Compare commits

...

14 Commits

Author SHA1 Message Date
qzhu
9a066f9330 fix a bug 2024-01-04 13:47:07 -08:00
qzhu
88f6ae7cd1 fix example code 2024-01-03 15:59:30 -08:00
qzhu
f2d90331a4 example code 2023-12-22 11:25:10 -08:00
qzhu
997a50900d fix example code 2023-12-22 10:19:20 -08:00
QianZhu
0009206829 Update docs/src/javascript/saas-modules.md
Co-authored-by: Aidan  <64613310+aidangomar@users.noreply.github.com>
2023-12-22 09:53:01 -08:00
QianZhu
cab74c31d0 Update docs/src/javascript/saas-modules.md
Co-authored-by: Aidan  <64613310+aidangomar@users.noreply.github.com>
2023-12-22 09:52:29 -08:00
qzhu
fbb7a546df SaaS JS API sdk doc 2023-12-21 21:28:24 -08:00
Ayush Chaurasia
693091db29 chore(python): Reduce posthog event count (#661)
- Register open_table as event 
- Because we're dropping 'seach' event currently, changed the name to
'search_table' and introduced throttling
- Throttled events will be counted once per time batch so that the user
is registered but event count doesn't go up by a lot
2023-12-08 11:00:51 -08:00
Ayush Chaurasia
dca4533dbe docs: Update roboflow tutorial position (#666) 2023-12-08 11:00:11 -08:00
QianZhu
f6bbe199dc Qian/minor fix doc (#695) 2023-12-08 09:58:53 -08:00
Kaushal Kumar Choudhary
366e522c2b docs: Add badges (#694)
adding some badges
added a gif to readme for the vectordb repo

---------

Co-authored-by: kaushal07wick <kaushalc6@gmail.com>
2023-12-08 20:55:04 +05:30
Chang She
244b6919cc chore: Use m1 runner for npm publish (#687)
We had some build issues with npm publish for cross-compiling arm64
macos on an x86 macos runner. Switching to m1 runner for now until
someone has time to deal with the feature flags.

follow-up tracked here: #688
2023-12-07 15:49:52 -08:00
QianZhu
aca785ff98 saas python sdk doc (#692)
<img width="256" alt="Screenshot 2023-12-07 at 11 55 41 AM"
src="https://github.com/lancedb/lancedb/assets/1305083/259bf234-9b3b-4c5d-af45-c7f3fada2cc7">
2023-12-07 14:47:56 -08:00
Chang She
bbdebf2c38 chore: update package lock (#689) 2023-12-06 17:14:56 -08:00
16 changed files with 1087 additions and 73 deletions

View File

@@ -88,6 +88,9 @@ jobs:
cd docs/test cd docs/test
node md_testing.js node md_testing.js
- name: Test - name: Test
env:
LANCEDB_URI: ${{ secrets.LANCEDB_URI }}
LANCEDB_DEV_API_KEY: ${{ secrets.LANCEDB_DEV_API_KEY }}
run: | run: |
cd docs/test/node cd docs/test/node
for d in *; do cd "$d"; echo "$d".js; node "$d".js; cd ..; done for d in *; do cd "$d"; echo "$d".js; node "$d".js; cd ..; done

View File

@@ -37,14 +37,10 @@ jobs:
path: | path: |
node/vectordb-*.tgz node/vectordb-*.tgz
node-macos: node-macos-x86:
runs-on: macos-13 runs-on: macos-13
# Only runs on tags that matches the make-release action # Only runs on tags that matches the make-release action
if: startsWith(github.ref, 'refs/tags/v') if: startsWith(github.ref, 'refs/tags/v')
strategy:
fail-fast: false
matrix:
target: [x86_64-apple-darwin, aarch64-apple-darwin]
steps: steps:
- name: Checkout - name: Checkout
uses: actions/checkout@v3 uses: actions/checkout@v3
@@ -54,11 +50,30 @@ jobs:
run: | run: |
cd node cd node
npm ci npm ci
- name: Install rustup target
if: ${{ matrix.target == 'aarch64-apple-darwin' }}
run: rustup target add aarch64-apple-darwin
- name: Build MacOS native node modules - name: Build MacOS native node modules
run: bash ci/build_macos_artifacts.sh ${{ matrix.target }} run: bash ci/build_macos_artifacts.sh x86_64-apple-darwin
- name: Upload Darwin Artifacts
uses: actions/upload-artifact@v3
with:
name: native-darwin
path: |
node/dist/lancedb-vectordb-darwin*.tgz
node-macos-arm64:
runs-on: macos-13-xlarge
# Only runs on tags that matches the make-release action
if: startsWith(github.ref, 'refs/tags/v')
steps:
- name: Checkout
uses: actions/checkout@v3
- name: Install system dependencies
run: brew install protobuf
- name: Install npm dependencies
run: |
cd node
npm ci
- name: Build MacOS native node modules
run: bash ci/build_macos_artifacts.sh aarch64-apple-darwin
- name: Upload Darwin Artifacts - name: Upload Darwin Artifacts
uses: actions/upload-artifact@v3 uses: actions/upload-artifact@v3
with: with:

View File

@@ -5,10 +5,11 @@
**Developer-friendly, serverless vector database for AI applications** **Developer-friendly, serverless vector database for AI applications**
<a href="https://lancedb.github.io/lancedb/">Documentation</a> <a href='https://github.com/lancedb/vectordb-recipes/tree/main' target="_blank"><img alt='LanceDB' src='https://img.shields.io/badge/VectorDB_Recipes-100000?style=for-the-badge&logo=LanceDB&logoColor=white&labelColor=645cfb&color=645cfb'/></a>
<a href="https://blog.lancedb.com/">Blog</a> <a href='https://lancedb.github.io/lancedb/' target="_blank"><img alt='lancdb' src='https://img.shields.io/badge/DOCS-100000?style=for-the-badge&logo=lancdb&logoColor=white&labelColor=645cfb&color=645cfb'/></a>
<a href="https://discord.gg/zMM32dvNtd">Discord</a> [![Medium](https://img.shields.io/badge/Medium-12100E?style=for-the-badge&logo=medium&logoColor=white)](https://blog.lancedb.com/)
<a href="https://twitter.com/lancedb">Twitter</a> [![Discord](https://img.shields.io/badge/Discord-%235865F2.svg?style=for-the-badge&logo=discord&logoColor=white)](https://discord.gg/zMM32dvNtd)
[![Twitter](https://img.shields.io/badge/Twitter-%231DA1F2.svg?style=for-the-badge&logo=Twitter&logoColor=white)](https://twitter.com/lancedb)
</p> </p>

View File

@@ -80,7 +80,6 @@ nav:
- Ingest Embedding Functions: embeddings/embedding_functions.md - Ingest Embedding Functions: embeddings/embedding_functions.md
- Available Functions: embeddings/default_embedding_functions.md - Available Functions: embeddings/default_embedding_functions.md
- Create Custom Embedding Functions: embeddings/api.md - Create Custom Embedding Functions: embeddings/api.md
- Example - Calculate CLIP Embeddings with Roboflow Inference: examples/image_embeddings_roboflow.md
- Example - Multi-lingual semantic search: notebooks/multi_lingual_example.ipynb - Example - Multi-lingual semantic search: notebooks/multi_lingual_example.ipynb
- Example - MultiModal CLIP Embeddings: notebooks/DisappearingEmbeddingFunction.ipynb - Example - MultiModal CLIP Embeddings: notebooks/DisappearingEmbeddingFunction.ipynb
- 🔍 Python full-text search: fts.md - 🔍 Python full-text search: fts.md
@@ -99,6 +98,7 @@ nav:
- YouTube Transcript Search: notebooks/youtube_transcript_search.ipynb - YouTube Transcript Search: notebooks/youtube_transcript_search.ipynb
- Documentation QA Bot using LangChain: notebooks/code_qa_bot.ipynb - Documentation QA Bot using LangChain: notebooks/code_qa_bot.ipynb
- Multimodal search using CLIP: notebooks/multimodal_search.ipynb - Multimodal search using CLIP: notebooks/multimodal_search.ipynb
- Example - Calculate CLIP Embeddings with Roboflow Inference: examples/image_embeddings_roboflow.md
- Serverless QA Bot with S3 and Lambda: examples/serverless_lancedb_with_s3_and_lambda.md - Serverless QA Bot with S3 and Lambda: examples/serverless_lancedb_with_s3_and_lambda.md
- Serverless QA Bot with Modal: examples/serverless_qa_bot_with_modal_and_langchain.md - Serverless QA Bot with Modal: examples/serverless_qa_bot_with_modal_and_langchain.md
- 🌐 Javascript examples: - 🌐 Javascript examples:
@@ -146,8 +146,10 @@ nav:
- Serverless Chatbot from any website: examples/serverless_website_chatbot.md - Serverless Chatbot from any website: examples/serverless_website_chatbot.md
- TransformersJS Embedding Search: examples/transformerjs_embedding_search_nodejs.md - TransformersJS Embedding Search: examples/transformerjs_embedding_search_nodejs.md
- API references: - API references:
- Python API: python/python.md - OSS Python API: python/python.md
- SaaS Python API: python/saas-python.md
- Javascript API: javascript/modules.md - Javascript API: javascript/modules.md
- SaaS Javascript API: javascript/saas-modules.md
- LanceDB Cloud↗: https://noteforms.com/forms/lancedb-mailing-list-cloud-kty1o5?notionforms=1&utm_source=notionforms - LanceDB Cloud↗: https://noteforms.com/forms/lancedb-mailing-list-cloud-kty1o5?notionforms=1&utm_source=notionforms
extra_css: extra_css:

View File

@@ -164,6 +164,7 @@ You can further filter the elements returned by a search using a where clause.
const results_2 = await table const results_2 = await table
.search(Array(1536).fill(1.2)) .search(Array(1536).fill(1.2))
.where("id != '1141'") .where("id != '1141'")
.limit(2)
.execute() .execute()
``` ```
@@ -187,6 +188,7 @@ You can select the columns returned by the query using a select clause.
const results_3 = await table const results_3 = await table
.search(Array(1536).fill(1.2)) .search(Array(1536).fill(1.2))
.select(["id"]) .select(["id"])
.limit(2)
.execute() .execute()
``` ```

View File

@@ -0,0 +1,226 @@
[vectordb](../README.md) / [Exports](../saas-modules.md) / RemoteConnection
# Class: RemoteConnection
A connection to a remote LanceDB database. The class RemoteConnection implements interface Connection
## Implements
- [`Connection`](../interfaces/Connection.md)
## Table of contents
### Constructors
- [constructor](RemoteConnection.md#constructor)
### Methods
- [createTable](RemoteConnection.md#createtable)
- [tableNames](RemoteConnection.md#tablenames)
- [openTable](RemoteConnection.md#opentable)
- [dropTable](RemoteConnection.md#droptable)
## Constructors
### constructor
**new RemoteConnection**(`client`, `dbName`)
#### Parameters
| Name | Type |
| :------ | :------ |
| `client` | `HttpLancedbClient` |
| `dbName` | `string` |
#### Defined in
[remote/index.ts:37](https://github.com/lancedb/lancedb/blob/main/node/src/remote/index.ts#L37)
## Methods
### createTable
**createTable**(`name`, `data`, `mode?`): `Promise`<[`Table`](../interfaces/Table.md)<`number`[]\>\>
Creates a new Table and initialize it with new data.
#### Parameters
| Name | Type | Description |
| :------ | :------ | :------ |
| `name` | `string` | The name of the table. |
| `data` | `Record`<`string`, `unknown`\>[] | Non-empty Array of Records to be inserted into the Table |
| `mode?` | [`WriteMode`](../enums/WriteMode.md) | The write mode to use when creating the table. |
#### Returns
`Promise`<[`Table`](../interfaces/Table.md)<`number`[]\>\>
#### Implementation of
[Connection](../interfaces/Connection.md).[createTable](../interfaces/Connection.md#createtable)
#### Defined in
[remote/index.ts:75](https://github.com/lancedb/lancedb/blob/main/node/src/remote/index.ts#L75)
**createTable**(`name`, `data`, `mode`): `Promise`<[`Table`](../interfaces/Table.md)<`number`[]\>\>
#### Parameters
| Name | Type |
| :------ | :------ |
| `name` | `string` |
| `data` | `Record`<`string`, `unknown`\>[] |
| `mode` | [`WriteMode`](../enums/WriteMode.md) |
| `embeddings` | [`EmbeddingFunction`](../interfaces/EmbeddingFunction.md)<`T`\> | An embedding function to use on this Table |
#### Returns
`Promise`<[`Table`](../interfaces/Table.md)<`number`[]\>\>
#### Implementation of
Connection.createTable
#### Defined in
[remote/index.ts:231](https://github.com/lancedb/lancedb/blob/b1eeb90/node/src/index.ts#L231)
___
### dropTable
**dropTable**(`name`): `Promise`<`void`\>
Drop an existing table.
#### Parameters
| Name | Type | Description |
| :------ | :------ | :------ |
| `name` | `string` | The name of the table to drop. |
#### Returns
`Promise`<`void`\>
#### Implementation of
[Connection](../interfaces/Connection.md).[dropTable](../interfaces/Connection.md#droptable)
#### Defined in
[remote/index.ts:131](https://github.com/lancedb/lancedb/blob/main/node/src/remote/index.ts#L131)
___
### openTable
**openTable**(`name`): `Promise`<[`Table`](../interfaces/Table.md)<`number`[]\>\>
Open a table in the database.
#### Parameters
| Name | Type | Description |
| :------ | :------ | :------ |
| `name` | `string` | The name of the table. |
#### Returns
`Promise`<[`Table`](../interfaces/Table.md)<`number`[]\>\>
#### Implementation of
[Connection](../interfaces/Connection.md).[openTable](../interfaces/Connection.md#opentable)
#### Defined in
[remote/index.ts:65](https://github.com/lancedb/lancedb/blob/main/node/src/remote/index.ts#L65)
**openTable**<`T`\>(`name`, `embeddings`): `Promise`<[`Table`](../interfaces/Table.md)<`T`\>\>
Open a table in the database.
#### Type parameters
| Name |
| :------ |
| `T` |
#### Parameters
| Name | Type | Description |
| :------ | :------ | :------ |
| `name` | `string` | The name of the table. |
| `embeddings` | [`EmbeddingFunction`](../interfaces/EmbeddingFunction.md)<`T`\> | An embedding function to use on this Table |
#### Returns
`Promise`<[`Table`](../interfaces/Table.md)<`T`\>\>
#### Implementation of
Connection.openTable
#### Defined in
[remote/index.ts:66](https://github.com/lancedb/lancedb/blob/main/node/src/remote/index.ts#L66)
**openTable**<`T`\>(`name`, `embeddings?`): `Promise`<[`Table`](../interfaces/Table.md)<`T`\>\>
#### Type parameters
| Name |
| :------ |
| `T` |
#### Parameters
| Name | Type |
| :------ | :------ |
| `name` | `string` |
| `embeddings?` | [`EmbeddingFunction`](../interfaces/EmbeddingFunction.md)<`T`\> |
#### Returns
`Promise`<[`Table`](../interfaces/Table.md)<`T`\>\>
#### Implementation of
Connection.openTable
#### Defined in
[remote/index.ts:67](https://github.com/lancedb/lancedb/blob/main/node/src/remote/index.ts#L67)
___
### tableNames
**tableNames**(): `Promise`<`string`[]\>
Get the names of all tables in the database, with pagination.
#### Parameters
| Name | Type |
| :------ | :------ |
| `pageToken` | `string` |
| `limit` | `int` |
#### Returns
`Promise`<`string`[]\>
#### Implementation of
[Connection](../interfaces/Connection.md).[tableNames](../interfaces/Connection.md#tablenames)
#### Defined in
[remote/index.ts:60](https://github.com/lancedb/lancedb/blob/main/node/src/remote/index.ts#L60)

View File

@@ -0,0 +1,76 @@
[vectordb](../README.md) / [Exports](../saas-modules.md) / RemoteQuery
# Class: Query<T\>
A builder for nearest neighbor queries for LanceDB.
## Type parameters
| Name | Type |
| :------ | :------ |
| `T` | `number`[] |
## Table of contents
### Constructors
- [constructor](RemoteQuery.md#constructor)
### Properties
- [\_embeddings](RemoteQuery.md#_embeddings)
- [\_query](RemoteQuery.md#_query)
- [\_name](RemoteQuery.md#_name)
- [\_client](RemoteQuery.md#_client)
### Methods
- [execute](RemoteQuery.md#execute)
## Constructors
### constructor
**new Query**<`T`\>(`name`, `client`, `query`, `embeddings?`)
#### Type parameters
| Name | Type |
| :------ | :------ |
| `T` | `number`[] |
#### Parameters
| Name | Type |
| :------ | :------ |
| `name` | `string` |
| `client` | `HttpLancedbClient` |
| `query` | `T` |
| `embeddings?` | [`EmbeddingFunction`](../interfaces/EmbeddingFunction.md)<`T`\> |
#### Defined in
[remote/index.ts:137](https://github.com/lancedb/lancedb/blob/main/node/src/remote/index.ts#L137)
## Methods
### execute
**execute**<`T`\>(): `Promise`<`T`[]\>
Execute the query and return the results as an Array of Objects
#### Type parameters
| Name | Type |
| :------ | :------ |
| `T` | `Record`<`string`, `unknown`\> |
#### Returns
`Promise`<`T`[]\>
#### Defined in
[remote/index.ts:143](https://github.com/lancedb/lancedb/blob/main/node/src/remote/index.ts#L143)

View File

@@ -0,0 +1,355 @@
[vectordb](../README.md) / [Exports](../saas-modules.md) / RemoteTable
# Class: RemoteTable<T\>
A LanceDB Table is the collection of Records. Each Record has one or more vector fields.
## Type parameters
| Name | Type |
| :------ | :------ |
| `T` | `number`[] |
## Implements
- [`Table`](../interfaces/Table.md)<`T`\>
## Table of contents
### Constructors
- [constructor](RemoteTable.md#constructor)
### Properties
- [\_name](RemoteTable.md#_name)
- [\_client](RemoteTable.md#_client)
- [\_embeddings](RemoteTable.md#_embeddings)
### Accessors
- [name](RemoteTable.md#name)
### Methods
- [add](RemoteTable.md#add)
- [countRows](RemoteTable.md#countrows)
- [createIndex](RemoteTable.md#createindex)
- [delete](RemoteTable.md#delete)
- [listIndices](classes/RemoteTable.md#listindices)
- [indexStats](classes/RemoteTable.md#liststats)
- [overwrite](RemoteTable.md#overwrite)
- [search](RemoteTable.md#search)
- [schema](classes/RemoteTable.md#schema)
- [update](RemoteTable.md#update)
## Constructors
### constructor
**new RemoteTable**<`T`\>(`client`, `name`)
#### Type parameters
| Name | Type |
| :------ | :------ |
| `T` | `number`[] |
#### Parameters
| Name | Type |
| :------ | :------ |
| `client` | `HttpLancedbClient` |
| `name` | `string` |
#### Defined in
[remote/index.ts:186](https://github.com/lancedb/lancedb/blob/main/node/src/remote/index.ts#L186)
**new RemoteTable**<`T`\>(`client`, `name`, `embeddings`)
#### Type parameters
| Name | Type |
| :------ | :------ |
| `T` | `number`[] |
#### Parameters
| Name | Type | Description |
| :------ | :------ | :------ |
| `client` | `HttpLancedbClient` | |
| `name` | `string` | |
| `embeddings` | [`EmbeddingFunction`](../interfaces/EmbeddingFunction.md)<`T`\> | An embedding function to use when interacting with this table |
#### Defined in
[remote/index.ts:187](https://github.com/lancedb/lancedb/blob/main/node/src/remote/index.ts#L187)
## Accessors
### name
`get` **name**(): `string`
#### Returns
`string`
#### Implementation of
[Table](../interfaces/Table.md).[name](../interfaces/Table.md#name)
#### Defined in
[remote/index.ts:194](https://github.com/lancedb/lancedb/blob/main/node/src/remote/index.ts#L194)
## Methods
### add
**add**(`data`): `Promise`<`number`\>
Insert records into this Table.
#### Parameters
| Name | Type | Description |
| :------ | :------ | :------ |
| `data` | `Record`<`string`, `unknown`\>[] | Records to be inserted into the Table |
#### Returns
`Promise`<`number`\>
The number of rows added to the table
#### Implementation of
[Table](../interfaces/Table.md).[add](../interfaces/Table.md#add)
#### Defined in
[remote/index.ts:293](https://github.com/lancedb/lancedb/blob/main/node/src/remote/index.ts#L293)
___
### countRows
**countRows**(): `Promise`<`number`\>
Returns the number of rows in this table.
#### Returns
`Promise`<`number`\>
#### Implementation of
[Table](../interfaces/Table.md).[countRows](../interfaces/Table.md#countrows)
#### Defined in
[remote/index.ts:290](https://github.com/lancedb/lancedb/blob/main/node/src/remote/index.ts#L290)
___
### createIndex
**createIndex**(`metric_type`, `column`): `Promise`<`any`\>
Create an ANN index on this Table vector index.
#### Parameters
| Name | Type | Description |
| :------ | :------ | :------ |
| `metric_type` | `string` | distance metric type, L2 or cosine or dot |
| `column` | `string` | the name of the column to be indexed |
#### Returns
`Promise`<`any`\>
#### Implementation of
[Table](../interfaces/Table.md).[createIndex](../interfaces/Table.md#createindex)
#### Defined in
[remote/index.ts:249](https://github.com/lancedb/lancedb/blob/main/node/src/remote/index.ts#L249)
___
### delete
**delete**(`filter`): `Promise`<`void`\>
Delete rows from this table.
#### Parameters
| Name | Type | Description |
| :------ | :------ | :------ |
| `filter` | `string` | A filter in the same format used by a sql WHERE clause. |
#### Returns
`Promise`<`void`\>
#### Implementation of
[Table](../interfaces/Table.md).[delete](../interfaces/Table.md#delete)
#### Defined in
[remote/index.ts:295](https://github.com/lancedb/lancedb/blob/main/node/src/remote/index.ts#L295)
___
### overwrite
**overwrite**(`data`): `Promise`<`number`\>
Insert records into this Table, replacing its contents.
#### Parameters
| Name | Type | Description |
| :------ | :------ | :------ |
| `data` | `Record`<`string`, `unknown`\>[] | Records to be inserted into the Table |
#### Returns
`Promise`<`number`\>
The number of rows added to the table
#### Implementation of
[Table](../interfaces/Table.md).[overwrite](../interfaces/Table.md#overwrite)
#### Defined in
[remote/index.ts:231](https://github.com/lancedb/lancedb/blob/main/node/src/remote/index.ts#L231)
___
### search
**search**(`query`): [`Query`](Query.md)<`T`\>
Creates a search query to find the nearest neighbors of the given search term
#### Parameters
| Name | Type | Description |
| :------ | :------ | :------ |
| `query` | `T` | The query search term |
#### Returns
[`Query`](Query.md)<`T`\>
#### Implementation of
[Table](../interfaces/Table.md).[search](../interfaces/Table.md#search)
#### Defined in
[remote/index.ts:209](https://github.com/lancedb/lancedb/blob/main/node/src/remote/index.ts#L209)
___
### update
**update**(`args`): `Promise`<`void`\>
Update zero to all rows depending on how many rows match the where clause.
#### Parameters
| Name | Type | Description |
| :------ | :------ | :------ |
| `args` | `UpdateArgs` or `UpdateSqlArgs` | The query search arguments |
#### Returns
`Promise`<`any`\>
#### Implementation of
[Table](../interfaces/Table.md).[search](../interfaces/Table.md#update)
#### Defined in
[remote/index.ts:299](https://github.com/lancedb/lancedb/blob/main/node/src/remote/index.ts#L299)
___
### schema
**schema**(): `Promise`<`void`\>
Get the schema of the table
#### Returns
`Promise`<`any`\>
#### Implementation of
[Table](../interfaces/Table.md).[search](../interfaces/Table.md#schema)
#### Defined in
[remote/index.ts:198](https://github.com/lancedb/lancedb/blob/main/node/src/remote/index.ts#L198)
___
### listIndices
**listIndices**(): `Promise`<`void`\>
List the indices of the table
#### Returns
`Promise`<`any`\>
#### Implementation of
[Table](../interfaces/Table.md).[search](../interfaces/Table.md#listIndices)
#### Defined in
[remote/index.ts:319](https://github.com/lancedb/lancedb/blob/main/node/src/remote/index.ts#L319)
___
### indexStats
**indexStats**(`indexUuid`): `Promise`<`void`\>
Get the indexed/unindexed of rows from the table
#### Parameters
| Name | Type | Description |
| :------ | :------ | :------ |
| `indexUuid` | `string` | the uuid of the index |
#### Returns
`Promise`<`numIndexedRows`\>
`Promise`<`numUnindexedRows`\>
#### Implementation of
[Table](../interfaces/Table.md).[search](../interfaces/Table.md#indexStats)
#### Defined in
[remote/index.ts:328](https://github.com/lancedb/lancedb/blob/main/node/src/remote/index.ts#L328)

View File

@@ -0,0 +1,93 @@
# Table of contents
## Installation
```bash
npm install vectordb
```
This will download the appropriate native library for your platform. We currently
support x86_64 Linux, aarch64 Linux, Intel MacOS, and ARM (M1/M2) MacOS. We do not
yet support Windows or musl-based Linux (such as Alpine Linux).
## Classes
- [RemoteConnection](classes/RemoteConnection.md)
- [RemoteTable](classes/RemoteTable.md)
- [RemoteQuery](classes/RemoteQuery.md)
## Methods
- [add](classes/RemoteTable.md#add)
- [countRows](classes/RemoteTable.md#countrows)
- [createIndex](classes/RemoteTable.md#createindex)
- [createTable](classes/RemoteConnection.md#createtable)
- [delete](classes/RemoteTable.md#delete)
- [dropTable](classes/RemoteConnection.md#droptable)
- [listIndices](classes/RemoteTable.md#listindices)
- [indexStats](classes/RemoteTable.md#liststats)
- [openTable](classes/RemoteConnection.md#opentable)
- [overwrite](classes/RemoteTable.md#overwrite)
- [schema](classes/RemoteTable.md#schema)
- [search](classes/RemoteTable.md#search)
- [tableNames](classes/RemoteConnection.md#tablenames)
- [update](classes/RemoteTable.md#update)
## Example code
```javascript
const lancedb = require('vectordb');
const { Schema, Field, Int32, Float32, Utf8, FixedSizeList } = require ("apache-arrow/Arrow.node")
// connect to a remote DB
const devApiKey = process.env.LANCEDB_DEV_API_KEY
const dbURI = process.env.LANCEDB_URI
console.log(devApiKey)
const db = await lancedb.connect({
uri: dbURI, // replace dbURI with your project, e.g. "db://your-project-name"
apiKey: devApiKey, // replace dbURI with your api key
region: "us-east-1-dev"
});
// create a new table
const tableName = "my_table_000"
const data = [
{ id: 1, vector: [0.1, 1.0], item: "foo", price: 10.0 },
{ id: 2, vector: [3.9, 0.5], item: "bar", price: 20.0 }
]
const schema = new Schema(
[
new Field('id', new Int32()),
new Field('vector', new FixedSizeList(2, new Field('float32', new Float32()))),
new Field('item', new Utf8()),
new Field('price', new Float32())
]
)
const table = await db.createTable({
name: tableName,
schema,
}, data)
// list the table
const tableNames_1 = await db.tableNames('')
// add some data and search should be okay
const newData = [
{ id: 3, vector: [10.3, 1.9], item: "test1", price: 30.0 },
{ id: 4, vector: [6.2, 9.2], item: "test2", price: 40.0 }
]
await table.add(newData)
// create the index for the table
await table.createIndex({
metric_type: "L2",
column: "vector"
})
let result = await table.search([2.8, 4.3]).select(["vector", "price"]).limit(1).execute()
// update the data
await table.update({
where: "id == 1",
values: { item: "foo1" }
})
//drop the table
await db.dropTable(tableName)
```

View File

@@ -0,0 +1,18 @@
# LanceDB Python API Reference
## Installation
```shell
pip install lancedb
```
## Connection
::: lancedb.connect
::: lancedb.remote.db.RemoteDBConnection
## Table
::: lancedb.remote.table.RemoteTable

18
node/package-lock.json generated
View File

@@ -316,6 +316,18 @@
"@jridgewell/sourcemap-codec": "^1.4.10" "@jridgewell/sourcemap-codec": "^1.4.10"
} }
}, },
"node_modules/@lancedb/vectordb-darwin-arm64": {
"version": "0.3.9",
"resolved": "https://registry.npmjs.org/@lancedb/vectordb-darwin-arm64/-/vectordb-darwin-arm64-0.3.9.tgz",
"integrity": "sha512-irtAdfSRQDcfnMnB8T7D0atLFfu1MMZZ1JaxMKu24DDZ8e4IMYKUplxwvWni3241yA9yDE/pliRZCNQbQCEfrg==",
"cpu": [
"arm64"
],
"optional": true,
"os": [
"darwin"
]
},
"node_modules/@lancedb/vectordb-darwin-x64": { "node_modules/@lancedb/vectordb-darwin-x64": {
"version": "0.3.9", "version": "0.3.9",
"resolved": "https://registry.npmjs.org/@lancedb/vectordb-darwin-x64/-/vectordb-darwin-x64-0.3.9.tgz", "resolved": "https://registry.npmjs.org/@lancedb/vectordb-darwin-x64/-/vectordb-darwin-x64-0.3.9.tgz",
@@ -4856,6 +4868,12 @@
"@jridgewell/sourcemap-codec": "^1.4.10" "@jridgewell/sourcemap-codec": "^1.4.10"
} }
}, },
"@lancedb/vectordb-darwin-arm64": {
"version": "0.3.9",
"resolved": "https://registry.npmjs.org/@lancedb/vectordb-darwin-arm64/-/vectordb-darwin-arm64-0.3.9.tgz",
"integrity": "sha512-irtAdfSRQDcfnMnB8T7D0atLFfu1MMZZ1JaxMKu24DDZ8e4IMYKUplxwvWni3241yA9yDE/pliRZCNQbQCEfrg==",
"optional": true
},
"@lancedb/vectordb-darwin-x64": { "@lancedb/vectordb-darwin-x64": {
"version": "0.3.9", "version": "0.3.9",
"resolved": "https://registry.npmjs.org/@lancedb/vectordb-darwin-x64/-/vectordb-darwin-x64-0.3.9.tgz", "resolved": "https://registry.npmjs.org/@lancedb/vectordb-darwin-x64/-/vectordb-darwin-x64-0.3.9.tgz",

View File

@@ -27,7 +27,7 @@ def connect(
uri: URI, uri: URI,
*, *,
api_key: Optional[str] = None, api_key: Optional[str] = None,
region: str = "us-west-2", region: str = "us-east-1",
host_override: Optional[str] = None, host_override: Optional[str] = None,
) -> DBConnection: ) -> DBConnection:
"""Connect to a LanceDB database. """Connect to a LanceDB database.
@@ -39,7 +39,7 @@ def connect(
api_key: str, optional api_key: str, optional
If presented, connect to LanceDB cloud. If presented, connect to LanceDB cloud.
Otherwise, connect to a database on file system or cloud storage. Otherwise, connect to a database on file system or cloud storage.
region: str, default "us-west-2" region: str, default "us-east-1"
The region to use for LanceDB Cloud. The region to use for LanceDB Cloud.
host_override: str, optional host_override: str, optional
The override url for LanceDB Cloud. The override url for LanceDB Cloud.

View File

@@ -56,16 +56,20 @@ class RemoteDBConnection(DBConnection):
self._loop = asyncio.get_event_loop() self._loop = asyncio.get_event_loop()
def __repr__(self) -> str: def __repr__(self) -> str:
return f"RemoveConnect(name={self.db_name})" return f"RemoteConnect(name={self.db_name})"
@override @override
def table_names(self, page_token: Optional[str] = None, limit=10) -> Iterable[str]: def table_names(
self, page_token: Optional[str] = None, limit: int = 10
) -> Iterable[str]:
"""List the names of all tables in the database. """List the names of all tables in the database.
Parameters Parameters
---------- ----------
page_token: str page_token: str
The last token to start the new page. The last token to start the new page.
limit: int, default 10
The maximum number of tables to return for each page.
Returns Returns
------- -------
@@ -120,6 +124,97 @@ class RemoteDBConnection(DBConnection):
fill_value: float = 0.0, fill_value: float = 0.0,
embedding_functions: Optional[List[EmbeddingFunctionConfig]] = None, embedding_functions: Optional[List[EmbeddingFunctionConfig]] = None,
) -> Table: ) -> Table:
"""Create a [Table][lancedb.table.Table] in the database.
Parameters
----------
name: str
The name of the table.
data: The data to initialize the table, *optional*
User must provide at least one of `data` or `schema`.
Acceptable types are:
- dict or list-of-dict
- pandas.DataFrame
- pyarrow.Table or pyarrow.RecordBatch
schema: The schema of the table, *optional*
Acceptable types are:
- pyarrow.Schema
- [LanceModel][lancedb.pydantic.LanceModel]
on_bad_vectors: str, default "error"
What to do if any of the vectors are not the same size or contains NaNs.
One of "error", "drop", "fill".
fill_value: float
The value to use when filling vectors. Only used if on_bad_vectors="fill".
Returns
-------
LanceTable
A reference to the newly created table.
!!! note
The vector index won't be created by default.
To create the index, call the `create_index` method on the table.
Examples
--------
Can create with list of tuples or dictionaries:
>>> import lancedb
>>> db = lancedb.connect("db://...", api_key="...", region="...") # doctest: +SKIP
>>> data = [{"vector": [1.1, 1.2], "lat": 45.5, "long": -122.7},
... {"vector": [0.2, 1.8], "lat": 40.1, "long": -74.1}]
>>> db.create_table("my_table", data) # doctest: +SKIP
LanceTable(my_table)
You can also pass a pandas DataFrame:
>>> import pandas as pd
>>> data = pd.DataFrame({
... "vector": [[1.1, 1.2], [0.2, 1.8]],
... "lat": [45.5, 40.1],
... "long": [-122.7, -74.1]
... })
>>> db.create_table("table2", data) # doctest: +SKIP
LanceTable(table2)
>>> custom_schema = pa.schema([
... pa.field("vector", pa.list_(pa.float32(), 2)),
... pa.field("lat", pa.float32()),
... pa.field("long", pa.float32())
... ])
>>> db.create_table("table3", data, schema = custom_schema) # doctest: +SKIP
LanceTable(table3)
It is also possible to create an table from `[Iterable[pa.RecordBatch]]`:
>>> import pyarrow as pa
>>> def make_batches():
... for i in range(5):
... yield pa.RecordBatch.from_arrays(
... [
... pa.array([[3.1, 4.1], [5.9, 26.5]],
... pa.list_(pa.float32(), 2)),
... pa.array(["foo", "bar"]),
... pa.array([10.0, 20.0]),
... ],
... ["vector", "item", "price"],
... )
>>> schema=pa.schema([
... pa.field("vector", pa.list_(pa.float32(), 2)),
... pa.field("item", pa.utf8()),
... pa.field("price", pa.float32()),
... ])
>>> db.create_table("table4", make_batches(), schema=schema) # doctest: +SKIP
LanceTable(table4)
"""
if data is None and schema is None: if data is None and schema is None:
raise ValueError("Either data or schema must be provided.") raise ValueError("Either data or schema must be provided.")
if embedding_functions is not None: if embedding_functions is not None:

View File

@@ -37,7 +37,10 @@ class RemoteTable(Table):
@cached_property @cached_property
def schema(self) -> pa.Schema: def schema(self) -> pa.Schema:
"""Return the schema of the table.""" """The [Arrow Schema](https://arrow.apache.org/docs/python/api/datatypes.html#)
of this Table
"""
resp = self._conn._loop.run_until_complete( resp = self._conn._loop.run_until_complete(
self._conn._client.post(f"/v1/table/{self._name}/describe/") self._conn._client.post(f"/v1/table/{self._name}/describe/")
) )
@@ -53,24 +56,17 @@ class RemoteTable(Table):
return resp["version"] return resp["version"]
def to_arrow(self) -> pa.Table: def to_arrow(self) -> pa.Table:
"""Return the table as an Arrow table.""" """to_arrow() is not supported on the LanceDB cloud"""
raise NotImplementedError("to_arrow() is not supported on the LanceDB cloud") raise NotImplementedError("to_arrow() is not supported on the LanceDB cloud")
def to_pandas(self): def to_pandas(self):
"""Return the table as a Pandas DataFrame. """to_pandas() is not supported on the LanceDB cloud"""
Intercept `to_arrow()` for better error message.
"""
return NotImplementedError("to_pandas() is not supported on the LanceDB cloud") return NotImplementedError("to_pandas() is not supported on the LanceDB cloud")
def create_index( def create_index(
self, self,
metric="L2", metric="L2",
num_partitions=256,
num_sub_vectors=96,
vector_column_name: str = VECTOR_COLUMN_NAME, vector_column_name: str = VECTOR_COLUMN_NAME,
replace: bool = True,
accelerator: Optional[str] = None,
index_cache_size: Optional[int] = None, index_cache_size: Optional[int] = None,
): ):
"""Create an index on the table. """Create an index on the table.
@@ -81,39 +77,28 @@ class RemoteTable(Table):
---------- ----------
metric : str metric : str
The metric to use for the index. Default is "L2". The metric to use for the index. Default is "L2".
num_partitions : int
The number of partitions to use for the index. Default is 256.
num_sub_vectors : int
The number of sub-vectors to use for the index. Default is 96.
vector_column_name : str vector_column_name : str
The name of the vector column. Default is "vector". The name of the vector column. Default is "vector".
replace : bool
Whether to replace the existing index. Default is True.
accelerator : str, optional
If set, use the given accelerator to create the index.
Default is None. Currently not supported.
index_cache_size : int, optional
The size of the index cache in number of entries. Default value is 256.
Examples Examples
-------- --------
import lancedb >>> import lancedb
import uuid >>> import uuid
from lancedb.schema import vector >>> from lancedb.schema import vector
conn = lancedb.connect("db://...", api_key="...", region="...") >>> db = lancedb.connect("db://...", api_key="...", region="...") # doctest: +SKIP
table_name = uuid.uuid4().hex >>> table_name = uuid.uuid4().hex
schema = pa.schema( >>> schema = pa.schema(
[ ... [
pa.field("id", pa.uint32(), False), ... pa.field("id", pa.uint32(), False),
pa.field("vector", vector(128), False), ... pa.field("vector", vector(128), False),
pa.field("s", pa.string(), False), ... pa.field("s", pa.string(), False),
] ... ]
) ... )
table = conn.create_table( >>> table = db.create_table( # doctest: +SKIP
table_name, ... table_name, # doctest: +SKIP
schema=schema, ... schema=schema, # doctest: +SKIP
) ... )
table.create_index() >>> table.create_index("L2", "vector") # doctest: +SKIP
""" """
index_type = "vector" index_type = "vector"
@@ -135,6 +120,28 @@ class RemoteTable(Table):
on_bad_vectors: str = "error", on_bad_vectors: str = "error",
fill_value: float = 0.0, fill_value: float = 0.0,
) -> int: ) -> int:
"""Add more data to the [Table](Table). It has the same API signature as the OSS version.
Parameters
----------
data: DATA
The data to insert into the table. Acceptable types are:
- dict or list-of-dict
- pandas.DataFrame
- pyarrow.Table or pyarrow.RecordBatch
mode: str
The mode to use when writing the data. Valid values are
"append" and "overwrite".
on_bad_vectors: str, default "error"
What to do if any of the vectors are not the same size or contains NaNs.
One of "error", "drop", "fill".
fill_value: float, default 0.
The value to use when filling vectors. Only used if on_bad_vectors="fill".
"""
data = _sanitize_data( data = _sanitize_data(
data, data,
self.schema, self.schema,
@@ -158,6 +165,58 @@ class RemoteTable(Table):
def search( def search(
self, query: Union[VEC, str], vector_column_name: str = VECTOR_COLUMN_NAME self, query: Union[VEC, str], vector_column_name: str = VECTOR_COLUMN_NAME
) -> LanceVectorQueryBuilder: ) -> LanceVectorQueryBuilder:
"""Create a search query to find the nearest neighbors
of the given query vector. We currently support [vector search][search]
All query options are defined in [Query][lancedb.query.Query].
Examples
--------
>>> import lancedb
>>> db = lancedb.connect("db://...", api_key="...", region="...") # doctest: +SKIP
>>> data = [
... {"original_width": 100, "caption": "bar", "vector": [0.1, 2.3, 4.5]},
... {"original_width": 2000, "caption": "foo", "vector": [0.5, 3.4, 1.3]},
... {"original_width": 3000, "caption": "test", "vector": [0.3, 6.2, 2.6]}
... ]
>>> table = db.create_table("my_table", data) # doctest: +SKIP
>>> query = [0.4, 1.4, 2.4]
>>> (table.search(query, vector_column_name="vector") # doctest: +SKIP
... .where("original_width > 1000", prefilter=True) # doctest: +SKIP
... .select(["caption", "original_width"]) # doctest: +SKIP
... .limit(2) # doctest: +SKIP
... .to_pandas()) # doctest: +SKIP
caption original_width vector _distance # doctest: +SKIP
0 foo 2000 [0.5, 3.4, 1.3] 5.220000 # doctest: +SKIP
1 test 3000 [0.3, 6.2, 2.6] 23.089996 # doctest: +SKIP
Parameters
----------
query: list/np.ndarray/str/PIL.Image.Image, default None
The targetted vector to search for.
- *default None*.
Acceptable types are: list, np.ndarray, PIL.Image.Image
- If None then the select/where/limit clauses are applied to filter
the table
vector_column_name: str
The name of the vector column to search.
*default "vector"*
Returns
-------
LanceQueryBuilder
A query builder object representing the query.
Once executed, the query returns
- selected columns
- the vector
- and also the "_distance" column which is the distance between the query
vector and the returned vector.
"""
return LanceVectorQueryBuilder(self, query, vector_column_name) return LanceVectorQueryBuilder(self, query, vector_column_name)
def _execute_query(self, query: Query) -> pa.Table: def _execute_query(self, query: Query) -> pa.Table:
@@ -165,7 +224,51 @@ class RemoteTable(Table):
return self._conn._loop.run_until_complete(result).to_arrow() return self._conn._loop.run_until_complete(result).to_arrow()
def delete(self, predicate: str): def delete(self, predicate: str):
"""Delete rows from the table.""" """Delete rows from the table.
This can be used to delete a single row, many rows, all rows, or
sometimes no rows (if your predicate matches nothing).
Parameters
----------
predicate: str
The SQL where clause to use when deleting rows.
- For example, 'x = 2' or 'x IN (1, 2, 3)'.
The filter must not be empty, or it will error.
Examples
--------
>>> import lancedb
>>> data = [
... {"x": 1, "vector": [1, 2]},
... {"x": 2, "vector": [3, 4]},
... {"x": 3, "vector": [5, 6]}
... ]
>>> db = lancedb.connect("db://...", api_key="...", region="...") # doctest: +SKIP
>>> table = db.create_table("my_table", data) # doctest: +SKIP
>>> table.search([10,10]).to_pandas() # doctest: +SKIP
x vector _distance # doctest: +SKIP
0 3 [5.0, 6.0] 41.0 # doctest: +SKIP
1 2 [3.0, 4.0] 85.0 # doctest: +SKIP
2 1 [1.0, 2.0] 145.0 # doctest: +SKIP
>>> table.delete("x = 2") # doctest: +SKIP
>>> table.search([10,10]).to_pandas() # doctest: +SKIP
x vector _distance # doctest: +SKIP
0 3 [5.0, 6.0] 41.0 # doctest: +SKIP
1 1 [1.0, 2.0] 145.0 # doctest: +SKIP
If you have a list of values to delete, you can combine them into a
stringified list and use the `IN` operator:
>>> to_remove = [1, 3] # doctest: +SKIP
>>> to_remove = ", ".join([str(v) for v in to_remove]) # doctest: +SKIP
>>> table.delete(f"x IN ({to_remove})") # doctest: +SKIP
>>> table.search([10,10]).to_pandas() # doctest: +SKIP
x vector _distance # doctest: +SKIP
0 2 [3.0, 4.0] 85.0 # doctest: +SKIP
"""
payload = {"predicate": predicate} payload = {"predicate": predicate}
self._conn._loop.run_until_complete( self._conn._loop.run_until_complete(
self._conn._client.post(f"/v1/table/{self._name}/delete/", data=payload) self._conn._client.post(f"/v1/table/{self._name}/delete/", data=payload)

View File

@@ -785,7 +785,7 @@ class LanceTable(Table):
and also the "_distance" column which is the distance between the query and also the "_distance" column which is the distance between the query
vector and the returned vector. vector and the returned vector.
""" """
register_event("search") register_event("search_table")
return LanceQueryBuilder.create( return LanceQueryBuilder.create(
self, query, query_type, vector_column_name=vector_column_name self, query, query_type, vector_column_name=vector_column_name
) )
@@ -906,6 +906,8 @@ class LanceTable(Table):
f"Table {name} does not exist." f"Table {name} does not exist."
f"Please first call db.create_table({name}, data)" f"Please first call db.create_table({name}, data)"
) )
register_event("open_table")
return tbl return tbl
def delete(self, where: str): def delete(self, where: str):

View File

@@ -64,8 +64,10 @@ class _Events:
Initializes the Events object with default values for events, rate_limit, and metadata. Initializes the Events object with default values for events, rate_limit, and metadata.
""" """
self.events = [] # events list self.events = [] # events list
self.max_events = 25 # max events to store in memory self.throttled_event_names = ["search_table"]
self.rate_limit = 60.0 # rate limit (seconds) self.throttled_events = set()
self.max_events = 5 # max events to store in memory
self.rate_limit = 60.0 * 5 # rate limit (seconds)
self.time = 0.0 self.time = 0.0
if is_git_dir(): if is_git_dir():
@@ -112,10 +114,9 @@ class _Events:
return return
if ( if (
len(self.events) < self.max_events len(self.events) < self.max_events
): # Events list limited to 25 events (drop any events past this) ): # Events list limited to self.max_events (drop any events past this)
params.update(self.metadata) params.update(self.metadata)
self.events.append( event = {
{
"event": event_name, "event": event_name,
"properties": params, "properties": params,
"timestamp": datetime.datetime.now( "timestamp": datetime.datetime.now(
@@ -123,7 +124,11 @@ class _Events:
).isoformat(), ).isoformat(),
"distinct_id": CONFIG["uuid"], "distinct_id": CONFIG["uuid"],
} }
) if event_name not in self.throttled_event_names:
self.events.append(event)
elif event_name not in self.throttled_events:
self.throttled_events.add(event_name)
self.events.append(event)
# Check rate limit # Check rate limit
t = time.time() t = time.time()
@@ -135,7 +140,6 @@ class _Events:
"distinct_id": CONFIG["uuid"], # posthog needs this to accepts the event "distinct_id": CONFIG["uuid"], # posthog needs this to accepts the event
"batch": self.events, "batch": self.events,
} }
# POST equivalent to requests.post(self.url, json=data). # POST equivalent to requests.post(self.url, json=data).
# threaded request is used to avoid blocking, retries are disabled, and verbose is disabled # threaded request is used to avoid blocking, retries are disabled, and verbose is disabled
# to avoid any possible disruption in the console. # to avoid any possible disruption in the console.
@@ -150,6 +154,7 @@ class _Events:
# Flush & Reset # Flush & Reset
self.events = [] self.events = []
self.throttled_events = set()
self.time = t self.time = t