This includes a handful of minor edits I made while reading the docs. In addition to a few spelling fixes, * standardize on "rerank" over "re-rank" in prose * terminate sentences with periods or colons as appropriate * replace some usage of dashes with colons, such as in "Try it yourself - <link>" All changes are surface-level. No changes to semantics or structure. --------- Co-authored-by: Will Jones <willjones127@gmail.com>
3.6 KiB
Rust-backed Client Migration Guide
In an effort to ensure all clients have the same set of capabilities we have
migrated the Python and Node clients onto a common Rust base library. In Python,
both the synchronous and asynchronous clients are based on this implementation.
In Node, the new client is available as @lancedb/lancedb, which replaces
the existing vectordb package.
This guide describes the differences between the two Node APIs and will hopefully assist users that would like to migrate to the new API.
TypeScript/JavaScript
For JS/TS users, we offer a brand new SDK @lancedb/lancedb
We tried to keep the API as similar as possible to the previous version, but there are a few small changes. Here are the most important ones:
Creating Tables
CreateTableOptions.writeOptions.writeMode has been replaced with CreateTableOptions.mode
=== "vectordb (deprecated)"
```ts
db.createTable(tableName, data, { writeMode: lancedb.WriteMode.Overwrite });
```
=== "@lancedb/lancedb"
```ts
db.createTable(tableName, data, { mode: "overwrite" })
```
Changes to Table APIs
Previously Table.schema was a property. Now it is an async method.
Creating Indices
The Table.createIndex method is now used for creating both vector indices
and scalar indices. It currently requires a column name to be specified (the
column to index). Vector index defaults are now smarter and scale better with
the size of the data.
=== "vectordb (deprecated)"
```ts
await tbl.createIndex({
column: "vector", // default
type: "ivf_pq",
num_partitions: 2,
num_sub_vectors: 2,
});
```
=== "@lancedb/lancedb"
```ts
await table.createIndex("vector", {
config: lancedb.Index.ivfPq({
numPartitions: 2,
numSubVectors: 2,
}),
});
```
Embedding Functions
The embedding API has been completely reworked, and it now more closely resembles the Python API, including the new embedding registry:
=== "vectordb (deprecated)"
```ts
const embeddingFunction = new lancedb.OpenAIEmbeddingFunction('text', API_KEY)
const data = [
{ id: 1, text: 'Black T-Shirt', price: 10 },
{ id: 2, text: 'Leather Jacket', price: 50 }
]
const table = await db.createTable('vectors', data, embeddingFunction)
```
=== "@lancedb/lancedb"
```ts
import * as lancedb from "@lancedb/lancedb";
import * as arrow from "apache-arrow";
import { LanceSchema, getRegistry } from "@lancedb/lancedb/embedding";
const func = getRegistry().get("openai").create({apiKey: API_KEY});
const data = [
{ id: 1, text: 'Black T-Shirt', price: 10 },
{ id: 2, text: 'Leather Jacket', price: 50 }
]
const table = await db.createTable('vectors', data, {
embeddingFunction: {
sourceColumn: "text",
function: func,
}
})
```
You can also use a schema driven approach, which parallels the Pydantic integration in our Python SDK:
const func = getRegistry().get("openai").create({apiKey: API_KEY});
const data = [
{ id: 1, text: 'Black T-Shirt', price: 10 },
{ id: 2, text: 'Leather Jacket', price: 50 }
]
const schema = LanceSchema({
id: new arrow.Int32(),
text: func.sourceField(new arrow.Utf8()),
price: new arrow.Float64(),
vector: func.vectorField()
})
const table = await db.createTable('vectors', data, {schema})