docs: add a reference to @lancedb/lance in the docs (#1166)

We aren't yet ready to switch over the examples since almost all JS
examples rely on embeddings and we haven't yet ported those over.
However, this makes it possible for those that are interested to start
using `@lancedb/lancedb`
This commit is contained in:
Weston Pace
2024-03-29 06:55:03 -05:00
parent 1b0aaf9ec3
commit e21b56293c
34 changed files with 3561 additions and 7 deletions

View File

@@ -0,0 +1,239 @@
[@lancedb/lancedb](../README.md) / [Exports](../modules.md) / Connection
# Class: Connection
A LanceDB Connection that allows you to open tables and create new ones.
Connection could be local against filesystem or remote against a server.
A Connection is intended to be a long lived object and may hold open
resources such as HTTP connection pools. This is generally fine and
a single connection should be shared if it is going to be used many
times. However, if you are finished with a connection, you may call
close to eagerly free these resources. Any call to a Connection
method after it has been closed will result in an error.
Closing a connection is optional. Connections will automatically
be closed when they are garbage collected.
Any created tables are independent and will continue to work even if
the underlying connection has been closed.
## Table of contents
### Constructors
- [constructor](Connection.md#constructor)
### Properties
- [inner](Connection.md#inner)
### Methods
- [close](Connection.md#close)
- [createEmptyTable](Connection.md#createemptytable)
- [createTable](Connection.md#createtable)
- [display](Connection.md#display)
- [dropTable](Connection.md#droptable)
- [isOpen](Connection.md#isopen)
- [openTable](Connection.md#opentable)
- [tableNames](Connection.md#tablenames)
## Constructors
### constructor
**new Connection**(`inner`): [`Connection`](Connection.md)
#### Parameters
| Name | Type |
| :------ | :------ |
| `inner` | `Connection` |
#### Returns
[`Connection`](Connection.md)
#### Defined in
[connection.ts:72](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/connection.ts#L72)
## Properties
### inner
`Readonly` **inner**: `Connection`
#### Defined in
[connection.ts:70](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/connection.ts#L70)
## Methods
### close
**close**(): `void`
Close the connection, releasing any underlying resources.
It is safe to call this method multiple times.
Any attempt to use the connection after it is closed will result in an error.
#### Returns
`void`
#### Defined in
[connection.ts:88](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/connection.ts#L88)
___
### createEmptyTable
**createEmptyTable**(`name`, `schema`, `options?`): `Promise`\<[`Table`](Table.md)\>
Creates a new empty Table
#### Parameters
| Name | Type | Description |
| :------ | :------ | :------ |
| `name` | `string` | The name of the table. |
| `schema` | `Schema`\<`any`\> | The schema of the table |
| `options?` | `Partial`\<[`CreateTableOptions`](../interfaces/CreateTableOptions.md)\> | - |
#### Returns
`Promise`\<[`Table`](Table.md)\>
#### Defined in
[connection.ts:151](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/connection.ts#L151)
___
### createTable
**createTable**(`name`, `data`, `options?`): `Promise`\<[`Table`](Table.md)\>
Creates a new Table and initialize it with new data.
#### Parameters
| Name | Type | Description |
| :------ | :------ | :------ |
| `name` | `string` | The name of the table. |
| `data` | `Table`\<`any`\> \| `Record`\<`string`, `unknown`\>[] | Non-empty Array of Records to be inserted into the table |
| `options?` | `Partial`\<[`CreateTableOptions`](../interfaces/CreateTableOptions.md)\> | - |
#### Returns
`Promise`\<[`Table`](Table.md)\>
#### Defined in
[connection.ts:123](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/connection.ts#L123)
___
### display
**display**(): `string`
Return a brief description of the connection
#### Returns
`string`
#### Defined in
[connection.ts:93](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/connection.ts#L93)
___
### dropTable
**dropTable**(`name`): `Promise`\<`void`\>
Drop an existing table.
#### Parameters
| Name | Type | Description |
| :------ | :------ | :------ |
| `name` | `string` | The name of the table to drop. |
#### Returns
`Promise`\<`void`\>
#### Defined in
[connection.ts:173](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/connection.ts#L173)
___
### isOpen
**isOpen**(): `boolean`
Return true if the connection has not been closed
#### Returns
`boolean`
#### Defined in
[connection.ts:77](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/connection.ts#L77)
___
### openTable
**openTable**(`name`): `Promise`\<[`Table`](Table.md)\>
Open a table in the database.
#### Parameters
| Name | Type | Description |
| :------ | :------ | :------ |
| `name` | `string` | The name of the table |
#### Returns
`Promise`\<[`Table`](Table.md)\>
#### Defined in
[connection.ts:112](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/connection.ts#L112)
___
### tableNames
**tableNames**(`options?`): `Promise`\<`string`[]\>
List all the table names in this database.
Tables will be returned in lexicographical order.
#### Parameters
| Name | Type | Description |
| :------ | :------ | :------ |
| `options?` | `Partial`\<[`TableNamesOptions`](../interfaces/TableNamesOptions.md)\> | options to control the paging / start point |
#### Returns
`Promise`\<`string`[]\>
#### Defined in
[connection.ts:104](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/connection.ts#L104)

View File

@@ -0,0 +1,121 @@
[@lancedb/lancedb](../README.md) / [Exports](../modules.md) / Index
# Class: Index
## Table of contents
### Constructors
- [constructor](Index.md#constructor)
### Properties
- [inner](Index.md#inner)
### Methods
- [btree](Index.md#btree)
- [ivfPq](Index.md#ivfpq)
## Constructors
### constructor
**new Index**(`inner`): [`Index`](Index.md)
#### Parameters
| Name | Type |
| :------ | :------ |
| `inner` | `Index` |
#### Returns
[`Index`](Index.md)
#### Defined in
[indices.ts:118](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/indices.ts#L118)
## Properties
### inner
`Private` `Readonly` **inner**: `Index`
#### Defined in
[indices.ts:117](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/indices.ts#L117)
## Methods
### btree
**btree**(): [`Index`](Index.md)
Create a btree index
A btree index is an index on a scalar columns. The index stores a copy of the column
in sorted order. A header entry is created for each block of rows (currently the
block size is fixed at 4096). These header entries are stored in a separate
cacheable structure (a btree). To search for data the header is used to determine
which blocks need to be read from disk.
For example, a btree index in a table with 1Bi rows requires sizeof(Scalar) * 256Ki
bytes of memory and will generally need to read sizeof(Scalar) * 4096 bytes to find
the correct row ids.
This index is good for scalar columns with mostly distinct values and does best when
the query is highly selective.
The btree index does not currently have any parameters though parameters such as the
block size may be added in the future.
#### Returns
[`Index`](Index.md)
#### Defined in
[indices.ts:175](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/indices.ts#L175)
___
### ivfPq
**ivfPq**(`options?`): [`Index`](Index.md)
Create an IvfPq index
This index stores a compressed (quantized) copy of every vector. These vectors
are grouped into partitions of similar vectors. Each partition keeps track of
a centroid which is the average value of all vectors in the group.
During a query the centroids are compared with the query vector to find the closest
partitions. The compressed vectors in these partitions are then searched to find
the closest vectors.
The compression scheme is called product quantization. Each vector is divided into
subvectors and then each subvector is quantized into a small number of bits. the
parameters `num_bits` and `num_subvectors` control this process, providing a tradeoff
between index size (and thus search speed) and index accuracy.
The partitioning process is called IVF and the `num_partitions` parameter controls how
many groups to create.
Note that training an IVF PQ index on a large dataset is a slow operation and
currently is also a memory intensive operation.
#### Parameters
| Name | Type |
| :------ | :------ |
| `options?` | `Partial`\<[`IvfPqOptions`](../interfaces/IvfPqOptions.md)\> |
#### Returns
[`Index`](Index.md)
#### Defined in
[indices.ts:144](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/indices.ts#L144)

View File

@@ -0,0 +1,75 @@
[@lancedb/lancedb](../README.md) / [Exports](../modules.md) / MakeArrowTableOptions
# Class: MakeArrowTableOptions
Options to control the makeArrowTable call.
## Table of contents
### Constructors
- [constructor](MakeArrowTableOptions.md#constructor)
### Properties
- [dictionaryEncodeStrings](MakeArrowTableOptions.md#dictionaryencodestrings)
- [schema](MakeArrowTableOptions.md#schema)
- [vectorColumns](MakeArrowTableOptions.md#vectorcolumns)
## Constructors
### constructor
**new MakeArrowTableOptions**(`values?`): [`MakeArrowTableOptions`](MakeArrowTableOptions.md)
#### Parameters
| Name | Type |
| :------ | :------ |
| `values?` | `Partial`\<[`MakeArrowTableOptions`](MakeArrowTableOptions.md)\> |
#### Returns
[`MakeArrowTableOptions`](MakeArrowTableOptions.md)
#### Defined in
[arrow.ts:100](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/arrow.ts#L100)
## Properties
### dictionaryEncodeStrings
**dictionaryEncodeStrings**: `boolean` = `false`
If true then string columns will be encoded with dictionary encoding
Set this to true if your string columns tend to repeat the same values
often. For more precise control use the `schema` property to specify the
data type for individual columns.
If `schema` is provided then this property is ignored.
#### Defined in
[arrow.ts:98](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/arrow.ts#L98)
___
### schema
`Optional` **schema**: `Schema`\<`any`\>
#### Defined in
[arrow.ts:67](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/arrow.ts#L67)
___
### vectorColumns
**vectorColumns**: `Record`\<`string`, [`VectorColumnOptions`](VectorColumnOptions.md)\>
#### Defined in
[arrow.ts:85](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/arrow.ts#L85)

View File

@@ -0,0 +1,368 @@
[@lancedb/lancedb](../README.md) / [Exports](../modules.md) / Query
# Class: Query
A builder for LanceDB queries.
## Hierarchy
- [`QueryBase`](QueryBase.md)\<`NativeQuery`, [`Query`](Query.md)\>
**`Query`**
## Table of contents
### Constructors
- [constructor](Query.md#constructor)
### Properties
- [inner](Query.md#inner)
### Methods
- [[asyncIterator]](Query.md#[asynciterator])
- [execute](Query.md#execute)
- [limit](Query.md#limit)
- [nativeExecute](Query.md#nativeexecute)
- [nearestTo](Query.md#nearestto)
- [select](Query.md#select)
- [toArray](Query.md#toarray)
- [toArrow](Query.md#toarrow)
- [where](Query.md#where)
## Constructors
### constructor
**new Query**(`tbl`): [`Query`](Query.md)
#### Parameters
| Name | Type |
| :------ | :------ |
| `tbl` | `Table` |
#### Returns
[`Query`](Query.md)
#### Overrides
[QueryBase](QueryBase.md).[constructor](QueryBase.md#constructor)
#### Defined in
[query.ts:329](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L329)
## Properties
### inner
`Protected` **inner**: `Query`
#### Inherited from
[QueryBase](QueryBase.md).[inner](QueryBase.md#inner)
#### Defined in
[query.ts:59](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L59)
## Methods
### [asyncIterator]
**[asyncIterator]**(): `AsyncIterator`\<`RecordBatch`\<`any`\>, `any`, `undefined`\>
#### Returns
`AsyncIterator`\<`RecordBatch`\<`any`\>, `any`, `undefined`\>
#### Inherited from
[QueryBase](QueryBase.md).[[asyncIterator]](QueryBase.md#[asynciterator])
#### Defined in
[query.ts:154](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L154)
___
### execute
**execute**(): [`RecordBatchIterator`](RecordBatchIterator.md)
Execute the query and return the results as an
#### Returns
[`RecordBatchIterator`](RecordBatchIterator.md)
**`See`**
- AsyncIterator
of
- RecordBatch.
By default, LanceDb will use many threads to calculate results and, when
the result set is large, multiple batches will be processed at one time.
This readahead is limited however and backpressure will be applied if this
stream is consumed slowly (this constrains the maximum memory used by a
single query)
#### Inherited from
[QueryBase](QueryBase.md).[execute](QueryBase.md#execute)
#### Defined in
[query.ts:149](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L149)
___
### limit
**limit**(`limit`): [`Query`](Query.md)
Set the maximum number of results to return.
By default, a plain search has no limit. If this method is not
called then every valid row from the table will be returned.
#### Parameters
| Name | Type |
| :------ | :------ |
| `limit` | `number` |
#### Returns
[`Query`](Query.md)
#### Inherited from
[QueryBase](QueryBase.md).[limit](QueryBase.md#limit)
#### Defined in
[query.ts:129](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L129)
___
### nativeExecute
**nativeExecute**(): `Promise`\<`RecordBatchIterator`\>
#### Returns
`Promise`\<`RecordBatchIterator`\>
#### Inherited from
[QueryBase](QueryBase.md).[nativeExecute](QueryBase.md#nativeexecute)
#### Defined in
[query.ts:134](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L134)
___
### nearestTo
**nearestTo**(`vector`): [`VectorQuery`](VectorQuery.md)
Find the nearest vectors to the given query vector.
This converts the query from a plain query to a vector query.
This method will attempt to convert the input to the query vector
expected by the embedding model. If the input cannot be converted
then an error will be thrown.
By default, there is no embedding model, and the input should be
an array-like object of numbers (something that can be used as input
to Float32Array.from)
If there is only one vector column (a column whose data type is a
fixed size list of floats) then the column does not need to be specified.
If there is more than one vector column you must use
#### Parameters
| Name | Type |
| :------ | :------ |
| `vector` | `unknown` |
#### Returns
[`VectorQuery`](VectorQuery.md)
**`See`**
- [VectorQuery#column](VectorQuery.md#column) to specify which column you would like
to compare with.
If no index has been created on the vector column then a vector query
will perform a distance comparison between the query vector and every
vector in the database and then sort the results. This is sometimes
called a "flat search"
For small databases, with a few hundred thousand vectors or less, this can
be reasonably fast. In larger databases you should create a vector index
on the column. If there is a vector index then an "approximate" nearest
neighbor search (frequently called an ANN search) will be performed. This
search is much faster, but the results will be approximate.
The query can be further parameterized using the returned builder. There
are various ANN search parameters that will let you fine tune your recall
accuracy vs search latency.
Vector searches always have a `limit`. If `limit` has not been called then
a default `limit` of 10 will be used.
- [Query#limit](Query.md#limit)
#### Defined in
[query.ts:370](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L370)
___
### select
**select**(`columns`): [`Query`](Query.md)
Return only the specified columns.
By default a query will return all columns from the table. However, this can have
a very significant impact on latency. LanceDb stores data in a columnar fashion. This
means we can finely tune our I/O to select exactly the columns we need.
As a best practice you should always limit queries to the columns that you need. If you
pass in an array of column names then only those columns will be returned.
You can also use this method to create new "dynamic" columns based on your existing columns.
For example, you may not care about "a" or "b" but instead simply want "a + b". This is often
seen in the SELECT clause of an SQL query (e.g. `SELECT a+b FROM my_table`).
To create dynamic columns you can pass in a Map<string, string>. A column will be returned
for each entry in the map. The key provides the name of the column. The value is
an SQL string used to specify how the column is calculated.
For example, an SQL query might state `SELECT a + b AS combined, c`. The equivalent
input to this method would be:
#### Parameters
| Name | Type |
| :------ | :------ |
| `columns` | `string`[] \| `Record`\<`string`, `string`\> \| `Map`\<`string`, `string`\> |
#### Returns
[`Query`](Query.md)
**`Example`**
```ts
new Map([["combined", "a + b"], ["c", "c"]])
Columns will always be returned in the order given, even if that order is different than
the order used when adding the data.
Note that you can pass in a `Record<string, string>` (e.g. an object literal). This method
uses `Object.entries` which should preserve the insertion order of the object. However,
object insertion order is easy to get wrong and `Map` is more foolproof.
```
#### Inherited from
[QueryBase](QueryBase.md).[select](QueryBase.md#select)
#### Defined in
[query.ts:108](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L108)
___
### toArray
**toArray**(): `Promise`\<`unknown`[]\>
Collect the results as an array of objects.
#### Returns
`Promise`\<`unknown`[]\>
#### Inherited from
[QueryBase](QueryBase.md).[toArray](QueryBase.md#toarray)
#### Defined in
[query.ts:169](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L169)
___
### toArrow
**toArrow**(): `Promise`\<`Table`\<`any`\>\>
Collect the results as an Arrow
#### Returns
`Promise`\<`Table`\<`any`\>\>
**`See`**
ArrowTable.
#### Inherited from
[QueryBase](QueryBase.md).[toArrow](QueryBase.md#toarrow)
#### Defined in
[query.ts:160](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L160)
___
### where
**where**(`predicate`): [`Query`](Query.md)
A filter statement to be applied to this query.
The filter should be supplied as an SQL query string. For example:
#### Parameters
| Name | Type |
| :------ | :------ |
| `predicate` | `string` |
#### Returns
[`Query`](Query.md)
**`Example`**
```ts
x > 10
y > 0 AND y < 100
x > 5 OR y = 'test'
Filtering performance can often be improved by creating a scalar index
on the filter column(s).
```
#### Inherited from
[QueryBase](QueryBase.md).[where](QueryBase.md#where)
#### Defined in
[query.ts:73](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L73)

View File

@@ -0,0 +1,291 @@
[@lancedb/lancedb](../README.md) / [Exports](../modules.md) / QueryBase
# Class: QueryBase\<NativeQueryType, QueryType\>
Common methods supported by all query types
## Type parameters
| Name | Type |
| :------ | :------ |
| `NativeQueryType` | extends `NativeQuery` \| `NativeVectorQuery` |
| `QueryType` | `QueryType` |
## Hierarchy
- **`QueryBase`**
↳ [`Query`](Query.md)
↳ [`VectorQuery`](VectorQuery.md)
## Implements
- `AsyncIterable`\<`RecordBatch`\>
## Table of contents
### Constructors
- [constructor](QueryBase.md#constructor)
### Properties
- [inner](QueryBase.md#inner)
### Methods
- [[asyncIterator]](QueryBase.md#[asynciterator])
- [execute](QueryBase.md#execute)
- [limit](QueryBase.md#limit)
- [nativeExecute](QueryBase.md#nativeexecute)
- [select](QueryBase.md#select)
- [toArray](QueryBase.md#toarray)
- [toArrow](QueryBase.md#toarrow)
- [where](QueryBase.md#where)
## Constructors
### constructor
**new QueryBase**\<`NativeQueryType`, `QueryType`\>(`inner`): [`QueryBase`](QueryBase.md)\<`NativeQueryType`, `QueryType`\>
#### Type parameters
| Name | Type |
| :------ | :------ |
| `NativeQueryType` | extends `Query` \| `VectorQuery` |
| `QueryType` | `QueryType` |
#### Parameters
| Name | Type |
| :------ | :------ |
| `inner` | `NativeQueryType` |
#### Returns
[`QueryBase`](QueryBase.md)\<`NativeQueryType`, `QueryType`\>
#### Defined in
[query.ts:59](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L59)
## Properties
### inner
`Protected` **inner**: `NativeQueryType`
#### Defined in
[query.ts:59](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L59)
## Methods
### [asyncIterator]
**[asyncIterator]**(): `AsyncIterator`\<`RecordBatch`\<`any`\>, `any`, `undefined`\>
#### Returns
`AsyncIterator`\<`RecordBatch`\<`any`\>, `any`, `undefined`\>
#### Implementation of
AsyncIterable.[asyncIterator]
#### Defined in
[query.ts:154](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L154)
___
### execute
**execute**(): [`RecordBatchIterator`](RecordBatchIterator.md)
Execute the query and return the results as an
#### Returns
[`RecordBatchIterator`](RecordBatchIterator.md)
**`See`**
- AsyncIterator
of
- RecordBatch.
By default, LanceDb will use many threads to calculate results and, when
the result set is large, multiple batches will be processed at one time.
This readahead is limited however and backpressure will be applied if this
stream is consumed slowly (this constrains the maximum memory used by a
single query)
#### Defined in
[query.ts:149](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L149)
___
### limit
**limit**(`limit`): `QueryType`
Set the maximum number of results to return.
By default, a plain search has no limit. If this method is not
called then every valid row from the table will be returned.
#### Parameters
| Name | Type |
| :------ | :------ |
| `limit` | `number` |
#### Returns
`QueryType`
#### Defined in
[query.ts:129](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L129)
___
### nativeExecute
**nativeExecute**(): `Promise`\<`RecordBatchIterator`\>
#### Returns
`Promise`\<`RecordBatchIterator`\>
#### Defined in
[query.ts:134](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L134)
___
### select
**select**(`columns`): `QueryType`
Return only the specified columns.
By default a query will return all columns from the table. However, this can have
a very significant impact on latency. LanceDb stores data in a columnar fashion. This
means we can finely tune our I/O to select exactly the columns we need.
As a best practice you should always limit queries to the columns that you need. If you
pass in an array of column names then only those columns will be returned.
You can also use this method to create new "dynamic" columns based on your existing columns.
For example, you may not care about "a" or "b" but instead simply want "a + b". This is often
seen in the SELECT clause of an SQL query (e.g. `SELECT a+b FROM my_table`).
To create dynamic columns you can pass in a Map<string, string>. A column will be returned
for each entry in the map. The key provides the name of the column. The value is
an SQL string used to specify how the column is calculated.
For example, an SQL query might state `SELECT a + b AS combined, c`. The equivalent
input to this method would be:
#### Parameters
| Name | Type |
| :------ | :------ |
| `columns` | `string`[] \| `Record`\<`string`, `string`\> \| `Map`\<`string`, `string`\> |
#### Returns
`QueryType`
**`Example`**
```ts
new Map([["combined", "a + b"], ["c", "c"]])
Columns will always be returned in the order given, even if that order is different than
the order used when adding the data.
Note that you can pass in a `Record<string, string>` (e.g. an object literal). This method
uses `Object.entries` which should preserve the insertion order of the object. However,
object insertion order is easy to get wrong and `Map` is more foolproof.
```
#### Defined in
[query.ts:108](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L108)
___
### toArray
**toArray**(): `Promise`\<`unknown`[]\>
Collect the results as an array of objects.
#### Returns
`Promise`\<`unknown`[]\>
#### Defined in
[query.ts:169](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L169)
___
### toArrow
**toArrow**(): `Promise`\<`Table`\<`any`\>\>
Collect the results as an Arrow
#### Returns
`Promise`\<`Table`\<`any`\>\>
**`See`**
ArrowTable.
#### Defined in
[query.ts:160](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L160)
___
### where
**where**(`predicate`): `QueryType`
A filter statement to be applied to this query.
The filter should be supplied as an SQL query string. For example:
#### Parameters
| Name | Type |
| :------ | :------ |
| `predicate` | `string` |
#### Returns
`QueryType`
**`Example`**
```ts
x > 10
y > 0 AND y < 100
x > 5 OR y = 'test'
Filtering performance can often be improved by creating a scalar index
on the filter column(s).
```
#### Defined in
[query.ts:73](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L73)

View File

@@ -0,0 +1,80 @@
[@lancedb/lancedb](../README.md) / [Exports](../modules.md) / RecordBatchIterator
# Class: RecordBatchIterator
## Implements
- `AsyncIterator`\<`RecordBatch`\>
## Table of contents
### Constructors
- [constructor](RecordBatchIterator.md#constructor)
### Properties
- [inner](RecordBatchIterator.md#inner)
- [promisedInner](RecordBatchIterator.md#promisedinner)
### Methods
- [next](RecordBatchIterator.md#next)
## Constructors
### constructor
**new RecordBatchIterator**(`promise?`): [`RecordBatchIterator`](RecordBatchIterator.md)
#### Parameters
| Name | Type |
| :------ | :------ |
| `promise?` | `Promise`\<`RecordBatchIterator`\> |
#### Returns
[`RecordBatchIterator`](RecordBatchIterator.md)
#### Defined in
[query.ts:27](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L27)
## Properties
### inner
`Private` `Optional` **inner**: `RecordBatchIterator`
#### Defined in
[query.ts:25](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L25)
___
### promisedInner
`Private` `Optional` **promisedInner**: `Promise`\<`RecordBatchIterator`\>
#### Defined in
[query.ts:24](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L24)
## Methods
### next
**next**(): `Promise`\<`IteratorResult`\<`RecordBatch`\<`any`\>, `any`\>\>
#### Returns
`Promise`\<`IteratorResult`\<`RecordBatch`\<`any`\>, `any`\>\>
#### Implementation of
AsyncIterator.next
#### Defined in
[query.ts:33](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L33)

View File

@@ -0,0 +1,594 @@
[@lancedb/lancedb](../README.md) / [Exports](../modules.md) / Table
# Class: Table
A Table is a collection of Records in a LanceDB Database.
A Table object is expected to be long lived and reused for multiple operations.
Table objects will cache a certain amount of index data in memory. This cache
will be freed when the Table is garbage collected. To eagerly free the cache you
can call the `close` method. Once the Table is closed, it cannot be used for any
further operations.
Closing a table is optional. It not closed, it will be closed when it is garbage
collected.
## Table of contents
### Constructors
- [constructor](Table.md#constructor)
### Properties
- [inner](Table.md#inner)
### Methods
- [add](Table.md#add)
- [addColumns](Table.md#addcolumns)
- [alterColumns](Table.md#altercolumns)
- [checkout](Table.md#checkout)
- [checkoutLatest](Table.md#checkoutlatest)
- [close](Table.md#close)
- [countRows](Table.md#countrows)
- [createIndex](Table.md#createindex)
- [delete](Table.md#delete)
- [display](Table.md#display)
- [dropColumns](Table.md#dropcolumns)
- [isOpen](Table.md#isopen)
- [listIndices](Table.md#listindices)
- [query](Table.md#query)
- [restore](Table.md#restore)
- [schema](Table.md#schema)
- [update](Table.md#update)
- [vectorSearch](Table.md#vectorsearch)
- [version](Table.md#version)
## Constructors
### constructor
**new Table**(`inner`): [`Table`](Table.md)
Construct a Table. Internal use only.
#### Parameters
| Name | Type |
| :------ | :------ |
| `inner` | `Table` |
#### Returns
[`Table`](Table.md)
#### Defined in
[table.ts:69](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/table.ts#L69)
## Properties
### inner
`Private` `Readonly` **inner**: `Table`
#### Defined in
[table.ts:66](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/table.ts#L66)
## Methods
### add
**add**(`data`, `options?`): `Promise`\<`void`\>
Insert records into this Table.
#### Parameters
| Name | Type | Description |
| :------ | :------ | :------ |
| `data` | [`Data`](../modules.md#data) | Records to be inserted into the Table |
| `options?` | `Partial`\<[`AddDataOptions`](../interfaces/AddDataOptions.md)\> | - |
#### Returns
`Promise`\<`void`\>
#### Defined in
[table.ts:105](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/table.ts#L105)
___
### addColumns
**addColumns**(`newColumnTransforms`): `Promise`\<`void`\>
Add new columns with defined values.
#### Parameters
| Name | Type | Description |
| :------ | :------ | :------ |
| `newColumnTransforms` | [`AddColumnsSql`](../interfaces/AddColumnsSql.md)[] | pairs of column names and the SQL expression to use to calculate the value of the new column. These expressions will be evaluated for each row in the table, and can reference existing columns in the table. |
#### Returns
`Promise`\<`void`\>
#### Defined in
[table.ts:261](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/table.ts#L261)
___
### alterColumns
**alterColumns**(`columnAlterations`): `Promise`\<`void`\>
Alter the name or nullability of columns.
#### Parameters
| Name | Type | Description |
| :------ | :------ | :------ |
| `columnAlterations` | [`ColumnAlteration`](../interfaces/ColumnAlteration.md)[] | One or more alterations to apply to columns. |
#### Returns
`Promise`\<`void`\>
#### Defined in
[table.ts:270](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/table.ts#L270)
___
### checkout
**checkout**(`version`): `Promise`\<`void`\>
Checks out a specific version of the Table
Any read operation on the table will now access the data at the checked out version.
As a consequence, calling this method will disable any read consistency interval
that was previously set.
This is a read-only operation that turns the table into a sort of "view"
or "detached head". Other table instances will not be affected. To make the change
permanent you can use the `[Self::restore]` method.
Any operation that modifies the table will fail while the table is in a checked
out state.
To return the table to a normal state use `[Self::checkout_latest]`
#### Parameters
| Name | Type |
| :------ | :------ |
| `version` | `number` |
#### Returns
`Promise`\<`void`\>
#### Defined in
[table.ts:317](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/table.ts#L317)
___
### checkoutLatest
**checkoutLatest**(): `Promise`\<`void`\>
Ensures the table is pointing at the latest version
This can be used to manually update a table when the read_consistency_interval is None
It can also be used to undo a `[Self::checkout]` operation
#### Returns
`Promise`\<`void`\>
#### Defined in
[table.ts:327](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/table.ts#L327)
___
### close
**close**(): `void`
Close the table, releasing any underlying resources.
It is safe to call this method multiple times.
Any attempt to use the table after it is closed will result in an error.
#### Returns
`void`
#### Defined in
[table.ts:85](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/table.ts#L85)
___
### countRows
**countRows**(`filter?`): `Promise`\<`number`\>
Count the total number of rows in the dataset.
#### Parameters
| Name | Type |
| :------ | :------ |
| `filter?` | `string` |
#### Returns
`Promise`\<`number`\>
#### Defined in
[table.ts:152](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/table.ts#L152)
___
### createIndex
**createIndex**(`column`, `options?`): `Promise`\<`void`\>
Create an index to speed up queries.
Indices can be created on vector columns or scalar columns.
Indices on vector columns will speed up vector searches.
Indices on scalar columns will speed up filtering (in both
vector and non-vector searches)
#### Parameters
| Name | Type |
| :------ | :------ |
| `column` | `string` |
| `options?` | `Partial`\<[`IndexOptions`](../interfaces/IndexOptions.md)\> |
#### Returns
`Promise`\<`void`\>
**`Example`**
```ts
// If the column has a vector (fixed size list) data type then
// an IvfPq vector index will be created.
const table = await conn.openTable("my_table");
await table.createIndex(["vector"]);
```
**`Example`**
```ts
// For advanced control over vector index creation you can specify
// the index type and options.
const table = await conn.openTable("my_table");
await table.createIndex(["vector"], I)
.ivf_pq({ num_partitions: 128, num_sub_vectors: 16 })
.build();
```
**`Example`**
```ts
// Or create a Scalar index
await table.createIndex("my_float_col").build();
```
#### Defined in
[table.ts:184](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/table.ts#L184)
___
### delete
**delete**(`predicate`): `Promise`\<`void`\>
Delete the rows that satisfy the predicate.
#### Parameters
| Name | Type |
| :------ | :------ |
| `predicate` | `string` |
#### Returns
`Promise`\<`void`\>
#### Defined in
[table.ts:157](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/table.ts#L157)
___
### display
**display**(): `string`
Return a brief description of the table
#### Returns
`string`
#### Defined in
[table.ts:90](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/table.ts#L90)
___
### dropColumns
**dropColumns**(`columnNames`): `Promise`\<`void`\>
Drop one or more columns from the dataset
This is a metadata-only operation and does not remove the data from the
underlying storage. In order to remove the data, you must subsequently
call ``compact_files`` to rewrite the data without the removed columns and
then call ``cleanup_files`` to remove the old files.
#### Parameters
| Name | Type | Description |
| :------ | :------ | :------ |
| `columnNames` | `string`[] | The names of the columns to drop. These can be nested column references (e.g. "a.b.c") or top-level column names (e.g. "a"). |
#### Returns
`Promise`\<`void`\>
#### Defined in
[table.ts:285](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/table.ts#L285)
___
### isOpen
▸ **isOpen**(): `boolean`
Return true if the table has not been closed
#### Returns
`boolean`
#### Defined in
[table.ts:74](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/table.ts#L74)
___
### listIndices
▸ **listIndices**(): `Promise`\<[`IndexConfig`](../interfaces/IndexConfig.md)[]\>
List all indices that have been created with Self::create_index
#### Returns
`Promise`\<[`IndexConfig`](../interfaces/IndexConfig.md)[]\>
#### Defined in
[table.ts:350](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/table.ts#L350)
___
### query
▸ **query**(): [`Query`](Query.md)
Create a [Query](Query.md) Builder.
Queries allow you to search your existing data. By default the query will
return all the data in the table in no particular order. The builder
returned by this method can be used to control the query using filtering,
vector similarity, sorting, and more.
Note: By default, all columns are returned. For best performance, you should
only fetch the columns you need. See [`Query::select_with_projection`] for
more details.
When appropriate, various indices and statistics based pruning will be used to
accelerate the query.
#### Returns
[`Query`](Query.md)
A builder that can be used to parameterize the query
**`Example`**
```ts
// SQL-style filtering
//
// This query will return up to 1000 rows whose value in the `id` column
// is greater than 5. LanceDb supports a broad set of filtering functions.
for await (const batch of table.query()
.filter("id > 1").select(["id"]).limit(20)) {
console.log(batch);
}
```
**`Example`**
```ts
// Vector Similarity Search
//
// This example will find the 10 rows whose value in the "vector" column are
// closest to the query vector [1.0, 2.0, 3.0]. If an index has been created
// on the "vector" column then this will perform an ANN search.
//
// The `refine_factor` and `nprobes` methods are used to control the recall /
// latency tradeoff of the search.
for await (const batch of table.query()
.nearestTo([1, 2, 3])
.refineFactor(5).nprobe(10)
.limit(10)) {
console.log(batch);
}
```
**`Example`**
```ts
// Scan the full dataset
//
// This query will return everything in the table in no particular order.
for await (const batch of table.query()) {
console.log(batch);
}
```
#### Defined in
[table.ts:238](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/table.ts#L238)
___
### restore
▸ **restore**(): `Promise`\<`void`\>
Restore the table to the currently checked out version
This operation will fail if checkout has not been called previously
This operation will overwrite the latest version of the table with a
previous version. Any changes made since the checked out version will
no longer be visible.
Once the operation concludes the table will no longer be in a checked
out state and the read_consistency_interval, if any, will apply.
#### Returns
`Promise`\<`void`\>
#### Defined in
[table.ts:343](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/table.ts#L343)
___
### schema
▸ **schema**(): `Promise`\<`Schema`\<`any`\>\>
Get the schema of the table.
#### Returns
`Promise`\<`Schema`\<`any`\>\>
#### Defined in
[table.ts:95](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/table.ts#L95)
___
### update
▸ **update**(`updates`, `options?`): `Promise`\<`void`\>
Update existing records in the Table
An update operation can be used to adjust existing values. Use the
returned builder to specify which columns to update. The new value
can be a literal value (e.g. replacing nulls with some default value)
or an expression applied to the old value (e.g. incrementing a value)
An optional condition can be specified (e.g. "only update if the old
value is 0")
Note: if your condition is something like "some_id_column == 7" and
you are updating many rows (with different ids) then you will get
better performance with a single [`merge_insert`] call instead of
repeatedly calilng this method.
#### Parameters
| Name | Type | Description |
| :------ | :------ | :------ |
| `updates` | `Record`\<`string`, `string`\> \| `Map`\<`string`, `string`\> | the columns to update Keys in the map should specify the name of the column to update. Values in the map provide the new value of the column. These can be SQL literal strings (e.g. "7" or "'foo'") or they can be expressions based on the row being updated (e.g. "my_col + 1") |
| `options?` | `Partial`\<[`UpdateOptions`](../interfaces/UpdateOptions.md)\> | additional options to control the update behavior |
#### Returns
`Promise`\<`void`\>
#### Defined in
[table.ts:137](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/table.ts#L137)
___
### vectorSearch
▸ **vectorSearch**(`vector`): [`VectorQuery`](VectorQuery.md)
Search the table with a given query vector.
This is a convenience method for preparing a vector query and
is the same thing as calling `nearestTo` on the builder returned
by `query`.
#### Parameters
| Name | Type |
| :------ | :------ |
| `vector` | `unknown` |
#### Returns
[`VectorQuery`](VectorQuery.md)
**`See`**
[Query#nearestTo](Query.md#nearestto) for more details.
#### Defined in
[table.ts:249](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/table.ts#L249)
___
### version
▸ **version**(): `Promise`\<`number`\>
Retrieve the version of the table
LanceDb supports versioning. Every operation that modifies the table increases
version. As long as a version hasn't been deleted you can `[Self::checkout]` that
version to view the data at that point. In addition, you can `[Self::restore]` the
version to replace the current table with a previous version.
#### Returns
`Promise`\<`number`\>
#### Defined in
[table.ts:297](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/table.ts#L297)

View File

@@ -0,0 +1,45 @@
[@lancedb/lancedb](../README.md) / [Exports](../modules.md) / VectorColumnOptions
# Class: VectorColumnOptions
## Table of contents
### Constructors
- [constructor](VectorColumnOptions.md#constructor)
### Properties
- [type](VectorColumnOptions.md#type)
## Constructors
### constructor
**new VectorColumnOptions**(`values?`): [`VectorColumnOptions`](VectorColumnOptions.md)
#### Parameters
| Name | Type |
| :------ | :------ |
| `values?` | `Partial`\<[`VectorColumnOptions`](VectorColumnOptions.md)\> |
#### Returns
[`VectorColumnOptions`](VectorColumnOptions.md)
#### Defined in
[arrow.ts:49](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/arrow.ts#L49)
## Properties
### type
**type**: `Float`\<`Floats`\>
Vector column type.
#### Defined in
[arrow.ts:47](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/arrow.ts#L47)

View File

@@ -0,0 +1,531 @@
[@lancedb/lancedb](../README.md) / [Exports](../modules.md) / VectorQuery
# Class: VectorQuery
A builder used to construct a vector search
This builder can be reused to execute the query many times.
## Hierarchy
- [`QueryBase`](QueryBase.md)\<`NativeVectorQuery`, [`VectorQuery`](VectorQuery.md)\>
**`VectorQuery`**
## Table of contents
### Constructors
- [constructor](VectorQuery.md#constructor)
### Properties
- [inner](VectorQuery.md#inner)
### Methods
- [[asyncIterator]](VectorQuery.md#[asynciterator])
- [bypassVectorIndex](VectorQuery.md#bypassvectorindex)
- [column](VectorQuery.md#column)
- [distanceType](VectorQuery.md#distancetype)
- [execute](VectorQuery.md#execute)
- [limit](VectorQuery.md#limit)
- [nativeExecute](VectorQuery.md#nativeexecute)
- [nprobes](VectorQuery.md#nprobes)
- [postfilter](VectorQuery.md#postfilter)
- [refineFactor](VectorQuery.md#refinefactor)
- [select](VectorQuery.md#select)
- [toArray](VectorQuery.md#toarray)
- [toArrow](VectorQuery.md#toarrow)
- [where](VectorQuery.md#where)
## Constructors
### constructor
**new VectorQuery**(`inner`): [`VectorQuery`](VectorQuery.md)
#### Parameters
| Name | Type |
| :------ | :------ |
| `inner` | `VectorQuery` |
#### Returns
[`VectorQuery`](VectorQuery.md)
#### Overrides
[QueryBase](QueryBase.md).[constructor](QueryBase.md#constructor)
#### Defined in
[query.ts:189](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L189)
## Properties
### inner
`Protected` **inner**: `VectorQuery`
#### Inherited from
[QueryBase](QueryBase.md).[inner](QueryBase.md#inner)
#### Defined in
[query.ts:59](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L59)
## Methods
### [asyncIterator]
**[asyncIterator]**(): `AsyncIterator`\<`RecordBatch`\<`any`\>, `any`, `undefined`\>
#### Returns
`AsyncIterator`\<`RecordBatch`\<`any`\>, `any`, `undefined`\>
#### Inherited from
[QueryBase](QueryBase.md).[[asyncIterator]](QueryBase.md#[asynciterator])
#### Defined in
[query.ts:154](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L154)
___
### bypassVectorIndex
**bypassVectorIndex**(): [`VectorQuery`](VectorQuery.md)
If this is called then any vector index is skipped
An exhaustive (flat) search will be performed. The query vector will
be compared to every vector in the table. At high scales this can be
expensive. However, this is often still useful. For example, skipping
the vector index can give you ground truth results which you can use to
calculate your recall to select an appropriate value for nprobes.
#### Returns
[`VectorQuery`](VectorQuery.md)
#### Defined in
[query.ts:321](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L321)
___
### column
**column**(`column`): [`VectorQuery`](VectorQuery.md)
Set the vector column to query
This controls which column is compared to the query vector supplied in
the call to
#### Parameters
| Name | Type |
| :------ | :------ |
| `column` | `string` |
#### Returns
[`VectorQuery`](VectorQuery.md)
**`See`**
[Query#nearestTo](Query.md#nearestto)
This parameter must be specified if the table has more than one column
whose data type is a fixed-size-list of floats.
#### Defined in
[query.ts:229](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L229)
___
### distanceType
**distanceType**(`distanceType`): [`VectorQuery`](VectorQuery.md)
Set the distance metric to use
When performing a vector search we try and find the "nearest" vectors according
to some kind of distance metric. This parameter controls which distance metric to
use. See
#### Parameters
| Name | Type |
| :------ | :------ |
| `distanceType` | `string` |
#### Returns
[`VectorQuery`](VectorQuery.md)
**`See`**
[IvfPqOptions.distanceType](../interfaces/IvfPqOptions.md#distancetype) for more details on the different
distance metrics available.
Note: if there is a vector index then the distance type used MUST match the distance
type used to train the vector index. If this is not done then the results will be
invalid.
By default "l2" is used.
#### Defined in
[query.ts:248](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L248)
___
### execute
**execute**(): [`RecordBatchIterator`](RecordBatchIterator.md)
Execute the query and return the results as an
#### Returns
[`RecordBatchIterator`](RecordBatchIterator.md)
**`See`**
- AsyncIterator
of
- RecordBatch.
By default, LanceDb will use many threads to calculate results and, when
the result set is large, multiple batches will be processed at one time.
This readahead is limited however and backpressure will be applied if this
stream is consumed slowly (this constrains the maximum memory used by a
single query)
#### Inherited from
[QueryBase](QueryBase.md).[execute](QueryBase.md#execute)
#### Defined in
[query.ts:149](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L149)
___
### limit
**limit**(`limit`): [`VectorQuery`](VectorQuery.md)
Set the maximum number of results to return.
By default, a plain search has no limit. If this method is not
called then every valid row from the table will be returned.
#### Parameters
| Name | Type |
| :------ | :------ |
| `limit` | `number` |
#### Returns
[`VectorQuery`](VectorQuery.md)
#### Inherited from
[QueryBase](QueryBase.md).[limit](QueryBase.md#limit)
#### Defined in
[query.ts:129](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L129)
___
### nativeExecute
**nativeExecute**(): `Promise`\<`RecordBatchIterator`\>
#### Returns
`Promise`\<`RecordBatchIterator`\>
#### Inherited from
[QueryBase](QueryBase.md).[nativeExecute](QueryBase.md#nativeexecute)
#### Defined in
[query.ts:134](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L134)
___
### nprobes
**nprobes**(`nprobes`): [`VectorQuery`](VectorQuery.md)
Set the number of partitions to search (probe)
This argument is only used when the vector column has an IVF PQ index.
If there is no index then this value is ignored.
The IVF stage of IVF PQ divides the input into partitions (clusters) of
related values.
The partition whose centroids are closest to the query vector will be
exhaustiely searched to find matches. This parameter controls how many
partitions should be searched.
Increasing this value will increase the recall of your query but will
also increase the latency of your query. The default value is 20. This
default is good for many cases but the best value to use will depend on
your data and the recall that you need to achieve.
For best results we recommend tuning this parameter with a benchmark against
your actual data to find the smallest possible value that will still give
you the desired recall.
#### Parameters
| Name | Type |
| :------ | :------ |
| `nprobes` | `number` |
#### Returns
[`VectorQuery`](VectorQuery.md)
#### Defined in
[query.ts:215](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L215)
___
### postfilter
**postfilter**(): [`VectorQuery`](VectorQuery.md)
If this is called then filtering will happen after the vector search instead of
before.
By default filtering will be performed before the vector search. This is how
filtering is typically understood to work. This prefilter step does add some
additional latency. Creating a scalar index on the filter column(s) can
often improve this latency. However, sometimes a filter is too complex or scalar
indices cannot be applied to the column. In these cases postfiltering can be
used instead of prefiltering to improve latency.
Post filtering applies the filter to the results of the vector search. This means
we only run the filter on a much smaller set of data. However, it can cause the
query to return fewer than `limit` results (or even no results) if none of the nearest
results match the filter.
Post filtering happens during the "refine stage" (described in more detail in
#### Returns
[`VectorQuery`](VectorQuery.md)
**`See`**
[VectorQuery#refineFactor](VectorQuery.md#refinefactor)). This means that setting a higher refine
factor can often help restore some of the results lost by post filtering.
#### Defined in
[query.ts:307](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L307)
___
### refineFactor
**refineFactor**(`refineFactor`): [`VectorQuery`](VectorQuery.md)
A multiplier to control how many additional rows are taken during the refine step
This argument is only used when the vector column has an IVF PQ index.
If there is no index then this value is ignored.
An IVF PQ index stores compressed (quantized) values. They query vector is compared
against these values and, since they are compressed, the comparison is inaccurate.
This parameter can be used to refine the results. It can improve both improve recall
and correct the ordering of the nearest results.
To refine results LanceDb will first perform an ANN search to find the nearest
`limit` * `refine_factor` results. In other words, if `refine_factor` is 3 and
`limit` is the default (10) then the first 30 results will be selected. LanceDb
then fetches the full, uncompressed, values for these 30 results. The results are
then reordered by the true distance and only the nearest 10 are kept.
Note: there is a difference between calling this method with a value of 1 and never
calling this method at all. Calling this method with any value will have an impact
on your search latency. When you call this method with a `refine_factor` of 1 then
LanceDb still needs to fetch the full, uncompressed, values so that it can potentially
reorder the results.
Note: if this method is NOT called then the distances returned in the _distance column
will be approximate distances based on the comparison of the quantized query vector
and the quantized result vectors. This can be considerably different than the true
distance between the query vector and the actual uncompressed vector.
#### Parameters
| Name | Type |
| :------ | :------ |
| `refineFactor` | `number` |
#### Returns
[`VectorQuery`](VectorQuery.md)
#### Defined in
[query.ts:282](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L282)
___
### select
**select**(`columns`): [`VectorQuery`](VectorQuery.md)
Return only the specified columns.
By default a query will return all columns from the table. However, this can have
a very significant impact on latency. LanceDb stores data in a columnar fashion. This
means we can finely tune our I/O to select exactly the columns we need.
As a best practice you should always limit queries to the columns that you need. If you
pass in an array of column names then only those columns will be returned.
You can also use this method to create new "dynamic" columns based on your existing columns.
For example, you may not care about "a" or "b" but instead simply want "a + b". This is often
seen in the SELECT clause of an SQL query (e.g. `SELECT a+b FROM my_table`).
To create dynamic columns you can pass in a Map<string, string>. A column will be returned
for each entry in the map. The key provides the name of the column. The value is
an SQL string used to specify how the column is calculated.
For example, an SQL query might state `SELECT a + b AS combined, c`. The equivalent
input to this method would be:
#### Parameters
| Name | Type |
| :------ | :------ |
| `columns` | `string`[] \| `Record`\<`string`, `string`\> \| `Map`\<`string`, `string`\> |
#### Returns
[`VectorQuery`](VectorQuery.md)
**`Example`**
```ts
new Map([["combined", "a + b"], ["c", "c"]])
Columns will always be returned in the order given, even if that order is different than
the order used when adding the data.
Note that you can pass in a `Record<string, string>` (e.g. an object literal). This method
uses `Object.entries` which should preserve the insertion order of the object. However,
object insertion order is easy to get wrong and `Map` is more foolproof.
```
#### Inherited from
[QueryBase](QueryBase.md).[select](QueryBase.md#select)
#### Defined in
[query.ts:108](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L108)
___
### toArray
**toArray**(): `Promise`\<`unknown`[]\>
Collect the results as an array of objects.
#### Returns
`Promise`\<`unknown`[]\>
#### Inherited from
[QueryBase](QueryBase.md).[toArray](QueryBase.md#toarray)
#### Defined in
[query.ts:169](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L169)
___
### toArrow
**toArrow**(): `Promise`\<`Table`\<`any`\>\>
Collect the results as an Arrow
#### Returns
`Promise`\<`Table`\<`any`\>\>
**`See`**
ArrowTable.
#### Inherited from
[QueryBase](QueryBase.md).[toArrow](QueryBase.md#toarrow)
#### Defined in
[query.ts:160](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L160)
___
### where
**where**(`predicate`): [`VectorQuery`](VectorQuery.md)
A filter statement to be applied to this query.
The filter should be supplied as an SQL query string. For example:
#### Parameters
| Name | Type |
| :------ | :------ |
| `predicate` | `string` |
#### Returns
[`VectorQuery`](VectorQuery.md)
**`Example`**
```ts
x > 10
y > 0 AND y < 100
x > 5 OR y = 'test'
Filtering performance can often be improved by creating a scalar index
on the filter column(s).
```
#### Inherited from
[QueryBase](QueryBase.md).[where](QueryBase.md#where)
#### Defined in
[query.ts:73](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/query.ts#L73)

View File

@@ -0,0 +1,111 @@
[@lancedb/lancedb](../README.md) / [Exports](../modules.md) / [embedding](../modules/embedding.md) / OpenAIEmbeddingFunction
# Class: OpenAIEmbeddingFunction
[embedding](../modules/embedding.md).OpenAIEmbeddingFunction
An embedding function that automatically creates vector representation for a given column.
## Implements
- [`EmbeddingFunction`](../interfaces/embedding.EmbeddingFunction.md)\<`string`\>
## Table of contents
### Constructors
- [constructor](embedding.OpenAIEmbeddingFunction.md#constructor)
### Properties
- [\_modelName](embedding.OpenAIEmbeddingFunction.md#_modelname)
- [\_openai](embedding.OpenAIEmbeddingFunction.md#_openai)
- [sourceColumn](embedding.OpenAIEmbeddingFunction.md#sourcecolumn)
### Methods
- [embed](embedding.OpenAIEmbeddingFunction.md#embed)
## Constructors
### constructor
**new OpenAIEmbeddingFunction**(`sourceColumn`, `openAIKey`, `modelName?`): [`OpenAIEmbeddingFunction`](embedding.OpenAIEmbeddingFunction.md)
#### Parameters
| Name | Type | Default value |
| :------ | :------ | :------ |
| `sourceColumn` | `string` | `undefined` |
| `openAIKey` | `string` | `undefined` |
| `modelName` | `string` | `"text-embedding-ada-002"` |
#### Returns
[`OpenAIEmbeddingFunction`](embedding.OpenAIEmbeddingFunction.md)
#### Defined in
[embedding/openai.ts:22](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/embedding/openai.ts#L22)
## Properties
### \_modelName
`Private` `Readonly` **\_modelName**: `string`
#### Defined in
[embedding/openai.ts:20](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/embedding/openai.ts#L20)
___
### \_openai
`Private` `Readonly` **\_openai**: `OpenAI`
#### Defined in
[embedding/openai.ts:19](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/embedding/openai.ts#L19)
___
### sourceColumn
**sourceColumn**: `string`
The name of the column that will be used as input for the Embedding Function.
#### Implementation of
[EmbeddingFunction](../interfaces/embedding.EmbeddingFunction.md).[sourceColumn](../interfaces/embedding.EmbeddingFunction.md#sourcecolumn)
#### Defined in
[embedding/openai.ts:61](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/embedding/openai.ts#L61)
## Methods
### embed
**embed**(`data`): `Promise`\<`number`[][]\>
Creates a vector representation for the given values.
#### Parameters
| Name | Type |
| :------ | :------ |
| `data` | `string`[] |
#### Returns
`Promise`\<`number`[][]\>
#### Implementation of
[EmbeddingFunction](../interfaces/embedding.EmbeddingFunction.md).[embed](../interfaces/embedding.EmbeddingFunction.md#embed)
#### Defined in
[embedding/openai.ts:48](https://github.com/lancedb/lancedb/blob/9d178c7/nodejs/lancedb/embedding/openai.ts#L48)