diff --git a/docs/src/js/README.md b/docs/src/js/README.md index 895872c3..fc2b81a5 100644 --- a/docs/src/js/README.md +++ b/docs/src/js/README.md @@ -40,37 +40,4 @@ The [quickstart](../basic.md) contains a more complete example. ## Development -```sh -npm run build -npm run test -``` - -### Running lint / format - -LanceDb uses [biome](https://biomejs.dev/) for linting and formatting. if you are using VSCode you will need to install the official [Biome](https://marketplace.visualstudio.com/items?itemName=biomejs.biome) extension. -To manually lint your code you can run: - -```sh -npm run lint -``` - -to automatically fix all fixable issues: - -```sh -npm run lint-fix -``` - -If you do not have your workspace root set to the `nodejs` directory, unfortunately the extension will not work. You can still run the linting and formatting commands manually. - -### Generating docs - -```sh -npm run docs - -cd ../docs -# Asssume the virtual environment was created -# python3 -m venv venv -# pip install -r requirements.txt -. ./venv/bin/activate -mkdocs build -``` +See [CONTRIBUTING.md](_media/CONTRIBUTING.md) for information on how to contribute to LanceDB. diff --git a/docs/src/js/_media/CONTRIBUTING.md b/docs/src/js/_media/CONTRIBUTING.md new file mode 100644 index 00000000..c8a347ea --- /dev/null +++ b/docs/src/js/_media/CONTRIBUTING.md @@ -0,0 +1,76 @@ +# Contributing to LanceDB Typescript + +This document outlines the process for contributing to LanceDB Typescript. +For general contribution guidelines, see [CONTRIBUTING.md](../CONTRIBUTING.md). + +## Project layout + +The Typescript package is a wrapper around the Rust library, `lancedb`. We use +the [napi-rs](https://napi.rs/) library to create the bindings between Rust and +Typescript. + +* `src/`: Rust bindings source code +* `lancedb/`: Typescript package source code +* `__test__/`: Unit tests +* `examples/`: An npm package with the examples shown in the documentation + +## Development environment + +To set up your development environment, you will need to install the following: + +1. Node.js 14 or later +2. Rust's package manager, Cargo. Use [rustup](https://rustup.rs/) to install. +3. [protoc](https://grpc.io/docs/protoc-installation/) (Protocol Buffers compiler) + +Initial setup: + +```shell +npm install +``` + +### Commit Hooks + +It is **highly recommended** to install the [pre-commit](https://pre-commit.com/) hooks to ensure that your +code is formatted correctly and passes basic checks before committing: + +```shell +pre-commit install +``` + +## Development + +Most common development commands can be run using the npm scripts. + +Build the package + +```shell +npm install +npm run build +``` + +Lint: + +```shell +npm run lint +``` + +Format and fix lints: + +```shell +npm run lint-fix +``` + +Run tests: + +```shell +npm test +``` + +To run a single test: + +```shell +# Single file: table.test.ts +npm test -- table.test.ts +# Single test: 'merge insert' in table.test.ts +npm test -- table.test.ts --testNamePattern=merge\ insert +``` diff --git a/docs/src/js/classes/Table.md b/docs/src/js/classes/Table.md index 145ec945..370a941c 100644 --- a/docs/src/js/classes/Table.md +++ b/docs/src/js/classes/Table.md @@ -317,6 +317,32 @@ then call ``cleanup_files`` to remove the old files. *** +### dropIndex() + +```ts +abstract dropIndex(name): Promise +``` + +Drop an index from the table. + +#### Parameters + +* **name**: `string` + The name of the index. + +#### Returns + +`Promise`<`void`> + +#### Note + +This does not delete the index from disk, it just removes it from the table. +To delete the index, run [Table#optimize](Table.md#optimize) after dropping the index. + +Use [Table.listIndices](Table.md#listindices) to find the names of the indices. + +*** + ### indexStats() ```ts @@ -336,6 +362,8 @@ List all the stats of a specified index The stats of the index. If the index does not exist, it will return undefined +Use [Table.listIndices](Table.md#listindices) to find the names of the indices. + *** ### isOpen() diff --git a/docs/src/js/classes/VectorQuery.md b/docs/src/js/classes/VectorQuery.md index d24da6fb..b970ea7e 100644 --- a/docs/src/js/classes/VectorQuery.md +++ b/docs/src/js/classes/VectorQuery.md @@ -128,6 +128,24 @@ whose data type is a fixed-size-list of floats. *** +### distanceRange() + +```ts +distanceRange(lowerBound?, upperBound?): VectorQuery +``` + +#### Parameters + +* **lowerBound?**: `number` + +* **upperBound?**: `number` + +#### Returns + +[`VectorQuery`](VectorQuery.md) + +*** + ### distanceType() ```ts @@ -528,6 +546,22 @@ distance between the query vector and the actual uncompressed vector. *** +### rerank() + +```ts +rerank(reranker): VectorQuery +``` + +#### Parameters + +* **reranker**: [`Reranker`](../namespaces/rerankers/interfaces/Reranker.md) + +#### Returns + +[`VectorQuery`](VectorQuery.md) + +*** + ### select() ```ts diff --git a/docs/src/js/globals.md b/docs/src/js/globals.md index a0222172..7e4758fd 100644 --- a/docs/src/js/globals.md +++ b/docs/src/js/globals.md @@ -7,6 +7,7 @@ ## Namespaces - [embedding](namespaces/embedding/README.md) +- [rerankers](namespaces/rerankers/README.md) ## Enumerations diff --git a/docs/src/js/interfaces/IvfPqOptions.md b/docs/src/js/interfaces/IvfPqOptions.md index dc2a5c33..a2b1bda1 100644 --- a/docs/src/js/interfaces/IvfPqOptions.md +++ b/docs/src/js/interfaces/IvfPqOptions.md @@ -68,6 +68,21 @@ The default value is 50. *** +### numBits? + +```ts +optional numBits: number; +``` + +Number of bits per sub-vector. + +This value controls how much each subvector is compressed. The more bits the more +accurate the index will be but the slower search. The default is 8 bits. + +The number of bits must be 4 or 8. + +*** + ### numPartitions? ```ts diff --git a/docs/src/js/namespaces/rerankers/README.md b/docs/src/js/namespaces/rerankers/README.md new file mode 100644 index 00000000..0ddacfb3 --- /dev/null +++ b/docs/src/js/namespaces/rerankers/README.md @@ -0,0 +1,17 @@ +[**@lancedb/lancedb**](../../README.md) • **Docs** + +*** + +[@lancedb/lancedb](../../globals.md) / rerankers + +# rerankers + +## Index + +### Classes + +- [RRFReranker](classes/RRFReranker.md) + +### Interfaces + +- [Reranker](interfaces/Reranker.md) diff --git a/docs/src/js/namespaces/rerankers/classes/RRFReranker.md b/docs/src/js/namespaces/rerankers/classes/RRFReranker.md new file mode 100644 index 00000000..7beb9aff --- /dev/null +++ b/docs/src/js/namespaces/rerankers/classes/RRFReranker.md @@ -0,0 +1,66 @@ +[**@lancedb/lancedb**](../../../README.md) • **Docs** + +*** + +[@lancedb/lancedb](../../../globals.md) / [rerankers](../README.md) / RRFReranker + +# Class: RRFReranker + +Reranks the results using the Reciprocal Rank Fusion (RRF) algorithm. + +Internally this uses the Rust implementation + +## Constructors + +### new RRFReranker() + +```ts +new RRFReranker(inner): RRFReranker +``` + +#### Parameters + +* **inner**: `RrfReranker` + +#### Returns + +[`RRFReranker`](RRFReranker.md) + +## Methods + +### rerankHybrid() + +```ts +rerankHybrid( + query, + vecResults, + ftsResults): Promise> +``` + +#### Parameters + +* **query**: `string` + +* **vecResults**: `RecordBatch`<`any`> + +* **ftsResults**: `RecordBatch`<`any`> + +#### Returns + +`Promise`<`RecordBatch`<`any`>> + +*** + +### create() + +```ts +static create(k): Promise +``` + +#### Parameters + +* **k**: `number` = `60` + +#### Returns + +`Promise`<[`RRFReranker`](RRFReranker.md)> diff --git a/docs/src/js/namespaces/rerankers/interfaces/Reranker.md b/docs/src/js/namespaces/rerankers/interfaces/Reranker.md new file mode 100644 index 00000000..1d056673 --- /dev/null +++ b/docs/src/js/namespaces/rerankers/interfaces/Reranker.md @@ -0,0 +1,30 @@ +[**@lancedb/lancedb**](../../../README.md) • **Docs** + +*** + +[@lancedb/lancedb](../../../globals.md) / [rerankers](../README.md) / Reranker + +# Interface: Reranker + +## Methods + +### rerankHybrid() + +```ts +rerankHybrid( + query, + vecResults, + ftsResults): Promise> +``` + +#### Parameters + +* **query**: `string` + +* **vecResults**: `RecordBatch`<`any`> + +* **ftsResults**: `RecordBatch`<`any`> + +#### Returns + +`Promise`<`RecordBatch`<`any`>> diff --git a/nodejs/__test__/table.test.ts b/nodejs/__test__/table.test.ts index 09f4cd7a..33e0daa2 100644 --- a/nodejs/__test__/table.test.ts +++ b/nodejs/__test__/table.test.ts @@ -473,6 +473,10 @@ describe("When creating an index", () => { // test offset rst = await tbl.query().limit(2).offset(1).nearestTo(queryVec).toArrow(); expect(rst.numRows).toBe(1); + + await tbl.dropIndex("vec_idx"); + const indices2 = await tbl.listIndices(); + expect(indices2.length).toBe(0); }); it("should search with distance range", async () => { diff --git a/nodejs/lancedb/table.ts b/nodejs/lancedb/table.ts index f3158c7d..b581ea30 100644 --- a/nodejs/lancedb/table.ts +++ b/nodejs/lancedb/table.ts @@ -226,6 +226,19 @@ export abstract class Table { column: string, options?: Partial, ): Promise; + + /** + * Drop an index from the table. + * + * @param name The name of the index. + * + * @note This does not delete the index from disk, it just removes it from the table. + * To delete the index, run {@link Table#optimize} after dropping the index. + * + * Use {@link Table.listIndices} to find the names of the indices. + */ + abstract dropIndex(name: string): Promise; + /** * Create a {@link Query} Builder. * @@ -426,6 +439,8 @@ export abstract class Table { * * @param {string} name The name of the index. * @returns {IndexStatistics | undefined} The stats of the index. If the index does not exist, it will return undefined + * + * Use {@link Table.listIndices} to find the names of the indices. */ abstract indexStats(name: string): Promise; @@ -591,6 +606,10 @@ export class LocalTable extends Table { await this.inner.createIndex(nativeIndex, column, options?.replace); } + async dropIndex(name: string): Promise { + await this.inner.dropIndex(name); + } + query(): Query { return new Query(this.inner); } diff --git a/nodejs/src/table.rs b/nodejs/src/table.rs index 5a9f4298..5a674414 100644 --- a/nodejs/src/table.rs +++ b/nodejs/src/table.rs @@ -135,6 +135,14 @@ impl Table { builder.execute().await.default_error() } + #[napi(catch_unwind)] + pub async fn drop_index(&self, index_name: String) -> napi::Result<()> { + self.inner_ref()? + .drop_index(&index_name) + .await + .default_error() + } + #[napi(catch_unwind)] pub async fn update( &self, diff --git a/python/python/lancedb/table.py b/python/python/lancedb/table.py index 10bd3316..e38cfcb2 100644 --- a/python/python/lancedb/table.py +++ b/python/python/lancedb/table.py @@ -586,6 +586,26 @@ class Table(ABC): """ raise NotImplementedError + def drop_index(self, name: str) -> None: + """ + Drop an index from the table. + + Parameters + ---------- + name: str + The name of the index to drop. + + Notes + ----- + This does not delete the index from disk, it just removes it from the table. + To delete the index, run [optimize][lancedb.table.Table.optimize] + after dropping the index. + + Use [list_indices][lancedb.table.Table.list_indices] to find the names of + the indices. + """ + raise NotImplementedError + @abstractmethod def create_scalar_index( self, @@ -1594,6 +1614,9 @@ class LanceTable(Table): ) ) + def drop_index(self, name: str) -> None: + return LOOP.run(self._table.drop_index(name)) + def create_scalar_index( self, column: str, @@ -2716,6 +2739,26 @@ class AsyncTable: add_note(e, help_msg) raise e + async def drop_index(self, name: str) -> None: + """ + Drop an index from the table. + + Parameters + ---------- + name: str + The name of the index to drop. + + Notes + ----- + This does not delete the index from disk, it just removes it from the table. + To delete the index, run [optimize][lancedb.table.AsyncTable.optimize] + after dropping the index. + + Use [list_indices][lancedb.table.AsyncTable.list_indices] to find the names + of the indices. + """ + await self._inner.drop_index(name) + async def add( self, data: DATA, diff --git a/python/python/tests/test_index.py b/python/python/tests/test_index.py index 6cdee77c..aefecc79 100644 --- a/python/python/tests/test_index.py +++ b/python/python/tests/test_index.py @@ -80,6 +80,10 @@ async def test_create_scalar_index(some_table: AsyncTable): # can also specify index type await some_table.create_index("id", config=BTree()) + await some_table.drop_index("id_idx") + indices = await some_table.list_indices() + assert len(indices) == 0 + @pytest.mark.asyncio async def test_create_bitmap_index(some_table: AsyncTable): diff --git a/python/python/tests/test_table.py b/python/python/tests/test_table.py index 2fdf73da..82b29db0 100644 --- a/python/python/tests/test_table.py +++ b/python/python/tests/test_table.py @@ -1008,6 +1008,10 @@ def test_create_scalar_index(mem_db: DBConnection): results = table.search([5, 5]).where("x != 'b'").to_arrow() assert results["_distance"][0].as_py() > 0 + table.drop_index(scalar_index.name) + indices = table.list_indices() + assert len(indices) == 0 + def test_empty_query(mem_db: DBConnection): table = mem_db.create_table( diff --git a/python/src/table.rs b/python/src/table.rs index 59ac29ce..211487fa 100644 --- a/python/src/table.rs +++ b/python/src/table.rs @@ -194,6 +194,14 @@ impl Table { }) } + pub fn drop_index(self_: PyRef<'_, Self>, index_name: String) -> PyResult> { + let inner = self_.inner_ref()?.clone(); + future_into_py(self_.py(), async move { + inner.drop_index(&index_name).await.infer_error()?; + Ok(()) + }) + } + pub fn list_indices(self_: PyRef<'_, Self>) -> PyResult> { let inner = self_.inner_ref()?.clone(); future_into_py(self_.py(), async move { diff --git a/rust/lancedb/src/remote/table.rs b/rust/lancedb/src/remote/table.rs index 9186667a..a98996db 100644 --- a/rust/lancedb/src/remote/table.rs +++ b/rust/lancedb/src/remote/table.rs @@ -816,6 +816,14 @@ impl TableInternal for RemoteTable { Ok(Some(stats)) } + + /// Not yet supported on LanceDB Cloud. + async fn drop_index(&self, _name: &str) -> Result<()> { + Err(Error::NotSupported { + message: "Drop index is not yet supported on LanceDB Cloud.".into(), + }) + } + async fn table_definition(&self) -> Result { Err(Error::NotSupported { message: "table_definition is not supported on LanceDB cloud.".into(), diff --git a/rust/lancedb/src/table.rs b/rust/lancedb/src/table.rs index 4022d588..1d8cce32 100644 --- a/rust/lancedb/src/table.rs +++ b/rust/lancedb/src/table.rs @@ -410,6 +410,7 @@ pub(crate) trait TableInternal: std::fmt::Display + std::fmt::Debug + Send + Syn async fn update(&self, update: UpdateBuilder) -> Result; async fn create_index(&self, index: IndexBuilder) -> Result<()>; async fn list_indices(&self) -> Result>; + async fn drop_index(&self, name: &str) -> Result<()>; async fn index_stats(&self, index_name: &str) -> Result>; async fn merge_insert( &self, @@ -984,6 +985,18 @@ impl Table { self.inner.index_stats(index_name.as_ref()).await } + /// Drop an index from the table. + /// + /// Note: This is not yet available in LanceDB cloud. + /// + /// This does not delete the index from disk, it just removes it from the table. + /// To delete the index, run [`Self::optimize()`] after dropping the index. + /// + /// Use [`Self::list_indices()`] to find the names of the indices. + pub async fn drop_index(&self, name: &str) -> Result<()> { + self.inner.drop_index(name).await + } + // Take many execution plans and map them into a single plan that adds // a query_index column and unions them. pub(crate) fn multi_vector_plan( @@ -1871,6 +1884,12 @@ impl TableInternal for NativeTable { } } + async fn drop_index(&self, index_name: &str) -> Result<()> { + let mut dataset = self.dataset.get_mut().await?; + dataset.drop_index(index_name).await?; + Ok(()) + } + async fn update(&self, update: UpdateBuilder) -> Result { let dataset = self.dataset.get().await?.clone(); let mut builder = LanceUpdateBuilder::new(Arc::new(dataset)); @@ -2897,6 +2916,9 @@ mod tests { assert_eq!(stats.num_unindexed_rows, 0); assert_eq!(stats.index_type, crate::index::IndexType::IvfPq); assert_eq!(stats.distance_type, Some(crate::DistanceType::L2)); + + table.drop_index(index_name).await.unwrap(); + assert_eq!(table.list_indices().await.unwrap().len(), 0); } #[tokio::test]