feat: search multiple query vectors as one query (#1811)

Allows users to pass multiple query vector as part of a single query
plan. This just runs the queries in parallel without any further
optimization. It's mostly a convenience.

Previously, I think this was only handled by the sync Python remote API.
This makes it common across all SDKs.

Closes https://github.com/lancedb/lancedb/issues/1803

```python
>>> import lancedb
>>> import asyncio
>>> 
>>> async def main():
...     db = await lancedb.connect_async("./demo")
...     table = await db.create_table("demo", [{"id": 1, "vector": [1, 2, 3]}, {"id": 2, "vector": [4, 5, 6]}], mode="overwrite")
...     return await table.query().nearest_to([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [4.0, 5.0, 6.0]]).limit(1).to_pandas()
... 
>>> asyncio.run(main())
   query_index  id           vector  _distance
0            2   2  [4.0, 5.0, 6.0]        0.0
1            1   2  [4.0, 5.0, 6.0]        0.0
2            0   1  [1.0, 2.0, 3.0]        0.0
```
This commit is contained in:
Will Jones
2024-11-13 16:05:16 -08:00
committed by GitHub
parent 0fd8a50bd7
commit abd75e0ead
9 changed files with 366 additions and 64 deletions

View File

@@ -135,6 +135,16 @@ impl VectorQuery {
self.inner = self.inner.clone().column(&column);
}
#[napi]
pub fn add_query_vector(&mut self, vector: Float32Array) -> Result<()> {
self.inner = self
.inner
.clone()
.add_query_vector(vector.as_ref())
.default_error()?;
Ok(())
}
#[napi]
pub fn distance_type(&mut self, distance_type: String) -> napi::Result<()> {
let distance_type = parse_distance_type(distance_type)?;