feat: search multiple query vectors as one query (#1811)

Allows users to pass multiple query vector as part of a single query
plan. This just runs the queries in parallel without any further
optimization. It's mostly a convenience.

Previously, I think this was only handled by the sync Python remote API.
This makes it common across all SDKs.

Closes https://github.com/lancedb/lancedb/issues/1803

```python
>>> import lancedb
>>> import asyncio
>>> 
>>> async def main():
...     db = await lancedb.connect_async("./demo")
...     table = await db.create_table("demo", [{"id": 1, "vector": [1, 2, 3]}, {"id": 2, "vector": [4, 5, 6]}], mode="overwrite")
...     return await table.query().nearest_to([[1.0, 2.0, 3.0], [4.0, 5.0, 6.0], [4.0, 5.0, 6.0]]).limit(1).to_pandas()
... 
>>> asyncio.run(main())
   query_index  id           vector  _distance
0            2   2  [4.0, 5.0, 6.0]        0.0
1            1   2  [4.0, 5.0, 6.0]        0.0
2            0   1  [1.0, 2.0, 3.0]        0.0
```
This commit is contained in:
Will Jones
2024-11-13 16:05:16 -08:00
committed by GitHub
parent 0fd8a50bd7
commit abd75e0ead
9 changed files with 366 additions and 64 deletions

View File

@@ -998,4 +998,18 @@ describe("column name options", () => {
const results = await table.query().where("`camelCase` = 1").toArray();
expect(results[0].camelCase).toBe(1);
});
test("can make multiple vector queries in one go", async () => {
const results = await table
.query()
.nearestTo([0.1, 0.2])
.addQueryVector([0.1, 0.2])
.limit(1)
.toArray();
console.log(results);
expect(results.length).toBe(2);
results.sort((a, b) => a.query_index - b.query_index);
expect(results[0].query_index).toBe(0);
expect(results[1].query_index).toBe(1);
});
});