Files
lancedb/python/python/tests/test_duckdb.py
Weston Pace c998a47e17 feat: add a pyarrow dataset adapater for LanceDB tables (#1902)
This currently only works for local tables (remote tables cannot be
queried)
This is also exclusive to the sync interface. However, since the pyarrow
dataset interface is synchronous I am not sure if there is much value in
making an async-wrapping variant.

In addition, I added a `to_batches` method to the base query in the sync
API. This already exists in the async API. In the sync API this PR only
adds support for vector queries and scalar queries and not for hybrid or
FTS queries.
2024-12-03 15:42:54 -08:00

22 lines
579 B
Python

import duckdb
import pyarrow as pa
import lancedb
from lancedb.integrations.pyarrow import PyarrowDatasetAdapter
def test_basic_query(tmp_path):
data = pa.table({"x": [1, 2, 3, 4], "y": [5, 6, 7, 8]})
conn = lancedb.connect(tmp_path)
tbl = conn.create_table("test", data)
adapter = PyarrowDatasetAdapter(tbl) # noqa: F841
duck_conn = duckdb.connect()
results = duck_conn.sql("SELECT SUM(x) FROM adapter").fetchall()
assert results[0][0] == 10
results = duck_conn.sql("SELECT SUM(y) FROM adapter").fetchall()
assert results[0][0] == 26