mirror of
https://github.com/lancedb/lancedb.git
synced 2025-12-26 22:59:57 +00:00
1.4 KiB
1.4 KiB
DuckDB
LanceDB works with DuckDB via PyArrow integration.
Let us start with installing duckdb and lancedb.
pip install duckdb lancedb
We will re-use the dataset created previously:
import lancedb
db = lancedb.connect("data/sample-lancedb")
data = [
{"vector": [3.1, 4.1], "item": "foo", "price": 10.0},
{"vector": [5.9, 26.5], "item": "bar", "price": 20.0}
]
table = db.create_table("pd_table", data=data)
arrow_table = table.to_arrow()
DuckDB can directly query the arrow_table:
import duckdb
duckdb.query("SELECT * FROM arrow_table")
┌─────────────┬─────────┬────────┐
│ vector │ item │ price │
│ float[] │ varchar │ double │
├─────────────┼─────────┼────────┤
│ [3.1, 4.1] │ foo │ 10.0 │
│ [5.9, 26.5] │ bar │ 20.0 │
└─────────────┴─────────┴────────┘
duckdb.query("SELECT mean(price) FROM arrow_table")
┌─────────────┐
│ mean(price) │
│ double │
├─────────────┤
│ 15.0 │
└─────────────┘