mirror of
https://github.com/lancedb/lancedb.git
synced 2026-01-03 10:22:56 +00:00
1.4 KiB
1.4 KiB
DuckDB
LanceDB works with DuckDB via PyArrow integration.
Let us start with installing duckdb and lancedb.
pip install duckdb lancedb
We will re-use the dataset created previously:
import pandas as pd
import lancedb
db = lancedb.connect("data/sample-lancedb")
data = pd.DataFrame({
"vector": [[3.1, 4.1], [5.9, 26.5]],
"item": ["foo", "bar"],
"price": [10.0, 20.0]
})
table = db.create_table("pd_table", data=data)
arrow_table = table.to_arrow()
DuckDB can directly query the arrow_table:
import duckdb
duckdb.query("SELECT * FROM arrow_table")
┌─────────────┬─────────┬────────┐
│ vector │ item │ price │
│ float[] │ varchar │ double │
├─────────────┼─────────┼────────┤
│ [3.1, 4.1] │ foo │ 10.0 │
│ [5.9, 26.5] │ bar │ 20.0 │
└─────────────┴─────────┴────────┘
duckdb.query("SELECT mean(price) FROM arrow_table")
┌─────────────┐
│ mean(price) │
│ double │
├─────────────┤
│ 15.0 │
└─────────────┘