Files
lancedb/docs/src/python/duckdb.md
2023-10-14 14:07:43 -07:00

1.4 KiB

DuckDB

LanceDB works with DuckDB via PyArrow integration.

Let us start with installing duckdb and lancedb.

pip install duckdb lancedb

We will re-use the dataset created previously:

import lancedb

db = lancedb.connect("data/sample-lancedb")
data = [
    {"vector": [3.1, 4.1], "item": "foo", "price": 10.0},
    {"vector": [5.9, 26.5], "item": "bar", "price": 20.0}
]
table = db.create_table("pd_table", data=data)
arrow_table = table.to_arrow()

DuckDB can directly query the arrow_table:

import duckdb

duckdb.query("SELECT * FROM arrow_table")
┌─────────────┬─────────┬────────┐
│   vector    │  item   │ price  │
│   float[]   │ varchar │ double │
├─────────────┼─────────┼────────┤
│ [3.1, 4.1]  │ foo     │   10.0 │
│ [5.9, 26.5] │ bar     │   20.0 │
└─────────────┴─────────┴────────┘
duckdb.query("SELECT mean(price) FROM arrow_table")
┌─────────────┐
│ mean(price) │
│   double    │
├─────────────┤
│        15.0 │
└─────────────┘