Files
lancedb/docs/src/python/duckdb.md

1.4 KiB

DuckDB

LanceDB works with DuckDB via PyArrow integration.

Let us start with installing duckdb and lancedb.

pip install duckdb lancedb

We will re-use the dataset created previously:

import pandas as pd
import lancedb

db = lancedb.connect("data/sample-lancedb")
data = pd.DataFrame({
    "vector": [[3.1, 4.1], [5.9, 26.5]],
    "item": ["foo", "bar"],
    "price": [10.0, 20.0]
})
table = db.create_table("pd_table", data=data)
arrow_table = table.to_arrow()

DuckDB can directly query the arrow_table:

import duckdb

duckdb.query("SELECT * FROM arrow_table")
┌─────────────┬─────────┬────────┐
│   vector    │  item   │ price  │
│   float[]   │ varchar │ double │
├─────────────┼─────────┼────────┤
│ [3.1, 4.1]  │ foo     │   10.0 │
│ [5.9, 26.5] │ bar     │   20.0 │
└─────────────┴─────────┴────────┘
duckdb.query("SELECT mean(price) FROM arrow_table")
┌─────────────┐
│ mean(price) │
│   double    │
├─────────────┤
│        15.0 │
└─────────────┘