mirror of
https://github.com/lancedb/lancedb.git
synced 2026-01-13 23:32:57 +00:00
54 lines
1.4 KiB
Markdown
54 lines
1.4 KiB
Markdown
# DuckDB
|
|
|
|
`LanceDB` works with `DuckDB` via [PyArrow integration](https://duckdb.org/docs/guides/python/sql_on_arrow).
|
|
|
|
Let us start with installing `duckdb` and `lancedb`.
|
|
|
|
```shell
|
|
pip install duckdb lancedb
|
|
```
|
|
|
|
We will re-use [the dataset created previously](./arrow.md):
|
|
|
|
```python
|
|
import lancedb
|
|
|
|
db = lancedb.connect("data/sample-lancedb")
|
|
data = [
|
|
{"vector": [3.1, 4.1], "item": "foo", "price": 10.0},
|
|
{"vector": [5.9, 26.5], "item": "bar", "price": 20.0}
|
|
]
|
|
table = db.create_table("pd_table", data=data)
|
|
arrow_table = table.to_arrow()
|
|
```
|
|
|
|
`DuckDB` can directly query the `arrow_table`:
|
|
|
|
```python
|
|
import duckdb
|
|
|
|
duckdb.query("SELECT * FROM arrow_table")
|
|
```
|
|
|
|
```
|
|
┌─────────────┬─────────┬────────┐
|
|
│ vector │ item │ price │
|
|
│ float[] │ varchar │ double │
|
|
├─────────────┼─────────┼────────┤
|
|
│ [3.1, 4.1] │ foo │ 10.0 │
|
|
│ [5.9, 26.5] │ bar │ 20.0 │
|
|
└─────────────┴─────────┴────────┘
|
|
```
|
|
|
|
```py
|
|
duckdb.query("SELECT mean(price) FROM arrow_table")
|
|
```
|
|
|
|
```
|
|
┌─────────────┐
|
|
│ mean(price) │
|
|
│ double │
|
|
├─────────────┤
|
|
│ 15.0 │
|
|
└─────────────┘
|
|
``` |