Files
lancedb/docs/src/guides/sql_querying.md
BubbleCal fec8d58f06 feat: support a bunch or FTS features in JS SDK (#2431)
- operator for match query
- slop for phrase query
- boolean query

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Introduced support for boolean full-text search queries with AND/OR
logic and occurrence conditions.
- Added operator options for match and multi-match queries to control
term combination logic.
- Enabled phrase queries to specify proximity (slop) for flexible phrase
matching.
- Added new enumerations (`Operator`, `Occur`) and the `BooleanQuery`
class for enhanced query expressiveness.

- **Bug Fixes**
- Improved validation and error handling for invalid operator and
occurrence inputs in full-text queries.

- **Tests**
- Expanded test coverage with new cases for boolean queries and
operator-based full-text searches.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2025-06-12 17:04:19 +08:00

2.4 KiB

You can use DuckDB and Apache Datafusion to query your LanceDB tables using SQL. This guide will show how to query Lance tables them using both.

We will re-use the dataset created previously:

import lancedb

db = lancedb.connect("data/sample-lancedb")
data = [
    {"vector": [3.1, 4.1], "item": "foo", "price": 10.0},
    {"vector": [5.9, 26.5], "item": "bar", "price": 20.0}
]
table = db.create_table("pd_table", data=data)

Querying a LanceDB Table with DuckDb

The to_lance method converts the LanceDB table to a LanceDataset, which is accessible to DuckDB through the Arrow compatibility layer. To query the resulting Lance dataset in DuckDB, all you need to do is reference the dataset by the same name in your SQL query.

import duckdb

arrow_table = table.to_lance()

duckdb.query("SELECT * FROM arrow_table")
┌─────────────┬─────────┬────────┐
│   vector    │  item   │ price  │
│   float[]   │ varchar │ double │
├─────────────┼─────────┼────────┤
│ [3.1, 4.1]  │ foo     │   10.0 │
│ [5.9, 26.5] │ bar     │   20.0 │
└─────────────┴─────────┴────────┘

Querying a LanceDB Table with Apache Datafusion

Have the required imports before doing any querying.

=== "Python"

```python
--8<-- "python/python/tests/docs/test_guide_tables.py:import-lancedb"
--8<-- "python/python/tests/docs/test_guide_tables.py:import-session-context"
--8<-- "python/python/tests/docs/test_guide_tables.py:import-ffi-dataset"
```

Register the table created with the Datafusion session context.

=== "Python"

```python
--8<-- "python/python/tests/docs/test_guide_tables.py:lance_sql_basic"
```
┌─────────────┬─────────┬────────┐
│   vector    │  item   │ price  │
│   float[]   │ varchar │ double │
├─────────────┼─────────┼────────┤
│ [3.1, 4.1]  │ foo     │   10.0 │
│ [5.9, 26.5] │ bar     │   20.0 │
└─────────────┴─────────┴────────┘