mirror of
https://github.com/lancedb/lancedb.git
synced 2025-12-27 15:12:53 +00:00
Closes #721 fts will return results as a pyarrow table. Pyarrow tables has a `filter` method but it does not take sql filter strings (only pyarrow compute expressions). Instead, we do one of two things to support `tbl.search("keywords").where("foo=5").limit(10).to_arrow()`: Default path: If duckdb is available then use duckdb to execute the sql filter string on the pyarrow table. Backup path: Otherwise, write the pyarrow table to a lance dataset and then do `to_table(filter=<filter>)` Neither is ideal. Default path has two issues: 1. requires installing an extra library (duckdb) 2. duckdb mangles some fields (like fixed size list => list) Backup path incurs a latency penalty (~20ms on ssd) to write the resultset to disk. In the short term, once #676 is addressed, we can write the dataset to "memory://" instead of disk, this makes the post filter evaluate much quicker (ETA next week). In the longer term, we'd like to be able to evaluate the filter string on the pyarrow Table directly, one possibility being that we use Substrait to generate pyarrow compute expressions from sql string. Or if there's enough progress on pyarrow, it could support Substrait expressions directly (no ETA) --------- Co-authored-by: Will Jones <willjones127@gmail.com>
LanceDB Documentation
LanceDB docs are deployed to https://lancedb.github.io/lancedb/.
Docs is built and deployed automatically by Github Actions
whenever a commit is pushed to the main branch. So it is possible for the docs to show
unreleased features.
Building the docs
Setup
- Install LanceDB. From LanceDB repo root:
pip install -e python - Install dependencies. From LanceDB repo root:
pip install -r docs/requirements.txt - Make sure you have node and npm setup
- Make sure protobuf and libssl are installed
Building node module and create markdown files
Build docs
From LanceDB repo root:
Run: PYTHONPATH=. mkdocs build -f docs/mkdocs.yml
If successful, you should see a docs/site directory that you can verify locally.