mirror of
https://github.com/lancedb/lancedb.git
synced 2026-01-06 03:42:57 +00:00
This is v1 of integrating full text search index into LanceDB.
# API
The query API is roughly the same as before, except if the input is text
instead of a vector we assume that its fts search.
## Example
If `table` is a LanceDB LanceTable, then:
Build index: `table.create_fts_index("text")`
Query: `df = table.search("puppy").limit(10).select(["text"]).to_df()`
# Implementation
Here we use the tantivy-py package to build the index. We then use the
row id's as the full-text-search index's doc id then we just do a Take
operation to fetch the rows.
# Limitations
1. don't support incremental row appends yet. New data won't show up in
search
2. local filesystem only
3. requires building tantivy explicitly
---------
Co-authored-by: Chang She <chang@lancedb.com>
58 lines
1.2 KiB
YAML
58 lines
1.2 KiB
YAML
name: Python
|
|
|
|
on:
|
|
push:
|
|
branches:
|
|
- main
|
|
pull_request:
|
|
paths:
|
|
- python/**
|
|
- .github/workflows/python.yml
|
|
jobs:
|
|
linux:
|
|
timeout-minutes: 30
|
|
strategy:
|
|
matrix:
|
|
python-minor-version: [ "8", "9", "10", "11" ]
|
|
runs-on: "ubuntu-22.04"
|
|
defaults:
|
|
run:
|
|
shell: bash
|
|
working-directory: python
|
|
steps:
|
|
- uses: actions/checkout@v3
|
|
with:
|
|
fetch-depth: 0
|
|
lfs: true
|
|
- name: Set up Python
|
|
uses: actions/setup-python@v4
|
|
with:
|
|
python-version: 3.${{ matrix.python-minor-version }}
|
|
- name: Install lancedb
|
|
run: |
|
|
pip install -e ".[fts]"
|
|
pip install pytest
|
|
- name: Run tests
|
|
run: pytest -x -v --durations=30 tests
|
|
mac:
|
|
timeout-minutes: 30
|
|
runs-on: "macos-12"
|
|
defaults:
|
|
run:
|
|
shell: bash
|
|
working-directory: python
|
|
steps:
|
|
- uses: actions/checkout@v3
|
|
with:
|
|
fetch-depth: 0
|
|
lfs: true
|
|
- name: Set up Python
|
|
uses: actions/setup-python@v4
|
|
with:
|
|
python-version: "3.11"
|
|
- name: Install lancedb
|
|
run: |
|
|
pip install -e ".[fts]"
|
|
pip install pytest
|
|
- name: Run tests
|
|
run: pytest -x -v --durations=30 tests |