mirror of
https://github.com/lancedb/lancedb.git
synced 2025-12-26 22:59:57 +00:00
first off, apologies for any folly since i'm new to contributing to lancedb. this PR is the continuation of [a discord thread](https://discord.com/channels/1030247538198061086/1030247538667827251/1278844345713299599): ## user story here's the lance db search query i'd like to run: ``` def search(phrase): logger.info(f'Searching for phrase: {phrase}') phrase_embedding = get_embedding(phrase) df = (table.search((phrase_embedding, phrase), query_type='hybrid') .limit(10).to_list()) logger.info(f'Success search with row count: {len(df)}') search('howdy (howdy)') search('howdy(howdy)') ``` the second search fails due to `ValueError: Syntax Error: howdy(howdy)` i saw on the [docs](https://lancedb.github.io/lancedb/fts/#phrase-queries-vs-terms-queries) that i can use `phrase_query()` to [enable a flag](https://github.com/lancedb/lancedb/blob/main/python/python/lancedb/query.py#L790-L792) to wrap the query in double quotes (as well as sanitize single quotes) prior to sending the query to search. this works for [normal FTS](https://lancedb.github.io/lancedb/fts/), but the command is unavailable on [hybrid search](https://lancedb.github.io/lancedb/hybrid_search/hybrid_search/). ## changes i added `phrase_query()` function to `LanceHybridQueryBuilder` by propagating the call down to its `self. _fts_query` object. i'm not too familiar with the codebase and am not sure if this is the best way to implement the functionality. feel free to riff on this PR or discard ## tests ``` (lancedb) JamesMPB:python james$ pwd /Users/james/src/lancedb/python (lancedb) JamesMPB:python james$ pytest python/tests/test_table.py python/tests/test_table.py ....................................... [100%] ====================================================== 39 passed, 1 warning in 2.23s ======================================================= ```
LanceDB
A Python library for LanceDB.
Installation
pip install lancedb
Usage
Basic Example
import lancedb
db = lancedb.connect('<PATH_TO_LANCEDB_DATASET>')
table = db.open_table('my_table')
results = table.search([0.1, 0.3]).limit(20).to_list()
print(results)
Development
LanceDb is based on the rust crate lancedb and is built with maturin. In order to build with maturin
you will either need a conda environment or a virtual environment (venv).
python -m venv venv
. ./venv/bin/activate
Install the necessary packages:
python -m pip install .[tests,dev]
To build the python package you can use maturin:
# This will build the rust bindings and place them in the appropriate place
# in your venv or conda environment
maturin develop
To run the unit tests:
pytest
To run the doc tests:
pytest --doctest-modules python/lancedb
To run linter and automatically fix all errors:
ruff format python
ruff --fix python
If any packages are missing, install them with:
pip install <PACKAGE_NAME>
For Windows users, there may be errors when installing packages, so these commands may be helpful:
Activate the virtual environment:
. .\venv\Scripts\activate
You may need to run the installs separately:
pip install -e .[tests]
pip install -e .[dev]
tantivy requires rust to be installed, so install it with conda, as it doesn't support windows installation:
pip install wheel
pip install cargo
conda install rust
pip install tantivy