Files
lancedb/python
Chang She 3c46d7f268 Handle NaN input data (#241)
Sometimes LangChain would insert a single `[np.nan]` as a placeholder if
the embedding function failed. This causes a problem for Lance format
because then the array can't be stored as a FixedSizedListArray.

Instead:
1. By default we remove rows with embedding lengths less than the
maximum length in the batch
2. If `strict=True` kwargs is set to True, then a `ValueError` is raised
if the embeddings aren't all the same length

---------

Co-authored-by: Chang She <chang@lancedb.com>
2023-07-04 20:00:46 -07:00
..
2023-07-04 20:00:46 -07:00
2023-07-04 20:00:46 -07:00
2023-06-26 11:25:39 -07:00
2023-06-27 16:48:31 -07:00
2023-03-22 19:46:15 -07:00

LanceDB

A Python library for LanceDB.

Installation

pip install lancedb

Usage

Basic Example

import lancedb
db = lancedb.connect('<PATH_TO_LANCEDB_DATASET>')
table = db.open_table('my_table')
results = table.search([0.1, 0.3]).limit(20).to_df()
print(results)

Development

Create a virtual environment and activate it:

python -m venv venv
. ./venv/bin/activate

Install the necessary packages:

python -m pip install .

To run the unit tests:

pytest

To run linter and automatically fix all errors:

black .
isort .

If any packages are missing, install them with:

pip install <PACKAGE_NAME>

For Windows users, there may be errors when installing packages, so these commands may be helpful:

Activate the virtual environment:

. .\venv\Scripts\activate

You may need to run the installs separately:

pip install -e .[tests]
pip install -e .[dev]

tantivy requires rust to be installed, so install it with conda, as it doesn't support windows installation:

pip install wheel
pip install cargo
conda install rust
pip install tantivy

To run the unit tests:

pytest