mirror of
https://github.com/lancedb/lancedb.git
synced 2026-07-03 11:00:40 +00:00
## Summary - When an embedding function returns an empty list (e.g. `[]`) for an input row — as can happen when a model produces no output for a blank string — `_append_vector_columns` crashed with `ArrowInvalid: Length of item not correct: expected N but got array of size 0` because PyArrow cannot fit a zero-length value into a fixed-size list element. - The fix adds a validation step in `gen()`, inside `_append_vector_columns`, that replaces any vector whose length does not match the expected `ndims` (including empty lists and `None`) with `None` before `pa.array()` is called. - `None` is a valid null in a PyArrow fixed-size list array, so the bad entry flows into `_handle_bad_vectors` and is handled according to the caller-supplied `on_bad_vectors` policy (`error` / `drop` / `fill` / `null`) instead of causing an unconditional crash. ## Test plan - [ ] Added `test_embedding_with_empty_output_vectors` in `python/python/tests/test_embeddings.py` that uses an embedding function returning `[]` for empty-string inputs, calls `table.add(..., on_bad_vectors="drop")`, and asserts no crash and that bad rows are correctly dropped. - [ ] Existing `test_embedding_with_bad_results` continues to pass (NaN vectors still handled correctly). - [ ] Verified manually that `pa.array([[1.,2.,3.,4.], []], type=pa.list_(pa.float32(), 4))` raises `ArrowInvalid` without the fix, and succeeds with `None` in place of `[]`. Fixes #1672 --------- Co-authored-by: Will Jones <willjones127@gmail.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
LanceDB Python SDK
A Python library for LanceDB.
Installation
pip install lancedb
Preview Releases
Stable releases are created about every 2 weeks. For the latest features and bug fixes, you can install the preview release. These releases receive the same level of testing as stable releases, but are not guaranteed to be available for more than 6 months after they are released. Once your application is stable, we recommend switching to stable releases.
pip install --pre --extra-index-url https://pypi.fury.io/lancedb/ lancedb
Usage
Basic Example
import lancedb
db = lancedb.connect('<PATH_TO_LANCEDB_DATASET>')
table = db.open_table('my_table')
results = table.search([0.1, 0.3]).limit(20).to_list()
print(results)
Development
See CONTRIBUTING.md for information on how to contribute to LanceDB.