mirror of
https://github.com/lancedb/lancedb.git
synced 2026-05-14 02:20:40 +00:00
Provides first-class PyTorch `Dataset`/`IterableDataset` wrappers around a LanceDB table or permutation. The wrapper: * Captures only the URI / table name / connect kwargs needed to re-open the table — no Rust handles in pickle output. Works out of the box with `DataLoader(num_workers > 0)`, which would otherwise crash a hand-rolled subclass. * Implements both `__getitem__` and PyTorch's `__getitems__` dunder so the underlying batched `Permutation.fetch` is used when DataLoader fetches a batch of indices. * Forwards column selection / format / transform / batch_size to the underlying Permutation, so users do not have to hand-roll the `_ensure_open` boilerplate from the issue. Builds on the public `Permutation.fetch` API (#3243). Closes lancedb/lancedb#3242
LanceDB Python SDK
A Python library for LanceDB.
Installation
pip install lancedb
Preview Releases
Stable releases are created about every 2 weeks. For the latest features and bug fixes, you can install the preview release. These releases receive the same level of testing as stable releases, but are not guaranteed to be available for more than 6 months after they are released. Once your application is stable, we recommend switching to stable releases.
pip install --pre --extra-index-url https://pypi.fury.io/lancedb/ lancedb
Usage
Basic Example
import lancedb
db = lancedb.connect('<PATH_TO_LANCEDB_DATASET>')
table = db.open_table('my_table')
results = table.search([0.1, 0.3]).limit(20).to_list()
print(results)
Development
See CONTRIBUTING.md for information on how to contribute to LanceDB.