mirror of
https://github.com/lancedb/lancedb.git
synced 2025-12-24 22:09:58 +00:00
This PR refactors how we handle read consistency: does the `LanceTable` class always pick up modifications to the table made by other instance or processes. Users have three options they can set at the connection level: 1. (Default) `read_consistency_interval=None` means it will not check at all. Users can call `table.checkout_latest()` to manually check for updates. 2. `read_consistency_interval=timedelta(0)` means **always** check for updates, giving strong read consistency. 3. `read_consistency_interval=timedelta(seconds=20)` means check for updates every 20 seconds. This is eventual consistency, a compromise between the two options above. ## Table reference state There is now an explicit difference between a `LanceTable` that tracks the current version and one that is fixed at a historical version. We now enforce that users cannot write if they have checked out an old version. They are instructed to call `checkout_latest()` before calling the write methods. Since `conn.open_table()` doesn't have a parameter for version, users will only get fixed references if they call `table.checkout()`. The difference between these two can be seen in the repr: Table that are fixed at a particular version will have a `version` displayed in the repr. Otherwise, the version will not be shown. ```python >>> table LanceTable(connection=..., name="my_table") >>> table.checkout(1) >>> table LanceTable(connection=..., name="my_table", version=1) ``` I decided to not create different classes for these states, because I think we already have enough complexity with the Cloud vs OSS table references. Based on #812
1.3 KiB
1.3 KiB
LanceDB
A Python library for LanceDB.
Installation
pip install lancedb
Usage
Basic Example
import lancedb
db = lancedb.connect('<PATH_TO_LANCEDB_DATASET>')
table = db.open_table('my_table')
results = table.search([0.1, 0.3]).limit(20).to_list()
print(results)
Development
Create a virtual environment and activate it:
python -m venv venv
. ./venv/bin/activate
Install the necessary packages:
python -m pip install .
To run the unit tests:
pytest
To run the doc tests:
pytest --doctest-modules lancedb
To run linter and automatically fix all errors:
ruff format python
ruff --fix python
If any packages are missing, install them with:
pip install <PACKAGE_NAME>
For Windows users, there may be errors when installing packages, so these commands may be helpful:
Activate the virtual environment:
. .\venv\Scripts\activate
You may need to run the installs separately:
pip install -e .[tests]
pip install -e .[dev]
tantivy requires rust to be installed, so install it with conda, as it doesn't support windows installation:
pip install wheel
pip install cargo
conda install rust
pip install tantivy
To run the unit tests:
pytest