mirror of
https://github.com/lancedb/lancedb.git
synced 2025-12-25 14:29:56 +00:00
This PR refactors how we handle read consistency: does the `LanceTable` class always pick up modifications to the table made by other instance or processes. Users have three options they can set at the connection level: 1. (Default) `read_consistency_interval=None` means it will not check at all. Users can call `table.checkout_latest()` to manually check for updates. 2. `read_consistency_interval=timedelta(0)` means **always** check for updates, giving strong read consistency. 3. `read_consistency_interval=timedelta(seconds=20)` means check for updates every 20 seconds. This is eventual consistency, a compromise between the two options above. ## Table reference state There is now an explicit difference between a `LanceTable` that tracks the current version and one that is fixed at a historical version. We now enforce that users cannot write if they have checked out an old version. They are instructed to call `checkout_latest()` before calling the write methods. Since `conn.open_table()` doesn't have a parameter for version, users will only get fixed references if they call `table.checkout()`. The difference between these two can be seen in the repr: Table that are fixed at a particular version will have a `version` displayed in the repr. Otherwise, the version will not be shown. ```python >>> table LanceTable(connection=..., name="my_table") >>> table.checkout(1) >>> table LanceTable(connection=..., name="my_table", version=1) ``` I decided to not create different classes for these states, because I think we already have enough complexity with the Cloud vs OSS table references. Based on #812
92 lines
1.3 KiB
Markdown
92 lines
1.3 KiB
Markdown
# LanceDB
|
|
|
|
A Python library for [LanceDB](https://github.com/lancedb/lancedb).
|
|
|
|
## Installation
|
|
|
|
```bash
|
|
pip install lancedb
|
|
```
|
|
|
|
## Usage
|
|
|
|
### Basic Example
|
|
|
|
```python
|
|
import lancedb
|
|
db = lancedb.connect('<PATH_TO_LANCEDB_DATASET>')
|
|
table = db.open_table('my_table')
|
|
results = table.search([0.1, 0.3]).limit(20).to_list()
|
|
print(results)
|
|
```
|
|
|
|
|
|
## Development
|
|
|
|
Create a virtual environment and activate it:
|
|
|
|
```bash
|
|
python -m venv venv
|
|
. ./venv/bin/activate
|
|
```
|
|
|
|
Install the necessary packages:
|
|
|
|
```bash
|
|
python -m pip install .
|
|
```
|
|
|
|
To run the unit tests:
|
|
|
|
```bash
|
|
pytest
|
|
```
|
|
|
|
To run the doc tests:
|
|
|
|
```bash
|
|
pytest --doctest-modules lancedb
|
|
```
|
|
|
|
To run linter and automatically fix all errors:
|
|
|
|
```bash
|
|
ruff format python
|
|
ruff --fix python
|
|
```
|
|
|
|
If any packages are missing, install them with:
|
|
|
|
```bash
|
|
pip install <PACKAGE_NAME>
|
|
```
|
|
|
|
|
|
___
|
|
For **Windows** users, there may be errors when installing packages, so these commands may be helpful:
|
|
|
|
Activate the virtual environment:
|
|
```bash
|
|
. .\venv\Scripts\activate
|
|
```
|
|
|
|
You may need to run the installs separately:
|
|
```bash
|
|
pip install -e .[tests]
|
|
pip install -e .[dev]
|
|
```
|
|
|
|
|
|
`tantivy` requires `rust` to be installed, so install it with `conda`, as it doesn't support windows installation:
|
|
```bash
|
|
pip install wheel
|
|
pip install cargo
|
|
conda install rust
|
|
pip install tantivy
|
|
```
|
|
|
|
To run the unit tests:
|
|
```bash
|
|
pytest
|
|
```
|