Files
lancedb/python
Weston Pace 8033a44d68 feat: add support for add to async python API (#1037)
In order to add support for `add` we needed to migrate the rust `Table`
trait to a `Table` struct and `TableInternal` trait (similar to the way
the connection is designed).

While doing this we also cleaned up some inconsistencies between the
SDKs:

* Python and Node are garbage collected languages and it can be
difficult to trigger something to be freed. The convention for these
languages is to have some kind of close method. I added a close method
to both the table and connection which will drop the underlying rust
object.
* We made significant improvements to table creation in
cc5f2136a6
for the `node` SDK. I copied these changes to the `nodejs` SDK.
* The nodejs tables were using fs to create tmp directories and these
were not getting cleaned up. This is mostly harmless but annoying and so
I changed it up a bit to ensure we cleanup tmp directories.
* ~~countRows in the node SDK was returning `bigint`. I changed it to
return `number`~~ (this actually happened in a previous PR)
* Tables and connections now implement `std::fmt::Display` which is
hooked into python's `__repr__`. Node has no concept of a regular "to
string" function and so I added a `display` method.
* Python method signatures are changing so that optional parameters are
always `Optional[foo] = None` instead of something like `foo = False`.
This is because we want those defaults to be in rust whenever possible
(though we still need to mention the default in documentation).
* I changed the python `AsyncConnection/AsyncTable` classes from
abstract classes with a single implementation to just classes because we
no longer have the remote implementation in python.

Note: this does NOT add the `add` function to the remote table. This PR
was already large enough, and the remote implementation is unique
enough, that I am going to do all the remote stuff at a later date (we
should have the structure in place and correct so there shouldn't be any
refactor concerns)

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2024-04-05 16:31:36 -07:00
..
2024-04-05 16:22:59 -07:00

LanceDB

A Python library for LanceDB.

Installation

pip install lancedb

Usage

Basic Example

import lancedb
db = lancedb.connect('<PATH_TO_LANCEDB_DATASET>')
table = db.open_table('my_table')
results = table.search([0.1, 0.3]).limit(20).to_list()
print(results)

Development

LanceDb is based on the rust crate lancedb and is built with maturin. In order to build with maturin you will either need a conda environment or a virtual environment (venv).

python -m venv venv
. ./venv/bin/activate

Install the necessary packages:

python -m pip install .[tests,dev]

To build the python package you can use maturin:

# This will build the rust bindings and place them in the appropriate place
# in your venv or conda environment
matruin develop

To run the unit tests:

pytest

To run the doc tests:

pytest --doctest-modules python/lancedb

To run linter and automatically fix all errors:

ruff format python
ruff --fix python

If any packages are missing, install them with:

pip install <PACKAGE_NAME>

For Windows users, there may be errors when installing packages, so these commands may be helpful:

Activate the virtual environment:

. .\venv\Scripts\activate

You may need to run the installs separately:

pip install -e .[tests]
pip install -e .[dev]

tantivy requires rust to be installed, so install it with conda, as it doesn't support windows installation:

pip install wheel
pip install cargo
conda install rust
pip install tantivy