Files
lancedb/python/CONTRIBUTING.md
msu-reevo cc81f3e1a5 fix(python): typing (#2167)
@wjones127 is there a standard way you guys setup your virtualenv? I can
either relist all the dependencies in the pyright precommit section, or
specify a venv, or the user has to be in the virtual environment when
they run git commit. If the venv location was standardized or a python
manager like `uv` was used it would be easier to avoid duplicating the
pyright dependency list.

Per your suggestion, in `pyproject.toml` I added in all the passing
files to the `includes` section.

For ruff I upgraded the version and removed "TCH" which doesn't exist as
an option.

I added a `pyright_report.csv` which contains a list of all files sorted
by pyright errors ascending as a todo list to work on.

I fixed about 30 issues in `table.py` stemming from str's being passed
into methods that required a string within a set of string Literals by
extracting them into `types.py`

Can you verify in the rust bridge that the schema should be a property
and not a method here? If it's a method, then there's another place in
the code where `inner.schema` should be `inner.schema()`
``` python
class RecordBatchStream:
    @property
    def schema(self) -> pa.Schema: ...
```

Also unless the `_lancedb.pyi` file is wrong, then there is no
`__anext__` here for `__inner` when it's not an `AsyncGenerator` and
only `next` is defined:
``` python
    async def __anext__(self) -> pa.RecordBatch:
        return await self._inner.__anext__()
        if isinstance(self._inner, AsyncGenerator):
            batch = await self._inner.__anext__()
        else:
            batch = await self._inner.next()
        if batch is None:
            raise StopAsyncIteration
        return batch
```
in the else statement, `_inner` is a `RecordBatchStream`
```python
class RecordBatchStream:
    @property
    def schema(self) -> pa.Schema: ...
    async def next(self) -> Optional[pa.RecordBatch]: ...
```

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2025-03-10 09:01:23 -07:00

1.7 KiB

Contributing to LanceDB Python

This document outlines the process for contributing to LanceDB Python. For general contribution guidelines, see CONTRIBUTING.md.

Project layout

The Python package is a wrapper around the Rust library, lancedb. We use pyo3 to create the bindings between Rust and Python.

  • src/: Rust bindings source code
  • python/lancedb: Python package source code
  • python/tests: Unit tests

Development environment

To set up your development environment, you will need to install the following:

  1. Python 3.9 or later
  2. Cargo (Rust's package manager). Use rustup to install.
  3. protoc (Protocol Buffers compiler)

Create a virtual environment to work in:

python -m venv venv
source venv/bin/activate
pip install maturin

Commit Hooks

It is highly recommended to install the pre-commit hooks to ensure that your code is formatted correctly and passes basic checks before committing:

make develop # this will install pre-commit itself
pre-commit install

Development

Most common development commands can be run using the Makefile.

Build the package

make develop

Format:

make format

Run tests:

make test
make doctest

Run type checking:

make typecheck

To run a single test, you can use the pytest command directly. Provide the path to the test file, and optionally the test name after ::.

# Single file: test_table.py
pytest -vv python/tests/test_table.py
# Single test: test_basic in test_table.py
pytest -vv python/tests/test_table.py::test_basic

To see all commands, run:

make help