docs: contributing guide (#1970)

* Adds basic contributing guides.
* Simplifies Python development with a Makefile.
This commit is contained in:
Will Jones
2025-01-07 15:11:16 -08:00
committed by GitHub
parent 17c9e9afea
commit 0e496ed3b5
7 changed files with 346 additions and 131 deletions

78
CONTRIBUTING.md Normal file
View File

@@ -0,0 +1,78 @@
# Contributing to LanceDB
LanceDB is an open-source project and we welcome contributions from the community.
This document outlines the process for contributing to LanceDB.
## Reporting Issues
If you encounter a bug or have a feature request, please open an issue on the
[GitHub issue tracker](https://github.com/lancedb/lancedb).
## Picking an issue
We track issues on the GitHub issue tracker. If you are looking for something to
work on, check the [good first issue](https://github.com/lancedb/lancedb/contribute) label. These issues are typically the best described and have the smallest scope.
If there's an issue you are interested in working on, please leave a comment on the issue. This will help us avoid duplicate work. Additionally, if you have questions about the issue, please ask them in the issue comments. We are happy to provide guidance on how to approach the issue.
## Configuring Git
First, fork the repository on GitHub, then clone your fork:
```bash
git clone https://github.com/<username>/lancedb.git
cd lancedb
```
Then add the main repository as a remote:
```bash
git remote add upstream https://github.com/lancedb/lancedb.git
git fetch upstream
```
## Setting up your development environment
We have development environments for Python, Typescript, and Java. Each environment has its own setup instructions.
* [Python](python/CONTRIBUTING.md)
* [Typescript](nodejs/CONTRIBUTING.md)
<!-- TODO: add Java contributing guide -->
* [Documentation](docs/README.md)
## Best practices for pull requests
For the best chance of having your pull request accepted, please follow these guidelines:
1. Unit test all bug fixes and new features. Your code will not be merged if it
doesn't have tests.
1. If you change the public API, update the documentation in the `docs` directory.
1. Aim to minimize the number of changes in each pull request. Keep to solving
one problem at a time, when possible.
1. Before marking a pull request ready-for-review, do a self review of your code.
Is it clear why you are making the changes? Are the changes easy to understand?
1. Use [conventional commit messages](https://www.conventionalcommits.org/en/) as pull request titles. Examples:
* New feature: `feat: adding foo API`
* Bug fix: `fix: issue with foo API`
* Documentation change: `docs: adding foo API documentation`
1. If your pull request is a work in progress, leave the pull request as a draft.
We will assume the pull request is ready for review when it is opened.
1. When writing tests, test the error cases. Make sure they have understandable
error messages.
## Project structure
The core library is written in Rust. The Python, Typescript, and Java libraries
are wrappers around the Rust library.
* `src/lancedb`: Rust library source code
* `python`: Python package source code
* `nodejs`: Typescript package source code
* `node`: **Deprecated** Typescript package source code
* `java`: Java package source code
* `docs`: Documentation source code
## Release process
For information on the release process, see: [release_process.md](release_process.md)

View File

@@ -9,36 +9,81 @@ unreleased features.
## Building the docs
### Setup
1. Install LanceDB. From LanceDB repo root: `pip install -e python`
2. Install dependencies. From LanceDB repo root: `pip install -r docs/requirements.txt`
3. Make sure you have node and npm setup
4. Make sure protobuf and libssl are installed
1. Install LanceDB Python. See setup in [Python contributing guide](../python/CONTRIBUTING.md).
Run `make develop` to install the Python package.
2. Install documentation dependencies. From LanceDB repo root: `pip install -r docs/requirements.txt`
### Building node module and create markdown files
### Preview the docs
See [Javascript docs README](./src/javascript/README.md)
### Build docs
From LanceDB repo root:
Run: `PYTHONPATH=. mkdocs build -f docs/mkdocs.yml`
If successful, you should see a `docs/site` directory that you can verify locally.
### Run local server
You can run a local server to test the docs prior to deployment by navigating to the `docs` directory and running the following command:
```bash
```shell
cd docs
mkdocs serve
```
### Run doctest for typescript example
If you want to just generate the HTML files:
```bash
cd lancedb/docs
npm i
npm run build
npm run all
```shell
PYTHONPATH=. mkdocs build -f docs/mkdocs.yml
```
If successful, you should see a `docs/site` directory that you can verify locally.
## Adding examples
To make sure examples are correct, we put examples in test files so they can be
run as part of our test suites.
You can see the tests are at:
* Python: `python/python/tests/docs`
* Typescript: `nodejs/examples/`
### Checking python examples
```shell
cd python
pytest -vv python/tests/docs
```
### Checking typescript examples
The `@lancedb/lancedb` package must be built before running the tests:
```shell
pushd nodejs
npm ci
npm run build
popd
```
Then you can run the examples by going to the `nodejs/examples` directory and
running the tests like a normal npm package:
```shell
pushd nodejs/examples
npm ci
npm test
popd
```
## API documentation
### Python
The Python API documentation is organized based on the file `docs/src/python/python.md`.
We manually add entries there so we can control the organization of the reference page.
**However, this means any new types must be manually added to the file.** No additional
steps are needed to generate the API documentation.
### Typescript
The typescript API documentation is generated from the typescript source code using [typedoc](https://typedoc.org/).
When new APIs are added, you must manually re-run the typedoc command to update the API documentation.
The new files should be checked into the repository.
```shell
pushd nodejs
npm run docs
popd
```

76
nodejs/CONTRIBUTING.md Normal file
View File

@@ -0,0 +1,76 @@
# Contributing to LanceDB Typescript
This document outlines the process for contributing to LanceDB Typescript.
For general contribution guidelines, see [CONTRIBUTING.md](../CONTRIBUTING.md).
## Project layout
The Typescript package is a wrapper around the Rust library, `lancedb`. We use
the [napi-rs](https://napi.rs/) library to create the bindings between Rust and
Typescript.
* `src/`: Rust bindings source code
* `lancedb/`: Typescript package source code
* `__test__/`: Unit tests
* `examples/`: An npm package with the examples shown in the documentation
## Development environment
To set up your development environment, you will need to install the following:
1. Node.js 14 or later
2. Rust's package manager, Cargo. Use [rustup](https://rustup.rs/) to install.
3. [protoc](https://grpc.io/docs/protoc-installation/) (Protocol Buffers compiler)
Initial setup:
```shell
npm install
```
### Commit Hooks
It is **highly recommended** to install the [pre-commit](https://pre-commit.com/) hooks to ensure that your
code is formatted correctly and passes basic checks before committing:
```shell
pre-commit install
```
## Development
Most common development commands can be run using the npm scripts.
Build the package
```shell
npm install
npm run build
```
Lint:
```shell
npm run lint
```
Format and fix lints:
```shell
npm run lint-fix
```
Run tests:
```shell
npm test
```
To run a single test:
```shell
# Single file: table.test.ts
npm test -- table.test.ts
# Single test: 'merge insert' in table.test.ts
npm test -- table.test.ts --testNamePattern=merge\ insert
```

View File

@@ -36,37 +36,4 @@ The [quickstart](../basic.md) contains a more complete example.
## Development
```sh
npm run build
npm run test
```
### Running lint / format
LanceDb uses [biome](https://biomejs.dev/) for linting and formatting. if you are using VSCode you will need to install the official [Biome](https://marketplace.visualstudio.com/items?itemName=biomejs.biome) extension.
To manually lint your code you can run:
```sh
npm run lint
```
to automatically fix all fixable issues:
```sh
npm run lint-fix
```
If you do not have your workspace root set to the `nodejs` directory, unfortunately the extension will not work. You can still run the linting and formatting commands manually.
### Generating docs
```sh
npm run docs
cd ../docs
# Asssume the virtual environment was created
# python3 -m venv venv
# pip install -r requirements.txt
. ./venv/bin/activate
mkdocs build
```
See [CONTRIBUTING.md](./CONTRIBUTING.md) for information on how to contribute to LanceDB.

78
python/CONTRIBUTING.md Normal file
View File

@@ -0,0 +1,78 @@
# Contributing to LanceDB Python
This document outlines the process for contributing to LanceDB Python.
For general contribution guidelines, see [CONTRIBUTING.md](../CONTRIBUTING.md).
## Project layout
The Python package is a wrapper around the Rust library, `lancedb`. We use
[pyo3](https://pyo3.rs/) to create the bindings between Rust and Python.
* `src/`: Rust bindings source code
* `python/lancedb`: Python package source code
* `python/tests`: Unit tests
## Development environment
To set up your development environment, you will need to install the following:
1. Python 3.9 or later
2. Cargo (Rust's package manager). Use [rustup](https://rustup.rs/) to install.
3. [protoc](https://grpc.io/docs/protoc-installation/) (Protocol Buffers compiler)
Create a virtual environment to work in:
```bash
python -m venv venv
source venv/bin/activate
pip install maturin
```
### Commit Hooks
It is **highly recommended** to install the pre-commit hooks to ensure that your
code is formatted correctly and passes basic checks before committing:
```bash
make develop # this will install pre-commit itself
pre-commit install
```
## Development
Most common development commands can be run using the Makefile.
Build the package
```shell
make develop
```
Format:
```shell
make format
```
Run tests:
```shell
make test
make doctest
```
To run a single test, you can use the `pytest` command directly. Provide the path
to the test file, and optionally the test name after `::`.
```shell
# Single file: test_table.py
pytest -vv python/tests/test_table.py
# Single test: test_basic in test_table.py
pytest -vv python/tests/test_table.py::test_basic
```
To see all commands, run:
```shell
make help
```

32
python/Makefile Normal file
View File

@@ -0,0 +1,32 @@
PIP_EXTRA_INDEX_URL ?= https://pypi.fury.io/lancedb/
help: ## Show this help.
@sed -ne '/@sed/!s/## //p' $(MAKEFILE_LIST)
.PHONY: develop
develop: ## Install the package in development mode.
PIP_EXTRA_INDEX_URL=$(PIP_EXTRA_INDEX_URL) maturin develop --extras tests,dev,embeddings
.PHONY: format
format: ## Format the code.
cargo fmt
ruff format python
.PHONY: check
check: ## Check formatting and lints.
cargo fmt --check
ruff format --check python
cargo clippy
ruff check python
.PHONY: fix
fix: ## Fix python lints
ruff check python --fix
.PHONY: doctest
doctest: ## Run documentation tests.
pytest --doctest-modules python/lancedb
.PHONY: test
test: ## Run tests.
pytest python/tests -vv --durations=10 -m "not slow"

View File

@@ -8,6 +8,15 @@ A Python library for [LanceDB](https://github.com/lancedb/lancedb).
pip install lancedb
```
### Preview Releases
Stable releases are created about every 2 weeks. For the latest features and bug fixes, you can install the preview release. These releases receive the same level of testing as stable releases, but are not guaranteed to be available for more than 6 months after they are released. Once your application is stable, we recommend switching to stable releases.
```bash
pip install --pre --extra-index-url https://pypi.fury.io/lancedb/ lancedb
```
## Usage
### Basic Example
@@ -20,76 +29,6 @@ results = table.search([0.1, 0.3]).limit(20).to_list()
print(results)
```
## Development
### Development
LanceDb is based on the rust crate `lancedb` and is built with maturin. In order to build with maturin
you will either need a conda environment or a virtual environment (venv).
```bash
python -m venv venv
. ./venv/bin/activate
```
Install the necessary packages:
```bash
python -m pip install .[tests,dev]
```
To build the python package you can use maturin:
```bash
# This will build the rust bindings and place them in the appropriate place
# in your venv or conda environment
maturin develop
```
To run the unit tests:
```bash
pytest
```
To run the doc tests:
```bash
pytest --doctest-modules python/lancedb
```
To run linter and automatically fix all errors:
```bash
ruff format python
ruff --fix python
```
If any packages are missing, install them with:
```bash
pip install <PACKAGE_NAME>
```
___
For **Windows** users, there may be errors when installing packages, so these commands may be helpful:
Activate the virtual environment:
```bash
. .\venv\Scripts\activate
```
You may need to run the installs separately:
```bash
pip install -e .[tests]
pip install -e .[dev]
```
`tantivy` requires `rust` to be installed, so install it with `conda`, as it doesn't support windows installation:
```bash
pip install wheel
pip install cargo
conda install rust
pip install tantivy
```
See [CONTRIBUTING.md](./CONTRIBUTING.md) for information on how to contribute to LanceDB.