Files
lancedb/docs
Heng Ge 0d30b31998 feat: support setting LSM write spec for a table (#3396)
## Summary

Split out from #3354

Adds `LsmWriteSpec` and `Table::set_lsm_write_spec` /
`unset_lsm_write_spec` to
install and clear the spec that selects Lance's MemWAL LSM-style write
path for
`merge_insert`.

`LsmWriteSpec` offers three sharding strategies, all built on Lance's
`InitializeMemWalBuilder`:

- `LsmWriteSpec::bucket(column, num_buckets)` — hash-bucket sharding by
the
  single-column unenforced primary key.
- `LsmWriteSpec::identity(column)` — identity sharding by the raw value
of a
  scalar column.
- `LsmWriteSpec::unsharded()` — a single MemWAL shard.

Each can be refined with `with_maintained_indexes(...)` (indexes the
MemWAL
keeps up to date as rows are appended) and
`with_writer_config_defaults(...)`
(default `ShardWriter` configuration recorded in the MemWAL index, so
every
writer starts from the same defaults). All variants require the table to
have
an unenforced primary key.

- `set_lsm_write_spec` installs the spec by initializing the MemWAL
index;
`unset_lsm_write_spec` removes it (dropping the MemWAL index), reverting
to
  the standard `merge_insert` path. `unset` is idempotent.
- Bindings: Python (`LsmWriteSpec.bucket` / `.identity` / `.unsharded`,
  `set_lsm_write_spec` / `unset_lsm_write_spec`) and TypeScript
  (`setLsmWriteSpec` with `specType` `"bucket"` / `"identity"` /
  `"unsharded"`). `RemoteTable` returns `NotSupported`.

The actual `merge_insert` LSM dispatch and `ShardWriter` write path are
a
follow-up — this PR only installs and clears the spec.
2026-05-18 00:11:33 -07:00
..

LanceDB Documentation

LanceDB docs are available at docs.lancedb.com.

The SDK docs are built and deployed automatically by Github Actions whenever a commit is pushed to the main branch. So it is possible for the docs to show unreleased features.

Building the docs

Setup

  1. Install LanceDB Python. See setup in Python contributing guide. Run make develop to install the Python package.
  2. Install documentation dependencies. From LanceDB repo root: pip install -r docs/requirements.txt

Preview the docs

cd docs
mkdocs serve

If you want to just generate the HTML files:

PYTHONPATH=. mkdocs build -f docs/mkdocs.yml

If successful, you should see a docs/site directory that you can verify locally.

Adding examples

To make sure examples are correct, we put examples in test files so they can be run as part of our test suites.

You can see the tests are at:

  • Python: python/python/tests/docs
  • Typescript: nodejs/examples/

Checking python examples

cd python
pytest -vv python/tests/docs

Checking typescript examples

The @lancedb/lancedb package must be built before running the tests:

pushd nodejs
npm ci
npm run build
popd

Then you can run the examples by going to the nodejs/examples directory and running the tests like a normal npm package:

pushd nodejs/examples
npm ci
npm test
popd

API documentation

Python

The Python API documentation is organized based on the file docs/src/python/python.md. We manually add entries there so we can control the organization of the reference page. However, this means any new types must be manually added to the file. No additional steps are needed to generate the API documentation.

Typescript

The typescript API documentation is generated from the typescript source code using typedoc.

When new APIs are added, you must manually re-run the typedoc command to update the API documentation. The new files should be checked into the repository.

pushd nodejs
npm run docs
popd