lancedb

mirror of https://github.com/lancedb/lancedb.git synced 2026-03-25 01:50:39 +00:00

Author	SHA1	Message	Date
Lance Release	c89240b16c	Bump version: 0.30.0-beta.6 → 0.30.0 python-v0.30.0	2026-03-16 22:46:19 +00:00
Lance Release	099ff355a4	Bump version: 0.30.0-beta.5 → 0.30.0-beta.6	2026-03-16 22:46:17 +00:00
Weston Pace	c5995fda67	feat: update lance dependency to 3.0.0 release (#3137 ) ## Summary - Update all 14 lance crates from `3.0.0-rc.3` (git source) to `3.0.0` (crates.io release) - Remove git/tag source references since 3.0.0 is published on crates.io ## Test plan - [x] `cargo check --features remote --tests --examples` passes - [x] `cargo clippy --features remote --tests --examples` passes - [ ] CI passes 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-16 15:29:18 -07:00
Weston Pace	25eb1fbfa4	fix: restore storage options on copy in localstack tests (#3148 )	2026-03-16 14:02:19 -07:00
Weston Pace	4ac41c5c3f	fix(ci): upgrade LocalStack to 4.0 for S3 integration tests (#3147 ) ## Summary - Upgrade LocalStack from 3.3 to 4.0 in `docker-compose.yml` to fix S3 integration test failures in CI - Version 3.3 has compatibility issues with newer Python 3.13 and updated boto3 dependencies - Matches the LocalStack version used successfully in the lance repository ## Test plan - [ ] Verify `docker compose up --detach --wait` completes successfully in CI - [ ] All tests in `test_s3.py` pass (5 tests) - [ ] All `@pytest.mark.s3_test` tests in `test_namespace_integration.py` pass (7 tests) - [ ] No regressions in non-integration test jobs (Mac, Windows) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-16 09:02:11 -07:00
Will Jones	9a5b0398ec	chore: fix ci (#3139 ) * Move away from buildjet, which is shutting down runners for GHA [^1] * Add `Cargo.lock` to build jobs, so when we upgrade locked dependencies we check the builds actually pass. CI started failing because dependencies were changed in #3116 without running all build jobs. * Add fixes for aws-lc-rs build in NodeJS. [^1]: https://buildjet.com/for-github-actions/blog/we-are-shutting-down --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-16 06:25:40 -07:00
Pratik Dey	d1d720d08a	feat(nodejs): support field/data type input in add_columns() method (#3114 ) Add support for passing field/data type information into add_columns() method, bringing parity with Python bindings. The method now accepts: - AddColumnsSql[] - SQL expressions (existing functionality) - Field - single Arrow field with explicit data type - Field[] - array of Arrow fields with explicit data types - Schema - Arrow schema with explicit data types New columns added via Field/Schema are initialized with null values. All field-based columns must be nullable due to null initialization. Resolves #3107 --------- Signed-off-by: Pratik <pratikrocks.dey11@gmail.com> Co-authored-by: Claude <noreply@anthropic.com>	2026-03-13 12:57:14 -07:00
Mesut-Doner	c2e543f1b7	feat(rust): support Expr in projection query (#3069 ) Referred and followed [`Select::Dynamic`] implementation. Closes #3039	2026-03-13 12:54:26 -07:00
Weston Pace	216c1b5f77	docs: remove experimental label from optimize and warn about delete_unverified (#3128 ) ## Summary - Removes the "Experimental API" section from `optimize` method documentation across Rust, Python, and TypeScript - Adds a warning to `delete_unverified` documentation in all bindings: this should only be set to true if you can guarantee no other process is working on the dataset, otherwise it could be corrupted - Fixes a typo ("shoudl" → "should") Closes #3125 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-12 14:37:42 +08:00
Xin Sun	fc1867da83	chore: remove the duplicate snafu-derive dependency in the lockfile (#3124 )	2026-03-10 21:43:51 -07:00
Esteban Gutierrez	f951da2b00	feat: support prewarm_index and prewarm_data on remote tables (#3110 ) ## Summary - Implement `RemoteTable.prewarm_data(columns)` calling `POST /v1/table/{id}/page_cache/prewarm/` - Implement `RemoteTable.prewarm_index(name)` calling `POST /v1/table/{id}/index/{name}/prewarm/` (previously returned `NotSupported`) - Add `BaseTable::prewarm_data(columns)` trait method and `Table` public API in Rust core - Add PyO3 bindings and Python API (`AsyncTable`, `LanceTable`, `RemoteTable`) for `prewarm_data` - Add type stubs for `prewarm_index` and `prewarm_data` in `_lancedb.pyi` - Upgrade Lance to 3.0.0-rc.3 with breaking change fixes Co-authored-by: Will Jones <willjones127@gmail.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-10 15:39:39 -05:00
Esteban Gutierrez	6530d82690	chore: dependency updates and security fixes (#3116 ) ## Summary - Update dependencies across Rust, Python, Node.js, Java, Docker, and docs - Pin unpinned dependency lower bounds to prevent silent downgrades - Bump CI actions to current major versions 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-09 20:04:27 -07:00
Lance Release	b3fc9c444f	Bump version: 0.27.0-beta.4 → 0.27.0-beta.5	2026-03-09 19:58:12 +00:00
Lance Release	6de8f42dcd	Bump version: 0.30.0-beta.4 → 0.30.0-beta.5 python-v0.30.0-beta.5	2026-03-09 19:56:15 +00:00
Will Jones	5c3bd68e58	feat: upgrade Lance to 3.0.0-rc.3 (#3104 ) Co-authored-by: Jack Ye <yezhaoqin@gmail.com>	2026-03-09 12:55:20 -07:00
Weston Pace	4be85444f0	feat: infer js native arrays (#3119 ) When we create tables without using Arrow by parsing JS records we always infer to float64. Many times embeddings are not float64 and it would be nice to be able to use the native type without requiring users to pull in Arrow. We can utilize JS's builtin Float32Array to do this. This PR also adds support for UInt8/16/32 and Int8/16/32 arrays as well. Closes #3115	2026-03-09 10:13:59 -07:00
Xuanwo	68c07f333f	chore: unify component README titles (#3066 )	2026-03-09 21:47:58 +08:00
Lance Release	814a379e08	Bump version: 0.27.0-beta.3 → 0.27.0-beta.4	2026-03-09 08:47:17 +00:00
Lance Release	f31561c5bb	Bump version: 0.30.0-beta.3 → 0.30.0-beta.4 python-v0.30.0-beta.4	2026-03-09 08:45:25 +00:00
Jack Ye	e0c5ceac03	fix: propagate managed versioning for namespace connection (#3111 ) Without this fix, if user directly use the native table to do operations like `add_columns`, even if it is configured to use namespace db connection, it is not really propagated through. The fix is to bring lancedb's python binding up to date and do a similar implementation as https://github.com/lance-format/lance/pull/5968, and make sure the namespace is fully propagated through all the related calls. --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-03-09 01:44:31 -07:00
Prashanth Rao	e93bb3355a	docs: add meth/func names to mkdocstrings (#3101 ) LanceDB's SDK API docs do not currently show method names under any given object, and this makes it harder to quickly understand and find relevant method names for a given class. Geneva docs show the available methods in the right navigation. This PR standardizes the appearance of the LanceDB SDK API in the docs to be more similar to Geneva's. <img width="1386" height="792" alt="image" src="https://github.com/user-attachments/assets/30816591-d6d5-495d-886d-e234beeb6059" /> <img width="897" height="540" alt="image" src="https://github.com/user-attachments/assets/d5491b6b-c7bf-4d3b-8b15-1a1a7700e7c9" />	2026-03-06 08:54:45 -08:00
Will Jones	b75991eb07	fix: propagate cast errors in `add()` (#3075 ) When we write data with `add()`, we can input data to the table's schema. However, we were using "safe" mode, which propagates errors as nulls. For example, if you pass `u64::max` into a field that is a `u32`, it will just write null instead of giving overflow error. Now it propagates the overflow. This is the same behavior as other systems like DuckDB. --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-03-05 20:24:50 -08:00
Wyatt Alt	97ca9bb943	feat: allow passing azure client/tenant ID through remote SDK (#3102 ) Prior to this commit we supported passing the azure storage account name to the lancedb remote SDK through headers. This adds support for client ID and tenant ID as well.	2026-03-04 11:11:36 -08:00
Xuanwo	fa1b04f341	chore: migrate Rust crates to edition 2024 and fix clippy warnings (#3098 ) This PR migrates all Rust crates in the workspace to Rust 2024 edition and addresses the resulting compatibility updates. It also fixes all clippy warnings surfaced by the workspace checks so the codebase remains warning-free under the current lint configuration. Context: - Scope: workspace edition bump (`2021` -> `2024`) plus follow-up refactors required by new edition and clippy rules. - Validation: `cargo fmt --all` and `cargo clippy --quiet --features remote --tests --examples -- -D warnings` both pass.	2026-03-03 16:23:29 -08:00
mrncstt	367abe99d2	feat(python): support dict to SQL struct conversion in table.update() (#3089 ) ## Summary - Add `@value_to_sql.register(dict)` handler that converts Python dicts to DataFusion's `named_struct()` SQL syntax - Enables updating struct-typed columns via `table.update(values={"col": {"field_a": 1, "field_b": "hello"}})` - Recursively handles nested structs, lists, nulls, and all existing scalar types Closes #1363 ## Details The `named_struct` function was introduced in DataFusion 38 and is now available (LanceDB uses DataFusion 52.1). The implementation follows the existing `singledispatch` pattern in `util.py`. Example conversion: ```python value_to_sql({"field_a": 1, "field_b": "hello"}) # => "named_struct('field_a', 1, 'field_b', 'hello')" ``` ## Test plan - [x] Unit tests for flat struct, nested struct, list inside struct, mixed types, null values, and empty dict - [ ] CI integration tests with actual table.update() on struct columns 🔗 [DataFusion named_struct docs](https://datafusion.apache.org/user-guide/sql/scalar_functions.html#named-struct)	2026-03-03 13:36:08 -08:00
Xuanwo	52ce2c995c	fix(ci): only run npm publish on release tags (#3093 ) This PR fixes the npm publish dry-run failure for prerelease versions without changing the existing workflow trigger behavior. The publish step now detects prerelease versions from `nodejs/package.json` and always appends `--tag preview` when needed. Context: - On `main` pushes, the workflow still runs `npm publish --dry-run` by design. - Recent failures were caused by prerelease versions (for example `0.27.0-beta.3`) running without `--tag`, which npm rejects. - The previous `refs/tags/v...-beta...` check did not apply on branch pushes, so dry-run could fail even though release tags worked.	2026-03-04 01:35:10 +08:00
Sean Mackrory	e71a00998c	ci: add regression test for fastSearch in FTS queries in TypeScript (#3090 ) We recently added support for this for the Python bindings, and wanted to confirm this already worked as expected in the TS bindings.	2026-03-03 07:09:09 -08:00
Sean Mackrory	39a2ac0a1c	feat: add parity between fast_search keyword argument between vector and FTS searches (#3091 ) We don't necessarily need to do this, but one user was confused having used `fast_search=True` as a keyword argument for vector searches, but being unable to do so for FTS, even after the most recent changes. I think this is the only discrepancy in where that is possible.	2026-03-03 05:21:36 -08:00
Wyatt Alt	bc7b344fa4	feat: add support for remote index params (#3087 ) Prior to this commit the remote SDK did not support the full set of index parameters. This extends the SDK to support them.	2026-03-02 11:14:28 -08:00
Will Jones	f91d2f5fec	ci(python): pin maturin to work around bug (#3088 ) Work around for https://github.com/PyO3/maturin/issues/3059	2026-03-02 09:38:54 -08:00
Wyatt Alt	cf81b6419f	feat: add `num_deleted_rows` to delete result (#3077 )	2026-03-02 08:37:14 -08:00
Lance Release	0498ac1f2f	Bump version: 0.27.0-beta.2 → 0.27.0-beta.3	2026-02-28 01:31:51 +00:00
Lance Release	aeb1c3ee6a	Bump version: 0.30.0-beta.2 → 0.30.0-beta.3 python-v0.30.0-beta.3	2026-02-28 01:29:53 +00:00
Weston Pace	f9ae46c0e7	feat: upgrade lance to 3.0.0-rc.2 and add bindings for fast_search (#3083 )	2026-02-27 17:27:01 -08:00
Will Jones	84bf022fb1	fix(python): pin pylance to make datafusion table provider match version (#3080 )	2026-02-27 13:34:05 -08:00
Will Jones	310967eceb	ci(rust): fix linux job (#3076 )	2026-02-26 19:25:46 -08:00
Jack Ye	154dbeee2a	chore: fix clippy for PreprocessingOutput without remote feature (#3070 ) Fix clippy: ``` error: fields `overwrite` and `rescannable` are never read Error: --> /home/runner/work/xxxx/xxxx/src/lancedb/rust/lancedb/src/table/add_data.rs:158:9 \| 156 \| pub struct PreprocessingOutput { \| ------------------- fields in this struct 157 \| pub plan: Arc<dyn datafusion_physical_plan::ExecutionPlan>, 158 \| pub overwrite: bool, \| ^^^^^^^^^ 159 \| pub rescannable: bool, \| ^^^^^^^^^^^ \| = note: `-D dead-code` implied by `-D warnings` = help: to override `-D warnings` add `#[allow(dead_code)]` ```	2026-02-25 14:59:32 -08:00
Lance Release	c9c08ac8b9	Bump version: 0.27.0-beta.1 → 0.27.0-beta.2	2026-02-25 07:47:54 +00:00
Lance Release	e253f5d9b6	Bump version: 0.30.0-beta.1 → 0.30.0-beta.2 python-v0.30.0-beta.2	2026-02-25 07:46:06 +00:00
LanceDB Robot	05b4fb0990	chore: update lance dependency to v3.1.0-beta.2 (#3068 ) ## Summary - Bump Lance Rust workspace dependencies to `v3.1.0-beta.2` via `ci/set_lance_version.py`. - Update Java `lance-core.version` to `3.1.0-beta.2`. ## Verification - `cargo clippy --workspace --tests --all-features -- -D warnings` - `cargo fmt --all` ## Release Reference - refs/tags/v3.1.0-beta.2	2026-02-24 23:02:22 -08:00
Mesut-Doner	613b9c1099	feat(rust): add expression builder API for type-safe query filters (#3032 ) ## Summary Adds a Rust expression builder API as a type-safe alternative to SQL strings for query filters. ## Motivation Filtering with raw SQL strings can be awkward when using variables and special types: Closes #3038 --------- Co-authored-by: Will Jones <willjones127@gmail.com>	2026-02-24 18:44:03 -08:00
Will Jones	d5948576b9	feat: parallel inserts for local tables (#3062 ) When input data is sufficiently large, we automatically split up into parallel writes using a round-robin exchange operator. We sample the first batch to determine data width, and target size of 1 million rows or 2GB, whichever is smaller.	2026-02-24 12:26:51 -08:00
Will Jones	0d3fc7860a	ci: fix python DataFusion test (#3060 )	2026-02-24 07:59:12 -08:00
Weston Pace	531cec075c	fix: don't expect all offsets to fit in one batch in permutation reader (#3065 ) This would cause takes against large permutations to fail	2026-02-24 06:32:54 -08:00
Will Jones	0e486511fa	feat: hook up new writer for insert (#3029 ) This hooks up a new writer implementation for the `add()` method. The main immediate benefit is it allows streaming requests to remote tables, and at the same time allowing retries for most inputs. In NodeJS, we always convert the data to `Vec<RecordBatch>`, so it's always retry-able. For Python, all are retry-able, except `Iterator` and `pa.RecordBatchReader`, which can only be consumed once. Some, like `pa.datasets.Dataset` are retry-able and streaming. A lot of the changes here are to make the new DataFusion write pipeline maintain the same behavior as the existing Python-based preprocessing, such as: * casting input data to target schema * rejecting NaN values if `on_bad_vectors="error"` * applying embedding functions. In future PRs, we'll enhance these by moving the embedding calls into DataFusion and making sure we parallelize them. See: https://github.com/lancedb/lancedb/issues/3048 --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 14:43:31 -08:00
Will Jones	367262662d	feat(nodejs): upgrade napi-rs from v2 to v3 (#3057 ) ## Summary - Upgrades `@napi-rs/cli` from v2 to v3, `napi`/`napi-derive` Rust crates to 3.x - Fixes a bug ([napi-rs#1170](https://github.com/napi-rs/napi-rs/issues/1170)) where the CLI failed to locate the built `.node` binary when a custom Cargo target directory is set (via `config.toml`) ## Changes package.json / CLI: - `napi.name` → `napi.binaryName`, `napi.triples` → `napi.targets` - Removed `--no-const-enum` flag and fixed output dir arg - `napi universal` → `napi universalize` Rust API migration: - `#[napi::module_init]` → `#[napi_derive::module_init]` - `napi::JsObject` → `Object`, `.get::<_, T>()` → `.get::<T>()` - `ErrorStrategy` removed; `ThreadsafeFunction` now takes an explicit `Return` type with `CalleeHandled = false` const generic - `JsFunction` + `create_threadsafe_function` replaced by typed `Function<Args, Return>` + `build_threadsafe_function().build()` - `RerankerCallbacks` struct removed (`Function<'env,...>` can't be stored in structs); `VectorQuery::rerank` now accepts the function directly - `ClassInstance::clone()` now returns `ClassInstance`, fixed with explicit deref - `Vec<u8>` in `#[napi(object)]` now maps to `Array<number>` in v3; changed to `Buffer` to preserve the TypeScript `Buffer` type TypeScript: - `inner.rerank({ rerankHybrid: async (_, args) => ... })` → `inner.rerank(async (args) => ...)` - Header provider callback wrapped in `async` to match stricter typed constructor signature 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 14:42:55 -08:00
Lance Release	11efaf46ae	Bump version: 0.27.0-beta.0 → 0.27.0-beta.1	2026-02-23 18:34:48 +00:00
Lance Release	1ea22ee5ef	Bump version: 0.30.0-beta.0 → 0.30.0-beta.1 python-v0.30.0-beta.1	2026-02-23 18:33:28 +00:00
LanceDB Robot	8cef8806e9	chore: update lance dependency to v3.0.0-beta.5 (#3058 ) ## Summary - Bump Lance Rust dependencies and Java `lance-core` to v3.0.0-beta.5 (refs/tags/v3.0.0-beta.5). - Update workspace toolchain and dependency defaults needed for the new Lance release. - Resolve new clippy lint defaults introduced by the toolchain update. ## Validation - `cargo clippy --workspace --tests --all-features -- -D warnings` - `cargo fmt --all` --------- Co-authored-by: Jack Ye <yezhaoqin@gmail.com>	2026-02-23 00:39:30 -08:00
Will Jones	a3cd7fce69	fix: update DatasetConsistencyWrapper to accept same-version updates (#3055 ) ## Summary `DatasetConsistencyWrapper::update()` only stored datasets with a strictly newer version. This caused `migrate_manifest_paths_v2` to silently drop its update since the migration renames files without bumping the dataset version. The subsequent `uses_v2_manifest_paths()` call would then return the stale cached dataset. Changed the version check from `>` to `>=` so same-version updates are accepted. ## Test plan - [x] Existing `test_create_table_v2_manifest_paths_async` Python test should pass - [x] Existing `should be able to migrate tables to the V2 manifest paths` NodeJS test should pass - [x] All dataset wrapper unit tests pass locally 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-20 16:01:15 -08:00

1 2 3 4 5 ...

2373 Commits