lancedb

mirror of https://github.com/lancedb/lancedb.git synced 2026-05-14 02:20:40 +00:00

Author	SHA1	Message	Date
Wyatt Alt	cf81b6419f	feat: add `num_deleted_rows` to delete result (#3077 )	2026-03-02 08:37:14 -08:00
Lance Release	0498ac1f2f	Bump version: 0.27.0-beta.2 → 0.27.0-beta.3	2026-02-28 01:31:51 +00:00
Lance Release	c9c08ac8b9	Bump version: 0.27.0-beta.1 → 0.27.0-beta.2	2026-02-25 07:47:54 +00:00
Will Jones	0e486511fa	feat: hook up new writer for insert (#3029 ) This hooks up a new writer implementation for the `add()` method. The main immediate benefit is it allows streaming requests to remote tables, and at the same time allowing retries for most inputs. In NodeJS, we always convert the data to `Vec<RecordBatch>`, so it's always retry-able. For Python, all are retry-able, except `Iterator` and `pa.RecordBatchReader`, which can only be consumed once. Some, like `pa.datasets.Dataset` are retry-able and streaming. A lot of the changes here are to make the new DataFusion write pipeline maintain the same behavior as the existing Python-based preprocessing, such as: * casting input data to target schema * rejecting NaN values if `on_bad_vectors="error"` * applying embedding functions. In future PRs, we'll enhance these by moving the embedding calls into DataFusion and making sure we parallelize them. See: https://github.com/lancedb/lancedb/issues/3048 --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-23 14:43:31 -08:00
Will Jones	367262662d	feat(nodejs): upgrade napi-rs from v2 to v3 (#3057 ) ## Summary - Upgrades `@napi-rs/cli` from v2 to v3, `napi`/`napi-derive` Rust crates to 3.x - Fixes a bug ([napi-rs#1170](https://github.com/napi-rs/napi-rs/issues/1170)) where the CLI failed to locate the built `.node` binary when a custom Cargo target directory is set (via `config.toml`) ## Changes package.json / CLI: - `napi.name` → `napi.binaryName`, `napi.triples` → `napi.targets` - Removed `--no-const-enum` flag and fixed output dir arg - `napi universal` → `napi universalize` Rust API migration: - `#[napi::module_init]` → `#[napi_derive::module_init]` - `napi::JsObject` → `Object`, `.get::<_, T>()` → `.get::<T>()` - `ErrorStrategy` removed; `ThreadsafeFunction` now takes an explicit `Return` type with `CalleeHandled = false` const generic - `JsFunction` + `create_threadsafe_function` replaced by typed `Function<Args, Return>` + `build_threadsafe_function().build()` - `RerankerCallbacks` struct removed (`Function<'env,...>` can't be stored in structs); `VectorQuery::rerank` now accepts the function directly - `ClassInstance::clone()` now returns `ClassInstance`, fixed with explicit deref - `Vec<u8>` in `#[napi(object)]` now maps to `Array<number>` in v3; changed to `Buffer` to preserve the TypeScript `Buffer` type TypeScript: - `inner.rerank({ rerankHybrid: async (_, args) => ... })` → `inner.rerank(async (args) => ...)` - Header provider callback wrapped in `async` to match stricter typed constructor signature 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-02-23 14:42:55 -08:00
Lance Release	11efaf46ae	Bump version: 0.27.0-beta.0 → 0.27.0-beta.1	2026-02-23 18:34:48 +00:00
Lance Release	7be6f45e0b	Bump version: 0.26.2 → 0.27.0-beta.0	2026-02-17 00:28:24 +00:00
Will Jones	c0230f91d2	feat(rust)!: accept `RecordBatch`, `Vec<RecordBatch>` in `create_table()` and `Table.add()` (#2948 ) BREAKING CHANGE: Arbitrary `impl RecordBatchReader` is no longer accepted, it must be made into `Box<dyn RecordBatchReader>`. This PR replaces `IntoArrow` with a new trait `Scannable` to define input row data. This provides the following advantages: 1. We can implement `Scannable` for more types than `IntoArrow`, such as `RecordBatch` and `Vec<RecordBatch>`. The `IntoArrow` trait was implemented for arbitrary `T: RecordBatchReader`, and the Rust compiler would prevent us from implementing it for foreign types like `RecordBatch` because (theoretically) those types might implement `RecordBatchReader` in the future. That's why we implement `Scannable` for `Box<dyn RecordBatchReader>` instead; since it's a concrete type it doesn't block implementing for other foreign types. 2. We can potentially replay `Scannable` values. Previously, we had to choose between buffering all data in memory and supporting retries of writes. But because `Scannable` things can optionally support re-scanning, we now have a way of supporting retries while also streaming. 3. `Scannable` can provide hints like `num_rows`, which can be used to schedule parallel writers. Without knowing the total number of rows, it's difficult to know whether it's worth writing multiple files in parallel. We don't yet fully take advantage of (2) and (3) yet, but will in future PRs. For (2), in order to be ready to leverage this, we need to hook the `Scannable` implementation up to Python and NodeJS bindings. Right now they always pass down a stream, but we want to make sure they support retries when possible. And for (3), this will need to be hooked up to #2939 and to a pipeline for running pre-processing steps (like embedding generation). ## Other changes * Moved `create_table` and `add_data` into their own modules. I've created a follow up issue to split up `table.rs` further, as it's by far the largest file: https://github.com/lancedb/lancedb/issues/2949 * Eliminated the `HAS_DATA` generic for `CreateTableBuilder`. I didn't see any public-facing places where we differentiated methods, which is why I felt this simplification was okay. * Added an `Error::External` variant and integrated some conversions to allow certain errors to pass through transparently. This will fully work once we upgrade Lance and get to take advantage of changes in https://github.com/lance-format/lance/pull/5606 * Added LZ4 compression support for write requests to remote endpoints. I checked and this has been supported on the server for > 1 year. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-02-13 14:18:36 -08:00
Jack Ye	53c7c560c9	feat: add third party licenses lists (#3010 ) The files are generated with `make licenses`, currently expected to run manually. In the future, some automations could be built.	2026-02-09 16:16:46 -08:00
Lance Release	de4f77800d	Bump version: 0.26.2-beta.0 → 0.26.2	2026-02-09 06:06:22 +00:00
Lance Release	b6ab721cf7	Bump version: 0.26.1 → 0.26.2-beta.0	2026-02-09 06:06:03 +00:00
Jack Ye	826a3e5ee9	ci(nodejs): add repository field to package.json for npm provenance (#3003 ) ## Summary - Added `repository` field to all nodejs package.json files (main package + 7 platform-specific packages) - This fixes the npm publish E422 error where sigstore provenance verification fails because the repository.url was empty ## Root Cause Failing CI: https://github.com/lancedb/lancedb/actions/runs/21770794768/job/62821570260 npm's sigstore provenance verification requires the `repository.url` field in package.json to match the GitHub repository URL from the provenance bundle. The platform-specific packages (`@lancedb/lancedb-darwin-arm64`, etc.) were missing this field entirely, causing the publish to fail with: ``` npm error 422 Unprocessable Entity - Error verifying sigstore provenance bundle: Failed to validate repository information: package.json: "repository.url" is "", expected to match "https://github.com/lancedb/lancedb" from provenance ``` 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-08 22:04:32 -08:00
Lance Release	9fac56252e	Bump version: 0.26.1-beta.0 → 0.26.1	2026-02-07 00:33:18 +00:00
Lance Release	c55ca20c1b	Bump version: 0.26.0 → 0.26.1-beta.0	2026-02-07 00:33:02 +00:00
Lance Release	55f09ef1cd	Bump version: 0.26.0-beta.0 → 0.26.0	2026-02-06 18:08:30 +00:00
Lance Release	e9d8651d18	Bump version: 0.25.0-beta.0 → 0.26.0-beta.0	2026-02-06 18:08:08 +00:00
Jack Ye	3ad7be9825	fix: remove x86_64-apple-darwin from list of npm triples (#2987 ) Missed during https://github.com/lancedb/lancedb/pull/2987	2026-02-06 09:43:44 -08:00
Jack Ye	0859312b83	feat: add initial and latest storage options apis (#2966 ) Expose `initial_storage_options()` and `latest_storage_options()` in lance Dataset, in lancedb rust, python and typescript SDKs. --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-04 10:31:39 -08:00
Jack Ye	bd2c6d0763	chore: update lance dependency to v2.0.0-rc.4 (#2972 )	2026-02-03 14:38:39 -08:00
Vedant Madane	d3e15f3e17	fix(node): allow bigint[] for takeRowIds (#2916 ) ## Summary This PR changes takeRowIds to accept bigint[] instead of number[], matching the type of _rowid returned by withRowId(). ## Problem When retrieving row IDs using \withRowId()\ and querying them back with takeRowIds(), users get an error because: 1. _rowid values are returned as JavaScript bigint 2. takeRowIds() expected number[] 3. NAPI failed to convert: Error: Failed to convert napi value BigInt into rust type i64 ## Reproduction \\\js import lancedb from '@lancedb/lancedb'; const db = await lancedb.connect('memory://'); const table = await db.createTable('test', [{ id: 1, vector: [1.0, 2.0] }]); const results = await table.query().withRowId().toArray(); const rowIds = results.map(row => row._rowid); console.log('types:', rowIds.map(id => typeof id)); // ['bigint'] await table.takeRowIds(rowIds).toArray(); // âŒ Error before fix \\\ ## Solution - Updated TypeScript signature from takeRowIds(rowIds: number[]) to takeRowIds(rowIds: bigint[]) - Updated Rust NAPI binding to accept Vec<BigInt> and convert using get_u64() Fixes #2722 --------- Co-authored-by: Will Jones <willjones127@gmail.com>	2026-02-03 10:09:51 -08:00
Lance Release	571295b0d9	Bump version: 0.24.1 → 0.25.0-beta.0	2026-02-03 04:48:34 +00:00
Lance Release	8500b16eca	Bump version: 0.24.1-beta.0 → 0.24.1	2026-01-26 23:39:18 +00:00
Lance Release	57e7282342	Bump version: 0.24.0 → 0.24.1-beta.0	2026-01-26 23:38:50 +00:00
Jack Ye	e4552e577a	chore(revert): revert update lance dependency to v2.0.0-rc.1 (#2936 ) (#2941 ) This reverts commit `bd84bba14d`, so that we can bump version to 1.0.4-rc.1	2026-01-26 11:13:59 -08:00
LanceDB Robot	bd84bba14d	chore: update lance dependency to v2.0.0-rc.1 (#2936 ) ## Summary - bump Lance dependencies to v2.0.0-rc.1 (git tag) - align Arrow/DataFusion/PyO3 versions for the new Lance release - update Python bindings for PyO3 0.26 (attach API + Py<PyAny>) ## Verification - `cargo clippy --workspace --tests --all-features -- -D warnings` - `cargo fmt --all` ## Reference - https://github.com/lance-format/lance/releases/tag/v2.0.0-rc.1 --------- Co-authored-by: Jack Ye <yezhaoqin@gmail.com> Co-authored-by: Will Jones <willjones127@gmail.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: BubbleCal <bubble_cal@outlook.com>	2026-01-22 13:14:38 -08:00
Lance Release	ac07f8068c	Bump version: 0.24.0-beta.1 → 0.24.0	2026-01-22 01:10:15 +00:00
Lance Release	bba362d372	Bump version: 0.24.0-beta.0 → 0.24.0-beta.1	2026-01-22 01:09:53 +00:00
Lance Release	790ba7115b	Bump version: 0.23.1 → 0.24.0-beta.0	2026-01-21 12:21:53 +00:00
Will Jones	1840aa7edc	feat(rust)!: remove default features (#2912 ) BREAKING CHANGE: removes `aws`, `dynamodb`, `azure`, `gcs`, `oss`, `huggingface` from default Rust features. They can be enabled by users as needed. They are still enabled for Python and NodeJS, since those users don't control the compilation of artifacts. Closes #2911	2026-01-13 11:23:14 -08:00
Lance Release	8bcac7e372	Bump version: 0.23.1-beta.2 → 0.23.1	2026-01-02 17:39:19 +00:00
Lance Release	e496184ab2	Bump version: 0.23.1-beta.1 → 0.23.1-beta.2	2026-01-02 17:38:54 +00:00
Lance Release	0667fa38d4	Bump version: 0.23.1-beta.0 → 0.23.1-beta.1	2025-12-17 06:59:29 +00:00
Lance Release	2fd712312f	Bump version: 0.23.0 → 0.23.1-beta.0	2025-12-17 03:30:51 +00:00
Lance Release	94bdffe13c	Bump version: 0.23.0-beta.2 → 0.23.0	2025-12-16 16:58:35 +00:00
Lance Release	b93ea3a388	Bump version: 0.23.0-beta.1 → 0.23.0-beta.2	2025-12-16 16:57:55 +00:00
Lance Release	0960e19559	Bump version: 0.23.0-beta.0 → 0.23.0-beta.1	2025-12-05 00:36:39 +00:00
Lance Release	6f79770248	Bump version: 0.22.4-beta.3 → 0.23.0-beta.0	2025-12-04 19:33:37 +00:00
Lance Release	b2d06a3a73	Bump version: 0.22.4-beta.2 → 0.22.4-beta.3	2025-12-02 22:01:59 +00:00
Prashanth Rao	76bcc78910	docs: nodejs failing CI is fixed (#2802 ) Fixes the breaking CI for nodejs, related to the documentation of the new Permutation API in typescript. - Expanded the generated typings in `nodejs/lancedb/native.d.ts` to include `SplitCalculatedOptions`, `splitNames` fields, and the persist/options-based `splitCalculated` methods so the permutation exports match the native API. - The previous block comment block had an inconsistency. `splitCalculated` takes an options object (`SplitCalculatedOptions`) in our bindings, not a bare string. The previous example showed `builder.splitCalculated("user_id % 3");`, which doesn’t match the actual signature and would fail TS typecheck. I updated the comment to `builder.splitCalculated({ calculation: "user_id % 3" });` so the example is now correct. - Updated the `splitCalculated` example in `nodejs/lancedb/permutation.ts` to use the options object. - Ran `npm docs` to ensure docs build correctly. > [!NOTE] > Disclaimer: I used GPT-5.1-Codex-Max to make these updates, but I have read the code and run `npm run docs` to verify that they work and are correct to the best of my knowledge.	2025-11-20 16:16:38 -08:00
Prashanth Rao	135dfdc7ec	docs: 404 and outdated URLs should now work (#2800 ) Did a full scan of all URLs that used to point to the old mkdocs pages, and now links to the appropriate pages on lancedb.com/docs or lance.org docs. --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-11-20 11:14:20 -08:00
Lance Release	3531393523	Bump version: 0.22.4-beta.1 → 0.22.4-beta.2	2025-11-19 20:25:41 +00:00
Wyatt Alt	386fc9e466	feat: add num_attempts to merge insert result (#2795 ) This pipes the num_attempts field from lance's merge insert result through lancedb. This allows callers of merge_insert to get a better idea of whether transaction conflicts are occurring.	2025-11-19 09:32:57 -08:00
Lance Release	ce1bafec1a	Bump version: 0.22.4-beta.0 → 0.22.4-beta.1	2025-11-19 12:58:59 +00:00
Will Jones	1cf3917a87	ci: make rust ci faster, get ci green (#2782 ) * Add `ci` profile for smaller build caches. This had a meaningful impact in Lance, and I expect a similar impact here. https://github.com/lancedb/lance/pull/5236 * Get caching working in Rust. Previously was not working due to `workspaces: rust`. * Get caching working in NodeJs lint job. Previously wasn't working because we installed the toolchain after we called `- uses: Swatinem/rust-cache@v2`, which invalidates the cache locally. * Fix broken pytest from async io transition (`pytest.PytestRemovedIn9Warning`) * Altered `get_num_sub_vectors` to handle bug in case of 4-bit PQ. This was cause of `rust future panicked: unknown error`. Raised an issue upstream to change panic to error: https://github.com/lancedb/lance/issues/5257 * Call `npm run docs` to fix doc issue. * Disable flakey Windows test for consistency. It's just an OS-specific timer issue, not our fault. * Fix Windows absolute path handling in namespaces. Was causing CI failure `OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: `	2025-11-18 09:04:56 -08:00
Lance Release	e2d7640021	Bump version: 0.22.3 → 0.22.4-beta.0	2025-11-17 08:43:51 +00:00
Lance Release	e34f51713a	Bump version: 0.22.3-beta.6 → 0.22.3	2025-11-07 04:59:18 +00:00
Lance Release	abaf5ac27f	Bump version: 0.22.3-beta.5 → 0.22.3-beta.6	2025-11-07 04:58:38 +00:00
Weston Pace	aeac9c7644	feat: add python Permutation class to mimic hugging face dataset and provide pytorch dataloader (#2725 )	2025-11-06 16:15:33 -08:00
Lance Release	aed4a7c98e	Bump version: 0.22.3-beta.4 → 0.22.3-beta.5	2025-10-31 17:08:56 +00:00
Lance Release	0b7b27481e	Bump version: 0.22.3-beta.3 → 0.22.3-beta.4	2025-10-31 01:14:39 +00:00

1 2 3 4 5 ...

442 Commits