lancedb

mirror of https://github.com/lancedb/lancedb.git synced 2026-06-29 00:50:38 +00:00

Author	SHA1	Message	Date
Jack Ye	d96c90c5b9	docs(node): update OAuth config docs	2026-06-27 00:35:55 -07:00
Jack Ye	c1a8702c65	feat(node): expose OAuth connection config	2026-06-27 00:02:59 -07:00
Jack Ye	3df3043563	feat(rust): add OAuth header provider (#3579 ) ## Summary Add the Rust OAuth header provider for remote LanceDB connections. This supports client credentials and Azure managed identity flows, handles token caching and refresh, redacts secrets in Debug output, and wires `ConnectBuilder::oauth_config()` into the remote client while rejecting ambiguous API-key/header-provider combinations.	2026-06-26 23:57:16 -07:00
Ryan Green	8a5cd74e48	fix: ensure read freshness provider is built into namespace client (#3571 ) By default the read freshness provider was not included in the namespace client, preventing the read freshness headers from being included in the request. This prevents checkout_latest() from working as expected when using the namespace client. This fix ensures the provided is built into the client when the namespace impl and properties are provided.	2026-06-25 21:47:55 -07:00
Lance Release	448d5ec20f	Bump version: 0.31.0-beta.2 → 0.31.0-beta.3	2026-06-25 01:55:06 +00:00
Lance Release	8718345229	Bump version: 0.34.0-beta.2 → 0.34.0-beta.3 python-v0.34.0-beta.3	2026-06-25 01:53:51 +00:00
LanceDB Robot	026fedc286	chore: update lance dependency to v9.0.0-beta.8 (#3580 ) Updates Lance dependencies from v9.0.0-beta.4 to v9.0.0-beta.8.\n\nThis refreshes the Rust workspace lockfile and the Java lance-core version. Triggering Lance tag: https://github.com/lance-format/lance/releases/tag/v9.0.0-beta.8	2026-06-24 18:52:59 -07:00
Jack Ye	fe287dc98c	fix(remote): support namespace clients with dynamic headers Bridge LanceDB dynamic header providers into Lance Namespace dynamic context providers for live remote namespace clients.	2026-06-24 15:30:00 -07:00
Jack Ye	411568b72c	fix(remote): omit empty api key header (#3573 ) ## Summary Skip inserting the x-api-key header when the configured API key is empty. This lets bearer-token or other dynamic-header authentication avoid sending an empty static API key header alongside the real auth header.	2026-06-24 13:25:59 -07:00
LanceDB Robot	ebf8d55ede	chore: update lance dependency to v9.0.0-beta.4 (#3570 ) Bumps the Lance dependencies to v9.0.0-beta.4 and refreshes the generated lockfile metadata. No compatibility fixes were required beyond the dependency updates. Triggered by https://github.com/lance-format/lance/releases/tag/v9.0.0-beta.4	2026-06-24 10:16:29 -05:00
Raphael Malikian	0ba70d96c3	fix: add missing stacklevel=2 to warnings.warn() and fix broken message concatenation (Fixes #3563 ) (#3564 ) Fixes #3563 ## Summary - Add `stacklevel=2` to 10 `warnings.warn()` calls across 4 files - Fix broken message concatenation in `table.py` where the second string was incorrectly passed as the `category` parameter ## Problem Multiple `warnings.warn()` calls in the `python/lancedb/` codebase were missing the `stacklevel` parameter. Without `stacklevel=2`, warnings point to library internals instead of the caller's code, making it impossible for users to identify which of their function calls triggered the warning. Additionally, two calls in `table.py` (lines 3411 and 3420) had a more serious bug: the deprecation message was split across two separate string arguments, causing the second string to be passed as the `category` parameter instead of being concatenated with the first string. This would cause `TypeError` when the warning was triggered. ## Changes \| File \| Fixes \| Description \| \|------\|-------\|-------------\| \| `embeddings/colpali.py` \| 1 \| Add `stacklevel=2` to `use_token_pooling` deprecation warning \| \| `remote/db.py` \| 3 \| Add `stacklevel=2` to `request_thread_pool`, `connection_timeout`, `read_timeout` deprecation warnings \| \| `remote/table.py` \| 3 \| Add `stacklevel=2` to `cleanup_old_versions`, `compact_files`, `optimize` no-op warnings \| \| `table.py` \| 3 \| Fix broken message concatenation for `data_storage_version` and `enable_v2_manifest_paths` deprecation warnings + add `stacklevel=2` to `retrain` deprecation warning \| ## Verification ```python # All warnings.warn() calls now have stacklevel python3 -c "import ast, os; ..." # Result: All warnings.warn() calls now have stacklevel! ``` ## Changelog \| Date \| Change \| Author \| \|------\|--------\|--------\| \| 2026-06-20 \| Fix missing stacklevel=2 in 10 warnings.warn() calls + fix broken message concatenation \| rtmalikian \| ### Files Changed - `python/python/lancedb/embeddings/colpali.py` — Add stacklevel=2 - `python/python/lancedb/remote/db.py` — Add stacklevel=2 to 3 deprecation warnings - `python/python/lancedb/remote/table.py` — Add stacklevel=2 to 3 no-op warnings - `python/python/lancedb/table.py` — Fix broken message concatenation + add stacklevel=2 ### Verification - AST-based audit confirms all `warnings.warn()` calls now include `stacklevel=2` - Syntax check passes for all 4 modified files --- About the Author: Raphael Malikian — Clinical AI Solutions Architect. I specialise in building and fixing AI/ML systems for healthcare, including vector databases, RAG pipelines, and clinical NLP. If you need help with your project or think I can add value to your organisation, feel free to reach out — I'd love to connect. 📧 rtmalikian@gmail.com 🔗 GitHub: https://github.com/rtmalikian 🔗 LinkedIn: http://www.linkedin.com/in/raphael-t-malikian-mbbs-bsc-hons-71075436a --- Disclosure: This code was developed with assistance from Hermes Agent (Nous Research). All changes were reviewed, tested against the actual codebase, and verified for correctness. Signed-off-by: rtmalikian <rtmalikian@gmail.com>	2026-06-23 13:42:59 -07:00
Lance Release	0749532c3c	Bump version: 0.31.0-beta.1 → 0.31.0-beta.2	2026-06-23 16:23:08 +00:00
Lance Release	26481a4b74	Bump version: 0.34.0-beta.1 → 0.34.0-beta.2 python-v0.34.0-beta.2	2026-06-23 16:21:52 +00:00
dependabot[bot]	08596f1644	chore(deps): bump the rust-minor-patch group with 2 updates (#3565 ) Bumps the rust-minor-patch group with 2 updates: [bytes](https://github.com/tokio-rs/bytes) and [napi](https://github.com/napi-rs/napi-rs). Updates `bytes` from 1.11.1 to 1.12.0 <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/tokio-rs/bytes/releases">bytes's releases</a>.</em></p> <blockquote> <h2>Bytes v1.12.0</h2> <h1>1.12.0 (June 18th, 2026)</h1> <h3>Added</h3> <ul> <li>Add <code>BytesMut::extend_from_within()</code> (<a href="https://redirect.github.com/tokio-rs/bytes/issues/818">#818</a>)</li> <li>Add <code>BytesMut::try_unsplit()</code> (<a href="https://redirect.github.com/tokio-rs/bytes/issues/746">#746</a>)</li> </ul> <h3>Fixed</h3> <ul> <li>Fix panic in <code>get_int</code> if <code>nbytes</code> is zero (<a href="https://redirect.github.com/tokio-rs/bytes/issues/806">#806</a>)</li> </ul> <h3>Changed</h3> <ul> <li>Pass vtable data by value (<a href="https://redirect.github.com/tokio-rs/bytes/issues/826">#826</a>)</li> <li>Exclude development scripts from published package (<a href="https://redirect.github.com/tokio-rs/bytes/issues/810">#810</a>)</li> </ul> <h3>Documented</h3> <ul> <li>Document that <code>BytesMut::{reserve,try_reserve}</code> doesn't preserve unused capacity (<a href="https://redirect.github.com/tokio-rs/bytes/issues/808">#808</a>)</li> </ul> </blockquote> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/tokio-rs/bytes/blob/master/CHANGELOG.md">bytes's changelog</a>.</em></p> <blockquote> <h1>1.12.0 (June 18th, 2026)</h1> <h3>Added</h3> <ul> <li>Add <code>BytesMut::extend_from_within()</code> (<a href="https://redirect.github.com/tokio-rs/bytes/issues/818">#818</a>)</li> <li>Add <code>BytesMut::try_unsplit()</code> (<a href="https://redirect.github.com/tokio-rs/bytes/issues/746">#746</a>)</li> </ul> <h3>Fixed</h3> <ul> <li>Fix panic in <code>get_int</code> if <code>nbytes</code> is zero (<a href="https://redirect.github.com/tokio-rs/bytes/issues/806">#806</a>)</li> </ul> <h3>Changed</h3> <ul> <li>Pass vtable data by value (<a href="https://redirect.github.com/tokio-rs/bytes/issues/826">#826</a>)</li> <li>Exclude development scripts from published package (<a href="https://redirect.github.com/tokio-rs/bytes/issues/810">#810</a>)</li> </ul> <h3>Documented</h3> <ul> <li>Document that <code>BytesMut::{reserve,try_reserve}</code> doesn't preserve unused capacity (<a href="https://redirect.github.com/tokio-rs/bytes/issues/808">#808</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`91402cee60`"><code>91402ce</code></a> Release bytes v1.12.0 (<a href="https://redirect.github.com/tokio-rs/bytes/issues/831">#831</a>)</li> <li><a href="`2256e6dc3e`"><code>2256e6d</code></a> chore: add safety comments on unsafe blocks (<a href="https://redirect.github.com/tokio-rs/bytes/issues/827">#827</a>)</li> <li><a href="`245adff079`"><code>245adff</code></a> Pass vtable data by value (<a href="https://redirect.github.com/tokio-rs/bytes/issues/826">#826</a>)</li> <li><a href="`00cc5ff2bd`"><code>00cc5ff</code></a> Implement <code>BytesMut::extend_from_within</code> (<a href="https://redirect.github.com/tokio-rs/bytes/issues/818">#818</a>)</li> <li><a href="`5b79d316c9`"><code>5b79d31</code></a> Merge tag 'v1.11.1'</li> <li><a href="`804ee6d039`"><code>804ee6d</code></a> Make try_unsplit method public (<a href="https://redirect.github.com/tokio-rs/bytes/issues/746">#746</a>)</li> <li><a href="`fd426ca084`"><code>fd426ca</code></a> Exclude development scripts from published package (<a href="https://redirect.github.com/tokio-rs/bytes/issues/810">#810</a>)</li> <li><a href="`b4ed70daee`"><code>b4ed70d</code></a> Add test for copy_to_bytes() -> BytesMut avoiding clone (<a href="https://redirect.github.com/tokio-rs/bytes/issues/809">#809</a>)</li> <li><a href="`94e42915a9`"><code>94e4291</code></a> Document that <code>BytesMut::{reserve,try_reserve}</code> doesn't preserve unused capac...</li> <li><a href="`acd1e0ffb8`"><code>acd1e0f</code></a> Fix <code>get_int</code> if <code>nbytes</code> is zero (<a href="https://redirect.github.com/tokio-rs/bytes/issues/806">#806</a>)</li> <li>See full diff in <a href="https://github.com/tokio-rs/bytes/compare/v1.11.1...v1.12.0">compare view</a></li> </ul> </details> <br /> Updates `napi` from 3.9.1 to 3.9.3 <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/napi-rs/napi-rs/releases">napi's releases</a>.</em></p> <blockquote> <h2>napi-v3.9.3</h2> <h3>Fixed</h3> <ul> <li><em>(napi)</em> sync referred flag when creating a weak ThreadsafeFunction (<a href="https://redirect.github.com/napi-rs/napi-rs/pull/3337">#3337</a>)</li> </ul> <h3>Other</h3> <ul> <li><em>(napi)</em> outline non-generic core of ThreadsafeFunction::create (<a href="https://redirect.github.com/napi-rs/napi-rs/pull/3334">#3334</a>)</li> </ul> <h2>napi-v3.9.2</h2> <h3>Fixed</h3> <ul> <li><em>(napi)</em> ReadableStream Reader loses chunks and aborts on errored streams (<a href="https://redirect.github.com/napi-rs/napi-rs/pull/3328">#3328</a>)</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="`ee58383da4`"><code>ee58383</code></a> chore(napi): release v3.9.3 (<a href="https://redirect.github.com/napi-rs/napi-rs/issues/3335">#3335</a>)</li> <li><a href="`c78727667b`"><code>c787276</code></a> fix(napi): sync referred flag when creating a weak ThreadsafeFunction (<a href="https://redirect.github.com/napi-rs/napi-rs/issues/3337">#3337</a>)</li> <li><a href="`d4276ca315`"><code>d4276ca</code></a> chore(deps): update dependency oxc-parser to ^0.137.0 (<a href="https://redirect.github.com/napi-rs/napi-rs/issues/3336">#3336</a>)</li> <li><a href="`a0b1831ce5`"><code>a0b1831</code></a> perf(napi): outline non-generic core of ThreadsafeFunction::create (<a href="https://redirect.github.com/napi-rs/napi-rs/issues/3334">#3334</a>)</li> <li><a href="`3759d7b485`"><code>3759d7b</code></a> chore(deps): update rust-lang/crates-io-auth-action action to v1.0.5 (<a href="https://redirect.github.com/napi-rs/napi-rs/issues/3333">#3333</a>)</li> <li><a href="`dd41eeb921`"><code>dd41eeb</code></a> build(deps): bump protobufjs from 7.6.2 to 7.6.4 (<a href="https://redirect.github.com/napi-rs/napi-rs/issues/3332">#3332</a>)</li> <li><a href="`cdd48b3873`"><code>cdd48b3</code></a> chore(deps): update dependency oxc-parser to ^0.136.0 (<a href="https://redirect.github.com/napi-rs/napi-rs/issues/3314">#3314</a>)</li> <li><a href="`e98762de2c`"><code>e98762d</code></a> chore(deps): update yarn monorepo to v4.17.0 (<a href="https://redirect.github.com/napi-rs/napi-rs/issues/3330">#3330</a>)</li> <li><a href="`529a78d15c`"><code>529a78d</code></a> chore(napi): release v3.9.2 (<a href="https://redirect.github.com/napi-rs/napi-rs/issues/3329">#3329</a>)</li> <li><a href="`88f4b97030`"><code>88f4b97</code></a> fix(napi): ReadableStream Reader loses chunks and aborts on errored streams (...</li> <li>Additional commits viewable in <a href="https://github.com/napi-rs/napi-rs/compare/napi-v3.9.1...napi-v3.9.3">compare view</a></li> </ul> </details> <br /> Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore <dependency name> major version` will close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself) - `@dependabot ignore <dependency name> minor version` will close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself) - `@dependabot ignore <dependency name>` will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself) - `@dependabot unignore <dependency name>` will remove all of the ignore conditions of the specified dependency - `@dependabot unignore <dependency name> <ignore condition>` will remove the ignore condition of the specified dependency and ignore conditions </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2026-06-23 09:21:05 -07:00
LanceDB Robot	f16da19b78	chore: update lance dependency to v9.0.0-beta.2 (#3569 ) Updates LanceDB's Lance dependencies to v9.0.0-beta.2 across the Rust workspace and Java lance-core dependency.\n\nNo compatibility fixes were required; clippy and formatting pass after installing the missing toolchain components on the runner. Triggering Lance tag: https://github.com/lance-format/lance/releases/tag/v9.0.0-beta.2	2026-06-23 09:20:13 -07:00
Drew Gallardo	41ac32a344	feat(rust): add blob read and materialization APIs (#3562 ) This PR is for the Read path against blob v2. #3528 handles declare + write, and this this adds materialization on local tables. - blob_columns() - fetch_blobs(column, row_ids) → bytes - fetch_blob_files(column, row_ids) → lazy handles - Pass _rowid from query().with_row_id(). Remote returns NotSupported. (for now) ### Use cases search, grab row ids, materialize images: ```rust let row_ids = /* _rowid from hits */; let images = table.fetch_blobs("image", &row_ids).await?; ``` Large blobs: open handles, read only what you need: ```rust let handles = table.fetch_blob_files("image", &row_ids).await?; let bytes = handles[0].as_ref().unwrap().read().await?; ``` Filter then batch fetch: collect ids from a filter, one call. Multiple blob columns: image and thumbnail independently. Row ids from before compact: still resolve. ### Alignment note Lance `read_blobs` drops null rows. We descriptor-take first, read non-null ids, re-expand to match input order. Null and zero-length blobs come back null/None. Bytes path sets `preserve_order(true)`. So I added: ``` TODO(lance): expose selection_index or an aligned execute so we can drop the pre-read. ``` ### Tests `cargo test -p lancedb --test blob_integration` - 30 tests covering nulls, reorder, dups, cross-fragment bytes + files, compact, delete, legacy v1 errors. --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-23 06:58:26 -07:00
Drew Gallardo	ba1ef34481	feat(rust): add blob v2 schema declaration and write path (#3528 ) First Rust PR for #3231. Lance already stores blob v2. This adds the LanceDB write side. ```rust let schema = Schema::new(vec![ Field::new("id", DataType::Int64, false), lancedb::blob("image", true), ]); let table = db.create_table("photos", schema).execute().await?; table.add(batch_with_large_binary_image_column).execute().await?; ``` Read/materialize and Python are follow-up PRs. ### Testing - cargo test -p lancedb --test blob_integration - cargo test -p lancedb blob:: datafusion::blob_coerce - cargo test -p lancedb (591 passed) - cargo clippy --features remote --tests --------- Co-authored-by: Claude Fable 5 <noreply@anthropic.com>	2026-06-19 12:33:15 -07:00
Will Jones	85d870b397	fix: parse RFC 3339 created_at and improve IndexConfig repr (#3558 ) The server now serializes an index's `created_at` as an RFC 3339 string (e.g. `"2026-06-18T21:37:36.637Z"`), but the client deserializer only accepted a unix timestamp in milliseconds. This caused `list_indices` to fail with: ``` Failed to parse list_indices response: invalid type: string "2026-06-18T21:37:36.637Z", expected a unix timestamp in milliseconds ``` This PR replaces the fixed millisecond deserializer with a custom one that accepts both an RFC 3339 string (current server) and a unix-millisecond integer (legacy deployments), so the client works against any server version. It also improves the `IndexConfig` repr in the Python bindings. Previously it printed only three fields (`Index(FTS, columns=["text"], name="text_idx")`), hiding the metadata that `list_indices` returns. It now renders every populated field, omitting any that are `None`. Each value is valid Python — integer counts use `_` thousands separators and `created_at` uses the `datetime` repr — so values round-trip. The real repr is a single line; it's wrapped here for readability: ```python >>> table.list_indices() [IndexConfig( name="text_idx", index_type="FTS", columns=["text"], index_uuid="aefd3e00-2f95-4bdc-92ac-06de84442bf1", type_url="/lance.table.InvertedIndexDetails", created_at=datetime.datetime(2026, 6, 18, 21, 37, 36, 637000, tzinfo=datetime.timezone.utc), num_indexed_rows=2, size_bytes=3_669, num_segments=1, index_version=1, index_details={ 'lance_tokenizer': None, 'base_tokenizer': 'simple', 'language': 'English', 'with_position': False, 'max_token_length': 40, 'lower_case': True, 'stem': True, 'remove_stop_words': True, 'custom_stop_words': None, 'ascii_folding': True, 'min_ngram_length': 3, 'max_ngram_length': 3, 'prefix_only': False, }, )] ``` Fixes #3556 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-19 10:40:56 -07:00
LanceDB Robot	c46d59d2ee	chore: update lance dependency to v8.0.0-rc.1 (#3557 ) Updates LanceDB Lance dependencies to Lance v8.0.0-rc.1. This includes the Rust workspace Lance crates, Cargo.lock, and Java lance-core version. Triggering tag: https://github.com/lance-format/lance/releases/tag/v8.0.0-rc.1	2026-06-19 11:40:38 -05:00
Lance Release	113f187c2d	Bump version: 0.31.0-beta.0 → 0.31.0-beta.1	2026-06-19 16:00:59 +00:00
Lance Release	3b279f5705	Bump version: 0.34.0-beta.0 → 0.34.0-beta.1 python-v0.34.0-beta.1	2026-06-19 15:59:43 +00:00
Ryan Green	e1334954d7	fix: overflow using sys.maxsize for k in query with namespace connection (#3561 )	2026-06-19 12:57:10 -02:30
LanceDB Robot	2f65a233fe	chore: update lance dependency to v8.0.0-beta.19 (#3555 ) Updates LanceDB's Lance dependencies from v8.0.0-beta.17 to v8.0.0-beta.19. This includes the Rust workspace Lance crates, Cargo.lock refresh, and Java lance-core version bump. Triggering Lance tag: https://github.com/lance-format/lance/releases/tag/v8.0.0-beta.19	2026-06-18 14:16:57 -05:00
Lance Release	e81356089a	Bump version: 0.30.1-beta.2 → 0.31.0-beta.0	2026-06-18 18:43:22 +00:00
Lance Release	4f4cce3f64	Bump version: 0.33.1-beta.2 → 0.34.0-beta.0 python-v0.34.0-beta.0	2026-06-18 18:42:07 +00:00
LanceDB Robot	c1c19cd133	chore: update lance dependency to v8.0.0-beta.17 (#3552 ) Updates the Lance Rust workspace dependencies and Java lance-core dependency to v8.0.0-beta.17. No LanceDB compatibility code changes were required; validation passed with cargo clippy and cargo fmt. Triggering Lance tag: https://github.com/lance-format/lance/releases/tag/v8.0.0-beta.17	2026-06-17 16:08:09 -07:00
Will Jones	ce5dadd386	fix(ci): allow shell pre-commit hooks in bumpversion configs (#3554 ) The "Create release commit" workflow (`make-release-commit.yml`) has failed on its last two runs; no release tags have been created since June 4. Since this workflow creates the tag that the cargo/npm/pypi/java publish workflows trigger off of, all recent releases are effectively blocked. The workflow installs `bump-my-version` unpinned. Version `1.4.0` added a check that refuses to run `pre_commit_hooks` containing shell syntax (pipes, `&&`, `if`, variable expansion) unless `allow_shell_hooks = true` is set. Both bumpversion configs use such hooks: - `python/.bumpversion.toml` — updates `Cargo.lock` after the bump (fails first) - `.bumpversion.toml` — runs `mvn versions:set` for the Java packages The job dies at the version-bump step with: > Hook '…' contains shell syntax (pipes, redirects, or variable expansion). Set `allow_shell_hooks = true` in your configuration to enable shell execution… This sets `allow_shell_hooks = true` in both configs to restore the previous behavior. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-17 15:22:05 -07:00
Armaan Sandhu	1f8ebef3cd	fix(rust): return typed errors instead of panicking in Bedrock embedding path (#3512 ) Closes #3506 ## Problem The Bedrock embedding compute path (`rust/lancedb/src/embeddings/bedrock.rs`) panics instead of returning a typed error in several places: - `serde_json::to_vec(&request_body).unwrap()`: request serialization. - `block_in_place(...).unwrap()`: the AWS `invoke_model` send result; any API error terminates the worker instead of propagating. - `v.as_f64().unwrap() as f32`: panics on non-numeric values in the returned embedding array. - `Handle::current()` + `block_in_place` assume a multi-threaded Tokio runtime and panic when that assumption does not hold (no runtime, or a current-thread runtime). Malformed payloads, non-numeric embedding values, or an incompatible runtime should surface as typed errors and never panic. ## Fix - Serialize the request body before the blocking section so a serialization failure returns `Error::Runtime` via `?`. - Map the `invoke_model` send error to `Error::Runtime` instead of `unwrap`. - Add a `json_array_to_f32` helper that converts the response array to `Vec<f32>`, returning `Error::Runtime` for a missing/non-array field or a non-numeric element (used by both the Titan and Cohere paths). - Add `current_multi_thread_handle()` (`Handle::try_current()` + a `RuntimeFlavor::CurrentThread` guard) so an absent or incompatible runtime returns a typed error rather than panicking in `block_in_place`. Scope note: the sibling `openai.rs` provider uses the same `block_in_place` + `block_on` bridge, so the bridge pattern itself is kept; this change only removes the panic paths that are specific to the Bedrock provider. ## Testing Added 6 unit tests (no AWS credentials required): - `json_array_to_f32`: valid numbers, non-array payload, and non-numeric element. - `current_multi_thread_handle`: errors with no runtime, errors on a current-thread runtime, and succeeds on a multi-threaded runtime. All pass; `cargo fmt` and `cargo clippy` clean. Build/test with `--features bedrock,lance/protoc`.	2026-06-17 15:06:44 -07:00
whitewooood	217fd8491d	fix(python): clarify single dictionary input error (#3537 ) ## Summary - clarify the Python error for passing a single dictionary to table creation/add paths - add a regression test for `create_table(..., data=dict)` so it points users to a list of dictionaries Fixes #409 ## Testing - `python -m pytest python/tests/test_table.py -q` - `python -m ruff format python/lancedb/table.py python/lancedb/scannable.py python/tests/test_table.py` - `python -m ruff check python/lancedb/table.py python/lancedb/scannable.py python/tests/test_table.py`	2026-06-17 12:55:55 -07:00
JSap0914	9128dbcd7a	fix(util): escape single quotes in struct field names in value_to_sql (#3548 ) ### Bug `value_to_sql({...})` builds a DataFusion `named_struct(...)` literal but interpolates the struct field names directly as `f"'{k}'"`. A field name that contains a single quote therefore produces invalid SQL: ```python >>> from lancedb.util import value_to_sql >>> value_to_sql({"it's": 1}) "named_struct('it's', 1)" # invalid SQL — the quote terminates the literal ``` String values are already escaped (single quotes doubled) by the `str` branch of `value_to_sql`, so keys and values were handled inconsistently. This affects `Table.update(values={...})` / `merge_insert` when a struct column has a field name containing `'`. ### Fix Render the key through `value_to_sql(str(k))` so field names are escaped exactly like string values: ```python >>> value_to_sql({"it's": 1}) "named_struct('it''s', 1)" ``` Keys without special characters are unchanged (`'a'` stays `'a'`), so existing behavior is preserved. ### Verification ``` $ pytest python/tests/test_util.py -k value_to_sql_dict ``` The new `test_value_to_sql_dict_key_escaping` covers quoted keys (incl. nested structs) and fails on `main` (`named_struct('it's', 1)`), passes with this change; the existing `test_value_to_sql_dict` still passes. Co-authored-by: JSap0914 <JSap0914@users.noreply.github.com>	2026-06-17 12:55:43 -07:00
Ghxst ☠️	394bb34fa2	fix(rust): report local write progress bytes from Lance (#3422 ) Fixes #3360. This updates native table writes so local write progress uses Lance writer byte stats instead of Arrow in-memory batch size once write bytes are available. The change wires the existing `WriteProgressTracker` into `InsertExec` for native `add` writes, installs a Lance `WriteProgressFn` only when no lower-level callback is already configured, and keeps the existing public `InsertExec::new` signature unchanged. Validation: - `cargo test -p lancedb --features remote table::write_progress::tests::test_progress_uses_lance_write_bytes_for_local_tables -- --nocapture` passed: 1 passed, 0 failed. - `cargo test -p lancedb --features remote table::write_progress::tests -- --nocapture` passed: 7 passed, 0 failed. - `cargo check --quiet --features remote --tests --examples` passed. - `cargo fmt --all --check` passed. - `git diff --check` passed. - `git diff \| gitleaks stdin --no-banner --redact --timeout 30` passed: no leaks found. I did not run the full `cargo test --quiet --features remote --tests` suite. Co-authored-by: Ghxst <200635707+GHX5T-SOL@users.noreply.github.com>	2026-06-17 12:05:59 -07:00
Armaan Sandhu	b2ae763254	fix(python): raise clear TypeError for bare List/Tuple in pydantic schema conversion (#3511 ) Closes #3502 ## Problem A bare, unparameterised `typing.List` / `typing.Tuple` field crashes `to_arrow_schema` with an opaque `AttributeError: __args__`: ```python from typing import Tuple from lancedb.pydantic import LanceModel class Doc(LanceModel): items: Tuple Doc.to_arrow_schema() # AttributeError: __args__ ``` In `_py_type_to_arrow_type`, the branch `elif getattr(py_type, "__origin__", None) in (list, tuple)` is taken for a bare generic (its `__origin__` is `list / tuple`), but the next line reads `py_type.__args__[0]`, and a bare generic has no `__args__`. Other unsupported types (e.g. `Dict[str, int]`) correctly raise a clear `TypeError`, so this case is inconsistent. Fix Guard the element-type lookup with `getattr(py_type, "__args__", None)` and raise a clear `TypeError` when it is missing, matching the existing behavior for other unsupported types. Bare builtin list / tuple are unaffected (their `__origin__` is `None`, so they already fall through to the existing `TypeError`). Testing - Added `test_bare_generic_raises_type_error` covering both `List` and `Tuple`. - ruff format and ruff check clean.	2026-06-17 11:58:48 -07:00
Drew Gallardo	1bead6960c	fix: pin mock clock in eventual consistency test (#3547 ) This PR fixes a flaky test I hit on Windows test in #3528. Looks like `test_eventual_consistency_background_refresh` was failing with `v_cached` expected 1, got 2. There was a pr which swapped `tokio::time::sleep(300ms)` for `clock::advance_by(300ms)`, which is pretty much fine but the test necer pinned the clock so the first `get()` locks the `cached_at` on wall time. Therefore, if our CI is taking long enough the ttl expires before the value assertion in the test. So now we can add a `pin()` and call it first `get()`. After that we can advance the clock manually with no problems. Also, it's worth noting that I tried pinning in `BackgroundCache::new()` first. That broke another test `test_reload_resets_consistency_timer`, which uses real `tokio::time::sleep` and needs wall clock after `clear_mock()`. So the pin stays in this test only. And this should unblock us. Failing instances: - https://github.com/lancedb/lancedb/actions/runs/27567527236/job/81495265474?pr=3528 - https://github.com/lancedb/lancedb/actions/runs/27560366489/job/81470414928	2026-06-17 11:56:40 -07:00
Brendan Clement	0abf641733	feat: send read-freshness signal on the lance-namespace path (#3551 ) ### Description `db://`-style connections that use the lance-namespace path (`LanceNamespaceDatabase` → `NativeTable` + the lance-namespace REST client) never sent a read-freshness signal. Against a server configured to serve cached table metadata up to some staleness window, this allows stale-read-after-write across handles and processes. The remote table path already solved this (#3439). This brings the namespace path to parity. The namespace REST client doesn't let callers attach headers directly, but it forwards a `DynamicContextProvider`'s `headers.*` context entries as HTTP headers per request. So: - A shared per-table baseline map is created before the namespace client. I built and installed on the `ConnectBuilder` via a context provider. - On read operations the provider emits ·x-lancedb-min-timestamp = max(baseline, now − read_consistency_interval)` (RFC3339), keyed by the operation's `object_id`. - Each table handle bumps its baseline (monotonically) on `checkout_latest()`, `restore()`, and every data/schema write. `checkout_latest()` is the primary hook: consumers refresh a handle there after writing elsewhere, then poll. Read operations that carry the floor: `describe_table`, `list_table_versions`, `query_table`, `list_tables`. `list_table_versions` is what resolves "latest" for managed-versioning tables (`get_latest_version`), so it's the op that makes `checkout_latest()` actually observe a prior write. `describe_table_version` is excluded (pinned to an immutable version). This mirrors #3439 (timestamp baseline, `max(baseline, now − interval)`, monotonic); no `min_version` and no body channel, since the namespace path has no version-returning write responses. ### Testing - Unit tests for `compute_min_timestamp` / `next_freshness_baseline` and the provider (header at/after a bumped baseline; nothing for an empty baseline + no interval; interval floor applies; non-read ops emit nothing; `list_tables` uses only the interval floor). - Verified end-to-end against a local server that honors the header: reads carry `x-lancedb-min-timestamp`, writes don't, and read-your-write holds.	2026-06-17 13:30:53 -04:00
Yang Cen	976edeb2ff	feat(query): add approx mode to vector queries (#3549 ) ## Feature ### What is the new feature? Adds Rust core API support for configuring vector query approximation mode with `ApproxMode::{Fast, Normal, Accurate}`. ### Why do we need this feature? Lance already exposes `lance_index::vector::ApproxMode` and scanner support for controlling the speed/accuracy tradeoff for approximate vector search. LanceDB Rust queries need to expose and pass this setting through for local/native and remote vector searches. ### How does it work? - Adds public `ApproxMode` in `rust/lancedb`, with lowercase serde, `Default::Normal`, parse/display, and conversions to/from Lance's `ApproxMode`. - Adds `approx_mode: Option<ApproxMode>` to `VectorQueryRequest` and a `VectorQuery::approx_mode(...)` builder. - Applies the mode to native/local Lance scanners after `nearest(...)` when explicitly set. - Sends `approx_mode` in remote query JSON only when explicitly set; default requests omit it. ## Validation - `cargo fmt --all` - `cargo test --quiet --features remote approx_mode` - `cargo test --quiet --features remote test_query_vector_default_values` - `cargo check --quiet --features remote --tests --examples` - `git diff --check`	2026-06-17 19:28:42 +08:00
Yang Cen	b46a44f873	feat(query): add approx mode to vector queries (#3549 ) ## Feature ### What is the new feature? Adds Rust core API support for configuring vector query approximation mode with `ApproxMode::{Fast, Normal, Accurate}`. ### Why do we need this feature? Lance already exposes `lance_index::vector::ApproxMode` and scanner support for controlling the speed/accuracy tradeoff for approximate vector search. LanceDB Rust queries need to expose and pass this setting through for local/native and remote vector searches. ### How does it work? - Adds public `ApproxMode` in `rust/lancedb`, with lowercase serde, `Default::Normal`, parse/display, and conversions to/from Lance's `ApproxMode`. - Adds `approx_mode: Option<ApproxMode>` to `VectorQueryRequest` and a `VectorQuery::approx_mode(...)` builder. - Applies the mode to native/local Lance scanners after `nearest(...)` when explicitly set. - Sends `approx_mode` in remote query JSON only when explicitly set; default requests omit it. ## Validation - `cargo fmt --all` - `cargo test --quiet --features remote approx_mode` - `cargo test --quiet --features remote test_query_vector_default_values` - `cargo check --quiet --features remote --tests --examples` - `git diff --check`	2026-06-17 19:28:36 +08:00
Brendan Clement	f76b075d13	feat: add table branch support to remote tables and Python/TS bindings (#3540 ) ### Description Adding branch support for RemoteTable by threading a branch selector onto every operation the data plane accepts it on. Exposes the currentBranch to nodejs and python through the bindings. Matching the server handlers, the branch rides as: - a `?branch=` query parameter for Arrow-body and query-only ops (insert, merge_insert, multipart_*, version/list, drop_index) - a `branch` field in the JSON body for everything else (count_rows, query, update, delete, create_index, column ops, index list/stats, stats, restore, describe, tags create/update) A main-branch handle (`branch == None`) produces byte-identical requests to before: no `branch` field and no `?branch=` - Handle-per-branch: `create_branch` / `checkout_branch` return a new handle with fresh caches and reset version/freshness state, mirroring `NativeTable`. - `create_branch` maps 409 to already-exists, 400 to invalid, and 404 to not-found with source context, and sends without retry so the 409 stays observable. - `Ref` translation covers version, version-number (relative to the handle's branch), and tag (resolved via the tags endpoint); `"main"` and empty normalize to the main branch. - Python branch handles persist their branch (and pinned version) across pickle/fork, so a forked or pickled handle reopens on its branch rather than silently reverting to main. ### Tests - Rust mock tests per op category (query-param and body mechanisms, branch CRUD, error paths, backward-compat). - Python sync branch CRUD, `open_table(branch=)`, and a pickle round-trip regression test.	2026-06-15 18:07:40 -04:00
LanceDB Robot	393ec981bf	chore: update lance dependency to v8.0.0-beta.14 (#3546 ) Updates LanceDB's Lance dependencies to v8.0.0-beta.14.\n\nThis refreshes the Rust workspace lockfile and Java lance-core version; no compatibility code changes were required. Triggering Lance tag: https://github.com/lance-format/lance/releases/tag/v8.0.0-beta.14	2026-06-15 16:56:16 -04:00
Will Jones	6219975222	perf: drop N+1 in RemoteTable::list_indices (#3535 ) `RemoteTable::list_indices` currently makes one `/index/list/` call plus one `/index/{name}/stats/` call per index just to recover `index_type`. When the server returns `index_type` directly in the `/index/list/` response, all enriched fields are used and the per-index stats fan-out is skipped entirely. When `index_type` is absent (legacy servers), the existing stats fallback runs as before. This is content-based: no version header required. ## Changes - `RemoteTable::parse_index_list_response` replaces the old split between enriched and legacy parsers. A single struct deserializes both old and new response shapes, with all fields except `index_name` and `columns` optional. `index_type` acts as the sentinel: present → use enriched fields directly; absent → call `/index/{name}/stats/`. ## Tests Added `test_list_indices_enriched` covering: - All enriched fields populated correctly when `index_type` is in the list response - Optional fields absent from the response deserialize as `None` - Stats endpoint is not called (panics if hit), verifying the fan-out is eliminated Existing `test_list_indices` and `test_list_indices_nested_field_paths` exercise the legacy path unchanged. ## Depends on - #3497 (expand `IndexConfig`) — already merged - Server-side enriched response support Closes #3494 Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-15 09:21:17 -07:00
Dan Tasse	d9f9a51668	feat: skills to connect and update column metadata (#3541 ) Two skills to help people connect and manage their column metadata using a server that implements the [REST API](https://lance.org/format/catalog/rest/) lancedb-column-metadata was built using the [Claude skill creator](https://claude.com/plugins/skill-creator); without the skill it was usually calling at least one method that didn't exist and usually not setting "replace": "false". So, while the base case is already pretty good, adding this skill improves things somewhat. lancedb-connect should help with most agentic workflows, because "finding all the things you need to connect to your server" can be the hardest part.	2026-06-15 11:42:01 -04:00
Brendan Clement	c187ff7712	chore: ignore pyo3 advisories RUSTSEC-2026-0176/0177 in cargo-deny (#3542 )	2026-06-15 21:37:03 +08:00
LanceDB Robot	dfbe5becaa	chore: update lance dependency to v8.0.0-beta.12 (#3538 ) Updates Rust workspace Lance crates and Java lance-core to v8.0.0-beta.12. No compatibility fixes were required; validation passed with cargo clippy and cargo fmt. Lance tag: https://github.com/lance-format/lance/releases/tag/v8.0.0-beta.12	2026-06-11 15:03:33 -07:00
Xuanyi Li	49815da933	refactor: extract create_index module from table.rs (#3521 ) ## Summary - Extracts the `create_index` code cluster from `table.rs` into a new `rust/lancedb/src/table/create_index.rs` submodule, continuing the work from #2949. - Moves 8 `NativeTable` inherent methods (`load_indices`, `validate_index_type`, `build_ivf_params`, `get_num_sub_vectors`, `get_vector_dimension`, `resolve_index_field`, `make_index_params`, `get_index_type_for_field`) and 11 associated tests into the new module. - Reduces `table.rs` from ~5009 to ~3804 lines (-1205 lines) with no behavioral changes. ## Test plan UT	2026-06-11 14:06:44 -07:00
Will Jones	f8caef3aca	feat(bindings): expose new IndexConfig fields in Python and Node.js (#3534 ) ## Summary Surfaces the rich per-index metadata added in #3497 to the Python and Node.js language bindings. Closes #3495. New optional fields exposed on `IndexConfig` in both bindings: - `index_uuid` / `indexUuid` — UUID of the first index segment - `type_url` / `typeUrl` — protobuf type URL for the index - `created_at` / `createdAt` — creation timestamp (milliseconds since Unix epoch) - `num_indexed_rows` / `numIndexedRows` — rows covered by the index - `num_unindexed_rows` / `numUnindexedRows` — rows not yet indexed - `size_bytes` / `sizeBytes` — total index file size in bytes - `num_segments` / `numSegments` — number of index segments - `index_version` / `indexVersion` — on-disk format version - `index_details` / `indexDetails` — type-specific JSON details string All fields are `None`/`undefined` for remote tables (which don't yet surface this metadata through the server response). ## Changes - `python/src/index.rs`: extend `IndexConfig` pyclass; update `From` impl; update `__getitem__` - `python/python/lancedb/_lancedb.pyi`: add type hints for new fields - `python/python/tests/test_table.py`: new `test_index_config_fields` test - `nodejs/src/table.rs`: extend `IndexConfig` napi struct; update `From` impl - `nodejs/__test__/table.test.ts`: new test; update existing `toEqual` assertions to `expect.objectContaining` to accommodate new fields ## Test plan - [x] Python: `uv run --extra tests pytest python/tests/test_table.py::test_index_config_fields` - [x] Node.js: `pnpm test __test__/table.test.ts` 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-06-11 13:37:39 -07:00
nuthalapativarun	40f3e22600	feat: support rename_table on LanceNamespaceDatabase (#3520 ) ## Summary Closes #3412 Implements `rename_table` for `LanceNamespaceDatabase` (sync and async Python) and the Rust `NamespaceDatabase` backend. Previously these raised `NotImplementedError`; this PR delegates to the `LanceNamespace.rename_table` method which is part of the lance-namespace spec. ### Changes - `rust/lancedb/src/database/namespace.rs`: Remove the `NotImplementedError` stub for `rename_table`. Build a `RenameTableRequest` (with `id`, `new_table_name`, and optionally `new_namespace_id`) and call `self.namespace.rename_table(...)`, mirroring the existing `drop_table` pattern. - `python/python/lancedb/namespace.py`: Import `RenameTableRequest` from `lance_namespace`. Replace the `raise NotImplementedError` in both `LanceNamespaceDatabase.rename_table` (sync) and `AsyncLanceNamespaceDatabase.rename_table` (async) with a call to `self._namespace_client.rename_table(request)`. - `python/python/tests/test_namespace.py`: Replace the `test_rename_table_not_supported` test (which checked for `NotImplementedError`) with `test_rename_table`, which: 1. Creates a table in a namespace 2. Calls `rename_table` with `cur_namespace_path` and `new_namespace_path` 3. Asserts the old name is gone from `table_names()` 4. Asserts the new name appears in `table_names()` 5. Verifies the renamed table can be opened ## Test plan - [ ] Existing namespace tests pass in CI (all rely on `lance.namespace.DirectoryNamespace` which requires the full lance package) - [ ] `test_rename_table` exercises the full rename path: create → rename → verify old gone → verify new present → open - [ ] Rust build passes with the updated `namespace.rs` (requires Rust toolchain in CI)	2026-06-11 11:41:07 -07:00
nuthalapativarun	04480c274a	test(python): add nested field regression matrix tests (#3518 ) ## Summary Closes #3406 Add a regression matrix in `python/python/tests/test_nested_fields.py` that exercises the full nested field index lifecycle for both the sync and async Python table APIs. The tests will fail if any implementation regresses to leaf-only field names in `list_indices`, `index_stats`, search, or filter results. ## Test scenarios covered Index types: BTree scalar, IvfPq vector, FTS Field-name edge cases (per acceptance criteria): - `rowId` — camelCase top-level field - `` `row-id` `` — hyphenated top-level field (escaped) - `parent.`\``leaf.name`\`` ` — struct leaf whose name contains a literal dot - `MetaData.userId` — mixed-case nested path - `` `meta-data`.`user-id` `` — hyphenated struct with hyphenated leaf Lifecycle operations per index type: - `create_index` / `create_scalar_index` / `create_fts_index` - `list_indices` → verify canonical full dotted path (not leaf name) - `index_stats` → verify row count and index type - Filtered scan (`WHERE nested.field = value`) - Vector search via nested embedding column - FTS search via nested text column - `add` (append) then re-check index listing - `optimize` then re-check index listing Both sync and async APIs are covered in parallel test classes. ## Notes Lance forbids top-level field names that contain a literal `.`, so the `` `a.b` `` acceptance-criterion variant is exercised as a struct leaf field (`parent.`\``leaf.name`\``) rather than a top-level column.	2026-06-11 08:06:04 -07:00
Trenton H	ae7f2cbfe8	feat(python): accept Expr in Table.delete and merge when_not_matched_by_source_delete (#3524 ) Another little pain point as I was working to integrate with paperless-ngx. The read path of table.search() or table.query() already accepted an Expr, but write paths Table.delete and merge_insert(...).when_not_matched_by_source_delete did not. This PR attempts to close that gap, so writes and reads can both use Expr, instead of one side needing to build a string.	2026-06-11 07:59:49 -07:00
LanceDB Robot	4fb7c92e86	chore: update lance dependency to v8.0.0-beta.11 (#3533 ) Updates Lance dependencies to v8.0.0-beta.11 and refreshes the Rust and Java lock/config files. This also adapts namespace external manifest store call sites to the new table-root-aware constructor required by Lance. Triggering tag: https://github.com/lancedb/lance/releases/tag/v8.0.0-beta.11	2026-06-10 17:53:58 -07:00
Will Jones	f03abc27e3	feat: expand IndexConfig with rich per-index metadata (#3497 ) `IndexConfig` (returned by `Table::list_indices`) previously exposed only `name`, `index_type`, and `columns`. Lance's `describe_indices` provides richer per-index info cheaply (reads manifest metadata, often cached), so this surfaces it. Adds these `Option<T>` fields to `lancedb::index::IndexConfig`, populated in `NativeTable::list_indices` from the `IndexDescription`: - `index_uuid`: uuid of the first segment - `type_url`: protobuf type URL (`IndexDescription::type_url`) - `created_at`: minimum creation time across segments - `num_indexed_rows`: approximate rows indexed across segments - `num_unindexed_rows`: table row count minus `num_indexed_rows` - `size_bytes`: total size of index files across segments - `num_segments`: number of segments making up the index - `index_version`: on-disk index format version (first segment) - `index_details`: index-type-specific details as JSON This field set mirrors the lance-namespace `IndexContent` contract (lance-format/lance-namespace#348) so client and server agree on the same shape. Note these are populated locally via `describe_indices` — `NativeTable::list_indices` reads the dataset directly and does not depend on the namespace spec change. `RemoteTable` leaves the new fields `None` until a follow-up wires them through the server response (#3494). Bindings exposure will also be a follow up: #3495 Existing `list_indices` tests in `rust/lancedb/src/table.rs` are extended to assert the new fields. Fixes #3492 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>	2026-06-10 16:14:33 -07:00
Trenton H	85d9c1ce63	feat: adds isin support to the 'Expr' builder (#3523 ) The `Expr` build already includes a lot of useful filtering options, `eq, ne, gt/gte, lt/lte, and_, or_, contains, cast`, but is was missing a membership like `isin`. This PR adds that support, as minimally as possible, allowing easy filtering for membership in a list, without needing to be a series of `where` expressions. I didn't see anything in CONTRIBUTING.md about needing a feature request or issue first, so I just made the change. My apologies if I missed that somewhere. Thanks for the vector store, we're using it now in paperless-ngx.	2026-06-10 15:28:19 -07:00

1 2 3 4 5 ...

2620 Commits