Summary:
- Route built-in sync namespace connections through the Rust namespace
connector.
- Keep custom namespace clients on the existing Python fallback.
- Preserve namespace-backed to_lance compatibility with lazy Python
client construction and add regressions.
## Summary
Adds per-session monotonic reads for remote (LanceDB Cloud/Enterprise)
tables, preventing successive reads on a handle from moving *backward*
in dataset version when a load balancer routes them to query nodes with
differently-cached views.
Each `RemoteTable` handle tracks the highest dataset version it has
observed in a read response — surfaced by the server via a new
`x-lancedb-version` response header — and sends it back as
`x-lancedb-min-read-version` on subsequent reads (`count_rows`,
`query`). A query node whose cache is behind that version refreshes
before serving; a node already at/beyond it serves from cache at no
extra cost.
The watermark is sourced only from reads (always committed dataset
versions), so unlike the retired `x-lancedb-min-version` it is
unaffected by WAL writes returning WAL entry ids. It is reset on
`checkout_latest()`. Both headers are optional and ignored by older
peers.
Server-side enforcement lives in LanceDB Enterprise. Targets the
`codex/update-lance-9-0-0-beta-8` integration branch to match the
Enterprise submodule pin.
This PR is part cleanup, part feature, part example.
It removes `IntoArrow` and `IntoArrowStream`. There was only one
redundant call site between the two. Once we moved everything to
`Scannable` these traits no longer serve any purpose.
It adds a `Scannable` impl for a polars DataFrame. We used to have this
at one point for `IntoArrow` so this is more like a regression fix than
anything.
It adds an example (and unit test) which ensures we can ingest from a
Polars DataFrame and export to one. LazyFrame support would be a
follow-up (though a pretty straightforward one) but we've never had
proper LazyFrame support before.
Agents seemed to have trouble finding the right calls to work with
branches (create, list, delete) and passing the right params to get it
to work. We probably don't need a big skill to get it on the right track
but a little nudge seems helpful. Doing a couple simple tasks, it saved
about half the time and tokens, so feels worthwhile. Created with the
Claude skills creator, hence the "skill.md in a bare folder"
organization - happy to move it if that's not the standard anymore.
```
Benchmark results (3 evals, with-skill vs baseline):
┌────────────────┬────────────┬────────────────────┐
│ Metric │ With skill │ Without skill │
├────────────────┼────────────┼────────────────────┤
│ Pass rate │ 3/3 (100%) │ 3/3 (100%) │
├────────────────┼────────────┼────────────────────┤
│ Avg time │ 51s │ 142s (2.8× slower) │
├────────────────┼────────────┼────────────────────┤
│ Avg tokens │ 19,305 │ 36,513 (47% more) │
├────────────────┼────────────┼────────────────────┤
│ Avg tool calls │ 5.7 │ 26 (4.5× more) │
└────────────────┴────────────┴────────────────────┘
```
Expose the merged Rust OAuth header provider through the Node/TypeScript
connection path.
Includes:
- Native OAuthConfig conversion for napi-rs
- ConnectionOptions.oauthConfig plumbing
- Public TypeScript OAuthConfig and OAuthFlowType exports
- Generated TypeScript API docs for the new config surface
- input-validation and debug-redaction coverage in the Rust binding
layer
Local validation: cargo fmt --all; git diff --check.
Fixes#3589
## Problem
Multiple `warnings.warn()` calls across the Python client are missing
the `stacklevel=2` parameter. This causes warning messages to point to
lancedb internal code instead of the user's code that triggered the
warning, making debugging difficult.
## Solution
Add `stacklevel=2` to 7 `warnings.warn()` calls across 4 files:
| File | Warnings Fixed |
|------|---------------|
| `remote/db.py` | `request_thread_pool`, `connection_timeout`,
`read_timeout` deprecation warnings |
| `remote/table.py` | `cleanup_old_versions`, `compact_files`,
`optimize` no-op warnings |
| `table.py` | `data_storage_version`, `enable_v2_manifest_paths`,
`retrain` deprecation warnings |
| `embeddings/colpali.py` | `use_token_pooling` deprecation warning |
## Verification
- All 4 modified files pass `ast.parse()` syntax check
- Only `stacklevel=2` added — no other changes
## Changelog
| Date | Change | Author |
|------|--------|--------|
| 2026-06-27 | Add missing stacklevel=2 to warnings.warn() calls |
rtmalikian |
### Files Changed
- `python/python/lancedb/remote/db.py` — Add stacklevel=2 to 3
deprecation warnings
- `python/python/lancedb/remote/table.py` — Add stacklevel=2 to 3 no-op
warnings
- `python/python/lancedb/table.py` — Add stacklevel=2 to 3 deprecation
warnings
- `python/python/lancedb/embeddings/colpali.py` — Add stacklevel=2 to 1
deprecation warning
### Verification
- Syntax check passed on all modified files
---
**About the Author:** Raphael Malikian — Clinical AI Solutions
Architect. I specialise in building and fixing AI/ML systems for
healthcare, including vector databases, RAG pipelines, and clinical NLP.
If you need help with your project or think I can add value to your
organisation, feel free to reach out — I'd love to connect.
📧rtmalikian@gmail.com🔗 GitHub: https://github.com/rtmalikian🔗 LinkedIn:
http://www.linkedin.com/in/raphael-t-malikian-mbbs-bsc-hons-71075436a
---
**Disclosure:** This code was developed with assistance from
DeepSeek-V4-Pro (DeepSeek) via Hermes Agent (Nous Research). All changes
were reviewed, tested against the actual codebase, and verified for
correctness.
Signed-off-by: rtmalikian <rtmalikian@gmail.com>
Fixes#2934
## Problem
Passing a `RemoteTable` to `permutation_builder()` raises a cryptic
`AttributeError`:
```
AttributeError: 'RemoteTable' object has no attribute '_inner'
```
This leaves users confused about what went wrong and why.
## Root Cause
`PermutationBuilder.__init__()` calls `async_permutation_builder(table)`
which accesses `table._inner` — the underlying Rust Lance table object.
`RemoteTable` connects to LanceDB Cloud/Enterprise and does not have a
local `_inner` attribute, making permutations fundamentally unsupported
on remote tables.
## Solution
Added an early check in `PermutationBuilder.__init__()` that verifies
the table has `_inner` before calling the Rust function, raising a clear
`TypeError` with an explanation of why permutations don't work on remote
tables.
## Verification
- Syntax validated with `ast.parse()`
- Structural verification: single call site (`permutation_builder()`),
guard placed before Rust FFI call
- Error message tested with mock: `MockRemoteTable()` correctly triggers
`TypeError`
## Changelog
| Date | Change | Author |
|------|--------|--------|
| 2026-06-28 | Added remote table guard in PermutationBuilder.__init__ |
rtmalikian |
### Files Changed
- python/python/lancedb/permutation.py — Added `hasattr(table,
"_inner")` check with clear error
---
**About the Author:** Raphael Malikian — Clinical AI Solutions
Architect. I specialise in building and fixing AI/ML systems for
healthcare, including vector databases, RAG pipelines, and clinical NLP.
If you need help with your project or think I can add value to your
organisation, feel free to reach out — I'd love to connect.
📧rtmalikian@gmail.com🔗 GitHub: https://github.com/rtmalikian🔗 LinkedIn:
http://www.linkedin.com/in/raphael-t-malikian-mbbs-bsc-hons-71075436a
---
**Disclosure:** This code was developed with assistance from
deepseek-v4-pro (DeepSeek) via Hermes Agent (Nous Research). All changes
were reviewed, tested against the actual codebase, and verified for
correctness.
Signed-off-by: rtmalikian <rtmalikian@gmail.com>
Updates Lance Rust workspace dependencies and Java lance-core to
v9.0.0-beta.10.
No compatibility code changes were required; clippy and rustfmt passed
after installing the missing runner components.
Lance tag:
https://github.com/lance-format/lance/releases/tag/v9.0.0-beta.10
Expose the merged Rust OAuth header provider through the Python async
connection path.
Includes:
- Python OAuthConfig and OAuthFlowType public config objects
- PyO3 conversion into the Rust OAuthConfig
- connect_async(oauth_config=...) plumbing
- repr redaction coverage for client_secret
Local validation: cargo fmt --all; ruff format/check on touched Python
files.
## Summary
Add the Rust OAuth header provider for remote LanceDB connections.
This supports client credentials and Azure managed identity flows,
handles token caching and refresh, redacts secrets in Debug output, and
wires `ConnectBuilder::oauth_config()` into the remote client while
rejecting ambiguous API-key/header-provider combinations.
By default the read freshness provider was not included in the namespace
client, preventing the read freshness headers from being included in the
request. This prevents checkout_latest() from working as expected when
using the namespace client.
This fix ensures the provided is built into the client when the
namespace impl and properties are provided.
## Summary
Skip inserting the x-api-key header when the configured API key is
empty.
This lets bearer-token or other dynamic-header authentication avoid
sending an empty static API key header alongside the real auth header.
Fixes#3563
## Summary
- Add `stacklevel=2` to 10 `warnings.warn()` calls across 4 files
- Fix broken message concatenation in `table.py` where the second string
was incorrectly passed as the `category` parameter
## Problem
Multiple `warnings.warn()` calls in the `python/lancedb/` codebase were
missing the `stacklevel` parameter. Without `stacklevel=2`, warnings
point to library internals instead of the caller's code, making it
impossible for users to identify which of their function calls triggered
the warning.
Additionally, two calls in `table.py` (lines 3411 and 3420) had a more
serious bug: the deprecation message was split across two separate
string arguments, causing the second string to be passed as the
`category` parameter instead of being concatenated with the first
string. This would cause `TypeError` when the warning was triggered.
## Changes
| File | Fixes | Description |
|------|-------|-------------|
| `embeddings/colpali.py` | 1 | Add `stacklevel=2` to
`use_token_pooling` deprecation warning |
| `remote/db.py` | 3 | Add `stacklevel=2` to `request_thread_pool`,
`connection_timeout`, `read_timeout` deprecation warnings |
| `remote/table.py` | 3 | Add `stacklevel=2` to `cleanup_old_versions`,
`compact_files`, `optimize` no-op warnings |
| `table.py` | 3 | Fix broken message concatenation for
`data_storage_version` and `enable_v2_manifest_paths` deprecation
warnings + add `stacklevel=2` to `retrain` deprecation warning |
## Verification
```python
# All warnings.warn() calls now have stacklevel
python3 -c "import ast, os; ..."
# Result: All warnings.warn() calls now have stacklevel!
```
## Changelog
| Date | Change | Author |
|------|--------|--------|
| 2026-06-20 | Fix missing stacklevel=2 in 10 warnings.warn() calls +
fix broken message concatenation | rtmalikian |
### Files Changed
- `python/python/lancedb/embeddings/colpali.py` — Add stacklevel=2
- `python/python/lancedb/remote/db.py` — Add stacklevel=2 to 3
deprecation warnings
- `python/python/lancedb/remote/table.py` — Add stacklevel=2 to 3 no-op
warnings
- `python/python/lancedb/table.py` — Fix broken message concatenation +
add stacklevel=2
### Verification
- AST-based audit confirms all `warnings.warn()` calls now include
`stacklevel=2`
- Syntax check passes for all 4 modified files
---
**About the Author:** Raphael Malikian — Clinical AI Solutions
Architect. I specialise in building and fixing AI/ML systems for
healthcare, including vector databases, RAG pipelines, and clinical NLP.
If you need help with your project or think I can add value to your
organisation, feel free to reach out — I'd love to connect.
📧rtmalikian@gmail.com🔗 GitHub: https://github.com/rtmalikian🔗 LinkedIn:
http://www.linkedin.com/in/raphael-t-malikian-mbbs-bsc-hons-71075436a
---
**Disclosure:** This code was developed with assistance from **Hermes
Agent** (Nous Research). All changes were reviewed, tested against the
actual codebase, and verified for correctness.
Signed-off-by: rtmalikian <rtmalikian@gmail.com>
Updates LanceDB's Lance dependencies to v9.0.0-beta.2 across the Rust
workspace and Java lance-core dependency.\n\nNo compatibility fixes were
required; clippy and formatting pass after installing the missing
toolchain components on the runner. Triggering Lance tag:
https://github.com/lance-format/lance/releases/tag/v9.0.0-beta.2
This PR is for the Read path against blob v2. #3528 handles declare +
write, and this this adds materialization on local tables.
- blob_columns()
- fetch_blobs(column, row_ids) → bytes
- fetch_blob_files(column, row_ids) → lazy handles
- Pass _rowid from query().with_row_id(). Remote returns NotSupported.
(for now)
### Use cases
search, grab row ids, materialize images:
```rust
let row_ids = /* _rowid from hits */;
let images = table.fetch_blobs("image", &row_ids).await?;
```
Large blobs: open handles, read only what you need:
```rust
let handles = table.fetch_blob_files("image", &row_ids).await?;
let bytes = handles[0].as_ref().unwrap().read().await?;
```
Filter then batch fetch: collect ids from a filter, one call.
Multiple blob columns: image and thumbnail independently.
Row ids from before compact: still resolve.
### Alignment note
Lance `read_blobs` drops null rows. We descriptor-take first, read
non-null ids, re-expand to match input order. Null and zero-length blobs
come back null/None. Bytes path sets `preserve_order(true)`. So I added:
```
TODO(lance): expose selection_index or an aligned execute so we can drop the pre-read.
```
### Tests
`cargo test -p lancedb --test blob_integration`
- 30 tests covering nulls, reorder, dups, cross-fragment bytes + files,
compact, delete, legacy v1 errors.
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The server now serializes an index's `created_at` as an RFC 3339 string
(e.g. `"2026-06-18T21:37:36.637Z"`), but the client deserializer only
accepted a unix timestamp in milliseconds. This caused `list_indices` to
fail with:
```
Failed to parse list_indices response: invalid type: string "2026-06-18T21:37:36.637Z", expected a unix timestamp in milliseconds
```
This PR replaces the fixed millisecond deserializer with a custom one
that accepts both an RFC 3339 string (current server) and a
unix-millisecond integer (legacy deployments), so the client works
against any server version.
It also improves the `IndexConfig` repr in the Python bindings.
Previously it printed only three fields (`Index(FTS, columns=["text"],
name="text_idx")`), hiding the metadata that `list_indices` returns. It
now renders every populated field, omitting any that are `None`. Each
value is valid Python — integer counts use `_` thousands separators and
`created_at` uses the `datetime` repr — so values round-trip. The real
repr is a single line; it's wrapped here for readability:
```python
>>> table.list_indices()
[IndexConfig(
name="text_idx",
index_type="FTS",
columns=["text"],
index_uuid="aefd3e00-2f95-4bdc-92ac-06de84442bf1",
type_url="/lance.table.InvertedIndexDetails",
created_at=datetime.datetime(2026, 6, 18, 21, 37, 36, 637000, tzinfo=datetime.timezone.utc),
num_indexed_rows=2,
size_bytes=3_669,
num_segments=1,
index_version=1,
index_details={
'lance_tokenizer': None,
'base_tokenizer': 'simple',
'language': 'English',
'with_position': False,
'max_token_length': 40,
'lower_case': True,
'stem': True,
'remove_stop_words': True,
'custom_stop_words': None,
'ascii_folding': True,
'min_ngram_length': 3,
'max_ngram_length': 3,
'prefix_only': False,
},
)]
```
Fixes#3556🤖 Generated with [Claude Code](https://claude.com/claude-code)
---------
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Updates LanceDB's Lance dependencies from v8.0.0-beta.17 to
v8.0.0-beta.19.
This includes the Rust workspace Lance crates, Cargo.lock refresh, and
Java lance-core version bump. Triggering Lance tag:
https://github.com/lance-format/lance/releases/tag/v8.0.0-beta.19
Updates the Lance Rust workspace dependencies and Java lance-core
dependency to v8.0.0-beta.17.
No LanceDB compatibility code changes were required; validation passed
with cargo clippy and cargo fmt. Triggering Lance tag:
https://github.com/lance-format/lance/releases/tag/v8.0.0-beta.17
The "Create release commit" workflow (`make-release-commit.yml`) has
failed on its last two runs; no release tags have been created since
June 4. Since this workflow creates the tag that the cargo/npm/pypi/java
publish workflows trigger off of, all recent releases are effectively
blocked.
The workflow installs `bump-my-version` unpinned. Version `1.4.0` added
a check that refuses to run `pre_commit_hooks` containing shell syntax
(pipes, `&&`, `if`, variable expansion) unless `allow_shell_hooks =
true` is set. Both bumpversion configs use such hooks:
- `python/.bumpversion.toml` — updates `Cargo.lock` after the bump
(fails first)
- `.bumpversion.toml` — runs `mvn versions:set` for the Java packages
The job dies at the version-bump step with:
> Hook '…' contains shell syntax (pipes, redirects, or variable
expansion). Set `allow_shell_hooks = true` in your configuration to
enable shell execution…
This sets `allow_shell_hooks = true` in both configs to restore the
previous behavior.
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Closes#3506
## Problem
The Bedrock embedding compute path
(`rust/lancedb/src/embeddings/bedrock.rs`) panics instead of returning a
typed error in several places:
- `serde_json::to_vec(&request_body).unwrap()`: request serialization.
- `block_in_place(...).unwrap()`: the AWS `invoke_model` send result;
any API error terminates the worker instead of propagating.
- `v.as_f64().unwrap() as f32`: panics on non-numeric values in the
returned embedding array.
- `Handle::current()` + `block_in_place` assume a multi-threaded Tokio
runtime and panic when that assumption does not hold (no runtime, or a
current-thread runtime).
Malformed payloads, non-numeric embedding values, or an incompatible
runtime should surface as typed errors and never panic.
## Fix
- Serialize the request body before the blocking section so a
serialization failure returns `Error::Runtime` via `?`.
- Map the `invoke_model` send error to `Error::Runtime` instead of
`unwrap`.
- Add a `json_array_to_f32` helper that converts the response array to
`Vec<f32>`, returning `Error::Runtime` for a missing/non-array field or
a non-numeric element (used by both the Titan and Cohere paths).
- Add `current_multi_thread_handle()` (`Handle::try_current()` + a
`RuntimeFlavor::CurrentThread` guard) so an absent or incompatible
runtime returns a typed error rather than panicking in `block_in_place`.
Scope note: the sibling `openai.rs` provider uses the same
`block_in_place` + `block_on` bridge, so the bridge pattern itself is
kept; this change only removes the panic paths that are specific to the
Bedrock provider.
## Testing
Added 6 unit tests (no AWS credentials required):
- `json_array_to_f32`: valid numbers, non-array payload, and non-numeric
element.
- `current_multi_thread_handle`: errors with no runtime, errors on a
current-thread runtime, and succeeds on a multi-threaded runtime.
All pass; `cargo fmt` and `cargo clippy` clean. Build/test with
`--features bedrock,lance/protoc`.
## Summary
- clarify the Python error for passing a single dictionary to table
creation/add paths
- add a regression test for `create_table(..., data=dict)` so it points
users to a list of dictionaries
Fixes#409
## Testing
- `python -m pytest python/tests/test_table.py -q`
- `python -m ruff format python/lancedb/table.py
python/lancedb/scannable.py python/tests/test_table.py`
- `python -m ruff check python/lancedb/table.py
python/lancedb/scannable.py python/tests/test_table.py`
### Bug
`value_to_sql({...})` builds a DataFusion `named_struct(...)` literal
but interpolates the struct field names directly as `f"'{k}'"`. A field
name that contains a single quote therefore produces invalid SQL:
```python
>>> from lancedb.util import value_to_sql
>>> value_to_sql({"it's": 1})
"named_struct('it's', 1)" # invalid SQL — the quote terminates the literal
```
String *values* are already escaped (single quotes doubled) by the `str`
branch of `value_to_sql`, so keys and values were handled
inconsistently. This affects `Table.update(values={...})` /
`merge_insert` when a struct column has a field name containing `'`.
### Fix
Render the key through `value_to_sql(str(k))` so field names are escaped
exactly like string values:
```python
>>> value_to_sql({"it's": 1})
"named_struct('it''s', 1)"
```
Keys without special characters are unchanged (`'a'` stays `'a'`), so
existing behavior is preserved.
### Verification
```
$ pytest python/tests/test_util.py -k value_to_sql_dict
```
The new `test_value_to_sql_dict_key_escaping` covers quoted keys (incl.
nested structs) and fails on `main` (`named_struct('it's', 1)`), passes
with this change; the existing `test_value_to_sql_dict` still passes.
Co-authored-by: JSap0914 <JSap0914@users.noreply.github.com>
Fixes#3360.
This updates native table writes so local write progress uses Lance
writer byte stats instead of Arrow in-memory batch size once write bytes
are available. The change wires the existing `WriteProgressTracker` into
`InsertExec` for native `add` writes, installs a Lance `WriteProgressFn`
only when no lower-level callback is already configured, and keeps the
existing public `InsertExec::new` signature unchanged.
Validation:
- `cargo test -p lancedb --features remote
table::write_progress::tests::test_progress_uses_lance_write_bytes_for_local_tables
-- --nocapture` passed: 1 passed, 0 failed.
- `cargo test -p lancedb --features remote table::write_progress::tests
-- --nocapture` passed: 7 passed, 0 failed.
- `cargo check --quiet --features remote --tests --examples` passed.
- `cargo fmt --all --check` passed.
- `git diff --check` passed.
- `git diff | gitleaks stdin --no-banner --redact --timeout 30` passed:
no leaks found.
I did not run the full `cargo test --quiet --features remote --tests`
suite.
Co-authored-by: Ghxst <200635707+GHX5T-SOL@users.noreply.github.com>
Closes#3502
## Problem
A bare, unparameterised `typing.List` / `typing.Tuple` field crashes
`to_arrow_schema` with an opaque `AttributeError: __args__`:
```python
from typing import Tuple
from lancedb.pydantic import LanceModel
class Doc(LanceModel):
items: Tuple
Doc.to_arrow_schema() # AttributeError: __args__
```
In `_py_type_to_arrow_type`, the branch `elif getattr(py_type,
"__origin__", None) in (list, tuple)` is taken for a bare generic (its
`__origin__` is `list / tuple`), but the next line reads
`py_type.__args__[0]`, and a bare generic has no `__args__`. Other
unsupported types (e.g. `Dict[str, int]`) correctly raise a clear
`TypeError`, so this case is inconsistent.
Fix
Guard the element-type lookup with `getattr(py_type, "__args__", None)`
and raise a clear `TypeError` when it is missing, matching the existing
behavior for other unsupported types. Bare builtin list / tuple are
unaffected (their `__origin__` is `None`, so they already fall through
to the existing `TypeError`).
Testing
- Added `test_bare_generic_raises_type_error` covering both `List` and
`Tuple`.
- ruff format and ruff check clean.
This PR fixes a flaky test I hit on Windows test in #3528.
Looks like `test_eventual_consistency_background_refresh` was failing
with `v_cached` expected 1, got 2. There was a pr which swapped
`tokio::time::sleep(300ms)` for `clock::advance_by(300ms)`, which is
pretty much fine but the test necer pinned the clock so the first
`get()` locks the `cached_at` on wall time. Therefore, if our CI is
taking long enough the ttl expires before the value assertion in the
test.
So now we can add a `pin()` and call it first `get()`. After that we can
advance the clock manually with no problems.
Also, it's worth noting that I tried pinning in `BackgroundCache::new()`
first. That broke another test `test_reload_resets_consistency_timer`,
which uses real `tokio::time::sleep` and needs wall clock after
`clear_mock()`. So the pin stays in this test only. And this should
unblock us.
Failing instances:
-
https://github.com/lancedb/lancedb/actions/runs/27567527236/job/81495265474?pr=3528
-
https://github.com/lancedb/lancedb/actions/runs/27560366489/job/81470414928
### Description
`db://`-style connections that use the lance-namespace path
(`LanceNamespaceDatabase` → `NativeTable` + the lance-namespace REST
client) never sent a read-freshness signal. Against a server configured
to serve cached table metadata up to some staleness window, this allows
stale-read-after-write across handles and processes. The remote table
path already solved this (#3439). This brings the namespace path to
parity.
The namespace REST client doesn't let callers attach headers directly,
but it forwards a `DynamicContextProvider`'s `headers.*` context entries
as HTTP headers per request. So:
- A shared per-table baseline map is created before the namespace
client. I built and installed on the `ConnectBuilder` via a context
provider.
- On read operations the provider emits ·x-lancedb-min-timestamp =
max(baseline, now − read_consistency_interval)`
(RFC3339), keyed by the operation's `object_id`.
- Each table handle bumps its baseline (monotonically) on
`checkout_latest()`, `restore()`, and every data/schema write.
`checkout_latest()` is the primary hook: consumers refresh a handle
there after writing elsewhere, then poll.
Read operations that carry the floor: `describe_table`,
`list_table_versions`, `query_table`, `list_tables`.
`list_table_versions` is what resolves "latest" for managed-versioning
tables (`get_latest_version`), so it's the op that makes
`checkout_latest()` actually observe a prior write.
`describe_table_version` is excluded (pinned to an immutable version).
This mirrors #3439 (timestamp baseline, `max(baseline, now − interval)`,
monotonic); no `min_version` and no body channel, since the namespace
path has no version-returning write responses.
### Testing
- Unit tests for `compute_min_timestamp` / `next_freshness_baseline` and
the provider (header at/after a bumped baseline; nothing for an empty
baseline + no interval; interval floor applies; non-read ops emit
nothing; `list_tables` uses only the interval floor).
- Verified end-to-end against a local server that honors the header:
reads carry `x-lancedb-min-timestamp`, writes don't, and read-your-write
holds.
## Feature
### What is the new feature?
Adds Rust core API support for configuring vector query approximation
mode with `ApproxMode::{Fast, Normal, Accurate}`.
### Why do we need this feature?
Lance already exposes `lance_index::vector::ApproxMode` and scanner
support for controlling the speed/accuracy tradeoff for approximate
vector search. LanceDB Rust queries need to expose and pass this setting
through for local/native and remote vector searches.
### How does it work?
- Adds public `ApproxMode` in `rust/lancedb`, with lowercase serde,
`Default::Normal`, parse/display, and conversions to/from Lance's
`ApproxMode`.
- Adds `approx_mode: Option<ApproxMode>` to `VectorQueryRequest` and a
`VectorQuery::approx_mode(...)` builder.
- Applies the mode to native/local Lance scanners after `nearest(...)`
when explicitly set.
- Sends `approx_mode` in remote query JSON only when explicitly set;
default requests omit it.
## Validation
- `cargo fmt --all`
- `cargo test --quiet --features remote approx_mode`
- `cargo test --quiet --features remote
test_query_vector_default_values`
- `cargo check --quiet --features remote --tests --examples`
- `git diff --check`
## Feature
### What is the new feature?
Adds Rust core API support for configuring vector query approximation
mode with `ApproxMode::{Fast, Normal, Accurate}`.
### Why do we need this feature?
Lance already exposes `lance_index::vector::ApproxMode` and scanner
support for controlling the speed/accuracy tradeoff for approximate
vector search. LanceDB Rust queries need to expose and pass this setting
through for local/native and remote vector searches.
### How does it work?
- Adds public `ApproxMode` in `rust/lancedb`, with lowercase serde,
`Default::Normal`, parse/display, and conversions to/from Lance's
`ApproxMode`.
- Adds `approx_mode: Option<ApproxMode>` to `VectorQueryRequest` and a
`VectorQuery::approx_mode(...)` builder.
- Applies the mode to native/local Lance scanners after `nearest(...)`
when explicitly set.
- Sends `approx_mode` in remote query JSON only when explicitly set;
default requests omit it.
## Validation
- `cargo fmt --all`
- `cargo test --quiet --features remote approx_mode`
- `cargo test --quiet --features remote
test_query_vector_default_values`
- `cargo check --quiet --features remote --tests --examples`
- `git diff --check`
### Description
Adding branch support for RemoteTable by threading a branch selector
onto every operation the data plane accepts it on. Exposes the
currentBranch to nodejs and python through the bindings.
Matching the server handlers, the branch rides as:
- a `?branch=` query parameter for Arrow-body and query-only ops
(insert, merge_insert, multipart_*, version/list, drop_index)
- a `branch` field in the JSON body for everything else (count_rows,
query, update, delete, create_index, column ops, index list/stats,
stats, restore, describe, tags create/update)
A main-branch handle (`branch == None`) produces byte-identical requests
to before: no `branch` field and no `?branch=`
- Handle-per-branch: `create_branch` / `checkout_branch` return a new
handle with fresh caches and reset version/freshness state, mirroring
`NativeTable`.
- `create_branch` maps 409 to already-exists, 400 to invalid, and 404 to
not-found with source context, and sends without retry so the 409 stays
observable.
- `Ref` translation covers version, version-number (relative to the
handle's branch), and tag (resolved via the tags endpoint); `"main"` and
empty normalize to the main branch.
- Python branch handles persist their branch (and pinned version) across
pickle/fork, so a forked or pickled handle reopens on its branch rather
than silently reverting to main.
### Tests
- Rust mock tests per op category (query-param and body mechanisms,
branch CRUD, error paths, backward-compat).
- Python sync branch CRUD, `open_table(branch=)`, and a pickle
round-trip regression test.
Updates LanceDB's Lance dependencies to v8.0.0-beta.14.\n\nThis
refreshes the Rust workspace lockfile and Java lance-core version; no
compatibility code changes were required. Triggering Lance tag:
https://github.com/lance-format/lance/releases/tag/v8.0.0-beta.14
`RemoteTable::list_indices` currently makes one `/index/list/` call plus
one `/index/{name}/stats/` call per index just to recover `index_type`.
When the server returns `index_type` directly in the `/index/list/`
response, all enriched fields are used and the per-index stats fan-out
is skipped entirely. When `index_type` is absent (legacy servers), the
existing stats fallback runs as before. This is content-based: no
version header required.
## Changes
- `RemoteTable::parse_index_list_response` replaces the old split
between enriched and legacy parsers. A single struct deserializes both
old and new response shapes, with all fields except `index_name` and
`columns` optional. `index_type` acts as the sentinel: present → use
enriched fields directly; absent → call `/index/{name}/stats/`.
## Tests
Added `test_list_indices_enriched` covering:
- All enriched fields populated correctly when `index_type` is in the
list response
- Optional fields absent from the response deserialize as `None`
- Stats endpoint is **not** called (panics if hit), verifying the
fan-out is eliminated
Existing `test_list_indices` and `test_list_indices_nested_field_paths`
exercise the legacy path unchanged.
## Depends on
- #3497 (expand `IndexConfig`) — already merged
- Server-side enriched response support
Closes#3494
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>