Commit Graph

2579 Commits

Author SHA1 Message Date
Brendan Clement
b48a1f3c7e test: add current_branch to the namespace pushdown fake 2026-06-08 15:51:34 -07:00
Brendan Clement
d6143c5426 test: assert branch handle read-consistency behavior 2026-06-08 14:50:47 -07:00
Brendan Clement
6064cd45b4 test: skip async namespace branch test without lance 2026-06-08 14:50:47 -07:00
Brendan Clement
12c66f55e5 fix: address review comments on branch support 2026-06-08 14:50:47 -07:00
Brendan Clement
bcd38512cf docs: trim branch API comments 2026-06-08 14:48:05 -07:00
Brendan Clement
bf5f6844ac fix: validate branch inputs (empty names, negative versions) 2026-06-08 14:48:05 -07:00
Brendan Clement
61141e87ee fix(python): skip server-side query pushdown on branch handles 2026-06-08 14:48:05 -07:00
Brendan Clement
f1817fc555 test(python): guard pylance-dependent branch tests with importorskip 2026-06-08 14:46:39 -07:00
Brendan Clement
3ed2282431 fix(python): branch-scope to_lance so branch handles don't read main 2026-06-08 14:46:39 -07:00
Brendan Clement
705595ac56 fix(python): keep open_table override signatures compatible with branch 2026-06-08 14:46:39 -07:00
Brendan Clement
9923105954 feat: support opening a branch directly via open_table 2026-06-08 14:46:39 -07:00
Brendan Clement
3103b57eda address sync / namespace issue in python sdk 2026-06-08 14:46:39 -07:00
Brendan Clement
c795b2e02b feat(typescript): export Branches from the public API 2026-06-08 14:46:39 -07:00
Brendan Clement
8fb77a2fcc refactor: use Self in branch method return types 2026-06-08 14:46:39 -07:00
Brendan Clement
aa938bdb1c feat(typescript): add table branch support 2026-06-08 14:46:39 -07:00
Brendan Clement
7b50fa287b feat(python): add table branch support 2026-06-08 14:46:39 -07:00
Brendan Clement
f0c3eda784 feat: add table branch support to the Rust core 2026-06-08 14:46:39 -07:00
Yang Cen
3e25f584eb fix(python): push down namespace full reads (#3516)
## Bug Fix

### What is the bug?

Namespace-backed `LanceTable.to_arrow()` full-table reads bypassed the
existing `QueryTable` server-side query path and called the lower-level
table `to_arrow()` implementation directly. In Geneva/Sophon this could
fail while parsing the Arrow IPC response for
`hist.get_table().to_arrow()` / `to_pandas()`, even though
`hist.get_table().search().to_arrow()` worked.

### What issues or incorrect behavior does the bug cause?

Full-table reads on namespace-backed tables with `QueryTable` pushdown
could fail with Arrow IPC parse errors, while query/search reads on the
same table succeeded. Since `to_pandas()` delegates through `to_arrow()`
for non-blob/native cases, pandas export was affected too.

### How does this PR fix the problem?

When `QueryTable` pushdown is enabled, sync and async table `to_arrow()`
now construct a plain no-filter, no-limit, all-columns query and execute
it through the table-level `_execute_query()` path. `AsyncTable` now
preserves namespace context from async namespace connections so async
full reads can make the same pushdown decision. Non-namespace tables and
namespace tables without `QueryTable` pushdown keep their existing
behavior.

### Tests

- `uv run --extra tests --extra dev --no-sync ruff check
python/lancedb/table.py python/lancedb/namespace.py
python/tests/test_namespace.py`
- `uv run --extra tests --extra dev --no-sync ruff format
python/lancedb/table.py python/lancedb/namespace.py
python/tests/test_namespace.py`
- `uv run --extra tests --extra dev --no-sync pytest
python/tests/test_namespace.py::TestPushdownOperations::test_lance_table_to_arrow_uses_query_pushdown
python/tests/test_namespace.py::TestAsyncPushdownOperations::test_async_table_to_arrow_uses_query_pushdown
python/tests/test_namespace.py::test_local_table_to_arrow_and_to_pandas_are_unchanged
-q`
- `uv run --extra tests --extra dev --no-sync pytest
python/tests/test_namespace.py -q`
2026-06-08 19:48:40 +08:00
LanceDB Robot
59fbfd4158 chore: update lance dependency to v8.0.0-beta.6 (#3510)
Updates LanceDB Lance dependencies from v8.0.0-beta.5 to v8.0.0-beta.6
and refreshes Cargo metadata.

No compatibility fixes were required; Java lance-core was bumped to
8.0.0-beta.6 as well.

Lance tag:
https://github.com/lance-format/lance/releases/tag/v8.0.0-beta.6
2026-06-05 16:55:16 -07:00
LanceDB Robot
f37e698e2f chore: update lance dependency to v8.0.0-beta.5 (#3508)
Updates Lance dependencies from v8.0.0-beta.4 to v8.0.0-beta.5 across
the Rust workspace and Java lance-core version.

No compatibility code changes were required; clippy and rustfmt pass
after installing the missing runner components.

Lance tag:
https://github.com/lance-format/lance/releases/tag/v8.0.0-beta.5
2026-06-05 12:20:08 -05:00
Will Jones
09b1bbc12a refactor!: drop unused loss field from IndexStatistics (#3496)
BREAKING CHANGE: direct Rust users lose the `IndexStatistics::loss`
field. Python and Node.js consumers are unaffected in practice for
remote tables (the value was always `None`/absent), but the attribute is
gone for local tables too.

`IndexStatistics::loss` was local-only — LanceDB Cloud never returned
it, so
`RemoteTable::index_stats` always set `loss: None`. It's vestigial; this
removes it.

- Remove `loss` from `IndexStatistics` and the internal `IndexMetadata`
in `rust/lancedb/src/index.rs`, plus the summing logic in
`NativeTable::index_stats`.
- Drop `loss` from the Python and Node.js bindings (and their
tests/docs).

Fixes #3493

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-05 07:52:40 -07:00
LanceDB Robot
c484b24e51 chore: update lance dependency to v8.0.0-beta.4 (#3507)
Updates LanceDB Lance dependencies to Lance v8.0.0-beta.4.

Includes the required compatibility fix for the new Lance file writer
finish summary API.

Lance tag:
https://github.com/lance-format/lance/releases/tag/v8.0.0-beta.4
2026-06-05 08:28:14 -05:00
Armaan Sandhu
3868965413 fix(python): run AsyncTable.search embeddings on a dedicated executor (#3459)
## Summary
  
`AsyncTable.search()` computes the query embedding with
`loop.run_in_executor(None, ...)`, which uses asyncio's **default**
`ThreadPoolExecutor`. That pool is shared with all other
`run_in_executor(None, ...)` work, so a slow embedding call — a heavy
local model or an HTTP request to an embeddings API — ties up those
threads and starves unrelated async I/O under concurrent load.
  
This moves the (potentially blocking) embedding call onto a **dedicated
executor**, isolating it from the default pool.
  
  Closes #3310.
  
  ## Problem

  `python/lancedb/table.py`, `AsyncTable.search()`:

  ```python
  return (
      await loop.run_in_executor(
None, # asyncio's default executor, shared with other blocking I/O
          embedding.function.compute_query_embeddings_with_retry,
          query,
      )   
  )[0]
  ```
  
Under load, concurrent searches whose embeddings block (or any other
code using the default executor) contend for the same small thread pool.
  
  ## Change

- Add a dedicated
`ThreadPoolExecutor(thread_name_prefix="lancedb-embedding")` in
`background_loop.py`, exposed via `embedding_executor()`.
- Use it in `AsyncTable.search()`'s `make_embedding` instead of the
default executor.
- Reset the executor in the existing `_reset_after_fork` hook — its
worker threads don't survive `fork()`, same as the background event
loop. It's recreated lazily, so this is cheap.

  ## Design notes
  
The issue asked whether maintainers preferred a configurable executor, a
dedicated internal one, or another approach (no response in the thread).
I went with a **dedicated internal executor**: it fixes the starvation
with no public API change and stays consistent with the existing `LOOP`
singleton. Making the pool size configurable would be an easy follow-up
if preferred.
  
Scope is limited to `search()`. The broader "embedding functions need
real async support" (including `add()`) is tracked separately in #3268.
  
  ## Testing
  
- Added `test_async_search_runs_embedding_on_dedicated_executor`:
patches the embedding function to record the executing thread during an
async search and asserts it runs on a `lancedb-embedding` thread.
Verified it **fails** against the previous `run_in_executor(None, ...)`
and passes with the fix.
- `ruff format`, `ruff check`, and `pyright` pass on the changed files.
2026-06-04 21:57:16 -07:00
Dan Rammer
c13ebc6796 feat(remote): implement set/unset_lsm_write_spec REST variant (#3501)
## Summary

Wires `RemoteTable::set_lsm_write_spec` / `unset_lsm_write_spec` to the
sophon REST endpoints added in
[lancedb/sophon#6181](https://github.com/lancedb/sophon/pull/6181),
replacing the previous `NotSupported` stubs.

- `set_lsm_write_spec` maps the `LsmWriteSpec` onto sophon's request DTO
— mode-tagged `sharding` (`unsharded` / `bucket` / `identity`),
`maintained_indexes`, and `writer_config_defaults` — and POSTs to
`/v1/table/{name}/set_lsm_write_spec/`.
- `unset_lsm_write_spec` POSTs to
`/v1/table/{name}/unset_lsm_write_spec/`.
- Both call `check_mutable` first, matching the other remote mutations.
- `maintained_indexes` is sent verbatim (an empty list means "no
maintained indexes", matching native semantics).

## Testing

- Added mocked-endpoint unit tests for unsharded / bucket / identity set
and for unset.
- `cargo check --features remote --tests` passes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 21:47:52 -05:00
LanceDB Robot
4b287fd9c4 chore: update lance dependency to v8.0.0-beta.2 (#3500)
Updates Lance dependencies to v8.0.0-beta.2 across the Rust workspace
and Java lance-core metadata.

The update was generated with ci/update_lance_dependency.py and required
no compatibility code changes.

Lance tag:
https://github.com/lance-format/lance/releases/tag/v8.0.0-beta.2

##  Merge blocker: legal review required

This bump pulls in a new transitive **dev/profiling** dependency chain
`inferno v0.11.21` → `pprof v0.15.0` → `lance-testing`, and `inferno` is
licensed **CDDL-1.0** (copyleft). To get `cargo-deny` green, `CDDL-1.0`
was added to the `deny.toml` allow list.

**Do not merge until legal has reviewed and signed off on allowing
CDDL-1.0.** The dependency is dev/test-only and not distributed, but the
allow-list addition still requires legal approval per our policy.

---------

Co-authored-by: Daniel Rammer <hamersaw@protonmail.com>
Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-04 12:26:04 -05:00
hashwnath
64194ea8ad fix(python): make LanceDBClientError pickleable (#3470)
## Summary

- Add `__reduce__` methods to `LanceDBClientError` and `RetryError` so
that instances can be pickled and unpickled correctly
- `HttpError` inherits the fix from `LanceDBClientError` since it has no
additional `__init__` parameters
- Add tests verifying pickle roundtrip for all three exception classes

Fixes #3447

## Test plan

- [x] Verified pickle roundtrip for `LanceDBClientError` with and
without `status_code`
- [x] Verified pickle roundtrip for `HttpError` (subclass, no extra init
params)
- [x] Verified pickle roundtrip for `RetryError` (subclass with many
extra params)
- [ ] CI tests pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Will Jones <willjones127@gmail.com>
2026-06-04 09:29:15 -07:00
dependabot[bot]
e6c5de1a58 chore(deps): bump the rust-minor-patch group with 3 updates (#3499)
Bumps the rust-minor-patch group with 3 updates:
[log](https://github.com/rust-lang/log),
[test-log](https://github.com/d-e-s-o/test-log) and
[serial_test](https://github.com/palfrey/serial_test).

Updates `log` from 0.4.30 to 0.4.31
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/rust-lang/log/releases">log's
releases</a>.</em></p>
<blockquote>
<h2>0.4.31</h2>
<h2>What's Changed</h2>
<ul>
<li>fix typos in kv compile errors and log documentation by <a
href="https://github.com/Isvane"><code>@​Isvane</code></a> in <a
href="https://redirect.github.com/rust-lang/log/pull/726">rust-lang/log#726</a></li>
<li>Leverage static str key when possible by <a
href="https://github.com/tisonkun"><code>@​tisonkun</code></a> in <a
href="https://redirect.github.com/rust-lang/log/pull/727">rust-lang/log#727</a></li>
<li>Prepare for 0.4.31 release by <a
href="https://github.com/KodrAus"><code>@​KodrAus</code></a> in <a
href="https://redirect.github.com/rust-lang/log/pull/728">rust-lang/log#728</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/Isvane"><code>@​Isvane</code></a> made
their first contribution in <a
href="https://redirect.github.com/rust-lang/log/pull/726">rust-lang/log#726</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/rust-lang/log/compare/0.4.30...0.4.31">https://github.com/rust-lang/log/compare/0.4.30...0.4.31</a></p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/rust-lang/log/blob/master/CHANGELOG.md">log's
changelog</a>.</em></p>
<blockquote>
<h2>[0.4.31] - 2026-06-02</h2>
<h2>What's Changed</h2>
<ul>
<li>Leverage static str key when possible by <a
href="https://github.com/tisonkun"><code>@​tisonkun</code></a> in <a
href="https://redirect.github.com/rust-lang/log/pull/727">rust-lang/log#727</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/Isvane"><code>@​Isvane</code></a> made
their first contribution in <a
href="https://redirect.github.com/rust-lang/log/pull/726">rust-lang/log#726</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/rust-lang/log/compare/0.4.30...0.4.31">https://github.com/rust-lang/log/compare/0.4.30...0.4.31</a></p>
<h2>[Unreleased]</h2>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="580839288e"><code>5808392</code></a>
Merge pull request <a
href="https://redirect.github.com/rust-lang/log/issues/728">#728</a>
from rust-lang/cargo/0.4.31</li>
<li><a
href="86d739f51a"><code>86d739f</code></a>
prepare for 0.4.31 release</li>
<li><a
href="c906cfb02e"><code>c906cfb</code></a>
Merge pull request <a
href="https://redirect.github.com/rust-lang/log/issues/727">#727</a>
from tisonkun/leverage-static-str-key-when-possible</li>
<li><a
href="756c279649"><code>756c279</code></a>
leverage str literal as well</li>
<li><a
href="3dd250d153"><code>3dd250d</code></a>
rename Key::from_static_str to from_str_static</li>
<li><a
href="db145979e2"><code>db14597</code></a>
Leverage static str key when possible</li>
<li><a
href="761461a5d0"><code>761461a</code></a>
Merge pull request <a
href="https://redirect.github.com/rust-lang/log/issues/726">#726</a>
from Isvane/fix/typos</li>
<li><a
href="48ce372edd"><code>48ce372</code></a>
fix typos in kv compile errors and log documentation</li>
<li>See full diff in <a
href="https://github.com/rust-lang/log/compare/0.4.30...0.4.31">compare
view</a></li>
</ul>
</details>
<br />

Updates `test-log` from 0.2.20 to 0.2.21
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/d-e-s-o/test-log/releases">test-log's
releases</a>.</em></p>
<blockquote>
<h2>v0.2.21</h2>
<ul>
<li>Fixed spans in generated code, improving <code>rust-analyzer</code>
interaction</li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/jorendorff"><code>@​jorendorff</code></a> made
their first contribution in <a
href="https://redirect.github.com/d-e-s-o/test-log/pull/68">d-e-s-o/test-log#68</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/d-e-s-o/test-log/compare/v0.2.20...v0.2.21">https://github.com/d-e-s-o/test-log/compare/v0.2.20...v0.2.21</a></p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/d-e-s-o/test-log/blob/main/CHANGELOG.md">test-log's
changelog</a>.</em></p>
<blockquote>
<h2>0.2.21</h2>
<ul>
<li>Fixed spans in generated code, improving <code>rust-analyzer</code>
interaction</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="b7b9da0345"><code>b7b9da0</code></a>
Bump version to 0.2.21</li>
<li><a
href="db522dc408"><code>db522dc</code></a>
Add CHANGELOG entry for <a
href="https://redirect.github.com/d-e-s-o/test-log/issues/68">#68</a></li>
<li><a
href="5e996d9ac6"><code>5e996d9</code></a>
Wrap the injected init code, not the original test body</li>
<li><a
href="c78563c1ca"><code>c78563c</code></a>
Retain existing spans for test code</li>
<li>See full diff in <a
href="https://github.com/d-e-s-o/test-log/compare/v0.2.20...v0.2.21">compare
view</a></li>
</ul>
</details>
<br />

Updates `serial_test` from 3.4.0 to 3.5.0
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/palfrey/serial_test/releases">serial_test's
releases</a>.</em></p>
<blockquote>
<h2>v3.5.0</h2>
<h2>What's Changed</h2>
<ul>
<li>Replace scc/sdd with std::sync::Mutex for Miri strict provenance
compatibility by <a
href="https://github.com/justanotheranonymoususer"><code>@​justanotheranonymoususer</code></a>
in <a
href="https://redirect.github.com/palfrey/serial_test/pull/157">palfrey/serial_test#157</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/justanotheranonymoususer"><code>@​justanotheranonymoususer</code></a>
made their first contribution in <a
href="https://redirect.github.com/palfrey/serial_test/pull/157">palfrey/serial_test#157</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/palfrey/serial_test/compare/v3.4.0...v3.5.0">https://github.com/palfrey/serial_test/compare/v3.4.0...v3.5.0</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="6181f64de9"><code>6181f64</code></a>
3.5.0</li>
<li><a
href="480bead2f6"><code>480bead</code></a>
Merge pull request <a
href="https://redirect.github.com/palfrey/serial_test/issues/157">#157</a>
from justanotheranonymoususer/remove-scc-dep</li>
<li><a
href="e03019e3cd"><code>e03019e</code></a>
Update ci.yml</li>
<li><a
href="820c0f3de9"><code>820c0f3</code></a>
Update ci.yml</li>
<li><a
href="62a89b055f"><code>62a89b0</code></a>
Only skip file_lock with filesystem access</li>
<li><a
href="5ff550164e"><code>5ff5501</code></a>
Update ci.yml</li>
<li><a
href="0bd996de9e"><code>0bd996d</code></a>
Let's try --all-features</li>
<li><a
href="338e4ed891"><code>338e4ed</code></a>
Fix formatting</li>
<li><a
href="a55cde5d1d"><code>a55cde5</code></a>
Cleanup code_lock.rs</li>
<li><a
href="9ad7a8f18c"><code>9ad7a8f</code></a>
Remove unnecessary test leftover changes</li>
<li>Additional commits viewable in <a
href="https://github.com/palfrey/serial_test/compare/v3.4.0...v3.5.0">compare
view</a></li>
</ul>
</details>
<br />


Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-06-04 09:29:08 -07:00
Lance Release
39a9f3e1e9 Bump version: 0.30.1-beta.1 → 0.30.1-beta.2 2026-06-04 06:05:35 +00:00
Lance Release
952055d428 Bump version: 0.33.1-beta.1 → 0.33.1-beta.2 python-v0.33.1-beta.2 2026-06-04 06:04:37 +00:00
Yang Cen
927ba2c948 fix(python): route blob query pandas through scanner (#3491)
## Bug Fix

### What is the bug?
`QueryBuilder.to_pandas(blob_mode="descriptions")` could still fall back
to `self.to_arrow()` for query outputs with blob columns. Custom query
subclasses or wrappers can have `to_arrow()` behavior that is not
compatible with pandas blob-description conversion, which can surface as
low-level Arrow/list-batch conversion failures.

### What issues or incorrect behavior does the bug cause?
Callers need to carry local `to_pandas` or plain-scan adapter special
casing for blob descriptions, and scanner-only kwargs such as row
addresses and fragment selection are not represented in LanceDB query
state.

### How does this PR fix the problem?
This PR routes blob-output query `to_pandas()` through the Lance scanner
path for `lazy`, `bytes`, and `descriptions` modes when the query is a
scanner-backed plain scan. For `blob_mode="descriptions"` with
`flatten`, it collects scanner Arrow/table output, applies LanceDB
`flatten_columns`, and converts to pandas from there. Non-plain blob
query shapes now fail with a clear unsupported error instead of falling
into subclass `to_arrow()` behavior.

It also adds Python query state and builder methods for scanner-only
plain-scan parameters:

- `with_row_address()` for `_rowaddr`
- `with_fragments(...)` for Lance fragment objects
- `fragment_ids([...])` as a convenience wrapper that resolves IDs to
Lance fragments

## Validation

- `cd python && uv run --no-sync ruff format --check
python/lancedb/query.py python/tests/test_query.py`
- `cd python && uv run --no-sync ruff check python/lancedb/query.py
python/tests/test_query.py`

Targeted pytest was intentionally not run locally per maintainer
request.
2026-06-04 14:03:33 +08:00
Armaan Sandhu
415d199c15 feat(rust): support datafusion expressions for merge insert predicates (#3444)
### Description
This PR exposes native DataFusion expression support in the Rust SDK's
`MergeInsertBuilder` via two new builder methods:
`when_matched_update_all_expr` and
`when_not_matched_by_source_delete_expr`.

For remote LanceDB tables (where operations are serialized over
HTTP/JSON to the SaaS backend), native DataFusion expression trees
cannot be executed directly. The SDK handles this gracefully by
returning a `NotSupported` error.

### Key Changes
- **`MergeFilter` Enum**: Introduced a helper enum to store either a SQL
string or a native `datafusion_expr::Expr`.
- **`MergeInsertBuilder`**: Updated `when_matched_update_all_filt` and
`when_not_matched_by_source_delete_filt` fields to store the new enum,
and added `when_matched_update_all_expr` and
`when_not_matched_by_source_delete_expr` builder methods.
- **Execution & Remote Dispatch**: Dispatched the filter variants during
local execution, and rejected expression filters with a clean
`NotSupported` error in remote table request conversion.
- **Testing**: Added a `test_merge_insert_expr` unit test covering
conditional updates and deletes with programmatically built DataFusion
expressions.

### Verification
- Added integration test `test_merge_insert_expr` which successfully
compiles and passes.
- Formatted and linted the code.

Closes #3416
2026-06-03 15:47:51 -07:00
Will Jones
a16676e05f ci: update python lockfile weekly (#3498)
Make sure we are getting security fixes in there regularly, and other
useful bumps.
2026-06-03 15:24:32 -07:00
Harikrishna KP
4e44262499 test(python): add regression test for nullable struct with None (#2654) (#3483)
## Summary

Regression test for [issue
#2654](https://github.com/lancedb/lancedb/issues/2654) — a nullable
struct column whose first batch contains only `None` values crashed in
`_align_field_types` with `AttributeError: 'pyarrow.lib.DataType' object
has no attribute 'fields'`.

The actual fix landed in #3394, but no test was added. This PR adds the
reproducer from the issue as a test.

## Test plan

- `test_add_nullable_struct_with_none`: creates a table with a nullable
struct column, adds a row with a non-null struct value, then a row with
`None` for the struct field. Verifies both rows land correctly.
- Uses Lance file format v2.1 (`new_table_data_storage_version="2.1"`)
because nullable structs aren't supported on v2.0.

## Related

- #3028 (the original fix attempt, now superseded)
2026-06-03 14:13:09 -07:00
Brendan Clement
632375faf1 docs: add cross-SDK parity guidance for code review (#3464)
Adds a REVIEW.md at the repo root with cross-SDK parity guidance for
automated code review. The Claude Code review feature automatically
loads `REVIEW.md` as review-only context.

This is intentionally a semantic nudge, not a deterministic check, it
relies on the reviewer reading the sibling SDK, so it will catch most
gaps.
2026-06-03 14:11:33 -07:00
devteamaegis
9969191d0d fix(rerankers): guard against empty vector_results in RRFReranker.rerank_multivector (#3467)
## What's broken

Calling `RRFReranker().rerank_multivector([])` crashes with `IndexError:
list index out of range` because the method accesses `vector_results[0]`
for the type-homogeneity check before verifying the list is non-empty.
The `all()` call passes vacuously on an empty iterable so the crash hits
the next lines.

```python
from lancedb.rerankers import RRFReranker
RRFReranker().rerank_multivector([])
# IndexError: list index out of range
```

## Why it happens

The type check uses `vector_results[0]` as the reference type but never
guards against an empty list. `all(...)` short-circuits to `True` when
the iterable is empty, so the bad index access on the lines that follow
is never reached by the existing guard logic.

## Fix

Add an explicit empty-list check before any indexing.
2026-06-03 14:06:33 -07:00
devteamaegis
1e7326cd8c fix(rerankers/mrr): raise ValueError on empty vector_results list (#3469)
## What's broken

`MRRReranker.rerank_multivector([])` raises `IndexError: list index out
of range`. The crash happens on line 128 (the `all()` type-homogeneity
check passes vacuously on an empty iterable) and on line 134 which
accesses `vector_results[0]` unconditionally, with no prior guard for an
empty list.

## Why it happens

`all()` over an empty iterable returns `True`, so the type check
silently passes and execution falls through to `vector_results[0]` which
crashes.

## Fix

Added a two-line guard at the top of `rerank_multivector` that raises a
clear `ValueError("vector_results must not be empty")` before any
indexing occurs.

## Test

Added `test_mrr_reranker_empty_input` in `test_rerankers.py` which calls
`rerank_multivector([])` and asserts that a `ValueError` with the
message "must not be empty" is raised.

Fixes #3468

Co-authored-by: Aegis Dev <aegis@devteamaegis.com>
2026-06-03 14:05:43 -07:00
Lance Release
9483b534af Bump version: 0.30.1-beta.0 → 0.30.1-beta.1 2026-06-03 11:17:37 +00:00
Lance Release
ac3411e81e Bump version: 0.33.1-beta.0 → 0.33.1-beta.1 python-v0.33.1-beta.1 2026-06-03 11:16:51 +00:00
Yang Cen
6f18eb4cce feat(python): support blob modes in query to_pandas (#3487)
## Feature

- What is the new feature?
- Adds `blob_mode` support to sync and async Python query `to_pandas()`
APIs.
- Enables plain scan queries to return blob columns as lazy `BlobFile`
objects, raw bytes, or blob descriptions.
- Lets namespace-backed local tables use Lance native blob-aware pandas
conversion for lazy blobs.

- Why do we need this feature?
- Table and Lance dataset/scanner APIs already support blob-aware pandas
conversion, but LanceDB query builders did not expose that capability.
- Geneva and other callers should be able to use query-level
`to_pandas(blob_mode=...)` without manually constructing Lance scanners.

- How does it work?
- Plain scan queries route through Lance scanner native
`to_pandas(blob_mode=...)`, preserving filter, projection, limit,
offset, row id, and alias/expression projection behavior.
- Non-native query shapes keep existing Arrow fallback semantics and
raise a clear error when they return blob columns with
`blob_mode="lazy"` or `blob_mode="bytes"`.
- Focused tests cover table/query blob modes,
filter/select/limit/offset/alias query cases, async query behavior,
vector-query error boundaries, and namespace-backed lazy blobs.

## Validation

- `cd python && .venv/bin/maturin develop --uv --extras tests,dev
--profile dev`
- `cd python && uv run --frozen --no-sync pytest
python/tests/test_table.py::test_table_to_pandas_blob_modes
python/tests/test_table.py::test_async_table_to_pandas_blob_bytes
python/tests/test_query.py::test_plain_scan_query_to_pandas_blob_modes
python/tests/test_query.py::test_plain_scan_query_to_pandas_blob_projection
python/tests/test_query.py::test_async_plain_scan_query_to_pandas_blob_projection
python/tests/test_query.py::test_vector_query_to_pandas_blob_mode_requires_native_path
python/tests/test_namespace.py::TestNamespaceConnection::test_table_to_pandas_blob_lazy_through_namespace
-q`
- `cd python && uv run --frozen --no-sync ruff format --check .`
- `cd python && uv run --frozen --no-sync ruff check .`
- `git diff --check`
2026-06-03 19:15:44 +08:00
Brendan Clement
379684391e feat: deprecate replace_field_metadata for update_field_metadata (#3484)
### Summary
Deprecates the Python replace_field_metadata (on Table and AsyncTable)
in favor of update_field_metadata. Mirrors Lance, which already
deprecated Dataset.replace_field_metadata for update_field_metadata.

Stacked on top of #3482 as this was a follow-up task after adding
update_field_metadata
2026-06-02 14:02:22 -07:00
Brendan Clement
d065be0474 feat: add update_field_metadata to edit per-field metadata (#3482)
### Summary
Adds update_field_metadata to the client SDK (Rust core, Python, and
TypeScript) so clients can edit per-field (column) Arrow metadata
(schema.fields[].metadata)

### Testing
- added unit tests
- ran E2E against a local server on both local and remote tables (set →
merge → delete), across Python sync/async and TypeScript

### Next steps
- deprecate replace_field_metadata in the python lancedb favor of this
(typescript didn't have replace_field_metadata method). This matches
Lance's API direction (Lance already deprecated replace_field_metadata
for update_field_metadata)
2026-06-02 07:00:00 -07:00
Xuanwo
7b874905fd ci: move Lance dependency bump flow into skill (#3475)
Moves the Lance dependency bump process into an in-repository skill so
local agents and GitHub Actions share the same workflow definition.

The update workflow is now an explicit, optional-tag entrypoint;
latest-release resolution, duplicate PR handling, Java/Rust dependency
updates, and Sophon follow-up are documented in the skill and backed by
a small deterministic helper.
2026-06-02 16:05:37 +08:00
Xuanwo
a327044e2f feat(python): support remote tables in PyTorch dataloaders (#3432)
This PR makes remote LanceDB tables usable from PyTorch multiprocessing
workers. Remote tables now carry enough safe JSON connection state to
reopen themselves after pickle/spawn or fork, and permutations lazily
rebuild their reader from restored tables instead of trying to reuse
process-local handles.

This addresses the remote-table gap in the PyTorch dataset path while
preserving the explicit connection factory escape hatch for custom
worker-side credential loading or non-serializable header providers.

Validated with targeted remote table, permutation, and PyTorch
DataLoader tests.
2026-06-02 15:38:28 +08:00
Lance Release
f20ec99dec Bump version: 0.30.0-beta.1 → 0.30.1-beta.0 2026-06-01 12:41:45 +00:00
Lance Release
60f961584c Bump version: 0.33.0-beta.1 → 0.33.1-beta.0 python-v0.33.1-beta.0 2026-06-01 12:41:02 +00:00
Xuanwo
ac699d7ecf chore: bump lance to 7.2.0-beta.3 (#3471)
This updates the workspace Lance dependencies from `v7.1.0-beta.4` to
`v7.2.0-beta.3` and refreshes `Cargo.lock`.

The lockfile now points at Lance commit
`7c070f760fa8e24c8015cb2afbd22c5e6b7898e8` and includes the transitive
dependency updates required by the new beta.
2026-06-01 20:40:07 +08:00
dependabot[bot]
968277be79 chore(deps): bump the rust-minor-patch group with 5 updates (#3465)
Bumps the rust-minor-patch group with 5 updates:

| Package | From | To |
| --- | --- | --- |
| [log](https://github.com/rust-lang/log) | `0.4.29` | `0.4.30` |
| [serde_json](https://github.com/serde-rs/json) | `1.0.149` | `1.0.150`
|
| [http](https://github.com/hyperium/http) | `1.4.0` | `1.4.1` |
| [uuid](https://github.com/uuid-rs/uuid) | `1.23.1` | `1.23.2` |
| [aws-smithy-runtime](https://github.com/smithy-lang/smithy-rs) |
`1.11.1` | `1.11.3` |

Updates `log` from 0.4.29 to 0.4.30
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/rust-lang/log/releases">log's
releases</a>.</em></p>
<blockquote>
<h2>0.4.30</h2>
<h3>What's Changed</h3>
<ul>
<li>Support capturing of <code>std::net</code> types by <a
href="https://github.com/KodrAus"><code>@​KodrAus</code></a> in <a
href="https://redirect.github.com/rust-lang/log/pull/724">rust-lang/log#724</a></li>
</ul>
<h3>New Contributors</h3>
<ul>
<li><a href="https://github.com/V0ldek"><code>@​V0ldek</code></a> made
their first contribution in <a
href="https://redirect.github.com/rust-lang/log/pull/720">rust-lang/log#720</a></li>
<li><a href="https://github.com/woodruffw"><code>@​woodruffw</code></a>
made their first contribution in <a
href="https://redirect.github.com/rust-lang/log/pull/723">rust-lang/log#723</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/rust-lang/log/compare/0.4.29...0.4.30">https://github.com/rust-lang/log/compare/0.4.29...0.4.30</a></p>
<h3>Notable Changes</h3>
<ul>
<li>MSRV is bumped to 1.71.0 in <a
href="https://redirect.github.com/rust-lang/log/pull/723">rust-lang/log#723</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/rust-lang/log/blob/master/CHANGELOG.md">log's
changelog</a>.</em></p>
<blockquote>
<h2>[0.4.30] - 2026-05-21</h2>
<h3>What's Changed</h3>
<ul>
<li>Support capturing of <code>std::net</code> types by <a
href="https://github.com/KodrAus"><code>@​KodrAus</code></a> in <a
href="https://redirect.github.com/rust-lang/log/pull/724">rust-lang/log#724</a></li>
</ul>
<h3>New Contributors</h3>
<ul>
<li><a href="https://github.com/V0ldek"><code>@​V0ldek</code></a> made
their first contribution in <a
href="https://redirect.github.com/rust-lang/log/pull/720">rust-lang/log#720</a></li>
<li><a href="https://github.com/woodruffw"><code>@​woodruffw</code></a>
made their first contribution in <a
href="https://redirect.github.com/rust-lang/log/pull/723">rust-lang/log#723</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/rust-lang/log/compare/0.4.29...0.4.30">https://github.com/rust-lang/log/compare/0.4.29...0.4.30</a></p>
<h3>Notable Changes</h3>
<ul>
<li>MSRV is bumped to 1.71.0 in <a
href="https://redirect.github.com/rust-lang/log/pull/723">rust-lang/log#723</a></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="9c55760b49"><code>9c55760</code></a>
Merge pull request <a
href="https://redirect.github.com/rust-lang/log/issues/725">#725</a>
from rust-lang/cargo/0.4.30</li>
<li><a
href="d1acb0585c"><code>d1acb05</code></a>
update docs on current MSRV and note latest bump in changelog</li>
<li><a
href="50682937b0"><code>5068293</code></a>
prepare for 0.4.30 release</li>
<li><a
href="7ccd873cb5"><code>7ccd873</code></a>
Merge pull request <a
href="https://redirect.github.com/rust-lang/log/issues/724">#724</a>
from rust-lang/feat/net-to-value</li>
<li><a
href="923dfaaf00"><code>923dfaa</code></a>
fix up test cfgs</li>
<li><a
href="ecb7de8daf"><code>ecb7de8</code></a>
gate net value impls on std</li>
<li><a
href="67bb4f6d2e"><code>67bb4f6</code></a>
run fmt</li>
<li><a
href="25f49fe3d3"><code>25f49fe</code></a>
rework net type capturing</li>
<li><a
href="7087dcb95c"><code>7087dcb</code></a>
feat: impl ToValue for core::net types</li>
<li><a
href="67bc7e32c6"><code>67bc7e3</code></a>
Merge pull request <a
href="https://redirect.github.com/rust-lang/log/issues/723">#723</a>
from woodruffw-forks/ww/ci</li>
<li>Additional commits viewable in <a
href="https://github.com/rust-lang/log/compare/0.4.29...0.4.30">compare
view</a></li>
</ul>
</details>
<br />

Updates `serde_json` from 1.0.149 to 1.0.150
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/serde-rs/json/releases">serde_json's
releases</a>.</em></p>
<blockquote>
<h2>v1.0.150</h2>
<ul>
<li>Reject non-string enum object keys (<a
href="https://redirect.github.com/serde-rs/json/issues/1324">#1324</a>,
thanks <a
href="https://github.com/puneetdixit200"><code>@​puneetdixit200</code></a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="a1ae73ac6a"><code>a1ae73a</code></a>
Release 1.0.150</li>
<li><a
href="1a360b0a6c"><code>1a360b0</code></a>
Merge pull request <a
href="https://redirect.github.com/serde-rs/json/issues/1324">#1324</a>
from puneetdixit200/reject-non-string-enum-keys</li>
<li><a
href="2037b634f9"><code>2037b63</code></a>
Reject non-string enum object keys</li>
<li><a
href="5d30df60e9"><code>5d30df6</code></a>
Resolve manual_assert_eq pedantic clippy lint</li>
<li><a
href="dc8003a88e"><code>dc8003a</code></a>
Raise required compiler for preserve_order feature to 1.85</li>
<li><a
href="a42fa980f8"><code>a42fa98</code></a>
Unpin CI miri toolchain</li>
<li><a
href="684a60eba1"><code>684a60e</code></a>
Pin CI miri to nightly-2026-02-11</li>
<li><a
href="7c7da3302b"><code>7c7da33</code></a>
Raise required compiler to Rust 1.71</li>
<li><a
href="acf4850e29"><code>acf4850</code></a>
Simplify Number::is_f64</li>
<li><a
href="6b8ceab565"><code>6b8ceab</code></a>
Resolve unnecessary_map_or clippy lint</li>
<li>Additional commits viewable in <a
href="https://github.com/serde-rs/json/compare/v1.0.149...v1.0.150">compare
view</a></li>
</ul>
</details>
<br />

Updates `http` from 1.4.0 to 1.4.1
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/hyperium/http/releases">http's
releases</a>.</em></p>
<blockquote>
<h2>v1.4.1</h2>
<h2>tl;dr</h2>
<ul>
<li>Fix <code>PathAndQuery::from_static()</code> and
<code>from_shared()</code> to reject inputs that do not start with
<code>/</code>.</li>
<li>Fix <code>Extend</code> for <code>HeaderMap</code> to clamp max size
hint and not overflow.</li>
<li>Fix <code>header::IntoIter</code> that could use-after-free if the
generic value type could panic on drop.</li>
<li>Fix <code>header::{IterMut, ValuesIterMut}</code> to not violate
stacked borrows.</li>
</ul>
<h2>What's Changed</h2>
<ul>
<li>chore(header): fix clippy::assign_op_pattern by <a
href="https://github.com/rxc-amzn"><code>@​rxc-amzn</code></a> in <a
href="https://redirect.github.com/hyperium/http/pull/806">hyperium/http#806</a></li>
<li>ci: pin itoa in msrv job by <a
href="https://github.com/seanmonstar"><code>@​seanmonstar</code></a> in
<a
href="https://redirect.github.com/hyperium/http/pull/813">hyperium/http#813</a></li>
<li>Remove unnecessary explicit lifetimes by <a
href="https://github.com/jplatte"><code>@​jplatte</code></a> in <a
href="https://redirect.github.com/hyperium/http/pull/815">hyperium/http#815</a></li>
<li>chore(ci): update to actions/checkout@v6 by <a
href="https://github.com/tottoto"><code>@​tottoto</code></a> in <a
href="https://redirect.github.com/hyperium/http/pull/819">hyperium/http#819</a></li>
<li>tests: update to rand 0.10 by <a
href="https://github.com/tottoto"><code>@​tottoto</code></a> in <a
href="https://redirect.github.com/hyperium/http/pull/818">hyperium/http#818</a></li>
<li>refactor: Remove usage of float instruction by <a
href="https://github.com/AurelienFT"><code>@​AurelienFT</code></a> in <a
href="https://redirect.github.com/hyperium/http/pull/823">hyperium/http#823</a></li>
<li>refactor(uri): consolidate PathAndQuery::from_shared and from_static
by <a
href="https://github.com/seanmonstar"><code>@​seanmonstar</code></a> in
<a
href="https://redirect.github.com/hyperium/http/pull/825">hyperium/http#825</a></li>
<li>fix(uri): reject Path::from_shared/from_static if doesn't start with
slash by <a
href="https://github.com/seanmonstar"><code>@​seanmonstar</code></a> in
<a
href="https://redirect.github.com/hyperium/http/pull/826">hyperium/http#826</a></li>
<li>Rephrase comment by <a
href="https://github.com/daalfox"><code>@​daalfox</code></a> in <a
href="https://redirect.github.com/hyperium/http/pull/827">hyperium/http#827</a></li>
<li>Fix typo in request builder docs by <a
href="https://github.com/vleksis"><code>@​vleksis</code></a> in <a
href="https://redirect.github.com/hyperium/http/pull/831">hyperium/http#831</a></li>
<li>fix: clamp Extend size hint so HeaderMap reserve cannot overflow by
<a href="https://github.com/SAY-5"><code>@​SAY-5</code></a> in <a
href="https://redirect.github.com/hyperium/http/pull/833">hyperium/http#833</a></li>
<li>fix(headers): fix stacked borrows for IterMut/ValuesIterMut by <a
href="https://github.com/seanmonstar"><code>@​seanmonstar</code></a> in
<a
href="https://redirect.github.com/hyperium/http/pull/837">hyperium/http#837</a></li>
<li>fix(header): use a set_len guard in IntoIter drop by <a
href="https://github.com/seanmonstar"><code>@​seanmonstar</code></a> in
<a
href="https://redirect.github.com/hyperium/http/pull/838">hyperium/http#838</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a href="https://github.com/rxc-amzn"><code>@​rxc-amzn</code></a>
made their first contribution in <a
href="https://redirect.github.com/hyperium/http/pull/806">hyperium/http#806</a></li>
<li><a
href="https://github.com/AurelienFT"><code>@​AurelienFT</code></a> made
their first contribution in <a
href="https://redirect.github.com/hyperium/http/pull/823">hyperium/http#823</a></li>
<li><a href="https://github.com/daalfox"><code>@​daalfox</code></a> made
their first contribution in <a
href="https://redirect.github.com/hyperium/http/pull/827">hyperium/http#827</a></li>
<li><a href="https://github.com/vleksis"><code>@​vleksis</code></a> made
their first contribution in <a
href="https://redirect.github.com/hyperium/http/pull/831">hyperium/http#831</a></li>
<li><a href="https://github.com/SAY-5"><code>@​SAY-5</code></a> made
their first contribution in <a
href="https://redirect.github.com/hyperium/http/pull/833">hyperium/http#833</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/hyperium/http/compare/v1.4.0...v1.4.1">https://github.com/hyperium/http/compare/v1.4.0...v1.4.1</a></p>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/hyperium/http/blob/master/CHANGELOG.md">http's
changelog</a>.</em></p>
<blockquote>
<h1>1.4.1 (May 25, 2026)</h1>
<ul>
<li>Fix <code>PathAndQuery::from_static()</code> and
<code>from_shared()</code> to reject inputs that do not start with
<code>/</code>.</li>
<li>Fix <code>Extend</code> for <code>HeaderMap</code> to clamp max size
hint and not overflow.</li>
<li>Fix <code>header::IntoIter</code> that could use-after-free if the
generic value type could panic on drop.</li>
<li>Fix <code>header::{IterMut, ValuesIterMut}</code> to not violate
stacked borrows.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="a24c968ba3"><code>a24c968</code></a>
v1.4.1</li>
<li><a
href="bc3b0441be"><code>bc3b044</code></a>
fix(header): use a set_len guard in IntoIter drop (<a
href="https://redirect.github.com/hyperium/http/issues/838">#838</a>)</li>
<li><a
href="1b968dc519"><code>1b968dc</code></a>
fix(header): fix stacked borrows for IterMut/ValuesIterMut (<a
href="https://redirect.github.com/hyperium/http/issues/837">#837</a>)</li>
<li><a
href="6e2dd42a15"><code>6e2dd42</code></a>
fix: clamp Extend size hint so HeaderMap reserve cannot overflow (<a
href="https://redirect.github.com/hyperium/http/issues/833">#833</a>)</li>
<li><a
href="68e0abb052"><code>68e0abb</code></a>
docs: fix typo in request builder docs (<a
href="https://redirect.github.com/hyperium/http/issues/831">#831</a>)</li>
<li><a
href="29dd307b3e"><code>29dd307</code></a>
docs(extensions): rephrase internal comment (<a
href="https://redirect.github.com/hyperium/http/issues/827">#827</a>)</li>
<li><a
href="ae48fb55b0"><code>ae48fb5</code></a>
fix(uri): reject Path::from_shared/from_static if doesn't start with
slash (#...</li>
<li><a
href="1ad200ec4c"><code>1ad200e</code></a>
refactor(uri): consolidate PathAndQuery::from_shared and from_static (<a
href="https://redirect.github.com/hyperium/http/issues/825">#825</a>)</li>
<li><a
href="d59d939f92"><code>d59d939</code></a>
refactor: Remove usage of float instruction (<a
href="https://redirect.github.com/hyperium/http/issues/823">#823</a>)</li>
<li><a
href="ed680c4d90"><code>ed680c4</code></a>
tests: update to rand 0.10 (<a
href="https://redirect.github.com/hyperium/http/issues/818">#818</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/hyperium/http/compare/v1.4.0...v1.4.1">compare
view</a></li>
</ul>
</details>
<br />

Updates `uuid` from 1.23.1 to 1.23.2
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/uuid-rs/uuid/releases">uuid's
releases</a>.</em></p>
<blockquote>
<h2>v1.23.2</h2>
<h2>What's Changed</h2>
<ul>
<li>Improve error messages for ambiguous formats by <a
href="https://github.com/KodrAus"><code>@​KodrAus</code></a> in <a
href="https://redirect.github.com/uuid-rs/uuid/pull/882">uuid-rs/uuid#882</a></li>
<li>Prepare for 1.23.2 release by <a
href="https://github.com/KodrAus"><code>@​KodrAus</code></a> in <a
href="https://redirect.github.com/uuid-rs/uuid/pull/883">uuid-rs/uuid#883</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/uuid-rs/uuid/compare/v1.23.1...v1.23.2">https://github.com/uuid-rs/uuid/compare/v1.23.1...v1.23.2</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="d11965705f"><code>d119657</code></a>
Merge pull request <a
href="https://redirect.github.com/uuid-rs/uuid/issues/883">#883</a> from
uuid-rs/cargo/v1.23.2</li>
<li><a
href="0651cfcb89"><code>0651cfc</code></a>
prepare for 1.23.2 release</li>
<li><a
href="e8dea0c1fd"><code>e8dea0c</code></a>
Merge pull request <a
href="https://redirect.github.com/uuid-rs/uuid/issues/882">#882</a> from
uuid-rs/fix/error-msgs</li>
<li><a
href="bdc429a8c7"><code>bdc429a</code></a>
fix up serde messages</li>
<li><a
href="d4342e400d"><code>d4342e4</code></a>
make indexes 0 based and fix up more error messages</li>
<li><a
href="4ad479fc20"><code>4ad479f</code></a>
work on more accurate parser errors</li>
<li>See full diff in <a
href="https://github.com/uuid-rs/uuid/compare/v1.23.1...v1.23.2">compare
view</a></li>
</ul>
</details>
<br />

Updates `aws-smithy-runtime` from 1.11.1 to 1.11.3
<details>
<summary>Commits</summary>
<ul>
<li>See full diff in <a
href="https://github.com/smithy-lang/smithy-rs/commits">compare
view</a></li>
</ul>
</details>
<br />


Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore <dependency name> major version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's major version (unless you unignore this specific
dependency's major version or upgrade to it yourself)
- `@dependabot ignore <dependency name> minor version` will close this
group update PR and stop Dependabot creating any more for the specific
dependency's minor version (unless you unignore this specific
dependency's minor version or upgrade to it yourself)
- `@dependabot ignore <dependency name>` will close this group update PR
and stop Dependabot creating any more for the specific dependency
(unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore <dependency name>` will remove all of the ignore
conditions of the specified dependency
- `@dependabot unignore <dependency name> <ignore condition>` will
remove the ignore condition of the specified dependency and ignore
conditions


</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-05-30 19:22:50 -07:00
Xuanwo
5638907fa5 chore: update Lance to v7.2.0-beta.1 (#3461)
Update the Rust workspace Lance git dependencies and Java lance-core
dependency to v7.2.0-beta.1.

This keeps LanceDB aligned with the latest Lance beta release and
refreshes the Cargo lockfile for the new Lance dependency graph.
2026-05-30 00:18:22 +08:00
Heng Ge
048f52c2aa feat(table): route merge_insert through the MemWAL LSM write path (#3354)
## Summary

When an `LsmWriteSpec` is installed on a table (#3396), `merge_insert`
upsert
calls are dispatched through Lance's MemWAL `ShardWriter` (LSM-style
append)
instead of the standard merge path.

- **`use_lsm_write`** — a `merge_insert` builder option, default `true`;
set it
  `false` to use the standard path for a call even when a spec is set.
- **`assume_pre_sharded`** — a `merge_insert` builder option, default
`false`;
  skips the per-row shard check and routes by the first row only.
- **`close_lsm_writers`** — drains and closes the table's cached MemWAL
shard
  writers.
- The `merge_insert` **`on`** columns default to, and are validated
against,
  the table's unenforced primary key.
- Shard writers are cached alongside the dataset (in
  `DatasetConsistencyWrapper`) and reused for the session.
- `MergeResult` gains **`num_rows`** — on the LSM path the insert/update
  breakdown is unknown until compaction, so only the total is reported.

Routing covers all three sharding strategies — bucket (murmur3,
Iceberg-compatible), identity, and unsharded. Each `merge_insert` call
targets
a single shard; the whole input is collected and validated before a
single
atomic `ShardWriter::put`, so a validation failure leaves the MemWAL
untouched.

Bindings: Python (`merge_insert(...).use_lsm_write(...)` /
`.assume_pre_sharded(...)`, `Table.close_lsm_writers`) and TypeScript
(`mergeInsert(...).useLsmWrite(...)` / `.assumePreSharded(...)`,
`Table.closeLsmWriters`).

## Context

Reconstructed from the original #3354 branch onto current `main`: the
branch
predated the #3394 (unenforced primary key) / #3396 (`LsmWriteSpec`)
split and
has been rebuilt on that merged foundation. Depends on Lance
`v7.0.0-beta.13`.

The MemWAL read path (reading un-flushed shard data back into queries)
and
remote (LanceDB Cloud) LSM support are follow-ups.

---------

Co-authored-by: Jack Ye <yezhaoqin@gmail.com>
2026-05-29 08:48:11 -07:00
Will Jones
458dcabbd2 chore: upgrade Rust toolchain to 1.95.0 (#3390)
Bumps the pinned toolchain in `rust-toolchain.toml` from 1.94.0 to
1.95.0.

Fixes new lints surfaced by clippy on 1.95.0:

- `manual_checked_ops` — fragment size mean in `table.rs` uses
`checked_div`
- `explicit_counter_loop` — shuffle test loop in `shuffle.rs`

No rustc warnings were introduced.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-29 08:21:45 -07:00