Compare commits

...

10 Commits

Author SHA1 Message Date
Weston Pace
f6c9d31f98 feat: add polars dataframe integration (#3584)
This PR is part cleanup, part feature, part example.

It removes `IntoArrow` and `IntoArrowStream`. There was only one
redundant call site between the two. Once we moved everything to
`Scannable` these traits no longer serve any purpose.

It adds a `Scannable` impl for a polars DataFrame. We used to have this
at one point for `IntoArrow` so this is more like a regression fix than
anything.

It adds an example (and unit test) which ensures we can ingest from a
Polars DataFrame and export to one. LazyFrame support would be a
follow-up (though a pretty straightforward one) but we've never had
proper LazyFrame support before.
2026-06-30 08:28:41 -07:00
Dan Tasse
a8f1c5a69f feat: add skill to work with branches better (#3596)
Agents seemed to have trouble finding the right calls to work with
branches (create, list, delete) and passing the right params to get it
to work. We probably don't need a big skill to get it on the right track
but a little nudge seems helpful. Doing a couple simple tasks, it saved
about half the time and tokens, so feels worthwhile. Created with the
Claude skills creator, hence the "skill.md in a bare folder"
organization - happy to move it if that's not the standard anymore.

```
Benchmark results (3 evals, with-skill vs baseline):

┌────────────────┬────────────┬────────────────────┐
│     Metric     │ With skill │   Without skill    │
├────────────────┼────────────┼────────────────────┤
│ Pass rate      │ 3/3 (100%) │ 3/3 (100%)         │
├────────────────┼────────────┼────────────────────┤
│ Avg time       │ 51s        │ 142s (2.8× slower) │
├────────────────┼────────────┼────────────────────┤
│ Avg tokens     │ 19,305     │ 36,513 (47% more)  │
├────────────────┼────────────┼────────────────────┤
│ Avg tool calls │ 5.7        │ 26 (4.5× more)     │
└────────────────┴────────────┴────────────────────┘
```
2026-06-30 09:27:21 -04:00
Jack Ye
10fecdf051 feat(node): expose OAuth connection config (#3587)
Expose the merged Rust OAuth header provider through the Node/TypeScript
connection path.

Includes:
- Native OAuthConfig conversion for napi-rs
- ConnectionOptions.oauthConfig plumbing
- Public TypeScript OAuthConfig and OAuthFlowType exports
- Generated TypeScript API docs for the new config surface
- input-validation and debug-redaction coverage in the Rust binding
layer

Local validation: cargo fmt --all; git diff --check.
2026-06-29 16:55:45 -07:00
Raphael Malikian
c9ae93a7fa fix: add missing stacklevel=2 to warnings.warn() calls (Fixes #3589) (#3590)
Fixes #3589

## Problem
Multiple `warnings.warn()` calls across the Python client are missing
the `stacklevel=2` parameter. This causes warning messages to point to
lancedb internal code instead of the user's code that triggered the
warning, making debugging difficult.

## Solution
Add `stacklevel=2` to 7 `warnings.warn()` calls across 4 files:

| File | Warnings Fixed |
|------|---------------|
| `remote/db.py` | `request_thread_pool`, `connection_timeout`,
`read_timeout` deprecation warnings |
| `remote/table.py` | `cleanup_old_versions`, `compact_files`,
`optimize` no-op warnings |
| `table.py` | `data_storage_version`, `enable_v2_manifest_paths`,
`retrain` deprecation warnings |
| `embeddings/colpali.py` | `use_token_pooling` deprecation warning |

## Verification
- All 4 modified files pass `ast.parse()` syntax check
- Only `stacklevel=2` added — no other changes

## Changelog

| Date | Change | Author |
|------|--------|--------|
| 2026-06-27 | Add missing stacklevel=2 to warnings.warn() calls |
rtmalikian |

### Files Changed
- `python/python/lancedb/remote/db.py` — Add stacklevel=2 to 3
deprecation warnings
- `python/python/lancedb/remote/table.py` — Add stacklevel=2 to 3 no-op
warnings
- `python/python/lancedb/table.py` — Add stacklevel=2 to 3 deprecation
warnings
- `python/python/lancedb/embeddings/colpali.py` — Add stacklevel=2 to 1
deprecation warning

### Verification
- Syntax check passed on all modified files

---

**About the Author:** Raphael Malikian — Clinical AI Solutions
Architect. I specialise in building and fixing AI/ML systems for
healthcare, including vector databases, RAG pipelines, and clinical NLP.
If you need help with your project or think I can add value to your
organisation, feel free to reach out — I'd love to connect.

📧 rtmalikian@gmail.com
🔗 GitHub: https://github.com/rtmalikian
🔗 LinkedIn:
http://www.linkedin.com/in/raphael-t-malikian-mbbs-bsc-hons-71075436a

---

**Disclosure:** This code was developed with assistance from
DeepSeek-V4-Pro (DeepSeek) via Hermes Agent (Nous Research). All changes
were reviewed, tested against the actual codebase, and verified for
correctness.

Signed-off-by: rtmalikian <rtmalikian@gmail.com>
2026-06-29 16:36:44 -07:00
Raphael Malikian
05756f0bbf fix(python): raise clear error when permutation API is used on remote tables (Fixes #2934) (#3591)
Fixes #2934

## Problem
Passing a `RemoteTable` to `permutation_builder()` raises a cryptic
`AttributeError`:
```
AttributeError: 'RemoteTable' object has no attribute '_inner'
```
This leaves users confused about what went wrong and why.

## Root Cause
`PermutationBuilder.__init__()` calls `async_permutation_builder(table)`
which accesses `table._inner` — the underlying Rust Lance table object.
`RemoteTable` connects to LanceDB Cloud/Enterprise and does not have a
local `_inner` attribute, making permutations fundamentally unsupported
on remote tables.

## Solution
Added an early check in `PermutationBuilder.__init__()` that verifies
the table has `_inner` before calling the Rust function, raising a clear
`TypeError` with an explanation of why permutations don't work on remote
tables.

## Verification
- Syntax validated with `ast.parse()`
- Structural verification: single call site (`permutation_builder()`),
guard placed before Rust FFI call
- Error message tested with mock: `MockRemoteTable()` correctly triggers
`TypeError`

## Changelog

| Date | Change | Author |
|------|--------|--------|
| 2026-06-28 | Added remote table guard in PermutationBuilder.__init__ |
rtmalikian |

### Files Changed
- python/python/lancedb/permutation.py — Added `hasattr(table,
"_inner")` check with clear error

---

**About the Author:** Raphael Malikian — Clinical AI Solutions
Architect. I specialise in building and fixing AI/ML systems for
healthcare, including vector databases, RAG pipelines, and clinical NLP.
If you need help with your project or think I can add value to your
organisation, feel free to reach out — I'd love to connect.

📧 rtmalikian@gmail.com
🔗 GitHub: https://github.com/rtmalikian
🔗 LinkedIn:
http://www.linkedin.com/in/raphael-t-malikian-mbbs-bsc-hons-71075436a

---

**Disclosure:** This code was developed with assistance from
deepseek-v4-pro (DeepSeek) via Hermes Agent (Nous Research). All changes
were reviewed, tested against the actual codebase, and verified for
correctness.

Signed-off-by: rtmalikian <rtmalikian@gmail.com>
2026-06-29 16:36:01 -07:00
LanceDB Robot
2a0945443e chore: update lance dependency to v9.0.0-beta.10 (#3594)
Updates Lance Rust workspace dependencies and Java lance-core to
v9.0.0-beta.10.

No compatibility code changes were required; clippy and rustfmt passed
after installing the missing runner components.

Lance tag:
https://github.com/lance-format/lance/releases/tag/v9.0.0-beta.10
2026-06-29 15:28:47 -05:00
Jack Ye
39e819b6a7 feat(python): expose OAuth connection config (#3586)
Expose the merged Rust OAuth header provider through the Python async
connection path.

Includes:
- Python OAuthConfig and OAuthFlowType public config objects
- PyO3 conversion into the Rust OAuthConfig
- connect_async(oauth_config=...) plumbing
- repr redaction coverage for client_secret

Local validation: cargo fmt --all; ruff format/check on touched Python
files.
2026-06-29 12:36:35 -07:00
dependabot[bot]
70126943ff chore(deps): bump the rust-minor-patch group across 1 directory with 6 updates (#3588)
Bumps the rust-minor-patch group with 6 updates in the / directory:

| Package | From | To |
| --- | --- | --- |
| [env_logger](https://github.com/rust-cli/env_logger) | `0.11.10` |
`0.11.11` |
| [log](https://github.com/rust-lang/log) | `0.4.32` | `0.4.33` |
| [uuid](https://github.com/uuid-rs/uuid) | `1.23.3` | `1.23.4` |
| [anyhow](https://github.com/dtolnay/anyhow) | `1.0.102` | `1.0.103` |
| [napi](https://github.com/napi-rs/napi-rs) | `3.9.3` | `3.9.4` |
| [napi-derive](https://github.com/napi-rs/napi-rs) | `3.5.6` | `3.5.7`
|


Updates `env_logger` from 0.11.10 to 0.11.11
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/rust-cli/env_logger/releases">env_logger's
releases</a>.</em></p>
<blockquote>
<h2>v0.11.11</h2>
<h2>[0.11.11] - 2026-06-25</h2>
<h3>Internal</h3>
<ul>
<li>Updated <code>env_filter</code></li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/rust-cli/env_logger/blob/main/CHANGELOG.md">env_logger's
changelog</a>.</em></p>
<blockquote>
<h2>[0.11.11] - 2026-06-25</h2>
<h3>Internal</h3>
<ul>
<li>Updated <code>env_filter</code></li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="b4d3f2b8dd"><code>b4d3f2b</code></a>
chore: Release</li>
<li><a
href="cc2b2efcd7"><code>cc2b2ef</code></a>
chore: Release</li>
<li><a
href="69e27d1e82"><code>69e27d1</code></a>
docs: Update changelog</li>
<li><a
href="166880db07"><code>166880d</code></a>
Merge pull request <a
href="https://redirect.github.com/rust-cli/env_logger/issues/411">#411</a>
from epage/parse</li>
<li><a
href="0a580d06e7"><code>0a580d0</code></a>
fix(filter): Remove 'parse' on no_std</li>
<li><a
href="78d8ef116e"><code>78d8ef1</code></a>
Merge pull request <a
href="https://redirect.github.com/rust-cli/env_logger/issues/404">#404</a>
from cagatay-y/feature/filter-no_std</li>
<li><a
href="132fe86c8c"><code>132fe86</code></a>
feat(filter): Add support for no_std environments</li>
<li><a
href="4feafa4c3c"><code>4feafa4</code></a>
refactor(env_filter): Fix unreachable pub warning</li>
<li><a
href="92f8d8d083"><code>92f8d8d</code></a>
Merge pull request <a
href="https://redirect.github.com/rust-cli/env_logger/issues/410">#410</a>
from rust-cli/renovate/crate-ci-typos-1.x</li>
<li><a
href="4e57784e0a"><code>4e57784</code></a>
chore(deps): Update pre-commit hook crate-ci/typos to v1.47.0</li>
<li>Additional commits viewable in <a
href="https://github.com/rust-cli/env_logger/compare/v0.11.10...v0.11.11">compare
view</a></li>
</ul>
</details>
<br />

Updates `log` from 0.4.32 to 0.4.33
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/rust-lang/log/blob/master/CHANGELOG.md">log's
changelog</a>.</em></p>
<blockquote>
<h2>[0.4.33] - 2026-06-20</h2>
<h2>What's Changed</h2>
<ul>
<li>Fixed key comparison by <a
href="https://github.com/matteo-zeggiotti-ok"><code>@​matteo-zeggiotti-ok</code></a>
in <a
href="https://redirect.github.com/rust-lang/log/pull/732">rust-lang/log#732</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/matteo-zeggiotti-ok"><code>@​matteo-zeggiotti-ok</code></a>
made their first contribution in <a
href="https://redirect.github.com/rust-lang/log/pull/732">rust-lang/log#732</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/rust-lang/log/compare/0.4.32...0.4.33">https://github.com/rust-lang/log/compare/0.4.32...0.4.33</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="f405739f3a"><code>f405739</code></a>
Merge pull request <a
href="https://redirect.github.com/rust-lang/log/issues/734">#734</a>
from rust-lang/cargo/0.4.33</li>
<li><a
href="6a24abf083"><code>6a24abf</code></a>
prepare for 0.4.33 release</li>
<li><a
href="87e062162e"><code>87e0621</code></a>
Merge pull request <a
href="https://redirect.github.com/rust-lang/log/issues/732">#732</a>
from matteo-zeggiotti-ok/fix-key-comparison</li>
<li><a
href="a9b57119a6"><code>a9b5711</code></a>
Review: fallback to the &amp;str hash</li>
<li><a
href="cc89cc6e41"><code>cc89cc6</code></a>
Review: fixed other comparisons</li>
<li><a
href="920e7dc281"><code>920e7dc</code></a>
Review: fixed comparison on <code>MaybeStaticStr</code></li>
<li><a
href="0d71d3c685"><code>0d71d3c</code></a>
Fixed key comparison</li>
<li>See full diff in <a
href="https://github.com/rust-lang/log/compare/0.4.32...0.4.33">compare
view</a></li>
</ul>
</details>
<br />

Updates `uuid` from 1.23.3 to 1.23.4
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/uuid-rs/uuid/releases">uuid's
releases</a>.</em></p>
<blockquote>
<h2>v1.23.4</h2>
<h2>What's Changed</h2>
<ul>
<li>Fix up name of fuzz script in readme by <a
href="https://github.com/KodrAus"><code>@​KodrAus</code></a> in <a
href="https://redirect.github.com/uuid-rs/uuid/pull/888">uuid-rs/uuid#888</a></li>
<li>document fixes by <a
href="https://github.com/frostyplanet"><code>@​frostyplanet</code></a>
in <a
href="https://redirect.github.com/uuid-rs/uuid/pull/889">uuid-rs/uuid#889</a></li>
<li>Prepare for 1.23.4 release by <a
href="https://github.com/KodrAus"><code>@​KodrAus</code></a> in <a
href="https://redirect.github.com/uuid-rs/uuid/pull/890">uuid-rs/uuid#890</a></li>
</ul>
<h2>New Contributors</h2>
<ul>
<li><a
href="https://github.com/frostyplanet"><code>@​frostyplanet</code></a>
made their first contribution in <a
href="https://redirect.github.com/uuid-rs/uuid/pull/889">uuid-rs/uuid#889</a></li>
</ul>
<p><strong>Full Changelog</strong>: <a
href="https://github.com/uuid-rs/uuid/compare/v1.23.3...v1.23.4">https://github.com/uuid-rs/uuid/compare/v1.23.3...v1.23.4</a></p>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="3296d64a19"><code>3296d64</code></a>
Merge pull request <a
href="https://redirect.github.com/uuid-rs/uuid/issues/890">#890</a> from
uuid-rs/cargo/v1.23.4</li>
<li><a
href="cba53d0da2"><code>cba53d0</code></a>
prepare for 1.23.4 release</li>
<li><a
href="e347af48aa"><code>e347af4</code></a>
Merge pull request <a
href="https://redirect.github.com/uuid-rs/uuid/issues/889">#889</a> from
frostyplanet/main</li>
<li><a
href="e9bf55c222"><code>e9bf55c</code></a>
doc: Fix broken link warnings</li>
<li><a
href="5351af40a0"><code>5351af4</code></a>
doc: Enable feature flag label for docs.rs</li>
<li><a
href="1e6a9669e3"><code>1e6a966</code></a>
Merge pull request <a
href="https://redirect.github.com/uuid-rs/uuid/issues/888">#888</a> from
uuid-rs/KodrAus-patch-1</li>
<li><a
href="c9619f639c"><code>c9619f6</code></a>
fix up name of fuzz script in readme</li>
<li>See full diff in <a
href="https://github.com/uuid-rs/uuid/compare/v1.23.3...v1.23.4">compare
view</a></li>
</ul>
</details>
<br />

Updates `anyhow` from 1.0.102 to 1.0.103
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/dtolnay/anyhow/releases">anyhow's
releases</a>.</em></p>
<blockquote>
<h2>1.0.103</h2>
<ul>
<li>Fix Stacked Borrows violation (UB) in
<code>Error::downcast_mut</code> (<a
href="https://redirect.github.com/dtolnay/anyhow/issues/451">#451</a>,
<a
href="https://redirect.github.com/dtolnay/anyhow/issues/452">#452</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="5bdb0e24db"><code>5bdb0e2</code></a>
Release 1.0.103</li>
<li><a
href="e621bd35dd"><code>e621bd3</code></a>
Merge pull request <a
href="https://redirect.github.com/dtolnay/anyhow/issues/452">#452</a>
from dtolnay/downcast</li>
<li><a
href="6e8c000690"><code>6e8c000</code></a>
Eliminate pointer-&gt;reference-&gt;pointer during downcast</li>
<li><a
href="67c4abd771"><code>67c4abd</code></a>
Add regression test for issue 451</li>
<li><a
href="917a169320"><code>917a169</code></a>
Update actions/upload-artifact@v6 -&gt; v7</li>
<li><a
href="d9dc3faf78"><code>d9dc3fa</code></a>
Update actions/checkout@v6 -&gt; v7</li>
<li><a
href="841522b2aa"><code>841522b</code></a>
Raise minimum tested compiler to rust 1.85</li>
<li>See full diff in <a
href="https://github.com/dtolnay/anyhow/compare/1.0.102...1.0.103">compare
view</a></li>
</ul>
</details>
<br />

Updates `napi` from 3.9.3 to 3.9.4
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/napi-rs/napi-rs/releases">napi's
releases</a>.</em></p>
<blockquote>
<h2>napi-v3.9.4</h2>
<h3>Other</h3>
<ul>
<li><em>(napi-derive)</em> outline #[napi(object)] field-error
decoration (<a
href="https://redirect.github.com/napi-rs/napi-rs/pull/3338">#3338</a>)</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="9cc199fa34"><code>9cc199f</code></a>
chore: release (<a
href="https://redirect.github.com/napi-rs/napi-rs/issues/3345">#3345</a>)</li>
<li><a
href="b77119e711"><code>b77119e</code></a>
chore(release): publish</li>
<li><a
href="71ce9f6015"><code>71ce9f6</code></a>
chore(deps): update actions/cache action to v6 (<a
href="https://redirect.github.com/napi-rs/napi-rs/issues/3349">#3349</a>)</li>
<li><a
href="8c87f474c8"><code>8c87f47</code></a>
chore(deps): update <code>@​tybys/wasm-util</code> to 0.10.3 (<a
href="https://redirect.github.com/napi-rs/napi-rs/issues/3348">#3348</a>)</li>
<li><a
href="04e2a7655d"><code>04e2a76</code></a>
chore(deps): update cross-platform-actions/action action to v1.3.0 (<a
href="https://redirect.github.com/napi-rs/napi-rs/issues/3346">#3346</a>)</li>
<li><a
href="54ecbe4915"><code>54ecbe4</code></a>
chore(deps): update actions/checkout action to v7 (<a
href="https://redirect.github.com/napi-rs/napi-rs/issues/3340">#3340</a>)</li>
<li><a
href="3dd0c309da"><code>3dd0c30</code></a>
perf(napi-derive): outline #[napi(object)] field-error decoration (<a
href="https://redirect.github.com/napi-rs/napi-rs/issues/3338">#3338</a>)</li>
<li><a
href="81ac3d98c3"><code>81ac3d9</code></a>
build(deps): bump undici from 6.26.0 to 6.27.0 (<a
href="https://redirect.github.com/napi-rs/napi-rs/issues/3342">#3342</a>)</li>
<li>See full diff in <a
href="https://github.com/napi-rs/napi-rs/compare/napi-v3.9.3...napi-v3.9.4">compare
view</a></li>
</ul>
</details>
<br />

Updates `napi-derive` from 3.5.6 to 3.5.7
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/napi-rs/napi-rs/releases">napi-derive's
releases</a>.</em></p>
<blockquote>
<h2>napi-derive-v3.5.7</h2>
<h3>Other</h3>
<ul>
<li>updated the following local packages: napi-derive-backend</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="9cc199fa34"><code>9cc199f</code></a>
chore: release (<a
href="https://redirect.github.com/napi-rs/napi-rs/issues/3345">#3345</a>)</li>
<li><a
href="b77119e711"><code>b77119e</code></a>
chore(release): publish</li>
<li><a
href="71ce9f6015"><code>71ce9f6</code></a>
chore(deps): update actions/cache action to v6 (<a
href="https://redirect.github.com/napi-rs/napi-rs/issues/3349">#3349</a>)</li>
<li><a
href="8c87f474c8"><code>8c87f47</code></a>
chore(deps): update <code>@​tybys/wasm-util</code> to 0.10.3 (<a
href="https://redirect.github.com/napi-rs/napi-rs/issues/3348">#3348</a>)</li>
<li><a
href="04e2a7655d"><code>04e2a76</code></a>
chore(deps): update cross-platform-actions/action action to v1.3.0 (<a
href="https://redirect.github.com/napi-rs/napi-rs/issues/3346">#3346</a>)</li>
<li><a
href="54ecbe4915"><code>54ecbe4</code></a>
chore(deps): update actions/checkout action to v7 (<a
href="https://redirect.github.com/napi-rs/napi-rs/issues/3340">#3340</a>)</li>
<li><a
href="3dd0c309da"><code>3dd0c30</code></a>
perf(napi-derive): outline #[napi(object)] field-error decoration (<a
href="https://redirect.github.com/napi-rs/napi-rs/issues/3338">#3338</a>)</li>
<li><a
href="81ac3d98c3"><code>81ac3d9</code></a>
build(deps): bump undici from 6.26.0 to 6.27.0 (<a
href="https://redirect.github.com/napi-rs/napi-rs/issues/3342">#3342</a>)</li>
<li><a
href="ee58383da4"><code>ee58383</code></a>
chore(napi): release v3.9.3 (<a
href="https://redirect.github.com/napi-rs/napi-rs/issues/3335">#3335</a>)</li>
<li><a
href="c78727667b"><code>c787276</code></a>
fix(napi): sync referred flag when creating a weak ThreadsafeFunction
(<a
href="https://redirect.github.com/napi-rs/napi-rs/issues/3337">#3337</a>)</li>
<li>Additional commits viewable in <a
href="https://github.com/napi-rs/napi-rs/compare/napi-derive-v3.5.6...napi-derive-v3.5.7">compare
view</a></li>
</ul>
</details>
<br />

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-06-29 07:44:46 -07:00
Lance Release
e01777070d Bump version: 0.31.0-beta.3 → 0.31.0-beta.4 2026-06-29 11:12:18 +00:00
Lance Release
3878adc6dc Bump version: 0.34.0-beta.3 → 0.34.0-beta.4 2026-06-29 11:11:05 +00:00
43 changed files with 1045 additions and 157 deletions

View File

@@ -0,0 +1,137 @@
---
name: lancedb-branch-ops
description: Branch management for LanceDB tables via the REST API. Use this skill whenever someone wants to create, delete, list, or switch branches on a LanceDB table — or needs to make sure a write (metadata update, index build, etc.) lands on a specific branch instead of main. Invoke it even without the word "branch" if context makes clear they want an experimental copy of a table, want to isolate changes, or want to confirm a mutation didn't touch main. Covers: branches/list, branches/create, branches/delete, and passing "branch" in describe/update_field_metadata/create_index to target a non-main version.
---
## Goal
Manage branches on a LanceDB table: list what exists, create new ones, delete stale ones, and direct read/write operations at a specific branch without touching main.
## Step 0: Establish the connection
Use the `lancedb-connect` skill to resolve the base URL and auth headers (`x-api-key`, `x-lancedb-database`). Skip this only if the connection is already known from the current conversation.
All examples below use `{base_url}` — substitute the resolved endpoint and include the auth headers on every request.
## The branch model (important)
LanceDB branches are named snapshots that diverge from the table's current state at creation time. There is **no checkout command** — you never switch the whole table to a branch. Instead, you **pass `"branch": "<name>"` in the request body** of any operation to target that branch. Omitting the key (or sending an empty body) always targets main.
`branches/list` returns only non-main branches. Main always exists and is not listed.
## List branches
```http
POST {base_url}/v1/table/{table_id}/branches/list
Content-Type: application/json
{}
```
Response:
```json
{
"branches": {
"experiment-reindex": {"parentVersion": 1, "createAt": 1782506085, "manifestSize": 1029}
}
}
```
If `branches` is `{}`, the table has no branches besides main.
## Create a branch
```http
POST {base_url}/v1/table/{table_id}/branches/create
Content-Type: application/json
{"name": "experiment-reindex"}
```
HTTP 200 with `{}` body = success. The branch is created off the table's current state on main.
Verify by calling `branches/list` and confirming the new name appears.
## Delete a branch
```http
POST {base_url}/v1/table/{table_id}/branches/delete
Content-Type: application/json
{"name": "stale-2024"}
```
HTTP 200 with `{}` body = success. Only the branch pointer is removed — main and all row data remain intact.
Verify by calling `branches/list` (name gone) and `describe` with no branch param (main still responds).
## Operate on a specific branch
Pass `"branch": "<name>"` in the body of any operation to scope it to that branch:
**Read schema on a branch:**
```http
POST {base_url}/v1/table/{table_id}/describe
Content-Type: application/json
{"branch": "wip-branch"}
```
**Write metadata to a branch (not main):**
```http
POST {base_url}/v1/table/{table_id}/update_field_metadata
Content-Type: application/json
{
"branch": "wip-branch",
"updates": [
{
"path": "category",
"metadata": {"lancedb:description": "Product category label."},
"replace": false
}
]
}
```
**Build an index on a branch:**
```http
POST {base_url}/v1/table/{table_id}/create_index
Content-Type: application/json
{
"branch": "wip-branch",
"column": "category",
"index_type": "BTREE"
}
```
## Verifying isolation
After writing to a branch, always confirm the change did NOT land on main:
```bash
# Should show the new metadata
curl -s -X POST {base_url}/v1/table/{table_id}/describe \
-H "x-api-key: <key>" -H "x-lancedb-database: <db>" \
-H "content-type: application/json" \
-d '{"branch": "wip-branch"}'
# Should NOT show the new metadata
curl -s -X POST {base_url}/v1/table/{table_id}/describe \
-H "x-api-key: <key>" -H "x-lancedb-database: <db>" \
-H "content-type: application/json" \
-d '{}'
```
## Quick reference
| Goal | Endpoint | Body |
|------|----------|------|
| List all branches | `branches/list` | `{}` |
| Create a branch | `branches/create` | `{"name": "..."}` |
| Delete a branch | `branches/delete` | `{"name": "..."}` |
| Read schema on branch | `describe` | `{"branch": "..."}` |
| Write metadata on branch | `update_field_metadata` | `{"branch": "...", "updates": [...]}` |
| Build index on branch | `create_index` | `{"branch": "...", "column": ..., "index_type": ...}` |
| Target main (default) | any endpoint | omit `"branch"` key |

View File

@@ -1,5 +1,5 @@
[tool.bumpversion]
current_version = "0.31.0-beta.3"
current_version = "0.31.0-beta.4"
parse = """(?x)
(?P<major>0|[1-9]\\d*)\\.
(?P<minor>0|[1-9]\\d*)\\.

132
Cargo.lock generated
View File

@@ -157,9 +157,9 @@ dependencies = [
[[package]]
name = "anyhow"
version = "1.0.102"
version = "1.0.103"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7f202df86484c868dbad7eaa557ef785d5c66295e41b460ef922eca0723b842c"
checksum = "2a4385e2e34eb35d6b3efe798b9eb88096925d87726c0798709bf56d9ed84af3"
[[package]]
name = "approx"
@@ -1297,15 +1297,6 @@ version = "2.11.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c4512299f36f043ab09a583e57bceb5a5aab7a73db1805848e8fef3c9e8c78b3"
[[package]]
name = "bitpacking"
version = "0.9.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "96a7139abd3d9cebf8cd6f920a389cf3dc9576172e32f4563f188cae3c3eb019"
dependencies = [
"crunchy",
]
[[package]]
name = "bitvec"
version = "1.0.1"
@@ -3186,9 +3177,9 @@ dependencies = [
[[package]]
name = "env_filter"
version = "1.0.1"
version = "2.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "32e90c2accc4b07a8456ea0debdc2e7587bdd890680d71173a15d4ae604f6eef"
checksum = "900d271a03799a1ee8d1ca9b19893b48ca674a9284fefcfb85f05e74ed314217"
dependencies = [
"log",
"regex",
@@ -3196,9 +3187,9 @@ dependencies = [
[[package]]
name = "env_logger"
version = "0.11.10"
version = "0.11.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0621c04f2196ac3f488dd583365b9c09be011a4ab8b9f37248ffcc8f6198b56a"
checksum = "de671bd27a75a797dc9ae289ba1e77276e75e2026408aab65185384e2d5cd3f6"
dependencies = [
"anstream",
"anstyle",
@@ -3432,8 +3423,8 @@ checksum = "42703706b716c37f96a77aea830392ad231f44c9e9a67872fa5548707e11b11c"
[[package]]
name = "fsst"
version = "9.0.0-beta.8"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.8#71c4aa2174971e98acb7e256fde1e1589024f5bc"
version = "9.0.0-beta.10"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
dependencies = [
"arrow-array",
"rand 0.9.4",
@@ -4735,8 +4726,8 @@ checksum = "e037a2e1d8d5fdbd49b16a4ea09d5d6401c1f29eca5ff29d03d3824dba16256a"
[[package]]
name = "lance"
version = "9.0.0-beta.8"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.8#71c4aa2174971e98acb7e256fde1e1589024f5bc"
version = "9.0.0-beta.10"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
dependencies = [
"arc-swap",
"arrow",
@@ -4754,7 +4745,6 @@ dependencies = [
"async_cell",
"aws-credential-types",
"aws-sdk-dynamodb",
"bitpacking",
"byteorder",
"bytes",
"chrono",
@@ -4773,6 +4763,7 @@ dependencies = [
"humantime",
"itertools 0.14.0",
"lance-arrow",
"lance-bitpacking",
"lance-core",
"lance-datafusion",
"lance-encoding",
@@ -4810,8 +4801,8 @@ dependencies = [
[[package]]
name = "lance-arrow"
version = "9.0.0-beta.8"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.8#71c4aa2174971e98acb7e256fde1e1589024f5bc"
version = "9.0.0-beta.10"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
dependencies = [
"arrow-array",
"arrow-buffer",
@@ -4832,7 +4823,7 @@ dependencies = [
[[package]]
name = "lance-arrow-scalar"
version = "58.0.0"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.8#71c4aa2174971e98acb7e256fde1e1589024f5bc"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
dependencies = [
"arrow-array",
"arrow-buffer",
@@ -4846,7 +4837,7 @@ dependencies = [
[[package]]
name = "lance-arrow-stats"
version = "58.0.0"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.8#71c4aa2174971e98acb7e256fde1e1589024f5bc"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
dependencies = [
"arrow-array",
"arrow-schema",
@@ -4855,18 +4846,19 @@ dependencies = [
[[package]]
name = "lance-bitpacking"
version = "9.0.0-beta.8"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.8#71c4aa2174971e98acb7e256fde1e1589024f5bc"
version = "9.0.0-beta.10"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
dependencies = [
"arrayref",
"crunchy",
"paste",
"seq-macro",
]
[[package]]
name = "lance-core"
version = "9.0.0-beta.8"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.8#71c4aa2174971e98acb7e256fde1e1589024f5bc"
version = "9.0.0-beta.10"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
dependencies = [
"arrow-array",
"arrow-buffer",
@@ -4904,8 +4896,8 @@ dependencies = [
[[package]]
name = "lance-datafusion"
version = "9.0.0-beta.8"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.8#71c4aa2174971e98acb7e256fde1e1589024f5bc"
version = "9.0.0-beta.10"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
dependencies = [
"arrow",
"arrow-array",
@@ -4935,8 +4927,8 @@ dependencies = [
[[package]]
name = "lance-datagen"
version = "9.0.0-beta.8"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.8#71c4aa2174971e98acb7e256fde1e1589024f5bc"
version = "9.0.0-beta.10"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
dependencies = [
"arrow",
"arrow-array",
@@ -4953,8 +4945,8 @@ dependencies = [
[[package]]
name = "lance-derive"
version = "9.0.0-beta.8"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.8#71c4aa2174971e98acb7e256fde1e1589024f5bc"
version = "9.0.0-beta.10"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
dependencies = [
"proc-macro2",
"quote",
@@ -4963,8 +4955,8 @@ dependencies = [
[[package]]
name = "lance-encoding"
version = "9.0.0-beta.8"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.8#71c4aa2174971e98acb7e256fde1e1589024f5bc"
version = "9.0.0-beta.10"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
dependencies = [
"arrow-arith",
"arrow-array",
@@ -4999,8 +4991,8 @@ dependencies = [
[[package]]
name = "lance-file"
version = "9.0.0-beta.8"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.8#71c4aa2174971e98acb7e256fde1e1589024f5bc"
version = "9.0.0-beta.10"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
dependencies = [
"arrow-arith",
"arrow-array",
@@ -5030,8 +5022,8 @@ dependencies = [
[[package]]
name = "lance-index"
version = "9.0.0-beta.8"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.8#71c4aa2174971e98acb7e256fde1e1589024f5bc"
version = "9.0.0-beta.10"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
dependencies = [
"arc-swap",
"arrow",
@@ -5043,7 +5035,6 @@ dependencies = [
"async-channel",
"async-recursion",
"async-trait",
"bitpacking",
"bitvec",
"bytes",
"chrono",
@@ -5061,6 +5052,7 @@ dependencies = [
"jsonb",
"lance-arrow",
"lance-arrow-stats",
"lance-bitpacking",
"lance-core",
"lance-datafusion",
"lance-datagen",
@@ -5096,8 +5088,8 @@ dependencies = [
[[package]]
name = "lance-io"
version = "9.0.0-beta.8"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.8#71c4aa2174971e98acb7e256fde1e1589024f5bc"
version = "9.0.0-beta.10"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
dependencies = [
"arrow",
"arrow-arith",
@@ -5138,8 +5130,8 @@ dependencies = [
[[package]]
name = "lance-linalg"
version = "9.0.0-beta.8"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.8#71c4aa2174971e98acb7e256fde1e1589024f5bc"
version = "9.0.0-beta.10"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
dependencies = [
"arrow-array",
"arrow-buffer",
@@ -5155,8 +5147,8 @@ dependencies = [
[[package]]
name = "lance-namespace"
version = "9.0.0-beta.8"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.8#71c4aa2174971e98acb7e256fde1e1589024f5bc"
version = "9.0.0-beta.10"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
dependencies = [
"arrow",
"async-trait",
@@ -5168,8 +5160,8 @@ dependencies = [
[[package]]
name = "lance-namespace-impls"
version = "9.0.0-beta.8"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.8#71c4aa2174971e98acb7e256fde1e1589024f5bc"
version = "9.0.0-beta.10"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
dependencies = [
"arrow",
"arrow-ipc",
@@ -5223,8 +5215,8 @@ dependencies = [
[[package]]
name = "lance-select"
version = "9.0.0-beta.8"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.8#71c4aa2174971e98acb7e256fde1e1589024f5bc"
version = "9.0.0-beta.10"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
dependencies = [
"arrow-array",
"arrow-buffer",
@@ -5239,8 +5231,8 @@ dependencies = [
[[package]]
name = "lance-table"
version = "9.0.0-beta.8"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.8#71c4aa2174971e98acb7e256fde1e1589024f5bc"
version = "9.0.0-beta.10"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
dependencies = [
"arrow",
"arrow-array",
@@ -5279,8 +5271,8 @@ dependencies = [
[[package]]
name = "lance-testing"
version = "9.0.0-beta.8"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.8#71c4aa2174971e98acb7e256fde1e1589024f5bc"
version = "9.0.0-beta.10"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
dependencies = [
"arrow-array",
"arrow-schema",
@@ -5293,8 +5285,8 @@ dependencies = [
[[package]]
name = "lance-tokenizer"
version = "9.0.0-beta.8"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.8#71c4aa2174971e98acb7e256fde1e1589024f5bc"
version = "9.0.0-beta.10"
source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
dependencies = [
"icu_segmenter",
"jieba-rs",
@@ -5307,7 +5299,7 @@ dependencies = [
[[package]]
name = "lancedb"
version = "0.31.0-beta.3"
version = "0.31.0-beta.4"
dependencies = [
"ahash",
"anyhow",
@@ -5391,7 +5383,7 @@ dependencies = [
[[package]]
name = "lancedb-nodejs"
version = "0.31.0-beta.3"
version = "0.31.0-beta.4"
dependencies = [
"arrow-array",
"arrow-buffer",
@@ -5416,7 +5408,7 @@ dependencies = [
[[package]]
name = "lancedb-python"
version = "0.34.0-beta.3"
version = "0.34.0-beta.4"
dependencies = [
"arrow",
"async-trait",
@@ -5649,9 +5641,9 @@ dependencies = [
[[package]]
name = "log"
version = "0.4.32"
version = "0.4.33"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "953f07c43838f8e6f9758cab68bf5bed85465e7587ebe0b823f1bcd81978ad3a"
checksum = "0ceec5bc11778974d1bcb055b18002eba7f4b3518b6a0081b3af5f21666da9ad"
[[package]]
name = "loom"
@@ -5959,9 +5951,9 @@ dependencies = [
[[package]]
name = "napi"
version = "3.9.3"
version = "3.9.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fbd9f9295f3ff5921e78a71222c3361a8216f7760b1a99a6ad4e8441de18bbb9"
checksum = "b41bda2ac390efb5e8d22025d925ccc3f3807d8c1bea6d19b36127247c4b8f83"
dependencies = [
"bitflags 2.11.1",
"chrono",
@@ -5984,9 +5976,9 @@ checksum = "c9c366d2c8c60b86fa632df75f745509b52f9128f91a6bad4c796e44abb505e1"
[[package]]
name = "napi-derive"
version = "3.5.6"
version = "3.5.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "89b3f766e04667e6da0e181e2da4f85475d5a6513b7cf6a80bea184e224a5b42"
checksum = "61d66f70256ad5aef58659966064471d0ad90e2897bc36a5a5e0389c85aabc1e"
dependencies = [
"convert_case",
"ctor 1.0.5",
@@ -5998,9 +5990,9 @@ dependencies = [
[[package]]
name = "napi-derive-backend"
version = "5.0.4"
version = "5.0.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0d5af30503edf933ce7377cf6d4c877a62b0f1107ea05585f1b5e430e88d5baf"
checksum = "81b4b08f15eed7a2a20c3f4c6314013fc3ac890a3afa9892b594485299ebdb2d"
dependencies = [
"convert_case",
"proc-macro2",
@@ -10128,9 +10120,9 @@ checksum = "06abde3611657adf66d383f00b093d7faecc7fa57071cce2578660c9f1010821"
[[package]]
name = "uuid"
version = "1.23.3"
version = "1.23.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "144d6b123cef80b301b8f72a9e2ca4370ddec21950d0a103dd22c437006d2db7"
checksum = "bf80a72845275afea99e7f2b434723d3bc7e38470fcd1c7ed39a599c73319a53"
dependencies = [
"getrandom 0.4.2",
"js-sys",

View File

@@ -13,20 +13,20 @@ categories = ["database-implementations"]
rust-version = "1.91.0"
[workspace.dependencies]
lance = { "version" = "=9.0.0-beta.8", default-features = false, "tag" = "v9.0.0-beta.8", "git" = "https://github.com/lance-format/lance.git" }
lance-core = { "version" = "=9.0.0-beta.8", "tag" = "v9.0.0-beta.8", "git" = "https://github.com/lance-format/lance.git" }
lance-datagen = { "version" = "=9.0.0-beta.8", "tag" = "v9.0.0-beta.8", "git" = "https://github.com/lance-format/lance.git" }
lance-file = { "version" = "=9.0.0-beta.8", "tag" = "v9.0.0-beta.8", "git" = "https://github.com/lance-format/lance.git" }
lance-io = { "version" = "=9.0.0-beta.8", default-features = false, "tag" = "v9.0.0-beta.8", "git" = "https://github.com/lance-format/lance.git" }
lance-index = { "version" = "=9.0.0-beta.8", "tag" = "v9.0.0-beta.8", "git" = "https://github.com/lance-format/lance.git" }
lance-linalg = { "version" = "=9.0.0-beta.8", "tag" = "v9.0.0-beta.8", "git" = "https://github.com/lance-format/lance.git" }
lance-namespace = { "version" = "=9.0.0-beta.8", "tag" = "v9.0.0-beta.8", "git" = "https://github.com/lance-format/lance.git" }
lance-namespace-impls = { "version" = "=9.0.0-beta.8", default-features = false, "tag" = "v9.0.0-beta.8", "git" = "https://github.com/lance-format/lance.git" }
lance-table = { "version" = "=9.0.0-beta.8", "tag" = "v9.0.0-beta.8", "git" = "https://github.com/lance-format/lance.git" }
lance-testing = { "version" = "=9.0.0-beta.8", "tag" = "v9.0.0-beta.8", "git" = "https://github.com/lance-format/lance.git" }
lance-datafusion = { "version" = "=9.0.0-beta.8", "tag" = "v9.0.0-beta.8", "git" = "https://github.com/lance-format/lance.git" }
lance-encoding = { "version" = "=9.0.0-beta.8", "tag" = "v9.0.0-beta.8", "git" = "https://github.com/lance-format/lance.git" }
lance-arrow = { "version" = "=9.0.0-beta.8", "tag" = "v9.0.0-beta.8", "git" = "https://github.com/lance-format/lance.git" }
lance = { "version" = "=9.0.0-beta.10", default-features = false, "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
lance-core = { "version" = "=9.0.0-beta.10", "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
lance-datagen = { "version" = "=9.0.0-beta.10", "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
lance-file = { "version" = "=9.0.0-beta.10", "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
lance-io = { "version" = "=9.0.0-beta.10", default-features = false, "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
lance-index = { "version" = "=9.0.0-beta.10", "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
lance-linalg = { "version" = "=9.0.0-beta.10", "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
lance-namespace = { "version" = "=9.0.0-beta.10", "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
lance-namespace-impls = { "version" = "=9.0.0-beta.10", default-features = false, "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
lance-table = { "version" = "=9.0.0-beta.10", "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
lance-testing = { "version" = "=9.0.0-beta.10", "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
lance-datafusion = { "version" = "=9.0.0-beta.10", "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
lance-encoding = { "version" = "=9.0.0-beta.10", "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
lance-arrow = { "version" = "=9.0.0-beta.10", "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
ahash = "0.8"
# Note that this one does not include pyarrow
arrow = { version = "58.0.0", optional = false }

View File

@@ -14,7 +14,7 @@ Add the following dependency to your `pom.xml`:
<dependency>
<groupId>com.lancedb</groupId>
<artifactId>lancedb-core</artifactId>
<version>0.31.0-beta.3</version>
<version>0.31.0-beta.4</version>
</dependency>
```

View File

@@ -0,0 +1,29 @@
[**@lancedb/lancedb**](../README.md) • **Docs**
***
[@lancedb/lancedb](../globals.md) / OAuthFlowType
# Enumeration: OAuthFlowType
OAuth authentication flow types.
## Enumeration Members
### AzureManagedIdentity
```ts
AzureManagedIdentity: "azure_managed_identity";
```
Azure Managed Identity via IMDS.
***
### ClientCredentials
```ts
ClientCredentials: "client_credentials";
```
Client Credentials grant (service-to-service / M2M).

View File

@@ -12,6 +12,7 @@
## Enumerations
- [FullTextQueryType](enumerations/FullTextQueryType.md)
- [OAuthFlowType](enumerations/OAuthFlowType.md)
- [Occur](enumerations/Occur.md)
- [Operator](enumerations/Operator.md)
@@ -85,6 +86,8 @@
- [ListNamespacesResponse](interfaces/ListNamespacesResponse.md)
- [LsmWriteSpec](interfaces/LsmWriteSpec.md)
- [MergeResult](interfaces/MergeResult.md)
- [NativeOAuthConfig](interfaces/NativeOAuthConfig.md)
- [OAuthConfig](interfaces/OAuthConfig.md)
- [OpenTableOptions](interfaces/OpenTableOptions.md)
- [OptimizeOptions](interfaces/OptimizeOptions.md)
- [OptimizeStats](interfaces/OptimizeStats.md)

View File

@@ -64,6 +64,19 @@ client used by manifest-enabled native connections.
***
### oauthConfig?
```ts
optional oauthConfig: NativeOAuthConfig;
```
(For LanceDB cloud only): OAuth configuration for IdP-based
authentication (e.g., Azure Entra ID). When set, token acquisition
and refresh are handled entirely in Rust. TypeScript users should pass
the public `OAuthConfig` type exported from `@lancedb/lancedb`.
***
### readConsistencyInterval?
```ts

View File

@@ -0,0 +1,88 @@
[**@lancedb/lancedb**](../README.md) • **Docs**
***
[@lancedb/lancedb](../globals.md) / NativeOAuthConfig
# Interface: NativeOAuthConfig
OAuth configuration for LanceDB authentication.
This is the generated napi-rs binding shape. TypeScript users should prefer
the public `OAuthConfig` type exported from `@lancedb/lancedb`.
All token acquisition and refresh is handled in the Rust layer.
## Properties
### clientId
```ts
clientId: string;
```
Application / Client ID.
***
### clientSecret?
```ts
optional clientSecret: string;
```
Client secret (required for client_credentials).
***
### flow?
```ts
optional flow: string;
```
Authentication flow: "client_credentials" or "azure_managed_identity"
***
### issuerUrl
```ts
issuerUrl: string;
```
OIDC issuer URL or OAuth authority URL.
For Azure: `https://login.microsoftonline.com/{tenant_id}/v2.0`
***
### managedIdentityClientId?
```ts
optional managedIdentityClientId: string;
```
Client ID for user-assigned managed identity (azure_managed_identity).
***
### refreshBufferSecs?
```ts
optional refreshBufferSecs: number;
```
Seconds before expiry to trigger proactive refresh (default: 300).
Keep this well below the token TTL; if it is greater than or equal to
the TTL, each request refreshes the token.
***
### scopes
```ts
scopes: string[];
```
OAuth scopes to request. For Azure managed identity, exactly one scope
or resource is required. For example: `["api://{app_id}/.default"]`

View File

@@ -0,0 +1,111 @@
[**@lancedb/lancedb**](../README.md) • **Docs**
***
[@lancedb/lancedb](../globals.md) / OAuthConfig
# Interface: OAuthConfig
OAuth configuration for LanceDB authentication.
This is the public TypeScript OAuth configuration type. The generated
`NativeOAuthConfig` type has the same runtime shape but is an implementation
detail of the napi-rs binding.
All token acquisition and refresh is handled in the Rust layer.
This config is passed through to Rust via napi-rs.
## Examples
```typescript
const config: OAuthConfig = {
issuerUrl: "https://login.microsoftonline.com/{tenant}/v2.0",
clientId: "app-id",
clientSecret: "secret",
scopes: ["api://lancedb-api/.default"],
};
```
```typescript
const config: OAuthConfig = {
issuerUrl: "https://login.microsoftonline.com/{tenant}/v2.0",
clientId: "app-id",
scopes: ["api://lancedb-api/.default"],
flow: OAuthFlowType.AzureManagedIdentity,
};
```
## Properties
### clientId
```ts
clientId: string;
```
Application / Client ID.
***
### clientSecret?
```ts
optional clientSecret: string;
```
Client secret (required for ClientCredentials).
***
### flow?
```ts
optional flow: OAuthFlowType;
```
Authentication flow (default: ClientCredentials).
***
### issuerUrl
```ts
issuerUrl: string;
```
OIDC issuer URL or OAuth authority URL.
For Azure: `https://login.microsoftonline.com/{tenant_id}/v2.0`
***
### managedIdentityClientId?
```ts
optional managedIdentityClientId: string;
```
Client ID for user-assigned managed identity (AzureManagedIdentity).
***
### refreshBufferSecs?
```ts
optional refreshBufferSecs: number;
```
Seconds before expiry to trigger proactive refresh (default: 300).
Keep this well below the token TTL; if it is greater than or equal to
the TTL, each request refreshes the token.
***
### scopes
```ts
scopes: string[];
```
OAuth scopes to request.
For Azure managed identity, exactly one scope or resource is required.
For example: `["api://{app_id}/.default"]`

View File

@@ -8,7 +8,7 @@
<parent>
<groupId>com.lancedb</groupId>
<artifactId>lancedb-parent</artifactId>
<version>0.31.0-beta.3</version>
<version>0.31.0-beta.4</version>
<relativePath>../pom.xml</relativePath>
</parent>

View File

@@ -6,7 +6,7 @@
<groupId>com.lancedb</groupId>
<artifactId>lancedb-parent</artifactId>
<version>0.31.0-beta.3</version>
<version>0.31.0-beta.4</version>
<packaging>pom</packaging>
<name>${project.artifactId}</name>
<description>LanceDB Java SDK Parent POM</description>
@@ -28,7 +28,7 @@
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<arrow.version>15.0.0</arrow.version>
<lance-core.version>9.0.0-beta.8</lance-core.version>
<lance-core.version>9.0.0-beta.10</lance-core.version>
<spotless.skip>false</spotless.skip>
<spotless.version>2.30.0</spotless.version>
<spotless.java.googlejavaformat.version>1.7</spotless.java.googlejavaformat.version>

View File

@@ -1,7 +1,7 @@
[package]
name = "lancedb-nodejs"
edition.workspace = true
version = "0.31.0-beta.3"
version = "0.31.0-beta.4"
publish = false
license.workspace = true
description.workspace = true

View File

@@ -52,6 +52,7 @@ export {
SplitHashOptions,
SplitSequentialOptions,
ShuffleOptions,
OAuthConfig as NativeOAuthConfig,
} from "./native.js";
export {
@@ -130,6 +131,8 @@ export {
TokenResponse,
} from "./header";
export { OAuthConfig, OAuthFlowType } from "./oauth";
export { MergeInsertBuilder, WriteExecutionOptions } from "./merge";
export * as embedding from "./embedding";

76
nodejs/lancedb/oauth.ts Normal file
View File

@@ -0,0 +1,76 @@
// SPDX-License-Identifier: Apache-2.0
// SPDX-FileCopyrightText: Copyright The LanceDB Authors
/**
* OAuth authentication flow types.
*/
export enum OAuthFlowType {
/** Client Credentials grant (service-to-service / M2M). */
ClientCredentials = "client_credentials",
/** Azure Managed Identity via IMDS. */
AzureManagedIdentity = "azure_managed_identity",
}
/**
* OAuth configuration for LanceDB authentication.
*
* This is the public TypeScript OAuth configuration type. The generated
* `NativeOAuthConfig` type has the same runtime shape but is an implementation
* detail of the napi-rs binding.
*
* All token acquisition and refresh is handled in the Rust layer.
* This config is passed through to Rust via napi-rs.
*
* @example Client Credentials (service-to-service):
* ```typescript
* const config: OAuthConfig = {
* issuerUrl: "https://login.microsoftonline.com/{tenant}/v2.0",
* clientId: "app-id",
* clientSecret: "secret",
* scopes: ["api://lancedb-api/.default"],
* };
* ```
*
* @example Azure Managed Identity:
* ```typescript
* const config: OAuthConfig = {
* issuerUrl: "https://login.microsoftonline.com/{tenant}/v2.0",
* clientId: "app-id",
* scopes: ["api://lancedb-api/.default"],
* flow: OAuthFlowType.AzureManagedIdentity,
* };
* ```
*/
export interface OAuthConfig {
/**
* OIDC issuer URL or OAuth authority URL.
* For Azure: `https://login.microsoftonline.com/{tenant_id}/v2.0`
*/
issuerUrl: string;
/** Application / Client ID. */
clientId: string;
/**
* OAuth scopes to request.
* For Azure managed identity, exactly one scope or resource is required.
* For example: `["api://{app_id}/.default"]`
*/
scopes: string[];
/** Authentication flow (default: ClientCredentials). */
flow?: OAuthFlowType;
/** Client secret (required for ClientCredentials). */
clientSecret?: string;
/** Client ID for user-assigned managed identity (AzureManagedIdentity). */
managedIdentityClientId?: string;
/**
* Seconds before expiry to trigger proactive refresh (default: 300).
* Keep this well below the token TTL; if it is greater than or equal to
* the TTL, each request refreshes the token.
*/
refreshBufferSecs?: number;
}

View File

@@ -1,6 +1,6 @@
{
"name": "@lancedb/lancedb-darwin-arm64",
"version": "0.31.0-beta.3",
"version": "0.31.0-beta.4",
"os": ["darwin"],
"cpu": ["arm64"],
"main": "lancedb.darwin-arm64.node",

View File

@@ -1,6 +1,6 @@
{
"name": "@lancedb/lancedb-linux-arm64-gnu",
"version": "0.31.0-beta.3",
"version": "0.31.0-beta.4",
"os": ["linux"],
"cpu": ["arm64"],
"main": "lancedb.linux-arm64-gnu.node",

View File

@@ -1,6 +1,6 @@
{
"name": "@lancedb/lancedb-linux-arm64-musl",
"version": "0.31.0-beta.3",
"version": "0.31.0-beta.4",
"os": ["linux"],
"cpu": ["arm64"],
"main": "lancedb.linux-arm64-musl.node",

View File

@@ -1,6 +1,6 @@
{
"name": "@lancedb/lancedb-linux-x64-gnu",
"version": "0.31.0-beta.3",
"version": "0.31.0-beta.4",
"os": ["linux"],
"cpu": ["x64"],
"main": "lancedb.linux-x64-gnu.node",

View File

@@ -1,6 +1,6 @@
{
"name": "@lancedb/lancedb-linux-x64-musl",
"version": "0.31.0-beta.3",
"version": "0.31.0-beta.4",
"os": ["linux"],
"cpu": ["x64"],
"main": "lancedb.linux-x64-musl.node",

View File

@@ -1,6 +1,6 @@
{
"name": "@lancedb/lancedb-win32-arm64-msvc",
"version": "0.31.0-beta.3",
"version": "0.31.0-beta.4",
"os": [
"win32"
],

View File

@@ -1,6 +1,6 @@
{
"name": "@lancedb/lancedb-win32-x64-msvc",
"version": "0.31.0-beta.3",
"version": "0.31.0-beta.4",
"os": ["win32"],
"cpu": ["x64"],
"main": "lancedb.win32-x64-msvc.node",

View File

@@ -1,12 +1,12 @@
{
"name": "@lancedb/lancedb",
"version": "0.31.0-beta.3",
"version": "0.31.0-beta.4",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "@lancedb/lancedb",
"version": "0.31.0-beta.3",
"version": "0.31.0-beta.4",
"cpu": [
"x64",
"arm64"

View File

@@ -11,7 +11,7 @@
"ann"
],
"private": false,
"version": "0.31.0-beta.3",
"version": "0.31.0-beta.4",
"main": "dist/index.js",
"exports": {
".": "./dist/index.js",

View File

@@ -112,6 +112,12 @@ impl Connection {
builder = builder.client_config(rust_config);
if let Some(oauth_config) = options.oauth_config {
let config: lancedb::remote::oauth::OAuthConfig =
oauth_config.try_into().default_error()?;
builder = builder.oauth_config(config);
}
if let Some(api_key) = options.api_key {
builder = builder.api_key(&api_key);
}

View File

@@ -65,6 +65,11 @@ pub struct ConnectionOptions {
/// (For LanceDB cloud only): the host to use for LanceDB cloud. Used
/// for testing purposes.
pub host_override: Option<String>,
/// (For LanceDB cloud only): OAuth configuration for IdP-based
/// authentication (e.g., Azure Entra ID). When set, token acquisition
/// and refresh are handled entirely in Rust. TypeScript users should pass
/// the public `OAuthConfig` type exported from `@lancedb/lancedb`.
pub oauth_config: Option<remote::OAuthConfig>,
}
#[napi(object)]

View File

@@ -3,7 +3,7 @@
use std::time::Duration;
use lancedb::{arrow::IntoArrow, ipc::ipc_file_to_batches, table::merge::MergeInsertBuilder};
use lancedb::{ipc::ipc_file_to_batches, table::merge::MergeInsertBuilder};
use napi::bindgen_prelude::*;
use napi_derive::napi;
@@ -66,11 +66,9 @@ impl NativeMergeInsertBuilder {
#[napi(catch_unwind)]
pub async fn execute(&self, buf: Buffer) -> napi::Result<MergeResult> {
let data = ipc_file_to_batches(buf.to_vec())
.and_then(IntoArrow::into_arrow)
.map_err(|e| {
napi::Error::from_reason(format!("Failed to read IPC file: {}", convert_error(&e)))
})?;
let data = ipc_file_to_batches(buf.to_vec()).map_err(|e| {
napi::Error::from_reason(format!("Failed to read IPC file: {}", convert_error(&e)))
})?;
let this = self.clone();

View File

@@ -3,6 +3,7 @@
use std::collections::HashMap;
use lancedb::error::Error;
use napi_derive::*;
/// Timeout configuration for remote HTTP client.
@@ -140,6 +141,84 @@ impl From<TlsConfig> for lancedb::remote::TlsConfig {
}
}
/// OAuth configuration for LanceDB authentication.
///
/// This is the generated napi-rs binding shape. TypeScript users should prefer
/// the public `OAuthConfig` type exported from `@lancedb/lancedb`.
///
/// All token acquisition and refresh is handled in the Rust layer.
#[napi(object)]
#[derive(Clone)]
pub struct OAuthConfig {
/// OIDC issuer URL or OAuth authority URL.
/// For Azure: `https://login.microsoftonline.com/{tenant_id}/v2.0`
pub issuer_url: String,
/// Application / Client ID.
pub client_id: String,
/// OAuth scopes to request. For Azure managed identity, exactly one scope
/// or resource is required. For example: `["api://{app_id}/.default"]`
pub scopes: Vec<String>,
/// Authentication flow: "client_credentials" or "azure_managed_identity"
pub flow: Option<String>,
/// Client secret (required for client_credentials).
pub client_secret: Option<String>,
/// Client ID for user-assigned managed identity (azure_managed_identity).
pub managed_identity_client_id: Option<String>,
/// Seconds before expiry to trigger proactive refresh (default: 300).
/// Keep this well below the token TTL; if it is greater than or equal to
/// the TTL, each request refreshes the token.
pub refresh_buffer_secs: Option<u32>,
}
impl std::fmt::Debug for OAuthConfig {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.debug_struct("OAuthConfig")
.field("issuer_url", &self.issuer_url)
.field("client_id", &self.client_id)
.field("scopes", &self.scopes)
.field("flow", &self.flow)
.field(
"client_secret",
&self.client_secret.as_deref().map(|_| "<redacted>"),
)
.field(
"managed_identity_client_id",
&self.managed_identity_client_id,
)
.field("refresh_buffer_secs", &self.refresh_buffer_secs)
.finish()
}
}
impl TryFrom<OAuthConfig> for lancedb::remote::oauth::OAuthConfig {
type Error = Error;
fn try_from(config: OAuthConfig) -> Result<Self, Self::Error> {
use lancedb::remote::oauth::OAuthFlow;
let flow = match config.flow.as_deref().unwrap_or("client_credentials") {
"client_credentials" => OAuthFlow::ClientCredentials,
"azure_managed_identity" => OAuthFlow::AzureManagedIdentity {
client_id: config.managed_identity_client_id,
},
other => {
return Err(Error::InvalidInput {
message: format!("Unknown OAuth flow type: {other}"),
});
}
};
Ok(Self {
issuer_url: config.issuer_url,
client_id: config.client_id,
client_secret: config.client_secret,
scopes: config.scopes,
flow,
refresh_buffer_secs: config.refresh_buffer_secs.map(|v| v as u64),
})
}
}
impl From<ClientConfig> for lancedb::remote::ClientConfig {
fn from(config: ClientConfig) -> Self {
Self {
@@ -156,3 +235,45 @@ impl From<ClientConfig> for lancedb::remote::ClientConfig {
}
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_unknown_oauth_flow_returns_invalid_input() {
let config = OAuthConfig {
issuer_url: "https://issuer.example.com".to_string(),
client_id: "client-id".to_string(),
scopes: vec!["scope".to_string()],
flow: Some("typo".to_string()),
client_secret: None,
managed_identity_client_id: None,
refresh_buffer_secs: None,
};
let err = lancedb::remote::oauth::OAuthConfig::try_from(config).unwrap_err();
assert!(matches!(
err,
Error::InvalidInput { message }
if message == "Unknown OAuth flow type: typo"
));
}
#[test]
fn test_oauth_config_debug_redacts_client_secret() {
let config = OAuthConfig {
issuer_url: "https://issuer.example.com".to_string(),
client_id: "client-id".to_string(),
scopes: vec!["scope".to_string()],
flow: Some("client_credentials".to_string()),
client_secret: Some("super-secret".to_string()),
managed_identity_client_id: None,
refresh_buffer_secs: None,
};
let debug = format!("{config:?}");
assert!(!debug.contains("super-secret"));
assert!(debug.contains("client_secret: Some(\"<redacted>\")"));
}
}

View File

@@ -1,5 +1,5 @@
[tool.bumpversion]
current_version = "0.34.0-beta.3"
current_version = "0.34.0-beta.4"
parse = """(?x)
(?P<major>0|[1-9]\\d*)\\.
(?P<minor>0|[1-9]\\d*)\\.

View File

@@ -1,6 +1,6 @@
[package]
name = "lancedb-python"
version = "0.34.0-beta.3"
version = "0.34.0-beta.4"
publish = false
edition.workspace = true
description = "Python bindings for LanceDB"

View File

@@ -89,6 +89,8 @@ def connect(
If presented, connect to LanceDB cloud.
Otherwise, connect to a database on file system or cloud storage.
Can be set via environment variable `LANCEDB_API_KEY`.
OAuth configuration is currently supported only by ``connect_async``;
synchronous LanceDB Cloud connections require an API key.
region: str, default "us-east-1"
The region to use for LanceDB Cloud.
host_override: str, optional
@@ -340,6 +342,7 @@ async def connect_async(
session: Optional[Session] = None,
manifest_enabled: bool = False,
namespace_client_properties: Optional[Dict[str, str]] = None,
oauth_config=None,
) -> AsyncConnection:
"""Connect to a LanceDB database.
@@ -389,6 +392,10 @@ async def connect_async(
namespace_client_properties : dict, optional
Additional directory namespace client properties to use with
``manifest_enabled=True``.
oauth_config : OAuthConfig, optional
OAuth configuration for LanceDB Cloud/Enterprise. This is supported by
``connect_async`` only; synchronous ``connect`` uses API key
authentication for ``db://`` URIs.
Examples
--------
@@ -435,6 +442,7 @@ async def connect_async(
session,
manifest_enabled,
namespace_client_properties,
oauth_config,
)
)

View File

@@ -280,6 +280,7 @@ async def connect(
session: Optional[Session],
manifest_enabled: bool = False,
namespace_client_properties: Optional[Dict[str, str]] = None,
oauth_config: Optional[Any] = None,
) -> Connection: ...
class RecordBatchStream:

View File

@@ -48,6 +48,14 @@ class PermutationBuilder:
By default, the permutation builder will create a single split that contains all
rows in the same order as the base table.
"""
if not hasattr(table, "_inner"):
raise TypeError(
f"PermutationBuilder requires a local LanceTable, "
f"got {type(table).__name__}. "
"The permutation API is not supported on remote tables. "
"Remote tables connect to LanceDB Cloud or Enterprise and do not have "
"direct access to the underlying Lance dataset needed for permutations."
)
self._async = async_permutation_builder(table)
def split_random(

View File

@@ -9,6 +9,7 @@ from typing import List, Optional
from lancedb import __version__
from .header import HeaderProvider
from .oauth import OAuthConfig, OAuthFlowType
__all__ = [
"TimeoutConfig",
@@ -16,6 +17,8 @@ __all__ = [
"TlsConfig",
"ClientConfig",
"HeaderProvider",
"OAuthConfig",
"OAuthFlowType",
]

View File

@@ -0,0 +1,75 @@
# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright The LanceDB Authors
from dataclasses import dataclass, field
from enum import Enum
from typing import List, Optional
class OAuthFlowType(str, Enum):
"""OAuth authentication flow types."""
CLIENT_CREDENTIALS = "client_credentials"
"""Client Credentials grant (service-to-service / M2M)."""
AZURE_MANAGED_IDENTITY = "azure_managed_identity"
"""Azure Managed Identity via IMDS."""
@dataclass
class OAuthConfig:
"""OAuth configuration for LanceDB authentication.
All token acquisition and refresh is handled in the Rust layer.
This config is passed through to Rust via PyO3.
Parameters
----------
issuer_url : str
OIDC issuer URL or OAuth authority URL.
For Azure: ``https://login.microsoftonline.com/{tenant_id}/v2.0``
client_id : str
Application / Client ID.
scopes : List[str]
OAuth scopes to request.
For Azure managed identity, exactly one scope or resource is required.
For example: ``["api://{app_id}/.default"]``
flow : OAuthFlowType
Authentication flow to use. Default: CLIENT_CREDENTIALS.
client_secret : Optional[str]
Client secret (required for CLIENT_CREDENTIALS).
managed_identity_client_id : Optional[str]
Client ID for user-assigned managed identity (AZURE_MANAGED_IDENTITY).
refresh_buffer_secs : Optional[int]
Seconds before expiry to trigger proactive refresh (default: 300).
Keep this well below the token TTL; if it is greater than or equal to
the TTL, each request refreshes the token.
Examples
--------
Client Credentials (service-to-service):
>>> config = OAuthConfig(
... issuer_url="https://login.microsoftonline.com/{tenant}/v2.0",
... client_id="app-id",
... client_secret="secret",
... scopes=["api://lancedb-api/.default"],
... )
Azure Managed Identity:
>>> config = OAuthConfig(
... issuer_url="https://login.microsoftonline.com/{tenant}/v2.0",
... client_id="app-id",
... scopes=["api://lancedb-api/.default"],
... flow=OAuthFlowType.AZURE_MANAGED_IDENTITY,
... )
"""
issuer_url: str
client_id: str
scopes: List[str]
flow: OAuthFlowType = OAuthFlowType.CLIENT_CREDENTIALS
client_secret: Optional[str] = field(default=None, repr=False)
managed_identity_client_id: Optional[str] = None
refresh_buffer_secs: Optional[int] = None

View File

@@ -539,7 +539,7 @@ impl Connection {
}
#[pyfunction]
#[pyo3(signature = (uri, api_key=None, region=None, host_override=None, read_consistency_interval=None, client_config=None, storage_options=None, session=None, manifest_enabled=false, namespace_client_properties=None))]
#[pyo3(signature = (uri, api_key=None, region=None, host_override=None, read_consistency_interval=None, client_config=None, storage_options=None, session=None, manifest_enabled=false, namespace_client_properties=None, oauth_config=None))]
#[allow(clippy::too_many_arguments)]
pub fn connect(
py: Python<'_>,
@@ -553,6 +553,7 @@ pub fn connect(
session: Option<crate::session::Session>,
manifest_enabled: bool,
namespace_client_properties: Option<HashMap<String, String>>,
oauth_config: Option<crate::oauth::PyOAuthConfig>,
) -> PyResult<Bound<'_, PyAny>> {
future_into_py(py, async move {
let mut builder = lancedb::connect(&uri);
@@ -582,6 +583,11 @@ pub fn connect(
if let Some(client_config) = client_config {
builder = builder.client_config(client_config.into());
}
if let Some(oauth_config) = oauth_config {
let config: lancedb::remote::oauth::OAuthConfig =
oauth_config.try_into().infer_error()?;
builder = builder.oauth_config(config);
}
if let Some(session) = session {
builder = builder.session(session.inner.clone());
}

View File

@@ -26,6 +26,7 @@ pub mod expr;
pub mod header;
pub mod index;
pub mod namespace;
pub mod oauth;
pub mod permutation;
pub mod query;
pub mod runtime;

72
python/src/oauth.rs Normal file
View File

@@ -0,0 +1,72 @@
// SPDX-License-Identifier: Apache-2.0
// SPDX-FileCopyrightText: Copyright The LanceDB Authors
use pyo3::FromPyObject;
use lancedb::error::Error;
use lancedb::remote::oauth::{OAuthConfig, OAuthFlow};
/// Python-side OAuth configuration, extracted via FromPyObject.
/// Maps to `lancedb.remote.oauth.OAuthConfig` Python dataclass.
#[derive(FromPyObject)]
pub struct PyOAuthConfig {
pub issuer_url: String,
pub client_id: String,
pub scopes: Vec<String>,
pub flow: String,
pub client_secret: Option<String>,
pub managed_identity_client_id: Option<String>,
pub refresh_buffer_secs: Option<u64>,
}
impl TryFrom<PyOAuthConfig> for OAuthConfig {
type Error = Error;
fn try_from(py: PyOAuthConfig) -> Result<Self, Self::Error> {
let flow = match py.flow.as_str() {
"client_credentials" => OAuthFlow::ClientCredentials,
"azure_managed_identity" => OAuthFlow::AzureManagedIdentity {
client_id: py.managed_identity_client_id,
},
other => {
return Err(Error::InvalidInput {
message: format!("Unknown OAuth flow type: {other}"),
});
}
};
Ok(Self {
issuer_url: py.issuer_url,
client_id: py.client_id,
client_secret: py.client_secret,
scopes: py.scopes,
flow,
refresh_buffer_secs: py.refresh_buffer_secs,
})
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_unknown_oauth_flow_returns_invalid_input() {
let config = PyOAuthConfig {
issuer_url: "https://issuer.example.com".to_string(),
client_id: "client-id".to_string(),
scopes: vec!["scope".to_string()],
flow: "typo".to_string(),
client_secret: None,
managed_identity_client_id: None,
refresh_buffer_secs: None,
};
let err = OAuthConfig::try_from(config).unwrap_err();
assert!(matches!(
err,
Error::InvalidInput { message }
if message == "Unknown OAuth flow type: typo"
));
}
}

View File

@@ -0,0 +1,33 @@
# SPDX-License-Identifier: Apache-2.0
# SPDX-FileCopyrightText: Copyright The LanceDB Authors
import importlib.util
import sys
from pathlib import Path
def _load_oauth_module():
oauth_path = (
Path(__file__).parents[1] / "python" / "lancedb" / "remote" / "oauth.py"
)
spec = importlib.util.spec_from_file_location("lancedb_remote_oauth", oauth_path)
module = importlib.util.module_from_spec(spec)
assert spec.loader is not None
sys.modules[spec.name] = module
spec.loader.exec_module(module)
return module
def test_oauth_config_repr_redacts_client_secret():
oauth = _load_oauth_module()
config = oauth.OAuthConfig(
issuer_url="https://issuer.example.com",
client_id="client-id",
scopes=["scope"],
client_secret="super-secret",
)
rendered = repr(config)
assert "super-secret" not in rendered
assert "client_secret" not in rendered

View File

@@ -1,6 +1,6 @@
[package]
name = "lancedb"
version = "0.31.0-beta.3"
version = "0.31.0-beta.4"
edition.workspace = true
description = "LanceDB: A serverless, low-latency vector database for AI applications"
license.workspace = true
@@ -166,6 +166,10 @@ required-features = ["bedrock"]
[[example]]
name = "simple"
[[example]]
name = "polars"
required-features = ["polars"]
[[example]]
name = "full_text_search"

View File

@@ -0,0 +1,47 @@
// SPDX-License-Identifier: Apache-2.0
// SPDX-FileCopyrightText: Copyright The LanceDB Authors
//! This example demonstrates ingesting a Polars DataFrame into LanceDB and
//! reading it back out as a Polars DataFrame.
use lancedb::arrow::IntoPolars;
use lancedb::query::ExecutableQuery;
use lancedb::{Result, connect};
use polars::prelude::{DataFrame, NamedFrom, Series};
fn make_dataframe() -> DataFrame {
let ids = Series::new("id", &[1i32, 2, 3, 4, 5]);
let names = Series::new("name", &["Alice", "Bob", "Carol", "Dave", "Eve"]);
let scores = Series::new("score", &[9.5f64, 8.1, 7.3, 9.0, 6.5]);
DataFrame::new(vec![ids, names, scores]).unwrap()
}
#[tokio::main]
async fn main() -> Result<()> {
let tmp = tempfile::tempdir().unwrap();
let db = connect(tmp.path().to_str().unwrap()).execute().await?;
// Ingest a Polars DataFrame directly — DataFrame now implements Scannable.
let df = make_dataframe();
println!("Input DataFrame:\n{df}");
let table = db.create_table("people", df).execute().await?;
// Append more rows.
let more = DataFrame::new(vec![
Series::new("id", &[6i32, 7]),
Series::new("name", &["Frank", "Grace"]),
Series::new("score", &[7.8f64, 8.9]),
])
.unwrap();
table.add(more).execute().await?;
// Read back as a Polars DataFrame.
let result_df = table.query().execute().await?.into_polars().await?;
println!(
"\nRound-tripped DataFrame ({} rows):\n{result_df}",
result_df.height()
);
Ok(())
}

View File

@@ -112,54 +112,14 @@ impl<S: Stream<Item = Result<arrow_array::RecordBatch>>> RecordBatchStream
/// A trait for converting incoming data to Arrow
///
/// Integrations should implement this trait to allow data to be
/// imported directly from the integration. For example, implementing
/// this trait for `Vec<Vec<...>>` would allow the `Vec` to be directly
/// used in methods like [`crate::connection::Connection::create_table`]
/// or [`crate::table::Table::add`]
pub trait IntoArrow {
/// Convert the data into an iterator of Arrow batches
fn into_arrow(self) -> Result<Box<dyn arrow_array::RecordBatchReader + Send>>;
}
pub type BoxedRecordBatchReader = Box<dyn arrow_array::RecordBatchReader + Send>;
impl<T: arrow_array::RecordBatchReader + Send + 'static> IntoArrow for T {
fn into_arrow(self) -> Result<Box<dyn arrow_array::RecordBatchReader + Send>> {
Ok(Box::new(self))
}
}
/// A trait for converting incoming data to Arrow asynchronously
///
/// Serves the same purpose as [`IntoArrow`], but for asynchronous data.
///
/// Note: Arrow has no async equivalent to RecordBatchReader and so
pub trait IntoArrowStream {
/// Convert the data into a stream of Arrow batches
fn into_arrow(self) -> Result<SendableRecordBatchStream>;
}
impl<S: Stream<Item = Result<arrow_array::RecordBatch>>> SimpleRecordBatchStream<S> {
pub fn new(stream: S, schema: Arc<arrow_schema::Schema>) -> Self {
Self { schema, stream }
}
}
impl IntoArrowStream for SendableRecordBatchStream {
fn into_arrow(self) -> Result<SendableRecordBatchStream> {
Ok(self)
}
}
impl IntoArrowStream for datafusion_physical_plan::SendableRecordBatchStream {
fn into_arrow(self) -> Result<SendableRecordBatchStream> {
let schema = self.schema();
let stream = self.map_err(|df_err| df_err.into());
Ok(Box::pin(SimpleRecordBatchStream::new(stream, schema)))
}
}
pub trait LanceDbDatagenExt {
fn into_ldb_stream(
self,
@@ -264,9 +224,7 @@ impl IntoPolars for SendableRecordBatchStream {
#[cfg(all(test, feature = "polars"))]
mod tests {
use super::SendableRecordBatchStream;
use crate::arrow::{
IntoArrow, IntoPolars, PolarsDataFrameRecordBatchReader, SimpleRecordBatchStream,
};
use crate::arrow::{IntoPolars, PolarsDataFrameRecordBatchReader, SimpleRecordBatchStream};
use polars::prelude::{DataFrame, NamedFrom, Series};
fn get_record_batch_reader_from_polars() -> Box<dyn arrow_array::RecordBatchReader + Send> {
@@ -280,10 +238,7 @@ mod tests {
float_series = Series::new("float", &[2.0]);
let df2 = DataFrame::new(vec![string_series, int_series, float_series]).unwrap();
PolarsDataFrameRecordBatchReader::new(df1.vstack(&df2).unwrap())
.unwrap()
.into_arrow()
.unwrap()
Box::new(PolarsDataFrameRecordBatchReader::new(df1.vstack(&df2).unwrap()).unwrap())
}
#[test]

View File

@@ -185,6 +185,43 @@ impl Scannable for SendableRecordBatchStream {
}
}
#[cfg(feature = "polars")]
impl Scannable for polars::frame::DataFrame {
fn schema(&self) -> SchemaRef {
crate::polars_arrow_convertors::convert_polars_df_schema_to_arrow_rb_schema(
self.schema().clone(),
)
.expect("failed to convert Polars DataFrame schema to Arrow schema")
}
fn scan_as_stream(&mut self) -> SendableRecordBatchStream {
let schema = Scannable::schema(self);
let batches: crate::Result<Vec<RecordBatch>> =
match crate::arrow::PolarsDataFrameRecordBatchReader::new(self.clone()) {
Err(e) => Err(e),
Ok(reader) => reader.map(|b| b.map_err(Into::into)).collect(),
};
match batches {
Err(e) => Box::pin(SimpleRecordBatchStream {
schema,
stream: once(async move { Err(e) }),
}),
Ok(batches) => {
let stream = futures::stream::iter(batches.into_iter().map(Ok));
Box::pin(SimpleRecordBatchStream { schema, stream })
}
}
}
fn num_rows(&self) -> Option<usize> {
Some(self.height())
}
fn rescannable(&self) -> bool {
true
}
}
#[async_trait]
impl StreamingWriteSource for Box<dyn Scannable> {
fn arrow_schema(&self) -> SchemaRef {
@@ -1089,4 +1126,60 @@ mod tests {
);
}
}
#[cfg(feature = "polars")]
mod polars_tests {
use super::*;
use crate::arrow::IntoPolars;
use crate::query::ExecutableQuery;
use polars::prelude::{DataFrame, NamedFrom, Series};
fn make_df() -> DataFrame {
DataFrame::new(vec![
Series::new("id", &[1i32, 2, 3]),
Series::new("val", &[1.1f64, 2.2, 3.3]),
])
.unwrap()
}
#[tokio::test]
async fn test_dataframe_scannable_round_trip() {
let tmp = tempfile::tempdir().unwrap();
let db = crate::connect(tmp.path().to_str().unwrap())
.execute()
.await
.unwrap();
let df = make_df();
let table = db.create_table("t", df.clone()).execute().await.unwrap();
// Append the same rows again.
table.add(df.clone()).execute().await.unwrap();
let result = table
.query()
.execute()
.await
.unwrap()
.into_polars()
.await
.unwrap();
assert_eq!(result.height(), df.height() * 2);
assert_eq!(result.schema(), df.schema());
}
#[tokio::test]
async fn test_dataframe_scannable_rescannable() {
let mut df = make_df();
assert!(df.rescannable());
let batches1: Vec<RecordBatch> = df.scan_as_stream().try_collect().await.unwrap();
assert_eq!(batches1.iter().map(|b| b.num_rows()).sum::<usize>(), 3);
// Can be scanned again.
let batches2: Vec<RecordBatch> = df.scan_as_stream().try_collect().await.unwrap();
assert_eq!(batches2.iter().map(|b| b.num_rows()).sum::<usize>(), 3);
}
}
}