feat: add table branch support to remote tables and Python/TS bindings

chore: update lance dependency to v8.0.0-beta.12 (#3538 )
Updates Rust workspace Lance crates and Java lance-core to v8.0.0-beta.12. No compatibility fixes were required; validation passed with cargo clippy and cargo fmt. Lance tag: https://github.com/lance-format/lance/releases/tag/v8.0.0-beta.12
2026-06-12 16:50:41 +00:00 · 2026-06-12 09:08:43 -07:00 · 2026-06-11 15:03:33 -07:00 · 2026-06-11 14:06:44 -07:00 · 2026-06-11 13:37:39 -07:00 · 2026-06-11 11:41:07 -07:00
141 changed files with 19567 additions and 6104 deletions
--- a/.agents/skills/README.md
+++ b/.agents/skills/README.md
@@ -0,0 +1,7 @@
+# Agent Skills
+
+This directory contains repo-scoped code agent skills for the LanceDB project.
+
+Each skill is a folder that contains a required `SKILL.md` and optional bundled resources.
+
+Codex discovers skills from `.agents/skills` in the current working directory and parent directories.
--- a/.agents/skills/lancedb-update-lance-dependency/SKILL.md
+++ b/.agents/skills/lancedb-update-lance-dependency/SKILL.md
@@ -0,0 +1,98 @@
+---
+name: lancedb-update-lance-dependency
+description: Update LanceDB to a specific Lance release or tag. Use when bumping Lance dependencies in the lancedb repository, including Rust workspace Lance crates, Java lance-core, validation, branch creation, commit, push, and PR creation when requested.
+---
+
+# LanceDB Update Lance Dependency
+
+## Scope
+
+Use this skill in the `lancedb/lancedb` repository when updating the Lance dependency to a specific Lance version or tag.
+
+Inputs can be a version (`7.2.0-beta.1`), a tag (`v7.2.0-beta.1`), a tag ref (`refs/tags/v7.2.0-beta.1`), or `latest`.
+
+## Workflow
+
+1. Confirm the worktree status with `git status --short`.
+2. Resolve the target Lance version:
+
+   - If the input is `latest`, empty, or omitted, run:
+
+     ```bash
+     python3 ci/check_lance_release.py
+     ```
+
+     Parse the JSON output. If `needs_update` is not `true`, stop without creating a PR. Otherwise use `latest_tag`.
+
+   - If the input is explicit, use it directly.
+
+3. Compute update metadata without changing files:
+
+   ```bash
+   python3 ci/update_lance_dependency.py "$TAG_OR_VERSION" --metadata-only
+   ```
+
+   Before making changes, check for an existing open PR with the emitted `pr_title`:
+
+   ```bash
+   gh pr list --search "\"$PR_TITLE\" in:title" --state open --limit 1 --json number,url,title
+   ```
+
+   If a matching open PR exists, stop and report it instead of creating a duplicate.
+
+4. Run the deterministic update entrypoint:
+
+   ```bash
+   python3 ci/update_lance_dependency.py "$TAG_OR_VERSION"
+   ```
+
+   This updates the Rust workspace Lance dependencies through `ci/set_lance_version.py`, updates `java/pom.xml`, refreshes Cargo metadata, and prints JSON metadata containing `branch_name`, `commit_message`, and `pr_title`.
+
+5. Run validation:
+
+   ```bash
+   cargo clippy --quiet --workspace --tests --all-features -- -D warnings
+   cargo fmt --all --quiet
+   ```
+
+   Fix real diagnostics and rerun clippy until it succeeds. Do not skip warnings.
+
+6. Inspect `git status --short` and `git diff` to ensure only the Lance dependency update and required compatibility fixes are present.
+
+7. If the task only asks to prepare local changes, stop here and report the changed files and validation result.
+
+8. If the task asks to publish the update, create a branch using the printed `branch_name`, stage all relevant files, and commit using the printed `commit_message`. Do not amend or rewrite existing commits.
+
+9. Push to `origin`. Before creating the PR, check that the current token has push permission:
+
+   ```bash
+   gh api repos/lancedb/lancedb --jq .permissions.push
+   ```
+
+   If the remote branch already exists for the same generated branch name, delete the remote ref with `gh api -X DELETE repos/lancedb/lancedb/git/refs/heads/$BRANCH_NAME`, then push. Do not force-push.
+
+10. Create a PR targeting `main` with the printed `pr_title`. If there is no PR template, keep the body to two or three concise sentences: state the Lance dependency bump, note any required compatibility fixes, and link the triggering Lance tag or release.
+
+11. Read back the remote PR title after creation. If it is not a Conventional Commit title, fix it immediately.
+
+12. When running in GitHub Actions after creating the LanceDB PR, trigger the Sophon dependency update:
+
+    ```bash
+    gh workflow run codex-bump-lancedb-lance.yml \
+      --repo lancedb/sophon \
+      -f lance_ref="$LANCE_TAG" \
+      -f lancedb_ref="$BRANCH_NAME"
+    gh run list --repo lancedb/sophon --workflow codex-bump-lancedb-lance.yml --limit 1 --json databaseId,url,displayTitle
+    ```
+
+    Use the emitted metadata `tag` value as `LANCE_TAG`. Do this only after a new LanceDB PR has been created. If the update was skipped because no update is needed or an open PR already exists, do not trigger Sophon.
+
+## GitHub Actions
+
+When this skill is used from GitHub Actions, `TAG`, `GH_TOKEN`, and `GITHUB_TOKEN` may already be set. Resolve `latest` first when `TAG` is empty. Once an explicit tag or version is known, use:
+
+```bash
+python3 ci/update_lance_dependency.py "$TAG" --github-output "$GITHUB_OUTPUT"
+```
+
+Then use the emitted `branch_name`, `commit_message`, and `pr_title` values for branch, commit, and PR creation.
--- a/.bumpversion.toml
+++ b/.bumpversion.toml
@@ -1,5 +1,5 @@
 [tool.bumpversion]
-current_version = "0.29.1-beta.0"
+current_version = "0.30.1-beta.2"
 parse = """(?x)
    (?P<major>0|[1-9]\\d*)\\.
    (?P<minor>0|[1-9]\\d*)\\.
--- a/.github/dependabot.yml
+++ b/.github/dependabot.yml
@@ -11,8 +11,24 @@ updates:
    schedule:
      interval: weekly
    open-pull-requests-limit: 10
+    # Only update Cargo.lock, never widen/raise the version requirements in
+    # Cargo.toml. The goal is keeping the lockfile (and the binaries we ship)
+    # current on security fixes, not forcing our library's consumers onto
+    # newer minimum versions.
+    versioning-strategy: lockfile-only
    groups:
      rust-minor-patch:
        update-types:
          - minor
          - patch
+
+  - package-ecosystem: pip
+    directory: /python
+    schedule:
+      interval: weekly
+    # Only update uv.lock, never widen version requirements in pyproject.toml.
+    versioning-strategy: lockfile-only
+    groups:
+      python-deps:
+        patterns:
+          - "*"
--- a/.github/workflows/build_windows_wheel/action.yml
+++ b/.github/workflows/build_windows_wheel/action.yml
@@ -29,7 +29,3 @@ runs:
        args: ${{ inputs.args }}
        docker-options: "-e PIP_EXTRA_INDEX_URL='https://pypi.fury.io/lance-format/ https://pypi.fury.io/lancedb/'"
        working-directory: python
-    - uses: actions/upload-artifact@v4
-      with:
-        name: windows-wheels
-        path: python\target\wheels
--- a/.github/workflows/codex-update-lance-dependency.yml
+++ b/.github/workflows/codex-update-lance-dependency.yml
@@ -4,14 +4,16 @@ on:
  workflow_call:
    inputs:
      tag:
-        description: "Tag name from Lance"
-        required: true
+        description: "Tag name from Lance. If omitted, the skill will use the latest Lance release that needs an update."
+        required: false
+        default: ""
        type: string
  workflow_dispatch:
    inputs:
      tag:
-        description: "Tag name from Lance"
-        required: true
+        description: "Tag name from Lance. Leave empty to use the latest Lance release that needs an update."
+        required: false
+        default: ""
        type: string

 permissions:
@@ -25,7 +27,7 @@ jobs:
    steps:
      - name: Show inputs
        run: |
-          echo "tag = ${{ inputs.tag }}"
+          echo "tag = ${{ inputs.tag || 'latest' }}"

      - name: Checkout Repo LanceDB
        uses: actions/checkout@v4
@@ -71,65 +73,21 @@ jobs:
          OPENAI_API_KEY: ${{ secrets.CODEX_TOKEN }}
        run: |
          set -euo pipefail
-          VERSION="${TAG#refs/tags/}"
-          VERSION="${VERSION#v}"
-          BRANCH_NAME="codex/update-lance-${VERSION//[^a-zA-Z0-9]/-}"
-
-          # Use "chore" for beta/rc versions, "feat" for stable releases
-          if [[ "${VERSION}" == *beta* ]] || [[ "${VERSION}" == *rc* ]]; then
-            COMMIT_TYPE="chore"
-          else
-            COMMIT_TYPE="feat"
-          fi
+          TARGET_TAG="${TAG:-latest}"

          cat <<EOF >/tmp/codex-prompt.txt
-          You are running inside the lancedb repository on a GitHub Actions runner. Update the Lance dependency to version ${VERSION} and prepare a pull request for maintainers to review.
+          You are running inside the lancedb repository on a GitHub Actions runner.

-          Follow these steps exactly:
-          1. Use script "ci/set_lance_version.py" to update Lance Rust dependencies. The script already refreshes Cargo metadata, so allow it to finish even if it takes time.
-          2. Update the Java lance-core dependency version in "java/pom.xml": change the "<lance-core.version>...</lance-core.version>" property to "${VERSION}".
-          3. Run "cargo clippy --workspace --tests --all-features -- -D warnings". If diagnostics appear, fix them yourself and rerun clippy until it exits cleanly. Do not skip any warnings.
-          4. After clippy succeeds, run "cargo fmt --all" to format the workspace.
-          5. Ensure the repository is clean except for intentional changes. Inspect "git status --short" and "git diff" to confirm the dependency update and any required fixes.
-          6. Create and switch to a new branch named "${BRANCH_NAME}" (replace any duplicated hyphens if necessary).
-          7. Stage all relevant files with "git add -A". Commit using the message "${COMMIT_TYPE}: update lance dependency to v${VERSION}".
-          8. Push the branch to origin. If the remote branch already exists, delete it first with "gh api -X DELETE repos/lancedb/lancedb/git/refs/heads/${BRANCH_NAME}" then push with "git push origin ${BRANCH_NAME}". Do NOT use "git push --force" or "git push -f".
-          9. env "GH_TOKEN" is available, use "gh" tools for github related operations like creating pull request.
-          10. Create a pull request targeting "main" with title "${COMMIT_TYPE}: update lance dependency to v${VERSION}". First, write the PR body to /tmp/pr-body.md using a heredoc (cat <<'EOF' > /tmp/pr-body.md). The body should summarize the dependency bump, clippy/fmt verification, and link the triggering tag (${TAG}). Then run "gh pr create --body-file /tmp/pr-body.md".
-          11. After creating the PR, display the PR URL, "git status --short", and a concise summary of the commands run and their results.
+          Use \$lancedb-update-lance-dependency with target "${TARGET_TAG}".

          Constraints:
-          - Use bash commands; avoid modifying GitHub workflow files other than through the scripted task above.
-          - Do not merge the PR.
-          - If any command fails, diagnose and fix the issue instead of aborting.
+          - Use env "GH_TOKEN" for GitHub operations.
+          - Do not merge the pull request.
+          - Do not force-push.
+          - Do not create a duplicate pull request if an open PR already exists for the target Lance version.
+          - If any command fails, diagnose and fix the root cause instead of aborting.
+          - After creating the PR, display the PR URL, "git status --short", and a concise summary of the commands run and their results.
          EOF

          printenv OPENAI_API_KEY | codex login --with-api-key
          codex --config shell_environment_policy.ignore_default_excludes=true exec --dangerously-bypass-approvals-and-sandbox "$(cat /tmp/codex-prompt.txt)"
-
-      - name: Trigger sophon dependency update
-        env:
-          TAG: ${{ inputs.tag }}
-          GH_TOKEN: ${{ secrets.ROBOT_TOKEN }}
-        run: |
-          set -euo pipefail
-          VERSION="${TAG#refs/tags/}"
-          VERSION="${VERSION#v}"
-          LANCEDB_BRANCH="codex/update-lance-${VERSION//[^a-zA-Z0-9]/-}"
-
-          echo "Triggering sophon workflow with:"
-          echo "  lance_ref: ${TAG#refs/tags/}"
-          echo "  lancedb_ref: ${LANCEDB_BRANCH}"
-
-          gh workflow run codex-bump-lancedb-lance.yml \
-            --repo lancedb/sophon \
-            -f lance_ref="${TAG#refs/tags/}" \
-            -f lancedb_ref="${LANCEDB_BRANCH}"
-
-      - name: Show latest sophon workflow run
-        env:
-          GH_TOKEN: ${{ secrets.ROBOT_TOKEN }}
-        run: |
-          set -euo pipefail
-          echo "Latest sophon workflow run:"
-          gh run list --repo lancedb/sophon --workflow codex-bump-lancedb-lance.yml --limit 1 --json databaseId,url,displayTitle
--- a/.github/workflows/lance-release-timer.yml
+++ b/.github/workflows/lance-release-timer.yml
@@ -1,62 +0,0 @@
-name: Lance Release Timer
-
-on:
-  schedule:
-    - cron: "*/10 * * * *"
-  workflow_dispatch:
-
-permissions:
-  contents: read
-  actions: write
-
-concurrency:
-  group: lance-release-timer
-  cancel-in-progress: false
-
-jobs:
-  trigger-update:
-    runs-on: ubuntu-latest
-    steps:
-      - name: Checkout repository
-        uses: actions/checkout@v4
-
-      - name: Check for new Lance tag
-        id: check
-        env:
-          GH_TOKEN: ${{ secrets.ROBOT_TOKEN }}
-        run: |
-          python3 ci/check_lance_release.py --github-output "$GITHUB_OUTPUT"
-
-      - name: Look for existing PR
-        if: steps.check.outputs.needs_update == 'true'
-        id: pr
-        env:
-          GH_TOKEN: ${{ secrets.ROBOT_TOKEN }}
-        run: |
-          set -euo pipefail
-          TITLE="chore: update lance dependency to v${{ steps.check.outputs.latest_version }}"
-          COUNT=$(gh pr list --search "\"$TITLE\" in:title" --state open --limit 1 --json number --jq 'length')
-          if [ "$COUNT" -gt 0 ]; then
-            echo "Open PR already exists for $TITLE"
-            echo "pr_exists=true" >> "$GITHUB_OUTPUT"
-          else
-            echo "No existing PR for $TITLE"
-            echo "pr_exists=false" >> "$GITHUB_OUTPUT"
-          fi
-
-      - name: Trigger codex update workflow
-        if: steps.check.outputs.needs_update == 'true' && steps.pr.outputs.pr_exists != 'true'
-        env:
-          GH_TOKEN: ${{ secrets.ROBOT_TOKEN }}
-        run: |
-          set -euo pipefail
-          TAG=${{ steps.check.outputs.latest_tag }}
-          gh workflow run codex-update-lance-dependency.yml -f tag=refs/tags/$TAG
-
-      - name: Show latest codex workflow run
-        if: steps.check.outputs.needs_update == 'true' && steps.pr.outputs.pr_exists != 'true'
-        env:
-          GH_TOKEN: ${{ secrets.ROBOT_TOKEN }}
-        run: |
-          set -euo pipefail
-          gh run list --workflow codex-update-lance-dependency.yml --limit 1 --json databaseId,url,displayTitle
--- a/.github/workflows/nodejs.yml
+++ b/.github/workflows/nodejs.yml
@@ -157,7 +157,10 @@ jobs:
        npx jest --testEnvironment jest-environment-node-single-context --verbose
  macos:
    timeout-minutes: 30
-    runs-on: "macos-14"
+    # macos-15 ships a newer linker; the older macos-14 linker fails to insert
+    # branch islands when the debug cdylib's __text section exceeds the 128 MB
+    # AArch64 B/BL branch range.
+    runs-on: "macos-15"
    defaults:
      run:
        shell: bash
--- a/.github/workflows/pypi-publish.yml
+++ b/.github/workflows/pypi-publish.yml
@@ -8,6 +8,9 @@ on:
    # This should trigger a dry run (we skip the final publish step)
    paths:
      - .github/workflows/pypi-publish.yml
+      - .github/workflows/build_linux_wheel/action.yml
+      - .github/workflows/build_mac_wheel/action.yml
+      - .github/workflows/build_windows_wheel/action.yml
      - Cargo.toml # Change in dependency frequently breaks builds
      - Cargo.lock

@@ -21,32 +24,21 @@ jobs:
  linux:
    name: Python ${{ matrix.config.platform }} manylinux${{ matrix.config.manylinux }}
    timeout-minutes: 60
-    permissions:
-      id-token: write
-      contents: read
    strategy:
      matrix:
        config:
-          - platform: x86_64
-            manylinux: "2_17"
-            extra_args: ""
-            runner: ubuntu-22.04
          - platform: x86_64
            manylinux: "2_28"
            extra_args: "--features fp16kernels"
            runner: ubuntu-22.04
-          - platform: aarch64
-            manylinux: "2_17"
-            extra_args: ""
-            # For successful fat LTO builds, we need a large runner to avoid OOM errors.
-            runner: ubuntu-2404-8x-arm64
+          # For successful fat LTO builds, we need a large runner to avoid OOM errors.
          - platform: aarch64
            manylinux: "2_28"
            extra_args: "--features fp16kernels"
            runner: ubuntu-2404-8x-arm64
    runs-on: ${{ matrix.config.runner }}
    steps:
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@v6
        with:
          fetch-depth: 0
          lfs: true
@@ -60,15 +52,14 @@ jobs:
          args: "--release --strip ${{ matrix.config.extra_args }}"
          arm-build: ${{ matrix.config.platform == 'aarch64' }}
          manylinux: ${{ matrix.config.manylinux }}
-      - uses: ./.github/workflows/upload_wheel
+      - uses: actions/upload-artifact@v7
        if: startsWith(github.ref, 'refs/tags/python-v')
        with:
-          fury_token: ${{ secrets.FURY_TOKEN }}
+          name: wheels-linux-${{ matrix.config.platform }}-${{ matrix.config.manylinux }}
+          path: target/wheels/lancedb-*.whl
+          if-no-files-found: error
  mac:
    timeout-minutes: 90
-    permissions:
-      id-token: write
-      contents: read
    runs-on: ${{ matrix.config.runner }}
    strategy:
      matrix:
@@ -78,7 +69,7 @@ jobs:
    env:
      MACOSX_DEPLOYMENT_TARGET: 10.15
    steps:
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@v6
        with:
          fetch-depth: 0
          lfs: true
@@ -90,18 +81,21 @@ jobs:
        with:
          python-minor-version: 10
          args: "--release --strip --target ${{ matrix.config.target }} --features fp16kernels"
-      - uses: ./.github/workflows/upload_wheel
+      - uses: actions/upload-artifact@v7
        if: startsWith(github.ref, 'refs/tags/python-v')
        with:
-          fury_token: ${{ secrets.FURY_TOKEN }}
+          name: wheels-mac-${{ matrix.config.target }}
+          path: target/wheels/lancedb-*.whl
+          if-no-files-found: error
  windows:
-    timeout-minutes: 60
-    permissions:
-      id-token: write
-      contents: read
+    timeout-minutes: 90
    runs-on: windows-latest
+    env:
+      # link.exe is single-threaded and the long pole on Windows builds. Use
+      # rustc's bundled lld-link instead.
+      CARGO_TARGET_X86_64_PC_WINDOWS_MSVC_LINKER: rust-lld
    steps:
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@v6
        with:
          fetch-depth: 0
          lfs: true
@@ -113,18 +107,70 @@ jobs:
        with:
          python-minor-version: 10
          args: "--release --strip"
-          vcpkg_token: ${{ secrets.VCPKG_GITHUB_PACKAGES }}
-      - uses: ./.github/workflows/upload_wheel
+      - uses: actions/upload-artifact@v7
        if: startsWith(github.ref, 'refs/tags/python-v')
        with:
-          fury_token: ${{ secrets.FURY_TOKEN }}
+          name: wheels-windows
+          path: target/wheels/lancedb-*.whl
+          if-no-files-found: error
+  publish:
+    name: Publish wheels
+    if: startsWith(github.ref, 'refs/tags/python-v')
+    needs: [linux, mac, windows]
+    runs-on: ubuntu-latest
+    permissions:
+      id-token: write
+      contents: read
+    steps:
+      - uses: actions/checkout@v6
+      - name: Download wheel artifacts
+        uses: actions/download-artifact@v8
+        with:
+          pattern: wheels-*
+          path: target/wheels
+          merge-multiple: true
+      - name: List wheels
+        run: ls -la target/wheels
+      - name: Choose repo
+        id: choose_repo
+        run: |
+          if [[ ${{ github.ref }} == *beta* ]]; then
+            echo "repo=fury" >> $GITHUB_OUTPUT
+          else
+            echo "repo=pypi" >> $GITHUB_OUTPUT
+          fi
+      - name: Publish to Fury
+        if: steps.choose_repo.outputs.repo == 'fury'
+        env:
+          FURY_TOKEN: ${{ secrets.FURY_TOKEN }}
+        run: |
+          shopt -s nullglob
+          WHEELS=(target/wheels/lancedb-*.whl)
+          if [[ ${#WHEELS[@]} -eq 0 ]]; then
+            echo "No wheels found in target/wheels/" >&2
+            exit 1
+          fi
+          for WHEEL in "${WHEELS[@]}"; do
+            echo "Uploading $WHEEL to Fury"
+            curl -f -F package=@"$WHEEL" "https://$FURY_TOKEN@push.fury.io/lancedb/"
+          done
+      # NOTE: pypa/gh-action-pypi-publish must be invoked directly from a
+      # workflow file, not from inside a composite action. When called from a
+      # composite, `github.action_repository` is empty (actions/runner#2473)
+      # and the action falls back to `github.repository`, producing a bogus
+      # `docker://ghcr.io/<repo>:<ref>` image reference that GHA tries to pull.
+      - name: Publish to PyPI
+        if: steps.choose_repo.outputs.repo == 'pypi'
+        uses: pypa/gh-action-pypi-publish@release/v1
+        with:
+          packages-dir: target/wheels/
  gh-release:
    if: startsWith(github.ref, 'refs/tags/python-v')
    runs-on: ubuntu-latest
    permissions:
      contents: write
    steps:
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@v6
        with:
          fetch-depth: 0
          lfs: true
@@ -187,13 +233,13 @@ jobs:
  report-failure:
    name: Report Workflow Failure
    runs-on: ubuntu-latest
-    needs: [linux, mac, windows]
+    needs: [linux, mac, windows, publish]
    permissions:
      contents: read
      issues: write
    if: always() && failure() && startsWith(github.ref, 'refs/tags/python-v')
    steps:
-      - uses: actions/checkout@v4
+      - uses: actions/checkout@v6
      - uses: ./.github/actions/create-failure-issue
        with:
          job-results: ${{ toJSON(needs) }}
--- a/.github/workflows/python.yml
+++ b/.github/workflows/python.yml
@@ -205,7 +205,7 @@ jobs:
      - name: Delete wheels
        run: rm -rf target/wheels
  pydantic1x:
-    timeout-minutes: 30
+    timeout-minutes: 60
    runs-on: "ubuntu-24.04"
    defaults:
      run:
--- a/.github/workflows/rust.yml
+++ b/.github/workflows/rust.yml
@@ -233,6 +233,26 @@ jobs:
          cargo update -p aws-sdk-sso --precise 1.62.0
          cargo update -p aws-sdk-ssooidc --precise 1.63.0
          cargo update -p aws-sdk-sts --precise 1.63.0
+          # aws-runtime/sigv4/credential-types/types and the aws-smithy-*
+          # crates bumped their MSRV to 1.91.1 in late 2026; pin to the last
+          # 1.91.0-compatible versions. The order matters — each downgrade
+          # only succeeds once everything that still pins it at a higher
+          # version has itself been downgraded.
+          cargo update -p aws-runtime --precise 1.5.12
+          cargo update -p aws-types --precise 1.3.9
+          cargo update -p aws-sigv4 --precise 1.3.5
+          cargo update -p aws-credential-types --precise 1.2.8
+          cargo update -p aws-smithy-checksums --precise 0.63.9
+          cargo update -p aws-smithy-runtime --precise 1.9.3
+          cargo update -p aws-smithy-http --precise 0.62.4
+          cargo update -p aws-smithy-eventstream --precise 0.60.12
+          cargo update -p aws-smithy-http-client --precise 1.1.3
+          cargo update -p aws-smithy-observability --precise 0.1.4
+          cargo update -p aws-smithy-query --precise 0.60.8
+          cargo update -p aws-smithy-runtime-api --precise 1.9.1
+          cargo update -p aws-smithy-async --precise 1.2.6
+          cargo update -p aws-smithy-types --precise 1.3.5
+          cargo update -p aws-smithy-xml --precise 0.60.11
          cargo update -p home --precise 0.5.9
      - name: cargo +${{ matrix.msrv }} check
        env:
--- a/.github/workflows/upload_wheel/action.yml
+++ b/.github/workflows/upload_wheel/action.yml
@@ -1,34 +0,0 @@
-name: upload-wheel
-
-description: "Upload wheels to Pypi"
-inputs:
-  fury_token:
-    required: true
-    description: "release token for the fury repo"
-
-runs:
-  using: "composite"
-  steps:
-  - name: Choose repo
-    shell: bash
-    id: choose_repo
-    run: |
-      if [[ ${{ github.ref }} == *beta* ]]; then
-        echo "repo=fury" >> $GITHUB_OUTPUT
-      else
-        echo "repo=pypi" >> $GITHUB_OUTPUT
-      fi
-  - name: Publish to Fury
-    if: steps.choose_repo.outputs.repo == 'fury'
-    shell: bash
-    env:
-      FURY_TOKEN: ${{ inputs.fury_token }}
-    run: |
-      WHEEL=$(ls target/wheels/lancedb-*.whl 2> /dev/null | head -n 1)
-      echo "Uploading $WHEEL to Fury"
-      curl -f -F package=@$WHEEL https://$FURY_TOKEN@push.fury.io/lancedb/
-  - name: Publish to PyPI
-    if: steps.choose_repo.outputs.repo == 'pypi'
-    uses: pypa/gh-action-pypi-publish@release/v1
-    with:
-      packages-dir: target/wheels/
--- a/.gitignore
+++ b/.gitignore
@@ -27,6 +27,7 @@ python/dist
 *.so
 *.dylib
 *.dll
+*.pdb

 ## Javascript
 *.node
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -17,9 +17,33 @@ Common commands:
 * Run tests: `cargo test --quiet --features remote --tests`
 * Run specific test: `cargo test --quiet --features remote -p <package_name> --test <test_name>`
 * Lint: `cargo clippy --quiet --features remote --tests --examples`
-* Format: `cargo fmt --all`
+* Format Rust: `cargo fmt --all`
+* Format Python: `ruff format .`
+* Lint Python: `ruff check .`
+* Bootstrap Python dev env: `cd python && uv run --extra tests --extra dev maturin develop --extras tests,dev`
+* Run Python tests: `cd python && uv run --extra tests pytest python/tests -vv --durations=10 -m "not slow and not s3_test"`
+* Run specific Python test: `cd python && uv run --extra tests pytest python/tests/<test_file>.py::<test_name> -q`

-Before committing changes, run formatting.
+For Python validation, prefer the uv-managed environment declared by `python/uv.lock`.
+Do not treat system `python`, global `pytest`, or missing editable-install errors as
+final blockers; bootstrap or enter the uv environment instead. If `lancedb._lancedb`
+is missing or stale, or if Rust/PyO3 binding code changed, rebuild the Python
+extension with the bootstrap command above before running tests.
+
+Before committing changes, run formatting for every language you touched. At minimum:
+
+* Rust changes: run `cargo fmt --all`.
+* Python changes: run `ruff format .` and `ruff check .` from the repository root,
+  and run targeted tests through `cd python && uv run ...`.
+* TypeScript changes: run the relevant `npm`/`pnpm` lint, format, build, and docs commands in `nodejs`.
+
+Before creating a PR, the exact value passed to `gh pr create --title` must follow
+Conventional Commits, such as `fix: support nested field paths in native index creation`
+or `feat(python): add dataset multiprocessing support`. Do not use a plain natural
+language summary like `Support nested field paths in native index creation` as the PR
+title. The semantic-release check uses the PR title and body as the merge commit message,
+so a non-conventional PR title will fail CI. After creating a PR, read the remote PR title
+back and fix it immediately if it is not conventional.

 ## Coding tips

--- a/Cargo.lock
+++ b/Cargo.lock
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -13,20 +13,20 @@ categories = ["database-implementations"]
 rust-version = "1.91.0"

 [workspace.dependencies]
-lance = { "version" = "=7.0.0-beta.13", default-features = false, "tag" = "v7.0.0-beta.13", "git" = "https://github.com/lance-format/lance.git" }
-lance-core = { "version" = "=7.0.0-beta.13", "tag" = "v7.0.0-beta.13", "git" = "https://github.com/lance-format/lance.git" }
-lance-datagen = { "version" = "=7.0.0-beta.13", "tag" = "v7.0.0-beta.13", "git" = "https://github.com/lance-format/lance.git" }
-lance-file = { "version" = "=7.0.0-beta.13", "tag" = "v7.0.0-beta.13", "git" = "https://github.com/lance-format/lance.git" }
-lance-io = { "version" = "=7.0.0-beta.13", default-features = false, "tag" = "v7.0.0-beta.13", "git" = "https://github.com/lance-format/lance.git" }
-lance-index = { "version" = "=7.0.0-beta.13", "tag" = "v7.0.0-beta.13", "git" = "https://github.com/lance-format/lance.git" }
-lance-linalg = { "version" = "=7.0.0-beta.13", "tag" = "v7.0.0-beta.13", "git" = "https://github.com/lance-format/lance.git" }
-lance-namespace = { "version" = "=7.0.0-beta.13", "tag" = "v7.0.0-beta.13", "git" = "https://github.com/lance-format/lance.git" }
-lance-namespace-impls = { "version" = "=7.0.0-beta.13", default-features = false, "tag" = "v7.0.0-beta.13", "git" = "https://github.com/lance-format/lance.git" }
-lance-table = { "version" = "=7.0.0-beta.13", "tag" = "v7.0.0-beta.13", "git" = "https://github.com/lance-format/lance.git" }
-lance-testing = { "version" = "=7.0.0-beta.13", "tag" = "v7.0.0-beta.13", "git" = "https://github.com/lance-format/lance.git" }
-lance-datafusion = { "version" = "=7.0.0-beta.13", "tag" = "v7.0.0-beta.13", "git" = "https://github.com/lance-format/lance.git" }
-lance-encoding = { "version" = "=7.0.0-beta.13", "tag" = "v7.0.0-beta.13", "git" = "https://github.com/lance-format/lance.git" }
-lance-arrow = { "version" = "=7.0.0-beta.13", "tag" = "v7.0.0-beta.13", "git" = "https://github.com/lance-format/lance.git" }
+lance = { "version" = "=8.0.0-beta.12", default-features = false, "tag" = "v8.0.0-beta.12", "git" = "https://github.com/lance-format/lance.git" }
+lance-core = { "version" = "=8.0.0-beta.12", "tag" = "v8.0.0-beta.12", "git" = "https://github.com/lance-format/lance.git" }
+lance-datagen = { "version" = "=8.0.0-beta.12", "tag" = "v8.0.0-beta.12", "git" = "https://github.com/lance-format/lance.git" }
+lance-file = { "version" = "=8.0.0-beta.12", "tag" = "v8.0.0-beta.12", "git" = "https://github.com/lance-format/lance.git" }
+lance-io = { "version" = "=8.0.0-beta.12", default-features = false, "tag" = "v8.0.0-beta.12", "git" = "https://github.com/lance-format/lance.git" }
+lance-index = { "version" = "=8.0.0-beta.12", "tag" = "v8.0.0-beta.12", "git" = "https://github.com/lance-format/lance.git" }
+lance-linalg = { "version" = "=8.0.0-beta.12", "tag" = "v8.0.0-beta.12", "git" = "https://github.com/lance-format/lance.git" }
+lance-namespace = { "version" = "=8.0.0-beta.12", "tag" = "v8.0.0-beta.12", "git" = "https://github.com/lance-format/lance.git" }
+lance-namespace-impls = { "version" = "=8.0.0-beta.12", default-features = false, "tag" = "v8.0.0-beta.12", "git" = "https://github.com/lance-format/lance.git" }
+lance-table = { "version" = "=8.0.0-beta.12", "tag" = "v8.0.0-beta.12", "git" = "https://github.com/lance-format/lance.git" }
+lance-testing = { "version" = "=8.0.0-beta.12", "tag" = "v8.0.0-beta.12", "git" = "https://github.com/lance-format/lance.git" }
+lance-datafusion = { "version" = "=8.0.0-beta.12", "tag" = "v8.0.0-beta.12", "git" = "https://github.com/lance-format/lance.git" }
+lance-encoding = { "version" = "=8.0.0-beta.12", "tag" = "v8.0.0-beta.12", "git" = "https://github.com/lance-format/lance.git" }
+lance-arrow = { "version" = "=8.0.0-beta.12", "tag" = "v8.0.0-beta.12", "git" = "https://github.com/lance-format/lance.git" }
 ahash = "0.8"
 # Note that this one does not include pyarrow
 arrow = { version = "58.0.0", optional = false }
--- a/REVIEW.md
+++ b/REVIEW.md
@@ -0,0 +1,26 @@
+# Code review guidelines
+
+Repo-specific guidance for automated PR reviews.
+
+## Cross-SDK parity
+
+LanceDB exposes the same core (`rust/lancedb`) through Python, TypeScript (`nodejs`),
+and Java bindings. Behavioral drift between SDKs is a recurring problem, so watch for
+parity gaps when reviewing — but only flag real ones:
+
+* If the change adds or modifies user-facing API or behavior in the shared core
+  (`rust/lancedb`), check whether each binding that should expose it (`python`,
+  `nodejs`) does. A core change with no corresponding binding update is worth a note.
+* If the change adds or modifies a public API in one SDK but not the other, open the
+  sibling SDK's corresponding module and state whether an equivalent exists. If not,
+  note it as a possible parity gap and suggest a follow-up issue.
+* For bug fixes, first read the sibling SDK's analogous code path to check whether the
+  same bug exists there. Only raise parity if it actually does. Do not ask to "port" a
+  fix for a bug that only ever existed in one binding.
+* Stay silent on internal-only refactors, tests, docs, and changes with no cross-SDK
+  surface.
+* Parity expectations apply to the Python and TypeScript (`nodejs`) SDKs. Java currently
+  implements only the remote table, not the local/embedded backend, so it is expected to
+  be partial — do not flag Java for missing local-only functionality.
+* Keep parity feedback to a short, clearly-labeled note (e.g. "Possible SDK parity
+  gap: …"). It is advisory, not a merge blocker.
--- a/ci/check_lance_release.py
+++ b/ci/check_lance_release.py
@@ -112,25 +112,25 @@ def fetch_remote_tags() -> List[TagInfo]:
            "api",
            "-X",
            "GET",
-            f"repos/{LANCE_REPO}/git/refs/tags",
-            "--paginate",
+            f"repos/{LANCE_REPO}/releases",
            "--jq",
-            ".[].ref",
+            ".[].tag_name",
+            "-F",
+            "per_page=20",
        ]
    )
    tags: List[TagInfo] = []
    for line in output.splitlines():
-        ref = line.strip()
-        if not ref.startswith("refs/tags/v"):
+        tag = line.strip()
+        if not tag.startswith("v"):
            continue
-        tag = ref.split("refs/tags/")[-1]
        version = tag.lstrip("v")
        try:
            tags.append(TagInfo(tag=tag, version=version, semver=parse_semver(version)))
        except ValueError:
            continue
    if not tags:
-        raise RuntimeError("No Lance tags could be parsed from GitHub API output")
+        raise RuntimeError("No Lance releases could be parsed from GitHub API output")
    return tags


--- a/ci/update_lance_dependency.py
+++ b/ci/update_lance_dependency.py
@@ -0,0 +1,126 @@
+#!/usr/bin/env python3
+"""Prepare a Lance dependency update for LanceDB."""
+
+from __future__ import annotations
+
+import argparse
+import json
+import re
+import subprocess
+import sys
+from pathlib import Path
+from typing import Sequence
+
+try:
+    from check_lance_release import parse_semver
+except ModuleNotFoundError:
+    # Supports importing as ci.update_lance_dependency from tests or ad hoc checks.
+    from ci.check_lance_release import parse_semver  # type: ignore
+
+
+def normalize_version(raw: str) -> str:
+    value = raw.strip()
+    value = value.removeprefix("refs/tags/")
+    value = value.removeprefix("v")
+    try:
+        parse_semver(value)
+    except ValueError:
+        raise ValueError(f"Unsupported Lance version or tag: {raw}")
+    return value
+
+
+def normalized_tag(version: str) -> str:
+    return f"v{version}"
+
+
+def branch_name(version: str) -> str:
+    suffix = re.sub(r"[^a-zA-Z0-9]+", "-", version).strip("-")
+    suffix = re.sub(r"-+", "-", suffix)
+    return f"codex/update-lance-{suffix}"
+
+
+def commit_type(version: str) -> str:
+    prerelease = version.split("-", maxsplit=1)[1] if "-" in version else ""
+    return "chore" if "beta" in prerelease or "rc" in prerelease else "feat"
+
+
+def metadata_for(version: str) -> dict[str, str]:
+    kind = commit_type(version)
+    message = f"{kind}: update lance dependency to v{version}"
+    return {
+        "version": version,
+        "tag": normalized_tag(version),
+        "branch_name": branch_name(version),
+        "commit_type": kind,
+        "commit_message": message,
+        "pr_title": message,
+    }
+
+
+def run_command(cmd: Sequence[str], *, cwd: Path) -> None:
+    subprocess.run(cmd, cwd=cwd, check=True)
+
+
+def update_java_lance_core_version(repo_root: Path, version: str) -> None:
+    pom_path = repo_root / "java" / "pom.xml"
+    contents = pom_path.read_text(encoding="utf-8")
+    updated, count = re.subn(
+        r"(<lance-core\.version>)[^<]+(</lance-core\.version>)",
+        rf"\g<1>{version}\g<2>",
+        contents,
+        count=1,
+    )
+    if count != 1:
+        raise RuntimeError(
+            "Expected exactly one <lance-core.version> entry in java/pom.xml"
+        )
+    pom_path.write_text(updated, encoding="utf-8")
+
+
+def write_github_outputs(path: str | None, payload: dict[str, str]) -> None:
+    if not path:
+        return
+    with open(path, "a", encoding="utf-8") as output:
+        for key, value in payload.items():
+            output.write(f"{key}={value}\n")
+
+
+def main(argv: Sequence[str] | None = None) -> int:
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument(
+        "tag_or_version",
+        help="Lance tag or version, for example refs/tags/v7.2.0-beta.1 or 7.2.0",
+    )
+    parser.add_argument(
+        "--repo-root",
+        type=Path,
+        default=Path(__file__).resolve().parents[1],
+        help="Path to the lancedb repository root",
+    )
+    parser.add_argument(
+        "--github-output",
+        default=None,
+        help="Optional GitHub Actions output file to receive metadata fields",
+    )
+    parser.add_argument(
+        "--metadata-only",
+        action="store_true",
+        help="Only print derived metadata; do not modify dependency files",
+    )
+    args = parser.parse_args(argv)
+
+    repo_root = args.repo_root.resolve()
+    version = normalize_version(args.tag_or_version)
+    payload = metadata_for(version)
+
+    if not args.metadata_only:
+        run_command([sys.executable, "ci/set_lance_version.py", version], cwd=repo_root)
+        update_java_lance_core_version(repo_root, version)
+
+    write_github_outputs(args.github_output, payload)
+    print(json.dumps(payload, sort_keys=True))
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
--- a/deny.toml
+++ b/deny.toml
@@ -147,6 +147,14 @@ allow = [
    "CDLA-Permissive-2.0",
 ]
 confidence-threshold = 0.8
+# Per-crate license exceptions: allow a license for a specific crate only,
+# rather than globally via the `allow` list above.
+exceptions = [
+    # CDDL-1.0 (copyleft) is pulled in only as a dev/profiling dependency via
+    # `inferno` -> `pprof` -> `lance-testing`; it is a test dependency that we
+    # do not distribute, so scope the allowance to `inferno` alone.
+    { allow = ["CDDL-1.0"], crate = "inferno" },
+]
 # Crates whose license cannot be determined from Cargo metadata but whose
 # license we've manually confirmed from upstream. Keep this list minimal.
 [[licenses.clarify]]
--- a/docs/src/java/java.md
+++ b/docs/src/java/java.md
@@ -14,7 +14,7 @@ Add the following dependency to your `pom.xml`:
 <dependency>
    <groupId>com.lancedb</groupId>
    <artifactId>lancedb-core</artifactId>
-    <version>0.29.1-beta.0</version>
+    <version>0.30.1-beta.2</version>
 </dependency>
 ```

--- a/docs/src/js/classes/BranchContents.md
+++ b/docs/src/js/classes/BranchContents.md
@@ -0,0 +1,43 @@
+[**@lancedb/lancedb**](../README.md) • **Docs**
+
+***
+
+[@lancedb/lancedb](../globals.md) / BranchContents
+
+# Class: BranchContents
+
+## Constructors
+
+### new BranchContents()
+
+```ts
+new BranchContents(): BranchContents
+```
+
+#### Returns
+
+[`BranchContents`](BranchContents.md)
+
+## Properties
+
+### manifestSize
+
+```ts
+manifestSize: number;
+```
+
+***
+
+### parentBranch?
+
+```ts
+optional parentBranch: string;
+```
+
+***
+
+### parentVersion
+
+```ts
+parentVersion: number;
+```
--- a/docs/src/js/classes/Branches.md
+++ b/docs/src/js/classes/Branches.md
@@ -0,0 +1,96 @@
+[**@lancedb/lancedb**](../README.md) • **Docs**
+
+***
+
+[@lancedb/lancedb](../globals.md) / Branches
+
+# Class: Branches
+
+Branch manager for a [Table](Table.md).
+
+Unlike tags, `create` and `checkout` return a new [Table](Table.md) handle scoped
+to the branch; writes on it do not affect `main`.
+
+## Methods
+
+### checkout()
+
+```ts
+checkout(name, version?): Promise<Table>
+```
+
+Check out an existing branch and return a handle scoped to it.
+
+With `version` set, the returned handle is pinned to that version of the
+branch (a read-only, detached view); otherwise it tracks the branch's
+latest and stays writable.
+
+#### Parameters
+
+* **name**: `string`
+
+* **version?**: `number`
+
+#### Returns
+
+`Promise`&lt;[`Table`](Table.md)&gt;
+
+***
+
+### create()
+
+```ts
+create(
+   name,
+   fromRef?,
+   fromVersion?): Promise<Table>
+```
+
+Create a branch and return a handle scoped to it.
+
+#### Parameters
+
+* **name**: `string`
+    Name of the new branch.
+
+* **fromRef?**: `string`
+    Source branch to fork from. Defaults to `main`.
+
+* **fromVersion?**: `number`
+    A specific version on `fromRef`. Defaults to latest.
+
+#### Returns
+
+`Promise`&lt;[`Table`](Table.md)&gt;
+
+***
+
+### delete()
+
+```ts
+delete(name): Promise<void>
+```
+
+Delete a branch.
+
+#### Parameters
+
+* **name**: `string`
+
+#### Returns
+
+`Promise`&lt;`void`&gt;
+
+***
+
+### list()
+
+```ts
+list(): Promise<Record<string, BranchContents>>
+```
+
+List all branches, mapping name to branch metadata.
+
+#### Returns
+
+`Promise`&lt;`Record`&lt;`string`, [`BranchContents`](BranchContents.md)&gt;&gt;
--- a/docs/src/js/classes/Connection.md
+++ b/docs/src/js/classes/Connection.md
@@ -441,18 +441,28 @@ Open a table in the database.

 ```ts
 abstract renameTable(
-   oldName,
+   currentName,
   newName,
-   namespacePath?): Promise<void>
+   options?): Promise<void>
 ```

+Rename a table.
+
+Currently only supported by LanceDB Cloud. Local OSS connections and
+namespace-backed connections (via [connectNamespace](../functions/connectNamespace.md)) reject with
+a "not supported" error.
+
 #### Parameters

-* **oldName**: `string`
+* **currentName**: `string`
+    The current name of the table.

 * **newName**: `string`
+    The new name for the table.

-* **namespacePath?**: `string`[]
+* **options?**: [`RenameTableOptions`](../interfaces/RenameTableOptions.md)
+    Optional namespace paths. When
+    `newNamespacePath` is omitted the table stays in `namespacePath`.

 #### Returns

--- a/docs/src/js/classes/Index.md
+++ b/docs/src/js/classes/Index.md
@@ -57,6 +57,24 @@ block size may be added in the future.

 ***

+### fm()
+
+```ts
+static fm(): Index
+```
+
+Create an FM-Index.
+
+An FM-Index is a scalar index on string or binary columns that accelerates
+substring search, i.e. `contains(col, 'needle')`. Unlike the tokenized
+full-text-search index, it matches arbitrary substrings of the raw bytes.
+
+#### Returns
+
+[`Index`](Index.md)
+
+***
+
 ### fts()

 ```ts
--- a/docs/src/js/classes/MergeInsertBuilder.md
+++ b/docs/src/js/classes/MergeInsertBuilder.md
@@ -76,6 +76,57 @@ the query optimizer chooses a suboptimal path.

 ***

+### useLsmWrite()
+
+```ts
+useLsmWrite(useLsmWrite): MergeInsertBuilder
+```
+
+Controls whether the merge uses the MemWAL LSM write path.
+
+By default (unset), a `mergeInsert` on a table with an LSM write spec is
+routed through Lance's MemWAL shard writer, and a table without one uses
+the standard path. Pass `false` to force the standard path even when a
+spec is set. Pass `true` to require a spec — `mergeInsert` rejects if none
+is installed.
+
+#### Parameters
+
+* **useLsmWrite**: `boolean`
+    Whether to use the LSM write path.
+
+#### Returns
+
+[`MergeInsertBuilder`](MergeInsertBuilder.md)
+
+***
+
+### validateSingleShard()
+
+```ts
+validateSingleShard(validateSingleShard): MergeInsertBuilder
+```
+
+Controls how an LSM merge checks that its input targets a single shard.
+
+When a table has an LSM write spec, every row in a `mergeInsert` call must
+route to the same shard. When `true` (the default), every row is inspected
+to verify this. When `false`, only the first row is inspected and the
+shard it routes to is used for the whole input — a faster path for callers
+that have already pre-sharded their input. Has no effect on tables without
+an LSM write spec.
+
+#### Parameters
+
+* **validateSingleShard**: `boolean`
+    Whether to check every row routes to one shard. Defaults to `true`.
+
+#### Returns
+
+[`MergeInsertBuilder`](MergeInsertBuilder.md)
+
+***
+
 ### whenMatchedUpdateAll()

 ```ts
--- a/docs/src/js/classes/Table.md
+++ b/docs/src/js/classes/Table.md
@@ -110,6 +110,23 @@ containing the new version number of the table after altering the columns.

 ***

+### branches()
+
+```ts
+abstract branches(): Promise<Branches>
+```
+
+Get the branch manager for this table.
+
+Branches are isolated, writable lines of history forked from another
+branch (or version). Writes on a branch do not affect `main`.
+
+#### Returns
+
+`Promise`&lt;[`Branches`](Branches.md)&gt;
+
+***
+
 ### checkout()

 ```ts
@@ -187,6 +204,25 @@ Any attempt to use the table after it is closed will result in an error.

 ***

+### closeLsmWriters()
+
+```ts
+abstract closeLsmWriters(): Promise<void>
+```
+
+Drain and close any cached MemWAL shard writers held for this table.
+
+When an [LsmWriteSpec](../interfaces/LsmWriteSpec.md) is installed, `mergeInsert` opens MemWAL
+shard writers and caches them for reuse across calls. This closes them,
+flushing pending data; writers reopen lazily on the next `mergeInsert`.
+It is a no-op when no writers are cached.
+
+#### Returns
+
+`Promise`&lt;`void`&gt;
+
+***
+
 ### countRows()

 ```ts
@@ -259,6 +295,23 @@ await table.createIndex("my_float_col");

 ***

+### currentBranch()
+
+```ts
+abstract currentBranch(): null | string
+```
+
+The branch this table handle is scoped to, or `null` for the main branch.
+
+A handle returned by [Branches.create](Branches.md#create) or [Branches.checkout](Branches.md#checkout)
+reports the branch it targets; a handle opened normally reports `null`.
+
+#### Returns
+
+`null` \| `string`
+
+***
+
 ### delete()

 ```ts
@@ -975,6 +1028,29 @@ based on the row being updated (e.g. "my_col + 1")

 ***

+### updateFieldMetadata()
+
+```ts
+abstract updateFieldMetadata(updates): Promise<UpdateFieldMetadataResult>
+```
+
+Update per-field (column) metadata.
+
+#### Parameters
+
+* **updates**: [`FieldMetadataUpdate`](../interfaces/FieldMetadataUpdate.md)[]
+    One or more per-field updates. Each
+    update's metadata is merged into the field's existing metadata by default;
+    a value of `null` deletes that key, and `replace: true` swaps the whole map.
+
+#### Returns
+
+`Promise`&lt;[`UpdateFieldMetadataResult`](../interfaces/UpdateFieldMetadataResult.md)&gt;
+
+resolves to the new table version.
+
+***
+
 ### vectorSearch()

 ```ts
--- a/docs/src/js/globals.md
+++ b/docs/src/js/globals.md
@@ -12,7 +12,6 @@
 ## Enumerations

 - [FullTextQueryType](enumerations/FullTextQueryType.md)
- [OAuthFlowType](enumerations/OAuthFlowType.md)
 - [Occur](enumerations/Occur.md)
 - [Operator](enumerations/Operator.md)

@@ -20,6 +19,8 @@

 - [BooleanQuery](classes/BooleanQuery.md)
 - [BoostQuery](classes/BoostQuery.md)
+- [BranchContents](classes/BranchContents.md)
+- [Branches](classes/Branches.md)
 - [Connection](classes/Connection.md)
 - [HeaderProvider](classes/HeaderProvider.md)
 - [Index](classes/Index.md)
@@ -66,6 +67,7 @@
 - [DropNamespaceOptions](interfaces/DropNamespaceOptions.md)
 - [DropNamespaceResponse](interfaces/DropNamespaceResponse.md)
 - [ExecutableQuery](interfaces/ExecutableQuery.md)
+- [FieldMetadataUpdate](interfaces/FieldMetadataUpdate.md)
 - [FragmentStatistics](interfaces/FragmentStatistics.md)
 - [FragmentSummaryStats](interfaces/FragmentSummaryStats.md)
 - [FtsOptions](interfaces/FtsOptions.md)
@@ -83,13 +85,12 @@
 - [ListNamespacesResponse](interfaces/ListNamespacesResponse.md)
 - [LsmWriteSpec](interfaces/LsmWriteSpec.md)
 - [MergeResult](interfaces/MergeResult.md)
- [NativeOAuthConfig](interfaces/NativeOAuthConfig.md)
- [OAuthConfig](interfaces/OAuthConfig.md)
 - [OpenTableOptions](interfaces/OpenTableOptions.md)
 - [OptimizeOptions](interfaces/OptimizeOptions.md)
 - [OptimizeStats](interfaces/OptimizeStats.md)
 - [QueryExecutionOptions](interfaces/QueryExecutionOptions.md)
 - [RemovalStats](interfaces/RemovalStats.md)
+- [RenameTableOptions](interfaces/RenameTableOptions.md)
 - [RestNamespaceConfig](interfaces/RestNamespaceConfig.md)
 - [RetryConfig](interfaces/RetryConfig.md)
 - [ScannableOptions](interfaces/ScannableOptions.md)
@@ -103,10 +104,12 @@
 - [TimeoutConfig](interfaces/TimeoutConfig.md)
 - [TlsConfig](interfaces/TlsConfig.md)
 - [TokenResponse](interfaces/TokenResponse.md)
+- [UpdateFieldMetadataResult](interfaces/UpdateFieldMetadataResult.md)
 - [UpdateOptions](interfaces/UpdateOptions.md)
 - [UpdateResult](interfaces/UpdateResult.md)
 - [Version](interfaces/Version.md)
 - [WriteExecutionOptions](interfaces/WriteExecutionOptions.md)
+- [WriteProgress](interfaces/WriteProgress.md)

 ## Type Aliases

--- a/docs/src/js/interfaces/AddDataOptions.md
+++ b/docs/src/js/interfaces/AddDataOptions.md
@@ -19,3 +19,39 @@ mode: "append" | "overwrite";
 If "append" (the default) then the new data will be added to the table

 If "overwrite" then the new data will replace the existing data in the table.
+
+***
+
+### progress()
+
+```ts
+progress: (progress) => void;
+```
+
+Optional callback invoked periodically with write progress.
+
+The callback is fired once per batch written and once more with
+`done: true` when the write completes. Calls are dispatched
+asynchronously to the JS event loop and never block the write — a slow
+callback will queue events rather than back-pressure the writer.
+
+Errors thrown from the callback are logged with `console.warn` and
+swallowed — they do not abort the write.
+
+#### Parameters
+
+* **progress**: [`WriteProgress`](WriteProgress.md)
+
+#### Returns
+
+`void`
+
+#### Example
+
+```ts
+await table.add(data, {
+  progress: (p) => {
+    console.log(`${p.outputRows}/${p.totalRows ?? "?"} rows`);
+  },
+});
+```
--- a/docs/src/js/interfaces/ConnectionOptions.md
+++ b/docs/src/js/interfaces/ConnectionOptions.md
@@ -64,34 +64,26 @@ client used by manifest-enabled native connections.

 ***

-### oauthConfig?
-
-```ts
-optional oauthConfig: NativeOAuthConfig;
-```
-
-(For LanceDB cloud only): OAuth configuration for IdP-based
-authentication (e.g., Azure Entra ID). When set, token acquisition
-and refresh are handled entirely in Rust.
-
-***
-
 ### readConsistencyInterval?

 ```ts
 optional readConsistencyInterval: number;
 ```

-(For LanceDB OSS only): The interval, in seconds, at which to check for
-updates to the table from other processes. If None, then consistency is not
-checked. For performance reasons, this is the default. For strong
-consistency, set this to zero seconds. Then every read will check for
-updates from other processes. As a compromise, you can set this to a
-non-zero value for eventual consistency. If more than that interval
-has passed since the last check, then the table will be checked for updates.
-Note: this consistency only applies to read operations. Write operations are
+The interval, in seconds, at which to check for updates to the table
+from other processes. If None, then consistency is not checked. For
+performance reasons, this is the default. For strong consistency, set
+this to zero seconds. Then every read will check for updates from other
+processes. As a compromise, you can set this to a non-zero value for
+eventual consistency. If more than that interval has passed since the
+last check, then the table will be checked for updates. Note: this
+consistency only applies to read operations. Write operations are
 always consistent.

+Stronger consistency is not free. The smaller the interval, the more
+often each read pays the cost of checking for updates against object
+storage, raising per-read latency and cost.
+
 ***

 ### region?
--- a/docs/src/js/interfaces/FieldMetadataUpdate.md
+++ b/docs/src/js/interfaces/FieldMetadataUpdate.md
@@ -0,0 +1,41 @@
+[**@lancedb/lancedb**](../README.md) • **Docs**
+
+***
+
+[@lancedb/lancedb](../globals.md) / FieldMetadataUpdate
+
+# Interface: FieldMetadataUpdate
+
+A per-field metadata update, addressed by dot-path.
+
+## Properties
+
+### metadata
+
+```ts
+metadata: Record<string, null | string>;
+```
+
+Metadata key/value pairs. Merged into the field's existing metadata by
+default; a value of `null` deletes that key.
+
+***
+
+### path
+
+```ts
+path: string;
+```
+
+Dot-separated path to the field. For a top-level column this is just its
+name; for a nested field it's the path, e.g. "a.b.c".
+
+***
+
+### replace?
+
+```ts
+optional replace: boolean;
+```
+
+If true, replace the field's entire metadata map instead of merging.
--- a/docs/src/js/interfaces/IndexConfig.md
+++ b/docs/src/js/interfaces/IndexConfig.md
@@ -23,6 +23,31 @@ be more columns to represent composite indices.

 ***

+### createdAt?
+
+```ts
+optional createdAt: Date;
+```
+
+When the index was created.
+
+`undefined` for remote tables or indices created before timestamps were tracked.
+
+***
+
+### indexDetails?
+
+```ts
+optional indexDetails: any;
+```
+
+Index-type-specific details parsed as a JavaScript object.
+
+Falls back to a raw string if JSON parsing fails. `undefined` for
+remote tables or when details are unavailable.
+
+***
+
 ### indexType

 ```ts
@@ -33,6 +58,30 @@ The type of the index

 ***

+### indexUuid?
+
+```ts
+optional indexUuid: string;
+```
+
+The UUID of the first segment of the index.
+
+`undefined` for remote tables, which do not yet surface this.
+
+***
+
+### indexVersion?
+
+```ts
+optional indexVersion: number;
+```
+
+The on-disk index format version.
+
+`undefined` for remote tables.
+
+***
+
 ### name

 ```ts
@@ -40,3 +89,63 @@ name: string;
 ```

 The name of the index
+
+***
+
+### numIndexedRows?
+
+```ts
+optional numIndexedRows: number;
+```
+
+The number of rows indexed, across all segments.
+
+`undefined` for remote tables.
+
+***
+
+### numSegments?
+
+```ts
+optional numSegments: number;
+```
+
+The number of segments that make up the index.
+
+`undefined` for remote tables.
+
+***
+
+### numUnindexedRows?
+
+```ts
+optional numUnindexedRows: number;
+```
+
+The number of rows not yet covered by this index.
+
+`undefined` for remote tables.
+
+***
+
+### sizeBytes?
+
+```ts
+optional sizeBytes: number;
+```
+
+The total size in bytes of all index files across all segments.
+
+`undefined` for remote tables or indices without size tracking.
+
+***
+
+### typeUrl?
+
+```ts
+optional typeUrl: string;
+```
+
+The protobuf type URL, a precise type identifier for the index.
+
+`undefined` for remote tables.
--- a/docs/src/js/interfaces/IndexStatistics.md
+++ b/docs/src/js/interfaces/IndexStatistics.md
@@ -30,17 +30,6 @@ The type of the index

 ***

-### loss?
-
-```ts
-optional loss: number;
-```
-
-The KMeans loss value of the index,
-it is only present for vector indices.
-
-***
-
 ### numIndexedRows

 ```ts
--- a/docs/src/js/interfaces/LsmWriteSpec.md
+++ b/docs/src/js/interfaces/LsmWriteSpec.md
@@ -11,7 +11,10 @@ Specification selecting Lance's MemWAL LSM-style write path for

 `specType` is `"bucket"`, `"identity"`, or `"unsharded"`. For `"bucket"`,
 `column` and `numBuckets` are required; for `"identity"`, `column` is
-required.
+required and must be a deterministic function of the unenforced primary
+key (every row with a given primary key must always produce the same
+`column` value, or upserts of that key can land in different shards and a
+stale version can win).

 ## Properties

--- a/docs/src/js/interfaces/MergeResult.md
+++ b/docs/src/js/interfaces/MergeResult.md
@@ -32,6 +32,14 @@ numInsertedRows: number;

 ***

+### numRows
+
+```ts
+numRows: number;
+```
+
+***
+
 ### numUpdatedRows

 ```ts
--- a/docs/src/js/interfaces/OpenTableOptions.md
+++ b/docs/src/js/interfaces/OpenTableOptions.md
@@ -8,6 +8,18 @@

 ## Properties

+### branch?
+
+```ts
+optional branch: string;
+```
+
+Open the table scoped to this branch instead of the default branch.
+
+Reads and writes on the returned table operate in the branch's context.
+
+***
+
 ### ~~indexCacheSize?~~

 ```ts
@@ -43,3 +55,17 @@ Options already set on the connection will be inherited by the table,
 but can be overridden here.

 The available options are described at https://docs.lancedb.com/storage/
+
+***
+
+### version?
+
+```ts
+optional version: number;
+```
+
+Open the table pinned to this version, producing a read-only view.
+
+Composes with [OpenTableOptions.branch](OpenTableOptions.md#branch): when both are set, opens
+that branch at the version; otherwise opens `main` at the version. Call
+`checkoutLatest` to return to a writable state.
--- a/docs/src/js/interfaces/RenameTableOptions.md
+++ b/docs/src/js/interfaces/RenameTableOptions.md
@@ -0,0 +1,29 @@
+[**@lancedb/lancedb**](../README.md) • **Docs**
+
+***
+
+[@lancedb/lancedb](../globals.md) / RenameTableOptions
+
+# Interface: RenameTableOptions
+
+## Properties
+
+### namespacePath?
+
+```ts
+optional namespacePath: string[];
+```
+
+The namespace path of the table being renamed. Defaults to the root
+namespace (`[]`) when omitted.
+
+***
+
+### newNamespacePath?
+
+```ts
+optional newNamespacePath: string[];
+```
+
+The namespace path to move the table to as part of the rename. When
+omitted the table stays in `namespacePath`.
--- a/docs/src/js/interfaces/UpdateFieldMetadataResult.md
+++ b/docs/src/js/interfaces/UpdateFieldMetadataResult.md
@@ -0,0 +1,15 @@
+[**@lancedb/lancedb**](../README.md) • **Docs**
+
+***
+
+[@lancedb/lancedb](../globals.md) / UpdateFieldMetadataResult
+
+# Interface: UpdateFieldMetadataResult
+
+## Properties
+
+### version
+
+```ts
+version: number;
+```
--- a/docs/src/js/interfaces/WriteProgress.md
+++ b/docs/src/js/interfaces/WriteProgress.md
@@ -0,0 +1,84 @@
+[**@lancedb/lancedb**](../README.md) • **Docs**
+
+***
+
+[@lancedb/lancedb](../globals.md) / WriteProgress
+
+# Interface: WriteProgress
+
+Progress snapshot for a write operation, delivered to the `progress`
+callback passed to [Table.add](../classes/Table.md#add).
+
+## Properties
+
+### activeTasks
+
+```ts
+activeTasks: number;
+```
+
+Number of parallel write tasks currently in flight.
+
+***
+
+### done
+
+```ts
+done: boolean;
+```
+
+`true` for the final callback; `false` otherwise.
+
+***
+
+### elapsedSeconds
+
+```ts
+elapsedSeconds: number;
+```
+
+Wall-clock seconds since the write started.
+
+***
+
+### outputBytes
+
+```ts
+outputBytes: number;
+```
+
+Number of bytes written so far.
+
+***
+
+### outputRows
+
+```ts
+outputRows: number;
+```
+
+Number of rows written so far.
+
+***
+
+### totalRows?
+
+```ts
+optional totalRows: number;
+```
+
+Total rows expected, when the input source reports it.
+
+Always set on the final callback (the one with `done: true`), falling
+back to the actual number of rows written when the source could not
+report a row count up front.
+
+***
+
+### totalTasks
+
+```ts
+totalTasks: number;
+```
+
+Total number of parallel write tasks (the write parallelism).
--- a/docs/src/python/python.md
+++ b/docs/src/python/python.md
@@ -166,6 +166,12 @@ lists the indices that LanceDb supports.

 ::: lancedb.index.IvfFlat

+::: lancedb.index.IvfSq
+
+::: lancedb.index.IvfRq
+
+::: lancedb.index.HnswFlat
+
 ::: lancedb.table.IndexStatistics

 ## Querying (Asynchronous)
--- a/java/lancedb-core/pom.xml
+++ b/java/lancedb-core/pom.xml
@@ -8,7 +8,7 @@
    <parent>
      <groupId>com.lancedb</groupId>
      <artifactId>lancedb-parent</artifactId>
-      <version>0.29.1-beta.0</version>
+      <version>0.30.1-beta.2</version>
      <relativePath>../pom.xml</relativePath>
    </parent>

--- a/java/pom.xml
+++ b/java/pom.xml
@@ -6,7 +6,7 @@

    <groupId>com.lancedb</groupId>
    <artifactId>lancedb-parent</artifactId>
-    <version>0.29.1-beta.0</version>
+    <version>0.30.1-beta.2</version>
    <packaging>pom</packaging>
    <name>${project.artifactId}</name>
    <description>LanceDB Java SDK Parent POM</description>
@@ -28,7 +28,7 @@
    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <arrow.version>15.0.0</arrow.version>
-        <lance-core.version>7.0.0-beta.13</lance-core.version>
+        <lance-core.version>8.0.0-beta.12</lance-core.version>
        <spotless.skip>false</spotless.skip>
        <spotless.version>2.30.0</spotless.version>
        <spotless.java.googlejavaformat.version>1.7</spotless.java.googlejavaformat.version>
--- a/nodejs/Cargo.toml
+++ b/nodejs/Cargo.toml
@@ -1,7 +1,7 @@
 [package]
 name = "lancedb-nodejs"
 edition.workspace = true
-version = "0.29.1-beta.0"
+version = "0.30.1-beta.2"
 publish = false
 license.workspace = true
 description.workspace = true
@@ -25,8 +25,12 @@ lancedb = { path = "../rust/lancedb", default-features = false }
 lance-namespace.workspace = true
 napi = { version = "3.8.3", default-features = false, features = [
    "napi9",
-    "async"
+    "async",
+    "chrono_date",
+    "serde-json",
 ] }
+chrono = { version = "0.4", default-features = false, features = ["clock"] }
+serde_json = "1"
 napi-derive = "3.5.2"
 # Prevent dynamic linking of lzma, which comes from datafusion
 lzma-sys = { version = "0.1", features = ["static"] }
--- a/nodejs/test/connection.test.ts
+++ b/nodejs/test/connection.test.ts
@@ -47,6 +47,14 @@ describe("given a connection", () => {
    await db.close();
    expect(db.isOpen()).toBe(false);
    await expect(db.tableNames()).rejects.toThrow("Connection is closed");
+    await expect(db.renameTable("a", "b")).rejects.toThrow(
+      "Connection is closed",
+    );
+  });
+
+  it("should report renameTable as unsupported on an OSS connection", async () => {
+    await db.createTable("a", [{ id: 1 }]);
+    await expect(db.renameTable("a", "b")).rejects.toThrow(/not supported/);
  });
  it("should be able to create a table from an object arg `createTable(options)`, or args `createTable(name, data, options)`", async () => {
    let tbl = await db.createTable("test", [{ id: 1 }, { id: 2 }]);
@@ -81,16 +89,6 @@ describe("given a connection", () => {
    await db.createTable("test4", [{ id: 1 }, { id: 2 }]);
  });

-  it("should expose renameTable and reject on OSS listing DB", async () => {
-    await db.createTable("old_name", [{ id: 1 }]);
-
-    await expect(db.renameTable("old_name", "new_name")).rejects.toThrow(
-      "rename_table is not supported in LanceDB OSS",
-    );
-
-    await expect(db.tableNames()).resolves.toEqual(["old_name"]);
-  });
-
  it("should fail if creating table twice, unless overwrite is true", async () => {
    let tbl = await db.createTable("test", [{ id: 1 }, { id: 2 }]);
    await expect(tbl.countRows()).resolves.toBe(2);
@@ -173,18 +171,22 @@ describe("given a connection", () => {

    let manifestDir =
      tmpDir.name + "/test_manifest_paths_v2_empty.lance/_versions";
-    readdirSync(manifestDir).forEach((file) => {
-      expect(file).toMatch(/^\d{20}\.manifest$/);
-    });
+    readdirSync(manifestDir)
+      .filter((f) => f.endsWith(".manifest"))
+      .forEach((file) => {
+        expect(file).toMatch(/^\d{20}\.manifest$/);
+      });

    table = (await db.createTable("test_manifest_paths_v2", [{ id: 1 }], {
      enableV2ManifestPaths: true,
    })) as LocalTable;
    expect(await table.usesV2ManifestPaths()).toBe(true);
    manifestDir = tmpDir.name + "/test_manifest_paths_v2.lance/_versions";
-    readdirSync(manifestDir).forEach((file) => {
-      expect(file).toMatch(/^\d{20}\.manifest$/);
-    });
+    readdirSync(manifestDir)
+      .filter((f) => f.endsWith(".manifest"))
+      .forEach((file) => {
+        expect(file).toMatch(/^\d{20}\.manifest$/);
+      });
  });

  it("should be able to migrate tables to the V2 manifest paths", async () => {
@@ -201,16 +203,20 @@ describe("given a connection", () => {

    const manifestDir =
      tmpDir.name + "/test_manifest_path_migration.lance/_versions";
-    readdirSync(manifestDir).forEach((file) => {
-      expect(file).toMatch(/^\d\.manifest$/);
-    });
+    readdirSync(manifestDir)
+      .filter((f) => f.endsWith(".manifest"))
+      .forEach((file) => {
+        expect(file).toMatch(/^\d\.manifest$/);
+      });

    await table.migrateManifestPathsV2();
    expect(await table.usesV2ManifestPaths()).toBe(true);

-    readdirSync(manifestDir).forEach((file) => {
-      expect(file).toMatch(/^\d{20}\.manifest$/);
-    });
+    readdirSync(manifestDir)
+      .filter((f) => f.endsWith(".manifest"))
+      .forEach((file) => {
+        expect(file).toMatch(/^\d{20}\.manifest$/);
+      });
  });
 });

--- a/nodejs/test/remote.test.ts
+++ b/nodejs/test/remote.test.ts
@@ -191,6 +191,40 @@ describe("remote connection", () => {
    );
  });

+  it("supports version time-travel and branches on remote", async () => {
+    await withMockDatabase(
+      (req, res) => {
+        const body = req.url?.includes("/branches/list")
+          ? JSON.stringify({
+              branches: {
+                exp: { parentVersion: 1, createAt: 1, manifestSize: 1 },
+              },
+            })
+          : JSON.stringify({ name: "t", version: 2, schema: { fields: [] } });
+        res.writeHead(200, { "Content-Type": "application/json" }).end(body);
+      },
+      async (db) => {
+        // version-only (and "main" + version) time-travel the main chain
+        const v2 = await db.openTable("t", undefined, { version: 2 });
+        expect(v2.currentBranch()).toBeNull();
+        const mainV2 = await db.openTable("t", undefined, {
+          branch: "main",
+          version: 2,
+        });
+        expect(mainV2.currentBranch()).toBeNull();
+
+        // a non-main branch opens a handle scoped to that branch
+        const exp = await db.openTable("t", undefined, { branch: "exp" });
+        expect(exp.currentBranch()).toBe("exp");
+        const expV2 = await db.openTable("t", undefined, {
+          branch: "exp",
+          version: 2,
+        });
+        expect(expV2.currentBranch()).toBe("exp");
+      },
+    );
+  });
+
  describe("TlsConfig", () => {
    it("should create TlsConfig with all fields", () => {
      const tlsConfig: TlsConfig = {
@@ -617,4 +651,68 @@ describe("remote connection", () => {
      );
    });
  });
+
+  describe("renameTable", () => {
+    async function captureRenameRequest(
+      call: (db: Connection) => Promise<void>,
+    ): Promise<{ url: string; body: Record<string, unknown> }> {
+      let captured: { url: string; body: Record<string, unknown> } | undefined;
+      await withMockDatabase((req, res) => {
+        let raw = "";
+        req.on("data", (chunk) => {
+          raw += chunk;
+        });
+        req.on("end", () => {
+          captured = {
+            url: req.url ?? "",
+            body: raw ? JSON.parse(raw) : {},
+          };
+          res.writeHead(200, { "Content-Type": "application/json" }).end("");
+        });
+      }, call);
+      if (!captured) {
+        throw new Error("mock server never saw a request");
+      }
+      return captured;
+    }
+
+    it("sends rename request for a table in the root namespace", async () => {
+      const { url, body } = await captureRenameRequest(async (db) => {
+        await db.renameTable("table1", "table2");
+      });
+      expect(url).toBe("/v1/table/table1/rename/");
+      // biome-ignore lint/style/useNamingConvention: snake_case mandated by the server wire format
+      expect(body).toEqual({ new_table_name: "table2" });
+    });
+
+    it("omits new_namespace when only the current namespace is supplied", async () => {
+      // Safe-default check: passing namespacePath alone must not send
+      // `new_namespace`, so the server keeps the table in its current
+      // namespace instead of silently moving it to root.
+      const { url, body } = await captureRenameRequest(async (db) => {
+        await db.renameTable("table1", "table2", {
+          namespacePath: ["ns1"],
+        });
+      });
+      expect(url).toBe("/v1/table/ns1$table1/rename/");
+      // biome-ignore lint/style/useNamingConvention: snake_case mandated by the server wire format
+      expect(body).toEqual({ new_table_name: "table2" });
+    });
+
+    it("includes new_namespace in the body for a cross-namespace rename", async () => {
+      const { url, body } = await captureRenameRequest(async (db) => {
+        await db.renameTable("table1", "table2", {
+          namespacePath: ["ns1"],
+          newNamespacePath: ["ns2"],
+        });
+      });
+      expect(url).toBe("/v1/table/ns1$table1/rename/");
+      expect(body).toEqual({
+        // biome-ignore lint/style/useNamingConvention: snake_case mandated by the server wire format
+        new_table_name: "table2",
+        // biome-ignore lint/style/useNamingConvention: snake_case mandated by the server wire format
+        new_namespace: ["ns2"],
+      });
+    });
+  });
 });
--- a/nodejs/test/table.test.ts
+++ b/nodejs/test/table.test.ts
@@ -28,6 +28,7 @@ import {
  List,
  Schema,
  SchemaLike,
+  Struct,
  Type,
  Uint8,
  Utf8,
@@ -84,6 +85,140 @@ describe.each([arrow15, arrow16, arrow17, arrow18])(
      await expect(table.countRows()).resolves.toBe(3);
    });

+    it("should support branches", async () => {
+      await table.add([{ id: 1 }]);
+      expect(await table.countRows()).toBe(1);
+
+      expect(table.currentBranch()).toBeNull();
+
+      // fork an isolated, writable branch from main
+      const branch = await (await table.branches()).create("exp");
+      expect(branch.currentBranch()).toBe("exp");
+      expect(await branch.countRows()).toBe(1);
+      await branch.add([{ id: 2 }]);
+      expect(await branch.countRows()).toBe(2);
+      // main is untouched by branch writes
+      expect(await table.countRows()).toBe(1);
+
+      // listed, with main (null) as the parent
+      const list = await (await table.branches()).list();
+      expect(Object.keys(list)).toContain("exp");
+      expect(list["exp"].parentBranch).toBeNull();
+
+      // fromRef="main" is equivalent to the default
+      await (await table.branches()).create("exp2", "main");
+      const list2 = await (await table.branches()).list();
+      expect(list2["exp2"].parentBranch).toBeNull();
+
+      // checkout returns a handle scoped to the branch's latest
+      const checkedOut = await (await table.branches()).checkout("exp");
+      expect(checkedOut.currentBranch()).toBe("exp");
+      expect(await checkedOut.countRows()).toBe(2);
+
+      // delete removes it
+      await (await table.branches()).delete("exp");
+      await (await table.branches()).delete("exp2");
+      const after = await (await table.branches()).list();
+      expect(Object.keys(after)).not.toContain("exp");
+    });
+
+    it("should open a branch via open_table", async () => {
+      const db = await connect(tmpDir.name);
+      await table.add([{ id: 1 }]);
+      const branch = await (await table.branches()).create("exp");
+      await branch.add([{ id: 2 }]);
+
+      // open_table(..., { branch }) returns a handle scoped to the branch
+      const opened = await db.openTable("some_table", undefined, {
+        branch: "exp",
+      });
+      expect(await opened.countRows()).toBe(2);
+      // opening without branch still tracks main
+      expect(await (await db.openTable("some_table")).countRows()).toBe(1);
+    });
+
+    it("should open a branch at a version isolated from main and HEAD", async () => {
+      const db = await connect(tmpDir.name);
+      // main: a single fork-point row
+      const t = await db.createTable("bv_table", [{ id: 0 }]);
+      const mainV1 = await t.version();
+
+      // fork "exp", then advance exp AND main independently past the fork so
+      // they diverge while sharing version numbers
+      const exp = await (await t.branches()).create("exp");
+      await exp.add([{ id: 1 }]); // exp: {0, 1}
+      const expV2 = await exp.version();
+      await exp.add([{ id: 2 }]); // exp HEAD: {0, 1, 2}
+      await t.add([{ id: 100 }, { id: 101 }, { id: 102 }]); // main HEAD: {0,100,101,102}
+      expect(await t.version()).toBe(expV2);
+
+      // open exp at the shared version: the data must be exp's, not main's.
+      // count alone cannot prove this (main@v2 also exists), so assert
+      // provenance by content.
+      const pinned = await db.openTable("bv_table", undefined, {
+        branch: "exp",
+        version: expV2,
+      });
+      expect(await pinned.countRows()).toBe(2); // not exp HEAD (3), not main@v2 (4)
+      expect(await pinned.countRows("id = 1")).toBe(1); // exp's post-fork row
+      expect(await pinned.countRows("id = 100")).toBe(0); // main's rows invisible
+
+      // the same coordinate is reachable directly via branches().checkout(name, version)
+      const pinnedDirect = await (await t.branches()).checkout("exp", expV2);
+      expect(await pinnedDirect.countRows()).toBe(2);
+
+      // the HEADs are unaffected
+      expect(
+        await (
+          await db.openTable("bv_table", undefined, { branch: "exp" })
+        ).countRows(),
+      ).toBe(3);
+      expect(await (await db.openTable("bv_table")).countRows()).toBe(4);
+
+      // version-only (no branch) time-travels main itself: its fork-point
+      // version holds only main's first row, and the shared version number
+      // resolves to main's data, not the branch's ("opens main at the version")
+      const oldMain = await db.openTable("bv_table", undefined, {
+        version: mainV1,
+      });
+      expect(await oldMain.countRows()).toBe(1);
+      const sharedOnMain = await db.openTable("bv_table", undefined, {
+        version: expV2,
+      });
+      expect(await sharedOnMain.countRows()).toBe(4); // main@v2, not exp@v2 (2)
+
+      // detached head: writing to a pinned version is rejected
+      await expect(pinned.add([{ id: 9 }])).rejects.toThrow(
+        /cannot be modified/,
+      );
+
+      // a nonexistent version is rejected -- on main, and on a branch (a
+      // distinct resolution path, on the branch's manifests)
+      await expect(
+        db.openTable("bv_table", undefined, { version: 9999 }),
+      ).rejects.toThrow();
+      await expect(
+        db.openTable("bv_table", undefined, { branch: "exp", version: 9999 }),
+      ).rejects.toThrow();
+
+      // checkoutLatest re-attaches the pinned handle to the BRANCH's HEAD
+      // (writable again), not main's HEAD (4), and not staying pinned (2)
+      await pinned.checkoutLatest();
+      expect(await pinned.countRows()).toBe(3); // exp HEAD
+      await pinned.add([{ id: 3 }]);
+      expect(await pinned.countRows()).toBe(4); // writable again
+    });
+
+    it("rejects invalid branch inputs", async () => {
+      const branches = await table.branches();
+      await expect(branches.create("")).rejects.toThrow("non-empty");
+      await expect(branches.checkout("")).rejects.toThrow("non-empty");
+      await expect(branches.delete("")).rejects.toThrow("non-empty");
+      await expect(branches.create("bad", "main", -1)).rejects.toThrow(
+        "non-negative",
+      );
+    });
+
    it("should show table stats", async () => {
      await table.add([{ id: 1 }, { id: 2 }]);
      await table.add([{ id: 1 }]);
@@ -115,6 +250,48 @@ describe.each([arrow15, arrow16, arrow17, arrow18])(
      await expect(table.countRows()).resolves.toBe(1);
    });

+    it("should invoke the progress callback", async () => {
+      const events: import("../lancedb").WriteProgress[] = [];
+      await table.add([{ id: 1 }, { id: 2 }, { id: 3 }], {
+        progress: (p) => events.push(p),
+      });
+
+      expect(events.length).toBeGreaterThan(0);
+      const last = events[events.length - 1];
+      expect(last.done).toBe(true);
+      // Earlier callbacks must have done=false.
+      for (const ev of events.slice(0, -1)) {
+        expect(ev.done).toBe(false);
+      }
+      // outputRows reflects the rows added in this call, not table size.
+      expect(last.outputRows).toBe(3);
+      // The input source (an array) reports a row count, so totalRows is set.
+      expect(last.totalRows).toBe(3);
+      // outputRows is monotonic.
+      for (let i = 1; i < events.length; i++) {
+        expect(events[i].outputRows).toBeGreaterThanOrEqual(
+          events[i - 1].outputRows,
+        );
+      }
+    });
+
+    it("should swallow errors thrown from the progress callback", async () => {
+      const warn = jest
+        .spyOn(console, "warn")
+        .mockImplementation(() => undefined);
+      try {
+        const res = await table.add([{ id: 1 }, { id: 2 }], {
+          progress: () => {
+            throw new Error("callback bomb");
+          },
+        });
+        expect(res.version).toBeGreaterThan(0);
+        expect(warn).toHaveBeenCalled();
+      } finally {
+        warn.mockRestore();
+      }
+    });
+
    it("should let me close the table", async () => {
      expect(table.isOpen()).toBe(true);
      table.close();
@@ -672,13 +849,15 @@ describe("When creating an index", () => {
    expect(fs.readdirSync(indexDir)).toHaveLength(1);
    const indices = await tbl.listIndices();
    expect(indices.length).toBe(1);
-    expect(indices[0]).toEqual({
-      name: "vec_idx",
-      indexType: "IvfPq",
-      columns: ["vec"],
-    });
+    expect(indices[0]).toEqual(
+      expect.objectContaining({
+        name: "vec_idx",
+        indexType: "IvfPq",
+        columns: ["vec"],
+      }),
+    );
    const stats = await tbl.indexStats("vec_idx");
-    expect(stats?.loss).toBeDefined();
+    expect(stats).toBeDefined();

    // Search without specifying the column
    let rst = await tbl
@@ -738,6 +917,274 @@ describe("When creating an index", () => {
    expect(indices2.length).toBe(0);
  });

+  it("should preserve canonical nested field paths across index lifecycle", async () => {
+    const db = await connect(tmpDir.name);
+    const nestedSchema = new Schema([
+      new Field("rowId", new Int32(), true),
+      new Field("row-id", new Int32(), true),
+      new Field("userId", new Int32(), true),
+      new Field(
+        "metadata",
+        new Struct([new Field("user_id", new Int32(), true)]),
+        true,
+      ),
+      new Field(
+        "MetaData",
+        new Struct([new Field("userId", new Int32(), true)]),
+        true,
+      ),
+      new Field(
+        "image",
+        new Struct([
+          new Field(
+            "embedding",
+            new FixedSizeList(2, new Field("item", new Float32(), true)),
+            true,
+          ),
+        ]),
+        true,
+      ),
+      new Field(
+        "payload",
+        new Struct([new Field("text", new Utf8(), true)]),
+        true,
+      ),
+      new Field(
+        "meta-data",
+        new Struct([new Field("user-id", new Int32(), true)]),
+        true,
+      ),
+      new Field(
+        "literal",
+        new Struct([new Field("a.b", new Int32(), true)]),
+        true,
+      ),
+    ]);
+    const nestedTable = await db.createTable(
+      "nested_field_index_lifecycle",
+      makeArrowTable(
+        Array.from({ length: 300 }, (_, rowId) => ({
+          rowId,
+          "row-id": rowId,
+          userId: rowId,
+          metadata: { ["user_id"]: rowId },
+          ["MetaData"]: { userId: rowId },
+          image: { embedding: [rowId, rowId + 1] },
+          payload: { text: `document ${rowId}` },
+          "meta-data": { "user-id": rowId },
+          literal: { "a.b": rowId },
+        })),
+        { schema: nestedSchema },
+      ),
+    );
+
+    await nestedTable.createIndex("rowId", {
+      config: Index.btree(),
+      name: "row_id_idx",
+    });
+    await nestedTable.createIndex("`row-id`", {
+      config: Index.btree(),
+      name: "row_dash_id_idx",
+    });
+    await nestedTable.createIndex("userId", {
+      config: Index.btree(),
+      name: "top_user_id_idx",
+    });
+    await nestedTable.createIndex("metadata.user_id", {
+      config: Index.btree(),
+      name: "nested_user_id_idx",
+    });
+    await nestedTable.createIndex("MetaData.userId", {
+      config: Index.btree(),
+      name: "mixed_case_metadata_user_id_idx",
+    });
+    await nestedTable.createIndex("`meta-data`.`user-id`", {
+      config: Index.btree(),
+      name: "escaped_names_idx",
+    });
+    await nestedTable.createIndex("literal.`a.b`", {
+      config: Index.btree(),
+      name: "literal_dot_idx",
+    });
+    await nestedTable.createIndex("image.embedding", {
+      name: "image_embedding_idx",
+    });
+    await nestedTable.createIndex("payload.text", {
+      config: Index.fts({ withPosition: false }),
+      name: "payload_text_idx",
+    });
+
+    const indices = await nestedTable.listIndices();
+    expect(indices).toEqual(
+      expect.arrayContaining([
+        expect.objectContaining({
+          name: "row_id_idx",
+          indexType: "BTree",
+          columns: ["rowId"],
+        }),
+        expect.objectContaining({
+          name: "row_dash_id_idx",
+          indexType: "BTree",
+          columns: ["`row-id`"],
+        }),
+        expect.objectContaining({
+          name: "top_user_id_idx",
+          indexType: "BTree",
+          columns: ["userId"],
+        }),
+        expect.objectContaining({
+          name: "nested_user_id_idx",
+          indexType: "BTree",
+          columns: ["metadata.user_id"],
+        }),
+        expect.objectContaining({
+          name: "mixed_case_metadata_user_id_idx",
+          indexType: "BTree",
+          columns: ["MetaData.userId"],
+        }),
+        expect.objectContaining({
+          name: "escaped_names_idx",
+          indexType: "BTree",
+          columns: ["`meta-data`.`user-id`"],
+        }),
+        expect.objectContaining({
+          name: "literal_dot_idx",
+          indexType: "BTree",
+          columns: ["literal.`a.b`"],
+        }),
+        expect.objectContaining({
+          name: "image_embedding_idx",
+          indexType: "IvfPq",
+          columns: ["image.embedding"],
+        }),
+        expect.objectContaining({
+          name: "payload_text_idx",
+          indexType: "FTS",
+          columns: ["payload.text"],
+        }),
+      ]),
+    );
+
+    const stats = await nestedTable.indexStats(
+      "mixed_case_metadata_user_id_idx",
+    );
+    expect(stats?.numIndexedRows).toEqual(300);
+    expect(stats?.indexType).toEqual("BTREE");
+
+    const filtered = await nestedTable
+      .query()
+      .where("MetaData.userId = 42")
+      .limit(1)
+      .toArray();
+    expect(filtered[0].MetaData.userId).toEqual(42);
+
+    const escapedFiltered = await nestedTable
+      .query()
+      .where("`row-id` = 43")
+      .limit(1)
+      .toArray();
+    expect(escapedFiltered[0]["row-id"]).toEqual(43);
+
+    const explicit = await nestedTable
+      .query()
+      .nearestTo([0.0, 1.0])
+      .column("image.embedding")
+      .limit(1)
+      .toArray();
+    const inferred = await nestedTable
+      .query()
+      .nearestTo([0.0, 1.0])
+      .limit(1)
+      .toArray();
+    expect(inferred[0].rowId).toEqual(explicit[0].rowId);
+
+    await nestedTable.add([
+      {
+        rowId: 300,
+        "row-id": 300,
+        userId: 300,
+        metadata: { ["user_id"]: 300 },
+        ["MetaData"]: { userId: 300 },
+        image: { embedding: [300.0, 301.0] },
+        payload: { text: "document 300" },
+        "meta-data": { "user-id": 300 },
+        literal: { "a.b": 300 },
+      },
+    ]);
+    await nestedTable.optimize();
+    const indicesAfterOptimize = await nestedTable.listIndices();
+    expect(indicesAfterOptimize).toEqual(
+      expect.arrayContaining([
+        expect.objectContaining({
+          name: "mixed_case_metadata_user_id_idx",
+          indexType: "BTree",
+          columns: ["MetaData.userId"],
+        }),
+        expect.objectContaining({
+          name: "image_embedding_idx",
+          indexType: "IvfPq",
+          columns: ["image.embedding"],
+        }),
+      ]),
+    );
+  });
+
+  it("should report multiple nested vector candidates", async () => {
+    const db = await connect(tmpDir.name);
+    const nestedSchema = new Schema([
+      new Field(
+        "image",
+        new Struct([
+          new Field(
+            "embedding",
+            new FixedSizeList(2, new Field("item", new Float32(), true)),
+            true,
+          ),
+        ]),
+        true,
+      ),
+      new Field(
+        "text",
+        new Struct([
+          new Field(
+            "embedding",
+            new FixedSizeList(2, new Field("item", new Float32(), true)),
+            true,
+          ),
+        ]),
+        true,
+      ),
+    ]);
+    const nestedTable = await db.createTable(
+      "multiple_nested_vectors",
+      makeArrowTable(
+        [
+          {
+            image: { embedding: [0.0, 1.0] },
+            text: { embedding: [2.0, 3.0] },
+          },
+        ],
+        { schema: nestedSchema },
+      ),
+    );
+
+    await expect(
+      nestedTable.query().nearestTo([0.0, 1.0]).limit(1).toArray(),
+    ).rejects.toThrow(/image\.embedding.*text\.embedding/);
+  });
+
+  it("should report when no default vector column exists", async () => {
+    const db = await connect(tmpDir.name);
+    const noVectorTable = await db.createTable(
+      "no_vector",
+      makeArrowTable([{ id: 0, label: "cat" }]),
+    );
+
+    await expect(
+      noVectorTable.query().nearestTo([0.0, 1.0]).limit(1).toArray(),
+    ).rejects.toThrow(/No vector column/);
+  });
+
  it("should wait for index readiness", async () => {
    // Create an index and then wait for it to be ready
    await tbl.createIndex("vec");
@@ -813,11 +1260,13 @@ describe("When creating an index", () => {
    expect(fs.readdirSync(indexDir)).toHaveLength(1);
    const indices = await tbl.listIndices();
    expect(indices.length).toBe(1);
-    expect(indices[0]).toEqual({
-      name: "vec_idx",
-      indexType: "IvfHnswSq",
-      columns: ["vec"],
-    });
+    expect(indices[0]).toEqual(
+      expect.objectContaining({
+        name: "vec_idx",
+        indexType: "IvfHnswSq",
+        columns: ["vec"],
+      }),
+    );

    // Search without specifying the column
    let rst = await tbl
@@ -990,6 +1439,20 @@ describe("When creating an index", () => {
    expect(fs.readdirSync(indexDir)).toHaveLength(1);
  });

+  test("create an FM index", async () => {
+    // FM-Index accelerates substring search on a string/binary column.
+    const db = await connect(tmpDir.name);
+    const fmTbl = await db.createTable("fm_table", [
+      { id: 0, text: "hello world" },
+      { id: 1, text: "foo bar" },
+    ]);
+    await fmTbl.createIndex("text", {
+      config: Index.fm(),
+    });
+    const indexDir = path.join(tmpDir.name, "fm_table.lance", "_indices");
+    expect(fs.readdirSync(indexDir)).toHaveLength(1);
+  });
+
  test("should be able to get index stats", async () => {
    await tbl.createIndex("id");

@@ -1000,7 +1463,6 @@ describe("When creating an index", () => {
    expect(stats?.distanceType).toBeUndefined();
    expect(stats?.indexType).toEqual("BTREE");
    expect(stats?.numIndices).toEqual(1);
-    expect(stats?.loss).toBeUndefined();
  });

  test("when getting stats on non-existent index", async () => {
@@ -1150,6 +1612,35 @@ describe("When creating an index", () => {
    expect(rst64Query.toString()).toEqual(rst64Search.toString());
    expect(rst64Query.numRows).toBe(2);
  });
+
+  it("should expose rich metadata fields on IndexConfig", async () => {
+    await tbl.createIndex("id", { config: Index.btree() });
+    await tbl.createIndex("vec");
+
+    const indicesByName = Object.fromEntries(
+      (await tbl.listIndices()).map((idx) => [idx.name, idx]),
+    );
+
+    const scalarIdx = indicesByName["id_idx"];
+    expect(scalarIdx).toBeDefined();
+    expect(typeof scalarIdx.indexUuid).toBe("string");
+    expect(scalarIdx.numIndexedRows).toBe(300);
+    expect(scalarIdx.numUnindexedRows).toBe(0);
+    expect(scalarIdx.numSegments).toBeGreaterThanOrEqual(1);
+    expect(scalarIdx.sizeBytes).toBeGreaterThan(0);
+    // Use toString check to avoid cross-realm instanceof failures with native Date objects
+    expect(Object.prototype.toString.call(scalarIdx.createdAt)).toBe(
+      "[object Date]",
+    );
+    expect((scalarIdx.createdAt as Date).getTime()).toBeGreaterThan(0);
+    expect(typeof scalarIdx.indexDetails).toBe("object");
+
+    const vectorIdx = indicesByName["vec_idx"];
+    expect(vectorIdx).toBeDefined();
+    expect(typeof vectorIdx.indexUuid).toBe("string");
+    expect(vectorIdx.numIndexedRows).toBe(300);
+    expect(typeof vectorIdx.indexDetails).toBe("object");
+  });
 });

 describe("When querying a table", () => {
@@ -1421,6 +1912,33 @@ describe("schema evolution", function () {
    expect(await table.schema()).toEqual(expectedSchema3);
  });

+  it("can update field metadata", async function () {
+    const con = await connect(tmpDir.name);
+    const table = await con.createTable("fm", [
+      { id: 1, category: "a" },
+      { id: 2, category: "b" },
+    ]);
+
+    const res = await table.updateFieldMetadata([
+      { path: "category", metadata: { unit: "label", pii: "false" } },
+    ]);
+    expect(res).toHaveProperty("version");
+    expect(res.version).toBe(2);
+
+    let cat = (await table.schema()).fields.find((f) => f.name === "category");
+    expect(cat?.metadata.get("unit")).toBe("label");
+    expect(cat?.metadata.get("pii")).toBe("false");
+
+    // merge: add a key, delete one via null, keep the rest
+    await table.updateFieldMetadata([
+      { path: "category", metadata: { source: "import", pii: null } },
+    ]);
+    cat = (await table.schema()).fields.find((f) => f.name === "category");
+    expect(cat?.metadata.get("unit")).toBe("label"); // preserved
+    expect(cat?.metadata.get("source")).toBe("import"); // added
+    expect(cat?.metadata.has("pii")).toBe(false); // deleted
+  });
+
  it("can cast to various types", async function () {
    const con = await connect(tmpDir.name);

@@ -2475,3 +2993,97 @@ describe("setLsmWriteSpec / unsetLsmWriteSpec", () => {
    ).rejects.toThrow();
  });
 });
+
+describe("LSM merge insert", () => {
+  let tmpDir: tmp.DirResult;
+
+  beforeEach(() => {
+    tmpDir = tmp.dirSync({ unsafeCleanup: true });
+  });
+  afterEach(() => tmpDir.removeCallback());
+
+  async function bucketTable(conn: Connection): Promise<Table> {
+    // The primary key column must be non-nullable.
+    const table = await conn.createEmptyTable(
+      "t",
+      new arrow.Schema([
+        new arrow.Field("id", new arrow.Utf8(), false),
+        new arrow.Field("value", new arrow.Float64(), true),
+      ]),
+    );
+    await table.add([
+      { id: "a", value: 1 },
+      { id: "b", value: 2 },
+    ]);
+    await table.setUnenforcedPrimaryKey("id");
+    // numBuckets = 1: every row routes to the single bucket.
+    await table.setLsmWriteSpec({
+      specType: "bucket",
+      column: "id",
+      numBuckets: 1,
+    });
+    return table;
+  }
+
+  it("routes merge_insert through the shard writer", async () => {
+    const conn = await connect(tmpDir.name);
+    const table = await bucketTable(conn);
+
+    const res = await table
+      .mergeInsert("id")
+      .whenMatchedUpdateAll()
+      .whenNotMatchedInsertAll()
+      .execute([
+        { id: "c", value: 3 },
+        { id: "d", value: 4 },
+      ]);
+    // LSM path: rows go to the MemWAL, so only numRows is populated.
+    expect(res.numRows).toBe(2);
+    expect(res.version).toBe(0);
+    expect(res.numInsertedRows).toBe(0);
+
+    await table.closeLsmWriters();
+  });
+
+  it("falls back to the standard path with useLsmWrite(false)", async () => {
+    const conn = await connect(tmpDir.name);
+    const table = await bucketTable(conn);
+
+    const res = await table
+      .mergeInsert("id")
+      .whenNotMatchedInsertAll()
+      .useLsmWrite(false)
+      .execute([
+        { id: "b", value: 9 },
+        { id: "e", value: 5 },
+      ]);
+    // Standard path commits: id="e" inserted ("b" already exists).
+    expect(res.numInsertedRows).toBe(1);
+    expect(await table.countRows()).toBe(3);
+  });
+
+  it("supports validateSingleShard(false)", async () => {
+    const conn = await connect(tmpDir.name);
+    const table = await bucketTable(conn);
+
+    const res = await table
+      .mergeInsert("id")
+      .whenMatchedUpdateAll()
+      .whenNotMatchedInsertAll()
+      .validateSingleShard(false)
+      .execute([{ id: "f", value: 6 }]);
+    expect(res.numRows).toBe(1);
+  });
+
+  it("rejects a non-upsert merge under an LSM spec", async () => {
+    const conn = await connect(tmpDir.name);
+    const table = await bucketTable(conn);
+
+    await expect(
+      table
+        .mergeInsert("id")
+        .whenNotMatchedInsertAll()
+        .execute([{ id: "g", value: 7 }]),
+    ).rejects.toThrow();
+  });
+});
--- a/nodejs/lancedb/connection.ts
+++ b/nodejs/lancedb/connection.ts
@@ -84,6 +84,20 @@ export interface CreateTableOptions {
 }

 export interface OpenTableOptions {
+  /**
+   * Open the table scoped to this branch instead of the default branch.
+   *
+   * Reads and writes on the returned table operate in the branch's context.
+   */
+  branch?: string;
+  /**
+   * Open the table pinned to this version, producing a read-only view.
+   *
+   * Composes with {@link OpenTableOptions.branch}: when both are set, opens
+   * that branch at the version; otherwise opens `main` at the version. Call
+   * `checkoutLatest` to return to a writable state.
+   */
+  version?: number;
  /**
   * Configuration for object storage.
   *
@@ -144,6 +158,19 @@ export interface DropNamespaceOptions {
  behavior?: "restrict" | "cascade";
 }

+export interface RenameTableOptions {
+  /**
+   * The namespace path of the table being renamed. Defaults to the root
+   * namespace (`[]`) when omitted.
+   */
+  namespacePath?: string[];
+  /**
+   * The namespace path to move the table to as part of the rename. When
+   * omitted the table stays in `namespacePath`.
+   */
+  newNamespacePath?: string[];
+}
+
 /**
 * A LanceDB Connection that allows you to open tables and create new ones.
 *
@@ -296,12 +323,6 @@ export abstract class Connection {
   */
  abstract dropTable(name: string, namespacePath?: string[]): Promise<void>;

-  abstract renameTable(
-    oldName: string,
-    newName: string,
-    namespacePath?: string[],
-  ): Promise<void>;
-
  /**
   * Drop all tables in the database.
   * @param {string[]} namespacePath The namespace path to drop tables from (defaults to root namespace).
@@ -397,6 +418,24 @@ export abstract class Connection {
      isShallow?: boolean;
    },
  ): Promise<Table>;
+
+  /**
+   * Rename a table.
+   *
+   * Currently only supported by LanceDB Cloud. Local OSS connections and
+   * namespace-backed connections (via {@link connectNamespace}) reject with
+   * a "not supported" error.
+   *
+   * @param {string} currentName - The current name of the table.
+   * @param {string} newName - The new name for the table.
+   * @param {RenameTableOptions} options - Optional namespace paths. When
+   *   `newNamespacePath` is omitted the table stays in `namespacePath`.
+   */
+  abstract renameTable(
+    currentName: string,
+    newName: string,
+    options?: RenameTableOptions,
+  ): Promise<void>;
 }

 /** @hideconstructor */
@@ -458,7 +497,20 @@ export class LocalConnection extends Connection {
      options?.indexCacheSize,
    );

-    return new LocalTable(innerTable);
+    let table: Table = new LocalTable(innerTable);
+    // "main" is the default branch, so treat it as no branch. On a real branch,
+    // scope and pin in one step (yielding "version V of branch B"); otherwise
+    // pin the version, if any, against main.
+    const branch =
+      options?.branch != null && options.branch !== "main"
+        ? options.branch
+        : undefined;
+    if (branch != null) {
+      table = await (await table.branches()).checkout(branch, options?.version);
+    } else if (options?.version != null) {
+      await table.checkout(options.version);
+    }
+    return table;
  }

  async cloneTable(
@@ -615,14 +667,6 @@ export class LocalConnection extends Connection {
    return this.inner.dropTable(name, namespacePath ?? []);
  }

-  async renameTable(
-    oldName: string,
-    newName: string,
-    namespacePath?: string[],
-  ): Promise<void> {
-    return this.inner.renameTable(oldName, newName, namespacePath ?? []);
-  }
-
  async dropAllTables(namespacePath?: string[]): Promise<void> {
    return this.inner.dropAllTables(namespacePath ?? []);
  }
@@ -665,6 +709,19 @@ export class LocalConnection extends Connection {
      options?.behavior,
    );
  }
+
+  async renameTable(
+    currentName: string,
+    newName: string,
+    options?: RenameTableOptions,
+  ): Promise<void> {
+    return this.inner.renameTable(
+      currentName,
+      newName,
+      options?.namespacePath ?? [],
+      options?.newNamespacePath,
+    );
+  }
 }

 /**
--- a/nodejs/lancedb/index.ts
+++ b/nodejs/lancedb/index.ts
@@ -38,10 +38,12 @@ export {
  FragmentSummaryStats,
  Tags,
  TagContents,
+  BranchContents,
  MergeResult,
  AddResult,
  AddColumnsResult,
  AlterColumnsResult,
+  UpdateFieldMetadataResult,
  DeleteResult,
  DropColumnsResult,
  UpdateResult,
@@ -50,7 +52,6 @@ export {
  SplitHashOptions,
  SplitSequentialOptions,
  ShuffleOptions,
-  OAuthConfig as NativeOAuthConfig,
 } from "./native.js";

 export {
@@ -72,6 +73,7 @@ export {
  CreateNamespaceResponse,
  DropNamespaceResponse,
  DescribeNamespaceResponse,
+  RenameTableOptions,
 } from "./connection";

 export { Session } from "./native.js";
@@ -110,12 +112,15 @@ export {

 export {
  Table,
+  Branches,
  AddDataOptions,
  UpdateOptions,
  OptimizeOptions,
  Version,
+  WriteProgress,
  LsmWriteSpec,
  ColumnAlteration,
+  FieldMetadataUpdate,
 } from "./table";

 export {
@@ -125,8 +130,6 @@ export {
  TokenResponse,
 } from "./header";

-export { OAuthConfig, OAuthFlowType } from "./oauth";
-
 export { MergeInsertBuilder, WriteExecutionOptions } from "./merge";

 export * as embedding from "./embedding";
--- a/nodejs/lancedb/indices.ts
+++ b/nodejs/lancedb/indices.ts
@@ -702,6 +702,17 @@ export class Index {
    return new Index(LanceDbIndex.labelList());
  }

+  /**
+   * Create an FM-Index.
+   *
+   * An FM-Index is a scalar index on string or binary columns that accelerates
+   * substring search, i.e. `contains(col, 'needle')`. Unlike the tokenized
+   * full-text-search index, it matches arbitrary substrings of the raw bytes.
+   */
+  static fm() {
+    return new Index(LanceDbIndex.fm());
+  }
+
  /**
   * Create a full text search index
   *
--- a/nodejs/lancedb/merge.ts
+++ b/nodejs/lancedb/merge.ts
@@ -87,6 +87,41 @@ export class MergeInsertBuilder {
      this.#schema,
    );
  }
+  /**
+   * Controls whether the merge uses the MemWAL LSM write path.
+   *
+   * By default (unset), a `mergeInsert` on a table with an LSM write spec is
+   * routed through Lance's MemWAL shard writer, and a table without one uses
+   * the standard path. Pass `false` to force the standard path even when a
+   * spec is set. Pass `true` to require a spec — `mergeInsert` rejects if none
+   * is installed.
+   *
+   * @param useLsmWrite - Whether to use the LSM write path.
+   */
+  useLsmWrite(useLsmWrite: boolean): MergeInsertBuilder {
+    return new MergeInsertBuilder(
+      this.#native.useLsmWrite(useLsmWrite),
+      this.#schema,
+    );
+  }
+  /**
+   * Controls how an LSM merge checks that its input targets a single shard.
+   *
+   * When a table has an LSM write spec, every row in a `mergeInsert` call must
+   * route to the same shard. When `true` (the default), every row is inspected
+   * to verify this. When `false`, only the first row is inspected and the
+   * shard it routes to is used for the whole input — a faster path for callers
+   * that have already pre-sharded their input. Has no effect on tables without
+   * an LSM write spec.
+   *
+   * @param validateSingleShard - Whether to check every row routes to one shard. Defaults to `true`.
+   */
+  validateSingleShard(validateSingleShard: boolean): MergeInsertBuilder {
+    return new MergeInsertBuilder(
+      this.#native.validateSingleShard(validateSingleShard),
+      this.#schema,
+    );
+  }
  /**
   * Executes the merge insert operation
   *
--- a/nodejs/lancedb/oauth.ts
+++ b/nodejs/lancedb/oauth.ts
@@ -1,82 +0,0 @@
-// SPDX-License-Identifier: Apache-2.0
-// SPDX-FileCopyrightText: Copyright The LanceDB Authors
-
-/**
- * OAuth authentication flow types.
- */
-export enum OAuthFlowType {
-  /** Client Credentials grant (service-to-service / M2M). */
-  ClientCredentials = "client_credentials",
-  /** Authorization Code with PKCE (interactive browser-based auth). */
-  AuthorizationCodePKCE = "authorization_code_pkce",
-  /** Device Code grant (CLI / headless environments). */
-  DeviceCode = "device_code",
-  /** Azure Managed Identity via IMDS. */
-  AzureManagedIdentity = "azure_managed_identity",
-  /** Workload Identity Federation (K8s, GitHub Actions). */
-  WorkloadIdentity = "workload_identity",
-}
-
-/**
- * OAuth configuration for LanceDB authentication.
- *
- * All token acquisition and refresh is handled in the Rust layer.
- * This config is passed through to Rust via napi-rs.
- *
- * @example Client Credentials (service-to-service):
- * ```typescript
- * const config: OAuthConfig = {
- *   issuerUrl: "https://login.microsoftonline.com/{tenant}/v2.0",
- *   clientId: "app-id",
- *   clientSecret: "secret",
- *   scopes: ["api://lancedb-api/.default"],
- * };
- * ```
- *
- * @example Azure Managed Identity:
- * ```typescript
- * const config: OAuthConfig = {
- *   issuerUrl: "https://login.microsoftonline.com/{tenant}/v2.0",
- *   clientId: "app-id",
- *   scopes: ["api://lancedb-api/.default"],
- *   flow: OAuthFlowType.AzureManagedIdentity,
- * };
- * ```
- */
-export interface OAuthConfig {
-  /**
-   * OIDC issuer URL or OAuth authority URL.
-   * For Azure: `https://login.microsoftonline.com/{tenant_id}/v2.0`
-   */
-  issuerUrl: string;
-
-  /** Application / Client ID. */
-  clientId: string;
-
-  /**
-   * OAuth scopes to request.
-   * For Azure: `["api://{app_id}/.default"]`
-   */
-  scopes: string[];
-
-  /** Authentication flow (default: ClientCredentials). */
-  flow?: OAuthFlowType;
-
-  /** Client secret (required for ClientCredentials). */
-  clientSecret?: string;
-
-  /** Redirect URI (AuthorizationCodePKCE flow). */
-  redirectUri?: string;
-
-  /** Port for local callback server (AuthorizationCodePKCE, default: 8400). */
-  callbackPort?: number;
-
-  /** Client ID for user-assigned managed identity (AzureManagedIdentity). */
-  managedIdentityClientId?: string;
-
-  /** Path to federated token file (WorkloadIdentity). */
-  tokenFile?: string;
-
-  /** Seconds before expiry to trigger proactive refresh (default: 300). */
-  refreshBufferSecs?: number;
-}
--- a/nodejs/lancedb/table.ts
+++ b/nodejs/lancedb/table.ts
@@ -25,13 +25,16 @@ import {
  AddColumnsSql,
  AddResult,
  AlterColumnsResult,
+  BranchContents,
  DeleteResult,
  DropColumnsResult,
  IndexConfig,
  IndexStatistics,
+  Branches as NativeBranches,
  OptimizeStats,
  TableStatistics,
  Tags,
+  UpdateFieldMetadataResult,
  UpdateResult,
  Table as _NativeTable,
 } from "./native";
@@ -46,6 +49,33 @@ import { sanitizeType } from "./sanitize";
 import { IntoSql, toSQL } from "./util";
 export { IndexConfig } from "./native";

+/**
+ * Progress snapshot for a write operation, delivered to the `progress`
+ * callback passed to {@link Table.add}.
+ */
+export interface WriteProgress {
+  /** Number of rows written so far. */
+  outputRows: number;
+  /** Number of bytes written so far. */
+  outputBytes: number;
+  /**
+   * Total rows expected, when the input source reports it.
+   *
+   * Always set on the final callback (the one with `done: true`), falling
+   * back to the actual number of rows written when the source could not
+   * report a row count up front.
+   */
+  totalRows?: number;
+  /** Wall-clock seconds since the write started. */
+  elapsedSeconds: number;
+  /** Number of parallel write tasks currently in flight. */
+  activeTasks: number;
+  /** Total number of parallel write tasks (the write parallelism). */
+  totalTasks: number;
+  /** `true` for the final callback; `false` otherwise. */
+  done: boolean;
+}
+
 /**
 * Options for adding data to a table.
 */
@@ -56,6 +86,28 @@ export interface AddDataOptions {
   * If "overwrite" then the new data will replace the existing data in the table.
   */
  mode: "append" | "overwrite";
+
+  /**
+   * Optional callback invoked periodically with write progress.
+   *
+   * The callback is fired once per batch written and once more with
+   * `done: true` when the write completes. Calls are dispatched
+   * asynchronously to the JS event loop and never block the write — a slow
+   * callback will queue events rather than back-pressure the writer.
+   *
+   * Errors thrown from the callback are logged with `console.warn` and
+   * swallowed — they do not abort the write.
+   *
+   * @example
+   * ```ts
+   * await table.add(data, {
+   *   progress: (p) => {
+   *     console.log(`${p.outputRows}/${p.totalRows ?? "?"} rows`);
+   *   },
+   * });
+   * ```
+   */
+  progress: (progress: WriteProgress) => void;
 }

 export interface UpdateOptions {
@@ -112,7 +164,10 @@ export interface Version {
 *
 * `specType` is `"bucket"`, `"identity"`, or `"unsharded"`. For `"bucket"`,
 * `column` and `numBuckets` are required; for `"identity"`, `column` is
- * required.
+ * required and must be a deterministic function of the unenforced primary
+ * key (every row with a given primary key must always produce the same
+ * `column` value, or upserts of that key can land in different shards and a
+ * stale version can win).
 */
 export interface LsmWriteSpec {
  /** One of `"bucket"`, `"identity"`, or `"unsharded"`. */
@@ -456,6 +511,18 @@ export abstract class Table {
  abstract alterColumns(
    columnAlterations: ColumnAlteration[],
  ): Promise<AlterColumnsResult>;
+
+  /**
+   * Update per-field (column) metadata.
+   * @param {FieldMetadataUpdate[]} updates One or more per-field updates. Each
+   * update's metadata is merged into the field's existing metadata by default;
+   * a value of `null` deletes that key, and `replace: true` swaps the whole map.
+   * @returns {Promise<UpdateFieldMetadataResult>} resolves to the new table version.
+   */
+  abstract updateFieldMetadata(
+    updates: FieldMetadataUpdate[],
+  ): Promise<UpdateFieldMetadataResult>;
+
  /**
   * Drop one or more columns from the dataset
   *
@@ -518,6 +585,16 @@ export abstract class Table {
   * @returns {Promise<void>}
   */
  abstract unsetLsmWriteSpec(): Promise<void>;
+  /**
+   * Drain and close any cached MemWAL shard writers held for this table.
+   *
+   * When an {@link LsmWriteSpec} is installed, `mergeInsert` opens MemWAL
+   * shard writers and caches them for reuse across calls. This closes them,
+   * flushing pending data; writers reopen lazily on the next `mergeInsert`.
+   * It is a no-op when no writers are cached.
+   * @returns {Promise<void>}
+   */
+  abstract closeLsmWriters(): Promise<void>;
  /** Retrieve the version of the table */

  abstract version(): Promise<number>;
@@ -578,6 +655,22 @@ export abstract class Table {
   */
  abstract tags(): Promise<Tags>;

+  /**
+   * Get the branch manager for this table.
+   *
+   * Branches are isolated, writable lines of history forked from another
+   * branch (or version). Writes on a branch do not affect `main`.
+   */
+  abstract branches(): Promise<Branches>;
+
+  /**
+   * The branch this table handle is scoped to, or `null` for the main branch.
+   *
+   * A handle returned by {@link Branches.create} or {@link Branches.checkout}
+   * reports the branch it targets; a handle opened normally reports `null`.
+   */
+  abstract currentBranch(): string | null;
+
  /**
   * Restore the table to the currently checked out version
   *
@@ -705,7 +798,20 @@ export class LocalTable extends Table {
    const schema = await this.schema();

    const buffer = await fromDataToBuffer(data, undefined, schema);
-    return await this.inner.add(buffer, mode);
+    // Wrap the user callback so a thrown error doesn't surface as an
+    // unhandled exception (the callback fires from a napi threadsafe
+    // function — exceptions there crash the process).
+    const userProgress = options?.progress;
+    const progress = userProgress
+      ? (p: WriteProgress) => {
+          try {
+            userProgress(p);
+          } catch (e) {
+            console.warn("Table.add progress callback threw:", e);
+          }
+        }
+      : undefined;
+    return await this.inner.add(buffer, mode, progress);
  }

  async update(
@@ -962,6 +1068,12 @@ export class LocalTable extends Table {
    return await this.inner.alterColumns(processedAlterations);
  }

+  async updateFieldMetadata(
+    updates: FieldMetadataUpdate[],
+  ): Promise<UpdateFieldMetadataResult> {
+    return await this.inner.updateFieldMetadata(updates);
+  }
+
  async dropColumns(columnNames: string[]): Promise<DropColumnsResult> {
    return await this.inner.dropColumns(columnNames);
  }
@@ -979,6 +1091,10 @@ export class LocalTable extends Table {
    return await this.inner.unsetLsmWriteSpec();
  }

+  async closeLsmWriters(): Promise<void> {
+    return await this.inner.closeLsmWriters();
+  }
+
  async version(): Promise<number> {
    return await this.inner.version();
  }
@@ -1010,6 +1126,14 @@ export class LocalTable extends Table {
    return await this.inner.tags();
  }

+  async branches(): Promise<Branches> {
+    return new Branches(await this.inner.branches());
+  }
+
+  currentBranch(): string | null {
+    return this.inner.currentBranch() ?? null;
+  }
+
  async optimize(options?: Partial<OptimizeOptions>): Promise<OptimizeStats> {
    let cleanupOlderThanMs;
    if (
@@ -1124,3 +1248,73 @@ export interface ColumnAlteration {
  /** Set the new nullability. Note that a nullable column cannot be made non-nullable. */
  nullable?: boolean;
 }
+
+/** A per-field metadata update, addressed by dot-path. */
+export interface FieldMetadataUpdate {
+  /**
+   * Dot-separated path to the field. For a top-level column this is just its
+   * name; for a nested field it's the path, e.g. "a.b.c".
+   */
+  path: string;
+  /**
+   * Metadata key/value pairs. Merged into the field's existing metadata by
+   * default; a value of `null` deletes that key.
+   */
+  metadata: Record<string, string | null>;
+  /** If true, replace the field's entire metadata map instead of merging. */
+  replace?: boolean;
+}
+
+/**
+ * Branch manager for a {@link Table}.
+ *
+ * Unlike tags, `create` and `checkout` return a new {@link Table} handle scoped
+ * to the branch; writes on it do not affect `main`.
+ */
+export class Branches {
+  #inner: NativeBranches;
+
+  /**
+   * Construct a Branches manager. Internal use only.
+   * @hidden
+   */
+  constructor(inner: NativeBranches) {
+    this.#inner = inner;
+  }
+
+  /** List all branches, mapping name to branch metadata. */
+  async list(): Promise<Record<string, BranchContents>> {
+    return await this.#inner.list();
+  }
+
+  /**
+   * Create a branch and return a handle scoped to it.
+   *
+   * @param name Name of the new branch.
+   * @param fromRef Source branch to fork from. Defaults to `main`.
+   * @param fromVersion A specific version on `fromRef`. Defaults to latest.
+   */
+  async create(
+    name: string,
+    fromRef?: string,
+    fromVersion?: number,
+  ): Promise<Table> {
+    return new LocalTable(await this.#inner.create(name, fromRef, fromVersion));
+  }
+
+  /**
+   * Check out an existing branch and return a handle scoped to it.
+   *
+   * With `version` set, the returned handle is pinned to that version of the
+   * branch (a read-only, detached view); otherwise it tracks the branch's
+   * latest and stays writable.
+   */
+  async checkout(name: string, version?: number): Promise<Table> {
+    return new LocalTable(await this.#inner.checkout(name, version));
+  }
+
+  /** Delete a branch. */
+  async delete(name: string): Promise<void> {
+    return await this.#inner.delete(name);
+  }
+}
--- a/nodejs/npm/darwin-arm64/package.json
+++ b/nodejs/npm/darwin-arm64/package.json
@@ -1,6 +1,6 @@
 {
 	"name": "@lancedb/lancedb-darwin-arm64",
-	"version": "0.29.1-beta.0",
+	"version": "0.30.1-beta.2",
 	"os": ["darwin"],
 	"cpu": ["arm64"],
 	"main": "lancedb.darwin-arm64.node",
--- a/nodejs/npm/linux-arm64-gnu/package.json
+++ b/nodejs/npm/linux-arm64-gnu/package.json
@@ -1,6 +1,6 @@
 {
 	"name": "@lancedb/lancedb-linux-arm64-gnu",
-	"version": "0.29.1-beta.0",
+	"version": "0.30.1-beta.2",
 	"os": ["linux"],
 	"cpu": ["arm64"],
 	"main": "lancedb.linux-arm64-gnu.node",
--- a/nodejs/npm/linux-arm64-musl/package.json
+++ b/nodejs/npm/linux-arm64-musl/package.json
@@ -1,6 +1,6 @@
 {
 	"name": "@lancedb/lancedb-linux-arm64-musl",
-	"version": "0.29.1-beta.0",
+	"version": "0.30.1-beta.2",
 	"os": ["linux"],
 	"cpu": ["arm64"],
 	"main": "lancedb.linux-arm64-musl.node",
--- a/nodejs/npm/linux-x64-gnu/package.json
+++ b/nodejs/npm/linux-x64-gnu/package.json
@@ -1,6 +1,6 @@
 {
 	"name": "@lancedb/lancedb-linux-x64-gnu",
-	"version": "0.29.1-beta.0",
+	"version": "0.30.1-beta.2",
 	"os": ["linux"],
 	"cpu": ["x64"],
 	"main": "lancedb.linux-x64-gnu.node",
--- a/nodejs/npm/linux-x64-musl/package.json
+++ b/nodejs/npm/linux-x64-musl/package.json
@@ -1,6 +1,6 @@
 {
 	"name": "@lancedb/lancedb-linux-x64-musl",
-	"version": "0.29.1-beta.0",
+	"version": "0.30.1-beta.2",
 	"os": ["linux"],
 	"cpu": ["x64"],
 	"main": "lancedb.linux-x64-musl.node",
--- a/nodejs/npm/win32-arm64-msvc/package.json
+++ b/nodejs/npm/win32-arm64-msvc/package.json
@@ -1,6 +1,6 @@
 {
  "name": "@lancedb/lancedb-win32-arm64-msvc",
-  "version": "0.29.1-beta.0",
+  "version": "0.30.1-beta.2",
  "os": [
    "win32"
  ],
--- a/nodejs/npm/win32-x64-msvc/package.json
+++ b/nodejs/npm/win32-x64-msvc/package.json
@@ -1,6 +1,6 @@
 {
 	"name": "@lancedb/lancedb-win32-x64-msvc",
-	"version": "0.29.1-beta.0",
+	"version": "0.30.1-beta.2",
 	"os": ["win32"],
 	"cpu": ["x64"],
 	"main": "lancedb.win32-x64-msvc.node",
--- a/nodejs/package-lock.json
+++ b/nodejs/package-lock.json
@@ -1,12 +1,12 @@
 {
  "name": "@lancedb/lancedb",
-  "version": "0.29.1-beta.0",
+  "version": "0.30.1-beta.2",
  "lockfileVersion": 3,
  "requires": true,
  "packages": {
    "": {
      "name": "@lancedb/lancedb",
-      "version": "0.29.1-beta.0",
+      "version": "0.30.1-beta.2",
      "cpu": [
        "x64",
        "arm64"
@@ -26,7 +26,7 @@
        "@aws-sdk/client-s3": "3.1003.0",
        "@biomejs/biome": "^1.7.3",
        "@jest/globals": "^29.7.0",
-        "@napi-rs/cli": "3.5.1",
+        "@napi-rs/cli": "3.7.0",
        "@types/axios": "^0.14.0",
        "@types/jest": "^29.1.2",
        "@types/node": "22.7.4",
@@ -2942,9 +2942,9 @@
      }
    },
    "node_modules/@napi-rs/cli": {
-      "version": "3.5.1",
-      "resolved": "https://registry.npmjs.org/@napi-rs/cli/-/cli-3.5.1.tgz",
-      "integrity": "sha512-XBfLQRDcB3qhu6bazdMJsecWW55kR85l5/k0af9BIBELXQSsCFU0fzug7PX8eQp6vVdm7W/U3z6uP5WmITB2Gw==",
+      "version": "3.7.0",
+      "resolved": "https://registry.npmjs.org/@napi-rs/cli/-/cli-3.7.0.tgz",
+      "integrity": "sha512-3d3+rmxlOIV/G1zPWeX4PCxuYnhcCQM2BvY9rtimC8RO0dFR9gtYP+Grov+WoduZtfWRj5N1XvytWeRxxCk5zw==",
      "dev": true,
      "license": "MIT",
      "dependencies": {
@@ -2954,7 +2954,7 @@
        "@octokit/rest": "^22.0.1",
        "clipanion": "^4.0.0-rc.4",
        "colorette": "^2.0.20",
-        "emnapi": "^1.7.1",
+        "emnapi": "^1.10.0",
        "es-toolkit": "^1.41.0",
        "js-yaml": "^4.1.0",
        "obug": "^2.0.0",
--- a/nodejs/package.json
+++ b/nodejs/package.json
@@ -11,7 +11,7 @@
    "ann"
  ],
  "private": false,
-  "version": "0.29.1-beta.0",
+  "version": "0.30.1-beta.2",
  "main": "dist/index.js",
  "exports": {
    ".": "./dist/index.js",
@@ -43,7 +43,7 @@
    "@aws-sdk/client-s3": "3.1003.0",
    "@biomejs/biome": "^1.7.3",
    "@jest/globals": "^29.7.0",
-    "@napi-rs/cli": "3.5.1",
+    "@napi-rs/cli": "3.7.0",
    "@types/axios": "^0.14.0",
    "@types/jest": "^29.1.2",
    "@types/node": "22.7.4",
--- a/nodejs/pnpm-lock.yaml
+++ b/nodejs/pnpm-lock.yaml
@@ -31,8 +31,8 @@ importers:
        specifier: ^29.7.0
        version: 29.7.0
      '@napi-rs/cli':
-        specifier: 3.5.1
-        version: 3.5.1(@emnapi/core@1.10.0)(@emnapi/runtime@1.10.0)(@types/node@22.7.4)
+        specifier: 3.7.0
+        version: 3.7.0(@emnapi/core@1.10.0)(@emnapi/runtime@1.10.0)(@types/node@22.7.4)
      '@types/axios':
        specifier: ^0.14.0
        version: 0.14.4
@@ -887,8 +887,8 @@ packages:
  '@jridgewell/trace-mapping@0.3.31':
    resolution: {integrity: sha512-zzNR+SdQSDJzc8joaeP8QQoCQr8NuYx2dIIytl1QeBEZHJ9uW6hebsrYgbz8hJwUQao3TWCMtmfV8Nu1twOLAw==}

-  '@napi-rs/cli@3.5.1':
-    resolution: {integrity: sha512-XBfLQRDcB3qhu6bazdMJsecWW55kR85l5/k0af9BIBELXQSsCFU0fzug7PX8eQp6vVdm7W/U3z6uP5WmITB2Gw==}
+  '@napi-rs/cli@3.7.0':
+    resolution: {integrity: sha512-3d3+rmxlOIV/G1zPWeX4PCxuYnhcCQM2BvY9rtimC8RO0dFR9gtYP+Grov+WoduZtfWRj5N1XvytWeRxxCk5zw==}
    engines: {node: '>= 16'}
    hasBin: true
    peerDependencies:
@@ -4582,7 +4582,7 @@ snapshots:
      '@jridgewell/resolve-uri': 3.1.2
      '@jridgewell/sourcemap-codec': 1.5.5

-  '@napi-rs/cli@3.5.1(@emnapi/core@1.10.0)(@emnapi/runtime@1.10.0)(@types/node@22.7.4)':
+  '@napi-rs/cli@3.7.0(@emnapi/core@1.10.0)(@emnapi/runtime@1.10.0)(@types/node@22.7.4)':
    dependencies:
      '@inquirer/prompts': 8.4.3(@types/node@22.7.4)
      '@napi-rs/cross-toolchain': 1.0.3(@emnapi/core@1.10.0)(@emnapi/runtime@1.10.0)
--- a/nodejs/src/connection.rs
+++ b/nodejs/src/connection.rs
@@ -112,11 +112,6 @@ impl Connection {

        builder = builder.client_config(rust_config);

-        if let Some(oauth_config) = options.oauth_config {
-            let config: lancedb::remote::oauth::OAuthConfig = oauth_config.into();
-            builder = builder.oauth_config(config);
-        }
-
        if let Some(api_key) = options.api_key {
            builder = builder.api_key(&api_key);
        }
@@ -333,20 +328,6 @@ impl Connection {
            .default_error()
    }

-    #[napi(catch_unwind)]
-    pub async fn rename_table(
-        &self,
-        old_name: String,
-        new_name: String,
-        namespace_path: Option<Vec<String>>,
-    ) -> napi::Result<()> {
-        let ns = namespace_path.unwrap_or_default();
-        self.get_inner()?
-            .rename_table(&old_name, &new_name, &ns, &ns)
-            .await
-            .default_error()
-    }
-
    #[napi(catch_unwind)]
    pub async fn drop_all_tables(&self, namespace_path: Option<Vec<String>>) -> napi::Result<()> {
        let ns = namespace_path.unwrap_or_default();
@@ -478,4 +459,23 @@ impl Connection {
            transaction_id: resp.transaction_id,
        })
    }
+
+    /// Rename a table. `current_namespace_path` and `new_namespace_path` default to
+    /// the root namespace when omitted; the caller is expected to either pass both
+    /// or pass neither.
+    #[napi(catch_unwind)]
+    pub async fn rename_table(
+        &self,
+        current_name: String,
+        new_name: String,
+        current_namespace_path: Option<Vec<String>>,
+        new_namespace_path: Option<Vec<String>>,
+    ) -> napi::Result<()> {
+        let cur_ns = current_namespace_path.unwrap_or_default();
+        let new_ns = new_namespace_path.unwrap_or_default();
+        self.get_inner()?
+            .rename_table(&current_name, &new_name, &cur_ns, &new_ns)
+            .await
+            .default_error()
+    }
 }
--- a/nodejs/src/index.rs
+++ b/nodejs/src/index.rs
@@ -4,7 +4,7 @@
 use std::sync::Mutex;

 use lancedb::index::Index as LanceDbIndex;
-use lancedb::index::scalar::{BTreeIndexBuilder, FtsIndexBuilder};
+use lancedb::index::scalar::{BTreeIndexBuilder, FmIndexBuilder, FtsIndexBuilder};
 use lancedb::index::vector::{
    IvfFlatIndexBuilder, IvfHnswPqIndexBuilder, IvfHnswSqIndexBuilder, IvfPqIndexBuilder,
    IvfRqIndexBuilder,
@@ -143,6 +143,13 @@ impl Index {
        }
    }

+    #[napi(factory)]
+    pub fn fm() -> Self {
+        Self {
+            inner: Mutex::new(Some(LanceDbIndex::Fm(FmIndexBuilder::default()))),
+        }
+    }
+
    #[napi(factory)]
    #[allow(clippy::too_many_arguments)]
    pub fn fts(
--- a/nodejs/src/lib.rs
+++ b/nodejs/src/lib.rs
@@ -24,15 +24,19 @@ mod util;
 #[napi(object)]
 #[derive(Debug)]
 pub struct ConnectionOptions {
-    /// (For LanceDB OSS only): The interval, in seconds, at which to check for
-    /// updates to the table from other processes. If None, then consistency is not
-    /// checked. For performance reasons, this is the default. For strong
-    /// consistency, set this to zero seconds. Then every read will check for
-    /// updates from other processes. As a compromise, you can set this to a
-    /// non-zero value for eventual consistency. If more than that interval
-    /// has passed since the last check, then the table will be checked for updates.
-    /// Note: this consistency only applies to read operations. Write operations are
+    /// The interval, in seconds, at which to check for updates to the table
+    /// from other processes. If None, then consistency is not checked. For
+    /// performance reasons, this is the default. For strong consistency, set
+    /// this to zero seconds. Then every read will check for updates from other
+    /// processes. As a compromise, you can set this to a non-zero value for
+    /// eventual consistency. If more than that interval has passed since the
+    /// last check, then the table will be checked for updates. Note: this
+    /// consistency only applies to read operations. Write operations are
    /// always consistent.
+    ///
+    /// Stronger consistency is not free. The smaller the interval, the more
+    /// often each read pays the cost of checking for updates against object
+    /// storage, raising per-read latency and cost.
    pub read_consistency_interval: Option<f64>,
    /// (For LanceDB OSS only): configuration for object storage.
    ///
@@ -61,10 +65,6 @@ pub struct ConnectionOptions {
    /// (For LanceDB cloud only): the host to use for LanceDB cloud. Used
    /// for testing purposes.
    pub host_override: Option<String>,
-    /// (For LanceDB cloud only): OAuth configuration for IdP-based
-    /// authentication (e.g., Azure Entra ID). When set, token acquisition
-    /// and refresh are handled entirely in Rust.
-    pub oauth_config: Option<remote::OAuthConfig>,
 }

 #[napi(object)]
--- a/nodejs/src/merge.rs
+++ b/nodejs/src/merge.rs
@@ -50,6 +50,20 @@ impl NativeMergeInsertBuilder {
        this
    }

+    #[napi]
+    pub fn use_lsm_write(&self, use_lsm_write: bool) -> Self {
+        let mut this = self.clone();
+        this.inner.use_lsm_write(use_lsm_write);
+        this
+    }
+
+    #[napi]
+    pub fn validate_single_shard(&self, validate_single_shard: bool) -> Self {
+        let mut this = self.clone();
+        this.inner.validate_single_shard(validate_single_shard);
+        this
+    }
+
    #[napi(catch_unwind)]
    pub async fn execute(&self, buf: Buffer) -> napi::Result<MergeResult> {
        let data = ipc_file_to_batches(buf.to_vec())
--- a/nodejs/src/remote.rs
+++ b/nodejs/src/remote.rs
@@ -140,67 +140,6 @@ impl From<TlsConfig> for lancedb::remote::TlsConfig {
    }
 }

-/// OAuth configuration for LanceDB authentication.
-/// All token acquisition and refresh is handled in the Rust layer.
-#[napi(object)]
-#[derive(Debug, Clone)]
-pub struct OAuthConfig {
-    /// OIDC issuer URL or OAuth authority URL.
-    /// For Azure: `https://login.microsoftonline.com/{tenant_id}/v2.0`
-    pub issuer_url: String,
-    /// Application / Client ID.
-    pub client_id: String,
-    /// OAuth scopes to request. For Azure: `["api://{app_id}/.default"]`
-    pub scopes: Vec<String>,
-    /// Authentication flow: "client_credentials", "authorization_code_pkce",
-    /// "device_code", "azure_managed_identity", "workload_identity"
-    pub flow: Option<String>,
-    /// Client secret (required for client_credentials).
-    pub client_secret: Option<String>,
-    /// Redirect URI (authorization_code_pkce flow).
-    pub redirect_uri: Option<String>,
-    /// Port for local callback server (authorization_code_pkce, default: 8400).
-    pub callback_port: Option<u16>,
-    /// Client ID for user-assigned managed identity (azure_managed_identity).
-    pub managed_identity_client_id: Option<String>,
-    /// Path to federated token file (workload_identity).
-    pub token_file: Option<String>,
-    /// Seconds before expiry to trigger proactive refresh (default: 300).
-    pub refresh_buffer_secs: Option<u32>,
-}
-
-impl From<OAuthConfig> for lancedb::remote::oauth::OAuthConfig {
-    fn from(config: OAuthConfig) -> Self {
-        use lancedb::remote::oauth::OAuthFlow;
-
-        let flow = match config.flow.as_deref().unwrap_or("client_credentials") {
-            "authorization_code_pkce" => OAuthFlow::AuthorizationCodePKCE {
-                redirect_uri: config.redirect_uri,
-                callback_port: config.callback_port,
-            },
-            "device_code" => OAuthFlow::DeviceCode,
-            "azure_managed_identity" => OAuthFlow::AzureManagedIdentity {
-                client_id: config.managed_identity_client_id,
-            },
-            "workload_identity" => OAuthFlow::WorkloadIdentity {
-                token_file: config
-                    .token_file
-                    .expect("tokenFile is required for workload_identity flow"),
-            },
-            other => panic!("Unknown OAuth flow type: {other}"),
-        };
-
-        Self {
-            issuer_url: config.issuer_url,
-            client_id: config.client_id,
-            client_secret: config.client_secret,
-            scopes: config.scopes,
-            flow,
-            refresh_buffer_secs: config.refresh_buffer_secs.map(|v| v as u64),
-        }
-    }
-}
-
 impl From<ClientConfig> for lancedb::remote::ClientConfig {
    fn from(config: ClientConfig) -> Self {
        Self {
--- a/nodejs/src/table.rs
+++ b/nodejs/src/table.rs
@@ -3,12 +3,16 @@

 use std::collections::HashMap;

+use chrono::{DateTime, Utc};
+
 use lancedb::ipc::{ipc_file_to_batches, ipc_file_to_schema};
 use lancedb::table::{
-    AddDataMode, ColumnAlteration as LanceColumnAlteration, Duration, NewColumnTransform,
-    OptimizeAction, OptimizeOptions, Table as LanceDbTable,
+    AddDataMode, ColumnAlteration as LanceColumnAlteration, Duration,
+    FieldMetadataUpdate as LanceFieldMetadataUpdate, NewColumnTransform, OptimizeAction,
+    OptimizeOptions, Ref, Table as LanceDbTable,
 };
 use napi::bindgen_prelude::*;
+use napi::threadsafe_function::{ThreadsafeFunction, ThreadsafeFunctionCallMode};
 use napi_derive::napi;

 use crate::error::NapiErrorExt;
@@ -67,8 +71,16 @@ impl Table {
        schema_to_buffer(&schema)
    }

-    #[napi(catch_unwind)]
-    pub async fn add(&self, buf: Buffer, mode: String) -> napi::Result<AddResult> {
+    #[napi(
+        catch_unwind,
+        ts_args_type = "buf: Buffer, mode: string, progressCallback?: (progress: WriteProgressInfo) => void"
+    )]
+    pub async fn add(
+        &self,
+        buf: Buffer,
+        mode: String,
+        progress_callback: Option<ProgressFn>,
+    ) -> napi::Result<AddResult> {
        let batches = ipc_file_to_batches(buf.to_vec())
            .map_err(|e| napi::Error::from_reason(format!("Failed to read IPC file: {}", e)))?;
        let batches = batches
@@ -92,6 +104,19 @@ impl Table {
            return Err(napi::Error::from_reason(format!("Invalid mode: {}", mode)));
        };

+        if let Some(tsfn) = progress_callback {
+            op = op.progress(move |p| {
+                // NonBlocking: dispatch onto the JS event loop without
+                // blocking the writer thread.  With napi-rs's default
+                // unbounded queue, events are not dropped — a slow JS
+                // callback will just queue them.
+                tsfn.call(
+                    WriteProgressInfo::from(p),
+                    ThreadsafeFunctionCallMode::NonBlocking,
+                );
+            });
+        }
+
        let res = op.execute().await.default_error()?;
        Ok(res.into())
    }
@@ -333,6 +358,23 @@ impl Table {
        Ok(res.into())
    }

+    #[napi(catch_unwind)]
+    pub async fn update_field_metadata(
+        &self,
+        updates: Vec<FieldMetadataUpdate>,
+    ) -> napi::Result<UpdateFieldMetadataResult> {
+        let updates = updates
+            .into_iter()
+            .map(LanceFieldMetadataUpdate::from)
+            .collect::<Vec<_>>();
+        let res = self
+            .inner_ref()?
+            .update_field_metadata(&updates)
+            .await
+            .default_error()?;
+        Ok(res.into())
+    }
+
    #[napi(catch_unwind)]
    pub async fn drop_columns(&self, columns: Vec<String>) -> napi::Result<DropColumnsResult> {
        let col_refs = columns.iter().map(String::as_str).collect::<Vec<_>>();
@@ -369,6 +411,11 @@ impl Table {
            .default_error()
    }

+    #[napi(catch_unwind)]
+    pub async fn close_lsm_writers(&self) -> napi::Result<()> {
+        self.inner_ref()?.close_lsm_writers().await.default_error()
+    }
+
    #[napi(catch_unwind)]
    pub async fn version(&self) -> napi::Result<i64> {
        self.inner_ref()?
@@ -433,6 +480,19 @@ impl Table {
        })
    }

+    #[napi(catch_unwind)]
+    pub async fn branches(&self) -> napi::Result<Branches> {
+        Ok(Branches {
+            inner: self.inner_ref()?.clone(),
+        })
+    }
+
+    /// The branch this handle is scoped to, or `null` for the main branch.
+    #[napi]
+    pub fn current_branch(&self) -> napi::Result<Option<String>> {
+        Ok(self.inner_ref()?.current_branch())
+    }
+
    #[napi(catch_unwind)]
    pub async fn optimize(
        &self,
@@ -550,6 +610,43 @@ pub struct IndexConfig {
    /// Currently this is always an array of size 1. In the future there may
    /// be more columns to represent composite indices.
    pub columns: Vec<String>,
+    /// The UUID of the first segment of the index.
+    ///
+    /// `undefined` for remote tables, which do not yet surface this.
+    pub index_uuid: Option<String>,
+    /// The protobuf type URL, a precise type identifier for the index.
+    ///
+    /// `undefined` for remote tables.
+    pub type_url: Option<String>,
+    /// When the index was created.
+    ///
+    /// `undefined` for remote tables or indices created before timestamps were tracked.
+    pub created_at: Option<DateTime<Utc>>,
+    /// The number of rows indexed, across all segments.
+    ///
+    /// `undefined` for remote tables.
+    pub num_indexed_rows: Option<i64>,
+    /// The number of rows not yet covered by this index.
+    ///
+    /// `undefined` for remote tables.
+    pub num_unindexed_rows: Option<i64>,
+    /// The total size in bytes of all index files across all segments.
+    ///
+    /// `undefined` for remote tables or indices without size tracking.
+    pub size_bytes: Option<i64>,
+    /// The number of segments that make up the index.
+    ///
+    /// `undefined` for remote tables.
+    pub num_segments: Option<i32>,
+    /// The on-disk index format version.
+    ///
+    /// `undefined` for remote tables.
+    pub index_version: Option<i32>,
+    /// Index-type-specific details parsed as a JavaScript object.
+    ///
+    /// Falls back to a raw string if JSON parsing fails. `undefined` for
+    /// remote tables or when details are unavailable.
+    pub index_details: Option<serde_json::Value>,
 }

 impl From<lancedb::index::IndexConfig> for IndexConfig {
@@ -559,6 +656,17 @@ impl From<lancedb::index::IndexConfig> for IndexConfig {
            index_type,
            columns: value.columns,
            name: value.name,
+            index_uuid: value.index_uuid,
+            type_url: value.type_url,
+            created_at: value.created_at,
+            num_indexed_rows: value.num_indexed_rows.map(|n| n as i64),
+            num_unindexed_rows: value.num_unindexed_rows.map(|n| n as i64),
+            size_bytes: value.size_bytes.map(|n| n as i64),
+            num_segments: value.num_segments.map(|n| n as i32),
+            index_version: value.index_version,
+            index_details: value
+                .index_details
+                .and_then(|s| serde_json::from_str(&s).ok()),
        }
    }
 }
@@ -654,6 +762,44 @@ pub struct OptimizeStats {
    pub prune: RemovalStats,
 }

+/// Progress snapshot for a write operation, delivered to the JS callback
+/// passed to `Table.add`.
+#[napi(object)]
+#[derive(Clone, Debug)]
+pub struct WriteProgressInfo {
+    /// Number of rows written so far.
+    pub output_rows: i64,
+    /// Number of bytes written so far.
+    pub output_bytes: i64,
+    /// Total rows expected, if the input source reports it.
+    /// Always set on the final callback (where `done` is `true`).
+    pub total_rows: Option<i64>,
+    /// Wall-clock seconds since monitoring started.
+    pub elapsed_seconds: f64,
+    /// Number of parallel write tasks currently in flight.
+    pub active_tasks: i64,
+    /// Total number of parallel write tasks (the write parallelism).
+    pub total_tasks: i64,
+    /// `true` for the final callback; `false` otherwise.
+    pub done: bool,
+}
+
+impl From<&lancedb::table::write_progress::WriteProgress> for WriteProgressInfo {
+    fn from(p: &lancedb::table::write_progress::WriteProgress) -> Self {
+        Self {
+            output_rows: p.output_rows() as i64,
+            output_bytes: p.output_bytes() as i64,
+            total_rows: p.total_rows().map(|n| n as i64),
+            elapsed_seconds: p.elapsed().as_secs_f64(),
+            active_tasks: p.active_tasks() as i64,
+            total_tasks: p.total_tasks() as i64,
+            done: p.done(),
+        }
+    }
+}
+
+type ProgressFn = ThreadsafeFunction<WriteProgressInfo, (), WriteProgressInfo, Status, false>;
+
 ///  A definition of a column alteration. The alteration changes the column at
 /// `path` to have the new name `name`, to be nullable if `nullable` is true,
 /// and to have the data type `data_type`. At least one of `rename` or `nullable`
@@ -682,6 +828,29 @@ pub struct ColumnAlteration {
    pub nullable: Option<bool>,
 }

+/// A per-field metadata update, addressed by dot-path. Merges into the field's
+/// existing metadata by default; a `null` value deletes a key, and `replace`
+/// swaps the field's entire metadata map.
+#[napi(object)]
+pub struct FieldMetadataUpdate {
+    /// Dot-separated path to the field (e.g. "embedding" or "a.b.c").
+    pub path: String,
+    /// Metadata keys to set; a `null` value deletes that key.
+    pub metadata: HashMap<String, Option<String>>,
+    /// If true, replace the field's entire metadata map instead of merging.
+    pub replace: Option<bool>,
+}
+
+impl From<FieldMetadataUpdate> for LanceFieldMetadataUpdate {
+    fn from(js: FieldMetadataUpdate) -> Self {
+        Self {
+            path: js.path,
+            metadata: js.metadata,
+            replace: js.replace.unwrap_or(false),
+        }
+    }
+}
+
 impl TryFrom<ColumnAlteration> for LanceColumnAlteration {
    type Error = String;
    fn try_from(js: ColumnAlteration) -> std::result::Result<Self, Self::Error> {
@@ -732,9 +901,6 @@ pub struct IndexStatistics {
    pub distance_type: Option<String>,
    /// The number of parts this index is split into.
    pub num_indices: Option<u32>,
-    /// The KMeans loss value of the index,
-    /// it is only present for vector indices.
-    pub loss: Option<f64>,
 }
 impl From<lancedb::index::IndexStatistics> for IndexStatistics {
    fn from(value: lancedb::index::IndexStatistics) -> Self {
@@ -744,7 +910,6 @@ impl From<lancedb::index::IndexStatistics> for IndexStatistics {
            index_type: value.index_type.to_string(),
            distance_type: value.distance_type.map(|d| d.to_string()),
            num_indices: value.num_indices,
-            loss: value.loss,
        }
    }
 }
@@ -880,6 +1045,7 @@ pub struct MergeResult {
    pub num_updated_rows: i64,
    pub num_deleted_rows: i64,
    pub num_attempts: i64,
+    pub num_rows: i64,
 }

 impl From<lancedb::table::MergeResult> for MergeResult {
@@ -890,6 +1056,7 @@ impl From<lancedb::table::MergeResult> for MergeResult {
            num_updated_rows: value.num_updated_rows as i64,
            num_deleted_rows: value.num_deleted_rows as i64,
            num_attempts: value.num_attempts as i64,
+            num_rows: value.num_rows as i64,
        }
    }
 }
@@ -920,6 +1087,19 @@ impl From<lancedb::table::AlterColumnsResult> for AlterColumnsResult {
    }
 }

+#[napi(object)]
+pub struct UpdateFieldMetadataResult {
+    pub version: i64,
+}
+
+impl From<lancedb::table::UpdateFieldMetadataResult> for UpdateFieldMetadataResult {
+    fn from(value: lancedb::table::UpdateFieldMetadataResult) -> Self {
+        Self {
+            version: value.version as i64,
+        }
+    }
+}
+
 #[napi(object)]
 pub struct DropColumnsResult {
    pub version: i64,
@@ -939,6 +1119,13 @@ pub struct TagContents {
    pub manifest_size: i64,
 }

+#[napi]
+pub struct BranchContents {
+    pub parent_branch: Option<String>,
+    pub parent_version: i64,
+    pub manifest_size: i64,
+}
+
 #[napi]
 pub struct Tags {
    inner: LanceDbTable,
@@ -1007,3 +1194,75 @@ impl Tags {
            .default_error()
    }
 }
+
+#[napi]
+pub struct Branches {
+    inner: LanceDbTable,
+}
+
+#[napi]
+impl Branches {
+    #[napi]
+    pub async fn list(&self) -> napi::Result<HashMap<String, BranchContents>> {
+        let branches = self.inner.list_branches().await.default_error()?;
+        let result = branches
+            .into_iter()
+            .map(|(k, v)| {
+                (
+                    k,
+                    BranchContents {
+                        parent_branch: v.parent_branch,
+                        parent_version: v.parent_version as i64,
+                        manifest_size: v.manifest_size as i64,
+                    },
+                )
+            })
+            .collect();
+        Ok(result)
+    }
+
+    #[napi]
+    pub async fn create(
+        &self,
+        name: String,
+        from_ref: Option<String>,
+        from_version: Option<i64>,
+    ) -> napi::Result<Table> {
+        let from_ref = from_ref.filter(|b| b != "main");
+        let from_version = from_version
+            .map(|v| {
+                u64::try_from(v).map_err(|_| {
+                    napi::Error::from_reason("from_version must be a non-negative integer")
+                })
+            })
+            .transpose()?;
+        let from = Ref::Version(from_ref, from_version);
+        let table = self
+            .inner
+            .create_branch(&name, from)
+            .await
+            .default_error()?;
+        Ok(Table::new(table))
+    }
+
+    #[napi]
+    pub async fn checkout(&self, name: String, version: Option<i64>) -> napi::Result<Table> {
+        let version = version
+            .map(|v| {
+                u64::try_from(v)
+                    .map_err(|_| napi::Error::from_reason("version must be a non-negative integer"))
+            })
+            .transpose()?;
+        let table = self
+            .inner
+            .checkout_branch(&name, version)
+            .await
+            .default_error()?;
+        Ok(Table::new(table))
+    }
+
+    #[napi]
+    pub async fn delete(&self, name: String) -> napi::Result<()> {
+        self.inner.delete_branch(&name).await.default_error()
+    }
+}
--- a/python/.bumpversion.toml
+++ b/python/.bumpversion.toml
@@ -1,5 +1,5 @@
 [tool.bumpversion]
-current_version = "0.32.1-beta.0"
+current_version = "0.33.1-beta.2"
 parse = """(?x)
    (?P<major>0|[1-9]\\d*)\\.
    (?P<minor>0|[1-9]\\d*)\\.
--- a/python/AGENTS.md
+++ b/python/AGENTS.md
@@ -4,16 +4,26 @@ code is in the `src/` directory and the Python bindings are in the `lancedb/` di

 Common commands:

+* Bootstrap dev env: `uv run --extra tests --extra dev maturin develop --extras tests,dev`
 * Build: `make develop`
 * Format: `make format`
 * Lint: `make check`
 * Fix lints: `make fix`
-* Test: `make test`
-* Doc test: `make doctest`
+* Test: `uv run --extra tests pytest python/tests -vv --durations=10 -m "not slow and not s3_test"`
+* Run specific test: `uv run --extra tests pytest python/tests/<test_file>.py::<test_name> -q`
+* Doc test: `uv run --extra tests pytest --doctest-modules python/lancedb`
+
+Use the uv-managed environment declared by `uv.lock` for Python validation. Do
+not treat system `python`, global `pytest`, or missing editable-install errors
+as final blockers; bootstrap or enter the uv environment instead. `make test`
+and `make doctest` assume the development environment is already prepared.

 Before committing changes, run lints and then formatting.

-When you change the Rust code, you will need to recompile the Python bindings: `make develop`.
+When you change the Rust code, PyO3 binding code, or see a missing/stale
+`lancedb._lancedb`, recompile the Python bindings with
+`uv run --extra tests --extra dev maturin develop --extras tests,dev` before
+running tests.

 When you export new types from Rust to Python, you must manually update `python/lancedb/_lancedb.pyi`
 with the corresponding type hints. You can run `pyright` to check for type errors in the Python code.
--- a/python/Cargo.toml
+++ b/python/Cargo.toml
@@ -1,6 +1,6 @@
 [package]
 name = "lancedb-python"
-version = "0.32.1-beta.0"
+version = "0.33.1-beta.2"
 publish = false
 edition.workspace = true
 description = "Python bindings for LanceDB"
@@ -26,7 +26,8 @@ lance-namespace-impls.workspace = true
 lance-io.workspace = true
 env_logger.workspace = true
 log.workspace = true
-pyo3 = { version = "0.28", features = ["extension-module", "abi3-py39"] }
+pyo3 = { version = "0.28", features = ["extension-module", "abi3-py39", "chrono"] }
+chrono = { version = "0.4", default-features = false, features = ["clock"] }
 pyo3-async-runtimes = { version = "0.28", features = [
    "attributes",
    "tokio-runtime",
--- a/python/python/lancedb/init.py
+++ b/python/python/lancedb/init.py
@@ -94,7 +94,6 @@ def connect(
    host_override: str, optional
        The override url for LanceDB Cloud.
    read_consistency_interval: timedelta, default None
-        (For LanceDB OSS only)
        The interval at which to check for updates to the table from other
        processes. If None, then consistency is not checked. For performance
        reasons, this is the default. For strong consistency, set this to
@@ -104,6 +103,10 @@ def connect(
        the last check, then the table will be checked for updates. Note: this
        consistency only applies to read operations. Write operations are
        always consistent.
+
+        Stronger consistency is not free. The smaller the interval, the more
+        often each read pays the cost of checking for updates against object
+        storage, raising per-read latency and cost.
    client_config: ClientConfig or dict, optional
        Configuration options for the LanceDB Cloud HTTP client. If a dict, then
        the keys are the attributes of the ClientConfig class. If None, then the
@@ -147,6 +150,13 @@ def connect(
    >>> db = lancedb.connect("s3://my-bucket/lancedb",
    ...                      storage_options={"aws_access_key_id": "***"})

+    For tests and temporary data, use an in-memory database:
+
+    >>> db = lancedb.connect("memory://")
+
+    In-memory databases are not persisted. Tables are dropped when the last
+    connection or table handle referencing them is closed.
+
    Connect to LanceDB cloud:

    >>> db = lancedb.connect("db://my_database", api_key="ldb_...",
@@ -210,6 +220,7 @@ def connect(
            request_thread_pool=request_thread_pool,
            client_config=client_config,
            storage_options=storage_options,
+            read_consistency_interval=read_consistency_interval,
            **kwargs,
        )
    _check_s3_bucket_with_dots(str(uri), storage_options)
@@ -304,6 +315,15 @@ def deserialize_conn(
            manifest_enabled=parsed.get("manifest_enabled", False),
            namespace_client_properties=parsed.get("namespace_client_properties"),
        )
+    elif connection_type == "remote":
+        return RemoteDBConnection(
+            parsed["db_url"],
+            parsed["api_key"],
+            parsed.get("region", "us-east-1"),
+            host_override=parsed.get("host_override"),
+            client_config=parsed.get("client_config"),
+            storage_options=storage_options,
+        )
    else:
        raise ValueError(f"Unknown connection_type: {connection_type}")

@@ -320,7 +340,6 @@ async def connect_async(
    session: Optional[Session] = None,
    manifest_enabled: bool = False,
    namespace_client_properties: Optional[Dict[str, str]] = None,
-    oauth_config=None,
 ) -> AsyncConnection:
    """Connect to a LanceDB database.

@@ -337,7 +356,6 @@ async def connect_async(
    host_override: str, optional
        The override url for LanceDB Cloud.
    read_consistency_interval: timedelta, default None
-        (For LanceDB OSS only)
        The interval at which to check for updates to the table from other
        processes. If None, then consistency is not checked. For performance
        reasons, this is the default. For strong consistency, set this to
@@ -347,6 +365,10 @@ async def connect_async(
        the last check, then the table will be checked for updates. Note: this
        consistency only applies to read operations. Write operations are
        always consistent.
+
+        Stronger consistency is not free. The smaller the interval, the more
+        often each read pays the cost of checking for updates against object
+        storage, raising per-read latency and cost.
    client_config: ClientConfig or dict, optional
        Configuration options for the LanceDB Cloud HTTP client. If a dict, then
        the keys are the attributes of the ClientConfig class. If None, then the
@@ -379,6 +401,8 @@ async def connect_async(
    ...     db = await lancedb.connect_async("s3://my-bucket/lancedb",
    ...                                      storage_options={
    ...                                          "aws_access_key_id": "***"})
+    ...     # For tests and temporary data, use an in-memory database
+    ...     db = await lancedb.connect_async("memory://")
    ...     # Connect to LanceDB cloud
    ...     db = await lancedb.connect_async("db://my_database", api_key="ldb_...",
    ...                                      client_config={
@@ -411,7 +435,6 @@ async def connect_async(
            session,
            manifest_enabled,
            namespace_client_properties,
-            oauth_config,
        )
    )

--- a/python/python/lancedb/_lancedb.pyi
+++ b/python/python/lancedb/_lancedb.pyi
@@ -1,4 +1,4 @@
-from datetime import timedelta
+from datetime import datetime, timedelta
 from typing import Dict, List, Optional, Tuple, Any, TypedDict, Union, Literal

 import pyarrow as pa
@@ -10,6 +10,7 @@ from .index import (
    IvfSq,
    Bitmap,
    LabelList,
+    Fm,
    HnswPq,
    HnswSq,
    HnswFlat,
@@ -47,6 +48,7 @@ class PyExpr:
    def lower(self) -> "PyExpr": ...
    def upper(self) -> "PyExpr": ...
    def contains(self, substr: "PyExpr") -> "PyExpr": ...
+    def isin(self, values: List["PyExpr"]) -> "PyExpr": ...
    def cast(self, data_type: pa.DataType) -> "PyExpr": ...
    def to_sql(self) -> str: ...

@@ -186,6 +188,7 @@ class Table:
            BTree,
            Bitmap,
            LabelList,
+            Fm,
            FTS,
        ],
        replace: Optional[bool],
@@ -202,12 +205,15 @@ class Table:
    async def prewarm_index(self, index_name: str) -> None: ...
    async def prewarm_data(self, columns: Optional[List[str]] = None) -> None: ...
    async def list_indices(self) -> list[IndexConfig]: ...
-    async def delete(self, filter: str) -> DeleteResult: ...
+    async def delete(self, filter: Union[str, PyExpr]) -> DeleteResult: ...
    async def add_columns(self, columns: list[tuple[str, str]]) -> AddColumnsResult: ...
    async def add_columns_with_schema(self, schema: pa.Schema) -> AddColumnsResult: ...
    async def alter_columns(
        self, columns: list[dict[str, Any]]
    ) -> AlterColumnsResult: ...
+    async def update_field_metadata(
+        self, updates: list[dict[str, Any]]
+    ) -> UpdateFieldMetadataResult: ...
    async def optimize(
        self,
        *,
@@ -220,8 +226,12 @@ class Table:
    async def set_unenforced_primary_key(self, columns: List[str]) -> None: ...
    async def set_lsm_write_spec(self, spec: LsmWriteSpec) -> None: ...
    async def unset_lsm_write_spec(self) -> None: ...
+    async def close_lsm_writers(self) -> None: ...
    @property
    def tags(self) -> Tags: ...
+    @property
+    def branches(self) -> Branches: ...
+    def current_branch(self) -> Optional[str]: ...
    def query(self) -> Query: ...
    def take_offsets(self, offsets: list[int]) -> TakeQuery: ...
    def take_row_ids(self, row_ids: list[int]) -> TakeQuery: ...
@@ -234,10 +244,30 @@ class Tags:
    async def delete(self, tag: str): ...
    async def update(self, tag: str, version: int): ...

+class Branches:
+    async def list(self) -> Dict[str, Any]: ...
+    async def create(
+        self,
+        name: str,
+        from_ref: Optional[str] = None,
+        from_version: Optional[int] = None,
+    ) -> Table: ...
+    async def checkout(self, name: str, version: Optional[int] = None) -> Table: ...
+    async def delete(self, name: str) -> None: ...
+
 class IndexConfig:
    name: str
    index_type: str
    columns: List[str]
+    index_uuid: Optional[str]
+    type_url: Optional[str]
+    created_at: Optional[datetime]
+    num_indexed_rows: Optional[int]
+    num_unindexed_rows: Optional[int]
+    size_bytes: Optional[int]
+    num_segments: Optional[int]
+    index_version: Optional[int]
+    index_details: Optional[Any]

 async def connect(
    uri: str,
@@ -250,7 +280,6 @@ async def connect(
    session: Optional[Session],
    manifest_enabled: bool = False,
    namespace_client_properties: Optional[Dict[str, str]] = None,
-    oauth_config: Optional[Any] = None,
 ) -> Connection: ...

 class RecordBatchStream:
@@ -421,6 +450,7 @@ class MergeResult:
    num_inserted_rows: int
    num_deleted_rows: int
    num_attempts: int
+    num_rows: int

 class LsmWriteSpec:
    """Specification selecting Lance's MemWAL LSM-style write path for
@@ -459,6 +489,9 @@ class AddColumnsResult:
 class AlterColumnsResult:
    version: int

+class UpdateFieldMetadataResult:
+    version: int
+
 class DropColumnsResult:
    version: int

--- a/python/python/lancedb/background_loop.py
+++ b/python/python/lancedb/background_loop.py
@@ -2,6 +2,7 @@
 # SPDX-FileCopyrightText: Copyright The LanceDB Authors

 import asyncio
+import concurrent.futures
 import os
 import threading
 import warnings
@@ -37,6 +38,24 @@ class BackgroundEventLoop:

 LOOP = BackgroundEventLoop()

+
+def _new_embedding_executor() -> concurrent.futures.ThreadPoolExecutor:
+    return concurrent.futures.ThreadPoolExecutor(thread_name_prefix="lancedb-embedding")
+
+
+# Embedding functions can block for a long time -- a heavy local model or an
+# HTTP request to a remote embeddings API. Running them on asyncio's default
+# executor lets them starve the unrelated blocking I/O that shares that pool,
+# so they get a dedicated one. See
+# https://github.com/lancedb/lancedb/issues/3310.
+_EMBEDDING_EXECUTOR = _new_embedding_executor()
+
+
+def embedding_executor() -> concurrent.futures.ThreadPoolExecutor:
+    """Return the executor dedicated to running blocking embedding calls."""
+    return _EMBEDDING_EXECUTOR
+
+
 _FORK_WARNED = False


@@ -47,6 +66,12 @@ def _reset_after_fork():
    # the new state. The Rust-side tokio runtime is reset analogously by a
    # pthread_atfork hook installed in the _lancedb extension.
    LOOP._start()
+    # The embedding executor's worker threads are dead in the child as well.
+    # Replace it with a fresh pool (threads are spawned lazily, so this is
+    # cheap); we don't shut down the old one, since joining its dead workers
+    # could hang.
+    global _EMBEDDING_EXECUTOR
+    _EMBEDDING_EXECUTOR = _new_embedding_executor()
    global _FORK_WARNED
    if not _FORK_WARNED:
        _FORK_WARNED = True
--- a/python/python/lancedb/db.py
+++ b/python/python/lancedb/db.py
@@ -8,7 +8,17 @@ from abc import abstractmethod
 from datetime import timedelta
 from pathlib import Path
 import sys
-from typing import TYPE_CHECKING, Any, Dict, Iterable, List, Literal, Optional, Union
+from typing import (
+    TYPE_CHECKING,
+    Any,
+    Dict,
+    Generator,
+    Iterable,
+    List,
+    Literal,
+    Optional,
+    Union,
+)

 if sys.version_info >= (3, 12):
    from typing import override
@@ -313,7 +323,7 @@ class DBConnection(EnforceOverrides):
        >>> data = [{"vector": [1.1, 1.2], "lat": 45.5, "long": -122.7},
        ...         {"vector": [0.2, 1.8], "lat": 40.1, "long":  -74.1}]
        >>> db.create_table("my_table", data)
-        LanceTable(name='my_table', version=1, ...)
+        LanceTable(name='my_table', ...)
        >>> db["my_table"].head()
        pyarrow.Table
        vector: fixed_size_list<item: float>[2]
@@ -334,7 +344,7 @@ class DBConnection(EnforceOverrides):
        ...    "long": [-122.7, -74.1]
        ... })
        >>> db.create_table("table2", data)
-        LanceTable(name='table2', version=1, ...)
+        LanceTable(name='table2', ...)
        >>> db["table2"].head()
        pyarrow.Table
        vector: fixed_size_list<item: float>[2]
@@ -357,7 +367,7 @@ class DBConnection(EnforceOverrides):
        ...   pa.field("long", pa.float32())
        ... ])
        >>> db.create_table("table3", data, schema = custom_schema)
-        LanceTable(name='table3', version=1, ...)
+        LanceTable(name='table3', ...)
        >>> db["table3"].head()
        pyarrow.Table
        vector: fixed_size_list<item: float>[2]
@@ -391,7 +401,7 @@ class DBConnection(EnforceOverrides):
        ...     pa.field("price", pa.float32()),
        ... ])
        >>> db.create_table("table4", make_batches(), schema=schema)
-        LanceTable(name='table4', version=1, ...)
+        LanceTable(name='table4', ...)

        """
        raise NotImplementedError
@@ -406,6 +416,8 @@ class DBConnection(EnforceOverrides):
        namespace_path: Optional[List[str]] = None,
        storage_options: Optional[Dict[str, str]] = None,
        index_cache_size: Optional[int] = None,
+        branch: Optional[str] = None,
+        version: Optional[int] = None,
    ) -> Table:
        """Open a Lance Table in the database.

@@ -434,6 +446,14 @@ class DBConnection(EnforceOverrides):
            connection will be inherited by the table, but can be overridden here.
            See available options at
            <https://docs.lancedb.com/storage/>
+        branch: str, optional
+            If provided, open a handle scoped to this branch instead of the
+            default branch. Reads and writes operate in the branch's context.
+        version: int, optional
+            If provided, open the table pinned to this version, producing a
+            read-only handle. Composes with ``branch``: when both are given,
+            opens that branch at the version; otherwise opens ``main`` at the
+            version. Call ``checkout_latest`` to return to a writable state.

        Returns
        -------
@@ -568,15 +588,15 @@ class LanceDBConnection(DBConnection):
    >>> db = lancedb.connect("./.lancedb")
    >>> db.create_table("my_table", data=[{"vector": [1.1, 1.2], "b": 2},
    ...                                   {"vector": [0.5, 1.3], "b": 4}])
-    LanceTable(name='my_table', version=1, ...)
+    LanceTable(name='my_table', ...)
    >>> db.create_table("another_table", data=[{"vector": [0.4, 0.4], "b": 6}])
-    LanceTable(name='another_table', version=1, ...)
+    LanceTable(name='another_table', ...)
    >>> sorted(db.table_names())
    ['another_table', 'my_table']
    >>> len(db)
    2
    >>> db["my_table"]
-    LanceTable(name='my_table', version=1, ...)
+    LanceTable(name='my_table', ...)
    >>> "my_table" in db
    True
    >>> db.drop_table("my_table")
@@ -847,11 +867,20 @@ class LanceDBConnection(DBConnection):
            )
        )

+    def _all_table_names(self) -> Generator[str, None, None]:
+        page_token = None
+        while True:
+            response = self.list_tables(page_token=page_token)
+            yield from response.tables
+            page_token = response.page_token
+            if not page_token:
+                return
+
    def __len__(self) -> int:
-        return len(self.table_names())
+        return sum(1 for _ in self._all_table_names())

    def __contains__(self, name: str) -> bool:
-        return name in self.table_names()
+        return name in self._all_table_names()

    @override
    def create_table(
@@ -939,6 +968,8 @@ class LanceDBConnection(DBConnection):
        namespace_path: Optional[List[str]] = None,
        storage_options: Optional[Dict[str, str]] = None,
        index_cache_size: Optional[int] = None,
+        branch: Optional[str] = None,
+        version: Optional[int] = None,
    ) -> LanceTable:
        """Open a table in the database.

@@ -949,6 +980,14 @@ class LanceDBConnection(DBConnection):
        namespace_path: List[str], optional
            The namespace to open the table from.  When non-empty, the
            table is resolved through the directory namespace client.
+        branch: str, optional
+            If provided, open a handle scoped to this branch instead of the
+            default branch. Reads and writes operate in the branch's context.
+        version: int, optional
+            If provided, open the table pinned to this version, producing a
+            read-only handle. Composes with ``branch``: when both are given,
+            opens that branch at the version; otherwise opens ``main`` at the
+            version. Call ``checkout_latest`` to return to a writable state.

        Returns
        -------
@@ -968,20 +1007,26 @@ class LanceDBConnection(DBConnection):
            )

        if namespace_path:
-            return self._namespace_conn().open_table(
+            tbl = self._namespace_conn().open_table(
+                name,
+                namespace_path=namespace_path,
+                storage_options=storage_options,
+                index_cache_size=index_cache_size,
+            )
+        else:
+            tbl = LanceTable.open(
+                self,
                name,
                namespace_path=namespace_path,
                storage_options=storage_options,
                index_cache_size=index_cache_size,
            )

-        return LanceTable.open(
-            self,
-            name,
-            namespace_path=namespace_path,
-            storage_options=storage_options,
-            index_cache_size=index_cache_size,
-        )
+        if branch is not None:
+            tbl = tbl.branches.checkout(branch, version)
+        elif version is not None:
+            tbl.checkout(version)
+        return tbl

    def clone_table(
        self,
@@ -1622,6 +1667,8 @@ class AsyncConnection(object):
        location: Optional[str] = None,
        namespace_client: Optional[Any] = None,
        managed_versioning: Optional[bool] = None,
+        branch: Optional[str] = None,
+        version: Optional[int] = None,
    ) -> AsyncTable:
        """Open a Lance Table in the database.

@@ -1657,6 +1704,14 @@ class AsyncConnection(object):
        managed_versioning: bool, optional
            Whether managed versioning is enabled for this table. If provided,
            avoids a redundant describe_table call when namespace_client is set.
+        branch: str, optional
+            If provided, open a handle scoped to this branch instead of the
+            default branch. Reads and writes operate in the branch's context.
+        version: int, optional
+            If provided, open the table pinned to this version, producing a
+            read-only handle. Composes with ``branch``: when both are given,
+            opens that branch at the version; otherwise opens ``main`` at the
+            version. Call ``checkout_latest`` to return to a writable state.

        Returns
        -------
@@ -1673,7 +1728,14 @@ class AsyncConnection(object):
            namespace_client=namespace_client,
            managed_versioning=managed_versioning,
        )
-        return AsyncTable(table)
+        tbl = AsyncTable(table)
+        # "main" is the default branch, so treat it as no branch: remote rejects
+        # every branch checkout (even "main"), and the version still applies.
+        if branch is not None and branch != "main":
+            tbl = await tbl.branches.checkout(branch, version)
+        elif version is not None:
+            await tbl.checkout(version)
+        return tbl

    async def clone_table(
        self,
--- a/python/python/lancedb/expr.py
+++ b/python/python/lancedb/expr.py
@@ -19,7 +19,7 @@ operators::

 from __future__ import annotations

-from typing import Union
+from typing import Iterable, Union

 import pyarrow as pa

@@ -174,6 +174,11 @@ class Expr:
        """Return True where the string contains *substr*."""
        return Expr(self._inner.contains(_coerce(substr)._inner))

+    def isin(self, values: "Iterable[ExprLike]") -> "Expr":
+        """Return True where the value is one of *values* (SQL ``IN``)."""
+        inner = [_coerce(v)._inner for v in values]
+        return Expr(self._inner.isin(inner))
+
    # ── type cast ────────────────────────────────────────────────────────────

    def cast(self, data_type: Union[str, "pa.DataType"]) -> "Expr":
--- a/python/python/lancedb/index.py
+++ b/python/python/lancedb/index.py
@@ -93,6 +93,20 @@ class LabelList:
    pass


+@dataclass
+class Fm:
+    """Describe an FM-Index configuration.
+
+    `Fm` is a scalar index on string or binary columns that accelerates
+    substring search, i.e. `contains(col, 'needle')`. Unlike the tokenized
+    `FTS` index, it matches arbitrary substrings of the raw bytes.
+
+    For example, it works with `url`, `path`, `content`, etc.
+    """
+
+    pass
+
+
@dataclass
 class FTS:
    """Describe a FTS index configuration.
@@ -281,6 +295,9 @@ class HnswPq:
    m: int = 20
    ef_construction: int = 300
    target_partition_size: Optional[int] = None
+    # Name of the accelerator (e.g. "cuda") to use for IVF training. When set,
+    # create_index() dispatches to pylance to build the index on the accelerator.
+    accelerator: Optional[str] = None


@dataclass
@@ -386,6 +403,9 @@ class HnswSq:
    m: int = 20
    ef_construction: int = 300
    target_partition_size: Optional[int] = None
+    # Name of the accelerator (e.g. "cuda") to use for IVF training. When set,
+    # create_index() dispatches to pylance to build the index on the accelerator.
+    accelerator: Optional[str] = None


@dataclass
@@ -579,6 +599,9 @@ class IvfFlat:
    max_iterations: int = 50
    sample_rate: int = 256
    target_partition_size: Optional[int] = None
+    # Name of the accelerator (e.g. "cuda") to use for IVF training. When set,
+    # create_index() dispatches to pylance to build the index on the accelerator.
+    accelerator: Optional[str] = None


@dataclass
@@ -609,6 +632,9 @@ class IvfSq:
    max_iterations: int = 50
    sample_rate: int = 256
    target_partition_size: Optional[int] = None
+    # Name of the accelerator (e.g. "cuda") to use for IVF training. When set,
+    # create_index() dispatches to pylance to build the index on the accelerator.
+    accelerator: Optional[str] = None


@dataclass
@@ -739,6 +765,9 @@ class IvfPq:
    max_iterations: int = 50
    sample_rate: int = 256
    target_partition_size: Optional[int] = None
+    # Name of the accelerator (e.g. "cuda") to use for IVF training. When set,
+    # create_index() dispatches to pylance to build the index on the accelerator.
+    accelerator: Optional[str] = None


@dataclass
@@ -792,6 +821,9 @@ class IvfRq:
    max_iterations: int = 50
    sample_rate: int = 256
    target_partition_size: Optional[int] = None
+    # Name of the accelerator (e.g. "cuda") to use for IVF training. When set,
+    # create_index() dispatches to pylance to build the index on the accelerator.
+    accelerator: Optional[str] = None


 __all__ = [
@@ -810,4 +842,5 @@ __all__ = [
    "FTS",
    "Bitmap",
    "LabelList",
+    "Fm",
 ]
--- a/python/python/lancedb/merge.py
+++ b/python/python/lancedb/merge.py
@@ -5,7 +5,9 @@
 from __future__ import annotations

 from datetime import timedelta
-from typing import TYPE_CHECKING, List, Optional
+from typing import TYPE_CHECKING, List, Optional, Union
+
+from .expr import Expr

 if TYPE_CHECKING:
    from .common import DATA
@@ -32,8 +34,11 @@ class LanceMergeInsertBuilder(object):
        self._when_not_matched_insert_all = False
        self._when_not_matched_by_source_delete = False
        self._when_not_matched_by_source_condition = None
+        self._when_not_matched_by_source_condition_expr = None
        self._timeout = None
        self._use_index = True
+        self._use_lsm_write = None
+        self._validate_single_shard = None

    def when_matched_update_all(
        self, *, where: Optional[str] = None
@@ -60,7 +65,7 @@ class LanceMergeInsertBuilder(object):
        return self

    def when_not_matched_by_source_delete(
-        self, condition: Optional[str] = None
+        self, condition: Union[str, Expr, None] = None
    ) -> LanceMergeInsertBuilder:
        """
        Rows that exist only in the target table (old data) will be
@@ -69,13 +74,16 @@ class LanceMergeInsertBuilder(object):

        Parameters
        ----------
-        condition: Optional[str], default None
+        condition: str or :class:`~lancedb.expr.Expr` or None, default None
            If None then all such rows will be deleted.  Otherwise the
-            condition will be used as an SQL filter to limit what rows
-            are deleted.
+            condition will be used as a filter to limit what rows are deleted.
+            Can be a SQL string or a type-safe :class:`~lancedb.expr.Expr`
+            built with :func:`~lancedb.expr.col` and :func:`~lancedb.expr.lit`.
        """
        self._when_not_matched_by_source_delete = True
-        if condition is not None:
+        if isinstance(condition, Expr):
+            self._when_not_matched_by_source_condition_expr = condition._inner
+        elif condition is not None:
            self._when_not_matched_by_source_condition = condition
        return self

@@ -96,6 +104,46 @@ class LanceMergeInsertBuilder(object):
        self._use_index = use_index
        return self

+    def use_lsm_write(self, use_lsm_write: bool) -> LanceMergeInsertBuilder:
+        """
+        Controls whether the merge uses the MemWAL LSM write path.
+
+        By default (unset), a `merge_insert` on a table with an LSM write spec
+        is routed through Lance's MemWAL shard writer, and a table without one
+        uses the standard path. Pass `False` to force the standard path even
+        when a spec is set. Pass `True` to require a spec — `merge_insert`
+        raises an error if none is installed.
+
+        Parameters
+        ----------
+        use_lsm_write: bool
+            Whether to use the LSM write path.
+        """
+        self._use_lsm_write = use_lsm_write
+        return self
+
+    def validate_single_shard(
+        self, validate_single_shard: bool
+    ) -> LanceMergeInsertBuilder:
+        """
+        Controls how an LSM merge checks that its input targets a single shard.
+
+        When a table has an LSM write spec, every row in a `merge_insert` call
+        must route to the same shard. When `True` (the default), every row is
+        inspected to verify this. When `False`, only the first row is inspected
+        and the shard it routes to is used for the whole input — a faster path
+        for callers that have already pre-sharded their input.
+
+        Has no effect on tables without an LSM write spec.
+
+        Parameters
+        ----------
+        validate_single_shard: bool
+            Whether to check every row routes to one shard. Defaults to `True`.
+        """
+        self._validate_single_shard = validate_single_shard
+        return self
+
    def execute(
        self,
        new_data: DATA,
--- a/python/python/lancedb/namespace.py
+++ b/python/python/lancedb/namespace.py
@@ -58,6 +58,7 @@ from lance_namespace import (
    ListTablesRequest,
    DescribeNamespaceRequest,
    DropTableRequest,
+    RenameTableRequest,
    ListNamespacesRequest,
    CreateNamespaceRequest,
    DropNamespaceRequest,
@@ -144,7 +145,12 @@ def _query_to_namespace_request(
    if query.postfilter is not None:
        prefilter = not query.postfilter

-    k = query.limit if query.limit is not None else 10
+    if query.limit is not None:
+        k = query.limit
+    elif query.vector is None and query.full_text_query is None:
+        k = sys.maxsize
+    else:
+        k = 10

    # Build request kwargs, only including non-None values for optional fields
    # that Pydantic doesn't accept as None
@@ -544,6 +550,8 @@ class LanceNamespaceDBConnection(DBConnection):
        namespace_path: Optional[List[str]] = None,
        storage_options: Optional[Dict[str, str]] = None,
        index_cache_size: Optional[int] = None,
+        branch: Optional[str] = None,
+        version: Optional[int] = None,
    ) -> Table:
        if namespace_path is None:
            namespace_path = []
@@ -562,7 +570,7 @@ class LanceNamespaceDBConnection(DBConnection):
                raise TableNotFoundError(f"Table not found: {'$'.join(table_id)}")
            raise

-        return LanceTable(
+        tbl = LanceTable(
            self,
            name,
            namespace_path=namespace_path,
@@ -570,6 +578,11 @@ class LanceNamespaceDBConnection(DBConnection):
            pushdown_operations=self._namespace_client_pushdown_operations,
            _async=async_table,
        )
+        if branch is not None:
+            tbl = tbl.branches.checkout(branch, version)
+        elif version is not None:
+            tbl.checkout(version)
+        return tbl

    @override
    def drop_table(self, name: str, namespace_path: Optional[List[str]] = None):
@@ -592,9 +605,14 @@ class LanceNamespaceDBConnection(DBConnection):
            cur_namespace_path = []
        if new_namespace_path is None:
            new_namespace_path = []
-        raise NotImplementedError(
-            "rename_table is not supported for namespace connections"
+        cur_table_id = cur_namespace_path + [cur_name]
+        new_namespace_id = new_namespace_path if new_namespace_path else None
+        request = RenameTableRequest(
+            id=cur_table_id,
+            new_table_name=new_name,
+            new_namespace_id=new_namespace_id,
        )
+        self._namespace_client.rename_table(request)

    @override
    def drop_database(self):
@@ -954,7 +972,7 @@ class AsyncLanceNamespaceDBConnection:
        if mode.lower() not in ["create", "overwrite"]:
            raise ValueError("mode must be either 'create' or 'overwrite'")
        validate_table_name(name)
-        return await self._inner.create_table(
+        table = await self._inner.create_table(
            name,
            data,
            schema=schema,
@@ -966,6 +984,11 @@ class AsyncLanceNamespaceDBConnection:
            embedding_functions=embedding_functions,
            storage_options=storage_options,
        )
+        return table._set_namespace_context(
+            namespace_path=namespace_path,
+            namespace_client=self._namespace_client,
+            pushdown_operations=self._namespace_client_pushdown_operations,
+        )

    async def open_table(
        self,
@@ -974,12 +997,14 @@ class AsyncLanceNamespaceDBConnection:
        namespace_path: Optional[List[str]] = None,
        storage_options: Optional[Dict[str, str]] = None,
        index_cache_size: Optional[int] = None,
+        branch: Optional[str] = None,
+        version: Optional[int] = None,
    ) -> AsyncTable:
        """Open an existing table from the namespace."""
        if namespace_path is None:
            namespace_path = []
        try:
-            return await self._inner.open_table(
+            table = await self._inner.open_table(
                name,
                namespace_path=namespace_path,
                storage_options=storage_options,
@@ -990,6 +1015,17 @@ class AsyncLanceNamespaceDBConnection:
                table_id = namespace_path + [name]
                raise TableNotFoundError(f"Table not found: {'$'.join(table_id)}")
            raise
+        # "main" is the default branch, so treat it as no branch (mirrors the
+        # sync remote path); the version still applies.
+        if branch is not None and branch != "main":
+            table = await table.branches.checkout(branch, version)
+        elif version is not None:
+            await table.checkout(version)
+        return table._set_namespace_context(
+            namespace_path=namespace_path,
+            namespace_client=self._namespace_client,
+            pushdown_operations=self._namespace_client_pushdown_operations,
+        )

    async def drop_table(self, name: str, namespace_path: Optional[List[str]] = None):
        """Drop a table from the namespace."""
@@ -1006,14 +1042,19 @@ class AsyncLanceNamespaceDBConnection:
        cur_namespace_path: Optional[List[str]] = None,
        new_namespace_path: Optional[List[str]] = None,
    ):
-        """Rename is not supported for namespace connections."""
+        """Rename a table in the namespace."""
        if cur_namespace_path is None:
            cur_namespace_path = []
        if new_namespace_path is None:
            new_namespace_path = []
-        raise NotImplementedError(
-            "rename_table is not supported for namespace connections"
+        cur_table_id = cur_namespace_path + [cur_name]
+        new_namespace_id = new_namespace_path if new_namespace_path else None
+        request = RenameTableRequest(
+            id=cur_table_id,
+            new_table_name=new_name,
+            new_namespace_id=new_namespace_id,
        )
+        self._namespace_client.rename_table(request)

    async def drop_database(self):
        """Deprecated method."""
--- a/python/python/lancedb/permutation.py
+++ b/python/python/lancedb/permutation.py
@@ -3,12 +3,13 @@

 import copy
 import json
+import os

 from deprecation import deprecated
 import pyarrow as pa

 from ._lancedb import async_permutation_builder, PermutationReader
-from .table import LanceTable
+from .table import LanceTable, Table
 from .background_loop import LOOP
 from .util import batch_to_tensor, batch_to_tensor_rows
 from typing import Any, Callable, Iterator, Literal, Optional, TYPE_CHECKING, Union
@@ -354,6 +355,49 @@ class Transforms:
 DEFAULT_BATCH_SIZE = 100


+def _table_to_pickle_state(table: Table) -> dict[str, Any]:
+    from .remote.table import RemoteTable
+
+    if isinstance(table, RemoteTable):
+        return {
+            "kind": "remote",
+            "table": table,
+        }
+
+    if not isinstance(table, LanceTable):
+        raise ValueError(f"Cannot pickle table of type {type(table)!r}")
+
+    base_uri = table._conn.uri
+    if base_uri.startswith("memory://"):
+        return {
+            "kind": "memory",
+            "name": table.name,
+            "data": table.to_arrow(),
+        }
+
+    return {
+        "kind": "local",
+        "name": table.name,
+        "uri": base_uri,
+        "namespace": table._namespace_path,
+        "storage_options": table._conn.storage_options,
+    }
+
+
+def _table_from_pickle_state(state: dict[str, Any]) -> Table:
+    from . import connect
+
+    kind = state["kind"]
+    if kind == "remote":
+        return state["table"]
+    if kind == "memory":
+        return connect("memory://").create_table(state["name"], state["data"])
+    if kind == "local":
+        db = connect(state["uri"], storage_options=state["storage_options"])
+        return db.open_table(state["name"], namespace_path=state["namespace"] or None)
+    raise ValueError(f"Unknown table pickle state kind: {kind}")
+
+
 class Permutation:
    """
    A Permutation is a view of a dataset that can be used as input to model training
@@ -369,15 +413,15 @@ class Permutation:

    def __init__(
        self,
-        base_table: LanceTable,
-        permutation_table: Optional[LanceTable],
+        base_table: Table,
+        permutation_table: Optional[Table],
        split: int,
        selection: dict[str, str],
        batch_size: int,
        transform_fn: Callable[pa.RecordBatch, Any],
        offset: Optional[int] = None,
        limit: Optional[int] = None,
-        connection_factory: Optional[Callable[[str], LanceTable]] = None,
+        connection_factory: Optional[Callable[[str], Table]] = None,
        _reader: Optional[PermutationReader] = None,
    ):
        """
@@ -397,6 +441,7 @@ class Permutation:
        if _reader is None:
            _reader = LOOP.run(self._build_reader())
        self.reader: PermutationReader = _reader
+        self._pid = os.getpid()

    async def _build_reader(self) -> PermutationReader:
        reader = await PermutationReader.from_tables(
@@ -428,29 +473,25 @@ class Permutation:
        return new

    def with_connection_factory(
-        self, connection_factory: Callable[[str], LanceTable]
+        self, connection_factory: Callable[[str], Table]
    ) -> "Permutation":
        """
        Creates a new permutation that will use ``connection_factory`` to reopen
        the base table when this permutation is unpickled in a worker process.

-        The factory is a callable that takes a single argument — the base table
-        name — and returns a [LanceTable]. It must be picklable; the worker
+        The factory is a callable that takes a single argument, the base table
+        name, and returns a LanceDB table. It must be picklable; the worker
        will pickle it via standard ``pickle`` and call it to recover the base
        table. Picklable callables in practice means top-level (module-level)
        functions, ``functools.partial`` of such functions, or instances of
        picklable classes implementing ``__call__``. Lambdas and closures over
        local variables don't pickle with the default protocol.

-        Setting a factory is necessary when the URI alone is not enough to
-        re-open the connection — most importantly for LanceDB Cloud (``db://``)
-        connections, where ``api_key`` and ``region`` aren't recoverable from
-        the connection object after construction.
-
-        For local file or cloud-storage paths the factory is optional: if not
-        set, ``__getstate__`` falls back to capturing
-        ``(uri, storage_options, namespace_path)`` and re-opening via
-        ``lancedb.connect(uri, storage_options=...)``.
+        A factory is optional for normal local and remote LanceDB connections:
+        if not set, ``__getstate__`` captures the table's own picklable reopen
+        state. Use a factory when that default state is not enough, for example
+        when credentials should be loaded from the worker environment instead
+        of being embedded in the pickle.

        Examples
        --------
@@ -508,7 +549,7 @@ class Permutation:
        return new

    @classmethod
-    def identity(cls, table: LanceTable) -> "Permutation":
+    def identity(cls, table: Table) -> "Permutation":
        """
        Creates an identity permutation for the given table.
        """
@@ -517,8 +558,8 @@ class Permutation:
    @classmethod
    def from_tables(
        cls,
-        base_table: LanceTable,
-        permutation_table: Optional[LanceTable] = None,
+        base_table: Table,
+        permutation_table: Optional[Table] = None,
        split: Optional[Union[str, int]] = None,
    ) -> "Permutation":
        """
@@ -594,11 +635,10 @@ class Permutation:

        The base table is captured either via a user-supplied
        ``connection_factory`` (see [with_connection_factory]) or, as a
-        fallback, by introspecting ``(uri, storage_options, namespace_path)``
-        on the connection. The permutation table — always an in-memory
-        LanceDB table — is captured as a pyarrow Table (which pickles via
-        Arrow IPC natively). The reader is dropped from the wire format;
-        ``__setstate__`` rebuilds it from the restored tables.
+        fallback, by the table's own picklable reopen state. The permutation
+        table is captured as a pyarrow Table (which pickles via Arrow IPC
+        natively). The reader is dropped from the wire format and rebuilt
+        lazily on first use.
        """
        permutation_data: Optional[pa.Table] = None
        if self.permutation_table is not None:
@@ -622,39 +662,9 @@ class Permutation:
            # namespace from the existing connection.
            return common

-        # URI-introspection fallback: only viable for native (OSS) connections
-        # where (uri, storage_options) is enough to reopen. Remote / cloud
-        # connections don't expose recoverable api_key / region — those users
-        # must call with_connection_factory().
-        try:
-            base_uri = self.base_table._conn.uri
-            storage_options = self.base_table._conn.storage_options
-        except AttributeError as e:
-            raise ValueError(
-                "Cannot pickle this Permutation: the base table's connection "
-                "does not expose a uri/storage_options, which usually means it "
-                "is a remote (LanceDB Cloud) connection. Call "
-                "Permutation.with_connection_factory(...) first to provide a "
-                "picklable callable that re-opens the base table from a worker "
-                "process."
-            ) from e
-
-        if base_uri.startswith("memory://"):
-            # In-memory base tables don't exist in any worker process by
-            # default, so dump the entire base table into the pickle. This
-            # can be expensive for large datasets — users with large
-            # in-memory base tables should either persist them or set a
-            # connection_factory.
-            return {
-                **common,
-                "base_table_data": self.base_table.to_arrow(),
-            }
-
        return {
            **common,
-            "base_table_uri": base_uri,
-            "base_table_namespace": self.base_table._namespace_path,
-            "base_table_storage_options": storage_options,
+            "base_table_state": _table_to_pickle_state(self.base_table),
        }

    def __setstate__(self, state: dict[str, Any]) -> None:
@@ -663,6 +673,8 @@ class Permutation:
        connection_factory = state["connection_factory"]
        if connection_factory is not None:
            base_table = connection_factory(state["base_table_name"])
+        elif "base_table_state" in state:
+            base_table = _table_from_pickle_state(state["base_table_state"])
        elif "base_table_data" in state:
            # In-memory base table inlined into the pickle; rebuild the same
            # way we rebuild the in-memory permutation table.
@@ -680,7 +692,7 @@ class Permutation:
                namespace_path=state["base_table_namespace"] or None,
            )

-        permutation_table: Optional[LanceTable] = None
+        permutation_table: Optional[Table] = None
        if state["permutation_data"] is not None:
            mem_db = connect("memory://")
            permutation_table = mem_db.create_table(
@@ -696,10 +708,28 @@ class Permutation:
        self.offset = state["offset"]
        self.limit = state["limit"]
        self.connection_factory = connection_factory
+        self.reader = None
+        self._pid = None
+
+    def _ensure_open(self) -> None:
+        pid = os.getpid()
+        if self.reader is not None and getattr(self, "_pid", None) == pid:
+            return
+        # The reader owns Rust-side table handles. Rebuild it after unpickle or
+        # fork even though the Python table wrappers reopen themselves.
+        if hasattr(self.base_table, "_ensure_open"):
+            self.base_table._ensure_open()
+        if self.permutation_table is not None and hasattr(
+            self.permutation_table, "_ensure_open"
+        ):
+            self.permutation_table._ensure_open()
        self.reader = LOOP.run(self._build_reader())
+        self._pid = pid

    @property
    def schema(self) -> pa.Schema:
+        self._ensure_open()
+
        async def do_output_schema():
            return await self.reader.output_schema(self.selection)

@@ -717,6 +747,7 @@ class Permutation:
        """
        The number of rows in the permutation
        """
+        self._ensure_open()
        return self.reader.count_rows()

    @property
@@ -875,6 +906,7 @@ class Permutation:
        If skip_last_batch is True, the last batch will be skipped if it is not a
        multiple of batch_size.
        """
+        self._ensure_open()

        async def get_iter():
            return await self.reader.read(self.selection, batch_size=batch_size)
@@ -976,6 +1008,7 @@ class Permutation:
        so `with_format` and `with_transform` affect this method in the same way
        they affect iteration.
        """
+        self._ensure_open()

        async def do_take_offsets():
            return await self.reader.take_offsets(offsets, selection=self.selection)
@@ -1011,9 +1044,11 @@ class Permutation:
        """
        Skip the first `skip` rows of the permutation
        """
+        self._ensure_open()
        new = copy.copy(self)
        new.offset = skip
        new.reader = LOOP.run(new._build_reader())
+        new._pid = os.getpid()
        return new

    @deprecated(details="Use with_take instead")
@@ -1032,9 +1067,11 @@ class Permutation:
        """
        Limit the permutation to `limit` rows (following any `skip`)
        """
+        self._ensure_open()
        new = copy.copy(self)
        new.limit = limit
        new.reader = LOOP.run(new._build_reader())
+        new._pid = os.getpid()
        return new

    @deprecated(details="Use with_repeat instead")
--- a/python/python/lancedb/query.py
+++ b/python/python/lancedb/query.py
@@ -3,12 +3,14 @@

 from __future__ import annotations

+import asyncio
 from abc import ABC, abstractmethod
 from concurrent.futures import ThreadPoolExecutor
-from enum import Enum
 from datetime import timedelta
+from enum import Enum
 from typing import (
    TYPE_CHECKING,
+    Any,
    Dict,
    List,
    Literal,
@@ -17,44 +19,51 @@ from typing import (
    Type,
    TypeVar,
    Union,
-    Any,
 )

-import asyncio
 import deprecation
 import numpy as np
 import pyarrow as pa
 import pyarrow.compute as pc
 import pydantic
+from typing_extensions import Annotated

-from lancedb.pydantic import PYDANTIC_VERSION
+from lancedb._lancedb import fts_query_to_json
 from lancedb.background_loop import LOOP
+from lancedb.pydantic import PYDANTIC_VERSION

 from . import __version__
 from .arrow import AsyncRecordBatchReader
 from .dependencies import pandas as pd
+from .expr import Expr
 from .rerankers.base import Reranker
 from .rerankers.rrf import RRFReranker
 from .rerankers.util import check_reranker_result
 from .util import flatten_columns
-from .expr import Expr
-from lancedb._lancedb import fts_query_to_json
-from typing_extensions import Annotated
+
+BlobMode = Literal["lazy", "bytes", "descriptions"]
+
+_BLOB_MODE_TO_HANDLING = {
+    "lazy": "blobs_descriptions",
+    "bytes": "all_binary",
+    "descriptions": "blobs_descriptions",
+}

 if TYPE_CHECKING:
    import sys
+
    import PIL
    import polars as pl

-    from ._lancedb import Query as LanceQuery
    from ._lancedb import FTSQuery as LanceFTSQuery
    from ._lancedb import HybridQuery as LanceHybridQuery
-    from ._lancedb import VectorQuery as LanceVectorQuery
-    from ._lancedb import TakeQuery as LanceTakeQuery
    from ._lancedb import PyQueryRequest
+    from ._lancedb import Query as LanceQuery
+    from ._lancedb import TakeQuery as LanceTakeQuery
+    from ._lancedb import VectorQuery as LanceVectorQuery
    from .common import VEC
    from .pydantic import LanceModel
-    from .table import Table
+    from .table import AsyncTable, Table

    if sys.version_info >= (3, 11):
        from typing import Self
@@ -64,6 +73,179 @@ if TYPE_CHECKING:
 T = TypeVar("T", bound="LanceModel")


+def _validate_blob_mode(blob_mode: BlobMode) -> None:
+    if blob_mode not in _BLOB_MODE_TO_HANDLING:
+        modes = ", ".join(repr(mode) for mode in _BLOB_MODE_TO_HANDLING)
+        raise ValueError(f"blob_mode must be one of {modes}, got {blob_mode!r}")
+
+
+def _field_is_blob(field: pa.Field) -> bool:
+    metadata = field.metadata or {}
+    return metadata.get(b"lance-encoding:blob") == b"true" or (
+        metadata.get("lance-encoding:blob") == "true"
+    )
+
+
+def _schema_has_blob_field(schema: pa.Schema) -> bool:
+    return any(_field_is_blob(field) for field in schema)
+
+
+def _blob_mode_requires_native_pandas(blob_mode: BlobMode, schema: pa.Schema) -> bool:
+    return blob_mode in _BLOB_MODE_TO_HANDLING and _schema_has_blob_field(schema)
+
+
+def _unsupported_blob_pandas_error(reason: str) -> RuntimeError:
+    return RuntimeError(
+        "blob columns require Lance native scanner conversion for query "
+        f"to_pandas(), but {reason}. Use a plain scan query or remove blob "
+        "columns from the projection."
+    )
+
+
+def _query_is_plain_scan(query: Query) -> bool:
+    return (
+        query.vector is None
+        and query.full_text_query is None
+        and not query.postfilter
+        and not query.order_by
+    )
+
+
+def _filter_to_sql(filter: Optional[Union[str, Expr]]) -> Optional[str]:
+    if filter is None:
+        return None
+    if isinstance(filter, Expr):
+        return filter.to_sql()
+    return filter
+
+
+def _projection_to_scanner_kwargs(
+    columns: Optional[
+        Union[
+            List[str], List[Tuple[str, Union[str, Expr]]], Dict[str, Union[str, Expr]]
+        ]
+    ],
+) -> Dict[str, Any]:
+    if columns is None:
+        return {}
+    if isinstance(columns, list):
+        if all(isinstance(column, str) for column in columns):
+            return {"columns": columns}
+        if all(isinstance(column, tuple) and len(column) == 2 for column in columns):
+            return {
+                "columns": {
+                    name: expr.to_sql() if isinstance(expr, Expr) else expr
+                    for name, expr in columns
+                }
+            }
+        # Let Lance raise the detailed projection validation error.
+        return {"columns": columns}
+
+    projection = {}
+    for name, expr in columns.items():
+        if isinstance(expr, Expr):
+            expr = expr.to_sql()
+        projection[name] = expr
+    return {"columns": projection}
+
+
+def _scanner_kwargs_for_query(
+    query: Query, blob_mode: BlobMode, dataset: Optional[Any] = None
+) -> Dict[str, Any]:
+    fragments = _scanner_fragments_for_query(query, dataset)
+    kwargs = {
+        **_projection_to_scanner_kwargs(query.columns),
+        "filter": _filter_to_sql(query.filter),
+        "limit": query.limit,
+        "offset": query.offset,
+        "with_row_id": query.with_row_id,
+        "with_row_address": query.with_row_address,
+        "fast_search": query.fast_search,
+        "blob_handling": _BLOB_MODE_TO_HANDLING[blob_mode],
+        "fragments": fragments,
+    }
+    return {key: value for key, value in kwargs.items() if value is not None}
+
+
+def _scanner_fragments_for_query(query: Query, dataset: Optional[Any]) -> Optional[Any]:
+    if query.fragments is not None and query.fragment_ids is not None:
+        raise ValueError("fragments and fragment_ids cannot both be set")
+    if query.fragments is not None:
+        return query.fragments
+    if query.fragment_ids is None:
+        return None
+    if dataset is None:
+        raise ValueError("fragment_ids require a Lance dataset")
+
+    requested = set(query.fragment_ids)
+    fragments = [
+        fragment
+        for fragment in dataset.get_fragments()
+        if fragment.fragment_id in requested
+    ]
+    found = {fragment.fragment_id for fragment in fragments}
+    missing = requested - found
+    if missing:
+        missing_ids = ", ".join(str(fragment_id) for fragment_id in sorted(missing))
+        raise ValueError(f"fragment_ids not found in dataset: {missing_ids}")
+    return fragments
+
+
+def _ensure_lazy_blob_frame(
+    df: "pd.DataFrame", schema: pa.Schema, blob_mode: BlobMode
+) -> "pd.DataFrame":
+    if blob_mode != "lazy" or not _schema_has_blob_field(schema) or len(df) == 0:
+        return df
+
+    for field in schema:
+        if not _field_is_blob(field) or field.name not in df.columns:
+            continue
+        value = df[field.name].iloc[0]
+        if value is not None and not hasattr(value, "readall"):
+            raise _unsupported_blob_pandas_error(
+                "the Lance scanner did not return lazy blob files"
+            )
+    return df
+
+
+def _scanner_to_table(scanner: Any) -> pa.Table:
+    if hasattr(scanner, "to_pyarrow"):
+        reader = scanner.to_pyarrow()
+        return reader.read_all()
+    if hasattr(scanner, "to_table"):
+        return scanner.to_table()
+    reader = scanner.to_reader()
+    return reader.read_all()
+
+
+def _scanner_to_pandas(scanner: Any, blob_mode: BlobMode, **kwargs) -> "pd.DataFrame":
+    schema = getattr(scanner, "projected_schema", None)
+    if schema is None:
+        schema = getattr(scanner, "schema", None)
+    if schema is None:
+        schema = getattr(scanner, "dataset_schema", None)
+    if callable(schema):
+        schema = schema()
+    if hasattr(scanner, "to_pandas"):
+        try:
+            df = scanner.to_pandas(blob_mode=blob_mode, **kwargs)
+        except TypeError as err:
+            message = str(err)
+            if "blob_mode" not in message and "unexpected keyword" not in message:
+                raise
+            df = scanner.to_pandas(**kwargs)
+        if schema is not None:
+            return _ensure_lazy_blob_frame(df, schema, blob_mode)
+        return df
+
+    tbl = _scanner_to_table(scanner)
+    if blob_mode == "lazy" and _schema_has_blob_field(tbl.schema):
+        raise _unsupported_blob_pandas_error(
+            "the Lance scanner does not expose to_pandas"
+        )
+    return tbl.to_pandas(**kwargs)
+
+
 # Pydantic validation function for vector queries
 def ensure_vector_query(
    val: Any,
@@ -498,6 +680,13 @@ class Query(pydantic.BaseModel):
    # if true, include the row id in the results
    with_row_id: Optional[bool] = None

+    # if true, include the row address in the results
+    with_row_address: Optional[bool] = None
+
+    # Lance fragments or fragment ids to scan on scanner-backed plain queries
+    fragments: Optional[Any] = None
+    fragment_ids: Optional[List[int]] = None
+
    # offset to start fetching results from
    offset: Optional[int] = None

@@ -690,6 +879,9 @@ class LanceQueryBuilder(ABC):
        self._where = None
        self._postfilter = None
        self._with_row_id = None
+        self._with_row_address = None
+        self._fragments = None
+        self._fragment_ids = None
        self._vector = None
        self._text = None
        self._ef = None
@@ -717,7 +909,9 @@ class LanceQueryBuilder(ABC):
        self,
        flatten: Optional[Union[int, bool]] = None,
        *,
+        blob_mode: BlobMode = "lazy",
        timeout: Optional[timedelta] = None,
+        **kwargs,
    ) -> "pd.DataFrame":
        """
        Execute the query and return the results as a pandas DataFrame.
@@ -735,9 +929,42 @@ class LanceQueryBuilder(ABC):
        timeout: Optional[timedelta]
            The maximum time to wait for the query to complete.
            If None, wait indefinitely.
+        blob_mode: str, default "lazy"
+            Controls how blob columns are returned for plain scan queries.
+            Vector, FTS, hybrid, and other non-native query shapes keep the
+            existing Arrow conversion path and only support blob descriptions.
+        **kwargs
+            Forwarded to pyarrow.Table.to_pandas after query execution and
+            optional flattening.
        """
+        _validate_blob_mode(blob_mode)
+        output_schema = getattr(self, "output_schema", None)
+        if output_schema is not None:
+            schema = output_schema()
+            if _blob_mode_requires_native_pandas(blob_mode, schema):
+                native_error = None
+                if (flatten is None or blob_mode == "descriptions") and timeout is None:
+                    try:
+                        df = self._plain_scan_to_pandas(
+                            blob_mode, flatten=flatten, **kwargs
+                        )
+                        if df is not None:
+                            return df
+                    except Exception as err:
+                        native_error = err
+                reason = (
+                    "this query shape cannot use Lance native pandas conversion"
+                    if native_error is None
+                    else str(native_error)
+                )
+                raise _unsupported_blob_pandas_error(reason) from native_error
+
        tbl = flatten_columns(self.to_arrow(timeout=timeout), flatten)
-        return tbl.to_pandas()
+        if _blob_mode_requires_native_pandas(blob_mode, tbl.schema):
+            raise _unsupported_blob_pandas_error(
+                "this query shape cannot use Lance native pandas conversion"
+            )
+        return tbl.to_pandas(**kwargs)

    @abstractmethod
    def to_arrow(self, *, timeout: Optional[timedelta] = None) -> pa.Table:
@@ -942,6 +1169,32 @@ class LanceQueryBuilder(ABC):
        self._with_row_id = with_row_id
        return self

+    def with_row_address(self, with_row_address: bool = True) -> Self:
+        """Set whether to return row addresses.
+
+        Parameters
+        ----------
+        with_row_address: bool, default True
+            If True, return the _rowaddr column in the results.
+
+        Returns
+        -------
+        LanceQueryBuilder
+            The LanceQueryBuilder object.
+        """
+        self._with_row_address = with_row_address
+        return self
+
+    def with_fragments(self, fragments: Any) -> Self:
+        """Set the Lance fragments to scan for plain scanner-backed queries."""
+        self._fragments = fragments
+        return self
+
+    def fragment_ids(self, fragment_ids: List[int]) -> Self:
+        """Set the Lance fragment ids to scan for plain scanner-backed queries."""
+        self._fragment_ids = fragment_ids
+        return self
+
    def explain_plan(self, verbose: Optional[bool] = False) -> str:
        """Return the execution plan for this query.

@@ -1081,6 +1334,25 @@ class LanceQueryBuilder(ABC):
        """
        raise NotImplementedError

+    def _plain_scan_to_pandas(
+        self,
+        blob_mode: BlobMode,
+        flatten: Optional[Union[int, bool]] = None,
+        **kwargs,
+    ) -> Optional["pd.DataFrame"]:
+        query = self.to_query_object()
+        if not _query_is_plain_scan(query):
+            return None
+
+        dataset = self._table.to_lance()
+        scanner = dataset.scanner(
+            **_scanner_kwargs_for_query(query, blob_mode, dataset)
+        )
+        if flatten is not None:
+            tbl = flatten_columns(_scanner_to_table(scanner), flatten)
+            return tbl.to_pandas(**kwargs)
+        return _scanner_to_pandas(scanner, blob_mode, **kwargs)
+
    @abstractmethod
    def to_query_object(self) -> Query:
        """Return a serializable representation of the query
@@ -1352,6 +1624,9 @@ class LanceVectorQueryBuilder(LanceQueryBuilder):
            refine_factor=self._refine_factor,
            vector_column=self._vector_column,
            with_row_id=self._with_row_id,
+            with_row_address=self._with_row_address,
+            fragments=self._fragments,
+            fragment_ids=self._fragment_ids,
            offset=self._offset,
            fast_search=self._fast_search,
            ef=self._ef,
@@ -1554,6 +1829,9 @@ class LanceFtsQueryBuilder(LanceQueryBuilder):
            limit=self._limit,
            postfilter=self._postfilter,
            with_row_id=self._with_row_id,
+            with_row_address=self._with_row_address,
+            fragments=self._fragments,
+            fragment_ids=self._fragment_ids,
            full_text_query=FullTextSearchQuery(
                query=self._query, columns=self._fts_columns
            ),
@@ -1624,6 +1902,9 @@ class LanceEmptyQueryBuilder(LanceQueryBuilder):
            filter=self._where,
            limit=self._limit,
            with_row_id=self._with_row_id,
+            with_row_address=self._with_row_address,
+            fragments=self._fragments,
+            fragment_ids=self._fragment_ids,
            offset=self._offset,
            order_by=self._order_by,
        )
@@ -2202,7 +2483,11 @@ class AsyncQueryBase(object):
    Base class for all async queries (take, scan, vector, fts, hybrid)
    """

-    def __init__(self, inner: Union[LanceQuery, LanceVectorQuery, LanceTakeQuery]):
+    def __init__(
+        self,
+        inner: Union[LanceQuery, LanceVectorQuery, LanceTakeQuery],
+        table: Optional["AsyncTable"] = None,
+    ):
        """
        Construct an AsyncQueryBase

@@ -2210,6 +2495,10 @@ class AsyncQueryBase(object):
        [AsyncTable.query][lancedb.table.AsyncTable.query] method to create a query.
        """
        self._inner = inner
+        self._table = table
+        self._with_row_address = None
+        self._fragments = None
+        self._fragment_ids = None

    def to_query_object(self) -> Query:
        """
@@ -2218,7 +2507,11 @@ class AsyncQueryBase(object):
        This is currently experimental but can be useful as the query object is pure
        python and more easily serializable.
        """
-        return Query.from_inner(self._inner.to_query_request())
+        query = Query.from_inner(self._inner.to_query_request())
+        query.with_row_address = self._with_row_address
+        query.fragments = self._fragments
+        query.fragment_ids = self._fragment_ids
+        return query

    def select(self, columns: Union[List[str], dict[str, str]]) -> Self:
        """
@@ -2275,6 +2568,27 @@ class AsyncQueryBase(object):
        self._inner.with_row_id()
        return self

+    def with_row_address(self, with_row_address: bool = True) -> Self:
+        """
+        Include the _rowaddr column in scanner-backed plain query results.
+        """
+        self._with_row_address = with_row_address
+        return self
+
+    def with_fragments(self, fragments: Any) -> Self:
+        """
+        Restrict scanner-backed plain query results to the given Lance fragments.
+        """
+        self._fragments = fragments
+        return self
+
+    def fragment_ids(self, fragment_ids: List[int]) -> Self:
+        """
+        Restrict scanner-backed plain query results to the given Lance fragment ids.
+        """
+        self._fragment_ids = fragment_ids
+        return self
+
    async def to_batches(
        self,
        *,
@@ -2352,6 +2666,9 @@ class AsyncQueryBase(object):
        self,
        flatten: Optional[Union[int, bool]] = None,
        timeout: Optional[timedelta] = None,
+        *,
+        blob_mode: BlobMode = "lazy",
+        **kwargs,
    ) -> "pd.DataFrame":
        """
        Execute the query and collect the results into a pandas DataFrame.
@@ -2384,10 +2701,63 @@ class AsyncQueryBase(object):
            The maximum time to wait for the query to complete.
            If not specified, no timeout is applied. If the query does not
            complete within the specified time, an error will be raised.
+        blob_mode: str, default "lazy"
+            Controls how blob columns are returned for plain scan queries.
+            Vector, FTS, hybrid, and other non-native query shapes keep the
+            existing Arrow conversion path and only support blob descriptions.
+        **kwargs
+            Forwarded to pyarrow.Table.to_pandas after query execution and
+            optional flattening.
        """
-        return (
-            flatten_columns(await self.to_arrow(timeout=timeout), flatten)
-        ).to_pandas()
+        _validate_blob_mode(blob_mode)
+        if hasattr(self._inner, "output_schema"):
+            schema = await self.output_schema()
+            if _blob_mode_requires_native_pandas(blob_mode, schema):
+                native_error = None
+                if (flatten is None or blob_mode == "descriptions") and timeout is None:
+                    try:
+                        df = await self._plain_scan_to_pandas(
+                            blob_mode, flatten=flatten, **kwargs
+                        )
+                        if df is not None:
+                            return df
+                    except Exception as err:
+                        native_error = err
+                reason = (
+                    "this query shape cannot use Lance native pandas conversion"
+                    if native_error is None
+                    else str(native_error)
+                )
+                raise _unsupported_blob_pandas_error(reason) from native_error
+
+        tbl = flatten_columns(await self.to_arrow(timeout=timeout), flatten)
+        if _blob_mode_requires_native_pandas(blob_mode, tbl.schema):
+            raise _unsupported_blob_pandas_error(
+                "this query shape cannot use Lance native pandas conversion"
+            )
+        return tbl.to_pandas(**kwargs)
+
+    async def _plain_scan_to_pandas(
+        self,
+        blob_mode: BlobMode,
+        flatten: Optional[Union[int, bool]] = None,
+        **kwargs,
+    ) -> Optional["pd.DataFrame"]:
+        if self._table is None:
+            return None
+
+        query = self.to_query_object()
+        if not _query_is_plain_scan(query):
+            return None
+
+        dataset = await self._table._to_lance()
+        scanner = dataset.scanner(
+            **_scanner_kwargs_for_query(query, blob_mode, dataset)
+        )
+        if flatten is not None:
+            tbl = flatten_columns(_scanner_to_table(scanner), flatten)
+            return tbl.to_pandas(**kwargs)
+        return _scanner_to_pandas(scanner, blob_mode, **kwargs)

    async def to_polars(
        self,
@@ -2494,14 +2864,18 @@ class AsyncStandardQuery(AsyncQueryBase):
    Base class for "standard" async queries (all but take currently)
    """

-    def __init__(self, inner: Union[LanceQuery, LanceVectorQuery]):
+    def __init__(
+        self,
+        inner: Union[LanceQuery, LanceVectorQuery],
+        table: Optional["AsyncTable"] = None,
+    ):
        """
        Construct an AsyncStandardQuery

        This method is not intended to be called directly.  Instead, use the
        [AsyncTable.query][lancedb.table.AsyncTable.query] method to create a query.
        """
-        super().__init__(inner)
+        super().__init__(inner, table)

    def where(self, predicate: Union[str, Expr]) -> Self:
        """
@@ -2607,14 +2981,14 @@ class AsyncStandardQuery(AsyncQueryBase):


 class AsyncQuery(AsyncStandardQuery):
-    def __init__(self, inner: LanceQuery):
+    def __init__(self, inner: LanceQuery, table: Optional["AsyncTable"] = None):
        """
        Construct an AsyncQuery

        This method is not intended to be called directly.  Instead, use the
        [AsyncTable.query][lancedb.table.AsyncTable.query] method to create a query.
        """
-        super().__init__(inner)
+        super().__init__(inner, table)
        self._inner = inner

    @classmethod
@@ -2698,10 +3072,11 @@ class AsyncQuery(AsyncStandardQuery):
            new_self = self._inner.nearest_to(query_vectors[0])
            for v in query_vectors[1:]:
                new_self.add_query_vector(v)
-            return AsyncVectorQuery(new_self)
+            return AsyncVectorQuery(new_self, self._table)
        else:
            return AsyncVectorQuery(
-                self._inner.nearest_to(AsyncQuery._query_vec_to_array(query_vector))
+                self._inner.nearest_to(AsyncQuery._query_vec_to_array(query_vector)),
+                self._table,
            )

    def nearest_to_text(
@@ -2734,17 +3109,18 @@ class AsyncQuery(AsyncStandardQuery):

        if isinstance(query, str):
            return AsyncFTSQuery(
-                self._inner.nearest_to_text({"query": query, "columns": columns})
+                self._inner.nearest_to_text({"query": query, "columns": columns}),
+                self._table,
            )
        # FullTextQuery object
-        return AsyncFTSQuery(self._inner.nearest_to_text({"query": query}))
+        return AsyncFTSQuery(self._inner.nearest_to_text({"query": query}), self._table)


 class AsyncFTSQuery(AsyncStandardQuery):
    """A query for full text search for LanceDB."""

-    def __init__(self, inner: LanceFTSQuery):
-        super().__init__(inner)
+    def __init__(self, inner: LanceFTSQuery, table: Optional["AsyncTable"] = None):
+        super().__init__(inner, table)
        self._inner = inner
        self._reranker = None

@@ -2826,10 +3202,11 @@ class AsyncFTSQuery(AsyncStandardQuery):
            new_self = self._inner.nearest_to(query_vectors[0])
            for v in query_vectors[1:]:
                new_self.add_query_vector(v)
-            return AsyncHybridQuery(new_self)
+            return AsyncHybridQuery(new_self, self._table)
        else:
            return AsyncHybridQuery(
-                self._inner.nearest_to(AsyncQuery._query_vec_to_array(query_vector))
+                self._inner.nearest_to(AsyncQuery._query_vec_to_array(query_vector)),
+                self._table,
            )

    async def to_batches(
@@ -3020,7 +3397,7 @@ class AsyncVectorQueryBase:


 class AsyncVectorQuery(AsyncStandardQuery, AsyncVectorQueryBase):
-    def __init__(self, inner: LanceVectorQuery):
+    def __init__(self, inner: LanceVectorQuery, table: Optional["AsyncTable"] = None):
        """
        Construct an AsyncVectorQuery

@@ -3030,7 +3407,7 @@ class AsyncVectorQuery(AsyncStandardQuery, AsyncVectorQueryBase):
        a vector query.  Or you can use
        [AsyncTable.vector_search][lancedb.table.AsyncTable.vector_search]
        """
-        super().__init__(inner)
+        super().__init__(inner, table)
        self._inner = inner
        self._reranker = None
        self._query_string = None
@@ -3084,10 +3461,13 @@ class AsyncVectorQuery(AsyncStandardQuery, AsyncVectorQueryBase):

        if isinstance(query, str):
            return AsyncHybridQuery(
-                self._inner.nearest_to_text({"query": query, "columns": columns})
+                self._inner.nearest_to_text({"query": query, "columns": columns}),
+                self._table,
            )
        # FullTextQuery object
-        return AsyncHybridQuery(self._inner.nearest_to_text({"query": query}))
+        return AsyncHybridQuery(
+            self._inner.nearest_to_text({"query": query}), self._table
+        )

    async def to_batches(
        self,
@@ -3114,8 +3494,8 @@ class AsyncHybridQuery(AsyncStandardQuery, AsyncVectorQueryBase):
    in the `rerank` method to convert the scores to ranks and then normalize them.
    """

-    def __init__(self, inner: LanceHybridQuery):
-        super().__init__(inner)
+    def __init__(self, inner: LanceHybridQuery, table: Optional["AsyncTable"] = None):
+        super().__init__(inner, table)
        self._inner = inner
        self._norm = "score"
        self._reranker = RRFReranker()
@@ -3156,8 +3536,8 @@ class AsyncHybridQuery(AsyncStandardQuery, AsyncVectorQueryBase):
        max_batch_length: Optional[int] = None,
        timeout: Optional[timedelta] = None,
    ) -> AsyncRecordBatchReader:
-        fts_query = AsyncFTSQuery(self._inner.to_fts_query())
-        vec_query = AsyncVectorQuery(self._inner.to_vector_query())
+        fts_query = AsyncFTSQuery(self._inner.to_fts_query(), self._table)
+        vec_query = AsyncVectorQuery(self._inner.to_vector_query(), self._table)

        # save the row ID choice that was made on the query builder and force it
        # to actually fetch the row ids because we need this for reranking
@@ -3257,8 +3637,16 @@ class AsyncTakeQuery(AsyncQueryBase):
    Builder for parameterizing and executing take queries.
    """

-    def __init__(self, inner: LanceTakeQuery):
-        super().__init__(inner)
+    def __init__(self, inner: LanceTakeQuery, table: Optional["AsyncTable"] = None):
+        super().__init__(inner, table)
+
+    async def _plain_scan_to_pandas(
+        self,
+        blob_mode: BlobMode,
+        flatten: Optional[Union[int, bool]] = None,
+        **kwargs,
+    ) -> Optional["pd.DataFrame"]:
+        return None


 class BaseQueryBuilder(object):
@@ -3310,6 +3698,27 @@ class BaseQueryBuilder(object):
        self._inner.with_row_id()
        return self

+    def with_row_address(self, with_row_address: bool = True) -> Self:
+        """
+        Include the _rowaddr column in scanner-backed plain query results.
+        """
+        self._inner.with_row_address(with_row_address)
+        return self
+
+    def with_fragments(self, fragments: Any) -> Self:
+        """
+        Restrict scanner-backed plain query results to the given Lance fragments.
+        """
+        self._inner.with_fragments(fragments)
+        return self
+
+    def fragment_ids(self, fragment_ids: List[int]) -> Self:
+        """
+        Restrict scanner-backed plain query results to the given Lance fragment ids.
+        """
+        self._inner.fragment_ids(fragment_ids)
+        return self
+
    def output_schema(self) -> pa.Schema:
        """
        Return the output schema for the query
@@ -3340,16 +3749,18 @@ class BaseQueryBuilder(object):
            If not specified, no timeout is applied. If the query does not
            complete within the specified time, an error will be raised.
        """
-        async_iter = LOOP.run(self._inner.execute(max_batch_length, timeout))
+        async_reader = LOOP.run(
+            self._inner.to_batches(max_batch_length=max_batch_length, timeout=timeout)
+        )

        def iter_sync():
            try:
                while True:
-                    yield LOOP.run(async_iter.__anext__())
+                    yield LOOP.run(async_reader.__anext__())
            except StopAsyncIteration:
                return

-        return pa.RecordBatchReader.from_batches(async_iter.schema, iter_sync())
+        return pa.RecordBatchReader.from_batches(async_reader.schema, iter_sync())

    def to_arrow(self, timeout: Optional[timedelta] = None) -> pa.Table:
        """
@@ -3389,6 +3800,9 @@ class BaseQueryBuilder(object):
        self,
        flatten: Optional[Union[int, bool]] = None,
        timeout: Optional[timedelta] = None,
+        *,
+        blob_mode: BlobMode = "lazy",
+        **kwargs,
    ) -> "pd.DataFrame":
        """
        Execute the query and collect the results into a pandas DataFrame.
@@ -3421,8 +3835,15 @@ class BaseQueryBuilder(object):
            The maximum time to wait for the query to complete.
            If not specified, no timeout is applied. If the query does not
            complete within the specified time, an error will be raised.
+        blob_mode: str, default "lazy"
+            Controls how blob columns are returned for plain scan queries.
+        **kwargs
+            Forwarded to pyarrow.Table.to_pandas after query execution and
+            optional flattening.
        """
-        return LOOP.run(self._inner.to_pandas(flatten, timeout))
+        return LOOP.run(
+            self._inner.to_pandas(flatten, timeout, blob_mode=blob_mode, **kwargs)
+        )

    def to_polars(
        self,
--- a/python/python/lancedb/remote/init.py
+++ b/python/python/lancedb/remote/init.py
@@ -9,7 +9,6 @@ from typing import List, Optional
 from lancedb import __version__

 from .header import HeaderProvider
-from .oauth import OAuthConfig, OAuthFlowType

 __all__ = [
    "TimeoutConfig",
@@ -17,8 +16,6 @@ __all__ = [
    "TlsConfig",
    "ClientConfig",
    "HeaderProvider",
-    "OAuthConfig",
-    "OAuthFlowType",
 ]


--- a/python/python/lancedb/remote/db.py
+++ b/python/python/lancedb/remote/db.py
@@ -3,6 +3,7 @@


 from datetime import timedelta
+import json
 import logging
 from concurrent.futures import ThreadPoolExecutor
 import sys
@@ -17,7 +18,7 @@ else:

 # Remove this import to fix circular dependency
 # from lancedb import connect_async
-from lancedb.remote import ClientConfig
+from lancedb.remote import ClientConfig, RetryConfig, TimeoutConfig, TlsConfig
 import pyarrow as pa

 from ..common import DATA
@@ -36,6 +37,64 @@ from ..table import Table
 from ..util import validate_table_name


+def _duration_seconds(value: Optional[timedelta]) -> Optional[float]:
+    return value.total_seconds() if value is not None else None
+
+
+def _timeout_config_to_dict(
+    config: Optional[TimeoutConfig],
+) -> Optional[dict[str, Any]]:
+    if config is None:
+        return None
+    return {
+        "timeout": _duration_seconds(config.timeout),
+        "connect_timeout": _duration_seconds(config.connect_timeout),
+        "read_timeout": _duration_seconds(config.read_timeout),
+        "pool_idle_timeout": _duration_seconds(config.pool_idle_timeout),
+    }
+
+
+def _retry_config_to_dict(config: RetryConfig) -> dict[str, Any]:
+    return {
+        "retries": config.retries,
+        "connect_retries": config.connect_retries,
+        "read_retries": config.read_retries,
+        "backoff_factor": config.backoff_factor,
+        "backoff_jitter": config.backoff_jitter,
+        "statuses": config.statuses,
+    }
+
+
+def _tls_config_to_dict(config: Optional[TlsConfig]) -> Optional[dict[str, Any]]:
+    if config is None:
+        return None
+    return {
+        "cert_file": config.cert_file,
+        "key_file": config.key_file,
+        "ssl_ca_cert": config.ssl_ca_cert,
+        "assert_hostname": config.assert_hostname,
+    }
+
+
+def _client_config_to_dict(config: ClientConfig) -> dict[str, Any]:
+    if config.header_provider is not None:
+        raise ValueError(
+            "Cannot serialize a remote connection with a header_provider. "
+            "Use static api_key/extra_headers or provide a worker-side "
+            "connection factory instead."
+        )
+    return {
+        "user_agent": config.user_agent,
+        "retry_config": _retry_config_to_dict(config.retry_config),
+        "timeout_config": _timeout_config_to_dict(config.timeout_config),
+        "extra_headers": config.extra_headers,
+        "id_delimiter": config.id_delimiter,
+        "tls_config": _tls_config_to_dict(config.tls_config),
+        "header_provider": None,
+        "user_id": config.user_id,
+    }
+
+
 class RemoteDBConnection(DBConnection):
    """A connection to a remote LanceDB database."""

@@ -50,6 +109,7 @@ class RemoteDBConnection(DBConnection):
        connection_timeout: Optional[float] = None,
        read_timeout: Optional[float] = None,
        storage_options: Optional[Dict[str, str]] = None,
+        read_consistency_interval: Optional[timedelta] = None,
    ):
        """Connect to a remote LanceDB database."""
        if isinstance(client_config, dict):
@@ -88,6 +148,11 @@ class RemoteDBConnection(DBConnection):
        parsed = urlparse(db_url)
        if parsed.scheme != "db":
            raise ValueError(f"Invalid scheme: {parsed.scheme}, only accepts db://")
+        self.db_url = db_url
+        self.api_key = api_key
+        self.region = region
+        self.host_override = host_override
+        self.storage_options = storage_options
        self.db_name = parsed.netloc

        self.client_config = client_config
@@ -103,12 +168,27 @@ class RemoteDBConnection(DBConnection):
                host_override=host_override,
                client_config=client_config,
                storage_options=storage_options,
+                read_consistency_interval=read_consistency_interval,
            )
        )

    def __repr__(self) -> str:
        return f"RemoteConnect(name={self.db_name})"

+    @override
+    def serialize(self) -> str:
+        return json.dumps(
+            {
+                "connection_type": "remote",
+                "db_url": self.db_url,
+                "api_key": self.api_key,
+                "region": self.region,
+                "host_override": self.host_override,
+                "client_config": _client_config_to_dict(self.client_config),
+                "storage_options": self.storage_options,
+            }
+        )
+
    @override
    def list_namespaces(
        self,
@@ -303,6 +383,8 @@ class RemoteDBConnection(DBConnection):
        namespace_path: Optional[List[str]] = None,
        storage_options: Optional[Dict[str, str]] = None,
        index_cache_size: Optional[int] = None,
+        branch: Optional[str] = None,
+        version: Optional[int] = None,
    ) -> Table:
        """Open a Lance Table in the database.

@@ -313,6 +395,14 @@ class RemoteDBConnection(DBConnection):
        namespace_path: List[str], optional
            The namespace to open the table from.
            None or empty list represents root namespace.
+        branch: str, optional
+            If provided, open a handle scoped to this branch instead of the
+            default branch. Reads and writes operate in the branch's context.
+        version: int, optional
+            If provided, open the table pinned to this version, producing a
+            read-only handle. Composes with ``branch``: when both are given,
+            opens that branch at the version; otherwise opens ``main`` at the
+            version. Call ``checkout_latest`` to return to a writable state.

        Returns
        -------
@@ -329,7 +419,17 @@ class RemoteDBConnection(DBConnection):
            )

        table = LOOP.run(self._conn.open_table(name, namespace_path=namespace_path))
-        return RemoteTable(table, self.db_name)
+        tbl = RemoteTable(
+            table,
+            self.db_name,
+            connection_state=self.serialize,
+            namespace_path=namespace_path,
+        )
+        if branch is not None:
+            tbl = tbl.branches.checkout(branch, version)
+        elif version is not None:
+            tbl.checkout(version)
+        return tbl

    def clone_table(
        self,
@@ -378,7 +478,12 @@ class RemoteDBConnection(DBConnection):
                is_shallow=is_shallow,
            )
        )
-        return RemoteTable(table, self.db_name)
+        return RemoteTable(
+            table,
+            self.db_name,
+            connection_state=self.serialize,
+            namespace_path=target_namespace_path,
+        )

    @override
    def create_table(
@@ -523,7 +628,12 @@ class RemoteDBConnection(DBConnection):
                fill_value=fill_value,
            )
        )
-        return RemoteTable(table, self.db_name)
+        return RemoteTable(
+            table,
+            self.db_name,
+            connection_state=self.serialize,
+            namespace_path=namespace_path,
+        )

    @override
    def drop_table(self, name: str, namespace_path: Optional[List[str]] = None):
--- a/python/python/lancedb/remote/errors.py
+++ b/python/python/lancedb/remote/errors.py
@@ -27,6 +27,9 @@ class LanceDBClientError(RuntimeError):
        self.request_id = request_id
        self.status_code = status_code

+    def __reduce__(self) -> tuple[type, tuple]:
+        return (self.__class__, (str(self), self.request_id, self.status_code))
+

 class HttpError(LanceDBClientError):
    """An error that occurred during an HTTP request.
@@ -101,3 +104,19 @@ class RetryError(LanceDBClientError):
        self.max_request_failures = max_request_failures
        self.max_connect_failures = max_connect_failures
        self.max_read_failures = max_read_failures
+
+    def __reduce__(self) -> tuple[type, tuple]:
+        return (
+            self.__class__,
+            (
+                str(self),
+                self.request_id,
+                self.request_failures,
+                self.connect_failures,
+                self.read_failures,
+                self.max_request_failures,
+                self.max_connect_failures,
+                self.max_read_failures,
+                self.status_code,
+            ),
+        )
--- a/python/python/lancedb/remote/oauth.py
+++ b/python/python/lancedb/remote/oauth.py
@@ -1,90 +0,0 @@
-# SPDX-License-Identifier: Apache-2.0
-# SPDX-FileCopyrightText: Copyright The LanceDB Authors
-
-from dataclasses import dataclass
-from enum import Enum
-from typing import List, Optional
-
-
-class OAuthFlowType(str, Enum):
-    """OAuth authentication flow types."""
-
-    CLIENT_CREDENTIALS = "client_credentials"
-    """Client Credentials grant (service-to-service / M2M)."""
-
-    AUTHORIZATION_CODE_PKCE = "authorization_code_pkce"
-    """Authorization Code with PKCE (interactive browser-based auth)."""
-
-    DEVICE_CODE = "device_code"
-    """Device Code grant (CLI / headless environments)."""
-
-    AZURE_MANAGED_IDENTITY = "azure_managed_identity"
-    """Azure Managed Identity via IMDS."""
-
-    WORKLOAD_IDENTITY = "workload_identity"
-    """Workload Identity Federation (K8s, GitHub Actions)."""
-
-
-@dataclass
-class OAuthConfig:
-    """OAuth configuration for LanceDB authentication.
-
-    All token acquisition and refresh is handled in the Rust layer.
-    This config is passed through to Rust via PyO3.
-
-    Parameters
-    ----------
-    issuer_url : str
-        OIDC issuer URL or OAuth authority URL.
-        For Azure: ``https://login.microsoftonline.com/{tenant_id}/v2.0``
-    client_id : str
-        Application / Client ID.
-    scopes : List[str]
-        OAuth scopes to request.
-        For Azure: ``["api://{app_id}/.default"]``
-    flow : OAuthFlowType
-        Authentication flow to use. Default: CLIENT_CREDENTIALS.
-    client_secret : Optional[str]
-        Client secret (required for CLIENT_CREDENTIALS).
-    redirect_uri : Optional[str]
-        Redirect URI for AUTHORIZATION_CODE_PKCE flow.
-    callback_port : Optional[int]
-        Port for local HTTP callback server (AUTHORIZATION_CODE_PKCE, default: 8400).
-    managed_identity_client_id : Optional[str]
-        Client ID for user-assigned managed identity (AZURE_MANAGED_IDENTITY).
-    token_file : Optional[str]
-        Path to federated token file (WORKLOAD_IDENTITY).
-    refresh_buffer_secs : Optional[int]
-        Seconds before expiry to trigger proactive refresh (default: 300).
-
-    Examples
-    --------
-    Client Credentials (service-to-service):
-
-    >>> config = OAuthConfig(
-    ...     issuer_url="https://login.microsoftonline.com/{tenant}/v2.0",
-    ...     client_id="app-id",
-    ...     client_secret="secret",
-    ...     scopes=["api://lancedb-api/.default"],
-    ... )
-
-    Azure Managed Identity:
-
-    >>> config = OAuthConfig(
-    ...     issuer_url="https://login.microsoftonline.com/{tenant}/v2.0",
-    ...     client_id="app-id",
-    ...     scopes=["api://lancedb-api/.default"],
-    ...     flow=OAuthFlowType.AZURE_MANAGED_IDENTITY,
-    ... )
-    """
-
-    issuer_url: str
-    client_id: str
-    scopes: List[str]
-    flow: OAuthFlowType = OAuthFlowType.CLIENT_CREDENTIALS
-    client_secret: Optional[str] = None
-    redirect_uri: Optional[str] = None
-    callback_port: Optional[int] = None
-    managed_identity_client_id: Optional[str] = None
-    token_file: Optional[str] = None
-    refresh_buffer_secs: Optional[int] = None
--- a/python/python/lancedb/remote/table.py
+++ b/python/python/lancedb/remote/table.py
@@ -2,15 +2,30 @@
 # SPDX-FileCopyrightText: Copyright The LanceDB Authors

 from datetime import timedelta
+import deprecation
 import logging
 from functools import cached_property
-from typing import Any, Callable, Dict, Iterable, List, Optional, Union, Literal
+import os
+from typing import (
+    Any,
+    Callable,
+    Dict,
+    Iterable,
+    List,
+    Optional,
+    Union,
+    Literal,
+    overload,
+)
 import warnings

+from lancedb import __version__
+
 from lancedb._lancedb import (
    AddColumnsResult,
    AddResult,
    AlterColumnsResult,
+    UpdateFieldMetadataResult,
    DeleteResult,
    DropColumnsResult,
    IndexConfig,
@@ -32,6 +47,7 @@ from lancedb.index import (
    LabelList,
 )
 from lancedb.remote.db import LOOP
+from lancedb.table import IndexConfigType, KNOWN_METRICS
 import pyarrow as pa

 from lancedb.common import DATA, VEC, VECTOR_COLUMN_NAME
@@ -40,7 +56,7 @@ from lancedb.embeddings import EmbeddingFunctionRegistry
 from lancedb.table import _normalize_progress

 from ..query import LanceVectorQueryBuilder, LanceQueryBuilder, LanceTakeQueryBuilder
-from ..table import AsyncTable, IndexStatistics, Query, Table, Tags
+from ..table import AsyncTable, BlobMode, Branches, IndexStatistics, Query, Table, Tags
 from ..types import BaseTokenizerType


@@ -49,14 +65,90 @@ class RemoteTable(Table):
        self,
        table: AsyncTable,
        db_name: str,
+        *,
+        connection_state: Optional[Union[str, Callable[[], str]]] = None,
+        namespace_path: Optional[List[str]] = None,
    ):
-        self._table = table
+        self._table_handle = table
+        self._name = table.name
        self.db_name = db_name
+        self._connection_state = connection_state
+        self._namespace_path = list(namespace_path or [])
+        self._checkout_version: Optional[int] = None
+        # The branch this handle is scoped to (None == main). Persisted so a
+        # fork/pickle reopen restores the branch instead of reverting to main.
+        self._branch: Optional[str] = None
+        self._pid = os.getpid()
+
+    def _serialized_connection_state(self) -> str:
+        if self._connection_state is None:
+            raise RuntimeError(
+                "Cannot reopen this remote table because it does not carry "
+                "serialized connection state"
+            )
+        if callable(self._connection_state):
+            self._connection_state = self._connection_state()
+        return self._connection_state
+
+    @property
+    def _table(self) -> AsyncTable:
+        self._ensure_open()
+        assert self._table_handle is not None
+        return self._table_handle
+
+    @_table.setter
+    def _table(self, table: AsyncTable) -> None:
+        self._table_handle = table
+        self._name = table.name
+        self._pid = os.getpid()
+
+    def _ensure_open(self) -> None:
+        pid = os.getpid()
+        if self._table_handle is not None and self._pid == pid:
+            return
+
+        # Pickle clears the handle; fork inherits a handle created in the
+        # parent process. In both cases reopen before touching the Rust client.
+        from lancedb import deserialize_conn
+
+        db = deserialize_conn(self._serialized_connection_state(), for_worker=True)
+        # Reopen on the same branch and pinned version (branch=None / version=None
+        # reproduce the plain main-latest open).
+        table = db.open_table(
+            self._name,
+            namespace_path=self._namespace_path,
+            branch=self._branch,
+            version=self._checkout_version,
+        )
+
+        self._table_handle = table._table
+        self.db_name = table.db_name
+        self._pid = pid
+
+    def __getstate__(self) -> dict:
+        return {
+            "connection_state": self._serialized_connection_state(),
+            "db_name": self.db_name,
+            "name": self.name,
+            "namespace_path": self._namespace_path,
+            "checkout_version": self._checkout_version,
+            "branch": self._branch,
+        }
+
+    def __setstate__(self, state: dict) -> None:
+        self._table_handle = None
+        self._name = state["name"]
+        self.db_name = state["db_name"]
+        self._connection_state = state["connection_state"]
+        self._namespace_path = state["namespace_path"]
+        self._checkout_version = state["checkout_version"]
+        self._branch = state.get("branch")
+        self._pid = None

    @property
    def name(self) -> str:
        """The name of the table"""
-        return self._table.name
+        return self._name

    def __repr__(self) -> str:
        return f"RemoteTable({self.db_name}.{self.name})"
@@ -78,6 +170,34 @@ class RemoteTable(Table):
    def tags(self) -> Tags:
        return Tags(self._table)

+    @property
+    def branches(self) -> Branches:
+        """Branch management for the table.
+
+        ``create``/``checkout`` return a new table handle scoped to the branch;
+        writes on it do not affect ``main``.
+        """
+        return Branches(self)
+
+    def current_branch(self) -> Optional[str]:
+        """The branch this table handle is scoped to, or ``None`` for ``main``."""
+        return self._table.current_branch()
+
+    def _wrap_branch_handle(
+        self, async_table: AsyncTable, version: Optional[int] = None
+    ) -> "RemoteTable":
+        # A branch handle stays a RemoteTable with the same connection context.
+        # Record the branch and version pin so a fork/pickle reopen restores both.
+        handle = RemoteTable(
+            async_table,
+            self.db_name,
+            connection_state=self._connection_state,
+            namespace_path=self._namespace_path,
+        )
+        handle._branch = async_table.current_branch()
+        handle._checkout_version = version
+        return handle
+
    @cached_property
    def embedding_functions(self) -> Dict[str, EmbeddingFunctionConfig]:
        """
@@ -101,18 +221,24 @@ class RemoteTable(Table):
        """to_arrow() is not yet supported on LanceDB cloud."""
        raise NotImplementedError("to_arrow() is not yet supported on LanceDB cloud.")

-    def to_pandas(self):
+    def to_pandas(self, blob_mode: BlobMode = "lazy", **kwargs):
        """to_pandas() is not yet supported on LanceDB cloud."""
        raise NotImplementedError("to_pandas() is not yet supported on LanceDB cloud.")

    def checkout(self, version: Union[int, str]):
-        return LOOP.run(self._table.checkout(version))
+        result = LOOP.run(self._table.checkout(version))
+        self._checkout_version = self.version
+        return result

    def checkout_latest(self):
-        return LOOP.run(self._table.checkout_latest())
+        result = LOOP.run(self._table.checkout_latest())
+        self._checkout_version = None
+        return result

    def restore(self, version: Optional[Union[int, str]] = None):
-        return LOOP.run(self._table.restore(version))
+        result = LOOP.run(self._table.restore(version))
+        self._checkout_version = None
+        return result

    def list_indices(self) -> Iterable[IndexConfig]:
        """List all the indices on the table"""
@@ -122,6 +248,11 @@ class RemoteTable(Table):
        """List all the stats of a specified index"""
        return LOOP.run(self._table.index_stats(index_uuid))

+    @deprecation.deprecated(
+        deprecated_in="0.25.0",
+        current_version=__version__,
+        details="Use create_index() with config=BTree()/Bitmap()/LabelList() instead.",
+    )
    def create_scalar_index(
        self,
        column: str,
@@ -131,7 +262,12 @@ class RemoteTable(Table):
        wait_timeout: Optional[timedelta] = None,
        name: Optional[str] = None,
    ):
-        """Creates a scalar index
+        """Creates a scalar index.
+
+        .. deprecated:: 0.25.0
+            Use :meth:`create_index` with a BTree, Bitmap, or LabelList config instead.
+            Example: ``table.create_index("column", config=BTree())``
+
        Parameters
        ----------
        column : str
@@ -162,6 +298,11 @@ class RemoteTable(Table):
            )
        )

+    @deprecation.deprecated(
+        deprecated_in="0.25.0",
+        current_version=__version__,
+        details="Use create_index() with config=FTS() instead.",
+    )
    def create_fts_index(
        self,
        column: str,
@@ -182,6 +323,12 @@ class RemoteTable(Table):
        prefix_only: bool = False,
        name: Optional[str] = None,
    ):
+        """Create a full-text search index on a column.
+
+        .. deprecated:: 0.25.0
+            Use :meth:`create_index` with an FTS config instead.
+            Example: ``table.create_index("text_column", config=FTS())``
+        """
        config = FTS(
            with_position=with_position,
            base_tokenizer=base_tokenizer,
@@ -205,9 +352,43 @@ class RemoteTable(Table):
            )
        )

+    # New unified API overload
+    @overload
    def create_index(
        self,
-        metric="l2",
+        column: str,
+        /,
+        *,
+        config: IndexConfigType,
+        wait_timeout: Optional[timedelta] = ...,
+        name: Optional[str] = ...,
+        train: bool = ...,
+    ) -> None: ...
+
+    # Legacy API overload (deprecated)
+    @overload
+    def create_index(
+        self,
+        metric: Literal["l2", "cosine", "dot", "hamming"] = ...,
+        vector_column_name: str = ...,
+        index_cache_size: Optional[int] = ...,
+        num_partitions: Optional[int] = ...,
+        num_sub_vectors: Optional[int] = ...,
+        replace: Optional[bool] = ...,
+        accelerator: Optional[str] = ...,
+        index_type: Literal[
+            "VECTOR", "IVF_FLAT", "IVF_SQ", "IVF_PQ", "IVF_HNSW_SQ", "IVF_HNSW_PQ"
+        ] = ...,
+        wait_timeout: Optional[timedelta] = ...,
+        *,
+        num_bits: int = ...,
+        name: Optional[str] = ...,
+        train: bool = ...,
+    ) -> None: ...
+
+    def create_index(
+        self,
+        metric: str = "l2",
        vector_column_name: str = VECTOR_COLUMN_NAME,
        index_cache_size: Optional[int] = None,
        num_partitions: Optional[int] = None,
@@ -218,89 +399,113 @@ class RemoteTable(Table):
        wait_timeout: Optional[timedelta] = None,
        *,
        num_bits: int = 8,
+        config: Optional[IndexConfigType] = None,
        name: Optional[str] = None,
        train: bool = True,
    ):
-        """Create an index on the table.
+        """Create an index on a column.

-        Parameters
-        ----------
-        metric : str
-            The metric to use for the index. Default is "l2".
-        vector_column_name : str
-            The name of the vector column. Default is "vector".
+        This method supports both the new unified API and the legacy API
+        for backwards compatibility. The new API takes the column name as the
+        first positional argument and an index configuration object via
+        ``config``; the legacy API takes the distance metric as the first
+        argument plus separate ``vector_column_name`` / ``num_partitions`` /
+        etc. parameters, and emits a ``DeprecationWarning``.

        Examples
        --------
-        >>> import lancedb
-        >>> import uuid
-        >>> from lancedb.schema import vector
-        >>> db = lancedb.connect("db://...", api_key="...", # doctest: +SKIP
-        ...                      region="...") # doctest: +SKIP
-        >>> table_name = uuid.uuid4().hex
-        >>> schema = pa.schema(
-        ...     [
-        ...             pa.field("id", pa.uint32(), False),
-        ...            pa.field("vector", vector(128), False),
-        ...             pa.field("s", pa.string(), False),
-        ...     ]
+        New API (recommended):
+
+        >>> table.create_index(  # doctest: +SKIP
+        ...     "vector", config=IvfPq(distance_type="l2")
        ... )
-        >>> table = db.create_table( # doctest: +SKIP
-        ...     table_name, # doctest: +SKIP
-        ...     schema=schema, # doctest: +SKIP
+        >>> table.create_index("category", config=BTree())  # doctest: +SKIP
+        >>> table.create_index("content", config=FTS())  # doctest: +SKIP
+
+        Legacy API (deprecated):
+
+        >>> table.create_index(  # doctest: +SKIP
+        ...     "l2", vector_column_name="vector"
        ... )
-        >>> table.create_index("l2", "vector") # doctest: +SKIP
        """
+        # Detect whether this is a legacy API call
+        is_legacy = self._is_legacy_create_index_call(
+            metric,
+            config,
+            num_partitions,
+            num_sub_vectors,
+            vector_column_name,
+            accelerator,
+            index_cache_size,
+            replace,
+        )

-        if accelerator is not None:
-            logging.warning(
-                "GPU accelerator is not yet supported on LanceDB cloud."
-                "If you have 100M+ vectors to index,"
-                "please contact us at contact@lancedb.com"
-            )
-        if replace is not None:
-            logging.warning(
-                "replace is not supported on LanceDB cloud."
-                "Existing indexes will always be replaced."
+        if is_legacy:
+            warnings.warn(
+                "The create_index() API with metric/num_partitions parameters is "
+                "deprecated and will be removed in a future version. "
+                "Please migrate to the new unified API:\n"
+                "  # Old (deprecated):\n"
+                "  table.create_index('l2', vector_column_name='my_vector')\n"
+                "  # New (recommended):\n"
+                "  table.create_index('my_vector', config=IvfPq(distance_type='l2'))",
+                DeprecationWarning,
+                stacklevel=2,
            )

-        index_type = index_type.upper()
-        if index_type == "VECTOR" or index_type == "IVF_PQ":
-            config = IvfPq(
-                distance_type=metric,
-                num_partitions=num_partitions,
-                num_sub_vectors=num_sub_vectors,
-                num_bits=num_bits,
-            )
-        elif index_type == "IVF_RQ":
-            config = IvfRq(
-                distance_type=metric,
-                num_partitions=num_partitions,
-                num_bits=num_bits,
-            )
-        elif index_type == "IVF_SQ":
-            config = IvfSq(distance_type=metric, num_partitions=num_partitions)
-        elif index_type == "IVF_HNSW_PQ":
-            raise ValueError(
-                "IVF_HNSW_PQ is not supported on LanceDB cloud."
-                "Please use IVF_HNSW_SQ instead."
-            )
-        elif index_type == "IVF_HNSW_SQ":
-            config = HnswSq(distance_type=metric, num_partitions=num_partitions)
-        elif index_type == "IVF_HNSW_FLAT":
-            config = HnswFlat(distance_type=metric, num_partitions=num_partitions)
-        elif index_type == "IVF_FLAT":
-            config = IvfFlat(distance_type=metric, num_partitions=num_partitions)
+            column = vector_column_name
+
+            if accelerator is not None:
+                logging.warning(
+                    "GPU accelerator is not yet supported on LanceDB cloud."
+                    "If you have 100M+ vectors to index,"
+                    "please contact us at contact@lancedb.com"
+                )
+            if replace is not None:
+                logging.warning(
+                    "replace is not supported on LanceDB cloud."
+                    "Existing indexes will always be replaced."
+                )
+
+            idx_type = index_type.upper()
+            if idx_type == "VECTOR" or idx_type == "IVF_PQ":
+                config = IvfPq(
+                    distance_type=metric,
+                    num_partitions=num_partitions,
+                    num_sub_vectors=num_sub_vectors,
+                    num_bits=num_bits,
+                )
+            elif idx_type == "IVF_RQ":
+                config = IvfRq(
+                    distance_type=metric,
+                    num_partitions=num_partitions,
+                    num_bits=num_bits,
+                )
+            elif idx_type == "IVF_SQ":
+                config = IvfSq(distance_type=metric, num_partitions=num_partitions)
+            elif idx_type == "IVF_HNSW_PQ":
+                raise ValueError(
+                    "IVF_HNSW_PQ is not supported on LanceDB cloud."
+                    "Please use IVF_HNSW_SQ instead."
+                )
+            elif idx_type == "IVF_HNSW_SQ":
+                config = HnswSq(distance_type=metric, num_partitions=num_partitions)
+            elif idx_type == "IVF_HNSW_FLAT":
+                config = HnswFlat(distance_type=metric, num_partitions=num_partitions)
+            elif idx_type == "IVF_FLAT":
+                config = IvfFlat(distance_type=metric, num_partitions=num_partitions)
+            else:
+                raise ValueError(
+                    f"Unknown vector index type: {idx_type}. Valid options are"
+                    " 'IVF_FLAT', 'IVF_PQ', 'IVF_RQ', 'IVF_SQ',"
+                    " 'IVF_HNSW_PQ', 'IVF_HNSW_SQ', 'IVF_HNSW_FLAT'"
+                )
        else:
-            raise ValueError(
-                f"Unknown vector index type: {index_type}. Valid options are"
-                " 'IVF_FLAT', 'IVF_PQ', 'IVF_RQ', 'IVF_SQ',"
-                " 'IVF_HNSW_PQ', 'IVF_HNSW_SQ', 'IVF_HNSW_FLAT'"
-            )
+            column = metric

        LOOP.run(
            self._table.create_index(
-                vector_column_name,
+                column,
                config=config,
                wait_timeout=wait_timeout,
                name=name,
@@ -308,6 +513,37 @@ class RemoteTable(Table):
            )
        )

+    def _is_legacy_create_index_call(
+        self,
+        first_arg: str,
+        config: Optional[IndexConfigType],
+        num_partitions: Optional[int],
+        num_sub_vectors: Optional[int],
+        vector_column_name: str,
+        accelerator: Optional[str],
+        index_cache_size: Optional[int],
+        replace: Optional[bool],
+    ) -> bool:
+        """Detect if this is a legacy create_index call."""
+        if config is not None:
+            return False
+        if any(
+            x is not None
+            for x in (
+                num_partitions,
+                num_sub_vectors,
+                accelerator,
+                index_cache_size,
+                replace,
+            )
+        ):
+            return True
+        if vector_column_name != VECTOR_COLUMN_NAME:
+            return True
+        if first_arg.lower() in KNOWN_METRICS:
+            return True
+        return False
+
    def add(
        self,
        data: DATA,
@@ -653,6 +889,11 @@ class RemoteTable(Table):
    ) -> AlterColumnsResult:
        return LOOP.run(self._table.alter_columns(*alterations))

+    def update_field_metadata(
+        self, *updates: dict[str, Any]
+    ) -> UpdateFieldMetadataResult:
+        return LOOP.run(self._table.update_field_metadata(*updates))
+
    def drop_columns(self, columns: Iterable[str]) -> DropColumnsResult:
        return LOOP.run(self._table.drop_columns(columns))

@@ -668,6 +909,10 @@ class RemoteTable(Table):
        """Not supported on LanceDB Cloud."""
        return LOOP.run(self._table.unset_lsm_write_spec())

+    def close_lsm_writers(self) -> None:
+        """No-op on LanceDB Cloud (no local shard writers)."""
+        return LOOP.run(self._table.close_lsm_writers())
+
    def drop_index(self, index_name: str):
        return LOOP.run(self._table.drop_index(index_name))

--- a/python/python/lancedb/rerankers/linear_combination.py
+++ b/python/python/lancedb/rerankers/linear_combination.py
@@ -102,8 +102,15 @@ class LinearCombinationReranker(Reranker):

        combined_list = []
        for row_id, result in results.items():
+            # Convert vector distance to a relevance score in [0, 1] where
+            # higher is better.  Missing vector entries are penalised with
+            # `_invert_score(fill)` = 1 - fill (= 0.0 for the default fill=1).
            vector_score = self._invert_score(result.get("_distance", fill))
-            fts_score = result.get("_score", fill)
+            # FTS scores (BM25) are already in a "higher = more relevant" space.
+            # Missing FTS entries are penalised symmetrically: we use
+            # `1 - fill` so that the same `fill` value drives both missing-vector
+            # and missing-FTS penalties in the same direction.
+            fts_score = result.get("_score", 1 - fill)
            result["_relevance_score"] = self._combine_score(vector_score, fts_score)
            combined_list.append(result)

@@ -123,8 +130,12 @@ class LinearCombinationReranker(Reranker):
        return tbl

    def _combine_score(self, vector_score, fts_score):
-        # these scores represent distance
-        return 1 - (self.weight * vector_score + (1 - self.weight) * fts_score)
+        # Both vector_score (inverted distance) and fts_score are in a
+        # "higher = more relevant" space.  A straight weighted average gives
+        # higher _relevance_score to better matches, as expected.
+        # Previously this returned `1 - (...)` which inverted the final
+        # ranking so that the *least* relevant document ranked first.
+        return self.weight * vector_score + (1 - self.weight) * fts_score

    def _invert_score(self, dist: float):
        # Invert the score between relevance and distance
--- a/python/python/lancedb/rerankers/mrr.py
+++ b/python/python/lancedb/rerankers/mrr.py
@@ -125,6 +125,9 @@ class MRRReranker(Reranker):
        This cannot reuse rerank_hybrid because MRR semantics require treating
        each vector result as a separate ranking system.
        """
+        if not vector_results:
+            raise ValueError("vector_results must not be empty")
+
        if not all(isinstance(v, type(vector_results[0])) for v in vector_results):
            raise ValueError(
                "All elements in vector_results should be of the same type"
--- a/python/python/lancedb/rerankers/rrf.py
+++ b/python/python/lancedb/rerankers/rrf.py
@@ -82,6 +82,9 @@ class RRFReranker(Reranker):
        results from multiple vector searches as it doesn't support reranking
        vector results individually.
        """
+        if not vector_results:
+            raise ValueError("vector_results must not be empty")
+
        # Make sure all elements are of the same type
        if not all(isinstance(v, type(vector_results[0])) for v in vector_results):
            raise ValueError(
--- a/python/python/lancedb/table.py
+++ b/python/python/lancedb/table.py
--- a/python/python/lancedb/util.py
+++ b/python/python/lancedb/util.py
@@ -10,7 +10,7 @@ import pathlib
 import warnings
 from datetime import date, datetime
 from functools import singledispatch
-from typing import Tuple, Union, Optional, Any
+from typing import Tuple, Union, Optional, Any, List
 from urllib.parse import urlparse

 import numpy as np
@@ -189,7 +189,33 @@ def flatten_columns(tbl: pa.Table, flatten: Optional[Union[int, bool]] = None):
    return tbl


-def inf_vector_column_query(schema: pa.Schema) -> str:
+def _format_field_path(path: List[str]) -> str:
+    def format_segment(segment: str) -> str:
+        if all(char.isalnum() or char == "_" for char in segment):
+            return segment
+        return f"`{segment.replace('`', '``')}`"
+
+    return ".".join(format_segment(segment) for segment in path)
+
+
+def _iter_vector_columns(
+    field: pa.Field, path: List[str], dim: Optional[int] = None
+) -> List[str]:
+    field_path = [*path, field.name]
+    if is_vector_column(field.type):
+        vector_dim = infer_vector_column_dim(field.type)
+        if dim is None or vector_dim == dim:
+            return [_format_field_path(field_path)]
+        return []
+    if pa.types.is_struct(field.type):
+        columns = []
+        for idx in range(field.type.num_fields):
+            columns.extend(_iter_vector_columns(field.type.field(idx), field_path, dim))
+        return columns
+    return []
+
+
+def inf_vector_column_query(schema: pa.Schema, dim: Optional[int] = None) -> str:
    """
    Get the vector column name

@@ -202,26 +228,21 @@ def inf_vector_column_query(schema: pa.Schema) -> str:
    -------
    str: the vector column name.
    """
-    vector_col_name = ""
-    vector_col_count = 0
-    for field_name in schema.names:
-        field = schema.field(field_name)
-        if is_vector_column(field.type):
-            vector_col_count += 1
-            if vector_col_count > 1:
-                raise ValueError(
-                    "Schema has more than one vector column. "
-                    "Please specify the vector column name "
-                    "for vector search"
-                )
-            elif vector_col_count == 1:
-                vector_col_name = field_name
-    if vector_col_count == 0:
+    vector_col_names = []
+    for field in schema:
+        vector_col_names.extend(_iter_vector_columns(field, [], dim))
+    if len(vector_col_names) > 1:
+        raise ValueError(
+            "Schema has more than one vector column. "
+            "Please specify the vector column name "
+            f"for vector search. Candidates: {vector_col_names}"
+        )
+    if len(vector_col_names) == 0:
        raise ValueError(
            "There is no vector column in the data. "
            "Please specify the vector column name for vector search"
        )
-    return vector_col_name
+    return vector_col_names[0]


 def is_vector_column(data_type: pa.DataType) -> bool:
@@ -247,6 +268,29 @@ def is_vector_column(data_type: pa.DataType) -> bool:
    return False


+def infer_vector_column_dim(data_type: pa.DataType) -> Optional[int]:
+    if pa.types.is_fixed_size_list(data_type):
+        return data_type.list_size
+    if pa.types.is_list(data_type):
+        return infer_vector_column_dim(data_type.value_type)
+    return None
+
+
+def _query_vector_dim(query: Optional[Any]) -> Optional[int]:
+    if query is None:
+        return None
+    if isinstance(query, np.ndarray):
+        if query.ndim == 0:
+            return None
+        return query.shape[-1]
+    if isinstance(query, list) and query:
+        first = query[0]
+        if isinstance(first, (list, tuple, np.ndarray)):
+            return len(first)
+        return len(query)
+    return None
+
+
 def infer_vector_column_name(
    schema: pa.Schema,
    query_type: str,
@@ -262,7 +306,9 @@ def infer_vector_column_name(

    if query is not None or query_type == "hybrid":
        try:
-            vector_column_name = inf_vector_column_query(schema)
+            vector_column_name = inf_vector_column_query(
+                schema, dim=_query_vector_dim(query)
+            )
        except Exception as e:
            raise e

@@ -339,6 +385,21 @@ def _(value: np.ndarray):
    return value_to_sql(value.tolist())


+@value_to_sql.register(np.bool_)
+def _(value: np.bool_):
+    return value_to_sql(bool(value))
+
+
+@value_to_sql.register(np.integer)
+def _(value: np.integer):
+    return value_to_sql(int(value))
+
+
+@value_to_sql.register(np.floating)
+def _(value: np.floating):
+    return value_to_sql(float(value))
+
+
 def deprecated(func):
    """This is a decorator which can be used to mark functions
    as deprecated. It will result in a warning being emitted
--- a/python/python/tests/docs/test_merge_insert.py
+++ b/python/python/tests/docs/test_merge_insert.py
@@ -57,7 +57,7 @@ async def test_upsert_async(mem_db_async):
    await table.count_rows()  # 3
    res
    # MergeResult(version=2, num_updated_rows=1,
-    # num_inserted_rows=1, num_deleted_rows=0)
+    # num_inserted_rows=1, num_deleted_rows=0, num_rows=2)
    # --8<-- [end:upsert_basic_async]
    assert await table.count_rows() == 3
    assert res.version == 2
@@ -86,7 +86,7 @@ def test_insert_if_not_exists(mem_db):
    table.count_rows()  # 3
    res
    # MergeResult(version=2, num_updated_rows=0,
-    # num_inserted_rows=1, num_deleted_rows=0)
+    # num_inserted_rows=1, num_deleted_rows=0, num_rows=1)
    # --8<-- [end:insert_if_not_exists]
    assert table.count_rows() == 3
    assert res.version == 2
@@ -116,7 +116,7 @@ async def test_insert_if_not_exists_async(mem_db_async):
    await table.count_rows()  # 3
    res
    # MergeResult(version=2, num_updated_rows=0,
-    # num_inserted_rows=1, num_deleted_rows=0)
+    # num_inserted_rows=1, num_deleted_rows=0, num_rows=1)
    # --8<-- [end:insert_if_not_exists]
    assert await table.count_rows() == 3
    assert res.version == 2
@@ -150,7 +150,7 @@ def test_replace_range(mem_db):
    table.count_rows("doc_id = 1")  # 1
    res
    # MergeResult(version=2, num_updated_rows=1,
-    # num_inserted_rows=0, num_deleted_rows=1)
+    # num_inserted_rows=0, num_deleted_rows=1, num_rows=1)
    # --8<-- [end:insert_if_not_exists]
    assert table.count_rows("doc_id = 1") == 1
    assert res.version == 2
@@ -185,7 +185,7 @@ async def test_replace_range_async(mem_db_async):
    await table.count_rows("doc_id = 1")  # 1
    res
    # MergeResult(version=2, num_updated_rows=1,
-    # num_inserted_rows=0, num_deleted_rows=1)
+    # num_inserted_rows=0, num_deleted_rows=1, num_rows=1)
    # --8<-- [end:insert_if_not_exists]
    assert await table.count_rows("doc_id = 1") == 1
    assert res.version == 2
--- a/python/python/tests/test_db.py
+++ b/python/python/tests/test_db.py
@@ -6,6 +6,7 @@ import re
 import sys
 from datetime import timedelta
 import os
+from types import SimpleNamespace

 import lancedb
 import numpy as np
@@ -188,6 +189,43 @@ def test_table_names(tmp_db: lancedb.DBConnection):
    assert len(result) == 3


+def test_db_contains_and_len_include_all_table_name_pages(tmp_db: lancedb.DBConnection):
+    for idx in range(20):
+        tmp_db.create_table(f"table_{idx}", data=[{"id": idx}])
+
+    assert len(tmp_db) == 20
+    for idx in range(20):
+        assert f"table_{idx}" in tmp_db
+    assert "does_not_exist" not in tmp_db
+
+
+def test_db_contains_stops_after_matching_table_page(
+    tmp_db: lancedb.DBConnection, monkeypatch
+):
+    calls = []
+    pages = {
+        None: SimpleNamespace(tables=["table_0", "table_1"], page_token="next"),
+        "next": SimpleNamespace(tables=["table_2"], page_token=None),
+    }
+
+    def list_tables(*, page_token=None, **_kwargs):
+        calls.append(page_token)
+        return pages[page_token]
+
+    monkeypatch.setattr(tmp_db, "list_tables", list_tables)
+
+    assert "table_1" in tmp_db
+    assert calls == [None]
+
+    calls.clear()
+    assert "table_2" in tmp_db
+    assert calls == [None, "next"]
+
+    calls.clear()
+    assert len(tmp_db) == 3
+    assert calls == [None, "next"]
+
+
@pytest.mark.asyncio
 async def test_table_names_async(tmp_path):
    db = lancedb.connect(tmp_path)
@@ -428,7 +466,8 @@ async def test_create_table_v2_manifest_paths_async(tmp_path):
    assert await tbl.uses_v2_manifest_paths()
    manifests_dir = tmp_path / "test_v2_manifest_paths.lance" / "_versions"
    for manifest in os.listdir(manifests_dir):
-        assert re.match(r"\d{20}\.manifest", manifest)
+        if manifest.endswith(".manifest"):
+            assert re.match(r"\d{20}\.manifest", manifest)

    # Start a table in V1 mode then migrate
    tbl = await db_no_v2_paths.create_table(
@@ -438,13 +477,15 @@ async def test_create_table_v2_manifest_paths_async(tmp_path):
    assert not await tbl.uses_v2_manifest_paths()
    manifests_dir = tmp_path / "test_v2_migration.lance" / "_versions"
    for manifest in os.listdir(manifests_dir):
-        assert re.match(r"\d\.manifest", manifest)
+        if manifest.endswith(".manifest"):
+            assert re.match(r"\d\.manifest", manifest)

    await tbl.migrate_manifest_paths_v2()
    assert await tbl.uses_v2_manifest_paths()

    for manifest in os.listdir(manifests_dir):
-        assert re.match(r"\d{20}\.manifest", manifest)
+        if manifest.endswith(".manifest"):
+            assert re.match(r"\d{20}\.manifest", manifest)


@pytest.mark.asyncio
--- a/python/python/tests/test_errors.py
+++ b/python/python/tests/test_errors.py
@@ -0,0 +1,56 @@
+# SPDX-License-Identifier: Apache-2.0
+# SPDX-FileCopyrightText: Copyright The LanceDB Authors
+
+import pickle
+
+from lancedb.remote.errors import HttpError, LanceDBClientError, RetryError
+
+
+def test_pickle_lancedb_client_error():
+    err = LanceDBClientError("something went wrong", "req-123", 400)
+    restored = pickle.loads(pickle.dumps(err))
+    assert str(restored) == "something went wrong"
+    assert restored.request_id == "req-123"
+    assert restored.status_code == 400
+
+
+def test_pickle_lancedb_client_error_no_status_code():
+    err = LanceDBClientError("fail", "req-456")
+    restored = pickle.loads(pickle.dumps(err))
+    assert str(restored) == "fail"
+    assert restored.request_id == "req-456"
+    assert restored.status_code is None
+
+
+def test_pickle_http_error():
+    err = HttpError("not found", "req-789", 404)
+    restored = pickle.loads(pickle.dumps(err))
+    assert isinstance(restored, HttpError)
+    assert str(restored) == "not found"
+    assert restored.request_id == "req-789"
+    assert restored.status_code == 404
+
+
+def test_pickle_retry_error():
+    err = RetryError(
+        "max retries exceeded",
+        "req-abc",
+        request_failures=3,
+        connect_failures=1,
+        read_failures=2,
+        max_request_failures=5,
+        max_connect_failures=3,
+        max_read_failures=3,
+        status_code=503,
+    )
+    restored = pickle.loads(pickle.dumps(err))
+    assert isinstance(restored, RetryError)
+    assert str(restored) == "max retries exceeded"
+    assert restored.request_id == "req-abc"
+    assert restored.request_failures == 3
+    assert restored.connect_failures == 1
+    assert restored.read_failures == 2
+    assert restored.max_request_failures == 5
+    assert restored.max_connect_failures == 3
+    assert restored.max_read_failures == 3
+    assert restored.status_code == 503
--- a/python/python/tests/test_fts.py
+++ b/python/python/tests/test_fts.py
@@ -215,11 +215,12 @@ def test_reject_legacy_tantivy_index(table):

@pytest.mark.parametrize("with_position", [True, False])
 def test_create_inverted_index(table, with_position):
-    table.create_fts_index(
-        "text",
-        with_position=with_position,
-        name="custom_fts_index",
-    )
+    with pytest.warns(DeprecationWarning, match="create_fts_index"):
+        table.create_fts_index(
+            "text",
+            with_position=with_position,
+            name="custom_fts_index",
+        )
    indices = table.list_indices()
    fts_indices = [i for i in indices if i.index_type == "FTS"]
    assert any(i.name == "custom_fts_index" for i in fts_indices)
@@ -563,8 +564,111 @@ def test_create_index_multiple_columns(tmp_path, table):


 def test_nested_schema(tmp_path, table):
-    with pytest.raises(ValueError, match="top-level fields"):
-        table.create_fts_index("nested.text")
+    table.create_fts_index("nested.text", with_position=True)
+    indices = table.list_indices()
+    assert len(indices) == 1
+    assert indices[0].index_type == "FTS"
+    assert indices[0].columns == ["nested.text"]
+
+    results = (
+        table.search("puppy", query_type="fts", fts_columns="nested.text")
+        .limit(5)
+        .to_list()
+    )
+    assert len(results) > 0
+    assert all("puppy" in row["nested"]["text"] for row in results)
+
+    results = table.search(MatchQuery("puppy", "nested.text")).limit(5).to_list()
+    assert len(results) > 0
+    assert all("puppy" in row["nested"]["text"] for row in results)
+
+    phrase_results = (
+        table.search(PhraseQuery("puppy runs", "nested.text")).limit(5).to_list()
+    )
+    assert len(phrase_results) > 0
+    assert all("puppy runs" in row["nested"]["text"] for row in phrase_results)
+
+    hybrid_results = (
+        table.search(query_type="hybrid", fts_columns="nested.text")
+        .vector([0 for _ in range(128)])
+        .text("puppy")
+        .limit(5)
+        .to_list()
+    )
+    assert len(hybrid_results) > 0
+
+
+@pytest.mark.asyncio
+async def test_nested_schema_async(async_table):
+    await async_table.create_index("nested.text", config=FTS(with_position=True))
+    indices = await async_table.list_indices()
+    assert len(indices) == 1
+    assert indices[0].index_type == "FTS"
+    assert indices[0].columns == ["nested.text"]
+
+    results = await (
+        async_table.query()
+        .nearest_to_text("puppy", columns="nested.text")
+        .limit(5)
+        .to_list()
+    )
+    assert len(results) > 0
+    assert all("puppy" in row["nested"]["text"] for row in results)
+
+    results = await (
+        async_table.query()
+        .nearest_to_text(MatchQuery("puppy", "nested.text"))
+        .limit(5)
+        .to_list()
+    )
+    assert len(results) > 0
+    assert all("puppy" in row["nested"]["text"] for row in results)
+
+    phrase_results = await (
+        async_table.query()
+        .nearest_to_text(PhraseQuery("puppy runs", "nested.text"))
+        .limit(5)
+        .to_list()
+    )
+    assert len(phrase_results) > 0
+    assert all("puppy runs" in row["nested"]["text"] for row in phrase_results)
+
+    hybrid_results = await (
+        async_table.query()
+        .nearest_to([0 for _ in range(128)])
+        .nearest_to_text("puppy", columns="nested.text")
+        .limit(5)
+        .to_list()
+    )
+    assert len(hybrid_results) > 0
+
+
+def test_nested_schema_rejects_invalid_fts_fields(tmp_path):
+    db = ldb.connect(tmp_path)
+    data = pa.table(
+        {
+            "payload": pa.array(
+                [
+                    {"text": "puppy runs", "count": 1},
+                    {"text": "car drives", "count": 2},
+                ]
+            ),
+            "vector": pa.array(
+                [[0.1, 0.1], [0.2, 0.2]],
+                type=pa.list_(pa.float32(), list_size=2),
+            ),
+        }
+    )
+    table = db.create_table("test", data=data)
+
+    with pytest.raises(ValueError, match="FTS index cannot be created.*payload"):
+        table.create_fts_index("payload")
+
+    with pytest.raises(ValueError, match="FTS index cannot be created.*count"):
+        table.create_fts_index("payload.count")
+
+    with pytest.raises(ValueError, match="Field path `payload.missing` not found"):
+        table.create_fts_index("payload.missing")


 def test_search_index_with_filter(table):
--- a/python/python/tests/test_index.py
+++ b/python/python/tests/test_index.py
@@ -20,6 +20,7 @@ from lancedb.index import (
    IvfRq,
    Bitmap,
    LabelList,
+    Fm,
    HnswPq,
    HnswSq,
    HnswFlat,
@@ -105,6 +106,94 @@ async def test_create_scalar_index(some_table: AsyncTable):
    assert len(indices) == 0


+@pytest.mark.asyncio
+async def test_create_nested_scalar_index_lists_canonical_paths(db_async):
+    metadata_type = pa.struct(
+        [
+            pa.field("user_id", pa.int32()),
+            pa.field("user.id", pa.int32()),
+        ]
+    )
+    mixed_case_metadata_type = pa.struct([pa.field("userId", pa.int32())])
+    escaped_metadata_type = pa.struct([pa.field("user-id", pa.int32())])
+    literal_type = pa.struct([pa.field("a.b", pa.int32())])
+    data = pa.Table.from_arrays(
+        [
+            pa.array([1, 2, 3], type=pa.int32()),
+            pa.array([1, 2, 3], type=pa.int32()),
+            pa.array([1, 2, 3], type=pa.int32()),
+            pa.array([1, 2, 3], type=pa.int32()),
+            pa.array(
+                [
+                    {"user_id": 10, "user.id": 100},
+                    {"user_id": 20, "user.id": 200},
+                    {"user_id": 30, "user.id": 300},
+                ],
+                type=metadata_type,
+            ),
+            pa.array(
+                [{"userId": 10}, {"userId": 20}, {"userId": 30}],
+                type=mixed_case_metadata_type,
+            ),
+            pa.array(
+                [{"user-id": 10}, {"user-id": 20}, {"user-id": 30}],
+                type=escaped_metadata_type,
+            ),
+            pa.array(
+                [{"a.b": 10}, {"a.b": 20}, {"a.b": 30}],
+                type=literal_type,
+            ),
+        ],
+        names=[
+            "rowId",
+            "row-id",
+            "userId",
+            "user_id",
+            "metadata",
+            "MetaData",
+            "meta-data",
+            "literal",
+        ],
+    )
+    table = await db_async.create_table("nested_scalar_index", data)
+
+    await table.create_index("rowId", config=BTree(), name="row_id_idx")
+    await table.create_index("`row-id`", config=BTree(), name="row_dash_id_idx")
+    await table.create_index("userId", config=BTree(), name="top_user_id_idx")
+    await table.create_index("user_id", config=BTree(), name="top_snake_user_id_idx")
+    await table.create_index(
+        "metadata.user_id", config=BTree(), name="nested_user_id_idx"
+    )
+    await table.create_index(
+        "metadata.`user.id`", config=BTree(), name="escaped_user_id_idx"
+    )
+    await table.create_index(
+        "MetaData.userId", config=BTree(), name="mixed_case_metadata_user_id_idx"
+    )
+    await table.create_index(
+        "`meta-data`.`user-id`", config=BTree(), name="escaped_names_idx"
+    )
+    await table.create_index("literal.`a.b`", config=BTree(), name="literal_dot_idx")
+
+    columns_by_name = {
+        index.name: index.columns for index in await table.list_indices()
+    }
+    assert columns_by_name["row_id_idx"] == ["rowId"]
+    assert columns_by_name["row_dash_id_idx"] == ["`row-id`"]
+    assert columns_by_name["top_user_id_idx"] == ["userId"]
+    assert columns_by_name["top_snake_user_id_idx"] == ["user_id"]
+    assert columns_by_name["nested_user_id_idx"] == ["metadata.user_id"]
+    assert columns_by_name["escaped_user_id_idx"] == ["metadata.`user.id`"]
+    assert columns_by_name["mixed_case_metadata_user_id_idx"] == ["MetaData.userId"]
+    assert columns_by_name["escaped_names_idx"] == ["`meta-data`.`user-id`"]
+    assert columns_by_name["literal_dot_idx"] == ["literal.`a.b`"]
+
+    for index_name in columns_by_name:
+        stats = await table.index_stats(index_name)
+        assert stats is not None
+        assert stats.num_indexed_rows == 3
+
+
@pytest.mark.asyncio
 async def test_create_fixed_size_binary_index(some_table: AsyncTable):
    await some_table.create_index("fsb", config=BTree())
@@ -115,6 +204,16 @@ async def test_create_fixed_size_binary_index(some_table: AsyncTable):
    assert indices[0].columns == ["fsb"]


+@pytest.mark.asyncio
+async def test_create_fm_index(some_table: AsyncTable):
+    # FM-Index accelerates substring search on string/binary columns.
+    await some_table.create_index("data", config=Fm())
+    indices = await some_table.list_indices()
+    assert len(indices) == 1
+    assert indices[0].index_type == "Fm"
+    assert indices[0].columns == ["data"]
+
+
@pytest.mark.asyncio
 async def test_create_bitmap_index(some_table: AsyncTable):
    await some_table.create_index("id", config=Bitmap())
@@ -122,12 +221,13 @@ async def test_create_bitmap_index(some_table: AsyncTable):
    await some_table.create_index("data", config=Bitmap())
    indices = await some_table.list_indices()
    assert len(indices) == 3
+    # list_indices returns indices in alphabetical order by name
    assert indices[0].index_type == "Bitmap"
-    assert indices[0].columns == ["id"]
+    assert indices[0].columns == ["data"]
    assert indices[1].index_type == "Bitmap"
-    assert indices[1].columns == ["is_active"]
+    assert indices[1].columns == ["id"]
    assert indices[2].index_type == "Bitmap"
-    assert indices[2].columns == ["data"]
+    assert indices[2].columns == ["is_active"]

    index_name = indices[0].name
    stats = await some_table.index_stats(index_name)
@@ -148,6 +248,51 @@ async def test_create_label_list_index(some_table: AsyncTable):
    await some_table.create_index("tags", config=LabelList())
    indices = await some_table.list_indices()
    assert str(indices) == '[Index(LabelList, columns=["tags"], name="tags_idx")]'
+    plan = await some_table.query().where("array_has(tags, 'tag0')").explain_plan()
+    assert "ScalarIndexQuery" in plan
+
+
+@pytest.mark.asyncio
+async def test_create_large_list_label_list_index(db_async):
+    data = pa.Table.from_pydict(
+        {"tags": [[f"tag{i % 2}", "shared"] for i in range(16)]},
+        schema=pa.schema([pa.field("tags", pa.large_list(pa.string()))]),
+    )
+    table = await db_async.create_table("large_list_label_list_index", data)
+
+    await table.create_index("tags", config=LabelList())
+    indices = await table.list_indices()
+    assert str(indices) == '[Index(LabelList, columns=["tags"], name="tags_idx")]'
+    plan = await table.query().where("array_has(tags, 'shared')").explain_plan()
+    assert "ScalarIndexQuery" in plan
+
+
+@pytest.mark.asyncio
+async def test_create_label_list_index_rejects_list_struct(db_async):
+    item_type = pa.struct(
+        [
+            pa.field("tag", pa.string()),
+            pa.field(
+                "metadata",
+                pa.struct([pa.field("userId", pa.string())]),
+            ),
+        ]
+    )
+    data = pa.Table.from_pylist(
+        [
+            {
+                "items": [
+                    {"tag": "tag0", "metadata": {"userId": "user0"}},
+                    {"tag": "shared", "metadata": {"userId": "user1"}},
+                ]
+            }
+        ],
+        schema=pa.schema([pa.field("items", pa.list_(item_type))]),
+    )
+    table = await db_async.create_table("list_struct_label_list_index", data)
+
+    with pytest.raises(Exception, match="LabelList index cannot be created"):
+        await table.create_index("items", config=LabelList())


@pytest.mark.asyncio
@@ -185,7 +330,6 @@ async def test_create_vector_index(some_table: AsyncTable):
    assert stats.num_indexed_rows == await some_table.count_rows()
    assert stats.num_unindexed_rows == 0
    assert stats.num_indices == 1
-    assert stats.loss >= 0.0


@pytest.mark.asyncio
@@ -209,7 +353,6 @@ async def test_create_4bit_ivfpq_index(some_table: AsyncTable):
    assert stats.num_indexed_rows == await some_table.count_rows()
    assert stats.num_unindexed_rows == 0
    assert stats.num_indices == 1
-    assert stats.loss >= 0.0


@pytest.mark.asyncio
--- a/python/python/tests/test_lsm_write_spec.py
+++ b/python/python/tests/test_lsm_write_spec.py
@@ -40,16 +40,6 @@ def _make_table(tmp_path):
 def test_set_lsm_write_spec_validates(tmp_path):
    _db, table = _make_table(tmp_path)

-    # No PK set yet.
-    with pytest.raises(Exception, match="primary key"):
-        table.set_lsm_write_spec(LsmWriteSpec.bucket("id", 4))
-
-    table.set_unenforced_primary_key("id")
-
-    # Column mismatch.
-    with pytest.raises(Exception, match="match"):
-        table.set_lsm_write_spec(LsmWriteSpec.bucket("v", 4))
-
    # Out-of-range num_buckets.
    with pytest.raises(Exception, match="num_buckets"):
        table.set_lsm_write_spec(LsmWriteSpec.bucket("id", 0))
@@ -70,7 +60,6 @@ def test_unset_lsm_write_spec(tmp_path):
        table.unset_lsm_write_spec()

    # Install a spec, then remove it; afterwards a fresh spec can be set.
-    table.set_unenforced_primary_key("id")
    table.set_lsm_write_spec(LsmWriteSpec.bucket("id", 4))
    table.unset_lsm_write_spec()
    # A second unset errors — there is no spec left to remove.
--- a/python/python/tests/test_merge_insert_lsm.py
+++ b/python/python/tests/test_merge_insert_lsm.py
@@ -0,0 +1,196 @@
+# SPDX-License-Identifier: Apache-2.0
+# SPDX-FileCopyrightText: Copyright The LanceDB Authors
+
+"""Tests for the MemWAL LSM ``merge_insert`` dispatch."""
+
+from datetime import timedelta
+
+import lancedb
+import pyarrow as pa
+import pytest
+from lancedb._lancedb import LsmWriteSpec
+
+SCHEMA = pa.schema(
+    [
+        pa.field("id", pa.int64(), nullable=False),
+        pa.field("value", pa.int64(), nullable=False),
+    ]
+)
+
+REGION_SCHEMA = pa.schema(
+    [
+        pa.field("id", pa.int64(), nullable=False),
+        pa.field("region", pa.utf8(), nullable=False),
+    ]
+)
+
+
+def _reader(ids):
+    batch = pa.RecordBatch.from_arrays(
+        [
+            pa.array(ids, type=pa.int64()),
+            pa.array(list(range(len(ids))), type=pa.int64()),
+        ],
+        schema=SCHEMA,
+    )
+    return pa.RecordBatchReader.from_batches(SCHEMA, [batch])
+
+
+def _region_reader(rows):
+    batch = pa.RecordBatch.from_arrays(
+        [
+            pa.array([row[0] for row in rows], type=pa.int64()),
+            pa.array([row[1] for row in rows], type=pa.utf8()),
+        ],
+        schema=REGION_SCHEMA,
+    )
+    return pa.RecordBatchReader.from_batches(REGION_SCHEMA, [batch])
+
+
+def _bucket_table(tmp_path):
+    """A table with ``id`` as the primary key and a single-bucket LSM spec."""
+    db = lancedb.connect(tmp_path, read_consistency_interval=timedelta(seconds=0))
+    table = db.create_table("t", _reader([1, 2, 3]))
+    table.set_unenforced_primary_key("id")
+    # num_buckets = 1: every row routes to the single bucket.
+    table.set_lsm_write_spec(LsmWriteSpec.bucket("id", 1))
+    return table
+
+
+def test_lsm_merge_insert_bucket(tmp_path):
+    table = _bucket_table(tmp_path)
+    # Empty `on` defaults to the primary key.
+    result = (
+        table.merge_insert([])
+        .when_matched_update_all()
+        .when_not_matched_insert_all()
+        .execute(_reader([3, 4, 5]))
+    )
+    # LSM path: rows go to the MemWAL, so only num_rows is populated.
+    assert result.num_rows == 3
+    assert result.version == 0
+    assert result.num_inserted_rows == 0
+    assert result.num_updated_rows == 0
+
+
+def test_lsm_merge_insert_unsharded(tmp_path):
+    db = lancedb.connect(tmp_path, read_consistency_interval=timedelta(seconds=0))
+    table = db.create_table("t", _reader([1, 2, 3]))
+    table.set_unenforced_primary_key("id")
+    table.set_lsm_write_spec(LsmWriteSpec.unsharded())
+    result = (
+        table.merge_insert("id")
+        .when_matched_update_all()
+        .when_not_matched_insert_all()
+        .execute(_reader([10, 11, 12, 13]))
+    )
+    assert result.num_rows == 4
+
+
+def test_lsm_merge_insert_identity(tmp_path):
+    db = lancedb.connect(tmp_path, read_consistency_interval=timedelta(seconds=0))
+    table = db.create_table("t", _region_reader([(1, "us"), (2, "us")]))
+    table.set_unenforced_primary_key("id")
+    table.set_lsm_write_spec(LsmWriteSpec.identity("region"))
+    # All rows share one identity value, so they route to one shard.
+    result = (
+        table.merge_insert([])
+        .when_matched_update_all()
+        .when_not_matched_insert_all()
+        .execute(_region_reader([(3, "us"), (4, "us")]))
+    )
+    assert result.num_rows == 2
+
+
+def test_lsm_merge_insert_use_lsm_write_false(tmp_path):
+    table = _bucket_table(tmp_path)  # rows id = 1, 2, 3
+    # use_lsm_write(False) opts out: the standard path runs and commits.
+    result = (
+        table.merge_insert("id")
+        .when_not_matched_insert_all()
+        .use_lsm_write(False)
+        .execute(_reader([3, 4, 5]))
+    )
+    assert result.num_inserted_rows == 2
+    assert table.count_rows() == 5
+
+
+def test_lsm_merge_insert_validate_single_shard_off(tmp_path):
+    table = _bucket_table(tmp_path)
+    result = (
+        table.merge_insert([])
+        .when_matched_update_all()
+        .when_not_matched_insert_all()
+        .validate_single_shard(False)
+        .execute(_reader([6, 7, 8]))
+    )
+    assert result.num_rows == 3
+
+
+def test_lsm_merge_insert_use_lsm_write_true_requires_spec(tmp_path):
+    # A table with a primary key but no LSM write spec installed.
+    db = lancedb.connect(tmp_path, read_consistency_interval=timedelta(seconds=0))
+    table = db.create_table("t", _reader([1, 2, 3]))
+    table.set_unenforced_primary_key("id")
+    with pytest.raises(Exception, match="use_lsm_write"):
+        (
+            table.merge_insert("id")
+            .when_matched_update_all()
+            .when_not_matched_insert_all()
+            .use_lsm_write(True)
+            .execute(_reader([4]))
+        )
+
+
+def test_lsm_merge_insert_rejects_on_not_primary_key(tmp_path):
+    table = _bucket_table(tmp_path)
+    with pytest.raises(Exception, match="primary key"):
+        (
+            table.merge_insert("value")
+            .when_matched_update_all()
+            .when_not_matched_insert_all()
+            .execute(_reader([1]))
+        )
+
+
+def test_lsm_merge_insert_rejects_non_upsert(tmp_path):
+    table = _bucket_table(tmp_path)
+    # Insert-only (no when_matched_update_all) is not the upsert shape.
+    with pytest.raises(Exception, match="upsert"):
+        table.merge_insert([]).when_not_matched_insert_all().execute(_reader([4]))
+
+
+def test_lsm_close_writers(tmp_path):
+    table = _bucket_table(tmp_path)
+    (
+        table.merge_insert([])
+        .when_matched_update_all()
+        .when_not_matched_insert_all()
+        .execute(_reader([7, 8]))
+    )
+    table.close_lsm_writers()
+    # The writer reopens lazily on the next merge_insert.
+    result = (
+        table.merge_insert([])
+        .when_matched_update_all()
+        .when_not_matched_insert_all()
+        .execute(_reader([9]))
+    )
+    assert result.num_rows == 1
+
+
+@pytest.mark.asyncio
+async def test_async_lsm_merge_insert(tmp_path):
+    db = await lancedb.connect_async(
+        tmp_path, read_consistency_interval=timedelta(seconds=0)
+    )
+    table = await db.create_table("t", _reader([1, 2, 3]))
+    await table.set_unenforced_primary_key("id")
+    await table.set_lsm_write_spec(LsmWriteSpec.bucket("id", 1))
+
+    builder = (
+        table.merge_insert([]).when_matched_update_all().when_not_matched_insert_all()
+    )
+    result = await builder.execute(_reader([3, 4, 5]))
+    assert result.num_rows == 3
+    await table.close_lsm_writers()
--- a/python/python/tests/test_namespace.py
+++ b/python/python/tests/test_namespace.py
@@ -5,10 +5,67 @@

 import tempfile
 import shutil
+import sys
 import pytest
 import pyarrow as pa
 import lancedb
 from lance_namespace.errors import NamespaceNotEmptyError, TableNotFoundError
+from lancedb.table import AsyncTable, LanceTable
+
+
+PUSHDOWN_DATA = pa.table(
+    {"id": list(range(12)), "text": [f"row-{idx}" for idx in range(12)]}
+)
+
+
+def _ipc_file(table: pa.Table = PUSHDOWN_DATA) -> bytes:
+    sink = pa.BufferOutputStream()
+    with pa.ipc.new_file(sink, table.schema) as writer:
+        writer.write_table(table)
+    return sink.getvalue().to_pybytes()
+
+
+class _FailingSyncInner:
+    name = "hist"
+
+    def current_branch(self):
+        # The pushdown gate only routes server-side when on the default branch.
+        return None
+
+    async def schema(self):
+        return PUSHDOWN_DATA.schema
+
+    async def to_arrow(self):
+        raise RuntimeError("direct table to_arrow should not be used")
+
+
+class _FailingAsyncInner:
+    def name(self):
+        return "hist"
+
+    async def schema(self):
+        return PUSHDOWN_DATA.schema
+
+    def query(self):
+        raise AssertionError("direct async query should not be used")
+
+
+class _NamespaceClient:
+    def __init__(self):
+        self.requests = []
+
+    def query_table(self, request):
+        self.requests.append(request)
+        return _ipc_file()
+
+
+def _namespace_lance_table(namespace_client: _NamespaceClient) -> LanceTable:
+    table = LanceTable.__new__(LanceTable)
+    table._table = _FailingSyncInner()
+    table._namespace_path = ["geneva"]
+    table._namespace_client = namespace_client
+    table._pushdown_operations = {"QueryTable"}
+    return table


 class TestNamespaceConnection:
@@ -76,6 +133,35 @@ class TestNamespaceConnection:
        assert len(result) == 0
        assert list(result.columns) == ["id", "vector", "text"]

+    def test_table_to_pandas_blob_lazy_through_namespace(self):
+        """Namespace-backed tables should use Lance blob-aware pandas conversion."""
+        pytest.importorskip("lance")
+        db = lancedb.connect_namespace("dir", {"root": self.temp_dir})
+        db.create_namespace(["test_ns"])
+        data = pa.table(
+            {
+                "id": pa.array([1, 2], pa.int64()),
+                "blob": pa.array([b"hello", b"world"], pa.large_binary()),
+            },
+            schema=pa.schema(
+                [
+                    pa.field("id", pa.int64()),
+                    pa.field(
+                        "blob",
+                        pa.large_binary(),
+                        metadata={"lance-encoding:blob": "true"},
+                    ),
+                ]
+            ),
+        )
+
+        table = db.create_table("blob_table", data, namespace_path=["test_ns"])
+        df = table.to_pandas(blob_mode="lazy").sort_values("id")
+
+        blob = df["blob"].iloc[0]
+        assert hasattr(blob, "readall")
+        assert blob.readall() == b"hello"
+
    def test_open_table_through_namespace(self):
        """Test opening an existing table through namespace."""
        db = lancedb.connect_namespace("dir", {"root": self.temp_dir})
@@ -171,8 +257,15 @@ class TestNamespaceConnection:
        assert table_schema.field("id").type == pa.int64()
        assert table_schema.field("text").type == pa.string()

-    def test_rename_table_not_supported(self):
-        """Test that rename_table raises NotImplementedError."""
+    def test_rename_table(self):
+        """Test that rename_table renames a table in the namespace.
+
+        The `dir` namespace implementation in lance-namespace-impls does not
+        implement `rename_table` yet (only the `rest` backend does), so it
+        currently falls back to the default trait method which raises
+        NotSupported. This is expected to start passing once the `dir`
+        backend gains rename_table support upstream.
+        """
        db = lancedb.connect_namespace("dir", {"root": self.temp_dir})

        # Create a child namespace first
@@ -187,9 +280,14 @@ class TestNamespaceConnection:
        )
        db.create_table("old_name", schema=schema, namespace_path=["test_ns"])

-        # Rename should raise NotImplementedError
-        with pytest.raises(NotImplementedError, match="rename_table is not supported"):
-            db.rename_table("old_name", "new_name")
+        # Rename the table within the same namespace
+        with pytest.raises(NotImplementedError, match="rename_table not implemented"):
+            db.rename_table(
+                "old_name",
+                "new_name",
+                cur_namespace_path=["test_ns"],
+                new_namespace_path=["test_ns"],
+            )

    def test_drop_all_tables(self):
        """Test dropping all tables through namespace."""
@@ -707,6 +805,22 @@ class TestPushdownOperations:
        db = lancedb.connect_namespace("dir", {"root": self.temp_dir})
        assert len(db._namespace_client_pushdown_operations) == 0

+    def test_lance_table_to_arrow_uses_query_pushdown(self):
+        namespace_client = _NamespaceClient()
+        table = _namespace_lance_table(namespace_client)
+
+        assert table.to_arrow().equals(PUSHDOWN_DATA)
+        assert table.to_pandas()["id"].tolist() == list(range(12))
+        assert len(namespace_client.requests) == 2
+        assert [request.id for request in namespace_client.requests] == [
+            ["geneva", "hist"],
+            ["geneva", "hist"],
+        ]
+        assert [request.k for request in namespace_client.requests] == [
+            sys.maxsize,
+            sys.maxsize,
+        ]
+

@pytest.mark.asyncio
 class TestAsyncPushdownOperations:
@@ -742,3 +856,39 @@ class TestAsyncPushdownOperations:
        """Test that pushdown operations default to empty on async connection."""
        db = lancedb.connect_namespace_async("dir", {"root": self.temp_dir})
        assert len(db._namespace_client_pushdown_operations) == 0
+
+    async def test_async_table_to_arrow_uses_query_pushdown(self):
+        namespace_client = _NamespaceClient()
+
+        table = AsyncTable(
+            _FailingAsyncInner(),
+            namespace_path=["geneva"],
+            namespace_client=namespace_client,
+            pushdown_operations={"QueryTable"},
+        )
+
+        assert (await table.to_arrow()).equals(PUSHDOWN_DATA)
+        assert (await table.to_pandas())["id"].tolist() == list(range(12))
+        assert len(namespace_client.requests) == 2
+        assert [request.id for request in namespace_client.requests] == [
+            ["geneva", "hist"],
+            ["geneva", "hist"],
+        ]
+        assert [request.k for request in namespace_client.requests] == [
+            sys.maxsize,
+            sys.maxsize,
+        ]
+
+
+def test_local_table_to_arrow_and_to_pandas_are_unchanged(tmp_path):
+    db = lancedb.connect(str(tmp_path / "db"))
+    table = db.create_table(
+        "local",
+        data=[
+            {"id": 1, "vector": [1.0, 2.0]},
+            {"id": 2, "vector": [3.0, 4.0]},
+        ],
+    )
+
+    assert table.to_arrow().column("id").to_pylist() == [1, 2]
+    assert table.to_pandas()["id"].tolist() == [1, 2]
--- a/python/python/tests/test_nested_fields.py
+++ b/python/python/tests/test_nested_fields.py
@@ -0,0 +1,686 @@
+# SPDX-License-Identifier: Apache-2.0
+# SPDX-FileCopyrightText: Copyright The LanceDB Authors
+
+"""Regression matrix for nested field support across LanceDB Python APIs.
+
+Covers the lifecycle described in lancedb/lancedb#3406:
+  - Nested scalar, vector, and FTS index creation with full dotted paths
+  - list_indices / index_stats return canonical full paths (not leaf names)
+  - search, filter, append, optimize behaviour
+  - Field-name edge cases: mixed case, literal-dot field names, same-name leaves
+  - Both sync and async Python table APIs
+
+The matrix uses the following field-name variants from the acceptance criteria:
+  - rowId              (camelCase top-level)
+  - `row-id`           (hyphenated top-level, escaped)
+  - parent.`leaf.name` (struct leaf whose name contains a literal dot)
+  - MetaData.userId    (mixed-case nested path)
+  - `meta-data`.`user-id`  (hyphenated struct with hyphenated leaf)
+
+Note: Lance forbids top-level field names that contain a '.', so the literal-dot
+edge case is exercised via a struct leaf field (parent.`leaf.name`) instead.
+"""
+
+from datetime import timedelta
+
+import pyarrow as pa
+import pytest
+import pytest_asyncio
+
+import lancedb
+from lancedb.db import AsyncConnection, DBConnection
+from lancedb.index import BTree, FTS, IvfPq
+
+
+# ---------------------------------------------------------------------------
+# Constants
+# ---------------------------------------------------------------------------
+
+DIM = 8
+# IvfPq requires at least num_partitions * 256 rows by default; keeping rows
+# small means we must drop num_sub_vectors and num_partitions very low.
+NROWS = 256
+
+
+def _vec(row: int) -> list:
+    return [float((row * DIM + i) % 256) for i in range(DIM)]
+
+
+# ---------------------------------------------------------------------------
+# Fixtures
+# ---------------------------------------------------------------------------
+
+
+@pytest.fixture
+def sync_db(tmp_path) -> DBConnection:
+    return lancedb.connect(tmp_path)
+
+
+@pytest_asyncio.fixture
+async def async_db(tmp_path) -> AsyncConnection:
+    return await lancedb.connect_async(
+        tmp_path, read_consistency_interval=timedelta(seconds=0)
+    )
+
+
+# ---------------------------------------------------------------------------
+# Schema / data builders
+# ---------------------------------------------------------------------------
+
+
+def _nested_scalar_schema() -> pa.Schema:
+    """Schema with nested scalar fields covering the acceptance-criteria names.
+
+    Top-level columns:
+      - rowId       int32  (camelCase top-level)
+      - row-id      int32  (hyphenated top-level name)
+      - MetaData    struct{userId int32}   (mixed-case nested path)
+      - meta-data   struct{user-id int32}  (hyphenated struct + hyphenated leaf)
+
+    Lance disallows top-level field names that contain '.' (e.g. a field
+    literally named 'a.b'), so that edge case is tested separately using
+    _literal_dot_schema() below.
+    """
+    return pa.schema(
+        [
+            pa.field("rowId", pa.int32()),
+            pa.field("row-id", pa.int32()),
+            pa.field(
+                "MetaData",
+                pa.struct([pa.field("userId", pa.int32())]),
+            ),
+            pa.field(
+                "meta-data",
+                pa.struct([pa.field("user-id", pa.int32())]),
+            ),
+        ]
+    )
+
+
+def _nested_scalar_data(nrows: int = NROWS) -> pa.Table:
+    schema = _nested_scalar_schema()
+    return pa.table(
+        {
+            "rowId": pa.array(list(range(nrows)), pa.int32()),
+            "row-id": pa.array(list(range(nrows)), pa.int32()),
+            "MetaData": pa.array(
+                [{"userId": i} for i in range(nrows)],
+                type=pa.struct([pa.field("userId", pa.int32())]),
+            ),
+            "meta-data": pa.array(
+                [{"user-id": i} for i in range(nrows)],
+                type=pa.struct([pa.field("user-id", pa.int32())]),
+            ),
+        },
+        schema=schema,
+    )
+
+
+def _literal_dot_schema() -> pa.Schema:
+    """Schema where a struct *leaf* field is named with a literal dot.
+
+    The path used in the index API is ``parent.`leaf.name` ``.
+    """
+    return pa.schema(
+        [
+            pa.field("id", pa.int32()),
+            pa.field(
+                "parent",
+                pa.struct([pa.field("leaf.name", pa.int32())]),
+            ),
+        ]
+    )
+
+
+def _literal_dot_data(nrows: int = NROWS) -> pa.Table:
+    parent_type = pa.struct([pa.field("leaf.name", pa.int32())])
+    return pa.table(
+        {
+            "id": pa.array(list(range(nrows)), pa.int32()),
+            "parent": pa.array(
+                [{"leaf.name": i} for i in range(nrows)],
+                type=parent_type,
+            ),
+        },
+        schema=_literal_dot_schema(),
+    )
+
+
+def _same_leaf_schema() -> pa.Schema:
+    return pa.schema(
+        [
+            pa.field("StructA", pa.struct([pa.field("userId", pa.int32())])),
+            pa.field("StructB", pa.struct([pa.field("userId", pa.int32())])),
+        ]
+    )
+
+
+def _same_leaf_data(nrows: int = NROWS) -> pa.Table:
+    t = pa.struct([pa.field("userId", pa.int32())])
+    return pa.table(
+        {
+            "StructA": pa.array([{"userId": i} for i in range(nrows)], type=t),
+            "StructB": pa.array([{"userId": i * 10} for i in range(nrows)], type=t),
+        },
+        schema=_same_leaf_schema(),
+    )
+
+
+def _nested_vector_schema() -> pa.Schema:
+    return pa.schema(
+        [
+            pa.field("id", pa.int32()),
+            pa.field(
+                "image",
+                pa.struct([pa.field("embedding", pa.list_(pa.float32(), DIM))]),
+            ),
+            pa.field(
+                "MetaData",
+                pa.struct([pa.field("userId", pa.int32())]),
+            ),
+        ]
+    )
+
+
+def _nested_vector_data(nrows: int = NROWS) -> pa.Table:
+    embedding_type = pa.list_(pa.float32(), DIM)
+    image_type = pa.struct([pa.field("embedding", embedding_type)])
+    meta_type = pa.struct([pa.field("userId", pa.int32())])
+    return pa.table(
+        {
+            "id": pa.array(list(range(nrows)), pa.int32()),
+            "image": pa.array(
+                [{"embedding": _vec(i)} for i in range(nrows)],
+                type=image_type,
+            ),
+            "MetaData": pa.array(
+                [{"userId": i} for i in range(nrows)],
+                type=meta_type,
+            ),
+        },
+        schema=_nested_vector_schema(),
+    )
+
+
+def _nested_fts_schema() -> pa.Schema:
+    return pa.schema(
+        [
+            pa.field("id", pa.int32()),
+            pa.field(
+                "payload",
+                pa.struct([pa.field("text", pa.utf8())]),
+            ),
+            pa.field(
+                "MetaData",
+                pa.struct([pa.field("userId", pa.int32())]),
+            ),
+        ]
+    )
+
+
+def _nested_fts_data(nrows: int = NROWS) -> pa.Table:
+    words = ["alpha", "bravo", "charlie", "delta", "echo"]
+    payload_type = pa.struct([pa.field("text", pa.utf8())])
+    meta_type = pa.struct([pa.field("userId", pa.int32())])
+    return pa.table(
+        {
+            "id": pa.array(list(range(nrows)), pa.int32()),
+            "payload": pa.array(
+                [{"text": words[i % len(words)]} for i in range(nrows)],
+                type=payload_type,
+            ),
+            "MetaData": pa.array(
+                [{"userId": i} for i in range(nrows)],
+                type=meta_type,
+            ),
+        },
+        schema=_nested_fts_schema(),
+    )
+
+
+# ---------------------------------------------------------------------------
+# Helpers
+# ---------------------------------------------------------------------------
+
+
+def _columns_by_name_sync(tbl) -> dict:
+    return {idx.name: idx.columns for idx in tbl.list_indices()}
+
+
+async def _columns_by_name_async(tbl) -> dict:
+    return {idx.name: idx.columns for idx in await tbl.list_indices()}
+
+
+# ===========================================================================
+# SYNC TESTS
+# ===========================================================================
+#
+# The sync LanceTable API uses:
+#   - create_scalar_index(column, ...)  for scalar (BTree/Bitmap/LabelList) indices
+#   - create_fts_index(column, ...)     for full-text-search indices
+#   - create_index(...)                 for vector indices (older positional API)
+# ===========================================================================
+
+
+class TestNestedScalarIndexSync:
+    """Sync regression matrix for nested scalar (BTree) indices."""
+
+    def test_top_level_camelcase_field(self, sync_db):
+        """list_indices must return the full camelCase field name."""
+        tbl = sync_db.create_table("t", _nested_scalar_data())
+        tbl.create_scalar_index("rowId", index_type="BTREE", name="rowid_idx")
+        col_map = _columns_by_name_sync(tbl)
+        assert col_map["rowid_idx"] == ["rowId"], (
+            "list_indices must return 'rowId', not a truncated leaf name"
+        )
+
+    def test_top_level_hyphenated_field_escaped(self, sync_db):
+        """Top-level field 'row-id' (hyphenated) accessed via escaped path."""
+        tbl = sync_db.create_table("t", _nested_scalar_data())
+        tbl.create_scalar_index("`row-id`", index_type="BTREE", name="rowid_hyph_idx")
+        col_map = _columns_by_name_sync(tbl)
+        assert col_map["rowid_hyph_idx"] == ["`row-id`"], (
+            "list_indices must return escaped path '`row-id`'"
+        )
+
+    def test_struct_leaf_literal_dot_field_escaped(self, sync_db):
+        """Struct leaf with a literal-dot name: parent.`leaf.name`.
+
+        The index listing must use the full escaped path, not just the leaf.
+        """
+        tbl = sync_db.create_table("t", _literal_dot_data())
+        tbl.create_scalar_index(
+            "parent.`leaf.name`", index_type="BTREE", name="leaf_dot_idx"
+        )
+        col_map = _columns_by_name_sync(tbl)
+        assert col_map["leaf_dot_idx"] == ["parent.`leaf.name`"], (
+            "list_indices must return 'parent.`leaf.name`', not just '`leaf.name`'"
+        )
+
+    def test_nested_mixed_case_path(self, sync_db):
+        """Nested path MetaData.userId (mixed case) must appear as full path."""
+        tbl = sync_db.create_table("t", _nested_scalar_data())
+        tbl.create_scalar_index(
+            "MetaData.userId", index_type="BTREE", name="metadata_userid_idx"
+        )
+        col_map = _columns_by_name_sync(tbl)
+        assert col_map["metadata_userid_idx"] == ["MetaData.userId"], (
+            "list_indices must return 'MetaData.userId', not leaf 'userId'"
+        )
+
+    def test_nested_hyphenated_path_escaped(self, sync_db):
+        """`meta-data`.`user-id` path with both parts escaped."""
+        tbl = sync_db.create_table("t", _nested_scalar_data())
+        tbl.create_scalar_index(
+            "`meta-data`.`user-id`", index_type="BTREE", name="metauid_idx"
+        )
+        col_map = _columns_by_name_sync(tbl)
+        assert col_map["metauid_idx"] == ["`meta-data`.`user-id`"], (
+            "list_indices must return '`meta-data`.`user-id`', not 'user-id'"
+        )
+
+    def test_filter_on_nested_mixed_case(self, sync_db):
+        """WHERE filter on a nested dotted path works after index creation."""
+        tbl = sync_db.create_table("t", _nested_scalar_data())
+        tbl.create_scalar_index(
+            "MetaData.userId", index_type="BTREE", name="metadata_userid_idx"
+        )
+        rows = tbl.search().where("MetaData.userId = 5").to_list()
+        assert len(rows) == 1
+        assert rows[0]["MetaData"]["userId"] == 5
+
+    def test_append_and_list_indices_stable(self, sync_db):
+        """After appending rows the index listing must remain unchanged."""
+        tbl = sync_db.create_table("t", _nested_scalar_data())
+        tbl.create_scalar_index(
+            "MetaData.userId", index_type="BTREE", name="meta_uid_idx"
+        )
+        tbl.add(_nested_scalar_data(nrows=4))
+        col_map = _columns_by_name_sync(tbl)
+        assert col_map["meta_uid_idx"] == ["MetaData.userId"]
+
+    def test_optimize_and_list_indices_stable(self, tmp_path):
+        """After optimize the index listing must still show full paths."""
+        db = lancedb.connect(tmp_path / "opt_db")
+        tbl = db.create_table("t", _nested_scalar_data())
+        tbl.create_scalar_index(
+            "MetaData.userId", index_type="BTREE", name="meta_uid_idx"
+        )
+        tbl.add(_nested_scalar_data(nrows=4))
+        tbl.optimize()
+        col_map = _columns_by_name_sync(tbl)
+        assert col_map["meta_uid_idx"] == ["MetaData.userId"]
+
+    def test_same_name_leaves_are_distinct(self, sync_db):
+        """Two structs sharing a leaf name must produce distinct index paths."""
+        tbl = sync_db.create_table("same_leaf", _same_leaf_data())
+        tbl.create_scalar_index(
+            "StructA.userId", index_type="BTREE", name="a_userid_idx"
+        )
+        tbl.create_scalar_index(
+            "StructB.userId", index_type="BTREE", name="b_userid_idx"
+        )
+        col_map = _columns_by_name_sync(tbl)
+        assert col_map["a_userid_idx"] == ["StructA.userId"]
+        assert col_map["b_userid_idx"] == ["StructB.userId"]
+
+    def test_index_stats_canonical_path(self, sync_db):
+        """index_stats round-trip: create on nested field, verify row count."""
+        tbl = sync_db.create_table("t", _nested_scalar_data())
+        tbl.create_scalar_index(
+            "MetaData.userId", index_type="BTREE", name="meta_uid_idx"
+        )
+        stats = tbl.index_stats("meta_uid_idx")
+        assert stats is not None
+        assert stats.index_type == "BTREE"
+        assert stats.num_indexed_rows == NROWS
+
+
+class TestNestedVectorIndexSync:
+    """Sync regression matrix for nested vector (IvfPq) indices."""
+
+    def test_nested_vector_index_full_path(self, sync_db):
+        """Listing after vector index creation must use the full dotted path."""
+        tbl = sync_db.create_table("vt", _nested_vector_data())
+        tbl.create_index(
+            num_partitions=2,
+            num_sub_vectors=2,
+            vector_column_name="image.embedding",
+            name="image_emb_idx",
+        )
+        col_map = _columns_by_name_sync(tbl)
+        assert col_map["image_emb_idx"] == ["image.embedding"], (
+            "list_indices must return 'image.embedding', not leaf 'embedding'"
+        )
+
+    def test_nested_vector_search(self, sync_db):
+        """Vector search on nested embedding field must return results."""
+        tbl = sync_db.create_table("vt", _nested_vector_data())
+        tbl.create_index(
+            num_partitions=2,
+            num_sub_vectors=2,
+            vector_column_name="image.embedding",
+            name="image_emb_idx",
+        )
+        results = (
+            tbl.search(_vec(0), vector_column_name="image.embedding").limit(5).to_list()
+        )
+        assert len(results) > 0
+
+    def test_nested_vector_index_stats(self, sync_db):
+        """index_stats for a nested vector index must reflect correct row count."""
+        tbl = sync_db.create_table("vt", _nested_vector_data())
+        tbl.create_index(
+            num_partitions=2,
+            num_sub_vectors=2,
+            vector_column_name="image.embedding",
+            name="image_emb_idx",
+        )
+        stats = tbl.index_stats("image_emb_idx")
+        assert stats is not None
+        assert stats.num_indexed_rows == NROWS
+
+    def test_nested_vector_append_optimize(self, tmp_path):
+        """After append and optimize the vector index listing must be stable."""
+        db = lancedb.connect(tmp_path / "vec_opt_db")
+        tbl = db.create_table("vt", _nested_vector_data())
+        tbl.create_index(
+            num_partitions=2,
+            num_sub_vectors=2,
+            vector_column_name="image.embedding",
+            name="image_emb_idx",
+        )
+        tbl.add(_nested_vector_data(nrows=4))
+        tbl.optimize()
+        col_map = _columns_by_name_sync(tbl)
+        assert col_map["image_emb_idx"] == ["image.embedding"]
+
+
+class TestNestedFTSIndexSync:
+    """Sync regression matrix for nested FTS indices."""
+
+    def test_nested_fts_index_full_path(self, sync_db):
+        """FTS index on payload.text must be listed with the full path."""
+        tbl = sync_db.create_table("ft", _nested_fts_data())
+        tbl.create_fts_index("payload.text", name="payload_text_idx")
+        col_map = _columns_by_name_sync(tbl)
+        assert col_map["payload_text_idx"] == ["payload.text"], (
+            "list_indices must return 'payload.text', not leaf 'text'"
+        )
+
+    def test_nested_fts_search(self, sync_db):
+        """FTS search on a nested text field must return correct results."""
+        tbl = sync_db.create_table("ft", _nested_fts_data())
+        tbl.create_fts_index("payload.text", name="payload_text_idx")
+        results = (
+            tbl.search("alpha", query_type="fts", fts_columns="payload.text")
+            .limit(10)
+            .to_list()
+        )
+        assert len(results) > 0
+        assert all(row["payload"]["text"] == "alpha" for row in results)
+
+    def test_nested_fts_append_optimize(self, tmp_path):
+        """After append and optimize the FTS index listing must be stable."""
+        db = lancedb.connect(tmp_path / "fts_opt_db")
+        tbl = db.create_table("ft", _nested_fts_data())
+        tbl.create_fts_index("payload.text", name="payload_text_idx")
+        tbl.add(_nested_fts_data(nrows=4))
+        tbl.optimize()
+        col_map = _columns_by_name_sync(tbl)
+        assert col_map["payload_text_idx"] == ["payload.text"]
+
+
+# ===========================================================================
+# ASYNC TESTS
+# ===========================================================================
+#
+# The async AsyncTable API uses create_index(column, config=...) uniformly
+# for scalar, vector, and FTS indices.
+# ===========================================================================
+
+
+class TestNestedScalarIndexAsync:
+    """Async regression matrix for nested scalar (BTree) indices."""
+
+    @pytest.mark.asyncio
+    async def test_top_level_camelcase_field(self, async_db):
+        """list_indices must return the full camelCase field name."""
+        tbl = await async_db.create_table("t", _nested_scalar_data())
+        await tbl.create_index("rowId", config=BTree(), name="rowid_idx")
+        col_map = await _columns_by_name_async(tbl)
+        assert col_map["rowid_idx"] == ["rowId"]
+
+    @pytest.mark.asyncio
+    async def test_top_level_hyphenated_field_escaped(self, async_db):
+        """Hyphenated top-level field accessed via escaped path."""
+        tbl = await async_db.create_table("t", _nested_scalar_data())
+        await tbl.create_index("`row-id`", config=BTree(), name="rowid_hyph_idx")
+        col_map = await _columns_by_name_async(tbl)
+        assert col_map["rowid_hyph_idx"] == ["`row-id`"]
+
+    @pytest.mark.asyncio
+    async def test_struct_leaf_literal_dot_field_escaped(self, async_db):
+        """Struct leaf with a literal-dot name: parent.`leaf.name`."""
+        tbl = await async_db.create_table("t", _literal_dot_data())
+        await tbl.create_index(
+            "parent.`leaf.name`", config=BTree(), name="leaf_dot_idx"
+        )
+        col_map = await _columns_by_name_async(tbl)
+        assert col_map["leaf_dot_idx"] == ["parent.`leaf.name`"]
+
+    @pytest.mark.asyncio
+    async def test_nested_mixed_case_path(self, async_db):
+        """Mixed-case nested path MetaData.userId must appear as full path."""
+        tbl = await async_db.create_table("t", _nested_scalar_data())
+        await tbl.create_index(
+            "MetaData.userId", config=BTree(), name="metadata_userid_idx"
+        )
+        col_map = await _columns_by_name_async(tbl)
+        assert col_map["metadata_userid_idx"] == ["MetaData.userId"]
+
+    @pytest.mark.asyncio
+    async def test_nested_hyphenated_path_escaped(self, async_db):
+        """`meta-data`.`user-id` path with both parts escaped."""
+        tbl = await async_db.create_table("t", _nested_scalar_data())
+        await tbl.create_index(
+            "`meta-data`.`user-id`", config=BTree(), name="metauid_idx"
+        )
+        col_map = await _columns_by_name_async(tbl)
+        assert col_map["metauid_idx"] == ["`meta-data`.`user-id`"]
+
+    @pytest.mark.asyncio
+    async def test_filter_on_nested_mixed_case(self, async_db):
+        """WHERE filter on a nested dotted path works after index creation."""
+        tbl = await async_db.create_table("t", _nested_scalar_data())
+        await tbl.create_index(
+            "MetaData.userId", config=BTree(), name="metadata_userid_idx"
+        )
+        rows = await tbl.query().where("MetaData.userId = 5").to_list()
+        assert len(rows) == 1
+        assert rows[0]["MetaData"]["userId"] == 5
+
+    @pytest.mark.asyncio
+    async def test_index_stats_canonical_path(self, async_db):
+        """index_stats round-trip: create on nested field, verify stats."""
+        tbl = await async_db.create_table("t", _nested_scalar_data())
+        await tbl.create_index("MetaData.userId", config=BTree(), name="meta_uid_idx")
+        stats = await tbl.index_stats("meta_uid_idx")
+        assert stats is not None
+        assert stats.index_type == "BTREE"
+        assert stats.num_indexed_rows == NROWS
+
+    @pytest.mark.asyncio
+    async def test_append_and_list_indices_stable(self, async_db):
+        """After appending rows the index listing must remain unchanged."""
+        tbl = await async_db.create_table("t", _nested_scalar_data())
+        await tbl.create_index("MetaData.userId", config=BTree(), name="meta_uid_idx")
+        await tbl.add(_nested_scalar_data(nrows=4))
+        col_map = await _columns_by_name_async(tbl)
+        assert col_map["meta_uid_idx"] == ["MetaData.userId"]
+
+    @pytest.mark.asyncio
+    async def test_optimize_and_list_indices_stable(self, tmp_path):
+        """After optimize the index listing must still show full paths."""
+        db = await lancedb.connect_async(
+            tmp_path / "opt_db", read_consistency_interval=timedelta(seconds=0)
+        )
+        tbl = await db.create_table("t", _nested_scalar_data())
+        await tbl.create_index("MetaData.userId", config=BTree(), name="meta_uid_idx")
+        await tbl.add(_nested_scalar_data(nrows=4))
+        await tbl.optimize()
+        col_map = await _columns_by_name_async(tbl)
+        assert col_map["meta_uid_idx"] == ["MetaData.userId"]
+
+    @pytest.mark.asyncio
+    async def test_same_name_leaves_are_distinct(self, async_db):
+        """Two structs sharing a leaf name must produce distinct index paths."""
+        tbl = await async_db.create_table("same_leaf", _same_leaf_data())
+        await tbl.create_index("StructA.userId", config=BTree(), name="a_userid_idx")
+        await tbl.create_index("StructB.userId", config=BTree(), name="b_userid_idx")
+        col_map = await _columns_by_name_async(tbl)
+        assert col_map["a_userid_idx"] == ["StructA.userId"]
+        assert col_map["b_userid_idx"] == ["StructB.userId"]
+
+
+class TestNestedVectorIndexAsync:
+    """Async regression matrix for nested vector (IvfPq) indices."""
+
+    @pytest.mark.asyncio
+    async def test_nested_vector_index_full_path(self, async_db):
+        """Listing after vector index creation must use the full dotted path."""
+        tbl = await async_db.create_table("vt", _nested_vector_data())
+        await tbl.create_index(
+            "image.embedding",
+            config=IvfPq(num_partitions=2, num_sub_vectors=2),
+            name="image_emb_idx",
+        )
+        col_map = await _columns_by_name_async(tbl)
+        assert col_map["image_emb_idx"] == ["image.embedding"]
+
+    @pytest.mark.asyncio
+    async def test_nested_vector_search(self, async_db):
+        """Vector search on nested embedding field must return results."""
+        tbl = await async_db.create_table("vt", _nested_vector_data())
+        await tbl.create_index(
+            "image.embedding",
+            config=IvfPq(num_partitions=2, num_sub_vectors=2),
+            name="image_emb_idx",
+        )
+        results = (
+            await tbl.query()
+            .nearest_to(_vec(0))
+            .column("image.embedding")
+            .limit(5)
+            .to_list()
+        )
+        assert len(results) > 0
+
+    @pytest.mark.asyncio
+    async def test_nested_vector_index_stats(self, async_db):
+        """index_stats for a nested vector index must reflect correct row count."""
+        tbl = await async_db.create_table("vt", _nested_vector_data())
+        await tbl.create_index(
+            "image.embedding",
+            config=IvfPq(num_partitions=2, num_sub_vectors=2),
+            name="image_emb_idx",
+        )
+        stats = await tbl.index_stats("image_emb_idx")
+        assert stats is not None
+        assert stats.num_indexed_rows == NROWS
+
+    @pytest.mark.asyncio
+    async def test_nested_vector_append_optimize(self, tmp_path):
+        """After append and optimize the vector index listing must be stable."""
+        db = await lancedb.connect_async(
+            tmp_path / "vec_opt_db", read_consistency_interval=timedelta(seconds=0)
+        )
+        tbl = await db.create_table("vt", _nested_vector_data())
+        await tbl.create_index(
+            "image.embedding",
+            config=IvfPq(num_partitions=2, num_sub_vectors=2),
+            name="image_emb_idx",
+        )
+        await tbl.add(_nested_vector_data(nrows=4))
+        await tbl.optimize()
+        col_map = await _columns_by_name_async(tbl)
+        assert col_map["image_emb_idx"] == ["image.embedding"]
+
+
+class TestNestedFTSIndexAsync:
+    """Async regression matrix for nested FTS indices."""
+
+    @pytest.mark.asyncio
+    async def test_nested_fts_index_full_path(self, async_db):
+        """FTS index on payload.text must be listed with the full path."""
+        tbl = await async_db.create_table("ft", _nested_fts_data())
+        await tbl.create_index("payload.text", config=FTS(), name="payload_text_idx")
+        col_map = await _columns_by_name_async(tbl)
+        assert col_map["payload_text_idx"] == ["payload.text"]
+
+    @pytest.mark.asyncio
+    async def test_nested_fts_search(self, async_db):
+        """FTS search on a nested text field must return correct results."""
+        tbl = await async_db.create_table("ft", _nested_fts_data())
+        await tbl.create_index("payload.text", config=FTS(), name="payload_text_idx")
+        results = (
+            await tbl.query()
+            .nearest_to_text("alpha", columns="payload.text")
+            .limit(10)
+            .to_list()
+        )
+        assert len(results) > 0
+        assert all(row["payload"]["text"] == "alpha" for row in results)
+
+    @pytest.mark.asyncio
+    async def test_nested_fts_append_optimize(self, tmp_path):
+        """After append and optimize the FTS index listing must be stable."""
+        db = await lancedb.connect_async(
+            tmp_path / "fts_opt_db", read_consistency_interval=timedelta(seconds=0)
+        )
+        tbl = await db.create_table("ft", _nested_fts_data())
+        await tbl.create_index("payload.text", config=FTS(), name="payload_text_idx")
+        await tbl.add(_nested_fts_data(nrows=4))
+        await tbl.optimize()
+        col_map = await _columns_by_name_async(tbl)
+        assert col_map["payload_text_idx"] == ["payload.text"]
--- a/Show More
+++ b/Show More