chore: update lance dependency to v8.0.0-beta.11

feat: adds isin support to the 'Expr' builder (#3523 )
The `Expr` build already includes a lot of useful filtering options, `eq, ne, gt/gte, lt/lte, and_, or_, contains, cast`, but is was missing a membership like `isin`. This PR adds that support, as minimally as possible, allowing easy filtering for membership in a list, without needing to be a series of `where` expressions. I didn't see anything in CONTRIBUTING.md about needing a feature request or issue first, so I just made the change. My apologies if I missed that somewhere. Thanks for the vector store, we're using it now in paperless-ngx.
2026-06-11 00:00:43 +00:00 · 2026-06-10 22:52:01 +00:00 · 2026-06-10 15:28:19 -07:00 · 2026-06-10 15:27:16 -07:00 · 2026-06-10 12:28:20 -07:00 · 2026-06-10 10:10:11 -07:00
91 changed files with 8745 additions and 2617 deletions
--- a/.agents/skills/README.md
+++ b/.agents/skills/README.md
@@ -0,0 +1,7 @@
+# Agent Skills
+
+This directory contains repo-scoped code agent skills for the LanceDB project.
+
+Each skill is a folder that contains a required `SKILL.md` and optional bundled resources.
+
+Codex discovers skills from `.agents/skills` in the current working directory and parent directories.
--- a/.agents/skills/lancedb-update-lance-dependency/SKILL.md
+++ b/.agents/skills/lancedb-update-lance-dependency/SKILL.md
@@ -0,0 +1,98 @@
+---
+name: lancedb-update-lance-dependency
+description: Update LanceDB to a specific Lance release or tag. Use when bumping Lance dependencies in the lancedb repository, including Rust workspace Lance crates, Java lance-core, validation, branch creation, commit, push, and PR creation when requested.
+---
+
+# LanceDB Update Lance Dependency
+
+## Scope
+
+Use this skill in the `lancedb/lancedb` repository when updating the Lance dependency to a specific Lance version or tag.
+
+Inputs can be a version (`7.2.0-beta.1`), a tag (`v7.2.0-beta.1`), a tag ref (`refs/tags/v7.2.0-beta.1`), or `latest`.
+
+## Workflow
+
+1. Confirm the worktree status with `git status --short`.
+2. Resolve the target Lance version:
+
+   - If the input is `latest`, empty, or omitted, run:
+
+     ```bash
+     python3 ci/check_lance_release.py
+     ```
+
+     Parse the JSON output. If `needs_update` is not `true`, stop without creating a PR. Otherwise use `latest_tag`.
+
+   - If the input is explicit, use it directly.
+
+3. Compute update metadata without changing files:
+
+   ```bash
+   python3 ci/update_lance_dependency.py "$TAG_OR_VERSION" --metadata-only
+   ```
+
+   Before making changes, check for an existing open PR with the emitted `pr_title`:
+
+   ```bash
+   gh pr list --search "\"$PR_TITLE\" in:title" --state open --limit 1 --json number,url,title
+   ```
+
+   If a matching open PR exists, stop and report it instead of creating a duplicate.
+
+4. Run the deterministic update entrypoint:
+
+   ```bash
+   python3 ci/update_lance_dependency.py "$TAG_OR_VERSION"
+   ```
+
+   This updates the Rust workspace Lance dependencies through `ci/set_lance_version.py`, updates `java/pom.xml`, refreshes Cargo metadata, and prints JSON metadata containing `branch_name`, `commit_message`, and `pr_title`.
+
+5. Run validation:
+
+   ```bash
+   cargo clippy --quiet --workspace --tests --all-features -- -D warnings
+   cargo fmt --all --quiet
+   ```
+
+   Fix real diagnostics and rerun clippy until it succeeds. Do not skip warnings.
+
+6. Inspect `git status --short` and `git diff` to ensure only the Lance dependency update and required compatibility fixes are present.
+
+7. If the task only asks to prepare local changes, stop here and report the changed files and validation result.
+
+8. If the task asks to publish the update, create a branch using the printed `branch_name`, stage all relevant files, and commit using the printed `commit_message`. Do not amend or rewrite existing commits.
+
+9. Push to `origin`. Before creating the PR, check that the current token has push permission:
+
+   ```bash
+   gh api repos/lancedb/lancedb --jq .permissions.push
+   ```
+
+   If the remote branch already exists for the same generated branch name, delete the remote ref with `gh api -X DELETE repos/lancedb/lancedb/git/refs/heads/$BRANCH_NAME`, then push. Do not force-push.
+
+10. Create a PR targeting `main` with the printed `pr_title`. If there is no PR template, keep the body to two or three concise sentences: state the Lance dependency bump, note any required compatibility fixes, and link the triggering Lance tag or release.
+
+11. Read back the remote PR title after creation. If it is not a Conventional Commit title, fix it immediately.
+
+12. When running in GitHub Actions after creating the LanceDB PR, trigger the Sophon dependency update:
+
+    ```bash
+    gh workflow run codex-bump-lancedb-lance.yml \
+      --repo lancedb/sophon \
+      -f lance_ref="$LANCE_TAG" \
+      -f lancedb_ref="$BRANCH_NAME"
+    gh run list --repo lancedb/sophon --workflow codex-bump-lancedb-lance.yml --limit 1 --json databaseId,url,displayTitle
+    ```
+
+    Use the emitted metadata `tag` value as `LANCE_TAG`. Do this only after a new LanceDB PR has been created. If the update was skipped because no update is needed or an open PR already exists, do not trigger Sophon.
+
+## GitHub Actions
+
+When this skill is used from GitHub Actions, `TAG`, `GH_TOKEN`, and `GITHUB_TOKEN` may already be set. Resolve `latest` first when `TAG` is empty. Once an explicit tag or version is known, use:
+
+```bash
+python3 ci/update_lance_dependency.py "$TAG" --github-output "$GITHUB_OUTPUT"
+```
+
+Then use the emitted `branch_name`, `commit_message`, and `pr_title` values for branch, commit, and PR creation.
--- a/.bumpversion.toml
+++ b/.bumpversion.toml
@@ -1,5 +1,5 @@
 [tool.bumpversion]
-current_version = "0.30.0-beta.1"
+current_version = "0.30.1-beta.2"
 parse = """(?x)
    (?P<major>0|[1-9]\\d*)\\.
    (?P<minor>0|[1-9]\\d*)\\.
--- a/.github/dependabot.yml
+++ b/.github/dependabot.yml
@@ -21,3 +21,14 @@ updates:
        update-types:
          - minor
          - patch
+
+  - package-ecosystem: pip
+    directory: /python
+    schedule:
+      interval: weekly
+    # Only update uv.lock, never widen version requirements in pyproject.toml.
+    versioning-strategy: lockfile-only
+    groups:
+      python-deps:
+        patterns:
+          - "*"
--- a/.github/workflows/codex-update-lance-dependency.yml
+++ b/.github/workflows/codex-update-lance-dependency.yml
@@ -4,14 +4,16 @@ on:
  workflow_call:
    inputs:
      tag:
-        description: "Tag name from Lance"
-        required: true
+        description: "Tag name from Lance. If omitted, the skill will use the latest Lance release that needs an update."
+        required: false
+        default: ""
        type: string
  workflow_dispatch:
    inputs:
      tag:
-        description: "Tag name from Lance"
-        required: true
+        description: "Tag name from Lance. Leave empty to use the latest Lance release that needs an update."
+        required: false
+        default: ""
        type: string

 permissions:
@@ -25,7 +27,7 @@ jobs:
    steps:
      - name: Show inputs
        run: |
-          echo "tag = ${{ inputs.tag }}"
+          echo "tag = ${{ inputs.tag || 'latest' }}"

      - name: Checkout Repo LanceDB
        uses: actions/checkout@v4
@@ -71,65 +73,21 @@ jobs:
          OPENAI_API_KEY: ${{ secrets.CODEX_TOKEN }}
        run: |
          set -euo pipefail
-          VERSION="${TAG#refs/tags/}"
-          VERSION="${VERSION#v}"
-          BRANCH_NAME="codex/update-lance-${VERSION//[^a-zA-Z0-9]/-}"
-
-          # Use "chore" for beta/rc versions, "feat" for stable releases
-          if [[ "${VERSION}" == *beta* ]] || [[ "${VERSION}" == *rc* ]]; then
-            COMMIT_TYPE="chore"
-          else
-            COMMIT_TYPE="feat"
-          fi
+          TARGET_TAG="${TAG:-latest}"

          cat <<EOF >/tmp/codex-prompt.txt
-          You are running inside the lancedb repository on a GitHub Actions runner. Update the Lance dependency to version ${VERSION} and prepare a pull request for maintainers to review.
+          You are running inside the lancedb repository on a GitHub Actions runner.

-          Follow these steps exactly:
-          1. Use script "ci/set_lance_version.py" to update Lance Rust dependencies. The script already refreshes Cargo metadata, so allow it to finish even if it takes time.
-          2. Update the Java lance-core dependency version in "java/pom.xml": change the "<lance-core.version>...</lance-core.version>" property to "${VERSION}".
-          3. Run "cargo clippy --workspace --tests --all-features -- -D warnings". If diagnostics appear, fix them yourself and rerun clippy until it exits cleanly. Do not skip any warnings.
-          4. After clippy succeeds, run "cargo fmt --all" to format the workspace.
-          5. Ensure the repository is clean except for intentional changes. Inspect "git status --short" and "git diff" to confirm the dependency update and any required fixes.
-          6. Create and switch to a new branch named "${BRANCH_NAME}" (replace any duplicated hyphens if necessary).
-          7. Stage all relevant files with "git add -A". Commit using the message "${COMMIT_TYPE}: update lance dependency to v${VERSION}".
-          8. Push the branch to origin. If the remote branch already exists, delete it first with "gh api -X DELETE repos/lancedb/lancedb/git/refs/heads/${BRANCH_NAME}" then push with "git push origin ${BRANCH_NAME}". Do NOT use "git push --force" or "git push -f".
-          9. env "GH_TOKEN" is available, use "gh" tools for github related operations like creating pull request.
-          10. Create a pull request targeting "main" with title "${COMMIT_TYPE}: update lance dependency to v${VERSION}". First, write the PR body to /tmp/pr-body.md using a heredoc (cat <<'EOF' > /tmp/pr-body.md). The body should summarize the dependency bump, clippy/fmt verification, and link the triggering tag (${TAG}). Then run "gh pr create --body-file /tmp/pr-body.md".
-          11. After creating the PR, display the PR URL, "git status --short", and a concise summary of the commands run and their results.
+          Use \$lancedb-update-lance-dependency with target "${TARGET_TAG}".

          Constraints:
-          - Use bash commands; avoid modifying GitHub workflow files other than through the scripted task above.
-          - Do not merge the PR.
-          - If any command fails, diagnose and fix the issue instead of aborting.
+          - Use env "GH_TOKEN" for GitHub operations.
+          - Do not merge the pull request.
+          - Do not force-push.
+          - Do not create a duplicate pull request if an open PR already exists for the target Lance version.
+          - If any command fails, diagnose and fix the root cause instead of aborting.
+          - After creating the PR, display the PR URL, "git status --short", and a concise summary of the commands run and their results.
          EOF

          printenv OPENAI_API_KEY | codex login --with-api-key
          codex --config shell_environment_policy.ignore_default_excludes=true exec --dangerously-bypass-approvals-and-sandbox "$(cat /tmp/codex-prompt.txt)"
-
-      - name: Trigger sophon dependency update
-        env:
-          TAG: ${{ inputs.tag }}
-          GH_TOKEN: ${{ secrets.ROBOT_TOKEN }}
-        run: |
-          set -euo pipefail
-          VERSION="${TAG#refs/tags/}"
-          VERSION="${VERSION#v}"
-          LANCEDB_BRANCH="codex/update-lance-${VERSION//[^a-zA-Z0-9]/-}"
-
-          echo "Triggering sophon workflow with:"
-          echo "  lance_ref: ${TAG#refs/tags/}"
-          echo "  lancedb_ref: ${LANCEDB_BRANCH}"
-
-          gh workflow run codex-bump-lancedb-lance.yml \
-            --repo lancedb/sophon \
-            -f lance_ref="${TAG#refs/tags/}" \
-            -f lancedb_ref="${LANCEDB_BRANCH}"
-
-      - name: Show latest sophon workflow run
-        env:
-          GH_TOKEN: ${{ secrets.ROBOT_TOKEN }}
-        run: |
-          set -euo pipefail
-          echo "Latest sophon workflow run:"
-          gh run list --repo lancedb/sophon --workflow codex-bump-lancedb-lance.yml --limit 1 --json databaseId,url,displayTitle
--- a/.github/workflows/lance-release-timer.yml
+++ b/.github/workflows/lance-release-timer.yml
@@ -1,62 +0,0 @@
-name: Lance Release Timer
-
-on:
-  schedule:
-    - cron: "*/10 * * * *"
-  workflow_dispatch:
-
-permissions:
-  contents: read
-  actions: write
-
-concurrency:
-  group: lance-release-timer
-  cancel-in-progress: false
-
-jobs:
-  trigger-update:
-    runs-on: ubuntu-latest
-    steps:
-      - name: Checkout repository
-        uses: actions/checkout@v4
-
-      - name: Check for new Lance tag
-        id: check
-        env:
-          GH_TOKEN: ${{ secrets.ROBOT_TOKEN }}
-        run: |
-          python3 ci/check_lance_release.py --github-output "$GITHUB_OUTPUT"
-
-      - name: Look for existing PR
-        if: steps.check.outputs.needs_update == 'true'
-        id: pr
-        env:
-          GH_TOKEN: ${{ secrets.ROBOT_TOKEN }}
-        run: |
-          set -euo pipefail
-          TITLE="chore: update lance dependency to v${{ steps.check.outputs.latest_version }}"
-          COUNT=$(gh pr list --search "\"$TITLE\" in:title" --state open --limit 1 --json number --jq 'length')
-          if [ "$COUNT" -gt 0 ]; then
-            echo "Open PR already exists for $TITLE"
-            echo "pr_exists=true" >> "$GITHUB_OUTPUT"
-          else
-            echo "No existing PR for $TITLE"
-            echo "pr_exists=false" >> "$GITHUB_OUTPUT"
-          fi
-
-      - name: Trigger codex update workflow
-        if: steps.check.outputs.needs_update == 'true' && steps.pr.outputs.pr_exists != 'true'
-        env:
-          GH_TOKEN: ${{ secrets.ROBOT_TOKEN }}
-        run: |
-          set -euo pipefail
-          TAG=${{ steps.check.outputs.latest_tag }}
-          gh workflow run codex-update-lance-dependency.yml -f tag=refs/tags/$TAG
-
-      - name: Show latest codex workflow run
-        if: steps.check.outputs.needs_update == 'true' && steps.pr.outputs.pr_exists != 'true'
-        env:
-          GH_TOKEN: ${{ secrets.ROBOT_TOKEN }}
-        run: |
-          set -euo pipefail
-          gh run list --workflow codex-update-lance-dependency.yml --limit 1 --json databaseId,url,displayTitle
--- a/.gitignore
+++ b/.gitignore
@@ -27,6 +27,7 @@ python/dist
 *.so
 *.dylib
 *.dll
+*.pdb

 ## Javascript
 *.node
--- a/Cargo.lock
+++ b/Cargo.lock
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -13,20 +13,20 @@ categories = ["database-implementations"]
 rust-version = "1.91.0"

 [workspace.dependencies]
-lance = { "version" = "=7.2.0-beta.3", default-features = false, "tag" = "v7.2.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
-lance-core = { "version" = "=7.2.0-beta.3", "tag" = "v7.2.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
-lance-datagen = { "version" = "=7.2.0-beta.3", "tag" = "v7.2.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
-lance-file = { "version" = "=7.2.0-beta.3", "tag" = "v7.2.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
-lance-io = { "version" = "=7.2.0-beta.3", default-features = false, "tag" = "v7.2.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
-lance-index = { "version" = "=7.2.0-beta.3", "tag" = "v7.2.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
-lance-linalg = { "version" = "=7.2.0-beta.3", "tag" = "v7.2.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
-lance-namespace = { "version" = "=7.2.0-beta.3", "tag" = "v7.2.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
-lance-namespace-impls = { "version" = "=7.2.0-beta.3", default-features = false, "tag" = "v7.2.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
-lance-table = { "version" = "=7.2.0-beta.3", "tag" = "v7.2.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
-lance-testing = { "version" = "=7.2.0-beta.3", "tag" = "v7.2.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
-lance-datafusion = { "version" = "=7.2.0-beta.3", "tag" = "v7.2.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
-lance-encoding = { "version" = "=7.2.0-beta.3", "tag" = "v7.2.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
-lance-arrow = { "version" = "=7.2.0-beta.3", "tag" = "v7.2.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
+lance = { "version" = "=8.0.0-beta.11", default-features = false, "tag" = "v8.0.0-beta.11", "git" = "https://github.com/lance-format/lance.git" }
+lance-core = { "version" = "=8.0.0-beta.11", "tag" = "v8.0.0-beta.11", "git" = "https://github.com/lance-format/lance.git" }
+lance-datagen = { "version" = "=8.0.0-beta.11", "tag" = "v8.0.0-beta.11", "git" = "https://github.com/lance-format/lance.git" }
+lance-file = { "version" = "=8.0.0-beta.11", "tag" = "v8.0.0-beta.11", "git" = "https://github.com/lance-format/lance.git" }
+lance-io = { "version" = "=8.0.0-beta.11", default-features = false, "tag" = "v8.0.0-beta.11", "git" = "https://github.com/lance-format/lance.git" }
+lance-index = { "version" = "=8.0.0-beta.11", "tag" = "v8.0.0-beta.11", "git" = "https://github.com/lance-format/lance.git" }
+lance-linalg = { "version" = "=8.0.0-beta.11", "tag" = "v8.0.0-beta.11", "git" = "https://github.com/lance-format/lance.git" }
+lance-namespace = { "version" = "=8.0.0-beta.11", "tag" = "v8.0.0-beta.11", "git" = "https://github.com/lance-format/lance.git" }
+lance-namespace-impls = { "version" = "=8.0.0-beta.11", default-features = false, "tag" = "v8.0.0-beta.11", "git" = "https://github.com/lance-format/lance.git" }
+lance-table = { "version" = "=8.0.0-beta.11", "tag" = "v8.0.0-beta.11", "git" = "https://github.com/lance-format/lance.git" }
+lance-testing = { "version" = "=8.0.0-beta.11", "tag" = "v8.0.0-beta.11", "git" = "https://github.com/lance-format/lance.git" }
+lance-datafusion = { "version" = "=8.0.0-beta.11", "tag" = "v8.0.0-beta.11", "git" = "https://github.com/lance-format/lance.git" }
+lance-encoding = { "version" = "=8.0.0-beta.11", "tag" = "v8.0.0-beta.11", "git" = "https://github.com/lance-format/lance.git" }
+lance-arrow = { "version" = "=8.0.0-beta.11", "tag" = "v8.0.0-beta.11", "git" = "https://github.com/lance-format/lance.git" }
 ahash = "0.8"
 # Note that this one does not include pyarrow
 arrow = { version = "58.0.0", optional = false }
--- a/REVIEW.md
+++ b/REVIEW.md
@@ -0,0 +1,26 @@
+# Code review guidelines
+
+Repo-specific guidance for automated PR reviews.
+
+## Cross-SDK parity
+
+LanceDB exposes the same core (`rust/lancedb`) through Python, TypeScript (`nodejs`),
+and Java bindings. Behavioral drift between SDKs is a recurring problem, so watch for
+parity gaps when reviewing — but only flag real ones:
+
+* If the change adds or modifies user-facing API or behavior in the shared core
+  (`rust/lancedb`), check whether each binding that should expose it (`python`,
+  `nodejs`) does. A core change with no corresponding binding update is worth a note.
+* If the change adds or modifies a public API in one SDK but not the other, open the
+  sibling SDK's corresponding module and state whether an equivalent exists. If not,
+  note it as a possible parity gap and suggest a follow-up issue.
+* For bug fixes, first read the sibling SDK's analogous code path to check whether the
+  same bug exists there. Only raise parity if it actually does. Do not ask to "port" a
+  fix for a bug that only ever existed in one binding.
+* Stay silent on internal-only refactors, tests, docs, and changes with no cross-SDK
+  surface.
+* Parity expectations apply to the Python and TypeScript (`nodejs`) SDKs. Java currently
+  implements only the remote table, not the local/embedded backend, so it is expected to
+  be partial — do not flag Java for missing local-only functionality.
+* Keep parity feedback to a short, clearly-labeled note (e.g. "Possible SDK parity
+  gap: …"). It is advisory, not a merge blocker.
--- a/ci/update_lance_dependency.py
+++ b/ci/update_lance_dependency.py
@@ -0,0 +1,126 @@
+#!/usr/bin/env python3
+"""Prepare a Lance dependency update for LanceDB."""
+
+from __future__ import annotations
+
+import argparse
+import json
+import re
+import subprocess
+import sys
+from pathlib import Path
+from typing import Sequence
+
+try:
+    from check_lance_release import parse_semver
+except ModuleNotFoundError:
+    # Supports importing as ci.update_lance_dependency from tests or ad hoc checks.
+    from ci.check_lance_release import parse_semver  # type: ignore
+
+
+def normalize_version(raw: str) -> str:
+    value = raw.strip()
+    value = value.removeprefix("refs/tags/")
+    value = value.removeprefix("v")
+    try:
+        parse_semver(value)
+    except ValueError:
+        raise ValueError(f"Unsupported Lance version or tag: {raw}")
+    return value
+
+
+def normalized_tag(version: str) -> str:
+    return f"v{version}"
+
+
+def branch_name(version: str) -> str:
+    suffix = re.sub(r"[^a-zA-Z0-9]+", "-", version).strip("-")
+    suffix = re.sub(r"-+", "-", suffix)
+    return f"codex/update-lance-{suffix}"
+
+
+def commit_type(version: str) -> str:
+    prerelease = version.split("-", maxsplit=1)[1] if "-" in version else ""
+    return "chore" if "beta" in prerelease or "rc" in prerelease else "feat"
+
+
+def metadata_for(version: str) -> dict[str, str]:
+    kind = commit_type(version)
+    message = f"{kind}: update lance dependency to v{version}"
+    return {
+        "version": version,
+        "tag": normalized_tag(version),
+        "branch_name": branch_name(version),
+        "commit_type": kind,
+        "commit_message": message,
+        "pr_title": message,
+    }
+
+
+def run_command(cmd: Sequence[str], *, cwd: Path) -> None:
+    subprocess.run(cmd, cwd=cwd, check=True)
+
+
+def update_java_lance_core_version(repo_root: Path, version: str) -> None:
+    pom_path = repo_root / "java" / "pom.xml"
+    contents = pom_path.read_text(encoding="utf-8")
+    updated, count = re.subn(
+        r"(<lance-core\.version>)[^<]+(</lance-core\.version>)",
+        rf"\g<1>{version}\g<2>",
+        contents,
+        count=1,
+    )
+    if count != 1:
+        raise RuntimeError(
+            "Expected exactly one <lance-core.version> entry in java/pom.xml"
+        )
+    pom_path.write_text(updated, encoding="utf-8")
+
+
+def write_github_outputs(path: str | None, payload: dict[str, str]) -> None:
+    if not path:
+        return
+    with open(path, "a", encoding="utf-8") as output:
+        for key, value in payload.items():
+            output.write(f"{key}={value}\n")
+
+
+def main(argv: Sequence[str] | None = None) -> int:
+    parser = argparse.ArgumentParser(description=__doc__)
+    parser.add_argument(
+        "tag_or_version",
+        help="Lance tag or version, for example refs/tags/v7.2.0-beta.1 or 7.2.0",
+    )
+    parser.add_argument(
+        "--repo-root",
+        type=Path,
+        default=Path(__file__).resolve().parents[1],
+        help="Path to the lancedb repository root",
+    )
+    parser.add_argument(
+        "--github-output",
+        default=None,
+        help="Optional GitHub Actions output file to receive metadata fields",
+    )
+    parser.add_argument(
+        "--metadata-only",
+        action="store_true",
+        help="Only print derived metadata; do not modify dependency files",
+    )
+    args = parser.parse_args(argv)
+
+    repo_root = args.repo_root.resolve()
+    version = normalize_version(args.tag_or_version)
+    payload = metadata_for(version)
+
+    if not args.metadata_only:
+        run_command([sys.executable, "ci/set_lance_version.py", version], cwd=repo_root)
+        update_java_lance_core_version(repo_root, version)
+
+    write_github_outputs(args.github_output, payload)
+    print(json.dumps(payload, sort_keys=True))
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
--- a/deny.toml
+++ b/deny.toml
@@ -147,6 +147,14 @@ allow = [
    "CDLA-Permissive-2.0",
 ]
 confidence-threshold = 0.8
+# Per-crate license exceptions: allow a license for a specific crate only,
+# rather than globally via the `allow` list above.
+exceptions = [
+    # CDDL-1.0 (copyleft) is pulled in only as a dev/profiling dependency via
+    # `inferno` -> `pprof` -> `lance-testing`; it is a test dependency that we
+    # do not distribute, so scope the allowance to `inferno` alone.
+    { allow = ["CDDL-1.0"], crate = "inferno" },
+]
 # Crates whose license cannot be determined from Cargo metadata but whose
 # license we've manually confirmed from upstream. Keep this list minimal.
 [[licenses.clarify]]
--- a/docs/src/java/java.md
+++ b/docs/src/java/java.md
@@ -14,7 +14,7 @@ Add the following dependency to your `pom.xml`:
 <dependency>
    <groupId>com.lancedb</groupId>
    <artifactId>lancedb-core</artifactId>
-    <version>0.30.0-beta.1</version>
+    <version>0.30.1-beta.2</version>
 </dependency>
 ```

--- a/docs/src/js/classes/BranchContents.md
+++ b/docs/src/js/classes/BranchContents.md
@@ -0,0 +1,43 @@
+[**@lancedb/lancedb**](../README.md) • **Docs**
+
+***
+
+[@lancedb/lancedb](../globals.md) / BranchContents
+
+# Class: BranchContents
+
+## Constructors
+
+### new BranchContents()
+
+```ts
+new BranchContents(): BranchContents
+```
+
+#### Returns
+
+[`BranchContents`](BranchContents.md)
+
+## Properties
+
+### manifestSize
+
+```ts
+manifestSize: number;
+```
+
+***
+
+### parentBranch?
+
+```ts
+optional parentBranch: string;
+```
+
+***
+
+### parentVersion
+
+```ts
+parentVersion: number;
+```
--- a/docs/src/js/classes/Branches.md
+++ b/docs/src/js/classes/Branches.md
@@ -0,0 +1,96 @@
+[**@lancedb/lancedb**](../README.md) • **Docs**
+
+***
+
+[@lancedb/lancedb](../globals.md) / Branches
+
+# Class: Branches
+
+Branch manager for a [Table](Table.md).
+
+Unlike tags, `create` and `checkout` return a new [Table](Table.md) handle scoped
+to the branch; writes on it do not affect `main`.
+
+## Methods
+
+### checkout()
+
+```ts
+checkout(name, version?): Promise<Table>
+```
+
+Check out an existing branch and return a handle scoped to it.
+
+With `version` set, the returned handle is pinned to that version of the
+branch (a read-only, detached view); otherwise it tracks the branch's
+latest and stays writable.
+
+#### Parameters
+
+* **name**: `string`
+
+* **version?**: `number`
+
+#### Returns
+
+`Promise`&lt;[`Table`](Table.md)&gt;
+
+***
+
+### create()
+
+```ts
+create(
+   name,
+   fromRef?,
+   fromVersion?): Promise<Table>
+```
+
+Create a branch and return a handle scoped to it.
+
+#### Parameters
+
+* **name**: `string`
+    Name of the new branch.
+
+* **fromRef?**: `string`
+    Source branch to fork from. Defaults to `main`.
+
+* **fromVersion?**: `number`
+    A specific version on `fromRef`. Defaults to latest.
+
+#### Returns
+
+`Promise`&lt;[`Table`](Table.md)&gt;
+
+***
+
+### delete()
+
+```ts
+delete(name): Promise<void>
+```
+
+Delete a branch.
+
+#### Parameters
+
+* **name**: `string`
+
+#### Returns
+
+`Promise`&lt;`void`&gt;
+
+***
+
+### list()
+
+```ts
+list(): Promise<Record<string, BranchContents>>
+```
+
+List all branches, mapping name to branch metadata.
+
+#### Returns
+
+`Promise`&lt;`Record`&lt;`string`, [`BranchContents`](BranchContents.md)&gt;&gt;
--- a/docs/src/js/classes/Index.md
+++ b/docs/src/js/classes/Index.md
@@ -57,6 +57,24 @@ block size may be added in the future.

 ***

+### fm()
+
+```ts
+static fm(): Index
+```
+
+Create an FM-Index.
+
+An FM-Index is a scalar index on string or binary columns that accelerates
+substring search, i.e. `contains(col, 'needle')`. Unlike the tokenized
+full-text-search index, it matches arbitrary substrings of the raw bytes.
+
+#### Returns
+
+[`Index`](Index.md)
+
+***
+
 ### fts()

 ```ts
--- a/docs/src/js/classes/Table.md
+++ b/docs/src/js/classes/Table.md
@@ -110,6 +110,23 @@ containing the new version number of the table after altering the columns.

 ***

+### branches()
+
+```ts
+abstract branches(): Promise<Branches>
+```
+
+Get the branch manager for this table.
+
+Branches are isolated, writable lines of history forked from another
+branch (or version). Writes on a branch do not affect `main`.
+
+#### Returns
+
+`Promise`&lt;[`Branches`](Branches.md)&gt;
+
+***
+
 ### checkout()

 ```ts
@@ -994,6 +1011,29 @@ based on the row being updated (e.g. "my_col + 1")

 ***

+### updateFieldMetadata()
+
+```ts
+abstract updateFieldMetadata(updates): Promise<UpdateFieldMetadataResult>
+```
+
+Update per-field (column) metadata.
+
+#### Parameters
+
+* **updates**: [`FieldMetadataUpdate`](../interfaces/FieldMetadataUpdate.md)[]
+    One or more per-field updates. Each
+    update's metadata is merged into the field's existing metadata by default;
+    a value of `null` deletes that key, and `replace: true` swaps the whole map.
+
+#### Returns
+
+`Promise`&lt;[`UpdateFieldMetadataResult`](../interfaces/UpdateFieldMetadataResult.md)&gt;
+
+resolves to the new table version.
+
+***
+
 ### vectorSearch()

 ```ts
--- a/docs/src/js/globals.md
+++ b/docs/src/js/globals.md
@@ -19,6 +19,8 @@

 - [BooleanQuery](classes/BooleanQuery.md)
 - [BoostQuery](classes/BoostQuery.md)
+- [BranchContents](classes/BranchContents.md)
+- [Branches](classes/Branches.md)
 - [Connection](classes/Connection.md)
 - [HeaderProvider](classes/HeaderProvider.md)
 - [Index](classes/Index.md)
@@ -65,6 +67,7 @@
 - [DropNamespaceOptions](interfaces/DropNamespaceOptions.md)
 - [DropNamespaceResponse](interfaces/DropNamespaceResponse.md)
 - [ExecutableQuery](interfaces/ExecutableQuery.md)
+- [FieldMetadataUpdate](interfaces/FieldMetadataUpdate.md)
 - [FragmentStatistics](interfaces/FragmentStatistics.md)
 - [FragmentSummaryStats](interfaces/FragmentSummaryStats.md)
 - [FtsOptions](interfaces/FtsOptions.md)
@@ -101,6 +104,7 @@
 - [TimeoutConfig](interfaces/TimeoutConfig.md)
 - [TlsConfig](interfaces/TlsConfig.md)
 - [TokenResponse](interfaces/TokenResponse.md)
+- [UpdateFieldMetadataResult](interfaces/UpdateFieldMetadataResult.md)
 - [UpdateOptions](interfaces/UpdateOptions.md)
 - [UpdateResult](interfaces/UpdateResult.md)
 - [Version](interfaces/Version.md)
--- a/docs/src/js/interfaces/FieldMetadataUpdate.md
+++ b/docs/src/js/interfaces/FieldMetadataUpdate.md
@@ -0,0 +1,41 @@
+[**@lancedb/lancedb**](../README.md) • **Docs**
+
+***
+
+[@lancedb/lancedb](../globals.md) / FieldMetadataUpdate
+
+# Interface: FieldMetadataUpdate
+
+A per-field metadata update, addressed by dot-path.
+
+## Properties
+
+### metadata
+
+```ts
+metadata: Record<string, null | string>;
+```
+
+Metadata key/value pairs. Merged into the field's existing metadata by
+default; a value of `null` deletes that key.
+
+***
+
+### path
+
+```ts
+path: string;
+```
+
+Dot-separated path to the field. For a top-level column this is just its
+name; for a nested field it's the path, e.g. "a.b.c".
+
+***
+
+### replace?
+
+```ts
+optional replace: boolean;
+```
+
+If true, replace the field's entire metadata map instead of merging.
--- a/docs/src/js/interfaces/IndexStatistics.md
+++ b/docs/src/js/interfaces/IndexStatistics.md
@@ -30,17 +30,6 @@ The type of the index

 ***

-### loss?
-
-```ts
-optional loss: number;
-```
-
-The KMeans loss value of the index,
-it is only present for vector indices.
-
-***
-
 ### numIndexedRows

 ```ts
--- a/docs/src/js/interfaces/OpenTableOptions.md
+++ b/docs/src/js/interfaces/OpenTableOptions.md
@@ -8,6 +8,18 @@

 ## Properties

+### branch?
+
+```ts
+optional branch: string;
+```
+
+Open the table scoped to this branch instead of the default branch.
+
+Reads and writes on the returned table operate in the branch's context.
+
+***
+
 ### ~~indexCacheSize?~~

 ```ts
@@ -43,3 +55,17 @@ Options already set on the connection will be inherited by the table,
 but can be overridden here.

 The available options are described at https://docs.lancedb.com/storage/
+
+***
+
+### version?
+
+```ts
+optional version: number;
+```
+
+Open the table pinned to this version, producing a read-only view.
+
+Composes with [OpenTableOptions.branch](OpenTableOptions.md#branch): when both are set, opens
+that branch at the version; otherwise opens `main` at the version. Call
+`checkoutLatest` to return to a writable state.
--- a/docs/src/js/interfaces/UpdateFieldMetadataResult.md
+++ b/docs/src/js/interfaces/UpdateFieldMetadataResult.md
@@ -0,0 +1,15 @@
+[**@lancedb/lancedb**](../README.md) • **Docs**
+
+***
+
+[@lancedb/lancedb](../globals.md) / UpdateFieldMetadataResult
+
+# Interface: UpdateFieldMetadataResult
+
+## Properties
+
+### version
+
+```ts
+version: number;
+```
--- a/java/lancedb-core/pom.xml
+++ b/java/lancedb-core/pom.xml
@@ -8,7 +8,7 @@
    <parent>
      <groupId>com.lancedb</groupId>
      <artifactId>lancedb-parent</artifactId>
-      <version>0.30.0-beta.1</version>
+      <version>0.30.1-beta.2</version>
      <relativePath>../pom.xml</relativePath>
    </parent>

--- a/java/pom.xml
+++ b/java/pom.xml
@@ -6,7 +6,7 @@

    <groupId>com.lancedb</groupId>
    <artifactId>lancedb-parent</artifactId>
-    <version>0.30.0-beta.1</version>
+    <version>0.30.1-beta.2</version>
    <packaging>pom</packaging>
    <name>${project.artifactId}</name>
    <description>LanceDB Java SDK Parent POM</description>
@@ -28,7 +28,7 @@
    <properties>
        <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
        <arrow.version>15.0.0</arrow.version>
-        <lance-core.version>7.2.0-beta.1</lance-core.version>
+        <lance-core.version>8.0.0-beta.11</lance-core.version>
        <spotless.skip>false</spotless.skip>
        <spotless.version>2.30.0</spotless.version>
        <spotless.java.googlejavaformat.version>1.7</spotless.java.googlejavaformat.version>
--- a/nodejs/Cargo.toml
+++ b/nodejs/Cargo.toml
@@ -1,7 +1,7 @@
 [package]
 name = "lancedb-nodejs"
 edition.workspace = true
-version = "0.30.0-beta.1"
+version = "0.30.1-beta.2"
 publish = false
 license.workspace = true
 description.workspace = true
--- a/nodejs/test/remote.test.ts
+++ b/nodejs/test/remote.test.ts
@@ -191,6 +191,34 @@ describe("remote connection", () => {
    );
  });

+  it("allows version on remote but rejects a non-main branch", async () => {
+    await withMockDatabase(
+      (_req, res) => {
+        // describe (table open + version validation) always succeeds
+        const body = JSON.stringify({
+          name: "t",
+          version: 2,
+          schema: { fields: [] },
+        });
+        res.writeHead(200, { "Content-Type": "application/json" }).end(body);
+      },
+      async (db) => {
+        // version-only (and "main" + version) is allowed: remote supports
+        // version time-travel even though it has no branches
+        await db.openTable("t", undefined, { version: 2 });
+        await db.openTable("t", undefined, { branch: "main", version: 2 });
+
+        // a non-main branch is rejected, with or without a version
+        await expect(
+          db.openTable("t", undefined, { branch: "exp" }),
+        ).rejects.toThrow(/branching/);
+        await expect(
+          db.openTable("t", undefined, { branch: "exp", version: 2 }),
+        ).rejects.toThrow(/branching/);
+      },
+    );
+  });
+
  describe("TlsConfig", () => {
    it("should create TlsConfig with all fields", () => {
      const tlsConfig: TlsConfig = {
--- a/nodejs/test/table.test.ts
+++ b/nodejs/test/table.test.ts
@@ -85,6 +85,136 @@ describe.each([arrow15, arrow16, arrow17, arrow18])(
      await expect(table.countRows()).resolves.toBe(3);
    });

+    it("should support branches", async () => {
+      await table.add([{ id: 1 }]);
+      expect(await table.countRows()).toBe(1);
+
+      // fork an isolated, writable branch from main
+      const branch = await (await table.branches()).create("exp");
+      expect(await branch.countRows()).toBe(1);
+      await branch.add([{ id: 2 }]);
+      expect(await branch.countRows()).toBe(2);
+      // main is untouched by branch writes
+      expect(await table.countRows()).toBe(1);
+
+      // listed, with main (null) as the parent
+      const list = await (await table.branches()).list();
+      expect(Object.keys(list)).toContain("exp");
+      expect(list["exp"].parentBranch).toBeNull();
+
+      // fromRef="main" is equivalent to the default
+      await (await table.branches()).create("exp2", "main");
+      const list2 = await (await table.branches()).list();
+      expect(list2["exp2"].parentBranch).toBeNull();
+
+      // checkout returns a handle scoped to the branch's latest
+      const checkedOut = await (await table.branches()).checkout("exp");
+      expect(await checkedOut.countRows()).toBe(2);
+
+      // delete removes it
+      await (await table.branches()).delete("exp");
+      await (await table.branches()).delete("exp2");
+      const after = await (await table.branches()).list();
+      expect(Object.keys(after)).not.toContain("exp");
+    });
+
+    it("should open a branch via open_table", async () => {
+      const db = await connect(tmpDir.name);
+      await table.add([{ id: 1 }]);
+      const branch = await (await table.branches()).create("exp");
+      await branch.add([{ id: 2 }]);
+
+      // open_table(..., { branch }) returns a handle scoped to the branch
+      const opened = await db.openTable("some_table", undefined, {
+        branch: "exp",
+      });
+      expect(await opened.countRows()).toBe(2);
+      // opening without branch still tracks main
+      expect(await (await db.openTable("some_table")).countRows()).toBe(1);
+    });
+
+    it("should open a branch at a version isolated from main and HEAD", async () => {
+      const db = await connect(tmpDir.name);
+      // main: a single fork-point row
+      const t = await db.createTable("bv_table", [{ id: 0 }]);
+      const mainV1 = await t.version();
+
+      // fork "exp", then advance exp AND main independently past the fork so
+      // they diverge while sharing version numbers
+      const exp = await (await t.branches()).create("exp");
+      await exp.add([{ id: 1 }]); // exp: {0, 1}
+      const expV2 = await exp.version();
+      await exp.add([{ id: 2 }]); // exp HEAD: {0, 1, 2}
+      await t.add([{ id: 100 }, { id: 101 }, { id: 102 }]); // main HEAD: {0,100,101,102}
+      expect(await t.version()).toBe(expV2);
+
+      // open exp at the shared version: the data must be exp's, not main's.
+      // count alone cannot prove this (main@v2 also exists), so assert
+      // provenance by content.
+      const pinned = await db.openTable("bv_table", undefined, {
+        branch: "exp",
+        version: expV2,
+      });
+      expect(await pinned.countRows()).toBe(2); // not exp HEAD (3), not main@v2 (4)
+      expect(await pinned.countRows("id = 1")).toBe(1); // exp's post-fork row
+      expect(await pinned.countRows("id = 100")).toBe(0); // main's rows invisible
+
+      // the same coordinate is reachable directly via branches().checkout(name, version)
+      const pinnedDirect = await (await t.branches()).checkout("exp", expV2);
+      expect(await pinnedDirect.countRows()).toBe(2);
+
+      // the HEADs are unaffected
+      expect(
+        await (
+          await db.openTable("bv_table", undefined, { branch: "exp" })
+        ).countRows(),
+      ).toBe(3);
+      expect(await (await db.openTable("bv_table")).countRows()).toBe(4);
+
+      // version-only (no branch) time-travels main itself: its fork-point
+      // version holds only main's first row, and the shared version number
+      // resolves to main's data, not the branch's ("opens main at the version")
+      const oldMain = await db.openTable("bv_table", undefined, {
+        version: mainV1,
+      });
+      expect(await oldMain.countRows()).toBe(1);
+      const sharedOnMain = await db.openTable("bv_table", undefined, {
+        version: expV2,
+      });
+      expect(await sharedOnMain.countRows()).toBe(4); // main@v2, not exp@v2 (2)
+
+      // detached head: writing to a pinned version is rejected
+      await expect(pinned.add([{ id: 9 }])).rejects.toThrow(
+        /cannot be modified/,
+      );
+
+      // a nonexistent version is rejected -- on main, and on a branch (a
+      // distinct resolution path, on the branch's manifests)
+      await expect(
+        db.openTable("bv_table", undefined, { version: 9999 }),
+      ).rejects.toThrow();
+      await expect(
+        db.openTable("bv_table", undefined, { branch: "exp", version: 9999 }),
+      ).rejects.toThrow();
+
+      // checkoutLatest re-attaches the pinned handle to the BRANCH's HEAD
+      // (writable again), not main's HEAD (4), and not staying pinned (2)
+      await pinned.checkoutLatest();
+      expect(await pinned.countRows()).toBe(3); // exp HEAD
+      await pinned.add([{ id: 3 }]);
+      expect(await pinned.countRows()).toBe(4); // writable again
+    });
+
+    it("rejects invalid branch inputs", async () => {
+      const branches = await table.branches();
+      await expect(branches.create("")).rejects.toThrow("non-empty");
+      await expect(branches.checkout("")).rejects.toThrow("non-empty");
+      await expect(branches.delete("")).rejects.toThrow("non-empty");
+      await expect(branches.create("bad", "main", -1)).rejects.toThrow(
+        "non-negative",
+      );
+    });
+
    it("should show table stats", async () => {
      await table.add([{ id: 1 }, { id: 2 }]);
      await table.add([{ id: 1 }]);
@@ -721,7 +851,7 @@ describe("When creating an index", () => {
      columns: ["vec"],
    });
    const stats = await tbl.indexStats("vec_idx");
-    expect(stats?.loss).toBeDefined();
+    expect(stats).toBeDefined();

    // Search without specifying the column
    let rst = await tbl
@@ -781,10 +911,22 @@ describe("When creating an index", () => {
    expect(indices2.length).toBe(0);
  });

-  it("should create and search a nested vector index", async () => {
+  it("should preserve canonical nested field paths across index lifecycle", async () => {
    const db = await connect(tmpDir.name);
    const nestedSchema = new Schema([
-      new Field("id", new Int32(), true),
+      new Field("rowId", new Int32(), true),
+      new Field("row-id", new Int32(), true),
+      new Field("userId", new Int32(), true),
+      new Field(
+        "metadata",
+        new Struct([new Field("user_id", new Int32(), true)]),
+        true,
+      ),
+      new Field(
+        "MetaData",
+        new Struct([new Field("userId", new Int32(), true)]),
+        true,
+      ),
      new Field(
        "image",
        new Struct([
@@ -796,28 +938,147 @@ describe("When creating an index", () => {
        ]),
        true,
      ),
+      new Field(
+        "payload",
+        new Struct([new Field("text", new Utf8(), true)]),
+        true,
+      ),
+      new Field(
+        "meta-data",
+        new Struct([new Field("user-id", new Int32(), true)]),
+        true,
+      ),
+      new Field(
+        "literal",
+        new Struct([new Field("a.b", new Int32(), true)]),
+        true,
+      ),
    ]);
    const nestedTable = await db.createTable(
-      "nested_vector",
+      "nested_field_index_lifecycle",
      makeArrowTable(
-        Array.from({ length: 300 }, (_, id) => ({
-          id,
-          image: { embedding: [id, id + 1] },
+        Array.from({ length: 300 }, (_, rowId) => ({
+          rowId,
+          "row-id": rowId,
+          userId: rowId,
+          metadata: { ["user_id"]: rowId },
+          ["MetaData"]: { userId: rowId },
+          image: { embedding: [rowId, rowId + 1] },
+          payload: { text: `document ${rowId}` },
+          "meta-data": { "user-id": rowId },
+          literal: { "a.b": rowId },
        })),
        { schema: nestedSchema },
      ),
    );

+    await nestedTable.createIndex("rowId", {
+      config: Index.btree(),
+      name: "row_id_idx",
+    });
+    await nestedTable.createIndex("`row-id`", {
+      config: Index.btree(),
+      name: "row_dash_id_idx",
+    });
+    await nestedTable.createIndex("userId", {
+      config: Index.btree(),
+      name: "top_user_id_idx",
+    });
+    await nestedTable.createIndex("metadata.user_id", {
+      config: Index.btree(),
+      name: "nested_user_id_idx",
+    });
+    await nestedTable.createIndex("MetaData.userId", {
+      config: Index.btree(),
+      name: "mixed_case_metadata_user_id_idx",
+    });
+    await nestedTable.createIndex("`meta-data`.`user-id`", {
+      config: Index.btree(),
+      name: "escaped_names_idx",
+    });
+    await nestedTable.createIndex("literal.`a.b`", {
+      config: Index.btree(),
+      name: "literal_dot_idx",
+    });
    await nestedTable.createIndex("image.embedding", {
      name: "image_embedding_idx",
    });
-    const indices = await nestedTable.listIndices();
-    expect(indices).toContainEqual({
-      name: "image_embedding_idx",
-      indexType: "IvfPq",
-      columns: ["image.embedding"],
+    await nestedTable.createIndex("payload.text", {
+      config: Index.fts({ withPosition: false }),
+      name: "payload_text_idx",
    });

+    const indices = await nestedTable.listIndices();
+    expect(indices).toEqual(
+      expect.arrayContaining([
+        {
+          name: "row_id_idx",
+          indexType: "BTree",
+          columns: ["rowId"],
+        },
+        {
+          name: "row_dash_id_idx",
+          indexType: "BTree",
+          columns: ["`row-id`"],
+        },
+        {
+          name: "top_user_id_idx",
+          indexType: "BTree",
+          columns: ["userId"],
+        },
+        {
+          name: "nested_user_id_idx",
+          indexType: "BTree",
+          columns: ["metadata.user_id"],
+        },
+        {
+          name: "mixed_case_metadata_user_id_idx",
+          indexType: "BTree",
+          columns: ["MetaData.userId"],
+        },
+        {
+          name: "escaped_names_idx",
+          indexType: "BTree",
+          columns: ["`meta-data`.`user-id`"],
+        },
+        {
+          name: "literal_dot_idx",
+          indexType: "BTree",
+          columns: ["literal.`a.b`"],
+        },
+        {
+          name: "image_embedding_idx",
+          indexType: "IvfPq",
+          columns: ["image.embedding"],
+        },
+        {
+          name: "payload_text_idx",
+          indexType: "FTS",
+          columns: ["payload.text"],
+        },
+      ]),
+    );
+
+    const stats = await nestedTable.indexStats(
+      "mixed_case_metadata_user_id_idx",
+    );
+    expect(stats?.numIndexedRows).toEqual(300);
+    expect(stats?.indexType).toEqual("BTREE");
+
+    const filtered = await nestedTable
+      .query()
+      .where("MetaData.userId = 42")
+      .limit(1)
+      .toArray();
+    expect(filtered[0].MetaData.userId).toEqual(42);
+
+    const escapedFiltered = await nestedTable
+      .query()
+      .where("`row-id` = 43")
+      .limit(1)
+      .toArray();
+    expect(escapedFiltered[0]["row-id"]).toEqual(43);
+
    const explicit = await nestedTable
      .query()
      .nearestTo([0.0, 1.0])
@@ -829,7 +1090,37 @@ describe("When creating an index", () => {
      .nearestTo([0.0, 1.0])
      .limit(1)
      .toArray();
-    expect(inferred[0].id).toEqual(explicit[0].id);
+    expect(inferred[0].rowId).toEqual(explicit[0].rowId);
+
+    await nestedTable.add([
+      {
+        rowId: 300,
+        "row-id": 300,
+        userId: 300,
+        metadata: { ["user_id"]: 300 },
+        ["MetaData"]: { userId: 300 },
+        image: { embedding: [300.0, 301.0] },
+        payload: { text: "document 300" },
+        "meta-data": { "user-id": 300 },
+        literal: { "a.b": 300 },
+      },
+    ]);
+    await nestedTable.optimize();
+    const indicesAfterOptimize = await nestedTable.listIndices();
+    expect(indicesAfterOptimize).toEqual(
+      expect.arrayContaining([
+        {
+          name: "mixed_case_metadata_user_id_idx",
+          indexType: "BTree",
+          columns: ["MetaData.userId"],
+        },
+        {
+          name: "image_embedding_idx",
+          indexType: "IvfPq",
+          columns: ["image.embedding"],
+        },
+      ]),
+    );
  });

  it("should report multiple nested vector candidates", async () => {
@@ -1140,6 +1431,20 @@ describe("When creating an index", () => {
    expect(fs.readdirSync(indexDir)).toHaveLength(1);
  });

+  test("create an FM index", async () => {
+    // FM-Index accelerates substring search on a string/binary column.
+    const db = await connect(tmpDir.name);
+    const fmTbl = await db.createTable("fm_table", [
+      { id: 0, text: "hello world" },
+      { id: 1, text: "foo bar" },
+    ]);
+    await fmTbl.createIndex("text", {
+      config: Index.fm(),
+    });
+    const indexDir = path.join(tmpDir.name, "fm_table.lance", "_indices");
+    expect(fs.readdirSync(indexDir)).toHaveLength(1);
+  });
+
  test("should be able to get index stats", async () => {
    await tbl.createIndex("id");

@@ -1150,7 +1455,6 @@ describe("When creating an index", () => {
    expect(stats?.distanceType).toBeUndefined();
    expect(stats?.indexType).toEqual("BTREE");
    expect(stats?.numIndices).toEqual(1);
-    expect(stats?.loss).toBeUndefined();
  });

  test("when getting stats on non-existent index", async () => {
@@ -1571,6 +1875,33 @@ describe("schema evolution", function () {
    expect(await table.schema()).toEqual(expectedSchema3);
  });

+  it("can update field metadata", async function () {
+    const con = await connect(tmpDir.name);
+    const table = await con.createTable("fm", [
+      { id: 1, category: "a" },
+      { id: 2, category: "b" },
+    ]);
+
+    const res = await table.updateFieldMetadata([
+      { path: "category", metadata: { unit: "label", pii: "false" } },
+    ]);
+    expect(res).toHaveProperty("version");
+    expect(res.version).toBe(2);
+
+    let cat = (await table.schema()).fields.find((f) => f.name === "category");
+    expect(cat?.metadata.get("unit")).toBe("label");
+    expect(cat?.metadata.get("pii")).toBe("false");
+
+    // merge: add a key, delete one via null, keep the rest
+    await table.updateFieldMetadata([
+      { path: "category", metadata: { source: "import", pii: null } },
+    ]);
+    cat = (await table.schema()).fields.find((f) => f.name === "category");
+    expect(cat?.metadata.get("unit")).toBe("label"); // preserved
+    expect(cat?.metadata.get("source")).toBe("import"); // added
+    expect(cat?.metadata.has("pii")).toBe(false); // deleted
+  });
+
  it("can cast to various types", async function () {
    const con = await connect(tmpDir.name);

--- a/nodejs/lancedb/connection.ts
+++ b/nodejs/lancedb/connection.ts
@@ -84,6 +84,20 @@ export interface CreateTableOptions {
 }

 export interface OpenTableOptions {
+  /**
+   * Open the table scoped to this branch instead of the default branch.
+   *
+   * Reads and writes on the returned table operate in the branch's context.
+   */
+  branch?: string;
+  /**
+   * Open the table pinned to this version, producing a read-only view.
+   *
+   * Composes with {@link OpenTableOptions.branch}: when both are set, opens
+   * that branch at the version; otherwise opens `main` at the version. Call
+   * `checkoutLatest` to return to a writable state.
+   */
+  version?: number;
  /**
   * Configuration for object storage.
   *
@@ -483,7 +497,20 @@ export class LocalConnection extends Connection {
      options?.indexCacheSize,
    );

-    return new LocalTable(innerTable);
+    let table: Table = new LocalTable(innerTable);
+    // "main" is the default branch, so treat it as no branch. On a real branch,
+    // scope and pin in one step (yielding "version V of branch B"); otherwise
+    // pin the version, if any, against main.
+    const branch =
+      options?.branch != null && options.branch !== "main"
+        ? options.branch
+        : undefined;
+    if (branch != null) {
+      table = await (await table.branches()).checkout(branch, options?.version);
+    } else if (options?.version != null) {
+      await table.checkout(options.version);
+    }
+    return table;
  }

  async cloneTable(
--- a/nodejs/lancedb/index.ts
+++ b/nodejs/lancedb/index.ts
@@ -38,10 +38,12 @@ export {
  FragmentSummaryStats,
  Tags,
  TagContents,
+  BranchContents,
  MergeResult,
  AddResult,
  AddColumnsResult,
  AlterColumnsResult,
+  UpdateFieldMetadataResult,
  DeleteResult,
  DropColumnsResult,
  UpdateResult,
@@ -110,6 +112,7 @@ export {

 export {
  Table,
+  Branches,
  AddDataOptions,
  UpdateOptions,
  OptimizeOptions,
@@ -117,6 +120,7 @@ export {
  WriteProgress,
  LsmWriteSpec,
  ColumnAlteration,
+  FieldMetadataUpdate,
 } from "./table";

 export {
--- a/nodejs/lancedb/indices.ts
+++ b/nodejs/lancedb/indices.ts
@@ -702,6 +702,17 @@ export class Index {
    return new Index(LanceDbIndex.labelList());
  }

+  /**
+   * Create an FM-Index.
+   *
+   * An FM-Index is a scalar index on string or binary columns that accelerates
+   * substring search, i.e. `contains(col, 'needle')`. Unlike the tokenized
+   * full-text-search index, it matches arbitrary substrings of the raw bytes.
+   */
+  static fm() {
+    return new Index(LanceDbIndex.fm());
+  }
+
  /**
   * Create a full text search index
   *
--- a/nodejs/lancedb/table.ts
+++ b/nodejs/lancedb/table.ts
@@ -25,13 +25,16 @@ import {
  AddColumnsSql,
  AddResult,
  AlterColumnsResult,
+  BranchContents,
  DeleteResult,
  DropColumnsResult,
  IndexConfig,
  IndexStatistics,
+  Branches as NativeBranches,
  OptimizeStats,
  TableStatistics,
  Tags,
+  UpdateFieldMetadataResult,
  UpdateResult,
  Table as _NativeTable,
 } from "./native";
@@ -508,6 +511,18 @@ export abstract class Table {
  abstract alterColumns(
    columnAlterations: ColumnAlteration[],
  ): Promise<AlterColumnsResult>;
+
+  /**
+   * Update per-field (column) metadata.
+   * @param {FieldMetadataUpdate[]} updates One or more per-field updates. Each
+   * update's metadata is merged into the field's existing metadata by default;
+   * a value of `null` deletes that key, and `replace: true` swaps the whole map.
+   * @returns {Promise<UpdateFieldMetadataResult>} resolves to the new table version.
+   */
+  abstract updateFieldMetadata(
+    updates: FieldMetadataUpdate[],
+  ): Promise<UpdateFieldMetadataResult>;
+
  /**
   * Drop one or more columns from the dataset
   *
@@ -640,6 +655,14 @@ export abstract class Table {
   */
  abstract tags(): Promise<Tags>;

+  /**
+   * Get the branch manager for this table.
+   *
+   * Branches are isolated, writable lines of history forked from another
+   * branch (or version). Writes on a branch do not affect `main`.
+   */
+  abstract branches(): Promise<Branches>;
+
  /**
   * Restore the table to the currently checked out version
   *
@@ -1037,6 +1060,12 @@ export class LocalTable extends Table {
    return await this.inner.alterColumns(processedAlterations);
  }

+  async updateFieldMetadata(
+    updates: FieldMetadataUpdate[],
+  ): Promise<UpdateFieldMetadataResult> {
+    return await this.inner.updateFieldMetadata(updates);
+  }
+
  async dropColumns(columnNames: string[]): Promise<DropColumnsResult> {
    return await this.inner.dropColumns(columnNames);
  }
@@ -1089,6 +1118,10 @@ export class LocalTable extends Table {
    return await this.inner.tags();
  }

+  async branches(): Promise<Branches> {
+    return new Branches(await this.inner.branches());
+  }
+
  async optimize(options?: Partial<OptimizeOptions>): Promise<OptimizeStats> {
    let cleanupOlderThanMs;
    if (
@@ -1203,3 +1236,73 @@ export interface ColumnAlteration {
  /** Set the new nullability. Note that a nullable column cannot be made non-nullable. */
  nullable?: boolean;
 }
+
+/** A per-field metadata update, addressed by dot-path. */
+export interface FieldMetadataUpdate {
+  /**
+   * Dot-separated path to the field. For a top-level column this is just its
+   * name; for a nested field it's the path, e.g. "a.b.c".
+   */
+  path: string;
+  /**
+   * Metadata key/value pairs. Merged into the field's existing metadata by
+   * default; a value of `null` deletes that key.
+   */
+  metadata: Record<string, string | null>;
+  /** If true, replace the field's entire metadata map instead of merging. */
+  replace?: boolean;
+}
+
+/**
+ * Branch manager for a {@link Table}.
+ *
+ * Unlike tags, `create` and `checkout` return a new {@link Table} handle scoped
+ * to the branch; writes on it do not affect `main`.
+ */
+export class Branches {
+  #inner: NativeBranches;
+
+  /**
+   * Construct a Branches manager. Internal use only.
+   * @hidden
+   */
+  constructor(inner: NativeBranches) {
+    this.#inner = inner;
+  }
+
+  /** List all branches, mapping name to branch metadata. */
+  async list(): Promise<Record<string, BranchContents>> {
+    return await this.#inner.list();
+  }
+
+  /**
+   * Create a branch and return a handle scoped to it.
+   *
+   * @param name Name of the new branch.
+   * @param fromRef Source branch to fork from. Defaults to `main`.
+   * @param fromVersion A specific version on `fromRef`. Defaults to latest.
+   */
+  async create(
+    name: string,
+    fromRef?: string,
+    fromVersion?: number,
+  ): Promise<Table> {
+    return new LocalTable(await this.#inner.create(name, fromRef, fromVersion));
+  }
+
+  /**
+   * Check out an existing branch and return a handle scoped to it.
+   *
+   * With `version` set, the returned handle is pinned to that version of the
+   * branch (a read-only, detached view); otherwise it tracks the branch's
+   * latest and stays writable.
+   */
+  async checkout(name: string, version?: number): Promise<Table> {
+    return new LocalTable(await this.#inner.checkout(name, version));
+  }
+
+  /** Delete a branch. */
+  async delete(name: string): Promise<void> {
+    return await this.#inner.delete(name);
+  }
+}
--- a/nodejs/npm/darwin-arm64/package.json
+++ b/nodejs/npm/darwin-arm64/package.json
@@ -1,6 +1,6 @@
 {
 	"name": "@lancedb/lancedb-darwin-arm64",
-	"version": "0.30.0-beta.1",
+	"version": "0.30.1-beta.2",
 	"os": ["darwin"],
 	"cpu": ["arm64"],
 	"main": "lancedb.darwin-arm64.node",
--- a/nodejs/npm/linux-arm64-gnu/package.json
+++ b/nodejs/npm/linux-arm64-gnu/package.json
@@ -1,6 +1,6 @@
 {
 	"name": "@lancedb/lancedb-linux-arm64-gnu",
-	"version": "0.30.0-beta.1",
+	"version": "0.30.1-beta.2",
 	"os": ["linux"],
 	"cpu": ["arm64"],
 	"main": "lancedb.linux-arm64-gnu.node",
--- a/nodejs/npm/linux-arm64-musl/package.json
+++ b/nodejs/npm/linux-arm64-musl/package.json
@@ -1,6 +1,6 @@
 {
 	"name": "@lancedb/lancedb-linux-arm64-musl",
-	"version": "0.30.0-beta.1",
+	"version": "0.30.1-beta.2",
 	"os": ["linux"],
 	"cpu": ["arm64"],
 	"main": "lancedb.linux-arm64-musl.node",
--- a/nodejs/npm/linux-x64-gnu/package.json
+++ b/nodejs/npm/linux-x64-gnu/package.json
@@ -1,6 +1,6 @@
 {
 	"name": "@lancedb/lancedb-linux-x64-gnu",
-	"version": "0.30.0-beta.1",
+	"version": "0.30.1-beta.2",
 	"os": ["linux"],
 	"cpu": ["x64"],
 	"main": "lancedb.linux-x64-gnu.node",
--- a/nodejs/npm/linux-x64-musl/package.json
+++ b/nodejs/npm/linux-x64-musl/package.json
@@ -1,6 +1,6 @@
 {
 	"name": "@lancedb/lancedb-linux-x64-musl",
-	"version": "0.30.0-beta.1",
+	"version": "0.30.1-beta.2",
 	"os": ["linux"],
 	"cpu": ["x64"],
 	"main": "lancedb.linux-x64-musl.node",
--- a/nodejs/npm/win32-arm64-msvc/package.json
+++ b/nodejs/npm/win32-arm64-msvc/package.json
@@ -1,6 +1,6 @@
 {
  "name": "@lancedb/lancedb-win32-arm64-msvc",
-  "version": "0.30.0-beta.1",
+  "version": "0.30.1-beta.2",
  "os": [
    "win32"
  ],
--- a/nodejs/npm/win32-x64-msvc/package.json
+++ b/nodejs/npm/win32-x64-msvc/package.json
@@ -1,6 +1,6 @@
 {
 	"name": "@lancedb/lancedb-win32-x64-msvc",
-	"version": "0.30.0-beta.1",
+	"version": "0.30.1-beta.2",
 	"os": ["win32"],
 	"cpu": ["x64"],
 	"main": "lancedb.win32-x64-msvc.node",
--- a/nodejs/package-lock.json
+++ b/nodejs/package-lock.json
@@ -1,12 +1,12 @@
 {
  "name": "@lancedb/lancedb",
-  "version": "0.30.0-beta.1",
+  "version": "0.30.1-beta.2",
  "lockfileVersion": 3,
  "requires": true,
  "packages": {
    "": {
      "name": "@lancedb/lancedb",
-      "version": "0.30.0-beta.1",
+      "version": "0.30.1-beta.2",
      "cpu": [
        "x64",
        "arm64"
@@ -26,7 +26,7 @@
        "@aws-sdk/client-s3": "3.1003.0",
        "@biomejs/biome": "^1.7.3",
        "@jest/globals": "^29.7.0",
-        "@napi-rs/cli": "3.5.1",
+        "@napi-rs/cli": "3.7.0",
        "@types/axios": "^0.14.0",
        "@types/jest": "^29.1.2",
        "@types/node": "22.7.4",
@@ -2942,9 +2942,9 @@
      }
    },
    "node_modules/@napi-rs/cli": {
-      "version": "3.5.1",
-      "resolved": "https://registry.npmjs.org/@napi-rs/cli/-/cli-3.5.1.tgz",
-      "integrity": "sha512-XBfLQRDcB3qhu6bazdMJsecWW55kR85l5/k0af9BIBELXQSsCFU0fzug7PX8eQp6vVdm7W/U3z6uP5WmITB2Gw==",
+      "version": "3.7.0",
+      "resolved": "https://registry.npmjs.org/@napi-rs/cli/-/cli-3.7.0.tgz",
+      "integrity": "sha512-3d3+rmxlOIV/G1zPWeX4PCxuYnhcCQM2BvY9rtimC8RO0dFR9gtYP+Grov+WoduZtfWRj5N1XvytWeRxxCk5zw==",
      "dev": true,
      "license": "MIT",
      "dependencies": {
@@ -2954,7 +2954,7 @@
        "@octokit/rest": "^22.0.1",
        "clipanion": "^4.0.0-rc.4",
        "colorette": "^2.0.20",
-        "emnapi": "^1.7.1",
+        "emnapi": "^1.10.0",
        "es-toolkit": "^1.41.0",
        "js-yaml": "^4.1.0",
        "obug": "^2.0.0",
--- a/nodejs/package.json
+++ b/nodejs/package.json
@@ -11,7 +11,7 @@
    "ann"
  ],
  "private": false,
-  "version": "0.30.0-beta.1",
+  "version": "0.30.1-beta.2",
  "main": "dist/index.js",
  "exports": {
    ".": "./dist/index.js",
@@ -43,7 +43,7 @@
    "@aws-sdk/client-s3": "3.1003.0",
    "@biomejs/biome": "^1.7.3",
    "@jest/globals": "^29.7.0",
-    "@napi-rs/cli": "3.5.1",
+    "@napi-rs/cli": "3.7.0",
    "@types/axios": "^0.14.0",
    "@types/jest": "^29.1.2",
    "@types/node": "22.7.4",
--- a/nodejs/pnpm-lock.yaml
+++ b/nodejs/pnpm-lock.yaml
@@ -31,8 +31,8 @@ importers:
        specifier: ^29.7.0
        version: 29.7.0
      '@napi-rs/cli':
-        specifier: 3.5.1
-        version: 3.5.1(@emnapi/core@1.10.0)(@emnapi/runtime@1.10.0)(@types/node@22.7.4)
+        specifier: 3.7.0
+        version: 3.7.0(@emnapi/core@1.10.0)(@emnapi/runtime@1.10.0)(@types/node@22.7.4)
      '@types/axios':
        specifier: ^0.14.0
        version: 0.14.4
@@ -887,8 +887,8 @@ packages:
  '@jridgewell/trace-mapping@0.3.31':
    resolution: {integrity: sha512-zzNR+SdQSDJzc8joaeP8QQoCQr8NuYx2dIIytl1QeBEZHJ9uW6hebsrYgbz8hJwUQao3TWCMtmfV8Nu1twOLAw==}

-  '@napi-rs/cli@3.5.1':
-    resolution: {integrity: sha512-XBfLQRDcB3qhu6bazdMJsecWW55kR85l5/k0af9BIBELXQSsCFU0fzug7PX8eQp6vVdm7W/U3z6uP5WmITB2Gw==}
+  '@napi-rs/cli@3.7.0':
+    resolution: {integrity: sha512-3d3+rmxlOIV/G1zPWeX4PCxuYnhcCQM2BvY9rtimC8RO0dFR9gtYP+Grov+WoduZtfWRj5N1XvytWeRxxCk5zw==}
    engines: {node: '>= 16'}
    hasBin: true
    peerDependencies:
@@ -4582,7 +4582,7 @@ snapshots:
      '@jridgewell/resolve-uri': 3.1.2
      '@jridgewell/sourcemap-codec': 1.5.5

-  '@napi-rs/cli@3.5.1(@emnapi/core@1.10.0)(@emnapi/runtime@1.10.0)(@types/node@22.7.4)':
+  '@napi-rs/cli@3.7.0(@emnapi/core@1.10.0)(@emnapi/runtime@1.10.0)(@types/node@22.7.4)':
    dependencies:
      '@inquirer/prompts': 8.4.3(@types/node@22.7.4)
      '@napi-rs/cross-toolchain': 1.0.3(@emnapi/core@1.10.0)(@emnapi/runtime@1.10.0)
--- a/nodejs/src/index.rs
+++ b/nodejs/src/index.rs
@@ -4,7 +4,7 @@
 use std::sync::Mutex;

 use lancedb::index::Index as LanceDbIndex;
-use lancedb::index::scalar::{BTreeIndexBuilder, FtsIndexBuilder};
+use lancedb::index::scalar::{BTreeIndexBuilder, FmIndexBuilder, FtsIndexBuilder};
 use lancedb::index::vector::{
    IvfFlatIndexBuilder, IvfHnswPqIndexBuilder, IvfHnswSqIndexBuilder, IvfPqIndexBuilder,
    IvfRqIndexBuilder,
@@ -143,6 +143,13 @@ impl Index {
        }
    }

+    #[napi(factory)]
+    pub fn fm() -> Self {
+        Self {
+            inner: Mutex::new(Some(LanceDbIndex::Fm(FmIndexBuilder::default()))),
+        }
+    }
+
    #[napi(factory)]
    #[allow(clippy::too_many_arguments)]
    pub fn fts(
--- a/nodejs/src/table.rs
+++ b/nodejs/src/table.rs
@@ -5,8 +5,9 @@ use std::collections::HashMap;

 use lancedb::ipc::{ipc_file_to_batches, ipc_file_to_schema};
 use lancedb::table::{
-    AddDataMode, ColumnAlteration as LanceColumnAlteration, Duration, NewColumnTransform,
-    OptimizeAction, OptimizeOptions, Table as LanceDbTable,
+    AddDataMode, ColumnAlteration as LanceColumnAlteration, Duration,
+    FieldMetadataUpdate as LanceFieldMetadataUpdate, NewColumnTransform, OptimizeAction,
+    OptimizeOptions, Ref, Table as LanceDbTable,
 };
 use napi::bindgen_prelude::*;
 use napi::threadsafe_function::{ThreadsafeFunction, ThreadsafeFunctionCallMode};
@@ -355,6 +356,23 @@ impl Table {
        Ok(res.into())
    }

+    #[napi(catch_unwind)]
+    pub async fn update_field_metadata(
+        &self,
+        updates: Vec<FieldMetadataUpdate>,
+    ) -> napi::Result<UpdateFieldMetadataResult> {
+        let updates = updates
+            .into_iter()
+            .map(LanceFieldMetadataUpdate::from)
+            .collect::<Vec<_>>();
+        let res = self
+            .inner_ref()?
+            .update_field_metadata(&updates)
+            .await
+            .default_error()?;
+        Ok(res.into())
+    }
+
    #[napi(catch_unwind)]
    pub async fn drop_columns(&self, columns: Vec<String>) -> napi::Result<DropColumnsResult> {
        let col_refs = columns.iter().map(String::as_str).collect::<Vec<_>>();
@@ -460,6 +478,13 @@ impl Table {
        })
    }

+    #[napi(catch_unwind)]
+    pub async fn branches(&self) -> napi::Result<Branches> {
+        Ok(Branches {
+            inner: self.inner_ref()?.clone(),
+        })
+    }
+
    #[napi(catch_unwind)]
    pub async fn optimize(
        &self,
@@ -747,6 +772,29 @@ pub struct ColumnAlteration {
    pub nullable: Option<bool>,
 }

+/// A per-field metadata update, addressed by dot-path. Merges into the field's
+/// existing metadata by default; a `null` value deletes a key, and `replace`
+/// swaps the field's entire metadata map.
+#[napi(object)]
+pub struct FieldMetadataUpdate {
+    /// Dot-separated path to the field (e.g. "embedding" or "a.b.c").
+    pub path: String,
+    /// Metadata keys to set; a `null` value deletes that key.
+    pub metadata: HashMap<String, Option<String>>,
+    /// If true, replace the field's entire metadata map instead of merging.
+    pub replace: Option<bool>,
+}
+
+impl From<FieldMetadataUpdate> for LanceFieldMetadataUpdate {
+    fn from(js: FieldMetadataUpdate) -> Self {
+        Self {
+            path: js.path,
+            metadata: js.metadata,
+            replace: js.replace.unwrap_or(false),
+        }
+    }
+}
+
 impl TryFrom<ColumnAlteration> for LanceColumnAlteration {
    type Error = String;
    fn try_from(js: ColumnAlteration) -> std::result::Result<Self, Self::Error> {
@@ -797,9 +845,6 @@ pub struct IndexStatistics {
    pub distance_type: Option<String>,
    /// The number of parts this index is split into.
    pub num_indices: Option<u32>,
-    /// The KMeans loss value of the index,
-    /// it is only present for vector indices.
-    pub loss: Option<f64>,
 }
 impl From<lancedb::index::IndexStatistics> for IndexStatistics {
    fn from(value: lancedb::index::IndexStatistics) -> Self {
@@ -809,7 +854,6 @@ impl From<lancedb::index::IndexStatistics> for IndexStatistics {
            index_type: value.index_type.to_string(),
            distance_type: value.distance_type.map(|d| d.to_string()),
            num_indices: value.num_indices,
-            loss: value.loss,
        }
    }
 }
@@ -987,6 +1031,19 @@ impl From<lancedb::table::AlterColumnsResult> for AlterColumnsResult {
    }
 }

+#[napi(object)]
+pub struct UpdateFieldMetadataResult {
+    pub version: i64,
+}
+
+impl From<lancedb::table::UpdateFieldMetadataResult> for UpdateFieldMetadataResult {
+    fn from(value: lancedb::table::UpdateFieldMetadataResult) -> Self {
+        Self {
+            version: value.version as i64,
+        }
+    }
+}
+
 #[napi(object)]
 pub struct DropColumnsResult {
    pub version: i64,
@@ -1006,6 +1063,13 @@ pub struct TagContents {
    pub manifest_size: i64,
 }

+#[napi]
+pub struct BranchContents {
+    pub parent_branch: Option<String>,
+    pub parent_version: i64,
+    pub manifest_size: i64,
+}
+
 #[napi]
 pub struct Tags {
    inner: LanceDbTable,
@@ -1074,3 +1138,75 @@ impl Tags {
            .default_error()
    }
 }
+
+#[napi]
+pub struct Branches {
+    inner: LanceDbTable,
+}
+
+#[napi]
+impl Branches {
+    #[napi]
+    pub async fn list(&self) -> napi::Result<HashMap<String, BranchContents>> {
+        let branches = self.inner.list_branches().await.default_error()?;
+        let result = branches
+            .into_iter()
+            .map(|(k, v)| {
+                (
+                    k,
+                    BranchContents {
+                        parent_branch: v.parent_branch,
+                        parent_version: v.parent_version as i64,
+                        manifest_size: v.manifest_size as i64,
+                    },
+                )
+            })
+            .collect();
+        Ok(result)
+    }
+
+    #[napi]
+    pub async fn create(
+        &self,
+        name: String,
+        from_ref: Option<String>,
+        from_version: Option<i64>,
+    ) -> napi::Result<Table> {
+        let from_ref = from_ref.filter(|b| b != "main");
+        let from_version = from_version
+            .map(|v| {
+                u64::try_from(v).map_err(|_| {
+                    napi::Error::from_reason("from_version must be a non-negative integer")
+                })
+            })
+            .transpose()?;
+        let from = Ref::Version(from_ref, from_version);
+        let table = self
+            .inner
+            .create_branch(&name, from)
+            .await
+            .default_error()?;
+        Ok(Table::new(table))
+    }
+
+    #[napi]
+    pub async fn checkout(&self, name: String, version: Option<i64>) -> napi::Result<Table> {
+        let version = version
+            .map(|v| {
+                u64::try_from(v)
+                    .map_err(|_| napi::Error::from_reason("version must be a non-negative integer"))
+            })
+            .transpose()?;
+        let table = self
+            .inner
+            .checkout_branch(&name, version)
+            .await
+            .default_error()?;
+        Ok(Table::new(table))
+    }
+
+    #[napi]
+    pub async fn delete(&self, name: String) -> napi::Result<()> {
+        self.inner.delete_branch(&name).await.default_error()
+    }
+}
--- a/python/.bumpversion.toml
+++ b/python/.bumpversion.toml
@@ -1,5 +1,5 @@
 [tool.bumpversion]
-current_version = "0.33.1-beta.0"
+current_version = "0.33.1-beta.2"
 parse = """(?x)
    (?P<major>0|[1-9]\\d*)\\.
    (?P<minor>0|[1-9]\\d*)\\.
--- a/python/Cargo.toml
+++ b/python/Cargo.toml
@@ -1,6 +1,6 @@
 [package]
 name = "lancedb-python"
-version = "0.33.1-beta.0"
+version = "0.33.1-beta.2"
 publish = false
 edition.workspace = true
 description = "Python bindings for LanceDB"
--- a/python/python/lancedb/init.py
+++ b/python/python/lancedb/init.py
@@ -315,6 +315,15 @@ def deserialize_conn(
            manifest_enabled=parsed.get("manifest_enabled", False),
            namespace_client_properties=parsed.get("namespace_client_properties"),
        )
+    elif connection_type == "remote":
+        return RemoteDBConnection(
+            parsed["db_url"],
+            parsed["api_key"],
+            parsed.get("region", "us-east-1"),
+            host_override=parsed.get("host_override"),
+            client_config=parsed.get("client_config"),
+            storage_options=storage_options,
+        )
    else:
        raise ValueError(f"Unknown connection_type: {connection_type}")

--- a/python/python/lancedb/_lancedb.pyi
+++ b/python/python/lancedb/_lancedb.pyi
@@ -10,6 +10,7 @@ from .index import (
    IvfSq,
    Bitmap,
    LabelList,
+    Fm,
    HnswPq,
    HnswSq,
    HnswFlat,
@@ -47,6 +48,7 @@ class PyExpr:
    def lower(self) -> "PyExpr": ...
    def upper(self) -> "PyExpr": ...
    def contains(self, substr: "PyExpr") -> "PyExpr": ...
+    def isin(self, values: List["PyExpr"]) -> "PyExpr": ...
    def cast(self, data_type: pa.DataType) -> "PyExpr": ...
    def to_sql(self) -> str: ...

@@ -186,6 +188,7 @@ class Table:
            BTree,
            Bitmap,
            LabelList,
+            Fm,
            FTS,
        ],
        replace: Optional[bool],
@@ -208,6 +211,9 @@ class Table:
    async def alter_columns(
        self, columns: list[dict[str, Any]]
    ) -> AlterColumnsResult: ...
+    async def update_field_metadata(
+        self, updates: list[dict[str, Any]]
+    ) -> UpdateFieldMetadataResult: ...
    async def optimize(
        self,
        *,
@@ -223,6 +229,9 @@ class Table:
    async def close_lsm_writers(self) -> None: ...
    @property
    def tags(self) -> Tags: ...
+    @property
+    def branches(self) -> Branches: ...
+    def current_branch(self) -> Optional[str]: ...
    def query(self) -> Query: ...
    def take_offsets(self, offsets: list[int]) -> TakeQuery: ...
    def take_row_ids(self, row_ids: list[int]) -> TakeQuery: ...
@@ -235,6 +244,17 @@ class Tags:
    async def delete(self, tag: str): ...
    async def update(self, tag: str, version: int): ...

+class Branches:
+    async def list(self) -> Dict[str, Any]: ...
+    async def create(
+        self,
+        name: str,
+        from_ref: Optional[str] = None,
+        from_version: Optional[int] = None,
+    ) -> Table: ...
+    async def checkout(self, name: str, version: Optional[int] = None) -> Table: ...
+    async def delete(self, name: str) -> None: ...
+
 class IndexConfig:
    name: str
    index_type: str
@@ -460,6 +480,9 @@ class AddColumnsResult:
 class AlterColumnsResult:
    version: int

+class UpdateFieldMetadataResult:
+    version: int
+
 class DropColumnsResult:
    version: int

--- a/python/python/lancedb/background_loop.py
+++ b/python/python/lancedb/background_loop.py
@@ -2,6 +2,7 @@
 # SPDX-FileCopyrightText: Copyright The LanceDB Authors

 import asyncio
+import concurrent.futures
 import os
 import threading
 import warnings
@@ -37,6 +38,24 @@ class BackgroundEventLoop:

 LOOP = BackgroundEventLoop()

+
+def _new_embedding_executor() -> concurrent.futures.ThreadPoolExecutor:
+    return concurrent.futures.ThreadPoolExecutor(thread_name_prefix="lancedb-embedding")
+
+
+# Embedding functions can block for a long time -- a heavy local model or an
+# HTTP request to a remote embeddings API. Running them on asyncio's default
+# executor lets them starve the unrelated blocking I/O that shares that pool,
+# so they get a dedicated one. See
+# https://github.com/lancedb/lancedb/issues/3310.
+_EMBEDDING_EXECUTOR = _new_embedding_executor()
+
+
+def embedding_executor() -> concurrent.futures.ThreadPoolExecutor:
+    """Return the executor dedicated to running blocking embedding calls."""
+    return _EMBEDDING_EXECUTOR
+
+
 _FORK_WARNED = False


@@ -47,6 +66,12 @@ def _reset_after_fork():
    # the new state. The Rust-side tokio runtime is reset analogously by a
    # pthread_atfork hook installed in the _lancedb extension.
    LOOP._start()
+    # The embedding executor's worker threads are dead in the child as well.
+    # Replace it with a fresh pool (threads are spawned lazily, so this is
+    # cheap); we don't shut down the old one, since joining its dead workers
+    # could hang.
+    global _EMBEDDING_EXECUTOR
+    _EMBEDDING_EXECUTOR = _new_embedding_executor()
    global _FORK_WARNED
    if not _FORK_WARNED:
        _FORK_WARNED = True
--- a/python/python/lancedb/db.py
+++ b/python/python/lancedb/db.py
@@ -416,6 +416,8 @@ class DBConnection(EnforceOverrides):
        namespace_path: Optional[List[str]] = None,
        storage_options: Optional[Dict[str, str]] = None,
        index_cache_size: Optional[int] = None,
+        branch: Optional[str] = None,
+        version: Optional[int] = None,
    ) -> Table:
        """Open a Lance Table in the database.

@@ -444,6 +446,14 @@ class DBConnection(EnforceOverrides):
            connection will be inherited by the table, but can be overridden here.
            See available options at
            <https://docs.lancedb.com/storage/>
+        branch: str, optional
+            If provided, open a handle scoped to this branch instead of the
+            default branch. Reads and writes operate in the branch's context.
+        version: int, optional
+            If provided, open the table pinned to this version, producing a
+            read-only handle. Composes with ``branch``: when both are given,
+            opens that branch at the version; otherwise opens ``main`` at the
+            version. Call ``checkout_latest`` to return to a writable state.

        Returns
        -------
@@ -958,6 +968,8 @@ class LanceDBConnection(DBConnection):
        namespace_path: Optional[List[str]] = None,
        storage_options: Optional[Dict[str, str]] = None,
        index_cache_size: Optional[int] = None,
+        branch: Optional[str] = None,
+        version: Optional[int] = None,
    ) -> LanceTable:
        """Open a table in the database.

@@ -968,6 +980,14 @@ class LanceDBConnection(DBConnection):
        namespace_path: List[str], optional
            The namespace to open the table from.  When non-empty, the
            table is resolved through the directory namespace client.
+        branch: str, optional
+            If provided, open a handle scoped to this branch instead of the
+            default branch. Reads and writes operate in the branch's context.
+        version: int, optional
+            If provided, open the table pinned to this version, producing a
+            read-only handle. Composes with ``branch``: when both are given,
+            opens that branch at the version; otherwise opens ``main`` at the
+            version. Call ``checkout_latest`` to return to a writable state.

        Returns
        -------
@@ -987,20 +1007,26 @@ class LanceDBConnection(DBConnection):
            )

        if namespace_path:
-            return self._namespace_conn().open_table(
+            tbl = self._namespace_conn().open_table(
+                name,
+                namespace_path=namespace_path,
+                storage_options=storage_options,
+                index_cache_size=index_cache_size,
+            )
+        else:
+            tbl = LanceTable.open(
+                self,
                name,
                namespace_path=namespace_path,
                storage_options=storage_options,
                index_cache_size=index_cache_size,
            )

-        return LanceTable.open(
-            self,
-            name,
-            namespace_path=namespace_path,
-            storage_options=storage_options,
-            index_cache_size=index_cache_size,
-        )
+        if branch is not None:
+            tbl = tbl.branches.checkout(branch, version)
+        elif version is not None:
+            tbl.checkout(version)
+        return tbl

    def clone_table(
        self,
@@ -1641,6 +1667,8 @@ class AsyncConnection(object):
        location: Optional[str] = None,
        namespace_client: Optional[Any] = None,
        managed_versioning: Optional[bool] = None,
+        branch: Optional[str] = None,
+        version: Optional[int] = None,
    ) -> AsyncTable:
        """Open a Lance Table in the database.

@@ -1676,6 +1704,14 @@ class AsyncConnection(object):
        managed_versioning: bool, optional
            Whether managed versioning is enabled for this table. If provided,
            avoids a redundant describe_table call when namespace_client is set.
+        branch: str, optional
+            If provided, open a handle scoped to this branch instead of the
+            default branch. Reads and writes operate in the branch's context.
+        version: int, optional
+            If provided, open the table pinned to this version, producing a
+            read-only handle. Composes with ``branch``: when both are given,
+            opens that branch at the version; otherwise opens ``main`` at the
+            version. Call ``checkout_latest`` to return to a writable state.

        Returns
        -------
@@ -1692,7 +1728,14 @@ class AsyncConnection(object):
            namespace_client=namespace_client,
            managed_versioning=managed_versioning,
        )
-        return AsyncTable(table)
+        tbl = AsyncTable(table)
+        # "main" is the default branch, so treat it as no branch: remote rejects
+        # every branch checkout (even "main"), and the version still applies.
+        if branch is not None and branch != "main":
+            tbl = await tbl.branches.checkout(branch, version)
+        elif version is not None:
+            await tbl.checkout(version)
+        return tbl

    async def clone_table(
        self,
--- a/python/python/lancedb/expr.py
+++ b/python/python/lancedb/expr.py
@@ -19,7 +19,7 @@ operators::

 from __future__ import annotations

-from typing import Union
+from typing import Iterable, Union

 import pyarrow as pa

@@ -174,6 +174,11 @@ class Expr:
        """Return True where the string contains *substr*."""
        return Expr(self._inner.contains(_coerce(substr)._inner))

+    def isin(self, values: "Iterable[ExprLike]") -> "Expr":
+        """Return True where the value is one of *values* (SQL ``IN``)."""
+        inner = [_coerce(v)._inner for v in values]
+        return Expr(self._inner.isin(inner))
+
    # ── type cast ────────────────────────────────────────────────────────────

    def cast(self, data_type: Union[str, "pa.DataType"]) -> "Expr":
--- a/python/python/lancedb/index.py
+++ b/python/python/lancedb/index.py
@@ -93,6 +93,20 @@ class LabelList:
    pass


+@dataclass
+class Fm:
+    """Describe an FM-Index configuration.
+
+    `Fm` is a scalar index on string or binary columns that accelerates
+    substring search, i.e. `contains(col, 'needle')`. Unlike the tokenized
+    `FTS` index, it matches arbitrary substrings of the raw bytes.
+
+    For example, it works with `url`, `path`, `content`, etc.
+    """
+
+    pass
+
+
@dataclass
 class FTS:
    """Describe a FTS index configuration.
@@ -828,4 +842,5 @@ __all__ = [
    "FTS",
    "Bitmap",
    "LabelList",
+    "Fm",
 ]
--- a/python/python/lancedb/namespace.py
+++ b/python/python/lancedb/namespace.py
@@ -144,7 +144,12 @@ def _query_to_namespace_request(
    if query.postfilter is not None:
        prefilter = not query.postfilter

-    k = query.limit if query.limit is not None else 10
+    if query.limit is not None:
+        k = query.limit
+    elif query.vector is None and query.full_text_query is None:
+        k = sys.maxsize
+    else:
+        k = 10

    # Build request kwargs, only including non-None values for optional fields
    # that Pydantic doesn't accept as None
@@ -544,6 +549,8 @@ class LanceNamespaceDBConnection(DBConnection):
        namespace_path: Optional[List[str]] = None,
        storage_options: Optional[Dict[str, str]] = None,
        index_cache_size: Optional[int] = None,
+        branch: Optional[str] = None,
+        version: Optional[int] = None,
    ) -> Table:
        if namespace_path is None:
            namespace_path = []
@@ -562,7 +569,7 @@ class LanceNamespaceDBConnection(DBConnection):
                raise TableNotFoundError(f"Table not found: {'$'.join(table_id)}")
            raise

-        return LanceTable(
+        tbl = LanceTable(
            self,
            name,
            namespace_path=namespace_path,
@@ -570,6 +577,11 @@ class LanceNamespaceDBConnection(DBConnection):
            pushdown_operations=self._namespace_client_pushdown_operations,
            _async=async_table,
        )
+        if branch is not None:
+            tbl = tbl.branches.checkout(branch, version)
+        elif version is not None:
+            tbl.checkout(version)
+        return tbl

    @override
    def drop_table(self, name: str, namespace_path: Optional[List[str]] = None):
@@ -954,7 +966,7 @@ class AsyncLanceNamespaceDBConnection:
        if mode.lower() not in ["create", "overwrite"]:
            raise ValueError("mode must be either 'create' or 'overwrite'")
        validate_table_name(name)
-        return await self._inner.create_table(
+        table = await self._inner.create_table(
            name,
            data,
            schema=schema,
@@ -966,6 +978,11 @@ class AsyncLanceNamespaceDBConnection:
            embedding_functions=embedding_functions,
            storage_options=storage_options,
        )
+        return table._set_namespace_context(
+            namespace_path=namespace_path,
+            namespace_client=self._namespace_client,
+            pushdown_operations=self._namespace_client_pushdown_operations,
+        )

    async def open_table(
        self,
@@ -974,12 +991,14 @@ class AsyncLanceNamespaceDBConnection:
        namespace_path: Optional[List[str]] = None,
        storage_options: Optional[Dict[str, str]] = None,
        index_cache_size: Optional[int] = None,
+        branch: Optional[str] = None,
+        version: Optional[int] = None,
    ) -> AsyncTable:
        """Open an existing table from the namespace."""
        if namespace_path is None:
            namespace_path = []
        try:
-            return await self._inner.open_table(
+            table = await self._inner.open_table(
                name,
                namespace_path=namespace_path,
                storage_options=storage_options,
@@ -990,6 +1009,17 @@ class AsyncLanceNamespaceDBConnection:
                table_id = namespace_path + [name]
                raise TableNotFoundError(f"Table not found: {'$'.join(table_id)}")
            raise
+        # "main" is the default branch, so treat it as no branch (mirrors the
+        # sync remote path); the version still applies.
+        if branch is not None and branch != "main":
+            table = await table.branches.checkout(branch, version)
+        elif version is not None:
+            await table.checkout(version)
+        return table._set_namespace_context(
+            namespace_path=namespace_path,
+            namespace_client=self._namespace_client,
+            pushdown_operations=self._namespace_client_pushdown_operations,
+        )

    async def drop_table(self, name: str, namespace_path: Optional[List[str]] = None):
        """Drop a table from the namespace."""
--- a/python/python/lancedb/permutation.py
+++ b/python/python/lancedb/permutation.py
@@ -3,12 +3,13 @@

 import copy
 import json
+import os

 from deprecation import deprecated
 import pyarrow as pa

 from ._lancedb import async_permutation_builder, PermutationReader
-from .table import LanceTable
+from .table import LanceTable, Table
 from .background_loop import LOOP
 from .util import batch_to_tensor, batch_to_tensor_rows
 from typing import Any, Callable, Iterator, Literal, Optional, TYPE_CHECKING, Union
@@ -354,6 +355,49 @@ class Transforms:
 DEFAULT_BATCH_SIZE = 100


+def _table_to_pickle_state(table: Table) -> dict[str, Any]:
+    from .remote.table import RemoteTable
+
+    if isinstance(table, RemoteTable):
+        return {
+            "kind": "remote",
+            "table": table,
+        }
+
+    if not isinstance(table, LanceTable):
+        raise ValueError(f"Cannot pickle table of type {type(table)!r}")
+
+    base_uri = table._conn.uri
+    if base_uri.startswith("memory://"):
+        return {
+            "kind": "memory",
+            "name": table.name,
+            "data": table.to_arrow(),
+        }
+
+    return {
+        "kind": "local",
+        "name": table.name,
+        "uri": base_uri,
+        "namespace": table._namespace_path,
+        "storage_options": table._conn.storage_options,
+    }
+
+
+def _table_from_pickle_state(state: dict[str, Any]) -> Table:
+    from . import connect
+
+    kind = state["kind"]
+    if kind == "remote":
+        return state["table"]
+    if kind == "memory":
+        return connect("memory://").create_table(state["name"], state["data"])
+    if kind == "local":
+        db = connect(state["uri"], storage_options=state["storage_options"])
+        return db.open_table(state["name"], namespace_path=state["namespace"] or None)
+    raise ValueError(f"Unknown table pickle state kind: {kind}")
+
+
 class Permutation:
    """
    A Permutation is a view of a dataset that can be used as input to model training
@@ -369,15 +413,15 @@ class Permutation:

    def __init__(
        self,
-        base_table: LanceTable,
-        permutation_table: Optional[LanceTable],
+        base_table: Table,
+        permutation_table: Optional[Table],
        split: int,
        selection: dict[str, str],
        batch_size: int,
        transform_fn: Callable[pa.RecordBatch, Any],
        offset: Optional[int] = None,
        limit: Optional[int] = None,
-        connection_factory: Optional[Callable[[str], LanceTable]] = None,
+        connection_factory: Optional[Callable[[str], Table]] = None,
        _reader: Optional[PermutationReader] = None,
    ):
        """
@@ -397,6 +441,7 @@ class Permutation:
        if _reader is None:
            _reader = LOOP.run(self._build_reader())
        self.reader: PermutationReader = _reader
+        self._pid = os.getpid()

    async def _build_reader(self) -> PermutationReader:
        reader = await PermutationReader.from_tables(
@@ -428,29 +473,25 @@ class Permutation:
        return new

    def with_connection_factory(
-        self, connection_factory: Callable[[str], LanceTable]
+        self, connection_factory: Callable[[str], Table]
    ) -> "Permutation":
        """
        Creates a new permutation that will use ``connection_factory`` to reopen
        the base table when this permutation is unpickled in a worker process.

-        The factory is a callable that takes a single argument — the base table
-        name — and returns a [LanceTable]. It must be picklable; the worker
+        The factory is a callable that takes a single argument, the base table
+        name, and returns a LanceDB table. It must be picklable; the worker
        will pickle it via standard ``pickle`` and call it to recover the base
        table. Picklable callables in practice means top-level (module-level)
        functions, ``functools.partial`` of such functions, or instances of
        picklable classes implementing ``__call__``. Lambdas and closures over
        local variables don't pickle with the default protocol.

-        Setting a factory is necessary when the URI alone is not enough to
-        re-open the connection — most importantly for LanceDB Cloud (``db://``)
-        connections, where ``api_key`` and ``region`` aren't recoverable from
-        the connection object after construction.
-
-        For local file or cloud-storage paths the factory is optional: if not
-        set, ``__getstate__`` falls back to capturing
-        ``(uri, storage_options, namespace_path)`` and re-opening via
-        ``lancedb.connect(uri, storage_options=...)``.
+        A factory is optional for normal local and remote LanceDB connections:
+        if not set, ``__getstate__`` captures the table's own picklable reopen
+        state. Use a factory when that default state is not enough, for example
+        when credentials should be loaded from the worker environment instead
+        of being embedded in the pickle.

        Examples
        --------
@@ -508,7 +549,7 @@ class Permutation:
        return new

    @classmethod
-    def identity(cls, table: LanceTable) -> "Permutation":
+    def identity(cls, table: Table) -> "Permutation":
        """
        Creates an identity permutation for the given table.
        """
@@ -517,8 +558,8 @@ class Permutation:
    @classmethod
    def from_tables(
        cls,
-        base_table: LanceTable,
-        permutation_table: Optional[LanceTable] = None,
+        base_table: Table,
+        permutation_table: Optional[Table] = None,
        split: Optional[Union[str, int]] = None,
    ) -> "Permutation":
        """
@@ -594,11 +635,10 @@ class Permutation:

        The base table is captured either via a user-supplied
        ``connection_factory`` (see [with_connection_factory]) or, as a
-        fallback, by introspecting ``(uri, storage_options, namespace_path)``
-        on the connection. The permutation table — always an in-memory
-        LanceDB table — is captured as a pyarrow Table (which pickles via
-        Arrow IPC natively). The reader is dropped from the wire format;
-        ``__setstate__`` rebuilds it from the restored tables.
+        fallback, by the table's own picklable reopen state. The permutation
+        table is captured as a pyarrow Table (which pickles via Arrow IPC
+        natively). The reader is dropped from the wire format and rebuilt
+        lazily on first use.
        """
        permutation_data: Optional[pa.Table] = None
        if self.permutation_table is not None:
@@ -622,39 +662,9 @@ class Permutation:
            # namespace from the existing connection.
            return common

-        # URI-introspection fallback: only viable for native (OSS) connections
-        # where (uri, storage_options) is enough to reopen. Remote / cloud
-        # connections don't expose recoverable api_key / region — those users
-        # must call with_connection_factory().
-        try:
-            base_uri = self.base_table._conn.uri
-            storage_options = self.base_table._conn.storage_options
-        except AttributeError as e:
-            raise ValueError(
-                "Cannot pickle this Permutation: the base table's connection "
-                "does not expose a uri/storage_options, which usually means it "
-                "is a remote (LanceDB Cloud) connection. Call "
-                "Permutation.with_connection_factory(...) first to provide a "
-                "picklable callable that re-opens the base table from a worker "
-                "process."
-            ) from e
-
-        if base_uri.startswith("memory://"):
-            # In-memory base tables don't exist in any worker process by
-            # default, so dump the entire base table into the pickle. This
-            # can be expensive for large datasets — users with large
-            # in-memory base tables should either persist them or set a
-            # connection_factory.
-            return {
-                **common,
-                "base_table_data": self.base_table.to_arrow(),
-            }
-
        return {
            **common,
-            "base_table_uri": base_uri,
-            "base_table_namespace": self.base_table._namespace_path,
-            "base_table_storage_options": storage_options,
+            "base_table_state": _table_to_pickle_state(self.base_table),
        }

    def __setstate__(self, state: dict[str, Any]) -> None:
@@ -663,6 +673,8 @@ class Permutation:
        connection_factory = state["connection_factory"]
        if connection_factory is not None:
            base_table = connection_factory(state["base_table_name"])
+        elif "base_table_state" in state:
+            base_table = _table_from_pickle_state(state["base_table_state"])
        elif "base_table_data" in state:
            # In-memory base table inlined into the pickle; rebuild the same
            # way we rebuild the in-memory permutation table.
@@ -680,7 +692,7 @@ class Permutation:
                namespace_path=state["base_table_namespace"] or None,
            )

-        permutation_table: Optional[LanceTable] = None
+        permutation_table: Optional[Table] = None
        if state["permutation_data"] is not None:
            mem_db = connect("memory://")
            permutation_table = mem_db.create_table(
@@ -696,10 +708,28 @@ class Permutation:
        self.offset = state["offset"]
        self.limit = state["limit"]
        self.connection_factory = connection_factory
+        self.reader = None
+        self._pid = None
+
+    def _ensure_open(self) -> None:
+        pid = os.getpid()
+        if self.reader is not None and getattr(self, "_pid", None) == pid:
+            return
+        # The reader owns Rust-side table handles. Rebuild it after unpickle or
+        # fork even though the Python table wrappers reopen themselves.
+        if hasattr(self.base_table, "_ensure_open"):
+            self.base_table._ensure_open()
+        if self.permutation_table is not None and hasattr(
+            self.permutation_table, "_ensure_open"
+        ):
+            self.permutation_table._ensure_open()
        self.reader = LOOP.run(self._build_reader())
+        self._pid = pid

    @property
    def schema(self) -> pa.Schema:
+        self._ensure_open()
+
        async def do_output_schema():
            return await self.reader.output_schema(self.selection)

@@ -717,6 +747,7 @@ class Permutation:
        """
        The number of rows in the permutation
        """
+        self._ensure_open()
        return self.reader.count_rows()

    @property
@@ -875,6 +906,7 @@ class Permutation:
        If skip_last_batch is True, the last batch will be skipped if it is not a
        multiple of batch_size.
        """
+        self._ensure_open()

        async def get_iter():
            return await self.reader.read(self.selection, batch_size=batch_size)
@@ -976,6 +1008,7 @@ class Permutation:
        so `with_format` and `with_transform` affect this method in the same way
        they affect iteration.
        """
+        self._ensure_open()

        async def do_take_offsets():
            return await self.reader.take_offsets(offsets, selection=self.selection)
@@ -1011,9 +1044,11 @@ class Permutation:
        """
        Skip the first `skip` rows of the permutation
        """
+        self._ensure_open()
        new = copy.copy(self)
        new.offset = skip
        new.reader = LOOP.run(new._build_reader())
+        new._pid = os.getpid()
        return new

    @deprecated(details="Use with_take instead")
@@ -1032,9 +1067,11 @@ class Permutation:
        """
        Limit the permutation to `limit` rows (following any `skip`)
        """
+        self._ensure_open()
        new = copy.copy(self)
        new.limit = limit
        new.reader = LOOP.run(new._build_reader())
+        new._pid = os.getpid()
        return new

    @deprecated(details="Use with_repeat instead")
--- a/python/python/lancedb/query.py
+++ b/python/python/lancedb/query.py
@@ -41,6 +41,14 @@ from .rerankers.rrf import RRFReranker
 from .rerankers.util import check_reranker_result
 from .util import flatten_columns

+BlobMode = Literal["lazy", "bytes", "descriptions"]
+
+_BLOB_MODE_TO_HANDLING = {
+    "lazy": "blobs_descriptions",
+    "bytes": "all_binary",
+    "descriptions": "blobs_descriptions",
+}
+
 if TYPE_CHECKING:
    import sys

@@ -55,7 +63,7 @@ if TYPE_CHECKING:
    from ._lancedb import VectorQuery as LanceVectorQuery
    from .common import VEC
    from .pydantic import LanceModel
-    from .table import Table
+    from .table import AsyncTable, Table

    if sys.version_info >= (3, 11):
        from typing import Self
@@ -65,6 +73,179 @@ if TYPE_CHECKING:
 T = TypeVar("T", bound="LanceModel")


+def _validate_blob_mode(blob_mode: BlobMode) -> None:
+    if blob_mode not in _BLOB_MODE_TO_HANDLING:
+        modes = ", ".join(repr(mode) for mode in _BLOB_MODE_TO_HANDLING)
+        raise ValueError(f"blob_mode must be one of {modes}, got {blob_mode!r}")
+
+
+def _field_is_blob(field: pa.Field) -> bool:
+    metadata = field.metadata or {}
+    return metadata.get(b"lance-encoding:blob") == b"true" or (
+        metadata.get("lance-encoding:blob") == "true"
+    )
+
+
+def _schema_has_blob_field(schema: pa.Schema) -> bool:
+    return any(_field_is_blob(field) for field in schema)
+
+
+def _blob_mode_requires_native_pandas(blob_mode: BlobMode, schema: pa.Schema) -> bool:
+    return blob_mode in _BLOB_MODE_TO_HANDLING and _schema_has_blob_field(schema)
+
+
+def _unsupported_blob_pandas_error(reason: str) -> RuntimeError:
+    return RuntimeError(
+        "blob columns require Lance native scanner conversion for query "
+        f"to_pandas(), but {reason}. Use a plain scan query or remove blob "
+        "columns from the projection."
+    )
+
+
+def _query_is_plain_scan(query: Query) -> bool:
+    return (
+        query.vector is None
+        and query.full_text_query is None
+        and not query.postfilter
+        and not query.order_by
+    )
+
+
+def _filter_to_sql(filter: Optional[Union[str, Expr]]) -> Optional[str]:
+    if filter is None:
+        return None
+    if isinstance(filter, Expr):
+        return filter.to_sql()
+    return filter
+
+
+def _projection_to_scanner_kwargs(
+    columns: Optional[
+        Union[
+            List[str], List[Tuple[str, Union[str, Expr]]], Dict[str, Union[str, Expr]]
+        ]
+    ],
+) -> Dict[str, Any]:
+    if columns is None:
+        return {}
+    if isinstance(columns, list):
+        if all(isinstance(column, str) for column in columns):
+            return {"columns": columns}
+        if all(isinstance(column, tuple) and len(column) == 2 for column in columns):
+            return {
+                "columns": {
+                    name: expr.to_sql() if isinstance(expr, Expr) else expr
+                    for name, expr in columns
+                }
+            }
+        # Let Lance raise the detailed projection validation error.
+        return {"columns": columns}
+
+    projection = {}
+    for name, expr in columns.items():
+        if isinstance(expr, Expr):
+            expr = expr.to_sql()
+        projection[name] = expr
+    return {"columns": projection}
+
+
+def _scanner_kwargs_for_query(
+    query: Query, blob_mode: BlobMode, dataset: Optional[Any] = None
+) -> Dict[str, Any]:
+    fragments = _scanner_fragments_for_query(query, dataset)
+    kwargs = {
+        **_projection_to_scanner_kwargs(query.columns),
+        "filter": _filter_to_sql(query.filter),
+        "limit": query.limit,
+        "offset": query.offset,
+        "with_row_id": query.with_row_id,
+        "with_row_address": query.with_row_address,
+        "fast_search": query.fast_search,
+        "blob_handling": _BLOB_MODE_TO_HANDLING[blob_mode],
+        "fragments": fragments,
+    }
+    return {key: value for key, value in kwargs.items() if value is not None}
+
+
+def _scanner_fragments_for_query(query: Query, dataset: Optional[Any]) -> Optional[Any]:
+    if query.fragments is not None and query.fragment_ids is not None:
+        raise ValueError("fragments and fragment_ids cannot both be set")
+    if query.fragments is not None:
+        return query.fragments
+    if query.fragment_ids is None:
+        return None
+    if dataset is None:
+        raise ValueError("fragment_ids require a Lance dataset")
+
+    requested = set(query.fragment_ids)
+    fragments = [
+        fragment
+        for fragment in dataset.get_fragments()
+        if fragment.fragment_id in requested
+    ]
+    found = {fragment.fragment_id for fragment in fragments}
+    missing = requested - found
+    if missing:
+        missing_ids = ", ".join(str(fragment_id) for fragment_id in sorted(missing))
+        raise ValueError(f"fragment_ids not found in dataset: {missing_ids}")
+    return fragments
+
+
+def _ensure_lazy_blob_frame(
+    df: "pd.DataFrame", schema: pa.Schema, blob_mode: BlobMode
+) -> "pd.DataFrame":
+    if blob_mode != "lazy" or not _schema_has_blob_field(schema) or len(df) == 0:
+        return df
+
+    for field in schema:
+        if not _field_is_blob(field) or field.name not in df.columns:
+            continue
+        value = df[field.name].iloc[0]
+        if value is not None and not hasattr(value, "readall"):
+            raise _unsupported_blob_pandas_error(
+                "the Lance scanner did not return lazy blob files"
+            )
+    return df
+
+
+def _scanner_to_table(scanner: Any) -> pa.Table:
+    if hasattr(scanner, "to_pyarrow"):
+        reader = scanner.to_pyarrow()
+        return reader.read_all()
+    if hasattr(scanner, "to_table"):
+        return scanner.to_table()
+    reader = scanner.to_reader()
+    return reader.read_all()
+
+
+def _scanner_to_pandas(scanner: Any, blob_mode: BlobMode, **kwargs) -> "pd.DataFrame":
+    schema = getattr(scanner, "projected_schema", None)
+    if schema is None:
+        schema = getattr(scanner, "schema", None)
+    if schema is None:
+        schema = getattr(scanner, "dataset_schema", None)
+    if callable(schema):
+        schema = schema()
+    if hasattr(scanner, "to_pandas"):
+        try:
+            df = scanner.to_pandas(blob_mode=blob_mode, **kwargs)
+        except TypeError as err:
+            message = str(err)
+            if "blob_mode" not in message and "unexpected keyword" not in message:
+                raise
+            df = scanner.to_pandas(**kwargs)
+        if schema is not None:
+            return _ensure_lazy_blob_frame(df, schema, blob_mode)
+        return df
+
+    tbl = _scanner_to_table(scanner)
+    if blob_mode == "lazy" and _schema_has_blob_field(tbl.schema):
+        raise _unsupported_blob_pandas_error(
+            "the Lance scanner does not expose to_pandas"
+        )
+    return tbl.to_pandas(**kwargs)
+
+
 # Pydantic validation function for vector queries
 def ensure_vector_query(
    val: Any,
@@ -499,6 +680,13 @@ class Query(pydantic.BaseModel):
    # if true, include the row id in the results
    with_row_id: Optional[bool] = None

+    # if true, include the row address in the results
+    with_row_address: Optional[bool] = None
+
+    # Lance fragments or fragment ids to scan on scanner-backed plain queries
+    fragments: Optional[Any] = None
+    fragment_ids: Optional[List[int]] = None
+
    # offset to start fetching results from
    offset: Optional[int] = None

@@ -691,6 +879,9 @@ class LanceQueryBuilder(ABC):
        self._where = None
        self._postfilter = None
        self._with_row_id = None
+        self._with_row_address = None
+        self._fragments = None
+        self._fragment_ids = None
        self._vector = None
        self._text = None
        self._ef = None
@@ -718,6 +909,7 @@ class LanceQueryBuilder(ABC):
        self,
        flatten: Optional[Union[int, bool]] = None,
        *,
+        blob_mode: BlobMode = "lazy",
        timeout: Optional[timedelta] = None,
        **kwargs,
    ) -> "pd.DataFrame":
@@ -737,11 +929,41 @@ class LanceQueryBuilder(ABC):
        timeout: Optional[timedelta]
            The maximum time to wait for the query to complete.
            If None, wait indefinitely.
+        blob_mode: str, default "lazy"
+            Controls how blob columns are returned for plain scan queries.
+            Vector, FTS, hybrid, and other non-native query shapes keep the
+            existing Arrow conversion path and only support blob descriptions.
        **kwargs
            Forwarded to pyarrow.Table.to_pandas after query execution and
            optional flattening.
        """
+        _validate_blob_mode(blob_mode)
+        output_schema = getattr(self, "output_schema", None)
+        if output_schema is not None:
+            schema = output_schema()
+            if _blob_mode_requires_native_pandas(blob_mode, schema):
+                native_error = None
+                if (flatten is None or blob_mode == "descriptions") and timeout is None:
+                    try:
+                        df = self._plain_scan_to_pandas(
+                            blob_mode, flatten=flatten, **kwargs
+                        )
+                        if df is not None:
+                            return df
+                    except Exception as err:
+                        native_error = err
+                reason = (
+                    "this query shape cannot use Lance native pandas conversion"
+                    if native_error is None
+                    else str(native_error)
+                )
+                raise _unsupported_blob_pandas_error(reason) from native_error
+
        tbl = flatten_columns(self.to_arrow(timeout=timeout), flatten)
+        if _blob_mode_requires_native_pandas(blob_mode, tbl.schema):
+            raise _unsupported_blob_pandas_error(
+                "this query shape cannot use Lance native pandas conversion"
+            )
        return tbl.to_pandas(**kwargs)

    @abstractmethod
@@ -947,6 +1169,32 @@ class LanceQueryBuilder(ABC):
        self._with_row_id = with_row_id
        return self

+    def with_row_address(self, with_row_address: bool = True) -> Self:
+        """Set whether to return row addresses.
+
+        Parameters
+        ----------
+        with_row_address: bool, default True
+            If True, return the _rowaddr column in the results.
+
+        Returns
+        -------
+        LanceQueryBuilder
+            The LanceQueryBuilder object.
+        """
+        self._with_row_address = with_row_address
+        return self
+
+    def with_fragments(self, fragments: Any) -> Self:
+        """Set the Lance fragments to scan for plain scanner-backed queries."""
+        self._fragments = fragments
+        return self
+
+    def fragment_ids(self, fragment_ids: List[int]) -> Self:
+        """Set the Lance fragment ids to scan for plain scanner-backed queries."""
+        self._fragment_ids = fragment_ids
+        return self
+
    def explain_plan(self, verbose: Optional[bool] = False) -> str:
        """Return the execution plan for this query.

@@ -1086,6 +1334,25 @@ class LanceQueryBuilder(ABC):
        """
        raise NotImplementedError

+    def _plain_scan_to_pandas(
+        self,
+        blob_mode: BlobMode,
+        flatten: Optional[Union[int, bool]] = None,
+        **kwargs,
+    ) -> Optional["pd.DataFrame"]:
+        query = self.to_query_object()
+        if not _query_is_plain_scan(query):
+            return None
+
+        dataset = self._table.to_lance()
+        scanner = dataset.scanner(
+            **_scanner_kwargs_for_query(query, blob_mode, dataset)
+        )
+        if flatten is not None:
+            tbl = flatten_columns(_scanner_to_table(scanner), flatten)
+            return tbl.to_pandas(**kwargs)
+        return _scanner_to_pandas(scanner, blob_mode, **kwargs)
+
    @abstractmethod
    def to_query_object(self) -> Query:
        """Return a serializable representation of the query
@@ -1357,6 +1624,9 @@ class LanceVectorQueryBuilder(LanceQueryBuilder):
            refine_factor=self._refine_factor,
            vector_column=self._vector_column,
            with_row_id=self._with_row_id,
+            with_row_address=self._with_row_address,
+            fragments=self._fragments,
+            fragment_ids=self._fragment_ids,
            offset=self._offset,
            fast_search=self._fast_search,
            ef=self._ef,
@@ -1559,6 +1829,9 @@ class LanceFtsQueryBuilder(LanceQueryBuilder):
            limit=self._limit,
            postfilter=self._postfilter,
            with_row_id=self._with_row_id,
+            with_row_address=self._with_row_address,
+            fragments=self._fragments,
+            fragment_ids=self._fragment_ids,
            full_text_query=FullTextSearchQuery(
                query=self._query, columns=self._fts_columns
            ),
@@ -1629,6 +1902,9 @@ class LanceEmptyQueryBuilder(LanceQueryBuilder):
            filter=self._where,
            limit=self._limit,
            with_row_id=self._with_row_id,
+            with_row_address=self._with_row_address,
+            fragments=self._fragments,
+            fragment_ids=self._fragment_ids,
            offset=self._offset,
            order_by=self._order_by,
        )
@@ -2207,7 +2483,11 @@ class AsyncQueryBase(object):
    Base class for all async queries (take, scan, vector, fts, hybrid)
    """

-    def __init__(self, inner: Union[LanceQuery, LanceVectorQuery, LanceTakeQuery]):
+    def __init__(
+        self,
+        inner: Union[LanceQuery, LanceVectorQuery, LanceTakeQuery],
+        table: Optional["AsyncTable"] = None,
+    ):
        """
        Construct an AsyncQueryBase

@@ -2215,6 +2495,10 @@ class AsyncQueryBase(object):
        [AsyncTable.query][lancedb.table.AsyncTable.query] method to create a query.
        """
        self._inner = inner
+        self._table = table
+        self._with_row_address = None
+        self._fragments = None
+        self._fragment_ids = None

    def to_query_object(self) -> Query:
        """
@@ -2223,7 +2507,11 @@ class AsyncQueryBase(object):
        This is currently experimental but can be useful as the query object is pure
        python and more easily serializable.
        """
-        return Query.from_inner(self._inner.to_query_request())
+        query = Query.from_inner(self._inner.to_query_request())
+        query.with_row_address = self._with_row_address
+        query.fragments = self._fragments
+        query.fragment_ids = self._fragment_ids
+        return query

    def select(self, columns: Union[List[str], dict[str, str]]) -> Self:
        """
@@ -2280,6 +2568,27 @@ class AsyncQueryBase(object):
        self._inner.with_row_id()
        return self

+    def with_row_address(self, with_row_address: bool = True) -> Self:
+        """
+        Include the _rowaddr column in scanner-backed plain query results.
+        """
+        self._with_row_address = with_row_address
+        return self
+
+    def with_fragments(self, fragments: Any) -> Self:
+        """
+        Restrict scanner-backed plain query results to the given Lance fragments.
+        """
+        self._fragments = fragments
+        return self
+
+    def fragment_ids(self, fragment_ids: List[int]) -> Self:
+        """
+        Restrict scanner-backed plain query results to the given Lance fragment ids.
+        """
+        self._fragment_ids = fragment_ids
+        return self
+
    async def to_batches(
        self,
        *,
@@ -2357,6 +2666,8 @@ class AsyncQueryBase(object):
        self,
        flatten: Optional[Union[int, bool]] = None,
        timeout: Optional[timedelta] = None,
+        *,
+        blob_mode: BlobMode = "lazy",
        **kwargs,
    ) -> "pd.DataFrame":
        """
@@ -2390,13 +2701,63 @@ class AsyncQueryBase(object):
            The maximum time to wait for the query to complete.
            If not specified, no timeout is applied. If the query does not
            complete within the specified time, an error will be raised.
+        blob_mode: str, default "lazy"
+            Controls how blob columns are returned for plain scan queries.
+            Vector, FTS, hybrid, and other non-native query shapes keep the
+            existing Arrow conversion path and only support blob descriptions.
        **kwargs
            Forwarded to pyarrow.Table.to_pandas after query execution and
            optional flattening.
        """
-        return (
-            flatten_columns(await self.to_arrow(timeout=timeout), flatten)
-        ).to_pandas(**kwargs)
+        _validate_blob_mode(blob_mode)
+        if hasattr(self._inner, "output_schema"):
+            schema = await self.output_schema()
+            if _blob_mode_requires_native_pandas(blob_mode, schema):
+                native_error = None
+                if (flatten is None or blob_mode == "descriptions") and timeout is None:
+                    try:
+                        df = await self._plain_scan_to_pandas(
+                            blob_mode, flatten=flatten, **kwargs
+                        )
+                        if df is not None:
+                            return df
+                    except Exception as err:
+                        native_error = err
+                reason = (
+                    "this query shape cannot use Lance native pandas conversion"
+                    if native_error is None
+                    else str(native_error)
+                )
+                raise _unsupported_blob_pandas_error(reason) from native_error
+
+        tbl = flatten_columns(await self.to_arrow(timeout=timeout), flatten)
+        if _blob_mode_requires_native_pandas(blob_mode, tbl.schema):
+            raise _unsupported_blob_pandas_error(
+                "this query shape cannot use Lance native pandas conversion"
+            )
+        return tbl.to_pandas(**kwargs)
+
+    async def _plain_scan_to_pandas(
+        self,
+        blob_mode: BlobMode,
+        flatten: Optional[Union[int, bool]] = None,
+        **kwargs,
+    ) -> Optional["pd.DataFrame"]:
+        if self._table is None:
+            return None
+
+        query = self.to_query_object()
+        if not _query_is_plain_scan(query):
+            return None
+
+        dataset = await self._table._to_lance()
+        scanner = dataset.scanner(
+            **_scanner_kwargs_for_query(query, blob_mode, dataset)
+        )
+        if flatten is not None:
+            tbl = flatten_columns(_scanner_to_table(scanner), flatten)
+            return tbl.to_pandas(**kwargs)
+        return _scanner_to_pandas(scanner, blob_mode, **kwargs)

    async def to_polars(
        self,
@@ -2503,14 +2864,18 @@ class AsyncStandardQuery(AsyncQueryBase):
    Base class for "standard" async queries (all but take currently)
    """

-    def __init__(self, inner: Union[LanceQuery, LanceVectorQuery]):
+    def __init__(
+        self,
+        inner: Union[LanceQuery, LanceVectorQuery],
+        table: Optional["AsyncTable"] = None,
+    ):
        """
        Construct an AsyncStandardQuery

        This method is not intended to be called directly.  Instead, use the
        [AsyncTable.query][lancedb.table.AsyncTable.query] method to create a query.
        """
-        super().__init__(inner)
+        super().__init__(inner, table)

    def where(self, predicate: Union[str, Expr]) -> Self:
        """
@@ -2616,14 +2981,14 @@ class AsyncStandardQuery(AsyncQueryBase):


 class AsyncQuery(AsyncStandardQuery):
-    def __init__(self, inner: LanceQuery):
+    def __init__(self, inner: LanceQuery, table: Optional["AsyncTable"] = None):
        """
        Construct an AsyncQuery

        This method is not intended to be called directly.  Instead, use the
        [AsyncTable.query][lancedb.table.AsyncTable.query] method to create a query.
        """
-        super().__init__(inner)
+        super().__init__(inner, table)
        self._inner = inner

    @classmethod
@@ -2707,10 +3072,11 @@ class AsyncQuery(AsyncStandardQuery):
            new_self = self._inner.nearest_to(query_vectors[0])
            for v in query_vectors[1:]:
                new_self.add_query_vector(v)
-            return AsyncVectorQuery(new_self)
+            return AsyncVectorQuery(new_self, self._table)
        else:
            return AsyncVectorQuery(
-                self._inner.nearest_to(AsyncQuery._query_vec_to_array(query_vector))
+                self._inner.nearest_to(AsyncQuery._query_vec_to_array(query_vector)),
+                self._table,
            )

    def nearest_to_text(
@@ -2743,17 +3109,18 @@ class AsyncQuery(AsyncStandardQuery):

        if isinstance(query, str):
            return AsyncFTSQuery(
-                self._inner.nearest_to_text({"query": query, "columns": columns})
+                self._inner.nearest_to_text({"query": query, "columns": columns}),
+                self._table,
            )
        # FullTextQuery object
-        return AsyncFTSQuery(self._inner.nearest_to_text({"query": query}))
+        return AsyncFTSQuery(self._inner.nearest_to_text({"query": query}), self._table)


 class AsyncFTSQuery(AsyncStandardQuery):
    """A query for full text search for LanceDB."""

-    def __init__(self, inner: LanceFTSQuery):
-        super().__init__(inner)
+    def __init__(self, inner: LanceFTSQuery, table: Optional["AsyncTable"] = None):
+        super().__init__(inner, table)
        self._inner = inner
        self._reranker = None

@@ -2835,10 +3202,11 @@ class AsyncFTSQuery(AsyncStandardQuery):
            new_self = self._inner.nearest_to(query_vectors[0])
            for v in query_vectors[1:]:
                new_self.add_query_vector(v)
-            return AsyncHybridQuery(new_self)
+            return AsyncHybridQuery(new_self, self._table)
        else:
            return AsyncHybridQuery(
-                self._inner.nearest_to(AsyncQuery._query_vec_to_array(query_vector))
+                self._inner.nearest_to(AsyncQuery._query_vec_to_array(query_vector)),
+                self._table,
            )

    async def to_batches(
@@ -3029,7 +3397,7 @@ class AsyncVectorQueryBase:


 class AsyncVectorQuery(AsyncStandardQuery, AsyncVectorQueryBase):
-    def __init__(self, inner: LanceVectorQuery):
+    def __init__(self, inner: LanceVectorQuery, table: Optional["AsyncTable"] = None):
        """
        Construct an AsyncVectorQuery

@@ -3039,7 +3407,7 @@ class AsyncVectorQuery(AsyncStandardQuery, AsyncVectorQueryBase):
        a vector query.  Or you can use
        [AsyncTable.vector_search][lancedb.table.AsyncTable.vector_search]
        """
-        super().__init__(inner)
+        super().__init__(inner, table)
        self._inner = inner
        self._reranker = None
        self._query_string = None
@@ -3093,10 +3461,13 @@ class AsyncVectorQuery(AsyncStandardQuery, AsyncVectorQueryBase):

        if isinstance(query, str):
            return AsyncHybridQuery(
-                self._inner.nearest_to_text({"query": query, "columns": columns})
+                self._inner.nearest_to_text({"query": query, "columns": columns}),
+                self._table,
            )
        # FullTextQuery object
-        return AsyncHybridQuery(self._inner.nearest_to_text({"query": query}))
+        return AsyncHybridQuery(
+            self._inner.nearest_to_text({"query": query}), self._table
+        )

    async def to_batches(
        self,
@@ -3123,8 +3494,8 @@ class AsyncHybridQuery(AsyncStandardQuery, AsyncVectorQueryBase):
    in the `rerank` method to convert the scores to ranks and then normalize them.
    """

-    def __init__(self, inner: LanceHybridQuery):
-        super().__init__(inner)
+    def __init__(self, inner: LanceHybridQuery, table: Optional["AsyncTable"] = None):
+        super().__init__(inner, table)
        self._inner = inner
        self._norm = "score"
        self._reranker = RRFReranker()
@@ -3165,8 +3536,8 @@ class AsyncHybridQuery(AsyncStandardQuery, AsyncVectorQueryBase):
        max_batch_length: Optional[int] = None,
        timeout: Optional[timedelta] = None,
    ) -> AsyncRecordBatchReader:
-        fts_query = AsyncFTSQuery(self._inner.to_fts_query())
-        vec_query = AsyncVectorQuery(self._inner.to_vector_query())
+        fts_query = AsyncFTSQuery(self._inner.to_fts_query(), self._table)
+        vec_query = AsyncVectorQuery(self._inner.to_vector_query(), self._table)

        # save the row ID choice that was made on the query builder and force it
        # to actually fetch the row ids because we need this for reranking
@@ -3266,8 +3637,16 @@ class AsyncTakeQuery(AsyncQueryBase):
    Builder for parameterizing and executing take queries.
    """

-    def __init__(self, inner: LanceTakeQuery):
-        super().__init__(inner)
+    def __init__(self, inner: LanceTakeQuery, table: Optional["AsyncTable"] = None):
+        super().__init__(inner, table)
+
+    async def _plain_scan_to_pandas(
+        self,
+        blob_mode: BlobMode,
+        flatten: Optional[Union[int, bool]] = None,
+        **kwargs,
+    ) -> Optional["pd.DataFrame"]:
+        return None


 class BaseQueryBuilder(object):
@@ -3319,6 +3698,27 @@ class BaseQueryBuilder(object):
        self._inner.with_row_id()
        return self

+    def with_row_address(self, with_row_address: bool = True) -> Self:
+        """
+        Include the _rowaddr column in scanner-backed plain query results.
+        """
+        self._inner.with_row_address(with_row_address)
+        return self
+
+    def with_fragments(self, fragments: Any) -> Self:
+        """
+        Restrict scanner-backed plain query results to the given Lance fragments.
+        """
+        self._inner.with_fragments(fragments)
+        return self
+
+    def fragment_ids(self, fragment_ids: List[int]) -> Self:
+        """
+        Restrict scanner-backed plain query results to the given Lance fragment ids.
+        """
+        self._inner.fragment_ids(fragment_ids)
+        return self
+
    def output_schema(self) -> pa.Schema:
        """
        Return the output schema for the query
@@ -3400,6 +3800,8 @@ class BaseQueryBuilder(object):
        self,
        flatten: Optional[Union[int, bool]] = None,
        timeout: Optional[timedelta] = None,
+        *,
+        blob_mode: BlobMode = "lazy",
        **kwargs,
    ) -> "pd.DataFrame":
        """
@@ -3433,11 +3835,15 @@ class BaseQueryBuilder(object):
            The maximum time to wait for the query to complete.
            If not specified, no timeout is applied. If the query does not
            complete within the specified time, an error will be raised.
+        blob_mode: str, default "lazy"
+            Controls how blob columns are returned for plain scan queries.
        **kwargs
            Forwarded to pyarrow.Table.to_pandas after query execution and
            optional flattening.
        """
-        return LOOP.run(self._inner.to_pandas(flatten, timeout, **kwargs))
+        return LOOP.run(
+            self._inner.to_pandas(flatten, timeout, blob_mode=blob_mode, **kwargs)
+        )

    def to_polars(
        self,
--- a/python/python/lancedb/remote/db.py
+++ b/python/python/lancedb/remote/db.py
@@ -3,6 +3,7 @@


 from datetime import timedelta
+import json
 import logging
 from concurrent.futures import ThreadPoolExecutor
 import sys
@@ -17,7 +18,7 @@ else:

 # Remove this import to fix circular dependency
 # from lancedb import connect_async
-from lancedb.remote import ClientConfig
+from lancedb.remote import ClientConfig, RetryConfig, TimeoutConfig, TlsConfig
 import pyarrow as pa

 from ..common import DATA
@@ -36,6 +37,64 @@ from ..table import Table
 from ..util import validate_table_name


+def _duration_seconds(value: Optional[timedelta]) -> Optional[float]:
+    return value.total_seconds() if value is not None else None
+
+
+def _timeout_config_to_dict(
+    config: Optional[TimeoutConfig],
+) -> Optional[dict[str, Any]]:
+    if config is None:
+        return None
+    return {
+        "timeout": _duration_seconds(config.timeout),
+        "connect_timeout": _duration_seconds(config.connect_timeout),
+        "read_timeout": _duration_seconds(config.read_timeout),
+        "pool_idle_timeout": _duration_seconds(config.pool_idle_timeout),
+    }
+
+
+def _retry_config_to_dict(config: RetryConfig) -> dict[str, Any]:
+    return {
+        "retries": config.retries,
+        "connect_retries": config.connect_retries,
+        "read_retries": config.read_retries,
+        "backoff_factor": config.backoff_factor,
+        "backoff_jitter": config.backoff_jitter,
+        "statuses": config.statuses,
+    }
+
+
+def _tls_config_to_dict(config: Optional[TlsConfig]) -> Optional[dict[str, Any]]:
+    if config is None:
+        return None
+    return {
+        "cert_file": config.cert_file,
+        "key_file": config.key_file,
+        "ssl_ca_cert": config.ssl_ca_cert,
+        "assert_hostname": config.assert_hostname,
+    }
+
+
+def _client_config_to_dict(config: ClientConfig) -> dict[str, Any]:
+    if config.header_provider is not None:
+        raise ValueError(
+            "Cannot serialize a remote connection with a header_provider. "
+            "Use static api_key/extra_headers or provide a worker-side "
+            "connection factory instead."
+        )
+    return {
+        "user_agent": config.user_agent,
+        "retry_config": _retry_config_to_dict(config.retry_config),
+        "timeout_config": _timeout_config_to_dict(config.timeout_config),
+        "extra_headers": config.extra_headers,
+        "id_delimiter": config.id_delimiter,
+        "tls_config": _tls_config_to_dict(config.tls_config),
+        "header_provider": None,
+        "user_id": config.user_id,
+    }
+
+
 class RemoteDBConnection(DBConnection):
    """A connection to a remote LanceDB database."""

@@ -89,6 +148,11 @@ class RemoteDBConnection(DBConnection):
        parsed = urlparse(db_url)
        if parsed.scheme != "db":
            raise ValueError(f"Invalid scheme: {parsed.scheme}, only accepts db://")
+        self.db_url = db_url
+        self.api_key = api_key
+        self.region = region
+        self.host_override = host_override
+        self.storage_options = storage_options
        self.db_name = parsed.netloc

        self.client_config = client_config
@@ -111,6 +175,20 @@ class RemoteDBConnection(DBConnection):
    def __repr__(self) -> str:
        return f"RemoteConnect(name={self.db_name})"

+    @override
+    def serialize(self) -> str:
+        return json.dumps(
+            {
+                "connection_type": "remote",
+                "db_url": self.db_url,
+                "api_key": self.api_key,
+                "region": self.region,
+                "host_override": self.host_override,
+                "client_config": _client_config_to_dict(self.client_config),
+                "storage_options": self.storage_options,
+            }
+        )
+
    @override
    def list_namespaces(
        self,
@@ -305,6 +383,8 @@ class RemoteDBConnection(DBConnection):
        namespace_path: Optional[List[str]] = None,
        storage_options: Optional[Dict[str, str]] = None,
        index_cache_size: Optional[int] = None,
+        branch: Optional[str] = None,
+        version: Optional[int] = None,
    ) -> Table:
        """Open a Lance Table in the database.

@@ -315,6 +395,14 @@ class RemoteDBConnection(DBConnection):
        namespace_path: List[str], optional
            The namespace to open the table from.
            None or empty list represents root namespace.
+        branch: str, optional
+            Branching is not yet supported on remote tables, so only the
+            default branch is accepted (``None`` or ``"main"``); any other
+            value raises ``NotImplementedError``.
+        version: int, optional
+            If provided, open the table pinned to this version, producing a
+            read-only handle. Call ``checkout_latest`` to return to a writable
+            state.

        Returns
        -------
@@ -322,6 +410,11 @@ class RemoteDBConnection(DBConnection):
        """
        from .table import RemoteTable

+        # Remote supports version time-travel but not branches: reject a non-main
+        # branch, but allow a version-only open (or "main").
+        if branch is not None and branch != "main":
+            raise NotImplementedError("branching is not yet supported on remote tables")
+
        if namespace_path is None:
            namespace_path = []
        if index_cache_size is not None:
@@ -331,7 +424,15 @@ class RemoteDBConnection(DBConnection):
            )

        table = LOOP.run(self._conn.open_table(name, namespace_path=namespace_path))
-        return RemoteTable(table, self.db_name)
+        tbl = RemoteTable(
+            table,
+            self.db_name,
+            connection_state=self.serialize,
+            namespace_path=namespace_path,
+        )
+        if version is not None:
+            tbl.checkout(version)
+        return tbl

    def clone_table(
        self,
@@ -380,7 +481,12 @@ class RemoteDBConnection(DBConnection):
                is_shallow=is_shallow,
            )
        )
-        return RemoteTable(table, self.db_name)
+        return RemoteTable(
+            table,
+            self.db_name,
+            connection_state=self.serialize,
+            namespace_path=target_namespace_path,
+        )

    @override
    def create_table(
@@ -525,7 +631,12 @@ class RemoteDBConnection(DBConnection):
                fill_value=fill_value,
            )
        )
-        return RemoteTable(table, self.db_name)
+        return RemoteTable(
+            table,
+            self.db_name,
+            connection_state=self.serialize,
+            namespace_path=namespace_path,
+        )

    @override
    def drop_table(self, name: str, namespace_path: Optional[List[str]] = None):
--- a/python/python/lancedb/remote/errors.py
+++ b/python/python/lancedb/remote/errors.py
@@ -27,6 +27,9 @@ class LanceDBClientError(RuntimeError):
        self.request_id = request_id
        self.status_code = status_code

+    def __reduce__(self) -> tuple[type, tuple]:
+        return (self.__class__, (str(self), self.request_id, self.status_code))
+

 class HttpError(LanceDBClientError):
    """An error that occurred during an HTTP request.
@@ -101,3 +104,19 @@ class RetryError(LanceDBClientError):
        self.max_request_failures = max_request_failures
        self.max_connect_failures = max_connect_failures
        self.max_read_failures = max_read_failures
+
+    def __reduce__(self) -> tuple[type, tuple]:
+        return (
+            self.__class__,
+            (
+                str(self),
+                self.request_id,
+                self.request_failures,
+                self.connect_failures,
+                self.read_failures,
+                self.max_request_failures,
+                self.max_connect_failures,
+                self.max_read_failures,
+                self.status_code,
+            ),
+        )
--- a/python/python/lancedb/remote/table.py
+++ b/python/python/lancedb/remote/table.py
@@ -5,6 +5,7 @@ from datetime import timedelta
 import deprecation
 import logging
 from functools import cached_property
+import os
 from typing import (
    Any,
    Callable,
@@ -24,6 +25,7 @@ from lancedb._lancedb import (
    AddColumnsResult,
    AddResult,
    AlterColumnsResult,
+    UpdateFieldMetadataResult,
    DeleteResult,
    DropColumnsResult,
    IndexConfig,
@@ -63,14 +65,80 @@ class RemoteTable(Table):
        self,
        table: AsyncTable,
        db_name: str,
+        *,
+        connection_state: Optional[Union[str, Callable[[], str]]] = None,
+        namespace_path: Optional[List[str]] = None,
    ):
-        self._table = table
+        self._table_handle = table
+        self._name = table.name
        self.db_name = db_name
+        self._connection_state = connection_state
+        self._namespace_path = list(namespace_path or [])
+        self._checkout_version: Optional[int] = None
+        self._pid = os.getpid()
+
+    def _serialized_connection_state(self) -> str:
+        if self._connection_state is None:
+            raise RuntimeError(
+                "Cannot reopen this remote table because it does not carry "
+                "serialized connection state"
+            )
+        if callable(self._connection_state):
+            self._connection_state = self._connection_state()
+        return self._connection_state
+
+    @property
+    def _table(self) -> AsyncTable:
+        self._ensure_open()
+        assert self._table_handle is not None
+        return self._table_handle
+
+    @_table.setter
+    def _table(self, table: AsyncTable) -> None:
+        self._table_handle = table
+        self._name = table.name
+        self._pid = os.getpid()
+
+    def _ensure_open(self) -> None:
+        pid = os.getpid()
+        if self._table_handle is not None and self._pid == pid:
+            return
+
+        # Pickle clears the handle; fork inherits a handle created in the
+        # parent process. In both cases reopen before touching the Rust client.
+        from lancedb import deserialize_conn
+
+        db = deserialize_conn(self._serialized_connection_state(), for_worker=True)
+        table = db.open_table(self._name, namespace_path=self._namespace_path)
+        if self._checkout_version is not None:
+            table.checkout(self._checkout_version)
+
+        self._table_handle = table._table
+        self.db_name = table.db_name
+        self._pid = pid
+
+    def __getstate__(self) -> dict:
+        return {
+            "connection_state": self._serialized_connection_state(),
+            "db_name": self.db_name,
+            "name": self.name,
+            "namespace_path": self._namespace_path,
+            "checkout_version": self._checkout_version,
+        }
+
+    def __setstate__(self, state: dict) -> None:
+        self._table_handle = None
+        self._name = state["name"]
+        self.db_name = state["db_name"]
+        self._connection_state = state["connection_state"]
+        self._namespace_path = state["namespace_path"]
+        self._checkout_version = state["checkout_version"]
+        self._pid = None

    @property
    def name(self) -> str:
        """The name of the table"""
-        return self._table.name
+        return self._name

    def __repr__(self) -> str:
        return f"RemoteTable({self.db_name}.{self.name})"
@@ -120,13 +188,19 @@ class RemoteTable(Table):
        raise NotImplementedError("to_pandas() is not yet supported on LanceDB cloud.")

    def checkout(self, version: Union[int, str]):
-        return LOOP.run(self._table.checkout(version))
+        result = LOOP.run(self._table.checkout(version))
+        self._checkout_version = self.version
+        return result

    def checkout_latest(self):
-        return LOOP.run(self._table.checkout_latest())
+        result = LOOP.run(self._table.checkout_latest())
+        self._checkout_version = None
+        return result

    def restore(self, version: Optional[Union[int, str]] = None):
-        return LOOP.run(self._table.restore(version))
+        result = LOOP.run(self._table.restore(version))
+        self._checkout_version = None
+        return result

    def list_indices(self) -> Iterable[IndexConfig]:
        """List all the indices on the table"""
@@ -777,6 +851,11 @@ class RemoteTable(Table):
    ) -> AlterColumnsResult:
        return LOOP.run(self._table.alter_columns(*alterations))

+    def update_field_metadata(
+        self, *updates: dict[str, Any]
+    ) -> UpdateFieldMetadataResult:
+        return LOOP.run(self._table.update_field_metadata(*updates))
+
    def drop_columns(self, columns: Iterable[str]) -> DropColumnsResult:
        return LOOP.run(self._table.drop_columns(columns))

--- a/python/python/lancedb/rerankers/mrr.py
+++ b/python/python/lancedb/rerankers/mrr.py
@@ -125,6 +125,9 @@ class MRRReranker(Reranker):
        This cannot reuse rerank_hybrid because MRR semantics require treating
        each vector result as a separate ranking system.
        """
+        if not vector_results:
+            raise ValueError("vector_results must not be empty")
+
        if not all(isinstance(v, type(vector_results[0])) for v in vector_results):
            raise ValueError(
                "All elements in vector_results should be of the same type"
--- a/python/python/lancedb/rerankers/rrf.py
+++ b/python/python/lancedb/rerankers/rrf.py
@@ -82,6 +82,9 @@ class RRFReranker(Reranker):
        results from multiple vector searches as it doesn't support reranking
        vector results individually.
        """
+        if not vector_results:
+            raise ValueError("vector_results must not be empty")
+
        # Make sure all elements are of the same type
        if not all(isinstance(v, type(vector_results[0])) for v in vector_results):
            raise ValueError(
--- a/python/python/lancedb/table.py
+++ b/python/python/lancedb/table.py
@@ -30,7 +30,7 @@ from lancedb.scannable import _register_optional_converters, to_scannable

 from . import __version__
 from lancedb.arrow import peek_reader
-from lancedb.background_loop import LOOP
+from lancedb.background_loop import LOOP, embedding_executor
 from .dependencies import (
    _check_for_hugging_face,
    _check_for_lance,
@@ -55,6 +55,7 @@ from .index import (
    Bitmap,
    IvfRq,
    LabelList,
+    Fm,
    HnswPq,
    HnswSq,
    HnswFlat,
@@ -89,6 +90,32 @@ from .index import lang_mapping

 BlobMode = Literal["lazy", "bytes", "descriptions"]

+_VALID_BLOB_MODES = ("lazy", "bytes", "descriptions")
+
+
+def _should_push_down_query_table(
+    namespace_client: Optional[Any], pushdown_operations: set
+) -> bool:
+    return namespace_client is not None and "QueryTable" in pushdown_operations
+
+
+def _validate_blob_mode(blob_mode: BlobMode) -> None:
+    if blob_mode not in _VALID_BLOB_MODES:
+        modes = ", ".join(repr(mode) for mode in _VALID_BLOB_MODES)
+        raise ValueError(f"blob_mode must be one of {modes}, got {blob_mode!r}")
+
+
+def _field_is_blob(field: pa.Field) -> bool:
+    metadata = field.metadata or {}
+    return metadata.get(b"lance-encoding:blob") == b"true" or (
+        metadata.get("lance-encoding:blob") == "true"
+    )
+
+
+def _schema_has_blob_field(schema: pa.Schema) -> bool:
+    return any(_field_is_blob(field) for field in schema)
+
+
 _MODEL_BACKED_TOKENIZER_PREFIXES = ("jieba", "lindera")
 _MODEL_BACKED_TOKENIZER_ERRORS = (
    "unknown base tokenizer",
@@ -154,6 +181,7 @@ if TYPE_CHECKING:
        AddColumnsResult,
        AddResult,
        AlterColumnsResult,
+        UpdateFieldMetadataResult,
        DeleteResult,
        DropColumnsResult,
        LsmWriteSpec,
@@ -186,6 +214,7 @@ IndexConfigType = Union[
    BTree,
    Bitmap,
    LabelList,
+    Fm,
    FTS,
 ]

@@ -757,6 +786,15 @@ class Table(ABC):
        """
        raise NotImplementedError

+    @property
+    def branches(self) -> "Branches":
+        """Branch management for the table.
+
+        Branches are isolated, writable lines of history forked from another
+        branch (or version). Writes on a branch do not affect ``main``.
+        """
+        raise NotImplementedError
+
    def __len__(self) -> int:
        """The number of rows in this Table"""
        return self.count_rows(None)
@@ -902,7 +940,7 @@ class Table(ABC):
        config : IndexConfigType, optional
            The index configuration object. If provided, uses the new unified API.
            Can be one of: IvfFlat, IvfPq, IvfSq, IvfRq, HnswPq, HnswSq,
-            BTree, Bitmap, LabelList, FTS.
+            BTree, Bitmap, LabelList, Fm, FTS.
        replace : bool, default True
            Whether to replace an existing index on this column.
        wait_timeout : timedelta, optional
@@ -1799,6 +1837,29 @@ class Table(ABC):
            version: the new version number of the table after the alteration.
        """

+    @abstractmethod
+    def update_field_metadata(
+        self, *updates: dict[str, Any]
+    ) -> UpdateFieldMetadataResult:
+        """
+        Update per-field (column) metadata.
+
+        Parameters
+        ----------
+        updates : dict
+            One or more dicts, each with:
+            - "path": str — dot-path to the field (e.g. "embedding" or "a.b.c").
+            - "metadata": dict[str, str | None] — keys to set; a value of ``None``
+              deletes that key.
+            - "replace": bool, optional — replace the field's whole metadata map
+              instead of merging (default False).
+
+        Returns
+        -------
+        UpdateFieldMetadataResult
+            version: the new table version after the update.
+        """
+
    @abstractmethod
    def drop_columns(self, columns: Iterable[str]) -> DropColumnsResult:
        """
@@ -2062,22 +2123,27 @@ class LanceTable(Table):
                "Please install with `pip install pylance`."
            )

+        branch = self.current_branch()
+        version = None if branch is not None else self.version
        if self._namespace_client is not None:
            table_id = self._namespace_path + [self.name]
-            return lance.dataset(
-                version=self.version,
+            ds = lance.dataset(
+                version=version,
                storage_options=self._conn.storage_options,
                namespace_client=self._namespace_client,
                table_id=table_id,
                **kwargs,
            )
-
-        return lance.dataset(
-            self._dataset_path,
-            version=self.version,
-            storage_options=self._conn.storage_options,
-            **kwargs,
-        )
+        else:
+            ds = lance.dataset(
+                self._dataset_path,
+                version=version,
+                storage_options=self._conn.storage_options,
+                **kwargs,
+            )
+        if branch is not None:
+            ds = ds.checkout_version((branch, self.version))
+        return ds

    @property
    def schema(self) -> pa.Schema:
@@ -2143,6 +2209,19 @@ class LanceTable(Table):
        """
        return Tags(self._table)

+    @property
+    def branches(self) -> "Branches":
+        """Branch management for the table.
+
+        ``create``/``checkout`` return a new table handle scoped to the branch;
+        writes on it do not affect ``main``.
+        """
+        return Branches(self)
+
+    def current_branch(self) -> Optional[str]:
+        """The branch this table handle is scoped to, or ``None`` for ``main``."""
+        return self._table.current_branch()
+
    def checkout(self, version: Union[int, str]):
        """Checkout a version of the table. This is an in-place operation.

@@ -2270,9 +2349,14 @@ class LanceTable(Table):
        -------
        pd.DataFrame
        """
-        if blob_mode == "lazy" and (
-            self._namespace_client is not None
-            or get_uri_scheme(self._dataset_path) == "memory"
+        _validate_blob_mode(blob_mode)
+        if blob_mode == "descriptions" or not _schema_has_blob_field(self.schema):
+            return self.to_arrow().to_pandas(**kwargs)
+
+        if (
+            blob_mode == "lazy"
+            and self._namespace_client is None
+            and get_uri_scheme(self._dataset_path) == "memory"
        ):
            return self.to_arrow().to_pandas(**kwargs)

@@ -2284,6 +2368,11 @@ class LanceTable(Table):
        Returns
        -------
        pa.Table"""
+        if _should_push_down_query_table(
+            self._namespace_client, self._pushdown_operations
+        ):
+            return self._execute_query(Query()).read_all()
+
        return LOOP.run(self._table.to_arrow())

    def to_polars(self, batch_size=None) -> "pl.LazyFrame":
@@ -2400,7 +2489,7 @@ class LanceTable(Table):
        config : IndexConfigType, optional
            The index configuration object. If provided, uses the new unified API.
            Can be one of: IvfFlat, IvfPq, IvfSq, IvfRq, HnswPq, HnswSq,
-            BTree, Bitmap, LabelList, FTS.
+            BTree, Bitmap, LabelList, Fm, FTS.
        replace : bool, default True
            Whether to replace an existing index on this column.
        wait_timeout : timedelta, optional
@@ -3397,9 +3486,14 @@ class LanceTable(Table):
        batch_size: Optional[int] = None,
        timeout: Optional[timedelta] = None,
    ) -> pa.RecordBatchReader:
+        # Branch queries run locally: the server-side query protocol can't
+        # carry a branch yet.
+        # TODO: push down server-side once it can (with remote table support).
        if (
-            "QueryTable" in self._pushdown_operations
-            and self._namespace_client is not None
+            _should_push_down_query_table(
+                self._namespace_client, self._pushdown_operations
+            )
+            and self.current_branch() is None
        ):
            from lancedb.namespace import _execute_server_side_query

@@ -3583,6 +3677,11 @@ class LanceTable(Table):
    ) -> AlterColumnsResult:
        return LOOP.run(self._table.alter_columns(*alterations))

+    def update_field_metadata(
+        self, *updates: dict[str, Any]
+    ) -> UpdateFieldMetadataResult:
+        return LOOP.run(self._table.update_field_metadata(*updates))
+
    def drop_columns(self, columns: Iterable[str]) -> DropColumnsResult:
        return LOOP.run(self._table.drop_columns(columns))

@@ -3637,10 +3736,18 @@ class LanceTable(Table):
        """
        LOOP.run(self._table.migrate_v2_manifest_paths())

+    @deprecation.deprecated(
+        deprecated_in="0.33.1",
+        current_version=__version__,
+        details="Use update_field_metadata() instead.",
+    )
    def replace_field_metadata(self, field_name: str, new_metadata: Dict[str, str]):
        """
        Replace the metadata of a field in the schema

+        .. deprecated:: 0.33.1
+            Use :func:`update_field_metadata` instead.
+
        Parameters
        ----------
        field_name: str
@@ -4120,7 +4227,14 @@ class AsyncTable:
    [AsyncTable.create_index][lancedb.table.AsyncTable.create_index].
    """

-    def __init__(self, table: LanceDBTable):
+    def __init__(
+        self,
+        table: LanceDBTable,
+        *,
+        namespace_path: Optional[List[str]] = None,
+        namespace_client: Optional[Any] = None,
+        pushdown_operations: Optional[set] = None,
+    ):
        """Create a new AsyncTable object.

        You should not create AsyncTable objects directly.
@@ -4129,6 +4243,21 @@ class AsyncTable:
        [AsyncConnection.open_table][lancedb.AsyncConnection.open_table] to obtain
        Table objects."""
        self._inner = table
+        self._namespace_path = namespace_path or []
+        self._namespace_client = namespace_client
+        self._pushdown_operations = pushdown_operations or set()
+
+    def _set_namespace_context(
+        self,
+        *,
+        namespace_path: Optional[List[str]] = None,
+        namespace_client: Optional[Any] = None,
+        pushdown_operations: Optional[set] = None,
+    ) -> "AsyncTable":
+        self._namespace_path = namespace_path or []
+        self._namespace_client = namespace_client
+        self._pushdown_operations = pushdown_operations or set()
+        return self

    def __repr__(self):
        return self._inner.__repr__()
@@ -4280,7 +4409,7 @@ class AsyncTable:
        can be executed with methods like [to_arrow][lancedb.query.AsyncQuery.to_arrow],
        [to_pandas][lancedb.query.AsyncQuery.to_pandas] and more.
        """
-        return AsyncQuery(self._inner.query())
+        return AsyncQuery(self._inner.query(), self)

    async def _to_lance(self, **kwargs) -> lance.LanceDataset:
        try:
@@ -4291,12 +4420,20 @@ class AsyncTable:
                "Please install with `pip install pylance`."
            )

-        return lance.dataset(
+        # lance.dataset() can't open a branch directly, so open the base table
+        # and check out the branch ref (a None branch resolves to main).
+        branch = self.current_branch()
+        table_version = await self.version()
+        version = None if branch is not None else table_version
+        ds = lance.dataset(
            await self.uri(),
-            version=await self.version(),
+            version=version,
            storage_options=await self.latest_storage_options(),
            **kwargs,
        )
+        if branch is not None:
+            ds = ds.checkout_version((branch, table_version))
+        return ds

    async def to_pandas(self, blob_mode: BlobMode = "lazy", **kwargs) -> "pd.DataFrame":
        """Return the table as a pandas DataFrame.
@@ -4312,7 +4449,13 @@ class AsyncTable:
        -------
        pd.DataFrame
        """
-        if blob_mode == "lazy":
+        _validate_blob_mode(blob_mode)
+        if blob_mode == "descriptions" or not _schema_has_blob_field(
+            await self.schema()
+        ):
+            return (await self.to_arrow()).to_pandas(**kwargs)
+
+        if blob_mode == "lazy" and get_uri_scheme(await self.uri()) == "memory":
            return (await self.to_arrow()).to_pandas(**kwargs)
        return (await self._to_lance()).to_pandas(blob_mode=blob_mode, **kwargs)

@@ -4323,6 +4466,11 @@ class AsyncTable:
        -------
        pa.Table
        """
+        if _should_push_down_query_table(
+            self._namespace_client, self._pushdown_operations
+        ):
+            return (await self._execute_query(Query())).read_all()
+
        return await self.query().to_arrow()

    async def create_index(
@@ -4341,6 +4489,7 @@ class AsyncTable:
                BTree,
                Bitmap,
                LabelList,
+                Fm,
                FTS,
            ]
        ] = None,
@@ -4393,12 +4542,14 @@ class AsyncTable:
                    BTree,
                    Bitmap,
                    LabelList,
+                    Fm,
                    FTS,
                ),
            ):
                raise TypeError(
                    "config must be an instance of IvfSq, IvfPq, IvfRq, HnswPq, HnswSq,"
-                    " BTree, Bitmap, LabelList, or FTS, but got " + str(type(config))
+                    " BTree, Bitmap, LabelList, Fm, or FTS, but got "
+                    + str(type(config))
                )
        try:
            await self._inner.create_index(
@@ -4840,10 +4991,13 @@ class AsyncTable:
            if embedding is not None:
                loop = asyncio.get_running_loop()
                # This function is likely to block, since it either calls an expensive
-                # function or makes an HTTP request to an embeddings REST API.
+                # function or makes an HTTP request to an embeddings REST API. Run it
+                # on a dedicated executor so it can't starve the default executor that
+                # other blocking I/O shares. See
+                # https://github.com/lancedb/lancedb/issues/3310.
                return (
                    await loop.run_in_executor(
-                        None,
+                        embedding_executor(),
                        embedding.function.compute_query_embeddings_with_retry,
                        query,
                    )
@@ -4997,6 +5151,14 @@ class AsyncTable:
        batch_size: Optional[int] = None,
        timeout: Optional[timedelta] = None,
    ) -> pa.RecordBatchReader:
+        if _should_push_down_query_table(
+            self._namespace_client, self._pushdown_operations
+        ):
+            from lancedb.namespace import _execute_server_side_query
+
+            table_id = self._namespace_path + [self.name]
+            return _execute_server_side_query(self._namespace_client, table_id, query)
+
        # The sync table calls into this method, so we need to map the
        # query to the async version of the query and run that here. This is only
        # used for that code path right now.
@@ -5234,6 +5396,13 @@ class AsyncTable:
        """
        return await self._inner.alter_columns(alterations)

+    async def update_field_metadata(
+        self, *updates: dict[str, Any]
+    ) -> UpdateFieldMetadataResult:
+        """Update per-field metadata. See
+        [`Table.update_field_metadata`][lancedb.table.Table.update_field_metadata]."""
+        return await self._inner.update_field_metadata(updates)
+
    async def drop_columns(self, columns: Iterable[str]):
        """
        Drop columns from the table.
@@ -5349,7 +5518,7 @@ class AsyncTable:
        pa.RecordBatch
            A record batch containing the rows at the given offsets.
        """
-        return AsyncTakeQuery(self._inner.take_offsets(offsets))
+        return AsyncTakeQuery(self._inner.take_offsets(offsets), self)

    def take_row_ids(self, row_ids: list[int]) -> AsyncTakeQuery:
        """
@@ -5378,7 +5547,7 @@ class AsyncTable:
        AsyncTakeQuery
            A query object that can be executed to get the rows.
        """
-        return AsyncTakeQuery(self._inner.take_row_ids(row_ids))
+        return AsyncTakeQuery(self._inner.take_row_ids(row_ids), self)

    @property
    def tags(self) -> AsyncTags:
@@ -5398,6 +5567,19 @@ class AsyncTable:
        """
        return AsyncTags(self._inner)

+    @property
+    def branches(self) -> AsyncBranches:
+        """Branch management for the table.
+
+        Branches are isolated, writable lines of history forked from another
+        branch (or version). Writes on a branch do not affect ``main``.
+        """
+        return AsyncBranches(self._inner)
+
+    def current_branch(self) -> Optional[str]:
+        """The branch this table handle is scoped to, or ``None`` for ``main``."""
+        return self._inner.current_branch()
+
    async def optimize(
        self,
        *,
@@ -5518,12 +5700,20 @@ class AsyncTable:
        """
        await self._inner.migrate_manifest_paths_v2()

+    @deprecation.deprecated(
+        deprecated_in="0.33.1",
+        current_version=__version__,
+        details="Use update_field_metadata() instead.",
+    )
    async def replace_field_metadata(
        self, field_name: str, new_metadata: dict[str, str]
    ):
        """
        Replace the metadata of a field in the schema

+        .. deprecated:: 0.33.1
+            Use :func:`update_field_metadata` instead.
+
        Parameters
        ----------
        field_name: str
@@ -5551,8 +5741,6 @@ class IndexStatistics:
        The distance type used by the index.
    num_indices: Optional[int]
        The number of parts the index is split into.
-    loss: Optional[float]
-        The KMeans loss for the index, for only vector indices.
    """

    num_indexed_rows: int
@@ -5572,7 +5760,6 @@ class IndexStatistics:
    ]
    distance_type: Optional[Literal["l2", "cosine", "dot"]] = None
    num_indices: Optional[int] = None
-    loss: Optional[float] = None

    # This exists for backwards compatibility with an older API, which returned
    # a dictionary instead of a class.
@@ -5725,6 +5912,75 @@ class Tags:
        LOOP.run(self._table.tags.update(tag, version))


+class Branches:
+    """
+    Table branch manager.
+    """
+
+    def __init__(self, parent: "LanceTable"):
+        self._parent = parent
+        self._table = parent._table
+
+    def list(self) -> Dict[str, Any]:
+        """List all branches, mapping name to branch metadata."""
+        return LOOP.run(self._table.branches.list())
+
+    def create(
+        self,
+        name: str,
+        from_ref: Optional[str] = None,
+        from_version: Optional[int] = None,
+    ) -> "LanceTable":
+        """Create a branch and return a handle scoped to it.
+
+        Parameters
+        ----------
+        name: str
+            Name of the new branch.
+        from_ref: str, optional
+            Source branch to fork from. Defaults to ``main``.
+        from_version: int, optional
+            A specific version on ``from_ref`` to fork from. Defaults to latest.
+        """
+        async_table = LOOP.run(
+            self._table.branches.create(name, from_ref, from_version)
+        )
+        return self._wrap(async_table)
+
+    def checkout(self, name: str, version: Optional[int] = None) -> "LanceTable":
+        """Check out an existing branch and return a handle scoped to it.
+
+        Parameters
+        ----------
+        name: str
+            Name of the branch to check out.
+        version: int, optional
+            A specific version on the branch to pin. When set, the returned
+            handle is a read-only view of that version; when omitted it tracks
+            the branch's latest and stays writable.
+        """
+        async_table = LOOP.run(self._table.branches.checkout(name, version))
+        return self._wrap(async_table)
+
+    def delete(self, name: str) -> None:
+        """Delete a branch."""
+        LOOP.run(self._table.branches.delete(name))
+
+    def _wrap(self, async_table: "AsyncTable") -> "LanceTable":
+        # Reuse the parent's connection + namespace context; from_inner would drop
+        # it and break identity/query routing for namespace-backed tables.
+        parent = self._parent
+        return LanceTable(
+            parent._conn,
+            async_table.name,
+            namespace_path=parent._namespace_path,
+            namespace_client=parent._namespace_client,
+            pushdown_operations=parent._pushdown_operations,
+            location=parent._location,
+            _async=async_table,
+        )
+
+
 class AsyncTags:
    """
    Async table tag manager.
@@ -5792,3 +6048,56 @@ class AsyncTags:
            The new table version to tag.
        """
        await self._table.tags.update(tag, version)
+
+
+class AsyncBranches:
+    """Async table branch manager."""
+
+    def __init__(self, table):
+        self._table = table
+
+    async def list(self) -> Dict[str, Any]:
+        """List all branches, mapping name to branch metadata."""
+        return await self._table.branches.list()
+
+    async def create(
+        self,
+        name: str,
+        from_ref: Optional[str] = None,
+        from_version: Optional[int] = None,
+    ) -> "AsyncTable":
+        """Create a branch and return a handle scoped to it.
+
+        Parameters
+        ----------
+        name: str
+            Name of the new branch.
+        from_ref: str, optional
+            Source branch to fork from. Defaults to ``main``.
+        from_version: int, optional
+            A specific version on ``from_ref`` to fork from. Defaults to latest.
+        """
+        # "main" and None are two spellings of the root branch in lance; normalize
+        # so from_ref="main" behaves identically to the default.
+        if from_ref == "main":
+            from_ref = None
+        inner = await self._table.branches.create(name, from_ref, from_version)
+        return AsyncTable(inner)
+
+    async def checkout(self, name: str, version: Optional[int] = None) -> "AsyncTable":
+        """Check out an existing branch and return a handle scoped to it.
+
+        Parameters
+        ----------
+        name: str
+            Name of the branch to check out.
+        version: int, optional
+            A specific version on the branch to pin. When set, the returned
+            handle is a read-only view of that version; when omitted it tracks
+            the branch's latest and stays writable.
+        """
+        return AsyncTable(await self._table.branches.checkout(name, version))
+
+    async def delete(self, name: str) -> None:
+        """Delete a branch."""
+        await self._table.branches.delete(name)
--- a/python/python/lancedb/util.py
+++ b/python/python/lancedb/util.py
@@ -385,6 +385,21 @@ def _(value: np.ndarray):
    return value_to_sql(value.tolist())


+@value_to_sql.register(np.bool_)
+def _(value: np.bool_):
+    return value_to_sql(bool(value))
+
+
+@value_to_sql.register(np.integer)
+def _(value: np.integer):
+    return value_to_sql(int(value))
+
+
+@value_to_sql.register(np.floating)
+def _(value: np.floating):
+    return value_to_sql(float(value))
+
+
 def deprecated(func):
    """This is a decorator which can be used to mark functions
    as deprecated. It will result in a warning being emitted
--- a/python/python/tests/test_errors.py
+++ b/python/python/tests/test_errors.py
@@ -0,0 +1,56 @@
+# SPDX-License-Identifier: Apache-2.0
+# SPDX-FileCopyrightText: Copyright The LanceDB Authors
+
+import pickle
+
+from lancedb.remote.errors import HttpError, LanceDBClientError, RetryError
+
+
+def test_pickle_lancedb_client_error():
+    err = LanceDBClientError("something went wrong", "req-123", 400)
+    restored = pickle.loads(pickle.dumps(err))
+    assert str(restored) == "something went wrong"
+    assert restored.request_id == "req-123"
+    assert restored.status_code == 400
+
+
+def test_pickle_lancedb_client_error_no_status_code():
+    err = LanceDBClientError("fail", "req-456")
+    restored = pickle.loads(pickle.dumps(err))
+    assert str(restored) == "fail"
+    assert restored.request_id == "req-456"
+    assert restored.status_code is None
+
+
+def test_pickle_http_error():
+    err = HttpError("not found", "req-789", 404)
+    restored = pickle.loads(pickle.dumps(err))
+    assert isinstance(restored, HttpError)
+    assert str(restored) == "not found"
+    assert restored.request_id == "req-789"
+    assert restored.status_code == 404
+
+
+def test_pickle_retry_error():
+    err = RetryError(
+        "max retries exceeded",
+        "req-abc",
+        request_failures=3,
+        connect_failures=1,
+        read_failures=2,
+        max_request_failures=5,
+        max_connect_failures=3,
+        max_read_failures=3,
+        status_code=503,
+    )
+    restored = pickle.loads(pickle.dumps(err))
+    assert isinstance(restored, RetryError)
+    assert str(restored) == "max retries exceeded"
+    assert restored.request_id == "req-abc"
+    assert restored.request_failures == 3
+    assert restored.connect_failures == 1
+    assert restored.read_failures == 2
+    assert restored.max_request_failures == 5
+    assert restored.max_connect_failures == 3
+    assert restored.max_read_failures == 3
+    assert restored.status_code == 503
--- a/python/python/tests/test_index.py
+++ b/python/python/tests/test_index.py
@@ -20,6 +20,7 @@ from lancedb.index import (
    IvfRq,
    Bitmap,
    LabelList,
+    Fm,
    HnswPq,
    HnswSq,
    HnswFlat,
@@ -113,8 +114,14 @@ async def test_create_nested_scalar_index_lists_canonical_paths(db_async):
            pa.field("user.id", pa.int32()),
        ]
    )
+    mixed_case_metadata_type = pa.struct([pa.field("userId", pa.int32())])
+    escaped_metadata_type = pa.struct([pa.field("user-id", pa.int32())])
+    literal_type = pa.struct([pa.field("a.b", pa.int32())])
    data = pa.Table.from_arrays(
        [
+            pa.array([1, 2, 3], type=pa.int32()),
+            pa.array([1, 2, 3], type=pa.int32()),
+            pa.array([1, 2, 3], type=pa.int32()),
            pa.array([1, 2, 3], type=pa.int32()),
            pa.array(
                [
@@ -124,25 +131,67 @@ async def test_create_nested_scalar_index_lists_canonical_paths(db_async):
                ],
                type=metadata_type,
            ),
+            pa.array(
+                [{"userId": 10}, {"userId": 20}, {"userId": 30}],
+                type=mixed_case_metadata_type,
+            ),
+            pa.array(
+                [{"user-id": 10}, {"user-id": 20}, {"user-id": 30}],
+                type=escaped_metadata_type,
+            ),
+            pa.array(
+                [{"a.b": 10}, {"a.b": 20}, {"a.b": 30}],
+                type=literal_type,
+            ),
+        ],
+        names=[
+            "rowId",
+            "row-id",
+            "userId",
+            "user_id",
+            "metadata",
+            "MetaData",
+            "meta-data",
+            "literal",
        ],
-        names=["user_id", "metadata"],
    )
    table = await db_async.create_table("nested_scalar_index", data)

-    await table.create_index("user_id", config=BTree(), name="top_user_id_idx")
+    await table.create_index("rowId", config=BTree(), name="row_id_idx")
+    await table.create_index("`row-id`", config=BTree(), name="row_dash_id_idx")
+    await table.create_index("userId", config=BTree(), name="top_user_id_idx")
+    await table.create_index("user_id", config=BTree(), name="top_snake_user_id_idx")
    await table.create_index(
        "metadata.user_id", config=BTree(), name="nested_user_id_idx"
    )
    await table.create_index(
        "metadata.`user.id`", config=BTree(), name="escaped_user_id_idx"
    )
+    await table.create_index(
+        "MetaData.userId", config=BTree(), name="mixed_case_metadata_user_id_idx"
+    )
+    await table.create_index(
+        "`meta-data`.`user-id`", config=BTree(), name="escaped_names_idx"
+    )
+    await table.create_index("literal.`a.b`", config=BTree(), name="literal_dot_idx")

    columns_by_name = {
        index.name: index.columns for index in await table.list_indices()
    }
-    assert columns_by_name["top_user_id_idx"] == ["user_id"]
+    assert columns_by_name["row_id_idx"] == ["rowId"]
+    assert columns_by_name["row_dash_id_idx"] == ["`row-id`"]
+    assert columns_by_name["top_user_id_idx"] == ["userId"]
+    assert columns_by_name["top_snake_user_id_idx"] == ["user_id"]
    assert columns_by_name["nested_user_id_idx"] == ["metadata.user_id"]
    assert columns_by_name["escaped_user_id_idx"] == ["metadata.`user.id`"]
+    assert columns_by_name["mixed_case_metadata_user_id_idx"] == ["MetaData.userId"]
+    assert columns_by_name["escaped_names_idx"] == ["`meta-data`.`user-id`"]
+    assert columns_by_name["literal_dot_idx"] == ["literal.`a.b`"]
+
+    for index_name in columns_by_name:
+        stats = await table.index_stats(index_name)
+        assert stats is not None
+        assert stats.num_indexed_rows == 3


@pytest.mark.asyncio
@@ -155,6 +204,16 @@ async def test_create_fixed_size_binary_index(some_table: AsyncTable):
    assert indices[0].columns == ["fsb"]


+@pytest.mark.asyncio
+async def test_create_fm_index(some_table: AsyncTable):
+    # FM-Index accelerates substring search on string/binary columns.
+    await some_table.create_index("data", config=Fm())
+    indices = await some_table.list_indices()
+    assert len(indices) == 1
+    assert indices[0].index_type == "Fm"
+    assert indices[0].columns == ["data"]
+
+
@pytest.mark.asyncio
 async def test_create_bitmap_index(some_table: AsyncTable):
    await some_table.create_index("id", config=Bitmap())
@@ -189,6 +248,51 @@ async def test_create_label_list_index(some_table: AsyncTable):
    await some_table.create_index("tags", config=LabelList())
    indices = await some_table.list_indices()
    assert str(indices) == '[Index(LabelList, columns=["tags"], name="tags_idx")]'
+    plan = await some_table.query().where("array_has(tags, 'tag0')").explain_plan()
+    assert "ScalarIndexQuery" in plan
+
+
+@pytest.mark.asyncio
+async def test_create_large_list_label_list_index(db_async):
+    data = pa.Table.from_pydict(
+        {"tags": [[f"tag{i % 2}", "shared"] for i in range(16)]},
+        schema=pa.schema([pa.field("tags", pa.large_list(pa.string()))]),
+    )
+    table = await db_async.create_table("large_list_label_list_index", data)
+
+    await table.create_index("tags", config=LabelList())
+    indices = await table.list_indices()
+    assert str(indices) == '[Index(LabelList, columns=["tags"], name="tags_idx")]'
+    plan = await table.query().where("array_has(tags, 'shared')").explain_plan()
+    assert "ScalarIndexQuery" in plan
+
+
+@pytest.mark.asyncio
+async def test_create_label_list_index_rejects_list_struct(db_async):
+    item_type = pa.struct(
+        [
+            pa.field("tag", pa.string()),
+            pa.field(
+                "metadata",
+                pa.struct([pa.field("userId", pa.string())]),
+            ),
+        ]
+    )
+    data = pa.Table.from_pylist(
+        [
+            {
+                "items": [
+                    {"tag": "tag0", "metadata": {"userId": "user0"}},
+                    {"tag": "shared", "metadata": {"userId": "user1"}},
+                ]
+            }
+        ],
+        schema=pa.schema([pa.field("items", pa.list_(item_type))]),
+    )
+    table = await db_async.create_table("list_struct_label_list_index", data)
+
+    with pytest.raises(Exception, match="LabelList index cannot be created"):
+        await table.create_index("items", config=LabelList())


@pytest.mark.asyncio
@@ -226,7 +330,6 @@ async def test_create_vector_index(some_table: AsyncTable):
    assert stats.num_indexed_rows == await some_table.count_rows()
    assert stats.num_unindexed_rows == 0
    assert stats.num_indices == 1
-    assert stats.loss >= 0.0


@pytest.mark.asyncio
@@ -250,7 +353,6 @@ async def test_create_4bit_ivfpq_index(some_table: AsyncTable):
    assert stats.num_indexed_rows == await some_table.count_rows()
    assert stats.num_unindexed_rows == 0
    assert stats.num_indices == 1
-    assert stats.loss >= 0.0


@pytest.mark.asyncio
--- a/python/python/tests/test_namespace.py
+++ b/python/python/tests/test_namespace.py
@@ -5,10 +5,67 @@

 import tempfile
 import shutil
+import sys
 import pytest
 import pyarrow as pa
 import lancedb
 from lance_namespace.errors import NamespaceNotEmptyError, TableNotFoundError
+from lancedb.table import AsyncTable, LanceTable
+
+
+PUSHDOWN_DATA = pa.table(
+    {"id": list(range(12)), "text": [f"row-{idx}" for idx in range(12)]}
+)
+
+
+def _ipc_file(table: pa.Table = PUSHDOWN_DATA) -> bytes:
+    sink = pa.BufferOutputStream()
+    with pa.ipc.new_file(sink, table.schema) as writer:
+        writer.write_table(table)
+    return sink.getvalue().to_pybytes()
+
+
+class _FailingSyncInner:
+    name = "hist"
+
+    def current_branch(self):
+        # The pushdown gate only routes server-side when on the default branch.
+        return None
+
+    async def schema(self):
+        return PUSHDOWN_DATA.schema
+
+    async def to_arrow(self):
+        raise RuntimeError("direct table to_arrow should not be used")
+
+
+class _FailingAsyncInner:
+    def name(self):
+        return "hist"
+
+    async def schema(self):
+        return PUSHDOWN_DATA.schema
+
+    def query(self):
+        raise AssertionError("direct async query should not be used")
+
+
+class _NamespaceClient:
+    def __init__(self):
+        self.requests = []
+
+    def query_table(self, request):
+        self.requests.append(request)
+        return _ipc_file()
+
+
+def _namespace_lance_table(namespace_client: _NamespaceClient) -> LanceTable:
+    table = LanceTable.__new__(LanceTable)
+    table._table = _FailingSyncInner()
+    table._namespace_path = ["geneva"]
+    table._namespace_client = namespace_client
+    table._pushdown_operations = {"QueryTable"}
+    return table


 class TestNamespaceConnection:
@@ -76,6 +133,35 @@ class TestNamespaceConnection:
        assert len(result) == 0
        assert list(result.columns) == ["id", "vector", "text"]

+    def test_table_to_pandas_blob_lazy_through_namespace(self):
+        """Namespace-backed tables should use Lance blob-aware pandas conversion."""
+        pytest.importorskip("lance")
+        db = lancedb.connect_namespace("dir", {"root": self.temp_dir})
+        db.create_namespace(["test_ns"])
+        data = pa.table(
+            {
+                "id": pa.array([1, 2], pa.int64()),
+                "blob": pa.array([b"hello", b"world"], pa.large_binary()),
+            },
+            schema=pa.schema(
+                [
+                    pa.field("id", pa.int64()),
+                    pa.field(
+                        "blob",
+                        pa.large_binary(),
+                        metadata={"lance-encoding:blob": "true"},
+                    ),
+                ]
+            ),
+        )
+
+        table = db.create_table("blob_table", data, namespace_path=["test_ns"])
+        df = table.to_pandas(blob_mode="lazy").sort_values("id")
+
+        blob = df["blob"].iloc[0]
+        assert hasattr(blob, "readall")
+        assert blob.readall() == b"hello"
+
    def test_open_table_through_namespace(self):
        """Test opening an existing table through namespace."""
        db = lancedb.connect_namespace("dir", {"root": self.temp_dir})
@@ -707,6 +793,22 @@ class TestPushdownOperations:
        db = lancedb.connect_namespace("dir", {"root": self.temp_dir})
        assert len(db._namespace_client_pushdown_operations) == 0

+    def test_lance_table_to_arrow_uses_query_pushdown(self):
+        namespace_client = _NamespaceClient()
+        table = _namespace_lance_table(namespace_client)
+
+        assert table.to_arrow().equals(PUSHDOWN_DATA)
+        assert table.to_pandas()["id"].tolist() == list(range(12))
+        assert len(namespace_client.requests) == 2
+        assert [request.id for request in namespace_client.requests] == [
+            ["geneva", "hist"],
+            ["geneva", "hist"],
+        ]
+        assert [request.k for request in namespace_client.requests] == [
+            sys.maxsize,
+            sys.maxsize,
+        ]
+

@pytest.mark.asyncio
 class TestAsyncPushdownOperations:
@@ -742,3 +844,39 @@ class TestAsyncPushdownOperations:
        """Test that pushdown operations default to empty on async connection."""
        db = lancedb.connect_namespace_async("dir", {"root": self.temp_dir})
        assert len(db._namespace_client_pushdown_operations) == 0
+
+    async def test_async_table_to_arrow_uses_query_pushdown(self):
+        namespace_client = _NamespaceClient()
+
+        table = AsyncTable(
+            _FailingAsyncInner(),
+            namespace_path=["geneva"],
+            namespace_client=namespace_client,
+            pushdown_operations={"QueryTable"},
+        )
+
+        assert (await table.to_arrow()).equals(PUSHDOWN_DATA)
+        assert (await table.to_pandas())["id"].tolist() == list(range(12))
+        assert len(namespace_client.requests) == 2
+        assert [request.id for request in namespace_client.requests] == [
+            ["geneva", "hist"],
+            ["geneva", "hist"],
+        ]
+        assert [request.k for request in namespace_client.requests] == [
+            sys.maxsize,
+            sys.maxsize,
+        ]
+
+
+def test_local_table_to_arrow_and_to_pandas_are_unchanged(tmp_path):
+    db = lancedb.connect(str(tmp_path / "db"))
+    table = db.create_table(
+        "local",
+        data=[
+            {"id": 1, "vector": [1.0, 2.0]},
+            {"id": 2, "vector": [3.0, 4.0]},
+        ],
+    )
+
+    assert table.to_arrow().column("id").to_pylist() == [1, 2]
+    assert table.to_pandas()["id"].tolist() == [1, 2]
--- a/python/python/tests/test_query.py
+++ b/python/python/tests/test_query.py
@@ -39,6 +39,35 @@ from utils import exception_output
 from importlib.util import find_spec


+def _blob_query_data():
+    return pa.table(
+        {
+            "id": pa.array([1, 2, 3, 4], pa.int64()),
+            "tag": pa.array(["drop", "keep", "keep", "keep"], pa.utf8()),
+            "vector": pa.array(
+                [[1.0, 0.0], [2.0, 0.0], [3.0, 0.0], [4.0, 0.0]],
+                type=pa.list_(pa.float32(), list_size=2),
+            ),
+            "blob": pa.array([b"one", b"two", b"three", b"four"], pa.large_binary()),
+        },
+        schema=pa.schema(
+            [
+                pa.field("id", pa.int64()),
+                pa.field("tag", pa.utf8()),
+                pa.field("vector", pa.list_(pa.float32(), list_size=2)),
+                pa.field(
+                    "blob", pa.large_binary(), metadata={"lance-encoding:blob": "true"}
+                ),
+            ]
+        ),
+    )
+
+
+def _assert_lazy_blob(value, expected: bytes):
+    assert hasattr(value, "readall")
+    assert value.readall() == expected
+
+
@pytest.fixture(scope="module")
 def table(tmpdir_factory) -> lancedb.table.Table:
    tmp_path = str(tmpdir_factory.mktemp("data"))
@@ -181,6 +210,216 @@ async def test_query_to_pandas_kwargs(table, table_async):
    assert async_df["id"].tolist() == [1, 2]


+@pytest.mark.parametrize("blob_mode", ["lazy", "bytes", "descriptions"])
+def test_plain_scan_query_to_pandas_blob_modes(tmp_db, blob_mode):
+    pytest.importorskip("lance")
+    table = tmp_db.create_table(
+        f"test_query_to_pandas_blob_{blob_mode}", _blob_query_data()
+    )
+
+    df = (
+        table.search()
+        .select(["id", "blob"])
+        .where("id = 1")
+        .to_pandas(blob_mode=blob_mode)
+    )
+
+    assert df["id"].tolist() == [1]
+    if blob_mode == "lazy":
+        _assert_lazy_blob(df["blob"].iloc[0], b"one")
+    elif blob_mode == "bytes":
+        assert df["blob"].tolist() == [b"one"]
+    else:
+        first = df["blob"].iloc[0]
+        assert first != b"one"
+        assert not hasattr(first, "readall")
+
+
+def test_plain_scan_query_to_pandas_blob_projection(tmp_db):
+    pytest.importorskip("lance")
+    table = tmp_db.create_table(
+        "test_query_to_pandas_blob_projection", _blob_query_data()
+    )
+
+    df = (
+        table.search()
+        .where("id >= 2")
+        .select({"id_alias": "id", "payload": "blob", "double_id": "id * 2"})
+        .limit(2)
+        .offset(1)
+        .to_pandas(blob_mode="bytes")
+    )
+
+    assert df["id_alias"].tolist() == [3, 4]
+    assert df["payload"].tolist() == [b"three", b"four"]
+    assert df["double_id"].tolist() == [6, 8]
+
+
+@pytest.mark.parametrize("blob_mode", ["bytes", "descriptions"])
+def test_plain_scan_query_to_pandas_blob_mode_does_not_collect_arrow(
+    tmp_db, monkeypatch, blob_mode
+):
+    pytest.importorskip("lance")
+    table = tmp_db.create_table(
+        "test_query_to_pandas_blob_no_arrow_collect", _blob_query_data()
+    )
+    query = table.search().where("id = 1").select(["id", "blob"])
+
+    def fail_to_arrow(*args, **kwargs):
+        raise AssertionError("to_arrow should not be called before native pandas")
+
+    monkeypatch.setattr(query, "to_arrow", fail_to_arrow)
+
+    df = query.to_pandas(blob_mode=blob_mode)
+
+    assert df["id"].tolist() == [1]
+    if blob_mode == "bytes":
+        assert df["blob"].tolist() == [b"one"]
+    else:
+        first = df["blob"].iloc[0]
+        assert first != b"one"
+        assert not hasattr(first, "readall")
+
+
+def test_plain_scan_query_to_pandas_blob_descriptions_flatten_uses_scanner(
+    tmp_db, monkeypatch
+):
+    pytest.importorskip("lance")
+    table = tmp_db.create_table(
+        "test_query_to_pandas_blob_desc_flatten", _blob_query_data()
+    )
+    query = table.search().where("id = 1").select(["id", "blob"])
+
+    def fail_to_arrow(*args, **kwargs):
+        raise AssertionError("to_arrow should not be called before scanner pandas")
+
+    monkeypatch.setattr(query, "to_arrow", fail_to_arrow)
+
+    df = query.to_pandas(blob_mode="descriptions", flatten=True)
+
+    assert df["id"].tolist() == [1]
+    assert any(column == "blob" or column.startswith("blob.") for column in df.columns)
+
+
+def test_plain_scan_query_to_pandas_scanner_state(tmp_db):
+    pytest.importorskip("lance")
+    data = _blob_query_data()
+    table = tmp_db.create_table("test_query_to_pandas_scanner_state", data.slice(0, 2))
+    table.add(data.slice(2, 2))
+
+    fragments = table.to_lance().get_fragments()
+    assert len(fragments) == 2
+
+    query = (
+        table.search()
+        .select(["id", "blob"])
+        .with_row_address()
+        .fragment_ids([fragments[1].fragment_id])
+    )
+    query_obj = query.to_query_object()
+    assert query_obj.with_row_address is True
+    assert query_obj.fragment_ids == [fragments[1].fragment_id]
+
+    df = query.to_pandas(blob_mode="descriptions")
+
+    assert df["id"].tolist() == [3, 4]
+    assert "_rowaddr" in df.columns
+    assert {rowaddr >> 32 for rowaddr in df["_rowaddr"]} == {fragments[1].fragment_id}
+
+    df_by_fragment = (
+        table.search()
+        .select(["id", "blob"])
+        .with_fragments([fragments[0]])
+        .to_pandas(blob_mode="descriptions")
+    )
+    assert df_by_fragment["id"].tolist() == [1, 2]
+
+
+@pytest.mark.asyncio
+async def test_async_plain_scan_query_to_pandas_blob_projection(tmp_db_async):
+    pytest.importorskip("lance")
+    table = await tmp_db_async.create_table(
+        "test_async_query_to_pandas_blob_projection", _blob_query_data()
+    )
+
+    lazy_df = await (
+        table.query().where("id = 1").select(["id", "blob"]).to_pandas(blob_mode="lazy")
+    )
+    assert lazy_df["id"].tolist() == [1]
+    _assert_lazy_blob(lazy_df["blob"].iloc[0], b"one")
+
+    bytes_df = await (
+        table.query()
+        .where("id >= 2")
+        .select({"id_alias": "id", "payload": "blob", "double_id": "id * 2"})
+        .limit(2)
+        .offset(1)
+        .to_pandas(blob_mode="bytes")
+    )
+    assert bytes_df["id_alias"].tolist() == [3, 4]
+    assert bytes_df["payload"].tolist() == [b"three", b"four"]
+    assert bytes_df["double_id"].tolist() == [6, 8]
+
+    desc_df = await (
+        table.query()
+        .where("id = 1")
+        .select(["blob"])
+        .to_pandas(blob_mode="descriptions")
+    )
+    first = desc_df["blob"].iloc[0]
+    assert first != b"one"
+    assert not hasattr(first, "readall")
+
+
+@pytest.mark.asyncio
+@pytest.mark.parametrize("blob_mode", ["bytes", "descriptions"])
+async def test_async_plain_scan_query_to_pandas_blob_mode_does_not_collect_arrow(
+    tmp_db_async, monkeypatch, blob_mode
+):
+    pytest.importorskip("lance")
+    table = await tmp_db_async.create_table(
+        "test_async_query_to_pandas_blob_no_arrow_collect", _blob_query_data()
+    )
+    query = table.query().where("id = 1").select(["id", "blob"])
+
+    async def fail_to_arrow(*args, **kwargs):
+        raise AssertionError("to_arrow should not be called before native pandas")
+
+    monkeypatch.setattr(query, "to_arrow", fail_to_arrow)
+
+    df = await query.to_pandas(blob_mode=blob_mode)
+
+    assert df["id"].tolist() == [1]
+    if blob_mode == "bytes":
+        assert df["blob"].tolist() == [b"one"]
+    else:
+        first = df["blob"].iloc[0]
+        assert first != b"one"
+        assert not hasattr(first, "readall")
+
+
+def test_vector_query_to_pandas_blob_mode_requires_native_path(tmp_db):
+    pytest.importorskip("lance")
+    table = tmp_db.create_table("test_vector_query_blob_mode", _blob_query_data())
+
+    with pytest.raises(RuntimeError, match="Lance native pandas conversion"):
+        table.search([1.0, 0.0]).select(["blob", "vector"]).limit(1).to_pandas(
+            blob_mode="lazy"
+        )
+
+
+def test_vector_query_to_pandas_blob_descriptions_requires_plain_scan(tmp_db):
+    pytest.importorskip("lance")
+    table = tmp_db.create_table(
+        "test_vector_query_blob_descriptions", _blob_query_data()
+    )
+
+    with pytest.raises(RuntimeError, match="plain scan query"):
+        table.search([1.0, 0.0]).select(["blob", "vector"]).limit(1).to_pandas(
+            blob_mode="descriptions"
+        )
+
+
 def test_order_by_plain_query(mem_db):
    table = mem_db.create_table(
        "test_order_by",
--- a/python/python/tests/test_remote_db.py
+++ b/python/python/tests/test_remote_db.py
@@ -1,12 +1,13 @@
 # SPDX-License-Identifier: Apache-2.0
 # SPDX-FileCopyrightText: Copyright The LanceDB Authors
-import re
 from concurrent.futures import ThreadPoolExecutor
 import contextlib
 from datetime import timedelta
 import http.server
 import json
 import multiprocessing as mp
+import pickle
+import re
 import sys
 import threading
 import time
@@ -153,6 +154,52 @@ async def test_async_checkout():
        assert await table.count_rows() == 300


+def test_remote_open_table_branch_and_version():
+    def handler(request):
+        # describe (table open + version validation) always succeeds
+        request.send_response(200)
+        request.send_header("Content-Type", "application/json")
+        request.end_headers()
+        request.wfile.write(
+            json.dumps({"version": 2, "schema": {"fields": []}}).encode()
+        )
+
+    with mock_lancedb_connection(handler) as db:
+        # version-only (and "main" + version) is allowed: remote supports
+        # version time-travel even though it has no branches
+        assert db.open_table("test", version=2) is not None
+        assert db.open_table("test", branch="main", version=2) is not None
+
+        # a non-main branch is rejected, with or without a version
+        with pytest.raises(NotImplementedError, match="branching"):
+            db.open_table("test", branch="exp")
+        with pytest.raises(NotImplementedError, match="branching"):
+            db.open_table("test", branch="exp", version=2)
+
+
+@pytest.mark.asyncio
+async def test_async_remote_open_table_branch_and_version():
+    def handler(request):
+        request.send_response(200)
+        request.send_header("Content-Type", "application/json")
+        request.end_headers()
+        request.wfile.write(
+            json.dumps({"version": 2, "schema": {"fields": []}}).encode()
+        )
+
+    async with mock_lancedb_connection_async(handler) as db:
+        # version-only (and "main" + version) is allowed: "main" is the default
+        # branch, so it must not hit the unsupported remote branch path
+        assert await db.open_table("test", version=2) is not None
+        assert await db.open_table("test", branch="main", version=2) is not None
+
+        # a non-main branch is rejected, with or without a version
+        with pytest.raises(NotImplementedError, match="branching"):
+            await db.open_table("test", branch="exp")
+        with pytest.raises(NotImplementedError, match="branching"):
+            await db.open_table("test", branch="exp", version=2)
+
+
 def test_table_len_sync():
    def handler(request):
        if request.path == "/v1/table/test/create/?mode=create":
@@ -171,6 +218,155 @@ def test_table_len_sync():
        assert len(table) == 1


+def test_remote_connection_serializes():
+    def handler(request):
+        request.send_response(200)
+        request.send_header("Content-Type", "application/json")
+        request.end_headers()
+        request.wfile.write(b'{"tables": []}')
+
+    with mock_lancedb_connection(handler) as db:
+        serialized = json.loads(db.serialize())
+        assert isinstance(serialized["client_config"], dict)
+        restored = lancedb.deserialize_conn(db.serialize())
+        assert restored.table_names() == []
+
+
+def test_remote_table_is_picklable():
+    def handler(request):
+        request.close_connection = True
+        if request.path == "/v1/table/test/describe/":
+            request.send_response(200)
+            request.send_header("Content-Type", "application/json")
+            request.end_headers()
+            payload = json.dumps(
+                {
+                    "version": 1,
+                    "schema": {
+                        "fields": [
+                            {"name": "id", "type": {"type": "int64"}, "nullable": False}
+                        ]
+                    },
+                }
+            )
+            request.wfile.write(payload.encode())
+        elif request.path == "/v1/table/test/count_rows/":
+            request.send_response(200)
+            request.send_header("Content-Type", "application/json")
+            request.end_headers()
+            request.wfile.write(b"3")
+        else:
+            request.send_response(404)
+            request.end_headers()
+
+    with mock_lancedb_connection(handler) as db:
+        table = db.open_table("test")
+        restored = pickle.loads(pickle.dumps(table))
+        assert restored.count_rows() == 3
+
+
+def test_remote_table_open_does_not_require_picklable_client_config():
+    from lancedb.remote import HeaderProvider
+
+    class LocalHeaderProvider(HeaderProvider):
+        def get_headers(self):
+            return {"X-Test-Header": "present"}
+
+    def handler(request):
+        request.close_connection = True
+        assert request.headers.get("X-Test-Header") == "present"
+        if request.path == "/v1/table/test/describe/":
+            request.send_response(200)
+            request.send_header("Content-Type", "application/json")
+            request.end_headers()
+            request.wfile.write(b'{"version": 1, "schema": {"fields": []}}')
+        elif request.path == "/v1/table/test/count_rows/":
+            request.send_response(200)
+            request.send_header("Content-Type", "application/json")
+            request.end_headers()
+            request.wfile.write(b"3")
+        else:
+            request.send_response(404)
+            request.end_headers()
+
+    with http.server.HTTPServer(
+        ("localhost", 0), make_mock_http_handler(handler)
+    ) as server:
+        port = server.server_address[1]
+        handle = threading.Thread(target=server.serve_forever)
+        handle.start()
+        try:
+            db = lancedb.connect(
+                "db://dev",
+                api_key="fake",
+                host_override=f"http://localhost:{port}",
+                client_config={
+                    "retry_config": {"retries": 0},
+                    "timeout_config": {"connect_timeout": 2, "read_timeout": 2},
+                    "header_provider": LocalHeaderProvider(),
+                },
+            )
+            table = db.open_table("test")
+            assert table.count_rows() == 3
+            with pytest.raises(ValueError, match="header_provider"):
+                pickle.dumps(table)
+        finally:
+            server.shutdown()
+            handle.join()
+
+
+def test_remote_permutation_is_picklable():
+    from lancedb.permutation import Permutation
+
+    rows = list(range(10))
+
+    def handler(request):
+        request.close_connection = True
+        if request.path == "/v1/table/test/describe/":
+            request.send_response(200)
+            request.send_header("Content-Type", "application/json")
+            request.end_headers()
+            payload = json.dumps(
+                {
+                    "version": 1,
+                    "schema": {
+                        "fields": [
+                            {"name": "a", "type": {"type": "int64"}, "nullable": False}
+                        ]
+                    },
+                }
+            )
+            request.wfile.write(payload.encode())
+        elif request.path == "/v1/table/test/count_rows/":
+            request.send_response(200)
+            request.send_header("Content-Type", "application/json")
+            request.end_headers()
+            request.wfile.write(str(len(rows)).encode())
+        elif request.path == "/v1/table/test/query/":
+            content_len = int(request.headers.get("Content-Length"))
+            body = json.loads(request.rfile.read(content_len))
+            if "filter" in body:
+                match = re.search(r"_rowoffset in \((.*?)\)", body["filter"])
+                offsets = [int(offset.strip()) for offset in match.group(1).split(",")]
+            else:
+                offsets = rows
+            table = pa.table({"a": [rows[offset] for offset in offsets]})
+
+            request.send_response(200)
+            request.send_header("Content-Type", "application/vnd.apache.arrow.file")
+            request.end_headers()
+            with pa.ipc.new_file(request.wfile, schema=table.schema) as writer:
+                writer.write_table(table)
+        else:
+            request.send_response(404)
+            request.end_headers()
+
+    with mock_lancedb_connection(handler) as db:
+        permutation = Permutation.identity(db.open_table("test"))
+        restored = pickle.loads(pickle.dumps(permutation))
+        assert restored.__getitems__([0, 2, 4]) == [{"a": 0}, {"a": 2}, {"a": 4}]
+
+
 def test_create_table_exist_ok():
    def handler(request):
        if request.path == "/v1/table/test/create/?mode=exist_ok":
@@ -1400,6 +1596,10 @@ def _remote_fork_child(port: int, queue) -> None:
    queue.put(db.table_names())


+def _remote_table_fork_child(table, queue) -> None:
+    queue.put(table.count_rows())
+
+
@pytest.mark.skipif(
    sys.platform != "linux",
    reason=(
@@ -1462,3 +1662,65 @@ def test_remote_connection_after_fork():
    finally:
        server.shutdown()
        server_thread.join()
+
+
+@pytest.mark.skipif(
+    sys.platform != "linux",
+    reason=(
+        "fork() is unavailable on Windows and unsafe on macOS "
+        "(Apple frameworks/TLS are not fork-safe)"
+    ),
+)
+def test_inherited_remote_table_reopens_after_fork():
+    def handler(request):
+        if request.path == "/v1/table/test/describe/":
+            request.send_response(200)
+            request.send_header("Content-Type", "application/json")
+            request.end_headers()
+            request.wfile.write(b'{"version": 1, "schema": {"fields": []}}')
+        elif request.path == "/v1/table/test/count_rows/":
+            request.send_response(200)
+            request.send_header("Content-Type", "application/json")
+            request.end_headers()
+            request.wfile.write(b"7")
+        else:
+            request.send_response(404)
+            request.end_headers()
+
+    server = http.server.HTTPServer(("localhost", 0), make_mock_http_handler(handler))
+    port = server.server_address[1]
+    server_thread = threading.Thread(target=server.serve_forever)
+    server_thread.start()
+    try:
+        db = lancedb.connect(
+            "db://dev",
+            api_key="fake",
+            host_override=f"http://localhost:{port}",
+            client_config={
+                "retry_config": {"retries": 0},
+                "timeout_config": {"connect_timeout": 2, "read_timeout": 2},
+            },
+        )
+        table = db.open_table("test")
+        assert table.count_rows() == 7
+
+        ctx = mp.get_context("fork")
+        queue = ctx.Queue()
+        proc = ctx.Process(target=_remote_table_fork_child, args=(table, queue))
+        proc.start()
+        proc.join(timeout=15)
+
+        if proc.is_alive():
+            proc.terminate()
+            proc.join(timeout=5)
+            if proc.is_alive():
+                proc.kill()
+                proc.join()
+            pytest.fail("Remote table hung after fork")
+
+        assert proc.exitcode == 0, f"child exited with code {proc.exitcode}"
+        assert not queue.empty(), "child produced no result"
+        assert queue.get() == 7
+    finally:
+        server.shutdown()
+        server_thread.join()
--- a/python/python/tests/test_rerankers.py
+++ b/python/python/tests/test_rerankers.py
@@ -344,6 +344,12 @@ def test_mrr_reranker(tmp_path):
    assert len(result_deduped) == len(result)


+def test_mrr_reranker_empty_input():
+    reranker = MRRReranker()
+    with pytest.raises(ValueError, match="must not be empty"):
+        reranker.rerank_multivector([])
+
+
 def test_rrf_reranker_distance():
    data = pa.table(
        {
--- a/python/python/tests/test_table.py
+++ b/python/python/tests/test_table.py
@@ -4,6 +4,7 @@

 import os
 import sys
+import threading
 import warnings
 from datetime import date, datetime, timedelta
 from time import sleep
@@ -26,6 +27,28 @@ from lancedb.table import LanceTable
 from pydantic import BaseModel


+def _blob_test_data():
+    return pa.table(
+        {
+            "id": pa.array([1, 2], pa.int64()),
+            "blob": pa.array([b"hello", b"world"], pa.large_binary()),
+        },
+        schema=pa.schema(
+            [
+                pa.field("id", pa.int64()),
+                pa.field(
+                    "blob", pa.large_binary(), metadata={"lance-encoding:blob": "true"}
+                ),
+            ]
+        ),
+    )
+
+
+def _assert_lazy_blob(value, expected: bytes):
+    assert hasattr(value, "readall")
+    assert value.readall() == expected
+
+
 def test_basic(mem_db: DBConnection):
    data = [
        {"vector": [3.1, 4.1], "item": "foo", "price": 10.0},
@@ -57,27 +80,30 @@ def test_table_to_pandas_default_matches_arrow(tmp_db: DBConnection):
    pd.testing.assert_frame_equal(table.to_pandas(), expected)


-def test_table_to_pandas_blob_bytes(tmp_db: DBConnection):
+def test_table_to_pandas_invalid_blob_mode_non_blob_table(tmp_db: DBConnection):
+    data = pa.table({"id": [1, 2], "text": ["one", "two"]})
+    table = tmp_db.create_table("test_to_pandas_invalid_blob_mode", data=data)
+
+    with pytest.raises(ValueError, match="blob_mode must be one of"):
+        table.to_pandas(blob_mode="invalid")
+
+
+@pytest.mark.parametrize("blob_mode", ["lazy", "bytes", "descriptions"])
+def test_table_to_pandas_blob_modes(tmp_db: DBConnection, blob_mode):
    pytest.importorskip("lance")
-    data = pa.table(
-        {
-            "id": pa.array([1, 2], pa.int64()),
-            "blob": pa.array([b"hello", b"world"], pa.large_binary()),
-        },
-        schema=pa.schema(
-            [
-                pa.field("id", pa.int64()),
-                pa.field(
-                    "blob", pa.large_binary(), metadata={"lance-encoding:blob": "true"}
-                ),
-            ]
-        ),
-    )
-    table = tmp_db.create_table("test_to_pandas_blob_bytes", data=data)
+    table = tmp_db.create_table(f"test_to_pandas_blob_{blob_mode}", _blob_test_data())

-    df = table.to_pandas(blob_mode="bytes")
+    df = table.to_pandas(blob_mode=blob_mode)

-    assert df["blob"].tolist() == [b"hello", b"world"]
+    if blob_mode == "lazy":
+        _assert_lazy_blob(df["blob"].iloc[0], b"hello")
+        _assert_lazy_blob(df["blob"].iloc[1], b"world")
+    elif blob_mode == "bytes":
+        assert df["blob"].tolist() == [b"hello", b"world"]
+    else:
+        first = df["blob"].iloc[0]
+        assert first != b"hello"
+        assert not hasattr(first, "readall")


 def test_table_to_pandas_kwargs(tmp_db: DBConnection):
@@ -93,22 +119,8 @@ def test_table_to_pandas_kwargs(tmp_db: DBConnection):
@pytest.mark.asyncio
 async def test_async_table_to_pandas_blob_bytes(tmp_db_async: AsyncConnection):
    pytest.importorskip("lance")
-    data = pa.table(
-        {
-            "id": pa.array([1, 2], pa.int64()),
-            "blob": pa.array([b"hello", b"world"], pa.large_binary()),
-        },
-        schema=pa.schema(
-            [
-                pa.field("id", pa.int64()),
-                pa.field(
-                    "blob", pa.large_binary(), metadata={"lance-encoding:blob": "true"}
-                ),
-            ]
-        ),
-    )
    table = await tmp_db_async.create_table(
-        "test_async_to_pandas_blob_bytes", data=data
+        "test_async_to_pandas_blob_bytes", data=_blob_test_data()
    )

    df = await table.to_pandas(blob_mode="bytes")
@@ -116,6 +128,19 @@ async def test_async_table_to_pandas_blob_bytes(tmp_db_async: AsyncConnection):
    assert df["blob"].tolist() == [b"hello", b"world"]


+@pytest.mark.asyncio
+async def test_async_table_to_pandas_invalid_blob_mode_non_blob_table(
+    tmp_db_async: AsyncConnection,
+):
+    table = await tmp_db_async.create_table(
+        "test_async_to_pandas_invalid_blob_mode",
+        data=pa.table({"id": [1, 2], "text": ["one", "two"]}),
+    )
+
+    with pytest.raises(ValueError, match="blob_mode must be one of"):
+        await table.to_pandas(blob_mode="invalid")
+
+
@pytest.mark.asyncio
 async def test_async_table_to_pandas_kwargs(tmp_db_async: AsyncConnection):
    pd = pytest.importorskip("pandas")
@@ -903,6 +928,346 @@ async def test_async_tags(mem_db_async: AsyncConnection):
    )


+def test_branches(tmp_path):
+    db = lancedb.connect(tmp_path, read_consistency_interval=timedelta(0))
+    table = db.create_table(
+        "test",
+        data=[
+            {"vector": [3.1, 4.1], "item": "foo", "price": 10.0},
+            {"vector": [5.9, 26.5], "item": "bar", "price": 20.0},
+        ],
+    )
+    assert table.count_rows() == 2
+
+    # fork an isolated, writable branch from main
+    branch = table.branches.create("exp")
+    assert branch.count_rows() == 2
+    branch.add(data=[{"vector": [10.0, 11.0], "item": "baz", "price": 30.0}])
+
+    # writes on the branch do not touch main
+    assert branch.count_rows() == 3
+    assert table.count_rows() == 2
+
+    # the branch is listed, with main (None) as its parent
+    branches = table.branches.list()
+    assert "exp" in branches
+    assert branches["exp"]["parent_branch"] is None
+
+    # from_ref="main" is equivalent to the default
+    table.branches.create("exp2", from_ref="main")
+    assert table.branches.list()["exp2"]["parent_branch"] is None
+
+    # checkout returns a handle scoped to the branch's latest
+    checked_out = table.branches.checkout("exp")
+    assert checked_out.count_rows() == 3
+
+    # delete removes it
+    table.branches.delete("exp")
+    table.branches.delete("exp2")
+    assert "exp" not in table.branches.list()
+
+
+def test_branch_handle_tracks_concurrent_writes(tmp_path):
+    db = lancedb.connect(tmp_path, read_consistency_interval=timedelta(0))
+    table = db.create_table("t", [{"id": 1}])
+
+    # two independent handles on the same branch
+    writer = table.branches.create("exp")
+    reader = db.open_table("t", branch="exp")
+    assert reader.count_rows() == 1
+
+    # a concurrent write on the branch is visible to the other handle
+    writer.add([{"id": 2}])
+    assert reader.count_rows() == 2
+    # main is unaffected
+    assert table.count_rows() == 1
+
+
+def test_branch_name_validation(tmp_path):
+    db = lancedb.connect(tmp_path)
+    table = db.create_table("t", [{"id": 1}])
+
+    with pytest.raises(ValueError, match="non-empty"):
+        table.branches.create("")
+    with pytest.raises(ValueError, match="non-empty"):
+        table.branches.checkout("")
+    with pytest.raises(ValueError, match="non-empty"):
+        table.branches.delete("")
+
+
+def test_branches_preserve_namespace(tmp_path):
+    pytest.importorskip(
+        "lance"
+    )  # namespace_path routes through lance's DirectoryNamespace
+    db = lancedb.connect(tmp_path)
+    table = db.create_table("t", [{"id": 1}], namespace_path=["ns1"])
+    assert table.namespace == ["ns1"]
+
+    branch = table.branches.create("exp")
+    assert branch.namespace == ["ns1"]
+    assert branch.id == table.id
+
+    # opening the branch directly also preserves namespace identity
+    opened = db.open_table("t", namespace_path=["ns1"], branch="exp")
+    assert opened.namespace == ["ns1"]
+
+
+def test_open_table_with_branch(tmp_path):
+    db = lancedb.connect(tmp_path)
+    table = db.create_table("t", [{"i": 1}])
+    table.branches.create("exp").add([{"i": 2}])
+
+    # open_table(branch=...) returns a handle scoped to the branch
+    assert db.open_table("t", branch="exp").count_rows() == 2
+    # opening without branch still tracks main
+    assert db.open_table("t").count_rows() == 1
+
+
+def test_open_table_with_branch_version(tmp_path):
+    db = lancedb.connect(tmp_path, read_consistency_interval=timedelta(0))
+
+    # main: a single fork-point row
+    t = db.create_table("t", [{"i": 0}])
+    main_v1 = t.version
+
+    # fork "exp", then advance exp AND main independently past the fork so they
+    # diverge while sharing version numbers
+    exp = t.branches.create("exp")
+    exp.add([{"i": 1}])  # exp: {0, 1}
+    exp_v2 = exp.version
+    exp.add([{"i": 2}])  # exp HEAD: {0, 1, 2}
+    t.add([{"i": 100}, {"i": 101}, {"i": 102}])  # main HEAD: {0, 100, 101, 102}
+    assert exp_v2 == t.version, "branch and main must share the version number"
+
+    # open exp at the shared version: the data must be exp's, not main's. count
+    # alone cannot prove this (main@v2 also exists), so assert provenance by
+    # content.
+    pinned = db.open_table("t", branch="exp", version=exp_v2)
+    assert pinned.current_branch() == "exp"
+    assert pinned.count_rows() == 2  # not exp HEAD (3), not main@v2 (4)
+    assert pinned.count_rows("i = 1") == 1  # exp's post-fork row is visible
+    assert pinned.count_rows("i = 100") == 0  # main's divergent rows are invisible
+
+    # the same coordinate is reachable directly via branches.checkout(name, version)
+    pinned_direct = t.branches.checkout("exp", exp_v2)
+    assert pinned_direct.current_branch() == "exp"
+    assert pinned_direct.count_rows() == 2
+
+    # the HEADs are unaffected
+    assert db.open_table("t", branch="exp").count_rows() == 3
+    assert db.open_table("t").count_rows() == 4
+
+    # version-only (no branch) time-travels main itself: its fork-point version
+    # holds only main's first row, and the shared version number resolves to
+    # main's data, not the branch's ("opens main at the version")
+    old_main = db.open_table("t", version=main_v1)
+    assert old_main.current_branch() is None
+    assert old_main.count_rows() == 1
+    shared_on_main = db.open_table("t", version=exp_v2)
+    assert shared_on_main.current_branch() is None
+    assert shared_on_main.count_rows() == 4
+
+    # detached head: writing to a pinned version is rejected
+    with pytest.raises((ValueError, RuntimeError), match="cannot be modified"):
+        pinned.add([{"i": 9}])
+
+    # a nonexistent version is rejected -- on main, and on a branch (a distinct
+    # resolution path, on the branch's manifests)
+    with pytest.raises((ValueError, RuntimeError)):
+        db.open_table("t", version=9999)
+    with pytest.raises((ValueError, RuntimeError)):
+        db.open_table("t", branch="exp", version=9999)
+
+    # checkout_latest re-attaches the pinned handle to the BRANCH's HEAD
+    # (writable again), not main's HEAD, and not staying pinned
+    pinned.checkout_latest()
+    assert pinned.current_branch() == "exp"
+    assert pinned.count_rows() == 3  # exp HEAD, not main's 4
+    pinned.add([{"i": 3}])
+    assert pinned.count_rows() == 4  # writable again
+
+
+@pytest.mark.asyncio
+async def test_async_namespace_open_table_with_branch(tmp_path):
+    pytest.importorskip("lance")  # "dir" impl is lance.namespace.DirectoryNamespace
+    db = lancedb.connect_namespace_async("dir", {"root": str(tmp_path)})
+    await db.create_namespace(["ns1"])
+    table = await db.create_table("t", [{"id": 1}], namespace_path=["ns1"])
+    branch = await table.branches.create("exp")
+    await branch.add([{"id": 2}])
+
+    # open_table(branch=...) on the async namespace connection must work
+    opened = await db.open_table("t", namespace_path=["ns1"], branch="exp")
+    assert await opened.count_rows() == 2
+
+
+def test_namespace_open_table_with_branch_version(tmp_path):
+    pytest.importorskip("lance")  # "dir" impl is lance.namespace.DirectoryNamespace
+    db = lancedb.connect_namespace("dir", {"root": str(tmp_path)})
+    db.create_namespace(["ns1"])
+    t = db.create_table("t", [{"i": 0}], namespace_path=["ns1"])
+
+    # fork "exp", then advance exp AND main past the fork so they diverge while
+    # sharing version numbers
+    exp = t.branches.create("exp")
+    exp.add([{"i": 1}])
+    exp_v2 = exp.version
+    exp.add([{"i": 2}])
+    t.add([{"i": 100}, {"i": 101}, {"i": 102}])
+    assert exp_v2 == t.version, "branch and main must share the version number"
+
+    # open_table(branch=, version=) on the namespace connection reads the
+    # branch's data at that version, not main's
+    pinned = db.open_table("t", namespace_path=["ns1"], branch="exp", version=exp_v2)
+    assert pinned.current_branch() == "exp"
+    assert pinned.count_rows() == 2  # not exp HEAD (3), not main@v2 (4)
+    assert pinned.count_rows("i = 1") == 1  # exp's post-fork row is visible
+    assert pinned.count_rows("i = 100") == 0  # main's divergent rows are invisible
+    assert db.open_table("t", namespace_path=["ns1"], branch="exp").count_rows() == 3
+
+
+@pytest.mark.asyncio
+async def test_async_namespace_open_table_with_branch_version(tmp_path):
+    pytest.importorskip("lance")  # "dir" impl is lance.namespace.DirectoryNamespace
+    db = lancedb.connect_namespace_async("dir", {"root": str(tmp_path)})
+    await db.create_namespace(["ns1"])
+    t = await db.create_table("t", [{"i": 0}], namespace_path=["ns1"])
+
+    # fork "exp", then advance exp AND main past the fork so they diverge while
+    # sharing version numbers
+    exp = await t.branches.create("exp")
+    await exp.add([{"i": 1}])
+    exp_v2 = await exp.version()
+    await exp.add([{"i": 2}])
+    await t.add([{"i": 100}, {"i": 101}, {"i": 102}])
+    assert exp_v2 == await t.version(), "branch and main must share the version number"
+
+    # open_table(branch=, version=) on the async namespace connection reads the
+    # branch's data at that version, not main's
+    pinned = await db.open_table(
+        "t", namespace_path=["ns1"], branch="exp", version=exp_v2
+    )
+    assert pinned.current_branch() == "exp"
+    assert await pinned.count_rows() == 2  # not exp HEAD (3), not main@v2 (4)
+    assert await pinned.count_rows("i = 1") == 1  # exp's post-fork row is visible
+    assert await pinned.count_rows("i = 100") == 0  # main's rows are invisible
+    assert (
+        await (
+            await db.open_table("t", namespace_path=["ns1"], branch="exp")
+        ).count_rows()
+        == 3
+    )
+
+
+def test_branch_to_lance_targets_branch(tmp_path):
+    pytest.importorskip("lance")
+    db = lancedb.connect(tmp_path)
+    table = db.create_table("t", [{"i": 1}])
+    branch = table.branches.create("exp")
+    branch.add([{"i": 2}])  # branch: 2 rows, main: 1 row
+
+    assert branch.to_lance().count_rows() == 2
+    assert table.to_lance().count_rows() == 1
+
+
+@pytest.mark.asyncio
+async def test_async_branches(tmp_path):
+    db = await lancedb.connect_async(tmp_path)
+    table = await db.create_table(
+        "test",
+        data=[
+            {"vector": [3.1, 4.1], "item": "foo", "price": 10.0},
+            {"vector": [5.9, 26.5], "item": "bar", "price": 20.0},
+        ],
+    )
+    assert await table.count_rows() == 2
+
+    branch = await table.branches.create("exp")
+    assert await branch.count_rows() == 2
+    await branch.add(data=[{"vector": [10.0, 11.0], "item": "baz", "price": 30.0}])
+
+    assert await branch.count_rows() == 3
+    assert await table.count_rows() == 2
+
+    branches = await table.branches.list()
+    assert "exp" in branches
+    assert branches["exp"]["parent_branch"] is None
+
+    await table.branches.create("exp2", from_ref="main")
+    assert (await table.branches.list())["exp2"]["parent_branch"] is None
+
+    checked_out = await table.branches.checkout("exp")
+    assert await checked_out.count_rows() == 3
+
+    await table.branches.delete("exp")
+    await table.branches.delete("exp2")
+    assert "exp" not in await table.branches.list()
+
+
+@pytest.mark.asyncio
+async def test_async_open_table_with_branch_version(tmp_path):
+    db = await lancedb.connect_async(tmp_path, read_consistency_interval=timedelta(0))
+
+    # main: a single fork-point row
+    t = await db.create_table("t", [{"i": 0}])
+    main_v1 = await t.version()
+
+    # fork "exp", then advance exp AND main independently past the fork so they
+    # diverge while sharing version numbers
+    exp = await t.branches.create("exp")
+    await exp.add([{"i": 1}])  # exp: {0, 1}
+    exp_v2 = await exp.version()
+    await exp.add([{"i": 2}])  # exp HEAD: {0, 1, 2}
+    await t.add([{"i": 100}, {"i": 101}, {"i": 102}])  # main HEAD: {0, 100, 101, 102}
+    assert exp_v2 == await t.version(), "branch and main must share the version number"
+
+    # open exp at the shared version: the data must be exp's, not main's. count
+    # alone cannot prove this (main@v2 also exists), so assert provenance by
+    # content.
+    pinned = await db.open_table("t", branch="exp", version=exp_v2)
+    assert pinned.current_branch() == "exp"
+    assert await pinned.count_rows() == 2  # not exp HEAD (3), not main@v2 (4)
+    assert await pinned.count_rows("i = 1") == 1  # exp's post-fork row is visible
+    assert await pinned.count_rows("i = 100") == 0  # main's rows are invisible
+
+    # the same coordinate is reachable directly via branches.checkout(name, version)
+    pinned_direct = await t.branches.checkout("exp", exp_v2)
+    assert pinned_direct.current_branch() == "exp"
+    assert await pinned_direct.count_rows() == 2
+
+    # the HEADs are unaffected
+    assert await (await db.open_table("t", branch="exp")).count_rows() == 3
+    assert await (await db.open_table("t")).count_rows() == 4
+
+    # version-only (no branch) time-travels main itself: its fork-point version
+    # holds only main's first row, and the shared version number resolves to
+    # main's data, not the branch's ("opens main at the version")
+    old_main = await db.open_table("t", version=main_v1)
+    assert old_main.current_branch() is None
+    assert await old_main.count_rows() == 1
+    shared_on_main = await db.open_table("t", version=exp_v2)
+    assert shared_on_main.current_branch() is None
+    assert await shared_on_main.count_rows() == 4
+
+    # detached head: writing to a pinned version is rejected
+    with pytest.raises((ValueError, RuntimeError), match="cannot be modified"):
+        await pinned.add([{"i": 9}])
+
+    # a nonexistent version is rejected -- on main, and on a branch
+    with pytest.raises((ValueError, RuntimeError)):
+        await db.open_table("t", version=9999)
+    with pytest.raises((ValueError, RuntimeError)):
+        await db.open_table("t", branch="exp", version=9999)
+
+    # checkout_latest re-attaches the pinned handle to the BRANCH's HEAD
+    # (writable again), not main's HEAD, and not staying pinned
+    await pinned.checkout_latest()
+    assert pinned.current_branch() == "exp"
+    assert await pinned.count_rows() == 3  # exp HEAD, not main's 4
+    await pinned.add([{"i": 3}])
+    assert await pinned.count_rows() == 4  # writable again
+
+
@patch("lancedb.table.AsyncTable.create_index")
 def test_create_index_method(mock_create_index, mem_db: DBConnection):
    table = mem_db.create_table(
@@ -1264,6 +1629,45 @@ def test_add_with_empty_fixed_size_list_drops_bad_rows(mem_db: DBConnection):
    assert np.allclose(data["embedding"].to_pylist()[0], np.array([0.1] * 16))


+def test_add_nullable_struct_with_none(mem_db: DBConnection):
+    """Regression test for issue #2654: a nullable struct column whose
+    first batch contains only None values must not crash in
+    _align_field_types with AttributeError: 'pyarrow.lib.DataType'
+    object has no attribute 'fields'.
+
+    PyArrow infers an all-None struct column as `null` (not `struct`),
+    so the type-alignment path needs to handle the case where the
+    source field type is null and use the target type directly.
+    """
+    # Use the v2.1 file format so that nullable structs are supported.
+    table = mem_db.create_table(
+        "test_nullable_struct",
+        schema=pa.schema(
+            [
+                pa.field("id", pa.string()),
+                pa.field(
+                    "data",
+                    pa.struct([pa.field("x", pa.float32())]),
+                    nullable=True,
+                ),
+            ]
+        ),
+        storage_options=dict(new_table_data_storage_version="2.1"),
+    )
+
+    # Adding a row with a non-null struct should work.
+    table.add([{"id": "1", "data": {"x": 1.0}}])
+
+    # Adding a row with None for the nullable struct field should also
+    # work — this is what used to crash.
+    table.add([{"id": "2", "data": None}])
+
+    result = table.to_arrow()
+    assert result.num_rows == 2
+    assert result.column("id").to_pylist() == ["1", "2"]
+    assert result.column("data").to_pylist() == [{"x": 1.0}, None]
+
+
 def test_add_with_integer_embeddings_preserves_casting(mem_db: DBConnection):
    class Schema(LanceModel):
        text: str
@@ -1995,18 +2399,32 @@ def test_create_scalar_index(mem_db: DBConnection):
 def test_create_index_nested_field_paths(mem_db: DBConnection):
    schema = pa.schema(
        [
+            pa.field("rowId", pa.int32()),
+            pa.field("row-id", pa.int32()),
+            pa.field("userId", pa.int32()),
            pa.field("metadata", pa.struct([pa.field("user_id", pa.int32())])),
+            pa.field("MetaData", pa.struct([pa.field("userId", pa.int32())])),
            pa.field(
                "image",
                pa.struct([pa.field("embedding", pa.list_(pa.float32(), 2))]),
            ),
+            pa.field("payload", pa.struct([pa.field("text", pa.string())])),
+            pa.field("meta-data", pa.struct([pa.field("user-id", pa.int32())])),
+            pa.field("literal", pa.struct([pa.field("a.b", pa.int32())])),
        ]
    )
    data = pa.Table.from_pylist(
        [
            {
+                "rowId": i,
+                "row-id": i,
+                "userId": i,
                "metadata": {"user_id": i},
+                "MetaData": {"userId": i},
                "image": {"embedding": [float(i), float(i + 1)]},
+                "payload": {"text": f"document {i}"},
+                "meta-data": {"user-id": i},
+                "literal": {"a.b": i},
            }
            for i in range(256)
        ],
@@ -2014,19 +2432,37 @@ def test_create_index_nested_field_paths(mem_db: DBConnection):
    )
    table = mem_db.create_table("nested_index_paths", data=data)

+    table.create_scalar_index("rowId", name="row_id_idx")
+    table.create_scalar_index("`row-id`", name="row_dash_id_idx")
+    table.create_scalar_index("userId", name="top_user_id_idx")
    table.create_scalar_index("metadata.user_id", name="metadata_user_id_idx")
+    table.create_scalar_index("MetaData.userId", name="mixed_case_metadata_user_id_idx")
+    table.create_scalar_index("`meta-data`.`user-id`", name="escaped_names_idx")
+    table.create_scalar_index("literal.`a.b`", name="literal_dot_idx")
    table.create_index(
        vector_column_name="image.embedding",
        num_partitions=1,
        num_sub_vectors=1,
        name="image_embedding_idx",
    )
+    table.create_fts_index("payload.text", with_position=False, name="payload_text_idx")

    indices = sorted(table.list_indices(), key=lambda idx: idx.name)
    assert [(idx.name, idx.index_type, idx.columns) for idx in indices] == [
+        ("escaped_names_idx", "BTree", ["`meta-data`.`user-id`"]),
        ("image_embedding_idx", "IvfPq", ["image.embedding"]),
+        ("literal_dot_idx", "BTree", ["literal.`a.b`"]),
        ("metadata_user_id_idx", "BTree", ["metadata.user_id"]),
+        ("mixed_case_metadata_user_id_idx", "BTree", ["MetaData.userId"]),
+        ("payload_text_idx", "FTS", ["payload.text"]),
+        ("row_dash_id_idx", "BTree", ["`row-id`"]),
+        ("row_id_idx", "BTree", ["rowId"]),
+        ("top_user_id_idx", "BTree", ["userId"]),
    ]
+    for index in indices:
+        stats = table.index_stats(index.name)
+        assert stats is not None
+        assert stats.num_indexed_rows == 256

    vector_results = (
        table.search([0.0, 1.0], vector_column_name="image.embedding")
@@ -2044,6 +2480,14 @@ def test_create_index_nested_field_paths(mem_db: DBConnection):
    assert len(filtered_results) == 1
    assert filtered_results[0]["metadata"]["user_id"] == 42

+    escaped_results = table.search().where("`row-id` = 43").limit(1).to_list()
+    assert len(escaped_results) == 1
+    assert escaped_results[0]["row-id"] == 43
+
+    fts_results = table.search("document 44", query_type="fts").limit(1).to_list()
+    assert len(fts_results) == 1
+    assert fts_results[0]["payload"]["text"] == "document 44"
+

 def test_empty_query(mem_db: DBConnection):
    table = mem_db.create_table(
@@ -2472,6 +2916,30 @@ def test_alter_columns(mem_db: DBConnection):
    assert table.to_arrow().column_names == ["new_id"]


+def test_update_field_metadata(mem_db: DBConnection):
+    data = pa.table({"id": [0, 1], "category": ["a", "b"]})
+    table = mem_db.create_table("my_table", data=data)
+
+    res = table.update_field_metadata(
+        {"path": "category", "metadata": {"unit": "label", "pii": "false"}}
+    )
+    assert res.version == 2
+    # Arrow field metadata is bytes-keyed
+    assert table.schema.field("category").metadata == {
+        b"unit": b"label",
+        b"pii": b"false",
+    }
+
+    # merge: add a key, delete one via None, keep the rest
+    table.update_field_metadata(
+        {"path": "category", "metadata": {"source": "import", "pii": None}}
+    )
+    assert table.schema.field("category").metadata == {
+        b"unit": b"label",
+        b"source": b"import",
+    }
+
+
@pytest.mark.asyncio
 async def test_alter_columns_async(mem_db_async: AsyncConnection):
    data = pa.table({"id": [0, 1]})
@@ -2750,3 +3218,38 @@ def test_sanitize_data_metadata_not_stripped():
    assert result_schema.metadata is not None
    assert result_schema.metadata[b"existing_key"] == b"existing_value"
    assert result_schema.metadata[b"new_key"] == b"new_value"
+
+
+@pytest.mark.asyncio
+async def test_async_search_runs_embedding_on_dedicated_executor(
+    mem_db_async: AsyncConnection,
+):
+    # Regression test for #3310: AsyncTable.search() must run the (potentially
+    # blocking) query-embedding call on the dedicated embedding executor, not
+    # asyncio's default executor -- which is shared with other blocking I/O and
+    # can be starved by a slow embedding call under concurrent load.
+    func = MockTextEmbeddingFunction.create()
+
+    class Schema(LanceModel):
+        text: str = func.SourceField()
+        vector: Vector(func.ndims()) = func.VectorField()
+
+    table = await mem_db_async.create_table("embed_executor", schema=Schema)
+    await table.add([{"text": "hello world"}])
+
+    captured_threads: List[str] = []
+    original = MockTextEmbeddingFunction.generate_embeddings
+
+    def record_thread(self, texts):
+        captured_threads.append(threading.current_thread().name)
+        return original(self, texts)
+
+    # Patch only around the search so we capture the query-embedding call, not
+    # the add-time source-embedding call.
+    with patch.object(MockTextEmbeddingFunction, "generate_embeddings", record_thread):
+        await (await table.search("a query string")).limit(1).to_list()
+
+    assert captured_threads, "search did not invoke the embedding function"
+    assert all(name.startswith("lancedb-embedding") for name in captured_threads), (
+        f"embedding ran off the dedicated executor: {captured_threads}"
+    )
--- a/python/python/tests/test_torch.py
+++ b/python/python/tests/test_torch.py
@@ -1,10 +1,15 @@
 # SPDX-License-Identifier: Apache-2.0
 # SPDX-FileCopyrightText: Copyright The LanceDB Authors

+import contextlib
 import functools
+import http.server
+import json
 import multiprocessing as mp
 import pickle
+import re
 import sys
+import threading

 import lancedb
 import pyarrow as pa
@@ -15,6 +20,107 @@ from lancedb.util import tbl_to_tensor
 torch = pytest.importorskip("torch")


+REMOTE_ROWS = list(range(100))
+
+
+def _make_mock_http_handler(handler):
+    class MockLanceDBHandler(http.server.BaseHTTPRequestHandler):
+        def do_GET(self):
+            handler(self)
+
+        def do_POST(self):
+            handler(self)
+
+    return MockLanceDBHandler
+
+
+def _remote_schema_payload():
+    return {
+        "version": 1,
+        "schema": {
+            "fields": [
+                {"name": "a", "type": {"type": "int64"}, "nullable": False},
+            ]
+        },
+    }
+
+
+def _offsets_from_filter(filter_sql: str | None) -> list[int]:
+    if filter_sql is None:
+        return REMOTE_ROWS
+    match = re.search(r"_rowoffset in \((.*?)\)", filter_sql)
+    if match is None:
+        return REMOTE_ROWS
+    raw_offsets = match.group(1).strip()
+    if raw_offsets == "":
+        return []
+    return [int(offset.strip()) for offset in raw_offsets.split(",")]
+
+
+def _remote_dataset_handler(request):
+    request.close_connection = True
+    if request.path == "/v1/table/test/describe/":
+        request.send_response(200)
+        request.send_header("Content-Type", "application/json")
+        request.end_headers()
+        request.wfile.write(json.dumps(_remote_schema_payload()).encode())
+    elif request.path == "/v1/table/test/count_rows/":
+        request.send_response(200)
+        request.send_header("Content-Type", "application/json")
+        request.end_headers()
+        request.wfile.write(str(len(REMOTE_ROWS)).encode())
+    elif request.path == "/v1/table/test/query/":
+        content_len = int(request.headers.get("Content-Length"))
+        body = json.loads(request.rfile.read(content_len))
+        offsets = _offsets_from_filter(body.get("filter"))
+        requested_columns = body.get("columns") or ["a"]
+        if isinstance(requested_columns, dict):
+            requested_columns = list(requested_columns)
+
+        data = {}
+        for column in requested_columns:
+            if column == "a":
+                data[column] = [REMOTE_ROWS[offset] for offset in offsets]
+            elif column == "_rowoffset":
+                data[column] = offsets
+            elif column == "_rowid":
+                data[column] = offsets
+
+        table = pa.table(data)
+        request.send_response(200)
+        request.send_header("Content-Type", "application/vnd.apache.arrow.file")
+        request.end_headers()
+        with pa.ipc.new_file(request.wfile, schema=table.schema) as writer:
+            writer.write_table(table)
+    else:
+        request.send_response(404)
+        request.end_headers()
+
+
+@contextlib.contextmanager
+def _remote_dataset_table():
+    with http.server.ThreadingHTTPServer(
+        ("localhost", 0), _make_mock_http_handler(_remote_dataset_handler)
+    ) as server:
+        port = server.server_address[1]
+        handle = threading.Thread(target=server.serve_forever)
+        handle.start()
+        try:
+            db = lancedb.connect(
+                "db://dev",
+                api_key="fake",
+                host_override=f"http://localhost:{port}",
+                client_config={
+                    "retry_config": {"retries": 0},
+                    "timeout_config": {"connect_timeout": 2, "read_timeout": 2},
+                },
+            )
+            yield db.open_table("test")
+        finally:
+            server.shutdown()
+            handle.join()
+
+
 def _open_native_table(uri: str, table_name: str):
    """Top-level connection factory used by the explicit-factory pickle test.

@@ -107,6 +213,39 @@ def test_permutation_dataloader_multiprocessing(tmp_db):
    assert seen == 1000


+def test_remote_table_dataloader_multiprocessing():
+    with _remote_dataset_table() as table:
+        dataloader = torch.utils.data.DataLoader(
+            table,
+            collate_fn=tbl_to_tensor,
+            batch_size=10,
+            num_workers=2,
+            multiprocessing_context="spawn",
+        )
+        seen = 0
+        for batch in dataloader:
+            assert batch.size(0) == 1
+            assert batch.size(1) == 10
+            seen += batch.size(1)
+        assert seen == len(REMOTE_ROWS)
+
+
+def test_remote_permutation_dataloader_multiprocessing():
+    with _remote_dataset_table() as table:
+        permutation = Permutation.identity(table)
+        dataloader = torch.utils.data.DataLoader(
+            permutation,
+            batch_size=10,
+            num_workers=2,
+            multiprocessing_context="spawn",
+        )
+        seen = 0
+        for batch in dataloader:
+            assert batch["a"].size(0) == 10
+            seen += batch["a"].size(0)
+        assert seen == len(REMOTE_ROWS)
+
+
 def test_permutation_pickle_with_connection_factory(tmp_path):
    """When the user provides a connection_factory, pickling should round-trip
    through that factory rather than introspecting the connection URI. Useful
@@ -171,6 +310,35 @@ def _multiworker_dataloader_target(db_uri: str, result_queue):
    result_queue.put(count)


+def _remote_multiworker_dataloader_target(port: int, result_queue):
+    import lancedb
+    from lancedb.permutation import Permutation
+
+    db = lancedb.connect(
+        "db://dev",
+        api_key="fake",
+        host_override=f"http://localhost:{port}",
+        client_config={
+            "retry_config": {"retries": 0},
+            "timeout_config": {"connect_timeout": 2, "read_timeout": 2},
+        },
+    )
+    table = db.open_table("test")
+    permutation = Permutation.identity(table)
+
+    dataloader = torch.utils.data.DataLoader(
+        permutation,
+        batch_size=10,
+        num_workers=2,
+        multiprocessing_context="fork",
+    )
+    count = 0
+    for batch in dataloader:
+        assert batch["a"].size(0) == 10
+        count += 1
+    result_queue.put(count)
+
+
@pytest.mark.skipif(
    sys.platform != "linux",
    reason=(
@@ -208,3 +376,46 @@ def test_permutation_dataloader_fork_workers(tmp_path):
    assert proc.exitcode == 0, f"child exited with code {proc.exitcode}"
    assert not queue.empty(), "child produced no batches"
    assert queue.get() == 100
+
+
+@pytest.mark.skipif(
+    sys.platform != "linux",
+    reason=(
+        "fork() is unavailable on Windows and unsafe on macOS "
+        "(Apple frameworks/TLS are not fork-safe)"
+    ),
+)
+def test_remote_permutation_dataloader_fork_workers():
+    with http.server.ThreadingHTTPServer(
+        ("localhost", 0), _make_mock_http_handler(_remote_dataset_handler)
+    ) as server:
+        port = server.server_address[1]
+        handle = threading.Thread(target=server.serve_forever)
+        handle.start()
+        try:
+            ctx = mp.get_context("spawn")
+            queue = ctx.Queue()
+            proc = ctx.Process(
+                target=_remote_multiworker_dataloader_target,
+                args=(port, queue),
+            )
+            proc.start()
+            proc.join(timeout=30)
+
+            if proc.is_alive():
+                proc.terminate()
+                proc.join(timeout=5)
+                if proc.is_alive():
+                    proc.kill()
+                    proc.join()
+                pytest.fail(
+                    "Remote permutation hung when iterated in a fork-based "
+                    "DataLoader worker"
+                )
+
+            assert proc.exitcode == 0, f"child exited with code {proc.exitcode}"
+            assert not queue.empty(), "child produced no batches"
+            assert queue.get() == 10
+        finally:
+            server.shutdown()
+            handle.join()
--- a/python/python/tests/test_util.py
+++ b/python/python/tests/test_util.py
@@ -149,6 +149,21 @@ def test_value_to_sql_dict():
    assert value_to_sql({}) == "named_struct()"


+def test_value_to_sql_numpy_scalars():
+    # numpy scalars (e.g. pulled from an ndarray or a pandas column) must
+    # convert the same way as their native Python counterparts. np.float64
+    # already worked by virtue of subclassing float, but the integer / bool
+    # / float32 scalars previously raised NotImplementedError.
+    import numpy as np
+
+    assert value_to_sql(np.int32(5)) == "5"
+    assert value_to_sql(np.int64(5)) == "5"
+    assert value_to_sql(np.float32(1.5)) == "1.5"
+    assert value_to_sql(np.float64(1.5)) == "1.5"
+    assert value_to_sql(np.bool_(True)) == "TRUE"
+    assert value_to_sql(np.bool_(False)) == "FALSE"
+
+
 def test_append_vector_columns():
    registry = EmbeddingFunctionRegistry.get_instance()
    registry.register("test")(MockTextEmbeddingFunction)
--- a/python/src/expr.rs
+++ b/python/src/expr.rs
@@ -9,7 +9,9 @@

 use arrow::{datatypes::DataType, pyarrow::PyArrowType};
 use datafusion_common::ScalarValue;
-use lancedb::expr::{DfExpr, col as ldb_col, contains, expr_cast, lit as df_lit, lower, upper};
+use lancedb::expr::{
+    DfExpr, col as ldb_col, contains, expr_cast, is_in, lit as df_lit, lower, upper,
+};
 use pyo3::types::PyBytes;
 use pyo3::{Bound, PyAny, PyResult, exceptions::PyValueError, prelude::*, pyfunction};

@@ -105,6 +107,14 @@ impl PyExpr {
        Self(contains(self.0.clone(), substr.0.clone()))
    }

+    // ── membership ───────────────────────────────────────────────────────────
+
+    /// Return true where the value is one of the given expressions (SQL ``IN``).
+    fn isin(&self, list: Vec<Self>) -> Self {
+        let items: Vec<DfExpr> = list.into_iter().map(|e| e.0).collect();
+        Self(is_in(self.0.clone(), items))
+    }
+
    // ── type cast ────────────────────────────────────────────────────────────

    /// Cast the expression to `data_type`.
--- a/python/src/index.rs
+++ b/python/src/index.rs
@@ -7,7 +7,7 @@ use lancedb::index::vector::{
 };
 use lancedb::index::{
    Index as LanceDbIndex,
-    scalar::{BTreeIndexBuilder, FtsIndexBuilder},
+    scalar::{BTreeIndexBuilder, FmIndexBuilder, FtsIndexBuilder},
 };
 use pyo3::IntoPyObject;
 use pyo3::types::PyStringMethods;
@@ -38,6 +38,7 @@ pub fn extract_index_params(source: &Option<Bound<'_, PyAny>>) -> PyResult<Lance
            "BTree" => Ok(LanceDbIndex::BTree(BTreeIndexBuilder::default())),
            "Bitmap" => Ok(LanceDbIndex::Bitmap(Default::default())),
            "LabelList" => Ok(LanceDbIndex::LabelList(Default::default())),
+            "Fm" => Ok(LanceDbIndex::Fm(FmIndexBuilder::default())),
            "FTS" => {
                let params = source.extract::<FtsParams>()?;
                let inner_opts = FtsIndexBuilder::default()
@@ -183,7 +184,7 @@ pub fn extract_index_params(source: &Option<Bound<'_, PyAny>>) -> PyResult<Lance
                Ok(LanceDbIndex::IvfHnswFlat(hnsw_flat_builder))
            }
            not_supported => Err(PyValueError::new_err(format!(
-                "Invalid index type '{}'.  Must be one of BTree, Bitmap, LabelList, FTS, IvfPq, IvfSq, IvfHnswPq, IvfHnswSq, or IvfHnswFlat",
+                "Invalid index type '{}'.  Must be one of BTree, Bitmap, LabelList, Fm, FTS, IvfPq, IvfSq, IvfHnswPq, IvfHnswSq, or IvfHnswFlat",
                not_supported
            ))),
        }
--- a/python/src/lib.rs
+++ b/python/src/lib.rs
@@ -16,7 +16,7 @@ use query::{FTSQuery, HybridQuery, Query, VectorQuery};
 use session::Session;
 use table::{
    AddColumnsResult, AddResult, AlterColumnsResult, DeleteResult, DropColumnsResult, LsmWriteSpec,
-    MergeResult, Table, UpdateResult,
+    MergeResult, Table, UpdateFieldMetadataResult, UpdateResult,
 };

 pub mod arrow;
@@ -50,6 +50,7 @@ pub fn _lancedb(_py: Python, m: &Bound<'_, PyModule>) -> PyResult<()> {
    m.add_class::<RecordBatchStream>()?;
    m.add_class::<AddColumnsResult>()?;
    m.add_class::<AlterColumnsResult>()?;
+    m.add_class::<UpdateFieldMetadataResult>()?;
    m.add_class::<AddResult>()?;
    m.add_class::<MergeResult>()?;
    m.add_class::<LsmWriteSpec>()?;
--- a/python/src/table.rs
+++ b/python/src/table.rs
@@ -16,12 +16,12 @@ use arrow::{
    pyarrow::{FromPyArrow, PyArrowType, ToPyArrow},
 };
 use lancedb::table::{
-    AddDataMode, ColumnAlteration, Duration, NewColumnTransform, OptimizeAction, OptimizeOptions,
-    Table as LanceDbTable,
+    AddDataMode, ColumnAlteration, Duration, FieldMetadataUpdate, NewColumnTransform,
+    OptimizeAction, OptimizeOptions, Ref, Table as LanceDbTable,
 };
 use pyo3::{
    Bound, FromPyObject, Py, PyAny, PyRef, PyResult, Python,
-    exceptions::{PyKeyError, PyRuntimeError, PyValueError},
+    exceptions::{PyRuntimeError, PyValueError},
    pyclass, pymethods,
    types::{IntoPyDict, PyAnyMethods, PyDict, PyDictMethods},
 };
@@ -357,6 +357,27 @@ impl From<lancedb::table::AlterColumnsResult> for AlterColumnsResult {
    }
 }

+#[pyclass(get_all, from_py_object)]
+#[derive(Clone, Debug)]
+pub struct UpdateFieldMetadataResult {
+    pub version: u64,
+}
+
+#[pymethods]
+impl UpdateFieldMetadataResult {
+    pub fn __repr__(&self) -> String {
+        format!("UpdateFieldMetadataResult(version={})", self.version)
+    }
+}
+
+impl From<lancedb::table::UpdateFieldMetadataResult> for UpdateFieldMetadataResult {
+    fn from(result: lancedb::table::UpdateFieldMetadataResult) -> Self {
+        Self {
+            version: result.version,
+        }
+    }
+}
+
 #[pyclass(get_all, from_py_object)]
 #[derive(Clone, Debug)]
 pub struct DropColumnsResult {
@@ -690,10 +711,6 @@ impl Table {
                        dict.set_item("num_indices", num_indices)?;
                    }

-                    if let Some(loss) = stats.loss {
-                        dict.set_item("loss", loss)?;
-                    }
-
                    Ok(Some(dict.unbind()))
                })
            } else {
@@ -843,6 +860,15 @@ impl Table {
        Ok(Tags::new(self.inner_ref()?.clone()))
    }

+    pub fn current_branch(&self) -> PyResult<Option<String>> {
+        Ok(self.inner_ref()?.current_branch())
+    }
+
+    #[getter]
+    pub fn branches(&self) -> PyResult<Branches> {
+        Ok(Branches::new(self.inner_ref()?.clone()))
+    }
+
    #[pyo3(signature = (offsets))]
    pub fn take_offsets(self_: PyRef<'_, Self>, offsets: Vec<u64>) -> PyResult<TakeQuery> {
        Ok(TakeQuery::new(
@@ -1102,31 +1128,57 @@ impl Table {
        field_name: String,
        metadata: &Bound<'_, PyDict>,
    ) -> PyResult<Bound<'a, PyAny>> {
-        let mut new_metadata = HashMap::<String, String>::new();
-        for (column_name, value) in metadata.into_iter() {
-            let key: String = column_name.extract()?;
-            let value: String = value.extract()?;
-            new_metadata.insert(key, value);
+        // Deprecated: forwards to the update_field_metadata path (replace mode).
+        let mut update = FieldMetadataUpdate::new(field_name).replace();
+        for (key, value) in metadata.into_iter() {
+            update = update.set(key.extract::<String>()?, value.extract::<String>()?);
        }

        let inner = self_.inner_ref()?.clone();
        future_into_py(self_.py(), async move {
-            let native_tbl = inner
-                .as_native()
-                .ok_or_else(|| PyValueError::new_err("This cannot be run on a remote table"))?;
-            let schema = native_tbl.manifest().await.infer_error()?.schema;
-            let field = schema
-                .field(&field_name)
-                .ok_or_else(|| PyKeyError::new_err(format!("Field {} not found", field_name)))?;
-
-            native_tbl
-                .replace_field_metadata(vec![(field.id as u32, new_metadata)])
-                .await
-                .infer_error()?;
-
+            inner.update_field_metadata(&[update]).await.infer_error()?;
            Ok(())
        })
    }
+
+    pub fn update_field_metadata<'a>(
+        self_: PyRef<'a, Self>,
+        updates: Vec<Bound<PyDict>>,
+    ) -> PyResult<Bound<'a, PyAny>> {
+        let updates = updates
+            .iter()
+            .map(|update| {
+                let path: String = update
+                    .get_item("path")?
+                    .ok_or_else(|| PyValueError::new_err("Missing path"))?
+                    .extract()?;
+                let mut field_update = FieldMetadataUpdate::new(path);
+                if let Some(metadata) = update.get_item("metadata")? {
+                    let metadata_dict = metadata.cast::<PyDict>()?;
+                    for (key, value) in metadata_dict.iter() {
+                        let key: String = key.extract()?;
+                        if value.is_none() {
+                            field_update = field_update.remove(key);
+                        } else {
+                            field_update = field_update.set(key, value.extract::<String>()?);
+                        }
+                    }
+                }
+                if let Some(replace) = update.get_item("replace")?
+                    && replace.extract::<bool>()?
+                {
+                    field_update = field_update.replace();
+                }
+                Ok(field_update)
+            })
+            .collect::<PyResult<Vec<_>>>()?;
+
+        let inner = self_.inner_ref()?.clone();
+        future_into_py(self_.py(), async move {
+            let result = inner.update_field_metadata(&updates).await.infer_error()?;
+            Ok(UpdateFieldMetadataResult::from(result))
+        })
+    }
 }

 #[derive(FromPyObject)]
@@ -1218,3 +1270,71 @@ impl Tags {
        })
    }
 }
+
+#[pyclass]
+pub struct Branches {
+    inner: LanceDbTable,
+}
+
+impl Branches {
+    pub fn new(table: LanceDbTable) -> Self {
+        Self { inner: table }
+    }
+}
+
+#[pymethods]
+impl Branches {
+    pub fn list(self_: PyRef<'_, Self>) -> PyResult<Bound<'_, PyAny>> {
+        let inner = self_.inner.clone();
+        future_into_py(self_.py(), async move {
+            let res = inner.list_branches().await.infer_error()?;
+            Python::attach(|py| {
+                let py_dict = PyDict::new(py);
+                for (name, contents) in res {
+                    let value = PyDict::new(py);
+                    value.set_item("parent_branch", contents.parent_branch)?;
+                    value.set_item("parent_version", contents.parent_version)?;
+                    value.set_item("manifest_size", contents.manifest_size)?;
+                    py_dict.set_item(name, value)?;
+                }
+                Ok(py_dict.unbind())
+            })
+        })
+    }
+
+    #[pyo3(signature = (name, from_ref=None, from_version=None))]
+    pub fn create(
+        self_: PyRef<'_, Self>,
+        name: String,
+        from_ref: Option<String>,
+        from_version: Option<u64>,
+    ) -> PyResult<Bound<'_, PyAny>> {
+        let inner = self_.inner.clone();
+        future_into_py(self_.py(), async move {
+            let from = Ref::Version(from_ref, from_version);
+            let table = inner.create_branch(&name, from).await.infer_error()?;
+            Ok(Table::new(table))
+        })
+    }
+
+    #[pyo3(signature = (name, version=None))]
+    pub fn checkout(
+        self_: PyRef<'_, Self>,
+        name: String,
+        version: Option<u64>,
+    ) -> PyResult<Bound<'_, PyAny>> {
+        let inner = self_.inner.clone();
+        future_into_py(self_.py(), async move {
+            let table = inner.checkout_branch(&name, version).await.infer_error()?;
+            Ok(Table::new(table))
+        })
+    }
+
+    pub fn delete(self_: PyRef<'_, Self>, name: String) -> PyResult<Bound<'_, PyAny>> {
+        let inner = self_.inner.clone();
+        future_into_py(self_.py(), async move {
+            inner.delete_branch(&name).await.infer_error()?;
+            Ok(())
+        })
+    }
+}
--- a/python/tests/test_expr.py
+++ b/python/tests/test_expr.py
@@ -450,6 +450,27 @@ def binary_table(tmp_path):
    return db.create_table("binary_test", data)


+class TestExprIsin:
+    def test_isin_ints(self):
+        assert col("id").isin([1, 2, 3]).to_sql() == "id IN (1, 2, 3)"
+
+    def test_isin_strs(self):
+        assert (
+            col("status").isin(["active", "pending"]).to_sql()
+            == "status IN ('active', 'pending')"
+        )
+
+    def test_isin_coerces_and_mixes(self):
+        assert col("id").isin([lit(1), 2]).to_sql() == "id IN (1, 2)"
+
+    def test_isin_empty(self):
+        assert col("id").isin([]).to_sql() == "id IN ()"
+
+    def test_isin_filter(self, simple_table):
+        result = simple_table.search().where(col("id").isin([1, 3, 5])).to_arrow()
+        assert result.num_rows == 3
+
+
 class TestExprBytesIntegration:
    def test_binary_equality_filter(self, binary_table):
        result = (
--- a/python/uv.lock
+++ b/python/uv.lock
--- a/rust/lancedb/Cargo.toml
+++ b/rust/lancedb/Cargo.toml
@@ -1,6 +1,6 @@
 [package]
 name = "lancedb"
-version = "0.30.0-beta.1"
+version = "0.30.1-beta.2"
 edition.workspace = true
 description = "LanceDB: A serverless, low-latency vector database for AI applications"
 license.workspace = true
--- a/rust/lancedb/src/connection.rs
+++ b/rust/lancedb/src/connection.rs
@@ -9,6 +9,7 @@ use std::sync::Arc;
 use arrow_array::RecordBatch;
 use arrow_schema::SchemaRef;
 use lance::dataset::ReadParams;
+use lance::dataset::refs::MAIN_BRANCH;
 use lance_namespace::models::{
    CreateNamespaceRequest, CreateNamespaceResponse, DescribeNamespaceRequest,
    DescribeNamespaceResponse, DropNamespaceRequest, DropNamespaceResponse, ListNamespacesRequest,
@@ -119,6 +120,8 @@ pub struct OpenTableBuilder {
    parent: Arc<dyn Database>,
    request: OpenTableRequest,
    embedding_registry: Arc<dyn EmbeddingRegistry>,
+    branch: Option<String>,
+    version: Option<u64>,
 }

 impl OpenTableBuilder {
@@ -139,6 +142,8 @@ impl OpenTableBuilder {
                managed_versioning: None,
            },
            embedding_registry,
+            branch: None,
+            version: None,
        }
    }

@@ -259,14 +264,48 @@ impl OpenTableBuilder {
        self
    }

+    /// Open the table scoped to the given branch instead of the default branch.
+    ///
+    /// Reads and writes on the returned table operate in the branch's context.
+    pub fn branch(mut self, branch: impl Into<String>) -> Self {
+        self.branch = Some(branch.into());
+        self
+    }
+
+    /// Open the table pinned to a specific version, producing a read-only "view".
+    ///
+    /// Composes with [`Self::branch`]: when a branch is also set, this opens that
+    /// branch at the given version; otherwise it opens `main` at that version.
+    /// The returned table is a detached head, so operations that modify the table
+    /// will fail until [`Table::checkout_latest`] is called.
+    ///
+    /// ```
+    /// # use lancedb::Connection;
+    /// # async fn f(conn: &Connection) -> Result<(), Box<dyn std::error::Error>> {
+    /// let table = conn.open_table("t").branch("exp").version(3).execute().await?;
+    /// # Ok(())
+    /// # }
+    /// ```
+    pub fn version(mut self, version: u64) -> Self {
+        self.version = Some(version);
+        self
+    }
+
    /// Open the table
    pub async fn execute(self) -> Result<Table> {
        let table = self.parent.open_table(self.request).await?;
-        Ok(Table::new_with_embedding_registry(
-            table,
-            self.parent,
-            self.embedding_registry,
-        ))
+        let table = Table::new_with_embedding_registry(table, self.parent, self.embedding_registry);
+        // "main" is the default branch, so treat it as no branch.
+        let branch = self.branch.filter(|b| b.as_str() != MAIN_BRANCH);
+        match branch {
+            Some(branch) => table.checkout_branch(&branch, self.version).await,
+            None => {
+                if let Some(version) = self.version {
+                    table.checkout(version).await?;
+                }
+                Ok(table)
+            }
+        }
    }
 }

--- a/rust/lancedb/src/database/namespace.rs
+++ b/rust/lancedb/src/database/namespace.rs
@@ -437,8 +437,11 @@ impl Database for LanceNamespaceDatabase {

        // Set up commit handler when managed_versioning is enabled
        if managed_versioning == Some(true) {
-            let external_store =
-                LanceNamespaceExternalManifestStore::new(self.namespace.clone(), table_id.clone());
+            let external_store = LanceNamespaceExternalManifestStore::for_table_uri(
+                self.namespace.clone(),
+                table_id.clone(),
+                &location,
+            )?;
            let commit_handler: Arc<dyn CommitHandler> = Arc::new(ExternalManifestCommitHandler {
                external_manifest_store: Arc::new(external_store),
            });
@@ -740,6 +743,64 @@ mod tests {
        assert!(table_names.contains(&"test_table".to_string()));
    }

+    #[tokio::test]
+    async fn test_namespace_branch_query_under_pushdown_stays_local() {
+        // With QueryTable pushdown enabled, a query on the main branch routes to
+        // the namespace server, but a branch handle must run locally: the
+        // server-side request carries no branch and would return main's rows.
+        let tmp_dir = tempdir().unwrap();
+        let root_path = tmp_dir.path().to_str().unwrap().to_string();
+
+        let mut properties = HashMap::new();
+        properties.insert("root".to_string(), root_path);
+
+        let conn = connect_namespace("dir", properties)
+            .pushdown_operation(NamespaceClientPushdownOperation::QueryTable)
+            .execute()
+            .await
+            .expect("Failed to connect to namespace");
+
+        conn.create_namespace(CreateNamespaceRequest {
+            id: Some(vec!["test_ns".into()]),
+            ..Default::default()
+        })
+        .await
+        .expect("Failed to create namespace");
+
+        // main has 5 rows
+        let table = conn
+            .create_table("ref_test", create_test_data())
+            .namespace(vec!["test_ns".into()])
+            .execute()
+            .await
+            .expect("Failed to create table");
+        let main_version = table.version().await.unwrap();
+
+        // fork a branch off main, then add 5 more rows so it differs from main
+        let branch = table
+            .create_branch("exp", main_version)
+            .await
+            .expect("Failed to create branch");
+        branch
+            .add(create_test_data())
+            .execute()
+            .await
+            .expect("Failed to append to branch");
+
+        // the branch query must run locally and see the branch's 10 rows --
+        // not get routed to the server (which carries no branch) and see main's 5
+        let results = branch
+            .query()
+            .execute()
+            .await
+            .expect("Failed to query branch")
+            .try_collect::<Vec<_>>()
+            .await
+            .expect("Failed to collect results");
+        let count: usize = results.iter().map(|b| b.num_rows()).sum();
+        assert_eq!(count, 10);
+    }
+
    #[tokio::test]
    async fn test_namespace_describe_table() {
        // Setup: Create a temporary directory for the namespace
--- a/rust/lancedb/src/dataloader/permutation/shuffle.rs
+++ b/rust/lancedb/src/dataloader/permutation/shuffle.rs
@@ -203,11 +203,11 @@ impl Shuffler {

        // Finish writing files
        for (file_idx, mut writer) in file_writers.into_iter().enumerate() {
-            let num_written = writer.finish().await?;
+            let write_summary = writer.finish().await?;
            log::debug!(
                "Shuffle job {}: wrote {} rows to file {}",
                self.id,
-                num_written,
+                write_summary.num_rows,
                file_idx
            );
        }
--- a/rust/lancedb/src/expr.rs
+++ b/rust/lancedb/src/expr.rs
@@ -57,6 +57,10 @@ pub fn expr_cast(expr: Expr, data_type: DataType) -> Expr {
    cast(expr, data_type)
 }

+pub fn is_in(expr: Expr, list: Vec<Expr>) -> Expr {
+    expr.in_list(list, false)
+}
+
 lazy_static::lazy_static! {
    static ref FUNC_REGISTRY: std::sync::RwLock<std::collections::HashMap<String, Arc<ScalarUDF>>> = {
        let mut m = std::collections::HashMap::new();
@@ -194,6 +198,13 @@ mod tests {
        assert_eq!(sql, "NOT (data = X'ABCD')");
    }

+    #[test]
+    fn test_is_in() {
+        let expr = is_in(col("id"), vec![lit(1i64), lit(2i64), lit(3i64)]);
+        let sql = expr_to_sql_string(&expr).unwrap();
+        assert!(sql.contains("IN"), "expected IN in: {}", sql);
+    }
+
    #[test]
    fn test_multiple_binary_literals() {
        use datafusion_common::ScalarValue;
--- a/rust/lancedb/src/index.rs
+++ b/rust/lancedb/src/index.rs
@@ -12,7 +12,7 @@ use crate::index::vector::IvfRqIndexBuilder;
 use crate::{DistanceType, Error, Result, table::BaseTable};

 use self::{
-    scalar::{BTreeIndexBuilder, BitmapIndexBuilder, LabelListIndexBuilder},
+    scalar::{BTreeIndexBuilder, BitmapIndexBuilder, FmIndexBuilder, LabelListIndexBuilder},
    vector::{
        IvfHnswFlatIndexBuilder, IvfHnswPqIndexBuilder, IvfHnswSqIndexBuilder, IvfPqIndexBuilder,
        IvfSqIndexBuilder,
@@ -48,6 +48,11 @@ pub enum Index {
    /// using an underlying bitmap index.
    LabelList(LabelListIndexBuilder),

+    /// An `FM` index is a scalar index on string/binary columns that accelerates
+    /// substring search (`contains(col, 'needle')`). It matches arbitrary
+    /// substrings of the raw bytes, unlike the tokenized [`Index::FTS`] index.
+    Fm(FmIndexBuilder),
+
    /// Full text search index using bm25.
    FTS(FtsIndexBuilder),

@@ -306,6 +311,8 @@ pub enum IndexType {
    Bitmap,
    #[serde(alias = "LABEL_LIST")]
    LabelList,
+    #[serde(alias = "FM", alias = "FMINDEX", alias = "FMIndex")]
+    Fm,
    // FTS
    #[serde(alias = "INVERTED", alias = "Inverted")]
    FTS,
@@ -324,6 +331,7 @@ impl std::fmt::Display for IndexType {
            Self::BTree => write!(f, "BTREE"),
            Self::Bitmap => write!(f, "BITMAP"),
            Self::LabelList => write!(f, "LABEL_LIST"),
+            Self::Fm => write!(f, "FM"),
            Self::FTS => write!(f, "FTS"),
        }
    }
@@ -337,6 +345,7 @@ impl std::str::FromStr for IndexType {
            "BTREE" => Ok(Self::BTree),
            "BITMAP" => Ok(Self::Bitmap),
            "LABEL_LIST" | "LABELLIST" => Ok(Self::LabelList),
+            "FM" | "FMINDEX" => Ok(Self::Fm),
            "FTS" | "INVERTED" => Ok(Self::FTS),
            "IVF_FLAT" => Ok(Self::IvfFlat),
            "IVF_SQ" => Ok(Self::IvfSq),
@@ -372,7 +381,6 @@ pub(crate) struct IndexMetadata {
    pub metric_type: Option<DistanceType>,
    // Sometimes the index type is provided at this level.
    pub index_type: Option<IndexType>,
-    pub loss: Option<f64>,
 }

 // This struct is used to deserialize the JSON data returned from the Lance API
@@ -404,6 +412,4 @@ pub struct IndexStatistics {
    pub distance_type: Option<DistanceType>,
    /// The number of parts this index is split into.
    pub num_indices: Option<u32>,
-    /// The loss value used by the index.
-    pub loss: Option<f64>,
 }
--- a/rust/lancedb/src/index/scalar.rs
+++ b/rust/lancedb/src/index/scalar.rs
@@ -51,6 +51,15 @@ pub struct BitmapIndexBuilder {}
 #[derive(Debug, Clone, Default, serde::Serialize)]
 pub struct LabelListIndexBuilder {}

+/// Builder for an FM-Index.
+///
+/// An FM-Index (Ferragina–Manzini) is a scalar index over string/binary columns
+/// that accelerates substring search, i.e. `contains(col, 'needle')`. Unlike an
+/// inverted (FTS) index it matches arbitrary substrings of the raw bytes rather
+/// than tokenized words.
+#[derive(Debug, Clone, Default, serde::Serialize)]
+pub struct FmIndexBuilder {}
+
 pub use lance_index::scalar::FullTextSearchQuery;
 pub use lance_index::scalar::InvertedIndexParams as FtsIndexBuilder;
 pub use lance_index::scalar::InvertedIndexParams;
--- a/rust/lancedb/src/remote/db.rs
+++ b/rust/lancedb/src/remote/db.rs
@@ -983,6 +983,49 @@ mod tests {
        assert_eq!(table.name(), "table1");
    }

+    #[tokio::test]
+    async fn test_open_table_branch_and_version() {
+        // Remote supports version time-travel but not branches. A version-only
+        // open (or one on the default "main" branch) must succeed; a non-main
+        // branch must be rejected, with or without a version.
+        let conn = Connection::new_with_handler(|request| {
+            assert_eq!(request.url().path(), "/v1/table/t/describe/");
+            http::Response::builder()
+                .status(200)
+                .body(
+                    r#"{"table": "t", "version": 2, "schema": {"fields": [
+                        {"name": "a", "type": { "type": "int32" }, "nullable": false}
+                    ]}}"#,
+                )
+                .unwrap()
+        });
+
+        // version-only: allowed (open + checkout(version) both round-trip)
+        conn.open_table("t").version(2).execute().await.unwrap();
+
+        // "main" is the default branch, so it counts as no branch
+        conn.open_table("t")
+            .branch("main")
+            .version(2)
+            .execute()
+            .await
+            .unwrap();
+
+        // a non-main branch is rejected, with or without a version
+        assert!(matches!(
+            conn.open_table("t").branch("exp").execute().await,
+            Err(Error::NotSupported { .. })
+        ));
+        assert!(matches!(
+            conn.open_table("t")
+                .branch("exp")
+                .version(2)
+                .execute()
+                .await,
+            Err(Error::NotSupported { .. })
+        ));
+    }
+
    #[tokio::test]
    async fn test_open_table_not_found() {
        let conn = Connection::new_with_handler(|_| {
--- a/rust/lancedb/src/remote/table.rs
+++ b/rust/lancedb/src/remote/table.rs
@@ -18,13 +18,14 @@ use crate::index::waiter::wait_for_index;
 use crate::query::{QueryFilter, QueryRequest, Select, VectorQueryRequest};
 use crate::table::AddColumnsResult;
 use crate::table::AddResult;
-use crate::table::AlterColumnsResult;
 use crate::table::DeleteResult;
 use crate::table::DropColumnsResult;
 use crate::table::MergeResult;
 use crate::table::Tags;
 use crate::table::UpdateResult;
+use crate::table::merge::MergeFilter;
 use crate::table::query::create_multi_vector_plan;
+use crate::table::{AlterColumnsResult, FieldMetadataUpdate, UpdateFieldMetadataResult};
 use crate::table::{AnyQuery, Filter, Predicate, PreprocessingOutput, TableStatistics};
 use crate::utils::background_cache::BackgroundCache;
 use crate::utils::{
@@ -1383,6 +1384,38 @@ impl<S: HttpSend> BaseTable for RemoteTable<S> {
            .map_err(unwrap_shared_error)
    }

+    async fn create_branch(
+        &self,
+        _name: &str,
+        _from: lance::dataset::refs::Ref,
+    ) -> Result<Arc<dyn BaseTable>> {
+        Err(Error::NotSupported {
+            message: "branching is not yet supported on remote tables".into(),
+        })
+    }
+
+    async fn checkout_branch(&self, _name: &str) -> Result<Arc<dyn BaseTable>> {
+        Err(Error::NotSupported {
+            message: "branching is not yet supported on remote tables".into(),
+        })
+    }
+
+    async fn list_branches(&self) -> Result<HashMap<String, lance::dataset::refs::BranchContents>> {
+        Err(Error::NotSupported {
+            message: "branching is not yet supported on remote tables".into(),
+        })
+    }
+
+    async fn delete_branch(&self, _name: &str) -> Result<()> {
+        Err(Error::NotSupported {
+            message: "branching is not yet supported on remote tables".into(),
+        })
+    }
+
+    fn current_branch(&self) -> Option<String> {
+        None
+    }
+
    async fn count_rows(&self, filter: Option<Filter>) -> Result<usize> {
        let mut request = self.post_read(&format!("/v1/table/{}/count_rows/", self.identifier));

@@ -1717,6 +1750,7 @@ impl<S: HttpSend> BaseTable for RemoteTable<S> {
            Index::BTree(p) => ("BTREE", Some(to_json(p)?)),
            Index::Bitmap(p) => ("BITMAP", Some(to_json(p)?)),
            Index::LabelList(p) => ("LABEL_LIST", Some(to_json(p)?)),
+            Index::Fm(p) => ("FM", Some(to_json(p)?)),
            Index::FTS(p) => ("FTS", Some(to_json(p)?)),
            Index::Auto => {
                if supported_vector_data_type(field.data_type()) {
@@ -1826,16 +1860,57 @@ impl<S: HttpSend> BaseTable for RemoteTable<S> {
        })
    }

-    async fn set_lsm_write_spec(&self, _spec: crate::table::LsmWriteSpec) -> Result<()> {
-        Err(Error::NotSupported {
-            message: "set_lsm_write_spec is not supported on LanceDB cloud.".into(),
-        })
+    async fn set_lsm_write_spec(&self, spec: crate::table::LsmWriteSpec) -> Result<()> {
+        use crate::table::LsmWriteSpec;
+        self.check_mutable().await?;
+
+        // Map the spec onto the server's request DTO. `sharding` is internally
+        // tagged on `mode` to mirror sophon's `Sharding` enum; `maintained_indexes`
+        // and `writer_config_defaults` are sent verbatim (an empty list means "no
+        // maintained indexes", not "default to all").
+        let sharding = match &spec {
+            LsmWriteSpec::Bucket {
+                column,
+                num_buckets,
+                ..
+            } => serde_json::json!({
+                "mode": "bucket",
+                "column": column,
+                "num_buckets": num_buckets,
+            }),
+            LsmWriteSpec::Identity { column, .. } => serde_json::json!({
+                "mode": "identity",
+                "column": column,
+            }),
+            LsmWriteSpec::Unsharded { .. } => serde_json::json!({ "mode": "unsharded" }),
+        };
+        let body = serde_json::json!({
+            "sharding": sharding,
+            "maintained_indexes": spec.maintained_indexes(),
+            "writer_config_defaults": spec.writer_config_defaults(),
+        });
+
+        let request = self
+            .client
+            .post(&format!(
+                "/v1/table/{}/set_lsm_write_spec/",
+                self.identifier
+            ))
+            .json(&body);
+        let (request_id, response) = self.send(request, true).await?;
+        self.check_table_response(&request_id, response).await?;
+        Ok(())
    }

    async fn unset_lsm_write_spec(&self) -> Result<()> {
-        Err(Error::NotSupported {
-            message: "unset_lsm_write_spec is not supported on LanceDB cloud.".into(),
-        })
+        self.check_mutable().await?;
+        let request = self.client.post(&format!(
+            "/v1/table/{}/unset_lsm_write_spec/",
+            self.identifier
+        ));
+        let (request_id, response) = self.send(request, true).await?;
+        self.check_table_response(&request_id, response).await?;
+        Ok(())
    }

    async fn tags(&self) -> Result<Box<dyn Tags + '_>> {
@@ -1968,6 +2043,35 @@ impl<S: HttpSend> BaseTable for RemoteTable<S> {
        Ok(result)
    }

+    async fn update_field_metadata(
+        &self,
+        updates: &[FieldMetadataUpdate],
+    ) -> Result<UpdateFieldMetadataResult> {
+        self.check_mutable().await?;
+        let body = serde_json::json!({ "updates": updates });
+        let request = self
+            .client
+            .post(&format!(
+                "/v1/table/{}/update_field_metadata/",
+                self.identifier
+            ))
+            .json(&body);
+        let (request_id, response) = self.send(request, true).await?;
+        let response = self.check_table_response(&request_id, response).await?;
+        let body = response.text().await.err_to_http(request_id.clone())?;
+
+        let result: UpdateFieldMetadataResult =
+            serde_json::from_str(&body).map_err(|e| Error::Http {
+                source: format!("Failed to parse update_field_metadata response: {}", e).into(),
+                request_id,
+                status_code: None,
+            })?;
+
+        self.invalidate_schema_cache();
+        self.track_write_version(result.version);
+        Ok(result)
+    }
+
    async fn drop_columns(&self, columns: &[&str]) -> Result<DropColumnsResult> {
        self.check_mutable().await?;
        let body = serde_json::json!({ "columns": columns });
@@ -2237,13 +2341,34 @@ impl TryFrom<MergeInsertBuilder> for MergeInsertRequest {
        }
        let on = value.on[0].clone();

+        let when_matched_update_all_filt = match value.when_matched_update_all_filt {
+            Some(MergeFilter::Sql(sql)) => Some(sql),
+            Some(MergeFilter::Expr(_)) => {
+                return Err(Error::NotSupported {
+                    message: "DataFusion expressions are not supported on remote tables".into(),
+                });
+            }
+            None => None,
+        };
+
+        let when_not_matched_by_source_delete_filt =
+            match value.when_not_matched_by_source_delete_filt {
+                Some(MergeFilter::Sql(sql)) => Some(sql),
+                Some(MergeFilter::Expr(_)) => {
+                    return Err(Error::NotSupported {
+                        message: "DataFusion expressions are not supported on remote tables".into(),
+                    });
+                }
+                None => None,
+            };
+
        Ok(Self {
            on,
            when_matched_update_all: value.when_matched_update_all,
-            when_matched_update_all_filt: value.when_matched_update_all_filt,
+            when_matched_update_all_filt,
            when_not_matched_insert_all: value.when_not_matched_insert_all,
            when_not_matched_by_source_delete: value.when_not_matched_by_source_delete,
-            when_not_matched_by_source_delete_filt: value.when_not_matched_by_source_delete_filt,
+            when_not_matched_by_source_delete_filt,
            // Only serialize use_index when it's false for backwards compatibility
            use_index: value.use_index,
        })
@@ -2261,6 +2386,7 @@ mod tests {

    use crate::remote::client::{ClientConfig, RetryConfig};
    use crate::table::AddDataMode;
+    use crate::table::FieldMetadataUpdate;

    use arrow::{array::AsArray, compute::concat_batches, datatypes::Int32Type};
    use arrow_array::{Int32Array, RecordBatch, RecordBatchIterator, record_batch};
@@ -2491,11 +2617,19 @@ mod tests {
        let vector_type =
            DataType::FixedSizeList(Arc::new(Field::new("item", DataType::Float32, true)), 8);
        Schema::new(vec![
+            Field::new("rowId", DataType::Int32, false),
+            Field::new("row-id", DataType::Int32, false),
+            Field::new("userId", DataType::Int32, false),
            Field::new(
                "metadata",
                DataType::Struct(vec![Field::new("user_id", DataType::Int32, false)].into()),
                false,
            ),
+            Field::new(
+                "MetaData",
+                DataType::Struct(vec![Field::new("userId", DataType::Int32, false)].into()),
+                false,
+            ),
            Field::new(
                "image",
                DataType::Struct(vec![Field::new("embedding", vector_type, false)].into()),
@@ -3789,6 +3923,22 @@ mod tests {
    async fn test_create_index_nested_field_paths() {
        let schema = nested_index_schema();
        let expected_requests = Arc::new(vec![
+            json!({
+                "column": "rowId",
+                "index_type": "BTREE",
+            }),
+            json!({
+                "column": "`row-id`",
+                "index_type": "BTREE",
+            }),
+            json!({
+                "column": "userId",
+                "index_type": "BTREE",
+            }),
+            json!({
+                "column": "MetaData.userId",
+                "index_type": "BTREE",
+            }),
            json!({
                "column": "metadata.user_id",
                "index_type": "BTREE",
@@ -3844,6 +3994,26 @@ mod tests {
            }
        });

+        table
+            .create_index(&["rowId"], Index::BTree(Default::default()))
+            .execute()
+            .await
+            .unwrap();
+        table
+            .create_index(&["`ROW-ID`"], Index::BTree(Default::default()))
+            .execute()
+            .await
+            .unwrap();
+        table
+            .create_index(&["userId"], Index::BTree(Default::default()))
+            .execute()
+            .await
+            .unwrap();
+        table
+            .create_index(&["MetaData.userId"], Index::BTree(Default::default()))
+            .execute()
+            .await
+            .unwrap();
        table
            .create_index(&["Metadata.USER_ID"], Index::BTree(Default::default()))
            .execute()
@@ -3954,6 +4124,166 @@ mod tests {
        assert_eq!(indices, expected);
    }

+    #[tokio::test]
+    async fn test_list_indices_nested_field_paths() {
+        let schema = nested_index_schema();
+        let table = Table::new_with_handler("my_table", move |request| {
+            assert_eq!(request.method(), "POST");
+
+            let response_body = match request.url().path() {
+                "/v1/table/my_table/describe/" => {
+                    return http::Response::builder()
+                        .status(200)
+                        .body(describe_response(&schema))
+                        .unwrap();
+                }
+                "/v1/table/my_table/index/list/" => {
+                    serde_json::json!({
+                        "indexes": [
+                            {
+                                "index_name": "row_id_idx",
+                                "index_uuid": "00000000-0000-0000-0000-000000000001",
+                                "columns": ["rowId"],
+                                "index_status": "done",
+                            },
+                            {
+                                "index_name": "row_dash_id_idx",
+                                "index_uuid": "00000000-0000-0000-0000-000000000002",
+                                "columns": ["`ROW-ID`"],
+                                "index_status": "done",
+                            },
+                            {
+                                "index_name": "user_id_idx",
+                                "index_uuid": "00000000-0000-0000-0000-000000000003",
+                                "columns": ["userId"],
+                                "index_status": "done",
+                            },
+                            {
+                                "index_name": "mixed_case_metadata_user_id_idx",
+                                "index_uuid": "00000000-0000-0000-0000-000000000004",
+                                "columns": ["MetaData.userId"],
+                                "index_status": "done",
+                            },
+                            {
+                                "index_name": "metadata_user_id_idx",
+                                "index_uuid": "00000000-0000-0000-0000-000000000005",
+                                "columns": ["Metadata.USER_ID"],
+                                "index_status": "done",
+                            },
+                            {
+                                "index_name": "image_embedding_idx",
+                                "index_uuid": "00000000-0000-0000-0000-000000000006",
+                                "columns": ["Image.Embedding"],
+                                "index_status": "done",
+                            },
+                            {
+                                "index_name": "payload_text_idx",
+                                "index_uuid": "00000000-0000-0000-0000-000000000007",
+                                "columns": ["Payload.Text"],
+                                "index_status": "done",
+                            },
+                            {
+                                "index_name": "meta_data_user_id_idx",
+                                "index_uuid": "00000000-0000-0000-0000-000000000008",
+                                "columns": ["`META-DATA`.`USER-ID`"],
+                                "index_status": "done",
+                            },
+                            {
+                                "index_name": "literal_dot_idx",
+                                "index_uuid": "00000000-0000-0000-0000-000000000009",
+                                "columns": ["literal.`A.B`"],
+                                "index_status": "done",
+                            },
+                        ]
+                    })
+                }
+                "/v1/table/my_table/index/row_id_idx/stats/"
+                | "/v1/table/my_table/index/row_dash_id_idx/stats/"
+                | "/v1/table/my_table/index/user_id_idx/stats/"
+                | "/v1/table/my_table/index/mixed_case_metadata_user_id_idx/stats/"
+                | "/v1/table/my_table/index/metadata_user_id_idx/stats/"
+                | "/v1/table/my_table/index/meta_data_user_id_idx/stats/"
+                | "/v1/table/my_table/index/literal_dot_idx/stats/" => {
+                    serde_json::json!({
+                        "num_indexed_rows": 100000,
+                        "num_unindexed_rows": 0,
+                        "index_type": "BTREE"
+                    })
+                }
+                "/v1/table/my_table/index/image_embedding_idx/stats/" => {
+                    serde_json::json!({
+                        "num_indexed_rows": 100000,
+                        "num_unindexed_rows": 0,
+                        "index_type": "IVF_PQ",
+                        "distance_type": "l2"
+                    })
+                }
+                "/v1/table/my_table/index/payload_text_idx/stats/" => {
+                    serde_json::json!({
+                        "num_indexed_rows": 100000,
+                        "num_unindexed_rows": 0,
+                        "index_type": "FTS"
+                    })
+                }
+                path => panic!("Unexpected path: {}", path),
+            };
+            http::Response::builder()
+                .status(200)
+                .body(serde_json::to_string(&response_body).unwrap())
+                .unwrap()
+        });
+
+        let indices = table.list_indices().await.unwrap();
+        let expected = vec![
+            IndexConfig {
+                name: "row_id_idx".into(),
+                index_type: IndexType::BTree,
+                columns: vec!["rowId".into()],
+            },
+            IndexConfig {
+                name: "row_dash_id_idx".into(),
+                index_type: IndexType::BTree,
+                columns: vec!["`row-id`".into()],
+            },
+            IndexConfig {
+                name: "user_id_idx".into(),
+                index_type: IndexType::BTree,
+                columns: vec!["userId".into()],
+            },
+            IndexConfig {
+                name: "mixed_case_metadata_user_id_idx".into(),
+                index_type: IndexType::BTree,
+                columns: vec!["MetaData.userId".into()],
+            },
+            IndexConfig {
+                name: "metadata_user_id_idx".into(),
+                index_type: IndexType::BTree,
+                columns: vec!["metadata.user_id".into()],
+            },
+            IndexConfig {
+                name: "image_embedding_idx".into(),
+                index_type: IndexType::IvfPq,
+                columns: vec!["image.embedding".into()],
+            },
+            IndexConfig {
+                name: "payload_text_idx".into(),
+                index_type: IndexType::FTS,
+                columns: vec!["payload.text".into()],
+            },
+            IndexConfig {
+                name: "meta_data_user_id_idx".into(),
+                index_type: IndexType::BTree,
+                columns: vec!["`meta-data`.`user-id`".into()],
+            },
+            IndexConfig {
+                name: "literal_dot_idx".into(),
+                index_type: IndexType::BTree,
+                columns: vec!["literal.`a.b`".into()],
+            },
+        ];
+        assert_eq!(indices, expected);
+    }
+
    #[tokio::test]
    async fn test_list_versions() {
        let table = Table::new_with_handler("my_table", |request| {
@@ -4028,7 +4358,6 @@ mod tests {
            index_type: IndexType::IvfPq,
            distance_type: Some(DistanceType::L2),
            num_indices: None,
-            loss: None,
        };
        assert_eq!(indices, expected);

@@ -4376,6 +4705,91 @@ mod tests {
        assert!(matches!(e, Error::IndexNotFound { .. }));
    }

+    #[tokio::test]
+    async fn test_set_lsm_write_spec_unsharded() {
+        let table = Table::new_with_handler("my_table", |request| {
+            assert_eq!(request.method(), "POST");
+            assert_eq!(
+                request.url().path(),
+                "/v1/table/my_table/set_lsm_write_spec/"
+            );
+            let body = request.body().unwrap().as_bytes().unwrap();
+            let body: serde_json::Value = serde_json::from_slice(body).unwrap();
+            assert_eq!(body["sharding"], serde_json::json!({ "mode": "unsharded" }));
+            assert_eq!(body["maintained_indexes"], serde_json::json!(["id_idx"]));
+            assert_eq!(
+                body["writer_config_defaults"],
+                serde_json::json!({ "max_memtable_rows": "1000" })
+            );
+            http::Response::builder()
+                .status(200)
+                .body(r#"{"maintained_indexes":["id_idx"]}"#)
+                .unwrap()
+        });
+        let spec = crate::table::LsmWriteSpec::unsharded()
+            .with_maintained_indexes(["id_idx"])
+            .with_writer_config_defaults([("max_memtable_rows", "1000")]);
+        table.set_lsm_write_spec(spec).await.unwrap();
+    }
+
+    #[tokio::test]
+    async fn test_set_lsm_write_spec_bucket() {
+        let table = Table::new_with_handler("my_table", |request| {
+            assert_eq!(request.method(), "POST");
+            assert_eq!(
+                request.url().path(),
+                "/v1/table/my_table/set_lsm_write_spec/"
+            );
+            let body = request.body().unwrap().as_bytes().unwrap();
+            let body: serde_json::Value = serde_json::from_slice(body).unwrap();
+            assert_eq!(
+                body["sharding"],
+                serde_json::json!({ "mode": "bucket", "column": "id", "num_buckets": 16 })
+            );
+            assert_eq!(body["maintained_indexes"], serde_json::json!([]));
+            http::Response::builder().status(200).body("{}").unwrap()
+        });
+        table
+            .set_lsm_write_spec(crate::table::LsmWriteSpec::bucket("id", 16))
+            .await
+            .unwrap();
+    }
+
+    #[tokio::test]
+    async fn test_set_lsm_write_spec_identity() {
+        let table = Table::new_with_handler("my_table", |request| {
+            assert_eq!(request.method(), "POST");
+            assert_eq!(
+                request.url().path(),
+                "/v1/table/my_table/set_lsm_write_spec/"
+            );
+            let body = request.body().unwrap().as_bytes().unwrap();
+            let body: serde_json::Value = serde_json::from_slice(body).unwrap();
+            assert_eq!(
+                body["sharding"],
+                serde_json::json!({ "mode": "identity", "column": "tenant" })
+            );
+            http::Response::builder().status(200).body("{}").unwrap()
+        });
+        table
+            .set_lsm_write_spec(crate::table::LsmWriteSpec::identity("tenant"))
+            .await
+            .unwrap();
+    }
+
+    #[tokio::test]
+    async fn test_unset_lsm_write_spec() {
+        let table = Table::new_with_handler("my_table", |request| {
+            assert_eq!(request.method(), "POST");
+            assert_eq!(
+                request.url().path(),
+                "/v1/table/my_table/unset_lsm_write_spec/"
+            );
+            http::Response::builder().status(200).body("{}").unwrap()
+        });
+        table.unset_lsm_write_spec().await.unwrap();
+    }
+
    #[tokio::test]
    async fn test_wait_for_index() {
        let table = _make_table_with_indices(0);
@@ -6460,4 +6874,25 @@ mod tests {
        assert!(!headers.contains_key("x-lancedb-min-version"));
        assert!(!headers.contains_key("x-lancedb-min-timestamp"));
    }
+
+    #[tokio::test]
+    async fn test_update_field_metadata() {
+        let table = Table::new_with_handler("my_table", |request| {
+            assert_eq!(request.method(), "POST");
+            assert_eq!(
+                request.url().path(),
+                "/v1/table/my_table/update_field_metadata/"
+            );
+            http::Response::builder()
+                .status(200)
+                .body(r#"{"version": 7, "fields": {"category": {"unit": "label"}}}"#)
+                .unwrap()
+        });
+
+        let result = table
+            .update_field_metadata(&[FieldMetadataUpdate::new("category").set("unit", "label")])
+            .await
+            .unwrap();
+        assert_eq!(result.version, 7);
+    }
 }
--- a/rust/lancedb/src/table.rs
+++ b/rust/lancedb/src/table.rs
@@ -62,7 +62,8 @@ use crate::query::{IntoQueryVector, Query, QueryExecutionOptions, TakeQuery, Vec
 use crate::table::datafusion::insert::InsertExec;
 use crate::utils::{
    PatchReadParam, PatchWriteParam, supported_bitmap_data_type, supported_btree_data_type,
-    supported_fts_data_type, supported_label_list_data_type, supported_vector_data_type,
+    supported_fm_data_type, supported_fts_data_type, supported_label_list_data_type,
+    supported_vector_data_type,
 };

 use self::dataset::DatasetConsistencyWrapper;
@@ -86,12 +87,15 @@ pub use add_data::{AddDataBuilder, AddDataMode, AddResult, NaNVectorBehavior};
 pub use chrono::Duration;
 pub use delete::DeleteResult;
 use futures::future::join_all;
-pub use lance::dataset::refs::{TagContents, Tags as LanceTags};
+pub use lance::dataset::refs::{BranchContents, Ref, TagContents, Tags as LanceTags};
 pub use lance::dataset::scanner::DatasetRecordBatchStream;
 use lance::dataset::statistics::DatasetStatisticsExt;
 pub use lance_index::optimize::OptimizeOptions;
 pub use optimize::{CompactionOptions, OptimizeAction, OptimizeStats};
-pub use schema_evolution::{AddColumnsResult, AlterColumnsResult, DropColumnsResult};
+pub use schema_evolution::{
+    AddColumnsResult, AlterColumnsResult, DropColumnsResult, FieldMetadataUpdate,
+    UpdateFieldMetadataResult,
+};
 use serde_with::skip_serializing_none;
 pub use update::{UpdateBuilder, UpdateResult};

@@ -622,6 +626,37 @@ pub trait BaseTable: std::fmt::Display + std::fmt::Debug + Send + Sync {
    async fn restore(&self) -> Result<()>;
    /// List the versions of the table.
    async fn list_versions(&self) -> Result<Vec<Version>>;
+    /// Create a new branch from `from` and return a handle scoped to it.
+    async fn create_branch(
+        &self,
+        name: &str,
+        from: lance::dataset::refs::Ref,
+    ) -> Result<Arc<dyn BaseTable>>;
+    /// Check out an existing branch and return a handle scoped to it.
+    async fn checkout_branch(&self, name: &str) -> Result<Arc<dyn BaseTable>>;
+    /// Check out an existing branch at an optional version, returning a handle.
+    ///
+    /// `None` tracks the branch's latest; `Some(v)` pins it to that version
+    /// (read-only). The default implementation composes [`Self::checkout_branch`]
+    /// and [`Self::checkout`]; implementations may override it to resolve the
+    /// `(branch, version)` coordinate in a single manifest read.
+    async fn checkout_branch_version(
+        &self,
+        name: &str,
+        version: Option<u64>,
+    ) -> Result<Arc<dyn BaseTable>> {
+        let branch = self.checkout_branch(name).await?;
+        if let Some(version) = version {
+            branch.checkout(version).await?;
+        }
+        Ok(branch)
+    }
+    /// List the branches of the table.
+    async fn list_branches(&self) -> Result<HashMap<String, BranchContents>>;
+    /// Delete a branch.
+    async fn delete_branch(&self, name: &str) -> Result<()>;
+    /// The branch this handle is scoped to, or `None` for `main`.
+    fn current_branch(&self) -> Option<String>;
    /// Get the table definition.
    async fn table_definition(&self) -> Result<TableDefinition>;
    /// Get the table URI (storage location)
@@ -660,6 +695,19 @@ pub trait BaseTable: std::fmt::Display + std::fmt::Debug + Send + Sync {
            message: "create_insert_exec not implemented".to_string(),
        })
    }
+    /// Update per-field metadata. Merges into existing metadata by default;
+    /// [`FieldMetadataUpdate::remove`] deletes a key and
+    /// [`FieldMetadataUpdate::replace`] swaps the field's whole map.
+    ///
+    /// The default returns `NotSupported`; Lance-backed and remote tables override it.
+    async fn update_field_metadata(
+        &self,
+        _updates: &[FieldMetadataUpdate],
+    ) -> Result<UpdateFieldMetadataResult> {
+        Err(Error::NotSupported {
+            message: "update_field_metadata is not supported on this table type".into(),
+        })
+    }
 }

 /// A Table is a collection of strong typed Rows.
@@ -1340,6 +1388,14 @@ impl Table {
        self.inner.alter_columns(alterations).await
    }

+    /// Update per-field metadata (merges by default).
+    pub async fn update_field_metadata(
+        &self,
+        updates: &[FieldMetadataUpdate],
+    ) -> Result<UpdateFieldMetadataResult> {
+        self.inner.update_field_metadata(updates).await
+    }
+
    /// Remove columns from the table.
    pub async fn drop_columns(&self, columns: &[&str]) -> Result<DropColumnsResult> {
        self.inner.drop_columns(columns).await
@@ -1601,6 +1657,57 @@ impl Table {
        self.inner.tags().await
    }

+    /// Create a new branch from `from` (a version, tag, or branch)
+    pub async fn create_branch(
+        &self,
+        name: &str,
+        from: impl Into<lance::dataset::refs::Ref>,
+    ) -> Result<Self> {
+        let inner = self.inner.create_branch(name, from.into()).await?;
+        Ok(Self {
+            inner,
+            database: self.database.clone(),
+            embedding_registry: self.embedding_registry.clone(),
+        })
+    }
+
+    /// Check out an existing branch and return a handle scoped to it.
+    ///
+    /// With `version` set, the returned handle is pinned to that version of the
+    /// branch: a read-only, detached view (as with [`Self::checkout`]). With
+    /// `version` as `None` it tracks the branch's latest and stays writable.
+    ///
+    /// ```
+    /// # use lancedb::Table;
+    /// # async fn f(table: &Table) -> Result<(), Box<dyn std::error::Error>> {
+    /// let exp_at_v3 = table.checkout_branch("exp", Some(3)).await?;
+    /// # Ok(())
+    /// # }
+    /// ```
+    pub async fn checkout_branch(&self, name: &str, version: Option<u64>) -> Result<Self> {
+        let inner = self.inner.checkout_branch_version(name, version).await?;
+        Ok(Self {
+            inner,
+            database: self.database.clone(),
+            embedding_registry: self.embedding_registry.clone(),
+        })
+    }
+
+    /// List the branches of the table.
+    pub async fn list_branches(&self) -> Result<HashMap<String, BranchContents>> {
+        self.inner.list_branches().await
+    }
+
+    /// Delete a branch.
+    pub async fn delete_branch(&self, name: &str) -> Result<()> {
+        self.inner.delete_branch(name).await
+    }
+
+    /// The branch this handle is scoped to, or `None` for `main`.
+    pub fn current_branch(&self) -> Option<String> {
+        self.inner.current_branch()
+    }
+
    /// Retrieve statistics on the table
    pub async fn stats(&self) -> Result<TableStatistics> {
        self.inner.stats().await
@@ -1798,8 +1905,11 @@ impl NativeTable {

        // Set up commit handler when managed_versioning is enabled
        if managed_versioning && let Some(ref ns_client) = namespace_client {
-            let external_store =
-                LanceNamespaceExternalManifestStore::new(ns_client.clone(), table_id.clone());
+            let external_store = LanceNamespaceExternalManifestStore::for_table_uri(
+                ns_client.clone(),
+                table_id.clone(),
+                uri,
+            )?;
            let commit_handler: Arc<dyn CommitHandler> = Arc::new(ExternalManifestCommitHandler {
                external_manifest_store: Arc::new(external_store),
            });
@@ -1837,6 +1947,30 @@ impl NativeTable {
        self
    }

+    /// Build a sibling `NativeTable` with the same identity but a different
+    /// (independent) dataset wrapper — used to hand out branch-scoped handles.
+    fn with_dataset(&self, dataset: dataset::DatasetConsistencyWrapper) -> Self {
+        Self {
+            name: self.name.clone(),
+            namespace: self.namespace.clone(),
+            id: self.id.clone(),
+            uri: self.uri.clone(),
+            dataset,
+            read_consistency_interval: self.read_consistency_interval,
+            namespace_client: self.namespace_client.clone(),
+            pushdown_operations: self.pushdown_operations.clone(),
+        }
+    }
+
+    fn validate_branch_name(name: &str, field: &str) -> Result<()> {
+        if name.is_empty() {
+            return Err(Error::InvalidInput {
+                message: format!("{field} must be a non-empty string"),
+            });
+        }
+        Ok(())
+    }
+
    /// Opens an existing Table using a namespace client.
    ///
    /// This method uses `DatasetBuilder::from_namespace` to open the table, which
@@ -2330,6 +2464,12 @@ impl NativeTable {
                    BuiltinIndexType::LabelList,
                )))
            }
+            Index::Fm(_) => {
+                Self::validate_index_type(field, "FM", supported_fm_data_type)?;
+                Ok(Box::new(ScalarIndexParams::for_builtin(
+                    BuiltinIndexType::Fm,
+                )))
+            }
            Index::FTS(fts_opts) => {
                Self::validate_index_type(field, "FTS", supported_fts_data_type)?;
                Ok(Box::new(fts_opts))
@@ -2488,6 +2628,7 @@ impl NativeTable {
            Index::BTree(_) => IndexType::BTree,
            Index::Bitmap(_) => IndexType::Bitmap,
            Index::LabelList(_) => IndexType::LabelList,
+            Index::Fm(_) => IndexType::Fm,
            Index::FTS(_) => IndexType::Inverted,
            Index::IvfFlat(_)
            | Index::IvfSq(_)
@@ -2580,6 +2721,7 @@ impl NativeTable {
    ///   field id and the second element is a hashmap of metadata key-value
    ///   pairs.
    ///
+    #[deprecated(since = "0.33.1", note = "Use `update_field_metadata` instead")]
    pub async fn replace_field_metadata(
        &self,
        new_values: impl IntoIterator<Item = (u32, HashMap<String, String>)>,
@@ -2627,6 +2769,72 @@ impl BaseTable for NativeTable {
        self.dataset.reload().await
    }

+    async fn create_branch(
+        &self,
+        name: &str,
+        from: lance::dataset::refs::Ref,
+    ) -> Result<Arc<dyn BaseTable>> {
+        Self::validate_branch_name(name, "branch name")?;
+        if let lance::dataset::refs::Ref::Version(Some(from_branch), _) = &from {
+            Self::validate_branch_name(from_branch, "from_ref")?;
+        }
+        let mut ds = (*self.dataset.get().await?).clone();
+        let branch_ds = ds.create_branch(name, from, None).await?;
+        let dataset = dataset::DatasetConsistencyWrapper::new_latest(
+            branch_ds,
+            self.read_consistency_interval,
+        );
+        Ok(Arc::new(self.with_dataset(dataset)))
+    }
+
+    async fn checkout_branch(&self, name: &str) -> Result<Arc<dyn BaseTable>> {
+        Self::validate_branch_name(name, "branch name")?;
+        let branch_ds = self.dataset.get().await?.checkout_branch(name).await?;
+        let dataset = dataset::DatasetConsistencyWrapper::new_latest(
+            branch_ds,
+            self.read_consistency_interval,
+        );
+        Ok(Arc::new(self.with_dataset(dataset)))
+    }
+
+    async fn checkout_branch_version(
+        &self,
+        name: &str,
+        version: Option<u64>,
+    ) -> Result<Arc<dyn BaseTable>> {
+        let Some(version) = version else {
+            return self.checkout_branch(name).await;
+        };
+        Self::validate_branch_name(name, "branch name")?;
+        // Resolve (branch, version) in a single manifest read.
+        let branch_ds = self
+            .dataset
+            .get()
+            .await?
+            .checkout_version((name, version))
+            .await?;
+        let dataset = dataset::DatasetConsistencyWrapper::new_time_travel(
+            branch_ds,
+            self.read_consistency_interval,
+        );
+        Ok(Arc::new(self.with_dataset(dataset)))
+    }
+
+    async fn list_branches(&self) -> Result<HashMap<String, BranchContents>> {
+        Ok(self.dataset.get().await?.list_branches().await?)
+    }
+
+    async fn delete_branch(&self, name: &str) -> Result<()> {
+        Self::validate_branch_name(name, "branch name")?;
+        let mut ds = (*self.dataset.get().await?).clone();
+        ds.delete_branch(name).await?;
+        Ok(())
+    }
+
+    fn current_branch(&self) -> Option<String> {
+        self.dataset.current_branch()
+    }
+
    async fn list_versions(&self) -> Result<Vec<Version>> {
        Ok(self.dataset.get().await?.versions().await?)
    }
@@ -2886,6 +3094,13 @@ impl BaseTable for NativeTable {
        schema_evolution::execute_alter_columns(self, alterations).await
    }

+    async fn update_field_metadata(
+        &self,
+        updates: &[FieldMetadataUpdate],
+    ) -> Result<UpdateFieldMetadataResult> {
+        schema_evolution::execute_update_field_metadata(self, updates).await
+    }
+
    async fn drop_columns(&self, columns: &[&str]) -> Result<DropColumnsResult> {
        schema_evolution::execute_drop_columns(self, columns).await
    }
@@ -2987,20 +3202,12 @@ impl BaseTable for NativeTable {
                .ok_or_else(|| Error::InvalidInput {
                    message: "index statistics was missing index type".to_string(),
                })?;
-        let loss = stats
-            .indices
-            .iter()
-            .map(|index| index.loss.unwrap_or_default())
-            .sum::<f64>();
-
-        let loss = first_index.loss.map(|first_loss| first_loss + loss);
        Ok(Some(IndexStatistics {
            num_indexed_rows: stats.num_indexed_rows,
            num_unindexed_rows: stats.num_unindexed_rows,
            index_type,
            distance_type: first_index.metric_type,
            num_indices: stats.num_indices,
-            loss,
        }))
    }

@@ -3136,7 +3343,6 @@ pub struct FragmentSummaryStats {
 #[cfg(test)]
 #[allow(deprecated)]
 mod tests {
-    use std::collections::HashMap;
    use std::sync::Arc;
    use std::sync::atomic::{AtomicBool, Ordering};
    use std::time::Duration;
@@ -3144,7 +3350,7 @@ mod tests {
    use arrow_array::{
        Array, ArrayRef, BooleanArray, FixedSizeListArray, Int32Array, LargeStringArray,
        RecordBatch, RecordBatchIterator, RecordBatchReader, StringArray, StructArray,
-        builder::{ListBuilder, StringBuilder},
+        builder::{LargeListBuilder, ListBuilder, StringBuilder},
    };
    use arrow_array::{BinaryArray, LargeBinaryArray};
    use arrow_data::ArrayDataBuilder;
@@ -3158,7 +3364,7 @@ mod tests {
    use super::*;
    use crate::connect;
    use crate::connection::ConnectBuilder;
-    use crate::index::scalar::{BTreeIndexBuilder, BitmapIndexBuilder};
+    use crate::index::scalar::{BTreeIndexBuilder, BitmapIndexBuilder, FmIndexBuilder};
    use crate::index::vector::{IvfHnswPqIndexBuilder, IvfHnswSqIndexBuilder};
    use crate::query::Select;
    use crate::query::{ExecutableQuery, QueryBase};
@@ -3347,6 +3553,351 @@ mod tests {
        assert_eq!(table.version().await.unwrap(), 4);
    }

+    #[tokio::test]
+    async fn test_branches() {
+        let tmp_dir = tempdir().unwrap();
+        let uri = tmp_dir.path().to_str().unwrap();
+
+        let conn = ConnectBuilder::new(uri)
+            .read_consistency_interval(Duration::from_secs(0))
+            .execute()
+            .await
+            .unwrap();
+
+        // main: one row at v1
+        let table = conn
+            .create_table("my_table", some_sample_data())
+            .execute()
+            .await
+            .unwrap();
+        assert_eq!(table.count_rows(None).await.unwrap(), 1);
+        assert_eq!(table.current_branch(), None);
+        let main_version = table.version().await.unwrap();
+
+        // branch off main's current version; it starts with main's data
+        let branch = table.create_branch("exp", main_version).await.unwrap();
+        assert_eq!(branch.current_branch().as_deref(), Some("exp"));
+        assert_eq!(branch.count_rows(None).await.unwrap(), 1);
+
+        // writes on the branch are isolated from main
+        branch.add(some_sample_data()).execute().await.unwrap();
+        assert_eq!(branch.count_rows(None).await.unwrap(), 2);
+        assert_eq!(
+            table.count_rows(None).await.unwrap(),
+            1,
+            "main must be untouched by branch writes"
+        );
+
+        // the branch shows up in the listing
+        let branches = table.list_branches().await.unwrap();
+        assert!(branches.contains_key("exp"));
+
+        // checking out the branch from the main handle sees the branch's latest data
+        let checked_out = table.checkout_branch("exp", None).await.unwrap();
+        assert_eq!(checked_out.current_branch().as_deref(), Some("exp"));
+        assert_eq!(checked_out.count_rows(None).await.unwrap(), 2);
+
+        // open_table(...).branch(...) opens directly onto the branch
+        let opened = conn
+            .open_table("my_table")
+            .branch("exp")
+            .execute()
+            .await
+            .unwrap();
+        assert_eq!(opened.current_branch().as_deref(), Some("exp"));
+        assert_eq!(opened.count_rows(None).await.unwrap(), 2);
+
+        // delete removes it from the listing
+        table.delete_branch("exp").await.unwrap();
+        let branches = table.list_branches().await.unwrap();
+        assert!(!branches.contains_key("exp"));
+    }
+
+    #[tokio::test]
+    async fn test_branch_version_checkout() {
+        let tmp_dir = tempdir().unwrap();
+        let uri = tmp_dir.path().to_str().unwrap();
+
+        let conn = ConnectBuilder::new(uri)
+            .read_consistency_interval(Duration::from_secs(0))
+            .execute()
+            .await
+            .unwrap();
+
+        // main: a single fork-point row (i = 0)
+        let table = conn
+            .create_table("my_table", sample_rows(vec![0]))
+            .execute()
+            .await
+            .unwrap();
+        let fork_point = table.version().await.unwrap();
+
+        // Fork "exp", then advance exp AND main independently past the fork so
+        // they diverge while sharing version numbers.
+        let branch = table.create_branch("exp", fork_point).await.unwrap();
+        let exp_fork = branch.version().await.unwrap(); // exp's shallow-clone version
+        branch.add(sample_rows(vec![1])).execute().await.unwrap(); // exp: {0, 1}
+        let exp_v2 = branch.version().await.unwrap();
+        branch.add(sample_rows(vec![2])).execute().await.unwrap(); // exp HEAD: {0, 1, 2}
+
+        // main's own commit reaches the SAME version number with different data
+        table
+            .add(sample_rows(vec![100, 101, 102]))
+            .execute()
+            .await
+            .unwrap(); // main HEAD: {0, 100, 101, 102}
+        let main_v2 = table.version().await.unwrap();
+        assert_eq!(
+            exp_v2, main_v2,
+            "branch and main must share the version number for this test to mean anything"
+        );
+
+        // Open exp at the shared version. The data must be exp's, not main's:
+        // count alone cannot prove this (main@v2 differs), so assert provenance
+        // by content.
+        let pinned = conn
+            .open_table("my_table")
+            .branch("exp")
+            .version(exp_v2)
+            .execute()
+            .await
+            .unwrap();
+        assert_eq!(pinned.current_branch().as_deref(), Some("exp"));
+        // isolated from exp's HEAD (3 rows) and from main@v2 (4 rows)
+        assert_eq!(pinned.count_rows(None).await.unwrap(), 2);
+        // exp's post-fork row is visible; main's divergent rows are not
+        assert_eq!(
+            pinned.count_rows(Some("i = 1".to_string())).await.unwrap(),
+            1
+        );
+        assert_eq!(
+            pinned
+                .count_rows(Some("i = 100".to_string()))
+                .await
+                .unwrap(),
+            0
+        );
+
+        // the same coordinate is reachable directly via checkout_branch(name, version)
+        let pinned_direct = table.checkout_branch("exp", Some(exp_v2)).await.unwrap();
+        assert_eq!(pinned_direct.current_branch().as_deref(), Some("exp"));
+        assert_eq!(pinned_direct.count_rows(None).await.unwrap(), 2);
+
+        // the HEADs are unaffected
+        let head = conn
+            .open_table("my_table")
+            .branch("exp")
+            .execute()
+            .await
+            .unwrap();
+        assert_eq!(head.count_rows(None).await.unwrap(), 3);
+        assert_eq!(table.count_rows(None).await.unwrap(), 4);
+
+        // a pinned version is a detached head: writes are rejected
+        assert!(pinned.add(sample_rows(vec![9])).execute().await.is_err());
+
+        // version-only (no branch) time-travels main itself: its fork-point
+        // version holds only main's first row, and the shared version number
+        // resolves to main's data, not the branch's ("opens main at the version")
+        let old_main = conn
+            .open_table("my_table")
+            .version(fork_point)
+            .execute()
+            .await
+            .unwrap();
+        assert_eq!(old_main.current_branch(), None);
+        assert_eq!(old_main.count_rows(None).await.unwrap(), 1);
+        let shared_on_main = conn
+            .open_table("my_table")
+            .version(exp_v2)
+            .execute()
+            .await
+            .unwrap();
+        assert_eq!(shared_on_main.current_branch(), None);
+        assert_eq!(shared_on_main.count_rows(None).await.unwrap(), 4);
+
+        // a nonexistent version is rejected
+        assert!(
+            conn.open_table("my_table")
+                .version(9999)
+                .execute()
+                .await
+                .is_err()
+        );
+
+        // a nonexistent version on a branch is rejected too: this resolves on
+        // the branch's path, a distinct miss from the main lookup above
+        assert!(
+            conn.open_table("my_table")
+                .branch("exp")
+                .version(9999)
+                .execute()
+                .await
+                .is_err()
+        );
+
+        // opening the branch at its fork point (the shallow-clone manifest)
+        // shows just the cloned state: main's fork-point row
+        let exp_at_fork = conn
+            .open_table("my_table")
+            .branch("exp")
+            .version(exp_fork)
+            .execute()
+            .await
+            .unwrap();
+        assert_eq!(exp_at_fork.current_branch().as_deref(), Some("exp"));
+        assert_eq!(exp_at_fork.count_rows(None).await.unwrap(), 1);
+
+        // checkout_latest re-attaches the pinned handle to the BRANCH's HEAD
+        // (writable again), not main's HEAD, and not staying pinned
+        pinned.checkout_latest().await.unwrap();
+        assert_eq!(pinned.current_branch().as_deref(), Some("exp"));
+        assert_eq!(pinned.count_rows(None).await.unwrap(), 3); // exp HEAD, not main's 4
+        pinned.add(sample_rows(vec![3])).execute().await.unwrap();
+        assert_eq!(pinned.count_rows(None).await.unwrap(), 4); // writable again
+    }
+
+    #[tokio::test]
+    async fn test_branch_version_two_branches() {
+        let tmp_dir = tempdir().unwrap();
+        let uri = tmp_dir.path().to_str().unwrap();
+        let conn = ConnectBuilder::new(uri)
+            .read_consistency_interval(Duration::from_secs(0))
+            .execute()
+            .await
+            .unwrap();
+
+        let table = conn
+            .create_table("my_table", sample_rows(vec![0]))
+            .execute()
+            .await
+            .unwrap();
+        let fork_point = table.version().await.unwrap();
+
+        // two branches off the same point, each advanced once so they reach the
+        // SAME version number with divergent data
+        let exp1 = table.create_branch("exp1", fork_point).await.unwrap();
+        let exp2 = table.create_branch("exp2", fork_point).await.unwrap();
+        exp1.add(sample_rows(vec![10])).execute().await.unwrap();
+        exp2.add(sample_rows(vec![20])).execute().await.unwrap();
+        let v1 = exp1.version().await.unwrap();
+        let v2 = exp2.version().await.unwrap();
+        assert_eq!(v1, v2, "both branches must reach the same version number");
+
+        // that shared version number resolves to each branch's own data
+        let at1 = table.checkout_branch("exp1", Some(v1)).await.unwrap();
+        assert_eq!(at1.count_rows(Some("i = 10".to_string())).await.unwrap(), 1);
+        assert_eq!(at1.count_rows(Some("i = 20".to_string())).await.unwrap(), 0);
+        let at2 = table.checkout_branch("exp2", Some(v2)).await.unwrap();
+        assert_eq!(at2.count_rows(Some("i = 20".to_string())).await.unwrap(), 1);
+        assert_eq!(at2.count_rows(Some("i = 10".to_string())).await.unwrap(), 0);
+    }
+
+    #[tokio::test]
+    async fn test_branch_name_validation() {
+        let tmp_dir = tempdir().unwrap();
+        let uri = tmp_dir.path().to_str().unwrap();
+        let conn = ConnectBuilder::new(uri).execute().await.unwrap();
+        let table = conn
+            .create_table("my_table", some_sample_data())
+            .execute()
+            .await
+            .unwrap();
+
+        // every entry point rejects an empty name instead of passing it down
+        assert!(matches!(
+            table.create_branch("", 1u64).await,
+            Err(Error::InvalidInput { .. })
+        ));
+        assert!(matches!(
+            table.checkout_branch("", None).await,
+            Err(Error::InvalidInput { .. })
+        ));
+        assert!(matches!(
+            table.delete_branch("").await,
+            Err(Error::InvalidInput { .. })
+        ));
+        // an empty source branch is rejected too
+        assert!(matches!(
+            table
+                .create_branch(
+                    "ok",
+                    lance::dataset::refs::Ref::Version(Some(String::new()), None)
+                )
+                .await,
+            Err(Error::InvalidInput { .. })
+        ));
+    }
+
+    #[tokio::test]
+    async fn test_branch_handle_tracks_concurrent_writes() {
+        let tmp_dir = tempdir().unwrap();
+        let uri = tmp_dir.path().to_str().unwrap();
+
+        // interval = 0 so every read checks storage for new commits
+        let conn = ConnectBuilder::new(uri)
+            .read_consistency_interval(Duration::from_secs(0))
+            .execute()
+            .await
+            .unwrap();
+        let table = conn
+            .create_table("my_table", some_sample_data())
+            .execute()
+            .await
+            .unwrap();
+        let v1 = table.version().await.unwrap();
+
+        // two independent handles on the same branch
+        let writer = table.create_branch("exp", v1).await.unwrap();
+        let reader = conn
+            .open_table("my_table")
+            .branch("exp")
+            .execute()
+            .await
+            .unwrap();
+        assert_eq!(reader.count_rows(None).await.unwrap(), 1);
+
+        // a concurrent write on the branch is visible to the other handle, which
+        // tracks the branch's HEAD (not main's)
+        writer.add(some_sample_data()).execute().await.unwrap();
+        assert_eq!(reader.count_rows(None).await.unwrap(), 2);
+        // main is untouched
+        assert_eq!(table.count_rows(None).await.unwrap(), 1);
+    }
+
+    #[tokio::test]
+    async fn test_branch_handle_without_consistency_interval_is_pinned() {
+        let tmp_dir = tempdir().unwrap();
+        let uri = tmp_dir.path().to_str().unwrap();
+
+        // default interval (None): handles do not auto-refresh
+        let conn = ConnectBuilder::new(uri).execute().await.unwrap();
+        let table = conn
+            .create_table("my_table", some_sample_data())
+            .execute()
+            .await
+            .unwrap();
+        let v1 = table.version().await.unwrap();
+
+        let writer = table.create_branch("exp", v1).await.unwrap();
+        let reader = conn
+            .open_table("my_table")
+            .branch("exp")
+            .execute()
+            .await
+            .unwrap();
+        assert_eq!(reader.count_rows(None).await.unwrap(), 1);
+
+        // without a consistency interval the reader stays on the version it
+        // opened, exactly like a main-branch handle...
+        writer.add(some_sample_data()).execute().await.unwrap();
+        assert_eq!(reader.count_rows(None).await.unwrap(), 1);
+
+        // ...until it explicitly refreshes
+        reader.checkout_latest().await.unwrap();
+        assert_eq!(reader.count_rows(None).await.unwrap(), 2);
+    }
+
    #[tokio::test]
    async fn test_create_index() {
        use arrow_array::RecordBatch;
@@ -3404,7 +3955,6 @@ mod tests {
        assert_eq!(stats.num_unindexed_rows, 0);
        assert_eq!(stats.index_type, crate::index::IndexType::IvfPq);
        assert_eq!(stats.distance_type, Some(crate::DistanceType::L2));
-        assert!(stats.loss.is_some());

        table.drop_index(index_name).await.unwrap();
        assert_eq!(table.list_indices().await.unwrap().len(), 0);
@@ -3482,7 +4032,7 @@ mod tests {
        use lance_index::vector::VectorIndex as LanceVectorIndex;

        let indices = native_table.load_indices().await.unwrap();
-        let index_uuid = indices[0].index_uuid.clone();
+        let index_uuid = uuid::Uuid::parse_str(&indices[0].index_uuid).unwrap();

        let dataset_guard = native_table.dataset.get().await.unwrap();
        let dataset = (*dataset_guard).clone();
@@ -3698,6 +4248,19 @@ mod tests {
        Box::new(RecordBatchIterator::new(vec![batch], schema))
    }

+    /// A single-batch reader holding the given `i` (Int32) values. Lets a test
+    /// write distinguishable rows so it can assert data provenance, not row count.
+    fn sample_rows(values: Vec<i32>) -> Box<dyn arrow_array::RecordBatchReader + Send> {
+        let batch = RecordBatch::try_new(
+            Arc::new(Schema::new(vec![Field::new("i", DataType::Int32, false)])),
+            vec![Arc::new(Int32Array::from(values))],
+        )
+        .unwrap();
+        let schema = batch.schema().clone();
+
+        Box::new(RecordBatchIterator::new(vec![Ok(batch)], schema))
+    }
+
    #[tokio::test]
    async fn test_create_scalar_index() {
        let tmp_dir = tempdir().unwrap();
@@ -3751,6 +4314,56 @@ mod tests {
        assert_eq!(stats.num_unindexed_rows, 0);
    }

+    #[tokio::test]
+    async fn test_create_fm_index() {
+        let tmp_dir = tempdir().unwrap();
+        let uri = tmp_dir.path().to_str().unwrap();
+
+        // FM-Index accelerates substring search, so it applies to a string column.
+        let batch = RecordBatch::try_new(
+            Arc::new(Schema::new(vec![Field::new("text", DataType::Utf8, false)])),
+            vec![Arc::new(StringArray::from(vec!["hello world"]))],
+        )
+        .unwrap();
+        let conn = ConnectBuilder::new(uri).execute().await.unwrap();
+        let table = conn
+            .create_table("my_table", batch.clone())
+            .execute()
+            .await
+            .unwrap();
+
+        table
+            .create_index(&["text"], Index::Fm(FmIndexBuilder::default()))
+            .execute()
+            .await
+            .unwrap();
+        table
+            .wait_for_index(&["text_idx"], Duration::from_millis(10))
+            .await
+            .unwrap();
+
+        let index_configs = table.list_indices().await.unwrap();
+        assert_eq!(index_configs.len(), 1);
+        let index = index_configs.into_iter().next().unwrap();
+        assert_eq!(index.index_type, crate::index::IndexType::Fm);
+        assert_eq!(index.columns, vec!["text".to_string()]);
+
+        // The committed FM-Index must answer a substring `contains` query.
+        let count = table
+            .query()
+            .only_if("contains(text, 'world')")
+            .execute()
+            .await
+            .unwrap()
+            .try_collect::<Vec<_>>()
+            .await
+            .unwrap()
+            .iter()
+            .map(|b| b.num_rows())
+            .sum::<usize>();
+        assert_eq!(count, 1);
+    }
+
    #[tokio::test]
    async fn test_create_index_nested_field_paths() {
        let tmp_dir = tempdir().unwrap();
@@ -3760,11 +4373,20 @@ mod tests {
        let num_rows = 512;
        let dimension = 8;

+        let row_id = Arc::new(Int32Array::from_iter_values(0..num_rows)) as ArrayRef;
+        let row_dash_id = Arc::new(Int32Array::from_iter_values(0..num_rows)) as ArrayRef;
+        let top_user_id = Arc::new(Int32Array::from_iter_values(0..num_rows)) as ArrayRef;
+
        let metadata = Arc::new(StructArray::from(vec![(
            Arc::new(Field::new("user_id", DataType::Int32, false)),
            Arc::new(Int32Array::from_iter_values(0..num_rows)) as ArrayRef,
        )]));

+        let mixed_case_metadata = Arc::new(StructArray::from(vec![(
+            Arc::new(Field::new("userId", DataType::Int32, false)),
+            Arc::new(Int32Array::from_iter_values(0..num_rows)) as ArrayRef,
+        )]));
+
        let vector_values = arrow_array::Float32Array::from_iter_values(
            (0..num_rows * dimension).map(|v| v as f32),
        );
@@ -3797,15 +4419,31 @@ mod tests {
        )]));

        let schema = Arc::new(Schema::new(vec![
+            Field::new("rowId", DataType::Int32, false),
+            Field::new("row-id", DataType::Int32, false),
+            Field::new("userId", DataType::Int32, false),
            Field::new("metadata", metadata.data_type().clone(), false),
+            Field::new("MetaData", mixed_case_metadata.data_type().clone(), false),
            Field::new("image", image.data_type().clone(), false),
            Field::new("payload", payload.data_type().clone(), false),
            Field::new("meta-data", meta_data.data_type().clone(), false),
            Field::new("literal", literal.data_type().clone(), false),
        ]));
-        let batch =
-            RecordBatch::try_new(schema, vec![metadata, image, payload, meta_data, literal])
-                .unwrap();
+        let batch = RecordBatch::try_new(
+            schema,
+            vec![
+                row_id,
+                row_dash_id,
+                top_user_id,
+                metadata,
+                mixed_case_metadata,
+                image,
+                payload,
+                meta_data,
+                literal,
+            ],
+        )
+        .unwrap();

        let table = conn
            .create_table("nested_index_paths", batch)
@@ -3822,6 +4460,33 @@ mod tests {
            .execute()
            .await
            .unwrap();
+        table
+            .create_index(&["rowId"], Index::BTree(BTreeIndexBuilder::default()))
+            .name("row_id_idx".to_string())
+            .execute()
+            .await
+            .unwrap();
+        table
+            .create_index(&["`row-id`"], Index::BTree(BTreeIndexBuilder::default()))
+            .name("row_dash_id_idx".to_string())
+            .execute()
+            .await
+            .unwrap();
+        table
+            .create_index(&["userId"], Index::BTree(BTreeIndexBuilder::default()))
+            .name("top_user_id_idx".to_string())
+            .execute()
+            .await
+            .unwrap();
+        table
+            .create_index(
+                &["MetaData.userId"],
+                Index::BTree(BTreeIndexBuilder::default()),
+            )
+            .name("mixed_case_metadata_user_id_idx".to_string())
+            .execute()
+            .await
+            .unwrap();
        table
            .create_index(&["image.embedding"], Index::Auto)
            .name("image_embedding_idx".to_string())
@@ -3889,11 +4554,31 @@ mod tests {
                    &["metadata.user_id".to_string()][..],
                    crate::index::IndexType::BTree,
                ),
+                (
+                    "mixed_case_metadata_user_id_idx",
+                    &["MetaData.userId".to_string()][..],
+                    crate::index::IndexType::BTree,
+                ),
                (
                    "payload_text_idx",
                    &["payload.text".to_string()][..],
                    crate::index::IndexType::FTS,
                ),
+                (
+                    "row_dash_id_idx",
+                    &["`row-id`".to_string()][..],
+                    crate::index::IndexType::BTree,
+                ),
+                (
+                    "row_id_idx",
+                    &["rowId".to_string()][..],
+                    crate::index::IndexType::BTree,
+                ),
+                (
+                    "top_user_id_idx",
+                    &["userId".to_string()][..],
+                    crate::index::IndexType::BTree,
+                ),
            ]
        );

@@ -4143,6 +4828,52 @@ mod tests {
        assert_eq!(index.columns, vec!["tags".to_string()]);
    }

+    #[tokio::test]
+    async fn test_create_label_list_index_on_large_list() {
+        let tmp_dir = tempdir().unwrap();
+        let uri = tmp_dir.path().to_str().unwrap();
+
+        let conn = ConnectBuilder::new(uri).execute().await.unwrap();
+
+        let schema = Arc::new(Schema::new(vec![Field::new(
+            "tags",
+            DataType::LargeList(Field::new("item", DataType::Utf8, true).into()),
+            true,
+        )]));
+
+        const TAGS: [&str; 3] = ["cat", "dog", "fish"];
+
+        let values_builder = StringBuilder::new();
+        let mut builder = LargeListBuilder::new(values_builder);
+        for i in 0..120 {
+            builder.values().append_value(TAGS[i % 3]);
+            if i % 3 == 0 {
+                builder.append(true)
+            }
+        }
+        let tags = Arc::new(builder.finish());
+
+        let batch = RecordBatch::try_new(schema, vec![tags]).unwrap();
+
+        let table = conn
+            .create_table("test_large_list_label_list", batch)
+            .execute()
+            .await
+            .unwrap();
+
+        table
+            .create_index(&["tags"], Index::LabelList(Default::default()))
+            .execute()
+            .await
+            .unwrap();
+
+        let index_configs = table.list_indices().await.unwrap();
+        assert_eq!(index_configs.len(), 1);
+        let index = index_configs.into_iter().next().unwrap();
+        assert_eq!(index.index_type, crate::index::IndexType::LabelList);
+        assert_eq!(index.columns, vec!["tags".to_string()]);
+    }
+
    #[tokio::test]
    async fn test_create_inverted_index() {
        let tmp_dir = tempdir().unwrap();
@@ -4449,10 +5180,10 @@ mod tests {
            Some(&"test_val2_update".to_string())
        );

-        let mut new_field_metadata = HashMap::<String, String>::new();
-        new_field_metadata.insert("test_field_key1".into(), "test_field_val1".into());
        native_tbl
-            .replace_field_metadata(vec![(field.id as u32, new_field_metadata)])
+            .update_field_metadata(&[
+                FieldMetadataUpdate::new("i").set("test_field_key1", "test_field_val1")
+            ])
            .await
            .unwrap();

--- a/rust/lancedb/src/table/dataset.rs
+++ b/rust/lancedb/src/table/dataset.rs
@@ -76,6 +76,23 @@ impl DatasetConsistencyWrapper {
        }
    }

+    /// Create a new wrapper pinned to the dataset's current version.
+    ///
+    /// `dataset` must already be checked out at the desired version; this pins
+    /// to `dataset.version()` without re-resolving. The wrapper is read-only
+    /// (time-travel) until [`as_latest`](Self::as_latest) re-attaches it to the
+    /// latest version.
+    pub fn new_time_travel(dataset: Dataset, read_consistency_interval: Option<Duration>) -> Self {
+        let version = dataset.version().version;
+        let wrapper = Self::new_latest(dataset, read_consistency_interval);
+        wrapper
+            .state
+            .lock()
+            .unwrap_or_else(|e| e.into_inner())
+            .pinned_version = Some(version);
+        wrapper
+    }
+
    /// The MemWAL `ShardWriter` cache co-located with this dataset.
    pub(crate) fn shard_writer(&self) -> &Arc<ShardWriterCache> {
        &self.shard_writer
@@ -144,8 +161,19 @@ impl DatasetConsistencyWrapper {
    }

    /// Checkout a branch and track its HEAD for new versions.
-    pub async fn as_branch(&self, _branch: impl Into<String>) -> Result<()> {
-        todo!("Branch support not yet implemented")
+    pub async fn as_branch(&self, branch: impl Into<String>) -> Result<()> {
+        let branch = branch.into();
+        let dataset = { self.state.lock()?.dataset.clone() };
+        let new_dataset = dataset.checkout_branch(&branch).await?;
+
+        let mut state = self.state.lock()?;
+        state.dataset = Arc::new(new_dataset);
+        state.pinned_version = None;
+        drop(state);
+        if let ConsistencyMode::Eventual(bg_cache) = &self.consistency {
+            bg_cache.invalidate();
+        }
+        Ok(())
    }

    /// Check that the dataset is in a mutable mode (Latest).
@@ -161,6 +189,17 @@ impl DatasetConsistencyWrapper {
        }
    }

+    /// The branch this wrapper is currently tracking, or `None` for `main`.
+    pub fn current_branch(&self) -> Option<String> {
+        self.state
+            .lock()
+            .unwrap_or_else(|e| e.into_inner())
+            .dataset
+            .manifest()
+            .branch
+            .clone()
+    }
+
    /// Returns the version, if in time travel mode, or None otherwise.
    pub fn time_travel_version(&self) -> Option<u64> {
        self.state
@@ -737,4 +776,31 @@ mod tests {
        let result = wrapper.reload().await;
        assert!(result.is_err());
    }
+
+    #[tokio::test]
+    async fn test_as_branch_is_writable_and_tracked() {
+        let dir = tempfile::tempdir().unwrap();
+        let uri = dir.path().to_str().unwrap();
+
+        // v1 on main, then shallow-clone a branch off it
+        let mut ds = create_test_dataset(uri).await;
+        let v1 = ds.version().version;
+        ds.create_branch("exp", v1, None).await.unwrap();
+
+        // wrapper starts on main: latest, writable, no branch
+        let wrapper = DatasetConsistencyWrapper::new_latest(ds, None);
+        assert_eq!(wrapper.current_branch(), None);
+
+        // switch to the branch
+        wrapper.as_branch("exp").await.unwrap();
+        assert_eq!(wrapper.current_branch().as_deref(), Some("exp"));
+
+        // a branch is writable (unlike a pinned/time-travel checkout)
+        wrapper.ensure_mutable().unwrap();
+        assert_eq!(wrapper.time_travel_version(), None);
+
+        // get() returns the branch dataset
+        let on_branch = wrapper.get().await.unwrap();
+        assert_eq!(on_branch.manifest().branch.as_deref(), Some("exp"));
+    }
 }
--- a/rust/lancedb/src/table/merge.rs
+++ b/rust/lancedb/src/table/merge.rs
@@ -53,6 +53,12 @@ pub struct MergeResult {
    pub num_rows: u64,
 }

+#[derive(Debug, Clone)]
+pub enum MergeFilter {
+    Sql(String),
+    Expr(datafusion_expr::Expr),
+}
+
 /// A builder used to create and run a merge insert operation
 ///
 /// See [`super::Table::merge_insert`] for more context
@@ -61,10 +67,10 @@ pub struct MergeInsertBuilder {
    table: Arc<dyn BaseTable>,
    pub(crate) on: Vec<String>,
    pub(crate) when_matched_update_all: bool,
-    pub(crate) when_matched_update_all_filt: Option<String>,
+    pub(crate) when_matched_update_all_filt: Option<MergeFilter>,
    pub(crate) when_not_matched_insert_all: bool,
    pub(crate) when_not_matched_by_source_delete: bool,
-    pub(crate) when_not_matched_by_source_delete_filt: Option<String>,
+    pub(crate) when_not_matched_by_source_delete_filt: Option<MergeFilter>,
    pub(crate) timeout: Option<Duration>,
    pub(crate) use_index: bool,
    pub(crate) use_lsm_write: Option<bool>,
@@ -110,7 +116,14 @@ impl MergeInsertBuilder {
    /// For example, "target.last_update < source.last_update"
    pub fn when_matched_update_all(&mut self, condition: Option<String>) -> &mut Self {
        self.when_matched_update_all = true;
-        self.when_matched_update_all_filt = condition;
+        self.when_matched_update_all_filt = condition.map(MergeFilter::Sql);
+        self
+    }
+
+    /// Similar to [`Self::when_matched_update_all`] but accepts a DataFusion logical expression directly.
+    pub fn when_matched_update_all_expr(&mut self, condition: datafusion_expr::Expr) -> &mut Self {
+        self.when_matched_update_all = true;
+        self.when_matched_update_all_filt = Some(MergeFilter::Expr(condition));
        self
    }

@@ -132,7 +145,17 @@ impl MergeInsertBuilder {
    ///   limit what rows are deleted.
    pub fn when_not_matched_by_source_delete(&mut self, filter: Option<String>) -> &mut Self {
        self.when_not_matched_by_source_delete = true;
-        self.when_not_matched_by_source_delete_filt = filter;
+        self.when_not_matched_by_source_delete_filt = filter.map(MergeFilter::Sql);
+        self
+    }
+
+    /// Similar to [`Self::when_not_matched_by_source_delete`] but accepts a DataFusion logical expression directly.
+    pub fn when_not_matched_by_source_delete_expr(
+        &mut self,
+        filter: datafusion_expr::Expr,
+    ) -> &mut Self {
+        self.when_not_matched_by_source_delete = true;
+        self.when_not_matched_by_source_delete_filt = Some(MergeFilter::Expr(filter));
        self
    }

@@ -234,7 +257,12 @@ pub(crate) async fn execute_merge_insert(
    ) {
        (false, _) => builder.when_matched(WhenMatched::DoNothing),
        (true, None) => builder.when_matched(WhenMatched::UpdateAll),
-        (true, Some(filt)) => builder.when_matched(WhenMatched::update_if(&dataset, &filt)?),
+        (true, Some(MergeFilter::Sql(filt))) => {
+            builder.when_matched(WhenMatched::update_if(&dataset, &filt)?)
+        }
+        (true, Some(MergeFilter::Expr(expr))) => {
+            builder.when_matched(WhenMatched::update_if_expr(expr))
+        }
    };
    if params.when_not_matched_insert_all {
        builder.when_not_matched(lance::dataset::WhenNotMatched::InsertAll);
@@ -242,10 +270,12 @@ pub(crate) async fn execute_merge_insert(
        builder.when_not_matched(lance::dataset::WhenNotMatched::DoNothing);
    }
    if params.when_not_matched_by_source_delete {
-        let behavior = if let Some(filter) = params.when_not_matched_by_source_delete_filt {
-            WhenNotMatchedBySource::delete_if(dataset.as_ref(), &filter)?
-        } else {
-            WhenNotMatchedBySource::Delete
+        let behavior = match params.when_not_matched_by_source_delete_filt {
+            Some(MergeFilter::Sql(filter)) => {
+                WhenNotMatchedBySource::delete_if(dataset.as_ref(), &filter)?
+            }
+            Some(MergeFilter::Expr(expr)) => WhenNotMatchedBySource::DeleteIf(expr),
+            None => WhenNotMatchedBySource::Delete,
        };
        builder.when_not_matched_by_source(behavior);
    } else {
@@ -386,6 +416,45 @@ mod tests {
        merge_insert_builder.execute(new_batches).await.unwrap();
        assert_eq!(table.count_rows(None).await.unwrap(), 25);
    }
+
+    #[tokio::test]
+    async fn test_merge_insert_expr() {
+        use datafusion_expr::{col, lit};
+
+        let conn = connect("memory://").execute().await.unwrap();
+
+        // Create a dataset with i=0..10
+        let batches = merge_insert_test_batches(0, 0);
+        let table = conn
+            .create_table("my_table_expr", batches)
+            .execute()
+            .await
+            .unwrap();
+        assert_eq!(table.count_rows(None).await.unwrap(), 10);
+
+        // Conditional update that only replaces the age=0 data
+        let new_batches = merge_insert_test_batches(5, 3);
+        let mut merge_insert_builder = table.merge_insert(&["i"]);
+        // use expression: target.age = 0
+        let expr = col("target.age").eq(lit(0));
+        merge_insert_builder.when_matched_update_all_expr(expr);
+        merge_insert_builder.execute(new_batches).await.unwrap();
+        assert_eq!(
+            table.count_rows(Some("age = 3".to_string())).await.unwrap(),
+            5
+        );
+
+        // Delete with expression
+        // Create new batches with i=10..20 (so target rows i=0..9 are not matched by source)
+        let new_batches = merge_insert_test_batches(10, 0); // won't insert or update since we don't enable matched/unmatched actions
+        let mut merge_insert_builder = table.merge_insert(&["i"]);
+        // delete if target.age = 3
+        let delete_expr = col("target.age").eq(lit(3));
+        merge_insert_builder.when_not_matched_by_source_delete_expr(delete_expr);
+        let result = merge_insert_builder.execute(new_batches).await.unwrap();
+        assert_eq!(result.num_deleted_rows, 5);
+        assert_eq!(table.count_rows(None).await.unwrap(), 5);
+    }
 }

 #[cfg(test)]
--- a/rust/lancedb/src/table/query.rs
+++ b/rust/lancedb/src/table/query.rs
@@ -41,11 +41,14 @@ pub async fn execute_query(
    query: &AnyQuery,
    options: QueryExecutionOptions,
 ) -> Result<DatasetRecordBatchStream> {
-    // If QueryTable pushdown is enabled and namespace client is configured, use server-side query execution
+    // QueryTable pushdown runs the query server-side, but only on the main
+    // branch: the namespace request carries no branch yet, so a branch handle
+    // must fall through to local execution.
    if table
        .pushdown_operations
        .contains(&NamespaceClientPushdownOperation::QueryTable)
        && let Some(ref namespace_client) = table.namespace_client
+        && table.dataset.current_branch().is_none()
    {
        return execute_namespace_query(table, namespace_client.clone(), query, options).await;
    }
--- a/rust/lancedb/src/table/schema_evolution.rs
+++ b/rust/lancedb/src/table/schema_evolution.rs
@@ -10,6 +10,7 @@

 use lance::dataset::{ColumnAlteration, NewColumnTransform};
 use serde::{Deserialize, Serialize};
+use std::collections::HashMap;

 use super::NativeTable;
 use crate::Result;
@@ -44,6 +45,52 @@ pub struct DropColumnsResult {
    pub version: u64,
 }

+/// A single field's metadata update, addressed by dot-path.
+///
+/// Merges into the field's existing metadata by default. Use [`Self::remove`] to
+/// delete a key, or [`Self::replace`] to swap the field's entire metadata map.
+#[derive(Debug, Clone, PartialEq, Eq, Default, Serialize)]
+pub struct FieldMetadataUpdate {
+    /// Dot-separated path to the field (e.g. `"embedding"` or `"address.zip"`).
+    pub path: String,
+    /// Keys to set (`Some`) or delete (`None`).
+    pub metadata: HashMap<String, Option<String>>,
+    /// If `true`, replace the field's entire metadata map instead of merging.
+    pub replace: bool,
+}
+
+impl FieldMetadataUpdate {
+    pub fn new(path: impl Into<String>) -> Self {
+        Self {
+            path: path.into(),
+            metadata: HashMap::new(),
+            replace: false,
+        }
+    }
+
+    pub fn set(mut self, key: impl Into<String>, value: impl Into<String>) -> Self {
+        self.metadata.insert(key.into(), Some(value.into()));
+        self
+    }
+
+    pub fn remove(mut self, key: impl Into<String>) -> Self {
+        self.metadata.insert(key.into(), None);
+        self
+    }
+
+    pub fn replace(mut self) -> Self {
+        self.replace = true;
+        self
+    }
+}
+
+#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize, Default)]
+pub struct UpdateFieldMetadataResult {
+    /// The commit version associated with the operation.
+    #[serde(default)]
+    pub version: u64,
+}
+
 /// Internal implementation of the add columns logic.
 ///
 /// Adds new columns to the table using the provided transforms.
@@ -90,6 +137,32 @@ pub(crate) async fn execute_drop_columns(
    Ok(DropColumnsResult { version })
 }

+/// Internal implementation of the update field metadata logic.
+///
+/// Merges or replaces per-field metadata, addressing fields by dot-path.
+pub(crate) async fn execute_update_field_metadata(
+    table: &NativeTable,
+    updates: &[FieldMetadataUpdate],
+) -> Result<UpdateFieldMetadataResult> {
+    table.dataset.ensure_mutable()?;
+    let mut dataset = (*table.dataset.get().await?).clone();
+
+    let mut builder = dataset.update_field_metadata();
+    for update in updates {
+        let entries = update.metadata.iter().map(|(k, v)| (k.clone(), v.clone()));
+        builder = if update.replace {
+            builder.replace(&update.path, entries)?
+        } else {
+            builder.update(&update.path, entries)?
+        };
+    }
+    builder.await?;
+
+    let version = dataset.version().version;
+    table.dataset.update(dataset);
+    Ok(UpdateFieldMetadataResult { version })
+}
+
 #[cfg(test)]
 mod tests {
    use arrow_array::{Int32Array, StringArray, record_batch};
@@ -97,6 +170,7 @@ mod tests {
    use futures::TryStreamExt;
    use lance::dataset::ColumnAlteration;

+    use super::FieldMetadataUpdate;
    use crate::connect;
    use crate::query::{ExecutableQuery, QueryBase, Select};
    use crate::table::NewColumnTransform;
@@ -610,4 +684,46 @@ mod tests {
        let v4 = table.version().await.unwrap();
        assert_eq!(drop_result.version, v4);
    }
+
+    #[tokio::test]
+    async fn test_update_field_metadata() {
+        let conn = connect("memory://").execute().await.unwrap();
+        let batch = record_batch!(
+            ("id", Int32, [1, 2, 3]),
+            ("category", Utf8, ["A", "B", "C"])
+        )
+        .unwrap();
+        let table = conn
+            .create_table("test_update_field_metadata", batch)
+            .execute()
+            .await
+            .unwrap();
+
+        // Set metadata on a field.
+        table
+            .update_field_metadata(&[FieldMetadataUpdate::new("category")
+                .set("unit", "label")
+                .set("pii", "false")])
+            .await
+            .unwrap();
+        let schema = table.schema().await.unwrap();
+        let field = schema.field_with_name("category").unwrap();
+        assert_eq!(
+            field.metadata().get("unit").map(String::as_str),
+            Some("label")
+        );
+
+        // Merge: add a key, delete one, keep the rest.
+        table
+            .update_field_metadata(&[FieldMetadataUpdate::new("category")
+                .set("source", "import")
+                .remove("pii")])
+            .await
+            .unwrap();
+        let schema = table.schema().await.unwrap();
+        let md = schema.field_with_name("category").unwrap().metadata();
+        assert_eq!(md.get("unit").map(String::as_str), Some("label")); // preserved
+        assert_eq!(md.get("source").map(String::as_str), Some("import")); // added
+        assert!(!md.contains_key("pii")); // deleted
+    }
 }
--- a/rust/lancedb/src/utils/mod.rs
+++ b/rust/lancedb/src/utils/mod.rs
@@ -257,7 +257,9 @@ pub fn supported_bitmap_data_type(dtype: &DataType) -> bool {

 pub fn supported_label_list_data_type(dtype: &DataType) -> bool {
    match dtype {
-        DataType::List(field) => supported_bitmap_data_type(field.data_type()),
+        DataType::List(field) | DataType::LargeList(field) => {
+            supported_bitmap_data_type(field.data_type())
+        }
        DataType::FixedSizeList(field, _) => supported_bitmap_data_type(field.data_type()),
        _ => false,
    }
@@ -277,6 +279,15 @@ fn supported_fts_data_type_impl(dtype: &DataType, in_list: bool) -> bool {
    }
 }

+/// FM-Index accelerates substring (`contains`) search over raw bytes, so it
+/// applies to string and binary columns.
+pub fn supported_fm_data_type(dtype: &DataType) -> bool {
+    matches!(
+        dtype,
+        DataType::Utf8 | DataType::LargeUtf8 | DataType::Binary | DataType::LargeBinary
+    )
+}
+
 pub fn supported_vector_data_type(dtype: &DataType) -> bool {
    match dtype {
        DataType::FixedSizeList(field, _) => {