chore: point lance dependency at jack branch

fix(udf): JobHandle.wait() terminates on failed jobs
wait() (sync + async) only stopped on finished/stale/committed, so a job the server already reported as state=failed was polled until the (default 3600s) timeout, then raised a misleading TimeoutError instead of the real cause. A doomed backfill -- e.g. a multi-column REFRESH COLUMN of a scalar UDF -- hung the client even though get_job surfaced the failure within ~3s. Add a terminal failed branch that raises JobFailedError carrying the server error, exported from the package. Verified end-to-end against the cluster: raises in 3.6s instead of hanging. Unit-tested with a mock conn (sync+async, failure + success + committed paths). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-30 17:40:40 +00:00 · 2026-06-29 22:46:07 -07:00 · 2026-06-29 22:35:35 -07:00 · 2026-06-29 22:35:35 -07:00 · 2026-06-29 22:35:35 -07:00 · 2026-06-29 22:35:34 -07:00
23 changed files with 3834 additions and 355 deletions
--- a/.agents/skills/lancedb-branch-ops/SKILL.md
+++ b/.agents/skills/lancedb-branch-ops/SKILL.md
@@ -1,137 +0,0 @@
---
-name: lancedb-branch-ops
-description: Branch management for LanceDB tables via the REST API. Use this skill whenever someone wants to create, delete, list, or switch branches on a LanceDB table — or needs to make sure a write (metadata update, index build, etc.) lands on a specific branch instead of main. Invoke it even without the word "branch" if context makes clear they want an experimental copy of a table, want to isolate changes, or want to confirm a mutation didn't touch main. Covers: branches/list, branches/create, branches/delete, and passing "branch" in describe/update_field_metadata/create_index to target a non-main version.
---
-
-## Goal
-
-Manage branches on a LanceDB table: list what exists, create new ones, delete stale ones, and direct read/write operations at a specific branch without touching main.
-
-## Step 0: Establish the connection
-
-Use the `lancedb-connect` skill to resolve the base URL and auth headers (`x-api-key`, `x-lancedb-database`). Skip this only if the connection is already known from the current conversation.
-
-All examples below use `{base_url}` — substitute the resolved endpoint and include the auth headers on every request.
-
-## The branch model (important)
-
-LanceDB branches are named snapshots that diverge from the table's current state at creation time. There is **no checkout command** — you never switch the whole table to a branch. Instead, you **pass `"branch": "<name>"` in the request body** of any operation to target that branch. Omitting the key (or sending an empty body) always targets main.
-
-`branches/list` returns only non-main branches. Main always exists and is not listed.
-
-## List branches
-
-```http
-POST {base_url}/v1/table/{table_id}/branches/list
-Content-Type: application/json
-
-{}
-```
-
-Response:
-```json
-{
-  "branches": {
-    "experiment-reindex": {"parentVersion": 1, "createAt": 1782506085, "manifestSize": 1029}
-  }
-}
-```
-
-If `branches` is `{}`, the table has no branches besides main.
-
-## Create a branch
-
-```http
-POST {base_url}/v1/table/{table_id}/branches/create
-Content-Type: application/json
-
-{"name": "experiment-reindex"}
-```
-
-HTTP 200 with `{}` body = success. The branch is created off the table's current state on main.
-
-Verify by calling `branches/list` and confirming the new name appears.
-
-## Delete a branch
-
-```http
-POST {base_url}/v1/table/{table_id}/branches/delete
-Content-Type: application/json
-
-{"name": "stale-2024"}
-```
-
-HTTP 200 with `{}` body = success. Only the branch pointer is removed — main and all row data remain intact.
-
-Verify by calling `branches/list` (name gone) and `describe` with no branch param (main still responds).
-
-## Operate on a specific branch
-
-Pass `"branch": "<name>"` in the body of any operation to scope it to that branch:
-
-**Read schema on a branch:**
-```http
-POST {base_url}/v1/table/{table_id}/describe
-Content-Type: application/json
-
-{"branch": "wip-branch"}
-```
-
-**Write metadata to a branch (not main):**
-```http
-POST {base_url}/v1/table/{table_id}/update_field_metadata
-Content-Type: application/json
-
-{
-  "branch": "wip-branch",
-  "updates": [
-    {
-      "path": "category",
-      "metadata": {"lancedb:description": "Product category label."},
-      "replace": false
-    }
-  ]
-}
-```
-
-**Build an index on a branch:**
-```http
-POST {base_url}/v1/table/{table_id}/create_index
-Content-Type: application/json
-
-{
-  "branch": "wip-branch",
-  "column": "category",
-  "index_type": "BTREE"
-}
-```
-
-## Verifying isolation
-
-After writing to a branch, always confirm the change did NOT land on main:
-
-```bash
-# Should show the new metadata
-curl -s -X POST {base_url}/v1/table/{table_id}/describe \
-  -H "x-api-key: <key>" -H "x-lancedb-database: <db>" \
-  -H "content-type: application/json" \
-  -d '{"branch": "wip-branch"}'
-
-# Should NOT show the new metadata
-curl -s -X POST {base_url}/v1/table/{table_id}/describe \
-  -H "x-api-key: <key>" -H "x-lancedb-database: <db>" \
-  -H "content-type: application/json" \
-  -d '{}'
-```
-
-## Quick reference
-
-| Goal | Endpoint | Body |
-|------|----------|------|
-| List all branches | `branches/list` | `{}` |
-| Create a branch | `branches/create` | `{"name": "..."}` |
-| Delete a branch | `branches/delete` | `{"name": "..."}` |
-| Read schema on branch | `describe` | `{"branch": "..."}` |
-| Write metadata on branch | `update_field_metadata` | `{"branch": "...", "updates": [...]}` |
-| Build index on branch | `create_index` | `{"branch": "...", "column": ..., "index_type": ...}` |
-| Target main (default) | any endpoint | omit `"branch"` key |
--- a/Cargo.lock
+++ b/Cargo.lock
@@ -1750,7 +1750,7 @@ version = "3.1.1"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "faf9468729b8cbcea668e36183cb69d317348c2e08e994829fb56ebfdfbaac34"
 dependencies = [
- "windows-sys 0.61.2",
+ "windows-sys 0.59.0",
 ]

 [[package]]
@@ -3014,7 +3014,7 @@ dependencies = [
 "libc",
 "option-ext",
 "redox_users",
- "windows-sys 0.61.2",
+ "windows-sys 0.59.0",
 ]

 [[package]]
@@ -3231,7 +3231,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "39cab71617ae0d63f51a36d69f866391735b51691dbda63cf6f96d042b63efeb"
 dependencies = [
 "libc",
- "windows-sys 0.61.2",
+ "windows-sys 0.59.0",
 ]

 [[package]]
@@ -3424,7 +3424,7 @@ checksum = "42703706b716c37f96a77aea830392ad231f44c9e9a67872fa5548707e11b11c"
 [[package]]
 name = "fsst"
 version = "9.0.0-beta.10"
-source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
+source = "git+https://github.com/jackye1995/lance.git?branch=jack%2Fsophon-pr-6325#1c5b5061c60934b4c18dbe86c5e91b4961105989"
 dependencies = [
 "arrow-array",
 "rand 0.9.4",
@@ -4466,7 +4466,7 @@ checksum = "3640c1c38b8e4e43584d8df18be5fc6b0aa314ce6ebf51b53313d4306cca8e46"
 dependencies = [
 "hermit-abi",
 "libc",
- "windows-sys 0.61.2",
+ "windows-sys 0.59.0",
 ]

 [[package]]
@@ -4560,7 +4560,7 @@ dependencies = [
 "portable-atomic-util",
 "serde_core",
 "wasm-bindgen",
- "windows-sys 0.61.2",
+ "windows-sys 0.59.0",
 ]

 [[package]]
@@ -4727,7 +4727,7 @@ checksum = "e037a2e1d8d5fdbd49b16a4ea09d5d6401c1f29eca5ff29d03d3824dba16256a"
 [[package]]
 name = "lance"
 version = "9.0.0-beta.10"
-source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
+source = "git+https://github.com/jackye1995/lance.git?branch=jack%2Fsophon-pr-6325#1c5b5061c60934b4c18dbe86c5e91b4961105989"
 dependencies = [
 "arc-swap",
 "arrow",
@@ -4802,7 +4802,7 @@ dependencies = [
 [[package]]
 name = "lance-arrow"
 version = "9.0.0-beta.10"
-source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
+source = "git+https://github.com/jackye1995/lance.git?branch=jack%2Fsophon-pr-6325#1c5b5061c60934b4c18dbe86c5e91b4961105989"
 dependencies = [
 "arrow-array",
 "arrow-buffer",
@@ -4823,7 +4823,7 @@ dependencies = [
 [[package]]
 name = "lance-arrow-scalar"
 version = "58.0.0"
-source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
+source = "git+https://github.com/jackye1995/lance.git?branch=jack%2Fsophon-pr-6325#1c5b5061c60934b4c18dbe86c5e91b4961105989"
 dependencies = [
 "arrow-array",
 "arrow-buffer",
@@ -4837,7 +4837,7 @@ dependencies = [
 [[package]]
 name = "lance-arrow-stats"
 version = "58.0.0"
-source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
+source = "git+https://github.com/jackye1995/lance.git?branch=jack%2Fsophon-pr-6325#1c5b5061c60934b4c18dbe86c5e91b4961105989"
 dependencies = [
 "arrow-array",
 "arrow-schema",
@@ -4847,7 +4847,7 @@ dependencies = [
 [[package]]
 name = "lance-bitpacking"
 version = "9.0.0-beta.10"
-source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
+source = "git+https://github.com/jackye1995/lance.git?branch=jack%2Fsophon-pr-6325#1c5b5061c60934b4c18dbe86c5e91b4961105989"
 dependencies = [
 "arrayref",
 "crunchy",
@@ -4858,7 +4858,7 @@ dependencies = [
 [[package]]
 name = "lance-core"
 version = "9.0.0-beta.10"
-source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
+source = "git+https://github.com/jackye1995/lance.git?branch=jack%2Fsophon-pr-6325#1c5b5061c60934b4c18dbe86c5e91b4961105989"
 dependencies = [
 "arrow-array",
 "arrow-buffer",
@@ -4897,7 +4897,7 @@ dependencies = [
 [[package]]
 name = "lance-datafusion"
 version = "9.0.0-beta.10"
-source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
+source = "git+https://github.com/jackye1995/lance.git?branch=jack%2Fsophon-pr-6325#1c5b5061c60934b4c18dbe86c5e91b4961105989"
 dependencies = [
 "arrow",
 "arrow-array",
@@ -4928,7 +4928,7 @@ dependencies = [
 [[package]]
 name = "lance-datagen"
 version = "9.0.0-beta.10"
-source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
+source = "git+https://github.com/jackye1995/lance.git?branch=jack%2Fsophon-pr-6325#1c5b5061c60934b4c18dbe86c5e91b4961105989"
 dependencies = [
 "arrow",
 "arrow-array",
@@ -4946,7 +4946,7 @@ dependencies = [
 [[package]]
 name = "lance-derive"
 version = "9.0.0-beta.10"
-source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
+source = "git+https://github.com/jackye1995/lance.git?branch=jack%2Fsophon-pr-6325#1c5b5061c60934b4c18dbe86c5e91b4961105989"
 dependencies = [
 "proc-macro2",
 "quote",
@@ -4956,7 +4956,7 @@ dependencies = [
 [[package]]
 name = "lance-encoding"
 version = "9.0.0-beta.10"
-source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
+source = "git+https://github.com/jackye1995/lance.git?branch=jack%2Fsophon-pr-6325#1c5b5061c60934b4c18dbe86c5e91b4961105989"
 dependencies = [
 "arrow-arith",
 "arrow-array",
@@ -4992,7 +4992,7 @@ dependencies = [
 [[package]]
 name = "lance-file"
 version = "9.0.0-beta.10"
-source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
+source = "git+https://github.com/jackye1995/lance.git?branch=jack%2Fsophon-pr-6325#1c5b5061c60934b4c18dbe86c5e91b4961105989"
 dependencies = [
 "arrow-arith",
 "arrow-array",
@@ -5023,7 +5023,7 @@ dependencies = [
 [[package]]
 name = "lance-index"
 version = "9.0.0-beta.10"
-source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
+source = "git+https://github.com/jackye1995/lance.git?branch=jack%2Fsophon-pr-6325#1c5b5061c60934b4c18dbe86c5e91b4961105989"
 dependencies = [
 "arc-swap",
 "arrow",
@@ -5089,7 +5089,7 @@ dependencies = [
 [[package]]
 name = "lance-io"
 version = "9.0.0-beta.10"
-source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
+source = "git+https://github.com/jackye1995/lance.git?branch=jack%2Fsophon-pr-6325#1c5b5061c60934b4c18dbe86c5e91b4961105989"
 dependencies = [
 "arrow",
 "arrow-arith",
@@ -5131,7 +5131,7 @@ dependencies = [
 [[package]]
 name = "lance-linalg"
 version = "9.0.0-beta.10"
-source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
+source = "git+https://github.com/jackye1995/lance.git?branch=jack%2Fsophon-pr-6325#1c5b5061c60934b4c18dbe86c5e91b4961105989"
 dependencies = [
 "arrow-array",
 "arrow-buffer",
@@ -5148,7 +5148,7 @@ dependencies = [
 [[package]]
 name = "lance-namespace"
 version = "9.0.0-beta.10"
-source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
+source = "git+https://github.com/jackye1995/lance.git?branch=jack%2Fsophon-pr-6325#1c5b5061c60934b4c18dbe86c5e91b4961105989"
 dependencies = [
 "arrow",
 "async-trait",
@@ -5161,7 +5161,7 @@ dependencies = [
 [[package]]
 name = "lance-namespace-impls"
 version = "9.0.0-beta.10"
-source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
+source = "git+https://github.com/jackye1995/lance.git?branch=jack%2Fsophon-pr-6325#1c5b5061c60934b4c18dbe86c5e91b4961105989"
 dependencies = [
 "arrow",
 "arrow-ipc",
@@ -5216,7 +5216,7 @@ dependencies = [
 [[package]]
 name = "lance-select"
 version = "9.0.0-beta.10"
-source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
+source = "git+https://github.com/jackye1995/lance.git?branch=jack%2Fsophon-pr-6325#1c5b5061c60934b4c18dbe86c5e91b4961105989"
 dependencies = [
 "arrow-array",
 "arrow-buffer",
@@ -5232,7 +5232,7 @@ dependencies = [
 [[package]]
 name = "lance-table"
 version = "9.0.0-beta.10"
-source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
+source = "git+https://github.com/jackye1995/lance.git?branch=jack%2Fsophon-pr-6325#1c5b5061c60934b4c18dbe86c5e91b4961105989"
 dependencies = [
 "arrow",
 "arrow-array",
@@ -5272,7 +5272,7 @@ dependencies = [
 [[package]]
 name = "lance-testing"
 version = "9.0.0-beta.10"
-source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
+source = "git+https://github.com/jackye1995/lance.git?branch=jack%2Fsophon-pr-6325#1c5b5061c60934b4c18dbe86c5e91b4961105989"
 dependencies = [
 "arrow-array",
 "arrow-schema",
@@ -5286,7 +5286,7 @@ dependencies = [
 [[package]]
 name = "lance-tokenizer"
 version = "9.0.0-beta.10"
-source = "git+https://github.com/lance-format/lance.git?tag=v9.0.0-beta.10#e25b71e74b89d10c57b412d111bde087117383f3"
+source = "git+https://github.com/jackye1995/lance.git?branch=jack%2Fsophon-pr-6325#1c5b5061c60934b4c18dbe86c5e91b4961105989"
 dependencies = [
 "icu_segmenter",
 "jieba-rs",
@@ -6085,7 +6085,7 @@ version = "0.50.3"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "7957b9740744892f114936ab4a57b3f487491bbeafaf8083688b16841a4240e5"
 dependencies = [
- "windows-sys 0.61.2",
+ "windows-sys 0.59.0",
 ]

 [[package]]
@@ -7400,8 +7400,8 @@ version = "0.14.3"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "343d3bd7056eda839b03204e68deff7d1b13aba7af2b2fd16890697274262ee7"
 dependencies = [
- "heck 0.5.0",
- "itertools 0.14.0",
+ "heck 0.4.1",
+ "itertools 0.11.0",
 "log",
 "multimap",
 "petgraph",
@@ -7420,7 +7420,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "27c6023962132f4b30eb4c172c91ce92d933da334c59c23cddee82358ddafb0b"
 dependencies = [
 "anyhow",
- "itertools 0.14.0",
+ "itertools 0.11.0",
 "proc-macro2",
 "quote",
 "syn 2.0.117",
@@ -7654,7 +7654,7 @@ dependencies = [
 "once_cell",
 "socket2 0.6.3",
 "tracing",
- "windows-sys 0.60.2",
+ "windows-sys 0.59.0",
 ]

 [[package]]
@@ -8394,7 +8394,7 @@ dependencies = [
 "errno",
 "libc",
 "linux-raw-sys",
- "windows-sys 0.61.2",
+ "windows-sys 0.59.0",
 ]

 [[package]]
@@ -8465,7 +8465,7 @@ dependencies = [
 "security-framework",
 "security-framework-sys",
 "webpki-root-certs",
- "windows-sys 0.61.2",
+ "windows-sys 0.59.0",
 ]

 [[package]]
@@ -9027,7 +9027,7 @@ version = "0.8.9"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "c1c97747dbf44bb1ca44a561ece23508e99cb592e862f22222dcf42f51d1e451"
 dependencies = [
- "heck 0.5.0",
+ "heck 0.4.1",
 "proc-macro2",
 "quote",
 "syn 2.0.117",
@@ -9039,7 +9039,7 @@ version = "0.9.0"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "54254b8531cafa275c5e096f62d48c81435d1015405a91198ddb11e967301d40"
 dependencies = [
- "heck 0.5.0",
+ "heck 0.4.1",
 "proc-macro2",
 "quote",
 "syn 2.0.117",
@@ -9472,7 +9472,7 @@ dependencies = [
 "getrandom 0.4.2",
 "once_cell",
 "rustix",
- "windows-sys 0.61.2",
+ "windows-sys 0.59.0",
 ]

 [[package]]
@@ -10407,7 +10407,7 @@ version = "0.1.11"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "c2a7b1c03c876122aa43f3020e6c3c3ee5c05081c9a00739faf7503aeba10d22"
 dependencies = [
- "windows-sys 0.61.2",
+ "windows-sys 0.59.0",
 ]

 [[package]]
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -13,20 +13,20 @@ categories = ["database-implementations"]
 rust-version = "1.91.0"

 [workspace.dependencies]
-lance = { "version" = "=9.0.0-beta.10", default-features = false, "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
-lance-core = { "version" = "=9.0.0-beta.10", "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
-lance-datagen = { "version" = "=9.0.0-beta.10", "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
-lance-file = { "version" = "=9.0.0-beta.10", "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
-lance-io = { "version" = "=9.0.0-beta.10", default-features = false, "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
-lance-index = { "version" = "=9.0.0-beta.10", "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
-lance-linalg = { "version" = "=9.0.0-beta.10", "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
-lance-namespace = { "version" = "=9.0.0-beta.10", "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
-lance-namespace-impls = { "version" = "=9.0.0-beta.10", default-features = false, "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
-lance-table = { "version" = "=9.0.0-beta.10", "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
-lance-testing = { "version" = "=9.0.0-beta.10", "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
-lance-datafusion = { "version" = "=9.0.0-beta.10", "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
-lance-encoding = { "version" = "=9.0.0-beta.10", "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
-lance-arrow = { "version" = "=9.0.0-beta.10", "tag" = "v9.0.0-beta.10", "git" = "https://github.com/lance-format/lance.git" }
+lance = { "version" = "=9.0.0-beta.10", default-features = false, "branch" = "jack/sophon-pr-6325", "git" = "https://github.com/jackye1995/lance.git" }
+lance-core = { "version" = "=9.0.0-beta.10", "branch" = "jack/sophon-pr-6325", "git" = "https://github.com/jackye1995/lance.git" }
+lance-datagen = { "version" = "=9.0.0-beta.10", "branch" = "jack/sophon-pr-6325", "git" = "https://github.com/jackye1995/lance.git" }
+lance-file = { "version" = "=9.0.0-beta.10", "branch" = "jack/sophon-pr-6325", "git" = "https://github.com/jackye1995/lance.git" }
+lance-io = { "version" = "=9.0.0-beta.10", default-features = false, "branch" = "jack/sophon-pr-6325", "git" = "https://github.com/jackye1995/lance.git" }
+lance-index = { "version" = "=9.0.0-beta.10", "branch" = "jack/sophon-pr-6325", "git" = "https://github.com/jackye1995/lance.git" }
+lance-linalg = { "version" = "=9.0.0-beta.10", "branch" = "jack/sophon-pr-6325", "git" = "https://github.com/jackye1995/lance.git" }
+lance-namespace = { "version" = "=9.0.0-beta.10", "branch" = "jack/sophon-pr-6325", "git" = "https://github.com/jackye1995/lance.git" }
+lance-namespace-impls = { "version" = "=9.0.0-beta.10", default-features = false, "branch" = "jack/sophon-pr-6325", "git" = "https://github.com/jackye1995/lance.git" }
+lance-table = { "version" = "=9.0.0-beta.10", "branch" = "jack/sophon-pr-6325", "git" = "https://github.com/jackye1995/lance.git" }
+lance-testing = { "version" = "=9.0.0-beta.10", "branch" = "jack/sophon-pr-6325", "git" = "https://github.com/jackye1995/lance.git" }
+lance-datafusion = { "version" = "=9.0.0-beta.10", "branch" = "jack/sophon-pr-6325", "git" = "https://github.com/jackye1995/lance.git" }
+lance-encoding = { "version" = "=9.0.0-beta.10", "branch" = "jack/sophon-pr-6325", "git" = "https://github.com/jackye1995/lance.git" }
+lance-arrow = { "version" = "=9.0.0-beta.10", "branch" = "jack/sophon-pr-6325", "git" = "https://github.com/jackye1995/lance.git" }
 ahash = "0.8"
 # Note that this one does not include pyarrow
 arrow = { version = "58.0.0", optional = false }
--- a/nodejs/src/merge.rs
+++ b/nodejs/src/merge.rs
@@ -3,7 +3,7 @@

 use std::time::Duration;

-use lancedb::{ipc::ipc_file_to_batches, table::merge::MergeInsertBuilder};
+use lancedb::{arrow::IntoArrow, ipc::ipc_file_to_batches, table::merge::MergeInsertBuilder};
 use napi::bindgen_prelude::*;
 use napi_derive::napi;

@@ -66,9 +66,11 @@ impl NativeMergeInsertBuilder {

    #[napi(catch_unwind)]
    pub async fn execute(&self, buf: Buffer) -> napi::Result<MergeResult> {
-        let data = ipc_file_to_batches(buf.to_vec()).map_err(|e| {
-            napi::Error::from_reason(format!("Failed to read IPC file: {}", convert_error(&e)))
-        })?;
+        let data = ipc_file_to_batches(buf.to_vec())
+            .and_then(IntoArrow::into_arrow)
+            .map_err(|e| {
+                napi::Error::from_reason(format!("Failed to read IPC file: {}", convert_error(&e)))
+            })?;

        let this = self.clone();

--- a/python/python/lancedb/init.py
+++ b/python/python/lancedb/init.py
@@ -17,6 +17,17 @@ from .db import AsyncConnection, DBConnection, LanceDBConnection
 from .remote import ClientConfig
 from .remote.db import RemoteDBConnection
 from .expr import Expr, col, lit, func
+from .udf import (
+    udf,
+    table_udf,
+    Udf,
+    JobHandle,
+    JobFailedError,
+    MaterializedView,
+    AsyncJobHandle,
+    AsyncMaterializedView,
+)
+from .lineage import Lineage, Node, Edge, FunctionRef
 from .schema import vector
 from .table import AsyncTable, Table
 from ._lancedb import Session
@@ -448,6 +459,18 @@ async def connect_async(


 __all__ = [
+    "udf",
+    "table_udf",
+    "Udf",
+    "JobHandle",
+    "JobFailedError",
+    "MaterializedView",
+    "AsyncJobHandle",
+    "AsyncMaterializedView",
+    "Lineage",
+    "Node",
+    "Edge",
+    "FunctionRef",
    "connect",
    "connect_async",
    "connect_namespace",
--- a/python/python/lancedb/db.py
+++ b/python/python/lancedb/db.py
@@ -65,6 +65,7 @@ if TYPE_CHECKING:
    from .common import DATA, URI
    from .embeddings import EmbeddingFunctionConfig
    from ._lancedb import Session
+    from .udf import MaterializedView, AsyncMaterializedView

 from .namespace_utils import (
    _normalize_create_namespace_mode,
@@ -562,6 +563,259 @@ class DBConnection(EnforceOverrides):
        """
        raise NotImplementedError("serialize is not supported for this connection type")

+    # -- Derived compute: functions, materialized views, jobs -------------
+    # Server-backed features (LanceDB Enterprise / Cloud); local
+    # connections raise NotImplementedError for now.
+
+    def create_function(
+        self,
+        name,
+        language: str = "python",
+        return_type: Optional[str] = None,
+        body: Optional[str] = None,
+        options: Optional[Dict[str, str]] = None,
+        *,
+        replace: bool = False,
+    ):
+        """Register a UDF (CREATE FUNCTION).
+
+        Pass a ``@udf`` / ``@table_udf``-decorated function (preferred):
+
+            db.create_function(embed)
+
+        or the explicit fields:
+
+        Parameters
+        ----------
+        name: str or Udf
+            A decorated UDF object, or the function name.
+        language: str
+            Implementation language (currently "python").
+        return_type: str
+            SQL return type, e.g. "FLOAT", "FLOAT[1536]",
+            "STRUCT(a FLOAT, b VARCHAR)", "TABLE(chunk VARCHAR, idx INT)".
+        body: str
+            Function body: source text, or base64 cloudpickle bytes when
+            options["body_format"] == "cloudpickle".
+        options: dict, optional
+            input_columns, pip, num_gpus, batch_size, timeout,
+            error_policy, docker_image, body_format, ...
+        replace: bool
+            Drop an existing function of the same name first.
+        """
+        from .udf import Udf
+
+        if isinstance(name, Udf):
+            req = name.create_request()
+            name, language, return_type, body, options = (
+                req["name"],
+                req["language"],
+                req["return_type"],
+                req["body"],
+                req["options"],
+            )
+        if replace:
+            try:
+                self.drop_function(name)
+            except Exception:
+                pass
+        LOOP.run(self._conn.create_function(name, language, return_type, body, options))
+
+    def list_functions(self):
+        """List registered functions (SHOW FUNCTIONS)."""
+        return LOOP.run(self._conn.list_functions())
+
+    def drop_function(self, name: str):
+        """Drop a registered function (DROP FUNCTION)."""
+        LOOP.run(self._conn.drop_function(name))
+
+    def create_materialized_view(
+        self,
+        name: str,
+        source=None,
+        select=None,
+        *,
+        query: Optional[str] = None,
+        where: Optional[str] = None,
+        auto_refresh: bool = False,
+        with_no_data: bool = False,
+        replace: bool = False,
+        partition_by: Optional[str] = None,
+    ) -> "MaterializedView":
+        """Create a materialized view (CREATE MATERIALIZED VIEW); returns a
+        `MaterializedView` handle (``.wait()`` blocks until it is populated).
+
+        Two ways to specify the view body:
+
+        - ergonomic: pass ``source`` (a table name or table) and ``select``
+          items -- column names, expression strings ("embed(body)"),
+          (alias, expression) tuples, or ``@udf`` / ``@table_udf`` objects.
+          The SELECT is assembled and parsed server-side (one parser, shared
+          with SQL).
+        - raw: pass ``query=`` with a full SELECT, e.g.
+          "SELECT id, embed(body) AS vec FROM articles WHERE id > 1".
+
+        `partition_by` partitions the view's (single) table function on a source
+        column. If that column has an IVF vector index the server partitions by
+        its index clusters (image-dedup style); otherwise it groups by distinct
+        value. (Geneva's `partition_by` and `partition_by_indexed_column` unify
+        here -- the engine picks the strategy from the column.)
+        """
+        from .udf import build_view_query, MaterializedView
+
+        if query is None:
+            if source is None or select is None:
+                raise ValueError(
+                    "create_materialized_view needs either query= or both "
+                    "source and select"
+                )
+            query = build_view_query(source, select)
+        if where:
+            query += f" WHERE {where}"
+        if replace:
+            self._drop_view_if_exists(name)
+        job_id = LOOP.run(
+            self._conn.create_materialized_view(
+                name,
+                query=query,
+                auto_refresh=auto_refresh,
+                with_no_data=with_no_data,
+                partition_by=partition_by,
+            )
+        )
+        return MaterializedView(self, name, job_id=job_id)
+
+    def _drop_view_if_exists(self, name: str) -> None:
+        # `replace=True` is "drop if present"; only a not-found error is
+        # benign here. Anything else (perms, server fault) must surface rather
+        # than be masked by a later create failure.
+        try:
+            self.drop_materialized_view(name)
+        except Exception as e:
+            msg = str(e).lower()
+            if "not found" not in msg and "does not exist" not in msg:
+                raise
+
+    def job(self, job_id: str):
+        """A `JobHandle` for reconnecting to an inflight job by id -- e.g. an
+        id you stored, or one returned from the SQL / REST surface. Submit
+        methods (`refresh_column`, `MaterializedView.refresh`) already return a
+        handle directly, so you do not need this to wait on a fresh submission."""
+        from .udf import JobHandle
+
+        return JobHandle(self, job_id)
+
+    def lineage(
+        self,
+        table: str,
+        column: Optional[str] = None,
+        *,
+        direction: Optional[str] = None,
+        depth: Optional[int] = None,
+    ):
+        """Derived-compute lineage of a table/view, or one of its columns:
+        upstream sources, downstream dependents, and the function version +
+        location that produced each derived column (with a drift flag). Returns
+        a `Lineage`. `direction` is "upstream" | "downstream" | "both" (server
+        default both); `depth` limits column-hops (transitive when omitted)."""
+        # `self._conn` is the AsyncConnection; drive its async `lineage`
+        # (which parses the JSON) on the loop, mirroring create_materialized_view.
+        return LOOP.run(
+            self._conn.lineage(table, column, direction=direction, depth=depth)
+        )
+
+    def _refresh_materialized_view(
+        self,
+        name: str,
+        *,
+        full: bool = False,
+        src_version: Optional[int] = None,
+        num_workers: Optional[int] = None,
+        max_workers: Optional[int] = None,
+    ) -> str:
+        """Internal: submit a materialized-view refresh, return the job id.
+        The public surface is ``MaterializedView.refresh()`` (which returns a
+        `JobHandle`); this stays private so refresh is only reached through the
+        handle.
+
+        ``full=True`` forces a full rebuild (recompute and replace every row)
+        instead of the default incremental refresh.
+        """
+        return LOOP.run(
+            self._conn._refresh_materialized_view(
+                name,
+                full=full,
+                src_version=src_version,
+                num_workers=num_workers,
+                max_workers=max_workers,
+            )
+        )
+
+    def explain_refresh_materialized_view(
+        self,
+        name: str,
+        *,
+        full: bool = False,
+        src_version: Optional[int] = None,
+    ):
+        """Plan a refresh without running it (EXPLAIN REFRESH). Returns a
+        plan with .has_work / .source_version / .last_refreshed_version /
+        .full_refresh / .rebuild / .units_total. `full=True` plans a full
+        rebuild (incremental planning needs stable row IDs on the source)."""
+        return LOOP.run(
+            self._conn.explain_refresh_materialized_view(
+                name, full=full, src_version=src_version
+            )
+        )
+
+    def alter_materialized_view(self, name: str, *, auto_refresh: bool):
+        """Update a materialized view's options (ALTER MATERIALIZED VIEW)."""
+        LOOP.run(self._conn.alter_materialized_view(name, auto_refresh=auto_refresh))
+
+    def drop_materialized_view(self, name: str):
+        """Drop a materialized view definition (DROP MATERIALIZED VIEW)."""
+        LOOP.run(self._conn.drop_materialized_view(name))
+
+    def list_materialized_views(self):
+        """List registered materialized view definitions."""
+        return LOOP.run(self._conn.list_materialized_views())
+
+    def list_jobs(self):
+        """List inflight server-side jobs across the database's tables."""
+        return LOOP.run(self._conn.list_jobs())
+
+    def get_job(self, job_id: str, table: "str | None" = None):
+        """Look up one server-side job by id (the wait()/status poll path).
+
+        Passing ``table`` (the job's table) lets the server answer with an O(1)
+        single-node read instead of scanning the database's active jobs.
+        Returns the job's status, or None if it's unknown or no longer active.
+        """
+        return LOOP.run(self._conn.get_job(job_id, table))
+
+    def cancel_job(self, job_id: str) -> bool:
+        """Cancel an inflight server-side job by id (CANCEL JOB).
+
+        Returns True if a matching inflight job was found and flagged for
+        cancellation, False if none was inflight (already finished or
+        unknown id) -- cancellation is best-effort.
+        """
+        return LOOP.run(self._conn.cancel_job(job_id))
+
+    def job_history(self, job_id: "str | None" = None):
+        """Durable history of completed server-side jobs (SHOW JOB HISTORY).
+
+        Pass ``job_id`` to narrow to a single job. Unlike :meth:`list_jobs`
+        (live, inflight) these are the terminal records.
+        """
+        return LOOP.run(self._conn.job_history(job_id))
+
+    def errors(self, job_id: "str | None" = None, table: "str | None" = None):
+        """Per-row UDF errors recorded by ``error_policy=skip`` (SHOW ERRORS),
+        optionally filtered by ``job_id`` and/or ``table``.
+        """
+        return LOOP.run(self._conn.errors(job_id, table))
+

 class LanceDBConnection(DBConnection):
    """
@@ -1787,6 +2041,200 @@ class AsyncConnection(object):
        )
        return AsyncTable(table)

+    # -- Derived compute: functions, materialized views, jobs -------------
+    # Server-backed features (LanceDB Enterprise / Cloud); local
+    # connections raise NotImplementedError for now.
+
+    async def create_function(
+        self,
+        name,
+        language: str = "python",
+        return_type: Optional[str] = None,
+        body: Optional[str] = None,
+        options: Optional[Dict[str, str]] = None,
+        *,
+        replace: bool = False,
+    ):
+        """Register a UDF (CREATE FUNCTION). Accepts a ``@udf``/``@table_udf``
+        object (preferred) or the explicit (name, language, return_type, body,
+        options)."""
+        from .udf import Udf
+
+        if isinstance(name, Udf):
+            req = name.create_request()
+            name, language, return_type, body, options = (
+                req["name"],
+                req["language"],
+                req["return_type"],
+                req["body"],
+                req["options"],
+            )
+        if replace:
+            try:
+                await self.drop_function(name)
+            except Exception:
+                pass
+        await self._inner.create_function(name, language, return_type, body, options)
+
+    async def list_functions(self):
+        """List registered functions (SHOW FUNCTIONS)."""
+        return await self._inner.list_functions()
+
+    async def drop_function(self, name: str):
+        """Drop a registered function (DROP FUNCTION)."""
+        await self._inner.drop_function(name)
+
+    async def create_materialized_view(
+        self,
+        name: str,
+        source=None,
+        select=None,
+        *,
+        query: Optional[str] = None,
+        where: Optional[str] = None,
+        auto_refresh: bool = False,
+        with_no_data: bool = False,
+        replace: bool = False,
+        partition_by: Optional[str] = None,
+    ) -> "AsyncMaterializedView":
+        """Create a materialized view; returns an `AsyncMaterializedView`
+        handle (``.wait()`` blocks until populated). Pass either ``query=`` (a
+        full SELECT) or ``source`` + ``select`` items; `partition_by`
+        partitions the view's table function on a source column (index-cluster
+        if the column is IVF-indexed, else distinct-value). See the sync
+        method for the select grammar."""
+        from .udf import build_view_query, AsyncMaterializedView
+
+        if query is None:
+            if source is None or select is None:
+                raise ValueError(
+                    "create_materialized_view needs either query= or both "
+                    "source and select"
+                )
+            query = build_view_query(source, select)
+        if where:
+            query += f" WHERE {where}"
+        if replace:
+            try:
+                await self.drop_materialized_view(name)
+            except Exception as e:
+                msg = str(e).lower()
+                if "not found" not in msg and "does not exist" not in msg:
+                    raise
+        job_id = await self._inner.create_materialized_view(
+            name,
+            query,
+            auto_refresh=auto_refresh,
+            with_no_data=with_no_data,
+            partition_by=partition_by,
+        )
+        return AsyncMaterializedView(self, name, job_id=job_id)
+
+    def job(self, job_id: str):
+        """An `AsyncJobHandle` for reconnecting to an inflight job by id (a
+        stored id, or one from the SQL / REST surface). Submit methods already
+        return a handle, so this is only needed to re-attach to an existing
+        job."""
+        from .udf import AsyncJobHandle
+
+        return AsyncJobHandle(self, job_id)
+
+    async def lineage(
+        self,
+        table: str,
+        column: Optional[str] = None,
+        *,
+        direction: Optional[str] = None,
+        depth: Optional[int] = None,
+    ):
+        """Derived-compute lineage of a table/view (or column). See the sync
+        `Connection.lineage`. Returns a `Lineage`."""
+        from .lineage import Lineage
+
+        raw = await self._inner.table_lineage(table, column, direction, depth)
+        return Lineage.from_json(raw)
+
+    async def _refresh_materialized_view(
+        self,
+        name: str,
+        *,
+        full: bool = False,
+        src_version: Optional[int] = None,
+        num_workers: Optional[int] = None,
+        max_workers: Optional[int] = None,
+    ) -> str:
+        """Internal: submit a refresh, return the job id. The public surface is
+        ``AsyncMaterializedView.refresh()`` (returns an `AsyncJobHandle`).
+
+        ``full=True`` forces a full rebuild (recompute and replace every row)
+        instead of the default incremental refresh.
+        """
+        return await self._inner.refresh_materialized_view(
+            name,
+            full=full,
+            src_version=src_version,
+            num_workers=num_workers,
+            max_workers=max_workers,
+        )
+
+    async def explain_refresh_materialized_view(
+        self,
+        name: str,
+        *,
+        full: bool = False,
+        src_version: Optional[int] = None,
+    ):
+        """Plan a refresh without running it (EXPLAIN REFRESH)."""
+        return await self._inner.explain_refresh_materialized_view(
+            name, full=full, src_version=src_version
+        )
+
+    async def alter_materialized_view(self, name: str, *, auto_refresh: bool):
+        """Update a materialized view's options."""
+        await self._inner.alter_materialized_view(name, auto_refresh)
+
+    async def drop_materialized_view(self, name: str):
+        """Drop a materialized view definition."""
+        await self._inner.drop_materialized_view(name)
+
+    async def list_materialized_views(self):
+        """List registered materialized view definitions."""
+        return await self._inner.list_materialized_views()
+
+    async def list_jobs(self):
+        """List inflight server-side jobs across the database's tables."""
+        return await self._inner.list_jobs()
+
+    async def get_job(self, job_id: str, table: "str | None" = None):
+        """Look up one server-side job by id (the wait()/status poll path).
+        ``table`` (the job's table) enables an O(1) server-side lookup.
+        Returns the job's status, or None if unknown / no longer active."""
+        return await self._inner.get_job(job_id, table)
+
+    async def cancel_job(self, job_id: str) -> bool:
+        """Cancel an inflight server-side job by id (CANCEL JOB).
+
+        Returns True if a matching inflight job was found and flagged for
+        cancellation, False otherwise (best-effort).
+        """
+        return await self._inner.cancel_job(job_id)
+
+    async def job_history(self, job_id: "str | None" = None):
+        """Durable history of completed server-side jobs (SHOW JOB HISTORY).
+
+        Reads each table's durable job-history store. Pass ``job_id`` to narrow
+        to a single job. Unlike :meth:`list_jobs` (live, inflight) these are the
+        terminal records, with created/updated/completed timestamps.
+        """
+        return await self._inner.job_history(job_id)
+
+    async def errors(self, job_id: "str | None" = None, table: "str | None" = None):
+        """Per-row UDF errors recorded by ``error_policy=skip`` (SHOW ERRORS).
+
+        Optionally filtered by ``job_id`` and/or ``table``.
+        """
+        return await self._inner.errors(job_id, table)
+
    async def rename_table(
        self,
        cur_name: str,
--- a/python/python/lancedb/lineage.py
+++ b/python/python/lancedb/lineage.py
@@ -0,0 +1,177 @@
+# SPDX-License-Identifier: Apache-2.0
+# SPDX-FileCopyrightText: Copyright The LanceDB Authors
+"""Client-side model of derived-compute lineage.
+
+`Connection.lineage()` / `Table.lineage()` / `MaterializedView.lineage()` return
+a `Lineage`: the graph of what a column or materialized view derives from
+(upstream), what derives from it (downstream), and -- for each derived column --
+the function that produced it, the version it was produced with, and whether
+that is stale relative to the function the registry now holds.
+
+The server returns this as JSON (the wire contract); these classes deserialize
+it. Nothing here talks to the server.
+"""
+
+from __future__ import annotations
+
+import json
+from dataclasses import dataclass, field
+from typing import List, Optional, Union
+
+
+@dataclass
+class FunctionRef:
+    """The function that produced a derived column, with version + location."""
+
+    name: str
+    #: Version that produced the data (stamped at compute time), if known.
+    as_computed_version: Optional[str] = None
+    #: Version the registry currently holds for this function name.
+    current_version: Optional[str] = None
+    #: True when the column was produced by an older function than the registry
+    #: now holds -- i.e. silently stale; re-refresh to catch up.
+    stale_vs_current: bool = False
+    language: Optional[str] = None
+    docker_image: Optional[str] = None
+    env_digest: Optional[str] = None
+    code_uri: Optional[str] = None
+
+    @classmethod
+    def _from(cls, d: dict) -> "FunctionRef":
+        return cls(
+            name=d["name"],
+            as_computed_version=d.get("as_computed_version"),
+            current_version=d.get("current_version"),
+            stale_vs_current=d.get("stale_vs_current", False),
+            language=d.get("language"),
+            docker_image=d.get("docker_image"),
+            env_digest=d.get("env_digest"),
+            code_uri=d.get("code_uri"),
+        )
+
+
+@dataclass
+class Node:
+    """A lineage node: a table, view, column, or function."""
+
+    kind: str  # "table" | "view" | "column" | "function"
+    id: str  # "table", "table.column", or "fn:name@version"
+    table: Optional[str] = None
+    function: Optional[FunctionRef] = None
+
+    @classmethod
+    def _from(cls, d: dict) -> "Node":
+        fn = d.get("function")
+        return cls(
+            kind=d["kind"],
+            id=d["id"],
+            table=d.get("table"),
+            function=FunctionRef._from(fn) if fn else None,
+        )
+
+
+@dataclass
+class Edge:
+    """`downstream` depends on `upstream`, produced by `via` (a function name,
+    or None for a passthrough)."""
+
+    downstream: str
+    upstream: str
+    via: Optional[str] = None
+
+    @classmethod
+    def _from(cls, d: dict) -> "Edge":
+        return cls(downstream=d["downstream"], upstream=d["upstream"], via=d.get("via"))
+
+
+@dataclass
+class Lineage:
+    """A derived-compute lineage graph (nodes + labeled edges)."""
+
+    target: str
+    nodes: List[Node] = field(default_factory=list)
+    edges: List[Edge] = field(default_factory=list)
+
+    @classmethod
+    def from_json(cls, raw: Union[str, bytes, dict]) -> "Lineage":
+        d = json.loads(raw) if isinstance(raw, (str, bytes)) else raw
+        return cls(
+            target=d.get("target", ""),
+            nodes=[Node._from(n) for n in d.get("nodes", [])],
+            edges=[Edge._from(e) for e in d.get("edges", [])],
+        )
+
+    def functions(self) -> List[FunctionRef]:
+        """The function nodes in the graph."""
+        return [n.function for n in self.nodes if n.function is not None]
+
+    def stale(self) -> List[FunctionRef]:
+        """Functions whose as-computed version is behind the current registry
+        version -- the columns they produced are silently out of date."""
+        return [f for f in self.functions() if f.stale_vs_current]
+
+    def to_dict(self) -> dict:
+        def prune(d: dict) -> dict:
+            return {k: v for k, v in d.items() if v is not None}
+
+        return {
+            "target": self.target,
+            "nodes": [
+                prune(
+                    {
+                        "kind": n.kind,
+                        "id": n.id,
+                        "table": n.table,
+                        "function": prune(vars(n.function)) if n.function else None,
+                    }
+                )
+                for n in self.nodes
+            ],
+            "edges": [prune(vars(e)) for e in self.edges],
+        }
+
+    def to_graphviz(self) -> str:
+        """Graphviz DOT for the lineage DAG: columns/tables as nodes, function
+        names on edges, drift edges dashed + red."""
+        stale_names = {f.name for f in self.stale()}
+        out = [
+            "digraph lineage {",
+            "  rankdir=LR;",
+            '  node [fontname="monospace"];',
+        ]
+        for n in self.nodes:
+            if n.kind == "function":
+                continue
+            shape = "ellipse" if n.kind in ("table", "view") else "box"
+            out.append(f'  "{n.id}" [shape={shape}];')
+        for e in self.edges:
+            attrs = ""
+            if e.via:
+                if e.via in stale_names:
+                    attrs = f' [label="{e.via}" color=red style=dashed]'
+                else:
+                    attrs = f' [label="{e.via}"]'
+            out.append(f'  "{e.upstream}" -> "{e.downstream}"{attrs};')
+        out.append("}")
+        return "\n".join(out)
+
+    def _repr_html_(self) -> str:
+        warn = ""
+        drift = self.stale()
+        if drift:
+            names = ", ".join(sorted({f.name for f in drift}))
+            warn = (
+                f'<p style="color:#b00000"><b>stale vs current:</b> {names} '
+                "(re-refresh to catch up)</p>"
+            )
+        rows = "".join(
+            f"<tr><td><code>{e.downstream}</code></td>"
+            f"<td>&larr; {e.via or ''}</td>"
+            f"<td><code>{e.upstream}</code></td></tr>"
+            for e in self.edges
+        )
+        return (
+            f"<b>lineage: <code>{self.target}</code></b>{warn}"
+            "<table><tr><th>derived</th><th>via</th><th>from</th></tr>"
+            f"{rows}</table>"
+        )
--- a/python/python/lancedb/remote/table.py
+++ b/python/python/lancedb/remote/table.py
@@ -13,10 +13,14 @@ from typing import (
    Iterable,
    List,
    Optional,
+    TYPE_CHECKING,
    Union,
    Literal,
    overload,
 )
+
+if TYPE_CHECKING:
+    from ..udf import JobHandle
 import warnings

 from lancedb import __version__
@@ -884,8 +888,142 @@ class RemoteTable(Table):
    def count_rows(self, filter: Optional[str] = None) -> int:
        return LOOP.run(self._table.count_rows(filter))

-    def add_columns(self, transforms: Dict[str, str]) -> AddColumnsResult:
-        return LOOP.run(self._table.add_columns(transforms))
+    def add_columns(
+        self,
+        transforms: Optional[Dict[str, str]] = None,
+        *,
+        computed: Optional[Dict[str, tuple]] = None,
+    ) -> Optional[AddColumnsResult]:
+        result = None
+        if transforms is not None:
+            result = LOOP.run(self._table.add_columns(transforms))
+        if computed:
+            LOOP.run(self._table.add_columns(computed=computed))
+        return result
+
+    def refresh_column(
+        self,
+        columns,
+        *,
+        where: Optional[str] = None,
+        num_workers: Optional[int] = None,
+        max_workers: Optional[int] = None,
+        batch_size: Optional[int] = None,
+        priority: Optional[str] = None,
+    ) -> "JobHandle":
+        """Trigger recompute of computed columns (REFRESH COLUMN).
+
+        The expression is resolved server-side from each column's stored
+        binding; columns bound to the same struct-returning function
+        refresh together. Returns a `JobHandle` to wait on, poll, or cancel
+        (``tbl.refresh_column("c").wait()``). Server-backed feature
+        (LanceDB Enterprise / Cloud).
+
+        num_workers / max_workers / batch_size / priority are per-refresh
+        scheduling knobs (how to run THIS refresh) and override any default
+        the function carries. `priority` is a Kueue tier
+        (training | interactive | backfill).
+        """
+        from ..udf import JobHandle
+
+        if isinstance(columns, str):
+            columns = [columns]
+        job_id = LOOP.run(
+            self._table.refresh_column(
+                list(columns),
+                where=where,
+                num_workers=num_workers,
+                max_workers=max_workers,
+                batch_size=batch_size,
+                priority=priority,
+            )
+        )
+        return JobHandle(self._job_conn(), job_id)
+
+    def lineage(self, column=None, *, direction=None, depth=None):
+        """Derived-compute lineage of this table, or one of its columns:
+        upstream sources, downstream dependents, and the function version +
+        location that produced each derived column (with a drift flag). Returns
+        a `Lineage`. See `Connection.lineage`."""
+        return self._job_conn().lineage(
+            self._name, column, direction=direction, depth=depth
+        )
+
+    def _job_conn(self):
+        """A client connection for polling jobs this table spawns. Built lazily
+        from the table's serialized connection state and cached (not pickled --
+        a forked/unpickled table rebuilds it on next use)."""
+        from lancedb import deserialize_conn
+
+        conn = getattr(self, "_job_conn_cache", None)
+        if conn is None:
+            conn = deserialize_conn(self._serialized_connection_state())
+            self._job_conn_cache = conn
+        return conn
+
+    def load_columns(
+        self,
+        source: Union[str, Iterable[str]],
+        pk: str,
+        columns: Union[Iterable[str], Dict[str, str]],
+        *,
+        source_format: str = "parquet",
+        source_pk: Optional[str] = None,
+        on_missing: str = "carry",
+        source_storage_options: Optional[Dict[str, str]] = None,
+        num_workers: Optional[int] = None,
+        max_workers: Optional[int] = None,
+        batch_size: Optional[int] = None,
+        commit_granularity: Optional[int] = None,
+        priority: Optional[str] = None,
+    ) -> str:
+        """Fill existing columns from an external source by primary-key join.
+
+        The distributed-job equivalent of Geneva's ``Table.load_columns()``:
+        imports precomputed values (e.g. embeddings) from Parquet/Lance/IPC into
+        this table, matching on a primary key. Returns the load job id.
+        Server-backed feature (LanceDB Enterprise / Cloud).
+
+        Parameters
+        ----------
+        source: str | list[str]
+            One source URI or a list of URIs.
+        pk: str
+            Destination primary-key column. Also the source key unless
+            ``source_pk`` is given.
+        columns: list[str] | dict[str, str]
+            Value columns to load. A list loads same-named columns; a dict maps
+            ``{target: source}``.
+        source_format: str
+            ``"parquet"`` (default), ``"lance"``, or ``"ipc"``.
+        source_pk: str, optional
+            Source primary-key column when it differs from ``pk``.
+        on_missing: str
+            Behavior for destination rows with no source match:
+            ``"carry"`` (default, keep existing), ``"null"``, or ``"error"``.
+        """
+        if isinstance(source, str):
+            source = [source]
+        if isinstance(columns, dict):
+            mappings = [(target, src) for target, src in columns.items()]
+        else:
+            mappings = [(c, None) for c in columns]
+        return LOOP.run(
+            self._table.load_columns(
+                list(source),
+                source_format,
+                pk,
+                mappings,
+                source_key=source_pk,
+                source_storage_options=source_storage_options,
+                on_missing=on_missing,
+                num_workers=num_workers,
+                max_workers=max_workers,
+                batch_size=batch_size,
+                commit_granularity=commit_granularity,
+                priority=priority,
+            )
+        )

    def alter_columns(
        self, *alterations: Iterable[Dict[str, str]]
--- a/python/python/lancedb/table.py
+++ b/python/python/lancedb/table.py
@@ -702,6 +702,24 @@ def _normalize_progress(progress):
    return progress, False


+def _computed_groups(computed):
+    """Group computed columns by expression, preserving declaration order
+    (struct-returning functions need their columns adjacent so schema order
+    matches field order). Accepts the ergonomic forms -- `fn("col")` values
+    and tuple keys for struct fan-out -- via `_normalize_computed`."""
+    from .udf import _normalize_computed
+
+    groups = []
+    for name, (sql_type, expression) in _normalize_computed(computed).items():
+        for expr, cols in groups:
+            if expr == expression:
+                cols.append((name, sql_type))
+                break
+        else:
+            groups.append((expression, [(name, sql_type)]))
+    return groups
+
+
 class Table(ABC):
    """
    A Table is a collection of Records in a LanceDB Database.
@@ -807,6 +825,59 @@ class Table(ABC):
        """The number of rows in this Table"""
        return self.count_rows(None)

+    def add_computed_column(
+        self,
+        columns,
+        fn,
+        args: Optional[List[str]] = None,
+        types=None,
+    ) -> None:
+        """Declare computed column(s) bound to a UDF -- no compute happens
+        here (the agent fills them lazily, or refresh_column() triggers a run).
+
+        .. deprecated::
+            A computed column is an expression over a registered function, so
+            bind it as one: ``add_columns(computed={"vec": embed("data")})``.
+            ``embed("data")`` applies the function to the `data` column and
+            infers the type from the function's return signature -- the
+            function never couples to a particular column. Prefer that form.
+        """
+        import warnings
+
+        warnings.warn(
+            "add_computed_column is deprecated; use add_columns(computed="
+            '{"vec": embed("data")}).',
+            DeprecationWarning,
+            stacklevel=2,
+        )
+        from .udf import Udf, struct_field_types
+
+        multi = isinstance(columns, (tuple, list))
+        if isinstance(fn, Udf):
+            expr = fn.expression(*(args or []))
+            if types is None:
+                if multi:
+                    if not fn.returns.upper().startswith("STRUCT"):
+                        raise ValueError(
+                            "several columns need a STRUCT-returning function"
+                        )
+                    types = struct_field_types(fn.returns)
+                else:
+                    types = fn.returns
+        else:
+            if types is None:
+                raise ValueError("pass types= when fn is a name string")
+            expr = f"{fn}({', '.join(args or [])})"
+        if multi:
+            if len(types) != len(columns):
+                raise ValueError(
+                    f"{len(columns)} columns but {len(types)} output types"
+                )
+            computed = {c: (t, expr) for c, t in zip(columns, types)}
+        else:
+            computed = {columns: (types, expr)}
+        self.add_columns(computed=computed)
+
    @property
    @abstractmethod
    def embedding_functions(self) -> Dict[str, EmbeddingFunctionConfig]:
@@ -3710,9 +3781,68 @@ class LanceTable(Table):
        return LOOP.run(self._table.index_stats(index_name))

    def add_columns(
-        self, transforms: Dict[str, str] | pa.field | List[pa.field] | pa.Schema
-    ) -> AddColumnsResult:
-        return LOOP.run(self._table.add_columns(transforms))
+        self,
+        transforms: Dict[str, str]
+        | pa.field
+        | List[pa.field]
+        | pa.Schema
+        | None = None,
+        *,
+        computed: Optional[Dict] = None,
+    ) -> Optional[AddColumnsResult]:
+        result = None
+        if transforms is not None:
+            result = LOOP.run(self._table.add_columns(transforms))
+        if computed:
+            # computed binds an expression over a registered function to a
+            # column: {col: fn("input_col")} -- fn("input_col") yields the
+            # expression and carries the inferred type; a tuple key fans a
+            # STRUCT return out to several columns. Declares the binding only;
+            # the server fills the values (server-backed). The legacy
+            # {col: (sql_type, expression)} tuple form is still accepted.
+            result_unused = LOOP.run(self._table.add_columns(computed=computed))
+            del result_unused
+        return result
+
+    def refresh_column(
+        self,
+        columns,
+        *,
+        where: Optional[str] = None,
+        num_workers: Optional[int] = None,
+        max_workers: Optional[int] = None,
+        batch_size: Optional[int] = None,
+        priority: Optional[str] = None,
+    ) -> "JobHandle":
+        """Trigger recompute of computed columns (REFRESH COLUMN).
+
+        The expression is resolved server-side from each column's stored
+        binding; columns bound to the same struct-returning function
+        refresh together. Returns a `JobHandle` to wait on, poll, or cancel
+        (``tbl.refresh_column("col").wait()``) -- mirrors
+        `MaterializedView.refresh()`. Server-backed feature (LanceDB
+        Enterprise / Cloud).
+
+        num_workers / max_workers / batch_size / priority are per-refresh
+        scheduling knobs (how to run THIS refresh) and override any default
+        the function carries. `priority` is a Kueue tier
+        (training | interactive | backfill).
+        """
+        from .udf import JobHandle
+
+        if isinstance(columns, str):
+            columns = [columns]
+        job_id = LOOP.run(
+            self._table.refresh_column(
+                list(columns),
+                where=where,
+                num_workers=num_workers,
+                max_workers=max_workers,
+                batch_size=batch_size,
+                priority=priority,
+            )
+        )
+        return JobHandle(self._conn, job_id, table=self.name)

    def alter_columns(
        self, *alterations: Iterable[Dict[str, str]]
@@ -5390,9 +5520,44 @@ class AsyncTable:

        return await self._inner.update(updates_sql, where)

+    async def refresh_column(
+        self,
+        columns,
+        *,
+        where: Optional[str] = None,
+        num_workers: Optional[int] = None,
+        max_workers: Optional[int] = None,
+        batch_size: Optional[int] = None,
+        priority: Optional[str] = None,
+    ) -> str:
+        """Trigger recompute of computed columns (REFRESH COLUMN).
+        Returns the refresh job id. Server-backed feature.
+
+        num_workers / max_workers / batch_size / priority are per-refresh
+        scheduling knobs (how to run THIS refresh); they override any default
+        the function carries. `priority` is a Kueue tier
+        (training | interactive | backfill)."""
+        if isinstance(columns, str):
+            columns = [columns]
+        return await self._inner.refresh_column(
+            list(columns),
+            where_clause=where,
+            num_workers=num_workers,
+            max_workers=max_workers,
+            batch_size=batch_size,
+            priority=priority,
+        )
+
    async def add_columns(
-        self, transforms: dict[str, str] | pa.field | List[pa.field] | pa.Schema
-    ) -> AddColumnsResult:
+        self,
+        transforms: dict[str, str]
+        | pa.field
+        | List[pa.field]
+        | pa.Schema
+        | None = None,
+        *,
+        computed: Optional[Dict] = None,
+    ) -> Optional[AddColumnsResult]:
        """
        Add new columns with defined values.

@@ -5411,6 +5576,7 @@ class AsyncTable:
            version: the new version number of the table after adding columns.

        """
+        result = None
        if isinstance(transforms, pa.Field):
            transforms = [transforms]
        if isinstance(transforms, list) and all(
@@ -5418,9 +5584,69 @@ class AsyncTable:
        ):
            transforms = pa.schema(transforms)
        if isinstance(transforms, pa.Schema):
-            return await self._inner.add_columns_with_schema(transforms)
+            result = await self._inner.add_columns_with_schema(transforms)
+        elif transforms is not None:
+            result = await self._inner.add_columns(list(transforms.items()))
+        if computed:
+            # computed binds an expression over a registered function to a
+            # column: {col: fn("input_col")} -- fn("input_col") yields the
+            # expression and carries the inferred type; a tuple key fans a
+            # STRUCT return out to several columns. Declares the binding only;
+            # the server fills the values (server-backed). The legacy
+            # {col: (sql_type, expression)} tuple form is still accepted.
+            for expression, cols in _computed_groups(computed):
+                await self._inner.add_computed_columns(cols, expression)
+        return result
+
+    async def add_computed_column(
+        self,
+        columns,
+        fn,
+        args: Optional[List[str]] = None,
+        types=None,
+    ) -> None:
+        """Declare computed column(s) bound to a UDF (async).
+
+        .. deprecated::
+            Use ``add_columns(computed={"col": fn("input_col")})`` -- a computed
+            column is an expression over a registered function, so bind it that
+            way instead of coupling the UDF to the column here.
+        """
+        import warnings
+
+        warnings.warn(
+            "add_computed_column is deprecated; use add_columns(computed="
+            '{"col": fn("input_col")}).',
+            DeprecationWarning,
+            stacklevel=2,
+        )
+        from .udf import Udf, struct_field_types
+
+        multi = isinstance(columns, (tuple, list))
+        if isinstance(fn, Udf):
+            expr = fn.expression(*(args or []))
+            if types is None:
+                if multi:
+                    if not fn.returns.upper().startswith("STRUCT"):
+                        raise ValueError(
+                            "several columns need a STRUCT-returning function"
+                        )
+                    types = struct_field_types(fn.returns)
+                else:
+                    types = fn.returns
        else:
-            return await self._inner.add_columns(list(transforms.items()))
+            if types is None:
+                raise ValueError("pass types= when fn is a name string")
+            expr = f"{fn}({', '.join(args or [])})"
+        if multi:
+            if len(types) != len(columns):
+                raise ValueError(
+                    f"{len(columns)} columns but {len(types)} output types"
+                )
+            computed = {c: (t, expr) for c, t in zip(columns, types)}
+        else:
+            computed = {columns: (types, expr)}
+        await self.add_columns(computed=computed)

    async def alter_columns(
        self, *alterations: Iterable[dict[str, Any]]
--- a/python/python/lancedb/udf.py
+++ b/python/python/lancedb/udf.py
@@ -0,0 +1,753 @@
+# SPDX-License-Identifier: Apache-2.0
+# SPDX-FileCopyrightText: Copyright The LanceDB Authors
+"""UDF authoring for LanceDB derived compute (server-backed).
+
+`@udf` / `@table_udf` turn a plain Python function into a registrable
+server-side UDF: a cloudpickled (or source) body, a SQL signature inferred
+from type hints, and the runtime options (pip deps, GPUs, batching, ...).
+Register and use them through the existing connection/table API:
+
+    import lancedb
+    from lancedb import udf, table_udf
+
+    db = lancedb.connect("db://my_db", api_key="...", host_override="...")
+
+    @udf(pip=["torch>=2.0"], num_gpus=1)
+    def embed(text: str) -> list[float]:
+        return model.encode(text).tolist()
+
+    db.create_function(embed)                 # CREATE FUNCTION (once)
+    tbl = db.open_table("docs")
+    tbl.add_columns(computed={"vec": embed("text")})  # bind embed(text) -> vec
+    tbl.refresh_column("vec").wait()          # materialize (returns a JobHandle)
+    view = db.create_materialized_view("chunks", tbl, ["id", chunk_fn])
+
+`embed("text")` applies the registered function to the `text` column and yields
+the expression `embed(text)`; the function itself stays decoupled from any
+column, so the same `embed` works on any column or table.
+
+These operations are server-backed (LanceDB Enterprise / Cloud); the
+decorator itself works locally (define + call), only registration needs a
+remote connection.
+"""
+
+from __future__ import annotations
+
+import asyncio
+import base64
+import dataclasses
+import functools
+import inspect
+import re
+import sys
+import textwrap
+import time
+import typing
+
+# -- type hints -> SQL type strings -------------------------------------
+
+_SCALARS = {
+    int: "BIGINT",
+    # Pragmatic default for ML workloads: python float maps to FLOAT
+    # (Float32). Use an explicit `returns=` for DOUBLE.
+    float: "FLOAT",
+    str: "VARCHAR",
+    bool: "BOOLEAN",
+    bytes: "BLOB",
+}
+
+
+class TypeInferenceError(TypeError):
+    pass
+
+
+def sql_type(hint) -> str:
+    """SQL type string for a python type hint."""
+    if hint in _SCALARS:
+        return _SCALARS[hint]
+    origin = typing.get_origin(hint)
+    if origin in (list, typing.List):
+        (item,) = typing.get_args(hint) or (None,)
+        if item in _SCALARS:
+            return f"{_SCALARS[item]}[]"
+        raise TypeInferenceError(
+            f"unsupported list item type {item!r}; use an explicit returns="
+        )
+    fields = _struct_fields(hint)
+    if fields is not None:
+        inner = ", ".join(f"{name} {sql_type(h)}" for name, h in fields)
+        return f"STRUCT({inner})"
+    raise TypeInferenceError(
+        f"cannot infer a SQL type for {hint!r}; pass an explicit type string"
+    )
+
+
+def _struct_fields(hint):
+    """(name, hint) pairs for a TypedDict or dataclass, else None."""
+    if dataclasses.is_dataclass(hint):
+        return [(f.name, f.type) for f in dataclasses.fields(hint)]
+    # TypedDict detection: a dict subclass with __annotations__.
+    if (
+        isinstance(hint, type)
+        and issubclass(hint, dict)
+        and typing.get_type_hints(hint)
+    ):
+        return list(typing.get_type_hints(hint).items())
+    return None
+
+
+def return_type(fn, override: "str | None", table: bool) -> str:
+    """SQL return type for a function: explicit override wins, else the
+    return annotation. Table functions render as TABLE(...) and accept
+    struct-shaped hints (TypedDict/dataclass, optionally list-wrapped)."""
+    if override is not None:
+        s = override.strip()
+        if table and not s.upper().startswith("TABLE"):
+            if s.upper().startswith("STRUCT"):
+                return "TABLE" + s[len("STRUCT") :]
+            raise TypeInferenceError(
+                "a table function's returns= must be TABLE(...) or STRUCT(...)"
+            )
+        return s
+
+    hints = typing.get_type_hints(fn)
+    ret = hints.get("return")
+    if ret is None:
+        raise TypeInferenceError(
+            f"function {fn.__name__!r} needs a return annotation or returns="
+        )
+    if table:
+        # Accept list[Row] / Row where Row is a TypedDict or dataclass.
+        if typing.get_origin(ret) in (list, typing.List):
+            (ret,) = typing.get_args(ret)
+        fields = _struct_fields(ret)
+        if fields is None:
+            raise TypeInferenceError(
+                "a table function must return rows shaped as a TypedDict or "
+                "dataclass (optionally list-wrapped); or pass returns=..."
+            )
+        inner = ", ".join(f"{name} {sql_type(h)}" for name, h in fields)
+        return f"TABLE({inner})"
+    return sql_type(ret)
+
+
+def param_types(fn) -> "list[tuple[str, str]]":
+    """(name, sql type) per parameter, from annotations. Each UDF
+    parameter binds to a source column of the same name by default."""
+    hints = typing.get_type_hints(fn)
+    out = []
+    for name, p in inspect.signature(fn).parameters.items():
+        if p.kind in (p.VAR_POSITIONAL, p.VAR_KEYWORD):
+            raise TypeInferenceError("*args/**kwargs are not supported in UDFs")
+        hint = hints.get(name)
+        if hint is None:
+            raise TypeInferenceError(
+                f"parameter {name!r} of {fn.__name__!r} needs a type annotation"
+            )
+        out.append((name, sql_type(hint)))
+    return out
+
+
+# -- column expressions -------------------------------------------------
+
+
+class ColumnExpr(str):
+    """A computed-column expression produced by applying a registered
+    function to column names, e.g. ``embed("data") -> "embed(data)"``.
+
+    It IS the expression string everywhere a string is expected (views, SQL,
+    logging), and additionally carries the function's declared return type so
+    ``add_columns(computed=...)`` can declare the column without a hand-written
+    type. ``field_types`` holds the per-field SQL types of a STRUCT return, for
+    fanning one expression out to several columns.
+    """
+
+    data_type: "str | None"
+    field_types: "list[str] | None"
+
+    def __new__(cls, expr: str, data_type=None, field_types=None):
+        obj = super().__new__(cls, expr)
+        obj.data_type = data_type
+        obj.field_types = field_types
+        return obj
+
+
+def _normalize_computed(computed: dict) -> dict:
+    """Normalize the user-facing ``computed=`` mapping to the canonical
+    ``{name: (sql_type, expression)}`` form.
+
+    Accepts, per entry:
+      - value is a `ColumnExpr` (from ``fn("col")``): the column's SQL type
+        comes from the function's return type -- no hand-written type needed. A
+        tuple key (``("chunk", "idx")``) fans a STRUCT return out to one
+        (type, expression) entry per field, in declared order.
+      - value is a legacy ``(sql_type, expression)`` tuple: passed through (the
+        escape hatch, e.g. bare-name function strings).
+    """
+    out: dict = {}
+    for key, val in computed.items():
+        if isinstance(val, ColumnExpr):
+            expr = str(val)
+            if isinstance(key, (tuple, list)):
+                if not val.field_types:
+                    raise ValueError(
+                        f"columns {tuple(key)} need a STRUCT-returning function; "
+                        f"{expr} returns a single value"
+                    )
+                if len(val.field_types) != len(key):
+                    raise ValueError(
+                        f"{len(key)} columns but {len(val.field_types)} struct fields "
+                        f"in {expr}"
+                    )
+                for name, t in zip(key, val.field_types):
+                    out[name] = (t, expr)
+            else:
+                if val.data_type is None:
+                    raise ValueError(f"cannot infer a type for {expr}; pass types=")
+                out[key] = (val.data_type, expr)
+        else:
+            out[key] = val
+    return out
+
+
+# -- the @udf / @table_udf decorators -----------------------------------
+
+
+class Udf:
+    def __init__(
+        self,
+        fn,
+        *,
+        returns: "str | None" = None,
+        table: bool = False,
+        name: "str | None" = None,
+        pip: "list[str] | None" = None,
+        pip_index_url: "str | None" = None,
+        pip_extra_index_urls: "list[str] | None" = None,
+        find_links: "list[str] | None" = None,
+        requirements: "str | list[str] | None" = None,
+        conda: "list[str] | None" = None,
+        conda_channels: "list[str] | None" = None,
+        env: "dict[str, str] | list[str] | None" = None,
+        num_cpus: "int | None" = None,
+        num_gpus: "int | None" = None,
+        batch_size: "int | None" = None,
+        timeout: "float | None" = None,
+        error_policy: "str | None" = None,
+        max_skip_ratio: "float | None" = None,
+        retries: "int | None" = None,
+        docker_image: "str | None" = None,
+        description: "str | None" = None,
+        prefer_source: bool = False,
+    ):
+        functools.update_wrapper(self, fn)
+        self.fn = fn
+        self.name = name or fn.__name__
+        self.table = table
+        self.params = param_types(fn)
+        self.returns = return_type(fn, returns, table)
+        self.prefer_source = prefer_source
+        self.options: "dict[str, str]" = {}
+        if conda and (pip or requirements):
+            raise ValueError("pass conda or pip/requirements, not both")
+        if conda_channels and not conda:
+            raise ValueError("conda_channels requires conda")
+        if pip:
+            self.options["pip"] = ",".join(pip)
+        if pip_extra_index_urls:
+            self.options["pip_extra_index_urls"] = ",".join(pip_extra_index_urls)
+        if find_links:
+            self.options["find_links"] = ",".join(find_links)
+        if requirements:
+            self.options["requirements"] = _format_requirements(requirements)
+        if conda:
+            self.options["conda"] = ",".join(conda)
+        if conda_channels:
+            self.options["conda_channels"] = ",".join(conda_channels)
+        if env:
+            self.options["env"] = _format_env(env)
+        for key, val in [
+            ("pip_index_url", pip_index_url),
+            ("num_cpus", num_cpus),
+            ("num_gpus", num_gpus),
+            ("batch_size", batch_size),
+            ("timeout", timeout),
+            ("error_policy", error_policy),
+            ("max_skip_ratio", max_skip_ratio),
+            ("retries", retries),
+            ("docker_image", docker_image),
+        ]:
+            if val is not None:
+                self.options[key] = str(val)
+        # Keep the source in the description (when available) so the
+        # catalog stays inspectable even for pickled bodies.
+        if description is not None:
+            self.options["description"] = description
+        else:
+            try:
+                self.options["description"] = textwrap.dedent(inspect.getsource(fn))
+            except (OSError, TypeError):
+                pass
+
+    def __call__(self, *args, **kwargs):
+        """Call with real values to run locally; call with column-name
+        strings to build an expression for backfills and views, e.g.
+        ``embed("data")`` -> the expression ``embed(data)`` (a `ColumnExpr`
+        carrying the function's return type for `add_columns(computed=...)`)."""
+        if args and all(isinstance(a, str) for a in args) and not kwargs:
+            return self.expression(*args)
+        return self.fn(*args, **kwargs)
+
+    def expression(self, *columns: str) -> ColumnExpr:
+        """The expression applying this function to `columns` (default: the
+        function's own parameter names). Returns a `ColumnExpr` -- a string
+        that also carries the declared return type (and struct field types)."""
+        cols = columns or [p for p, _ in self.params]
+        expr = f"{self.name}({', '.join(cols)})"
+        field_types = None
+        if self.returns.upper().startswith("STRUCT"):
+            field_types = struct_field_types(self.returns)
+        return ColumnExpr(expr, data_type=self.returns, field_types=field_types)
+
+    def _body(self) -> "tuple[str, str]":
+        """(body literal, body_format). Source when requested and
+        retrievable; cloudpickle otherwise (handles closures)."""
+        if self.prefer_source:
+            try:
+                src = textwrap.dedent(inspect.getsource(self.fn))
+                # Strip the decorator line(s) so the stored body is a
+                # plain function definition.
+                lines = src.splitlines(keepends=True)
+                while lines and lines[0].lstrip().startswith("@"):
+                    lines.pop(0)
+                return "".join(lines), "source"
+            except (OSError, TypeError):
+                pass
+        import cloudpickle
+
+        raw = cloudpickle.dumps(self.fn)
+        return base64.b64encode(raw).decode("ascii"), "cloudpickle"
+
+    def _body_and_options(self) -> "tuple[str, dict[str, str]]":
+        """The body literal plus the finalized options (body_format /
+        python_version / cloudpickle-pip bookkeeping for a non-source
+        body)."""
+        body, body_format = self._body()
+        options = dict(self.options)
+        if body_format != "source":
+            options["body_format"] = body_format
+            # Pickled code objects only load under the same interpreter
+            # minor version; record ours so the worker can fail with a
+            # clear message instead of a bytecode error.
+            options["python_version"] = self.pickle_environment()
+            # The worker deserializes the body with cloudpickle; make sure
+            # the job's pip environment provides it. Conda bakes inject
+            # cloudpickle server-side, so do not create an invalid pip+conda
+            # declaration here.
+            if "conda" not in options:
+                pip = [d for d in options.get("pip", "").split(",") if d]
+                if not any(d.startswith("cloudpickle") for d in pip):
+                    pip.append("cloudpickle")
+                options["pip"] = ",".join(pip)
+        return body, options
+
+    def create_request(self) -> dict:
+        """Keyword arguments for `connection.create_function`."""
+        body, options = self._body_and_options()
+        return {
+            "name": self.name,
+            "language": "python",
+            "return_type": self.returns,
+            "body": body,
+            "options": options,
+        }
+
+    def create_statement(self) -> str:
+        """The equivalent `CREATE FUNCTION` SQL (for SQL-surface callers)."""
+        params = ", ".join(f"{n} {t}" for n, t in self.params)
+        body, options = self._body_and_options()
+        with_clause = ""
+        if options:
+            rendered = ", ".join(
+                f"{k} = '{_escape(v)}'" for k, v in sorted(options.items())
+            )
+            with_clause = f" WITH ({rendered})"
+        return (
+            f"CREATE FUNCTION {self.name}({params}) RETURNS {self.returns} "
+            f"LANGUAGE python AS '{_escape_body(body)}'{with_clause}"
+        )
+
+    def pickle_environment(self) -> str:
+        """Python version the body pickles under -- workers should match
+        the minor version for cloudpickle compatibility."""
+        return f"{sys.version_info.major}.{sys.version_info.minor}"
+
+
+def _escape(s: str) -> str:
+    return str(s).replace("'", "''")
+
+
+def _format_requirements(requirements: "str | list[str]") -> str:
+    if isinstance(requirements, str):
+        return requirements
+    return "\n".join(str(req) for req in requirements)
+
+
+def _format_env(env: "dict[str, str] | list[str]") -> str:
+    if isinstance(env, dict):
+        return "; ".join(f"{key}={value}" for key, value in env.items())
+    return "; ".join(str(entry) for entry in env)
+
+
+def _escape_body(body: str) -> str:
+    # The server unescapes \n / \t in single-quoted bodies; encode real
+    # newlines accordingly and escape quotes.
+    return (
+        body.replace("\\", "\\\\")
+        .replace("'", "''")
+        .replace("\n", "\\n")
+        .replace("\t", "\\t")
+    )
+
+
+def udf(fn=None, **kwargs):
+    """Decorate a function as a scalar (or struct-returning) UDF.
+
+    @udf
+    def doubled(val: int) -> float: ...
+
+    @udf(pip=["torch>=2"], num_gpus=1)
+    def embed(body: str) -> list[float]: ...
+    """
+    if fn is not None:
+        return Udf(fn, **kwargs)
+    return lambda f: Udf(f, **kwargs)
+
+
+def table_udf(fn=None, **kwargs):
+    """Decorate a table function (UDTF): each input row may emit zero or
+    more output rows. Only usable in materialized views.
+
+        class Chunk(TypedDict):
+            chunk: str
+            chunk_idx: int
+
+        @table_udf
+        def chunker(body: str) -> list[Chunk]: ...
+    """
+    kwargs["table"] = True
+    if fn is not None:
+        return Udf(fn, **kwargs)
+    return lambda f: Udf(f, **kwargs)
+
+
+# -- view / job handles (thin references over a connection) -------------
+
+
+def struct_field_types(returns: str) -> "list[str]":
+    """Field type strings of a STRUCT(...) SQL type, in declared order."""
+    inner = returns.strip()[len("STRUCT(") : -1]
+    fields, depth, start = [], 0, 0
+    for i, c in enumerate(inner):
+        if c in "([":
+            depth += 1
+        elif c in ")]":
+            depth -= 1
+        elif c == "," and depth == 0:
+            fields.append(inner[start:i].strip())
+            start = i + 1
+    fields.append(inner[start:].strip())
+    # Each field is "name TYPE"; drop the name.
+    return [f.split(None, 1)[1] for f in fields]
+
+
+def build_view_query(source, select) -> str:
+    """Assemble a view SELECT from a source (name or table) and select
+    items: a column name, an expression string, a (alias, expression)
+    tuple, or a @udf/@table_udf object."""
+    src = source.name if hasattr(source, "name") else source
+    items = []
+    for item in select:
+        if isinstance(item, Udf):
+            items.append(item.expression())
+        elif isinstance(item, tuple):
+            alias, expr = item
+            expr = expr.expression() if isinstance(expr, Udf) else expr
+            items.append(f"{expr} AS {alias}")
+        else:
+            items.append(item)
+    return f"SELECT {', '.join(items)} FROM {src}"
+
+
+def _job_id_matches(handle_id: str, listed_id: str) -> bool:
+    # The refresh/backfill endpoints return the submission id (a uuid), but
+    # the agent names the manifest job "<table>-<type>-<first 8 of the
+    # submission id>" -- which is what list_jobs and cancel report. Match the
+    # canonical id directly, or by that submission prefix.
+    if listed_id == handle_id:
+        return True
+    prefix = handle_id[:8]
+    return len(prefix) >= 4 and prefix in listed_id
+
+
+class MaterializedView:
+    """A reference to a materialized view (name + connection). Operations are
+    server-backed connection calls bound to the name.
+
+    ``create_materialized_view`` returns one of these; ``job_id`` is the
+    initial-population job (None when the view was created with no data), so
+    ``db.create_materialized_view(...).wait()`` blocks until it is populated.
+    """
+
+    def __init__(self, conn, name: str, job_id: "str | None" = None):
+        self.conn = conn
+        self.name = name
+        #: initial-population job id from create, or None (with_no_data).
+        self.job_id = job_id
+
+    def wait(self, timeout: float = 3600.0, poll: float = 2.0) -> str:
+        """Block until the initial-population job (from create) finishes.
+        A no-op when the view was created with no data."""
+        if self.job_id is None:
+            return "finished"
+        return JobHandle(self.conn, self.job_id, table=self.name).wait(
+            timeout=timeout, poll=poll
+        )
+
+    def refresh(self, full: bool = False) -> "JobHandle":
+        """Refresh the materialized view; returns a `JobHandle` to wait on,
+        poll, or cancel (``view.refresh().wait()``).
+
+        ``full=True`` forces a full rebuild (recompute and replace every row)
+        instead of the default incremental refresh. A full rebuild preserves
+        the view's indexes -- they are reindexed by the distributed indexer.
+        """
+        job_id = self.conn._refresh_materialized_view(self.name, full=full)
+        return JobHandle(self.conn, job_id, table=self.name)
+
+    def explain_refresh(self, full: bool = False):
+        """Plan a refresh without running it (EXPLAIN REFRESH)."""
+        return self.conn.explain_refresh_materialized_view(self.name, full=full)
+
+    def alter(self, auto_refresh: bool) -> None:
+        self.conn.alter_materialized_view(self.name, auto_refresh=auto_refresh)
+
+    def drop(self) -> None:
+        self.conn.drop_materialized_view(self.name)
+
+    # A materialized view is a first-class table: it can be indexed and
+    # searched like any other. These open the materialized dataset by name and
+    # delegate. Indexes declared this way are recorded against the view, so the
+    # engine re-applies them after a full refresh rebuilds the dataset (a full
+    # refresh overwrites the dataset, which would otherwise drop its indices).
+    def _table(self):
+        return self.conn.open_table(self.name)
+
+    def create_index(self, *args, **kwargs):
+        """Build an index on the materialized view (see Table.create_index)."""
+        return self._table().create_index(*args, **kwargs)
+
+    def create_scalar_index(self, *args, **kwargs):
+        """Build a scalar index on the materialized view."""
+        return self._table().create_scalar_index(*args, **kwargs)
+
+    def create_fts_index(self, *args, **kwargs):
+        """Build a full-text-search index on the materialized view."""
+        return self._table().create_fts_index(*args, **kwargs)
+
+    def search(self, *args, **kwargs):
+        """Search the materialized view (vector / FTS / hybrid)."""
+        return self._table().search(*args, **kwargs)
+
+    def lineage(self, column=None, *, direction=None, depth=None):
+        """Lineage of the materialized view (or one of its columns). Delegates
+        to the backing table; the server already includes the view's sources
+        and downstream dependents. Returns a `Lineage`."""
+        return self._table().lineage(column, direction=direction, depth=depth)
+
+
+_PROGRESS = re.compile(r"(\d+)/(\d+)")
+
+
+class JobFailedError(RuntimeError):
+    """Raised by ``JobHandle.wait()`` when the server reports the job ``failed``.
+
+    Carries the server-side error so a doomed backfill (e.g. a multi-column
+    ``REFRESH COLUMN`` of a scalar UDF) surfaces its real cause promptly,
+    instead of the caller blocking until ``wait()``'s timeout.
+    """
+
+    def __init__(self, job_id: str, error: "str | None"):
+        self.job_id = job_id
+        self.error = error
+        super().__init__(f"job {job_id} failed: {error or 'unknown error'}")
+
+
+class JobHandle:
+    """A reference to an inflight server-side job, with polling helpers."""
+
+    #: How long an unseen job is treated as still materializing (submission
+    #: -> agent cycle -> manifest write is async).
+    GRACE_SECONDS = 20.0
+
+    def __init__(self, conn, job_id: str, table: "str | None" = None):
+        self.conn = conn
+        self.id = job_id
+        #: The job's table, when known (refresh_column / MV refresh). Lets the
+        #: server resolve this job with an O(1) single-node read; without it the
+        #: lookup scans the database's active jobs (still correct).
+        self.table = table
+        self._created = time.monotonic()
+        self._seen = False
+
+    def _job(self):
+        # Poll by id (one job), not list_jobs (every active job): the server
+        # matches the submission/manifest id and reads just this table's node.
+        return self.conn.get_job(self.id, self.table)
+
+    def status(self) -> str:
+        """pending / running / cancelling / stale, or 'finished' once the
+        job has left the inflight listing."""
+        job = self._job()
+        if job is not None:
+            self._seen = True
+            return job.state
+        if not self._seen and time.monotonic() - self._created < self.GRACE_SECONDS:
+            return "pending"
+        return "finished"
+
+    def progress(self) -> "tuple[int, int] | None":
+        """(units_done, units_total) while running, else None."""
+        job = self._job()
+        if job is not None and job.units_total is not None:
+            return job.units_done or 0, job.units_total
+        return None
+
+    def wait(self, timeout: float = 3600.0, poll: float = 2.0) -> str:
+        deadline = time.monotonic() + timeout
+        while time.monotonic() < deadline:
+            state = self.status()
+            if state in ("finished", "stale"):
+                return state
+            if state == "failed":
+                # Terminal failure -- surface the server error now, don't block
+                # until `timeout`. `finalize` wrote it to the job's status node.
+                job = self._job()
+                raise JobFailedError(self.id, job.error if job is not None else None)
+            if state == "pending":
+                time.sleep(min(poll, 0.5))
+                continue
+            job = self._job()
+            if job is not None and job.committed:
+                return "finished"
+            time.sleep(poll)
+        raise TimeoutError(f"job {self.id} still {self.status()} after {timeout}s")
+
+    def cancel(self) -> None:
+        # Cancel by the canonical manifest id (what cancel matches), found
+        # via the submission prefix; fall back to the raw id.
+        job = self._job()
+        self.conn.cancel_job(job.job_id if job is not None else self.id)
+
+
+class AsyncMaterializedView:
+    """Async reference to a materialized view (name + async connection)."""
+
+    def __init__(self, conn, name: str, job_id: "str | None" = None):
+        self.conn = conn
+        self.name = name
+        #: initial-population job id from create, or None (with_no_data).
+        self.job_id = job_id
+
+    async def wait(self, timeout: float = 3600.0, poll: float = 2.0) -> str:
+        """Block until the initial-population job (from create) finishes.
+        A no-op when the view was created with no data."""
+        if self.job_id is None:
+            return "finished"
+        return await AsyncJobHandle(self.conn, self.job_id, table=self.name).wait(
+            timeout=timeout, poll=poll
+        )
+
+    async def refresh(self, full: bool = False) -> "AsyncJobHandle":
+        """Refresh the materialized view; returns an `AsyncJobHandle` to wait
+        on, poll, or cancel.
+
+        ``full=True`` forces a full rebuild instead of an incremental refresh
+        (indexes are preserved and reindexed by the distributed indexer).
+        """
+        job_id = await self.conn._refresh_materialized_view(self.name, full=full)
+        return AsyncJobHandle(self.conn, job_id, table=self.name)
+
+    async def explain_refresh(self, full: bool = False):
+        return await self.conn.explain_refresh_materialized_view(self.name, full=full)
+
+    async def alter(self, auto_refresh: bool) -> None:
+        await self.conn.alter_materialized_view(self.name, auto_refresh=auto_refresh)
+
+    async def drop(self) -> None:
+        await self.conn.drop_materialized_view(self.name)
+
+    async def lineage(self, column=None, *, direction=None, depth=None):
+        """Lineage of the materialized view (or column). Returns a `Lineage`."""
+        return await self.conn.lineage(
+            self.name, column, direction=direction, depth=depth
+        )
+
+
+class AsyncJobHandle:
+    """Async reference to an inflight server-side job, with polling helpers."""
+
+    GRACE_SECONDS = 20.0
+
+    def __init__(self, conn, job_id: str, table: "str | None" = None):
+        self.conn = conn
+        self.id = job_id
+        #: See JobHandle.table -- enables an O(1) by-id lookup when known.
+        self.table = table
+        self._created = time.monotonic()
+        self._seen = False
+
+    async def _job(self):
+        # Poll by id, not list_jobs (see JobHandle._job).
+        return await self.conn.get_job(self.id, self.table)
+
+    async def status(self) -> str:
+        job = await self._job()
+        if job is not None:
+            self._seen = True
+            return job.state
+        if not self._seen and time.monotonic() - self._created < self.GRACE_SECONDS:
+            return "pending"
+        return "finished"
+
+    async def progress(self) -> "tuple[int, int] | None":
+        job = await self._job()
+        if job is not None and job.units_total is not None:
+            return job.units_done or 0, job.units_total
+        return None
+
+    async def wait(self, timeout: float = 3600.0, poll: float = 2.0) -> str:
+        deadline = time.monotonic() + timeout
+        while time.monotonic() < deadline:
+            state = await self.status()
+            if state in ("finished", "stale"):
+                return state
+            if state == "failed":
+                # Terminal failure -- surface the server error now, don't block
+                # until `timeout`. `finalize` wrote it to the job's status node.
+                job = await self._job()
+                raise JobFailedError(self.id, job.error if job is not None else None)
+            if state == "pending":
+                await asyncio.sleep(min(poll, 0.5))
+                continue
+            job = await self._job()
+            if job is not None and job.committed:
+                return "finished"
+            await asyncio.sleep(poll)
+        raise TimeoutError(
+            f"job {self.id} still {await self.status()} after {timeout}s"
+        )
+
+    async def cancel(self) -> None:
+        job = await self._job()
+        await self.conn.cancel_job(job.job_id if job is not None else self.id)
--- a/python/python/tests/test_job_handle.py
+++ b/python/python/tests/test_job_handle.py
@@ -0,0 +1,92 @@
+# SPDX-License-Identifier: Apache-2.0
+# SPDX-FileCopyrightText: Copyright The LanceDB Authors
+"""JobHandle.wait() terminal-state handling.
+
+Regression coverage for the cluster backfill-failure hang: the server reports a
+doomed job as ``state="failed"`` within seconds, but ``wait()`` used to ignore
+``failed`` and block until its (default 3600s) timeout. These tests pin that a
+``failed`` job raises ``JobFailedError`` promptly, carrying the server error.
+"""
+
+import asyncio
+import time
+
+import pytest
+
+from lancedb.udf import JobHandle, AsyncJobHandle, JobFailedError
+
+
+class FakeJobInfo:
+    """Mirror of the pyo3 builtins.JobInfo fields wait()/status() read."""
+
+    def __init__(self, state, error=None, committed=False, units_total=None):
+        self.state = state
+        self.error = error
+        self.committed = committed
+        self.units_total = units_total
+        self.units_done = None
+        self.job_id = "job-1"
+
+
+class FakeConn:
+    """get_job() walks a scripted list of JobInfo (or None) snapshots, holding
+    the last one once exhausted, so wait() polls a deterministic timeline."""
+
+    def __init__(self, snapshots):
+        self._snaps = list(snapshots)
+        self.calls = 0
+
+    def get_job(self, job_id, table=None):
+        snap = self._snaps[min(self.calls, len(self._snaps) - 1)]
+        self.calls += 1
+        return snap
+
+
+class AsyncFakeConn(FakeConn):
+    async def get_job(self, job_id, table=None):
+        return FakeConn.get_job(self, job_id, table)
+
+
+def test_wait_raises_on_failed_promptly():
+    # pending -> failed: wait() must raise the server error, not TimeoutError.
+    conn = FakeConn(
+        [None, FakeJobInfo("failed", error="multi-column backfill needs a STRUCT")]
+    )
+    jh = JobHandle(conn, "job-1", table="t")
+    t0 = time.monotonic()
+    with pytest.raises(JobFailedError) as exc:
+        jh.wait(timeout=30, poll=0.01)
+    assert time.monotonic() - t0 < 5  # prompt, nowhere near the 30s timeout
+    assert "STRUCT" in str(exc.value)
+    assert exc.value.error == "multi-column backfill needs a STRUCT"
+    assert exc.value.job_id == "job-1"
+
+
+def test_wait_returns_finished_on_success():
+    # running -> finished (job left the inflight listing) returns normally.
+    conn = FakeConn([FakeJobInfo("running", units_total=2), None])
+    jh = JobHandle(conn, "job-1", table="t")
+    jh._seen = True  # already observed, so a None now means "finished" not grace
+    assert jh.wait(timeout=30, poll=0.01) == "finished"
+
+
+def test_wait_returns_finished_on_committed():
+    # A committed job that is still listed resolves to finished.
+    conn = FakeConn([FakeJobInfo("running", committed=True, units_total=2)])
+    jh = JobHandle(conn, "job-1", table="t")
+    jh._seen = True
+    assert jh.wait(timeout=30, poll=0.01) == "finished"
+
+
+def test_async_wait_raises_on_failed_promptly():
+    conn = AsyncFakeConn([None, FakeJobInfo("failed", error="boom")])
+    jh = AsyncJobHandle(conn, "job-1", table="t")
+
+    async def run():
+        t0 = time.monotonic()
+        with pytest.raises(JobFailedError) as exc:
+            await jh.wait(timeout=30, poll=0.01)
+        assert time.monotonic() - t0 < 5
+        assert exc.value.error == "boom"
+
+    asyncio.run(run())
--- a/python/src/connection.rs
+++ b/python/src/connection.rs
@@ -18,7 +18,10 @@ use lancedb::{
    connection::Connection as LanceConnection,
    connection::NamespaceClientPushdownOperation,
    database::namespace::LanceNamespaceDatabase,
-    database::{CreateTableMode, Database, ReadConsistency},
+    database::{
+        CreateFunctionRequest, CreateMaterializedViewRequest, CreateTableMode, Database,
+        ReadConsistency, RefreshMaterializedViewRequest, TableLineageRequest,
+    },
 };
 use pyo3::{
    Bound, FromPyObject, Py, PyAny, PyRef, PyResult, Python,
@@ -27,6 +30,92 @@ use pyo3::{
    types::{PyDict, PyDictMethods},
 };

+/// A registered function, as returned by `list_functions`.
+#[pyclass(get_all)]
+#[derive(Clone)]
+pub struct FunctionInfo {
+    pub name: String,
+    pub language: String,
+    pub return_type: String,
+    pub description: String,
+}
+
+/// A registered materialized view definition.
+#[pyclass(get_all)]
+#[derive(Clone)]
+pub struct MaterializedViewInfo {
+    pub name: String,
+    pub source_table: String,
+    pub projection: Vec<String>,
+    pub udf_columns: Vec<String>,
+    pub filter: Option<String>,
+    pub auto_refresh: bool,
+}
+
+/// One inflight server-side job.
+#[pyclass(get_all)]
+#[derive(Clone)]
+pub struct JobInfo {
+    pub table: String,
+    pub job_id: String,
+    pub job_type: String,
+    pub state: String,
+    pub column: Option<String>,
+    pub age_seconds: Option<i64>,
+    pub command: Option<String>,
+    pub units_done: Option<i64>,
+    pub units_total: Option<i64>,
+    pub committed: bool,
+    pub rows_skipped: u64,
+    pub error: Option<String>,
+}
+
+/// One durable, completed/terminal server-side job record (SHOW JOB HISTORY).
+#[pyclass(get_all)]
+#[derive(Clone)]
+pub struct JobHistoryEntry {
+    pub table: String,
+    pub job_id: String,
+    pub job_type: String,
+    pub state: String,
+    pub column: Option<String>,
+    pub created_ms: i64,
+    pub updated_ms: i64,
+    pub completed_ms: Option<i64>,
+    pub rows_processed: Option<i64>,
+    pub rows_skipped: Option<i64>,
+    pub error: Option<String>,
+    pub events: Option<String>,
+}
+
+/// One per-row UDF error recorded by `error_policy=skip` (SHOW ERRORS).
+#[pyclass(get_all)]
+#[derive(Clone)]
+pub struct JobErrorEntry {
+    pub job_id: String,
+    pub table: String,
+    pub column: String,
+    pub error_type: String,
+    pub error_message: String,
+    pub fragment_id: Option<i64>,
+    pub source_row_id: Option<i64>,
+    pub table_version: Option<i64>,
+    pub age_seconds: Option<i64>,
+}
+
+/// The plan a REFRESH MATERIALIZED VIEW would execute (EXPLAIN REFRESH).
+#[pyclass(get_all)]
+#[derive(Clone)]
+pub struct MvRefreshPlan {
+    pub table_name: String,
+    pub has_work: bool,
+    pub source_version: u64,
+    pub last_refreshed_version: Option<u64>,
+    pub full_refresh: bool,
+    pub rebuild: bool,
+    pub units_total: u64,
+}
+
 #[pyclass]
 pub struct Connection {
    inner: Option<LanceConnection>,
@@ -310,6 +399,308 @@ impl Connection {
        })
    }

+    #[pyo3(signature = (name, language, return_type, body, options=None))]
+    pub fn create_function(
+        self_: PyRef<'_, Self>,
+        name: String,
+        language: String,
+        return_type: String,
+        body: String,
+        options: Option<HashMap<String, String>>,
+    ) -> PyResult<Bound<'_, PyAny>> {
+        let inner = self_.get_inner()?.clone();
+        future_into_py(self_.py(), async move {
+            inner
+                .create_function(CreateFunctionRequest {
+                    name,
+                    language,
+                    return_type,
+                    body,
+                    options: options.unwrap_or_default(),
+                })
+                .await
+                .infer_error()
+        })
+    }
+
+    pub fn list_functions(self_: PyRef<'_, Self>) -> PyResult<Bound<'_, PyAny>> {
+        let inner = self_.get_inner()?.clone();
+        future_into_py(self_.py(), async move {
+            let functions = inner.list_functions().await.infer_error()?;
+            Ok(functions
+                .into_iter()
+                .map(|f| FunctionInfo {
+                    name: f.name,
+                    language: f.language,
+                    return_type: f.return_type,
+                    description: f.description,
+                })
+                .collect::<Vec<_>>())
+        })
+    }
+
+    pub fn drop_function(self_: PyRef<'_, Self>, name: String) -> PyResult<Bound<'_, PyAny>> {
+        let inner = self_.get_inner()?.clone();
+        future_into_py(self_.py(), async move {
+            inner.drop_function(&name).await.infer_error()
+        })
+    }
+
+    #[pyo3(signature = (name, query, auto_refresh=false, with_no_data=false, partition_by=None))]
+    pub fn create_materialized_view(
+        self_: PyRef<'_, Self>,
+        name: String,
+        query: String,
+        auto_refresh: bool,
+        with_no_data: bool,
+        partition_by: Option<String>,
+    ) -> PyResult<Bound<'_, PyAny>> {
+        let inner = self_.get_inner()?.clone();
+        future_into_py(self_.py(), async move {
+            inner
+                .create_materialized_view(CreateMaterializedViewRequest {
+                    name,
+                    query,
+                    auto_refresh,
+                    with_no_data,
+                    partition_by,
+                })
+                .await
+                .infer_error()
+        })
+    }
+
+    #[pyo3(signature = (name, full=false, src_version=None, num_workers=None, max_workers=None))]
+    pub fn refresh_materialized_view(
+        self_: PyRef<'_, Self>,
+        name: String,
+        full: bool,
+        src_version: Option<u64>,
+        num_workers: Option<u32>,
+        max_workers: Option<u32>,
+    ) -> PyResult<Bound<'_, PyAny>> {
+        let inner = self_.get_inner()?.clone();
+        future_into_py(self_.py(), async move {
+            inner
+                .refresh_materialized_view(RefreshMaterializedViewRequest {
+                    name,
+                    full,
+                    src_version,
+                    num_workers,
+                    max_workers,
+                })
+                .await
+                .infer_error()
+        })
+    }
+
+    /// Derived-compute lineage of a table/view (or column), returned as the
+    /// server's lineage JSON string (the Python layer parses it).
+    pub fn table_lineage(
+        self_: PyRef<'_, Self>,
+        name: String,
+        column: Option<String>,
+        direction: Option<String>,
+        depth: Option<u32>,
+    ) -> PyResult<Bound<'_, PyAny>> {
+        let inner = self_.get_inner()?.clone();
+        future_into_py(self_.py(), async move {
+            inner
+                .table_lineage(TableLineageRequest {
+                    name,
+                    column,
+                    direction,
+                    depth,
+                })
+                .await
+                .infer_error()
+        })
+    }
+
+    #[pyo3(signature = (name, full=false, src_version=None))]
+    pub fn explain_refresh_materialized_view(
+        self_: PyRef<'_, Self>,
+        name: String,
+        full: bool,
+        src_version: Option<u64>,
+    ) -> PyResult<Bound<'_, PyAny>> {
+        let inner = self_.get_inner()?.clone();
+        future_into_py(self_.py(), async move {
+            let p = inner
+                .explain_refresh_materialized_view(&name, full, src_version)
+                .await
+                .infer_error()?;
+            Ok(MvRefreshPlan {
+                table_name: p.table_name,
+                has_work: p.has_work,
+                source_version: p.source_version,
+                last_refreshed_version: p.last_refreshed_version,
+                full_refresh: p.full_refresh,
+                rebuild: p.rebuild,
+                units_total: p.units_total,
+            })
+        })
+    }
+
+    pub fn alter_materialized_view(
+        self_: PyRef<'_, Self>,
+        name: String,
+        auto_refresh: bool,
+    ) -> PyResult<Bound<'_, PyAny>> {
+        let inner = self_.get_inner()?.clone();
+        future_into_py(self_.py(), async move {
+            inner
+                .alter_materialized_view(&name, auto_refresh)
+                .await
+                .infer_error()
+        })
+    }
+
+    pub fn drop_materialized_view(
+        self_: PyRef<'_, Self>,
+        name: String,
+    ) -> PyResult<Bound<'_, PyAny>> {
+        let inner = self_.get_inner()?.clone();
+        future_into_py(self_.py(), async move {
+            inner.drop_materialized_view(&name).await.infer_error()
+        })
+    }
+
+    pub fn list_materialized_views(self_: PyRef<'_, Self>) -> PyResult<Bound<'_, PyAny>> {
+        let inner = self_.get_inner()?.clone();
+        future_into_py(self_.py(), async move {
+            let views = inner.list_materialized_views().await.infer_error()?;
+            Ok(views
+                .into_iter()
+                .map(|v| MaterializedViewInfo {
+                    name: v.name,
+                    source_table: v.source_table,
+                    projection: v.projection,
+                    udf_columns: v.udf_columns,
+                    filter: v.filter,
+                    auto_refresh: v.auto_refresh,
+                })
+                .collect::<Vec<_>>())
+        })
+    }
+
+    pub fn list_jobs(self_: PyRef<'_, Self>) -> PyResult<Bound<'_, PyAny>> {
+        let inner = self_.get_inner()?.clone();
+        future_into_py(self_.py(), async move {
+            let jobs = inner.list_jobs().await.infer_error()?;
+            Ok(jobs
+                .into_iter()
+                .map(|j| JobInfo {
+                    table: j.table,
+                    job_id: j.job_id,
+                    job_type: j.job_type,
+                    state: j.state,
+                    column: j.column,
+                    age_seconds: j.age_seconds,
+                    command: j.command,
+                    units_done: j.units_done,
+                    units_total: j.units_total,
+                    committed: j.committed,
+                    rows_skipped: j.rows_skipped,
+                    error: j.error,
+                })
+                .collect::<Vec<_>>())
+        })
+    }
+
+    pub fn cancel_job(self_: PyRef<'_, Self>, job_id: String) -> PyResult<Bound<'_, PyAny>> {
+        let inner = self_.get_inner()?.clone();
+        future_into_py(self_.py(), async move {
+            inner.cancel_job(&job_id).await.infer_error()
+        })
+    }
+
+    #[pyo3(signature = (job_id, table=None))]
+    pub fn get_job(
+        self_: PyRef<'_, Self>,
+        job_id: String,
+        table: Option<String>,
+    ) -> PyResult<Bound<'_, PyAny>> {
+        let inner = self_.get_inner()?.clone();
+        future_into_py(self_.py(), async move {
+            let job = inner
+                .get_job(&job_id, table.as_deref())
+                .await
+                .infer_error()?;
+            Ok(job.map(|j| JobInfo {
+                table: j.table,
+                job_id: j.job_id,
+                job_type: j.job_type,
+                state: j.state,
+                column: j.column,
+                age_seconds: j.age_seconds,
+                command: j.command,
+                units_done: j.units_done,
+                units_total: j.units_total,
+                committed: j.committed,
+                rows_skipped: j.rows_skipped,
+                error: j.error,
+            }))
+        })
+    }
+
+    #[pyo3(signature = (job_id=None))]
+    pub fn job_history(
+        self_: PyRef<'_, Self>,
+        job_id: Option<String>,
+    ) -> PyResult<Bound<'_, PyAny>> {
+        let inner = self_.get_inner()?.clone();
+        future_into_py(self_.py(), async move {
+            let rows = inner.job_history(job_id.as_deref()).await.infer_error()?;
+            Ok(rows
+                .into_iter()
+                .map(|r| JobHistoryEntry {
+                    table: r.table,
+                    job_id: r.job_id,
+                    job_type: r.job_type,
+                    state: r.state,
+                    column: r.column,
+                    created_ms: r.created_ms,
+                    updated_ms: r.updated_ms,
+                    completed_ms: r.completed_ms,
+                    rows_processed: r.rows_processed,
+                    rows_skipped: r.rows_skipped,
+                    error: r.error,
+                    events: r.events,
+                })
+                .collect::<Vec<_>>())
+        })
+    }
+
+    #[pyo3(signature = (job_id=None, table=None))]
+    pub fn errors(
+        self_: PyRef<'_, Self>,
+        job_id: Option<String>,
+        table: Option<String>,
+    ) -> PyResult<Bound<'_, PyAny>> {
+        let inner = self_.get_inner()?.clone();
+        future_into_py(self_.py(), async move {
+            let rows = inner
+                .errors(job_id.as_deref(), table.as_deref())
+                .await
+                .infer_error()?;
+            Ok(rows
+                .into_iter()
+                .map(|e| JobErrorEntry {
+                    job_id: e.job_id,
+                    table: e.table,
+                    column: e.column,
+                    error_type: e.error_type,
+                    error_message: e.error_message,
+                    fragment_id: e.fragment_id,
+                    source_row_id: e.source_row_id,
+                    table_version: e.table_version,
+                    age_seconds: e.age_seconds,
+                })
+                .collect::<Vec<_>>())
+        })
+    }
+
    #[pyo3(signature = (cur_name, new_name, cur_namespace_path=None, new_namespace_path=None))]
    pub fn rename_table(
        self_: PyRef<'_, Self>,
--- a/python/src/lib.rs
+++ b/python/src/lib.rs
@@ -41,6 +41,11 @@ pub fn _lancedb(_py: Python, m: &Bound<'_, PyModule>) -> PyResult<()> {
        .write_style("LANCEDB_LOG_STYLE");
    env_logger::init_from_env(env);
    m.add_class::<Connection>()?;
+    m.add_class::<connection::FunctionInfo>()?;
+    m.add_class::<connection::MaterializedViewInfo>()?;
+    m.add_class::<connection::JobInfo>()?;
+    m.add_class::<connection::JobHistoryEntry>()?;
+    m.add_class::<connection::JobErrorEntry>()?;
    m.add_class::<Session>()?;
    m.add_class::<Table>()?;
    m.add_class::<IndexConfig>()?;
--- a/python/src/table.rs
+++ b/python/src/table.rs
@@ -17,8 +17,8 @@ use arrow::{
    pyarrow::{FromPyArrow, PyArrowType, ToPyArrow},
 };
 use lancedb::table::{
-    AddDataMode, ColumnAlteration, Duration, FieldMetadataUpdate, NewColumnTransform,
-    OptimizeAction, OptimizeOptions, Ref, Table as LanceDbTable,
+    AddDataMode, ColumnAlteration, Duration, FieldMetadataUpdate, LoadColumnsRequest,
+    NewColumnTransform, OptimizeAction, OptimizeOptions, Ref, Table as LanceDbTable,
 };
 use pyo3::{
    Bound, FromPyObject, Py, PyAny, PyRef, PyResult, Python,
@@ -1060,6 +1060,83 @@ impl Table {
        })
    }

+    pub fn add_computed_columns(
+        self_: PyRef<'_, Self>,
+        columns: Vec<(String, String)>,
+        expression: String,
+    ) -> PyResult<Bound<'_, PyAny>> {
+        let inner = self_.inner_ref()?.clone();
+        future_into_py(self_.py(), async move {
+            inner
+                .add_computed_columns(&columns, &expression)
+                .await
+                .infer_error()
+        })
+    }
+
+    #[pyo3(signature = (columns, where_clause=None, num_workers=None, max_workers=None, batch_size=None, priority=None))]
+    pub fn refresh_column(
+        self_: PyRef<'_, Self>,
+        columns: Vec<String>,
+        where_clause: Option<String>,
+        num_workers: Option<u32>,
+        max_workers: Option<u32>,
+        batch_size: Option<u32>,
+        priority: Option<String>,
+    ) -> PyResult<Bound<'_, PyAny>> {
+        let inner = self_.inner_ref()?.clone();
+        future_into_py(self_.py(), async move {
+            inner
+                .refresh_column(
+                    &columns,
+                    where_clause,
+                    num_workers,
+                    max_workers,
+                    batch_size,
+                    priority,
+                )
+                .await
+                .infer_error()
+        })
+    }
+
+    #[allow(clippy::too_many_arguments)]
+    #[pyo3(signature = (source_uris, source_format, target_key, columns, source_key=None, source_storage_options=None, on_missing=None, num_workers=None, max_workers=None, batch_size=None, commit_granularity=None, priority=None))]
+    pub fn load_columns(
+        self_: PyRef<'_, Self>,
+        source_uris: Vec<String>,
+        source_format: String,
+        target_key: String,
+        columns: Vec<(String, Option<String>)>,
+        source_key: Option<String>,
+        source_storage_options: Option<std::collections::HashMap<String, String>>,
+        on_missing: Option<String>,
+        num_workers: Option<u32>,
+        max_workers: Option<u32>,
+        batch_size: Option<u32>,
+        commit_granularity: Option<u32>,
+        priority: Option<String>,
+    ) -> PyResult<Bound<'_, PyAny>> {
+        let inner = self_.inner_ref()?.clone();
+        let request = LoadColumnsRequest {
+            source_uris,
+            source_format,
+            source_storage_options,
+            target_key,
+            source_key,
+            columns,
+            on_missing,
+            num_workers,
+            max_workers,
+            batch_size,
+            commit_granularity,
+            priority,
+        };
+        future_into_py(self_.py(), async move {
+            inner.load_columns(request).await.infer_error()
+        })
+    }
+
    pub fn add_columns(
        self_: PyRef<'_, Self>,
        definitions: Vec<(String, String)>,
--- a/rust/lancedb/Cargo.toml
+++ b/rust/lancedb/Cargo.toml
@@ -166,10 +166,6 @@ required-features = ["bedrock"]
 [[example]]
 name = "simple"

-[[example]]
-name = "polars"
-required-features = ["polars"]
-
 [[example]]
 name = "full_text_search"

--- a/rust/lancedb/examples/polars.rs
+++ b/rust/lancedb/examples/polars.rs
@@ -1,47 +0,0 @@
-// SPDX-License-Identifier: Apache-2.0
-// SPDX-FileCopyrightText: Copyright The LanceDB Authors
-
-//! This example demonstrates ingesting a Polars DataFrame into LanceDB and
-//! reading it back out as a Polars DataFrame.
-
-use lancedb::arrow::IntoPolars;
-use lancedb::query::ExecutableQuery;
-use lancedb::{Result, connect};
-use polars::prelude::{DataFrame, NamedFrom, Series};
-
-fn make_dataframe() -> DataFrame {
-    let ids = Series::new("id", &[1i32, 2, 3, 4, 5]);
-    let names = Series::new("name", &["Alice", "Bob", "Carol", "Dave", "Eve"]);
-    let scores = Series::new("score", &[9.5f64, 8.1, 7.3, 9.0, 6.5]);
-    DataFrame::new(vec![ids, names, scores]).unwrap()
-}
-
-#[tokio::main]
-async fn main() -> Result<()> {
-    let tmp = tempfile::tempdir().unwrap();
-    let db = connect(tmp.path().to_str().unwrap()).execute().await?;
-
-    // Ingest a Polars DataFrame directly — DataFrame now implements Scannable.
-    let df = make_dataframe();
-    println!("Input DataFrame:\n{df}");
-
-    let table = db.create_table("people", df).execute().await?;
-
-    // Append more rows.
-    let more = DataFrame::new(vec![
-        Series::new("id", &[6i32, 7]),
-        Series::new("name", &["Frank", "Grace"]),
-        Series::new("score", &[7.8f64, 8.9]),
-    ])
-    .unwrap();
-    table.add(more).execute().await?;
-
-    // Read back as a Polars DataFrame.
-    let result_df = table.query().execute().await?.into_polars().await?;
-
-    println!(
-        "\nRound-tripped DataFrame ({} rows):\n{result_df}",
-        result_df.height()
-    );
-    Ok(())
-}
--- a/rust/lancedb/src/arrow.rs
+++ b/rust/lancedb/src/arrow.rs
@@ -112,14 +112,54 @@ impl<S: Stream<Item = Result<arrow_array::RecordBatch>>> RecordBatchStream

 /// A trait for converting incoming data to Arrow
 ///
+/// Integrations should implement this trait to allow data to be
+/// imported directly from the integration.  For example, implementing
+/// this trait for `Vec<Vec<...>>` would allow the `Vec` to be directly
+/// used in methods like [`crate::connection::Connection::create_table`]
+/// or [`crate::table::Table::add`]
+pub trait IntoArrow {
+    /// Convert the data into an iterator of Arrow batches
+    fn into_arrow(self) -> Result<Box<dyn arrow_array::RecordBatchReader + Send>>;
+}
+
 pub type BoxedRecordBatchReader = Box<dyn arrow_array::RecordBatchReader + Send>;

+impl<T: arrow_array::RecordBatchReader + Send + 'static> IntoArrow for T {
+    fn into_arrow(self) -> Result<Box<dyn arrow_array::RecordBatchReader + Send>> {
+        Ok(Box::new(self))
+    }
+}
+
+/// A trait for converting incoming data to Arrow asynchronously
+///
+/// Serves the same purpose as [`IntoArrow`], but for asynchronous data.
+///
+/// Note: Arrow has no async equivalent to RecordBatchReader and so
+pub trait IntoArrowStream {
+    /// Convert the data into a stream of Arrow batches
+    fn into_arrow(self) -> Result<SendableRecordBatchStream>;
+}
+
 impl<S: Stream<Item = Result<arrow_array::RecordBatch>>> SimpleRecordBatchStream<S> {
    pub fn new(stream: S, schema: Arc<arrow_schema::Schema>) -> Self {
        Self { schema, stream }
    }
 }

+impl IntoArrowStream for SendableRecordBatchStream {
+    fn into_arrow(self) -> Result<SendableRecordBatchStream> {
+        Ok(self)
+    }
+}
+
+impl IntoArrowStream for datafusion_physical_plan::SendableRecordBatchStream {
+    fn into_arrow(self) -> Result<SendableRecordBatchStream> {
+        let schema = self.schema();
+        let stream = self.map_err(|df_err| df_err.into());
+        Ok(Box::pin(SimpleRecordBatchStream::new(stream, schema)))
+    }
+}
+
 pub trait LanceDbDatagenExt {
    fn into_ldb_stream(
        self,
@@ -224,7 +264,9 @@ impl IntoPolars for SendableRecordBatchStream {
 #[cfg(all(test, feature = "polars"))]
 mod tests {
    use super::SendableRecordBatchStream;
-    use crate::arrow::{IntoPolars, PolarsDataFrameRecordBatchReader, SimpleRecordBatchStream};
+    use crate::arrow::{
+        IntoArrow, IntoPolars, PolarsDataFrameRecordBatchReader, SimpleRecordBatchStream,
+    };
    use polars::prelude::{DataFrame, NamedFrom, Series};

    fn get_record_batch_reader_from_polars() -> Box<dyn arrow_array::RecordBatchReader + Send> {
@@ -238,7 +280,10 @@ mod tests {
        float_series = Series::new("float", &[2.0]);
        let df2 = DataFrame::new(vec![string_series, int_series, float_series]).unwrap();

-        Box::new(PolarsDataFrameRecordBatchReader::new(df1.vstack(&df2).unwrap()).unwrap())
+        PolarsDataFrameRecordBatchReader::new(df1.vstack(&df2).unwrap())
+            .unwrap()
+            .into_arrow()
+            .unwrap()
    }

    #[test]
--- a/rust/lancedb/src/connection.rs
+++ b/rust/lancedb/src/connection.rs
@@ -23,8 +23,10 @@ use crate::connection::create_table::CreateTableBuilder;
 use crate::data::scannable::Scannable;
 use crate::database::listing::ListingDatabase;
 use crate::database::{
-    CloneTableRequest, Database, DatabaseOptions, OpenTableRequest, ReadConsistency,
-    TableNamesRequest,
+    CloneTableRequest, CreateFunctionRequest, CreateMaterializedViewRequest, Database,
+    DatabaseOptions, FunctionInfo, JobErrorInfo, JobHistoryInfo, JobInfo, MaterializedViewInfo,
+    MvRefreshPlan, OpenTableRequest, ReadConsistency, RefreshMaterializedViewRequest,
+    TableLineageRequest, TableNamesRequest,
 };
 use crate::embeddings::{EmbeddingRegistry, MemoryRegistry};
 use crate::error::{Error, Result};
@@ -488,6 +490,113 @@ impl Connection {
        )
    }

+    // -- Derived compute: functions, materialized views, jobs -------------
+    // Server-backed features (LanceDB Enterprise / Cloud); local
+    // databases return NotSupported for now.
+
+    /// Register a UDF (CREATE FUNCTION).
+    pub async fn create_function(&self, request: CreateFunctionRequest) -> Result<()> {
+        self.internal.create_function(request).await
+    }
+
+    /// List registered functions (SHOW FUNCTIONS).
+    pub async fn list_functions(&self) -> Result<Vec<FunctionInfo>> {
+        self.internal.list_functions().await
+    }
+
+    /// Drop a registered function (DROP FUNCTION).
+    pub async fn drop_function(&self, name: &str) -> Result<()> {
+        self.internal.drop_function(name).await
+    }
+
+    /// Create a materialized view (CREATE MATERIALIZED VIEW). Returns
+    /// the initial-population job id, absent when `with_no_data`.
+    pub async fn create_materialized_view(
+        &self,
+        request: CreateMaterializedViewRequest,
+    ) -> Result<Option<String>> {
+        self.internal.create_materialized_view(request).await
+    }
+
+    /// Refresh a materialized view; returns the refresh job id.
+    pub async fn refresh_materialized_view(
+        &self,
+        request: RefreshMaterializedViewRequest,
+    ) -> Result<String> {
+        self.internal.refresh_materialized_view(request).await
+    }
+
+    /// Derived-compute lineage of a table/view (or column), as server-defined
+    /// JSON. Read-only.
+    pub async fn table_lineage(&self, request: TableLineageRequest) -> Result<String> {
+        self.internal.table_lineage(request).await
+    }
+
+    /// Plan a materialized-view refresh without submitting work
+    /// (EXPLAIN REFRESH).
+    pub async fn explain_refresh_materialized_view(
+        &self,
+        name: &str,
+        full: bool,
+        src_version: Option<u64>,
+    ) -> Result<MvRefreshPlan> {
+        self.internal
+            .explain_refresh_materialized_view(name, full, src_version)
+            .await
+    }
+
+    /// Update a materialized view's options (ALTER MATERIALIZED VIEW).
+    pub async fn alter_materialized_view(&self, name: &str, auto_refresh: bool) -> Result<()> {
+        self.internal
+            .alter_materialized_view(name, auto_refresh)
+            .await
+    }
+
+    /// Drop a materialized view definition (DROP MATERIALIZED VIEW).
+    pub async fn drop_materialized_view(&self, name: &str) -> Result<()> {
+        self.internal.drop_materialized_view(name).await
+    }
+
+    /// List registered materialized view definitions.
+    pub async fn list_materialized_views(&self) -> Result<Vec<MaterializedViewInfo>> {
+        self.internal.list_materialized_views().await
+    }
+
+    /// List inflight server-side jobs across the database's tables.
+    pub async fn list_jobs(&self) -> Result<Vec<JobInfo>> {
+        self.internal.list_jobs().await
+    }
+
+    /// Cancel an inflight server-side job by id. Returns true if a
+    /// matching inflight job was flagged for cancellation.
+    pub async fn cancel_job(&self, job_id: &str) -> Result<bool> {
+        self.internal.cancel_job(job_id).await
+    }
+
+    /// Look up a single server-side job by id -- the `wait()`/status poll path.
+    /// `table_hint` (the job's table) enables an O(1) server-side lookup; `None`
+    /// scans the database's active jobs. A `None` result means unknown / not
+    /// active.
+    pub async fn get_job(&self, job_id: &str, table_hint: Option<&str>) -> Result<Option<JobInfo>> {
+        self.internal.get_job(job_id, table_hint).await
+    }
+
+    /// Durable job history (SHOW JOB HISTORY) across the database's tables.
+    /// Pass `job_id` to narrow to a single job.
+    pub async fn job_history(&self, job_id: Option<&str>) -> Result<Vec<JobHistoryInfo>> {
+        self.internal.job_history(job_id).await
+    }
+
+    /// Per-row UDF errors (SHOW ERRORS) across the database's tables, optionally
+    /// filtered by `job_id` and/or `table`.
+    pub async fn errors(
+        &self,
+        job_id: Option<&str>,
+        table: Option<&str>,
+    ) -> Result<Vec<JobErrorInfo>> {
+        self.internal.errors(job_id, table).await
+    }
+
    /// Rename a table in the database.
    ///
    /// This is only supported in LanceDB Cloud.
--- a/rust/lancedb/src/data/scannable.rs
+++ b/rust/lancedb/src/data/scannable.rs
@@ -185,43 +185,6 @@ impl Scannable for SendableRecordBatchStream {
    }
 }

-#[cfg(feature = "polars")]
-impl Scannable for polars::frame::DataFrame {
-    fn schema(&self) -> SchemaRef {
-        crate::polars_arrow_convertors::convert_polars_df_schema_to_arrow_rb_schema(
-            self.schema().clone(),
-        )
-        .expect("failed to convert Polars DataFrame schema to Arrow schema")
-    }
-
-    fn scan_as_stream(&mut self) -> SendableRecordBatchStream {
-        let schema = Scannable::schema(self);
-        let batches: crate::Result<Vec<RecordBatch>> =
-            match crate::arrow::PolarsDataFrameRecordBatchReader::new(self.clone()) {
-                Err(e) => Err(e),
-                Ok(reader) => reader.map(|b| b.map_err(Into::into)).collect(),
-            };
-        match batches {
-            Err(e) => Box::pin(SimpleRecordBatchStream {
-                schema,
-                stream: once(async move { Err(e) }),
-            }),
-            Ok(batches) => {
-                let stream = futures::stream::iter(batches.into_iter().map(Ok));
-                Box::pin(SimpleRecordBatchStream { schema, stream })
-            }
-        }
-    }
-
-    fn num_rows(&self) -> Option<usize> {
-        Some(self.height())
-    }
-
-    fn rescannable(&self) -> bool {
-        true
-    }
-}
-
 #[async_trait]
 impl StreamingWriteSource for Box<dyn Scannable> {
    fn arrow_schema(&self) -> SchemaRef {
@@ -1126,60 +1089,4 @@ mod tests {
            );
        }
    }
-
-    #[cfg(feature = "polars")]
-    mod polars_tests {
-        use super::*;
-        use crate::arrow::IntoPolars;
-        use crate::query::ExecutableQuery;
-        use polars::prelude::{DataFrame, NamedFrom, Series};
-
-        fn make_df() -> DataFrame {
-            DataFrame::new(vec![
-                Series::new("id", &[1i32, 2, 3]),
-                Series::new("val", &[1.1f64, 2.2, 3.3]),
-            ])
-            .unwrap()
-        }
-
-        #[tokio::test]
-        async fn test_dataframe_scannable_round_trip() {
-            let tmp = tempfile::tempdir().unwrap();
-            let db = crate::connect(tmp.path().to_str().unwrap())
-                .execute()
-                .await
-                .unwrap();
-
-            let df = make_df();
-            let table = db.create_table("t", df.clone()).execute().await.unwrap();
-
-            // Append the same rows again.
-            table.add(df.clone()).execute().await.unwrap();
-
-            let result = table
-                .query()
-                .execute()
-                .await
-                .unwrap()
-                .into_polars()
-                .await
-                .unwrap();
-
-            assert_eq!(result.height(), df.height() * 2);
-            assert_eq!(result.schema(), df.schema());
-        }
-
-        #[tokio::test]
-        async fn test_dataframe_scannable_rescannable() {
-            let mut df = make_df();
-            assert!(df.rescannable());
-
-            let batches1: Vec<RecordBatch> = df.scan_as_stream().try_collect().await.unwrap();
-            assert_eq!(batches1.iter().map(|b| b.num_rows()).sum::<usize>(), 3);
-
-            // Can be scanned again.
-            let batches2: Vec<RecordBatch> = df.scan_as_stream().try_collect().await.unwrap();
-            assert_eq!(batches2.iter().map(|b| b.num_rows()).sum::<usize>(), 3);
-        }
-    }
 }
--- a/rust/lancedb/src/database.rs
+++ b/rust/lancedb/src/database.rs
@@ -27,7 +27,7 @@ use lance_namespace::models::{
 };

 use crate::data::scannable::Scannable;
-use crate::error::Result;
+use crate::error::{Error, Result};
 use crate::table::{BaseTable, WriteOptions};

 pub mod listing;
@@ -200,6 +200,205 @@ pub enum ReadConsistency {
    Strong,
 }

+/// A request to register a UDF (CREATE FUNCTION).
+///
+/// Functions are first-class database objects, decoupled from any
+/// column; computed columns and materialized views reference them by
+/// name. Server-backed feature (LanceDB Enterprise / Cloud).
+#[derive(Debug, Clone)]
+pub struct CreateFunctionRequest {
+    /// Function name.
+    pub name: String,
+    /// Implementation language (currently "python").
+    pub language: String,
+    /// SQL return type, e.g. `FLOAT`, `FLOAT[1536]`,
+    /// `STRUCT(a FLOAT, b VARCHAR)`, `TABLE(chunk VARCHAR, idx INT)`.
+    pub return_type: String,
+    /// Function body: source text, or base64 cloudpickle bytes when
+    /// `options["body_format"] = "cloudpickle"`.
+    pub body: String,
+    /// Options: input_columns, pip, num_gpus, batch_size, timeout,
+    /// error_policy, docker_image, body_format, ...
+    pub options: HashMap<String, String>,
+}
+
+/// A registered function, as returned by `list_functions`.
+#[derive(Debug, Clone)]
+pub struct FunctionInfo {
+    pub name: String,
+    pub language: String,
+    pub return_type: String,
+    pub description: String,
+}
+
+/// A request to create a materialized view (CREATE MATERIALIZED VIEW).
+#[derive(Debug, Clone)]
+pub struct CreateMaterializedViewRequest {
+    /// View name.
+    pub name: String,
+    /// The view's SELECT statement, e.g.
+    /// `SELECT id, embed(body) AS vec FROM articles WHERE id > 1`.
+    /// Bare columns project through; function-call columns compute via
+    /// registered UDFs (a RETURNS TABLE function makes a row-expanding
+    /// chunker view).
+    pub query: String,
+    /// Refresh automatically when the source table changes.
+    pub auto_refresh: bool,
+    /// Register the definition only; skip the initial population.
+    pub with_no_data: bool,
+    /// Optional source column to partition the view's table function on. If the
+    /// column has an IVF vector index the server partitions by its clusters
+    /// (image-dedup style); otherwise it groups by distinct value.
+    pub partition_by: Option<String>,
+}
+
+impl CreateMaterializedViewRequest {
+    pub fn new(name: impl Into<String>, query: impl Into<String>) -> Self {
+        Self {
+            name: name.into(),
+            query: query.into(),
+            auto_refresh: false,
+            with_no_data: false,
+            partition_by: None,
+        }
+    }
+}
+
+/// A request to refresh a materialized view.
+#[derive(Debug, Clone)]
+pub struct RefreshMaterializedViewRequest {
+    /// View name.
+    pub name: String,
+    /// Force a full rebuild (recompute and replace every row) instead of the
+    /// default incremental refresh.
+    pub full: bool,
+    /// Pin the refresh to a source-table version; latest when absent.
+    pub src_version: Option<u64>,
+    /// Initial worker count.
+    pub num_workers: Option<u32>,
+    /// Elastic worker ceiling.
+    pub max_workers: Option<u32>,
+}
+
+/// A request for the derived-compute lineage of a table/view (or one of its
+/// columns). The response is server-defined lineage JSON, returned opaque so
+/// this client need not model the server's lineage schema.
+#[derive(Debug, Clone, Default)]
+pub struct TableLineageRequest {
+    /// Table or view name.
+    pub name: String,
+    /// Column for column-level lineage; whole table/view when absent.
+    pub column: Option<String>,
+    /// "upstream" | "downstream" | "both" (server default when absent).
+    pub direction: Option<String>,
+    /// Column-hops to walk; transitive when absent.
+    pub depth: Option<u32>,
+}
+
+impl RefreshMaterializedViewRequest {
+    pub fn new(name: impl Into<String>) -> Self {
+        Self {
+            name: name.into(),
+            full: false,
+            src_version: None,
+            num_workers: None,
+            max_workers: None,
+        }
+    }
+}
+
+/// A registered materialized view definition, as returned by
+/// `list_materialized_views`.
+#[derive(Debug, Clone)]
+pub struct MaterializedViewInfo {
+    pub name: String,
+    pub source_table: String,
+    /// Source columns projected through.
+    pub projection: Vec<String>,
+    /// `alias=expression` per UDF-computed column.
+    pub udf_columns: Vec<String>,
+    pub filter: Option<String>,
+    pub auto_refresh: bool,
+}
+
+/// A row from `list_jobs`: one inflight server-side job (index build,
+/// compaction, column refresh, view refresh, ...).
+#[derive(Debug, Clone)]
+pub struct JobInfo {
+    pub table: String,
+    pub job_id: String,
+    pub job_type: String,
+    /// Lifecycle state: "running", "cancelling", or "stale".
+    pub state: String,
+    pub column: Option<String>,
+    pub age_seconds: Option<i64>,
+    pub command: Option<String>,
+    pub units_done: Option<i64>,
+    pub units_total: Option<i64>,
+    /// Whether the job's final commit has completed (output visible).
+    pub committed: bool,
+    pub rows_skipped: u64,
+    pub error: Option<String>,
+}
+
+/// A row from `job_history`: one durable, completed/terminal server-side job
+/// record (SHOW JOB HISTORY), read from a table's `_job_history` store. Unlike
+/// `JobInfo` (live, inflight jobs) this carries created/updated/completed
+/// timestamps and the lifecycle event log.
+#[derive(Debug, Clone)]
+pub struct JobHistoryInfo {
+    pub table: String,
+    pub job_id: String,
+    pub job_type: String,
+    pub state: String,
+    pub column: Option<String>,
+    pub created_ms: i64,
+    pub updated_ms: i64,
+    pub completed_ms: Option<i64>,
+    pub rows_processed: Option<i64>,
+    pub rows_skipped: Option<i64>,
+    pub error: Option<String>,
+    /// Newline-joined lifecycle event log, oldest first.
+    pub events: Option<String>,
+}
+
+/// A row from `errors`: one per-row UDF failure recorded by `error_policy=skip`
+/// (SHOW ERRORS).
+#[derive(Debug, Clone)]
+pub struct JobErrorInfo {
+    pub job_id: String,
+    pub table: String,
+    pub column: String,
+    pub error_type: String,
+    pub error_message: String,
+    pub fragment_id: Option<i64>,
+    pub source_row_id: Option<i64>,
+    pub table_version: Option<i64>,
+    pub age_seconds: Option<i64>,
+}
+
+/// The plan a `REFRESH MATERIALIZED VIEW` would execute, as returned by
+/// `explain_refresh_materialized_view` (EXPLAIN REFRESH). No work is run.
+#[derive(Debug, Clone)]
+pub struct MvRefreshPlan {
+    pub table_name: String,
+    /// Whether a refresh would do anything (rebuild or non-empty units).
+    pub has_work: bool,
+    pub source_version: u64,
+    pub last_refreshed_version: Option<u64>,
+    pub full_refresh: bool,
+    /// Source changed non-append-only since the last refresh -> rebuild.
+    pub rebuild: bool,
+    /// Number of row-range work units the refresh would process.
+    pub units_total: u64,
+}
+
+fn not_supported<T>(what: &str) -> Result<T> {
+    Err(Error::NotSupported {
+        message: format!("{} is not supported by this database", what),
+    })
+}
+
 /// The `Database` trait defines the interface for database implementations.
 ///
 /// A database is responsible for managing tables and their metadata.
@@ -245,6 +444,99 @@ pub trait Database:
    ///
    /// See [`CloneTableRequest`] for detailed documentation and examples.
    async fn clone_table(&self, request: CloneTableRequest) -> Result<Arc<dyn BaseTable>>;
+
+    // -- Derived compute: functions, materialized views, jobs -------------
+    //
+    // Server-backed features (LanceDB Enterprise / Cloud). The defaults
+    // return NotSupported; the remote database overrides them. Local
+    // single-node implementations are planned.
+
+    /// Register a UDF (CREATE FUNCTION).
+    async fn create_function(&self, _request: CreateFunctionRequest) -> Result<()> {
+        not_supported("create_function")
+    }
+    /// List registered functions (SHOW FUNCTIONS).
+    async fn list_functions(&self) -> Result<Vec<FunctionInfo>> {
+        not_supported("list_functions")
+    }
+    /// Drop a registered function (DROP FUNCTION).
+    async fn drop_function(&self, _name: &str) -> Result<()> {
+        not_supported("drop_function")
+    }
+    /// Create a materialized view (CREATE MATERIALIZED VIEW). Returns
+    /// the initial-population job id, absent when `with_no_data`.
+    async fn create_materialized_view(
+        &self,
+        _request: CreateMaterializedViewRequest,
+    ) -> Result<Option<String>> {
+        not_supported("create_materialized_view")
+    }
+    /// Refresh a materialized view; returns the refresh job id.
+    async fn refresh_materialized_view(
+        &self,
+        _request: RefreshMaterializedViewRequest,
+    ) -> Result<String> {
+        not_supported("refresh_materialized_view")
+    }
+    /// Derived-compute lineage of a table/view (or column), as server-defined
+    /// JSON. Read-only.
+    async fn table_lineage(&self, _request: TableLineageRequest) -> Result<String> {
+        not_supported("table_lineage")
+    }
+    /// Plan a materialized-view refresh without submitting work
+    /// (EXPLAIN REFRESH). `full` plans a full rebuild (incremental
+    /// planning requires stable row IDs on the source).
+    async fn explain_refresh_materialized_view(
+        &self,
+        _name: &str,
+        _full: bool,
+        _src_version: Option<u64>,
+    ) -> Result<MvRefreshPlan> {
+        not_supported("explain_refresh_materialized_view")
+    }
+    /// Update a materialized view's options (ALTER MATERIALIZED VIEW).
+    async fn alter_materialized_view(&self, _name: &str, _auto_refresh: bool) -> Result<()> {
+        not_supported("alter_materialized_view")
+    }
+    /// Drop a materialized view definition (DROP MATERIALIZED VIEW).
+    async fn drop_materialized_view(&self, _name: &str) -> Result<()> {
+        not_supported("drop_materialized_view")
+    }
+    /// List registered materialized view definitions.
+    async fn list_materialized_views(&self) -> Result<Vec<MaterializedViewInfo>> {
+        not_supported("list_materialized_views")
+    }
+    /// List inflight server-side jobs across the database's tables.
+    async fn list_jobs(&self) -> Result<Vec<JobInfo>> {
+        not_supported("list_jobs")
+    }
+    /// Cancel an inflight server-side job by id. Returns true if a
+    /// matching inflight job was found and flagged for cancellation,
+    /// false if none was inflight (best-effort, like SQL `CANCEL JOB`).
+    async fn cancel_job(&self, _job_id: &str) -> Result<bool> {
+        not_supported("cancel_job")
+    }
+    /// Point-access for a single job by id -- the `wait()`/status poll path.
+    /// `table_hint` (the job's table, which `wait()` callers know) enables an
+    /// O(1) server-side lookup. `None` if the job is unknown or not active.
+    async fn get_job(&self, _job_id: &str, _table_hint: Option<&str>) -> Result<Option<JobInfo>> {
+        not_supported("get_job")
+    }
+    /// Durable job history (SHOW JOB HISTORY) across the database's tables,
+    /// optionally narrowed to a single `job_id`.
+    async fn job_history(&self, _job_id: Option<&str>) -> Result<Vec<JobHistoryInfo>> {
+        not_supported("job_history")
+    }
+    /// Per-row UDF errors (SHOW ERRORS) recorded by `error_policy=skip` across
+    /// the database's tables, optionally filtered by `job_id` and/or `table`.
+    async fn errors(
+        &self,
+        _job_id: Option<&str>,
+        _table: Option<&str>,
+    ) -> Result<Vec<JobErrorInfo>> {
+        not_supported("errors")
+    }
+
    /// Open a table in the database
    async fn open_table(&self, request: OpenTableRequest) -> Result<Arc<dyn BaseTable>>;
    /// Rename a table in the database
--- a/rust/lancedb/src/remote/db.rs
+++ b/rust/lancedb/src/remote/db.rs
@@ -19,8 +19,10 @@ use lance_namespace::models::{

 use crate::Error;
 use crate::database::{
-    CloneTableRequest, CreateTableMode, CreateTableRequest, Database, DatabaseOptions,
-    OpenTableRequest, ReadConsistency, TableNamesRequest,
+    CloneTableRequest, CreateFunctionRequest, CreateMaterializedViewRequest, CreateTableMode,
+    CreateTableRequest, Database, DatabaseOptions, FunctionInfo, JobErrorInfo, JobHistoryInfo,
+    JobInfo, MaterializedViewInfo, MvRefreshPlan, OpenTableRequest, ReadConsistency,
+    RefreshMaterializedViewRequest, TableLineageRequest, TableNamesRequest,
 };
 use crate::error::Result;
 use crate::remote::util::stream_as_body;
@@ -33,6 +35,248 @@ use super::client::{
 use super::table::RemoteTable;
 use super::util::parse_server_version;

+// Wire types for the derived-compute routes (functions, materialized
+// views, jobs). Field shapes mirror the server's REST contract.
+#[derive(serde::Serialize)]
+struct RemoteCreateFunctionRequest {
+    language: String,
+    return_type: String,
+    body: String,
+    options: std::collections::HashMap<String, String>,
+}
+
+#[derive(serde::Deserialize)]
+struct RemoteFunctionEntry {
+    name: String,
+    language: String,
+    return_type: String,
+    #[serde(default)]
+    description: String,
+}
+
+#[derive(serde::Deserialize)]
+struct RemoteListFunctionsResponse {
+    functions: Vec<RemoteFunctionEntry>,
+}
+
+#[derive(serde::Serialize)]
+struct RemoteCreateMaterializedViewRequest {
+    query: String,
+    auto_refresh: bool,
+    with_no_data: bool,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    partition_by: Option<String>,
+}
+
+#[derive(serde::Deserialize)]
+struct RemoteCreateMaterializedViewResponse {
+    #[serde(default)]
+    job_id: Option<String>,
+}
+
+#[derive(serde::Serialize)]
+struct RemoteRefreshMaterializedViewRequest {
+    #[serde(skip_serializing_if = "std::ops::Not::not")]
+    full: bool,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    src_version: Option<u64>,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    num_workers: Option<u32>,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    max_workers: Option<u32>,
+}
+
+#[derive(serde::Deserialize)]
+struct RemoteRefreshMaterializedViewResponse {
+    job_id: String,
+}
+
+#[derive(serde::Serialize)]
+struct RemoteExplainRefreshRequest {
+    #[serde(skip_serializing_if = "Option::is_none")]
+    full: Option<bool>,
+    #[serde(skip_serializing_if = "Option::is_none")]
+    src_version: Option<u64>,
+}
+
+#[derive(serde::Deserialize)]
+struct RemoteExplainRefreshResponse {
+    table_name: String,
+    has_work: bool,
+    source_version: u64,
+    last_refreshed_version: Option<u64>,
+    full_refresh: bool,
+    rebuild: bool,
+    units_total: u64,
+}
+
+#[derive(serde::Serialize)]
+struct RemoteAlterMaterializedViewRequest {
+    auto_refresh: bool,
+}
+
+#[derive(serde::Deserialize)]
+struct RemoteMaterializedViewEntry {
+    name: String,
+    source_table: String,
+    #[serde(default)]
+    projection: Vec<String>,
+    #[serde(default)]
+    udf_columns: Vec<String>,
+    #[serde(default)]
+    filter: Option<String>,
+    #[serde(default)]
+    auto_refresh: bool,
+}
+
+#[derive(serde::Deserialize)]
+struct RemoteListMaterializedViewsResponse {
+    views: Vec<RemoteMaterializedViewEntry>,
+}
+
+#[derive(serde::Deserialize)]
+struct RemoteJobEntry {
+    table: String,
+    job_id: String,
+    job_type: String,
+    state: String,
+    #[serde(default)]
+    column: Option<String>,
+    #[serde(default)]
+    age_seconds: Option<i64>,
+    #[serde(default)]
+    command: Option<String>,
+    #[serde(default)]
+    units_done: Option<i64>,
+    #[serde(default)]
+    units_total: Option<i64>,
+    #[serde(default)]
+    committed: bool,
+    #[serde(default)]
+    rows_skipped: u64,
+    #[serde(default)]
+    error: Option<String>,
+}
+
+#[derive(serde::Deserialize)]
+struct RemoteListJobsResponse {
+    jobs: Vec<RemoteJobEntry>,
+}
+
+#[derive(serde::Deserialize)]
+struct RemoteGetJobResponse {
+    #[serde(default)]
+    job: Option<RemoteJobEntry>,
+}
+
+#[derive(serde::Deserialize)]
+struct RemoteCancelJobResponse {
+    cancelled: bool,
+}
+
+impl From<RemoteJobEntry> for JobInfo {
+    fn from(j: RemoteJobEntry) -> Self {
+        JobInfo {
+            table: j.table,
+            job_id: j.job_id,
+            job_type: j.job_type,
+            state: j.state,
+            column: j.column,
+            age_seconds: j.age_seconds,
+            command: j.command,
+            units_done: j.units_done,
+            units_total: j.units_total,
+            committed: j.committed,
+            rows_skipped: j.rows_skipped,
+            error: j.error,
+        }
+    }
+}
+
+#[derive(serde::Deserialize)]
+struct RemoteJobHistoryEntry {
+    table: String,
+    job_id: String,
+    job_type: String,
+    state: String,
+    #[serde(default)]
+    column: Option<String>,
+    created_ms: i64,
+    updated_ms: i64,
+    #[serde(default)]
+    completed_ms: Option<i64>,
+    #[serde(default)]
+    rows_processed: Option<i64>,
+    #[serde(default)]
+    rows_skipped: Option<i64>,
+    #[serde(default)]
+    error: Option<String>,
+    #[serde(default)]
+    events: Option<String>,
+}
+
+#[derive(serde::Deserialize)]
+struct RemoteJobHistoryResponse {
+    jobs: Vec<RemoteJobHistoryEntry>,
+}
+
+impl From<RemoteJobHistoryEntry> for JobHistoryInfo {
+    fn from(j: RemoteJobHistoryEntry) -> Self {
+        JobHistoryInfo {
+            table: j.table,
+            job_id: j.job_id,
+            job_type: j.job_type,
+            state: j.state,
+            column: j.column,
+            created_ms: j.created_ms,
+            updated_ms: j.updated_ms,
+            completed_ms: j.completed_ms,
+            rows_processed: j.rows_processed,
+            rows_skipped: j.rows_skipped,
+            error: j.error,
+            events: j.events,
+        }
+    }
+}
+
+#[derive(serde::Deserialize)]
+struct RemoteErrorEntry {
+    job_id: String,
+    table: String,
+    column: String,
+    error_type: String,
+    error_message: String,
+    #[serde(default)]
+    fragment_id: Option<i64>,
+    #[serde(default)]
+    source_row_id: Option<i64>,
+    #[serde(default)]
+    table_version: Option<i64>,
+    #[serde(default)]
+    age_seconds: Option<i64>,
+}
+
+#[derive(serde::Deserialize)]
+struct RemoteErrorsResponse {
+    errors: Vec<RemoteErrorEntry>,
+}
+
+impl From<RemoteErrorEntry> for JobErrorInfo {
+    fn from(e: RemoteErrorEntry) -> Self {
+        JobErrorInfo {
+            job_id: e.job_id,
+            table: e.table,
+            column: e.column,
+            error_type: e.error_type,
+            error_message: e.error_message,
+            fragment_id: e.fragment_id,
+            source_row_id: e.source_row_id,
+            table_version: e.table_version,
+            age_seconds: e.age_seconds,
+        }
+    }
+}
+
 // Request structure for the remote clone table API
 #[derive(serde::Serialize)]
 struct RemoteCloneTableRequest {
@@ -641,6 +885,228 @@ impl<S: HttpSend> Database for RemoteDatabase<S> {
        Ok(table)
    }

+    async fn create_function(&self, request: CreateFunctionRequest) -> Result<()> {
+        let body = RemoteCreateFunctionRequest {
+            language: request.language,
+            return_type: request.return_type,
+            body: request.body,
+            options: request.options,
+        };
+        let req = self
+            .client
+            .post(&format!("/v1/function/{}/create", request.name))
+            .json(&body);
+        let (request_id, rsp) = self.client.send(req).await?;
+        self.client.check_response(&request_id, rsp).await?;
+        Ok(())
+    }
+
+    async fn list_functions(&self) -> Result<Vec<FunctionInfo>> {
+        let req = self.client.get("/v1/function/list");
+        let (request_id, rsp) = self.client.send(req).await?;
+        let rsp = self.client.check_response(&request_id, rsp).await?;
+        let body: RemoteListFunctionsResponse = rsp.json().await.err_to_http(request_id)?;
+        Ok(body
+            .functions
+            .into_iter()
+            .map(|f| FunctionInfo {
+                name: f.name,
+                language: f.language,
+                return_type: f.return_type,
+                description: f.description,
+            })
+            .collect())
+    }
+
+    async fn drop_function(&self, name: &str) -> Result<()> {
+        let req = self.client.post(&format!("/v1/function/{}/drop", name));
+        let (request_id, rsp) = self.client.send(req).await?;
+        self.client.check_response(&request_id, rsp).await?;
+        Ok(())
+    }
+
+    async fn create_materialized_view(
+        &self,
+        request: CreateMaterializedViewRequest,
+    ) -> Result<Option<String>> {
+        let body = RemoteCreateMaterializedViewRequest {
+            query: request.query,
+            auto_refresh: request.auto_refresh,
+            with_no_data: request.with_no_data,
+            partition_by: request.partition_by,
+        };
+        let req = self
+            .client
+            .post(&format!("/v1/materialized_view/{}/create", request.name))
+            .json(&body);
+        let (request_id, rsp) = self.client.send(req).await?;
+        let rsp = self.client.check_response(&request_id, rsp).await?;
+        let body: RemoteCreateMaterializedViewResponse =
+            rsp.json().await.err_to_http(request_id)?;
+        Ok(body.job_id)
+    }
+
+    async fn refresh_materialized_view(
+        &self,
+        request: RefreshMaterializedViewRequest,
+    ) -> Result<String> {
+        let body = RemoteRefreshMaterializedViewRequest {
+            full: request.full,
+            src_version: request.src_version,
+            num_workers: request.num_workers,
+            max_workers: request.max_workers,
+        };
+        let req = self
+            .client
+            .post(&format!("/v1/materialized_view/{}/refresh", request.name))
+            .json(&body);
+        let (request_id, rsp) = self.client.send(req).await?;
+        let rsp = self.client.check_response(&request_id, rsp).await?;
+        let body: RemoteRefreshMaterializedViewResponse =
+            rsp.json().await.err_to_http(request_id)?;
+        Ok(body.job_id)
+    }
+
+    async fn table_lineage(&self, request: TableLineageRequest) -> Result<String> {
+        let mut req = self
+            .client
+            .get(&format!("/v1/table/{}/lineage", request.name));
+        if let Some(column) = &request.column {
+            req = req.query(&[("column", column)]);
+        }
+        if let Some(direction) = &request.direction {
+            req = req.query(&[("direction", direction)]);
+        }
+        if let Some(depth) = request.depth {
+            req = req.query(&[("depth", depth.to_string())]);
+        }
+        let (request_id, rsp) = self.client.send(req).await?;
+        let rsp = self.client.check_response(&request_id, rsp).await?;
+        // Server-defined lineage JSON, returned opaque (the client does not
+        // model the lineage schema; the Python layer deserializes it).
+        rsp.text().await.err_to_http(request_id)
+    }
+
+    async fn explain_refresh_materialized_view(
+        &self,
+        name: &str,
+        full: bool,
+        src_version: Option<u64>,
+    ) -> Result<MvRefreshPlan> {
+        let body = RemoteExplainRefreshRequest {
+            full: Some(full),
+            src_version,
+        };
+        let req = self
+            .client
+            .post(&format!("/v1/materialized_view/{}/explain_refresh", name))
+            .json(&body);
+        let (request_id, rsp) = self.client.send(req).await?;
+        let rsp = self.client.check_response(&request_id, rsp).await?;
+        let body: RemoteExplainRefreshResponse = rsp.json().await.err_to_http(request_id)?;
+        Ok(MvRefreshPlan {
+            table_name: body.table_name,
+            has_work: body.has_work,
+            source_version: body.source_version,
+            last_refreshed_version: body.last_refreshed_version,
+            full_refresh: body.full_refresh,
+            rebuild: body.rebuild,
+            units_total: body.units_total,
+        })
+    }
+
+    async fn alter_materialized_view(&self, name: &str, auto_refresh: bool) -> Result<()> {
+        let req = self
+            .client
+            .post(&format!("/v1/materialized_view/{}/alter", name))
+            .json(&RemoteAlterMaterializedViewRequest { auto_refresh });
+        let (request_id, rsp) = self.client.send(req).await?;
+        self.client.check_response(&request_id, rsp).await?;
+        Ok(())
+    }
+
+    async fn drop_materialized_view(&self, name: &str) -> Result<()> {
+        let req = self
+            .client
+            .post(&format!("/v1/materialized_view/{}/drop", name));
+        let (request_id, rsp) = self.client.send(req).await?;
+        self.client.check_response(&request_id, rsp).await?;
+        Ok(())
+    }
+
+    async fn list_materialized_views(&self) -> Result<Vec<MaterializedViewInfo>> {
+        let req = self.client.get("/v1/materialized_view/list");
+        let (request_id, rsp) = self.client.send(req).await?;
+        let rsp = self.client.check_response(&request_id, rsp).await?;
+        let body: RemoteListMaterializedViewsResponse = rsp.json().await.err_to_http(request_id)?;
+        Ok(body
+            .views
+            .into_iter()
+            .map(|v| MaterializedViewInfo {
+                name: v.name,
+                source_table: v.source_table,
+                projection: v.projection,
+                udf_columns: v.udf_columns,
+                filter: v.filter,
+                auto_refresh: v.auto_refresh,
+            })
+            .collect())
+    }
+
+    async fn list_jobs(&self) -> Result<Vec<JobInfo>> {
+        let req = self.client.get("/v1/job/list");
+        let (request_id, rsp) = self.client.send(req).await?;
+        let rsp = self.client.check_response(&request_id, rsp).await?;
+        let body: RemoteListJobsResponse = rsp.json().await.err_to_http(request_id)?;
+        Ok(body.jobs.into_iter().map(JobInfo::from).collect())
+    }
+
+    async fn get_job(&self, job_id: &str, table: Option<&str>) -> Result<Option<JobInfo>> {
+        // Point-access poll path: GET /v1/job/{id}, with the table as the O(1)
+        // hint when known. `query` handles URL-encoding the table name.
+        let mut req = self.client.get(&format!("/v1/job/{job_id}"));
+        if let Some(t) = table {
+            req = req.query(&[("table", t)]);
+        }
+        let (request_id, rsp) = self.client.send(req).await?;
+        let rsp = self.client.check_response(&request_id, rsp).await?;
+        let body: RemoteGetJobResponse = rsp.json().await.err_to_http(request_id)?;
+        Ok(body.job.map(JobInfo::from))
+    }
+
+    async fn cancel_job(&self, job_id: &str) -> Result<bool> {
+        let req = self.client.post(&format!("/v1/job/{}/cancel", job_id));
+        let (request_id, rsp) = self.client.send(req).await?;
+        let rsp = self.client.check_response(&request_id, rsp).await?;
+        let body: RemoteCancelJobResponse = rsp.json().await.err_to_http(request_id)?;
+        Ok(body.cancelled)
+    }
+
+    async fn job_history(&self, job_id: Option<&str>) -> Result<Vec<JobHistoryInfo>> {
+        let mut req = self.client.get("/v1/job/history");
+        if let Some(j) = job_id {
+            req = req.query(&[("job", j)]);
+        }
+        let (request_id, rsp) = self.client.send(req).await?;
+        let rsp = self.client.check_response(&request_id, rsp).await?;
+        let body: RemoteJobHistoryResponse = rsp.json().await.err_to_http(request_id)?;
+        Ok(body.jobs.into_iter().map(JobHistoryInfo::from).collect())
+    }
+
+    async fn errors(&self, job_id: Option<&str>, table: Option<&str>) -> Result<Vec<JobErrorInfo>> {
+        let mut req = self.client.get("/v1/job/errors");
+        if let Some(j) = job_id {
+            req = req.query(&[("job", j)]);
+        }
+        if let Some(t) = table {
+            req = req.query(&[("table", t)]);
+        }
+        let (request_id, rsp) = self.client.send(req).await?;
+        let rsp = self.client.check_response(&request_id, rsp).await?;
+        let body: RemoteErrorsResponse = rsp.json().await.err_to_http(request_id)?;
+        Ok(body.errors.into_iter().map(JobErrorInfo::from).collect())
+    }
+
    async fn open_table(&self, request: OpenTableRequest) -> Result<Arc<dyn BaseTable>> {
        let identifier = build_table_identifier(
            &request.name,
@@ -1580,6 +2046,223 @@ mod tests {
        }
    }

+    #[tokio::test]
+    async fn test_derived_compute_routes() {
+        // create_function
+        let conn = Connection::new_with_handler(|request| {
+            assert_eq!(request.method(), &reqwest::Method::POST);
+            assert_eq!(request.url().path(), "/v1/function/embed/create");
+            let body: serde_json::Value =
+                serde_json::from_slice(request.body().unwrap().as_bytes().unwrap()).unwrap();
+            assert_eq!(body["language"], "python");
+            assert_eq!(body["return_type"], "FLOAT[4]");
+            assert_eq!(body["body"], "def embed(x): ...");
+            assert_eq!(body["options"]["pip"], "torch");
+            http::Response::builder()
+                .status(200)
+                .body(r#"{"name":"embed","status":"OK"}"#)
+                .unwrap()
+        });
+        conn.create_function(crate::database::CreateFunctionRequest {
+            name: "embed".into(),
+            language: "python".into(),
+            return_type: "FLOAT[4]".into(),
+            body: "def embed(x): ...".into(),
+            options: [("pip".to_string(), "torch".to_string())].into(),
+        })
+        .await
+        .unwrap();
+
+        // list_functions
+        let conn = Connection::new_with_handler(|request| {
+            assert_eq!(request.method(), &reqwest::Method::GET);
+            assert_eq!(request.url().path(), "/v1/function/list");
+            http::Response::builder()
+                .status(200)
+                .body(
+                    r#"{"functions":[{"name":"embed","language":"python","return_type":"Float32","description":""}]}"#,
+                )
+                .unwrap()
+        });
+        let functions = conn.list_functions().await.unwrap();
+        assert_eq!(functions.len(), 1);
+        assert_eq!(functions[0].name, "embed");
+
+        // drop_function
+        let conn = Connection::new_with_handler(|request| {
+            assert_eq!(request.method(), &reqwest::Method::POST);
+            assert_eq!(request.url().path(), "/v1/function/embed/drop");
+            http::Response::builder()
+                .status(200)
+                .body(r#"{"name":"embed","status":"OK"}"#)
+                .unwrap()
+        });
+        conn.drop_function("embed").await.unwrap();
+
+        // create_materialized_view
+        let conn = Connection::new_with_handler(|request| {
+            assert_eq!(request.method(), &reqwest::Method::POST);
+            assert_eq!(request.url().path(), "/v1/materialized_view/mv1/create");
+            let body: serde_json::Value =
+                serde_json::from_slice(request.body().unwrap().as_bytes().unwrap()).unwrap();
+            assert_eq!(body["query"], "SELECT id, embed(body) AS vec FROM docs");
+            assert_eq!(body["auto_refresh"], true);
+            assert_eq!(body["with_no_data"], false);
+            http::Response::builder()
+                .status(200)
+                .body(r#"{"name":"mv1","job_id":"j-1"}"#)
+                .unwrap()
+        });
+        let mut request = crate::database::CreateMaterializedViewRequest::new(
+            "mv1",
+            "SELECT id, embed(body) AS vec FROM docs",
+        );
+        request.auto_refresh = true;
+        let job_id = conn.create_materialized_view(request).await.unwrap();
+        assert_eq!(job_id.as_deref(), Some("j-1"));
+
+        // refresh_materialized_view
+        let conn = Connection::new_with_handler(|request| {
+            assert_eq!(request.method(), &reqwest::Method::POST);
+            assert_eq!(request.url().path(), "/v1/materialized_view/mv1/refresh");
+            let body: serde_json::Value =
+                serde_json::from_slice(request.body().unwrap().as_bytes().unwrap()).unwrap();
+            assert_eq!(body["num_workers"], 2);
+            assert!(body.get("src_version").is_none());
+            http::Response::builder()
+                .status(202)
+                .body(r#"{"job_id":"j-2"}"#)
+                .unwrap()
+        });
+        let mut request = crate::database::RefreshMaterializedViewRequest::new("mv1");
+        request.num_workers = Some(2);
+        let job_id = conn.refresh_materialized_view(request).await.unwrap();
+        assert_eq!(job_id, "j-2");
+
+        // alter_materialized_view
+        let conn = Connection::new_with_handler(|request| {
+            assert_eq!(request.method(), &reqwest::Method::POST);
+            assert_eq!(request.url().path(), "/v1/materialized_view/mv1/alter");
+            let body: serde_json::Value =
+                serde_json::from_slice(request.body().unwrap().as_bytes().unwrap()).unwrap();
+            assert_eq!(body["auto_refresh"], false);
+            http::Response::builder()
+                .status(200)
+                .body(r#"{"name":"mv1","status":"OK"}"#)
+                .unwrap()
+        });
+        conn.alter_materialized_view("mv1", false).await.unwrap();
+
+        // drop_materialized_view
+        let conn = Connection::new_with_handler(|request| {
+            assert_eq!(request.url().path(), "/v1/materialized_view/mv1/drop");
+            http::Response::builder()
+                .status(200)
+                .body(r#"{"name":"mv1","status":"OK"}"#)
+                .unwrap()
+        });
+        conn.drop_materialized_view("mv1").await.unwrap();
+
+        // list_materialized_views
+        let conn = Connection::new_with_handler(|request| {
+            assert_eq!(request.method(), &reqwest::Method::GET);
+            assert_eq!(request.url().path(), "/v1/materialized_view/list");
+            http::Response::builder()
+                .status(200)
+                .body(
+                    r#"{"views":[{"name":"mv1","source_table":"docs","projection":["id"],"udf_columns":["vec=embed(body)"],"filter":null,"auto_refresh":true}]}"#,
+                )
+                .unwrap()
+        });
+        let views = conn.list_materialized_views().await.unwrap();
+        assert_eq!(views.len(), 1);
+        assert_eq!(views[0].source_table, "docs");
+        assert!(views[0].auto_refresh);
+
+        // list_jobs
+        let conn = Connection::new_with_handler(|request| {
+            assert_eq!(request.method(), &reqwest::Method::GET);
+            assert_eq!(request.url().path(), "/v1/job/list");
+            http::Response::builder()
+                .status(200)
+                .body(
+                    r#"{"jobs":[{"table":"docs","job_id":"j-3","job_type":"udf_virtual_column_backfill","state":"running","column":"vec","age_seconds":4,"command":null,"units_done":1,"units_total":2,"committed":false,"rows_skipped":0,"error":null}]}"#,
+                )
+                .unwrap()
+        });
+        let jobs = conn.list_jobs().await.unwrap();
+        assert_eq!(jobs.len(), 1);
+        assert_eq!(jobs[0].state, "running");
+        assert_eq!(jobs[0].units_total, Some(2));
+
+        // cancel_job
+        let conn = Connection::new_with_handler(|request| {
+            assert_eq!(request.method(), &reqwest::Method::POST);
+            assert_eq!(request.url().path(), "/v1/job/j-3/cancel");
+            http::Response::builder()
+                .status(200)
+                .body(r#"{"cancelled":true}"#)
+                .unwrap()
+        });
+        assert!(conn.cancel_job("j-3").await.unwrap());
+
+        // cancel_job: no such inflight job -> false, not an error
+        let conn = Connection::new_with_handler(|request| {
+            assert_eq!(request.url().path(), "/v1/job/gone/cancel");
+            http::Response::builder()
+                .status(200)
+                .body(r#"{"cancelled":false}"#)
+                .unwrap()
+        });
+        assert!(!conn.cancel_job("gone").await.unwrap());
+
+        // job_history: GET /v1/job/history, no filter
+        let conn = Connection::new_with_handler(|request| {
+            assert_eq!(request.method(), &reqwest::Method::GET);
+            assert_eq!(request.url().path(), "/v1/job/history");
+            assert!(request.url().query().is_none());
+            http::Response::builder()
+                .status(200)
+                .body(
+                    r#"{"jobs":[{"table":"docs","job_id":"j-1","job_type":"udf_virtual_column_backfill","state":"done","column":"vec","created_ms":1000,"updated_ms":2000,"completed_ms":2000,"rows_processed":42,"rows_skipped":3,"error":null,"events":"created\ndone"}]}"#,
+                )
+                .unwrap()
+        });
+        let hist = conn.job_history(None).await.unwrap();
+        assert_eq!(hist.len(), 1);
+        assert_eq!(hist[0].state, "done");
+        assert_eq!(hist[0].rows_processed, Some(42));
+        assert_eq!(hist[0].events.as_deref(), Some("created\ndone"));
+
+        // job_history: ?job= narrows to one job
+        let conn = Connection::new_with_handler(|request| {
+            assert_eq!(request.url().path(), "/v1/job/history");
+            assert_eq!(request.url().query(), Some("job=j-1"));
+            http::Response::builder()
+                .status(200)
+                .body(r#"{"jobs":[]}"#)
+                .unwrap()
+        });
+        assert!(conn.job_history(Some("j-1")).await.unwrap().is_empty());
+
+        // errors: GET /v1/job/errors with job + table filters
+        let conn = Connection::new_with_handler(|request| {
+            assert_eq!(request.method(), &reqwest::Method::GET);
+            assert_eq!(request.url().path(), "/v1/job/errors");
+            assert_eq!(request.url().query(), Some("job=j-1&table=docs"));
+            http::Response::builder()
+                .status(200)
+                .body(
+                    r#"{"errors":[{"job_id":"j-1","table":"docs","column":"vec","error_type":"ValueError","error_message":"boom","fragment_id":0,"source_row_id":42,"table_version":7,"age_seconds":5}]}"#,
+                )
+                .unwrap()
+        });
+        let errs = conn.errors(Some("j-1"), Some("docs")).await.unwrap();
+        assert_eq!(errs.len(), 1);
+        assert_eq!(errs[0].error_type, "ValueError");
+        assert_eq!(errs[0].source_row_id, Some(42));
+    }
+
    #[tokio::test]
    async fn test_clone_table() {
        let conn = Connection::new_with_handler(|request| {
--- a/rust/lancedb/src/remote/table.rs
+++ b/rust/lancedb/src/remote/table.rs
@@ -2309,6 +2309,126 @@ impl<S: HttpSend> BaseTable for RemoteTable<S> {
            message: "optimize is not supported on LanceDB cloud.".into(),
        })
    }
+    async fn add_computed_columns(
+        &self,
+        columns: &[(String, String)],
+        expression: &str,
+    ) -> Result<()> {
+        let new_columns: Vec<serde_json::Value> = columns
+            .iter()
+            .map(|(name, data_type)| {
+                serde_json::json!({
+                    "name": name,
+                    "computed": { "data_type": data_type, "expression": expression },
+                })
+            })
+            .collect();
+        let request = self
+            .client
+            .post(&format!("/v1/table/{}/add_columns/", self.identifier))
+            .json(&serde_json::json!({ "new_columns": new_columns }));
+        let (request_id, response) = self.send(request, true).await?;
+        self.check_table_response(&request_id, response).await?;
+        Ok(())
+    }
+
+    async fn refresh_column(
+        &self,
+        columns: &[String],
+        where_clause: Option<String>,
+        num_workers: Option<u32>,
+        max_workers: Option<u32>,
+        batch_size: Option<u32>,
+        priority: Option<String>,
+    ) -> Result<String> {
+        let mut body = serde_json::json!({ "columns": columns });
+        if let Some(w) = where_clause {
+            body["where_clause"] = serde_json::Value::String(w);
+        }
+        if let Some(n) = num_workers {
+            body["num_workers"] = n.into();
+        }
+        if let Some(n) = max_workers {
+            body["max_workers"] = n.into();
+        }
+        if let Some(n) = batch_size {
+            body["batch_size"] = n.into();
+        }
+        if let Some(p) = priority {
+            body["priority"] = serde_json::Value::String(p);
+        }
+        let request = self
+            .client
+            .post(&format!("/v1/table/{}/refresh_column", self.identifier))
+            .json(&body);
+        let (request_id, response) = self.send(request, true).await?;
+        let response = self.check_table_response(&request_id, response).await?;
+        #[derive(serde::Deserialize)]
+        struct RefreshColumnResponse {
+            job_id: String,
+        }
+        let body: RefreshColumnResponse = response.json().await.err_to_http(request_id)?;
+        Ok(body.job_id)
+    }
+
+    async fn load_columns(&self, request: crate::table::LoadColumnsRequest) -> Result<String> {
+        let columns: Vec<serde_json::Value> = request
+            .columns
+            .iter()
+            .map(|(target, source)| {
+                serde_json::json!({
+                    "target": target,
+                    "source": source.clone().unwrap_or_else(|| target.clone()),
+                })
+            })
+            .collect();
+        let mut source = serde_json::json!({
+            "uris": request.source_uris,
+            "format": request.source_format,
+        });
+        if let Some(opts) = request.source_storage_options {
+            source["storage_options"] = serde_json::to_value(opts).unwrap_or_default();
+        }
+        let mut body = serde_json::json!({
+            "columns": columns,
+            "source": source,
+            "target_key": request.target_key,
+        });
+        if let Some(k) = request.source_key {
+            body["source_key"] = serde_json::Value::String(k);
+        }
+        if let Some(m) = request.on_missing {
+            body["on_missing"] = serde_json::Value::String(m);
+        }
+        if let Some(n) = request.num_workers {
+            body["num_workers"] = n.into();
+        }
+        if let Some(n) = request.max_workers {
+            body["max_workers"] = n.into();
+        }
+        if let Some(n) = request.batch_size {
+            body["batch_size"] = n.into();
+        }
+        if let Some(n) = request.commit_granularity {
+            body["commit_granularity"] = n.into();
+        }
+        if let Some(p) = request.priority {
+            body["priority"] = serde_json::Value::String(p);
+        }
+        let http_request = self
+            .client
+            .post(&format!("/v1/table/{}/load_columns", self.identifier))
+            .json(&body);
+        let (request_id, response) = self.send(http_request, true).await?;
+        let response = self.check_table_response(&request_id, response).await?;
+        #[derive(serde::Deserialize)]
+        struct LoadColumnsResponse {
+            job_id: String,
+        }
+        let body: LoadColumnsResponse = response.json().await.err_to_http(request_id)?;
+        Ok(body.job_id)
+    }
+
    async fn add_columns(
        &self,
        transforms: NewColumnTransform,
@@ -2801,6 +2921,75 @@ mod tests {
        }
    }

+    #[tokio::test]
+    async fn test_refresh_column() {
+        let table = Table::new_with_handler("my_table", |request| {
+            assert_eq!(request.method(), "POST");
+            assert_eq!(request.url().path(), "/v1/table/my_table/refresh_column");
+            let body: serde_json::Value =
+                serde_json::from_slice(request.body().unwrap().as_bytes().unwrap()).unwrap();
+            assert_eq!(body["columns"], serde_json::json!(["vec"]));
+            assert_eq!(body["num_workers"], 2);
+            assert!(body.get("where_clause").is_none());
+
+            http::Response::builder()
+                .status(202)
+                .body(r#"{"job_id":"j-9"}"#)
+                .unwrap()
+        });
+
+        let job_id = table
+            .refresh_column(&["vec".to_string()], None, Some(2), None, None, None)
+            .await
+            .unwrap();
+        assert_eq!(job_id, "j-9");
+    }
+
+    #[tokio::test]
+    async fn test_load_columns() {
+        let table = Table::new_with_handler("my_table", |request| {
+            assert_eq!(request.method(), "POST");
+            assert_eq!(request.url().path(), "/v1/table/my_table/load_columns");
+            let body: serde_json::Value =
+                serde_json::from_slice(request.body().unwrap().as_bytes().unwrap()).unwrap();
+            assert_eq!(
+                body["columns"],
+                serde_json::json!([{"target": "embedding", "source": "emb"}])
+            );
+            assert_eq!(body["source"]["format"], "parquet");
+            assert_eq!(
+                body["source"]["uris"],
+                serde_json::json!(["s3://b/x.parquet"])
+            );
+            assert_eq!(body["target_key"], "document_id");
+            assert_eq!(body["source_key"], "doc_id");
+            assert_eq!(body["on_missing"], "null");
+            assert_eq!(body["num_workers"], 4);
+
+            http::Response::builder()
+                .status(202)
+                .body(r#"{"job_id":"lc-7"}"#)
+                .unwrap()
+        });
+
+        let request = crate::table::LoadColumnsRequest {
+            source_uris: vec!["s3://b/x.parquet".to_string()],
+            source_format: "parquet".to_string(),
+            source_storage_options: None,
+            target_key: "document_id".to_string(),
+            source_key: Some("doc_id".to_string()),
+            columns: vec![("embedding".to_string(), Some("emb".to_string()))],
+            on_missing: Some("null".to_string()),
+            num_workers: Some(4),
+            max_workers: None,
+            batch_size: None,
+            commit_granularity: None,
+            priority: None,
+        };
+        let job_id = table.load_columns(request).await.unwrap();
+        assert_eq!(job_id, "lc-7");
+    }
+
    #[tokio::test]
    async fn test_version() {
        let table = Table::new_with_handler("my_table", |request| {
--- a/rust/lancedb/src/table.rs
+++ b/rust/lancedb/src/table.rs
@@ -471,6 +471,33 @@ impl LsmWriteSpec {
    }
 }

+/// Request to fill existing table columns from an external source by
+/// primary-key join (Geneva `Table.load_columns()` parity). Server-backed
+/// feature (LanceDB Enterprise / Cloud).
+#[derive(Debug, Clone)]
+pub struct LoadColumnsRequest {
+    /// External source URIs.
+    pub source_uris: Vec<String>,
+    /// Source format: "parquet" | "lance" | "ipc".
+    pub source_format: String,
+    /// Source-only storage options (e.g. cloud credentials).
+    pub source_storage_options: Option<HashMap<String, String>>,
+    /// Destination primary-key column.
+    pub target_key: String,
+    /// Source primary-key column. Defaults to `target_key` when None.
+    pub source_key: Option<String>,
+    /// Value column mappings as `(target, source)`; a None source defaults to
+    /// the target name.
+    pub columns: Vec<(String, Option<String>)>,
+    /// Missing-row policy: "carry" (default) | "null" | "error".
+    pub on_missing: Option<String>,
+    pub num_workers: Option<u32>,
+    pub max_workers: Option<u32>,
+    pub batch_size: Option<u32>,
+    pub commit_granularity: Option<u32>,
+    pub priority: Option<String>,
+}
+
 /// A trait for anything "table-like".  This is used for both native tables (which target
 /// Lance datasets) and remote tables (which target LanceDB cloud)
 ///
@@ -620,6 +647,47 @@ pub trait BaseTable: std::fmt::Display + std::fmt::Debug + Send + Sync {
        transforms: NewColumnTransform,
        read_columns: Option<Vec<String>>,
    ) -> Result<AddColumnsResult>;
+    /// Declare computed columns bound to a registered function: each
+    /// `(name, sql_type)` is added all-null with the expression stored
+    /// as its binding; no compute happens here (the server's lazy
+    /// detector or refresh_column fills them). Several columns map a
+    /// struct-returning function's fields positionally. Server-backed
+    /// feature; the default returns NotSupported.
+    async fn add_computed_columns(
+        &self,
+        _columns: &[(String, String)],
+        _expression: &str,
+    ) -> Result<()> {
+        Err(Error::NotSupported {
+            message: "computed columns are not supported by this table".into(),
+        })
+    }
+    /// Trigger recompute of computed columns. The expression is
+    /// resolved server-side from each column's stored binding; columns
+    /// bound to the same struct-returning function refresh together.
+    /// Returns the refresh job id. Server-backed feature (LanceDB
+    /// Enterprise / Cloud); the default returns NotSupported.
+    async fn refresh_column(
+        &self,
+        _columns: &[String],
+        _where_clause: Option<String>,
+        _num_workers: Option<u32>,
+        _max_workers: Option<u32>,
+        _batch_size: Option<u32>,
+        _priority: Option<String>,
+    ) -> Result<String> {
+        Err(Error::NotSupported {
+            message: "refresh_column is not supported by this table".into(),
+        })
+    }
+    /// Fill existing columns from an external source by primary-key join
+    /// (Geneva `load_columns`). Returns the load job id. Server-backed feature;
+    /// the default returns NotSupported.
+    async fn load_columns(&self, _request: LoadColumnsRequest) -> Result<String> {
+        Err(Error::NotSupported {
+            message: "load_columns is not supported by this table".into(),
+        })
+    }
    /// Alter columns in the table.
    async fn alter_columns(&self, alterations: &[ColumnAlteration]) -> Result<AlterColumnsResult>;
    /// Drop columns from the table.
@@ -1461,6 +1529,48 @@ impl Table {
        self.inner.add_columns(transforms, read_columns).await
    }

+    /// Declare computed columns bound to a registered function
+    /// (`(name, sql_type)` pairs + a `f(args)` expression). No compute
+    /// happens here. Server-backed feature.
+    pub async fn add_computed_columns(
+        &self,
+        columns: &[(String, String)],
+        expression: &str,
+    ) -> Result<()> {
+        self.inner.add_computed_columns(columns, expression).await
+    }
+
+    /// Trigger recompute of computed columns (REFRESH COLUMN). The
+    /// expression comes from each column's stored binding; columns
+    /// bound to the same struct-returning function refresh together.
+    /// Returns the refresh job id. Server-backed feature.
+    pub async fn refresh_column(
+        &self,
+        columns: &[String],
+        where_clause: Option<String>,
+        num_workers: Option<u32>,
+        max_workers: Option<u32>,
+        batch_size: Option<u32>,
+        priority: Option<String>,
+    ) -> Result<String> {
+        self.inner
+            .refresh_column(
+                columns,
+                where_clause,
+                num_workers,
+                max_workers,
+                batch_size,
+                priority,
+            )
+            .await
+    }
+
+    /// Fill existing columns from an external Parquet/Lance/IPC source by
+    /// primary-key join (Geneva `Table.load_columns()`). Returns the job id.
+    pub async fn load_columns(&self, request: LoadColumnsRequest) -> Result<String> {
+        self.inner.load_columns(request).await
+    }
+
    /// Change a column's name or nullability.
    pub async fn alter_columns(
        &self,
Author	SHA1	Message	Date
Jack Ye	bff911a65d	chore: point lance dependency at jack branch	2026-06-29 22:46:07 -07:00
Wyatt Alt	3a4cdb7aff	fix(udf): JobHandle.wait() terminates on failed jobs wait() (sync + async) only stopped on finished/stale/committed, so a job the server already reported as state=failed was polled until the (default 3600s) timeout, then raised a misleading TimeoutError instead of the real cause. A doomed backfill -- e.g. a multi-column REFRESH COLUMN of a scalar UDF -- hung the client even though get_job surfaced the failure within ~3s. Add a terminal failed branch that raises JobFailedError carrying the server error, exported from the package. Verified end-to-end against the cluster: raises in 3.6s instead of hanging. Unit-tested with a mock conn (sync+async, failure + success + committed paths). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 22:35:35 -07:00
Wyatt Alt	142ac835d3	client: job_history() and errors() over REST (SHOW JOB HISTORY / SHOW ERRORS) The client exposed list_jobs/get_job/cancel_job but not the durable job history or the per-row UDF errors, so those SQL/REST surfaces had no SDK equivalent. Add job_history(job_id=None) and errors(job_id=None, table=None) through every layer: - Database trait + Connection API (JobHistoryInfo, JobErrorInfo types). - Remote REST impl: GET /v1/job/history (?job=) and GET /v1/job/errors (?job=&table=), with serde response types + From mappings. - pyo3 bindings + pyclasses JobHistoryEntry / JobErrorEntry, registered. - Python sync + async db.py wrappers. Mirrors the existing list_jobs plumbing exactly. Remote-handler test asserts the GET paths, query filters, and response parsing for both. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 22:35:35 -07:00
Wyatt Alt	3f44f93e92	job wait(): poll by id via get_job (point access) instead of list_jobs JobHandle/AsyncJobHandle now poll conn.get_job(id, table) -- one job -- instead of list_jobs() + client-side filter over every active job. The job's table is threaded in from refresh_column / MV refresh as an O(1) lookup hint. Plumbs get_job through the Database trait (default not_supported), RemoteDatabase (GET /v1/job/{id}?table=...), the Connection wrapper, and the pyo3 binding + db.py. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 22:35:35 -07:00
Wyatt Alt	9dfa43a9de	fix: sync Connection.lineage delegates to AsyncConnection.lineage self._conn on a remote sync connection is an AsyncConnection (python), which exposes `lineage` (parses the JSON), not the pyo3 `table_lineage`. The sync wrapper was calling self._conn.table_lineage -> AttributeError. Drive self._conn.lineage on the loop instead, mirroring create_materialized_view. (table_lineage stays the pyo3 method the async path calls via self._inner.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 22:35:34 -07:00
Wyatt Alt	03e895fa5c	Merge integration submodule (refresh_column JobHandle / create_mv fix) into lineage	2026-06-29 22:35:34 -07:00
Wyatt Alt	c31e53088e	client: slice 4 -- Python lineage surface - new lineage.py: Lineage / Node / Edge / FunctionRef dataclasses that parse the server's lineage JSON, with to_dict(), to_graphviz() (drift edges dashed+red), and _repr_html_(); plus .functions() / .stale() helpers. - Connection.lineage(table, column=, direction=, depth=) (sync + async) calls the pyo3 table_lineage binding and deserializes into Lineage. - Table.lineage(column=, ...) via the table's job connection; MaterializedView / AsyncMaterializedView .lineage() delegate to the backing table (the server already includes the view's sources + downstream dependents). - export the new types. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 22:35:34 -07:00
Wyatt Alt	434a5be187	feat(client): Table.refresh_column returns a JobHandle (like MaterializedView.refresh) refresh_column returned the bare job-id str, so callers had to wrap it: db.job(tbl.refresh_column("c")).wait(). Mirror MaterializedView.refresh() and return a JobHandle directly, so tbl.refresh_column("c").wait() / .status() / .id work without the wrapper. (db.job(job_id) stays for reconnecting by a stored id.) Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 22:35:34 -07:00
Wyatt Alt	78aa005093	client: slice 3 -- thread table_lineage through the remote client + pyo3 A new Database::table_lineage(TableLineageRequest) -> Result<String> threaded end to end: default not_supported in the trait; the remote impl issues GET /v1/table/{name}/lineage with column/direction/depth query params and returns the body verbatim; connection.rs exposes a pub wrapper; the pyo3 binding hands the JSON string to Python. The lineage payload is carried as opaque JSON on purpose: the open-source lancedb client must not depend on the sophon-internal derived_jobs crate that defines the lineage schema, so the wire format is the contract and the Python layer deserializes it. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 22:35:34 -07:00
Wyatt Alt	6191542cfe	fix(mv): MaterializedView.refresh calls the async _refresh (underscore) The sync _refresh_materialized_view called self._conn.refresh_materialized_view (no underscore); the async method is _refresh_materialized_view, so MaterializedView.refresh() raised AttributeError. Add the underscore. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 22:35:34 -07:00
Wyatt Alt	6af3088b91	client: make refresh_materialized_view private (reach it via the handle) Refresh is a submit-a-job verb, so its only public surface should be MaterializedView.refresh() / AsyncMaterializedView.refresh() (which return a job handle). Rename the connection methods to _refresh_materialized_view and have the handles call that, so the raw by-name refresh is no longer advertised on the connection. The pyo3 native binding is unchanged. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 22:35:34 -07:00
Wyatt Alt	e73d4618d8	fix(mv): create_materialized_view passes query as keyword, not positional The sync RemoteDBConnection.create_materialized_view assembled the SELECT but called the async create_materialized_view with the query as the 2nd positional arg, which binds to `source=` (query= is keyword-only). Every call then failed the "needs either query= or both source and select" validation. Pass query=query. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 22:35:34 -07:00
Wyatt Alt	3d92106394	client: split create_view into create_materialized_view; return job handles - create_materialized_view now takes either query= or source+select (folds in the old create_view builder) and returns a MaterializedView handle whose .wait() blocks on initial population. create_view is removed -- it was misnamed (it built a materialized view, while CREATE VIEW means the plain non-materialized view the engine also supports). - MaterializedView.refresh() and the remote Table.refresh_column() now return a JobHandle directly, so tbl.refresh_column("c").wait() needs no db.job(...) wrapper. db.job(id) is narrowed to reconnect-by-id (stored id / SQL / REST). - rename View/AsyncView -> MaterializedView/AsyncMaterializedView (+ exports). - tighten the replace path: only a not-found error on the pre-drop is benign; real failures (perms/server) now surface instead of being swallowed. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 22:35:34 -07:00
Wyatt Alt	5810974b37	feat(client): Table.load_columns() REST client for LOAD COLUMNS Geneva Table.load_columns() parity on the REST-only client. Fills existing columns from an external Parquet/Lance/IPC source by primary-key join. - BaseTable::load_columns default (NotSupported) + public Table::load_columns, taking a LoadColumnsRequest (source uris/format/storage_options, target/source key, (target, source?) column mappings, on_missing, worker/batch/commit knobs). - Remote impl POSTs to /v1/table/{id}/load_columns with the matching body; mock test asserts the request shape. - PyO3 binding + Python remote Table.load_columns(source, pk, columns, *, source_format, source_pk, on_missing, ...) accepting a column list or {target: source} dict. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 22:35:34 -07:00
Wyatt Alt	8b38500b07	feat(view): full=True force-rebuild on refresh_materialized_view View.refresh(full=True) (sync + async) now works -- it previously raised NotImplementedError. Thread the flag through the client: RefreshMaterialized- ViewRequest.full -> the REST body (RemoteRefreshMaterializedViewRequest.full); pyo3 refresh_materialized_view(full=...); Connection.refresh_materialized_view( name, full=) sync + async. A full refresh forces a recompute-and-replace and preserves the view's indexes (reindexed by the distributed indexer). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 22:35:34 -07:00
Wyatt Alt	fd0a3b97d0	feat(view): materialized views are first-class indexable + searchable Add View.create_index / create_scalar_index / create_fts_index / search as pass-throughs to open_table(name). A materialized view is a real Lance dataset; these let it be indexed and searched like any other table, closing the parity gap with Geneva (whose create_materialized_view returns a first-class Table). The server-side create_index handler records indexes declared on a view so they survive a full refresh (which overwrites the dataset, dropping its indices); that re-apply is wired in the sophon engine. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 22:35:34 -07:00
Wyatt Alt	b9f33ba1c9	feat(refresh): priority as a per-refresh knob; fix batch_size on RemoteTable Thread priority (Kueue tier) through refresh_column at every layer (Python sync+async + RemoteTable -> pyo3 -> Rust client trait/public/remote -> REST body), mirroring num_workers/batch_size. The function keeps its priority as a default; the per-refresh value overrides. Also adds the previously-missed batch_size to RemoteTable.refresh_column (the REST sync path). cargo check (lancedb --features remote --tests, lancedb-python) + ruff clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 22:35:34 -07:00
Wyatt Alt	d4f4fef3ba	feat(refresh): batch_size is a per-refresh knob (refresh_column), not a function-only option batch_size / num_workers / max_workers are invocation concerns (how to schedule THIS refresh), so expose batch_size on refresh_column through every layer (Python sync+async -> pyo3 -> Rust client -> the REST RefreshColumnRequest.batch_size, which the handler already forwards into the backfill). num_workers/max_workers were already invocation- placed; batch_size was the gap. The function may still carry a default; the refresh override wins (extends the batch_size_override model). Both crates cargo-check clean. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 22:35:34 -07:00
Wyatt Alt	fbe6a5a3fd	feat(udf): computed columns as expressions -- add_columns(computed={col: fn("input")}) A computed column is an expression over a registered function applied to input columns, not a UDF coupled to a column. fn("data") already returned the expression string "fn(data)"; make it a ColumnExpr (a str subclass) that also carries the function's return type, so add_columns(computed={"vec": embed("data")}) declares the column with no hand-written type. _normalize_computed handles the new form (and tuple keys for STRUCT fan-out) and keeps the legacy {col: (sql_type, expression)} tuple. add_computed_column is deprecated (delegates, with a DeprecationWarning). The function stays decoupled from columns -- register once, apply anywhere. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 22:35:34 -07:00
Wyatt Alt	127054069a	feat(mv): partition_by option on create_materialized_view / create_view Thread an optional partition_by through the client: CreateMaterializedViewRequest -> REST body -> pyo3 binding -> Python create_materialized_view/create_view kwarg (sync + async). The server partitions the view's table function by the named source column -- by IVF index clusters if the column is indexed (image-dedup), else by distinct value. Unifies Geneva's partition_by + partition_by_indexed_column into one knob. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 22:35:34 -07:00
Wyatt Alt	b20931b8f7	feat: async UDF client ergonomics (AsyncConnection/AsyncTable + AsyncView/AsyncJobHandle) Mirrors the sync ergonomics on the async surface: AsyncConnection create_function(udf, replace=)/create_view/job; AsyncTable.add_computed_column; AsyncView + AsyncJobHandle (await + asyncio.sleep; shared submission-prefix matcher with the sync JobHandle). Decorator + REST routes are shared/already validated; this is the async wrapper layer. Exported from the package root. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 22:35:34 -07:00
Wyatt Alt	396d68e490	fix: JobHandle resolves the manifest job id from the submission id db.job(id) gets the submission id the refresh/backfill endpoints return, but list_jobs / cancel report the agent's manifest id (<table>-<type>-<first 8 of submission id>). JobHandle now matches that (exact id or submission prefix) so wait()/progress() truly track, and cancel() cancels by the resolved canonical id instead of the unusable submission uuid. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 22:35:34 -07:00
Wyatt Alt	ad37f87387	feat: fold UDF authoring into lancedb (udf module + connection/table ergonomics) Brings the @udf/@table_udf decorator + type inference into lancedb as lancedb.udf (Apache-2.0), and adds the ergonomic glue to the existing connection/table so there's no separate object model: - create_function() accepts a Udf (and a replace= flag) - Table.add_computed_column(column, udf) - create_view(name, source, select, ...) -> View (assembles the SELECT) - Connection.job(job_id) -> JobHandle - View / JobHandle are thin references over a connection Exports udf/table_udf/Udf/JobHandle/View from the package root. The operations stay the existing remote-only methods (enterprise/cloud); the decorator works locally. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 22:35:34 -07:00
Wyatt Alt	e93476f0e0	feat: explain_refresh_materialized_view over REST (EXPLAIN REFRESH SDK) Database trait gains explain_refresh_materialized_view (default NotSupported) returning an MvRefreshPlan; RemoteDatabase POSTs /v1/materialized_view/{name}/explain_refresh; Connection method; pyo3 MvRefreshPlan pyclass + binding; sync+async python wrappers. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 22:35:34 -07:00
Wyatt Alt	2b41fce033	feat: cancel_job over REST (Database::cancel_job + remote impl + pyo3 + python) Exposes the existing server-side CANCEL JOB (CoordinatorCatalog::cancel_job) as a REST-backed SDK method: Database trait default NotSupported, RemoteDatabase POSTs /v1/job/{id}/cancel, pyo3 binding, sync+async python wrappers. Best-effort: a missing job returns false, not an error. Mock-HTTP unit test in test_derived_compute_routes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>	2026-06-29 22:35:34 -07:00
Wyatt Alt	04948fc4f6	feat: computed columns as a param on add_columns Per the interface design: computed columns are parameters on the existing add_columns operation, not a separate method. - BaseTable::add_computed_columns((name, sql_type) pairs + a f(args) expression) -- default NotSupported; RemoteTable posts 'computed' entries to the existing /v1/table/{id}/add_columns route. - python add_columns gains computed= on LanceTable, RemoteTable, and AsyncTable: tbl.add_columns(computed={'doubled': ('FLOAT', 'double_it(val)')}); grouped by expression so struct-returning functions' columns land adjacently.	2026-06-29 22:35:34 -07:00
Wyatt Alt	ff3c7111b9	feat: SDK surface for functions, materialized views, jobs, refresh_column Adds the derived-compute interface to the SDK: - Database trait: create/list/drop_function, create/refresh/alter/ drop/list_materialized_view, list_jobs -- default implementations return Error::NotSupported (NotImplementedError in python), so existing Database impls are unaffected; local single-node implementations are planned. BaseTable gains refresh_column with the same default. - RemoteDatabase/RemoteTable implement them against the server REST routes (/v1/function/, /v1/materialized_view/, /v1/job/list, /v1/table/{id}/refresh_column), with mock-HTTP unit tests. - Connection/Table public methods, pyo3 bindings (FunctionInfo, MaterializedViewInfo, JobInfo pyclasses), and python wrappers: sync on the DBConnection base (shared by local and remote connections), async on AsyncConnection; refresh_column on LanceTable, RemoteTable, and AsyncTable.	2026-06-29 22:35:34 -07:00