feat(rust): re-export arrow and datafusion crates from lancedb (#3576)

lancedb's public API forces downstream crates to construct foreign types
— `RecordBatch`/arrays/builders for `Table::add(...)` (arrow), and
`datafusion_expr::Expr` for `only_if_expr`/`expr_projection`/merge
filters. The required version must exactly match lancedb's internal
arrow/datafusion line, but nothing on the API surface makes that
visible. Drift surfaces only as confusing trait/type errors:

```text
error[E0277]: the trait bound `RecordBatch: Scannable` is not satisfied
  = note: there are multiple different versions of crate `arrow_array` in the dependency graph
```

This re-exports the crates lancedb already pins, so consumers can rely
on a single, guaranteed-matching line via a discoverable import path
instead of declaring their own (potentially mismatched) direct
dependency.

- `lancedb::arrow::{arrow, arrow_array, arrow_buffer, arrow_cast,
arrow_data, arrow_ipc, arrow_ord, arrow_schema, arrow_select}` —
previously only `arrow_schema` was re-exported. `arrow-buffer` is
promoted from a transitive to a direct dependency.
- `lancedb::datafusion` — `Expr` is a first-class part of the query and
merge APIs (`only_if_expr`, `expr_projection`,
`QueryFilter::Datafusion`, `when_matched_update_all_expr`), and
`ExecutionPlan` is returned from `create_plan`.

This follows DataFusion's own precedent of re-exporting `arrow`. The
coupling already exists via the trait/impl bounds — this surfaces it
rather than hiding it behind an `E0277`.

Closes #3575

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This commit is contained in:
Will Jones
2026-07-01 10:10:55 -07:00
committed by GitHub
parent f94673ae5e
commit 8a37f2ad77
5 changed files with 21 additions and 0 deletions

1
Cargo.lock generated
View File

@@ -5305,6 +5305,7 @@ dependencies = [
"anyhow",
"arrow",
"arrow-array",
"arrow-buffer",
"arrow-cast",
"arrow-data",
"arrow-ipc",

View File

@@ -31,6 +31,7 @@ ahash = "0.8"
# Note that this one does not include pyarrow
arrow = { version = "58.0.0", optional = false }
arrow-array = "58.0.0"
arrow-buffer = "58.0.0"
arrow-data = "58.0.0"
arrow-ipc = "58.0.0"
arrow-ord = "58.0.0"

View File

@@ -14,6 +14,7 @@ rust-version.workspace = true
ahash = { workspace = true }
arrow = { workspace = true }
arrow-array = { workspace = true }
arrow-buffer = { workspace = true }
arrow-data = { workspace = true }
arrow-schema = { workspace = true }
arrow-select = { workspace = true }

View File

@@ -3,7 +3,19 @@
use std::{pin::Pin, sync::Arc};
// Re-export the arrow crates we depend on so downstream consumers can build
// `RecordBatch`/arrays/builders against the exact same arrow line lancedb was
// compiled against, instead of declaring their own (potentially mismatched)
// direct arrow dependencies. See https://github.com/lancedb/lancedb/issues/3575.
pub use arrow;
pub use arrow_array;
pub use arrow_buffer;
pub use arrow_cast;
pub use arrow_data;
pub use arrow_ipc;
pub use arrow_ord;
pub use arrow_schema;
pub use arrow_select;
use datafusion_common::DataFusionError;
use datafusion_physical_plan::stream::RecordBatchStreamAdapter;
use futures::{Stream, StreamExt, TryStreamExt};

View File

@@ -342,3 +342,9 @@ pub use connection::connect_namespace;
/// Re-export Lance Session and ObjectStoreRegistry for custom session creation
pub use lance::session::Session;
pub use lance_io::object_store::ObjectStoreRegistry;
/// Re-export DataFusion so consumers can build the `Expr` values that public
/// query/merge APIs (e.g. [`query::QueryBase::only_if_expr`]) accept without
/// declaring their own (potentially mismatched) direct `datafusion` dependency.
/// See <https://github.com/lancedb/lancedb/issues/3575>.
pub use datafusion;