mirror of
https://github.com/lancedb/lancedb.git
synced 2026-05-30 18:30:40 +00:00
When a table is created with `pa.json_()` (PyArrow's JSON extension type), it is stored internally as `lance.json` (LargeBinary with `lance.json` extension metadata). Calling `table.add()` with `pa.json_()` data failed with: ``` RuntimeError: lance error: Append with different schema: `data` should have type json but type was large_binary ``` `build_field_exprs` in `rust/lancedb/src/table/datafusion/cast.rs` saw that the input field (`Utf8` with `arrow.json` metadata) differed from the table field (`LargeBinary` with `lance.json` metadata). Since `can_cast_types(Utf8, LargeBinary)` is true, it inserted a DataFusion `Utf8 → LargeBinary` cast. That cast preserved the input field's `arrow.json` extension metadata instead of adopting the table's `lance.json` metadata, so lance-core detected a schema mismatch and rejected the append. This adds a special case in `build_field_exprs`: when the input is `arrow.json` and the table field is `lance.json`, the expression is passed through unchanged. Lance-core's write path already handles the `arrow.json → lance.json` conversion (including JSONB encoding), so no DataFusion cast is needed. Fixes #3144 Continues #3291 from a fork (the original author's branch could not be pushed to). The original commits are preserved; an additional commit fixes the CI failures on that PR — formatting, a missing trait import, and read-back assertions that assumed binary storage when a lance.json column is read back as `Utf8`. 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: yunju.lly <yunju.lly@antgroup.com> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>