Files
lancedb/rust
Will Jones a9f49c8150 fix: allow appending arrow.json data into lance.json tables (#3429)
When a table is created with `pa.json_()` (PyArrow's JSON extension
type),
it is stored internally as `lance.json` (LargeBinary with `lance.json`
extension metadata). Calling `table.add()` with `pa.json_()` data failed
with:

```
RuntimeError: lance error: Append with different schema:
  `data` should have type json but type was large_binary
```

`build_field_exprs` in `rust/lancedb/src/table/datafusion/cast.rs` saw
that
the input field (`Utf8` with `arrow.json` metadata) differed from the
table
field (`LargeBinary` with `lance.json` metadata). Since
`can_cast_types(Utf8, LargeBinary)` is true, it inserted a DataFusion
`Utf8 → LargeBinary` cast. That cast preserved the input field's
`arrow.json`
extension metadata instead of adopting the table's `lance.json`
metadata, so
lance-core detected a schema mismatch and rejected the append.

This adds a special case in `build_field_exprs`: when the input is
`arrow.json` and the table field is `lance.json`, the expression is
passed
through unchanged. Lance-core's write path already handles the
`arrow.json → lance.json` conversion (including JSONB encoding), so no
DataFusion cast is needed.

Fixes #3144

Continues #3291 from a fork (the original author's branch could not be
pushed to). The original commits are preserved; an additional commit
fixes
the CI failures on that PR — formatting, a missing trait import, and
read-back assertions that assumed binary storage when a lance.json
column
is read back as `Utf8`.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: yunju.lly <yunju.lly@antgroup.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-26 19:24:28 -07:00
..