lancedb

mirror of https://github.com/lancedb/lancedb.git synced 2026-07-03 11:00:40 +00:00

Files

Will Jones 0e486511fa feat: hook up new writer for insert (#3029 )

This hooks up a new writer implementation for the `add()` method. The
main immediate benefit is it allows streaming requests to remote tables,
and at the same time allowing retries for most inputs.

In NodeJS, we always convert the data to `Vec<RecordBatch>`, so it's
always retry-able.

For Python, all are retry-able, except `Iterator` and
`pa.RecordBatchReader`, which can only be consumed once. Some, like
`pa.datasets.Dataset` are retry-able *and* streaming.

A lot of the changes here are to make the new DataFusion write pipeline
maintain the same behavior as the existing Python-based preprocessing,
such as:

* casting input data to target schema
* rejecting NaN values if `on_bad_vectors="error"`
* applying embedding functions.

In future PRs, we'll enhance these by moving the embedding calls into
DataFusion and making sure we parallelize them. See:
https://github.com/lancedb/lancedb/issues/3048

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

2026-02-23 14:43:31 -08:00

table

feat: hook up new writer for insert (#3029 )

2026-02-23 14:43:31 -08:00

arrow.rs

chore: update lance dependency to v2.0.0-rc.4 (#2972 )

2026-02-03 14:38:39 -08:00

connection.rs

feat(rust)!: accept RecordBatch, Vec<RecordBatch> in create_table() and Table.add() (#2948 )

2026-02-13 14:18:36 -08:00

error.rs

chore: update lance dependency to v2.0.0-rc.4 (#2972 )