mirror of
https://github.com/lancedb/lancedb.git
synced 2026-01-02 18:02:58 +00:00
perf: re-use table instance during write (#1909)
Previously, whenever `Table.add()` was called, we would write and re-open the underlying dataset. This was bad for performance, as it reset the table cache and initiated a lot of IO. It also could be the source of bugs, since we didn't necessarily pass all the necessary connection options down when re-opening the table. Closes #1655
This commit is contained in:
@@ -1624,15 +1624,7 @@ class LanceTable(Table):
|
||||
on_bad_vectors=on_bad_vectors,
|
||||
fill_value=fill_value,
|
||||
)
|
||||
# Access the dataset_mut property to ensure that the dataset is mutable.
|
||||
self._ref.dataset_mut
|
||||
self._ref.dataset = lance.write_dataset(
|
||||
data,
|
||||
self._dataset_uri,
|
||||
schema=self.schema,
|
||||
mode=mode,
|
||||
storage_options=self._ref.storage_options,
|
||||
)
|
||||
self._ref.dataset_mut.insert(data, mode=mode, schema=self.schema)
|
||||
|
||||
def merge(
|
||||
self,
|
||||
|
||||
Reference in New Issue
Block a user