Files
lancedb/python/tests
Will Jones a975cc0a94 fix: prevent duplicate data in FTS index (#728)
This forces the user to replace the whole FTS directory when re-creating
the index, prevent duplicate data from being created. Previously, the
whole dataset was re-added to the existing index, duplicating existing
rows in the index.

This (in combination with lancedb/lance#1707) caused #726, since the
duplicate data emitted duplicate indices for `take()` and an upstream
issue caused those queries to fail.

This solution isn't ideal, since it makes the FTS index temporarily
unavailable while the index is built. In the future, we should have
multiple FTS index directories, which would allow atomic commits of new
indexes (as well as multiple indexes for different columns).

Fixes #498.
Fixes #726.

---------

Co-authored-by: Chang She <759245+changhiskhan@users.noreply.github.com>
2024-04-05 16:24:30 -07:00
..
2024-04-05 16:24:30 -07:00
2023-05-04 11:41:05 -07:00