fix(python): Few fts patches (#1039)

1. filtering with fts mutated the schema, which caused schema mistmatch
problems with hybrid search as it combines fts and vector search tables.
2. fts with filter failed with `with_row_id`. This was because row_id
was calculated before filtering which caused size mismatch on attaching
it after.
3. The fix for 1 meant that now row_id is attached before filtering but
passing a filter to `to_lance` on a dataset that already contains
`_rowid` raises a panic from lance. So temporarily, in case where fts is
used with a filter AND `with_row_id`, we just force user to using the
duckdb pathway.

---------

Co-authored-by: Chang She <759245+changhiskhan@users.noreply.github.com>
This commit is contained in:
Ayush Chaurasia
2024-03-05 06:11:59 +05:30
committed by Weston Pace
parent c60a193767
commit b5326d31e9
3 changed files with 33 additions and 8 deletions

View File

@@ -893,8 +893,17 @@ def test_hybrid_search(db, tmp_path):
result3 = table.search(
"Our father who art in heaven", query_type="hybrid"
).to_pydantic(MyTable)
assert result1 == result3
# with post filters
result = (
table.search("Arrrrggghhhhhhh", query_type="hybrid")
.where("text='Arrrrggghhhhhhh'")
.to_list()
)
len(result) == 1
@pytest.mark.parametrize(
"consistency_interval", [None, timedelta(seconds=0), timedelta(seconds=0.1)]