fix(python): fix multiple embedding functions bug (#597)

Closes #594

The embedding functions are pydantic models so multiple instances with
the same parameters are considered ==, which means that if you have
multiple embedding columns it's possible for the embeddings to get
overwritten. Instead we use `is` instead of == to avoid this problem.

testing: modified unit test to include this case
This commit is contained in:
Chang She
2023-10-24 13:05:05 -04:00
committed by Weston Pace
parent 0036ca9de7
commit 2861f33982
2 changed files with 26 additions and 2 deletions

View File

@@ -327,7 +327,12 @@ class LanceModel(pydantic.BaseModel):
for vec, func in vec_and_function:
for source, field_info in cls.safe_get_fields().items():
src_func = get_extras(field_info, "source_column_for")
if src_func == func:
if src_func is func:
# note we can't use == here since the function is a pydantic
# model so two instances of the same function are ==, so if you
# have multiple vector columns from multiple sources, both will
# be mapped to the same source column
# GH594
configs.append(
EmbeddingFunctionConfig(
source_column=source, vector_column=vec, function=func