feat: adds isin support to the 'Expr' builder (#3523)

The `Expr` build already includes a lot of useful filtering options,
`eq, ne, gt/gte, lt/lte, and_, or_, contains, cast`, but is was missing
a membership like `isin`. This PR adds that support, as minimally as
possible, allowing easy filtering for membership in a list, without
needing to be a series of `where` expressions.

I didn't see anything in CONTRIBUTING.md about needing a feature request
or issue first, so I just made the change. My apologies if I missed that
somewhere.

Thanks for the vector store, we're using it now in paperless-ngx.
This commit is contained in:
Trenton H
2026-06-10 15:28:19 -07:00
committed by GitHub
parent d786e39fdc
commit 85d9c1ce63
6 changed files with 51 additions and 2 deletions

View File

@@ -9,7 +9,9 @@
use arrow::{datatypes::DataType, pyarrow::PyArrowType};
use datafusion_common::ScalarValue;
use lancedb::expr::{DfExpr, col as ldb_col, contains, expr_cast, lit as df_lit, lower, upper};
use lancedb::expr::{
DfExpr, col as ldb_col, contains, expr_cast, is_in, lit as df_lit, lower, upper,
};
use pyo3::types::PyBytes;
use pyo3::{Bound, PyAny, PyResult, exceptions::PyValueError, prelude::*, pyfunction};
@@ -105,6 +107,14 @@ impl PyExpr {
Self(contains(self.0.clone(), substr.0.clone()))
}
// ── membership ───────────────────────────────────────────────────────────
/// Return true where the value is one of the given expressions (SQL ``IN``).
fn isin(&self, list: Vec<Self>) -> Self {
let items: Vec<DfExpr> = list.into_iter().map(|e| e.0).collect();
Self(is_in(self.0.clone(), items))
}
// ── type cast ────────────────────────────────────────────────────────────
/// Cast the expression to `data_type`.