Commit Graph

818 Commits

Author SHA1 Message Date
BubbleCal
544382df5e fix: handle batch quires in single request (#2139) 2025-02-21 13:23:39 +08:00
Lance Release
a33a0670f6 Bump version: 0.19.1-beta.2 → 0.19.1-beta.3 2025-02-20 03:37:27 +00:00
Lei Xu
1865f7decf fix: support optional nested pydantic model (#2130)
Closes #2129
2025-02-17 20:43:13 -08:00
BubbleCal
a608621476 test: query with dist range and new rows (#2126)
we found a bug that flat KNN plan node's stats is not in right order as
fields in schema, it would cause an error if querying with distance
range and new unindexed rows.

we've fixed this in lance so add this test for verifying it works

Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2025-02-17 12:57:45 +08:00
Lance Release
40f0dbb64d Bump version: 0.19.1-beta.1 → 0.19.1-beta.2 2025-02-13 04:39:19 +00:00
Will Jones
78a17ad54c chore: improve dev instructions for Python (#2088)
Closes #2042
2025-02-12 14:08:52 -08:00
Lance Release
d18d63c69d Bump version: 0.19.1-beta.0 → 0.19.1-beta.1 2025-02-11 20:55:23 +00:00
Lance Release
e64712cfa5 Bump version: 0.19.0 → 0.19.1-beta.0 2025-02-07 19:27:07 +00:00
Lance Release
998cd43fe6 Bump version: 0.19.0-beta.0 → 0.19.0 2025-02-07 17:32:26 +00:00
Lance Release
4bc7eebe61 Bump version: 0.18.1-beta.4 → 0.19.0-beta.0 2025-02-07 17:32:26 +00:00
Will Jones
e7574698eb feat: upgrade Lance to 0.23.0 (#2101)
Upstream changelog:
https://github.com/lancedb/lance/releases/tag/v0.23.0
2025-02-07 07:58:07 -08:00
Will Jones
801a9e5f6f feat(python): streaming larger-than-memory writes (#2094)
Makes our preprocessing pipeline do transforms in streaming fashion, so
users can do larger-then-memory writes.

Closes #2082
2025-02-06 16:37:30 -08:00
Weston Pace
1a449fa49e refactor: rename drop_db / drop_database to drop_all_tables, expose database from connection (#2098)
If we start supporting external catalogs then "drop database" may be
misleading (and not possible). We should be more clear that this is a
utility method to drop all tables. This is also a nice chance for some
consistency cleanup as it was `drop_db` in rust, `drop_database` in
python, and non-existent in typescript.

This PR also adds a public accessor to get the database trait from a
connection.

BREAKING CHANGE: the `drop_database` / `drop_db` methods are now
deprecated.
2025-02-06 13:22:28 -08:00
Weston Pace
6bf742c759 feat: expose table trait (#2097)
Similar to
c269524b2f
this PR reworks and exposes an internal trait (this time
`TableInternal`) to be a public trait. These two PRs together should
make it possible for others to integrate LanceDB on top of other
catalogs.

This PR also adds a basic `TableProvider` implementation for tables,
although some work still needs to be done here (pushdown not yet
enabled).
2025-02-05 18:13:51 -08:00
Ryan Green
ef3093bc23 feat: drop_index() remote implementation (#2093)
Support drop_index operation in remote table.
2025-02-05 10:06:19 -03:30
Will Jones
16851389ea feat: extra headers parameter in client options (#2091)
Closes #1106

Unfortunately, these need to be set at the connection level. I
investigated whether if we let users provide a callback they could use
`AsyncLocalStorage` to access their context. However, it doesn't seem
like NAPI supports this right now. I filed an issue:
https://github.com/napi-rs/napi-rs/issues/2456
2025-02-04 17:26:45 -08:00
Weston Pace
c269524b2f feat!: refactor ConnectionInternal into a Database trait (#2067)
This opens up the door for more custom database implementations than the
two we have today. The biggest change should be inivisble:
`ConnectionInternal` has been renamed to `Database`, made public, and
refactored

However, there are a few breaking changes. `data_storage_version` and
`enable_v2_manifest_paths` have been moved from options on
`create_table` to options for the database which are now set via
`storage_options`.

Before:
```
db = connect(uri)
tbl = db.create_table("my_table", data, data_storage_version="legacy", enable_v2_manifest_paths=True)
```

After:
```
db = connect(uri, storage_options={
  "new_table_enable_v2_manifest_paths": "true",
  "new_table_data_storage_version": "legacy"
})
tbl = db.create_table("my_table", data)
```

BREAKING CHANGE: the data_storage_version, enable_v2_manifest_paths
options have moved from options to create_table to storage_options.
BREAKING CHANGE: the use_legacy_format option has been removed,
data_storage_version has replaced it for some time now
2025-02-04 14:35:14 -08:00
Lance Release
f6eef14313 Bump version: 0.18.1-beta.3 → 0.18.1-beta.4 2025-02-04 17:25:52 +00:00
Rob Meng
32716adaa3 chore: bump lance version (#2092) 2025-02-04 12:25:05 -05:00
Lance Release
482f1ee1d3 Bump version: 0.18.1-beta.2 → 0.18.1-beta.3 2025-02-01 01:20:49 +00:00
Will Jones
2f39274a66 feat: upgrade lance to 0.23.0-beta.4 (#2089)
Upstream changelog:
https://github.com/lancedb/lance/releases/tag/v0.23.0-beta.4
2025-01-31 17:20:15 -08:00
Will Jones
2fc174f532 docs: add sync/async tabs to quickstart (#2087)
Closes #2033
2025-01-31 15:43:54 -08:00
Will Jones
dba85f4d6f docs: user guide for merge insert (#2083)
Closes #2062
2025-01-31 10:03:21 -08:00
Will Jones
15f8f4d627 ci: check license headers (#2076)
Based on the same workflow in Lance.
2025-01-29 08:27:07 -08:00
Lance Release
a9897d9d85 Bump version: 0.18.1-beta.1 → 0.18.1-beta.2 2025-01-28 22:31:14 +00:00
Will Jones
acda7a4589 feat: upgrade lance to v0.23.0-beta.3 (#2074)
This includes several bugfixes for `merge_insert` and null handling in
vector search.

https://github.com/lancedb/lance/releases/tag/v0.23.0-beta.3
2025-01-28 14:00:06 -08:00
Vaibhav
dac0857745 feat: add distance_type() parameter to python sync query builders and metric() as an alias (#2073)
This PR aims to fix #2047 by doing the following things:
- Add a distance_type parameter to the sync query builders of Python
SDK.
- Make metric an alias to distance_type.
2025-01-28 13:59:53 -08:00
Lance Release
e5f42a850e Bump version: 0.18.1-beta.0 → 0.18.1-beta.1 2025-01-23 23:01:13 +00:00
Will Jones
28e1b70e4b fix(python): preserve original distance and score in hybrid queries (#2061)
Fixes #2031

When we do hybrid search, we normalize the scores. We do this
calculation in-place, because the Rerankers expect the `_distance` and
`_score` columns to be the normalized ones. So I've changed the logic so
that we restore the original distance and scores by matching on row ids.
2025-01-23 13:54:26 -08:00
Will Jones
52b79d2b1e feat: upgrade lance to v0.23.0-beta.2 (#2063)
Fixes https://github.com/lancedb/lancedb/issues/2043
2025-01-23 13:51:30 -08:00
Will Jones
bcfc93cc88 fix(python): various fixes for async query builders (#2048)
This includes several improvements and fixes to the Python Async query
builders:

1. The API reference docs show all the methods for each builder
2. The hybrid query builder now has all the same setter methods as the
vector search one, so you can now set things like `.distance_type()` on
a hybrid query.
3. Re-rankers are now properly hooked up and tested for FTS and vector
search. Previously the re-rankers were accidentally bypassed in unit
tests, because the builders overrode `.to_arrow()`, but the unit test
called `.to_batches()` which was only defined in the base class. Now all
builders implement `.to_batches()` and leave `.to_arrow()` to the base
class.
4. The `AsyncQueryBase` and `AsyncVectoryQueryBase` setter methods now
return `Self`, which provides the appropriate subclass as the type hint
return value. Previously, `AsyncQueryBase` had them all hard-coded to
`AsyncQuery`, which was unfortunate. (This required bringing in
`typing-extensions` for older Python version, but I think it's worth
it.)
2025-01-20 16:14:34 -08:00
BubbleCal
214d0debf5 docs: claim LanceDB supports float16/float32/float64 for multivector (#2040) 2025-01-21 07:04:15 +08:00
Will Jones
f059372137 feat: add drop_index() method (#2039)
Closes #1665
2025-01-20 10:08:51 -08:00
Lance Release
3dc1803c07 Bump version: 0.18.0 → 0.18.1-beta.0 2025-01-17 04:37:23 +00:00
BubbleCal
d0501f65f1 fix: linear reranker applies wrong score to combine (#2035)
related to #2014 
this fixes:
- linear reranker may lost some results if the merging consumes all
vector results earlier than fts results
- linear reranker inverts the fts score but only vector distance can be
inverted

---------

Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2025-01-17 11:33:48 +08:00
Bert
4703cc6894 chore: upgrade lance to v0.22.1-beta.3 (#2038) 2025-01-16 12:42:42 -05:00
BubbleCal
493f9ce467 fix: can't infer the vector column for multivector (#2026)
Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2025-01-16 14:08:04 +08:00
Weston Pace
5c759505b8 feat: upgrade lance 0.22.1b1 (#2029)
Now the version actually exists :)
2025-01-15 07:37:37 -08:00
BubbleCal
d57bed90e5 docs: add missing example code (#2025) 2025-01-14 21:17:05 -08:00
BubbleCal
648327e90c docs: show how to pack bits for binary vector (#2020)
Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2025-01-14 09:00:57 -08:00
Lance Release
995bd9bf37 Bump version: 0.18.0-beta.1 → 0.18.0 2025-01-14 01:02:26 +00:00
Lance Release
36cc06697f Bump version: 0.18.0-beta.0 → 0.18.0-beta.1 2025-01-14 01:02:25 +00:00
Will Jones
92dcf24b0c feat: upgrade Lance to v0.22.0 (#2017)
Upstream changelog:
https://github.com/lancedb/lance/releases/tag/v0.22.0
2025-01-13 15:06:01 -08:00
BubbleCal
66cbf6b6c5 feat: support multivector type (#2005)
Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2025-01-13 14:10:40 -08:00
Prashant Dixit
b66cd943a7 fix: broken voyageai embedding API (#2013)
This PR fixes the broken Embedding API for Voyageai.
2025-01-13 08:52:38 -08:00
Weston Pace
d8d11f48e7 feat: upgrade to lance 0.22.0b1 (#2011) 2025-01-10 12:51:52 -08:00
Lance Release
fbffe532a8 Bump version: 0.17.2-beta.2 → 0.18.0-beta.0 2025-01-10 19:01:20 +00:00
Will Jones
6eacae18c4 test: fix test failure from merge (#2007) 2025-01-09 11:27:24 -08:00
Bert
f4afe456e8 feat!: change default from postfiltering to prefiltering for sync python (#2000)
BREAKING CHANGE: prefiltering is now the default in the synchronous
python SDK

resolves: #1872
2025-01-08 19:13:58 -05:00
Renato Marroquin
ea5c2266b8 feat(python): support .rerank() on non-hybrid queries in Async API (WIP) (#1972)
Fixes https://github.com/lancedb/lancedb/issues/1950

---------

Co-authored-by: Renato Marroquin <renato.marroquin@oracle.com>
2025-01-08 16:42:47 -05:00