Commit Graph

1525 Commits

Author SHA1 Message Date
Will Jones
bcfc93cc88 fix(python): various fixes for async query builders (#2048)
This includes several improvements and fixes to the Python Async query
builders:

1. The API reference docs show all the methods for each builder
2. The hybrid query builder now has all the same setter methods as the
vector search one, so you can now set things like `.distance_type()` on
a hybrid query.
3. Re-rankers are now properly hooked up and tested for FTS and vector
search. Previously the re-rankers were accidentally bypassed in unit
tests, because the builders overrode `.to_arrow()`, but the unit test
called `.to_batches()` which was only defined in the base class. Now all
builders implement `.to_batches()` and leave `.to_arrow()` to the base
class.
4. The `AsyncQueryBase` and `AsyncVectoryQueryBase` setter methods now
return `Self`, which provides the appropriate subclass as the type hint
return value. Previously, `AsyncQueryBase` had them all hard-coded to
`AsyncQuery`, which was unfortunate. (This required bringing in
`typing-extensions` for older Python version, but I think it's worth
it.)
2025-01-20 16:14:34 -08:00
BubbleCal
214d0debf5 docs: claim LanceDB supports float16/float32/float64 for multivector (#2040) 2025-01-21 07:04:15 +08:00
Will Jones
f059372137 feat: add drop_index() method (#2039)
Closes #1665
2025-01-20 10:08:51 -08:00
Lance Release
3dc1803c07 Bump version: 0.18.0 → 0.18.1-beta.0 python-v0.18.1-beta.0 2025-01-17 04:37:23 +00:00
BubbleCal
d0501f65f1 fix: linear reranker applies wrong score to combine (#2035)
related to #2014 
this fixes:
- linear reranker may lost some results if the merging consumes all
vector results earlier than fts results
- linear reranker inverts the fts score but only vector distance can be
inverted

---------

Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2025-01-17 11:33:48 +08:00
Bert
4703cc6894 chore: upgrade lance to v0.22.1-beta.3 (#2038) 2025-01-16 12:42:42 -05:00
BubbleCal
493f9ce467 fix: can't infer the vector column for multivector (#2026)
Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2025-01-16 14:08:04 +08:00
Weston Pace
5c759505b8 feat: upgrade lance 0.22.1b1 (#2029)
Now the version actually exists :)
2025-01-15 07:37:37 -08:00
BubbleCal
bb6a39727e fix: missing distance type for auto index on RemoteTable (#2027)
Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2025-01-15 20:28:55 +08:00
BubbleCal
d57bed90e5 docs: add missing example code (#2025) 2025-01-14 21:17:05 -08:00
BubbleCal
648327e90c docs: show how to pack bits for binary vector (#2020)
Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2025-01-14 09:00:57 -08:00
Lance Release
6c7e81ee57 Updating package-lock.json 2025-01-14 02:14:37 +00:00
Lance Release
905e9d4738 Updating package-lock.json 2025-01-14 01:03:49 +00:00
Lance Release
38642e349c Updating package-lock.json 2025-01-14 01:03:33 +00:00
Lance Release
6879861ea8 Bump version: 0.15.0-beta.1 → 0.15.0 v0.15.0 2025-01-14 01:03:04 +00:00
Lance Release
88325e488e Bump version: 0.15.0-beta.0 → 0.15.0-beta.1 2025-01-14 01:02:59 +00:00
Lance Release
995bd9bf37 Bump version: 0.18.0-beta.1 → 0.18.0 python-v0.18.0 2025-01-14 01:02:26 +00:00
Lance Release
36cc06697f Bump version: 0.18.0-beta.0 → 0.18.0-beta.1 2025-01-14 01:02:25 +00:00
Will Jones
35da464591 ci: fix stable check (#2019) 2025-01-13 17:01:54 -08:00
Will Jones
31f9c30ffb chore: fix test of error message (#2018)
Addresses failure on `main`:
https://github.com/lancedb/lancedb/actions/runs/12757756657/job/35558683317
2025-01-13 15:36:46 -08:00
Will Jones
92dcf24b0c feat: upgrade Lance to v0.22.0 (#2017)
Upstream changelog:
https://github.com/lancedb/lance/releases/tag/v0.22.0
2025-01-13 15:06:01 -08:00
Will Jones
6b0adba2d9 chore: add deprecation warning to vectordb (#2003) 2025-01-13 14:53:12 -08:00
BubbleCal
66cbf6b6c5 feat: support multivector type (#2005)
Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2025-01-13 14:10:40 -08:00
Keming
ce9506db71 docs(hnsw): fix markdown list style (#2015) 2025-01-13 08:53:13 -08:00
Prashant Dixit
b66cd943a7 fix: broken voyageai embedding API (#2013)
This PR fixes the broken Embedding API for Voyageai.
2025-01-13 08:52:38 -08:00
Weston Pace
d8d11f48e7 feat: upgrade to lance 0.22.0b1 (#2011) 2025-01-10 12:51:52 -08:00
Lance Release
7ec5df3022 Updating package-lock.json 2025-01-10 19:58:10 +00:00
Lance Release
b17304172c Updating package-lock.json 2025-01-10 19:02:31 +00:00
Lance Release
fbe5408434 Updating package-lock.json 2025-01-10 19:02:15 +00:00
Lance Release
3f3f845c5a Bump version: 0.14.2-beta.0 → 0.15.0-beta.0 v0.15.0-beta.0 2025-01-10 19:01:47 +00:00
Lance Release
fbffe532a8 Bump version: 0.17.2-beta.2 → 0.18.0-beta.0 python-v0.18.0-beta.0 2025-01-10 19:01:20 +00:00
Josef Gugglberger
55ffc96e56 docs: update storage.md, fix Azure Sync connect example (#2010)
In the sync code example there was also an `await`.


![image](https://github.com/user-attachments/assets/4e1a1bd9-f2fb-4dbe-a9a6-1384ab63edbb)
2025-01-10 09:01:19 -08:00
Mr. Doge
998c5f3f74 ci: add dbghelp.lib to sysroot-aarch64-pc-windows-msvc.sh (#1975) (#2008)
successful runs:
https://github.com/FuPeiJiang/lancedb/actions/runs/12698662005
2025-01-09 14:24:09 -08:00
Will Jones
6eacae18c4 test: fix test failure from merge (#2007) 2025-01-09 11:27:24 -08:00
Bert
d3ea75cc2b feat: expose dataset config (#2004)
Expose methods on NativeTable for updating schema metadata and dataset
config & getting the dataset config via the manifest.
2025-01-08 21:13:18 -05:00
Bert
f4afe456e8 feat!: change default from postfiltering to prefiltering for sync python (#2000)
BREAKING CHANGE: prefiltering is now the default in the synchronous
python SDK

resolves: #1872
2025-01-08 19:13:58 -05:00
Renato Marroquin
ea5c2266b8 feat(python): support .rerank() on non-hybrid queries in Async API (WIP) (#1972)
Fixes https://github.com/lancedb/lancedb/issues/1950

---------

Co-authored-by: Renato Marroquin <renato.marroquin@oracle.com>
2025-01-08 16:42:47 -05:00
Will Jones
c557e77f09 feat(python)!: support inserting and upserting subschemas (#1965)
BREAKING CHANGE: For a field "vector", list of integers will now be
converted to binary (uint8) vectors instead of f32 vectors. Use float
values instead for f32 vectors.

* Adds proper support for inserting and upserting subsets of the full
schema. I thought I had previously implemented this in #1827, but it
turns out I had not tested carefully enough.
* Refactors `_santize_data` and other utility functions to be simpler
and not require `numpy` or `combine_chunks()`.
* Added a new suite of unit tests to validate sanitization utilities.

## Examples

```python
import pandas as pd
import lancedb

db = lancedb.connect("memory://demo")
intial_data = pd.DataFrame({
    "a": [1, 2, 3],
    "b": [4, 5, 6],
    "c": [7, 8, 9]
})
table = db.create_table("demo", intial_data)

# Insert a subschema
new_data = pd.DataFrame({"a": [10, 11]})
table.add(new_data)
table.to_pandas()
```
```
    a    b    c
0   1  4.0  7.0
1   2  5.0  8.0
2   3  6.0  9.0
3  10  NaN  NaN
4  11  NaN  NaN
```


```python
# Upsert a subschema
upsert_data = pd.DataFrame({
    "a": [3, 10, 15],
    "b": [6, 7, 8],
})
table.merge_insert(on="a").when_matched_update_all().when_not_matched_insert_all().execute(upsert_data)
table.to_pandas()
```
```
    a    b    c
0   1  4.0  7.0
1   2  5.0  8.0
2   3  6.0  9.0
3  10  7.0  NaN
4  11  NaN  NaN
5  15  8.0  NaN
```
2025-01-08 10:11:10 -08:00
BubbleCal
3c0a64be8f feat: support distance range in queries (#1999)
this also updates the docs

---------

Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2025-01-08 11:03:27 +08:00
Will Jones
0e496ed3b5 docs: contributing guide (#1970)
* Adds basic contributing guides.
* Simplifies Python development with a Makefile.
2025-01-07 15:11:16 -08:00
QianZhu
17c9e9afea docs: add async examples to doc (#1941)
- added sync and async tabs for python examples
- moved python code to tests/docs

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2025-01-07 15:10:25 -08:00
Wyatt Alt
0b45ef93c0 docs: assorted copyedits (#1998)
This includes a handful of minor edits I made while reading the docs. In
addition to a few spelling fixes,
* standardize on "rerank" over "re-rank" in prose
* terminate sentences with periods or colons as appropriate
* replace some usage of dashes with colons, such as in "Try it yourself
- <link>"

All changes are surface-level. No changes to semantics or structure.

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2025-01-06 15:04:48 -08:00
Gagan Bhullar
b474f98049 feat(python): flatten in AsyncQuery (#1967)
PR fixes #1949

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2025-01-06 10:52:03 -08:00
Takahiro Ebato
2c05ffed52 feat(python): add to_polars to AsyncQueryBase (#1986)
Fixes https://github.com/lancedb/lancedb/issues/1952

Added `to_polars` method to `AsyncQueryBase`.
2025-01-06 09:35:28 -08:00
Will Jones
8b31540b21 ci: prevent stable release with preview lance (#1995)
Accidentally referenced a preview release in our stable release of
LanceDB. This adds a CI check to prevent that.
2025-01-06 08:54:14 -08:00
Lance Release
ba844318f8 Updating package-lock.json 2025-01-06 06:26:41 +00:00
Lance Release
f007b76153 Updating package-lock.json 2025-01-06 05:35:28 +00:00
Lance Release
5d8d258f59 Updating package-lock.json 2025-01-06 05:35:13 +00:00
Lance Release
4172140f74 Bump version: 0.14.1 → 0.14.2-beta.0 v0.14.2-beta.0 2025-01-06 05:34:52 +00:00
Lance Release
a27c5cf12b Bump version: 0.17.2-beta.1 → 0.17.2-beta.2 python-v0.17.2-beta.2 2025-01-06 05:34:27 +00:00