lancedb

mirror of https://github.com/lancedb/lancedb.git synced 2025-12-25 14:29:56 +00:00

Author	SHA1	Message	Date
Lance Release	482f1ee1d3	Bump version: 0.18.1-beta.2 → 0.18.1-beta.3	2025-02-01 01:20:49 +00:00
Will Jones	2f39274a66	feat: upgrade lance to 0.23.0-beta.4 (#2089 ) Upstream changelog: https://github.com/lancedb/lance/releases/tag/v0.23.0-beta.4	2025-01-31 17:20:15 -08:00
Will Jones	2fc174f532	docs: add sync/async tabs to quickstart (#2087 ) Closes #2033	2025-01-31 15:43:54 -08:00
Will Jones	dba85f4d6f	docs: user guide for merge insert (#2083 ) Closes #2062	2025-01-31 10:03:21 -08:00
Will Jones	15f8f4d627	ci: check license headers (#2076 ) Based on the same workflow in Lance.	2025-01-29 08:27:07 -08:00
Lance Release	a9897d9d85	Bump version: 0.18.1-beta.1 → 0.18.1-beta.2	2025-01-28 22:31:14 +00:00
Will Jones	acda7a4589	feat: upgrade lance to v0.23.0-beta.3 (#2074 ) This includes several bugfixes for `merge_insert` and null handling in vector search. https://github.com/lancedb/lance/releases/tag/v0.23.0-beta.3	2025-01-28 14:00:06 -08:00
Vaibhav	dac0857745	feat: add `distance_type()` parameter to python sync query builders and `metric()` as an alias (#2073 ) This PR aims to fix #2047 by doing the following things: - Add a distance_type parameter to the sync query builders of Python SDK. - Make metric an alias to distance_type.	2025-01-28 13:59:53 -08:00
Lance Release	e5f42a850e	Bump version: 0.18.1-beta.0 → 0.18.1-beta.1	2025-01-23 23:01:13 +00:00
Will Jones	28e1b70e4b	fix(python): preserve original distance and score in hybrid queries (#2061 ) Fixes #2031 When we do hybrid search, we normalize the scores. We do this calculation in-place, because the Rerankers expect the `_distance` and `_score` columns to be the normalized ones. So I've changed the logic so that we restore the original distance and scores by matching on row ids.	2025-01-23 13:54:26 -08:00
Will Jones	52b79d2b1e	feat: upgrade lance to v0.23.0-beta.2 (#2063 ) Fixes https://github.com/lancedb/lancedb/issues/2043	2025-01-23 13:51:30 -08:00
Will Jones	bcfc93cc88	fix(python): various fixes for async query builders (#2048 ) This includes several improvements and fixes to the Python Async query builders: 1. The API reference docs show all the methods for each builder 2. The hybrid query builder now has all the same setter methods as the vector search one, so you can now set things like `.distance_type()` on a hybrid query. 3. Re-rankers are now properly hooked up and tested for FTS and vector search. Previously the re-rankers were accidentally bypassed in unit tests, because the builders overrode `.to_arrow()`, but the unit test called `.to_batches()` which was only defined in the base class. Now all builders implement `.to_batches()` and leave `.to_arrow()` to the base class. 4. The `AsyncQueryBase` and `AsyncVectoryQueryBase` setter methods now return `Self`, which provides the appropriate subclass as the type hint return value. Previously, `AsyncQueryBase` had them all hard-coded to `AsyncQuery`, which was unfortunate. (This required bringing in `typing-extensions` for older Python version, but I think it's worth it.)	2025-01-20 16:14:34 -08:00
BubbleCal	214d0debf5	docs: claim LanceDB supports float16/float32/float64 for multivector (#2040 )	2025-01-21 07:04:15 +08:00
Will Jones	f059372137	feat: add `drop_index()` method (#2039 ) Closes #1665	2025-01-20 10:08:51 -08:00
Lance Release	3dc1803c07	Bump version: 0.18.0 → 0.18.1-beta.0	2025-01-17 04:37:23 +00:00
BubbleCal	d0501f65f1	fix: linear reranker applies wrong score to combine (#2035 ) related to #2014 this fixes: - linear reranker may lost some results if the merging consumes all vector results earlier than fts results - linear reranker inverts the fts score but only vector distance can be inverted --------- Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2025-01-17 11:33:48 +08:00
Bert	4703cc6894	chore: upgrade lance to v0.22.1-beta.3 (#2038 )	2025-01-16 12:42:42 -05:00
BubbleCal	493f9ce467	fix: can't infer the vector column for multivector (#2026 ) Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2025-01-16 14:08:04 +08:00
Weston Pace	5c759505b8	feat: upgrade lance 0.22.1b1 (#2029 ) Now the version actually exists :)	2025-01-15 07:37:37 -08:00
BubbleCal	d57bed90e5	docs: add missing example code (#2025 )	2025-01-14 21:17:05 -08:00
BubbleCal	648327e90c	docs: show how to pack bits for binary vector (#2020 ) Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2025-01-14 09:00:57 -08:00
Lance Release	995bd9bf37	Bump version: 0.18.0-beta.1 → 0.18.0	2025-01-14 01:02:26 +00:00
Lance Release	36cc06697f	Bump version: 0.18.0-beta.0 → 0.18.0-beta.1	2025-01-14 01:02:25 +00:00
Will Jones	92dcf24b0c	feat: upgrade Lance to v0.22.0 (#2017 ) Upstream changelog: https://github.com/lancedb/lance/releases/tag/v0.22.0	2025-01-13 15:06:01 -08:00
BubbleCal	66cbf6b6c5	feat: support multivector type (#2005 ) Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2025-01-13 14:10:40 -08:00
Prashant Dixit	b66cd943a7	fix: broken voyageai embedding API (#2013 ) This PR fixes the broken Embedding API for Voyageai.	2025-01-13 08:52:38 -08:00
Weston Pace	d8d11f48e7	feat: upgrade to lance 0.22.0b1 (#2011 )	2025-01-10 12:51:52 -08:00
Lance Release	fbffe532a8	Bump version: 0.17.2-beta.2 → 0.18.0-beta.0	2025-01-10 19:01:20 +00:00
Will Jones	6eacae18c4	test: fix test failure from merge (#2007 )	2025-01-09 11:27:24 -08:00
Bert	f4afe456e8	feat!: change default from postfiltering to prefiltering for sync python (#2000 ) BREAKING CHANGE: prefiltering is now the default in the synchronous python SDK resolves: #1872	2025-01-08 19:13:58 -05:00
Renato Marroquin	ea5c2266b8	feat(python): support .rerank() on non-hybrid queries in Async API (WIP) (#1972 ) Fixes https://github.com/lancedb/lancedb/issues/1950 --------- Co-authored-by: Renato Marroquin <renato.marroquin@oracle.com>	2025-01-08 16:42:47 -05:00
Will Jones	c557e77f09	feat(python)!: support inserting and upserting subschemas (#1965 ) BREAKING CHANGE: For a field "vector", list of integers will now be converted to binary (uint8) vectors instead of f32 vectors. Use float values instead for f32 vectors. * Adds proper support for inserting and upserting subsets of the full schema. I thought I had previously implemented this in #1827, but it turns out I had not tested carefully enough. * Refactors `_santize_data` and other utility functions to be simpler and not require `numpy` or `combine_chunks()`. * Added a new suite of unit tests to validate sanitization utilities. ## Examples ```python import pandas as pd import lancedb db = lancedb.connect("memory://demo") intial_data = pd.DataFrame({ "a": [1, 2, 3], "b": [4, 5, 6], "c": [7, 8, 9] }) table = db.create_table("demo", intial_data) # Insert a subschema new_data = pd.DataFrame({"a": [10, 11]}) table.add(new_data) table.to_pandas() ``` ``` a b c 0 1 4.0 7.0 1 2 5.0 8.0 2 3 6.0 9.0 3 10 NaN NaN 4 11 NaN NaN ``` ```python # Upsert a subschema upsert_data = pd.DataFrame({ "a": [3, 10, 15], "b": [6, 7, 8], }) table.merge_insert(on="a").when_matched_update_all().when_not_matched_insert_all().execute(upsert_data) table.to_pandas() ``` ``` a b c 0 1 4.0 7.0 1 2 5.0 8.0 2 3 6.0 9.0 3 10 7.0 NaN 4 11 NaN NaN 5 15 8.0 NaN ```	2025-01-08 10:11:10 -08:00
BubbleCal	3c0a64be8f	feat: support distance range in queries (#1999 ) this also updates the docs --------- Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2025-01-08 11:03:27 +08:00
Will Jones	0e496ed3b5	docs: contributing guide (#1970 ) * Adds basic contributing guides. * Simplifies Python development with a Makefile.	2025-01-07 15:11:16 -08:00
QianZhu	17c9e9afea	docs: add async examples to doc (#1941 ) - added sync and async tabs for python examples - moved python code to tests/docs --------- Co-authored-by: Will Jones <willjones127@gmail.com>	2025-01-07 15:10:25 -08:00
Gagan Bhullar	b474f98049	feat(python): `flatten` in `AsyncQuery` (#1967 ) PR fixes #1949 --------- Co-authored-by: Will Jones <willjones127@gmail.com>	2025-01-06 10:52:03 -08:00
Takahiro Ebato	2c05ffed52	feat(python): add `to_polars` to `AsyncQueryBase` (#1986 ) Fixes https://github.com/lancedb/lancedb/issues/1952 Added `to_polars` method to `AsyncQueryBase`.	2025-01-06 09:35:28 -08:00
Lance Release	a27c5cf12b	Bump version: 0.17.2-beta.1 → 0.17.2-beta.2	2025-01-06 05:34:27 +00:00
BubbleCal	f4dea72cc5	feat: support vector search with distance thresholds (#1993 ) Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2025-01-06 13:23:39 +08:00
Lei Xu	f76c4a5ce1	chore: add pyright static type checking and fix some of the table interface (#1996 ) * Enable `pyright` in the project * Fixed some pyright typing errors in `table.py`	2025-01-04 15:24:58 -08:00
BubbleCal	445a312667	fix: selecting columns failed on FTS and hybrid search (#1991 ) it reports error `AttributeError: 'builtins.FTSQuery' object has no attribute 'select_columns'` because we missed `select_columns` method in rust Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2025-01-03 13:08:12 +08:00
Lance Release	92d845fa72	Bump version: 0.17.2-beta.0 → 0.17.2-beta.1	2024-12-31 23:36:18 +00:00
Lei Xu	397813f6a4	chore: bump pylance to 0.21.1b1 (#1989 )	2024-12-31 15:34:27 -08:00
Lei Xu	50c30c5d34	chore(python): fix typo of the synchronized checkout API (#1988 )	2024-12-30 18:54:31 -08:00
Renato Marroquin	0cb6da6b7e	docs: add new indexes to python docs (#1945 ) closes issue #1855 Co-authored-by: Renato Marroquin <renato.marroquin@oracle.com>	2024-12-28 15:35:10 -08:00
BubbleCal	aec8332eb5	chore: add `dynamic = ["version"]` to pass build check (#1977 ) Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2024-12-28 10:45:23 -08:00
Lance Release	dae8334d0b	Bump version: 0.17.1 → 0.17.2-beta.0	2024-12-25 08:28:59 +00:00
BubbleCal	16cf2990f3	feat: create IVF_FLAT on remote table (#1978 ) Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2024-12-25 14:57:07 +08:00
Lance Release	27404c8623	Bump version: 0.17.1-beta.7 → 0.17.1	2024-12-24 18:37:28 +00:00
Lance Release	f181c7e77f	Bump version: 0.17.1-beta.6 → 0.17.1-beta.7	2024-12-24 18:37:27 +00:00

1 2 3 4 5 ...

699 Commits