lancedb

mirror of https://github.com/lancedb/lancedb.git synced 2026-07-06 20:40:41 +00:00

Author	SHA1	Message	Date
Lance Release	4bc7eebe61	Bump version: 0.18.1-beta.4 → 0.19.0-beta.0	2025-02-07 17:32:26 +00:00
Will Jones	2e3b34e79b	feat(node): support inserting and upserting subschemas (#2100 ) Fixes #2095 Closes #1832	2025-02-07 09:30:18 -08:00
Will Jones	e7574698eb	feat: upgrade Lance to 0.23.0 (#2101 ) Upstream changelog: https://github.com/lancedb/lance/releases/tag/v0.23.0	2025-02-07 07:58:07 -08:00
Will Jones	801a9e5f6f	feat(python): streaming larger-than-memory writes (#2094 ) Makes our preprocessing pipeline do transforms in streaming fashion, so users can do larger-then-memory writes. Closes #2082	2025-02-06 16:37:30 -08:00
Weston Pace	4e5fbe6c99	fix: ensure metadata erased from schema call in table provider (#2099 ) This also adds a basic unit test for the table provider	2025-02-06 15:30:20 -08:00
Weston Pace	1a449fa49e	refactor: rename drop_db / drop_database to drop_all_tables, expose database from connection (#2098 ) If we start supporting external catalogs then "drop database" may be misleading (and not possible). We should be more clear that this is a utility method to drop all tables. This is also a nice chance for some consistency cleanup as it was `drop_db` in rust, `drop_database` in python, and non-existent in typescript. This PR also adds a public accessor to get the database trait from a connection. BREAKING CHANGE: the `drop_database` / `drop_db` methods are now deprecated.	2025-02-06 13:22:28 -08:00
Weston Pace	6bf742c759	feat: expose table trait (#2097 ) Similar to `c269524b2f` this PR reworks and exposes an internal trait (this time `TableInternal`) to be a public trait. These two PRs together should make it possible for others to integrate LanceDB on top of other catalogs. This PR also adds a basic `TableProvider` implementation for tables, although some work still needs to be done here (pushdown not yet enabled).	2025-02-05 18:13:51 -08:00
Ryan Green	ef3093bc23	feat: drop_index() remote implementation (#2093 ) Support drop_index operation in remote table.	2025-02-05 10:06:19 -03:30
Will Jones	16851389ea	feat: extra headers parameter in client options (#2091 ) Closes #1106 Unfortunately, these need to be set at the connection level. I investigated whether if we let users provide a callback they could use `AsyncLocalStorage` to access their context. However, it doesn't seem like NAPI supports this right now. I filed an issue: https://github.com/napi-rs/napi-rs/issues/2456	2025-02-04 17:26:45 -08:00
Weston Pace	c269524b2f	feat!: refactor ConnectionInternal into a Database trait (#2067 ) This opens up the door for more custom database implementations than the two we have today. The biggest change should be inivisble: `ConnectionInternal` has been renamed to `Database`, made public, and refactored However, there are a few breaking changes. `data_storage_version` and `enable_v2_manifest_paths` have been moved from options on `create_table` to options for the database which are now set via `storage_options`. Before: ``` db = connect(uri) tbl = db.create_table("my_table", data, data_storage_version="legacy", enable_v2_manifest_paths=True) ``` After: ``` db = connect(uri, storage_options={ "new_table_enable_v2_manifest_paths": "true", "new_table_data_storage_version": "legacy" }) tbl = db.create_table("my_table", data) ``` BREAKING CHANGE: the data_storage_version, enable_v2_manifest_paths options have moved from options to create_table to storage_options. BREAKING CHANGE: the use_legacy_format option has been removed, data_storage_version has replaced it for some time now	2025-02-04 14:35:14 -08:00
Lance Release	f6eef14313	Bump version: 0.18.1-beta.3 → 0.18.1-beta.4 python-v0.18.1-beta.4	2025-02-04 17:25:52 +00:00
Rob Meng	32716adaa3	chore: bump lance version (#2092 )	2025-02-04 12:25:05 -05:00
Lance Release	5e98b7f4c0	Updating package-lock.json	2025-02-01 02:27:43 +00:00
Lance Release	3f2589c11f	Updating package-lock.json	2025-02-01 01:22:22 +00:00
Lance Release	e3b99694d6	Updating package-lock.json	2025-02-01 01:22:05 +00:00
Lance Release	9d42dc349c	Bump version: 0.15.1-beta.2 → 0.15.1-beta.3 v0.15.1-beta.3	2025-02-01 01:21:28 +00:00
Lance Release	482f1ee1d3	Bump version: 0.18.1-beta.2 → 0.18.1-beta.3 python-v0.18.1-beta.3	2025-02-01 01:20:49 +00:00
Will Jones	2f39274a66	feat: upgrade lance to 0.23.0-beta.4 (#2089 ) Upstream changelog: https://github.com/lancedb/lance/releases/tag/v0.23.0-beta.4	2025-01-31 17:20:15 -08:00
Will Jones	2fc174f532	docs: add sync/async tabs to quickstart (#2087 ) Closes #2033	2025-01-31 15:43:54 -08:00
Will Jones	dba85f4d6f	docs: user guide for merge insert (#2083 ) Closes #2062	2025-01-31 10:03:21 -08:00
Jeff Simpson	555fa26147	fix(rust): add embedding_registry on open_table (#2086 ) # Description Fix for: https://github.com/lancedb/lancedb/issues/1581 This is the same implementation as https://github.com/lancedb/lancedb/pull/1781 but with the addition of a unit test and rustfmt.	2025-01-31 08:48:02 -08:00
Will Jones	e05c0cd87e	ci(node): check docs in CI (#2084 ) * Make `npm run docs` fail if there are any warnings. This will catch items missing from the API reference. * Add a check in our CI to make sure `npm run dos` runs without warnings and doesn't generate any new files (indicating it might be out-of-date. * Hide constructors that aren't user facing. * Remove unused enum `WriteMode`. Closes #2068	2025-01-30 16:06:06 -08:00
Lance Release	25c17ebf4e	Updating package-lock.json	2025-01-30 18:24:59 +00:00
Lance Release	87b12b57dc	Updating package-lock.json	2025-01-30 17:33:15 +00:00
Lance Release	3dc9b71914	Updating package-lock.json	2025-01-30 17:32:59 +00:00
Lance Release	2622f34d1a	Bump version: 0.15.1-beta.1 → 0.15.1-beta.2 v0.15.1-beta.2	2025-01-30 17:32:33 +00:00
Will Jones	a677a4b651	ci: fix arm64 windows cross compile build (#2081 ) * Adds a CI job to check the cross compiled Windows ARM build. * Didn't replace the test build because we need native build to run tests. But for some reason (I forget why) we need cross compiled for nodejs. * Pinned crunchy to workaround https://github.com/eira-fransham/crunchy/issues/13 This is needed to fix failure from https://github.com/lancedb/lancedb/actions/runs/13020773184/job/36320719331	2025-01-30 09:24:20 -08:00
Weston Pace	e6b4f14c1f	docs: clarify upper case characters in column names need to be escaped (#2079 )	2025-01-29 09:34:43 -08:00
Will Jones	15f8f4d627	ci: check license headers (#2076 ) Based on the same workflow in Lance.	2025-01-29 08:27:07 -08:00
Will Jones	6526d6c3b1	ci(rust): caching improvements (up to 2.8x faster builds) (#2075 ) Some Rust jobs (such as [Rust/linux](https://github.com/lancedb/lancedb/actions/runs/13019232960/job/36315830779)) take almost minutes. This can be a bit of a bottleneck. * Two fixes to make caches more effective * Check in `Cargo.lock` so that dependencies don't change much between runs * Added a new CI job to validate we can build without a lockfile * Altered build commands so they don't have contradictory features and therefore don't trigger multiple builds Sadly, I don't think there's much to be done for windows-arm64, as much of the compile time is because the base image is so bare we need to install the build tools ourselves.	2025-01-29 08:26:45 -08:00
Lance Release	da4d7e3ca7	Updating package-lock.json	2025-01-28 22:32:20 +00:00
Lance Release	8fbadca9aa	Updating package-lock.json	2025-01-28 22:32:05 +00:00
Lance Release	29120219cf	Bump version: 0.15.1-beta.0 → 0.15.1-beta.1 v0.15.1-beta.1	2025-01-28 22:31:39 +00:00
Lance Release	a9897d9d85	Bump version: 0.18.1-beta.1 → 0.18.1-beta.2 python-v0.18.1-beta.2	2025-01-28 22:31:14 +00:00
Will Jones	acda7a4589	feat: upgrade lance to v0.23.0-beta.3 (#2074 ) This includes several bugfixes for `merge_insert` and null handling in vector search. https://github.com/lancedb/lance/releases/tag/v0.23.0-beta.3	2025-01-28 14:00:06 -08:00
Vaibhav	dac0857745	feat: add `distance_type()` parameter to python sync query builders and `metric()` as an alias (#2073 ) This PR aims to fix #2047 by doing the following things: - Add a distance_type parameter to the sync query builders of Python SDK. - Make metric an alias to distance_type.	2025-01-28 13:59:53 -08:00
Will Jones	0a9e1eab75	fix(node): `createTable()` should save embeddings, and `mergeInsert` should use them (#2065 ) * `createTable()` now saves embeddings in the schema metadata. Previously, it would drop them. (`createEmptyTable()` was already tested and worked.) * `mergeInsert()` now uses embeddings. Fixes #2066	2025-01-28 12:38:50 -08:00
V	d999d72c8d	docs: pandas example (#2044 ) Fix example for section ## From pandas DataFrame	2025-01-24 11:37:47 -08:00
Lance Release	de4720993e	Updating package-lock.json	2025-01-23 23:02:20 +00:00
Lance Release	6c14a307e2	Updating package-lock.json	2025-01-23 23:02:03 +00:00
Lance Release	43747278c8	Bump version: 0.15.0 → 0.15.1-beta.0 v0.15.1-beta.0	2025-01-23 23:01:40 +00:00
Lance Release	e5f42a850e	Bump version: 0.18.1-beta.0 → 0.18.1-beta.1 python-v0.18.1-beta.1	2025-01-23 23:01:13 +00:00
Will Jones	7920ecf66e	ci(python): stop using deprecated 2_24 manylinux for arm (#2064 ) Based on changes made in Lance: * https://github.com/lancedb/lance/pull/3409 * https://github.com/lancedb/lance/pull/3411	2025-01-23 15:00:34 -08:00
Will Jones	28e1b70e4b	fix(python): preserve original distance and score in hybrid queries (#2061 ) Fixes #2031 When we do hybrid search, we normalize the scores. We do this calculation in-place, because the Rerankers expect the `_distance` and `_score` columns to be the normalized ones. So I've changed the logic so that we restore the original distance and scores by matching on row ids.	2025-01-23 13:54:26 -08:00
Will Jones	52b79d2b1e	feat: upgrade lance to v0.23.0-beta.2 (#2063 ) Fixes https://github.com/lancedb/lancedb/issues/2043	2025-01-23 13:51:30 -08:00
Bert	c05d45150d	docs: clarify the arguments for `replace_field_metadata` (#2053 ) When calling `replace_field_metadata` we pass in an iter of tuples `(u32, HashMap<String, String>)`. That `u32` needs to be the field id from the lance schema `7f60aa0a87/rust/lance-core/src/datatypes/field.rs (L123)` This can sometimes be different than the index of the field in the arrow schema (e.g. if fields have been dropped). This PR adds docs that try to clarify what that argument should be, as well as corrects the usage in the test (which was improperly passing the index of the arrow schema).	2025-01-23 08:52:27 -05:00
BubbleCal	48ed3bb544	chore: replace the util to lance's (#2052 ) Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2025-01-23 11:04:37 +08:00
Will Jones	bcfc93cc88	fix(python): various fixes for async query builders (#2048 ) This includes several improvements and fixes to the Python Async query builders: 1. The API reference docs show all the methods for each builder 2. The hybrid query builder now has all the same setter methods as the vector search one, so you can now set things like `.distance_type()` on a hybrid query. 3. Re-rankers are now properly hooked up and tested for FTS and vector search. Previously the re-rankers were accidentally bypassed in unit tests, because the builders overrode `.to_arrow()`, but the unit test called `.to_batches()` which was only defined in the base class. Now all builders implement `.to_batches()` and leave `.to_arrow()` to the base class. 4. The `AsyncQueryBase` and `AsyncVectoryQueryBase` setter methods now return `Self`, which provides the appropriate subclass as the type hint return value. Previously, `AsyncQueryBase` had them all hard-coded to `AsyncQuery`, which was unfortunate. (This required bringing in `typing-extensions` for older Python version, but I think it's worth it.)	2025-01-20 16:14:34 -08:00
BubbleCal	214d0debf5	docs: claim LanceDB supports float16/float32/float64 for multivector (#2040 )	2025-01-21 07:04:15 +08:00
Will Jones	f059372137	feat: add `drop_index()` method (#2039 ) Closes #1665	2025-01-20 10:08:51 -08:00

1 2 3 4 5 ...

1572 Commits