lancedb

mirror of https://github.com/lancedb/lancedb.git synced 2025-12-27 15:12:53 +00:00

Author	SHA1	Message	Date
Jack Ye	dadb042978	feat: bump lance to 0.38.3-beta.2 and rust to 1.90.0 (#2714 )	2025-10-10 14:02:41 -07:00
Weston Pace	5a19cf15a6	feat: a utility for creating "permutation views" (#2552 ) I'm working on a lancedb version of pytorch data loading (and hopefully addressing https://github.com/lancedb/lance/issues/3727). However, rather than rely on pytorch for everything I'm moving some of the things that pytorch does into rust. This gives us more control over data loading (e.g. using shards or a hash-based split) and it allows permutations to be persistent. In particular I hope to be able to: * Create a persistent permutation * This permutation can handle splits, filtering, shuffling, and sharding * Create a rust data loader that can read a permutation (one or more splits), or a subset of a permutation (for DDP) * Create a python data loader that delegates to the rust data loader Eventually create integrations for other data loading libraries, including rust & node	2025-10-09 18:07:31 -07:00
LuQQiu	86a6bb9fcb	chore: supports limit push down through MetadataEraserExec (#2679 ) For limit to sucessfully push down to FilteredReadExec https://github.com/lancedb/lance/pull/4795/	2025-10-09 09:33:38 -07:00
BubbleCal	b59d1007d3	feat(index): add IVF_RQ index type (#2687 ) this expose IVF_RQ (RabitQ quantization) index type to lancedb --------- Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2025-10-09 15:46:18 +08:00
Lance Release	56a16b1728	Bump version: 0.22.2-beta.3 → 0.22.2	2025-10-08 18:13:08 +00:00
Lance Release	b7afed9beb	Bump version: 0.22.2-beta.2 → 0.22.2-beta.3	2025-10-08 18:12:23 +00:00
Jack Ye	5ec12c9971	fix: federated database should not pass namesapce to listing database (#2702 ) Fixes error that when converting a federated database operation to a listing database operation, the namespace parameter is no longer correct and should be dropped. Note that with the testing infra we have today, we don't have a good way to test these changes. I will do a quick follow up on https://github.com/lancedb/lancedb/issues/2701 but would be great to get this in first to resolve the related issues.	2025-10-06 14:12:41 -07:00
Lance Release	d7e02c8181	Bump version: 0.22.2-beta.1 → 0.22.2-beta.2	2025-10-06 18:10:40 +00:00
Will Jones	1ac745eb18	ci: fix Python and Node CI on main (#2700 ) Example failure: https://github.com/lancedb/lancedb/actions/runs/18237024283/job/51932651993	2025-10-06 09:40:08 -07:00
LuQQiu	0d78929893	feat: upgrade lance to 0.38.0 (#2695 ) https://github.com/lancedb/lance/releases/tag/v0.38.0 --------- Co-authored-by: Will Jones <willjones127@gmail.com>	2025-10-03 16:47:05 -07:00
Lance Release	fec2a05629	Bump version: 0.22.2-beta.0 → 0.22.2-beta.1	2025-09-30 19:31:44 +00:00
Colin Patrick McCabe	82b25a71e9	feat: add support for test_remote_connections (#2666 ) Add a new test feature which allows for running the lancedb tests against a remote server. Convert over a few tests in src/connection.rs as a proof of concept. To make local development easier, the remote tests can be run locally from a Makefile. This file can also be used to run the feature tests, with a single invocation of 'make'. (The feature tests require bringing up a docker compose environment.)	2025-09-26 11:24:43 -07:00
Jack Ye	13c613d45f	chore: upgrade lance to v0.37.1-beta.1 (#2682 )	2025-09-25 23:12:09 -07:00
Weston Pace	e07389a36c	feat: allow bitmap indexes on large-string, binary, large-binary, and bitmap (#2678 ) The underlying `pylance` already supported this, it was just blocked out by an over-eager validation function Closes #1981	2025-09-25 09:46:42 -07:00
Lance Release	e7e9e80b1d	Bump version: 0.22.1 → 0.22.2-beta.0	2025-09-24 22:54:54 +00:00
Jack Ye	504bdc471c	feat(rust): support namespace backed database (#2664 ) This PR adds support for namespace-backed databases through lance-namespace integration, enabling centralized table management through namespace APIs. --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-09-24 15:33:31 -07:00
Will Jones	d617cdef4a	feat: add use_index parameter to merge insert operations (#2674 ) ## Summary Exposes `use_index` Merge Insert parameter, which was created upstream in https://github.com/lancedb/lance/pull/4688. ## API Examples ### Python ```python # Force table scan table.merge_insert(["id"]) \ .when_not_matched_insert_all() \ .use_index(False) \ .execute(data) ``` ### Node.js/TypeScript ```typescript // Force table scan await table.mergeInsert("id") .whenNotMatchedInsertAll() .useIndex(false) .execute(data); ``` ### Rust ```rust // Force table scan let mut builder = table.merge_insert(&["id"]); builder.when_not_matched_insert_all() .use_index(false); builder.execute(data).await?; ``` 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Claude <noreply@anthropic.com>	2025-09-24 12:50:21 -07:00
Lance Release	d6cc68f671	Bump version: 0.22.1-beta.4 → 0.22.1	2025-09-23 22:07:31 +00:00
Lance Release	55eacfa685	Bump version: 0.22.1-beta.3 → 0.22.1-beta.4	2025-09-23 22:06:45 +00:00
Will Jones	1ab60fae7f	feat: upgrade Lance to v0.37.0 (#2672 ) Change logs: * https://github.com/lancedb/lance/releases/tag/v0.37.0 * https://github.com/lancedb/lance/releases/tag/v0.36.0	2025-09-23 13:41:47 -07:00
Lance Release	05a4ea646a	Bump version: 0.22.1-beta.2 → 0.22.1-beta.3	2025-09-22 04:49:00 +00:00
Jack Ye	ff71d7e552	feat: support shallow clone (#2653 ) Support shallow cloning a dataset at a specific location to create a new dataset, using the shallow_clone feature in Lance. Also introduce remote `clone` API for remote tables for this functionality.	2025-09-21 21:28:40 -07:00
Lance Release	b5a39bffec	Bump version: 0.22.1-beta.1 → 0.22.1-beta.2	2025-09-18 20:22:35 +00:00
Jack Ye	97e9938dfe	fix: add missing validations to namespace operations (#2659 )	2025-09-17 23:27:04 -07:00
Weston Pace	1d4b92e01e	refactor: remove catalog implementation now that we have namespaces in database (#2662 ) We had previously prototyped a `Catalog` trait anticipating a three-tiered Catalog-Database-Table structure. Now that we have namespaces in the `Database` we can support any tiering scheme and the `Catalog` trait is no longer needed.	2025-09-17 08:40:20 -07:00
Jack Ye	0ebc8d45a8	chore: fix no lock build warnings and CI timeouts (#2650 ) Example CI failures: - publish build timeout: https://github.com/lancedb/lancedb/actions/runs/17626482881/job/50084552906 - doc test build timeout: https://github.com/lancedb/lancedb/actions/runs/17627058590/job/50086456818	2025-09-11 15:30:35 -07:00
BubbleCal	f7d78c3420	feat: add 'target_partition_size' param (#2642 ) this exposes the param `target_partition_size` from lance --------- Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2025-09-11 22:56:16 +08:00
Lance Release	6ea6884260	Bump version: 0.22.1-beta.0 → 0.22.1-beta.1	2025-09-10 20:49:43 +00:00
Jack Ye	8da74dcb37	feat: support per-request header override (#2631 ) ## Summary This PR introduces a `HeaderProvider` which is called for all remote HTTP calls to get the latest headers to inject. This is useful for features like adding the latest auth tokens where the header provider can auto-refresh tokens internally and each request always set the refreshed token. --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-09-10 13:44:00 -07:00
Lance Release	3c7419b392	Bump version: 0.22.0 → 0.22.1-beta.0	2025-09-10 14:24:58 +00:00
Wyatt Alt	e77d57a5b6	chore: update lance to 0.35.0-beta4 (#2639 ) Updates lance to 0.35.0-beta4, which also incurs a datafusion update. This brings in a fix for a memory leak in index caching, resulting from a cyclical reference.	2025-09-10 06:19:35 -07:00
Jack Ye	9391ad1450	feat: support mTLS for remote database (#2638 ) This PR adds mTLS (mutual TLS) configuration support for the LanceDB remote HTTP client, allowing users to authenticate with client certificates and configure custom CA certificates for server verification. --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-09-09 21:04:46 -07:00
LuQQiu	79960b254e	fix: add partition statistics to MetadataEraser (#2637 ) Some of the data fusion optimizers optimize based on data statistics (e.g. total bytes, number of rows). If those statistics are not supplied, optimizers cannot optimize on top. One example is Anti Hash Join which can optimize from LeftAnti (Left: big table, Right: small table) to RightAnti (Left: small table, Right: big table). Left Anti requires reading the whole big & small table while RightAnti only requires reading the whole left table and supports limit push down to only read partial of big table	2025-09-09 09:13:22 -07:00
Lance Release	06d5612443	Bump version: 0.22.0-beta.2 → 0.22.0	2025-09-04 08:33:40 +00:00
Lance Release	45f96f4151	Bump version: 0.22.0-beta.1 → 0.22.0-beta.2	2025-09-04 08:33:09 +00:00
Jack Ye	683aaed716	chore: upgrade lance to 0.35.0 (#2625 )	2025-09-04 01:31:13 -07:00
Lance Release	48f7b20daa	Bump version: 0.22.0-beta.0 → 0.22.0-beta.1	2025-09-03 17:51:36 +00:00
Jack Ye	e6f1da31dc	chore: upgrade lance to 0.34.0-beta.4 (#2621 )	2025-09-02 21:33:55 -07:00
Lance Release	47747287b6	Bump version: 0.21.4-beta.1 → 0.22.0-beta.0	2025-08-29 21:20:57 +00:00
Wyatt Alt	981f8427e6	chore: update lance (#2610 ) Adds storage_options to object_store wrap() to adhere to upstream lance change.	2025-08-29 13:41:02 -07:00
Jack Ye	faf8973624	feat!: support multi-level namespace (#2603 ) This PR adds support of multi-level namespace in a LanceDB database, according to the Lance Namespace spec. This allows users to create namespace inside a database connection, perform create, drop, list, list_tables in a namespace. (other operations like update, describe will be in a follow-up PR) The 3 types of database connections behave like the following: 1 Local database connections will continue to have just a flat list of tables for backwards compatibility. 2. Remote database connections will make REST API calls according to the APIs in the Lance Namespace spec. 3. Lance Namespace connections will invoke the corresponding operations against the specific namespace implementation which could have different behaviors regarding these APIs. All the table APIs now take identifier instead of name, for example `/v1/table/{name}/create` is now `/v1/table/{id}/create`. If a table is directly in the root namespace, the API call is identical. If the table is in a namespace, then the full table ID should be used, with `$` as the default delimiter (`.` is a special character and creates issues with URL parsing so `$` is used), for example `/v1/table/ns1$table1/create`. If a different parameter needs to be passed in, user can configure the `id_delimiter` in client config and that becomes a query parameter, for example `/v1/table/ns1__table1/create?delimiter=__` The Python and Typescript APIs are kept backwards compatible, but the following Rust APIs are not: 1. `Connection::drop_table(&self, name: impl AsRef<str>) -> Result<()>` is now `Connection::drop_table(&self, name: impl AsRef<str>, namespace: &[String]) -> Result<()>` 2. `Connection::drop_all_tables(&self) -> Result<()>` is now `Connection::drop_all_tables(&self, name: impl AsRef<str>) -> Result<()>`	2025-08-27 12:07:55 -07:00
Lance Release	6839ac3509	Bump version: 0.21.4-beta.0 → 0.21.4-beta.1	2025-08-22 03:55:22 +00:00
Lance Release	d4a41b5663	Bump version: 0.21.3 → 0.21.4-beta.0	2025-08-19 22:56:52 +00:00
Vitali Lovich	d602e9f98c	fix: make cloud features optional (#2567 ) (#2568 ) This shrinks the size of a local embedded build that can disable all the default features. When combined with https://github.com/lancedb/lance/pull/4362 and the dependencies are updated to point to the fix, this resolves #2567 fully. Verified by patching the workspace to redirect to my clone of lance with the PR applied. ``` cargo tree -p lancedb -e no-build -e no-dev --no-default-features -i aws-config \| less ``` The reason that lance itself needs to change too is that many dependencies within that project depend on lance-io/default and lancedb depends on them which transitively ends up enabling the cloud regardless. The PR in lance removes the dependency on lance-io/default from all sibling crates. --------- Co-authored-by: Will Jones <willjones127@gmail.com>	2025-08-15 16:46:52 -07:00
Will Jones	ad09234d59	feat: allow setting `train=False` and `name` on indices (#2586 ) Enables two new parameters when building indices: * `name`: Allows explicitly setting a name on the index. Default is `{col_name}_idx`. * `train` (default `True`): When set to `False`, an empty index will be immediately created. The upgrade of Lance means there are also additional behaviors from `cd76a993b8`: * When a scalar index is created on a Table, it will be kept around even if all rows are deleted or updated. * Scalar indices can be created on empty tables. They will default to `train=False` if the table is empty. --------- Co-authored-by: Weston Pace <weston.pace@gmail.com>	2025-08-15 14:00:26 -07:00
Lance Release	0c34ffb252	Bump version: 0.21.3-beta.0 → 0.21.3	2025-08-15 18:03:26 +00:00
Lance Release	d9f333d828	Bump version: 0.21.2 → 0.21.3-beta.0	2025-08-15 18:02:43 +00:00
Yuval Lifshitz	ce550e6c45	feat: add missing rust examples (#2583 ) all 3 example are running now with: ``` cargo run --example simple cargo run --example full_text_search cargo run --example ivf_pq ``` Signed-off-by: Yuval Lifshitz <ylifshit@ibm.com> Co-authored-by: Weston Pace <weston.pace@gmail.com>	2025-08-15 10:38:58 -07:00
Will Jones	dcf53c4506	fix: limit and offset support paginating through FTS and vector search results (#2592 ) Adds tests to ensure that users can paginate through simple scan, FTS, and vector search results using `limit` and `offset`. Tests upstream work: https://github.com/lancedb/lance/pull/4318 Closes #2459	2025-08-15 08:55:12 -07:00
Weston Pace	ed640a76d9	feat: add take_offsets and take_row_ids (#2584 ) These operations have existed in lance for a long while and many users need to drop down to lance for this capability. This PR adds the API and implements it using filters (e.g. `_rowid IN (...)`) so that in doesn't currently add any load to `BaseTable`. I'm not sure that is sustainable as base table implementations may want to specialize how they handle this method. However, I figure it is a good starting point. In addition, unlike Lance, this API does not currently guarantee anything about the order of the take results. This is necessary for the fallback filter approach to work (SQL filters cannot guarantee result order)	2025-08-15 06:48:24 -07:00

1 2 3 4 5 ...

550 Commits