lancedb

mirror of https://github.com/lancedb/lancedb.git synced 2026-06-02 20:00:46 +00:00

Author	SHA1	Message	Date
Weston Pace	02783bf440	feat: add a getitems implementation for the permutation (#3013 )	2026-02-12 05:36:11 -08:00
Abhishek	3c1162612e	refactor: extract optimize logic from table.rs into submodule (#2979 ) ## Summary Continues the modularization effort of table operations as outlined in #2949. - Extracts optimization operations (`OptimizeAction`, `OptimizeStats`, `execute_optimize`, `compact_files_impl`, `cleanup_old_versions`, `optimize_indices`) from `table.rs` into `table/optimize.rs` - Public API remains unchanged via re-exports - Adds comprehensive tests including error cases with message assertions ## Test plan - [x] All new optimization tests pass - [x] All existing tests pass - [x] `cargo clippy` passes with no warnings - [x] `cargo fmt --check` passes --------- Co-authored-by: Will Jones <willjones127@gmail.com>	2026-02-09 16:22:57 -08:00
Lance Release	de4f77800d	Bump version: 0.26.2-beta.0 → 0.26.2	2026-02-09 06:06:22 +00:00
Lance Release	b6ab721cf7	Bump version: 0.26.1 → 0.26.2-beta.0	2026-02-09 06:06:03 +00:00
Lance Release	9fac56252e	Bump version: 0.26.1-beta.0 → 0.26.1	2026-02-07 00:33:18 +00:00
Lance Release	c55ca20c1b	Bump version: 0.26.0 → 0.26.1-beta.0	2026-02-07 00:33:02 +00:00
Abhishek	6dde379d44	refactor: extract schema evolution logic from table.rs into submodule (#2973 ) Continues the modularization effort of schema evolution operations as outlined in #2949 ## Summary - Extracts schema evolution operations (add_columns, alter_columns, drop_columns) from `table.rs` into `table/schema_evolution.rs` - Public API remains unchanged via re-exports ## Test plan - [x] All new schema evolution tests pass - [x] All existing tests pass - [x] `cargo clippy` passes with no warnings - [x] `cargo fmt --check` passes	2026-02-06 11:33:18 -08:00
Lance Release	55f09ef1cd	Bump version: 0.26.0-beta.0 → 0.26.0	2026-02-06 18:08:30 +00:00
Lance Release	e9d8651d18	Bump version: 0.25.0-beta.0 → 0.26.0-beta.0	2026-02-06 18:08:08 +00:00
Jack Ye	0859312b83	feat: add initial and latest storage options apis (#2966 ) Expose `initial_storage_options()` and `latest_storage_options()` in lance Dataset, in lancedb rust, python and typescript SDKs. --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-04 10:31:39 -08:00
Jack Ye	bd2c6d0763	chore: update lance dependency to v2.0.0-rc.4 (#2972 )	2026-02-03 14:38:39 -08:00
Will Jones	fbf4a53475	feat(rust): implement `TableProvider::insert_into()` for LanceDB tables (#2939 ) Implements `InsertExec` and `RemoteInsertExec` to support running inserts in DataFusion. ## Context In https://github.com/lancedb/lancedb/pull/2929, I've prototyped moving the insert pipeline into DataFusion. This will enable parallelism at two levels: 1. Running preprocessing, such as casting the input schema or computing embeddings 2. Writing out files This PR is just the first part of running the actual writes. In the end, the plans might look like: ``` InsertExec RepartitionExec num_partitions=<write_parallelism> ProjectionExec vector=compute_embedding() RepartitionExec num_partitions=<num_cpus> DataSourceExec ``` where `num_cpus` is used to take advantage of all cores, while `write_parallelism` might be less than `num_cpus` if there are too few rows to want to split writes across `num_cpus` files. Later PRs will move the preprocessing steps into DataFusion, and then hook this up to the `Table::add()` implementations. ## Relation to future SQL work We eventually plan on having the Remote SDK go through a FlightSQL endpoint. Then for most queries we will send just the SQL string to the server, and not run any sort of DataFusion plan on the client. However, I think writes will be a little special, especially bulk writes where we need to upload large streams of data and likely want parallelism. So we'll have different code paths for writes, and I think using DataFusion makes sense, especially as long as we are doing the pre-processing on the client side still.	2026-02-03 10:38:02 -08:00
ChinmayGowda71	9c017d8348	refactor: extract update logic to src/table/update.rs (#2964 ) References #2949 Part 2 of table.rs refactor. Moved UpdateResult, UpdateBuilder, and execution logic to src/table/update.rs. No functional changes API remains identical. --------- Co-authored-by: Will Jones <willjones127@gmail.com>	2026-02-03 09:54:19 -08:00
Lance Release	571295b0d9	Bump version: 0.24.1 → 0.25.0-beta.0	2026-02-03 04:48:34 +00:00
Will Jones	131024839f	fix: include _rowid in hash and calculated split projections (#2965 ) ## Summary - PR #2957 changed the permutation builder to only select `_rowid` from the base table, but `Splitter::project()` for hash and calculated splits replaced the selection entirely, dropping `_rowid`. - Include `_rowid` in the column selections for hash and calculated split projections. - Fix a Python test that queried the permutation table for base table columns no longer materialized. Fixes the `test_split_hash`, `test_split_hash_with_discard`, `test_split_calculated`, `test_shuffle_combined_with_splits`, and `test_filter_with_splits` failures in `test_permutation.py`. ## Test plan - [x] `cargo test -p lancedb -- permutation` (22 passed) - [x] `pytest python/tests/test_permutation.py` (46 passed) - [x] `npm test __test__/permutation.test.ts` (20 passed) 🤖 Generated with [Claude Code](https://claude.com/claude-code) --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-02-02 16:27:58 -08:00
ChinmayGowda71	3c7ddf4d0c	refactor: modularize table.rs and extract delete logic (#2952 ) References #2949 Moved DeleteResult and delete() implementation to src/table/delete.rs. No functional changes. Added a test delete which works. Will work on refactoring update next.	2026-02-02 11:54:49 -08:00
Mesut-Doner	3755064e93	fix(rust): support embeddings in create_empty_table (#2961 ) Fixes the Rust SDK's `create_empty_table` to properly support embedding column definitions, bringing it to parity with the Python SDK. ## Problem The Rust SDK's `Connection::create_empty_table` did not support setting embedding columns. When using `.add_embedding()` on the builder, the embedding column definitions were lost because `TableDefinition::new_from_schema(schema)` marks all columns as physical only, without embedding metadata. The Python SDK worked around this by creating an empty record batch with proper schema metadata rather than using `create_empty_table` directly. ## Solution Modified `CreateTableBuilder<false>` to handle embeddings Closes #2759	2026-01-30 15:44:18 -08:00
Weston Pace	9be28448f5	fix: don't store all columns in the permutation table (#2957 ) The permutation table was always intended to be a small table of row id pointers (and split id). However, it was accidentally doing a full materialization of the base table 🤦 This PR changes the permutation builder to only store row id and split id.	2026-01-29 16:06:36 -08:00
Weston Pace	e9e904783c	feat: allow the permutation builder memory limit to be configured by env var (#2946 ) Running into issues with DF sorting again. This will at least allow the memory limit to be set large to bypass problems.	2026-01-28 09:02:59 +05:30
Lance Release	8500b16eca	Bump version: 0.24.1-beta.0 → 0.24.1	2026-01-26 23:39:18 +00:00
Lance Release	57e7282342	Bump version: 0.24.0 → 0.24.1-beta.0	2026-01-26 23:38:50 +00:00
Jack Ye	7bf020b3d5	chore: fix clippy when remote flag is not set (#2943 ) Also add a step in CI to ensure this does not happen in the future	2026-01-26 13:59:31 -08:00
Jack Ye	e4552e577a	chore(revert): revert update lance dependency to v2.0.0-rc.1 (#2936 ) (#2941 ) This reverts commit `bd84bba14d`, so that we can bump version to 1.0.4-rc.1	2026-01-26 11:13:59 -08:00
Will Jones	f979a902ad	ci(rust): fix MSRV check (#2940 ) Realized our MSRV check was inert because `rust-toolchain.toml` was overriding the Rust version. We set the `RUSTUP_TOOLCHAIN` environment variable, which overrides that. Also needed to update to MSRV 1.88 (due to dependencies like Lance and DataFusion) and fix some clippy warnings.	2026-01-23 15:57:09 -08:00
Colin Patrick McCabe	5a7a8da567	feat: check AZURE_STORAGE_ACCOUNT_NAME in remote conns (#2918 ) Unlike in Amazon S3, in Azure bucket names are not globally unique. Instead, the combination of (storage_account_name, bucket_name) is unique. Therefore, when using Azure blob store, we always need a way to configure the storage account name. One way is to use the storage_options hash map and set azure_storage_account_name. Another way is to set an environment variable, AZURE_STORAGE_ACCOUNT_NAME. Prior to this PR, the second way (environment variable) did not work with remote connections. This is because the existing code that checks for these environment variables happens inside the Azure object store implementation itself, which does not run locally when using remote connections. This PR addresses that situation by adding a check of the environment variable. This functions as a default if the relevant storage option is not set in the storage_options hash map.	2026-01-22 13:36:05 -08:00
Jack Ye	0db8176445	test: fix failing remote doctest reference to aws feature (#2935 ) Closes https://github.com/lancedb/lancedb/issues/2933	2026-01-22 13:17:03 -08:00
LanceDB Robot	bd84bba14d	chore: update lance dependency to v2.0.0-rc.1 (#2936 ) ## Summary - bump Lance dependencies to v2.0.0-rc.1 (git tag) - align Arrow/DataFusion/PyO3 versions for the new Lance release - update Python bindings for PyO3 0.26 (attach API + Py<PyAny>) ## Verification - `cargo clippy --workspace --tests --all-features -- -D warnings` - `cargo fmt --all` ## Reference - https://github.com/lance-format/lance/releases/tag/v2.0.0-rc.1 --------- Co-authored-by: Jack Ye <yezhaoqin@gmail.com> Co-authored-by: Will Jones <willjones127@gmail.com> Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: BubbleCal <bubble_cal@outlook.com>	2026-01-22 13:14:38 -08:00
Lance Release	ac07f8068c	Bump version: 0.24.0-beta.1 → 0.24.0	2026-01-22 01:10:15 +00:00
Lance Release	bba362d372	Bump version: 0.24.0-beta.0 → 0.24.0-beta.1	2026-01-22 01:09:53 +00:00
Jack Ye	4e65748abf	chore: update lance dependency to v1.0.3-rc.1 (#2927 ) Supercedes https://github.com/lancedb/lancedb/pull/2925 We accidentally upgraded lance to 2.0.0-beta.8. This PR reverts that first and then bump to 1.0.3-rc.1 --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>	2026-01-21 11:52:07 -08:00
Colin Patrick McCabe	e897f3edab	test: assert remote behavior of drop_table (#2926 ) Add support for testing remote connections in drop_table in `rust/lancedb/src/connection.rs`.	2026-01-21 08:42:40 -08:00
Lance Release	790ba7115b	Bump version: 0.23.1 → 0.24.0-beta.0	2026-01-21 12:21:53 +00:00
Ryan Green	cd5f91bb7d	feat: expose table uri (#2922 ) * Expose `table.uri` property for all tables, including remote tables * Fix bug in path calculation on windows file systems	2026-01-20 19:56:46 -03:30
LanceDB Robot	4da01a0e65	chore: update lance dependency to v2.0.0-beta.8 (#2907 ) ## Summary - bump Lance crates to v2.0.0-beta.8 and align arrow/datafusion/regex/half and PyO3 dependencies - update Rust/Python bindings for upstream API changes (namespace/table requests, query select columns, storage option providers) - verified with cargo clippy --workspace --tests --all-features -D warnings and cargo fmt --all Triggered by refs/tags/v2.0.0-beta.8. --------- Signed-off-by: BubbleCal <bubble-cal@outlook.com> Co-authored-by: BubbleCal <bubble-cal@outlook.com>	2026-01-16 01:46:52 +08:00
Will Jones	1840aa7edc	feat(rust)!: remove default features (#2912 ) BREAKING CHANGE: removes `aws`, `dynamodb`, `azure`, `gcs`, `oss`, `huggingface` from default Rust features. They can be enabled by users as needed. They are still enabled for Python and NodeJS, since those users don't control the compilation of artifacts. Closes #2911	2026-01-13 11:23:14 -08:00
Xuanwo	489c91c5d6	feat: enable huggingface feature by default (#2910 )	2026-01-13 20:42:11 +05:30
Qichao Chu	4494eb9e56	feat: parallelize embedding computations (#2896 ) Implement parallel execution of multiple embedding functions using std:🧵:scope to improve performance when a table has multiple embedding columns. Key changes: - Add compute_embeddings_parallel() helper method to WithEmbeddings - Use fast path for single embeddings (no threading overhead) - Use scoped threads for parallel execution of multiple embeddings - Add comprehensive tests including parallelization timing verification - Update WithEmbeddings documentation Performance improvements: - I/O-bound embeddings (OpenAI, Bedrock): High benefit from concurrent API calls - CPU-bound embeddings (sentence-transformers): Medium benefit from core utilization - Single embedding: No overhead (fast path) Closes TODO on line 266 in rust/lancedb/src/embeddings.rs	2026-01-06 14:35:56 -08:00
LuQQiu	d67a8743ba	feat: support remote ivf rq (#2863 )	2026-01-02 15:35:33 -08:00
Colin Patrick McCabe	ac164c352b	test: convert test_table_names to test both remote and local (#2888 ) Convert test_table_names to test both remote and local connections. This PR also includes some miscellaneous improvements in src/test_utils/connection.rs. It starts a thread to drain stdout from the server process. It adds the PRINT_LANCEDB_TEST_CONNECTION_SCRIPT_OUTPUT environment variable, which optionally displays server stdout. Fix a bash conditional in run_with_test_connection.sh.	2026-01-02 15:08:44 -08:00
Lance Release	8bcac7e372	Bump version: 0.23.1-beta.2 → 0.23.1	2026-01-02 17:39:19 +00:00
Lance Release	e496184ab2	Bump version: 0.23.1-beta.1 → 0.23.1-beta.2	2026-01-02 17:38:54 +00:00
Lance Release	0667fa38d4	Bump version: 0.23.1-beta.0 → 0.23.1-beta.1	2025-12-17 06:59:29 +00:00
Jack Ye	1628f7e3f3	fix: pass namespace storage options provider into native table (#2873 ) Previously the native table is created with static credentials and could not auto-refresh credentials when expired.	2025-12-16 22:58:04 -08:00
Lance Release	2fd712312f	Bump version: 0.23.0 → 0.23.1-beta.0	2025-12-17 03:30:51 +00:00
Jack Ye	9e60fda0ec	fix: use post for describe_namespace and allow access to underlying client (#2871 ) Issues found during integration tests: 1. describe_namespace should use POST 2. service needs to access the underlying namespace to be able to do operations like create_empty_table directly, or get credentials in isolated paths like a remote take	2025-12-16 19:29:27 -08:00
Lance Release	94bdffe13c	Bump version: 0.23.0-beta.2 → 0.23.0	2025-12-16 16:58:35 +00:00
Lance Release	b93ea3a388	Bump version: 0.23.0-beta.1 → 0.23.0-beta.2	2025-12-16 16:57:55 +00:00
Lance Release	0960e19559	Bump version: 0.23.0-beta.0 → 0.23.0-beta.1	2025-12-05 00:36:39 +00:00
Lance Release	6f79770248	Bump version: 0.22.4-beta.3 → 0.23.0-beta.0	2025-12-04 19:33:37 +00:00
BubbleCal	a61461331c	feat: add IVF SQ index support and HNSW aliases (#2832 ) Adds IVF_SQ index config through Rust core and Python bindings, plus alias names IvfHnswSq/Pq for backward compatibility. Updates remote/table helpers and types to accept the new index type. Includes tests covering IVF SQ creation and alias usage.	2025-12-04 00:25:44 +08:00

1 2 3 4 5 ...

733 Commits