lancedb

mirror of https://github.com/lancedb/lancedb.git synced 2026-05-18 20:40:41 +00:00

Author	SHA1	Message	Date
Will Jones	3d1f102087	feat: allow Python and Typescript users to create `Session`s (#2530 ) ## Summary - Exposes `Session` in Python and Typescript so users can set the `index_cache_size_bytes` and `metadata_cache_size_bytes` * The `Session` is attached to the `Connection`, and thus shared across all tables in that connection. - Adds deprecation warnings for table-level cache configuration 🤖 Generated with [Claude Code](https://claude.ai/code) --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-07-24 12:06:29 -07:00
Weston Pace	94fb9f364a	feat: update lance version to 0.32.0-b2 (#2525 )	2025-07-23 12:23:10 -07:00
Lance Release	cceaf27d79	Bump version: 0.21.2-beta.0 → 0.21.2-beta.1	2025-07-22 15:41:13 +00:00
Will Jones	0579303602	feat: allow setting custom Session on ListingDatabase (#2526 ) ## Summary Add support for providing a custom `Session` when connecting to a `ListingDatabase`. This allows users to configure object store registries, caching, and other session-related settings while maintaining full backward compatibility. ## Usage Example ```rust use std::sync::Arc; use lancedb::connect; let custom_session = Arc::new(lance::session::Session::default()); let db = connect("/path/to/database") .session(custom_session) .execute() .await?; ``` 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Claude <noreply@anthropic.com>	2025-07-21 16:28:39 -07:00
Lance Release	b3a637fdeb	Bump version: 0.21.1 → 0.21.2-beta.0	2025-07-18 16:03:28 +00:00
Wyatt Alt	ab8cbe62dd	fix: excessive object storage handle creation in create_table (#2505 ) This fixes two bugs with create_table storage handle reuse. First issue is, the database object did not previously carry a session that create_table operations could reuse for create_table operations. Second issue is, the inheritance logic for create_table and open_table was causing empty storage options (i.e Some({})) to get sent, instead of None. Lance handles these differently: * When None is set, the object store held in the session's storage registry that was created at "connect" is used. This value stays in the cache long-term (probably as long as the db reference is held). * When Some({}) is sent, LanceDB will create a new connection and cache it for an empty key. However, that cached value will remain valid only as long as the client holds a reference to the table. After that, the cache is poisoned and the next create_table with the same key, will create a new connection. This confounds reuse if e.g python gc's the table object before another table is created. My feeling is that the second path, if intentional, is probably meant to serve cases where tables are overriding settings and the cached connection is assumed not to be generally applicable. The bug is we were engaging that mechanism for all tables.	2025-07-17 16:27:23 -07:00
Lance Release	6d23d32ab5	Bump version: 0.21.1-beta.2 → 0.21.1	2025-07-10 21:36:59 +00:00
Lance Release	704cec34e1	Bump version: 0.21.1-beta.1 → 0.21.1-beta.2	2025-07-10 21:36:26 +00:00
Lance Release	2bffbcefa5	Bump version: 0.21.1-beta.0 → 0.21.1-beta.1	2025-07-09 05:54:20 +00:00
BubbleCal	cab36d94b2	feat: support to specify num_partitions and num_bits (#2488 )	2025-07-09 11:36:09 +08:00
Lance Release	6fc006072c	Bump version: 0.21.0 → 0.21.1-beta.0	2025-07-07 21:01:30 +00:00
BubbleCal	dbccd9e4f1	chore: upgrade lance to 0.31.1-beta.1 (#2486 ) this also upgrades: - datafusion 47.0 -> 48.0 - half 2.5.0 -> 2.6.0 to be consistent with lance --------- Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2025-07-07 22:16:43 +08:00
Will Jones	b12ebfed4c	fix: only monotonically update dataset (#2479 ) Make sure we only update the latest version if it's actually newer. This is important if there are concurrent queries, as they can take different amounts of time.	2025-07-01 08:29:37 -07:00
Weston Pace	1dadb2aefa	feat: upgrade to lance 0.31.0-beta.1 (#2469 ) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * Chores * Updated dependencies to newer versions for improved compatibility and stability. * Refactor * Improved internal handling of data ranges and stream lifetimes for enhanced performance and reliability. * Simplified code style for Python query object conversions without affecting functionality. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-06-30 11:10:53 -07:00
Kilerd Chan	ba755626cc	fix: expose parsing error coming from invalid object store uri (#2475 ) this PR is to expose the error from `ListingCatalog::open_path` which unwrap the Result coming from `ObjectStore::from_uri` to avoid panic	2025-06-30 10:33:18 +08:00
Lance Release	a00b8595d1	Bump version: 0.21.0-beta.0 → 0.21.0	2025-06-20 05:47:06 +00:00
Lance Release	9c8314b4fd	Bump version: 0.20.1-beta.2 → 0.21.0-beta.0	2025-06-20 05:46:27 +00:00
Lance Release	c72f6770fd	Bump version: 0.20.1-beta.1 → 0.20.1-beta.2	2025-06-18 23:33:57 +00:00
Lance Release	b77314168d	Bump version: 0.20.1-beta.0 → 0.20.1-beta.1	2025-06-17 23:22:50 +00:00
Wyatt Alt	627ca4c810	chore: update lance to v0.29.1-beta.2 (#2442 ) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Chores - Updated internal dependencies to use a newer version of the Lance library. - New Features - Added support for a new query occurrence type labeled "MUST NOT" in search filters. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-06-17 14:02:13 -07:00
Lance Release	f8dae4ffe9	Bump version: 0.20.0 → 0.20.1-beta.0	2025-06-16 16:30:14 +00:00
Weston Pace	59b57e30ed	feat: add maximum and minimum nprobes properties (#2430 ) This exposes the maximum_nprobes and minimum_nprobes feature that was added in https://github.com/lancedb/lance/pull/3903 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Added support for specifying minimum and maximum probe counts in vector search queries, allowing finer control over search behavior. - Users can now independently set minimum and maximum probes for vector and hybrid queries via new methods and parameters in Python, Node.js, and Rust APIs. - Bug Fixes - Improved parameter validation to ensure correct usage of minimum and maximum probe values. - Tests - Expanded test coverage to validate correct handling, serialization, and error cases for the new probe parameters. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-06-13 15:18:29 -07:00
Lance Release	d5f2eca754	Bump version: 0.20.0-beta.3 → 0.20.0	2025-06-04 21:08:31 +00:00
Lance Release	7fa455a8a5	Bump version: 0.20.0-beta.2 → 0.20.0-beta.3	2025-06-04 21:07:59 +00:00
Lance Release	7acece493d	Bump version: 0.20.0-beta.1 → 0.20.0-beta.2	2025-06-04 07:14:39 +00:00
Jack Ye	74e578b3c8	feat: upgrade lance to v0.29.0-beta.2 (#2419 ) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Chores - Updated various internal dependencies to newer versions for improved stability and compatibility. - Increased the version number for the Python package. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-06-03 15:16:26 -07:00
Lance Release	316b406265	Bump version: 0.20.0-beta.0 → 0.20.0-beta.1	2025-06-03 16:27:53 +00:00
Lance Release	d7afa600b8	Bump version: 0.19.2-beta.0 → 0.20.0-beta.0	2025-05-31 03:47:37 +00:00
Will Jones	5895ef4039	ci: revert unnecessary version bump (#2415 ) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Chores - Downgraded version numbers for the Node.js, Python, and Rust packages. No other user-facing changes were made. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-05-30 16:51:14 -07:00
Jack Ye	0528cd858a	fix: avoid failing list_indices for any unknown index (#2413 ) Closes #2412 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Bug Fixes - Improved the reliability of listing indices by logging warnings for errors and skipping problematic entries, ensuring successful results are returned. - Internal indices used for optimization are now excluded from the visible list of indices. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-05-30 14:43:12 -07:00
BubbleCal	5c7f63388d	feat!: upgrade lance to v0.28.0 (#2404 ) this introduces some breaking changes in terms of rust API of creating FTS index, and the default index params changed Signed-off-by: BubbleCal <bubble-cal@outlook.com> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Updated default settings for full-text search (FTS) index creation: stemming, stop word removal, and ASCII folding are now enabled by default, while token position storage is disabled by default. - Refactor - Simplified and streamlined the configuration and handling of FTS index parameters for improved maintainability and consistency across interfaces. - Enhanced serialization and request construction for FTS index parameters to reduce manual handling and improve code clarity. - Improved test coverage by explicitly enabling positional indexing in FTS tests to support phrase queries. - Chores - Upgraded all internal dependencies related to FTS indexing to the latest version for enhanced compatibility and performance. - Updated package versions for Node.js, Python, and Rust components to the latest beta releases. - Improved CI workflows by adding Rust toolchain setup with formatting and linting tools. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: BubbleCal <bubble-cal@outlook.com> Co-authored-by: Will Jones <willjones127@gmail.com>	2025-05-29 15:19:24 -07:00
Lance Release	07bc1c5397	Bump version: 0.19.1 → 0.19.2-beta.0	2025-05-23 21:58:31 +00:00
Lance Release	51561e31a0	Bump version: 0.19.1-beta.6 → 0.19.1	2025-05-22 05:58:05 +00:00
Lance Release	7b19120578	Bump version: 0.19.1-beta.5 → 0.19.1-beta.6	2025-05-22 05:58:00 +00:00
Lance Release	198f0f80c6	Bump version: 0.19.1-beta.4 → 0.19.1-beta.5	2025-05-15 23:43:32 +00:00
Lance Release	543dec9ff0	Bump version: 0.19.1-beta.3 → 0.19.1-beta.4	2025-05-08 20:19:17 +00:00
LuQQiu	19e896ff69	chore: add default for result structs (#2377 ) add default for result structs, when values are not provided, will go with the default values <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Chores - Improved internal handling of table operation results to support default values. No changes to user-facing features or functionality. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-05-08 13:09:11 -07:00
Will Jones	272e4103b2	feat: provide timeout parameter for merge_insert (#2378 ) Provides the ability to set a timeout for merge insert. The default underlying timeout is however long the first attempt takes, or if there are multiple attempts, 30 seconds. This has two use cases: 1. Make the timeout shorter, when you want to fail if it takes too long. 2. Allow taking more time to do retries. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Added support for specifying a timeout when performing merge insert operations in Python, Node.js, and Rust APIs. - Introduced a new option to control the maximum allowed execution time for merge inserts, including retry timeout handling. - Documentation - Updated and added documentation to describe the new timeout option and its usage in APIs. - Tests - Added and updated tests to verify correct timeout behavior during merge insert operations. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-05-08 13:07:05 -07:00
Wyatt Alt	75c257ebb6	fix: return IndexNotExist on remote drop index 404 (#2380 ) Prior to this commit, attempting to drop an index that did not exist would return a TableNotFound error stating that the target table does not exist -- even when it did exist. Instead, we now return an IndexNotFound error. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Bug Fixes - Improved error handling when attempting to drop a non-existent index, providing a more accurate error message. - Tests - Added a test to verify correct error reporting when dropping an index that does not exist. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-05-07 17:24:05 -07:00
Lance Release	d83424d6b4	Bump version: 0.19.1-beta.2 → 0.19.1-beta.3	2025-05-06 02:45:06 +00:00
LuQQiu	b2160b2304	fix: fix backward compatibility with the add API (#2375 ) add API originally returns a struct with request_id, add backward compatibility for that <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Bug Fixes - Improved handling of empty server responses for various data operations to ensure consistent behavior across server versions. - Added default values to version and numeric fields to prevent errors when response data is incomplete. - Tests - Expanded tests to cover multiple server response scenarios, validating correct version handling in data operations. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-05-05 19:26:27 -07:00
Lance Release	dc8054e90d	Bump version: 0.19.1-beta.1 → 0.19.1-beta.2	2025-05-06 00:08:55 +00:00
LuQQiu	695813463c	chore: reduce unneeded API calls for return version for write operations and improve test (#2373 ) Reduce the duplicate code for remote write operation testing. Avoid double call to remote to get version info, just return 0 instead of suddenly adding extra API calls for end users when they are using old servers. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Added version tracking to table operation results, allowing users to see the commit version associated with add, update, delete, merge, and column modification operations. - Bug Fixes - Improved compatibility with legacy servers by standardizing version information as zero when the server does not return a version. - Documentation - Clarified the meaning of the version field in operation results, especially for cases involving legacy server responses. - Tests - Enhanced test coverage to verify correct behavior with both legacy and modern server responses. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-05-05 16:47:19 -07:00
LuQQiu	ed594b0f76	feat: return version for all write operations (#2368 ) return version info for all write operations (add, update, merge_insert and column modification operations) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Table modification operations (add, update, delete, merge, add/alter/drop columns) now return detailed result objects including version numbers and operation statistics. - Result objects provide clearer feedback such as rows affected and new table version after each operation. - Documentation - Updated documentation to describe new result objects and their fields for all relevant table operations. - Added documentation for new result interfaces and updated method return types in Node.js and Python APIs. - Tests - Enhanced test coverage to assert correctness of returned versioning and operation metadata after table modifications. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-05-05 14:25:34 -07:00
Alex Pilon	f315f9665a	feat: implement bindings to return merge stats (#2367 ) Based on this comment: https://github.com/lancedb/lancedb/issues/2228#issuecomment-2730463075 and https://github.com/lancedb/lance/pull/2357 Here is my attempt at implementing bindings for returning merge stats from a `merge_insert.execute` call for lancedb. Note: I have almost no idea what I am doing in Rust but tried to follow existing code patterns and pay attention to compiler hints. - The change in nodejs binding appeared to be necessary to get compilation to work, presumably this could actual work properly by returning some kind of NAPI JS object of the stats data? - I am unsure of what to do with the remote/table.rs changes - necessarily for compilation to work; I assume this is related to LanceDB cloud, but unsure the best way to handle that at this point. Proof of function: ```python import pandas as pd import lancedb db = lancedb.connect("/tmp/test.db") test_data = pd.DataFrame( { "title": ["Hello", "Test Document", "Example", "Data Sample", "Last One"], "id": [1, 2, 3, 4, 5], "content": [ "World", "This is a test", "Another example", "More test data", "Final entry", ], } ) table = db.create_table("documents", data=test_data, exist_ok=True, mode="overwrite") update_data = pd.DataFrame( { "title": [ "Hello, World", "Test Document, it's good", "Example", "Data Sample", "Last One", "New One", ], "id": [1, 2, 3, 4, 5, 6], "content": [ "World", "This is a test", "Another example", "More test data", "Final entry", "New content", ], } ) stats = ( table.merge_insert(on="id") .when_matched_update_all() .when_not_matched_insert_all() .execute(update_data) ) print(stats) ``` returns ``` {'num_inserted_rows': 1, 'num_updated_rows': 5, 'num_deleted_rows': 0} ``` <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Summary by CodeRabbit - New Features - Merge-insert operations now return detailed statistics, including counts of inserted, updated, and deleted rows. - Bug Fixes - Tests updated to validate returned merge-insert statistics for accuracy. - Documentation - Method documentation improved to reflect new return values and clarify merge operation results. - Added documentation for the new `MergeStats` interface detailing operation statistics. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Will Jones <willjones127@gmail.com>	2025-05-01 10:00:20 -07:00
Lance Release	508e621f3d	Bump version: 0.19.1-beta.0 → 0.19.1-beta.1	2025-04-29 22:19:14 +00:00
Ryan Green	af54e0ce06	feat: add table stats API (#2363 ) * Add a new "table stats" API to expose basic table and fragment statistics with local and remote table implementations ### Questions * This is using `calculate_data_stats` to determine total bytes in the table. This seems like a potentially expensive operation - are there any concerns about performance for large datasets? ### Notes * bytes_on_disk seems to be stored at the column level but there does not seem to be a way to easily calculate total bytes per fragment. This may need to be added in lance before we can support fragment size (bytes) statistics. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Added a method to retrieve comprehensive table statistics, including total rows, index counts, storage size, and detailed fragment size metrics such as minimum, maximum, mean, and percentiles. - Enabled fetching of table statistics from remote sources through asynchronous requests. - Extended table interfaces across Python, Rust, and Node.js to support synchronous and asynchronous retrieval of table statistics. - Tests - Introduced tests to verify the accuracy of the new table statistics feature for both populated and empty tables. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-04-29 15:19:08 -02:30
Lance Release	e9f25f6a12	Bump version: 0.19.0 → 0.19.1-beta.0	2025-04-28 17:20:26 +00:00
LuQQiu	a9311c4dc0	feat: add list/create/delete/update/checkout tag API (#2353 ) add the tag related API to list existing tags, attach tag to a version, update the tag version, delete tag, get the version of the tag, and checkout the version that the tag bounded to. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced table version tagging, allowing users to create, update, delete, and list human-readable tags for specific table versions. - Enabled checking out a table by either version number or tag name. - Added new interfaces for tag management in both Python and Node.js APIs, supporting synchronous and asynchronous workflows. - Bug Fixes - None. - Documentation - Updated documentation to describe the new tagging features, including usage examples. - Tests - Added comprehensive tests for tag creation, updating, deletion, listing, and version checkout by tag in both Python and Node.js environments. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-04-28 10:04:46 -07:00
Lance Release	726d629b9b	Bump version: 0.19.0-beta.12 → 0.19.0	2025-04-25 21:16:30 +00:00

1 2 3 4 5 ...

492 Commits