lancedb

mirror of https://github.com/lancedb/lancedb.git synced 2025-12-25 22:29:58 +00:00

Author	SHA1	Message	Date
Lance Release	38d11291da	Updating package-lock.json	2025-05-31 03:48:11 +00:00
Lance Release	d7afa600b8	Bump version: 0.19.2-beta.0 → 0.20.0-beta.0	2025-05-31 03:47:37 +00:00
Will Jones	5895ef4039	ci: revert unnecessary version bump (#2415 ) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Chores - Downgraded version numbers for the Node.js, Python, and Rust packages. No other user-facing changes were made. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-05-30 16:51:14 -07:00
BubbleCal	5c7f63388d	feat!: upgrade lance to v0.28.0 (#2404 ) this introduces some breaking changes in terms of rust API of creating FTS index, and the default index params changed Signed-off-by: BubbleCal <bubble-cal@outlook.com> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Updated default settings for full-text search (FTS) index creation: stemming, stop word removal, and ASCII folding are now enabled by default, while token position storage is disabled by default. - Refactor - Simplified and streamlined the configuration and handling of FTS index parameters for improved maintainability and consistency across interfaces. - Enhanced serialization and request construction for FTS index parameters to reduce manual handling and improve code clarity. - Improved test coverage by explicitly enabling positional indexing in FTS tests to support phrase queries. - Chores - Upgraded all internal dependencies related to FTS indexing to the latest version for enhanced compatibility and performance. - Updated package versions for Node.js, Python, and Rust components to the latest beta releases. - Improved CI workflows by adding Rust toolchain setup with formatting and linting tools. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: BubbleCal <bubble-cal@outlook.com> Co-authored-by: Will Jones <willjones127@gmail.com>	2025-05-29 15:19:24 -07:00
Lance Release	23ee132546	Updating package-lock.json	2025-05-23 21:58:58 +00:00
Lance Release	07bc1c5397	Bump version: 0.19.1 → 0.19.2-beta.0	2025-05-23 21:58:31 +00:00
Lance Release	875ed7ae6f	Updating package-lock.json	2025-05-22 05:58:59 +00:00
Lance Release	51561e31a0	Bump version: 0.19.1-beta.6 → 0.19.1	2025-05-22 05:58:05 +00:00
Lance Release	7b19120578	Bump version: 0.19.1-beta.5 → 0.19.1-beta.6	2025-05-22 05:58:00 +00:00
Lance Release	05a85cfc2a	Updating package-lock.json	2025-05-15 23:44:27 +00:00
Lance Release	198f0f80c6	Bump version: 0.19.1-beta.4 → 0.19.1-beta.5	2025-05-15 23:43:32 +00:00
Lance Release	a5fbbf0d66	Updating package-lock.json	2025-05-08 20:20:18 +00:00
Lance Release	543dec9ff0	Bump version: 0.19.1-beta.3 → 0.19.1-beta.4	2025-05-08 20:19:17 +00:00
Will Jones	272e4103b2	feat: provide timeout parameter for merge_insert (#2378 ) Provides the ability to set a timeout for merge insert. The default underlying timeout is however long the first attempt takes, or if there are multiple attempts, 30 seconds. This has two use cases: 1. Make the timeout shorter, when you want to fail if it takes too long. 2. Allow taking more time to do retries. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Added support for specifying a timeout when performing merge insert operations in Python, Node.js, and Rust APIs. - Introduced a new option to control the maximum allowed execution time for merge inserts, including retry timeout handling. - Documentation - Updated and added documentation to describe the new timeout option and its usage in APIs. - Tests - Added and updated tests to verify correct timeout behavior during merge insert operations. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-05-08 13:07:05 -07:00
LuQQiu	c9ae1b1737	fix: add restore with tag in python and nodejs API (#2374 ) add restore with tag API in python and nodejs API and add tests to guard them <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - The restore functionality now supports using version tags in addition to numeric version identifiers, allowing you to revert tables to a state marked by a tag. - Bug Fixes - Restoring with an unknown tag now properly raises an error. - Documentation - Updated documentation and examples to clarify that restore accepts both version numbers and tags. - Tests - Added new tests to verify restore behavior with version tags and error handling for unknown tags. - Added tests for checkout and restore operations involving tags. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-05-06 16:12:58 -07:00
Lance Release	529e774bbb	Updating package-lock.json	2025-05-06 02:45:45 +00:00
Lance Release	d83424d6b4	Bump version: 0.19.1-beta.2 → 0.19.1-beta.3	2025-05-06 02:45:06 +00:00
Lance Release	e4eee38b3c	Updating package-lock.json	2025-05-06 00:09:39 +00:00
Lance Release	dc8054e90d	Bump version: 0.19.1-beta.1 → 0.19.1-beta.2	2025-05-06 00:08:55 +00:00
LuQQiu	ed594b0f76	feat: return version for all write operations (#2368 ) return version info for all write operations (add, update, merge_insert and column modification operations) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Table modification operations (add, update, delete, merge, add/alter/drop columns) now return detailed result objects including version numbers and operation statistics. - Result objects provide clearer feedback such as rows affected and new table version after each operation. - Documentation - Updated documentation to describe new result objects and their fields for all relevant table operations. - Added documentation for new result interfaces and updated method return types in Node.js and Python APIs. - Tests - Enhanced test coverage to assert correctness of returned versioning and operation metadata after table modifications. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-05-05 14:25:34 -07:00
Alex Pilon	f315f9665a	feat: implement bindings to return merge stats (#2367 ) Based on this comment: https://github.com/lancedb/lancedb/issues/2228#issuecomment-2730463075 and https://github.com/lancedb/lance/pull/2357 Here is my attempt at implementing bindings for returning merge stats from a `merge_insert.execute` call for lancedb. Note: I have almost no idea what I am doing in Rust but tried to follow existing code patterns and pay attention to compiler hints. - The change in nodejs binding appeared to be necessary to get compilation to work, presumably this could actual work properly by returning some kind of NAPI JS object of the stats data? - I am unsure of what to do with the remote/table.rs changes - necessarily for compilation to work; I assume this is related to LanceDB cloud, but unsure the best way to handle that at this point. Proof of function: ```python import pandas as pd import lancedb db = lancedb.connect("/tmp/test.db") test_data = pd.DataFrame( { "title": ["Hello", "Test Document", "Example", "Data Sample", "Last One"], "id": [1, 2, 3, 4, 5], "content": [ "World", "This is a test", "Another example", "More test data", "Final entry", ], } ) table = db.create_table("documents", data=test_data, exist_ok=True, mode="overwrite") update_data = pd.DataFrame( { "title": [ "Hello, World", "Test Document, it's good", "Example", "Data Sample", "Last One", "New One", ], "id": [1, 2, 3, 4, 5, 6], "content": [ "World", "This is a test", "Another example", "More test data", "Final entry", "New content", ], } ) stats = ( table.merge_insert(on="id") .when_matched_update_all() .when_not_matched_insert_all() .execute(update_data) ) print(stats) ``` returns ``` {'num_inserted_rows': 1, 'num_updated_rows': 5, 'num_deleted_rows': 0} ``` <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Summary by CodeRabbit - New Features - Merge-insert operations now return detailed statistics, including counts of inserted, updated, and deleted rows. - Bug Fixes - Tests updated to validate returned merge-insert statistics for accuracy. - Documentation - Method documentation improved to reflect new return values and clarify merge operation results. - Added documentation for the new `MergeStats` interface detailing operation statistics. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Will Jones <willjones127@gmail.com>	2025-05-01 10:00:20 -07:00
Andrew C. Oliver	5deb26bc8b	fix: prevent embedded objects from returning null in all of their fields (#2355 ) metadata{filename=xyz} filename would be there structurally, but ALWAYS null. I didn't include this as a file but it may be useful for understanding the problem for people searching on this issue so I'm including it here as documentation. Before this patch any field that is more than 1 deep is accepted but returns null values for subfields when queried. ```js const lancedb = require('@lancedb/lancedb'); // Debug logger function debug(message, data) { console.log(`[TEST] ${message}`, data !== undefined ? data : ''); } // Log when our unwrapArrowObject is called const kParent = Symbol.for("parent"); const kRowIndex = Symbol.for("rowIndex"); // Override console.log for our test const originalConsoleLog = console.log; console.log = function() { // Filter out noisy logs if (arguments[0] && typeof arguments[0] === 'string' && arguments[0].includes('[INFO] [LanceDB]')) { originalConsoleLog.apply(console, arguments); } originalConsoleLog.apply(console, arguments); }; async function main() { debug('Starting test...'); // Connect to the database debug('Connecting to database...'); const db = await lancedb.connect('./.lancedb'); // Try to open an existing table, or create a new one if it doesn't exist let table; try { table = await db.openTable('test_nested_fields'); debug('Opened existing table'); } catch (e) { debug('Creating new table...'); // Create test data with nested metadata structure const data = [ { id: 'test1', vector: [1, 2, 3], metadata: { filePath: "/path/to/file1.ts", startLine: 10, endLine: 20, text: "function test() { return true; }" } }, { id: 'test2', vector: [4, 5, 6], metadata: { filePath: "/path/to/file2.ts", startLine: 30, endLine: 40, text: "function test2() { return false; }" } } ]; debug('Data to be inserted:', JSON.stringify(data, null, 2)); // Create the table table = await db.createTable('test_nested_fields', data); debug('Table created successfully'); } // Query the table and get results debug('Querying table...'); const results = await table.search([1, 2, 3]).limit(10).toArray(); // Log the results debug('Number of results:', results.length); if (results.length > 0) { const firstResult = results[0]; debug('First result properties:', Object.keys(firstResult)); // Check if metadata is accessible and what properties it has if (firstResult.metadata) { debug('Metadata properties:', Object.keys(firstResult.metadata)); debug('Metadata filePath:', firstResult.metadata.filePath); debug('Metadata startLine:', firstResult.metadata.startLine); // Destructure to see if that helps const { filePath, startLine, endLine, text } = firstResult.metadata; debug('Destructured values:', { filePath, startLine, endLine, text }); // Check if it's a proxy object debug('Result is proxy?', Object.getPrototypeOf(firstResult) === Object.prototype ? false : true); debug('Metadata is proxy?', Object.getPrototypeOf(firstResult.metadata) === Object.prototype ? false : true); } else { debug('Metadata is not accessible!'); } } // Close the database await db.close(); } main().catch(e => { console.error('Error:', e); }); ``` <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Summary by CodeRabbit - Bug Fixes - Improved handling of nested struct fields to ensure accurate preservation of values during serialization and deserialization. - Enhanced robustness when accessing nested object properties, reducing errors with missing or null values. - Tests - Added tests to verify correct handling of nested struct fields through serialization and deserialization. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Will Jones <willjones127@gmail.com>	2025-05-01 09:38:55 -07:00
Lance Release	4ade3e31e2	Updating package-lock.json	2025-04-29 22:19:46 +00:00
Lance Release	508e621f3d	Bump version: 0.19.1-beta.0 → 0.19.1-beta.1	2025-04-29 22:19:14 +00:00
Ryan Green	af54e0ce06	feat: add table stats API (#2363 ) * Add a new "table stats" API to expose basic table and fragment statistics with local and remote table implementations ### Questions * This is using `calculate_data_stats` to determine total bytes in the table. This seems like a potentially expensive operation - are there any concerns about performance for large datasets? ### Notes * bytes_on_disk seems to be stored at the column level but there does not seem to be a way to easily calculate total bytes per fragment. This may need to be added in lance before we can support fragment size (bytes) statistics. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Added a method to retrieve comprehensive table statistics, including total rows, index counts, storage size, and detailed fragment size metrics such as minimum, maximum, mean, and percentiles. - Enabled fetching of table statistics from remote sources through asynchronous requests. - Extended table interfaces across Python, Rust, and Node.js to support synchronous and asynchronous retrieval of table statistics. - Tests - Introduced tests to verify the accuracy of the new table statistics feature for both populated and empty tables. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-04-29 15:19:08 -02:30
Lance Release	554939e5d2	Updating package-lock.json	2025-04-28 17:20:58 +00:00
Lance Release	e9f25f6a12	Bump version: 0.19.0 → 0.19.1-beta.0	2025-04-28 17:20:26 +00:00
LuQQiu	a9311c4dc0	feat: add list/create/delete/update/checkout tag API (#2353 ) add the tag related API to list existing tags, attach tag to a version, update the tag version, delete tag, get the version of the tag, and checkout the version that the tag bounded to. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Introduced table version tagging, allowing users to create, update, delete, and list human-readable tags for specific table versions. - Enabled checking out a table by either version number or tag name. - Added new interfaces for tag management in both Python and Node.js APIs, supporting synchronous and asynchronous workflows. - Bug Fixes - None. - Documentation - Updated documentation to describe the new tagging features, including usage examples. - Tests - Added comprehensive tests for tag creation, updating, deletion, listing, and version checkout by tag in both Python and Node.js environments. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-04-28 10:04:46 -07:00
Lance Release	e8c0c52315	Updating package-lock.json	2025-04-25 21:17:03 +00:00
Lance Release	726d629b9b	Bump version: 0.19.0-beta.12 → 0.19.0	2025-04-25 21:16:30 +00:00
Lance Release	b493f56dee	Bump version: 0.19.0-beta.11 → 0.19.0-beta.12	2025-04-25 21:16:25 +00:00
Will Jones	d8517117f1	feat: upgrade Lance to v0.26.0 (#2359 ) Upstream changelog: https://github.com/lancedb/lance/releases/tag/v0.26.0 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Chores - Updated dependency management to use published crate versions for improved reliability and maintainability. - Added a temporary workaround for build issues by pinning a specific version of a dependency. - Refactor - Improved resource management and concurrency by updating internal ownership models for object storage components. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-04-25 13:59:12 -07:00
Lance Release	cbb9a7877c	Updating package-lock.json	2025-04-25 05:02:47 +00:00
Lance Release	1fdaf7a1a4	Bump version: 0.19.0-beta.10 → 0.19.0-beta.11	2025-04-25 05:02:16 +00:00
Lance Release	c19bdd9a24	Updating package-lock.json	2025-04-22 18:24:16 +00:00
Lance Release	a705621067	Bump version: 0.19.0-beta.9 → 0.19.0-beta.10	2025-04-22 18:23:39 +00:00
Lance Release	db853c4041	Updating package-lock.json	2025-04-21 22:50:56 +00:00
Lance Release	d8746c61c6	Bump version: 0.19.0-beta.8 → 0.19.0-beta.9	2025-04-21 22:50:20 +00:00
Ryan Green	3ae90dde80	feat: add new table API to wait for async indexing (#2338 ) * Add new wait_for_index() table operation that polls until indices are created/fully indexed * Add an optional wait timeout parameter to all create_index operations * Python and NodeJS interfaces <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Summary by CodeRabbit - New Features - Added optional waiting for index creation completion with configurable timeout. - Introduced methods to poll and wait for indices to be fully built across sync and async tables. - Extended index creation APIs to accept a wait timeout parameter. - Bug Fixes - Added a new timeout error variant for improved error reporting on index operations. - Tests - Added tests covering successful index readiness waiting, timeout scenarios, and missing index cases. <!-- end of auto-generated comment: release notes by coderabbit.ai -->	2025-04-21 08:41:21 -02:30
Lance Release	edc4e40a7b	Updating package-lock.json	2025-04-17 22:16:36 +00:00
Lance Release	35cff12e31	Bump version: 0.19.0-beta.7 → 0.19.0-beta.8	2025-04-17 22:16:02 +00:00
Weston Pace	26080ee4c1	feat: add prewarm_index function (#2342 ) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Added the ability to prewarm (load into memory) table indexes via new methods in Python, Node.js, and Rust APIs, potentially reducing cold-start query latency. - Bug Fixes - Ensured prewarming an index does not interfere with subsequent search operations. - Tests - Introduced new test cases to verify full-text search index creation, prewarming, and search functionalities in both Python and Node.js. - Chores - Updated dependencies for improved compatibility and performance. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Lu Qiu <luqiujob@gmail.com>	2025-04-17 15:14:36 -07:00
Lance Release	8a50944061	Updating package-lock.json	2025-04-15 04:11:16 +00:00
Lance Release	b3ad105fa0	Bump version: 0.19.0-beta.6 → 0.19.0-beta.7	2025-04-15 04:10:43 +00:00
BubbleCal	2248aa9508	fix: bugs for new FTS APIs (#2314 ) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - New Features - Enhanced full-text search capabilities with support for phrase queries, fuzzy matching, boosting, and multi-column matching. - Search methods now accept full-text query objects directly, improving query flexibility and precision. - Python and JavaScript SDKs updated to handle full-text queries seamlessly, including async search support. - Tests - Added comprehensive tests covering fuzzy search, phrase search, and boosted queries to ensure robust full-text search functionality. - Documentation - Updated query class documentation to reflect new constructor options and removal of deprecated methods for clarity and simplicity. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2025-04-15 11:51:35 +08:00
Will Jones	b3a4efd587	fix: revert change default read_consistency_interval=5s (#2327 ) This reverts commit `a547c523c2` or #2281 The current implementation can cause panics and performance degradation. I will bring this back with more testing in https://github.com/lancedb/lancedb/pull/2311 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Documentation - Enhanced clarity on read consistency settings with updated descriptions and default behavior. - Removed outdated warnings about eventual consistency from the troubleshooting guide. - Refactor - Streamlined the handling of the read consistency interval across integrations, now defaulting to "None" for improved performance. - Simplified internal logic to offer a more consistent experience. - Tests - Updated test expectations to reflect the new default representation for the read consistency interval. - Removed redundant tests related to "no consistency" settings for streamlined testing. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>	2025-04-14 08:48:15 -07:00
Lance Release	f23aa0a793	Updating package-lock.json	2025-04-08 06:17:03 +00:00
Lance Release	56aa133ee6	Bump version: 0.19.0-beta.5 → 0.19.0-beta.6	2025-04-08 06:16:30 +00:00
BubbleCal	ec8271931f	feat: support to create FTS index on list of strings (#2317 ) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Chores - Updated internal library dependencies to the latest beta version for improved system stability. - Tests - Added automated tests to validate full-text search functionality on list-based text fields. - Refactor - Enhanced the search processing logic to provide robust support for list-type text data, ensuring more reliable results. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2025-04-08 14:12:35 +08:00
Lance Release	2e170c3c7b	Updating package-lock.json	2025-04-04 21:50:28 +00:00

1 2 3 4 5 ...

310 Commits