lancedb

mirror of https://github.com/lancedb/lancedb.git synced 2025-12-27 07:09:57 +00:00

Author	SHA1	Message	Date
Will Jones	b3a4efd587	fix: revert change default read_consistency_interval=5s (#2327 ) This reverts commit `a547c523c2` or #2281 The current implementation can cause panics and performance degradation. I will bring this back with more testing in https://github.com/lancedb/lancedb/pull/2311 <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - Documentation - Enhanced clarity on read consistency settings with updated descriptions and default behavior. - Removed outdated warnings about eventual consistency from the troubleshooting guide. - Refactor - Streamlined the handling of the read consistency interval across integrations, now defaulting to "None" for improved performance. - Simplified internal logic to offer a more consistent experience. - Tests - Updated test expectations to reflect the new default representation for the read consistency interval. - Removed redundant tests related to "no consistency" settings for streamlined testing. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>	2025-04-14 08:48:15 -07:00
Will Jones	a547c523c2	feat!: change default read_consistency_interval=5s (#2281 ) Previously, when we loaded the next version of the table, we would block all reads with a write lock. Now, we only do that if `read_consistency_interval=0`. Otherwise, we load the next version asynchronously in the background. This should mean that `read_consistency_interval > 0` won't have a meaningful impact on latency. Along with this change, I felt it was safe to change the default consistency interval to 5 seconds. The current default is `None`, which means we will never check for a new version by default. I think that default is contrary to most users expectations.	2025-03-28 11:04:31 -07:00
Gagan Bhullar	14677d7c18	fix: metric type inconsistency (#2122 ) PR fixes #2113 --------- Co-authored-by: Will Jones <willjones127@gmail.com>	2025-03-12 10:28:37 -07:00
Ryan Green	ef3093bc23	feat: drop_index() remote implementation (#2093 ) Support drop_index operation in remote table.	2025-02-05 10:06:19 -03:30
Bert	4e8c7b0adf	fix: serialize vectordb client errors as json (#1795 )	2024-11-05 14:16:25 -05:00
Will Jones	48f46d4751	docs(node): update `indexStats` signature and regenerate docs (#1742 ) `indexStats` still referenced UUID even though in https://github.com/lancedb/lancedb/pull/1702 we changed it to take name instead.	2024-10-18 10:53:28 -07:00
Will Jones	aff25e3bf9	fix(node): add native packages to bump version (#1738 ) We weren't bumping the version, so when users downloaded our package from npm, they were getting the old binaries.	2024-10-08 23:03:53 -06:00
Will Jones	f958f4d2e8	feat: remote index stats (#1702 ) BREAKING CHANGE: the return value of `index_stats` method has changed and all `index_stats` APIs now take index name instead of UUID. Also several deprecated index statistics methods were removed. * Removes deprecated methods for individual index statistics * Aligns public `IndexStatistics` struct with API response from LanceDB Cloud. * Implements `index_stats` for remote Rust SDK and Python async API.	2024-09-27 12:10:00 -07:00
QianZhu	f00b21c98c	fix: metric type for python/node search api (#1689 )	2024-09-24 16:10:29 -07:00
Bert	3152ccd13c	fix: re-add hostOverride arg to ConnectionOptions (#1694 ) Fixes issue where hostOverride was no-longer passed through to RemoteConnection	2024-09-24 13:29:26 -03:00
Bert	d5021356b4	feat: add fast_search to vectordb (#1693 )	2024-09-24 13:28:54 -03:00
LuQQiu	7ed86cadfb	feat(node): let NODE API region default to us-east-1 (#1631 ) Fixes #1622 To sync with python API	2024-09-13 11:48:57 -07:00
Gagan Bhullar	a76186ee83	fix(node): read consistency level fix (#1567 ) PR fixes #1565	2024-08-27 17:03:42 -07:00
Bert	1f4a051070	feat: make timeout configurable for vectordb node SDK (#1443 )	2024-07-15 13:23:13 -02:30
Ryan Green	8e348ab4bd	fix: use JS naming convention in new index stats fields (#1377 ) Changes new index stats fields in node client from snake case to camel case.	2024-06-10 16:41:31 -02:30
QianZhu	b9e3cfbdca	fix: add status to remote listIndices return (#1364 ) expose `status` returned by remote listIndices	2024-06-08 09:52:35 -07:00
QianZhu	1dbb4cd1e2	fix: error msg when query vector dim is wrong (#1339 ) - changed the error msg for table.search with wrong query vector dim - added missing fields for listIndices and indexStats to be consistent with Python API - will make changes in node integ test	2024-05-31 10:18:06 -07:00
Cory Grinstead	6eaaee59f8	fix: remove accidental console.log (#1307 ) i accidentally left a console.log when doing https://github.com/lancedb/lancedb/pull/1290	2024-05-15 16:07:46 -05:00
Cory Grinstead	bc582bb702	fix(nodejs): add better error handling when missing embedding functions (#1290 ) note: running the default lint command `npm run lint -- --fix` seems to have made a lot of unrelated changes.	2024-05-14 08:43:39 -05:00
Will Jones	a6babfa651	fix(node/vectordb): parse value not key (#1276 )	2024-05-07 10:16:05 -07:00
Bert	08d62550bb	fix: passing data to createTable as option (#1242 ) Fixes issue where we would throw `Either data or schema needs to defined` when passing `data` to `createTable` as a property of the first argument (an object). ```ts await db.createTable({ name: 'table1', data, schema }) ```	2024-04-26 15:26:08 -04:00
Weston Pace	c7fbc4aaee	docs: fix minor typo (#1220 )	2024-04-14 03:32:57 +05:30
Will Jones	1d23af213b	feat: expose storage options in LanceDB (#1204 ) Exposes `storage_options` in LanceDB. This is provided for Python async, Node `lancedb`, and Node `vectordb` (and Rust of course). Python synchronous is omitted because it's not compatible with the PyArrow filesystems we use there currently. In the future, we will move the sync API to wrap the async one, and then it will get support for `storage_options`. 1. Fixes #1168 2. Closes #1165 3. Closes #1082 4. Closes #439 5. Closes #897 6. Closes #642 7. Closes #281 8. Closes #114 9. Closes #990 10. Deprecating `awsCredentials` and `awsRegion`. Users are encouraged to use `storageOptions` instead.	2024-04-10 10:12:04 -07:00
Bert	25dea4e859	BREAKING CHANGE: Check if remote table exists when opening (with caching) (#1214 ) - make open table behaviour consistent: - remote tables will check if the table exists by calling /describe and throwing an error if the call doesn't succeed - this is similar to the behaviour for local tables where we will raise an exception when opening the table if the local dataset doesn't exist - The table names are cached in the client with a TTL - Also fixes a small bug where if the remote error response was deserialized from JSON as an object, we'd print it resulting in the unhelpful error message: `Error: Server Error, status: 404, message: Not Found: [object Object]`	2024-04-10 11:54:47 -04:00
QianZhu	871500db70	add a default value for search.limit to be consistent with python sdk (#1191 ) Changed the default value for search.limit to be 10	2024-04-05 16:34:50 -07:00
Bert	a900bc0827	ensure table names are uri encoded for tables (#1189 ) This prevents an issue where users can do something like: ```js db.createTable('my-table#123123') ``` The server has logic to determine that '#' character is not allowed in the table name, but currently this is being returned as 404 error because it routes to `/v1/my-table#123123/create` and `#123123/create` will not be parsed as part of path	2024-04-05 16:34:50 -07:00
QianZhu	44d799ebb8	bug: fix the return value of countRows (#1186 )	2024-04-05 16:34:50 -07:00
Bert	ff45f25cf2	fix error decoding in nodejs client (#1184 ) fixes: #1183	2024-04-05 16:34:50 -07:00
QianZhu	2f89fc26f1	feat: add filterable countRows to remote API (#1169 )	2024-04-05 16:34:46 -07:00
Bert	1e41232f28	Node SDK Client middleware for HTTP Requests (#1130 ) Adds client-side middleware to LanceDB Node SDK to instrument HTTP Requests Example - adding `x-request-id` request header: ```js class HttpMiddleware { constructor({ requestId }) { this.requestId = requestId } onRemoteRequest(req, next) { req.headers['x-request-id'] = this.requestId return next(req) } } const db = await lancedb.connect({ uri: 'db://remote-123', apiKey: 'sk_...', }) let tables = await db.withMiddleware(new HttpMiddleware({ requestId: '123' })).tableNames(); ``` --------- Co-authored-by: Weston Pace <weston.pace@gmail.com>	2024-04-05 16:33:37 -07:00
Weston Pace	4180b44472	feat: refactor the query API and add query support to the python async API (#1113 ) In addition, there are also a number of changes in nodejs to the docstrings of existing methods because this PR adds a jsdoc linter.	2024-04-05 16:32:47 -07:00
Will Jones	f0c5f5ba62	fix: handle uri in object (#1091 ) Fixes #1078	2024-04-05 16:32:15 -07:00
Weston Pace	f822255683	feat: add create_index to the async python API (#1052 ) This also refactors the rust lancedb index builder API (and, correspondingly, the nodejs API)	2024-04-05 16:32:14 -07:00
Will Jones	90af5cf028	fix: propagate filter validation errors (#1092 ) In Rust and Node, we have been swallowing filter validation errors. If there was an error in parsing the filter, then the filter was silently ignored, returning unfiltered results. Fixes #1081	2024-04-05 16:31:53 -07:00
Weston Pace	c60a193767	fix: sanitize foreign schemas (#1058 ) Arrow-js uses brittle `instanceof` checks throughout the code base. These fail unless the library instance that produced the object matches exactly the same instance the vectordb is using. At a minimum, this means that a user using arrow version 15 (or any version that doesn't match exactly the version that vectordb is using) will get strange errors when they try and use vectordb. However, there are even cases where the versions can be perfectly identical, and the instanceof check still fails. One such example is when using `vite` (e.g. https://github.com/vitejs/vite/issues/3910) This PR solves the problem in a rather brute force, but workable, fashion. If we encounter a schema that does not pass the `instanceof` check then we will attempt to sanitize that schema by traversing the object and, if it has all the correct properties, constructing an appropriate `Schema` instance via deep cloning.	2024-04-05 16:31:42 -07:00
QianZhu	b32b69c993	Add create scalar index to sdk (#1033 )	2024-04-05 16:31:36 -07:00
Rob Meng	f3de3d990d	chore: upgrade to lance 0.10.1 (#1034 ) upgrade to lance 0.10.1 and update doc string to reflect dynamic projection options	2024-04-05 16:31:36 -07:00
Will Jones	464a36ad38	feat: `{add\|alter\|drop}_columns` APIs (#1015 ) Initial work for #959. This exposes the basic functionality for each in all of the APIs. Will add user guide documentation in a later PR.	2024-04-05 16:30:47 -07:00
Will Jones	c5b0934bfb	feat(node): add `read_consistency_interval` to Node and Rust (#1002 ) This PR adds the same consistency semantics as was added in #828. It does not add the same lazy-loading of tables, since that breaks some existing tests. This closes #998. --------- Co-authored-by: Weston Pace <weston.pace@gmail.com>	2024-04-05 16:30:40 -07:00
Weston Pace	9241f47f0e	feat: make it easier to create empty tables (#942 ) This PR also reworks the table creation utilities significantly so that they are more consistent, built on top of each other, and thoroughly documented.	2024-04-05 16:30:30 -07:00
Will Jones	68115f1369	fix: wrap in BigInt to avoid upstream bug (#962 ) Closes #960	2024-04-05 16:30:30 -07:00
Weston Pace	41ccb48160	feat: add support for filter during merge insert when matched (#948 ) Closes #940	2024-04-05 16:29:58 -07:00
Weston Pace	138fc3f66b	feat: add a filterable count_rows to all the lancedb APIs (#913 ) A `count_rows` method that takes a filter was recently added to `LanceTable`. This PR adds it everywhere else except `RemoteTable` (that will come soon).	2024-04-05 16:29:58 -07:00
Weston Pace	18f7bad3dd	feat: add merge_insert to the node and rust APIs (#915 )	2024-04-05 16:29:05 -07:00
Lei Xu	8e139012e2	fix(node): pass AWS credentials to db level operations (#908 ) Passed the following tests ```ts const keyId = process.env.AWS_ACCESS_KEY_ID; const secretKey = process.env.AWS_SECRET_ACCESS_KEY; const sessionToken = process.env.AWS_SESSION_TOKEN; const region = process.env.AWS_REGION; const db = await lancedb.connect({ uri: "s3://bucket/path", awsCredentials: { accessKeyId: keyId, secretKey: secretKey, sessionToken: sessionToken, }, awsRegion: region, } as lancedb.ConnectionOptions); console.log(await db.createTable("test", [{ vector: [1, 2, 3] }])); console.log(await db.tableNames()); console.log(await db.dropTable("test")) ```	2024-04-05 16:28:56 -07:00
Lei Xu	e7fdb931de	chore: convert all js doc test to use snippet. (#881 )	2024-04-05 16:28:56 -07:00
Lei Xu	a192c1a9b1	chore(rust): simplified version of optimize (#869 ) Consolidate various optimize() into one method, similar to postgres VACCUM in the process of preparing Rust API for public use	2024-04-05 16:28:18 -07:00
Lei Xu	65c1d8bc4c	feat: change create table to accept Arrow table (#845 )	2024-04-05 16:27:50 -07:00
Bert	a409000c6f	allow passing api key as env var (#841 ) Allow passing API key as env var: ```shell export LANCEDB_API_KEY=sh_123... ``` with this set, apiKey argument can omitted from `connect` ```js const db = await vectordb.connect({ uri: "db://test-proj-01-ae8343", region: "us-east-1", }) ``` ```py db = lancedb.connect( uri="db://test-proj-01-ae8343", region="us-east-1", ) ```	2024-04-05 16:27:42 -07:00
Lei Xu	d8befeeea2	feat(js): add helper function to create Arrow Table with schema (#838 ) Support to make Apache Arrow Table from an array of javascript Records, with optionally provided Schema.	2024-04-05 16:27:42 -07:00

1 2 3

135 Commits