Commit Graph

135 Commits

Author SHA1 Message Date
Will Jones
b3a4efd587 fix: revert change default read_consistency_interval=5s (#2327)
This reverts commit a547c523c2 or #2281

The current implementation can cause panics and performance degradation.
I will bring this back with more testing in
https://github.com/lancedb/lancedb/pull/2311

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Documentation**
- Enhanced clarity on read consistency settings with updated
descriptions and default behavior.
- Removed outdated warnings about eventual consistency from the
troubleshooting guide.

- **Refactor**
- Streamlined the handling of the read consistency interval across
integrations, now defaulting to "None" for improved performance.
  - Simplified internal logic to offer a more consistent experience.

- **Tests**
- Updated test expectations to reflect the new default representation
for the read consistency interval.
- Removed redundant tests related to "no consistency" settings for
streamlined testing.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2025-04-14 08:48:15 -07:00
Will Jones
a547c523c2 feat!: change default read_consistency_interval=5s (#2281)
Previously, when we loaded the next version of the table, we would block
all reads with a write lock. Now, we only do that if
`read_consistency_interval=0`. Otherwise, we load the next version
asynchronously in the background. This should mean that
`read_consistency_interval > 0` won't have a meaningful impact on
latency.

Along with this change, I felt it was safe to change the default
consistency interval to 5 seconds. The current default is `None`, which
means we will **never** check for a new version by default. I think that
default is contrary to most users expectations.
2025-03-28 11:04:31 -07:00
Gagan Bhullar
14677d7c18 fix: metric type inconsistency (#2122)
PR fixes #2113

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2025-03-12 10:28:37 -07:00
Ryan Green
ef3093bc23 feat: drop_index() remote implementation (#2093)
Support drop_index operation in remote table.
2025-02-05 10:06:19 -03:30
Bert
4e8c7b0adf fix: serialize vectordb client errors as json (#1795) 2024-11-05 14:16:25 -05:00
Will Jones
48f46d4751 docs(node): update indexStats signature and regenerate docs (#1742)
`indexStats` still referenced UUID even though in
https://github.com/lancedb/lancedb/pull/1702 we changed it to take name
instead.
2024-10-18 10:53:28 -07:00
Will Jones
aff25e3bf9 fix(node): add native packages to bump version (#1738)
We weren't bumping the version, so when users downloaded our package
from npm, they were getting the old binaries.
2024-10-08 23:03:53 -06:00
Will Jones
f958f4d2e8 feat: remote index stats (#1702)
BREAKING CHANGE: the return value of `index_stats` method has changed
and all `index_stats` APIs now take index name instead of UUID. Also
several deprecated index statistics methods were removed.

* Removes deprecated methods for individual index statistics
* Aligns public `IndexStatistics` struct with API response from LanceDB
Cloud.
* Implements `index_stats` for remote Rust SDK and Python async API.
2024-09-27 12:10:00 -07:00
QianZhu
f00b21c98c fix: metric type for python/node search api (#1689) 2024-09-24 16:10:29 -07:00
Bert
3152ccd13c fix: re-add hostOverride arg to ConnectionOptions (#1694)
Fixes issue where hostOverride was no-longer passed through to
RemoteConnection
2024-09-24 13:29:26 -03:00
Bert
d5021356b4 feat: add fast_search to vectordb (#1693) 2024-09-24 13:28:54 -03:00
LuQQiu
7ed86cadfb feat(node): let NODE API region default to us-east-1 (#1631)
Fixes #1622 
To sync with python API
2024-09-13 11:48:57 -07:00
Gagan Bhullar
a76186ee83 fix(node): read consistency level fix (#1567)
PR fixes #1565
2024-08-27 17:03:42 -07:00
Bert
1f4a051070 feat: make timeout configurable for vectordb node SDK (#1443) 2024-07-15 13:23:13 -02:30
Ryan Green
8e348ab4bd fix: use JS naming convention in new index stats fields (#1377)
Changes new index stats fields in node client from snake case to camel
case.
2024-06-10 16:41:31 -02:30
QianZhu
b9e3cfbdca fix: add status to remote listIndices return (#1364)
expose `status` returned by remote listIndices
2024-06-08 09:52:35 -07:00
QianZhu
1dbb4cd1e2 fix: error msg when query vector dim is wrong (#1339)
- changed the error msg for table.search with wrong query vector dim 
- added missing fields for listIndices and indexStats to be consistent
with Python API - will make changes in node integ test
2024-05-31 10:18:06 -07:00
Cory Grinstead
6eaaee59f8 fix: remove accidental console.log (#1307)
i accidentally left a console.log when doing
https://github.com/lancedb/lancedb/pull/1290
2024-05-15 16:07:46 -05:00
Cory Grinstead
bc582bb702 fix(nodejs): add better error handling when missing embedding functions (#1290)
note: 
running the default lint command `npm run lint -- --fix` seems to have
made a lot of unrelated changes.
2024-05-14 08:43:39 -05:00
Will Jones
a6babfa651 fix(node/vectordb): parse value not key (#1276) 2024-05-07 10:16:05 -07:00
Bert
08d62550bb fix: passing data to createTable as option (#1242)
Fixes issue where we would throw `Either data or schema needs to
defined` when passing `data` to `createTable` as a property of the first
argument (an object).

```ts
await db.createTable({
  name: 'table1',
  data,
  schema
})
```
2024-04-26 15:26:08 -04:00
Weston Pace
c7fbc4aaee docs: fix minor typo (#1220) 2024-04-14 03:32:57 +05:30
Will Jones
1d23af213b feat: expose storage options in LanceDB (#1204)
Exposes `storage_options` in LanceDB. This is provided for Python async,
Node `lancedb`, and Node `vectordb` (and Rust of course). Python
synchronous is omitted because it's not compatible with the PyArrow
filesystems we use there currently. In the future, we will move the sync
API to wrap the async one, and then it will get support for
`storage_options`.

1. Fixes #1168
2. Closes #1165
3. Closes #1082
4. Closes #439
5. Closes #897
6. Closes #642
7. Closes #281
8. Closes #114
9. Closes #990
10. Deprecating `awsCredentials` and `awsRegion`. Users are encouraged
to use `storageOptions` instead.
2024-04-10 10:12:04 -07:00
Bert
25dea4e859 BREAKING CHANGE: Check if remote table exists when opening (with caching) (#1214)
- make open table behaviour consistent:
- remote tables will check if the table exists by calling /describe and
throwing an error if the call doesn't succeed
- this is similar to the behaviour for local tables where we will raise
an exception when opening the table if the local dataset doesn't exist
- The table names are cached in the client with a TTL
- Also fixes a small bug where if the remote error response was
deserialized from JSON as an object, we'd print it resulting in the
unhelpful error message: `Error: Server Error, status: 404, message: Not
Found: [object Object]`
2024-04-10 11:54:47 -04:00
QianZhu
871500db70 add a default value for search.limit to be consistent with python sdk (#1191)
Changed the default value for search.limit to be 10
2024-04-05 16:34:50 -07:00
Bert
a900bc0827 ensure table names are uri encoded for tables (#1189)
This prevents an issue where users can do something like:
```js
db.createTable('my-table#123123')
```
The server has logic to determine that '#' character is not allowed in
the table name, but currently this is being returned as 404 error
because it routes to `/v1/my-table#123123/create` and `#123123/create`
will not be parsed as part of path
2024-04-05 16:34:50 -07:00
QianZhu
44d799ebb8 bug: fix the return value of countRows (#1186) 2024-04-05 16:34:50 -07:00
Bert
ff45f25cf2 fix error decoding in nodejs client (#1184)
fixes: #1183
2024-04-05 16:34:50 -07:00
QianZhu
2f89fc26f1 feat: add filterable countRows to remote API (#1169) 2024-04-05 16:34:46 -07:00
Bert
1e41232f28 Node SDK Client middleware for HTTP Requests (#1130)
Adds client-side middleware to LanceDB Node SDK to instrument HTTP
Requests

Example - adding `x-request-id` request header:
```js
class HttpMiddleware {
    constructor({ requestId }) {
        this.requestId = requestId
    }

    onRemoteRequest(req, next) {
        req.headers['x-request-id'] = this.requestId
        return next(req)
    }
}

const db = await lancedb.connect({
  uri: 'db://remote-123',
  apiKey: 'sk_...',
})

let tables = await db.withMiddleware(new HttpMiddleware({ requestId: '123' })).tableNames();

```

---------

Co-authored-by: Weston Pace <weston.pace@gmail.com>
2024-04-05 16:33:37 -07:00
Weston Pace
4180b44472 feat: refactor the query API and add query support to the python async API (#1113)
In addition, there are also a number of changes in nodejs to the
docstrings of existing methods because this PR adds a jsdoc linter.
2024-04-05 16:32:47 -07:00
Will Jones
f0c5f5ba62 fix: handle uri in object (#1091)
Fixes #1078
2024-04-05 16:32:15 -07:00
Weston Pace
f822255683 feat: add create_index to the async python API (#1052)
This also refactors the rust lancedb index builder API (and,
correspondingly, the nodejs API)
2024-04-05 16:32:14 -07:00
Will Jones
90af5cf028 fix: propagate filter validation errors (#1092)
In Rust and Node, we have been swallowing filter validation errors. If
there was an error in parsing the filter, then the filter was silently
ignored, returning unfiltered results.

Fixes #1081
2024-04-05 16:31:53 -07:00
Weston Pace
c60a193767 fix: sanitize foreign schemas (#1058)
Arrow-js uses brittle `instanceof` checks throughout the code base.
These fail unless the library instance that produced the object matches
exactly the same instance the vectordb is using. At a minimum, this
means that a user using arrow version 15 (or any version that doesn't
match exactly the version that vectordb is using) will get strange
errors when they try and use vectordb.

However, there are even cases where the versions can be perfectly
identical, and the instanceof check still fails. One such example is
when using `vite` (e.g. https://github.com/vitejs/vite/issues/3910)

This PR solves the problem in a rather brute force, but workable,
fashion. If we encounter a schema that does not pass the `instanceof`
check then we will attempt to sanitize that schema by traversing the
object and, if it has all the correct properties, constructing an
appropriate `Schema` instance via deep cloning.
2024-04-05 16:31:42 -07:00
QianZhu
b32b69c993 Add create scalar index to sdk (#1033) 2024-04-05 16:31:36 -07:00
Rob Meng
f3de3d990d chore: upgrade to lance 0.10.1 (#1034)
upgrade to lance 0.10.1 and update doc string to reflect dynamic
projection options
2024-04-05 16:31:36 -07:00
Will Jones
464a36ad38 feat: {add|alter|drop}_columns APIs (#1015)
Initial work for #959. This exposes the basic functionality for each in
all of the APIs. Will add user guide documentation in a later PR.
2024-04-05 16:30:47 -07:00
Will Jones
c5b0934bfb feat(node): add read_consistency_interval to Node and Rust (#1002)
This PR adds the same consistency semantics as was added in #828. It
*does not* add the same lazy-loading of tables, since that breaks some
existing tests.

This closes #998.

---------

Co-authored-by: Weston Pace <weston.pace@gmail.com>
2024-04-05 16:30:40 -07:00
Weston Pace
9241f47f0e feat: make it easier to create empty tables (#942)
This PR also reworks the table creation utilities significantly so that
they are more consistent, built on top of each other, and thoroughly
documented.
2024-04-05 16:30:30 -07:00
Will Jones
68115f1369 fix: wrap in BigInt to avoid upstream bug (#962)
Closes #960
2024-04-05 16:30:30 -07:00
Weston Pace
41ccb48160 feat: add support for filter during merge insert when matched (#948)
Closes #940
2024-04-05 16:29:58 -07:00
Weston Pace
138fc3f66b feat: add a filterable count_rows to all the lancedb APIs (#913)
A `count_rows` method that takes a filter was recently added to
`LanceTable`. This PR adds it everywhere else except `RemoteTable` (that
will come soon).
2024-04-05 16:29:58 -07:00
Weston Pace
18f7bad3dd feat: add merge_insert to the node and rust APIs (#915) 2024-04-05 16:29:05 -07:00
Lei Xu
8e139012e2 fix(node): pass AWS credentials to db level operations (#908)
Passed the following tests

```ts
const keyId = process.env.AWS_ACCESS_KEY_ID;
const secretKey = process.env.AWS_SECRET_ACCESS_KEY;
const sessionToken = process.env.AWS_SESSION_TOKEN;
const region = process.env.AWS_REGION;

const db = await lancedb.connect({
  uri: "s3://bucket/path",
  awsCredentials: {
    accessKeyId: keyId,
    secretKey: secretKey,
    sessionToken: sessionToken,
  },
  awsRegion: region,
} as lancedb.ConnectionOptions);

  console.log(await db.createTable("test", [{ vector: [1, 2, 3] }]));
  console.log(await db.tableNames());
  console.log(await db.dropTable("test"))
```
2024-04-05 16:28:56 -07:00
Lei Xu
e7fdb931de chore: convert all js doc test to use snippet. (#881) 2024-04-05 16:28:56 -07:00
Lei Xu
a192c1a9b1 chore(rust): simplified version of optimize (#869)
Consolidate various optimize() into one method, similar to postgres
VACCUM in the process of preparing Rust API for public use
2024-04-05 16:28:18 -07:00
Lei Xu
65c1d8bc4c feat: change create table to accept Arrow table (#845) 2024-04-05 16:27:50 -07:00
Bert
a409000c6f allow passing api key as env var (#841)
Allow passing API key as env var:
```shell
export LANCEDB_API_KEY=sh_123...
```

with this set, apiKey argument can omitted from `connect`
```js
    const db = await vectordb.connect({
        uri: "db://test-proj-01-ae8343",
        region: "us-east-1",
  })
```
```py
    db = lancedb.connect(
        uri="db://test-proj-01-ae8343",
        region="us-east-1",
    )
```
2024-04-05 16:27:42 -07:00
Lei Xu
d8befeeea2 feat(js): add helper function to create Arrow Table with schema (#838)
Support to make Apache Arrow Table from an array of javascript Records,
with optionally provided Schema.
2024-04-05 16:27:42 -07:00