Commit Graph

114 Commits

Author SHA1 Message Date
Weston Pace
c7fbc4aaee docs: fix minor typo (#1220) 2024-04-14 03:32:57 +05:30
Will Jones
1d23af213b feat: expose storage options in LanceDB (#1204)
Exposes `storage_options` in LanceDB. This is provided for Python async,
Node `lancedb`, and Node `vectordb` (and Rust of course). Python
synchronous is omitted because it's not compatible with the PyArrow
filesystems we use there currently. In the future, we will move the sync
API to wrap the async one, and then it will get support for
`storage_options`.

1. Fixes #1168
2. Closes #1165
3. Closes #1082
4. Closes #439
5. Closes #897
6. Closes #642
7. Closes #281
8. Closes #114
9. Closes #990
10. Deprecating `awsCredentials` and `awsRegion`. Users are encouraged
to use `storageOptions` instead.
2024-04-10 10:12:04 -07:00
Bert
25dea4e859 BREAKING CHANGE: Check if remote table exists when opening (with caching) (#1214)
- make open table behaviour consistent:
- remote tables will check if the table exists by calling /describe and
throwing an error if the call doesn't succeed
- this is similar to the behaviour for local tables where we will raise
an exception when opening the table if the local dataset doesn't exist
- The table names are cached in the client with a TTL
- Also fixes a small bug where if the remote error response was
deserialized from JSON as an object, we'd print it resulting in the
unhelpful error message: `Error: Server Error, status: 404, message: Not
Found: [object Object]`
2024-04-10 11:54:47 -04:00
QianZhu
871500db70 add a default value for search.limit to be consistent with python sdk (#1191)
Changed the default value for search.limit to be 10
2024-04-05 16:34:50 -07:00
Bert
a900bc0827 ensure table names are uri encoded for tables (#1189)
This prevents an issue where users can do something like:
```js
db.createTable('my-table#123123')
```
The server has logic to determine that '#' character is not allowed in
the table name, but currently this is being returned as 404 error
because it routes to `/v1/my-table#123123/create` and `#123123/create`
will not be parsed as part of path
2024-04-05 16:34:50 -07:00
QianZhu
44d799ebb8 bug: fix the return value of countRows (#1186) 2024-04-05 16:34:50 -07:00
Bert
ff45f25cf2 fix error decoding in nodejs client (#1184)
fixes: #1183
2024-04-05 16:34:50 -07:00
QianZhu
2f89fc26f1 feat: add filterable countRows to remote API (#1169) 2024-04-05 16:34:46 -07:00
Bert
1e41232f28 Node SDK Client middleware for HTTP Requests (#1130)
Adds client-side middleware to LanceDB Node SDK to instrument HTTP
Requests

Example - adding `x-request-id` request header:
```js
class HttpMiddleware {
    constructor({ requestId }) {
        this.requestId = requestId
    }

    onRemoteRequest(req, next) {
        req.headers['x-request-id'] = this.requestId
        return next(req)
    }
}

const db = await lancedb.connect({
  uri: 'db://remote-123',
  apiKey: 'sk_...',
})

let tables = await db.withMiddleware(new HttpMiddleware({ requestId: '123' })).tableNames();

```

---------

Co-authored-by: Weston Pace <weston.pace@gmail.com>
2024-04-05 16:33:37 -07:00
Weston Pace
4180b44472 feat: refactor the query API and add query support to the python async API (#1113)
In addition, there are also a number of changes in nodejs to the
docstrings of existing methods because this PR adds a jsdoc linter.
2024-04-05 16:32:47 -07:00
Will Jones
f0c5f5ba62 fix: handle uri in object (#1091)
Fixes #1078
2024-04-05 16:32:15 -07:00
Weston Pace
f822255683 feat: add create_index to the async python API (#1052)
This also refactors the rust lancedb index builder API (and,
correspondingly, the nodejs API)
2024-04-05 16:32:14 -07:00
Will Jones
90af5cf028 fix: propagate filter validation errors (#1092)
In Rust and Node, we have been swallowing filter validation errors. If
there was an error in parsing the filter, then the filter was silently
ignored, returning unfiltered results.

Fixes #1081
2024-04-05 16:31:53 -07:00
Weston Pace
c60a193767 fix: sanitize foreign schemas (#1058)
Arrow-js uses brittle `instanceof` checks throughout the code base.
These fail unless the library instance that produced the object matches
exactly the same instance the vectordb is using. At a minimum, this
means that a user using arrow version 15 (or any version that doesn't
match exactly the version that vectordb is using) will get strange
errors when they try and use vectordb.

However, there are even cases where the versions can be perfectly
identical, and the instanceof check still fails. One such example is
when using `vite` (e.g. https://github.com/vitejs/vite/issues/3910)

This PR solves the problem in a rather brute force, but workable,
fashion. If we encounter a schema that does not pass the `instanceof`
check then we will attempt to sanitize that schema by traversing the
object and, if it has all the correct properties, constructing an
appropriate `Schema` instance via deep cloning.
2024-04-05 16:31:42 -07:00
QianZhu
b32b69c993 Add create scalar index to sdk (#1033) 2024-04-05 16:31:36 -07:00
Rob Meng
f3de3d990d chore: upgrade to lance 0.10.1 (#1034)
upgrade to lance 0.10.1 and update doc string to reflect dynamic
projection options
2024-04-05 16:31:36 -07:00
Will Jones
464a36ad38 feat: {add|alter|drop}_columns APIs (#1015)
Initial work for #959. This exposes the basic functionality for each in
all of the APIs. Will add user guide documentation in a later PR.
2024-04-05 16:30:47 -07:00
Will Jones
c5b0934bfb feat(node): add read_consistency_interval to Node and Rust (#1002)
This PR adds the same consistency semantics as was added in #828. It
*does not* add the same lazy-loading of tables, since that breaks some
existing tests.

This closes #998.

---------

Co-authored-by: Weston Pace <weston.pace@gmail.com>
2024-04-05 16:30:40 -07:00
Weston Pace
9241f47f0e feat: make it easier to create empty tables (#942)
This PR also reworks the table creation utilities significantly so that
they are more consistent, built on top of each other, and thoroughly
documented.
2024-04-05 16:30:30 -07:00
Will Jones
68115f1369 fix: wrap in BigInt to avoid upstream bug (#962)
Closes #960
2024-04-05 16:30:30 -07:00
Weston Pace
41ccb48160 feat: add support for filter during merge insert when matched (#948)
Closes #940
2024-04-05 16:29:58 -07:00
Weston Pace
138fc3f66b feat: add a filterable count_rows to all the lancedb APIs (#913)
A `count_rows` method that takes a filter was recently added to
`LanceTable`. This PR adds it everywhere else except `RemoteTable` (that
will come soon).
2024-04-05 16:29:58 -07:00
Weston Pace
18f7bad3dd feat: add merge_insert to the node and rust APIs (#915) 2024-04-05 16:29:05 -07:00
Lei Xu
8e139012e2 fix(node): pass AWS credentials to db level operations (#908)
Passed the following tests

```ts
const keyId = process.env.AWS_ACCESS_KEY_ID;
const secretKey = process.env.AWS_SECRET_ACCESS_KEY;
const sessionToken = process.env.AWS_SESSION_TOKEN;
const region = process.env.AWS_REGION;

const db = await lancedb.connect({
  uri: "s3://bucket/path",
  awsCredentials: {
    accessKeyId: keyId,
    secretKey: secretKey,
    sessionToken: sessionToken,
  },
  awsRegion: region,
} as lancedb.ConnectionOptions);

  console.log(await db.createTable("test", [{ vector: [1, 2, 3] }]));
  console.log(await db.tableNames());
  console.log(await db.dropTable("test"))
```
2024-04-05 16:28:56 -07:00
Lei Xu
e7fdb931de chore: convert all js doc test to use snippet. (#881) 2024-04-05 16:28:56 -07:00
Lei Xu
a192c1a9b1 chore(rust): simplified version of optimize (#869)
Consolidate various optimize() into one method, similar to postgres
VACCUM in the process of preparing Rust API for public use
2024-04-05 16:28:18 -07:00
Lei Xu
65c1d8bc4c feat: change create table to accept Arrow table (#845) 2024-04-05 16:27:50 -07:00
Bert
a409000c6f allow passing api key as env var (#841)
Allow passing API key as env var:
```shell
export LANCEDB_API_KEY=sh_123...
```

with this set, apiKey argument can omitted from `connect`
```js
    const db = await vectordb.connect({
        uri: "db://test-proj-01-ae8343",
        region: "us-east-1",
  })
```
```py
    db = lancedb.connect(
        uri="db://test-proj-01-ae8343",
        region="us-east-1",
    )
```
2024-04-05 16:27:42 -07:00
Lei Xu
d8befeeea2 feat(js): add helper function to create Arrow Table with schema (#838)
Support to make Apache Arrow Table from an array of javascript Records,
with optionally provided Schema.
2024-04-05 16:27:42 -07:00
Chang She
b699b5c42b chore(js): remove errant console.log (#834) 2024-04-05 16:27:42 -07:00
Chang She
d19bf80375 Merge branch 'tecmie/embeddings-openai' of github.com:tecmie/lancedb into tecmie-tecmie/embeddings-openai 2024-04-05 16:27:41 -07:00
Lei Xu
5b2c602fb3 doc: improve docs for nodejs connect functions (#833)
* improve the docstring for NodeJS connect functions and
`ConnectOptions` parameters.
* Simplify `npm run build` steps.
2024-04-05 16:27:32 -07:00
Andrew Miracle
8daed93a91 eslint fix 2024-04-05 16:25:52 -07:00
Andrew Miracle
fa13fb9392 rebase from lancedb/main 2024-04-05 16:25:14 -07:00
Chang She
118a11c9b3 feat(node): align incoming data to table schema (#802) 2024-04-05 16:25:14 -07:00
Chang She
a758876a65 feat(node): support table.schema for LocalTable (#789)
Close #773 

we pass an empty table over IPC so we don't need to manually deal with
serde. Then we just return the schema attribute from the empty table.

---------

Co-authored-by: albertlockett <albert.lockett@gmail.com>
2024-04-05 16:25:14 -07:00
Chang She
81487f10fe feat(js): support list of string input (#755)
Add support for adding lists of string input (e.g., list of categorical
labels)

Follow-up items: #757 #758
2024-04-05 16:25:02 -07:00
Aidan
a76b5755ff fix: createIndex index cache size (#741) 2024-04-05 16:25:02 -07:00
Andrew Miracle
5948f11641 eslint fix 2024-04-05 16:25:02 -07:00
Andrew Miracle
9efc3fa6d8 remove console logs 2024-04-05 16:25:02 -07:00
Andrew Miracle
453bf113ae add support for openai SDK version ^4.24.1 2024-04-05 16:25:02 -07:00
Chang She
4b243c5ff8 feat(node): align incoming data to table schema (#802) 2024-04-05 16:25:01 -07:00
Chang She
175ad9223b feat(node): support table.schema for LocalTable (#789)
Close #773 

we pass an empty table over IPC so we don't need to manually deal with
serde. Then we just return the schema attribute from the empty table.

---------

Co-authored-by: albertlockett <albert.lockett@gmail.com>
2024-04-05 16:25:01 -07:00
Bert
5bb128a24d docs: fix JS api docs for update method (#738) 2024-04-05 16:24:47 -07:00
Weston Pace
94e81ff84b feat: add the ability to create scalar indices (#679)
This is a pretty direct binding to the underlying lance capability
2024-04-05 16:24:47 -07:00
Aidan
b4ae3f3097 feat: node list tables pagination (#733) 2024-04-05 16:24:47 -07:00
Chang She
cd791a366b feat(js): support list of string input (#755)
Add support for adding lists of string input (e.g., list of categorical
labels)

Follow-up items: #757 #758
2024-04-05 16:24:47 -07:00
Aidan
e74c203e6f fix: createIndex index cache size (#741) 2024-04-05 16:24:47 -07:00
Aidan
3232b55218 feat: Node create index API (#720) 2024-04-05 16:24:30 -07:00
Aidan
ee2034db23 feat: Node Schema API (#717) 2024-04-05 16:24:30 -07:00