Commit Graph

1061 Commits

Author SHA1 Message Date
BubbleCal
12b3c87964 feat: support to create more vector index types (#1407)
Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2024-07-02 10:53:03 -02:30
Lei Xu
020a437230 docs: add merge insert, create index and create scalar index to public rest api doc (#1420)
Added 3 APIs doc publicly:
- `merge_insert`
- `create_index`
- `create_scalar_index`

---------

Co-authored-by: Weston Pace <weston.pace@gmail.com>
2024-07-01 12:52:27 -07:00
Cory Grinstead
34f1aeb84c chore(nodejs): make opean optional, and apache-arrow a peer dep (#1417)
fyi, this should have no breaking changes as npm is opt-out instead of
opt-in when resolving dependencies

all peer and optional dependencies get installed by default, so users
need to manually opt out.

`npm i --omit optional --omit peer`
2024-07-01 12:50:01 -05:00
Cory Grinstead
5c3a88b6b2 feat(nodejs): add better typehints for registry (#1408)
previously the `registry` would return `undefined | EmbeddingFunction`
even for built in functions such as "openai"

now it'll return the correct type for `getRegistry().get("openai")

as well as pass in the correct options type to `create`

### before
```ts
const options: {model: 'not-a-real-model'}
// this'd compile just fine, but result in runtime error
const openai: EmbeddingFunction | undefined = getRegistry().get("openai").create(options)
// this'd also compile fine
const openai: EmbeddingFunction | undefined = getRegistry().get("openai").create({MODEL: ''})
```
### after
```ts
const options: {model: 'not-a-real-model'}

const openai: OpenAIEmbeddingFunction = getRegistry().get("openai").create(options)
// Type '"not-a-real-model"' is not assignable to type '"text-embedding-ada-002" | "text-embedding-3-large" | "text-embedding-3-small" | undefined'


```
2024-07-01 12:49:42 -05:00
Lei Xu
e780b2f51c ci: fix nodejs doc test (#1419)
Fixed nodejs doctest failures due to compiling JNI node.
2024-07-01 10:21:41 -07:00
Cory Grinstead
b8a1719174 feat(nodejs): catch unwinds in node bindings (#1414)
this bumps napi version to 2.16 which contains a few bug fixes.
Additionally, it adds `catch_unwind` to any method that may
unintentionally panic.

`catch_unwind` will unwind the panics and return a regular JS error
instead of panicking.
2024-07-01 09:28:10 -05:00
Ayush Chaurasia
ccded130ed docs: add reranking example (#1416) 2024-07-01 19:42:38 +05:30
Sidharth Rajaram
48f8d1b3b7 docs: addresses typos in HF embedding example docs (#1415)
* `table.add` requires `data` parameter on the docs page regarding use
of embedding models from HF
* also changed the name of example class from `TextModel` to `Words`
since that is what is used as parameter in the `db.create_table` call
* Per
https://lancedb.github.io/lancedb/python/python/#lancedb.table.Table.add
2024-07-01 12:14:17 +05:30
Will Jones
865ed99881 feat: dynamodb commit store support (#1410)
This allows users to specify URIs like:

```
s3+ddb://my_bucket/path?ddbTableName=myCommitTable
```

and it will support concurrent writes in S3.

* [x] Add dynamodb integration tests
* [x] Add modifications to get it working in Python sync API
* [x] Added section in documentation describing how to configure.

Closes #534

---------

Co-authored-by: universalmind303 <cory.grinstead@gmail.com>
2024-06-28 09:30:36 -07:00
Lei Xu
d6485f1215 docs: add openapi rest api page (#1413) 2024-06-27 21:32:34 -07:00
Cory Grinstead
79a1667753 feat(nodejs): feature parity [6/N] - make public interface work with multiple arrow versions (#1392)
previously we didnt have great compatibility with other versions of
apache arrow. This should bridge that gap a bit.


depends on https://github.com/lancedb/lancedb/pull/1391
see actual diff here
https://github.com/universalmind303/lancedb/compare/query-filter...universalmind303:arrow-compatibility
2024-06-25 11:10:08 -05:00
Thomas J. Fan
a866b78a31 docs: fixes polars formatting in docs (#1400)
Currently, the whole polars section is formatted as a code block:
https://lancedb.github.io/lancedb/guides/tables/#from-a-polars-dataframe

This PR fixes the formatting.
2024-06-25 08:46:16 -07:00
Will Jones
c7d37b3e6e docs: add tip about lzma linking (#1397)
Similar to https://github.com/lancedb/lance/pull/2505
2024-06-25 08:20:31 -07:00
Lance Release
4b71552b73 Updating package-lock.json 2024-06-25 00:26:08 +00:00
Lance Release
5ce5f64da3 Bump version: 0.6.0-beta.0 → 0.6.0 v0.6.0 2024-06-25 00:25:45 +00:00
Lance Release
c582b0fc63 Bump version: 0.5.2 → 0.6.0-beta.0 2024-06-25 00:25:45 +00:00
Lance Release
bc0814767b Bump version: 0.9.0-beta.0 → 0.9.0 python-v0.9.0 2024-06-25 00:25:27 +00:00
Lance Release
8960a8e535 Bump version: 0.8.2 → 0.9.0-beta.0 2024-06-25 00:25:27 +00:00
Weston Pace
a8568ddc72 feat: upgrade to lance 0.13.0 (#1404) 2024-06-24 17:22:57 -07:00
Cory Grinstead
55f88346d0 feat(nodejs): table.indexStats (#1361)
closes https://github.com/lancedb/lancedb/issues/1359
2024-06-21 17:06:52 -05:00
Will Jones
dfb9a28795 ci(node): add description and keywords for lancedb package (#1398) 2024-06-21 14:43:35 -07:00
Cory Grinstead
a797f5fe59 feat(nodejs): feature parity [5/N] - add query.filter() alias (#1391)
to make the transition from `vectordb` to `@lancedb/lancedb` as seamless
as possible, this adds `query.filter` with a deprecated tag.


depends on https://github.com/lancedb/lancedb/pull/1390
see actual diff here
https://github.com/universalmind303/lancedb/compare/list-indices-name...universalmind303:query-filter
2024-06-21 16:03:58 -05:00
Cory Grinstead
3cd84c9375 feat(nodejs): feature parity [4/N] - add 'name' to 'IndexConfig' for 'listIndices' (#1390)
depends on https://github.com/lancedb/lancedb/pull/1386

see actual diff here
https://github.com/universalmind303/lancedb/compare/create-table-args...universalmind303:list-indices-name
2024-06-21 15:45:02 -05:00
Cory Grinstead
5ca83fdc99 fix(node): node build (#1396)
i have no idea why this fixes the build.
2024-06-21 15:42:22 -05:00
Cory Grinstead
33cc9b682f feat(nodejs): feature parity [3/N] - createTable({name, data, ...options}) (#1386)
adds support for the `vectordb` syntax of `createTable({name, data,
...options})`.


depends on https://github.com/lancedb/lancedb/pull/1380
see actual diff here
https://github.com/universalmind303/lancedb/compare/table-name...universalmind303:create-table-args
2024-06-21 12:17:39 -05:00
Cory Grinstead
b3e5ac6d2a feat(nodejs): feature parity [2/N] - add table.name and lancedb.connect({args}) (#1380)
depends on https://github.com/lancedb/lancedb/pull/1378

see proper diff here
https://github.com/universalmind303/lancedb/compare/remote-table-node...universalmind303:lancedb:table-name
2024-06-21 11:38:26 -05:00
josca42
0fe844034d feat: enable stemming (#1356)
Added the ability to specify tokenizer_name, when creating a full text
search index using tantivy. This enables the use of language specific
stemming.

Also updated the [guide on full text
search](https://lancedb.github.io/lancedb/fts/) with a short section on
choosing tokenizer.

Fixes #1315
2024-06-20 14:23:55 -07:00
Cory Grinstead
f41eb899dc chore(rust): lock toolchain & fix clippy (#1389)
- fix some clippy errors from ci running a different toolchain. 
- add some saftey notes about some unsafe blocks. 

- locks the toolchain so that it is consistent across dev and CI.
2024-06-20 12:13:03 -05:00
Cory Grinstead
e7022b990e feat(nodejs): feature parity [1/N] - remote table (#1378)
closes https://github.com/lancedb/lancedb/issues/1362
2024-06-17 15:23:27 -05:00
Weston Pace
ea86dad4b7 feat: upgrade lance to 0.12.2-beta.2 (#1381) 2024-06-14 05:43:26 -07:00
harsha-mangena
a45656b8b6 docs: remove code-block:: python from docs (#1366)
- refer #1264
- fixed minor documentation issue
2024-06-11 13:13:02 -07:00
Cory Grinstead
bc19a75f65 feat(nodejs): merge insert (#1351)
closes https://github.com/lancedb/lancedb/issues/1349
2024-06-11 15:05:15 -05:00
Ryan Green
8e348ab4bd fix: use JS naming convention in new index stats fields (#1377)
Changes new index stats fields in node client from snake case to camel
case.
2024-06-10 16:41:31 -02:30
Raghav Dixit
96914a619b docs: llama-index integration (#1347)
Updated api refrence and usage for llama index integration.
2024-06-09 23:52:18 +05:30
Beinan
3c62806b6a fix(java): the JVM crash when using jdk 8 (#1372)
The Optional::isEmpty does not exist in java 8, so we should use
isPresent instead
2024-06-08 22:43:41 -07:00
Ayush Chaurasia
72f339a0b3 docs: add note about embedding api not being available on cloud (#1371) 2024-06-09 03:57:23 +05:30
QianZhu
b9e3cfbdca fix: add status to remote listIndices return (#1364)
expose `status` returned by remote listIndices
2024-06-08 09:52:35 -07:00
Ayush Chaurasia
5e30648f45 docs: fix example path (#1367) 2024-06-07 19:40:50 -07:00
Ayush Chaurasia
76fc16c7a1 docs: add retriever guide, address minor onboarding feedbacks & enhancement (#1326)
- Tried to address some onboarding feedbacks listed in
https://github.com/lancedb/lancedb/issues/1224
- Improve visibility of pydantic integration and embedding API. (Based
on onboarding feedback - Many ways of ingesting data, defining schema
but not sure what to use in a specific use-case)
- Add a guide that takes users through testing and improving retriever
performance using built-in utilities like hybrid-search and reranking
- Add some benchmarks for the above
- Add missing cohere docs

---------

Co-authored-by: Weston Pace <weston.pace@gmail.com>
2024-06-08 06:25:31 +05:30
Weston Pace
007f9c1af8 chore: change build machine for linux arm (#1360) 2024-06-06 13:22:58 -07:00
Lance Release
27e4ad3f11 Updating package-lock.json 2024-06-05 13:47:44 +00:00
Lance Release
df42943ccf Bump version: 0.5.2-beta.0 → 0.5.2 v0.5.2 2024-06-05 13:47:28 +00:00
Lance Release
3eec9ea740 Bump version: 0.5.1 → 0.5.2-beta.0 2024-06-05 13:47:27 +00:00
Lance Release
11fcdb1194 Bump version: 0.8.2-beta.0 → 0.8.2 python-v0.8.2 2024-06-05 13:47:16 +00:00
Lance Release
95a5a0d713 Bump version: 0.8.1 → 0.8.2-beta.0 2024-06-05 13:47:16 +00:00
Weston Pace
c3043a54c6 feat: bump lance dependency to 0.12.1 (#1357) 2024-06-05 06:07:11 -07:00
Weston Pace
d5586c9c32 feat: make it possible to opt in to using the v2 format (#1352)
This also exposed the max_batch_length configuration option in
python/node (it was needed to verify if we are actually in v2 mode or
not)
2024-06-04 21:52:14 -07:00
Rob Meng
d39e7d23f4 feat: fast path for checkout_latest (#1355)
similar to https://github.com/lancedb/lancedb/pull/1354
do locked IO less frequently
2024-06-04 23:01:28 -04:00
Rob Meng
ddceda4ff7 feat: add fast path to dataset reload (#1354)
most of the time we don't need to reload. Locking the write lock and
performing IO is not an ideal pattern.

This PR tries to make the critical section of `.write()` happen less
frequently.

This isn't the most ideal solution. The most ideal solution should not
lock until the new dataset has been loaded. But that would require too
much refactoring.
2024-06-04 19:03:53 -04:00
Cory Grinstead
70f92f19a6 feat(nodejs): table.search functionality (#1341)
closes https://github.com/lancedb/lancedb/issues/1256
2024-06-04 14:04:03 -05:00