Commit Graph

1163 Commits

Author SHA1 Message Date
Rithik Kumar
758c82858f docs: add AI agent example (#1553)
before:
![Screenshot 2024-08-21
225014](https://github.com/user-attachments/assets/e5b05586-87c5-4739-a4df-2d6cd0704ba5)

After:
![Screenshot 2024-08-21
225029](https://github.com/user-attachments/assets/504959db-f560-49b2-9492-557e9846a793)

---------

Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>
2024-08-22 00:54:05 +05:30
Rithik Kumar
0cbc9cd551 docs: add evaluation example (#1552)
before:
![Screenshot 2024-08-21
194228](https://github.com/user-attachments/assets/68d96658-7579-4934-85af-e8c898b64660)

After:
![Screenshot 2024-08-21
195258](https://github.com/user-attachments/assets/81ddb9cd-cb93-47fc-a121-ff82701fd11f)

---------

Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>
2024-08-21 20:37:04 +05:30
Ayush Chaurasia
7d65dd97cf chore(python): update Colbert architecture and minor improvements (#1547)
- Update ColBertReranker architecture: The current implementation
doesn't use the right arch. This PR uses the implementation in Rerankers
library. Fixes https://github.com/lancedb/lancedb/issues/1546
Benchmark diff (hit rate):
Hybrid - 91 vs 87
reranked vector - 85 vs 80

- Reranking in FTS is basically disabled in main after last week's FTS
updates. I think there's no blocker in supporting that?
- Allow overriding accelerators: Most transformer based Rerankers and
Embedding automatically select device. This PR allows overriding those
settings by passing `device`. Fixes:
https://github.com/lancedb/lancedb/issues/1487

---------

Co-authored-by: BubbleCal <bubble-cal@outlook.com>
2024-08-21 12:26:52 +05:30
Ayush Chaurasia
85bb7e54e4 docs: missing griffe dependency for mkdocs deployment (#1545) 2024-08-19 07:48:23 +05:30
Rithik Kumar
21014cab45 docs: add chatbot example and improve quality of other examples (#1544) 2024-08-17 12:35:33 +05:30
Lei Xu
5857cb4c6e docs: add a section to describe scalar index (#1495) 2024-08-16 18:48:29 -07:00
Rithik Kumar
09ce6c5bb5 docs: add vector search example (#1543) 2024-08-16 21:30:45 +05:30
BubbleCal
0fa50775d6 feat: support to query/index FTS on RemoteTable/AsyncTable (#1537)
Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2024-08-16 12:01:05 +08:00
Gagan Bhullar
20faa4424b feat(python): add delete unverified parameter (#1542)
PR fixes #1527
2024-08-15 09:01:32 -07:00
BubbleCal
b624fc59eb docs: add create_fts_index doc in Python API Reference (#1533)
resolve #1313

---------

Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2024-08-15 11:35:16 +08:00
Gagan Bhullar
d2caa5e202 feat(nodejs): add delete unverified (#1530)
PR fixes part of #1527
2024-08-14 08:53:53 -07:00
BubbleCal
501817cfac chore: bump the required python version to 3.9 (#1541)
Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2024-08-14 08:44:31 -07:00
Ryan Green
b3daa25f46 feat: allow new scalar index types to be created in remote table (#1538) 2024-08-13 16:05:42 -02:30
Matt Basta
6008a8257b fix: remove native.d.ts from .npmignore (#1531)
This removes the type definitions for a number of important TypeScript
interfaces from `.npmignore` so that the package is not incorrectly
typed `any` in a number of places.

---

Presently the `opts` argument to `lancedb.connect` is typed `any`, even
though it shouldn't be.

<img width="560" alt="image"
src="https://github.com/user-attachments/assets/5c974ce8-5a59-44a1-935d-cbb808f0ea24">

Clicking into the type definitions for the published package, it has the
correct type signature:

<img width="831" alt="image"
src="https://github.com/user-attachments/assets/6e39a519-13ff-4ca8-95ae-85538ac59d5d">

However, `ConnectionOptions` is imported from `native.js` (along with a
number of other imports a bit further down):

<img width="384" alt="image"
src="https://github.com/user-attachments/assets/10c1b055-ae78-4088-922e-2816af64c23c">

This is not otherwise an issue, except that the type definitions for
`native.js` are not included in the published package:

<img width="217" alt="image"
src="https://github.com/user-attachments/assets/f15cd3b6-a8de-4011-9fa2-391858da20ec">

I haven't compiled the Rust code and run the build script, but I
strongly suspect that disincluding the type definitions in `.npmignore`
is ultimately the root cause here.
2024-08-13 10:06:15 -07:00
Lance Release
aaff43d304 Updating package-lock.json 2024-08-12 19:48:18 +00:00
Lance Release
d4c3a8ca87 Bump version: 0.9.0 → 0.10.0-beta.0 v0.10.0-beta.0 2024-08-12 19:48:02 +00:00
Lance Release
ff5bbfdd4c Bump version: 0.12.0 → 0.13.0-beta.0 python-v0.13.0-beta.0 2024-08-12 19:47:57 +00:00
Lei Xu
694ca30c7c feat(nodejs): add bitmap and label list index types in nodejs (#1532) 2024-08-11 12:06:02 -07:00
Lei Xu
b2317c904d feat: create bitmap and label list scalar index using python async api (#1529)
* Expose `bitmap` and `LabelList` scalar index type via Rust and Async
Python API
* Add documents
2024-08-11 09:16:11 -07:00
BubbleCal
613f3063b9 chore: upgrade lance to 0.16.1 (#1524)
Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2024-08-09 19:18:05 +08:00
BubbleCal
5d2cd7fb2e chore: upgrade object_store to 0.10.2 (#1523)
To use the same version with lance

Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2024-08-09 12:03:46 +08:00
Ayush Chaurasia
a88e9bb134 docs: add lancedb embedding fcn on cloud docs (#1521) 2024-08-09 07:21:04 +05:30
Gagan Bhullar
9c1adff426 feat(python): add to_list to async api (#1520)
PR fixes #1517
2024-08-08 11:45:20 -07:00
BubbleCal
f9d5fa88a1 feat!: migrate FTS from tantivy to lance-index (#1483)
Lance now supports FTS, so add it into lancedb Python, TypeScript and
Rust SDKs.

For Python, we still use tantivy based FTS by default because the lance
FTS index now misses some features of tantivy.

For Python:
- Support to create lance based FTS index
- Support to specify columns for full text search (only available for
lance based FTS index)

For TypeScript:
- Change the search method so that it can accept both string and vector
- Support full text search

For Rust
- Support full text search

The others:
- Update the FTS doc

BREAKING CHANGE: 
- for Python, this renames the attached score column of FTS from "score"
to "_score", this could be a breaking change for users that rely the
scores

---------

Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2024-08-08 15:33:15 +08:00
Lance Release
4db554eea5 Updating package-lock.json 2024-08-07 20:56:12 +00:00
Lance Release
101066788d Bump version: 0.9.0-beta.0 → 0.9.0 v0.9.0 2024-08-07 20:55:53 +00:00
Lance Release
c4135d9d30 Bump version: 0.8.0 → 0.9.0-beta.0 2024-08-07 20:55:52 +00:00
Lance Release
ec39d98571 Bump version: 0.12.0-beta.0 → 0.12.0 python-v0.12.0 2024-08-07 20:55:40 +00:00
Lance Release
0cb37f0e5e Bump version: 0.11.0 → 0.12.0-beta.0 2024-08-07 20:55:39 +00:00
Gagan Bhullar
24e3507ee2 fix(node): export optimize options (#1518)
PR fixes #1514
2024-08-07 13:15:51 -07:00
Lei Xu
2bdf0a02f9 feat!: upgrade lance to 0.16 (#1519) 2024-08-07 13:15:22 -07:00
Gagan Bhullar
32123713fd feat(python): optimize stats repr method (#1510)
PR fixes #1507
2024-08-07 08:47:52 -07:00
Gagan Bhullar
d5a01ffe7b feat(python): index config repr method (#1509)
PR fixes #1506
2024-08-07 08:46:46 -07:00
Ayush Chaurasia
e01045692c feat(python): support embedding functions in remote table (#1405) 2024-08-07 20:22:43 +05:30
Rithik Kumar
a62f661d90 docs: revamp example docs (#1512)
Before: 
![Screenshot 2024-08-07
015834](https://github.com/user-attachments/assets/b817f846-78b3-4d6f-b4a0-dfa3f4d6be87)

After:
![Screenshot 2024-08-07
015852](https://github.com/user-attachments/assets/53370301-8c40-45f8-abe3-32f9d051597e)
![Screenshot 2024-08-07
015934](https://github.com/user-attachments/assets/63cdd038-32bb-4b3e-b9c4-1389d2754014)
![Screenshot 2024-08-07
015941](https://github.com/user-attachments/assets/70388680-9c2b-49ef-ba00-2bb015988214)
![Screenshot 2024-08-07
015949](https://github.com/user-attachments/assets/76335a33-bb6f-473c-896f-447320abcc25)

---------

Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>
2024-08-07 03:56:59 +05:30
Ayush Chaurasia
4769d8eb76 feat(python): multi-vector reranking support (#1481)
Currently targeting the following usage:
```
from lancedb.rerankers import CrossEncoderReranker

reranker = CrossEncoderReranker()

query = "hello"

res1 = table.search(query, vector_column_name="vector").limit(3)
res2 = table.search(query, vector_column_name="text_vector").limit(3)
res3 = table.search(query, vector_column_name="meta_vector").limit(3)

reranked = reranker.rerank_multivector(
               [res1, res2, res3],  
              deduplicate=True,
              query=query # some reranker models need query
)
```
- This implements rerank_multivector function in the base reranker so
that all rerankers that implement rerank_vector will automatically have
multivector reranking support
- Special case for RRF reranker that just uses its existing
rerank_hybrid fcn to multi-vector reranking.

---------

Co-authored-by: Weston Pace <weston.pace@gmail.com>
2024-08-07 01:45:46 +05:30
Ayush Chaurasia
d07d7a5980 chore: update polars version range (#1508) 2024-08-06 23:43:15 +05:30
Robby
8d2ff7b210 feat(python): add watsonx embeddings to registry (#1486)
Related issue: https://github.com/lancedb/lancedb/issues/1412

---------

Co-authored-by: Robby <h0rv@users.noreply.github.com>
2024-08-06 10:58:33 +05:30
Will Jones
61c05b51a0 fix(nodejs): address import issues in lancedb npm module (#1503)
Fixes [#1496](https://github.com/lancedb/lancedb/issues/1496)
2024-08-05 16:30:27 -07:00
Will Jones
7801ab9b8b ci: fix release by upgrading to Node 18 (#1494)
Building with Node 16 produced this error:

```
npm ERR! code ENOENT
npm ERR! syscall chmod
npm ERR! path /io/nodejs/node_modules/apache-arrow-15/bin/arrow2csv.cjs
npm ERR! errno -2
npm ERR! enoent ENOENT: no such file or directory, chmod '/io/nodejs/node_modules/apache-arrow-15/bin/arrow2csv.cjs'
npm ERR! enoent This is related to npm not being able to find a file.
npm ERR! enoent 
```

[CI
Failure](https://github.com/lancedb/lancedb/actions/runs/10117131772/job/27981475770).
This looks like it is https://github.com/apache/arrow/issues/43341

Upgrading to Node 18 makes this goes away. Since Node 18 requires glibc
>= 2_28, we had to upgrade the manylinux version we are using. This is
fine since we already state a minimum Node version of 18.

This also upgrades the openssl version we bundle, as well as
consolidates the build files.
2024-08-05 14:08:42 -07:00
Rithik Kumar
d297da5a7e docs: update examples docs (#1488)
Testing Workflow with my first PR.
Before:
![Screenshot 2024-08-01
183326](https://github.com/user-attachments/assets/83d22101-8bbf-4b18-81e4-f740e605727a)

After:
![Screenshot 2024-08-01
183333](https://github.com/user-attachments/assets/a5e4cd2c-c524-4009-81d5-75b2b0361f83)
2024-08-01 18:54:45 +05:30
Ryan Green
6af69b57ad fix: return LanceMergeInsertBuilder in overridden merge_insert method on remote table (#1484) 2024-07-31 12:25:16 -02:30
Cory Grinstead
a062a92f6b docs: custom embedding function for ts (#1479) 2024-07-30 18:19:55 -05:00
Gagan Bhullar
277b753fd8 fix: run java stages in parallel (#1472)
This PR is for issue - https://github.com/lancedb/lancedb/issues/1331
2024-07-27 12:04:32 -07:00
Lance Release
f78b7863f6 Updating package-lock.json 2024-07-26 20:18:55 +00:00
Lance Release
e7d824af2b Bump version: 0.8.0-beta.0 → 0.8.0 v0.8.0 2024-07-26 20:18:37 +00:00
Lance Release
02f1ec775f Bump version: 0.7.2 → 0.8.0-beta.0 2024-07-26 20:18:36 +00:00
Lance Release
7b6d3f943b Bump version: 0.11.0-beta.0 → 0.11.0 python-v0.11.0 2024-07-26 20:18:31 +00:00
Lance Release
676876f4d5 Bump version: 0.10.2 → 0.11.0-beta.0 2024-07-26 20:18:30 +00:00
Cory Grinstead
fbfe2444a8 feat(nodejs): huggingface compatible transformers (#1462) 2024-07-26 12:54:15 -07:00