Commit Graph

1147 Commits

Author SHA1 Message Date
Lance Release
ff5bbfdd4c Bump version: 0.12.0 → 0.13.0-beta.0 python-v0.13.0-beta.0 2024-08-12 19:47:57 +00:00
Lei Xu
694ca30c7c feat(nodejs): add bitmap and label list index types in nodejs (#1532) 2024-08-11 12:06:02 -07:00
Lei Xu
b2317c904d feat: create bitmap and label list scalar index using python async api (#1529)
* Expose `bitmap` and `LabelList` scalar index type via Rust and Async
Python API
* Add documents
2024-08-11 09:16:11 -07:00
BubbleCal
613f3063b9 chore: upgrade lance to 0.16.1 (#1524)
Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2024-08-09 19:18:05 +08:00
BubbleCal
5d2cd7fb2e chore: upgrade object_store to 0.10.2 (#1523)
To use the same version with lance

Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2024-08-09 12:03:46 +08:00
Ayush Chaurasia
a88e9bb134 docs: add lancedb embedding fcn on cloud docs (#1521) 2024-08-09 07:21:04 +05:30
Gagan Bhullar
9c1adff426 feat(python): add to_list to async api (#1520)
PR fixes #1517
2024-08-08 11:45:20 -07:00
BubbleCal
f9d5fa88a1 feat!: migrate FTS from tantivy to lance-index (#1483)
Lance now supports FTS, so add it into lancedb Python, TypeScript and
Rust SDKs.

For Python, we still use tantivy based FTS by default because the lance
FTS index now misses some features of tantivy.

For Python:
- Support to create lance based FTS index
- Support to specify columns for full text search (only available for
lance based FTS index)

For TypeScript:
- Change the search method so that it can accept both string and vector
- Support full text search

For Rust
- Support full text search

The others:
- Update the FTS doc

BREAKING CHANGE: 
- for Python, this renames the attached score column of FTS from "score"
to "_score", this could be a breaking change for users that rely the
scores

---------

Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2024-08-08 15:33:15 +08:00
Lance Release
4db554eea5 Updating package-lock.json 2024-08-07 20:56:12 +00:00
Lance Release
101066788d Bump version: 0.9.0-beta.0 → 0.9.0 v0.9.0 2024-08-07 20:55:53 +00:00
Lance Release
c4135d9d30 Bump version: 0.8.0 → 0.9.0-beta.0 2024-08-07 20:55:52 +00:00
Lance Release
ec39d98571 Bump version: 0.12.0-beta.0 → 0.12.0 python-v0.12.0 2024-08-07 20:55:40 +00:00
Lance Release
0cb37f0e5e Bump version: 0.11.0 → 0.12.0-beta.0 2024-08-07 20:55:39 +00:00
Gagan Bhullar
24e3507ee2 fix(node): export optimize options (#1518)
PR fixes #1514
2024-08-07 13:15:51 -07:00
Lei Xu
2bdf0a02f9 feat!: upgrade lance to 0.16 (#1519) 2024-08-07 13:15:22 -07:00
Gagan Bhullar
32123713fd feat(python): optimize stats repr method (#1510)
PR fixes #1507
2024-08-07 08:47:52 -07:00
Gagan Bhullar
d5a01ffe7b feat(python): index config repr method (#1509)
PR fixes #1506
2024-08-07 08:46:46 -07:00
Ayush Chaurasia
e01045692c feat(python): support embedding functions in remote table (#1405) 2024-08-07 20:22:43 +05:30
Rithik Kumar
a62f661d90 docs: revamp example docs (#1512)
Before: 
![Screenshot 2024-08-07
015834](https://github.com/user-attachments/assets/b817f846-78b3-4d6f-b4a0-dfa3f4d6be87)

After:
![Screenshot 2024-08-07
015852](https://github.com/user-attachments/assets/53370301-8c40-45f8-abe3-32f9d051597e)
![Screenshot 2024-08-07
015934](https://github.com/user-attachments/assets/63cdd038-32bb-4b3e-b9c4-1389d2754014)
![Screenshot 2024-08-07
015941](https://github.com/user-attachments/assets/70388680-9c2b-49ef-ba00-2bb015988214)
![Screenshot 2024-08-07
015949](https://github.com/user-attachments/assets/76335a33-bb6f-473c-896f-447320abcc25)

---------

Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>
2024-08-07 03:56:59 +05:30
Ayush Chaurasia
4769d8eb76 feat(python): multi-vector reranking support (#1481)
Currently targeting the following usage:
```
from lancedb.rerankers import CrossEncoderReranker

reranker = CrossEncoderReranker()

query = "hello"

res1 = table.search(query, vector_column_name="vector").limit(3)
res2 = table.search(query, vector_column_name="text_vector").limit(3)
res3 = table.search(query, vector_column_name="meta_vector").limit(3)

reranked = reranker.rerank_multivector(
               [res1, res2, res3],  
              deduplicate=True,
              query=query # some reranker models need query
)
```
- This implements rerank_multivector function in the base reranker so
that all rerankers that implement rerank_vector will automatically have
multivector reranking support
- Special case for RRF reranker that just uses its existing
rerank_hybrid fcn to multi-vector reranking.

---------

Co-authored-by: Weston Pace <weston.pace@gmail.com>
2024-08-07 01:45:46 +05:30
Ayush Chaurasia
d07d7a5980 chore: update polars version range (#1508) 2024-08-06 23:43:15 +05:30
Robby
8d2ff7b210 feat(python): add watsonx embeddings to registry (#1486)
Related issue: https://github.com/lancedb/lancedb/issues/1412

---------

Co-authored-by: Robby <h0rv@users.noreply.github.com>
2024-08-06 10:58:33 +05:30
Will Jones
61c05b51a0 fix(nodejs): address import issues in lancedb npm module (#1503)
Fixes [#1496](https://github.com/lancedb/lancedb/issues/1496)
2024-08-05 16:30:27 -07:00
Will Jones
7801ab9b8b ci: fix release by upgrading to Node 18 (#1494)
Building with Node 16 produced this error:

```
npm ERR! code ENOENT
npm ERR! syscall chmod
npm ERR! path /io/nodejs/node_modules/apache-arrow-15/bin/arrow2csv.cjs
npm ERR! errno -2
npm ERR! enoent ENOENT: no such file or directory, chmod '/io/nodejs/node_modules/apache-arrow-15/bin/arrow2csv.cjs'
npm ERR! enoent This is related to npm not being able to find a file.
npm ERR! enoent 
```

[CI
Failure](https://github.com/lancedb/lancedb/actions/runs/10117131772/job/27981475770).
This looks like it is https://github.com/apache/arrow/issues/43341

Upgrading to Node 18 makes this goes away. Since Node 18 requires glibc
>= 2_28, we had to upgrade the manylinux version we are using. This is
fine since we already state a minimum Node version of 18.

This also upgrades the openssl version we bundle, as well as
consolidates the build files.
2024-08-05 14:08:42 -07:00
Rithik Kumar
d297da5a7e docs: update examples docs (#1488)
Testing Workflow with my first PR.
Before:
![Screenshot 2024-08-01
183326](https://github.com/user-attachments/assets/83d22101-8bbf-4b18-81e4-f740e605727a)

After:
![Screenshot 2024-08-01
183333](https://github.com/user-attachments/assets/a5e4cd2c-c524-4009-81d5-75b2b0361f83)
2024-08-01 18:54:45 +05:30
Ryan Green
6af69b57ad fix: return LanceMergeInsertBuilder in overridden merge_insert method on remote table (#1484) 2024-07-31 12:25:16 -02:30
Cory Grinstead
a062a92f6b docs: custom embedding function for ts (#1479) 2024-07-30 18:19:55 -05:00
Gagan Bhullar
277b753fd8 fix: run java stages in parallel (#1472)
This PR is for issue - https://github.com/lancedb/lancedb/issues/1331
2024-07-27 12:04:32 -07:00
Lance Release
f78b7863f6 Updating package-lock.json 2024-07-26 20:18:55 +00:00
Lance Release
e7d824af2b Bump version: 0.8.0-beta.0 → 0.8.0 v0.8.0 2024-07-26 20:18:37 +00:00
Lance Release
02f1ec775f Bump version: 0.7.2 → 0.8.0-beta.0 2024-07-26 20:18:36 +00:00
Lance Release
7b6d3f943b Bump version: 0.11.0-beta.0 → 0.11.0 python-v0.11.0 2024-07-26 20:18:31 +00:00
Lance Release
676876f4d5 Bump version: 0.10.2 → 0.11.0-beta.0 2024-07-26 20:18:30 +00:00
Cory Grinstead
fbfe2444a8 feat(nodejs): huggingface compatible transformers (#1462) 2024-07-26 12:54:15 -07:00
Will Jones
9555efacf9 feat: upgrade lance to 0.15.0 (#1477)
Changelog: https://github.com/lancedb/lance/releases/tag/v0.15.0

* Fixes #1466
* Closes #1475
* Fixes #1446
2024-07-26 09:13:49 -07:00
Ayush Chaurasia
513926960d docs: add rrf docs and update reranking notebook with Jina reranker results (#1474)
- RRF reranker
- Jina Reranker results

---------

Co-authored-by: Weston Pace <weston.pace@gmail.com>
2024-07-25 22:29:46 +05:30
inn-0
cc507ca766 docs: add missing whitespace before markdown table to fix rendering issue (#1471)
### Fix markdown table rendering issue

This PR adds a missing whitespace before a markdown table in the
documentation. This issue causes the table to not render properly in
mkdocs, while it does render properly in GitHub's markdown viewer.

#### Change Details:
- Added a single line of whitespace before the markdown table to ensure
proper rendering in mkdocs.

#### Note:
- I wasn't able to test this fix in the mkdocs environment, but it
should be safe as it only involves adding whitespace which won't break
anything.


---


Cohere supports following input types:

| Input Type               | Description                          |
|-------------------------|---------------------------------------|
| "`search_document`"     | Used for embeddings stored in a vector|
|                         | database for search use-cases.        |
| "`search_query`"        | Used for embeddings of search queries |
|                         | run against a vector DB               |
| "`semantic_similarity`" | Specifies the given text will be used |
|                         | for Semantic Textual Similarity (STS) |
| "`classification`"      | Used for embeddings passed through a  |
|                         | text classifier.                      |
| "`clustering`"          | Used for the embeddings run through a |
|                         | clustering algorithm                  |

Usage Example:
2024-07-24 22:26:28 +05:30
Cory Grinstead
492d0328fe chore: update readme to point to lancedb package (#1470) 2024-07-23 13:46:32 -07:00
Chang She
374c1e7aba fix: infer schema from huggingface dataset (#1444)
Closes #1383

When creating a table from a HuggingFace dataset, infer the arrow schema
directly
2024-07-23 13:12:34 -07:00
Gagan Bhullar
30047a5566 fix: remove source .ts code from published npm package (#1467)
This PR is for issue - https://github.com/lancedb/lancedb/issues/1358
2024-07-23 13:11:54 -07:00
Bert
85ccf9e22b feat!: correct timeout argument lancedb nodejs sdk (#1468)
Correct the timeout argument to `connect` in @lancedb/lancedb node SDK.
`RemoteConnectionOptions` specified two fields `connectionTimeout` and
`readTimeout`, probably to be consistent with the python SDK, but only
`connectionTimeout` was being used and it was passed to axios in such a
way that this covered the enture remote request (connect + read). This
change adds a single parameter `timeout` which makes the args to
`connect` consistent with the legacy vectordb sdk.

BREAKING CHANGE: This is a breaking change b/c users who would have
previously been passing `connectionTimeout` will now be expected to pass
`timeout`.
2024-07-23 14:02:46 -03:00
Ayush Chaurasia
0255221086 feat: add reciprocal rank fusion reranker (#1456)
Implements https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf

Refactors the hybrid search only rerrankers test to avoid repetition.
2024-07-23 21:37:17 +05:30
Lance Release
4ee229490c Updating package-lock.json 2024-07-23 13:49:13 +00:00
Lance Release
93e24f23af Bump version: 0.7.2-beta.0 → 0.7.2 v0.7.2 2024-07-23 13:48:58 +00:00
Lance Release
8f141e1e33 Bump version: 0.7.1 → 0.7.2-beta.0 2024-07-23 13:48:58 +00:00
Lance Release
1d5da1d069 Bump version: 0.10.2-beta.0 → 0.10.2 python-v0.10.2 2024-07-23 13:48:48 +00:00
Lance Release
0c0ec1c404 Bump version: 0.10.1 → 0.10.2-beta.0 2024-07-23 13:48:47 +00:00
Weston Pace
d4aad82aec fix: don't use v2 by default on empty table (#1469) 2024-07-23 06:47:49 -07:00
Will Jones
4f601a2d4c fix: handle camelCase column names in select (#1460)
Fixes #1385
2024-07-22 12:53:17 -07:00
Cory Grinstead
391fa26175 feat(rust): huggingface sentence-transformers (#1447)
Co-authored-by: Will Jones <willjones127@gmail.com>
2024-07-22 13:47:57 -05:00