Commit Graph

764 Commits

Author SHA1 Message Date
Lance Release
a9088224c5 [python] Bump version: 0.5.2 → 0.5.3 python-v0.5.3 2024-02-03 03:04:04 +00:00
Ayush Chaurasia
688c57a0d8 fix: revert safe_import_pandas usage (#921) 2024-02-02 18:57:13 -08:00
Lance Release
12a98deded Updating package-lock.json 2024-02-02 22:37:23 +00:00
Lance Release
e4bb042918 Updating package-lock.json 2024-02-02 21:57:07 +00:00
Lance Release
04e1662681 Bump version: 0.4.7 → 0.4.8 v0.4.8 2024-02-02 21:56:57 +00:00
Lance Release
ce2242e06d [python] Bump version: 0.5.1 → 0.5.2 python-v0.5.2 2024-02-02 21:33:02 +00:00
Weston Pace
778339388a chore: bump pylance version to latest in pyproject.toml (#918) 2024-02-02 13:32:12 -08:00
Weston Pace
7f8637a0b4 feat: add merge_insert to the node and rust APIs (#915) 2024-02-02 13:16:51 -08:00
QianZhu
09cd08222d make it explicit about the vector column data type (#916)
<img width="837" alt="Screenshot 2024-02-01 at 4 23 34 PM"
src="https://github.com/lancedb/lancedb/assets/1305083/4f0f5c5a-2a24-4b00-aad1-ef80a593d964">
[
<img width="838" alt="Screenshot 2024-02-01 at 4 26 03 PM"
src="https://github.com/lancedb/lancedb/assets/1305083/ca073bc8-b518-4be3-811d-8a7184416f07">
](url)

---------

Co-authored-by: Weston Pace <weston.pace@gmail.com>
2024-02-02 09:02:02 -08:00
Bert
a248d7feec fix: add request retry to python client (#917)
Adds capability to the remote python SDK to retry requests (fixes #911)

This can be configured through environment:
- `LANCE_CLIENT_MAX_RETRIES`= total number of retries. Set to 0 to
disable retries. default = 3
- `LANCE_CLIENT_CONNECT_RETRIES` = number of times to retry request in
case of TCP connect failure. default = 3
- `LANCE_CLIENT_READ_RETRIES` = number of times to retry request in case
of HTTP request failure. default = 3
- `LANCE_CLIENT_RETRY_STATUSES` = http statuses for which the request
will be retried. passed as comma separated list of ints. default `500,
502, 503`
- `LANCE_CLIENT_RETRY_BACKOFF_FACTOR` = controls time between retry
requests. see
[here](23f2287eb5/src/urllib3/util/retry.py (L141-L146)).
default = 0.25

Only read requests will be retried:
- list table names
- query
- describe table
- list table indices

This does not add retry capabilities for writes as it could possibly
cause issues in the case where the retried write isn't idempotent. For
example, in the case where the LB times-out the request but the server
completes the request anyway, we might not want to blindly retry an
insert request.
2024-02-02 11:27:29 -05:00
Weston Pace
cc9473a94a docs: add cleanup_old_versions and compact_files to Table for documentation purposes (#900)
Closes #819
2024-02-01 15:06:00 -08:00
Weston Pace
d77e95a4f4 feat: upgrade to lance 0.9.11 and expose merge_insert (#906)
This adds the python bindings requested in #870 The javascript/rust
bindings will be added in a future PR.
2024-02-01 11:36:29 -08:00
Lei Xu
62f053ac92 ci: bump to new version of python action to use node 20 gIthub action runtime (#909)
Github action is deprecating old node-16 runtime.
2024-02-01 11:36:03 -08:00
JacobLinCool
34e10caad2 fix the repo link on npm, add links for homepage and bug report (#910)
- fix the repo link on npm
- add links for homepage and bug report
2024-01-31 21:07:11 -08:00
QianZhu
f5726e2d0c arrow table/f16 example (#907) 2024-01-31 14:41:28 -08:00
Lance Release
12b4fb42fc Updating package-lock.json 2024-01-31 21:18:24 +00:00
Lance Release
1328cd46f1 Updating package-lock.json 2024-01-31 20:29:38 +00:00
Lance Release
0c940ed9f8 Bump version: 0.4.6 → 0.4.7 v0.4.7 2024-01-31 20:29:28 +00:00
Lei Xu
5f59e51583 fix(node): pass AWS credentials to db level operations (#908)
Passed the following tests

```ts
const keyId = process.env.AWS_ACCESS_KEY_ID;
const secretKey = process.env.AWS_SECRET_ACCESS_KEY;
const sessionToken = process.env.AWS_SESSION_TOKEN;
const region = process.env.AWS_REGION;

const db = await lancedb.connect({
  uri: "s3://bucket/path",
  awsCredentials: {
    accessKeyId: keyId,
    secretKey: secretKey,
    sessionToken: sessionToken,
  },
  awsRegion: region,
} as lancedb.ConnectionOptions);

  console.log(await db.createTable("test", [{ vector: [1, 2, 3] }]));
  console.log(await db.tableNames());
  console.log(await db.dropTable("test"))
```
2024-01-31 12:05:01 -08:00
Will Jones
8d0ea29f89 docs: provide AWS S3 cleanup and permissions advice (#903)
Adding some more quick advice for how to setup AWS S3 with LanceDB.

---------

Co-authored-by: Prashanth Rao <35005448+prrao87@users.noreply.github.com>
2024-01-31 09:24:54 -08:00
Abraham Lopez
b9468bb980 chore: update JS/TS example in README (#898)
- The JS/TS library actually expects named parameters via an object in
`.createTable()` rather than individual arguments
- Added example on how to search rows by criteria without a vector
search. TS type of `.search()` currently has the `query` parameter as
non-optional so we have to pass undefined for now.
2024-01-30 11:09:45 -08:00
Lei Xu
a42df158a3 ci: change apple silicon runner to free OSS macos-14 target (#901) 2024-01-30 11:05:42 -08:00
Raghav Dixit
9df6905d86 chore(python): GTE embedding function model name update (#902)
Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>
2024-01-30 23:56:29 +05:30
Ayush Chaurasia
3ffed89793 feat(python): Hybrid search & Reranker API (#824)
based on https://github.com/lancedb/lancedb/pull/713
- The Reranker api can be plugged into vector only or fts only search
but this PR doesn't do that (see example -
https://txt.cohere.com/rerank/)


### Default reranker -- `LinearCombinationReranker(weight=0.7,
fill=1.0)`

```
table.search("hello", query_type="hybrid").rerank(normalize="score").to_pandas()
```
### Available rerankers
LinearCombinationReranker
```
from lancedb.rerankers import LinearCombinationReranker

# Same as default 
table.search("hello", query_type="hybrid").rerank(
                                      normalize="score", 
                                      reranker=LinearCombinationReranker()
                                     ).to_pandas()

# with custom params
reranker = LinearCombinationReranker(weight=0.3, fill=1.0)
table.search("hello", query_type="hybrid").rerank(
                                      normalize="score", 
                                      reranker=reranker
                                     ).to_pandas()
```

Cohere Reranker
```
from lancedb.rerankers import CohereReranker

# default model.. English and multi-lingual supported. See docstring for available custom params
table.search("hello", query_type="hybrid").rerank(
                                      normalize="rank",  # score or rank
                                      reranker=CohereReranker()
                                     ).to_pandas()

```

CrossEncoderReranker

```
from lancedb.rerankers import CrossEncoderReranker

table.search("hello", query_type="hybrid").rerank(
                                      normalize="rank", 
                                      reranker=CrossEncoderReranker()
                                     ).to_pandas()

```

## Using custom Reranker
```
from lancedb.reranker import Reranker

class CustomReranker(Reranker):
    def rerank_hybrid(self, vector_result, fts_result):
           combined_res = self.merge_results(vector_results, fts_results) # or use custom combination logic
           # Custom rerank logic here
           
           return combined_res
```

- [x] Expand testing
- [x] Make sure usage makes sense
- [x] Run simple benchmarks for correctness (Seeing weird result from
cohere reranker in the toy example)
- Support diverse rerankers by default:
- [x] Cross encoding
- [x] Cohere
- [x] Reciprocal Rank Fusion

---------

Co-authored-by: Chang She <759245+changhiskhan@users.noreply.github.com>
Co-authored-by: Prashanth Rao <35005448+prrao87@users.noreply.github.com>
2024-01-30 19:10:33 +05:30
Prashanth Rao
f150768739 Fix image bgcolor (#891)
Minor fix to change the background color for an image in the docs. It's
now readable in both light and dark modes (earlier version made it
impossible to read in dark mode).
2024-01-30 16:50:29 +05:30
Ayush Chaurasia
b432ecf2f6 doc: Add documentation chatbot for LanceDB (#890)
<img width="1258" alt="Screenshot 2024-01-29 at 10 05 52 PM"
src="https://github.com/lancedb/lancedb/assets/15766192/7c108fde-e993-415c-ad01-72010fd5fe31">
2024-01-30 11:24:57 +05:30
Raghav Dixit
d1a7257810 feat(python): Embedding fn support for gte-mlx/gte-large (#873)
have added testing and an example in the docstring, will be pushing a
separate PR in recipe repo for rag example

---------

Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>
2024-01-30 11:21:57 +05:30
Ayush Chaurasia
5c5e23bbb9 chore(python): Temporarily extend remote connection timeout (#888)
Context - https://etoai.slack.com/archives/C05NC5YSW5V/p1706371205883149
2024-01-29 17:34:33 +05:30
Lei Xu
e5796a4836 doc: fix js example of create index (#886) 2024-01-28 17:02:36 -08:00
Lei Xu
b9c5323265 doc: use snippet for rust code example and make sure rust examples run through CI (#885) 2024-01-28 14:30:30 -08:00
Lei Xu
e41a52863a fix: fix doc build to include the source snippet correctly (#883) 2024-01-28 11:55:58 -08:00
Chang She
13acc8a480 doc(rust): minor fixes for Rust quick start. (#878) 2024-01-28 11:40:52 -08:00
Lei Xu
22b9eceb12 chore: convert all js doc test to use snippet. (#881) 2024-01-28 11:39:25 -08:00
Lei Xu
5f62302614 doc: use code snippet for typescript examples (#880)
The typescript code is in a fully function file, that will be run via the CI.
2024-01-27 22:52:37 -08:00
Ayush Chaurasia
d84e0d1db8 feat(python): Aws Bedrock embeddings integration (#822)
Supports amazon titan, cohere english & cohere multi-lingual base
models.
2024-01-28 02:04:15 +05:30
Lei Xu
ac94b2a420 chore: upgrade lance, pylance and datafusion (#879) 2024-01-27 12:31:38 -08:00
Lei Xu
b49bc113c4 chore: add one rust SDK e2e example (#876)
Co-authored-by: Chang She <759245+changhiskhan@users.noreply.github.com>
2024-01-26 22:41:20 -08:00
Lei Xu
77b5b1cf0e doc: update quick start for full rust example (#872) 2024-01-26 16:19:43 -08:00
Lei Xu
e910809de0 chore: bump github actions to v4 due to GHA warnings of node version deprecation (#874) 2024-01-26 15:52:47 -08:00
Lance Release
90b5b55126 Updating package-lock.json 2024-01-26 23:35:58 +00:00
Lance Release
488e4f8452 Updating package-lock.json 2024-01-26 22:40:46 +00:00
Lance Release
ba6f949515 Bump version: 0.4.5 → 0.4.6 v0.4.6 2024-01-26 22:40:36 +00:00
Lei Xu
3dd8522bc9 feat(rust): provide connect and connect_with_options in Rust SDK (#871)
* Bring the feature parity of Rust connect methods.
* A global connect method that can connect to local and remote / cloud
table, as the same as in js/python today.
2024-01-26 11:40:11 -08:00
Lei Xu
e01ef63488 chore(rust): simplified version of optimize (#869)
Consolidate various optimize() into one method, similar to postgres
VACCUM in the process of preparing Rust API for public use
2024-01-26 11:36:04 -08:00
Lei Xu
a6cf24b359 feat(napi): Issue queries as node SDK (#868)
* Query as a fluent API and `AsyncIterator<RecordBatch>`
* Much more docs
* Add tests for auto infer vector search columns with different
dimensions.
2024-01-25 22:14:14 -08:00
Lance Release
9a07c9aad8 Updating package-lock.json 2024-01-25 21:49:36 +00:00
Lance Release
d405798952 Updating package-lock.json 2024-01-25 20:54:55 +00:00
Lance Release
e8a8b92b2a Bump version: 0.4.4 → 0.4.5 v0.4.5 2024-01-25 20:54:44 +00:00
Lei Xu
66362c6506 fix: release build for node sdk (#861) 2024-01-25 12:51:32 -08:00
Lance Release
5228ca4b6b Updating package-lock.json 2024-01-25 19:53:05 +00:00