Adds capability to the remote python SDK to retry requests (fixes#911)
This can be configured through environment:
- `LANCE_CLIENT_MAX_RETRIES`= total number of retries. Set to 0 to
disable retries. default = 3
- `LANCE_CLIENT_CONNECT_RETRIES` = number of times to retry request in
case of TCP connect failure. default = 3
- `LANCE_CLIENT_READ_RETRIES` = number of times to retry request in case
of HTTP request failure. default = 3
- `LANCE_CLIENT_RETRY_STATUSES` = http statuses for which the request
will be retried. passed as comma separated list of ints. default `500,
502, 503`
- `LANCE_CLIENT_RETRY_BACKOFF_FACTOR` = controls time between retry
requests. see
[here](23f2287eb5/src/urllib3/util/retry.py (L141-L146)).
default = 0.25
Only read requests will be retried:
- list table names
- query
- describe table
- list table indices
This does not add retry capabilities for writes as it could possibly
cause issues in the case where the retried write isn't idempotent. For
example, in the case where the LB times-out the request but the server
completes the request anyway, we might not want to blindly retry an
insert request.
- The JS/TS library actually expects named parameters via an object in
`.createTable()` rather than individual arguments
- Added example on how to search rows by criteria without a vector
search. TS type of `.search()` currently has the `query` parameter as
non-optional so we have to pass undefined for now.
based on https://github.com/lancedb/lancedb/pull/713
- The Reranker api can be plugged into vector only or fts only search
but this PR doesn't do that (see example -
https://txt.cohere.com/rerank/)
### Default reranker -- `LinearCombinationReranker(weight=0.7,
fill=1.0)`
```
table.search("hello", query_type="hybrid").rerank(normalize="score").to_pandas()
```
### Available rerankers
LinearCombinationReranker
```
from lancedb.rerankers import LinearCombinationReranker
# Same as default
table.search("hello", query_type="hybrid").rerank(
normalize="score",
reranker=LinearCombinationReranker()
).to_pandas()
# with custom params
reranker = LinearCombinationReranker(weight=0.3, fill=1.0)
table.search("hello", query_type="hybrid").rerank(
normalize="score",
reranker=reranker
).to_pandas()
```
Cohere Reranker
```
from lancedb.rerankers import CohereReranker
# default model.. English and multi-lingual supported. See docstring for available custom params
table.search("hello", query_type="hybrid").rerank(
normalize="rank", # score or rank
reranker=CohereReranker()
).to_pandas()
```
CrossEncoderReranker
```
from lancedb.rerankers import CrossEncoderReranker
table.search("hello", query_type="hybrid").rerank(
normalize="rank",
reranker=CrossEncoderReranker()
).to_pandas()
```
## Using custom Reranker
```
from lancedb.reranker import Reranker
class CustomReranker(Reranker):
def rerank_hybrid(self, vector_result, fts_result):
combined_res = self.merge_results(vector_results, fts_results) # or use custom combination logic
# Custom rerank logic here
return combined_res
```
- [x] Expand testing
- [x] Make sure usage makes sense
- [x] Run simple benchmarks for correctness (Seeing weird result from
cohere reranker in the toy example)
- Support diverse rerankers by default:
- [x] Cross encoding
- [x] Cohere
- [x] Reciprocal Rank Fusion
---------
Co-authored-by: Chang She <759245+changhiskhan@users.noreply.github.com>
Co-authored-by: Prashanth Rao <35005448+prrao87@users.noreply.github.com>
Minor fix to change the background color for an image in the docs. It's
now readable in both light and dark modes (earlier version made it
impossible to read in dark mode).
have added testing and an example in the docstring, will be pushing a
separate PR in recipe repo for rag example
---------
Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>
* Bring the feature parity of Rust connect methods.
* A global connect method that can connect to local and remote / cloud
table, as the same as in js/python today.
* Easy to type
* Handle `String, &str, [String] and [&str]` well without manual
conversion
* Fix function name to be verb
* Improve docstring of Rust.
* Promote `query` and `search()` to public `Table` trait