This includes a handful of minor edits I made while reading the docs. In
addition to a few spelling fixes,
* standardize on "rerank" over "re-rank" in prose
* terminate sentences with periods or colons as appropriate
* replace some usage of dashes with colons, such as in "Try it yourself
- <link>"
All changes are surface-level. No changes to semantics or structure.
---------
Co-authored-by: Will Jones <willjones127@gmail.com>
The code to support VoyageAI embedding and rerank models was added in
the https://github.com/lancedb/lancedb/pull/1799 PR.
Some of the documentation changes was also made, here adding the
VoyageAI embedding doc link to the index page.
These are my first PRs in lancedb and while i checked the
documentation/code structure, i might missed something important. Please
let me know if any changes required!
- Enforce all rerankers always return _relevance_score. This was already
loosely done in tests before but based on user feedback its better to
always have _relevance_score present in all reranked results
- Deprecate LinearCombinationReranker in docs. And also fix a case where
it would not return _relevance_score if one result set was missing
- Both LinearCombination (the current default) and RRF are pretty fast
compared to model based rerankers. RRF is slightly faster.
- In our tests RRF has also been slightly more accurate.
This PR:
- Makes RRF the default reranker
- Removed duplicate docs for rerankers
This PR:
- Adds missing license headers
- Integrates with answerdotai Rerankers package
- Updates ColbertReranker to subclass answerdotai package. This is done
to keep backwards compatibility as some users might be used to importing
ColbertReranker directly
- Set `trust_remote_code` to ` True` by default in CrossEncoder and
sentence-transformer based rerankers