mirror of
https://github.com/lancedb/lancedb.git
synced 2026-07-03 11:00:40 +00:00
## What
`MRRReranker.rerank_multivector` averages each document's reciprocal
ranks over the wrong denominator. It divides by the number of rankings
the document *happens to appear in*, instead of the total number of
rankings being fused.
```python
# python/python/lancedb/rerankers/mrr.py
for result_id, reciprocal_ranks in mrr_score_map.items():
mean_rr = np.mean(reciprocal_ranks) # divides by len(present systems)
```
`mrr_score_map[doc]` only accumulates a reciprocal rank for the systems
in which the document was returned, so `np.mean` never accounts for the
systems that missed it.
## Why it's wrong
Mean Reciprocal Rank fusion treats a system that didn't return a
document as a reciprocal rank of `0` and averages across **all**
systems. That's the exact mechanism by which it rewards cross-system
consensus. Dividing by the appearance count removes that, so a document
liked by a single ranking can beat one ranked highly by every ranking.
Concretely, fusing 3 vector rankings:
| Doc | Ranks | Current score | Correct score |
|-----|-------|---------------|---------------|
| A | #1 in 1 system only | `mean([1.0]) = 1.000` | `1.0 / 3 = 0.333` |
| B | #1, #1, #2 across all 3 | `mean([1, 1, .5]) = 0.833` | `2.5 / 3 =
0.833` |
The current code ranks **A above B** - a document two of three rankings
ignored outranks one all three ranked at or near the top.
This also makes `rerank_multivector` inconsistent with `rerank_hybrid`
in the same file, which already treats a missing system as `0`
(`vector_rr = 0.0` / `fts_rr = 0.0`), and with the class docstring
("average of reciprocal ranks across different search results").
## Fix
Divide the summed reciprocal ranks by the total number of rankings:
```python
num_systems = len(vector_results)
...
mean_rr = float(np.sum(reciprocal_ranks)) / num_systems
```
## Tests
Adds `test_mrr_multivector_rewards_consensus`, which asserts the exact
MRR scores and that the consensus document ranks first. It fails on
`main` and passes with this change. Existing reranker tests are
unaffected.