From ca713521323836eb24b9c45087e7abebc565c94b Mon Sep 17 00:00:00 2001 From: Prashanth Rao <35005448+prrao87@users.noreply.github.com> Date: Thu, 15 Feb 2024 12:06:05 -0500 Subject: [PATCH] Revert "docs: Minimal reranking evaluation benchmarks (#977)" This reverts commit f0298d8372cf67637784e3d51c63b3f919df20f6. --- docs/mkdocs.yml | 2 -- docs/src/hybrid_search/eval.md | 49 ---------------------------------- 2 files changed, 51 deletions(-) delete mode 100644 docs/src/hybrid_search/eval.md diff --git a/docs/mkdocs.yml b/docs/mkdocs.yml index 16ae3478..5d8f7196 100644 --- a/docs/mkdocs.yml +++ b/docs/mkdocs.yml @@ -92,7 +92,6 @@ nav: - Full-text search: fts.md - Hybrid search: - Overview: hybrid_search/hybrid_search.md - - Comparing Rerankers: hybrid_search/eval.md - Airbnb financial data example: notebooks/hybrid_search.ipynb - Filtering: sql.md - Versioning & Reproducibility: notebooks/reproducibility.ipynb @@ -157,7 +156,6 @@ nav: - Full-text search: fts.md - Hybrid search: - Overview: hybrid_search/hybrid_search.md - - Comparing Rerankers: hybrid_search/eval.md - Airbnb financial data example: notebooks/hybrid_search.ipynb - Filtering: sql.md - Versioning & Reproducibility: notebooks/reproducibility.ipynb diff --git a/docs/src/hybrid_search/eval.md b/docs/src/hybrid_search/eval.md deleted file mode 100644 index 496e62c7..00000000 --- a/docs/src/hybrid_search/eval.md +++ /dev/null @@ -1,49 +0,0 @@ -# Hybrid Search - -Hybrid Search is a broad (often misused) term. It can mean anything from combining multiple methods for searching, to applying ranking methods to better sort the results. In this blog, we use the definition of "hybrid search" to mean using a combination of keyword-based and vector search. - -## The challenge of (re)ranking search results -Once you have a group of the most relevant search results from multiple search sources, you'd likely standardize the score and rank them accordingly. This process can also be seen as another independent step - reranking. -There are two approaches for reranking search results from multiple sources. -* Score-based: Calculate final relevance scores based on a weighted linear combination of individual search algorithm scores. Example - Weighted linear combination of semantic search & keyword-based search results. -* Relevance-based: Discards the existing scores and calculates the relevance of each search result - query pair. Example - Cross Encoder models - -Even though there are many strategies for reranking search results, none works for all cases. Moreover, evaluating them itself is a challenge. Also, reranking can be dataset, application specific so it's hard to generalize. - -### Example evaluation of hybrid search with Reranking - -Here's some evaluation numbers from experiment comparing these re-rankers on about 800 queries. It is modified version of an evaluation script from [llama-index](https://github.com/run-llama/finetune-embedding/blob/main/evaluate.ipynb) that measures hit-rate at top-k. - - With OpenAI ada2 embedding - -Vector Search baseline - `0.64` - -| Reranker | Top-3 | Top-5 | Top-10 | -| --- | --- | --- | --- | -| Linear Combination | `0.73` | `0.74` | `0.85` | -| Cross Encoder | `0.71` | `0.70` | `0.77` | -| Cohere | `0.81` | `0.81` | `0.85` | -| ColBERT | `0.68` | `0.68` | `0.73` | - -
-
-
-
-