docs: assorted copyedits (#1998)

This includes a handful of minor edits I made while reading the docs. In
addition to a few spelling fixes,
* standardize on "rerank" over "re-rank" in prose
* terminate sentences with periods or colons as appropriate
* replace some usage of dashes with colons, such as in "Try it yourself
- <link>"

All changes are surface-level. No changes to semantics or structure.

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
This commit is contained in:
Wyatt Alt
2025-01-06 15:04:48 -08:00
committed by GitHub
parent b474f98049
commit 0b45ef93c0
31 changed files with 161 additions and 164 deletions

View File

@@ -207,7 +207,7 @@
"cell_type": "markdown",
"source": [
"## The dataset\n",
"The dataset we'll use is a synthetic QA dataset generated from LLama2 review paper. The paper was divided into chunks, with each chunk being a unique context. An LLM was prompted to ask questions relevant to the context for testing a retreiver.\n",
"The dataset we'll use is a synthetic QA dataset generated from LLama2 review paper. The paper was divided into chunks, with each chunk being a unique context. An LLM was prompted to ask questions relevant to the context for testing a retriever.\n",
"The exact code and other utility functions for this can be found in [this](https://github.com/lancedb/ragged) repo\n"
],
"metadata": {

View File

@@ -477,7 +477,7 @@
"source": [
"## Vector Search\n",
"\n",
"avg latency - `3.48 ms ± 71.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)`"
"Average latency: `3.48 ms ± 71.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)`"
]
},
{
@@ -597,7 +597,7 @@
"`LinearCombinationReranker(weight=0.7)` is used as the default reranker for reranking the hybrid search results if the reranker isn't specified explicitly.\n",
"The `weight` param controls the weightage provided to vector search score. The weight of `1-weight` is applied to FTS scores when reranking.\n",
"\n",
"Latency - `71 ms ± 25.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)`"
"Latency: `71 ms ± 25.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)`"
]
},
{
@@ -675,9 +675,9 @@
},
"source": [
"### Cohere Reranker\n",
"This uses Cohere's Reranking API to re-rank the results. It accepts the reranking model name as a parameter. By Default it uses the english-v3 model but you can easily switch to a multi-lingual model.\n",
"This uses Cohere's Reranking API to re-rank the results. It accepts the reranking model name as a parameter. By default it uses the english-v3 model but you can easily switch to a multi-lingual model.\n",
"\n",
"latency - `605 ms ± 78.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)`"
"Latency: `605 ms ± 78.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)`"
]
},
{
@@ -1165,7 +1165,7 @@
},
"source": [
"### ColBERT Reranker\n",
"Colber Reranker is powered by ColBERT model. It runs locally using the huggingface implementation.\n",
"Colbert Reranker is powered by ColBERT model. It runs locally using the huggingface implementation.\n",
"\n",
"Latency - `950 ms ± 5.78 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)`\n",
"\n",
@@ -1489,9 +1489,9 @@
},
"source": [
"### Cross Encoder Reranker\n",
"Uses cross encoder models are rerankers. Uses sentence transformer implemntation locally\n",
"Uses cross encoder models are rerankers. Uses sentence transformer implementation locally\n",
"\n",
"Latency - `1.38 s ± 64.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)`"
"Latency: `1.38 s ± 64.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)`"
]
},
{
@@ -1771,10 +1771,10 @@
"source": [
"### (Experimental) OpenAI Reranker\n",
"\n",
"This prompts chat model to rerank results which is not a dedicated reranker model. This should be treated as experimental. You might run out of token limit so set the search limits based on your token limit.\n",
"NOTE: It is recommended to use `gpt-4-turbo-preview`, older models might lead to bad behaviour\n",
"This prompts a chat model to rerank results and is not a dedicated reranker model. This should be treated as experimental. You might exceed the token limit so set the search limits based on your token limit.\n",
"NOTE: It is recommended to use `gpt-4-turbo-preview` as older models might lead to bad behaviour\n",
"\n",
"Latency - `Can take 10s of seconds if using GPT-4 model`"
"Latency: `Can take 10s of seconds if using GPT-4 model`"
]
},
{
@@ -1817,7 +1817,7 @@
},
"source": [
"## Use your custom Reranker\n",
"Hybrid search in LanceDB is designed to be very flexible. You can easily plug in your own Re-reranking logic. To do so, you simply need to implement the base Reranker class"
"Hybrid search in LanceDB is designed to be very flexible. You can easily plug in your own Re-reranking logic. To do so, you simply need to implement the base Reranker class:"
]
},
{
@@ -1849,9 +1849,9 @@
"source": [
"### Custom Reranker based on CohereReranker\n",
"\n",
"For the sake of simplicity let's build custom reranker that just enchances the Cohere Reranker by accepting a filter query, and accept other CohereReranker params as kwags.\n",
"For the sake of simplicity let's build a custom reranker that enhances the Cohere Reranker by accepting a filter query, and accepts other CohereReranker params as kwargs.\n",
"\n",
"For this toy example let's say we want to get rid of docs that represent a table of contents, appendix etc. as these are semantically close of representing costs but this isn't something we are interested in because they don't represent the specific reasons why operating costs were high. They simply represent the costs."
"For this toy example let's say we want to get rid of docs that represent a table of contents or appendix, as these are semantically close to representing costs but don't represent the specific reasons why operating costs were high."
]
},
{
@@ -1969,7 +1969,7 @@
"id": "b3b5464a-7252-4eab-aaac-9b0eae37496f"
},
"source": [
"As you can see the document containing the Table of contetnts of spending no longer shows up"
"As you can see, the document containing the table of contents no longer shows up."
]
}
],

View File

@@ -49,7 +49,7 @@
},
"source": [
"## What is a retriever\n",
"VectorDBs are used as retreivers in recommender or chatbot-based systems for retrieving relevant data based on user queries. For example, retriever is a critical component of Retrieval Augmented Generation (RAG) acrhitectures. In this section, we will discuss how to improve the performance of retrievers.\n",
"VectorDBs are used as retrievers in recommender or chatbot-based systems for retrieving relevant data based on user queries. For example, retriever is a critical component of Retrieval Augmented Generation (RAG) acrhitectures. In this section, we will discuss how to improve the performance of retrievers.\n",
"\n",
"<img src=\"https://llmstack.ai/assets/images/rag-f517f1f834bdbb94a87765e0edd40ff2.png\" />\n",
"\n",
@@ -64,7 +64,7 @@
"- Fine-tuning the embedding models\n",
"- Using different embedding models\n",
"\n",
"Obviously, the above list is not exhaustive. There are other subtler ways that can improve retrieval performance like experimenting chunking algorithms, using different distance/similarity metrics etc. But for brevity, we'll only cover high level and more impactful techniques here.\n",
"Obviously, the above list is not exhaustive. There are other subtler ways that can improve retrieval performance like alternative chunking algorithms, using different distance/similarity metrics, and more. For brevity, we'll only cover high level and more impactful techniques here.\n",
"\n"
]
},
@@ -77,7 +77,7 @@
"# LanceDB\n",
"- Multimodal DB for AI\n",
"- Powered by an innovative & open-source in-house file format\n",
"- 0 Setup\n",
"- Zero setup\n",
"- Scales up on disk storage\n",
"- Native support for vector, full-text(BM25) and hybrid search\n",
"\n",
@@ -92,8 +92,8 @@
},
"source": [
"## The dataset\n",
"The dataset we'll use is a synthetic QA dataset generated from LLama2 review paper. The paper was divided into chunks, with each chunk being a unique context. An LLM was prompted to ask questions relevant to the context for testing a retreiver.\n",
"The exact code and other utility functions for this can be found in [this](https://github.com/lancedb/ragged) repo\n"
"The dataset we'll use is a synthetic QA dataset generated from LLama2 review paper. The paper was divided into chunks, with each chunk being a unique context. An LLM was prompted to ask questions relevant to the context for testing a retriever.\n",
"The exact code and other utility functions for this can be found in [this](https://github.com/lancedb/ragged) repo.\n"
]
},
{
@@ -594,10 +594,10 @@
},
"source": [
"## Ingestion\n",
"Let us now ingest the contexts in LanceDB\n",
"Let us now ingest the contexts in LanceDB. The steps will be:\n",
"\n",
"- Create a schema (Pydantic or Pyarrow)\n",
"- Select an embedding model from LanceDB Embedding API (Allows automatic vectorization of data)\n",
"- Select an embedding model from LanceDB Embedding API (to allow automatic vectorization of data)\n",
"- Ingest the contexts\n"
]
},
@@ -841,7 +841,7 @@
},
"source": [
"## Different Query types in LanceDB\n",
"LanceDB allows switching query types with by setting `query_type` argument, which defaults to `vector` when using Embedding API. In this example we'll use `JinaReranker` which is one of many rerankers supported by LanceDB\n",
"LanceDB allows switching query types with by setting `query_type` argument, which defaults to `vector` when using Embedding API. In this example we'll use `JinaReranker` which is one of many rerankers supported by LanceDB.\n",
"\n",
"### Vector search:\n",
"Vector search\n",
@@ -1446,11 +1446,11 @@
"source": [
"## Takeaways & Tradeoffs\n",
"\n",
"* **Easiest method to significantly improve accuracy** Using Hybrid search and/or rerankers can significantly improve retrieval performance without spending any additional time or effort on tuning embedding models, generators, or dissecting the dataset.\n",
"* **Rerankers significantly improve accuracy at little cost.** Using Hybrid search and/or rerankers can significantly improve retrieval performance without spending any additional time or effort on tuning embedding models, generators, or dissecting the dataset.\n",
"\n",
"* **Reranking is an expensive operation.** Depending on the type of reranker you choose, they can incur significant latecy to query times. Although some API-based rerankers can be significantly faster.\n",
"\n",
"* When using models locally, having a warmed-up GPU environment will significantly reduce latency. This is specially useful if the application doesn't need to be strcitly realtime. The tradeoff being GPU resources."
"* **Pre-warmed GPU environments reduce latency.** When using models locally, having a warmed-up GPU environment will significantly reduce latency. This is especially useful if the application doesn't need to be strictly realtime. Pre-warming comes at the expense of GPU resources."
]
},
{
@@ -1504,4 +1504,4 @@
},
"nbformat": 4,
"nbformat_minor": 0
}
}

View File

@@ -188,7 +188,7 @@
"id": "4ba9ffac-c779-49e3-91a7-f1c00f3fda41",
"metadata": {},
"source": [
"Creating a LanceDB table from a pandas dataframe is straightforward using `create_table`"
"Creating a LanceDB table from a pandas dataframe is straightforward using `create_table`:"
]
},
{
@@ -457,7 +457,7 @@
"metadata": {},
"source": [
"Ok so this is a vector database, so we need actual vectors.\n",
"We'll use sentence transformers here to avoid having to deal with api keys and all that."
"We'll use sentence transformers here to avoid having to deal with API keys."
]
},
{
@@ -465,7 +465,7 @@
"id": "85db4ed9-8f80-4b56-9867-1381fa1c4c7d",
"metadata": {},
"source": [
"Let's create a basic model using the \"all-MiniLM-L6-v2\" model and embed the quotes"
"Let's create a basic model using the \"all-MiniLM-L6-v2\" model and embed the quotes:"
]
},
{
@@ -498,7 +498,7 @@
"source": [
"We can then convert the vectors into a pyarrow Table and merge it to the LanceDB Table.\n",
"\n",
"For the merge to work successfully, we need to have an overlapping column. Here the natural choice is to use the id column"
"For the merge to work successfully, we need to have an overlapping column. Here the natural choice is to use the id column:"
]
},
{
@@ -599,7 +599,7 @@
"id": "518da48d-6481-4c1e-8ba4-800d5e0542cf",
"metadata": {},
"source": [
"And now we'll use the `LanceTable.merge` function to add the vector column into the LanceTable."
"And now we'll use the `LanceTable.merge` function to add the vector column into the LanceTable:"
]
},
{
@@ -706,7 +706,7 @@
"id": "f590fec8-0ed0-4148-b940-c81abe7b421c",
"metadata": {},
"source": [
"If we look at the schema, we see that `all-MiniLM-L6-v2` produces 384-dimensional vectors"
"If we look at the schema, we see that `all-MiniLM-L6-v2` produces 384-dimensional vectors:"
]
},
{
@@ -945,7 +945,7 @@
"source": [
"### Switching Models\n",
"\n",
"Now we'll switch to the `all-mpnet-base-v2` model and add the vectors to the restored dataset again"
"Now we'll switch to the `all-mpnet-base-v2` model and add the vectors to the restored dataset again:"
]
},
{
@@ -1018,7 +1018,7 @@
"## Deletion\n",
"\n",
"What if the whole show was just Rick-isms? \n",
"Let's delete any quote not said by Rick"
"Let's delete any quote not said by Rick:"
]
},
{
@@ -1161,7 +1161,7 @@
"id": "97a1cf79-b46b-40cd-ada0-54edef358627",
"metadata": {},
"source": [
"We never had to explicitly manage the versioning. And we never had to create expensive and slow snapshots. LanceDB automatically tracks the full history of operations I created and supports fast rollbacks. In production this is critical for debugging issues and minimizing downtime by rolling back to a previously successful state in seconds."
"We never had to explicitly manage the versioning. And we never had to create expensive and slow snapshots. LanceDB automatically tracks the full history of operations and supports fast rollbacks. In production this is critical for debugging issues and minimizing downtime by rolling back to a previously successful state in seconds."
]
}
],