docs: update lntegration docs & fixed links (#1423)

1. Updated langchain docs. 2. Minor update to llamaindex doc. 3. Added notebook examples and linked them correctly
2026-01-05 19:32:56 +00:00 · 2024-07-03 21:50:33 +05:30
parent b8ccea9f71
commit a5ff623443
5 changed files with 1225 additions and 8 deletions
--- a/docs/src/integrations/langchain.md
+++ b/docs/src/integrations/langchain.md
@@ -2,7 +2,7 @@
 ![Illustration](../assets/langchain.png)

 ## Quick Start
-You can load your document data using langchain's loaders, for this example we are using `TextLoader` and `OpenAIEmbeddings` as the embedding model.
+You can load your document data using langchain's loaders, for this example we are using `TextLoader` and `OpenAIEmbeddings` as the embedding model. Checkout Complete example here - [LangChain demo](../notebooks/langchain_example.ipynb)
 ```python
 import os
 from langchain.document_loaders import TextLoader
@@ -38,6 +38,8 @@ The exhaustive list of parameters for `LanceDB` vector store are :
 - `api_key`: (Optional) API key to use for LanceDB cloud database. Defaults to `None`.  
 - `region`: (Optional) Region to use for LanceDB cloud database. Only for LanceDB Cloud, defaults to `None`.  
 - `mode`: (Optional) Mode to use for adding data to the table. Defaults to `'overwrite'`.  
+- `reranker`: (Optional) The reranker to use for LanceDB.
+- `relevance_score_fn`: (Optional[Callable[[float], float]]) Langchain relevance score function to be used. Defaults to `None`. 

 ```python
 db_url = "db://lang_test" # url of db you created
@@ -54,12 +56,14 @@ vector_store = LanceDB(
 ```

 ### Methods 
-To add texts and store respective embeddings automatically:   
+
 ##### add_texts()
 - `texts`: `Iterable` of strings to add to the vectorstore.
 - `metadatas`: Optional `list[dict()]` of metadatas associated with the texts.
 - `ids`: Optional `list` of ids to associate with the texts. 
+- `kwargs`: `Any`

+This method adds texts and stores respective embeddings automatically.

 ```python
 vector_store.add_texts(texts = ['test_123'], metadatas =[{'source' :'wiki'}]) 
@@ -74,7 +78,6 @@ pd_df.to_csv("docsearch.csv", index=False)
 # you can also create a new vector store object using an older connection object:
 vector_store = LanceDB(connection=tbl, embedding=embeddings)
 ```
-For index creation make sure your table has enough data in it. An ANN index is ususally not needed for datasets ~100K vectors. For large-scale (>1M) or higher dimension vectors, it is beneficial to create an ANN index.
 ##### create_index() 
 - `col_name`: `Optional[str] = None`
 - `vector_col`: `Optional[str] = None`
@@ -82,6 +85,8 @@ For index creation make sure your table has enough data in it. An ANN index is u
 - `num_sub_vectors`: `Optional[int] = 96`
 - `index_cache_size`: `Optional[int] = None`

+This method creates an index for the vector store. For index creation make sure your table has enough data in it. An ANN index is ususally not needed for datasets ~100K vectors. For large-scale (>1M) or higher dimension vectors, it is beneficial to create an ANN index.
+
 ```python
 # for creating vector index
 vector_store.create_index(vector_col='vector', metric = 'cosine')
@@ -89,4 +94,108 @@ vector_store.create_index(vector_col='vector', metric = 'cosine')
 # for creating scalar index(for non-vector columns)
 vector_store.create_index(col_name='text')

-```
+```
+
+##### similarity_search()
+- `query`: `str`
+- `k`: `Optional[int] = None`
+- `filter`: `Optional[Dict[str, str]] = None`
+- `fts`: `Optional[bool] = False`
+- `name`: `Optional[str] = None`
+- `kwargs`: `Any`
+
+Return documents most similar to the query without relevance scores
+
+```python
+docs = docsearch.similarity_search(query)
+print(docs[0].page_content)
+```
+
+##### similarity_search_by_vector()
+- `embedding`: `List[float]`
+- `k`: `Optional[int] = None`
+- `filter`: `Optional[Dict[str, str]] = None`
+- `name`: `Optional[str] = None`
+- `kwargs`: `Any`
+
+Returns documents most similar to the query vector.
+
+```python
+docs = docsearch.similarity_search_by_vector(query)
+print(docs[0].page_content)
+```
+
+##### similarity_search_with_score()
+- `query`: `str`
+- `k`: `Optional[int] = None`
+- `filter`: `Optional[Dict[str, str]] = None`
+- `kwargs`: `Any`
+
+Returns documents most similar to the query string with relevance scores, gets called by base class's `similarity_search_with_relevance_scores` which selects relevance score based on our `_select_relevance_score_fn`.
+
+```python
+docs = docsearch.similarity_search_with_relevance_scores(query)
+print("relevance score - ", docs[0][1])
+print("text- ", docs[0][0].page_content[:1000])
+```
+
+##### similarity_search_by_vector_with_relevance_scores()
+- `embedding`: `List[float]`
+- `k`: `Optional[int] = None`
+- `filter`: `Optional[Dict[str, str]] = None`
+- `name`: `Optional[str] = None`
+- `kwargs`: `Any`
+
+Return documents most similar to the query vector with relevance scores.
+Relevance score 
+
+```python
+docs = docsearch.similarity_search_by_vector_with_relevance_scores(query_embedding)
+print("relevance score - ", docs[0][1])
+print("text- ", docs[0][0].page_content[:1000])
+```
+
+##### max_marginal_relevance_search()
+- `query`: `str`
+- `k`: `Optional[int] = None`
+- `fetch_k` : Number of Documents to fetch to pass to MMR algorithm, `Optional[int] = None`
+- `lambda_mult`: Number between 0 and 1 that determines the degree
+                        of diversity among the results with 0 corresponding
+                        to maximum diversity and 1 to minimum diversity.
+                        Defaults to 0.5. `float = 0.5`
+- `filter`: `Optional[Dict[str, str]] = None`
+- `kwargs`: `Any`
+
+Returns docs selected using the maximal marginal relevance(MMR).
+Maximal marginal relevance optimizes for similarity to query AND diversity among selected documents.
+
+Similarly, `max_marginal_relevance_search_by_vector()` function returns docs most similar to the embedding passed to the function using MMR. instead of a string query you need to pass the embedding to be searched for. 
+
+```python
+result = docsearch.max_marginal_relevance_search(
+        query="text"
+    )
+result_texts = [doc.page_content for doc in result]
+print(result_texts)
+
+## search by vector :
+result = docsearch.max_marginal_relevance_search_by_vector(
+        embeddings.embed_query("text")
+    )
+result_texts = [doc.page_content for doc in result]
+print(result_texts)
+```
+
+##### add_images()
+- `uris` : File path to the image. `List[str]`.
+- `metadatas` : Optional list of metadatas. `(Optional[List[dict]], optional)`
+- `ids` : Optional list of IDs. `(Optional[List[str]], optional)`
+
+Adds images by automatically creating their embeddings and adds them to the vectorstore.
+
+```python
+vec_store.add_images(uris=image_uris) 
+# here image_uris are local fs paths to the images.
+```
+
+
--- a/docs/src/integrations/llamaIndex.md
+++ b/docs/src/integrations/llamaIndex.md
@@ -2,7 +2,8 @@
 ![Illustration](../assets/llama-index.jpg)

 ## Quick start
-You would need to install the integration via `pip install llama-index-vector-stores-lancedb` in order to use it. You can run the below script to try it out :
+You would need to install the integration via `pip install llama-index-vector-stores-lancedb` in order to use it. 
+You can run the below script to try it out :
 ```python
 import logging
 import sys
@@ -43,6 +44,8 @@ retriever = index.as_retriever(vector_store_kwargs={"where": lance_filter})
 response = retriever.retrieve("What did the author do growing up?")
 ```

+Checkout Complete example here - [LlamaIndex demo](../notebooks/LlamaIndex_example.ipynb)
+
 ### Filtering
 For metadata filtering, you can use a Lance SQL-like string filter as demonstrated in the example above. Additionally, you can also filter using the `MetadataFilters` class from LlamaIndex:
 ```python