mirror of
https://github.com/lancedb/lancedb.git
synced 2026-01-02 18:02:58 +00:00
BREAKING CHANGE: embedding function implementations in Node need to now call `resolveVariables()` in their constructors and should **not** implement `toJSON()`. This tries to address the handling of secrets. In Node, they are currently lost. In Python, they are currently leaked into the table schema metadata. This PR introduces an in-memory variable store on the function registry. It also allows embedding function definitions to label certain config values as "sensitive", and the preprocessing logic will raise an error if users try to pass in hard-coded values. Closes #2110 Closes #521 --------- Co-authored-by: Weston Pace <weston.pace@gmail.com>
392 lines
18 KiB
YAML
392 lines
18 KiB
YAML
site_name: LanceDB
|
|
site_url: https://lancedb.github.io/lancedb/
|
|
repo_url: https://github.com/lancedb/lancedb
|
|
edit_uri: https://github.com/lancedb/lancedb/tree/main/docs/src
|
|
repo_name: lancedb/lancedb
|
|
docs_dir: src
|
|
watch:
|
|
- src
|
|
- ../python/python
|
|
|
|
theme:
|
|
name: "material"
|
|
logo: assets/logo.png
|
|
favicon: assets/logo.png
|
|
palette:
|
|
# Palette toggle for light mode
|
|
- scheme: lancedb
|
|
primary: custom
|
|
toggle:
|
|
icon: material/weather-night
|
|
name: Switch to dark mode
|
|
# Palette toggle for dark mode
|
|
- scheme: slate
|
|
primary: custom
|
|
toggle:
|
|
icon: material/weather-sunny
|
|
name: Switch to light mode
|
|
features:
|
|
- content.code.copy
|
|
- content.tabs.link
|
|
- content.action.edit
|
|
- content.tooltips
|
|
- toc.follow
|
|
- navigation.top
|
|
- navigation.tabs
|
|
- navigation.tabs.sticky
|
|
- navigation.footer
|
|
- navigation.tracking
|
|
- navigation.instant
|
|
- content.footnote.tooltips
|
|
icon:
|
|
repo: fontawesome/brands/github
|
|
annotation: material/arrow-right-circle
|
|
custom_dir: overrides
|
|
|
|
plugins:
|
|
- search
|
|
- autorefs
|
|
- mkdocstrings:
|
|
handlers:
|
|
python:
|
|
paths: [../python]
|
|
options:
|
|
docstring_style: numpy
|
|
heading_level: 3
|
|
show_source: true
|
|
show_symbol_type_in_heading: true
|
|
show_signature_annotations: true
|
|
show_root_heading: true
|
|
members_order: source
|
|
docstring_section_style: list
|
|
signature_crossrefs: true
|
|
separate_signature: true
|
|
import:
|
|
# for cross references
|
|
- https://arrow.apache.org/docs/objects.inv
|
|
- https://pandas.pydata.org/docs/objects.inv
|
|
- https://lancedb.github.io/lance/objects.inv
|
|
- https://docs.pydantic.dev/latest/objects.inv
|
|
- mkdocs-jupyter
|
|
- render_swagger:
|
|
allow_arbitrary_locations: true
|
|
|
|
markdown_extensions:
|
|
- admonition
|
|
- footnotes
|
|
- pymdownx.critic
|
|
- pymdownx.caret
|
|
- pymdownx.keys
|
|
- pymdownx.mark
|
|
- pymdownx.tilde
|
|
- pymdownx.details
|
|
- pymdownx.highlight:
|
|
anchor_linenums: true
|
|
line_spans: __span
|
|
pygments_lang_class: true
|
|
- pymdownx.inlinehilite
|
|
- pymdownx.snippets:
|
|
base_path: ..
|
|
dedent_subsections: true
|
|
- pymdownx.superfences
|
|
- pymdownx.tabbed:
|
|
alternate_style: true
|
|
- md_in_html
|
|
- abbr
|
|
- attr_list
|
|
- pymdownx.snippets
|
|
- pymdownx.emoji:
|
|
emoji_index: !!python/name:material.extensions.emoji.twemoji
|
|
emoji_generator: !!python/name:material.extensions.emoji.to_svg
|
|
- markdown.extensions.toc:
|
|
baselevel: 1
|
|
permalink: ""
|
|
|
|
nav:
|
|
- Home:
|
|
- LanceDB: index.md
|
|
- 🏃🏼♂️ Quick start: basic.md
|
|
- 📚 Concepts:
|
|
- Vector search: concepts/vector_search.md
|
|
- Indexing:
|
|
- IVFPQ: concepts/index_ivfpq.md
|
|
- HNSW: concepts/index_hnsw.md
|
|
- Storage: concepts/storage.md
|
|
- Data management: concepts/data_management.md
|
|
- 🔨 Guides:
|
|
- Working with tables: guides/tables.md
|
|
- Building a vector index: ann_indexes.md
|
|
- Vector Search: search.md
|
|
- Full-text search (native): fts.md
|
|
- Full-text search (tantivy-based): fts_tantivy.md
|
|
- Building a scalar index: guides/scalar_index.md
|
|
- Hybrid search:
|
|
- Overview: hybrid_search/hybrid_search.md
|
|
- Comparing Rerankers: hybrid_search/eval.md
|
|
- Airbnb financial data example: notebooks/hybrid_search.ipynb
|
|
- RAG:
|
|
- Vanilla RAG: rag/vanilla_rag.md
|
|
- Multi-head RAG: rag/multi_head_rag.md
|
|
- Corrective RAG: rag/corrective_rag.md
|
|
- Agentic RAG: rag/agentic_rag.md
|
|
- Graph RAG: rag/graph_rag.md
|
|
- Self RAG: rag/self_rag.md
|
|
- Adaptive RAG: rag/adaptive_rag.md
|
|
- SFR RAG: rag/sfr_rag.md
|
|
- Advanced Techniques:
|
|
- HyDE: rag/advanced_techniques/hyde.md
|
|
- FLARE: rag/advanced_techniques/flare.md
|
|
- Reranking:
|
|
- Quickstart: reranking/index.md
|
|
- Cohere Reranker: reranking/cohere.md
|
|
- Linear Combination Reranker: reranking/linear_combination.md
|
|
- Reciprocal Rank Fusion Reranker: reranking/rrf.md
|
|
- Cross Encoder Reranker: reranking/cross_encoder.md
|
|
- ColBERT Reranker: reranking/colbert.md
|
|
- Jina Reranker: reranking/jina.md
|
|
- OpenAI Reranker: reranking/openai.md
|
|
- AnswerDotAi Rerankers: reranking/answerdotai.md
|
|
- Voyage AI Rerankers: reranking/voyageai.md
|
|
- Building Custom Rerankers: reranking/custom_reranker.md
|
|
- Example: notebooks/lancedb_reranking.ipynb
|
|
- Filtering: sql.md
|
|
- Versioning & Reproducibility:
|
|
- sync API: notebooks/reproducibility.ipynb
|
|
- async API: notebooks/reproducibility_async.ipynb
|
|
- Configuring Storage: guides/storage.md
|
|
- Migration Guide: migration.md
|
|
- Tuning retrieval performance:
|
|
- Choosing right query type: guides/tuning_retrievers/1_query_types.md
|
|
- Reranking: guides/tuning_retrievers/2_reranking.md
|
|
- Embedding fine-tuning: guides/tuning_retrievers/3_embed_tuning.md
|
|
- 🧬 Managing embeddings:
|
|
- Understand Embeddings: embeddings/understanding_embeddings.md
|
|
- Get Started: embeddings/index.md
|
|
- Embedding functions: embeddings/embedding_functions.md
|
|
- Available models:
|
|
- Overview: embeddings/default_embedding_functions.md
|
|
- Text Embedding Functions:
|
|
- Sentence Transformers: embeddings/available_embedding_models/text_embedding_functions/sentence_transformers.md
|
|
- Huggingface Embedding Models: embeddings/available_embedding_models/text_embedding_functions/huggingface_embedding.md
|
|
- Ollama Embeddings: embeddings/available_embedding_models/text_embedding_functions/ollama_embedding.md
|
|
- OpenAI Embeddings: embeddings/available_embedding_models/text_embedding_functions/openai_embedding.md
|
|
- Instructor Embeddings: embeddings/available_embedding_models/text_embedding_functions/instructor_embedding.md
|
|
- Gemini Embeddings: embeddings/available_embedding_models/text_embedding_functions/gemini_embedding.md
|
|
- Cohere Embeddings: embeddings/available_embedding_models/text_embedding_functions/cohere_embedding.md
|
|
- Jina Embeddings: embeddings/available_embedding_models/text_embedding_functions/jina_embedding.md
|
|
- AWS Bedrock Text Embedding Functions: embeddings/available_embedding_models/text_embedding_functions/aws_bedrock_embedding.md
|
|
- IBM watsonx.ai Embeddings: embeddings/available_embedding_models/text_embedding_functions/ibm_watsonx_ai_embedding.md
|
|
- Voyage AI Embeddings: embeddings/available_embedding_models/text_embedding_functions/voyageai_embedding.md
|
|
- Multimodal Embedding Functions:
|
|
- OpenClip embeddings: embeddings/available_embedding_models/multimodal_embedding_functions/openclip_embedding.md
|
|
- Imagebind embeddings: embeddings/available_embedding_models/multimodal_embedding_functions/imagebind_embedding.md
|
|
- Jina Embeddings: embeddings/available_embedding_models/multimodal_embedding_functions/jina_multimodal_embedding.md
|
|
- User-defined embedding functions: embeddings/custom_embedding_function.md
|
|
- Variables and secrets: embeddings/variables_and_secrets.md
|
|
- "Example: Multi-lingual semantic search": notebooks/multi_lingual_example.ipynb
|
|
- "Example: MultiModal CLIP Embeddings": notebooks/DisappearingEmbeddingFunction.ipynb
|
|
- 🔌 Integrations:
|
|
- Tools and data formats: integrations/index.md
|
|
- Pandas and PyArrow: python/pandas_and_pyarrow.md
|
|
- Polars: python/polars_arrow.md
|
|
- DuckDB: python/duckdb.md
|
|
- LangChain:
|
|
- LangChain 🔗: integrations/langchain.md
|
|
- LangChain demo: notebooks/langchain_demo.ipynb
|
|
- LangChain JS/TS 🔗: https://js.langchain.com/docs/integrations/vectorstores/lancedb
|
|
- LlamaIndex 🦙:
|
|
- LlamaIndex docs: integrations/llamaIndex.md
|
|
- LlamaIndex demo: notebooks/llamaIndex_demo.ipynb
|
|
- Pydantic: python/pydantic.md
|
|
- Voxel51: integrations/voxel51.md
|
|
- PromptTools: integrations/prompttools.md
|
|
- dlt: integrations/dlt.md
|
|
- phidata: integrations/phidata.md
|
|
- 🎯 Examples:
|
|
- Overview: examples/index.md
|
|
- 🐍 Python:
|
|
- Overview: examples/examples_python.md
|
|
- Build From Scratch: examples/python_examples/build_from_scratch.md
|
|
- Multimodal: examples/python_examples/multimodal.md
|
|
- Rag: examples/python_examples/rag.md
|
|
- Vector Search: examples/python_examples/vector_search.md
|
|
- Chatbot: examples/python_examples/chatbot.md
|
|
- Evaluation: examples/python_examples/evaluations.md
|
|
- AI Agent: examples/python_examples/aiagent.md
|
|
- Recommender System: examples/python_examples/recommendersystem.md
|
|
- Miscellaneous:
|
|
- Serverless QA Bot with S3 and Lambda: examples/serverless_lancedb_with_s3_and_lambda.md
|
|
- Serverless QA Bot with Modal: examples/serverless_qa_bot_with_modal_and_langchain.md
|
|
- 👾 JavaScript:
|
|
- Overview: examples/examples_js.md
|
|
- Serverless Website Chatbot: examples/serverless_website_chatbot.md
|
|
- YouTube Transcript Search: examples/youtube_transcript_bot_with_nodejs.md
|
|
- TransformersJS Embedding Search: examples/transformerjs_embedding_search_nodejs.md
|
|
- 🦀 Rust:
|
|
- Overview: examples/examples_rust.md
|
|
- 📓 Studies:
|
|
- ↗Improve retrievers with hybrid search and reranking: https://blog.lancedb.com/hybrid-search-and-reranking-report/
|
|
- 💭 FAQs: faq.md
|
|
- 🔍 Troubleshooting: troubleshooting.md
|
|
- ⚙️ API reference:
|
|
- 🐍 Python: python/python.md
|
|
- 👾 JavaScript (vectordb): javascript/modules.md
|
|
- 👾 JavaScript (lancedb): js/globals.md
|
|
- 🦀 Rust: https://docs.rs/lancedb/latest/lancedb/
|
|
- ☁️ LanceDB Cloud:
|
|
- Overview: cloud/index.md
|
|
- API reference:
|
|
- 🐍 Python: python/saas-python.md
|
|
- 👾 JavaScript: javascript/modules.md
|
|
- REST API: cloud/rest.md
|
|
- FAQs: cloud/cloud_faq.md
|
|
|
|
- Quick start: basic.md
|
|
- Concepts:
|
|
- Vector search: concepts/vector_search.md
|
|
- Indexing:
|
|
- IVFPQ: concepts/index_ivfpq.md
|
|
- HNSW: concepts/index_hnsw.md
|
|
- Storage: concepts/storage.md
|
|
- Data management: concepts/data_management.md
|
|
- Guides:
|
|
- Working with tables: guides/tables.md
|
|
- Building an ANN index: ann_indexes.md
|
|
- Vector Search: search.md
|
|
- Full-text search (native): fts.md
|
|
- Full-text search (tantivy-based): fts_tantivy.md
|
|
- Building a scalar index: guides/scalar_index.md
|
|
- Hybrid search:
|
|
- Overview: hybrid_search/hybrid_search.md
|
|
- Comparing Rerankers: hybrid_search/eval.md
|
|
- Airbnb financial data example: notebooks/hybrid_search.ipynb
|
|
- RAG:
|
|
- Vanilla RAG: rag/vanilla_rag.md
|
|
- Multi-head RAG: rag/multi_head_rag.md
|
|
- Corrective RAG: rag/corrective_rag.md
|
|
- Agentic RAG: rag/agentic_rag.md
|
|
- Graph RAG: rag/graph_rag.md
|
|
- Self RAG: rag/self_rag.md
|
|
- Adaptive RAG: rag/adaptive_rag.md
|
|
- SFR RAG: rag/sfr_rag.md
|
|
- Advanced Techniques:
|
|
- HyDE: rag/advanced_techniques/hyde.md
|
|
- FLARE: rag/advanced_techniques/flare.md
|
|
- Reranking:
|
|
- Quickstart: reranking/index.md
|
|
- Cohere Reranker: reranking/cohere.md
|
|
- Linear Combination Reranker: reranking/linear_combination.md
|
|
- Reciprocal Rank Fusion Reranker: reranking/rrf.md
|
|
- Cross Encoder Reranker: reranking/cross_encoder.md
|
|
- ColBERT Reranker: reranking/colbert.md
|
|
- Jina Reranker: reranking/jina.md
|
|
- OpenAI Reranker: reranking/openai.md
|
|
- AnswerDotAi Rerankers: reranking/answerdotai.md
|
|
- Building Custom Rerankers: reranking/custom_reranker.md
|
|
- Example: notebooks/lancedb_reranking.ipynb
|
|
- Filtering: sql.md
|
|
- Versioning & Reproducibility:
|
|
- sync API: notebooks/reproducibility.ipynb
|
|
- async API: notebooks/reproducibility_async.ipynb
|
|
- Configuring Storage: guides/storage.md
|
|
- Migration Guide: migration.md
|
|
- Tuning retrieval performance:
|
|
- Choosing right query type: guides/tuning_retrievers/1_query_types.md
|
|
- Reranking: guides/tuning_retrievers/2_reranking.md
|
|
- Embedding fine-tuning: guides/tuning_retrievers/3_embed_tuning.md
|
|
- Managing Embeddings:
|
|
- Understand Embeddings: embeddings/understanding_embeddings.md
|
|
- Get Started: embeddings/index.md
|
|
- Embedding functions: embeddings/embedding_functions.md
|
|
- Available models:
|
|
- Overview: embeddings/default_embedding_functions.md
|
|
- Text Embedding Functions:
|
|
- Sentence Transformers: embeddings/available_embedding_models/text_embedding_functions/sentence_transformers.md
|
|
- Huggingface Embedding Models: embeddings/available_embedding_models/text_embedding_functions/huggingface_embedding.md
|
|
- Ollama Embeddings: embeddings/available_embedding_models/text_embedding_functions/ollama_embedding.md
|
|
- OpenAI Embeddings: embeddings/available_embedding_models/text_embedding_functions/openai_embedding.md
|
|
- Instructor Embeddings: embeddings/available_embedding_models/text_embedding_functions/instructor_embedding.md
|
|
- Gemini Embeddings: embeddings/available_embedding_models/text_embedding_functions/gemini_embedding.md
|
|
- Cohere Embeddings: embeddings/available_embedding_models/text_embedding_functions/cohere_embedding.md
|
|
- Jina Embeddings: embeddings/available_embedding_models/text_embedding_functions/jina_embedding.md
|
|
- AWS Bedrock Text Embedding Functions: embeddings/available_embedding_models/text_embedding_functions/aws_bedrock_embedding.md
|
|
- IBM watsonx.ai Embeddings: embeddings/available_embedding_models/text_embedding_functions/ibm_watsonx_ai_embedding.md
|
|
- Multimodal Embedding Functions:
|
|
- OpenClip embeddings: embeddings/available_embedding_models/multimodal_embedding_functions/openclip_embedding.md
|
|
- Imagebind embeddings: embeddings/available_embedding_models/multimodal_embedding_functions/imagebind_embedding.md
|
|
- Jina Embeddings: embeddings/available_embedding_models/multimodal_embedding_functions/jina_multimodal_embedding.md
|
|
- User-defined embedding functions: embeddings/custom_embedding_function.md
|
|
- Variables and secrets: embeddings/variables_and_secrets.md
|
|
- "Example: Multi-lingual semantic search": notebooks/multi_lingual_example.ipynb
|
|
- "Example: MultiModal CLIP Embeddings": notebooks/DisappearingEmbeddingFunction.ipynb
|
|
- Integrations:
|
|
- Overview: integrations/index.md
|
|
- Pandas and PyArrow: python/pandas_and_pyarrow.md
|
|
- Polars: python/polars_arrow.md
|
|
- DuckDB: python/duckdb.md
|
|
- LangChain 🦜️🔗↗: integrations/langchain.md
|
|
- LangChain.js 🦜️🔗↗: https://js.langchain.com/docs/integrations/vectorstores/lancedb
|
|
- LlamaIndex 🦙↗: integrations/llamaIndex.md
|
|
- Pydantic: python/pydantic.md
|
|
- Voxel51: integrations/voxel51.md
|
|
- PromptTools: integrations/prompttools.md
|
|
- dlt: integrations/dlt.md
|
|
- phidata: integrations/phidata.md
|
|
- Examples:
|
|
- examples/index.md
|
|
- 🐍 Python:
|
|
- Overview: examples/examples_python.md
|
|
- Build From Scratch: examples/python_examples/build_from_scratch.md
|
|
- Multimodal: examples/python_examples/multimodal.md
|
|
- Rag: examples/python_examples/rag.md
|
|
- Vector Search: examples/python_examples/vector_search.md
|
|
- Chatbot: examples/python_examples/chatbot.md
|
|
- Evaluation: examples/python_examples/evaluations.md
|
|
- AI Agent: examples/python_examples/aiagent.md
|
|
- Recommender System: examples/python_examples/recommendersystem.md
|
|
- Miscellaneous:
|
|
- Serverless QA Bot with S3 and Lambda: examples/serverless_lancedb_with_s3_and_lambda.md
|
|
- Serverless QA Bot with Modal: examples/serverless_qa_bot_with_modal_and_langchain.md
|
|
- 👾 JavaScript:
|
|
- Overview: examples/examples_js.md
|
|
- Serverless Website Chatbot: examples/serverless_website_chatbot.md
|
|
- YouTube Transcript Search: examples/youtube_transcript_bot_with_nodejs.md
|
|
- TransformersJS Embedding Search: examples/transformerjs_embedding_search_nodejs.md
|
|
- 🦀 Rust:
|
|
- Overview: examples/examples_rust.md
|
|
- Studies:
|
|
- studies/overview.md
|
|
- ↗Improve retrievers with hybrid search and reranking: https://blog.lancedb.com/hybrid-search-and-reranking-report/
|
|
- API reference:
|
|
- Overview: api_reference.md
|
|
- Python: python/python.md
|
|
- Javascript (vectordb): javascript/modules.md
|
|
- Javascript (lancedb): js/globals.md
|
|
- Rust: https://docs.rs/lancedb/latest/lancedb/index.html
|
|
- LanceDB Cloud:
|
|
- Overview: cloud/index.md
|
|
- API reference:
|
|
- 🐍 Python: python/saas-python.md
|
|
- 👾 JavaScript: javascript/modules.md
|
|
- REST API: cloud/rest.md
|
|
- FAQs: cloud/cloud_faq.md
|
|
|
|
extra_css:
|
|
- styles/global.css
|
|
- styles/extra.css
|
|
|
|
extra_javascript:
|
|
- "extra_js/init_ask_ai_widget.js"
|
|
|
|
extra:
|
|
analytics:
|
|
provider: google
|
|
property: G-B7NFM40W74
|
|
social:
|
|
- icon: fontawesome/brands/github
|
|
link: https://github.com/lancedb/lancedb
|
|
- icon: fontawesome/brands/x-twitter
|
|
link: https://twitter.com/lancedb
|
|
- icon: fontawesome/brands/linkedin
|
|
link: https://www.linkedin.com/company/lancedb
|