lancedb

mirror of https://github.com/lancedb/lancedb.git synced 2026-07-04 11:30:46 +00:00

Author	SHA1	Message	Date
Raghav Dixit	765569425c	doc updates (#1085 ) closes #1084	2024-04-05 16:32:15 -07:00
Ivan Leo	89ce417452	Update default_embedding_functions.md (#1073 ) Added a small bit of documentation for the `dim` feature which is provided by the new `text-embedding-3` model series that allows users to shorten an embedding. Happy to discuss a bit on the phrasing but I struggled quite a bit with getting it to work so wanted to help others who might want to use the newer model too	2024-04-05 16:31:53 -07:00
Louis Guitton	7f9ef0d329	Fix default_embedding_functions.md (#1043 ) typo and broken table	2024-04-05 16:31:36 -07:00
Chang She	484a121866	doc: improve embedding functions documentation (#983 ) Got some user feedback that the `implicit` / `explicit` distinction is confusing. Instead I was thinking we would just deprecate the `with_embeddings` API and then organize working with embeddings into 3 buckets: 1. manually generate embeddings 2. use a provided embedding function 3. define your own custom embedding function	2024-04-05 16:30:40 -07:00
Ayush Chaurasia	510e8378bc	feat(python): hybrid search updates, examples, & latency benchmarks (#964 ) - Rename safe_import -> attempt_import_or_raise (closes https://github.com/lancedb/lancedb/pull/923) - Update docs - Add Notebook example (@changhiskhan you can use it for the talk. Comes with "open in colab" button) - Latency benchmark & results comparison, sanity check on real-world data - Updates the default openai model to gpt-4	2024-04-05 16:30:30 -07:00
Ayush Chaurasia	545a03d7f9	feat(python): Aws Bedrock embeddings integration (#822 ) Supports amazon titan, cohere english & cohere multi-lingual base models.	2024-04-05 16:28:56 -07:00
Prashanth Rao	4d5d748acd	docs: Updates and refactor (#683 ) This PR makes incremental changes to the documentation. * Closes #697 * Closes #698 - [x] Add dark mode - [x] Fix headers in navbar - [x] Add `extra.css` to customize navbar styles - [x] Customize fonts for prose/code blocks, navbar and admonitions - [x] Inspect all admonition boxes (remove redundant dropdowns) and improve clarity and readability - [x] Ensure that all images in the docs have white background (not transparent) to be viewable in dark mode - [x] Improve code formatting in code blocks to make them consistent with autoformatters (eslint/ruff) - [x] Add bolder weight to h1 headers - [x] Add diagram showing the difference between embedded (OSS) and serverless (Cloud) - [x] Fix [Creating an empty table](https://lancedb.github.io/lancedb/guides/tables/#creating-empty-table) section: right now, the subheaders are not clickable. - [x] In critical data ingestion methods like `table.add` (among others), the type signature often does not match the actual code - [x] Proof-read each documentation section and rewrite as necessary to provide more context, use cases, and explanations so it reads less like reference documentation. This is especially important for CRUD and search sections since those are so central to the user experience. - [x] The section for [Adding data](https://lancedb.github.io/lancedb/guides/tables/#adding-to-a-table) only shows examples for pandas and iterables. We should include pydantic models, arrow tables, etc. - [x] Add conceptual tutorial for IVF-PQ index - [x] Clearly separate vector search, FTS and filtering sections so that these are easier to find - [x] Add docs on refine factor to explain its importance for recall. Closes #716 - [x] Add an FAQ page showing answers to commonly asked questions about LanceDB. Closes #746 - [x] Add simple polars example to the integrations section. Closes #756 and closes #153 - [ ] Add basic docs for the Rust API (more detailed API docs can come later). Closes #781 - [x] Add a section on the various storage options on local vs. cloud (S3, EBS, EFS, local disk, etc.) and the tradeoffs involved. Closes #782 - [x] Revamp filtering docs: add pre-filtering examples and redo headers and update content for SQL filters. Closes #783 and closes #784. - [x] Add docs for data management: compaction, cleaning up old versions and incremental indexing. Closes #785 - [ ] Add a benchmark section that also discusses some best practices. Closes #787 --------- Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com> Co-authored-by: Will Jones <willjones127@gmail.com>	2024-04-05 16:27:12 -07:00
Ayush Chaurasia	2f72d5138e	feat(python): Add gemini text embedding function (#806 ) Named it Gemini-text for now. Not sure how complicated it will be to support both text and multimodal embeddings under the same class "gemini"..But its not something to worry about for now I guess.	2024-04-05 16:25:52 -07:00
Chris	6698376f02	Minor Fixes to Ingest Embedding Functions Docs (#777 ) Addressed minor typos and grammatical issues to improve readability --------- Co-authored-by: Christopher Correa <chris.correa@gmail.com>	2024-04-05 16:24:47 -07:00
Vladimir Varankin	2fd829296e	Minor corrections for docs of embedding_functions (#780 ) In addition to #777, this pull request fixes more typos in the documentation for "Ingest Embedding Functions".	2024-04-05 16:24:47 -07:00
Bengsoon Chuah	e3ba5b2402	Add relevant imports for each step (#764 ) I found that it was quite incoherent to have to read through the documentation and having to search which submodule that each class should be imported from. For example, it is cumbersome to have to navigate to another documentation page to find out that `EmbeddingFunctionRegistry` is from `lancedb.embeddings`	2024-04-05 16:24:47 -07:00
elliottRobinson	3ab4b335c3	Update default_embedding_functions.md (#744 ) Modify some grammar, punctuation, and spelling errors.	2024-04-05 16:24:47 -07:00
Ayush Chaurasia	088792c821	[Docs]: Add Instructor embeddings and rate limit handler docs (#651 )	2024-04-05 16:23:49 -07:00
Ayush Chaurasia	1c42894918	[DOCS][PYTHON] Update embeddings API docs & Example (#516 ) This PR adds an overview of embeddings docs: - 2 ways to vectorize your data using lancedb - explicit & implicit - explicit - manually vectorize your data using `wit_embedding` function - Implicit - automatically vectorize your data as it comes by ingesting your embedding function details as table metadata - Multi-modal example w/ disappearing embedding function	2024-04-05 16:22:59 -07:00

14 Commits