Commit Graph

86 Commits

Author SHA1 Message Date
Rithik Kumar
2bc7dca3ca docs: add changes to Embeddings-> Available models-> overview page (#1596)
adding features and improvements to - Manage Embeddings page

Before:
![Screenshot 2024-09-04
223743](https://github.com/user-attachments/assets/f1e116b5-6ebb-4d59-9d29-b20084998cd0)

After:



![Screenshot 2024-09-05
214214](https://github.com/user-attachments/assets/8c94318e-68af-447e-97e1-8153860a2914)

![Screenshot 2024-09-05
213623](https://github.com/user-attachments/assets/55c82770-6df9-4bab-9c5c-1ea1552138de)

![Screenshot 2024-09-05
215931](https://github.com/user-attachments/assets/9bfac7d4-16a6-454e-801e-50789ff75261)
2024-09-05 22:19:08 +05:30
Ayush Chaurasia
51966a84f5 docs: add multi-vector reranking, answerdotai and studies section (#1579) 2024-08-31 04:09:14 +05:30
Ayush Chaurasia
bfe8fccfab docs: add hnsw docs (#1570) 2024-08-29 15:16:27 +05:30
Rithik Kumar
6f6eb170a9 docs: revamp Python example: Overview page and remove redundant examples and notebooks (#1574)
before:
![Screenshot 2024-08-29
131656](https://github.com/user-attachments/assets/81cb5d70-5dff-4e57-8bbe-3461327aed7d)

After:
![Screenshot 2024-08-29
131715](https://github.com/user-attachments/assets/62109a37-7f66-4fd4-90ed-906a85472117)

---------

Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>
2024-08-29 13:48:10 +05:30
Rithik Kumar
dd1c16bbaf docs: fix links, convert backslash to forward slash in mkdocs.yml (#1571)
Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>
2024-08-28 16:07:57 +05:30
Rithik Kumar
ae85008714 docs: revamp embedding models (#1568)
before:
![Screenshot 2024-08-27
151525](https://github.com/user-attachments/assets/d4f8f2b9-37e6-4a31-b144-01b804019e11)

After:
![Screenshot 2024-08-27
151550](https://github.com/user-attachments/assets/79fe7d27-8f14-4d80-9b41-a1e91f8c708f)

---------

Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>
2024-08-27 17:14:35 +05:30
Rithik Kumar
632007d0e2 docs: add recommender system example (#1561)
before:
![Screenshot 2024-08-24
230216](https://github.com/user-attachments/assets/cc8a810a-b032-45d7-b086-b2ef0720dc16)

After:
![Screenshot 2024-08-24
230228](https://github.com/user-attachments/assets/eaa1dc31-ac7f-4b81-aa79-b4cf94f0cbd5)

---------

Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>
2024-08-25 12:30:30 +05:30
rahuljo
6ad5553eca docs: add dlt-lancedb integration page (#1551)
Co-authored-by: Akela Drissner-Schmid <32450038+akelad@users.noreply.github.com>
2024-08-22 15:18:49 +05:30
Rithik Kumar
758c82858f docs: add AI agent example (#1553)
before:
![Screenshot 2024-08-21
225014](https://github.com/user-attachments/assets/e5b05586-87c5-4739-a4df-2d6cd0704ba5)

After:
![Screenshot 2024-08-21
225029](https://github.com/user-attachments/assets/504959db-f560-49b2-9492-557e9846a793)

---------

Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>
2024-08-22 00:54:05 +05:30
Rithik Kumar
0cbc9cd551 docs: add evaluation example (#1552)
before:
![Screenshot 2024-08-21
194228](https://github.com/user-attachments/assets/68d96658-7579-4934-85af-e8c898b64660)

After:
![Screenshot 2024-08-21
195258](https://github.com/user-attachments/assets/81ddb9cd-cb93-47fc-a121-ff82701fd11f)

---------

Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>
2024-08-21 20:37:04 +05:30
Rithik Kumar
21014cab45 docs: add chatbot example and improve quality of other examples (#1544) 2024-08-17 12:35:33 +05:30
Lei Xu
5857cb4c6e docs: add a section to describe scalar index (#1495) 2024-08-16 18:48:29 -07:00
Rithik Kumar
09ce6c5bb5 docs: add vector search example (#1543) 2024-08-16 21:30:45 +05:30
Rithik Kumar
a62f661d90 docs: revamp example docs (#1512)
Before: 
![Screenshot 2024-08-07
015834](https://github.com/user-attachments/assets/b817f846-78b3-4d6f-b4a0-dfa3f4d6be87)

After:
![Screenshot 2024-08-07
015852](https://github.com/user-attachments/assets/53370301-8c40-45f8-abe3-32f9d051597e)
![Screenshot 2024-08-07
015934](https://github.com/user-attachments/assets/63cdd038-32bb-4b3e-b9c4-1389d2754014)
![Screenshot 2024-08-07
015941](https://github.com/user-attachments/assets/70388680-9c2b-49ef-ba00-2bb015988214)
![Screenshot 2024-08-07
015949](https://github.com/user-attachments/assets/76335a33-bb6f-473c-896f-447320abcc25)

---------

Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>
2024-08-07 03:56:59 +05:30
Ayush Chaurasia
513926960d docs: add rrf docs and update reranking notebook with Jina reranker results (#1474)
- RRF reranker
- Jina Reranker results

---------

Co-authored-by: Weston Pace <weston.pace@gmail.com>
2024-07-25 22:29:46 +05:30
Cory Grinstead
69295548cc docs: minor updates for js migration guides (#1451)
Co-authored-by: Will Jones <willjones127@gmail.com>
2024-07-22 10:26:49 -07:00
Cory Grinstead
7ae327242b docs: update migration.md (#1445) 2024-07-15 18:20:23 -05:00
Ayush Chaurasia
bb2e624ff0 docs: add fine tuning section in retriever guide and minor fixes (#1438) 2024-07-11 17:34:29 +05:30
Joan Fontanals
cef24801f4 docs: add jina reranker to index (#1427)
PR to add JinaReranker documentation page to the rerankers index
2024-07-09 14:39:35 +05:30
Raghav Dixit
a5ff623443 docs: update lntegration docs & fixed links (#1423)
1. Updated langchain docs. 
2. Minor update to llamaindex doc.
3. Added notebook examples and linked them correctly
2024-07-03 21:50:33 +05:30
Lei Xu
d6485f1215 docs: add openapi rest api page (#1413) 2024-06-27 21:32:34 -07:00
Raghav Dixit
96914a619b docs: llama-index integration (#1347)
Updated api refrence and usage for llama index integration.
2024-06-09 23:52:18 +05:30
Ayush Chaurasia
76fc16c7a1 docs: add retriever guide, address minor onboarding feedbacks & enhancement (#1326)
- Tried to address some onboarding feedbacks listed in
https://github.com/lancedb/lancedb/issues/1224
- Improve visibility of pydantic integration and embedding API. (Based
on onboarding feedback - Many ways of ingesting data, defining schema
but not sure what to use in a specific use-case)
- Add a guide that takes users through testing and improving retriever
performance using built-in utilities like hybrid-search and reranking
- Add some benchmarks for the above
- Add missing cohere docs

---------

Co-authored-by: Weston Pace <weston.pace@gmail.com>
2024-06-08 06:25:31 +05:30
Raghav Dixit
0bd6ac945e Documentation : Langchain doc bug fix (#1301)
nav bar update
2024-05-13 20:56:34 +05:30
Weston Pace
deb947ddbd doc: fix typo, broken links (#1218) 2024-04-11 14:58:51 -07:00
Prashanth Rao
d155e82723 [docs] Fix broken links and clarify language in integrations docs (#1209)
This PR does the following:

- Fixes broken/outdated URLs
- Adds clarity to the way DuckDB/LanceDB integration works via Arrow
2024-04-11 15:32:08 +05:30
Ayush Chaurasia
44c03ebef3 docs : Update Reranking docs (#1213) 2024-04-11 15:20:00 +05:30
Lei Xu
7e5a54b76a chore: add social link footer (#1177) 2024-04-05 16:34:50 -07:00
Weston Pace
e21b56293c docs: add a reference to @lancedb/lance in the docs (#1166)
We aren't yet ready to switch over the examples since almost all JS
examples rely on embeddings and we haven't yet ported those over.
However, this makes it possible for those that are interested to start
using `@lancedb/lancedb`
2024-04-05 16:34:39 -07:00
Weston Pace
f97c7dad8c docs: add the async python API to the docs (#1156) 2024-04-05 16:34:37 -07:00
Weston Pace
0fe0976a0e docs: add links to rust SDK docs, remove references to rust SDK being unstable / experimental (#1131) 2024-04-05 16:33:05 -07:00
Ayush Chaurasia
538d0320f7 Docs: add meta tags (#1006) 2024-04-05 16:30:40 -07:00
Ayush Chaurasia
bc871169f0 docs: Add meta tag for image preview (#988)
I think this should work. Need to deploy it to be sure as it can be
tested locally. Can be tested here.

2 things about this solution:
* All pages have a same meta tag, i.e, lancedb banner
* If needed, we can automatically use the first image of each page and
generate meta tags using the ultralytics mkdocs plugin that we did for
this purpose - https://github.com/ultralytics/mkdocs
2024-04-05 16:30:40 -07:00
Chang She
3fc835e124 doc: update navigation links for embedding functions (#986) 2024-04-05 16:30:40 -07:00
Ayush Chaurasia
aecafa6479 docs: Minimal reranking evaluation benchmarks (#977) 2024-04-05 16:30:40 -07:00
Prashanth Rao
b014c24e66 [docs]: Fix typos and clarity in hybrid search docs (#966)
- Fixed typos and added some clarity to the hybrid search docs
- Changed "Airbnb" case to be as per the [official company
name](https://en.wikipedia.org/wiki/Airbnb) (the "bnb" shouldn't be
capitalized", and the text in the document aligns with this
- Fixed headers in nav bar
2024-04-05 16:30:30 -07:00
Ayush Chaurasia
510e8378bc feat(python): hybrid search updates, examples, & latency benchmarks (#964)
- Rename safe_import -> attempt_import_or_raise (closes
https://github.com/lancedb/lancedb/pull/923)
- Update docs
- Add Notebook example (@changhiskhan you can use it for the talk. Comes
with "open in colab" button)
- Latency benchmark & results comparison, sanity check on real-world
data
- Updates the default openai model to gpt-4
2024-04-05 16:30:30 -07:00
Ayush Chaurasia
a41f7be88d feat(python): Hybrid search & Reranker API (#824)
based on https://github.com/lancedb/lancedb/pull/713
- The Reranker api can be plugged into vector only or fts only search
but this PR doesn't do that (see example -
https://txt.cohere.com/rerank/)


### Default reranker -- `LinearCombinationReranker(weight=0.7,
fill=1.0)`

```
table.search("hello", query_type="hybrid").rerank(normalize="score").to_pandas()
```
### Available rerankers
LinearCombinationReranker
```
from lancedb.rerankers import LinearCombinationReranker

# Same as default 
table.search("hello", query_type="hybrid").rerank(
                                      normalize="score", 
                                      reranker=LinearCombinationReranker()
                                     ).to_pandas()

# with custom params
reranker = LinearCombinationReranker(weight=0.3, fill=1.0)
table.search("hello", query_type="hybrid").rerank(
                                      normalize="score", 
                                      reranker=reranker
                                     ).to_pandas()
```

Cohere Reranker
```
from lancedb.rerankers import CohereReranker

# default model.. English and multi-lingual supported. See docstring for available custom params
table.search("hello", query_type="hybrid").rerank(
                                      normalize="rank",  # score or rank
                                      reranker=CohereReranker()
                                     ).to_pandas()

```

CrossEncoderReranker

```
from lancedb.rerankers import CrossEncoderReranker

table.search("hello", query_type="hybrid").rerank(
                                      normalize="rank", 
                                      reranker=CrossEncoderReranker()
                                     ).to_pandas()

```

## Using custom Reranker
```
from lancedb.reranker import Reranker

class CustomReranker(Reranker):
    def rerank_hybrid(self, vector_result, fts_result):
           combined_res = self.merge_results(vector_results, fts_results) # or use custom combination logic
           # Custom rerank logic here
           
           return combined_res
```

- [x] Expand testing
- [x] Make sure usage makes sense
- [x] Run simple benchmarks for correctness (Seeing weird result from
cohere reranker in the toy example)
- Support diverse rerankers by default:
- [x] Cross encoding
- [x] Cohere
- [x] Reciprocal Rank Fusion

---------

Co-authored-by: Chang She <759245+changhiskhan@users.noreply.github.com>
Co-authored-by: Prashanth Rao <35005448+prrao87@users.noreply.github.com>
2024-04-05 16:28:56 -07:00
Ayush Chaurasia
b326bf2ef6 doc: Add documentation chatbot for LanceDB (#890)
<img width="1258" alt="Screenshot 2024-01-29 at 10 05 52 PM"
src="https://github.com/lancedb/lancedb/assets/15766192/7c108fde-e993-415c-ad01-72010fd5fe31">
2024-04-05 16:28:56 -07:00
Lei Xu
12e776821a doc: use snippet for rust code example and make sure rust examples run through CI (#885) 2024-04-05 16:28:56 -07:00
Lei Xu
d811b89de2 doc: use code snippet for typescript examples (#880)
The typescript code is in a fully function file, that will be run via the CI.
2024-04-05 16:28:56 -07:00
Lei Xu
fd2fd94862 doc: update quick start for full rust example (#872) 2024-04-05 16:28:56 -07:00
Will Jones
7d82e56f76 docs: document basics of configuring object storage (#832)
Created based on upstream PR https://github.com/lancedb/lance/pull/1849

Closes #681

---------

Co-authored-by: Prashanth Rao <35005448+prrao87@users.noreply.github.com>
2024-04-05 16:27:51 -07:00
Prashanth Rao
4d5d748acd docs: Updates and refactor (#683)
This PR makes incremental changes to the documentation.

* Closes #697
* Closes #698

- [x] Add dark mode
- [x] Fix headers in navbar
- [x] Add `extra.css` to customize navbar styles
- [x] Customize fonts for prose/code blocks, navbar and admonitions
- [x] Inspect all admonition boxes (remove redundant dropdowns) and
improve clarity and readability
- [x] Ensure that all images in the docs have white background (not
transparent) to be viewable in dark mode
- [x] Improve code formatting in code blocks to make them consistent
with autoformatters (eslint/ruff)
- [x] Add bolder weight to h1 headers
- [x] Add diagram showing the difference between embedded (OSS) and
serverless (Cloud)
- [x] Fix [Creating an empty
table](https://lancedb.github.io/lancedb/guides/tables/#creating-empty-table)
section: right now, the subheaders are not clickable.
- [x] In critical data ingestion methods like `table.add` (among
others), the type signature often does not match the actual code
- [x] Proof-read each documentation section and rewrite as necessary to
provide more context, use cases, and explanations so it reads less like
reference documentation. This is especially important for CRUD and
search sections since those are so central to the user experience.

- [x] The section for [Adding
data](https://lancedb.github.io/lancedb/guides/tables/#adding-to-a-table)
only shows examples for pandas and iterables. We should include pydantic
models, arrow tables, etc.
- [x] Add conceptual tutorial for IVF-PQ index
- [x] Clearly separate vector search, FTS and filtering sections so that
these are easier to find
- [x] Add docs on refine factor to explain its importance for recall.
Closes #716
- [x] Add an FAQ page showing answers to commonly asked questions about
LanceDB. Closes #746
- [x] Add simple polars example to the integrations section. Closes #756
and closes #153
- [ ] Add basic docs for the Rust API (more detailed API docs can come
later). Closes #781
- [x] Add a section on the various storage options on local vs. cloud
(S3, EBS, EFS, local disk, etc.) and the tradeoffs involved. Closes #782
- [x] Revamp filtering docs: add pre-filtering examples and redo headers
and update content for SQL filters. Closes #783 and closes #784.
- [x] Add docs for data management: compaction, cleaning up old versions
and incremental indexing. Closes #785
- [ ] Add a benchmark section that also discusses some best practices.
Closes #787

---------

Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>
Co-authored-by: Will Jones <willjones127@gmail.com>
2024-04-05 16:27:12 -07:00
QianZhu
25d1c62c3f SaaS JS API sdk doc (#740)
Co-authored-by: Aidan <64613310+aidangomar@users.noreply.github.com>
2024-04-05 16:24:47 -07:00
Ayush Chaurasia
91ff324c70 docs: Update roboflow tutorial position (#666) 2024-04-05 16:23:49 -07:00
QianZhu
bda0135cfc saas python sdk doc (#692)
<img width="256" alt="Screenshot 2023-12-07 at 11 55 41 AM"
src="https://github.com/lancedb/lancedb/assets/1305083/259bf234-9b3b-4c5d-af45-c7f3fada2cc7">
2024-04-05 16:23:49 -07:00
James
a94a033553 (docs):Add CLIP image embedding example (#660)
In this PR, I add a guide that lets you use Roboflow Inference to
calculate CLIP embeddings for use in LanceDB. This post was reviewed by
@AyushExel.
2024-04-05 16:23:49 -07:00
Ayush Chaurasia
955c2a751a [Docs][SEO] Add sitemap and robots.txt (#645)
Sitemap improves SEO by ranking pages and tracking updates.
2024-04-05 16:23:49 -07:00
Ayush Chaurasia
07ab4cd14c Disable posthog on docs & reduce sentry trace factor (#607)
- posthog charges per event and docs events are registered very
frequently. We can keep tracking them on GA
- Reduced sentry trace factor
2024-04-05 16:23:13 -07:00