lancedb

mirror of https://github.com/lancedb/lancedb.git synced 2025-12-22 21:09:58 +00:00

Author	SHA1	Message	Date
Bill Chambers	9c25998110	docs: update serverless_lancedb_with_s3_and_lambda.md (#1559 )	2024-08-26 14:55:28 +05:30
Ayush Chaurasia	549ca51a8a	feat: add answerdotai rerankers support and minor improvements (#1560 ) This PR: - Adds missing license headers - Integrates with answerdotai Rerankers package - Updates ColbertReranker to subclass answerdotai package. This is done to keep backwards compatibility as some users might be used to importing ColbertReranker directly - Set `trust_remote_code` to ` True` by default in CrossEncoder and sentence-transformer based rerankers	2024-08-26 13:25:10 +05:30
Rithik Kumar	632007d0e2	docs: add recommender system example (#1561 ) before: ![Screenshot 2024-08-24 230216](https://github.com/user-attachments/assets/cc8a810a-b032-45d7-b086-b2ef0720dc16) After: ![Screenshot 2024-08-24 230228](https://github.com/user-attachments/assets/eaa1dc31-ac7f-4b81-aa79-b4cf94f0cbd5) --------- Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>	2024-08-25 12:30:30 +05:30
rahuljo	6ad5553eca	docs: add dlt-lancedb integration page (#1551 ) Co-authored-by: Akela Drissner-Schmid <32450038+akelad@users.noreply.github.com>	2024-08-22 15:18:49 +05:30
Rithik Kumar	758c82858f	docs: add AI agent example (#1553 ) before: ![Screenshot 2024-08-21 225014](https://github.com/user-attachments/assets/e5b05586-87c5-4739-a4df-2d6cd0704ba5) After: ![Screenshot 2024-08-21 225029](https://github.com/user-attachments/assets/504959db-f560-49b2-9492-557e9846a793) --------- Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>	2024-08-22 00:54:05 +05:30
Rithik Kumar	0cbc9cd551	docs: add evaluation example (#1552 ) before: ![Screenshot 2024-08-21 194228](https://github.com/user-attachments/assets/68d96658-7579-4934-85af-e8c898b64660) After: ![Screenshot 2024-08-21 195258](https://github.com/user-attachments/assets/81ddb9cd-cb93-47fc-a121-ff82701fd11f) --------- Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>	2024-08-21 20:37:04 +05:30
Ayush Chaurasia	85bb7e54e4	docs: missing griffe dependency for mkdocs deployment (#1545 )	2024-08-19 07:48:23 +05:30
Rithik Kumar	21014cab45	docs: add chatbot example and improve quality of other examples (#1544 )	2024-08-17 12:35:33 +05:30
Lei Xu	5857cb4c6e	docs: add a section to describe scalar index (#1495 )	2024-08-16 18:48:29 -07:00
Rithik Kumar	09ce6c5bb5	docs: add vector search example (#1543 )	2024-08-16 21:30:45 +05:30
Lei Xu	b2317c904d	feat: create bitmap and label list scalar index using python async api (#1529 ) * Expose `bitmap` and `LabelList` scalar index type via Rust and Async Python API * Add documents	2024-08-11 09:16:11 -07:00
Ayush Chaurasia	a88e9bb134	docs: add lancedb embedding fcn on cloud docs (#1521 )	2024-08-09 07:21:04 +05:30
BubbleCal	f9d5fa88a1	feat!: migrate FTS from tantivy to lance-index (#1483 ) Lance now supports FTS, so add it into lancedb Python, TypeScript and Rust SDKs. For Python, we still use tantivy based FTS by default because the lance FTS index now misses some features of tantivy. For Python: - Support to create lance based FTS index - Support to specify columns for full text search (only available for lance based FTS index) For TypeScript: - Change the search method so that it can accept both string and vector - Support full text search For Rust - Support full text search The others: - Update the FTS doc BREAKING CHANGE: - for Python, this renames the attached score column of FTS from "score" to "_score", this could be a breaking change for users that rely the scores --------- Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2024-08-08 15:33:15 +08:00
Rithik Kumar	a62f661d90	docs: revamp example docs (#1512 ) Before: ![Screenshot 2024-08-07 015834](https://github.com/user-attachments/assets/b817f846-78b3-4d6f-b4a0-dfa3f4d6be87) After: ![Screenshot 2024-08-07 015852](https://github.com/user-attachments/assets/53370301-8c40-45f8-abe3-32f9d051597e) ![Screenshot 2024-08-07 015934](https://github.com/user-attachments/assets/63cdd038-32bb-4b3e-b9c4-1389d2754014) ![Screenshot 2024-08-07 015941](https://github.com/user-attachments/assets/70388680-9c2b-49ef-ba00-2bb015988214) ![Screenshot 2024-08-07 015949](https://github.com/user-attachments/assets/76335a33-bb6f-473c-896f-447320abcc25) --------- Co-authored-by: Ayush Chaurasia <ayush.chaurarsia@gmail.com>	2024-08-07 03:56:59 +05:30
Robby	8d2ff7b210	feat(python): add watsonx embeddings to registry (#1486 ) Related issue: https://github.com/lancedb/lancedb/issues/1412 --------- Co-authored-by: Robby <h0rv@users.noreply.github.com>	2024-08-06 10:58:33 +05:30
Rithik Kumar	d297da5a7e	docs: update examples docs (#1488 ) Testing Workflow with my first PR. Before: ![Screenshot 2024-08-01 183326](https://github.com/user-attachments/assets/83d22101-8bbf-4b18-81e4-f740e605727a) After: ![Screenshot 2024-08-01 183333](https://github.com/user-attachments/assets/a5e4cd2c-c524-4009-81d5-75b2b0361f83)	2024-08-01 18:54:45 +05:30
Cory Grinstead	a062a92f6b	docs: custom embedding function for ts (#1479 )	2024-07-30 18:19:55 -05:00
Ayush Chaurasia	513926960d	docs: add rrf docs and update reranking notebook with Jina reranker results (#1474 ) - RRF reranker - Jina Reranker results --------- Co-authored-by: Weston Pace <weston.pace@gmail.com>	2024-07-25 22:29:46 +05:30
inn-0	cc507ca766	docs: add missing whitespace before markdown table to fix rendering issue (#1471 ) ### Fix markdown table rendering issue This PR adds a missing whitespace before a markdown table in the documentation. This issue causes the table to not render properly in mkdocs, while it does render properly in GitHub's markdown viewer. #### Change Details: - Added a single line of whitespace before the markdown table to ensure proper rendering in mkdocs. #### Note: - I wasn't able to test this fix in the mkdocs environment, but it should be safe as it only involves adding whitespace which won't break anything. --- Cohere supports following input types: \| Input Type \| Description \| \|-------------------------\|---------------------------------------\| \| "`search_document`" \| Used for embeddings stored in a vector\| \| \| database for search use-cases. \| \| "`search_query`" \| Used for embeddings of search queries \| \| \| run against a vector DB \| \| "`semantic_similarity`" \| Specifies the given text will be used \| \| \| for Semantic Textual Similarity (STS) \| \| "`classification`" \| Used for embeddings passed through a \| \| \| text classifier. \| \| "`clustering`" \| Used for the embeddings run through a \| \| \| clustering algorithm \| Usage Example:	2024-07-24 22:26:28 +05:30
Lei Xu	c9c61eb060	docs: expose merge_insert doc for remote python SDK (#1464 ) `merge_insert` API is not shown up on [`RemoteTable`](https://lancedb.github.io/lancedb/python/saas-python/#lancedb.remote.table.RemoteTable) today * Also bump `ruff` version as well	2024-07-22 10:48:16 -07:00
Cory Grinstead	69295548cc	docs: minor updates for js migration guides (#1451 ) Co-authored-by: Will Jones <willjones127@gmail.com>	2024-07-22 10:26:49 -07:00
Cory Grinstead	2276b114c5	docs: add installation note about yarn (#1459 ) I noticed that setting up a simple project with [Yarn](https://yarnpkg.com/) failed because unlike others [npm, pnpm, bun], yarn does not automatically resolve peer dependencies, so i added a quick note about it in the installation guide.	2024-07-19 18:48:24 -05:00
Magnus	dc609a337d	fix: added support for trust_remote_code (#1454 ) Closes #1285 Added trust_remote_code to the SentenceTransformerEmbeddings class. Defaults to `False`	2024-07-18 19:37:52 +05:30
Cory Grinstead	7ae327242b	docs: update `migration.md` (#1445 )	2024-07-15 18:20:23 -05:00
Ayush Chaurasia	bb2e624ff0	docs: add fine tuning section in retriever guide and minor fixes (#1438 )	2024-07-11 17:34:29 +05:30
Cory Grinstead	31be9212da	docs(nodejs): add @lancedb/lancedb examples everywhere (#1411 ) Co-authored-by: Will Jones <willjones127@gmail.com>	2024-07-10 13:29:03 -05:00
Joan Fontanals	cef24801f4	docs: add jina reranker to index (#1427 ) PR to add JinaReranker documentation page to the rerankers index	2024-07-09 14:39:35 +05:30
Lei Xu	58c2cd01a5	docs: add fast search to openapi.yml (#1435 )	2024-07-08 11:55:45 -07:00
Joan Fontanals	08d25c5a80	feat: add Jina integration in Python for Embedding and Reranker (#1424 ) Integration of Jina Embeddings and Rerankers through its API	2024-07-05 01:34:43 +05:30
Raghav Dixit	a5ff623443	docs: update lntegration docs & fixed links (#1423 ) 1. Updated langchain docs. 2. Minor update to llamaindex doc. 3. Added notebook examples and linked them correctly	2024-07-03 21:50:33 +05:30
Lei Xu	020a437230	docs: add merge insert, create index and create scalar index to public rest api doc (#1420 ) Added 3 APIs doc publicly: - `merge_insert` - `create_index` - `create_scalar_index` --------- Co-authored-by: Weston Pace <weston.pace@gmail.com>	2024-07-01 12:52:27 -07:00
Ayush Chaurasia	ccded130ed	docs: add reranking example (#1416 )	2024-07-01 19:42:38 +05:30
Sidharth Rajaram	48f8d1b3b7	docs: addresses typos in HF embedding example docs (#1415 ) * `table.add` requires `data` parameter on the docs page regarding use of embedding models from HF * also changed the name of example class from `TextModel` to `Words` since that is what is used as parameter in the `db.create_table` call * Per https://lancedb.github.io/lancedb/python/python/#lancedb.table.Table.add	2024-07-01 12:14:17 +05:30
Will Jones	865ed99881	feat: dynamodb commit store support (#1410 ) This allows users to specify URIs like: ``` s3+ddb://my_bucket/path?ddbTableName=myCommitTable ``` and it will support concurrent writes in S3. * [x] Add dynamodb integration tests * [x] Add modifications to get it working in Python sync API * [x] Added section in documentation describing how to configure. Closes #534 --------- Co-authored-by: universalmind303 <cory.grinstead@gmail.com>	2024-06-28 09:30:36 -07:00
Lei Xu	d6485f1215	docs: add openapi rest api page (#1413 )	2024-06-27 21:32:34 -07:00
Thomas J. Fan	a866b78a31	docs: fixes polars formatting in docs (#1400 ) Currently, the whole polars section is formatted as a code block: https://lancedb.github.io/lancedb/guides/tables/#from-a-polars-dataframe This PR fixes the formatting.	2024-06-25 08:46:16 -07:00
josca42	0fe844034d	feat: enable stemming (#1356 ) Added the ability to specify tokenizer_name, when creating a full text search index using tantivy. This enables the use of language specific stemming. Also updated the [guide on full text search](https://lancedb.github.io/lancedb/fts/) with a short section on choosing tokenizer. Fixes #1315	2024-06-20 14:23:55 -07:00
Raghav Dixit	96914a619b	docs: llama-index integration (#1347 ) Updated api refrence and usage for llama index integration.	2024-06-09 23:52:18 +05:30
Ayush Chaurasia	72f339a0b3	docs: add note about embedding api not being available on cloud (#1371 )	2024-06-09 03:57:23 +05:30
Ayush Chaurasia	5e30648f45	docs: fix example path (#1367 )	2024-06-07 19:40:50 -07:00
Ayush Chaurasia	76fc16c7a1	docs: add retriever guide, address minor onboarding feedbacks & enhancement (#1326 ) - Tried to address some onboarding feedbacks listed in https://github.com/lancedb/lancedb/issues/1224 - Improve visibility of pydantic integration and embedding API. (Based on onboarding feedback - Many ways of ingesting data, defining schema but not sure what to use in a specific use-case) - Add a guide that takes users through testing and improving retriever performance using built-in utilities like hybrid-search and reranking - Add some benchmarks for the above - Add missing cohere docs --------- Co-authored-by: Weston Pace <weston.pace@gmail.com>	2024-06-08 06:25:31 +05:30
Will Jones	5349e8b1db	ci: make preview releases (#1302 ) This PR changes the release process. Some parts are more complex, and other parts I've simplified. ## Simplifications * Combined `Create Release Commit` and `Create Python Release Commit` into a single workflow. By default, it does a release of all packages, but you can still choose to make just a Python or just Node/Rust release through the arguments. This will make it rarer that we create a Node release but forget about Python or vice-versa. * Releases are automatically generated once a tag is pushed. This eliminates the manual step of creating the release. * Release notes are automatically generated and changes are categorized based on the PR labels. * Removed the use of `LANCEDB_RELEASE_TOKEN` in favor of just using `GITHUB_TOKEN` where it wasn't necessary. In the one place it is necessary, I left a comment as to why it is. * Reused the version in `python/Cargo.toml` so we don't have two different versions in Python LanceDB. ## New changes * We now can create `preview` / `beta` releases. By default `Create Release Commit` will create a preview release, but you can select a "stable" release type and it will create a full stable release. * For Python, pre-releases go to fury.io instead of PyPI * `bump2version` was deprecated, so upgraded to `bump-my-version`. This also seems to better support semantic versioning with pre-releases. * `ci` changes will now be shown in the changelog, allowing changes like this to be visible to users. `chore` is still hidden. ## Versioning NOTE: unlike how it is in lance repo right now, the version in main is the last one released, including beta versions. --------- Co-authored-by: Lance Release <lance-dev@lancedb.com> Co-authored-by: Weston Pace <weston.pace@gmail.com>	2024-05-17 11:24:38 -07:00
Raghav Dixit	0bd6ac945e	Documentation : Langchain doc bug fix (#1301 ) nav bar update	2024-05-13 20:56:34 +05:30
Raghav Dixit	c9d5475333	Documentation: Langchain Integration (#1297 ) Integration doc update	2024-05-13 10:19:33 -04:00
asmith26	3850d5fb35	Add ollama embeddings function (#1263 ) Following the docs [here](https://lancedb.github.io/lancedb/python/python/#lancedb.embeddings.openai.OpenAIEmbeddings) I've been trying to use ollama embedding via the OpenAI API interface, but unfortunately I couldn't get it to work (possibly related to https://github.com/ollama/ollama/issues/2416) Given the popularity of ollama I thought it could be helpful to have a dedicated Ollama Embedding function in lancedb. Very much welcome any thought on this or my code etc. Thanks!	2024-05-13 13:09:19 +05:30
Will Jones	becd649130	docs: add tip about using allow_http on local servers (#1277 ) Based on user question https://discord.com/channels/1030247538198061086/1197630499926057021/1237350091191222293	2024-05-07 10:15:26 -07:00
Nehil Jain	e933de003d	fix: Docs for embed_func fixed in youtube transcript search notebook (#1269 ) Fixes issue https://github.com/lancedb/lancedb/issues/1268	2024-05-06 11:48:25 +05:30
asmith26	df48454b70	Update embedding_functions.md (#1250 ) `clip.ndims` seems to be a function (I installed with `pip install open_clip_torch`).	2024-05-01 09:33:42 -07:00
Alex Kohler	c1a7d65473	chore: fix get_registry call in baai embeddings example (#1230 )	2024-04-20 07:25:16 +05:30
Weston Pace	deb947ddbd	doc: fix typo, broken links (#1218 )	2024-04-11 14:58:51 -07:00

... 2 3 4 5 6 ...

417 Commits