lancedb

mirror of https://github.com/lancedb/lancedb.git synced 2025-12-25 22:29:58 +00:00

Author	SHA1	Message	Date
BubbleCal	8dcd328dce	feat: support to create table from record batch iterator (#1593 )	2024-09-06 10:41:38 +08:00
Gagan Bhullar	b24810a011	feat(python, rust): expose offset in query (#1556 ) PR is part of #1555	2024-09-05 08:33:07 -07:00
Ayush Chaurasia	03ef1dc081	feat: update default reranker to RRF (#1580 ) - Both LinearCombination (the current default) and RRF are pretty fast compared to model based rerankers. RRF is slightly faster. - In our tests RRF has also been slightly more accurate. This PR: - Makes RRF the default reranker - Removed duplicate docs for rerankers	2024-09-03 14:00:13 +05:30
Ayush Chaurasia	dc72ece847	feat!: better api for manual hybrid queries (#1575 ) Currently, the only documented way of performing hybrid search is by using embedding API and passing string queries that get automatically embedded. There are use cases where users might like to pass vectors and text manually instead. This ticket contains more information and historical context - https://github.com/lancedb/lancedb/issues/937 This breaks a undocumented pathway that allowed passing (vector, text) tuple queries which was intended to be temporary, so this is marked as a breaking change. For all practical purposes, this should not really impact most users ### usage ``` results = table.search(query_type="hybrid") .vector(vector_query) .text(text_query) .limit(5) .to_pandas() ```	2024-08-30 17:37:58 +05:30
BubbleCal	1521435193	fix: specify column to search for FTS (#1572 ) Before this we ignored the `fts_columns` parameter, and for now we support to search on only one column, it could lead to an error if we have multiple indexed columns for FTS --------- Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2024-08-29 23:43:46 +08:00
Gagan Bhullar	a85f039352	fix(bug): limit fix (#1548 ) PR fixes #1151	2024-08-26 14:25:14 -07:00
Ayush Chaurasia	549ca51a8a	feat: add answerdotai rerankers support and minor improvements (#1560 ) This PR: - Adds missing license headers - Integrates with answerdotai Rerankers package - Updates ColbertReranker to subclass answerdotai package. This is done to keep backwards compatibility as some users might be used to importing ColbertReranker directly - Set `trust_remote_code` to ` True` by default in CrossEncoder and sentence-transformer based rerankers	2024-08-26 13:25:10 +05:30
Lance Release	89bcc1b2e7	Bump version: 0.13.0-beta.0 → 0.13.0-beta.1	2024-08-23 13:56:30 +00:00
Gagan Bhullar	6eb7ccfdee	fix: rerank attribute unknown (#1554 ) PR fixes #1550	2024-08-22 11:46:36 +05:30
Ayush Chaurasia	7d65dd97cf	chore(python): update Colbert architecture and minor improvements (#1547 ) - Update ColBertReranker architecture: The current implementation doesn't use the right arch. This PR uses the implementation in Rerankers library. Fixes https://github.com/lancedb/lancedb/issues/1546 Benchmark diff (hit rate): Hybrid - 91 vs 87 reranked vector - 85 vs 80 - Reranking in FTS is basically disabled in main after last week's FTS updates. I think there's no blocker in supporting that? - Allow overriding accelerators: Most transformer based Rerankers and Embedding automatically select device. This PR allows overriding those settings by passing `device`. Fixes: https://github.com/lancedb/lancedb/issues/1487 --------- Co-authored-by: BubbleCal <bubble-cal@outlook.com>	2024-08-21 12:26:52 +05:30
Lei Xu	5857cb4c6e	docs: add a section to describe scalar index (#1495 )	2024-08-16 18:48:29 -07:00
BubbleCal	0fa50775d6	feat: support to query/index FTS on RemoteTable/AsyncTable (#1537 ) Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2024-08-16 12:01:05 +08:00
Gagan Bhullar	20faa4424b	feat(python): add delete unverified parameter (#1542 ) PR fixes #1527	2024-08-15 09:01:32 -07:00
BubbleCal	b624fc59eb	docs: add `create_fts_index` doc in Python API Reference (#1533 ) resolve #1313 --------- Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2024-08-15 11:35:16 +08:00
BubbleCal	501817cfac	chore: bump the required python version to 3.9 (#1541 ) Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2024-08-14 08:44:31 -07:00
Ryan Green	b3daa25f46	feat: allow new scalar index types to be created in remote table (#1538 )	2024-08-13 16:05:42 -02:30
Lance Release	ff5bbfdd4c	Bump version: 0.12.0 → 0.13.0-beta.0	2024-08-12 19:47:57 +00:00
Lei Xu	b2317c904d	feat: create bitmap and label list scalar index using python async api (#1529 ) * Expose `bitmap` and `LabelList` scalar index type via Rust and Async Python API * Add documents	2024-08-11 09:16:11 -07:00
BubbleCal	613f3063b9	chore: upgrade lance to 0.16.1 (#1524 ) Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2024-08-09 19:18:05 +08:00
Gagan Bhullar	9c1adff426	feat(python): add to_list to async api (#1520 ) PR fixes #1517	2024-08-08 11:45:20 -07:00
BubbleCal	f9d5fa88a1	feat!: migrate FTS from tantivy to lance-index (#1483 ) Lance now supports FTS, so add it into lancedb Python, TypeScript and Rust SDKs. For Python, we still use tantivy based FTS by default because the lance FTS index now misses some features of tantivy. For Python: - Support to create lance based FTS index - Support to specify columns for full text search (only available for lance based FTS index) For TypeScript: - Change the search method so that it can accept both string and vector - Support full text search For Rust - Support full text search The others: - Update the FTS doc BREAKING CHANGE: - for Python, this renames the attached score column of FTS from "score" to "_score", this could be a breaking change for users that rely the scores --------- Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2024-08-08 15:33:15 +08:00
Lance Release	ec39d98571	Bump version: 0.12.0-beta.0 → 0.12.0	2024-08-07 20:55:40 +00:00
Lance Release	0cb37f0e5e	Bump version: 0.11.0 → 0.12.0-beta.0	2024-08-07 20:55:39 +00:00
Lei Xu	2bdf0a02f9	feat!: upgrade lance to 0.16 (#1519 )	2024-08-07 13:15:22 -07:00
Gagan Bhullar	32123713fd	feat(python): optimize stats repr method (#1510 ) PR fixes #1507	2024-08-07 08:47:52 -07:00
Gagan Bhullar	d5a01ffe7b	feat(python): index config repr method (#1509 ) PR fixes #1506	2024-08-07 08:46:46 -07:00
Ayush Chaurasia	e01045692c	feat(python): support embedding functions in remote table (#1405 )	2024-08-07 20:22:43 +05:30
Ayush Chaurasia	4769d8eb76	feat(python): multi-vector reranking support (#1481 ) Currently targeting the following usage: ``` from lancedb.rerankers import CrossEncoderReranker reranker = CrossEncoderReranker() query = "hello" res1 = table.search(query, vector_column_name="vector").limit(3) res2 = table.search(query, vector_column_name="text_vector").limit(3) res3 = table.search(query, vector_column_name="meta_vector").limit(3) reranked = reranker.rerank_multivector( [res1, res2, res3], deduplicate=True, query=query # some reranker models need query ) ``` - This implements rerank_multivector function in the base reranker so that all rerankers that implement rerank_vector will automatically have multivector reranking support - Special case for RRF reranker that just uses its existing rerank_hybrid fcn to multi-vector reranking. --------- Co-authored-by: Weston Pace <weston.pace@gmail.com>	2024-08-07 01:45:46 +05:30
Ayush Chaurasia	d07d7a5980	chore: update polars version range (#1508 )	2024-08-06 23:43:15 +05:30
Robby	8d2ff7b210	feat(python): add watsonx embeddings to registry (#1486 ) Related issue: https://github.com/lancedb/lancedb/issues/1412 --------- Co-authored-by: Robby <h0rv@users.noreply.github.com>	2024-08-06 10:58:33 +05:30
Ryan Green	6af69b57ad	fix: return LanceMergeInsertBuilder in overridden merge_insert method on remote table (#1484 )	2024-07-31 12:25:16 -02:30
Lance Release	7b6d3f943b	Bump version: 0.11.0-beta.0 → 0.11.0	2024-07-26 20:18:31 +00:00
Lance Release	676876f4d5	Bump version: 0.10.2 → 0.11.0-beta.0	2024-07-26 20:18:30 +00:00
Will Jones	9555efacf9	feat: upgrade lance to 0.15.0 (#1477 ) Changelog: https://github.com/lancedb/lance/releases/tag/v0.15.0 * Fixes #1466 * Closes #1475 * Fixes #1446	2024-07-26 09:13:49 -07:00
Chang She	374c1e7aba	fix: infer schema from huggingface dataset (#1444 ) Closes #1383 When creating a table from a HuggingFace dataset, infer the arrow schema directly	2024-07-23 13:12:34 -07:00
Ayush Chaurasia	0255221086	feat: add reciprocal rank fusion reranker (#1456 ) Implements https://plg.uwaterloo.ca/~gvcormac/cormacksigir09-rrf.pdf Refactors the hybrid search only rerrankers test to avoid repetition.	2024-07-23 21:37:17 +05:30
Lance Release	1d5da1d069	Bump version: 0.10.2-beta.0 → 0.10.2	2024-07-23 13:48:48 +00:00
Lance Release	0c0ec1c404	Bump version: 0.10.1 → 0.10.2-beta.0	2024-07-23 13:48:47 +00:00
Weston Pace	d4aad82aec	fix: don't use v2 by default on empty table (#1469 )	2024-07-23 06:47:49 -07:00
Will Jones	4f601a2d4c	fix: handle camelCase column names in select (#1460 ) Fixes #1385	2024-07-22 12:53:17 -07:00
Lei Xu	c9c61eb060	docs: expose merge_insert doc for remote python SDK (#1464 ) `merge_insert` API is not shown up on [`RemoteTable`](https://lancedb.github.io/lancedb/python/saas-python/#lancedb.remote.table.RemoteTable) today * Also bump `ruff` version as well	2024-07-22 10:48:16 -07:00
Ayush Chaurasia	ed7bd45c17	chore: choose appropriate args for concat_table based on pyarrow version & refactor reranker tests (#1455 )	2024-07-18 21:04:59 +05:30
Magnus	dc609a337d	fix: added support for trust_remote_code (#1454 ) Closes #1285 Added trust_remote_code to the SentenceTransformerEmbeddings class. Defaults to `False`	2024-07-18 19:37:52 +05:30
Lance Release	2c36767f20	Bump version: 0.10.1-beta.0 → 0.10.1	2024-07-17 14:04:40 +00:00
Lance Release	1fa7e96aa1	Bump version: 0.10.0 → 0.10.1-beta.0	2024-07-17 14:04:39 +00:00
Lance Release	11959cc5d6	Bump version: 0.10.0-beta.0 → 0.10.0	2024-07-13 08:55:22 +00:00
Lance Release	7c65cec8d7	Bump version: 0.9.0 → 0.10.0-beta.0	2024-07-13 08:55:22 +00:00
Adam Azzam	82621d5b13	chore: typing for lance.connect (#1441 ) Feel free to close if this is a distraction, but untyped keywords in lance.connect is throwing pylance errors in strict mode. <img width="683" alt="Screenshot 2024-07-11 at 1 21 04 PM" src="https://github.com/lancedb/lancedb/assets/33043305/fe6cd4d9-4e59-413d-87f2-aabb9ff84cc4">	2024-07-12 10:39:28 -07:00
Lei Xu	0708428357	feat: support update over binary field (#1440 )	2024-07-12 09:22:00 -07:00
BubbleCal	137d86d3c5	chore: bump lance to 0.14.1 (#1442 ) Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2024-07-12 21:41:59 +08:00

1 2 3 4 5 ...

434 Commits