Commit Graph

1239 Commits

Author SHA1 Message Date
Rob Meng
ee6c18f207 feat: expose underlying dataset uri of the table (#1704) 2024-09-27 10:20:02 -04:00
rjrobben
e606a455df fix(EmbeddingFunction): modify safe_model_dump to explicitly exclude class fields with underscore (#1688)
Resolve issue #1681

---------

Co-authored-by: rjrobben <rjrobben123@gmail.com>
2024-09-25 11:53:49 -07:00
Gagan Bhullar
8f0eb34109 fix: hnsw default partitions (#1667)
PR fixes #1662

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2024-09-25 09:16:03 -07:00
Ayush Chaurasia
2f2721e242 feat(python): allow explicit hybrid search query pattern in SaaS (feat parity) (#1698)
-  fixes https://github.com/lancedb/lancedb/issues/1697.
- unifies vector column inference logic for remote and local table to
prevent future disparities.
- Updates docstring in RemoteTable to specify empty queries are not
supported
2024-09-25 21:04:00 +05:30
QianZhu
f00b21c98c fix: metric type for python/node search api (#1689) 2024-09-24 16:10:29 -07:00
Lance Release
962b3afd17 Updating package-lock.json 2024-09-24 16:51:37 +00:00
Lance Release
b72ac073ab Bump version: 0.11.0-beta.0 → 0.11.0-beta.1 v0.11.0-beta.1 2024-09-24 16:51:16 +00:00
Bert
3152ccd13c fix: re-add hostOverride arg to ConnectionOptions (#1694)
Fixes issue where hostOverride was no-longer passed through to
RemoteConnection
2024-09-24 13:29:26 -03:00
Bert
d5021356b4 feat: add fast_search to vectordb (#1693) 2024-09-24 13:28:54 -03:00
Will Jones
e82f63b40a fix(node): pass no const enum (#1690)
Apparently this is a no-no for libraries.
https://ncjamieson.com/dont-export-const-enums/

Fixes [#1664](https://github.com/lancedb/lancedb/issues/1664)
2024-09-24 07:41:42 -07:00
Ayush Chaurasia
f81ce68e41 fix(python): force deduce vector column name if running explicit hybrid query (#1692)
Right now when passing vector and query explicitly for hybrid search ,
vector_column_name is not deduced.
(https://lancedb.github.io/lancedb/hybrid_search/hybrid_search/#hybrid-search-in-lancedb
). Because vector and query can be both none when initialising the
QueryBuilder in this case. This PR forces deduction of query type if it
is set to "hybrid"
2024-09-24 19:02:56 +05:30
Will Jones
f5c25b6fff ci: run clippy on tests (#1659) 2024-09-23 07:33:47 -07:00
Ayush Chaurasia
86978e7588 feat!: enforce all rerankers always return relevance score & deprecate linear combination fixes (#1687)
- Enforce all rerankers always return _relevance_score. This was already
loosely done in tests before but based on user feedback its better to
always have _relevance_score present in all reranked results
- Deprecate LinearCombinationReranker in docs. And also fix a case where
it would not return _relevance_score if one result set was missing
2024-09-23 12:12:02 +05:30
Lei Xu
7c314d61cc chore: add error handling for openai embedding generation (#1680) 2024-09-23 12:10:56 +05:30
Lei Xu
7a8d2f37c4 feat(rust): add with_row_id to rust SDK (#1683) 2024-09-21 21:26:19 -07:00
Rithik Kumar
11072b9edc docs: phidata integration page (#1678)
Added new integration page for phidata :

![image](https://github.com/user-attachments/assets/8cd9b420-f249-4eac-ac13-ae53983822be)
2024-09-21 00:40:47 +05:30
Lei Xu
915d828cee feat!: set embeddings to Null if embedding function return invalid results (#1674) 2024-09-19 23:16:20 -07:00
Lance Release
d9a72adc58 Updating package-lock.json 2024-09-19 17:53:19 +00:00
Lance Release
d6cf2dafc6 Bump version: 0.10.0 → 0.11.0-beta.0 v0.11.0-beta.0 2024-09-19 17:53:00 +00:00
Lance Release
38f0031d0b Bump version: 0.13.0 → 0.14.0-beta.0 python-v0.14.0-beta.0 2024-09-19 17:52:38 +00:00
LuQQiu
e118c37228 ci: enable java auto release (#1602)
Enable bump java pom.xml versions
Enable auto java release when detect stable github release
2024-09-19 10:51:03 -07:00
LuQQiu
abeaae3d80 feat!: upgrade Lance to 0.18.0 (#1657)
BREAKING CHANGE: default file format changed to Lance v2.0.

Upgrade Lance to 0.18.0

Change notes: https://github.com/lancedb/lance/releases/tag/v0.18.0
2024-09-19 10:50:26 -07:00
Gagan Bhullar
b3c0227065 docs: hnsw documentation (#1640)
PR closes #1627

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2024-09-19 10:32:46 -07:00
Will Jones
521e665f57 feat(rust): remote client write data endpoint (#1645)
* Implements:
  * Add
  * Update
  * Delete
  * Merge-Insert

---------

Co-authored-by: Weston Pace <weston.pace@gmail.com>
2024-09-18 15:02:56 -07:00
Will Jones
ffb28dd4fc feat(rust): remote endpoints for schema, version, count_rows (#1644)
A handful of additional endpoints.
2024-09-16 08:19:25 -07:00
Lei Xu
32af962c0c feat: fix creating empty table and creating table by a list of RecordBatch for remote python sdk (#1650)
Closes #1637
2024-09-14 11:33:34 -07:00
Ayush Chaurasia
18484d0b6c fix: allow pass optional args in colbert reranker (#1649)
Fixes https://github.com/lancedb/lancedb/issues/1641
2024-09-14 11:18:09 -07:00
Lei Xu
c02ee3c80c chore: make remote client a context manager (#1648)
Allow `RemoteLanceDBClient` to be used as context manager
2024-09-13 22:08:48 -07:00
Rithik Kumar
dcd5f51036 docs: add understand embeddings v1 (#1643)
Before getting started with **managing embeddings**. Let's **understand
embeddings** (LanceDB way)

![Screenshot 2024-09-14
012144](https://github.com/user-attachments/assets/7c5435dc-5316-47e9-8d7d-9994ab13b93d)
2024-09-14 02:07:00 +05:30
Sayandip Dutta
9b8472850e fix: unterminated string literal on table update (#1573)
resolves #1429 
(python)

```python
-    return f"'{value}'"
+    return f'"{value}"'
```

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2024-09-13 12:32:59 -07:00
Sayandip Dutta
36d05ea641 fix: add appropriate QueryBuilder overloads to LanceTable.search (#1558)
- Add overloads to Table.search, to preserve the return information
of different types of QueryBuilder objects for LanceTable
- Fix fts_column type annotation by including making it `Optional`

resolves #1550

---------

Co-authored-by: sayandip-dutta <sayandip.dutta@nevaehtech.com>
Co-authored-by: Will Jones <willjones127@gmail.com>
2024-09-13 12:32:30 -07:00
LuQQiu
7ed86cadfb feat(node): let NODE API region default to us-east-1 (#1631)
Fixes #1622 
To sync with python API
2024-09-13 11:48:57 -07:00
Will Jones
1c123b58d8 feat: implement Remote connection for LanceDB Rust (#1639)
* Adding a simple test facility, which allows you to mock a single
endpoint at a time with a closure.
* Implementing all the database-level endpoints

Table-level APIs will be done in a follow up PR.

---------

Co-authored-by: Weston Pace <weston.pace@gmail.com>
2024-09-13 10:53:27 -07:00
BubbleCal
bf7d2d6fb0 docs: update FTS docs for JS SDK (#1634)
Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2024-09-13 05:48:29 -07:00
LuQQiu
c7732585bf fix: support pyarrow input types (#1628)
fixes #1625 
Support PyArrow.RecordBatch, pa.dataset.Dataset, pa.dataset.Scanner,
paRecordBatchReader
2024-09-12 10:59:18 -07:00
Prashant Dixit
b3bf6386c3 docs: rag section in guide (#1619)
This PR adds the RAG section in the Guides. It includes all the RAGs
with code snippet and some advanced techniques which improves RAG.
2024-09-11 21:13:55 +05:30
BubbleCal
4b79db72bf docs: improve the docs and API param name (#1629)
Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2024-09-11 10:18:29 +08:00
Lance Release
622a2922e2 Updating package-lock.json 2024-09-10 20:12:54 +00:00
Lance Release
c91221d710 Bump version: 0.10.0-beta.2 → 0.10.0 v0.10.0 2024-09-10 20:12:41 +00:00
Lance Release
56da5ebd13 Bump version: 0.10.0-beta.1 → 0.10.0-beta.2 2024-09-10 20:12:40 +00:00
Lance Release
64eb43229d Bump version: 0.13.0-beta.2 → 0.13.0 python-v0.13.0 2024-09-10 20:12:35 +00:00
Lance Release
c31c92122f Bump version: 0.13.0-beta.1 → 0.13.0-beta.2 2024-09-10 20:12:35 +00:00
Gagan Bhullar
205fc530cf feat: expose hnsw indices (#1595)
PR closes #1522

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2024-09-10 11:08:13 -07:00
BubbleCal
2bde5401eb feat: support to build FTS without positions (#1621) 2024-09-10 22:51:32 +08:00
Antonio Molner Domenech
a405847f9b fix(python): remove unmaintained ratelimiter dependency (#1603)
The `ratelimiter` package hasn't been updated in ages and is no longer
maintained. This PR removes the dependency on `ratelimiter` and replaces
it with a custom rate limiter implementation.

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2024-09-09 12:35:53 -07:00
Gagan Bhullar
bcc19665ce feat(nodejs): expose offset (#1620)
PR closes #1555
2024-09-09 11:54:40 -07:00
Will Jones
2a6586d6fb feat: add flag to enable faster manifest paths (#1612)
The new V2 manifest path scheme makes discovering the latest version of
a table constant time on object stores, regardless of the number of
versions in the table. See benchmarks in the PR here:
https://github.com/lancedb/lance/pull/2798

Closes #1583
2024-09-09 11:34:36 -07:00
James Wu
029b01bbbf feat: enable phrase_query(bool) for hybrid search queries (#1578)
first off, apologies for any folly since i'm new to contributing to
lancedb. this PR is the continuation of [a discord
thread](https://discord.com/channels/1030247538198061086/1030247538667827251/1278844345713299599):

## user story

here's the lance db search query i'd like to run:

```
def search(phrase):
    logger.info(f'Searching for phrase: {phrase}')
    phrase_embedding = get_embedding(phrase)
    df = (table.search((phrase_embedding, phrase), query_type='hybrid')
        .limit(10).to_list())
    logger.info(f'Success search with row count: {len(df)}')

search('howdy (howdy)')
search('howdy(howdy)')
```

the second search fails due to `ValueError: Syntax Error: howdy(howdy)`

i saw on the
[docs](https://lancedb.github.io/lancedb/fts/#phrase-queries-vs-terms-queries)
that i can use `phrase_query()` to [enable a
flag](https://github.com/lancedb/lancedb/blob/main/python/python/lancedb/query.py#L790-L792)
to wrap the query in double quotes (as well as sanitize single quotes)
prior to sending the query to search. this works for [normal
FTS](https://lancedb.github.io/lancedb/fts/), but the command is
unavailable on [hybrid
search](https://lancedb.github.io/lancedb/hybrid_search/hybrid_search/).

## changes

i added `phrase_query()` function to `LanceHybridQueryBuilder` by
propagating the call down to its `self. _fts_query` object. i'm not too
familiar with the codebase and am not sure if this is the best way to
implement the functionality. feel free to riff on this PR or discard


## tests

```
(lancedb) JamesMPB:python james$ pwd
/Users/james/src/lancedb/python
(lancedb) JamesMPB:python james$ pytest python/tests/test_table.py 
python/tests/test_table.py .......................................                                                                   [100%]
====================================================== 39 passed, 1 warning in 2.23s =======================================================
```
2024-09-07 08:58:05 +05:30
Will Jones
cd32944e54 feat: upgrade lance to v0.17.0 (#1608)
Changelog: https://github.com/lancedb/lance/releases/tag/v0.17.0

Highlights:

* You can do "phrase queries" by adding double quotes around phrases
(multiple tokens) in FTS.

Added follow ups in: https://github.com/lancedb/lancedb/issues/1611
2024-09-06 14:10:02 -07:00
Jon X
7eb3b52297 docs: added a blank line between a paragraph and a list block (#1604)
Though the markdown can be rendered well on GitHub (GFM style?), but it
seems that it's required to insert a blank line between a paragraph and
a list block to make it render well with `mkdocs`?

see also the web page:
https://lancedb.github.io/lancedb/concepts/index_hnsw/
2024-09-06 09:38:19 +05:30