Commit Graph

1320 Commits

Author SHA1 Message Date
Ayush Chaurasia
90e9c52d0a docs: update hybrid search example to latest langchain (#1824)
Co-authored-by: qzhu <qian@lancedb.com>
2024-11-12 20:06:25 -08:00
Will Jones
68974a4e06 ci: add index URL to fix failing docs build (#1823) 2024-11-12 16:54:22 -08:00
Lei Xu
4c9bab0d92 fix: use pandas with pydantic embedding column (#1818)
* Make Pandas `DataFrame` works with embedding function + Subset of
columns
* Make `lancedb.create_table()` work with embedding function
2024-11-11 14:48:56 -08:00
QianZhu
5117aecc38 docs: search param explanation for OSS doc (#1815)
![Screenshot 2024-11-09 at 11 09
14 AM](https://github.com/user-attachments/assets/2aeba016-aeff-4658-85c6-8640285ba0c9)
2024-11-11 11:57:17 -08:00
Umut Hope YILDIRIM
729718cb09 fix: arm64 runner proto already installed bug (#1810)
https://github.com/lancedb/lancedb/actions/runs/11748512661/job/32732745458
2024-11-08 14:49:37 -08:00
Umut Hope YILDIRIM
b1c84e0bda feat: added lancedb and vectordb release ci for win32-arm64-msvc npmjs only (#1805) 2024-11-08 11:40:57 -08:00
fzowl
cbbc07d0f5 feat: voyageai support (#1799)
Adding VoyageAI embedding and rerank support
2024-11-09 00:51:20 +05:30
Kursat Aktas
21021f94ca docs: introducing LanceDB Guru on Gurubase.io (#1797)
Hello team,

I'm the maintainer of [Anteon](https://github.com/getanteon/anteon). We
have created Gurubase.io with the mission of building a centralized,
open-source tool-focused knowledge base. Essentially, each "guru" is
equipped with custom knowledge to answer user questions based on
collected data related to that tool.

I wanted to update you that I've manually added the [LanceDB
Guru](https://gurubase.io/g/lancedb) to Gurubase. LanceDB Guru uses the
data from this repo and data from the
[docs](https://lancedb.github.io/lancedb/) to answer questions by
leveraging the LLM.

In this PR, I showcased the "LanceDB Guru", which highlights that
LanceDB now has an AI assistant available to help users with their
questions. Please let me know your thoughts on this contribution.

Additionally, if you want me to disable LanceDB Guru in Gurubase, just
let me know that's totally fine.

Signed-off-by: Kursat Aktas <kursat.ce@gmail.com>
2024-11-08 10:55:22 -08:00
BubbleCal
0ed77fa990 chore: impl Debug & Clone for Index params (#1808)
we don't really need these trait in lancedb, but all fields in `Index`
implement the 2 traits, so do it for possibility to use `Index`
somewhere

Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2024-11-09 01:07:43 +08:00
BubbleCal
4372c231cd feat: support optimize indices in sync API (#1769)
Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2024-11-08 08:48:07 -08:00
Umut Hope YILDIRIM
fa9ca8f7a6 ci: arm64 windows build support (#1770)
Adds support for 'aarch64-pc-windows-msvc'.
2024-11-06 15:34:23 -08:00
Lance Release
2a35d24ee6 Updating package-lock.json 2024-11-06 17:26:36 +00:00
Lance Release
dd9ce337e2 Bump version: 0.13.0-beta.0 → 0.13.0-beta.1 v0.13.0-beta.1 2024-11-06 17:26:17 +00:00
Will Jones
b9921d56cc fix(node): update default log level to warn (#1801)
🤦
2024-11-06 09:13:53 -08:00
Lance Release
0cfd9ed18e Updating package-lock.json 2024-11-05 23:21:50 +00:00
Lance Release
975398c3a8 Bump version: 0.12.0 → 0.13.0-beta.0 v0.13.0-beta.0 2024-11-05 23:21:32 +00:00
Lance Release
08d5f93f34 Bump version: 0.15.0 → 0.16.0-beta.0 python-v0.16.0-beta.0 2024-11-05 23:21:13 +00:00
Will Jones
91cab3b556 feat(python): transition Python remote sdk to use Rust implementation (#1701)
* Replaces Python implementation of Remote SDK with Rust one.
* Drops dependency on `attrs` and `cachetools`. Makes `requests` an
optional dependency used only for embeddings feature.
* Adds dependency on `nest-asyncio`. This was required to get hybrid
search working.
* Deprecate `request_thread_pool` parameter. We now use the tokio
threadpool.
* Stop caching the `schema` on a remote table. Schema is mutable and
there's no mechanism in place to invalidate the cache.
* Removed the client-side resolution of the vector column. We should
already be resolving this server-side.
2024-11-05 13:44:39 -08:00
Will Jones
c61bfc3af8 chore: update package locks (#1798) 2024-11-05 13:28:59 -08:00
Bert
4e8c7b0adf fix: serialize vectordb client errors as json (#1795) 2024-11-05 14:16:25 -05:00
Weston Pace
26f4a80e10 feat: upgrade to lance 0.19.2-beta.3 (#1794) 2024-11-05 06:43:41 -08:00
Will Jones
3604d20ad3 feat(python,node): support with_row_id in Python and remote (#1784)
Needed to support hybrid search in Remote SDK.
2024-11-04 11:25:45 -08:00
Gagan Bhullar
9708d829a9 fix: explain plan options (#1776)
PR fixes #1768
2024-11-04 10:25:34 -08:00
Will Jones
059c9794b5 fix(rust): fix update, open_table, fts search in remote client (#1785)
* `open_table` uses `POST` not `GET`
* `update` uses `predicate` key not `only_if`
* For FTS search, vector cannot be omitted. It must be passed as empty.
* Added logging of JSON request bodies to debug level logging.
2024-11-04 08:27:55 -08:00
Will Jones
15ed7f75a0 feat(python): support post filter on FTS (#1783) 2024-11-01 10:05:05 -07:00
Will Jones
96181ab421 feat: fast_search in Python and Node (#1623)
Sometimes it is acceptable to users to only search indexed data and skip
and new un-indexed data. For example, if un-indexed data will be shortly
indexed and they don't mind the delay. In these cases, we can save a lot
of CPU time in search, and provide better latency. Users can activate
this on queries using `fast_search()`.
2024-11-01 09:29:09 -07:00
Will Jones
f3fc339ef6 fix(rust): fix delete, update, query in remote SDK (#1782)
Fixes several minor issues with Rust remote SDK:

* Delete uses `predicate` not `filter` as parameter
* Update does not return the row value in remote SDK
* Update takes tuples
* Content type returned by query node is wrong, so we shouldn't validate
it. https://github.com/lancedb/sophon/issues/2742
* Data returned by query endpoint is actually an Arrow IPC file, not IPC
stream.
2024-10-31 15:22:09 -07:00
Will Jones
113cd6995b fix: index_stats works for FTS indices (#1780)
When running `index_stats()` for an FTS index, users would get the
deserialization error:

```
InvalidInput { message: "error deserializing index statistics: unknown variant `Inverted`, expected one of `IvfPq`, `IvfHnswPq`, `IvfHnswSq`, `BTree`, `Bitmap`, `LabelList`, `FTS` at line 1 column 24" }
```
2024-10-30 11:33:49 -07:00
Lance Release
02535bdc88 Updating package-lock.json 2024-10-29 22:16:51 +00:00
Lance Release
facc7d61c0 Bump version: 0.12.0-beta.0 → 0.12.0 v0.12.0 2024-10-29 22:16:32 +00:00
Lance Release
f947259f16 Bump version: 0.11.1-beta.1 → 0.12.0-beta.0 2024-10-29 22:16:27 +00:00
Lance Release
e291212ecf Bump version: 0.15.0-beta.0 → 0.15.0 python-v0.15.0 2024-10-29 22:16:05 +00:00
Lance Release
edc6445f6f Bump version: 0.14.1-beta.1 → 0.15.0-beta.0 2024-10-29 22:16:05 +00:00
Will Jones
a324f4ad7a feat(node): enable logging and show full errors (#1775)
This exposes the `LANCEDB_LOG` environment variable in node, so that
users can now turn on logging.

In addition, fixes a bug where only the top-level error from Rust was
being shown. This PR makes sure the full error chain is included in the
error message. In the future, will improve this so the error chain is
set on the [cause](https://nodejs.org/api/errors.html#errorcause)
property of JS errors https://github.com/lancedb/lancedb/issues/1779

Fixes #1774
2024-10-29 15:13:34 -07:00
Weston Pace
55104c5bae feat: allow distance type (metric) to be specified during hybrid search (#1777) 2024-10-29 13:51:18 -07:00
Rithik Kumar
d71df4572e docs: revamp langchain integration page (#1773)
Before - 
<img width="1030" alt="Screenshot 2024-10-28 132932"
src="https://github.com/user-attachments/assets/63f78bfa-949e-473e-ab22-0c692577fa3e">


After - 
<img width="1037" alt="Screenshot 2024-10-28 132727"
src="https://github.com/user-attachments/assets/85a12f6c-74f0-49ba-9f1a-fe77ad125704">
2024-10-29 22:55:50 +05:30
Rithik Kumar
aa269199ad docs: fix archived examples links (#1751) 2024-10-29 22:55:27 +05:30
BubbleCal
32fdcf97db feat!: upgrade lance to 0.19.1 (#1762)
BREAKING CHANGE: default tokenizer no longer does stemming or stop-word
removal. Users should explicitly turn that option on in the future.

- upgrade lance to 0.19.1
- update the FTS docs
- update the FTS API

Upstream change notes:
https://github.com/lancedb/lance/releases/tag/v0.19.1

---------

Signed-off-by: BubbleCal <bubble-cal@outlook.com>
Co-authored-by: Will Jones <willjones127@gmail.com>
2024-10-29 09:03:52 -07:00
Ryan Green
b9802a0d23 Revert "fix: error during deserialization of "INVERTED" index type"
This reverts commit 2ea5939f85.
2024-10-25 14:46:47 -02:30
Ryan Green
2ea5939f85 fix: error during deserialization of "INVERTED" index type 2024-10-25 14:40:14 -02:30
Lance Release
04e1f1ee4c Updating package-lock.json 2024-10-23 00:34:22 +00:00
Lance Release
bbc588e27d Bump version: 0.11.1-beta.0 → 0.11.1-beta.1 v0.11.1-beta.1 2024-10-23 00:34:01 +00:00
Lance Release
5517e102c3 Bump version: 0.14.1-beta.0 → 0.14.1-beta.1 python-v0.14.1-beta.1 2024-10-23 00:33:40 +00:00
Will Jones
82197c54e4 perf: eliminate iop in refresh (#1760)
Closes #1741

If we checkout a version, we need to make a `HEAD` request to get the
size of the manifest. The new `checkout_latest()` code path can skip
this IOP. This makes the refresh slightly faster.
2024-10-18 13:40:24 -07:00
Will Jones
48f46d4751 docs(node): update indexStats signature and regenerate docs (#1742)
`indexStats` still referenced UUID even though in
https://github.com/lancedb/lancedb/pull/1702 we changed it to take name
instead.
2024-10-18 10:53:28 -07:00
Lance Release
437316cbbc Updating package-lock.json 2024-10-17 18:59:18 +00:00
Lance Release
d406eab2c8 Bump version: 0.11.0 → 0.11.1-beta.0 v0.11.1-beta.0 2024-10-17 18:59:01 +00:00
Lance Release
1f41101897 Bump version: 0.14.0 → 0.14.1-beta.0 python-v0.14.1-beta.0 2024-10-17 18:58:45 +00:00
Will Jones
99e4db0d6a feat(rust): allow add_embedding on create_empty_table (#1754)
Fixes https://github.com/lancedb/lancedb/issues/1750
2024-10-17 11:58:15 -07:00
Will Jones
46486d4d22 fix: list_indices can handle fts indexes (#1753)
Fixes #1752
2024-10-16 10:39:40 -07:00