Commit Graph

363 Commits

Author SHA1 Message Date
Varun Chawla
2802764092 fix(embeddings): stop retrying OpenAI 401 authentication errors (#2995)
## Summary
Fixes #1679

This PR prevents the OpenAI embedding function from retrying when
receiving a 401 Unauthorized error. Authentication errors are permanent
failures that won't be fixed by retrying, yet the current implementation
retries all exceptions up to 7 times by default.

## Changes
- Modified `retry_with_exponential_backoff` in `utils.py` to check for
non-retryable errors before retrying
- Added `_is_non_retryable_error` helper function that detects:
  - Exceptions with name `AuthenticationError` (OpenAI's 401 error)
  - Exceptions with `status_code` attribute of 401 or 403
- Enhanced OpenAI embeddings to explicitly catch and re-raise
`AuthenticationError` with better logging
- Added unit test `test_openai_no_retry_on_401` to verify authentication
errors don't trigger retries

## Test Plan
- Added test that verifies:
  1. A function raising `AuthenticationError` is only called once
  2. No retry delays occur (sleep is never called)
- Existing tests continue to pass
- Formatting applied via `make format`

## Example Behavior

**Before**: With an invalid API key, users would see 7 retry attempts
over ~2 minutes:
```
WARNING:root:Error occurred: Error code: 401 - {'error': {'message': 'Incorrect API key provided...'}}
 Retrying in 3.97 seconds (retry 1 of 7)
WARNING:root:Error occurred: Error code: 401...
 Retrying in 7.94 seconds (retry 2 of 7)
...
```

**After**: With an invalid API key, the error is raised immediately:
```
ERROR:root:Authentication failed: Invalid API key provided
AuthenticationError: Error code: 401 - {'error': {'message': 'Incorrect API key provided...'}}
```

This provides better UX and prevents unnecessary API calls that would
fail anyway.

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2026-02-19 09:20:54 -08:00
Prashanth Rao
155ec16161 fix: deprecate outdated files for embedding registry (#3037)
There are old and outdated files in our embedding registry that can
confuse coding agents. This PR deprecates the following files that have
newer, more modern methods to generate such embeddings.

- Deprecate `embeddings/siglip.py` 
- Deprecate `embeddings/gte.py` 

## Why this change?

Per a discussion with @AyushExel, the [embedding registry directory
](1840aa7edc/python/python/lancedb/embeddings)
in the LanceDB repo has a number of outdated files that need to be
deprecated.

See https://github.com/lancedb/docs/issues/85 for the docs gaps that
identified this.
- Add note in `openclip` docs that it can be used for SigLip embeddings,
which it now supports
- Add note in the `sentence-transformers` page that ALL text embedding
models on Hugging Face can be used
2026-02-18 12:04:39 -05:00
Omair Afzal
715b81c86b fix(python): graceful handling of empty result sets in hybrid search (#3030)
## Problem

When applying hard filters that result in zero matches, hybrid search
crashes with `IndexError: list index out of range` during reranking.
This happens because empty result tables are passed through the full
reranker pipeline, which expects at least one result.

Traceback from the issue:
```
lancedb/query.py: in _combine_hybrid_results
    results = reranker.rerank_hybrid(fts_query, vector_results, fts_results)
lancedb/rerankers/answerdotai.py: in rerank_hybrid
    combined_results = self._rerank(combined_results, query)
...
IndexError: list index out of range
```

## Fix

Added an early return in `_combine_hybrid_results` when both vector and
FTS results are empty. Instead of passing empty tables through
normalization, reranking, and score restoration (which can fail in
various ways), we now build a properly-typed empty result table with the
`_relevance_score` column and return it directly.

## Test

Added `test_empty_hybrid_result_reranker` that exercises
`_combine_hybrid_results` directly with empty vector and FTS tables,
verifying:
- Returns empty table with correct schema  
- Includes `_relevance_score` column
- Respects `with_row_ids` flag

Closes #2425
2026-02-17 11:37:10 -08:00
Weston Pace
70cbee6293 feat: improve Permutation pytorch integration (#3016)
This changes around the output format of `Permutation` in some breaking
ways but I think the API is still new enough to be considered
experimental.

1. In order to align with both huggingface's dataset and torch's
expectations the default output format is now a list of dicts
(row-major) instead of a dict of lists (column-major). I've added a
python_col option which will return the dict of lists.
2. In order to align with pytorch's expectation the `torch` format is
now a list of tensors (row-major) instead of a 2D tensor (column-major).
I've added a torch_col option which will return the 2D tensor instead.

Added tests for torch integration with Permutation

~~Leaving draft until https://github.com/lancedb/lancedb/pull/3013
merges as this is built on top of that~~
2026-02-12 13:41:14 -08:00
Weston Pace
02783bf440 feat: add a getitems implementation for the permutation (#3013) 2026-02-12 05:36:11 -08:00
Dhruv
4323ca0147 feat: show reranker info in hybrid search explain plan (#3006)
Closes #3000

The hybrid search `explain_plan` now shows the reranker as the top-level
node with
the vector and FTS sub-plans indented underneath, instead of just
listing them
separately with no reranker context.

**Before:**
```
Vector Search Plan:
ProjectionExec: ...
FTS Search Plan:
ProjectionExec: ...
```

**After:**
```
RRFReranker(K=60)
  Vector Search Plan:
  ProjectionExec: ...
  FTS Search Plan:
  ProjectionExec: ...
```

Other rerankers display similarly ; e.g.
`LinearCombinationReranker(weight=0.7, fill=1.0)`,
`MRRReranker(weight_vector=0.5, weight_fts=0.5)`,
`CohereReranker(model_name=name)`.

---------

Signed-off-by: dask-58 <googldhruv@gmail.com>
Co-authored-by: Will Jones <willjones127@gmail.com>
2026-02-10 11:45:39 -08:00
Dhruv
bd3dd6a8e5 fix: improve error message for multi-field FTS index creation (#3005)
Fixes #2999

The error message previously said `"field_names must be a string when
use_tantivy=False"` implying they should use the to be deprecated
tantivy backend #2998.

Updated the error message and docstring to instead guide users to create
a separate FTS index for each field

Signed-off-by: dask-58 <googldhruv@gmail.com>
2026-02-09 16:28:50 -08:00
Jack Ye
0859312b83 feat: add initial and latest storage options apis (#2966)
Expose `initial_storage_options()` and `latest_storage_options()` in
lance Dataset, in lancedb rust, python and typescript SDKs.

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-04 10:31:39 -08:00
Jack Ye
bd2c6d0763 chore: update lance dependency to v2.0.0-rc.4 (#2972) 2026-02-03 14:38:39 -08:00
Rashid Ul Islam
c3cc2530b7 feat(python): expose fast_search in synchronous API (Fixes #2612) (#2962)
Fixes #2612

This PR exposes the private _fast_search attribute via a public
fast_search() method in the synchronous LanceVectorQueryBuilder.

Previously, enabling fast search in the sync API required accessing a
private member (query._fast_search = True). This change aligns the
synchronous API with the Async and Remote APIs, allowing for cleaner,
more Pythonic method chaining.

Changes:
Added fast_search() method to LanceVectorQueryBuilder in
python/python/lancedb/query.py.
Added a unit test verifying the flag works with high-dimensional data
(2560 dims) and chaining.
Example Usage:

Before:

```
query = table.search(vector)
query._fast_search = True  # Private attribute usage
results = query.limit(10).to_pandas()
```

After:

```
results = (
    table.search(vector)
    .fast_search()
    .limit(10)
    .to_pandas()
)
```

Verification:
I have added a test case (test_fast_search_high_dimension) that
replicates the scenario described in the issue (2560 dimensions, cosine
distance) to ensure the pipeline constructs the query correctly without
errors.

Checklist:

- [ ]  I have added tests to cover my changes.
- [ ]  All new and existing tests passed.
- [ ]  Documentation has been updated (inline docstrings).

Signed-off-by: Rashidul Islam <rasidulislam71@gmail.com>
2026-02-03 09:17:27 -08:00
Will Jones
131024839f fix: include _rowid in hash and calculated split projections (#2965)
## Summary

- PR #2957 changed the permutation builder to only select `_rowid` from
the base table, but `Splitter::project()` for hash and calculated splits
replaced the selection entirely, dropping `_rowid`.
- Include `_rowid` in the column selections for hash and calculated
split projections.
- Fix a Python test that queried the permutation table for base table
columns no longer materialized.

Fixes the `test_split_hash`, `test_split_hash_with_discard`,
`test_split_calculated`, `test_shuffle_combined_with_splits`, and
`test_filter_with_splits` failures in `test_permutation.py`.

## Test plan

- [x] `cargo test -p lancedb -- permutation` (22 passed)
- [x] `pytest python/tests/test_permutation.py` (46 passed)
- [x] `npm test __test__/permutation.test.ts` (20 passed)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-02-02 16:27:58 -08:00
Aman Harsh
3b8996bb69 fix(python): cancel remote queries on sync API interruption (#2913)
Fixes #2898 

Problem:
Sync API cancellations didn’t stop remote query coroutines, so requests
could continue after interrupt.

Changes:
- Cancel run_coroutine_threadsafe futures on any BaseException in the
sync background loop
- Update cancellation test to avoid starting a real background thread
and cover GeneratorExit
2026-01-30 15:47:18 -08:00
Xin Sun
8773b865a9 fix(python): uses PIL incorrectly and may raise AttributeError (#2954)
Importing `PIL` alone does not guarantee that the `Image` submodule is
loaded. In a clean environment where no other code has imported
`PIL.Image` before, `PIL.Image` does not exist on the `PIL` package,
which leads to the AttributeError.
2026-01-30 15:33:10 -08:00
fzowl
1ee29675b3 feat(python): adding VoyageAI v4 models (#2959)
Adding VoyageAI v4 models
 - with these, i added unit tests
 - added example code (tested!)
2026-01-30 15:16:03 -08:00
Lei Xu
357197bacc chore!: change support python version from 3.10 to 3.13 (#2955)
Python 3.9 is EOL since Oct 2025. and last two pyarrow builts were
against python3.10-3.13.

* This PR is contributed by codex-gpt5.2
2026-01-30 01:47:50 +08:00
Lei Xu
ad51e2dd1f fix: support pydantic list of structs or optional struct (#2953)
Closes #2950

*This code is generated by codex-gpt5.2*
2026-01-28 21:08:18 -08:00
Jack Ye
e4552e577a chore(revert): revert update lance dependency to v2.0.0-rc.1 (#2936) (#2941)
This reverts commit bd84bba14d, so that we
can bump version to 1.0.4-rc.1
2026-01-26 11:13:59 -08:00
LanceDB Robot
bd84bba14d chore: update lance dependency to v2.0.0-rc.1 (#2936)
## Summary
- bump Lance dependencies to v2.0.0-rc.1 (git tag)
- align Arrow/DataFusion/PyO3 versions for the new Lance release
- update Python bindings for PyO3 0.26 (attach API + Py<PyAny>)

## Verification
- `cargo clippy --workspace --tests --all-features -- -D warnings`
- `cargo fmt --all`

## Reference
- https://github.com/lance-format/lance/releases/tag/v2.0.0-rc.1

---------

Co-authored-by: Jack Ye <yezhaoqin@gmail.com>
Co-authored-by: Will Jones <willjones127@gmail.com>
Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: BubbleCal <bubble_cal@outlook.com>
2026-01-22 13:14:38 -08:00
Jack Ye
f124c9d8d2 test: string type conversion in pandas 3.0+ (#2928)
Pandas 3.0+ string now converts to Arrow large_utf8. This PR mainly
makes sure our test accounts for the difference across the pandas
versions when constructing schema.
2026-01-21 13:40:48 -08:00
Jack Ye
4e65748abf chore: update lance dependency to v1.0.3-rc.1 (#2927)
Supercedes https://github.com/lancedb/lancedb/pull/2925

We accidentally upgraded lance to 2.0.0-beta.8. This PR reverts that
first and then bump to 1.0.3-rc.1

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-21 11:52:07 -08:00
Ryan Green
cd5f91bb7d feat: expose table uri (#2922)
* Expose `table.uri` property for all tables, including remote tables
* Fix bug in path calculation on windows file systems
2026-01-20 19:56:46 -03:30
LanceDB Robot
4da01a0e65 chore: update lance dependency to v2.0.0-beta.8 (#2907)
## Summary
- bump Lance crates to v2.0.0-beta.8 and align
arrow/datafusion/regex/half and PyO3 dependencies
- update Rust/Python bindings for upstream API changes (namespace/table
requests, query select columns, storage option providers)
- verified with cargo clippy --workspace --tests --all-features -D
warnings and cargo fmt --all

Triggered by refs/tags/v2.0.0-beta.8.

---------

Signed-off-by: BubbleCal <bubble-cal@outlook.com>
Co-authored-by: BubbleCal <bubble-cal@outlook.com>
2026-01-16 01:46:52 +08:00
Colin Patrick McCabe
2f6d525802 fix: support exist_ok in RemoteDBConnection.create_table (#2901)
RemoteDBConnection should support passing exist_ok to create_table, just
like LanceDBConnection (the non-remote form) does. It can support this
by passing 'exist_ok' as the mode parameter.
2026-01-07 12:29:45 -08:00
LuQQiu
d67a8743ba feat: support remote ivf rq (#2863) 2026-01-02 15:35:33 -08:00
Chenghao Lyu
46fcbbc1e3 fix(python): require explicit region for S3 buckets with dots (#2892)
When region is not specific in the s3 path, `resolve_s3_region` from
"lance-format" project (see [here][1]) will resolve the region by
calling `resolve_bucket_region`, which is a function from the
"arrow-rs-object-store" project expecting [virtual-hosted-style
URLs][1]. When there are dot (".") in the virtual-hosted-style URLs, it
breaks automatic region detection. See more details in the issue
description:
https://github.com/lancedb/lancedb/issues/1898#issuecomment-3690142427

This PR add early validation in connect() and connect_async() to raise a
clear error with instructions when the region is not specified for such
buckets.


[1]:
https://github.com/lance-format/lance/blob/v2.0.0-beta.4/rust/lance-io/src/object_store/providers/aws.rs#L197
[2]:
eedbf3d7d8/src/aws/resolve.rs (L52C5-L52C65)
[3]:
https://docs.aws.amazon.com/AmazonS3/latest/userguide/VirtualHosting.html#virtual-hosted-style-access

Fixes #1898
2026-01-02 15:35:22 -08:00
fzowl
2adb10e6a8 feat: voyage-multimodal-3.5 (#2887)
voyage-multimodal-3.5 support (text, image and video embeddings)
2026-01-02 15:14:52 -08:00
Jonathan Hsieh
1cf7b4b678 docs: remove incorrect "LanceDb Cloud only" from table_names params (#2893)
The page_token and limit parameters for table_names() are supported by
both local storage and LanceDB Cloud, not just Cloud as the docstring
incorrectly stated.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 09:08:04 -08:00
Prashanth Rao
8ae4f42fbe fix: add to_lance() and to_polars() stub methods for type-checkers (#2876)
Adds `Table.to_lance()` and `Table.to_polars()` methods (non-abstract
methods, defaulting to `NotImplementedError`) so type checkers like
mypy, pyright and ty don’t flag them as unknown attributes on `Table`.
Not making these abstract methods should keep existing remote/other
`Table` implementations instantiable.

This is non-breaking change to existing functionality and is purely for
the purpose of pleasing static type-checkers like mypy, ty and pyright.

<img width="626" height="134" alt="image"
src="https://github.com/user-attachments/assets/f4619bca-a882-432b-bd23-ae8f189ff9e3"
/>
2025-12-18 12:55:07 -05:00
BubbleCal
39a18baf59 feat: infer vector type to float32 if integers are out of uint8 range (#2856)
## Summary
- infer integer vector columns as float32 when any value exceeds uint8
range or is negative
- keep uint8 for integer vectors within range and nulls only
- add sync/async tests covering large integer vector inference

## Testing
- ./.venv/bin/pytest python/python/tests/test_table.py -k
"large_int_vectors"
2025-12-08 17:10:25 +08:00
BubbleCal
a61461331c feat: add IVF SQ index support and HNSW aliases (#2832)
Adds IVF_SQ index config through Rust core and Python bindings, plus
alias names IvfHnswSq/Pq for backward compatibility. Updates
remote/table helpers and types to accept the new index type. Includes
tests covering IVF SQ creation and alias usage.
2025-12-04 00:25:44 +08:00
Jack Ye
d1efc6ad8a refactor!: use namespace models directly for namespace operations (#2806)
1. Use generated models in lance-namespace for request response models
to avoid multiple layers of conversions
2. Make sure the API is consistent with the namespace spec
3. Deprecate the table_names API in favor of the list_tables API in
namespace that allows full pagination support without the need to have
sorted table names
4. Add describe_namespace API which was a miss in the original
implementation
2025-12-02 22:41:04 -08:00
Jonathan Hsieh
44878dd9a5 feat: support stable row IDs via storage_options (#2831)
Add support for enabling stable row IDs when creating tables via the
`new_table_enable_stable_row_ids` storage option.

Stable row IDs ensure that row identifiers remain constant after
compaction, update, delete, and merge operations. This is useful for
materialized views and other use cases that need to track source rows
across these operations.

The option can be set at two levels:
- Connection level: applies to all tables created with that connection
- Table level: per-table override via create_table storage_options

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-12-02 13:57:00 -08:00
LanceDB Robot
4b5bb2d76c chore: update lance dependency to v1.0.0-beta.16 (#2835)
## Summary
- bump all Lance crates to v1.0.0-beta.16 via ci/set_lance_version.py
- refresh Cargo.lock (reqwest/opendal/etc.) to satisfy the new release

## Verification
- cargo clippy --workspace --tests --all-features -- -D warnings
- cargo fmt --all

Triggered by
[refs/tags/v1.0.0-beta.16](https://github.com/lance-format/lance/releases/tag/v1.0.0-beta.16)

---------

Co-authored-by: Jack Ye <yezhaoqin@gmail.com>
2025-12-01 23:07:03 -08:00
Prashanth Rao
a250d8e7df docs: improve docstring for RabitQ in Python (#2808)
This PR improves the docstring for `IVF_RQ` (RabitQ) in Python. The
earlier version referred to it as "residual quantization", which is
confusing to future readers of the code.

In contrast, the TypeScript and Rust codebases defined `IVF_RQ` as
RabitQ. So now the three languages use comments that are consistent with
one another.

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-24 13:35:19 +08:00
Jack Ye
0baf807be0 ci: use larger runner for doctest and fix failing tests (#2801)
Currently test would fail after installing to around pytorch
2025-11-20 19:44:31 -08:00
Prashanth Rao
135dfdc7ec docs: 404 and outdated URLs should now work (#2800)
Did a full scan of all URLs that used to point to the old mkdocs pages,
and now links to the appropriate pages on lancedb.com/docs or lance.org
docs.

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-20 11:14:20 -08:00
Jackson Hew
bb6b0bea0c fix: .phrase_query() not working (#2781)
The `self._query` value was not set when wrapping its copy `query` with
quotation marks.

The test for phrase queries has been updated to test the
`.phrase_query()` method as well, which will catch this bug.

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2025-11-20 10:32:37 -08:00
Jack Ye
0084eb238b fix: use None default for namespace (#2797)
Realized that using [] is an anti-pattern in python for defaults:
https://docs.python-guide.org/writing/gotchas/
2025-11-20 10:23:41 -08:00
Colin Patrick McCabe
7d3f5348a7 feat: implement head() for remote tables (#2793)
Implemnent the head() function for RemoteTable.
2025-11-19 12:49:34 -08:00
Jack Ye
1b78ccedaf feat: support async namespace connection (#2788)
Also fix 2 bugs:
1. make storage options provider serializable in ray
2. fix table.to_table() uri is wrong for namespace-backed tables
2025-11-19 12:23:50 -08:00
Mykola Skrynnyk
ca8d118f78 feat(python): support to_pydantic in async (#2438)
This request improves support for `pydantic` integration by adding
`to_pydantic` method to asynchronous queries and handling models that
use `alias` in field definitions. Fixes #2436 and closes #2437 .

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Added support for converting asynchronous query results to Pydantic
models.
- **Bug Fixes**
- Simplified conversion of query results to Pydantic models for improved
reliability.
- Improved handling of field aliases and computed fields when mapping
query results to Pydantic models.
- **Tests**
- Added tests to verify correct mapping of aliased and computed fields
in both synchronous and asynchronous scenarios.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-11-19 11:20:14 -08:00
Wyatt Alt
386fc9e466 feat: add num_attempts to merge insert result (#2795)
This pipes the num_attempts field from lance's merge insert result
through lancedb. This allows callers of merge_insert to get a better
idea of whether transaction conflicts are occurring.
2025-11-19 09:32:57 -08:00
Will Jones
1cf3917a87 ci: make rust ci faster, get ci green (#2782)
* Add `ci` profile for smaller build caches. This had a meaningful
impact in Lance, and I expect a similar impact here.
https://github.com/lancedb/lance/pull/5236
* Get caching working in Rust. Previously was not working due to
`workspaces: rust`.
* Get caching working in NodeJs lint job. Previously wasn't working
because we installed the toolchain **after** we called `- uses:
Swatinem/rust-cache@v2`, which invalidates the cache locally.
* Fix broken pytest from async io transition
(`pytest.PytestRemovedIn9Warning`)
* Altered `get_num_sub_vectors` to handle bug in case of 4-bit PQ. This
was cause of `rust future panicked: unknown error`. Raised an issue
upstream to change panic to error:
https://github.com/lancedb/lance/issues/5257
* Call `npm run docs` to fix doc issue.
* Disable flakey Windows test for consistency. It's just an OS-specific
timer issue, not our fault.
* Fix Windows absolute path handling in namespaces. Was causing CI
failure `OSError: [WinError 123] The filename, directory name, or volume
label syntax is incorrect: `
2025-11-18 09:04:56 -08:00
Ryan Green
92dbec1f95 fix: convert schema metadata to strings for JsonArrowSchema (#2786)
Fixes pydantic validation errors when creating materialized views with
namespace.

```
>       return JsonArrowSchema(fields=fields, metadata=schema.metadata)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E       pydantic_core._pydantic_core.ValidationError: 4 validation errors for JsonArrowSchema
E       metadata.b'geneva::view::query'
E         Input should be a valid string [type=string_type, input_value=b'{"base":{"vector_column...t-image:latest\\"}"}}]}', input_type=bytes]
E           For further information visit https://errors.pydantic.dev/2.12/v/string_type
```
2025-11-17 13:18:20 -03:30
Jack Ye
e47f552a86 feat: support namespace credentials vending (#2778)
Based on https://github.com/lancedb/lance/pull/4984

1. Bump to 1.0.0-beta.2
2. Use DirectoryNamespace in lance to perform all testing in python and
rust for much better coverage
3. Refactor `ListingDatabase` to be able to accept location and
namespace. This is because we have to leverage listing database (local
lancedb connection) for using namespace, namespace only resolves the
location and storage options but we don't want to bind all the way to
rust since user will plug-in namespace from python side. And thus
`ListingDatabase` needs to be able to accept location and namespace that
are created from namespace connection.
4. For credentials vending, we also pass storage options provider all
the way to rust layer, and the rust layer calls back to the python
function to fetch next storage option. This is exactly the same thing we
did in pylance.
2025-11-17 00:42:24 -08:00
Colin Patrick McCabe
1ff594a6a4 feat: bump lance version to 0.40-0-beta.2 (#2772)
Bump the bump lance version to 0.40-0-beta.2.
2025-11-10 14:36:37 -08:00
Prashanth Rao
8e06b8bfe1 feat: pare down docs to only show API refs (#2770)
This PR does the following: 
- Pare down the docs to only what's needed (Python, JS/TS API docs and a
pointer to Rust docs)
- Styling changes to be more in line with the main website theme

The relative URLs remain unchanged, so assuming CI passes, there should
be no breaking changes from the main docs site that points back here.
2025-11-10 12:04:57 -05:00
Weston Pace
aeac9c7644 feat: add python Permutation class to mimic hugging face dataset and provide pytorch dataloader (#2725) 2025-11-06 16:15:33 -08:00
LuQQiu
8b94308cf2 feat: add fts udtf in sql (#2755)
Support FTS feature parity in SQL to match current Python API
capability.
Add `.to_json()` method to FTS query classes to enable usage with SQL
`fts()` UDTF.
Related: https://github.com/lancedb/blog-lancedb/pull/147

query = MatchQuery("puppy", "text", fuzziness=2)
result = client.execute(f"SELECT * FROM fts('table',
'{query.to_json()}')")

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-31 10:06:19 -07:00
fzowl
93c2cf2f59 feat(voyageai): update voyage integration (#2713)
Adding multimodal usage guide
VoyageAI integration changes:
 - Adding voyage-3.5 and voyage-3.5-lite models
 - Adding voyage-context-3 model
 - Adding rerank-2.5 and rerank-2.5-lite models
2025-10-29 16:49:07 +05:30