Commit Graph

830 Commits

Author SHA1 Message Date
Lance Release
e612686fdb Bump version: 0.25.0 → 0.25.1-beta.0 2025-09-10 14:24:07 +00:00
Jack Ye
9391ad1450 feat: support mTLS for remote database (#2638)
This PR adds mTLS (mutual TLS) configuration support for the LanceDB
remote HTTP client, allowing users to authenticate with client
certificates and configure custom CA certificates for server
verification.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-09-09 21:04:46 -07:00
Lance Release
f744b785f8 Bump version: 0.25.0-beta.2 → 0.25.0 2025-09-04 08:32:44 +00:00
Lance Release
2e3f745820 Bump version: 0.25.0-beta.1 → 0.25.0-beta.2 2025-09-04 08:32:43 +00:00
Lance Release
4dd399ca29 Bump version: 0.25.0-beta.0 → 0.25.0-beta.1 2025-09-03 17:50:41 +00:00
Wyatt Alt
a9ea785b15 fix: remote python sdk namespace typing (#2620)
This changes the default values for some namespace parameters in the
remote python SDK from None to [], to match the underlying code it
calls.

Prior to this commit, failing to supply "namespace" with the remote SDK
would cause an error because the underlying code it dispatches to does
not consider None to be valid input.
2025-09-02 16:32:32 -07:00
Colin Patrick McCabe
cc38453391 fix!: fix doctest in query.py (#2622)
Fix doctest in query.py to include cumulative_cpu, now that lance
includes that.
2025-09-02 15:47:32 -07:00
Lance Release
0847e666a0 Bump version: 0.24.4-beta.1 → 0.25.0-beta.0 2025-08-29 21:19:51 +00:00
Will Jones
f6846004ca feat: add name parameter to remaining Python create index calls (#2617)
## Summary
This PR adds the missing `name` parameter to `create_scalar_index` and
`create_fts_index` methods in the Python SDK, which was inadvertently
omitted when it was added to `create_index` in PR #2586.

## Changes
- Add `name: Optional[str] = None` parameter to abstract
`Table.create_scalar_index` and `Table.create_fts_index` methods
- Update `LanceTable` implementation to accept and pass the `name`
parameter to the underlying Rust layer
- Update `RemoteTable` implementation to accept and pass the `name`
parameter
- Enhanced tests to verify custom index names work correctly for both
scalar and FTS indices
- When `name` is not provided, default names are generated (e.g.,
`{column}_idx`)

## Test plan
- [x] Added test cases for custom names in scalar index creation
- [x] Added test cases for custom names in FTS index creation  
- [x] Verified existing tests continue to pass
- [x] Code formatting and linting checks pass

This ensures API consistency across all index creation methods in the
LanceDB Python SDK.

Fixes #2616

🤖 Generated with [Claude Code](https://claude.ai/code)

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-08-27 14:02:48 -07:00
Jack Ye
faf8973624 feat!: support multi-level namespace (#2603)
This PR adds support of multi-level namespace in a LanceDB database,
according to the Lance Namespace spec.

This allows users to create namespace inside a database connection,
perform create, drop, list, list_tables in a namespace. (other
operations like update, describe will be in a follow-up PR)

The 3 types of database connections behave like the following:
1 Local database connections will continue to have just a flat list of
tables for backwards compatibility.
2. Remote database connections will make REST API calls according to the
APIs in the Lance Namespace spec.
3. Lance Namespace connections will invoke the corresponding operations
against the specific namespace implementation which could have different
behaviors regarding these APIs.

All the table APIs now take identifier instead of name, for example
`/v1/table/{name}/create` is now `/v1/table/{id}/create`. If a table is
directly in the root namespace, the API call is identical. If the table
is in a namespace, then the full table ID should be used, with `$` as
the default delimiter (`.` is a special character and creates issues
with URL parsing so `$` is used), for example
`/v1/table/ns1$table1/create`. If a different parameter needs to be
passed in, user can configure the `id_delimiter` in client config and
that becomes a query parameter, for example
`/v1/table/ns1__table1/create?delimiter=__`

The Python and Typescript APIs are kept backwards compatible, but the
following Rust APIs are not:
1. `Connection::drop_table(&self, name: impl AsRef<str>) -> Result<()>`
is now `Connection::drop_table(&self, name: impl AsRef<str>, namespace:
&[String]) -> Result<()>`
2. `Connection::drop_all_tables(&self) -> Result<()>` is now
`Connection::drop_all_tables(&self, name: impl AsRef<str>) ->
Result<()>`
2025-08-27 12:07:55 -07:00
Weston Pace
fabe37274f feat: add __getitems__ method impl for torch integration (#2596)
This allows a lancedb Table to act as a torch dataset.
2025-08-25 13:23:22 -07:00
Lance Release
b88422e515 Bump version: 0.24.4-beta.0 → 0.24.4-beta.1 2025-08-22 03:54:34 +00:00
Jack Ye
04285a4a4e feat(python): integrate with lance namespace (#2599)
This PR integrates `lancedb` with `lance-namespace` so that users can
use LanceDB client to access Lance tables in any catalog services. In
general, we expect most of the logic to be delegated to the existing
`LanceDBConnection` and `LanceTable`, but the namespace implemenation
will control how table is created, dropped, and describe where the table
is stored with any related storage options like access credentials.

The implementation currently only supports a 1 level namespace that
directly contains tables. We will introduce nested namespace support in
a separated PR.

Users are expected to use it in the following way:

```python
>>> import lancedb
>>> import pyarrow as pa
>>> # Connect using GlueNamespace
>>> db = lancedb.connect_namespace("glue", {"catalog_id": "123456789012"})
>>> # Create a table with schema
>>> schema = pa.schema([
...     pa.field("id", pa.int64()),
...     pa.field("vector", pa.list_(pa.float32(), 2))
... ])
>>> table = db.create_table("my_table", schema=schema)
>>> # List tables
>>> db.table_names()
['my_table']
```
2025-08-20 15:46:16 -07:00
Lance Release
adc3daa462 Bump version: 0.24.3 → 0.24.4-beta.0 2025-08-19 22:56:05 +00:00
Vitali Lovich
d602e9f98c fix: make cloud features optional (#2567) (#2568)
This shrinks the size of a local embedded build that can disable all the
default features. When combined with
https://github.com/lancedb/lance/pull/4362 and the dependencies are
updated to point to the fix, this resolves #2567 fully.

Verified by patching the workspace to redirect to my clone of lance with
the PR applied.
```
cargo tree -p lancedb -e no-build -e no-dev --no-default-features -i aws-config | less
```

The reason that lance itself needs to change too is that many
dependencies within that project depend on lance-io/default and lancedb
depends on them which transitively ends up enabling the cloud
regardless. The PR in lance removes the dependency on lance-io/default
from all sibling crates.

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2025-08-15 16:46:52 -07:00
Will Jones
ad09234d59 feat: allow setting train=False and name on indices (#2586)
Enables two new parameters when building indices:

* `name`: Allows explicitly setting a name on the index. Default is
`{col_name}_idx`.
* `train` (default `True`): When set to `False`, an empty index will be
immediately created.

The upgrade of Lance means there are also additional behaviors from
cd76a993b8:

* When a scalar index is created on a Table, it will be kept around even
if all rows are deleted or updated.
* Scalar indices can be created on empty tables. They will default to
`train=False` if the table is empty.

---------

Co-authored-by: Weston Pace <weston.pace@gmail.com>
2025-08-15 14:00:26 -07:00
Lance Release
bb809abd4b Bump version: 0.24.3-beta.0 → 0.24.3 2025-08-15 18:02:04 +00:00
Lance Release
c87530f7a3 Bump version: 0.24.2 → 0.24.3-beta.0 2025-08-15 18:02:04 +00:00
Weston Pace
ed640a76d9 feat: add take_offsets and take_row_ids (#2584)
These operations have existed in lance for a long while and many users
need to drop down to lance for this capability. This PR adds the API and
implements it using filters (e.g. `_rowid IN (...)`) so that in doesn't
currently add any load to `BaseTable`. I'm not sure that is sustainable
as base table implementations may want to specialize how they handle
this method. However, I figure it is a good starting point.

In addition, unlike Lance, this API does not currently guarantee
anything about the order of the take results. This is necessary for the
fallback filter approach to work (SQL filters cannot guarantee result
order)
2025-08-15 06:48:24 -07:00
Weston Pace
16beaaa656 ci: fix broken CI checks (#2585) 2025-08-13 10:05:57 -07:00
Will Jones
9d683e4f0b feat: infer vector columns when name contains 'vector' or 'embedding' (#2547)
## Summary

- Enhanced vector column detection to use substring matching instead of
exact matching
- Now detects columns with names containing "vector" or "embedding"
(case-insensitive)
- Added integer vector support to Node.js implementation (matching
Python)
- Comprehensive test coverage for both float and integer vector types

## Changes

### Python (`python/python/lancedb/table.py`)
- Updated `_infer_target_schema()` to use substring matching with helper
function `_is_vector_column()`
- Preserved original field names instead of forcing "vector"
- Consolidated duplicate logic for better maintainability

### Node.js (`nodejs/lancedb/arrow.ts`)
- Enhanced type inference with `nameSuggestsVectorColumn()` helper
function
- Added `isAllIntegers()` function with performance optimization (checks
first 10 elements)
- Implemented integer vector support using `Uint8` type (matching
Python)
- Improved type safety by removing `any` usage

### Tests
- **Python**: Added
`test_infer_target_schema_with_vector_embedding_names()` in
`test_util.py`
- **Node.js**: Added comprehensive test case in `arrow.test.ts`
- Both test suites cover various naming patterns and integer/float
vector types

## Examples of newly supported column names:
- `user_vector`, `text_embedding`, `doc_embeddings`
- `my_vector_field`, `embedding_model`
- `VECTOR_COL`, `Vector_Mixed` (case-insensitive)
- Both float and integer arrays are properly converted to fixed-size
lists

## Test plan
- [x] All existing tests pass (backward compatibility maintained)
- [x] New tests pass for both Python and Node.js implementations
- [x] Integer vector detection works correctly in Node.js
- [x] Code passes linting and formatting checks
- [x] Performance optimized for large vector arrays

Fixes #2546

🤖 Generated with [Claude Code](https://claude.ai/code)

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-08-04 15:36:49 -07:00
Poornachandra.A.N
7d0127b376 feat(embeddings): add siglip embedding support to lancedb (#2499)
###  Summary

This PR adds **SigLIP** (Sigmoid Loss Image Pretraining) as a new
embedding model in the LanceDB embedding registry. SigLIP improves
image-text alignment performance using sigmoid-based contrastive loss
and offers robust zero-shot generalization.

Fixes #2498 

### What’s Implemented

#### 1. `SigLIP` Embedding Class

* Added `SigLIP` support under `python/lancedb/embeddings/siglip.py`
* Implements:

  * `compute_source_embeddings`
  * `_batch_generate_embeddings`
  * Normalization logic
  * Batch-wise progress logging for image embedding

#### 2. Registry Integration

* Registered `SigLIP` in `embeddings/__init__.py`
* `SigLIP` now usable via `connect(..., embedding="siglip")`

#### 3. Evaluation Benchmark Support

* Added SigLIP to `test_embeddings_slow.py` for side-by-side
benchmarking with OpenCLIP and ImageBind


###  New Test Methods

####  `test_siglip`

* End-to-end test to verify embeddings table creation and vector shape
for SigLIP
![WhatsApp Image 2025-07-10 at 18 00
27_a3368163](https://github.com/user-attachments/assets/e5582ee1-80a3-43d7-a7a1-26ceecce9f4d)


####  `test_siglip_vs_openclip_vs_imagebind_benchmark_full`

* Benchmarks:

  * **Recall\@1 / 5 / 10**
  * **mAP (Mean Average Precision)**
  * **Embedding & Search Latency**
  * Dimensionality reporting
![WhatsApp Image 2025-07-10 at 18 12
13_22c67a84](https://github.com/user-attachments/assets/455bf30f-62b7-4684-a3f3-ad52e2a1ffe5)


###  Notes

* SigLIP outputs 768D embeddings (vs 512D for OpenCLIP)
* Benchmark shows competitive performance despite higher dimensionality
* I'm still new to contributing to open-source and learning as I go.
Please feel free to suggest any improvements — I'm happy to make
changes!
2025-08-04 11:42:39 -07:00
Will Jones
02595dc475 feat: add overall timeout parameter to remote client (#2550)
## Summary
- Adds an overall `timeout` parameter to `TimeoutConfig` that limits the
total time for the entire request
- Can be set via config or `LANCE_CLIENT_TIMEOUT` environment variable
- Exposed in Python and Node.js bindings
- Includes comprehensive tests

## Test plan
- [x] Unit tests for Rust TimeoutConfig
- [x] Integration tests for Python bindings  
- [x] Integration tests for Node.js bindings
- [x] All existing tests pass

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-authored-by: Claude <noreply@anthropic.com>
2025-08-04 10:06:55 -07:00
Mark McCaskey
fe76496a59 fix: .nprobes method in python bindings, improve error messages (#2556)
`nprobes` with a value greater than 20 fails with the minimum error:

```
self = <lancedb.query.AsyncVectorQuery object at 0x10b749720>, minimum_nprobes = 30

    def minimum_nprobes(self, minimum_nprobes: int) -> Self:
        """Set the minimum number of probes to use.

        See `nprobes` for more details.

        These partitions will be searched on every indexed vector query and will
        increase recall at the expense of latency.
        """
>       self._inner.minimum_nprobes(minimum_nprobes)
E       ValueError: Invalid input, minimum_nprobes must be less than or equal to maximum_nprobes

python/lancedb/query.py:2744: ValueError
```

Putting the max set before the min seems reasonable but it causes this
reasonable case to fail:
```
def test_nprobes_min_max_works_sync(table):
    LanceVectorQueryBuilder(table, [0, 0], "vector").minimum_nprobes(2).maximum_nprobes(4).to_list()
```

with

```
self = <lancedb.query.AsyncVectorQuery object at 0x1203f1c90>, maximum_nprobes = 4

    def maximum_nprobes(self, maximum_nprobes: int) -> Self:
        """Set the maximum number of probes to use.

        See `nprobes` for more details.

        If this value is greater than `minimum_nprobes` then the excess partitions
        will be searched only if we have not found enough results.

        This can be useful when there is a narrow filter to allow these queries to
        spend more time searching and avoid potential false negatives.

        If this value is 0 then no limit will be applied and all partitions could be
        searched if needed to satisfy the limit.
        """
>       self._inner.maximum_nprobes(maximum_nprobes)
E       ValueError: Invalid input, maximum_nprobes must be greater than or equal to minimum_nprobes

python/lancedb/query.py:2761: ValueError
```.

The case I care about is where min == max, but this solution handles it
even if they're not. If both min and max exist, we set both to the
minimum and then set the max. This isn't 100% the same as the minimum
setter checks for 0 on the min and `.nprobes` does not do any sanity
checking at all. But I figured this was the most reasonable and general
solution without touching more of this code.

As part of this I noticed the error messages were a bit ambiguous so I
made them symmetric and clarified them while I was here.
2025-07-30 09:23:25 -07:00
Lance Release
f79295c697 Bump version: 0.24.2-beta.2 → 0.24.2 2025-07-25 20:31:15 +00:00
Lance Release
381fad9b65 Bump version: 0.24.2-beta.1 → 0.24.2-beta.2 2025-07-25 20:31:15 +00:00
Tristan Zajonc
055bf91d3e fix: handle empty list with schema in table creation (#2548)
## Summary
Fixes IndexError when creating tables with empty list data and a
provided schema. Previously, `_into_pyarrow_reader()` would attempt to
access `data[0]` on empty lists, causing an IndexError. Now properly
handles empty lists by using the provided schema.

Also adds regression tests for GitHub issues #1968 and #303 to prevent
future regressions with empty table scenarios.

## Changes
- Fix IndexError in `_into_pyarrow_reader()` for empty list + schema
case
- Add Optional[pa.Schema] parameter to handle empty data gracefully  
- Add `test_create_table_empty_list_with_schema` for the IndexError fix
- Add `test_create_empty_then_add_data` for issue #1968
- Add `test_search_empty_table` for issue #303

## Test plan
- [x] All new regression tests pass
- [x] Existing tests continue to pass
- [x] Code formatted with `make format`
2025-07-25 10:23:43 +08:00
Tristan Zajonc
10fa23e0d6 fix(python): expose register function in embeddings module (#2544)
## Summary
Fixes #2541

**Problem**: The `register` function was not accessible via `from
lancedb.embeddings import register` as documented, causing ImportError
for users trying to create custom embedding functions.

**Solution**: Added `register` to the exports in
`python/lancedb/embeddings/__init__.py` to match the documented API and
follow the same pattern as other registry functions (`get_registry`,
`EmbeddingFunctionRegistry`).

**Root Cause**: The function existed in `lancedb.embeddings.registry`
but wasn't exposed through the main embeddings module interface.

## Changes
- Add `register` to imports in
`/python/python/lancedb/embeddings/__init__.py`

## Test Plan
- [x] Verified `from lancedb.embeddings import register` works as
documented
- [x] Confirmed existing embedding tests pass
- [x] Checked that the fix follows existing patterns (same as
`get_registry`)
- [x] Validated linting and formatting passes

## References
Fixes #2541
2025-07-24 15:30:06 -07:00
yihong
43d9fc28b0 fix: can not build on python3.9 for dev (#2477)
This patch fix can not build on python3.9 dev

the reason is that for ibm-watsonx-ai the min version is py3.10

more can check on `pyoven` https://pyoven.org/package/ibm-watsonx-ai/

also fix tiny md lint

---------

Signed-off-by: yihong0618 <zouzou0208@gmail.com>
2025-07-24 12:39:04 -07:00
aniaan
f45f0d0431 fix(python): correct type annotations in EmbeddingFunctionRegistry (#2478)
- Fix register() method's alias parameter type from 'str = None' to
'Optional[str] = None'
- Add return type annotation 'Type[EmbeddingFunction]' to get() method
- Import Type from typing module for proper type hints
2025-07-24 12:31:49 -07:00
Tristan Zajonc
b9e3c36d82 fix: replace broken documentation URLs in error messages (#2533)
Replaces broken 404 URL and unhelpful documentation links in type error
messages with working URL and inline list of supported data types.

**Before**: Points to
https://lancedb.github.io/lance/read_and_write.html (404 error)
**After**: Lists supported types inline and points to
https://lancedb.github.io/lancedb/guides/tables/
2025-07-24 12:30:27 -07:00
Chen Chongchen
3cd7dd3375 fix: to_pydantic typing (#2517)
currently, to_pydantic will always return LanceModel. If type checking
is enabled in my project. I have to use `cast(data,
List[RealModelType])` to solve type error. This PR uses generic to solve
this problem.
2025-07-24 12:30:15 -07:00
Will Jones
3d1f102087 feat: allow Python and Typescript users to create Sessions (#2530)
## Summary
- Exposes `Session` in Python and Typescript so users can set the
`index_cache_size_bytes` and `metadata_cache_size_bytes`
* The `Session` is attached to the `Connection`, and thus shared across
all tables in that connection.
- Adds deprecation warnings for table-level cache configuration


🤖 Generated with [Claude Code](https://claude.ai/code)

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-07-24 12:06:29 -07:00
Tristan Zajonc
81afd8a42f fix: use local random state in FTS test fixtures to prevent flaky failures (#2532)
## Summary
Fixes intermittent CI failures in `test_search_fts[False]` where boolean
FTS queries were returning fewer results than expected due to
non-deterministic test data generation.

## Problem
The test was using global `random` and `np.random` without seeding,
causing the boolean query `MatchQuery("puppy", "text") &
MatchQuery("runs", "text")` to sometimes return only 3 results instead
of the expected 5, leading to `AssertionError: assert 3 == 5`.

## Solution
- Replace global random calls with local `random.Random(42)` and
`np.random.RandomState(42)` objects in test fixtures
- Ensures deterministic test data while maintaining test isolation
- No impact on other tests since random state is scoped to fixtures only

## Test Results
-  `test_search_fts[False]` now passes consistently
-  All other FTS tests continue to pass 
-  No regression in other test suites (verified with `test_basic`)
-  Maintains existing test behavior and coverage
2025-07-24 11:30:02 -07:00
Tristan Zajonc
c2aa03615a fix: correct grammar in LanceDB cloud connection error message (#2537)
## Summary

Fixed a minor grammar error in the error message for missing API key
when connecting to LanceDB cloud.

## Changes

- Changed 'api_key is required to connected LanceDB cloud' to 'api_key
is required to connect to LanceDB cloud'
- Location: `python/python/lancedb/__init__.py:95`

## Test plan

- Error message formatting is correct and grammatical
- No functional changes to existing behavior
2025-07-24 09:56:06 -07:00
Tristan Zajonc
d2c6759e7f fix: use import stubs to prevent MLX doctest collection failures (#2536)
## Summary
- Add `create_import_stub()` helper to `embeddings/utils.py` for
handling optional dependencies
- Fix MLX doctest collection failures by using import stubs in
`gte_mlx_model.py`
- Module now imports successfully for doctest collection even when MLX
is not installed

## Changes
- **New utility function**: `create_import_stub()` creates placeholder
objects that allow class inheritance but raise helpful errors when used
- **Updated MLX model**: Uses import stubs instead of direct imports
that fail immediately
- **Graceful degradation**: Clear error messages when MLX functionality
is accessed without MLX installed

## Test Results
-  `pytest --doctest-modules python/lancedb` now passes (with and
without MLX installed)
-  All existing tests continue to pass
-  MLX functionality works normally when MLX is installed
-  Helpful error messages when MLX functionality is used without MLX
installed

Fixes #2538

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2025-07-23 16:25:33 -07:00
Will Jones
fbff244ed8 chore: add claude md files (#2531)
Gives basic context to Claude about how to do common tasks in the repo.
2025-07-23 12:20:36 -07:00
Lance Release
7a15337e03 Bump version: 0.24.2-beta.0 → 0.24.2-beta.1 2025-07-22 15:40:17 +00:00
Lance Release
ce24457531 Bump version: 0.24.1 → 0.24.2-beta.0 2025-07-18 16:02:37 +00:00
BubbleCal
087fe6343d test: fix random data may break test case (#2514)
this test adds a new vector and then performs vector search with
distance range.
this may fail if the new vector becomes the closest one to the query
vector

Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2025-07-18 16:15:06 +08:00
Ayush Chaurasia
f076bb41f4 feat: add support for returning all scores with rerankers (#2509)
Previously `return_score="all"` was supported only for the default
reranker (RRF) and not the model based rerankers.
This adds support for keeping all scores in the base reranker so that
all model based rerankers can use it. Its a slower path than keeping
just the relevance score but can be useful in debugging
2025-07-15 21:03:03 +05:30
BubbleCal
03b62599d7 feat: support ngram tokenizer (#2507)
Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2025-07-15 16:36:08 +08:00
Lance Release
a300a238db Bump version: 0.24.1-beta.2 → 0.24.1 2025-07-10 21:36:02 +00:00
Lance Release
a41ff1df0a Bump version: 0.24.1-beta.1 → 0.24.1-beta.2 2025-07-10 21:36:02 +00:00
CyrusAttoun
167fccc427 fix: change 'return' to 'raise' for unimplemented remote table function (#2484)
just noticed that we're doing a 'return' instead of a 'raise' while
trying to get remote functionality working for my project. I went ahead
and implemented tests for both of the unimplemented functions (to_pandas
and to_arrow) while I was in there.

---------

Co-authored-by: Cyrus Attoun <jattoun1@gmail.com>
2025-07-09 14:27:08 -07:00
Lance Release
905552f993 Bump version: 0.24.1-beta.0 → 0.24.1-beta.1 2025-07-09 05:53:28 +00:00
BubbleCal
cab36d94b2 feat: support to specify num_partitions and num_bits (#2488) 2025-07-09 11:36:09 +08:00
Lance Release
d4bb59b542 Bump version: 0.24.0 → 0.24.1-beta.0 2025-07-07 21:00:38 +00:00
Weston Pace
1dadb2aefa feat: upgrade to lance 0.31.0-beta.1 (#2469)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

* **Chores**
* Updated dependencies to newer versions for improved compatibility and
stability.

* **Refactor**
* Improved internal handling of data ranges and stream lifetimes for
enhanced performance and reliability.
* Simplified code style for Python query object conversions without
affecting functionality.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-06-30 11:10:53 -07:00
Haoyu Weng
eb9784d7f2 feat(python): batch Ollama embed calls (#2453)
Other embedding integrations such as Cohere and OpenAI already send
requests in batches. We should do that for Ollama too to improve
throughput.

The Ollama [`.embed`
API](63ca747622/ollama/_client.py (L359-L378))
was added in version 0.3.0 (almost a year ago) so I updated the version
requirement in pyproject.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Bug Fixes**
- Improved compatibility with newer versions of the "ollama" package by
requiring version 0.3.0 or higher.
- Enhanced embedding generation to process batches of texts more
efficiently and reliably.
- **Refactor**
	- Improved type consistency and clarity for embedding-related methods.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-06-30 08:28:14 -07:00