Commit Graph

2158 Commits

Author SHA1 Message Date
Lance Release
05a4ea646a Bump version: 0.22.1-beta.2 → 0.22.1-beta.3 2025-09-22 04:49:00 +00:00
Lance Release
ebbeeff4e0 Bump version: 0.25.1-beta.2 → 0.25.1-beta.3 python-v0.25.1-beta.3 2025-09-22 04:47:42 +00:00
Jack Ye
407ca53f92 chore: increase pypi publish timeout and use warp runner for arm64 (#2670)
Fix failures like:
https://github.com/lancedb/lancedb/actions/runs/17840462235/job/50748940233

ARM64 build cannot succeed within 1 hour, x86-64 build sometimes cannot
succeed within 1 hour.
2025-09-21 21:42:44 -07:00
Jack Ye
ff71d7e552 feat: support shallow clone (#2653)
Support shallow cloning a dataset at a specific location to create a new
dataset, using the shallow_clone feature in Lance. Also introduce remote
`clone` API for remote tables for this functionality.
2025-09-21 21:28:40 -07:00
Neha Prasad
2261eb95a0 fix(node): handle undefined vector fields with embedding functions (#2655)
- Fixes issue where passing `{ vector: undefined }` with an embedding
function threw "Found field not in schema" error instead of calling the
embedding function like `null` or omitted fields.

**Changes:**
- Modified `rowPathsAndValues` to skip undefined values during schema
inference
- Added test case verifying undefined, null, and omitted vector fields
all work correctly

**Before:** `{ vector: undefined }` → Error
**After:** `{ vector: undefined }` → Calls embedding function

Closes #2647
2025-09-19 09:17:28 -07:00
Jack Ye
5b397e410b chore: fix out of date tests with new namespace validation (#2663)
Failure:
https://github.com/lancedb/lancedb/actions/runs/17820044478/job/50660516344
2025-09-18 13:29:47 -07:00
Lance Release
b5a39bffec Bump version: 0.22.1-beta.1 → 0.22.1-beta.2 2025-09-18 20:22:35 +00:00
Lance Release
5e1e9add07 Bump version: 0.25.1-beta.1 → 0.25.1-beta.2 python-v0.25.1-beta.2 2025-09-18 20:21:33 +00:00
Jack Ye
97e9938dfe fix: add missing validations to namespace operations (#2659) 2025-09-17 23:27:04 -07:00
Weston Pace
1d4b92e01e refactor: remove catalog implementation now that we have namespaces in database (#2662)
We had previously prototyped a `Catalog` trait anticipating a
three-tiered Catalog-Database-Table structure. Now that we have
namespaces in the `Database` we can support any tiering scheme and the
`Catalog` trait is no longer needed.
2025-09-17 08:40:20 -07:00
Le Duc Manh
4c9fc3044b fix: use create to resolve variables (#2640)
# What
- Use `create` to resolve variables values

# Reference
Fixes #2181

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2025-09-12 13:07:32 -07:00
Jack Ye
0ebc8d45a8 chore: fix no lock build warnings and CI timeouts (#2650)
Example CI failures:
- publish build timeout:
https://github.com/lancedb/lancedb/actions/runs/17626482881/job/50084552906
- doc test build timeout:
https://github.com/lancedb/lancedb/actions/runs/17627058590/job/50086456818
2025-09-11 15:30:35 -07:00
BubbleCal
f7d78c3420 feat: add 'target_partition_size' param (#2642)
this exposes the param `target_partition_size` from lance

---------

Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2025-09-11 22:56:16 +08:00
Lance Release
6ea6884260 Bump version: 0.22.1-beta.0 → 0.22.1-beta.1 2025-09-10 20:49:43 +00:00
Lance Release
b1d791a299 Bump version: 0.25.1-beta.0 → 0.25.1-beta.1 python-v0.25.1-beta.1 2025-09-10 20:48:56 +00:00
Jack Ye
8da74dcb37 feat: support per-request header override (#2631)
## Summary

This PR introduces a `HeaderProvider` which is called for all remote
HTTP calls to get the latest headers to inject. This is useful for
features like adding the latest auth tokens where the header provider
can auto-refresh tokens internally and each request always set the
refreshed token.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-09-10 13:44:00 -07:00
Lance Release
3c7419b392 Bump version: 0.22.0 → 0.22.1-beta.0 2025-09-10 14:24:58 +00:00
Lance Release
e612686fdb Bump version: 0.25.0 → 0.25.1-beta.0 python-v0.25.1-beta.0 2025-09-10 14:24:07 +00:00
Wyatt Alt
e77d57a5b6 chore: update lance to 0.35.0-beta4 (#2639)
Updates lance to 0.35.0-beta4, which also incurs a datafusion update.
This brings in a fix for a memory leak in index caching, resulting from
a cyclical reference.
2025-09-10 06:19:35 -07:00
Jack Ye
9391ad1450 feat: support mTLS for remote database (#2638)
This PR adds mTLS (mutual TLS) configuration support for the LanceDB
remote HTTP client, allowing users to authenticate with client
certificates and configure custom CA certificates for server
verification.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-09-09 21:04:46 -07:00
LuQQiu
79960b254e fix: add partition statistics to MetadataEraser (#2637)
Some of the data fusion optimizers optimize based on data statistics
(e.g. total bytes, number of rows).
If those statistics are not supplied, optimizers cannot optimize on top.
One example is Anti Hash Join which can optimize from LeftAnti (Left:
big table, Right: small table) to RightAnti (Left: small table, Right:
big table). Left Anti requires reading the whole big & small table while
RightAnti only requires reading the whole left table and supports limit
push down to only read partial of big table
2025-09-09 09:13:22 -07:00
Xuanwo
d19c64e29b chore: bump version for JSON support (#2633)
Bump version of lance to latest beta for JSON support.

Signed-off-by: Xuanwo <github@xuanwo.io>
2025-09-05 12:26:28 -07:00
Lance Release
06d5612443 Bump version: 0.22.0-beta.2 → 0.22.0 2025-09-04 08:33:40 +00:00
Lance Release
45f96f4151 Bump version: 0.22.0-beta.1 → 0.22.0-beta.2 2025-09-04 08:33:09 +00:00
Lance Release
f744b785f8 Bump version: 0.25.0-beta.2 → 0.25.0 python-v0.25.0 2025-09-04 08:32:44 +00:00
Lance Release
2e3f745820 Bump version: 0.25.0-beta.1 → 0.25.0-beta.2 2025-09-04 08:32:43 +00:00
Jack Ye
683aaed716 chore: upgrade lance to 0.35.0 (#2625) 2025-09-04 01:31:13 -07:00
Lance Release
48f7b20daa Bump version: 0.22.0-beta.0 → 0.22.0-beta.1 2025-09-03 17:51:36 +00:00
Lance Release
4dd399ca29 Bump version: 0.25.0-beta.0 → 0.25.0-beta.1 python-v0.25.0-beta.1 2025-09-03 17:50:41 +00:00
Jack Ye
e6f1da31dc chore: upgrade lance to 0.34.0-beta.4 (#2621) 2025-09-02 21:33:55 -07:00
Wyatt Alt
a9ea785b15 fix: remote python sdk namespace typing (#2620)
This changes the default values for some namespace parameters in the
remote python SDK from None to [], to match the underlying code it
calls.

Prior to this commit, failing to supply "namespace" with the remote SDK
would cause an error because the underlying code it dispatches to does
not consider None to be valid input.
2025-09-02 16:32:32 -07:00
Colin Patrick McCabe
cc38453391 fix!: fix doctest in query.py (#2622)
Fix doctest in query.py to include cumulative_cpu, now that lance
includes that.
2025-09-02 15:47:32 -07:00
Lance Release
47747287b6 Bump version: 0.21.4-beta.1 → 0.22.0-beta.0 2025-08-29 21:20:57 +00:00
Lance Release
0847e666a0 Bump version: 0.24.4-beta.1 → 0.25.0-beta.0 python-v0.25.0-beta.0 2025-08-29 21:19:51 +00:00
Wyatt Alt
981f8427e6 chore: update lance (#2610)
Adds storage_options to object_store wrap() to adhere to upstream lance
change.
2025-08-29 13:41:02 -07:00
Will Jones
f6846004ca feat: add name parameter to remaining Python create index calls (#2617)
## Summary
This PR adds the missing `name` parameter to `create_scalar_index` and
`create_fts_index` methods in the Python SDK, which was inadvertently
omitted when it was added to `create_index` in PR #2586.

## Changes
- Add `name: Optional[str] = None` parameter to abstract
`Table.create_scalar_index` and `Table.create_fts_index` methods
- Update `LanceTable` implementation to accept and pass the `name`
parameter to the underlying Rust layer
- Update `RemoteTable` implementation to accept and pass the `name`
parameter
- Enhanced tests to verify custom index names work correctly for both
scalar and FTS indices
- When `name` is not provided, default names are generated (e.g.,
`{column}_idx`)

## Test plan
- [x] Added test cases for custom names in scalar index creation
- [x] Added test cases for custom names in FTS index creation  
- [x] Verified existing tests continue to pass
- [x] Code formatting and linting checks pass

This ensures API consistency across all index creation methods in the
LanceDB Python SDK.

Fixes #2616

🤖 Generated with [Claude Code](https://claude.ai/code)

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-08-27 14:02:48 -07:00
Jack Ye
faf8973624 feat!: support multi-level namespace (#2603)
This PR adds support of multi-level namespace in a LanceDB database,
according to the Lance Namespace spec.

This allows users to create namespace inside a database connection,
perform create, drop, list, list_tables in a namespace. (other
operations like update, describe will be in a follow-up PR)

The 3 types of database connections behave like the following:
1 Local database connections will continue to have just a flat list of
tables for backwards compatibility.
2. Remote database connections will make REST API calls according to the
APIs in the Lance Namespace spec.
3. Lance Namespace connections will invoke the corresponding operations
against the specific namespace implementation which could have different
behaviors regarding these APIs.

All the table APIs now take identifier instead of name, for example
`/v1/table/{name}/create` is now `/v1/table/{id}/create`. If a table is
directly in the root namespace, the API call is identical. If the table
is in a namespace, then the full table ID should be used, with `$` as
the default delimiter (`.` is a special character and creates issues
with URL parsing so `$` is used), for example
`/v1/table/ns1$table1/create`. If a different parameter needs to be
passed in, user can configure the `id_delimiter` in client config and
that becomes a query parameter, for example
`/v1/table/ns1__table1/create?delimiter=__`

The Python and Typescript APIs are kept backwards compatible, but the
following Rust APIs are not:
1. `Connection::drop_table(&self, name: impl AsRef<str>) -> Result<()>`
is now `Connection::drop_table(&self, name: impl AsRef<str>, namespace:
&[String]) -> Result<()>`
2. `Connection::drop_all_tables(&self) -> Result<()>` is now
`Connection::drop_all_tables(&self, name: impl AsRef<str>) ->
Result<()>`
2025-08-27 12:07:55 -07:00
Weston Pace
fabe37274f feat: add __getitems__ method impl for torch integration (#2596)
This allows a lancedb Table to act as a torch dataset.
2025-08-25 13:23:22 -07:00
Lance Release
6839ac3509 Bump version: 0.21.4-beta.0 → 0.21.4-beta.1 2025-08-22 03:55:22 +00:00
Lance Release
b88422e515 Bump version: 0.24.4-beta.0 → 0.24.4-beta.1 python-v0.24.4-beta.1 2025-08-22 03:54:34 +00:00
BubbleCal
8d60685ede chore: upgrade lance to 0.33.0-beta.4 (#2604)
detials:
https://github.com/lancedb/lance/releases/tag/untagged-5191abd48c1fbe76f746

Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2025-08-21 21:18:48 +08:00
Jack Ye
04285a4a4e feat(python): integrate with lance namespace (#2599)
This PR integrates `lancedb` with `lance-namespace` so that users can
use LanceDB client to access Lance tables in any catalog services. In
general, we expect most of the logic to be delegated to the existing
`LanceDBConnection` and `LanceTable`, but the namespace implemenation
will control how table is created, dropped, and describe where the table
is stored with any related storage options like access credentials.

The implementation currently only supports a 1 level namespace that
directly contains tables. We will introduce nested namespace support in
a separated PR.

Users are expected to use it in the following way:

```python
>>> import lancedb
>>> import pyarrow as pa
>>> # Connect using GlueNamespace
>>> db = lancedb.connect_namespace("glue", {"catalog_id": "123456789012"})
>>> # Create a table with schema
>>> schema = pa.schema([
...     pa.field("id", pa.int64()),
...     pa.field("vector", pa.list_(pa.float32(), 2))
... ])
>>> table = db.create_table("my_table", schema=schema)
>>> # List tables
>>> db.table_names()
['my_table']
```
2025-08-20 15:46:16 -07:00
Lance Release
d4a41b5663 Bump version: 0.21.3 → 0.21.4-beta.0 2025-08-19 22:56:52 +00:00
Lance Release
adc3daa462 Bump version: 0.24.3 → 0.24.4-beta.0 python-v0.24.4-beta.0 2025-08-19 22:56:05 +00:00
Will Jones
acbfa6c012 feat: upgrade lance to 0.33.0-beta.3 (#2598)
Change logs:
*
[v0.33.0-beta.3](https://github.com/lancedb/lance/releases/tag/v0.33.0-beta.3)
*
[v0.33.0-beta.2](https://github.com/lancedb/lance/releases/tag/v0.33.0-beta.2)
*
[v0.33.0-beta.1](https://github.com/lancedb/lance/releases/tag/v0.33.0-beta.1)

Important changes:

* Row-level conflict resolution for delete operations
* Fixes #2593
* Fix for keeping tombstones fields around, preventing cleanup of
dropped columns.
2025-08-19 13:45:15 -07:00
Vitali Lovich
d602e9f98c fix: make cloud features optional (#2567) (#2568)
This shrinks the size of a local embedded build that can disable all the
default features. When combined with
https://github.com/lancedb/lance/pull/4362 and the dependencies are
updated to point to the fix, this resolves #2567 fully.

Verified by patching the workspace to redirect to my clone of lance with
the PR applied.
```
cargo tree -p lancedb -e no-build -e no-dev --no-default-features -i aws-config | less
```

The reason that lance itself needs to change too is that many
dependencies within that project depend on lance-io/default and lancedb
depends on them which transitively ends up enabling the cloud
regardless. The PR in lance removes the dependency on lance-io/default
from all sibling crates.

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2025-08-15 16:46:52 -07:00
Will Jones
ad09234d59 feat: allow setting train=False and name on indices (#2586)
Enables two new parameters when building indices:

* `name`: Allows explicitly setting a name on the index. Default is
`{col_name}_idx`.
* `train` (default `True`): When set to `False`, an empty index will be
immediately created.

The upgrade of Lance means there are also additional behaviors from
cd76a993b8:

* When a scalar index is created on a Table, it will be kept around even
if all rows are deleted or updated.
* Scalar indices can be created on empty tables. They will default to
`train=False` if the table is empty.

---------

Co-authored-by: Weston Pace <weston.pace@gmail.com>
2025-08-15 14:00:26 -07:00
Lance Release
0c34ffb252 Bump version: 0.21.3-beta.0 → 0.21.3 2025-08-15 18:03:26 +00:00
Lance Release
d9f333d828 Bump version: 0.21.2 → 0.21.3-beta.0 2025-08-15 18:02:43 +00:00
Lance Release
bb809abd4b Bump version: 0.24.3-beta.0 → 0.24.3 python-v0.24.3 2025-08-15 18:02:04 +00:00