Commit Graph

338 Commits

Author SHA1 Message Date
fzowl
2adb10e6a8 feat: voyage-multimodal-3.5 (#2887)
voyage-multimodal-3.5 support (text, image and video embeddings)
2026-01-02 15:14:52 -08:00
Jonathan Hsieh
1cf7b4b678 docs: remove incorrect "LanceDb Cloud only" from table_names params (#2893)
The page_token and limit parameters for table_names() are supported by
both local storage and LanceDB Cloud, not just Cloud as the docstring
incorrectly stated.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 09:08:04 -08:00
Prashanth Rao
8ae4f42fbe fix: add to_lance() and to_polars() stub methods for type-checkers (#2876)
Adds `Table.to_lance()` and `Table.to_polars()` methods (non-abstract
methods, defaulting to `NotImplementedError`) so type checkers like
mypy, pyright and ty don’t flag them as unknown attributes on `Table`.
Not making these abstract methods should keep existing remote/other
`Table` implementations instantiable.

This is non-breaking change to existing functionality and is purely for
the purpose of pleasing static type-checkers like mypy, ty and pyright.

<img width="626" height="134" alt="image"
src="https://github.com/user-attachments/assets/f4619bca-a882-432b-bd23-ae8f189ff9e3"
/>
2025-12-18 12:55:07 -05:00
BubbleCal
39a18baf59 feat: infer vector type to float32 if integers are out of uint8 range (#2856)
## Summary
- infer integer vector columns as float32 when any value exceeds uint8
range or is negative
- keep uint8 for integer vectors within range and nulls only
- add sync/async tests covering large integer vector inference

## Testing
- ./.venv/bin/pytest python/python/tests/test_table.py -k
"large_int_vectors"
2025-12-08 17:10:25 +08:00
BubbleCal
a61461331c feat: add IVF SQ index support and HNSW aliases (#2832)
Adds IVF_SQ index config through Rust core and Python bindings, plus
alias names IvfHnswSq/Pq for backward compatibility. Updates
remote/table helpers and types to accept the new index type. Includes
tests covering IVF SQ creation and alias usage.
2025-12-04 00:25:44 +08:00
Jack Ye
d1efc6ad8a refactor!: use namespace models directly for namespace operations (#2806)
1. Use generated models in lance-namespace for request response models
to avoid multiple layers of conversions
2. Make sure the API is consistent with the namespace spec
3. Deprecate the table_names API in favor of the list_tables API in
namespace that allows full pagination support without the need to have
sorted table names
4. Add describe_namespace API which was a miss in the original
implementation
2025-12-02 22:41:04 -08:00
Jonathan Hsieh
44878dd9a5 feat: support stable row IDs via storage_options (#2831)
Add support for enabling stable row IDs when creating tables via the
`new_table_enable_stable_row_ids` storage option.

Stable row IDs ensure that row identifiers remain constant after
compaction, update, delete, and merge operations. This is useful for
materialized views and other use cases that need to track source rows
across these operations.

The option can be set at two levels:
- Connection level: applies to all tables created with that connection
- Table level: per-table override via create_table storage_options

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-12-02 13:57:00 -08:00
LanceDB Robot
4b5bb2d76c chore: update lance dependency to v1.0.0-beta.16 (#2835)
## Summary
- bump all Lance crates to v1.0.0-beta.16 via ci/set_lance_version.py
- refresh Cargo.lock (reqwest/opendal/etc.) to satisfy the new release

## Verification
- cargo clippy --workspace --tests --all-features -- -D warnings
- cargo fmt --all

Triggered by
[refs/tags/v1.0.0-beta.16](https://github.com/lance-format/lance/releases/tag/v1.0.0-beta.16)

---------

Co-authored-by: Jack Ye <yezhaoqin@gmail.com>
2025-12-01 23:07:03 -08:00
Prashanth Rao
a250d8e7df docs: improve docstring for RabitQ in Python (#2808)
This PR improves the docstring for `IVF_RQ` (RabitQ) in Python. The
earlier version referred to it as "residual quantization", which is
confusing to future readers of the code.

In contrast, the TypeScript and Rust codebases defined `IVF_RQ` as
RabitQ. So now the three languages use comments that are consistent with
one another.

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-24 13:35:19 +08:00
Jack Ye
0baf807be0 ci: use larger runner for doctest and fix failing tests (#2801)
Currently test would fail after installing to around pytorch
2025-11-20 19:44:31 -08:00
Prashanth Rao
135dfdc7ec docs: 404 and outdated URLs should now work (#2800)
Did a full scan of all URLs that used to point to the old mkdocs pages,
and now links to the appropriate pages on lancedb.com/docs or lance.org
docs.

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-20 11:14:20 -08:00
Jackson Hew
bb6b0bea0c fix: .phrase_query() not working (#2781)
The `self._query` value was not set when wrapping its copy `query` with
quotation marks.

The test for phrase queries has been updated to test the
`.phrase_query()` method as well, which will catch this bug.

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2025-11-20 10:32:37 -08:00
Jack Ye
0084eb238b fix: use None default for namespace (#2797)
Realized that using [] is an anti-pattern in python for defaults:
https://docs.python-guide.org/writing/gotchas/
2025-11-20 10:23:41 -08:00
Colin Patrick McCabe
7d3f5348a7 feat: implement head() for remote tables (#2793)
Implemnent the head() function for RemoteTable.
2025-11-19 12:49:34 -08:00
Jack Ye
1b78ccedaf feat: support async namespace connection (#2788)
Also fix 2 bugs:
1. make storage options provider serializable in ray
2. fix table.to_table() uri is wrong for namespace-backed tables
2025-11-19 12:23:50 -08:00
Mykola Skrynnyk
ca8d118f78 feat(python): support to_pydantic in async (#2438)
This request improves support for `pydantic` integration by adding
`to_pydantic` method to asynchronous queries and handling models that
use `alias` in field definitions. Fixes #2436 and closes #2437 .

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Added support for converting asynchronous query results to Pydantic
models.
- **Bug Fixes**
- Simplified conversion of query results to Pydantic models for improved
reliability.
- Improved handling of field aliases and computed fields when mapping
query results to Pydantic models.
- **Tests**
- Added tests to verify correct mapping of aliased and computed fields
in both synchronous and asynchronous scenarios.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-11-19 11:20:14 -08:00
Wyatt Alt
386fc9e466 feat: add num_attempts to merge insert result (#2795)
This pipes the num_attempts field from lance's merge insert result
through lancedb. This allows callers of merge_insert to get a better
idea of whether transaction conflicts are occurring.
2025-11-19 09:32:57 -08:00
Will Jones
1cf3917a87 ci: make rust ci faster, get ci green (#2782)
* Add `ci` profile for smaller build caches. This had a meaningful
impact in Lance, and I expect a similar impact here.
https://github.com/lancedb/lance/pull/5236
* Get caching working in Rust. Previously was not working due to
`workspaces: rust`.
* Get caching working in NodeJs lint job. Previously wasn't working
because we installed the toolchain **after** we called `- uses:
Swatinem/rust-cache@v2`, which invalidates the cache locally.
* Fix broken pytest from async io transition
(`pytest.PytestRemovedIn9Warning`)
* Altered `get_num_sub_vectors` to handle bug in case of 4-bit PQ. This
was cause of `rust future panicked: unknown error`. Raised an issue
upstream to change panic to error:
https://github.com/lancedb/lance/issues/5257
* Call `npm run docs` to fix doc issue.
* Disable flakey Windows test for consistency. It's just an OS-specific
timer issue, not our fault.
* Fix Windows absolute path handling in namespaces. Was causing CI
failure `OSError: [WinError 123] The filename, directory name, or volume
label syntax is incorrect: `
2025-11-18 09:04:56 -08:00
Ryan Green
92dbec1f95 fix: convert schema metadata to strings for JsonArrowSchema (#2786)
Fixes pydantic validation errors when creating materialized views with
namespace.

```
>       return JsonArrowSchema(fields=fields, metadata=schema.metadata)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E       pydantic_core._pydantic_core.ValidationError: 4 validation errors for JsonArrowSchema
E       metadata.b'geneva::view::query'
E         Input should be a valid string [type=string_type, input_value=b'{"base":{"vector_column...t-image:latest\\"}"}}]}', input_type=bytes]
E           For further information visit https://errors.pydantic.dev/2.12/v/string_type
```
2025-11-17 13:18:20 -03:30
Jack Ye
e47f552a86 feat: support namespace credentials vending (#2778)
Based on https://github.com/lancedb/lance/pull/4984

1. Bump to 1.0.0-beta.2
2. Use DirectoryNamespace in lance to perform all testing in python and
rust for much better coverage
3. Refactor `ListingDatabase` to be able to accept location and
namespace. This is because we have to leverage listing database (local
lancedb connection) for using namespace, namespace only resolves the
location and storage options but we don't want to bind all the way to
rust since user will plug-in namespace from python side. And thus
`ListingDatabase` needs to be able to accept location and namespace that
are created from namespace connection.
4. For credentials vending, we also pass storage options provider all
the way to rust layer, and the rust layer calls back to the python
function to fetch next storage option. This is exactly the same thing we
did in pylance.
2025-11-17 00:42:24 -08:00
Colin Patrick McCabe
1ff594a6a4 feat: bump lance version to 0.40-0-beta.2 (#2772)
Bump the bump lance version to 0.40-0-beta.2.
2025-11-10 14:36:37 -08:00
Prashanth Rao
8e06b8bfe1 feat: pare down docs to only show API refs (#2770)
This PR does the following: 
- Pare down the docs to only what's needed (Python, JS/TS API docs and a
pointer to Rust docs)
- Styling changes to be more in line with the main website theme

The relative URLs remain unchanged, so assuming CI passes, there should
be no breaking changes from the main docs site that points back here.
2025-11-10 12:04:57 -05:00
Weston Pace
aeac9c7644 feat: add python Permutation class to mimic hugging face dataset and provide pytorch dataloader (#2725) 2025-11-06 16:15:33 -08:00
LuQQiu
8b94308cf2 feat: add fts udtf in sql (#2755)
Support FTS feature parity in SQL to match current Python API
capability.
Add `.to_json()` method to FTS query classes to enable usage with SQL
`fts()` UDTF.
Related: https://github.com/lancedb/blog-lancedb/pull/147

query = MatchQuery("puppy", "text", fuzziness=2)
result = client.execute(f"SELECT * FROM fts('table',
'{query.to_json()}')")

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-31 10:06:19 -07:00
fzowl
93c2cf2f59 feat(voyageai): update voyage integration (#2713)
Adding multimodal usage guide
VoyageAI integration changes:
 - Adding voyage-3.5 and voyage-3.5-lite models
 - Adding voyage-context-3 model
 - Adding rerank-2.5 and rerank-2.5-lite models
2025-10-29 16:49:07 +05:30
LuQQiu
199904ab35 chore: update lance dependency to v0.38.3-beta.11 (#2749)
## Summary

- Updated all Lance dependencies from v0.38.3-beta.9 to v0.38.3-beta.11
- Migrated `lance-namespace-impls` to use new granular cloud provider
features (`dir-aws`, `dir-gcp`, `dir-azure`, `dir-oss`) instead of
deprecated `dir` feature
- Updated namespace connection API to use `ConnectBuilder` instead of
deprecated `connect()` function

## API Changes

The Lance team refactored the `lance-namespace-impls` package in
v0.38.3-beta.11:

1. **Feature flags**: The single `dir` feature was split into cloud
provider-specific features:
   - `dir-aws` for AWS S3 support
   - `dir-gcp` for Google Cloud Storage support
   - `dir-azure` for Azure Blob Storage support
   - `dir-oss` for Alibaba Cloud OSS support

2. **Connection API**: The `connect()` function was replaced with a
`ConnectBuilder` pattern for more flexibility

## Testing

-  Ran `cargo clippy --workspace --tests --all-features -- -D warnings`
- no warnings
-  Ran `cargo fmt --all` - code formatted
-  All changes verified and committed

## Related

This update was triggered by the Lance release:
https://github.com/lancedb/lance/releases/tag/v0.38.3-beta.11

🤖 Generated with [Claude Code](https://claude.com/claude-code)

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-27 19:10:26 -07:00
Weston Pace
4cfcd95320 feat: add a permutation reader that can read a permutation view (#2712)
This adds a rust permutation builder. In the next PR I will have python
bindings and integration with pytorch.
2025-10-17 05:00:23 -07:00
Ayush Chaurasia
3f2e3986e9 feat: expand support for multivector colpali models and enchancements (#2719) 2025-10-17 14:36:32 +05:30
Weston Pace
8f8e06a2da feat: add output_schema method to queries (#2717)
This is a helper utility I need for some of my data loader work. It
makes it easy to see the output schema even when a `select` has been
applied.
2025-10-14 05:13:28 -07:00
Weston Pace
5a19cf15a6 feat: a utility for creating "permutation views" (#2552)
I'm working on a lancedb version of pytorch data loading (and hopefully
addressing https://github.com/lancedb/lance/issues/3727).

However, rather than rely on pytorch for everything I'm moving some of
the things that pytorch does into rust. This gives us more control over
data loading (e.g. using shards or a hash-based split) and it allows
permutations to be persistent. In particular I hope to be able to:

* Create a persistent permutation
* This permutation can handle splits, filtering, shuffling, and sharding
* Create a rust data loader that can read a permutation (one or more
splits), or a subset of a permutation (for DDP)
* Create a python data loader that delegates to the rust data loader

Eventually create integrations for other data loading libraries,
including rust & node
2025-10-09 18:07:31 -07:00
BubbleCal
b59d1007d3 feat(index): add IVF_RQ index type (#2687)
this expose IVF_RQ (RabitQ quantization) index type to lancedb

---------

Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2025-10-09 15:46:18 +08:00
Wyatt Alt
3594538509 fix: add name to index config and fix create_index typing (#2660)
Co-authored-by: Mark McCaskey <markm@harvey.ai>
2025-10-08 04:41:30 -07:00
Ed Rogers
d0ce489b21 fix: use stdlib override when possible (#2699)
## Description of changes

Fixes #2698  

This PR uses
[`typing.override`](https://docs.python.org/3/library/typing.html#typing.override)
in favor of the [`overrides`](https://pypi.org/project/overrides/)
dependency when possible. As of Python 3.12, the standard library offers
`typing.override` to perform a static check on overridden methods.

### Motivation

Currently, `overrides` is incompatible with Python 3.14. As a result,
any package that attempts to import `overrides` using Python 3.14+ will
raise an `AttributeError`. An
[issue](https://github.com/mkorpela/overrides/issues/127) has been
raised and a [pull
request](https://github.com/mkorpela/overrides/pull/133) has been
submitted to the GitHub repo for the `overrides` project. But the
maintainer has been unresponsive.

To ensure readiness for Python 3.14, this package (and any other package
directly depending on `overrides`) should consider using
`typing.override` instead.

### Impact

The standard library added `typing.override` as of 3.12. As a result,
this change will affect only users of Python 3.12+. Previous versions
will continue to rely on `overrides`. Notably, the standard library
implementation is slightly different than that of `overrides`. A
thorough discussion of those differences is shown in [PEP
698](https://peps.python.org/pep-0698/), and it is also summarized
nicely by the maintainer of `overrides`
[here](https://github.com/mkorpela/overrides/issues/126#issuecomment-2401327116).

There are 2 main ways that switching from `overrides` to
`typing.override` will have an impact on developers of this repo.
1. `typing.override` does not implement any runtime checking. Instead,
it provides information to type checkers.
2. The stdlib does not provide a mixin class to enforce override
decorators on child classes. (Their reasoning for this is explained in
[the PEP](https://peps.python.org/pep-0698/).) This PR disables that
behavior entirely by replacing the `EnforceOverrides`.
2025-10-06 11:23:20 -07:00
Weston Pace
e07389a36c feat: allow bitmap indexes on large-string, binary, large-binary, and bitmap (#2678)
The underlying `pylance` already supported this, it was just blocked out
by an over-eager validation function

Closes #1981
2025-09-25 09:46:42 -07:00
Will Jones
d617cdef4a feat: add use_index parameter to merge insert operations (#2674)
## Summary

Exposes `use_index` Merge Insert parameter, which was created upstream
in https://github.com/lancedb/lance/pull/4688.

## API Examples

### Python
```python
# Force table scan
table.merge_insert(["id"]) \
    .when_not_matched_insert_all() \
    .use_index(False) \
    .execute(data)
```

### Node.js/TypeScript
```typescript
// Force table scan  
await table.mergeInsert("id")
    .whenNotMatchedInsertAll()
    .useIndex(false)
    .execute(data);
```

### Rust
```rust
// Force table scan
let mut builder = table.merge_insert(&["id"]);
builder.when_not_matched_insert_all()
       .use_index(false);
builder.execute(data).await?;
```

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-authored-by: Claude <noreply@anthropic.com>
2025-09-24 12:50:21 -07:00
Will Jones
1ab60fae7f feat: upgrade Lance to v0.37.0 (#2672)
Change logs:

* https://github.com/lancedb/lance/releases/tag/v0.37.0
* https://github.com/lancedb/lance/releases/tag/v0.36.0
2025-09-23 13:41:47 -07:00
Ayush Chaurasia
e921c90c1b feat: support mean reciprocal rank reranker (#2671)
The basic idea of MRR is this -
https://www.evidentlyai.com/ranking-metrics/mean-reciprocal-rank-mrr
I've implemented a weighted version for allowing user to set weightage
between vector and fts.

The gist is something like this 

### Scenario A: Document at rank 1 in one set, absent from another

```
# Assuming equal weights: weight_vector = 0.5, weight_fts = 0.5
vector_rr = 1.0  # rank 1 → 1/1 = 1.0
fts_rr = 0.0     # absent → 0.0

weighted_mrr = 0.5 × 1.0 + 0.5 × 0.0 = 0.5
```
### Scenario B: Document at rank 1 in one set, rank 2 in another
```
# Same weights: weight_vector = 0.5, weight_fts = 0.5
vector_rr = 1.0  # rank 1 → 1/1 = 1.0
fts_rr = 0.5     # rank 2 → 1/2 = 0.5

weighted_mrr = 0.5 × 1.0 + 0.5 × 0.5 = 0.5 + 0.25 = 0.75
```

And so with `return_score="all"` the result looks something like this
(this is from the reranker tests).
Because this is a weighted rank based reranker, some results might have
the same score
```
                                                 text                                             vector     _distance      _rowid     _score  _relevance_score
0                                    I am your father  [-0.010703234, 0.069315575, 0.030076642, 0.002...  8.149148e-13  8589934598  10.978719          1.000000
1                          the ground beneath my feet  [-0.09500901, 0.00092102867, 0.0755851, 0.0372...  1.376896e+00  8589934604        NaN          0.250000
2                I find your lack of faith disturbing  [0.07525753, -0.0100010475, 0.09990541, 0.0209...           NaN  8589934595   3.483394          0.250000
3                               but I don't wanna die  [0.033476487, -0.011235877, -0.057625435, -0.0...  1.538222e+00  8589934610   1.130355          0.238095
4   if you strike me down I shall become more powe...  [0.00432201, 0.030120496, 5.3317923e-05, 0.033...  1.381086e+00  8589934594   0.715157          0.216667
5           I see a salty message written in the eves  [-0.04213107, 0.0016004723, 0.061052393, -0.02...  1.638301e+00  8589934603   1.043785          0.133333
6                              but his son was mortal  [0.012462767, 0.049041674, -0.057339743, -0.04...  1.421566e+00  8589934620        NaN          0.125000
7                   I've got a bad feeling about this  [-0.06973199, -0.029960092, 0.02641632, -0.031...           NaN  8589934596   1.043785          0.125000
8    now that's a name I haven't heard in a long time  [-0.014374257, -0.013588792, -0.07487557, 0.03...  1.597573e+00  8589934593   0.848772          0.118056
9                                        he was a god  [-0.0258895, 0.11925236, -0.029397793, 0.05888...  1.423147e+00  8589934618        NaN          0.100000
10                 I wish they would make another one  [-0.14737535, -0.015304729, 0.04318139, -0.061...           NaN  8589934622   1.043785          0.100000
11                                   Kratos had a son  [-0.057455737, 0.13734367, -0.03537109, -0.000...  1.488075e+00  8589934617        NaN          0.083333
12                       I don't wanna live like this  [-0.0028891307, 0.015214227, 0.025183653, 0.08...           NaN  8589934609   1.043785          0.071429
13             I see a mansard roof through the trees  [0.052383978, 0.087759204, 0.014739997, 0.0239...           NaN  8589934602   1.043785          0.062500
14                          great kid don't get cocky  [-0.047043696, 0.054648954, -0.008509666, -0.0...  1.618125e+00  8589934592        NaN          0.055556
```
2025-09-23 18:25:18 +05:30
Jack Ye
ff71d7e552 feat: support shallow clone (#2653)
Support shallow cloning a dataset at a specific location to create a new
dataset, using the shallow_clone feature in Lance. Also introduce remote
`clone` API for remote tables for this functionality.
2025-09-21 21:28:40 -07:00
Jack Ye
5b397e410b chore: fix out of date tests with new namespace validation (#2663)
Failure:
https://github.com/lancedb/lancedb/actions/runs/17820044478/job/50660516344
2025-09-18 13:29:47 -07:00
Le Duc Manh
4c9fc3044b fix: use create to resolve variables (#2640)
# What
- Use `create` to resolve variables values

# Reference
Fixes #2181

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2025-09-12 13:07:32 -07:00
BubbleCal
f7d78c3420 feat: add 'target_partition_size' param (#2642)
this exposes the param `target_partition_size` from lance

---------

Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2025-09-11 22:56:16 +08:00
Jack Ye
8da74dcb37 feat: support per-request header override (#2631)
## Summary

This PR introduces a `HeaderProvider` which is called for all remote
HTTP calls to get the latest headers to inject. This is useful for
features like adding the latest auth tokens where the header provider
can auto-refresh tokens internally and each request always set the
refreshed token.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-09-10 13:44:00 -07:00
Jack Ye
9391ad1450 feat: support mTLS for remote database (#2638)
This PR adds mTLS (mutual TLS) configuration support for the LanceDB
remote HTTP client, allowing users to authenticate with client
certificates and configure custom CA certificates for server
verification.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-09-09 21:04:46 -07:00
Wyatt Alt
a9ea785b15 fix: remote python sdk namespace typing (#2620)
This changes the default values for some namespace parameters in the
remote python SDK from None to [], to match the underlying code it
calls.

Prior to this commit, failing to supply "namespace" with the remote SDK
would cause an error because the underlying code it dispatches to does
not consider None to be valid input.
2025-09-02 16:32:32 -07:00
Colin Patrick McCabe
cc38453391 fix!: fix doctest in query.py (#2622)
Fix doctest in query.py to include cumulative_cpu, now that lance
includes that.
2025-09-02 15:47:32 -07:00
Will Jones
f6846004ca feat: add name parameter to remaining Python create index calls (#2617)
## Summary
This PR adds the missing `name` parameter to `create_scalar_index` and
`create_fts_index` methods in the Python SDK, which was inadvertently
omitted when it was added to `create_index` in PR #2586.

## Changes
- Add `name: Optional[str] = None` parameter to abstract
`Table.create_scalar_index` and `Table.create_fts_index` methods
- Update `LanceTable` implementation to accept and pass the `name`
parameter to the underlying Rust layer
- Update `RemoteTable` implementation to accept and pass the `name`
parameter
- Enhanced tests to verify custom index names work correctly for both
scalar and FTS indices
- When `name` is not provided, default names are generated (e.g.,
`{column}_idx`)

## Test plan
- [x] Added test cases for custom names in scalar index creation
- [x] Added test cases for custom names in FTS index creation  
- [x] Verified existing tests continue to pass
- [x] Code formatting and linting checks pass

This ensures API consistency across all index creation methods in the
LanceDB Python SDK.

Fixes #2616

🤖 Generated with [Claude Code](https://claude.ai/code)

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-08-27 14:02:48 -07:00
Jack Ye
faf8973624 feat!: support multi-level namespace (#2603)
This PR adds support of multi-level namespace in a LanceDB database,
according to the Lance Namespace spec.

This allows users to create namespace inside a database connection,
perform create, drop, list, list_tables in a namespace. (other
operations like update, describe will be in a follow-up PR)

The 3 types of database connections behave like the following:
1 Local database connections will continue to have just a flat list of
tables for backwards compatibility.
2. Remote database connections will make REST API calls according to the
APIs in the Lance Namespace spec.
3. Lance Namespace connections will invoke the corresponding operations
against the specific namespace implementation which could have different
behaviors regarding these APIs.

All the table APIs now take identifier instead of name, for example
`/v1/table/{name}/create` is now `/v1/table/{id}/create`. If a table is
directly in the root namespace, the API call is identical. If the table
is in a namespace, then the full table ID should be used, with `$` as
the default delimiter (`.` is a special character and creates issues
with URL parsing so `$` is used), for example
`/v1/table/ns1$table1/create`. If a different parameter needs to be
passed in, user can configure the `id_delimiter` in client config and
that becomes a query parameter, for example
`/v1/table/ns1__table1/create?delimiter=__`

The Python and Typescript APIs are kept backwards compatible, but the
following Rust APIs are not:
1. `Connection::drop_table(&self, name: impl AsRef<str>) -> Result<()>`
is now `Connection::drop_table(&self, name: impl AsRef<str>, namespace:
&[String]) -> Result<()>`
2. `Connection::drop_all_tables(&self) -> Result<()>` is now
`Connection::drop_all_tables(&self, name: impl AsRef<str>) ->
Result<()>`
2025-08-27 12:07:55 -07:00
Weston Pace
fabe37274f feat: add __getitems__ method impl for torch integration (#2596)
This allows a lancedb Table to act as a torch dataset.
2025-08-25 13:23:22 -07:00
Jack Ye
04285a4a4e feat(python): integrate with lance namespace (#2599)
This PR integrates `lancedb` with `lance-namespace` so that users can
use LanceDB client to access Lance tables in any catalog services. In
general, we expect most of the logic to be delegated to the existing
`LanceDBConnection` and `LanceTable`, but the namespace implemenation
will control how table is created, dropped, and describe where the table
is stored with any related storage options like access credentials.

The implementation currently only supports a 1 level namespace that
directly contains tables. We will introduce nested namespace support in
a separated PR.

Users are expected to use it in the following way:

```python
>>> import lancedb
>>> import pyarrow as pa
>>> # Connect using GlueNamespace
>>> db = lancedb.connect_namespace("glue", {"catalog_id": "123456789012"})
>>> # Create a table with schema
>>> schema = pa.schema([
...     pa.field("id", pa.int64()),
...     pa.field("vector", pa.list_(pa.float32(), 2))
... ])
>>> table = db.create_table("my_table", schema=schema)
>>> # List tables
>>> db.table_names()
['my_table']
```
2025-08-20 15:46:16 -07:00
Will Jones
ad09234d59 feat: allow setting train=False and name on indices (#2586)
Enables two new parameters when building indices:

* `name`: Allows explicitly setting a name on the index. Default is
`{col_name}_idx`.
* `train` (default `True`): When set to `False`, an empty index will be
immediately created.

The upgrade of Lance means there are also additional behaviors from
cd76a993b8:

* When a scalar index is created on a Table, it will be kept around even
if all rows are deleted or updated.
* Scalar indices can be created on empty tables. They will default to
`train=False` if the table is empty.

---------

Co-authored-by: Weston Pace <weston.pace@gmail.com>
2025-08-15 14:00:26 -07:00