Compare commits

..

21 Commits

Author SHA1 Message Date
lancedb automation
2ae26b4ad7 chore: update lance dependency to v1.0.0-beta.9 2025-11-25 05:13:12 +00:00
LanceDB Robot
d6daa08b54 chore: update lance dependency to v1.0.0-beta.8 (#2813)
## Summary
- bump all Lance crates to v1.0.0-beta.8 via ci/set_lance_version.py
- verified 
- ran 

Trigger: refs/tags/v1.0.0-beta.8
2025-11-24 14:58:42 -08:00
Wyatt Alt
17b71de22e feat: update codex url key (#2812)
This previously through unknown key for htmlUrl and indicated "url" is a
valid field.
2025-11-24 13:13:18 -08:00
Prashanth Rao
a250d8e7df docs: improve docstring for RabitQ in Python (#2808)
This PR improves the docstring for `IVF_RQ` (RabitQ) in Python. The
earlier version referred to it as "residual quantization", which is
confusing to future readers of the code.

In contrast, the TypeScript and Rust codebases defined `IVF_RQ` as
RabitQ. So now the three languages use comments that are consistent with
one another.

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-24 13:35:19 +08:00
LanceDB Robot
5a2b33581e chore: update lance dependency to v1.0.0-beta.7 (#2807)
## Summary
- bump Lance dependencies to v1.0.0-beta.7 via `ci/set_lance_version.py`
- verified workspace with `cargo clippy --workspace --tests
--all-features -- -D warnings`
- formatted with `cargo fmt --all`

Trigger: refs/tags/v1.0.0-beta.7
2025-11-21 21:42:09 -08:00
Jack Ye
3d254f61b0 ci: trigger downstream verification after version bump (#2809) 2025-11-21 09:50:23 -08:00
Jack Ye
d15e380be1 ci: add support for lance-format fury index for downloading pylance (#2804)
Set the lance-format fury repo for most places that are downloading. For
uploading, it is kept unchanged since lancedb is published to lancedb
fury.
2025-11-20 23:29:36 -08:00
Jack Ye
0baf807be0 ci: use larger runner for doctest and fix failing tests (#2801)
Currently test would fail after installing to around pytorch
2025-11-20 19:44:31 -08:00
Prashanth Rao
76bcc78910 docs: nodejs failing CI is fixed (#2802)
Fixes the breaking CI for nodejs, related to the documentation of the
new Permutation API in typescript.

- Expanded the generated typings in `nodejs/lancedb/native.d.ts` to
include `SplitCalculatedOptions`, `splitNames` fields, and the
persist/options-based `splitCalculated` methods so the permutation
exports match the native API.
- The previous block comment block had an inconsistency.
`splitCalculated` takes an options object (`SplitCalculatedOptions`) in
our bindings, not a bare string. The previous example showed
`builder.splitCalculated("user_id % 3");`, which doesn’t match the
actual signature and would fail TS typecheck. I updated the comment to
`builder.splitCalculated({ calculation: "user_id % 3" });` so the
example is now correct.
- Updated the `splitCalculated` example in
`nodejs/lancedb/permutation.ts` to use the options object.
- Ran `npm docs` to ensure docs build correctly.

> [!NOTE]
> **Disclaimer**: I used GPT-5.1-Codex-Max to make these updates, but I
have read the code and run `npm run docs` to verify that they work and
are correct to the best of my knowledge.
2025-11-20 16:16:38 -08:00
Prashanth Rao
135dfdc7ec docs: 404 and outdated URLs should now work (#2800)
Did a full scan of all URLs that used to point to the old mkdocs pages,
and now links to the appropriate pages on lancedb.com/docs or lance.org
docs.

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-20 11:14:20 -08:00
Will Jones
6f39108857 docs: add some missing classes (#2450)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Documentation**
- Expanded Python API reference with new entries for table metadata,
tagging, remote client configuration, and index statistics.
- Added documentation for new classes and modules in both synchronous
and asynchronous sections, including `FragmentStatistics`,
`FragmentSummaryStats`, `Tags`, `AsyncTags`, `IndexStatistics`, and
remote configuration options.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-11-20 11:04:16 -08:00
Jackson Hew
bb6b0bea0c fix: .phrase_query() not working (#2781)
The `self._query` value was not set when wrapping its copy `query` with
quotation marks.

The test for phrase queries has been updated to test the
`.phrase_query()` method as well, which will catch this bug.

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2025-11-20 10:32:37 -08:00
Jack Ye
0084eb238b fix: use None default for namespace (#2797)
Realized that using [] is an anti-pattern in python for defaults:
https://docs.python-guide.org/writing/gotchas/
2025-11-20 10:23:41 -08:00
LanceDB Robot
28ab29a3f0 chore: update lance dependency to v1.0.0-beta.5 (#2798)
## Summary
- bump all Lance workspace dependencies to v1.0.0-beta.5
- verified `cargo clippy --workspace --tests --all-features -- -D
warnings`
- ran `cargo fmt --all`

Triggered by refs/tags/v1.0.0-beta.5
2025-11-20 17:43:24 +08:00
Colin Patrick McCabe
7d3f5348a7 feat: implement head() for remote tables (#2793)
Implemnent the head() function for RemoteTable.
2025-11-19 12:49:34 -08:00
Lance Release
3531393523 Bump version: 0.22.4-beta.1 → 0.22.4-beta.2 2025-11-19 20:25:41 +00:00
Lance Release
93b8ac8e3e Bump version: 0.25.4-beta.1 → 0.25.4-beta.2 2025-11-19 20:24:46 +00:00
Jack Ye
1b78ccedaf feat: support async namespace connection (#2788)
Also fix 2 bugs:
1. make storage options provider serializable in ray
2. fix table.to_table() uri is wrong for namespace-backed tables
2025-11-19 12:23:50 -08:00
Mykola Skrynnyk
ca8d118f78 feat(python): support to_pydantic in async (#2438)
This request improves support for `pydantic` integration by adding
`to_pydantic` method to asynchronous queries and handling models that
use `alias` in field definitions. Fixes #2436 and closes #2437 .

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **New Features**
- Added support for converting asynchronous query results to Pydantic
models.
- **Bug Fixes**
- Simplified conversion of query results to Pydantic models for improved
reliability.
- Improved handling of field aliases and computed fields when mapping
query results to Pydantic models.
- **Tests**
- Added tests to verify correct mapping of aliased and computed fields
in both synchronous and asynchronous scenarios.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-11-19 11:20:14 -08:00
Wyatt Alt
386fc9e466 feat: add num_attempts to merge insert result (#2795)
This pipes the num_attempts field from lance's merge insert result
through lancedb. This allows callers of merge_insert to get a better
idea of whether transaction conflicts are occurring.
2025-11-19 09:32:57 -08:00
Lance Release
ce1bafec1a Bump version: 0.22.4-beta.0 → 0.22.4-beta.1 2025-11-19 12:58:59 +00:00
68 changed files with 1069 additions and 204 deletions

View File

@@ -1,5 +1,5 @@
[tool.bumpversion]
current_version = "0.22.4-beta.0"
current_version = "0.22.4-beta.2"
parse = """(?x)
(?P<major>0|[1-9]\\d*)\\.
(?P<minor>0|[1-9]\\d*)\\.

View File

@@ -18,6 +18,6 @@ body:
label: Link
description: >
Provide a link to the existing documentation, if applicable.
placeholder: ex. https://lancedb.github.io/lancedb/guides/tables/...
placeholder: ex. https://lancedb.com/docs/tables/...
validations:
required: false

View File

@@ -31,7 +31,7 @@ runs:
with:
command: build
working-directory: python
docker-options: "-e PIP_EXTRA_INDEX_URL=https://pypi.fury.io/lancedb/"
docker-options: "-e PIP_EXTRA_INDEX_URL='https://pypi.fury.io/lance-format/ https://pypi.fury.io/lancedb/'"
target: x86_64-unknown-linux-gnu
manylinux: ${{ inputs.manylinux }}
args: ${{ inputs.args }}
@@ -46,7 +46,7 @@ runs:
with:
command: build
working-directory: python
docker-options: "-e PIP_EXTRA_INDEX_URL=https://pypi.fury.io/lancedb/"
docker-options: "-e PIP_EXTRA_INDEX_URL='https://pypi.fury.io/lance-format/ https://pypi.fury.io/lancedb/'"
target: aarch64-unknown-linux-gnu
manylinux: ${{ inputs.manylinux }}
args: ${{ inputs.args }}

View File

@@ -22,5 +22,5 @@ runs:
command: build
# TODO: pass through interpreter
args: ${{ inputs.args }}
docker-options: "-e PIP_EXTRA_INDEX_URL=https://pypi.fury.io/lancedb/"
docker-options: "-e PIP_EXTRA_INDEX_URL='https://pypi.fury.io/lance-format/ https://pypi.fury.io/lancedb/'"
working-directory: python

View File

@@ -26,7 +26,7 @@ runs:
with:
command: build
args: ${{ inputs.args }}
docker-options: "-e PIP_EXTRA_INDEX_URL=https://pypi.fury.io/lancedb/"
docker-options: "-e PIP_EXTRA_INDEX_URL='https://pypi.fury.io/lance-format/ https://pypi.fury.io/lancedb/'"
working-directory: python
- uses: actions/upload-artifact@v4
with:

View File

@@ -98,3 +98,30 @@ jobs:
printenv OPENAI_API_KEY | codex login --with-api-key
codex --config shell_environment_policy.ignore_default_excludes=true exec --dangerously-bypass-approvals-and-sandbox "$(cat /tmp/codex-prompt.txt)"
- name: Trigger sophon dependency update
env:
TAG: ${{ inputs.tag }}
GH_TOKEN: ${{ secrets.ROBOT_TOKEN }}
run: |
set -euo pipefail
VERSION="${TAG#refs/tags/}"
VERSION="${VERSION#v}"
LANCEDB_BRANCH="codex/update-lance-${VERSION//[^a-zA-Z0-9]/-}"
echo "Triggering sophon workflow with:"
echo " lance_ref: ${TAG#refs/tags/}"
echo " lancedb_ref: ${LANCEDB_BRANCH}"
gh workflow run codex-bump-lancedb-lance.yml \
--repo lancedb/sophon \
-f lance_ref="${TAG#refs/tags/}" \
-f lancedb_ref="${LANCEDB_BRANCH}"
- name: Show latest sophon workflow run
env:
GH_TOKEN: ${{ secrets.ROBOT_TOKEN }}
run: |
set -euo pipefail
echo "Latest sophon workflow run:"
gh run list --repo lancedb/sophon --workflow codex-bump-lancedb-lance.yml --limit 1 --json databaseId,url,displayTitle

View File

@@ -24,7 +24,7 @@ env:
# according to: https://matklad.github.io/2021/09/04/fast-rust-builds.html
# CI builds are faster with incremental disabled.
CARGO_INCREMENTAL: "0"
PIP_EXTRA_INDEX_URL: "https://pypi.fury.io/lancedb/"
PIP_EXTRA_INDEX_URL: "https://pypi.fury.io/lance-format/ https://pypi.fury.io/lancedb/"
jobs:
# Single deploy job since we're just deploying
@@ -50,8 +50,8 @@ jobs:
- name: Build Python
working-directory: python
run: |
python -m pip install --extra-index-url https://pypi.fury.io/lancedb/ -e .
python -m pip install --extra-index-url https://pypi.fury.io/lancedb/ -r ../docs/requirements.txt
python -m pip install --extra-index-url https://pypi.fury.io/lance-format/ --extra-index-url https://pypi.fury.io/lancedb/ -e .
python -m pip install --extra-index-url https://pypi.fury.io/lance-format/ --extra-index-url https://pypi.fury.io/lancedb/ -r ../docs/requirements.txt
- name: Set up node
uses: actions/setup-node@v3
with:

View File

@@ -59,4 +59,4 @@ jobs:
GH_TOKEN: ${{ secrets.ROBOT_TOKEN }}
run: |
set -euo pipefail
gh run list --workflow codex-update-lance-dependency.yml --limit 1 --json databaseId,htmlUrl,displayTitle
gh run list --workflow codex-update-lance-dependency.yml --limit 1 --json databaseId,url,displayTitle

View File

@@ -11,7 +11,7 @@ on:
- Cargo.toml # Change in dependency frequently breaks builds
env:
PIP_EXTRA_INDEX_URL: "https://pypi.fury.io/lancedb/"
PIP_EXTRA_INDEX_URL: "https://pypi.fury.io/lance-format/ https://pypi.fury.io/lancedb/"
jobs:
linux:

View File

@@ -18,7 +18,7 @@ env:
# Color output for pytest is off by default.
PYTEST_ADDOPTS: "--color=yes"
FORCE_COLOR: "1"
PIP_EXTRA_INDEX_URL: "https://pypi.fury.io/lancedb/"
PIP_EXTRA_INDEX_URL: "https://pypi.fury.io/lance-format/ https://pypi.fury.io/lancedb/"
RUST_BACKTRACE: "1"
jobs:
@@ -79,7 +79,7 @@ jobs:
doctest:
name: "Doctest"
timeout-minutes: 30
runs-on: "ubuntu-24.04"
runs-on: ubuntu-2404-8x-x64
defaults:
run:
shell: bash
@@ -100,7 +100,7 @@ jobs:
sudo apt install -y protobuf-compiler
- name: Install
run: |
pip install --extra-index-url https://pypi.fury.io/lancedb/ -e .[tests,dev,embeddings]
pip install --extra-index-url https://pypi.fury.io/lance-format/ --extra-index-url https://pypi.fury.io/lancedb/ -e .[tests,dev,embeddings]
pip install tantivy
pip install mlx
- name: Doctest
@@ -226,7 +226,7 @@ jobs:
run: |
pip install "pydantic<2"
pip install pyarrow==16
pip install --extra-index-url https://pypi.fury.io/lancedb/ -e .[tests]
pip install --extra-index-url https://pypi.fury.io/lance-format/ --extra-index-url https://pypi.fury.io/lancedb/ -e .[tests]
pip install tantivy
- name: Run tests
run: pytest -m "not slow and not s3_test" -x -v --durations=30 python/tests

View File

@@ -15,7 +15,7 @@ runs:
- name: Install lancedb
shell: bash
run: |
pip3 install --extra-index-url https://pypi.fury.io/lancedb/ $(ls target/wheels/lancedb-*.whl)[tests,dev]
pip3 install --extra-index-url https://pypi.fury.io/lance-format/ --extra-index-url https://pypi.fury.io/lancedb/ $(ls target/wheels/lancedb-*.whl)[tests,dev]
- name: Setup localstack for integration tests
if: ${{ inputs.integration == 'true' }}
shell: bash

70
Cargo.lock generated
View File

@@ -3032,8 +3032,8 @@ checksum = "42703706b716c37f96a77aea830392ad231f44c9e9a67872fa5548707e11b11c"
[[package]]
name = "fsst"
version = "1.0.0-beta.3"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.3#95ff7f6e684c1911fc00c9c2811cce1a61c06ff5"
version = "1.0.0-beta.9"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.9#40dd02c7ab04e82ac774cefa337db3f2626dead7"
dependencies = [
"arrow-array",
"rand 0.9.2",
@@ -4217,8 +4217,8 @@ dependencies = [
[[package]]
name = "lance"
version = "1.0.0-beta.3"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.3#95ff7f6e684c1911fc00c9c2811cce1a61c06ff5"
version = "1.0.0-beta.9"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.9#40dd02c7ab04e82ac774cefa337db3f2626dead7"
dependencies = [
"arrow",
"arrow-arith",
@@ -4282,8 +4282,8 @@ dependencies = [
[[package]]
name = "lance-arrow"
version = "1.0.0-beta.3"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.3#95ff7f6e684c1911fc00c9c2811cce1a61c06ff5"
version = "1.0.0-beta.9"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.9#40dd02c7ab04e82ac774cefa337db3f2626dead7"
dependencies = [
"arrow-array",
"arrow-buffer",
@@ -4301,8 +4301,8 @@ dependencies = [
[[package]]
name = "lance-bitpacking"
version = "1.0.0-beta.3"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.3#95ff7f6e684c1911fc00c9c2811cce1a61c06ff5"
version = "1.0.0-beta.9"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.9#40dd02c7ab04e82ac774cefa337db3f2626dead7"
dependencies = [
"arrayref",
"paste",
@@ -4311,8 +4311,8 @@ dependencies = [
[[package]]
name = "lance-core"
version = "1.0.0-beta.3"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.3#95ff7f6e684c1911fc00c9c2811cce1a61c06ff5"
version = "1.0.0-beta.9"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.9#40dd02c7ab04e82ac774cefa337db3f2626dead7"
dependencies = [
"arrow-array",
"arrow-buffer",
@@ -4348,8 +4348,8 @@ dependencies = [
[[package]]
name = "lance-datafusion"
version = "1.0.0-beta.3"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.3#95ff7f6e684c1911fc00c9c2811cce1a61c06ff5"
version = "1.0.0-beta.9"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.9#40dd02c7ab04e82ac774cefa337db3f2626dead7"
dependencies = [
"arrow",
"arrow-array",
@@ -4378,8 +4378,8 @@ dependencies = [
[[package]]
name = "lance-datagen"
version = "1.0.0-beta.3"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.3#95ff7f6e684c1911fc00c9c2811cce1a61c06ff5"
version = "1.0.0-beta.9"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.9#40dd02c7ab04e82ac774cefa337db3f2626dead7"
dependencies = [
"arrow",
"arrow-array",
@@ -4396,8 +4396,8 @@ dependencies = [
[[package]]
name = "lance-encoding"
version = "1.0.0-beta.3"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.3#95ff7f6e684c1911fc00c9c2811cce1a61c06ff5"
version = "1.0.0-beta.9"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.9#40dd02c7ab04e82ac774cefa337db3f2626dead7"
dependencies = [
"arrow-arith",
"arrow-array",
@@ -4434,8 +4434,8 @@ dependencies = [
[[package]]
name = "lance-file"
version = "1.0.0-beta.3"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.3#95ff7f6e684c1911fc00c9c2811cce1a61c06ff5"
version = "1.0.0-beta.9"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.9#40dd02c7ab04e82ac774cefa337db3f2626dead7"
dependencies = [
"arrow-arith",
"arrow-array",
@@ -4467,8 +4467,8 @@ dependencies = [
[[package]]
name = "lance-index"
version = "1.0.0-beta.3"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.3#95ff7f6e684c1911fc00c9c2811cce1a61c06ff5"
version = "1.0.0-beta.9"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.9#40dd02c7ab04e82ac774cefa337db3f2626dead7"
dependencies = [
"arrow",
"arrow-arith",
@@ -4529,8 +4529,8 @@ dependencies = [
[[package]]
name = "lance-io"
version = "1.0.0-beta.3"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.3#95ff7f6e684c1911fc00c9c2811cce1a61c06ff5"
version = "1.0.0-beta.9"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.9#40dd02c7ab04e82ac774cefa337db3f2626dead7"
dependencies = [
"arrow",
"arrow-arith",
@@ -4570,8 +4570,8 @@ dependencies = [
[[package]]
name = "lance-linalg"
version = "1.0.0-beta.3"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.3#95ff7f6e684c1911fc00c9c2811cce1a61c06ff5"
version = "1.0.0-beta.9"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.9#40dd02c7ab04e82ac774cefa337db3f2626dead7"
dependencies = [
"arrow-array",
"arrow-buffer",
@@ -4587,8 +4587,8 @@ dependencies = [
[[package]]
name = "lance-namespace"
version = "1.0.0-beta.3"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.3#95ff7f6e684c1911fc00c9c2811cce1a61c06ff5"
version = "1.0.0-beta.9"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.9#40dd02c7ab04e82ac774cefa337db3f2626dead7"
dependencies = [
"arrow",
"async-trait",
@@ -4600,8 +4600,8 @@ dependencies = [
[[package]]
name = "lance-namespace-impls"
version = "1.0.0-beta.3"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.3#95ff7f6e684c1911fc00c9c2811cce1a61c06ff5"
version = "1.0.0-beta.9"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.9#40dd02c7ab04e82ac774cefa337db3f2626dead7"
dependencies = [
"arrow",
"arrow-ipc",
@@ -4639,8 +4639,8 @@ dependencies = [
[[package]]
name = "lance-table"
version = "1.0.0-beta.3"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.3#95ff7f6e684c1911fc00c9c2811cce1a61c06ff5"
version = "1.0.0-beta.9"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.9#40dd02c7ab04e82ac774cefa337db3f2626dead7"
dependencies = [
"arrow",
"arrow-array",
@@ -4679,8 +4679,8 @@ dependencies = [
[[package]]
name = "lance-testing"
version = "1.0.0-beta.3"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.3#95ff7f6e684c1911fc00c9c2811cce1a61c06ff5"
version = "1.0.0-beta.9"
source = "git+https://github.com/lance-format/lance.git?tag=v1.0.0-beta.9#40dd02c7ab04e82ac774cefa337db3f2626dead7"
dependencies = [
"arrow-array",
"arrow-schema",
@@ -4691,7 +4691,7 @@ dependencies = [
[[package]]
name = "lancedb"
version = "0.22.4-beta.0"
version = "0.22.4-beta.2"
dependencies = [
"ahash",
"anyhow",
@@ -4786,7 +4786,7 @@ dependencies = [
[[package]]
name = "lancedb-nodejs"
version = "0.22.4-beta.0"
version = "0.22.4-beta.2"
dependencies = [
"arrow-array",
"arrow-ipc",
@@ -4806,7 +4806,7 @@ dependencies = [
[[package]]
name = "lancedb-python"
version = "0.25.4-beta.0"
version = "0.25.4-beta.2"
dependencies = [
"arrow",
"async-trait",

View File

@@ -15,20 +15,20 @@ categories = ["database-implementations"]
rust-version = "1.78.0"
[workspace.dependencies]
lance = { "version" = "=1.0.0-beta.3", default-features = false, "tag" = "v1.0.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
lance-core = { "version" = "=1.0.0-beta.3", "tag" = "v1.0.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
lance-datagen = { "version" = "=1.0.0-beta.3", "tag" = "v1.0.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
lance-file = { "version" = "=1.0.0-beta.3", "tag" = "v1.0.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
lance-io = { "version" = "=1.0.0-beta.3", default-features = false, "tag" = "v1.0.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
lance-index = { "version" = "=1.0.0-beta.3", "tag" = "v1.0.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
lance-linalg = { "version" = "=1.0.0-beta.3", "tag" = "v1.0.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
lance-namespace = { "version" = "=1.0.0-beta.3", "tag" = "v1.0.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
lance-namespace-impls = { "version" = "=1.0.0-beta.3", "features" = ["dir-aws", "dir-gcp", "dir-azure", "dir-oss", "rest"], "tag" = "v1.0.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
lance-table = { "version" = "=1.0.0-beta.3", "tag" = "v1.0.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
lance-testing = { "version" = "=1.0.0-beta.3", "tag" = "v1.0.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
lance-datafusion = { "version" = "=1.0.0-beta.3", "tag" = "v1.0.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
lance-encoding = { "version" = "=1.0.0-beta.3", "tag" = "v1.0.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
lance-arrow = { "version" = "=1.0.0-beta.3", "tag" = "v1.0.0-beta.3", "git" = "https://github.com/lance-format/lance.git" }
lance = { "version" = "=1.0.0-beta.9", default-features = false, "tag" = "v1.0.0-beta.9", "git" = "https://github.com/lance-format/lance.git" }
lance-core = { "version" = "=1.0.0-beta.9", "tag" = "v1.0.0-beta.9", "git" = "https://github.com/lance-format/lance.git" }
lance-datagen = { "version" = "=1.0.0-beta.9", "tag" = "v1.0.0-beta.9", "git" = "https://github.com/lance-format/lance.git" }
lance-file = { "version" = "=1.0.0-beta.9", "tag" = "v1.0.0-beta.9", "git" = "https://github.com/lance-format/lance.git" }
lance-io = { "version" = "=1.0.0-beta.9", default-features = false, "tag" = "v1.0.0-beta.9", "git" = "https://github.com/lance-format/lance.git" }
lance-index = { "version" = "=1.0.0-beta.9", "tag" = "v1.0.0-beta.9", "git" = "https://github.com/lance-format/lance.git" }
lance-linalg = { "version" = "=1.0.0-beta.9", "tag" = "v1.0.0-beta.9", "git" = "https://github.com/lance-format/lance.git" }
lance-namespace = { "version" = "=1.0.0-beta.9", "tag" = "v1.0.0-beta.9", "git" = "https://github.com/lance-format/lance.git" }
lance-namespace-impls = { "version" = "=1.0.0-beta.9", "features" = ["dir-aws", "dir-gcp", "dir-azure", "dir-oss", "rest"], "tag" = "v1.0.0-beta.9", "git" = "https://github.com/lance-format/lance.git" }
lance-table = { "version" = "=1.0.0-beta.9", "tag" = "v1.0.0-beta.9", "git" = "https://github.com/lance-format/lance.git" }
lance-testing = { "version" = "=1.0.0-beta.9", "tag" = "v1.0.0-beta.9", "git" = "https://github.com/lance-format/lance.git" }
lance-datafusion = { "version" = "=1.0.0-beta.9", "tag" = "v1.0.0-beta.9", "git" = "https://github.com/lance-format/lance.git" }
lance-encoding = { "version" = "=1.0.0-beta.9", "tag" = "v1.0.0-beta.9", "git" = "https://github.com/lance-format/lance.git" }
lance-arrow = { "version" = "=1.0.0-beta.9", "tag" = "v1.0.0-beta.9", "git" = "https://github.com/lance-format/lance.git" }
ahash = "0.8"
# Note that this one does not include pyarrow
arrow = { version = "56.2", optional = false }

View File

@@ -15,7 +15,7 @@
# **The Multimodal AI Lakehouse**
[**How to Install** ](#how-to-install) ✦ [**Detailed Documentation**](https://lancedb.github.io/lancedb/) ✦ [**Tutorials and Recipes**](https://github.com/lancedb/vectordb-recipes/tree/main) ✦ [**Contributors**](#contributors)
[**How to Install** ](#how-to-install) ✦ [**Detailed Documentation**](https://lancedb.com/docs) ✦ [**Tutorials and Recipes**](https://github.com/lancedb/vectordb-recipes/tree/main) ✦ [**Contributors**](#contributors)
**The ultimate multimodal data platform for AI/ML applications.**

View File

@@ -1,8 +1,8 @@
# LanceDB Documentation
LanceDB docs are deployed to https://lancedb.github.io/lancedb/.
LanceDB docs are available at [lancedb.com/docs](https://lancedb.com/docs).
Docs is built and deployed automatically by [Github Actions](../.github/workflows/docs.yml)
The SDK docs are built and deployed automatically by [Github Actions](../.github/workflows/docs.yml)
whenever a commit is pushed to the `main` branch. So it is possible for the docs to show
unreleased features.

View File

@@ -34,7 +34,7 @@ const results = await table.vectorSearch([0.1, 0.3]).limit(20).toArray();
console.log(results);
```
The [quickstart](https://lancedb.github.io/lancedb/basic/) contains a more complete example.
The [quickstart](https://lancedb.com/docs/quickstart/basic-usage/) contains more complete examples.
## Development

View File

@@ -147,7 +147,7 @@ A new PermutationBuilder instance
#### Example
```ts
builder.splitCalculated("user_id % 3");
builder.splitCalculated({ calculation: "user_id % 3" });
```
***

View File

@@ -89,4 +89,4 @@ optional storageOptions: Record<string, string>;
(For LanceDB OSS only): configuration for object storage.
The available options are described at https://lancedb.github.io/lancedb/guides/storage/
The available options are described at https://lancedb.com/docs/storage/

View File

@@ -97,4 +97,4 @@ Configuration for object storage.
Options already set on the connection will be inherited by the table,
but can be overridden here.
The available options are described at https://lancedb.github.io/lancedb/guides/storage/
The available options are described at https://lancedb.com/docs/storage/

View File

@@ -8,6 +8,14 @@
## Properties
### numAttempts
```ts
numAttempts: number;
```
***
### numDeletedRows
```ts

View File

@@ -42,4 +42,4 @@ Configuration for object storage.
Options already set on the connection will be inherited by the table,
but can be overridden here.
The available options are described at https://lancedb.github.io/lancedb/guides/storage/
The available options are described at https://lancedb.com/docs/storage/

View File

@@ -30,6 +30,12 @@ is also an [asynchronous API client](#connections-asynchronous).
::: lancedb.table.Table
::: lancedb.table.FragmentStatistics
::: lancedb.table.FragmentSummaryStats
::: lancedb.table.Tags
## Querying (Synchronous)
::: lancedb.query.Query
@@ -58,6 +64,14 @@ is also an [asynchronous API client](#connections-asynchronous).
::: lancedb.embeddings.open_clip.OpenClipEmbeddings
## Remote configuration
::: lancedb.remote.ClientConfig
::: lancedb.remote.TimeoutConfig
::: lancedb.remote.RetryConfig
## Context
::: lancedb.context.contextualize
@@ -115,6 +129,8 @@ Table hold your actual data as a collection of records / rows.
::: lancedb.table.AsyncTable
::: lancedb.table.AsyncTags
## Indices (Asynchronous)
Indices can be created on a table to speed up queries. This section
@@ -136,6 +152,8 @@ lists the indices that LanceDb supports.
::: lancedb.index.IvfFlat
::: lancedb.table.IndexStatistics
## Querying (Asynchronous)
Queries allow you to return data from your database. Basic queries can be

View File

@@ -8,7 +8,7 @@
<parent>
<groupId>com.lancedb</groupId>
<artifactId>lancedb-parent</artifactId>
<version>0.22.4-beta.0</version>
<version>0.22.4-beta.2</version>
<relativePath>../pom.xml</relativePath>
</parent>

View File

@@ -8,7 +8,7 @@
<parent>
<groupId>com.lancedb</groupId>
<artifactId>lancedb-parent</artifactId>
<version>0.22.4-beta.0</version>
<version>0.22.4-beta.2</version>
<relativePath>../pom.xml</relativePath>
</parent>

View File

@@ -6,7 +6,7 @@
<groupId>com.lancedb</groupId>
<artifactId>lancedb-parent</artifactId>
<version>0.22.4-beta.0</version>
<version>0.22.4-beta.2</version>
<packaging>pom</packaging>
<name>${project.artifactId}</name>
<description>LanceDB Java SDK Parent POM</description>

View File

@@ -1,7 +1,7 @@
[package]
name = "lancedb-nodejs"
edition.workspace = true
version = "0.22.4-beta.0"
version = "0.22.4-beta.2"
license.workspace = true
description.workspace = true
repository.workspace = true

View File

@@ -30,7 +30,7 @@ const results = await table.vectorSearch([0.1, 0.3]).limit(20).toArray();
console.log(results);
```
The [quickstart](https://lancedb.github.io/lancedb/basic/) contains a more complete example.
The [quickstart](https://lancedb.com/docs/quickstart/basic-usage/) contains more complete examples.
## Development

View File

@@ -42,7 +42,7 @@ export interface CreateTableOptions {
* Options already set on the connection will be inherited by the table,
* but can be overridden here.
*
* The available options are described at https://lancedb.github.io/lancedb/guides/storage/
* The available options are described at https://lancedb.com/docs/storage/
*/
storageOptions?: Record<string, string>;
@@ -78,7 +78,7 @@ export interface OpenTableOptions {
* Options already set on the connection will be inherited by the table,
* but can be overridden here.
*
* The available options are described at https://lancedb.github.io/lancedb/guides/storage/
* The available options are described at https://lancedb.com/docs/storage/
*/
storageOptions?: Record<string, string>;
/**

View File

@@ -118,7 +118,7 @@ export class PermutationBuilder {
* @returns A new PermutationBuilder instance
* @example
* ```ts
* builder.splitCalculated("user_id % 3");
* builder.splitCalculated({ calculation: "user_id % 3" });
* ```
*/
splitCalculated(options: SplitCalculatedOptions): PermutationBuilder {

View File

@@ -1,6 +1,6 @@
{
"name": "@lancedb/lancedb-darwin-arm64",
"version": "0.22.4-beta.0",
"version": "0.22.4-beta.2",
"os": ["darwin"],
"cpu": ["arm64"],
"main": "lancedb.darwin-arm64.node",

View File

@@ -1,6 +1,6 @@
{
"name": "@lancedb/lancedb-darwin-x64",
"version": "0.22.4-beta.0",
"version": "0.22.4-beta.2",
"os": ["darwin"],
"cpu": ["x64"],
"main": "lancedb.darwin-x64.node",

View File

@@ -1,6 +1,6 @@
{
"name": "@lancedb/lancedb-linux-arm64-gnu",
"version": "0.22.4-beta.0",
"version": "0.22.4-beta.2",
"os": ["linux"],
"cpu": ["arm64"],
"main": "lancedb.linux-arm64-gnu.node",

View File

@@ -1,6 +1,6 @@
{
"name": "@lancedb/lancedb-linux-arm64-musl",
"version": "0.22.4-beta.0",
"version": "0.22.4-beta.2",
"os": ["linux"],
"cpu": ["arm64"],
"main": "lancedb.linux-arm64-musl.node",

View File

@@ -1,6 +1,6 @@
{
"name": "@lancedb/lancedb-linux-x64-gnu",
"version": "0.22.4-beta.0",
"version": "0.22.4-beta.2",
"os": ["linux"],
"cpu": ["x64"],
"main": "lancedb.linux-x64-gnu.node",

View File

@@ -1,6 +1,6 @@
{
"name": "@lancedb/lancedb-linux-x64-musl",
"version": "0.22.4-beta.0",
"version": "0.22.4-beta.2",
"os": ["linux"],
"cpu": ["x64"],
"main": "lancedb.linux-x64-musl.node",

View File

@@ -1,6 +1,6 @@
{
"name": "@lancedb/lancedb-win32-arm64-msvc",
"version": "0.22.4-beta.0",
"version": "0.22.4-beta.2",
"os": [
"win32"
],

View File

@@ -1,6 +1,6 @@
{
"name": "@lancedb/lancedb-win32-x64-msvc",
"version": "0.22.4-beta.0",
"version": "0.22.4-beta.2",
"os": ["win32"],
"cpu": ["x64"],
"main": "lancedb.win32-x64-msvc.node",

View File

@@ -1,12 +1,12 @@
{
"name": "@lancedb/lancedb",
"version": "0.22.4-beta.0",
"version": "0.22.4-beta.2",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "@lancedb/lancedb",
"version": "0.22.4-beta.0",
"version": "0.22.4-beta.2",
"cpu": [
"x64",
"arm64"

View File

@@ -11,7 +11,7 @@
"ann"
],
"private": false,
"version": "0.22.4-beta.0",
"version": "0.22.4-beta.2",
"main": "dist/index.js",
"exports": {
".": "./dist/index.js",

View File

@@ -35,7 +35,7 @@ pub struct ConnectionOptions {
pub read_consistency_interval: Option<f64>,
/// (For LanceDB OSS only): configuration for object storage.
///
/// The available options are described at https://lancedb.github.io/lancedb/guides/storage/
/// The available options are described at https://lancedb.com/docs/storage/
pub storage_options: Option<HashMap<String, String>>,
/// (For LanceDB OSS only): the session to use for this connection. Holds
/// shared caches and other session-specific state.

View File

@@ -740,6 +740,7 @@ pub struct MergeResult {
pub num_inserted_rows: i64,
pub num_updated_rows: i64,
pub num_deleted_rows: i64,
pub num_attempts: i64,
}
impl From<lancedb::table::MergeResult> for MergeResult {
@@ -749,6 +750,7 @@ impl From<lancedb::table::MergeResult> for MergeResult {
num_inserted_rows: value.num_inserted_rows as i64,
num_updated_rows: value.num_updated_rows as i64,
num_deleted_rows: value.num_deleted_rows as i64,
num_attempts: value.num_attempts as i64,
}
}
}

View File

@@ -1,5 +1,5 @@
[tool.bumpversion]
current_version = "0.25.4-beta.1"
current_version = "0.25.4-beta.2"
parse = """(?x)
(?P<major>0|[1-9]\\d*)\\.
(?P<minor>0|[1-9]\\d*)\\.

View File

@@ -1,6 +1,6 @@
[package]
name = "lancedb-python"
version = "0.25.4-beta.1"
version = "0.25.4-beta.2"
edition.workspace = true
description = "Python bindings for LanceDB"
license.workspace = true

View File

@@ -1,11 +1,11 @@
PIP_EXTRA_INDEX_URL ?= https://pypi.fury.io/lancedb/
PIP_EXTRA_INDEX_URL ?= https://pypi.fury.io/lance-format/ https://pypi.fury.io/lancedb/
help: ## Show this help.
@sed -ne '/@sed/!s/## //p' $(MAKEFILE_LIST)
.PHONY: develop
develop: ## Install the package in development mode.
PIP_EXTRA_INDEX_URL=$(PIP_EXTRA_INDEX_URL) maturin develop --extras tests,dev,embeddings
PIP_EXTRA_INDEX_URL="$(PIP_EXTRA_INDEX_URL)" maturin develop --extras tests,dev,embeddings
.PHONY: format
format: ## Format the code.

View File

@@ -59,7 +59,7 @@ tests = [
"polars>=0.19, <=1.3.0",
"tantivy",
"pyarrow-stubs",
"pylance>=1.0.0b2",
"pylance>=1.0.0b4",
"requests",
"datafusion",
]

View File

@@ -20,7 +20,12 @@ from .remote.db import RemoteDBConnection
from .schema import vector
from .table import AsyncTable, Table
from ._lancedb import Session
from .namespace import connect_namespace, LanceNamespaceDBConnection
from .namespace import (
connect_namespace,
connect_namespace_async,
LanceNamespaceDBConnection,
AsyncLanceNamespaceDBConnection,
)
def connect(
@@ -36,7 +41,7 @@ def connect(
session: Optional[Session] = None,
**kwargs: Any,
) -> DBConnection:
"""Connect to a LanceDB database. YAY!
"""Connect to a LanceDB database.
Parameters
----------
@@ -67,7 +72,7 @@ def connect(
default configuration is used.
storage_options: dict, optional
Additional options for the storage backend. See available options at
<https://lancedb.github.io/lancedb/guides/storage/>
<https://lancedb.com/docs/storage/>
session: Session, optional
(For LanceDB OSS only)
A session to use for this connection. Sessions allow you to configure
@@ -169,7 +174,7 @@ async def connect_async(
default configuration is used.
storage_options: dict, optional
Additional options for the storage backend. See available options at
<https://lancedb.github.io/lancedb/guides/storage/>
<https://lancedb.com/docs/storage/>
session: Session, optional
(For LanceDB OSS only)
A session to use for this connection. Sessions allow you to configure
@@ -224,7 +229,9 @@ __all__ = [
"connect",
"connect_async",
"connect_namespace",
"connect_namespace_async",
"AsyncConnection",
"AsyncLanceNamespaceDBConnection",
"AsyncTable",
"URI",
"sanitize_uri",

View File

@@ -26,7 +26,7 @@ class Connection(object):
async def close(self): ...
async def list_namespaces(
self,
namespace: List[str],
namespace: Optional[List[str]],
page_token: Optional[str],
limit: Optional[int],
) -> List[str]: ...
@@ -34,7 +34,7 @@ class Connection(object):
async def drop_namespace(self, namespace: List[str]) -> None: ...
async def table_names(
self,
namespace: List[str],
namespace: Optional[List[str]],
start_after: Optional[str],
limit: Optional[int],
) -> list[str]: ...
@@ -43,7 +43,7 @@ class Connection(object):
name: str,
mode: str,
data: pa.RecordBatchReader,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
storage_options: Optional[Dict[str, str]] = None,
storage_options_provider: Optional[StorageOptionsProvider] = None,
location: Optional[str] = None,
@@ -53,7 +53,7 @@ class Connection(object):
name: str,
mode: str,
schema: pa.Schema,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
storage_options: Optional[Dict[str, str]] = None,
storage_options_provider: Optional[StorageOptionsProvider] = None,
location: Optional[str] = None,
@@ -61,7 +61,7 @@ class Connection(object):
async def open_table(
self,
name: str,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
storage_options: Optional[Dict[str, str]] = None,
storage_options_provider: Optional[StorageOptionsProvider] = None,
index_cache_size: Optional[int] = None,
@@ -71,7 +71,7 @@ class Connection(object):
self,
target_table_name: str,
source_uri: str,
target_namespace: List[str] = [],
target_namespace: Optional[List[str]] = None,
source_version: Optional[int] = None,
source_tag: Optional[str] = None,
is_shallow: bool = True,
@@ -80,11 +80,13 @@ class Connection(object):
self,
cur_name: str,
new_name: str,
cur_namespace: List[str] = [],
new_namespace: List[str] = [],
cur_namespace: Optional[List[str]] = None,
new_namespace: Optional[List[str]] = None,
) -> None: ...
async def drop_table(self, name: str, namespace: List[str] = []) -> None: ...
async def drop_all_tables(self, namespace: List[str] = []) -> None: ...
async def drop_table(
self, name: str, namespace: Optional[List[str]] = None
) -> None: ...
async def drop_all_tables(self, namespace: Optional[List[str]] = None) -> None: ...
class Table:
def name(self) -> str: ...
@@ -306,6 +308,7 @@ class MergeResult:
num_updated_rows: int
num_inserted_rows: int
num_deleted_rows: int
num_attempts: int
class AddColumnsResult:
version: int

View File

@@ -96,7 +96,7 @@ def data_to_reader(
f"Unknown data type {type(data)}. "
"Supported types: list of dicts, pandas DataFrame, polars DataFrame, "
"pyarrow Table/RecordBatch, or Pydantic models. "
"See https://lancedb.github.io/lancedb/guides/tables/ for examples."
"See https://lancedb.com/docs/tables/ for examples."
)

View File

@@ -54,7 +54,7 @@ class DBConnection(EnforceOverrides):
def list_namespaces(
self,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
page_token: Optional[str] = None,
limit: int = 10,
) -> Iterable[str]:
@@ -75,6 +75,8 @@ class DBConnection(EnforceOverrides):
Iterable of str
List of immediate child namespace names
"""
if namespace is None:
namespace = []
return []
def create_namespace(self, namespace: List[str]) -> None:
@@ -107,7 +109,7 @@ class DBConnection(EnforceOverrides):
page_token: Optional[str] = None,
limit: int = 10,
*,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
) -> Iterable[str]:
"""List all tables in this database, in sorted order
@@ -142,7 +144,7 @@ class DBConnection(EnforceOverrides):
fill_value: float = 0.0,
embedding_functions: Optional[List[EmbeddingFunctionConfig]] = None,
*,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
storage_options: Optional[Dict[str, str]] = None,
storage_options_provider: Optional["StorageOptionsProvider"] = None,
data_storage_version: Optional[str] = None,
@@ -191,7 +193,7 @@ class DBConnection(EnforceOverrides):
Additional options for the storage backend. Options already set on the
connection will be inherited by the table, but can be overridden here.
See available options at
<https://lancedb.github.io/lancedb/guides/storage/>
<https://lancedb.com/docs/storage/>
data_storage_version: optional, str, default "stable"
Deprecated. Set `storage_options` when connecting to the database and set
`new_table_data_storage_version` in the options.
@@ -308,7 +310,7 @@ class DBConnection(EnforceOverrides):
self,
name: str,
*,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
storage_options: Optional[Dict[str, str]] = None,
storage_options_provider: Optional["StorageOptionsProvider"] = None,
index_cache_size: Optional[int] = None,
@@ -339,7 +341,7 @@ class DBConnection(EnforceOverrides):
Additional options for the storage backend. Options already set on the
connection will be inherited by the table, but can be overridden here.
See available options at
<https://lancedb.github.io/lancedb/guides/storage/>
<https://lancedb.com/docs/storage/>
Returns
-------
@@ -347,7 +349,7 @@ class DBConnection(EnforceOverrides):
"""
raise NotImplementedError
def drop_table(self, name: str, namespace: List[str] = []):
def drop_table(self, name: str, namespace: Optional[List[str]] = None):
"""Drop a table from the database.
Parameters
@@ -358,14 +360,16 @@ class DBConnection(EnforceOverrides):
The namespace to drop the table from.
Empty list represents root namespace.
"""
if namespace is None:
namespace = []
raise NotImplementedError
def rename_table(
self,
cur_name: str,
new_name: str,
cur_namespace: List[str] = [],
new_namespace: List[str] = [],
cur_namespace: Optional[List[str]] = None,
new_namespace: Optional[List[str]] = None,
):
"""Rename a table in the database.
@@ -382,6 +386,10 @@ class DBConnection(EnforceOverrides):
The namespace to move the table to.
If not specified, defaults to the same as cur_namespace.
"""
if cur_namespace is None:
cur_namespace = []
if new_namespace is None:
new_namespace = []
raise NotImplementedError
def drop_database(self):
@@ -391,7 +399,7 @@ class DBConnection(EnforceOverrides):
"""
raise NotImplementedError
def drop_all_tables(self, namespace: List[str] = []):
def drop_all_tables(self, namespace: Optional[List[str]] = None):
"""
Drop all tables from the database
@@ -401,6 +409,8 @@ class DBConnection(EnforceOverrides):
The namespace to drop all tables from.
None or empty list represents root namespace.
"""
if namespace is None:
namespace = []
raise NotImplementedError
@property
@@ -541,7 +551,7 @@ class LanceDBConnection(DBConnection):
@override
def list_namespaces(
self,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
page_token: Optional[str] = None,
limit: int = 10,
) -> Iterable[str]:
@@ -562,6 +572,8 @@ class LanceDBConnection(DBConnection):
Iterable of str
List of immediate child namespace names
"""
if namespace is None:
namespace = []
return LOOP.run(
self._conn.list_namespaces(
namespace=namespace, page_token=page_token, limit=limit
@@ -596,7 +608,7 @@ class LanceDBConnection(DBConnection):
page_token: Optional[str] = None,
limit: int = 10,
*,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
) -> Iterable[str]:
"""Get the names of all tables in the database. The names are sorted.
@@ -614,6 +626,8 @@ class LanceDBConnection(DBConnection):
Iterator of str.
A list of table names.
"""
if namespace is None:
namespace = []
return LOOP.run(
self._conn.table_names(
namespace=namespace, start_after=page_token, limit=limit
@@ -638,7 +652,7 @@ class LanceDBConnection(DBConnection):
fill_value: float = 0.0,
embedding_functions: Optional[List[EmbeddingFunctionConfig]] = None,
*,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
storage_options: Optional[Dict[str, str]] = None,
storage_options_provider: Optional["StorageOptionsProvider"] = None,
data_storage_version: Optional[str] = None,
@@ -655,6 +669,8 @@ class LanceDBConnection(DBConnection):
---
DBConnection.create_table
"""
if namespace is None:
namespace = []
if mode.lower() not in ["create", "overwrite"]:
raise ValueError("mode must be either 'create' or 'overwrite'")
validate_table_name(name)
@@ -680,7 +696,7 @@ class LanceDBConnection(DBConnection):
self,
name: str,
*,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
storage_options: Optional[Dict[str, str]] = None,
storage_options_provider: Optional["StorageOptionsProvider"] = None,
index_cache_size: Optional[int] = None,
@@ -698,6 +714,8 @@ class LanceDBConnection(DBConnection):
-------
A LanceTable object representing the table.
"""
if namespace is None:
namespace = []
if index_cache_size is not None:
import warnings
@@ -723,7 +741,7 @@ class LanceDBConnection(DBConnection):
target_table_name: str,
source_uri: str,
*,
target_namespace: List[str] = [],
target_namespace: Optional[List[str]] = None,
source_version: Optional[int] = None,
source_tag: Optional[str] = None,
is_shallow: bool = True,
@@ -756,6 +774,8 @@ class LanceDBConnection(DBConnection):
-------
A LanceTable object representing the cloned table.
"""
if target_namespace is None:
target_namespace = []
LOOP.run(
self._conn.clone_table(
target_table_name,
@@ -776,7 +796,7 @@ class LanceDBConnection(DBConnection):
def drop_table(
self,
name: str,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
ignore_missing: bool = False,
):
"""Drop a table from the database.
@@ -790,6 +810,8 @@ class LanceDBConnection(DBConnection):
ignore_missing: bool, default False
If True, ignore if the table does not exist.
"""
if namespace is None:
namespace = []
LOOP.run(
self._conn.drop_table(
name, namespace=namespace, ignore_missing=ignore_missing
@@ -797,7 +819,9 @@ class LanceDBConnection(DBConnection):
)
@override
def drop_all_tables(self, namespace: List[str] = []):
def drop_all_tables(self, namespace: Optional[List[str]] = None):
if namespace is None:
namespace = []
LOOP.run(self._conn.drop_all_tables(namespace=namespace))
@override
@@ -805,8 +829,8 @@ class LanceDBConnection(DBConnection):
self,
cur_name: str,
new_name: str,
cur_namespace: List[str] = [],
new_namespace: List[str] = [],
cur_namespace: Optional[List[str]] = None,
new_namespace: Optional[List[str]] = None,
):
"""Rename a table in the database.
@@ -821,6 +845,10 @@ class LanceDBConnection(DBConnection):
new_namespace: List[str], optional
The namespace to move the table to.
"""
if cur_namespace is None:
cur_namespace = []
if new_namespace is None:
new_namespace = []
LOOP.run(
self._conn.rename_table(
cur_name,
@@ -910,7 +938,7 @@ class AsyncConnection(object):
async def list_namespaces(
self,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
page_token: Optional[str] = None,
limit: int = 10,
) -> Iterable[str]:
@@ -931,6 +959,8 @@ class AsyncConnection(object):
Iterable of str
List of immediate child namespace names (not full paths)
"""
if namespace is None:
namespace = []
return await self._inner.list_namespaces(
namespace=namespace, page_token=page_token, limit=limit
)
@@ -958,7 +988,7 @@ class AsyncConnection(object):
async def table_names(
self,
*,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
start_after: Optional[str] = None,
limit: Optional[int] = None,
) -> Iterable[str]:
@@ -982,6 +1012,8 @@ class AsyncConnection(object):
-------
Iterable of str
"""
if namespace is None:
namespace = []
return await self._inner.table_names(
namespace=namespace, start_after=start_after, limit=limit
)
@@ -998,7 +1030,7 @@ class AsyncConnection(object):
storage_options: Optional[Dict[str, str]] = None,
storage_options_provider: Optional["StorageOptionsProvider"] = None,
*,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
embedding_functions: Optional[List[EmbeddingFunctionConfig]] = None,
location: Optional[str] = None,
) -> AsyncTable:
@@ -1045,7 +1077,7 @@ class AsyncConnection(object):
Additional options for the storage backend. Options already set on the
connection will be inherited by the table, but can be overridden here.
See available options at
<https://lancedb.github.io/lancedb/guides/storage/>
<https://lancedb.com/docs/storage/>
Returns
-------
@@ -1155,6 +1187,8 @@ class AsyncConnection(object):
... await db.create_table("table4", make_batches(), schema=schema)
>>> asyncio.run(iterable_example())
"""
if namespace is None:
namespace = []
metadata = None
if embedding_functions is not None:
@@ -1212,7 +1246,7 @@ class AsyncConnection(object):
self,
name: str,
*,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
storage_options: Optional[Dict[str, str]] = None,
storage_options_provider: Optional["StorageOptionsProvider"] = None,
index_cache_size: Optional[int] = None,
@@ -1231,7 +1265,7 @@ class AsyncConnection(object):
Additional options for the storage backend. Options already set on the
connection will be inherited by the table, but can be overridden here.
See available options at
<https://lancedb.github.io/lancedb/guides/storage/>
<https://lancedb.com/docs/storage/>
index_cache_size: int, default 256
**Deprecated**: Use session-level cache configuration instead.
Create a Session with custom cache sizes and pass it to lancedb.connect().
@@ -1254,6 +1288,8 @@ class AsyncConnection(object):
-------
A LanceTable object representing the table.
"""
if namespace is None:
namespace = []
table = await self._inner.open_table(
name,
namespace=namespace,
@@ -1269,7 +1305,7 @@ class AsyncConnection(object):
target_table_name: str,
source_uri: str,
*,
target_namespace: List[str] = [],
target_namespace: Optional[List[str]] = None,
source_version: Optional[int] = None,
source_tag: Optional[str] = None,
is_shallow: bool = True,
@@ -1302,6 +1338,8 @@ class AsyncConnection(object):
-------
An AsyncTable object representing the cloned table.
"""
if target_namespace is None:
target_namespace = []
table = await self._inner.clone_table(
target_table_name,
source_uri,
@@ -1316,8 +1354,8 @@ class AsyncConnection(object):
self,
cur_name: str,
new_name: str,
cur_namespace: List[str] = [],
new_namespace: List[str] = [],
cur_namespace: Optional[List[str]] = None,
new_namespace: Optional[List[str]] = None,
):
"""Rename a table in the database.
@@ -1334,6 +1372,10 @@ class AsyncConnection(object):
The namespace to move the table to.
If not specified, defaults to the same as cur_namespace.
"""
if cur_namespace is None:
cur_namespace = []
if new_namespace is None:
new_namespace = []
await self._inner.rename_table(
cur_name, new_name, cur_namespace=cur_namespace, new_namespace=new_namespace
)
@@ -1342,7 +1384,7 @@ class AsyncConnection(object):
self,
name: str,
*,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
ignore_missing: bool = False,
):
"""Drop a table from the database.
@@ -1357,6 +1399,8 @@ class AsyncConnection(object):
ignore_missing: bool, default False
If True, ignore if the table does not exist.
"""
if namespace is None:
namespace = []
try:
await self._inner.drop_table(name, namespace=namespace)
except ValueError as e:
@@ -1365,7 +1409,7 @@ class AsyncConnection(object):
if f"Table '{name}' was not found" not in str(e):
raise e
async def drop_all_tables(self, namespace: List[str] = []):
async def drop_all_tables(self, namespace: Optional[List[str]] = None):
"""Drop all tables from the database.
Parameters
@@ -1374,6 +1418,8 @@ class AsyncConnection(object):
The namespace to drop all tables from.
None or empty list represents root namespace.
"""
if namespace is None:
namespace = []
await self._inner.drop_all_tables(namespace=namespace)
@deprecation.deprecated(

View File

@@ -609,9 +609,19 @@ class IvfPq:
class IvfRq:
"""Describes an IVF RQ Index
IVF-RQ (Residual Quantization) stores a compressed copy of each vector using
residual quantization and organizes them into IVF partitions. Parameters
largely mirror IVF-PQ for consistency.
IVF-RQ (RabitQ Quantization) compresses vectors using RabitQ quantization
and organizes them into IVF partitions.
The compression scheme is called RabitQ quantization. Each dimension is
quantized into a small number of bits. The parameters `num_bits` and
`num_partitions` control this process, providing a tradeoff between
index size (and thus search speed) and index accuracy.
The partitioning process is called IVF and the `num_partitions` parameter
controls how many groups to create.
Note that training an IVF RQ index on a large dataset is a slow operation
and currently is also a memory intensive operation.
Attributes
----------
@@ -628,7 +638,7 @@ class IvfRq:
Number of IVF partitions to create.
num_bits: int, default 1
Number of bits to encode each dimension.
Number of bits to encode each dimension in the RabitQ codebook.
max_iterations: int, default 50
Max iterations to train kmeans when computing IVF partitions.

View File

@@ -10,6 +10,7 @@ through a namespace abstraction.
from __future__ import annotations
import asyncio
import sys
from typing import Dict, Iterable, List, Optional, Union
@@ -23,7 +24,7 @@ import pyarrow as pa
from lancedb.db import DBConnection, LanceDBConnection
from lancedb.io import StorageOptionsProvider
from lancedb.table import LanceTable, Table
from lancedb.table import AsyncTable, LanceTable, Table
from lancedb.util import validate_table_name
from lancedb.common import DATA
from lancedb.pydantic import LanceModel
@@ -126,13 +127,17 @@ class LanceNamespaceStorageOptionsProvider(StorageOptionsProvider):
Examples
--------
>>> from lance_namespace import connect as namespace_connect
>>> namespace = namespace_connect("rest", {"url": "https://..."})
>>> provider = LanceNamespaceStorageOptionsProvider(
... namespace=namespace,
... table_id=["my_namespace", "my_table"]
... )
>>> options = provider.fetch_storage_options()
Create a provider and fetch storage options::
from lance_namespace import connect as namespace_connect
# Connect to namespace (requires a running namespace server)
namespace = namespace_connect("rest", {"uri": "https://..."})
provider = LanceNamespaceStorageOptionsProvider(
namespace=namespace,
table_id=["my_namespace", "my_table"]
)
options = provider.fetch_storage_options()
"""
def __init__(self, namespace: LanceNamespace, table_id: List[str]):
@@ -234,8 +239,10 @@ class LanceNamespaceDBConnection(DBConnection):
page_token: Optional[str] = None,
limit: int = 10,
*,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
) -> Iterable[str]:
if namespace is None:
namespace = []
request = ListTablesRequest(id=namespace, page_token=page_token, limit=limit)
response = self._ns.list_tables(request)
return response.tables if response.tables else []
@@ -252,12 +259,14 @@ class LanceNamespaceDBConnection(DBConnection):
fill_value: float = 0.0,
embedding_functions: Optional[List[EmbeddingFunctionConfig]] = None,
*,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
storage_options: Optional[Dict[str, str]] = None,
storage_options_provider: Optional[StorageOptionsProvider] = None,
data_storage_version: Optional[str] = None,
enable_v2_manifest_paths: Optional[bool] = None,
) -> Table:
if namespace is None:
namespace = []
if mode.lower() not in ["create", "overwrite"]:
raise ValueError("mode must be either 'create' or 'overwrite'")
validate_table_name(name)
@@ -346,11 +355,13 @@ class LanceNamespaceDBConnection(DBConnection):
self,
name: str,
*,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
storage_options: Optional[Dict[str, str]] = None,
storage_options_provider: Optional[StorageOptionsProvider] = None,
index_cache_size: Optional[int] = None,
) -> Table:
if namespace is None:
namespace = []
table_id = namespace + [name]
request = DescribeTableRequest(id=table_id)
response = self._ns.describe_table(request)
@@ -380,8 +391,10 @@ class LanceNamespaceDBConnection(DBConnection):
)
@override
def drop_table(self, name: str, namespace: List[str] = []):
def drop_table(self, name: str, namespace: Optional[List[str]] = None):
# Use namespace drop_table directly
if namespace is None:
namespace = []
table_id = namespace + [name]
request = DropTableRequest(id=table_id)
self._ns.drop_table(request)
@@ -391,9 +404,13 @@ class LanceNamespaceDBConnection(DBConnection):
self,
cur_name: str,
new_name: str,
cur_namespace: List[str] = [],
new_namespace: List[str] = [],
cur_namespace: Optional[List[str]] = None,
new_namespace: Optional[List[str]] = None,
):
if cur_namespace is None:
cur_namespace = []
if new_namespace is None:
new_namespace = []
raise NotImplementedError(
"rename_table is not supported for namespace connections"
)
@@ -405,14 +422,16 @@ class LanceNamespaceDBConnection(DBConnection):
)
@override
def drop_all_tables(self, namespace: List[str] = []):
def drop_all_tables(self, namespace: Optional[List[str]] = None):
if namespace is None:
namespace = []
for table_name in self.table_names(namespace=namespace):
self.drop_table(table_name, namespace=namespace)
@override
def list_namespaces(
self,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
page_token: Optional[str] = None,
limit: int = 10,
) -> Iterable[str]:
@@ -434,6 +453,8 @@ class LanceNamespaceDBConnection(DBConnection):
Iterable[str]
Names of child namespaces.
"""
if namespace is None:
namespace = []
request = ListNamespacesRequest(
id=namespace, page_token=page_token, limit=limit
)
@@ -471,13 +492,15 @@ class LanceNamespaceDBConnection(DBConnection):
name: str,
table_uri: str,
*,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
storage_options: Optional[Dict[str, str]] = None,
storage_options_provider: Optional[StorageOptionsProvider] = None,
index_cache_size: Optional[int] = None,
) -> LanceTable:
# Open a table directly from a URI using the location parameter
# Note: storage_options should already be merged by the caller
if namespace is None:
namespace = []
temp_conn = LanceDBConnection(
table_uri, # Use the table location as the connection URI
read_consistency_interval=self.read_consistency_interval,
@@ -497,6 +520,310 @@ class LanceNamespaceDBConnection(DBConnection):
)
class AsyncLanceNamespaceDBConnection:
"""
An async LanceDB connection that uses a namespace for table management.
This connection delegates table URI resolution to a lance_namespace instance,
while providing async methods for all operations.
"""
def __init__(
self,
namespace: LanceNamespace,
*,
read_consistency_interval: Optional[timedelta] = None,
storage_options: Optional[Dict[str, str]] = None,
session: Optional[Session] = None,
):
"""
Initialize an async namespace-based LanceDB connection.
Parameters
----------
namespace : LanceNamespace
The namespace instance to use for table management
read_consistency_interval : Optional[timedelta]
The interval at which to check for updates to the table from other
processes. If None, then consistency is not checked.
storage_options : Optional[Dict[str, str]]
Additional options for the storage backend
session : Optional[Session]
A session to use for this connection
"""
self._ns = namespace
self.read_consistency_interval = read_consistency_interval
self.storage_options = storage_options or {}
self.session = session
async def table_names(
self,
page_token: Optional[str] = None,
limit: int = 10,
*,
namespace: Optional[List[str]] = None,
) -> Iterable[str]:
"""List table names in the namespace."""
if namespace is None:
namespace = []
request = ListTablesRequest(id=namespace, page_token=page_token, limit=limit)
response = self._ns.list_tables(request)
return response.tables if response.tables else []
async def create_table(
self,
name: str,
data: Optional[DATA] = None,
schema: Optional[Union[pa.Schema, LanceModel]] = None,
mode: str = "create",
exist_ok: bool = False,
on_bad_vectors: str = "error",
fill_value: float = 0.0,
embedding_functions: Optional[List[EmbeddingFunctionConfig]] = None,
*,
namespace: Optional[List[str]] = None,
storage_options: Optional[Dict[str, str]] = None,
storage_options_provider: Optional[StorageOptionsProvider] = None,
data_storage_version: Optional[str] = None,
enable_v2_manifest_paths: Optional[bool] = None,
) -> AsyncTable:
"""Create a new table in the namespace."""
if namespace is None:
namespace = []
if mode.lower() not in ["create", "overwrite"]:
raise ValueError("mode must be either 'create' or 'overwrite'")
validate_table_name(name)
# Get location from namespace
table_id = namespace + [name]
# Step 1: Get the table location and storage options from namespace
location = None
namespace_storage_options = None
if mode.lower() == "overwrite":
# Try to describe the table first to see if it exists
try:
describe_request = DescribeTableRequest(id=table_id)
describe_response = self._ns.describe_table(describe_request)
location = describe_response.location
namespace_storage_options = describe_response.storage_options
except Exception:
# Table doesn't exist, will create a new one below
pass
if location is None:
# Table doesn't exist or mode is "create", reserve a new location
create_empty_request = CreateEmptyTableRequest(
id=table_id,
location=None,
properties=self.storage_options if self.storage_options else None,
)
create_empty_response = self._ns.create_empty_table(create_empty_request)
if not create_empty_response.location:
raise ValueError(
"Table location is missing from create_empty_table response"
)
location = create_empty_response.location
namespace_storage_options = create_empty_response.storage_options
# Merge storage options: self.storage_options < user options < namespace options
merged_storage_options = dict(self.storage_options)
if storage_options:
merged_storage_options.update(storage_options)
if namespace_storage_options:
merged_storage_options.update(namespace_storage_options)
# Step 2: Create table using LanceTable.create with the location
# Run the sync operation in a thread
def _create_table():
temp_conn = LanceDBConnection(
location,
read_consistency_interval=self.read_consistency_interval,
storage_options=merged_storage_options,
session=self.session,
)
# Create a storage options provider if not provided by user
if (
storage_options_provider is None
and namespace_storage_options is not None
):
provider = LanceNamespaceStorageOptionsProvider(
namespace=self._ns,
table_id=table_id,
)
else:
provider = storage_options_provider
return LanceTable.create(
temp_conn,
name,
data,
schema,
mode=mode,
exist_ok=exist_ok,
on_bad_vectors=on_bad_vectors,
fill_value=fill_value,
embedding_functions=embedding_functions,
namespace=namespace,
storage_options=merged_storage_options,
storage_options_provider=provider,
location=location,
)
lance_table = await asyncio.to_thread(_create_table)
# Get the underlying async table from LanceTable
return lance_table._table
async def open_table(
self,
name: str,
*,
namespace: Optional[List[str]] = None,
storage_options: Optional[Dict[str, str]] = None,
storage_options_provider: Optional[StorageOptionsProvider] = None,
index_cache_size: Optional[int] = None,
) -> AsyncTable:
"""Open an existing table from the namespace."""
if namespace is None:
namespace = []
table_id = namespace + [name]
request = DescribeTableRequest(id=table_id)
response = self._ns.describe_table(request)
# Merge storage options: self.storage_options < user options < namespace options
merged_storage_options = dict(self.storage_options)
if storage_options:
merged_storage_options.update(storage_options)
if response.storage_options:
merged_storage_options.update(response.storage_options)
# Create a storage options provider if not provided by user
if storage_options_provider is None and response.storage_options is not None:
storage_options_provider = LanceNamespaceStorageOptionsProvider(
namespace=self._ns,
table_id=table_id,
)
# Open table in a thread
def _open_table():
temp_conn = LanceDBConnection(
response.location,
read_consistency_interval=self.read_consistency_interval,
storage_options=merged_storage_options,
session=self.session,
)
return LanceTable.open(
temp_conn,
name,
namespace=namespace,
storage_options=merged_storage_options,
storage_options_provider=storage_options_provider,
index_cache_size=index_cache_size,
location=response.location,
)
lance_table = await asyncio.to_thread(_open_table)
return lance_table._table
async def drop_table(self, name: str, namespace: Optional[List[str]] = None):
"""Drop a table from the namespace."""
if namespace is None:
namespace = []
table_id = namespace + [name]
request = DropTableRequest(id=table_id)
self._ns.drop_table(request)
async def rename_table(
self,
cur_name: str,
new_name: str,
cur_namespace: Optional[List[str]] = None,
new_namespace: Optional[List[str]] = None,
):
"""Rename is not supported for namespace connections."""
if cur_namespace is None:
cur_namespace = []
if new_namespace is None:
new_namespace = []
raise NotImplementedError(
"rename_table is not supported for namespace connections"
)
async def drop_database(self):
"""Deprecated method."""
raise NotImplementedError(
"drop_database is deprecated, use drop_all_tables instead"
)
async def drop_all_tables(self, namespace: Optional[List[str]] = None):
"""Drop all tables in the namespace."""
if namespace is None:
namespace = []
table_names = await self.table_names(namespace=namespace)
for table_name in table_names:
await self.drop_table(table_name, namespace=namespace)
async def list_namespaces(
self,
namespace: Optional[List[str]] = None,
page_token: Optional[str] = None,
limit: int = 10,
) -> Iterable[str]:
"""
List child namespaces under the given namespace.
Parameters
----------
namespace : Optional[List[str]]
The parent namespace to list children from.
If None, lists root-level namespaces.
page_token : Optional[str]
Pagination token for listing results.
limit : int
Maximum number of namespaces to return.
Returns
-------
Iterable[str]
Names of child namespaces.
"""
if namespace is None:
namespace = []
request = ListNamespacesRequest(
id=namespace, page_token=page_token, limit=limit
)
response = self._ns.list_namespaces(request)
return response.namespaces if response.namespaces else []
async def create_namespace(self, namespace: List[str]) -> None:
"""
Create a new namespace.
Parameters
----------
namespace : List[str]
The namespace path to create.
"""
request = CreateNamespaceRequest(id=namespace)
self._ns.create_namespace(request)
async def drop_namespace(self, namespace: List[str]) -> None:
"""
Drop a namespace.
Parameters
----------
namespace : List[str]
The namespace path to drop.
"""
request = DropNamespaceRequest(id=namespace)
self._ns.drop_namespace(request)
def connect_namespace(
impl: str,
properties: Dict[str, str],
@@ -541,3 +868,62 @@ def connect_namespace(
storage_options=storage_options,
session=session,
)
def connect_namespace_async(
impl: str,
properties: Dict[str, str],
*,
read_consistency_interval: Optional[timedelta] = None,
storage_options: Optional[Dict[str, str]] = None,
session: Optional[Session] = None,
) -> AsyncLanceNamespaceDBConnection:
"""
Connect to a LanceDB database through a namespace (returns async connection).
This function is synchronous but returns an AsyncLanceNamespaceDBConnection
that provides async methods for all database operations.
Parameters
----------
impl : str
The namespace implementation to use. For examples:
- "dir" for DirectoryNamespace
- "rest" for REST-based namespace
- Full module path for custom implementations
properties : Dict[str, str]
Configuration properties for the namespace implementation.
Different namespace implemenation has different config properties.
For example, use DirectoryNamespace with {"root": "/path/to/directory"}
read_consistency_interval : Optional[timedelta]
The interval at which to check for updates to the table from other
processes. If None, then consistency is not checked.
storage_options : Optional[Dict[str, str]]
Additional options for the storage backend
session : Optional[Session]
A session to use for this connection
Returns
-------
AsyncLanceNamespaceDBConnection
An async namespace-based connection to LanceDB
Examples
--------
>>> import lancedb
>>> # This function is sync, but returns an async connection
>>> db = lancedb.connect_namespace_async("dir", {"root": "/path/to/db"})
>>> # Use async methods on the connection
>>> async def use_db():
... tables = await db.table_names()
... table = await db.create_table("my_table", schema=schema)
"""
namespace = namespace_connect(impl, properties)
# Return the async namespace-based connection
return AsyncLanceNamespaceDBConnection(
namespace,
read_consistency_interval=read_consistency_interval,
storage_options=storage_options,
session=session,
)

View File

@@ -14,6 +14,7 @@ from typing import (
Literal,
Optional,
Tuple,
Type,
TypeVar,
Union,
Any,
@@ -786,10 +787,7 @@ class LanceQueryBuilder(ABC):
-------
List[LanceModel]
"""
return [
model(**{k: v for k, v in row.items() if k in model.field_names()})
for row in self.to_arrow(timeout=timeout).to_pylist()
]
return [model(**row) for row in self.to_arrow(timeout=timeout).to_pylist()]
def to_polars(self, *, timeout: Optional[timedelta] = None) -> "pl.DataFrame":
"""
@@ -885,7 +883,7 @@ class LanceQueryBuilder(ABC):
----------
where: str
The where clause which is a valid SQL where clause. See
`Lance filter pushdown <https://lancedb.github.io/lance/read_and_write.html#filter-push-down>`_
`Lance filter pushdown <https://lance.org/guide/read_and_write#filter-push-down>`_
for valid SQL expressions.
prefilter: bool, default True
If True, apply the filter before vector search, otherwise the
@@ -1358,7 +1356,7 @@ class LanceVectorQueryBuilder(LanceQueryBuilder):
----------
where: str
The where clause which is a valid SQL where clause. See
`Lance filter pushdown <https://lancedb.github.io/lance/read_and_write.html#filter-push-down>`_
`Lance filter pushdown <https://lance.org/guide/read_and_write#filter-push-down>`_
for valid SQL expressions.
prefilter: bool, default True
If True, apply the filter before vector search, otherwise the
@@ -1497,7 +1495,7 @@ class LanceFtsQueryBuilder(LanceQueryBuilder):
if self._phrase_query:
if isinstance(query, str):
if not query.startswith('"') or not query.endswith('"'):
query = f'"{query}"'
self._query = f'"{query}"'
elif isinstance(query, FullTextQuery) and not isinstance(
query, PhraseQuery
):
@@ -2400,6 +2398,28 @@ class AsyncQueryBase(object):
return pl.from_arrow(await self.to_arrow(timeout=timeout))
async def to_pydantic(
self, model: Type[LanceModel], *, timeout: Optional[timedelta] = None
) -> List[LanceModel]:
"""
Convert results to a list of pydantic models.
Parameters
----------
model : Type[LanceModel]
The pydantic model to use.
timeout : timedelta, optional
The maximum time to wait for the query to complete.
If None, wait indefinitely.
Returns
-------
list[LanceModel]
"""
return [
model(**row) for row in (await self.to_arrow(timeout=timeout)).to_pylist()
]
async def explain_plan(self, verbose: Optional[bool] = False):
"""Return the execution plan for this query.

View File

@@ -104,7 +104,7 @@ class RemoteDBConnection(DBConnection):
@override
def list_namespaces(
self,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
page_token: Optional[str] = None,
limit: int = 10,
) -> Iterable[str]:
@@ -125,6 +125,8 @@ class RemoteDBConnection(DBConnection):
Iterable of str
List of immediate child namespace names
"""
if namespace is None:
namespace = []
return LOOP.run(
self._conn.list_namespaces(
namespace=namespace, page_token=page_token, limit=limit
@@ -159,7 +161,7 @@ class RemoteDBConnection(DBConnection):
page_token: Optional[str] = None,
limit: int = 10,
*,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
) -> Iterable[str]:
"""List the names of all tables in the database.
@@ -177,6 +179,8 @@ class RemoteDBConnection(DBConnection):
-------
An iterator of table names.
"""
if namespace is None:
namespace = []
return LOOP.run(
self._conn.table_names(
namespace=namespace, start_after=page_token, limit=limit
@@ -188,7 +192,7 @@ class RemoteDBConnection(DBConnection):
self,
name: str,
*,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
storage_options: Optional[Dict[str, str]] = None,
index_cache_size: Optional[int] = None,
) -> Table:
@@ -208,6 +212,8 @@ class RemoteDBConnection(DBConnection):
"""
from .table import RemoteTable
if namespace is None:
namespace = []
if index_cache_size is not None:
logging.info(
"index_cache_size is ignored in LanceDb Cloud"
@@ -222,7 +228,7 @@ class RemoteDBConnection(DBConnection):
target_table_name: str,
source_uri: str,
*,
target_namespace: List[str] = [],
target_namespace: Optional[List[str]] = None,
source_version: Optional[int] = None,
source_tag: Optional[str] = None,
is_shallow: bool = True,
@@ -252,6 +258,8 @@ class RemoteDBConnection(DBConnection):
"""
from .table import RemoteTable
if target_namespace is None:
target_namespace = []
table = LOOP.run(
self._conn.clone_table(
target_table_name,
@@ -275,7 +283,7 @@ class RemoteDBConnection(DBConnection):
mode: Optional[str] = None,
embedding_functions: Optional[List[EmbeddingFunctionConfig]] = None,
*,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
) -> Table:
"""Create a [Table][lancedb.table.Table] in the database.
@@ -372,6 +380,8 @@ class RemoteDBConnection(DBConnection):
LanceTable(table4)
"""
if namespace is None:
namespace = []
validate_table_name(name)
if embedding_functions is not None:
logging.warning(
@@ -396,7 +406,7 @@ class RemoteDBConnection(DBConnection):
return RemoteTable(table, self.db_name)
@override
def drop_table(self, name: str, namespace: List[str] = []):
def drop_table(self, name: str, namespace: Optional[List[str]] = None):
"""Drop a table from the database.
Parameters
@@ -407,6 +417,8 @@ class RemoteDBConnection(DBConnection):
The namespace to drop the table from.
None or empty list represents root namespace.
"""
if namespace is None:
namespace = []
LOOP.run(self._conn.drop_table(name, namespace=namespace))
@override
@@ -414,8 +426,8 @@ class RemoteDBConnection(DBConnection):
self,
cur_name: str,
new_name: str,
cur_namespace: List[str] = [],
new_namespace: List[str] = [],
cur_namespace: Optional[List[str]] = None,
new_namespace: Optional[List[str]] = None,
):
"""Rename a table in the database.
@@ -426,6 +438,10 @@ class RemoteDBConnection(DBConnection):
new_name: str
The new name of the table.
"""
if cur_namespace is None:
cur_namespace = []
if new_namespace is None:
new_namespace = []
LOOP.run(
self._conn.rename_table(
cur_name,

View File

@@ -652,6 +652,17 @@ class RemoteTable(Table):
"migrate_v2_manifest_paths() is not supported on the LanceDB Cloud"
)
def head(self, n=5) -> pa.Table:
"""
Return the first `n` rows of the table.
Parameters
----------
n: int, default 5
The number of rows to return.
"""
return LOOP.run(self._table.query().limit(n).to_arrow())
def add_index(tbl: pa.Table, i: int) -> pa.Table:
return tbl.add_column(

View File

@@ -178,7 +178,7 @@ def _into_pyarrow_reader(
f"Unknown data type {type(data)}. "
"Supported types: list of dicts, pandas DataFrame, polars DataFrame, "
"pyarrow Table/RecordBatch, or Pydantic models. "
"See https://lancedb.github.io/lancedb/guides/tables/ for examples."
"See https://lancedb.com/docs/tables/ for examples."
)
@@ -1018,7 +1018,7 @@ class Table(ABC):
... .when_not_matched_insert_all() \\
... .execute(new_data)
>>> res
MergeResult(version=2, num_updated_rows=2, num_inserted_rows=1, num_deleted_rows=0)
MergeResult(version=2, num_updated_rows=2, num_inserted_rows=1, num_deleted_rows=0, num_attempts=1)
>>> # The order of new rows is non-deterministic since we use
>>> # a hash-join as part of this operation and so we sort here
>>> table.to_arrow().sort_by("a").to_pandas()
@@ -1708,15 +1708,18 @@ class LanceTable(Table):
connection: "LanceDBConnection",
name: str,
*,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
storage_options: Optional[Dict[str, str]] = None,
storage_options_provider: Optional["StorageOptionsProvider"] = None,
index_cache_size: Optional[int] = None,
location: Optional[str] = None,
_async: AsyncTable = None,
):
if namespace is None:
namespace = []
self._conn = connection
self._namespace = namespace
self._location = location # Store location for use in _dataset_path
if _async is not None:
self._table = _async
else:
@@ -1765,12 +1768,14 @@ class LanceTable(Table):
db,
name,
*,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
storage_options: Optional[Dict[str, str]] = None,
storage_options_provider: Optional["StorageOptionsProvider"] = None,
index_cache_size: Optional[int] = None,
location: Optional[str] = None,
):
if namespace is None:
namespace = []
tbl = cls(
db,
name,
@@ -1794,6 +1799,10 @@ class LanceTable(Table):
@cached_property
def _dataset_path(self) -> str:
# Cacheable since it's deterministic
# If table was opened with explicit location (e.g., from namespace),
# use that location directly instead of constructing from base URI
if self._location is not None:
return self._location
return _table_path(self._conn.uri, self.name)
def to_lance(self, **kwargs) -> lance.LanceDataset:
@@ -2618,7 +2627,7 @@ class LanceTable(Table):
fill_value: float = 0.0,
embedding_functions: Optional[List[EmbeddingFunctionConfig]] = None,
*,
namespace: List[str] = [],
namespace: Optional[List[str]] = None,
storage_options: Optional[Dict[str, str | bool]] = None,
storage_options_provider: Optional["StorageOptionsProvider"] = None,
data_storage_version: Optional[str] = None,
@@ -2678,9 +2687,12 @@ class LanceTable(Table):
Deprecated. Set `storage_options` when connecting to the database and set
`new_table_enable_v2_manifest_paths` in the options.
"""
if namespace is None:
namespace = []
self = cls.__new__(cls)
self._conn = db
self._namespace = namespace
self._location = location
if data_storage_version is not None:
warnings.warn(
@@ -3622,7 +3634,7 @@ class AsyncTable:
... .when_not_matched_insert_all() \\
... .execute(new_data)
>>> res
MergeResult(version=2, num_updated_rows=2, num_inserted_rows=1, num_deleted_rows=0)
MergeResult(version=2, num_updated_rows=2, num_inserted_rows=1, num_deleted_rows=0, num_attempts=1)
>>> # The order of new rows is non-deterministic since we use
>>> # a hash-join as part of this operation and so we sort here
>>> table.to_arrow().sort_by("a").to_pandas()

View File

@@ -325,11 +325,18 @@ def test_search_fts_phrase_query(table):
pass
table.create_fts_index("text", use_tantivy=False, with_position=True, replace=True)
results = table.search("puppy").limit(100).to_list()
# Test with quotation marks
phrase_results = table.search('"puppy runs"').limit(100).to_list()
assert len(results) > len(phrase_results)
assert len(phrase_results) > 0
# Test with a query
# Test with .phrase_query()
phrase_results = table.search("puppy runs").phrase_query().limit(100).to_list()
assert len(results) > len(phrase_results)
assert len(phrase_results) > 0
# Test with PhraseQuery()
phrase_results = (
table.search(PhraseQuery("puppy runs", "text")).limit(100).to_list()
)

View File

@@ -423,3 +423,218 @@ class TestNamespaceConnection:
db.drop_table("same_name_table", namespace=["namespace_b"])
db.drop_namespace(["namespace_a"])
db.drop_namespace(["namespace_b"])
@pytest.mark.asyncio
class TestAsyncNamespaceConnection:
"""Test async namespace-based LanceDB connection using DirectoryNamespace."""
def setup_method(self):
"""Set up test fixtures."""
self.temp_dir = tempfile.mkdtemp()
def teardown_method(self):
"""Clean up test fixtures."""
shutil.rmtree(self.temp_dir, ignore_errors=True)
async def test_connect_namespace_async(self):
"""Test connecting to LanceDB through DirectoryNamespace asynchronously."""
db = lancedb.connect_namespace_async("dir", {"root": self.temp_dir})
# Should be an AsyncLanceNamespaceDBConnection
assert isinstance(db, lancedb.AsyncLanceNamespaceDBConnection)
# Initially no tables in root
table_names = await db.table_names()
assert len(list(table_names)) == 0
async def test_create_table_async(self):
"""Test creating a table asynchronously through namespace."""
db = lancedb.connect_namespace_async("dir", {"root": self.temp_dir})
# Create a child namespace first
await db.create_namespace(["test_ns"])
# Define schema for empty table
schema = pa.schema(
[
pa.field("id", pa.int64()),
pa.field("vector", pa.list_(pa.float32(), 2)),
pa.field("text", pa.string()),
]
)
# Create empty table in child namespace
table = await db.create_table(
"test_table", schema=schema, namespace=["test_ns"]
)
assert table is not None
assert isinstance(table, lancedb.AsyncTable)
# Table should appear in child namespace
table_names = await db.table_names(namespace=["test_ns"])
assert "test_table" in list(table_names)
async def test_open_table_async(self):
"""Test opening an existing table asynchronously through namespace."""
db = lancedb.connect_namespace_async("dir", {"root": self.temp_dir})
# Create a child namespace first
await db.create_namespace(["test_ns"])
# Create a table with schema in child namespace
schema = pa.schema(
[
pa.field("id", pa.int64()),
pa.field("vector", pa.list_(pa.float32(), 2)),
]
)
await db.create_table("test_table", schema=schema, namespace=["test_ns"])
# Open the table
table = await db.open_table("test_table", namespace=["test_ns"])
assert table is not None
assert isinstance(table, lancedb.AsyncTable)
# Test write operation - add data to the table
test_data = [
{"id": 1, "vector": [1.0, 2.0]},
{"id": 2, "vector": [3.0, 4.0]},
{"id": 3, "vector": [5.0, 6.0]},
]
await table.add(test_data)
# Test read operation - query the table
result = await table.to_arrow()
assert len(result) == 3
assert result.schema.field("id").type == pa.int64()
assert result.schema.field("vector").type == pa.list_(pa.float32(), 2)
# Verify data content
result_df = result.to_pandas()
assert result_df["id"].tolist() == [1, 2, 3]
assert [v.tolist() for v in result_df["vector"]] == [
[1.0, 2.0],
[3.0, 4.0],
[5.0, 6.0],
]
# Test update operation
await table.update({"id": 20}, where="id = 2")
result = await table.to_arrow()
result_df = result.to_pandas().sort_values("id").reset_index(drop=True)
assert result_df["id"].tolist() == [1, 3, 20]
# Test delete operation
await table.delete("id = 1")
result = await table.to_arrow()
assert len(result) == 2
result_df = result.to_pandas().sort_values("id").reset_index(drop=True)
assert result_df["id"].tolist() == [3, 20]
async def test_drop_table_async(self):
"""Test dropping a table asynchronously through namespace."""
db = lancedb.connect_namespace_async("dir", {"root": self.temp_dir})
# Create a child namespace first
await db.create_namespace(["test_ns"])
# Create tables in child namespace
schema = pa.schema(
[
pa.field("id", pa.int64()),
pa.field("vector", pa.list_(pa.float32(), 2)),
]
)
await db.create_table("table1", schema=schema, namespace=["test_ns"])
await db.create_table("table2", schema=schema, namespace=["test_ns"])
# Verify both tables exist in child namespace
table_names = list(await db.table_names(namespace=["test_ns"]))
assert "table1" in table_names
assert "table2" in table_names
assert len(table_names) == 2
# Drop one table
await db.drop_table("table1", namespace=["test_ns"])
# Verify only table2 remains
table_names = list(await db.table_names(namespace=["test_ns"]))
assert "table1" not in table_names
assert "table2" in table_names
assert len(table_names) == 1
async def test_namespace_operations_async(self):
"""Test namespace management operations asynchronously."""
db = lancedb.connect_namespace_async("dir", {"root": self.temp_dir})
# Initially no namespaces
namespaces = await db.list_namespaces()
assert len(list(namespaces)) == 0
# Create a namespace
await db.create_namespace(["test_namespace"])
# Verify namespace exists
namespaces = list(await db.list_namespaces())
assert "test_namespace" in namespaces
assert len(namespaces) == 1
# Create table in namespace
schema = pa.schema(
[
pa.field("id", pa.int64()),
pa.field("vector", pa.list_(pa.float32(), 2)),
]
)
table = await db.create_table(
"test_table", schema=schema, namespace=["test_namespace"]
)
assert table is not None
# Verify table exists in namespace
tables_in_namespace = list(await db.table_names(namespace=["test_namespace"]))
assert "test_table" in tables_in_namespace
assert len(tables_in_namespace) == 1
# Drop table from namespace
await db.drop_table("test_table", namespace=["test_namespace"])
# Verify table no longer exists in namespace
tables_in_namespace = list(await db.table_names(namespace=["test_namespace"]))
assert len(tables_in_namespace) == 0
# Drop namespace
await db.drop_namespace(["test_namespace"])
# Verify namespace no longer exists
namespaces = list(await db.list_namespaces())
assert len(namespaces) == 0
async def test_drop_all_tables_async(self):
"""Test dropping all tables asynchronously through namespace."""
db = lancedb.connect_namespace_async("dir", {"root": self.temp_dir})
# Create a child namespace first
await db.create_namespace(["test_ns"])
# Create multiple tables in child namespace
schema = pa.schema(
[
pa.field("id", pa.int64()),
pa.field("vector", pa.list_(pa.float32(), 2)),
]
)
for i in range(3):
await db.create_table(f"table{i}", schema=schema, namespace=["test_ns"])
# Verify tables exist in child namespace
table_names = await db.table_names(namespace=["test_ns"])
assert len(list(table_names)) == 3
# Drop all tables in child namespace
await db.drop_all_tables(namespace=["test_ns"])
# Verify all tables are gone from child namespace
table_names = await db.table_names(namespace=["test_ns"])
assert len(list(table_names)) == 0

View File

@@ -412,3 +412,50 @@ def test_multi_vector_in_lance_model():
t = TestModel(id=1)
assert t.vectors == [[0.0] * 16]
def test_aliases_in_lance_model(mem_db):
data = [
{"vector": [3.1, 4.1], "item": "foo", "price": 10.0},
{"vector": [5.9, 6.5], "item": "bar", "price": 20.0},
]
tbl = mem_db.create_table("items", data=data)
class TestModel(LanceModel):
name: str = Field(alias="item")
price: float
distance: float = Field(alias="_distance")
model = (
tbl.search([5.9, 6.5])
.distance_type("cosine")
.limit(1)
.to_pydantic(TestModel)[0]
)
assert hasattr(model, "name")
assert hasattr(model, "distance")
assert model.distance < 0.01
@pytest.mark.asyncio
async def test_aliases_in_lance_model_async(mem_db_async):
data = [
{"vector": [8.3, 2.5], "item": "foo", "price": 12.0},
{"vector": [7.7, 3.9], "item": "bar", "price": 11.2},
]
tbl = await mem_db_async.create_table("items", data=data)
class TestModel(LanceModel):
name: str = Field(alias="item")
price: float
distance: float = Field(alias="_distance")
model = (
await tbl.vector_search([7.7, 3.9])
.distance_type("cosine")
.limit(1)
.to_pydantic(TestModel)
)[0]
assert hasattr(model, "name")
assert hasattr(model, "distance")
assert model.distance < 0.01

View File

@@ -546,6 +546,22 @@ def query_test_table(query_handler, *, server_version=Version("0.1.0")):
yield table
def test_head():
def handler(body):
assert body == {
"k": 5,
"prefilter": True,
"vector": [],
"version": None,
}
return pa.table({"id": [1, 2, 3]})
with query_test_table(handler) as table:
data = table.head(5)
assert data == pa.table({"id": [1, 2, 3]})
def test_query_sync_minimal():
def handler(body):
assert body == {

View File

@@ -1487,7 +1487,7 @@ def setup_hybrid_search_table(db: DBConnection, embedding_func):
table.add([{"text": p} for p in phrases])
# Create a fts index
table.create_fts_index("text")
table.create_fts_index("text", with_position=True)
return table, MyTable, emb

View File

@@ -134,17 +134,19 @@ pub struct MergeResult {
pub num_updated_rows: u64,
pub num_inserted_rows: u64,
pub num_deleted_rows: u64,
pub num_attempts: u32,
}
#[pymethods]
impl MergeResult {
pub fn __repr__(&self) -> String {
format!(
"MergeResult(version={}, num_updated_rows={}, num_inserted_rows={}, num_deleted_rows={})",
"MergeResult(version={}, num_updated_rows={}, num_inserted_rows={}, num_deleted_rows={}, num_attempts={})",
self.version,
self.num_updated_rows,
self.num_inserted_rows,
self.num_deleted_rows
self.num_deleted_rows,
self.num_attempts
)
}
}
@@ -156,6 +158,7 @@ impl From<lancedb::table::MergeResult> for MergeResult {
num_updated_rows: result.num_updated_rows,
num_inserted_rows: result.num_inserted_rows,
num_deleted_rows: result.num_deleted_rows,
num_attempts: result.num_attempts,
}
}
}

View File

@@ -1,6 +1,6 @@
[package]
name = "lancedb"
version = "0.22.4-beta.0"
version = "0.22.4-beta.2"
edition.workspace = true
description = "LanceDB: A serverless, low-latency vector database for AI applications"
license.workspace = true

View File

@@ -239,7 +239,7 @@ impl<const HAS_DATA: bool> CreateTableBuilder<HAS_DATA> {
/// Options already set on the connection will be inherited by the table,
/// but can be overridden here.
///
/// See available options at <https://lancedb.github.io/lancedb/guides/storage/>
/// See available options at <https://lancedb.com/docs/storage/>
pub fn storage_option(mut self, key: impl Into<String>, value: impl Into<String>) -> Self {
let store_options = self
.request
@@ -259,7 +259,7 @@ impl<const HAS_DATA: bool> CreateTableBuilder<HAS_DATA> {
/// Options already set on the connection will be inherited by the table,
/// but can be overridden here.
///
/// See available options at <https://lancedb.github.io/lancedb/guides/storage/>
/// See available options at <https://lancedb.com/docs/storage/>
pub fn storage_options(
mut self,
pairs: impl IntoIterator<Item = (impl Into<String>, impl Into<String>)>,
@@ -442,7 +442,7 @@ impl OpenTableBuilder {
/// Options already set on the connection will be inherited by the table,
/// but can be overridden here.
///
/// See available options at <https://lancedb.github.io/lancedb/guides/storage/>
/// See available options at <https://lancedb.com/docs/storage/>
pub fn storage_option(mut self, key: impl Into<String>, value: impl Into<String>) -> Self {
let storage_options = self
.request
@@ -461,7 +461,7 @@ impl OpenTableBuilder {
/// Options already set on the connection will be inherited by the table,
/// but can be overridden here.
///
/// See available options at <https://lancedb.github.io/lancedb/guides/storage/>
/// See available options at <https://lancedb.com/docs/storage/>
pub fn storage_options(
mut self,
pairs: impl IntoIterator<Item = (impl Into<String>, impl Into<String>)>,
@@ -959,7 +959,7 @@ impl ConnectBuilder {
/// Set an option for the storage layer.
///
/// See available options at <https://lancedb.github.io/lancedb/guides/storage/>
/// See available options at <https://lancedb.com/docs/storage/>
pub fn storage_option(mut self, key: impl Into<String>, value: impl Into<String>) -> Self {
self.request.options.insert(key.into(), value.into());
self
@@ -967,7 +967,7 @@ impl ConnectBuilder {
/// Set multiple options for the storage layer.
///
/// See available options at <https://lancedb.github.io/lancedb/guides/storage/>
/// See available options at <https://lancedb.com/docs/storage/>
pub fn storage_options(
mut self,
pairs: impl IntoIterator<Item = (impl Into<String>, impl Into<String>)>,
@@ -1102,7 +1102,7 @@ impl ConnectNamespaceBuilder {
/// Set an option for the storage layer.
///
/// See available options at <https://lancedb.github.io/lancedb/guides/storage/>
/// See available options at <https://lancedb.com/docs/storage/>
pub fn storage_option(mut self, key: impl Into<String>, value: impl Into<String>) -> Self {
self.storage_options.insert(key.into(), value.into());
self
@@ -1110,7 +1110,7 @@ impl ConnectNamespaceBuilder {
/// Set multiple options for the storage layer.
///
/// See available options at <https://lancedb.github.io/lancedb/guides/storage/>
/// See available options at <https://lancedb.com/docs/storage/>
pub fn storage_options(
mut self,
pairs: impl IntoIterator<Item = (impl Into<String>, impl Into<String>)>,

View File

@@ -60,7 +60,7 @@ pub struct ListingDatabaseOptions {
/// These are used to create/list tables and they are inherited by all tables
/// opened by this database.
///
/// See available options at <https://lancedb.github.io/lancedb/guides/storage/>
/// See available options at <https://lancedb.com/docs/storage/>
pub storage_options: HashMap<String, String>,
}
@@ -157,7 +157,7 @@ impl ListingDatabaseOptionsBuilder {
/// Set an option for the storage layer.
///
/// See available options at <https://lancedb.github.io/lancedb/guides/storage/>
/// See available options at <https://lancedb.com/docs/storage/>
pub fn storage_option(mut self, key: impl Into<String>, value: impl Into<String>) -> Self {
self.options
.storage_options
@@ -167,7 +167,7 @@ impl ListingDatabaseOptionsBuilder {
/// Set multiple options for the storage layer.
///
/// See available options at <https://lancedb.github.io/lancedb/guides/storage/>
/// See available options at <https://lancedb.com/docs/storage/>
pub fn storage_options(
mut self,
pairs: impl IntoIterator<Item = (impl Into<String>, impl Into<String>)>,

View File

@@ -71,7 +71,7 @@
//! It treats [`FixedSizeList<Float16/Float32>`](https://docs.rs/arrow/latest/arrow/array/struct.FixedSizeListArray.html)
//! columns as vector columns.
//!
//! For more details, please refer to [LanceDB documentation](https://lancedb.github.io/lancedb/).
//! For more details, please refer to the [LanceDB documentation](https://lancedb.com/docs).
//!
//! #### Create a table
//!

View File

@@ -90,7 +90,7 @@ pub struct RemoteDatabaseOptions {
pub host_override: Option<String>,
/// Storage options configure the storage layer (e.g. S3, GCS, Azure, etc.)
///
/// See available options at <https://lancedb.github.io/lancedb/guides/storage/>
/// See available options at <https://lancedb.com/docs/storage/>
///
/// These options are only used for LanceDB Enterprise and only a subset of options
/// are supported.

View File

@@ -1183,6 +1183,7 @@ impl<S: HttpSend> BaseTable for RemoteTable<S> {
num_deleted_rows: 0,
num_inserted_rows: 0,
num_updated_rows: 0,
num_attempts: 0,
});
}

View File

@@ -467,6 +467,11 @@ pub struct MergeResult {
/// However those rows are not shared with the user.
#[serde(default)]
pub num_deleted_rows: u64,
/// Number of attempts performed during the merge operation.
/// This includes the initial attempt plus any retries due to transaction conflicts.
/// A value of 1 means the operation succeeded on the first try.
#[serde(default)]
pub num_attempts: u32,
}
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize, Default)]
@@ -2531,6 +2536,7 @@ impl BaseTable for NativeTable {
num_updated_rows: stats.num_updated_rows,
num_inserted_rows: stats.num_inserted_rows,
num_deleted_rows: stats.num_deleted_rows,
num_attempts: stats.num_attempts,
})
}
@@ -2990,9 +2996,13 @@ mod tests {
// Perform a "insert if not exists"
let mut merge_insert_builder = table.merge_insert(&["i"]);
merge_insert_builder.when_not_matched_insert_all();
merge_insert_builder.execute(new_batches).await.unwrap();
let result = merge_insert_builder.execute(new_batches).await.unwrap();
// Only 5 rows should actually be inserted
assert_eq!(table.count_rows(None).await.unwrap(), 15);
assert_eq!(result.num_inserted_rows, 5);
assert_eq!(result.num_updated_rows, 0);
assert_eq!(result.num_deleted_rows, 0);
assert_eq!(result.num_attempts, 1);
// Create new data with i=15..25 (no id matches)
let new_batches = Box::new(merge_insert_test_batches(15, 2));