lancedb

mirror of https://github.com/lancedb/lancedb.git synced 2026-01-13 23:32:57 +00:00

Go to file

Ryan Green 96d534d4bc feat: add retries to remote client for requests with stream bodies (#2349 )

Closes https://github.com/lancedb/lancedb/issues/2307
* Adds retries to remote operations with stream bodies (add,
merge_insert)
* Change default retryable status codes to 409, 429, 500, 502, 503, 504
* Don't retry add or merge_insert operations on 5xx responses

Notes:
* Supporting retries on stream bodies means we have to buffer the body
into memory so it can be cloned on retry. This will impact memory use
patterns for the remote client. This buffering can be disabled by
disabling retries (i.e. setting retries to 0 in RetryConfig)
* It does not seem that retry config can be specified by env vars as the
documentation suggests. I added a follow-up issue
[here](https://github.com/lancedb/lancedb/issues/2350)



<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

## Summary by CodeRabbit

- **New Features**
- Enhanced retry support for remote requests with configurable limits
and exponential backoff with jitter.
- Added robust retry logic for streaming data uploads, enabling retries
with buffered data to ensure reliability.

- **Bug Fixes**
- Improved error handling and retry behavior for HTTP status codes 409
and 504.

- **Refactor**
- Centralized and modularized HTTP request sending and retry logic
across remote database and table operations.
  - Streamlined request ID management for improved traceability.
- Simplified error message construction in index waiting functionality.

- **Tests**
  - Added a test verifying merge-insert retries on HTTP 409 responses.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

2025-04-22 15:40:44 -02:30

.cargo

ci: refactor node releases (#2223 )

2025-03-21 10:56:29 -07:00

.github

fix(python): make sure pandas is optional (#2346 )

2025-04-21 13:42:13 -07:00

ci: refactor node releases (#2223 )

2025-03-21 10:56:29 -07:00

dockerfiles

A simple base usage that install the dependencies necessary to use FT… (#1036 )

2024-04-05 16:31:36 -07:00

docs

docs: make table.update() nodejs guide consistent with API documentation (#2334 )

2025-04-21 08:38:16 -07:00

java

Bump version: 0.19.0-beta.8 → 0.19.0-beta.9

2025-04-21 22:50:20 +00:00

node

Updating package-lock.json

2025-04-21 23:55:43 +00:00

nodejs

Updating package-lock.json

2025-04-21 22:50:56 +00:00

python

Bump version: 0.22.0-beta.8 → 0.22.0-beta.9

2025-04-21 22:49:59 +00:00

rust

feat: add retries to remote client for requests with stream bodies (#2349 )

2025-04-22 15:40:44 -02:30

.bumpversion.toml

Bump version: 0.19.0-beta.8 → 0.19.0-beta.9

2025-04-21 22:50:20 +00:00

.gitignore

ci(rust): caching improvements (up to 2.8x faster builds) (#2075 )

2025-01-29 08:26:45 -08:00

.pre-commit-config.yaml

fix(python): typing (#2167 )

2025-03-10 09:01:23 -07:00

Cargo.lock

feat: add retries to remote client for requests with stream bodies (#2349 )

2025-04-22 15:40:44 -02:30

Cargo.toml

feat: add prewarm_index function (#2342 )

2025-04-17 15:14:36 -07:00

CONTRIBUTING.md

docs: contributing guide (#1970 )

2025-01-07 15:11:16 -08:00

docker-compose.yml

feat: expose storage options in LanceDB (#1204 )

2024-04-10 10:12:04 -07:00

LICENSE

initial commit

2023-03-17 18:15:19 -07:00

pyright_report.csv

fix(python): typing (#2167 )

2025-03-10 09:01:23 -07:00

README.md

fix: handle light and dark mode logo (#2265 )

2025-03-22 10:21:05 -07:00

release_process.md

ci: enable java auto release (#1602 )

2024-09-19 10:51:03 -07:00

rust-toolchain.toml

ci(rust): check MSRV and upgrade toolchain (#1960 )

2024-12-19 08:43:25 -08:00

README.md

Search More, Manage Less

LanceDB is an open-source database for vector-search built with persistent storage, which greatly simplifies retrieval, filtering and management of embeddings.

The key features of LanceDB include:

Production-scale vector search with no servers to manage.
Store, query and filter vectors, metadata and multi-modal data (text, images, videos, point clouds, and more).
Support for vector similarity search, full-text search and SQL.
Native Python and Javascript/Typescript support.
Zero-copy, automatic versioning, manage versions of your data without needing extra infrastructure.
GPU support in building vector index(*).
Ecosystem integrations with LangChain 🦜️🔗, LlamaIndex 🦙, Apache-Arrow, Pandas, Polars, DuckDB and more on the way.

LanceDB's core is written in Rust 🦀 and is built using Lance, an open-source columnar format designed for performant ML workloads.

Quick Start

Javascript

npm install @lancedb/lancedb

import * as lancedb from "@lancedb/lancedb";

const db = await lancedb.connect("data/sample-lancedb");
const table = await db.createTable("vectors", [
	{ id: 1, vector: [0.1, 0.2], item: "foo", price: 10 },
	{ id: 2, vector: [1.1, 1.2], item: "bar", price: 50 },
], {mode: 'overwrite'});


const query = table.vectorSearch([0.1, 0.3]).limit(2);
const results = await query.toArray();

// You can also search for rows by specific criteria without involving a vector search.
const rowsByCriteria = await table.query().where("price >= 10").toArray();

Python

pip install lancedb

import lancedb

uri = "data/sample-lancedb"
db = lancedb.connect(uri)
table = db.create_table("my_table",
                         data=[{"vector": [3.1, 4.1], "item": "foo", "price": 10.0},
                               {"vector": [5.9, 26.5], "item": "bar", "price": 20.0}])
result = table.search([100, 100]).limit(2).to_pandas()

Blogs, Tutorials & Videos

Languages

Rust 42.8%

Python 41.9%

TypeScript 14.2%

Shell 0.6%

Java 0.3%

README.md Unescape Escape

Quick Start

Blogs, Tutorials & Videos

README.md