Closes#1741
If we checkout a version, we need to make a `HEAD` request to get the
size of the manifest. The new `checkout_latest()` code path can skip
this IOP. This makes the refresh slightly faster.
## user story
fixes https://github.com/lancedb/lancedb/issues/1480https://github.com/invl/retry has not had an update in 8 years, one if
its sub-dependencies via requirements.txt
(https://github.com/pytest-dev/py) is no longer maintained and has a
high severity vulnerability (CVE-2022-42969).
retry is only used for a single function in the python codebase for a
deprecated helper function `with_embeddings`, which was created for an
older tutorial (https://github.com/lancedb/lancedb/pull/12) [but is now
deprecated](https://lancedb.github.io/lancedb/embeddings/legacy/).
## changes
i backported a limited range of functionality of the `@retry()`
decorator directly into lancedb so that we no longer have a dependency
to the `retry` package.
## tests
```
/Users/james/src/lancedb/python $ ruff check .
All checks passed!
/Users/james/src/lancedb/python $ pytest python/tests/test_embeddings.py
python/tests/test_embeddings.py .......s.... [100%]
================================================================ 11 passed, 1 skipped, 2 warnings in 7.08s ================================================================
```
* Adds nicer errors to remote SDK, that expose useful properties like
`request_id` and `status_code`.
* Makes sure the Python tracebacks print nicely by mapping the `source`
field from a Rust error to the `__cause__` field.
A few bugs uncovered by integration tests:
* We didn't prepend `/v1` to the Table endpoint URLs
* `/create_index` takes `metric_type` not `distance_type`. (This is also
an error in the OpenAPI docs.)
* `/create_index` expects the `metric_type` parameter to always be
lowercase.
* We were writing an IPC file message when we were supposed to send an
IPC stream message.
This PR ports over advanced client configuration present in the Python
`RestfulLanceDBClient` to the Rust one. The goal is to have feature
parity so we can replace the implementation.
* [x] Request timeout
* [x] Retries with backoff
* [x] Request id generation
* [x] User agent (with default tied to library version ✨ )
* [x] Table existence cache
* [ ] Deferred: ~Request id customization (should this just pick up OTEL
trace ids?)~
Fixes#1684
Resovles #1709. Adds `trust_remote_code` as a parameter to the
`TransformersEmbeddingFunction` class with a default of False. Updated
relevant documentation with the same.
BREAKING CHANGE: the return value of `index_stats` method has changed
and all `index_stats` APIs now take index name instead of UUID. Also
several deprecated index statistics methods were removed.
* Removes deprecated methods for individual index statistics
* Aligns public `IndexStatistics` struct with API response from LanceDB
Cloud.
* Implements `index_stats` for remote Rust SDK and Python async API.
- fixes https://github.com/lancedb/lancedb/issues/1697.
- unifies vector column inference logic for remote and local table to
prevent future disparities.
- Updates docstring in RemoteTable to specify empty queries are not
supported