## Summary
- update all lance crates to v1.0.0 using the helper script (fallbacks
to the v1.0.0 tag)
- refresh Cargo.lock to pull the new release
- add script fallback to retry with the git tag when a crates.io release
is unavailable
## Testing
- cargo clippy --workspace --tests --all-features -- -D warnings
- cargo fmt --all
Tag: https://github.com/lance-format/lance/releases/tag/v1.0.0
---------
Co-authored-by: Jack Ye <yezhaoqin@gmail.com>
This PR will add a timely lance release check for lancedb which will
auto bump lance while new tags released.
---
**This PR was primarily authored with Codex using GPT-5-Codex and then
hand-reviewed by me. I AM responsible for every change made in this PR.
I aimed to keep it aligned with our goals, though I may have missed
minor issues. Please flag anything that feels off, I'll fix it
quickly.**
---------
Signed-off-by: Xuanwo <github@xuanwo.io>
Fixed the issue on lance-namespace side to avoid pinning to a specific
lance version. This should fix the issue of the increased release
artifact size and build time.
Add a new test feature which allows for running the lancedb tests
against a remote server. Convert over a few tests in src/connection.rs
as a proof of concept.
To make local development easier, the remote tests can be run locally
from a Makefile. This file can also be used to run the feature tests,
with a single invocation of 'make'. (The feature tests require bringing
up a docker compose environment.)
This shrinks the size of a local embedded build that can disable all the
default features. When combined with
https://github.com/lancedb/lance/pull/4362 and the dependencies are
updated to point to the fix, this resolves#2567 fully.
Verified by patching the workspace to redirect to my clone of lance with
the PR applied.
```
cargo tree -p lancedb -e no-build -e no-dev --no-default-features -i aws-config | less
```
The reason that lance itself needs to change too is that many
dependencies within that project depend on lance-io/default and lancedb
depends on them which transitively ends up enabling the cloud
regardless. The PR in lance removes the dependency on lance-io/default
from all sibling crates.
---------
Co-authored-by: Will Jones <willjones127@gmail.com>
Adds a script to change the lance dependency easily. To make this
change, I just had to run:
```bash
python ci/set_lance_version.py stable
```
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **New Features**
- Added a script to automate updating the Lance package version in
project dependencies.
- **Chores**
- Updated workflows to improve lockfile management and automate updates
during releases and publishing.
- Switched Lance dependencies from git-based references to fixed version
numbers for improved stability.
- Enhanced lockfile update script with an option to amend commits and
quieter output.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Follow up to #2416
Forgot to do `git add`.
Also need to delete old actions updating package lock.
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Chores**
- Removed legacy workflows related to updating package lock files.
- Improved the update lockfiles script to ensure updated lockfiles are
always included in amended commits.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
- **Chores**
- Updated workflow to ignore changes in the `Cargo.lock` file during
documentation checks, reducing unnecessary workflow failures.
- Enhanced release process by adding automated lockfile updates for
Node.js and Rust components.
- Removed an obsolete package-lock update job from the publishing
workflow to streamline releases.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
This PR fixes build issues associated with `aws-lc-rs`, while
simplifying the build process. Previously, we used custom scripts for
the musl and Windows ARM builds. These were complicated and prone to
breaking. This PR switches to a setup that mirrors
https://github.com/napi-rs/package-template/blob/main/.github/workflows/CI.yml.
* linux glibc and musl builds now use the Docker images provided by the
napi project
* Windows ARM build now just cross compiles from Windows x64, which
turns out to work quite well.
@wjones127 is there a standard way you guys setup your virtualenv? I can
either relist all the dependencies in the pyright precommit section, or
specify a venv, or the user has to be in the virtual environment when
they run git commit. If the venv location was standardized or a python
manager like `uv` was used it would be easier to avoid duplicating the
pyright dependency list.
Per your suggestion, in `pyproject.toml` I added in all the passing
files to the `includes` section.
For ruff I upgraded the version and removed "TCH" which doesn't exist as
an option.
I added a `pyright_report.csv` which contains a list of all files sorted
by pyright errors ascending as a todo list to work on.
I fixed about 30 issues in `table.py` stemming from str's being passed
into methods that required a string within a set of string Literals by
extracting them into `types.py`
Can you verify in the rust bridge that the schema should be a property
and not a method here? If it's a method, then there's another place in
the code where `inner.schema` should be `inner.schema()`
``` python
class RecordBatchStream:
@property
def schema(self) -> pa.Schema: ...
```
Also unless the `_lancedb.pyi` file is wrong, then there is no
`__anext__` here for `__inner` when it's not an `AsyncGenerator` and
only `next` is defined:
``` python
async def __anext__(self) -> pa.RecordBatch:
return await self._inner.__anext__()
if isinstance(self._inner, AsyncGenerator):
batch = await self._inner.__anext__()
else:
batch = await self._inner.next()
if batch is None:
raise StopAsyncIteration
return batch
```
in the else statement, `_inner` is a `RecordBatchStream`
```python
class RecordBatchStream:
@property
def schema(self) -> pa.Schema: ...
async def next(self) -> Optional[pa.RecordBatch]: ...
```
---------
Co-authored-by: Will Jones <willjones127@gmail.com>
fixes that all `neon pack-build` packs are named
`vectordb-linux-x64-musl-*.tgz` even when cross-compiling
adds 2nd param:
`TARGET_TRIPLE=${2:-x86_64-unknown-linux-gnu}`
`npm run pack-build -- -t $TARGET_TRIPLE`
This is done as setup for a PR that will fix the OpenAI dependency
issue.
* [x] FTS examples
* [x] Setup mock openai
* [x] Ran `npm audit fix`
* [x] sentences embeddings test
* [x] Double check formatting of docs examples
Building with Node 16 produced this error:
```
npm ERR! code ENOENT
npm ERR! syscall chmod
npm ERR! path /io/nodejs/node_modules/apache-arrow-15/bin/arrow2csv.cjs
npm ERR! errno -2
npm ERR! enoent ENOENT: no such file or directory, chmod '/io/nodejs/node_modules/apache-arrow-15/bin/arrow2csv.cjs'
npm ERR! enoent This is related to npm not being able to find a file.
npm ERR! enoent
```
[CI
Failure](https://github.com/lancedb/lancedb/actions/runs/10117131772/job/27981475770).
This looks like it is https://github.com/apache/arrow/issues/43341
Upgrading to Node 18 makes this goes away. Since Node 18 requires glibc
>= 2_28, we had to upgrade the manylinux version we are using. This is
fine since we already state a minimum Node version of 18.
This also upgrades the openssl version we bundle, as well as
consolidates the build files.
* Labelled jobs `vectordb` and `lancedb` so it's clear which package
they are for
* Fix permission issue in aarch64 Linux `vectordb` build that has been
blocking release for two months.
* Added Slack notifications for failure of these publish jobs.
I tried to do a stable release and it failed with:
```
Traceback (most recent call last):
File "/home/runner/work/lancedb/lancedb/ci/check_breaking_changes.py", line 20, in <module>
commits = repo.compare(args.base, args.head).commits
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/github/Repository.py", line 1133, in compare
headers, data = self._requester.requestJsonAndCheck("GET", f"{self.url}/compare/{base}...{head}", params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/github/Requester.py", line 548, in requestJsonAndCheck
return self.__check(*self.requestJson(verb, url, parameters, headers, input, self.__customConnection(url)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/github/Requester.py", line 609, in __check
raise self.createException(status, responseHeaders, data)
github.GithubException.UnknownObjectException: 404 {"message": "Not Found", "documentation_url": "https://docs.github.com/rest/commits/commits#compare-two-commits"}
```
I believe the problem is that we are calculating the
`LAST_STABLE_RELEASE` after we have run bump version and so the newly
created tag is in the list of tags we search and it is the most recent
one and so it gets included as `LAST_STABLE_RELEASE`. Then, the call to
github fails because we haven't pushed the tag yet. This changes the
logic to grab `LAST_STABLE_RELEASE` before we create any new tags.
This PR changes the release process. Some parts are more complex, and
other parts I've simplified.
## Simplifications
* Combined `Create Release Commit` and `Create Python Release Commit`
into a single workflow. By default, it does a release of all packages,
but you can still choose to make just a Python or just Node/Rust release
through the arguments. This will make it rarer that we create a Node
release but forget about Python or vice-versa.
* Releases are automatically generated once a tag is pushed. This
eliminates the manual step of creating the release.
* Release notes are automatically generated and changes are categorized
based on the PR labels.
* Removed the use of `LANCEDB_RELEASE_TOKEN` in favor of just using
`GITHUB_TOKEN` where it wasn't necessary. In the one place it is
necessary, I left a comment as to why it is.
* Reused the version in `python/Cargo.toml` so we don't have two
different versions in Python LanceDB.
## New changes
* We now can create `preview` / `beta` releases. By default `Create
Release Commit` will create a preview release, but you can select a
"stable" release type and it will create a full stable release.
* For Python, pre-releases go to fury.io instead of PyPI
* `bump2version` was deprecated, so upgraded to `bump-my-version`. This
also seems to better support semantic versioning with pre-releases.
* `ci` changes will now be shown in the changelog, allowing changes like
this to be visible to users. `chore` is still hidden.
## Versioning
**NOTE**: unlike how it is in lance repo right now, the version in main
is the last one released, including beta versions.
---------
Co-authored-by: Lance Release <lance-dev@lancedb.com>
Co-authored-by: Weston Pace <weston.pace@gmail.com>
When we turned on fat LTO builds, we made the release build job **much**
more compute and memory intensive. The ARM runners have particularly low
memory per core, which makes them susceptible to OOM errors. To avoid
issues, I have enabled memory swap on ARM and bumped the side of the
runner.
there's build failure for the rust artifact but the macos arm64 build
for npm publish still passed. So we had a silent failure for 2 releases.
By setting error to immediate this should cause fail immediately.
* Refactors the Node module to load the shared library from a separate
package. When a user does `npm install vectordb`, the correct optional
dependency is automatically downloaded by npm.
* Add scripts and instructions to build Linux and MacOS node artifacts
locally.
* Add instructions for publishing the npm module and crates.
Co-authored-by: Will Jones <willjones127@gmail.com>
Changes:
* Refactors the Node module to load the shared library from a separate
package. When a user does `npm install vectordb`, the correct optional
dependency is automatically downloaded by npm.
* Brings Rust and Node versions in alignment at 0.1.2.
* Add scripts and instructions to build Linux and MacOS node artifacts
locally.
* Add instructions for publishing the npm module and crates.