while adding some more docs & examples for the new js sdk, i ran across
a few compatibility issues when using different arrow versions. This
should fix those issues.
- add `return` for `__enter__`
The buggy code didn't return the object, therefore it will always return
None within a context manager:
```python
with await lancedb.connect_async("./.lancedb") as db:
# db is always None
```
(BTW, why not to design an async context manager?)
- add a unit test for Async connection context manager
- update return type of `AsyncConnection.open_table` to `AsyncTable`
Although type annotation doesn't affect the functionality, it is helpful
for IDEs.
Attempting to create a new minor version failed with:
```
Specified version (0.4.21-beta.0) does not match last tagged version (0.4.20)
```
It seems the last release commit for rust/node was made without the new
process and did not adjust bumpversion.toml correctly (or maybe
bumpversion.toml did not exist at that time)
I tried to do a stable release and it failed with:
```
Traceback (most recent call last):
File "/home/runner/work/lancedb/lancedb/ci/check_breaking_changes.py", line 20, in <module>
commits = repo.compare(args.base, args.head).commits
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/github/Repository.py", line 1133, in compare
headers, data = self._requester.requestJsonAndCheck("GET", f"{self.url}/compare/{base}...{head}", params)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/github/Requester.py", line 548, in requestJsonAndCheck
return self.__check(*self.requestJson(verb, url, parameters, headers, input, self.__customConnection(url)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/github/Requester.py", line 609, in __check
raise self.createException(status, responseHeaders, data)
github.GithubException.UnknownObjectException: 404 {"message": "Not Found", "documentation_url": "https://docs.github.com/rest/commits/commits#compare-two-commits"}
```
I believe the problem is that we are calculating the
`LAST_STABLE_RELEASE` after we have run bump version and so the newly
created tag is in the list of tags we search and it is the most recent
one and so it gets included as `LAST_STABLE_RELEASE`. Then, the call to
github fails because we haven't pushed the tag yet. This changes the
logic to grab `LAST_STABLE_RELEASE` before we create any new tags.
The optimize function is pretty crucial for getting good performance
when building a large scale dataset but it was only exposed in rust
(many sync python users are probably doing this via to_lance today)
This PR adds the optimize function to nodejs and to python.
I left the function marked experimental because I think there will
likely be changes to optimization (e.g. if we add features like
"optimize on write"). I also only exposed the `cleanup_older_than`
configuration parameter since this one is very commonly used and the
rest have sensible defaults and we don't really know why we would
recommend different values for these defaults anyways.
This PR changes the release process. Some parts are more complex, and
other parts I've simplified.
## Simplifications
* Combined `Create Release Commit` and `Create Python Release Commit`
into a single workflow. By default, it does a release of all packages,
but you can still choose to make just a Python or just Node/Rust release
through the arguments. This will make it rarer that we create a Node
release but forget about Python or vice-versa.
* Releases are automatically generated once a tag is pushed. This
eliminates the manual step of creating the release.
* Release notes are automatically generated and changes are categorized
based on the PR labels.
* Removed the use of `LANCEDB_RELEASE_TOKEN` in favor of just using
`GITHUB_TOKEN` where it wasn't necessary. In the one place it is
necessary, I left a comment as to why it is.
* Reused the version in `python/Cargo.toml` so we don't have two
different versions in Python LanceDB.
## New changes
* We now can create `preview` / `beta` releases. By default `Create
Release Commit` will create a preview release, but you can select a
"stable" release type and it will create a full stable release.
* For Python, pre-releases go to fury.io instead of PyPI
* `bump2version` was deprecated, so upgraded to `bump-my-version`. This
also seems to better support semantic versioning with pre-releases.
* `ci` changes will now be shown in the changelog, allowing changes like
this to be visible to users. `chore` is still hidden.
## Versioning
**NOTE**: unlike how it is in lance repo right now, the version in main
is the last one released, including beta versions.
---------
Co-authored-by: Lance Release <lance-dev@lancedb.com>
Co-authored-by: Weston Pace <weston.pace@gmail.com>
I've been noticing a lot of friction with the current toolchain for
'/nodejs'. Particularly with the usage of eslint and prettier.
[Biome](https://biomejs.dev/) is an all in one formatter & linter that
replaces the need for two different ones that can potentially clash with
one another.
I've been using it in the
[nodejs-polars](https://github.com/pola-rs/nodejs-polars) repo for quite
some time & have found it much more pleasant to work with.
---
One other small change included in this PR:
use [ts-jest](https://www.npmjs.com/package/ts-jest) so we can run our
tests without having to rebuild typescript code first
* Enforce conventional commit PR titles
* Add automatic labelling of PRs
* Write down breaking change policy.
Left for another PR:
* Validation of breaking change version bumps. (This is complicated due
to separate releases for Python and other package.)