Commit Graph

161 Commits

Author SHA1 Message Date
Will Jones
cee2b5ea42 chore: upgrade pyarrow pin (#2192)
Closes #2191


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated the required version of the pyarrow package to version 16 or
higher.
- Adjusted automated testing workflows to install pyarrow version 16 for
compatibility checks.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-05-05 11:23:13 -07:00
Will Jones
acc3b03004 ci: fix docs deploy (#2351)
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Improved CI workflow for documentation builds by optimizing Rust build
settings and updating the runner environment.
  - Fixed a typo in a workflow step name.
- Streamlined caching steps to reduce redundancy and improve efficiency.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-22 13:55:34 -07:00
Will Jones
92f0b16e46 fix(python): make sure pandas is optional (#2346)
Fixes #2344


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Tests**
- Updated tests to use PyArrow Tables instead of pandas DataFrames where
possible, reducing reliance on pandas.
- Tests that require pandas are now automatically skipped if pandas is
not installed.
- **Chores**
- Improved workflow to uninstall both pylance and pandas in a specific
test step.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-21 13:42:13 -07:00
Will Jones
4a2cdbf299 ci: provide token for deprecate call (#2309)
This should prevent the failures we are seeing in Node release.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chore**
- Enhanced the package deprecation process with improved security
measures, ensuring smoother and more reliable updates during package
deprecation.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-04 14:44:58 -07:00
Will Jones
647dee4e94 ci: check release builds when we change dependencies (#2299)
The issue we fixed in https://github.com/lancedb/lancedb/pull/2296 was
caused by an upgrade in dependencies. This could have been caught if we
had run these CI jobs when we did the dependency change.

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

- **Chores**
- Updated our automated pipeline to trigger additional stability checks
when dependency configurations change, ensuring smoother build and
release processes.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-03 16:19:00 -07:00
Will Jones
f091f57594 ci: fix lancedb musl builds (#2296)
Fixes #2255


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit

- **Chores**
- Enhanced the build process to improve performance and reliability
across Linux platforms.
  - Updated environment settings for more accurate compiler integration.
- Activated previously inactive build configurations to support advanced
feature support.
- Added support for the x86_64 architecture on Linux systems utilizing
the musl C library.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
2025-04-01 14:44:27 -07:00
Will Jones
e59f9382a0 ci: deprecate vectordb each release (#2292)
I released each time we published, the new package was no longer
deprecated. This re-deprecated the package after a new publish.
2025-03-31 12:03:04 -07:00
vinoyang
c321cccc12 chore(java): make rust release to be a switch option (#2277) 2025-03-28 11:26:24 +08:00
Will Jones
93a82fd371 ci: allow dry run on PR to Python release (#2245)
This just makes it easier to test in the future.
2025-03-21 16:14:32 -07:00
Will Jones
0d379e6ffa ci(node): setup URL so auth token is picked up (#2257)
Should fix failure seen here:
https://github.com/lancedb/lancedb/actions/runs/13999958170/job/39207039825
2025-03-21 16:14:24 -07:00
Will Jones
b2a38ac366 fix: make pylance optional again (#2209)
The two remaining blockers were:

* A method `with_embeddings` that was deprecated a year ago
* A typecheck for `LanceDataset`
2025-03-21 11:26:32 -07:00
Will Jones
2bfdef2624 ci: refactor node releases (#2223)
This PR fixes build issues associated with `aws-lc-rs`, while
simplifying the build process. Previously, we used custom scripts for
the musl and Windows ARM builds. These were complicated and prone to
breaking. This PR switches to a setup that mirrors
https://github.com/napi-rs/package-template/blob/main/.github/workflows/CI.yml.

* linux glibc and musl builds now use the Docker images provided by the
napi project
* Windows ARM build now just cross compiles from Windows x64, which
turns out to work quite well.
2025-03-21 10:56:29 -07:00
Will Jones
2a1d6d8abf ci: simplify windows builds (#2243)
We soon won't rely on cross compiling from Linux to windows, so can
remove this check. Instead, check that we can cross compile from Windows
between architectures.
2025-03-20 08:06:56 -07:00
Will Jones
440a466a13 ci: remove OpenSSL as dependency in favor of rustls (#2242)
`object_store` already hard codes `rustls` as the TLS implementation, so
we have been shipping a mix of `rustls` and `openssl`. For simplicity of
builds, we should consolidate to one, and that has to be `rustls`.
2025-03-20 08:06:45 -07:00
Will Jones
a6b6f6a806 ci: drop vectordb support for musl, windows ARM (#2241)
vectordb is deprecated, and these platforms are particularly difficult
to maintain. Removing now to prevent further headaches.

We will keep these platforms supported on `@lancedb/lancedb`.
2025-03-19 12:23:46 -07:00
msu-reevo
cc81f3e1a5 fix(python): typing (#2167)
@wjones127 is there a standard way you guys setup your virtualenv? I can
either relist all the dependencies in the pyright precommit section, or
specify a venv, or the user has to be in the virtual environment when
they run git commit. If the venv location was standardized or a python
manager like `uv` was used it would be easier to avoid duplicating the
pyright dependency list.

Per your suggestion, in `pyproject.toml` I added in all the passing
files to the `includes` section.

For ruff I upgraded the version and removed "TCH" which doesn't exist as
an option.

I added a `pyright_report.csv` which contains a list of all files sorted
by pyright errors ascending as a todo list to work on.

I fixed about 30 issues in `table.py` stemming from str's being passed
into methods that required a string within a set of string Literals by
extracting them into `types.py`

Can you verify in the rust bridge that the schema should be a property
and not a method here? If it's a method, then there's another place in
the code where `inner.schema` should be `inner.schema()`
``` python
class RecordBatchStream:
    @property
    def schema(self) -> pa.Schema: ...
```

Also unless the `_lancedb.pyi` file is wrong, then there is no
`__anext__` here for `__inner` when it's not an `AsyncGenerator` and
only `next` is defined:
``` python
    async def __anext__(self) -> pa.RecordBatch:
        return await self._inner.__anext__()
        if isinstance(self._inner, AsyncGenerator):
            batch = await self._inner.__anext__()
        else:
            batch = await self._inner.next()
        if batch is None:
            raise StopAsyncIteration
        return batch
```
in the else statement, `_inner` is a `RecordBatchStream`
```python
class RecordBatchStream:
    @property
    def schema(self) -> pa.Schema: ...
    async def next(self) -> Optional[pa.RecordBatch]: ...
```

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2025-03-10 09:01:23 -07:00
vinoyang
374fe0ad95 feat(rust): introduce Catalog trait and implement ListingCatalog (#2148)
Co-authored-by: Weston Pace <weston.pace@gmail.com>
2025-03-03 20:22:24 -08:00
Lei Xu
7c12d497b0 ci: bump python to 3.12 in GHA (#2169) 2025-03-01 17:24:02 -08:00
Weston Pace
d6b3ccb37b feat: upgrade lance to 0.23.2 (#2152)
This also changes the pylance pin from `==0.23.2` to `~=0.23.2` which
should allow the pylance dependency to float a little. The pylance
dependency is actually not used for much anymore and so it should be
tolerant of patch changes.
2025-02-26 09:02:51 -08:00
Will Jones
e05c0cd87e ci(node): check docs in CI (#2084)
* Make `npm run docs` fail if there are any warnings. This will catch
items missing from the API reference.
* Add a check in our CI to make sure `npm run dos` runs without warnings
and doesn't generate any new files (indicating it might be out-of-date.
* Hide constructors that aren't user facing.
* Remove unused enum `WriteMode`.

Closes #2068
2025-01-30 16:06:06 -08:00
Will Jones
a677a4b651 ci: fix arm64 windows cross compile build (#2081)
* Adds a CI job to check the cross compiled Windows ARM build.
* Didn't replace the test build because we need native build to run
tests. But for some reason (I forget why) we need cross compiled for
nodejs.
* Pinned crunchy to workaround
https://github.com/eira-fransham/crunchy/issues/13

This is needed to fix failure from
https://github.com/lancedb/lancedb/actions/runs/13020773184/job/36320719331
2025-01-30 09:24:20 -08:00
Will Jones
15f8f4d627 ci: check license headers (#2076)
Based on the same workflow in Lance.
2025-01-29 08:27:07 -08:00
Will Jones
6526d6c3b1 ci(rust): caching improvements (up to 2.8x faster builds) (#2075)
Some Rust jobs (such as
[Rust/linux](https://github.com/lancedb/lancedb/actions/runs/13019232960/job/36315830779))
take almost minutes. This can be a bit of a bottleneck.

* Two fixes to make caches more effective
* Check in `Cargo.lock` so that dependencies don't change much between
runs
      * Added a new CI job to validate we can build without a lockfile
* Altered build commands so they don't have contradictory features and
therefore don't trigger multiple builds

Sadly, I don't think there's much to be done for windows-arm64, as much
of the compile time is because the base image is so bare we need to
install the build tools ourselves.
2025-01-29 08:26:45 -08:00
Will Jones
7920ecf66e ci(python): stop using deprecated 2_24 manylinux for arm (#2064)
Based on changes made in Lance:

* https://github.com/lancedb/lance/pull/3409
* https://github.com/lancedb/lance/pull/3411
2025-01-23 15:00:34 -08:00
Mr. Doge
998c5f3f74 ci: add dbghelp.lib to sysroot-aarch64-pc-windows-msvc.sh (#1975) (#2008)
successful runs:
https://github.com/FuPeiJiang/lancedb/actions/runs/12698662005
2025-01-09 14:24:09 -08:00
Will Jones
6eacae18c4 test: fix test failure from merge (#2007) 2025-01-09 11:27:24 -08:00
Will Jones
8b31540b21 ci: prevent stable release with preview lance (#1995)
Accidentally referenced a preview release in our stable release of
LanceDB. This adds a CI check to prevent that.
2025-01-06 08:54:14 -08:00
Lei Xu
f76c4a5ce1 chore: add pyright static type checking and fix some of the table interface (#1996)
* Enable `pyright` in the project
* Fixed some pyright typing errors in `table.py`
2025-01-04 15:24:58 -08:00
Will Jones
0a0f667bbd chore: fix typos (#1976) 2024-12-24 12:50:54 -08:00
Will Jones
03753fd84b ci(node): remove hardcoded toolchain from typescript release build (#1974)
We upgraded the toolchain in #1960, but didn't realize we hardcoded it
in `npm-publish.yml`. I found if I just removed the hard-coded
toolchain, it selects the correct one.

This didn't fully fix Windows Arm, so I created a follow-up issue here:
https://github.com/lancedb/lancedb/issues/1975
2024-12-24 12:48:41 -08:00
Will Jones
27ef0bb0a2 ci(rust): check MSRV and upgrade toolchain (#1960)
* Upgrades our toolchain file to v1.83.0, since many dependencies now
have MSRV of 1.81.0
* Reverts Rust changes from #1946 that were working around this in a
dumb way
* Adding an MSRV check
* Reduce MSRV back to 1.78.0
2024-12-19 08:43:25 -08:00
Will Jones
25402ba6ec chore: update lockfiles (#1946) 2024-12-18 08:43:33 -08:00
Will Jones
d11b2a6975 ci: fix python beta release to publish to fury (#1937)
We have been publishing all releases--even preview ones--to PyPI. This
was because of a faulty bash if statement. This PR fixes that
conditional.
2024-12-13 14:19:14 -08:00
Lei Xu
c78a9849b4 ci: upgrade version of upload-pages-artifact and deploy-pages (#1917)
For
https://github.blog/changelog/2024-12-05-deprecation-notice-github-pages-actions-to-require-artifacts-actions-v4-on-github-com/
2024-12-06 10:45:24 -05:00
Will Jones
8b628854d5 ci: fix nodejs release jobs (#1912)
* Clean up old commented out jobs
* Fix runner issue that caused these failures:
https://github.com/lancedb/lancedb/actions/runs/12186754094
2024-12-05 14:45:10 -08:00
Mr. Doge
c7d424b2f3 ci: aarch64-pc-windows-msvc (#1890)
`npm run pack-build -- -t $TARGET_TRIPLE`
was needed instead of
`npm run pack-build -t $TARGET_TRIPLE`
https://github.com/lancedb/lancedb/pull/1889

some documentation about `*-pc-windows-msvc` cross-compilation (from
alpine):
https://github.com/lancedb/lancedb/pull/1831#issuecomment-2497156918

only `arm64` in `matrix` config is used
since `x86_64` built by `runs-on: windows-2022` is working
2024-12-02 11:17:37 -08:00
Bert
1efb9914ee ci: fix failing python release (#1896)
Fix failing python release for windows:
https://github.com/lancedb/lancedb/actions/runs/12019637086/job/33506642964

Also updates pkginfo to fix twine build as suggested here:
https://github.com/pypi/warehouse/issues/15611
failing release:
https://github.com/lancedb/lancedb/actions/runs/12091344173/job/33719622146
2024-12-02 11:05:29 -08:00
Mr. Doge
d496ab13a0 ci: linux: specify target triple for neon pack-build (vectordb) (#1889)
fixes that all `neon pack-build` packs are named
`vectordb-linux-x64-musl-*.tgz` even when cross-compiling

adds 2nd param:
`TARGET_TRIPLE=${2:-x86_64-unknown-linux-gnu}`
`npm run pack-build -- -t $TARGET_TRIPLE`
2024-11-26 10:57:17 -08:00
Will Jones
9fa08bfa93 ci: use correct runner for vectordb (#1881)
We already do this for `gnu` builds, we should do this also for `musl`
builds.
2024-11-25 16:17:10 -08:00
Mr. Doge
53d1535de1 ci: musl x64,arm64 (#1853)
untested 4 artifacts at:
https://github.com/FuPeiJiang/lancedb/actions/runs/11926579058
node-native-linux-aarch64-musl 22.6 MB
node-native-linux-x86_64-musl 23.6 MB
nodejs-native-linux-aarch64-musl 26.7 MB
nodejs-native-linux-x86_64-musl 27 MB

this follows the same process as:
https://github.com/lancedb/lancedb/pull/1816#issuecomment-2484816669

Closes #1388
Closes #1107

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2024-11-20 10:53:19 -08:00
Will Jones
97d6210c33 ci: remove invalid references (#1834)
Fix release job
2024-11-18 11:32:44 -08:00
Will Jones
587c0824af feat: flexible null handling and insert subschemas in Python (#1827)
* Test that we can insert subschemas (omit nullable columns) in Python.
* More work is needed to support this in Node. See:
https://github.com/lancedb/lancedb/issues/1832
* Test that we can insert data with nullable schema but no nulls in
non-nullable schema.
* Add `"null"` option for `on_bad_vectors` where we fill with null if
the vector is bad.
* Make null values not considered bad if the field itself is nullable.
2024-11-15 11:33:00 -08:00
Will Jones
119d88b9db ci: disable Windows Arm64 until the release builds work (#1833)
Started to actually fix this, but it was taking too long
https://github.com/lancedb/lancedb/pull/1831
2024-11-14 15:04:23 -08:00
Will Jones
0fd8a50bd7 ci(node): run examples in CI (#1796)
This is done as setup for a PR that will fix the OpenAI dependency
issue.

 * [x] FTS examples
 * [x] Setup mock openai
 * [x] Ran `npm audit fix`
 * [x] sentences embeddings test
 * [x] Double check formatting of docs examples
2024-11-13 11:10:56 -08:00
Umut Hope YILDIRIM
9f228feb0e ci: remove cache to fix build issues on windows arm runner (#1820) 2024-11-13 09:27:10 -08:00
Will Jones
68974a4e06 ci: add index URL to fix failing docs build (#1823) 2024-11-12 16:54:22 -08:00
Lei Xu
4c9bab0d92 fix: use pandas with pydantic embedding column (#1818)
* Make Pandas `DataFrame` works with embedding function + Subset of
columns
* Make `lancedb.create_table()` work with embedding function
2024-11-11 14:48:56 -08:00
Umut Hope YILDIRIM
729718cb09 fix: arm64 runner proto already installed bug (#1810)
https://github.com/lancedb/lancedb/actions/runs/11748512661/job/32732745458
2024-11-08 14:49:37 -08:00
Umut Hope YILDIRIM
b1c84e0bda feat: added lancedb and vectordb release ci for win32-arm64-msvc npmjs only (#1805) 2024-11-08 11:40:57 -08:00
Umut Hope YILDIRIM
fa9ca8f7a6 ci: arm64 windows build support (#1770)
Adds support for 'aarch64-pc-windows-msvc'.
2024-11-06 15:34:23 -08:00