@wjones127 is there a standard way you guys setup your virtualenv? I can
either relist all the dependencies in the pyright precommit section, or
specify a venv, or the user has to be in the virtual environment when
they run git commit. If the venv location was standardized or a python
manager like `uv` was used it would be easier to avoid duplicating the
pyright dependency list.
Per your suggestion, in `pyproject.toml` I added in all the passing
files to the `includes` section.
For ruff I upgraded the version and removed "TCH" which doesn't exist as
an option.
I added a `pyright_report.csv` which contains a list of all files sorted
by pyright errors ascending as a todo list to work on.
I fixed about 30 issues in `table.py` stemming from str's being passed
into methods that required a string within a set of string Literals by
extracting them into `types.py`
Can you verify in the rust bridge that the schema should be a property
and not a method here? If it's a method, then there's another place in
the code where `inner.schema` should be `inner.schema()`
``` python
class RecordBatchStream:
@property
def schema(self) -> pa.Schema: ...
```
Also unless the `_lancedb.pyi` file is wrong, then there is no
`__anext__` here for `__inner` when it's not an `AsyncGenerator` and
only `next` is defined:
``` python
async def __anext__(self) -> pa.RecordBatch:
return await self._inner.__anext__()
if isinstance(self._inner, AsyncGenerator):
batch = await self._inner.__anext__()
else:
batch = await self._inner.next()
if batch is None:
raise StopAsyncIteration
return batch
```
in the else statement, `_inner` is a `RecordBatchStream`
```python
class RecordBatchStream:
@property
def schema(self) -> pa.Schema: ...
async def next(self) -> Optional[pa.RecordBatch]: ...
```
---------
Co-authored-by: Will Jones <willjones127@gmail.com>
This also changes the pylance pin from `==0.23.2` to `~=0.23.2` which
should allow the pylance dependency to float a little. The pylance
dependency is actually not used for much anymore and so it should be
tolerant of patch changes.
* Make `npm run docs` fail if there are any warnings. This will catch
items missing from the API reference.
* Add a check in our CI to make sure `npm run dos` runs without warnings
and doesn't generate any new files (indicating it might be out-of-date.
* Hide constructors that aren't user facing.
* Remove unused enum `WriteMode`.
Closes#2068
Some Rust jobs (such as
[Rust/linux](https://github.com/lancedb/lancedb/actions/runs/13019232960/job/36315830779))
take almost minutes. This can be a bit of a bottleneck.
* Two fixes to make caches more effective
* Check in `Cargo.lock` so that dependencies don't change much between
runs
* Added a new CI job to validate we can build without a lockfile
* Altered build commands so they don't have contradictory features and
therefore don't trigger multiple builds
Sadly, I don't think there's much to be done for windows-arm64, as much
of the compile time is because the base image is so bare we need to
install the build tools ourselves.
We upgraded the toolchain in #1960, but didn't realize we hardcoded it
in `npm-publish.yml`. I found if I just removed the hard-coded
toolchain, it selects the correct one.
This didn't fully fix Windows Arm, so I created a follow-up issue here:
https://github.com/lancedb/lancedb/issues/1975
* Upgrades our toolchain file to v1.83.0, since many dependencies now
have MSRV of 1.81.0
* Reverts Rust changes from #1946 that were working around this in a
dumb way
* Adding an MSRV check
* Reduce MSRV back to 1.78.0
fixes that all `neon pack-build` packs are named
`vectordb-linux-x64-musl-*.tgz` even when cross-compiling
adds 2nd param:
`TARGET_TRIPLE=${2:-x86_64-unknown-linux-gnu}`
`npm run pack-build -- -t $TARGET_TRIPLE`
* Test that we can insert subschemas (omit nullable columns) in Python.
* More work is needed to support this in Node. See:
https://github.com/lancedb/lancedb/issues/1832
* Test that we can insert data with nullable schema but no nulls in
non-nullable schema.
* Add `"null"` option for `on_bad_vectors` where we fill with null if
the vector is bad.
* Make null values not considered bad if the field itself is nullable.
This is done as setup for a PR that will fix the OpenAI dependency
issue.
* [x] FTS examples
* [x] Setup mock openai
* [x] Ran `npm audit fix`
* [x] sentences embeddings test
* [x] Double check formatting of docs examples
* Labelled jobs `vectordb` and `lancedb` so it's clear which package
they are for
* Fix permission issue in aarch64 Linux `vectordb` build that has been
blocking release for two months.
* Added Slack notifications for failure of these publish jobs.
This doesn't actually block a python-only release since this step runs
after the version bump has been pushed but it still would be nice for
the git job to finish successfully.