The eslint rules specify some formatting requirements that are rather
strict and conflict with vscode's default formatter. I was unable to get
auto-formatting to setup correctly. Also, eslint has quite recently
[given up on
formatting](https://eslint.org/blog/2023/10/deprecating-formatting-rules/)
and recommends using a 3rd party formatter.
This PR adds prettier as the formatter. It restores the eslint rules to
their defaults. This does mean we now have the "no explicit any" check
back on. I know that rule is pedantic but it did help me catch a few
corner cases in type testing that weren't covered in the current code.
Leaving in draft as this is dependent on other PRs.
In order to add support for `add` we needed to migrate the rust `Table`
trait to a `Table` struct and `TableInternal` trait (similar to the way
the connection is designed).
While doing this we also cleaned up some inconsistencies between the
SDKs:
* Python and Node are garbage collected languages and it can be
difficult to trigger something to be freed. The convention for these
languages is to have some kind of close method. I added a close method
to both the table and connection which will drop the underlying rust
object.
* We made significant improvements to table creation in
cc5f2136a6
for the `node` SDK. I copied these changes to the `nodejs` SDK.
* The nodejs tables were using fs to create tmp directories and these
were not getting cleaned up. This is mostly harmless but annoying and so
I changed it up a bit to ensure we cleanup tmp directories.
* ~~countRows in the node SDK was returning `bigint`. I changed it to
return `number`~~ (this actually happened in a previous PR)
* Tables and connections now implement `std::fmt::Display` which is
hooked into python's `__repr__`. Node has no concept of a regular "to
string" function and so I added a `display` method.
* Python method signatures are changing so that optional parameters are
always `Optional[foo] = None` instead of something like `foo = False`.
This is because we want those defaults to be in rust whenever possible
(though we still need to mention the default in documentation).
* I changed the python `AsyncConnection/AsyncTable` classes from
abstract classes with a single implementation to just classes because we
no longer have the remote implementation in python.
Note: this does NOT add the `add` function to the remote table. This PR
was already large enough, and the remote implementation is unique
enough, that I am going to do all the remote stuff at a later date (we
should have the structure in place and correct so there shouldn't be any
refactor concerns)
---------
Co-authored-by: Will Jones <willjones127@gmail.com>
This changes `lancedb` from a "pure python" setuptools project to a
maturin project and adds a rust lancedb dependency.
The async python client is extremely minimal (only `connect` and
`Connection.table_names` are supported). The purpose of this PR is to
get the infrastructure in place for building out the rest of the async
client.
Although this is not technically a breaking change (no APIs are
changing) it is still a considerable change in the way the wheels are
built because they now include the native shared library.
This also renames the new experimental node package to lancedb. The
classic node package remains named vectordb.
The goal here is to avoid introducing piecemeal breaking changes to the
vectordb crate. Instead, once the new API is stabilized, we will
officially release the lancedb crate and deprecate the vectordb crate.
The same pattern will eventually happen with the npm package vectordb.
When we turned on fat LTO builds, we made the release build job **much**
more compute and memory intensive. The ARM runners have particularly low
memory per core, which makes them susceptible to OOM errors. To avoid
issues, I have enabled memory swap on ARM and bumped the side of the
runner.
@eddyxu added instructions for linting here:
7af213801a/python/README.md (L45-L50)
However, we had a lot of failures and weren't checking this in CI. This
PR fixes all lints and adds a check to CI to keep us in compliance with
the lints.
Use pathlib for local paths so that pathlib
can handle the correct separator on windows.
Closes#703
---------
Co-authored-by: Will Jones <willjones127@gmail.com>
This PR adds issue templates, which help two recurring issues:
* Users forget to tell us whether they are using the Node or Python SDK
* Issues don't get appropriate tags
This doesn't force the use of the templates. Because we set
`blank_issues_enabled: true`, users can still create a custom issue.
Most recent release failed because `release` depends on `node-macos`,
but we renamed `node-macos` to `node-macos-{x86,arm64}`. This fixes that
by consolidating them back to a single `node-macos` job, which also has
the side effect of making the file shorter.
We had some build issues with npm publish for cross-compiling arm64
macos on an x86 macos runner. Switching to m1 runner for now until
someone has time to deal with the feature flags.
follow-up tracked here: #688
aws integration tests are flaky because we didn't wait for the services
to become healthy. (we only waited for the localstack service, this PR
adds wait for sub services)
# WARNING: specifying engine is NOT a publicly supported feature in
lancedb yet. THE API WILL CHANGE.
This PR exposes dynamodb based commit to `vectordb` and JS SDK (will do
python in another PR since it's on a different release track)
This PR also added aws integration test using `localstack`
## What?
This PR adds uri parameters to DB connection string. User may specify
`engine` in the connection string to let LanceDB know that the user
wants to use an external store when reading and writing a table. User
may also pass any parameters required by the commitStore in the
connection string, these parameters will be propagated to lance.
e.g.
```
vectordb.connect("s3://my-db-bucket?engine=ddb&ddbTableName=my-commit-table")
```
will automatically convert table path to
```
s3+ddb://my-db-bucket/my_table.lance?&ddbTableName=my-commit-table
```