Commit Graph

11 Commits

Author SHA1 Message Date
Weston Pace
c269524b2f feat!: refactor ConnectionInternal into a Database trait (#2067)
This opens up the door for more custom database implementations than the
two we have today. The biggest change should be inivisble:
`ConnectionInternal` has been renamed to `Database`, made public, and
refactored

However, there are a few breaking changes. `data_storage_version` and
`enable_v2_manifest_paths` have been moved from options on
`create_table` to options for the database which are now set via
`storage_options`.

Before:
```
db = connect(uri)
tbl = db.create_table("my_table", data, data_storage_version="legacy", enable_v2_manifest_paths=True)
```

After:
```
db = connect(uri, storage_options={
  "new_table_enable_v2_manifest_paths": "true",
  "new_table_data_storage_version": "legacy"
})
tbl = db.create_table("my_table", data)
```

BREAKING CHANGE: the data_storage_version, enable_v2_manifest_paths
options have moved from options to create_table to storage_options.
BREAKING CHANGE: the use_legacy_format option has been removed,
data_storage_version has replaced it for some time now
2025-02-04 14:35:14 -08:00
Cory Grinstead
31be9212da docs(nodejs): add @lancedb/lancedb examples everywhere (#1411)
Co-authored-by: Will Jones <willjones127@gmail.com>
2024-07-10 13:29:03 -05:00
Will Jones
865ed99881 feat: dynamodb commit store support (#1410)
This allows users to specify URIs like:

```
s3+ddb://my_bucket/path?ddbTableName=myCommitTable
```

and it will support concurrent writes in S3.

* [x] Add dynamodb integration tests
* [x] Add modifications to get it working in Python sync API
* [x] Added section in documentation describing how to configure.

Closes #534

---------

Co-authored-by: universalmind303 <cory.grinstead@gmail.com>
2024-06-28 09:30:36 -07:00
Cory Grinstead
e7022b990e feat(nodejs): feature parity [1/N] - remote table (#1378)
closes https://github.com/lancedb/lancedb/issues/1362
2024-06-17 15:23:27 -05:00
Cory Grinstead
dbea3a7544 feat: js embedding registry (#1308)
---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2024-05-29 13:12:19 -05:00
Cory Grinstead
055efdcdb6 refactor(nodejs): use biomejs instead of eslint & prettier (#1304)
I've been noticing a lot of friction with the current toolchain for
'/nodejs'. Particularly with the usage of eslint and prettier.

[Biome](https://biomejs.dev/) is an all in one formatter & linter that
replaces the need for two different ones that can potentially clash with
one another.

I've been using it in the
[nodejs-polars](https://github.com/pola-rs/nodejs-polars) repo for quite
some time & have found it much more pleasant to work with.

---

One other small change included in this PR:

use [ts-jest](https://www.npmjs.com/package/ts-jest) so we can run our
tests without having to rebuild typescript code first
2024-05-14 11:11:18 -05:00
Weston Pace
73c69a6b9a feat: page_token / limit to native table_names function. Use async table_names function from sync table_names function (#1059)
The synchronous table_names function in python lancedb relies on arrow's
filesystem which behaves slightly differently than object_store. As a
result, the function would not work properly in GCS.

However, the async table_names function uses object_store directly and
thus is accurate. In most cases we can fallback to using the async
table_names function and so this PR does so. The one case we cannot is
if the user is already in an async context (we can't start a new async
event loop). Soon, we can just redirect those users to use the async API
instead of the sync API and so that case will eventually go away. For
now, we fallback to the old behavior.
2024-04-05 16:31:45 -07:00
Weston Pace
785ecfa037 feat: reconfigure typescript linter / formatter for nodejs (#1042)
The eslint rules specify some formatting requirements that are rather
strict and conflict with vscode's default formatter. I was unable to get
auto-formatting to setup correctly. Also, eslint has quite recently
[given up on
formatting](https://eslint.org/blog/2023/10/deprecating-formatting-rules/)
and recommends using a 3rd party formatter.

This PR adds prettier as the formatter. It restores the eslint rules to
their defaults. This does mean we now have the "no explicit any" check
back on. I know that rule is pedantic but it did help me catch a few
corner cases in type testing that weren't covered in the current code.
Leaving in draft as this is dependent on other PRs.
2024-04-05 16:31:36 -07:00
Weston Pace
629c622d15 feat: Initial remote table implementation for rust (#1024)
This will eventually replace the remote table implementations in python
and node.
2024-04-05 16:31:36 -07:00
Chang She
3c46d7f268 Handle NaN input data (#241)
Sometimes LangChain would insert a single `[np.nan]` as a placeholder if
the embedding function failed. This causes a problem for Lance format
because then the array can't be stored as a FixedSizedListArray.

Instead:
1. By default we remove rows with embedding lengths less than the
maximum length in the batch
2. If `strict=True` kwargs is set to True, then a `ValueError` is raised
if the embeddings aren't all the same length

---------

Co-authored-by: Chang She <chang@lancedb.com>
2023-07-04 20:00:46 -07:00
Chang She
01db9417fa add ruff and black pre-commit hook 2023-03-22 17:59:15 -07:00