Commit Graph

404 Commits

Author SHA1 Message Date
Prashanth Rao
76bcc78910 docs: nodejs failing CI is fixed (#2802)
Fixes the breaking CI for nodejs, related to the documentation of the
new Permutation API in typescript.

- Expanded the generated typings in `nodejs/lancedb/native.d.ts` to
include `SplitCalculatedOptions`, `splitNames` fields, and the
persist/options-based `splitCalculated` methods so the permutation
exports match the native API.
- The previous block comment block had an inconsistency.
`splitCalculated` takes an options object (`SplitCalculatedOptions`) in
our bindings, not a bare string. The previous example showed
`builder.splitCalculated("user_id % 3");`, which doesn’t match the
actual signature and would fail TS typecheck. I updated the comment to
`builder.splitCalculated({ calculation: "user_id % 3" });` so the
example is now correct.
- Updated the `splitCalculated` example in
`nodejs/lancedb/permutation.ts` to use the options object.
- Ran `npm docs` to ensure docs build correctly.

> [!NOTE]
> **Disclaimer**: I used GPT-5.1-Codex-Max to make these updates, but I
have read the code and run `npm run docs` to verify that they work and
are correct to the best of my knowledge.
2025-11-20 16:16:38 -08:00
Prashanth Rao
135dfdc7ec docs: 404 and outdated URLs should now work (#2800)
Did a full scan of all URLs that used to point to the old mkdocs pages,
and now links to the appropriate pages on lancedb.com/docs or lance.org
docs.

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-20 11:14:20 -08:00
Lance Release
3531393523 Bump version: 0.22.4-beta.1 → 0.22.4-beta.2 2025-11-19 20:25:41 +00:00
Wyatt Alt
386fc9e466 feat: add num_attempts to merge insert result (#2795)
This pipes the num_attempts field from lance's merge insert result
through lancedb. This allows callers of merge_insert to get a better
idea of whether transaction conflicts are occurring.
2025-11-19 09:32:57 -08:00
Lance Release
ce1bafec1a Bump version: 0.22.4-beta.0 → 0.22.4-beta.1 2025-11-19 12:58:59 +00:00
Will Jones
1cf3917a87 ci: make rust ci faster, get ci green (#2782)
* Add `ci` profile for smaller build caches. This had a meaningful
impact in Lance, and I expect a similar impact here.
https://github.com/lancedb/lance/pull/5236
* Get caching working in Rust. Previously was not working due to
`workspaces: rust`.
* Get caching working in NodeJs lint job. Previously wasn't working
because we installed the toolchain **after** we called `- uses:
Swatinem/rust-cache@v2`, which invalidates the cache locally.
* Fix broken pytest from async io transition
(`pytest.PytestRemovedIn9Warning`)
* Altered `get_num_sub_vectors` to handle bug in case of 4-bit PQ. This
was cause of `rust future panicked: unknown error`. Raised an issue
upstream to change panic to error:
https://github.com/lancedb/lance/issues/5257
* Call `npm run docs` to fix doc issue.
* Disable flakey Windows test for consistency. It's just an OS-specific
timer issue, not our fault.
* Fix Windows absolute path handling in namespaces. Was causing CI
failure `OSError: [WinError 123] The filename, directory name, or volume
label syntax is incorrect: `
2025-11-18 09:04:56 -08:00
Lance Release
e2d7640021 Bump version: 0.22.3 → 0.22.4-beta.0 2025-11-17 08:43:51 +00:00
Lance Release
e34f51713a Bump version: 0.22.3-beta.6 → 0.22.3 2025-11-07 04:59:18 +00:00
Lance Release
abaf5ac27f Bump version: 0.22.3-beta.5 → 0.22.3-beta.6 2025-11-07 04:58:38 +00:00
Weston Pace
aeac9c7644 feat: add python Permutation class to mimic hugging face dataset and provide pytorch dataloader (#2725) 2025-11-06 16:15:33 -08:00
Lance Release
aed4a7c98e Bump version: 0.22.3-beta.4 → 0.22.3-beta.5 2025-10-31 17:08:56 +00:00
Lance Release
0b7b27481e Bump version: 0.22.3-beta.3 → 0.22.3-beta.4 2025-10-31 01:14:39 +00:00
S.A.N
20bec61ecb refactor(node): async generator for RecordBatchIterator (#2744)
JS native Async Generator, more efficient asynchronous iteration, fewer
synthetic promises, and the ability to handle `catch` or `break` of
parent loop in `finally` block
2025-10-30 14:36:24 -07:00
Will Jones
45255be42c ci: add agents and add reviewing instructions (#2754) 2025-10-29 17:28:26 -07:00
Lance Release
2a6143b5bd Bump version: 0.22.3-beta.2 → 0.22.3-beta.3 2025-10-28 02:12:20 +00:00
Lance Release
1fa888615f Bump version: 0.22.3-beta.1 → 0.22.3-beta.2 2025-10-21 20:14:20 +00:00
Lance Release
d96404c635 Bump version: 0.22.3-beta.0 → 0.22.3-beta.1 2025-10-19 23:41:46 +00:00
Weston Pace
4cfcd95320 feat: add a permutation reader that can read a permutation view (#2712)
This adds a rust permutation builder. In the next PR I will have python
bindings and integration with pytorch.
2025-10-17 05:00:23 -07:00
Weston Pace
8f8e06a2da feat: add output_schema method to queries (#2717)
This is a helper utility I need for some of my data loader work. It
makes it easy to see the output schema even when a `select` has been
applied.
2025-10-14 05:13:28 -07:00
Lance Release
03eab0f091 Bump version: 0.22.2 → 0.22.3-beta.0 2025-10-14 02:25:58 +00:00
Weston Pace
5a19cf15a6 feat: a utility for creating "permutation views" (#2552)
I'm working on a lancedb version of pytorch data loading (and hopefully
addressing https://github.com/lancedb/lance/issues/3727).

However, rather than rely on pytorch for everything I'm moving some of
the things that pytorch does into rust. This gives us more control over
data loading (e.g. using shards or a hash-based split) and it allows
permutations to be persistent. In particular I hope to be able to:

* Create a persistent permutation
* This permutation can handle splits, filtering, shuffling, and sharding
* Create a rust data loader that can read a permutation (one or more
splits), or a subset of a permutation (for DDP)
* Create a python data loader that delegates to the rust data loader

Eventually create integrations for other data loading libraries,
including rust & node
2025-10-09 18:07:31 -07:00
BubbleCal
b59d1007d3 feat(index): add IVF_RQ index type (#2687)
this expose IVF_RQ (RabitQ quantization) index type to lancedb

---------

Signed-off-by: BubbleCal <bubble-cal@outlook.com>
2025-10-09 15:46:18 +08:00
Lance Release
56a16b1728 Bump version: 0.22.2-beta.3 → 0.22.2 2025-10-08 18:13:08 +00:00
Lance Release
b7afed9beb Bump version: 0.22.2-beta.2 → 0.22.2-beta.3 2025-10-08 18:12:23 +00:00
Tom LaMarre
917aabd077 fix(node): support specifying arrow field types by name (#2704)
The [`FieldLike` type in
arrow.ts](5ec12c9971/nodejs/lancedb/arrow.ts (L71-L78))
can have a `type: string` property, but before this change, actually
trying to create a table that has a schema that specifies field types by
name results in an error:

```
Error: Expected a Type but object was null/undefined
```

This change adds support for mapping some type name strings to arrow
`DataType`s, so that passing `FieldLike`s with a `type: string` property
to `sanitizeField` does not throw an error.

The type names that can be passed are upper/lowercase variations of the
keys of the `constructorsByTypeName` object. This does not support
mapping types that need parameters, such as timestamps which need
timezones.

With this, it is possible to create empty tables from `SchemaLike`
objects without instantiating arrow types, e.g.:

```
    import { SchemaLike } from "../lancedb/arrow"
    // ...
    const schemaLike = {
      fields: [
        {
          name: "id",
          type: "int64",
          nullable: true,
        },
        {
          name: "vector",
          type: "float64",
          nullable: true,
        },
      ],
    // ...
    } satisfies SchemaLike;
    const table = await con.createEmptyTable("test", schemaLike);
 ```

This change also makes `FieldLike.nullable` required since the `sanitizeField` function throws if it is undefined.
2025-10-08 04:40:06 -07:00
Lance Release
d7e02c8181 Bump version: 0.22.2-beta.1 → 0.22.2-beta.2 2025-10-06 18:10:40 +00:00
Neha Prasad
9e2a68541e fix(node): allow undefined/omitted values for nullable vector fields (#2656)
**Problem**: When a vector field is marked as nullable, users should be
able to omit it or pass `undefined`, but this was throwing an error:
"Table has embeddings: 'vector', but no embedding function was provided"

fixes: #2646

**Solution**: Modified `validateSchemaEmbeddings` to check
`field.nullable` before treating `undefined` values as missing embedding
fields.

**Changes**:
- Fixed validation logic in `nodejs/lancedb/arrow.ts`
- Enabled previously skipped test for nullable fields
- Added reproduction test case

**Behavior**:
-  `{ vector: undefined }` now works for nullable fields
-  `{}` (omitted field) now works for nullable fields  
-  `{ vector: null }` still works (unchanged)
-  Non-nullable fields still properly throw errors (unchanged)

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
Co-authored-by: neha <neha@posthog.com>
2025-10-02 10:53:05 -07:00
Lance Release
fec2a05629 Bump version: 0.22.2-beta.0 → 0.22.2-beta.1 2025-09-30 19:31:44 +00:00
Lance Release
e7e9e80b1d Bump version: 0.22.1 → 0.22.2-beta.0 2025-09-24 22:54:54 +00:00
Will Jones
d617cdef4a feat: add use_index parameter to merge insert operations (#2674)
## Summary

Exposes `use_index` Merge Insert parameter, which was created upstream
in https://github.com/lancedb/lance/pull/4688.

## API Examples

### Python
```python
# Force table scan
table.merge_insert(["id"]) \
    .when_not_matched_insert_all() \
    .use_index(False) \
    .execute(data)
```

### Node.js/TypeScript
```typescript
// Force table scan  
await table.mergeInsert("id")
    .whenNotMatchedInsertAll()
    .useIndex(false)
    .execute(data);
```

### Rust
```rust
// Force table scan
let mut builder = table.merge_insert(&["id"]);
builder.when_not_matched_insert_all()
       .use_index(false);
builder.execute(data).await?;
```

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-authored-by: Claude <noreply@anthropic.com>
2025-09-24 12:50:21 -07:00
Will Jones
356d7046fd ci: fix test failure on main (#2677)
Test was in wrong position.
2025-09-24 09:46:04 -07:00
Will Jones
48e5caabda ci(nodejs): lint for unused imports (#2673) 2025-09-23 18:49:42 -07:00
Lance Release
d6cc68f671 Bump version: 0.22.1-beta.4 → 0.22.1 2025-09-23 22:07:31 +00:00
Lance Release
55eacfa685 Bump version: 0.22.1-beta.3 → 0.22.1-beta.4 2025-09-23 22:06:45 +00:00
Neha Prasad
b0800b4b71 fix: undefined values should become null in nullable fields (#2658)
### Bug Fix: Undefined Values in Nullable Fields

**Issue**: When inserting data with `undefined` values into nullable
fields, LanceDB was incorrectly coercing them to default values (`false`
for booleans, `NaN` for numbers, `""` for strings) instead of `null`.

**Fix**: Modified the `makeVector()` function in `arrow.ts` to properly
convert `undefined` values to `null` for nullable fields before passing
data to Apache Arrow.

fixes: #2645

**Result**: Now `{ text: undefined, number: undefined, bool: undefined
}` correctly becomes `{ text: null, number: null, bool: null }` when
fields are marked as nullable in the schema.

**Files Changed**: 
- `nodejs/lancedb/arrow.ts` (core fix)
- `nodejs/__test__/arrow.test.ts` (test coverage)

- This ensures proper null handling for nullable fields as expected by
users.

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2025-09-23 14:29:52 -07:00
Neha Prasad
1befebf614 fix(node): handle null values in nullable boolean fields (#2657)
### Solution
Added special handling in `makeVector` function for boolean arrays where
all values are null. The fix creates a proper null bitmap using
`makeData` and `arrowMakeVector` instead of relying on Apache Arrow's
`vectorFromArray` which doesn't handle this edge case correctly.

fixes: #2644

### Changes
- Added null value detection for boolean types in `makeVector` function
- Creates proper Arrow data structure with null bitmap when all boolean
values are null
- Preserves existing behavior for non-null boolean values and other data
types

- Fixes the boolean null value bug while maintaining backward
compatibility.

---------

Co-authored-by: Will Jones <willjones127@gmail.com>
2025-09-23 14:07:00 -07:00
Lance Release
05a4ea646a Bump version: 0.22.1-beta.2 → 0.22.1-beta.3 2025-09-22 04:49:00 +00:00
Jack Ye
ff71d7e552 feat: support shallow clone (#2653)
Support shallow cloning a dataset at a specific location to create a new
dataset, using the shallow_clone feature in Lance. Also introduce remote
`clone` API for remote tables for this functionality.
2025-09-21 21:28:40 -07:00
Neha Prasad
2261eb95a0 fix(node): handle undefined vector fields with embedding functions (#2655)
- Fixes issue where passing `{ vector: undefined }` with an embedding
function threw "Found field not in schema" error instead of calling the
embedding function like `null` or omitted fields.

**Changes:**
- Modified `rowPathsAndValues` to skip undefined values during schema
inference
- Added test case verifying undefined, null, and omitted vector fields
all work correctly

**Before:** `{ vector: undefined }` → Error
**After:** `{ vector: undefined }` → Calls embedding function

Closes #2647
2025-09-19 09:17:28 -07:00
Lance Release
b5a39bffec Bump version: 0.22.1-beta.1 → 0.22.1-beta.2 2025-09-18 20:22:35 +00:00
Lance Release
6ea6884260 Bump version: 0.22.1-beta.0 → 0.22.1-beta.1 2025-09-10 20:49:43 +00:00
Jack Ye
8da74dcb37 feat: support per-request header override (#2631)
## Summary

This PR introduces a `HeaderProvider` which is called for all remote
HTTP calls to get the latest headers to inject. This is useful for
features like adding the latest auth tokens where the header provider
can auto-refresh tokens internally and each request always set the
refreshed token.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-09-10 13:44:00 -07:00
Lance Release
3c7419b392 Bump version: 0.22.0 → 0.22.1-beta.0 2025-09-10 14:24:58 +00:00
Jack Ye
9391ad1450 feat: support mTLS for remote database (#2638)
This PR adds mTLS (mutual TLS) configuration support for the LanceDB
remote HTTP client, allowing users to authenticate with client
certificates and configure custom CA certificates for server
verification.

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-09-09 21:04:46 -07:00
Lance Release
06d5612443 Bump version: 0.22.0-beta.2 → 0.22.0 2025-09-04 08:33:40 +00:00
Lance Release
45f96f4151 Bump version: 0.22.0-beta.1 → 0.22.0-beta.2 2025-09-04 08:33:09 +00:00
Lance Release
48f7b20daa Bump version: 0.22.0-beta.0 → 0.22.0-beta.1 2025-09-03 17:51:36 +00:00
Jack Ye
e6f1da31dc chore: upgrade lance to 0.34.0-beta.4 (#2621) 2025-09-02 21:33:55 -07:00
Lance Release
47747287b6 Bump version: 0.21.4-beta.1 → 0.22.0-beta.0 2025-08-29 21:20:57 +00:00
Wyatt Alt
981f8427e6 chore: update lance (#2610)
Adds storage_options to object_store wrap() to adhere to upstream lance
change.
2025-08-29 13:41:02 -07:00