lancedb

mirror of https://github.com/lancedb/lancedb.git synced 2026-01-08 12:52:58 +00:00

Author	SHA1	Message	Date
Lance Release	8bcac7e372	Bump version: 0.23.1-beta.2 → 0.23.1	2026-01-02 17:39:19 +00:00
Lance Release	e496184ab2	Bump version: 0.23.1-beta.1 → 0.23.1-beta.2	2026-01-02 17:38:54 +00:00
Lance Release	0667fa38d4	Bump version: 0.23.1-beta.0 → 0.23.1-beta.1	2025-12-17 06:59:29 +00:00
Lance Release	2fd712312f	Bump version: 0.23.0 → 0.23.1-beta.0	2025-12-17 03:30:51 +00:00
Lance Release	94bdffe13c	Bump version: 0.23.0-beta.2 → 0.23.0	2025-12-16 16:58:35 +00:00
Lance Release	b93ea3a388	Bump version: 0.23.0-beta.1 → 0.23.0-beta.2	2025-12-16 16:57:55 +00:00
Lance Release	0960e19559	Bump version: 0.23.0-beta.0 → 0.23.0-beta.1	2025-12-05 00:36:39 +00:00
Lance Release	6f79770248	Bump version: 0.22.4-beta.3 → 0.23.0-beta.0	2025-12-04 19:33:37 +00:00
Lance Release	b2d06a3a73	Bump version: 0.22.4-beta.2 → 0.22.4-beta.3	2025-12-02 22:01:59 +00:00
Prashanth Rao	76bcc78910	docs: nodejs failing CI is fixed (#2802 ) Fixes the breaking CI for nodejs, related to the documentation of the new Permutation API in typescript. - Expanded the generated typings in `nodejs/lancedb/native.d.ts` to include `SplitCalculatedOptions`, `splitNames` fields, and the persist/options-based `splitCalculated` methods so the permutation exports match the native API. - The previous block comment block had an inconsistency. `splitCalculated` takes an options object (`SplitCalculatedOptions`) in our bindings, not a bare string. The previous example showed `builder.splitCalculated("user_id % 3");`, which doesn’t match the actual signature and would fail TS typecheck. I updated the comment to `builder.splitCalculated({ calculation: "user_id % 3" });` so the example is now correct. - Updated the `splitCalculated` example in `nodejs/lancedb/permutation.ts` to use the options object. - Ran `npm docs` to ensure docs build correctly. > [!NOTE] > Disclaimer: I used GPT-5.1-Codex-Max to make these updates, but I have read the code and run `npm run docs` to verify that they work and are correct to the best of my knowledge.	2025-11-20 16:16:38 -08:00
Prashanth Rao	135dfdc7ec	docs: 404 and outdated URLs should now work (#2800 ) Did a full scan of all URLs that used to point to the old mkdocs pages, and now links to the appropriate pages on lancedb.com/docs or lance.org docs. --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-11-20 11:14:20 -08:00
Lance Release	3531393523	Bump version: 0.22.4-beta.1 → 0.22.4-beta.2	2025-11-19 20:25:41 +00:00
Wyatt Alt	386fc9e466	feat: add num_attempts to merge insert result (#2795 ) This pipes the num_attempts field from lance's merge insert result through lancedb. This allows callers of merge_insert to get a better idea of whether transaction conflicts are occurring.	2025-11-19 09:32:57 -08:00
Lance Release	ce1bafec1a	Bump version: 0.22.4-beta.0 → 0.22.4-beta.1	2025-11-19 12:58:59 +00:00
Will Jones	1cf3917a87	ci: make rust ci faster, get ci green (#2782 ) * Add `ci` profile for smaller build caches. This had a meaningful impact in Lance, and I expect a similar impact here. https://github.com/lancedb/lance/pull/5236 * Get caching working in Rust. Previously was not working due to `workspaces: rust`. * Get caching working in NodeJs lint job. Previously wasn't working because we installed the toolchain after we called `- uses: Swatinem/rust-cache@v2`, which invalidates the cache locally. * Fix broken pytest from async io transition (`pytest.PytestRemovedIn9Warning`) * Altered `get_num_sub_vectors` to handle bug in case of 4-bit PQ. This was cause of `rust future panicked: unknown error`. Raised an issue upstream to change panic to error: https://github.com/lancedb/lance/issues/5257 * Call `npm run docs` to fix doc issue. * Disable flakey Windows test for consistency. It's just an OS-specific timer issue, not our fault. * Fix Windows absolute path handling in namespaces. Was causing CI failure `OSError: [WinError 123] The filename, directory name, or volume label syntax is incorrect: `	2025-11-18 09:04:56 -08:00
Lance Release	e2d7640021	Bump version: 0.22.3 → 0.22.4-beta.0	2025-11-17 08:43:51 +00:00
Lance Release	e34f51713a	Bump version: 0.22.3-beta.6 → 0.22.3	2025-11-07 04:59:18 +00:00
Lance Release	abaf5ac27f	Bump version: 0.22.3-beta.5 → 0.22.3-beta.6	2025-11-07 04:58:38 +00:00
Weston Pace	aeac9c7644	feat: add python Permutation class to mimic hugging face dataset and provide pytorch dataloader (#2725 )	2025-11-06 16:15:33 -08:00
Lance Release	aed4a7c98e	Bump version: 0.22.3-beta.4 → 0.22.3-beta.5	2025-10-31 17:08:56 +00:00
Lance Release	0b7b27481e	Bump version: 0.22.3-beta.3 → 0.22.3-beta.4	2025-10-31 01:14:39 +00:00
S.A.N	20bec61ecb	refactor(node): async generator for RecordBatchIterator (#2744 ) JS native Async Generator, more efficient asynchronous iteration, fewer synthetic promises, and the ability to handle `catch` or `break` of parent loop in `finally` block	2025-10-30 14:36:24 -07:00
Will Jones	45255be42c	ci: add agents and add reviewing instructions (#2754 )	2025-10-29 17:28:26 -07:00
Lance Release	2a6143b5bd	Bump version: 0.22.3-beta.2 → 0.22.3-beta.3	2025-10-28 02:12:20 +00:00
Lance Release	1fa888615f	Bump version: 0.22.3-beta.1 → 0.22.3-beta.2	2025-10-21 20:14:20 +00:00
Lance Release	d96404c635	Bump version: 0.22.3-beta.0 → 0.22.3-beta.1	2025-10-19 23:41:46 +00:00
Weston Pace	4cfcd95320	feat: add a permutation reader that can read a permutation view (#2712 ) This adds a rust permutation builder. In the next PR I will have python bindings and integration with pytorch.	2025-10-17 05:00:23 -07:00
Weston Pace	8f8e06a2da	feat: add output_schema method to queries (#2717 ) This is a helper utility I need for some of my data loader work. It makes it easy to see the output schema even when a `select` has been applied.	2025-10-14 05:13:28 -07:00
Lance Release	03eab0f091	Bump version: 0.22.2 → 0.22.3-beta.0	2025-10-14 02:25:58 +00:00
Weston Pace	5a19cf15a6	feat: a utility for creating "permutation views" (#2552 ) I'm working on a lancedb version of pytorch data loading (and hopefully addressing https://github.com/lancedb/lance/issues/3727). However, rather than rely on pytorch for everything I'm moving some of the things that pytorch does into rust. This gives us more control over data loading (e.g. using shards or a hash-based split) and it allows permutations to be persistent. In particular I hope to be able to: * Create a persistent permutation * This permutation can handle splits, filtering, shuffling, and sharding * Create a rust data loader that can read a permutation (one or more splits), or a subset of a permutation (for DDP) * Create a python data loader that delegates to the rust data loader Eventually create integrations for other data loading libraries, including rust & node	2025-10-09 18:07:31 -07:00
BubbleCal	b59d1007d3	feat(index): add IVF_RQ index type (#2687 ) this expose IVF_RQ (RabitQ quantization) index type to lancedb --------- Signed-off-by: BubbleCal <bubble-cal@outlook.com>	2025-10-09 15:46:18 +08:00
Lance Release	56a16b1728	Bump version: 0.22.2-beta.3 → 0.22.2	2025-10-08 18:13:08 +00:00
Lance Release	b7afed9beb	Bump version: 0.22.2-beta.2 → 0.22.2-beta.3	2025-10-08 18:12:23 +00:00
Tom LaMarre	917aabd077	fix(node): support specifying arrow field types by name (#2704 ) The [`FieldLike` type in arrow.ts](`5ec12c9971/nodejs/lancedb/arrow.ts (L71-L78)`) can have a `type: string` property, but before this change, actually trying to create a table that has a schema that specifies field types by name results in an error: ``` Error: Expected a Type but object was null/undefined ``` This change adds support for mapping some type name strings to arrow `DataType`s, so that passing `FieldLike`s with a `type: string` property to `sanitizeField` does not throw an error. The type names that can be passed are upper/lowercase variations of the keys of the `constructorsByTypeName` object. This does not support mapping types that need parameters, such as timestamps which need timezones. With this, it is possible to create empty tables from `SchemaLike` objects without instantiating arrow types, e.g.: ``` import { SchemaLike } from "../lancedb/arrow" // ... const schemaLike = { fields: [ { name: "id", type: "int64", nullable: true, }, { name: "vector", type: "float64", nullable: true, }, ], // ... } satisfies SchemaLike; const table = await con.createEmptyTable("test", schemaLike); ``` This change also makes `FieldLike.nullable` required since the `sanitizeField` function throws if it is undefined.	2025-10-08 04:40:06 -07:00
Lance Release	d7e02c8181	Bump version: 0.22.2-beta.1 → 0.22.2-beta.2	2025-10-06 18:10:40 +00:00
Neha Prasad	9e2a68541e	fix(node): allow undefined/omitted values for nullable vector fields (#2656 ) Problem: When a vector field is marked as nullable, users should be able to omit it or pass `undefined`, but this was throwing an error: "Table has embeddings: 'vector', but no embedding function was provided" fixes: #2646 Solution: Modified `validateSchemaEmbeddings` to check `field.nullable` before treating `undefined` values as missing embedding fields. Changes: - Fixed validation logic in `nodejs/lancedb/arrow.ts` - Enabled previously skipped test for nullable fields - Added reproduction test case Behavior: - ✅ `{ vector: undefined }` now works for nullable fields - ✅ `{}` (omitted field) now works for nullable fields - ✅ `{ vector: null }` still works (unchanged) - ✅ Non-nullable fields still properly throw errors (unchanged) --------- Co-authored-by: Will Jones <willjones127@gmail.com> Co-authored-by: neha <neha@posthog.com>	2025-10-02 10:53:05 -07:00
Lance Release	fec2a05629	Bump version: 0.22.2-beta.0 → 0.22.2-beta.1	2025-09-30 19:31:44 +00:00
Lance Release	e7e9e80b1d	Bump version: 0.22.1 → 0.22.2-beta.0	2025-09-24 22:54:54 +00:00
Will Jones	d617cdef4a	feat: add use_index parameter to merge insert operations (#2674 ) ## Summary Exposes `use_index` Merge Insert parameter, which was created upstream in https://github.com/lancedb/lance/pull/4688. ## API Examples ### Python ```python # Force table scan table.merge_insert(["id"]) \ .when_not_matched_insert_all() \ .use_index(False) \ .execute(data) ``` ### Node.js/TypeScript ```typescript // Force table scan await table.mergeInsert("id") .whenNotMatchedInsertAll() .useIndex(false) .execute(data); ``` ### Rust ```rust // Force table scan let mut builder = table.merge_insert(&["id"]); builder.when_not_matched_insert_all() .use_index(false); builder.execute(data).await?; ``` 🤖 Generated with [Claude Code](https://claude.ai/code) Co-authored-by: Claude <noreply@anthropic.com>	2025-09-24 12:50:21 -07:00
Will Jones	356d7046fd	ci: fix test failure on main (#2677 ) Test was in wrong position.	2025-09-24 09:46:04 -07:00
Will Jones	48e5caabda	ci(nodejs): lint for unused imports (#2673 )	2025-09-23 18:49:42 -07:00
Lance Release	d6cc68f671	Bump version: 0.22.1-beta.4 → 0.22.1	2025-09-23 22:07:31 +00:00
Lance Release	55eacfa685	Bump version: 0.22.1-beta.3 → 0.22.1-beta.4	2025-09-23 22:06:45 +00:00
Neha Prasad	b0800b4b71	fix: undefined values should become null in nullable fields (#2658 ) ### Bug Fix: Undefined Values in Nullable Fields Issue: When inserting data with `undefined` values into nullable fields, LanceDB was incorrectly coercing them to default values (`false` for booleans, `NaN` for numbers, `""` for strings) instead of `null`. Fix: Modified the `makeVector()` function in `arrow.ts` to properly convert `undefined` values to `null` for nullable fields before passing data to Apache Arrow. fixes: #2645 Result: Now `{ text: undefined, number: undefined, bool: undefined }` correctly becomes `{ text: null, number: null, bool: null }` when fields are marked as nullable in the schema. Files Changed: - `nodejs/lancedb/arrow.ts` (core fix) - `nodejs/__test__/arrow.test.ts` (test coverage) - This ensures proper null handling for nullable fields as expected by users. --------- Co-authored-by: Will Jones <willjones127@gmail.com>	2025-09-23 14:29:52 -07:00
Neha Prasad	1befebf614	fix(node): handle null values in nullable boolean fields (#2657 ) ### Solution Added special handling in `makeVector` function for boolean arrays where all values are null. The fix creates a proper null bitmap using `makeData` and `arrowMakeVector` instead of relying on Apache Arrow's `vectorFromArray` which doesn't handle this edge case correctly. fixes: #2644 ### Changes - Added null value detection for boolean types in `makeVector` function - Creates proper Arrow data structure with null bitmap when all boolean values are null - Preserves existing behavior for non-null boolean values and other data types - Fixes the boolean null value bug while maintaining backward compatibility. --------- Co-authored-by: Will Jones <willjones127@gmail.com>	2025-09-23 14:07:00 -07:00
Lance Release	05a4ea646a	Bump version: 0.22.1-beta.2 → 0.22.1-beta.3	2025-09-22 04:49:00 +00:00
Jack Ye	ff71d7e552	feat: support shallow clone (#2653 ) Support shallow cloning a dataset at a specific location to create a new dataset, using the shallow_clone feature in Lance. Also introduce remote `clone` API for remote tables for this functionality.	2025-09-21 21:28:40 -07:00
Neha Prasad	2261eb95a0	fix(node): handle undefined vector fields with embedding functions (#2655 ) - Fixes issue where passing `{ vector: undefined }` with an embedding function threw "Found field not in schema" error instead of calling the embedding function like `null` or omitted fields. Changes: - Modified `rowPathsAndValues` to skip undefined values during schema inference - Added test case verifying undefined, null, and omitted vector fields all work correctly Before: `{ vector: undefined }` → Error After: `{ vector: undefined }` → Calls embedding function Closes #2647	2025-09-19 09:17:28 -07:00
Lance Release	b5a39bffec	Bump version: 0.22.1-beta.1 → 0.22.1-beta.2	2025-09-18 20:22:35 +00:00
Lance Release	6ea6884260	Bump version: 0.22.1-beta.0 → 0.22.1-beta.1	2025-09-10 20:49:43 +00:00

1 2 3 4 5 ...

413 Commits