feat: add python Permutation class to mimic hugging face dataset and provide pytorch dataloader (#2725)

This commit is contained in:
Weston Pace
2025-11-06 16:15:33 -08:00
committed by GitHub
parent 6ddd271627
commit aeac9c7644
24 changed files with 2071 additions and 126 deletions

View File

@@ -64,6 +64,36 @@ builder.filter("age > 18 AND status = 'active'");
***
### persist()
```ts
persist(connection, tableName): PermutationBuilder
```
Configure the permutation to be persisted.
#### Parameters
* **connection**: [`Connection`](Connection.md)
The connection to persist the permutation to
* **tableName**: `string`
The name of the table to create
#### Returns
[`PermutationBuilder`](PermutationBuilder.md)
A new PermutationBuilder instance
#### Example
```ts
builder.persist(connection, "permutation_table");
```
***
### shuffle()
```ts
@@ -98,15 +128,15 @@ builder.shuffle({ seed: 42, clumpSize: 10 });
### splitCalculated()
```ts
splitCalculated(calculation): PermutationBuilder
splitCalculated(options): PermutationBuilder
```
Configure calculated splits for the permutation.
#### Parameters
* **calculation**: `string`
SQL expression for calculating splits
* **options**: [`SplitCalculatedOptions`](../interfaces/SplitCalculatedOptions.md)
Configuration for calculated splitting
#### Returns

View File

@@ -77,6 +77,7 @@
- [RemovalStats](interfaces/RemovalStats.md)
- [RetryConfig](interfaces/RetryConfig.md)
- [ShuffleOptions](interfaces/ShuffleOptions.md)
- [SplitCalculatedOptions](interfaces/SplitCalculatedOptions.md)
- [SplitHashOptions](interfaces/SplitHashOptions.md)
- [SplitRandomOptions](interfaces/SplitRandomOptions.md)
- [SplitSequentialOptions](interfaces/SplitSequentialOptions.md)

View File

@@ -0,0 +1,23 @@
[**@lancedb/lancedb**](../README.md) • **Docs**
***
[@lancedb/lancedb](../globals.md) / SplitCalculatedOptions
# Interface: SplitCalculatedOptions
## Properties
### calculation
```ts
calculation: string;
```
***
### splitNames?
```ts
optional splitNames: string[];
```

View File

@@ -24,6 +24,14 @@ optional discardWeight: number;
***
### splitNames?
```ts
optional splitNames: string[];
```
***
### splitWeights
```ts

View File

@@ -37,3 +37,11 @@ optional ratios: number[];
```ts
optional seed: number;
```
***
### splitNames?
```ts
optional splitNames: string[];
```

View File

@@ -29,3 +29,11 @@ optional fixed: number;
```ts
optional ratios: number[];
```
***
### splitNames?
```ts
optional splitNames: string[];
```