Files
lancedb/docs/src/js/namespaces/embedding/classes/TextEmbeddingFunction.md
Will Jones 7ac5f74c80 feat!: add variable store to embeddings registry (#2112)
BREAKING CHANGE: embedding function implementations in Node need to now
call `resolveVariables()` in their constructors and should **not**
implement `toJSON()`.

This tries to address the handling of secrets. In Node, they are
currently lost. In Python, they are currently leaked into the table
schema metadata.

This PR introduces an in-memory variable store on the function registry.
It also allows embedding function definitions to label certain config
values as "sensitive", and the preprocessing logic will raise an error
if users try to pass in hard-coded values.

Closes #2110
Closes #521

---------

Co-authored-by: Weston Pace <weston.pace@gmail.com>
2025-02-24 15:52:19 -08:00

5.3 KiB

@lancedb/lancedbDocs


@lancedb/lancedb / embedding / TextEmbeddingFunction

Class: abstract TextEmbeddingFunction<M>

an abstract class for implementing embedding functions that take text as input

Extends

Type Parameters

M extends FunctionOptions = FunctionOptions

Constructors

new TextEmbeddingFunction()

new TextEmbeddingFunction<M>(): TextEmbeddingFunction<M>

Returns

TextEmbeddingFunction<M>

Inherited from

EmbeddingFunction.constructor

Methods

computeQueryEmbeddings()

computeQueryEmbeddings(data): Promise<number[] | Float32Array | Float64Array>

Compute the embeddings for a single query

Parameters

  • data: string

Returns

Promise<number[] | Float32Array | Float64Array>

Overrides

EmbeddingFunction.computeQueryEmbeddings


computeSourceEmbeddings()

computeSourceEmbeddings(data): Promise<number[][] | Float32Array[] | Float64Array[]>

Creates a vector representation for the given values.

Parameters

  • data: string[]

Returns

Promise<number[][] | Float32Array[] | Float64Array[]>

Overrides

EmbeddingFunction.computeSourceEmbeddings


embeddingDataType()

embeddingDataType(): Float<Floats>

The datatype of the embeddings

Returns

Float<Floats>

Overrides

EmbeddingFunction.embeddingDataType


generateEmbeddings()

abstract generateEmbeddings(texts, ...args): Promise<number[][] | Float32Array[] | Float64Array[]>

Parameters

  • texts: string[]

  • ...args: any[]

Returns

Promise<number[][] | Float32Array[] | Float64Array[]>


getSensitiveKeys()

protected getSensitiveKeys(): string[]

Provide a list of keys in the function options that should be treated as sensitive. If users pass raw values for these keys, they will be rejected.

Returns

string[]

Inherited from

EmbeddingFunction.getSensitiveKeys


init()?

optional init(): Promise<void>

Optionally load any resources needed for the embedding function.

This method is called after the embedding function has been initialized but before any embeddings are computed. It is useful for loading local models or other resources that are needed for the embedding function to work.

Returns

Promise<void>

Inherited from

EmbeddingFunction.init


ndims()

ndims(): undefined | number

The number of dimensions of the embeddings

Returns

undefined | number

Inherited from

EmbeddingFunction.ndims


resolveVariables()

protected resolveVariables(config): Partial<M>

Apply variables to the config.

Parameters

  • config: Partial<M>

Returns

Partial<M>

Inherited from

EmbeddingFunction.resolveVariables


sourceField()

sourceField(): [DataType<Type, any>, Map<string, EmbeddingFunction<any, FunctionOptions>>]

sourceField is used in combination with LanceSchema to provide a declarative data model

Returns

[DataType<Type, any>, Map<string, EmbeddingFunction<any, FunctionOptions>>]

See

LanceSchema

Overrides

EmbeddingFunction.sourceField


toJSON()

toJSON(): Record<string, any>

Get the original arguments to the constructor, to serialize them so they can be used to recreate the embedding function later.

Returns

Record<string, any>

Inherited from

EmbeddingFunction.toJSON


vectorField()

vectorField(optionsOrDatatype?): [DataType<Type, any>, Map<string, EmbeddingFunction<any, FunctionOptions>>]

vectorField is used in combination with LanceSchema to provide a declarative data model

Parameters

  • optionsOrDatatype?: DataType<Type, any> | Partial<FieldOptions<DataType<Type, any>>> The options for the field

Returns

[DataType<Type, any>, Map<string, EmbeddingFunction<any, FunctionOptions>>]

See

LanceSchema

Inherited from

EmbeddingFunction.vectorField