mirror of
https://github.com/lancedb/lancedb.git
synced 2025-12-23 05:19:58 +00:00
Compare commits
382 Commits
python-v0.
...
v0.22.0-be
| Author | SHA1 | Date | |
|---|---|---|---|
|
|
4877bb6072 | ||
|
|
4dd399ca29 | ||
|
|
e6f1da31dc | ||
|
|
a9ea785b15 | ||
|
|
cc38453391 | ||
|
|
47747287b6 | ||
|
|
0847e666a0 | ||
|
|
981f8427e6 | ||
|
|
f6846004ca | ||
|
|
faf8973624 | ||
|
|
fabe37274f | ||
|
|
6839ac3509 | ||
|
|
b88422e515 | ||
|
|
8d60685ede | ||
|
|
04285a4a4e | ||
|
|
d4a41b5663 | ||
|
|
adc3daa462 | ||
|
|
acbfa6c012 | ||
|
|
d602e9f98c | ||
|
|
ad09234d59 | ||
|
|
0c34ffb252 | ||
|
|
d9f333d828 | ||
|
|
bb809abd4b | ||
|
|
c87530f7a3 | ||
|
|
1eb1beecd6 | ||
|
|
ce550e6c45 | ||
|
|
d3bae1f3a3 | ||
|
|
dcf53c4506 | ||
|
|
941eada703 | ||
|
|
ed640a76d9 | ||
|
|
296205ef96 | ||
|
|
16beaaa656 | ||
|
|
4ff87b1f4a | ||
|
|
0532ef2358 | ||
|
|
dcf7334c1f | ||
|
|
8ffe992a6f | ||
|
|
9d683e4f0b | ||
|
|
0a1ea1858d | ||
|
|
7d0127b376 | ||
|
|
02595dc475 | ||
|
|
f23327af79 | ||
|
|
c7afa724dd | ||
|
|
c359cec504 | ||
|
|
fe76496a59 | ||
|
|
67ec1fe75c | ||
|
|
70d9b04ba5 | ||
|
|
b0d4a79c35 | ||
|
|
f79295c697 | ||
|
|
381fad9b65 | ||
|
|
055bf91d3e | ||
|
|
050f0086b8 | ||
|
|
10fa23e0d6 | ||
|
|
43d9fc28b0 | ||
|
|
f45f0d0431 | ||
|
|
b9e3c36d82 | ||
|
|
3cd7dd3375 | ||
|
|
12d4ce4cfe | ||
|
|
3d1f102087 | ||
|
|
81afd8a42f | ||
|
|
c2aa03615a | ||
|
|
d2c6759e7f | ||
|
|
94fb9f364a | ||
|
|
fbff244ed8 | ||
|
|
7e7466d224 | ||
|
|
cceaf27d79 | ||
|
|
7a15337e03 | ||
|
|
96c66fd087 | ||
|
|
0579303602 | ||
|
|
75edb8756c | ||
|
|
88283110f4 | ||
|
|
b3a637fdeb | ||
|
|
ce24457531 | ||
|
|
087fe6343d | ||
|
|
ab8cbe62dd | ||
|
|
f076bb41f4 | ||
|
|
902fb83d54 | ||
|
|
779118339f | ||
|
|
03b62599d7 | ||
|
|
4c999fb651 | ||
|
|
6d23d32ab5 | ||
|
|
704cec34e1 | ||
|
|
a300a238db | ||
|
|
a41ff1df0a | ||
|
|
77b005d849 | ||
|
|
167fccc427 | ||
|
|
2bffbcefa5 | ||
|
|
905552f993 | ||
|
|
e4898c9313 | ||
|
|
cab36d94b2 | ||
|
|
b64252d4fd | ||
|
|
6fc006072c | ||
|
|
d4bb59b542 | ||
|
|
6b2dd6de51 | ||
|
|
dbccd9e4f1 | ||
|
|
b12ebfed4c | ||
|
|
1dadb2aefa | ||
|
|
eb9784d7f2 | ||
|
|
ba755626cc | ||
|
|
7760799cb8 | ||
|
|
4beb2d2877 | ||
|
|
a00b8595d1 | ||
|
|
9c8314b4fd | ||
|
|
c625b6f2b2 | ||
|
|
bec8fe6547 | ||
|
|
dc1150c011 | ||
|
|
afaefc6264 | ||
|
|
cb70ff8cee | ||
|
|
cbb5a841b1 | ||
|
|
c72f6770fd | ||
|
|
e5a80a5e86 | ||
|
|
8d0a7fad1f | ||
|
|
b80d4d0134 | ||
|
|
9645fe52c2 | ||
|
|
b77314168d | ||
|
|
e08d45e090 | ||
|
|
2e3ddb8382 | ||
|
|
627ca4c810 | ||
|
|
f8dae4ffe9 | ||
|
|
9eb6119468 | ||
|
|
59b57e30ed | ||
|
|
fec8d58f06 | ||
|
|
84ded9d678 | ||
|
|
65696d9713 | ||
|
|
e2f2ea32e4 | ||
|
|
d5f2eca754 | ||
|
|
7fa455a8a5 | ||
|
|
8f42b5874e | ||
|
|
274f19f560 | ||
|
|
fbcbc75b5b | ||
|
|
008f389bd0 | ||
|
|
91af6518d9 | ||
|
|
af6819762c | ||
|
|
7acece493d | ||
|
|
20e017fedc | ||
|
|
74e578b3c8 | ||
|
|
d92d9eb3d2 | ||
|
|
b6cdce7bc9 | ||
|
|
316b406265 | ||
|
|
8825c7c1dd | ||
|
|
81c85ff702 | ||
|
|
570f2154d5 | ||
|
|
0525c055fc | ||
|
|
38d11291da | ||
|
|
258e682574 | ||
|
|
d7afa600b8 | ||
|
|
5c7303ab2e | ||
|
|
5895ef4039 | ||
|
|
0528cd858a | ||
|
|
6582f43422 | ||
|
|
5c7f63388d | ||
|
|
d0bc671cac | ||
|
|
d37e17593d | ||
|
|
cb726d370e | ||
|
|
23ee132546 | ||
|
|
7fa090d330 | ||
|
|
07bc1c5397 | ||
|
|
d7a9dbb9fc | ||
|
|
00487afc7d | ||
|
|
1902d65aad | ||
|
|
c4fbb65b8e | ||
|
|
875ed7ae6f | ||
|
|
95a46a57ba | ||
|
|
51561e31a0 | ||
|
|
7b19120578 | ||
|
|
745c34a6a9 | ||
|
|
db8fa2454d | ||
|
|
a67a7b4b42 | ||
|
|
496846e532 | ||
|
|
dadcfebf8e | ||
|
|
67033dbd7f | ||
|
|
05a85cfc2a | ||
|
|
40c5d3d72b | ||
|
|
198f0f80c6 | ||
|
|
e3f2fd3892 | ||
|
|
f401ccc599 | ||
|
|
81b59139f8 | ||
|
|
1026781ab6 | ||
|
|
9c699b8cd9 | ||
|
|
34bec59bc3 | ||
|
|
a5fbbf0d66 | ||
|
|
b42721167b | ||
|
|
543dec9ff0 | ||
|
|
04f962f6b0 | ||
|
|
19e896ff69 | ||
|
|
272e4103b2 | ||
|
|
75c257ebb6 | ||
|
|
9ee152eb42 | ||
|
|
c9ae1b1737 | ||
|
|
89dc80c42a | ||
|
|
7b020ac799 | ||
|
|
529e774bbb | ||
|
|
7c12239305 | ||
|
|
d83424d6b4 | ||
|
|
8bf89f887c | ||
|
|
b2160b2304 | ||
|
|
1bb82597be | ||
|
|
e4eee38b3c | ||
|
|
64fc2be503 | ||
|
|
dc8054e90d | ||
|
|
1684940946 | ||
|
|
695813463c | ||
|
|
ed594b0f76 | ||
|
|
cee2b5ea42 | ||
|
|
f315f9665a | ||
|
|
5deb26bc8b | ||
|
|
3cc670ac38 | ||
|
|
4ade3e31e2 | ||
|
|
a222d2cd91 | ||
|
|
508e621f3d | ||
|
|
a1a0472f3f | ||
|
|
3425a6d339 | ||
|
|
af54e0ce06 | ||
|
|
089905fe8f | ||
|
|
554939e5d2 | ||
|
|
7a13814922 | ||
|
|
e9f25f6a12 | ||
|
|
419a433244 | ||
|
|
a9311c4dc0 | ||
|
|
178bcf9c90 | ||
|
|
b9be092cb1 | ||
|
|
e8c0c52315 | ||
|
|
a60fa0d3b7 | ||
|
|
726d629b9b | ||
|
|
b493f56dee | ||
|
|
a8b5ad7e74 | ||
|
|
f8f6264883 | ||
|
|
d8517117f1 | ||
|
|
ab66dd5ed2 | ||
|
|
cbb9a7877c | ||
|
|
b7fc223535 | ||
|
|
1fdaf7a1a4 | ||
|
|
d11819c90c | ||
|
|
9b902272f1 | ||
|
|
8c0622fa2c | ||
|
|
2191f948c3 | ||
|
|
acc3b03004 | ||
|
|
7f091b8c8e | ||
|
|
c19bdd9a24 | ||
|
|
dad0ff5cd2 | ||
|
|
a705621067 | ||
|
|
39614fdb7d | ||
|
|
96d534d4bc | ||
|
|
5051d30d09 | ||
|
|
db853c4041 | ||
|
|
76d1d22bdc | ||
|
|
d8746c61c6 | ||
|
|
1a66df2627 | ||
|
|
44670076c1 | ||
|
|
92f0b16e46 | ||
|
|
1620ba3508 | ||
|
|
3ae90dde80 | ||
|
|
4f07fea6df | ||
|
|
3d7d82cf86 | ||
|
|
edc4e40a7b | ||
|
|
ca3806a02f | ||
|
|
35cff12e31 | ||
|
|
c6c20cb2bd | ||
|
|
26080ee4c1 | ||
|
|
ef3a2b5357 | ||
|
|
c42a201389 | ||
|
|
24e42ccd4d | ||
|
|
8a50944061 | ||
|
|
40e066bc7c | ||
|
|
b3ad105fa0 | ||
|
|
6e701d3e1b | ||
|
|
2248aa9508 | ||
|
|
a6fa69ab89 | ||
|
|
b3a4efd587 | ||
|
|
4708b60bb1 | ||
|
|
080ea2f9a4 | ||
|
|
32fdde23f8 | ||
|
|
c44e5c046c | ||
|
|
f23aa0a793 | ||
|
|
83fc2b1851 | ||
|
|
56aa133ee6 | ||
|
|
27d9e5c596 | ||
|
|
ec8271931f | ||
|
|
6c6966600c | ||
|
|
2e170c3c7b | ||
|
|
fd92e651d1 | ||
|
|
c298482ee1 | ||
|
|
d59f64b5a3 | ||
|
|
30ed8c4c43 | ||
|
|
4a2cdbf299 | ||
|
|
657843d9e9 | ||
|
|
1cd76b8498 | ||
|
|
a38f784081 | ||
|
|
647dee4e94 | ||
|
|
0844c2dd64 | ||
|
|
fd2692295c | ||
|
|
d4ea50fba1 | ||
|
|
0d42297cf8 | ||
|
|
a6d4125cbf | ||
|
|
5c32a99e61 | ||
|
|
cefaa75b24 | ||
|
|
bd62c2384f | ||
|
|
f0bc08c0d7 | ||
|
|
e52ac79c69 | ||
|
|
f091f57594 | ||
|
|
a997fd4108 | ||
|
|
1486514ccc | ||
|
|
a505bc3965 | ||
|
|
c1738250a3 | ||
|
|
1ee63984f5 | ||
|
|
2eb2c8862a | ||
|
|
4ea8e178d3 | ||
|
|
e4485a630e | ||
|
|
fb95f9b3bd | ||
|
|
625bab3f21 | ||
|
|
e59f9382a0 | ||
|
|
fdee7ba477 | ||
|
|
c44fa3abc4 | ||
|
|
fc43aac0ed | ||
|
|
e67cd0baf9 | ||
|
|
26dab93f2a | ||
|
|
b9bdb8d937 | ||
|
|
a1d1833a40 | ||
|
|
a547c523c2 | ||
|
|
dc8b75feab | ||
|
|
c1600cdc06 | ||
|
|
f5dee46970 | ||
|
|
346cbf8bf7 | ||
|
|
3c7dfe9f28 | ||
|
|
f52d05d3fa | ||
|
|
c321cccc12 | ||
|
|
cba14a5743 | ||
|
|
72057b743d | ||
|
|
698f329598 | ||
|
|
79fa745130 | ||
|
|
2ad71bdeca | ||
|
|
7c13615096 | ||
|
|
f882f5b69a | ||
|
|
a68311a893 | ||
|
|
846a5cea33 | ||
|
|
e3dec647b5 | ||
|
|
c58104cecc | ||
|
|
b3b5362632 | ||
|
|
abe06fee3d | ||
|
|
93a82fd371 | ||
|
|
0d379e6ffa | ||
|
|
e1388bdfdd | ||
|
|
315a24c2bc | ||
|
|
6dd4cf6038 | ||
|
|
f97e751b3c | ||
|
|
e803a626a1 | ||
|
|
9403254442 | ||
|
|
b2a38ac366 | ||
|
|
bdb6c09c3b | ||
|
|
2bfdef2624 | ||
|
|
7982d5c082 | ||
|
|
7ff6ec7fe3 | ||
|
|
ba1ded933a | ||
|
|
b595d8a579 | ||
|
|
2a1d6d8abf | ||
|
|
440a466a13 | ||
|
|
b9afd9c860 | ||
|
|
a6b6f6a806 | ||
|
|
ae1548b507 | ||
|
|
4e03ee82bc | ||
|
|
46a6846d07 | ||
|
|
a207213358 | ||
|
|
6c321c694a | ||
|
|
5c00b2904c | ||
|
|
14677d7c18 | ||
|
|
dd22a379b2 | ||
|
|
7747c9bcbf | ||
|
|
c9d6fc43a6 | ||
|
|
581bcfbb88 | ||
|
|
3750639b5f | ||
|
|
e744d54460 | ||
|
|
9d1ce4b5a5 | ||
|
|
729ce5e542 | ||
|
|
de6739e7ec | ||
|
|
495216efdb | ||
|
|
a3b45a4d00 | ||
|
|
c316c2f532 | ||
|
|
3966b16b63 | ||
|
|
5661cc15ac | ||
|
|
4e7220400f | ||
|
|
ae4928fe77 | ||
|
|
e80a405dee | ||
|
|
a53e19e386 |
@@ -1,5 +1,5 @@
|
||||
[tool.bumpversion]
|
||||
current_version = "0.18.0-beta.0"
|
||||
current_version = "0.22.0-beta.1"
|
||||
parse = """(?x)
|
||||
(?P<major>0|[1-9]\\d*)\\.
|
||||
(?P<minor>0|[1-9]\\d*)\\.
|
||||
@@ -50,11 +50,6 @@ pre_commit_hooks = [
|
||||
optional_value = "final"
|
||||
values = ["beta", "final"]
|
||||
|
||||
[[tool.bumpversion.files]]
|
||||
filename = "node/package.json"
|
||||
replace = "\"version\": \"{new_version}\","
|
||||
search = "\"version\": \"{current_version}\","
|
||||
|
||||
[[tool.bumpversion.files]]
|
||||
filename = "nodejs/package.json"
|
||||
replace = "\"version\": \"{new_version}\","
|
||||
@@ -66,54 +61,8 @@ glob = "nodejs/npm/*/package.json"
|
||||
replace = "\"version\": \"{new_version}\","
|
||||
search = "\"version\": \"{current_version}\","
|
||||
|
||||
# vectodb node binary packages
|
||||
[[tool.bumpversion.files]]
|
||||
glob = "node/package.json"
|
||||
replace = "\"@lancedb/vectordb-darwin-arm64\": \"{new_version}\""
|
||||
search = "\"@lancedb/vectordb-darwin-arm64\": \"{current_version}\""
|
||||
|
||||
[[tool.bumpversion.files]]
|
||||
glob = "node/package.json"
|
||||
replace = "\"@lancedb/vectordb-darwin-x64\": \"{new_version}\""
|
||||
search = "\"@lancedb/vectordb-darwin-x64\": \"{current_version}\""
|
||||
|
||||
[[tool.bumpversion.files]]
|
||||
glob = "node/package.json"
|
||||
replace = "\"@lancedb/vectordb-linux-arm64-gnu\": \"{new_version}\""
|
||||
search = "\"@lancedb/vectordb-linux-arm64-gnu\": \"{current_version}\""
|
||||
|
||||
[[tool.bumpversion.files]]
|
||||
glob = "node/package.json"
|
||||
replace = "\"@lancedb/vectordb-linux-x64-gnu\": \"{new_version}\""
|
||||
search = "\"@lancedb/vectordb-linux-x64-gnu\": \"{current_version}\""
|
||||
|
||||
[[tool.bumpversion.files]]
|
||||
glob = "node/package.json"
|
||||
replace = "\"@lancedb/vectordb-linux-arm64-musl\": \"{new_version}\""
|
||||
search = "\"@lancedb/vectordb-linux-arm64-musl\": \"{current_version}\""
|
||||
|
||||
[[tool.bumpversion.files]]
|
||||
glob = "node/package.json"
|
||||
replace = "\"@lancedb/vectordb-linux-x64-musl\": \"{new_version}\""
|
||||
search = "\"@lancedb/vectordb-linux-x64-musl\": \"{current_version}\""
|
||||
|
||||
[[tool.bumpversion.files]]
|
||||
glob = "node/package.json"
|
||||
replace = "\"@lancedb/vectordb-win32-x64-msvc\": \"{new_version}\""
|
||||
search = "\"@lancedb/vectordb-win32-x64-msvc\": \"{current_version}\""
|
||||
|
||||
[[tool.bumpversion.files]]
|
||||
glob = "node/package.json"
|
||||
replace = "\"@lancedb/vectordb-win32-arm64-msvc\": \"{new_version}\""
|
||||
search = "\"@lancedb/vectordb-win32-arm64-msvc\": \"{current_version}\""
|
||||
|
||||
# Cargo files
|
||||
# ------------
|
||||
[[tool.bumpversion.files]]
|
||||
filename = "rust/ffi/node/Cargo.toml"
|
||||
replace = "\nversion = \"{new_version}\""
|
||||
search = "\nversion = \"{current_version}\""
|
||||
|
||||
[[tool.bumpversion.files]]
|
||||
filename = "rust/lancedb/Cargo.toml"
|
||||
replace = "\nversion = \"{new_version}\""
|
||||
|
||||
@@ -34,6 +34,10 @@ rustflags = ["-C", "target-cpu=haswell", "-C", "target-feature=+avx2,+fma,+f16c"
|
||||
[target.x86_64-unknown-linux-musl]
|
||||
rustflags = ["-C", "target-cpu=haswell", "-C", "target-feature=-crt-static,+avx2,+fma,+f16c"]
|
||||
|
||||
[target.aarch64-unknown-linux-musl]
|
||||
linker = "aarch64-linux-musl-gcc"
|
||||
rustflags = ["-C", "target-feature=-crt-static"]
|
||||
|
||||
[target.aarch64-apple-darwin]
|
||||
rustflags = ["-C", "target-cpu=apple-m1", "-C", "target-feature=+neon,+fp16,+fhm,+dotprod"]
|
||||
|
||||
@@ -44,4 +48,4 @@ rustflags = ["-Ctarget-feature=+crt-static"]
|
||||
|
||||
# Experimental target for Arm64 Windows
|
||||
[target.aarch64-pc-windows-msvc]
|
||||
rustflags = ["-Ctarget-feature=+crt-static"]
|
||||
rustflags = ["-Ctarget-feature=+crt-static"]
|
||||
|
||||
@@ -36,8 +36,7 @@ runs:
|
||||
args: ${{ inputs.args }}
|
||||
before-script-linux: |
|
||||
set -e
|
||||
yum install -y openssl-devel \
|
||||
&& curl -L https://github.com/protocolbuffers/protobuf/releases/download/v24.4/protoc-24.4-linux-$(uname -m).zip > /tmp/protoc.zip \
|
||||
curl -L https://github.com/protocolbuffers/protobuf/releases/download/v24.4/protoc-24.4-linux-$(uname -m).zip > /tmp/protoc.zip \
|
||||
&& unzip /tmp/protoc.zip -d /usr/local \
|
||||
&& rm /tmp/protoc.zip
|
||||
- name: Build Arm Manylinux Wheel
|
||||
@@ -52,7 +51,7 @@ runs:
|
||||
args: ${{ inputs.args }}
|
||||
before-script-linux: |
|
||||
set -e
|
||||
yum install -y openssl-devel clang \
|
||||
yum install -y clang \
|
||||
&& curl -L https://github.com/protocolbuffers/protobuf/releases/download/v24.4/protoc-24.4-linux-aarch_64.zip > /tmp/protoc.zip \
|
||||
&& unzip /tmp/protoc.zip -d /usr/local \
|
||||
&& rm /tmp/protoc.zip
|
||||
|
||||
10
.github/workflows/cargo-publish.yml
vendored
10
.github/workflows/cargo-publish.yml
vendored
@@ -5,8 +5,8 @@ on:
|
||||
tags-ignore:
|
||||
# We don't publish pre-releases for Rust. Crates.io is just a source
|
||||
# distribution, so we don't need to publish pre-releases.
|
||||
- 'v*-beta*'
|
||||
- '*-v*' # for example, python-vX.Y.Z
|
||||
- "v*-beta*"
|
||||
- "*-v*" # for example, python-vX.Y.Z
|
||||
|
||||
env:
|
||||
# This env var is used by Swatinem/rust-cache@v2 for the cache
|
||||
@@ -19,6 +19,8 @@ env:
|
||||
jobs:
|
||||
build:
|
||||
runs-on: ubuntu-22.04
|
||||
permissions:
|
||||
id-token: write
|
||||
timeout-minutes: 30
|
||||
# Only runs on tags that matches the make-release action
|
||||
if: startsWith(github.ref, 'refs/tags/v')
|
||||
@@ -31,6 +33,8 @@ jobs:
|
||||
run: |
|
||||
sudo apt update
|
||||
sudo apt install -y protobuf-compiler libssl-dev
|
||||
- uses: rust-lang/crates-io-auth-action@v1
|
||||
id: auth
|
||||
- name: Publish the package
|
||||
run: |
|
||||
cargo publish -p lancedb --all-features --token ${{ secrets.CARGO_REGISTRY_TOKEN }}
|
||||
cargo publish -p lancedb --all-features --token ${{ steps.auth.outputs.token }}
|
||||
|
||||
24
.github/workflows/docs.yml
vendored
24
.github/workflows/docs.yml
vendored
@@ -18,17 +18,24 @@ concurrency:
|
||||
group: "pages"
|
||||
cancel-in-progress: true
|
||||
|
||||
env:
|
||||
# This reduces the disk space needed for the build
|
||||
RUSTFLAGS: "-C debuginfo=0"
|
||||
# according to: https://matklad.github.io/2021/09/04/fast-rust-builds.html
|
||||
# CI builds are faster with incremental disabled.
|
||||
CARGO_INCREMENTAL: "0"
|
||||
|
||||
jobs:
|
||||
# Single deploy job since we're just deploying
|
||||
build:
|
||||
environment:
|
||||
name: github-pages
|
||||
url: ${{ steps.deployment.outputs.page_url }}
|
||||
runs-on: buildjet-8vcpu-ubuntu-2204
|
||||
runs-on: ubuntu-24.04
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v4
|
||||
- name: Install dependecies needed for ubuntu
|
||||
- name: Install dependencies needed for ubuntu
|
||||
run: |
|
||||
sudo apt install -y protobuf-compiler libssl-dev
|
||||
rustup update && rustup default
|
||||
@@ -38,6 +45,7 @@ jobs:
|
||||
python-version: "3.10"
|
||||
cache: "pip"
|
||||
cache-dependency-path: "docs/requirements.txt"
|
||||
- uses: Swatinem/rust-cache@v2
|
||||
- name: Build Python
|
||||
working-directory: python
|
||||
run: |
|
||||
@@ -48,23 +56,11 @@ jobs:
|
||||
with:
|
||||
node-version: 20
|
||||
cache: 'npm'
|
||||
cache-dependency-path: node/package-lock.json
|
||||
- uses: Swatinem/rust-cache@v2
|
||||
- name: Install node dependencies
|
||||
working-directory: node
|
||||
run: |
|
||||
sudo apt update
|
||||
sudo apt install -y protobuf-compiler libssl-dev
|
||||
- name: Build node
|
||||
working-directory: node
|
||||
run: |
|
||||
npm ci
|
||||
npm run build
|
||||
npm run tsc
|
||||
- name: Create markdown files
|
||||
working-directory: node
|
||||
run: |
|
||||
npx typedoc --plugin typedoc-plugin-markdown --out ../docs/src/javascript src/index.ts
|
||||
- name: Build docs
|
||||
working-directory: docs
|
||||
run: |
|
||||
|
||||
48
.github/workflows/docs_test.yml
vendored
48
.github/workflows/docs_test.yml
vendored
@@ -58,51 +58,3 @@ jobs:
|
||||
run: |
|
||||
cd docs/test/python
|
||||
for d in *; do cd "$d"; echo "$d".py; python "$d".py; cd ..; done
|
||||
test-node:
|
||||
name: Test doc nodejs code
|
||||
runs-on: ubuntu-24.04
|
||||
timeout-minutes: 60
|
||||
strategy:
|
||||
fail-fast: false
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v4
|
||||
with:
|
||||
fetch-depth: 0
|
||||
lfs: true
|
||||
- name: Print CPU capabilities
|
||||
run: cat /proc/cpuinfo
|
||||
- name: Set up Node
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version: 20
|
||||
- name: Install protobuf
|
||||
run: |
|
||||
sudo apt update
|
||||
sudo apt install -y protobuf-compiler
|
||||
- name: Install dependecies needed for ubuntu
|
||||
run: |
|
||||
sudo apt install -y libssl-dev
|
||||
rustup update && rustup default
|
||||
- name: Rust cache
|
||||
uses: swatinem/rust-cache@v2
|
||||
- name: Install node dependencies
|
||||
run: |
|
||||
sudo swapoff -a
|
||||
sudo fallocate -l 8G /swapfile
|
||||
sudo chmod 600 /swapfile
|
||||
sudo mkswap /swapfile
|
||||
sudo swapon /swapfile
|
||||
sudo swapon --show
|
||||
cd node
|
||||
npm ci
|
||||
npm run build-release
|
||||
cd ../docs
|
||||
npm install
|
||||
- name: Test
|
||||
env:
|
||||
LANCEDB_URI: ${{ secrets.LANCEDB_URI }}
|
||||
LANCEDB_DEV_API_KEY: ${{ secrets.LANCEDB_DEV_API_KEY }}
|
||||
run: |
|
||||
cd docs
|
||||
npm t
|
||||
|
||||
6
.github/workflows/java-publish.yml
vendored
6
.github/workflows/java-publish.yml
vendored
@@ -43,7 +43,7 @@ jobs:
|
||||
- uses: Swatinem/rust-cache@v2
|
||||
- uses: actions-rust-lang/setup-rust-toolchain@v1
|
||||
with:
|
||||
toolchain: "1.79.0"
|
||||
toolchain: "1.81.0"
|
||||
cache-workspaces: "./java/core/lancedb-jni"
|
||||
# Disable full debug symbol generation to speed up CI build and keep memory down
|
||||
# "1" means line tables only, which is useful for panic tracebacks.
|
||||
@@ -97,7 +97,7 @@ jobs:
|
||||
- name: Dry run
|
||||
if: github.event_name == 'pull_request'
|
||||
run: |
|
||||
mvn --batch-mode -DskipTests package
|
||||
mvn --batch-mode -DskipTests -Drust.release.build=true package
|
||||
- name: Set github
|
||||
run: |
|
||||
git config --global user.email "LanceDB Github Runner"
|
||||
@@ -108,7 +108,7 @@ jobs:
|
||||
echo "use-agent" >> ~/.gnupg/gpg.conf
|
||||
echo "pinentry-mode loopback" >> ~/.gnupg/gpg.conf
|
||||
export GPG_TTY=$(tty)
|
||||
mvn --batch-mode -DskipTests -DpushChanges=false -Dgpg.passphrase=${{ secrets.GPG_PASSPHRASE }} deploy -P deploy-to-ossrh
|
||||
mvn --batch-mode -DskipTests -Drust.release.build=true -DpushChanges=false -Dgpg.passphrase=${{ secrets.GPG_PASSPHRASE }} deploy -P deploy-to-ossrh
|
||||
env:
|
||||
SONATYPE_USER: ${{ secrets.SONATYPE_USER }}
|
||||
SONATYPE_TOKEN: ${{ secrets.SONATYPE_TOKEN }}
|
||||
|
||||
7
.github/workflows/java.yml
vendored
7
.github/workflows/java.yml
vendored
@@ -35,6 +35,9 @@ jobs:
|
||||
- uses: Swatinem/rust-cache@v2
|
||||
with:
|
||||
workspaces: java/core/lancedb-jni
|
||||
- uses: actions-rust-lang/setup-rust-toolchain@v1
|
||||
with:
|
||||
components: rustfmt
|
||||
- name: Run cargo fmt
|
||||
run: cargo fmt --check
|
||||
working-directory: ./java/core/lancedb-jni
|
||||
@@ -68,6 +71,9 @@ jobs:
|
||||
- uses: Swatinem/rust-cache@v2
|
||||
with:
|
||||
workspaces: java/core/lancedb-jni
|
||||
- uses: actions-rust-lang/setup-rust-toolchain@v1
|
||||
with:
|
||||
components: rustfmt
|
||||
- name: Run cargo fmt
|
||||
run: cargo fmt --check
|
||||
working-directory: ./java/core/lancedb-jni
|
||||
@@ -110,4 +116,3 @@ jobs:
|
||||
-Djdk.reflect.useDirectMethodHandle=false \
|
||||
-Dio.netty.tryReflectionSetAccessible=true"
|
||||
JAVA_HOME=$JAVA_17 mvn clean test
|
||||
|
||||
|
||||
9
.github/workflows/make-release-commit.yml
vendored
9
.github/workflows/make-release-commit.yml
vendored
@@ -84,6 +84,7 @@ jobs:
|
||||
run: |
|
||||
pip install bump-my-version PyGithub packaging
|
||||
bash ci/bump_version.sh ${{ inputs.type }} ${{ inputs.bump-minor }} v $COMMIT_BEFORE_BUMP
|
||||
bash ci/update_lockfiles.sh --amend
|
||||
- name: Push new version tag
|
||||
if: ${{ !inputs.dry_run }}
|
||||
uses: ad-m/github-push-action@master
|
||||
@@ -92,11 +93,3 @@ jobs:
|
||||
github_token: ${{ secrets.LANCEDB_RELEASE_TOKEN }}
|
||||
branch: ${{ github.ref }}
|
||||
tags: true
|
||||
- uses: ./.github/workflows/update_package_lock
|
||||
if: ${{ !inputs.dry_run && inputs.other }}
|
||||
with:
|
||||
github_token: ${{ secrets.GITHUB_TOKEN }}
|
||||
- uses: ./.github/workflows/update_package_lock_nodejs
|
||||
if: ${{ !inputs.dry_run && inputs.other }}
|
||||
with:
|
||||
github_token: ${{ secrets.GITHUB_TOKEN }}
|
||||
|
||||
147
.github/workflows/node.yml
vendored
147
.github/workflows/node.yml
vendored
@@ -1,147 +0,0 @@
|
||||
name: Node
|
||||
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- main
|
||||
pull_request:
|
||||
paths:
|
||||
- node/**
|
||||
- rust/ffi/node/**
|
||||
- .github/workflows/node.yml
|
||||
- docker-compose.yml
|
||||
|
||||
concurrency:
|
||||
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
|
||||
cancel-in-progress: true
|
||||
|
||||
env:
|
||||
# Disable full debug symbol generation to speed up CI build and keep memory down
|
||||
# "1" means line tables only, which is useful for panic tracebacks.
|
||||
#
|
||||
# Use native CPU to accelerate tests if possible, especially for f16
|
||||
# target-cpu=haswell fixes failing ci build
|
||||
RUSTFLAGS: "-C debuginfo=1 -C target-cpu=haswell -C target-feature=+f16c,+avx2,+fma"
|
||||
RUST_BACKTRACE: "1"
|
||||
|
||||
jobs:
|
||||
linux:
|
||||
name: Linux (Node ${{ matrix.node-version }})
|
||||
timeout-minutes: 30
|
||||
strategy:
|
||||
matrix:
|
||||
node-version: [ "18", "20" ]
|
||||
runs-on: "ubuntu-22.04"
|
||||
defaults:
|
||||
run:
|
||||
shell: bash
|
||||
working-directory: node
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
fetch-depth: 0
|
||||
lfs: true
|
||||
- uses: actions/setup-node@v3
|
||||
with:
|
||||
node-version: ${{ matrix.node-version }}
|
||||
cache: 'npm'
|
||||
cache-dependency-path: node/package-lock.json
|
||||
- uses: Swatinem/rust-cache@v2
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
sudo apt update
|
||||
sudo apt install -y protobuf-compiler libssl-dev
|
||||
- name: Build
|
||||
run: |
|
||||
npm ci
|
||||
npm run build
|
||||
npm run pack-build
|
||||
npm install --no-save ./dist/lancedb-vectordb-*.tgz
|
||||
# Remove index.node to test with dependency installed
|
||||
rm index.node
|
||||
- name: Test
|
||||
run: npm run test
|
||||
macos:
|
||||
timeout-minutes: 30
|
||||
runs-on: "macos-13"
|
||||
defaults:
|
||||
run:
|
||||
shell: bash
|
||||
working-directory: node
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
fetch-depth: 0
|
||||
lfs: true
|
||||
- uses: actions/setup-node@v3
|
||||
with:
|
||||
node-version: 20
|
||||
cache: 'npm'
|
||||
cache-dependency-path: node/package-lock.json
|
||||
- uses: Swatinem/rust-cache@v2
|
||||
- name: Install dependencies
|
||||
run: brew install protobuf
|
||||
- name: Build
|
||||
run: |
|
||||
npm ci
|
||||
npm run build
|
||||
npm run pack-build
|
||||
npm install --no-save ./dist/lancedb-vectordb-*.tgz
|
||||
# Remove index.node to test with dependency installed
|
||||
rm index.node
|
||||
- name: Test
|
||||
run: |
|
||||
npm run test
|
||||
aws-integtest:
|
||||
timeout-minutes: 45
|
||||
runs-on: "ubuntu-22.04"
|
||||
defaults:
|
||||
run:
|
||||
shell: bash
|
||||
working-directory: node
|
||||
env:
|
||||
AWS_ACCESS_KEY_ID: ACCESSKEY
|
||||
AWS_SECRET_ACCESS_KEY: SECRETKEY
|
||||
AWS_DEFAULT_REGION: us-west-2
|
||||
# this one is for s3
|
||||
AWS_ENDPOINT: http://localhost:4566
|
||||
# this one is for dynamodb
|
||||
DYNAMODB_ENDPOINT: http://localhost:4566
|
||||
ALLOW_HTTP: true
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
with:
|
||||
fetch-depth: 0
|
||||
lfs: true
|
||||
- uses: actions/setup-node@v3
|
||||
with:
|
||||
node-version: 20
|
||||
cache: 'npm'
|
||||
cache-dependency-path: node/package-lock.json
|
||||
- name: start local stack
|
||||
run: docker compose -f ../docker-compose.yml up -d --wait
|
||||
- name: create s3
|
||||
run: aws s3 mb s3://lancedb-integtest --endpoint $AWS_ENDPOINT
|
||||
- name: create ddb
|
||||
run: |
|
||||
aws dynamodb create-table \
|
||||
--table-name lancedb-integtest \
|
||||
--attribute-definitions '[{"AttributeName": "base_uri", "AttributeType": "S"}, {"AttributeName": "version", "AttributeType": "N"}]' \
|
||||
--key-schema '[{"AttributeName": "base_uri", "KeyType": "HASH"}, {"AttributeName": "version", "KeyType": "RANGE"}]' \
|
||||
--provisioned-throughput '{"ReadCapacityUnits": 10, "WriteCapacityUnits": 10}' \
|
||||
--endpoint-url $DYNAMODB_ENDPOINT
|
||||
- uses: Swatinem/rust-cache@v2
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
sudo apt update
|
||||
sudo apt install -y protobuf-compiler libssl-dev
|
||||
- name: Build
|
||||
run: |
|
||||
npm ci
|
||||
npm run build
|
||||
npm run pack-build
|
||||
npm install --no-save ./dist/lancedb-vectordb-*.tgz
|
||||
# Remove index.node to test with dependency installed
|
||||
rm index.node
|
||||
- name: Test
|
||||
run: npm run integration-test
|
||||
9
.github/workflows/nodejs.yml
vendored
9
.github/workflows/nodejs.yml
vendored
@@ -47,6 +47,9 @@ jobs:
|
||||
run: |
|
||||
sudo apt update
|
||||
sudo apt install -y protobuf-compiler libssl-dev
|
||||
- uses: actions-rust-lang/setup-rust-toolchain@v1
|
||||
with:
|
||||
components: rustfmt, clippy
|
||||
- name: Lint
|
||||
run: |
|
||||
cargo fmt --all -- --check
|
||||
@@ -76,7 +79,7 @@ jobs:
|
||||
with:
|
||||
node-version: ${{ matrix.node-version }}
|
||||
cache: 'npm'
|
||||
cache-dependency-path: node/package-lock.json
|
||||
cache-dependency-path: nodejs/package-lock.json
|
||||
- uses: Swatinem/rust-cache@v2
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
@@ -113,7 +116,7 @@ jobs:
|
||||
set -e
|
||||
npm ci
|
||||
npm run docs
|
||||
if ! git diff --exit-code; then
|
||||
if ! git diff --exit-code -- . ':(exclude)Cargo.lock'; then
|
||||
echo "Docs need to be updated"
|
||||
echo "Run 'npm run docs', fix any warnings, and commit the changes."
|
||||
exit 1
|
||||
@@ -134,7 +137,7 @@ jobs:
|
||||
with:
|
||||
node-version: 20
|
||||
cache: 'npm'
|
||||
cache-dependency-path: node/package-lock.json
|
||||
cache-dependency-path: nodejs/package-lock.json
|
||||
- uses: Swatinem/rust-cache@v2
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
|
||||
883
.github/workflows/npm-publish.yml
vendored
883
.github/workflows/npm-publish.yml
vendored
@@ -1,601 +1,30 @@
|
||||
name: NPM Publish
|
||||
|
||||
env:
|
||||
MACOSX_DEPLOYMENT_TARGET: '10.13'
|
||||
CARGO_INCREMENTAL: '0'
|
||||
|
||||
permissions:
|
||||
contents: write
|
||||
id-token: write
|
||||
|
||||
on:
|
||||
push:
|
||||
branches:
|
||||
- main
|
||||
tags:
|
||||
- "v*"
|
||||
pull_request:
|
||||
# This should trigger a dry run (we skip the final publish step)
|
||||
paths:
|
||||
- .github/workflows/npm-publish.yml
|
||||
- Cargo.toml # Change in dependency frequently breaks builds
|
||||
|
||||
concurrency:
|
||||
group: ${{ github.workflow }}-${{ github.ref }}
|
||||
cancel-in-progress: true
|
||||
|
||||
jobs:
|
||||
node:
|
||||
name: vectordb Typescript
|
||||
runs-on: ubuntu-latest
|
||||
# Only runs on tags that matches the make-release action
|
||||
if: startsWith(github.ref, 'refs/tags/v')
|
||||
defaults:
|
||||
run:
|
||||
shell: bash
|
||||
working-directory: node
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v4
|
||||
- uses: actions/setup-node@v3
|
||||
with:
|
||||
node-version: 20
|
||||
cache: "npm"
|
||||
cache-dependency-path: node/package-lock.json
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
sudo apt update
|
||||
sudo apt install -y protobuf-compiler libssl-dev
|
||||
- name: Build
|
||||
run: |
|
||||
npm ci
|
||||
npm run tsc
|
||||
npm pack
|
||||
- name: Upload Linux Artifacts
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: node-package
|
||||
path: |
|
||||
node/vectordb-*.tgz
|
||||
|
||||
node-macos:
|
||||
name: vectordb ${{ matrix.config.arch }}
|
||||
strategy:
|
||||
matrix:
|
||||
config:
|
||||
- arch: x86_64-apple-darwin
|
||||
runner: macos-13
|
||||
- arch: aarch64-apple-darwin
|
||||
# xlarge is implicitly arm64.
|
||||
runner: macos-14
|
||||
runs-on: ${{ matrix.config.runner }}
|
||||
# Only runs on tags that matches the make-release action
|
||||
if: startsWith(github.ref, 'refs/tags/v')
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v4
|
||||
- name: Install system dependencies
|
||||
run: brew install protobuf
|
||||
- name: Install npm dependencies
|
||||
run: |
|
||||
cd node
|
||||
npm ci
|
||||
- name: Build MacOS native node modules
|
||||
run: bash ci/build_macos_artifacts.sh ${{ matrix.config.arch }}
|
||||
- name: Upload Darwin Artifacts
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: node-native-darwin-${{ matrix.config.arch }}
|
||||
path: |
|
||||
node/dist/lancedb-vectordb-darwin*.tgz
|
||||
|
||||
nodejs-macos:
|
||||
name: lancedb ${{ matrix.config.arch }}
|
||||
strategy:
|
||||
matrix:
|
||||
config:
|
||||
- arch: x86_64-apple-darwin
|
||||
runner: macos-13
|
||||
- arch: aarch64-apple-darwin
|
||||
# xlarge is implicitly arm64.
|
||||
runner: macos-14
|
||||
runs-on: ${{ matrix.config.runner }}
|
||||
# Only runs on tags that matches the make-release action
|
||||
if: startsWith(github.ref, 'refs/tags/v')
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v4
|
||||
- name: Install system dependencies
|
||||
run: brew install protobuf
|
||||
- name: Install npm dependencies
|
||||
run: |
|
||||
cd nodejs
|
||||
npm ci
|
||||
- name: Build MacOS native nodejs modules
|
||||
run: bash ci/build_macos_artifacts_nodejs.sh ${{ matrix.config.arch }}
|
||||
- name: Upload Darwin Artifacts
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: nodejs-native-darwin-${{ matrix.config.arch }}
|
||||
path: |
|
||||
nodejs/dist/*.node
|
||||
|
||||
node-linux-gnu:
|
||||
name: vectordb (${{ matrix.config.arch}}-unknown-linux-gnu)
|
||||
runs-on: ${{ matrix.config.runner }}
|
||||
# Only runs on tags that matches the make-release action
|
||||
if: startsWith(github.ref, 'refs/tags/v')
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
config:
|
||||
- arch: x86_64
|
||||
runner: ubuntu-latest
|
||||
- arch: aarch64
|
||||
# For successful fat LTO builds, we need a large runner to avoid OOM errors.
|
||||
runner: warp-ubuntu-latest-arm64-4x
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v4
|
||||
# To avoid OOM errors on ARM, we create a swap file.
|
||||
- name: Configure aarch64 build
|
||||
if: ${{ matrix.config.arch == 'aarch64' }}
|
||||
run: |
|
||||
free -h
|
||||
sudo fallocate -l 16G /swapfile
|
||||
sudo chmod 600 /swapfile
|
||||
sudo mkswap /swapfile
|
||||
sudo swapon /swapfile
|
||||
echo "/swapfile swap swap defaults 0 0" >> sudo /etc/fstab
|
||||
# print info
|
||||
swapon --show
|
||||
free -h
|
||||
- name: Build Linux Artifacts
|
||||
run: |
|
||||
bash ci/build_linux_artifacts.sh ${{ matrix.config.arch }} ${{ matrix.config.arch }}-unknown-linux-gnu
|
||||
- name: Upload Linux Artifacts
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: node-native-linux-${{ matrix.config.arch }}-gnu
|
||||
path: |
|
||||
node/dist/lancedb-vectordb-linux*.tgz
|
||||
|
||||
node-linux-musl:
|
||||
name: vectordb (${{ matrix.config.arch}}-unknown-linux-musl)
|
||||
runs-on: ubuntu-latest
|
||||
container: alpine:edge
|
||||
# Only runs on tags that matches the make-release action
|
||||
if: startsWith(github.ref, 'refs/tags/v')
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
config:
|
||||
- arch: x86_64
|
||||
- arch: aarch64
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v4
|
||||
- name: Install common dependencies
|
||||
run: |
|
||||
apk add protobuf-dev curl clang mold grep npm bash
|
||||
curl --proto '=https' --tlsv1.3 -sSf https://raw.githubusercontent.com/rust-lang/rustup/refs/heads/master/rustup-init.sh | sh -s -- -y
|
||||
echo "source $HOME/.cargo/env" >> saved_env
|
||||
echo "export CC=clang" >> saved_env
|
||||
echo "export RUSTFLAGS='-Ctarget-cpu=haswell -Ctarget-feature=-crt-static,+avx2,+fma,+f16c -Clinker=clang -Clink-arg=-fuse-ld=mold'" >> saved_env
|
||||
- name: Configure aarch64 build
|
||||
if: ${{ matrix.config.arch == 'aarch64' }}
|
||||
run: |
|
||||
source "$HOME/.cargo/env"
|
||||
rustup target add aarch64-unknown-linux-musl
|
||||
crt=$(realpath $(dirname $(rustup which rustc))/../lib/rustlib/aarch64-unknown-linux-musl/lib/self-contained)
|
||||
sysroot_lib=/usr/aarch64-unknown-linux-musl/usr/lib
|
||||
apk_url=https://dl-cdn.alpinelinux.org/alpine/latest-stable/main/aarch64/
|
||||
curl -sSf $apk_url > apk_list
|
||||
for pkg in gcc libgcc musl; do curl -sSf $apk_url$(cat apk_list | grep -oP '(?<=")'$pkg'-\d.*?(?=")') | tar zxf -; done
|
||||
mkdir -p $sysroot_lib
|
||||
echo 'GROUP ( libgcc_s.so.1 -lgcc )' > $sysroot_lib/libgcc_s.so
|
||||
cp usr/lib/libgcc_s.so.1 $sysroot_lib
|
||||
cp usr/lib/gcc/aarch64-alpine-linux-musl/*/libgcc.a $sysroot_lib
|
||||
cp lib/ld-musl-aarch64.so.1 $sysroot_lib/libc.so
|
||||
echo '!<arch>' > $sysroot_lib/libdl.a
|
||||
(cd $crt && cp crti.o crtbeginS.o crtendS.o crtn.o -t $sysroot_lib)
|
||||
echo "export CARGO_BUILD_TARGET=aarch64-unknown-linux-musl" >> saved_env
|
||||
echo "export RUSTFLAGS='-Ctarget-cpu=apple-m1 -Ctarget-feature=-crt-static,+neon,+fp16,+fhm,+dotprod -Clinker=clang -Clink-arg=-fuse-ld=mold -Clink-arg=--target=aarch64-unknown-linux-musl -Clink-arg=--sysroot=/usr/aarch64-unknown-linux-musl -Clink-arg=-lc'" >> saved_env
|
||||
- name: Build Linux Artifacts
|
||||
run: |
|
||||
source ./saved_env
|
||||
bash ci/manylinux_node/build_vectordb.sh ${{ matrix.config.arch }} ${{ matrix.config.arch }}-unknown-linux-musl
|
||||
- name: Upload Linux Artifacts
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: node-native-linux-${{ matrix.config.arch }}-musl
|
||||
path: |
|
||||
node/dist/lancedb-vectordb-linux*.tgz
|
||||
|
||||
nodejs-linux-gnu:
|
||||
name: lancedb (${{ matrix.config.arch}}-unknown-linux-gnu
|
||||
runs-on: ${{ matrix.config.runner }}
|
||||
# Only runs on tags that matches the make-release action
|
||||
if: startsWith(github.ref, 'refs/tags/v')
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
config:
|
||||
- arch: x86_64
|
||||
runner: ubuntu-latest
|
||||
- arch: aarch64
|
||||
# For successful fat LTO builds, we need a large runner to avoid OOM errors.
|
||||
runner: buildjet-16vcpu-ubuntu-2204-arm
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v4
|
||||
# Buildjet aarch64 runners have only 1.5 GB RAM per core, vs 3.5 GB per core for
|
||||
# x86_64 runners. To avoid OOM errors on ARM, we create a swap file.
|
||||
- name: Configure aarch64 build
|
||||
if: ${{ matrix.config.arch == 'aarch64' }}
|
||||
run: |
|
||||
free -h
|
||||
sudo fallocate -l 16G /swapfile
|
||||
sudo chmod 600 /swapfile
|
||||
sudo mkswap /swapfile
|
||||
sudo swapon /swapfile
|
||||
echo "/swapfile swap swap defaults 0 0" >> sudo /etc/fstab
|
||||
# print info
|
||||
swapon --show
|
||||
free -h
|
||||
- name: Build Linux Artifacts
|
||||
run: |
|
||||
bash ci/build_linux_artifacts_nodejs.sh ${{ matrix.config.arch }}
|
||||
- name: Upload Linux Artifacts
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: nodejs-native-linux-${{ matrix.config.arch }}-gnu
|
||||
path: |
|
||||
nodejs/dist/*.node
|
||||
# The generic files are the same in all distros so we just pick
|
||||
# one to do the upload.
|
||||
- name: Upload Generic Artifacts
|
||||
if: ${{ matrix.config.arch == 'x86_64' }}
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: nodejs-dist
|
||||
path: |
|
||||
nodejs/dist/*
|
||||
!nodejs/dist/*.node
|
||||
|
||||
nodejs-linux-musl:
|
||||
name: lancedb (${{ matrix.config.arch}}-unknown-linux-musl
|
||||
runs-on: ubuntu-latest
|
||||
container: alpine:edge
|
||||
# Only runs on tags that matches the make-release action
|
||||
if: startsWith(github.ref, 'refs/tags/v')
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
config:
|
||||
- arch: x86_64
|
||||
- arch: aarch64
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v4
|
||||
- name: Install common dependencies
|
||||
run: |
|
||||
apk add protobuf-dev curl clang mold grep npm bash openssl-dev openssl-libs-static
|
||||
curl --proto '=https' --tlsv1.3 -sSf https://raw.githubusercontent.com/rust-lang/rustup/refs/heads/master/rustup-init.sh | sh -s -- -y
|
||||
echo "source $HOME/.cargo/env" >> saved_env
|
||||
echo "export CC=clang" >> saved_env
|
||||
echo "export RUSTFLAGS='-Ctarget-cpu=haswell -Ctarget-feature=-crt-static,+avx2,+fma,+f16c -Clinker=clang -Clink-arg=-fuse-ld=mold'" >> saved_env
|
||||
echo "export X86_64_UNKNOWN_LINUX_MUSL_OPENSSL_INCLUDE_DIR=/usr/include" >> saved_env
|
||||
echo "export X86_64_UNKNOWN_LINUX_MUSL_OPENSSL_LIB_DIR=/usr/lib" >> saved_env
|
||||
- name: Configure aarch64 build
|
||||
if: ${{ matrix.config.arch == 'aarch64' }}
|
||||
run: |
|
||||
source "$HOME/.cargo/env"
|
||||
rustup target add aarch64-unknown-linux-musl
|
||||
crt=$(realpath $(dirname $(rustup which rustc))/../lib/rustlib/aarch64-unknown-linux-musl/lib/self-contained)
|
||||
sysroot_lib=/usr/aarch64-unknown-linux-musl/usr/lib
|
||||
apk_url=https://dl-cdn.alpinelinux.org/alpine/latest-stable/main/aarch64/
|
||||
curl -sSf $apk_url > apk_list
|
||||
for pkg in gcc libgcc musl openssl-dev openssl-libs-static; do curl -sSf $apk_url$(cat apk_list | grep -oP '(?<=")'$pkg'-\d.*?(?=")') | tar zxf -; done
|
||||
mkdir -p $sysroot_lib
|
||||
echo 'GROUP ( libgcc_s.so.1 -lgcc )' > $sysroot_lib/libgcc_s.so
|
||||
cp usr/lib/libgcc_s.so.1 $sysroot_lib
|
||||
cp usr/lib/gcc/aarch64-alpine-linux-musl/*/libgcc.a $sysroot_lib
|
||||
cp lib/ld-musl-aarch64.so.1 $sysroot_lib/libc.so
|
||||
echo '!<arch>' > $sysroot_lib/libdl.a
|
||||
(cd $crt && cp crti.o crtbeginS.o crtendS.o crtn.o -t $sysroot_lib)
|
||||
echo "export CARGO_BUILD_TARGET=aarch64-unknown-linux-musl" >> saved_env
|
||||
echo "export RUSTFLAGS='-Ctarget-feature=-crt-static,+neon,+fp16,+fhm,+dotprod -Clinker=clang -Clink-arg=-fuse-ld=mold -Clink-arg=--target=aarch64-unknown-linux-musl -Clink-arg=--sysroot=/usr/aarch64-unknown-linux-musl -Clink-arg=-lc'" >> saved_env
|
||||
echo "export AARCH64_UNKNOWN_LINUX_MUSL_OPENSSL_INCLUDE_DIR=$(realpath usr/include)" >> saved_env
|
||||
echo "export AARCH64_UNKNOWN_LINUX_MUSL_OPENSSL_LIB_DIR=$(realpath usr/lib)" >> saved_env
|
||||
- name: Build Linux Artifacts
|
||||
run: |
|
||||
source ./saved_env
|
||||
bash ci/manylinux_node/build_lancedb.sh ${{ matrix.config.arch }}
|
||||
- name: Upload Linux Artifacts
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: nodejs-native-linux-${{ matrix.config.arch }}-musl
|
||||
path: |
|
||||
nodejs/dist/*.node
|
||||
|
||||
node-windows:
|
||||
name: vectordb ${{ matrix.target }}
|
||||
runs-on: windows-2022
|
||||
# Only runs on tags that matches the make-release action
|
||||
if: startsWith(github.ref, 'refs/tags/v')
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
target: [x86_64-pc-windows-msvc]
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v4
|
||||
- name: Install Protoc v21.12
|
||||
working-directory: C:\
|
||||
run: |
|
||||
New-Item -Path 'C:\protoc' -ItemType Directory
|
||||
Set-Location C:\protoc
|
||||
Invoke-WebRequest https://github.com/protocolbuffers/protobuf/releases/download/v21.12/protoc-21.12-win64.zip -OutFile C:\protoc\protoc.zip
|
||||
7z x protoc.zip
|
||||
Add-Content $env:GITHUB_PATH "C:\protoc\bin"
|
||||
shell: powershell
|
||||
- name: Install npm dependencies
|
||||
run: |
|
||||
cd node
|
||||
npm ci
|
||||
- name: Build Windows native node modules
|
||||
run: .\ci\build_windows_artifacts.ps1 ${{ matrix.target }}
|
||||
- name: Upload Windows Artifacts
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: node-native-windows
|
||||
path: |
|
||||
node/dist/lancedb-vectordb-win32*.tgz
|
||||
|
||||
node-windows-arm64:
|
||||
name: vectordb ${{ matrix.config.arch }}-pc-windows-msvc
|
||||
# if: startsWith(github.ref, 'refs/tags/v')
|
||||
runs-on: ubuntu-latest
|
||||
container: alpine:edge
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
config:
|
||||
# - arch: x86_64
|
||||
- arch: aarch64
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v4
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
apk add protobuf-dev curl clang lld llvm19 grep npm bash msitools sed
|
||||
curl --proto '=https' --tlsv1.3 -sSf https://raw.githubusercontent.com/rust-lang/rustup/refs/heads/master/rustup-init.sh | sh -s -- -y
|
||||
echo "source $HOME/.cargo/env" >> saved_env
|
||||
echo "export CC=clang" >> saved_env
|
||||
echo "export AR=llvm-ar" >> saved_env
|
||||
source "$HOME/.cargo/env"
|
||||
rustup target add ${{ matrix.config.arch }}-pc-windows-msvc
|
||||
(mkdir -p sysroot && cd sysroot && sh ../ci/sysroot-${{ matrix.config.arch }}-pc-windows-msvc.sh)
|
||||
echo "export C_INCLUDE_PATH=/usr/${{ matrix.config.arch }}-pc-windows-msvc/usr/include" >> saved_env
|
||||
echo "export CARGO_BUILD_TARGET=${{ matrix.config.arch }}-pc-windows-msvc" >> saved_env
|
||||
- name: Configure x86_64 build
|
||||
if: ${{ matrix.config.arch == 'x86_64' }}
|
||||
run: |
|
||||
echo "export RUSTFLAGS='-Ctarget-cpu=haswell -Ctarget-feature=+crt-static,+avx2,+fma,+f16c -Clinker=lld -Clink-arg=/LIBPATH:/usr/x86_64-pc-windows-msvc/usr/lib'" >> saved_env
|
||||
- name: Configure aarch64 build
|
||||
if: ${{ matrix.config.arch == 'aarch64' }}
|
||||
run: |
|
||||
echo "export RUSTFLAGS='-Ctarget-feature=+crt-static,+neon,+fp16,+fhm,+dotprod -Clinker=lld -Clink-arg=/LIBPATH:/usr/aarch64-pc-windows-msvc/usr/lib -Clink-arg=arm64rt.lib'" >> saved_env
|
||||
- name: Build Windows Artifacts
|
||||
run: |
|
||||
source ./saved_env
|
||||
bash ci/manylinux_node/build_vectordb.sh ${{ matrix.config.arch }} ${{ matrix.config.arch }}-pc-windows-msvc
|
||||
- name: Upload Windows Artifacts
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: node-native-windows-${{ matrix.config.arch }}
|
||||
path: |
|
||||
node/dist/lancedb-vectordb-win32*.tgz
|
||||
|
||||
nodejs-windows:
|
||||
name: lancedb ${{ matrix.target }}
|
||||
runs-on: windows-2022
|
||||
# Only runs on tags that matches the make-release action
|
||||
if: startsWith(github.ref, 'refs/tags/v')
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
target: [x86_64-pc-windows-msvc]
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v4
|
||||
- name: Install Protoc v21.12
|
||||
working-directory: C:\
|
||||
run: |
|
||||
New-Item -Path 'C:\protoc' -ItemType Directory
|
||||
Set-Location C:\protoc
|
||||
Invoke-WebRequest https://github.com/protocolbuffers/protobuf/releases/download/v21.12/protoc-21.12-win64.zip -OutFile C:\protoc\protoc.zip
|
||||
7z x protoc.zip
|
||||
Add-Content $env:GITHUB_PATH "C:\protoc\bin"
|
||||
shell: powershell
|
||||
- name: Install npm dependencies
|
||||
run: |
|
||||
cd nodejs
|
||||
npm ci
|
||||
- name: Build Windows native node modules
|
||||
run: .\ci\build_windows_artifacts_nodejs.ps1 ${{ matrix.target }}
|
||||
- name: Upload Windows Artifacts
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: nodejs-native-windows
|
||||
path: |
|
||||
nodejs/dist/*.node
|
||||
|
||||
nodejs-windows-arm64:
|
||||
name: lancedb ${{ matrix.config.arch }}-pc-windows-msvc
|
||||
# Only runs on tags that matches the make-release action
|
||||
# if: startsWith(github.ref, 'refs/tags/v')
|
||||
runs-on: ubuntu-latest
|
||||
container: alpine:edge
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
config:
|
||||
# - arch: x86_64
|
||||
- arch: aarch64
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v4
|
||||
- name: Install dependencies
|
||||
run: |
|
||||
apk add protobuf-dev curl clang lld llvm19 grep npm bash msitools sed
|
||||
curl --proto '=https' --tlsv1.3 -sSf https://raw.githubusercontent.com/rust-lang/rustup/refs/heads/master/rustup-init.sh | sh -s -- -y
|
||||
echo "source $HOME/.cargo/env" >> saved_env
|
||||
echo "export CC=clang" >> saved_env
|
||||
echo "export AR=llvm-ar" >> saved_env
|
||||
source "$HOME/.cargo/env"
|
||||
rustup target add ${{ matrix.config.arch }}-pc-windows-msvc
|
||||
(mkdir -p sysroot && cd sysroot && sh ../ci/sysroot-${{ matrix.config.arch }}-pc-windows-msvc.sh)
|
||||
echo "export C_INCLUDE_PATH=/usr/${{ matrix.config.arch }}-pc-windows-msvc/usr/include" >> saved_env
|
||||
echo "export CARGO_BUILD_TARGET=${{ matrix.config.arch }}-pc-windows-msvc" >> saved_env
|
||||
printf '#!/bin/sh\ncargo "$@"' > $HOME/.cargo/bin/cargo-xwin
|
||||
chmod u+x $HOME/.cargo/bin/cargo-xwin
|
||||
- name: Configure x86_64 build
|
||||
if: ${{ matrix.config.arch == 'x86_64' }}
|
||||
run: |
|
||||
echo "export RUSTFLAGS='-Ctarget-cpu=haswell -Ctarget-feature=+crt-static,+avx2,+fma,+f16c -Clinker=lld -Clink-arg=/LIBPATH:/usr/x86_64-pc-windows-msvc/usr/lib'" >> saved_env
|
||||
- name: Configure aarch64 build
|
||||
if: ${{ matrix.config.arch == 'aarch64' }}
|
||||
run: |
|
||||
echo "export RUSTFLAGS='-Ctarget-feature=+crt-static,+neon,+fp16,+fhm,+dotprod -Clinker=lld -Clink-arg=/LIBPATH:/usr/aarch64-pc-windows-msvc/usr/lib -Clink-arg=arm64rt.lib'" >> saved_env
|
||||
- name: Build Windows Artifacts
|
||||
run: |
|
||||
source ./saved_env
|
||||
bash ci/manylinux_node/build_lancedb.sh ${{ matrix.config.arch }}
|
||||
- name: Upload Windows Artifacts
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: nodejs-native-windows-${{ matrix.config.arch }}
|
||||
path: |
|
||||
nodejs/dist/*.node
|
||||
|
||||
release:
|
||||
name: vectordb NPM Publish
|
||||
needs: [node, node-macos, node-linux-gnu, node-linux-musl, node-windows, node-windows-arm64]
|
||||
runs-on: ubuntu-latest
|
||||
# Only runs on tags that matches the make-release action
|
||||
if: startsWith(github.ref, 'refs/tags/v')
|
||||
steps:
|
||||
- uses: actions/download-artifact@v4
|
||||
with:
|
||||
pattern: node-*
|
||||
- name: Display structure of downloaded files
|
||||
run: ls -R
|
||||
- uses: actions/setup-node@v3
|
||||
with:
|
||||
node-version: 20
|
||||
registry-url: "https://registry.npmjs.org"
|
||||
- name: Publish to NPM
|
||||
env:
|
||||
NODE_AUTH_TOKEN: ${{ secrets.LANCEDB_NPM_REGISTRY_TOKEN }}
|
||||
run: |
|
||||
# Tag beta as "preview" instead of default "latest". See lancedb
|
||||
# npm publish step for more info.
|
||||
if [[ $GITHUB_REF =~ refs/tags/v(.*)-beta.* ]]; then
|
||||
PUBLISH_ARGS="--tag preview"
|
||||
fi
|
||||
|
||||
mv */*.tgz .
|
||||
for filename in *.tgz; do
|
||||
npm publish $PUBLISH_ARGS $filename
|
||||
done
|
||||
- name: Notify Slack Action
|
||||
uses: ravsamhq/notify-slack-action@2.3.0
|
||||
if: ${{ always() }}
|
||||
with:
|
||||
status: ${{ job.status }}
|
||||
notify_when: "failure"
|
||||
notification_title: "{workflow} is failing"
|
||||
env:
|
||||
SLACK_WEBHOOK_URL: ${{ secrets.ACTION_MONITORING_SLACK }}
|
||||
|
||||
release-nodejs:
|
||||
name: lancedb NPM Publish
|
||||
needs: [nodejs-macos, nodejs-linux-gnu, nodejs-linux-musl, nodejs-windows, nodejs-windows-arm64]
|
||||
runs-on: ubuntu-latest
|
||||
# Only runs on tags that matches the make-release action
|
||||
if: startsWith(github.ref, 'refs/tags/v')
|
||||
defaults:
|
||||
run:
|
||||
shell: bash
|
||||
working-directory: nodejs
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v4
|
||||
- uses: actions/download-artifact@v4
|
||||
with:
|
||||
name: nodejs-dist
|
||||
path: nodejs/dist
|
||||
- uses: actions/download-artifact@v4
|
||||
name: Download arch-specific binaries
|
||||
with:
|
||||
pattern: nodejs-*
|
||||
path: nodejs/nodejs-artifacts
|
||||
merge-multiple: true
|
||||
- name: Display structure of downloaded files
|
||||
run: find .
|
||||
- uses: actions/setup-node@v3
|
||||
with:
|
||||
node-version: 20
|
||||
registry-url: "https://registry.npmjs.org"
|
||||
- name: Install napi-rs
|
||||
run: npm install -g @napi-rs/cli
|
||||
- name: Prepare artifacts
|
||||
run: npx napi artifacts -d nodejs-artifacts
|
||||
- name: Display structure of staged files
|
||||
run: find npm
|
||||
- name: Publish to NPM
|
||||
env:
|
||||
NODE_AUTH_TOKEN: ${{ secrets.LANCEDB_NPM_REGISTRY_TOKEN }}
|
||||
# By default, things are published to the latest tag. This is what is
|
||||
# installed by default if the user does not specify a version. This is
|
||||
# good for stable releases, but for pre-releases, we want to publish to
|
||||
# the "preview" tag so they can install with `npm install lancedb@preview`.
|
||||
# See: https://medium.com/@mbostock/prereleases-and-npm-e778fc5e2420
|
||||
run: |
|
||||
if [[ $GITHUB_REF =~ refs/tags/v(.*)-beta.* ]]; then
|
||||
npm publish --access public --tag preview
|
||||
else
|
||||
npm publish --access public
|
||||
fi
|
||||
- name: Notify Slack Action
|
||||
uses: ravsamhq/notify-slack-action@2.3.0
|
||||
if: ${{ always() }}
|
||||
with:
|
||||
status: ${{ job.status }}
|
||||
notify_when: "failure"
|
||||
notification_title: "{workflow} is failing"
|
||||
env:
|
||||
SLACK_WEBHOOK_URL: ${{ secrets.ACTION_MONITORING_SLACK }}
|
||||
|
||||
update-package-lock:
|
||||
if: startsWith(github.ref, 'refs/tags/v')
|
||||
needs: [release]
|
||||
runs-on: ubuntu-latest
|
||||
permissions:
|
||||
contents: write
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v4
|
||||
with:
|
||||
ref: main
|
||||
token: ${{ secrets.LANCEDB_RELEASE_TOKEN }}
|
||||
fetch-depth: 0
|
||||
lfs: true
|
||||
- uses: ./.github/workflows/update_package_lock
|
||||
with:
|
||||
github_token: ${{ secrets.GITHUB_TOKEN }}
|
||||
|
||||
update-package-lock-nodejs:
|
||||
if: startsWith(github.ref, 'refs/tags/v')
|
||||
needs: [release-nodejs]
|
||||
runs-on: ubuntu-latest
|
||||
permissions:
|
||||
contents: write
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v4
|
||||
with:
|
||||
ref: main
|
||||
token: ${{ secrets.LANCEDB_RELEASE_TOKEN }}
|
||||
fetch-depth: 0
|
||||
lfs: true
|
||||
- uses: ./.github/workflows/update_package_lock_nodejs
|
||||
with:
|
||||
github_token: ${{ secrets.GITHUB_TOKEN }}
|
||||
|
||||
gh-release:
|
||||
if: startsWith(github.ref, 'refs/tags/v')
|
||||
runs-on: ubuntu-latest
|
||||
@@ -662,3 +91,277 @@ jobs:
|
||||
generate_release_notes: false
|
||||
name: Node/Rust LanceDB v${{ steps.extract_version.outputs.version }}
|
||||
body: ${{ steps.release_notes.outputs.changelog }}
|
||||
|
||||
build-lancedb:
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
settings:
|
||||
- target: x86_64-apple-darwin
|
||||
host: macos-latest
|
||||
features: ","
|
||||
pre_build: |-
|
||||
brew install protobuf
|
||||
rustup target add x86_64-apple-darwin
|
||||
- target: aarch64-apple-darwin
|
||||
host: macos-latest
|
||||
features: fp16kernels
|
||||
pre_build: brew install protobuf
|
||||
- target: x86_64-pc-windows-msvc
|
||||
host: windows-latest
|
||||
features: ","
|
||||
pre_build: |-
|
||||
choco install --no-progress protoc ninja nasm
|
||||
tail -n 1000 /c/ProgramData/chocolatey/logs/chocolatey.log
|
||||
# There is an issue where choco doesn't add nasm to the path
|
||||
export PATH="$PATH:/c/Program Files/NASM"
|
||||
nasm -v
|
||||
- target: aarch64-pc-windows-msvc
|
||||
host: windows-latest
|
||||
features: ","
|
||||
pre_build: |-
|
||||
choco install --no-progress protoc
|
||||
rustup target add aarch64-pc-windows-msvc
|
||||
- target: x86_64-unknown-linux-gnu
|
||||
host: ubuntu-latest
|
||||
features: fp16kernels
|
||||
# https://github.com/napi-rs/napi-rs/blob/main/debian.Dockerfile
|
||||
docker: ghcr.io/napi-rs/napi-rs/nodejs-rust:lts-debian
|
||||
pre_build: |-
|
||||
set -e &&
|
||||
apt-get update &&
|
||||
apt-get install -y protobuf-compiler pkg-config
|
||||
- target: x86_64-unknown-linux-musl
|
||||
# This one seems to need some extra memory
|
||||
host: ubuntu-2404-8x-x64
|
||||
# https://github.com/napi-rs/napi-rs/blob/main/alpine.Dockerfile
|
||||
docker: ghcr.io/napi-rs/napi-rs/nodejs-rust:lts-alpine
|
||||
features: fp16kernels
|
||||
pre_build: |-
|
||||
set -e &&
|
||||
apk add protobuf-dev curl &&
|
||||
ln -s /usr/lib/gcc/x86_64-alpine-linux-musl/14.2.0/crtbeginS.o /usr/lib/crtbeginS.o &&
|
||||
ln -s /usr/lib/libgcc_s.so /usr/lib/libgcc.so &&
|
||||
CC=gcc &&
|
||||
CXX=g++
|
||||
- target: aarch64-unknown-linux-gnu
|
||||
host: ubuntu-2404-8x-x64
|
||||
# https://github.com/napi-rs/napi-rs/blob/main/debian-aarch64.Dockerfile
|
||||
docker: ghcr.io/napi-rs/napi-rs/nodejs-rust:lts-debian-aarch64
|
||||
features: "fp16kernels"
|
||||
pre_build: |-
|
||||
set -e &&
|
||||
apt-get update &&
|
||||
apt-get install -y protobuf-compiler pkg-config &&
|
||||
# https://github.com/aws/aws-lc-rs/issues/737#issuecomment-2725918627
|
||||
ln -s /usr/aarch64-unknown-linux-gnu/lib/gcc/aarch64-unknown-linux-gnu/4.8.5/crtbeginS.o /usr/aarch64-unknown-linux-gnu/aarch64-unknown-linux-gnu/sysroot/usr/lib/crtbeginS.o &&
|
||||
ln -s /usr/aarch64-unknown-linux-gnu/lib/gcc /usr/aarch64-unknown-linux-gnu/aarch64-unknown-linux-gnu/sysroot/usr/lib/gcc &&
|
||||
rustup target add aarch64-unknown-linux-gnu
|
||||
- target: aarch64-unknown-linux-musl
|
||||
host: ubuntu-2404-8x-x64
|
||||
# https://github.com/napi-rs/napi-rs/blob/main/alpine.Dockerfile
|
||||
docker: ghcr.io/napi-rs/napi-rs/nodejs-rust:lts-alpine
|
||||
features: ","
|
||||
pre_build: |-
|
||||
set -e &&
|
||||
apk add protobuf-dev &&
|
||||
rustup target add aarch64-unknown-linux-musl &&
|
||||
export CC_aarch64_unknown_linux_musl=aarch64-linux-musl-gcc &&
|
||||
export CXX_aarch64_unknown_linux_musl=aarch64-linux-musl-g++
|
||||
name: build - ${{ matrix.settings.target }}
|
||||
runs-on: ${{ matrix.settings.host }}
|
||||
defaults:
|
||||
run:
|
||||
working-directory: nodejs
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- name: Setup node
|
||||
uses: actions/setup-node@v4
|
||||
if: ${{ !matrix.settings.docker }}
|
||||
with:
|
||||
node-version: 20
|
||||
cache: npm
|
||||
cache-dependency-path: nodejs/package-lock.json
|
||||
- name: Install
|
||||
uses: dtolnay/rust-toolchain@stable
|
||||
if: ${{ !matrix.settings.docker }}
|
||||
with:
|
||||
toolchain: stable
|
||||
targets: ${{ matrix.settings.target }}
|
||||
- name: Cache cargo
|
||||
uses: actions/cache@v4
|
||||
with:
|
||||
path: |
|
||||
~/.cargo/registry/index/
|
||||
~/.cargo/registry/cache/
|
||||
~/.cargo/git/db/
|
||||
.cargo-cache
|
||||
target/
|
||||
key: nodejs-${{ matrix.settings.target }}-cargo-${{ matrix.settings.host }}
|
||||
- name: Setup toolchain
|
||||
run: ${{ matrix.settings.setup }}
|
||||
if: ${{ matrix.settings.setup }}
|
||||
shell: bash
|
||||
- name: Install dependencies
|
||||
run: npm ci
|
||||
- name: Build in docker
|
||||
uses: addnab/docker-run-action@v3
|
||||
if: ${{ matrix.settings.docker }}
|
||||
with:
|
||||
image: ${{ matrix.settings.docker }}
|
||||
options: "--user 0:0 -v ${{ github.workspace }}/.cargo-cache/git/db:/usr/local/cargo/git/db \
|
||||
-v ${{ github.workspace }}/.cargo/registry/cache:/usr/local/cargo/registry/cache \
|
||||
-v ${{ github.workspace }}/.cargo/registry/index:/usr/local/cargo/registry/index \
|
||||
-v ${{ github.workspace }}:/build -w /build/nodejs"
|
||||
run: |
|
||||
set -e
|
||||
${{ matrix.settings.pre_build }}
|
||||
npx napi build --platform --release --no-const-enum \
|
||||
--features ${{ matrix.settings.features }} \
|
||||
--target ${{ matrix.settings.target }} \
|
||||
--dts ../lancedb/native.d.ts \
|
||||
--js ../lancedb/native.js \
|
||||
--strip \
|
||||
dist/
|
||||
- name: Build
|
||||
run: |
|
||||
${{ matrix.settings.pre_build }}
|
||||
npx napi build --platform --release --no-const-enum \
|
||||
--features ${{ matrix.settings.features }} \
|
||||
--target ${{ matrix.settings.target }} \
|
||||
--dts ../lancedb/native.d.ts \
|
||||
--js ../lancedb/native.js \
|
||||
--strip \
|
||||
$EXTRA_ARGS \
|
||||
dist/
|
||||
if: ${{ !matrix.settings.docker }}
|
||||
shell: bash
|
||||
- name: Upload artifact
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: lancedb-${{ matrix.settings.target }}
|
||||
path: nodejs/dist/*.node
|
||||
if-no-files-found: error
|
||||
# The generic files are the same in all distros so we just pick
|
||||
# one to do the upload.
|
||||
- name: Make generic artifacts
|
||||
if: ${{ matrix.settings.target == 'aarch64-apple-darwin' }}
|
||||
run: npm run tsc
|
||||
- name: Upload Generic Artifacts
|
||||
if: ${{ matrix.settings.target == 'aarch64-apple-darwin' }}
|
||||
uses: actions/upload-artifact@v4
|
||||
with:
|
||||
name: nodejs-dist
|
||||
path: |
|
||||
nodejs/dist/*
|
||||
!nodejs/dist/*.node
|
||||
test-lancedb:
|
||||
name: "Test: ${{ matrix.settings.target }} - node@${{ matrix.node }}"
|
||||
needs:
|
||||
- build-lancedb
|
||||
strategy:
|
||||
fail-fast: false
|
||||
matrix:
|
||||
settings:
|
||||
# TODO: Get tests passing on Windows (failing from test tmpdir issue)
|
||||
# - host: windows-latest
|
||||
# target: x86_64-pc-windows-msvc
|
||||
- host: macos-latest
|
||||
target: aarch64-apple-darwin
|
||||
- target: x86_64-unknown-linux-gnu
|
||||
host: ubuntu-latest
|
||||
- target: aarch64-unknown-linux-gnu
|
||||
host: buildjet-16vcpu-ubuntu-2204-arm
|
||||
node:
|
||||
- '20'
|
||||
runs-on: ${{ matrix.settings.host }}
|
||||
defaults:
|
||||
run:
|
||||
shell: bash
|
||||
working-directory: nodejs
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- name: Setup node
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version: ${{ matrix.node }}
|
||||
cache: npm
|
||||
cache-dependency-path: nodejs/package-lock.json
|
||||
- name: Install dependencies
|
||||
run: npm ci
|
||||
- name: Download artifacts
|
||||
uses: actions/download-artifact@v4
|
||||
with:
|
||||
name: lancedb-${{ matrix.settings.target }}
|
||||
path: nodejs/dist/
|
||||
# For testing purposes:
|
||||
# run-id: 13982782871
|
||||
# github-token: ${{ secrets.GITHUB_TOKEN }} # token with actions:read permissions on target repo
|
||||
- uses: actions/download-artifact@v4
|
||||
with:
|
||||
name: nodejs-dist
|
||||
path: nodejs/dist
|
||||
# For testing purposes:
|
||||
# github-token: ${{ secrets.GITHUB_TOKEN }} # token with actions:read permissions on target repo
|
||||
# run-id: 13982782871
|
||||
- name: List packages
|
||||
run: ls -R dist
|
||||
- name: Move built files
|
||||
run: cp dist/native.d.ts dist/native.js dist/*.node lancedb/
|
||||
- name: Test bindings
|
||||
run: npm test
|
||||
publish:
|
||||
name: Publish
|
||||
runs-on: ubuntu-latest
|
||||
defaults:
|
||||
run:
|
||||
shell: bash
|
||||
working-directory: nodejs
|
||||
needs:
|
||||
- test-lancedb
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- name: Setup node
|
||||
uses: actions/setup-node@v4
|
||||
with:
|
||||
node-version: 20
|
||||
cache: npm
|
||||
cache-dependency-path: nodejs/package-lock.json
|
||||
registry-url: "https://registry.npmjs.org"
|
||||
- name: Install dependencies
|
||||
run: npm ci
|
||||
- uses: actions/download-artifact@v4
|
||||
with:
|
||||
name: nodejs-dist
|
||||
path: nodejs/dist
|
||||
# For testing purposes:
|
||||
# run-id: 13982782871
|
||||
# github-token: ${{ secrets.GITHUB_TOKEN }} # token with actions:read permissions on target repo
|
||||
- uses: actions/download-artifact@v4
|
||||
name: Download arch-specific binaries
|
||||
with:
|
||||
pattern: lancedb-*
|
||||
path: nodejs/nodejs-artifacts
|
||||
merge-multiple: true
|
||||
# For testing purposes:
|
||||
# run-id: 13982782871
|
||||
# github-token: ${{ secrets.GITHUB_TOKEN }} # token with actions:read permissions on target repo
|
||||
- name: Display structure of downloaded files
|
||||
run: find dist && find nodejs-artifacts
|
||||
- name: Move artifacts
|
||||
run: npx napi artifacts -d nodejs-artifacts
|
||||
- name: List packages
|
||||
run: find npm
|
||||
- name: Publish
|
||||
env:
|
||||
NODE_AUTH_TOKEN: ${{ secrets.LANCEDB_NPM_REGISTRY_TOKEN }}
|
||||
DRY_RUN: ${{ !startsWith(github.ref, 'refs/tags/v') }}
|
||||
run: |
|
||||
ARGS="--access public"
|
||||
if [[ $DRY_RUN == "true" ]]; then
|
||||
ARGS="$ARGS --dry-run"
|
||||
fi
|
||||
if [[ $GITHUB_REF =~ refs/tags/v(.*)-beta.* ]]; then
|
||||
ARGS="$ARGS --tag preview"
|
||||
fi
|
||||
npm publish $ARGS
|
||||
|
||||
9
.github/workflows/pypi-publish.yml
vendored
9
.github/workflows/pypi-publish.yml
vendored
@@ -4,6 +4,11 @@ on:
|
||||
push:
|
||||
tags:
|
||||
- 'python-v*'
|
||||
pull_request:
|
||||
# This should trigger a dry run (we skip the final publish step)
|
||||
paths:
|
||||
- .github/workflows/pypi-publish.yml
|
||||
- Cargo.toml # Change in dependency frequently breaks builds
|
||||
|
||||
jobs:
|
||||
linux:
|
||||
@@ -46,6 +51,7 @@ jobs:
|
||||
arm-build: ${{ matrix.config.platform == 'aarch64' }}
|
||||
manylinux: ${{ matrix.config.manylinux }}
|
||||
- uses: ./.github/workflows/upload_wheel
|
||||
if: startsWith(github.ref, 'refs/tags/python-v')
|
||||
with:
|
||||
pypi_token: ${{ secrets.LANCEDB_PYPI_API_TOKEN }}
|
||||
fury_token: ${{ secrets.FURY_TOKEN }}
|
||||
@@ -75,6 +81,7 @@ jobs:
|
||||
python-minor-version: 8
|
||||
args: "--release --strip --target ${{ matrix.config.target }} --features fp16kernels"
|
||||
- uses: ./.github/workflows/upload_wheel
|
||||
if: startsWith(github.ref, 'refs/tags/python-v')
|
||||
with:
|
||||
pypi_token: ${{ secrets.LANCEDB_PYPI_API_TOKEN }}
|
||||
fury_token: ${{ secrets.FURY_TOKEN }}
|
||||
@@ -96,10 +103,12 @@ jobs:
|
||||
args: "--release --strip"
|
||||
vcpkg_token: ${{ secrets.VCPKG_GITHUB_PACKAGES }}
|
||||
- uses: ./.github/workflows/upload_wheel
|
||||
if: startsWith(github.ref, 'refs/tags/python-v')
|
||||
with:
|
||||
pypi_token: ${{ secrets.LANCEDB_PYPI_API_TOKEN }}
|
||||
fury_token: ${{ secrets.FURY_TOKEN }}
|
||||
gh-release:
|
||||
if: startsWith(github.ref, 'refs/tags/python-v')
|
||||
runs-on: ubuntu-latest
|
||||
permissions:
|
||||
contents: write
|
||||
|
||||
10
.github/workflows/python.yml
vendored
10
.github/workflows/python.yml
vendored
@@ -13,6 +13,11 @@ concurrency:
|
||||
group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
|
||||
cancel-in-progress: true
|
||||
|
||||
env:
|
||||
# Color output for pytest is off by default.
|
||||
PYTEST_ADDOPTS: "--color=yes"
|
||||
FORCE_COLOR: "1"
|
||||
|
||||
jobs:
|
||||
lint:
|
||||
name: "Lint"
|
||||
@@ -131,6 +136,10 @@ jobs:
|
||||
- uses: ./.github/workflows/run_tests
|
||||
with:
|
||||
integration: true
|
||||
- name: Test without pylance or pandas
|
||||
run: |
|
||||
pip uninstall -y pylance pandas
|
||||
pytest -vv python/tests/test_table.py
|
||||
# Make sure wheels are not included in the Rust cache
|
||||
- name: Delete wheels
|
||||
run: rm -rf target/wheels
|
||||
@@ -219,6 +228,7 @@ jobs:
|
||||
- name: Install lancedb
|
||||
run: |
|
||||
pip install "pydantic<2"
|
||||
pip install pyarrow==16
|
||||
pip install --extra-index-url https://pypi.fury.io/lancedb/ -e .[tests]
|
||||
pip install tantivy
|
||||
- name: Run tests
|
||||
|
||||
4
.github/workflows/run_tests/action.yml
vendored
4
.github/workflows/run_tests/action.yml
vendored
@@ -24,8 +24,8 @@ runs:
|
||||
- name: pytest (with integration)
|
||||
shell: bash
|
||||
if: ${{ inputs.integration == 'true' }}
|
||||
run: pytest -m "not slow" -x -v --durations=30 python/python/tests
|
||||
run: pytest -m "not slow" -vv --durations=30 python/python/tests
|
||||
- name: pytest (no integration tests)
|
||||
shell: bash
|
||||
if: ${{ inputs.integration != 'true' }}
|
||||
run: pytest -m "not slow and not s3_test" -x -v --durations=30 python/python/tests
|
||||
run: pytest -m "not slow and not s3_test" -vv --durations=30 python/python/tests
|
||||
|
||||
153
.github/workflows/rust.yml
vendored
153
.github/workflows/rust.yml
vendored
@@ -40,6 +40,9 @@ jobs:
|
||||
with:
|
||||
fetch-depth: 0
|
||||
lfs: true
|
||||
- uses: actions-rust-lang/setup-rust-toolchain@v1
|
||||
with:
|
||||
components: rustfmt, clippy
|
||||
- uses: Swatinem/rust-cache@v2
|
||||
with:
|
||||
workspaces: rust
|
||||
@@ -157,153 +160,33 @@ jobs:
|
||||
|
||||
windows:
|
||||
runs-on: windows-2022
|
||||
strategy:
|
||||
matrix:
|
||||
target:
|
||||
- x86_64-pc-windows-msvc
|
||||
- aarch64-pc-windows-msvc
|
||||
defaults:
|
||||
run:
|
||||
working-directory: rust/lancedb
|
||||
steps:
|
||||
- uses: actions/checkout@v4
|
||||
- uses: Swatinem/rust-cache@v2
|
||||
with:
|
||||
workspaces: rust
|
||||
- name: Install Protoc v21.12
|
||||
working-directory: C:\
|
||||
run: choco install --no-progress protoc
|
||||
- name: Build
|
||||
run: |
|
||||
New-Item -Path 'C:\protoc' -ItemType Directory
|
||||
Set-Location C:\protoc
|
||||
Invoke-WebRequest https://github.com/protocolbuffers/protobuf/releases/download/v21.12/protoc-21.12-win64.zip -OutFile C:\protoc\protoc.zip
|
||||
7z x protoc.zip
|
||||
Add-Content $env:GITHUB_PATH "C:\protoc\bin"
|
||||
shell: powershell
|
||||
rustup target add ${{ matrix.target }}
|
||||
$env:VCPKG_ROOT = $env:VCPKG_INSTALLATION_ROOT
|
||||
cargo build --features remote --tests --locked --target ${{ matrix.target }}
|
||||
- name: Run tests
|
||||
# Can only run tests when target matches host
|
||||
if: ${{ matrix.target == 'x86_64-pc-windows-msvc' }}
|
||||
run: |
|
||||
$env:VCPKG_ROOT = $env:VCPKG_INSTALLATION_ROOT
|
||||
cargo test --features remote --locked
|
||||
|
||||
windows-arm64-cross:
|
||||
# We cross compile in Node releases, so we want to make sure
|
||||
# this can run successfully.
|
||||
runs-on: ubuntu-latest
|
||||
container: alpine:edge
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@v4
|
||||
- name: Install dependencies (part 1)
|
||||
run: |
|
||||
set -e
|
||||
apk add protobuf-dev curl clang lld llvm19 grep npm bash msitools sed
|
||||
- name: Install rust
|
||||
uses: actions-rust-lang/setup-rust-toolchain@v1
|
||||
with:
|
||||
target: aarch64-pc-windows-msvc
|
||||
- name: Install dependencies (part 2)
|
||||
run: |
|
||||
set -e
|
||||
mkdir -p sysroot
|
||||
cd sysroot
|
||||
sh ../ci/sysroot-aarch64-pc-windows-msvc.sh
|
||||
- name: Check
|
||||
env:
|
||||
CC: clang
|
||||
AR: llvm-ar
|
||||
C_INCLUDE_PATH: /usr/aarch64-pc-windows-msvc/usr/include
|
||||
CARGO_BUILD_TARGET: aarch64-pc-windows-msvc
|
||||
RUSTFLAGS: -Ctarget-feature=+crt-static,+neon,+fp16,+fhm,+dotprod -Clinker=lld -Clink-arg=/LIBPATH:/usr/aarch64-pc-windows-msvc/usr/lib -Clink-arg=arm64rt.lib
|
||||
run: |
|
||||
source $HOME/.cargo/env
|
||||
cargo check --features remote --locked
|
||||
|
||||
windows-arm64:
|
||||
runs-on: windows-4x-arm
|
||||
steps:
|
||||
- name: Install Git
|
||||
run: |
|
||||
Invoke-WebRequest -Uri "https://github.com/git-for-windows/git/releases/download/v2.44.0.windows.1/Git-2.44.0-64-bit.exe" -OutFile "git-installer.exe"
|
||||
Start-Process -FilePath "git-installer.exe" -ArgumentList "/VERYSILENT", "/NORESTART" -Wait
|
||||
shell: powershell
|
||||
- name: Add Git to PATH
|
||||
run: |
|
||||
Add-Content $env:GITHUB_PATH "C:\Program Files\Git\bin"
|
||||
$env:Path = [System.Environment]::GetEnvironmentVariable("Path","Machine") + ";" + [System.Environment]::GetEnvironmentVariable("Path","User")
|
||||
shell: powershell
|
||||
- name: Configure Git symlinks
|
||||
run: git config --global core.symlinks true
|
||||
- uses: actions/checkout@v4
|
||||
- uses: actions/setup-python@v5
|
||||
with:
|
||||
python-version: "3.13"
|
||||
- name: Install Visual Studio Build Tools
|
||||
run: |
|
||||
Invoke-WebRequest -Uri "https://aka.ms/vs/17/release/vs_buildtools.exe" -OutFile "vs_buildtools.exe"
|
||||
Start-Process -FilePath "vs_buildtools.exe" -ArgumentList "--quiet", "--wait", "--norestart", "--nocache", `
|
||||
"--installPath", "C:\BuildTools", `
|
||||
"--add", "Microsoft.VisualStudio.Component.VC.Tools.ARM64", `
|
||||
"--add", "Microsoft.VisualStudio.Component.VC.Tools.x86.x64", `
|
||||
"--add", "Microsoft.VisualStudio.Component.Windows11SDK.22621", `
|
||||
"--add", "Microsoft.VisualStudio.Component.VC.ATL", `
|
||||
"--add", "Microsoft.VisualStudio.Component.VC.ATLMFC", `
|
||||
"--add", "Microsoft.VisualStudio.Component.VC.Llvm.Clang" -Wait
|
||||
shell: powershell
|
||||
- name: Add Visual Studio Build Tools to PATH
|
||||
run: |
|
||||
$vsPath = "C:\BuildTools\VC\Tools\MSVC"
|
||||
$latestVersion = (Get-ChildItem $vsPath | Sort-Object {[version]$_.Name} -Descending)[0].Name
|
||||
Add-Content $env:GITHUB_PATH "C:\BuildTools\VC\Tools\MSVC\$latestVersion\bin\Hostx64\arm64"
|
||||
Add-Content $env:GITHUB_PATH "C:\BuildTools\VC\Tools\MSVC\$latestVersion\bin\Hostx64\x64"
|
||||
Add-Content $env:GITHUB_PATH "C:\Program Files (x86)\Windows Kits\10\bin\10.0.22621.0\arm64"
|
||||
Add-Content $env:GITHUB_PATH "C:\Program Files (x86)\Windows Kits\10\bin\10.0.22621.0\x64"
|
||||
Add-Content $env:GITHUB_PATH "C:\BuildTools\VC\Tools\Llvm\x64\bin"
|
||||
|
||||
# Add MSVC runtime libraries to LIB
|
||||
$env:LIB = "C:\BuildTools\VC\Tools\MSVC\$latestVersion\lib\arm64;" +
|
||||
"C:\Program Files (x86)\Windows Kits\10\Lib\10.0.22621.0\um\arm64;" +
|
||||
"C:\Program Files (x86)\Windows Kits\10\Lib\10.0.22621.0\ucrt\arm64"
|
||||
Add-Content $env:GITHUB_ENV "LIB=$env:LIB"
|
||||
|
||||
# Add INCLUDE paths
|
||||
$env:INCLUDE = "C:\BuildTools\VC\Tools\MSVC\$latestVersion\include;" +
|
||||
"C:\Program Files (x86)\Windows Kits\10\Include\10.0.22621.0\ucrt;" +
|
||||
"C:\Program Files (x86)\Windows Kits\10\Include\10.0.22621.0\um;" +
|
||||
"C:\Program Files (x86)\Windows Kits\10\Include\10.0.22621.0\shared"
|
||||
Add-Content $env:GITHUB_ENV "INCLUDE=$env:INCLUDE"
|
||||
shell: powershell
|
||||
- name: Install Rust
|
||||
run: |
|
||||
Invoke-WebRequest https://win.rustup.rs/x86_64 -OutFile rustup-init.exe
|
||||
.\rustup-init.exe -y --default-host aarch64-pc-windows-msvc --default-toolchain 1.83.0
|
||||
shell: powershell
|
||||
- name: Add Rust to PATH
|
||||
run: |
|
||||
Add-Content $env:GITHUB_PATH "$env:USERPROFILE\.cargo\bin"
|
||||
shell: powershell
|
||||
- uses: Swatinem/rust-cache@v2
|
||||
with:
|
||||
workspaces: rust
|
||||
- name: Install 7-Zip ARM
|
||||
run: |
|
||||
New-Item -Path 'C:\7zip' -ItemType Directory
|
||||
Invoke-WebRequest https://7-zip.org/a/7z2408-arm64.exe -OutFile C:\7zip\7z-installer.exe
|
||||
Start-Process -FilePath C:\7zip\7z-installer.exe -ArgumentList '/S' -Wait
|
||||
shell: powershell
|
||||
- name: Add 7-Zip to PATH
|
||||
run: Add-Content $env:GITHUB_PATH "C:\Program Files\7-Zip"
|
||||
shell: powershell
|
||||
- name: Install Protoc v21.12
|
||||
working-directory: C:\
|
||||
run: |
|
||||
if (Test-Path 'C:\protoc') {
|
||||
Write-Host "Protoc directory exists, skipping installation"
|
||||
return
|
||||
}
|
||||
New-Item -Path 'C:\protoc' -ItemType Directory
|
||||
Set-Location C:\protoc
|
||||
Invoke-WebRequest https://github.com/protocolbuffers/protobuf/releases/download/v21.12/protoc-21.12-win64.zip -OutFile C:\protoc\protoc.zip
|
||||
& 'C:\Program Files\7-Zip\7z.exe' x protoc.zip
|
||||
shell: powershell
|
||||
- name: Add Protoc to PATH
|
||||
run: Add-Content $env:GITHUB_PATH "C:\protoc\bin"
|
||||
shell: powershell
|
||||
- name: Run tests
|
||||
run: |
|
||||
$env:VCPKG_ROOT = $env:VCPKG_INSTALLATION_ROOT
|
||||
cargo test --target aarch64-pc-windows-msvc --features remote --locked
|
||||
|
||||
msrv:
|
||||
# Check the minimum supported Rust version
|
||||
name: MSRV Check - Rust v${{ matrix.msrv }}
|
||||
|
||||
33
.github/workflows/update_package_lock/action.yml
vendored
33
.github/workflows/update_package_lock/action.yml
vendored
@@ -1,33 +0,0 @@
|
||||
name: update_package_lock
|
||||
description: "Update node's package.lock"
|
||||
|
||||
inputs:
|
||||
github_token:
|
||||
required: true
|
||||
description: "github token for the repo"
|
||||
|
||||
runs:
|
||||
using: "composite"
|
||||
steps:
|
||||
- uses: actions/setup-node@v3
|
||||
with:
|
||||
node-version: 20
|
||||
- name: Set git configs
|
||||
shell: bash
|
||||
run: |
|
||||
git config user.name 'Lance Release'
|
||||
git config user.email 'lance-dev@lancedb.com'
|
||||
- name: Update package-lock.json file
|
||||
working-directory: ./node
|
||||
run: |
|
||||
npm install
|
||||
git add package-lock.json
|
||||
git commit -m "Updating package-lock.json"
|
||||
shell: bash
|
||||
- name: Push changes
|
||||
if: ${{ inputs.dry_run }} == "false"
|
||||
uses: ad-m/github-push-action@master
|
||||
with:
|
||||
github_token: ${{ inputs.github_token }}
|
||||
branch: main
|
||||
tags: true
|
||||
@@ -1,33 +0,0 @@
|
||||
name: update_package_lock_nodejs
|
||||
description: "Update nodejs's package.lock"
|
||||
|
||||
inputs:
|
||||
github_token:
|
||||
required: true
|
||||
description: "github token for the repo"
|
||||
|
||||
runs:
|
||||
using: "composite"
|
||||
steps:
|
||||
- uses: actions/setup-node@v3
|
||||
with:
|
||||
node-version: 20
|
||||
- name: Set git configs
|
||||
shell: bash
|
||||
run: |
|
||||
git config user.name 'Lance Release'
|
||||
git config user.email 'lance-dev@lancedb.com'
|
||||
- name: Update package-lock.json file
|
||||
working-directory: ./nodejs
|
||||
run: |
|
||||
npm install
|
||||
git add package-lock.json
|
||||
git commit -m "Updating package-lock.json"
|
||||
shell: bash
|
||||
- name: Push changes
|
||||
if: ${{ inputs.dry_run }} == "false"
|
||||
uses: ad-m/github-push-action@master
|
||||
with:
|
||||
github_token: ${{ inputs.github_token }}
|
||||
branch: main
|
||||
tags: true
|
||||
3
.gitignore
vendored
3
.gitignore
vendored
@@ -31,9 +31,6 @@ python/dist
|
||||
*.node
|
||||
**/node_modules
|
||||
**/.DS_Store
|
||||
node/dist
|
||||
node/examples/**/package-lock.json
|
||||
node/examples/**/dist
|
||||
nodejs/lancedb/native*
|
||||
dist
|
||||
|
||||
|
||||
80
CLAUDE.md
Normal file
80
CLAUDE.md
Normal file
@@ -0,0 +1,80 @@
|
||||
LanceDB is a database designed for retrieval, including vector, full-text, and hybrid search.
|
||||
It is a wrapper around Lance. There are two backends: local (in-process like SQLite) and
|
||||
remote (against LanceDB Cloud).
|
||||
|
||||
The core of LanceDB is written in Rust. There are bindings in Python, Typescript, and Java.
|
||||
|
||||
Project layout:
|
||||
|
||||
* `rust/lancedb`: The LanceDB core Rust implementation.
|
||||
* `python`: The Python bindings, using PyO3.
|
||||
* `nodejs`: The Typescript bindings, using napi-rs
|
||||
* `java`: The Java bindings
|
||||
|
||||
Common commands:
|
||||
|
||||
* Check for compiler errors: `cargo check --quiet --features remote --tests --examples`
|
||||
* Run tests: `cargo test --quiet --features remote --tests`
|
||||
* Run specific test: `cargo test --quiet --features remote -p <package_name> --test <test_name>`
|
||||
* Lint: `cargo clippy --quiet --features remote --tests --examples`
|
||||
* Format: `cargo fmt --all`
|
||||
|
||||
Before committing changes, run formatting.
|
||||
|
||||
## Coding tips
|
||||
|
||||
* When writing Rust doctests for things that require a connection or table reference,
|
||||
write them as a function instead of a fully executable test. This allows type checking
|
||||
to run but avoids needing a full test environment. For example:
|
||||
```rust
|
||||
/// ```
|
||||
/// use lance_index::scalar::FullTextSearchQuery;
|
||||
/// use lancedb::query::{QueryBase, ExecutableQuery};
|
||||
///
|
||||
/// # use lancedb::Table;
|
||||
/// # async fn query(table: &Table) -> Result<(), Box<dyn std::error::Error>> {
|
||||
/// let results = table.query()
|
||||
/// .full_text_search(FullTextSearchQuery::new("hello world".into()))
|
||||
/// .execute()
|
||||
/// .await?;
|
||||
/// # Ok(())
|
||||
/// # }
|
||||
/// ```
|
||||
```
|
||||
|
||||
## Example plan: adding a new method on Table
|
||||
|
||||
Adding a new method involves first adding it to the Rust core, then exposing it
|
||||
in the Python and TypeScript bindings. There are both local and remote tables.
|
||||
Remote tables are implemented via a HTTP API and require the `remote` cargo
|
||||
feature flag to be enabled. Python has both sync and async methods.
|
||||
|
||||
Rust core changes:
|
||||
|
||||
1. Add method on `Table` struct in `rust/lancedb/src/table.rs` (calls `BaseTable` trait).
|
||||
2. Add method to `BaseTable` trait in `rust/lancedb/src/table.rs`.
|
||||
3. Implement new trait method on `NativeTable` in `rust/lancedb/src/table.rs`.
|
||||
* Test with unit test in `rust/lancedb/src/table.rs`.
|
||||
4. Implement new trait method on `RemoteTable` in `rust/lancedb/src/remote/table.rs`.
|
||||
* Test with unit test in `rust/lancedb/src/remote/table.rs` against mocked endpoint.
|
||||
|
||||
Python bindings changes:
|
||||
|
||||
1. Add PyO3 method binding in `python/src/table.rs`. Run `make develop` to compile bindings.
|
||||
2. Add types for PyO3 method in `python/python/lancedb/_lancedb.pyi`.
|
||||
3. Add method to `AsyncTable` class in `python/python/lancedb/table.py`.
|
||||
4. Add abstract method to `Table` abstract base class in `python/python/lancedb/table.py`.
|
||||
5. Add concrete sync method to `LanceTable` class in `python/python/lancedb/table.py`.
|
||||
* Should use `LOOP.run()` to call the corresponding `AsyncTable` method.
|
||||
6. Add concrete sync method to `RemoteTable` class in `python/python/lancedb/remote/table.py`.
|
||||
7. Add unit test in `python/tests/test_table.py`.
|
||||
|
||||
TypeScript bindings changes:
|
||||
|
||||
1. Add napi-rs method binding on `Table` in `nodejs/src/table.rs`.
|
||||
2. Run `npm run build` to generate TypeScript definitions.
|
||||
3. Add typescript method on abstract class `Table` in `nodejs/src/table.ts`.
|
||||
4. Add concrete method on `LocalTable` class in `nodejs/src/native_table.ts`.
|
||||
* Note: despite the name, this class is also used for remote tables.
|
||||
5. Add test in `nodejs/__test__/table.test.ts`.
|
||||
6. Run `npm run docs` to generate TypeScript documentation.
|
||||
3978
Cargo.lock
generated
3978
Cargo.lock
generated
File diff suppressed because it is too large
Load Diff
67
Cargo.toml
67
Cargo.toml
@@ -1,11 +1,5 @@
|
||||
[workspace]
|
||||
members = [
|
||||
"rust/ffi/node",
|
||||
"rust/lancedb",
|
||||
"nodejs",
|
||||
"python",
|
||||
"java/core/lancedb-jni",
|
||||
]
|
||||
members = ["rust/lancedb", "nodejs", "python", "java/core/lancedb-jni"]
|
||||
# Python package needs to be built by maturin.
|
||||
exclude = ["python"]
|
||||
resolver = "2"
|
||||
@@ -21,52 +15,51 @@ categories = ["database-implementations"]
|
||||
rust-version = "1.78.0"
|
||||
|
||||
[workspace.dependencies]
|
||||
lance = { "version" = "=0.24.1", "features" = ["dynamodb"] }
|
||||
lance-io = { version = "=0.24.1" }
|
||||
lance-index = { version = "=0.24.1" }
|
||||
lance-linalg = { version = "=0.24.1" }
|
||||
lance-table = { version = "=0.24.1" }
|
||||
lance-testing = { version = "=0.24.1" }
|
||||
lance-datafusion = { version = "=0.24.1" }
|
||||
lance-encoding = { version = "=0.24.1" }
|
||||
lance = { "version" = "=0.34.0", default-features = false, "features" = ["dynamodb"], "tag" = "v0.34.0-beta.4", "git" = "https://github.com/lancedb/lance.git" }
|
||||
lance-io = { "version" = "=0.34.0", default-features = false, "tag" = "v0.34.0-beta.4", "git" = "https://github.com/lancedb/lance.git" }
|
||||
lance-index = { "version" = "=0.34.0", "tag" = "v0.34.0-beta.4", "git" = "https://github.com/lancedb/lance.git" }
|
||||
lance-linalg = { "version" = "=0.34.0", "tag" = "v0.34.0-beta.4", "git" = "https://github.com/lancedb/lance.git" }
|
||||
lance-table = { "version" = "=0.34.0", "tag" = "v0.34.0-beta.4", "git" = "https://github.com/lancedb/lance.git" }
|
||||
lance-testing = { "version" = "=0.34.0", "tag" = "v0.34.0-beta.4", "git" = "https://github.com/lancedb/lance.git" }
|
||||
lance-datafusion = { "version" = "=0.34.0", "tag" = "v0.34.0-beta.4", "git" = "https://github.com/lancedb/lance.git" }
|
||||
lance-encoding = { "version" = "=0.34.0", "tag" = "v0.34.0-beta.4", "git" = "https://github.com/lancedb/lance.git" }
|
||||
# Note that this one does not include pyarrow
|
||||
arrow = { version = "54.1", optional = false }
|
||||
arrow-array = "54.1"
|
||||
arrow-data = "54.1"
|
||||
arrow-ipc = "54.1"
|
||||
arrow-ord = "54.1"
|
||||
arrow-schema = "54.1"
|
||||
arrow-arith = "54.1"
|
||||
arrow-cast = "54.1"
|
||||
arrow = { version = "55.1", optional = false }
|
||||
arrow-array = "55.1"
|
||||
arrow-data = "55.1"
|
||||
arrow-ipc = "55.1"
|
||||
arrow-ord = "55.1"
|
||||
arrow-schema = "55.1"
|
||||
arrow-arith = "55.1"
|
||||
arrow-cast = "55.1"
|
||||
async-trait = "0"
|
||||
datafusion = { version = "45.0", default-features = false }
|
||||
datafusion-catalog = "45.0"
|
||||
datafusion-common = { version = "45.0", default-features = false }
|
||||
datafusion-execution = "45.0"
|
||||
datafusion-expr = "45.0"
|
||||
datafusion-physical-plan = "45.0"
|
||||
datafusion = { version = "48.0", default-features = false }
|
||||
datafusion-catalog = "48.0"
|
||||
datafusion-common = { version = "48.0", default-features = false }
|
||||
datafusion-execution = "48.0"
|
||||
datafusion-expr = "48.0"
|
||||
datafusion-physical-plan = "48.0"
|
||||
env_logger = "0.11"
|
||||
half = { "version" = "=2.4.1", default-features = false, features = [
|
||||
half = { "version" = "2.6.0", default-features = false, features = [
|
||||
"num-traits",
|
||||
] }
|
||||
futures = "0"
|
||||
log = "0.4"
|
||||
moka = { version = "0.12", features = ["future"] }
|
||||
object_store = "0.11.0"
|
||||
object_store = "0.12.0"
|
||||
pin-project = "1.0.7"
|
||||
snafu = "0.8"
|
||||
url = "2"
|
||||
num-traits = "0.2"
|
||||
rand = "0.8"
|
||||
rand = "0.9"
|
||||
regex = "1.10"
|
||||
lazy_static = "1"
|
||||
semver = "1.0.25"
|
||||
|
||||
crunchy = "0.2.4"
|
||||
# Temporary pins to work around downstream issues
|
||||
# https://github.com/apache/arrow-rs/commit/2fddf85afcd20110ce783ed5b4cdeb82293da30b
|
||||
chrono = "=0.4.39"
|
||||
chrono = "=0.4.41"
|
||||
# https://github.com/RustCrypto/formats/issues/1684
|
||||
base64ct = "=1.6.0"
|
||||
|
||||
# Workaround for: https://github.com/eira-fransham/crunchy/issues/13
|
||||
crunchy = "=0.2.2"
|
||||
# Workaround for: https://github.com/Lokathor/bytemuck/issues/306
|
||||
bytemuck_derive = ">=1.8.1, <1.9.0"
|
||||
|
||||
127
README.md
127
README.md
@@ -1,86 +1,97 @@
|
||||
<a href="https://cloud.lancedb.com" target="_blank">
|
||||
<img src="https://github.com/user-attachments/assets/92dad0a2-2a37-4ce1-b783-0d1b4f30a00c" alt="LanceDB Cloud Public Beta" width="100%" style="max-width: 100%;">
|
||||
</a>
|
||||
<div align="center">
|
||||
<p align="center">
|
||||
|
||||
<img width="275" alt="LanceDB Logo" src="https://github.com/lancedb/lancedb/assets/5846846/37d7c7ad-c2fd-4f56-9f16-fffb0d17c73a">
|
||||
[](https://lancedb.com)
|
||||
[](https://lancedb.com/)
|
||||
[](https://blog.lancedb.com/)
|
||||
[](https://discord.gg/zMM32dvNtd)
|
||||
[](https://twitter.com/lancedb)
|
||||
[](https://www.linkedin.com/company/lancedb/)
|
||||
|
||||
**Developer-friendly, database for multimodal AI**
|
||||
|
||||
<a href='https://github.com/lancedb/vectordb-recipes/tree/main' target="_blank"><img alt='LanceDB' src='https://img.shields.io/badge/VectorDB_Recipes-100000?style=for-the-badge&logo=LanceDB&logoColor=white&labelColor=645cfb&color=645cfb'/></a>
|
||||
<a href='https://lancedb.github.io/lancedb/' target="_blank"><img alt='lancdb' src='https://img.shields.io/badge/DOCS-100000?style=for-the-badge&logo=lancdb&logoColor=white&labelColor=645cfb&color=645cfb'/></a>
|
||||
[](https://blog.lancedb.com/)
|
||||
[](https://discord.gg/zMM32dvNtd)
|
||||
[](https://twitter.com/lancedb)
|
||||
[](https://gurubase.io/g/lancedb)
|
||||
<img src="docs/src/assets/lancedb.png" alt="LanceDB" width="50%">
|
||||
|
||||
</p>
|
||||
# **The Multimodal AI Lakehouse**
|
||||
|
||||
<img max-width="750px" alt="LanceDB Multimodal Search" src="https://github.com/lancedb/lancedb/assets/917119/09c5afc5-7816-4687-bae4-f2ca194426ec">
|
||||
[**How to Install** ](#how-to-install) ✦ [**Detailed Documentation**](https://lancedb.github.io/lancedb/) ✦ [**Tutorials and Recipes**](https://github.com/lancedb/vectordb-recipes/tree/main) ✦ [**Contributors**](#contributors)
|
||||
|
||||
**The ultimate multimodal data platform for AI/ML applications.**
|
||||
|
||||
LanceDB is designed for fast, scalable, and production-ready vector search. It is built on top of the Lance columnar format. You can store, index, and search over petabytes of multimodal data and vectors with ease.
|
||||
LanceDB is a central location where developers can build, train and analyze their AI workloads.
|
||||
|
||||
</p>
|
||||
</div>
|
||||
|
||||
<hr />
|
||||
<br>
|
||||
|
||||
LanceDB is an open-source database for vector-search built with persistent storage, which greatly simplifies retrieval, filtering and management of embeddings.
|
||||
## **Demo: Multimodal Search by Keyword, Vector or with SQL**
|
||||
<img max-width="750px" alt="LanceDB Multimodal Search" src="https://github.com/lancedb/lancedb/assets/917119/09c5afc5-7816-4687-bae4-f2ca194426ec">
|
||||
|
||||
The key features of LanceDB include:
|
||||
## **Star LanceDB to get updates!**
|
||||
|
||||
* Production-scale vector search with no servers to manage.
|
||||
<details>
|
||||
<summary>⭐ Click here ⭐ to see how fast we're growing!</summary>
|
||||
<picture>
|
||||
<source media="(prefers-color-scheme: dark)" srcset="https://api.star-history.com/svg?repos=lancedb/lancedb&theme=dark&type=Date">
|
||||
<img width="100%" src="https://api.star-history.com/svg?repos=lancedb/lancedb&theme=dark&type=Date">
|
||||
</picture>
|
||||
</details>
|
||||
|
||||
* Store, query and filter vectors, metadata and multi-modal data (text, images, videos, point clouds, and more).
|
||||
## **Key Features**:
|
||||
|
||||
* Support for vector similarity search, full-text search and SQL.
|
||||
- **Fast Vector Search**: Search billions of vectors in milliseconds with state-of-the-art indexing.
|
||||
- **Comprehensive Search**: Support for vector similarity search, full-text search and SQL.
|
||||
- **Multimodal Support**: Store, query and filter vectors, metadata and multimodal data (text, images, videos, point clouds, and more).
|
||||
- **Advanced Features**: Zero-copy, automatic versioning, manage versions of your data without needing extra infrastructure. GPU support in building vector index.
|
||||
|
||||
* Native Python and Javascript/Typescript support.
|
||||
### **Products**:
|
||||
- **Open Source & Local**: 100% open source, runs locally or in your cloud. No vendor lock-in.
|
||||
- **Cloud and Enterprise**: Production-scale vector search with no servers to manage. Complete data sovereignty and security.
|
||||
|
||||
* Zero-copy, automatic versioning, manage versions of your data without needing extra infrastructure.
|
||||
### **Ecosystem**:
|
||||
- **Columnar Storage**: Built on the Lance columnar format for efficient storage and analytics.
|
||||
- **Seamless Integration**: Python, Node.js, Rust, and REST APIs for easy integration. Native Python and Javascript/Typescript support.
|
||||
- **Rich Ecosystem**: Integrations with [**LangChain** 🦜️🔗](https://python.langchain.com/docs/integrations/vectorstores/lancedb/), [**LlamaIndex** 🦙](https://gpt-index.readthedocs.io/en/latest/examples/vector_stores/LanceDBIndexDemo.html), Apache-Arrow, Pandas, Polars, DuckDB and more on the way.
|
||||
|
||||
* GPU support in building vector index(*).
|
||||
## **How to Install**:
|
||||
|
||||
* Ecosystem integrations with [LangChain 🦜️🔗](https://python.langchain.com/docs/integrations/vectorstores/lancedb/), [LlamaIndex 🦙](https://gpt-index.readthedocs.io/en/latest/examples/vector_stores/LanceDBIndexDemo.html), Apache-Arrow, Pandas, Polars, DuckDB and more on the way.
|
||||
Follow the [Quickstart](https://lancedb.github.io/lancedb/basic/) doc to set up LanceDB locally.
|
||||
|
||||
LanceDB's core is written in Rust 🦀 and is built using <a href="https://github.com/lancedb/lance">Lance</a>, an open-source columnar format designed for performant ML workloads.
|
||||
**API & SDK:** We also support Python, Typescript and Rust SDKs
|
||||
|
||||
## Quick Start
|
||||
| Interface | Documentation |
|
||||
|-----------|---------------|
|
||||
| Python SDK | https://lancedb.github.io/lancedb/python/python/ |
|
||||
| Typescript SDK | https://lancedb.github.io/lancedb/js/globals/ |
|
||||
| Rust SDK | https://docs.rs/lancedb/latest/lancedb/index.html |
|
||||
| REST API | https://docs.lancedb.com/api-reference/introduction |
|
||||
|
||||
**Javascript**
|
||||
```shell
|
||||
npm install @lancedb/lancedb
|
||||
```
|
||||
## **Join Us and Contribute**
|
||||
|
||||
```javascript
|
||||
import * as lancedb from "@lancedb/lancedb";
|
||||
We welcome contributions from everyone! Whether you're a developer, researcher, or just someone who wants to help out.
|
||||
|
||||
const db = await lancedb.connect("data/sample-lancedb");
|
||||
const table = await db.createTable("vectors", [
|
||||
{ id: 1, vector: [0.1, 0.2], item: "foo", price: 10 },
|
||||
{ id: 2, vector: [1.1, 1.2], item: "bar", price: 50 },
|
||||
], {mode: 'overwrite'});
|
||||
If you have any suggestions or feature requests, please feel free to open an issue on GitHub or discuss it on our [**Discord**](https://discord.gg/G5DcmnZWKB) server.
|
||||
|
||||
[**Check out the GitHub Issues**](https://github.com/lancedb/lancedb/issues) if you would like to work on the features that are planned for the future. If you have any suggestions or feature requests, please feel free to open an issue on GitHub.
|
||||
|
||||
## **Contributors**
|
||||
|
||||
<a href="https://github.com/lancedb/lancedb/graphs/contributors">
|
||||
<img src="https://contrib.rocks/image?repo=lancedb/lancedb" />
|
||||
</a>
|
||||
|
||||
|
||||
const query = table.vectorSearch([0.1, 0.3]).limit(2);
|
||||
const results = await query.toArray();
|
||||
## **Stay in Touch With Us**
|
||||
<div align="center">
|
||||
|
||||
// You can also search for rows by specific criteria without involving a vector search.
|
||||
const rowsByCriteria = await table.query().where("price >= 10").toArray();
|
||||
```
|
||||
</br>
|
||||
|
||||
**Python**
|
||||
```shell
|
||||
pip install lancedb
|
||||
```
|
||||
[](https://lancedb.com/)
|
||||
[](https://blog.lancedb.com/)
|
||||
[](https://discord.gg/zMM32dvNtd)
|
||||
[](https://twitter.com/lancedb)
|
||||
[](https://www.linkedin.com/company/lancedb/)
|
||||
|
||||
```python
|
||||
import lancedb
|
||||
|
||||
uri = "data/sample-lancedb"
|
||||
db = lancedb.connect(uri)
|
||||
table = db.create_table("my_table",
|
||||
data=[{"vector": [3.1, 4.1], "item": "foo", "price": 10.0},
|
||||
{"vector": [5.9, 26.5], "item": "bar", "price": 20.0}])
|
||||
result = table.search([100, 100]).limit(2).to_pandas()
|
||||
```
|
||||
|
||||
## Blogs, Tutorials & Videos
|
||||
* 📈 <a href="https://blog.lancedb.com/benchmarking-random-access-in-lance/">2000x better performance with Lance over Parquet</a>
|
||||
* 🤖 <a href="https://github.com/lancedb/vectordb-recipes/tree/main/examples/Youtube-Search-QA-Bot">Build a question and answer bot with LanceDB</a>
|
||||
</div>
|
||||
|
||||
@@ -1,22 +0,0 @@
|
||||
#!/bin/bash
|
||||
set -e
|
||||
ARCH=${1:-x86_64}
|
||||
TARGET_TRIPLE=${2:-x86_64-unknown-linux-gnu}
|
||||
|
||||
# We pass down the current user so that when we later mount the local files
|
||||
# into the container, the files are accessible by the current user.
|
||||
pushd ci/manylinux_node
|
||||
docker build \
|
||||
-t lancedb-node-manylinux \
|
||||
--build-arg="ARCH=$ARCH" \
|
||||
--build-arg="DOCKER_USER=$(id -u)" \
|
||||
--progress=plain \
|
||||
.
|
||||
popd
|
||||
|
||||
# We turn on memory swap to avoid OOM killer
|
||||
docker run \
|
||||
-v $(pwd):/io -w /io \
|
||||
--memory-swap=-1 \
|
||||
lancedb-node-manylinux \
|
||||
bash ci/manylinux_node/build_vectordb.sh $ARCH $TARGET_TRIPLE
|
||||
@@ -1,21 +0,0 @@
|
||||
#!/bin/bash
|
||||
set -e
|
||||
ARCH=${1:-x86_64}
|
||||
|
||||
# We pass down the current user so that when we later mount the local files
|
||||
# into the container, the files are accessible by the current user.
|
||||
pushd ci/manylinux_node
|
||||
docker build \
|
||||
-t lancedb-node-manylinux-$ARCH \
|
||||
--build-arg="ARCH=$ARCH" \
|
||||
--build-arg="DOCKER_USER=$(id -u)" \
|
||||
--progress=plain \
|
||||
.
|
||||
popd
|
||||
|
||||
# We turn on memory swap to avoid OOM killer
|
||||
docker run \
|
||||
-v $(pwd):/io -w /io \
|
||||
--memory-swap=-1 \
|
||||
lancedb-node-manylinux-$ARCH \
|
||||
bash ci/manylinux_node/build_lancedb.sh $ARCH
|
||||
@@ -1,34 +0,0 @@
|
||||
# Builds the macOS artifacts (node binaries).
|
||||
# Usage: ./ci/build_macos_artifacts.sh [target]
|
||||
# Targets supported: x86_64-apple-darwin aarch64-apple-darwin
|
||||
set -e
|
||||
|
||||
prebuild_rust() {
|
||||
# Building here for the sake of easier debugging.
|
||||
pushd rust/ffi/node
|
||||
echo "Building rust library for $1"
|
||||
export RUST_BACKTRACE=1
|
||||
cargo build --release --target $1
|
||||
popd
|
||||
}
|
||||
|
||||
build_node_binaries() {
|
||||
pushd node
|
||||
echo "Building node library for $1"
|
||||
npm run build-release -- --target $1
|
||||
npm run pack-build -- --target $1
|
||||
popd
|
||||
}
|
||||
|
||||
if [ -n "$1" ]; then
|
||||
targets=$1
|
||||
else
|
||||
targets="x86_64-apple-darwin aarch64-apple-darwin"
|
||||
fi
|
||||
|
||||
echo "Building artifacts for targets: $targets"
|
||||
for target in $targets
|
||||
do
|
||||
prebuild_rust $target
|
||||
build_node_binaries $target
|
||||
done
|
||||
@@ -1,34 +0,0 @@
|
||||
# Builds the macOS artifacts (nodejs binaries).
|
||||
# Usage: ./ci/build_macos_artifacts_nodejs.sh [target]
|
||||
# Targets supported: x86_64-apple-darwin aarch64-apple-darwin
|
||||
set -e
|
||||
|
||||
prebuild_rust() {
|
||||
# Building here for the sake of easier debugging.
|
||||
pushd rust/lancedb
|
||||
echo "Building rust library for $1"
|
||||
export RUST_BACKTRACE=1
|
||||
cargo build --release --target $1
|
||||
popd
|
||||
}
|
||||
|
||||
build_node_binaries() {
|
||||
pushd nodejs
|
||||
echo "Building nodejs library for $1"
|
||||
export RUST_TARGET=$1
|
||||
npm run build-release
|
||||
popd
|
||||
}
|
||||
|
||||
if [ -n "$1" ]; then
|
||||
targets=$1
|
||||
else
|
||||
targets="x86_64-apple-darwin aarch64-apple-darwin"
|
||||
fi
|
||||
|
||||
echo "Building artifacts for targets: $targets"
|
||||
for target in $targets
|
||||
do
|
||||
prebuild_rust $target
|
||||
build_node_binaries $target
|
||||
done
|
||||
@@ -1,42 +0,0 @@
|
||||
# Builds the Windows artifacts (node binaries).
|
||||
# Usage: .\ci\build_windows_artifacts.ps1 [target]
|
||||
# Targets supported:
|
||||
# - x86_64-pc-windows-msvc
|
||||
# - i686-pc-windows-msvc
|
||||
# - aarch64-pc-windows-msvc
|
||||
|
||||
function Prebuild-Rust {
|
||||
param (
|
||||
[string]$target
|
||||
)
|
||||
|
||||
# Building here for the sake of easier debugging.
|
||||
Push-Location -Path "rust/ffi/node"
|
||||
Write-Host "Building rust library for $target"
|
||||
$env:RUST_BACKTRACE=1
|
||||
cargo build --release --target $target
|
||||
Pop-Location
|
||||
}
|
||||
|
||||
function Build-NodeBinaries {
|
||||
param (
|
||||
[string]$target
|
||||
)
|
||||
|
||||
Push-Location -Path "node"
|
||||
Write-Host "Building node library for $target"
|
||||
npm run build-release -- --target $target
|
||||
npm run pack-build -- --target $target
|
||||
Pop-Location
|
||||
}
|
||||
|
||||
$targets = $args[0]
|
||||
if (-not $targets) {
|
||||
$targets = "x86_64-pc-windows-msvc", "aarch64-pc-windows-msvc"
|
||||
}
|
||||
|
||||
Write-Host "Building artifacts for targets: $targets"
|
||||
foreach ($target in $targets) {
|
||||
Prebuild-Rust $target
|
||||
Build-NodeBinaries $target
|
||||
}
|
||||
@@ -1,42 +0,0 @@
|
||||
# Builds the Windows artifacts (nodejs binaries).
|
||||
# Usage: .\ci\build_windows_artifacts_nodejs.ps1 [target]
|
||||
# Targets supported:
|
||||
# - x86_64-pc-windows-msvc
|
||||
# - i686-pc-windows-msvc
|
||||
# - aarch64-pc-windows-msvc
|
||||
|
||||
function Prebuild-Rust {
|
||||
param (
|
||||
[string]$target
|
||||
)
|
||||
|
||||
# Building here for the sake of easier debugging.
|
||||
Push-Location -Path "rust/lancedb"
|
||||
Write-Host "Building rust library for $target"
|
||||
$env:RUST_BACKTRACE=1
|
||||
cargo build --release --target $target
|
||||
Pop-Location
|
||||
}
|
||||
|
||||
function Build-NodeBinaries {
|
||||
param (
|
||||
[string]$target
|
||||
)
|
||||
|
||||
Push-Location -Path "nodejs"
|
||||
Write-Host "Building nodejs library for $target"
|
||||
$env:RUST_TARGET=$target
|
||||
npm run build-release
|
||||
Pop-Location
|
||||
}
|
||||
|
||||
$targets = $args[0]
|
||||
if (-not $targets) {
|
||||
$targets = "x86_64-pc-windows-msvc", "aarch64-pc-windows-msvc"
|
||||
}
|
||||
|
||||
Write-Host "Building artifacts for targets: $targets"
|
||||
foreach ($target in $targets) {
|
||||
Prebuild-Rust $target
|
||||
Build-NodeBinaries $target
|
||||
}
|
||||
@@ -1,31 +0,0 @@
|
||||
# Many linux dockerfile with Rust, Node, and Lance dependencies installed.
|
||||
# This container allows building the node modules native libraries in an
|
||||
# environment with a very old glibc, so that we are compatible with a wide
|
||||
# range of linux distributions.
|
||||
ARG ARCH=x86_64
|
||||
|
||||
FROM quay.io/pypa/manylinux_2_28_${ARCH}
|
||||
|
||||
ARG ARCH=x86_64
|
||||
ARG DOCKER_USER=default_user
|
||||
|
||||
# Install static openssl
|
||||
COPY install_openssl.sh install_openssl.sh
|
||||
RUN ./install_openssl.sh ${ARCH} > /dev/null
|
||||
|
||||
# Protobuf is also installed as root.
|
||||
COPY install_protobuf.sh install_protobuf.sh
|
||||
RUN ./install_protobuf.sh ${ARCH}
|
||||
|
||||
ENV DOCKER_USER=${DOCKER_USER}
|
||||
# Create a group and user, but only if it doesn't exist
|
||||
RUN echo ${ARCH} && id -u ${DOCKER_USER} >/dev/null 2>&1 || adduser --user-group --create-home --uid ${DOCKER_USER} build_user
|
||||
|
||||
# We switch to the user to install Rust and Node, since those like to be
|
||||
# installed at the user level.
|
||||
USER ${DOCKER_USER}
|
||||
|
||||
COPY prepare_manylinux_node.sh prepare_manylinux_node.sh
|
||||
RUN cp /prepare_manylinux_node.sh $HOME/ && \
|
||||
cd $HOME && \
|
||||
./prepare_manylinux_node.sh ${ARCH}
|
||||
@@ -1,19 +0,0 @@
|
||||
#!/bin/bash
|
||||
# Builds the nodejs module for manylinux. Invoked by ci/build_linux_artifacts_nodejs.sh.
|
||||
set -e
|
||||
ARCH=${1:-x86_64}
|
||||
|
||||
if [ "$ARCH" = "x86_64" ]; then
|
||||
export OPENSSL_LIB_DIR=/usr/local/lib64/
|
||||
else
|
||||
export OPENSSL_LIB_DIR=/usr/local/lib/
|
||||
fi
|
||||
export OPENSSL_STATIC=1
|
||||
export OPENSSL_INCLUDE_DIR=/usr/local/include/openssl
|
||||
|
||||
#Alpine doesn't have .bashrc
|
||||
FILE=$HOME/.bashrc && test -f $FILE && source $FILE
|
||||
|
||||
cd nodejs
|
||||
npm ci
|
||||
npm run build-release
|
||||
@@ -1,21 +0,0 @@
|
||||
#!/bin/bash
|
||||
# Builds the node module for manylinux. Invoked by ci/build_linux_artifacts.sh.
|
||||
set -e
|
||||
ARCH=${1:-x86_64}
|
||||
TARGET_TRIPLE=${2:-x86_64-unknown-linux-gnu}
|
||||
|
||||
if [ "$ARCH" = "x86_64" ]; then
|
||||
export OPENSSL_LIB_DIR=/usr/local/lib64/
|
||||
else
|
||||
export OPENSSL_LIB_DIR=/usr/local/lib/
|
||||
fi
|
||||
export OPENSSL_STATIC=1
|
||||
export OPENSSL_INCLUDE_DIR=/usr/local/include/openssl
|
||||
|
||||
#Alpine doesn't have .bashrc
|
||||
FILE=$HOME/.bashrc && test -f $FILE && source $FILE
|
||||
|
||||
cd node
|
||||
npm ci
|
||||
npm run build-release
|
||||
npm run pack-build -- -t $TARGET_TRIPLE
|
||||
@@ -1,26 +0,0 @@
|
||||
#!/bin/bash
|
||||
# Builds openssl from source so we can statically link to it
|
||||
|
||||
# this is to avoid the error we get with the system installation:
|
||||
# /usr/bin/ld: <library>: version node not found for symbol SSLeay@@OPENSSL_1.0.1
|
||||
# /usr/bin/ld: failed to set dynamic section sizes: Bad value
|
||||
set -e
|
||||
|
||||
git clone -b OpenSSL_1_1_1v \
|
||||
--single-branch \
|
||||
https://github.com/openssl/openssl.git
|
||||
|
||||
pushd openssl
|
||||
|
||||
if [[ $1 == x86_64* ]]; then
|
||||
ARCH=linux-x86_64
|
||||
else
|
||||
# gnu target
|
||||
ARCH=linux-aarch64
|
||||
fi
|
||||
|
||||
./Configure no-shared $ARCH
|
||||
|
||||
make
|
||||
|
||||
make install
|
||||
@@ -1,15 +0,0 @@
|
||||
#!/bin/bash
|
||||
# Installs protobuf compiler. Should be run as root.
|
||||
set -e
|
||||
|
||||
if [[ $1 == x86_64* ]]; then
|
||||
ARCH=x86_64
|
||||
else
|
||||
# gnu target
|
||||
ARCH=aarch_64
|
||||
fi
|
||||
|
||||
PB_REL=https://github.com/protocolbuffers/protobuf/releases
|
||||
PB_VERSION=23.1
|
||||
curl -LO $PB_REL/download/v$PB_VERSION/protoc-$PB_VERSION-linux-$ARCH.zip
|
||||
unzip protoc-$PB_VERSION-linux-$ARCH.zip -d /usr/local
|
||||
@@ -1,21 +0,0 @@
|
||||
#!/bin/bash
|
||||
set -e
|
||||
|
||||
install_node() {
|
||||
echo "Installing node..."
|
||||
|
||||
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.34.0/install.sh | bash
|
||||
|
||||
source "$HOME"/.bashrc
|
||||
|
||||
nvm install --no-progress 18
|
||||
}
|
||||
|
||||
install_rust() {
|
||||
echo "Installing rust..."
|
||||
curl https://sh.rustup.rs -sSf | bash -s -- -y
|
||||
export PATH="$PATH:/root/.cargo/bin"
|
||||
}
|
||||
|
||||
install_node
|
||||
install_rust
|
||||
265
ci/set_lance_version.py
Normal file
265
ci/set_lance_version.py
Normal file
@@ -0,0 +1,265 @@
|
||||
import argparse
|
||||
import sys
|
||||
import json
|
||||
|
||||
|
||||
def run_command(command: str) -> str:
|
||||
"""
|
||||
Run a shell command and return stdout as a string.
|
||||
If exit code is not 0, raise an exception with the stderr output.
|
||||
"""
|
||||
import subprocess
|
||||
|
||||
result = subprocess.run(command, shell=True, capture_output=True, text=True)
|
||||
if result.returncode != 0:
|
||||
raise Exception(f"Command failed with error: {result.stderr.strip()}")
|
||||
return result.stdout.strip()
|
||||
|
||||
|
||||
def get_latest_stable_version() -> str:
|
||||
version_line = run_command("cargo info lance | grep '^version:'")
|
||||
version = version_line.split(" ")[1].strip()
|
||||
return version
|
||||
|
||||
|
||||
def get_latest_preview_version() -> str:
|
||||
lance_tags = run_command(
|
||||
"git ls-remote --tags https://github.com/lancedb/lance.git | grep 'refs/tags/v[0-9beta.-]\\+$'"
|
||||
).splitlines()
|
||||
lance_tags = (
|
||||
tag.split("refs/tags/")[1]
|
||||
for tag in lance_tags
|
||||
if "refs/tags/" in tag and "beta" in tag
|
||||
)
|
||||
from packaging.version import Version
|
||||
|
||||
latest = max(
|
||||
(tag[1:] for tag in lance_tags if tag.startswith("v")), key=lambda t: Version(t)
|
||||
)
|
||||
return str(latest)
|
||||
|
||||
|
||||
def extract_features(line: str) -> list:
|
||||
"""
|
||||
Extracts the features from a line in Cargo.toml.
|
||||
Example: 'lance = { "version" = "=0.29.0", "features" = ["dynamodb"] }'
|
||||
Returns: ['dynamodb']
|
||||
"""
|
||||
import re
|
||||
|
||||
match = re.search(r'"features"\s*=\s*\[\s*(.*?)\s*\]', line, re.DOTALL)
|
||||
if match:
|
||||
features_str = match.group(1)
|
||||
return [f.strip('"') for f in features_str.split(",") if len(f) > 0]
|
||||
return []
|
||||
|
||||
|
||||
def extract_default_features(line: str) -> bool:
|
||||
"""
|
||||
Checks if default-features = false is present in a line in Cargo.toml.
|
||||
Example: 'lance = { "version" = "=0.29.0", default-features = false, "features" = ["dynamodb"] }'
|
||||
Returns: True if default-features = false is present, False otherwise
|
||||
"""
|
||||
import re
|
||||
|
||||
match = re.search(r'default-features\s*=\s*false', line)
|
||||
return match is not None
|
||||
|
||||
|
||||
def dict_to_toml_line(package_name: str, config: dict) -> str:
|
||||
"""
|
||||
Converts a configuration dictionary to a TOML dependency line.
|
||||
Dictionary insertion order is preserved (Python 3.7+), so the caller
|
||||
controls the order of fields in the output.
|
||||
|
||||
Args:
|
||||
package_name: The name of the package (e.g., "lance", "lance-io")
|
||||
config: Dictionary with keys like "version", "path", "git", "tag", "features", "default-features"
|
||||
The order of keys in this dict determines the order in the output.
|
||||
|
||||
Returns:
|
||||
A properly formatted TOML line with a trailing newline
|
||||
"""
|
||||
# If only version is specified, use simple format
|
||||
if len(config) == 1 and "version" in config:
|
||||
return f'{package_name} = "{config["version"]}"\n'
|
||||
|
||||
# Otherwise, use inline table format
|
||||
parts = []
|
||||
for key, value in config.items():
|
||||
if key == "default-features" and not value:
|
||||
parts.append("default-features = false")
|
||||
elif key == "features":
|
||||
parts.append(f'"features" = {json.dumps(value)}')
|
||||
elif isinstance(value, str):
|
||||
parts.append(f'"{key}" = "{value}"')
|
||||
else:
|
||||
# This shouldn't happen with our current usage
|
||||
parts.append(f'"{key}" = {json.dumps(value)}')
|
||||
|
||||
return f'{package_name} = {{ {", ".join(parts)} }}\n'
|
||||
|
||||
|
||||
def update_cargo_toml(line_updater):
|
||||
"""
|
||||
Updates the Cargo.toml file by applying the line_updater function to each line.
|
||||
The line_updater function should take a line as input and return the updated line.
|
||||
"""
|
||||
with open("Cargo.toml", "r") as f:
|
||||
lines = f.readlines()
|
||||
|
||||
new_lines = []
|
||||
lance_line = ""
|
||||
is_parsing_lance_line = False
|
||||
for line in lines:
|
||||
if line.startswith("lance"):
|
||||
# Check if this is a single-line or multi-line entry
|
||||
# Single-line entries either:
|
||||
# 1. End with } (complete inline table)
|
||||
# 2. End with " (simple version string)
|
||||
# Multi-line entries start with { but don't end with }
|
||||
if line.strip().endswith("}") or line.strip().endswith('"'):
|
||||
# Single-line entry - process immediately
|
||||
new_lines.append(line_updater(line))
|
||||
elif "{" in line and not line.strip().endswith("}"):
|
||||
# Multi-line entry - start accumulating
|
||||
lance_line = line
|
||||
is_parsing_lance_line = True
|
||||
else:
|
||||
# Single-line entry without quotes or braces (shouldn't happen but handle it)
|
||||
new_lines.append(line_updater(line))
|
||||
elif is_parsing_lance_line:
|
||||
lance_line += line
|
||||
if line.strip().endswith("}"):
|
||||
new_lines.append(line_updater(lance_line))
|
||||
lance_line = ""
|
||||
is_parsing_lance_line = False
|
||||
else:
|
||||
# Keep the line unchanged
|
||||
new_lines.append(line)
|
||||
|
||||
with open("Cargo.toml", "w") as f:
|
||||
f.writelines(new_lines)
|
||||
|
||||
|
||||
def set_stable_version(version: str):
|
||||
"""
|
||||
Sets lines to
|
||||
lance = { "version" = "=0.29.0", default-features = false, "features" = ["dynamodb"] }
|
||||
lance-io = { "version" = "=0.29.0", default-features = false }
|
||||
...
|
||||
"""
|
||||
|
||||
def line_updater(line: str) -> str:
|
||||
package_name = line.split("=", maxsplit=1)[0].strip()
|
||||
|
||||
# Build config in desired order: version, default-features, features
|
||||
config = {"version": f"={version}"}
|
||||
|
||||
if extract_default_features(line):
|
||||
config["default-features"] = False
|
||||
|
||||
features = extract_features(line)
|
||||
if features:
|
||||
config["features"] = features
|
||||
|
||||
return dict_to_toml_line(package_name, config)
|
||||
|
||||
update_cargo_toml(line_updater)
|
||||
|
||||
|
||||
def set_preview_version(version: str):
|
||||
"""
|
||||
Sets lines to
|
||||
lance = { "version" = "=0.29.0", default-features = false, "features" = ["dynamodb"], "tag" = "v0.29.0-beta.2", "git" = "https://github.com/lancedb/lance.git" }
|
||||
lance-io = { "version" = "=0.29.0", default-features = false, "tag" = "v0.29.0-beta.2", "git" = "https://github.com/lancedb/lance.git" }
|
||||
...
|
||||
"""
|
||||
|
||||
def line_updater(line: str) -> str:
|
||||
package_name = line.split("=", maxsplit=1)[0].strip()
|
||||
base_version = version.split("-")[0] # Get the base version without beta suffix
|
||||
|
||||
# Build config in desired order: version, default-features, features, tag, git
|
||||
config = {"version": f"={base_version}"}
|
||||
|
||||
if extract_default_features(line):
|
||||
config["default-features"] = False
|
||||
|
||||
features = extract_features(line)
|
||||
if features:
|
||||
config["features"] = features
|
||||
|
||||
config["tag"] = f"v{version}"
|
||||
config["git"] = "https://github.com/lancedb/lance.git"
|
||||
|
||||
return dict_to_toml_line(package_name, config)
|
||||
|
||||
update_cargo_toml(line_updater)
|
||||
|
||||
|
||||
def set_local_version():
|
||||
"""
|
||||
Sets lines to
|
||||
lance = { "path" = "../lance/rust/lance", default-features = false, "features" = ["dynamodb"] }
|
||||
lance-io = { "path" = "../lance/rust/lance-io", default-features = false }
|
||||
...
|
||||
"""
|
||||
|
||||
def line_updater(line: str) -> str:
|
||||
package_name = line.split("=", maxsplit=1)[0].strip()
|
||||
|
||||
# Build config in desired order: path, default-features, features
|
||||
config = {"path": f"../lance/rust/{package_name}"}
|
||||
|
||||
if extract_default_features(line):
|
||||
config["default-features"] = False
|
||||
|
||||
features = extract_features(line)
|
||||
if features:
|
||||
config["features"] = features
|
||||
|
||||
return dict_to_toml_line(package_name, config)
|
||||
|
||||
update_cargo_toml(line_updater)
|
||||
|
||||
|
||||
parser = argparse.ArgumentParser(description="Set the version of the Lance package.")
|
||||
parser.add_argument(
|
||||
"version",
|
||||
type=str,
|
||||
help="The version to set for the Lance package. Use 'stable' for the latest stable version, 'preview' for latest preview version, or a specific version number (e.g., '0.1.0'). You can also specify 'local' to use a local path.",
|
||||
)
|
||||
args = parser.parse_args()
|
||||
|
||||
if args.version == "stable":
|
||||
latest_stable_version = get_latest_stable_version()
|
||||
print(
|
||||
f"Found latest stable version: \033[1mv{latest_stable_version}\033[0m",
|
||||
file=sys.stderr,
|
||||
)
|
||||
set_stable_version(latest_stable_version)
|
||||
elif args.version == "preview":
|
||||
latest_preview_version = get_latest_preview_version()
|
||||
print(
|
||||
f"Found latest preview version: \033[1mv{latest_preview_version}\033[0m",
|
||||
file=sys.stderr,
|
||||
)
|
||||
set_preview_version(latest_preview_version)
|
||||
elif args.version == "local":
|
||||
set_local_version()
|
||||
else:
|
||||
# Parse the version number.
|
||||
version = args.version
|
||||
# Ignore initial v if present.
|
||||
if version.startswith("v"):
|
||||
version = version[1:]
|
||||
|
||||
if "beta" in version:
|
||||
set_preview_version(version)
|
||||
else:
|
||||
set_stable_version(version)
|
||||
|
||||
print("Updating lockfiles...", file=sys.stderr, end="")
|
||||
run_command("cargo metadata > /dev/null")
|
||||
print(" done.", file=sys.stderr)
|
||||
27
ci/update_lockfiles.sh
Executable file
27
ci/update_lockfiles.sh
Executable file
@@ -0,0 +1,27 @@
|
||||
#!/usr/bin/env bash
|
||||
set -euo pipefail
|
||||
|
||||
AMEND=false
|
||||
|
||||
for arg in "$@"; do
|
||||
if [[ "$arg" == "--amend" ]]; then
|
||||
AMEND=true
|
||||
fi
|
||||
done
|
||||
|
||||
# This updates the lockfile without building
|
||||
cargo metadata --quiet > /dev/null
|
||||
|
||||
pushd nodejs || exit 1
|
||||
npm install --package-lock-only --silent
|
||||
popd
|
||||
|
||||
if git diff --quiet --exit-code; then
|
||||
echo "No lockfile changes to commit; skipping amend."
|
||||
elif $AMEND; then
|
||||
git add Cargo.lock nodejs/package-lock.json
|
||||
git commit --amend --no-edit
|
||||
else
|
||||
git add Cargo.lock nodejs/package-lock.json
|
||||
git commit -m "Update lockfiles"
|
||||
fi
|
||||
@@ -2,7 +2,7 @@
|
||||
|
||||
LanceDB docs are deployed to https://lancedb.github.io/lancedb/.
|
||||
|
||||
Docs is built and deployed automatically by [Github Actions](.github/workflows/docs.yml)
|
||||
Docs is built and deployed automatically by [Github Actions](../.github/workflows/docs.yml)
|
||||
whenever a commit is pushed to the `main` branch. So it is possible for the docs to show
|
||||
unreleased features.
|
||||
|
||||
|
||||
@@ -124,6 +124,9 @@ nav:
|
||||
- Overview: hybrid_search/hybrid_search.md
|
||||
- Comparing Rerankers: hybrid_search/eval.md
|
||||
- Airbnb financial data example: notebooks/hybrid_search.ipynb
|
||||
- Late interaction with MultiVector search:
|
||||
- Overview: guides/multi-vector.md
|
||||
- Example: notebooks/Multivector_on_LanceDB.ipynb
|
||||
- RAG:
|
||||
- Vanilla RAG: rag/vanilla_rag.md
|
||||
- Multi-head RAG: rag/multi_head_rag.md
|
||||
@@ -190,6 +193,7 @@ nav:
|
||||
- Pandas and PyArrow: python/pandas_and_pyarrow.md
|
||||
- Polars: python/polars_arrow.md
|
||||
- DuckDB: python/duckdb.md
|
||||
- Datafusion: python/datafusion.md
|
||||
- LangChain:
|
||||
- LangChain 🔗: integrations/langchain.md
|
||||
- LangChain demo: notebooks/langchain_demo.ipynb
|
||||
@@ -202,6 +206,7 @@ nav:
|
||||
- PromptTools: integrations/prompttools.md
|
||||
- dlt: integrations/dlt.md
|
||||
- phidata: integrations/phidata.md
|
||||
- Genkit: integrations/genkit.md
|
||||
- 🎯 Examples:
|
||||
- Overview: examples/index.md
|
||||
- 🐍 Python:
|
||||
@@ -233,13 +238,6 @@ nav:
|
||||
- 👾 JavaScript (vectordb): javascript/modules.md
|
||||
- 👾 JavaScript (lancedb): js/globals.md
|
||||
- 🦀 Rust: https://docs.rs/lancedb/latest/lancedb/
|
||||
- ☁️ LanceDB Cloud:
|
||||
- Overview: cloud/index.md
|
||||
- API reference:
|
||||
- 🐍 Python: python/saas-python.md
|
||||
- 👾 JavaScript: javascript/modules.md
|
||||
- REST API: cloud/rest.md
|
||||
- FAQs: cloud/cloud_faq.md
|
||||
|
||||
- Quick start: basic.md
|
||||
- Concepts:
|
||||
@@ -251,6 +249,7 @@ nav:
|
||||
- Data management: concepts/data_management.md
|
||||
- Guides:
|
||||
- Working with tables: guides/tables.md
|
||||
- Working with SQL: guides/sql_querying.md
|
||||
- Building an ANN index: ann_indexes.md
|
||||
- Vector Search: search.md
|
||||
- Full-text search (native): fts.md
|
||||
@@ -260,6 +259,9 @@ nav:
|
||||
- Overview: hybrid_search/hybrid_search.md
|
||||
- Comparing Rerankers: hybrid_search/eval.md
|
||||
- Airbnb financial data example: notebooks/hybrid_search.ipynb
|
||||
- Late interaction with MultiVector search:
|
||||
- Overview: guides/multi-vector.md
|
||||
- Document search Example: notebooks/Multivector_on_LanceDB.ipynb
|
||||
- RAG:
|
||||
- Vanilla RAG: rag/vanilla_rag.md
|
||||
- Multi-head RAG: rag/multi_head_rag.md
|
||||
@@ -324,6 +326,7 @@ nav:
|
||||
- Pandas and PyArrow: python/pandas_and_pyarrow.md
|
||||
- Polars: python/polars_arrow.md
|
||||
- DuckDB: python/duckdb.md
|
||||
- Datafusion: python/datafusion.md
|
||||
- LangChain 🦜️🔗↗: integrations/langchain.md
|
||||
- LangChain.js 🦜️🔗↗: https://js.langchain.com/docs/integrations/vectorstores/lancedb
|
||||
- LlamaIndex 🦙↗: integrations/llamaIndex.md
|
||||
@@ -332,6 +335,7 @@ nav:
|
||||
- PromptTools: integrations/prompttools.md
|
||||
- dlt: integrations/dlt.md
|
||||
- phidata: integrations/phidata.md
|
||||
- Genkit: integrations/genkit.md
|
||||
- Examples:
|
||||
- examples/index.md
|
||||
- 🐍 Python:
|
||||
@@ -363,13 +367,6 @@ nav:
|
||||
- Javascript (vectordb): javascript/modules.md
|
||||
- Javascript (lancedb): js/globals.md
|
||||
- Rust: https://docs.rs/lancedb/latest/lancedb/index.html
|
||||
- LanceDB Cloud:
|
||||
- Overview: cloud/index.md
|
||||
- API reference:
|
||||
- 🐍 Python: python/saas-python.md
|
||||
- 👾 JavaScript: javascript/modules.md
|
||||
- REST API: cloud/rest.md
|
||||
- FAQs: cloud/cloud_faq.md
|
||||
|
||||
extra_css:
|
||||
- styles/global.css
|
||||
|
||||
@@ -171,7 +171,7 @@ paths:
|
||||
distance_type:
|
||||
type: string
|
||||
description: |
|
||||
The distance metric to use for search. L2, Cosine, Dot and Hamming are supported. Default is L2.
|
||||
The distance metric to use for search. l2, Cosine, Dot and Hamming are supported. Default is l2.
|
||||
bypass_vector_index:
|
||||
type: boolean
|
||||
description: |
|
||||
@@ -450,7 +450,7 @@ paths:
|
||||
type: string
|
||||
nullable: false
|
||||
description: |
|
||||
The metric type to use for the index. L2, Cosine, Dot are supported.
|
||||
The metric type to use for the index. l2, Cosine, Dot are supported.
|
||||
index_type:
|
||||
type: string
|
||||
responses:
|
||||
|
||||
5
docs/overrides/partials/main.html
Normal file
5
docs/overrides/partials/main.html
Normal file
@@ -0,0 +1,5 @@
|
||||
{% extends "base.html" %}
|
||||
|
||||
{% block announce %}
|
||||
📚 Starting June 1st, 2025, please use <a href="https://lancedb.github.io/documentation" target="_blank" rel="noopener noreferrer">lancedb.github.io/documentation</a> for the latest docs.
|
||||
{% endblock %}
|
||||
12
docs/package-lock.json
generated
12
docs/package-lock.json
generated
@@ -19,7 +19,7 @@
|
||||
},
|
||||
"../node": {
|
||||
"name": "vectordb",
|
||||
"version": "0.12.0",
|
||||
"version": "0.21.2-beta.0",
|
||||
"cpu": [
|
||||
"x64",
|
||||
"arm64"
|
||||
@@ -65,11 +65,11 @@
|
||||
"uuid": "^9.0.0"
|
||||
},
|
||||
"optionalDependencies": {
|
||||
"@lancedb/vectordb-darwin-arm64": "0.12.0",
|
||||
"@lancedb/vectordb-darwin-x64": "0.12.0",
|
||||
"@lancedb/vectordb-linux-arm64-gnu": "0.12.0",
|
||||
"@lancedb/vectordb-linux-x64-gnu": "0.12.0",
|
||||
"@lancedb/vectordb-win32-x64-msvc": "0.12.0"
|
||||
"@lancedb/vectordb-darwin-arm64": "0.21.2-beta.0",
|
||||
"@lancedb/vectordb-darwin-x64": "0.21.2-beta.0",
|
||||
"@lancedb/vectordb-linux-arm64-gnu": "0.21.2-beta.0",
|
||||
"@lancedb/vectordb-linux-x64-gnu": "0.21.2-beta.0",
|
||||
"@lancedb/vectordb-win32-x64-msvc": "0.21.2-beta.0"
|
||||
},
|
||||
"peerDependencies": {
|
||||
"@apache-arrow/ts": "^14.0.2",
|
||||
|
||||
@@ -69,7 +69,7 @@ Lance supports `IVF_PQ` index type by default.
|
||||
|
||||
The following IVF_PQ paramters can be specified:
|
||||
|
||||
- **distance_type**: The distance metric to use. By default it uses euclidean distance "`L2`".
|
||||
- **distance_type**: The distance metric to use. By default it uses euclidean distance "`l2`".
|
||||
We also support "cosine" and "dot" distance as well.
|
||||
- **num_partitions**: The number of partitions in the index. The default is the square root
|
||||
of the number of rows.
|
||||
@@ -291,7 +291,7 @@ Product quantization can lead to approximately `16 * sizeof(float32) / 1 = 64` t
|
||||
|
||||
`num_partitions` is used to decide how many partitions the first level `IVF` index uses.
|
||||
Higher number of partitions could lead to more efficient I/O during queries and better accuracy, but it takes much more time to train.
|
||||
On `SIFT-1M` dataset, our benchmark shows that keeping each partition 1K-4K rows lead to a good latency / recall.
|
||||
On `SIFT-1M` dataset, our benchmark shows that keeping each partition 4K-8K rows lead to a good latency / recall.
|
||||
|
||||
`num_sub_vectors` specifies how many Product Quantization (PQ) short codes to generate on each vector. The number should be a factor of the vector dimension. Because
|
||||
PQ is a lossy compression of the original vector, a higher `num_sub_vectors` usually results in
|
||||
|
||||
BIN
docs/src/assets/hero-header.png
Normal file
BIN
docs/src/assets/hero-header.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 1.7 MiB |
BIN
docs/src/assets/lancedb.png
Normal file
BIN
docs/src/assets/lancedb.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 40 KiB |
@@ -2,7 +2,7 @@
|
||||
|
||||
LanceDB Cloud is a SaaS (software-as-a-service) solution that runs serverless in the cloud, clearly separating storage from compute. It's designed to be highly scalable without breaking the bank. LanceDB Cloud is currently in private beta with general availability coming soon, but you can apply for early access with the private beta release by signing up below.
|
||||
|
||||
[Try out LanceDB Cloud](https://noteforms.com/forms/lancedb-mailing-list-cloud-kty1o5?notionforms=1&utm_source=notionforms){ .md-button .md-button--primary }
|
||||
[Try out LanceDB Cloud (Public Beta)](https://cloud.lancedb.com){ .md-button .md-button--primary }
|
||||
|
||||
## Architecture
|
||||
|
||||
|
||||
@@ -13,7 +13,7 @@ The following concepts are important to keep in mind:
|
||||
- Data is versioned, with each insert operation creating a new version of the dataset and an update to the manifest that tracks versions via metadata
|
||||
|
||||
!!! note
|
||||
1. First, each version contains metadata and just the new/updated data in your transaction. So if you have 100 versions, they aren't 100 duplicates of the same data. However, they do have 100x the metadata overhead of a single version, which can result in slower queries.
|
||||
1. First, each version contains metadata and just the new/updated data in your transaction. So if you have 100 versions, they aren't 100 duplicates of the same data. However, they do have 100x the metadata overhead of a single version, which can result in slower queries.
|
||||
2. Second, these versions exist to keep LanceDB scalable and consistent. We do not immediately blow away old versions when creating new ones because other clients might be in the middle of querying the old version. It's important to retain older versions for as long as they might be queried.
|
||||
|
||||
## What are fragments?
|
||||
@@ -37,6 +37,10 @@ Depending on the use case and dataset, optimal compaction will have different re
|
||||
- It’s always better to use *batch* inserts rather than adding 1 row at a time (to avoid too small fragments). If single-row inserts are unavoidable, run compaction on a regular basis to merge them into larger fragments.
|
||||
- Keep the number of fragments under 100, which is suitable for most use cases (for *really* large datasets of >500M rows, more fragments might be needed)
|
||||
|
||||
!!! note
|
||||
|
||||
LanceDB Cloud/Enterprise supports [auto-compaction](https://docs.lancedb.com/enterprise/architecture/architecture#write-path) which automatically optimizes fragments in the background as data changes.
|
||||
|
||||
## Deletion
|
||||
|
||||
Although Lance allows you to delete rows from a dataset, it does not actually delete the data immediately. It simply marks the row as deleted in the `DataFile` that represents a fragment. For a given version of the dataset, each fragment can have up to one deletion file (if no rows were ever deleted from that fragment, it will not have a deletion file). This is important to keep in mind because it means that the data is still there, and can be recovered if needed, as long as that version still exists based on your backup policy.
|
||||
@@ -50,13 +54,9 @@ Reindexing is the process of updating the index to account for new data, keeping
|
||||
|
||||
Both LanceDB OSS and Cloud support reindexing, but the process (at least for now) is different for each, depending on the type of index.
|
||||
|
||||
When a reindex job is triggered in the background, the entire data is reindexed, but in the interim as new queries come in, LanceDB will combine results from the existing index with exhaustive kNN search on the new data. This is done to ensure that you're still searching on all your data, but it does come at a performance cost. The more data that you add without reindexing, the impact on latency (due to exhaustive search) can be noticeable.
|
||||
In LanceDB OSS, re-indexing happens synchronously when you call either `create_index` or `optimize` on a table. In LanceDB Cloud, re-indexing happens asynchronously as you add and update data in your table.
|
||||
|
||||
### Vector reindex
|
||||
By default, queries will search new data even if it has yet to be indexed. This is done using brute-force methods, such as kNN for vector search, and combined with the fast index search results. This is done to ensure that you're always searching over all your data, but it does come at a performance cost. Without reindexing, adding more data to a table will make queries slower and more expensive. This behavior can be disabled by setting the [fast_search](https://lancedb.github.io/lancedb/python/python/#lancedb.query.AsyncQuery.fast_search) parameter which will instruct the query to ignore un-indexed data.
|
||||
|
||||
* LanceDB Cloud supports incremental reindexing, where a background process will trigger a new index build for you automatically when new data is added to a dataset
|
||||
* LanceDB Cloud/Enterprise supports [automatic incremental reindexing](https://docs.lancedb.com/core#vector-index) for vector, scalar, and FTS indices, where a background process will trigger a new index build for you automatically when new data is added or modified in a dataset
|
||||
* LanceDB OSS requires you to manually trigger a reindex operation -- we are working on adding incremental reindexing to LanceDB OSS as well
|
||||
|
||||
### FTS reindex
|
||||
|
||||
FTS reindexing is supported in both LanceDB OSS and Cloud, but requires that it's manually rebuilt once you have a significant enough amount of new data added that needs to be reindexed. We [updated](https://github.com/lancedb/lancedb/pull/762) Tantivy's default heap size from 128MB to 1GB in LanceDB to make it much faster to reindex, by up to 10x from the default settings.
|
||||
|
||||
@@ -59,7 +59,7 @@ Then the greedy search routine operates as follows:
|
||||
|
||||
There are three key parameters to set when constructing an HNSW index:
|
||||
|
||||
* `metric`: Use an `L2` euclidean distance metric. We also support `dot` and `cosine` distance.
|
||||
* `metric`: Use an `l2` euclidean distance metric. We also support `dot` and `cosine` distance.
|
||||
* `m`: The number of neighbors to select for each vector in the HNSW graph.
|
||||
* `ef_construction`: The number of candidates to evaluate during the construction of the HNSW graph.
|
||||
|
||||
|
||||
@@ -47,7 +47,7 @@ We can combine the above concepts to understand how to build and query an IVF-PQ
|
||||
|
||||
There are three key parameters to set when constructing an IVF-PQ index:
|
||||
|
||||
* `metric`: Use an `L2` euclidean distance metric. We also support `dot` and `cosine` distance.
|
||||
* `metric`: Use an `l2` euclidean distance metric. We also support `dot` and `cosine` distance.
|
||||
* `num_partitions`: The number of partitions in the IVF portion of the index.
|
||||
* `num_sub_vectors`: The number of sub-vectors that will be created during Product Quantization (PQ).
|
||||
|
||||
@@ -56,7 +56,7 @@ In Python, the index can be created as follows:
|
||||
```python
|
||||
# Create and train the index for a 1536-dimensional vector
|
||||
# Make sure you have enough data in the table for an effective training step
|
||||
tbl.create_index(metric="L2", num_partitions=256, num_sub_vectors=96)
|
||||
tbl.create_index(metric="l2", num_partitions=256, num_sub_vectors=96)
|
||||
```
|
||||
!!! note
|
||||
`num_partitions`=256 and `num_sub_vectors`=96 does not work for every dataset. Those values needs to be adjusted for your particular dataset.
|
||||
|
||||
@@ -54,7 +54,7 @@ As mentioned, after creating embedding, each data point is represented as a vect
|
||||
|
||||
Points that are close to each other in vector space are considered similar (or appear in similar contexts), and points that are far away are considered dissimilar. To quantify this closeness, we use distance as a metric which can be measured in the following way -
|
||||
|
||||
1. **Euclidean Distance (L2)**: It calculates the straight-line distance between two points (vectors) in a multidimensional space.
|
||||
1. **Euclidean Distance (l2)**: It calculates the straight-line distance between two points (vectors) in a multidimensional space.
|
||||
2. **Cosine Similarity**: It measures the cosine of the angle between two vectors, providing a normalized measure of similarity based on their direction.
|
||||
3. **Dot product**: It is calculated as the sum of the products of their corresponding components. To measure relatedness it considers both the magnitude and direction of the vectors.
|
||||
|
||||
|
||||
@@ -8,15 +8,5 @@ LanceDB provides language APIs, allowing you to embed a database in your languag
|
||||
* 👾 [JavaScript](examples_js.md) examples
|
||||
* 🦀 Rust examples (coming soon)
|
||||
|
||||
## Python Applications powered by LanceDB
|
||||
|
||||
| Project Name | Description |
|
||||
| --- | --- |
|
||||
| **Ultralytics Explorer 🚀**<br>[](https://docs.ultralytics.com/datasets/explorer/)<br>[](https://colab.research.google.com/github/ultralytics/ultralytics/blob/main/docs/en/datasets/explorer/explorer.ipynb) | - 🔍 **Explore CV Datasets**: Semantic search, SQL queries, vector similarity, natural language.<br>- 🖥️ **GUI & Python API**: Seamless dataset interaction.<br>- ⚡ **Efficient & Scalable**: Leverages LanceDB for large datasets.<br>- 📊 **Detailed Analysis**: Easily analyze data patterns.<br>- 🌐 **Browser GUI Demo**: Create embeddings, search images, run queries. |
|
||||
| **Website Chatbot🤖**<br>[](https://github.com/lancedb/lancedb-vercel-chatbot)<br>[](https://vercel.com/new/clone?repository-url=https%3A%2F%2Fgithub.com%2Flancedb%2Flancedb-vercel-chatbot&env=OPENAI_API_KEY&envDescription=OpenAI%20API%20Key%20for%20chat%20completion.&project-name=lancedb-vercel-chatbot&repository-name=lancedb-vercel-chatbot&demo-title=LanceDB%20Chatbot%20Demo&demo-description=Demo%20website%20chatbot%20with%20LanceDB.&demo-url=https%3A%2F%2Flancedb.vercel.app&demo-image=https%3A%2F%2Fi.imgur.com%2FazVJtvr.png) | - 🌐 **Chatbot from Sitemap/Docs**: Create a chatbot using site or document context.<br>- 🚀 **Embed LanceDB in Next.js**: Lightweight, on-prem storage.<br>- 🧠 **AI-Powered Context Retrieval**: Efficiently access relevant data.<br>- 🔧 **Serverless & Native JS**: Seamless integration with Next.js.<br>- ⚡ **One-Click Deploy on Vercel**: Quick and easy setup.. |
|
||||
|
||||
## Nodejs Applications powered by LanceDB
|
||||
|
||||
| Project Name | Description |
|
||||
| --- | --- |
|
||||
| **Langchain Writing Assistant✍️ **<br>[](https://github.com/lancedb/vectordb-recipes/tree/main/applications/node/lanchain_writing_assistant) | - **📂 Data Source Integration**: Use your own data by specifying data source file, and the app instantly processes it to provide insights. <br>- **🧠 Intelligent Suggestions**: Powered by LangChain.js and LanceDB, it improves writing productivity and accuracy. <br>- **💡 Enhanced Writing Experience**: It delivers real-time contextual insights and factual suggestions while the user writes. |
|
||||
!!! tip "Hosted LanceDB"
|
||||
If you want S3 cost-efficiency and local performance via a simple serverless API, checkout **LanceDB Cloud**. For private deployments, high performance at extreme scale, or if you have strict security requirements, talk to us about **LanceDB Enterprise**. [Learn more](https://docs.lancedb.com/)
|
||||
85
docs/src/guides/multi-vector.md
Normal file
85
docs/src/guides/multi-vector.md
Normal file
@@ -0,0 +1,85 @@
|
||||
# Late interaction & MultiVector embedding type
|
||||
Late interaction is a technique used in retrieval that calculates the relevance of a query to a document by comparing their multi-vector representations. The key difference between late interaction and other popular methods:
|
||||
|
||||

|
||||
|
||||
|
||||
[ Illustration from https://jina.ai/news/what-is-colbert-and-late-interaction-and-why-they-matter-in-search/]
|
||||
|
||||
<b>No interaction:</b> Refers to independently embedding the query and document, that are compared to calcualte similarity without any interaction between them. This is typically used in vector search operations.
|
||||
|
||||
<b>Partial interaction</b> Refers to a specific approach where the similarity computation happens primarily between query vectors and document vectors, without extensive interaction between individual components of each. An example of this is dual-encoder models like BERT.
|
||||
|
||||
<b>Early full interaction</b> Refers to techniques like cross-encoders that process query and docs in pairs with full interaction across various stages of encoding. This is a powerful, but relatively slower technique. Because it requires processing query and docs in pairs, doc embeddings can't be pre-computed for fast retrieval. This is why cross encoders are typically used as reranking models combined with vector search. Learn more about [LanceDB Reranking support](https://lancedb.github.io/lancedb/reranking/).
|
||||
|
||||
<b>Late interaction</b> Late interaction is a technique that calculates the doc and query similarity independently and then the interaction or evaluation happens during the retrieval process. This is typically used in retrieval models like ColBERT. Unlike early interaction, It allows speeding up the retrieval process without compromising the depth of semantic analysis.
|
||||
|
||||
## Internals of ColBERT
|
||||
Let's take a look at the steps involved in performing late interaction based retrieval using ColBERT:
|
||||
|
||||
• ColBERT employs BERT-based encoders for both queries `(fQ)` and documents `(fD)`
|
||||
• A single BERT model is shared between query and document encoders and special tokens distinguish input types: `[Q]` for queries and `[D]` for documents
|
||||
|
||||
**Query Encoder (fQ):**
|
||||
• Query q is tokenized into WordPiece tokens: `q1, q2, ..., ql`. `[Q]` token is prepended right after BERT's `[CLS]` token
|
||||
• If query length < Nq, it's padded with [MASK] tokens up to Nq.
|
||||
• The padded sequence goes through BERT's transformer architecture
|
||||
• Final embeddings are L2-normalized.
|
||||
|
||||
**Document Encoder (fD):**
|
||||
• Document d is tokenized into tokens `d1, d2, ..., dm`. `[D]` token is prepended after `[CLS]` token
|
||||
• Unlike queries, documents are NOT padded with `[MASK]` tokens
|
||||
• Document tokens are processed through BERT and the same linear layer
|
||||
|
||||
**Late Interaction:**
|
||||
• Late interaction estimates relevance score `S(q,d)` using embedding `Eq` and `Ed`. Late interaction happens after independent encoding
|
||||
• For each query embedding, maximum similarity is computed against all document embeddings
|
||||
• The similarity measure can be cosine similarity or squared L2 distance
|
||||
|
||||
**MaxSim Calculation:**
|
||||
```
|
||||
S(q,d) := Σ max(Eqi⋅EdjT)
|
||||
i∈|Eq| j∈|Ed|
|
||||
```
|
||||
• This finds the best matching document embedding for each query embedding
|
||||
• Captures relevance based on strongest local matches between contextual embeddings
|
||||
|
||||
## LanceDB MultiVector type
|
||||
LanceDB supports multivector type, this is useful when you have multiple vectors for a single item (e.g. with ColBert and ColPali).
|
||||
|
||||
You can index on a column with multivector type and search on it, the query can be single vector or multiple vectors. For now, only cosine metric is supported for multivector search. The vector value type can be float16, float32 or float64. LanceDB integrateds [ConteXtualized Token Retriever(XTR)](https://arxiv.org/abs/2304.01982), which introduces a simple, yet novel, objective function that encourages the model to retrieve the most important document tokens first.
|
||||
|
||||
```python
|
||||
import lancedb
|
||||
import numpy as np
|
||||
import pyarrow as pa
|
||||
|
||||
db = lancedb.connect("data/multivector_demo")
|
||||
schema = pa.schema(
|
||||
[
|
||||
pa.field("id", pa.int64()),
|
||||
# float16, float32, and float64 are supported
|
||||
pa.field("vector", pa.list_(pa.list_(pa.float32(), 256))),
|
||||
]
|
||||
)
|
||||
data = [
|
||||
{
|
||||
"id": i,
|
||||
"vector": np.random.random(size=(2, 256)).tolist(),
|
||||
}
|
||||
for i in range(1024)
|
||||
]
|
||||
tbl = db.create_table("my_table", data=data, schema=schema)
|
||||
|
||||
# only cosine similarity is supported for multi-vectors
|
||||
tbl.create_index(metric="cosine")
|
||||
|
||||
# query with single vector
|
||||
query = np.random.random(256).astype(np.float16)
|
||||
tbl.search(query).to_arrow()
|
||||
|
||||
# query with multiple vectors
|
||||
query = np.random.random(size=(2, 256))
|
||||
tbl.search(query).to_arrow()
|
||||
```
|
||||
Find more about vector search in LanceDB [here](https://lancedb.github.io/lancedb/search/#multivector-type).
|
||||
60
docs/src/guides/sql_querying.md
Normal file
60
docs/src/guides/sql_querying.md
Normal file
@@ -0,0 +1,60 @@
|
||||
# SQL Querying
|
||||
|
||||
You can use DuckDB and Apache Datafusion to query your LanceDB tables using SQL.
|
||||
This guide will show how to query Lance tables them using both.
|
||||
|
||||
We will re-use the dataset [created previously](./tables.md):
|
||||
|
||||
```python
|
||||
import lancedb
|
||||
|
||||
db = lancedb.connect("data/sample-lancedb")
|
||||
data = [
|
||||
{"vector": [3.1, 4.1], "item": "foo", "price": 10.0},
|
||||
{"vector": [5.9, 26.5], "item": "bar", "price": 20.0}
|
||||
]
|
||||
table = db.create_table("pd_table", data=data)
|
||||
```
|
||||
|
||||
## Querying a LanceDB Table with DuckDb
|
||||
|
||||
The `to_lance` method converts the LanceDB table to a `LanceDataset`, which is accessible to DuckDB through the Arrow compatibility layer.
|
||||
To query the resulting Lance dataset in DuckDB, all you need to do is reference the dataset by the same name in your SQL query.
|
||||
|
||||
```python
|
||||
import duckdb
|
||||
|
||||
arrow_table = table.to_lance()
|
||||
|
||||
duckdb.query("SELECT * FROM arrow_table")
|
||||
```
|
||||
|
||||
| vector | item | price |
|
||||
| ----------- | ---- | ----- |
|
||||
| [3.1, 4.1] | foo | 10.0 |
|
||||
| [5.9, 26.5] | bar | 20.0 |
|
||||
|
||||
## Querying a LanceDB Table with Apache Datafusion
|
||||
|
||||
Have the required imports before doing any querying.
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
--8<-- "python/python/tests/docs/test_guide_tables.py:import-lancedb"
|
||||
--8<-- "python/python/tests/docs/test_guide_tables.py:import-session-context"
|
||||
--8<-- "python/python/tests/docs/test_guide_tables.py:import-ffi-dataset"
|
||||
```
|
||||
|
||||
Register the table created with the Datafusion session context.
|
||||
|
||||
=== "Python"
|
||||
|
||||
```python
|
||||
--8<-- "python/python/tests/docs/test_guide_tables.py:lance_sql_basic"
|
||||
```
|
||||
|
||||
| vector | item | price |
|
||||
| ----------- | ---- | ----- |
|
||||
| [3.1, 4.1] | foo | 10.0 |
|
||||
| [5.9, 26.5] | bar | 20.0 |
|
||||
@@ -342,7 +342,7 @@ For **read and write access**, LanceDB will need a policy such as:
|
||||
"Action": [
|
||||
"s3:PutObject",
|
||||
"s3:GetObject",
|
||||
"s3:DeleteObject",
|
||||
"s3:DeleteObject"
|
||||
],
|
||||
"Resource": "arn:aws:s3:::<bucket>/<prefix>/*"
|
||||
},
|
||||
@@ -374,7 +374,7 @@ For **read-only access**, LanceDB will need a policy such as:
|
||||
{
|
||||
"Effect": "Allow",
|
||||
"Action": [
|
||||
"s3:GetObject",
|
||||
"s3:GetObject"
|
||||
],
|
||||
"Resource": "arn:aws:s3:::<bucket>/<prefix>/*"
|
||||
},
|
||||
|
||||
@@ -765,7 +765,10 @@ This can be used to update zero to all rows depending on how many rows match the
|
||||
];
|
||||
const tbl = await db.createTable("my_table", data)
|
||||
|
||||
await tbl.update({vector: [10, 10]}, { where: "x = 2"})
|
||||
await tbl.update({
|
||||
values: { vector: [10, 10] },
|
||||
where: "x = 2"
|
||||
});
|
||||
```
|
||||
|
||||
=== "vectordb (deprecated)"
|
||||
@@ -784,7 +787,10 @@ This can be used to update zero to all rows depending on how many rows match the
|
||||
];
|
||||
const tbl = await db.createTable("my_table", data)
|
||||
|
||||
await tbl.update({ where: "x = 2", values: {vector: [10, 10]} })
|
||||
await tbl.update({
|
||||
where: "x = 2",
|
||||
values: { vector: [10, 10] }
|
||||
});
|
||||
```
|
||||
|
||||
#### Updating using a sql query
|
||||
|
||||
@@ -4,6 +4,9 @@ LanceDB is an open-source vector database for AI that's designed to store, manag
|
||||
|
||||
Both the database and the underlying data format are designed from the ground up to be **easy-to-use**, **scalable** and **cost-effective**.
|
||||
|
||||
!!! tip "Hosted LanceDB"
|
||||
If you want S3 cost-efficiency and local performance via a simple serverless API, checkout **LanceDB Cloud**. For private deployments, high performance at extreme scale, or if you have strict security requirements, talk to us about **LanceDB Enterprise**. [Learn more](https://docs.lancedb.com/)
|
||||
|
||||

|
||||
|
||||
## Truly multi-modal
|
||||
@@ -20,7 +23,7 @@ LanceDB **OSS** is an **open-source**, batteries-included embedded vector databa
|
||||
|
||||
LanceDB **Cloud** is a SaaS (software-as-a-service) solution that runs serverless in the cloud, making the storage clearly separated from compute. It's designed to be cost-effective and highly scalable without breaking the bank. LanceDB Cloud is currently in private beta with general availability coming soon, but you can apply for early access with the private beta release by signing up below.
|
||||
|
||||
[Try out LanceDB Cloud](https://noteforms.com/forms/lancedb-mailing-list-cloud-kty1o5?notionforms=1&utm_source=notionforms){ .md-button .md-button--primary }
|
||||
[Try out LanceDB Cloud (Public Beta) Now](https://cloud.lancedb.com){ .md-button .md-button--primary }
|
||||
|
||||
## Why use LanceDB?
|
||||
|
||||
|
||||
183
docs/src/integrations/genkit.md
Normal file
183
docs/src/integrations/genkit.md
Normal file
@@ -0,0 +1,183 @@
|
||||
### genkitx-lancedb
|
||||
This is a lancedb plugin for genkit framework. It allows you to use LanceDB for ingesting and rereiving data using genkit framework.
|
||||
|
||||

|
||||
|
||||
### Installation
|
||||
```bash
|
||||
pnpm install genkitx-lancedb
|
||||
```
|
||||
|
||||
### Usage
|
||||
|
||||
Adding LanceDB plugin to your genkit instance.
|
||||
|
||||
```ts
|
||||
import { lancedbIndexerRef, lancedb, lancedbRetrieverRef, WriteMode } from 'genkitx-lancedb';
|
||||
import { textEmbedding004, vertexAI } from '@genkit-ai/vertexai';
|
||||
import { gemini } from '@genkit-ai/vertexai';
|
||||
import { z, genkit } from 'genkit';
|
||||
import { Document } from 'genkit/retriever';
|
||||
import { chunk } from 'llm-chunk';
|
||||
import { readFile } from 'fs/promises';
|
||||
import path from 'path';
|
||||
import pdf from 'pdf-parse/lib/pdf-parse';
|
||||
|
||||
const ai = genkit({
|
||||
plugins: [
|
||||
// vertexAI provides the textEmbedding004 embedder
|
||||
vertexAI(),
|
||||
|
||||
// the local vector store requires an embedder to translate from text to vector
|
||||
lancedb([
|
||||
{
|
||||
dbUri: '.db', // optional lancedb uri, default to .db
|
||||
tableName: 'table', // optional table name, default to table
|
||||
embedder: textEmbedding004,
|
||||
},
|
||||
]),
|
||||
],
|
||||
});
|
||||
```
|
||||
|
||||
You can run this app with the following command:
|
||||
```bash
|
||||
genkit start -- tsx --watch src/index.ts
|
||||
```
|
||||
|
||||
This'll add LanceDB as a retriever and indexer to the genkit instance. You can see it in the GUI view
|
||||
<img width="1710" alt="Screenshot 2025-05-11 at 7 21 05 PM" src="https://github.com/user-attachments/assets/e752f7f4-785b-4797-a11e-72ab06a531b7" />
|
||||
|
||||
**Testing retrieval on a sample table**
|
||||
Let's see the raw retrieval results
|
||||
|
||||
<img width="1710" alt="Screenshot 2025-05-11 at 7 21 05 PM" src="https://github.com/user-attachments/assets/b8d356ed-8421-4790-8fc0-d6af563b9657" />
|
||||
On running this query, you'll 5 results fetched from the lancedb table, where each result looks something like this:
|
||||
<img width="1417" alt="Screenshot 2025-05-11 at 7 21 18 PM" src="https://github.com/user-attachments/assets/77429525-36e2-4da6-a694-e58c1cf9eb83" />
|
||||
|
||||
|
||||
|
||||
## Creating a custom RAG flow
|
||||
|
||||
Now that we've seen how you can use LanceDB for in a genkit pipeline, let's refine the flow and create a RAG. A RAG flow will consist of an index and a retreiver with its outputs postprocessed an fed into an LLM for final response
|
||||
|
||||
### Creating custom indexer flows
|
||||
You can also create custom indexer flows, utilizing more options and features provided by LanceDB.
|
||||
|
||||
```ts
|
||||
export const menuPdfIndexer = lancedbIndexerRef({
|
||||
// Using all defaults, for dbUri, tableName, and embedder, etc
|
||||
});
|
||||
|
||||
const chunkingConfig = {
|
||||
minLength: 1000,
|
||||
maxLength: 2000,
|
||||
splitter: 'sentence',
|
||||
overlap: 100,
|
||||
delimiters: '',
|
||||
} as any;
|
||||
|
||||
|
||||
async function extractTextFromPdf(filePath: string) {
|
||||
const pdfFile = path.resolve(filePath);
|
||||
const dataBuffer = await readFile(pdfFile);
|
||||
const data = await pdf(dataBuffer);
|
||||
return data.text;
|
||||
}
|
||||
|
||||
export const indexMenu = ai.defineFlow(
|
||||
{
|
||||
name: 'indexMenu',
|
||||
inputSchema: z.string().describe('PDF file path'),
|
||||
outputSchema: z.void(),
|
||||
},
|
||||
async (filePath: string) => {
|
||||
filePath = path.resolve(filePath);
|
||||
|
||||
// Read the pdf.
|
||||
const pdfTxt = await ai.run('extract-text', () =>
|
||||
extractTextFromPdf(filePath)
|
||||
);
|
||||
|
||||
// Divide the pdf text into segments.
|
||||
const chunks = await ai.run('chunk-it', async () =>
|
||||
chunk(pdfTxt, chunkingConfig)
|
||||
);
|
||||
|
||||
// Convert chunks of text into documents to store in the index.
|
||||
const documents = chunks.map((text) => {
|
||||
return Document.fromText(text, { filePath });
|
||||
});
|
||||
|
||||
// Add documents to the index.
|
||||
await ai.index({
|
||||
indexer: menuPdfIndexer,
|
||||
documents,
|
||||
options: {
|
||||
writeMode: WriteMode.Overwrite,
|
||||
} as any
|
||||
});
|
||||
}
|
||||
);
|
||||
```
|
||||
|
||||
<img width="1316" alt="Screenshot 2025-05-11 at 8 35 56 PM" src="https://github.com/user-attachments/assets/e2a20ce4-d1d0-4fa2-9a84-f2cc26e3a29f" />
|
||||
|
||||
In your console, you can see the logs
|
||||
|
||||
<img width="511" alt="Screenshot 2025-05-11 at 7 19 14 PM" src="https://github.com/user-attachments/assets/243f26c5-ed38-40b6-b661-002f40f0423a" />
|
||||
|
||||
### Creating custom retriever flows
|
||||
You can also create custom retriever flows, utilizing more options and features provided by LanceDB.
|
||||
```ts
|
||||
export const menuRetriever = lancedbRetrieverRef({
|
||||
tableName: "table", // Use the same table name as the indexer.
|
||||
displayName: "Menu", // Use a custom display name.
|
||||
|
||||
export const menuQAFlow = ai.defineFlow(
|
||||
{ name: "Menu", inputSchema: z.string(), outputSchema: z.string() },
|
||||
async (input: string) => {
|
||||
// retrieve relevant documents
|
||||
const docs = await ai.retrieve({
|
||||
retriever: menuRetriever,
|
||||
query: input,
|
||||
options: {
|
||||
k: 3,
|
||||
},
|
||||
});
|
||||
|
||||
const extractedContent = docs.map(doc => {
|
||||
if (doc.content && Array.isArray(doc.content) && doc.content.length > 0) {
|
||||
if (doc.content[0].media && doc.content[0].media.url) {
|
||||
return doc.content[0].media.url;
|
||||
}
|
||||
}
|
||||
return "No content found";
|
||||
});
|
||||
|
||||
console.log("Extracted content:", extractedContent);
|
||||
|
||||
const { text } = await ai.generate({
|
||||
model: gemini('gemini-2.0-flash'),
|
||||
prompt: `
|
||||
You are acting as a helpful AI assistant that can answer
|
||||
questions about the food available on the menu at Genkit Grub Pub.
|
||||
|
||||
Use only the context provided to answer the question.
|
||||
If you don't know, do not make up an answer.
|
||||
Do not add or change items on the menu.
|
||||
|
||||
Context:
|
||||
${extractedContent.join('\n\n')}
|
||||
|
||||
Question: ${input}`,
|
||||
docs,
|
||||
});
|
||||
|
||||
return text;
|
||||
}
|
||||
);
|
||||
```
|
||||
Now using our retrieval flow, we can ask question about the ingsted PDF
|
||||
<img width="1306" alt="Screenshot 2025-05-11 at 7 18 45 PM" src="https://github.com/user-attachments/assets/86c66b13-7c12-4d5f-9d81-ae36bfb1c346" />
|
||||
|
||||
@@ -108,7 +108,7 @@ This method creates a scalar(for non-vector cols) or a vector index on a table.
|
||||
|:---|:---|:---|:---|
|
||||
|`vector_col`|`Optional[str]`| Provide if you want to create index on a vector column. |`None`|
|
||||
|`col_name`|`Optional[str]`| Provide if you want to create index on a non-vector column. |`None`|
|
||||
|`metric`|`Optional[str]` |Provide the metric to use for vector index. choice of metrics: 'L2', 'dot', 'cosine'. |`L2`|
|
||||
|`metric`|`Optional[str]` |Provide the metric to use for vector index. choice of metrics: 'l2', 'dot', 'cosine'. |`l2`|
|
||||
|`num_partitions`|`Optional[int]`|Number of partitions to use for the index.|`256`|
|
||||
|`num_sub_vectors`|`Optional[int]` |Number of sub-vectors to use for the index.|`96`|
|
||||
|`index_cache_size`|`Optional[int]` |Size of the index cache.|`None`|
|
||||
|
||||
@@ -125,7 +125,7 @@ The exhaustive list of parameters for `LanceDBVectorStore` vector store are :
|
||||
```
|
||||
- **_table_exists(self, tbl_name: `Optional[str]` = `None`) -> `bool`** : Returns `True` if `tbl_name` exists in database.
|
||||
- __create_index(
|
||||
self, scalar: `Optional[bool]` = False, col_name: `Optional[str]` = None, num_partitions: `Optional[int]` = 256, num_sub_vectors: `Optional[int]` = 96, index_cache_size: `Optional[int]` = None, metric: `Optional[str]` = "L2",
|
||||
self, scalar: `Optional[bool]` = False, col_name: `Optional[str]` = None, num_partitions: `Optional[int]` = 256, num_sub_vectors: `Optional[int]` = 96, index_cache_size: `Optional[int]` = None, metric: `Optional[str]` = "l2",
|
||||
) -> `None`__ : Creates a scalar(for non-vector cols) or a vector index on a table.
|
||||
Make sure your vector column has enough data before creating an index on it.
|
||||
|
||||
|
||||
@@ -10,7 +10,7 @@ Distance metrics type.
|
||||
|
||||
- [Cosine](MetricType.md#cosine)
|
||||
- [Dot](MetricType.md#dot)
|
||||
- [L2](MetricType.md#l2)
|
||||
- [l2](MetricType.md#l2)
|
||||
|
||||
## Enumeration Members
|
||||
|
||||
|
||||
@@ -85,7 +85,7 @@ ___
|
||||
|
||||
• `Optional` **metric\_type**: [`MetricType`](../enums/MetricType.md)
|
||||
|
||||
Metric type, L2 or Cosine
|
||||
Metric type, l2 or Cosine
|
||||
|
||||
#### Defined in
|
||||
|
||||
|
||||
@@ -15,11 +15,9 @@ npm install @lancedb/lancedb
|
||||
This will download the appropriate native library for your platform. We currently
|
||||
support:
|
||||
|
||||
- Linux (x86_64 and aarch64)
|
||||
- Linux (x86_64 and aarch64 on glibc and musl)
|
||||
- MacOS (Intel and ARM/M1/M2)
|
||||
- Windows (x86_64 only)
|
||||
|
||||
We do not yet support musl-based Linux (such as Alpine Linux) or aarch64 Windows.
|
||||
- Windows (x86_64 and aarch64)
|
||||
|
||||
## Usage
|
||||
|
||||
|
||||
53
docs/src/js/classes/BooleanQuery.md
Normal file
53
docs/src/js/classes/BooleanQuery.md
Normal file
@@ -0,0 +1,53 @@
|
||||
[**@lancedb/lancedb**](../README.md) • **Docs**
|
||||
|
||||
***
|
||||
|
||||
[@lancedb/lancedb](../globals.md) / BooleanQuery
|
||||
|
||||
# Class: BooleanQuery
|
||||
|
||||
Represents a full-text query interface.
|
||||
This interface defines the structure and behavior for full-text queries,
|
||||
including methods to retrieve the query type and convert the query to a dictionary format.
|
||||
|
||||
## Implements
|
||||
|
||||
- [`FullTextQuery`](../interfaces/FullTextQuery.md)
|
||||
|
||||
## Constructors
|
||||
|
||||
### new BooleanQuery()
|
||||
|
||||
```ts
|
||||
new BooleanQuery(queries): BooleanQuery
|
||||
```
|
||||
|
||||
Creates an instance of BooleanQuery.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **queries**: [[`Occur`](../enumerations/Occur.md), [`FullTextQuery`](../interfaces/FullTextQuery.md)][]
|
||||
An array of (Occur, FullTextQuery objects) to combine.
|
||||
Occur specifies whether the query must match, or should match.
|
||||
|
||||
#### Returns
|
||||
|
||||
[`BooleanQuery`](BooleanQuery.md)
|
||||
|
||||
## Methods
|
||||
|
||||
### queryType()
|
||||
|
||||
```ts
|
||||
queryType(): FullTextQueryType
|
||||
```
|
||||
|
||||
The type of the full-text query.
|
||||
|
||||
#### Returns
|
||||
|
||||
[`FullTextQueryType`](../enumerations/FullTextQueryType.md)
|
||||
|
||||
#### Implementation of
|
||||
|
||||
[`FullTextQuery`](../interfaces/FullTextQuery.md).[`queryType`](../interfaces/FullTextQuery.md#querytype)
|
||||
67
docs/src/js/classes/BoostQuery.md
Normal file
67
docs/src/js/classes/BoostQuery.md
Normal file
@@ -0,0 +1,67 @@
|
||||
[**@lancedb/lancedb**](../README.md) • **Docs**
|
||||
|
||||
***
|
||||
|
||||
[@lancedb/lancedb](../globals.md) / BoostQuery
|
||||
|
||||
# Class: BoostQuery
|
||||
|
||||
Represents a full-text query interface.
|
||||
This interface defines the structure and behavior for full-text queries,
|
||||
including methods to retrieve the query type and convert the query to a dictionary format.
|
||||
|
||||
## Implements
|
||||
|
||||
- [`FullTextQuery`](../interfaces/FullTextQuery.md)
|
||||
|
||||
## Constructors
|
||||
|
||||
### new BoostQuery()
|
||||
|
||||
```ts
|
||||
new BoostQuery(
|
||||
positive,
|
||||
negative,
|
||||
options?): BoostQuery
|
||||
```
|
||||
|
||||
Creates an instance of BoostQuery.
|
||||
The boost returns documents that match the positive query,
|
||||
but penalizes those that match the negative query.
|
||||
the penalty is controlled by the `negativeBoost` parameter.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **positive**: [`FullTextQuery`](../interfaces/FullTextQuery.md)
|
||||
The positive query that boosts the relevance score.
|
||||
|
||||
* **negative**: [`FullTextQuery`](../interfaces/FullTextQuery.md)
|
||||
The negative query that reduces the relevance score.
|
||||
|
||||
* **options?**
|
||||
Optional parameters for the boost query.
|
||||
- `negativeBoost`: The boost factor for the negative query (default is 0.0).
|
||||
|
||||
* **options.negativeBoost?**: `number`
|
||||
|
||||
#### Returns
|
||||
|
||||
[`BoostQuery`](BoostQuery.md)
|
||||
|
||||
## Methods
|
||||
|
||||
### queryType()
|
||||
|
||||
```ts
|
||||
queryType(): FullTextQueryType
|
||||
```
|
||||
|
||||
The type of the full-text query.
|
||||
|
||||
#### Returns
|
||||
|
||||
[`FullTextQueryType`](../enumerations/FullTextQueryType.md)
|
||||
|
||||
#### Implementation of
|
||||
|
||||
[`FullTextQuery`](../interfaces/FullTextQuery.md).[`queryType`](../interfaces/FullTextQuery.md#querytype)
|
||||
@@ -126,6 +126,37 @@ the vectors.
|
||||
|
||||
***
|
||||
|
||||
### ivfFlat()
|
||||
|
||||
```ts
|
||||
static ivfFlat(options?): Index
|
||||
```
|
||||
|
||||
Create an IvfFlat index
|
||||
|
||||
This index groups vectors into partitions of similar vectors. Each partition keeps track of
|
||||
a centroid which is the average value of all vectors in the group.
|
||||
|
||||
During a query the centroids are compared with the query vector to find the closest
|
||||
partitions. The vectors in these partitions are then searched to find
|
||||
the closest vectors.
|
||||
|
||||
The partitioning process is called IVF and the `num_partitions` parameter controls how
|
||||
many groups to create.
|
||||
|
||||
Note that training an IVF FLAT index on a large dataset is a slow operation and
|
||||
currently is also a memory intensive operation.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **options?**: `Partial`<[`IvfFlatOptions`](../interfaces/IvfFlatOptions.md)>
|
||||
|
||||
#### Returns
|
||||
|
||||
[`Index`](Index.md)
|
||||
|
||||
***
|
||||
|
||||
### ivfPq()
|
||||
|
||||
```ts
|
||||
|
||||
76
docs/src/js/classes/MatchQuery.md
Normal file
76
docs/src/js/classes/MatchQuery.md
Normal file
@@ -0,0 +1,76 @@
|
||||
[**@lancedb/lancedb**](../README.md) • **Docs**
|
||||
|
||||
***
|
||||
|
||||
[@lancedb/lancedb](../globals.md) / MatchQuery
|
||||
|
||||
# Class: MatchQuery
|
||||
|
||||
Represents a full-text query interface.
|
||||
This interface defines the structure and behavior for full-text queries,
|
||||
including methods to retrieve the query type and convert the query to a dictionary format.
|
||||
|
||||
## Implements
|
||||
|
||||
- [`FullTextQuery`](../interfaces/FullTextQuery.md)
|
||||
|
||||
## Constructors
|
||||
|
||||
### new MatchQuery()
|
||||
|
||||
```ts
|
||||
new MatchQuery(
|
||||
query,
|
||||
column,
|
||||
options?): MatchQuery
|
||||
```
|
||||
|
||||
Creates an instance of MatchQuery.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **query**: `string`
|
||||
The text query to search for.
|
||||
|
||||
* **column**: `string`
|
||||
The name of the column to search within.
|
||||
|
||||
* **options?**
|
||||
Optional parameters for the match query.
|
||||
- `boost`: The boost factor for the query (default is 1.0).
|
||||
- `fuzziness`: The fuzziness level for the query (default is 0).
|
||||
- `maxExpansions`: The maximum number of terms to consider for fuzzy matching (default is 50).
|
||||
- `operator`: The logical operator to use for combining terms in the query (default is "OR").
|
||||
- `prefixLength`: The number of beginning characters being unchanged for fuzzy matching.
|
||||
|
||||
* **options.boost?**: `number`
|
||||
|
||||
* **options.fuzziness?**: `number`
|
||||
|
||||
* **options.maxExpansions?**: `number`
|
||||
|
||||
* **options.operator?**: [`Operator`](../enumerations/Operator.md)
|
||||
|
||||
* **options.prefixLength?**: `number`
|
||||
|
||||
#### Returns
|
||||
|
||||
[`MatchQuery`](MatchQuery.md)
|
||||
|
||||
## Methods
|
||||
|
||||
### queryType()
|
||||
|
||||
```ts
|
||||
queryType(): FullTextQueryType
|
||||
```
|
||||
|
||||
The type of the full-text query.
|
||||
|
||||
#### Returns
|
||||
|
||||
[`FullTextQueryType`](../enumerations/FullTextQueryType.md)
|
||||
|
||||
#### Implementation of
|
||||
|
||||
[`FullTextQuery`](../interfaces/FullTextQuery.md).[`queryType`](../interfaces/FullTextQuery.md#querytype)
|
||||
@@ -33,20 +33,22 @@ Construct a MergeInsertBuilder. __Internal use only.__
|
||||
### execute()
|
||||
|
||||
```ts
|
||||
execute(data): Promise<void>
|
||||
execute(data, execOptions?): Promise<MergeResult>
|
||||
```
|
||||
|
||||
Executes the merge insert operation
|
||||
|
||||
Nothing is returned but the `Table` is updated
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **data**: [`Data`](../type-aliases/Data.md)
|
||||
|
||||
* **execOptions?**: `Partial`<[`WriteExecutionOptions`](../interfaces/WriteExecutionOptions.md)>
|
||||
|
||||
#### Returns
|
||||
|
||||
`Promise`<`void`>
|
||||
`Promise`<[`MergeResult`](../interfaces/MergeResult.md)>
|
||||
|
||||
the merge result
|
||||
|
||||
***
|
||||
|
||||
|
||||
67
docs/src/js/classes/MultiMatchQuery.md
Normal file
67
docs/src/js/classes/MultiMatchQuery.md
Normal file
@@ -0,0 +1,67 @@
|
||||
[**@lancedb/lancedb**](../README.md) • **Docs**
|
||||
|
||||
***
|
||||
|
||||
[@lancedb/lancedb](../globals.md) / MultiMatchQuery
|
||||
|
||||
# Class: MultiMatchQuery
|
||||
|
||||
Represents a full-text query interface.
|
||||
This interface defines the structure and behavior for full-text queries,
|
||||
including methods to retrieve the query type and convert the query to a dictionary format.
|
||||
|
||||
## Implements
|
||||
|
||||
- [`FullTextQuery`](../interfaces/FullTextQuery.md)
|
||||
|
||||
## Constructors
|
||||
|
||||
### new MultiMatchQuery()
|
||||
|
||||
```ts
|
||||
new MultiMatchQuery(
|
||||
query,
|
||||
columns,
|
||||
options?): MultiMatchQuery
|
||||
```
|
||||
|
||||
Creates an instance of MultiMatchQuery.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **query**: `string`
|
||||
The text query to search for across multiple columns.
|
||||
|
||||
* **columns**: `string`[]
|
||||
An array of column names to search within.
|
||||
|
||||
* **options?**
|
||||
Optional parameters for the multi-match query.
|
||||
- `boosts`: An array of boost factors for each column (default is 1.0 for all).
|
||||
- `operator`: The logical operator to use for combining terms in the query (default is "OR").
|
||||
|
||||
* **options.boosts?**: `number`[]
|
||||
|
||||
* **options.operator?**: [`Operator`](../enumerations/Operator.md)
|
||||
|
||||
#### Returns
|
||||
|
||||
[`MultiMatchQuery`](MultiMatchQuery.md)
|
||||
|
||||
## Methods
|
||||
|
||||
### queryType()
|
||||
|
||||
```ts
|
||||
queryType(): FullTextQueryType
|
||||
```
|
||||
|
||||
The type of the full-text query.
|
||||
|
||||
#### Returns
|
||||
|
||||
[`FullTextQueryType`](../enumerations/FullTextQueryType.md)
|
||||
|
||||
#### Implementation of
|
||||
|
||||
[`FullTextQuery`](../interfaces/FullTextQuery.md).[`queryType`](../interfaces/FullTextQuery.md#querytype)
|
||||
64
docs/src/js/classes/PhraseQuery.md
Normal file
64
docs/src/js/classes/PhraseQuery.md
Normal file
@@ -0,0 +1,64 @@
|
||||
[**@lancedb/lancedb**](../README.md) • **Docs**
|
||||
|
||||
***
|
||||
|
||||
[@lancedb/lancedb](../globals.md) / PhraseQuery
|
||||
|
||||
# Class: PhraseQuery
|
||||
|
||||
Represents a full-text query interface.
|
||||
This interface defines the structure and behavior for full-text queries,
|
||||
including methods to retrieve the query type and convert the query to a dictionary format.
|
||||
|
||||
## Implements
|
||||
|
||||
- [`FullTextQuery`](../interfaces/FullTextQuery.md)
|
||||
|
||||
## Constructors
|
||||
|
||||
### new PhraseQuery()
|
||||
|
||||
```ts
|
||||
new PhraseQuery(
|
||||
query,
|
||||
column,
|
||||
options?): PhraseQuery
|
||||
```
|
||||
|
||||
Creates an instance of `PhraseQuery`.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **query**: `string`
|
||||
The phrase to search for in the specified column.
|
||||
|
||||
* **column**: `string`
|
||||
The name of the column to search within.
|
||||
|
||||
* **options?**
|
||||
Optional parameters for the phrase query.
|
||||
- `slop`: The maximum number of intervening unmatched positions allowed between words in the phrase (default is 0).
|
||||
|
||||
* **options.slop?**: `number`
|
||||
|
||||
#### Returns
|
||||
|
||||
[`PhraseQuery`](PhraseQuery.md)
|
||||
|
||||
## Methods
|
||||
|
||||
### queryType()
|
||||
|
||||
```ts
|
||||
queryType(): FullTextQueryType
|
||||
```
|
||||
|
||||
The type of the full-text query.
|
||||
|
||||
#### Returns
|
||||
|
||||
[`FullTextQueryType`](../enumerations/FullTextQueryType.md)
|
||||
|
||||
#### Implementation of
|
||||
|
||||
[`FullTextQuery`](../interfaces/FullTextQuery.md).[`queryType`](../interfaces/FullTextQuery.md#querytype)
|
||||
@@ -14,7 +14,7 @@ A builder for LanceDB queries.
|
||||
|
||||
## Extends
|
||||
|
||||
- [`QueryBase`](QueryBase.md)<`NativeQuery`>
|
||||
- `StandardQueryBase`<`NativeQuery`>
|
||||
|
||||
## Properties
|
||||
|
||||
@@ -26,10 +26,57 @@ protected inner: Query | Promise<Query>;
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`inner`](QueryBase.md#inner)
|
||||
`StandardQueryBase.inner`
|
||||
|
||||
## Methods
|
||||
|
||||
### analyzePlan()
|
||||
|
||||
```ts
|
||||
analyzePlan(): Promise<string>
|
||||
```
|
||||
|
||||
Executes the query and returns the physical query plan annotated with runtime metrics.
|
||||
|
||||
This is useful for debugging and performance analysis, as it shows how the query was executed
|
||||
and includes metrics such as elapsed time, rows processed, and I/O statistics.
|
||||
|
||||
#### Returns
|
||||
|
||||
`Promise`<`string`>
|
||||
|
||||
A query execution plan with runtime metrics for each step.
|
||||
|
||||
#### Example
|
||||
|
||||
```ts
|
||||
import * as lancedb from "@lancedb/lancedb"
|
||||
|
||||
const db = await lancedb.connect("./.lancedb");
|
||||
const table = await db.createTable("my_table", [
|
||||
{ vector: [1.1, 0.9], id: "1" },
|
||||
]);
|
||||
|
||||
const plan = await table.query().nearestTo([0.5, 0.2]).analyzePlan();
|
||||
|
||||
Example output (with runtime metrics inlined):
|
||||
AnalyzeExec verbose=true, metrics=[]
|
||||
ProjectionExec: expr=[id@3 as id, vector@0 as vector, _distance@2 as _distance], metrics=[output_rows=1, elapsed_compute=3.292µs]
|
||||
Take: columns="vector, _rowid, _distance, (id)", metrics=[output_rows=1, elapsed_compute=66.001µs, batches_processed=1, bytes_read=8, iops=1, requests=1]
|
||||
CoalesceBatchesExec: target_batch_size=1024, metrics=[output_rows=1, elapsed_compute=3.333µs]
|
||||
GlobalLimitExec: skip=0, fetch=10, metrics=[output_rows=1, elapsed_compute=167ns]
|
||||
FilterExec: _distance@2 IS NOT NULL, metrics=[output_rows=1, elapsed_compute=8.542µs]
|
||||
SortExec: TopK(fetch=10), expr=[_distance@2 ASC NULLS LAST], metrics=[output_rows=1, elapsed_compute=63.25µs, row_replacements=1]
|
||||
KNNVectorDistance: metric=l2, metrics=[output_rows=1, elapsed_compute=114.333µs, output_batches=1]
|
||||
LanceScan: uri=/path/to/data, projection=[vector], row_id=true, row_addr=false, ordered=false, metrics=[output_rows=1, elapsed_compute=103.626µs, bytes_read=549, iops=2, requests=2]
|
||||
```
|
||||
|
||||
#### Inherited from
|
||||
|
||||
`StandardQueryBase.analyzePlan`
|
||||
|
||||
***
|
||||
|
||||
### execute()
|
||||
|
||||
```ts
|
||||
@@ -60,7 +107,7 @@ single query)
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`execute`](QueryBase.md#execute)
|
||||
`StandardQueryBase.execute`
|
||||
|
||||
***
|
||||
|
||||
@@ -96,7 +143,7 @@ const plan = await table.query().nearestTo([0.5, 0.2]).explainPlan();
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`explainPlan`](QueryBase.md#explainplan)
|
||||
`StandardQueryBase.explainPlan`
|
||||
|
||||
***
|
||||
|
||||
@@ -117,7 +164,7 @@ Use [Table#optimize](Table.md#optimize) to index all un-indexed data.
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`fastSearch`](QueryBase.md#fastsearch)
|
||||
`StandardQueryBase.fastSearch`
|
||||
|
||||
***
|
||||
|
||||
@@ -147,7 +194,7 @@ Use `where` instead
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`filter`](QueryBase.md#filter)
|
||||
`StandardQueryBase.filter`
|
||||
|
||||
***
|
||||
|
||||
@@ -159,7 +206,7 @@ fullTextSearch(query, options?): this
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **query**: `string`
|
||||
* **query**: `string` \| [`FullTextQuery`](../interfaces/FullTextQuery.md)
|
||||
|
||||
* **options?**: `Partial`<[`FullTextSearchOptions`](../interfaces/FullTextSearchOptions.md)>
|
||||
|
||||
@@ -169,7 +216,7 @@ fullTextSearch(query, options?): this
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`fullTextSearch`](QueryBase.md#fulltextsearch)
|
||||
`StandardQueryBase.fullTextSearch`
|
||||
|
||||
***
|
||||
|
||||
@@ -194,7 +241,7 @@ called then every valid row from the table will be returned.
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`limit`](QueryBase.md#limit)
|
||||
`StandardQueryBase.limit`
|
||||
|
||||
***
|
||||
|
||||
@@ -262,7 +309,7 @@ nearestToText(query, columns?): Query
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **query**: `string`
|
||||
* **query**: `string` \| [`FullTextQuery`](../interfaces/FullTextQuery.md)
|
||||
|
||||
* **columns?**: `string`[]
|
||||
|
||||
@@ -278,6 +325,10 @@ nearestToText(query, columns?): Query
|
||||
offset(offset): this
|
||||
```
|
||||
|
||||
Set the number of rows to skip before returning results.
|
||||
|
||||
This is useful for pagination.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **offset**: `number`
|
||||
@@ -288,7 +339,7 @@ offset(offset): this
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`offset`](QueryBase.md#offset)
|
||||
`StandardQueryBase.offset`
|
||||
|
||||
***
|
||||
|
||||
@@ -341,7 +392,7 @@ object insertion order is easy to get wrong and `Map` is more foolproof.
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`select`](QueryBase.md#select)
|
||||
`StandardQueryBase.select`
|
||||
|
||||
***
|
||||
|
||||
@@ -363,7 +414,7 @@ Collect the results as an array of objects.
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`toArray`](QueryBase.md#toarray)
|
||||
`StandardQueryBase.toArray`
|
||||
|
||||
***
|
||||
|
||||
@@ -389,7 +440,7 @@ ArrowTable.
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`toArrow`](QueryBase.md#toarrow)
|
||||
`StandardQueryBase.toArrow`
|
||||
|
||||
***
|
||||
|
||||
@@ -424,7 +475,7 @@ on the filter column(s).
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`where`](QueryBase.md#where)
|
||||
`StandardQueryBase.where`
|
||||
|
||||
***
|
||||
|
||||
@@ -446,4 +497,4 @@ order to perform hybrid search.
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`withRowId`](QueryBase.md#withrowid)
|
||||
`StandardQueryBase.withRowId`
|
||||
|
||||
@@ -15,12 +15,11 @@ Common methods supported by all query types
|
||||
|
||||
## Extended by
|
||||
|
||||
- [`Query`](Query.md)
|
||||
- [`VectorQuery`](VectorQuery.md)
|
||||
- [`TakeQuery`](TakeQuery.md)
|
||||
|
||||
## Type Parameters
|
||||
|
||||
• **NativeQueryType** *extends* `NativeQuery` \| `NativeVectorQuery`
|
||||
• **NativeQueryType** *extends* `NativeQuery` \| `NativeVectorQuery` \| `NativeTakeQuery`
|
||||
|
||||
## Implements
|
||||
|
||||
@@ -36,6 +35,49 @@ protected inner: NativeQueryType | Promise<NativeQueryType>;
|
||||
|
||||
## Methods
|
||||
|
||||
### analyzePlan()
|
||||
|
||||
```ts
|
||||
analyzePlan(): Promise<string>
|
||||
```
|
||||
|
||||
Executes the query and returns the physical query plan annotated with runtime metrics.
|
||||
|
||||
This is useful for debugging and performance analysis, as it shows how the query was executed
|
||||
and includes metrics such as elapsed time, rows processed, and I/O statistics.
|
||||
|
||||
#### Returns
|
||||
|
||||
`Promise`<`string`>
|
||||
|
||||
A query execution plan with runtime metrics for each step.
|
||||
|
||||
#### Example
|
||||
|
||||
```ts
|
||||
import * as lancedb from "@lancedb/lancedb"
|
||||
|
||||
const db = await lancedb.connect("./.lancedb");
|
||||
const table = await db.createTable("my_table", [
|
||||
{ vector: [1.1, 0.9], id: "1" },
|
||||
]);
|
||||
|
||||
const plan = await table.query().nearestTo([0.5, 0.2]).analyzePlan();
|
||||
|
||||
Example output (with runtime metrics inlined):
|
||||
AnalyzeExec verbose=true, metrics=[]
|
||||
ProjectionExec: expr=[id@3 as id, vector@0 as vector, _distance@2 as _distance], metrics=[output_rows=1, elapsed_compute=3.292µs]
|
||||
Take: columns="vector, _rowid, _distance, (id)", metrics=[output_rows=1, elapsed_compute=66.001µs, batches_processed=1, bytes_read=8, iops=1, requests=1]
|
||||
CoalesceBatchesExec: target_batch_size=1024, metrics=[output_rows=1, elapsed_compute=3.333µs]
|
||||
GlobalLimitExec: skip=0, fetch=10, metrics=[output_rows=1, elapsed_compute=167ns]
|
||||
FilterExec: _distance@2 IS NOT NULL, metrics=[output_rows=1, elapsed_compute=8.542µs]
|
||||
SortExec: TopK(fetch=10), expr=[_distance@2 ASC NULLS LAST], metrics=[output_rows=1, elapsed_compute=63.25µs, row_replacements=1]
|
||||
KNNVectorDistance: metric=l2, metrics=[output_rows=1, elapsed_compute=114.333µs, output_batches=1]
|
||||
LanceScan: uri=/path/to/data, projection=[vector], row_id=true, row_addr=false, ordered=false, metrics=[output_rows=1, elapsed_compute=103.626µs, bytes_read=549, iops=2, requests=2]
|
||||
```
|
||||
|
||||
***
|
||||
|
||||
### execute()
|
||||
|
||||
```ts
|
||||
@@ -98,104 +140,6 @@ const plan = await table.query().nearestTo([0.5, 0.2]).explainPlan();
|
||||
|
||||
***
|
||||
|
||||
### fastSearch()
|
||||
|
||||
```ts
|
||||
fastSearch(): this
|
||||
```
|
||||
|
||||
Skip searching un-indexed data. This can make search faster, but will miss
|
||||
any data that is not yet indexed.
|
||||
|
||||
Use [Table#optimize](Table.md#optimize) to index all un-indexed data.
|
||||
|
||||
#### Returns
|
||||
|
||||
`this`
|
||||
|
||||
***
|
||||
|
||||
### ~~filter()~~
|
||||
|
||||
```ts
|
||||
filter(predicate): this
|
||||
```
|
||||
|
||||
A filter statement to be applied to this query.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **predicate**: `string`
|
||||
|
||||
#### Returns
|
||||
|
||||
`this`
|
||||
|
||||
#### See
|
||||
|
||||
where
|
||||
|
||||
#### Deprecated
|
||||
|
||||
Use `where` instead
|
||||
|
||||
***
|
||||
|
||||
### fullTextSearch()
|
||||
|
||||
```ts
|
||||
fullTextSearch(query, options?): this
|
||||
```
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **query**: `string`
|
||||
|
||||
* **options?**: `Partial`<[`FullTextSearchOptions`](../interfaces/FullTextSearchOptions.md)>
|
||||
|
||||
#### Returns
|
||||
|
||||
`this`
|
||||
|
||||
***
|
||||
|
||||
### limit()
|
||||
|
||||
```ts
|
||||
limit(limit): this
|
||||
```
|
||||
|
||||
Set the maximum number of results to return.
|
||||
|
||||
By default, a plain search has no limit. If this method is not
|
||||
called then every valid row from the table will be returned.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **limit**: `number`
|
||||
|
||||
#### Returns
|
||||
|
||||
`this`
|
||||
|
||||
***
|
||||
|
||||
### offset()
|
||||
|
||||
```ts
|
||||
offset(offset): this
|
||||
```
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **offset**: `number`
|
||||
|
||||
#### Returns
|
||||
|
||||
`this`
|
||||
|
||||
***
|
||||
|
||||
### select()
|
||||
|
||||
```ts
|
||||
@@ -285,37 +229,6 @@ ArrowTable.
|
||||
|
||||
***
|
||||
|
||||
### where()
|
||||
|
||||
```ts
|
||||
where(predicate): this
|
||||
```
|
||||
|
||||
A filter statement to be applied to this query.
|
||||
|
||||
The filter should be supplied as an SQL query string. For example:
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **predicate**: `string`
|
||||
|
||||
#### Returns
|
||||
|
||||
`this`
|
||||
|
||||
#### Example
|
||||
|
||||
```ts
|
||||
x > 10
|
||||
y > 0 AND y < 100
|
||||
x > 5 OR y = 'test'
|
||||
|
||||
Filtering performance can often be improved by creating a scalar index
|
||||
on the filter column(s).
|
||||
```
|
||||
|
||||
***
|
||||
|
||||
### withRowId()
|
||||
|
||||
```ts
|
||||
|
||||
88
docs/src/js/classes/Session.md
Normal file
88
docs/src/js/classes/Session.md
Normal file
@@ -0,0 +1,88 @@
|
||||
[**@lancedb/lancedb**](../README.md) • **Docs**
|
||||
|
||||
***
|
||||
|
||||
[@lancedb/lancedb](../globals.md) / Session
|
||||
|
||||
# Class: Session
|
||||
|
||||
A session for managing caches and object stores across LanceDB operations.
|
||||
|
||||
Sessions allow you to configure cache sizes for index and metadata caches,
|
||||
which can significantly impact memory use and performance. They can
|
||||
also be re-used across multiple connections to share the same cache state.
|
||||
|
||||
## Constructors
|
||||
|
||||
### new Session()
|
||||
|
||||
```ts
|
||||
new Session(indexCacheSizeBytes?, metadataCacheSizeBytes?): Session
|
||||
```
|
||||
|
||||
Create a new session with custom cache sizes.
|
||||
|
||||
# Parameters
|
||||
|
||||
- `index_cache_size_bytes`: The size of the index cache in bytes.
|
||||
Index data is stored in memory in this cache to speed up queries.
|
||||
Defaults to 6GB if not specified.
|
||||
- `metadata_cache_size_bytes`: The size of the metadata cache in bytes.
|
||||
The metadata cache stores file metadata and schema information in memory.
|
||||
This cache improves scan and write performance.
|
||||
Defaults to 1GB if not specified.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **indexCacheSizeBytes?**: `null` \| `bigint`
|
||||
|
||||
* **metadataCacheSizeBytes?**: `null` \| `bigint`
|
||||
|
||||
#### Returns
|
||||
|
||||
[`Session`](Session.md)
|
||||
|
||||
## Methods
|
||||
|
||||
### approxNumItems()
|
||||
|
||||
```ts
|
||||
approxNumItems(): number
|
||||
```
|
||||
|
||||
Get the approximate number of items cached in the session.
|
||||
|
||||
#### Returns
|
||||
|
||||
`number`
|
||||
|
||||
***
|
||||
|
||||
### sizeBytes()
|
||||
|
||||
```ts
|
||||
sizeBytes(): bigint
|
||||
```
|
||||
|
||||
Get the current size of the session caches in bytes.
|
||||
|
||||
#### Returns
|
||||
|
||||
`bigint`
|
||||
|
||||
***
|
||||
|
||||
### default()
|
||||
|
||||
```ts
|
||||
static default(): Session
|
||||
```
|
||||
|
||||
Create a session with default cache sizes.
|
||||
|
||||
This is equivalent to creating a session with 6GB index cache
|
||||
and 1GB metadata cache.
|
||||
|
||||
#### Returns
|
||||
|
||||
[`Session`](Session.md)
|
||||
@@ -40,7 +40,7 @@ Returns the name of the table
|
||||
### add()
|
||||
|
||||
```ts
|
||||
abstract add(data, options?): Promise<void>
|
||||
abstract add(data, options?): Promise<AddResult>
|
||||
```
|
||||
|
||||
Insert records into this Table.
|
||||
@@ -54,14 +54,17 @@ Insert records into this Table.
|
||||
|
||||
#### Returns
|
||||
|
||||
`Promise`<`void`>
|
||||
`Promise`<[`AddResult`](../interfaces/AddResult.md)>
|
||||
|
||||
A promise that resolves to an object
|
||||
containing the new version number of the table
|
||||
|
||||
***
|
||||
|
||||
### addColumns()
|
||||
|
||||
```ts
|
||||
abstract addColumns(newColumnTransforms): Promise<void>
|
||||
abstract addColumns(newColumnTransforms): Promise<AddColumnsResult>
|
||||
```
|
||||
|
||||
Add new columns with defined values.
|
||||
@@ -76,14 +79,17 @@ Add new columns with defined values.
|
||||
|
||||
#### Returns
|
||||
|
||||
`Promise`<`void`>
|
||||
`Promise`<[`AddColumnsResult`](../interfaces/AddColumnsResult.md)>
|
||||
|
||||
A promise that resolves to an object
|
||||
containing the new version number of the table after adding the columns.
|
||||
|
||||
***
|
||||
|
||||
### alterColumns()
|
||||
|
||||
```ts
|
||||
abstract alterColumns(columnAlterations): Promise<void>
|
||||
abstract alterColumns(columnAlterations): Promise<AlterColumnsResult>
|
||||
```
|
||||
|
||||
Alter the name or nullability of columns.
|
||||
@@ -96,7 +102,10 @@ Alter the name or nullability of columns.
|
||||
|
||||
#### Returns
|
||||
|
||||
`Promise`<`void`>
|
||||
`Promise`<[`AlterColumnsResult`](../interfaces/AlterColumnsResult.md)>
|
||||
|
||||
A promise that resolves to an object
|
||||
containing the new version number of the table after altering the columns.
|
||||
|
||||
***
|
||||
|
||||
@@ -117,8 +126,8 @@ wish to return to standard mode, call `checkoutLatest`.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **version**: `number`
|
||||
The version to checkout
|
||||
* **version**: `string` \| `number`
|
||||
The version to checkout, could be version number or tag
|
||||
|
||||
#### Returns
|
||||
|
||||
@@ -252,7 +261,7 @@ await table.createIndex("my_float_col");
|
||||
### delete()
|
||||
|
||||
```ts
|
||||
abstract delete(predicate): Promise<void>
|
||||
abstract delete(predicate): Promise<DeleteResult>
|
||||
```
|
||||
|
||||
Delete the rows that satisfy the predicate.
|
||||
@@ -263,7 +272,10 @@ Delete the rows that satisfy the predicate.
|
||||
|
||||
#### Returns
|
||||
|
||||
`Promise`<`void`>
|
||||
`Promise`<[`DeleteResult`](../interfaces/DeleteResult.md)>
|
||||
|
||||
A promise that resolves to an object
|
||||
containing the new version number of the table
|
||||
|
||||
***
|
||||
|
||||
@@ -284,7 +296,7 @@ Return a brief description of the table
|
||||
### dropColumns()
|
||||
|
||||
```ts
|
||||
abstract dropColumns(columnNames): Promise<void>
|
||||
abstract dropColumns(columnNames): Promise<DropColumnsResult>
|
||||
```
|
||||
|
||||
Drop one or more columns from the dataset
|
||||
@@ -303,7 +315,10 @@ then call ``cleanup_files`` to remove the old files.
|
||||
|
||||
#### Returns
|
||||
|
||||
`Promise`<`void`>
|
||||
`Promise`<[`DropColumnsResult`](../interfaces/DropColumnsResult.md)>
|
||||
|
||||
A promise that resolves to an object
|
||||
containing the new version number of the table after dropping the columns.
|
||||
|
||||
***
|
||||
|
||||
@@ -454,6 +469,28 @@ Modeled after ``VACUUM`` in PostgreSQL.
|
||||
|
||||
***
|
||||
|
||||
### prewarmIndex()
|
||||
|
||||
```ts
|
||||
abstract prewarmIndex(name): Promise<void>
|
||||
```
|
||||
|
||||
Prewarm an index in the table.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **name**: `string`
|
||||
The name of the index.
|
||||
This will load the index into memory. This may reduce the cold-start time for
|
||||
future queries. If the index does not fit in the cache then this call may be
|
||||
wasteful.
|
||||
|
||||
#### Returns
|
||||
|
||||
`Promise`<`void`>
|
||||
|
||||
***
|
||||
|
||||
### query()
|
||||
|
||||
```ts
|
||||
@@ -575,7 +612,7 @@ of the given query
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **query**: `string` \| [`IntoVector`](../type-aliases/IntoVector.md)
|
||||
* **query**: `string` \| [`IntoVector`](../type-aliases/IntoVector.md) \| [`MultiVector`](../type-aliases/MultiVector.md) \| [`FullTextQuery`](../interfaces/FullTextQuery.md)
|
||||
the query, a vector or string
|
||||
|
||||
* **queryType?**: `string`
|
||||
@@ -593,6 +630,92 @@ of the given query
|
||||
|
||||
***
|
||||
|
||||
### stats()
|
||||
|
||||
```ts
|
||||
abstract stats(): Promise<TableStatistics>
|
||||
```
|
||||
|
||||
Returns table and fragment statistics
|
||||
|
||||
#### Returns
|
||||
|
||||
`Promise`<[`TableStatistics`](../interfaces/TableStatistics.md)>
|
||||
|
||||
The table and fragment statistics
|
||||
|
||||
***
|
||||
|
||||
### tags()
|
||||
|
||||
```ts
|
||||
abstract tags(): Promise<Tags>
|
||||
```
|
||||
|
||||
Get a tags manager for this table.
|
||||
|
||||
Tags allow you to label specific versions of a table with a human-readable name.
|
||||
The returned tags manager can be used to list, create, update, or delete tags.
|
||||
|
||||
#### Returns
|
||||
|
||||
`Promise`<[`Tags`](Tags.md)>
|
||||
|
||||
A tags manager for this table
|
||||
|
||||
#### Example
|
||||
|
||||
```typescript
|
||||
const tagsManager = await table.tags();
|
||||
await tagsManager.create("v1", 1);
|
||||
const tags = await tagsManager.list();
|
||||
console.log(tags); // { "v1": { version: 1, manifestSize: ... } }
|
||||
```
|
||||
|
||||
***
|
||||
|
||||
### takeOffsets()
|
||||
|
||||
```ts
|
||||
abstract takeOffsets(offsets): TakeQuery
|
||||
```
|
||||
|
||||
Create a query that returns a subset of the rows in the table.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **offsets**: `number`[]
|
||||
The offsets of the rows to return.
|
||||
|
||||
#### Returns
|
||||
|
||||
[`TakeQuery`](TakeQuery.md)
|
||||
|
||||
A builder that can be used to parameterize the query.
|
||||
|
||||
***
|
||||
|
||||
### takeRowIds()
|
||||
|
||||
```ts
|
||||
abstract takeRowIds(rowIds): TakeQuery
|
||||
```
|
||||
|
||||
Create a query that returns a subset of the rows in the table.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **rowIds**: `number`[]
|
||||
The row ids of the rows to return.
|
||||
|
||||
#### Returns
|
||||
|
||||
[`TakeQuery`](TakeQuery.md)
|
||||
|
||||
A builder that can be used to parameterize the query.
|
||||
|
||||
***
|
||||
|
||||
### toArrow()
|
||||
|
||||
```ts
|
||||
@@ -612,7 +735,7 @@ Return the table as an arrow table
|
||||
#### update(opts)
|
||||
|
||||
```ts
|
||||
abstract update(opts): Promise<void>
|
||||
abstract update(opts): Promise<UpdateResult>
|
||||
```
|
||||
|
||||
Update existing records in the Table
|
||||
@@ -623,7 +746,10 @@ Update existing records in the Table
|
||||
|
||||
##### Returns
|
||||
|
||||
`Promise`<`void`>
|
||||
`Promise`<[`UpdateResult`](../interfaces/UpdateResult.md)>
|
||||
|
||||
A promise that resolves to an object containing
|
||||
the number of rows updated and the new version number
|
||||
|
||||
##### Example
|
||||
|
||||
@@ -634,7 +760,7 @@ table.update({where:"x = 2", values:{"vector": [10, 10]}})
|
||||
#### update(opts)
|
||||
|
||||
```ts
|
||||
abstract update(opts): Promise<void>
|
||||
abstract update(opts): Promise<UpdateResult>
|
||||
```
|
||||
|
||||
Update existing records in the Table
|
||||
@@ -645,7 +771,10 @@ Update existing records in the Table
|
||||
|
||||
##### Returns
|
||||
|
||||
`Promise`<`void`>
|
||||
`Promise`<[`UpdateResult`](../interfaces/UpdateResult.md)>
|
||||
|
||||
A promise that resolves to an object containing
|
||||
the number of rows updated and the new version number
|
||||
|
||||
##### Example
|
||||
|
||||
@@ -656,7 +785,7 @@ table.update({where:"x = 2", valuesSql:{"x": "x + 1"}})
|
||||
#### update(updates, options)
|
||||
|
||||
```ts
|
||||
abstract update(updates, options?): Promise<void>
|
||||
abstract update(updates, options?): Promise<UpdateResult>
|
||||
```
|
||||
|
||||
Update existing records in the Table
|
||||
@@ -679,10 +808,6 @@ repeatedly calilng this method.
|
||||
* **updates**: `Record`<`string`, `string`> \| `Map`<`string`, `string`>
|
||||
the
|
||||
columns to update
|
||||
Keys in the map should specify the name of the column to update.
|
||||
Values in the map provide the new value of the column. These can
|
||||
be SQL literal strings (e.g. "7" or "'foo'") or they can be expressions
|
||||
based on the row being updated (e.g. "my_col + 1")
|
||||
|
||||
* **options?**: `Partial`<[`UpdateOptions`](../interfaces/UpdateOptions.md)>
|
||||
additional options to control
|
||||
@@ -690,7 +815,15 @@ repeatedly calilng this method.
|
||||
|
||||
##### Returns
|
||||
|
||||
`Promise`<`void`>
|
||||
`Promise`<[`UpdateResult`](../interfaces/UpdateResult.md)>
|
||||
|
||||
A promise that resolves to an object
|
||||
containing the number of rows updated and the new version number
|
||||
|
||||
Keys in the map should specify the name of the column to update.
|
||||
Values in the map provide the new value of the column. These can
|
||||
be SQL literal strings (e.g. "7" or "'foo'") or they can be expressions
|
||||
based on the row being updated (e.g. "my_col + 1")
|
||||
|
||||
***
|
||||
|
||||
@@ -708,7 +841,7 @@ by `query`.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **vector**: [`IntoVector`](../type-aliases/IntoVector.md)
|
||||
* **vector**: [`IntoVector`](../type-aliases/IntoVector.md) \| [`MultiVector`](../type-aliases/MultiVector.md)
|
||||
|
||||
#### Returns
|
||||
|
||||
@@ -731,3 +864,26 @@ Retrieve the version of the table
|
||||
#### Returns
|
||||
|
||||
`Promise`<`number`>
|
||||
|
||||
***
|
||||
|
||||
### waitForIndex()
|
||||
|
||||
```ts
|
||||
abstract waitForIndex(indexNames, timeoutSeconds): Promise<void>
|
||||
```
|
||||
|
||||
Waits for asynchronous indexing to complete on the table.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **indexNames**: `string`[]
|
||||
The name of the indices to wait for
|
||||
|
||||
* **timeoutSeconds**: `number`
|
||||
The number of seconds to wait before timing out
|
||||
This will raise an error if the indices are not created and fully indexed within the timeout.
|
||||
|
||||
#### Returns
|
||||
|
||||
`Promise`<`void`>
|
||||
|
||||
35
docs/src/js/classes/TagContents.md
Normal file
35
docs/src/js/classes/TagContents.md
Normal file
@@ -0,0 +1,35 @@
|
||||
[**@lancedb/lancedb**](../README.md) • **Docs**
|
||||
|
||||
***
|
||||
|
||||
[@lancedb/lancedb](../globals.md) / TagContents
|
||||
|
||||
# Class: TagContents
|
||||
|
||||
## Constructors
|
||||
|
||||
### new TagContents()
|
||||
|
||||
```ts
|
||||
new TagContents(): TagContents
|
||||
```
|
||||
|
||||
#### Returns
|
||||
|
||||
[`TagContents`](TagContents.md)
|
||||
|
||||
## Properties
|
||||
|
||||
### manifestSize
|
||||
|
||||
```ts
|
||||
manifestSize: number;
|
||||
```
|
||||
|
||||
***
|
||||
|
||||
### version
|
||||
|
||||
```ts
|
||||
version: number;
|
||||
```
|
||||
99
docs/src/js/classes/Tags.md
Normal file
99
docs/src/js/classes/Tags.md
Normal file
@@ -0,0 +1,99 @@
|
||||
[**@lancedb/lancedb**](../README.md) • **Docs**
|
||||
|
||||
***
|
||||
|
||||
[@lancedb/lancedb](../globals.md) / Tags
|
||||
|
||||
# Class: Tags
|
||||
|
||||
## Constructors
|
||||
|
||||
### new Tags()
|
||||
|
||||
```ts
|
||||
new Tags(): Tags
|
||||
```
|
||||
|
||||
#### Returns
|
||||
|
||||
[`Tags`](Tags.md)
|
||||
|
||||
## Methods
|
||||
|
||||
### create()
|
||||
|
||||
```ts
|
||||
create(tag, version): Promise<void>
|
||||
```
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **tag**: `string`
|
||||
|
||||
* **version**: `number`
|
||||
|
||||
#### Returns
|
||||
|
||||
`Promise`<`void`>
|
||||
|
||||
***
|
||||
|
||||
### delete()
|
||||
|
||||
```ts
|
||||
delete(tag): Promise<void>
|
||||
```
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **tag**: `string`
|
||||
|
||||
#### Returns
|
||||
|
||||
`Promise`<`void`>
|
||||
|
||||
***
|
||||
|
||||
### getVersion()
|
||||
|
||||
```ts
|
||||
getVersion(tag): Promise<number>
|
||||
```
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **tag**: `string`
|
||||
|
||||
#### Returns
|
||||
|
||||
`Promise`<`number`>
|
||||
|
||||
***
|
||||
|
||||
### list()
|
||||
|
||||
```ts
|
||||
list(): Promise<Record<string, TagContents>>
|
||||
```
|
||||
|
||||
#### Returns
|
||||
|
||||
`Promise`<`Record`<`string`, [`TagContents`](TagContents.md)>>
|
||||
|
||||
***
|
||||
|
||||
### update()
|
||||
|
||||
```ts
|
||||
update(tag, version): Promise<void>
|
||||
```
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **tag**: `string`
|
||||
|
||||
* **version**: `number`
|
||||
|
||||
#### Returns
|
||||
|
||||
`Promise`<`void`>
|
||||
265
docs/src/js/classes/TakeQuery.md
Normal file
265
docs/src/js/classes/TakeQuery.md
Normal file
@@ -0,0 +1,265 @@
|
||||
[**@lancedb/lancedb**](../README.md) • **Docs**
|
||||
|
||||
***
|
||||
|
||||
[@lancedb/lancedb](../globals.md) / TakeQuery
|
||||
|
||||
# Class: TakeQuery
|
||||
|
||||
A query that returns a subset of the rows in the table.
|
||||
|
||||
## Extends
|
||||
|
||||
- [`QueryBase`](QueryBase.md)<`NativeTakeQuery`>
|
||||
|
||||
## Properties
|
||||
|
||||
### inner
|
||||
|
||||
```ts
|
||||
protected inner: TakeQuery | Promise<TakeQuery>;
|
||||
```
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`inner`](QueryBase.md#inner)
|
||||
|
||||
## Methods
|
||||
|
||||
### analyzePlan()
|
||||
|
||||
```ts
|
||||
analyzePlan(): Promise<string>
|
||||
```
|
||||
|
||||
Executes the query and returns the physical query plan annotated with runtime metrics.
|
||||
|
||||
This is useful for debugging and performance analysis, as it shows how the query was executed
|
||||
and includes metrics such as elapsed time, rows processed, and I/O statistics.
|
||||
|
||||
#### Returns
|
||||
|
||||
`Promise`<`string`>
|
||||
|
||||
A query execution plan with runtime metrics for each step.
|
||||
|
||||
#### Example
|
||||
|
||||
```ts
|
||||
import * as lancedb from "@lancedb/lancedb"
|
||||
|
||||
const db = await lancedb.connect("./.lancedb");
|
||||
const table = await db.createTable("my_table", [
|
||||
{ vector: [1.1, 0.9], id: "1" },
|
||||
]);
|
||||
|
||||
const plan = await table.query().nearestTo([0.5, 0.2]).analyzePlan();
|
||||
|
||||
Example output (with runtime metrics inlined):
|
||||
AnalyzeExec verbose=true, metrics=[]
|
||||
ProjectionExec: expr=[id@3 as id, vector@0 as vector, _distance@2 as _distance], metrics=[output_rows=1, elapsed_compute=3.292µs]
|
||||
Take: columns="vector, _rowid, _distance, (id)", metrics=[output_rows=1, elapsed_compute=66.001µs, batches_processed=1, bytes_read=8, iops=1, requests=1]
|
||||
CoalesceBatchesExec: target_batch_size=1024, metrics=[output_rows=1, elapsed_compute=3.333µs]
|
||||
GlobalLimitExec: skip=0, fetch=10, metrics=[output_rows=1, elapsed_compute=167ns]
|
||||
FilterExec: _distance@2 IS NOT NULL, metrics=[output_rows=1, elapsed_compute=8.542µs]
|
||||
SortExec: TopK(fetch=10), expr=[_distance@2 ASC NULLS LAST], metrics=[output_rows=1, elapsed_compute=63.25µs, row_replacements=1]
|
||||
KNNVectorDistance: metric=l2, metrics=[output_rows=1, elapsed_compute=114.333µs, output_batches=1]
|
||||
LanceScan: uri=/path/to/data, projection=[vector], row_id=true, row_addr=false, ordered=false, metrics=[output_rows=1, elapsed_compute=103.626µs, bytes_read=549, iops=2, requests=2]
|
||||
```
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`analyzePlan`](QueryBase.md#analyzeplan)
|
||||
|
||||
***
|
||||
|
||||
### execute()
|
||||
|
||||
```ts
|
||||
protected execute(options?): RecordBatchIterator
|
||||
```
|
||||
|
||||
Execute the query and return the results as an
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **options?**: `Partial`<[`QueryExecutionOptions`](../interfaces/QueryExecutionOptions.md)>
|
||||
|
||||
#### Returns
|
||||
|
||||
[`RecordBatchIterator`](RecordBatchIterator.md)
|
||||
|
||||
#### See
|
||||
|
||||
- AsyncIterator
|
||||
of
|
||||
- RecordBatch.
|
||||
|
||||
By default, LanceDb will use many threads to calculate results and, when
|
||||
the result set is large, multiple batches will be processed at one time.
|
||||
This readahead is limited however and backpressure will be applied if this
|
||||
stream is consumed slowly (this constrains the maximum memory used by a
|
||||
single query)
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`execute`](QueryBase.md#execute)
|
||||
|
||||
***
|
||||
|
||||
### explainPlan()
|
||||
|
||||
```ts
|
||||
explainPlan(verbose): Promise<string>
|
||||
```
|
||||
|
||||
Generates an explanation of the query execution plan.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **verbose**: `boolean` = `false`
|
||||
If true, provides a more detailed explanation. Defaults to false.
|
||||
|
||||
#### Returns
|
||||
|
||||
`Promise`<`string`>
|
||||
|
||||
A Promise that resolves to a string containing the query execution plan explanation.
|
||||
|
||||
#### Example
|
||||
|
||||
```ts
|
||||
import * as lancedb from "@lancedb/lancedb"
|
||||
const db = await lancedb.connect("./.lancedb");
|
||||
const table = await db.createTable("my_table", [
|
||||
{ vector: [1.1, 0.9], id: "1" },
|
||||
]);
|
||||
const plan = await table.query().nearestTo([0.5, 0.2]).explainPlan();
|
||||
```
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`explainPlan`](QueryBase.md#explainplan)
|
||||
|
||||
***
|
||||
|
||||
### select()
|
||||
|
||||
```ts
|
||||
select(columns): this
|
||||
```
|
||||
|
||||
Return only the specified columns.
|
||||
|
||||
By default a query will return all columns from the table. However, this can have
|
||||
a very significant impact on latency. LanceDb stores data in a columnar fashion. This
|
||||
means we can finely tune our I/O to select exactly the columns we need.
|
||||
|
||||
As a best practice you should always limit queries to the columns that you need. If you
|
||||
pass in an array of column names then only those columns will be returned.
|
||||
|
||||
You can also use this method to create new "dynamic" columns based on your existing columns.
|
||||
For example, you may not care about "a" or "b" but instead simply want "a + b". This is often
|
||||
seen in the SELECT clause of an SQL query (e.g. `SELECT a+b FROM my_table`).
|
||||
|
||||
To create dynamic columns you can pass in a Map<string, string>. A column will be returned
|
||||
for each entry in the map. The key provides the name of the column. The value is
|
||||
an SQL string used to specify how the column is calculated.
|
||||
|
||||
For example, an SQL query might state `SELECT a + b AS combined, c`. The equivalent
|
||||
input to this method would be:
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **columns**: `string` \| `string`[] \| `Record`<`string`, `string`> \| `Map`<`string`, `string`>
|
||||
|
||||
#### Returns
|
||||
|
||||
`this`
|
||||
|
||||
#### Example
|
||||
|
||||
```ts
|
||||
new Map([["combined", "a + b"], ["c", "c"]])
|
||||
|
||||
Columns will always be returned in the order given, even if that order is different than
|
||||
the order used when adding the data.
|
||||
|
||||
Note that you can pass in a `Record<string, string>` (e.g. an object literal). This method
|
||||
uses `Object.entries` which should preserve the insertion order of the object. However,
|
||||
object insertion order is easy to get wrong and `Map` is more foolproof.
|
||||
```
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`select`](QueryBase.md#select)
|
||||
|
||||
***
|
||||
|
||||
### toArray()
|
||||
|
||||
```ts
|
||||
toArray(options?): Promise<any[]>
|
||||
```
|
||||
|
||||
Collect the results as an array of objects.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **options?**: `Partial`<[`QueryExecutionOptions`](../interfaces/QueryExecutionOptions.md)>
|
||||
|
||||
#### Returns
|
||||
|
||||
`Promise`<`any`[]>
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`toArray`](QueryBase.md#toarray)
|
||||
|
||||
***
|
||||
|
||||
### toArrow()
|
||||
|
||||
```ts
|
||||
toArrow(options?): Promise<Table<any>>
|
||||
```
|
||||
|
||||
Collect the results as an Arrow
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **options?**: `Partial`<[`QueryExecutionOptions`](../interfaces/QueryExecutionOptions.md)>
|
||||
|
||||
#### Returns
|
||||
|
||||
`Promise`<`Table`<`any`>>
|
||||
|
||||
#### See
|
||||
|
||||
ArrowTable.
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`toArrow`](QueryBase.md#toarrow)
|
||||
|
||||
***
|
||||
|
||||
### withRowId()
|
||||
|
||||
```ts
|
||||
withRowId(): this
|
||||
```
|
||||
|
||||
Whether to return the row id in the results.
|
||||
|
||||
This column can be used to match results between different queries. For
|
||||
example, to match results from a full text search and a vector search in
|
||||
order to perform hybrid search.
|
||||
|
||||
#### Returns
|
||||
|
||||
`this`
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`withRowId`](QueryBase.md#withrowid)
|
||||
@@ -16,7 +16,7 @@ This builder can be reused to execute the query many times.
|
||||
|
||||
## Extends
|
||||
|
||||
- [`QueryBase`](QueryBase.md)<`NativeVectorQuery`>
|
||||
- `StandardQueryBase`<`NativeVectorQuery`>
|
||||
|
||||
## Properties
|
||||
|
||||
@@ -28,7 +28,7 @@ protected inner: VectorQuery | Promise<VectorQuery>;
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`inner`](QueryBase.md#inner)
|
||||
`StandardQueryBase.inner`
|
||||
|
||||
## Methods
|
||||
|
||||
@@ -48,6 +48,53 @@ addQueryVector(vector): VectorQuery
|
||||
|
||||
***
|
||||
|
||||
### analyzePlan()
|
||||
|
||||
```ts
|
||||
analyzePlan(): Promise<string>
|
||||
```
|
||||
|
||||
Executes the query and returns the physical query plan annotated with runtime metrics.
|
||||
|
||||
This is useful for debugging and performance analysis, as it shows how the query was executed
|
||||
and includes metrics such as elapsed time, rows processed, and I/O statistics.
|
||||
|
||||
#### Returns
|
||||
|
||||
`Promise`<`string`>
|
||||
|
||||
A query execution plan with runtime metrics for each step.
|
||||
|
||||
#### Example
|
||||
|
||||
```ts
|
||||
import * as lancedb from "@lancedb/lancedb"
|
||||
|
||||
const db = await lancedb.connect("./.lancedb");
|
||||
const table = await db.createTable("my_table", [
|
||||
{ vector: [1.1, 0.9], id: "1" },
|
||||
]);
|
||||
|
||||
const plan = await table.query().nearestTo([0.5, 0.2]).analyzePlan();
|
||||
|
||||
Example output (with runtime metrics inlined):
|
||||
AnalyzeExec verbose=true, metrics=[]
|
||||
ProjectionExec: expr=[id@3 as id, vector@0 as vector, _distance@2 as _distance], metrics=[output_rows=1, elapsed_compute=3.292µs]
|
||||
Take: columns="vector, _rowid, _distance, (id)", metrics=[output_rows=1, elapsed_compute=66.001µs, batches_processed=1, bytes_read=8, iops=1, requests=1]
|
||||
CoalesceBatchesExec: target_batch_size=1024, metrics=[output_rows=1, elapsed_compute=3.333µs]
|
||||
GlobalLimitExec: skip=0, fetch=10, metrics=[output_rows=1, elapsed_compute=167ns]
|
||||
FilterExec: _distance@2 IS NOT NULL, metrics=[output_rows=1, elapsed_compute=8.542µs]
|
||||
SortExec: TopK(fetch=10), expr=[_distance@2 ASC NULLS LAST], metrics=[output_rows=1, elapsed_compute=63.25µs, row_replacements=1]
|
||||
KNNVectorDistance: metric=l2, metrics=[output_rows=1, elapsed_compute=114.333µs, output_batches=1]
|
||||
LanceScan: uri=/path/to/data, projection=[vector], row_id=true, row_addr=false, ordered=false, metrics=[output_rows=1, elapsed_compute=103.626µs, bytes_read=549, iops=2, requests=2]
|
||||
```
|
||||
|
||||
#### Inherited from
|
||||
|
||||
`StandardQueryBase.analyzePlan`
|
||||
|
||||
***
|
||||
|
||||
### bypassVectorIndex()
|
||||
|
||||
```ts
|
||||
@@ -201,7 +248,7 @@ single query)
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`execute`](QueryBase.md#execute)
|
||||
`StandardQueryBase.execute`
|
||||
|
||||
***
|
||||
|
||||
@@ -237,7 +284,7 @@ const plan = await table.query().nearestTo([0.5, 0.2]).explainPlan();
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`explainPlan`](QueryBase.md#explainplan)
|
||||
`StandardQueryBase.explainPlan`
|
||||
|
||||
***
|
||||
|
||||
@@ -258,7 +305,7 @@ Use [Table#optimize](Table.md#optimize) to index all un-indexed data.
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`fastSearch`](QueryBase.md#fastsearch)
|
||||
`StandardQueryBase.fastSearch`
|
||||
|
||||
***
|
||||
|
||||
@@ -288,7 +335,7 @@ Use `where` instead
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`filter`](QueryBase.md#filter)
|
||||
`StandardQueryBase.filter`
|
||||
|
||||
***
|
||||
|
||||
@@ -300,7 +347,7 @@ fullTextSearch(query, options?): this
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **query**: `string`
|
||||
* **query**: `string` \| [`FullTextQuery`](../interfaces/FullTextQuery.md)
|
||||
|
||||
* **options?**: `Partial`<[`FullTextSearchOptions`](../interfaces/FullTextSearchOptions.md)>
|
||||
|
||||
@@ -310,7 +357,7 @@ fullTextSearch(query, options?): this
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`fullTextSearch`](QueryBase.md#fulltextsearch)
|
||||
`StandardQueryBase.fullTextSearch`
|
||||
|
||||
***
|
||||
|
||||
@@ -335,7 +382,54 @@ called then every valid row from the table will be returned.
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`limit`](QueryBase.md#limit)
|
||||
`StandardQueryBase.limit`
|
||||
|
||||
***
|
||||
|
||||
### maximumNprobes()
|
||||
|
||||
```ts
|
||||
maximumNprobes(maximumNprobes): VectorQuery
|
||||
```
|
||||
|
||||
Set the maximum number of probes used.
|
||||
|
||||
This controls the maximum number of partitions that will be searched. If this
|
||||
number is greater than minimumNprobes then the excess partitions will _only_ be
|
||||
searched if we have not found enough results. This can be useful when there is
|
||||
a narrow filter to allow these queries to spend more time searching and avoid
|
||||
potential false negatives.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **maximumNprobes**: `number`
|
||||
|
||||
#### Returns
|
||||
|
||||
[`VectorQuery`](VectorQuery.md)
|
||||
|
||||
***
|
||||
|
||||
### minimumNprobes()
|
||||
|
||||
```ts
|
||||
minimumNprobes(minimumNprobes): VectorQuery
|
||||
```
|
||||
|
||||
Set the minimum number of probes used.
|
||||
|
||||
This controls the minimum number of partitions that will be searched. This
|
||||
parameter will impact every query against a vector index, regardless of the
|
||||
filter. See `nprobes` for more details. Higher values will increase recall
|
||||
but will also increase latency.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **minimumNprobes**: `number`
|
||||
|
||||
#### Returns
|
||||
|
||||
[`VectorQuery`](VectorQuery.md)
|
||||
|
||||
***
|
||||
|
||||
@@ -366,6 +460,10 @@ For best results we recommend tuning this parameter with a benchmark against
|
||||
your actual data to find the smallest possible value that will still give
|
||||
you the desired recall.
|
||||
|
||||
For more fine grained control over behavior when you have a very narrow filter
|
||||
you can use `minimumNprobes` and `maximumNprobes`. This method sets both
|
||||
the minimum and maximum to the same value.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **nprobes**: `number`
|
||||
@@ -382,6 +480,10 @@ you the desired recall.
|
||||
offset(offset): this
|
||||
```
|
||||
|
||||
Set the number of rows to skip before returning results.
|
||||
|
||||
This is useful for pagination.
|
||||
|
||||
#### Parameters
|
||||
|
||||
* **offset**: `number`
|
||||
@@ -392,7 +494,7 @@ offset(offset): this
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`offset`](QueryBase.md#offset)
|
||||
`StandardQueryBase.offset`
|
||||
|
||||
***
|
||||
|
||||
@@ -539,7 +641,7 @@ object insertion order is easy to get wrong and `Map` is more foolproof.
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`select`](QueryBase.md#select)
|
||||
`StandardQueryBase.select`
|
||||
|
||||
***
|
||||
|
||||
@@ -561,7 +663,7 @@ Collect the results as an array of objects.
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`toArray`](QueryBase.md#toarray)
|
||||
`StandardQueryBase.toArray`
|
||||
|
||||
***
|
||||
|
||||
@@ -587,7 +689,7 @@ ArrowTable.
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`toArrow`](QueryBase.md#toarrow)
|
||||
`StandardQueryBase.toArrow`
|
||||
|
||||
***
|
||||
|
||||
@@ -622,7 +724,7 @@ on the filter column(s).
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`where`](QueryBase.md#where)
|
||||
`StandardQueryBase.where`
|
||||
|
||||
***
|
||||
|
||||
@@ -644,4 +746,4 @@ order to perform hybrid search.
|
||||
|
||||
#### Inherited from
|
||||
|
||||
[`QueryBase`](QueryBase.md).[`withRowId`](QueryBase.md#withrowid)
|
||||
`StandardQueryBase.withRowId`
|
||||
|
||||
54
docs/src/js/enumerations/FullTextQueryType.md
Normal file
54
docs/src/js/enumerations/FullTextQueryType.md
Normal file
@@ -0,0 +1,54 @@
|
||||
[**@lancedb/lancedb**](../README.md) • **Docs**
|
||||
|
||||
***
|
||||
|
||||
[@lancedb/lancedb](../globals.md) / FullTextQueryType
|
||||
|
||||
# Enumeration: FullTextQueryType
|
||||
|
||||
Enum representing the types of full-text queries supported.
|
||||
|
||||
- `Match`: Performs a full-text search for terms in the query string.
|
||||
- `MatchPhrase`: Searches for an exact phrase match in the text.
|
||||
- `Boost`: Boosts the relevance score of specific terms in the query.
|
||||
- `MultiMatch`: Searches across multiple fields for the query terms.
|
||||
|
||||
## Enumeration Members
|
||||
|
||||
### Boolean
|
||||
|
||||
```ts
|
||||
Boolean: "boolean";
|
||||
```
|
||||
|
||||
***
|
||||
|
||||
### Boost
|
||||
|
||||
```ts
|
||||
Boost: "boost";
|
||||
```
|
||||
|
||||
***
|
||||
|
||||
### Match
|
||||
|
||||
```ts
|
||||
Match: "match";
|
||||
```
|
||||
|
||||
***
|
||||
|
||||
### MatchPhrase
|
||||
|
||||
```ts
|
||||
MatchPhrase: "match_phrase";
|
||||
```
|
||||
|
||||
***
|
||||
|
||||
### MultiMatch
|
||||
|
||||
```ts
|
||||
MultiMatch: "multi_match";
|
||||
```
|
||||
37
docs/src/js/enumerations/Occur.md
Normal file
37
docs/src/js/enumerations/Occur.md
Normal file
@@ -0,0 +1,37 @@
|
||||
[**@lancedb/lancedb**](../README.md) • **Docs**
|
||||
|
||||
***
|
||||
|
||||
[@lancedb/lancedb](../globals.md) / Occur
|
||||
|
||||
# Enumeration: Occur
|
||||
|
||||
Enum representing the occurrence of terms in full-text queries.
|
||||
|
||||
- `Must`: The term must be present in the document.
|
||||
- `Should`: The term should contribute to the document score, but is not required.
|
||||
- `MustNot`: The term must not be present in the document.
|
||||
|
||||
## Enumeration Members
|
||||
|
||||
### Must
|
||||
|
||||
```ts
|
||||
Must: "MUST";
|
||||
```
|
||||
|
||||
***
|
||||
|
||||
### MustNot
|
||||
|
||||
```ts
|
||||
MustNot: "MUST_NOT";
|
||||
```
|
||||
|
||||
***
|
||||
|
||||
### Should
|
||||
|
||||
```ts
|
||||
Should: "SHOULD";
|
||||
```
|
||||
28
docs/src/js/enumerations/Operator.md
Normal file
28
docs/src/js/enumerations/Operator.md
Normal file
@@ -0,0 +1,28 @@
|
||||
[**@lancedb/lancedb**](../README.md) • **Docs**
|
||||
|
||||
***
|
||||
|
||||
[@lancedb/lancedb](../globals.md) / Operator
|
||||
|
||||
# Enumeration: Operator
|
||||
|
||||
Enum representing the logical operators used in full-text queries.
|
||||
|
||||
- `And`: All terms must match.
|
||||
- `Or`: At least one term must match.
|
||||
|
||||
## Enumeration Members
|
||||
|
||||
### And
|
||||
|
||||
```ts
|
||||
And: "AND";
|
||||
```
|
||||
|
||||
***
|
||||
|
||||
### Or
|
||||
|
||||
```ts
|
||||
Or: "OR";
|
||||
```
|
||||
@@ -6,10 +6,13 @@
|
||||
|
||||
# Function: connect()
|
||||
|
||||
## connect(uri, options)
|
||||
## connect(uri, options, session)
|
||||
|
||||
```ts
|
||||
function connect(uri, options?): Promise<Connection>
|
||||
function connect(
|
||||
uri,
|
||||
options?,
|
||||
session?): Promise<Connection>
|
||||
```
|
||||
|
||||
Connect to a LanceDB instance at the given URI.
|
||||
@@ -29,6 +32,8 @@ Accepted formats:
|
||||
* **options?**: `Partial`<[`ConnectionOptions`](../interfaces/ConnectionOptions.md)>
|
||||
The options to use when connecting to the database
|
||||
|
||||
* **session?**: [`Session`](../classes/Session.md)
|
||||
|
||||
### Returns
|
||||
|
||||
`Promise`<[`Connection`](../classes/Connection.md)>
|
||||
@@ -77,7 +82,7 @@ Accepted formats:
|
||||
|
||||
[ConnectionOptions](../interfaces/ConnectionOptions.md) for more details on the URI format.
|
||||
|
||||
### Example
|
||||
### Examples
|
||||
|
||||
```ts
|
||||
const conn = await connect({
|
||||
@@ -85,3 +90,11 @@ const conn = await connect({
|
||||
storageOptions: {timeout: "60s"}
|
||||
});
|
||||
```
|
||||
|
||||
```ts
|
||||
const session = Session.default();
|
||||
const conn = await connect({
|
||||
uri: "/path/to/database",
|
||||
session: session
|
||||
});
|
||||
```
|
||||
|
||||
19
docs/src/js/functions/packBits.md
Normal file
19
docs/src/js/functions/packBits.md
Normal file
@@ -0,0 +1,19 @@
|
||||
[**@lancedb/lancedb**](../README.md) • **Docs**
|
||||
|
||||
***
|
||||
|
||||
[@lancedb/lancedb](../globals.md) / packBits
|
||||
|
||||
# Function: packBits()
|
||||
|
||||
```ts
|
||||
function packBits(data): number[]
|
||||
```
|
||||
|
||||
## Parameters
|
||||
|
||||
* **data**: `number`[]
|
||||
|
||||
## Returns
|
||||
|
||||
`number`[]
|
||||
@@ -9,37 +9,62 @@
|
||||
- [embedding](namespaces/embedding/README.md)
|
||||
- [rerankers](namespaces/rerankers/README.md)
|
||||
|
||||
## Enumerations
|
||||
|
||||
- [FullTextQueryType](enumerations/FullTextQueryType.md)
|
||||
- [Occur](enumerations/Occur.md)
|
||||
- [Operator](enumerations/Operator.md)
|
||||
|
||||
## Classes
|
||||
|
||||
- [BooleanQuery](classes/BooleanQuery.md)
|
||||
- [BoostQuery](classes/BoostQuery.md)
|
||||
- [Connection](classes/Connection.md)
|
||||
- [Index](classes/Index.md)
|
||||
- [MakeArrowTableOptions](classes/MakeArrowTableOptions.md)
|
||||
- [MatchQuery](classes/MatchQuery.md)
|
||||
- [MergeInsertBuilder](classes/MergeInsertBuilder.md)
|
||||
- [MultiMatchQuery](classes/MultiMatchQuery.md)
|
||||
- [PhraseQuery](classes/PhraseQuery.md)
|
||||
- [Query](classes/Query.md)
|
||||
- [QueryBase](classes/QueryBase.md)
|
||||
- [RecordBatchIterator](classes/RecordBatchIterator.md)
|
||||
- [Session](classes/Session.md)
|
||||
- [Table](classes/Table.md)
|
||||
- [TagContents](classes/TagContents.md)
|
||||
- [Tags](classes/Tags.md)
|
||||
- [TakeQuery](classes/TakeQuery.md)
|
||||
- [VectorColumnOptions](classes/VectorColumnOptions.md)
|
||||
- [VectorQuery](classes/VectorQuery.md)
|
||||
|
||||
## Interfaces
|
||||
|
||||
- [AddColumnsResult](interfaces/AddColumnsResult.md)
|
||||
- [AddColumnsSql](interfaces/AddColumnsSql.md)
|
||||
- [AddDataOptions](interfaces/AddDataOptions.md)
|
||||
- [AddResult](interfaces/AddResult.md)
|
||||
- [AlterColumnsResult](interfaces/AlterColumnsResult.md)
|
||||
- [ClientConfig](interfaces/ClientConfig.md)
|
||||
- [ColumnAlteration](interfaces/ColumnAlteration.md)
|
||||
- [CompactionStats](interfaces/CompactionStats.md)
|
||||
- [ConnectionOptions](interfaces/ConnectionOptions.md)
|
||||
- [CreateTableOptions](interfaces/CreateTableOptions.md)
|
||||
- [DeleteResult](interfaces/DeleteResult.md)
|
||||
- [DropColumnsResult](interfaces/DropColumnsResult.md)
|
||||
- [ExecutableQuery](interfaces/ExecutableQuery.md)
|
||||
- [FragmentStatistics](interfaces/FragmentStatistics.md)
|
||||
- [FragmentSummaryStats](interfaces/FragmentSummaryStats.md)
|
||||
- [FtsOptions](interfaces/FtsOptions.md)
|
||||
- [FullTextQuery](interfaces/FullTextQuery.md)
|
||||
- [FullTextSearchOptions](interfaces/FullTextSearchOptions.md)
|
||||
- [HnswPqOptions](interfaces/HnswPqOptions.md)
|
||||
- [HnswSqOptions](interfaces/HnswSqOptions.md)
|
||||
- [IndexConfig](interfaces/IndexConfig.md)
|
||||
- [IndexOptions](interfaces/IndexOptions.md)
|
||||
- [IndexStatistics](interfaces/IndexStatistics.md)
|
||||
- [IvfFlatOptions](interfaces/IvfFlatOptions.md)
|
||||
- [IvfPqOptions](interfaces/IvfPqOptions.md)
|
||||
- [MergeResult](interfaces/MergeResult.md)
|
||||
- [OpenTableOptions](interfaces/OpenTableOptions.md)
|
||||
- [OptimizeOptions](interfaces/OptimizeOptions.md)
|
||||
- [OptimizeStats](interfaces/OptimizeStats.md)
|
||||
@@ -47,9 +72,12 @@
|
||||
- [RemovalStats](interfaces/RemovalStats.md)
|
||||
- [RetryConfig](interfaces/RetryConfig.md)
|
||||
- [TableNamesOptions](interfaces/TableNamesOptions.md)
|
||||
- [TableStatistics](interfaces/TableStatistics.md)
|
||||
- [TimeoutConfig](interfaces/TimeoutConfig.md)
|
||||
- [UpdateOptions](interfaces/UpdateOptions.md)
|
||||
- [UpdateResult](interfaces/UpdateResult.md)
|
||||
- [Version](interfaces/Version.md)
|
||||
- [WriteExecutionOptions](interfaces/WriteExecutionOptions.md)
|
||||
|
||||
## Type Aliases
|
||||
|
||||
@@ -58,6 +86,7 @@
|
||||
- [FieldLike](type-aliases/FieldLike.md)
|
||||
- [IntoSql](type-aliases/IntoSql.md)
|
||||
- [IntoVector](type-aliases/IntoVector.md)
|
||||
- [MultiVector](type-aliases/MultiVector.md)
|
||||
- [RecordBatchLike](type-aliases/RecordBatchLike.md)
|
||||
- [SchemaLike](type-aliases/SchemaLike.md)
|
||||
- [TableLike](type-aliases/TableLike.md)
|
||||
@@ -66,3 +95,4 @@
|
||||
|
||||
- [connect](functions/connect.md)
|
||||
- [makeArrowTable](functions/makeArrowTable.md)
|
||||
- [packBits](functions/packBits.md)
|
||||
|
||||
15
docs/src/js/interfaces/AddColumnsResult.md
Normal file
15
docs/src/js/interfaces/AddColumnsResult.md
Normal file
@@ -0,0 +1,15 @@
|
||||
[**@lancedb/lancedb**](../README.md) • **Docs**
|
||||
|
||||
***
|
||||
|
||||
[@lancedb/lancedb](../globals.md) / AddColumnsResult
|
||||
|
||||
# Interface: AddColumnsResult
|
||||
|
||||
## Properties
|
||||
|
||||
### version
|
||||
|
||||
```ts
|
||||
version: number;
|
||||
```
|
||||
15
docs/src/js/interfaces/AddResult.md
Normal file
15
docs/src/js/interfaces/AddResult.md
Normal file
@@ -0,0 +1,15 @@
|
||||
[**@lancedb/lancedb**](../README.md) • **Docs**
|
||||
|
||||
***
|
||||
|
||||
[@lancedb/lancedb](../globals.md) / AddResult
|
||||
|
||||
# Interface: AddResult
|
||||
|
||||
## Properties
|
||||
|
||||
### version
|
||||
|
||||
```ts
|
||||
version: number;
|
||||
```
|
||||
15
docs/src/js/interfaces/AlterColumnsResult.md
Normal file
15
docs/src/js/interfaces/AlterColumnsResult.md
Normal file
@@ -0,0 +1,15 @@
|
||||
[**@lancedb/lancedb**](../README.md) • **Docs**
|
||||
|
||||
***
|
||||
|
||||
[@lancedb/lancedb](../globals.md) / AlterColumnsResult
|
||||
|
||||
# Interface: AlterColumnsResult
|
||||
|
||||
## Properties
|
||||
|
||||
### version
|
||||
|
||||
```ts
|
||||
version: number;
|
||||
```
|
||||
@@ -16,7 +16,7 @@ must be provided.
|
||||
### dataType?
|
||||
|
||||
```ts
|
||||
optional dataType: string;
|
||||
optional dataType: string | DataType<Type, any>;
|
||||
```
|
||||
|
||||
A new data type for the column. If not provided then the data type will not be changed.
|
||||
|
||||
@@ -70,6 +70,17 @@ Defaults to 'us-east-1'.
|
||||
|
||||
***
|
||||
|
||||
### session?
|
||||
|
||||
```ts
|
||||
optional session: Session;
|
||||
```
|
||||
|
||||
(For LanceDB OSS only): the session to use for this connection. Holds
|
||||
shared caches and other session-specific state.
|
||||
|
||||
***
|
||||
|
||||
### storageOptions?
|
||||
|
||||
```ts
|
||||
|
||||
15
docs/src/js/interfaces/DeleteResult.md
Normal file
15
docs/src/js/interfaces/DeleteResult.md
Normal file
@@ -0,0 +1,15 @@
|
||||
[**@lancedb/lancedb**](../README.md) • **Docs**
|
||||
|
||||
***
|
||||
|
||||
[@lancedb/lancedb](../globals.md) / DeleteResult
|
||||
|
||||
# Interface: DeleteResult
|
||||
|
||||
## Properties
|
||||
|
||||
### version
|
||||
|
||||
```ts
|
||||
version: number;
|
||||
```
|
||||
15
docs/src/js/interfaces/DropColumnsResult.md
Normal file
15
docs/src/js/interfaces/DropColumnsResult.md
Normal file
@@ -0,0 +1,15 @@
|
||||
[**@lancedb/lancedb**](../README.md) • **Docs**
|
||||
|
||||
***
|
||||
|
||||
[@lancedb/lancedb](../globals.md) / DropColumnsResult
|
||||
|
||||
# Interface: DropColumnsResult
|
||||
|
||||
## Properties
|
||||
|
||||
### version
|
||||
|
||||
```ts
|
||||
version: number;
|
||||
```
|
||||
37
docs/src/js/interfaces/FragmentStatistics.md
Normal file
37
docs/src/js/interfaces/FragmentStatistics.md
Normal file
@@ -0,0 +1,37 @@
|
||||
[**@lancedb/lancedb**](../README.md) • **Docs**
|
||||
|
||||
***
|
||||
|
||||
[@lancedb/lancedb](../globals.md) / FragmentStatistics
|
||||
|
||||
# Interface: FragmentStatistics
|
||||
|
||||
## Properties
|
||||
|
||||
### lengths
|
||||
|
||||
```ts
|
||||
lengths: FragmentSummaryStats;
|
||||
```
|
||||
|
||||
Statistics on the number of rows in the table fragments
|
||||
|
||||
***
|
||||
|
||||
### numFragments
|
||||
|
||||
```ts
|
||||
numFragments: number;
|
||||
```
|
||||
|
||||
The number of fragments in the table
|
||||
|
||||
***
|
||||
|
||||
### numSmallFragments
|
||||
|
||||
```ts
|
||||
numSmallFragments: number;
|
||||
```
|
||||
|
||||
The number of uncompacted fragments in the table
|
||||
77
docs/src/js/interfaces/FragmentSummaryStats.md
Normal file
77
docs/src/js/interfaces/FragmentSummaryStats.md
Normal file
@@ -0,0 +1,77 @@
|
||||
[**@lancedb/lancedb**](../README.md) • **Docs**
|
||||
|
||||
***
|
||||
|
||||
[@lancedb/lancedb](../globals.md) / FragmentSummaryStats
|
||||
|
||||
# Interface: FragmentSummaryStats
|
||||
|
||||
## Properties
|
||||
|
||||
### max
|
||||
|
||||
```ts
|
||||
max: number;
|
||||
```
|
||||
|
||||
The number of rows in the fragment with the most rows
|
||||
|
||||
***
|
||||
|
||||
### mean
|
||||
|
||||
```ts
|
||||
mean: number;
|
||||
```
|
||||
|
||||
The mean number of rows in the fragments
|
||||
|
||||
***
|
||||
|
||||
### min
|
||||
|
||||
```ts
|
||||
min: number;
|
||||
```
|
||||
|
||||
The number of rows in the fragment with the fewest rows
|
||||
|
||||
***
|
||||
|
||||
### p25
|
||||
|
||||
```ts
|
||||
p25: number;
|
||||
```
|
||||
|
||||
The 25th percentile of number of rows in the fragments
|
||||
|
||||
***
|
||||
|
||||
### p50
|
||||
|
||||
```ts
|
||||
p50: number;
|
||||
```
|
||||
|
||||
The 50th percentile of number of rows in the fragments
|
||||
|
||||
***
|
||||
|
||||
### p75
|
||||
|
||||
```ts
|
||||
p75: number;
|
||||
```
|
||||
|
||||
The 75th percentile of number of rows in the fragments
|
||||
|
||||
***
|
||||
|
||||
### p99
|
||||
|
||||
```ts
|
||||
p99: number;
|
||||
```
|
||||
|
||||
The 99th percentile of number of rows in the fragments
|
||||
@@ -23,7 +23,7 @@ whether to remove punctuation
|
||||
### baseTokenizer?
|
||||
|
||||
```ts
|
||||
optional baseTokenizer: "raw" | "simple" | "whitespace";
|
||||
optional baseTokenizer: "raw" | "simple" | "whitespace" | "ngram";
|
||||
```
|
||||
|
||||
The tokenizer to use when building the index.
|
||||
@@ -71,6 +71,36 @@ tokens longer than this length will be ignored
|
||||
|
||||
***
|
||||
|
||||
### ngramMaxLength?
|
||||
|
||||
```ts
|
||||
optional ngramMaxLength: number;
|
||||
```
|
||||
|
||||
ngram max length
|
||||
|
||||
***
|
||||
|
||||
### ngramMinLength?
|
||||
|
||||
```ts
|
||||
optional ngramMinLength: number;
|
||||
```
|
||||
|
||||
ngram min length
|
||||
|
||||
***
|
||||
|
||||
### prefixOnly?
|
||||
|
||||
```ts
|
||||
optional prefixOnly: boolean;
|
||||
```
|
||||
|
||||
whether to only index the prefix of the token for ngram tokenizer
|
||||
|
||||
***
|
||||
|
||||
### removeStopWords?
|
||||
|
||||
```ts
|
||||
|
||||
25
docs/src/js/interfaces/FullTextQuery.md
Normal file
25
docs/src/js/interfaces/FullTextQuery.md
Normal file
@@ -0,0 +1,25 @@
|
||||
[**@lancedb/lancedb**](../README.md) • **Docs**
|
||||
|
||||
***
|
||||
|
||||
[@lancedb/lancedb](../globals.md) / FullTextQuery
|
||||
|
||||
# Interface: FullTextQuery
|
||||
|
||||
Represents a full-text query interface.
|
||||
This interface defines the structure and behavior for full-text queries,
|
||||
including methods to retrieve the query type and convert the query to a dictionary format.
|
||||
|
||||
## Methods
|
||||
|
||||
### queryType()
|
||||
|
||||
```ts
|
||||
queryType(): FullTextQueryType
|
||||
```
|
||||
|
||||
The type of the full-text query.
|
||||
|
||||
#### Returns
|
||||
|
||||
[`FullTextQueryType`](../enumerations/FullTextQueryType.md)
|
||||
@@ -24,18 +24,18 @@ The following distance types are available:
|
||||
|
||||
"l2" - Euclidean distance. This is a very common distance metric that
|
||||
accounts for both magnitude and direction when determining the distance
|
||||
between vectors. L2 distance has a range of [0, ∞).
|
||||
between vectors. l2 distance has a range of [0, ∞).
|
||||
|
||||
"cosine" - Cosine distance. Cosine distance is a distance metric
|
||||
calculated from the cosine similarity between two vectors. Cosine
|
||||
similarity is a measure of similarity between two non-zero vectors of an
|
||||
inner product space. It is defined to equal the cosine of the angle
|
||||
between them. Unlike L2, the cosine distance is not affected by the
|
||||
between them. Unlike l2, the cosine distance is not affected by the
|
||||
magnitude of the vectors. Cosine distance has a range of [0, 2].
|
||||
|
||||
"dot" - Dot product. Dot distance is the dot product of two vectors. Dot
|
||||
distance has a range of (-∞, ∞). If the vectors are normalized (i.e. their
|
||||
L2 norm is 1), then dot distance is equivalent to the cosine distance.
|
||||
l2 norm is 1), then dot distance is equivalent to the cosine distance.
|
||||
|
||||
***
|
||||
|
||||
|
||||
@@ -24,18 +24,18 @@ The following distance types are available:
|
||||
|
||||
"l2" - Euclidean distance. This is a very common distance metric that
|
||||
accounts for both magnitude and direction when determining the distance
|
||||
between vectors. L2 distance has a range of [0, ∞).
|
||||
between vectors. l2 distance has a range of [0, ∞).
|
||||
|
||||
"cosine" - Cosine distance. Cosine distance is a distance metric
|
||||
calculated from the cosine similarity between two vectors. Cosine
|
||||
similarity is a measure of similarity between two non-zero vectors of an
|
||||
inner product space. It is defined to equal the cosine of the angle
|
||||
between them. Unlike L2, the cosine distance is not affected by the
|
||||
between them. Unlike l2, the cosine distance is not affected by the
|
||||
magnitude of the vectors. Cosine distance has a range of [0, 2].
|
||||
|
||||
"dot" - Dot product. Dot distance is the dot product of two vectors. Dot
|
||||
distance has a range of (-∞, ∞). If the vectors are normalized (i.e. their
|
||||
L2 norm is 1), then dot distance is equivalent to the cosine distance.
|
||||
l2 norm is 1), then dot distance is equivalent to the cosine distance.
|
||||
|
||||
***
|
||||
|
||||
|
||||
@@ -26,6 +26,18 @@ will be used to determine the most useful kind of index to create.
|
||||
|
||||
***
|
||||
|
||||
### name?
|
||||
|
||||
```ts
|
||||
optional name: string;
|
||||
```
|
||||
|
||||
Optional custom name for the index.
|
||||
|
||||
If not provided, a default name will be generated based on the column name.
|
||||
|
||||
***
|
||||
|
||||
### replace?
|
||||
|
||||
```ts
|
||||
@@ -39,3 +51,30 @@ and the same name, then an error will be returned. This is true even if
|
||||
that index is out of date.
|
||||
|
||||
The default is true
|
||||
|
||||
***
|
||||
|
||||
### train?
|
||||
|
||||
```ts
|
||||
optional train: boolean;
|
||||
```
|
||||
|
||||
Whether to train the index with existing data.
|
||||
|
||||
If true (default), the index will be trained with existing data in the table.
|
||||
If false, the index will be created empty and populated as new data is added.
|
||||
|
||||
Note: This option is only supported for scalar indices. Vector indices always train.
|
||||
|
||||
***
|
||||
|
||||
### waitTimeoutSeconds?
|
||||
|
||||
```ts
|
||||
optional waitTimeoutSeconds: number;
|
||||
```
|
||||
|
||||
Timeout in seconds to wait for index creation to complete.
|
||||
|
||||
If not specified, the method will return immediately after starting the index creation.
|
||||
|
||||
@@ -30,6 +30,17 @@ The type of the index
|
||||
|
||||
***
|
||||
|
||||
### loss?
|
||||
|
||||
```ts
|
||||
optional loss: number;
|
||||
```
|
||||
|
||||
The KMeans loss value of the index,
|
||||
it is only present for vector indices.
|
||||
|
||||
***
|
||||
|
||||
### numIndexedRows
|
||||
|
||||
```ts
|
||||
|
||||
112
docs/src/js/interfaces/IvfFlatOptions.md
Normal file
112
docs/src/js/interfaces/IvfFlatOptions.md
Normal file
@@ -0,0 +1,112 @@
|
||||
[**@lancedb/lancedb**](../README.md) • **Docs**
|
||||
|
||||
***
|
||||
|
||||
[@lancedb/lancedb](../globals.md) / IvfFlatOptions
|
||||
|
||||
# Interface: IvfFlatOptions
|
||||
|
||||
Options to create an `IVF_FLAT` index
|
||||
|
||||
## Properties
|
||||
|
||||
### distanceType?
|
||||
|
||||
```ts
|
||||
optional distanceType: "l2" | "cosine" | "dot" | "hamming";
|
||||
```
|
||||
|
||||
Distance type to use to build the index.
|
||||
|
||||
Default value is "l2".
|
||||
|
||||
This is used when training the index to calculate the IVF partitions
|
||||
(vectors are grouped in partitions with similar vectors according to this
|
||||
distance type).
|
||||
|
||||
The distance type used to train an index MUST match the distance type used
|
||||
to search the index. Failure to do so will yield inaccurate results.
|
||||
|
||||
The following distance types are available:
|
||||
|
||||
"l2" - Euclidean distance. This is a very common distance metric that
|
||||
accounts for both magnitude and direction when determining the distance
|
||||
between vectors. l2 distance has a range of [0, ∞).
|
||||
|
||||
"cosine" - Cosine distance. Cosine distance is a distance metric
|
||||
calculated from the cosine similarity between two vectors. Cosine
|
||||
similarity is a measure of similarity between two non-zero vectors of an
|
||||
inner product space. It is defined to equal the cosine of the angle
|
||||
between them. Unlike l2, the cosine distance is not affected by the
|
||||
magnitude of the vectors. Cosine distance has a range of [0, 2].
|
||||
|
||||
Note: the cosine distance is undefined when one (or both) of the vectors
|
||||
are all zeros (there is no direction). These vectors are invalid and may
|
||||
never be returned from a vector search.
|
||||
|
||||
"dot" - Dot product. Dot distance is the dot product of two vectors. Dot
|
||||
distance has a range of (-∞, ∞). If the vectors are normalized (i.e. their
|
||||
l2 norm is 1), then dot distance is equivalent to the cosine distance.
|
||||
|
||||
"hamming" - Hamming distance. Hamming distance is a distance metric
|
||||
calculated from the number of bits that are different between two vectors.
|
||||
Hamming distance has a range of [0, dimension]. Note that the hamming distance
|
||||
is only valid for binary vectors.
|
||||
|
||||
***
|
||||
|
||||
### maxIterations?
|
||||
|
||||
```ts
|
||||
optional maxIterations: number;
|
||||
```
|
||||
|
||||
Max iteration to train IVF kmeans.
|
||||
|
||||
When training an IVF FLAT index we use kmeans to calculate the partitions. This parameter
|
||||
controls how many iterations of kmeans to run.
|
||||
|
||||
Increasing this might improve the quality of the index but in most cases these extra
|
||||
iterations have diminishing returns.
|
||||
|
||||
The default value is 50.
|
||||
|
||||
***
|
||||
|
||||
### numPartitions?
|
||||
|
||||
```ts
|
||||
optional numPartitions: number;
|
||||
```
|
||||
|
||||
The number of IVF partitions to create.
|
||||
|
||||
This value should generally scale with the number of rows in the dataset.
|
||||
By default the number of partitions is the square root of the number of
|
||||
rows.
|
||||
|
||||
If this value is too large then the first part of the search (picking the
|
||||
right partition) will be slow. If this value is too small then the second
|
||||
part of the search (searching within a partition) will be slow.
|
||||
|
||||
***
|
||||
|
||||
### sampleRate?
|
||||
|
||||
```ts
|
||||
optional sampleRate: number;
|
||||
```
|
||||
|
||||
The number of vectors, per partition, to sample when training IVF kmeans.
|
||||
|
||||
When an IVF FLAT index is trained, we need to calculate partitions. These are groups
|
||||
of vectors that are similar to each other. To do this we use an algorithm called kmeans.
|
||||
|
||||
Running kmeans on a large dataset can be slow. To speed this up we run kmeans on a
|
||||
random sample of the data. This parameter controls the size of the sample. The total
|
||||
number of vectors used to train the index is `sample_rate * num_partitions`.
|
||||
|
||||
Increasing this value might improve the quality of the index but in most cases the
|
||||
default should be sufficient.
|
||||
|
||||
The default value is 256.
|
||||
@@ -31,13 +31,13 @@ The following distance types are available:
|
||||
|
||||
"l2" - Euclidean distance. This is a very common distance metric that
|
||||
accounts for both magnitude and direction when determining the distance
|
||||
between vectors. L2 distance has a range of [0, ∞).
|
||||
between vectors. l2 distance has a range of [0, ∞).
|
||||
|
||||
"cosine" - Cosine distance. Cosine distance is a distance metric
|
||||
calculated from the cosine similarity between two vectors. Cosine
|
||||
similarity is a measure of similarity between two non-zero vectors of an
|
||||
inner product space. It is defined to equal the cosine of the angle
|
||||
between them. Unlike L2, the cosine distance is not affected by the
|
||||
between them. Unlike l2, the cosine distance is not affected by the
|
||||
magnitude of the vectors. Cosine distance has a range of [0, 2].
|
||||
|
||||
Note: the cosine distance is undefined when one (or both) of the vectors
|
||||
@@ -46,7 +46,7 @@ never be returned from a vector search.
|
||||
|
||||
"dot" - Dot product. Dot distance is the dot product of two vectors. Dot
|
||||
distance has a range of (-∞, ∞). If the vectors are normalized (i.e. their
|
||||
L2 norm is 1), then dot distance is equivalent to the cosine distance.
|
||||
l2 norm is 1), then dot distance is equivalent to the cosine distance.
|
||||
|
||||
***
|
||||
|
||||
|
||||
Some files were not shown because too many files have changed in this diff Show More
Reference in New Issue
Block a user