mirror of
https://github.com/lancedb/lancedb.git
synced 2026-01-05 19:32:56 +00:00
Merge branch 'main' into changhiskhan/improve-index-docs
This commit is contained in:
@@ -34,6 +34,7 @@ tbl.create_index(num_partitions=256, num_sub_vectors=96)
|
||||
Since `create_index` has a training step, it can take a few minutes to finish for large tables. You can control the index
|
||||
creation by providing the following parameters:
|
||||
|
||||
- **metric** (default: "L2"): The distance metric to use. By default we use euclidean distance. We also support cosine distance.
|
||||
- **num_partitions** (default: 256): The number of partitions of the index. The number of partitions should be configured so each partition has 3-5K vectors. For example, a table
|
||||
with ~1M vectors should use 256 partitions. You can specify arbitrary number of partitions but powers of 2 is most conventional.
|
||||
A higher number leads to faster queries, but it makes index generation slower.
|
||||
@@ -56,6 +57,7 @@ There are a couple of parameters that can be used to fine-tune the search:
|
||||
e.g., for 1M vectors divided into 256 partitions, if you're looking for top 20, then refine_factor=200 reranks the whole partition.<br/>
|
||||
Note: refine_factor is only applicable if an ANN index is present. If specified on a table without an ANN index, it is ignored.
|
||||
|
||||
|
||||
```python
|
||||
tbl.search(np.random.random((768))) \
|
||||
.limit(2) \
|
||||
|
||||
Reference in New Issue
Block a user