From 3eac75e61ae5e7ce40d652a0189c5365508d758b Mon Sep 17 00:00:00 2001
From: gsilvestrin <gc@eto.ai>
Date: Wed, 19 Apr 2023 20:23:18 -0700
Subject: [PATCH] review comments

---
 docs/src/ann_indexes.md | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/docs/src/ann_indexes.md b/docs/src/ann_indexes.md
index e9fd5682..6d81d33e 100644
--- a/docs/src/ann_indexes.md
+++ b/docs/src/ann_indexes.md
@@ -1,8 +1,8 @@
 # ANN (Approximate Nearest Neighbor) Indexes
 
-You can create an index over your vector data to make search faster. Vector indexes are faster but less 
- accurate than exhaustive search. LanceDB provides many parameters to fine-tune the index's size, the speed of 
-queries, and the accuracy of results.
+You can create an index over your vector data to make search faster. Vector indexes are faster but less accurate than exhaustive search. LanceDB provides many parameters to fine-tune the index's size, the speed of queries, and the accuracy of results.
+
+Currently, LanceDB does not automatically create the ANN index. In the future we will look to improve this experience and automate index creation and configuration.
 
 ## Creating an ANN Index
 
@@ -28,9 +28,10 @@ tbl.create_index(num_partitions=256, num_sub_vectors=96)
 Since `create_index` has a training step, it can take a few minutes to finish for large tables. You can control the index
 creation by providing the following parameters:
 
-- **num_partitions**: The number of partitions of the index. A higher number leads to faster queries, but it makes index 
-generation slower.
-- **num_sub_vectors**: The number of subvectors (M) that will be created during Product Quantization (PQ). A larger number makes
+- **num_partitions** (default: 256): The number of partitions of the index. The number of partitions should be configured so each partition has 3-5K vectors. For example, a table 
+with ~1M vectors should use 256 partitions. You can specify arbitrary number of partitions but powers of 2 is most conventional. 
+A higher number leads to faster queries, but it makes index generation slower. 
+- **num_sub_vectors** (default: 96): The number of subvectors (M) that will be created during Product Quantization (PQ). A larger number makes
 search more accurate, but also makes the index larger and slower to build. 
 
 ## Querying an ANN Index
@@ -39,9 +40,9 @@ Querying vector indexes is done via the [search](https://lancedb.github.io/lance
 
 There are a couple of parameters that can be used to fine-tune the search:
 
-- **limit**: The amount of results that will be returned
-- **nprobes**: The number of probes used. A higher number makes search more accurate but also slower.
-- **refine_factor**: Refine the results by reading extra elements and re-ranking them in memory. A higher number makes 
+- **limit** (default: 10): The amount of results that will be returned
+- **nprobes** (default: 20): The number of probes used. A higher number makes search more accurate but also slower.
+- **refine_factor** (default: None): Refine the results by reading extra elements and re-ranking them in memory. A higher number makes 
 search more accurate but also slower.
 
 ```python