mirror of
https://github.com/lancedb/lancedb.git
synced 2026-01-07 04:12:59 +00:00
feat(python): Set heap size to get faster fts indexing performance (#762)
By default tantivy-py uses 128MB heapsize. We change the default to 1GB and we allow the user to customize this locally this makes `test_fts.py` run 10x faster
This commit is contained in:
@@ -75,6 +75,18 @@ applied on top of the full text search results. This can be invoked via the fami
|
||||
table.search("puppy").limit(10).where("meta='foo'").to_list()
|
||||
```
|
||||
|
||||
## Configurations
|
||||
|
||||
By default, LanceDB configures a 1GB heap size limit for creating the index. You can
|
||||
reduce this if running on a smaller node, or increase this for faster performance while
|
||||
indexing a larger corpus.
|
||||
|
||||
```python
|
||||
# configure a 512MB heap size
|
||||
heap = 1024 * 1024 * 512
|
||||
table.create_fts_index(["text1", "text2"], writer_heap_size=heap, replace=True)
|
||||
```
|
||||
|
||||
## Current limitations
|
||||
|
||||
1. Currently we do not yet support incremental writes.
|
||||
|
||||
Reference in New Issue
Block a user