trinity-1686a
a3f001360f
add support for warming up range of terms ( #2042 )
...
* add support for warming up range of terms
* simplify handling of limit
2023-05-22 14:29:35 +02:00
PSeitz
04562c0318
add fastfield tokenizer to IndexBuilder ( #2046 )
2023-05-18 04:33:42 +02:00
Yuri Astrakhan
74275b76a6
Inline format arguments where makes sense ( #2038 )
...
Applied this command to the code, making it a bit shorter and slightly
more readable.
```
cargo +nightly clippy --all-features --benches --tests --workspace --fix -- -A clippy::all -W clippy::uninlined_format_args
cargo +nightly fmt --all
```
2023-05-10 18:03:59 +09:00
PSeitz
4ee1b5cda0
add seperate tokenizer manager for fast fields ( #2019 )
...
* add seperate tokenizer manager for fast fields
* rename
2023-05-08 11:22:31 +02:00
tottoto
73452284ae
Remove unused crates from dependencies ( #2018 )
...
* Remove unused crates from dependencies
* Revert rand to columnar
* Revert criterion to stacker
2023-05-02 12:34:20 +02:00
trinity-1686a
9c93bfeb51
optimise warmup code path ( #2007 )
...
* optimise warmup code path
* better function naming
2023-04-21 11:23:09 +02:00
PSeitz
74f9eafefc
refactor Term ( #2006 )
...
* refactor Term
add ValueBytes for serialized term values
add missing debug for ip
skip unnecessary json path validation
remove code duplication
add DATE_TIME_PRECISION_INDEXED constant
add missing Term clarification
remove weird value_bytes_mut() API
* fix naming
2023-04-20 15:31:43 +02:00
Paul Masurel
fbda511a1a
Making more things public for quickwit. ( #2005 )
2023-04-20 11:37:45 +09:00
Paul Masurel
f853bf204b
Align the numerical type priority order with columnar. ( #1978 )
...
Closes #1956
2023-04-07 10:07:54 +09:00
PSeitz
9e2faecf5b
add memory limit for aggregations ( #1942 )
...
* add memory limit for aggregations
introduce AggregationLimits to set memory consumption limit and bucket limits
memory limit is checked during aggregation, bucket limit is checked before returning the aggregation request.
* Apply suggestions from code review
Co-authored-by: Paul Masurel <paul@quickwit.io >
* add ByteCount with human readable format
---------
Co-authored-by: Paul Masurel <paul@quickwit.io >
2023-03-16 06:21:07 +01:00
Paul Masurel
7fae4d98d7
Adapting for quickwit2 ( #1912 )
...
* Adapting tantivy to make it possible to be plugged to quickwit.
* Apply suggestions from code review
Co-authored-by: PSeitz <PSeitz@users.noreply.github.com >
* Added unit test
---------
Co-authored-by: PSeitz <PSeitz@users.noreply.github.com >
2023-03-01 16:27:46 +09:00
Paul Masurel
66ff53b0f4
Various minor code cleanup ( #1909 )
2023-02-27 13:48:34 +09:00
Paul Masurel
789cc8703e
Adding unit test testing docfreq after merge ( #1895 )
2023-02-22 11:05:34 +09:00
Paul Masurel
e5098d9fe8
Moving test around reenabling tests that were disabled. ( #1894 )
2023-02-22 10:31:52 +09:00
Alex Cole
f2f38c43ce
Make BM25 scoring more flexible ( #1855 )
...
* Introduce Bm25StatisticsProvider to inject statistics
* fix formatting I accidentally changed
2023-02-16 19:14:12 +09:00
PSeitz
36c6138e7f
fix: auto downgrade index record option, instead of vint error ( #1857 )
...
Prev: thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: IoError(Custom { kind: InvalidData, error: "Reach end of buffer while reading VInt" })', src/main.rs:46:14
Now: Automatic downgrade to next available level
2023-02-10 13:45:23 +01:00
Paul Masurel
405e2cf4d9
Merge with main
2023-02-09 14:28:57 +01:00
Paul Masurel
bd5eea9852
Integrated columnar work.
2023-02-09 13:14:31 +01:00
PSeitz
0f20787917
fix doc store cache docs ( #1821 )
...
* fix doc store cache docs
addresses an issue reported in #1820
* rename doc_store_cache_size
2023-01-23 07:06:49 +01:00
PSeitz
f687b3a5aa
start migrate Field to &str ( #1772 )
...
start migrate Field to &str in preparation of columnar
return Result for get_field
2023-01-18 16:12:07 +09:00
Shikhar Bhushan
2650111b76
EnableScoring::Disabled - optional Searcher ( #1780 )
2023-01-12 09:26:50 -05:00
Paul Masurel
bb48c3e488
Refactoring to prepare for the addition of dynamic fast field ( #1730 )
...
* Refactoring to prepare for the addition of dynamic fast field
- Exposing insert_key / insert_value
- Renamed SSTable::{Reader/Writer}-> SSTable::{ValueReader/ValueWriter}
- Added a generic Dictionary object in the sstable crate
- Removing the TermDictionary wrapper from tantivy, relying directly on
an alias of the generic Dictionary object.
- dropped the use of byteorder in sstable.
- Stopped scanning / reading the entire dictionary when streaming a range.
* Added a benchmark for streaming sstable ranges.
* CR comments.
Rename deserialize_u64 -> deserialize_vint_u64
* Removed needless allocation, split serialize into serialize and clear.
2022-12-22 12:25:46 +09:00
Paul Masurel
32cb1d22da
Removed AsyncIoResult. ( #1728 )
2022-12-21 16:01:17 +09:00
PSeitz
f9171a3981
fix clippy ( #1725 )
...
* fix clippy
* fix clippy fastfield codecs
* fix clippy bitpacker
* fix clippy common
* fix clippy stacker
* fix clippy sstable
* fmt
2022-12-20 07:30:06 +01:00
PSeitz
0281b22b77
update create_in_ram docs ( #1695 )
2022-11-24 17:30:09 +01:00
trinity-1686a
5765c261aa
allow warming up of the full posting list ( #1673 )
...
* allow warming up of the full posting list
* cargo fmt
2022-11-14 10:27:56 +09:00
Paul Masurel
3edf0a2724
Using the manual reload policy in IndexWriter. ( #1667 )
2022-11-09 11:20:41 +01:00
Pascal Seitz
38ad46e580
fix clippy
2022-11-07 16:09:55 +08:00
Bruce Mitchener
b3bf9a5716
Documentation improvements.
2022-10-05 14:18:10 +07:00
Pascal Seitz
0e94213af0
validate index settings on create
2022-09-29 18:58:09 +08:00
Bruce Mitchener
cb252a42af
docs: "associated to" -> "associated with" ( #1557 )
...
This reads better this way.
2022-09-26 20:23:37 +09:00
Bruce Mitchener
ea8e6d7b1d
Tidy up clippy config. ( #1547 )
...
* Checking cfg_attr is no longer necessary.
* Don't need multiple `clippy::` prefixes on a name.
2022-09-26 09:37:55 +09:00
Bruce Mitchener
d231671fe2
clippy: Remove borrows that the compiler will do.
...
This started showing up with clippy in rust 1.64.
2022-09-22 22:38:23 +07:00
Bruce Mitchener
cf02e32578
Improvements to doc linking, grammar, etc.
2022-09-19 18:10:22 +07:00
Bruce Mitchener
6a88ac3fe3
Documentation improvements.
...
Fix some linking, some grammar, some typos, etc.
2022-09-18 18:05:37 +07:00
Paul Masurel
817225edfb
Allow for a same-thread doc compressor. ( #1510 )
...
In addition, it isolates the doc compressor logic,
better reports io::Result.
In the case of the same-thread doc compressor,
the blocks are also not copied.
2022-09-13 15:32:48 +09:00
Paul Masurel
4d634d61ff
Expose memory usage in SingleSegmentIndexWriter ( #1508 )
2022-09-07 18:33:52 +09:00
Paul Masurel
08c4412d73
Adding dragon API to build index without any thread. ( #1496 )
...
Closes #1487
2022-09-01 10:32:36 +09:00
Paul Masurel
a451f6d60d
Minor refactoring. ( #1495 )
2022-08-31 12:00:58 +09:00
PSeitz
bb01e99e05
Fixes race condition in Searcher ( #1464 )
...
Fixes a race condition in Searcher, by avoiding repeated calls to open_segment_readers and passing them instead as argument
Closes #1461
2022-08-24 21:17:37 +09:00
Kian-Meng Ang
625bcb4877
Fix typos and markdowns
...
Found via these commands:
codespell -L crate,ser,panting,beauti,hart,ue,atleast,childs,ond,pris,hel,mot
markdownlint *.md doc/src/*.md --disable MD013 MD025 MD033 MD001 MD024 MD036 MD041 MD003
2022-08-13 18:25:47 +08:00
Evance Soumaoro
fad3faefe2
added InvertedIndexReader::doc_freq_async and SnippetGenerator::new methods
2022-08-12 06:39:10 +00:00
boraarslan
d4b2b7de8b
Expose inner file slice
2022-08-04 18:13:17 +03:00
PSeitz
23fe73a6c0
remove searcher pool and make Searcher cloneable ( #1411 )
...
* remove searcher pool and make Searcher cloneable
closes #1410
* use SearcherInner in InnerIndexReader
2022-07-12 18:07:48 +09:00
Pascal Seitz
5750224d4c
set docstore cache size at construction
2022-07-04 14:27:55 +08:00
Pascal Seitz
9db2f0e82b
expose doc store cache size
...
expose lru doc store cache size
optimize doc store cache size
2022-07-04 13:54:41 +08:00
PSeitz
de178a1901
Merge pull request #1395 from PSeitz/fix_clippy
...
fix clippy
2022-06-21 16:30:59 +08:00
Antoine G
11e4225f23
doc fix ( #1391 )
...
Documentation fix.
2022-06-21 15:53:33 +09:00
Pascal Seitz
1440f3243b
fix clippy
2022-06-21 14:47:01 +08:00
Kanji Yomoda
83d0c13fb0
Fix outdated variable naming and comments to alive bitset ( #1387 )
...
* Fix outdated variables and comments for alive bitset
* Fix expired link to delete bitset
2022-06-14 15:59:15 +09:00