Paul Masurel
ae0a983ccf
blop
2020-12-03 14:19:54 +09:00
Paul Masurel
570dcb67cf
Small refactoring
2020-12-03 14:10:35 +09:00
Paul Masurel
654c400a0b
TermDictionary.finish does not flush
2020-12-03 13:36:25 +09:00
Paul Masurel
80a99539ce
Several TermDict operation now returns an io::Result
2020-12-03 13:13:11 +09:00
Paul Masurel
4b1c770e5e
Simplified counting writer and removed flush
2020-12-03 11:24:39 +09:00
Paul Masurel
3491645e69
Moved the term merger
2020-12-03 10:24:04 +09:00
Paul Masurel
b4b3bc7acd
Cargo fmt
2020-12-03 10:08:38 +09:00
Paul Masurel
521c7b271b
Isolated fst impl of termdictionary in a specific module.
2020-12-02 21:18:33 +09:00
Adrien Guillo
3ab1ba0b2f
Fix clippy warning
2020-12-01 12:07:53 -08:00
Paul Masurel
b344c0ac05
Merge pull request #949 from tantivy-search/docset_is_send
...
DocSet is send
2020-12-01 19:12:51 +09:00
Paul Masurel
1741619c7f
DocSet is send
2020-12-01 19:11:21 +09:00
Paul Masurel
067ba3dff0
Merge pull request #946 from tantivy-search/issue/test-bugfix-atomicwrite
...
Attempt to fix bug surfacing sometimes in test.
2020-12-01 15:29:51 +09:00
Paul Masurel
f79250f665
Fix perf regression in the benchmark for the Count collector.
...
In order to reduce IO, we introduced a way to instanciate a dummy
constant FieldnormReader which worked by allocating a buffer with
as many bytes as there are docs in the segments.
This allocation is not a negligible by any mean.
This PR works by offering two implementation for the
FieldnormReader.
The const field norm reader simply returns the same value all of the
time, while the array based one does the same as the current one.
2020-12-01 08:51:32 +09:00
Paul Masurel
5a33b8d533
Merge pull request #942 from barrotsteindev/filter-collector
...
added initial implementation for filter_collector
2020-11-30 11:26:28 +09:00
Paul Masurel
d165655fb1
Added specialized implementation for count/count_including... in &mut DocSet
2020-11-30 11:24:13 +09:00
barrotsteindev
c805871b92
better test
2020-11-25 14:25:49 +02:00
barrotsteindev
bc44543d8f
added TPredicate generic param and updated tests
2020-11-25 14:08:24 +02:00
Paul Masurel
db514208a7
Removed the SegmentCollector type from the Generics of the
...
FilterCollector
2020-11-25 14:08:24 +02:00
barrotsteindev
b6ff29e020
simplified FilterCollector#for_segment
2020-11-25 14:08:24 +02:00
barrotsteindev
7c94dfdc15
fmt
2020-11-25 14:08:24 +02:00
barrotsteindev
8782c0eada
updated docs
2020-11-25 14:08:24 +02:00
barrotsteindev
fea0ba1042
removed unnecessary static liftimes
2020-11-25 14:08:24 +02:00
barrotsteindev
027555c75f
added initial implementation for filter_collector
2020-11-25 14:08:24 +02:00
Paul Masurel
b478ed747a
Attempt to fix bug surfacing sometimes in test.
...
Recently, `test_index_manual_policy_mmap` has been failing on Windows.
The idea addressed by this patch is that we forget to sync the parent
directory with the current implementation of atomic writes.
This was done correctly when we were relying the atomicwrites crate.
*crossing fingers*
2020-11-25 18:00:05 +09:00
Paul Masurel
e9aa27dace
Avoid computing the BM25 weight if scoring is disabled
2020-11-25 14:35:49 +09:00
Paul Masurel
30c5f7c5f0
Applied CR comments
2020-11-25 13:56:05 +09:00
Adrien Guillo
6f26871c0f
Replace some Arc<Box<dyn... with Arc<dyn...
2020-11-24 19:54:53 -08:00
Paul Masurel
f93cc5b5e3
Merge pull request #944 from tantivy-search/no-file-len-problem
...
No filelen problem.
2020-11-25 11:54:44 +09:00
Paul Masurel
5a25c8dfd3
No filelen problem.
2020-11-25 11:51:58 +09:00
Adrien Guillo
1cfdce3437
Add helper methods for reading u8 and u64 to OwnedBytes
2020-11-23 10:45:46 -08:00
Paul Masurel
8d0e049261
Revert "Move SegmentUpdater::list_files() to Index"
2020-11-20 13:53:50 +09:00
Adrien Guillo
267e920a80
Move SegmentUpdater::list_files() to Index
...
... and make the method public
2020-11-17 17:54:18 -08:00
Paul Masurel
7f0e61b173
Refactoring of the skip index.
...
The skip index now identifies both the start and the end offset
of blocks. Checkpoints are compressed in blocks, reaching better
compression.
2020-11-17 16:05:11 +09:00
Paul Masurel
ce4c50446b
Merge pull request #937 from tantivy-search/guilload--cache-store-reader-blocks
...
Cache store reader blocks in an LRU fashion
2020-11-17 13:45:10 +09:00
Adrien Guillo
9ab25d2575
Cache store reader blocks in an LRU fashion
2020-11-16 19:09:10 -08:00
Paul Masurel
6d4b982417
Marked blockwand test as ignored.
...
- Using impl trait for iterating `matching_segments` in the termdict
merger
2020-11-16 13:44:14 +09:00
Paul Masurel
650eca271f
Merge pull request #932 from tantivy-search/fix-unit-test-file-watcher
...
Fixing unit test.
2020-11-13 11:47:15 +09:00
Paul Masurel
8ee55aef6d
Fixing unit test.
2020-11-13 09:01:45 +09:00
Paul Masurel
40d41c7dcb
Merge pull request #929 from tantivy-search/api-public-term-merger
...
Make field TermMerger API public
2020-11-12 14:11:53 +09:00
Paul Masurel
eef348004e
Closes #930 Minor bug.
...
Watch callback could be callback if the last watch handle was dropped
shortly before meta.json is called.
2020-11-11 15:51:23 +09:00
Paul Masurel
e784bbc40f
Update src/core/searcher.rs
...
Co-authored-by: Adrien Guillo <adrien.guillo@gmail.com >
2020-11-11 12:37:52 +09:00
Paul Masurel
b8118d439f
Make field TermMerger API public
2020-11-11 11:59:09 +09:00
Paul Masurel
a49e59053c
Making block wand test more robusts
2020-11-10 18:01:38 +09:00
Paul Masurel
41bb2bd58b
Merge pull request #926 from tantivy-search/guilload--directory-exists
...
Modified `Directory::exists` API to return `Result<bool, OpenReadError>`
2020-11-10 17:59:45 +09:00
Adrien Guillo
7fd6054145
Modified Directory::exists API to return Result<bool, OpenReadError>
2020-11-09 18:00:14 -08:00
Paul Masurel
6abf4e97b5
Merge pull request #925 from tantivy-search/postings-end-offset
...
Adding post stop offset to TermInfo.
2020-11-09 15:58:04 +09:00
Paul Masurel
d23aee76c9
Avoid loading fieldnorms when not necessary
2020-11-09 15:50:16 +09:00
Paul Masurel
726d32eac5
Merge pull request #924 from tantivy-search/guilload--implement-poll-watcher
...
Implement FileWatcher
2020-11-06 22:41:26 +09:00
Paul Masurel
b5f3dcdc8b
TermInfo contain the end_offset of the postings.
...
We slice the ReadOnlySource tightly.
2020-11-06 15:18:51 +09:00
Adrien Guillo
2875deb4b1
Implement FileWatcher
2020-11-05 20:08:15 -08:00