Commit Graph

1392 Commits

Author SHA1 Message Date
Paul Masurel
ae0a983ccf blop 2020-12-03 14:19:54 +09:00
Paul Masurel
570dcb67cf Small refactoring 2020-12-03 14:10:35 +09:00
Paul Masurel
654c400a0b TermDictionary.finish does not flush 2020-12-03 13:36:25 +09:00
Paul Masurel
80a99539ce Several TermDict operation now returns an io::Result 2020-12-03 13:13:11 +09:00
Paul Masurel
4b1c770e5e Simplified counting writer and removed flush 2020-12-03 11:24:39 +09:00
Paul Masurel
3491645e69 Moved the term merger 2020-12-03 10:24:04 +09:00
Paul Masurel
b4b3bc7acd Cargo fmt 2020-12-03 10:08:38 +09:00
Paul Masurel
521c7b271b Isolated fst impl of termdictionary in a specific module. 2020-12-02 21:18:33 +09:00
Adrien Guillo
3ab1ba0b2f Fix clippy warning 2020-12-01 12:07:53 -08:00
Paul Masurel
b344c0ac05 Merge pull request #949 from tantivy-search/docset_is_send
DocSet is send
2020-12-01 19:12:51 +09:00
Paul Masurel
1741619c7f DocSet is send 2020-12-01 19:11:21 +09:00
Paul Masurel
067ba3dff0 Merge pull request #946 from tantivy-search/issue/test-bugfix-atomicwrite
Attempt to fix bug surfacing sometimes in test.
2020-12-01 15:29:51 +09:00
Paul Masurel
f79250f665 Fix perf regression in the benchmark for the Count collector.
In order to reduce IO, we introduced a way to instanciate a dummy
constant FieldnormReader which worked by allocating a buffer with
as many bytes as there are docs in the segments.

This allocation is not a negligible by any mean.

This PR works by offering two implementation for the
FieldnormReader.
The const field norm reader simply returns the same value all of the
time, while the array based one does the same as the current one.
2020-12-01 08:51:32 +09:00
Paul Masurel
5a33b8d533 Merge pull request #942 from barrotsteindev/filter-collector
added initial implementation for filter_collector
2020-11-30 11:26:28 +09:00
Paul Masurel
d165655fb1 Added specialized implementation for count/count_including... in &mut DocSet 2020-11-30 11:24:13 +09:00
barrotsteindev
c805871b92 better test 2020-11-25 14:25:49 +02:00
barrotsteindev
bc44543d8f added TPredicate generic param and updated tests 2020-11-25 14:08:24 +02:00
Paul Masurel
db514208a7 Removed the SegmentCollector type from the Generics of the
FilterCollector
2020-11-25 14:08:24 +02:00
barrotsteindev
b6ff29e020 simplified FilterCollector#for_segment 2020-11-25 14:08:24 +02:00
barrotsteindev
7c94dfdc15 fmt 2020-11-25 14:08:24 +02:00
barrotsteindev
8782c0eada updated docs 2020-11-25 14:08:24 +02:00
barrotsteindev
fea0ba1042 removed unnecessary static liftimes 2020-11-25 14:08:24 +02:00
barrotsteindev
027555c75f added initial implementation for filter_collector 2020-11-25 14:08:24 +02:00
Paul Masurel
b478ed747a Attempt to fix bug surfacing sometimes in test.
Recently, `test_index_manual_policy_mmap` has been failing on Windows.

The idea addressed by this patch is that we forget to sync the parent
directory with the current implementation of atomic writes.
This was done correctly when we were relying the atomicwrites crate.

*crossing fingers*
2020-11-25 18:00:05 +09:00
Paul Masurel
e9aa27dace Avoid computing the BM25 weight if scoring is disabled 2020-11-25 14:35:49 +09:00
Paul Masurel
30c5f7c5f0 Applied CR comments 2020-11-25 13:56:05 +09:00
Adrien Guillo
6f26871c0f Replace some Arc<Box<dyn... with Arc<dyn... 2020-11-24 19:54:53 -08:00
Paul Masurel
f93cc5b5e3 Merge pull request #944 from tantivy-search/no-file-len-problem
No filelen problem.
2020-11-25 11:54:44 +09:00
Paul Masurel
5a25c8dfd3 No filelen problem. 2020-11-25 11:51:58 +09:00
Adrien Guillo
1cfdce3437 Add helper methods for reading u8 and u64 to OwnedBytes 2020-11-23 10:45:46 -08:00
Paul Masurel
8d0e049261 Revert "Move SegmentUpdater::list_files() to Index" 2020-11-20 13:53:50 +09:00
Adrien Guillo
267e920a80 Move SegmentUpdater::list_files() to Index
... and make the method public
2020-11-17 17:54:18 -08:00
Paul Masurel
7f0e61b173 Refactoring of the skip index.
The skip index now identifies both the start and the end offset
of blocks. Checkpoints are compressed in blocks, reaching better
compression.
2020-11-17 16:05:11 +09:00
Paul Masurel
ce4c50446b Merge pull request #937 from tantivy-search/guilload--cache-store-reader-blocks
Cache store reader blocks in an LRU fashion
2020-11-17 13:45:10 +09:00
Adrien Guillo
9ab25d2575 Cache store reader blocks in an LRU fashion 2020-11-16 19:09:10 -08:00
Paul Masurel
6d4b982417 Marked blockwand test as ignored.
- Using impl trait for iterating `matching_segments` in the termdict
merger
2020-11-16 13:44:14 +09:00
Paul Masurel
650eca271f Merge pull request #932 from tantivy-search/fix-unit-test-file-watcher
Fixing unit test.
2020-11-13 11:47:15 +09:00
Paul Masurel
8ee55aef6d Fixing unit test. 2020-11-13 09:01:45 +09:00
Paul Masurel
40d41c7dcb Merge pull request #929 from tantivy-search/api-public-term-merger
Make field TermMerger API public
2020-11-12 14:11:53 +09:00
Paul Masurel
eef348004e Closes #930 Minor bug.
Watch callback could be callback if the last watch handle was dropped
shortly before meta.json is called.
2020-11-11 15:51:23 +09:00
Paul Masurel
e784bbc40f Update src/core/searcher.rs
Co-authored-by: Adrien Guillo <adrien.guillo@gmail.com>
2020-11-11 12:37:52 +09:00
Paul Masurel
b8118d439f Make field TermMerger API public 2020-11-11 11:59:09 +09:00
Paul Masurel
a49e59053c Making block wand test more robusts 2020-11-10 18:01:38 +09:00
Paul Masurel
41bb2bd58b Merge pull request #926 from tantivy-search/guilload--directory-exists
Modified `Directory::exists` API to return `Result<bool, OpenReadError>`
2020-11-10 17:59:45 +09:00
Adrien Guillo
7fd6054145 Modified Directory::exists API to return Result<bool, OpenReadError> 2020-11-09 18:00:14 -08:00
Paul Masurel
6abf4e97b5 Merge pull request #925 from tantivy-search/postings-end-offset
Adding post stop offset to TermInfo.
2020-11-09 15:58:04 +09:00
Paul Masurel
d23aee76c9 Avoid loading fieldnorms when not necessary 2020-11-09 15:50:16 +09:00
Paul Masurel
726d32eac5 Merge pull request #924 from tantivy-search/guilload--implement-poll-watcher
Implement FileWatcher
2020-11-06 22:41:26 +09:00
Paul Masurel
b5f3dcdc8b TermInfo contain the end_offset of the postings.
We slice the ReadOnlySource tightly.
2020-11-06 15:18:51 +09:00
Adrien Guillo
2875deb4b1 Implement FileWatcher 2020-11-05 20:08:15 -08:00