Commit Graph

2059 Commits

Author SHA1 Message Date
Pascal Seitz
5ee5037934 create and use ReadSerializedBitSet 2021-09-24 12:53:33 +08:00
Pascal Seitz
c217bfed1e cargo fmt 2021-09-23 21:02:19 +08:00
Pascal Seitz
c27ccd3e24 improve naming 2021-09-23 21:02:09 +08:00
Pascal Seitz
d7a6a409a1 renames 2021-09-23 20:33:11 +08:00
Pascal Seitz
a1f5cead96 AliveBitSet instead of DeleteBitSet 2021-09-23 20:03:57 +08:00
Pascal Seitz
4583fa270b fixes 2021-09-23 10:39:53 +08:00
Pascal Seitz
beb3a5bd73 fix len 2021-09-18 17:58:15 +08:00
Pascal Seitz
93cbd52bf0 move code to biset, add inline, add benchmark 2021-09-18 17:35:22 +08:00
Pascal Seitz
c22177a005 add iterator 2021-09-17 15:29:27 +08:00
Pascal Seitz
4da71273e1 add de/serialization for bitset
remove len footgun
2021-09-17 10:28:12 +08:00
Pascal Seitz
4ae1d87632 add DeleteBitSet iterator 2021-09-15 23:10:04 +08:00
Paul Masurel
46b86a7976 Bounced version and edited changelog 0.16.1 2021-09-10 23:05:09 +09:00
PSeitz
3bc177e69d fix #1151 (#1152)
* fix #1151

Fixes a off by one error in the stats for the index fast field in the multi value fast field.
When retrieving the data range for a docid, `get(doc)..get(docid+1)` is requested. On creation
the num_vals statistic was set to doc instead of docid + 1. In the multivaluelinearinterpol fast
field the last value was therefore not serialized (and would return 0 instead in most cases).
So the last document get(lastdoc)..get(lastdoc + 1) would return the invalid range `value..0`.

This PR adds a proptest to cover this scenario. A combination of a large number values, since multilinear
interpolation is only active for more than 5_000 values, and a merge is required.
2021-09-10 23:00:37 +09:00
PSeitz
319609e9c1 test cargo-llvm-cov (#1149) 2021-09-03 22:00:43 +09:00
Kanji Yomoda
9d87b89718 Fix incorrect comment for Index::create_in_dir (#1148)
* Fix incorrect comment for Index::create_in_dir
2021-09-03 10:37:16 +09:00
Tomoko Uchida
dd81e38e53 Add WhitespaceTokenizer (#1147)
* Add WhitespaceTokenizer.
2021-08-29 18:20:49 +09:00
Paul Masurel
9f32b22602 Preparing for release. 2021-08-26 09:07:08 +09:00
sigaloid
096ce7488e Resolve some clippys, format (#1144)
* cargo +nightly clippy --fix -Z unstable-options
2021-08-26 08:46:00 +09:00
PSeitz
a1782dd172 Update index_sorting.md 2021-08-25 07:55:50 +01:00
PSeitz
000d76b11a Update index_sorting.md 2021-08-24 19:28:06 +01:00
PSeitz
abd29f6646 Update index_sorting.md 2021-08-24 19:26:19 +01:00
PSeitz
b4ecf0ab2f Merge pull request #1146 from tantivy-search/sorting_doc
add sorting to book
2021-08-23 17:37:54 +01:00
Pascal Seitz
798f7dbf67 add sorting to book 2021-08-23 17:36:41 +01:00
PSeitz
06a2e47c8d Merge pull request #1145 from tantivy-search/blub2
cargo fmt
2021-08-21 18:52:50 +01:00
Pascal Seitz
e0b83eb291 cargo fmt 2021-08-21 18:52:10 +01:00
PSeitz
13401f46ea add wildcard mention 2021-08-21 18:10:33 +01:00
PSeitz
1a45b030dc Merge pull request #1141 from tantivy-search/tantivy_common
dissolve common module
2021-08-20 08:03:37 +01:00
Pascal Seitz
62052bcc2d add missing test function
closes #1139
2021-08-20 07:26:22 +01:00
Pascal Seitz
3265f7bec3 dissolve common module 2021-08-19 23:26:34 +01:00
Pascal Seitz
ee0881712a move bitset to common crate, move composite file to directory 2021-08-19 17:45:09 +01:00
PSeitz
483e0336b6 Merge pull request #1140 from tantivy-search/tantivy_common
rename common to tantivy-common
2021-08-19 13:02:54 +01:00
Pascal Seitz
3e8f267e33 rename common to tantivy-common 2021-08-19 10:27:20 +01:00
Paul Masurel
3b247fd968 Version bump 2021-08-19 10:12:30 +09:00
Paul Masurel
750f6e6479 Removed obsolete unit test (#1138) 2021-08-19 10:07:49 +09:00
Evance Soumaoro
5b475e6603 Checksum validation using active files (#1130)
* now validate checksum uses segment files not managed files
2021-08-19 10:03:20 +09:00
PSeitz
0ca7f73dc5 add docs badge, fix build badge 2021-08-13 19:40:33 +01:00
PSeitz
47ed18845e Merge pull request #1136 from tantivy-search/minor_fixes
more docs detail
2021-08-13 18:11:47 +01:00
Pascal Seitz
dc141cdb29 more docs detail
remove code duplicate
2021-08-13 17:40:13 +01:00
PSeitz
f6cf6e889b Merge pull request #1133 from tantivy-search/merge_overflow
test doc_freq and term_freq in sorted index
2021-08-05 07:53:46 +01:00
Pascal Seitz
f379a80233 test doc_freq and term_freq in sorted index 2021-08-03 11:38:05 +01:00
PSeitz
4a320fd1ff fix delta position in merge and index sorting (#1132)
fixes #1125
2021-08-03 18:06:36 +09:00
PSeitz
85d23e8e3b Merge pull request #1129 from tantivy-search/merge_overflow
add long running test in ci
2021-08-02 15:54:31 +01:00
Pascal Seitz
022ab9d298 don't run as pr 2021-08-02 15:44:00 +01:00
Pascal Seitz
605e8603dc add positions to long running test 2021-08-02 15:29:49 +01:00
Pascal Seitz
70f160b329 add long running test in ci 2021-08-02 11:35:39 +01:00
PSeitz
6d265e6bed fix gh action name 2021-08-02 10:38:01 +01:00
PSeitz
fdc512391b Merge pull request #1128 from tantivy-search/merge_overflow
add sort to functional test, add env for iterations
2021-08-02 10:29:16 +01:00
Pascal Seitz
108714c934 add sort to functional test, add env for iterations 2021-08-02 10:11:17 +01:00
Paul Masurel
44e8cf98a5 Cargo fmt 2021-07-30 15:30:01 +09:00
Paul Masurel
f0ee69d9e9 Remove the complicated block search logic for a simpler branchless (#1124)
binary search

The code is simpler and faster.

Before
test postings::bench::bench_segment_intersection                                                                         ... bench:   2,093,697 ns/iter (+/- 115,509)
test postings::bench::bench_skip_next_p01                                                                                ... bench:      58,585 ns/iter (+/- 796)
test postings::bench::bench_skip_next_p1                                                                                 ... bench:     160,872 ns/iter (+/- 5,164)
test postings::bench::bench_skip_next_p10                                                                                ... bench:     615,229 ns/iter (+/- 25,108)
test postings::bench::bench_skip_next_p90                                                                                ... bench:   1,120,509 ns/iter (+/- 22,271)

After
test postings::bench::bench_segment_intersection                                                                         ... bench:   1,747,726 ns/iter (+/- 52,867)
test postings::bench::bench_skip_next_p01                                                                                ... bench:      55,205 ns/iter (+/- 714)
test postings::bench::bench_skip_next_p1                                                                                 ... bench:     131,433 ns/iter (+/- 2,814)
test postings::bench::bench_skip_next_p10                                                                                ... bench:     478,830 ns/iter (+/- 12,794)
test postings::bench::bench_skip_next_p90                                                                                ... bench:     931,082 ns/iter (+/- 31,468)
2021-07-30 14:38:42 +09:00