Commit Graph

2585 Commits

Author SHA1 Message Date
PSeitz
f8686ab1ec improve comments
Co-authored-by: Paul Masurel <paul@quickwit.io>
2022-09-30 10:06:34 +08:00
Pascal Seitz
8f647b817f fix docstore settings for temp docstore
fixes #1565
2022-09-28 17:53:59 +08:00
trinity-1686a
a86b0df6f4 Add query matching terms in a set (#1539) 2022-09-28 09:43:18 +02:00
Bruce Mitchener
f842da758c Move ArcBytes,WeakArcBytes to mmap_directory. (#1555)
When building without default features (so without mmap, etc),
there are some warnings about unused things. This fixes the
ones related to `ArcBytes` and `WeakArcBytes`, which are only
used with the `mmap_directory` code.
2022-09-27 09:57:28 +09:00
Bruce Mitchener
97ccd6d712 Avoid slicing a string in DocParsingError. (#1559)
Fixes #1339.
2022-09-26 20:27:15 +09:00
Bruce Mitchener
cb252a42af docs: "associated to" -> "associated with" (#1557)
This reads better this way.
2022-09-26 20:23:37 +09:00
Bruce Mitchener
d9609dd6b6 POLLING_INTERVAL needn't be pub. (#1556)
This is only used within the file watcher and is const, so it
can't be configured.
2022-09-26 20:22:55 +09:00
Bruce Mitchener
f03667d967 Remove references to /cpp directory. (#1560)
This was removed in 2018, so these should be fine to remove now.
2022-09-26 20:22:28 +09:00
PSeitz
10f10a322f Merge pull request #1554 from quickwit-oss/prepare_ip_field
prepare for ip field
2022-09-26 16:34:24 +08:00
Pascal Seitz
f757471077 prepare for ip field 2022-09-26 16:27:35 +08:00
PSeitz
21e0adefda use binary search instead of linear for get_val in merge (#1548)
* use binary search instead of linear for get_val in merge

* use partition_point
2022-09-26 09:42:33 +09:00
Bruce Mitchener
ea8e6d7b1d Tidy up clippy config. (#1547)
* Checking cfg_attr is no longer necessary.
* Don't need multiple `clippy::` prefixes on a name.
2022-09-26 09:37:55 +09:00
PSeitz
dac7da780e Merge pull request #1545 from waywardmonkeys/remove-some-refs
clippy: Remove borrows that the compiler will do.
2022-09-23 15:33:23 +08:00
PSeitz
20c87903b2 fix multivalue ff index creation regression (#1543)
fixes multivalue ff regression by avoiding using `get_val`. Line::train calls repeatedly get_val, but get_val implementation on Column for multivalues is very slow. The fix is to use the iterator instead. Longterm fix should be to remove get_val access in serialization.

Old Code

test fastfield::bench::bench_multi_value_ff_merge_few_segments                                                           ... bench:  46,103,960 ns/iter (+/- 2,066,083)
test fastfield::bench::bench_multi_value_ff_merge_many_segments                                                          ... bench:  83,073,036 ns/iter (+/- 4,373,615)
est fastfield::bench::bench_multi_value_ff_merge_many_segments_log_merge                                                ... bench:  64,178,576 ns/iter (+/- 1,466,700)

Current

running 3 tests
test fastfield::multivalued::bench::bench_multi_value_ff_merge_few_segments                                              ... bench:  57,379,523 ns/iter (+/- 3,220,787)
test fastfield::multivalued::bench::bench_multi_value_ff_merge_many_segments                                             ... bench:  90,831,688 ns/iter (+/- 1,445,486)
test fastfield::multivalued::bench::bench_multi_value_ff_merge_many_segments_log_merge                                   ... bench: 158,313,264 ns/iter (+/- 28,823,250)

With Fix

running 3 tests
test fastfield::multivalued::bench::bench_multi_value_ff_merge_few_segments                                              ... bench:  57,635,671 ns/iter (+/- 2,707,361)
test fastfield::multivalued::bench::bench_multi_value_ff_merge_many_segments                                             ... bench:  91,468,712 ns/iter (+/- 11,393,581)
test fastfield::multivalued::bench::bench_multi_value_ff_merge_many_segments_log_merge                                   ... bench:  73,909,138 ns/iter (+/- 15,846,097)
2022-09-23 15:36:29 +09:00
PSeitz
f9c3947803 Merge pull request #1546 from waywardmonkeys/use-ux-from-bool
Use u8::from(bool), u64::from(bool).
2022-09-23 09:06:24 +08:00
Bruce Mitchener
e9a384bb15 Use u8::from(bool), u64::from(bool). 2022-09-22 22:44:53 +07:00
Bruce Mitchener
d231671fe2 clippy: Remove borrows that the compiler will do.
This started showing up with clippy in rust 1.64.
2022-09-22 22:38:23 +07:00
trinity-1686a
fa3d786a2f Add support for deleting all documents matching query (#1535)
* add support for deleting all documents matching query

#1494
2022-09-22 21:26:09 +09:00
Paul Masurel
75aafeeb9b Added a function to deep clone RamDirectory. (#1544) 2022-09-22 12:04:02 +02:00
PSeitz
6f066c7f65 Merge pull request #1541 from quickwit-oss/add_bench
add benchmarks for multivalued fastfield merge
2022-09-22 15:28:00 +08:00
Pascal Seitz
22e56aaee3 add benchmarks for multivalued fastfield merge 2022-09-22 11:25:41 +08:00
Paul Masurel
d641979127 Minor refactor of fast fields (#1538) 2022-09-21 12:55:03 +09:00
Paul Masurel
1998111521 Minor refactoring fast fields (#1537) 2022-09-21 12:46:11 +09:00
PSeitz
acb2e2e282 Merge pull request #1532 from quickwit-oss/refactor_ff
remove fast_field_cardinality from FastValue
2022-09-21 04:00:35 +02:00
Pascal Seitz
1ff5da5eb4 remove fast_field_cardinality from FastValue
unused and at the wrong placed
2022-09-21 09:38:46 +08:00
Bruce Mitchener
c3b25710ad doc: Improve directory::Lock docs. (#1534)
Update the docs to reflect the lack of LockParams, correct an error,
and improve cross-linking.
2022-09-20 18:03:35 +09:00
PSeitz
8492010d43 Merge pull request #1531 from waywardmonkeys/improve-docs-more
Improvements to doc linking, grammar, etc.
2022-09-20 15:37:07 +08:00
Bruce Mitchener
cf02e32578 Improvements to doc linking, grammar, etc. 2022-09-19 18:10:22 +07:00
PSeitz
8cca1014c9 Merge pull request #1527 from waywardmonkeys/remove-stream_field-reference
docs: Remove mentions of stream_field method.
2022-09-19 17:16:46 +08:00
PSeitz
938f884e32 Merge pull request #1525 from waywardmonkeys/fix-etsy-logo-alt-text-readme
README: Fix Etsy logo and alt text.
2022-09-19 16:55:08 +08:00
PSeitz
ed68afb698 Merge pull request #1528 from quickwit-oss/ff_refact
fix benches
2022-09-19 11:37:08 +08:00
PSeitz
8a7962dc22 Merge pull request #1524 from waywardmonkeys/improve-docs-1
Documentation improvements.
2022-09-19 11:15:42 +08:00
Pascal Seitz
a06039dea8 fix benches
move some benches to lib.rs to test unexported items
2022-09-19 11:07:20 +08:00
Bruce Mitchener
68b6254b09 docs: Remove mentions of stream_field method.
This method doesn't exist, so no need to mention it.
2022-09-18 23:13:41 +07:00
Bruce Mitchener
6a88ac3fe3 Documentation improvements.
Fix some linking, some grammar, some typos, etc.
2022-09-18 18:05:37 +07:00
Bruce Mitchener
191b934650 README: Fix Etsy logo and alt text. 2022-09-18 15:02:35 +07:00
PSeitz
1a2ba7025a Merge pull request #1513 from quickwit-oss/ip_codec
add ip codec
2022-09-16 18:53:08 +08:00
Pascal Seitz
02599ebeb7 remove ip_to_u128 2022-09-16 18:16:16 +08:00
Pascal Seitz
a16b466460 merge ColumnExt with Column trait 2022-09-16 18:15:18 +08:00
Pascal Seitz
b8d8fdeb6e move benches, improve bench data 2022-09-16 16:42:23 +08:00
Pascal Seitz
12856d80fa change bench, update numbers 2022-09-16 16:41:01 +08:00
Pascal Seitz
e75472ec9a add serialize_u128, open_u128, refactor 2022-09-16 16:40:59 +08:00
Pascal Seitz
e2e6c94ba8 remove ColumnV2 2022-09-16 16:40:06 +08:00
Pascal Seitz
9f610b25af fix benches, add benches 2022-09-16 16:38:48 +08:00
Pascal Seitz
237b64025e take ColumnV2 as parameter
improve algorithm
stricter assertions
improve names
2022-09-16 16:38:48 +08:00
Pascal Seitz
592caeefa0 renames 2022-09-16 16:38:48 +08:00
Pascal Seitz
570009b5b1 move to mod.rs 2022-09-16 16:38:48 +08:00
Pascal Seitz
61b5110db7 use 0 as null in compact space 2022-09-16 16:38:48 +08:00
PSeitz
58af1235e4 Apply suggestions from code review
Co-authored-by: Paul Masurel <paul@quickwit.io>
2022-09-16 16:38:48 +08:00
Pascal Seitz
d3e7c41a1f refactor to range_mapping 2022-09-16 16:38:48 +08:00