Commit Graph

2943 Commits

Author SHA1 Message Date
Pascal Seitz
b7d0dd154a fmt 2022-11-14 14:49:15 +08:00
PSeitz
ce10fab20f Apply suggestions from code review
Co-authored-by: Paul Masurel <paul@quickwit.io>
2022-11-14 14:21:53 +08:00
Pascal Seitz
e034328a8b Improve position_to_docid, refactor, add tests 2022-11-14 14:21:53 +08:00
Pascal Seitz
f811d1616b add support for ip range query on multivalue fastfields 2022-11-14 14:21:52 +08:00
PSeitz
c665b16ff0 Merge pull request #1672 from quickwit-oss/allow_range_without_indexed
Allow range query on fastfield without INDEXED
2022-11-14 12:45:12 +08:00
PSeitz
3b5f810051 Merge pull request #1677 from quickwit-oss/switch_to_u32
switch total_num_val to u32
2022-11-14 12:01:40 +08:00
trinity-1686a
5765c261aa allow warming up of the full posting list (#1673)
* allow warming up of the full posting list

* cargo fmt
2022-11-14 10:27:56 +09:00
Pascal Seitz
fb9f03118d switch total_num_val to u32 2022-11-11 17:35:52 +08:00
PSeitz
55a9d808d4 Merge pull request #1674 from quickwit-oss/u128_codec_header
add header with codec type for u128
2022-11-11 13:47:51 +08:00
Pascal Seitz
32166682b3 add header deser test 2022-11-11 13:28:12 +08:00
Pascal Seitz
e6acf8f76d add header with codec type for u128 2022-11-11 11:52:17 +08:00
Pascal Seitz
9e8a0c2cca Allow range query on fastfield without INDEXED 2022-11-10 15:56:08 +08:00
Paul Masurel
3edf0a2724 Using the manual reload policy in IndexWriter. (#1667) 2022-11-09 11:20:41 +01:00
Paul Masurel
8ca12a5683 Added stop word filter to CHANGELOG.md 2022-11-09 17:00:45 +09:00
Adam Reichold
a4b759d2fe Include stop word lists from Lucene and the Snowball project (#1666) 2022-11-09 16:57:35 +09:00
PSeitz
3e9c806890 Merge pull request #1665 from quickwit-oss/fix_num_vals
fix num_vals on u128 value index after merge
2022-11-07 21:46:02 +08:00
Pascal Seitz
c69a873dd3 fix num_vals on value index after merge 2022-11-07 21:05:21 +08:00
PSeitz
666afcf641 Merge pull request #1663 from PSeitz/fix_clippy
fix clippy
2022-11-07 18:11:20 +08:00
Pascal Seitz
38ad46e580 fix clippy 2022-11-07 16:09:55 +08:00
PSeitz
e948889f4c Merge pull request #1662 from quickwit-oss/fix_num_vals
fix num_vals in multivalue index after merge
2022-11-07 15:57:32 +08:00
Pascal Seitz
6e636c9cea fix num_vals in multivalue index after merge 2022-11-07 15:00:52 +08:00
PSeitz
5a610efbc1 Merge pull request #1661 from quickwit-oss/upgrade_criterion
update criterion to 0.4
2022-11-04 14:45:34 +08:00
Pascal Seitz
500a0d5e48 update criterion to 0.4 2022-11-04 13:26:29 +08:00
PSeitz
509a265659 add docstore version (#1652)
* add docstore version

closes #1589

* assert for docstore version
2022-11-04 10:19:16 +09:00
PSeitz
5b2cea1b97 Merge pull request #1656 from quickwit-oss/multival_offset_index
move multivalue index to own file
2022-11-02 14:03:06 +08:00
PSeitz
a5a80ffaea Update fastfield_codecs/src/column.rs
Co-authored-by: Paul Masurel <paul@quickwit.io>
2022-11-02 06:37:27 +01:00
PSeitz
0f98d91a39 Merge pull request #1646 from quickwit-oss/no_score_calls
No score calls if score is not requested
2022-11-01 20:09:32 +08:00
PSeitz
2af6b01c17 Update src/query/boolean_query/boolean_weight.rs
Co-authored-by: Paul Masurel <paul@quickwit.io>
2022-11-01 16:13:00 +08:00
Adam Reichold
c32ab66bbd Small improvements to StopWorldFilter (#1657)
* Do not copy the whole set of stop words for each stream

* Make construction of StopWordFilter more flexible.
2022-11-01 16:47:34 +09:00
PSeitz
3f3a6f9990 Merge pull request #1653 from quickwit-oss/faster_hash
switch to fx hashmap
2022-11-01 14:53:18 +08:00
Pascal Seitz
83325d8f3f move multivalue index to own file
start_doc parameter in positions to docids
2022-11-01 10:36:13 +08:00
PSeitz
4e46f4f8c4 Merge pull request #1649 from adamreichold/split-compound-words
RFC: Add dictionary-based SplitCompoundWords token filter.
2022-10-27 17:12:48 +08:00
Pascal Seitz
43df356010 rename to docset 2022-10-27 16:53:38 +08:00
PSeitz
6647362464 Merge pull request #1648 from adamreichold/stemmer-todo-alloc
Avoid unconditional allocation in StemmerTokenStream.
2022-10-27 16:50:41 +08:00
Pascal Seitz
279b1b28d3 switch to fx hashmap 2022-10-27 16:19:59 +08:00
PSeitz
7a80851e36 Merge pull request #1645 from quickwit-oss/ip_field_range_query
add ip range query benchmark, add seek behaviour
2022-10-27 16:13:52 +08:00
Adam Reichold
cd952429d2 Add dictionary-based SplitCompoundWords token filter. 2022-10-27 08:30:33 +02:00
PSeitz
d777c964da Merge pull request #1650 from adamreichold/fnv-rustc-hash
Replace FNV by rustc-hash
2022-10-27 12:11:26 +08:00
Adam Reichold
bbb058d976 Replace FNV by rustc-hash
Both construction have similar goals but rustc-hash ist better suited for
contemporary CPU as it works one word at a time instead of byte per byte.
2022-10-27 00:35:09 +02:00
Adam Reichold
5f7d027a52 Avoid unconditional allocation in StemmerTokenStream.
This fixes the TODO in two ways: If the stemmer already yields an owned string,
it is used directly as the new text of the token. Otherwise, a temporary buffer
is used to copy the stemmed text (just as before) and then swapping it into the
token to reuse its existing buffer.
2022-10-26 18:11:15 +02:00
Pascal Seitz
dfab201191 for_each_docset to iterate without score 2022-10-26 17:25:05 +08:00
PSeitz
0c2bd36fe3 Panic on duplicate field names (#1647)
fixes #1601
2022-10-26 16:17:33 +09:00
Pascal Seitz
af839753e0 No score calls if score is not requested 2022-10-26 12:18:35 +08:00
Pascal Seitz
fec2b63571 improve bench by adding more blanks in compact space 2022-10-25 22:09:01 +08:00
Pascal Seitz
6213ea476a pass positions parameter 2022-10-25 17:44:51 +08:00
Pascal Seitz
5e159c26bf add ip range query benchmark, add seek behaviour 2022-10-25 15:57:19 +08:00
PSeitz
a5e59ab598 Merge pull request #1644 from quickwit-oss/get_val_u32
switch get_val() to u32
2022-10-24 19:30:03 +08:00
Pascal Seitz
e772d3170d switch get_val() to u32
Fixes #1638
2022-10-24 19:05:57 +08:00
PSeitz
8c2ba7bd55 Merge pull request #1637 from quickwit-oss/ip_field_range_query
add range query via ip fast field
2022-10-24 18:10:47 +08:00
Pascal Seitz
02328b0151 fix proptest 2022-10-24 17:46:06 +08:00