tantivy

mirror of https://github.com/quickwit-oss/tantivy.git synced 2026-01-07 17:42:55 +00:00

Author	SHA1	Message	Date
Pascal Seitz	791350091c	switch num_vals() to u32 fixes #1630	2022-10-20 19:44:28 +08:00
Paul Masurel	483b1d13d4	Added unit test for long tokens (#1635 ) * Bugfix on long tokens and multivalue text fields. Fixes a minor bug for the strong edge case in which a tokenizer would emit tokens where the last token does not cover the last position. More importantly, this adds unit tests. Closes #1634 * Update src/indexer/segment_writer.rs Co-authored-by: PSeitz <PSeitz@users.noreply.github.com> Co-authored-by: PSeitz <PSeitz@users.noreply.github.com>	2022-10-20 15:05:37 +09:00
PSeitz	8de7fa9d95	Merge pull request #1631 from quickwit-oss/high_positions add test for phrase search on multi text field	2022-10-20 10:26:00 +08:00
Paul Masurel	94313b62f8	Hotfix issue/1629 - position broken (#1633 ) * Bugfix position broken. For Field with several FieldValues, with a value that contained no token at all, the token position was reinitialized to 0. As a result, PhraseQueries can show some false positives. In addition, after the computation of the position delta, we can underflow u32, and end up with gigantic delta. We haven't been able to actually explain the bug in 1629, but it is assumed that in some corner case these delta can cause a panic. Closes #1629	2022-10-20 11:03:55 +09:00
Pascal Seitz	f2b2628feb	add test for phrase search on multi text field	2022-10-19 16:29:56 +08:00
PSeitz	449f595832	Merge pull request #1628 from quickwit-oss/skip_index_deser faster skipindex deserialization, larger blocksize on sort	2022-10-19 11:05:20 +08:00
Pascal Seitz	a4485f7611	faster skipindex deserialization, larger blocksize on sort	2022-10-18 19:32:23 +08:00
Pascal Seitz	1082ff60f9	add range query handling for ip via term dictionary since IPs are mapped monotonically we can use the term dictionary for range queries	2022-10-18 13:08:27 +08:00
PSeitz	491854155c	Merge pull request #1625 from quickwit-oss/index_ip_field index ip field	2022-10-18 11:18:17 +08:00
Christoph Herzog	96c3d54ac7	fix: Fix power of two computation on 32bit architectures (#1624 ) The current `compute_previous_power_of_two()` implementation used for TermHashmap takes and returns `usize` , but actually only works correclty on 64 bit architectures (aka usize == u64) On other architectures the leading_zeros computation is run on the wrong type (must be u64), and leads to overflows. Fixed simply computing the leading_zeros based on a u64 value.	2022-10-18 11:55:02 +09:00
Pascal Seitz	6800fdec9d	add indexing for ip field Closes #1595	2022-10-18 10:07:48 +08:00
Pascal Seitz	024e53a99c	remove truncate	2022-10-17 12:14:35 +08:00
Pascal Seitz	8d75e451bd	fix truncate, remove mutable access from term	2022-10-17 12:14:35 +08:00
Pascal Seitz	fcfd76ec55	refactor Term fixes some issues with Term Remove duplicate calls to truncate or resize Replace Magic Number 5 with constant Enforce minimum size of 5 for metadata Fix broken truncate docs use constructor instead new + set calls normalize constructor stack replace assert on internal behavior fixes #1585	2022-10-17 12:14:34 +08:00
Pascal Seitz	129f7422f5	remove unused buffer	2022-10-14 20:01:10 +08:00
PSeitz	f39cce2c8b	Merge pull request #1622 from quickwit-oss/term_aggregation add term aggregation clarification	2022-10-14 18:09:18 +08:00
Pascal Seitz	952b048341	add term aggregation clarification	2022-10-14 16:12:19 +08:00
PSeitz	80f9596ec8	Merge pull request #1611 from quickwit-oss/remove_token_stream_alloc remove tokenstream vec alloc	2022-10-14 15:12:30 +08:00
PSeitz	a602c248fb	Merge pull request #1590 from waywardmonkeys/fix-doc-warnings-quickwit Fix missing doc warnings when enabling feature "quickwit".	2022-10-14 14:09:25 +08:00
PSeitz	4b9d1fe828	Merge pull request #1620 from quickwit-oss/fix_fieldnorms_indexing Fix missing fieldnorm indexing	2022-10-14 13:41:38 +08:00
Pascal Seitz	63bc390b02	Fix missing fieldnorm indexing Fixes broken search (no results) with BM25 for u64, i64, f64, bool, bytes and date after deletion and merge. There were no fieldnorms recorded for those field. After merge InvertedIndexReader::total_num_tokens returns 0 (Sum over the fieldnorms is 0). BM25 does not work when total_num_tokens is 0. Fixes #1617	2022-10-14 12:44:40 +08:00
Paul Masurel	07393c2fa0	Attempt to fix race condition in test. (#1619 ) Close #1550	2022-10-14 10:56:37 +09:00
PSeitz	77a415cbe4	rename NothingRecorder to DocIdRecorder (#1615 )	2022-10-13 15:43:40 +09:00
Pascal Seitz	9cb8cfbea8	return Error instead panic in fastfields fixes #1572	2022-10-11 14:15:22 +08:00
PSeitz	8b69aab0fc	avoid prepare_doc allocation (#1610 ) avoid prepare_doc allocation, ~10% more thoughput best case	2022-10-11 14:15:55 +09:00
PSeitz	3650d1f36a	Merge pull request #1553 from quickwit-oss/ip_field ip field	2022-10-11 13:09:47 +08:00
Pascal Seitz	2efebdb1bb	remove tokenstream vec alloc	2022-10-11 10:30:56 +08:00
François Massot	e443ca63aa	Merge pull request #1608 from quickwit-oss/nigel/serialise-bytes-as-b64-#2042 Serialise bytes as base64 strings instead of arrays.	2022-10-10 11:51:23 +02:00
Pascal Seitz	5c9cbee29d	handle IpV4 serialization case	2022-10-07 19:52:00 +08:00
Pascal Seitz	b2ca83a93c	switch to ipv6, add monotonic_mapping tests	2022-10-07 18:47:55 +08:00
Nigel Andrews	3b189080d4	Use raw string literals in tests	2022-10-07 12:28:25 +02:00
Nigel Andrews	00a6586efe	Replaced String::serialize for serializer.serialize_str	2022-10-07 11:55:05 +02:00
PSeitz	534b1d33c3	use ipv6 Co-authored-by: Paul Masurel <paul@quickwit.io>	2022-10-07 16:56:00 +08:00
PSeitz	f465173872	Apply suggestions from code review Co-authored-by: Paul Masurel <paul@quickwit.io>	2022-10-07 16:55:53 +08:00
Pascal Seitz	96315df20d	use idx part only for positions_to_docid	2022-10-07 16:54:04 +08:00
Pascal Seitz	9a1609d364	add test	2022-10-07 16:25:01 +08:00
Pascal Seitz	2864bf7123	use serializer for u128	2022-10-07 16:25:01 +08:00
Pascal Seitz	5171ff611b	serialize ip as u128, add test for positions_to_docid	2022-10-07 16:25:01 +08:00
Pascal Seitz	e50e74acf8	remove u128 type	2022-10-07 16:25:01 +08:00
Pascal Seitz	0b86658389	rename ip addr, use buffer	2022-10-07 16:25:01 +08:00
Pascal Seitz	5d6602a8d9	mark null handling TODO	2022-10-07 16:25:01 +08:00
Pascal Seitz	4d29ff4d01	finalize ip addr rename	2022-10-07 16:25:01 +08:00
Pascal Seitz	cdc8e3a8be	group montonic mapping and inverse fix mapping inverse remove ip indexing add get_between_vals test	2022-10-07 16:25:01 +08:00
Pascal Seitz	787a37bacf	expect instead of unwrap	2022-10-07 16:25:01 +08:00
Pascal Seitz	f5039f1846	remove roaring	2022-10-07 16:25:01 +08:00
Pascal Seitz	eeb1f19093	rename to iter_gen	2022-10-07 16:25:01 +08:00
Pascal Seitz	087beaf328	remove null handling	2022-10-07 16:25:01 +08:00
Pascal Seitz	309449dba3	rename to IpAddr	2022-10-07 16:25:01 +08:00
Pascal Seitz	c8713a01ed	use iter api	2022-10-07 16:25:01 +08:00
Pascal Seitz	6113e0408c	remove comment	2022-10-07 16:25:01 +08:00

... 2 3 4 5 6 ...

2208 Commits