tantivy

mirror of https://github.com/quickwit-oss/tantivy.git synced 2025-12-27 12:32:55 +00:00

Author	SHA1	Message	Date
PSeitz	4c52499622	clippy (#2549 )	2024-11-29 16:08:21 +08:00
PSeitz	21d057059e	clippy (#2527 ) * clippy * clippy * clippy * clippy * convert allow to expect and remove unused * cargo fmt * cleanup * export sample * clippy	2024-10-22 09:26:54 +08:00
Bruce Mitchener	c17e513377	Reduce typo count. (#2510 )	2024-10-10 09:55:37 +08:00
trinity-1686a	85395d942a	fix clippy lints from 1.80-1.81 (#2488 ) * fix some clippy lints * fix clippy::doc_lazy_continuation * fix some lints for 1.82	2024-09-05 14:33:05 +02:00
PSeitz	232f37126e	fix coverage (#2448 )	2024-07-05 12:04:18 +08:00
PSeitz	74940e9345	clippy (#2349 ) * fix clippy * fix clippy * fix duplicate imports	2024-04-09 07:54:44 +02:00
PSeitz	7ce950f141	add method to fetch block of first vals in columnar (#2330 ) * add method to fetch block of first vals in columnar add method to fetch block of first vals in columnar (this is way faster than single calls for full columns) add benchmark fix import warnings ``` test bench_get_block_first_on_full_column ... bench: 56 ns/iter (+/- 26) test bench_get_block_first_on_full_column_single_calls ... bench: 311 ns/iter (+/- 6) test bench_get_block_first_on_multi_column ... bench: 378 ns/iter (+/- 15) test bench_get_block_first_on_multi_column_single_calls ... bench: 546 ns/iter (+/- 13) test bench_get_block_first_on_optional_column ... bench: 291 ns/iter (+/- 6) test bench_get_block_first_on_optional_column_single_calls ... bench: 362 ns/iter (+/- 8) ``` * use remainder	2024-03-15 08:01:47 +01:00
PSeitz	b0e65560a1	handle ip adresses in term aggregation (#2319 ) * handle ip adresses in term aggregation Stores IpAdresses during the segment term aggregation via u64 representation and convert to u128(IpV6Adress) via downcast when converting to intermediate results. Enable Downcasting on `ColumnValues` Expose u64 variant for u128 encoded data via `open_u64_lenient` method. Remove lifetime in VecColumn, to avoid 'static lifetime requirement coming from downcast trait. * rename method	2024-03-14 09:41:18 +01:00
Paul Masurel	014328e378	Fix bug that can cause `get_docids_for_value_range` to panic. (#2295 ) * Fix bug that can cause `get_docids_for_value_range` to panic. When `selected_docid_range.end == num_rows`, we would get a panic as we try to access a non-existing blockmeta. This PR accepts calls to rank with any value. For any value above num_rows we simply return non_null_rows. Fixes #2293 * add tests, merge variables --------- Co-authored-by: Pascal Seitz <pascal.seitz@gmail.com>	2024-01-09 14:52:20 +01:00
Adam Reichold	42acd334f4	Fixes the new deny-by-default incorrect_partial_ord_impl_on_ord_type Clippy lint (#2131 )	2023-07-21 11:36:17 +09:00
Paul Masurel	910b0b0c61	Cargo fmt	2023-07-03 22:03:31 +09:00
PSeitz	17186ca9c9	improve docs (#2105 )	2023-06-27 13:37:14 +08:00
PSeitz	ba309e18a1	switch to nanosecond precision (#2016 )	2023-05-01 03:32:20 +02:00
Paul Masurel	694a056255	Faster range (#1954 ) * Faster range queries This PR does several changes - ip compact space now uses u32 - the bitunpacker now gets a get_batch function - we push down range filtering, removing GCD / shift in the bitpacking codec. - we rely on AVX2 routine to do the filtering. * Apply suggestions from code review * Apply suggestions from code review * CR comments	2023-03-27 14:56:32 +09:00
PSeitz	835f228bfa	fix cardinality when merging empty columns (#1960 ) fixes #1958	2023-03-25 15:58:15 +09:00
Paul Masurel	2b6a4da640	Exposing empty column builder. (#1959 )	2023-03-24 16:34:41 +09:00
PSeitz	5504cfd012	remove IterColumn (#1955 ) fixes #1658	2023-03-23 06:43:17 +01:00
PSeitz	6a7a1106d6	work in batches of docs (#1937 ) * work in batches of docs * add fill_buffer test	2023-03-21 06:57:44 +01:00
PSeitz	8459efa32c	split term collection count and sub_agg (#1921 ) use unrolled ColumnValues::get_vals	2023-03-13 04:37:41 +01:00
PSeitz	a42a96f470	fix panic in dict column merge (#1930 ) * fix panic in dict column merge * Bugfix and added unit test --------- Co-authored-by: Paul Masurel <paul@quickwit.io>	2023-03-08 22:04:37 +09:00
Paul Masurel	364e321415	Clippy fix (#1926 )	2023-03-06 10:37:17 +09:00
PSeitz	bc36458334	move buffer in front of dynamic dispatch (#1915 ) dynamic dispatch seems to be really expensive, move the buffer in front of the dynamic dispatch, to reduce the number of calls into the dynamic dispatched collector.	2023-02-28 13:07:50 +08:00
Paul Masurel	f537334e4f	Adding a write schema to columnar's merge operations. (#1884 ) * Adding a write schema to columnar's merge operations. * Added unit test checking min/max when columns are empty. * CR comment * Rename to value_type_to_column_type	2023-02-21 18:25:16 +09:00
Paul Masurel	02bebf4ff5	Cargo fmt	2023-02-20 09:40:04 +09:00
Paul Masurel	0274c982d5	Refactoring. (#1881 ) `ColumnValues` wrongly located in column_values/column.rs due to historical reason moves to column_values/mod.rs u128 stuff gets its own directory like u64 stuff.	2023-02-17 21:57:14 +09:00
PSeitz	111f25a8f7	clippy (#1879 ) * fix clippy * fix clippy * fmt	2023-02-17 11:34:21 +01:00
PSeitz	71f43ace1d	fix dynamic dispatch regression for range queries (#1871 )	2023-02-14 16:56:40 +01:00
Paul Masurel	097fd6138d	Fix clippy comments (#1872 )	2023-02-14 23:12:45 +09:00
PSeitz	1cfb9ce59a	improve range query performance (#1864 ) fix RowId vs DocId naming fixes #1863	2023-02-14 13:25:39 +09:00
trinity-1686a	539ff08a79	move DateTime to tantivy_common (#1861 ) * move DateTime to tantivy_common * resolve imports of columnar::DateTime as import of common::DateTime	2023-02-11 17:03:06 +01:00
PSeitz	dab93df94e	fix benchmarks (#1862 )	2023-02-11 15:44:47 +09:00
Paul Masurel	bd5eea9852	Integrated columnar work.	2023-02-09 13:14:31 +01:00
PSeitz	b31fd389d8	collect columns for merge (#1812 ) * collect columns for merge * return column_type from, fix visibility * fix Co-authored-by: Paul Masurel <paul@quickwit.io>	2023-01-20 07:58:29 +01:00
Paul Masurel	89cec79813	Make it possible to force a column type and intricate bugfix. (#1815 )	2023-01-20 14:30:56 +09:00
Paul Masurel	08919a2900	Improvement on the scalar / random bitpacker code. (#1781 ) * Improvement on the scalar / random bitpacker code. Added proptesting Added simple benchmark Added assert and comments on the very non trivial hidden contract Remove the need for an extra padding. The last point introduces a small performance regression (~10%). * Fixing unit tests	2023-01-19 18:09:13 +09:00
Paul Masurel	e3d504d833	Minor code cleanup (#1810 )	2023-01-19 17:47:26 +09:00
Paul Masurel	5a42c5aae9	Add support for multivalues (#1809 )	2023-01-19 16:55:01 +09:00
PSeitz	f9abd256b7	add ip addr to columnar (#1805 )	2023-01-19 05:36:06 +01:00
Paul Masurel	25bad784ad	Integrated fastfield codecs into columnar. (#1782 ) Introduced asymetric OptionalCodec / SerializableOptionalCodec Removed cardinality from the columnar sstable. Added DynamicColumn Reorganized all files Change DenseCodec serialization logic. Renamed methods to rank/select Moved versioning footer to the columnar level	2023-01-16 17:24:49 +09:00

39 Commits