PSeitz
5379c99ea2
update edition to 2024 ( #2620 )
...
* update common to edition 2024
* update bitpacker to edition 2024
* update stacker to edition 2024
* update query-grammar to edition 2024
* update sstable to edition 2024 + fmt
* fmt
* update columnar to edition 2024
* cargo fmt
* use None instead of _
2025-04-18 04:56:31 +02:00
PSeitz
21d057059e
clippy ( #2527 )
...
* clippy
* clippy
* clippy
* clippy
* convert allow to expect and remove unused
* cargo fmt
* cleanup
* export sample
* clippy
2024-10-22 09:26:54 +08:00
PSeitz
927b4432c9
Perf: use term hashmap in fastfield ( #2243 )
...
* add shared arena hashmap
* bench fastfield indexing
* use shared arena hashmap in columnar
lower minimum resize in hashtable
* clippy
* add comments
2023-11-09 13:44:02 +01:00
PSeitz
27f202083c
Improve Termmap Indexing Performance +~30% ( #2058 )
...
* update benchmark
* Improve Termmap Indexing Performance +~30%
This contains many small changes to improve Termmap performance.
Most notably:
* Specialized byte compare and equality versions, instead of glibc calls.
* ExpUnrolledLinkedList to not contain inline items.
Allow compare hash only via a feature flag compare_hash_only:
64bits should be enough with a good hash function to compare strings by
their hashes instead of comparing the strings. Disabled by default
CreateHashMap/alice/174693
time: [642.23 µs 643.80 µs 645.24 µs]
thrpt: [258.20 MiB/s 258.78 MiB/s 259.41 MiB/s]
change:
time: [-14.429% -13.303% -12.348%] (p = 0.00 < 0.05)
thrpt: [+14.088% +15.344% +16.862%]
Performance has improved.
CreateHashMap/alice_expull/174693
time: [877.03 µs 880.44 µs 884.67 µs]
thrpt: [188.32 MiB/s 189.22 MiB/s 189.96 MiB/s]
change:
time: [-26.460% -26.274% -26.091%] (p = 0.00 < 0.05)
thrpt: [+35.301% +35.637% +35.981%]
Performance has improved.
CreateHashMap/numbers_zipf/8000000
time: [9.1198 ms 9.1573 ms 9.1961 ms]
thrpt: [829.64 MiB/s 833.15 MiB/s 836.57 MiB/s]
change:
time: [-35.229% -34.828% -34.384%] (p = 0.00 < 0.05)
thrpt: [+52.403% +53.440% +54.390%]
Performance has improved.
* clippy
* add bench for ids
* inline(always) to inline whole block with bounds checks
* cleanup
2023-06-08 11:13:52 +02:00
PSeitz
d1988be8e9
fix and extend benchmark ( #2030 )
...
* add benchmark, add missing inlines
* fix stacker bench
* add wiki benchmark
* move line split out of bench
2023-05-10 13:01:56 +02:00
PSeitz
e83abbfe4a
perf: faster term hash map ( #1940 )
...
* add term hashmap benchmark
* refactor arena hashmap
add inlines
remove occupied array and use table_entry.is_empty instead (saves 4 bytes per entry)
reduce saturation threshold from 1/3 to 1/2 to reduce memory
use u32 for UnorderedId (we have the 4billion limit anyways on the Columnar stuff)
fix naming LinearProbing
remove byteorder dependency
memory consumption went down from 2Gb to 1.8GB on indexing wikipedia dataset in tantivy
* Update stacker/src/arena_hashmap.rs
Co-authored-by: Paul Masurel <paul@quickwit.io >
---------
Co-authored-by: Paul Masurel <paul@quickwit.io >
2023-04-17 09:07:33 +02:00
Paul Masurel
136a8f4124
Isolating sstable and stacker in independant crates. ( #1718 )
...
Both crate will be used in the new (optional + dynamic) fastfield work.
2022-12-13 11:44:17 +09:00