PSeitz-dd
5ba0031f7d
move rand_distr to dev_dep ( #2772 )
2025-12-11 18:23:50 +08:00
PSeitz
33794a114c
chore: Release ( #2686 )
...
Co-authored-by: Pascal Seitz <pascal.seitz@datadoghq.com >
2025-08-20 18:29:37 +08:00
PSeitz
4a6123d3ff
release tantivy: bump versions ( #2625 )
...
* chore: Release
* chore: Release
---------
Co-authored-by: Pascal Seitz <pascal.seitz@datadoghq.com >
2025-06-10 15:34:39 +02:00
PSeitz
5379c99ea2
update edition to 2024 ( #2620 )
...
* update common to edition 2024
* update bitpacker to edition 2024
* update stacker to edition 2024
* update query-grammar to edition 2024
* update sstable to edition 2024 + fmt
* fmt
* update columnar to edition 2024
* cargo fmt
* use None instead of _
2025-04-18 04:56:31 +02:00
dependabot[bot]
8edb439440
Update rustc-hash requirement from 1.1.0 to 2.1.0 ( #2551 )
...
---
updated-dependencies:
- dependency-name: rustc-hash
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-12-26 10:25:05 +01:00
dependabot[bot]
c66af2c0a9
Update binggan requirement from 0.12.0 to 0.14.0 ( #2530 )
...
* Update binggan requirement from 0.12.0 to 0.14.0
---
updated-dependencies:
- dependency-name: binggan
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
* fix build
---------
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Pascal Seitz <pascal.seitz@gmail.com >
2024-10-24 09:41:35 +08:00
PSeitz
7b65ad922d
use binggan for stacker bench ( #2492 )
...
* use binggan for stacker bench
```
alice (num terms: 174693)
hashmap Memory: 1.3 MB Avg: 367.19 MiB/s (-1.34%) Median: 368.10 MiB/s (-1.34%) [378.75 MiB/s .. 352.81 MiB/s]
hasmap with postings Memory: 2.4 MB Avg: 237.29 MiB/s (-2.19%) Median: 240.22 MiB/s (-1.61%) [248.26 MiB/s .. 210.66 MiB/s]
fxhashmap ref postings Memory: 2.9 MB Avg: 171.94 MiB/s (-3.22%) Median: 174.13 MiB/s (-2.69%) [185.94 MiB/s .. 152.43 MiB/s]
fxhasmap owned postings Memory: 3.5 MB Avg: 96.993 MiB/s (-4.20%) Median: 97.410 MiB/s (-4.48%) [102.78 MiB/s .. 82.745 MiB/s]
numbers unique 100k
hashmap Memory: 5.2 MB Avg: 334.17 MiB/s (-3.06%) Median: 352.61 MiB/s (+0.77%) [362.60 MiB/s .. 213.03 MiB/s]
hasmap with postings Memory: 6.3 MB Avg: 316.96 MiB/s (-0.02%) Median: 325.16 MiB/s (-0.04%) [338.36 MiB/s .. 218.60 MiB/s]
zipfs numbers 100k
hashmap Memory: 1.3 MB Avg: 1.2342 GiB/s (+2.87%) Median: 1.2677 GiB/s (+4.66%) [1.3130 GiB/s .. 915.93 MiB/s]
hasmap with postings Memory: 2.4 MB Avg: 485.16 MiB/s (+2.68%) Median: 494.70 MiB/s (+4.42%) [505.31 MiB/s .. 413.14 MiB/s]
numbers unique 1mio
hashmap Memory: 35.7 MB Avg: 169.68 MiB/s (-1.08%) Median: 166.80 MiB/s (-3.87%) [201.33 MiB/s .. 154.26 MiB/s]
hasmap with postings Memory: 39.8 MB Avg: 149.49 MiB/s (-3.07%) Median: 150.85 MiB/s (-1.45%) [160.76 MiB/s .. 130.94 MiB/s]
zipfs numbers 1mio
hashmap Memory: 1.3 MB Avg: 1.2185 GiB/s (-2.33%) Median: 1.2291 GiB/s (-2.33%) [1.2905 GiB/s .. 1.0742 GiB/s]
hasmap with postings Memory: 5.5 MB Avg: 358.43 MiB/s (-11.63%) Median: 356.95 MiB/s (-12.85%) [444.94 MiB/s .. 302.46 MiB/s]
numbers unique 2mio
hashmap Memory: 70.3 MB Avg: 163.65 MiB/s (+8.37%) Median: 162.83 MiB/s (+8.80%) [190.20 MiB/s .. 144.70 MiB/s]
hasmap with postings Memory: 78.6 MB Avg: 148.00 MiB/s (+7.75%) Median: 151.53 MiB/s (+9.11%) [166.92 MiB/s .. 120.09 MiB/s]
zipfs numbers 2mio
hashmap Memory: 1.3 MB Avg: 1.2535 GiB/s (+2.59%) Median: 1.2654 GiB/s (+0.36%) [1.2938 GiB/s .. 1.0592 GiB/s]
hasmap with postings Memory: 9.7 MB Avg: 377.96 MiB/s (-4.94%) Median: 381.82 MiB/s (-3.67%) [426.14 MiB/s .. 335.66 MiB/s]
numbers unique 5mio
hashmap Memory: 277.9 MB Avg: 121.30 MiB/s (+2.00%) Median: 121.99 MiB/s (+2.99%) [132.51 MiB/s .. 110.32 MiB/s]
hasmap with postings Memory: 295.7 MB Avg: 114.23 MiB/s (+2.13%) Median: 115.26 MiB/s (+2.94%) [124.08 MiB/s .. 103.38 MiB/s]
zipfs numbers 5mio
hashmap Memory: 1.3 MB Avg: 1.2326 GiB/s (+0.63%) Median: 1.2400 GiB/s (+0.71%) [1.2755 GiB/s .. 1.0923 GiB/s]
hasmap with postings Memory: 25.4 MB Avg: 360.49 MiB/s (+1.07%) Median: 363.44 MiB/s (+1.27%) [404.88 MiB/s .. 300.38 MiB/s]
```
* rename bench
* update binggan
* rename to HASHMAP_CAPACITY
2024-10-16 11:41:33 +08:00
PSeitz
17d5869ad6
update CHANGELOG, use github API in cliff ( #2354 )
...
* update CHANGELOG, use github API in cliff
* reset version to 0.21.1, before release
* chore: Release
* remove unreleased from CHANGELOG
2024-04-15 10:07:20 +02:00
PSeitz
1e9fc51535
update ahash ( #2344 )
2024-04-09 06:35:39 +02:00
PSeitz
1223a87eb2
add fuzz test for hashmap ( #2310 )
2024-01-31 10:30:21 +01:00
PSeitz
49448b31c6
chore: Release ( #2168 )
...
* chore: Release
* update CHANGELOG
2023-09-01 13:58:58 +02:00
PSeitz
e3eacb4388
release tantivy ( #2083 )
...
* prerelease
* chore: Release
2023-06-09 10:47:46 +02:00
PSeitz
27f202083c
Improve Termmap Indexing Performance +~30% ( #2058 )
...
* update benchmark
* Improve Termmap Indexing Performance +~30%
This contains many small changes to improve Termmap performance.
Most notably:
* Specialized byte compare and equality versions, instead of glibc calls.
* ExpUnrolledLinkedList to not contain inline items.
Allow compare hash only via a feature flag compare_hash_only:
64bits should be enough with a good hash function to compare strings by
their hashes instead of comparing the strings. Disabled by default
CreateHashMap/alice/174693
time: [642.23 µs 643.80 µs 645.24 µs]
thrpt: [258.20 MiB/s 258.78 MiB/s 259.41 MiB/s]
change:
time: [-14.429% -13.303% -12.348%] (p = 0.00 < 0.05)
thrpt: [+14.088% +15.344% +16.862%]
Performance has improved.
CreateHashMap/alice_expull/174693
time: [877.03 µs 880.44 µs 884.67 µs]
thrpt: [188.32 MiB/s 189.22 MiB/s 189.96 MiB/s]
change:
time: [-26.460% -26.274% -26.091%] (p = 0.00 < 0.05)
thrpt: [+35.301% +35.637% +35.981%]
Performance has improved.
CreateHashMap/numbers_zipf/8000000
time: [9.1198 ms 9.1573 ms 9.1961 ms]
thrpt: [829.64 MiB/s 833.15 MiB/s 836.57 MiB/s]
change:
time: [-35.229% -34.828% -34.384%] (p = 0.00 < 0.05)
thrpt: [+52.403% +53.440% +54.390%]
Performance has improved.
* clippy
* add bench for ids
* inline(always) to inline whole block with bounds checks
* cleanup
2023-06-08 11:13:52 +02:00
dependabot[bot]
4be6f83b0a
Update criterion requirement from 0.4 to 0.5 ( #2056 )
...
Updates the requirements on [criterion](https://github.com/bheisler/criterion.rs ) to permit the latest version.
- [Changelog](https://github.com/bheisler/criterion.rs/blob/master/CHANGELOG.md )
- [Commits](https://github.com/bheisler/criterion.rs/compare/0.4.0...0.5.0 )
---
updated-dependencies:
- dependency-name: criterion
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com >
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-05-24 15:59:51 +09:00
PSeitz
00c5df610c
update termmap benchmark ( #2040 )
2023-05-12 07:35:06 +02:00
PSeitz
d1988be8e9
fix and extend benchmark ( #2030 )
...
* add benchmark, add missing inlines
* fix stacker bench
* add wiki benchmark
* move line split out of bench
2023-05-10 13:01:56 +02:00
tottoto
73452284ae
Remove unused crates from dependencies ( #2018 )
...
* Remove unused crates from dependencies
* Revert rand to columnar
* Revert criterion to stacker
2023-05-02 12:34:20 +02:00
PSeitz
e83abbfe4a
perf: faster term hash map ( #1940 )
...
* add term hashmap benchmark
* refactor arena hashmap
add inlines
remove occupied array and use table_entry.is_empty instead (saves 4 bytes per entry)
reduce saturation threshold from 1/3 to 1/2 to reduce memory
use u32 for UnorderedId (we have the 4billion limit anyways on the Columnar stuff)
fix naming LinearProbing
remove byteorder dependency
memory consumption went down from 2Gb to 1.8GB on indexing wikipedia dataset in tantivy
* Update stacker/src/arena_hashmap.rs
Co-authored-by: Paul Masurel <paul@quickwit.io >
---------
Co-authored-by: Paul Masurel <paul@quickwit.io >
2023-04-17 09:07:33 +02:00
Paul Masurel
ed5a3b3172
Bumped murmurhash version
2023-03-03 21:24:32 +09:00
Paul Masurel
2a6d1eaf78
Added missing license.
2022-12-22 12:47:43 +09:00
Paul Masurel
f39165e1e7
Moving FileSlice to tantivy-common ( #1729 )
2022-12-21 16:35:11 +09:00
Paul Masurel
136a8f4124
Isolating sstable and stacker in independant crates. ( #1718 )
...
Both crate will be used in the new (optional + dynamic) fastfield work.
2022-12-13 11:44:17 +09:00