tantivy

mirror of https://github.com/quickwit-oss/tantivy.git synced 2026-07-07 17:50:42 +00:00

Files

Paul Masurel 04beab3b29 Performance improvement for nested cardinality aggregation

When a string cardinality aggregation is nested it end up being applied to different buckets.
Dictionary encoding relies on a different dictionaries for each segment.

As a result, during segment collection, we only collect term ordinals in a HashSet, and decode them in the
term dictionary at the end of collection.

Before this PR, this decoding phase was done once for each bucket, causing the same work to be done over and over. This PR introduce a coupon cache. The HLL sketch relies on a hash of the string values.

We populate the cache before bucket collection, and get our values from it.

This PR also rename "caching" "buffering" in aggregation (it was never caching), and does several cleanups.

2026-04-10 14:51:00 +02:00

column

fix Column.first method parameter type (#2792 )

2026-01-05 10:03:01 +01:00

column_index

clippy (#2700 )

2025-09-19 18:04:25 +02:00

column_values

Composite agg merge (#2856 )

2026-03-18 17:28:59 +01:00

columnar

clippy (#2700 )