Files
tantivy/src/aggregation
PSeitz 5b7cca13e5 lower contention on AggregationLimits (#2394)
PR https://github.com/quickwit-oss/quickwit/pull/4962 fixes an issue
where the AggregationLimits are not passed correctly. Since the
AggregationLimits are shared properly we run into contention issues.

This PR includes some straightforward improvement to reduce contention,
by only calling if the memory changed and avoiding the second read.

We probably need some sharding with multiple counters or local caching before updating the
global after some threshold.
2024-05-15 12:25:40 +02:00
..
2024-04-18 16:28:05 +02:00
2024-05-07 09:59:41 +02:00
2023-04-07 07:18:28 +02:00
2024-05-07 11:29:49 +02:00
2022-06-02 09:32:57 +09:00

Contributing

When adding new bucket aggregation make sure to extend the "test_aggregation_flushing" test for at least 2 levels.

Code Organization

Tantivy's aggregations have been designed to mimic the aggregations of elasticsearch.

The code is organized in submodules:

bucket

Contains all bucket aggregations, like range aggregation. These bucket aggregations group documents into buckets and can contain sub-aggregations.

metric

Contains all metric aggregations, like average aggregation. Metric aggregations do not have sub aggregations.

agg_req

agg_req contains the users aggregation request. Deserialization from json is compatible with elasticsearch aggregation requests.

agg_req_with_accessor

agg_req_with_accessor contains the users aggregation request enriched with fast field accessors etc, which are used during collection.

segment_agg_result

segment_agg_result contains the aggregation result tree, which is used for collection of a segment. The tree from agg_req_with_accessor is passed during collection.

intermediate_agg_result

intermediate_agg_result contains the aggregation tree for merging with other trees.

agg_result

agg_result contains the final aggregation tree.