mirror of https://github.com/quickwit-oss/tantivy.git synced 2026-05-18 17:20:41 +00:00

Files

giovannicuccu 1095c9b073 Issue 1787 extended stats (#2247 )

* first version of extended stats along with its tests

* using IntermediateExtendStats instead of IntermediateStats with all tests passing

* Created struct for request and response

* first test with extended_stats

* kahan summation and tests with approximate equality

* version ready for merge

* removed approx dependency

* refactor for using ExtendedStats only when needed

* interim version

* refined version with code formatted

* refactored a struct

* cosmetic refactor

* fix after merge

* fix format

* added extended_stat bench

* merge and new benchmark for extended stats

* split stat segment collectors

* wrapped intermediate extended stat with a box to limit memory usage

* Revert "wrapped intermediate extended stat with a box to limit memory usage"

This reverts commit 5b4aa9f393.

* some code reformat, commented kahan summation

* refactor after review

* refactor after code review

* fix after incorrectly restoring kahan summation

* modifications for code review + bug fix in merge_fruit

* refactor assert_nearly_equals macro

* update after code review

---------

Co-authored-by: Giovanni Cuccu <gcuccu@imolainformatica.it>

2024-06-04 14:25:17 +08:00

bucket

style: simplify strings with string interpolation (#2412 )

2024-05-27 09:16:47 +02:00

metric

Issue 1787 extended stats (#2247 )

2024-06-04 14:25:17 +08:00

agg_limits.rs

lower contention on AggregationLimits (#2394 )

2024-05-15 12:25:40 +02:00

agg_req_with_accessor.rs

Issue 1787 extended stats (#2247 )

2024-06-04 14:25:17 +08:00

agg_req.rs

Issue 1787 extended stats (#2247 )

2024-06-04 14:25:17 +08:00

agg_result.rs

Issue 1787 extended stats (#2247 )

2024-06-04 14:25:17 +08:00

agg_tests.rs

fix postcard compatibility for top_hits, add postcard test (#2346 )

2024-04-09 06:17:25 +02:00

buf_collector.rs

switch to Aggregation without serde_untagged (#2003 )

2023-04-25 08:54:51 +02:00

collector.rs

cleanup top level exports (#2382 )

2024-05-07 09:59:41 +02:00

date.rs

Inline format arguments where makes sense (#2038 )

2023-05-10 18:03:59 +09:00

error.rs

add percentiles aggregations (#1984 )

2023-04-07 07:18:28 +02:00

intermediate_agg_result.rs

Issue 1787 extended stats (#2247 )

2024-06-04 14:25:17 +08:00

mod.rs

style: simplify strings with string interpolation (#2412 )

2024-05-27 09:16:47 +02:00

README.md

Improve Docs Readability (#1380 )

2022-06-02 09:32:57 +09:00

segment_agg_result.rs

Issue 1787 extended stats (#2247 )

2024-06-04 14:25:17 +08:00

README.md

Contributing

When adding new bucket aggregation make sure to extend the "test_aggregation_flushing" test for at least 2 levels.

Code Organization

Tantivy's aggregations have been designed to mimic the aggregations of elasticsearch.

The code is organized in submodules:

bucket

Contains all bucket aggregations, like range aggregation. These bucket aggregations group documents into buckets and can contain sub-aggregations.

metric

Contains all metric aggregations, like average aggregation. Metric aggregations do not have sub aggregations.

agg_req

agg_req contains the users aggregation request. Deserialization from json is compatible with elasticsearch aggregation requests.

agg_req_with_accessor

agg_req_with_accessor contains the users aggregation request enriched with fast field accessors etc, which are used during collection.

segment_agg_result

segment_agg_result contains the aggregation result tree, which is used for collection of a segment. The tree from agg_req_with_accessor is passed during collection.

intermediate_agg_result

intermediate_agg_result contains the aggregation tree for merging with other trees.

agg_result

agg_result contains the final aggregation tree.