mirror of https://github.com/quickwit-oss/tantivy.git synced 2026-05-28 14:10:42 +00:00

Files

PSeitz 5f23bb7e65 switch to sparse collection for histogram (#1898 )

* switch to sparse collection for histogram

Replaces histogram vec collection with a hashmap. This approach works much better for sparse data and enables use cases like drill downs (filter + small interval).
It is slower for dense cases (1.3x-2x slower). This can be alleviated with a specialized hashmap in the future.
closes #1704
closes #1370

* refactor, clippy

* fix bucket_pos overflow issue

2023-02-23 07:02:58 +01:00

bucket

switch to sparse collection for histogram (#1898 )

2023-02-23 07:02:58 +01:00

metric

remove schema in aggs (#1888 )

2023-02-22 04:50:28 +01:00

agg_req_with_accessor.rs

remove schema in aggs (#1888 )

2023-02-22 04:50:28 +01:00

agg_req.rs

Remove standard deviation from stats aggregation

2023-01-16 22:58:23 -05:00

agg_result.rs

remove schema in aggs (#1888 )

2023-02-22 04:50:28 +01:00

agg_tests.rs

remove schema in aggs (#1888 )

2023-02-22 04:50:28 +01:00

buf_collector.rs

refactor aggregations (#1875 )

2023-02-16 13:15:16 +01:00

collector.rs

remove schema in aggs (#1888 )

2023-02-22 04:50:28 +01:00

date.rs

add aggregation support for date type (#1693 )

2022-11-28 09:12:08 +09:00

intermediate_agg_result.rs

remove schema in aggs (#1888 )

2023-02-22 04:50:28 +01:00

mod.rs

remove schema in aggs (#1888 )

2023-02-22 04:50:28 +01:00

README.md

Improve Docs Readability (#1380 )

2022-06-02 09:32:57 +09:00

segment_agg_result.rs

switch to sparse collection for histogram (#1898 )

2023-02-23 07:02:58 +01:00

README.md

Contributing

When adding new bucket aggregation make sure to extend the "test_aggregation_flushing" test for at least 2 levels.

Code Organization

Tantivy's aggregations have been designed to mimic the aggregations of elasticsearch.

The code is organized in submodules:

bucket

Contains all bucket aggregations, like range aggregation. These bucket aggregations group documents into buckets and can contain sub-aggregations.

metric

Contains all metric aggregations, like average aggregation. Metric aggregations do not have sub aggregations.

agg_req

agg_req contains the users aggregation request. Deserialization from json is compatible with elasticsearch aggregation requests.

agg_req_with_accessor

agg_req_with_accessor contains the users aggregation request enriched with fast field accessors etc, which are used during collection.

segment_agg_result

segment_agg_result contains the aggregation result tree, which is used for collection of a segment. The tree from agg_req_with_accessor is passed during collection.

intermediate_agg_result

intermediate_agg_result contains the aggregation tree for merging with other trees.

agg_result

agg_result contains the final aggregation tree.