PSeitz
e522163a1c
use json in agg tests ( #1998 )
...
* switch to JSON in tests, add flat aggregation types
* use method
* clippy
* remove commented file
2023-04-17 14:08:48 +02:00
PSeitz
41af70799d
add percentiles aggregations ( #1984 )
...
* add percentiles aggregations
add percentiles aggregation
fix disabled agg benchmark
* Update src/aggregation/metric/percentiles.rs
Co-authored-by: Paul Masurel <paul@quickwit.io >
* Apply suggestions from code review
Co-authored-by: Paul Masurel <paul@quickwit.io >
* fix import
* fix import
---------
Co-authored-by: Paul Masurel <paul@quickwit.io >
2023-04-07 07:18:28 +02:00
PSeitz
5c4ea6a708
tokenizer option on text fastfield ( #1945 )
...
* tokenizer option on text fastfield
allow to set tokenizer option on text fastfield (fixes #1901 )
handle PreTokenized strings in fast field
* change visibility
* remove custom de/serialization
2023-03-31 10:03:38 +02:00
PSeitz
8f7f1d6be4
add Display for ByteCount ( #1949 )
...
* add Display for ByteCount
* export missing AggregationLimits
2023-03-21 08:02:35 +01:00
PSeitz
9e2faecf5b
add memory limit for aggregations ( #1942 )
...
* add memory limit for aggregations
introduce AggregationLimits to set memory consumption limit and bucket limits
memory limit is checked during aggregation, bucket limit is checked before returning the aggregation request.
* Apply suggestions from code review
Co-authored-by: Paul Masurel <paul@quickwit.io >
* add ByteCount with human readable format
---------
Co-authored-by: Paul Masurel <paul@quickwit.io >
2023-03-16 06:21:07 +01:00
PSeitz
2fb3740cb0
handle missing column for aggs ( #1920 )
...
* handle missing column for aggs
add empty column fallback for missing column in aggs.
Fix sort for term agg on sub-agg with missing value (null is smallest)
* add error when field is not fast
2023-03-15 06:09:59 +01:00
PSeitz
61cfd8dc57
fix clippy ( #1927 )
2023-03-13 03:12:02 +01:00
PSeitz
ca20bfa776
add date_histogram ( #1900 )
...
* add date_histogram
* add return result
2023-03-02 05:17:35 +01:00
Paul Masurel
7fae4d98d7
Adapting for quickwit2 ( #1912 )
...
* Adapting tantivy to make it possible to be plugged to quickwit.
* Apply suggestions from code review
Co-authored-by: PSeitz <PSeitz@users.noreply.github.com >
* Added unit test
---------
Co-authored-by: PSeitz <PSeitz@users.noreply.github.com >
2023-03-01 16:27:46 +09:00
PSeitz
e510f699c8
feat: add support for u64,i64,f64 fields in term aggregation ( #1883 )
...
* feat: add support for u64,i64,f64 fields in term aggregation
* hash enum values
* fix build
* Apply suggestions from code review
Co-authored-by: Paul Masurel <paul@quickwit.io >
---------
Co-authored-by: Paul Masurel <paul@quickwit.io >
2023-02-27 15:04:41 +08:00
PSeitz
c7278b3258
remove schema in aggs ( #1888 )
...
* switch to ColumnType, move tests
* remove Schema dependency in agg
2023-02-22 04:50:28 +01:00
PSeitz
74bf60b4f7
implement SegmentAggregationCollector on bucket aggs ( #1878 )
2023-02-17 12:53:29 +01:00
PSeitz
019db10e8e
refactor aggregations ( #1875 )
...
* add specialized version for full cardinality
Pre Columnar
test aggregation::tests::bench::bench_aggregation_average_u64 ... bench: 6,681,850 ns/iter (+/- 1,217,385)
test aggregation::tests::bench::bench_aggregation_average_u64_and_f64 ... bench: 10,576,327 ns/iter (+/- 494,380)
Current
test aggregation::tests::bench::bench_aggregation_average_u64 ... bench: 11,562,084 ns/iter (+/- 3,678,682)
test aggregation::tests::bench::bench_aggregation_average_u64_and_f64 ... bench: 18,925,790 ns/iter (+/- 17,616,771)
Post Change
test aggregation::tests::bench::bench_aggregation_average_u64 ... bench: 9,123,811 ns/iter (+/- 399,720)
test aggregation::tests::bench::bench_aggregation_average_u64_and_f64 ... bench: 13,111,825 ns/iter (+/- 273,547)
* refactor aggregation collection
* add buffering collector
2023-02-16 13:15:16 +01:00
PSeitz
347614c841
test error for avg agg on ip field ( #1873 )
...
closes #1835
2023-02-14 23:22:56 +08:00
PSeitz
7a9befd18d
fix sort order test for term aggregation ( #1858 )
...
fix sort order test for term aggregation
fix invalid request test
2023-02-10 10:26:58 +01:00
Paul Masurel
bd5eea9852
Integrated columnar work.
2023-02-09 13:14:31 +01:00
PSeitz
a2ca12995e
update aggregation docs ( #1807 )
2023-01-19 09:52:47 +01:00
Adrien Guillo
f2dad194ea
Add count, min, max, and sum aggregations
2023-01-16 12:22:20 -05:00
Adam Reichold
8312c882a5
More cosmetic fixes for upcoming Clippy lints. ( #1771 )
2023-01-10 10:32:45 +01:00
PSeitz
f9171a3981
fix clippy ( #1725 )
...
* fix clippy
* fix clippy fastfield codecs
* fix clippy bitpacker
* fix clippy common
* fix clippy stacker
* fix clippy sstable
* fmt
2022-12-20 07:30:06 +01:00
PSeitz
ee1f2c1f28
add aggregation support for date type ( #1693 )
...
* add aggregation support for date type
fixes #1332
* serialize key_as_string as rfc3339 in date histogram
* update docs
* enable date for range aggregation
2022-11-28 09:12:08 +09:00
Pascal Seitz
952b048341
add term aggregation clarification
2022-10-14 16:12:19 +08:00
Bruce Mitchener
cf02e32578
Improvements to doc linking, grammar, etc.
2022-09-19 18:10:22 +07:00
PSeitz
45924711fd
improve docs ( #1514 )
...
fix link alias after https://github.com/rust-lang/rustfmt/pull/5262 has been merged and released.
fix dead links
2022-09-08 22:33:59 +09:00
Paul Masurel
26876d41d7
Moving the serialization logic to the fastfield_codecs crate.
2022-09-03 00:29:52 +09:00
k-yomo
704d0a8d8b
Refactor range aggregation tests
2022-07-28 06:31:25 +09:00
k-yomo
9b6b60cc2b
Remove unnecessary keyed parameter setting
2022-07-27 18:43:52 +09:00
k-yomo
6444516a82
User serde default for the keyed params
2022-07-27 01:12:56 +09:00
k-yomo
a9b0d1a0ab
Fix aggreagtion examples
2022-07-26 18:54:27 +09:00
k-yomo
2b333ca635
Fix keyed param type in the comment
2022-07-26 18:35:01 +09:00
k-yomo
5ab5f070ed
Fix to use bool directory for the keyed parameter
2022-07-26 18:18:38 +09:00
k-yomo
5b564916f0
Add support for keyed parameter in range and histgram aggregations
2022-07-26 04:28:21 +09:00
PSeitz
db1836691e
fix visibility ( #1398 )
2022-06-28 16:21:39 +09:00
Pascal Seitz
44ea7313ca
set max bucket size as parameter
2022-05-13 13:21:52 +08:00
Pascal Seitz
11ac451250
abort aggregation when too many buckets are created
...
Validation happens on different phases depending on the aggregation
Term: During segment collection
Histogram: At the end when converting in intermediate buckets (we preallocate empty buckets for the range) Revisit after #1370
Range: When validating the request
update CHANGELOG
2022-05-12 12:26:43 +08:00
Pascal Seitz
3f88718f38
refactor aggregations
2022-05-12 12:26:43 +08:00
Pascal Seitz
d11a8cce26
minor docs fix
2022-05-06 17:52:36 +08:00
Pascal Seitz
bc607a921b
add alias shard_size split_size for quickwit
...
improve some docs
2022-05-06 17:52:36 +08:00
Pascal Seitz
c45eb9a9fa
improve readability, add json test
2022-04-26 11:22:34 +08:00
Pascal Seitz
1be6c6111c
support order property on term aggregations
...
support order property on term aggregations
order can be by doc_count, key, or a metric sub_aggregation
2022-04-20 00:34:38 +08:00
Pascal Seitz
ec69875d15
fix collecting term_dict field names
...
fix collecting term_dict field names for sub_aggregations, minor refactoring
2022-04-15 17:49:20 +08:00
Pascal Seitz
ab6b532cc4
add comments
2022-04-14 12:06:36 +08:00
Pascal Seitz
902d05ebec
refactor getffreader function
2022-04-13 19:51:18 +08:00
Pascal Seitz
46724b4a05
add segment_size, add get term dict fields, add tests
2022-04-13 19:51:18 +08:00
Pascal Seitz
24432bf523
add term aggregation
2022-04-13 19:51:18 +08:00
Pascal Seitz
8807bfd13d
fast field on string
...
enables FAST on string fields, which creates a fastfield containing the term ordinals
2022-03-29 12:40:10 +08:00
Paul Masurel
46d5de920d
Removes all usage of block_on, and use a oneshot channel instead. ( #1315 )
...
* Removes all usage of block_on, and use a oneshot channel instead.
Calling `block_on` panics in certain context.
For instance, it panics when it is called in a the context of another
call to block.
Using it in tantivy is unnecessary. We replace it by a thin wrapper
around a oneshot channel that supports both async/sync.
* Removing needless uses of async in the API.
Co-authored-by: PSeitz <PSeitz@users.noreply.github.com >
2022-03-18 16:54:58 +09:00
PSeitz
141b9aa245
Merge pull request #1306 from PSeitz/histogram
...
add Histogram aggregation
2022-03-18 05:03:46 +01:00
Pascal Seitz
aa391bf843
refactor parameters
2022-03-17 16:28:37 +08:00
Pascal Seitz
47dcbdbeae
handle empty results, empty indices, add tests
2022-03-17 10:24:34 +08:00