Compare commits

...

34 Commits

Author SHA1 Message Date
cong.xie
e9c6383b8c fix: adapt composite aggregation for Quickwit compatibility
Fixes on top of PR #2714 ("Add composite aggregation") to make it work
with Quickwit's current codebase and Postcard serialization:

- Rewrite SegmentCompositeCollector to match current
  SegmentAggregationCollector trait signatures (collect,
  add_intermediate_aggregation_result, prepare_max_bucket)
- Remove Clone derive from CompositeBucketCollector (incompatible
  with dyn SegmentAggregationCollector)
- Add custom serde for FxHashMap entries in
  IntermediateCompositeBucketResult (Postcard requires known sequence
  length)
- Rewrite AfterKey Serialize/Deserialize to output raw values instead
  of internal "type:value" format, matching Elasticsearch wire format
- Remove unused imports and tracing::warn calls

Made-with: Cursor
2026-03-13 15:38:25 -04:00
Remi Dettai
d662415b81 Add composite aggregation 2026-03-13 10:32:41 -04:00
cong.xie
2dc4e9ef78 fix: resolve remaining clippy errors in ddsketch
- Replace approximate PI/E constants with non-famous value in test
- Fix reversed empty range (2048..0) → (0..2048).rev() in store test

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-18 15:54:27 -05:00
cong.xie
aeea65f61d refactor: rewrite encoding.rs with idiomatic Rust
- Replace bare constants with FlagType and BinEncodingMode enums
- Use const fn for flag byte construction instead of raw bit ops
- Replace if-else chain with nested match in decode_from_java_bytes
- Use split_first() in read_byte for idiomatic slice consumption
- Use split_at in read_f64_le to avoid TryInto on edition 2018
- Use u64::from(next) instead of `next as u64` casts
- Extract assert_golden, assert_quantiles_match, bytes_to_hex helpers
  to reduce duplication across golden byte tests
- Fix edition-2018 assert! format string compatibility
- Clean up is_valid_flag_byte with let-else and match

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-18 15:49:12 -05:00
cong.xie
4211d5a1ed fix: resolve clippy warnings in vendored sketches-ddsketch
- manual_range_contains: use !(0.0..=1.0).contains(&q)
- identity_op: simplify (0 << 2) | FLAG_TYPE to just FLAG_TYPE
- manual_clamp: use .clamp(0, 8) instead of .max(0).min(8)
- manual_repeat_n: use repeat_n() instead of repeat().take()
- cast_abs_to_unsigned: use .unsigned_abs() instead of .abs() as usize

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-18 13:36:06 -05:00
cong.xie
d50c7a1daf Add Java source links for cross-language alignment comments
Reference the exact Java source files in DataDog/sketches-java for
Config::new(), Config::key(), Config::value(), Config::from_gamma(),
and Store::add_count() so readers can verify the alignment.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-18 13:25:12 -05:00
cong.xie
cf760fd5b6 fix: remove internal reference from code comment
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-18 12:59:25 -05:00
cong.xie
df04c7d8f1 fix: rustfmt nightly formatting for vendored sketches-ddsketch
Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-18 12:53:01 -05:00
cong.xie
68626bf3a1 Vendor sketches-ddsketch with Java-compatible binary encoding
Fork sketches-ddsketch as a workspace member to add native Java binary
serialization (to_java_bytes/from_java_bytes) for DDSketch. This enables
pomsky to return raw DDSketch bytes that event-query can deserialize via
DDSketchWithExactSummaryStatistics.decode().

Key changes:
- Vendor sketches-ddsketch crate with encoding.rs implementing VarEncoding,
  flag bytes, and INDEX_DELTAS_AND_COUNTS store format
- Align Config::key() to floor-based indexing matching Java's LogarithmicMapping
- Add PercentilesCollector::to_sketch_bytes() for pomsky integration
- Cross-language golden byte tests verified byte-identical with Java output

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-18 11:36:21 -05:00
cong.xie
7eca33143e Remove Datadog-specific references from comments
This is an open-source repo — replace references to Datadog's event query
with generic cross-language compatibility descriptions.
2026-02-12 11:44:42 -05:00
cong.xie
698f073f88 fix fmt 2026-02-11 15:52:39 -05:00
cong.xie
cdd24b7ee5 Replace hyperloglogplus with Apache DataSketches HLL (lg_k=11)
Switch tantivy's cardinality aggregation from the hyperloglogplus crate
(HyperLogLog++ with p=16) to the official Apache DataSketches HLL
implementation (datasketches crate v0.2.0 with lg_k=11, Hll4).

This enables returning raw HLL sketch bytes from pomsky to Datadog's
event query, where they can be properly deserialized and merged using
the same DataSketches library (Java). The previous implementation
required pomsky to fabricate fake HLL sketches from scalar cardinality
estimates, which produced incorrect results when merged.

Changes:
- Cargo.toml: hyperloglogplus 0.4.1 -> datasketches 0.2.0
- CardinalityCollector: HyperLogLogPlus<u64, BuildSaltedHasher> -> HllSketch
- Custom Serde impl using HllSketch binary format (cross-shard compat)
- New to_sketch_bytes() for external consumers (pomsky)
- Salt preserved via (salt, value) tuple hashing for column type disambiguation
- Removed BuildSaltedHasher struct
- Added 4 new unit tests (serde roundtrip, merge, binary compat, salt)
2026-02-11 08:49:46 -05:00
trinity-1686a
5562ce6037 Merge pull request #2818 from Darkheir/fix/query_grammar_regex_between_parentheses 2026-02-11 11:39:58 +01:00
Metin Dumandag
09b6ececa7 Export fields of the PercentileValuesVecEntry (#2833)
Otherwise, there is no way to access these fields when not using the
json serialized form of the aggregation results.

This simple data struct is part of the public api,
so its fields should be accessible as well.
2026-02-11 11:31:07 +01:00
Moe
8018016e46 feat: add fast field support for Bytes type (#100) (#2830)
## What

Enable range queries and TopN sorting on `Bytes` fast fields, bringing them to parity with `Str` fields.

## Why

`BytesColumn` uses the same dictionary encoding as `StrColumn` internally, but range queries and TopN sorting were explicitly disabled for `Bytes`. This prevented use cases like storing lexicographically sortable binary data (e.g., arbitrary-precision decimals) that need efficient range filtering.

## How

1. **Enable range queries for Bytes** - Changed `is_type_valid_for_fastfield_range_query()` to return `true` for `Type::Bytes`
2. **Add BytesColumn handling in scorer** - Added a branch in `FastFieldRangeWeight::scorer()` to handle bytes fields using dictionary ordinal lookup (mirrors the existing `StrColumn` logic)
3. **Add SortByBytes** - New sort key computer for TopN queries on bytes columns

## Tests

- `test_bytes_field_ff_range_query` - Tests inclusive/exclusive bounds and unbounded ranges
- `test_sort_by_bytes_asc` / `test_sort_by_bytes_desc` - Tests lexicographic ordering in both directions
2026-02-11 11:26:18 +01:00
trinity-1686a
6bf185dc3f Merge pull request #2829 from quickwit-oss/cong.xie/add-intermediate-accessors 2026-02-10 17:07:24 +01:00
cong.xie
bb141abe22 feat(aggregation): add keys() accessor to IntermediateAggregationResults 2026-02-09 15:38:35 -05:00
cong.xie
f1c29ba972 resolve conflcit 2026-02-06 14:23:11 -05:00
cong.xie
ae0554a6a5 feat(aggregation): add public accessors for intermediate aggregation results
Add accessor methods to allow external crates to read intermediate
aggregation results without accessing pub(crate) fields:

- IntermediateAggregationResults: get(), remove()
- IntermediateTermBucketResult: entries(), sum_other_doc_count(), doc_count_error_upper_bound()
- IntermediateAverage: stats()
- IntermediateStats: count(), sum()
- IntermediateKey: Display impl for string conversion
2026-02-06 11:12:20 -05:00
cong.xie
0d7abe5d23 feat(aggregation): add public accessors for intermediate aggregation results
Add accessor methods to allow external crates to read intermediate
aggregation results without accessing pub(crate) fields:

- IntermediateAggregationResults: get(), get_mut(), remove()
- IntermediateTermBucketResult: entries(), sum_other_doc_count(), doc_count_error_upper_bound()
- IntermediateAverage: stats()
- IntermediateStats: count(), sum()
- IntermediateKey: Display impl for string conversion
2026-02-06 10:28:59 -05:00
PSeitz
28db952131 Add regex search and merge segments benchmark (#2826)
* add merge_segments benchmark

* add regex search bench
2026-02-02 17:28:02 +01:00
PSeitz
98ebbf922d faster exclude queries (#2825)
* faster exclude queries

Faster exclude queries with multiple terms.

Changes `Exclude` to be able to exclude multiple DocSets, instead of
putting the docsets into a union.
Use `seek_danger` in `Exclude`.

closes #2822

* replace unwrap with match
2026-01-30 17:06:41 +01:00
Paul Masurel
4a89e74597 Fix rfc3339 typos and add Claude Code skills (#2823)
Closes #2817
2026-01-30 12:00:28 +01:00
Alex Lazar
4d99e51e50 Bump oneshot to 0.1.13 per dependabot (#2821) 2026-01-30 11:42:01 +01:00
Darkheir
a55e4069e4 feat(query-grammar): Apply PR review suggestions
Signed-off-by: Darkheir <raphael.cohen@sekoia.io>
2026-01-28 14:13:55 +01:00
Darkheir
1fd30c62be fix(query-grammar): Fix regexes between parentheses
Signed-off-by: Darkheir <raphael.cohen@sekoia.io>
2026-01-28 10:37:51 +01:00
trinity-1686a
9b619998bd Merge pull request #2816 from evance-br/fix-closing-paren-elastic-range 2026-01-27 17:00:08 +01:00
Evance Soumaoro
765c448945 uncomment commented code when testing 2026-01-27 13:19:41 +00:00
Evance Soumaoro
943594ebaa uncomment commented code when testing 2026-01-27 13:08:38 +00:00
Evance Soumaoro
df17daae0d fix closing parenthesis error on elastic range queries for lenient parser 2026-01-27 13:01:14 +00:00
Paul Masurel
0ae94baef5 Remove temp file (#2815)
Co-authored-by: Paul Masurel <paul.masurel@datadoghq.com>
2026-01-27 09:22:11 +01:00
Paul Masurel
3f448ecf79 Bugfix on intersection. (#2812)
The intersection algorithm made it possible for .seek(..) with values
lower than the current doc id, breaking the DocSet contract.

The fix removes the optimization that caused left.seek(..) to be replaced
by a simpler left.advance(..).

Simply doing so lead to a performance regression.
I therefore integrated that idea within SegmentPostings.seek.

We now attempt to check the next doc systematically on seek,
PROVIDED the block is already loaded.

Closes #2811

Co-authored-by: Paul Masurel <paul.masurel@datadoghq.com>
2026-01-27 09:21:09 +01:00
Paul Masurel
b86caeefe2 Major bugfix in intersection
A bug was added with the `seek_into_the_danger_zone()` optimization

(Spotted and fixed by Stu)

The contract says seek_into_the_danger_zone returns true if do is part of the docset.

The blanket implementation goes like this.

```
let current_doc = self.doc();
if current_doc < target {
     self.seek(target);
}
self.doc() == target
```

So it will return true if target is TERMINATED, where really TERMINATED does not belong to the docset.


The fix tries to clarify the contracts and fixes the intersection algorithm.
We observe a small but all over the board improvement in intersection performance.

---------

Co-authored-by: Stu Hood <stuhood@gmail.com>
Co-authored-by: Paul Masurel <paul.masurel@datadoghq.com>
2026-01-23 18:44:10 +01:00
ChangRui-Ryan
abf1e64f4d add benchmark for string search and get (#2795) 2026-01-19 11:50:41 +01:00
74 changed files with 8888 additions and 277 deletions

View File

@@ -0,0 +1,125 @@
---
name: rationalize-deps
description: Analyze Cargo.toml dependencies and attempt to remove unused features to reduce compile times and binary size
---
# Rationalize Dependencies
This skill analyzes Cargo.toml dependencies to identify and remove unused features.
## Overview
Many crates enable features by default that may not be needed. This skill:
1. Identifies dependencies with default features enabled
2. Tests if `default-features = false` works
3. Identifies which specific features are actually needed
4. Verifies compilation after changes
## Step 1: Identify the target
Ask the user which crate(s) to analyze:
- A specific crate name (e.g., "tokio", "serde")
- A specific workspace member (e.g., "quickwit-search")
- "all" to scan the entire workspace
## Step 2: Analyze current dependencies
For the workspace Cargo.toml (`quickwit/Cargo.toml`), list dependencies that:
- Do NOT have `default-features = false`
- Have default features that might be unnecessary
Run: `cargo tree -p <crate> -f "{p} {f}" --edges features` to see what features are actually used.
## Step 3: For each candidate dependency
### 3a: Check the crate's default features
Look up the crate on crates.io or check its Cargo.toml to understand:
- What features are enabled by default
- What each feature provides
Use: `cargo metadata --format-version=1 | jq '.packages[] | select(.name == "<crate>") | .features'`
### 3b: Try disabling default features
Modify the dependency in `quickwit/Cargo.toml`:
From:
```toml
some-crate = { version = "1.0" }
```
To:
```toml
some-crate = { version = "1.0", default-features = false }
```
### 3c: Run cargo check
Run: `cargo check --workspace` (or target specific packages for faster feedback)
If compilation fails:
1. Read the error messages to identify which features are needed
2. Add only the required features explicitly:
```toml
some-crate = { version = "1.0", default-features = false, features = ["needed-feature"] }
```
3. Re-run cargo check
### 3d: Binary search for minimal features
If there are many default features, use binary search:
1. Start with no features
2. If it fails, add half the default features
3. Continue until you find the minimal set
## Step 4: Document findings
For each dependency analyzed, report:
- Original configuration
- New configuration (if changed)
- Features that were removed
- Any features that are required
## Step 5: Verify full build
After all changes, run:
```bash
cargo check --workspace --all-targets
cargo test --workspace --no-run
```
## Common Patterns
### Serde
Often only needs `derive`:
```toml
serde = { version = "1.0", default-features = false, features = ["derive", "std"] }
```
### Tokio
Identify which runtime features are actually used:
```toml
tokio = { version = "1.0", default-features = false, features = ["rt-multi-thread", "macros", "sync"] }
```
### Reqwest
Often doesn't need all TLS backends:
```toml
reqwest = { version = "0.11", default-features = false, features = ["rustls-tls", "json"] }
```
## Rollback
If changes cause issues:
```bash
git checkout quickwit/Cargo.toml
cargo check --workspace
```
## Tips
- Start with large crates that have many default features (tokio, reqwest, hyper)
- Use `cargo bloat --crates` to identify large dependencies
- Check `cargo tree -d` for duplicate dependencies that might indicate feature conflicts
- Some features are needed only for tests - consider using `[dev-dependencies]` features

View File

@@ -0,0 +1,60 @@
---
name: simple-pr
description: Create a simple PR from staged changes with an auto-generated commit message
disable-model-invocation: true
---
# Simple PR
Follow these steps to create a simple PR from staged changes:
## Step 1: Check workspace state
Run: `git status`
Verify that all changes have been staged (no unstaged changes). If there are unstaged changes, abort and ask the user to stage their changes first with `git add`.
Also verify that we are on the `main` branch. If not, abort and ask the user to switch to main first.
## Step 2: Ensure main is up to date
Run: `git pull origin main`
This ensures we're working from the latest code.
## Step 3: Review staged changes
Run: `git diff --cached`
Review the staged changes to understand what the PR will contain.
## Step 4: Generate commit message
Based on the staged changes, generate a concise commit message (1-2 sentences) that describes the "why" rather than the "what".
Display the proposed commit message to the user and ask for confirmation before proceeding.
## Step 5: Create a new branch
Get the git username: `git config user.name | tr ' ' '-' | tr '[:upper:]' '[:lower:]'`
Create a short, descriptive branch name based on the changes (e.g., `fix-typo-in-readme`, `add-retry-logic`, `update-deps`).
Create and checkout the branch: `git checkout -b {username}/{short-descriptive-name}`
## Step 6: Commit changes
Commit with the message from step 3:
```
git commit -m "{commit-message}"
```
## Step 7: Push and open a PR
Push the branch and open a PR:
```
git push -u origin {branch-name}
gh pr create --title "{commit-message-title}" --body "{longer-description-if-needed}"
```
Report the PR URL to the user when complete.

View File

@@ -15,7 +15,7 @@ rust-version = "1.85"
exclude = ["benches/*.json", "benches/*.txt"]
[dependencies]
oneshot = "0.1.7"
oneshot = "0.1.13"
base64 = "0.22.0"
byteorder = "1.4.3"
crc32fast = "1.3.2"
@@ -64,8 +64,8 @@ query-grammar = { version = "0.25.0", path = "./query-grammar", package = "tanti
tantivy-bitpacker = { version = "0.9", path = "./bitpacker" }
common = { version = "0.10", path = "./common/", package = "tantivy-common" }
tokenizer-api = { version = "0.6", path = "./tokenizer-api", package = "tantivy-tokenizer-api" }
sketches-ddsketch = { version = "0.3.0", features = ["use_serde"] }
hyperloglogplus = { version = "0.4.1", features = ["const-loop"] }
sketches-ddsketch = { path = "./sketches-ddsketch", features = ["use_serde"] }
datasketches = "0.2.0"
futures-util = { version = "0.3.28", optional = true }
futures-channel = { version = "0.3.28", optional = true }
fnv = "1.0.7"
@@ -144,6 +144,7 @@ members = [
"sstable",
"tokenizer-api",
"columnar",
"sketches-ddsketch",
]
# Following the "fail" crate best practises, we isolate
@@ -189,3 +190,16 @@ harness = false
[[bench]]
name = "bool_queries_with_range"
harness = false
[[bench]]
name = "str_search_and_get"
harness = false
[[bench]]
name = "merge_segments"
harness = false
[[bench]]
name = "regex_all_terms"
harness = false

View File

@@ -1,5 +1,6 @@
use binggan::plugins::PeakMemAllocPlugin;
use binggan::{black_box, InputGroup, PeakMemAlloc, INSTRUMENTED_SYSTEM};
use common::DateTime;
use rand::distr::weighted::WeightedIndex;
use rand::rngs::StdRng;
use rand::seq::IndexedRandom;
@@ -70,6 +71,12 @@ fn bench_agg(mut group: InputGroup<Index>) {
register!(group, terms_many_json_mixed_type_with_avg_sub_agg);
register!(group, composite_term_many_page_1000);
register!(group, composite_term_many_page_1000_with_avg_sub_agg);
register!(group, composite_term_few);
register!(group, composite_histogram);
register!(group, composite_histogram_calendar);
register!(group, cardinality_agg);
register!(group, terms_status_with_cardinality_agg);
@@ -313,6 +320,75 @@ fn terms_many_json_mixed_type_with_avg_sub_agg(index: &Index) {
});
execute_agg(index, agg_req);
}
fn composite_term_few(index: &Index) {
let agg_req = json!({
"my_ctf": {
"composite": {
"sources": [
{ "text_few_terms": { "terms": { "field": "text_few_terms" } } }
],
"size": 1000
}
},
});
execute_agg(index, agg_req);
}
fn composite_term_many_page_1000(index: &Index) {
let agg_req = json!({
"my_ctmp1000": {
"composite": {
"sources": [
{ "text_many_terms": { "terms": { "field": "text_many_terms" } } }
],
"size": 1000
}
},
});
execute_agg(index, agg_req);
}
fn composite_term_many_page_1000_with_avg_sub_agg(index: &Index) {
let agg_req = json!({
"my_ctmp1000wasa": {
"composite": {
"sources": [
{ "text_many_terms": { "terms": { "field": "text_many_terms" } } }
],
"size": 1000,
},
"aggs": {
"average_f64": { "avg": { "field": "score_f64" } }
}
},
});
execute_agg(index, agg_req);
}
fn composite_histogram(index: &Index) {
let agg_req = json!({
"my_ch": {
"composite": {
"sources": [
{ "f64_histogram": { "histogram": { "field": "score_f64", "interval": 1 } } }
],
"size": 1000
}
},
});
execute_agg(index, agg_req);
}
fn composite_histogram_calendar(index: &Index) {
let agg_req = json!({
"my_chc": {
"composite": {
"sources": [
{ "time_histogram": { "date_histogram": { "field": "timestamp", "calendar_interval": "month" } } }
],
"size": 1000
}
},
});
execute_agg(index, agg_req);
}
fn execute_agg(index: &Index, agg_req: serde_json::Value) {
let agg_req: Aggregations = serde_json::from_value(agg_req).unwrap();
@@ -504,6 +580,7 @@ fn get_test_index_bench(cardinality: Cardinality) -> tantivy::Result<Index> {
let score_field = schema_builder.add_u64_field("score", score_fieldtype.clone());
let score_field_f64 = schema_builder.add_f64_field("score_f64", score_fieldtype.clone());
let score_field_i64 = schema_builder.add_i64_field("score_i64", score_fieldtype);
let date_field = schema_builder.add_date_field("timestamp", FAST);
// use tmp dir
let index = if reuse_index {
Index::create_in_dir("agg_bench", schema_builder.build())?
@@ -593,6 +670,7 @@ fn get_test_index_bench(cardinality: Cardinality) -> tantivy::Result<Index> {
score_field => val as u64,
score_field_f64 => lg_norm.sample(&mut rng),
score_field_i64 => val as i64,
date_field => DateTime::from_timestamp_millis((val * 1_000_000.) as i64),
))?;
if cardinality == Cardinality::OptionalSparse {
for _ in 0..20 {

224
benches/merge_segments.rs Normal file
View File

@@ -0,0 +1,224 @@
// Benchmarks segment merging
//
// Notes:
// - Input segments are kept intact (no deletes / no IndexWriter merge).
// - Output is written to a `NullDirectory` that discards all files except
// fieldnorms (needed for merging).
use std::collections::HashMap;
use std::io::{self, Write};
use std::path::{Path, PathBuf};
use std::sync::{Arc, RwLock};
use binggan::{black_box, BenchRunner};
use rand::prelude::*;
use rand::rngs::StdRng;
use rand::SeedableRng;
use tantivy::directory::error::{DeleteError, OpenReadError, OpenWriteError};
use tantivy::directory::{
AntiCallToken, Directory, FileHandle, OwnedBytes, TerminatingWrite, WatchCallback, WatchHandle,
WritePtr,
};
use tantivy::indexer::{merge_filtered_segments, NoMergePolicy};
use tantivy::schema::{Schema, TEXT};
use tantivy::{doc, HasLen, Index, IndexSettings, Segment};
#[derive(Clone, Default, Debug)]
struct NullDirectory {
blobs: Arc<RwLock<HashMap<PathBuf, OwnedBytes>>>,
}
struct NullWriter;
impl Write for NullWriter {
fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
Ok(buf.len())
}
fn flush(&mut self) -> io::Result<()> {
Ok(())
}
}
impl TerminatingWrite for NullWriter {
fn terminate_ref(&mut self, _token: AntiCallToken) -> io::Result<()> {
Ok(())
}
}
struct InMemoryWriter {
path: PathBuf,
buffer: Vec<u8>,
blobs: Arc<RwLock<HashMap<PathBuf, OwnedBytes>>>,
}
impl Write for InMemoryWriter {
fn write(&mut self, buf: &[u8]) -> io::Result<usize> {
self.buffer.extend_from_slice(buf);
Ok(buf.len())
}
fn flush(&mut self) -> io::Result<()> {
Ok(())
}
}
impl TerminatingWrite for InMemoryWriter {
fn terminate_ref(&mut self, _token: AntiCallToken) -> io::Result<()> {
let bytes = OwnedBytes::new(std::mem::take(&mut self.buffer));
self.blobs.write().unwrap().insert(self.path.clone(), bytes);
Ok(())
}
}
#[derive(Debug, Default)]
struct NullFileHandle;
impl HasLen for NullFileHandle {
fn len(&self) -> usize {
0
}
}
impl FileHandle for NullFileHandle {
fn read_bytes(&self, _range: std::ops::Range<usize>) -> io::Result<OwnedBytes> {
unimplemented!()
}
}
impl Directory for NullDirectory {
fn get_file_handle(&self, path: &Path) -> Result<Arc<dyn FileHandle>, OpenReadError> {
if let Some(bytes) = self.blobs.read().unwrap().get(path) {
return Ok(Arc::new(bytes.clone()));
}
Ok(Arc::new(NullFileHandle))
}
fn delete(&self, _path: &Path) -> Result<(), DeleteError> {
Ok(())
}
fn exists(&self, _path: &Path) -> Result<bool, OpenReadError> {
Ok(true)
}
fn open_write(&self, path: &Path) -> Result<WritePtr, OpenWriteError> {
let path_buf = path.to_path_buf();
if path.to_string_lossy().ends_with(".fieldnorm") {
let writer = InMemoryWriter {
path: path_buf,
buffer: Vec::new(),
blobs: Arc::clone(&self.blobs),
};
Ok(io::BufWriter::new(Box::new(writer)))
} else {
Ok(io::BufWriter::new(Box::new(NullWriter)))
}
}
fn atomic_read(&self, path: &Path) -> Result<Vec<u8>, OpenReadError> {
if let Some(bytes) = self.blobs.read().unwrap().get(path) {
return Ok(bytes.as_slice().to_vec());
}
Err(OpenReadError::FileDoesNotExist(path.to_path_buf()))
}
fn atomic_write(&self, _path: &Path, _data: &[u8]) -> io::Result<()> {
Ok(())
}
fn sync_directory(&self) -> io::Result<()> {
Ok(())
}
fn watch(&self, _watch_callback: WatchCallback) -> tantivy::Result<WatchHandle> {
Ok(WatchHandle::empty())
}
}
struct MergeScenario {
#[allow(dead_code)]
index: Index,
segments: Vec<Segment>,
settings: IndexSettings,
label: String,
}
fn build_index(
num_segments: usize,
docs_per_segment: usize,
tokens_per_doc: usize,
vocab_size: usize,
) -> MergeScenario {
let mut schema_builder = Schema::builder();
let body = schema_builder.add_text_field("body", TEXT);
let schema = schema_builder.build();
let index = Index::create_in_ram(schema.clone());
assert!(vocab_size > 0);
let total_tokens = num_segments * docs_per_segment * tokens_per_doc;
let use_unique_terms = vocab_size >= total_tokens;
let mut rng = StdRng::from_seed([7u8; 32]);
let mut next_token_id: u64 = 0;
{
let mut writer = index.writer_with_num_threads(1, 256_000_000).unwrap();
writer.set_merge_policy(Box::new(NoMergePolicy));
for _ in 0..num_segments {
for _ in 0..docs_per_segment {
let mut tokens = Vec::with_capacity(tokens_per_doc);
for _ in 0..tokens_per_doc {
let token_id = if use_unique_terms {
let id = next_token_id;
next_token_id += 1;
id
} else {
rng.random_range(0..vocab_size as u64)
};
tokens.push(format!("term_{token_id}"));
}
writer.add_document(doc!(body => tokens.join(" "))).unwrap();
}
writer.commit().unwrap();
}
}
let segments = index.searchable_segments().unwrap();
let settings = index.settings().clone();
let label = format!(
"segments={}, docs/seg={}, tokens/doc={}, vocab={}",
num_segments, docs_per_segment, tokens_per_doc, vocab_size
);
MergeScenario {
index,
segments,
settings,
label,
}
}
fn main() {
let scenarios = vec![
build_index(8, 50_000, 12, 8),
build_index(16, 50_000, 12, 8),
build_index(16, 100_000, 12, 8),
build_index(8, 50_000, 8, 8 * 50_000 * 8),
];
let mut runner = BenchRunner::new();
for scenario in scenarios {
let mut group = runner.new_group();
group.set_name(format!("merge_segments inv_index — {}", scenario.label));
let segments = scenario.segments.clone();
let settings = scenario.settings.clone();
group.register("merge", move |_| {
let output_dir = NullDirectory::default();
let filter_doc_ids = vec![None; segments.len()];
let merged_index =
merge_filtered_segments(&segments, settings.clone(), filter_doc_ids, output_dir)
.unwrap();
black_box(merged_index);
});
group.run();
}
}

113
benches/regex_all_terms.rs Normal file
View File

@@ -0,0 +1,113 @@
// Benchmarks regex query that matches all terms in a synthetic index.
//
// Corpus model:
// - N unique terms: t000000, t000001, ...
// - M docs
// - K tokens per doc: doc i gets terms derived from (i, token_index)
//
// Query:
// - Regex "t.*" to match all terms
//
// Run with:
// - cargo bench --bench regex_all_terms
//
use std::fmt::Write;
use binggan::{black_box, BenchRunner};
use tantivy::collector::Count;
use tantivy::query::RegexQuery;
use tantivy::schema::{Schema, TEXT};
use tantivy::{doc, Index, ReloadPolicy};
const HEAP_SIZE_BYTES: usize = 200_000_000;
#[derive(Clone, Copy)]
struct BenchConfig {
num_terms: usize,
num_docs: usize,
tokens_per_doc: usize,
}
fn main() {
let configs = default_configs();
let mut runner = BenchRunner::new();
for config in configs {
let (index, text_field) = build_index(config, HEAP_SIZE_BYTES);
let reader = index
.reader_builder()
.reload_policy(ReloadPolicy::Manual)
.try_into()
.expect("reader");
let searcher = reader.searcher();
let query = RegexQuery::from_pattern("t.*", text_field).expect("regex query");
let mut group = runner.new_group();
group.set_name(format!(
"regex_all_terms_t{}_d{}_k{}",
config.num_terms, config.num_docs, config.tokens_per_doc
));
group.register("regex_count", move |_| {
let count = searcher.search(&query, &Count).expect("search");
black_box(count);
});
group.run();
}
}
fn default_configs() -> Vec<BenchConfig> {
vec![
BenchConfig {
num_terms: 10_000,
num_docs: 100_000,
tokens_per_doc: 1,
},
BenchConfig {
num_terms: 10_000,
num_docs: 100_000,
tokens_per_doc: 8,
},
BenchConfig {
num_terms: 100_000,
num_docs: 100_000,
tokens_per_doc: 1,
},
BenchConfig {
num_terms: 100_000,
num_docs: 100_000,
tokens_per_doc: 8,
},
]
}
fn build_index(config: BenchConfig, heap_size_bytes: usize) -> (Index, tantivy::schema::Field) {
let mut schema_builder = Schema::builder();
let text_field = schema_builder.add_text_field("text", TEXT);
let schema = schema_builder.build();
let index = Index::create_in_ram(schema);
let term_width = config.num_terms.to_string().len();
{
let mut writer = index
.writer_with_num_threads(1, heap_size_bytes)
.expect("writer");
let mut buffer = String::new();
for doc_id in 0..config.num_docs {
buffer.clear();
for token_idx in 0..config.tokens_per_doc {
if token_idx > 0 {
buffer.push(' ');
}
let term_id = (doc_id * config.tokens_per_doc + token_idx) % config.num_terms;
write!(&mut buffer, "t{term_id:0term_width$}").expect("write token");
}
writer
.add_document(doc!(text_field => buffer.as_str()))
.expect("add_document");
}
writer.commit().expect("commit");
}
(index, text_field)
}

View File

@@ -0,0 +1,421 @@
// This benchmark compares different approaches for retrieving string values:
//
// 1. Fast Field Approach: retrieves string values via term_ords() and ord_to_str()
//
// 2. Doc Store Approach: retrieves string values via searcher.doc() and field extraction
//
// The benchmark includes various data distributions:
// - Dense Sequential: Sequential document IDs with dense data
// - Dense Random: Random document IDs with dense data
// - Sparse Sequential: Sequential document IDs with sparse data
// - Sparse Random: Random document IDs with sparse data
use std::ops::Bound;
use binggan::{black_box, BenchGroup, BenchRunner};
use rand::prelude::*;
use rand::rngs::StdRng;
use rand::SeedableRng;
use tantivy::collector::{Count, DocSetCollector};
use tantivy::query::RangeQuery;
use tantivy::schema::document::TantivyDocument;
use tantivy::schema::{Schema, Value, FAST, STORED, STRING};
use tantivy::{doc, Index, ReloadPolicy, Searcher, Term};
#[derive(Clone)]
struct BenchIndex {
#[allow(dead_code)]
index: Index,
searcher: Searcher,
}
fn build_shared_indices(num_docs: usize, distribution: &str) -> BenchIndex {
// Schema with string fast field and stored field for doc access
let mut schema_builder = Schema::builder();
let f_str_fast = schema_builder.add_text_field("str_fast", STRING | STORED | FAST);
let f_str_stored = schema_builder.add_text_field("str_stored", STRING | STORED);
let schema = schema_builder.build();
let index = Index::create_in_ram(schema.clone());
// Populate index with stable RNG for reproducibility.
let mut rng = StdRng::from_seed([7u8; 32]);
{
let mut writer = index.writer_with_num_threads(1, 4_000_000_000).unwrap();
match distribution {
"dense_random" => {
for _doc_id in 0..num_docs {
let suffix = rng.gen_range(0u64..1000u64);
let str_val = format!("str_{:03}", suffix);
writer
.add_document(doc!(
f_str_fast=>str_val.clone(),
f_str_stored=>str_val,
))
.unwrap();
}
}
"dense_sequential" => {
for doc_id in 0..num_docs {
let suffix = doc_id as u64 % 1000;
let str_val = format!("str_{:03}", suffix);
writer
.add_document(doc!(
f_str_fast=>str_val.clone(),
f_str_stored=>str_val,
))
.unwrap();
}
}
"sparse_random" => {
for _doc_id in 0..num_docs {
let suffix = rng.gen_range(0u64..1000000u64);
let str_val = format!("str_{:07}", suffix);
writer
.add_document(doc!(
f_str_fast=>str_val.clone(),
f_str_stored=>str_val,
))
.unwrap();
}
}
"sparse_sequential" => {
for doc_id in 0..num_docs {
let suffix = doc_id as u64;
let str_val = format!("str_{:07}", suffix);
writer
.add_document(doc!(
f_str_fast=>str_val.clone(),
f_str_stored=>str_val,
))
.unwrap();
}
}
_ => {
panic!("Unsupported distribution type");
}
}
writer.commit().unwrap();
}
// Prepare reader/searcher once.
let reader = index
.reader_builder()
.reload_policy(ReloadPolicy::Manual)
.try_into()
.unwrap();
let searcher = reader.searcher();
BenchIndex { index, searcher }
}
fn main() {
// Prepare corpora with varying scenarios
let scenarios = vec![
(
"dense_random_search_low_range".to_string(),
1_000_000,
"dense_random",
0,
9,
),
(
"dense_random_search_high_range".to_string(),
1_000_000,
"dense_random",
990,
999,
),
(
"dense_sequential_search_low_range".to_string(),
1_000_000,
"dense_sequential",
0,
9,
),
(
"dense_sequential_search_high_range".to_string(),
1_000_000,
"dense_sequential",
990,
999,
),
(
"sparse_random_search_low_range".to_string(),
1_000_000,
"sparse_random",
0,
9999,
),
(
"sparse_random_search_high_range".to_string(),
1_000_000,
"sparse_random",
990_000,
999_999,
),
(
"sparse_sequential_search_low_range".to_string(),
1_000_000,
"sparse_sequential",
0,
9999,
),
(
"sparse_sequential_search_high_range".to_string(),
1_000_000,
"sparse_sequential",
990_000,
999_999,
),
];
let mut runner = BenchRunner::new();
for (scenario_id, n, distribution, range_low, range_high) in scenarios {
let bench_index = build_shared_indices(n, distribution);
let mut group = runner.new_group();
group.set_name(scenario_id);
let field = bench_index.searcher.schema().get_field("str_fast").unwrap();
let (lower_str, upper_str) =
if distribution == "dense_sequential" || distribution == "dense_random" {
(
format!("str_{:03}", range_low),
format!("str_{:03}", range_high),
)
} else {
(
format!("str_{:07}", range_low),
format!("str_{:07}", range_high),
)
};
let lower_term = Term::from_field_text(field, &lower_str);
let upper_term = Term::from_field_text(field, &upper_str);
let query = RangeQuery::new(Bound::Included(lower_term), Bound::Included(upper_term));
run_benchmark_tasks(&mut group, &bench_index, query, range_low, range_high);
group.run();
}
}
/// Run all benchmark tasks for a given range query
fn run_benchmark_tasks(
bench_group: &mut BenchGroup,
bench_index: &BenchIndex,
query: RangeQuery,
range_low: u64,
range_high: u64,
) {
// Test count of matching documents
add_bench_task_count(
bench_group,
bench_index,
query.clone(),
range_low,
range_high,
);
// Test fetching all DocIds of matching documents
add_bench_task_docset(
bench_group,
bench_index,
query.clone(),
range_low,
range_high,
);
// Test fetching all string fast field values of matching documents
add_bench_task_fetch_all_strings(
bench_group,
bench_index,
query.clone(),
range_low,
range_high,
);
// Test fetching all string values of matching documents through doc() method
add_bench_task_fetch_all_strings_from_doc(
bench_group,
bench_index,
query,
range_low,
range_high,
);
}
fn add_bench_task_count(
bench_group: &mut BenchGroup,
bench_index: &BenchIndex,
query: RangeQuery,
range_low: u64,
range_high: u64,
) {
let task_name = format!("string_search_count_[{}-{}]", range_low, range_high);
let search_task = CountSearchTask {
searcher: bench_index.searcher.clone(),
query,
};
bench_group.register(task_name, move |_| black_box(search_task.run()));
}
fn add_bench_task_docset(
bench_group: &mut BenchGroup,
bench_index: &BenchIndex,
query: RangeQuery,
range_low: u64,
range_high: u64,
) {
let task_name = format!("string_fetch_all_docset_[{}-{}]", range_low, range_high);
let search_task = DocSetSearchTask {
searcher: bench_index.searcher.clone(),
query,
};
bench_group.register(task_name, move |_| black_box(search_task.run()));
}
fn add_bench_task_fetch_all_strings(
bench_group: &mut BenchGroup,
bench_index: &BenchIndex,
query: RangeQuery,
range_low: u64,
range_high: u64,
) {
let task_name = format!(
"string_fastfield_fetch_all_strings_[{}-{}]",
range_low, range_high
);
let search_task = FetchAllStringsSearchTask {
searcher: bench_index.searcher.clone(),
query,
};
bench_group.register(task_name, move |_| {
let result = black_box(search_task.run());
result.len()
});
}
fn add_bench_task_fetch_all_strings_from_doc(
bench_group: &mut BenchGroup,
bench_index: &BenchIndex,
query: RangeQuery,
range_low: u64,
range_high: u64,
) {
let task_name = format!(
"string_doc_fetch_all_strings_[{}-{}]",
range_low, range_high
);
let search_task = FetchAllStringsFromDocTask {
searcher: bench_index.searcher.clone(),
query,
};
bench_group.register(task_name, move |_| {
let result = black_box(search_task.run());
result.len()
});
}
struct CountSearchTask {
searcher: Searcher,
query: RangeQuery,
}
impl CountSearchTask {
#[inline(never)]
pub fn run(&self) -> usize {
self.searcher.search(&self.query, &Count).unwrap()
}
}
struct DocSetSearchTask {
searcher: Searcher,
query: RangeQuery,
}
impl DocSetSearchTask {
#[inline(never)]
pub fn run(&self) -> usize {
let result = self.searcher.search(&self.query, &DocSetCollector).unwrap();
result.len()
}
}
struct FetchAllStringsSearchTask {
searcher: Searcher,
query: RangeQuery,
}
impl FetchAllStringsSearchTask {
#[inline(never)]
pub fn run(&self) -> Vec<String> {
let doc_addresses = self.searcher.search(&self.query, &DocSetCollector).unwrap();
let mut docs = doc_addresses.into_iter().collect::<Vec<_>>();
docs.sort();
let mut strings = Vec::with_capacity(docs.len());
for doc_address in docs {
let segment_reader = &self.searcher.segment_readers()[doc_address.segment_ord as usize];
let str_column_opt = segment_reader.fast_fields().str("str_fast");
if let Ok(Some(str_column)) = str_column_opt {
let doc_id = doc_address.doc_id;
let term_ord = str_column.term_ords(doc_id).next().unwrap();
let mut str_buffer = String::new();
if str_column.ord_to_str(term_ord, &mut str_buffer).is_ok() {
strings.push(str_buffer);
}
}
}
strings
}
}
struct FetchAllStringsFromDocTask {
searcher: Searcher,
query: RangeQuery,
}
impl FetchAllStringsFromDocTask {
#[inline(never)]
pub fn run(&self) -> Vec<String> {
let doc_addresses = self.searcher.search(&self.query, &DocSetCollector).unwrap();
let mut docs = doc_addresses.into_iter().collect::<Vec<_>>();
docs.sort();
let mut strings = Vec::with_capacity(docs.len());
let str_stored_field = self
.searcher
.schema()
.get_field("str_stored")
.expect("str_stored field should exist");
for doc_address in docs {
// Get the document from the doc store (row store access)
if let Ok(doc) = self.searcher.doc::<TantivyDocument>(doc_address) {
// Extract string values from the stored field
if let Some(field_value) = doc.get_first(str_stored_field) {
if let Some(text) = field_value.as_value().as_str() {
strings.push(text.to_string());
}
}
}
}
strings
}
}

View File

@@ -31,7 +31,7 @@ pub use u64_based::{
serialize_and_load_u64_based_column_values, serialize_u64_based_column_values,
};
pub use u128_based::{
CompactSpaceU64Accessor, open_u128_as_compact_u64, open_u128_mapped,
CompactHit, CompactSpaceU64Accessor, open_u128_as_compact_u64, open_u128_mapped,
serialize_column_values_u128,
};
pub use vec_column::VecColumn;

View File

@@ -292,6 +292,19 @@ impl BinarySerializable for IPCodecParams {
}
}
/// Represents the result of looking up a u128 value in the compact space.
///
/// If a value is outside the compact space, the next compact value is returned.
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum CompactHit {
/// The value exists in the compact space
Exact(u32),
/// The value does not exist in the compact space, but the next higher value does
Next(u32),
/// The value is greater than the maximum compact value
AfterLast,
}
/// Exposes the compact space compressed values as u64.
///
/// This allows faster access to the values, as u64 is faster to work with than u128.
@@ -309,6 +322,11 @@ impl CompactSpaceU64Accessor {
pub fn compact_to_u128(&self, compact: u32) -> u128 {
self.0.compact_to_u128(compact)
}
/// Finds the next compact space value for a given u128 value.
pub fn u128_to_next_compact(&self, value: u128) -> CompactHit {
self.0.u128_to_next_compact(value)
}
}
impl ColumnValues<u64> for CompactSpaceU64Accessor {
@@ -430,6 +448,26 @@ impl CompactSpaceDecompressor {
Ok(decompressor)
}
/// Finds the next compact space value for a given u128 value
pub fn u128_to_next_compact(&self, value: u128) -> CompactHit {
// Try to convert to compact space
match self.u128_to_compact(value) {
// Value is in compact space, return its compact representation
Ok(compact) => CompactHit::Exact(compact),
// Value is not in compact space
Err(pos) => {
if pos >= self.params.compact_space.ranges_mapping.len() {
// Value is beyond all ranges, no next value exists
CompactHit::AfterLast
} else {
// Get the next range and return its start compact value
let next_range = &self.params.compact_space.ranges_mapping[pos];
CompactHit::Next(next_range.compact_start)
}
}
}
}
/// Converting to compact space for the decompressor is more complex, since we may get values
/// which are outside the compact space. e.g. if we map
/// 1000 => 5
@@ -823,6 +861,41 @@ mod tests {
let _data = test_aux_vals(vals);
}
#[test]
fn test_u128_to_next_compact() {
let vals = &[100u128, 200u128, 1_000_000_000u128, 1_000_000_100u128];
let mut data = test_aux_vals(vals);
let _header = U128Header::deserialize(&mut data);
let decomp = CompactSpaceDecompressor::open(data).unwrap();
// Test value that's already in a range
let compact_100 = decomp.u128_to_compact(100).unwrap();
assert_eq!(
decomp.u128_to_next_compact(100),
CompactHit::Exact(compact_100)
);
// Test value between two ranges
let compact_million = decomp.u128_to_compact(1_000_000_000).unwrap();
assert_eq!(
decomp.u128_to_next_compact(250),
CompactHit::Next(compact_million)
);
// Test value before the first range
assert_eq!(
decomp.u128_to_next_compact(50),
CompactHit::Next(compact_100)
);
// Test value after the last range
assert_eq!(
decomp.u128_to_next_compact(10_000_000_000),
CompactHit::AfterLast
);
}
use proptest::prelude::*;
fn num_strategy() -> impl Strategy<Value = u128> {

View File

@@ -7,7 +7,7 @@ mod compact_space;
use common::{BinarySerializable, OwnedBytes, VInt};
pub use compact_space::{
CompactSpaceCompressor, CompactSpaceDecompressor, CompactSpaceU64Accessor,
CompactHit, CompactSpaceCompressor, CompactSpaceDecompressor, CompactSpaceU64Accessor,
};
use crate::column_values::monotonic_map_column;

View File

@@ -59,7 +59,7 @@ pub struct RowAddr {
pub row_id: RowId,
}
pub use sstable::Dictionary;
pub use sstable::{Dictionary, TermOrdHit};
pub type Streamer<'a> = sstable::Streamer<'a, VoidSSTable>;
pub use common::DateTime;

View File

@@ -60,7 +60,7 @@ At indexing, tantivy will try to interpret number and strings as different type
priority order.
Numbers will be interpreted as u64, i64 and f64 in that order.
Strings will be interpreted as rfc3999 dates or simple strings.
Strings will be interpreted as rfc3339 dates or simple strings.
The first working type is picked and is the only term that is emitted for indexing.
Note this interpretation happens on a per-document basis, and there is no effort to try to sniff
@@ -81,7 +81,7 @@ Will be interpreted as
(my_path.my_segment, String, 233) or (my_path.my_segment, u64, 233)
```
Likewise, we need to emit two tokens if the query contains an rfc3999 date.
Likewise, we need to emit two tokens if the query contains an rfc3339 date.
Indeed the date could have been actually a single token inside the text of a document at ingestion time. Generally speaking, we will always at least emit a string token in query parsing, and sometimes more.
If one more json field is defined, things get even more complicated.

View File

@@ -560,7 +560,7 @@ fn range_infallible(inp: &str) -> JResult<&str, UserInputLeaf> {
(
(
value((), tag(">=")),
map(word_infallible("", false), |(bound, err)| {
map(word_infallible(")", false), |(bound, err)| {
(
(
bound
@@ -574,7 +574,7 @@ fn range_infallible(inp: &str) -> JResult<&str, UserInputLeaf> {
),
(
value((), tag("<=")),
map(word_infallible("", false), |(bound, err)| {
map(word_infallible(")", false), |(bound, err)| {
(
(
UserInputBound::Unbounded,
@@ -588,7 +588,7 @@ fn range_infallible(inp: &str) -> JResult<&str, UserInputLeaf> {
),
(
value((), tag(">")),
map(word_infallible("", false), |(bound, err)| {
map(word_infallible(")", false), |(bound, err)| {
(
(
bound
@@ -602,7 +602,7 @@ fn range_infallible(inp: &str) -> JResult<&str, UserInputLeaf> {
),
(
value((), tag("<")),
map(word_infallible("", false), |(bound, err)| {
map(word_infallible(")", false), |(bound, err)| {
(
(
UserInputBound::Unbounded,
@@ -704,7 +704,11 @@ fn regex(inp: &str) -> IResult<&str, UserInputLeaf> {
many1(alt((preceded(char('\\'), char('/')), none_of("/")))),
char('/'),
),
peek(alt((multispace1, eof))),
peek(alt((
value((), multispace1),
value((), char(')')),
value((), eof),
))),
),
|elements| UserInputLeaf::Regex {
field: None,
@@ -721,8 +725,12 @@ fn regex_infallible(inp: &str) -> JResult<&str, UserInputLeaf> {
opt_i_err(char('/'), "missing delimiter /"),
),
opt_i_err(
peek(alt((multispace1, eof))),
"expected whitespace or end of input",
peek(alt((
value((), multispace1),
value((), char(')')),
value((), eof),
))),
"expected whitespace, closing parenthesis, or end of input",
),
)(inp)
{
@@ -1323,6 +1331,14 @@ mod test {
test_parse_query_to_ast_helper("<a", "{\"*\" TO \"a\"}");
test_parse_query_to_ast_helper("<=a", "{\"*\" TO \"a\"]");
test_parse_query_to_ast_helper("<=bsd", "{\"*\" TO \"bsd\"]");
test_parse_query_to_ast_helper("(<=42)", "{\"*\" TO \"42\"]");
test_parse_query_to_ast_helper("(<=42 )", "{\"*\" TO \"42\"]");
test_parse_query_to_ast_helper("(age:>5)", "\"age\":{\"5\" TO \"*\"}");
test_parse_query_to_ast_helper(
"(title:bar AND age:>12)",
"(+\"title\":bar +\"age\":{\"12\" TO \"*\"})",
);
}
#[test]
@@ -1699,6 +1715,10 @@ mod test {
test_parse_query_to_ast_helper("foo:(A OR B)", "(?\"foo\":A ?\"foo\":B)");
test_parse_query_to_ast_helper("foo:(A* OR B*)", "(?\"foo\":A* ?\"foo\":B*)");
test_parse_query_to_ast_helper("foo:(*A OR *B)", "(?\"foo\":*A ?\"foo\":*B)");
// Regexes between parentheses
test_parse_query_to_ast_helper("foo:(/A.*/)", "\"foo\":/A.*/");
test_parse_query_to_ast_helper("foo:(/A.*/ OR /B.*/)", "(?\"foo\":/A.*/ ?\"foo\":/B.*/)");
}
#[test]

View File

@@ -66,6 +66,7 @@ impl UserInputLeaf {
}
UserInputLeaf::Range { field, .. } if field.is_none() => *field = Some(default_field),
UserInputLeaf::Set { field, .. } if field.is_none() => *field = Some(default_field),
UserInputLeaf::Regex { field, .. } if field.is_none() => *field = Some(default_field),
_ => (), // field was already set, do nothing
}
}

View File

@@ -0,0 +1,27 @@
[package]
name = "sketches-ddsketch"
version = "0.3.0"
authors = ["Mike Heffner <mikeh@fesnel.com>"]
edition = "2018"
license = "Apache-2.0"
readme = "README.md"
repository = "https://github.com/mheffner/rust-sketches-ddsketch"
homepage = "https://github.com/mheffner/rust-sketches-ddsketch"
description = """
A direct port of the Golang DDSketch implementation.
"""
exclude = [".gitignore"]
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
serde = { package = "serde", version = "1.0", optional = true, features = ["derive", "serde_derive"] }
[dev-dependencies]
approx = "0.5.1"
rand = "0.8.5"
rand_distr = "0.4.3"
[features]
use_serde = ["serde", "serde/derive"]

201
sketches-ddsketch/LICENSE Normal file
View File

@@ -0,0 +1,201 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [2019] [Mike Heffner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@@ -0,0 +1,11 @@
clean:
cargo clean
test:
cargo test
test_logs:
cargo test -- --nocapture
test_performance:
cargo test --release --jobs 1 test_performance -- --ignored --nocapture

View File

@@ -0,0 +1,37 @@
# sketches-ddsketch
This is a direct port of the [Golang](https://github.com/DataDog/sketches-go)
[DDSketch](https://arxiv.org/pdf/1908.10693.pdf) quantile sketch implementation
to Rust. DDSketch is a fully-mergeable quantile sketch with relative-error
guarantees and is extremely fast.
# DDSketch
* Sketch size automatically grows as needed, starting with 128 bins.
* Extremely fast sample insertion and sketch merges.
## Usage
```rust
use sketches_ddsketch::{Config, DDSketch};
let config = Config::defaults();
let mut sketch = DDSketch::new(c);
sketch.add(1.0);
sketch.add(1.0);
sketch.add(1.0);
// Get p=50%
let quantile = sketch.quantile(0.5).unwrap();
assert_eq!(quantile, Some(1.0));
```
## Performance
No performance tuning has been done with this implementation of the port, so we
would expect similar profiles to the original implementation.
Out of the box we see can achieve over 70M sample inserts/sec and 350K sketch
merges/sec. All tests run on a single core Intel i7 processor with 4.2Ghz max
clock.

View File

@@ -0,0 +1,98 @@
#[cfg(feature = "use_serde")]
use serde::{Deserialize, Serialize};
const DEFAULT_MAX_BINS: u32 = 2048;
const DEFAULT_ALPHA: f64 = 0.01;
const DEFAULT_MIN_VALUE: f64 = 1.0e-9;
/// The configuration struct for constructing a `DDSketch`
#[derive(Copy, Clone, Debug, PartialEq)]
#[cfg_attr(feature = "use_serde", derive(Serialize, Deserialize))]
pub struct Config {
pub max_num_bins: u32,
pub gamma: f64,
pub(crate) gamma_ln: f64,
pub(crate) min_value: f64,
pub offset: i32,
}
fn log_gamma(value: f64, gamma_ln: f64) -> f64 {
value.ln() / gamma_ln
}
impl Config {
/// Construct a new `Config` struct with specific parameters. If you are unsure of how to
/// configure this, the `defaults` method constructs a `Config` with built-in defaults.
///
/// `max_num_bins` is the max number of bins the DDSketch will grow to, in steps of 128 bins.
pub fn new(alpha: f64, max_num_bins: u32, min_value: f64) -> Self {
// Aligned with Java's LogarithmicMapping / LogLikeIndexMapping:
// gamma = (1 + alpha) / (1 - alpha) (correctingFactor=1 for LogarithmicMapping)
// gamma_ln = gamma.ln() (not ln_1p, to match Java's Math.log(gamma))
// See: https://github.com/DataDog/sketches-java/blob/master/src/main/java/com/datadoghq/sketch/ddsketch/mapping/LogLikeIndexMapping.java (gamma() static method)
// See: https://github.com/DataDog/sketches-java/blob/master/src/main/java/com/datadoghq/sketch/ddsketch/mapping/LogarithmicMapping.java (constructor, correctingFactor()=1)
let gamma = (1.0 + alpha) / (1.0 - alpha);
let gamma_ln = gamma.ln();
Config {
max_num_bins,
gamma,
gamma_ln,
min_value,
offset: 1 - (log_gamma(min_value, gamma_ln) as i32),
}
}
/// Return a `Config` using built-in default settings
pub fn defaults() -> Self {
Self::new(DEFAULT_ALPHA, DEFAULT_MAX_BINS, DEFAULT_MIN_VALUE)
}
pub fn key(&self, v: f64) -> i32 {
// Aligned with Java's LogLikeIndexMapping.index(): floor-based indexing.
// Java uses `(int) index` / `(int) index - 1` which is equivalent to floor().
// See: https://github.com/DataDog/sketches-java/blob/master/src/main/java/com/datadoghq/sketch/ddsketch/mapping/LogLikeIndexMapping.java (index() method)
self.log_gamma(v).floor() as i32
}
pub fn value(&self, key: i32) -> f64 {
// Aligned with Java's LogLikeIndexMapping.value():
// lowerBound(index) * (1 + relativeAccuracy)
// = logInverse((index - indexOffset) / multiplier) * (1 + relativeAccuracy)
// = gamma^key * 2*gamma/(gamma+1)
// See: https://github.com/DataDog/sketches-java/blob/master/src/main/java/com/datadoghq/sketch/ddsketch/mapping/LogLikeIndexMapping.java (value() and lowerBound() methods)
self.pow_gamma(key) * (2.0 * self.gamma / (1.0 + self.gamma))
}
pub fn log_gamma(&self, value: f64) -> f64 {
log_gamma(value, self.gamma_ln)
}
pub fn pow_gamma(&self, key: i32) -> f64 {
((key as f64) * self.gamma_ln).exp()
}
pub fn min_possible(&self) -> f64 {
self.min_value
}
/// Reconstruct a Config from a gamma value (as decoded from the binary format).
/// Uses default max_num_bins and min_value.
/// See Java: https://github.com/DataDog/sketches-java/blob/master/src/main/java/com/datadoghq/sketch/ddsketch/mapping/LogarithmicMapping.java (LogarithmicMapping(double gamma, double indexOffset) constructor)
pub(crate) fn from_gamma(gamma: f64) -> Self {
let gamma_ln = gamma.ln();
Config {
max_num_bins: DEFAULT_MAX_BINS,
gamma,
gamma_ln,
min_value: DEFAULT_MIN_VALUE,
offset: 1 - (log_gamma(DEFAULT_MIN_VALUE, gamma_ln) as i32),
}
}
}
impl Default for Config {
fn default() -> Self {
Self::new(DEFAULT_ALPHA, DEFAULT_MAX_BINS, DEFAULT_MIN_VALUE)
}
}

View File

@@ -0,0 +1,385 @@
use std::{error, fmt};
#[cfg(feature = "use_serde")]
use serde::{Deserialize, Serialize};
use crate::config::Config;
use crate::store::Store;
type Result<T> = std::result::Result<T, DDSketchError>;
/// General error type for DDSketch, represents either an invalid quantile or an
/// incompatible merge operation.
#[derive(Debug, Clone)]
pub enum DDSketchError {
Quantile,
Merge,
}
impl fmt::Display for DDSketchError {
fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
match self {
DDSketchError::Quantile => {
write!(f, "Invalid quantile, must be between 0 and 1 (inclusive)")
}
DDSketchError::Merge => write!(f, "Can not merge sketches with different configs"),
}
}
}
impl error::Error for DDSketchError {
fn source(&self) -> Option<&(dyn error::Error + 'static)> {
// Generic
None
}
}
/// This struct represents a [DDSketch](https://arxiv.org/pdf/1908.10693.pdf)
#[derive(Clone)]
#[cfg_attr(feature = "use_serde", derive(Serialize, Deserialize))]
pub struct DDSketch {
pub(crate) config: Config,
pub(crate) store: Store,
pub(crate) negative_store: Store,
pub(crate) min: f64,
pub(crate) max: f64,
pub(crate) sum: f64,
pub(crate) zero_count: u64,
}
impl Default for DDSketch {
fn default() -> Self {
Self::new(Default::default())
}
}
// XXX: functions should return Option<> in the case of empty
impl DDSketch {
/// Construct a `DDSketch`. Requires a `Config` specifying the parameters of the sketch
pub fn new(config: Config) -> Self {
DDSketch {
config,
store: Store::new(config.max_num_bins as usize),
negative_store: Store::new(config.max_num_bins as usize),
min: f64::INFINITY,
max: f64::NEG_INFINITY,
sum: 0.0,
zero_count: 0,
}
}
/// Add the sample to the sketch
pub fn add(&mut self, v: f64) {
if v > self.config.min_possible() {
let key = self.config.key(v);
self.store.add(key);
} else if v < -self.config.min_possible() {
let key = self.config.key(-v);
self.negative_store.add(key);
} else {
self.zero_count += 1;
}
if v < self.min {
self.min = v;
}
if self.max < v {
self.max = v;
}
self.sum += v;
}
/// Return the quantile value for quantiles between 0.0 and 1.0. Result is an error, represented
/// as DDSketchError::Quantile if the requested quantile is outside of that range.
///
/// If the sketch is empty the result is None, else Some(v) for the quantile value.
pub fn quantile(&self, q: f64) -> Result<Option<f64>> {
if !(0.0..=1.0).contains(&q) {
return Err(DDSketchError::Quantile);
}
if self.empty() {
return Ok(None);
}
if q == 0.0 {
return Ok(Some(self.min));
} else if q == 1.0 {
return Ok(Some(self.max));
}
let rank = (q * (self.count() as f64 - 1.0)) as u64;
let quantile;
if rank < self.negative_store.count() {
let reversed_rank = self.negative_store.count() - rank - 1;
let key = self.negative_store.key_at_rank(reversed_rank);
quantile = -self.config.value(key);
} else if rank < self.zero_count + self.negative_store.count() {
quantile = 0.0;
} else {
let key = self
.store
.key_at_rank(rank - self.zero_count - self.negative_store.count());
quantile = self.config.value(key);
}
Ok(Some(quantile))
}
/// Returns the minimum value seen, or None if sketch is empty
pub fn min(&self) -> Option<f64> {
if self.empty() {
None
} else {
Some(self.min)
}
}
/// Returns the maximum value seen, or None if sketch is empty
pub fn max(&self) -> Option<f64> {
if self.empty() {
None
} else {
Some(self.max)
}
}
/// Returns the sum of values seen, or None if sketch is empty
pub fn sum(&self) -> Option<f64> {
if self.empty() {
None
} else {
Some(self.sum)
}
}
/// Returns the number of values added to the sketch
pub fn count(&self) -> usize {
(self.store.count() + self.zero_count + self.negative_store.count()) as usize
}
/// Returns the length of the underlying `Store`. This is mainly only useful for understanding
/// how much the sketch has grown given the inserted values.
pub fn length(&self) -> usize {
self.store.length() as usize + self.negative_store.length() as usize
}
/// Merge the contents of another sketch into this one. The sketch that is merged into this one
/// is unchanged after the merge.
pub fn merge(&mut self, o: &DDSketch) -> Result<()> {
if self.config != o.config {
return Err(DDSketchError::Merge);
}
let was_empty = self.store.count() == 0;
// Merge the stores
self.store.merge(&o.store);
self.negative_store.merge(&o.negative_store);
self.zero_count += o.zero_count;
// Need to ensure we don't override min/max with initializers
// if either store were empty
if was_empty {
self.min = o.min;
self.max = o.max;
} else if o.store.count() > 0 {
if o.min < self.min {
self.min = o.min
}
if o.max > self.max {
self.max = o.max;
}
}
self.sum += o.sum;
Ok(())
}
fn empty(&self) -> bool {
self.count() == 0
}
/// Encode this sketch into the Java-compatible binary format used by
/// `com.datadoghq.sketch.ddsketch.DDSketchWithExactSummaryStatistics`.
pub fn to_java_bytes(&self) -> Vec<u8> {
crate::encoding::encode_to_java_bytes(self)
}
/// Decode a sketch from the Java-compatible binary format.
/// Accepts bytes produced by Java's `DDSketchWithExactSummaryStatistics.encode()`
/// with or without the `0x02` version prefix.
pub fn from_java_bytes(
bytes: &[u8],
) -> std::result::Result<Self, crate::encoding::DecodeError> {
crate::encoding::decode_from_java_bytes(bytes)
}
}
#[cfg(test)]
mod tests {
use approx::assert_relative_eq;
use crate::{Config, DDSketch};
#[test]
fn test_add_zero() {
let alpha = 0.01;
let c = Config::new(alpha, 2048, 10e-9);
let mut dd = DDSketch::new(c);
dd.add(0.0);
}
#[test]
fn test_quartiles() {
let alpha = 0.01;
let c = Config::new(alpha, 2048, 10e-9);
let mut dd = DDSketch::new(c);
// Initialize sketch with {1.0, 2.0, 3.0, 4.0}
for i in 1..5 {
dd.add(i as f64);
}
// We expect the following mappings from quantile to value:
// [0,0.33]: 1.0, (0.34,0.66]: 2.0, (0.67,0.99]: 3.0, (0.99, 1.0]: 4.0
let test_cases = vec![
(0.0, 1.0),
(0.25, 1.0),
(0.33, 1.0),
(0.34, 2.0),
(0.5, 2.0),
(0.66, 2.0),
(0.67, 3.0),
(0.75, 3.0),
(0.99, 3.0),
(1.0, 4.0),
];
for (q, val) in test_cases {
assert_relative_eq!(dd.quantile(q).unwrap().unwrap(), val, max_relative = alpha);
}
}
#[test]
fn test_neg_quartiles() {
let alpha = 0.01;
let c = Config::new(alpha, 2048, 10e-9);
let mut dd = DDSketch::new(c);
// Initialize sketch with {1.0, 2.0, 3.0, 4.0}
for i in 1..5 {
dd.add(-i as f64);
}
let test_cases = vec![
(0.0, -4.0),
(0.25, -4.0),
(0.5, -3.0),
(0.75, -2.0),
(1.0, -1.0),
];
for (q, val) in test_cases {
assert_relative_eq!(dd.quantile(q).unwrap().unwrap(), val, max_relative = alpha);
}
}
#[test]
fn test_simple_quantile() {
let c = Config::defaults();
let mut dd = DDSketch::new(c);
for i in 1..101 {
dd.add(i as f64);
}
assert_eq!(dd.quantile(0.95).unwrap().unwrap().ceil(), 95.0);
assert!(dd.quantile(-1.01).is_err());
assert!(dd.quantile(1.01).is_err());
}
#[test]
fn test_empty_sketch() {
let c = Config::defaults();
let dd = DDSketch::new(c);
assert_eq!(dd.quantile(0.98).unwrap(), None);
assert_eq!(dd.max(), None);
assert_eq!(dd.min(), None);
assert_eq!(dd.sum(), None);
assert_eq!(dd.count(), 0);
assert!(dd.quantile(1.01).is_err());
}
#[test]
fn test_basic_histogram_data() {
let values = &[
0.754225035,
0.752900282,
0.752812246,
0.752602367,
0.754310155,
0.753525981,
0.752981082,
0.752715536,
0.751667941,
0.755079054,
0.753528150,
0.755188464,
0.752508723,
0.750064549,
0.753960428,
0.751139298,
0.752523560,
0.753253428,
0.753498342,
0.751858358,
0.752104636,
0.753841300,
0.754467374,
0.753814334,
0.750881719,
0.753182556,
0.752576884,
0.753945708,
0.753571911,
0.752314573,
0.752586651,
];
let c = Config::defaults();
let mut dd = DDSketch::new(c);
for value in values {
dd.add(*value);
}
assert_eq!(dd.max(), Some(0.755188464));
assert_eq!(dd.min(), Some(0.750064549));
assert_eq!(dd.count(), 31);
assert_eq!(dd.sum(), Some(23.343630625000003));
assert!(dd.quantile(0.25).unwrap().is_some());
assert!(dd.quantile(0.5).unwrap().is_some());
assert!(dd.quantile(0.75).unwrap().is_some());
}
#[test]
fn test_length() {
let mut dd = DDSketch::default();
assert_eq!(dd.length(), 0);
dd.add(1.0);
assert_eq!(dd.length(), 128);
dd.add(2.0);
dd.add(3.0);
assert_eq!(dd.length(), 128);
dd.add(-1.0);
assert_eq!(dd.length(), 256);
dd.add(-2.0);
dd.add(-3.0);
assert_eq!(dd.length(), 256);
}
}

View File

@@ -0,0 +1,813 @@
//! Java-compatible binary encoding/decoding for DDSketch.
//!
//! This module implements the binary format used by the Java
//! `com.datadoghq.sketch.ddsketch.DDSketchWithExactSummaryStatistics` class
//! from the DataDog/sketches-java library. It enables cross-language
//! serialization so that sketches produced in Rust can be deserialized
//! and merged by Java consumers.
use std::fmt;
use crate::config::Config;
use crate::ddsketch::DDSketch;
use crate::store::Store;
// ---------------------------------------------------------------------------
// Flag byte layout
//
// Each flag byte packs a 2-bit type ordinal in the low bits and a 6-bit
// subflag in the upper bits: (subflag << 2) | type_ordinal
// See: https://github.com/DataDog/sketches-java/blob/master/src/main/java/com/datadoghq/sketch/ddsketch/encoding/Flag.java
// ---------------------------------------------------------------------------
/// The 2-bit type field occupying the low bits of every flag byte.
#[repr(u8)]
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
enum FlagType {
SketchFeatures = 0,
PositiveStore = 1,
IndexMapping = 2,
NegativeStore = 3,
}
impl FlagType {
fn from_byte(b: u8) -> Option<Self> {
match b & 0x03 {
0 => Some(Self::SketchFeatures),
1 => Some(Self::PositiveStore),
2 => Some(Self::IndexMapping),
3 => Some(Self::NegativeStore),
_ => None,
}
}
}
/// Construct a flag byte from a subflag and a type.
const fn flag(subflag: u8, flag_type: FlagType) -> u8 {
(subflag << 2) | (flag_type as u8)
}
// Pre-computed flag bytes for the sketch features we encode/decode.
const FLAG_INDEX_MAPPING_LOG: u8 = flag(0, FlagType::IndexMapping); // 0x02
const FLAG_ZERO_COUNT: u8 = flag(1, FlagType::SketchFeatures); // 0x04
const FLAG_COUNT: u8 = flag(0x28, FlagType::SketchFeatures); // 0xA0
const FLAG_SUM: u8 = flag(0x21, FlagType::SketchFeatures); // 0x84
const FLAG_MIN: u8 = flag(0x22, FlagType::SketchFeatures); // 0x88
const FLAG_MAX: u8 = flag(0x23, FlagType::SketchFeatures); // 0x8C
/// BinEncodingMode subflags for store flag bytes.
/// See: https://github.com/DataDog/sketches-java/blob/master/src/main/java/com/datadoghq/sketch/ddsketch/encoding/BinEncodingMode.java
#[repr(u8)]
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
enum BinEncodingMode {
IndexDeltasAndCounts = 1,
IndexDeltas = 2,
ContiguousCounts = 3,
}
impl BinEncodingMode {
fn from_subflag(subflag: u8) -> Option<Self> {
match subflag {
1 => Some(Self::IndexDeltasAndCounts),
2 => Some(Self::IndexDeltas),
3 => Some(Self::ContiguousCounts),
_ => None,
}
}
}
const VAR_DOUBLE_ROTATE_DISTANCE: u32 = 6;
const MAX_VAR_LEN_64: usize = 9;
const DEFAULT_MAX_BINS: u32 = 2048;
// ---------------------------------------------------------------------------
// Error type
// ---------------------------------------------------------------------------
#[derive(Debug, Clone)]
pub enum DecodeError {
UnexpectedEof,
InvalidFlag(u8),
InvalidData(String),
}
impl fmt::Display for DecodeError {
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
Self::UnexpectedEof => write!(f, "unexpected end of input"),
Self::InvalidFlag(b) => write!(f, "invalid flag byte: 0x{b:02X}"),
Self::InvalidData(msg) => write!(f, "invalid data: {msg}"),
}
}
}
impl std::error::Error for DecodeError {}
// ---------------------------------------------------------------------------
// VarEncoding — bit-exact port of Java VarEncodingHelper
// See: https://github.com/DataDog/sketches-java/blob/master/src/main/java/com/datadoghq/sketch/ddsketch/encoding/VarEncodingHelper.java
// ---------------------------------------------------------------------------
fn encode_unsigned_var_long(out: &mut Vec<u8>, mut value: u64) {
let length = ((63 - value.leading_zeros() as i32) / 7).clamp(0, 8);
for _ in 0..length {
out.push((value as u8) | 0x80);
value >>= 7;
}
out.push(value as u8);
}
fn decode_unsigned_var_long(input: &mut &[u8]) -> Result<u64, DecodeError> {
let mut value: u64 = 0;
let mut shift: u32 = 0;
loop {
let next = read_byte(input)?;
if next < 0x80 || shift == 56 {
return Ok(value | (u64::from(next) << shift));
}
value |= (u64::from(next) & 0x7F) << shift;
shift += 7;
}
}
/// ZigZag encode then var-long encode.
fn encode_signed_var_long(out: &mut Vec<u8>, value: i64) {
let encoded = ((value >> 63) ^ (value << 1)) as u64;
encode_unsigned_var_long(out, encoded);
}
fn decode_signed_var_long(input: &mut &[u8]) -> Result<i64, DecodeError> {
let encoded = decode_unsigned_var_long(input)?;
Ok(((encoded >> 1) as i64) ^ -((encoded & 1) as i64))
}
fn double_to_var_bits(value: f64) -> u64 {
let bits = f64::to_bits(value + 1.0).wrapping_sub(f64::to_bits(1.0));
bits.rotate_left(VAR_DOUBLE_ROTATE_DISTANCE)
}
fn var_bits_to_double(bits: u64) -> f64 {
f64::from_bits(
bits.rotate_right(VAR_DOUBLE_ROTATE_DISTANCE)
.wrapping_add(f64::to_bits(1.0)),
) - 1.0
}
fn encode_var_double(out: &mut Vec<u8>, value: f64) {
let mut bits = double_to_var_bits(value);
for _ in 0..MAX_VAR_LEN_64 - 1 {
let next = (bits >> 57) as u8;
bits <<= 7;
if bits == 0 {
out.push(next);
return;
}
out.push(next | 0x80);
}
out.push((bits >> 56) as u8);
}
fn decode_var_double(input: &mut &[u8]) -> Result<f64, DecodeError> {
let mut bits: u64 = 0;
let mut shift: i32 = 57; // 8*8 - 7
loop {
let next = read_byte(input)?;
if shift == 1 {
bits |= u64::from(next);
break;
}
if next < 0x80 {
bits |= u64::from(next) << shift;
break;
}
bits |= (u64::from(next) & 0x7F) << shift;
shift -= 7;
}
Ok(var_bits_to_double(bits))
}
// ---------------------------------------------------------------------------
// Byte-level helpers
// ---------------------------------------------------------------------------
fn read_byte(input: &mut &[u8]) -> Result<u8, DecodeError> {
match input.split_first() {
Some((&byte, rest)) => {
*input = rest;
Ok(byte)
}
None => Err(DecodeError::UnexpectedEof),
}
}
fn write_f64_le(out: &mut Vec<u8>, value: f64) {
out.extend_from_slice(&value.to_le_bytes());
}
fn read_f64_le(input: &mut &[u8]) -> Result<f64, DecodeError> {
if input.len() < 8 {
return Err(DecodeError::UnexpectedEof);
}
let (bytes, rest) = input.split_at(8);
*input = rest;
// bytes is guaranteed to be length 8 by the split_at above.
let arr = [
bytes[0], bytes[1], bytes[2], bytes[3], bytes[4], bytes[5], bytes[6], bytes[7],
];
Ok(f64::from_le_bytes(arr))
}
// ---------------------------------------------------------------------------
// Store encoding/decoding
// See: https://github.com/DataDog/sketches-java/blob/master/src/main/java/com/datadoghq/sketch/ddsketch/store/DenseStore.java (encode/decode methods)
// ---------------------------------------------------------------------------
/// Collect non-zero bins in the store as (absolute_index, count) pairs.
///
/// Allocation is acceptable here: this runs once per encode and the Vec
/// has at most `max_num_bins` entries.
fn collect_non_zero_bins(store: &Store) -> Vec<(i32, u64)> {
if store.count == 0 {
return Vec::new();
}
let start = (store.min_key - store.offset) as usize;
let end = ((store.max_key - store.offset + 1) as usize).min(store.bins.len());
store.bins[start..end]
.iter()
.enumerate()
.filter(|&(_, &count)| count > 0)
.map(|(i, &count)| (start as i32 + i as i32 + store.offset, count))
.collect()
}
fn encode_store(out: &mut Vec<u8>, store: &Store, flag_type: FlagType) {
let bins = collect_non_zero_bins(store);
if bins.is_empty() {
return;
}
out.push(flag(BinEncodingMode::IndexDeltasAndCounts as u8, flag_type));
encode_unsigned_var_long(out, bins.len() as u64);
let mut prev_index: i64 = 0;
for &(index, count) in &bins {
encode_signed_var_long(out, i64::from(index) - prev_index);
encode_var_double(out, count as f64);
prev_index = i64::from(index);
}
}
fn decode_store(input: &mut &[u8], subflag: u8, bin_limit: usize) -> Result<Store, DecodeError> {
let mode = BinEncodingMode::from_subflag(subflag).ok_or_else(|| {
DecodeError::InvalidData(format!("unknown bin encoding mode subflag: {subflag}"))
})?;
let num_bins = decode_unsigned_var_long(input)? as usize;
let mut store = Store::new(bin_limit);
match mode {
BinEncodingMode::IndexDeltasAndCounts => {
let mut index: i64 = 0;
for _ in 0..num_bins {
index += decode_signed_var_long(input)?;
let count = decode_var_double(input)?;
store.add_count(index as i32, count as u64);
}
}
BinEncodingMode::IndexDeltas => {
let mut index: i64 = 0;
for _ in 0..num_bins {
index += decode_signed_var_long(input)?;
store.add_count(index as i32, 1);
}
}
BinEncodingMode::ContiguousCounts => {
let start_index = decode_signed_var_long(input)?;
let index_delta = decode_signed_var_long(input)?;
let mut index = start_index;
for _ in 0..num_bins {
let count = decode_var_double(input)?;
store.add_count(index as i32, count as u64);
index += index_delta;
}
}
}
Ok(store)
}
// ---------------------------------------------------------------------------
// Top-level encode / decode
// ---------------------------------------------------------------------------
/// Encode a DDSketch into the Java-compatible binary format.
///
/// The output follows the encoding order of
/// `DDSketchWithExactSummaryStatistics.encode()` then `DDSketch.encode()`:
///
/// 1. Summary statistics: COUNT, MIN, MAX (if count > 0)
/// 2. SUM (if sum != 0)
/// 3. Index mapping (LOG layout): gamma, indexOffset
/// 4. Zero count (if > 0)
/// 5. Positive store bins
/// 6. Negative store bins
pub fn encode_to_java_bytes(sketch: &DDSketch) -> Vec<u8> {
let mut out = Vec::new();
let count = sketch.count() as f64;
// Summary statistics (DDSketchWithExactSummaryStatistics.encode)
if count != 0.0 {
out.push(FLAG_COUNT);
encode_var_double(&mut out, count);
out.push(FLAG_MIN);
write_f64_le(&mut out, sketch.min);
out.push(FLAG_MAX);
write_f64_le(&mut out, sketch.max);
}
if sketch.sum != 0.0 {
out.push(FLAG_SUM);
write_f64_le(&mut out, sketch.sum);
}
// DDSketch.encode: index mapping + zero count + stores
out.push(FLAG_INDEX_MAPPING_LOG);
write_f64_le(&mut out, sketch.config.gamma);
write_f64_le(&mut out, 0.0_f64);
if sketch.zero_count != 0 {
out.push(FLAG_ZERO_COUNT);
encode_var_double(&mut out, sketch.zero_count as f64);
}
encode_store(&mut out, &sketch.store, FlagType::PositiveStore);
encode_store(&mut out, &sketch.negative_store, FlagType::NegativeStore);
out
}
/// Decode a DDSketch from the Java-compatible binary format.
///
/// Accepts bytes with or without a `0x02` version prefix.
pub fn decode_from_java_bytes(bytes: &[u8]) -> Result<DDSketch, DecodeError> {
if bytes.is_empty() {
return Err(DecodeError::UnexpectedEof);
}
let mut input = bytes;
// Skip optional version prefix (0x02 followed by a valid flag byte).
if input.len() >= 2 && input[0] == 0x02 && is_valid_flag_byte(input[1]) {
input = &input[1..];
}
let mut gamma: Option<f64> = None;
let mut zero_count: f64 = 0.0;
let mut sum: f64 = 0.0;
let mut min: f64 = f64::INFINITY;
let mut max: f64 = f64::NEG_INFINITY;
let mut positive_store: Option<Store> = None;
let mut negative_store: Option<Store> = None;
while !input.is_empty() {
let flag_byte = read_byte(&mut input)?;
let flag_type =
FlagType::from_byte(flag_byte).ok_or(DecodeError::InvalidFlag(flag_byte))?;
let subflag = flag_byte >> 2;
match flag_type {
FlagType::IndexMapping => {
gamma = Some(read_f64_le(&mut input)?);
let _index_offset = read_f64_le(&mut input)?;
}
FlagType::SketchFeatures => match flag_byte {
FLAG_ZERO_COUNT => zero_count += decode_var_double(&mut input)?,
FLAG_COUNT => {
let _count = decode_var_double(&mut input)?;
}
FLAG_SUM => sum = read_f64_le(&mut input)?,
FLAG_MIN => min = read_f64_le(&mut input)?,
FLAG_MAX => max = read_f64_le(&mut input)?,
_ => return Err(DecodeError::InvalidFlag(flag_byte)),
},
FlagType::PositiveStore => {
positive_store = Some(decode_store(
&mut input,
subflag,
DEFAULT_MAX_BINS as usize,
)?);
}
FlagType::NegativeStore => {
negative_store = Some(decode_store(
&mut input,
subflag,
DEFAULT_MAX_BINS as usize,
)?);
}
}
}
let g = gamma.unwrap_or_else(|| Config::defaults().gamma);
let config = Config::from_gamma(g);
let store = positive_store.unwrap_or_else(|| Store::new(config.max_num_bins as usize));
let neg = negative_store.unwrap_or_else(|| Store::new(config.max_num_bins as usize));
Ok(DDSketch {
config,
store,
negative_store: neg,
min,
max,
sum,
zero_count: zero_count as u64,
})
}
/// Check whether a byte is a valid flag byte for the DDSketch binary format.
fn is_valid_flag_byte(b: u8) -> bool {
// Known sketch-feature flags
if matches!(
b,
FLAG_ZERO_COUNT | FLAG_COUNT | FLAG_SUM | FLAG_MIN | FLAG_MAX | FLAG_INDEX_MAPPING_LOG
) {
return true;
}
let Some(flag_type) = FlagType::from_byte(b) else {
return false;
};
let subflag = b >> 2;
match flag_type {
FlagType::PositiveStore | FlagType::NegativeStore => (1..=3).contains(&subflag),
FlagType::IndexMapping => subflag <= 4, // LOG=0, LOG_LINEAR=1 .. LOG_QUARTIC=4
_ => false,
}
}
// ---------------------------------------------------------------------------
// Tests
// ---------------------------------------------------------------------------
#[cfg(test)]
mod tests {
use super::*;
use crate::{Config, DDSketch};
// --- VarEncoding unit tests ---
#[test]
fn test_unsigned_var_long_zero() {
let mut buf = Vec::new();
encode_unsigned_var_long(&mut buf, 0);
assert_eq!(buf, [0x00]);
let mut input = buf.as_slice();
assert_eq!(decode_unsigned_var_long(&mut input).unwrap(), 0);
assert!(input.is_empty());
}
#[test]
fn test_unsigned_var_long_small() {
let mut buf = Vec::new();
encode_unsigned_var_long(&mut buf, 1);
assert_eq!(buf, [0x01]);
let mut input = buf.as_slice();
assert_eq!(decode_unsigned_var_long(&mut input).unwrap(), 1);
}
#[test]
fn test_unsigned_var_long_128() {
let mut buf = Vec::new();
encode_unsigned_var_long(&mut buf, 128);
assert_eq!(buf, [0x80, 0x01]);
let mut input = buf.as_slice();
assert_eq!(decode_unsigned_var_long(&mut input).unwrap(), 128);
}
#[test]
fn test_unsigned_var_long_roundtrip() {
for v in [0u64, 1, 127, 128, 255, 256, 16383, 16384, u64::MAX] {
let mut buf = Vec::new();
encode_unsigned_var_long(&mut buf, v);
let mut input = buf.as_slice();
let decoded = decode_unsigned_var_long(&mut input).unwrap();
assert_eq!(decoded, v, "roundtrip failed for {}", v);
assert!(input.is_empty());
}
}
#[test]
fn test_signed_var_long_roundtrip() {
for v in [0i64, 1, -1, 63, -64, 64, -65, i64::MAX, i64::MIN] {
let mut buf = Vec::new();
encode_signed_var_long(&mut buf, v);
let mut input = buf.as_slice();
let decoded = decode_signed_var_long(&mut input).unwrap();
assert_eq!(decoded, v, "roundtrip failed for {}", v);
assert!(input.is_empty());
}
}
#[test]
fn test_var_double_roundtrip() {
for v in [0.0, 1.0, 2.0, 5.0, 15.0, 42.0, 100.0, 1e-9, 1e15, 0.5, 7.77] {
let mut buf = Vec::new();
encode_var_double(&mut buf, v);
let mut input = buf.as_slice();
let decoded = decode_var_double(&mut input).unwrap();
assert!(
(decoded - v).abs() < 1e-15 || decoded == v,
"roundtrip failed for {}: got {}",
v,
decoded,
);
assert!(input.is_empty());
}
}
#[test]
fn test_var_double_small_integers() {
let mut buf = Vec::new();
encode_var_double(&mut buf, 1.0);
assert_eq!(buf.len(), 1, "VarDouble(1.0) should be 1 byte");
buf.clear();
encode_var_double(&mut buf, 5.0);
assert_eq!(buf.len(), 1, "VarDouble(5.0) should be 1 byte");
}
// --- DDSketch encode/decode roundtrip tests ---
#[test]
fn test_encode_empty_sketch() {
let sketch = DDSketch::new(Config::defaults());
let bytes = sketch.to_java_bytes();
assert!(!bytes.is_empty());
let decoded = DDSketch::from_java_bytes(&bytes).unwrap();
assert_eq!(decoded.count(), 0);
assert_eq!(decoded.min(), None);
assert_eq!(decoded.max(), None);
assert_eq!(decoded.sum(), None);
}
#[test]
fn test_encode_simple_sketch() {
let mut sketch = DDSketch::new(Config::defaults());
for v in [1.0, 2.0, 3.0, 4.0, 5.0] {
sketch.add(v);
}
let bytes = sketch.to_java_bytes();
let decoded = DDSketch::from_java_bytes(&bytes).unwrap();
assert_eq!(decoded.count(), 5);
assert_eq!(decoded.min(), Some(1.0));
assert_eq!(decoded.max(), Some(5.0));
assert_eq!(decoded.sum(), Some(15.0));
assert_quantiles_match(&sketch, &decoded, &[0.5, 0.9, 0.95, 0.99]);
}
#[test]
fn test_encode_single_value() {
let mut sketch = DDSketch::new(Config::defaults());
sketch.add(42.0);
let bytes = sketch.to_java_bytes();
let decoded = DDSketch::from_java_bytes(&bytes).unwrap();
assert_eq!(decoded.count(), 1);
assert_eq!(decoded.min(), Some(42.0));
assert_eq!(decoded.max(), Some(42.0));
assert_eq!(decoded.sum(), Some(42.0));
}
#[test]
fn test_encode_negative_values() {
let mut sketch = DDSketch::new(Config::defaults());
for v in [-3.0, -1.0, 2.0, 5.0] {
sketch.add(v);
}
let bytes = sketch.to_java_bytes();
let decoded = DDSketch::from_java_bytes(&bytes).unwrap();
assert_eq!(decoded.count(), 4);
assert_eq!(decoded.min(), Some(-3.0));
assert_eq!(decoded.max(), Some(5.0));
assert_eq!(decoded.sum(), Some(3.0));
assert_quantiles_match(&sketch, &decoded, &[0.0, 0.25, 0.5, 0.75, 1.0]);
}
#[test]
fn test_encode_with_zero_value() {
let mut sketch = DDSketch::new(Config::defaults());
for v in [0.0, 1.0, 2.0] {
sketch.add(v);
}
let bytes = sketch.to_java_bytes();
let decoded = DDSketch::from_java_bytes(&bytes).unwrap();
assert_eq!(decoded.count(), 3);
assert_eq!(decoded.min(), Some(0.0));
assert_eq!(decoded.max(), Some(2.0));
assert_eq!(decoded.sum(), Some(3.0));
assert_eq!(decoded.zero_count, 1);
}
#[test]
fn test_encode_large_range() {
let mut sketch = DDSketch::new(Config::defaults());
sketch.add(0.001);
sketch.add(1_000_000.0);
let bytes = sketch.to_java_bytes();
let decoded = DDSketch::from_java_bytes(&bytes).unwrap();
assert_eq!(decoded.count(), 2);
assert_eq!(decoded.min(), Some(0.001));
assert_eq!(decoded.max(), Some(1_000_000.0));
}
#[test]
fn test_encode_with_version_prefix() {
let mut sketch = DDSketch::new(Config::defaults());
for v in [1.0, 2.0, 3.0] {
sketch.add(v);
}
let bytes = sketch.to_java_bytes();
// Simulate Java's toByteArrayV2: prepend 0x02
let mut v2_bytes = vec![0x02];
v2_bytes.extend_from_slice(&bytes);
let decoded = DDSketch::from_java_bytes(&v2_bytes).unwrap();
assert_eq!(decoded.count(), 3);
assert_eq!(decoded.min(), Some(1.0));
assert_eq!(decoded.max(), Some(3.0));
}
#[test]
fn test_byte_level_encoding() {
let mut sketch = DDSketch::new(Config::defaults());
sketch.add(1.0);
let bytes = sketch.to_java_bytes();
assert_eq!(bytes[0], FLAG_COUNT, "first byte should be COUNT flag");
assert!(
bytes.contains(&FLAG_INDEX_MAPPING_LOG),
"should contain index mapping flag"
);
}
// --- Cross-language golden byte tests ---
//
// Golden bytes generated by Java's DDSketchWithExactSummaryStatistics.encode()
// using LogarithmicMapping(0.01) + CollapsingLowestDenseStore(2048).
const GOLDEN_SIMPLE: &str = "a00588000000000000f03f8c0000000000001440840000000000002e4002fd4a815abf52f03f000000000000000005050002440228021e021602";
const GOLDEN_SINGLE: &str = "a0028800000000000045408c000000000000454084000000000000454002fd4a815abf52f03f00000000000000000501f40202";
const GOLDEN_NEGATIVE: &str = "a084408800000000000008c08c000000000000144084000000000000084002fd4a815abf52f03f0000000000000000050244025c02070200026c02";
const GOLDEN_ZERO: &str = "a0048800000000000000008c000000000000004084000000000000084002fd4a815abf52f03f00000000000000000402050200024402";
const GOLDEN_EMPTY: &str = "02fd4a815abf52f03f0000000000000000";
const GOLDEN_MANY: &str = "a08d1488000000000000f03f8c0000000000005940840000000000bab34002fd4a815abf52f03f000000000000000005550002440228021e021602120210020c020c020c0208020a020802060208020602060206020602040206020402040204020402040204020402040204020202040202020402020204020202020204020202020202020402020202020202020202020202020202020202020202020202020202020203020202020202020302020202020302020202020302020203020202030202020302030202020302030203020202030203020302030202";
fn hex_to_bytes(hex: &str) -> Vec<u8> {
(0..hex.len())
.step_by(2)
.map(|i| u8::from_str_radix(&hex[i..i + 2], 16).unwrap())
.collect()
}
fn bytes_to_hex(bytes: &[u8]) -> String {
bytes.iter().map(|b| format!("{b:02x}")).collect()
}
fn assert_golden(label: &str, sketch: &DDSketch, golden_hex: &str) {
let bytes = sketch.to_java_bytes();
let expected = hex_to_bytes(golden_hex);
assert_eq!(
bytes,
expected,
"Rust encoding doesn't match Java golden bytes for {}.\nRust: {}\nJava: {}",
label,
bytes_to_hex(&bytes),
golden_hex,
);
}
fn assert_quantiles_match(a: &DDSketch, b: &DDSketch, quantiles: &[f64]) {
for &q in quantiles {
let va = a.quantile(q).unwrap().unwrap();
let vb = b.quantile(q).unwrap().unwrap();
assert!(
(va - vb).abs() / va.abs().max(1e-15) < 1e-12,
"quantile({}) mismatch: {} vs {}",
q,
va,
vb,
);
}
}
#[test]
fn test_cross_language_simple() {
let mut sketch = DDSketch::new(Config::defaults());
for v in [1.0, 2.0, 3.0, 4.0, 5.0] {
sketch.add(v);
}
assert_golden("SIMPLE", &sketch, GOLDEN_SIMPLE);
}
#[test]
fn test_cross_language_single() {
let mut sketch = DDSketch::new(Config::defaults());
sketch.add(42.0);
assert_golden("SINGLE", &sketch, GOLDEN_SINGLE);
}
#[test]
fn test_cross_language_negative() {
let mut sketch = DDSketch::new(Config::defaults());
for v in [-3.0, -1.0, 2.0, 5.0] {
sketch.add(v);
}
assert_golden("NEGATIVE", &sketch, GOLDEN_NEGATIVE);
}
#[test]
fn test_cross_language_zero() {
let mut sketch = DDSketch::new(Config::defaults());
for v in [0.0, 1.0, 2.0] {
sketch.add(v);
}
assert_golden("ZERO", &sketch, GOLDEN_ZERO);
}
#[test]
fn test_cross_language_empty() {
let sketch = DDSketch::new(Config::defaults());
assert_golden("EMPTY", &sketch, GOLDEN_EMPTY);
}
#[test]
fn test_cross_language_many() {
let mut sketch = DDSketch::new(Config::defaults());
for i in 1..=100 {
sketch.add(i as f64);
}
assert_golden("MANY", &sketch, GOLDEN_MANY);
}
#[test]
fn test_decode_java_golden_bytes() {
for (name, hex) in [
("SIMPLE", GOLDEN_SIMPLE),
("SINGLE", GOLDEN_SINGLE),
("NEGATIVE", GOLDEN_NEGATIVE),
("ZERO", GOLDEN_ZERO),
("EMPTY", GOLDEN_EMPTY),
("MANY", GOLDEN_MANY),
] {
let bytes = hex_to_bytes(hex);
let result = DDSketch::from_java_bytes(&bytes);
assert!(
result.is_ok(),
"failed to decode {}: {:?}",
name,
result.err()
);
}
}
#[test]
fn test_encode_decode_many_values() {
let mut sketch = DDSketch::new(Config::defaults());
for i in 1..=100 {
sketch.add(i as f64);
}
let bytes = sketch.to_java_bytes();
let decoded = DDSketch::from_java_bytes(&bytes).unwrap();
assert_eq!(decoded.count(), 100);
assert_eq!(decoded.min(), Some(1.0));
assert_eq!(decoded.max(), Some(100.0));
assert_eq!(decoded.sum(), Some(5050.0));
let alpha = 0.01;
let orig_p95 = sketch.quantile(0.95).unwrap().unwrap();
let dec_p95 = decoded.quantile(0.95).unwrap().unwrap();
assert!(
(orig_p95 - dec_p95).abs() / orig_p95 < alpha,
"p95 mismatch: {} vs {}",
orig_p95,
dec_p95,
);
}
}

View File

@@ -0,0 +1,52 @@
//! This crate provides a direct port of the [Golang](https://github.com/DataDog/sketches-go)
//! [DDSketch](https://arxiv.org/pdf/1908.10693.pdf) implementation to Rust. All efforts
//! have been made to keep this as close to the original implementation as possible, with a few
//! tweaks to get closer to idiomatic Rust.
//!
//! # Usage
//!
//! Add multiple samples to a DDSketch and invoke the `quantile` method to pull any quantile from
//! 0.0* to *1.0*.
//!
//! ```rust
//! use sketches_ddsketch::{Config, DDSketch};
//!
//! let c = Config::defaults();
//! let mut d = DDSketch::new(c);
//!
//! d.add(1.0);
//! d.add(1.0);
//! d.add(1.0);
//!
//! let q = d.quantile(0.50).unwrap();
//!
//! assert!(q < Some(1.02));
//! assert!(q > Some(0.98));
//! ```
//!
//! Sketches can also be merged.
//!
//! ```rust
//! use sketches_ddsketch::{Config, DDSketch};
//!
//! let c = Config::defaults();
//! let mut d1 = DDSketch::new(c);
//! let mut d2 = DDSketch::new(c);
//!
//! d1.add(1.0);
//! d2.add(2.0);
//! d2.add(2.0);
//!
//! d1.merge(&d2);
//!
//! assert_eq!(d1.count(), 3);
//! ```
pub use self::config::Config;
pub use self::ddsketch::{DDSketch, DDSketchError};
pub use self::encoding::DecodeError;
mod config;
mod ddsketch;
pub mod encoding;
mod store;

View File

@@ -0,0 +1,252 @@
#[cfg(feature = "use_serde")]
use serde::{Deserialize, Serialize};
const CHUNK_SIZE: i32 = 128;
// Divide the `dividend` by the `divisor`, rounding towards positive infinity.
//
// Similar to the nightly only `std::i32::div_ceil`.
fn div_ceil(dividend: i32, divisor: i32) -> i32 {
(dividend + divisor - 1) / divisor
}
/// CollapsingLowestDenseStore
#[derive(Clone, Debug)]
#[cfg_attr(feature = "use_serde", derive(Serialize, Deserialize))]
pub struct Store {
pub(crate) bins: Vec<u64>,
pub(crate) count: u64,
pub(crate) min_key: i32,
pub(crate) max_key: i32,
pub(crate) offset: i32,
pub(crate) bin_limit: usize,
is_collapsed: bool,
}
impl Store {
pub fn new(bin_limit: usize) -> Self {
Store {
bins: Vec::new(),
count: 0,
min_key: i32::MAX,
max_key: i32::MIN,
offset: 0,
bin_limit,
is_collapsed: false,
}
}
/// Return the number of bins.
pub fn length(&self) -> i32 {
self.bins.len() as i32
}
pub fn is_empty(&self) -> bool {
self.bins.is_empty()
}
pub fn add(&mut self, key: i32) {
let idx = self.get_index(key);
self.bins[idx] += 1;
self.count += 1;
}
/// See Java: https://github.com/DataDog/sketches-java/blob/master/src/main/java/com/datadoghq/sketch/ddsketch/store/DenseStore.java (add(int index, double count) method)
pub(crate) fn add_count(&mut self, key: i32, count: u64) {
let idx = self.get_index(key);
self.bins[idx] += count;
self.count += count;
}
fn get_index(&mut self, key: i32) -> usize {
if key < self.min_key {
if self.is_collapsed {
return 0;
}
self.extend_range(key, None);
if self.is_collapsed {
return 0;
}
} else if key > self.max_key {
self.extend_range(key, None);
}
(key - self.offset) as usize
}
fn extend_range(&mut self, key: i32, second_key: Option<i32>) {
let second_key = second_key.unwrap_or(key);
let new_min_key = i32::min(key, i32::min(second_key, self.min_key));
let new_max_key = i32::max(key, i32::max(second_key, self.max_key));
if self.is_empty() {
let new_len = self.get_new_length(new_min_key, new_max_key);
self.bins.resize(new_len, 0);
self.offset = new_min_key;
self.adjust(new_min_key, new_max_key);
} else if new_min_key >= self.min_key && new_max_key < self.offset + self.length() {
self.min_key = new_min_key;
self.max_key = new_max_key;
} else {
// Grow bins
let new_length = self.get_new_length(new_min_key, new_max_key);
if new_length > self.length() as usize {
self.bins.resize(new_length, 0);
}
self.adjust(new_min_key, new_max_key);
}
}
fn get_new_length(&self, new_min_key: i32, new_max_key: i32) -> usize {
let desired_length = new_max_key - new_min_key + 1;
usize::min(
(CHUNK_SIZE * div_ceil(desired_length, CHUNK_SIZE)) as usize,
self.bin_limit,
)
}
fn adjust(&mut self, new_min_key: i32, new_max_key: i32) {
if new_max_key - new_min_key + 1 > self.length() {
let new_min_key = new_max_key - self.length() + 1;
if new_min_key >= self.max_key {
// Put everything in the first bin.
self.offset = new_min_key;
self.min_key = new_min_key;
self.bins.fill(0);
self.bins[0] = self.count;
} else {
let shift = self.offset - new_min_key;
if shift < 0 {
let collapse_start_index = (self.min_key - self.offset) as usize;
let collapse_end_index = (new_min_key - self.offset) as usize;
let collapsed_count: u64 = self.bins[collapse_start_index..collapse_end_index]
.iter()
.sum();
let zero_len = (new_min_key - self.min_key) as usize;
self.bins.splice(
collapse_start_index..collapse_end_index,
std::iter::repeat_n(0, zero_len),
);
self.bins[collapse_end_index] += collapsed_count;
}
self.min_key = new_min_key;
self.shift_bins(shift);
}
self.max_key = new_max_key;
self.is_collapsed = true;
} else {
self.center_bins(new_min_key, new_max_key);
self.min_key = new_min_key;
self.max_key = new_max_key;
}
}
fn shift_bins(&mut self, shift: i32) {
if shift > 0 {
let shift = shift as usize;
self.bins.rotate_right(shift);
for idx in 0..shift {
self.bins[idx] = 0;
}
} else {
let shift = shift.unsigned_abs() as usize;
for idx in 0..shift {
self.bins[idx] = 0;
}
self.bins.rotate_left(shift);
}
self.offset -= shift;
}
fn center_bins(&mut self, new_min_key: i32, new_max_key: i32) {
let middle_key = new_min_key + (new_max_key - new_min_key + 1) / 2;
let shift = self.offset + self.length() / 2 - middle_key;
self.shift_bins(shift)
}
pub fn key_at_rank(&self, rank: u64) -> i32 {
let mut n = 0;
for (i, bin) in self.bins.iter().enumerate() {
n += *bin;
if n > rank {
return i as i32 + self.offset;
}
}
self.max_key
}
pub fn count(&self) -> u64 {
self.count
}
pub fn merge(&mut self, other: &Store) {
if other.count == 0 {
return;
}
if self.count == 0 {
self.copy(other);
return;
}
if other.min_key < self.min_key || other.max_key > self.max_key {
self.extend_range(other.min_key, Some(other.max_key));
}
let collapse_start_index = other.min_key - other.offset;
let mut collapse_end_index = i32::min(self.min_key, other.max_key + 1) - other.offset;
if collapse_end_index > collapse_start_index {
let collapsed_count: u64 = self.bins
[collapse_start_index as usize..collapse_end_index as usize]
.iter()
.sum();
self.bins[0] += collapsed_count;
} else {
collapse_end_index = collapse_start_index;
}
for key in (collapse_end_index + other.offset)..(other.max_key + 1) {
self.bins[(key - self.offset) as usize] += other.bins[(key - other.offset) as usize]
}
self.count += other.count;
}
fn copy(&mut self, o: &Store) {
self.bins = o.bins.clone();
self.count = o.count;
self.min_key = o.min_key;
self.max_key = o.max_key;
self.offset = o.offset;
self.bin_limit = o.bin_limit;
self.is_collapsed = o.is_collapsed;
}
}
#[cfg(test)]
mod tests {
use crate::store::Store;
#[test]
fn test_simple_store() {
let mut s = Store::new(2048);
for i in 0..2048 {
s.add(i);
}
}
#[test]
fn test_simple_store_rev() {
let mut s = Store::new(2048);
for i in (0..2048).rev() {
s.add(i);
}
}
}

View File

@@ -0,0 +1,88 @@
use std::cmp::Ordering;
use std::f64::NAN;
pub struct Dataset {
values: Vec<f64>,
sum: f64,
sorted: bool,
}
fn cmp_f64(a: &f64, b: &f64) -> Ordering {
assert!(!a.is_nan() && !b.is_nan());
if a < b {
return Ordering::Less;
} else if a > b {
return Ordering::Greater;
} else {
return Ordering::Equal;
}
}
impl Dataset {
pub fn new() -> Self {
Dataset {
values: Vec::new(),
sum: 0.0,
sorted: false,
}
}
pub fn add(&mut self, value: f64) {
self.values.push(value);
self.sum += value;
self.sorted = false;
}
// pub fn quantile(&mut self, q: f64) -> f64 {
// self.lower_quantile(q)
// }
pub fn lower_quantile(&mut self, q: f64) -> f64 {
if q < 0.0 || q > 1.0 || self.values.len() == 0 {
return NAN;
}
self.sort();
let rank = q * (self.values.len() - 1) as f64;
self.values[rank.floor() as usize]
}
pub fn upper_quantile(&mut self, q: f64) -> f64 {
if q < 0.0 || q > 1.0 || self.values.len() == 0 {
return NAN;
}
self.sort();
let rank = q * (self.values.len() - 1) as f64;
self.values[rank.ceil() as usize]
}
pub fn min(&mut self) -> f64 {
self.sort();
self.values[0]
}
pub fn max(&mut self) -> f64 {
self.sort();
self.values[self.values.len() - 1]
}
pub fn sum(&self) -> f64 {
self.sum
}
pub fn count(&self) -> usize {
self.values.len()
}
fn sort(&mut self) {
if self.sorted {
return;
}
self.values.sort_by(cmp_f64);
self.sorted = true;
}
}

View File

@@ -0,0 +1,100 @@
extern crate rand;
extern crate rand_distr;
use rand::prelude::*;
pub trait Generator {
fn generate(&mut self) -> f64;
}
// Constant generator
//
pub struct Constant {
value: f64,
}
impl Constant {
pub fn new(value: f64) -> Self {
Constant { value }
}
}
impl Generator for Constant {
fn generate(&mut self) -> f64 {
self.value
}
}
// Linear generator
//
pub struct Linear {
current_value: f64,
step: f64,
}
impl Linear {
pub fn new(start_value: f64, step: f64) -> Self {
Linear {
current_value: start_value,
step,
}
}
}
impl Generator for Linear {
fn generate(&mut self) -> f64 {
let value = self.current_value;
self.current_value += self.step;
value
}
}
// Normal distribution generator
//
pub struct Normal {
distr: rand_distr::Normal<f64>,
}
impl Normal {
pub fn new(mean: f64, stddev: f64) -> Self {
Normal {
distr: rand_distr::Normal::new(mean, stddev).unwrap(),
}
}
}
impl Generator for Normal {
fn generate(&mut self) -> f64 {
self.distr.sample(&mut rand::thread_rng())
}
}
// Lognormal distribution generator
//
pub struct Lognormal {
distr: rand_distr::LogNormal<f64>,
}
impl Lognormal {
pub fn new(mean: f64, stddev: f64) -> Self {
Lognormal {
distr: rand_distr::LogNormal::new(mean, stddev).unwrap(),
}
}
}
impl Generator for Lognormal {
fn generate(&mut self) -> f64 {
self.distr.sample(&mut rand::thread_rng())
}
}
// Exponential distribution generator
//
pub struct Exponential {
distr: rand_distr::Exp<f64>,
}
impl Exponential {
pub fn new(lambda: f64) -> Self {
Exponential {
distr: rand_distr::Exp::new(lambda).unwrap(),
}
}
}
impl Generator for Exponential {
fn generate(&mut self) -> f64 {
self.distr.sample(&mut rand::thread_rng())
}
}

View File

@@ -0,0 +1,2 @@
pub mod dataset;
pub mod generator;

View File

@@ -0,0 +1,316 @@
mod common;
use std::time::Instant;
use common::dataset::Dataset;
use common::generator;
use common::generator::Generator;
use sketches_ddsketch::{Config, DDSketch};
const TEST_ALPHA: f64 = 0.01;
const TEST_MAX_BINS: u32 = 1024;
const TEST_MIN_VALUE: f64 = 1.0e-9;
// Used for float equality
const TEST_ERROR_THRESH: f64 = 1.0e-9;
const TEST_SIZES: [usize; 5] = [3, 5, 10, 100, 1000];
const TEST_QUANTILES: [f64; 10] = [0.0, 0.1, 0.25, 0.5, 0.75, 0.9, 0.95, 0.99, 0.999, 1.0];
#[test]
fn test_constant() {
evaluate_sketches(|| Box::new(generator::Constant::new(42.0)));
}
#[test]
fn test_linear() {
evaluate_sketches(|| Box::new(generator::Linear::new(0.0, 1.0)));
}
#[test]
fn test_normal() {
evaluate_sketches(|| Box::new(generator::Normal::new(35.0, 1.0)));
}
#[test]
fn test_lognormal() {
evaluate_sketches(|| Box::new(generator::Lognormal::new(0.0, 2.0)));
}
#[test]
fn test_exponential() {
evaluate_sketches(|| Box::new(generator::Exponential::new(2.0)));
}
fn evaluate_test_sizes(f: impl Fn(usize)) {
for sz in &TEST_SIZES {
f(*sz);
}
}
fn evaluate_sketches(gen_factory: impl Fn() -> Box<dyn generator::Generator>) {
evaluate_test_sizes(|sz: usize| {
let mut generator = gen_factory();
evaluate_sketch(sz, &mut generator);
});
}
fn new_config() -> Config {
Config::new(TEST_ALPHA, TEST_MAX_BINS, TEST_MIN_VALUE)
}
fn assert_float_eq(a: f64, b: f64) {
assert!((a - b).abs() < TEST_ERROR_THRESH, "{} != {}", a, b);
}
fn evaluate_sketch(count: usize, generator: &mut Box<dyn generator::Generator>) {
let c = new_config();
let mut g = DDSketch::new(c);
let mut d = Dataset::new();
for _i in 0..count {
let value = generator.generate();
g.add(value);
d.add(value);
}
compare_sketches(&mut d, &g);
}
fn compare_sketches(d: &mut Dataset, g: &DDSketch) {
for q in &TEST_QUANTILES {
let lower = d.lower_quantile(*q);
let upper = d.upper_quantile(*q);
let min_expected;
if lower < 0.0 {
min_expected = lower * (1.0 + TEST_ALPHA);
} else {
min_expected = lower * (1.0 - TEST_ALPHA);
}
let max_expected;
if upper > 0.0 {
max_expected = upper * (1.0 + TEST_ALPHA);
} else {
max_expected = upper * (1.0 - TEST_ALPHA);
}
let quantile = g.quantile(*q).unwrap().unwrap();
assert!(
min_expected <= quantile,
"Lower than min, quantile: {}, wanted {} <= {}",
*q,
min_expected,
quantile
);
assert!(
quantile <= max_expected,
"Higher than max, quantile: {}, wanted {} <= {}",
*q,
quantile,
max_expected
);
// verify that calls do not modify result (not mut so not possible?)
let quantile2 = g.quantile(*q).unwrap().unwrap();
assert_eq!(quantile, quantile2);
}
assert_eq!(g.min().unwrap(), d.min());
assert_eq!(g.max().unwrap(), d.max());
assert_float_eq(g.sum().unwrap(), d.sum());
assert_eq!(g.count(), d.count());
}
#[test]
fn test_merge_normal() {
evaluate_test_sizes(|sz: usize| {
let c = new_config();
let mut d = Dataset::new();
let mut g1 = DDSketch::new(c);
let mut generator1 = generator::Normal::new(35.0, 1.0);
for _ in (0..sz).step_by(3) {
let value = generator1.generate();
g1.add(value);
d.add(value);
}
let mut g2 = DDSketch::new(c);
let mut generator2 = generator::Normal::new(50.0, 2.0);
for _ in (1..sz).step_by(3) {
let value = generator2.generate();
g2.add(value);
d.add(value);
}
g1.merge(&g2).unwrap();
let mut g3 = DDSketch::new(c);
let mut generator3 = generator::Normal::new(40.0, 0.5);
for _ in (2..sz).step_by(3) {
let value = generator3.generate();
g3.add(value);
d.add(value);
}
g1.merge(&g3).unwrap();
compare_sketches(&mut d, &g1);
});
}
#[test]
fn test_merge_empty() {
evaluate_test_sizes(|sz: usize| {
let c = new_config();
let mut d = Dataset::new();
let mut g1 = DDSketch::new(c);
let mut g2 = DDSketch::new(c);
let mut generator = generator::Exponential::new(5.0);
for _ in 0..sz {
let value = generator.generate();
g2.add(value);
d.add(value);
}
g1.merge(&g2).unwrap();
compare_sketches(&mut d, &g1);
let g3 = DDSketch::new(c);
g2.merge(&g3).unwrap();
compare_sketches(&mut d, &g2);
});
}
#[test]
fn test_merge_mixed() {
evaluate_test_sizes(|sz: usize| {
let c = new_config();
let mut d = Dataset::new();
let mut g1 = DDSketch::new(c);
let mut generator1 = generator::Normal::new(100.0, 1.0);
for _ in (0..sz).step_by(3) {
let value = generator1.generate();
g1.add(value);
d.add(value);
}
let mut g2 = DDSketch::new(c);
let mut generator2 = generator::Exponential::new(5.0);
for _ in (1..sz).step_by(3) {
let value = generator2.generate();
g2.add(value);
d.add(value);
}
g1.merge(&g2).unwrap();
let mut g3 = DDSketch::new(c);
let mut generator3 = generator::Exponential::new(0.1);
for _ in (2..sz).step_by(3) {
let value = generator3.generate();
g3.add(value);
d.add(value);
}
g1.merge(&g3).unwrap();
compare_sketches(&mut d, &g1);
})
}
#[test]
fn test_merge_incompatible() {
let c1 = Config::new(TEST_ALPHA, TEST_MAX_BINS, TEST_MIN_VALUE);
let c2 = Config::new(TEST_ALPHA * 2.0, TEST_MAX_BINS, TEST_MIN_VALUE);
let mut d1 = DDSketch::new(c1);
let d2 = DDSketch::new(c2);
assert!(d1.merge(&d2).is_err());
let c3 = Config::new(TEST_ALPHA, TEST_MAX_BINS, TEST_MIN_VALUE * 10.0);
let d3 = DDSketch::new(c3);
assert!(d1.merge(&d3).is_err());
let c4 = Config::new(TEST_ALPHA, TEST_MAX_BINS * 2, TEST_MIN_VALUE);
let d4 = DDSketch::new(c4);
assert!(d1.merge(&d4).is_err());
// the same should work
let c5 = Config::new(TEST_ALPHA, TEST_MAX_BINS, TEST_MIN_VALUE);
let dsame = DDSketch::new(c5);
assert!(d1.merge(&dsame).is_ok());
}
#[test]
#[ignore]
fn test_performance_insert() {
let c = Config::defaults();
let mut g = DDSketch::new(c);
let mut gen = generator::Normal::new(1000.0, 500.0);
let count = 300_000_000;
let mut values = Vec::new();
for _ in 0..count {
values.push(gen.generate());
}
let start_time = Instant::now();
for value in values {
g.add(value);
}
// This simply ensures the operations don't get optimzed out as ignored
let quantile = g.quantile(0.50).unwrap().unwrap();
let elapsed = start_time.elapsed().as_micros() as f64;
let elapsed = elapsed / 1_000_000.0;
println!(
"RESULT: p50={:.2} => Added {}M samples in {:2} secs ({:.2}M samples/sec)",
quantile,
count / 1_000_000,
elapsed,
(count as f64) / 1_000_000.0 / elapsed
);
}
#[test]
#[ignore]
fn test_performance_merge() {
let c = Config::defaults();
let mut gen = generator::Normal::new(1000.0, 500.0);
let merge_count = 500_000;
let sample_count = 1_000;
let mut sketches = Vec::new();
for _ in 0..merge_count {
let mut d = DDSketch::new(c);
for _ in 0..sample_count {
d.add(gen.generate());
}
sketches.push(d);
}
let mut base = DDSketch::new(c);
let start_time = Instant::now();
for sketch in &sketches {
base.merge(sketch).unwrap();
}
let elapsed = start_time.elapsed().as_micros() as f64;
let elapsed = elapsed / 1_000_000.0;
println!(
"RESULT: Merged {} sketches in {:2} secs ({:.2} merges/sec)",
merge_count,
elapsed,
(merge_count as f64) / elapsed
);
}

View File

@@ -95,11 +95,21 @@ pub(crate) fn get_all_ff_reader_or_empty(
allowed_column_types: Option<&[ColumnType]>,
fallback_type: ColumnType,
) -> crate::Result<Vec<(columnar::Column<u64>, ColumnType)>> {
let ff_fields = reader.fast_fields();
let mut ff_field_with_type =
ff_fields.u64_lenient_for_type_all(allowed_column_types, field_name)?;
let mut ff_field_with_type = get_all_ff_readers(reader, field_name, allowed_column_types)?;
if ff_field_with_type.is_empty() {
ff_field_with_type.push((Column::build_empty_column(reader.num_docs()), fallback_type));
}
Ok(ff_field_with_type)
}
/// Get all fast field reader.
pub(crate) fn get_all_ff_readers(
reader: &SegmentReader,
field_name: &str,
allowed_column_types: Option<&[ColumnType]>,
) -> crate::Result<Vec<(columnar::Column<u64>, ColumnType)>> {
let ff_fields = reader.fast_fields();
let ff_field_with_type =
ff_fields.u64_lenient_for_type_all(allowed_column_types, field_name)?;
Ok(ff_field_with_type)
}

View File

@@ -9,11 +9,12 @@ use crate::aggregation::accessor_helpers::{
get_numeric_or_date_column_types,
};
use crate::aggregation::agg_req::{Aggregation, AggregationVariants, Aggregations};
pub use crate::aggregation::bucket::{CompositeAggReqData, CompositeSourceAccessors};
use crate::aggregation::bucket::{
build_segment_filter_collector, build_segment_range_collector, FilterAggReqData,
HistogramAggReqData, HistogramBounds, IncludeExcludeParam, MissingTermAggReqData,
RangeAggReqData, SegmentHistogramCollector, TermMissingAgg, TermsAggReqData, TermsAggregation,
TermsAggregationInternal,
build_segment_filter_collector, build_segment_range_collector, CompositeAggregation,
FilterAggReqData, HistogramAggReqData, HistogramBounds, IncludeExcludeParam,
MissingTermAggReqData, RangeAggReqData, SegmentCompositeCollector, SegmentHistogramCollector,
TermMissingAgg, TermsAggReqData, TermsAggregation, TermsAggregationInternal,
};
use crate::aggregation::metric::{
build_segment_stats_collector, AverageAggregation, CardinalityAggReqData,
@@ -73,6 +74,12 @@ impl AggregationsSegmentCtx {
self.per_request.filter_req_data.push(Some(Box::new(data)));
self.per_request.filter_req_data.len() - 1
}
pub(crate) fn push_composite_req_data(&mut self, data: CompositeAggReqData) -> usize {
self.per_request
.composite_req_data
.push(Some(Box::new(data)));
self.per_request.composite_req_data.len() - 1
}
#[inline]
pub(crate) fn get_term_req_data(&self, idx: usize) -> &TermsAggReqData {
@@ -108,6 +115,12 @@ impl AggregationsSegmentCtx {
.as_deref()
.expect("range_req_data slot is empty (taken)")
}
#[inline]
pub(crate) fn get_composite_req_data(&self, idx: usize) -> &CompositeAggReqData {
self.per_request.composite_req_data[idx]
.as_deref()
.expect("composite_req_data slot is empty (taken)")
}
// ---------- mutable getters ----------
@@ -130,8 +143,14 @@ impl AggregationsSegmentCtx {
.as_deref_mut()
.expect("histogram_req_data slot is empty (taken)")
}
#[inline]
pub(crate) fn get_composite_req_data_mut(&mut self, idx: usize) -> &mut CompositeAggReqData {
self.per_request.composite_req_data[idx]
.as_deref_mut()
.expect("composite_req_data slot is empty (taken)")
}
// ---------- take / put (terms, histogram, range) ----------
// ---------- take / put (terms, histogram, range, composite) ----------
/// Move out the boxed Histogram request at `idx`, leaving `None`.
#[inline]
@@ -181,6 +200,25 @@ impl AggregationsSegmentCtx {
debug_assert!(self.per_request.filter_req_data[idx].is_none());
self.per_request.filter_req_data[idx] = Some(value);
}
/// Move out the Composite request at `idx`.
#[inline]
pub(crate) fn take_composite_req_data(&mut self, idx: usize) -> Box<CompositeAggReqData> {
self.per_request.composite_req_data[idx]
.take()
.expect("composite_req_data slot is empty (taken)")
}
/// Put back a Composite request into an empty slot at `idx`.
#[inline]
pub(crate) fn put_back_composite_req_data(
&mut self,
idx: usize,
value: Box<CompositeAggReqData>,
) {
debug_assert!(self.per_request.composite_req_data[idx].is_none());
self.per_request.composite_req_data[idx] = Some(value);
}
}
/// Each type of aggregation has its own request data struct. This struct holds
@@ -200,6 +238,8 @@ pub struct PerRequestAggSegCtx {
pub range_req_data: Vec<Option<Box<RangeAggReqData>>>,
/// FilterAggReqData contains the request data for a filter aggregation.
pub filter_req_data: Vec<Option<Box<FilterAggReqData>>>,
/// CompositeAggReqData contains the request data for a composite aggregation.
pub composite_req_data: Vec<Option<Box<CompositeAggReqData>>>,
/// Shared by avg, min, max, sum, stats, extended_stats, count
pub stats_metric_req_data: Vec<MetricAggReqData>,
/// CardinalityAggReqData contains the request data for a cardinality aggregation.
@@ -255,6 +295,11 @@ impl PerRequestAggSegCtx {
.iter()
.map(|t| t.get_memory_consumption())
.sum::<usize>()
+ self
.composite_req_data
.iter()
.map(|t| t.as_ref().unwrap().get_memory_consumption())
.sum::<usize>()
+ self.agg_tree.len() * std::mem::size_of::<AggRefNode>()
}
@@ -291,6 +336,11 @@ impl PerRequestAggSegCtx {
.expect("filter_req_data slot is empty (taken)")
.name
.as_str(),
AggKind::Composite => &self.composite_req_data[idx]
.as_deref()
.expect("composite_req_data slot is empty (taken)")
.name
.as_str(),
}
}
@@ -417,6 +467,9 @@ pub(crate) fn build_segment_agg_collector(
)?)),
AggKind::Range => Ok(build_segment_range_collector(req, node)?),
AggKind::Filter => build_segment_filter_collector(req, node),
AggKind::Composite => Ok(Box::new(SegmentCompositeCollector::from_req_and_validate(
req, node,
)?)),
}
}
@@ -447,6 +500,7 @@ pub enum AggKind {
DateHistogram,
Range,
Filter,
Composite,
}
impl AggKind {
@@ -462,6 +516,7 @@ impl AggKind {
AggKind::DateHistogram => "DateHistogram",
AggKind::Range => "Range",
AggKind::Filter => "Filter",
AggKind::Composite => "Composite",
}
}
}
@@ -740,6 +795,14 @@ fn build_nodes(
children,
}])
}
AggregationVariants::Composite(composite_req) => Ok(vec![build_composite_node(
agg_name,
reader,
segment_ordinal,
data,
&req.sub_aggregation,
composite_req,
)?]),
}
}
@@ -935,6 +998,35 @@ fn build_terms_or_cardinality_nodes(
Ok(nodes)
}
fn build_composite_node(
agg_name: &str,
reader: &SegmentReader,
segment_ordinal: SegmentOrdinal,
data: &mut AggregationsSegmentCtx,
sub_aggs: &Aggregations,
req: &CompositeAggregation,
) -> crate::Result<AggRefNode> {
let mut composite_accessors = Vec::with_capacity(req.sources.len());
for source in &req.sources {
let source_after_key_opt = req.after.get(source.name()).map(|k| &k.0);
let source_accessor =
CompositeSourceAccessors::build_for_source(reader, source, source_after_key_opt)?;
composite_accessors.push(source_accessor);
}
let agg = CompositeAggReqData {
name: agg_name.to_string(),
req: req.clone(),
composite_accessors,
};
let idx = data.push_composite_req_data(agg);
let children = build_children(sub_aggs, reader, segment_ordinal, data)?;
Ok(AggRefNode {
kind: AggKind::Composite,
idx_in_req_data: idx,
children,
})
}
/// Builds a single BitSet of allowed term ordinals for a string dictionary column according to
/// include/exclude parameters.
fn build_allowed_term_ids_for_str(

View File

@@ -40,6 +40,7 @@ use super::metric::{
MaxAggregation, MinAggregation, PercentilesAggregationReq, StatsAggregation, SumAggregation,
TopHitsAggregationReq,
};
use crate::aggregation::bucket::CompositeAggregation;
/// The top-level aggregation request structure, which contains [`Aggregation`] and their user
/// defined names. It is also used in buckets aggregations to define sub-aggregations.
@@ -134,6 +135,9 @@ pub enum AggregationVariants {
/// Filter documents into a single bucket.
#[serde(rename = "filter")]
Filter(FilterAggregation),
/// Put data into multi level paginated buckets.
#[serde(rename = "composite")]
Composite(CompositeAggregation),
// Metric aggregation types
/// Computes the average of the extracted values.
@@ -180,6 +184,11 @@ impl AggregationVariants {
AggregationVariants::Histogram(histogram) => vec![histogram.field.as_str()],
AggregationVariants::DateHistogram(histogram) => vec![histogram.field.as_str()],
AggregationVariants::Filter(filter) => filter.get_fast_field_names(),
AggregationVariants::Composite(composite) => composite
.sources
.iter()
.map(|source_map| source_map.field())
.collect(),
AggregationVariants::Average(avg) => vec![avg.field_name()],
AggregationVariants::Count(count) => vec![count.field_name()],
AggregationVariants::Max(max) => vec![max.field_name()],
@@ -214,6 +223,12 @@ impl AggregationVariants {
_ => None,
}
}
pub(crate) fn as_composite(&self) -> Option<&CompositeAggregation> {
match &self {
AggregationVariants::Composite(composite) => Some(composite),
_ => None,
}
}
pub(crate) fn as_percentile(&self) -> Option<&PercentilesAggregationReq> {
match &self {
AggregationVariants::Percentiles(percentile_req) => Some(percentile_req),

View File

@@ -13,6 +13,8 @@ use super::metric::{
ExtendedStats, PercentilesMetricResult, SingleMetricResult, Stats, TopHitsMetricResult,
};
use super::{AggregationError, Key};
use crate::aggregation::bucket::AfterKey;
use crate::aggregation::intermediate_agg_result::CompositeIntermediateKey;
use crate::TantivyError;
#[derive(Clone, Default, Debug, PartialEq, Serialize, Deserialize)]
@@ -158,6 +160,16 @@ pub enum BucketResult {
},
/// This is the filter result - a single bucket with sub-aggregations
Filter(FilterBucketResult),
/// This is the composite aggregation result
Composite {
/// The buckets
///
/// See [`CompositeAggregation`](super::bucket::CompositeAggregation)
buckets: Vec<CompositeBucketEntry>,
/// The key to start after when paginating
#[serde(skip_serializing_if = "FxHashMap::is_empty")]
after_key: FxHashMap<String, AfterKey>,
},
}
impl BucketResult {
@@ -179,6 +191,9 @@ impl BucketResult {
// Only count sub-aggregation buckets
filter_result.sub_aggregations.get_bucket_count()
}
BucketResult::Composite { buckets, .. } => {
buckets.iter().map(|bucket| bucket.get_bucket_count()).sum()
}
}
}
}
@@ -337,3 +352,130 @@ pub struct FilterBucketResult {
#[serde(flatten)]
pub sub_aggregations: AggregationResults,
}
/// The JSON mappable key to identify a composite bucket.
///
/// This is similar to `Key`, but composite keys can also be boolean and null.
///
/// Note the type information loss compared to `CompositeIntermediateKey`.
/// Pagination is performed using `AfterKey`, which encodes type information.
#[derive(Clone, Debug, Serialize, Deserialize)]
#[serde(untagged)]
pub enum CompositeKey {
/// Boolean key
Bool(bool),
/// String key
Str(String),
/// `i64` key
I64(i64),
/// `u64` key
U64(u64),
/// `f64` key
F64(f64),
/// Null key
Null,
}
impl Eq for CompositeKey {}
impl std::hash::Hash for CompositeKey {
fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
core::mem::discriminant(self).hash(state);
match self {
Self::Bool(val) => val.hash(state),
Self::Str(text) => text.hash(state),
Self::F64(val) => val.to_bits().hash(state),
Self::U64(val) => val.hash(state),
Self::I64(val) => val.hash(state),
Self::Null => {}
}
}
}
impl PartialEq for CompositeKey {
fn eq(&self, other: &Self) -> bool {
match (self, other) {
(Self::Bool(l), Self::Bool(r)) => l == r,
(Self::Str(l), Self::Str(r)) => l == r,
(Self::F64(l), Self::F64(r)) => l.to_bits() == r.to_bits(),
(Self::I64(l), Self::I64(r)) => l == r,
(Self::U64(l), Self::U64(r)) => l == r,
(Self::Null, Self::Null) => true,
(
Self::Bool(_)
| Self::Str(_)
| Self::F64(_)
| Self::I64(_)
| Self::U64(_)
| Self::Null,
_,
) => false,
}
}
}
impl From<CompositeIntermediateKey> for CompositeKey {
fn from(value: CompositeIntermediateKey) -> Self {
match value {
CompositeIntermediateKey::Str(s) => Self::Str(s),
CompositeIntermediateKey::IpAddr(s) => {
// Prefer to use the IPv4 representation if possible
if let Some(ip) = s.to_ipv4_mapped() {
Self::Str(ip.to_string())
} else {
Self::Str(s.to_string())
}
}
CompositeIntermediateKey::F64(f) => Self::F64(f),
CompositeIntermediateKey::Bool(f) => Self::Bool(f),
CompositeIntermediateKey::U64(f) => Self::U64(f),
CompositeIntermediateKey::I64(f) => Self::I64(f),
CompositeIntermediateKey::DateTime(f) => Self::I64(f / 1_000_000), // Convert ns to ms
CompositeIntermediateKey::Null => Self::Null,
}
}
}
/// This is the default entry for a bucket, which contains a composite key, count, and optionally
/// sub-aggregations.
/// ...
/// "my_composite": {
/// "buckets": [
/// {
/// "key": {
/// "date": 1494201600000,
/// "product": "rocky"
/// },
/// "doc_count": 5
/// },
/// {
/// "key": {
/// "date": 1494201600000,
/// "product": "balboa"
/// },
/// "doc_count": 2
/// },
/// {
/// "key": {
/// "date": 1494201700000,
/// "product": "john"
/// },
/// "doc_count": 3
/// }
/// ]
/// }
/// ...
/// }
/// ```
#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]
pub struct CompositeBucketEntry {
/// The identifier of the bucket.
pub key: FxHashMap<String, CompositeKey>,
/// Number of documents in the bucket.
pub doc_count: u64,
#[serde(flatten)]
/// Sub-aggregations in this bucket.
pub sub_aggregation: AggregationResults,
}
impl CompositeBucketEntry {
pub(crate) fn get_bucket_count(&self) -> u64 {
1 + self.sub_aggregation.get_bucket_count()
}
}

View File

@@ -0,0 +1,515 @@
use std::fmt::Debug;
use std::net::Ipv6Addr;
use columnar::column_values::{CompactHit, CompactSpaceU64Accessor};
use columnar::{Column, ColumnType, MonotonicallyMappableToU64, StrColumn, TermOrdHit};
use crate::aggregation::accessor_helpers::{get_all_ff_readers, get_numeric_or_date_column_types};
use crate::aggregation::bucket::composite::numeric_types::num_proj;
use crate::aggregation::bucket::composite::numeric_types::num_proj::ProjectedNumber;
use crate::aggregation::bucket::composite::ToTypePaginationOrder;
use crate::aggregation::bucket::{
parse_into_milliseconds, CalendarInterval, CompositeAggregation, CompositeAggregationSource,
MissingOrder, Order,
};
use crate::aggregation::intermediate_agg_result::CompositeIntermediateKey;
use crate::{SegmentReader, TantivyError};
/// Contains all information required by the SegmentCompositeCollector to perform the
/// composite aggregation on a segment.
pub struct CompositeAggReqData {
/// The name of the aggregation.
pub name: String,
/// The normalized term aggregation request.
pub req: CompositeAggregation,
/// Accessors for each source, each source can have multiple accessors (columns).
pub composite_accessors: Vec<CompositeSourceAccessors>,
}
impl CompositeAggReqData {
/// Estimate the memory consumption of this struct in bytes.
pub fn get_memory_consumption(&self) -> usize {
std::mem::size_of::<Self>()
+ self.composite_accessors.len() * std::mem::size_of::<CompositeSourceAccessors>()
}
}
/// Accessors for a single column in a composite source.
pub struct CompositeAccessor {
/// The fast field column
pub column: Column<u64>,
/// The column type
pub column_type: ColumnType,
/// Term dictionary if the column type is Str
///
/// Only used by term sources
pub str_dict_column: Option<StrColumn>,
/// Parsed date interval for date histogram sources
pub date_histogram_interval: PrecomputedDateInterval,
}
/// Accessors to all the columns that belong to the field of a composite source.
pub struct CompositeSourceAccessors {
/// The accessors for this source
pub accessors: Vec<CompositeAccessor>,
/// The key after which to start collecting results. Applies to the first
/// column of the source.
pub after_key: PrecomputedAfterKey,
/// The column index the after_key applies to. The after_key only applies to
/// one column. Columns before should be skipped. Columns after should be
/// kept without comparison to the after_key.
pub after_key_accessor_idx: usize,
/// Whether to skip missing values because of the after_key. Skipping only
/// applies if the value for previous columns were exactly equal to the
/// corresponding after keys (is_on_after_key).
pub skip_missing: bool,
/// The after key was set to null to indicate that the last collected key
/// was a missing value.
pub is_after_key_explicit_missing: bool,
}
impl CompositeSourceAccessors {
/// Creates a new set of accessors for the composite source.
///
/// Precomputes some values to make collection faster.
pub fn build_for_source(
reader: &SegmentReader,
source: &CompositeAggregationSource,
// First option is None when no after key was set in the query, the
// second option is None when the after key was set but its value for
// this source was set to `null`
source_after_key_opt: Option<&CompositeIntermediateKey>,
) -> crate::Result<Self> {
let is_after_key_explicit_missing = source_after_key_opt
.map(|after_key| matches!(after_key, CompositeIntermediateKey::Null))
.unwrap_or(false);
let mut skip_missing = false;
if let Some(CompositeIntermediateKey::Null) = source_after_key_opt {
if !source.missing_bucket() {
return Err(TantivyError::InvalidArgument(
"the 'after' key for a source cannot be null when 'missing_bucket' is false"
.to_string(),
));
}
} else if source_after_key_opt.is_some() {
// if missing buckets come first and we have a non null after key, we skip missing
if MissingOrder::First == source.missing_order() {
skip_missing = true;
}
if MissingOrder::Default == source.missing_order() && Order::Asc == source.order() {
skip_missing = true;
}
};
match source {
CompositeAggregationSource::Terms(source) => {
let allowed_column_types = [
ColumnType::I64,
ColumnType::U64,
ColumnType::F64,
ColumnType::Str,
ColumnType::DateTime,
ColumnType::Bool,
ColumnType::IpAddr,
// ColumnType::Bytes Unsupported
];
let mut columns_and_types =
get_all_ff_readers(reader, &source.field, Some(&allowed_column_types))?;
// Sort columns by their pagination order and determine which to skip
columns_and_types.sort_by_key(|(_, col_type)| col_type.column_pagination_order());
if source.order == Order::Desc {
columns_and_types.reverse();
}
let after_key_accessor_idx = find_first_column_to_collect(
&columns_and_types,
source_after_key_opt,
source.missing_order,
source.order,
)?;
let source_collectors: Vec<CompositeAccessor> = columns_and_types
.into_iter()
.map(|(column, column_type)| {
Ok(CompositeAccessor {
column,
column_type,
str_dict_column: reader.fast_fields().str(&source.field)?,
date_histogram_interval: PrecomputedDateInterval::NotApplicable,
})
})
.collect::<crate::Result<_>>()?;
let after_key = if let Some(first_col) =
source_collectors.get(after_key_accessor_idx)
{
match source_after_key_opt {
Some(after_key) => PrecomputedAfterKey::precompute(
&first_col,
after_key,
&source.field,
source.missing_order,
source.order,
)?,
None => {
precompute_missing_after_key(false, source.missing_order, source.order)
}
}
} else {
// if no columns, we don't care about the after_key
PrecomputedAfterKey::Next(0)
};
Ok(CompositeSourceAccessors {
accessors: source_collectors,
is_after_key_explicit_missing,
skip_missing,
after_key,
after_key_accessor_idx,
})
}
CompositeAggregationSource::Histogram(source) => {
let column_and_types: Vec<(Column, ColumnType)> = get_all_ff_readers(
reader,
&source.field,
Some(get_numeric_or_date_column_types()),
)?;
let source_collectors: Vec<CompositeAccessor> = column_and_types
.into_iter()
.map(|(column, column_type)| {
Ok(CompositeAccessor {
column,
column_type,
str_dict_column: None,
date_histogram_interval: PrecomputedDateInterval::NotApplicable,
})
})
.collect::<crate::Result<_>>()?;
let after_key = match source_after_key_opt {
Some(CompositeIntermediateKey::F64(key)) => {
let normalized_key = *key / source.interval;
num_proj::f64_to_i64(normalized_key).into()
}
Some(CompositeIntermediateKey::Null) => {
precompute_missing_after_key(true, source.missing_order, source.order)
}
None => precompute_missing_after_key(true, source.missing_order, source.order),
_ => {
return Err(crate::TantivyError::InvalidArgument(
"After key type invalid for interval composite source".to_string(),
));
}
};
Ok(CompositeSourceAccessors {
accessors: source_collectors,
is_after_key_explicit_missing,
skip_missing,
after_key,
after_key_accessor_idx: 0,
})
}
CompositeAggregationSource::DateHistogram(source) => {
let column_and_types =
get_all_ff_readers(reader, &source.field, Some(&[ColumnType::DateTime]))?;
let date_histogram_interval =
PrecomputedDateInterval::from_date_histogram_source_intervals(
&source.fixed_interval,
source.calendar_interval,
)?;
let source_collectors: Vec<CompositeAccessor> = column_and_types
.into_iter()
.map(|(column, column_type)| {
Ok(CompositeAccessor {
column,
column_type,
str_dict_column: None,
date_histogram_interval,
})
})
.collect::<crate::Result<_>>()?;
let after_key = match source_after_key_opt {
Some(CompositeIntermediateKey::DateTime(key)) => {
PrecomputedAfterKey::Exact(key.to_u64())
}
Some(CompositeIntermediateKey::Null) => {
precompute_missing_after_key(true, source.missing_order, source.order)
}
None => precompute_missing_after_key(true, source.missing_order, source.order),
_ => {
return Err(crate::TantivyError::InvalidArgument(
"After key type invalid for interval composite source".to_string(),
));
}
};
Ok(CompositeSourceAccessors {
accessors: source_collectors,
is_after_key_explicit_missing,
skip_missing,
after_key,
after_key_accessor_idx: 0,
})
}
}
}
}
/// Finds the index of the first column we should start collecting from to
/// resume the pagination from the after_key.
fn find_first_column_to_collect<T>(
sorted_columns: &[(T, ColumnType)],
after_key_opt: Option<&CompositeIntermediateKey>,
missing_order: MissingOrder,
order: Order,
) -> crate::Result<usize> {
let after_key = match after_key_opt {
None => return Ok(0), // No pagination, start from beginning
Some(key) => key,
};
// Handle null after_key (we were on a missing value last time)
if matches!(after_key, CompositeIntermediateKey::Null) {
return match (missing_order, order) {
// Missing values come first, so all columns remain
(MissingOrder::First, _) | (MissingOrder::Default, Order::Asc) => Ok(0),
// Missing values come last, so all columns are done
(MissingOrder::Last, _) | (MissingOrder::Default, Order::Desc) => {
Ok(sorted_columns.len())
}
};
}
// Find the first column whose type order matches or follows the after_key's
// type in the pagination sequence
let after_key_column_order = after_key.column_pagination_order();
for (idx, (_, col_type)) in sorted_columns.iter().enumerate() {
let col_order = col_type.column_pagination_order();
let is_first_to_collect = match order {
Order::Asc => col_order >= after_key_column_order,
Order::Desc => col_order <= after_key_column_order,
};
if is_first_to_collect {
return Ok(idx);
}
}
// All columns are before the after_key, nothing left to collect
Ok(sorted_columns.len())
}
fn precompute_missing_after_key(
is_after_key_explicit_missing: bool,
missing_order: MissingOrder,
order: Order,
) -> PrecomputedAfterKey {
let after_last = PrecomputedAfterKey::AfterLast;
let before_first = PrecomputedAfterKey::Next(0);
match (is_after_key_explicit_missing, missing_order, order) {
(true, MissingOrder::First, Order::Asc) => before_first,
(true, MissingOrder::First, Order::Desc) => after_last,
(true, MissingOrder::Last, Order::Asc) => after_last,
(true, MissingOrder::Last, Order::Desc) => before_first,
(true, MissingOrder::Default, Order::Asc) => before_first,
(true, MissingOrder::Default, Order::Desc) => after_last,
(false, _, Order::Asc) => before_first,
(false, _, Order::Desc) => after_last,
}
}
/// A parsed representation of the date interval for date histogram sources
#[derive(Clone, Copy, Debug)]
pub enum PrecomputedDateInterval {
/// This is not a date histogram source
NotApplicable,
/// Source was configured with a fixed interval
FixedNanoseconds(i64),
/// Source was configured with a calendar interval
Calendar(CalendarInterval),
}
impl PrecomputedDateInterval {
/// Validates the date histogram source interval fields and parses a date interval from them.
pub fn from_date_histogram_source_intervals(
fixed_interval: &Option<String>,
calendar_interval: Option<CalendarInterval>,
) -> crate::Result<Self> {
match (fixed_interval, calendar_interval) {
(Some(_), Some(_)) | (None, None) => Err(TantivyError::InvalidArgument(
"date histogram source must one and only one of fixed_interval or \
calendar_interval set"
.to_string(),
)),
(Some(fixed_interval), None) => {
let fixed_interval_ms = parse_into_milliseconds(&fixed_interval)?;
Ok(PrecomputedDateInterval::FixedNanoseconds(
fixed_interval_ms * 1_000_000,
))
}
(None, Some(calendar_interval)) => {
Ok(PrecomputedDateInterval::Calendar(calendar_interval))
}
}
}
}
/// The after key projected to the u64 column space
///
/// Some column types (term, IP) might not have an exact representation of the
/// specified after key
#[derive(Debug)]
pub enum PrecomputedAfterKey {
/// The after key could be exactly represented in the column space.
Exact(u64),
/// The after key could not be exactly represented exactly represented, so
/// this is the next closest one.
Next(u64),
/// The after key could not be represented in the column space, it is
/// greater than all value
AfterLast,
}
impl From<TermOrdHit> for PrecomputedAfterKey {
fn from(hit: TermOrdHit) -> Self {
match hit {
TermOrdHit::Exact(ord) => PrecomputedAfterKey::Exact(ord),
// TermOrdHit represents AfterLast as Next(u64::MAX), we keep it as is
TermOrdHit::Next(ord) => PrecomputedAfterKey::Next(ord),
}
}
}
impl From<CompactHit> for PrecomputedAfterKey {
fn from(hit: CompactHit) -> Self {
match hit {
CompactHit::Exact(ord) => PrecomputedAfterKey::Exact(ord as u64),
CompactHit::Next(ord) => PrecomputedAfterKey::Next(ord as u64),
CompactHit::AfterLast => PrecomputedAfterKey::AfterLast,
}
}
}
impl<T: MonotonicallyMappableToU64> From<ProjectedNumber<T>> for PrecomputedAfterKey {
fn from(num: ProjectedNumber<T>) -> Self {
match num {
ProjectedNumber::Exact(number) => PrecomputedAfterKey::Exact(number.to_u64()),
ProjectedNumber::Next(number) => PrecomputedAfterKey::Next(number.to_u64()),
ProjectedNumber::AfterLast => PrecomputedAfterKey::AfterLast,
}
}
}
// /!\ These operators only makes sense if both values are in the same column space
impl PrecomputedAfterKey {
pub fn equals(&self, column_value: u64) -> bool {
match self {
PrecomputedAfterKey::Exact(v) => *v == column_value,
PrecomputedAfterKey::Next(_) => false,
PrecomputedAfterKey::AfterLast => false,
}
}
pub fn gt(&self, column_value: u64) -> bool {
match self {
PrecomputedAfterKey::Exact(v) => *v > column_value,
PrecomputedAfterKey::Next(v) => *v > column_value,
PrecomputedAfterKey::AfterLast => true,
}
}
pub fn lt(&self, column_value: u64) -> bool {
match self {
PrecomputedAfterKey::Exact(v) => *v < column_value,
// a value equal to the next is greater than the after key
PrecomputedAfterKey::Next(v) => *v <= column_value,
PrecomputedAfterKey::AfterLast => false,
}
}
fn precompute_ip_addr(column: &Column<u64>, key: &Ipv6Addr) -> crate::Result<Self> {
let compact_space_accessor = column
.values
.clone()
.downcast_arc::<CompactSpaceU64Accessor>()
.map_err(|_| {
TantivyError::AggregationError(crate::aggregation::AggregationError::InternalError(
"type mismatch: could not downcast to CompactSpaceU64Accessor".to_string(),
))
})?;
let ip_u128 = key.to_bits();
let ip_next_compact = compact_space_accessor.u128_to_next_compact(ip_u128);
Ok(ip_next_compact.into())
}
fn precompute_term_ord(
str_dict_column: &Option<StrColumn>,
key: &str,
field: &str,
) -> crate::Result<Self> {
let dict = str_dict_column
.as_ref()
.expect("dictionary missing for str accessor")
.dictionary();
let next_ord = dict.term_ord_or_next(key).map_err(|_| {
TantivyError::InvalidArgument(format!(
"failed to lookup after_key '{}' for field '{}'",
key, field
))
})?;
Ok(next_ord.into())
}
/// Projects the after key into the column space of the given accessor.
///
/// The computed after key will not take care of skipping entire columns
/// when the after key type is ordered after the accessor's type, that
/// should be performed earlier.
pub fn precompute(
composite_accessor: &CompositeAccessor,
source_after_key: &CompositeIntermediateKey,
field: &str,
missing_order: MissingOrder,
order: Order,
) -> crate::Result<Self> {
use CompositeIntermediateKey as CIKey;
let precomputed_key = match (composite_accessor.column_type, source_after_key) {
(ColumnType::Bytes, _) => panic!("unsupported"),
// null after key
(_, CIKey::Null) => precompute_missing_after_key(false, missing_order, order),
// numerical
(ColumnType::I64, CIKey::I64(k)) => PrecomputedAfterKey::Exact(k.to_u64()),
(ColumnType::I64, CIKey::U64(k)) => num_proj::u64_to_i64(*k).into(),
(ColumnType::I64, CIKey::F64(k)) => num_proj::f64_to_i64(*k).into(),
(ColumnType::U64, CIKey::I64(k)) => num_proj::i64_to_u64(*k).into(),
(ColumnType::U64, CIKey::U64(k)) => PrecomputedAfterKey::Exact(*k),
(ColumnType::U64, CIKey::F64(k)) => num_proj::f64_to_u64(*k).into(),
(ColumnType::F64, CIKey::I64(k)) => num_proj::i64_to_f64(*k).into(),
(ColumnType::F64, CIKey::U64(k)) => num_proj::u64_to_f64(*k).into(),
(ColumnType::F64, CIKey::F64(k)) => PrecomputedAfterKey::Exact(k.to_u64()),
// boolean
(ColumnType::Bool, CIKey::Bool(key)) => PrecomputedAfterKey::Exact(key.to_u64()),
// string
(ColumnType::Str, CIKey::Str(key)) => PrecomputedAfterKey::precompute_term_ord(
&composite_accessor.str_dict_column,
key,
field,
)?,
// date time
(ColumnType::DateTime, CIKey::DateTime(key)) => {
PrecomputedAfterKey::Exact(key.to_u64())
}
// ip address
(ColumnType::IpAddr, CIKey::IpAddr(key)) => {
PrecomputedAfterKey::precompute_ip_addr(&composite_accessor.column, key)?
}
// assume the column's type is ordered after the after_key's type
_ => PrecomputedAfterKey::keep_all(order),
};
Ok(precomputed_key)
}
fn keep_all(order: Order) -> Self {
match order {
Order::Asc => PrecomputedAfterKey::Next(0),
Order::Desc => PrecomputedAfterKey::Next(u64::MAX),
}
}
}

View File

@@ -0,0 +1,140 @@
use time::convert::{Day, Nanosecond};
use time::{Time, UtcDateTime};
const NS_IN_DAY: i64 = Nanosecond::per_t::<i128>(Day) as i64;
/// Computes the timestamp in nanoseconds corresponding to the beginning of the
/// year (January 1st at midnight UTC).
pub(super) fn try_year_bucket(timestamp_ns: i64) -> crate::Result<i64> {
year_bucket_using_time_crate(timestamp_ns).map_err(|e| {
crate::TantivyError::InvalidArgument(format!(
"Failed to compute year bucket for timestamp {}: {}",
timestamp_ns,
e.to_string()
))
})
}
/// Computes the timestamp in nanoseconds corresponding to the beginning of the
/// month (1st at midnight UTC).
pub(super) fn try_month_bucket(timestamp_ns: i64) -> crate::Result<i64> {
month_bucket_using_time_crate(timestamp_ns).map_err(|e| {
crate::TantivyError::InvalidArgument(format!(
"Failed to compute month bucket for timestamp {}: {}",
timestamp_ns,
e.to_string()
))
})
}
/// Computes the timestamp in nanoseconds corresponding to the beginning of the
/// week (Monday at midnight UTC).
pub(super) fn week_bucket(timestamp_ns: i64) -> i64 {
// 1970-01-01 was a Thursday (weekday = 4)
let days_since_epoch = timestamp_ns.div_euclid(NS_IN_DAY);
// Find the weekday: 0=Monday, ..., 6=Sunday
let weekday = (days_since_epoch + 3).rem_euclid(7);
let monday_days_since_epoch = days_since_epoch - weekday;
monday_days_since_epoch * NS_IN_DAY
}
fn year_bucket_using_time_crate(timestamp_ns: i64) -> Result<i64, time::Error> {
let timestamp_ns = UtcDateTime::from_unix_timestamp_nanos(timestamp_ns as i128)?
.replace_ordinal(1)?
.replace_time(Time::MIDNIGHT)
.unix_timestamp_nanos();
Ok(timestamp_ns as i64)
}
fn month_bucket_using_time_crate(timestamp_ns: i64) -> Result<i64, time::Error> {
let timestamp_ns = UtcDateTime::from_unix_timestamp_nanos(timestamp_ns as i128)?
.replace_day(1)?
.replace_time(Time::MIDNIGHT)
.unix_timestamp_nanos();
Ok(timestamp_ns as i64)
}
#[cfg(test)]
mod tests {
use std::i64;
use time::format_description::well_known::Iso8601;
use time::UtcDateTime;
use super::*;
fn ts_ns(iso: &str) -> i64 {
UtcDateTime::parse(iso, &Iso8601::DEFAULT)
.unwrap()
.unix_timestamp_nanos() as i64
}
#[test]
fn test_year_bucket() {
let ts = ts_ns("1970-01-01T00:00:00Z");
let res = try_year_bucket(ts).unwrap();
assert_eq!(res, ts_ns("1970-01-01T00:00:00Z"));
let ts = ts_ns("1970-06-01T10:00:01.010Z");
let res = try_year_bucket(ts).unwrap();
assert_eq!(res, ts_ns("1970-01-01T00:00:00Z"));
let ts = ts_ns("2008-12-31T23:59:59.999999999Z"); // leap year
let res = try_year_bucket(ts).unwrap();
assert_eq!(res, ts_ns("2008-01-01T00:00:00Z"));
let ts = ts_ns("2008-01-01T00:00:00Z"); // leap year
let res = try_year_bucket(ts).unwrap();
assert_eq!(res, ts_ns("2008-01-01T00:00:00Z"));
let ts = ts_ns("2010-12-31T23:59:59.999999999Z");
let res = try_year_bucket(ts).unwrap();
assert_eq!(res, ts_ns("2010-01-01T00:00:00Z"));
let ts = ts_ns("1972-06-01T00:10:00Z");
let res = try_year_bucket(ts).unwrap();
assert_eq!(res, ts_ns("1972-01-01T00:00:00Z"));
}
#[test]
fn test_month_bucket() {
let ts = ts_ns("1970-01-15T00:00:00Z");
let res = try_month_bucket(ts).unwrap();
assert_eq!(res, ts_ns("1970-01-01T00:00:00Z"));
let ts = ts_ns("1970-02-01T00:00:00Z");
let res = try_month_bucket(ts).unwrap();
assert_eq!(res, ts_ns("1970-02-01T00:00:00Z"));
let ts = ts_ns("2000-01-31T23:59:59.999999999Z");
let res = try_month_bucket(ts).unwrap();
assert_eq!(res, ts_ns("2000-01-01T00:00:00Z"));
}
#[test]
fn test_week_bucket() {
let ts = ts_ns("1970-01-05T00:00:00Z"); // Monday
let res = week_bucket(ts);
assert_eq!(res, ts_ns("1970-01-05T00:00:00Z"));
let ts = ts_ns("1970-01-05T23:59:59Z"); // Monday
let res = week_bucket(ts);
assert_eq!(res, ts_ns("1970-01-05T00:00:00Z"));
let ts = ts_ns("1970-01-07T01:13:00Z"); // Wednesday
let res = week_bucket(ts);
assert_eq!(res, ts_ns("1970-01-05T00:00:00Z"));
let ts = ts_ns("1970-01-11T23:59:59.999999999Z"); // Sunday
let res = week_bucket(ts);
assert_eq!(res, ts_ns("1970-01-05T00:00:00Z"));
let ts = ts_ns("2025-10-16T10:41:59.010Z"); // Thursday
let res = week_bucket(ts);
assert_eq!(res, ts_ns("2025-10-13T00:00:00Z"));
let ts = ts_ns("1970-01-01T00:00:00Z"); // Thursday
let res = week_bucket(ts);
assert_eq!(res, ts_ns("1969-12-29T00:00:00Z")); // Negative
}
}

View File

@@ -0,0 +1,595 @@
use std::fmt::Debug;
use std::net::Ipv6Addr;
use columnar::column_values::CompactSpaceU64Accessor;
use columnar::{
Column, ColumnType, Dictionary, MonotonicallyMappableToU128, MonotonicallyMappableToU64,
NumericalValue, StrColumn,
};
use rustc_hash::FxHashMap;
use smallvec::SmallVec;
use crate::aggregation::agg_data::{
build_segment_agg_collectors, AggRefNode, AggregationsSegmentCtx,
};
use crate::aggregation::bucket::composite::accessors::{
CompositeAccessor, CompositeAggReqData, PrecomputedDateInterval,
};
use crate::aggregation::bucket::composite::calendar_interval;
use crate::aggregation::bucket::composite::map::{DynArrayHeapMap, MAX_DYN_ARRAY_SIZE};
use crate::aggregation::bucket::{
CalendarInterval, CompositeAggregationSource, MissingOrder, Order,
};
use crate::aggregation::intermediate_agg_result::{
CompositeIntermediateKey, IntermediateAggregationResult, IntermediateAggregationResults,
IntermediateBucketResult, IntermediateCompositeBucketEntry, IntermediateCompositeBucketResult,
};
use crate::aggregation::segment_agg_result::SegmentAggregationCollector;
use crate::aggregation::BucketId;
use crate::TantivyError;
#[derive(Debug)]
struct CompositeBucketCollector {
count: u32,
}
impl CompositeBucketCollector {
fn new() -> Self {
CompositeBucketCollector { count: 0 }
}
#[inline]
fn collect(&mut self) {
self.count += 1;
}
}
/// The value is represented as a tuple of:
/// - the column index or missing value sentinel
/// - if the value is present, store the accessor index + 1
/// - if the value is missing, store 0 (for missing first) or u8::MAX (for missing last)
/// - the fast field value u64 representation
/// - 0 if the field is missing
/// - regular u64 repr if the ordering is ascending
/// - bitwise NOT of the u64 repr if the ordering is descending
#[derive(Clone, Copy, Debug, PartialEq, Eq, PartialOrd, Ord, Default, Hash)]
struct InternalValueRepr(u8, u64);
impl InternalValueRepr {
#[inline]
fn new_term(raw: u64, accessor_idx: u8, order: Order) -> Self {
match order {
Order::Asc => InternalValueRepr(accessor_idx + 1, raw),
Order::Desc => InternalValueRepr(accessor_idx + 1, !raw),
}
}
/// For histogram, the source column does not matter
#[inline]
fn new_histogram(raw: u64, order: Order) -> Self {
match order {
Order::Asc => InternalValueRepr(1, raw),
Order::Desc => InternalValueRepr(1, !raw),
}
}
#[inline]
fn new_missing(order: Order, missing_order: MissingOrder) -> Self {
let column_idx = match (missing_order, order) {
(MissingOrder::First, _) => 0,
(MissingOrder::Last, _) => u8::MAX,
(MissingOrder::Default, Order::Asc) => 0,
(MissingOrder::Default, Order::Desc) => u8::MAX,
};
InternalValueRepr(column_idx, 0)
}
#[inline]
fn decode(self, order: Order) -> Option<(u8, u64)> {
if self.0 == u8::MAX || self.0 == 0 {
return None;
}
match order {
Order::Asc => Some((self.0 - 1, self.1)),
Order::Desc => Some((self.0 - 1, !self.1)),
}
}
}
/// The collector puts values from the fast field into the correct buckets and
/// does a conversion to the correct datatype.
#[derive(Debug)]
pub struct SegmentCompositeCollector {
buckets: DynArrayHeapMap<InternalValueRepr, CompositeBucketCollector>,
accessor_idx: usize,
}
impl SegmentAggregationCollector for SegmentCompositeCollector {
fn add_intermediate_aggregation_result(
&mut self,
agg_data: &AggregationsSegmentCtx,
results: &mut IntermediateAggregationResults,
_parent_bucket_id: BucketId,
) -> crate::Result<()> {
let name = agg_data
.get_composite_req_data(self.accessor_idx)
.name
.clone();
let buckets = self.into_intermediate_bucket_result(agg_data)?;
results.push(
name,
IntermediateAggregationResult::Bucket(IntermediateBucketResult::Composite { buckets }),
)?;
Ok(())
}
#[inline]
fn collect(
&mut self,
_parent_bucket_id: BucketId,
docs: &[crate::DocId],
agg_data: &mut AggregationsSegmentCtx,
) -> crate::Result<()> {
let mem_pre = self.get_memory_consumption();
let composite_agg_data = agg_data.take_composite_req_data(self.accessor_idx);
for doc in docs {
let mut sub_level_values = SmallVec::new();
recursive_key_visitor(
*doc,
agg_data,
&composite_agg_data,
0,
&mut sub_level_values,
&mut self.buckets,
true,
)?;
}
agg_data.put_back_composite_req_data(self.accessor_idx, composite_agg_data);
let mem_delta = self.get_memory_consumption() - mem_pre;
if mem_delta > 0 {
agg_data.context.limits.add_memory_consumed(mem_delta)?;
}
Ok(())
}
fn prepare_max_bucket(
&mut self,
_max_bucket: BucketId,
_agg_data: &AggregationsSegmentCtx,
) -> crate::Result<()> {
Ok(())
}
fn flush(&mut self, _agg_data: &mut AggregationsSegmentCtx) -> crate::Result<()> {
Ok(())
}
}
impl SegmentCompositeCollector {
fn get_memory_consumption(&self) -> u64 {
self.buckets.memory_consumption()
}
pub(crate) fn from_req_and_validate(
req_data: &mut AggregationsSegmentCtx,
node: &AggRefNode,
) -> crate::Result<Self> {
validate_req(req_data, node.idx_in_req_data)?;
if !node.children.is_empty() {
let _sub_aggregation = build_segment_agg_collectors(req_data, &node.children)?;
}
let composite_req_data = req_data.get_composite_req_data(node.idx_in_req_data);
Ok(SegmentCompositeCollector {
buckets: DynArrayHeapMap::try_new(composite_req_data.req.sources.len())?,
accessor_idx: node.idx_in_req_data,
})
}
#[inline]
pub(crate) fn into_intermediate_bucket_result(
&mut self,
agg_data: &AggregationsSegmentCtx,
) -> crate::Result<IntermediateCompositeBucketResult> {
let mut dict: FxHashMap<Vec<CompositeIntermediateKey>, IntermediateCompositeBucketEntry> =
Default::default();
dict.reserve(self.buckets.size());
let composite_data = agg_data.get_composite_req_data(self.accessor_idx);
let buckets = std::mem::replace(
&mut self.buckets,
DynArrayHeapMap::try_new(composite_data.req.sources.len())
.expect("already validated source count"),
);
for (key_internal_repr, agg) in buckets.into_iter() {
let key = resolve_key(&key_internal_repr, composite_data)?;
let sub_aggregation_res = IntermediateAggregationResults::default();
dict.insert(
key,
IntermediateCompositeBucketEntry {
doc_count: agg.count,
sub_aggregation: sub_aggregation_res,
},
);
}
Ok(IntermediateCompositeBucketResult {
entries: dict,
target_size: composite_data.req.size,
orders: composite_data
.req
.sources
.iter()
.map(|source| match source {
CompositeAggregationSource::Terms(t) => (t.order, t.missing_order),
CompositeAggregationSource::Histogram(h) => (h.order, h.missing_order),
CompositeAggregationSource::DateHistogram(d) => (d.order, d.missing_order),
})
.collect(),
})
}
}
fn validate_req(req_data: &mut AggregationsSegmentCtx, accessor_idx: usize) -> crate::Result<()> {
let composite_data = req_data.get_composite_req_data(accessor_idx);
let req = &composite_data.req;
if req.sources.is_empty() {
return Err(TantivyError::InvalidArgument(
"composite aggregation must have at least one source".to_string(),
));
}
if req.size == 0 {
return Err(TantivyError::InvalidArgument(
"composite aggregation 'size' must be > 0".to_string(),
));
}
let column_types_for_sources = composite_data.composite_accessors.iter().map(|item| {
item.accessors
.iter()
.map(|a| a.column_type)
.collect::<Vec<_>>()
});
for column_types in column_types_for_sources {
if column_types.len() > MAX_DYN_ARRAY_SIZE {
return Err(TantivyError::InvalidArgument(format!(
"composite aggregation source supports maximum {MAX_DYN_ARRAY_SIZE} sources",
)));
}
if column_types.contains(&ColumnType::Bytes) {
return Err(TantivyError::InvalidArgument(
"composite aggregation does not support 'bytes' field type".to_string(),
));
}
}
Ok(())
}
fn collect_bucket_with_limit(
agg_data: &mut AggregationsSegmentCtx,
composite_agg_data: &CompositeAggReqData,
buckets: &mut DynArrayHeapMap<InternalValueRepr, CompositeBucketCollector>,
key: &[InternalValueRepr],
) -> crate::Result<()> {
if (buckets.size() as u32) < composite_agg_data.req.size {
buckets
.get_or_insert_with(key, CompositeBucketCollector::new)
.collect();
return Ok(());
}
if let Some(entry) = buckets.get_mut(key) {
entry.collect();
return Ok(());
}
if let Some(highest_key) = buckets.peek_highest() {
if key < highest_key {
buckets.evict_highest();
buckets
.get_or_insert_with(key, CompositeBucketCollector::new)
.collect();
}
}
let _ = agg_data;
Ok(())
}
/// Converts the composite key from its internal column space representation
/// (segment specific) into its intermediate form.
fn resolve_key(
internal_key: &[InternalValueRepr],
agg_data: &CompositeAggReqData,
) -> crate::Result<Vec<CompositeIntermediateKey>> {
internal_key
.into_iter()
.enumerate()
.map(|(idx, val)| {
resolve_internal_value_repr(
*val,
&agg_data.req.sources[idx],
&agg_data.composite_accessors[idx].accessors,
)
})
.collect()
}
fn resolve_internal_value_repr(
internal_value_repr: InternalValueRepr,
source: &CompositeAggregationSource,
composite_accessors: &[CompositeAccessor],
) -> crate::Result<CompositeIntermediateKey> {
let decoded_value_opt = match source {
CompositeAggregationSource::Terms(source) => internal_value_repr.decode(source.order),
CompositeAggregationSource::Histogram(source) => internal_value_repr.decode(source.order),
CompositeAggregationSource::DateHistogram(source) => {
internal_value_repr.decode(source.order)
}
};
let Some((decoded_accessor_idx, val)) = decoded_value_opt else {
return Ok(CompositeIntermediateKey::Null);
};
let key = match source {
CompositeAggregationSource::Terms(_) => {
let CompositeAccessor {
column_type,
str_dict_column,
column,
..
} = &composite_accessors[decoded_accessor_idx as usize];
resolve_term(val, column_type, str_dict_column, column)?
}
CompositeAggregationSource::Histogram(source) => {
CompositeIntermediateKey::F64(i64::from_u64(val) as f64 * source.interval)
}
CompositeAggregationSource::DateHistogram(_) => {
CompositeIntermediateKey::DateTime(i64::from_u64(val))
}
};
Ok(key)
}
fn resolve_term(
val: u64,
column_type: &ColumnType,
str_dict_column: &Option<StrColumn>,
column: &Column,
) -> crate::Result<CompositeIntermediateKey> {
let key = if *column_type == ColumnType::Str {
let fallback_dict = Dictionary::empty();
let term_dict = str_dict_column
.as_ref()
.map(|el| el.dictionary())
.unwrap_or_else(|| &fallback_dict);
let mut buffer = Vec::new();
term_dict.ord_to_term(val, &mut buffer)?;
CompositeIntermediateKey::Str(
String::from_utf8(buffer.to_vec()).expect("could not convert to String"),
)
} else if *column_type == ColumnType::DateTime {
let val = i64::from_u64(val);
CompositeIntermediateKey::DateTime(val)
} else if *column_type == ColumnType::Bool {
let val = bool::from_u64(val);
CompositeIntermediateKey::Bool(val)
} else if *column_type == ColumnType::IpAddr {
let compact_space_accessor = column
.values
.clone()
.downcast_arc::<CompactSpaceU64Accessor>()
.map_err(|_| {
TantivyError::AggregationError(crate::aggregation::AggregationError::InternalError(
"Type mismatch: Could not downcast to CompactSpaceU64Accessor".to_string(),
))
})?;
let val: u128 = compact_space_accessor.compact_to_u128(val as u32);
let val = Ipv6Addr::from_u128(val);
CompositeIntermediateKey::IpAddr(val)
} else {
if *column_type == ColumnType::U64 {
CompositeIntermediateKey::U64(val)
} else if *column_type == ColumnType::I64 {
CompositeIntermediateKey::I64(i64::from_u64(val))
} else {
let val = f64::from_u64(val);
let val: NumericalValue = val.into();
match val.normalize() {
NumericalValue::U64(val) => CompositeIntermediateKey::U64(val),
NumericalValue::I64(val) => CompositeIntermediateKey::I64(val),
NumericalValue::F64(val) => CompositeIntermediateKey::F64(val),
}
}
};
Ok(key)
}
/// Depth-first walk of the accessors to build the composite key combinations
/// and update the buckets.
fn recursive_key_visitor(
doc_id: crate::DocId,
agg_data: &mut AggregationsSegmentCtx,
composite_agg_data: &CompositeAggReqData,
source_idx_for_recursion: usize,
sub_level_values: &mut SmallVec<[InternalValueRepr; MAX_DYN_ARRAY_SIZE]>,
buckets: &mut DynArrayHeapMap<InternalValueRepr, CompositeBucketCollector>,
is_on_after_key: bool,
) -> crate::Result<()> {
if source_idx_for_recursion == composite_agg_data.req.sources.len() {
if !is_on_after_key {
collect_bucket_with_limit(
agg_data,
composite_agg_data,
buckets,
sub_level_values,
)?;
}
return Ok(());
}
let current_level_accessors = &composite_agg_data.composite_accessors[source_idx_for_recursion];
let current_level_source = &composite_agg_data.req.sources[source_idx_for_recursion];
let mut missing = true;
for (accessor_idx, accessor) in current_level_accessors.accessors.iter().enumerate() {
let values = accessor.column.values_for_doc(doc_id);
for value in values {
missing = false;
match current_level_source {
CompositeAggregationSource::Terms(_) => {
let preceeds_after_key_type =
accessor_idx < current_level_accessors.after_key_accessor_idx;
if is_on_after_key && preceeds_after_key_type {
break;
}
let matches_after_key_type =
accessor_idx == current_level_accessors.after_key_accessor_idx;
if matches_after_key_type && is_on_after_key {
let should_skip = match current_level_source.order() {
Order::Asc => current_level_accessors.after_key.gt(value),
Order::Desc => current_level_accessors.after_key.lt(value),
};
if should_skip {
continue;
}
}
sub_level_values.push(InternalValueRepr::new_term(
value,
accessor_idx as u8,
current_level_source.order(),
));
let still_on_after_key =
matches_after_key_type && current_level_accessors.after_key.equals(value);
recursive_key_visitor(
doc_id,
agg_data,
composite_agg_data,
source_idx_for_recursion + 1,
sub_level_values,
buckets,
is_on_after_key && still_on_after_key,
)?;
sub_level_values.pop();
}
CompositeAggregationSource::Histogram(source) => {
let float_value = match accessor.column_type {
ColumnType::U64 => value as f64,
ColumnType::I64 => i64::from_u64(value) as f64,
ColumnType::DateTime => i64::from_u64(value) as f64 / 1_000_000.,
ColumnType::F64 => f64::from_u64(value),
_ => {
panic!(
"unexpected type {:?}. This should not happen",
accessor.column_type
)
}
};
let bucket_index = (float_value / source.interval).floor() as i64;
let bucket_value = i64::to_u64(bucket_index);
if is_on_after_key {
let should_skip = match current_level_source.order() {
Order::Asc => current_level_accessors.after_key.gt(bucket_value),
Order::Desc => current_level_accessors.after_key.lt(bucket_value),
};
if should_skip {
continue;
}
}
sub_level_values.push(InternalValueRepr::new_histogram(
bucket_value,
current_level_source.order(),
));
let still_on_after_key = current_level_accessors.after_key.equals(bucket_value);
recursive_key_visitor(
doc_id,
agg_data,
composite_agg_data,
source_idx_for_recursion + 1,
sub_level_values,
buckets,
is_on_after_key && still_on_after_key,
)?;
sub_level_values.pop();
}
CompositeAggregationSource::DateHistogram(_) => {
let value_ns = match accessor.column_type {
ColumnType::DateTime => i64::from_u64(value),
_ => {
panic!(
"unexpected type {:?}. This should not happen",
accessor.column_type
)
}
};
let bucket_index = match accessor.date_histogram_interval {
PrecomputedDateInterval::FixedNanoseconds(fixed_interval_ns) => {
(value_ns / fixed_interval_ns) * fixed_interval_ns
}
PrecomputedDateInterval::Calendar(CalendarInterval::Year) => {
calendar_interval::try_year_bucket(value_ns)?
}
PrecomputedDateInterval::Calendar(CalendarInterval::Month) => {
calendar_interval::try_month_bucket(value_ns)?
}
PrecomputedDateInterval::Calendar(CalendarInterval::Week) => {
calendar_interval::week_bucket(value_ns)
}
PrecomputedDateInterval::NotApplicable => {
panic!("interval not precomputed for date histogram source")
}
};
let bucket_value = i64::to_u64(bucket_index);
if is_on_after_key {
let should_skip = match current_level_source.order() {
Order::Asc => current_level_accessors.after_key.gt(bucket_value),
Order::Desc => current_level_accessors.after_key.lt(bucket_value),
};
if should_skip {
continue;
}
}
sub_level_values.push(InternalValueRepr::new_histogram(
bucket_value,
current_level_source.order(),
));
let still_on_after_key = current_level_accessors.after_key.equals(bucket_value);
recursive_key_visitor(
doc_id,
agg_data,
composite_agg_data,
source_idx_for_recursion + 1,
sub_level_values,
buckets,
is_on_after_key && still_on_after_key,
)?;
sub_level_values.pop();
}
};
}
}
if missing && current_level_source.missing_bucket() {
if is_on_after_key && current_level_accessors.skip_missing {
return Ok(());
}
sub_level_values.push(InternalValueRepr::new_missing(
current_level_source.order(),
current_level_source.missing_order(),
));
recursive_key_visitor(
doc_id,
agg_data,
composite_agg_data,
source_idx_for_recursion + 1,
sub_level_values,
buckets,
is_on_after_key && current_level_accessors.is_after_key_explicit_missing,
)?;
sub_level_values.pop();
}
Ok(())
}

View File

@@ -0,0 +1,364 @@
use std::collections::BinaryHeap;
use std::fmt::Debug;
use std::hash::Hash;
use rustc_hash::FxHashMap;
use smallvec::SmallVec;
use crate::TantivyError;
/// Map backed by a hash map for fast access and a binary heap to track the
/// highest key. The key is an array of fixed size S.
#[derive(Clone, Debug)]
struct ArrayHeapMap<K: Ord, V, const S: usize> {
pub(crate) buckets: FxHashMap<[K; S], V>,
pub(crate) heap: BinaryHeap<[K; S]>,
}
impl<K: Ord, V, const S: usize> Default for ArrayHeapMap<K, V, S> {
fn default() -> Self {
ArrayHeapMap {
buckets: FxHashMap::default(),
heap: BinaryHeap::default(),
}
}
}
impl<K: Eq + Hash + Clone + Ord, V, const S: usize> ArrayHeapMap<K, V, S> {
/// Panics if the length of `key` is not S.
fn get_or_insert_with<F: FnOnce() -> V>(&mut self, key: &[K], f: F) -> &mut V {
let key_array: &[K; S] = key.try_into().expect("Key length mismatch");
self.buckets.entry(key_array.clone()).or_insert_with(|| {
self.heap.push(key_array.clone());
f()
})
}
/// Panics if the length of `key` is not S.
fn get_mut(&mut self, key: &[K]) -> Option<&mut V> {
let key_array: &[K; S] = key.try_into().expect("Key length mismatch");
self.buckets.get_mut(key_array)
}
fn peek_highest(&self) -> Option<&[K]> {
self.heap.peek().map(|k_array| k_array.as_slice())
}
fn evict_highest(&mut self) {
if let Some(highest) = self.heap.pop() {
self.buckets.remove(&highest);
}
}
fn memory_consumption(&self) -> u64 {
let key_size = std::mem::size_of::<[K; S]>();
let map_size = (key_size + std::mem::size_of::<V>()) * self.buckets.capacity();
let heap_size = key_size * self.heap.capacity();
(map_size + heap_size) as u64
}
}
impl<K: Copy + Ord + Clone + 'static, V: 'static, const S: usize> ArrayHeapMap<K, V, S> {
fn into_iter(self) -> Box<dyn Iterator<Item = (SmallVec<[K; MAX_DYN_ARRAY_SIZE]>, V)>> {
Box::new(
self.buckets
.into_iter()
.map(|(k, v)| (SmallVec::from_slice(&k), v)),
)
}
fn values_mut<'a>(&'a mut self) -> Box<dyn Iterator<Item = &'a mut V> + 'a> {
Box::new(self.buckets.values_mut())
}
}
pub(super) const MAX_DYN_ARRAY_SIZE: usize = 16;
const MAX_DYN_ARRAY_SIZE_PLUS_ONE: usize = MAX_DYN_ARRAY_SIZE + 1;
/// A map optimized for memory footprint, fast access and efficient eviction of
/// the highest key.
///
/// Keys are inlined arrays of size 1 to [MAX_DYN_ARRAY_SIZE] but for a given
/// instance the key size is fixed. This allows to avoid heap allocations for the
/// keys.
#[derive(Clone, Debug)]
pub(super) struct DynArrayHeapMap<K: Ord, V>(DynArrayHeapMapInner<K, V>);
/// Wrapper around ArrayHeapMap to dynamically dispatch on the array size.
#[derive(Clone, Debug)]
enum DynArrayHeapMapInner<K: Ord, V> {
Dim1(ArrayHeapMap<K, V, 1>),
Dim2(ArrayHeapMap<K, V, 2>),
Dim3(ArrayHeapMap<K, V, 3>),
Dim4(ArrayHeapMap<K, V, 4>),
Dim5(ArrayHeapMap<K, V, 5>),
Dim6(ArrayHeapMap<K, V, 6>),
Dim7(ArrayHeapMap<K, V, 7>),
Dim8(ArrayHeapMap<K, V, 8>),
Dim9(ArrayHeapMap<K, V, 9>),
Dim10(ArrayHeapMap<K, V, 10>),
Dim11(ArrayHeapMap<K, V, 11>),
Dim12(ArrayHeapMap<K, V, 12>),
Dim13(ArrayHeapMap<K, V, 13>),
Dim14(ArrayHeapMap<K, V, 14>),
Dim15(ArrayHeapMap<K, V, 15>),
Dim16(ArrayHeapMap<K, V, 16>),
}
impl<K: Ord, V> DynArrayHeapMap<K, V> {
/// Creates a new heap map with dynamic array keys of size `key_dimension`.
pub(super) fn try_new(key_dimension: usize) -> crate::Result<Self> {
let inner = match key_dimension {
0 => {
return Err(TantivyError::InvalidArgument(
"DynArrayHeapMap dimension must be at least 1".to_string(),
))
}
1 => DynArrayHeapMapInner::Dim1(ArrayHeapMap::default()),
2 => DynArrayHeapMapInner::Dim2(ArrayHeapMap::default()),
3 => DynArrayHeapMapInner::Dim3(ArrayHeapMap::default()),
4 => DynArrayHeapMapInner::Dim4(ArrayHeapMap::default()),
5 => DynArrayHeapMapInner::Dim5(ArrayHeapMap::default()),
6 => DynArrayHeapMapInner::Dim6(ArrayHeapMap::default()),
7 => DynArrayHeapMapInner::Dim7(ArrayHeapMap::default()),
8 => DynArrayHeapMapInner::Dim8(ArrayHeapMap::default()),
9 => DynArrayHeapMapInner::Dim9(ArrayHeapMap::default()),
10 => DynArrayHeapMapInner::Dim10(ArrayHeapMap::default()),
11 => DynArrayHeapMapInner::Dim11(ArrayHeapMap::default()),
12 => DynArrayHeapMapInner::Dim12(ArrayHeapMap::default()),
13 => DynArrayHeapMapInner::Dim13(ArrayHeapMap::default()),
14 => DynArrayHeapMapInner::Dim14(ArrayHeapMap::default()),
15 => DynArrayHeapMapInner::Dim15(ArrayHeapMap::default()),
16 => DynArrayHeapMapInner::Dim16(ArrayHeapMap::default()),
MAX_DYN_ARRAY_SIZE_PLUS_ONE.. => {
return Err(TantivyError::InvalidArgument(format!(
"DynArrayHeapMap supports maximum {MAX_DYN_ARRAY_SIZE} dimensions, got \
{key_dimension}",
)))
}
};
Ok(DynArrayHeapMap(inner))
}
/// Number of elements in the map. This is not the dimension of the keys.
pub(super) fn size(&self) -> usize {
match &self.0 {
DynArrayHeapMapInner::Dim1(map) => map.buckets.len(),
DynArrayHeapMapInner::Dim2(map) => map.buckets.len(),
DynArrayHeapMapInner::Dim3(map) => map.buckets.len(),
DynArrayHeapMapInner::Dim4(map) => map.buckets.len(),
DynArrayHeapMapInner::Dim5(map) => map.buckets.len(),
DynArrayHeapMapInner::Dim6(map) => map.buckets.len(),
DynArrayHeapMapInner::Dim7(map) => map.buckets.len(),
DynArrayHeapMapInner::Dim8(map) => map.buckets.len(),
DynArrayHeapMapInner::Dim9(map) => map.buckets.len(),
DynArrayHeapMapInner::Dim10(map) => map.buckets.len(),
DynArrayHeapMapInner::Dim11(map) => map.buckets.len(),
DynArrayHeapMapInner::Dim12(map) => map.buckets.len(),
DynArrayHeapMapInner::Dim13(map) => map.buckets.len(),
DynArrayHeapMapInner::Dim14(map) => map.buckets.len(),
DynArrayHeapMapInner::Dim15(map) => map.buckets.len(),
DynArrayHeapMapInner::Dim16(map) => map.buckets.len(),
}
}
}
impl<K: Ord + Hash + Clone, V> DynArrayHeapMap<K, V> {
/// Get a mutable reference to the value corresponding to `key` or inserts a new
/// value created by calling `f`.
///
/// Panics if the length of `key` does not match the key dimension of the map.
pub(super) fn get_or_insert_with<F: FnOnce() -> V>(&mut self, key: &[K], f: F) -> &mut V {
match &mut self.0 {
DynArrayHeapMapInner::Dim1(map) => map.get_or_insert_with(key, f),
DynArrayHeapMapInner::Dim2(map) => map.get_or_insert_with(key, f),
DynArrayHeapMapInner::Dim3(map) => map.get_or_insert_with(key, f),
DynArrayHeapMapInner::Dim4(map) => map.get_or_insert_with(key, f),
DynArrayHeapMapInner::Dim5(map) => map.get_or_insert_with(key, f),
DynArrayHeapMapInner::Dim6(map) => map.get_or_insert_with(key, f),
DynArrayHeapMapInner::Dim7(map) => map.get_or_insert_with(key, f),
DynArrayHeapMapInner::Dim8(map) => map.get_or_insert_with(key, f),
DynArrayHeapMapInner::Dim9(map) => map.get_or_insert_with(key, f),
DynArrayHeapMapInner::Dim10(map) => map.get_or_insert_with(key, f),
DynArrayHeapMapInner::Dim11(map) => map.get_or_insert_with(key, f),
DynArrayHeapMapInner::Dim12(map) => map.get_or_insert_with(key, f),
DynArrayHeapMapInner::Dim13(map) => map.get_or_insert_with(key, f),
DynArrayHeapMapInner::Dim14(map) => map.get_or_insert_with(key, f),
DynArrayHeapMapInner::Dim15(map) => map.get_or_insert_with(key, f),
DynArrayHeapMapInner::Dim16(map) => map.get_or_insert_with(key, f),
}
}
/// Returns a mutable reference to the value corresponding to `key`.
///
/// Panics if the length of `key` does not match the key dimension of the map.
pub fn get_mut(&mut self, key: &[K]) -> Option<&mut V> {
match &mut self.0 {
DynArrayHeapMapInner::Dim1(map) => map.get_mut(key),
DynArrayHeapMapInner::Dim2(map) => map.get_mut(key),
DynArrayHeapMapInner::Dim3(map) => map.get_mut(key),
DynArrayHeapMapInner::Dim4(map) => map.get_mut(key),
DynArrayHeapMapInner::Dim5(map) => map.get_mut(key),
DynArrayHeapMapInner::Dim6(map) => map.get_mut(key),
DynArrayHeapMapInner::Dim7(map) => map.get_mut(key),
DynArrayHeapMapInner::Dim8(map) => map.get_mut(key),
DynArrayHeapMapInner::Dim9(map) => map.get_mut(key),
DynArrayHeapMapInner::Dim10(map) => map.get_mut(key),
DynArrayHeapMapInner::Dim11(map) => map.get_mut(key),
DynArrayHeapMapInner::Dim12(map) => map.get_mut(key),
DynArrayHeapMapInner::Dim13(map) => map.get_mut(key),
DynArrayHeapMapInner::Dim14(map) => map.get_mut(key),
DynArrayHeapMapInner::Dim15(map) => map.get_mut(key),
DynArrayHeapMapInner::Dim16(map) => map.get_mut(key),
}
}
/// Returns a reference to the highest key in the map.
pub(super) fn peek_highest(&self) -> Option<&[K]> {
match &self.0 {
DynArrayHeapMapInner::Dim1(map) => map.peek_highest(),
DynArrayHeapMapInner::Dim2(map) => map.peek_highest(),
DynArrayHeapMapInner::Dim3(map) => map.peek_highest(),
DynArrayHeapMapInner::Dim4(map) => map.peek_highest(),
DynArrayHeapMapInner::Dim5(map) => map.peek_highest(),
DynArrayHeapMapInner::Dim6(map) => map.peek_highest(),
DynArrayHeapMapInner::Dim7(map) => map.peek_highest(),
DynArrayHeapMapInner::Dim8(map) => map.peek_highest(),
DynArrayHeapMapInner::Dim9(map) => map.peek_highest(),
DynArrayHeapMapInner::Dim10(map) => map.peek_highest(),
DynArrayHeapMapInner::Dim11(map) => map.peek_highest(),
DynArrayHeapMapInner::Dim12(map) => map.peek_highest(),
DynArrayHeapMapInner::Dim13(map) => map.peek_highest(),
DynArrayHeapMapInner::Dim14(map) => map.peek_highest(),
DynArrayHeapMapInner::Dim15(map) => map.peek_highest(),
DynArrayHeapMapInner::Dim16(map) => map.peek_highest(),
}
}
/// Removes the entry with the highest key from the map.
pub(super) fn evict_highest(&mut self) {
match &mut self.0 {
DynArrayHeapMapInner::Dim1(map) => map.evict_highest(),
DynArrayHeapMapInner::Dim2(map) => map.evict_highest(),
DynArrayHeapMapInner::Dim3(map) => map.evict_highest(),
DynArrayHeapMapInner::Dim4(map) => map.evict_highest(),
DynArrayHeapMapInner::Dim5(map) => map.evict_highest(),
DynArrayHeapMapInner::Dim6(map) => map.evict_highest(),
DynArrayHeapMapInner::Dim7(map) => map.evict_highest(),
DynArrayHeapMapInner::Dim8(map) => map.evict_highest(),
DynArrayHeapMapInner::Dim9(map) => map.evict_highest(),
DynArrayHeapMapInner::Dim10(map) => map.evict_highest(),
DynArrayHeapMapInner::Dim11(map) => map.evict_highest(),
DynArrayHeapMapInner::Dim12(map) => map.evict_highest(),
DynArrayHeapMapInner::Dim13(map) => map.evict_highest(),
DynArrayHeapMapInner::Dim14(map) => map.evict_highest(),
DynArrayHeapMapInner::Dim15(map) => map.evict_highest(),
DynArrayHeapMapInner::Dim16(map) => map.evict_highest(),
}
}
pub(crate) fn memory_consumption(&self) -> u64 {
match &self.0 {
DynArrayHeapMapInner::Dim1(map) => map.memory_consumption(),
DynArrayHeapMapInner::Dim2(map) => map.memory_consumption(),
DynArrayHeapMapInner::Dim3(map) => map.memory_consumption(),
DynArrayHeapMapInner::Dim4(map) => map.memory_consumption(),
DynArrayHeapMapInner::Dim5(map) => map.memory_consumption(),
DynArrayHeapMapInner::Dim6(map) => map.memory_consumption(),
DynArrayHeapMapInner::Dim7(map) => map.memory_consumption(),
DynArrayHeapMapInner::Dim8(map) => map.memory_consumption(),
DynArrayHeapMapInner::Dim9(map) => map.memory_consumption(),
DynArrayHeapMapInner::Dim10(map) => map.memory_consumption(),
DynArrayHeapMapInner::Dim11(map) => map.memory_consumption(),
DynArrayHeapMapInner::Dim12(map) => map.memory_consumption(),
DynArrayHeapMapInner::Dim13(map) => map.memory_consumption(),
DynArrayHeapMapInner::Dim14(map) => map.memory_consumption(),
DynArrayHeapMapInner::Dim15(map) => map.memory_consumption(),
DynArrayHeapMapInner::Dim16(map) => map.memory_consumption(),
}
}
}
impl<K: Ord + Clone + Copy + 'static, V: 'static> DynArrayHeapMap<K, V> {
/// Turns this map into an iterator over key-value pairs.
pub fn into_iter(self) -> impl Iterator<Item = (SmallVec<[K; MAX_DYN_ARRAY_SIZE]>, V)> {
match self.0 {
DynArrayHeapMapInner::Dim1(map) => map.into_iter(),
DynArrayHeapMapInner::Dim2(map) => map.into_iter(),
DynArrayHeapMapInner::Dim3(map) => map.into_iter(),
DynArrayHeapMapInner::Dim4(map) => map.into_iter(),
DynArrayHeapMapInner::Dim5(map) => map.into_iter(),
DynArrayHeapMapInner::Dim6(map) => map.into_iter(),
DynArrayHeapMapInner::Dim7(map) => map.into_iter(),
DynArrayHeapMapInner::Dim8(map) => map.into_iter(),
DynArrayHeapMapInner::Dim9(map) => map.into_iter(),
DynArrayHeapMapInner::Dim10(map) => map.into_iter(),
DynArrayHeapMapInner::Dim11(map) => map.into_iter(),
DynArrayHeapMapInner::Dim12(map) => map.into_iter(),
DynArrayHeapMapInner::Dim13(map) => map.into_iter(),
DynArrayHeapMapInner::Dim14(map) => map.into_iter(),
DynArrayHeapMapInner::Dim15(map) => map.into_iter(),
DynArrayHeapMapInner::Dim16(map) => map.into_iter(),
}
}
/// Returns an iterator over mutable references to the values in the map.
pub(super) fn values_mut(&mut self) -> impl Iterator<Item = &mut V> {
match &mut self.0 {
DynArrayHeapMapInner::Dim1(map) => map.values_mut(),
DynArrayHeapMapInner::Dim2(map) => map.values_mut(),
DynArrayHeapMapInner::Dim3(map) => map.values_mut(),
DynArrayHeapMapInner::Dim4(map) => map.values_mut(),
DynArrayHeapMapInner::Dim5(map) => map.values_mut(),
DynArrayHeapMapInner::Dim6(map) => map.values_mut(),
DynArrayHeapMapInner::Dim7(map) => map.values_mut(),
DynArrayHeapMapInner::Dim8(map) => map.values_mut(),
DynArrayHeapMapInner::Dim9(map) => map.values_mut(),
DynArrayHeapMapInner::Dim10(map) => map.values_mut(),
DynArrayHeapMapInner::Dim11(map) => map.values_mut(),
DynArrayHeapMapInner::Dim12(map) => map.values_mut(),
DynArrayHeapMapInner::Dim13(map) => map.values_mut(),
DynArrayHeapMapInner::Dim14(map) => map.values_mut(),
DynArrayHeapMapInner::Dim15(map) => map.values_mut(),
DynArrayHeapMapInner::Dim16(map) => map.values_mut(),
}
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_dyn_array_heap_map() {
let mut map = DynArrayHeapMap::<u32, &str>::try_new(2).unwrap();
// insert
let key1 = [1u32, 2u32];
let key2 = [2u32, 1u32];
map.get_or_insert_with(&key1, || "a");
map.get_or_insert_with(&key2, || "b");
assert_eq!(map.size(), 2);
// evict highest
assert_eq!(map.peek_highest(), Some(&key2[..]));
map.evict_highest();
assert_eq!(map.size(), 1);
assert_eq!(map.peek_highest(), Some(&key1[..]));
// mutable iterator
{
let mut mut_iter = map.values_mut();
let v = mut_iter.next().unwrap();
assert_eq!(*v, "a");
*v = "c";
assert_eq!(mut_iter.next(), None);
}
// into_iter
let mut iter = map.into_iter();
let (k, v) = iter.next().unwrap();
assert_eq!(k.as_slice(), &key1);
assert_eq!(v, "c");
assert_eq!(iter.next(), None);
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,460 @@
/// This modules helps comparing numerical values of different types (i64, u64
/// and f64).
pub(super) mod num_cmp {
use std::cmp::Ordering;
use crate::TantivyError;
pub fn cmp_i64_f64(left_i: i64, right_f: f64) -> crate::Result<Ordering> {
if right_f.is_nan() {
return Err(TantivyError::InvalidArgument(
"NaN comparison is not supported".to_string(),
));
}
// If right_f is < i64::MIN then left_i > right_f (i64::MIN=-2^63 can be
// exactly represented as f64)
if right_f < i64::MIN as f64 {
return Ok(Ordering::Greater);
}
// If right_f is >= i64::MAX then left_i < right_f (i64::MAX=2^63-1 cannot
// be exactly represented as f64)
if right_f >= i64::MAX as f64 {
return Ok(Ordering::Less);
}
// Now right_f is in (i64::MIN, i64::MAX), so `right_f as i64` is
// well-defined (truncation toward 0)
let right_as_i = right_f as i64;
let result = match left_i.cmp(&right_as_i) {
Ordering::Less => Ordering::Less,
Ordering::Greater => Ordering::Greater,
Ordering::Equal => {
// they have the same integer part, compare the fraction
let rem = right_f - (right_as_i as f64);
if rem == 0.0 {
Ordering::Equal
} else if right_f > 0.0 {
Ordering::Less
} else {
Ordering::Greater
}
}
};
Ok(result)
}
pub fn cmp_u64_f64(left_u: u64, right_f: f64) -> crate::Result<Ordering> {
if right_f.is_nan() {
return Err(TantivyError::InvalidArgument(
"NaN comparison is not supported".to_string(),
));
}
// Negative floats are always less than any u64 >= 0
if right_f < 0.0 {
return Ok(Ordering::Greater);
}
// If right_f is >= u64::MAX then left_u < right_f (u64::MAX=2^64-1 cannot be exactly)
let max_as_f = u64::MAX as f64;
if right_f > max_as_f {
return Ok(Ordering::Less);
}
// Now right_f is in (0, u64::MAX), so `right_f as u64` is well-defined
// (truncation toward 0)
let right_as_u = right_f as u64;
let result = match left_u.cmp(&right_as_u) {
Ordering::Less => Ordering::Less,
Ordering::Greater => Ordering::Greater,
Ordering::Equal => {
// they have the same integer part, compare the fraction
let rem = right_f - (right_as_u as f64);
if rem == 0.0 {
Ordering::Equal
} else {
Ordering::Less
}
}
};
Ok(result)
}
pub fn cmp_i64_u64(left_i: i64, right_u: u64) -> Ordering {
if left_i < 0 {
Ordering::Less
} else {
let left_as_u = left_i as u64;
left_as_u.cmp(&right_u)
}
}
}
/// This modules helps projecting numerical values to other numerical types.
/// When the target value space cannot exactly represent the source value, the
/// next representable value is returned (or AfterLast if the source value is
/// larger than the largest representable value).
///
/// All functions in this module assume that f64 values are not NaN.
pub(super) mod num_proj {
#[derive(Debug, PartialEq)]
pub enum ProjectedNumber<T> {
Exact(T),
Next(T),
AfterLast,
}
pub fn i64_to_u64(value: i64) -> ProjectedNumber<u64> {
if value < 0 {
ProjectedNumber::Next(0)
} else {
ProjectedNumber::Exact(value as u64)
}
}
pub fn u64_to_i64(value: u64) -> ProjectedNumber<i64> {
if value > i64::MAX as u64 {
ProjectedNumber::AfterLast
} else {
ProjectedNumber::Exact(value as i64)
}
}
pub fn f64_to_u64(value: f64) -> ProjectedNumber<u64> {
if value < 0.0 {
ProjectedNumber::Next(0)
} else if value > u64::MAX as f64 {
ProjectedNumber::AfterLast
} else if value.fract() == 0.0 {
ProjectedNumber::Exact(value as u64)
} else {
// casting f64 to u64 truncates toward zero
ProjectedNumber::Next(value as u64 + 1)
}
}
pub fn f64_to_i64(value: f64) -> ProjectedNumber<i64> {
if value < (i64::MIN as f64) {
return ProjectedNumber::Next(i64::MIN);
} else if value >= (i64::MAX as f64) {
return ProjectedNumber::AfterLast;
} else if value.fract() == 0.0 {
ProjectedNumber::Exact(value as i64)
} else if value > 0.0 {
// casting f64 to i64 truncates toward zero
ProjectedNumber::Next(value as i64 + 1)
} else {
ProjectedNumber::Next(value as i64)
}
}
pub fn i64_to_f64(value: i64) -> ProjectedNumber<f64> {
let value_f = value as f64;
let k_roundtrip = value_f as i64;
if k_roundtrip == value {
// between -2^53 and 2^53 all i64 are exactly represented as f64
ProjectedNumber::Exact(value_f)
} else {
// for very large/small i64 values, it is approximated to the closest f64
if k_roundtrip > value {
ProjectedNumber::Next(value_f)
} else {
ProjectedNumber::Next(value_f.next_up())
}
}
}
pub fn u64_to_f64(value: u64) -> ProjectedNumber<f64> {
let value_f = value as f64;
let k_roundtrip = value_f as u64;
if k_roundtrip == value {
// between 0 and 2^53 all u64 are exactly represented as f64
ProjectedNumber::Exact(value_f)
} else if k_roundtrip > value {
ProjectedNumber::Next(value_f)
} else {
ProjectedNumber::Next(value_f.next_up())
}
}
}
#[cfg(test)]
mod num_cmp_tests {
use std::cmp::Ordering;
use super::num_cmp::*;
#[test]
fn test_cmp_u64_f64() {
// Basic comparisons
assert_eq!(cmp_u64_f64(5, 5.0).unwrap(), Ordering::Equal);
assert_eq!(cmp_u64_f64(5, 6.0).unwrap(), Ordering::Less);
assert_eq!(cmp_u64_f64(6, 5.0).unwrap(), Ordering::Greater);
assert_eq!(cmp_u64_f64(0, 0.0).unwrap(), Ordering::Equal);
assert_eq!(cmp_u64_f64(0, 0.1).unwrap(), Ordering::Less);
// Negative float values should always be less than any u64
assert_eq!(cmp_u64_f64(0, -0.1).unwrap(), Ordering::Greater);
assert_eq!(cmp_u64_f64(5, -5.0).unwrap(), Ordering::Greater);
assert_eq!(cmp_u64_f64(u64::MAX, -1e20).unwrap(), Ordering::Greater);
// Tests with extreme values
assert_eq!(cmp_u64_f64(u64::MAX, 1e20).unwrap(), Ordering::Less);
// Precision edge cases: large u64 that loses precision when converted to f64
// => 2^54, exactly represented as f64
let large_f64 = 18_014_398_509_481_984.0;
let large_u64 = 18_014_398_509_481_984;
// prove that large_u64 is exactly represented as f64
assert_eq!(large_u64 as f64, large_f64);
assert_eq!(cmp_u64_f64(large_u64, large_f64).unwrap(), Ordering::Equal);
// => (2^54 + 1) cannot be exactly represented in f64
let large_u64_plus_1 = 18_014_398_509_481_985;
// prove that it is represented as f64 by large_f64
assert_eq!(large_u64_plus_1 as f64, large_f64);
assert_eq!(
cmp_u64_f64(large_u64_plus_1, large_f64).unwrap(),
Ordering::Greater
);
// => (2^54 - 1) cannot be exactly represented in f64
let large_u64_minus_1 = 18_014_398_509_481_983;
// prove that it is also represented as f64 by large_f64
assert_eq!(large_u64_minus_1 as f64, large_f64);
assert_eq!(
cmp_u64_f64(large_u64_minus_1, large_f64).unwrap(),
Ordering::Less
);
// NaN comparison results in an error
assert!(cmp_u64_f64(0, f64::NAN).is_err());
}
#[test]
fn test_cmp_i64_f64() {
// Basic comparisons
assert_eq!(cmp_i64_f64(5, 5.0).unwrap(), Ordering::Equal);
assert_eq!(cmp_i64_f64(5, 6.0).unwrap(), Ordering::Less);
assert_eq!(cmp_i64_f64(6, 5.0).unwrap(), Ordering::Greater);
assert_eq!(cmp_i64_f64(-5, -5.0).unwrap(), Ordering::Equal);
assert_eq!(cmp_i64_f64(-5, -4.0).unwrap(), Ordering::Less);
assert_eq!(cmp_i64_f64(-4, -5.0).unwrap(), Ordering::Greater);
assert_eq!(cmp_i64_f64(-5, 5.0).unwrap(), Ordering::Less);
assert_eq!(cmp_i64_f64(5, -5.0).unwrap(), Ordering::Greater);
assert_eq!(cmp_i64_f64(0, -0.1).unwrap(), Ordering::Greater);
assert_eq!(cmp_i64_f64(0, 0.1).unwrap(), Ordering::Less);
assert_eq!(cmp_i64_f64(-1, -0.5).unwrap(), Ordering::Less);
assert_eq!(cmp_i64_f64(-1, 0.0).unwrap(), Ordering::Less);
assert_eq!(cmp_i64_f64(0, 0.0).unwrap(), Ordering::Equal);
// Tests with extreme values
assert_eq!(cmp_i64_f64(i64::MAX, 1e20).unwrap(), Ordering::Less);
assert_eq!(cmp_i64_f64(i64::MIN, -1e20).unwrap(), Ordering::Greater);
// Precision edge cases: large i64 that loses precision when converted to f64
// => 2^54, exactly represented as f64
let large_f64 = 18_014_398_509_481_984.0;
let large_i64 = 18_014_398_509_481_984;
// prove that large_i64 is exactly represented as f64
assert_eq!(large_i64 as f64, large_f64);
assert_eq!(cmp_i64_f64(large_i64, large_f64).unwrap(), Ordering::Equal);
// => (1_i64 << 54) + 1 cannot be exactly represented in f64
let large_i64_plus_1 = 18_014_398_509_481_985;
// prove that it is represented as f64 by large_f64
assert_eq!(large_i64_plus_1 as f64, large_f64);
assert_eq!(
cmp_i64_f64(large_i64_plus_1, large_f64).unwrap(),
Ordering::Greater
);
// => (1_i64 << 54) - 1 cannot be exactly represented in f64
let large_i64_minus_1 = 18_014_398_509_481_983;
// prove that it is also represented as f64 by large_f64
assert_eq!(large_i64_minus_1 as f64, large_f64);
assert_eq!(
cmp_i64_f64(large_i64_minus_1, large_f64).unwrap(),
Ordering::Less
);
// Same precision edge case but with negative values
// => -2^54, exactly represented as f64
let large_neg_f64 = -18_014_398_509_481_984.0;
let large_neg_i64 = -18_014_398_509_481_984;
// prove that large_neg_i64 is exactly represented as f64
assert_eq!(large_neg_i64 as f64, large_neg_f64);
assert_eq!(
cmp_i64_f64(large_neg_i64, large_neg_f64).unwrap(),
Ordering::Equal
);
// => (-2^54 + 1) cannot be exactly represented in f64
let large_neg_i64_plus_1 = -18_014_398_509_481_985;
// prove that it is represented as f64 by large_neg_f64
assert_eq!(large_neg_i64_plus_1 as f64, large_neg_f64);
assert_eq!(
cmp_i64_f64(large_neg_i64_plus_1, large_neg_f64).unwrap(),
Ordering::Less
);
// => (-2^54 - 1) cannot be exactly represented in f64
let large_neg_i64_minus_1 = -18_014_398_509_481_983;
// prove that it is also represented as f64 by large_neg_f64
assert_eq!(large_neg_i64_minus_1 as f64, large_neg_f64);
assert_eq!(
cmp_i64_f64(large_neg_i64_minus_1, large_neg_f64).unwrap(),
Ordering::Greater
);
// NaN comparison results in an error
assert!(cmp_i64_f64(0, f64::NAN).is_err());
}
#[test]
fn test_cmp_i64_u64() {
// Test with negative i64 values (should always be less than any u64)
assert_eq!(cmp_i64_u64(-1, 0), Ordering::Less);
assert_eq!(cmp_i64_u64(i64::MIN, 0), Ordering::Less);
assert_eq!(cmp_i64_u64(i64::MIN, u64::MAX), Ordering::Less);
// Test with positive i64 values
assert_eq!(cmp_i64_u64(0, 0), Ordering::Equal);
assert_eq!(cmp_i64_u64(1, 0), Ordering::Greater);
assert_eq!(cmp_i64_u64(1, 1), Ordering::Equal);
assert_eq!(cmp_i64_u64(0, 1), Ordering::Less);
assert_eq!(cmp_i64_u64(5, 10), Ordering::Less);
assert_eq!(cmp_i64_u64(10, 5), Ordering::Greater);
// Test with values near i64::MAX and u64 conversion
assert_eq!(cmp_i64_u64(i64::MAX, i64::MAX as u64), Ordering::Equal);
assert_eq!(cmp_i64_u64(i64::MAX, (i64::MAX as u64) + 1), Ordering::Less);
assert_eq!(cmp_i64_u64(i64::MAX, u64::MAX), Ordering::Less);
}
}
#[cfg(test)]
mod num_proj_tests {
use super::num_proj::{self, ProjectedNumber};
#[test]
fn test_i64_to_u64() {
assert_eq!(num_proj::i64_to_u64(-1), ProjectedNumber::Next(0));
assert_eq!(num_proj::i64_to_u64(i64::MIN), ProjectedNumber::Next(0));
assert_eq!(num_proj::i64_to_u64(0), ProjectedNumber::Exact(0));
assert_eq!(num_proj::i64_to_u64(42), ProjectedNumber::Exact(42));
assert_eq!(
num_proj::i64_to_u64(i64::MAX),
ProjectedNumber::Exact(i64::MAX as u64)
);
}
#[test]
fn test_u64_to_i64() {
assert_eq!(num_proj::u64_to_i64(0), ProjectedNumber::Exact(0));
assert_eq!(num_proj::u64_to_i64(42), ProjectedNumber::Exact(42));
assert_eq!(
num_proj::u64_to_i64(i64::MAX as u64),
ProjectedNumber::Exact(i64::MAX)
);
assert_eq!(
num_proj::u64_to_i64((i64::MAX as u64) + 1),
ProjectedNumber::AfterLast
);
assert_eq!(num_proj::u64_to_i64(u64::MAX), ProjectedNumber::AfterLast);
}
#[test]
fn test_f64_to_u64() {
assert_eq!(num_proj::f64_to_u64(-1e25), ProjectedNumber::Next(0));
assert_eq!(num_proj::f64_to_u64(-0.1), ProjectedNumber::Next(0));
assert_eq!(num_proj::f64_to_u64(1e20), ProjectedNumber::AfterLast);
assert_eq!(
num_proj::f64_to_u64(f64::INFINITY),
ProjectedNumber::AfterLast
);
assert_eq!(num_proj::f64_to_u64(0.0), ProjectedNumber::Exact(0));
assert_eq!(num_proj::f64_to_u64(42.0), ProjectedNumber::Exact(42));
assert_eq!(num_proj::f64_to_u64(0.5), ProjectedNumber::Next(1));
assert_eq!(num_proj::f64_to_u64(42.1), ProjectedNumber::Next(43));
}
#[test]
fn test_f64_to_i64() {
assert_eq!(num_proj::f64_to_i64(-1e20), ProjectedNumber::Next(i64::MIN));
assert_eq!(
num_proj::f64_to_i64(f64::NEG_INFINITY),
ProjectedNumber::Next(i64::MIN)
);
assert_eq!(num_proj::f64_to_i64(1e20), ProjectedNumber::AfterLast);
assert_eq!(
num_proj::f64_to_i64(f64::INFINITY),
ProjectedNumber::AfterLast
);
assert_eq!(num_proj::f64_to_i64(0.0), ProjectedNumber::Exact(0));
assert_eq!(num_proj::f64_to_i64(42.0), ProjectedNumber::Exact(42));
assert_eq!(num_proj::f64_to_i64(-42.0), ProjectedNumber::Exact(-42));
assert_eq!(num_proj::f64_to_i64(0.5), ProjectedNumber::Next(1));
assert_eq!(num_proj::f64_to_i64(42.1), ProjectedNumber::Next(43));
assert_eq!(num_proj::f64_to_i64(-0.5), ProjectedNumber::Next(0));
assert_eq!(num_proj::f64_to_i64(-42.1), ProjectedNumber::Next(-42));
}
#[test]
fn test_i64_to_f64() {
assert_eq!(num_proj::i64_to_f64(0), ProjectedNumber::Exact(0.0));
assert_eq!(num_proj::i64_to_f64(42), ProjectedNumber::Exact(42.0));
assert_eq!(num_proj::i64_to_f64(-42), ProjectedNumber::Exact(-42.0));
let max_exact = 9_007_199_254_740_992; // 2^53
assert_eq!(
num_proj::i64_to_f64(max_exact),
ProjectedNumber::Exact(max_exact as f64)
);
// Test values that cannot be exactly represented as f64 (integers above 2^53)
let large_i64 = 9_007_199_254_740_993; // 2^53 + 1
let closest_f64 = 9_007_199_254_740_992.0;
assert_eq!(large_i64 as f64, closest_f64);
if let ProjectedNumber::Next(val) = num_proj::i64_to_f64(large_i64) {
// Verify that the returned float is different from the direct cast
assert!(val > closest_f64);
assert!(val - closest_f64 < 2. * f64::EPSILON * closest_f64);
} else {
panic!("Expected ProjectedNumber::Next for large_i64");
}
// Test with very large negative value
let large_neg_i64 = -9_007_199_254_740_993; // -(2^53 + 1)
let closest_neg_f64 = -9_007_199_254_740_992.0;
assert_eq!(large_neg_i64 as f64, closest_neg_f64);
if let ProjectedNumber::Next(val) = num_proj::i64_to_f64(large_neg_i64) {
// Verify that the returned float is the closest representable f64
assert_eq!(val, closest_neg_f64);
} else {
panic!("Expected ProjectedNumber::Next for large_neg_i64");
}
}
#[test]
fn test_u64_to_f64() {
assert_eq!(num_proj::u64_to_f64(0), ProjectedNumber::Exact(0.0));
assert_eq!(num_proj::u64_to_f64(42), ProjectedNumber::Exact(42.0));
// Test the largest u64 value that can be exactly represented as f64 (2^53)
let max_exact = 9_007_199_254_740_992; // 2^53
assert_eq!(
num_proj::u64_to_f64(max_exact),
ProjectedNumber::Exact(max_exact as f64)
);
// Test values that cannot be exactly represented as f64 (integers above 2^53)
let large_u64 = 9_007_199_254_740_993; // 2^53 + 1
let closest_f64 = 9_007_199_254_740_992.0;
assert_eq!(large_u64 as f64, closest_f64);
if let ProjectedNumber::Next(val) = num_proj::u64_to_f64(large_u64) {
// Verify that the returned float is different from the direct cast
assert!(val > closest_f64);
assert!(val - closest_f64 < 2. * f64::EPSILON * closest_f64);
} else {
panic!("Expected ProjectedNumber::Next for large_u64");
}
}
}

View File

@@ -207,7 +207,7 @@ fn parse_offset_into_milliseconds(input: &str) -> Result<i64, AggregationError>
}
}
fn parse_into_milliseconds(input: &str) -> Result<i64, AggregationError> {
pub(crate) fn parse_into_milliseconds(input: &str) -> Result<i64, AggregationError> {
let split_boundary = input
.as_bytes()
.iter()

View File

@@ -22,6 +22,7 @@
//! - [Range](RangeAggregation)
//! - [Terms](TermsAggregation)
mod composite;
mod filter;
mod histogram;
mod range;
@@ -31,6 +32,7 @@ mod term_missing_agg;
use std::collections::HashMap;
use std::fmt;
pub use composite::*;
pub use filter::*;
pub use histogram::*;
pub use range::*;

View File

@@ -25,9 +25,12 @@ use super::metric::{
use super::segment_agg_result::AggregationLimitsGuard;
use super::{format_date, AggregationError, Key, SerializedKey};
use crate::aggregation::agg_result::{
AggregationResults, BucketEntries, BucketEntry, FilterBucketResult,
AggregationResults, BucketEntries, BucketEntry, CompositeBucketEntry, FilterBucketResult,
};
use crate::aggregation::bucket::{
composite_intermediate_key_ordering, CompositeAggregation, MissingOrder,
TermsAggregationInternal,
};
use crate::aggregation::bucket::TermsAggregationInternal;
use crate::aggregation::metric::CardinalityCollector;
use crate::TantivyError;
@@ -90,6 +93,19 @@ impl From<IntermediateKey> for Key {
impl Eq for IntermediateKey {}
impl std::fmt::Display for IntermediateKey {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
IntermediateKey::Str(val) => f.write_str(val),
IntermediateKey::F64(val) => f.write_str(&val.to_string()),
IntermediateKey::U64(val) => f.write_str(&val.to_string()),
IntermediateKey::I64(val) => f.write_str(&val.to_string()),
IntermediateKey::Bool(val) => f.write_str(&val.to_string()),
IntermediateKey::IpAddr(val) => f.write_str(&val.to_string()),
}
}
}
impl std::hash::Hash for IntermediateKey {
fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
core::mem::discriminant(self).hash(state);
@@ -105,6 +121,21 @@ impl std::hash::Hash for IntermediateKey {
}
impl IntermediateAggregationResults {
/// Returns a reference to the intermediate aggregation result for the given key.
pub fn get(&self, key: &str) -> Option<&IntermediateAggregationResult> {
self.aggs_res.get(key)
}
/// Removes and returns the intermediate aggregation result for the given key.
pub fn remove(&mut self, key: &str) -> Option<IntermediateAggregationResult> {
self.aggs_res.remove(key)
}
/// Returns an iterator over the keys in the intermediate aggregation results.
pub fn keys(&self) -> impl Iterator<Item = &String> {
self.aggs_res.keys()
}
/// Add a result
pub fn push(&mut self, key: String, value: IntermediateAggregationResult) -> crate::Result<()> {
let entry = self.aggs_res.entry(key);
@@ -218,6 +249,11 @@ pub(crate) fn empty_from_req(req: &Aggregation) -> IntermediateAggregationResult
is_date_agg: true,
})
}
Composite(_) => {
IntermediateAggregationResult::Bucket(IntermediateBucketResult::Composite {
buckets: Default::default(),
})
}
Average(_) => IntermediateAggregationResult::Metric(IntermediateMetricResult::Average(
IntermediateAverage::default(),
)),
@@ -445,6 +481,11 @@ pub enum IntermediateBucketResult {
/// Sub-aggregation results
sub_aggregations: IntermediateAggregationResults,
},
/// Composite aggregation
Composite {
/// The composite buckets
buckets: IntermediateCompositeBucketResult,
},
}
impl IntermediateBucketResult {
@@ -540,6 +581,13 @@ impl IntermediateBucketResult {
sub_aggregations: final_sub_aggregations,
}))
}
IntermediateBucketResult::Composite { buckets } => buckets.into_final_result(
req.agg
.as_composite()
.expect("unexpected aggregation, expected composite aggregation"),
req.sub_aggregation(),
limits,
),
}
}
@@ -606,6 +654,16 @@ impl IntermediateBucketResult {
*doc_count_left += doc_count_right;
sub_aggs_left.merge_fruits(sub_aggs_right)?;
}
(
IntermediateBucketResult::Composite {
buckets: buckets_left,
},
IntermediateBucketResult::Composite {
buckets: buckets_right,
},
) => {
buckets_left.merge_fruits(buckets_right)?;
}
(IntermediateBucketResult::Range(_), _) => {
panic!("try merge on different types")
}
@@ -618,6 +676,9 @@ impl IntermediateBucketResult {
(IntermediateBucketResult::Filter { .. }, _) => {
panic!("try merge on different types")
}
(IntermediateBucketResult::Composite { .. }, _) => {
panic!("try merge on different types")
}
}
Ok(())
}
@@ -639,6 +700,21 @@ pub struct IntermediateTermBucketResult {
}
impl IntermediateTermBucketResult {
/// Returns a reference to the map of bucket entries keyed by [`IntermediateKey`].
pub fn entries(&self) -> &FxHashMap<IntermediateKey, IntermediateTermBucketEntry> {
&self.entries
}
/// Returns the count of documents not included in the returned buckets.
pub fn sum_other_doc_count(&self) -> u64 {
self.sum_other_doc_count
}
/// Returns the upper bound of the error on document counts in the returned buckets.
pub fn doc_count_error_upper_bound(&self) -> u64 {
self.doc_count_error_upper_bound
}
pub(crate) fn into_final_result(
self,
req: &TermsAggregation,
@@ -820,7 +896,7 @@ impl IntermediateRangeBucketEntry {
};
// If we have a date type on the histogram buckets, we add the `key_as_string` field as
// rfc339
// rfc3339
if column_type == Some(ColumnType::DateTime) {
if let Some(val) = range_bucket_entry.to {
let key_as_string = format_date(val as i64)?;
@@ -846,6 +922,212 @@ pub struct IntermediateTermBucketEntry {
pub sub_aggregation: IntermediateAggregationResults,
}
/// Entry for the composite bucket.
pub type IntermediateCompositeBucketEntry = IntermediateTermBucketEntry;
/// The fully typed key for composite aggregation
#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]
pub enum CompositeIntermediateKey {
/// Bool key
Bool(bool),
/// String key
Str(String),
/// Float key
F64(f64),
/// Signed integer key
I64(i64),
/// Unsigned integer key
U64(u64),
/// DateTime key, nanoseconds since epoch
DateTime(i64),
/// IP Address key
IpAddr(Ipv6Addr),
/// Missing value key
Null,
}
impl Eq for CompositeIntermediateKey {}
impl std::hash::Hash for CompositeIntermediateKey {
fn hash<H: std::hash::Hasher>(&self, state: &mut H) {
core::mem::discriminant(self).hash(state);
match self {
CompositeIntermediateKey::Bool(val) => val.hash(state),
CompositeIntermediateKey::Str(text) => text.hash(state),
CompositeIntermediateKey::F64(val) => val.to_bits().hash(state),
CompositeIntermediateKey::U64(val) => val.hash(state),
CompositeIntermediateKey::I64(val) => val.hash(state),
CompositeIntermediateKey::DateTime(val) => val.hash(state),
CompositeIntermediateKey::IpAddr(val) => val.hash(state),
CompositeIntermediateKey::Null => {}
}
}
}
/// Composite aggregation page.
#[derive(Default, Clone, Debug, PartialEq, Serialize, Deserialize)]
pub struct IntermediateCompositeBucketResult {
#[serde(
serialize_with = "serialize_composite_entries",
deserialize_with = "deserialize_composite_entries"
)]
pub(crate) entries: FxHashMap<Vec<CompositeIntermediateKey>, IntermediateCompositeBucketEntry>,
pub(crate) target_size: u32,
pub(crate) orders: Vec<(Order, MissingOrder)>,
}
fn serialize_composite_entries<S>(
entries: &FxHashMap<Vec<CompositeIntermediateKey>, IntermediateCompositeBucketEntry>,
serializer: S,
) -> Result<S::Ok, S::Error>
where
S: serde::Serializer,
{
use serde::ser::SerializeSeq;
let mut seq = serializer.serialize_seq(Some(entries.len()))?;
for (k, v) in entries {
seq.serialize_element(&(k, v))?;
}
seq.end()
}
fn deserialize_composite_entries<'de, D>(
deserializer: D,
) -> Result<FxHashMap<Vec<CompositeIntermediateKey>, IntermediateCompositeBucketEntry>, D::Error>
where
D: serde::Deserializer<'de>,
{
let vec: Vec<(Vec<CompositeIntermediateKey>, IntermediateCompositeBucketEntry)> =
serde::Deserialize::deserialize(deserializer)?;
Ok(vec.into_iter().collect())
}
impl IntermediateCompositeBucketResult {
pub(crate) fn into_final_result(
self,
req: &CompositeAggregation,
sub_aggregation_req: &Aggregations,
limits: &mut AggregationLimitsGuard,
) -> crate::Result<BucketResult> {
let trimmed_entry_vec =
trim_composite_buckets(self.entries, &self.orders, self.target_size)?;
let after_key = if trimmed_entry_vec.len() == req.size as usize {
trimmed_entry_vec
.last()
.map(|bucket| {
let (intermediate_key, _entry) = bucket;
intermediate_key
.iter()
.enumerate()
.map(|(idx, intermediate_key)| {
let source = &req.sources[idx];
(source.name().to_string(), intermediate_key.clone().into())
})
.collect()
})
.unwrap()
} else {
FxHashMap::default()
};
let buckets = trimmed_entry_vec
.into_iter()
.map(|(intermediate_key, entry)| {
let key = intermediate_key
.into_iter()
.enumerate()
.map(|(idx, intermediate_key)| {
let source = &req.sources[idx];
(source.name().to_string(), intermediate_key.into())
})
.collect();
Ok(CompositeBucketEntry {
key,
doc_count: entry.doc_count as u64,
sub_aggregation: entry
.sub_aggregation
.into_final_result_internal(sub_aggregation_req, limits)?,
})
})
.collect::<crate::Result<Vec<_>>>()?;
Ok(BucketResult::Composite { after_key, buckets })
}
fn merge_fruits(&mut self, other: IntermediateCompositeBucketResult) -> crate::Result<()> {
merge_maps(&mut self.entries, other.entries)?;
if self.entries.len() as u32 > 2 * self.target_size {
// 2x factor used to avoid trimming too often (expensive operation)
// an optimal threshold could probably be figured out
self.trim()?;
}
Ok(())
}
/// Trim the composite buckets to the target size, according to the ordering.
///
/// Returns an error if the ordering comparison fails.
pub(crate) fn trim(&mut self) -> crate::Result<()> {
if self.entries.len() as u32 <= self.target_size {
return Ok(());
}
let sorted_entries = trim_composite_buckets(
std::mem::take(&mut self.entries),
&self.orders,
self.target_size,
)?;
self.entries = sorted_entries.into_iter().collect();
Ok(())
}
}
fn trim_composite_buckets(
entries: FxHashMap<Vec<CompositeIntermediateKey>, IntermediateCompositeBucketEntry>,
orders: &[(Order, MissingOrder)],
target_size: u32,
) -> crate::Result<
Vec<(
Vec<CompositeIntermediateKey>,
IntermediateCompositeBucketEntry,
)>,
> {
let mut entries: Vec<_> = entries.into_iter().collect();
let mut sort_error: Option<TantivyError> = None;
entries.sort_by(|(left_key, _), (right_key, _)| {
// Only attempt sorting if we haven't encountered an error yet
if sort_error.is_some() {
return Ordering::Equal; // Return a default, we'll handle the error after sorting
}
for i in 0..orders.len() {
match composite_intermediate_key_ordering(
&left_key[i],
&right_key[i],
orders[i].0,
orders[i].1,
) {
Ok(ordering) if ordering != Ordering::Equal => return ordering,
Ok(_) => continue, // Equal, try next key
Err(err) => {
sort_error = Some(err);
break;
}
}
}
Ordering::Equal
});
// If we encountered an error during sorting, return it now
if let Some(err) = sort_error {
return Err(err);
}
entries.truncate(target_size as usize);
Ok(entries)
}
impl MergeFruits for IntermediateTermBucketEntry {
fn merge_fruits(&mut self, other: IntermediateTermBucketEntry) -> crate::Result<()> {
self.doc_count += other.doc_count;

View File

@@ -55,6 +55,12 @@ impl IntermediateAverage {
pub(crate) fn from_stats(stats: IntermediateStats) -> Self {
Self { stats }
}
/// Returns a reference to the underlying [`IntermediateStats`].
pub fn stats(&self) -> &IntermediateStats {
&self.stats
}
/// Merges the other intermediate result into self.
pub fn merge_fruits(&mut self, other: IntermediateAverage) {
self.stats.merge_fruits(other.stats);

View File

@@ -1,12 +1,11 @@
use std::collections::hash_map::DefaultHasher;
use std::hash::{BuildHasher, Hasher};
use std::hash::Hash;
use columnar::column_values::CompactSpaceU64Accessor;
use columnar::{Column, ColumnType, Dictionary, StrColumn};
use common::f64_to_u64;
use hyperloglogplus::{HyperLogLog, HyperLogLogPlus};
use datasketches::hll::{HllSketch, HllType, HllUnion};
use rustc_hash::FxHashSet;
use serde::{Deserialize, Serialize};
use serde::{Deserialize, Deserializer, Serialize, Serializer};
use crate::aggregation::agg_data::AggregationsSegmentCtx;
use crate::aggregation::intermediate_agg_result::{
@@ -16,29 +15,17 @@ use crate::aggregation::segment_agg_result::SegmentAggregationCollector;
use crate::aggregation::*;
use crate::TantivyError;
#[derive(Clone, Debug, Serialize, Deserialize)]
struct BuildSaltedHasher {
salt: u8,
}
impl BuildHasher for BuildSaltedHasher {
type Hasher = DefaultHasher;
fn build_hasher(&self) -> Self::Hasher {
let mut hasher = DefaultHasher::new();
hasher.write_u8(self.salt);
hasher
}
}
/// Log2 of the number of registers for the HLL sketch.
/// 2^11 = 2048 registers, giving ~2.3% relative error and ~1KB per sketch (Hll4).
const LG_K: u8 = 11;
/// # Cardinality
///
/// The cardinality aggregation allows for computing an estimate
/// of the number of different values in a data set based on the
/// HyperLogLog++ algorithm. This is particularly useful for understanding the
/// uniqueness of values in a large dataset where counting each unique value
/// individually would be computationally expensive.
/// Apache DataSketches HyperLogLog algorithm. This is particularly useful for
/// understanding the uniqueness of values in a large dataset where counting
/// each unique value individually would be computationally expensive.
///
/// For example, you might use a cardinality aggregation to estimate the number
/// of unique visitors to a website by aggregating on a field that contains
@@ -184,7 +171,7 @@ impl SegmentCardinalityCollectorBucket {
term_ids.sort_unstable();
dict.sorted_ords_to_term_cb(term_ids.iter().map(|term| *term as u64), |term| {
self.cardinality.sketch.insert_any(&term);
self.cardinality.insert(term);
Ok(())
})?;
if has_missing {
@@ -195,17 +182,17 @@ impl SegmentCardinalityCollectorBucket {
);
match missing_key {
Key::Str(missing) => {
self.cardinality.sketch.insert_any(&missing);
self.cardinality.insert(missing.as_str());
}
Key::F64(val) => {
let val = f64_to_u64(*val);
self.cardinality.sketch.insert_any(&val);
self.cardinality.insert(val);
}
Key::U64(val) => {
self.cardinality.sketch.insert_any(&val);
self.cardinality.insert(*val);
}
Key::I64(val) => {
self.cardinality.sketch.insert_any(&val);
self.cardinality.insert(*val);
}
}
}
@@ -296,11 +283,11 @@ impl SegmentAggregationCollector for SegmentCardinalityCollector {
})?;
for val in col_block_accessor.iter_vals() {
let val: u128 = compact_space_accessor.compact_to_u128(val as u32);
bucket.cardinality.sketch.insert_any(&val);
bucket.cardinality.insert(val);
}
} else {
for val in col_block_accessor.iter_vals() {
bucket.cardinality.sketch.insert_any(&val);
bucket.cardinality.insert(val);
}
}
@@ -321,11 +308,18 @@ impl SegmentAggregationCollector for SegmentCardinalityCollector {
}
}
#[derive(Clone, Debug, Serialize, Deserialize)]
/// The percentiles collector used during segment collection and for merging results.
#[derive(Clone, Debug)]
/// The cardinality collector used during segment collection and for merging results.
/// Uses Apache DataSketches HLL (lg_k=11, Hll4) for compact binary serialization
/// and cross-language compatibility (e.g. Java `datasketches` library).
pub struct CardinalityCollector {
sketch: HyperLogLogPlus<u64, BuildSaltedHasher>,
sketch: HllSketch,
/// Salt derived from `ColumnType`, used to differentiate values of different column types
/// that map to the same u64 (e.g. bool `false` = 0 vs i64 `0`).
/// Not serialized — only needed during insertion, not after sketch registers are populated.
salt: u8,
}
impl Default for CardinalityCollector {
fn default() -> Self {
Self::new(0)
@@ -338,25 +332,52 @@ impl PartialEq for CardinalityCollector {
}
}
impl CardinalityCollector {
/// Compute the final cardinality estimate.
pub fn finalize(self) -> Option<f64> {
Some(self.sketch.clone().count().trunc())
impl Serialize for CardinalityCollector {
fn serialize<S: Serializer>(&self, serializer: S) -> Result<S::Ok, S::Error> {
let bytes = self.sketch.serialize();
serializer.serialize_bytes(&bytes)
}
}
impl<'de> Deserialize<'de> for CardinalityCollector {
fn deserialize<D: Deserializer<'de>>(deserializer: D) -> Result<Self, D::Error> {
let bytes: Vec<u8> = Deserialize::deserialize(deserializer)?;
let sketch = HllSketch::deserialize(&bytes).map_err(serde::de::Error::custom)?;
Ok(Self { sketch, salt: 0 })
}
}
impl CardinalityCollector {
fn new(salt: u8) -> Self {
Self {
sketch: HyperLogLogPlus::new(16, BuildSaltedHasher { salt }).unwrap(),
sketch: HllSketch::new(LG_K, HllType::Hll4),
salt,
}
}
pub(crate) fn merge_fruits(&mut self, right: CardinalityCollector) -> crate::Result<()> {
self.sketch.merge(&right.sketch).map_err(|err| {
TantivyError::AggregationError(AggregationError::InternalError(format!(
"Error while merging cardinality {err:?}"
)))
})?;
/// Insert a value into the HLL sketch, salted by the column type.
/// The salt ensures that identical u64 values from different column types
/// (e.g. bool `false` vs i64 `0`) are counted as distinct.
pub(crate) fn insert<T: Hash>(&mut self, value: T) {
self.sketch.update((self.salt, value));
}
/// Compute the final cardinality estimate.
pub fn finalize(self) -> Option<f64> {
Some(self.sketch.estimate().trunc())
}
/// Serialize the HLL sketch to its compact binary representation.
/// The format is cross-language compatible with Apache DataSketches (Java, C++, Python).
pub fn to_sketch_bytes(&self) -> Vec<u8> {
self.sketch.serialize()
}
pub(crate) fn merge_fruits(&mut self, right: CardinalityCollector) -> crate::Result<()> {
let mut union = HllUnion::new(LG_K);
union.update(&self.sketch);
union.update(&right.sketch);
self.sketch = union.get_result(HllType::Hll4);
Ok(())
}
}
@@ -518,4 +539,75 @@ mod tests {
Ok(())
}
#[test]
fn cardinality_collector_serde_roundtrip() {
use super::CardinalityCollector;
let mut collector = CardinalityCollector::default();
collector.insert("hello");
collector.insert("world");
collector.insert("hello"); // duplicate
let serialized = serde_json::to_vec(&collector).unwrap();
let deserialized: CardinalityCollector = serde_json::from_slice(&serialized).unwrap();
let original_estimate = collector.finalize().unwrap();
let roundtrip_estimate = deserialized.finalize().unwrap();
assert_eq!(original_estimate, roundtrip_estimate);
assert_eq!(original_estimate, 2.0);
}
#[test]
fn cardinality_collector_merge() {
use super::CardinalityCollector;
let mut left = CardinalityCollector::default();
left.insert("a");
left.insert("b");
let mut right = CardinalityCollector::default();
right.insert("b");
right.insert("c");
left.merge_fruits(right).unwrap();
let estimate = left.finalize().unwrap();
assert_eq!(estimate, 3.0);
}
#[test]
fn cardinality_collector_serialize_deserialize_binary() {
use datasketches::hll::HllSketch;
use super::CardinalityCollector;
let mut collector = CardinalityCollector::default();
collector.insert("apple");
collector.insert("banana");
collector.insert("cherry");
let bytes = collector.to_sketch_bytes();
let deserialized = HllSketch::deserialize(&bytes).unwrap();
assert!((deserialized.estimate() - 3.0).abs() < 0.01);
}
#[test]
fn cardinality_collector_salt_differentiates_types() {
use super::CardinalityCollector;
// Without salt, same u64 value from different column types would collide
let mut collector_bool = CardinalityCollector::new(5); // e.g. ColumnType::Bool
collector_bool.insert(0u64); // false
collector_bool.insert(1u64); // true
let mut collector_i64 = CardinalityCollector::new(2); // e.g. ColumnType::I64
collector_i64.insert(0u64);
collector_i64.insert(1u64);
// Merge them
collector_bool.merge_fruits(collector_i64).unwrap();
let estimate = collector_bool.finalize().unwrap();
// Should be 4 because salt makes (5, 0) != (2, 0) and (5, 1) != (2, 1)
assert_eq!(estimate, 4.0);
}
}

View File

@@ -107,8 +107,11 @@ pub enum PercentileValues {
#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]
/// The entry when requesting percentiles with keyed: false
pub struct PercentileValuesVecEntry {
key: f64,
value: f64,
/// Percentile
pub key: f64,
/// Value at the percentile
pub value: f64,
}
/// Single-metric aggregations use this common result structure.

View File

@@ -222,6 +222,12 @@ impl PercentilesCollector {
self.sketch.add(val);
}
/// Encode the underlying DDSketch to Java-compatible binary format
/// for cross-language serialization with Java consumers.
pub fn to_sketch_bytes(&self) -> Vec<u8> {
self.sketch.to_java_bytes()
}
pub(crate) fn merge_fruits(&mut self, right: PercentilesCollector) -> crate::Result<()> {
self.sketch.merge(&right.sketch).map_err(|err| {
TantivyError::AggregationError(AggregationError::InternalError(format!(
@@ -610,11 +616,11 @@ mod tests {
assert_eq!(
res["range_with_stats"]["buckets"][0]["percentiles"]["values"]["1.0"],
5.0028295751107414
5.002829575110705
);
assert_eq!(
res["range_with_stats"]["buckets"][0]["percentiles"]["values"]["99.0"],
10.07469668951144
10.07469668951133
);
Ok(())
@@ -659,8 +665,8 @@ mod tests {
let res = exec_request_with_query(agg_req, &index, None)?;
assert_eq!(res["percentiles"]["values"]["1.0"], 5.0028295751107414);
assert_eq!(res["percentiles"]["values"]["99.0"], 10.07469668951144);
assert_eq!(res["percentiles"]["values"]["1.0"], 5.002829575110705);
assert_eq!(res["percentiles"]["values"]["99.0"], 10.07469668951133);
Ok(())
}

View File

@@ -110,6 +110,16 @@ impl Default for IntermediateStats {
}
impl IntermediateStats {
/// Returns the number of values collected.
pub fn count(&self) -> u64 {
self.count
}
/// Returns the sum of all values collected.
pub fn sum(&self) -> f64 {
self.sum
}
/// Merges the other stats intermediate result into self.
pub fn merge_fruits(&mut self, other: IntermediateStats) {
self.count += other.count;

View File

@@ -1,4 +1,5 @@
mod order;
mod sort_by_bytes;
mod sort_by_erased_type;
mod sort_by_score;
mod sort_by_static_fast_value;
@@ -6,6 +7,7 @@ mod sort_by_string;
mod sort_key_computer;
pub use order::*;
pub use sort_by_bytes::SortByBytes;
pub use sort_by_erased_type::SortByErasedType;
pub use sort_by_score::SortBySimilarityScore;
pub use sort_by_static_fast_value::SortByStaticFastValue;

View File

@@ -0,0 +1,168 @@
use columnar::BytesColumn;
use crate::collector::sort_key::NaturalComparator;
use crate::collector::{SegmentSortKeyComputer, SortKeyComputer};
use crate::termdict::TermOrdinal;
use crate::{DocId, Score};
/// Sort by the first value of a bytes column.
///
/// If the field is multivalued, only the first value is considered.
///
/// Documents that do not have this value are still considered.
/// Their sort key will simply be `None`.
#[derive(Debug, Clone)]
pub struct SortByBytes {
column_name: String,
}
impl SortByBytes {
/// Creates a new sort by bytes sort key computer.
pub fn for_field(column_name: impl ToString) -> Self {
SortByBytes {
column_name: column_name.to_string(),
}
}
}
impl SortKeyComputer for SortByBytes {
type SortKey = Option<Vec<u8>>;
type Child = ByBytesColumnSegmentSortKeyComputer;
type Comparator = NaturalComparator;
fn segment_sort_key_computer(
&self,
segment_reader: &crate::SegmentReader,
) -> crate::Result<Self::Child> {
let bytes_column_opt = segment_reader.fast_fields().bytes(&self.column_name)?;
Ok(ByBytesColumnSegmentSortKeyComputer { bytes_column_opt })
}
}
/// Segment-level sort key computer for bytes columns.
pub struct ByBytesColumnSegmentSortKeyComputer {
bytes_column_opt: Option<BytesColumn>,
}
impl SegmentSortKeyComputer for ByBytesColumnSegmentSortKeyComputer {
type SortKey = Option<Vec<u8>>;
type SegmentSortKey = Option<TermOrdinal>;
type SegmentComparator = NaturalComparator;
#[inline(always)]
fn segment_sort_key(&mut self, doc: DocId, _score: Score) -> Option<TermOrdinal> {
let bytes_column = self.bytes_column_opt.as_ref()?;
bytes_column.ords().first(doc)
}
fn convert_segment_sort_key(&self, term_ord_opt: Option<TermOrdinal>) -> Option<Vec<u8>> {
// TODO: Individual lookups to the dictionary like this are very likely to repeatedly
// decompress the same blocks. See https://github.com/quickwit-oss/tantivy/issues/2776
let term_ord = term_ord_opt?;
let bytes_column = self.bytes_column_opt.as_ref()?;
let mut bytes = Vec::new();
bytes_column
.dictionary()
.ord_to_term(term_ord, &mut bytes)
.ok()?;
Some(bytes)
}
}
#[cfg(test)]
mod tests {
use super::SortByBytes;
use crate::collector::TopDocs;
use crate::query::AllQuery;
use crate::schema::{BytesOptions, Schema, FAST, INDEXED};
use crate::{Index, IndexWriter, Order, TantivyDocument};
#[test]
fn test_sort_by_bytes_asc() -> crate::Result<()> {
let mut schema_builder = Schema::builder();
let bytes_field = schema_builder
.add_bytes_field("data", BytesOptions::default().set_fast().set_indexed());
let id_field = schema_builder.add_u64_field("id", FAST | INDEXED);
let schema = schema_builder.build();
let index = Index::create_in_ram(schema);
let mut index_writer: IndexWriter = index.writer_for_tests()?;
// Insert documents with byte values in non-sorted order
let test_data: Vec<(u64, Vec<u8>)> = vec![
(1, vec![0x02, 0x00]),
(2, vec![0x00, 0x10]),
(3, vec![0x01, 0x00]),
(4, vec![0x00, 0x20]),
];
for (id, bytes) in &test_data {
let mut doc = TantivyDocument::new();
doc.add_u64(id_field, *id);
doc.add_bytes(bytes_field, bytes);
index_writer.add_document(doc)?;
}
index_writer.commit()?;
let reader = index.reader()?;
let searcher = reader.searcher();
// Sort ascending by bytes
let top_docs =
TopDocs::with_limit(10).order_by((SortByBytes::for_field("data"), Order::Asc));
let results: Vec<(Option<Vec<u8>>, _)> = searcher.search(&AllQuery, &top_docs)?;
// Expected order: [0x00,0x10], [0x00,0x20], [0x01,0x00], [0x02,0x00]
let sorted_bytes: Vec<Option<Vec<u8>>> = results.into_iter().map(|(b, _)| b).collect();
assert_eq!(
sorted_bytes,
vec![
Some(vec![0x00, 0x10]),
Some(vec![0x00, 0x20]),
Some(vec![0x01, 0x00]),
Some(vec![0x02, 0x00]),
]
);
Ok(())
}
#[test]
fn test_sort_by_bytes_desc() -> crate::Result<()> {
let mut schema_builder = Schema::builder();
let bytes_field = schema_builder
.add_bytes_field("data", BytesOptions::default().set_fast().set_indexed());
let schema = schema_builder.build();
let index = Index::create_in_ram(schema);
let mut index_writer: IndexWriter = index.writer_for_tests()?;
let test_data: Vec<Vec<u8>> = vec![vec![0x00, 0x10], vec![0x02, 0x00], vec![0x01, 0x00]];
for bytes in &test_data {
let mut doc = TantivyDocument::new();
doc.add_bytes(bytes_field, bytes);
index_writer.add_document(doc)?;
}
index_writer.commit()?;
let reader = index.reader()?;
let searcher = reader.searcher();
// Sort descending by bytes
let top_docs =
TopDocs::with_limit(10).order_by((SortByBytes::for_field("data"), Order::Desc));
let results: Vec<(Option<Vec<u8>>, _)> = searcher.search(&AllQuery, &top_docs)?;
// Expected order (descending): [0x02,0x00], [0x01,0x00], [0x00,0x10]
let sorted_bytes: Vec<Option<Vec<u8>>> = results.into_iter().map(|(b, _)| b).collect();
assert_eq!(
sorted_bytes,
vec![
Some(vec![0x02, 0x00]),
Some(vec![0x01, 0x00]),
Some(vec![0x00, 0x10]),
]
);
Ok(())
}
}

View File

@@ -1,7 +1,7 @@
use columnar::{ColumnType, MonotonicallyMappableToU64};
use crate::collector::sort_key::{
NaturalComparator, SortBySimilarityScore, SortByStaticFastValue, SortByString,
NaturalComparator, SortByBytes, SortBySimilarityScore, SortByStaticFastValue, SortByString,
};
use crate::collector::{SegmentSortKeyComputer, SortKeyComputer};
use crate::fastfield::FastFieldNotAvailableError;
@@ -114,6 +114,16 @@ impl SortKeyComputer for SortByErasedType {
},
})
}
ColumnType::Bytes => {
let computer = SortByBytes::for_field(column_name);
let inner = computer.segment_sort_key_computer(segment_reader)?;
Box::new(ErasedSegmentSortKeyComputerWrapper {
inner,
converter: |val: Option<Vec<u8>>| {
val.map(OwnedValue::Bytes).unwrap_or(OwnedValue::Null)
},
})
}
ColumnType::U64 => {
let computer = SortByStaticFastValue::<u64>::for_field(column_name);
let inner = computer.segment_sort_key_computer(segment_reader)?;
@@ -281,6 +291,65 @@ mod tests {
);
}
#[test]
fn test_sort_by_owned_bytes() {
let mut schema_builder = Schema::builder();
let data_field = schema_builder.add_bytes_field("data", FAST);
let schema = schema_builder.build();
let index = Index::create_in_ram(schema);
let mut writer = index.writer_for_tests().unwrap();
writer
.add_document(doc!(data_field => vec![0x03u8, 0x00]))
.unwrap();
writer
.add_document(doc!(data_field => vec![0x01u8, 0x00]))
.unwrap();
writer
.add_document(doc!(data_field => vec![0x02u8, 0x00]))
.unwrap();
writer.add_document(doc!()).unwrap();
writer.commit().unwrap();
let reader = index.reader().unwrap();
let searcher = reader.searcher();
// Sort descending (Natural - highest first)
let collector = TopDocs::with_limit(10)
.order_by((SortByErasedType::for_field("data"), ComparatorEnum::Natural));
let top_docs = searcher.search(&AllQuery, &collector).unwrap();
let values: Vec<OwnedValue> = top_docs.into_iter().map(|(key, _)| key).collect();
assert_eq!(
values,
vec![
OwnedValue::Bytes(vec![0x03, 0x00]),
OwnedValue::Bytes(vec![0x02, 0x00]),
OwnedValue::Bytes(vec![0x01, 0x00]),
OwnedValue::Null
]
);
// Sort ascending (ReverseNoneLower - lowest first, nulls last)
let collector = TopDocs::with_limit(10).order_by((
SortByErasedType::for_field("data"),
ComparatorEnum::ReverseNoneLower,
));
let top_docs = searcher.search(&AllQuery, &collector).unwrap();
let values: Vec<OwnedValue> = top_docs.into_iter().map(|(key, _)| key).collect();
assert_eq!(
values,
vec![
OwnedValue::Bytes(vec![0x01, 0x00]),
OwnedValue::Bytes(vec![0x02, 0x00]),
OwnedValue::Bytes(vec![0x03, 0x00]),
OwnedValue::Null
]
);
}
#[test]
fn test_sort_by_owned_reverse() {
let mut schema_builder = Schema::builder();

View File

@@ -676,7 +676,7 @@ mod tests {
let num_segments = reader.searcher().segment_readers().len();
assert!(num_segments <= 4);
let num_components_except_deletes_and_tempstore =
crate::index::SegmentComponent::iterator().len() - 2;
crate::index::SegmentComponent::iterator().len() - 1;
let max_num_mmapped = num_components_except_deletes_and_tempstore * num_segments;
assert_eventually(|| {
let num_mmapped = mmap_directory.get_cache_info().mmapped.len();

View File

@@ -51,31 +51,55 @@ pub trait DocSet: Send {
doc
}
/// Seeks to the target if possible and returns true if the target is in the DocSet.
/// !!!Dragons ahead!!!
/// In spirit, this is an approximate and dangerous version of `seek`.
///
/// It can leave the DocSet in an `invalid` state and might return a
/// lower bound of what the result of Seek would have been.
///
///
/// More accurately it returns either:
/// - Found if the target is in the docset. In that case, the DocSet is left in a valid state.
/// - SeekLowerBound(seek_lower_bound) if the target is not in the docset. In that case, The
/// DocSet can be the left in a invalid state. The DocSet should then only receives call to
/// `seek_danger(..)` until it returns `Found`, and get back to a valid state.
///
/// `seek_lower_bound` can be any `DocId` (in the docset or not) as long as it is in
/// `(target .. seek_result] U {TERMINATED}` where `seek_result` is the first document in the
/// docset greater than to `target`.
///
/// `seek_danger` may return `SeekLowerBound(TERMINATED)`.
///
/// Calling `seek_danger` with TERMINATED as a target is allowed,
/// and should always return NewTarget(TERMINATED) or anything larger as TERMINATED is NOT in
/// the DocSet.
///
/// DocSets that already have an efficient `seek` method don't need to implement
/// `seek_into_the_danger_zone`. All wrapper DocSets should forward
/// `seek_into_the_danger_zone` to the underlying DocSet.
/// `seek_danger`.
///
/// ## API Behaviour
/// If `seek_into_the_danger_zone` is returning true, a call to `doc()` has to return target.
/// If `seek_into_the_danger_zone` is returning false, a call to `doc()` may return any doc
/// between the last doc that matched and target or a doc that is a valid next hit after
/// target. The DocSet is considered to be in an invalid state until
/// `seek_into_the_danger_zone` returns true again.
///
/// `target` needs to be equal or larger than `doc` when in a valid state.
///
/// Consecutive calls are not allowed to have decreasing `target` values.
///
/// # Warning
/// This is an advanced API used by intersection. The API contract is tricky, avoid using it.
fn seek_into_the_danger_zone(&mut self, target: DocId) -> bool {
let current_doc = self.doc();
if current_doc < target {
self.seek(target);
/// Consecutive calls to seek_danger are guaranteed to have strictly increasing `target`
/// values.
fn seek_danger(&mut self, target: DocId) -> SeekDangerResult {
if target >= TERMINATED {
debug_assert!(target == TERMINATED);
// No need to advance.
return SeekDangerResult::SeekLowerBound(target);
}
// The default implementation does not include any
// `danger zone` behavior.
//
// It does not leave the scorer in an invalid state.
// For this reason, we can safely call `self.doc()`.
let mut doc = self.doc();
if doc < target {
doc = self.seek(target);
}
if doc == target {
SeekDangerResult::Found
} else {
SeekDangerResult::SeekLowerBound(doc)
}
self.doc() == target
}
/// Fills a given mutable buffer with the next doc ids from the
@@ -166,6 +190,17 @@ pub trait DocSet: Send {
}
}
#[derive(Clone, Copy, Debug, PartialEq, Eq)]
pub enum SeekDangerResult {
/// The target was found in the DocSet.
Found,
/// The target was not found in the DocSet.
/// We return a range in which the value could be.
/// The given target can be any DocId, that is <= than the first document
/// in the docset after the target.
SeekLowerBound(DocId),
}
impl DocSet for &mut dyn DocSet {
fn advance(&mut self) -> u32 {
(**self).advance()
@@ -175,8 +210,8 @@ impl DocSet for &mut dyn DocSet {
(**self).seek(target)
}
fn seek_into_the_danger_zone(&mut self, target: DocId) -> bool {
(**self).seek_into_the_danger_zone(target)
fn seek_danger(&mut self, target: DocId) -> SeekDangerResult {
(**self).seek_danger(target)
}
fn doc(&self) -> u32 {
@@ -211,9 +246,9 @@ impl<TDocSet: DocSet + ?Sized> DocSet for Box<TDocSet> {
unboxed.seek(target)
}
fn seek_into_the_danger_zone(&mut self, target: DocId) -> bool {
fn seek_danger(&mut self, target: DocId) -> SeekDangerResult {
let unboxed: &mut TDocSet = self.borrow_mut();
unboxed.seek_into_the_danger_zone(target)
unboxed.seek_danger(target)
}
fn fill_buffer(&mut self, buffer: &mut [DocId; COLLECT_BLOCK_BUFFER_LEN]) -> usize {

View File

@@ -1,8 +1,6 @@
use std::collections::HashSet;
use std::fmt;
use std::path::PathBuf;
use std::sync::atomic::AtomicBool;
use std::sync::Arc;
use serde::{Deserialize, Serialize};
@@ -37,7 +35,6 @@ impl SegmentMetaInventory {
let inner = InnerSegmentMeta {
segment_id,
max_doc,
include_temp_doc_store: Arc::new(AtomicBool::new(true)),
deletes: None,
};
SegmentMeta::from(self.inventory.track(inner))
@@ -85,15 +82,6 @@ impl SegmentMeta {
self.tracked.segment_id
}
/// Removes the Component::TempStore from the alive list and
/// therefore marks the temp docstore file to be deleted by
/// the garbage collection.
pub fn untrack_temp_docstore(&self) {
self.tracked
.include_temp_doc_store
.store(false, std::sync::atomic::Ordering::Relaxed);
}
/// Returns the number of deleted documents.
pub fn num_deleted_docs(&self) -> u32 {
self.tracked
@@ -111,20 +99,9 @@ impl SegmentMeta {
/// is by removing all files that have been created by tantivy
/// and are not used by any segment anymore.
pub fn list_files(&self) -> HashSet<PathBuf> {
if self
.tracked
.include_temp_doc_store
.load(std::sync::atomic::Ordering::Relaxed)
{
SegmentComponent::iterator()
.map(|component| self.relative_path(*component))
.collect::<HashSet<PathBuf>>()
} else {
SegmentComponent::iterator()
.filter(|comp| *comp != &SegmentComponent::TempStore)
.map(|component| self.relative_path(*component))
.collect::<HashSet<PathBuf>>()
}
SegmentComponent::iterator()
.map(|component| self.relative_path(*component))
.collect::<HashSet<PathBuf>>()
}
/// Returns the relative path of a component of our segment.
@@ -138,7 +115,6 @@ impl SegmentMeta {
SegmentComponent::Positions => ".pos".to_string(),
SegmentComponent::Terms => ".term".to_string(),
SegmentComponent::Store => ".store".to_string(),
SegmentComponent::TempStore => ".store.temp".to_string(),
SegmentComponent::FastFields => ".fast".to_string(),
SegmentComponent::FieldNorms => ".fieldnorm".to_string(),
SegmentComponent::Delete => format!(".{}.del", self.delete_opstamp().unwrap_or(0)),
@@ -183,7 +159,6 @@ impl SegmentMeta {
segment_id: inner_meta.segment_id,
max_doc,
deletes: None,
include_temp_doc_store: Arc::new(AtomicBool::new(true)),
});
SegmentMeta { tracked }
}
@@ -202,7 +177,6 @@ impl SegmentMeta {
let tracked = self.tracked.map(move |inner_meta| InnerSegmentMeta {
segment_id: inner_meta.segment_id,
max_doc: inner_meta.max_doc,
include_temp_doc_store: Arc::new(AtomicBool::new(true)),
deletes: Some(delete_meta),
});
SegmentMeta { tracked }
@@ -214,14 +188,6 @@ struct InnerSegmentMeta {
segment_id: SegmentId,
max_doc: u32,
pub deletes: Option<DeleteMeta>,
/// If you want to avoid the SegmentComponent::TempStore file to be covered by
/// garbage collection and deleted, set this to true. This is used during merge.
#[serde(skip)]
#[serde(default = "default_temp_store")]
pub(crate) include_temp_doc_store: Arc<AtomicBool>,
}
fn default_temp_store() -> Arc<AtomicBool> {
Arc::new(AtomicBool::new(false))
}
impl InnerSegmentMeta {

View File

@@ -23,8 +23,6 @@ pub enum SegmentComponent {
/// Accessing a document from the store is relatively slow, as it
/// requires to decompress the entire block it belongs to.
Store,
/// Temporary storage of the documents, before streamed to `Store`.
TempStore,
/// Bitset describing which document of the segment is alive.
/// (It was representing deleted docs but changed to represent alive docs from v0.17)
Delete,
@@ -33,14 +31,13 @@ pub enum SegmentComponent {
impl SegmentComponent {
/// Iterates through the components.
pub fn iterator() -> slice::Iter<'static, SegmentComponent> {
static SEGMENT_COMPONENTS: [SegmentComponent; 8] = [
static SEGMENT_COMPONENTS: [SegmentComponent; 7] = [
SegmentComponent::Postings,
SegmentComponent::Positions,
SegmentComponent::FastFields,
SegmentComponent::FieldNorms,
SegmentComponent::Terms,
SegmentComponent::Store,
SegmentComponent::TempStore,
SegmentComponent::Delete,
];
SEGMENT_COMPONENTS.iter()

View File

@@ -218,7 +218,7 @@ fn index_documents<D: Document>(
let alive_bitset_opt = apply_deletes(&segment_with_max_doc, &mut delete_cursor, &doc_opstamps)?;
let meta = segment_with_max_doc.meta().clone();
meta.untrack_temp_docstore();
// update segment_updater inventory to remove tempstore
let segment_entry = SegmentEntry::new(meta, delete_cursor, alive_bitset_opt);
segment_updater.schedule_add_segment(segment_entry).wait()?;

View File

@@ -303,10 +303,10 @@ impl BlockSegmentPostings {
}
pub(crate) fn load_block(&mut self) {
let offset = self.skip_reader.byte_offset();
if self.block_is_loaded() {
return;
}
let offset = self.skip_reader.byte_offset();
match self.skip_reader.block_info() {
BlockInfo::BitPacked {
doc_num_bits,

View File

@@ -168,12 +168,20 @@ impl DocSet for SegmentPostings {
self.doc()
}
#[inline]
fn seek(&mut self, target: DocId) -> DocId {
debug_assert!(self.doc() <= target);
if self.doc() >= target {
return self.doc();
}
// As an optimization, if the block is already loaded, we can
// cheaply check the next doc.
self.cur = (self.cur + 1).min(COMPRESSION_BLOCK_SIZE - 1);
if self.doc() >= target {
return self.doc();
}
// Delegate block-local search to BlockSegmentPostings::seek, which returns
// the in-block index of the first doc >= target.
self.cur = self.block_cursor.seek(target);

View File

@@ -291,18 +291,6 @@ impl<TScoreCombiner: ScoreCombiner> BooleanWeight<TScoreCombiner> {
}
};
let exclude_scorer_opt: Option<Box<dyn Scorer>> = if exclude_scorers.is_empty() {
None
} else {
let exclude_specialized_scorer: SpecializedScorer =
scorer_union(exclude_scorers, DoNothingCombiner::default, num_docs);
Some(into_box_scorer(
exclude_specialized_scorer,
DoNothingCombiner::default,
num_docs,
))
};
let include_scorer = match (should_scorers, must_scorers) {
(ShouldScorersCombinationMethod::Ignored, must_scorers) => {
// No SHOULD clauses (or they were absorbed into MUST).
@@ -380,16 +368,23 @@ impl<TScoreCombiner: ScoreCombiner> BooleanWeight<TScoreCombiner> {
}
}
};
if let Some(exclude_scorer) = exclude_scorer_opt {
let include_scorer_boxed =
into_box_scorer(include_scorer, &score_combiner_fn, num_docs);
Ok(SpecializedScorer::Other(Box::new(Exclude::new(
include_scorer_boxed,
exclude_scorer,
))))
} else {
Ok(include_scorer)
if exclude_scorers.is_empty() {
return Ok(include_scorer);
}
let include_scorer_boxed = into_box_scorer(include_scorer, &score_combiner_fn, num_docs);
let scorer: Box<dyn Scorer> = if exclude_scorers.len() == 1 {
let exclude_scorer = exclude_scorers.pop().unwrap();
match exclude_scorer.downcast::<TermScorer>() {
// Cast to TermScorer succeeded
Ok(exclude_scorer) => Box::new(Exclude::new(include_scorer_boxed, *exclude_scorer)),
// We get back the original Box<dyn Scorer>
Err(exclude_scorer) => Box::new(Exclude::new(include_scorer_boxed, exclude_scorer)),
}
} else {
Box::new(Exclude::new(include_scorer_boxed, exclude_scorers))
};
Ok(SpecializedScorer::Other(scorer))
}
}

View File

@@ -1,6 +1,6 @@
use std::fmt;
use crate::docset::COLLECT_BLOCK_BUFFER_LEN;
use crate::docset::{SeekDangerResult, COLLECT_BLOCK_BUFFER_LEN};
use crate::fastfield::AliveBitSet;
use crate::query::{EnableScoring, Explanation, Query, Scorer, Weight};
use crate::{DocId, DocSet, Score, SegmentReader, Term};
@@ -104,8 +104,8 @@ impl<S: Scorer> DocSet for BoostScorer<S> {
fn seek(&mut self, target: DocId) -> DocId {
self.underlying.seek(target)
}
fn seek_into_the_danger_zone(&mut self, target: DocId) -> bool {
self.underlying.seek_into_the_danger_zone(target)
fn seek_danger(&mut self, target: DocId) -> SeekDangerResult {
self.underlying.seek_danger(target)
}
fn fill_buffer(&mut self, buffer: &mut [DocId; COLLECT_BLOCK_BUFFER_LEN]) -> usize {

View File

@@ -1,6 +1,7 @@
use std::cmp::Ordering;
use std::collections::BinaryHeap;
use crate::docset::SeekDangerResult;
use crate::query::score_combiner::DoNothingCombiner;
use crate::query::{ScoreCombiner, Scorer};
use crate::{DocId, DocSet, Score, TERMINATED};
@@ -67,10 +68,12 @@ impl<T: Scorer> DocSet for ScorerWrapper<T> {
self.current_doc = doc_id;
doc_id
}
fn seek_into_the_danger_zone(&mut self, target: DocId) -> bool {
let found = self.scorer.seek_into_the_danger_zone(target);
self.current_doc = self.scorer.doc();
found
fn seek_danger(&mut self, target: DocId) -> SeekDangerResult {
let result = self.scorer.seek_danger(target);
if result == SeekDangerResult::Found {
self.current_doc = target;
}
result
}
fn doc(&self) -> DocId {

View File

@@ -1,48 +1,71 @@
use crate::docset::{DocSet, TERMINATED};
use crate::docset::{DocSet, SeekDangerResult, TERMINATED};
use crate::query::Scorer;
use crate::{DocId, Score};
#[inline]
fn is_within<TDocSetExclude: DocSet>(docset: &mut TDocSetExclude, doc: DocId) -> bool {
docset.doc() <= doc && docset.seek(doc) == doc
}
/// Filters a given `DocSet` by removing the docs from a given `DocSet`.
/// An exclusion set is a set of documents
/// that should be excluded from a given DocSet.
///
/// The excluding docset has no impact on scoring.
pub struct Exclude<TDocSet, TDocSetExclude> {
underlying_docset: TDocSet,
excluding_docset: TDocSetExclude,
/// It can be a single DocSet, or a Vec of DocSets.
pub trait ExclusionSet: Send {
/// Returns `true` if the given `doc` is in the exclusion set.
fn contains(&mut self, doc: DocId) -> bool;
}
impl<TDocSet, TDocSetExclude> Exclude<TDocSet, TDocSetExclude>
impl<TDocSet: DocSet> ExclusionSet for TDocSet {
#[inline]
fn contains(&mut self, doc: DocId) -> bool {
self.seek_danger(doc) == SeekDangerResult::Found
}
}
impl<TDocSet: DocSet> ExclusionSet for Vec<TDocSet> {
#[inline]
fn contains(&mut self, doc: DocId) -> bool {
for docset in self.iter_mut() {
if docset.seek_danger(doc) == SeekDangerResult::Found {
return true;
}
}
false
}
}
/// Filters a given `DocSet` by removing the docs from an exclusion set.
///
/// The excluding docsets have no impact on scoring.
pub struct Exclude<TDocSet, TExclusionSet> {
underlying_docset: TDocSet,
exclusion_set: TExclusionSet,
}
impl<TDocSet, TExclusionSet> Exclude<TDocSet, TExclusionSet>
where
TDocSet: DocSet,
TDocSetExclude: DocSet,
TExclusionSet: ExclusionSet,
{
/// Creates a new `ExcludeScorer`
pub fn new(
mut underlying_docset: TDocSet,
mut excluding_docset: TDocSetExclude,
) -> Exclude<TDocSet, TDocSetExclude> {
mut exclusion_set: TExclusionSet,
) -> Exclude<TDocSet, TExclusionSet> {
while underlying_docset.doc() != TERMINATED {
let target = underlying_docset.doc();
if !is_within(&mut excluding_docset, target) {
if !exclusion_set.contains(target) {
break;
}
underlying_docset.advance();
}
Exclude {
underlying_docset,
excluding_docset,
exclusion_set,
}
}
}
impl<TDocSet, TDocSetExclude> DocSet for Exclude<TDocSet, TDocSetExclude>
impl<TDocSet, TExclusionSet> DocSet for Exclude<TDocSet, TExclusionSet>
where
TDocSet: DocSet,
TDocSetExclude: DocSet,
TExclusionSet: ExclusionSet,
{
fn advance(&mut self) -> DocId {
loop {
@@ -50,7 +73,7 @@ where
if candidate == TERMINATED {
return TERMINATED;
}
if !is_within(&mut self.excluding_docset, candidate) {
if !self.exclusion_set.contains(candidate) {
return candidate;
}
}
@@ -61,7 +84,7 @@ where
if candidate == TERMINATED {
return TERMINATED;
}
if !is_within(&mut self.excluding_docset, candidate) {
if !self.exclusion_set.contains(candidate) {
return candidate;
}
self.advance()
@@ -79,10 +102,10 @@ where
}
}
impl<TScorer, TDocSetExclude> Scorer for Exclude<TScorer, TDocSetExclude>
impl<TScorer, TExclusionSet> Scorer for Exclude<TScorer, TExclusionSet>
where
TScorer: Scorer,
TDocSetExclude: DocSet + 'static,
TExclusionSet: ExclusionSet + 'static,
{
#[inline]
fn score(&mut self) -> Score {

View File

@@ -1,5 +1,5 @@
use super::size_hint::estimate_intersection;
use crate::docset::{DocSet, TERMINATED};
use crate::docset::{DocSet, SeekDangerResult, TERMINATED};
use crate::query::term_query::TermScorer;
use crate::query::{EmptyScorer, Scorer};
use crate::{DocId, Score};
@@ -84,6 +84,14 @@ impl<TDocSet: DocSet> Intersection<TDocSet, TDocSet> {
docsets.sort_by_key(|docset| docset.cost());
go_to_first_doc(&mut docsets);
let left = docsets.remove(0);
debug_assert!({
let doc = left.doc();
if doc == TERMINATED {
true
} else {
docsets.iter().all(|docset| docset.doc() == doc)
}
});
let right = docsets.remove(0);
Intersection {
left,
@@ -108,46 +116,61 @@ impl<TDocSet: DocSet, TOtherDocSet: DocSet> DocSet for Intersection<TDocSet, TOt
#[inline]
fn advance(&mut self) -> DocId {
let (left, right) = (&mut self.left, &mut self.right);
let mut candidate = left.advance();
if candidate == TERMINATED {
return TERMINATED;
}
loop {
// In the first part we look for a document in the intersection
// of the two rarest `DocSet` in the intersection.
// Invariant:
// - candidate is always <= to the next document in the intersection.
// - candidate strictly increases at every occurence of the loop.
let mut candidate = left.doc() + 1;
loop {
if right.seek_into_the_danger_zone(candidate) {
break;
}
let right_doc = right.doc();
// TODO: Think about which value would make sense here
// It depends on the DocSet implementation, when a seek would outweigh an advance.
if right_doc > candidate.wrapping_add(100) {
candidate = left.seek(right_doc);
} else {
candidate = left.advance();
}
if candidate == TERMINATED {
return TERMINATED;
}
}
// Termination: candidate strictly increases.
'outer: while candidate < TERMINATED {
// As we enter the loop, we should always have candidate < next_doc.
debug_assert_eq!(left.doc(), right.doc());
// test the remaining scorers
if self
.others
.iter_mut()
.all(|docset| docset.seek_into_the_danger_zone(candidate))
candidate = left.seek(candidate);
// Left is positionned on `candidate`.
debug_assert_eq!(left.doc(), candidate);
if let SeekDangerResult::SeekLowerBound(seek_lower_bound) = right.seek_danger(candidate)
{
debug_assert_eq!(candidate, self.left.doc());
debug_assert_eq!(candidate, self.right.doc());
debug_assert!(self.others.iter().all(|docset| docset.doc() == candidate));
return candidate;
debug_assert!(
seek_lower_bound == TERMINATED || seek_lower_bound > candidate,
"seek_lower_bound {seek_lower_bound} must be greater than candidate \
{candidate}"
);
candidate = seek_lower_bound;
continue;
}
candidate = left.advance();
// Left and right are positionned on `candidate`.
debug_assert_eq!(right.doc(), candidate);
for other in &mut self.others {
if let SeekDangerResult::SeekLowerBound(seek_lower_bound) =
other.seek_danger(candidate)
{
// One of the scorer does not match, let's restart at the top of the loop.
debug_assert!(
seek_lower_bound == TERMINATED || seek_lower_bound > candidate,
"seek_lower_bound {seek_lower_bound} must be greater than candidate \
{candidate}"
);
candidate = seek_lower_bound;
continue 'outer;
}
}
// At this point all scorers are in a valid state, aligned on the next document in the
// intersection.
debug_assert!(self.others.iter().all(|docset| docset.doc() == candidate));
return candidate;
}
// We make sure our docset is in a valid state.
// In particular, we want .doc() to return TERMINATED.
left.seek(TERMINATED);
TERMINATED
}
fn seek(&mut self, target: DocId) -> DocId {
@@ -166,13 +189,19 @@ impl<TDocSet: DocSet, TOtherDocSet: DocSet> DocSet for Intersection<TDocSet, TOt
///
/// Some implementations may choose to advance past the target if beneficial for performance.
/// The return value is `true` if the target is in the docset, and `false` otherwise.
fn seek_into_the_danger_zone(&mut self, target: DocId) -> bool {
self.left.seek_into_the_danger_zone(target)
&& self.right.seek_into_the_danger_zone(target)
&& self
.others
.iter_mut()
.all(|docset| docset.seek_into_the_danger_zone(target))
fn seek_danger(&mut self, target: DocId) -> SeekDangerResult {
if let SeekDangerResult::SeekLowerBound(new_target) = self.left.seek_danger(target) {
return SeekDangerResult::SeekLowerBound(new_target);
}
if let SeekDangerResult::SeekLowerBound(new_target) = self.right.seek_danger(target) {
return SeekDangerResult::SeekLowerBound(new_target);
}
for docset in &mut self.others {
if let SeekDangerResult::SeekLowerBound(new_target) = docset.seek_danger(target) {
return SeekDangerResult::SeekLowerBound(new_target);
}
}
SeekDangerResult::Found
}
#[inline]
@@ -215,9 +244,12 @@ mod tests {
use proptest::prelude::*;
use super::Intersection;
use crate::collector::Count;
use crate::docset::{DocSet, TERMINATED};
use crate::postings::tests::test_skip_against_unoptimized;
use crate::query::VecDocSet;
use crate::query::{QueryParser, VecDocSet};
use crate::schema::{Schema, TEXT};
use crate::Index;
#[test]
fn test_intersection() {
@@ -304,6 +336,58 @@ mod tests {
assert_eq!(intersection.doc(), TERMINATED);
}
#[test]
fn test_intersection_abc() {
let a = VecDocSet::from(vec![2, 3, 6]);
let b = VecDocSet::from(vec![1, 3, 5]);
let c = VecDocSet::from(vec![1, 3, 5]);
let mut intersection = Intersection::new(vec![c, b, a], 10);
let mut docs = Vec::new();
use crate::DocSet;
while intersection.doc() != TERMINATED {
docs.push(intersection.doc());
intersection.advance();
}
assert_eq!(&docs, &[3]);
}
#[test]
fn test_intersection_termination() {
use crate::query::score_combiner::DoNothingCombiner;
use crate::query::{BufferedUnionScorer, ConstScorer, VecDocSet};
let a1 = ConstScorer::new(VecDocSet::from(vec![0u32, 10000]), 1.0);
let a2 = ConstScorer::new(VecDocSet::from(vec![0u32, 10000]), 1.0);
let mut b_scorers = vec![];
for _ in 0..2 {
// Union matches 0 and 10000.
b_scorers.push(ConstScorer::new(VecDocSet::from(vec![0, 10000]), 1.0));
}
// That's the union of two scores matching 0, and 10_000.
let union = BufferedUnionScorer::build(b_scorers, DoNothingCombiner::default, 30000);
// Mismatching scorer: matches 0 and 20000. We then append more docs at the end to ensure it
// is last.
let mut m_docs = vec![0, 20000];
for i in 30000..30100 {
m_docs.push(i);
}
let m = ConstScorer::new(VecDocSet::from(m_docs), 1.0);
// Costs: A1=2, A2=2, Union=4, M=102.
// Sorted: A1, A2, Union, M.
// Left=A1, Right=A2, Others=[Union, M].
let mut intersection = crate::query::intersect_scorers(
vec![Box::new(a1), Box::new(a2), Box::new(union), Box::new(m)],
40000,
);
while intersection.doc() != TERMINATED {
intersection.advance();
}
}
// Strategy to generate sorted and deduplicated vectors of u32 document IDs
fn sorted_deduped_vec(max_val: u32, max_size: usize) -> impl Strategy<Value = Vec<u32>> {
prop::collection::vec(0..max_val, 0..max_size).prop_map(|mut vec| {
@@ -335,6 +419,30 @@ mod tests {
}
assert_eq!(intersection.doc(), TERMINATED);
}
}
#[test]
fn test_bug_2811_intersection_candidate_should_increase() {
let mut schema_builder = Schema::builder();
let text_field = schema_builder.add_text_field("text", TEXT);
let schema = schema_builder.build();
let index = Index::create_in_ram(schema);
let mut writer = index.writer_for_tests().unwrap();
writer
.add_document(doc!(text_field=>"hello happy tax"))
.unwrap();
writer.add_document(doc!(text_field=>"hello")).unwrap();
writer.add_document(doc!(text_field=>"hello")).unwrap();
writer.add_document(doc!(text_field=>"happy tax")).unwrap();
writer.commit().unwrap();
let query_parser = QueryParser::for_index(&index, Vec::new());
let query = query_parser
.parse_query(r#"+text:hello +text:"happy tax""#)
.unwrap();
let searcher = index.reader().unwrap().searcher();
let c = searcher.search(&*query, &Count).unwrap();
assert_eq!(c, 1);
}
}

View File

@@ -43,7 +43,7 @@ pub use self::boost_query::{BoostQuery, BoostWeight};
pub use self::const_score_query::{ConstScoreQuery, ConstScorer};
pub use self::disjunction_max_query::DisjunctionMaxQuery;
pub use self::empty_query::{EmptyQuery, EmptyScorer, EmptyWeight};
pub use self::exclude::Exclude;
pub use self::exclude::{Exclude, ExclusionSet};
pub use self::exist_query::ExistsQuery;
pub use self::explanation::Explanation;
#[cfg(test)]

View File

@@ -1,4 +1,4 @@
use crate::docset::{DocSet, TERMINATED};
use crate::docset::{DocSet, SeekDangerResult, TERMINATED};
use crate::fieldnorm::FieldNormReader;
use crate::postings::Postings;
use crate::query::bm25::Bm25Weight;
@@ -194,11 +194,16 @@ impl<TPostings: Postings> DocSet for PhrasePrefixScorer<TPostings> {
self.advance()
}
fn seek_into_the_danger_zone(&mut self, target: DocId) -> bool {
if self.phrase_scorer.seek_into_the_danger_zone(target) {
self.matches_prefix()
fn seek_danger(&mut self, target: DocId) -> SeekDangerResult {
let seek_res = self.phrase_scorer.seek_danger(target);
if seek_res != SeekDangerResult::Found {
return seek_res;
}
// The intersection matched. Now let's see if we match the prefix.
if self.matches_prefix() {
SeekDangerResult::Found
} else {
false
SeekDangerResult::SeekLowerBound(target + 1)
}
}

View File

@@ -1,6 +1,6 @@
use std::cmp::Ordering;
use crate::docset::{DocSet, TERMINATED};
use crate::docset::{DocSet, SeekDangerResult, TERMINATED};
use crate::fieldnorm::FieldNormReader;
use crate::postings::Postings;
use crate::query::bm25::Bm25Weight;
@@ -530,12 +530,23 @@ impl<TPostings: Postings> DocSet for PhraseScorer<TPostings> {
self.advance()
}
fn seek_into_the_danger_zone(&mut self, target: DocId) -> bool {
debug_assert!(target >= self.doc());
if self.intersection_docset.seek_into_the_danger_zone(target) && self.phrase_match() {
return true;
fn seek_danger(&mut self, target: DocId) -> SeekDangerResult {
debug_assert!(
target >= self.doc(),
"target ({}) should be greater than or equal to doc ({})",
target,
self.doc()
);
let seek_res = self.intersection_docset.seek_danger(target);
if seek_res != SeekDangerResult::Found {
return seek_res;
}
// The intersection matched. Now let's see if we match the phrase.
if self.phrase_match() {
SeekDangerResult::Found
} else {
SeekDangerResult::SeekLowerBound(target + 1)
}
false
}
fn doc(&self) -> DocId {

View File

@@ -2068,6 +2068,16 @@ mod test {
format!("Regex(Field(0), {:#?})", expected_regex).as_str(),
false,
);
let expected_regex2 = tantivy_fst::Regex::new(r".*a").unwrap();
test_parse_query_to_logical_ast_helper(
"title:(/.*b/ OR /.*a/)",
format!(
"(Regex(Field(0), {:#?}) Regex(Field(0), {:#?}))",
expected_regex, expected_regex2
)
.as_str(),
false,
);
// Invalid field
let err = parse_query_to_logical_ast("float:/.*b/", false).unwrap_err();

View File

@@ -19,7 +19,8 @@ pub(crate) fn is_type_valid_for_fastfield_range_query(typ: Type) -> bool {
| Type::Bool
| Type::Date
| Type::Json
| Type::IpAddr => true,
Type::Facet | Type::Bytes => false,
| Type::IpAddr
| Type::Bytes => true,
Type::Facet => false,
}
}

View File

@@ -6,8 +6,8 @@ use std::net::Ipv6Addr;
use std::ops::{Bound, RangeInclusive};
use columnar::{
Cardinality, Column, ColumnType, MonotonicallyMappableToU128, MonotonicallyMappableToU64,
NumericalType, StrColumn,
BytesColumn, Cardinality, Column, ColumnType, MonotonicallyMappableToU128,
MonotonicallyMappableToU64, NumericalType, StrColumn,
};
use common::bounds::{BoundsRange, TransformBound};
@@ -163,6 +163,25 @@ impl Weight for FastFieldRangeWeight {
};
let dict = str_dict_column.dictionary();
let bounds = self.bounds.map_bound(get_value_bytes);
// Get term ids for terms
let (lower_bound, upper_bound) =
dict.term_bounds_to_ord(bounds.lower_bound, bounds.upper_bound)?;
let fast_field_reader = reader.fast_fields();
let Some((column, _col_type)) =
fast_field_reader.u64_lenient_for_type(None, &field_name)?
else {
return Ok(Box::new(EmptyScorer));
};
search_on_u64_ff(column, boost, BoundsRange::new(lower_bound, upper_bound))
} else if field_type.is_bytes() {
let Some(bytes_column): Option<BytesColumn> =
reader.fast_fields().bytes(&field_name)?
else {
return Ok(Box::new(EmptyScorer));
};
let dict = bytes_column.dictionary();
let bounds = self.bounds.map_bound(get_value_bytes);
// Get term ids for terms
let (lower_bound, upper_bound) =
@@ -1402,6 +1421,66 @@ mod tests {
Ok(())
}
#[test]
fn test_bytes_field_ff_range_query() -> crate::Result<()> {
use crate::schema::BytesOptions;
let mut schema_builder = Schema::builder();
let bytes_field = schema_builder
.add_bytes_field("data", BytesOptions::default().set_fast().set_indexed());
let schema = schema_builder.build();
let index = Index::create_in_ram(schema.clone());
let mut index_writer: IndexWriter = index.writer_for_tests()?;
// Insert documents with lexicographically sortable byte values
// Using simple byte sequences that have clear ordering
let values: Vec<Vec<u8>> = vec![
vec![0x00, 0x10],
vec![0x00, 0x20],
vec![0x00, 0x30],
vec![0x01, 0x00],
vec![0x01, 0x10],
vec![0x02, 0x00],
];
for value in &values {
let mut doc = TantivyDocument::new();
doc.add_bytes(bytes_field, value);
index_writer.add_document(doc)?;
}
index_writer.commit()?;
let reader = index.reader()?;
let searcher = reader.searcher();
// Test: Range query [0x00, 0x20] to [0x01, 0x00] (inclusive)
// Should match: [0x00, 0x20], [0x00, 0x30], [0x01, 0x00]
let lower = Term::from_field_bytes(bytes_field, &[0x00, 0x20]);
let upper = Term::from_field_bytes(bytes_field, &[0x01, 0x00]);
let range_query = RangeQuery::new(Bound::Included(lower), Bound::Included(upper));
let count = searcher.search(&range_query, &Count)?;
assert_eq!(
count, 3,
"Expected 3 documents in range [0x00,0x20] to [0x01,0x00]"
);
// Test: Range query > [0x01, 0x00] (exclusive lower bound)
// Should match: [0x01, 0x10], [0x02, 0x00]
let lower = Term::from_field_bytes(bytes_field, &[0x01, 0x00]);
let range_query = RangeQuery::new(Bound::Excluded(lower), Bound::Unbounded);
let count = searcher.search(&range_query, &Count)?;
assert_eq!(count, 2, "Expected 2 documents > [0x01,0x00]");
// Test: Range query < [0x00, 0x30] (exclusive upper bound)
// Should match: [0x00, 0x10], [0x00, 0x20]
let upper = Term::from_field_bytes(bytes_field, &[0x00, 0x30]);
let range_query = RangeQuery::new(Bound::Unbounded, Bound::Excluded(upper));
let count = searcher.search(&range_query, &Count)?;
assert_eq!(count, 2, "Expected 2 documents < [0x00,0x30]");
Ok(())
}
}
#[cfg(test)]

View File

@@ -1,6 +1,6 @@
use std::marker::PhantomData;
use crate::docset::DocSet;
use crate::docset::{DocSet, SeekDangerResult};
use crate::query::score_combiner::ScoreCombiner;
use crate::query::Scorer;
use crate::{DocId, Score};
@@ -56,9 +56,9 @@ where
self.req_scorer.seek(target)
}
fn seek_into_the_danger_zone(&mut self, target: DocId) -> bool {
fn seek_danger(&mut self, target: DocId) -> SeekDangerResult {
self.score_cache = None;
self.req_scorer.seek_into_the_danger_zone(target)
self.req_scorer.seek_danger(target)
}
fn doc(&self) -> DocId {

View File

@@ -105,6 +105,7 @@ impl DocSet for TermScorer {
#[inline]
fn seek(&mut self, target: DocId) -> DocId {
debug_assert!(target >= self.doc());
self.postings.seek(target)
}

View File

@@ -1,6 +1,6 @@
use common::TinySet;
use crate::docset::{DocSet, TERMINATED};
use crate::docset::{DocSet, SeekDangerResult, TERMINATED};
use crate::query::score_combiner::{DoNothingCombiner, ScoreCombiner};
use crate::query::size_hint::estimate_union;
use crate::query::Scorer;
@@ -225,25 +225,47 @@ where
}
}
fn seek_into_the_danger_zone(&mut self, target: DocId) -> bool {
fn seek_danger(&mut self, target: DocId) -> SeekDangerResult {
if target >= TERMINATED {
return SeekDangerResult::SeekLowerBound(TERMINATED);
}
if self.is_in_horizon(target) {
// Our value is within the buffered horizon and the docset may already have been
// processed and removed, so we need to use seek, which uses the regular advance.
self.seek(target) == target
} else {
// The docsets are not in the buffered range, so we can use seek_into_the_danger_zone
// of the underlying docsets
let is_hit = self
.docsets
.iter_mut()
.any(|docset| docset.seek_into_the_danger_zone(target));
let seek_doc = self.seek(target);
if seek_doc == target {
return SeekDangerResult::Found;
} else {
return SeekDangerResult::SeekLowerBound(seek_doc);
};
}
// The API requires the DocSet to be in a valid state when `seek_into_the_danger_zone`
// returns true.
if is_hit {
self.seek(target);
// The docsets are not in the buffered range, so we can use seek_into_the_danger_zone
// of the underlying docsets
let mut is_hit = false;
let mut min_new_target = TERMINATED;
for docset in self.docsets.iter_mut() {
match docset.seek_danger(target) {
SeekDangerResult::Found => {
is_hit = true;
break;
}
SeekDangerResult::SeekLowerBound(new_target) => {
min_new_target = min_new_target.min(new_target);
}
}
is_hit
}
// The API requires the DocSet to be in a valid state when `seek_into_the_danger_zone`
// returns Found.
if is_hit {
// The doc is found. Let's make sure we position the union on the target
// to bring it back to a valid state.
self.seek(target);
SeekDangerResult::Found
} else {
SeekDangerResult::SeekLowerBound(min_new_target)
}
}

View File

@@ -14,7 +14,7 @@ mod tests {
use common::BitSet;
use super::{SimpleUnion, *};
use crate::docset::{DocSet, TERMINATED};
use crate::docset::{DocSet, SeekDangerResult, TERMINATED};
use crate::postings::tests::test_skip_against_unoptimized;
use crate::query::score_combiner::DoNothingCombiner;
use crate::query::union::bitset_union::BitSetPostingUnion;
@@ -254,6 +254,27 @@ mod tests {
vec![1, 2, 3, 7, 8, 9, 99, 100, 101, 500, 20000],
);
}
#[test]
fn test_buffered_union_seek_into_danger_zone_terminated() {
let scorer1 = ConstScorer::new(VecDocSet::from(vec![1, 2]), 1.0);
let scorer2 = ConstScorer::new(VecDocSet::from(vec![2, 3]), 1.0);
let mut union_scorer =
BufferedUnionScorer::build(vec![scorer1, scorer2], DoNothingCombiner::default, 100);
// Advance to end
while union_scorer.doc() != TERMINATED {
union_scorer.advance();
}
assert_eq!(union_scorer.doc(), TERMINATED);
assert_eq!(
union_scorer.seek_danger(TERMINATED),
SeekDangerResult::SeekLowerBound(TERMINATED)
);
}
}
#[cfg(all(test, feature = "unstable"))]

View File

@@ -17,6 +17,9 @@ pub struct VecDocSet {
impl From<Vec<DocId>> for VecDocSet {
fn from(doc_ids: Vec<DocId>) -> VecDocSet {
// We do not use `slice::is_sorted`, as we want to check for doc ids to be strictly
// sorted.
assert!(doc_ids.windows(2).all(|w| w[0] < w[1]));
VecDocSet { doc_ids, cursor: 0 }
}
}

View File

@@ -223,6 +223,11 @@ impl FieldType {
matches!(self, FieldType::Str(_))
}
/// returns true if this is a bytes field
pub fn is_bytes(&self) -> bool {
matches!(self, FieldType::Bytes(_))
}
/// returns true if this is an date field
pub fn is_date(&self) -> bool {
matches!(self, FieldType::Date(_))

View File

@@ -124,7 +124,6 @@ impl SegmentSpaceUsage {
FieldNorms => PerField(self.fieldnorms().clone()),
Terms => PerField(self.termdict().clone()),
SegmentComponent::Store => ComponentSpaceUsage::Store(self.store().clone()),
SegmentComponent::TempStore => ComponentSpaceUsage::Store(self.store().clone()),
Delete => Basic(self.deletes()),
}
}

View File

@@ -51,7 +51,7 @@ mod sstable_index_v3;
pub use sstable_index_v3::{BlockAddr, SSTableIndex, SSTableIndexBuilder, SSTableIndexV3};
mod sstable_index_v2;
pub(crate) mod vint;
pub use dictionary::Dictionary;
pub use dictionary::{Dictionary, TermOrdHit};
pub use streamer::{Streamer, StreamerBuilder};
mod block_reader;