mirror of
https://github.com/GreptimeTeam/greptimedb.git
synced 2026-03-23 10:30:37 +00:00
* feat: cast filters type for scanbench Signed-off-by: evenyag <realevenyag@gmail.com> * chore: pub file_range mod So we can use the pub struct FileRange in other places Signed-off-by: evenyag <realevenyag@gmail.com> * fix: add api as dev-dependency to cmd for clippy Signed-off-by: evenyag <realevenyag@gmail.com> * feat: support profiling after warmup Signed-off-by: evenyag <realevenyag@gmail.com> --------- Signed-off-by: evenyag <realevenyag@gmail.com>
3.5 KiB
3.5 KiB
Scanbench Usage
scanbench benchmarks region scans directly from storage through:
greptime datanode scanbench ...
Build
cargo build -p cmd --bin greptime
Command
./target/debug/greptime datanode scanbench \
--config <CONFIG_TOML> \
--region-id <REGION_ID> \
--table-dir <TABLE_DIR> \
[--scanner <seq|unordered|series>] \
[--scan-config <SCAN_CONFIG_JSON>] \
[--parallelism <N>] \
[--iterations <N>] \
[--path-type <bare|data|metadata>] \
[--force-flat-format] \
[--enable-wal] \
[--pprof-file <FLAMEGRAPH_SVG>] \
[--pprof-after-warmup] \
[--verbose]
Required Arguments
--config: Datanode/standalone TOML config.--region-id: Region ID in one of:<u64>(example:4398046511104)<table_id>:<region_number>(example:1024:0)
--table-dir: Table directory used in open request (example:greptime/public/1024).
Optional Arguments
--scanner: Scan strategy. Default:seq.seq: default scanunordered: time-windowed distributionseries: per-series distribution
--scan-config: JSON file to tune scan request.--parallelism: Simulated scan parallelism. Default:1.--iterations: Benchmark iterations. Default:1.--path-type: Region path type (bare,data,metadata). Default:bare.--force-flat-format: Force reading the region in flat format. Default: disabled.--enable-wal: Enable WAL replay when opening the region. Default: disabled. When enabled, scanbench uses the log store configured in the[wal]section of the config TOML (raft-engine or Kafka). When disabled or when no WAL is configured, aNoopLogStoreis used.--pprof-file: Output flamegraph path (Unix only).--pprof-after-warmup: Start profiling after the first iteration, using it as a warmup. Requires--pprof-file. Default: disabled.--verbose/-v: Enable verbose output.
Scan Config JSON
{
"projection": [0, 1, 2],
"projection_names": ["host", "cpu"],
"filters": ["host = 'web-1'", "cpu > 80"],
"series_row_selector": "last_row"
}
Notes:
- All fields are optional.
- Use either
projection(indexes) orprojection_names(column names), not both. projection_namesuses exact (case-sensitive) column name matching.filtersis a list of SQL expressions (not full SQL statements), e.g."host = 'web-1'".series_row_selectorcurrently supports only"last_row".
Examples
Default sequential scan:
./target/debug/greptime datanode scanbench \
--config /path/to/config.toml \
--region-id 1024:0 \
--table-dir greptime/public/1024
Unordered scan with parallelism:
./target/debug/greptime datanode scanbench \
--config /path/to/config.toml \
--region-id 1024:0 \
--table-dir greptime/public/1024 \
--scanner unordered \
--parallelism 8 \
--iterations 5
Series scan with scan config and flamegraph:
./target/debug/greptime datanode scanbench \
--config /path/to/config.toml \
--region-id 1024:0 \
--table-dir greptime/public/1024 \
--scanner series \
--scan-config /path/to/scan-config.json \
--pprof-file /tmp/scanbench.svg
Force flat-format read:
./target/debug/greptime datanode scanbench \
--config /path/to/config.toml \
--region-id 1024:0 \
--table-dir greptime/public/1024 \
--force-flat-format
Scan with WAL replay enabled (uses [wal] config from TOML):
./target/debug/greptime datanode scanbench \
--config /path/to/config.toml \
--region-id 1024:0 \
--table-dir greptime/public/1024 \
--enable-wal