mirror of
https://github.com/GreptimeTeam/greptimedb.git
synced 2026-05-14 20:10:37 +00:00
* test: add fuzz_repartition_metric_table target scaffold Signed-off-by: WenyXu <wenymedia@gmail.com> * test: add metric logical lifecycle in repartition fuzz target Signed-off-by: WenyXu <wenymedia@gmail.com> * test: support partitioned metric tables in repartition fuzz Signed-off-by: WenyXu <wenymedia@gmail.com> * test: add repartition loop and partition assertions for metric target Signed-off-by: WenyXu <wenymedia@gmail.com> * test: use shared timestamp clock in metric repartition writes Signed-off-by: WenyXu <wenymedia@gmail.com> * refactor: unify string value and bound generation for fuzzing Signed-off-by: WenyXu <wenymedia@gmail.com> * test: use fixed physical table name in metric repartition fuzz Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: fmt Signed-off-by: WenyXu <wenymedia@gmail.com> * ci: update ci config Signed-off-by: WenyXu <wenymedia@gmail.com> * refactor: use btreemap Signed-off-by: WenyXu <wenymedia@gmail.com> * print count result Signed-off-by: WenyXu <wenymedia@gmail.com> * test: add csv translator for insert expr Introduce a dedicated top-level csv translator so fuzz insert expressions can be converted into writer-ready records through a structured path instead of ad-hoc formatting in targets. Signed-off-by: WenyXu <wenymedia@gmail.com> * test: add csv dump session utilities Introduce CSV dump env helpers and a session writer that creates run directories, emits seed metadata, and flushes staged CSV records for fuzz workflows. Signed-off-by: WenyXu <wenymedia@gmail.com> * test: bound csv dump buffer with auto flush Parse readable buffer sizes from env and flush staged CSV records automatically when the in-memory threshold is reached to prevent unbounded growth during long fuzz runs. Signed-off-by: WenyXu <wenymedia@gmail.com> * test: flush csv dump before repartition validation Wire csv dump session into the metric repartition fuzz flow so successful inserts are translated from insert expressions into CSV records during write loops and flushed to disk right before row validation. Signed-off-by: WenyXu <wenymedia@gmail.com> * test: keep csv dumps on failure and cleanup on pass Capture run outcomes in metric repartition fuzz, remove dump directories only after successful validation, and retain dump paths on failures so CI and local investigations can use the same artifacts. Signed-off-by: WenyXu <wenymedia@gmail.com> * test: align partial csv records with table headers Keep append payload compact by storing partial insert-expression columns, then expand to full table-context headers at flush time and fill missing values with empty strings. Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: add logs Signed-off-by: WenyXu <wenymedia@gmail.com> * dump csv Signed-off-by: WenyXu <wenymedia@gmail.com> * ci: dump csv Signed-off-by: WenyXu <wenymedia@gmail.com> * refactor Signed-off-by: WenyXu <wenymedia@gmail.com> * test: add table-scoped sql dump writer primitives Signed-off-by: WenyXu <wenymedia@gmail.com> * test: capture table-scoped sql traces after execution Record insert and repartition SQL only after successful execution, include started_at_ms and elapsed_ms in trace comments, and broadcast repartition events into every logical-table trace file for consistent debugging context. Signed-off-by: WenyXu <wenymedia@gmail.com> * test: harden sql trace comments and include create sql Normalize multiline trace comments into valid SQL comment lines and append logical-table CREATE SQL to per-table traces for better timeline reconstruction during repartition debugging. Signed-off-by: WenyXu <wenymedia@gmail.com> * test: dump physical create and repartition SQL traces Signed-off-by: WenyXu <wenymedia@gmail.com> * dump repartition sql Signed-off-by: WenyXu <wenymedia@gmail.com> * test: scaffold writer control channel for barrier flow Add Barrier/Resume/Stop control skeleton and channel wiring in write_loop to prepare per-repartition validation barriers. Also align SQL dump tests with broadcast SQL payload behavior. Signed-off-by: WenyXu <wenymedia@gmail.com> * test: implement writer barrier pause and resume control Make writer control messages effective by pausing writes on barrier, resuming on resume, and stopping via channel signaling so the next commit can enforce deterministic per-repartition validation boundaries. Signed-off-by: WenyXu <wenymedia@gmail.com> * test: validate rows after each repartition barrier Add per-action barrier/ack synchronization with timeout, run immediate logical-table row validation after each repartition, and resume writer only after validation completes to improve minimal failure localization. Signed-off-by: WenyXu <wenymedia@gmail.com> * test: flush dump sessions before per-epoch validation Extract a shared flush-and-snapshot helper and call it before each immediate row validation so CSV/SQL artifacts are persisted at the same epoch boundary being validated. Signed-off-by: WenyXu <wenymedia@gmail.com> * fix: fix unit tests Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: add retry Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: apply suggestions from CR Signed-off-by: WenyXu <wenymedia@gmail.com> --------- Signed-off-by: WenyXu <wenymedia@gmail.com>
89 lines
2.4 KiB
Markdown
89 lines
2.4 KiB
Markdown
# Fuzz Test for GreptimeDB
|
|
|
|
## Setup
|
|
1. Install the [fuzz](https://rust-fuzz.github.io/book/cargo-fuzz/setup.html) cli first.
|
|
```bash
|
|
cargo install cargo-fuzz
|
|
```
|
|
|
|
2. Start GreptimeDB
|
|
3. Copy the `.env.example`, which is at project root, to `.env` and change the values on need.
|
|
|
|
### For stable fuzz tests
|
|
Set the GreptimeDB MySQL address.
|
|
```
|
|
GT_MYSQL_ADDR = localhost:4002
|
|
```
|
|
|
|
### For unstable fuzz tests
|
|
Set the binary path of the GreptimeDB:
|
|
```
|
|
GT_FUZZ_BINARY_PATH = /path/to/
|
|
```
|
|
|
|
Change the instance root directory(the default value: `/tmp/unstable_greptime/`)
|
|
```
|
|
GT_FUZZ_INSTANCE_ROOT_DIR = /path/to/
|
|
```
|
|
## Run
|
|
1. List all fuzz targets
|
|
```bash
|
|
cargo fuzz list --fuzz-dir tests-fuzz
|
|
```
|
|
|
|
2. Run a fuzz target.
|
|
```bash
|
|
cargo fuzz run fuzz_create_table --fuzz-dir tests-fuzz -D -s none
|
|
```
|
|
|
|
## Crash Reproduction
|
|
If you want to reproduce a crash, you first need to obtain the Base64 encoded code, which usually appears at the end of a crash report, and store it in a file.
|
|
|
|
Alternatively, if you already have the crash file, you can skip this step.
|
|
|
|
```bash
|
|
echo "Base64" > .crash
|
|
```
|
|
Print the `std::fmt::Debug` output for an input.
|
|
|
|
```bash
|
|
cargo fuzz fmt fuzz_target .crash --fuzz-dir tests-fuzz -D -s none
|
|
```
|
|
Rerun the fuzz test with the input.
|
|
You can override fuzz input with environment variables. For example, to override fuzz input like:
|
|
|
|
```
|
|
FuzzInput {
|
|
seed: 6666,
|
|
actions: 175
|
|
}
|
|
```
|
|
|
|
you can run with `GT_FUZZ_OVERRIDE_SEED=6666` and `GT_FUZZ_OVERRIDE_ACTIONS=175`:
|
|
|
|
```bash
|
|
GT_FUZZ_OVERRIDE_SEED=6666 GT_FUZZ_OVERRIDE_ACTIONS=175 cargo fuzz run fuzz_target .crash --fuzz-dir tests-fuzz -D -s none
|
|
```
|
|
|
|
For more details, visit [cargo fuzz](https://rust-fuzz.github.io/book/cargo-fuzz/tutorial.html) or run the command `cargo fuzz --help`.
|
|
|
|
## Repartition Metric Dump Artifacts
|
|
|
|
For `fuzz_repartition_metric_table`, dump artifacts are written under one run directory.
|
|
|
|
- Table data snapshots: `<logical_table>.table-data.csv`
|
|
- SQL traces per logical table: `<logical_table>.trace.sql`
|
|
- Seed metadata: `seed.meta`
|
|
|
|
SQL trace behavior:
|
|
|
|
- Insert SQL is appended after successful execution with comment fields including
|
|
`started_at_ms` and `elapsed_ms`.
|
|
- Repartition events are broadcast to all logical table trace files with comment fields including
|
|
`action_idx`, `started_at_ms`, `elapsed_ms`, and SQL text.
|
|
|
|
Run directory lifecycle:
|
|
|
|
- On success, the run directory is cleaned up.
|
|
- On failure, the run directory is retained for CI/local diffing.
|