greptimedb

mirror of https://github.com/GreptimeTeam/greptimedb.git synced 2026-05-14 03:50:39 +00:00

Go to file

Weny Xu 20f38d8a6a test(fuzz): add metric table repartition fuzz target (#7754 )

* test: add fuzz_repartition_metric_table target scaffold

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: add metric logical lifecycle in repartition fuzz target

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: support partitioned metric tables in repartition fuzz

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: add repartition loop and partition assertions for metric target

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: use shared timestamp clock in metric repartition writes

Signed-off-by: WenyXu <wenymedia@gmail.com>

* refactor: unify string value and bound generation for fuzzing

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: use fixed physical table name in metric repartition fuzz

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: fmt

Signed-off-by: WenyXu <wenymedia@gmail.com>

* ci: update ci config

Signed-off-by: WenyXu <wenymedia@gmail.com>

* refactor: use btreemap

Signed-off-by: WenyXu <wenymedia@gmail.com>

* print count result

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: add csv translator for insert expr

Introduce a dedicated top-level csv translator so fuzz insert expressions can be converted into writer-ready records through a structured path instead of ad-hoc formatting in targets.

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: add csv dump session utilities

Introduce CSV dump env helpers and a session writer that creates run directories, emits seed metadata, and flushes staged CSV records for fuzz workflows.

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: bound csv dump buffer with auto flush

Parse readable buffer sizes from env and flush staged CSV records automatically when the in-memory threshold is reached to prevent unbounded growth during long fuzz runs.

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: flush csv dump before repartition validation

Wire csv dump session into the metric repartition fuzz flow so successful inserts are translated from insert expressions into CSV records during write loops and flushed to disk right before row validation.

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: keep csv dumps on failure and cleanup on pass

Capture run outcomes in metric repartition fuzz, remove dump directories only after successful validation, and retain dump paths on failures so CI and local investigations can use the same artifacts.

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: align partial csv records with table headers

Keep append payload compact by storing partial insert-expression columns, then expand to full table-context headers at flush time and fill missing values with empty strings.

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: add logs

Signed-off-by: WenyXu <wenymedia@gmail.com>

* dump csv

Signed-off-by: WenyXu <wenymedia@gmail.com>

* ci: dump csv

Signed-off-by: WenyXu <wenymedia@gmail.com>

* refactor

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: add table-scoped sql dump writer primitives

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: capture table-scoped sql traces after execution

Record insert and repartition SQL only after successful execution, include started_at_ms and elapsed_ms in trace comments, and broadcast repartition events into every logical-table trace file for consistent debugging context.

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: harden sql trace comments and include create sql

Normalize multiline trace comments into valid SQL comment lines and append logical-table CREATE SQL to per-table traces for better timeline reconstruction during repartition debugging.

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: dump physical create and repartition SQL traces

Signed-off-by: WenyXu <wenymedia@gmail.com>

* dump repartition sql

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: scaffold writer control channel for barrier flow

Add Barrier/Resume/Stop control skeleton and channel wiring in write_loop to prepare per-repartition validation barriers. Also align SQL dump tests with broadcast SQL payload behavior.

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: implement writer barrier pause and resume control

Make writer control messages effective by pausing writes on barrier, resuming on resume, and stopping via channel signaling so the next commit can enforce deterministic per-repartition validation boundaries.

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: validate rows after each repartition barrier

Add per-action barrier/ack synchronization with timeout, run immediate logical-table row validation after each repartition, and resume writer only after validation completes to improve minimal failure localization.

Signed-off-by: WenyXu <wenymedia@gmail.com>

* test: flush dump sessions before per-epoch validation

Extract a shared flush-and-snapshot helper and call it before each immediate row validation so CSV/SQL artifacts are persisted at the same epoch boundary being validated.

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: fix unit tests

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: add retry

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: apply suggestions from CR

Signed-off-by: WenyXu <wenymedia@gmail.com>

---------

Signed-off-by: WenyXu <wenymedia@gmail.com>

2026-03-13 08:00:09 +00:00

.cargo

feat: put sqlness into a separated dir (#6911 )

2025-09-05 01:39:29 +00:00

.config

build: on windows (#2054 )

2023-08-10 08:08:37 +00:00

.github

test(fuzz): add metric table repartition fuzz target (#7754 )

2026-03-13 08:00:09 +00:00

config

fix: gc update repart map properly (#7606 )

2026-01-28 04:31:19 +00:00

cyborg

ci: handle prerelease version (#7492 )

2025-12-29 08:21:05 +00:00

docker

ci: upgrade GCC in centos dev-builder for cxx crate compatibility (#7643 )

2026-01-30 11:14:25 +00:00

docs

feat: improve filter support for scanbench (#7736 )

2026-03-03 09:00:41 +00:00

grafana

chore: add grafana dashboard about trigger (#7536 )

2026-01-08 06:47:46 +00:00

scripts

feat: add TLS support for mysql backend (#6979 )

2025-09-16 13:46:37 +00:00

src

fix: allow empty string for env values (#7803 )

2026-03-13 03:57:08 +00:00

tests

fix: rm useless analyzer (#7797 )

2026-03-12 10:53:47 +00:00

tests-fuzz

test(fuzz): add metric table repartition fuzz target (#7754 )

2026-03-13 08:00:09 +00:00

tests-integration

feat: introduce APIs for storing perses dashboard definition (#7791 )

2026-03-13 03:40:04 +00:00

.dockerignore

fix: docker build (#1822 )

2023-06-25 11:05:46 +08:00

.editorconfig

feat: to_timezone function (#3470 )

2024-03-12 01:46:19 +00:00

.env.example

feat: add GcsConfig credential field (#4568 )

2024-08-16 03:11:20 +00:00

.gitignore

feat: refine the MemoryGuard (#7466 )

2025-12-25 04:09:32 +00:00

.pre-commit-config.yaml

chore: check for redundant pre-commit hooks (#7506 )

2026-01-07 13:46:42 +00:00

AUTHOR.md

chore: members and committers update (#7341 )

2025-12-04 04:08:43 +00:00

Cargo.lock

chore(deps): bump quinn-proto from 0.11.12 to 0.11.14 (#7805 )

2026-03-13 06:28:58 +00:00

Cargo.toml

chore: bump version rc.2 (#7788 )

2026-03-11 06:57:07 +00:00

cliff.toml

ci: update breaking change title level (#7497 )

2025-12-30 06:17:51 +00:00

codecov.yml

refactor: refactor TableRouteManager (#3392 )

2024-02-28 06:18:09 +00:00

CONTRIBUTING.md

fix: typo in AI-assisted contributions policy (#7472 )

2025-12-25 03:03:14 +00:00

Cross.toml

fix: cross compiling for aarch64 targets and allow customizing page size (#5487 )

2025-02-07 11:21:16 +00:00

flake.lock

fix: use full DDL of flow in information_schema.flows.flow_definition (#7704 )

2026-02-12 00:09:40 +00:00

flake.nix

chore: upgrade DataFusion family, again (#7578 )

2026-03-03 07:36:39 +00:00

LICENSE

chore: multiple licenses fixes (#2714 )

2023-11-09 10:38:12 +00:00

licenserc.toml

feat: trigger alter parse (#6553 )

2025-07-29 11:07:31 +00:00

Makefile

chore: mount cargo git cache in docker builds (#7484 )

2025-12-26 01:56:11 +00:00

README.md

docs: update year to 2026 (#7787 )

2026-03-11 07:29:35 +00:00

rust-toolchain.toml

chore: update rust to nightly 2025-10-01 (#7069 )

2025-10-11 07:30:52 +00:00

rustfmt.toml

chore: specify import style in rustfmt (#460 )

2022-11-15 15:58:54 +08:00

SECURITY.md

feat: Create SECURITY.md (#1270 )

2023-03-28 19:14:29 +08:00

taplo.toml

chore: skip reorder workspace tables in taplo (#3388 )

2024-02-26 08:57:49 +00:00

typos.toml

chore: adjust manifest cache log level (#7655 )

2026-02-03 07:08:52 +00:00

README.md

One database for metrics, logs, and traces
replacing Prometheus, Loki, and Elasticsearch

The unified OpenTelemetry backend — with SQL + PromQL on object storage.

User Guide | API Docs | Roadmap 2026

Introduction
⭐ Key Features
How GreptimeDB Compares
Architecture
Try GreptimeDB
Getting Started
Build From Source
Tools & Extensions
Project Status
Community
License
Commercial Support
Contributing
Acknowledgement

Introduction

GreptimeDB is an open-source observability database built for Observability 2.0 — treating metrics, logs, and traces as one unified data model (wide events) instead of three separate pillars.

Use it as the single OpenTelemetry backend — replacing Prometheus, Loki, and Elasticsearch with one database built on object storage. Query with SQL and PromQL, scale without pain, cut costs up to 50x.

Features

Feature	Description
Drop-in replacement	PromQL, Prometheus remote write, Jaeger, and OpenTelemetry native. Use as your single backend for all three signals, or migrate one at a time.
50x lower cost	Object storage (S3, GCS, Azure Blob etc.) as primary storage. Compute-storage separation scales without pain.
SQL + PromQL	Monitor with PromQL, analyze with SQL. One database replaces Prometheus + your data warehouse.
Sub-second at PB-EB scale	Columnar engine with fulltext, inverted, and skipping indexes. Written in Rust.

✅ Perfect for:

Replacing Prometheus + Loki + Elasticsearch with one database
Scaling past Prometheus — high cardinality, long-term storage, no Thanos/Mimir overhead
Cutting observability costs with object storage (up to 50x savings on traces, 30% on logs)
AI/LLM observability — store and analyze high-volume conversation data, agent traces, and token metrics via OpenTelemetry GenAI conventions
Edge-to-cloud observability with unified APIs on resource-constrained devices

Why Observability 2.0? The three-pillar model (separate databases for metrics, logs, traces) creates data silos and operational complexity. GreptimeDB treats all observability data as timestamped wide events in a single columnar engine — enabling cross-signal SQL JOINs, eliminating redundant infrastructure, and naturally supporting emerging workloads like AI agent observability. Read more: Observability 2.0 and the Database for It.

Learn more in Why GreptimeDB.

How GreptimeDB Compares

Feature	GreptimeDB	Prometheus / Thanos / Mimir	Grafana Loki	Elasticsearch
Data types	Metrics, logs, traces	Metrics only	Logs only	Logs, traces
Query language	SQL + PromQL	PromQL	LogQL	Query DSL
Storage	Native object storage (S3, etc.)	Local disk + object storage (Thanos/Mimir)	Object storage (chunks)	Local disk
Scaling	Compute-storage separation, stateless nodes	Federation / Thanos / Mimir — multi-component, ops heavy	Stateless + object storage	Shard-based, ops heavy
Cost efficiency	Up to 50x lower storage	High at scale	Moderate	High (inverted index overhead)
OpenTelemetry	Native (metrics + logs + traces)	Partial (metrics only)	Partial (logs only)	Via instrumentation

Benchmarks:

Architecture

GreptimeDB can run in two modes:

Standalone Mode - Single binary for development and small deployments
Distributed Mode - Separate components for production scale:
- Frontend: Query processing and protocol handling
- Datanode: Data storage and retrieval
- Metasrv: Metadata management and coordination

Read the architecture document. DeepWiki provides an in-depth look at GreptimeDB:

Try GreptimeDB

docker pull greptime/greptimedb

docker run -p 127.0.0.1:4000-4003:4000-4003 \
  -v "$(pwd)/greptimedb_data:/greptimedb_data" \
  --name greptime --rm \
  greptime/greptimedb:latest standalone start \
  --http-addr 0.0.0.0:4000 \
  --rpc-bind-addr 0.0.0.0:4001 \
  --mysql-addr 0.0.0.0:4002 \
  --postgres-addr 0.0.0.0:4003

Dashboard: http://localhost:4000/dashboard

Getting Started

Build From Source

Prerequisites:

Rust toolchain (nightly)
Protobuf compiler (>= 3.15)
C/C++ building essentials, including gcc/g++/autoconf and glibc library (eg. libc6-dev on Ubuntu and glibc-devel on Fedora)
Python toolchain (optional): Required only if using some test scripts.

Build and Run:

make
cargo run -- standalone start

Tools & Extensions

Kubernetes: GreptimeDB Operator
Helm Charts: Greptime Helm Charts
Dashboard: Web UI
gRPC Ingester: Go, Java, C++, Erlang, Rust, .NET
Grafana Data Source: GreptimeDB Grafana data source plugin
Grafana Dashboard: Official Dashboard for monitoring

Project Status

Status: RC — marching toward v1.0 GA! GA (v1.0): March 2026

Deployed in production handling billions of data points daily
Stable APIs, actively maintained, with regular releases (version info)

GreptimeDB v1.0 represents a major milestone toward maturity — marking stable APIs, production readiness, and proven performance.

Roadmap: v1.0 highlights and release plan and 2026 roadmap.

For production use, we recommend using the latest stable release.

If you find this project useful, a ⭐ would mean a lot to us!

Community

We invite you to engage and contribute!

License

GreptimeDB is licensed under the Apache License 2.0.

Commercial Support

Running GreptimeDB in your organization? We offer enterprise add-ons, services, training, and consulting. Contact us for details.

Contributing

Read our Contribution Guidelines.
Explore Internal Concepts and DeepWiki.
Pick up a good first issue and join the #contributors Slack channel.

Acknowledgement

Special thanks to all contributors! See AUTHORS.md.

Uses Apache Arrow™ (memory model)
Apache Parquet™ (file storage)
Apache DataFusion™ (query engine)
Apache OpenDAL™ (data access abstraction)

Description

Open-source, cloud-native, unified observability database for metrics, logs and traces, supporting SQL/PromQL/Streaming.

analytics cloud-native database distributed greptimedb logs metrics monitoring observability observability-database observability-datalake promql rust rust-database sql time-series traces tsdb

Readme Apache-2.0 791 MiB

README.md

One database for metrics, logs, and traces replacing Prometheus, Loki, and Elasticsearch

User Guide | API Docs | Roadmap 2026

Introduction

Features

How GreptimeDB Compares

Architecture

Try GreptimeDB

Getting Started

Build From Source

Tools & Extensions

Project Status

Community

License

Commercial Support

Contributing

Acknowledgement

One database for metrics, logs, and traces
replacing Prometheus, Loki, and Elasticsearch