Weny Xu 8ad2d2414c chore: pick fixes and bump version to v1.1.2 (#8404)
* fix: improve Grafana metrics dashboards (#8298)

* chore: initial changes

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: improve troubleshooting dashboard

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: rm troubleshooting-dashboard.md

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: optimize metrics dashboard

Signed-off-by: evenyag <realevenyag@gmail.com>

* docs: move troubleshooting-dashboard.md

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: move mito gc duration panel

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: cleanup the dashboard

- Overview trend panels are now aggregate-only:
    - Total Ingestion Rate Trend
    - Total Query Rate Trend

- Protocol breakdowns remain in Ingestion and Queries.
- Mito Backpressure and Failures no longer duplicates scan/GC signals.
- Removed Write Stall per Instance.
- Split Object Store and WAL into collapsed Object Store and collapsed
  WAL.
- Moved WAL/logstore panels out of Storage into WAL.
- Normalized OpenDAL “other request” matchers.
- Normalized trigger elapsed p99/p75/avg aggregation.
- Regenerated standalone JSON and dashboard YAML/Markdown.
- Updated docs/troubleshooting-dashboard.md.

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: rearrange metasrv dashboard panels

Signed-off-by: evenyag <realevenyag@gmail.com>

* feat: improve troubleshooting dashboard layout

Signed-off-by: evenyag <realevenyag@gmail.com>

* docs: remove obsolete troubleshooting dashboard doc

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: correct cluster dashboard panel queries (missing _bucket, raw counters, rate normalization)

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: correct trigger panel datasource, collapse flush/compaction, split request latency panels

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: update grafana metrics dashboard panels

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: correct Grafana dashboard units

Signed-off-by: evenyag <realevenyag@gmail.com>

* chore: regenerate Grafana dashboards

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: use throughput unit for index IO bytes

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: redact Kafka SASL password in debug output (#8337)

## Summary
- Mask `KafkaClientSaslConfig` password fields in debug output while keeping usernames visible.
- Cover metasrv WAL debug output with a regression test.

## Files
- `src/common/wal/src/config/kafka/common.rs`
- `src/common/wal/src/config.rs`

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix(query): run optimizer rules before MergeScan (#8339)

* fix(query): push down join filters before MergeScan

Signed-off-by: discord9 <discord9@163.com>

* fix(query): run optimizer before MergeScan pushdown

Signed-off-by: discord9 <discord9@163.com>

* fix(query): narrow pre-MergeScan filter pushdown

Signed-off-by: discord9 <discord9@163.com>

* fix(query): refine pre-MergeScan optimizer prepass

Signed-off-by: discord9 <discord9@163.com>

* fix(query): satisfy predicate extractor clippy

Signed-off-by: discord9 <discord9@163.com>

* test(query): cover pre-MergeScan optimizer edges

Signed-off-by: discord9 <discord9@163.com>

* test(query): cover set comparison prepass

Signed-off-by: discord9 <discord9@163.com>

* fix(query): guard remote scan filter pushdown

Signed-off-by: discord9 <discord9@163.com>

* fix(query): preserve subquery planning errors

Signed-off-by: discord9 <discord9@163.com>

* fix(query): preserve usable scan predicates

Signed-off-by: discord9 <discord9@163.com>

* fix(query): simplify scan predicate extraction

Signed-off-by: discord9 <discord9@163.com>

* fix(query): keep scan filter extraction scoped

Signed-off-by: discord9 <discord9@163.com>

* docs(query): explain pre-MergeScan optimizer

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: preserve bulk write grpc error details (#8349)

Signed-off-by: jeremyhi <fengjiachun@gmail.com>
Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: include index files in GC listing (#8327)

* fix: include index files in GC listing

Signed-off-by: discord9 <discord9@163.com>

* chore: filter GC index listing to puffins

Signed-off-by: discord9 <discord9@163.com>

* chore: simplify GC index listing stream

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: stream tables for prometheus label discovery (#8341)

Signed-off-by: Ritwij Aryan Parmar <ritwij.aryan.parmar@gmail.com>
Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: account parquet metadata cache size (#8368)

* fix: account parquet metadata cache size

Use Parquet metadata memory sizing for SST metadata cache weight and add regression coverage for byte-array page-index buffers.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* fix: saturate sst meta cache weight

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

---------

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: respect gc mailbox timeout for admin gc (#8363)

Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: record catalog and schema in slow queries (#8387)

* fix: record catalog and schema in slow queries

Add catalog and schema context to slow query records while appending the new columns after existing fields to preserve column order.

- `src/common/frontend/src/slow_query_event.rs`: extend `SlowQueryEvent` schema and rows with `catalog_name` and `schema_name`, and cover append-only ordering.
- `src/catalog/src/process_manager.rs`: carry catalog and schema through `SlowQueryTimer`.
- `src/frontend/src/instance.rs`: capture context for SQL, plan, and PromQL slow query timers.
- `tests-integration/tests/sql.rs`: assert MySQL and PostgreSQL slow query records include catalog and schema.

Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>

* fix: address slow query review comment

Use `String::clone` when writing slow query catalog and schema values.

Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>

* fix: keep slow query schema only

Remove the slow query `catalog_name` column and keep `schema_name` as a non-null tag dimension.

- `src/common/frontend/src/slow_query_event.rs`: expose only `schema_name` in `SlowQueryEvent` rows and mark it as a tag.

- `src/catalog/src/process_manager.rs`: stop carrying catalog context in `SlowQueryTimer`.

- `src/frontend/src/instance.rs`: pass only schema context to slow query timers.

- `tests-integration/tests/sql.rs`: assert slow query records include `schema_name` without `catalog_name`.

Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>

* fix: schema name semantic should be field

Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>

* fix: typo

Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>

---------

Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>
Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: invalidate comment DDL cache and lock by object ID (#8390)

* fix: invalidate comment ddl cache locally

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: fix typos

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: apply suggestions

Signed-off-by: WenyXu <wenymedia@gmail.com>

---------

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: client_ip error logs skip internal API (#8362)

* chore: client_ip error logs skip internal API

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* fix: fmt

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: use const

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

* chore: use const

Signed-off-by: shuiyisong <xixing.sys@gmail.com>

---------

Signed-off-by: shuiyisong <xixing.sys@gmail.com>
Signed-off-by: WenyXu <wenymedia@gmail.com>

* feat: update dashboard to v0.13.6 (#8369)

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: use ENV for building dashboard (#8384)

Signed-off-by: shuiyisong <xixing.sys@gmail.com>
Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: handle PromQL time binary aggregation (#8398)

Signed-off-by: jeremyhi <fengjiachun@gmail.com>
Signed-off-by: WenyXu <wenymedia@gmail.com>

* perf(mito): prune files by manifest time range (#8352)

* perf(mito): prune files by manifest time range

Signed-off-by: discord9 <discord9@163.com>

* chore(mito): address file pruning review

Signed-off-by: discord9 <discord9@163.com>

* chore(mito): remove verbose file pruning log

Signed-off-by: discord9 <discord9@163.com>

* chore(mito): expose file pruning metric

Signed-off-by: discord9 <discord9@163.com>

* chore(mito): shorten file pruning metric

Signed-off-by: discord9 <discord9@163.com>

* test(mito): cover file pruning edge cases

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: WenyXu <wenymedia@gmail.com>

* perf(mito): skip manifest-pruned file ranges (#8366)

* perf(mito): skip manifest-pruned file ranges

Signed-off-by: discord9 <discord9@163.com>

* test(mito): allow empty prune benchmark output

Signed-off-by: discord9 <discord9@163.com>

* fix(mito): avoid caching stale pruned builders

Signed-off-by: discord9 <discord9@163.com>

* chore(mito): address pruner clippy

Signed-off-by: discord9 <discord9@163.com>

* fix(mito): account worker pruner builder metrics

Signed-off-by: discord9 <discord9@163.com>

* test(mito): keep empty prune benchmark local

Signed-off-by: discord9 <discord9@163.com>

* refactor(mito): share manifest-pruned range skip

Signed-off-by: discord9 <discord9@163.com>

* chore(mito): shorten prune cache comment

Signed-off-by: discord9 <discord9@163.com>

* fix(mito): keep manifest prune state in pruner

Signed-off-by: discord9 <discord9@163.com>

* test(mito): cover manifest prune fast skip edge cases

Signed-off-by: discord9 <discord9@163.com>

* chore: fix typo in logical table alter

Signed-off-by: discord9 <discord9@163.com>

* chore(mito): address pruner review comments

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: bump version to v1.1.2

Signed-off-by: WenyXu <wenymedia@gmail.com>

---------

Signed-off-by: evenyag <realevenyag@gmail.com>
Signed-off-by: WenyXu <wenymedia@gmail.com>
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: jeremyhi <fengjiachun@gmail.com>
Signed-off-by: Ritwij Aryan Parmar <ritwij.aryan.parmar@gmail.com>
Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>
Signed-off-by: shuiyisong <xixing.sys@gmail.com>
Co-authored-by: Yingwen <realevenyag@gmail.com>
Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com>
Co-authored-by: discord9 <discord9@163.com>
Co-authored-by: jeremyhi <jiachun_feng@proton.me>
Co-authored-by: Ritwij Aryan Parmar <88580521+RitwijParmar@users.noreply.github.com>
Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>
Co-authored-by: sun <sunchang_long@163.com>
2026-07-02 21:21:19 +08:00
2023-08-10 08:08:37 +00:00
2023-06-25 11:05:46 +08:00
2023-11-09 10:38:12 +00:00
2023-03-28 19:14:29 +08:00

GreptimeDB Logo

One database for metrics, logs, and traces
replacing Prometheus, Loki, and Elasticsearch

The unified OpenTelemetry backend — with SQL + PromQL on object storage.

Introduction

GreptimeDB is an open-source observability database built for Observability 2.0 — treating metrics, logs, and traces as one unified data model (wide events) instead of three separate pillars.

Use it as the single OpenTelemetry backend — replacing Prometheus, Loki, and Elasticsearch with one database built on object storage. Query with SQL and PromQL, scale without pain, cut costs up to 50×.

Overview

A quick overview of what GreptimeDB ingests, how it connects to other systems, and what its distributed engine lets you do.

GreptimeDB Overview

Features

Feature Description
Observability 2.0 native Logs, metrics, and traces in one engine with SQL + PromQL. Native OpenTelemetry, Prometheus remote write, and Jaeger. Migrate one signal at a time, or use as a single backend.
Elastic compute-storage separation Scale reads independently with horizontal replicas. Serve high-concurrency workloads from dashboards, alerting, and AI agents — without resharding or data migration.
Sub-second on PBEB-scale data Columnar engine with fulltext, inverted, and skipping indexes. Written in Rust. Designed for high-concurrency point queries, not just analytical scans.
50× lower cost Object storage (S3, GCS, Azure Blob) as primary storage, with a tiered cache (memory + local disk) to keep writes and queries fast.

Perfect for:

  • Replacing Prometheus + Loki + Elasticsearch with a single observability backend
  • Scaling past Prometheus — high cardinality, long-term storage, no Thanos/Mimir overhead
  • AI/agent workloads — store GenAI telemetry (OTel GenAI conventions), and serve high-concurrency reads from SRE/developer agents via horizontal read replicas
  • Cutting observability costs with object storage (up to 50× savings on traces, 30% on logs)
  • Edge-to-cloud observability with unified APIs on resource-constrained devices

Why Observability 2.0? Three separate databases for metrics, logs, and traces means three storage layers, three query languages, and three sets of dashboards. GreptimeDB stores all three as timestamped wide events in one columnar engine — JOIN across signals in SQL, run one stack instead of three, and ingest AI agent telemetry the same way. Read more: Observability 2.0 and the Database for It.

Learn more in Why GreptimeDB.

How GreptimeDB Compares

Capability GreptimeDB Prometheus / Thanos / Mimir Grafana Loki Elasticsearch
Data types Metrics, logs, traces Metrics only Logs only Logs, traces
Query language SQL + PromQL PromQL LogQL Query DSL
Storage Native object storage (S3, etc.) Local disk + object storage (Thanos/Mimir) Object storage (chunks) Local disk
Scaling Compute-storage separation, stateless nodes Federation / Thanos / Mimir — multi-component, ops heavy Stateless + object storage Shard-based, ops heavy
Cost efficiency Up to 50× lower storage cost High at scale Moderate High (inverted index overhead)
OpenTelemetry Native (metrics + logs + traces) Partial (metrics only) Partial (logs only) Via instrumentation

Benchmarks:

Architecture

GreptimeDB can run in two modes:

  • Standalone — single binary for development and small deployments.
  • Distributed — four components, each independently scalable:
    • Frontend — protocol entry (OTel, Prometheus, MySQL/PostgreSQL, gRPC, ingestion APIs for Elasticsearch/InfluxDB/Loki) and the distributed query engine. Stateless, scales horizontally.
    • Datanode — region engine with WAL, memtable, SST, cache, compaction, and indexes. Persists data to object storage. Elastic.
    • Metasrv — metadata, routing, repartitioning, autopilot, and security. Backed by a pluggable KV layer (etcd or RDS).
    • Flownode (optional) — continuous flow computation (streaming and materialized views).

For deeper coverage, see the architecture doc or DeepWiki.

GreptimeDB System Overview

Try GreptimeDB

For AI agents — paste this prompt into your agent:

Read https://docs.greptime.com/SKILL.md and follow the instructions
to deploy, configure, ingest, and query GreptimeDB.
docker run -p 127.0.0.1:4000-4003:4000-4003 \
  -v "$(pwd)/greptimedb_data:/greptimedb_data" \
  --name greptime --rm \
  greptime/greptimedb:latest standalone start \
  --http-addr 0.0.0.0:4000 \
  --rpc-bind-addr 0.0.0.0:4001 \
  --mysql-addr 0.0.0.0:4002 \
  --postgres-addr 0.0.0.0:4003

Dashboard: http://localhost:4000/dashboard

Read more in the full Install Guide.

Troubleshooting:

  • Cannot connect to the database? Ensure that ports 4000, 4001, 4002, and 4003 are not blocked by a firewall or used by other services.
  • Failed to start? Check the container logs with docker logs greptime for further details.

Getting Started

Build From Source

Prerequisites:

  • Rust toolchain — nightly, pinned by rust-toolchain.toml
  • Protobuf compiler (>= 3.15)
  • C/C++ building essentials: gcc / g++ / autoconf and the glibc dev package (libc6-dev on Ubuntu, glibc-devel on Fedora)
  • Python toolchain (optional, only for some test scripts)

Build and run:

make                          # build greptime binary
cargo run -- standalone start # start in standalone mode

Common dev commands:

make fmt            # format Rust code
make clippy         # lint (fails on warnings)
make test           # unit + integration tests (uses cargo-nextest)
make sqlness-test   # SQL regression tests

See the Contribution Guidelines for the full developer workflow.

Tools & Extensions

Project Status

GreptimeDB is at v1.0 GA with stable APIs and regular releases. It runs in production at scale — OceanBase Cloud operates 80+ GreptimeDB clusters managing 300 TB of logs, cutting log storage cost by 60% after migrating from Grafana Loki. See more in case studies.

Read the v1.0 highlights and 2026 roadmap, or browse the version reference.

If GreptimeDB is useful to you, please star the repo.

Star History Chart

Known Users

Community

We invite you to engage and contribute!

License

GreptimeDB is licensed under the Apache License 2.0.

Commercial Support

Running GreptimeDB in your organization? We offer enterprise add-ons, services, training, and consulting. Contact us for details.

Contributing

Acknowledgement

Special thanks to all contributors! See AUTHOR.md.


All trademarks, logos, and brand names referenced in this README and in the Overview diagram are the property of their respective owners. Their use is for identification purposes only and does not imply endorsement or affiliation.

Description
Languages
Rust 99.5%
Python 0.2%