* fix: improve Grafana metrics dashboards (#8298) * chore: initial changes Signed-off-by: evenyag <realevenyag@gmail.com> * feat: improve troubleshooting dashboard Signed-off-by: evenyag <realevenyag@gmail.com> * chore: rm troubleshooting-dashboard.md Signed-off-by: evenyag <realevenyag@gmail.com> * chore: optimize metrics dashboard Signed-off-by: evenyag <realevenyag@gmail.com> * docs: move troubleshooting-dashboard.md Signed-off-by: evenyag <realevenyag@gmail.com> * chore: move mito gc duration panel Signed-off-by: evenyag <realevenyag@gmail.com> * chore: cleanup the dashboard - Overview trend panels are now aggregate-only: - Total Ingestion Rate Trend - Total Query Rate Trend - Protocol breakdowns remain in Ingestion and Queries. - Mito Backpressure and Failures no longer duplicates scan/GC signals. - Removed Write Stall per Instance. - Split Object Store and WAL into collapsed Object Store and collapsed WAL. - Moved WAL/logstore panels out of Storage into WAL. - Normalized OpenDAL “other request” matchers. - Normalized trigger elapsed p99/p75/avg aggregation. - Regenerated standalone JSON and dashboard YAML/Markdown. - Updated docs/troubleshooting-dashboard.md. Signed-off-by: evenyag <realevenyag@gmail.com> * fix: rearrange metasrv dashboard panels Signed-off-by: evenyag <realevenyag@gmail.com> * feat: improve troubleshooting dashboard layout Signed-off-by: evenyag <realevenyag@gmail.com> * docs: remove obsolete troubleshooting dashboard doc Signed-off-by: evenyag <realevenyag@gmail.com> * fix: correct cluster dashboard panel queries (missing _bucket, raw counters, rate normalization) Signed-off-by: evenyag <realevenyag@gmail.com> * fix: correct trigger panel datasource, collapse flush/compaction, split request latency panels Signed-off-by: evenyag <realevenyag@gmail.com> * fix: update grafana metrics dashboard panels Signed-off-by: evenyag <realevenyag@gmail.com> * fix: correct Grafana dashboard units Signed-off-by: evenyag <realevenyag@gmail.com> * chore: regenerate Grafana dashboards Signed-off-by: evenyag <realevenyag@gmail.com> * fix: use throughput unit for index IO bytes Signed-off-by: evenyag <realevenyag@gmail.com> --------- Signed-off-by: evenyag <realevenyag@gmail.com> Signed-off-by: WenyXu <wenymedia@gmail.com> * fix: redact Kafka SASL password in debug output (#8337) ## Summary - Mask `KafkaClientSaslConfig` password fields in debug output while keeping usernames visible. - Cover metasrv WAL debug output with a regression test. ## Files - `src/common/wal/src/config/kafka/common.rs` - `src/common/wal/src/config.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> Signed-off-by: WenyXu <wenymedia@gmail.com> * fix(query): run optimizer rules before MergeScan (#8339) * fix(query): push down join filters before MergeScan Signed-off-by: discord9 <discord9@163.com> * fix(query): run optimizer before MergeScan pushdown Signed-off-by: discord9 <discord9@163.com> * fix(query): narrow pre-MergeScan filter pushdown Signed-off-by: discord9 <discord9@163.com> * fix(query): refine pre-MergeScan optimizer prepass Signed-off-by: discord9 <discord9@163.com> * fix(query): satisfy predicate extractor clippy Signed-off-by: discord9 <discord9@163.com> * test(query): cover pre-MergeScan optimizer edges Signed-off-by: discord9 <discord9@163.com> * test(query): cover set comparison prepass Signed-off-by: discord9 <discord9@163.com> * fix(query): guard remote scan filter pushdown Signed-off-by: discord9 <discord9@163.com> * fix(query): preserve subquery planning errors Signed-off-by: discord9 <discord9@163.com> * fix(query): preserve usable scan predicates Signed-off-by: discord9 <discord9@163.com> * fix(query): simplify scan predicate extraction Signed-off-by: discord9 <discord9@163.com> * fix(query): keep scan filter extraction scoped Signed-off-by: discord9 <discord9@163.com> * docs(query): explain pre-MergeScan optimizer Signed-off-by: discord9 <discord9@163.com> --------- Signed-off-by: discord9 <discord9@163.com> Signed-off-by: WenyXu <wenymedia@gmail.com> * fix: preserve bulk write grpc error details (#8349) Signed-off-by: jeremyhi <fengjiachun@gmail.com> Signed-off-by: WenyXu <wenymedia@gmail.com> * fix: include index files in GC listing (#8327) * fix: include index files in GC listing Signed-off-by: discord9 <discord9@163.com> * chore: filter GC index listing to puffins Signed-off-by: discord9 <discord9@163.com> * chore: simplify GC index listing stream Signed-off-by: discord9 <discord9@163.com> --------- Signed-off-by: discord9 <discord9@163.com> Signed-off-by: WenyXu <wenymedia@gmail.com> * fix: stream tables for prometheus label discovery (#8341) Signed-off-by: Ritwij Aryan Parmar <ritwij.aryan.parmar@gmail.com> Signed-off-by: WenyXu <wenymedia@gmail.com> * fix: account parquet metadata cache size (#8368) * fix: account parquet metadata cache size Use Parquet metadata memory sizing for SST metadata cache weight and add regression coverage for byte-array page-index buffers. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * fix: saturate sst meta cache weight Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> --------- Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> Signed-off-by: WenyXu <wenymedia@gmail.com> * fix: respect gc mailbox timeout for admin gc (#8363) Signed-off-by: discord9 <discord9@163.com> Signed-off-by: WenyXu <wenymedia@gmail.com> * fix: record catalog and schema in slow queries (#8387) * fix: record catalog and schema in slow queries Add catalog and schema context to slow query records while appending the new columns after existing fields to preserve column order. - `src/common/frontend/src/slow_query_event.rs`: extend `SlowQueryEvent` schema and rows with `catalog_name` and `schema_name`, and cover append-only ordering. - `src/catalog/src/process_manager.rs`: carry catalog and schema through `SlowQueryTimer`. - `src/frontend/src/instance.rs`: capture context for SQL, plan, and PromQL slow query timers. - `tests-integration/tests/sql.rs`: assert MySQL and PostgreSQL slow query records include catalog and schema. Signed-off-by: Lei, HUANG <ratuthomm@gmail.com> * fix: address slow query review comment Use `String::clone` when writing slow query catalog and schema values. Signed-off-by: Lei, HUANG <ratuthomm@gmail.com> * fix: keep slow query schema only Remove the slow query `catalog_name` column and keep `schema_name` as a non-null tag dimension. - `src/common/frontend/src/slow_query_event.rs`: expose only `schema_name` in `SlowQueryEvent` rows and mark it as a tag. - `src/catalog/src/process_manager.rs`: stop carrying catalog context in `SlowQueryTimer`. - `src/frontend/src/instance.rs`: pass only schema context to slow query timers. - `tests-integration/tests/sql.rs`: assert slow query records include `schema_name` without `catalog_name`. Signed-off-by: Lei, HUANG <ratuthomm@gmail.com> * fix: schema name semantic should be field Signed-off-by: Lei, HUANG <ratuthomm@gmail.com> * fix: typo Signed-off-by: Lei, HUANG <ratuthomm@gmail.com> --------- Signed-off-by: Lei, HUANG <ratuthomm@gmail.com> Signed-off-by: WenyXu <wenymedia@gmail.com> * fix: invalidate comment DDL cache and lock by object ID (#8390) * fix: invalidate comment ddl cache locally Signed-off-by: WenyXu <wenymedia@gmail.com> * fix: fix typos Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: apply suggestions Signed-off-by: WenyXu <wenymedia@gmail.com> --------- Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: client_ip error logs skip internal API (#8362) * chore: client_ip error logs skip internal API Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: fmt Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: use const Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: use const Signed-off-by: shuiyisong <xixing.sys@gmail.com> --------- Signed-off-by: shuiyisong <xixing.sys@gmail.com> Signed-off-by: WenyXu <wenymedia@gmail.com> * feat: update dashboard to v0.13.6 (#8369) Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: use ENV for building dashboard (#8384) Signed-off-by: shuiyisong <xixing.sys@gmail.com> Signed-off-by: WenyXu <wenymedia@gmail.com> * fix: handle PromQL time binary aggregation (#8398) Signed-off-by: jeremyhi <fengjiachun@gmail.com> Signed-off-by: WenyXu <wenymedia@gmail.com> * perf(mito): prune files by manifest time range (#8352) * perf(mito): prune files by manifest time range Signed-off-by: discord9 <discord9@163.com> * chore(mito): address file pruning review Signed-off-by: discord9 <discord9@163.com> * chore(mito): remove verbose file pruning log Signed-off-by: discord9 <discord9@163.com> * chore(mito): expose file pruning metric Signed-off-by: discord9 <discord9@163.com> * chore(mito): shorten file pruning metric Signed-off-by: discord9 <discord9@163.com> * test(mito): cover file pruning edge cases Signed-off-by: discord9 <discord9@163.com> --------- Signed-off-by: discord9 <discord9@163.com> Signed-off-by: WenyXu <wenymedia@gmail.com> * perf(mito): skip manifest-pruned file ranges (#8366) * perf(mito): skip manifest-pruned file ranges Signed-off-by: discord9 <discord9@163.com> * test(mito): allow empty prune benchmark output Signed-off-by: discord9 <discord9@163.com> * fix(mito): avoid caching stale pruned builders Signed-off-by: discord9 <discord9@163.com> * chore(mito): address pruner clippy Signed-off-by: discord9 <discord9@163.com> * fix(mito): account worker pruner builder metrics Signed-off-by: discord9 <discord9@163.com> * test(mito): keep empty prune benchmark local Signed-off-by: discord9 <discord9@163.com> * refactor(mito): share manifest-pruned range skip Signed-off-by: discord9 <discord9@163.com> * chore(mito): shorten prune cache comment Signed-off-by: discord9 <discord9@163.com> * fix(mito): keep manifest prune state in pruner Signed-off-by: discord9 <discord9@163.com> * test(mito): cover manifest prune fast skip edge cases Signed-off-by: discord9 <discord9@163.com> * chore: fix typo in logical table alter Signed-off-by: discord9 <discord9@163.com> * chore(mito): address pruner review comments Signed-off-by: discord9 <discord9@163.com> --------- Signed-off-by: discord9 <discord9@163.com> Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: bump version to v1.1.2 Signed-off-by: WenyXu <wenymedia@gmail.com> --------- Signed-off-by: evenyag <realevenyag@gmail.com> Signed-off-by: WenyXu <wenymedia@gmail.com> Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> Signed-off-by: discord9 <discord9@163.com> Signed-off-by: jeremyhi <fengjiachun@gmail.com> Signed-off-by: Ritwij Aryan Parmar <ritwij.aryan.parmar@gmail.com> Signed-off-by: Lei, HUANG <ratuthomm@gmail.com> Signed-off-by: shuiyisong <xixing.sys@gmail.com> Co-authored-by: Yingwen <realevenyag@gmail.com> Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com> Co-authored-by: discord9 <discord9@163.com> Co-authored-by: jeremyhi <jiachun_feng@proton.me> Co-authored-by: Ritwij Aryan Parmar <88580521+RitwijParmar@users.noreply.github.com> Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com> Co-authored-by: sun <sunchang_long@163.com>
Setup tests for multiple storage backend
To run the integration test, please copy .env.example to .env in the project root folder and change the values on need.
Take s3 for example. You need to set your S3 bucket, access key id and secret key:
# Settings for s3 test
GT_S3_BUCKET=S3 bucket
GT_S3_REGION=S3 region
GT_S3_ACCESS_KEY_ID=S3 access key id
GT_S3_ACCESS_KEY=S3 secret access key
Run
Execute the following command in the project root folder:
cargo test integration
Test s3 storage:
cargo test s3
Test oss storage:
cargo test oss
Test azblob storage:
cargo test azblob
Setup tests with Kafka wal
To run the integration test, please copy .env.example to .env in the project root folder and change the values on need.
GT_KAFKA_ENDPOINTS = localhost:9092
Setup kafka standalone
cd tests-integration/fixtures
docker compose -f docker-compose.yml up kafka
Setup tests with etcd TLS
This guide explains how to set up and test TLS-enabled etcd connections in GreptimeDB integration tests.
Quick Start
TLS certificates are already at tests-integration/fixtures/etcd-tls-certs/.
-
Start TLS-enabled etcd:
cd tests-integration/fixtures docker compose up etcd-tls -d -
Start all services (including etcd-tls):
cd tests-integration/fixtures docker compose up -d --wait
Certificate Details
The checked-in certificates include:
ca.crt- Certificate Authority certificateserver.crt/server-key.pem- Server certificate for etcd-tls serviceclient.crt/client-key.pem- Client certificate for connecting to etcd-tls
The server certificate includes SANs for localhost, etcd-tls, 127.0.0.1, and ::1.
Regenerating Certificates (Optional)
If you need to regenerate the etcd certificates:
# Regenerate certificates (overwrites existing ones)
./scripts/generate-etcd-tls-certs.sh
# Or generate in custom location
./scripts/generate-etcd-tls-certs.sh /path/to/cert/directory
If you need to regenerate the mysql and postgres certificates:
# Regenerate certificates (overwrites existing ones)
./scripts/generate_certs.sh
# Or generate in custom location
./scripts/generate_certs.sh /path/to/cert/directory
Note: The checked-in certificates are for testing purposes only and should never be used in production.