Files
greptimedb/tests-integration
dennis zhuang 31c2c1f6db feat: table semantic layer per-table enrichment (Phase 2) (#8218)
* feat: table semantic layer per-table enrichment (Phase 2)

Phase 2 of the table semantic layer, plus a vocabulary trim so the layer only
records what a machine consumer cannot cheaply recover on its own.

Per-table metric enrichment (OTLP), via an internal per-table channel:
- A `SemanticIndex` accumulator records, per emitted table, the declared metric
  keys: type / unit / temporality / metadata_quality=declared / original_name.
  Conflicting single-valued keys collapse to `mixed`/`unknown`.
- Recording happens at the `encode_metrics` level where the base name, metric
  type, and proto fields are all in scope, so histogram/summary fan-out gets the
  correct per-subtable type (`_bucket`=histogram, `_sum`/`_count`=counter)
  without threading state through every encoder.
- The index is serialized onto the `greptime.internal.semantic.per_table_index`
  context extension; `apply_per_table_semantic_options` folds each table's keys
  into its options at auto-create time.
- `trace.conventions` is refined from the request's resource/scope `schema_url`s
  (concrete when uniform, else `mixed`/`unknown`).

Vocabulary trimmed to only meaningful keys. Kept: signal_type, source, pipeline,
trace.conventions, metric.{type,unit,temporality,metadata_quality,original_name}.
Dropped: metric.monotonic (a function of type), trace.has_events/has_links
(constant + derivable from columns), log.severity_scheme/body_format (constant /
derivable, and body_format cost an O(rows) scan), resource/scope lineage
(restates columns / collector-config concern), source_version (no cheap
non-constant value today). Prometheus carries type/unit in the metric name by
convention, so it gets identity only — no inferred enrichment.

Identity (signal_type + source) extended to the remaining ingest protocols so
the discovery view is complete: InfluxDB and OpenTSDB (metric), Loki and
Elasticsearch (log). These protocols carry no type/unit metadata, so identity is
all that applies.

Tests: unit coverage for the accumulator, per-metric-type fan-out, and trace
conventions; integration goldens updated for the OTLP metric/trace SHOW CREATE
output and the new Loki identity.

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

* chore: validate the option value

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>

---------

Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
2026-06-04 12:52:00 +00:00
..

Setup tests for multiple storage backend

To run the integration test, please copy .env.example to .env in the project root folder and change the values on need.

Take s3 for example. You need to set your S3 bucket, access key id and secret key:

# Settings for s3 test
GT_S3_BUCKET=S3 bucket
GT_S3_REGION=S3 region
GT_S3_ACCESS_KEY_ID=S3 access key id
GT_S3_ACCESS_KEY=S3 secret access key

Run

Execute the following command in the project root folder:

cargo test integration

Test s3 storage:

cargo test s3

Test oss storage:

cargo test oss

Test azblob storage:

cargo test azblob

Setup tests with Kafka wal

To run the integration test, please copy .env.example to .env in the project root folder and change the values on need.

GT_KAFKA_ENDPOINTS = localhost:9092

Setup kafka standalone

cd tests-integration/fixtures

docker compose -f docker-compose.yml up kafka

Setup tests with etcd TLS

This guide explains how to set up and test TLS-enabled etcd connections in GreptimeDB integration tests.

Quick Start

TLS certificates are already at tests-integration/fixtures/etcd-tls-certs/.

  1. Start TLS-enabled etcd:

    cd tests-integration/fixtures
    docker compose up etcd-tls -d
    
  2. Start all services (including etcd-tls):

    cd tests-integration/fixtures
    docker compose up -d --wait
    

Certificate Details

The checked-in certificates include:

  • ca.crt - Certificate Authority certificate
  • server.crt / server-key.pem - Server certificate for etcd-tls service
  • client.crt / client-key.pem - Client certificate for connecting to etcd-tls

The server certificate includes SANs for localhost, etcd-tls, 127.0.0.1, and ::1.

Regenerating Certificates (Optional)

If you need to regenerate the etcd certificates:

# Regenerate certificates (overwrites existing ones)
./scripts/generate-etcd-tls-certs.sh

# Or generate in custom location
./scripts/generate-etcd-tls-certs.sh /path/to/cert/directory

If you need to regenerate the mysql and postgres certificates:

# Regenerate certificates (overwrites existing ones)
./scripts/generate_certs.sh

# Or generate in custom location
./scripts/generate_certs.sh /path/to/cert/directory

Note: The checked-in certificates are for testing purposes only and should never be used in production.