* feat: table semantic layer information_schema view (Phase 3)
Add `information_schema.table_semantics`, a queryable view over the table
semantic layer. One row per table that carries at least one
`greptime.semantic.*` option: the signal-agnostic keys
(signal_type/source/pipeline/metadata_quality) are promoted to columns and
the remaining signal-specific keys are folded into a `semantic_options`
JSON string. Tables with no semantic key are excluded.
Stacked on Phase 2.
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
* chore: address PR review on table_semantics
- fold JSON serialization failure into None instead of unwrap/panic
- drop per-row Vec allocation in predicate eval; use a fixed array
- align RFC view name with the shipped `table_semantics`
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
* chore: update results
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
---------
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
* feat: table semantic layer per-table enrichment (Phase 2)
Phase 2 of the table semantic layer, plus a vocabulary trim so the layer only
records what a machine consumer cannot cheaply recover on its own.
Per-table metric enrichment (OTLP), via an internal per-table channel:
- A `SemanticIndex` accumulator records, per emitted table, the declared metric
keys: type / unit / temporality / metadata_quality=declared / original_name.
Conflicting single-valued keys collapse to `mixed`/`unknown`.
- Recording happens at the `encode_metrics` level where the base name, metric
type, and proto fields are all in scope, so histogram/summary fan-out gets the
correct per-subtable type (`_bucket`=histogram, `_sum`/`_count`=counter)
without threading state through every encoder.
- The index is serialized onto the `greptime.internal.semantic.per_table_index`
context extension; `apply_per_table_semantic_options` folds each table's keys
into its options at auto-create time.
- `trace.conventions` is refined from the request's resource/scope `schema_url`s
(concrete when uniform, else `mixed`/`unknown`).
Vocabulary trimmed to only meaningful keys. Kept: signal_type, source, pipeline,
trace.conventions, metric.{type,unit,temporality,metadata_quality,original_name}.
Dropped: metric.monotonic (a function of type), trace.has_events/has_links
(constant + derivable from columns), log.severity_scheme/body_format (constant /
derivable, and body_format cost an O(rows) scan), resource/scope lineage
(restates columns / collector-config concern), source_version (no cheap
non-constant value today). Prometheus carries type/unit in the metric name by
convention, so it gets identity only — no inferred enrichment.
Identity (signal_type + source) extended to the remaining ingest protocols so
the discovery view is complete: InfluxDB and OpenTSDB (metric), Loki and
Elasticsearch (log). These protocols carry no type/unit metadata, so identity is
all that applies.
Tests: unit coverage for the accumulator, per-metric-type fan-out, and trace
conventions; integration goldens updated for the OTLP metric/trace SHOW CREATE
output and the new Loki identity.
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
* chore: validate the option value
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
---------
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
* feat: Runner executes procedure
* feat: Add rollback key type to ParsedKey
* feat: Write rollback key when procedure is unable to execute
* feat: Use loaded step to re-submit subprocedure
* feat: Track subprocedures in ProcedureMeta
* feat: Clean message cache after the root procedure is done
* feat: Runner returns execution result
* fix: Fix tests
* test: Test Runner
* test: Test procedures_in_tree
* chore: Refine test and comments
* feat: Remove support of lock inheritance
A deadlock happens if a subprocedure acquires the same lock key as
its parent.
The main concern is if the subprocedure directly inherits its parent's
lock, then how should we behave when multiple subprocedures acquire
this same lock? Each procedure may assume it has unique access to the
same object but it actually shares the resource with others.
Now subprocedures need to use different keys to lock objects, which is
reasonable. For example:
- A parent procedure wants to create a table so it locks the table with
a key like `catalog.schema.table`
- Subprocedures create regions for the table so they lock the regions
with keys `catalog.schema.table.region-0 ~ catalog.schema.table.region-n`
* style: Fix clippy
* feat: insert_procedure returns false on duplicate procedure
Also rename this method to try_insert_procedure
* chore: Address CR comments
* docs: Add procedure framework RFC
* docs: Add dump, rollback and locking to procedure framework
* docs: Change ProcedureBuilder to ProcedureLoader
* docs: Add sub-procedures section
* docs: Add a link to explain idempotent
* docs: Add link to the tracking issue
* docs: Fix ProcedureLoader type alias
* docs: Update procedure API
* docs: Address CR comments
* docs: Update path and make the docs more clear