* feat: report region read load in heartbeat
Signed-off-by: WenyXu <wenymedia@gmail.com>
* feat: expose region query stats in information schema
Signed-off-by: WenyXu <wenymedia@gmail.com>
* chore: update sqlness result
Signed-off-by: WenyXu <wenymedia@gmail.com>
* fix: record region query stats on stream drop
Signed-off-by: WenyXu <wenymedia@gmail.com>
* fix: keep region query cpu stats in nanoseconds
Signed-off-by: WenyXu <wenymedia@gmail.com>
---------
Signed-off-by: WenyXu <wenymedia@gmail.com>
* fix: record catalog and schema in slow queries
Add catalog and schema context to slow query records while appending the new columns after existing fields to preserve column order.
- `src/common/frontend/src/slow_query_event.rs`: extend `SlowQueryEvent` schema and rows with `catalog_name` and `schema_name`, and cover append-only ordering.
- `src/catalog/src/process_manager.rs`: carry catalog and schema through `SlowQueryTimer`.
- `src/frontend/src/instance.rs`: capture context for SQL, plan, and PromQL slow query timers.
- `tests-integration/tests/sql.rs`: assert MySQL and PostgreSQL slow query records include catalog and schema.
Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>
* fix: address slow query review comment
Use `String::clone` when writing slow query catalog and schema values.
Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>
* fix: keep slow query schema only
Remove the slow query `catalog_name` column and keep `schema_name` as a non-null tag dimension.
- `src/common/frontend/src/slow_query_event.rs`: expose only `schema_name` in `SlowQueryEvent` rows and mark it as a tag.
- `src/catalog/src/process_manager.rs`: stop carrying catalog context in `SlowQueryTimer`.
- `src/frontend/src/instance.rs`: pass only schema context to slow query timers.
- `tests-integration/tests/sql.rs`: assert slow query records include `schema_name` without `catalog_name`.
Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>
* fix: schema name semantic should be field
Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>
* fix: typo
Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>
---------
Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>
* feat: disable non object json write
* remove unused code
* fix: sqlness test
* simply code
* add comment
* remove unused unit test
* fix: cargo fmt
* fix: unit test
* fix: unit test
* refactor: extract region hook function for extension
* refactor: add a warning for unmatched vector size
* feat: reconstruct sst_info from file_meta
* chore: update comments about the hook
Make auto_flush_interval configurable per table (region) in addition to
the global Mito engine config, mirroring how ttl supports a global default
with a per-table override.
- Add auto_flush_interval to RegionOptions, parsed from the table option
with the same humantime format as the global config; reject a
non-positive value.
- The periodic flush logic resolves the effective interval per region,
falling back to the global config when unset, using a saturating
conversion to avoid overflow on extreme values.
- Accept the option key in is_mito_engine_option_key so it can be set at
CREATE TABLE via WITH ('auto_flush_interval' = '5m').
Ref #8340
Signed-off-by: raphaelroshan <raphaelroshan@gmail.com>
* feat(json2): type hint
* test(datatypes): add JsonSettings serde round-trip test
* minor refactor
This reverts commit 7ff5a5249a09be5396536284fe822b5761ef4e6a.
* fix: code review
* feat: accept x-greptime-pipeline-name header on /events/logs
The /events/logs (and /logs/ingest) endpoint previously only read the
pipeline name from the `pipeline_name` query parameter, while the
OTLP/Elasticsearch/Splunk log ingestion endpoints already accept it via
the `x-greptime-pipeline-name` header. This inconsistency is unfriendly
to users.
Make `log_ingester` resolve the pipeline name from the
`x-greptime-pipeline-name` header (and the deprecated
`x-greptime-log-pipeline-name`), falling back to the query parameter.
The header takes precedence, consistent with how other pipeline options
(e.g. `x-greptime-pipeline-params`) outrank their query-parameter
counterparts.
Closes#6095
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: BootstrapperSBL <yvanwww01@gmail.com>
* address review: prefer non-deprecated pipeline-name header
When both pipeline-name headers are present, resolve the non-deprecated
`x-greptime-pipeline-name` before the deprecated
`x-greptime-log-pipeline-name`, and cover the precedence with a test.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: BootstrapperSBL <yvanwww01@gmail.com>
---------
Signed-off-by: BootstrapperSBL <yvanwww01@gmail.com>
Co-authored-by: BootstrapperSBL <yvanwww01@gmail.com>
Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>
* fix(json2): failed to compact memtable
* fix: cargo clippy
* refactor: align schema with json2 filed in flush
* chore: add unit test for json aligner
* chore: add json2 integration test
* fix: cr by codex
* fix: use parquet schema for encoded JSON2 memtable parts
* Use is_structured_json_field to determine whether the field is of JSON2 type.
* fix: cargo clippy
* fix: only align structured json fields
* chore: assert bulk JSON2 aligner input schemas in debug
* chore: declare GreptimeDB Enterprise License for enterprise-gated sources
The `enterprise`-feature-gated sources (triggers, mito2 extension) were
excluded from the Apache-2.0 header check but carried no license of their
own. Declare a separate GreptimeDB Enterprise License and enforce it.
- Add LICENSE-ENTERPRISE (open-core split; core stays Apache-2.0).
- Add an Enterprise License header to each enterprise source file.
- Add licenserc-enterprise.toml and a second hawkeye step in CI to enforce
the Enterprise header on exactly those files.
- Cross-reference the two complementary file lists; document the layout in
licenses/README.md and the README License section.
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
* chore: reference per-customer Enterprise Agreement instead of a terms URL
There is no public enterprise-terms page; each customer signs an individually
negotiated agreement. Point the license at a "separate written commercial
agreement with GrepTime Inc." and direct readers to the existing contact page.
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
---------
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
docs: add project-level AGENTS.md as the shared agent/contributor guide
Promote the agent guidance that was previously a personal, gitignored root file
into a committed, contributor-facing AGENTS.md, and tie together the existing
.agents/ resources (architecture invariants, generated-files list, skills) and
the per-crate AGENTS.md guides under one auto-discoverable root entry point.
- Add root AGENTS.md: core commands, repo map, must-read rules, high-signal
entry points, and pre-PR checks. Worded for any coding agent, not just one.
- CLAUDE.md becomes a symlink to AGENTS.md for Claude Code compatibility.
- .gitignore: stop ignoring root AGENTS.md/CLAUDE.md; ignore /.local/ instead,
which is where personal/local agent config now lives.
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
* fix(metric-engine): report query load under physical region id
Propagate the physical region ID through the scanner, record batch
stream, and query engine so that query-load metrics (CPU time, scanned
bytes) are attributed to the correct physical region rather than always
to the logical region.
- `src/store-api/src/region_engine.rs` — add `query_load_region_id` to
`ScannerProperties` and `set_query_load_region_id` to `RegionScanner`
trait
- `src/mito2/src/read/seq_scan.rs`,
`src/mito2/src/read/series_scan.rs`,
`src/mito2/src/read/unordered_scan.rs` — implement the new trait
method on each scanner
- `src/common/recordbatch/src/adapter.rs` — carry region id through
`RecordBatchStreamAdapter` into `RecordBatchMetrics`
- `src/table/src/table/scan.rs` — expose region id on `RegionScanExec`
- `src/query/src/datafusion.rs` — extract region id from the physical
plan and set it on the output stream
- `src/query/src/dist_plan/merge_scan.rs` — use metrics-contained region
id in query-load reporting, falling back to the logical region id
- `src/metric-engine/src/engine/read.rs` — set region id on the metric
engine scanner
Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>
* fix(query): satisfy clippy for query load region id
Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>
* fix(query): ignore missing query load region ids
Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>
---------
Signed-off-by: Lei, HUANG <ratuthomm@gmail.com>
Co-authored-by: Lei, HUANG <ratuthomm@gmail.com>
* docs(agents): add per-crate guides, architecture invariants, and generated-files list
Add agent/contributor navigation docs modeled on the AGENTS.md convention:
- Per-crate AGENTS.md for hot crates (mito2, metric-engine, flow, frontend,
meta-srv): module map, read/write paths, change-coupling points, test
commands, and gotchas.
- .agents/architecture-invariants.md: repo-wide rules that clippy and the
style guide do not cover (format compatibility, crate layering, async
runtimes, error handling, experimental gating, the DataFusion fork).
- .agents/generated-files.md: tool-generated artifacts that must not be
hand-edited (sqlness .result, config.md, dashboards, build.rs output, proto).
- Anchor the .gitignore CLAUDE.md/AGENTS.md rules to the repo root so per-crate
AGENTS.md files are tracked while root-level personal config stays ignored.
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
* chore: update crate AGENTS.md and fix config.md path
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
* docs(agents): fix DataFusion patch layout and SQL query lifecycle order
Address review feedback on #8346:
- architecture-invariants: the DataFusion sub-crates pin an exact crates.io
version in [workspace.dependencies] and are redirected to the fork rev in
[patch.crates-io]; the two sections hold different forms, not the same rev.
- frontend: the SQL query lifecycle runs the pre_parsing/post_parsing
interceptors around parsing, before the per-statement permission check.
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
---------
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>