mirror of
https://github.com/GreptimeTeam/greptimedb.git
synced 2026-05-14 20:10:37 +00:00
* fix(metric-engine): validate column types and require time index in verify_rows The remote-write path into the metric engine previously bypassed schema validation. When a row's time index column carried a non-timestamp datatype (e.g. a string), the request reached mito's ValueBuilder::push for the timestamp builder and panicked instead of surfacing a typed error. Cache the (column_id, data_type, semantic_type) tuple for each physical column on PhysicalRegionState and use it in verify_rows to: - reject columns whose datatype or semantic type disagrees with the physical region's schema (mirrors mito's WriteRequest::check_schema) - reject requests that omit the time index column entirely Field columns stay optional; tag completeness needs per-logical-region metadata that verify_rows doesn't have and is left to a follow-up. Fixes #7990. Signed-off-by: BootstrapperSBL <yvanwww01@gmail.com> * refactor(metric-engine): simplify PhysicalColumnInfo construction - Add From<ColumnMetadata> and From<&ColumnMetadata> for PhysicalColumnInfo so call sites can use metadata.into() instead of repeating the field list. - Replace the four struct-literal constructions in create.rs, open.rs and alter.rs with the conversion. - In verify_rows, pass &col.column_name to ColumnNotFoundSnafu instead of cloning it explicitly (snafu's context handles the conversion). Signed-off-by: BootstrapperSBL <yvanwww01@gmail.com> * perf(metric-engine): cache time index column name in PhysicalRegionState verify_rows previously scanned every physical column on each row batch to find the timestamp column. Since the time index is fixed at region creation and never changes, stash its name on PhysicalRegionState when the region is first registered and read it directly from there. add_physical_columns carries a debug_assert to document the invariant that alter never introduces a new time index. Signed-off-by: BootstrapperSBL <yvanwww01@gmail.com> * perf(metric-engine): borrow physical column names when building name_to_id On the row-write path we built a HashMap<String, ColumnId> by cloning every column name out of the physical region's cached state. The map is scoped to the block that holds the state's read guard, so there's no need to own the keys. Switch the map to HashMap<&str, ColumnId> and widen RowsIter::new / IterIndex::new to accept any key type that borrows as str. Existing test helpers that pass HashMap<String, ColumnId> keep working through the Borrow<str> bound. Signed-off-by: BootstrapperSBL <yvanwww01@gmail.com> * fix: validate metric rows against physical schema Cache physical column metadata in the metric engine state so row validation and row modification can use the same source of truth for column IDs, data types, and semantic types. Validate incoming metric rows against the physical schema before writes. Put requests now require the time index and the expected field column, while delete requests keep accepting primary-key-plus-timestamp payloads by skipping the field completeness check. Pass physical column metadata directly into RowsIter instead of rebuilding a name-to-column-id map at each call site, and cover the new validation paths with tests for missing time indexes, missing fields, and duplicate field columns. Signed-off-by: evenyag <realevenyag@gmail.com> * fix: do not allow adding a new field Signed-off-by: evenyag <realevenyag@gmail.com> * fix: fill default value for fields Signed-off-by: evenyag <realevenyag@gmail.com> * fix: fill default for nullable fields Signed-off-by: evenyag <realevenyag@gmail.com> --------- Signed-off-by: BootstrapperSBL <yvanwww01@gmail.com> Signed-off-by: evenyag <realevenyag@gmail.com> Co-authored-by: BootstrapperSBL <yvanwww01@gmail.com> Co-authored-by: evenyag <realevenyag@gmail.com>