mirror of https://github.com/GreptimeTeam/greptimedb.git synced 2026-05-14 12:00:40 +00:00

Files

Yvan Wang d1873ca31d fix(metric-engine): validate column types and require time index in verify_rows (#8018 )

* fix(metric-engine): validate column types and require time index in verify_rows

The remote-write path into the metric engine previously bypassed schema
validation. When a row's time index column carried a non-timestamp
datatype (e.g. a string), the request reached mito's ValueBuilder::push
for the timestamp builder and panicked instead of surfacing a typed
error.

Cache the (column_id, data_type, semantic_type) tuple for each physical
column on PhysicalRegionState and use it in verify_rows to:

- reject columns whose datatype or semantic type disagrees with the
  physical region's schema (mirrors mito's WriteRequest::check_schema)
- reject requests that omit the time index column entirely

Field columns stay optional; tag completeness needs per-logical-region
metadata that verify_rows doesn't have and is left to a follow-up.

Fixes #7990.

Signed-off-by: BootstrapperSBL <yvanwww01@gmail.com>

* refactor(metric-engine): simplify PhysicalColumnInfo construction

- Add From<ColumnMetadata> and From<&ColumnMetadata> for PhysicalColumnInfo
  so call sites can use metadata.into() instead of repeating the field list.
- Replace the four struct-literal constructions in create.rs, open.rs and
  alter.rs with the conversion.
- In verify_rows, pass &col.column_name to ColumnNotFoundSnafu instead of
  cloning it explicitly (snafu's context handles the conversion).

Signed-off-by: BootstrapperSBL <yvanwww01@gmail.com>

* perf(metric-engine): cache time index column name in PhysicalRegionState

verify_rows previously scanned every physical column on each row batch to
find the timestamp column. Since the time index is fixed at region
creation and never changes, stash its name on PhysicalRegionState when
the region is first registered and read it directly from there.

add_physical_columns carries a debug_assert to document the invariant
that alter never introduces a new time index.

Signed-off-by: BootstrapperSBL <yvanwww01@gmail.com>

* perf(metric-engine): borrow physical column names when building name_to_id

On the row-write path we built a HashMap<String, ColumnId> by cloning
every column name out of the physical region's cached state. The map is
scoped to the block that holds the state's read guard, so there's no
need to own the keys.

Switch the map to HashMap<&str, ColumnId> and widen RowsIter::new /
IterIndex::new to accept any key type that borrows as str. Existing
test helpers that pass HashMap<String, ColumnId> keep working through
the Borrow<str> bound.

Signed-off-by: BootstrapperSBL <yvanwww01@gmail.com>

* fix: validate metric rows against physical schema

Cache physical column metadata in the metric engine state so row validation and row modification can use the same source of truth for column IDs, data types, and semantic types.

Validate incoming metric rows against the physical schema before writes. Put requests now require the time index and the expected field column, while delete requests keep accepting primary-key-plus-timestamp payloads by skipping the field completeness check.

Pass physical column metadata directly into RowsIter instead of rebuilding a name-to-column-id map at each call site, and cover the new validation paths with tests for missing time indexes, missing fields, and duplicate field columns.

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: do not allow adding a new field

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: fill default value for fields

Signed-off-by: evenyag <realevenyag@gmail.com>

* fix: fill default for nullable fields

Signed-off-by: evenyag <realevenyag@gmail.com>

---------

Signed-off-by: BootstrapperSBL <yvanwww01@gmail.com>
Signed-off-by: evenyag <realevenyag@gmail.com>
Co-authored-by: BootstrapperSBL <yvanwww01@gmail.com>
Co-authored-by: evenyag <realevenyag@gmail.com>

2026-05-07 12:41:07 +00:00

cases

fix(metric-engine): validate column types and require time index in verify_rows (#8018 )

2026-05-07 12:41:07 +00:00

compat

refactor: restructure sqlness to support multiple envs and extract common utils (#7066 )

2025-10-11 06:34:17 +00:00

conf

feat!: switch default sst format to flat (#7909 )

2026-04-03 04:14:02 +00:00

data

fix: add map datatype conversion in copy_table_from (#6185 ) (#6422 )

2025-07-28 03:53:10 +00:00

runner

chore: update the opendal to 0.56 rc2 (#8003 )

2026-04-26 09:59:48 +00:00

upgrade-compat

feat!: improve mysql/pg compatibility (#7315 )

2025-12-01 20:41:14 +00:00

README.md

refactor: restructure sqlness to support multiple envs and extract common utils (#7066 )

2025-10-11 06:34:17 +00:00

README.md

Sqlness Test

Sqlness manual

Case file

Sqlness has two types of file:

.sql: test input, SQL only
.result: expected test output, SQL and its results

.result is the output (execution result) file. If you see .result files is changed, it means this test gets a different result and indicates it fails. You should check change logs to solve the problem.

You only need to write test SQL in .sql file, and run the test.

Case organization

The root dir of input cases is tests/cases. It contains several subdirectories stand for different test modes. E.g., standalone/ contains all the tests to run under greptimedb standalone start mode.

Under the first level of subdirectory (e.g. the cases/standalone), you can organize your cases as you like. Sqlness walks through every file recursively and runs them.

Kafka WAL

Sqlness supports Kafka WAL. You can either provide a Kafka cluster or let sqlness to start one for you.

To run test with kafka, you need to pass the option -w kafka. If no other options are provided, sqlness will use conf/kafka-cluster.yml to start a Kafka cluster. This requires docker and docker-compose commands in your environment.

Otherwise, you can additionally pass the your existing kafka environment to sqlness with -k option. E.g.:

cargo sqlness bare -w kafka -k localhost:9092

In this case, sqlness will not start its own kafka cluster and the one you provided instead.

Run the test

Unlike other tests, this harness is in a binary target form. You can run it with:

cargo sqlness bare

It automatically finishes the following procedures: compile GreptimeDB, start it, grab tests and feed it to the server, then collect and compare the results. You only need to check if the .result files are changed. If not, congratulations, the test is passed 🥳!