mirror of https://github.com/GreptimeTeam/greptimedb.git synced 2026-01-16 18:22:55 +00:00

Files

Weny Xu 2ae20daa62 feat: add sync region instruction for repartition procedure (#7562 )

* feat: add sync region instruction for repartition procedure

This commit introduces a new sync region instruction and integrates it
into the repartition procedure flow, specifically for metric engine tables.

Changes:
- Add SyncRegion instruction type and SyncRegionsReply in instruction.rs
- Implement SyncRegionHandler in datanode to handle sync region requests
- Add SyncRegion state in repartition procedure to sync newly allocated regions
- Integrate sync region step after enter_staging_region for metric engine tables
- Add sync_region flag and allocated_region_ids to PersistentContext
- Make SyncRegionFromRequest serializable for instruction transmission
- Add test utilities and mock support for sync region operations

The sync region step is conditionally executed based on the table engine type,
ensuring that newly allocated regions in metric engine tables are properly
synced from their source regions before proceeding with manifest remapping.

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: add logs

Signed-off-by: WenyXu <wenymedia@gmail.com>

* feat(repartition): improve staging region handling and support metric engine repartition
- Reorder sync region flow: move SyncRegion from EnterStagingRegion to RepartitionStart to sync before applying staging
- Add ExitStaging metadata update state to properly clear staging leader info after repartition completes
- Update build_template_from_raw_table_info to optionally skip metric engine internal columns when creating region requests
- Fix region state transition: set_dropping now expects specific state (Staging or Writable) for proper validation
- Adjust region drop and copy handlers to handle staging regions correctly
- Add comprehensive test cases for metric engine SPLIT/MERGE partition operations on physical tables with logical tables
- Improve logging for table route updates, region drops, and repartition operations

Signed-off-by: WenyXu <wenymedia@gmail.com>

* refactor: removes code duplication

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: update result

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: refine comments

Signed-off-by: WenyXu <wenymedia@gmail.com>

* feat: add error strategy support for flush region and flush pending deallocate regions

- **Add `ErrorStrategy` enum** in `procedure/utils.rs`:
  - Supports `Ignore` and `Retry` strategies for error handling
  - Refactor `flush_region` to accept `error_strategy` parameter
  - Extract `handle_flush_region_reply` helper function for better code organization

- **Add pending deallocate region support**:
  - Add `pending_deallocate_region_ids` field to `PersistentContext`
  - Implement `flush_pending_deallocate_regions` in `EnterStagingRegion` state
  - Flush pending deallocate regions before entering staging regions to ensure data consistency

- **Update error handling**:
  - `flush_leader_region`: Use `ErrorStrategy::Ignore` to skip unreachable datanodes
  - `sync_region`: Use `ErrorStrategy::Retry` for critical operations
  - `enter_staging_region`: Use `ErrorStrategy::Retry` when flushing pending deallocate regions

This change improves the robustness of the repartition procedure by:
1. Providing flexible error handling strategies for flush operations
2. Ensuring pending deallocate regions are properly flushed before repartitioning
3. Preventing data inconsistency during region migration

Signed-off-by: WenyXu <wenymedia@gmail.com>

* chore: apply suggestions from CR

Signed-off-by: WenyXu <wenymedia@gmail.com>

* fix: compile

Signed-off-by: WenyXu <wenymedia@gmail.com>

---------

Signed-off-by: WenyXu <wenymedia@gmail.com>

2026-01-15 04:52:57 +00:00

cases

feat: add sync region instruction for repartition procedure (#7562 )

2026-01-15 04:52:57 +00:00

compat

refactor: restructure sqlness to support multiple envs and extract common utils (#7066 )

2025-10-11 06:34:17 +00:00

conf

feat!: make heartbeat config only in metasrv (#7510 )

2026-01-06 09:43:36 +00:00

data

fix: add map datatype conversion in copy_table_from (#6185 ) (#6422 )

2025-07-28 03:53:10 +00:00

runner

fix: more wait time for sqlness start and better message (#7485 )

2025-12-26 01:55:20 +00:00

upgrade-compat

feat!: improve mysql/pg compatibility (#7315 )

2025-12-01 20:41:14 +00:00

README.md

refactor: restructure sqlness to support multiple envs and extract common utils (#7066 )

2025-10-11 06:34:17 +00:00

README.md

Sqlness Test

Sqlness manual

Case file

Sqlness has two types of file:

.sql: test input, SQL only
.result: expected test output, SQL and its results

.result is the output (execution result) file. If you see .result files is changed, it means this test gets a different result and indicates it fails. You should check change logs to solve the problem.

You only need to write test SQL in .sql file, and run the test.

Case organization

The root dir of input cases is tests/cases. It contains several subdirectories stand for different test modes. E.g., standalone/ contains all the tests to run under greptimedb standalone start mode.

Under the first level of subdirectory (e.g. the cases/standalone), you can organize your cases as you like. Sqlness walks through every file recursively and runs them.

Kafka WAL

Sqlness supports Kafka WAL. You can either provide a Kafka cluster or let sqlness to start one for you.

To run test with kafka, you need to pass the option -w kafka. If no other options are provided, sqlness will use conf/kafka-cluster.yml to start a Kafka cluster. This requires docker and docker-compose commands in your environment.

Otherwise, you can additionally pass the your existing kafka environment to sqlness with -k option. E.g.:

cargo sqlness bare -w kafka -k localhost:9092

In this case, sqlness will not start its own kafka cluster and the one you provided instead.

Run the test

Unlike other tests, this harness is in a binary target form. You can run it with:

cargo sqlness bare

It automatically finishes the following procedures: compile GreptimeDB, start it, grab tests and feed it to the server, then collect and compare the results. You only need to check if the .result files are changed. If not, congratulations, the test is passed 🥳!