* fix: window sort off by one precision TimeRange&better alias track (#8019)
* fix: window sort track alias&off by one precision TimeRange
Signed-off-by: discord9 <discord9@163.com>
* chore: more test
Signed-off-by: discord9 <discord9@163.com>
* refactor: clear helper
Signed-off-by: discord9 <discord9@163.com>
* dedup a bit
Signed-off-by: discord9 <discord9@163.com>
* feat: even more guard
Signed-off-by: discord9 <discord9@163.com>
* fix: case insensitive
Signed-off-by: discord9 <discord9@163.com>
---------
Signed-off-by: discord9 <discord9@163.com>
(cherry picked from commit 9fafd879ed)
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix(server): describe EXPLAIN statements so bind parameters work (#8035)
* fix(server): describe EXPLAIN statements so bind parameters work
`do_describe_inner` only planned `Insert`/`Query`/`Delete`, so
`EXPLAIN` and `EXPLAIN ANALYZE` fell through to the non-plan branch
and had no parameter-type inference. At Bind time the Postgres
handler then reported `unsupported_parameter_type` even though the
inner query would have worked on its own.
Recurse one level into `Statement::Explain` so that an EXPLAIN
wrapping a plannable statement goes through the same describe path.
Adds a tokio-postgres integration test that exercises
`EXPLAIN`/`EXPLAIN ANALYZE` over the extended query protocol.
Fixes#8029
Signed-off-by: BootstrapperSBL <yvanwww@gmail.com>
* refactor(server): extract plannable-inner check into closure
Reduce duplication between the direct match and the EXPLAIN inner match
by factoring out is_inner_plannable. Behaviour unchanged.
Signed-off-by: BootstrapperSBL <yvanwww@gmail.com>
---------
Signed-off-by: BootstrapperSBL <yvanwww@gmail.com>
(cherry picked from commit 793545d8e6)
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix: windows windowed sort ci (#8039)
* fix: windows windowed sort ci
Signed-off-by: discord9 <discord9@163.com>
* chore
Signed-off-by: discord9 <discord9@163.com>
* c
Signed-off-by: discord9 <discord9@163.com>
---------
Signed-off-by: discord9 <discord9@163.com>
(cherry picked from commit 760581b2a0)
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix: batched prometheus ingest row metric (#8054)
* fix: count batched prometheus ingest rows
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* fix: align batched ingest metrics
Use actual affected rows when updating `DIST_INGEST_ROW_COUNT` and cache the flush database label to avoid repeated `get_db_string` allocation.
Files: `src/servers/src/pending_rows_batcher.rs`
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
(cherry picked from commit f0b3ee4830)
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix: preserve case in database name from connection string (#8062)
`parse_optional_catalog_and_schema_from_db_string` unconditionally
lowercased database/schema names, causing quoted database names (e.g.
`CREATE DATABASE "TestQuery"`) to be stored with preserved case but
looked up as lowercase on connection, resulting in "Database not found".
Fixes#8059
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
(cherry picked from commit f5c1d5d9bc)
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix(metric-engine): validate column types and require time index in verify_rows (#8018)
* fix(metric-engine): validate column types and require time index in verify_rows
The remote-write path into the metric engine previously bypassed schema
validation. When a row's time index column carried a non-timestamp
datatype (e.g. a string), the request reached mito's ValueBuilder::push
for the timestamp builder and panicked instead of surfacing a typed
error.
Cache the (column_id, data_type, semantic_type) tuple for each physical
column on PhysicalRegionState and use it in verify_rows to:
- reject columns whose datatype or semantic type disagrees with the
physical region's schema (mirrors mito's WriteRequest::check_schema)
- reject requests that omit the time index column entirely
Field columns stay optional; tag completeness needs per-logical-region
metadata that verify_rows doesn't have and is left to a follow-up.
Fixes#7990.
Signed-off-by: BootstrapperSBL <yvanwww01@gmail.com>
* refactor(metric-engine): simplify PhysicalColumnInfo construction
- Add From<ColumnMetadata> and From<&ColumnMetadata> for PhysicalColumnInfo
so call sites can use metadata.into() instead of repeating the field list.
- Replace the four struct-literal constructions in create.rs, open.rs and
alter.rs with the conversion.
- In verify_rows, pass &col.column_name to ColumnNotFoundSnafu instead of
cloning it explicitly (snafu's context handles the conversion).
Signed-off-by: BootstrapperSBL <yvanwww01@gmail.com>
* perf(metric-engine): cache time index column name in PhysicalRegionState
verify_rows previously scanned every physical column on each row batch to
find the timestamp column. Since the time index is fixed at region
creation and never changes, stash its name on PhysicalRegionState when
the region is first registered and read it directly from there.
add_physical_columns carries a debug_assert to document the invariant
that alter never introduces a new time index.
Signed-off-by: BootstrapperSBL <yvanwww01@gmail.com>
* perf(metric-engine): borrow physical column names when building name_to_id
On the row-write path we built a HashMap<String, ColumnId> by cloning
every column name out of the physical region's cached state. The map is
scoped to the block that holds the state's read guard, so there's no
need to own the keys.
Switch the map to HashMap<&str, ColumnId> and widen RowsIter::new /
IterIndex::new to accept any key type that borrows as str. Existing
test helpers that pass HashMap<String, ColumnId> keep working through
the Borrow<str> bound.
Signed-off-by: BootstrapperSBL <yvanwww01@gmail.com>
* fix: validate metric rows against physical schema
Cache physical column metadata in the metric engine state so row validation and row modification can use the same source of truth for column IDs, data types, and semantic types.
Validate incoming metric rows against the physical schema before writes. Put requests now require the time index and the expected field column, while delete requests keep accepting primary-key-plus-timestamp payloads by skipping the field completeness check.
Pass physical column metadata directly into RowsIter instead of rebuilding a name-to-column-id map at each call site, and cover the new validation paths with tests for missing time indexes, missing fields, and duplicate field columns.
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix: do not allow adding a new field
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix: fill default value for fields
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix: fill default for nullable fields
Signed-off-by: evenyag <realevenyag@gmail.com>
---------
Signed-off-by: BootstrapperSBL <yvanwww01@gmail.com>
Signed-off-by: evenyag <realevenyag@gmail.com>
Co-authored-by: BootstrapperSBL <yvanwww01@gmail.com>
Co-authored-by: evenyag <realevenyag@gmail.com>
(cherry picked from commit d1873ca31d)
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix: type inference for sql rewrite (#8052)
fix: type inference for rewrited sql
(cherry picked from commit 5b47ec24ec)
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix: infer time index from column meta on derived table (#8013)
* rough fix
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* reorganize
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* simplification
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* fix format
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* add comment
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* enhance default by infer
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* supply comments
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* update sqlness result
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
(cherry picked from commit 0d90f7407c)
Signed-off-by: evenyag <realevenyag@gmail.com>
* feat: pre-cast constants (#7926)
* init impl
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* handle no cast
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* refactor using common-expr
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* extend matching pattern
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* more tests
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* simplification
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* fix zero timestamp
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* fix: normalize sqlness partition count output
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* fix: normalize remaining sqlness plan output
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* fix: normalize sqlness repartition details in tql explain
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* fix: tighten const normalization casts
* test: normalize standalone tql explain repartition output
* resolve cr comments
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* simplify
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
(cherry picked from commit 9133d0464f)
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix(mito): ignore compaction override in enum option validation (#8094)
* fix(mito): ignore compaction override in enum option validation
Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
* test: cover compaction override without compaction type
Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
* fix(mito): short-circuit enum option validation
Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
---------
Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
(cherry picked from commit 73c267e641)
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix(mito2): drop unsound time-filter cache-key stripping (#8105)
* fix(mito2): drop unsound time-filter cache-key stripping
Signed-off-by: evenyag <realevenyag@gmail.com>
* chore: update comments and test
Signed-off-by: evenyag <realevenyag@gmail.com>
---------
Signed-off-by: evenyag <realevenyag@gmail.com>
(cherry picked from commit 5e468190a5)
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix: remap batch table route addresses (#8109)
(cherry picked from commit a04fa52486)
Signed-off-by: evenyag <realevenyag@gmail.com>
* chore: bump version to v1.0.2
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix: avoid stale route update during repartition allocation (#8115)
Signed-off-by: WenyXu <wenymedia@gmail.com>
Signed-off-by: evenyag <realevenyag@gmail.com>
* test: update sqlness result
Signed-off-by: evenyag <realevenyag@gmail.com>
---------
Signed-off-by: discord9 <discord9@163.com>
Signed-off-by: evenyag <realevenyag@gmail.com>
Signed-off-by: BootstrapperSBL <yvanwww@gmail.com>
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
Signed-off-by: BootstrapperSBL <yvanwww01@gmail.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>
Signed-off-by: WenyXu <wenymedia@gmail.com>
Co-authored-by: discord9 <55937128+discord9@users.noreply.github.com>
Co-authored-by: Yvan Wang <131545713+BootstrapperSBL@users.noreply.github.com>
Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com>
Co-authored-by: BootstrapperSBL <yvanwww01@gmail.com>
Co-authored-by: Ning Sun <sunng@protonmail.com>
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: QuakeWang <45645138+QuakeWang@users.noreply.github.com>
Co-authored-by: Weny Xu <wenymedia@gmail.com>
* feat: support alter from primary_key to flat
Signed-off-by: evenyag <realevenyag@gmail.com>
* chore: alter flat to primary_key
Signed-off-by: evenyag <realevenyag@gmail.com>
* feat: change default_experimental_flat_format to true
Signed-off-by: evenyag <realevenyag@gmail.com>
* feat: compute channel size from splitted batch size
Signed-off-by: evenyag <realevenyag@gmail.com>
* test: add tests for split and channel size
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix: always set sst_format from manifest on region open
sanitize_region_options did not set options.sst_format when the
default (PrimaryKey) matched the manifest value, leaving it as None
after reopen. This caused the alter format change to appear lost.
Signed-off-by: evenyag <realevenyag@gmail.com>
* test: fix tests
Signed-off-by: evenyag <realevenyag@gmail.com>
* test: show create table after alteration
Signed-off-by: evenyag <realevenyag@gmail.com>
* refactor!: rename default_experimental_flat_format to default_flat_format
The flat format is no longer experimental. Remove "experimental" from
the config field name, doc comments, and all references.
Signed-off-by: evenyag <realevenyag@gmail.com>
* chore: fix clippy
Signed-off-by: evenyag <realevenyag@gmail.com>
---------
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix: resolve optimization issue for extended query
* fix: type cast from subquery
* chore: update error information in sqlness
* chore: switch to released pgwire
* refactor: remove optimize function completely
* chore: add more tests
* test: attempt to fix the fuzz issue
* fix: try to resolve the test issue
* feat: initial function rewriter for json_get
* feat: make sure rewrite rule is applied
* feat: keep analyzer's default rules
* feat: implement rewriter for arrow_cast
* test: add unit test for tht rewriter
* chore: format
* refactor: extract some more functions
* Apply suggestion from @waynexia
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
---------
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
* feat: update pgwire
* feat: add special parser for copy to stdout
* feat: implement copy to stdout
* fix: improve code
* fix: expect optional with
* fix: lint
* feat: correct encoder using and refactor
* chore: fmt
* refactor: update api
* chore: use released dependencies
* fix: update datafusion-pg-catalog to support schema query
* fix: support for double quoted identifier
* feat: update datafusion-postgres to support schema.table
* refactor: use pgsqlparser container
* refactor: remove unquote which is no longer needed
* fix: correctly handle invalid query
* fix: correct handle null in nano timestamp
* test: add a new test for additional close )
* refactor: remove the `RawTableMeta` and `RawTableInfo` to make codes more concise
Signed-off-by: luofucong <luofc@foxmail.com>
* fix ci
Signed-off-by: luofucong <luofc@foxmail.com>
* fix ci
Signed-off-by: luofucong <luofc@foxmail.com>
---------
Signed-off-by: luofucong <luofc@foxmail.com>
* feat(copy_to_json): add `date_format`/`timestamp_format`/`time_format` for JSON format.
Signed-off-by: Yihai Lin <yihai-lin@foxmail.com>
* Update src/common/datasource/src/file_format/json.rs
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Yihai Lin <yihai-lin@foxmail.com>
* chore: Use predefined constants as the time format.
Signed-off-by: Yihai Lin <yihai-lin@foxmail.com>
---------
Signed-off-by: Yihai Lin <yihai-lin@foxmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* feat: flow last non null support
Signed-off-by: discord9 <discord9@163.com>
* clippy
Signed-off-by: discord9 <discord9@163.com>
* test: report error when sink is not last non null
Signed-off-by: discord9 <discord9@163.com>
* fix: error if query column matches nothing if partial
Signed-off-by: discord9 <discord9@163.com>
---------
Signed-off-by: discord9 <discord9@163.com>
* feat: add return_field_from_args
* feat: add JsonGetWithType
* port json_get_float and json_get_bool to new implementation, add
json_get with third argument accepting a scalar value for type.
* fix: lint fix
* chore: add sqlness tests
* chore: update tests
* feat: add sync region instruction for repartition procedure
This commit introduces a new sync region instruction and integrates it
into the repartition procedure flow, specifically for metric engine tables.
Changes:
- Add SyncRegion instruction type and SyncRegionsReply in instruction.rs
- Implement SyncRegionHandler in datanode to handle sync region requests
- Add SyncRegion state in repartition procedure to sync newly allocated regions
- Integrate sync region step after enter_staging_region for metric engine tables
- Add sync_region flag and allocated_region_ids to PersistentContext
- Make SyncRegionFromRequest serializable for instruction transmission
- Add test utilities and mock support for sync region operations
The sync region step is conditionally executed based on the table engine type,
ensuring that newly allocated regions in metric engine tables are properly
synced from their source regions before proceeding with manifest remapping.
Signed-off-by: WenyXu <wenymedia@gmail.com>
* chore: add logs
Signed-off-by: WenyXu <wenymedia@gmail.com>
* feat(repartition): improve staging region handling and support metric engine repartition
- Reorder sync region flow: move SyncRegion from EnterStagingRegion to RepartitionStart to sync before applying staging
- Add ExitStaging metadata update state to properly clear staging leader info after repartition completes
- Update build_template_from_raw_table_info to optionally skip metric engine internal columns when creating region requests
- Fix region state transition: set_dropping now expects specific state (Staging or Writable) for proper validation
- Adjust region drop and copy handlers to handle staging regions correctly
- Add comprehensive test cases for metric engine SPLIT/MERGE partition operations on physical tables with logical tables
- Improve logging for table route updates, region drops, and repartition operations
Signed-off-by: WenyXu <wenymedia@gmail.com>
* refactor: removes code duplication
Signed-off-by: WenyXu <wenymedia@gmail.com>
* fix: update result
Signed-off-by: WenyXu <wenymedia@gmail.com>
* chore: refine comments
Signed-off-by: WenyXu <wenymedia@gmail.com>
* feat: add error strategy support for flush region and flush pending deallocate regions
- **Add `ErrorStrategy` enum** in `procedure/utils.rs`:
- Supports `Ignore` and `Retry` strategies for error handling
- Refactor `flush_region` to accept `error_strategy` parameter
- Extract `handle_flush_region_reply` helper function for better code organization
- **Add pending deallocate region support**:
- Add `pending_deallocate_region_ids` field to `PersistentContext`
- Implement `flush_pending_deallocate_regions` in `EnterStagingRegion` state
- Flush pending deallocate regions before entering staging regions to ensure data consistency
- **Update error handling**:
- `flush_leader_region`: Use `ErrorStrategy::Ignore` to skip unreachable datanodes
- `sync_region`: Use `ErrorStrategy::Retry` for critical operations
- `enter_staging_region`: Use `ErrorStrategy::Retry` when flushing pending deallocate regions
This change improves the robustness of the repartition procedure by:
1. Providing flexible error handling strategies for flush operations
2. Ensuring pending deallocate regions are properly flushed before repartitioning
3. Preventing data inconsistency during region migration
Signed-off-by: WenyXu <wenymedia@gmail.com>
* chore: apply suggestions from CR
Signed-off-by: WenyXu <wenymedia@gmail.com>
* fix: compile
Signed-off-by: WenyXu <wenymedia@gmail.com>
---------
Signed-off-by: WenyXu <wenymedia@gmail.com>