greptimedb

mirror of https://github.com/GreptimeTeam/greptimedb.git synced 2026-07-04 04:50:37 +00:00

Author	SHA1	Message	Date
dennis zhuang	31c2c1f6db	feat: table semantic layer per-table enrichment (Phase 2) (#8218 ) * feat: table semantic layer per-table enrichment (Phase 2) Phase 2 of the table semantic layer, plus a vocabulary trim so the layer only records what a machine consumer cannot cheaply recover on its own. Per-table metric enrichment (OTLP), via an internal per-table channel: - A `SemanticIndex` accumulator records, per emitted table, the declared metric keys: type / unit / temporality / metadata_quality=declared / original_name. Conflicting single-valued keys collapse to `mixed`/`unknown`. - Recording happens at the `encode_metrics` level where the base name, metric type, and proto fields are all in scope, so histogram/summary fan-out gets the correct per-subtable type (`_bucket`=histogram, `_sum`/`_count`=counter) without threading state through every encoder. - The index is serialized onto the `greptime.internal.semantic.per_table_index` context extension; `apply_per_table_semantic_options` folds each table's keys into its options at auto-create time. - `trace.conventions` is refined from the request's resource/scope `schema_url`s (concrete when uniform, else `mixed`/`unknown`). Vocabulary trimmed to only meaningful keys. Kept: signal_type, source, pipeline, trace.conventions, metric.{type,unit,temporality,metadata_quality,original_name}. Dropped: metric.monotonic (a function of type), trace.has_events/has_links (constant + derivable from columns), log.severity_scheme/body_format (constant / derivable, and body_format cost an O(rows) scan), resource/scope lineage (restates columns / collector-config concern), source_version (no cheap non-constant value today). Prometheus carries type/unit in the metric name by convention, so it gets identity only — no inferred enrichment. Identity (signal_type + source) extended to the remaining ingest protocols so the discovery view is complete: InfluxDB and OpenTSDB (metric), Loki and Elasticsearch (log). These protocols carry no type/unit metadata, so identity is all that applies. Tests: unit coverage for the accumulator, per-metric-type fan-out, and trace conventions; integration goldens updated for the OTLP metric/trace SHOW CREATE output and the new Loki identity. Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * chore: validate the option value Signed-off-by: Dennis Zhuang <killme2008@gmail.com> --------- Signed-off-by: Dennis Zhuang <killme2008@gmail.com>	2026-06-04 12:52:00 +00:00
QuakeWang	6ed67167bb	feat: support headerless CSV copy from (#8233 ) * feat: support headerless CSV copy from Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com> * fix: update csv copy sqlness result Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com> * test: cover headerless CSV copy from Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com> * test: cover headerless CSV column count mismatch Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com> --------- Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>	2026-06-04 12:04:15 +00:00
Lei, HUANG	da9d314fa4	fix: align bulk insert schema handling (#8222 ) * fix: align bulk insert schema handling Carry `aligned_schema_version` through bulk insert requests so datanode can recheck stale schemas only when needed. Use the physical data region schema version for metric logical bulk inserts, while external bulk insert callers leave alignment unknown. Related files: - `Cargo.toml` - `Cargo.lock` - `src/store-api/src/region_request.rs` - `src/metric-engine/src/engine/bulk_insert.rs` - `src/mito2/src/request.rs` - `src/mito2/src/worker/handle_bulk_insert.rs` - `src/mito2/src/worker/handle_write.rs` - `src/operator/src/bulk_insert.rs` - `src/servers/src/pending_rows_batcher.rs` - `src/mito2/src/engine/edit_region_test.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * test: cover bulk insert missing columns Add Flight bulk insert coverage for batches that omit nullable columns and verify the database fills the missing values with nulls. Related files: - `tests-integration/src/grpc/flight.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * chore: update proto to main branch commit Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * fix: align physical metric bulk insert schema Pass the physical schema version through metric bulk insert requests so physical and logical region writes share the same schema alignment behavior. Files: - \`src/metric-engine/src/engine/bulk_insert.rs\` — updates \`bulk_insert_physical_region\` and reuses \`physical_schema_version\` for bulk insert schema alignment. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * chore: add some doc Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> --------- Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>	2026-06-04 02:55:09 +00:00
QuakeWang	ca07a53deb	feat: support CSV copy skip bad records (#8198 ) * feat: support CSV copy headers and skip bad records Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com> * refactor: reuse CSV skippable error helper Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com> * fix: address CSV skip bad records review Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com> * docs: clarify CSV skip bad records performance tradeoff Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com> --------- Signed-off-by: QuakeWang <wangfuzheng0814@foxmail.com>	2026-06-03 12:11:02 +00:00
dennis zhuang	f25ab67af1	feat: table semantic layer identity (Phase 1) (#8210 ) * feat: table semantic layer identity (Phase 1) Attach a thin layer of semantic metadata to ingested tables via the existing `table_options` slot, so machine consumers (LLM agents, alert/dashboard builders, MCP servers, ETL) can align a table with the observability concept it stands for without guessing from column names. See docs/rfcs/2026-05-28-table-semantic-layer.md. Phase 1 (identity) only: - New `table::requests::semantic` module: the `greptime.semantic.` vocabulary (signal/source/source_version/pipeline + trace/metric/log/resource-scope keys, defined now, populated by later phases), value constants, the internal `greptime.internal.semantic.per_table_index` transport key (reserved for Phase 2, deliberately outside the public namespace), and `is_semantic_option_key`. - `validate_table_option` accepts the `greptime.semantic.` prefix, so the keys are valid both on the auto-create path and on explicit `CREATE TABLE ... WITH (...)`. - `fill_table_options_for_create` copies every semantic ctx extension into the new table's options (prefix passthrough alongside the fixed allowlist). - Frontend stamps identity on the context at each ingest entry: OTLP metrics (metric/opentelemetry), traces (+pipeline, has_events/has_links/conventions for the v1 model), logs (log/opentelemetry), and Prometheus remote write (metric/prometheus, metadata_quality=inferred). OTLP metric metadata_quality is left for Phase 2 (declared). - Trace identity is stamped only on the main span table; the derived `_services` / `_operations` lookup tables keep the unstamped context and carry no semantic identity (cross-table relationships are out of scope). Semantic options appear in SHOW CREATE TABLE (like table_data_model / otlp_metric_compat) and in information_schema, so an LLM inspecting a table sees its semantics directly. Tests: unit (validation prefix + internal-key rejection, ctx passthrough) and integration assertions that the common keys land for OTLP metrics (metric-engine logical table), traces, logs, and Prometheus remote write; SHOW CREATE goldens updated. Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * fix: prom batcher not cover and white list for semantic keys/values Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * fix: typo Signed-off-by: Dennis Zhuang <killme2008@gmail.com> --------- Signed-off-by: Dennis Zhuang <killme2008@gmail.com>	2026-06-02 01:56:29 +00:00
discord9	28fd796f4e	fix(flow): harden incremental read correctness (#8196 ) * fix(flow): harden incremental read correctness Signed-off-by: discord9 <discord9@163.com> * fix(flow): propagate dirty window options Signed-off-by: discord9 <discord9@163.com> * test: more Signed-off-by: discord9 <discord9@163.com> * chore: test config api Signed-off-by: discord9 <discord9@163.com> * refactor: split gen Signed-off-by: discord9 <discord9@163.com> * chore: per review Signed-off-by: discord9 <discord9@163.com> * fix: allowlist key Signed-off-by: discord9 <discord9@163.com> --------- Signed-off-by: discord9 <discord9@163.com>	2026-06-01 02:48:00 +00:00
dennis zhuang	ed9312f8e3	feat: global switch for creating tables automatically (#8203 ) * feat: global switch for creating table automatically Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * chore: make auto_create_table as comment by default Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * feat: respect gloabl switch for metric engine Signed-off-by: Dennis Zhuang <killme2008@gmail.com> --------- Signed-off-by: Dennis Zhuang <killme2008@gmail.com>	2026-05-31 23:51:14 +00:00
discord9	ba15a9c056	feat: support pending flow metadata with defer_on_missing_source (#8124 ) * feat: support defer_on_missing_source for pending flow creation Add `defer_on_missing_source` flow option that allows creating flows even when source tables do not yet exist. The flow enters a pending state and is automatically activated when source tables become available. Key changes: - New `FlowStatus::PendingSources` and fields in `FlowInfoValue` for unresolved source table names and last activation error - `defer_on_missing_source` create-time-only option: stripped from runtime/flownode `CreateRequest` but preserved in metadata for SQL round-trip (`SHOW CREATE FLOW`, `information_schema.flows`) - `CreateFlowProcedure` creates pending metadata when sources are missing and `defer_on_missing_source=true`; falls back to `FlowType::Batching` for missing-source flows - `PendingFlowReconcileManager` in meta-srv periodically checks pending flows and activates them when source tables resolve - `ActivatePendingFlowProcedure` handles activation: allocates peers, creates flows on flownodes, updates metadata, invalidates cache - `OR REPLACE` properly handles pending<->active transitions, including peer allocation and flownode flow teardown - `FlowMetadataAllocator::alloc_peers` for peer allocation at activation time - Validated flow options: only `defer_on_missing_source` allowed; unknown options rejected - Known issue: standalone mode does not support flownodes, so pending flow flush/sink behavior covered only in distributed sqlness; operator and meta unit tests cover activation logic Tests: - operator `determine_flow_type_for_source_state` (3 passed) - common-meta `create_flow` (19 passed) including replacement - common-meta `activate_flow` (4 passed) - meta-srv `flow` (11 passed) - sqlness: `flow_pending` covers create/replace/round-trip Signed-off-by: discord9 <discord9@163.com> * chore: simplify pending flow PR scope Reduce PR #8124 to the metadata-only MVP after complexity review. Changes: - Remove automatic activation procedure and meta-srv reconcile wiring - Remove activation tests and activation-only metadata fields - Reject cross-state pending<->active `OR REPLACE` transitions for now - Keep pending metadata creation and SQL round-trip behavior - Allow `DROP FLOW` for pending flows without routes - Reduce flow_pending sqlness to metadata/round-trip/drop coverage only Deferred follow-ups are documented locally in `.tmp/tasks/pending-defer-semantics/deferred-followups.md` and intentionally not committed. Tests: - `cargo test -p operator determine_flow_type_for_source_state` - `cargo test -p common-meta create_flow` - `cargo test -p common-meta drop_flow` - `cargo sqlness bare --test-filter flow_pending --bins-dir /mnt/nvme_rust/rust-targets/pending_defer/debug` Signed-off-by: discord9 <discord9@163.com> * test: cover pending flow metadata edge cases Signed-off-by: discord9 <discord9@163.com> * test: fix pending flow metadata test lint Signed-off-by: discord9 <discord9@163.com> * docs: document pending flow metadata fields Signed-off-by: discord9 <discord9@163.com> * chore: more sleep when test Signed-off-by: discord9 <discord9@163.com> --------- Signed-off-by: discord9 <discord9@163.com>	2026-05-29 06:59:21 +00:00
Weny Xu	f513b77ccc	feat: support alter table partition syntax (#8177 ) * feat(sql): support alter table partition syntax Signed-off-by: WenyXu <wenymedia@gmail.com> * feat: support repartition source proto Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: update greptime-proto Signed-off-by: WenyXu <wenymedia@gmail.com> --------- Signed-off-by: WenyXu <wenymedia@gmail.com>	2026-05-26 15:06:14 +00:00
Lei, HUANG	f8df016623	feat: add InfluxDB default merge mode config (#8134 ) * feat/influxdb-default-merge-mode: add InfluxDB merge mode config - `influxdb` config: add `default_merge_mode` parsing and defaults in `src/frontend/src/service_config/influxdb.rs` and `src/frontend/src/service_config.rs` - auto-create behavior: apply configured `merge_mode` for InfluxDB ingestion in `src/frontend/src/instance.rs`, `src/frontend/src/instance/builder.rs`, `src/frontend/src/instance/influxdb.rs`, and `src/operator/src/insert.rs` - config docs: document `influxdb.default_merge_mode` in `config/frontend.example.toml`, `config/standalone.example.toml`, and `config/config.md` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/influxdb-default-merge-mode: derive merge mode default - `influxdb` config: derive `Default` for `InfluxdbMergeMode` in `src/frontend/src/service_config/influxdb.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/influxdb-default-merge-mode: update config API snapshot - `config API`: include `default_merge_mode` in `tests-integration/tests/http.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/influxdb-default-merge-mode: avoid default context clone - `InfluxDB merge mode`: avoid cloning `QueryContext` for default `last_non_null` in `src/frontend/src/instance/influxdb.rs` - `InfluxDB merge mode`: cover default, configured, and explicit `MERGE_MODE_KEY` paths in `src/frontend/src/instance/influxdb.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> --------- Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>	2026-05-19 16:54:36 +00:00
Lei, HUANG	2fdbe6c8c3	feat: expose node info for placement selectors (#8095 ) * feat: expose node info for placement selectors Return `NodeInfo` from `PeerDiscovery` methods and keep OSS selectors mapping back to `Peer`. Carry `__greptime_origin_frontend.addr` from frontend create-table DDLs into selector `extensions`, and thread `PeerAllocContext` through table-route allocation. Persist datanode `NodeInfo` when heartbeat stats are absent so collected env vars remain available after restart. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * fix: skip datanode node info without stats Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * fix: avoid unnecessary workload clones Skip workload cloning for inactive nodes and for active node-info lookups without workload filters. Files: `src/meta-srv/src/discovery/utils.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * fix: require frontend origin address Require `StatementExecutor` to carry a concrete frontend origin address and always attach it to meta DDL query contexts. Files: `src/operator/src/statement.rs`, `src/operator/src/statement/ddl.rs`, `src/operator/src/utils.rs`, `src/frontend/src/instance/builder.rs`, `src/frontend/src/heartbeat.rs`, `src/flow/src/server.rs`, `src/cmd/src/standalone.rs`, `src/cmd/src/flownode.rs`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * refactor: reuse resolved frontend address Resolve the frontend peer address once in the frontend builder, store it on the instance, and reuse it for heartbeat and flow invoker origins. Files: `src/frontend/src/instance/builder.rs`, `src/frontend/src/instance.rs`, `src/frontend/src/heartbeat.rs`, `src/cmd/src/frontend.rs`, `src/cmd/src/standalone.rs`, `src/frontend/src/frontend.rs`, `src/frontend/src/heartbeat/tests.rs`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * fix: preserve datanode lease liveness Filter active datanode node infos through lease timestamps and workloads while preserving node info fields such as reported env vars. Files: `src/meta-srv/src/discovery/utils.rs`, `src/meta-srv/src/discovery/lease.rs`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * Remove stale datanode lease helper - `discovery`: remove the obsolete `alive_datanodes` helper and related tests in `src/meta-srv/src/discovery/utils.rs` and `src/meta-srv/src/discovery/lease.rs` - `integration`: update cluster and standalone setup paths in `tests-integration/src/cluster.rs` and `tests-integration/src/standalone.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/env-based-region-selector-oss: simplify lease discovery - `lease-discovery`: simplify logic and remove unused utilities in `src/meta-srv/src/discovery/lease.rs` and `src/meta-srv/src/discovery/utils.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> --------- Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>	2026-05-15 07:35:18 +00:00
Lei, HUANG	796aae3d9f	feat(operator): allow last_row merge mode with append mode (#8065 ) * feat(operator): allow last_row merge_mode when append_mode is enabled - Update RegionOptions::validate to allow last_row merge_mode with append_mode. - Update fill_table_options_for_create to automatically set merge_mode to last_row when append_mode is enabled for LastNonNull table type. - Add unit tests in mito2 and operator to verify options validation and table creation. - Add integration test for InfluxDB write with append mode hint. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * fix(operator): simplify append mode options Group `LastNonNull` auto-create options in a single append-mode branch. Files: - `src/operator/src/insert.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * fix: sqlness Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> --------- Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>	2026-05-07 07:21:37 +00:00
discord9	d2d256909f	feat(flow): parse defer on miss src table (#7980 ) * feat: parse create flow with Signed-off-by: discord9 <discord9@163.com> * feat: validate after parse Signed-off-by: discord9 <discord9@163.com> * pcr Signed-off-by: discord9 <discord9@163.com> * chore: sqlness Signed-off-by: discord9 <discord9@163.com> --------- Signed-off-by: discord9 <discord9@163.com>	2026-04-27 03:02:13 +00:00
shuiyisong	0effc30778	chore: update the opendal to 0.56 rc2 (#8003 ) * chore: update opendal version Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: update opendal version Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: fix test Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: grpc init Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: dep versions Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: remove aws-lc-rs in reqwest Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: rebase main and fix compile Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: remove unused deps Signed-off-by: shuiyisong <xixing.sys@gmail.com> * Revert "fix: remove aws-lc-rs in reqwest" This reverts commit 90bfafca06f877befb36f3e54bb72fcfc4c56778. * chore: remove aws-lc-sys from blacklist Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: fix sqlness Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: add tls deps Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: idemptent install in rds Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: test Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: use aws-lc-sys as possible Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: lint Signed-off-by: shuiyisong <xixing.sys@gmail.com> * fix: address comments Signed-off-by: shuiyisong <xixing.sys@gmail.com> * chore: address CR issue Signed-off-by: shuiyisong <xixing.sys@gmail.com> Signed-off-by: evenyag <realevenyag@gmail.com> * fix: sync opendal compat adapter with upstream Signed-off-by: evenyag <realevenyag@gmail.com> * fix: address compat clippy warnings Signed-off-by: evenyag <realevenyag@gmail.com> --------- Signed-off-by: shuiyisong <xixing.sys@gmail.com> Signed-off-by: evenyag <realevenyag@gmail.com> Co-authored-by: evenyag <realevenyag@gmail.com>	2026-04-26 09:59:48 +00:00
LFC	c7427fa6d2	feat: json2 insert (#7981 ) * feat: json2 insert Signed-off-by: luofucong <luofc@foxmail.com> * resolve PR comments Signed-off-by: luofucong <luofc@foxmail.com> * use main greptime-proto Signed-off-by: luofucong <luofc@foxmail.com> --------- Signed-off-by: luofucong <luofc@foxmail.com>	2026-04-20 09:45:32 +00:00
yxrxy	6c924efc2f	fix: remove redundant error messages in admin functions (#7953 ) Closes #7938 Signed-off-by: yxrxy <yxrxytrigger@gmail.com>	2026-04-17 08:21:22 +00:00
fys	62013217c7	fix: cargo check -p common-meta (#7964 ) fix: moka feature	2026-04-14 08:27:22 +00:00
Ning Sun	32a2990802	feat: allow customizing trace table partitions (#7944 ) * feat: allow customizing trace table partitions * feat: add hint * feat: return error on invalid partition number * feat: add full failure/partial success/full success * chore: format * fix: address review suggestion	2026-04-13 14:10:55 +00:00
Lei, HUANG	2b4e12c358	feat: auto-align Prometheus schemas in pending rows batching (#7877 ) * feat/auto-schema-align: - Error Handling Improvements: - Removed `CatalogSnafu` context from various `.await` calls in `dashboard.rs`, `influxdb.rs`, `jaeger.rs`, `prometheus.rs`, `event.rs`, and `pipeline.rs` to streamline error handling. - Prometheus Store Enhancements: - Added support for auto-creating tables and adding missing Prometheus tag columns in `prom_store.rs` and `pending_rows_batcher.rs`. - Introduced `PendingRowsSchemaAlterer` trait for schema alterations in `pending_rows_batcher.rs`. - Test Additions: - Added tests for new Prometheus store functionalities in `prom_store.rs` and `pending_rows_batcher.rs`. - Error Message Improvements: - Enhanced error messages for catalog access in `error.rs`. - Server Configuration Updates: - Updated server configuration to include Prometheus store options in `server.rs`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * reformat Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: ### Add DataTypes Error Handling and Column Renaming Logic - `error.rs`: Introduced a new `DataTypes` error variant to handle errors from `datatypes::error::Error`. Updated `ErrorExt` implementation to include `DataTypes`. - `pending_rows_batcher.rs`: Added functions `find_prom_special_column_names` and `rename_prom_special_columns_for_existing_schema` to handle renaming of special Prometheus columns. Updated `build_prom_create_table_schema` to simplify error handling with `ConcreteDataType`. - Tests: Added a test case `test_rename_prom_special_columns_for_existing_schema` to verify the renaming logic for Prometheus special columns. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: - Refactored `PendingRowsBatcher` to accommodate Prometheus record batches: - Introduced `accommodate_record_batch_for_target_schema` to normalize incoming record batches against existing table schemas. - Removed `collect_missing_prom_tag_columns` and `rename_prom_special_columns_for_existing_schema` in favor of the new function. - Added `unzip_logical_region_schema` to extract schema components. - Updated tests in `pending_rows_batcher.rs`: - Added tests for `accommodate_record_batch_for_target_schema` to verify handling of missing tag columns and renaming of special columns. - Ensured error handling for missing timestamp and field columns in target schema. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: ### Commit Summary - Enhancement in Table Creation Logic: Updated `prom_store.rs` to modify the handling of `table_options` during table creation. Specifically, `table_options` are now extended differently based on the `AutoCreateTableType`. For `Physical` tables, enforced `sst_format=flat` to optimize pending-rows writes by leveraging bulk memtables. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: Enhance Performance Monitoring in `pending_rows_batcher.rs` - Added performance monitoring timers to various stages of the `PendingRowsBatcher` process, including schema cache checks, table resolution, schema creation, and record batch alignment. - Improved schema handling by adding timers around schema alteration and missing column addition processes. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: - Enhance Concurrent Write Handling: Introduced `FlushRegionWrite` and `FlushWriteResult` structs to manage region writes and their results. Added `flush_region_writes_concurrently` function to handle concurrent flushing of region writes based on `should_dispatch_concurrently` logic in `pending_rows_batcher.rs`. - Testing Enhancements: Added tests for concurrent dispatching of region writes and the logic for determining concurrent dispatch in `pending_rows_batcher.rs`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: ### Add Histogram for Flush Stage Elapsed Time - `metrics.rs`: Introduced a new `HistogramVec` named `PENDING_ROWS_BATCH_FLUSH_STAGE_ELAPSED` to track the elapsed time of pending rows batch flush stages. - `pending_rows_batcher.rs`: Replaced instances of `PENDING_ROWS_BATCH_INGEST_STAGE_ELAPSED` with `PENDING_ROWS_BATCH_FLUSH_STAGE_ELAPSED` to measure the elapsed time for various flush stages, including `flush_write_region`, `flush_concat_table_batches`, `flush_resolve_table`, `flush_fetch_partition_rule`, `flush_split_record_batch`, `flush_filter_record_batch`, `flush_resolve_region_leader`, and `flush_encode_ipc`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * Add design doc for physical table batching in PendingRowsBatcher Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * Add implementation plan for physical table batching in PendingRowsBatcher * feat/auto-schema-align: ### Commit Message Enhance Metric Engine with Physical Batch Processing - Add `metric-engine` Dependency: Updated `Cargo.lock` and `Cargo.toml` to include `metric-engine` as a workspace dependency. - Expose Batch Modifier Functions: Changed visibility of `TagColumnInfo`, `compute_tsid_array`, and `modify_batch_sparse` in `batch_modifier.rs` to public, and made `batch_modifier` a public module in `lib.rs`. - Implement Physical Batch Processing: - Added functions `bulk_insert_physical_region` and `bulk_insert_logical_region` in `bulk_insert.rs` to handle physical and logical batch insertions. - Updated `pending_rows_batcher.rs` to attempt physical batch processing before falling back to logical processing, including new functions `flush_batch_physical` and `flush_batch_per_logical_table`. - Enhance Testing: - Added tests for physical region passthrough and empty batch handling in `bulk_insert.rs`. - Introduced `with_mito_config` in `test_util.rs` for customized test environments. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: ### Enhance Batch Processing for Table Creation and Alteration - `prom_store.rs`: - Added `create_tables_if_missing_batch` and `add_missing_prom_tag_columns_batch` methods to handle batch creation of tables and batch alteration to add missing tag columns. - Implemented logic to determine missing tables and columns, and perform batch operations accordingly. - `pending_rows_batcher.rs`: - Updated `PendingRowsBatcher` to utilize batch methods for creating tables an adding missing columns. - Enhanced logic to resolve table schemas and accommodate record batches after batch operations. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * perf: concurrent catalog lookups and eliminate redundant concat_batches on ingest path Replace sequential catalog_manager.table() calls with concurrent futures::future::join_all in align_table_batches_to_region_schema. This affects all three lookup loops: initial table resolution, post-create resolution, and post-alter schema refresh. Reduces O(N) sequential RPC latency to O(1) wall-clock time for requests with many distinct logical tables (e.g. Prometheus remote_write). Remove the per-logical-table concat_batches in flush_batch_physical. Instead of merging all chunks of a table into one RecordBatch before calling modify_batch_sparse, apply modify_batch_sparse directly to each chunk and collect all modified chunks for a single final concat. This eliminates one full data copy per logical table on the flush path. * refactor: extract Prometheus schema alignment helpers into prom_row_builder module Move six functions and their eight unit tests from pending_rows_batcher.rs (~2386 lines) into a new prom_row_builder.rs module (~776 lines), leaving the batcher at ~1665 lines focused on flush/worker machinery. Extracted functions: - accommodate_record_batch_for_target_schema (normalize incoming batch against existing table schema) - unzip_logical_region_schema (extract ts/field/tag columns) - build_prom_create_table_schema (build ColumnSchema vec for table creation) - align_record_batch_to_schema (reorder/fill/cast columns to target schema) - rows_to_record_batch (convert proto Rows to Arrow RecordBatch) - build_arrow_array (build Arrow arrays from proto values) Cleaned up 12 now-unused imports from pending_rows_batcher.rs. * feat/auto-schema-align: ### Enhance `PendingRowsBatcher` and `prom_row_builder` for Efficient Schema Handling - `pending_rows_batcher.rs`: - Refactored `submit` method to integrate table batch building and alignment into a single method `build_and_align_table_batches`. - Removed intermediate `RecordBatch` creation, optimizing the process by directly converting proto `RowInsertRequests` into aligned `RecordBatch`es. - Enhanced schema handling by identifying missing columns directly from proto schemas. - `prom_row_builder.rs`: - Introduced `rows_to_aligned_record_batch` for direct conversion of proto `Rows` into aligned `RecordBatch`es. - Added `identify_missing_columns_from_proto` to detect absent tag columns without intermediate `RecordBatch`. - Implemented `build_prom_create_table_schema_from_proto` to construct table schemas directly from proto schemas. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: Add elapsed time metrics for bulk insert operations - Updated `bulk_insert` method in `bulk_insert.rs` to record elapsed time metrics using `MITO_OPERATION_ELAPSED` for both physical and logical regions. - Added a new test `test_bulk_insert_records_elapsed_metric` to verify that the elapsed time metric is recorded correctly during bulk insert operations. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * remove flush per logical region Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: Refactor `flush_batch` and `flush_batch_physical` functions - Removed unused `catalog` and `schema` variables from `flush_batch` in `pending_rows_batcher.rs`. - Updated `flush_batch_physical` to directly use `ctx.current_catalog()` and `ctx.current_schema()` for resolving table names. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: ### Remove Unused Function and Associated Test - File: `src/servers/src/prom_row_builder.rs` - Removed the unused function `build_prom_create_table_schema` which was responsible for building a `Vec<ColumnSchema>` from an Arrow schema. - Deleted the associated test `test_build_prom_create_table_schema_from_request_schema` that validated the removed function. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: - Remove Test: Deleted the `test_bulk_insert_records_elapsed_metric` test from `bulk_insert.rs`. - Refactor Table Resolution: Introduced `TableResolutionPlan` struct and refactored table resolution logic in `pending_rows_batcher.rs`. - Enhance Table Handling: Added functions for collecting non-empty table rows, unique table schemas, and handling table creation and alteration in `pending_rows_batcher.rs`. - Add Tests: Implemented tests for `collect_non_empty_table_rows` and `collect_unique_table_schemas` in `pending_rows_batcher.rs`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: - Refactor Error Handling: Updated error handling in `pending_rows_batcher.rs` and `prom_row_builder.rs` to use `Snafu` error context for more descriptive error messages. - Remove Unused Functionality: Eliminated the `rows_to_record_batch` function and related test in `prom_row_builder.rs` as it was redundant. - Simplify Function Return Types: Modified `rows_to_aligned_record_batch` in `prom_row_builder.rs` to return only `RecordBatch` without missing columns, simplifying the function's interface and related tests. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: ### Add Helper Function for Table Options in `prom_store.rs` - Introduced `fill_metric_physical_table_options` function to encapsulate logic for setting table options, ensuring the use of flat SST format and physical table metadata. - Updated `Instance` implementation to utilize the new helper function for setting table options. - Added a unit test `test_metric_physical_table_options_forces_flat_sst_format` to verify the correct application of table options. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: - Refactor `PendingRowsBatcher`: Simplified worker retrieval logic in `get_or_spawn_worker` method by using a more concise conditional check. - Metrics Update: Added `PENDING_ROWS_BATCH_FLUSH_STAGE_ELAPSED` metric in `pending_rows_batcher.rs`. - Remove Unused Code: Deleted multiple test functions related to record batch alignment and schema preparation in `pending_rows_batcher.rs` and `prom_row_builder.rs`. - Function Visibility Change: Made `build_prom_create_table_schema_from_proto` public in `prom_row_builder.rs`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * chore: remove plan Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: ### Refactor and Simplify Schema Alteration Logic - Removed Unused Methods: Deleted `create_table_if_missing` and `add_missing_prom_tag_columns` methods from `PendingRowsSchemaAlterer` trait in `prom_store.rs` and `pending_rows_batcher.rs`. - Error Handling Improvement: Enhanced error handling in `create_tables_if_missing_batch` method to return a specific error message for unsupported `AutoCreateTableType` in `prom_store.rs`. - Visibility Change: Made `as_str` method public in `AutoCreateTableType` enum in `insert.rs` to support external access. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: ### Commit Message Improve safety in `prom_row_builder.rs` - Updated `unzip_logical_region_schema` to use `saturating_sub` for safer capacity calculation of `tag_columns`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: Add TODO comments for future improvements in `pending_rows_batcher.rs` - Added a TODO comment to consider bounding the `flush_region_writes_concurrently` function. - Added a TODO comment to potentially limit the maximum rows to concatenate in the `flush_batch_physical` function. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: ### Commit Message Enhance error handling in `pending_rows_batcher.rs` - Updated `collect_unique_table_schemas` to return a `Result` type, enabling error handling for duplicate table names. - Modified the function to return an error when duplicate table names are found in `table_rows`. - Adjusted test cases to handle the new `Result` return type in `collect_unique_table_schemas`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: - Refactor `partition_columns` Method: Updated the `partition_columns` method in `multi_dim.rs`, `partition.rs`, and `splitter.rs` to return a slice reference instead of a cloned vector, improving performance by avoiding unnecessary cloning. - Enhance Partition Handling: Added functions `collect_tag_columns_and_non_tag_indices` and `strip_partition_columns_from_batch` in `pending_rows_batcher.rs` to manage partition columns more efficiently, including stripping partition columns from record batches. - Update Tests: Modified existing tests and added new ones in `pending_rows_batcher.rs` to verify the functionality of partition column handling, ensuring correct behavior of the new methods. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: ### Enhance Schema Handling and Validation in `pending_rows_batcher.rs` - Schema Validation Enhancements: - Added checks for essential columns (`timestamp`, `value`) in `collect_tag_columns_and_non_tag_indices`. - Introduced `PHYSICAL_REGION_ESSENTIAL_COLUMN_COUNT` to ensure minimum column count in `strip_partition_columns_from_batch`. - Improved error handling for unexpected data types and duplicated columns. - Function Modifications: - Updated `strip_partition_columns_from_batch` to project essential columns without lookup. - Modified `flush_batch_physical` to use `essential_col_indices` instead of `non_tag_indices`. - Test Enhancements: - Added tests for schema validation, including checks for unexpected data types and duplicated columns. - Verified correct projection of essential columns in `strip_partition_columns_from_batch`. Files affected: `pending_rows_batcher.rs`, `tests`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: - Add `smallvec` Dependency: Updated `Cargo.lock` and `Cargo.toml` to include `smallvec` as a workspace dependency. - Refactor Function: Renamed `collect_tag_columns_and_non_tag_indices` to `columns_taxonomy` in `pending_rows_batcher.rs` and updated its return type to use `SmallVec`. - Update Tests: Modified test cases in `pending_rows_batcher.rs` to reflect changes in function name and return type. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: Refactor `pending_rows_batcher.rs` to Simplify Table ID Handling - Updated `TableBatch` struct to use `TableId` directly instead of `Option<u32>` for `table_id`. - Simplified logic in `flush_batch_physical` by removing the check for `None` in `table_id`. - Adjusted related logic in `start_worker` to accommodate the change in `table_id` handling. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: ### Enhance Batch Processing Logic - `pending_rows_batcher.rs`: - Moved column taxonomy resolution inside the loop to handle schema variations across batches. - Added checks to skip processing if both tag columns and essential column indices are empty. - Tests: - Added `test_modify_batch_sparse_with_taxonomy_per_batch` to verify batch modification logic with varying schemas. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: ### Remove Primary Key Column Check in `pending_rows_batcher.rs` - Removed the check for the primary key column and other essential column names in the function `strip_partition_columns_from_batch` within `pending_rows_batcher.rs`. - Simplified the logic by eliminating the validation of column order against expected essential names. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: ### Refactor error handling and iteration in `otlp.rs` and `pending_rows_batcher.rs` - `otlp.rs`: Simplified error handling by removing `CatalogSnafu` context when awaiting table retrieval. - `pending_rows_batcher.rs`: Streamlined iteration over tables by removing unnecessary `into_iter()` calls, improving code readability and efficiency. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * chore/metrics-for-bulk: Add timing metrics for batch processing in `pending_rows_batcher.rs` - Introduced `modify_elapsed` and `columns_taxonomy_elapsed` to measure time spent in `modify_batch_sparse` and `columns_taxonomy` functions. - Updated `flush_batch_physical` to record these metrics using `PENDING_ROWS_BATCH_FLUSH_STAGE_ELAPSED`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: ### Commit Summary - Remove Unused Code: Eliminated the `#[allow(dead_code)]` attribute from the `compute_tsid_array` function in `batch_modifier.rs`. - Error Handling Improvement: Enhanced error handling in `flush_batch_physical` function by adjusting the `match` block in `pending_rows_batcher.rs`. - Simplify Logic: Streamlined the logic in `rows_to_aligned_record_batch` by removing unnecessary type casting in `prom_row_builder.rs`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: Refactor `flush_batch_physical` in `pending_rows_batcher.rs`: - Moved partition column stripping logic to a single location before processing region batches. - Updated the use of `combined_batch` to `stripped_batch` for consistency in batch processing. - Removed redundant partition column stripping logic within the region batch loop. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * feat/auto-schema-align: ### Update `batch_modifier.rs` Documentation and Parameter Naming - Enhanced documentation for `compute_tsid_array` and `modify_batch_sparse` functions to clarify their logic and parameters. - Renamed parameter `non_tag_column_indices` to `extra_column_indices` in `modify_batch_sparse` for better clarity. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> --------- Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>	2026-04-01 02:45:26 +00:00
Ning Sun	e14404c677	chore: update rust toolchain to 2026-03-21 (#7849 ) * chore: update rust toolchain to 2026-03-21 * chore: new format * fix: lint * chore: resolve lint issues * chore: remove as_millis_f64 * chore: deps up	2026-03-30 12:13:14 +00:00
discord9	d7bc5ad16b	feat: add incremental read context and scan boundaries (#7848 ) * feat: add incremental read context and scan boundaries Signed-off-by: discord9 <discord9@163.com> * chore: per review Signed-off-by: discord9 <discord9@163.com> * docs: explain field Signed-off-by: discord9 <discord9@163.com> --------- Signed-off-by: discord9 <discord9@163.com>	2026-03-26 07:00:22 +00:00
discord9	56ee8baa3f	feat: admin gc table/regions (#7619 ) * feat: gc table Signed-off-by: discord9 <discord9@163.com> * test: admin gc Signed-off-by: discord9 <discord9@163.com> * chore: after rebase fix Signed-off-by: discord9 <discord9@163.com> * refactor: GcStats Signed-off-by: discord9 <discord9@163.com> * refactor: use gc ticker for admin gc Signed-off-by: discord9 <discord9@163.com> * fix: region routes override Signed-off-by: discord9 <discord9@163.com> * test: non happy path Signed-off-by: discord9 <discord9@163.com> * refactor: gc job report enum Signed-off-by: discord9 <discord9@163.com> * test: process 0 regions Signed-off-by: discord9 <discord9@163.com> * after rebase Signed-off-by: discord9 <discord9@163.com> * feat: allow manual gc to return error Signed-off-by: discord9 <discord9@163.com> * chore: update proto Signed-off-by: discord9 <discord9@163.com> * per review Signed-off-by: discord9 <discord9@163.com> * chore: timeout and update proto Signed-off-by: discord9 <discord9@163.com> * chore: udpate proto Signed-off-by: discord9 <discord9@163.com> --------- Signed-off-by: discord9 <discord9@163.com>	2026-03-06 08:25:44 +00:00
discord9	d1151b665b	feat: flow tql cte (#7702 ) * feat: flow tql cte Signed-off-by: discord9 <discord9@163.com> * fix: creating flow TQL CTE source tables lose cte part in query Signed-off-by: discord9 <discord9@163.com> * test: update sqlness result Signed-off-by: discord9 <discord9@163.com> * chore Signed-off-by: discord9 <discord9@163.com> * fix: properly canonicalize ident Signed-off-by: discord9 <discord9@163.com> * feat: even stricter check Signed-off-by: discord9 <discord9@163.com> * chore: sqlness update Signed-off-by: discord9 <discord9@163.com> * chore: after rebase fix Signed-off-by: discord9 <discord9@163.com> --------- Signed-off-by: discord9 <discord9@163.com>	2026-03-06 03:36:42 +00:00
LFC	b2074e3863	chore: upgrade DataFusion family, again (#7578 ) * chore: upgrade DataFusion family Signed-off-by: luofucong <luofc@foxmail.com> * chore: switch to released version of datafusion-pg-catalog --------- Signed-off-by: luofucong <luofc@foxmail.com> Co-authored-by: Ning Sun <sunning@greptime.com> Co-authored-by: Ning Sun <sunng@protonmail.com>	2026-03-03 07:36:39 +00:00
Weny Xu	df04267c54	fix(repartition): reject writes on deallocating regions during region merge (#7694 ) * feat(meta): add write route policy to region route with backward compatibility Signed-off-by: WenyXu <wenymedia@gmail.com> * fix(meta): use partition_expr compatibility accessor in repartition matching Signed-off-by: WenyXu <wenymedia@gmail.com> * feat(meta): introduce staging partition rule enum for repartition instructions Signed-off-by: WenyXu <wenymedia@gmail.com> * feat(datanode): plumb staging partition rule enum through heartbeat handlers Signed-off-by: WenyXu <wenymedia@gmail.com> * feat(meta): mark pending-deallocate regions as reject-all during merge staging Signed-off-by: WenyXu <wenymedia@gmail.com> * feat(partition): exclude reject-all regions from write partitioning Signed-off-by: WenyXu <wenymedia@gmail.com> * feat(mito): store staging partition rule enum in region state Signed-off-by: WenyXu <wenymedia@gmail.com> * feat(mito): reject writes in staging when partition rule is reject-all Signed-off-by: WenyXu <wenymedia@gmail.com> * feat(meta): send enter staging instruction with reject-all Signed-off-by: WenyXu <wenymedia@gmail.com> * fix(repartition): preserve reject-all on exit, merge enter-staging instructions, and allow staged bulk writes Signed-off-by: WenyXu <wenymedia@gmail.com> * refactor: refactor to ignore all writes Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: apply suggestions Signed-off-by: WenyXu <wenymedia@gmail.com> * refactor: rename StagingPartitionRule to StagingPartitionDirective across staging flow Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: add comments Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: clippy Signed-off-by: WenyXu <wenymedia@gmail.com> * refactor: nit Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: apply suggestions Signed-off-by: WenyXu <wenymedia@gmail.com> * refactor: rename Signed-off-by: WenyXu <wenymedia@gmail.com> --------- Signed-off-by: WenyXu <wenymedia@gmail.com>	2026-02-25 07:04:38 +00:00
Ruihang Xia	db46849f40	feat: track flow source tables for TQL and info schema (#7697 ) * feat: track flow source tables for TQL and info schema * handle schema matcher Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * sqlness tests Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * cover __name__ case Signed-off-by: Ruihang Xia <waynestxia@gmail.com> --------- Signed-off-by: Ruihang Xia <waynestxia@gmail.com>	2026-02-11 03:19:09 +00:00
Ning Sun	43afb7962a	refactor: remove session from common meta (#7698 ) * refactor: remove session dependency from common-meta * chore: add udeps * chore: format * fix: lint issues * chore: update oneshot * chore: update unused deps	2026-02-11 03:04:45 +00:00
fys	1aa80d9363	fix: incorrect-tql-explain result (#7675 )	2026-02-11 02:30:15 +00:00
Weny Xu	0ed3b83099	refactor: rename partition rule version to partition expr version (#7696 ) * refactor: rename partition rule version to partition expr version Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: update proto Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: clippy Signed-off-by: WenyXu <wenymedia@gmail.com> --------- Signed-off-by: WenyXu <wenymedia@gmail.com>	2026-02-10 10:12:47 +00:00
LFC	8c23b29725	refactor: remove the `RawTableMeta` and `RawTableInfo` to make codes more concise (#7626 ) * refactor: remove the `RawTableMeta` and `RawTableInfo` to make codes more concise Signed-off-by: luofucong <luofc@foxmail.com> * fix ci Signed-off-by: luofucong <luofc@foxmail.com> * fix ci Signed-off-by: luofucong <luofc@foxmail.com> --------- Signed-off-by: luofucong <luofc@foxmail.com>	2026-02-10 07:10:04 +00:00
Yingwen	39d3744f4f	fix: bump prometheus to 0.14 (#7686 ) * chore: add metrics test for fs backend Signed-off-by: evenyag <realevenyag@gmail.com> * test: test opendal metrics Signed-off-by: evenyag <realevenyag@gmail.com> * fix: update prometheus to 0.14 The opendal metrics require 0.14. Signed-off-by: evenyag <realevenyag@gmail.com> --------- Signed-off-by: evenyag <realevenyag@gmail.com>	2026-02-09 09:51:00 +00:00
Weny Xu	8026b23834	feat: partition rule version validation for writes and staging (#7628 ) * feat: verify partition rule Signed-off-by: WenyXu <wenymedia@gmail.com> * feat: add partition version cache Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: header check Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: fmt toml Signed-off-by: WenyXu <wenymedia@gmail.com> * refactor: minor refactor Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: header Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: fix clippy Signed-off-by: WenyXu <wenymedia@gmail.com> * fix: fix unit tests Signed-off-by: WenyXu <wenymedia@gmail.com> * refactor: minor refactor Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: apply suggestions Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: nit Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: nit Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: apply suggestions Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: apply suggestions Signed-off-by: WenyXu <wenymedia@gmail.com> --------- Signed-off-by: WenyXu <wenymedia@gmail.com>	2026-02-06 12:16:34 +00:00
discord9	3006ac54af	fix: get correct table info when insert create/alter table (#7641 ) * fix: set partition column&other newly acquired table info Signed-off-by: discord9 <discord9@163.com> * fix: update after alter table info Signed-off-by: discord9 <discord9@163.com> * refactor: use table name instead Signed-off-by: discord9 <discord9@163.com> * chore: log Signed-off-by: discord9 <discord9@163.com> --------- Signed-off-by: discord9 <discord9@163.com>	2026-01-30 10:32:31 +00:00
Ning Sun	124478f577	feat: use arrow-pg for postgres data encoding (#7591 ) * feat: use arrow-pg for encode_row * refactor: remove bytea and datetime module * feat: port more encodings to arrow-pg * feat: implement intervalstyle * chore: format * chore: remove error that is no longer used * chore: use released arrow-pg * Apply suggestions from code review Co-authored-by: LFC <990479+MichaelScofield@users.noreply.github.com> --------- Co-authored-by: LFC <990479+MichaelScofield@users.noreply.github.com>	2026-01-28 02:34:02 +00:00
fys	3c915a382b	fix: unit tests when enterprise feature is enabled (#7625 )	2026-01-28 02:13:17 +00:00
Ruihang Xia	c83868c4eb	feat: partition rule simplifier (#7622 ) * basic impl Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * reuse collider Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * simplify range helpers Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * notes Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * update unit test resule Signed-off-by: Ruihang Xia <waynestxia@gmail.com> --------- Signed-off-by: Ruihang Xia <waynestxia@gmail.com>	2026-01-27 14:31:20 +00:00
Weny Xu	4fb61047cb	test: add integration tests for repartition (#7560 ) * test: add integration tests for mito repartition Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: update test result Signed-off-by: WenyXu <wenymedia@gmail.com> * test: add integration tests for metric repartition Signed-off-by: WenyXu <wenymedia@gmail.com> * fix: correct results Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: enable tests for object store Signed-off-by: WenyXu <wenymedia@gmail.com> * test: add compaction and gc Signed-off-by: WenyXu <wenymedia@gmail.com> * fix: fix unit test Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: apply suggestions Signed-off-by: WenyXu <wenymedia@gmail.com> * more cases Signed-off-by: WenyXu <wenymedia@gmail.com> * fix: file ref also in repart mapping Signed-off-by: discord9 <discord9@163.com> * chore: apply suggestions Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: set a longer timeout for mock metasrv channel manager Signed-off-by: WenyXu <wenymedia@gmail.com> --------- Signed-off-by: WenyXu <wenymedia@gmail.com> Signed-off-by: discord9 <discord9@163.com> Co-authored-by: discord9 <discord9@163.com>	2026-01-22 10:14:40 +00:00
Weny Xu	25687bb282	feat: add ddl timeout/wait options, repartition `WITH` parsing, meta-client startup refactor (#7589 ) * feat: add ddl request timeouts and unify meta client startup Signed-off-by: WenyXu <wenymedia@gmail.com> * feat: omplement ALTER TABLE repartition DDL options parsing Signed-off-by: WenyXu <wenymedia@gmail.com> * test: add sqlness tests Signed-off-by: WenyXu <wenymedia@gmail.com> * test: add unit tests Signed-off-by: WenyXu <wenymedia@gmail.com> * feat: pass timeout argument to procedure Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: apply suggestions Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: refine comments Signed-off-by: WenyXu <wenymedia@gmail.com> * test: assert timeout Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: apply suggestions Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: update proto Signed-off-by: WenyXu <wenymedia@gmail.com> --------- Signed-off-by: WenyXu <wenymedia@gmail.com>	2026-01-20 09:26:53 +00:00
LFC	e64c31e59a	chore: upgrade DataFusion family (#7558 ) * chore: upgrade DataFusion family Signed-off-by: luofucong <luofc@foxmail.com> * use main proto Signed-off-by: luofucong <luofc@foxmail.com> * fix ci Signed-off-by: luofucong <luofc@foxmail.com> --------- Signed-off-by: luofucong <luofc@foxmail.com>	2026-01-14 14:02:31 +00:00
Ruihang Xia	45b4067721	feat: always canonicalize partition expr (#7553 ) * feat: always canonicalize partition expr Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * fix ut assertion Signed-off-by: Ruihang Xia <waynestxia@gmail.com> --------- Signed-off-by: Ruihang Xia <waynestxia@gmail.com>	2026-01-12 07:24:29 +00:00
Weny Xu	567d3e66e9	feat: integrate repartition procedure into `DdlManager` (#7548 ) * feat: add repartition procedure factory support to DdlManager - Introduce RepartitionProcedureFactory trait for creating and registering repartition procedures - Implement DefaultRepartitionProcedureFactory for metasrv with full support - Implement StandaloneRepartitionProcedureFactory for standalone (unsupported) - Add procedure loader registration for RepartitionProcedure and RepartitionGroupProcedure - Add helper methods to TableMetadataAllocator for allocator access - Add error types for repartition procedure operations - Update DdlManager to accept and use RepartitionProcedureFactoryRef Signed-off-by: WenyXu <wenymedia@gmail.com> * feat: integrate repartition procedure into DdlManager - Add submit_repartition_task() to handle repartition from alter table - Route Repartition operations in submit_alter_table_task() to repartition factory - Refactor: rename submit_procedure() to execute_procedure_and_wait() - Make all DDL operations wait for completion by default - Add submit_procedure() for fire-and-forget submissions - Add CreateRepartitionProcedure error type - Add placeholder Repartition handling in grpc-expr (unsupported) - Update greptime-proto dependency Signed-off-by: WenyXu <wenymedia@gmail.com> * feat: implement ALTER TABLE REPARTITION procedure submission Signed-off-by: WenyXu <wenymedia@gmail.com> * refactor(repartition): handle central region in apply staging manifest - Introduce ApplyStagingManifestInstructions struct to organize instructions - Add special handling for central region when applying staging manifests - Transition state from UpdateMetadata to RepartitionEnd after applying staging manifests - Remove next_state() method in RepartitionStart and inline state transitions - Improve logging and expression serialization in DDL statement executor - Move repartition tests from standalone to distributed test suite Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: apply suggestions from CR Signed-off-by: WenyXu <wenymedia@gmail.com> * chore: update proto Signed-off-by: WenyXu <wenymedia@gmail.com> --------- Signed-off-by: WenyXu <wenymedia@gmail.com>	2026-01-09 08:37:21 +00:00
Weny Xu	aadfcd7821	feat(repartition): implement validation logic for repartition table (#7538 ) * feat(repartition): implement validation logic for repartition_table Signed-off-by: WenyXu <wenymedia@gmail.com> * refactor: minor refactor Signed-off-by: WenyXu <wenymedia@gmail.com> * test: update sqlness Signed-off-by: WenyXu <wenymedia@gmail.com> --------- Signed-off-by: WenyXu <wenymedia@gmail.com>	2026-01-08 12:18:39 +00:00
Weny Xu	ada4666e10	refactor: remove `region_numbers` from `TableMeta` and `TableInfo` (#7519 ) * refactor: remove `region_numbers` from `TableMeta` and `TableInfo` Signed-off-by: WenyXu <wenymedia@gmail.com> * feat: create partitions from region route Signed-off-by: WenyXu <wenymedia@gmail.com> * fix: fix build Signed-off-by: WenyXu <wenymedia@gmail.com> --------- Signed-off-by: WenyXu <wenymedia@gmail.com>	2026-01-06 13:21:36 +00:00
AntiTopQuark	cea578244c	fix(compaction): unify behavior of database compaction options with TTL (#7402 ) * fix: fix dynamic compactiom option,unify behavior of database compaction options with TTL option Signed-off-by: AntiTopQuark <AntiTopQuark1350@outlook.com> * fix unit test Signed-off-by: AntiTopQuark <AntiTopQuark1350@outlook.com> * add debug log Signed-off-by: AntiTopQuark <AntiTopQuark1350@outlook.com> --------- Signed-off-by: AntiTopQuark <AntiTopQuark1350@outlook.com>	2025-12-25 02:34:42 +00:00
LFC	dc9f3a702e	refactor: explicitly define json struct to ingest jsonbench data (#7462 ) ingest jsonbench data Signed-off-by: luofucong <luofc@foxmail.com>	2025-12-24 07:30:22 +00:00
Ruihang Xia	564cc0c750	feat: table/column/flow COMMENT (#7060 ) * initial impl Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * simplify impl Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * sqlness test Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * avoid unimplemented panic Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * validate flow Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * update sqlness result Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * fix table column comment Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * table level comment Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * simplify table info serde Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * don't txn Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * remove empty trait Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * wip: procedure Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * update proto Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * grpc support Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * Apply suggestions from code review Co-authored-by: dennis zhuang <killme2008@gmail.com> Co-authored-by: LFC <990479+MichaelScofield@users.noreply.github.com> Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * try from pb struct Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * doc comment Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * check unchanged fast case Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * tune errors Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * fix merge error Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * use try_as_raw_value Signed-off-by: Ruihang Xia <waynestxia@gmail.com> --------- Signed-off-by: Ruihang Xia <waynestxia@gmail.com> Co-authored-by: dennis zhuang <killme2008@gmail.com> Co-authored-by: LFC <990479+MichaelScofield@users.noreply.github.com>	2025-12-10 15:08:47 +00:00
Lei, HUANG	11ecb7a28a	refactor(servers): bulk insert service (#7329 ) * refactor/bulk-insert-service: refactor: decode FlightData early in put_record_batch pipeline - Move FlightDecoder usage from Inserter up to PutRecordBatchRequestStream, passing decoded RecordBatch and schema bytes instead of raw FlightData. - Eliminate redundant per-request decoding/encoding in Inserter; encode once and reuse for all region requests. - Streamline GrpcQueryHandler trait and implementations to accept PutRecordBatchRequest containing pre-decoded data. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * refactor/bulk-insert-service: feat: stream-based bulk insert with per-batch responses - Introduce handle_put_record_batch_stream() to process Flight DoPut streams - Resolve table & permissions once, yield (request_id, AffectedRows) per batch - Replace loop-over-request with async-stream in frontend & server - Make PutRecordBatchRequestStream public for cross-crate usage Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * refactor/bulk-insert-service: fix: propagate request_id with errors in bulk insert stream Changes the bulk-insert stream item type from Result<(i64, AffectedRows), E> to (i64, Result<AffectedRows, E>) so every emitted tuple carries the request_id even on failure, letting callers correlate errors with the originating request. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * refactor/bulk-insert-service: refactor: unify DoPut response stream to return DoPutResponse Replace the tuple (i64, Result<AffectedRows>) with Result<DoPutResponse> throughout the gRPC bulk-insert path so the handler, adapter and server all speak the same type. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * refactor/bulk-insert-service: feat: add elapsed_secs to DoPutResponse for bulk-insert timing - DoPutResponse now carries elapsed_secs field - Frontend measures and attaches insert duration - Server observes GRPC_BULK_INSERT_ELAPSED metric from response Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * refactor/bulk-insert-service: refactor: unify Bytes import in flight module - Replace `bytes::Bytes` with `Bytes` alias for consistency - Remove redundant `ProstBytes` alias Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * refactor/bulk-insert-service: fix: terminate gRPC stream on error and optimize FlightData handling - Stop retrying on stream errors in gRPC handler - Replace Vec1 indexing with into_iter().next() for FlightData - Remove redundant clones in bulk_insert and flight modules Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * refactor/bulk-insert-service: Improve permission check placement in `grpc.rs` - Moved the permission check for `BulkInsert` to occur before resolving the table reference in `GrpcQueryHandler` implementation. - Ensures permission validation is performed earlier in the process, potentially avoiding unnecessary operations if permission is denied. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * refactor/bulk-insert-service: Refactor Bulk Insert Handling in gRPC - `grpc.rs`: - Switched from `async_stream::stream` to `async_stream::try_stream` for error handling. - Removed `body_size` parameter and added `flight_data` to `handle_bulk_insert`. - Simplified error handling and permission checks in `GrpcQueryHandler`. - `bulk_insert.rs`: - Added `raw_flight_data` parameter to `handle_bulk_insert`. - Calculated `body_size` from `raw_flight_data` and removed redundant encoding logic. - `flight.rs`: - Replaced `body_size` with `flight_data` in `PutRecordBatchRequest`. - Updated memory usage calculation to include `flight_data` components. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * refactor/bulk-insert-service: perf(bulk_insert): encode record batch once per datanode Move FlightData encoding outside the per-region loop so the same encoded bytes are reused when mask.select_all(), eliminating redundant serialisation work. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> --------- Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>	2025-12-04 07:08:02 +00:00
dennis zhuang	1f91422bae	feat!: improve mysql/pg compatibility (#7315 ) * feat(mysql): add SHOW WARNINGS support and return warnings for unsupported SET variables Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * feat(function): add MySQL IF() function and PostgreSQL description functions for connector compatibility Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * fix: show tables for mysql Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * fix: partitions table in information_schema and add starrocks external catalog compatibility Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * refactor: async udf Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * fix: set warnings Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * feat: impl pg_my_temp_schema and make description functions simple Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * test: add test for issue 7313 Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * feat: apply suggestions Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * fix: partition_expression and partition_description Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * fix: test Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * fix: unit tests Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * fix: saerch_path only works for pg Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * feat: improve warnings processing Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * fix: warnings while writing affected rows and refactor Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * chore: improve ShobjDescriptionFunction signature Signed-off-by: Dennis Zhuang <killme2008@gmail.com> * refactor: array_to_boolean Signed-off-by: Dennis Zhuang <killme2008@gmail.com> --------- Signed-off-by: Dennis Zhuang <killme2008@gmail.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-12-01 20:41:14 +00:00
LFC	fdab75ce27	feat: simple read write new json type values (#7175 ) feat: basic json read and write Signed-off-by: luofucong <luofc@foxmail.com>	2025-11-27 12:40:35 +00:00
fys	020477994b	feat: add some configurable points (#7227 ) * feat: enhance extension * fix: cr * move information schema table factories trait to standalone * fix: self cr * remove extension factory * refactor * remove extension filed from greptime options struct * refactor * minor refactor * fix: cargo check * fix: clippy * fix: license check * feat: enhance grpc and http configurator in servers crate * grpc builder configurator * remove unused file * complete the remaining expansion points. * fix: self-cr * rename * fix: typo	2025-11-27 09:21:46 +00:00

1 2 3 4 5 ...

375 Commits