feat: flownode to frontend load balance with guess

fix: convert JSON type to JSON string in COPY TABLE TO statment (#6255 )
* fix: convert JSON type to JSON string in COPY TABLE TO statement * chore: apply suggestions from CR * chore: apply suggestions from CR
2026-01-04 12:22:55 +00:00 · 2025-06-08 14:17:32 +08:00 · 2025-06-06 02:23:57 +00:00 · 2025-06-05 12:38:47 +00:00 · 2025-06-05 12:34:11 +00:00 · 2025-06-05 02:16:53 +00:00
102 changed files with 3238 additions and 672 deletions
--- a/.github/workflows/dev-build.yml
+++ b/.github/workflows/dev-build.yml
@@ -55,6 +55,11 @@ on:
        description: Build and push images to DockerHub and ACR
        required: false
        default: true
+      upload_artifacts_to_s3:
+        type: boolean
+        description: Whether upload artifacts to s3
+        required: false
+        default: false
      cargo_profile:
        type: choice
        description: The cargo profile to use in building GreptimeDB.
@@ -238,7 +243,7 @@ jobs:
          version: ${{ needs.allocate-runners.outputs.version }}
          push-latest-tag: false # Don't push the latest tag to registry.
          dev-mode: true # Only build the standard images.
-          
+
      - name: Echo Docker image tag to step summary
        run: |
          echo "## Docker Image Tag" >> $GITHUB_STEP_SUMMARY
@@ -281,7 +286,7 @@ jobs:
          aws-cn-access-key-id: ${{ secrets.AWS_CN_ACCESS_KEY_ID }}
          aws-cn-secret-access-key: ${{ secrets.AWS_CN_SECRET_ACCESS_KEY }}
          aws-cn-region: ${{ vars.AWS_RELEASE_BUCKET_REGION }}
-          upload-to-s3: false
+          upload-to-s3: ${{ inputs.upload_artifacts_to_s3 }}
          dev-mode: true                     # Only build the standard images(exclude centos images).
          push-latest-tag: false             # Don't push the latest tag to registry.
          update-version-info: false         # Don't update the version info in S3.
--- a/Cargo.lock
+++ b/Cargo.lock
@@ -4876,7 +4876,7 @@ dependencies = [
 [[package]]
 name = "greptime-proto"
 version = "0.1.0"
-source = "git+https://github.com/GreptimeTeam/greptime-proto.git?rev=442348b2518c0bf187fb1ad011ba370c38b96cc4#442348b2518c0bf187fb1ad011ba370c38b96cc4"
+source = "git+https://github.com/GreptimeTeam/greptime-proto.git?rev=454c52634c3bac27de10bf0d85d5533eed1cf03f#454c52634c3bac27de10bf0d85d5533eed1cf03f"
 dependencies = [
 "prost 0.13.5",
 "serde",
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -132,7 +132,7 @@ etcd-client = "0.14"
 fst = "0.4.7"
 futures = "0.3"
 futures-util = "0.3"
-greptime-proto = { git = "https://github.com/GreptimeTeam/greptime-proto.git", rev = "442348b2518c0bf187fb1ad011ba370c38b96cc4" }
+greptime-proto = { git = "https://github.com/GreptimeTeam/greptime-proto.git", rev = "454c52634c3bac27de10bf0d85d5533eed1cf03f" }
 hex = "0.4"
 http = "1"
 humantime = "2.1"
--- a/docs/how-to/how-to-profile-memory.md
+++ b/docs/how-to/how-to-profile-memory.md
@@ -1,6 +1,6 @@
 # Profile memory usage of GreptimeDB

-This crate provides an easy approach to dump memory profiling info. A set of ready to use scripts is provided in [docs/how-to/memory-profile-scripts](docs/how-to/memory-profile-scripts).
+This crate provides an easy approach to dump memory profiling info. A set of ready to use scripts is provided in [docs/how-to/memory-profile-scripts](./memory-profile-scripts/scripts).

 ## Prerequisites
 ### jemalloc
--- a/grafana/dashboards/metrics/cluster/dashboard.json
+++ b/grafana/dashboards/metrics/cluster/dashboard.json
@@ -25,7 +25,7 @@
  "editable": true,
  "fiscalYearStartMonth": 0,
  "graphTooltip": 1,
-  "id": 5,
+  "id": 7,
  "links": [],
  "panels": [
    {
@@ -4476,7 +4476,7 @@
            "axisPlacement": "auto",
            "barAlignment": 0,
            "barWidthFactor": 0.6,
-            "drawStyle": "line",
+            "drawStyle": "points",
            "fillOpacity": 0,
            "gradientMode": "none",
            "hideFrom": {
@@ -4553,9 +4553,22 @@
          "legendFormat": "[{{instance}}]-[{{pod}}]-[{{stage}}]-p99",
          "range": true,
          "refId": "A"
+        },
+        {
+          "datasource": {
+            "type": "prometheus",
+            "uid": "${metrics}"
+          },
+          "editorMode": "code",
+          "expr": "sum by(instance, pod, le, stage) (rate(greptime_mito_compaction_stage_elapsed_sum{instance=~\"$datanode\"}[$__rate_interval]))/sum by(instance, pod, le, stage) (rate(greptime_mito_compaction_stage_elapsed_count{instance=~\"$datanode\"}[$__rate_interval]))",
+          "hide": false,
+          "instant": false,
+          "legendFormat": "[{{instance}}]-[{{pod}}]-[{{stage}}]-avg",
+          "range": true,
+          "refId": "B"
        }
      ],
-      "title": "Compaction P99 per Instance by Stage",
+      "title": "Compaction Elapsed Time per Instance by Stage",
      "type": "timeseries"
    },
    {
@@ -5546,13 +5559,131 @@
      "title": "Region Worker Handle Bulk Insert Requests",
      "type": "timeseries"
    },
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "${metrics}"
+      },
+      "description": "Per-stage elapsed time for region worker to decode requests.",
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisBorderShow": false,
+            "axisCenteredZero": false,
+            "axisColorMode": "text",
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "barWidthFactor": 0.6,
+            "drawStyle": "points",
+            "fillOpacity": 0,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "viz": false
+            },
+            "insertNulls": false,
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "auto",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green"
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "s"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 117
+      },
+      "id": 338,
+      "options": {
+        "legend": {
+          "calcs": [
+            "lastNotNull"
+          ],
+          "displayMode": "table",
+          "placement": "bottom",
+          "showLegend": true
+        },
+        "tooltip": {
+          "hideZeros": false,
+          "mode": "single",
+          "sort": "none"
+        }
+      },
+      "pluginVersion": "12.0.0",
+      "targets": [
+        {
+          "datasource": {
+            "type": "prometheus",
+            "uid": "${metrics}"
+          },
+          "disableTextWrap": false,
+          "editorMode": "builder",
+          "expr": "histogram_quantile(0.95, sum by(le, instance, stage, pod) (rate(greptime_datanode_convert_region_request_bucket[$__rate_interval])))",
+          "fullMetaSearch": false,
+          "includeNullMetadata": true,
+          "instant": false,
+          "legendFormat": "[{{instance}}]-[{{pod}}]-[{{stage}}]-P95",
+          "range": true,
+          "refId": "A",
+          "useBackend": false
+        },
+        {
+          "datasource": {
+            "type": "prometheus",
+            "uid": "${metrics}"
+          },
+          "editorMode": "code",
+          "expr": "sum by(le,instance, stage, pod) (rate(greptime_datanode_convert_region_request_sum[$__rate_interval]))/sum by(le,instance, stage, pod) (rate(greptime_datanode_convert_region_request_count[$__rate_interval]))",
+          "hide": false,
+          "instant": false,
+          "legendFormat": "[{{instance}}]-[{{pod}}]-[{{stage}}]-AVG",
+          "range": true,
+          "refId": "B"
+        }
+      ],
+      "title": "Region Worker Convert Requests",
+      "type": "timeseries"
+    },
    {
      "collapsed": true,
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
-        "y": 117
+        "y": 125
      },
      "id": 313,
      "panels": [
@@ -6682,7 +6813,7 @@
        "h": 1,
        "w": 24,
        "x": 0,
-        "y": 118
+        "y": 126
      },
      "id": 324,
      "panels": [
@@ -6979,7 +7110,7 @@
        "h": 1,
        "w": 24,
        "x": 0,
-        "y": 119
+        "y": 127
      },
      "id": 328,
      "panels": [
@@ -7627,6 +7758,5 @@
  "timezone": "",
  "title": "GreptimeDB",
  "uid": "dejf3k5e7g2kgb",
-  "version": 3,
-  "weekStart": ""
+  "version": 3
 }
--- a/grafana/dashboards/metrics/cluster/dashboard.md
+++ b/grafana/dashboards/metrics/cluster/dashboard.md
@@ -60,7 +60,7 @@
 | Read Stage P99 per Instance | `histogram_quantile(0.99, sum by(instance, pod, le, stage) (rate(greptime_mito_read_stage_elapsed_bucket{instance=~"$datanode"}[$__rate_interval])))` | `timeseries` | Read Stage P99 per Instance. | `prometheus` | `s` | `[{{instance}}]-[{{pod}}]-[{{stage}}]` |
 | Write Stage P99 per Instance | `histogram_quantile(0.99, sum by(instance, pod, le, stage) (rate(greptime_mito_write_stage_elapsed_bucket{instance=~"$datanode"}[$__rate_interval])))` | `timeseries` | Write Stage P99 per Instance. | `prometheus` | `s` | `[{{instance}}]-[{{pod}}]-[{{stage}}]` |
 | Compaction OPS per Instance | `sum by(instance, pod) (rate(greptime_mito_compaction_total_elapsed_count{instance=~"$datanode"}[$__rate_interval]))` | `timeseries` | Compaction OPS per Instance. | `prometheus` | `ops` | `[{{ instance }}]-[{{pod}}]` |
-| Compaction P99 per Instance by Stage | `histogram_quantile(0.99, sum by(instance, pod, le, stage) (rate(greptime_mito_compaction_stage_elapsed_bucket{instance=~"$datanode"}[$__rate_interval])))` | `timeseries` | Compaction latency by stage | `prometheus` | `s` | `[{{instance}}]-[{{pod}}]-[{{stage}}]-p99` |
+| Compaction Elapsed Time per Instance by Stage | `histogram_quantile(0.99, sum by(instance, pod, le, stage) (rate(greptime_mito_compaction_stage_elapsed_bucket{instance=~"$datanode"}[$__rate_interval])))`<br/>`sum by(instance, pod, le, stage) (rate(greptime_mito_compaction_stage_elapsed_sum{instance=~"$datanode"}[$__rate_interval]))/sum by(instance, pod, le, stage) (rate(greptime_mito_compaction_stage_elapsed_count{instance=~"$datanode"}[$__rate_interval]))` | `timeseries` | Compaction latency by stage | `prometheus` | `s` | `[{{instance}}]-[{{pod}}]-[{{stage}}]-p99` |
 | Compaction P99 per Instance | `histogram_quantile(0.99, sum by(instance, pod, le,stage) (rate(greptime_mito_compaction_total_elapsed_bucket{instance=~"$datanode"}[$__rate_interval])))` | `timeseries` | Compaction P99 per Instance. | `prometheus` | `s` | `[{{instance}}]-[{{pod}}]-[{{stage}}]-compaction` |
 | WAL write size | `histogram_quantile(0.95, sum by(le,instance, pod) (rate(raft_engine_write_size_bucket[$__rate_interval])))`<br/>`histogram_quantile(0.99, sum by(le,instance,pod) (rate(raft_engine_write_size_bucket[$__rate_interval])))`<br/>`sum by (instance, pod)(rate(raft_engine_write_size_sum[$__rate_interval]))` | `timeseries` | Write-ahead logs write size as bytes. This chart includes stats of p95 and p99 size by instance, total WAL write rate. | `prometheus` | `bytes` | `[{{instance}}]-[{{pod}}]-req-size-p95` |
 | Cached Bytes per Instance | `greptime_mito_cache_bytes{instance=~"$datanode"}` | `timeseries` | Cached Bytes per Instance. | `prometheus` | `decbytes` | `[{{instance}}]-[{{pod}}]-[{{type}}]` |
@@ -70,6 +70,7 @@
 | Inflight Flush | `greptime_mito_inflight_flush_count` | `timeseries` | Ongoing flush task count | `prometheus` | `none` | `[{{instance}}]-[{{pod}}]` |
 | Compaction Input/Output Bytes | `sum by(instance, pod) (greptime_mito_compaction_input_bytes)`<br/>`sum by(instance, pod) (greptime_mito_compaction_output_bytes)` | `timeseries` | Compaction oinput output bytes | `prometheus` | `bytes` | `[{{instance}}]-[{{pod}}]-input` |
 | Region Worker Handle Bulk Insert Requests | `histogram_quantile(0.95, sum by(le,instance, stage, pod) (rate(greptime_region_worker_handle_write_bucket[$__rate_interval])))`<br/>`sum by(le,instance, stage, pod) (rate(greptime_region_worker_handle_write_sum[$__rate_interval]))/sum by(le,instance, stage, pod) (rate(greptime_region_worker_handle_write_count[$__rate_interval]))` | `timeseries` | Per-stage elapsed time for region worker to handle bulk insert region requests. | `prometheus` | `s` | `[{{instance}}]-[{{pod}}]-[{{stage}}]-P95` |
+| Region Worker Convert Requests | `histogram_quantile(0.95, sum by(le, instance, stage, pod) (rate(greptime_datanode_convert_region_request_bucket[$__rate_interval])))`<br/>`sum by(le,instance, stage, pod) (rate(greptime_datanode_convert_region_request_sum[$__rate_interval]))/sum by(le,instance, stage, pod) (rate(greptime_datanode_convert_region_request_count[$__rate_interval]))` | `timeseries` | Per-stage elapsed time for region worker to decode requests. | `prometheus` | `s` | `[{{instance}}]-[{{pod}}]-[{{stage}}]-P95` |
 # OpenDAL
 | Title | Query | Type | Description | Datasource | Unit | Legend Format |
 | --- | --- | --- | --- | --- | --- | --- |
--- a/grafana/dashboards/metrics/cluster/dashboard.yaml
+++ b/grafana/dashboards/metrics/cluster/dashboard.yaml
@@ -487,7 +487,7 @@ groups:
                type: prometheus
                uid: ${metrics}
              legendFormat: '[{{ instance }}]-[{{pod}}]'
-        - title: Compaction P99 per Instance by Stage
+        - title: Compaction Elapsed Time per Instance by Stage
          type: timeseries
          description: Compaction latency by stage
          unit: s
@@ -497,6 +497,11 @@ groups:
                type: prometheus
                uid: ${metrics}
              legendFormat: '[{{instance}}]-[{{pod}}]-[{{stage}}]-p99'
+            - expr: sum by(instance, pod, le, stage) (rate(greptime_mito_compaction_stage_elapsed_sum{instance=~"$datanode"}[$__rate_interval]))/sum by(instance, pod, le, stage) (rate(greptime_mito_compaction_stage_elapsed_count{instance=~"$datanode"}[$__rate_interval]))
+              datasource:
+                type: prometheus
+                uid: ${metrics}
+              legendFormat: '[{{instance}}]-[{{pod}}]-[{{stage}}]-avg'
        - title: Compaction P99 per Instance
          type: timeseries
          description: Compaction P99 per Instance.
@@ -607,6 +612,21 @@ groups:
                type: prometheus
                uid: ${metrics}
              legendFormat: '[{{instance}}]-[{{pod}}]-[{{stage}}]-AVG'
+        - title: Region Worker Convert Requests
+          type: timeseries
+          description: Per-stage elapsed time for region worker to decode requests.
+          unit: s
+          queries:
+            - expr: histogram_quantile(0.95, sum by(le, instance, stage, pod) (rate(greptime_datanode_convert_region_request_bucket[$__rate_interval])))
+              datasource:
+                type: prometheus
+                uid: ${metrics}
+              legendFormat: '[{{instance}}]-[{{pod}}]-[{{stage}}]-P95'
+            - expr: sum by(le,instance, stage, pod) (rate(greptime_datanode_convert_region_request_sum[$__rate_interval]))/sum by(le,instance, stage, pod) (rate(greptime_datanode_convert_region_request_count[$__rate_interval]))
+              datasource:
+                type: prometheus
+                uid: ${metrics}
+              legendFormat: '[{{instance}}]-[{{pod}}]-[{{stage}}]-AVG'
    - title: OpenDAL
      panels:
        - title: QPS per Instance
--- a/grafana/dashboards/metrics/standalone/dashboard.json
+++ b/grafana/dashboards/metrics/standalone/dashboard.json
@@ -25,7 +25,7 @@
  "editable": true,
  "fiscalYearStartMonth": 0,
  "graphTooltip": 1,
-  "id": 5,
+  "id": 7,
  "links": [],
  "panels": [
    {
@@ -4476,7 +4476,7 @@
            "axisPlacement": "auto",
            "barAlignment": 0,
            "barWidthFactor": 0.6,
-            "drawStyle": "line",
+            "drawStyle": "points",
            "fillOpacity": 0,
            "gradientMode": "none",
            "hideFrom": {
@@ -4553,9 +4553,22 @@
          "legendFormat": "[{{instance}}]-[{{pod}}]-[{{stage}}]-p99",
          "range": true,
          "refId": "A"
+        },
+        {
+          "datasource": {
+            "type": "prometheus",
+            "uid": "${metrics}"
+          },
+          "editorMode": "code",
+          "expr": "sum by(instance, pod, le, stage) (rate(greptime_mito_compaction_stage_elapsed_sum{}[$__rate_interval]))/sum by(instance, pod, le, stage) (rate(greptime_mito_compaction_stage_elapsed_count{}[$__rate_interval]))",
+          "hide": false,
+          "instant": false,
+          "legendFormat": "[{{instance}}]-[{{pod}}]-[{{stage}}]-avg",
+          "range": true,
+          "refId": "B"
        }
      ],
-      "title": "Compaction P99 per Instance by Stage",
+      "title": "Compaction Elapsed Time per Instance by Stage",
      "type": "timeseries"
    },
    {
@@ -5546,13 +5559,131 @@
      "title": "Region Worker Handle Bulk Insert Requests",
      "type": "timeseries"
    },
+    {
+      "datasource": {
+        "type": "prometheus",
+        "uid": "${metrics}"
+      },
+      "description": "Per-stage elapsed time for region worker to decode requests.",
+      "fieldConfig": {
+        "defaults": {
+          "color": {
+            "mode": "palette-classic"
+          },
+          "custom": {
+            "axisBorderShow": false,
+            "axisCenteredZero": false,
+            "axisColorMode": "text",
+            "axisLabel": "",
+            "axisPlacement": "auto",
+            "barAlignment": 0,
+            "barWidthFactor": 0.6,
+            "drawStyle": "points",
+            "fillOpacity": 0,
+            "gradientMode": "none",
+            "hideFrom": {
+              "legend": false,
+              "tooltip": false,
+              "viz": false
+            },
+            "insertNulls": false,
+            "lineInterpolation": "linear",
+            "lineWidth": 1,
+            "pointSize": 5,
+            "scaleDistribution": {
+              "type": "linear"
+            },
+            "showPoints": "auto",
+            "spanNulls": false,
+            "stacking": {
+              "group": "A",
+              "mode": "none"
+            },
+            "thresholdsStyle": {
+              "mode": "off"
+            }
+          },
+          "mappings": [],
+          "thresholds": {
+            "mode": "absolute",
+            "steps": [
+              {
+                "color": "green"
+              },
+              {
+                "color": "red",
+                "value": 80
+              }
+            ]
+          },
+          "unit": "s"
+        },
+        "overrides": []
+      },
+      "gridPos": {
+        "h": 8,
+        "w": 12,
+        "x": 12,
+        "y": 117
+      },
+      "id": 338,
+      "options": {
+        "legend": {
+          "calcs": [
+            "lastNotNull"
+          ],
+          "displayMode": "table",
+          "placement": "bottom",
+          "showLegend": true
+        },
+        "tooltip": {
+          "hideZeros": false,
+          "mode": "single",
+          "sort": "none"
+        }
+      },
+      "pluginVersion": "12.0.0",
+      "targets": [
+        {
+          "datasource": {
+            "type": "prometheus",
+            "uid": "${metrics}"
+          },
+          "disableTextWrap": false,
+          "editorMode": "builder",
+          "expr": "histogram_quantile(0.95, sum by(le, instance, stage, pod) (rate(greptime_datanode_convert_region_request_bucket[$__rate_interval])))",
+          "fullMetaSearch": false,
+          "includeNullMetadata": true,
+          "instant": false,
+          "legendFormat": "[{{instance}}]-[{{pod}}]-[{{stage}}]-P95",
+          "range": true,
+          "refId": "A",
+          "useBackend": false
+        },
+        {
+          "datasource": {
+            "type": "prometheus",
+            "uid": "${metrics}"
+          },
+          "editorMode": "code",
+          "expr": "sum by(le,instance, stage, pod) (rate(greptime_datanode_convert_region_request_sum[$__rate_interval]))/sum by(le,instance, stage, pod) (rate(greptime_datanode_convert_region_request_count[$__rate_interval]))",
+          "hide": false,
+          "instant": false,
+          "legendFormat": "[{{instance}}]-[{{pod}}]-[{{stage}}]-AVG",
+          "range": true,
+          "refId": "B"
+        }
+      ],
+      "title": "Region Worker Convert Requests",
+      "type": "timeseries"
+    },
    {
      "collapsed": true,
      "gridPos": {
        "h": 1,
        "w": 24,
        "x": 0,
-        "y": 117
+        "y": 125
      },
      "id": 313,
      "panels": [
@@ -6682,7 +6813,7 @@
        "h": 1,
        "w": 24,
        "x": 0,
-        "y": 118
+        "y": 126
      },
      "id": 324,
      "panels": [
@@ -6979,7 +7110,7 @@
        "h": 1,
        "w": 24,
        "x": 0,
-        "y": 119
+        "y": 127
      },
      "id": 328,
      "panels": [
@@ -7627,6 +7758,5 @@
  "timezone": "",
  "title": "GreptimeDB",
  "uid": "dejf3k5e7g2kgb",
-  "version": 3,
-  "weekStart": ""
+  "version": 3
 }
--- a/grafana/dashboards/metrics/standalone/dashboard.md
+++ b/grafana/dashboards/metrics/standalone/dashboard.md
@@ -60,7 +60,7 @@
 | Read Stage P99 per Instance | `histogram_quantile(0.99, sum by(instance, pod, le, stage) (rate(greptime_mito_read_stage_elapsed_bucket{}[$__rate_interval])))` | `timeseries` | Read Stage P99 per Instance. | `prometheus` | `s` | `[{{instance}}]-[{{pod}}]-[{{stage}}]` |
 | Write Stage P99 per Instance | `histogram_quantile(0.99, sum by(instance, pod, le, stage) (rate(greptime_mito_write_stage_elapsed_bucket{}[$__rate_interval])))` | `timeseries` | Write Stage P99 per Instance. | `prometheus` | `s` | `[{{instance}}]-[{{pod}}]-[{{stage}}]` |
 | Compaction OPS per Instance | `sum by(instance, pod) (rate(greptime_mito_compaction_total_elapsed_count{}[$__rate_interval]))` | `timeseries` | Compaction OPS per Instance. | `prometheus` | `ops` | `[{{ instance }}]-[{{pod}}]` |
-| Compaction P99 per Instance by Stage | `histogram_quantile(0.99, sum by(instance, pod, le, stage) (rate(greptime_mito_compaction_stage_elapsed_bucket{}[$__rate_interval])))` | `timeseries` | Compaction latency by stage | `prometheus` | `s` | `[{{instance}}]-[{{pod}}]-[{{stage}}]-p99` |
+| Compaction Elapsed Time per Instance by Stage | `histogram_quantile(0.99, sum by(instance, pod, le, stage) (rate(greptime_mito_compaction_stage_elapsed_bucket{}[$__rate_interval])))`<br/>`sum by(instance, pod, le, stage) (rate(greptime_mito_compaction_stage_elapsed_sum{}[$__rate_interval]))/sum by(instance, pod, le, stage) (rate(greptime_mito_compaction_stage_elapsed_count{}[$__rate_interval]))` | `timeseries` | Compaction latency by stage | `prometheus` | `s` | `[{{instance}}]-[{{pod}}]-[{{stage}}]-p99` |
 | Compaction P99 per Instance | `histogram_quantile(0.99, sum by(instance, pod, le,stage) (rate(greptime_mito_compaction_total_elapsed_bucket{}[$__rate_interval])))` | `timeseries` | Compaction P99 per Instance. | `prometheus` | `s` | `[{{instance}}]-[{{pod}}]-[{{stage}}]-compaction` |
 | WAL write size | `histogram_quantile(0.95, sum by(le,instance, pod) (rate(raft_engine_write_size_bucket[$__rate_interval])))`<br/>`histogram_quantile(0.99, sum by(le,instance,pod) (rate(raft_engine_write_size_bucket[$__rate_interval])))`<br/>`sum by (instance, pod)(rate(raft_engine_write_size_sum[$__rate_interval]))` | `timeseries` | Write-ahead logs write size as bytes. This chart includes stats of p95 and p99 size by instance, total WAL write rate. | `prometheus` | `bytes` | `[{{instance}}]-[{{pod}}]-req-size-p95` |
 | Cached Bytes per Instance | `greptime_mito_cache_bytes{}` | `timeseries` | Cached Bytes per Instance. | `prometheus` | `decbytes` | `[{{instance}}]-[{{pod}}]-[{{type}}]` |
@@ -70,6 +70,7 @@
 | Inflight Flush | `greptime_mito_inflight_flush_count` | `timeseries` | Ongoing flush task count | `prometheus` | `none` | `[{{instance}}]-[{{pod}}]` |
 | Compaction Input/Output Bytes | `sum by(instance, pod) (greptime_mito_compaction_input_bytes)`<br/>`sum by(instance, pod) (greptime_mito_compaction_output_bytes)` | `timeseries` | Compaction oinput output bytes | `prometheus` | `bytes` | `[{{instance}}]-[{{pod}}]-input` |
 | Region Worker Handle Bulk Insert Requests | `histogram_quantile(0.95, sum by(le,instance, stage, pod) (rate(greptime_region_worker_handle_write_bucket[$__rate_interval])))`<br/>`sum by(le,instance, stage, pod) (rate(greptime_region_worker_handle_write_sum[$__rate_interval]))/sum by(le,instance, stage, pod) (rate(greptime_region_worker_handle_write_count[$__rate_interval]))` | `timeseries` | Per-stage elapsed time for region worker to handle bulk insert region requests. | `prometheus` | `s` | `[{{instance}}]-[{{pod}}]-[{{stage}}]-P95` |
+| Region Worker Convert Requests | `histogram_quantile(0.95, sum by(le, instance, stage, pod) (rate(greptime_datanode_convert_region_request_bucket[$__rate_interval])))`<br/>`sum by(le,instance, stage, pod) (rate(greptime_datanode_convert_region_request_sum[$__rate_interval]))/sum by(le,instance, stage, pod) (rate(greptime_datanode_convert_region_request_count[$__rate_interval]))` | `timeseries` | Per-stage elapsed time for region worker to decode requests. | `prometheus` | `s` | `[{{instance}}]-[{{pod}}]-[{{stage}}]-P95` |
 # OpenDAL
 | Title | Query | Type | Description | Datasource | Unit | Legend Format |
 | --- | --- | --- | --- | --- | --- | --- |
--- a/grafana/dashboards/metrics/standalone/dashboard.yaml
+++ b/grafana/dashboards/metrics/standalone/dashboard.yaml
@@ -487,7 +487,7 @@ groups:
                type: prometheus
                uid: ${metrics}
              legendFormat: '[{{ instance }}]-[{{pod}}]'
-        - title: Compaction P99 per Instance by Stage
+        - title: Compaction Elapsed Time per Instance by Stage
          type: timeseries
          description: Compaction latency by stage
          unit: s
@@ -497,6 +497,11 @@ groups:
                type: prometheus
                uid: ${metrics}
              legendFormat: '[{{instance}}]-[{{pod}}]-[{{stage}}]-p99'
+            - expr: sum by(instance, pod, le, stage) (rate(greptime_mito_compaction_stage_elapsed_sum{}[$__rate_interval]))/sum by(instance, pod, le, stage) (rate(greptime_mito_compaction_stage_elapsed_count{}[$__rate_interval]))
+              datasource:
+                type: prometheus
+                uid: ${metrics}
+              legendFormat: '[{{instance}}]-[{{pod}}]-[{{stage}}]-avg'
        - title: Compaction P99 per Instance
          type: timeseries
          description: Compaction P99 per Instance.
@@ -607,6 +612,21 @@ groups:
                type: prometheus
                uid: ${metrics}
              legendFormat: '[{{instance}}]-[{{pod}}]-[{{stage}}]-AVG'
+        - title: Region Worker Convert Requests
+          type: timeseries
+          description: Per-stage elapsed time for region worker to decode requests.
+          unit: s
+          queries:
+            - expr: histogram_quantile(0.95, sum by(le, instance, stage, pod) (rate(greptime_datanode_convert_region_request_bucket[$__rate_interval])))
+              datasource:
+                type: prometheus
+                uid: ${metrics}
+              legendFormat: '[{{instance}}]-[{{pod}}]-[{{stage}}]-P95'
+            - expr: sum by(le,instance, stage, pod) (rate(greptime_datanode_convert_region_request_sum[$__rate_interval]))/sum by(le,instance, stage, pod) (rate(greptime_datanode_convert_region_request_count[$__rate_interval]))
+              datasource:
+                type: prometheus
+                uid: ${metrics}
+              legendFormat: '[{{instance}}]-[{{pod}}]-[{{stage}}]-AVG'
    - title: OpenDAL
      panels:
        - title: QPS per Instance
--- a/grafana/scripts/gen-dashboards.sh
+++ b/grafana/scripts/gen-dashboards.sh
@@ -6,7 +6,7 @@ DAC_IMAGE=ghcr.io/zyy17/dac:20250423-522bd35

 remove_instance_filters() {
  # Remove the instance filters for the standalone dashboards.
-  sed 's/instance=~\\"$datanode\\",//; s/instance=~\\"$datanode\\"//; s/instance=~\\"$frontend\\",//; s/instance=~\\"$frontend\\"//; s/instance=~\\"$metasrv\\",//; s/instance=~\\"$metasrv\\"//; s/instance=~\\"$flownode\\",//; s/instance=~\\"$flownode\\"//;' $CLUSTER_DASHBOARD_DIR/dashboard.json > $STANDALONE_DASHBOARD_DIR/dashboard.json
+  sed -E 's/instance=~\\"(\$datanode|\$frontend|\$metasrv|\$flownode)\\",?//g' "$CLUSTER_DASHBOARD_DIR/dashboard.json" > "$STANDALONE_DASHBOARD_DIR/dashboard.json"
 }

 generate_intermediate_dashboards_and_docs() {
--- a/licenserc.toml
+++ b/licenserc.toml
@@ -27,6 +27,8 @@ excludes = [
    "src/servers/src/repeated_field.rs",
    "src/servers/src/http/test_helpers.rs",
    # enterprise
+    "src/common/meta/src/rpc/ddl/trigger.rs",
+    "src/operator/src/expr_helper/trigger.rs",
    "src/sql/src/statements/create/trigger.rs",
    "src/sql/src/statements/show/trigger.rs",
    "src/sql/src/parsers/create_parser/trigger.rs",
--- a/src/cli/Cargo.toml
+++ b/src/cli/Cargo.toml
@@ -5,8 +5,12 @@ edition.workspace = true
 license.workspace = true

 [features]
-pg_kvbackend = ["common-meta/pg_kvbackend"]
-mysql_kvbackend = ["common-meta/mysql_kvbackend"]
+default = [
+    "pg_kvbackend",
+    "mysql_kvbackend",
+]
+pg_kvbackend = ["common-meta/pg_kvbackend", "meta-srv/pg_kvbackend"]
+mysql_kvbackend = ["common-meta/mysql_kvbackend", "meta-srv/mysql_kvbackend"]

 [lints]
 workspace = true
--- a/src/cmd/Cargo.toml
+++ b/src/cmd/Cargo.toml
@@ -10,7 +10,13 @@ name = "greptime"
 path = "src/bin/greptime.rs"

 [features]
-default = ["servers/pprof", "servers/mem-prof", "meta-srv/pg_kvbackend", "meta-srv/mysql_kvbackend"]
+default = [
+    "servers/pprof",
+    "servers/mem-prof",
+    "meta-srv/pg_kvbackend",
+    "meta-srv/mysql_kvbackend",
+]
+enterprise = ["common-meta/enterprise", "frontend/enterprise", "meta-srv/enterprise"]
 tokio-console = ["common-telemetry/tokio-console"]

 [lints]
--- a/src/cmd/src/standalone.rs
+++ b/src/cmd/src/standalone.rs
@@ -35,6 +35,8 @@ use common_meta::ddl::flow_meta::{FlowMetadataAllocator, FlowMetadataAllocatorRe
 use common_meta::ddl::table_meta::{TableMetadataAllocator, TableMetadataAllocatorRef};
 use common_meta::ddl::{DdlContext, NoopRegionFailureDetectorControl, ProcedureExecutorRef};
 use common_meta::ddl_manager::DdlManager;
+#[cfg(feature = "enterprise")]
+use common_meta::ddl_manager::TriggerDdlManagerRef;
 use common_meta::key::flow::flow_state::FlowStat;
 use common_meta::key::flow::{FlowMetadataManager, FlowMetadataManagerRef};
 use common_meta::key::{TableMetadataManager, TableMetadataManagerRef};
@@ -69,6 +71,7 @@ use frontend::service_config::{
 };
 use meta_srv::metasrv::{FLOW_ID_SEQ, TABLE_ID_SEQ};
 use mito2::config::MitoConfig;
+use query::options::QueryOptions;
 use serde::{Deserialize, Serialize};
 use servers::export_metrics::{ExportMetricsOption, ExportMetricsTask};
 use servers::grpc::GrpcOptions;
@@ -153,6 +156,7 @@ pub struct StandaloneOptions {
    pub init_regions_parallelism: usize,
    pub max_in_flight_write_bytes: Option<ReadableSize>,
    pub slow_query: Option<SlowQueryOptions>,
+    pub query: QueryOptions,
 }

 impl Default for StandaloneOptions {
@@ -185,6 +189,7 @@ impl Default for StandaloneOptions {
            init_regions_parallelism: 16,
            max_in_flight_write_bytes: None,
            slow_query: Some(SlowQueryOptions::default()),
+            query: QueryOptions::default(),
        }
    }
 }
@@ -240,6 +245,7 @@ impl StandaloneOptions {
            grpc: cloned_opts.grpc,
            init_regions_in_background: cloned_opts.init_regions_in_background,
            init_regions_parallelism: cloned_opts.init_regions_parallelism,
+            query: cloned_opts.query,
            ..Default::default()
        }
    }
@@ -579,6 +585,8 @@ impl StartCommand {
            flow_id_sequence,
        ));

+        #[cfg(feature = "enterprise")]
+        let trigger_ddl_manager: Option<TriggerDdlManagerRef> = plugins.get();
        let ddl_task_executor = Self::create_ddl_task_executor(
            procedure_manager.clone(),
            node_manager.clone(),
@@ -587,6 +595,8 @@ impl StartCommand {
            table_meta_allocator,
            flow_metadata_manager,
            flow_meta_allocator,
+            #[cfg(feature = "enterprise")]
+            trigger_ddl_manager,
        )
        .await?;

@@ -651,6 +661,7 @@ impl StartCommand {
        })
    }

+    #[allow(clippy::too_many_arguments)]
    pub async fn create_ddl_task_executor(
        procedure_manager: ProcedureManagerRef,
        node_manager: NodeManagerRef,
@@ -659,6 +670,7 @@ impl StartCommand {
        table_metadata_allocator: TableMetadataAllocatorRef,
        flow_metadata_manager: FlowMetadataManagerRef,
        flow_metadata_allocator: FlowMetadataAllocatorRef,
+        #[cfg(feature = "enterprise")] trigger_ddl_manager: Option<TriggerDdlManagerRef>,
    ) -> Result<ProcedureExecutorRef> {
        let procedure_executor: ProcedureExecutorRef = Arc::new(
            DdlManager::try_new(
@@ -675,6 +687,8 @@ impl StartCommand {
                },
                procedure_manager,
                true,
+                #[cfg(feature = "enterprise")]
+                trigger_ddl_manager,
            )
            .context(error::InitDdlManagerSnafu)?,
        );
--- a/src/common/meta/Cargo.toml
+++ b/src/common/meta/Cargo.toml
@@ -8,6 +8,7 @@ license.workspace = true
 testing = []
 pg_kvbackend = ["dep:tokio-postgres", "dep:backon", "dep:deadpool-postgres", "dep:deadpool"]
 mysql_kvbackend = ["dep:sqlx", "dep:backon"]
+enterprise = []

 [lints]
 workspace = true
--- a/src/common/meta/src/ddl/test_util.rs
+++ b/src/common/meta/src/ddl/test_util.rs
@@ -18,6 +18,7 @@ pub mod create_table;
 pub mod datanode_handler;
 pub mod flownode_handler;

+use std::assert_matches::assert_matches;
 use std::collections::HashMap;

 use api::v1::meta::Partition;
@@ -75,8 +76,6 @@ pub async fn create_logical_table(
    physical_table_id: TableId,
    table_name: &str,
 ) -> TableId {
-    use std::assert_matches::assert_matches;
-
    let tasks = vec![test_create_logical_table_task(table_name)];
    let mut procedure = CreateLogicalTablesProcedure::new(tasks, physical_table_id, ddl_context);
    let status = procedure.on_prepare().await.unwrap();
--- a/src/common/meta/src/ddl_manager.rs
+++ b/src/common/meta/src/ddl_manager.rs
@@ -47,6 +47,10 @@ use crate::error::{
 use crate::key::table_info::TableInfoValue;
 use crate::key::table_name::TableNameKey;
 use crate::key::{DeserializedValueWithBytes, TableMetadataManagerRef};
+#[cfg(feature = "enterprise")]
+use crate::rpc::ddl::trigger::CreateTriggerTask;
+#[cfg(feature = "enterprise")]
+use crate::rpc::ddl::DdlTask::CreateTrigger;
 use crate::rpc::ddl::DdlTask::{
    AlterDatabase, AlterLogicalTables, AlterTable, CreateDatabase, CreateFlow, CreateLogicalTables,
    CreateTable, CreateView, DropDatabase, DropFlow, DropLogicalTables, DropTable, DropView,
@@ -70,8 +74,29 @@ pub type BoxedProcedureLoaderFactory = dyn Fn(DdlContext) -> BoxedProcedureLoade
 pub struct DdlManager {
    ddl_context: DdlContext,
    procedure_manager: ProcedureManagerRef,
+    #[cfg(feature = "enterprise")]
+    trigger_ddl_manager: Option<TriggerDdlManagerRef>,
 }

+/// This trait is responsible for handling DDL tasks about triggers. e.g.,
+/// create trigger, drop trigger, etc.
+#[cfg(feature = "enterprise")]
+#[async_trait::async_trait]
+pub trait TriggerDdlManager: Send + Sync {
+    async fn create_trigger(
+        &self,
+        create_trigger_task: CreateTriggerTask,
+        procedure_manager: ProcedureManagerRef,
+        ddl_context: DdlContext,
+        query_context: QueryContext,
+    ) -> Result<SubmitDdlTaskResponse>;
+
+    fn as_any(&self) -> &dyn std::any::Any;
+}
+
+#[cfg(feature = "enterprise")]
+pub type TriggerDdlManagerRef = Arc<dyn TriggerDdlManager>;
+
 macro_rules! procedure_loader_entry {
    ($procedure:ident) => {
        (
@@ -100,10 +125,13 @@ impl DdlManager {
        ddl_context: DdlContext,
        procedure_manager: ProcedureManagerRef,
        register_loaders: bool,
+        #[cfg(feature = "enterprise")] trigger_ddl_manager: Option<TriggerDdlManagerRef>,
    ) -> Result<Self> {
        let manager = Self {
            ddl_context,
            procedure_manager,
+            #[cfg(feature = "enterprise")]
+            trigger_ddl_manager,
        };
        if register_loaders {
            manager.register_loaders()?;
@@ -669,6 +697,28 @@ async fn handle_create_flow_task(
    })
 }

+#[cfg(feature = "enterprise")]
+async fn handle_create_trigger_task(
+    ddl_manager: &DdlManager,
+    create_trigger_task: CreateTriggerTask,
+    query_context: QueryContext,
+) -> Result<SubmitDdlTaskResponse> {
+    let Some(m) = ddl_manager.trigger_ddl_manager.as_ref() else {
+        return UnsupportedSnafu {
+            operation: "create trigger",
+        }
+        .fail();
+    };
+
+    m.create_trigger(
+        create_trigger_task,
+        ddl_manager.procedure_manager.clone(),
+        ddl_manager.ddl_context.clone(),
+        query_context,
+    )
+    .await
+}
+
 async fn handle_alter_logical_table_tasks(
    ddl_manager: &DdlManager,
    alter_table_tasks: Vec<AlterTableTask>,
@@ -777,6 +827,15 @@ impl ProcedureExecutor for DdlManager {
                    handle_create_flow_task(self, create_flow_task, request.query_context.into())
                        .await
                }
+                #[cfg(feature = "enterprise")]
+                CreateTrigger(create_trigger_task) => {
+                    handle_create_trigger_task(
+                        self,
+                        create_trigger_task,
+                        request.query_context.into(),
+                    )
+                    .await
+                }
                DropFlow(drop_flow_task) => handle_drop_flow_task(self, drop_flow_task).await,
                CreateView(create_view_task) => {
                    handle_create_view_task(self, create_view_task).await
@@ -905,6 +964,8 @@ mod tests {
            },
            procedure_manager.clone(),
            true,
+            #[cfg(feature = "enterprise")]
+            None,
        );

        let expected_loaders = vec![
--- a/src/common/meta/src/key/flow.rs
+++ b/src/common/meta/src/key/flow.rs
@@ -14,7 +14,7 @@

 pub mod flow_info;
 pub(crate) mod flow_name;
-pub(crate) mod flow_route;
+pub mod flow_route;
 pub mod flow_state;
 mod flownode_addr_helper;
 pub(crate) mod flownode_flow;
--- a/src/common/meta/src/key/flow/flow_info.rs
+++ b/src/common/meta/src/key/flow/flow_info.rs
@@ -114,37 +114,37 @@ impl<'a> MetadataKey<'a, FlowInfoKeyInner> for FlowInfoKeyInner {
 #[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
 pub struct FlowInfoValue {
    /// The source tables used by the flow.
-    pub(crate) source_table_ids: Vec<TableId>,
+    pub source_table_ids: Vec<TableId>,
    /// The sink table used by the flow.
-    pub(crate) sink_table_name: TableName,
+    pub sink_table_name: TableName,
    /// Which flow nodes this flow is running on.
-    pub(crate) flownode_ids: BTreeMap<FlowPartitionId, FlownodeId>,
+    pub flownode_ids: BTreeMap<FlowPartitionId, FlownodeId>,
    /// The catalog name.
-    pub(crate) catalog_name: String,
+    pub catalog_name: String,
    /// The query context used when create flow.
    /// Although flow doesn't belong to any schema, this query_context is needed to remember
    /// the query context when `create_flow` is executed
    /// for recovering flow using the same sql&query_context after db restart.
    /// if none, should use default query context
    #[serde(default)]
-    pub(crate) query_context: Option<crate::rpc::ddl::QueryContext>,
+    pub query_context: Option<crate::rpc::ddl::QueryContext>,
    /// The flow name.
-    pub(crate) flow_name: String,
+    pub flow_name: String,
    /// The raw sql.
-    pub(crate) raw_sql: String,
+    pub raw_sql: String,
    /// The expr of expire.
    /// Duration in seconds as `i64`.
-    pub(crate) expire_after: Option<i64>,
+    pub expire_after: Option<i64>,
    /// The comment.
-    pub(crate) comment: String,
+    pub comment: String,
    /// The options.
-    pub(crate) options: HashMap<String, String>,
+    pub options: HashMap<String, String>,
    /// The created time
    #[serde(default)]
-    pub(crate) created_time: DateTime<Utc>,
+    pub created_time: DateTime<Utc>,
    /// The updated time.
    #[serde(default)]
-    pub(crate) updated_time: DateTime<Utc>,
+    pub updated_time: DateTime<Utc>,
 }

 impl FlowInfoValue {
--- a/src/common/meta/src/rpc/ddl.rs
+++ b/src/common/meta/src/rpc/ddl.rs
@@ -12,6 +12,9 @@
 // See the License for the specific language governing permissions and
 // limitations under the License.

+#[cfg(feature = "enterprise")]
+pub mod trigger;
+
 use std::collections::{HashMap, HashSet};
 use std::result;

@@ -68,6 +71,8 @@ pub enum DdlTask {
    DropFlow(DropFlowTask),
    CreateView(CreateViewTask),
    DropView(DropViewTask),
+    #[cfg(feature = "enterprise")]
+    CreateTrigger(trigger::CreateTriggerTask),
 }

 impl DdlTask {
@@ -242,6 +247,18 @@ impl TryFrom<Task> for DdlTask {
            Task::DropFlowTask(drop_flow) => Ok(DdlTask::DropFlow(drop_flow.try_into()?)),
            Task::CreateViewTask(create_view) => Ok(DdlTask::CreateView(create_view.try_into()?)),
            Task::DropViewTask(drop_view) => Ok(DdlTask::DropView(drop_view.try_into()?)),
+            Task::CreateTriggerTask(create_trigger) => {
+                #[cfg(feature = "enterprise")]
+                return Ok(DdlTask::CreateTrigger(create_trigger.try_into()?));
+                #[cfg(not(feature = "enterprise"))]
+                {
+                    let _ = create_trigger;
+                    crate::error::UnsupportedSnafu {
+                        operation: "create trigger",
+                    }
+                    .fail()
+                }
+            }
        }
    }
 }
@@ -292,6 +309,8 @@ impl TryFrom<SubmitDdlTaskRequest> for PbDdlTaskRequest {
            DdlTask::DropFlow(task) => Task::DropFlowTask(task.into()),
            DdlTask::CreateView(task) => Task::CreateViewTask(task.try_into()?),
            DdlTask::DropView(task) => Task::DropViewTask(task.into()),
+            #[cfg(feature = "enterprise")]
+            DdlTask::CreateTrigger(task) => Task::CreateTriggerTask(task.into()),
        };

        Ok(Self {
--- a/src/common/meta/src/rpc/ddl/trigger.rs
+++ b/src/common/meta/src/rpc/ddl/trigger.rs
@@ -0,0 +1,276 @@
+use std::collections::HashMap;
+use std::time::Duration;
+
+use api::v1::meta::CreateTriggerTask as PbCreateTriggerTask;
+use api::v1::notify_channel::ChannelType as PbChannelType;
+use api::v1::{
+    CreateTriggerExpr, NotifyChannel as PbNotifyChannel, WebhookOptions as PbWebhookOptions,
+};
+use serde::{Deserialize, Serialize};
+use snafu::OptionExt;
+
+use crate::error;
+use crate::error::Result;
+use crate::rpc::ddl::DdlTask;
+
+// Create trigger
+#[derive(Debug, Clone, Serialize, Deserialize)]
+pub struct CreateTriggerTask {
+    pub catalog_name: String,
+    pub trigger_name: String,
+    pub if_not_exists: bool,
+    pub sql: String,
+    pub channels: Vec<NotifyChannel>,
+    pub labels: HashMap<String, String>,
+    pub annotations: HashMap<String, String>,
+    pub interval: Duration,
+}
+
+#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
+pub struct NotifyChannel {
+    pub name: String,
+    pub channel_type: ChannelType,
+}
+
+/// The available channel enum for sending trigger notifications.
+#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
+pub enum ChannelType {
+    Webhook(WebhookOptions),
+}
+
+#[derive(Debug, Clone, PartialEq, Serialize, Deserialize)]
+pub struct WebhookOptions {
+    /// The URL of the AlertManager API endpoint.
+    ///
+    /// e.g., "http://localhost:9093".
+    pub url: String,
+    /// Configuration options for the AlertManager webhook. e.g., timeout, etc.
+    pub opts: HashMap<String, String>,
+}
+
+impl From<CreateTriggerTask> for PbCreateTriggerTask {
+    fn from(task: CreateTriggerTask) -> Self {
+        let channels = task
+            .channels
+            .into_iter()
+            .map(PbNotifyChannel::from)
+            .collect();
+
+        let expr = CreateTriggerExpr {
+            catalog_name: task.catalog_name,
+            trigger_name: task.trigger_name,
+            create_if_not_exists: task.if_not_exists,
+            sql: task.sql,
+            channels,
+            labels: task.labels,
+            annotations: task.annotations,
+            interval: task.interval.as_secs(),
+        };
+
+        PbCreateTriggerTask {
+            create_trigger: Some(expr),
+        }
+    }
+}
+
+impl TryFrom<PbCreateTriggerTask> for CreateTriggerTask {
+    type Error = error::Error;
+
+    fn try_from(task: PbCreateTriggerTask) -> Result<Self> {
+        let expr = task.create_trigger.context(error::InvalidProtoMsgSnafu {
+            err_msg: "expected create_trigger",
+        })?;
+
+        let channels = expr
+            .channels
+            .into_iter()
+            .map(NotifyChannel::try_from)
+            .collect::<Result<Vec<_>>>()?;
+
+        let task = CreateTriggerTask {
+            catalog_name: expr.catalog_name,
+            trigger_name: expr.trigger_name,
+            if_not_exists: expr.create_if_not_exists,
+            sql: expr.sql,
+            channels,
+            labels: expr.labels,
+            annotations: expr.annotations,
+            interval: Duration::from_secs(expr.interval),
+        };
+        Ok(task)
+    }
+}
+
+impl From<NotifyChannel> for PbNotifyChannel {
+    fn from(channel: NotifyChannel) -> Self {
+        let NotifyChannel { name, channel_type } = channel;
+
+        let channel_type = match channel_type {
+            ChannelType::Webhook(options) => PbChannelType::Webhook(PbWebhookOptions {
+                url: options.url,
+                opts: options.opts,
+            }),
+        };
+
+        PbNotifyChannel {
+            name,
+            channel_type: Some(channel_type),
+        }
+    }
+}
+
+impl TryFrom<PbNotifyChannel> for NotifyChannel {
+    type Error = error::Error;
+
+    fn try_from(channel: PbNotifyChannel) -> Result<Self> {
+        let PbNotifyChannel { name, channel_type } = channel;
+
+        let channel_type = channel_type.context(error::InvalidProtoMsgSnafu {
+            err_msg: "expected channel_type",
+        })?;
+
+        let channel_type = match channel_type {
+            PbChannelType::Webhook(options) => ChannelType::Webhook(WebhookOptions {
+                url: options.url,
+                opts: options.opts,
+            }),
+        };
+        Ok(NotifyChannel { name, channel_type })
+    }
+}
+
+impl DdlTask {
+    /// Creates a [`DdlTask`] to create a trigger.
+    pub fn new_create_trigger(expr: CreateTriggerTask) -> Self {
+        DdlTask::CreateTrigger(expr)
+    }
+}
+
+#[cfg(test)]
+mod tests {
+    use super::*;
+
+    #[test]
+    fn test_convert_create_trigger_task() {
+        let original = CreateTriggerTask {
+            catalog_name: "test_catalog".to_string(),
+            trigger_name: "test_trigger".to_string(),
+            if_not_exists: true,
+            sql: "SELECT * FROM test".to_string(),
+            channels: vec![
+                NotifyChannel {
+                    name: "channel1".to_string(),
+                    channel_type: ChannelType::Webhook(WebhookOptions {
+                        url: "http://localhost:9093".to_string(),
+                        opts: HashMap::from([("timeout".to_string(), "30s".to_string())]),
+                    }),
+                },
+                NotifyChannel {
+                    name: "channel2".to_string(),
+                    channel_type: ChannelType::Webhook(WebhookOptions {
+                        url: "http://alertmanager:9093".to_string(),
+                        opts: HashMap::new(),
+                    }),
+                },
+            ],
+            labels: vec![
+                ("key1".to_string(), "value1".to_string()),
+                ("key2".to_string(), "value2".to_string()),
+            ]
+            .into_iter()
+            .collect(),
+            annotations: vec![
+                ("summary".to_string(), "Test alert".to_string()),
+                ("description".to_string(), "This is a test".to_string()),
+            ]
+            .into_iter()
+            .collect(),
+            interval: Duration::from_secs(60),
+        };
+
+        let pb_task: PbCreateTriggerTask = original.clone().into();
+
+        let expr = pb_task.create_trigger.as_ref().unwrap();
+        assert_eq!(expr.catalog_name, "test_catalog");
+        assert_eq!(expr.trigger_name, "test_trigger");
+        assert!(expr.create_if_not_exists);
+        assert_eq!(expr.sql, "SELECT * FROM test");
+        assert_eq!(expr.channels.len(), 2);
+        assert_eq!(expr.labels.len(), 2);
+        assert_eq!(expr.labels.get("key1").unwrap(), "value1");
+        assert_eq!(expr.labels.get("key2").unwrap(), "value2");
+        assert_eq!(expr.annotations.len(), 2);
+        assert_eq!(expr.annotations.get("summary").unwrap(), "Test alert");
+        assert_eq!(
+            expr.annotations.get("description").unwrap(),
+            "This is a test"
+        );
+        assert_eq!(expr.interval, 60);
+
+        let round_tripped = CreateTriggerTask::try_from(pb_task).unwrap();
+
+        assert_eq!(original.catalog_name, round_tripped.catalog_name);
+        assert_eq!(original.trigger_name, round_tripped.trigger_name);
+        assert_eq!(original.if_not_exists, round_tripped.if_not_exists);
+        assert_eq!(original.sql, round_tripped.sql);
+        assert_eq!(original.channels.len(), round_tripped.channels.len());
+        assert_eq!(&original.channels[0], &round_tripped.channels[0]);
+        assert_eq!(&original.channels[1], &round_tripped.channels[1]);
+        assert_eq!(original.labels, round_tripped.labels);
+        assert_eq!(original.annotations, round_tripped.annotations);
+        assert_eq!(original.interval, round_tripped.interval);
+
+        // Invalid, since create_trigger is None and it's required.
+        let invalid_task = PbCreateTriggerTask {
+            create_trigger: None,
+        };
+        let result = CreateTriggerTask::try_from(invalid_task);
+        assert!(result.is_err());
+    }
+
+    #[test]
+    fn test_convert_notify_channel() {
+        let original = NotifyChannel {
+            name: "test_channel".to_string(),
+            channel_type: ChannelType::Webhook(WebhookOptions {
+                url: "http://localhost:9093".to_string(),
+                opts: HashMap::new(),
+            }),
+        };
+        let pb_channel: PbNotifyChannel = original.clone().into();
+        match pb_channel.channel_type.as_ref().unwrap() {
+            PbChannelType::Webhook(options) => {
+                assert_eq!(pb_channel.name, "test_channel");
+                assert_eq!(options.url, "http://localhost:9093");
+                assert!(options.opts.is_empty());
+            }
+        }
+        let round_tripped = NotifyChannel::try_from(pb_channel).unwrap();
+        assert_eq!(original, round_tripped);
+
+        // Test with timeout is None.
+        let no_timeout = NotifyChannel {
+            name: "no_timeout".to_string(),
+            channel_type: ChannelType::Webhook(WebhookOptions {
+                url: "http://localhost:9093".to_string(),
+                opts: HashMap::new(),
+            }),
+        };
+        let pb_no_timeout: PbNotifyChannel = no_timeout.clone().into();
+        match pb_no_timeout.channel_type.as_ref().unwrap() {
+            PbChannelType::Webhook(options) => {
+                assert_eq!(options.url, "http://localhost:9093");
+            }
+        }
+        let round_tripped_no_timeout = NotifyChannel::try_from(pb_no_timeout).unwrap();
+        assert_eq!(no_timeout, round_tripped_no_timeout);
+
+        // Invalid, since channel_type is None and it's required.
+        let invalid_channel = PbNotifyChannel {
+            name: "invalid".to_string(),
+            channel_type: None,
+        };
+        let result = NotifyChannel::try_from(invalid_channel);
+        assert!(result.is_err());
+    }
+}
--- a/src/common/recordbatch/src/error.rs
+++ b/src/common/recordbatch/src/error.rs
@@ -133,6 +133,18 @@ pub enum Error {
        source: datatypes::error::Error,
    },

+    #[snafu(display(
+        "Failed to downcast vector of type '{:?}' to type '{:?}'",
+        from_type,
+        to_type
+    ))]
+    DowncastVector {
+        from_type: ConcreteDataType,
+        to_type: ConcreteDataType,
+        #[snafu(implicit)]
+        location: Location,
+    },
+
    #[snafu(display("Error occurs when performing arrow computation"))]
    ArrowCompute {
        #[snafu(source)]
@@ -192,6 +204,8 @@ impl ErrorExt for Error {
            | Error::PhysicalExpr { .. }
            | Error::RecordBatchSliceIndexOverflow { .. } => StatusCode::Internal,

+            Error::DowncastVector { .. } => StatusCode::Unexpected,
+
            Error::PollStream { .. } => StatusCode::EngineExecuteQuery,

            Error::ArrowCompute { .. } => StatusCode::IllegalState,
--- a/src/common/recordbatch/src/lib.rs
+++ b/src/common/recordbatch/src/lib.rs
@@ -30,13 +30,16 @@ pub use datafusion::physical_plan::SendableRecordBatchStream as DfSendableRecord
 use datatypes::arrow::compute::SortOptions;
 pub use datatypes::arrow::record_batch::RecordBatch as DfRecordBatch;
 use datatypes::arrow::util::pretty;
-use datatypes::prelude::VectorRef;
-use datatypes::schema::{Schema, SchemaRef};
+use datatypes::prelude::{ConcreteDataType, VectorRef};
+use datatypes::scalars::{ScalarVector, ScalarVectorBuilder};
+use datatypes::schema::{ColumnSchema, Schema, SchemaRef};
+use datatypes::types::json_type_value_to_string;
+use datatypes::vectors::{BinaryVector, StringVectorBuilder};
 use error::Result;
 use futures::task::{Context, Poll};
 use futures::{Stream, TryStreamExt};
 pub use recordbatch::RecordBatch;
-use snafu::{ensure, ResultExt};
+use snafu::{ensure, OptionExt, ResultExt};

 pub trait RecordBatchStream: Stream<Item = Result<RecordBatch>> {
    fn name(&self) -> &str {
@@ -58,6 +61,146 @@ pub struct OrderOption {
    pub options: SortOptions,
 }

+/// A wrapper that maps a [RecordBatchStream] to a new [RecordBatchStream] by applying a function to each [RecordBatch].
+///
+/// The mapper function is applied to each [RecordBatch] in the stream.
+/// The schema of the new [RecordBatchStream] is the same as the schema of the inner [RecordBatchStream] after applying the schema mapper function.
+/// The output ordering of the new [RecordBatchStream] is the same as the output ordering of the inner [RecordBatchStream].
+/// The metrics of the new [RecordBatchStream] is the same as the metrics of the inner [RecordBatchStream] if it is not `None`.
+pub struct SendableRecordBatchMapper {
+    inner: SendableRecordBatchStream,
+    /// The mapper function is applied to each [RecordBatch] in the stream.
+    /// The original schema and the mapped schema are passed to the mapper function.
+    mapper: fn(RecordBatch, &SchemaRef, &SchemaRef) -> Result<RecordBatch>,
+    /// The schema of the new [RecordBatchStream] is the same as the schema of the inner [RecordBatchStream] after applying the schema mapper function.
+    schema: SchemaRef,
+    /// Whether the mapper function is applied to each [RecordBatch] in the stream.
+    apply_mapper: bool,
+}
+
+/// Maps the json type to string in the batch.
+///
+/// The json type is mapped to string by converting the json value to string.
+/// The batch is updated to have the same number of columns as the original batch,
+/// but with the json type mapped to string.
+pub fn map_json_type_to_string(
+    batch: RecordBatch,
+    original_schema: &SchemaRef,
+    mapped_schema: &SchemaRef,
+) -> Result<RecordBatch> {
+    let mut vectors = Vec::with_capacity(original_schema.column_schemas().len());
+    for (vector, schema) in batch.columns.iter().zip(original_schema.column_schemas()) {
+        if let ConcreteDataType::Json(j) = schema.data_type {
+            let mut string_vector_builder = StringVectorBuilder::with_capacity(vector.len());
+            let binary_vector = vector
+                .as_any()
+                .downcast_ref::<BinaryVector>()
+                .with_context(|| error::DowncastVectorSnafu {
+                    from_type: schema.data_type.clone(),
+                    to_type: ConcreteDataType::binary_datatype(),
+                })?;
+            for value in binary_vector.iter_data() {
+                let Some(value) = value else {
+                    string_vector_builder.push(None);
+                    continue;
+                };
+                let string_value =
+                    json_type_value_to_string(value, &j.format).with_context(|_| {
+                        error::CastVectorSnafu {
+                            from_type: schema.data_type.clone(),
+                            to_type: ConcreteDataType::string_datatype(),
+                        }
+                    })?;
+                string_vector_builder.push(Some(string_value.as_str()));
+            }
+
+            let string_vector = string_vector_builder.finish();
+            vectors.push(Arc::new(string_vector) as VectorRef);
+        } else {
+            vectors.push(vector.clone());
+        }
+    }
+
+    RecordBatch::new(mapped_schema.clone(), vectors)
+}
+
+/// Maps the json type to string in the schema.
+///
+/// The json type is mapped to string by converting the json value to string.
+/// The schema is updated to have the same number of columns as the original schema,
+/// but with the json type mapped to string.
+///
+/// Returns the new schema and whether the schema needs to be mapped to string.
+pub fn map_json_type_to_string_schema(schema: SchemaRef) -> (SchemaRef, bool) {
+    let mut new_columns = Vec::with_capacity(schema.column_schemas().len());
+    let mut apply_mapper = false;
+    for column in schema.column_schemas() {
+        if matches!(column.data_type, ConcreteDataType::Json(_)) {
+            new_columns.push(ColumnSchema::new(
+                column.name.to_string(),
+                ConcreteDataType::string_datatype(),
+                column.is_nullable(),
+            ));
+            apply_mapper = true;
+        } else {
+            new_columns.push(column.clone());
+        }
+    }
+    (Arc::new(Schema::new(new_columns)), apply_mapper)
+}
+
+impl SendableRecordBatchMapper {
+    /// Creates a new [SendableRecordBatchMapper] with the given inner [RecordBatchStream], mapper function, and schema mapper function.
+    pub fn new(
+        inner: SendableRecordBatchStream,
+        mapper: fn(RecordBatch, &SchemaRef, &SchemaRef) -> Result<RecordBatch>,
+        schema_mapper: fn(SchemaRef) -> (SchemaRef, bool),
+    ) -> Self {
+        let (mapped_schema, apply_mapper) = schema_mapper(inner.schema());
+        Self {
+            inner,
+            mapper,
+            schema: mapped_schema,
+            apply_mapper,
+        }
+    }
+}
+
+impl RecordBatchStream for SendableRecordBatchMapper {
+    fn name(&self) -> &str {
+        "SendableRecordBatchMapper"
+    }
+
+    fn schema(&self) -> SchemaRef {
+        self.schema.clone()
+    }
+
+    fn output_ordering(&self) -> Option<&[OrderOption]> {
+        self.inner.output_ordering()
+    }
+
+    fn metrics(&self) -> Option<RecordBatchMetrics> {
+        self.inner.metrics()
+    }
+}
+
+impl Stream for SendableRecordBatchMapper {
+    type Item = Result<RecordBatch>;
+
+    fn poll_next(mut self: Pin<&mut Self>, cx: &mut Context<'_>) -> Poll<Option<Self::Item>> {
+        if self.apply_mapper {
+            Pin::new(&mut self.inner).poll_next(cx).map(|opt| {
+                opt.map(|result| {
+                    result
+                        .and_then(|batch| (self.mapper)(batch, &self.inner.schema(), &self.schema))
+                })
+            })
+        } else {
+            Pin::new(&mut self.inner).poll_next(cx)
+        }
+    }
+}
+
 /// EmptyRecordBatchStream can be used to create a RecordBatchStream
 /// that will produce no results
 pub struct EmptyRecordBatchStream {
--- a/src/flow/src/batching_mode/frontend_client.rs
+++ b/src/flow/src/batching_mode/frontend_client.rs
@@ -14,8 +14,9 @@

 //! Frontend client to run flow as batching task which is time-window-aware normal query triggered every tick set by user

-use std::sync::{Arc, Weak};
-use std::time::SystemTime;
+use std::collections::HashMap;
+use std::sync::{Arc, Mutex, Weak};
+use std::time::{Duration, Instant, SystemTime};

 use api::v1::greptime_request::Request;
 use api::v1::CreateTableExpr;
@@ -26,20 +27,21 @@ use common_meta::cluster::{NodeInfo, NodeInfoKey, Role};
 use common_meta::peer::Peer;
 use common_meta::rpc::store::RangeRequest;
 use common_query::Output;
-use common_telemetry::warn;
+use common_telemetry::{debug, warn};
+use itertools::Itertools;
 use meta_client::client::MetaClient;
-use rand::rng;
-use rand::seq::SliceRandom;
 use servers::query_handler::grpc::GrpcQueryHandler;
 use session::context::{QueryContextBuilder, QueryContextRef};
 use snafu::{OptionExt, ResultExt};

+use crate::batching_mode::task::BatchingTask;
 use crate::batching_mode::{
    DEFAULT_BATCHING_ENGINE_QUERY_TIMEOUT, FRONTEND_ACTIVITY_TIMEOUT, GRPC_CONN_TIMEOUT,
    GRPC_MAX_RETRIES,
 };
 use crate::error::{ExternalSnafu, InvalidRequestSnafu, NoAvailableFrontendSnafu, UnexpectedSnafu};
-use crate::{Error, FlowAuthHeader};
+use crate::metrics::METRIC_FLOW_BATCHING_ENGINE_GUESS_FE_LOAD;
+use crate::{Error, FlowAuthHeader, FlowId};

 /// Just like [`GrpcQueryHandler`] but use BoxedError
 ///
@@ -74,6 +76,105 @@ impl<

 type HandlerMutable = Arc<std::sync::Mutex<Option<Weak<dyn GrpcQueryHandlerWithBoxedError>>>>;

+/// Statistics about running query on this frontend from flownode
+#[derive(Debug, Default, Clone)]
+struct FrontendStat {
+    /// The query for flow id has been running since this timestamp
+    since: HashMap<FlowId, Instant>,
+    /// The average query time for each flow id
+    /// This is used to calculate the average query time for each flow id
+    past_query_avg: HashMap<FlowId, (usize, Duration)>,
+}
+
+#[derive(Debug, Default, Clone)]
+pub struct FrontendStats {
+    /// The statistics for each flow id
+    stats: Arc<Mutex<HashMap<String, FrontendStat>>>,
+}
+
+impl FrontendStats {
+    pub fn observe(&self, frontend_addr: &str, flow_id: FlowId) -> FrontendStatsGuard {
+        let mut stats = self.stats.lock().expect("Failed to lock frontend stats");
+        let stat = stats.entry(frontend_addr.to_string()).or_default();
+        stat.since.insert(flow_id, Instant::now());
+
+        FrontendStatsGuard {
+            stats: self.stats.clone(),
+            frontend_addr: frontend_addr.to_string(),
+            cur: flow_id,
+        }
+    }
+
+    /// return frontend addrs sorted by load, from lightest to heaviest
+    /// The load is calculated as the total average query time for each flow id plus running query's total running time elapsed
+    pub fn sort_by_load(&self) -> Vec<String> {
+        let stats = self.stats.lock().expect("Failed to lock frontend stats");
+        let fe_load_factor = stats
+            .iter()
+            .map(|(node_addr, stat)| {
+                // total expected avg running time for all currently running queries
+                let total_expect_avg_run_time = stat
+                    .since
+                    .keys()
+                    .map(|f| {
+                        let (count, total_duration) =
+                            stat.past_query_avg.get(f).unwrap_or(&(0, Duration::ZERO));
+                        if *count == 0 {
+                            0.0
+                        } else {
+                            total_duration.as_secs_f64() / *count as f64
+                        }
+                    })
+                    .sum::<f64>();
+                let total_cur_running_time = stat
+                    .since
+                    .values()
+                    .map(|since| since.elapsed().as_secs_f64())
+                    .sum::<f64>();
+                (
+                    node_addr.to_string(),
+                    total_expect_avg_run_time + total_cur_running_time,
+                )
+            })
+            .sorted_by(|(_, load_a), (_, load_b)| {
+                load_a
+                    .partial_cmp(load_b)
+                    .unwrap_or(std::cmp::Ordering::Equal)
+            })
+            .collect::<Vec<_>>();
+        debug!("Frontend load factor: {:?}", fe_load_factor);
+        for (node_addr, load) in &fe_load_factor {
+            METRIC_FLOW_BATCHING_ENGINE_GUESS_FE_LOAD
+                .with_label_values(&[&node_addr.to_string()])
+                .observe(*load);
+        }
+        fe_load_factor
+            .into_iter()
+            .map(|(addr, _)| addr)
+            .collect::<Vec<_>>()
+    }
+}
+
+pub struct FrontendStatsGuard {
+    stats: Arc<Mutex<HashMap<String, FrontendStat>>>,
+    frontend_addr: String,
+    cur: FlowId,
+}
+
+impl Drop for FrontendStatsGuard {
+    fn drop(&mut self) {
+        let mut stats = self.stats.lock().expect("Failed to lock frontend stats");
+        if let Some(stat) = stats.get_mut(&self.frontend_addr) {
+            if let Some(since) = stat.since.remove(&self.cur) {
+                let elapsed = since.elapsed();
+                let (count, total_duration) = stat.past_query_avg.entry(self.cur).or_default();
+                *count += 1;
+                *total_duration += elapsed;
+            }
+        }
+    }
+}
+
 /// A simple frontend client able to execute sql using grpc protocol
 ///
 /// This is for computation-heavy query which need to offload computation to frontend, lifting the load from flownode
@@ -83,6 +184,7 @@ pub enum FrontendClient {
        meta_client: Arc<MetaClient>,
        chnl_mgr: ChannelManager,
        auth: Option<FlowAuthHeader>,
+        fe_stats: FrontendStats,
    },
    Standalone {
        /// for the sake of simplicity still use grpc even in standalone mode
@@ -114,6 +216,7 @@ impl FrontendClient {
                ChannelManager::with_config(cfg)
            },
            auth,
+            fe_stats: Default::default(),
        }
    }

@@ -192,6 +295,7 @@ impl FrontendClient {
            meta_client: _,
            chnl_mgr,
            auth,
+            fe_stats,
        } = self
        else {
            return UnexpectedSnafu {
@@ -208,8 +312,21 @@ impl FrontendClient {
                .duration_since(SystemTime::UNIX_EPOCH)
                .unwrap()
                .as_millis() as i64;
-            // shuffle the frontends to avoid always pick the same one
-            frontends.shuffle(&mut rng());
+            let node_addrs_by_load = fe_stats.sort_by_load();
+            // index+1 to load order asc, so that the lightest node has load 1 and non-existent node has load 0
+            let addr2load = node_addrs_by_load
+                .iter()
+                .enumerate()
+                .map(|(i, id)| (id.clone(), i + 1))
+                .collect::<HashMap<_, _>>();
+            // sort frontends by load, from lightest to heaviest
+            frontends.sort_by(|(_, a), (_, b)| {
+                // if not even in stats, treat as 0 load since never been queried
+                let load_a = addr2load.get(&a.peer.addr).unwrap_or(&0);
+                let load_b = addr2load.get(&b.peer.addr).unwrap_or(&0);
+                load_a.cmp(load_b)
+            });
+            debug!("Frontend nodes sorted by load: {:?}", frontends);

            // found node with maximum last_activity_ts
            for (_, node_info) in frontends
@@ -257,6 +374,7 @@ impl FrontendClient {
        create: CreateTableExpr,
        catalog: &str,
        schema: &str,
+        task: Option<&BatchingTask>,
    ) -> Result<u32, Error> {
        self.handle(
            Request::Ddl(api::v1::DdlRequest {
@@ -265,6 +383,7 @@ impl FrontendClient {
            catalog,
            schema,
            &mut None,
+            task,
        )
        .await
    }
@@ -276,15 +395,19 @@ impl FrontendClient {
        catalog: &str,
        schema: &str,
        peer_desc: &mut Option<PeerDesc>,
+        task: Option<&BatchingTask>,
    ) -> Result<u32, Error> {
        match self {
-            FrontendClient::Distributed { .. } => {
+            FrontendClient::Distributed { fe_stats, .. } => {
                let db = self.get_random_active_frontend(catalog, schema).await?;

                *peer_desc = Some(PeerDesc::Dist {
                    peer: db.peer.clone(),
                });

+                let flow_id = task.map(|t| t.config.flow_id).unwrap_or_default();
+                let _guard = fe_stats.observe(&db.peer.addr, flow_id);
+
                db.database
                    .handle_with_retry(req.clone(), GRPC_MAX_RETRIES)
                    .await
--- a/src/flow/src/batching_mode/state.rs
+++ b/src/flow/src/batching_mode/state.rs
@@ -30,6 +30,9 @@ use crate::batching_mode::task::BatchingTask;
 use crate::batching_mode::time_window::TimeWindowExpr;
 use crate::batching_mode::MIN_REFRESH_DURATION;
 use crate::error::{DatatypesSnafu, InternalSnafu, TimeSnafu, UnexpectedSnafu};
+use crate::metrics::{
+    METRIC_FLOW_BATCHING_ENGINE_QUERY_TIME_RANGE, METRIC_FLOW_BATCHING_ENGINE_QUERY_WINDOW_CNT,
+};
 use crate::{Error, FlowId};

 /// The state of the [`BatchingTask`].
@@ -127,10 +130,10 @@ impl DirtyTimeWindows {
    /// Time window merge distance
    ///
    /// TODO(discord9): make those configurable
-    const MERGE_DIST: i32 = 3;
+    pub const MERGE_DIST: i32 = 3;

    /// Maximum number of filters allowed in a single query
-    const MAX_FILTER_NUM: usize = 20;
+    pub const MAX_FILTER_NUM: usize = 20;

    /// Add lower bounds to the dirty time windows. Upper bounds are ignored.
    ///
@@ -154,11 +157,16 @@ impl DirtyTimeWindows {
    }

    /// Generate all filter expressions consuming all time windows
+    ///
+    /// there is two limits:
+    /// - shouldn't return a too long time range(<=`window_size * window_cnt`), so that the query can be executed in a reasonable time
+    /// - shouldn't return too many time range exprs, so that the query can be parsed properly instead of causing parser to overflow
    pub fn gen_filter_exprs(
        &mut self,
        col_name: &str,
        expire_lower_bound: Option<Timestamp>,
        window_size: chrono::Duration,
+        window_cnt: usize,
        flow_id: FlowId,
        task_ctx: Option<&BatchingTask>,
    ) -> Result<Option<datafusion_expr::Expr>, Error> {
@@ -196,12 +204,33 @@ impl DirtyTimeWindows {
            }
        }

-        // get the first `MAX_FILTER_NUM` time windows
-        let nth = self
-            .windows
-            .iter()
-            .nth(Self::MAX_FILTER_NUM)
-            .map(|(key, _)| *key);
+        // get the first `window_cnt` time windows
+        let max_time_range = window_size * window_cnt as i32;
+        let nth = {
+            let mut cur_time_range = chrono::Duration::zero();
+            let mut nth_key = None;
+            for (idx, (start, end)) in self.windows.iter().enumerate() {
+                // if time range is too long, stop
+                if cur_time_range > max_time_range {
+                    nth_key = Some(*start);
+                    break;
+                }
+
+                // if we have enough time windows, stop
+                if idx >= window_cnt {
+                    nth_key = Some(*start);
+                    break;
+                }
+
+                if let Some(end) = end {
+                    if let Some(x) = end.sub(start) {
+                        cur_time_range += x;
+                    }
+                }
+            }
+
+            nth_key
+        };
        let first_nth = {
            if let Some(nth) = nth {
                let mut after = self.windows.split_off(&nth);
@@ -213,6 +242,24 @@ impl DirtyTimeWindows {
            }
        };

+        METRIC_FLOW_BATCHING_ENGINE_QUERY_WINDOW_CNT
+            .with_label_values(&[flow_id.to_string().as_str()])
+            .observe(first_nth.len() as f64);
+
+        let full_time_range = first_nth
+            .iter()
+            .fold(chrono::Duration::zero(), |acc, (start, end)| {
+                if let Some(end) = end {
+                    acc + end.sub(start).unwrap_or(chrono::Duration::zero())
+                } else {
+                    acc
+                }
+            })
+            .num_seconds() as f64;
+        METRIC_FLOW_BATCHING_ENGINE_QUERY_TIME_RANGE
+            .with_label_values(&[flow_id.to_string().as_str()])
+            .observe(full_time_range);
+
        let mut expr_lst = vec![];
        for (start, end) in first_nth.into_iter() {
            // align using time window exprs
@@ -274,6 +321,8 @@ impl DirtyTimeWindows {
    }

    /// Merge time windows that overlaps or get too close
+    ///
+    /// TODO(discord9): not merge and prefer to send smaller time windows? how?
    pub fn merge_dirty_time_windows(
        &mut self,
        window_size: chrono::Duration,
@@ -472,7 +521,14 @@ mod test {
                .unwrap();
            assert_eq!(expected, dirty.windows);
            let filter_expr = dirty
-                .gen_filter_exprs("ts", expire_lower_bound, window_size, 0, None)
+                .gen_filter_exprs(
+                    "ts",
+                    expire_lower_bound,
+                    window_size,
+                    DirtyTimeWindows::MAX_FILTER_NUM,
+                    0,
+                    None,
+                )
                .unwrap();

            let unparser = datafusion::sql::unparser::Unparser::default();
--- a/src/flow/src/batching_mode/task.rs
+++ b/src/flow/src/batching_mode/task.rs
@@ -46,7 +46,7 @@ use tokio::time::Instant;

 use crate::adapter::{AUTO_CREATED_PLACEHOLDER_TS_COL, AUTO_CREATED_UPDATE_AT_TS_COL};
 use crate::batching_mode::frontend_client::FrontendClient;
-use crate::batching_mode::state::TaskState;
+use crate::batching_mode::state::{DirtyTimeWindows, TaskState};
 use crate::batching_mode::time_window::TimeWindowExpr;
 use crate::batching_mode::utils::{
    get_table_info_df_schema, sql_to_df_plan, AddAutoColumnRewriter, AddFilterRewriter,
@@ -280,7 +280,7 @@ impl BatchingTask {
        let catalog = &self.config.sink_table_name[0];
        let schema = &self.config.sink_table_name[1];
        frontend_client
-            .create(expr.clone(), catalog, schema)
+            .create(expr.clone(), catalog, schema, Some(self))
            .await?;
        Ok(())
    }
@@ -361,7 +361,7 @@ impl BatchingTask {
            };

            frontend_client
-                .handle(req, catalog, schema, &mut peer_desc)
+                .handle(req, catalog, schema, &mut peer_desc, Some(self))
                .await
        };

@@ -387,7 +387,6 @@ impl BatchingTask {
            METRIC_FLOW_BATCHING_ENGINE_SLOW_QUERY
                .with_label_values(&[
                    flow_id.to_string().as_str(),
-                    &plan.to_string(),
                    &peer_desc.unwrap_or_default().to_string(),
                ])
                .observe(elapsed.as_secs_f64());
@@ -429,16 +428,23 @@ impl BatchingTask {
                }
            }

-            let mut new_query = None;
-            let mut gen_and_exec = async || {
-                new_query = self.gen_insert_plan(&engine).await?;
-                if let Some(new_query) = &new_query {
-                    self.execute_logical_plan(&frontend_client, new_query).await
-                } else {
-                    Ok(None)
+            let new_query = match self.gen_insert_plan(&engine).await {
+                Ok(new_query) => new_query,
+                Err(err) => {
+                    common_telemetry::error!(err; "Failed to generate query for flow={}", self.config.flow_id);
+                    // also sleep for a little while before try again to prevent flooding logs
+                    tokio::time::sleep(MIN_REFRESH_DURATION).await;
+                    continue;
                }
            };
-            match gen_and_exec().await {
+
+            let res = if let Some(new_query) = &new_query {
+                self.execute_logical_plan(&frontend_client, new_query).await
+            } else {
+                Ok(None)
+            };
+
+            match res {
                // normal execute, sleep for some time before doing next query
                Ok(Some(_)) => {
                    let sleep_until = {
@@ -583,6 +589,7 @@ impl BatchingTask {
                &col_name,
                Some(l),
                window_size,
+                DirtyTimeWindows::MAX_FILTER_NUM,
                self.config.flow_id,
                Some(self),
            )?;
--- a/src/flow/src/expr/scalar.rs
+++ b/src/flow/src/expr/scalar.rs
@@ -15,9 +15,14 @@
 //! Scalar expressions.

 use std::collections::{BTreeMap, BTreeSet};
+use std::sync::Arc;

-use arrow::array::{make_array, ArrayData, ArrayRef};
+use arrow::array::{make_array, ArrayData, ArrayRef, BooleanArray};
+use arrow::buffer::BooleanBuffer;
+use arrow::compute::or_kleene;
 use common_error::ext::BoxedError;
+use datafusion::physical_expr_common::datum::compare_with_eq;
+use datafusion_common::DataFusionError;
 use datatypes::prelude::{ConcreteDataType, DataType};
 use datatypes::value::Value;
 use datatypes::vectors::{BooleanVector, Helper, VectorRef};
@@ -92,6 +97,10 @@ pub enum ScalarExpr {
        then: Box<ScalarExpr>,
        els: Box<ScalarExpr>,
    },
+    InList {
+        expr: Box<ScalarExpr>,
+        list: Vec<ScalarExpr>,
+    },
 }

 impl ScalarExpr {
@@ -137,6 +146,7 @@ impl ScalarExpr {
                    .context(crate::error::ExternalSnafu)?;
                Ok(ColumnType::new_nullable(typ))
            }
+            ScalarExpr::InList { expr, .. } => expr.typ(context),
        }
    }
 }
@@ -222,9 +232,57 @@ impl ScalarExpr {
                exprs,
            } => df_scalar_fn.eval_batch(batch, exprs),
            ScalarExpr::If { cond, then, els } => Self::eval_if_then(batch, cond, then, els),
+            ScalarExpr::InList { expr, list } => Self::eval_in_list(batch, expr, list),
        }
    }

+    fn eval_in_list(
+        batch: &Batch,
+        expr: &ScalarExpr,
+        list: &[ScalarExpr],
+    ) -> Result<VectorRef, EvalError> {
+        let eval_list = list
+            .iter()
+            .map(|e| e.eval_batch(batch))
+            .collect::<Result<Vec<_>, _>>()?;
+        let eval_expr = expr.eval_batch(batch)?;
+
+        ensure!(
+            eval_list
+                .iter()
+                .all(|v| v.data_type() == eval_expr.data_type()),
+            TypeMismatchSnafu {
+                expected: eval_expr.data_type(),
+                actual: eval_list
+                    .iter()
+                    .find(|v| v.data_type() != eval_expr.data_type())
+                    .map(|v| v.data_type())
+                    .unwrap(),
+            }
+        );
+
+        let lhs = eval_expr.to_arrow_array();
+
+        let found = eval_list
+            .iter()
+            .map(|v| v.to_arrow_array())
+            .try_fold(
+                BooleanArray::new(BooleanBuffer::new_unset(batch.row_count()), None),
+                |result, in_list_elem| -> Result<BooleanArray, DataFusionError> {
+                    let rhs = compare_with_eq(&lhs, &in_list_elem, false)?;
+
+                    Ok(or_kleene(&result, &rhs)?)
+                },
+            )
+            .with_context(|_| crate::expr::error::DatafusionSnafu {
+                context: "Failed to compare eval_expr with eval_list",
+            })?;
+
+        let res = BooleanVector::from(found);
+
+        Ok(Arc::new(res))
+    }
+
    /// NOTE: this if then eval impl assume all given expr are pure, and will not change the state of the world
    /// since it will evaluate both then and else branch and filter the result
    fn eval_if_then(
@@ -337,6 +395,15 @@ impl ScalarExpr {
                df_scalar_fn,
                exprs,
            } => df_scalar_fn.eval(values, exprs),
+            ScalarExpr::InList { expr, list } => {
+                let eval_expr = expr.eval(values)?;
+                let eval_list = list
+                    .iter()
+                    .map(|v| v.eval(values))
+                    .collect::<Result<Vec<_>, _>>()?;
+                let found = eval_list.iter().any(|item| *item == eval_expr);
+                Ok(Value::Boolean(found))
+            }
        }
    }

@@ -514,6 +581,13 @@ impl ScalarExpr {
                }
                Ok(())
            }
+            ScalarExpr::InList { expr, list } => {
+                f(expr)?;
+                for item in list {
+                    f(item)?;
+                }
+                Ok(())
+            }
        }
    }

@@ -558,6 +632,13 @@ impl ScalarExpr {
                }
                Ok(())
            }
+            ScalarExpr::InList { expr, list } => {
+                f(expr)?;
+                for item in list {
+                    f(item)?;
+                }
+                Ok(())
+            }
        }
    }
 }
--- a/src/flow/src/metrics.rs
+++ b/src/flow/src/metrics.rs
@@ -38,10 +38,34 @@ lazy_static! {
    pub static ref METRIC_FLOW_BATCHING_ENGINE_SLOW_QUERY: HistogramVec = register_histogram_vec!(
        "greptime_flow_batching_engine_slow_query_secs",
        "flow batching engine slow query(seconds)",
-        &["flow_id", "sql", "peer"],
+        &["flow_id", "peer"],
        vec![60., 2. * 60., 3. * 60., 5. * 60., 10. * 60.]
    )
    .unwrap();
+    pub static ref METRIC_FLOW_BATCHING_ENGINE_QUERY_WINDOW_CNT: HistogramVec =
+        register_histogram_vec!(
+            "greptime_flow_batching_engine_query_window_cnt",
+            "flow batching engine query time window count",
+            &["flow_id"],
+            vec![0.0, 5., 10., 20., 40.]
+        )
+        .unwrap();
+    pub static ref METRIC_FLOW_BATCHING_ENGINE_QUERY_TIME_RANGE: HistogramVec =
+        register_histogram_vec!(
+            "greptime_flow_batching_engine_query_time_range_secs",
+            "flow batching engine query time range(seconds)",
+            &["flow_id"],
+            vec![60., 4. * 60., 16. * 60., 64. * 60., 256. * 60.]
+        )
+        .unwrap();
+    pub static ref METRIC_FLOW_BATCHING_ENGINE_GUESS_FE_LOAD: HistogramVec =
+        register_histogram_vec!(
+            "greptime_flow_batching_engine_guess_fe_load",
+            "flow batching engine guessed frontend load",
+            &["fe_addr"],
+            vec![60., 4. * 60., 16. * 60., 64. * 60., 256. * 60.]
+        )
+        .unwrap();
    pub static ref METRIC_FLOW_RUN_INTERVAL_MS: IntGauge =
        register_int_gauge!("greptime_flow_run_interval_ms", "flow run interval in ms").unwrap();
    pub static ref METRIC_FLOW_ROWS: IntCounterVec = register_int_counter_vec!(
--- a/src/flow/src/server.rs
+++ b/src/flow/src/server.rs
@@ -596,7 +596,7 @@ impl FrontendInvoker {
            .start_timer();

        self.inserter
-            .handle_row_inserts(requests, ctx, &self.statement_executor, false)
+            .handle_row_inserts(requests, ctx, &self.statement_executor, false, false)
            .await
            .map_err(BoxedError::new)
            .context(common_frontend::error::ExternalSnafu)
--- a/src/flow/src/transform/expr.rs
+++ b/src/flow/src/transform/expr.rs
@@ -476,11 +476,27 @@ impl TypedExpr {
                let substrait_expr = s.value.as_ref().with_context(|| InvalidQuerySnafu {
                    reason: "SingularOrList expression without value",
                })?;
+                let typed_expr =
+                    TypedExpr::from_substrait_rex(substrait_expr, input_schema, extensions).await?;
                // Note that we didn't impl support to in list expr
                if !s.options.is_empty() {
-                    return not_impl_err!("In list expression is not supported");
+                    let mut list = Vec::with_capacity(s.options.len());
+                    for opt in s.options.iter() {
+                        let opt_expr =
+                            TypedExpr::from_substrait_rex(opt, input_schema, extensions).await?;
+                        list.push(opt_expr.expr);
+                    }
+                    let in_list_expr = ScalarExpr::InList {
+                        expr: Box::new(typed_expr.expr),
+                        list,
+                    };
+                    Ok(TypedExpr::new(
+                        in_list_expr,
+                        ColumnType::new_nullable(CDT::boolean_datatype()),
+                    ))
+                } else {
+                    Ok(typed_expr)
                }
-                TypedExpr::from_substrait_rex(substrait_expr, input_schema, extensions).await
            }
            Some(RexType::Selection(field_ref)) => match &field_ref.reference_type {
                Some(DirectReference(direct)) => match &direct.reference_type.as_ref() {
--- a/src/frontend/Cargo.toml
+++ b/src/frontend/Cargo.toml
@@ -6,7 +6,7 @@ license.workspace = true

 [features]
 testing = []
-enterprise = ["operator/enterprise", "sql/enterprise"]
+enterprise = ["common-meta/enterprise", "operator/enterprise", "sql/enterprise"]

 [lints]
 workspace = true
--- a/src/frontend/src/instance/grpc.rs
+++ b/src/frontend/src/instance/grpc.rs
@@ -76,7 +76,7 @@ impl GrpcQueryHandler for Instance {
        let output = match request {
            Request::Inserts(requests) => self.handle_inserts(requests, ctx.clone()).await?,
            Request::RowInserts(requests) => {
-                self.handle_row_inserts(requests, ctx.clone(), false)
+                self.handle_row_inserts(requests, ctx.clone(), false, false)
                    .await?
            }
            Request::Deletes(requests) => self.handle_deletes(requests, ctx.clone()).await?,
@@ -420,6 +420,7 @@ impl Instance {
        requests: RowInsertRequests,
        ctx: QueryContextRef,
        accommodate_existing_schema: bool,
+        is_single_value: bool,
    ) -> Result<Output> {
        self.inserter
            .handle_row_inserts(
@@ -427,6 +428,7 @@ impl Instance {
                ctx,
                self.statement_executor.as_ref(),
                accommodate_existing_schema,
+                is_single_value,
            )
            .await
            .context(TableOperationSnafu)
@@ -439,7 +441,14 @@ impl Instance {
        ctx: QueryContextRef,
    ) -> Result<Output> {
        self.inserter
-            .handle_last_non_null_inserts(requests, ctx, self.statement_executor.as_ref(), true)
+            .handle_last_non_null_inserts(
+                requests,
+                ctx,
+                self.statement_executor.as_ref(),
+                true,
+                // Influx protocol may writes multiple fields (values).
+                false,
+            )
            .await
            .context(TableOperationSnafu)
    }
--- a/src/frontend/src/instance/opentsdb.rs
+++ b/src/frontend/src/instance/opentsdb.rs
@@ -52,8 +52,9 @@ impl OpentsdbProtocolHandler for Instance {
            None
        };

+        // OpenTSDB is single value.
        let output = self
-            .handle_row_inserts(requests, ctx, true)
+            .handle_row_inserts(requests, ctx, true, true)
            .await
            .map_err(BoxedError::new)
            .context(servers::error::ExecuteGrpcQuerySnafu)?;
--- a/src/frontend/src/instance/otlp.rs
+++ b/src/frontend/src/instance/otlp.rs
@@ -63,7 +63,7 @@ impl OpenTelemetryProtocolHandler for Instance {
            None
        };

-        self.handle_row_inserts(requests, ctx, false)
+        self.handle_row_inserts(requests, ctx, false, false)
            .await
            .map_err(BoxedError::new)
            .context(error::ExecuteGrpcQuerySnafu)
@@ -125,7 +125,7 @@ impl OpenTelemetryProtocolHandler for Instance {
        pipeline_params: GreptimePipelineParams,
        table_name: String,
        ctx: QueryContextRef,
-    ) -> ServerResult<Output> {
+    ) -> ServerResult<Vec<Output>> {
        self.plugins
            .get::<PermissionCheckerRef>()
            .as_ref()
@@ -137,7 +137,7 @@ impl OpenTelemetryProtocolHandler for Instance {
            .get::<OpenTelemetryProtocolInterceptorRef<servers::error::Error>>();
        interceptor_ref.pre_execute(ctx.clone())?;

-        let (requests, rows) = otlp::logs::to_grpc_insert_requests(
+        let opt_req = otlp::logs::to_grpc_insert_requests(
            request,
            pipeline,
            pipeline_params,
@@ -148,7 +148,7 @@ impl OpenTelemetryProtocolHandler for Instance {
        .await?;

        let _guard = if let Some(limiter) = &self.limiter {
-            let result = limiter.limit_row_inserts(&requests);
+            let result = limiter.limit_ctx_req(&opt_req);
            if result.is_none() {
                return InFlightWriteBytesExceededSnafu.fail();
            }
@@ -157,10 +157,24 @@ impl OpenTelemetryProtocolHandler for Instance {
            None
        };

-        self.handle_log_inserts(requests, ctx)
-            .await
-            .inspect(|_| OTLP_LOGS_ROWS.inc_by(rows as u64))
-            .map_err(BoxedError::new)
-            .context(error::ExecuteGrpcQuerySnafu)
+        let mut outputs = vec![];
+
+        for (temp_ctx, requests) in opt_req.as_req_iter(ctx) {
+            let cnt = requests
+                .inserts
+                .iter()
+                .filter_map(|r| r.rows.as_ref().map(|r| r.rows.len()))
+                .sum::<usize>();
+
+            let o = self
+                .handle_log_inserts(requests, temp_ctx)
+                .await
+                .inspect(|_| OTLP_LOGS_ROWS.inc_by(cnt as u64))
+                .map_err(BoxedError::new)
+                .context(error::ExecuteGrpcQuerySnafu)?;
+            outputs.push(o);
+        }
+
+        Ok(outputs)
    }
 }
--- a/src/frontend/src/instance/prom_store.rs
+++ b/src/frontend/src/instance/prom_store.rs
@@ -195,7 +195,7 @@ impl PromStoreProtocolHandler for Instance {
                .map_err(BoxedError::new)
                .context(error::ExecuteGrpcQuerySnafu)?
        } else {
-            self.handle_row_inserts(request, ctx.clone(), true)
+            self.handle_row_inserts(request, ctx.clone(), true, true)
                .await
                .map_err(BoxedError::new)
                .context(error::ExecuteGrpcQuerySnafu)?
--- a/src/frontend/src/limiter.rs
+++ b/src/frontend/src/limiter.rs
@@ -18,8 +18,11 @@ use std::sync::Arc;
 use api::v1::column::Values;
 use api::v1::greptime_request::Request;
 use api::v1::value::ValueData;
-use api::v1::{Decimal128, InsertRequests, IntervalMonthDayNano, RowInsertRequests};
+use api::v1::{
+    Decimal128, InsertRequests, IntervalMonthDayNano, RowInsertRequest, RowInsertRequests,
+};
 use common_telemetry::{debug, warn};
+use pipeline::ContextReq;

 pub(crate) type LimiterRef = Arc<Limiter>;

@@ -75,7 +78,9 @@ impl Limiter {
    pub fn limit_request(&self, request: &Request) -> Option<InFlightWriteBytesCounter> {
        let size = match request {
            Request::Inserts(requests) => self.insert_requests_data_size(requests),
-            Request::RowInserts(requests) => self.rows_insert_requests_data_size(requests),
+            Request::RowInserts(requests) => {
+                self.rows_insert_requests_data_size(requests.inserts.iter())
+            }
            _ => 0,
        };
        self.limit_in_flight_write_bytes(size as u64)
@@ -85,7 +90,12 @@ impl Limiter {
        &self,
        requests: &RowInsertRequests,
    ) -> Option<InFlightWriteBytesCounter> {
-        let size = self.rows_insert_requests_data_size(requests);
+        let size = self.rows_insert_requests_data_size(requests.inserts.iter());
+        self.limit_in_flight_write_bytes(size as u64)
+    }
+
+    pub fn limit_ctx_req(&self, opt_req: &ContextReq) -> Option<InFlightWriteBytesCounter> {
+        let size = self.rows_insert_requests_data_size(opt_req.ref_all_req());
        self.limit_in_flight_write_bytes(size as u64)
    }

@@ -137,9 +147,12 @@ impl Limiter {
        size
    }

-    fn rows_insert_requests_data_size(&self, request: &RowInsertRequests) -> usize {
+    fn rows_insert_requests_data_size<'a>(
+        &self,
+        inserts: impl Iterator<Item = &'a RowInsertRequest>,
+    ) -> usize {
        let mut size: usize = 0;
-        for insert in &request.inserts {
+        for insert in inserts {
            if let Some(rows) = &insert.rows {
                for row in &rows.rows {
                    for value in &row.values {
--- a/src/frontend/src/slow_query_recorder.rs
+++ b/src/frontend/src/slow_query_recorder.rs
@@ -233,7 +233,7 @@ impl SlowQueryEventHandler {
            .into();

        self.inserter
-            .handle_row_inserts(requests, query_ctx, &self.statement_executor, false)
+            .handle_row_inserts(requests, query_ctx, &self.statement_executor, false, false)
            .await
            .context(TableOperationSnafu)?;

--- a/src/log-store/src/kafka/util/record.rs
+++ b/src/log-store/src/kafka/util/record.rs
@@ -169,7 +169,6 @@ fn convert_to_naive_entry(provider: Arc<KafkaProvider>, record: Record) -> Entry
    Entry::Naive(NaiveEntry {
        provider: Provider::Kafka(provider),
        region_id,
-        // TODO(weny): should be the offset in the topic
        entry_id: record.meta.entry_id,
        data: record.data,
    })
@@ -182,6 +181,7 @@ fn convert_to_multiple_entry(
 ) -> Entry {
    let mut headers = Vec::with_capacity(records.len());
    let mut parts = Vec::with_capacity(records.len());
+    let entry_id = records.last().map(|r| r.meta.entry_id).unwrap_or_default();

    for record in records {
        let header = match record.meta.tp {
@@ -197,8 +197,7 @@ fn convert_to_multiple_entry(
    Entry::MultiplePart(MultiplePartEntry {
        provider: Provider::Kafka(provider),
        region_id,
-        // TODO(weny): should be the offset in the topic
-        entry_id: 0,
+        entry_id,
        headers,
        parts,
    })
@@ -369,8 +368,7 @@ mod tests {
            Entry::MultiplePart(MultiplePartEntry {
                provider: Provider::Kafka(provider.clone()),
                region_id,
-                // TODO(weny): always be 0.
-                entry_id: 0,
+                entry_id: 1,
                headers: vec![MultiplePartHeader::First],
                parts: vec![vec![1; 100]],
            })
@@ -388,8 +386,7 @@ mod tests {
            Entry::MultiplePart(MultiplePartEntry {
                provider: Provider::Kafka(provider.clone()),
                region_id,
-                // TODO(weny): always be 0.
-                entry_id: 0,
+                entry_id: 1,
                headers: vec![MultiplePartHeader::Last],
                parts: vec![vec![1; 100]],
            })
@@ -411,8 +408,7 @@ mod tests {
            Entry::MultiplePart(MultiplePartEntry {
                provider: Provider::Kafka(provider),
                region_id,
-                // TODO(weny): always be 0.
-                entry_id: 0,
+                entry_id: 1,
                headers: vec![MultiplePartHeader::Middle(0)],
                parts: vec![vec![1; 100]],
            })
--- a/src/meta-srv/Cargo.toml
+++ b/src/meta-srv/Cargo.toml
@@ -9,6 +9,7 @@ mock = []
 pg_kvbackend = ["dep:tokio-postgres", "common-meta/pg_kvbackend", "dep:deadpool-postgres", "dep:deadpool"]
 mysql_kvbackend = ["dep:sqlx", "common-meta/mysql_kvbackend"]
 testing = ["common-wal/testing"]
+enterprise = ["common-meta/enterprise"]

 [lints]
 workspace = true
--- a/src/meta-srv/src/bootstrap.rs
+++ b/src/meta-srv/src/bootstrap.rs
@@ -61,9 +61,9 @@ use tonic::transport::server::{Router, TcpIncoming};

 use crate::election::etcd::EtcdElection;
 #[cfg(feature = "mysql_kvbackend")]
-use crate::election::mysql::MySqlElection;
+use crate::election::rds::mysql::MySqlElection;
 #[cfg(feature = "pg_kvbackend")]
-use crate::election::postgres::PgElection;
+use crate::election::rds::postgres::PgElection;
 #[cfg(any(feature = "pg_kvbackend", feature = "mysql_kvbackend"))]
 use crate::election::CANDIDATE_LEASE_SECS;
 use crate::metasrv::builder::MetasrvBuilder;
--- a/src/meta-srv/src/election.rs
+++ b/src/meta-srv/src/election.rs
@@ -13,15 +13,14 @@
 // limitations under the License.

 pub mod etcd;
-#[cfg(feature = "mysql_kvbackend")]
-pub mod mysql;
-#[cfg(feature = "pg_kvbackend")]
-pub mod postgres;
+#[cfg(any(feature = "pg_kvbackend", feature = "mysql_kvbackend"))]
+pub mod rds;

 use std::fmt::{self, Debug};
+use std::sync::atomic::{AtomicBool, Ordering};
 use std::sync::Arc;

-use common_telemetry::{info, warn};
+use common_telemetry::{error, info, warn};
 use tokio::sync::broadcast::error::RecvError;
 use tokio::sync::broadcast::{self, Receiver, Sender};

@@ -110,6 +109,28 @@ fn listen_leader_change(leader_value: String) -> Sender<LeaderChangeMessage> {
    tx
 }

+/// Sends a leader change message to the channel and sets the `is_leader` flag.
+/// If a leader is elected, it will also set the `leader_infancy` flag to true.
+fn send_leader_change_and_set_flags(
+    is_leader: &AtomicBool,
+    leader_infancy: &AtomicBool,
+    tx: &Sender<LeaderChangeMessage>,
+    msg: LeaderChangeMessage,
+) {
+    let is_elected = matches!(msg, LeaderChangeMessage::Elected(_));
+    if is_leader
+        .compare_exchange(!is_elected, is_elected, Ordering::AcqRel, Ordering::Acquire)
+        .is_ok()
+    {
+        if is_elected {
+            leader_infancy.store(true, Ordering::Release);
+        }
+        if let Err(e) = tx.send(msg) {
+            error!(e; "Failed to send leader change message");
+        }
+    }
+}
+
 #[async_trait::async_trait]
 pub trait Election: Send + Sync {
    type Leader;
--- a/src/meta-srv/src/election/etcd.rs
+++ b/src/meta-srv/src/election/etcd.rs
@@ -27,8 +27,8 @@ use tokio::sync::broadcast::Receiver;
 use tokio::time::{timeout, MissedTickBehavior};

 use crate::election::{
-    listen_leader_change, Election, LeaderChangeMessage, LeaderKey, CANDIDATES_ROOT,
-    CANDIDATE_LEASE_SECS, ELECTION_KEY, KEEP_ALIVE_INTERVAL_SECS,
+    listen_leader_change, send_leader_change_and_set_flags, Election, LeaderChangeMessage,
+    LeaderKey, CANDIDATES_ROOT, CANDIDATE_LEASE_SECS, ELECTION_KEY, KEEP_ALIVE_INTERVAL_SECS,
 };
 use crate::error;
 use crate::error::Result;
@@ -247,18 +247,12 @@ impl Election for EtcdElection {
                }
            }

-            if self
-                .is_leader
-                .compare_exchange(true, false, Ordering::AcqRel, Ordering::Acquire)
-                .is_ok()
-            {
-                if let Err(e) = self
-                    .leader_watcher
-                    .send(LeaderChangeMessage::StepDown(Arc::new(leader.clone())))
-                {
-                    error!(e; "Failed to send leader change message");
-                }
-            }
+            send_leader_change_and_set_flags(
+                &self.is_leader,
+                &self.infancy,
+                &self.leader_watcher,
+                LeaderChangeMessage::StepDown(Arc::new(leader.clone())),
+            );
        }

        Ok(())
@@ -305,20 +299,12 @@ impl EtcdElection {
            );

            // Only after a successful `keep_alive` is the leader considered official.
-            if self
-                .is_leader
-                .compare_exchange(false, true, Ordering::AcqRel, Ordering::Acquire)
-                .is_ok()
-            {
-                self.infancy.store(true, Ordering::Release);
-
-                if let Err(e) = self
-                    .leader_watcher
-                    .send(LeaderChangeMessage::Elected(Arc::new(leader)))
-                {
-                    error!(e; "Failed to send leader change message");
-                }
-            }
+            send_leader_change_and_set_flags(
+                &self.is_leader,
+                &self.infancy,
+                &self.leader_watcher,
+                LeaderChangeMessage::Elected(Arc::new(leader.clone())),
+            );
        }

        Ok(())
--- a/src/meta-srv/src/election/rds.rs
+++ b/src/meta-srv/src/election/rds.rs
@@ -0,0 +1,90 @@
+// Copyright 2023 Greptime Team
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+#[cfg(feature = "mysql_kvbackend")]
+pub mod mysql;
+#[cfg(feature = "pg_kvbackend")]
+pub mod postgres;
+
+use common_time::Timestamp;
+use itertools::Itertools;
+use snafu::OptionExt;
+
+use crate::election::LeaderKey;
+use crate::error::{Result, UnexpectedSnafu};
+
+// Separator between value and expire time in the lease string.
+// A lease is put into rds election in the format:
+// <node_info> || __metadata_lease_sep || <expire_time>
+const LEASE_SEP: &str = r#"||__metadata_lease_sep||"#;
+
+/// Parses the value and expire time from the given string retrieved from rds.
+fn parse_value_and_expire_time(value: &str) -> Result<(String, Timestamp)> {
+    let (value, expire_time) =
+        value
+            .split(LEASE_SEP)
+            .collect_tuple()
+            .with_context(|| UnexpectedSnafu {
+                violated: format!(
+                    "Invalid value {}, expect node info || {} || expire time",
+                    value, LEASE_SEP
+                ),
+            })?;
+    // Given expire_time is in the format 'YYYY-MM-DD HH24:MI:SS.MS'
+    let expire_time = match Timestamp::from_str(expire_time, None) {
+        Ok(ts) => ts,
+        Err(_) => UnexpectedSnafu {
+            violated: format!("Invalid timestamp: {}", expire_time),
+        }
+        .fail()?,
+    };
+    Ok((value.to_string(), expire_time))
+}
+
+/// LeaderKey used for [LeaderChangeMessage] in rds election components.
+#[derive(Debug, Clone, Default)]
+struct RdsLeaderKey {
+    name: Vec<u8>,
+    key: Vec<u8>,
+    rev: i64,
+    lease: i64,
+}
+
+impl LeaderKey for RdsLeaderKey {
+    fn name(&self) -> &[u8] {
+        &self.name
+    }
+
+    fn key(&self) -> &[u8] {
+        &self.key
+    }
+
+    fn revision(&self) -> i64 {
+        self.rev
+    }
+
+    fn lease_id(&self) -> i64 {
+        self.lease
+    }
+}
+
+/// Lease information for rds election.
+#[derive(Default, Clone, Debug)]
+struct Lease {
+    leader_value: String,
+    expire_time: Timestamp,
+    current: Timestamp,
+    // `origin` is the original value of the lease, used for CAS.
+    origin: String,
+}
--- a/src/meta-srv/src/election/rds/mysql.rs
+++ b/src/meta-srv/src/election/rds/mysql.rs
@@ -16,9 +16,8 @@ use std::sync::atomic::{AtomicBool, Ordering};
 use std::sync::Arc;
 use std::time::Duration;

-use common_telemetry::{error, warn};
+use common_telemetry::warn;
 use common_time::Timestamp;
-use itertools::Itertools;
 use snafu::{ensure, OptionExt, ResultExt};
 use sqlx::mysql::{MySqlArguments, MySqlRow};
 use sqlx::query::Query;
@@ -26,8 +25,10 @@ use sqlx::{MySql, MySqlConnection, MySqlTransaction, Row};
 use tokio::sync::{broadcast, Mutex, MutexGuard};
 use tokio::time::MissedTickBehavior;

+use crate::election::rds::{parse_value_and_expire_time, Lease, RdsLeaderKey, LEASE_SEP};
 use crate::election::{
-    listen_leader_change, Election, LeaderChangeMessage, LeaderKey, CANDIDATES_ROOT, ELECTION_KEY,
+    listen_leader_change, send_leader_change_and_set_flags, Election, LeaderChangeMessage,
+    CANDIDATES_ROOT, ELECTION_KEY,
 };
 use crate::error::{
    DeserializeFromJsonSnafu, MySqlExecutionSnafu, NoLeaderSnafu, Result, SerializeToJsonSnafu,
@@ -35,20 +36,6 @@ use crate::error::{
 };
 use crate::metasrv::{ElectionRef, LeaderValue, MetasrvNodeInfo};

-// Separator between value and expire time.
-const LEASE_SEP: &str = r#"||__metadata_lease_sep||"#;
-
-/// Lease information.
-/// TODO(CookiePie): PgElection can also use this struct. Refactor it to a common module.
-#[derive(Default, Clone, Debug)]
-struct Lease {
-    leader_value: String,
-    expire_time: Timestamp,
-    current: Timestamp,
-    // origin is the origin value of the lease, used for CAS.
-    origin: String,
-}
-
 struct ElectionSqlFactory<'a> {
    table_name: &'a str,
    meta_lease_ttl_secs: u64,
@@ -204,55 +191,6 @@ impl<'a> ElectionSqlFactory<'a> {
    }
 }

-/// Parse the value and expire time from the given string. The value should be in the format "value || LEASE_SEP || expire_time".
-fn parse_value_and_expire_time(value: &str) -> Result<(String, Timestamp)> {
-    let (value, expire_time) =
-        value
-            .split(LEASE_SEP)
-            .collect_tuple()
-            .with_context(|| UnexpectedSnafu {
-                violated: format!(
-                    "Invalid value {}, expect node info || {} || expire time",
-                    value, LEASE_SEP
-                ),
-            })?;
-    // Given expire_time is in the format 'YYYY-MM-DD HH24:MI:SS.MS'
-    let expire_time = match Timestamp::from_str(expire_time, None) {
-        Ok(ts) => ts,
-        Err(_) => UnexpectedSnafu {
-            violated: format!("Invalid timestamp: {}", expire_time),
-        }
-        .fail()?,
-    };
-    Ok((value.to_string(), expire_time))
-}
-
-#[derive(Debug, Clone, Default)]
-struct MySqlLeaderKey {
-    name: Vec<u8>,
-    key: Vec<u8>,
-    rev: i64,
-    lease: i64,
-}
-
-impl LeaderKey for MySqlLeaderKey {
-    fn name(&self) -> &[u8] {
-        &self.name
-    }
-
-    fn key(&self) -> &[u8] {
-        &self.key
-    }
-
-    fn revision(&self) -> i64 {
-        self.rev
-    }
-
-    fn lease_id(&self) -> i64 {
-        self.lease
-    }
-}
-
 enum Executor<'a> {
    Default(MutexGuard<'a, MySqlConnection>),
    Txn(MySqlTransaction<'a>),
@@ -767,23 +705,17 @@ impl MySqlElection {
    /// Still consider itself as the leader locally but failed to acquire the lock. Step down without deleting the key.
    async fn step_down_without_lock(&self) -> Result<()> {
        let key = self.election_key().into_bytes();
-        let leader_key = MySqlLeaderKey {
+        let leader_key = RdsLeaderKey {
            name: self.leader_value.clone().into_bytes(),
            key: key.clone(),
            ..Default::default()
        };
-        if self
-            .is_leader
-            .compare_exchange(true, false, Ordering::AcqRel, Ordering::Acquire)
-            .is_ok()
-        {
-            if let Err(e) = self
-                .leader_watcher
-                .send(LeaderChangeMessage::StepDown(Arc::new(leader_key)))
-            {
-                error!(e; "Failed to send leader change message");
-            }
-        }
+        send_leader_change_and_set_flags(
+            &self.is_leader,
+            &self.leader_infancy,
+            &self.leader_watcher,
+            LeaderChangeMessage::StepDown(Arc::new(leader_key)),
+        );
        Ok(())
    }

@@ -791,7 +723,7 @@ impl MySqlElection {
    /// Caution: Should only elected while holding the lock.
    async fn elected(&self, executor: &mut Executor<'_>) -> Result<()> {
        let key = self.election_key();
-        let leader_key = MySqlLeaderKey {
+        let leader_key = RdsLeaderKey {
            name: self.leader_value.clone().into_bytes(),
            key: key.clone().into_bytes(),
            ..Default::default()
@@ -800,20 +732,12 @@ impl MySqlElection {
        self.put_value_with_lease(&key, &self.leader_value, self.meta_lease_ttl_secs, executor)
            .await?;

-        if self
-            .is_leader
-            .compare_exchange(false, true, Ordering::AcqRel, Ordering::Acquire)
-            .is_ok()
-        {
-            self.leader_infancy.store(true, Ordering::Release);
-
-            if let Err(e) = self
-                .leader_watcher
-                .send(LeaderChangeMessage::Elected(Arc::new(leader_key)))
-            {
-                error!(e; "Failed to send leader change message");
-            }
-        }
+        send_leader_change_and_set_flags(
+            &self.is_leader,
+            &self.leader_infancy,
+            &self.leader_watcher,
+            LeaderChangeMessage::Elected(Arc::new(leader_key)),
+        );
        Ok(())
    }

--- a/src/meta-srv/src/election/rds/postgres.rs
+++ b/src/meta-srv/src/election/rds/postgres.rs
@@ -18,15 +18,16 @@ use std::time::Duration;

 use common_telemetry::{error, warn};
 use common_time::Timestamp;
-use itertools::Itertools;
 use snafu::{ensure, OptionExt, ResultExt};
 use tokio::sync::broadcast;
 use tokio::time::MissedTickBehavior;
 use tokio_postgres::types::ToSql;
 use tokio_postgres::Client;

+use crate::election::rds::{parse_value_and_expire_time, Lease, RdsLeaderKey, LEASE_SEP};
 use crate::election::{
-    listen_leader_change, Election, LeaderChangeMessage, LeaderKey, CANDIDATES_ROOT, ELECTION_KEY,
+    listen_leader_change, send_leader_change_and_set_flags, Election, LeaderChangeMessage,
+    CANDIDATES_ROOT, ELECTION_KEY,
 };
 use crate::error::{
    DeserializeFromJsonSnafu, NoLeaderSnafu, PostgresExecutionSnafu, Result, SerializeToJsonSnafu,
@@ -34,9 +35,6 @@ use crate::error::{
 };
 use crate::metasrv::{ElectionRef, LeaderValue, MetasrvNodeInfo};

-// Separator between value and expire time.
-const LEASE_SEP: &str = r#"||__metadata_lease_sep||"#;
-
 struct ElectionSqlFactory<'a> {
    lock_id: u64,
    table_name: &'a str,
@@ -173,54 +171,6 @@ impl<'a> ElectionSqlFactory<'a> {
    }
 }

-/// Parse the value and expire time from the given string. The value should be in the format "value || LEASE_SEP || expire_time".
-fn parse_value_and_expire_time(value: &str) -> Result<(String, Timestamp)> {
-    let (value, expire_time) = value
-        .split(LEASE_SEP)
-        .collect_tuple()
-        .context(UnexpectedSnafu {
-            violated: format!(
-                "Invalid value {}, expect node info || {} || expire time",
-                value, LEASE_SEP
-            ),
-        })?;
-    // Given expire_time is in the format 'YYYY-MM-DD HH24:MI:SS.MS'
-    let expire_time = match Timestamp::from_str(expire_time, None) {
-        Ok(ts) => ts,
-        Err(_) => UnexpectedSnafu {
-            violated: format!("Invalid timestamp: {}", expire_time),
-        }
-        .fail()?,
-    };
-    Ok((value.to_string(), expire_time))
-}
-
-#[derive(Debug, Clone, Default)]
-struct PgLeaderKey {
-    name: Vec<u8>,
-    key: Vec<u8>,
-    rev: i64,
-    lease: i64,
-}
-
-impl LeaderKey for PgLeaderKey {
-    fn name(&self) -> &[u8] {
-        &self.name
-    }
-
-    fn key(&self) -> &[u8] {
-        &self.key
-    }
-
-    fn revision(&self) -> i64 {
-        self.rev
-    }
-
-    fn lease_id(&self) -> i64 {
-        self.lease
-    }
-}
-
 /// PostgreSql implementation of Election.
 pub struct PgElection {
    leader_value: String,
@@ -314,27 +264,31 @@ impl Election for PgElection {
        loop {
            let _ = keep_alive_interval.tick().await;

-            let (_, prev_expire_time, current_time, origin) = self
-                .get_value_with_lease(&key, true)
+            let lease = self
+                .get_value_with_lease(&key)
                .await?
-                .unwrap_or_default();
+                .context(UnexpectedSnafu {
+                    violated: format!("Failed to get lease for key: {:?}", key),
+                })?;

            ensure!(
-                prev_expire_time > current_time,
+                lease.expire_time > lease.current,
                UnexpectedSnafu {
                    violated: format!(
                        "Candidate lease expired at {:?} (current time {:?}), key: {:?}",
-                        prev_expire_time,
-                        current_time,
-                        String::from_utf8_lossy(&key.into_bytes())
+                        lease.expire_time, lease.current, key
                    ),
                }
            );

            // Safety: origin is Some since we are using `get_value_with_lease` with `true`.
-            let origin = origin.unwrap();
-            self.update_value_with_lease(&key, &origin, &node_info, self.candidate_lease_ttl_secs)
-                .await?;
+            self.update_value_with_lease(
+                &key,
+                &lease.origin,
+                &node_info,
+                self.candidate_lease_ttl_secs,
+            )
+            .await?;
        }
    }

@@ -400,11 +354,9 @@ impl Election for PgElection {
            Ok(self.leader_value.as_bytes().into())
        } else {
            let key = self.election_key();
-            if let Some((leader, expire_time, current, _)) =
-                self.get_value_with_lease(&key, false).await?
-            {
-                ensure!(expire_time > current, NoLeaderSnafu);
-                Ok(leader.as_bytes().into())
+            if let Some(lease) = self.get_value_with_lease(&key).await? {
+                ensure!(lease.expire_time > lease.current, NoLeaderSnafu);
+                Ok(lease.leader_value.as_bytes().into())
            } else {
                NoLeaderSnafu.fail()
            }
@@ -422,11 +374,7 @@ impl Election for PgElection {

 impl PgElection {
    /// Returns value, expire time and current time. If `with_origin` is true, the origin string is also returned.
-    async fn get_value_with_lease(
-        &self,
-        key: &str,
-        with_origin: bool,
-    ) -> Result<Option<(String, Timestamp, Timestamp, Option<String>)>> {
+    async fn get_value_with_lease(&self, key: &str) -> Result<Option<Lease>> {
        let key = key.as_bytes();
        let res = self
            .client
@@ -451,16 +399,12 @@ impl PgElection {
                String::from_utf8_lossy(res[0].try_get(0).unwrap_or_default());
            let (value, expire_time) = parse_value_and_expire_time(&value_and_expire_time)?;

-            if with_origin {
-                Ok(Some((
-                    value,
-                    expire_time,
-                    current_time,
-                    Some(value_and_expire_time.to_string()),
-                )))
-            } else {
-                Ok(Some((value, expire_time, current_time, None)))
-            }
+            Ok(Some(Lease {
+                leader_value: value,
+                expire_time,
+                current: current_time,
+                origin: value_and_expire_time.to_string(),
+            }))
        }
    }

@@ -579,16 +523,18 @@ impl PgElection {
        let key = self.election_key();
        // Case 1
        if self.is_leader() {
-            match self.get_value_with_lease(&key, true).await? {
-                Some((prev_leader, expire_time, current, prev)) => {
-                    match (prev_leader == self.leader_value, expire_time > current) {
+            match self.get_value_with_lease(&key).await? {
+                Some(lease) => {
+                    match (
+                        lease.leader_value == self.leader_value,
+                        lease.expire_time > lease.current,
+                    ) {
                        // Case 1.1
                        (true, true) => {
                            // Safety: prev is Some since we are using `get_value_with_lease` with `true`.
-                            let prev = prev.unwrap();
                            self.update_value_with_lease(
                                &key,
-                                &prev,
+                                &lease.origin,
                                &self.leader_value,
                                self.meta_lease_ttl_secs,
                            )
@@ -635,12 +581,12 @@ impl PgElection {
        if self.is_leader() {
            self.step_down_without_lock().await?;
        }
-        let (_, expire_time, current, _) = self
-            .get_value_with_lease(&key, false)
+        let lease = self
+            .get_value_with_lease(&key)
            .await?
            .context(NoLeaderSnafu)?;
        // Case 2
-        ensure!(expire_time > current, NoLeaderSnafu);
+        ensure!(lease.expire_time > lease.current, NoLeaderSnafu);
        // Case 3
        Ok(())
    }
@@ -653,35 +599,29 @@ impl PgElection {
    /// Should only step down while holding the advisory lock.
    async fn step_down(&self) -> Result<()> {
        let key = self.election_key();
-        let leader_key = PgLeaderKey {
+        let leader_key = RdsLeaderKey {
            name: self.leader_value.clone().into_bytes(),
            key: key.clone().into_bytes(),
            ..Default::default()
        };
-        if self
-            .is_leader
-            .compare_exchange(true, false, Ordering::AcqRel, Ordering::Acquire)
-            .is_ok()
-        {
-            self.delete_value(&key).await?;
-            self.client
-                .query(&self.sql_set.step_down, &[])
-                .await
-                .context(PostgresExecutionSnafu)?;
-            if let Err(e) = self
-                .leader_watcher
-                .send(LeaderChangeMessage::StepDown(Arc::new(leader_key)))
-            {
-                error!(e; "Failed to send leader change message");
-            }
-        }
+        self.delete_value(&key).await?;
+        self.client
+            .query(&self.sql_set.step_down, &[])
+            .await
+            .context(PostgresExecutionSnafu)?;
+        send_leader_change_and_set_flags(
+            &self.is_leader,
+            &self.leader_infancy,
+            &self.leader_watcher,
+            LeaderChangeMessage::StepDown(Arc::new(leader_key)),
+        );
        Ok(())
    }

    /// Still consider itself as the leader locally but failed to acquire the lock. Step down without deleting the key.
    async fn step_down_without_lock(&self) -> Result<()> {
        let key = self.election_key().into_bytes();
-        let leader_key = PgLeaderKey {
+        let leader_key = RdsLeaderKey {
            name: self.leader_value.clone().into_bytes(),
            key: key.clone(),
            ..Default::default()
@@ -705,7 +645,7 @@ impl PgElection {
    /// Caution: Should only elected while holding the advisory lock.
    async fn elected(&self) -> Result<()> {
        let key = self.election_key();
-        let leader_key = PgLeaderKey {
+        let leader_key = RdsLeaderKey {
            name: self.leader_value.clone().into_bytes(),
            key: key.clone().into_bytes(),
            ..Default::default()
@@ -800,23 +740,22 @@ mod tests {
            .unwrap();
        assert!(res);

-        let (value_get, _, _, prev) = pg_election
-            .get_value_with_lease(&key, true)
+        let lease = pg_election
+            .get_value_with_lease(&key)
            .await
            .unwrap()
            .unwrap();
-        assert_eq!(value_get, value);
+        assert_eq!(lease.leader_value, value);

-        let prev = prev.unwrap();
        pg_election
-            .update_value_with_lease(&key, &prev, &value, pg_election.meta_lease_ttl_secs)
+            .update_value_with_lease(&key, &lease.origin, &value, pg_election.meta_lease_ttl_secs)
            .await
            .unwrap();

        let res = pg_election.delete_value(&key).await.unwrap();
        assert!(res);

-        let res = pg_election.get_value_with_lease(&key, false).await.unwrap();
+        let res = pg_election.get_value_with_lease(&key).await.unwrap();
        assert!(res.is_none());

        for i in 0..10 {
@@ -963,13 +902,13 @@ mod tests {
        };

        leader_pg_election.elected().await.unwrap();
-        let (leader, expire_time, current, _) = leader_pg_election
-            .get_value_with_lease(&leader_pg_election.election_key(), false)
+        let lease = leader_pg_election
+            .get_value_with_lease(&leader_pg_election.election_key())
            .await
            .unwrap()
            .unwrap();
-        assert!(leader == leader_value);
-        assert!(expire_time > current);
+        assert!(lease.leader_value == leader_value);
+        assert!(lease.expire_time > lease.current);
        assert!(leader_pg_election.is_leader());

        match rx.recv().await {
@@ -986,12 +925,12 @@ mod tests {
        }

        leader_pg_election.step_down_without_lock().await.unwrap();
-        let (leader, _, _, _) = leader_pg_election
-            .get_value_with_lease(&leader_pg_election.election_key(), false)
+        let lease = leader_pg_election
+            .get_value_with_lease(&leader_pg_election.election_key())
            .await
            .unwrap()
            .unwrap();
-        assert!(leader == leader_value);
+        assert!(lease.leader_value == leader_value);
        assert!(!leader_pg_election.is_leader());

        match rx.recv().await {
@@ -1008,13 +947,13 @@ mod tests {
        }

        leader_pg_election.elected().await.unwrap();
-        let (leader, expire_time, current, _) = leader_pg_election
-            .get_value_with_lease(&leader_pg_election.election_key(), false)
+        let lease = leader_pg_election
+            .get_value_with_lease(&leader_pg_election.election_key())
            .await
            .unwrap()
            .unwrap();
-        assert!(leader == leader_value);
-        assert!(expire_time > current);
+        assert!(lease.leader_value == leader_value);
+        assert!(lease.expire_time > lease.current);
        assert!(leader_pg_election.is_leader());

        match rx.recv().await {
@@ -1032,7 +971,7 @@ mod tests {

        leader_pg_election.step_down().await.unwrap();
        let res = leader_pg_election
-            .get_value_with_lease(&leader_pg_election.election_key(), false)
+            .get_value_with_lease(&leader_pg_election.election_key())
            .await
            .unwrap();
        assert!(res.is_none());
@@ -1085,13 +1024,13 @@ mod tests {
        let res: bool = res[0].get(0);
        assert!(res);
        leader_pg_election.leader_action().await.unwrap();
-        let (leader, expire_time, current, _) = leader_pg_election
-            .get_value_with_lease(&leader_pg_election.election_key(), false)
+        let lease = leader_pg_election
+            .get_value_with_lease(&leader_pg_election.election_key())
            .await
            .unwrap()
            .unwrap();
-        assert!(leader == leader_value);
-        assert!(expire_time > current);
+        assert!(lease.leader_value == leader_value);
+        assert!(lease.expire_time > lease.current);
        assert!(leader_pg_election.is_leader());

        match rx.recv().await {
@@ -1116,13 +1055,15 @@ mod tests {
        let res: bool = res[0].get(0);
        assert!(res);
        leader_pg_election.leader_action().await.unwrap();
-        let (leader, new_expire_time, current, _) = leader_pg_election
-            .get_value_with_lease(&leader_pg_election.election_key(), false)
+        let new_lease = leader_pg_election
+            .get_value_with_lease(&leader_pg_election.election_key())
            .await
            .unwrap()
            .unwrap();
-        assert!(leader == leader_value);
-        assert!(new_expire_time > current && new_expire_time > expire_time);
+        assert!(new_lease.leader_value == leader_value);
+        assert!(
+            new_lease.expire_time > new_lease.current && new_lease.expire_time > lease.expire_time
+        );
        assert!(leader_pg_election.is_leader());

        // Step 3: Something wrong, the leader lease expired.
@@ -1137,7 +1078,7 @@ mod tests {
        assert!(res);
        leader_pg_election.leader_action().await.unwrap();
        let res = leader_pg_election
-            .get_value_with_lease(&leader_pg_election.election_key(), false)
+            .get_value_with_lease(&leader_pg_election.election_key())
            .await
            .unwrap();
        assert!(res.is_none());
@@ -1164,13 +1105,13 @@ mod tests {
        let res: bool = res[0].get(0);
        assert!(res);
        leader_pg_election.leader_action().await.unwrap();
-        let (leader, expire_time, current, _) = leader_pg_election
-            .get_value_with_lease(&leader_pg_election.election_key(), false)
+        let lease = leader_pg_election
+            .get_value_with_lease(&leader_pg_election.election_key())
            .await
            .unwrap()
            .unwrap();
-        assert!(leader == leader_value);
-        assert!(expire_time > current);
+        assert!(lease.leader_value == leader_value);
+        assert!(lease.expire_time > lease.current);
        assert!(leader_pg_election.is_leader());

        match rx.recv().await {
@@ -1193,7 +1134,7 @@ mod tests {
            .unwrap();
        leader_pg_election.leader_action().await.unwrap();
        let res = leader_pg_election
-            .get_value_with_lease(&leader_pg_election.election_key(), false)
+            .get_value_with_lease(&leader_pg_election.election_key())
            .await
            .unwrap();
        assert!(res.is_none());
@@ -1221,13 +1162,13 @@ mod tests {
        let res: bool = res[0].get(0);
        assert!(res);
        leader_pg_election.leader_action().await.unwrap();
-        let (leader, expire_time, current, _) = leader_pg_election
-            .get_value_with_lease(&leader_pg_election.election_key(), false)
+        let lease = leader_pg_election
+            .get_value_with_lease(&leader_pg_election.election_key())
            .await
            .unwrap()
            .unwrap();
-        assert!(leader == leader_value);
-        assert!(expire_time > current);
+        assert!(lease.leader_value == leader_value);
+        assert!(lease.expire_time > lease.current);
        assert!(leader_pg_election.is_leader());

        match rx.recv().await {
@@ -1261,7 +1202,7 @@ mod tests {
            .unwrap();
        leader_pg_election.leader_action().await.unwrap();
        let res = leader_pg_election
-            .get_value_with_lease(&leader_pg_election.election_key(), false)
+            .get_value_with_lease(&leader_pg_election.election_key())
            .await
            .unwrap();
        assert!(res.is_none());
--- a/src/meta-srv/src/handler.rs
+++ b/src/meta-srv/src/handler.rs
@@ -375,7 +375,7 @@ impl HeartbeatMailbox {

    /// Parses the [Instruction] from [MailboxMessage].
    #[cfg(test)]
-    pub(crate) fn json_instruction(msg: &MailboxMessage) -> Result<Instruction> {
+    pub fn json_instruction(msg: &MailboxMessage) -> Result<Instruction> {
        let Payload::Json(payload) =
            msg.payload
                .as_ref()
--- a/src/meta-srv/src/metasrv/builder.rs
+++ b/src/meta-srv/src/metasrv/builder.rs
@@ -280,7 +280,7 @@ impl MetasrvBuilder {
            ensure!(
                options.allow_region_failover_on_local_wal,
                error::UnexpectedSnafu {
-                    violated: "Region failover is not supported in the local WAL implementation! 
+                    violated: "Region failover is not supported in the local WAL implementation!
                    If you want to enable region failover for local WAL, please set `allow_region_failover_on_local_wal` to true.",
                }
            );
@@ -351,6 +351,11 @@ impl MetasrvBuilder {
        };

        let leader_region_registry = Arc::new(LeaderRegionRegistry::default());
+
+        #[cfg(feature = "enterprise")]
+        let trigger_ddl_manager = plugins
+            .as_ref()
+            .and_then(|plugins| plugins.get::<common_meta::ddl_manager::TriggerDdlManagerRef>());
        let ddl_manager = Arc::new(
            DdlManager::try_new(
                DdlContext {
@@ -366,6 +371,8 @@ impl MetasrvBuilder {
                },
                procedure_manager.clone(),
                true,
+                #[cfg(feature = "enterprise")]
+                trigger_ddl_manager,
            )
            .context(error::InitDdlManagerSnafu)?,
        );
--- a/src/meta-srv/src/procedure/wal_prune/manager.rs
+++ b/src/meta-srv/src/procedure/wal_prune/manager.rs
@@ -343,13 +343,14 @@ mod test {
    #[tokio::test]
    async fn test_wal_prune_ticker() {
        let (tx, mut rx) = WalPruneManager::channel();
-        let interval = Duration::from_millis(10);
+        let interval = Duration::from_millis(50);
        let ticker = WalPruneTicker::new(interval, tx);
        assert_eq!(ticker.name(), "WalPruneTicker");

        for _ in 0..2 {
            ticker.start();
-            sleep(2 * interval).await;
+            // wait a bit longer to make sure not all ticks are skipped
+            sleep(4 * interval).await;
            assert!(!rx.is_empty());
            while let Ok(event) = rx.try_recv() {
                assert_matches!(event, Event::Tick);
--- a/src/mito2/src/error.rs
+++ b/src/mito2/src/error.rs
@@ -657,13 +657,6 @@ pub enum Error {
        unexpected_entry_id: u64,
    },

-    #[snafu(display("Read the corrupted log entry, region_id: {}", region_id))]
-    CorruptedEntry {
-        region_id: RegionId,
-        #[snafu(implicit)]
-        location: Location,
-    },
-
    #[snafu(display(
        "Failed to download file, region_id: {}, file_id: {}, file_type: {:?}",
        region_id,
@@ -1106,7 +1099,6 @@ impl ErrorExt for Error {
            | EncodeMemtable { .. }
            | CreateDir { .. }
            | ReadDataPart { .. }
-            | CorruptedEntry { .. }
            | BuildEntry { .. }
            | Metadata { .. }
            | MitoManifestInfo { .. } => StatusCode::Internal,
--- a/src/mito2/src/memtable/simple_bulk_memtable.rs
+++ b/src/mito2/src/memtable/simple_bulk_memtable.rs
@@ -65,7 +65,7 @@ impl SimpleBulkMemtable {
        } else {
            dedup
        };
-        let series = RwLock::new(Series::new(&region_metadata));
+        let series = RwLock::new(Series::with_capacity(&region_metadata, 1024));

        Self {
            id,
--- a/src/mito2/src/memtable/time_series.rs
+++ b/src/mito2/src/memtable/time_series.rs
@@ -60,7 +60,7 @@ use crate::region::options::MergeMode;
 use crate::row_converter::{DensePrimaryKeyCodec, PrimaryKeyCodecExt};

 /// Initial vector builder capacity.
-const INITIAL_BUILDER_CAPACITY: usize = 1024 * 8;
+const INITIAL_BUILDER_CAPACITY: usize = 4;

 /// Vector builder capacity.
 const BUILDER_CAPACITY: usize = 512;
@@ -663,15 +663,19 @@ pub(crate) struct Series {
 }

 impl Series {
-    pub(crate) fn new(region_metadata: &RegionMetadataRef) -> Self {
+    pub(crate) fn with_capacity(region_metadata: &RegionMetadataRef, builder_cap: usize) -> Self {
        Self {
            pk_cache: None,
-            active: ValueBuilder::new(region_metadata, INITIAL_BUILDER_CAPACITY),
+            active: ValueBuilder::new(region_metadata, builder_cap),
            frozen: vec![],
            region_metadata: region_metadata.clone(),
        }
    }

+    pub(crate) fn new(region_metadata: &RegionMetadataRef) -> Self {
+        Self::with_capacity(region_metadata, INITIAL_BUILDER_CAPACITY)
+    }
+
    pub fn is_empty(&self) -> bool {
        self.active.len() == 0 && self.frozen.is_empty()
    }
--- a/src/mito2/src/sst/index/bloom_filter/applier/builder.rs
+++ b/src/mito2/src/sst/index/bloom_filter/applier/builder.rs
@@ -15,7 +15,7 @@
 use std::collections::{BTreeMap, BTreeSet};

 use common_telemetry::warn;
-use datafusion_common::ScalarValue;
+use datafusion_common::{Column, ScalarValue};
 use datafusion_expr::expr::InList;
 use datafusion_expr::{BinaryExpr, Expr, Operator};
 use datatypes::data_type::ConcreteDataType;
@@ -121,6 +121,7 @@ impl<'a> BloomFilterIndexApplierBuilder<'a> {
                    Ok(())
                }
                Operator::Eq => self.collect_eq(left, right),
+                Operator::Or => self.collect_or_eq_list(left, right),
                _ => Ok(()),
            },
            Expr::InList(in_list) => self.collect_in_list(in_list),
@@ -152,10 +153,8 @@ impl<'a> BloomFilterIndexApplierBuilder<'a> {

    /// Collects an equality expression (column = value)
    fn collect_eq(&mut self, left: &Expr, right: &Expr) -> Result<()> {
-        let (col, lit) = match (left, right) {
-            (Expr::Column(col), Expr::Literal(lit)) => (col, lit),
-            (Expr::Literal(lit), Expr::Column(col)) => (col, lit),
-            _ => return Ok(()),
+        let Some((col, lit)) = Self::eq_expr_col_lit(left, right)? else {
+            return Ok(());
        };
        if lit.is_null() {
            return Ok(());
@@ -218,6 +217,83 @@ impl<'a> BloomFilterIndexApplierBuilder<'a> {
        Ok(())
    }

+    /// Collects an or expression in the form of `column = lit OR column = lit OR ...`.
+    fn collect_or_eq_list(&mut self, left: &Expr, right: &Expr) -> Result<()> {
+        let (eq_left, eq_right, or_list) = if let Expr::BinaryExpr(BinaryExpr {
+            left: l,
+            op: Operator::Eq,
+            right: r,
+        }) = left
+        {
+            (l, r, right)
+        } else if let Expr::BinaryExpr(BinaryExpr {
+            left: l,
+            op: Operator::Eq,
+            right: r,
+        }) = right
+        {
+            (l, r, left)
+        } else {
+            return Ok(());
+        };
+
+        let Some((col, lit)) = Self::eq_expr_col_lit(eq_left, eq_right)? else {
+            return Ok(());
+        };
+        if lit.is_null() {
+            return Ok(());
+        }
+        let Some((column_id, data_type)) = self.column_id_and_type(&col.name)? else {
+            return Ok(());
+        };
+
+        let mut inlist = BTreeSet::new();
+        inlist.insert(encode_lit(lit, data_type.clone())?);
+        if Self::collect_or_eq_list_rec(&col.name, &data_type, or_list, &mut inlist)? {
+            self.predicates
+                .entry(column_id)
+                .or_default()
+                .push(InListPredicate { list: inlist });
+        }
+
+        Ok(())
+    }
+
+    fn collect_or_eq_list_rec(
+        column_name: &str,
+        data_type: &ConcreteDataType,
+        expr: &Expr,
+        inlist: &mut BTreeSet<Bytes>,
+    ) -> Result<bool> {
+        if let Expr::BinaryExpr(BinaryExpr { left, op, right }) = expr {
+            match op {
+                Operator::Or => {
+                    let r = Self::collect_or_eq_list_rec(column_name, data_type, left, inlist)?
+                        .then(|| {
+                            Self::collect_or_eq_list_rec(column_name, data_type, right, inlist)
+                        })
+                        .transpose()?
+                        .unwrap_or(false);
+                    return Ok(r);
+                }
+                Operator::Eq => {
+                    let Some((col, lit)) = Self::eq_expr_col_lit(left, right)? else {
+                        return Ok(false);
+                    };
+                    if lit.is_null() || column_name != col.name {
+                        return Ok(false);
+                    }
+                    let bytes = encode_lit(lit, data_type.clone())?;
+                    inlist.insert(bytes);
+                    return Ok(true);
+                }
+                _ => {}
+            }
+        }
+
+        Ok(false)
+    }
+
    /// Helper function to get non-null literal value
    fn nonnull_lit(expr: &Expr) -> Option<&ScalarValue> {
        match expr {
@@ -225,6 +301,19 @@ impl<'a> BloomFilterIndexApplierBuilder<'a> {
            _ => None,
        }
    }
+
+    /// Helper function to get the column and literal value from an equality expr (column = lit)
+    fn eq_expr_col_lit<'b>(
+        left: &'b Expr,
+        right: &'b Expr,
+    ) -> Result<Option<(&'b Column, &'b ScalarValue)>> {
+        let (col, lit) = match (left, right) {
+            (Expr::Column(col), Expr::Literal(lit)) => (col, lit),
+            (Expr::Literal(lit), Expr::Column(col)) => (col, lit),
+            _ => return Ok(None),
+        };
+        Ok(Some((col, lit)))
+    }
 }

 // TODO(ruihang): extract this and the one under inverted_index into a common util mod.
@@ -241,6 +330,7 @@ fn encode_lit(lit: &ScalarValue, data_type: ConcreteDataType) -> Result<Bytes> {
 mod tests {
    use api::v1::SemanticType;
    use datafusion_common::Column;
+    use datafusion_expr::{col, lit};
    use datatypes::schema::ColumnSchema;
    use object_store::services::Memory;
    use store_api::metadata::{ColumnMetadata, RegionMetadata, RegionMetadataBuilder};
@@ -356,6 +446,66 @@ mod tests {
        assert_eq!(column_predicates[0].list.len(), 3);
    }

+    #[test]
+    fn test_build_with_or_chain() {
+        let (_d, factory) = PuffinManagerFactory::new_for_test_block("test_build_with_or_chain_");
+        let metadata = test_region_metadata();
+        let builder = || {
+            BloomFilterIndexApplierBuilder::new(
+                "test".to_string(),
+                test_object_store(),
+                &metadata,
+                factory.clone(),
+            )
+        };
+
+        let expr = col("column1")
+            .eq(lit("value1"))
+            .or(col("column1")
+                .eq(lit("value2"))
+                .or(col("column1").eq(lit("value4"))))
+            .or(col("column1").eq(lit("value3")));
+
+        let result = builder().build(&[expr]).unwrap();
+        assert!(result.is_some());
+
+        let predicates = result.unwrap().predicates;
+        let column_predicates = predicates.get(&1).unwrap();
+        assert_eq!(column_predicates.len(), 1);
+        assert_eq!(column_predicates[0].list.len(), 4);
+        let or_chain_predicates = &column_predicates[0].list;
+        let encode_str = |s: &str| {
+            encode_lit(
+                &ScalarValue::Utf8(Some(s.to_string())),
+                ConcreteDataType::string_datatype(),
+            )
+            .unwrap()
+        };
+        assert!(or_chain_predicates.contains(&encode_str("value1")));
+        assert!(or_chain_predicates.contains(&encode_str("value2")));
+        assert!(or_chain_predicates.contains(&encode_str("value3")));
+        assert!(or_chain_predicates.contains(&encode_str("value4")));
+
+        // Test with null value
+        let expr = col("column1").eq(Expr::Literal(ScalarValue::Utf8(None)));
+        let result = builder().build(&[expr]).unwrap();
+        assert!(result.is_none());
+
+        // Test with different column
+        let expr = col("column1")
+            .eq(lit("value1"))
+            .or(col("column2").eq(lit("value2")));
+        let result = builder().build(&[expr]).unwrap();
+        assert!(result.is_none());
+
+        // Test with non or chain
+        let expr = col("column1")
+            .eq(lit("value1"))
+            .or(col("column1").gt_eq(lit("value2")));
+        let result = builder().build(&[expr]).unwrap();
+        assert!(result.is_none());
+    }
+
    #[test]
    fn test_build_with_and_expressions() {
        let (_d, factory) = PuffinManagerFactory::new_for_test_block("test_build_with_and_");
--- a/src/mito2/src/wal/entry_distributor.rs
+++ b/src/mito2/src/wal/entry_distributor.rs
@@ -16,7 +16,7 @@ use std::collections::HashMap;
 use std::sync::Arc;

 use async_stream::stream;
-use common_telemetry::{debug, error};
+use common_telemetry::{debug, error, warn};
 use futures::future::join_all;
 use snafu::OptionExt;
 use store_api::logstore::entry::Entry;
@@ -133,11 +133,15 @@ impl WalEntryReader for WalEntryReceiver {
        }

        let stream = stream! {
-            let mut buffered_entry = None;
+            let mut buffered_entry: Option<Entry> = None;
            while let Some(next_entry) = entry_receiver.recv().await {
                match buffered_entry.take() {
                    Some(entry) => {
-                        yield decode_raw_entry(entry);
+                        if entry.is_complete() {
+                            yield decode_raw_entry(entry);
+                        } else {
+                            warn!("Ignoring incomplete entry: {}", entry);
+                        }
                        buffered_entry = Some(next_entry);
                    },
                    None => {
@@ -149,6 +153,8 @@ impl WalEntryReader for WalEntryReceiver {
                // Ignores tail corrupted data.
                if entry.is_complete() {
                    yield decode_raw_entry(entry);
+                } else {
+                    warn!("Ignoring incomplete entry: {}", entry);
                }
            }
        };
@@ -213,7 +219,6 @@ pub fn build_wal_entry_distributor_and_receivers(

 #[cfg(test)]
 mod tests {
-    use std::assert_matches::assert_matches;

    use api::v1::{Mutation, OpType, WalEntry};
    use futures::{stream, TryStreamExt};
@@ -385,6 +390,7 @@ mod tests {

    #[tokio::test]
    async fn test_tail_corrupted_stream() {
+        common_telemetry::init_default_ut_logging();
        let mut entries = vec![];
        let region1 = RegionId::new(1, 1);
        let region1_expected_wal_entry = WalEntry {
@@ -484,6 +490,7 @@ mod tests {

    #[tokio::test]
    async fn test_part_corrupted_stream() {
+        common_telemetry::init_default_ut_logging();
        let mut entries = vec![];
        let region1 = RegionId::new(1, 1);
        let region1_expected_wal_entry = WalEntry {
@@ -504,7 +511,7 @@ mod tests {
            3,
        ));
        entries.extend(vec![
-            // The corrupted data.
+            // The incomplete entry.
            Entry::MultiplePart(MultiplePartEntry {
                provider: provider.clone(),
                region_id: region2,
@@ -512,6 +519,7 @@ mod tests {
                headers: vec![MultiplePartHeader::First],
                parts: vec![vec![1; 100]],
            }),
+            // The incomplete entry.
            Entry::MultiplePart(MultiplePartEntry {
                provider: provider.clone(),
                region_id: region2,
@@ -545,14 +553,14 @@ mod tests {
            vec![(0, region1_expected_wal_entry)]
        );

-        assert_matches!(
+        assert_eq!(
            streams
                .get_mut(1)
                .unwrap()
                .try_collect::<Vec<_>>()
                .await
-                .unwrap_err(),
-            error::Error::CorruptedEntry { .. }
+                .unwrap(),
+            vec![]
        );
    }

--- a/src/mito2/src/wal/entry_reader.rs
+++ b/src/mito2/src/wal/entry_reader.rs
@@ -14,21 +14,25 @@

 use api::v1::WalEntry;
 use async_stream::stream;
+use common_telemetry::tracing::warn;
 use futures::StreamExt;
 use object_store::Buffer;
 use prost::Message;
-use snafu::{ensure, ResultExt};
+use snafu::ResultExt;
 use store_api::logstore::entry::Entry;
 use store_api::logstore::provider::Provider;

-use crate::error::{CorruptedEntrySnafu, DecodeWalSnafu, Result};
+use crate::error::{DecodeWalSnafu, Result};
 use crate::wal::raw_entry_reader::RawEntryReader;
 use crate::wal::{EntryId, WalEntryStream};

+/// Decodes the [Entry] into [WalEntry].
+///
+/// The caller must ensure the [Entry] is complete.
 pub(crate) fn decode_raw_entry(raw_entry: Entry) -> Result<(EntryId, WalEntry)> {
    let entry_id = raw_entry.entry_id();
    let region_id = raw_entry.region_id();
-    ensure!(raw_entry.is_complete(), CorruptedEntrySnafu { region_id });
+    debug_assert!(raw_entry.is_complete());
    let buffer = into_buffer(raw_entry);
    let wal_entry = WalEntry::decode(buffer).context(DecodeWalSnafu { region_id })?;
    Ok((entry_id, wal_entry))
@@ -58,7 +62,7 @@ impl WalEntryReader for NoopEntryReader {
    }
 }

-/// A Reader reads the [RawEntry] from [RawEntryReader] and decodes [RawEntry] into [WalEntry].
+/// A Reader reads the [Entry] from [RawEntryReader] and decodes [Entry] into [WalEntry].
 pub struct LogStoreEntryReader<R> {
    reader: R,
 }
@@ -75,11 +79,15 @@ impl<R: RawEntryReader> WalEntryReader for LogStoreEntryReader<R> {
        let mut stream = reader.read(ns, start_id)?;

        let stream = stream! {
-            let mut buffered_entry = None;
+            let mut buffered_entry: Option<Entry> = None;
            while let Some(next_entry) = stream.next().await {
                match buffered_entry.take() {
                    Some(entry) => {
-                        yield decode_raw_entry(entry);
+                        if entry.is_complete() {
+                            yield decode_raw_entry(entry);
+                        } else {
+                            warn!("Ignoring incomplete entry: {}", entry);
+                        }
                        buffered_entry = Some(next_entry?);
                    },
                    None => {
@@ -91,6 +99,8 @@ impl<R: RawEntryReader> WalEntryReader for LogStoreEntryReader<R> {
                // Ignores tail corrupted data.
                if entry.is_complete() {
                    yield decode_raw_entry(entry);
+                } else {
+                    warn!("Ignoring incomplete entry: {}", entry);
                }
            }
        };
@@ -101,7 +111,6 @@ impl<R: RawEntryReader> WalEntryReader for LogStoreEntryReader<R> {

 #[cfg(test)]
 mod tests {
-    use std::assert_matches::assert_matches;

    use api::v1::{Mutation, OpType, WalEntry};
    use futures::TryStreamExt;
@@ -110,7 +119,6 @@ mod tests {
    use store_api::logstore::provider::Provider;
    use store_api::storage::RegionId;

-    use crate::error;
    use crate::test_util::wal_util::MockRawEntryStream;
    use crate::wal::entry_reader::{LogStoreEntryReader, WalEntryReader};

@@ -141,7 +149,7 @@ mod tests {
                    headers: vec![MultiplePartHeader::First, MultiplePartHeader::Last],
                    parts,
                }),
-                // The tail corrupted data.
+                // The tail incomplete entry.
                Entry::MultiplePart(MultiplePartEntry {
                    provider: provider.clone(),
                    region_id: RegionId::new(1, 1),
@@ -171,6 +179,7 @@ mod tests {
        let provider = Provider::kafka_provider("my_topic".to_string());
        let raw_entry_stream = MockRawEntryStream {
            entries: vec![
+                // The incomplete entry.
                Entry::MultiplePart(MultiplePartEntry {
                    provider: provider.clone(),
                    region_id: RegionId::new(1, 1),
@@ -189,12 +198,12 @@ mod tests {
        };

        let mut reader = LogStoreEntryReader::new(raw_entry_stream);
-        let err = reader
+        let entries = reader
            .read(&provider, 0)
            .unwrap()
            .try_collect::<Vec<_>>()
            .await
-            .unwrap_err();
-        assert_matches!(err, error::Error::CorruptedEntry { .. });
+            .unwrap();
+        assert!(entries.is_empty());
    }
 }
--- a/src/operator/Cargo.toml
+++ b/src/operator/Cargo.toml
@@ -6,7 +6,7 @@ license.workspace = true

 [features]
 testing = []
-enterprise = ["sql/enterprise"]
+enterprise = ["common-meta/enterprise", "sql/enterprise"]

 [lints]
 workspace = true
--- a/src/operator/src/error.rs
+++ b/src/operator/src/error.rs
@@ -703,6 +703,14 @@ pub enum Error {
        location: Location,
    },

+    #[cfg(feature = "enterprise")]
+    #[snafu(display("Invalid trigger name: {name}"))]
+    InvalidTriggerName {
+        name: String,
+        #[snafu(implicit)]
+        location: Location,
+    },
+
    #[snafu(display("Empty {} expr", name))]
    EmptyDdlExpr {
        name: String,
@@ -872,6 +880,8 @@ impl ErrorExt for Error {
            | Error::CursorNotFound { .. }
            | Error::CursorExists { .. }
            | Error::CreatePartitionRules { .. } => StatusCode::InvalidArguments,
+            #[cfg(feature = "enterprise")]
+            Error::InvalidTriggerName { .. } => StatusCode::InvalidArguments,
            Error::TableAlreadyExists { .. } | Error::ViewAlreadyExists { .. } => {
                StatusCode::TableAlreadyExists
            }
--- a/src/operator/src/expr_helper.rs
+++ b/src/operator/src/expr_helper.rs
@@ -12,6 +12,9 @@
 // See the License for the specific language governing permissions and
 // limitations under the License.

+#[cfg(feature = "enterprise")]
+pub mod trigger;
+
 use std::collections::{HashMap, HashSet};

 use api::helper::ColumnDataTypeWrapper;
@@ -55,6 +58,8 @@ use sql::statements::{
 use sql::util::extract_tables_from_query;
 use table::requests::{TableOptions, FILE_TABLE_META_KEY};
 use table::table_reference::TableReference;
+#[cfg(feature = "enterprise")]
+pub use trigger::to_create_trigger_task_expr;

 use crate::error::{
    BuildCreateExprOnInsertionSnafu, ColumnDataTypeSnafu, ConvertColumnDefaultConstraintSnafu,
--- a/src/operator/src/expr_helper/trigger.rs
+++ b/src/operator/src/expr_helper/trigger.rs
@@ -0,0 +1,146 @@
+use api::v1::notify_channel::ChannelType as PbChannelType;
+use api::v1::{
+    CreateTriggerExpr as PbCreateTriggerExpr, NotifyChannel as PbNotifyChannel,
+    WebhookOptions as PbWebhookOptions,
+};
+use session::context::QueryContextRef;
+use snafu::ensure;
+use sql::ast::ObjectName;
+use sql::statements::create::trigger::{ChannelType, CreateTrigger};
+
+use crate::error::Result;
+
+pub fn to_create_trigger_task_expr(
+    create_trigger: CreateTrigger,
+    query_ctx: &QueryContextRef,
+) -> Result<PbCreateTriggerExpr> {
+    let CreateTrigger {
+        trigger_name,
+        if_not_exists,
+        query,
+        interval,
+        labels,
+        annotations,
+        channels,
+    } = create_trigger;
+
+    let catalog_name = query_ctx.current_catalog().to_string();
+    let trigger_name = sanitize_trigger_name(trigger_name)?;
+
+    let channels = channels
+        .into_iter()
+        .map(|c| {
+            let name = c.name.value;
+            match c.channel_type {
+                ChannelType::Webhook(am) => PbNotifyChannel {
+                    name,
+                    channel_type: Some(PbChannelType::Webhook(PbWebhookOptions {
+                        url: am.url.value,
+                        opts: am.options.into_map(),
+                    })),
+                },
+            }
+        })
+        .collect::<Vec<_>>();
+
+    let sql = query.to_string();
+    let labels = labels.into_map();
+    let annotations = annotations.into_map();
+
+    Ok(PbCreateTriggerExpr {
+        catalog_name,
+        trigger_name,
+        create_if_not_exists: if_not_exists,
+        sql,
+        channels,
+        labels,
+        annotations,
+        interval,
+    })
+}
+
+fn sanitize_trigger_name(mut trigger_name: ObjectName) -> Result<String> {
+    ensure!(
+        trigger_name.0.len() == 1,
+        crate::error::InvalidTriggerNameSnafu {
+            name: trigger_name.to_string(),
+        }
+    );
+    // safety: we've checked trigger_name.0 has exactly one element.
+    Ok(trigger_name.0.swap_remove(0).value)
+}
+
+#[cfg(test)]
+mod tests {
+    use session::context::QueryContext;
+    use sql::dialect::GreptimeDbDialect;
+    use sql::parser::{ParseOptions, ParserContext};
+    use sql::statements::statement::Statement;
+
+    use super::*;
+
+    #[test]
+    fn test_sanitize_trigger_name() {
+        let name = ObjectName(vec![sql::ast::Ident::new("my_trigger")]);
+        let sanitized = sanitize_trigger_name(name).unwrap();
+        assert_eq!(sanitized, "my_trigger");
+
+        let name = ObjectName(vec![sql::ast::Ident::with_quote('`', "my_trigger")]);
+        let sanitized = sanitize_trigger_name(name).unwrap();
+        assert_eq!(sanitized, "my_trigger");
+
+        let name = ObjectName(vec![sql::ast::Ident::with_quote('\'', "trigger")]);
+        let sanitized = sanitize_trigger_name(name).unwrap();
+        assert_eq!(sanitized, "trigger");
+    }
+
+    #[test]
+    fn test_to_create_trigger_task_expr() {
+        let sql = r#"CREATE TRIGGER IF NOT EXISTS cpu_monitor
+ON (SELECT host AS host_label, cpu, memory FROM machine_monitor WHERE cpu > 2) EVERY '5 minute'::INTERVAL
+LABELS (label_name=label_val)
+ANNOTATIONS (annotation_name=annotation_val)
+NOTIFY
+(WEBHOOK alert_manager URL 'http://127.0.0.1:9093' WITH (timeout='1m'))"#;
+
+        let stmt =
+            ParserContext::create_with_dialect(sql, &GreptimeDbDialect {}, ParseOptions::default())
+                .unwrap()
+                .pop()
+                .unwrap();
+
+        let Statement::CreateTrigger(stmt) = stmt else {
+            unreachable!()
+        };
+
+        let query_ctx = QueryContext::arc();
+        let expr = to_create_trigger_task_expr(stmt, &query_ctx).unwrap();
+
+        assert_eq!("greptime", expr.catalog_name);
+        assert_eq!("cpu_monitor", expr.trigger_name);
+        assert!(expr.create_if_not_exists);
+        assert_eq!(
+            "(SELECT host AS host_label, cpu, memory FROM machine_monitor WHERE cpu > 2)",
+            expr.sql
+        );
+        assert_eq!(300, expr.interval);
+        assert_eq!(1, expr.labels.len());
+        assert_eq!("label_val", expr.labels.get("label_name").unwrap());
+        assert_eq!(1, expr.annotations.len());
+        assert_eq!(
+            "annotation_val",
+            expr.annotations.get("annotation_name").unwrap()
+        );
+        assert_eq!(1, expr.channels.len());
+        let c = &expr.channels[0];
+        assert_eq!("alert_manager", c.name,);
+        let channel_type = c.channel_type.as_ref().unwrap();
+        let PbChannelType::Webhook(am) = &channel_type;
+        assert_eq!("http://127.0.0.1:9093", am.url);
+        assert_eq!(1, am.opts.len());
+        assert_eq!(
+            "1m",
+            am.opts.get("timeout").expect("Expected timeout option")
+        );
+    }
+}
--- a/src/operator/src/insert.rs
+++ b/src/operator/src/insert.rs
@@ -147,7 +147,7 @@ impl Inserter {
        statement_executor: &StatementExecutor,
    ) -> Result<Output> {
        let row_inserts = ColumnToRow::convert(requests)?;
-        self.handle_row_inserts(row_inserts, ctx, statement_executor, false)
+        self.handle_row_inserts(row_inserts, ctx, statement_executor, false, false)
            .await
    }

@@ -158,6 +158,7 @@ impl Inserter {
        ctx: QueryContextRef,
        statement_executor: &StatementExecutor,
        accommodate_existing_schema: bool,
+        is_single_value: bool,
    ) -> Result<Output> {
        preprocess_row_insert_requests(&mut requests.inserts)?;
        self.handle_row_inserts_with_create_type(
@@ -166,6 +167,7 @@ impl Inserter {
            statement_executor,
            AutoCreateTableType::Physical,
            accommodate_existing_schema,
+            is_single_value,
        )
        .await
    }
@@ -183,6 +185,7 @@ impl Inserter {
            statement_executor,
            AutoCreateTableType::Log,
            false,
+            false,
        )
        .await
    }
@@ -199,6 +202,7 @@ impl Inserter {
            statement_executor,
            AutoCreateTableType::Trace,
            false,
+            false,
        )
        .await
    }
@@ -210,6 +214,7 @@ impl Inserter {
        ctx: QueryContextRef,
        statement_executor: &StatementExecutor,
        accommodate_existing_schema: bool,
+        is_single_value: bool,
    ) -> Result<Output> {
        self.handle_row_inserts_with_create_type(
            requests,
@@ -217,6 +222,7 @@ impl Inserter {
            statement_executor,
            AutoCreateTableType::LastNonNull,
            accommodate_existing_schema,
+            is_single_value,
        )
        .await
    }
@@ -229,6 +235,7 @@ impl Inserter {
        statement_executor: &StatementExecutor,
        create_type: AutoCreateTableType,
        accommodate_existing_schema: bool,
+        is_single_value: bool,
    ) -> Result<Output> {
        // remove empty requests
        requests.inserts.retain(|req| {
@@ -249,6 +256,7 @@ impl Inserter {
                create_type,
                statement_executor,
                accommodate_existing_schema,
+                is_single_value,
            )
            .await?;

@@ -299,6 +307,7 @@ impl Inserter {
                AutoCreateTableType::Logical(physical_table.to_string()),
                statement_executor,
                true,
+                true,
            )
            .await?;
        let name_to_info = table_infos
@@ -464,9 +473,10 @@ impl Inserter {
    /// This mapping is used in the conversion of RowToRegion.
    ///
    /// `accommodate_existing_schema` is used to determine if the existing schema should override the new schema.
-    /// It only works for TIME_INDEX and VALUE columns. This is for the case where the user creates a table with
+    /// It only works for TIME_INDEX and single VALUE columns. This is for the case where the user creates a table with
    /// custom schema, and then inserts data with endpoints that have default schema setting, like prometheus
    /// remote write. This will modify the `RowInsertRequests` in place.
+    /// `is_single_value` indicates whether the default schema only contains single value column so we can accommodate it.
    async fn create_or_alter_tables_on_demand(
        &self,
        requests: &mut RowInsertRequests,
@@ -474,6 +484,7 @@ impl Inserter {
        auto_create_table_type: AutoCreateTableType,
        statement_executor: &StatementExecutor,
        accommodate_existing_schema: bool,
+        is_single_value: bool,
    ) -> Result<CreateAlterTableResult> {
        let _timer = crate::metrics::CREATE_ALTER_ON_DEMAND
            .with_label_values(&[auto_create_table_type.as_str()])
@@ -537,6 +548,7 @@ impl Inserter {
                        &table,
                        ctx,
                        accommodate_existing_schema,
+                        is_single_value,
                    )? {
                        alter_tables.push(alter_expr);
                    }
@@ -815,12 +827,15 @@ impl Inserter {
    /// When `accommodate_existing_schema` is true, it may modify the input `req` to
    /// accommodate it with existing schema. See [`create_or_alter_tables_on_demand`](Self::create_or_alter_tables_on_demand)
    /// for more details.
+    /// When `accommodate_existing_schema` is true and `is_single_value` is true, it also consider fields when modifying the
+    /// input `req`.
    fn get_alter_table_expr_on_demand(
        &self,
        req: &mut RowInsertRequest,
        table: &TableRef,
        ctx: &QueryContextRef,
        accommodate_existing_schema: bool,
+        is_single_value: bool,
    ) -> Result<Option<AlterTableExpr>> {
        let catalog_name = ctx.current_catalog();
        let schema_name = ctx.current_schema();
@@ -838,18 +853,20 @@ impl Inserter {
            let table_schema = table.schema();
            // Find timestamp column name
            let ts_col_name = table_schema.timestamp_column().map(|c| c.name.clone());
-            // Find field column name if there is only one
+            // Find field column name if there is only one and `is_single_value` is true.
            let mut field_col_name = None;
-            let mut multiple_field_cols = false;
-            table.field_columns().for_each(|col| {
-                if field_col_name.is_none() {
-                    field_col_name = Some(col.name.clone());
-                } else {
-                    multiple_field_cols = true;
+            if is_single_value {
+                let mut multiple_field_cols = false;
+                table.field_columns().for_each(|col| {
+                    if field_col_name.is_none() {
+                        field_col_name = Some(col.name.clone());
+                    } else {
+                        multiple_field_cols = true;
+                    }
+                });
+                if multiple_field_cols {
+                    field_col_name = None;
                }
-            });
-            if multiple_field_cols {
-                field_col_name = None;
            }

            // Update column name in request schema for Timestamp/Field columns
@@ -875,11 +892,11 @@ impl Inserter {
                }
            }

-            // Remove from add_columns any column that is timestamp or field (if there is only one field column)
+            // Only keep columns that are tags or non-single field.
            add_columns.add_columns.retain(|col| {
                let def = col.column_def.as_ref().unwrap();
-                def.semantic_type != SemanticType::Timestamp as i32
-                    && (def.semantic_type != SemanticType::Field as i32 && field_col_name.is_some())
+                def.semantic_type == SemanticType::Tag as i32
+                    || (def.semantic_type == SemanticType::Field as i32 && field_col_name.is_none())
            });

            if add_columns.add_columns.is_empty() {
@@ -1231,7 +1248,7 @@ mod tests {
            )),
        );
        let alter_expr = inserter
-            .get_alter_table_expr_on_demand(&mut req, &table, &ctx, true)
+            .get_alter_table_expr_on_demand(&mut req, &table, &ctx, true, true)
            .unwrap();
        assert!(alter_expr.is_none());

--- a/src/operator/src/statement/copy_table_to.rs
+++ b/src/operator/src/statement/copy_table_to.rs
@@ -25,7 +25,10 @@ use common_datasource::object_store::{build_backend, parse_url};
 use common_datasource::util::find_dir_and_filename;
 use common_query::Output;
 use common_recordbatch::adapter::DfRecordBatchStreamAdapter;
-use common_recordbatch::SendableRecordBatchStream;
+use common_recordbatch::{
+    map_json_type_to_string, map_json_type_to_string_schema, RecordBatchStream,
+    SendableRecordBatchMapper, SendableRecordBatchStream,
+};
 use common_telemetry::{debug, tracing};
 use datafusion::datasource::DefaultTableSource;
 use datafusion_common::TableReference as DfTableReference;
@@ -57,6 +60,11 @@ impl StatementExecutor {
    ) -> Result<usize> {
        let threshold = WRITE_BUFFER_THRESHOLD.as_bytes() as usize;

+        let stream = Box::pin(SendableRecordBatchMapper::new(
+            stream,
+            map_json_type_to_string,
+            map_json_type_to_string_schema,
+        ));
        match format {
            Format::Csv(_) => stream_to_csv(
                Box::pin(DfRecordBatchStreamAdapter::new(stream)),
--- a/src/operator/src/statement/ddl.rs
+++ b/src/operator/src/statement/ddl.rs
@@ -20,6 +20,10 @@ use api::v1::meta::CreateFlowTask as PbCreateFlowTask;
 use api::v1::{
    column_def, AlterDatabaseExpr, AlterTableExpr, CreateFlowExpr, CreateTableExpr, CreateViewExpr,
 };
+#[cfg(feature = "enterprise")]
+use api::v1::{
+    meta::CreateTriggerTask as PbCreateTriggerTask, CreateTriggerExpr as PbCreateTriggerExpr,
+};
 use catalog::CatalogManagerRef;
 use chrono::Utc;
 use common_catalog::consts::{is_readonly_schema, DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME};
@@ -31,6 +35,8 @@ use common_meta::ddl::ExecutorContext;
 use common_meta::instruction::CacheIdent;
 use common_meta::key::schema_name::{SchemaName, SchemaNameKey};
 use common_meta::key::NAME_PATTERN;
+#[cfg(feature = "enterprise")]
+use common_meta::rpc::ddl::trigger::CreateTriggerTask;
 use common_meta::rpc::ddl::{
    CreateFlowTask, DdlTask, DropFlowTask, DropViewTask, SubmitDdlTaskRequest,
    SubmitDdlTaskResponse,
@@ -58,6 +64,8 @@ use session::table_name::table_idents_to_full_name;
 use snafu::{ensure, OptionExt, ResultExt};
 use sql::parser::{ParseOptions, ParserContext};
 use sql::statements::alter::{AlterDatabase, AlterTable};
+#[cfg(feature = "enterprise")]
+use sql::statements::create::trigger::CreateTrigger;
 use sql::statements::create::{
    CreateExternalTable, CreateFlow, CreateTable, CreateTableLike, CreateView, Partitions,
 };
@@ -179,24 +187,42 @@ impl StatementExecutor {
            }
        );

-        // Check if is creating logical table
        if create_table.engine == METRIC_ENGINE_NAME
            && create_table
                .table_options
                .contains_key(LOGICAL_TABLE_METADATA_KEY)
        {
-            return self
-                .create_logical_tables(std::slice::from_ref(create_table), query_ctx)
+            // Create logical tables
+            ensure!(
+                partitions.is_none(),
+                InvalidPartitionRuleSnafu {
+                    reason: "logical table in metric engine should not have partition rule, it will be inherited from physical table",
+                }
+            );
+            self.create_logical_tables(std::slice::from_ref(create_table), query_ctx)
                .await?
                .into_iter()
                .next()
                .context(error::UnexpectedSnafu {
-                    violated: "expected to create a logical table",
-                });
+                    violated: "expected to create logical tables",
+                })
+        } else {
+            // Create other normal table
+            self.create_non_logic_table(create_table, partitions, query_ctx)
+                .await
        }
+    }

+    #[tracing::instrument(skip_all)]
+    pub async fn create_non_logic_table(
+        &self,
+        create_table: &mut CreateTableExpr,
+        partitions: Option<Partitions>,
+        query_ctx: QueryContextRef,
+    ) -> Result<TableRef> {
        let _timer = crate::metrics::DIST_CREATE_TABLE.start_timer();

+        // Check if schema exists
        let schema = self
            .table_metadata_manager
            .schema_manager()
@@ -206,7 +232,6 @@ impl StatementExecutor {
            ))
            .await
            .context(TableMetadataManagerSnafu)?;
-
        ensure!(
            schema.is_some(),
            SchemaNotFoundSnafu {
@@ -347,10 +372,43 @@ impl StatementExecutor {
    #[tracing::instrument(skip_all)]
    pub async fn create_trigger(
        &self,
-        _stmt: sql::statements::create::trigger::CreateTrigger,
-        _query_context: QueryContextRef,
+        stmt: CreateTrigger,
+        query_context: QueryContextRef,
    ) -> Result<Output> {
-        crate::error::UnsupportedTriggerSnafu {}.fail()
+        let expr = expr_helper::to_create_trigger_task_expr(stmt, &query_context)?;
+        self.create_trigger_inner(expr, query_context).await
+    }
+
+    #[cfg(feature = "enterprise")]
+    pub async fn create_trigger_inner(
+        &self,
+        expr: PbCreateTriggerExpr,
+        query_context: QueryContextRef,
+    ) -> Result<Output> {
+        self.create_trigger_procedure(expr, query_context).await?;
+        Ok(Output::new_with_affected_rows(0))
+    }
+
+    #[cfg(feature = "enterprise")]
+    async fn create_trigger_procedure(
+        &self,
+        expr: PbCreateTriggerExpr,
+        query_context: QueryContextRef,
+    ) -> Result<SubmitDdlTaskResponse> {
+        let task = CreateTriggerTask::try_from(PbCreateTriggerTask {
+            create_trigger: Some(expr),
+        })
+        .context(error::InvalidExprSnafu)?;
+
+        let request = SubmitDdlTaskRequest {
+            query_context,
+            task: DdlTask::new_create_trigger(task),
+        };
+
+        self.procedure_executor
+            .submit_ddl_task(&ExecutorContext::default(), request)
+            .await
+            .context(error::ExecuteDdlSnafu)
    }

    #[tracing::instrument(skip_all)]
--- a/src/pipeline/src/etl.rs
+++ b/src/pipeline/src/etl.rs
@@ -13,6 +13,7 @@
 // limitations under the License.

 #![allow(dead_code)]
+pub mod ctx_req;
 pub mod field;
 pub mod processor;
 pub mod transform;
@@ -153,21 +154,39 @@ impl DispatchedTo {
 /// The result of pipeline execution
 #[derive(Debug)]
 pub enum PipelineExecOutput {
-    Transformed((Row, Option<String>)),
-    // table_suffix, ts_key -> unit
-    AutoTransform(Option<String>, HashMap<String, TimeUnit>),
+    Transformed(TransformedOutput),
+    AutoTransform(AutoTransformOutput),
    DispatchedTo(DispatchedTo),
 }

+#[derive(Debug)]
+pub struct TransformedOutput {
+    pub opt: String,
+    pub row: Row,
+    pub table_suffix: Option<String>,
+}
+
+#[derive(Debug)]
+pub struct AutoTransformOutput {
+    pub table_suffix: Option<String>,
+    // ts_column_name -> unit
+    pub ts_unit_map: HashMap<String, TimeUnit>,
+}
+
 impl PipelineExecOutput {
+    // Note: This is a test only function, do not use it in production.
    pub fn into_transformed(self) -> Option<(Row, Option<String>)> {
-        if let Self::Transformed(o) = self {
-            Some(o)
+        if let Self::Transformed(TransformedOutput {
+            row, table_suffix, ..
+        }) = self
+        {
+            Some((row, table_suffix))
        } else {
            None
        }
    }

+    // Note: This is a test only function, do not use it in production.
    pub fn into_dispatched(self) -> Option<DispatchedTo> {
        if let Self::DispatchedTo(d) = self {
            Some(d)
@@ -224,9 +243,13 @@ impl Pipeline {
        }

        if let Some(transformer) = self.transformer() {
-            let row = transformer.transform_mut(val)?;
+            let (opt, row) = transformer.transform_mut(val)?;
            let table_suffix = self.tablesuffix.as_ref().and_then(|t| t.apply(val));
-            Ok(PipelineExecOutput::Transformed((row, table_suffix)))
+            Ok(PipelineExecOutput::Transformed(TransformedOutput {
+                opt,
+                row,
+                table_suffix,
+            }))
        } else {
            let table_suffix = self.tablesuffix.as_ref().and_then(|t| t.apply(val));
            let mut ts_unit_map = HashMap::with_capacity(4);
@@ -238,7 +261,10 @@ impl Pipeline {
                    }
                }
            }
-            Ok(PipelineExecOutput::AutoTransform(table_suffix, ts_unit_map))
+            Ok(PipelineExecOutput::AutoTransform(AutoTransformOutput {
+                table_suffix,
+                ts_unit_map,
+            }))
        }
    }

--- a/src/pipeline/src/etl/ctx_req.rs
+++ b/src/pipeline/src/etl/ctx_req.rs
@@ -0,0 +1,153 @@
+// Copyright 2023 Greptime Team
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+use std::collections::hash_map::IntoIter;
+use std::collections::BTreeMap;
+use std::sync::Arc;
+
+use ahash::{HashMap, HashMapExt};
+use api::v1::{RowInsertRequest, RowInsertRequests, Rows};
+use itertools::Itertools;
+use session::context::{QueryContext, QueryContextRef};
+
+use crate::PipelineMap;
+
+const DEFAULT_OPT: &str = "";
+
+pub const PIPELINE_HINT_KEYS: [&str; 6] = [
+    "greptime_auto_create_table",
+    "greptime_ttl",
+    "greptime_append_mode",
+    "greptime_merge_mode",
+    "greptime_physical_table",
+    "greptime_skip_wal",
+];
+
+const PIPELINE_HINT_PREFIX: &str = "greptime_";
+
+// Remove hints from the pipeline context and form a option string
+// e.g: skip_wal=true,ttl=1d
+pub fn from_pipeline_map_to_opt(pipeline_map: &mut PipelineMap) -> String {
+    let mut btreemap = BTreeMap::new();
+    for k in PIPELINE_HINT_KEYS {
+        if let Some(v) = pipeline_map.remove(k) {
+            btreemap.insert(k, v.to_str_value());
+        }
+    }
+    btreemap
+        .into_iter()
+        .map(|(k, v)| format!("{}={}", k.replace(PIPELINE_HINT_PREFIX, ""), v))
+        .join(",")
+}
+
+// split the option string back to a map
+fn from_opt_to_map(opt: &str) -> HashMap<&str, &str> {
+    opt.split(',')
+        .filter_map(|s| {
+            s.split_once("=")
+                .filter(|(k, v)| !k.is_empty() && !v.is_empty())
+        })
+        .collect()
+}
+
+// ContextReq is a collection of row insert requests with different options.
+// The default option is empty string.
+// Because options are set in query context, we have to split them into sequential calls
+// e.g:
+// {
+//     "skip_wal=true,ttl=1d": [RowInsertRequest],
+//     "ttl=1d": [RowInsertRequest],
+// }
+#[derive(Debug, Default)]
+pub struct ContextReq {
+    req: HashMap<String, Vec<RowInsertRequest>>,
+}
+
+impl ContextReq {
+    pub fn from_opt_map(opt_map: HashMap<String, Rows>, table_name: String) -> Self {
+        Self {
+            req: opt_map
+                .into_iter()
+                .map(|(opt, rows)| {
+                    (
+                        opt,
+                        vec![RowInsertRequest {
+                            table_name: table_name.clone(),
+                            rows: Some(rows),
+                        }],
+                    )
+                })
+                .collect::<HashMap<String, Vec<RowInsertRequest>>>(),
+        }
+    }
+
+    pub fn default_opt_with_reqs(reqs: Vec<RowInsertRequest>) -> Self {
+        let mut req_map = HashMap::new();
+        req_map.insert(DEFAULT_OPT.to_string(), reqs);
+        Self { req: req_map }
+    }
+
+    pub fn add_rows(&mut self, opt: String, req: RowInsertRequest) {
+        self.req.entry(opt).or_default().push(req);
+    }
+
+    pub fn merge(&mut self, other: Self) {
+        for (opt, req) in other.req {
+            self.req.entry(opt).or_default().extend(req);
+        }
+    }
+
+    pub fn as_req_iter(self, ctx: QueryContextRef) -> ContextReqIter {
+        let ctx = (*ctx).clone();
+
+        ContextReqIter {
+            opt_req: self.req.into_iter(),
+            ctx_template: ctx,
+        }
+    }
+
+    pub fn all_req(self) -> impl Iterator<Item = RowInsertRequest> {
+        self.req.into_iter().flat_map(|(_, req)| req)
+    }
+
+    pub fn ref_all_req(&self) -> impl Iterator<Item = &RowInsertRequest> {
+        self.req.values().flatten()
+    }
+}
+
+// ContextReqIter is an iterator that iterates over the ContextReq.
+// The context template is cloned from the original query context.
+// It will clone the query context for each option and set the options to the context.
+// Then it will return the context and the row insert requests for actual insert.
+pub struct ContextReqIter {
+    opt_req: IntoIter<String, Vec<RowInsertRequest>>,
+    ctx_template: QueryContext,
+}
+
+impl Iterator for ContextReqIter {
+    type Item = (QueryContextRef, RowInsertRequests);
+
+    fn next(&mut self) -> Option<Self::Item> {
+        let (opt, req_vec) = self.opt_req.next()?;
+
+        let opt_map = from_opt_to_map(&opt);
+
+        let mut ctx = self.ctx_template.clone();
+        for (k, v) in opt_map {
+            ctx.set_extension(k, v);
+        }
+
+        Some((Arc::new(ctx), RowInsertRequests { inserts: req_vec }))
+    }
+}
--- a/src/pipeline/src/etl/transform/transformer/greptime.rs
+++ b/src/pipeline/src/etl/transform/transformer/greptime.rs
@@ -40,7 +40,7 @@ use crate::etl::transform::index::Index;
 use crate::etl::transform::{Transform, Transforms};
 use crate::etl::value::{Timestamp, Value};
 use crate::etl::PipelineMap;
-use crate::PipelineContext;
+use crate::{from_pipeline_map_to_opt, PipelineContext};

 const DEFAULT_GREPTIME_TIMESTAMP_COLUMN: &str = "greptime_timestamp";
 const DEFAULT_MAX_NESTED_LEVELS_FOR_JSON_FLATTENING: usize = 10;
@@ -185,13 +185,15 @@ impl GreptimeTransformer {
        }
    }

-    pub fn transform_mut(&self, val: &mut PipelineMap) -> Result<Row> {
+    pub fn transform_mut(&self, pipeline_map: &mut PipelineMap) -> Result<(String, Row)> {
+        let opt = from_pipeline_map_to_opt(pipeline_map);
+
        let mut values = vec![GreptimeValue { value_data: None }; self.schema.len()];
        let mut output_index = 0;
        for transform in self.transforms.iter() {
            for field in transform.fields.iter() {
                let index = field.input_field();
-                match val.get(index) {
+                match pipeline_map.get(index) {
                    Some(v) => {
                        let value_data = coerce_value(v, transform)?;
                        // every transform fields has only one output field
@@ -217,7 +219,7 @@ impl GreptimeTransformer {
                output_index += 1;
            }
        }
-        Ok(Row { values })
+        Ok((opt, Row { values }))
    }

    pub fn transforms(&self) -> &Transforms {
@@ -517,8 +519,7 @@ fn resolve_value(
 fn identity_pipeline_inner(
    pipeline_maps: Vec<PipelineMap>,
    pipeline_ctx: &PipelineContext<'_>,
-) -> Result<(SchemaInfo, Vec<Row>)> {
-    let mut rows = Vec::with_capacity(pipeline_maps.len());
+) -> Result<(SchemaInfo, HashMap<String, Vec<Row>>)> {
    let mut schema_info = SchemaInfo::default();
    let custom_ts = pipeline_ctx.pipeline_definition.get_custom_ts();

@@ -539,20 +540,30 @@ fn identity_pipeline_inner(
        options: None,
    });

-    for values in pipeline_maps {
-        let row = values_to_row(&mut schema_info, values, pipeline_ctx)?;
-        rows.push(row);
+    let mut opt_map = HashMap::new();
+    let len = pipeline_maps.len();
+
+    for mut pipeline_map in pipeline_maps {
+        let opt = from_pipeline_map_to_opt(&mut pipeline_map);
+        let row = values_to_row(&mut schema_info, pipeline_map, pipeline_ctx)?;
+
+        opt_map
+            .entry(opt)
+            .or_insert_with(|| Vec::with_capacity(len))
+            .push(row);
    }

    let column_count = schema_info.schema.len();
-    for row in rows.iter_mut() {
-        let diff = column_count - row.values.len();
-        for _ in 0..diff {
-            row.values.push(GreptimeValue { value_data: None });
+    for (_, row) in opt_map.iter_mut() {
+        for row in row.iter_mut() {
+            let diff = column_count - row.values.len();
+            for _ in 0..diff {
+                row.values.push(GreptimeValue { value_data: None });
+            }
        }
    }

-    Ok((schema_info, rows))
+    Ok((schema_info, opt_map))
 }

 /// Identity pipeline for Greptime
@@ -567,7 +578,7 @@ pub fn identity_pipeline(
    array: Vec<PipelineMap>,
    table: Option<Arc<table::Table>>,
    pipeline_ctx: &PipelineContext<'_>,
-) -> Result<Rows> {
+) -> Result<HashMap<String, Rows>> {
    let input = if pipeline_ctx.pipeline_param.flatten_json_object() {
        array
            .into_iter()
@@ -577,7 +588,7 @@ pub fn identity_pipeline(
        array
    };

-    identity_pipeline_inner(input, pipeline_ctx).map(|(mut schema, rows)| {
+    identity_pipeline_inner(input, pipeline_ctx).map(|(mut schema, opt_map)| {
        if let Some(table) = table {
            let table_info = table.table_info();
            for tag_name in table_info.meta.row_key_column_names() {
@@ -586,10 +597,19 @@ pub fn identity_pipeline(
                }
            }
        }
-        Rows {
-            schema: schema.schema,
-            rows,
-        }
+
+        opt_map
+            .into_iter()
+            .map(|(opt, rows)| {
+                (
+                    opt,
+                    Rows {
+                        schema: schema.schema.clone(),
+                        rows,
+                    },
+                )
+            })
+            .collect::<HashMap<String, Rows>>()
    })
 }

@@ -739,7 +759,9 @@ mod tests {
            ];
            let rows = identity_pipeline(json_array_to_map(array).unwrap(), None, &pipeline_ctx);
            assert!(rows.is_ok());
-            let rows = rows.unwrap();
+            let mut rows = rows.unwrap();
+            assert!(rows.len() == 1);
+            let rows = rows.remove("").unwrap();
            assert_eq!(rows.schema.len(), 8);
            assert_eq!(rows.rows.len(), 2);
            assert_eq!(8, rows.rows[0].values.len());
@@ -769,12 +791,16 @@ mod tests {
            let tag_column_names = ["name".to_string(), "address".to_string()];

            let rows = identity_pipeline_inner(json_array_to_map(array).unwrap(), &pipeline_ctx)
-                .map(|(mut schema, rows)| {
+                .map(|(mut schema, mut rows)| {
                    for name in tag_column_names {
                        if let Some(index) = schema.index.get(&name) {
                            schema.schema[*index].semantic_type = SemanticType::Tag as i32;
                        }
                    }
+
+                    assert!(rows.len() == 1);
+                    let rows = rows.remove("").unwrap();
+
                    Rows {
                        schema: schema.schema,
                        rows,
--- a/src/pipeline/src/lib.rs
+++ b/src/pipeline/src/lib.rs
@@ -19,14 +19,16 @@ mod manager;
 mod metrics;
 mod tablesuffix;

+pub use etl::ctx_req::{from_pipeline_map_to_opt, ContextReq};
 pub use etl::processor::Processor;
 pub use etl::transform::transformer::greptime::{GreptimePipelineParams, SchemaInfo};
 pub use etl::transform::transformer::identity_pipeline;
 pub use etl::transform::GreptimeTransformer;
 pub use etl::value::{Array, Map, Value};
 pub use etl::{
-    json_array_to_map, json_to_map, parse, simd_json_array_to_map, simd_json_to_map, Content,
-    DispatchedTo, Pipeline, PipelineExecOutput, PipelineMap,
+    json_array_to_map, json_to_map, parse, simd_json_array_to_map, simd_json_to_map,
+    AutoTransformOutput, Content, DispatchedTo, Pipeline, PipelineExecOutput, PipelineMap,
+    TransformedOutput,
 };
 pub use manager::{
    pipeline_operator, table, util, IdentityTimeIndex, PipelineContext, PipelineDefinition,
--- a/src/pipeline/src/manager/table.rs
+++ b/src/pipeline/src/manager/table.rs
@@ -236,6 +236,7 @@ impl PipelineTable {
                Self::query_ctx(&table_info),
                &self.statement_executor,
                false,
+                false,
            )
            .await
            .context(InsertPipelineSnafu)?;
--- a/src/servers/src/error.rs
+++ b/src/servers/src/error.rs
@@ -569,6 +569,7 @@ pub enum Error {
        #[snafu(implicit)]
        location: Location,
    },
+
    #[snafu(display("Convert SQL value error"))]
    ConvertSqlValue {
        source: datatypes::error::Error,
--- a/src/servers/src/grpc/greptime_handler.rs
+++ b/src/servers/src/grpc/greptime_handler.rs
@@ -38,6 +38,7 @@ use common_telemetry::{debug, error, tracing, warn};
 use common_time::timezone::parse_timezone;
 use futures_util::StreamExt;
 use session::context::{QueryContext, QueryContextBuilder, QueryContextRef};
+use session::hints::READ_PREFERENCE_HINT;
 use snafu::{OptionExt, ResultExt};
 use table::metadata::TableId;
 use tokio::sync::mpsc;
@@ -49,7 +50,6 @@ use crate::error::{
 };
 use crate::grpc::flight::{PutRecordBatchRequest, PutRecordBatchRequestStream};
 use crate::grpc::TonicResult;
-use crate::hint_headers::READ_PREFERENCE_HINT;
 use crate::metrics;
 use crate::metrics::{METRIC_AUTH_FAILURE, METRIC_SERVER_GRPC_DB_REQUEST_TIMER};
 use crate::query_handler::grpc::ServerGrpcQueryHandlerRef;
--- a/src/servers/src/hint_headers.rs
+++ b/src/servers/src/hint_headers.rs
@@ -13,23 +13,9 @@
 // limitations under the License.

 use http::HeaderMap;
+use session::hints::{HINTS_KEY, HINTS_KEY_PREFIX, HINT_KEYS};
 use tonic::metadata::MetadataMap;

-// For the given format: `x-greptime-hints: auto_create_table=true, ttl=7d`
-pub const HINTS_KEY: &str = "x-greptime-hints";
-
-pub const READ_PREFERENCE_HINT: &str = "read_preference";
-
-const HINT_KEYS: [&str; 7] = [
-    "x-greptime-hint-auto_create_table",
-    "x-greptime-hint-ttl",
-    "x-greptime-hint-append_mode",
-    "x-greptime-hint-merge_mode",
-    "x-greptime-hint-physical_table",
-    "x-greptime-hint-skip_wal",
-    "x-greptime-hint-read_preference",
-];
-
 pub(crate) fn extract_hints<T: ToHeaderMap>(headers: &T) -> Vec<(String, String)> {
    let mut hints = Vec::new();
    if let Some(value_str) = headers.get(HINTS_KEY) {
@@ -44,7 +30,7 @@ pub(crate) fn extract_hints<T: ToHeaderMap>(headers: &T) -> Vec<(String, String)
    }
    for key in HINT_KEYS.iter() {
        if let Some(value) = headers.get(key) {
-            let new_key = key.replace("x-greptime-hint-", "");
+            let new_key = key.replace(HINTS_KEY_PREFIX, "");
            hints.push((new_key, value.trim().to_string()));
        }
    }
--- a/src/servers/src/http/event.rs
+++ b/src/servers/src/http/event.rs
@@ -18,7 +18,6 @@ use std::str::FromStr;
 use std::sync::Arc;
 use std::time::Instant;

-use api::v1::RowInsertRequests;
 use async_trait::async_trait;
 use axum::body::Bytes;
 use axum::extract::{FromRequest, Multipart, Path, Query, Request, State};
@@ -34,7 +33,9 @@ use datatypes::value::column_data_to_json;
 use headers::ContentType;
 use lazy_static::lazy_static;
 use pipeline::util::to_pipeline_version;
-use pipeline::{GreptimePipelineParams, PipelineContext, PipelineDefinition, PipelineMap};
+use pipeline::{
+    ContextReq, GreptimePipelineParams, PipelineContext, PipelineDefinition, PipelineMap,
+};
 use serde::{Deserialize, Serialize};
 use serde_json::{json, Deserializer, Map, Value};
 use session::context::{Channel, QueryContext, QueryContextRef};
@@ -345,7 +346,7 @@ async fn dryrun_pipeline_inner(
    let name_key = "name";

    let results = results
-        .into_iter()
+        .all_req()
        .filter_map(|row| {
            if let Some(rows) = row.rows {
                let table_name = row.table_name;
@@ -798,7 +799,7 @@ pub(crate) async fn ingest_logs_inner(
    let db = query_ctx.get_db_string();
    let exec_timer = std::time::Instant::now();

-    let mut insert_requests = Vec::with_capacity(log_ingest_requests.len());
+    let mut req = ContextReq::default();

    let pipeline_params = GreptimePipelineParams::from_params(
        headers
@@ -811,36 +812,42 @@ pub(crate) async fn ingest_logs_inner(
        let requests =
            run_pipeline(&handler, &pipeline_ctx, pipeline_req, &query_ctx, true).await?;

-        insert_requests.extend(requests);
+        req.merge(requests);
    }

-    let output = handler
-        .insert(
-            RowInsertRequests {
-                inserts: insert_requests,
-            },
-            query_ctx,
-        )
-        .await;
+    let mut outputs = Vec::new();
+    let mut total_rows: u64 = 0;
+    let mut fail = false;
+    for (temp_ctx, act_req) in req.as_req_iter(query_ctx) {
+        let output = handler.insert(act_req, temp_ctx).await;

-    if let Ok(Output {
-        data: OutputData::AffectedRows(rows),
-        meta: _,
-    }) = &output
-    {
+        if let Ok(Output {
+            data: OutputData::AffectedRows(rows),
+            meta: _,
+        }) = &output
+        {
+            total_rows += *rows as u64;
+        } else {
+            fail = true;
+        }
+        outputs.push(output);
+    }
+
+    if total_rows > 0 {
        METRIC_HTTP_LOGS_INGESTION_COUNTER
            .with_label_values(&[db.as_str()])
-            .inc_by(*rows as u64);
+            .inc_by(total_rows);
        METRIC_HTTP_LOGS_INGESTION_ELAPSED
            .with_label_values(&[db.as_str(), METRIC_SUCCESS_VALUE])
            .observe(exec_timer.elapsed().as_secs_f64());
-    } else {
+    }
+    if fail {
        METRIC_HTTP_LOGS_INGESTION_ELAPSED
            .with_label_values(&[db.as_str(), METRIC_FAILURE_VALUE])
            .observe(exec_timer.elapsed().as_secs_f64());
    }

-    let response = GreptimedbV1Response::from_output(vec![output])
+    let response = GreptimedbV1Response::from_output(outputs)
        .await
        .with_execution_time(exec_timer.elapsed().as_millis() as u64);
    Ok(response)
--- a/src/servers/src/http/otlp.rs
+++ b/src/servers/src/http/otlp.rs
@@ -166,7 +166,7 @@ pub async fn logs(
            resp_body: ExportLogsServiceResponse {
                partial_success: None,
            },
-            write_cost: o.meta.cost,
+            write_cost: o.iter().map(|o| o.meta.cost).sum(),
        })
 }

--- a/src/servers/src/http/prom_store.rs
+++ b/src/servers/src/http/prom_store.rs
@@ -15,7 +15,6 @@
 use std::sync::Arc;

 use api::prom_store::remote::ReadRequest;
-use api::v1::RowInsertRequests;
 use axum::body::Bytes;
 use axum::extract::{Query, State};
 use axum::http::{header, HeaderValue, StatusCode};
@@ -29,7 +28,7 @@ use hyper::HeaderMap;
 use lazy_static::lazy_static;
 use object_pool::Pool;
 use pipeline::util::to_pipeline_version;
-use pipeline::PipelineDefinition;
+use pipeline::{ContextReq, PipelineDefinition};
 use prost::Message;
 use serde::{Deserialize, Serialize};
 use session::context::{Channel, QueryContext};
@@ -133,18 +132,24 @@ pub async fn remote_write(
        processor.set_pipeline(pipeline_handler, query_ctx.clone(), pipeline_def);
    }

-    let (request, samples) =
+    let req =
        decode_remote_write_request(is_zstd, body, prom_validation_mode, &mut processor).await?;

-    let output = prom_store_handler
-        .write(request, query_ctx, prom_store_with_metric_engine)
-        .await?;
-    crate::metrics::PROM_STORE_REMOTE_WRITE_SAMPLES.inc_by(samples as u64);
-    Ok((
-        StatusCode::NO_CONTENT,
-        write_cost_header_map(output.meta.cost),
-    )
-        .into_response())
+    let mut cost = 0;
+    for (temp_ctx, reqs) in req.as_req_iter(query_ctx) {
+        let cnt: u64 = reqs
+            .inserts
+            .iter()
+            .filter_map(|s| s.rows.as_ref().map(|r| r.rows.len() as u64))
+            .sum();
+        let output = prom_store_handler
+            .write(reqs, temp_ctx, prom_store_with_metric_engine)
+            .await?;
+        crate::metrics::PROM_STORE_REMOTE_WRITE_SAMPLES.inc_by(cnt);
+        cost += output.meta.cost;
+    }
+
+    Ok((StatusCode::NO_CONTENT, write_cost_header_map(cost)).into_response())
 }

 impl IntoResponse for PromStoreResponse {
@@ -202,7 +207,7 @@ async fn decode_remote_write_request(
    body: Bytes,
    prom_validation_mode: PromValidationMode,
    processor: &mut PromSeriesProcessor,
-) -> Result<(RowInsertRequests, usize)> {
+) -> Result<ContextReq> {
    let _timer = crate::metrics::METRIC_HTTP_PROM_STORE_DECODE_ELAPSED.start_timer();

    // due to vmagent's limitation, there is a chance that vmagent is
@@ -227,7 +232,8 @@ async fn decode_remote_write_request(
    if processor.use_pipeline {
        processor.exec_pipeline().await
    } else {
-        Ok(request.as_row_insert_requests())
+        let reqs = request.as_row_insert_requests();
+        Ok(ContextReq::default_opt_with_reqs(reqs))
    }
 }

--- a/src/servers/src/mysql/federated.rs
+++ b/src/servers/src/mysql/federated.rs
@@ -183,7 +183,7 @@ fn select_variable(query: &str, query_context: QueryContextRef) -> Option<Output

        // get value of variables from known sources or fallback to defaults
        let value = match var_as[0] {
-            "time_zone" => query_context.timezone().to_string(),
+            "session.time_zone" | "time_zone" => query_context.timezone().to_string(),
            "system_time_zone" => system_timezone_name(),
            _ => VAR_VALUES
                .get(var_as[0])
--- a/src/servers/src/otlp/logs.rs
+++ b/src/servers/src/otlp/logs.rs
@@ -18,13 +18,15 @@ use api::v1::column_data_type_extension::TypeExt;
 use api::v1::value::ValueData;
 use api::v1::{
    ColumnDataType, ColumnDataTypeExtension, ColumnOptions, ColumnSchema, JsonTypeExtension, Row,
-    RowInsertRequest, RowInsertRequests, Rows, SemanticType, Value as GreptimeValue,
+    RowInsertRequest, Rows, SemanticType, Value as GreptimeValue,
 };
 use jsonb::{Number as JsonbNumber, Value as JsonbValue};
 use opentelemetry_proto::tonic::collector::logs::v1::ExportLogsServiceRequest;
 use opentelemetry_proto::tonic::common::v1::{any_value, AnyValue, InstrumentationScope, KeyValue};
 use opentelemetry_proto::tonic::logs::v1::{LogRecord, ResourceLogs, ScopeLogs};
-use pipeline::{GreptimePipelineParams, PipelineContext, PipelineWay, SchemaInfo, SelectInfo};
+use pipeline::{
+    ContextReq, GreptimePipelineParams, PipelineContext, PipelineWay, SchemaInfo, SelectInfo,
+};
 use serde_json::{Map, Value};
 use session::context::QueryContextRef;
 use snafu::{ensure, ResultExt};
@@ -55,21 +57,16 @@ pub async fn to_grpc_insert_requests(
    table_name: String,
    query_ctx: &QueryContextRef,
    pipeline_handler: PipelineHandlerRef,
-) -> Result<(RowInsertRequests, usize)> {
+) -> Result<ContextReq> {
    match pipeline {
        PipelineWay::OtlpLogDirect(select_info) => {
            let rows = parse_export_logs_service_request_to_rows(request, select_info)?;
-            let len = rows.rows.len();
            let insert_request = RowInsertRequest {
                rows: Some(rows),
                table_name,
            };
-            Ok((
-                RowInsertRequests {
-                    inserts: vec![insert_request],
-                },
-                len,
-            ))
+
+            Ok(ContextReq::default_opt_with_reqs(vec![insert_request]))
        }
        PipelineWay::Pipeline(pipeline_def) => {
            let data = parse_export_logs_service_request(request);
@@ -77,7 +74,7 @@ pub async fn to_grpc_insert_requests(

            let pipeline_ctx =
                PipelineContext::new(&pipeline_def, &pipeline_params, query_ctx.channel());
-            let inserts = run_pipeline(
+            run_pipeline(
                &pipeline_handler,
                &pipeline_ctx,
                PipelineIngestRequest {
@@ -87,20 +84,7 @@ pub async fn to_grpc_insert_requests(
                query_ctx,
                true,
            )
-            .await?;
-            let len = inserts
-                .iter()
-                .map(|insert| {
-                    insert
-                        .rows
-                        .as_ref()
-                        .map(|rows| rows.rows.len())
-                        .unwrap_or(0)
-                })
-                .sum();
-
-            let insert_requests = RowInsertRequests { inserts };
-            Ok((insert_requests, len))
+            .await
        }
        _ => NotSupportedSnafu {
            feat: "Unsupported pipeline for logs",
--- a/src/servers/src/pipeline.rs
+++ b/src/servers/src/pipeline.rs
@@ -20,8 +20,9 @@ use api::v1::{RowInsertRequest, Rows};
 use itertools::Itertools;
 use pipeline::error::AutoTransformOneTimestampSnafu;
 use pipeline::{
-    DispatchedTo, IdentityTimeIndex, Pipeline, PipelineContext, PipelineDefinition,
-    PipelineExecOutput, PipelineMap, GREPTIME_INTERNAL_IDENTITY_PIPELINE_NAME,
+    AutoTransformOutput, ContextReq, DispatchedTo, IdentityTimeIndex, Pipeline, PipelineContext,
+    PipelineDefinition, PipelineExecOutput, PipelineMap, TransformedOutput,
+    GREPTIME_INTERNAL_IDENTITY_PIPELINE_NAME,
 };
 use session::context::{Channel, QueryContextRef};
 use snafu::{OptionExt, ResultExt};
@@ -66,7 +67,7 @@ pub(crate) async fn run_pipeline(
    pipeline_req: PipelineIngestRequest,
    query_ctx: &QueryContextRef,
    is_top_level: bool,
-) -> Result<Vec<RowInsertRequest>> {
+) -> Result<ContextReq> {
    if pipeline_ctx.pipeline_definition.is_identity() {
        run_identity_pipeline(handler, pipeline_ctx, pipeline_req, query_ctx).await
    } else {
@@ -79,7 +80,7 @@ async fn run_identity_pipeline(
    pipeline_ctx: &PipelineContext<'_>,
    pipeline_req: PipelineIngestRequest,
    query_ctx: &QueryContextRef,
-) -> Result<Vec<RowInsertRequest>> {
+) -> Result<ContextReq> {
    let PipelineIngestRequest {
        table: table_name,
        values: data_array,
@@ -93,12 +94,7 @@ async fn run_identity_pipeline(
            .context(CatalogSnafu)?
    };
    pipeline::identity_pipeline(data_array, table, pipeline_ctx)
-        .map(|rows| {
-            vec![RowInsertRequest {
-                rows: Some(rows),
-                table_name,
-            }]
-        })
+        .map(|opt_map| ContextReq::from_opt_map(opt_map, table_name))
        .context(PipelineSnafu)
 }

@@ -108,7 +104,7 @@ async fn run_custom_pipeline(
    pipeline_req: PipelineIngestRequest,
    query_ctx: &QueryContextRef,
    is_top_level: bool,
-) -> Result<Vec<RowInsertRequest>> {
+) -> Result<ContextReq> {
    let db = query_ctx.get_db_string();
    let pipeline = get_pipeline(pipeline_ctx.pipeline_definition, handler, query_ctx).await?;

@@ -135,17 +131,24 @@ async fn run_custom_pipeline(
            .context(PipelineSnafu)?;

        match r {
-            PipelineExecOutput::Transformed((row, table_suffix)) => {
+            PipelineExecOutput::Transformed(TransformedOutput {
+                opt,
+                row,
+                table_suffix,
+            }) => {
                let act_table_name = table_suffix_to_table_name(&table_name, table_suffix);
-                push_to_map!(transformed_map, act_table_name, row, arr_len);
+                push_to_map!(transformed_map, (opt, act_table_name), row, arr_len);
            }
-            PipelineExecOutput::AutoTransform(table_suffix, ts_keys) => {
+            PipelineExecOutput::AutoTransform(AutoTransformOutput {
+                table_suffix,
+                ts_unit_map,
+            }) => {
                let act_table_name = table_suffix_to_table_name(&table_name, table_suffix);
                push_to_map!(auto_map, act_table_name.clone(), pipeline_map, arr_len);
                auto_map_ts_keys
                    .entry(act_table_name)
                    .or_insert_with(HashMap::new)
-                    .extend(ts_keys);
+                    .extend(ts_unit_map);
            }
            PipelineExecOutput::DispatchedTo(dispatched_to) => {
                push_to_map!(dispatched, dispatched_to, pipeline_map, arr_len);
@@ -153,7 +156,7 @@ async fn run_custom_pipeline(
        }
    }

-    let mut results = Vec::new();
+    let mut results = ContextReq::default();

    if let Some(s) = pipeline.schemas() {
        // transformed
@@ -161,14 +164,17 @@ async fn run_custom_pipeline(
        // if current pipeline generates some transformed results, build it as
        // `RowInsertRequest` and append to results. If the pipeline doesn't
        // have dispatch, this will be only output of the pipeline.
-        for (table_name, rows) in transformed_map {
-            results.push(RowInsertRequest {
-                rows: Some(Rows {
-                    rows,
-                    schema: s.clone(),
-                }),
-                table_name,
-            });
+        for ((opt, table_name), rows) in transformed_map {
+            results.add_rows(
+                opt,
+                RowInsertRequest {
+                    rows: Some(Rows {
+                        rows,
+                        schema: s.clone(),
+                    }),
+                    table_name,
+                },
+            );
        }
    } else {
        // auto map
@@ -205,7 +211,7 @@ async fn run_custom_pipeline(
            )
            .await?;

-            results.extend(reqs);
+            results.merge(reqs);
        }
    }

@@ -240,7 +246,7 @@ async fn run_custom_pipeline(
        ))
        .await?;

-        results.extend(requests);
+        results.merge(requests);
    }

    if is_top_level {
--- a/src/servers/src/prom_row_builder.rs
+++ b/src/servers/src/prom_row_builder.rs
@@ -18,10 +18,7 @@ use std::string::ToString;
 use ahash::HashMap;
 use api::prom_store::remote::Sample;
 use api::v1::value::ValueData;
-use api::v1::{
-    ColumnDataType, ColumnSchema, Row, RowInsertRequest, RowInsertRequests, Rows, SemanticType,
-    Value,
-};
+use api::v1::{ColumnDataType, ColumnSchema, Row, RowInsertRequest, Rows, SemanticType, Value};
 use common_query::prelude::{GREPTIME_TIMESTAMP, GREPTIME_VALUE};
 use prost::DecodeError;

@@ -55,17 +52,11 @@ impl TablesBuilder {
    }

    /// Converts [TablesBuilder] to [RowInsertRequests] and row numbers and clears inner states.
-    pub(crate) fn as_insert_requests(&mut self) -> (RowInsertRequests, usize) {
-        let mut total_rows = 0;
-        let inserts = self
-            .tables
+    pub(crate) fn as_insert_requests(&mut self) -> Vec<RowInsertRequest> {
+        self.tables
            .drain()
-            .map(|(name, mut table)| {
-                total_rows += table.num_rows();
-                table.as_row_insert_request(name)
-            })
-            .collect();
-        (RowInsertRequests { inserts }, total_rows)
+            .map(|(name, mut table)| table.as_row_insert_request(name))
+            .collect()
    }
 }

@@ -116,11 +107,6 @@ impl TableBuilder {
        }
    }

-    /// Total number of rows inside table builder.
-    fn num_rows(&self) -> usize {
-        self.rows.len()
-    }
-
    /// Adds a set of labels and samples to table builder.
    pub(crate) fn add_labels_and_samples(
        &mut self,
--- a/src/servers/src/proto.rs
+++ b/src/servers/src/proto.rs
@@ -18,11 +18,13 @@ use std::ops::Deref;
 use std::slice;

 use api::prom_store::remote::Sample;
-use api::v1::RowInsertRequests;
+use api::v1::RowInsertRequest;
 use bytes::{Buf, Bytes};
 use common_query::prelude::{GREPTIME_TIMESTAMP, GREPTIME_VALUE};
 use common_telemetry::debug;
-use pipeline::{GreptimePipelineParams, PipelineContext, PipelineDefinition, PipelineMap, Value};
+use pipeline::{
+    ContextReq, GreptimePipelineParams, PipelineContext, PipelineDefinition, PipelineMap, Value,
+};
 use prost::encoding::message::merge;
 use prost::encoding::{decode_key, decode_varint, WireType};
 use prost::DecodeError;
@@ -267,7 +269,7 @@ impl Clear for PromWriteRequest {
 }

 impl PromWriteRequest {
-    pub fn as_row_insert_requests(&mut self) -> (RowInsertRequests, usize) {
+    pub fn as_row_insert_requests(&mut self) -> Vec<RowInsertRequest> {
        self.table_data.as_insert_requests()
    }

@@ -409,9 +411,7 @@ impl PromSeriesProcessor {
        Ok(())
    }

-    pub(crate) async fn exec_pipeline(
-        &mut self,
-    ) -> crate::error::Result<(RowInsertRequests, usize)> {
+    pub(crate) async fn exec_pipeline(&mut self) -> crate::error::Result<ContextReq> {
        // prepare params
        let handler = self.pipeline_handler.as_ref().context(InternalSnafu {
            err_msg: "pipeline handler is not set",
@@ -425,10 +425,9 @@ impl PromSeriesProcessor {
        })?;

        let pipeline_ctx = PipelineContext::new(pipeline_def, &pipeline_param, query_ctx.channel());
-        let mut size = 0;

        // run pipeline
-        let mut inserts = Vec::with_capacity(self.table_values.len());
+        let mut req = ContextReq::default();
        for (table_name, pipeline_maps) in self.table_values.iter_mut() {
            let pipeline_req = PipelineIngestRequest {
                table: table_name.clone(),
@@ -436,16 +435,10 @@ impl PromSeriesProcessor {
            };
            let row_req =
                run_pipeline(handler, &pipeline_ctx, pipeline_req, query_ctx, true).await?;
-            size += row_req
-                .iter()
-                .map(|rq| rq.rows.as_ref().map(|r| r.rows.len()).unwrap_or(0))
-                .sum::<usize>();
-            inserts.extend(row_req);
+            req.merge(row_req);
        }

-        let row_insert_requests = RowInsertRequests { inserts };
-
-        Ok((row_insert_requests, size))
+        Ok(req)
    }
 }

@@ -489,7 +482,13 @@ mod tests {
        prom_write_request
            .merge(data.clone(), PromValidationMode::Strict, &mut p)
            .unwrap();
-        let (prom_rows, samples) = prom_write_request.as_row_insert_requests();
+
+        let req = prom_write_request.as_row_insert_requests();
+        let samples = req
+            .iter()
+            .filter_map(|r| r.rows.as_ref().map(|r| r.rows.len()))
+            .sum::<usize>();
+        let prom_rows = RowInsertRequests { inserts: req };

        assert_eq!(expected_samples, samples);
        assert_eq!(expected_rows.inserts.len(), prom_rows.inserts.len());
--- a/src/servers/src/query_handler.rs
+++ b/src/servers/src/query_handler.rs
@@ -122,7 +122,7 @@ pub trait OpenTelemetryProtocolHandler: PipelineHandler {
        pipeline_params: GreptimePipelineParams,
        table_name: String,
        ctx: QueryContextRef,
-    ) -> Result<Output>;
+    ) -> Result<Vec<Output>>;
 }

 /// PipelineHandler is responsible for handling pipeline related requests.
--- a/src/session/src/hints.rs
+++ b/src/session/src/hints.rs
@@ -0,0 +1,29 @@
+// Copyright 2023 Greptime Team
+//
+// Licensed under the Apache License, Version 2.0 (the "License");
+// you may not use this file except in compliance with the License.
+// You may obtain a copy of the License at
+//
+//     http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+// For the given format: `x-greptime-hints: auto_create_table=true, ttl=7d`
+pub const HINTS_KEY: &str = "x-greptime-hints";
+pub const HINTS_KEY_PREFIX: &str = "x-greptime-hint-";
+
+pub const READ_PREFERENCE_HINT: &str = "read_preference";
+
+pub const HINT_KEYS: [&str; 7] = [
+    "x-greptime-hint-auto_create_table",
+    "x-greptime-hint-ttl",
+    "x-greptime-hint-append_mode",
+    "x-greptime-hint-merge_mode",
+    "x-greptime-hint-physical_table",
+    "x-greptime-hint-skip_wal",
+    "x-greptime-hint-read_preference",
+];
--- a/src/session/src/lib.rs
+++ b/src/session/src/lib.rs
@@ -13,6 +13,7 @@
 // limitations under the License.

 pub mod context;
+pub mod hints;
 pub mod session_config;
 pub mod table_name;

--- a/src/sql/src/parsers/alter_parser.rs
+++ b/src/sql/src/parsers/alter_parser.rs
@@ -391,13 +391,16 @@ fn parse_string_options(parser: &mut Parser) -> std::result::Result<(String, Str
    parser.expect_token(&Token::Eq)?;
    let value = if parser.parse_keyword(Keyword::NULL) {
        "".to_string()
-    } else if let Ok(v) = parser.parse_literal_string() {
-        v
    } else {
-        return Err(ParserError::ParserError(format!(
-            "Unexpected option value for alter table statements, expect string literal or NULL, got: `{}`",
-            parser.next_token()
-        )));
+        let next_token = parser.peek_token();
+        if let Token::Number(number_as_string, _) = next_token.token {
+            parser.advance_token();
+            number_as_string
+        } else {
+            parser.parse_literal_string().map_err(|_|{
+                ParserError::ParserError(format!("Unexpected option value for alter table statements, expect string literal, numeric literal or NULL, got: `{}`", next_token))
+            })?
+        }
    };
    Ok((name, value))
 }
@@ -1088,4 +1091,38 @@ mod tests {
        )
        .unwrap_err();
    }
+
+    #[test]
+    fn test_parse_alter_with_numeric_value() {
+        for sql in [
+            "ALTER TABLE test SET 'compaction.twcs.trigger_file_num'=8;",
+            "ALTER TABLE test SET 'compaction.twcs.trigger_file_num'='8';",
+        ] {
+            let mut result = ParserContext::create_with_dialect(
+                sql,
+                &GreptimeDbDialect {},
+                ParseOptions::default(),
+            )
+            .unwrap();
+            assert_eq!(1, result.len());
+
+            let statement = result.remove(0);
+            assert_matches!(statement, Statement::AlterTable { .. });
+            match statement {
+                Statement::AlterTable(alter_table) => {
+                    let alter_operation = alter_table.alter_operation();
+                    assert_matches!(alter_operation, AlterTableOperation::SetTableOptions { .. });
+                    match alter_operation {
+                        AlterTableOperation::SetTableOptions { options } => {
+                            assert_eq!(options.len(), 1);
+                            assert_eq!(options[0].key, "compaction.twcs.trigger_file_num");
+                            assert_eq!(options[0].value, "8");
+                        }
+                        _ => unreachable!(),
+                    }
+                }
+                _ => unreachable!(),
+            }
+        }
+    }
 }
--- a/src/sql/src/statements/create.rs
+++ b/src/sql/src/statements/create.rs
@@ -211,6 +211,12 @@ impl ColumnExtensions {
    }
 }

+/// Partition on columns or values.
+///
+/// - `column_list` is the list of columns in `PARTITION ON COLUMNS` clause.
+/// - `exprs` is the list of expressions in `PARTITION ON VALUES` clause, like
+///   `host <= 'host1'`, `host > 'host1' and host <= 'host2'` or `host > 'host2'`.
+///   Each expression stands for a partition.
 #[derive(Debug, PartialEq, Eq, Clone, Visit, VisitMut, Serialize)]
 pub struct Partitions {
    pub column_list: Vec<Ident>,
--- a/src/store-api/src/logstore/entry.rs
+++ b/src/store-api/src/logstore/entry.rs
@@ -12,6 +12,7 @@
 // See the License for the specific language governing permissions and
 // limitations under the License.

+use std::fmt::{Display, Formatter};
 use std::mem::size_of;

 use crate::logstore::provider::Provider;
@@ -30,6 +31,15 @@ pub enum Entry {
    MultiplePart(MultiplePartEntry),
 }

+impl Display for Entry {
+    fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
+        match self {
+            Entry::Naive(entry) => write!(f, "{}", entry),
+            Entry::MultiplePart(entry) => write!(f, "{}", entry),
+        }
+    }
+}
+
 impl Entry {
    /// Into [NaiveEntry] if it's type of [Entry::Naive].
    pub fn into_naive_entry(self) -> Option<NaiveEntry> {
@@ -56,6 +66,16 @@ pub struct NaiveEntry {
    pub data: Vec<u8>,
 }

+impl Display for NaiveEntry {
+    fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
+        write!(
+            f,
+            "NaiveEntry(provider={:?}, region_id={}, entry_id={})",
+            self.provider, self.region_id, self.entry_id,
+        )
+    }
+}
+
 impl NaiveEntry {
    /// Estimates the persisted size of the entry.
    fn estimated_size(&self) -> usize {
@@ -79,6 +99,19 @@ pub struct MultiplePartEntry {
    pub parts: Vec<Vec<u8>>,
 }

+impl Display for MultiplePartEntry {
+    fn fmt(&self, f: &mut Formatter<'_>) -> std::fmt::Result {
+        write!(
+            f,
+            "MultiplePartEntry(provider={:?}, region_id={}, entry_id={}, len={})",
+            self.provider,
+            self.region_id,
+            self.entry_id,
+            self.parts.len()
+        )
+    }
+}
+
 impl MultiplePartEntry {
    fn is_complete(&self) -> bool {
        self.headers.contains(&MultiplePartHeader::First)
--- a/src/store-api/src/logstore/provider.rs
+++ b/src/store-api/src/logstore/provider.rs
@@ -69,10 +69,10 @@ impl Display for Provider {
    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
        match &self {
            Provider::RaftEngine(provider) => {
-                write!(f, "region: {}", RegionId::from_u64(provider.id))
+                write!(f, "RaftEngine(region={})", RegionId::from_u64(provider.id))
            }
-            Provider::Kafka(provider) => write!(f, "topic: {}", provider.topic),
-            Provider::Noop => write!(f, "noop"),
+            Provider::Kafka(provider) => write!(f, "Kafka(topic={})", provider.topic),
+            Provider::Noop => write!(f, "Noop"),
        }
    }
 }
--- a/tests-integration/Cargo.toml
+++ b/tests-integration/Cargo.toml
@@ -6,6 +6,14 @@ license.workspace = true

 [features]
 dashboard = []
+enterprise = [
+    "cmd/enterprise",
+    "common-meta/enterprise",
+    "frontend/enterprise",
+    "meta-srv/enterprise",
+    "operator/enterprise",
+    "sql/enterprise",
+]

 [lints]
 workspace = true
--- a/tests-integration/src/standalone.rs
+++ b/tests-integration/src/standalone.rs
@@ -226,6 +226,8 @@ impl GreptimeDbStandaloneBuilder {
                },
                procedure_manager.clone(),
                register_procedure_loaders,
+                #[cfg(feature = "enterprise")]
+                None,
            )
            .unwrap(),
        );
--- a/tests-integration/src/test_util.rs
+++ b/tests-integration/src/test_util.rs
@@ -457,6 +457,7 @@ pub async fn setup_test_http_app_with_frontend_and_user_provider(
        ))
        .with_log_ingest_handler(instance.fe_instance().clone(), None, None)
        .with_logs_handler(instance.fe_instance().clone())
+        .with_influxdb_handler(instance.fe_instance().clone())
        .with_otlp_handler(instance.fe_instance().clone())
        .with_jaeger_handler(instance.fe_instance().clone())
        .with_greptime_config_options(instance.opts.to_toml().unwrap());
--- a/tests-integration/tests/http.rs
+++ b/tests-integration/tests/http.rs
@@ -104,6 +104,7 @@ macro_rules! http_tests {
                test_identity_pipeline_with_custom_ts,
                test_pipeline_dispatcher,
                test_pipeline_suffix_template,
+                test_pipeline_context,

                test_otlp_metrics,
                test_otlp_traces_v0,
@@ -116,6 +117,8 @@ macro_rules! http_tests {
                test_log_query,
                test_jaeger_query_api,
                test_jaeger_query_api_for_trace_v1,
+
+                test_influxdb_write,
            );
        )*
    };
@@ -1155,6 +1158,9 @@ record_type = "system_table"
 threshold = "30s"
 sample_ratio = 1.0
 ttl = "30d"
+
+[query]
+parallelism = 0
 "#,
    )
    .trim()
@@ -2008,6 +2014,125 @@ table_suffix: _${type}
    guard.remove_all().await;
 }

+pub async fn test_pipeline_context(storage_type: StorageType) {
+    common_telemetry::init_default_ut_logging();
+    let (app, mut guard) =
+        setup_test_http_app_with_frontend(storage_type, "test_pipeline_context").await;
+
+    // handshake
+    let client = TestClient::new(app).await;
+
+    let root_pipeline = r#"
+processors:
+  - date:
+      field: time
+      formats:
+        - "%Y-%m-%d %H:%M:%S%.3f"
+      ignore_missing: true
+
+transform:
+  - fields:
+      - id1, id1_root
+      - id2, id2_root
+    type: int32
+  - fields:
+      - type
+      - log
+      - logger
+    type: string
+  - field: time
+    type: time
+    index: timestamp
+table_suffix: _${type}
+"#;
+
+    // 1. create pipeline
+    let res = client
+        .post("/v1/events/pipelines/root")
+        .header("Content-Type", "application/x-yaml")
+        .body(root_pipeline)
+        .send()
+        .await;
+
+    assert_eq!(res.status(), StatusCode::OK);
+
+    // 2. write data
+    let data_body = r#"
+[
+  {
+    "id1": "2436",
+    "id2": "2528",
+    "logger": "INTERACT.MANAGER",
+    "type": "http",
+    "time": "2024-05-25 20:16:37.217",
+    "log": "ClusterAdapter:enter sendTextDataToCluster\\n",
+    "greptime_ttl": "1d"
+  },
+  {
+    "id1": "2436",
+    "id2": "2528",
+    "logger": "INTERACT.MANAGER",
+    "type": "db",
+    "time": "2024-05-25 20:16:37.217",
+    "log": "ClusterAdapter:enter sendTextDataToCluster\\n"
+  }
+]
+"#;
+    let res = client
+        .post("/v1/events/logs?db=public&table=d_table&pipeline_name=root")
+        .header("Content-Type", "application/json")
+        .body(data_body)
+        .send()
+        .await;
+    assert_eq!(res.status(), StatusCode::OK);
+
+    // 3. check table list
+    validate_data(
+        "test_pipeline_context_table_list",
+        &client,
+        "show tables",
+        "[[\"d_table_db\"],[\"d_table_http\"],[\"demo\"],[\"numbers\"]]",
+    )
+    .await;
+
+    // 4. check each table's data
+    // CREATE TABLE IF NOT EXISTS "d_table_db" (
+    //     ... ignore
+    //     )
+    //   ENGINE=mito
+    //   WITH(
+    //     append_mode = 'true'
+    //     )
+    let expected = "[[\"d_table_db\",\"CREATE TABLE IF NOT EXISTS \\\"d_table_db\\\" (\\n  \\\"id1_root\\\" INT NULL,\\n  \\\"id2_root\\\" INT NULL,\\n  \\\"type\\\" STRING NULL,\\n  \\\"log\\\" STRING NULL,\\n  \\\"logger\\\" STRING NULL,\\n  \\\"time\\\" TIMESTAMP(9) NOT NULL,\\n  TIME INDEX (\\\"time\\\")\\n)\\n\\nENGINE=mito\\nWITH(\\n  append_mode = 'true'\\n)\"]]";
+
+    validate_data(
+        "test_pipeline_context_db",
+        &client,
+        "show create table d_table_db",
+        expected,
+    )
+    .await;
+
+    // CREATE TABLE IF NOT EXISTS "d_table_http" (
+    //     ... ignore
+    //     )
+    //   ENGINE=mito
+    //   WITH(
+    //     append_mode = 'true',
+    //     ttl = '1day'
+    //     )
+    let expected = "[[\"d_table_http\",\"CREATE TABLE IF NOT EXISTS \\\"d_table_http\\\" (\\n  \\\"id1_root\\\" INT NULL,\\n  \\\"id2_root\\\" INT NULL,\\n  \\\"type\\\" STRING NULL,\\n  \\\"log\\\" STRING NULL,\\n  \\\"logger\\\" STRING NULL,\\n  \\\"time\\\" TIMESTAMP(9) NOT NULL,\\n  TIME INDEX (\\\"time\\\")\\n)\\n\\nENGINE=mito\\nWITH(\\n  append_mode = 'true',\\n  ttl = '1day'\\n)\"]]";
+    validate_data(
+        "test_pipeline_context_http",
+        &client,
+        "show create table d_table_http",
+        expected,
+    )
+    .await;
+
+    guard.remove_all().await;
+}
+
 pub async fn test_identity_pipeline_with_flatten(store_type: StorageType) {
    common_telemetry::init_default_ut_logging();
    let (app, mut guard) =
@@ -4472,6 +4597,52 @@ pub async fn test_jaeger_query_api_for_trace_v1(store_type: StorageType) {
    guard.remove_all().await;
 }

+pub async fn test_influxdb_write(store_type: StorageType) {
+    common_telemetry::init_default_ut_logging();
+    let (app, mut guard) =
+        setup_test_http_app_with_frontend(store_type, "test_influxdb_write").await;
+
+    let client = TestClient::new(app).await;
+
+    // Only write field cpu.
+    let result = client
+        .post("/v1/influxdb/write?db=public&p=greptime&u=greptime")
+        .body("test_alter,host=host1 cpu=1.2 1664370459457010101")
+        .send()
+        .await;
+    assert_eq!(result.status(), 204);
+    assert!(result.text().await.is_empty());
+
+    // Only write field mem.
+    let result = client
+        .post("/v1/influxdb/write?db=public&p=greptime&u=greptime")
+        .body("test_alter,host=host1 mem=10240.0 1664370469457010101")
+        .send()
+        .await;
+    assert_eq!(result.status(), 204);
+    assert!(result.text().await.is_empty());
+
+    // Write field cpu & mem.
+    let result = client
+        .post("/v1/influxdb/write?db=public&p=greptime&u=greptime")
+        .body("test_alter,host=host1 cpu=3.2,mem=20480.0 1664370479457010101")
+        .send()
+        .await;
+    assert_eq!(result.status(), 204);
+    assert!(result.text().await.is_empty());
+
+    let expected = r#"[["host1",1.2,1664370459457010101,null],["host1",null,1664370469457010101,10240.0],["host1",3.2,1664370479457010101,20480.0]]"#;
+    validate_data(
+        "test_influxdb_write",
+        &client,
+        "select * from test_alter order by ts;",
+        expected,
+    )
+    .await;
+
+    guard.remove_all().await;
+}
+
 async fn validate_data(test_name: &str, client: &TestClient, sql: &str, expected: &str) {
    let res = client
        .get(format!("/v1/sql?sql={sql}").as_str())
--- a/tests-integration/tests/sql.rs
+++ b/tests-integration/tests/sql.rs
@@ -339,6 +339,8 @@ pub async fn test_mysql_timezone(store_type: StorageType) {

    let timezone = conn.fetch_all("SELECT @@time_zone").await.unwrap();
    assert_eq!(timezone[0].get::<String, usize>(0), "Asia/Shanghai");
+    let timezone = conn.fetch_all("SELECT @@session.time_zone").await.unwrap();
+    assert_eq!(timezone[0].get::<String, usize>(0), "Asia/Shanghai");
    let timezone = conn.fetch_all("SELECT @@system_time_zone").await.unwrap();
    assert_eq!(timezone[0].get::<String, usize>(0), "UTC");
    let _ = conn.execute("SET time_zone = 'UTC'").await.unwrap();
@@ -367,6 +369,8 @@ pub async fn test_mysql_timezone(store_type: StorageType) {
    let _ = conn.execute("SET time_zone = '+08:00'").await.unwrap();
    let timezone = conn.fetch_all("SELECT @@time_zone").await.unwrap();
    assert_eq!(timezone[0].get::<String, usize>(0), "+08:00");
+    let timezone = conn.fetch_all("SELECT @@session.time_zone").await.unwrap();
+    assert_eq!(timezone[0].get::<String, usize>(0), "+08:00");

    let rows2 = conn.fetch_all("select ts from demo").await.unwrap();
    // we use Utc here for format only
@@ -391,6 +395,8 @@ pub async fn test_mysql_timezone(store_type: StorageType) {
    );
    let timezone = conn.fetch_all("SELECT @@time_zone").await.unwrap();
    assert_eq!(timezone[0].get::<String, usize>(0), "-07:00");
+    let timezone = conn.fetch_all("SELECT @@session.time_zone").await.unwrap();
+    assert_eq!(timezone[0].get::<String, usize>(0), "-07:00");

    let _ = fe_mysql_server.shutdown().await;
    guard.remove_all().await;
--- a/tests/cases/standalone/common/copy/copy_from_fs_csv.result
+++ b/tests/cases/standalone/common/copy/copy_from_fs_csv.result
@@ -1,15 +1,21 @@
-CREATE TABLE demo(host string, cpu double, memory double, ts TIMESTAMP time index);
+CREATE TABLE demo(host string, cpu double, memory double, jsons JSON, ts TIMESTAMP time index);

 Affected Rows: 0

+insert into
+    demo(host, cpu, memory, jsons, ts)
+values
+    ('host1', 66.6, 1024, '{"foo":"bar"}', 1655276557000),
+    ('host2', 88.8, 333.3, '{"a":null,"foo":"bar"}', 1655276558000);
+
+Affected Rows: 2
+
 insert into
    demo(host, cpu, memory, ts)
 values
-    ('host1', 66.6, 1024, 1655276557000),
-    ('host2', 88.8, 333.3, 1655276558000),
    ('host3', 99.9, 444.4, 1722077263000);

-Affected Rows: 3
+Affected Rows: 1

 Copy demo TO '${SQLNESS_HOME}/demo/export/csv/demo.csv' with (format='csv');

@@ -32,6 +38,44 @@ select * from with_filename order by ts;
 | host2 | 88.8 | 333.3  | 2022-06-15T07:02:38 |
 +-------+------+--------+---------------------+

+CREATE TABLE with_json(host string, cpu double, memory double, jsons JSON, ts timestamp time index);
+
+Affected Rows: 0
+
+Copy with_json FROM '${SQLNESS_HOME}/demo/export/json/demo.json' with (format='json');
+
+Affected Rows: 3
+
+select host, cpu, memory, json_to_string(jsons), ts from with_json order by ts;
+
+-------+------+--------+---------------------------------+---------------------+
+| host  | cpu  | memory | json_to_string(with_json.jsons) | ts                  |
+-------+------+--------+---------------------------------+---------------------+
+| host1 | 66.6 | 1024.0 | {"foo":"bar"}                   | 2022-06-15T07:02:37 |
+| host2 | 88.8 | 333.3  | {"a":null,"foo":"bar"}          | 2022-06-15T07:02:38 |
+| host3 | 99.9 | 444.4  |                                 | 2024-07-27T10:47:43 |
+-------+------+--------+---------------------------------+---------------------+
+
+-- SQLNESS PROTOCOL MYSQL
+select host, cpu, memory, jsons, ts from demo where host != 'host3';
+
+-------+------+--------+------------------------+---------------------+
+| host  | cpu  | memory | jsons                  | ts                  |
+-------+------+--------+------------------------+---------------------+
+| host1 | 66.6 | 1024   | {"foo":"bar"}          | 2022-06-15 07:02:37 |
+| host2 | 88.8 | 333.3  | {"a":null,"foo":"bar"} | 2022-06-15 07:02:38 |
+-------+------+--------+------------------------+---------------------+
+
+-- SQLNESS PROTOCOL POSTGRES
+select host, cpu, memory, jsons, ts from demo where host != 'host3';
+
+-------+------+--------+------------------------+----------------------------+
+| host  | cpu  | memory | jsons                  | ts                         |
+-------+------+--------+------------------------+----------------------------+
+| host1 | 66.6 | 1024   | {"foo":"bar"}          | 2022-06-15 07:02:37.000000 |
+| host2 | 88.8 | 333.3  | {"a":null,"foo":"bar"} | 2022-06-15 07:02:38.000000 |
+-------+------+--------+------------------------+----------------------------+
+
 CREATE TABLE with_path(host string, cpu double, memory double, ts timestamp time index);

 Affected Rows: 0
@@ -110,6 +154,10 @@ drop table with_filename;

 Affected Rows: 0

+drop table with_json;
+
+Affected Rows: 0
+
 drop table with_path;

 Affected Rows: 0
--- a/tests/cases/standalone/common/copy/copy_from_fs_csv.sql
+++ b/tests/cases/standalone/common/copy/copy_from_fs_csv.sql
@@ -1,10 +1,14 @@
-CREATE TABLE demo(host string, cpu double, memory double, ts TIMESTAMP time index);
+CREATE TABLE demo(host string, cpu double, memory double, jsons JSON, ts TIMESTAMP time index);
+
+insert into
+    demo(host, cpu, memory, jsons, ts)
+values
+    ('host1', 66.6, 1024, '{"foo":"bar"}', 1655276557000),
+    ('host2', 88.8, 333.3, '{"a":null,"foo":"bar"}', 1655276558000);

 insert into
    demo(host, cpu, memory, ts)
 values
-    ('host1', 66.6, 1024, 1655276557000),
-    ('host2', 88.8, 333.3, 1655276558000),
    ('host3', 99.9, 444.4, 1722077263000);

 Copy demo TO '${SQLNESS_HOME}/demo/export/csv/demo.csv' with (format='csv');
@@ -15,6 +19,18 @@ Copy with_filename FROM '${SQLNESS_HOME}/demo/export/csv/demo.csv' with (format=

 select * from with_filename order by ts;

+CREATE TABLE with_json(host string, cpu double, memory double, jsons JSON, ts timestamp time index);
+
+Copy with_json FROM '${SQLNESS_HOME}/demo/export/json/demo.json' with (format='json');
+
+select host, cpu, memory, json_to_string(jsons), ts from with_json order by ts;
+
+-- SQLNESS PROTOCOL MYSQL
+select host, cpu, memory, jsons, ts from demo where host != 'host3';
+
+-- SQLNESS PROTOCOL POSTGRES
+select host, cpu, memory, jsons, ts from demo where host != 'host3';
+
 CREATE TABLE with_path(host string, cpu double, memory double, ts timestamp time index);

 Copy with_path FROM '${SQLNESS_HOME}/demo/export/csv/' with (format='csv', start_time='2023-06-15 07:02:37');
@@ -43,6 +59,8 @@ drop table demo;

 drop table with_filename;

+drop table with_json;
+
 drop table with_path;

 drop table with_pattern;
--- a/tests/cases/standalone/common/copy/copy_from_fs_json.result
+++ b/tests/cases/standalone/common/copy/copy_from_fs_json.result
@@ -1,15 +1,21 @@
-CREATE TABLE demo(host string, cpu double, memory double, ts TIMESTAMP time index);
+CREATE TABLE demo(host string, cpu double, memory double, jsons JSON, ts TIMESTAMP time index);

 Affected Rows: 0

 insert into 
-    demo(host, cpu, memory, ts) 
+    demo(host, cpu, memory, jsons, ts) 
 values 
-    ('host1', 66.6, 1024, 1655276557000), 
-    ('host2', 88.8,  333.3, 1655276558000),
+    ('host1', 66.6, 1024, '{"foo":"bar"}', 1655276557000), 
+    ('host2', 88.8, 333.3, '{"a":null,"foo":"bar"}', 1655276558000);
+
+Affected Rows: 2
+
+insert into
+    demo(host, cpu, memory, ts)
+values
    ('host3', 99.9, 444.4, 1722077263000);

-Affected Rows: 3
+Affected Rows: 1

 Copy demo TO '${SQLNESS_HOME}/demo/export/json/demo.json' with (format='json');

@@ -32,6 +38,44 @@ select * from with_filename order by ts;
 | host2 | 88.8 | 333.3  | 2022-06-15T07:02:38 |
 +-------+------+--------+---------------------+

+CREATE TABLE with_json(host string, cpu double, memory double, jsons JSON, ts timestamp time index);
+
+Affected Rows: 0
+
+Copy with_json FROM '${SQLNESS_HOME}/demo/export/json/demo.json' with (format='json');
+
+Affected Rows: 3
+
+select host, cpu, memory, json_to_string(jsons), ts from with_json order by ts;
+
+-------+------+--------+---------------------------------+---------------------+
+| host  | cpu  | memory | json_to_string(with_json.jsons) | ts                  |
+-------+------+--------+---------------------------------+---------------------+
+| host1 | 66.6 | 1024.0 | {"foo":"bar"}                   | 2022-06-15T07:02:37 |
+| host2 | 88.8 | 333.3  | {"a":null,"foo":"bar"}          | 2022-06-15T07:02:38 |
+| host3 | 99.9 | 444.4  |                                 | 2024-07-27T10:47:43 |
+-------+------+--------+---------------------------------+---------------------+
+
+-- SQLNESS PROTOCOL MYSQL
+select host, cpu, memory, jsons, ts from demo where host != 'host3';
+
+-------+------+--------+------------------------+---------------------+
+| host  | cpu  | memory | jsons                  | ts                  |
+-------+------+--------+------------------------+---------------------+
+| host1 | 66.6 | 1024   | {"foo":"bar"}          | 2022-06-15 07:02:37 |
+| host2 | 88.8 | 333.3  | {"a":null,"foo":"bar"} | 2022-06-15 07:02:38 |
+-------+------+--------+------------------------+---------------------+
+
+-- SQLNESS PROTOCOL POSTGRES
+select host, cpu, memory, jsons, ts from demo where host != 'host3';
+
+-------+------+--------+------------------------+----------------------------+
+| host  | cpu  | memory | jsons                  | ts                         |
+-------+------+--------+------------------------+----------------------------+
+| host1 | 66.6 | 1024   | {"foo":"bar"}          | 2022-06-15 07:02:37.000000 |
+| host2 | 88.8 | 333.3  | {"a":null,"foo":"bar"} | 2022-06-15 07:02:38.000000 |
+-------+------+--------+------------------------+----------------------------+
+
 CREATE TABLE with_path(host string, cpu double, memory double, ts timestamp time index);

 Affected Rows: 0
@@ -110,6 +154,10 @@ drop table with_filename;

 Affected Rows: 0

+drop table with_json;
+
+Affected Rows: 0
+
 drop table with_path;

 Affected Rows: 0
--- a/tests/cases/standalone/common/copy/copy_from_fs_json.sql
+++ b/tests/cases/standalone/common/copy/copy_from_fs_json.sql
@@ -1,10 +1,14 @@
-CREATE TABLE demo(host string, cpu double, memory double, ts TIMESTAMP time index);
+CREATE TABLE demo(host string, cpu double, memory double, jsons JSON, ts TIMESTAMP time index);

 insert into 
-    demo(host, cpu, memory, ts) 
+    demo(host, cpu, memory, jsons, ts) 
 values 
-    ('host1', 66.6, 1024, 1655276557000), 
-    ('host2', 88.8,  333.3, 1655276558000),
+    ('host1', 66.6, 1024, '{"foo":"bar"}', 1655276557000), 
+    ('host2', 88.8, 333.3, '{"a":null,"foo":"bar"}', 1655276558000);
+
+insert into
+    demo(host, cpu, memory, ts)
+values
    ('host3', 99.9, 444.4, 1722077263000);

 Copy demo TO '${SQLNESS_HOME}/demo/export/json/demo.json' with (format='json');
@@ -15,6 +19,18 @@ Copy with_filename FROM '${SQLNESS_HOME}/demo/export/json/demo.json' with (forma

 select * from with_filename order by ts;

+CREATE TABLE with_json(host string, cpu double, memory double, jsons JSON, ts timestamp time index);
+
+Copy with_json FROM '${SQLNESS_HOME}/demo/export/json/demo.json' with (format='json');
+
+select host, cpu, memory, json_to_string(jsons), ts from with_json order by ts;
+
+-- SQLNESS PROTOCOL MYSQL
+select host, cpu, memory, jsons, ts from demo where host != 'host3';
+
+-- SQLNESS PROTOCOL POSTGRES
+select host, cpu, memory, jsons, ts from demo where host != 'host3';
+
 CREATE TABLE with_path(host string, cpu double, memory double, ts timestamp time index);

 Copy with_path FROM '${SQLNESS_HOME}/demo/export/json/' with (format='json', start_time='2022-06-15 07:02:37', end_time='2022-06-15 07:02:39');
@@ -43,6 +59,8 @@ drop table demo;

 drop table with_filename;

+drop table with_json;
+
 drop table with_path;

 drop table with_pattern;
--- a/tests/cases/standalone/common/copy/copy_from_fs_parquet.result
+++ b/tests/cases/standalone/common/copy/copy_from_fs_parquet.result
@@ -1,4 +1,4 @@
-CREATE TABLE demo(host string, cpu double, memory double, ts TIMESTAMP time index);
+CREATE TABLE demo(host string, cpu double, memory double, jsons JSON, ts TIMESTAMP time index);

 Affected Rows: 0

@@ -6,14 +6,20 @@ CREATE TABLE demo_2(host string, cpu double, memory double, ts TIMESTAMP time in

 Affected Rows: 0

+insert into
+    demo(host, cpu, memory, jsons, ts)
+values
+    ('host1', 66.6, 1024, '{"foo":"bar"}', 1655276557000),
+    ('host2', 88.8, 333.3, '{"a":null,"foo":"bar"}', 1655276558000);
+
+Affected Rows: 2
+
 insert into
    demo(host, cpu, memory, ts)
 values
-    ('host1', 66.6, 1024, 1655276557000),
-    ('host2', 88.8, 333.3, 1655276558000),
    ('host3', 111.1, 444.4, 1722077263000);

-Affected Rows: 3
+Affected Rows: 1

 insert into
    demo_2(host, cpu, memory, ts)
@@ -70,6 +76,44 @@ select * from with_path order by ts;
 | host6 | 222.2 | 555.5  | 2024-07-27T10:47:44 |
 +-------+-------+--------+---------------------+

+CREATE TABLE with_json(host string, cpu double, memory double, jsons JSON, ts timestamp time index);
+
+Affected Rows: 0
+
+Copy with_json FROM '${SQLNESS_HOME}/demo/export/parquet_files/demo.parquet';
+
+Affected Rows: 3
+
+select host, cpu, memory, json_to_string(jsons), ts from with_json order by ts;
+
+-------+-------+--------+---------------------------------+---------------------+
+| host  | cpu   | memory | json_to_string(with_json.jsons) | ts                  |
+-------+-------+--------+---------------------------------+---------------------+
+| host1 | 66.6  | 1024.0 | {"foo":"bar"}                   | 2022-06-15T07:02:37 |
+| host2 | 88.8  | 333.3  | {"a":null,"foo":"bar"}          | 2022-06-15T07:02:38 |
+| host3 | 111.1 | 444.4  |                                 | 2024-07-27T10:47:43 |
+-------+-------+--------+---------------------------------+---------------------+
+
+-- SQLNESS PROTOCOL MYSQL
+select host, cpu, memory, jsons, ts from demo where host != 'host3';
+
+-------+------+--------+------------------------+---------------------+
+| host  | cpu  | memory | jsons                  | ts                  |
+-------+------+--------+------------------------+---------------------+
+| host1 | 66.6 | 1024   | {"foo":"bar"}          | 2022-06-15 07:02:37 |
+| host2 | 88.8 | 333.3  | {"a":null,"foo":"bar"} | 2022-06-15 07:02:38 |
+-------+------+--------+------------------------+---------------------+
+
+-- SQLNESS PROTOCOL POSTGRES
+select host, cpu, memory, jsons, ts from demo where host != 'host3';
+
+-------+------+--------+------------------------+----------------------------+
+| host  | cpu  | memory | jsons                  | ts                         |
+-------+------+--------+------------------------+----------------------------+
+| host1 | 66.6 | 1024   | {"foo":"bar"}          | 2022-06-15 07:02:37.000000 |
+| host2 | 88.8 | 333.3  | {"a":null,"foo":"bar"} | 2022-06-15 07:02:38.000000 |
+-------+------+--------+------------------------+----------------------------+
+
 CREATE TABLE with_pattern(host string, cpu double, memory double, ts timestamp time index);

 Affected Rows: 0
@@ -171,6 +215,10 @@ drop table with_filename;

 Affected Rows: 0

+drop table with_json;
+
+Affected Rows: 0
+
 drop table with_path;

 Affected Rows: 0
--- a/tests/cases/standalone/common/copy/copy_from_fs_parquet.sql
+++ b/tests/cases/standalone/common/copy/copy_from_fs_parquet.sql
@@ -1,12 +1,16 @@
-CREATE TABLE demo(host string, cpu double, memory double, ts TIMESTAMP time index);
+CREATE TABLE demo(host string, cpu double, memory double, jsons JSON, ts TIMESTAMP time index);

 CREATE TABLE demo_2(host string, cpu double, memory double, ts TIMESTAMP time index);

+insert into
+    demo(host, cpu, memory, jsons, ts)
+values
+    ('host1', 66.6, 1024, '{"foo":"bar"}', 1655276557000),
+    ('host2', 88.8, 333.3, '{"a":null,"foo":"bar"}', 1655276558000);
+
 insert into
    demo(host, cpu, memory, ts)
 values
-    ('host1', 66.6, 1024, 1655276557000),
-    ('host2', 88.8, 333.3, 1655276558000),
    ('host3', 111.1, 444.4, 1722077263000);

 insert into
@@ -32,6 +36,18 @@ Copy with_path FROM '${SQLNESS_HOME}/demo/export/parquet_files/';

 select * from with_path order by ts;

+CREATE TABLE with_json(host string, cpu double, memory double, jsons JSON, ts timestamp time index);
+
+Copy with_json FROM '${SQLNESS_HOME}/demo/export/parquet_files/demo.parquet';
+
+select host, cpu, memory, json_to_string(jsons), ts from with_json order by ts;
+
+-- SQLNESS PROTOCOL MYSQL
+select host, cpu, memory, jsons, ts from demo where host != 'host3';
+
+-- SQLNESS PROTOCOL POSTGRES
+select host, cpu, memory, jsons, ts from demo where host != 'host3';
+
 CREATE TABLE with_pattern(host string, cpu double, memory double, ts timestamp time index);

 Copy with_pattern FROM '${SQLNESS_HOME}/demo/export/parquet_files/' WITH (PATTERN = 'demo.*', start_time='2022-06-15 07:02:39');
@@ -70,6 +86,8 @@ drop table demo_2;

 drop table with_filename;

+drop table with_json;
+
 drop table with_path;

 drop table with_pattern;
--- a/tests/cases/standalone/common/copy/copy_to_fs.result
+++ b/tests/cases/standalone/common/copy/copy_to_fs.result
@@ -1,11 +1,15 @@
-CREATE TABLE demo(host string, cpu DOUBLE, memory DOUBLE, ts TIMESTAMP TIME INDEX);
+CREATE TABLE demo(host string, cpu DOUBLE, memory DOUBLE, jsons JSON, ts TIMESTAMP TIME INDEX);

 Affected Rows: 0

-insert into demo(host, cpu, memory, ts) values ('host1', 66.6, 1024, 1655276557000), ('host2', 88.8,  333.3, 1655276558000);
+insert into demo(host, cpu, memory, jsons, ts) values ('host1', 66.6, 1024, '{"foo":"bar"}', 1655276557000), ('host2', 88.8,  333.3, '{"a":null,"foo":"bar"}', 1655276558000);

 Affected Rows: 2

+insert into demo(host, cpu, memory, ts) values ('host3', 111.1, 444.4, 1722077263000);
+
+Affected Rows: 1
+
 COPY demo TO '${SQLNESS_HOME}/export/demo.parquet' WITH (start_time='2022-06-15 07:02:37', end_time='2022-06-15 07:02:38');

 Affected Rows: 1
@@ -18,15 +22,15 @@ COPY demo TO '${SQLNESS_HOME}/export/demo.json' WITH (format='json', start_time=

 Affected Rows: 1

-COPY (select host, cpu, ts from demo where host = 'host2') TO '${SQLNESS_HOME}/export/demo.parquet';
+COPY (select host, cpu, jsons, ts from demo where host = 'host2') TO '${SQLNESS_HOME}/export/demo.parquet';

 Affected Rows: 1

-COPY (select host, cpu, ts from demo where host = 'host2') TO '${SQLNESS_HOME}/export/demo.csv' WITH (format='csv');
+COPY (select host, cpu, jsons, ts from demo where host = 'host2') TO '${SQLNESS_HOME}/export/demo.csv' WITH (format='csv');

 Affected Rows: 1

-COPY (select host, cpu, ts from demo where host = 'host2') TO '${SQLNESS_HOME}/export/demo.json' WITH (format='json');
+COPY (select host, cpu, jsons, ts from demo where host = 'host2') TO '${SQLNESS_HOME}/export/demo.json' WITH (format='json');

 Affected Rows: 1

--- a/tests/cases/standalone/common/copy/copy_to_fs.sql
+++ b/tests/cases/standalone/common/copy/copy_to_fs.sql
@@ -1,6 +1,8 @@
-CREATE TABLE demo(host string, cpu DOUBLE, memory DOUBLE, ts TIMESTAMP TIME INDEX);
+CREATE TABLE demo(host string, cpu DOUBLE, memory DOUBLE, jsons JSON, ts TIMESTAMP TIME INDEX);

-insert into demo(host, cpu, memory, ts) values ('host1', 66.6, 1024, 1655276557000), ('host2', 88.8,  333.3, 1655276558000);
+insert into demo(host, cpu, memory, jsons, ts) values ('host1', 66.6, 1024, '{"foo":"bar"}', 1655276557000), ('host2', 88.8,  333.3, '{"a":null,"foo":"bar"}', 1655276558000);
+
+insert into demo(host, cpu, memory, ts) values ('host3', 111.1, 444.4, 1722077263000);

 COPY demo TO '${SQLNESS_HOME}/export/demo.parquet' WITH (start_time='2022-06-15 07:02:37', end_time='2022-06-15 07:02:38');

@@ -8,10 +10,10 @@ COPY demo TO '${SQLNESS_HOME}/export/demo.csv' WITH (format='csv', start_time='2

 COPY demo TO '${SQLNESS_HOME}/export/demo.json' WITH (format='json', start_time='2022-06-15 07:02:37', end_time='2022-06-15 07:02:38');

-COPY (select host, cpu, ts from demo where host = 'host2') TO '${SQLNESS_HOME}/export/demo.parquet';
+COPY (select host, cpu, jsons, ts from demo where host = 'host2') TO '${SQLNESS_HOME}/export/demo.parquet';

-COPY (select host, cpu, ts from demo where host = 'host2') TO '${SQLNESS_HOME}/export/demo.csv' WITH (format='csv');
+COPY (select host, cpu, jsons, ts from demo where host = 'host2') TO '${SQLNESS_HOME}/export/demo.csv' WITH (format='csv');

-COPY (select host, cpu, ts from demo where host = 'host2') TO '${SQLNESS_HOME}/export/demo.json' WITH (format='json');
+COPY (select host, cpu, jsons, ts from demo where host = 'host2') TO '${SQLNESS_HOME}/export/demo.json' WITH (format='json');

 drop table demo;
--- a/tests/cases/standalone/common/create/metric_engine_partition.result
+++ b/tests/cases/standalone/common/create/metric_engine_partition.result
@@ -0,0 +1,81 @@
+create table metric_engine_partition (
+    ts timestamp time index,
+    host string primary key,
+    cpu double,
+)
+partition on columns (host) (
+    host <= 'host1',
+    host > 'host1' and host <= 'host2',
+    host > 'host2'
+)
+engine = metric
+with (
+    physical_metric_table = "true",
+);
+
+Affected Rows: 0
+
+select count(*) from metric_engine_partition;
+
+----------+
+| count(*) |
+----------+
+| 0        |
+----------+
+
+create table logical_table_1 (
+    ts timestamp time index,
+    host string primary key,
+    cpu double,
+)
+partition on columns (host) ()
+engine = metric
+with (
+    on_physical_table = "metric_engine_partition",
+);
+
+Error: 1004(InvalidArguments), Invalid partition rule: logical table in metric engine should not have partition rule, it will be inherited from physical table
+
+create table logical_table_2 (
+    ts timestamp time index,
+    host string primary key,
+    cpu double,
+)
+engine = metric
+with (
+    on_physical_table = "metric_engine_partition",
+);
+
+Affected Rows: 0
+
+show create table logical_table_2;
+
+-----------------+-------------------------------------------------+
+| Table           | Create Table                                    |
+-----------------+-------------------------------------------------+
+| logical_table_2 | CREATE TABLE IF NOT EXISTS "logical_table_2" (  |
+|                 |   "cpu" DOUBLE NULL,                            |
+|                 |   "host" STRING NULL,                           |
+|                 |   "ts" TIMESTAMP(3) NOT NULL,                   |
+|                 |   TIME INDEX ("ts"),                            |
+|                 |   PRIMARY KEY ("host")                          |
+|                 | )                                               |
+|                 | PARTITION ON COLUMNS ("host") (                 |
+|                 |   host <= 'host1',                              |
+|                 |   host > 'host2',                               |
+|                 |   host > 'host1' AND host <= 'host2'            |
+|                 | )                                               |
+|                 | ENGINE=metric                                   |
+|                 | WITH(                                           |
+|                 |   on_physical_table = 'metric_engine_partition' |
+|                 | )                                               |
+-----------------+-------------------------------------------------+
+
+drop table logical_table_2;
+
+Affected Rows: 0
+
+drop table metric_engine_partition;
+
+Affected Rows: 0
+
--- a/tests/cases/standalone/common/create/metric_engine_partition.sql
+++ b/tests/cases/standalone/common/create/metric_engine_partition.sql
@@ -0,0 +1,43 @@
+create table metric_engine_partition (
+    ts timestamp time index,
+    host string primary key,
+    cpu double,
+)
+partition on columns (host) (
+    host <= 'host1',
+    host > 'host1' and host <= 'host2',
+    host > 'host2'
+)
+engine = metric
+with (
+    physical_metric_table = "true",
+);
+
+select count(*) from metric_engine_partition;
+
+create table logical_table_1 (
+    ts timestamp time index,
+    host string primary key,
+    cpu double,
+)
+partition on columns (host) ()
+engine = metric
+with (
+    on_physical_table = "metric_engine_partition",
+);
+
+create table logical_table_2 (
+    ts timestamp time index,
+    host string primary key,
+    cpu double,
+)
+engine = metric
+with (
+    on_physical_table = "metric_engine_partition",
+);
+
+show create table logical_table_2;
+
+drop table logical_table_2;
+
+drop table metric_engine_partition;
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
discord9	3d17d195a3	feat: flownode to frontend load balance with guess	2025-06-08 14:17:32 +08:00
Weny Xu	0d4f27a699	fix: convert JSON type to JSON string in COPY TABLE TO statment (#6255 ) * fix: convert JSON type to JSON string in COPY TABLE TO statement * chore: apply suggestions from CR * chore: apply suggestions from CR	2025-06-06 02:23:57 +00:00
Ruihang Xia	c4da8bb69d	feat: don't allow creating logical table with partitions (#6249 ) * feat: don't allow creating logical table with partitions Signed-off-by: Ruihang Xia <waynestxia@gmail.com> * fix clippy Signed-off-by: Ruihang Xia <waynestxia@gmail.com> --------- Signed-off-by: Ruihang Xia <waynestxia@gmail.com>	2025-06-05 12:38:47 +00:00
discord9	0bd8856e2f	chore: pub flow info (#6253 ) * chore: make all flow info's field public * chore: expose flow_route * chore: more pub	2025-06-05 12:34:11 +00:00
Lei, HUANG	92c5a9f5f4	chore: allow numberic values in alter statements (#6252 ) chore/allow-numberic-values-in-alter: ### Commit Message Enhance `alter_parser.rs` to Support Numeric Values - Updated `parse_string_options` function in `alter_parser.rs` to handle numeric literals in addition to string literals and `NULL` for alter table statements. - Added a new test `test_parse_alter_with_numeric_value` in `alter_parser.rs` to verify the parsing of numeric values in alter table options.	2025-06-05 02:16:53 +00:00
Weny Xu	80c5af0ecf	fix: ignore incomplete WAL entries during read (#6251 ) * fix: ignore incomplete entry * fix: fix unit tests	2025-06-04 11:16:42 +00:00
LFC	7afb77fd35	fix: add "query" options to standalone (#6248 )	2025-06-04 08:47:31 +00:00
discord9	0b9af77fe9	chore: test sleep longer (#6247 ) * chore: test sleep longer * win timer resolution is 15.6ms, need longer	2025-06-04 08:18:44 +00:00
discord9	cbafb6e00b	feat(flow): flow streaming mode in list expr support (#6229 ) * feat: flow streaming in list support * chore: per review * chore: per review * fix: expr correct type	2025-06-04 08:05:20 +00:00
LFC	744a754246	fix: add missing features (#6245 )	2025-06-04 07:13:39 +00:00
fys	9cd4a2c525	feat: add trigger ddl manager (#6228 ) * feat: add trigger ddl manager * chore: reduce the number of cfg feature code blocks * upgrade greptime-proto * chore: upgrade greptime-proto	2025-06-04 06:38:02 +00:00
liyang	180920327b	ci: add option to choose whether upload artifacts to S3 in the development build (#6232 ) ci: add option to choose whether to upload artifacts to S3 in the development build	2025-06-04 03:49:53 +00:00
Yingwen	ee4f830be6	fix: do not accommodate fields for multi-value protocol (#6237 )	2025-06-04 01:10:52 +00:00
shuiyisong	69975f1f71	feat: pipeline with insert options (#6192 ) * feat: pipeline recognize hints from exec * chore: rename and add test * chore: minor improve * chore: rename and add comments * fix: typos * chore: remove unnecessory clone fn * chore: group metrics * chore: use struct in transform output enum * chore: update hint prefix	2025-06-03 18:46:48 +00:00
discord9	38cac301f2	refactor(flow): limit the size of query (#6216 ) * refactor: not wait for slow query * chore: clippy * chore: fmt * WIP: time range lock * WIP * refactor: rm over-complicated query pool * chore: add more metrics& rm sql from slow query metrics	2025-06-03 12:27:07 +00:00
Yuhan Wang	083c22b90a	refactor: extract some common functions and structs in election module (#6172 ) * refactor: extract some common functions and structs in election module * chore: add comments and modify a function name * chore: add comments and modify a function name * fix: missing 2 lines in license header * fix: acqrel * chore: apply comment suggestions * Update src/meta-srv/src/election.rs Co-authored-by: jeremyhi <jiachun_feng@proton.me> --------- Co-authored-by: jeremyhi <jiachun_feng@proton.me>	2025-06-03 11:31:30 +00:00
Lei, HUANG	fdd164c0fa	fix(mito): revert initial builder capacity for TimeSeriesMemtable (#6231 ) * fix/initial-builder-cap: ### Enhance Series Initialization and Capacity Management - `simple_bulk_memtable.rs`: Updated the `Series` initialization to use `with_capacity` with a specified capacity of 8192, improving memory management. - `time_series.rs`: Introduced `with_capacity` method in `Series` to allow custom initial capacity for `ValueBuilder`. Adjusted `INITIAL_BUILDER_CAPACITY` to 16 for more efficient memory usage. Added a new `new` method to maintain backward compatibility. * fix/initial-builder-cap: ### Adjust Memory Allocation in Memtable - `simple_bulk_memtable.rs`: Reduced the initial capacity of `Series` from 8192 to 1024 to optimize memory usage. - `time_series.rs`: Decreased `INITIAL_BUILDER_CAPACITY` from 16 to 4 to improve efficiency in vector building.	2025-06-03 08:25:02 +00:00
Zhenchi	078afb2bd6	feat: bloom filter index applier support or eq chain (#6227 ) * feat: bloom filter index applier support or eq chain Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> * address comments Signed-off-by: Zhenchi <zhongzc_arch@outlook.com> --------- Signed-off-by: Zhenchi <zhongzc_arch@outlook.com>	2025-06-03 08:08:19 +00:00
localhost	477e4cc344	chore: add pg mysql be default feature in cli (#6230 )	2025-06-03 07:09:26 +00:00
Lei, HUANG	078d83cec2	chore: add some metrics to grafana dashboard (#6169 ) * add compaction elapsed time avg and bulk request convert elapsed time to grafana dashboard * fix: standalone dashboard conversion * chore: newline --------- Co-authored-by: Yingwen <realevenyag@gmail.com>	2025-06-03 03:33:11 +00:00
liyang	7705d84d83	docs: fix bad link (#6222 ) * docs: fix bad link * Update how-to-profile-memory.md	2025-06-03 03:19:10 +00:00
dennis zhuang	0d81400bb4	feat: supports select @@session.time_zone (#6212 )	2025-06-03 02:32:19 +00:00