Merge branch 'main' into create-view

build(deps): bump rustls from 0.22.3 to 0.22.4 (#3764 )
Bumps [rustls](https://github.com/rustls/rustls) from 0.22.3 to 0.22.4. - [Release notes](https://github.com/rustls/rustls/releases) - [Changelog](https://github.com/rustls/rustls/blob/main/CHANGELOG.md) - [Commits](https://github.com/rustls/rustls/compare/v/0.22.3...v/0.22.4) --- updated-dependencies: - dependency-name: rustls dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-23 06:30:05 +00:00 · 2024-04-22 21:08:22 +08:00 · 2024-04-22 17:19:08 +08:00 · 2024-04-20 06:01:32 +00:00 · 2024-04-19 09:56:09 +00:00 · 2024-04-19 06:38:34 +00:00
1816 changed files with 231514 additions and 82295 deletions
--- a/.cargo/config.toml
+++ b/.cargo/config.toml
@@ -3,14 +3,3 @@ linker = "aarch64-linux-gnu-gcc"
 [alias]
 sqlness = "run --bin sqlness-runner --"
 [build]
 rustflags = [
    # lints
    # TODO: use lint configuration in cargo https://github.com/rust-lang/cargo/issues/5034
    "-Wclippy::print_stdout",
    "-Wclippy::print_stderr",
    "-Wclippy::implicit_clone",
    "-Aclippy::items_after_test_module",
 ]
--- a/.config/nextest.toml
+++ b/.config/nextest.toml
@@ -1,2 +1,3 @@
 [profile.default]
 slow-timeout = { period = "60s", terminate-after = 3, grace-period = "30s" }
 retries = { backoff = "exponential", count = 3, delay = "10s", jitter = true }
--- a/.dockerignore
+++ b/.dockerignore
@@ -20,6 +20,3 @@ out/
 # Rust
 target/
 # Git
 .git
--- a/.editorconfig
+++ b/.editorconfig
@@ -0,0 +1,10 @@
 root = true
 [*]
 end_of_line = lf
 indent_style = space
 insert_final_newline = true
 trim_trailing_whitespace = true
 [{Makefile,**.mk}]
 indent_style = tab
--- a/.env.example
+++ b/.env.example
@@ -14,4 +14,13 @@ GT_AZBLOB_CONTAINER=AZBLOB container
 GT_AZBLOB_ACCOUNT_NAME=AZBLOB account name
 GT_AZBLOB_ACCOUNT_KEY=AZBLOB account key
 GT_AZBLOB_ENDPOINT=AZBLOB endpoint
 # Settings for gcs test 
 GT_GCS_BUCKET = GCS bucket 
 GT_GCS_SCOPE  = GCS scope
 GT_GCS_CREDENTIAL_PATH = GCS credential path 
 GT_GCS_ENDPOINT = GCS end point
 # Settings for kafka wal test
 GT_KAFKA_ENDPOINTS = localhost:9092
 # Setting for fuzz tests
 GT_MYSQL_ADDR = localhost:4002
--- a/.github/CODEOWNERS
+++ b/.github/CODEOWNERS
@@ -0,0 +1,27 @@
 # GreptimeDB CODEOWNERS
 # These owners will be the default owners for everything in the repo.
 * @GreptimeTeam/db-approver
 ## [Module] Databse Engine
 /src/index @zhongzc
 /src/mito2 @evenyag @v0y4g3r @waynexia
 /src/query @evenyag
 ## [Module] Distributed
 /src/common/meta @MichaelScofield
 /src/common/procedure @MichaelScofield
 /src/meta-client @MichaelScofield
 /src/meta-srv @MichaelScofield
 ## [Module] Write Ahead Log
 /src/log-store @v0y4g3r
 /src/store-api @v0y4g3r
 ## [Module] Metrics Engine
 /src/metric-engine @waynexia
 /src/promql @waynexia
 ## [Module] Flow
 /src/flow @zhongzc @waynexia
--- a/.github/ISSUE_TEMPLATE/bug_report.yml
+++ b/.github/ISSUE_TEMPLATE/bug_report.yml
@@ -21,6 +21,7 @@ body:
        - Locking issue
        - Performance issue
        - Unexpected error
        - User Experience
        - Other
    validations:
      required: true
@@ -33,21 +34,40 @@ body:
      multiple: true
      options:
        - Standalone mode
        - Distributed Cluster
        - Storage Engine
        - Query Engine
        - Table Engine
        - Write Protocols
        - Metasrv
        - Frontend
        - Datanode
        - Meta
        - Other
    validations:
      required: true
  - type: textarea
-    id: what-happened
+    id: reproduce
    attributes:
-      label: What happened?
+      label: Minimal reproduce step
      description: |
-        Tell us what happened and also what you would have expected to
+        Please walk us through and provide steps and details on how
-        happen instead.
+        to reproduce the issue. If possible, provide scripts that we
-      placeholder: "Describe the bug"
+        can run to trigger the bug.
    validations:
      required: true
  - type: textarea
    id: expected-manner
    attributes:
      label: What did you expect to see?
    validations:
      required: true
  - type: textarea
    id: actual-manner
    attributes:
      label: What did you see instead?
    validations:
      required: true
@@ -63,6 +83,17 @@ body:
    validations:
      required: true
  - type: input
    id: greptimedb
    attributes:
      label: What version of GreptimeDB did you use?
      description: |
        Please provide the version of GreptimeDB. For example:
        0.5.1 etc. You can get it by executing command line `greptime --version`.
      placeholder: "0.5.1"
    validations:
      required: true
  - type: textarea
    id: logs
    attributes:
@@ -72,14 +103,3 @@ body:
        trace. This will be automatically formatted into code, so no
        need for backticks.
      render: bash
  - type: textarea
    id: reproduce
    attributes:
      label: How can we reproduce the bug?
      description: |
        Please walk us through and provide steps and details on how
        to reproduce the issue. If possible, provide scripts that we
        can run to trigger the bug.
    validations:
      required: true
--- a/.github/actions/build-dev-builder-images/action.yml
+++ b/.github/actions/build-dev-builder-images/action.yml
@@ -0,0 +1,76 @@
 name: Build and push dev-builder images
 description: Build and push dev-builder images to DockerHub and ACR
 inputs:
  dockerhub-image-registry:
    description: The dockerhub image registry to store the images
    required: false
    default: docker.io
  dockerhub-image-registry-username:
    description: The dockerhub username to login to the image registry
    required: true
  dockerhub-image-registry-token:
    description: The dockerhub token to login to the image registry
    required: true
  dockerhub-image-namespace:
    description: The dockerhub namespace of the image registry to store the images
    required: false
    default: greptime
  version:
    description: Version of the dev-builder
    required: false
    default: latest
  build-dev-builder-ubuntu:
    description: Build dev-builder-ubuntu image
    required: false
    default: 'true'
  build-dev-builder-centos:
    description: Build dev-builder-centos image
    required: false
    default: 'true'
  build-dev-builder-android:
    description: Build dev-builder-android image
    required: false
    default: 'true'
 runs:
  using: composite
  steps:
    - name: Login to Dockerhub
      uses: docker/login-action@v2
      with:
        registry: ${{ inputs.dockerhub-image-registry }}
        username: ${{ inputs.dockerhub-image-registry-username }}
        password: ${{ inputs.dockerhub-image-registry-token }}
    - name: Build and push dev-builder-ubuntu image
      shell: bash
      if: ${{ inputs.build-dev-builder-ubuntu == 'true' }}
      run: |
        make dev-builder \
          BASE_IMAGE=ubuntu \
          BUILDX_MULTI_PLATFORM_BUILD=true \
          IMAGE_REGISTRY=${{ inputs.dockerhub-image-registry }} \
          IMAGE_NAMESPACE=${{ inputs.dockerhub-image-namespace }} \
          IMAGE_TAG=${{ inputs.version }}
    - name: Build and push dev-builder-centos image
      shell: bash
      if: ${{ inputs.build-dev-builder-centos == 'true' }}
      run: |
        make dev-builder \
          BASE_IMAGE=centos \
          BUILDX_MULTI_PLATFORM_BUILD=true \
          IMAGE_REGISTRY=${{ inputs.dockerhub-image-registry }} \
          IMAGE_NAMESPACE=${{ inputs.dockerhub-image-namespace }} \
          IMAGE_TAG=${{ inputs.version }}
    - name: Build and push dev-builder-android image # Only build image for amd64 platform.
      shell: bash
      if: ${{ inputs.build-dev-builder-android == 'true' }}
      run: |
        make dev-builder \
          BASE_IMAGE=android \
          IMAGE_REGISTRY=${{ inputs.dockerhub-image-registry }} \
          IMAGE_NAMESPACE=${{ inputs.dockerhub-image-namespace }} \
          IMAGE_TAG=${{ inputs.version }} && \
        docker push ${{ inputs.dockerhub-image-registry }}/${{ inputs.dockerhub-image-namespace }}/dev-builder-android:${{ inputs.version }}
--- a/.github/actions/build-greptime-binary/action.yml
+++ b/.github/actions/build-greptime-binary/action.yml
@@ -0,0 +1,65 @@
 name: Build greptime binary
 description: Build and upload the single linux artifact
 inputs:
  base-image:
    description: Base image to build greptime
    required: true
  features:
    description: Cargo features to build
    required: true
  cargo-profile:
    description: Cargo profile to build
    required: true
  artifacts-dir:
    description: Directory to store artifacts
    required: true
  version:
    description: Version of the artifact
    required: true
  working-dir:
    description: Working directory to build the artifacts
    required: false
    default: .
  build-android-artifacts:
    description: Build android artifacts
    required: false
    default: 'false'
 runs:
  using: composite
  steps:
    - name: Build greptime binary
      shell: bash
      if: ${{ inputs.build-android-artifacts == 'false' }}
      run: |
        cd ${{ inputs.working-dir }} && \
        make build-by-dev-builder \
          CARGO_PROFILE=${{ inputs.cargo-profile }} \
          FEATURES=${{ inputs.features }} \
          BASE_IMAGE=${{ inputs.base-image }}
    - name: Upload artifacts
      uses: ./.github/actions/upload-artifacts
      if: ${{ inputs.build-android-artifacts == 'false' }}
      env:
        PROFILE_TARGET: ${{ inputs.cargo-profile == 'dev' && 'debug' || inputs.cargo-profile }}
      with:
        artifacts-dir: ${{ inputs.artifacts-dir }}
        target-file: ./target/$PROFILE_TARGET/greptime
        version: ${{ inputs.version }}
        working-dir: ${{ inputs.working-dir }}
    # TODO(zyy17): We can remove build-android-artifacts flag in the future.
    - name: Build greptime binary
      shell: bash
      if: ${{ inputs.build-android-artifacts == 'true' }}
      run: |
        cd ${{ inputs.working-dir }} && make strip-android-bin
    - name: Upload android artifacts
      uses: ./.github/actions/upload-artifacts
      if: ${{ inputs.build-android-artifacts == 'true' }}
      with:
        artifacts-dir: ${{ inputs.artifacts-dir }}
        target-file: ./target/aarch64-linux-android/release/greptime
        version: ${{ inputs.version }}
        working-dir: ${{ inputs.working-dir }}
--- a/.github/actions/build-greptime-images/action.yml
+++ b/.github/actions/build-greptime-images/action.yml
@@ -0,0 +1,104 @@
 name: Build greptime images
 description: Build and push greptime images
 inputs:
  image-registry:
    description: The image registry to store the images
    required: true
  image-registry-username:
    description: The username to login to the image registry
    required: true
  image-registry-password:
    description: The password to login to the image registry
    required: true
  amd64-artifact-name:
    description: The name of the amd64 artifact for building images
    required: true
  arm64-artifact-name:
    description: The name of the arm64 artifact for building images
    required: false
    default: ""
  image-namespace:
    description: The namespace of the image registry to store the images
    required: true
  image-name:
    description: The name of the image to build
    required: true
  image-tag:
    description: The tag of the image to build
    required: true
  docker-file:
    description: The path to the Dockerfile to build
    required: true
  platforms:
    description: The supported platforms to build the image
    required: true
  push-latest-tag:
    description: Whether to push the latest tag
    required: false
    default: 'true'
 runs:
  using: composite
  steps:
    - name: Login to image registry
      uses: docker/login-action@v2
      with:
        registry: ${{ inputs.image-registry }}
        username: ${{ inputs.image-registry-username }}
        password: ${{ inputs.image-registry-password }}
    - name: Set up qemu for multi-platform builds
      uses: docker/setup-qemu-action@v2
    - name: Set up buildx
      uses: docker/setup-buildx-action@v2
    - name: Download amd64 artifacts
      uses: actions/download-artifact@v4
      with:
        name: ${{ inputs.amd64-artifact-name }}
    - name: Unzip the amd64 artifacts
      shell: bash
      run: |
        tar xvf ${{ inputs.amd64-artifact-name }}.tar.gz && \
        rm ${{ inputs.amd64-artifact-name }}.tar.gz && \
        rm -rf amd64 && \
        mv ${{ inputs.amd64-artifact-name }} amd64
    - name: Download arm64 artifacts
      uses: actions/download-artifact@v4
      if: ${{ inputs.arm64-artifact-name }}
      with:
        name: ${{ inputs.arm64-artifact-name }}
    - name: Unzip the arm64 artifacts
      shell: bash
      if: ${{ inputs.arm64-artifact-name }}
      run: |
        tar xvf ${{ inputs.arm64-artifact-name }}.tar.gz && \
        rm ${{ inputs.arm64-artifact-name }}.tar.gz && \
        rm -rf arm64 && \
        mv ${{ inputs.arm64-artifact-name }} arm64
    - name: Build and push images(without latest) for amd64 and arm64
      if: ${{ inputs.push-latest-tag == 'false' }}
      uses: docker/build-push-action@v3
      with:
        context: .
        file: ${{ inputs.docker-file }}
        push: true
        platforms: ${{ inputs.platforms }}
        tags: |
          ${{ inputs.image-registry }}/${{ inputs.image-namespace }}/${{ inputs.image-name }}:${{ inputs.image-tag }}
    - name: Build and push images for amd64 and arm64
      if: ${{ inputs.push-latest-tag == 'true' }}
      uses: docker/build-push-action@v3
      with:
        context: .
        file: ${{ inputs.docker-file }}
        push: true
        platforms: ${{ inputs.platforms }}
        tags: |
          ${{ inputs.image-registry }}/${{ inputs.image-namespace }}/${{ inputs.image-name }}:latest
          ${{ inputs.image-registry }}/${{ inputs.image-namespace }}/${{ inputs.image-name }}:${{ inputs.image-tag }}
--- a/.github/actions/build-images/action.yml
+++ b/.github/actions/build-images/action.yml
@@ -0,0 +1,62 @@
 name: Group for building greptimedb images
 description: Group for building greptimedb images
 inputs:
  image-registry:
    description: The image registry to store the images
    required: true
  image-namespace:
    description: The namespace of the image registry to store the images
    required: true
  image-name:
    description: The name of the image to build
    required: false
    default: greptimedb
  image-registry-username:
    description: The username to login to the image registry
    required: true
  image-registry-password:
    description: The password to login to the image registry
    required: true
  version:
    description: Version of the artifact
    required: true
  push-latest-tag:
    description: Whether to push the latest tag
    required: false
    default: 'true'
  dev-mode:
    description: Enable dev mode, only build standard greptime
    required: false
    default: 'false'
 runs:
  using: composite
  steps:
    - name: Build and push standard images to dockerhub
      uses: ./.github/actions/build-greptime-images
      with: # The image will be used as '${{ inputs.image-registry }}/${{ inputs.image-namespace }}/${{ inputs.image-name }}:${{ inputs.version }}'
        image-registry: ${{ inputs.image-registry }}
        image-namespace: ${{ inputs.image-namespace }}
        image-registry-username: ${{ inputs.image-registry-username }}
        image-registry-password: ${{ inputs.image-registry-password }}
        image-name: ${{ inputs.image-name }}
        image-tag: ${{ inputs.version }}
        docker-file: docker/ci/ubuntu/Dockerfile
        amd64-artifact-name: greptime-linux-amd64-pyo3-${{ inputs.version }}
        arm64-artifact-name: greptime-linux-arm64-pyo3-${{ inputs.version }}
        platforms: linux/amd64,linux/arm64
        push-latest-tag: ${{ inputs.push-latest-tag }}
    - name: Build and push centos images to dockerhub
      if: ${{ inputs.dev-mode == 'false' }}
      uses: ./.github/actions/build-greptime-images
      with:
        image-registry: ${{ inputs.image-registry }}
        image-namespace: ${{ inputs.image-namespace }}
        image-registry-username: ${{ inputs.image-registry-username }}
        image-registry-password: ${{ inputs.image-registry-password }}
        image-name: ${{ inputs.image-name }}-centos
        image-tag: ${{ inputs.version }}
        docker-file: docker/ci/centos/Dockerfile
        amd64-artifact-name: greptime-linux-amd64-centos-${{ inputs.version }}
        platforms: linux/amd64
        push-latest-tag: ${{ inputs.push-latest-tag }}
--- a/.github/actions/build-linux-artifacts/action.yml
+++ b/.github/actions/build-linux-artifacts/action.yml
@@ -0,0 +1,88 @@
 name: Build linux artifacts
 description: Build linux artifacts
 inputs:
  arch:
    description: Architecture to build
    required: true
  cargo-profile:
    description: Cargo profile to build
    required: true
  version:
    description: Version of the artifact
    required: true
  disable-run-tests:
    description: Disable running integration tests
    required: true
  dev-mode:
    description: Enable dev mode, only build standard greptime
    required: false
    default: 'false'
  working-dir:
    description: Working directory to build the artifacts
    required: false
    default: .
 runs:
  using: composite
  steps:
    - name: Run integration test
      if: ${{ inputs.disable-run-tests == 'false' }}
      shell: bash
      # NOTE: If the BUILD_JOBS > 4, it's always OOM in EC2 instance.
      run: |
        cd ${{ inputs.working-dir }} && \
        make run-it-in-container BUILD_JOBS=4
    - name: Upload sqlness logs
      if: ${{ failure() && inputs.disable-run-tests == 'false' }} # Only upload logs when the integration tests failed.
      uses: actions/upload-artifact@v4
      with:
        name: sqlness-logs
        path: /tmp/greptime-*.log
        retention-days: 3
    - name: Build standard greptime
      uses: ./.github/actions/build-greptime-binary
      with:
        base-image: ubuntu
        features: pyo3_backend,servers/dashboard
        cargo-profile: ${{ inputs.cargo-profile }}
        artifacts-dir: greptime-linux-${{ inputs.arch }}-pyo3-${{ inputs.version }}
        version: ${{ inputs.version }}
        working-dir: ${{ inputs.working-dir }}
    - name: Build greptime without pyo3
      if: ${{ inputs.dev-mode == 'false' }}
      uses: ./.github/actions/build-greptime-binary
      with:
        base-image: ubuntu
        features: servers/dashboard
        cargo-profile: ${{ inputs.cargo-profile }}
        artifacts-dir: greptime-linux-${{ inputs.arch }}-${{ inputs.version }}
        version: ${{ inputs.version }}
        working-dir: ${{ inputs.working-dir }}
    - name: Clean up the target directory # Clean up the target directory for the centos7 base image, or it will still use the objects of last build.
      shell: bash
      run: |
        rm -rf ./target/
    - name: Build greptime on centos base image
      uses: ./.github/actions/build-greptime-binary
      if: ${{ inputs.arch == 'amd64' && inputs.dev-mode == 'false' }} # Only build centos7 base image for amd64.
      with:
        base-image: centos
        features: servers/dashboard
        cargo-profile: ${{ inputs.cargo-profile }}
        artifacts-dir: greptime-linux-${{ inputs.arch }}-centos-${{ inputs.version }}
        version: ${{ inputs.version }}
        working-dir: ${{ inputs.working-dir }}
    - name: Build greptime on android base image
      uses: ./.github/actions/build-greptime-binary
      if: ${{ inputs.arch == 'amd64' && inputs.dev-mode == 'false' }} # Only build android base image on amd64.
      with:
        base-image: android
        artifacts-dir: greptime-android-arm64-${{ inputs.version }}
        version: ${{ inputs.version }}
        working-dir: ${{ inputs.working-dir }}
        build-android-artifacts: true
--- a/.github/actions/build-macos-artifacts/action.yml
+++ b/.github/actions/build-macos-artifacts/action.yml
@@ -0,0 +1,89 @@
 name: Build macos artifacts
 description: Build macos artifacts
 inputs:
  arch:
    description: Architecture to build
    required: true
  rust-toolchain:
    description: Rust toolchain to use
    required: true
  cargo-profile:
    description: Cargo profile to build
    required: true
  features:
    description: Cargo features to build
    required: true
  version:
    description: Version of the artifact
    required: true
  disable-run-tests:
    description: Disable running integration tests
    required: true
  artifacts-dir:
    description: Directory to store artifacts
    required: true
 runs:
  using: composite
  steps:
    - name: Cache cargo assets
      id: cache
      uses: actions/cache@v3
      with:
        path: |
          ~/.cargo/bin/
          ~/.cargo/registry/index/
          ~/.cargo/registry/cache/
          ~/.cargo/git/db/
          target/
        key: ${{ inputs.arch }}-build-cargo-${{ hashFiles('**/Cargo.lock') }}
    - name: Install protoc
      shell: bash
      run: |
        brew install protobuf
    - name: Install rust toolchain
      uses: dtolnay/rust-toolchain@master
      with:
        toolchain: ${{ inputs.rust-toolchain }}
        targets: ${{ inputs.arch }}
    - name: Start etcd # For integration tests.
      if: ${{ inputs.disable-run-tests == 'false' }}
      shell: bash
      run: |
        brew install etcd && \
        brew services start etcd
    - name: Install latest nextest release # For integration tests.
      if: ${{ inputs.disable-run-tests == 'false' }}
      uses: taiki-e/install-action@nextest
    - name: Run integration tests
      if: ${{ inputs.disable-run-tests == 'false' }}
      shell: bash
      run: |
        make test sqlness-test
    - name: Upload sqlness logs
      if: ${{ failure() }} # Only upload logs when the integration tests failed.
      uses: actions/upload-artifact@v4
      with:
        name: sqlness-logs
        path: /tmp/greptime-*.log
        retention-days: 3
    - name: Build greptime binary
      shell: bash
      run: |
        make build \
        CARGO_PROFILE=${{ inputs.cargo-profile }} \
        FEATURES=${{ inputs.features }} \
        TARGET=${{ inputs.arch }}
    - name: Upload artifacts
      uses: ./.github/actions/upload-artifacts
      with:
        artifacts-dir: ${{ inputs.artifacts-dir }}
        target-file: target/${{ inputs.arch }}/${{ inputs.cargo-profile }}/greptime
        version: ${{ inputs.version }}
--- a/.github/actions/build-windows-artifacts/action.yml
+++ b/.github/actions/build-windows-artifacts/action.yml
@@ -0,0 +1,82 @@
 name: Build Windows artifacts
 description: Build Windows artifacts
 inputs:
  arch:
    description: Architecture to build
    required: true
  rust-toolchain:
    description: Rust toolchain to use
    required: true
  cargo-profile:
    description: Cargo profile to build
    required: true
  features:
    description: Cargo features to build
    required: true
  version:
    description: Version of the artifact
    required: true
  disable-run-tests:
    description: Disable running integration tests
    required: true
  artifacts-dir:
    description: Directory to store artifacts
    required: true
 runs:
  using: composite
  steps:
    - uses: arduino/setup-protoc@v3
      with:
        repo-token: ${{ secrets.GITHUB_TOKEN }}
    - name: Install rust toolchain
      uses: dtolnay/rust-toolchain@master
      with:
        toolchain: ${{ inputs.rust-toolchain }}
        targets: ${{ inputs.arch }}
        components: llvm-tools-preview
    - name: Rust Cache
      uses: Swatinem/rust-cache@v2
    - name: Install Python
      uses: actions/setup-python@v5
      with:
        python-version: '3.10'
    - name: Install PyArrow Package
      shell: pwsh
      run: pip install pyarrow
    - name: Install WSL distribution
      uses: Vampire/setup-wsl@v2
      with:
        distribution: Ubuntu-22.04
    - name: Install latest nextest release # For integration tests.
      if: ${{ inputs.disable-run-tests == 'false' }}
      uses: taiki-e/install-action@nextest
    - name: Run integration tests
      if: ${{ inputs.disable-run-tests == 'false' }}
      shell: pwsh
      run: make test sqlness-test
    - name: Upload sqlness logs
      if: ${{ failure() }} # Only upload logs when the integration tests failed.
      uses: actions/upload-artifact@v4
      with:
        name: sqlness-logs
        path: /tmp/greptime-*.log
        retention-days: 3
    - name: Build greptime binary
      shell: pwsh
      run: cargo build --profile ${{ inputs.cargo-profile }} --features ${{ inputs.features }} --target ${{ inputs.arch }} --bin greptime
    - name: Upload artifacts
      uses: ./.github/actions/upload-artifacts
      with:
        artifacts-dir: ${{ inputs.artifacts-dir }}
        target-file: target/${{ inputs.arch }}/${{ inputs.cargo-profile }}/greptime
        version: ${{ inputs.version }}
--- a/.github/actions/deploy-greptimedb/action.yml
+++ b/.github/actions/deploy-greptimedb/action.yml
@@ -0,0 +1,31 @@
 name: Deploy GreptimeDB cluster
 description: Deploy GreptimeDB cluster on Kubernetes
 inputs:
  aws-ci-test-bucket:
    description: 'AWS S3 bucket name for testing'
    required: true
  aws-region:
    description: 'AWS region for testing'
    required: true
  data-root:
    description: 'Data root for testing'
    required: true
  aws-access-key-id:
    description: 'AWS access key id for testing'
    required: true
  aws-secret-access-key:
    description: 'AWS secret access key for testing'
    required: true
 runs:
  using: composite
  steps:
    - name: Deploy GreptimeDB by Helm
      shell: bash
      env:
        DATA_ROOT: ${{ inputs.data-root }}
        AWS_CI_TEST_BUCKET: ${{ inputs.aws-ci-test-bucket }}
        AWS_REGION: ${{ inputs.aws-region }}
        AWS_ACCESS_KEY_ID: ${{ inputs.aws-access-key-id }}
        AWS_SECRET_ACCESS_KEY: ${{ inputs.aws-secret-access-key }}
      run: |
        ./.github/scripts/deploy-greptimedb.sh
--- a/.github/actions/fuzz-test/action.yaml
+++ b/.github/actions/fuzz-test/action.yaml
@@ -0,0 +1,13 @@
 name: Fuzz Test
 description: 'Fuzz test given setup and service'
 inputs:
  target:
    description: "The fuzz target to test"
 runs:
  using: composite
  steps:
  - name: Run Fuzz Test
    shell: bash
    run: cargo fuzz run ${{ inputs.target }} --fuzz-dir tests-fuzz -D -s none -- -max_total_time=120
    env:
      GT_MYSQL_ADDR: 127.0.0.1:4002
--- a/.github/actions/publish-github-release/action.yml
+++ b/.github/actions/publish-github-release/action.yml
@@ -0,0 +1,53 @@
 name: Publish GitHub release
 description: Publish GitHub release
 inputs:
  version:
    description: Version to release
    required: true
 runs:
  using: composite
  steps:
    # Download artifacts from previous jobs, the artifacts will be downloaded to:
    # ${WORKING_DIR}
    #   |- greptime-darwin-amd64-pyo3-v0.5.0/greptime-darwin-amd64-pyo3-v0.5.0.tar.gz
    #   |- greptime-darwin-amd64-pyo3-v0.5.0.sha256sum/greptime-darwin-amd64-pyo3-v0.5.0.sha256sum
    #   |- greptime-darwin-amd64-v0.5.0/greptime-darwin-amd64-v0.5.0.tar.gz
    #   |- greptime-darwin-amd64-v0.5.0.sha256sum/greptime-darwin-amd64-v0.5.0.sha256sum
    #   ...
    - name: Download artifacts
      uses: actions/download-artifact@v4
    - name: Create git tag for release
      if: ${{ github.event_name != 'push' }} # Meaning this is a scheduled or manual workflow.
      shell: bash
      run: |
        git tag ${{ inputs.version }}
    # Only publish release when the release tag is like v1.0.0, v1.0.1, v1.0.2, etc.
    - name: Set release arguments
      shell: bash
      run: |
        if [[ "${{ inputs.version }}" =~ ^v[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
          echo "prerelease=false" >> $GITHUB_ENV
          echo "makeLatest=true" >> $GITHUB_ENV
          echo "generateReleaseNotes=false" >> $GITHUB_ENV
          echo "omitBody=true" >> $GITHUB_ENV
        else
          echo "prerelease=true" >> $GITHUB_ENV
          echo "makeLatest=false" >> $GITHUB_ENV
          echo "generateReleaseNotes=true" >> $GITHUB_ENV
          echo "omitBody=false" >> $GITHUB_ENV
        fi
    - name: Publish release
      uses: ncipollo/release-action@v1
      with:
        name: "Release ${{ inputs.version }}"
        prerelease: ${{ env.prerelease }}
        makeLatest: ${{ env.makeLatest }}
        tag: ${{ inputs.version }}
        generateReleaseNotes: ${{ env.generateReleaseNotes }}
        omitBody: ${{ env.omitBody }} # omitBody is true when the release is a official release.
        allowUpdates: true
        artifacts: |
          **/greptime-*/*
--- a/.github/actions/release-cn-artifacts/action.yaml
+++ b/.github/actions/release-cn-artifacts/action.yaml
@@ -0,0 +1,138 @@
 name: Release CN artifacts
 description: Release artifacts to CN region
 inputs:
  src-image-registry:
    description: The source image registry to store the images
    required: true
    default: docker.io
  src-image-namespace:
    description: The namespace of the source image registry to store the images
    required: true
    default: greptime
  src-image-name:
    description: The name of the source image
    required: false
    default: greptimedb
  dst-image-registry:
    description: The destination image registry to store the images
    required: true
  dst-image-namespace:
    description: The namespace of the destination image registry to store the images
    required: true
    default: greptime
  dst-image-registry-username:
    description: The username to login to the image registry
    required: true
  dst-image-registry-password:
    description: The password to login to the image registry
    required: true
  version:
    description: Version of the artifact
    required: true
  dev-mode:
    description: Enable dev mode, only push standard greptime
    required: false
    default: 'false'
  push-latest-tag:
    description: Whether to push the latest tag of the image
    required: false
    default: 'true'
  aws-cn-s3-bucket:
    description: S3 bucket to store released artifacts in CN region
    required: true
  aws-cn-access-key-id:
    description: AWS access key id in CN region
    required: true
  aws-cn-secret-access-key:
    description: AWS secret access key in CN region
    required: true
  aws-cn-region:
    description: AWS region in CN
    required: true
  upload-to-s3:
    description: Upload to S3
    required: false
    default: 'true'
  artifacts-dir:
    description: Directory to store artifacts
    required: false
    default: 'artifacts'
  update-version-info:
    description: Update the version info in S3
    required: false
    default: 'true'
  upload-max-retry-times:
    description: Max retry times for uploading artifacts to S3
    required: false
    default: "20"
  upload-retry-timeout:
    description: Timeout for uploading artifacts to S3
    required: false
    default: "30" # minutes
 runs:
  using: composite
  steps:
    - name: Download artifacts
      uses: actions/download-artifact@v4
      with:
        path: ${{ inputs.artifacts-dir }}
    - name: Release artifacts to cn region
      uses: nick-invision/retry@v2
      if: ${{ inputs.upload-to-s3 == 'true' }}
      env:
        AWS_ACCESS_KEY_ID: ${{ inputs.aws-cn-access-key-id }}
        AWS_SECRET_ACCESS_KEY: ${{ inputs.aws-cn-secret-access-key }}
        AWS_DEFAULT_REGION: ${{ inputs.aws-cn-region }}
        UPDATE_VERSION_INFO: ${{ inputs.update-version-info }}
      with:
        max_attempts: ${{ inputs.upload-max-retry-times }}
        timeout_minutes: ${{ inputs.upload-retry-timeout }}
        command: |
          ./.github/scripts/upload-artifacts-to-s3.sh \
            ${{ inputs.artifacts-dir }} \
            ${{ inputs.version }} \
            ${{ inputs.aws-cn-s3-bucket }}
    - name: Push greptimedb image from Dockerhub to ACR
      shell: bash
      env:
        DST_REGISTRY_USERNAME: ${{ inputs.dst-image-registry-username }}
        DST_REGISTRY_PASSWORD: ${{ inputs.dst-image-registry-password }}
      run: |
        ./.github/scripts/copy-image.sh \
         ${{ inputs.src-image-registry }}/${{ inputs.src-image-namespace }}/${{ inputs.src-image-name }}:${{ inputs.version }} \
         ${{ inputs.dst-image-registry }}/${{ inputs.dst-image-namespace }}
    - name: Push latest greptimedb image from Dockerhub to ACR
      shell: bash
      if: ${{ inputs.push-latest-tag == 'true' }}
      env:
        DST_REGISTRY_USERNAME: ${{ inputs.dst-image-registry-username }}
        DST_REGISTRY_PASSWORD: ${{ inputs.dst-image-registry-password }}
      run: |
        ./.github/scripts/copy-image.sh \
         ${{ inputs.src-image-registry }}/${{ inputs.src-image-namespace }}/${{ inputs.src-image-name }}:latest \
         ${{ inputs.dst-image-registry }}/${{ inputs.dst-image-namespace }}
    - name: Push greptimedb-centos image from DockerHub to ACR
      shell: bash
      if: ${{ inputs.dev-mode == 'false' }}
      env:
        DST_REGISTRY_USERNAME: ${{ inputs.dst-image-registry-username }}
        DST_REGISTRY_PASSWORD: ${{ inputs.dst-image-registry-password }}
      run: |
        ./.github/scripts/copy-image.sh \
         ${{ inputs.src-image-registry }}/${{ inputs.src-image-namespace }}/${{ inputs.src-image-name }}-centos:latest \
         ${{ inputs.dst-image-registry }}/${{ inputs.dst-image-namespace }}
    - name: Push greptimedb-centos image from DockerHub to ACR
      shell: bash
      if: ${{ inputs.dev-mode == 'false' && inputs.push-latest-tag == 'true' }}
      env:
        DST_REGISTRY_USERNAME: ${{ inputs.dst-image-registry-username }}
        DST_REGISTRY_PASSWORD: ${{ inputs.dst-image-registry-password }}
      run: |
        ./.github/scripts/copy-image.sh \
         ${{ inputs.src-image-registry }}/${{ inputs.src-image-namespace }}/${{ inputs.src-image-name }}-centos:latest \
         ${{ inputs.dst-image-registry }}/${{ inputs.dst-image-namespace }}
--- a/.github/actions/sqlness-test/action.yml
+++ b/.github/actions/sqlness-test/action.yml
@@ -0,0 +1,59 @@
 name: Run sqlness test
 description: Run sqlness test on GreptimeDB
 inputs:
  aws-ci-test-bucket:
    description: 'AWS S3 bucket name for testing'
    required: true
  aws-region:
    description: 'AWS region for testing'
    required: true
  data-root:
    description: 'Data root for testing'
    required: true
  aws-access-key-id:
    description: 'AWS access key id for testing'
    required: true
  aws-secret-access-key:
    description: 'AWS secret access key for testing'
    required: true
 runs:
  using: composite
  steps:
    - name: Deploy GreptimeDB cluster by Helm
      uses: ./.github/actions/deploy-greptimedb
      with:
        data-root: ${{ inputs.data-root }}
        aws-ci-test-bucket: ${{ inputs.aws-ci-test-bucket }}
        aws-region: ${{ inputs.aws-region }}
        aws-access-key-id: ${{ inputs.aws-access-key-id }}
        aws-secret-access-key: ${{ inputs.aws-secret-access-key }}
    # TODO(zyy17): The following tests will be replaced by the real sqlness test.
    - name: Run tests on greptimedb cluster
      shell: bash
      run: |
        mysql -h 127.0.0.1 -P 14002 -e "CREATE TABLE IF NOT EXISTS system_metrics (host VARCHAR(255), idc VARCHAR(255), cpu_util DOUBLE, memory_util DOUBLE, disk_util DOUBLE, ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY(host, idc), TIME INDEX(ts));" && \
        mysql -h 127.0.0.1 -P 14002 -e "SHOW TABLES;"
    - name: Run tests on greptimedb cluster that uses S3
      shell: bash
      run: |
        mysql -h 127.0.0.1 -P 24002 -e "CREATE TABLE IF NOT EXISTS system_metrics (host VARCHAR(255), idc VARCHAR(255), cpu_util DOUBLE, memory_util DOUBLE, disk_util DOUBLE, ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY(host, idc), TIME INDEX(ts));" && \
        mysql -h 127.0.0.1 -P 24002 -e "SHOW TABLES;"
    - name: Run tests on standalone greptimedb
      shell: bash
      run: |
        mysql -h 127.0.0.1 -P 34002 -e "CREATE TABLE IF NOT EXISTS system_metrics (host VARCHAR(255), idc VARCHAR(255), cpu_util DOUBLE, memory_util DOUBLE, disk_util DOUBLE, ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY(host, idc), TIME INDEX(ts));" && \
        mysql -h 127.0.0.1 -P 34002 -e "SHOW TABLES;"
    - name: Clean S3 data
      shell: bash
      env:
        AWS_DEFAULT_REGION: ${{ inputs.aws-region }}
        AWS_ACCESS_KEY_ID: ${{ inputs.aws-access-key-id }}
        AWS_SECRET_ACCESS_KEY: ${{ inputs.aws-secret-access-key }}
      run: |
        aws s3 rm s3://${{ inputs.aws-ci-test-bucket }}/${{ inputs.data-root }} --recursive
--- a/.github/actions/start-runner/action.yml
+++ b/.github/actions/start-runner/action.yml
@@ -0,0 +1,67 @@
 name: Start EC2 runner
 description: Start EC2 runner
 inputs:
  runner:
    description: The linux runner name
    required: true
  aws-access-key-id:
    description: AWS access key id
    required: true
  aws-secret-access-key:
    description: AWS secret access key
    required: true
  aws-region:
    description: AWS region
    required: true
  github-token:
    description: The GitHub token to clone private repository
    required: false
    default: ""
  image-id:
    description: The EC2 image id
    required: true
  security-group-id:
    description: The EC2 security group id
    required: true
  subnet-id:
    description: The EC2 subnet id
    required: true
 outputs:
  label:
    description: "label"
    value: ${{ steps.start-linux-arm64-ec2-runner.outputs.label || inputs.runner }}
  ec2-instance-id:
    description: "ec2-instance-id"
    value: ${{ steps.start-linux-arm64-ec2-runner.outputs.ec2-instance-id }}
 runs:
  using: composite
  steps:
    - name: Configure AWS credentials
      if: startsWith(inputs.runner, 'ec2')
      uses: aws-actions/configure-aws-credentials@v2
      with:
        aws-access-key-id: ${{ inputs.aws-access-key-id }}
        aws-secret-access-key: ${{ inputs.aws-secret-access-key }}
        aws-region: ${{ inputs.aws-region }}
    # The EC2 runner will use the following format:
    # <vm-type>-<instance-type>-<arch>
    # like 'ec2-c6a.4xlarge-amd64'.
    - name: Get EC2 instance type
      if: startsWith(inputs.runner, 'ec2')
      id: get-ec2-instance-type
      shell: bash
      run: |
        echo "instance-type=$(echo ${{ inputs.runner }} | cut -d'-' -f2)" >> $GITHUB_OUTPUT
    - name: Start EC2 runner
      if: startsWith(inputs.runner, 'ec2')
      uses: machulav/ec2-github-runner@v2
      id: start-linux-arm64-ec2-runner
      with:
        mode: start
        ec2-image-id: ${{ inputs.image-id }}
        ec2-instance-type: ${{ steps.get-ec2-instance-type.outputs.instance-type }}
        subnet-id: ${{ inputs.subnet-id }}
        security-group-id: ${{ inputs.security-group-id }}
        github-token: ${{ inputs.github-token }}
--- a/.github/actions/stop-runner/action.yml
+++ b/.github/actions/stop-runner/action.yml
@@ -0,0 +1,41 @@
 name: Stop EC2 runner
 description: Stop EC2 runner
 inputs:
  label:
    description: The linux runner name
    required: true
  ec2-instance-id:
    description: The EC2 instance id
    required: true
  aws-access-key-id:
    description: AWS access key id
    required: true
  aws-secret-access-key:
    description: AWS secret access key
    required: true
  aws-region:
    description: AWS region
    required: true
  github-token:
    description: The GitHub token to clone private repository
    required: false
    default: ""
 runs:
  using: composite
  steps:
    - name: Configure AWS credentials
      if: ${{ inputs.label && inputs.ec2-instance-id }}
      uses: aws-actions/configure-aws-credentials@v2
      with:
        aws-access-key-id: ${{ inputs.aws-access-key-id }}
        aws-secret-access-key: ${{ inputs.aws-secret-access-key }}
        aws-region: ${{ inputs.aws-region }}
    - name: Stop EC2 runner
      if: ${{ inputs.label && inputs.ec2-instance-id }}
      uses: machulav/ec2-github-runner@v2
      with:
        mode: stop
        label: ${{ inputs.label }}
        ec2-instance-id: ${{ inputs.ec2-instance-id }}
        github-token: ${{ inputs.github-token }}
--- a/.github/actions/upload-artifacts/action.yml
+++ b/.github/actions/upload-artifacts/action.yml
@@ -0,0 +1,64 @@
 name: Upload artifacts
 description: Upload artifacts
 inputs:
  artifacts-dir:
    description: Directory to store artifacts
    required: true
  target-file:
    description: The path of the target artifact
    required: false
  version:
    description: Version of the artifact
    required: true
  working-dir:
    description: Working directory to upload the artifacts
    required: false
    default: .
 runs:
  using: composite
  steps:
    - name: Create artifacts directory
      if: ${{ inputs.target-file != '' }}
      working-directory: ${{ inputs.working-dir }}
      shell: bash
      run: |
        mkdir -p ${{ inputs.artifacts-dir }} && \
        cp ${{ inputs.target-file }} ${{ inputs.artifacts-dir }}
    # The compressed artifacts will use the following layout:
    # greptime-linux-amd64-pyo3-v0.3.0sha256sum
    # greptime-linux-amd64-pyo3-v0.3.0.tar.gz
    #   greptime-linux-amd64-pyo3-v0.3.0
    #   └── greptime
    - name: Compress artifacts and calculate checksum
      working-directory: ${{ inputs.working-dir }}
      shell: bash
      run: |
        tar -zcvf ${{ inputs.artifacts-dir }}.tar.gz ${{ inputs.artifacts-dir }}
    - name: Calculate checksum
      if: runner.os != 'Windows'
      working-directory: ${{ inputs.working-dir }}
      shell: bash
      run: |
        echo $(shasum -a 256 ${{ inputs.artifacts-dir }}.tar.gz | cut -f1 -d' ') > ${{ inputs.artifacts-dir }}.sha256sum
    - name: Calculate checksum on Windows
      if: runner.os == 'Windows'
      working-directory: ${{ inputs.working-dir }}
      shell: pwsh
      run: Get-FileHash ${{ inputs.artifacts-dir }}.tar.gz -Algorithm SHA256 | select -ExpandProperty Hash > ${{ inputs.artifacts-dir }}.sha256sum
    # Note: The artifacts will be double zip compressed(related issue: https://github.com/actions/upload-artifact/issues/39).
    # However, when we use 'actions/download-artifact' to download the artifacts, it will be automatically unzipped.
    - name: Upload artifacts
      uses: actions/upload-artifact@v4
      with:
        name: ${{ inputs.artifacts-dir }}
        path: ${{ inputs.working-dir }}/${{ inputs.artifacts-dir }}.tar.gz
    - name: Upload checksum
      uses: actions/upload-artifact@v4
      with:
        name: ${{ inputs.artifacts-dir }}.sha256sum
        path: ${{ inputs.working-dir }}/${{ inputs.artifacts-dir }}.sha256sum
--- a/.github/doc-label-config.yml
+++ b/.github/doc-label-config.yml
@@ -0,0 +1,4 @@
 Doc not needed:
    - '- \[x\]  This PR does not require documentation updates.'
 Doc update required:
    - '- \[ \]  This PR does not require documentation updates.'
--- a/.github/pull_request_template.md
+++ b/.github/pull_request_template.md
@@ -1,8 +1,10 @@
-I hereby agree to the terms of the [GreptimeDB CLA](https://gist.github.com/xtang/6378857777706e568c1949c7578592cc)
+I hereby agree to the terms of the [GreptimeDB CLA](https://github.com/GreptimeTeam/.github/blob/main/CLA.md).
 ## Refer to a related PR or issue link (optional)
 ## What's changed and what's your intention?
-_PLEASE DO NOT LEAVE THIS EMPTY !!!_
+__!!! DO NOT LEAVE THIS BLOCK EMPTY !!!__
 Please explain IN DETAIL what the changes are in this PR and why they are needed:
@@ -15,5 +17,4 @@ Please explain IN DETAIL what the changes are in this PR and why they are needed
 - [ ]  I have written the necessary rustdoc comments.
 - [ ]  I have added the necessary unit tests and integration tests.
-
+- [x]  This PR does not require documentation updates.
 ## Refer to a related PR or issue link (optional)
--- a/.github/scripts/copy-image.sh
+++ b/.github/scripts/copy-image.sh
@@ -0,0 +1,47 @@
 #!/usr/bin/env bash
 set -e
 set -o pipefail
 SRC_IMAGE=$1
 DST_REGISTRY=$2
 SKOPEO_STABLE_IMAGE="quay.io/skopeo/stable:latest"
 # Check if necessary variables are set.
 function check_vars() {
  for var in DST_REGISTRY_USERNAME DST_REGISTRY_PASSWORD DST_REGISTRY SRC_IMAGE; do
    if [ -z "${!var}" ]; then
      echo "$var is not set or empty."
      echo "Usage: DST_REGISTRY_USERNAME=<your-dst-registry-username> DST_REGISTRY_PASSWORD=<your-dst-registry-password> $0 <dst-registry> <src-image>"
      exit 1
    fi
  done
 }
 # Copies images from DockerHub to the destination registry.
 function copy_images_from_dockerhub() {
  # Check if docker is installed.
  if ! command -v docker &> /dev/null; then
    echo "docker is not installed. Please install docker to continue."
    exit 1
  fi
  # Extract the name and tag of the source image.
  IMAGE_NAME=$(echo "$SRC_IMAGE" | sed "s/.*\///")
  echo "Copying $SRC_IMAGE to $DST_REGISTRY/$IMAGE_NAME"
  docker run "$SKOPEO_STABLE_IMAGE" copy -a docker://"$SRC_IMAGE" \
    --dest-creds "$DST_REGISTRY_USERNAME":"$DST_REGISTRY_PASSWORD" \
    docker://"$DST_REGISTRY/$IMAGE_NAME"
 }
 function main() {
  check_vars
  copy_images_from_dockerhub
 }
 # Usage example:
 # DST_REGISTRY_USERNAME=123 DST_REGISTRY_PASSWORD=456 \
 #   ./copy-image.sh greptime/greptimedb:v0.4.0 greptime-registry.cn-hangzhou.cr.aliyuncs.com
 main
--- a/.github/scripts/create-version.sh
+++ b/.github/scripts/create-version.sh
@@ -0,0 +1,68 @@
 #!/usr/bin/env bash
 set -e
 # - If it's a tag push release, the version is the tag name(${{ github.ref_name }});
 # - If it's a scheduled release, the version is '${{ env.NEXT_RELEASE_VERSION }}-nightly-$buildTime', like 'v0.2.0-nightly-20230313';
 # - If it's a manual release, the version is '${{ env.NEXT_RELEASE_VERSION }}-$(git rev-parse --short HEAD)-YYYYMMDDSS', like 'v0.2.0-e5b243c-2023071245';
 # - If it's a nightly build, the version is 'nightly-YYYYMMDD-$(git rev-parse --short HEAD)', like 'nightly-20230712-e5b243c'.
 # create_version ${GIHUB_EVENT_NAME} ${NEXT_RELEASE_VERSION} ${NIGHTLY_RELEASE_PREFIX}
 function create_version() {
  # Read from envrionment variables.
  if [ -z "$GITHUB_EVENT_NAME" ]; then
      echo "GITHUB_EVENT_NAME is empty"
      exit 1
  fi
  if [ -z "$NEXT_RELEASE_VERSION" ]; then
      echo "NEXT_RELEASE_VERSION is empty"
      exit 1
  fi
  if [ -z "$NIGHTLY_RELEASE_PREFIX" ]; then
      echo "NIGHTLY_RELEASE_PREFIX is empty"
      exit 1
  fi
  # Reuse $NEXT_RELEASE_VERSION to identify whether it's a nightly build.
  # It will be like 'nigtly-20230808-7d0d8dc6'.
  if [ "$NEXT_RELEASE_VERSION" = nightly ]; then
    echo "$NIGHTLY_RELEASE_PREFIX-$(date "+%Y%m%d")-$(git rev-parse --short HEAD)"
    exit 0
  fi
  # Reuse $NEXT_RELEASE_VERSION to identify whether it's a dev build.
  # It will be like 'dev-2023080819-f0e7216c'.
  if [ "$NEXT_RELEASE_VERSION" = dev ]; then
    if [ -z "$COMMIT_SHA" ]; then
      echo "COMMIT_SHA is empty in dev build"
      exit 1
    fi
    echo "dev-$(date "+%Y%m%d-%s")-$(echo "$COMMIT_SHA" | cut -c1-8)"
    exit 0
  fi
  # Note: Only output 'version=xxx' to stdout when everything is ok, so that it can be used in GitHub Actions Outputs.
  if [ "$GITHUB_EVENT_NAME" = push ]; then
    if [ -z "$GITHUB_REF_NAME" ]; then
      echo "GITHUB_REF_NAME is empty in push event"
      exit 1
    fi
    echo "$GITHUB_REF_NAME"
  elif [ "$GITHUB_EVENT_NAME" = workflow_dispatch ]; then
    echo "$NEXT_RELEASE_VERSION-$(git rev-parse --short HEAD)-$(date "+%Y%m%d-%s")"
  elif [ "$GITHUB_EVENT_NAME" = schedule ]; then
    echo "$NEXT_RELEASE_VERSION-$NIGHTLY_RELEASE_PREFIX-$(date "+%Y%m%d")"
  else
    echo "Unsupported GITHUB_EVENT_NAME: $GITHUB_EVENT_NAME"
    exit 1
  fi
 }
 # You can run as following examples:
 #  GITHUB_EVENT_NAME=push NEXT_RELEASE_VERSION=v0.4.0 NIGHTLY_RELEASE_PREFIX=nigtly GITHUB_REF_NAME=v0.3.0 ./create-version.sh
 #  GITHUB_EVENT_NAME=workflow_dispatch NEXT_RELEASE_VERSION=v0.4.0 NIGHTLY_RELEASE_PREFIX=nigtly ./create-version.sh
 #  GITHUB_EVENT_NAME=schedule NEXT_RELEASE_VERSION=v0.4.0 NIGHTLY_RELEASE_PREFIX=nigtly ./create-version.sh
 #  GITHUB_EVENT_NAME=schedule NEXT_RELEASE_VERSION=nightly NIGHTLY_RELEASE_PREFIX=nigtly ./create-version.sh
 #  GITHUB_EVENT_NAME=workflow_dispatch COMMIT_SHA=f0e7216c4bb6acce9b29a21ec2d683be2e3f984a NEXT_RELEASE_VERSION=dev NIGHTLY_RELEASE_PREFIX=nigtly ./create-version.sh
 create_version
--- a/.github/scripts/deploy-greptimedb.sh
+++ b/.github/scripts/deploy-greptimedb.sh
@@ -0,0 +1,169 @@
 #!/usr/bin/env bash
 set -e
 set -o pipefail
 KUBERNETES_VERSION="${KUBERNETES_VERSION:-v1.24.0}"
 ENABLE_STANDALONE_MODE="${ENABLE_STANDALONE_MODE:-true}"
 DEFAULT_INSTALL_NAMESPACE=${DEFAULT_INSTALL_NAMESPACE:-default}
 GREPTIMEDB_IMAGE_TAG=${GREPTIMEDB_IMAGE_TAG:-latest}
 ETCD_CHART="oci://registry-1.docker.io/bitnamicharts/etcd"
 GREPTIME_CHART="https://greptimeteam.github.io/helm-charts/"
 # Ceate a cluster with 1 control-plane node and 5 workers.
 function create_kind_cluster() {
  cat <<EOF | kind create cluster --name "${CLUSTER}" --image kindest/node:"$KUBERNETES_VERSION" --config=-
 kind: Cluster
 apiVersion: kind.x-k8s.io/v1alpha4
 nodes:
 - role: control-plane
 - role: worker
 - role: worker
 - role: worker
 - role: worker
 - role: worker
 EOF
 }
 # Add greptime Helm chart repo.
 function add_greptime_chart() {
  helm repo add greptime "$GREPTIME_CHART"
  helm repo update
 }
 # Deploy a etcd cluster with 3 members.
 function deploy_etcd_cluster() {
  local namespace="$1"
  helm install etcd "$ETCD_CHART" \
    --set replicaCount=3 \
    --set auth.rbac.create=false \
    --set auth.rbac.token.enabled=false \
    -n "$namespace"
  # Wait for etcd cluster to be ready.
  kubectl rollout status statefulset/etcd -n "$namespace"
 }
 # Deploy greptimedb-operator.
 function deploy_greptimedb_operator() {
  # Use the latest chart and image.
  helm install greptimedb-operator greptime/greptimedb-operator \
    --set image.tag=latest \
    -n "$DEFAULT_INSTALL_NAMESPACE"
  # Wait for greptimedb-operator to be ready.
  kubectl rollout status deployment/greptimedb-operator -n "$DEFAULT_INSTALL_NAMESPACE"
 }
 # Deploy greptimedb cluster by using local storage.
 # It will expose cluster service ports as '14000', '14001', '14002', '14003' to local access.
 function deploy_greptimedb_cluster() {
  local cluster_name=$1
  local install_namespace=$2
  kubectl create ns "$install_namespace"
  deploy_etcd_cluster "$install_namespace"
  helm install "$cluster_name" greptime/greptimedb-cluster \
    --set image.tag="$GREPTIMEDB_IMAGE_TAG" \
    --set meta.etcdEndpoints="etcd.$install_namespace:2379" \
    -n "$install_namespace"
  # Wait for greptimedb cluster to be ready.
  while true; do
    PHASE=$(kubectl -n "$install_namespace" get gtc "$cluster_name" -o jsonpath='{.status.clusterPhase}')
    if [ "$PHASE" == "Running" ]; then
      echo "Cluster is ready"
      break
    else
      echo "Cluster is not ready yet: Current phase: $PHASE"
      sleep 5 # wait for 5 seconds before check again.
    fi
  done
  # Expose greptimedb cluster to local access.
  kubectl -n "$install_namespace" port-forward svc/"$cluster_name"-frontend \
    14000:4000 \
    14001:4001 \
    14002:4002 \
    14003:4003 > /tmp/connections.out &
 }
 # Deploy greptimedb cluster by using S3.
 # It will expose cluster service ports as '24000', '24001', '24002', '24003' to local access.
 function deploy_greptimedb_cluster_with_s3_storage() {
  local cluster_name=$1
  local install_namespace=$2
  kubectl create ns "$install_namespace"
  deploy_etcd_cluster "$install_namespace"
  helm install "$cluster_name" greptime/greptimedb-cluster -n "$install_namespace" \
    --set image.tag="$GREPTIMEDB_IMAGE_TAG" \
    --set meta.etcdEndpoints="etcd.$install_namespace:2379" \
    --set storage.s3.bucket="$AWS_CI_TEST_BUCKET" \
    --set storage.s3.region="$AWS_REGION" \
    --set storage.s3.root="$DATA_ROOT" \
    --set storage.credentials.secretName=s3-credentials \
    --set storage.credentials.accessKeyId="$AWS_ACCESS_KEY_ID" \
    --set storage.credentials.secretAccessKey="$AWS_SECRET_ACCESS_KEY"
  # Wait for greptimedb cluster to be ready.
  while true; do
    PHASE=$(kubectl -n "$install_namespace" get gtc "$cluster_name" -o jsonpath='{.status.clusterPhase}')
    if [ "$PHASE" == "Running" ]; then
      echo "Cluster is ready"
      break
    else
      echo "Cluster is not ready yet: Current phase: $PHASE"
      sleep 5 # wait for 5 seconds before check again.
    fi
  done
  # Expose greptimedb cluster to local access.
  kubectl -n "$install_namespace" port-forward svc/"$cluster_name"-frontend \
    24000:4000 \
    24001:4001 \
    24002:4002 \
    24003:4003 > /tmp/connections.out &
 }
 # Deploy standalone greptimedb.
 # It will expose cluster service ports as '34000', '34001', '34002', '34003' to local access.
 function deploy_standalone_greptimedb() {
  helm install greptimedb-standalone greptime/greptimedb-standalone \
    --set image.tag="$GREPTIMEDB_IMAGE_TAG" \
    -n "$DEFAULT_INSTALL_NAMESPACE"
  # Wait for etcd cluster to be ready.
  kubectl rollout status statefulset/greptimedb-standalone -n "$DEFAULT_INSTALL_NAMESPACE"
  # Expose greptimedb to local access.
  kubectl -n "$DEFAULT_INSTALL_NAMESPACE" port-forward svc/greptimedb-standalone \
    34000:4000 \
    34001:4001 \
    34002:4002 \
    34003:4003 > /tmp/connections.out &
 }
 # Entrypoint of the script.
 function main() {
  create_kind_cluster
  add_greptime_chart
  # Deploy standalone greptimedb in the same K8s.
  if [ "$ENABLE_STANDALONE_MODE" == "true" ]; then
    deploy_standalone_greptimedb
  fi
  deploy_greptimedb_operator
  deploy_greptimedb_cluster testcluster testcluster
  deploy_greptimedb_cluster_with_s3_storage testcluster-s3 testcluster-s3
 }
 # Usages:
 # - Deploy greptimedb cluster: ./deploy-greptimedb.sh
 main
--- a/.github/scripts/upload-artifacts-to-s3.sh
+++ b/.github/scripts/upload-artifacts-to-s3.sh
@@ -0,0 +1,102 @@
 #!/usr/bin/env bash
 set -e
 set -o pipefail
 ARTIFACTS_DIR=$1
 VERSION=$2
 AWS_S3_BUCKET=$3
 RELEASE_DIRS="releases/greptimedb"
 GREPTIMEDB_REPO="GreptimeTeam/greptimedb"
 # Check if necessary variables are set.
 function check_vars() {
  for var in AWS_S3_BUCKET VERSION ARTIFACTS_DIR; do
    if [ -z "${!var}" ]; then
      echo "$var is not set or empty."
      echo "Usage: $0 <artifacts-dir> <version> <aws-s3-bucket>"
      exit 1
    fi
  done
 }
 # Uploads artifacts to AWS S3 bucket.
 function upload_artifacts() {
  # The bucket layout will be:
  # releases/greptimedb
  # ├── latest-version.txt
  # ├── latest-nightly-version.txt
  # ├── v0.1.0
  # │   ├── greptime-darwin-amd64-pyo3-v0.1.0.sha256sum
  # │   └── greptime-darwin-amd64-pyo3-v0.1.0.tar.gz
  # └── v0.2.0
  #    ├── greptime-darwin-amd64-pyo3-v0.2.0.sha256sum
  #    └── greptime-darwin-amd64-pyo3-v0.2.0.tar.gz
  find "$ARTIFACTS_DIR" -type f \( -name "*.tar.gz" -o -name "*.sha256sum" \) | while IFS= read -r file; do
    aws s3 cp \
      "$file" "s3://$AWS_S3_BUCKET/$RELEASE_DIRS/$VERSION/$(basename "$file")"
  done
 }
 # Updates the latest version information in AWS S3 if UPDATE_VERSION_INFO is true.
 function update_version_info() {
  if [ "$UPDATE_VERSION_INFO" == "true" ]; then
    # If it's the officail release(like v1.0.0, v1.0.1, v1.0.2, etc.), update latest-version.txt.
    if [[ "$VERSION" =~ ^v[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
      echo "Updating latest-version.txt"
      echo "$VERSION" > latest-version.txt
      aws s3 cp \
        latest-version.txt "s3://$AWS_S3_BUCKET/$RELEASE_DIRS/latest-version.txt"
    fi
    # If it's the nightly release, update latest-nightly-version.txt.
    if [[ "$VERSION" == *"nightly"* ]]; then
      echo "Updating latest-nightly-version.txt"
      echo "$VERSION" > latest-nightly-version.txt
      aws s3 cp \
        latest-nightly-version.txt "s3://$AWS_S3_BUCKET/$RELEASE_DIRS/latest-nightly-version.txt"
    fi
  fi
 }
 # Downloads artifacts from Github if DOWNLOAD_ARTIFACTS_FROM_GITHUB is true.
 function download_artifacts_from_github() {
  if [ "$DOWNLOAD_ARTIFACTS_FROM_GITHUB" == "true" ]; then
    # Check if jq is installed.
    if ! command -v jq &> /dev/null; then
      echo "jq is not installed. Please install jq to continue."
      exit 1
    fi
    # Get the latest release API response.
    RELEASES_API_RESPONSE=$(curl -s -H "Accept: application/vnd.github.v3+json" "https://api.github.com/repos/$GREPTIMEDB_REPO/releases/latest")
    # Extract download URLs for the artifacts.
    # Exclude source code archives which are typically named as 'greptimedb-<version>.zip' or 'greptimedb-<version>.tar.gz'.
    ASSET_URLS=$(echo "$RELEASES_API_RESPONSE" | jq -r '.assets[] | select(.name | test("greptimedb-.*\\.(zip|tar\\.gz)$") | not) | .browser_download_url')
    # Download each asset.
    while IFS= read -r url; do
      if [ -n "$url" ]; then
        curl -LJO "$url"
        echo "Downloaded: $url"
      fi
    done <<< "$ASSET_URLS"
  fi
 }
 function main() {
  check_vars
  download_artifacts_from_github
  upload_artifacts
  update_version_info
 }
 # Usage example:
 #   AWS_ACCESS_KEY_ID=<your_access_key_id> \
 #   AWS_SECRET_ACCESS_KEY=<your_secret_access_key> \
 #   AWS_DEFAULT_REGION=<your_region> \
 #   UPDATE_VERSION_INFO=true \
 #   DOWNLOAD_ARTIFACTS_FROM_GITHUB=false \
 #     ./upload-artifacts-to-s3.sh <artifacts-dir> <version> <aws-s3-bucket>
 main
--- a/.github/workflows/apidoc.yml
+++ b/.github/workflows/apidoc.yml
@@ -1,7 +1,7 @@
 on:
  push:
    branches:
-      - develop
+      - main
    paths-ignore:
      - 'docs/**'
      - 'config/**'
@@ -13,14 +13,14 @@ on:
 name: Build API docs
 env:
-  RUST_TOOLCHAIN: nightly-2023-05-03
+  RUST_TOOLCHAIN: nightly-2024-04-18
 jobs:
  apidoc:
-    runs-on: ubuntu-latest
+    runs-on: ubuntu-20.04
    steps:
-    - uses: actions/checkout@v3
+    - uses: actions/checkout@v4
-    - uses: arduino/setup-protoc@v1
+    - uses: arduino/setup-protoc@v3
      with:
        repo-token: ${{ secrets.GITHUB_TOKEN }}
    - uses: dtolnay/rust-toolchain@master
@@ -40,3 +40,4 @@ jobs:
      uses: JamesIves/github-pages-deploy-action@v4
      with:
        folder: target/doc
        single-commit: true
--- a/.github/workflows/dev-build.yml
+++ b/.github/workflows/dev-build.yml
@@ -0,0 +1,345 @@
 # Development build only build the debug version of the artifacts manually.
 name: GreptimeDB Development Build
 on:
  workflow_dispatch: # Allows you to run this workflow manually.
    inputs:
      repository:
        description: The public repository to build
        required: false
        default: GreptimeTeam/greptimedb
      commit: # Note: We only pull the source code and use the current workflow to build the artifacts.
        description: The commit to build
        required: true
      linux_amd64_runner:
        type: choice
        description: The runner uses to build linux-amd64 artifacts
        default: ec2-c6i.4xlarge-amd64
        options:
          - ubuntu-20.04
          - ubuntu-20.04-8-cores
          - ubuntu-20.04-16-cores
          - ubuntu-20.04-32-cores
          - ubuntu-20.04-64-cores
          - ec2-c6i.xlarge-amd64 # 4C8G
          - ec2-c6i.2xlarge-amd64 # 8C16G
          - ec2-c6i.4xlarge-amd64 # 16C32G
          - ec2-c6i.8xlarge-amd64 # 32C64G
          - ec2-c6i.16xlarge-amd64 # 64C128G
      linux_arm64_runner:
        type: choice
        description: The runner uses to build linux-arm64 artifacts
        default: ec2-c6g.4xlarge-arm64
        options:
          - ec2-c6g.xlarge-arm64 # 4C8G
          - ec2-c6g.2xlarge-arm64 # 8C16G
          - ec2-c6g.4xlarge-arm64 # 16C32G
          - ec2-c6g.8xlarge-arm64 # 32C64G
          - ec2-c6g.16xlarge-arm64 # 64C128G
      skip_test:
        description: Do not run integration tests during the build
        type: boolean
        default: true
      build_linux_amd64_artifacts:
        type: boolean
        description: Build linux-amd64 artifacts
        required: false
        default: true
      build_linux_arm64_artifacts:
        type: boolean
        description: Build linux-arm64 artifacts
        required: false
        default: true
      release_images:
        type: boolean
        description: Build and push images to DockerHub and ACR
        required: false
        default: true
      cargo_profile:
        type: choice
        description: The cargo profile to use in building GreptimeDB.
        default: nightly
        options:
          - dev
          - release
          - nightly
 # Use env variables to control all the release process.
 env:
  CARGO_PROFILE: ${{ inputs.cargo_profile }}
  # Controls whether to run tests, include unit-test, integration-test and sqlness.
  DISABLE_RUN_TESTS: ${{ inputs.skip_test || vars.DEFAULT_SKIP_TEST }}
  # Always use 'dev' to indicate it's the dev build.
  NEXT_RELEASE_VERSION: dev
  NIGHTLY_RELEASE_PREFIX: nightly
  # Use the different image name to avoid conflict with the release images.
  IMAGE_NAME: greptimedb-dev
  # The source code will check out in the following path: '${WORKING_DIR}/dev/greptime'.
  CHECKOUT_GREPTIMEDB_PATH: dev/greptimedb
 jobs:
  allocate-runners:
    name: Allocate runners
    if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
    runs-on: ubuntu-20.04
    outputs:
      linux-amd64-runner: ${{ steps.start-linux-amd64-runner.outputs.label }}
      linux-arm64-runner: ${{ steps.start-linux-arm64-runner.outputs.label }}
      # The following EC2 resource id will be used for resource releasing.
      linux-amd64-ec2-runner-label: ${{ steps.start-linux-amd64-runner.outputs.label }}
      linux-amd64-ec2-runner-instance-id: ${{ steps.start-linux-amd64-runner.outputs.ec2-instance-id }}
      linux-arm64-ec2-runner-label: ${{ steps.start-linux-arm64-runner.outputs.label }}
      linux-arm64-ec2-runner-instance-id: ${{ steps.start-linux-arm64-runner.outputs.ec2-instance-id }}
      # The 'version' use as the global tag name of the release workflow.
      version: ${{ steps.create-version.outputs.version }}
    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Create version
        id: create-version
        run: |
          version=$(./.github/scripts/create-version.sh) && \
          echo $version && \
          echo "version=$version" >> $GITHUB_OUTPUT
        env:
          GITHUB_EVENT_NAME: ${{ github.event_name }}
          GITHUB_REF_NAME: ${{ github.ref_name }}
          COMMIT_SHA: ${{ inputs.commit }}
          NEXT_RELEASE_VERSION: ${{ env.NEXT_RELEASE_VERSION }}
          NIGHTLY_RELEASE_PREFIX: ${{ env.NIGHTLY_RELEASE_PREFIX }}
      - name: Allocate linux-amd64 runner
        if: ${{ inputs.build_linux_amd64_artifacts || github.event_name == 'schedule' }}
        uses: ./.github/actions/start-runner
        id: start-linux-amd64-runner
        with:
          runner: ${{ inputs.linux_amd64_runner || vars.DEFAULT_AMD64_RUNNER }}
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ vars.EC2_RUNNER_REGION }}
          github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
          image-id: ${{ vars.EC2_RUNNER_LINUX_AMD64_IMAGE_ID }}
          security-group-id: ${{ vars.EC2_RUNNER_SECURITY_GROUP_ID }}
          subnet-id: ${{ vars.EC2_RUNNER_SUBNET_ID }}
      - name: Allocate linux-arm64 runner
        if: ${{ inputs.build_linux_arm64_artifacts || github.event_name == 'schedule' }}
        uses: ./.github/actions/start-runner
        id: start-linux-arm64-runner
        with:
          runner: ${{ inputs.linux_arm64_runner || vars.DEFAULT_ARM64_RUNNER }}
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ vars.EC2_RUNNER_REGION }}
          github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
          image-id: ${{ vars.EC2_RUNNER_LINUX_ARM64_IMAGE_ID }}
          security-group-id: ${{ vars.EC2_RUNNER_SECURITY_GROUP_ID }}
          subnet-id: ${{ vars.EC2_RUNNER_SUBNET_ID }}
  build-linux-amd64-artifacts:
    name: Build linux-amd64 artifacts
    if: ${{ inputs.build_linux_amd64_artifacts || github.event_name == 'schedule' }}
    needs: [
      allocate-runners,
    ]
    runs-on: ${{ needs.allocate-runners.outputs.linux-amd64-runner }}
    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Checkout greptimedb
        uses: actions/checkout@v4
        with:
          repository: ${{ inputs.repository }}
          ref: ${{ inputs.commit }}
          path: ${{ env.CHECKOUT_GREPTIMEDB_PATH }}
      - uses: ./.github/actions/build-linux-artifacts
        with:
          arch: amd64
          cargo-profile: ${{ env.CARGO_PROFILE }}
          version: ${{ needs.allocate-runners.outputs.version }}
          disable-run-tests: ${{ env.DISABLE_RUN_TESTS }}
          dev-mode: true # Only build the standard greptime binary.
          working-dir: ${{ env.CHECKOUT_GREPTIMEDB_PATH }}
  build-linux-arm64-artifacts:
    name: Build linux-arm64 artifacts
    if: ${{ inputs.build_linux_arm64_artifacts || github.event_name == 'schedule' }}
    needs: [
      allocate-runners,
    ]
    runs-on: ${{ needs.allocate-runners.outputs.linux-arm64-runner }}
    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Checkout greptimedb
        uses: actions/checkout@v4
        with:
          repository: ${{ inputs.repository }}
          ref: ${{ inputs.commit }}
          path: ${{ env.CHECKOUT_GREPTIMEDB_PATH }}
      - uses: ./.github/actions/build-linux-artifacts
        with:
          arch: arm64
          cargo-profile: ${{ env.CARGO_PROFILE }}
          version: ${{ needs.allocate-runners.outputs.version }}
          disable-run-tests: ${{ env.DISABLE_RUN_TESTS }}
          dev-mode: true # Only build the standard greptime binary.
          working-dir: ${{ env.CHECKOUT_GREPTIMEDB_PATH }}
  release-images-to-dockerhub:
    name: Build and push images to DockerHub
    if: ${{ inputs.release_images || github.event_name == 'schedule' }}
    needs: [
      allocate-runners,
      build-linux-amd64-artifacts,
      build-linux-arm64-artifacts,
    ]
    runs-on: ubuntu-20.04
    outputs:
      build-result: ${{ steps.set-build-result.outputs.build-result }}
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Build and push images to dockerhub
        uses: ./.github/actions/build-images
        with:
          image-registry: docker.io
          image-namespace: ${{ vars.IMAGE_NAMESPACE }}
          image-name: ${{ env.IMAGE_NAME }}
          image-registry-username: ${{ secrets.DOCKERHUB_USERNAME }}
          image-registry-password: ${{ secrets.DOCKERHUB_TOKEN }}
          version: ${{ needs.allocate-runners.outputs.version }}
          push-latest-tag: false # Don't push the latest tag to registry.
          dev-mode: true # Only build the standard images.
      - name: Set build result
        id: set-build-result
        run: |
          echo "build-result=success" >> $GITHUB_OUTPUT
  release-cn-artifacts:
    name: Release artifacts to CN region
    if: ${{ inputs.release_images || github.event_name == 'schedule' }}
    needs: [
      allocate-runners,
      release-images-to-dockerhub,
    ]
    runs-on: ubuntu-20.04
    continue-on-error: true
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Release artifacts to CN region
        uses: ./.github/actions/release-cn-artifacts
        with:
          src-image-registry: docker.io
          src-image-namespace: ${{ vars.IMAGE_NAMESPACE }}
          src-image-name: ${{ env.IMAGE_NAME }}
          dst-image-registry-username: ${{ secrets.ALICLOUD_USERNAME }}
          dst-image-registry-password: ${{ secrets.ALICLOUD_PASSWORD }}
          dst-image-registry: ${{ vars.ACR_IMAGE_REGISTRY }}
          dst-image-namespace: ${{ vars.IMAGE_NAMESPACE }}
          version: ${{ needs.allocate-runners.outputs.version }}
          aws-cn-s3-bucket: ${{ vars.AWS_RELEASE_BUCKET }}
          aws-cn-access-key-id: ${{ secrets.AWS_CN_ACCESS_KEY_ID }}
          aws-cn-secret-access-key: ${{ secrets.AWS_CN_SECRET_ACCESS_KEY }}
          aws-cn-region: ${{ vars.AWS_RELEASE_BUCKET_REGION }}
          dev-mode: true                     # Only build the standard images(exclude centos images).
          push-latest-tag: false             # Don't push the latest tag to registry.
          update-version-info: false         # Don't update the version info in S3.
  stop-linux-amd64-runner: # It's always run as the last job in the workflow to make sure that the runner is released.
    name: Stop linux-amd64 runner
    # Only run this job when the runner is allocated.
    if: ${{ always() }}
    runs-on: ubuntu-20.04
    needs: [
      allocate-runners,
      build-linux-amd64-artifacts,
    ]
    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Stop EC2 runner
        uses: ./.github/actions/stop-runner
        with:
          label: ${{ needs.allocate-runners.outputs.linux-amd64-ec2-runner-label }}
          ec2-instance-id: ${{ needs.allocate-runners.outputs.linux-amd64-ec2-runner-instance-id }}
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ vars.EC2_RUNNER_REGION }}
          github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
  stop-linux-arm64-runner: # It's always run as the last job in the workflow to make sure that the runner is released.
    name: Stop linux-arm64 runner
    # Only run this job when the runner is allocated.
    if: ${{ always() }}
    runs-on: ubuntu-20.04
    needs: [
      allocate-runners,
      build-linux-arm64-artifacts,
    ]
    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Stop EC2 runner
        uses: ./.github/actions/stop-runner
        with:
          label: ${{ needs.allocate-runners.outputs.linux-arm64-ec2-runner-label }}
          ec2-instance-id: ${{ needs.allocate-runners.outputs.linux-arm64-ec2-runner-instance-id }}
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ vars.EC2_RUNNER_REGION }}
          github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
  notification:
    if: ${{ always() }} # Not requiring successful dependent jobs, always run.
    name: Send notification to Greptime team
    needs: [
      release-images-to-dockerhub
    ]
    runs-on: ubuntu-20.04
    env:
      SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_DEVELOP_CHANNEL }}
    steps:
      - name: Notifiy dev build successful result
        uses: slackapi/slack-github-action@v1.23.0
        if: ${{ needs.release-images-to-dockerhub.outputs.build-result == 'success' }}
        with:
          payload: |
            {"text": "GreptimeDB's ${{ env.NEXT_RELEASE_VERSION }} build has completed successfully."}
      - name: Notifiy dev build failed result
        uses: slackapi/slack-github-action@v1.23.0
        if: ${{ needs.release-images-to-dockerhub.outputs.build-result != 'success' }}
        with:
          payload: |
            {"text": "GreptimeDB's ${{ env.NEXT_RELEASE_VERSION }} build has failed, please check 'https://github.com/GreptimeTeam/greptimedb/actions/workflows/${{ env.NEXT_RELEASE_VERSION }}-build.yml'."}
--- a/.github/workflows/develop.yml
+++ b/.github/workflows/develop.yml
@@ -1,6 +1,7 @@
 on:
  merge_group:
  pull_request:
-    types: [opened, synchronize, reopened, ready_for_review]
+    types: [ opened, synchronize, reopened, ready_for_review ]
    paths-ignore:
      - 'docs/**'
      - 'config/**'
@@ -8,9 +9,9 @@ on:
      - '.dockerignore'
      - 'docker/**'
      - '.gitignore'
      - 'grafana/**'
  push:
    branches:
      - develop
      - main
    paths-ignore:
      - 'docs/**'
@@ -19,29 +20,41 @@ on:
      - '.dockerignore'
      - 'docker/**'
      - '.gitignore'
      - 'grafana/**'
  workflow_dispatch:
 name: CI
 concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
  cancel-in-progress: true
 env:
-  RUST_TOOLCHAIN: nightly-2023-05-03
+  RUST_TOOLCHAIN: nightly-2024-04-18
 jobs:
-  typos:
+  check-typos-and-docs:
-    name: Spell Check with Typos
+    name: Check typos and docs
-    runs-on: ubuntu-latest
+    runs-on: ubuntu-20.04
    steps:
-      - uses: actions/checkout@v3
+      - uses: actions/checkout@v4
      - uses: crate-ci/typos@v1.13.10
      - name: Check the config docs
        run: |
          make config-docs && \
          git diff --name-only --exit-code ./config/config.md  \
          || (echo "'config/config.md' is not up-to-date, please run 'make config-docs'." && exit 1)
  check:
    name: Check
-    if: github.event.pull_request.draft == false
+    runs-on: ${{ matrix.os }}
-    runs-on: ubuntu-latest
+    strategy:
      matrix:
        os: [ windows-latest, ubuntu-20.04 ]
    timeout-minutes: 60
    steps:
-      - uses: actions/checkout@v3
+      - uses: actions/checkout@v4
-      - uses: arduino/setup-protoc@v1
+      - uses: arduino/setup-protoc@v3
        with:
          repo-token: ${{ secrets.GITHUB_TOKEN }}
      - uses: dtolnay/rust-toolchain@master
@@ -49,77 +62,78 @@ jobs:
          toolchain: ${{ env.RUST_TOOLCHAIN }}
      - name: Rust Cache
        uses: Swatinem/rust-cache@v2
        with:
          # Shares across multiple jobs
          # Shares with `Clippy` job
          shared-key: "check-lint"
      - name: Run cargo check
-        run: cargo check --workspace --all-targets
+        run: cargo check --locked --workspace --all-targets
  toml:
    name: Toml Check
-    if: github.event.pull_request.draft == false
+    runs-on: ubuntu-20.04
    runs-on: ubuntu-latest
    timeout-minutes: 60
    steps:
-      - uses: actions/checkout@v3
+      - uses: actions/checkout@v4
      - uses: dtolnay/rust-toolchain@master
        with:
          toolchain: stable
      - name: Rust Cache
        uses: Swatinem/rust-cache@v2
        with:
          # Shares across multiple jobs
          shared-key: "check-toml"
      - name: Install taplo
        run: cargo +stable install taplo-cli --version ^0.9 --locked
      - name: Run taplo
        run: taplo format --check
  build:
    name: Build GreptimeDB binaries
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        os: [ ubuntu-20.04 ]
    timeout-minutes: 60
    steps:
      - uses: actions/checkout@v4
      - uses: arduino/setup-protoc@v3
        with:
          repo-token: ${{ secrets.GITHUB_TOKEN }}
      - uses: dtolnay/rust-toolchain@master
        with:
          toolchain: ${{ env.RUST_TOOLCHAIN }}
-      - name: Rust Cache
+      - uses: Swatinem/rust-cache@v2
-        uses: Swatinem/rust-cache@v2
+        with:
-      - name: Install taplo
+          # Shares across multiple jobs
-        run: cargo install taplo-cli --version ^0.8 --locked
+          shared-key: "build-binaries"
-      - name: Run taplo
+      - name: Build greptime binaries
-        run: taplo format --check --option "indent_string=    "
+        shell: bash
        run: cargo build --bin greptime --bin sqlness-runner
      - name: Pack greptime binaries
        shell: bash
        run: |
          mkdir bins && \
          mv ./target/debug/greptime bins && \
          mv ./target/debug/sqlness-runner bins
      - name: Print greptime binaries info
        run: ls -lh bins
      - name: Upload artifacts
        uses: ./.github/actions/upload-artifacts
        with:
          artifacts-dir: bins
          version: current
-  # Use coverage to run test.
+  fuzztest:
-  # test:
+    name: Fuzz Test
-  #   name: Test Suite
+    needs: build
-  #   if: github.event.pull_request.draft == false
+    runs-on: ubuntu-latest
-  #   runs-on: ubuntu-latest
+    strategy:
-  #   timeout-minutes: 60
+      matrix:
-  #   steps:
+        target: [ "fuzz_create_table", "fuzz_alter_table", "fuzz_create_database" ]
  #     - uses: actions/checkout@v3
  #     - name: Cache LLVM and Clang
  #       id: cache-llvm
  #       uses: actions/cache@v3
  #       with:
  #         path: ./llvm
  #         key: llvm
  #     - uses: arduino/setup-protoc@v1
  #       with:
  #         repo-token: ${{ secrets.GITHUB_TOKEN }}
  #     - uses: KyleMayes/install-llvm-action@v1
  #       with:
  #         version: "14.0"
  #         cached: ${{ steps.cache-llvm.outputs.cache-hit }}
  #     - uses: dtolnay/rust-toolchain@master
  #       with:
  #         toolchain: ${{ env.RUST_TOOLCHAIN }}
  #     - name: Rust Cache
  #       uses: Swatinem/rust-cache@v2
  #     - name: Cleanup disk
  #       uses: curoky/cleanup-disk-action@v2.0
  #       with:
  #         retain: 'rust,llvm'
  #     - name: Install latest nextest release
  #       uses: taiki-e/install-action@nextest
  #     - name: Run tests
  #       run: cargo nextest run
  #       env:
  #         CARGO_BUILD_RUSTFLAGS: "-C link-arg=-fuse-ld=lld"
  #         RUST_BACKTRACE: 1
  #         GT_S3_BUCKET: ${{ secrets.S3_BUCKET }}
  #         GT_S3_ACCESS_KEY_ID: ${{ secrets.S3_ACCESS_KEY_ID }}
  #         GT_S3_ACCESS_KEY: ${{ secrets.S3_ACCESS_KEY }}
  #         UNITTEST_LOG_DIR: "__unittest_logs"
  sqlness:
    name: Sqlness Test
    if: github.event.pull_request.draft == false
    runs-on: ubuntu-latest-8-cores
    timeout-minutes: 60
    needs: [clippy]
    steps:
-      - uses: actions/checkout@v3
+      - uses: actions/checkout@v4
-      - uses: arduino/setup-protoc@v1
+      - uses: arduino/setup-protoc@v3
        with:
          repo-token: ${{ secrets.GITHUB_TOKEN }}
      - uses: dtolnay/rust-toolchain@master
@@ -127,35 +141,95 @@ jobs:
          toolchain: ${{ env.RUST_TOOLCHAIN }}
      - name: Rust Cache
        uses: Swatinem/rust-cache@v2
-      - name: Run etcd
+        with:
          # Shares across multiple jobs
          shared-key: "fuzz-test-targets"
      - name: Set Rust Fuzz
        shell: bash
        run: |
-          ETCD_VER=v3.5.7
+          sudo apt update && sudo apt install -y libfuzzer-14-dev
-          DOWNLOAD_URL=https://github.com/etcd-io/etcd/releases/download
+          cargo install cargo-fuzz
-          curl -L ${DOWNLOAD_URL}/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -o /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz
+      - name: Download pre-built binaries
-          mkdir -p /tmp/etcd-download
+        uses: actions/download-artifact@v4
-          tar xzvf /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz -C /tmp/etcd-download --strip-components=1
+        with:
-          rm -f /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz
+          name: bins
          path: .
      - name: Unzip binaries
        run: tar -xvf ./bins.tar.gz
      - name: Run GreptimeDB
        run: |
          ./bins/greptime standalone start&
      - name: Fuzz Test
        uses: ./.github/actions/fuzz-test
        env:
          CUSTOM_LIBFUZZER_PATH: /usr/lib/llvm-14/lib/libFuzzer.a
        with:
          target: ${{ matrix.target }}
-          sudo cp -a /tmp/etcd-download/etcd* /usr/local/bin/
+  sqlness:
-          nohup etcd >/tmp/etcd.log 2>&1 &
+    name: Sqlness Test
    needs: build
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        os: [ ubuntu-20.04 ]
    timeout-minutes: 60
    steps:
      - uses: actions/checkout@v4
      - name: Download pre-built binaries
        uses: actions/download-artifact@v4
        with:
          name: bins
          path: .
      - name: Unzip binaries
        run: tar -xvf ./bins.tar.gz
      - name: Run sqlness
-        run: cargo sqlness && ls /tmp
+        run: RUST_BACKTRACE=1 ./bins/sqlness-runner -c ./tests/cases --bins-dir ./bins
      - name: Upload sqlness logs
        if: always()
-        uses: actions/upload-artifact@v3
+        uses: actions/upload-artifact@v4
        with:
          name: sqlness-logs
          path: /tmp/greptime-*.log
          retention-days: 3
-  fmt:
+  sqlness-kafka-wal:
-    name: Rustfmt
+    name: Sqlness Test with Kafka Wal
-    if: github.event.pull_request.draft == false
+    needs: build
-    runs-on: ubuntu-latest
+    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        os: [ ubuntu-20.04 ]
    timeout-minutes: 60
    steps:
-      - uses: actions/checkout@v3
+      - uses: actions/checkout@v4
-      - uses: arduino/setup-protoc@v1
+      - name: Download pre-built binaries
        uses: actions/download-artifact@v4
        with:
          name: bins
          path: .
      - name: Unzip binaries
        run: tar -xvf ./bins.tar.gz
      - name: Setup kafka server
        working-directory: tests-integration/fixtures/kafka
        run: docker compose -f docker-compose-standalone.yml up -d --wait
      - name: Run sqlness
        run: RUST_BACKTRACE=1 ./bins/sqlness-runner -w kafka -k 127.0.0.1:9092 -c ./tests/cases --bins-dir ./bins
      - name: Upload sqlness logs
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: sqlness-logs-with-kafka-wal
          path: /tmp/greptime-*.log
          retention-days: 3
  fmt:
    name: Rustfmt
    runs-on: ubuntu-20.04
    timeout-minutes: 60
    steps:
      - uses: actions/checkout@v4
      - uses: arduino/setup-protoc@v3
        with:
          repo-token: ${{ secrets.GITHUB_TOKEN }}
      - uses: dtolnay/rust-toolchain@master
@@ -164,17 +238,19 @@ jobs:
          components: rustfmt
      - name: Rust Cache
        uses: Swatinem/rust-cache@v2
        with:
          # Shares across multiple jobs
          shared-key: "check-rust-fmt"
      - name: Run cargo fmt
        run: cargo fmt --all -- --check
  clippy:
    name: Clippy
-    if: github.event.pull_request.draft == false
+    runs-on: ubuntu-20.04
    runs-on: ubuntu-latest
    timeout-minutes: 60
    steps:
-      - uses: actions/checkout@v3
+      - uses: actions/checkout@v4
-      - uses: arduino/setup-protoc@v1
+      - uses: arduino/setup-protoc@v3
        with:
          repo-token: ${{ secrets.GITHUB_TOKEN }}
      - uses: dtolnay/rust-toolchain@master
@@ -183,17 +259,20 @@ jobs:
          components: clippy
      - name: Rust Cache
        uses: Swatinem/rust-cache@v2
        with:
          # Shares across multiple jobs
          # Shares with `Check` job
          shared-key: "check-lint"
      - name: Run cargo clippy
        run: cargo clippy --workspace --all-targets -- -D warnings
  coverage:
    if: github.event.pull_request.draft == false
-    runs-on: ubuntu-latest-8-cores
+    runs-on: ubuntu-20.04-8-cores
    timeout-minutes: 60
    needs: [clippy]
    steps:
-      - uses: actions/checkout@v3
+      - uses: actions/checkout@v4
-      - uses: arduino/setup-protoc@v1
+      - uses: arduino/setup-protoc@v3
        with:
          repo-token: ${{ secrets.GITHUB_TOKEN }}
      - uses: KyleMayes/install-llvm-action@v1
@@ -206,32 +285,65 @@ jobs:
          components: llvm-tools-preview
      - name: Rust Cache
        uses: Swatinem/rust-cache@v2
        with:
          # Shares cross multiple jobs
          shared-key: "coverage-test"
      - name: Docker Cache
        uses: ScribeMD/docker-cache@0.3.7
        with:
          key: docker-${{ runner.os }}-coverage
      - name: Install latest nextest release
        uses: taiki-e/install-action@nextest
      - name: Install cargo-llvm-cov
        uses: taiki-e/install-action@cargo-llvm-cov
      - name: Install Python
-        uses: actions/setup-python@v4
+        uses: actions/setup-python@v5
        with:
          python-version: '3.10'
      - name: Install PyArrow Package
        run: pip install pyarrow
-      - name: Install cargo-llvm-cov
+      - name: Setup etcd server
-        uses: taiki-e/install-action@cargo-llvm-cov
+        working-directory: tests-integration/fixtures/etcd
-      - name: Collect coverage data
+        run: docker compose -f docker-compose-standalone.yml up -d --wait
      - name: Setup kafka server
        working-directory: tests-integration/fixtures/kafka
        run: docker compose -f docker-compose-standalone.yml up -d --wait
      - name: Run nextest cases
        run: cargo llvm-cov nextest --workspace --lcov --output-path lcov.info -F pyo3_backend -F dashboard
        env:
          CARGO_BUILD_RUSTFLAGS: "-C link-arg=-fuse-ld=lld"
          RUST_BACKTRACE: 1
          CARGO_INCREMENTAL: 0
-          GT_S3_BUCKET: ${{ secrets.S3_BUCKET }}
+          GT_S3_BUCKET: ${{ vars.AWS_CI_TEST_BUCKET }}
-          GT_S3_ACCESS_KEY_ID: ${{ secrets.S3_ACCESS_KEY_ID }}
+          GT_S3_ACCESS_KEY_ID: ${{ secrets.AWS_CI_TEST_ACCESS_KEY_ID }}
-          GT_S3_ACCESS_KEY: ${{ secrets.S3_ACCESS_KEY }}
+          GT_S3_ACCESS_KEY: ${{ secrets.AWS_CI_TEST_SECRET_ACCESS_KEY }}
-          GT_S3_REGION: ${{ secrets.S3_REGION }}
+          GT_S3_REGION: ${{ vars.AWS_CI_TEST_BUCKET_REGION }}
          GT_ETCD_ENDPOINTS: http://127.0.0.1:2379
          GT_KAFKA_ENDPOINTS: 127.0.0.1:9092
          UNITTEST_LOG_DIR: "__unittest_logs"
      - name: Codecov upload
-        uses: codecov/codecov-action@v2
+        uses: codecov/codecov-action@v4
        with:
          token: ${{ secrets.CODECOV_TOKEN }}
          files: ./lcov.info
          flags: rust
          fail_ci_if_error: false
          verbose: true
  compat:
    name: Compatibility Test
    needs: build
    runs-on: ubuntu-20.04
    timeout-minutes: 60
    steps:
      - uses: actions/checkout@v4
      - name: Download pre-built binaries
        uses: actions/download-artifact@v4
        with:
          name: bins
          path: .
      - name: Unzip binaries
        run: |
          mkdir -p ./bins/current
          tar -xvf ./bins.tar.gz --strip-components=1 -C ./bins/current
      - run: ./tests/compat/test-compat.sh 0.6.0
--- a/.github/workflows/doc-issue.yml
+++ b/.github/workflows/doc-issue.yml
@@ -11,10 +11,10 @@ on:
 jobs:
  doc_issue:
    if: github.event.label.name == 'doc update required'
-    runs-on: ubuntu-latest
+    runs-on: ubuntu-20.04
    steps:
      - name: create an issue in doc repo
-        uses: dacbd/create-issue-action@main
+        uses: dacbd/create-issue-action@v1.2.1
        with:
          owner: GreptimeTeam
          repo: docs
@@ -25,10 +25,10 @@ jobs:
            ${{ github.event.issue.html_url || github.event.pull_request.html_url }}
  cloud_issue:
    if: github.event.label.name == 'cloud followup required'
-    runs-on: ubuntu-latest
+    runs-on: ubuntu-20.04
    steps:
      - name: create an issue in cloud repo
-        uses: dacbd/create-issue-action@main
+        uses: dacbd/create-issue-action@v1.2.1
        with:
          owner: GreptimeTeam
          repo: greptimedb-cloud
--- a/.github/workflows/doc-label.yml
+++ b/.github/workflows/doc-label.yml
@@ -0,0 +1,36 @@
 name: "PR Doc Labeler"
 on:
  pull_request_target:
    types: [opened, edited, synchronize, ready_for_review, auto_merge_enabled, labeled, unlabeled]
 permissions:
  pull-requests: write
  contents: read
 jobs:
  triage:
    if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
    runs-on: ubuntu-latest
    steps:
    - uses: github/issue-labeler@v3.4
      with:
        configuration-path: .github/doc-label-config.yml
        enable-versioned-regex: false
        repo-token: ${{ secrets.GITHUB_TOKEN }}
        sync-labels: 1
    - name: create an issue in doc repo
      uses: dacbd/create-issue-action@v1.2.1
      if: ${{ github.event.action == 'opened' && contains(github.event.pull_request.body, '- [ ]  This PR does not require documentation updates.') }}
      with:
        owner: GreptimeTeam
        repo: docs
        token: ${{ secrets.DOCS_REPO_TOKEN }}
        title: Update docs for ${{ github.event.issue.title || github.event.pull_request.title }}
        body: |
          A document change request is generated from
          ${{ github.event.issue.html_url || github.event.pull_request.html_url }}
    - name: Check doc labels
      uses: docker://agilepathway/pull-request-label-checker:latest
      with:
        one_of: Doc update required,Doc not needed
        repo_token: ${{ secrets.GITHUB_TOKEN }}
--- a/.github/workflows/docs.yml
+++ b/.github/workflows/docs.yml
@@ -1,4 +1,5 @@
 on:
  merge_group:
  pull_request:
    types: [opened, synchronize, reopened, ready_for_review]
    paths:
@@ -8,9 +9,9 @@ on:
      - '.dockerignore'
      - 'docker/**'
      - '.gitignore'
      - 'grafana/**'
  push:
    branches:
      - develop
      - main
    paths:
      - 'docs/**'
@@ -19,6 +20,7 @@ on:
      - '.dockerignore'
      - 'docker/**'
      - '.gitignore'
      - 'grafana/**'
  workflow_dispatch:
 name: CI
@@ -27,29 +29,50 @@ name: CI
 # https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/defining-the-mergeability-of-pull-requests/troubleshooting-required-status-checks#handling-skipped-but-required-checks
 jobs:
  typos:
    name: Spell Check with Typos
    runs-on: ubuntu-20.04
    steps:
      - uses: actions/checkout@v4
      - uses: crate-ci/typos@v1.13.10
  check:
    name: Check
-    if: github.event.pull_request.draft == false
+    runs-on: ubuntu-20.04
    runs-on: ubuntu-latest
    steps:
      - run: 'echo "No action required"'
  fmt:
    name: Rustfmt
-    if: github.event.pull_request.draft == false
+    runs-on: ubuntu-20.04
    runs-on: ubuntu-latest
    steps:
      - run: 'echo "No action required"'
  clippy:
    name: Clippy
-    if: github.event.pull_request.draft == false
+    runs-on: ubuntu-20.04
    runs-on: ubuntu-latest
    steps:
      - run: 'echo "No action required"'
  coverage:
-    if: github.event.pull_request.draft == false
+    runs-on: ubuntu-20.04
-    runs-on: ubuntu-latest
+    steps:
      - run: 'echo "No action required"'
  sqlness:
    name: Sqlness Test
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        os: [ ubuntu-20.04 ]
    steps:
      - run: 'echo "No action required"'
  sqlness-kafka-wal:
    name: Sqlness Test with Kafka Wal
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        os: [ ubuntu-20.04 ]
    steps:
      - run: 'echo "No action required"'
--- a/.github/workflows/license.yaml
+++ b/.github/workflows/license.yaml
@@ -3,14 +3,14 @@ name: License checker
 on:
  push:
    branches:
-    - develop
+    - main
  pull_request:
    types: [opened, synchronize, reopened, ready_for_review]
 jobs:
  license-header-check:
-    runs-on: ubuntu-latest
+    runs-on: ubuntu-20.04
    name: license-header-check
    steps:
-    - uses: actions/checkout@v2
+    - uses: actions/checkout@v4
    - name: Check License Header
-      uses: apache/skywalking-eyes/header@df70871af1a8109c9a5b1dc824faaf65246c5236
+      uses: korandoru/hawkeye@v5
--- a/.github/workflows/nightly-build.yml
+++ b/.github/workflows/nightly-build.yml
@@ -0,0 +1,309 @@
 # Nightly build only do the following things:
 # 1. Run integration tests;
 # 2. Build binaries and images for linux-amd64 and linux-arm64 platform;
 name: GreptimeDB Nightly Build
 on:
  schedule:
    # Trigger at 00:00(UTC) on every day-of-week from Monday through Friday.
    - cron: '0 0 * * 1-5'
  workflow_dispatch: # Allows you to run this workflow manually.
    inputs:
      linux_amd64_runner:
        type: choice
        description: The runner uses to build linux-amd64 artifacts
        default: ec2-c6i.2xlarge-amd64
        options:
          - ubuntu-20.04
          - ubuntu-20.04-8-cores
          - ubuntu-20.04-16-cores
          - ubuntu-20.04-32-cores
          - ubuntu-20.04-64-cores
          - ec2-c6i.xlarge-amd64 # 4C8G
          - ec2-c6i.2xlarge-amd64 # 8C16G
          - ec2-c6i.4xlarge-amd64 # 16C32G
          - ec2-c6i.8xlarge-amd64 # 32C64G
          - ec2-c6i.16xlarge-amd64 # 64C128G
      linux_arm64_runner:
        type: choice
        description: The runner uses to build linux-arm64 artifacts
        default: ec2-c6g.2xlarge-arm64
        options:
          - ec2-c6g.xlarge-arm64 # 4C8G
          - ec2-c6g.2xlarge-arm64 # 8C16G
          - ec2-c6g.4xlarge-arm64 # 16C32G
          - ec2-c6g.8xlarge-arm64 # 32C64G
          - ec2-c6g.16xlarge-arm64 # 64C128G
      skip_test:
        description: Do not run integration tests during the build
        type: boolean
        default: true
      build_linux_amd64_artifacts:
        type: boolean
        description: Build linux-amd64 artifacts
        required: false
        default: false
      build_linux_arm64_artifacts:
        type: boolean
        description: Build linux-arm64 artifacts
        required: false
        default: false
      release_images:
        type: boolean
        description: Build and push images to DockerHub and ACR
        required: false
        default: false
 # Use env variables to control all the release process.
 env:
  CARGO_PROFILE: nightly
  # Controls whether to run tests, include unit-test, integration-test and sqlness.
  DISABLE_RUN_TESTS: ${{ inputs.skip_test || vars.DEFAULT_SKIP_TEST }}
  # Always use 'nightly' to indicate it's the nightly build.
  NEXT_RELEASE_VERSION: nightly
  NIGHTLY_RELEASE_PREFIX: nightly
 jobs:
  allocate-runners:
    name: Allocate runners
    if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
    runs-on: ubuntu-20.04
    outputs:
      linux-amd64-runner: ${{ steps.start-linux-amd64-runner.outputs.label }}
      linux-arm64-runner: ${{ steps.start-linux-arm64-runner.outputs.label }}
      # The following EC2 resource id will be used for resource releasing.
      linux-amd64-ec2-runner-label: ${{ steps.start-linux-amd64-runner.outputs.label }}
      linux-amd64-ec2-runner-instance-id: ${{ steps.start-linux-amd64-runner.outputs.ec2-instance-id }}
      linux-arm64-ec2-runner-label: ${{ steps.start-linux-arm64-runner.outputs.label }}
      linux-arm64-ec2-runner-instance-id: ${{ steps.start-linux-arm64-runner.outputs.ec2-instance-id }}
      # The 'version' use as the global tag name of the release workflow.
      version: ${{ steps.create-version.outputs.version }}
    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Create version
        id: create-version
        run: |
          version=$(./.github/scripts/create-version.sh) && \
          echo $version && \
          echo "version=$version" >> $GITHUB_OUTPUT
        env:
          GITHUB_EVENT_NAME: ${{ github.event_name }}
          GITHUB_REF_NAME: ${{ github.ref_name }}
          NEXT_RELEASE_VERSION: ${{ env.NEXT_RELEASE_VERSION }}
          NIGHTLY_RELEASE_PREFIX: ${{ env.NIGHTLY_RELEASE_PREFIX }}
      - name: Allocate linux-amd64 runner
        if: ${{ inputs.build_linux_amd64_artifacts || github.event_name == 'schedule' }}
        uses: ./.github/actions/start-runner
        id: start-linux-amd64-runner
        with:
          runner: ${{ inputs.linux_amd64_runner || vars.DEFAULT_AMD64_RUNNER }}
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ vars.EC2_RUNNER_REGION }}
          github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
          image-id: ${{ vars.EC2_RUNNER_LINUX_AMD64_IMAGE_ID }}
          security-group-id: ${{ vars.EC2_RUNNER_SECURITY_GROUP_ID }}
          subnet-id: ${{ vars.EC2_RUNNER_SUBNET_ID }}
      - name: Allocate linux-arm64 runner
        if: ${{ inputs.build_linux_arm64_artifacts || github.event_name == 'schedule' }}
        uses: ./.github/actions/start-runner
        id: start-linux-arm64-runner
        with:
          runner: ${{ inputs.linux_arm64_runner || vars.DEFAULT_ARM64_RUNNER }}
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ vars.EC2_RUNNER_REGION }}
          github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
          image-id: ${{ vars.EC2_RUNNER_LINUX_ARM64_IMAGE_ID }}
          security-group-id: ${{ vars.EC2_RUNNER_SECURITY_GROUP_ID }}
          subnet-id: ${{ vars.EC2_RUNNER_SUBNET_ID }}
  build-linux-amd64-artifacts:
    name: Build linux-amd64 artifacts
    if: ${{ inputs.build_linux_amd64_artifacts || github.event_name == 'schedule' }}
    needs: [
      allocate-runners,
    ]
    runs-on: ${{ needs.allocate-runners.outputs.linux-amd64-runner }}
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - uses: ./.github/actions/build-linux-artifacts
        with:
          arch: amd64
          cargo-profile: ${{ env.CARGO_PROFILE }}
          version: ${{ needs.allocate-runners.outputs.version }}
          disable-run-tests: ${{ env.DISABLE_RUN_TESTS }}
  build-linux-arm64-artifacts:
    name: Build linux-arm64 artifacts
    if: ${{ inputs.build_linux_arm64_artifacts || github.event_name == 'schedule' }}
    needs: [
      allocate-runners,
    ]
    runs-on: ${{ needs.allocate-runners.outputs.linux-arm64-runner }}
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - uses: ./.github/actions/build-linux-artifacts
        with:
          arch: arm64
          cargo-profile: ${{ env.CARGO_PROFILE }}
          version: ${{ needs.allocate-runners.outputs.version }}
          disable-run-tests: ${{ env.DISABLE_RUN_TESTS }}
  release-images-to-dockerhub:
    name: Build and push images to DockerHub
    if: ${{ inputs.release_images || github.event_name == 'schedule' }}
    needs: [
      allocate-runners,
      build-linux-amd64-artifacts,
      build-linux-arm64-artifacts,
    ]
    runs-on: ubuntu-20.04
    outputs:
      nightly-build-result: ${{ steps.set-nightly-build-result.outputs.nightly-build-result }}
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Build and push images to dockerhub
        uses: ./.github/actions/build-images
        with:
          image-registry: docker.io
          image-namespace: ${{ vars.IMAGE_NAMESPACE }}
          image-registry-username: ${{ secrets.DOCKERHUB_USERNAME }}
          image-registry-password: ${{ secrets.DOCKERHUB_TOKEN }}
          version: ${{ needs.allocate-runners.outputs.version }}
          push-latest-tag: false # Don't push the latest tag to registry.
      - name: Set nightly build result
        id: set-nightly-build-result
        run: |
          echo "nightly-build-result=success" >> $GITHUB_OUTPUT
  release-cn-artifacts:
    name: Release artifacts to CN region
    if: ${{ inputs.release_images || github.event_name == 'schedule' }}
    needs: [
      allocate-runners,
      release-images-to-dockerhub,
    ]
    runs-on: ubuntu-20.04
    # When we push to ACR, it's easy to fail due to some unknown network issues.
    # However, we don't want to fail the whole workflow because of this.
    # The ACR have daily sync with DockerHub, so don't worry about the image not being updated.
    continue-on-error: true
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Release artifacts to CN region
        uses: ./.github/actions/release-cn-artifacts
        with:
          src-image-registry: docker.io
          src-image-namespace: ${{ vars.IMAGE_NAMESPACE }}
          src-image-name: greptimedb
          dst-image-registry-username: ${{ secrets.ALICLOUD_USERNAME }}
          dst-image-registry-password: ${{ secrets.ALICLOUD_PASSWORD }}
          dst-image-registry: ${{ vars.ACR_IMAGE_REGISTRY }}
          dst-image-namespace: ${{ vars.IMAGE_NAMESPACE }}
          version: ${{ needs.allocate-runners.outputs.version }}
          aws-cn-s3-bucket: ${{ vars.AWS_RELEASE_BUCKET }}
          aws-cn-access-key-id: ${{ secrets.AWS_CN_ACCESS_KEY_ID }}
          aws-cn-secret-access-key: ${{ secrets.AWS_CN_SECRET_ACCESS_KEY }}
          aws-cn-region: ${{ vars.AWS_RELEASE_BUCKET_REGION }}
          dev-mode: false
          update-version-info: false  # Don't update version info in S3.
          push-latest-tag: false      # Don't push the latest tag to registry.
  stop-linux-amd64-runner: # It's always run as the last job in the workflow to make sure that the runner is released.
    name: Stop linux-amd64 runner
    # Only run this job when the runner is allocated.
    if: ${{ always() }}
    runs-on: ubuntu-20.04
    needs: [
      allocate-runners,
      build-linux-amd64-artifacts,
    ]
    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Stop EC2 runner
        uses: ./.github/actions/stop-runner
        with:
          label: ${{ needs.allocate-runners.outputs.linux-amd64-ec2-runner-label }}
          ec2-instance-id: ${{ needs.allocate-runners.outputs.linux-amd64-ec2-runner-instance-id }}
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ vars.EC2_RUNNER_REGION }}
          github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
  stop-linux-arm64-runner: # It's always run as the last job in the workflow to make sure that the runner is released.
    name: Stop linux-arm64 runner
    # Only run this job when the runner is allocated.
    if: ${{ always() }}
    runs-on: ubuntu-20.04
    needs: [
      allocate-runners,
      build-linux-arm64-artifacts,
    ]
    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Stop EC2 runner
        uses: ./.github/actions/stop-runner
        with:
          label: ${{ needs.allocate-runners.outputs.linux-arm64-ec2-runner-label }}
          ec2-instance-id: ${{ needs.allocate-runners.outputs.linux-arm64-ec2-runner-instance-id }}
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ vars.EC2_RUNNER_REGION }}
          github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
  notification:
    if: ${{ always() }} # Not requiring successful dependent jobs, always run.
    name: Send notification to Greptime team
    needs: [
      release-images-to-dockerhub
    ]
    runs-on: ubuntu-20.04
    env:
      SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_DEVELOP_CHANNEL }}
    steps:
      - name: Notifiy nightly build successful result
        uses: slackapi/slack-github-action@v1.23.0
        if: ${{ needs.release-images-to-dockerhub.outputs.nightly-build-result == 'success' }}
        with:
          payload: |
            {"text": "GreptimeDB's ${{ env.NEXT_RELEASE_VERSION }} build has completed successfully."}
      - name: Notifiy nightly build failed result
        uses: slackapi/slack-github-action@v1.23.0
        if: ${{ needs.release-images-to-dockerhub.outputs.nightly-build-result != 'success' }}
        with:
          payload: |
            {"text": "GreptimeDB's ${{ env.NEXT_RELEASE_VERSION }} build has failed, please check 'https://github.com/GreptimeTeam/greptimedb/actions/workflows/${{ env.NEXT_RELEASE_VERSION }}-build.yml'."}
--- a/.github/workflows/nightly-ci.yml
+++ b/.github/workflows/nightly-ci.yml
@@ -0,0 +1,100 @@
 # Nightly CI: runs tests every night for our second tier plaforms (Windows)
 on:
  schedule:
    - cron: '0 23 * * 1-5'
  workflow_dispatch:
 name: Nightly CI
 concurrency:
  group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
  cancel-in-progress: true
 env:
  RUST_TOOLCHAIN: nightly-2024-04-18
 jobs:
  sqlness:
    name: Sqlness Test
    if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        os: [ windows-latest-8-cores ]
    timeout-minutes: 60
    steps:
      - uses: actions/checkout@v4
      - uses: arduino/setup-protoc@v3
        with:
          repo-token: ${{ secrets.GITHUB_TOKEN }}
      - uses: dtolnay/rust-toolchain@master
        with:
          toolchain: ${{ env.RUST_TOOLCHAIN }}
      - name: Rust Cache
        uses: Swatinem/rust-cache@v2
      - name: Run sqlness
        run: cargo sqlness
      - name: Notify slack if failed
        if: failure()
        uses: slackapi/slack-github-action@v1.23.0
        env:
          SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_DEVELOP_CHANNEL }}
        with:
          payload: |
            {"text": "Nightly CI failed for sqlness tests"}
      - name: Upload sqlness logs
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: sqlness-logs
          path: /tmp/greptime-*.log
          retention-days: 3
  test-on-windows:
    if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
    runs-on: windows-latest-8-cores
    timeout-minutes: 60
    steps:
      - run: git config --global core.autocrlf false
      - uses: actions/checkout@v4
      - uses: arduino/setup-protoc@v3
        with:
          repo-token: ${{ secrets.GITHUB_TOKEN }}
      - name: Install Rust toolchain
        uses: dtolnay/rust-toolchain@master
        with:
          toolchain: ${{ env.RUST_TOOLCHAIN }}
          components: llvm-tools-preview
      - name: Rust Cache
        uses: Swatinem/rust-cache@v2
      - name: Install Cargo Nextest
        uses: taiki-e/install-action@nextest
      - name: Install Python
        uses: actions/setup-python@v5
        with:
          python-version: '3.10'
      - name: Install PyArrow Package
        run: pip install pyarrow
      - name: Install WSL distribution
        uses: Vampire/setup-wsl@v2
        with:
          distribution: Ubuntu-22.04
      - name: Running tests
        run: cargo nextest run -F pyo3_backend,dashboard
        env:
          RUST_BACKTRACE: 1
          CARGO_INCREMENTAL: 0
          GT_S3_BUCKET: ${{ vars.AWS_CI_TEST_BUCKET }}
          GT_S3_ACCESS_KEY_ID: ${{ secrets.AWS_CI_TEST_ACCESS_KEY_ID }}
          GT_S3_ACCESS_KEY: ${{ secrets.AWS_CI_TEST_SECRET_ACCESS_KEY }}
          GT_S3_REGION: ${{ vars.AWS_CI_TEST_BUCKET_REGION }}
          UNITTEST_LOG_DIR: "__unittest_logs"
      - name: Notify slack if failed
        if: failure()
        uses: slackapi/slack-github-action@v1.23.0
        env:
          SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_DEVELOP_CHANNEL }}
        with:
          payload: |
            {"text": "Nightly CI failed for cargo test"}
--- a/.github/workflows/nightly-funtional-tests.yml
+++ b/.github/workflows/nightly-funtional-tests.yml
@@ -0,0 +1,27 @@
 name: Nightly functional tests
 on:
  schedule:
    # At 00:00 on Tuesday.
    - cron: '0 0 * * 2'
  workflow_dispatch:
 jobs:
  sqlness-test:
    name: Run sqlness test
    if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
    runs-on: ubuntu-22.04
    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Run sqlness test
        uses: ./.github/actions/sqlness-test
        with:
          data-root: sqlness-test
          aws-ci-test-bucket: ${{ vars.AWS_CI_TEST_BUCKET }}
          aws-region: ${{ vars.AWS_CI_TEST_BUCKET_REGION }}
          aws-access-key-id: ${{ secrets.AWS_CI_TEST_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_CI_TEST_SECRET_ACCESS_KEY }}
--- a/.github/workflows/pr-title-checker.yml
+++ b/.github/workflows/pr-title-checker.yml
@@ -10,19 +10,19 @@ on:
 jobs:
  check:
-    runs-on: ubuntu-latest
+    runs-on: ubuntu-20.04
    timeout-minutes: 10
    steps:
-      - uses: thehanimo/pr-title-checker@v1.3.4
+      - uses: thehanimo/pr-title-checker@v1.4.2
        with:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          pass_on_octokit_error: false
          configuration_path: ".github/pr-title-checker-config.json"
  breaking:
-    runs-on: ubuntu-latest
+    runs-on: ubuntu-20.04
    timeout-minutes: 10
    steps:
-      - uses: thehanimo/pr-title-checker@v1.3.4
+      - uses: thehanimo/pr-title-checker@v1.4.2
        with:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          pass_on_octokit_error: false
--- a/.github/workflows/release-dev-builder-images.yaml
+++ b/.github/workflows/release-dev-builder-images.yaml
@@ -0,0 +1,85 @@
 name: Release dev-builder images
 on:
  workflow_dispatch: # Allows you to run this workflow manually.
    inputs:
      version:
        description: Version of the dev-builder
        required: false
        default: latest
      release_dev_builder_ubuntu_image:
        type: boolean
        description: Release dev-builder-ubuntu image
        required: false
        default: false
      release_dev_builder_centos_image:
        type: boolean
        description: Release dev-builder-centos image
        required: false
        default: false
      release_dev_builder_android_image:
        type: boolean
        description: Release dev-builder-android image
        required: false
        default: false
 jobs:
  release-dev-builder-images:
    name: Release dev builder images
    if: ${{ inputs.release_dev_builder_ubuntu_image || inputs.release_dev_builder_centos_image || inputs.release_dev_builder_android_image }} # Only manually trigger this job.
    runs-on: ubuntu-20.04-16-cores
    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Build and push dev builder images
        uses: ./.github/actions/build-dev-builder-images
        with:
          version: ${{ inputs.version }}
          dockerhub-image-registry-username: ${{ secrets.DOCKERHUB_USERNAME }}
          dockerhub-image-registry-token: ${{ secrets.DOCKERHUB_TOKEN }}
          build-dev-builder-ubuntu: ${{ inputs.release_dev_builder_ubuntu_image }}
          build-dev-builder-centos: ${{ inputs.release_dev_builder_centos_image }}
          build-dev-builder-android: ${{ inputs.release_dev_builder_android_image }}
  release-dev-builder-images-cn: # Note: Be careful issue: https://github.com/containers/skopeo/issues/1874 and we decide to use the latest stable skopeo container.
    name: Release dev builder images to CN region
    runs-on: ubuntu-20.04
    needs: [
      release-dev-builder-images
    ]
    steps:
      - name: Push dev-builder-ubuntu image
        shell: bash
        if: ${{ inputs.release_dev_builder_ubuntu_image }}
        env:
          DST_REGISTRY_USERNAME: ${{ secrets.ALICLOUD_USERNAME }}
          DST_REGISTRY_PASSWORD: ${{ secrets.ALICLOUD_PASSWORD }}
        run: |
          docker run quay.io/skopeo/stable:latest copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-ubuntu:${{ inputs.version }} \
            --dest-creds "$DST_REGISTRY_USERNAME":"$DST_REGISTRY_PASSWORD" \
            docker://${{ vars.ACR_IMAGE_REGISTRY }}/${{ vars.IMAGE_NAMESPACE }}/dev-builder-ubuntu:${{ inputs.version }}
      - name: Push dev-builder-centos image
        shell: bash
        if: ${{ inputs.release_dev_builder_centos_image }}
        env:
          DST_REGISTRY_USERNAME: ${{ secrets.ALICLOUD_USERNAME }}
          DST_REGISTRY_PASSWORD: ${{ secrets.ALICLOUD_PASSWORD }}
        run: |
          docker run quay.io/skopeo/stable:latest copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-centos:${{ inputs.version }} \
            --dest-creds "$DST_REGISTRY_USERNAME":"$DST_REGISTRY_PASSWORD" \
            docker://${{ vars.ACR_IMAGE_REGISTRY }}/${{ vars.IMAGE_NAMESPACE }}/dev-builder-centos:${{ inputs.version }}
      - name: Push dev-builder-android image
        shell: bash
        if: ${{ inputs.release_dev_builder_android_image }}
        env:
          DST_REGISTRY_USERNAME: ${{ secrets.ALICLOUD_USERNAME }}
          DST_REGISTRY_PASSWORD: ${{ secrets.ALICLOUD_PASSWORD }}
        run: |
          docker run quay.io/skopeo/stable:latest copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-android:${{ inputs.version }} \
            --dest-creds "$DST_REGISTRY_USERNAME":"$DST_REGISTRY_PASSWORD" \
            docker://${{ vars.ACR_IMAGE_REGISTRY }}/${{ vars.IMAGE_NAMESPACE }}/dev-builder-android:${{ inputs.version }}
--- a/.github/workflows/release.yml
+++ b/.github/workflows/release.yml
@@ -1,3 +1,8 @@
 name: Release
 # There are two kinds of formal release:
 # 1. The tag('v*.*.*') push release: the release workflow will be triggered by the tag push event.
 # 2. The scheduled release(the version will be '${{ env.NEXT_RELEASE_VERSION }}-nightly-YYYYMMDD'): the release workflow will be triggered by the schedule event.
 on:
  push:
    tags:
@@ -5,473 +10,453 @@ on:
  schedule:
    # At 00:00 on Monday.
    - cron: '0 0 * * 1'
-  # Mannually trigger only builds binaries.
+  workflow_dispatch: # Allows you to run this workflow manually.
-  workflow_dispatch:
+    # Notes: The GitHub Actions ONLY support 10 inputs, and it's already used up.
    inputs:
-      dry_run:
+      linux_amd64_runner:
-        description: 'Skip docker push and release steps'
+        type: choice
        description: The runner uses to build linux-amd64 artifacts
        default: ec2-c6i.4xlarge-amd64
        options:
          - ubuntu-20.04
          - ubuntu-20.04-8-cores
          - ubuntu-20.04-16-cores
          - ubuntu-20.04-32-cores
          - ubuntu-20.04-64-cores
          - ec2-c6i.xlarge-amd64 # 4C8G
          - ec2-c6i.2xlarge-amd64 # 8C16G
          - ec2-c6i.4xlarge-amd64 # 16C32G
          - ec2-c6i.8xlarge-amd64 # 32C64G
          - ec2-c6i.16xlarge-amd64 # 64C128G
      linux_arm64_runner:
        type: choice
        description: The runner uses to build linux-arm64 artifacts
        default: ec2-c6g.4xlarge-arm64
        options:
          - ec2-c6g.xlarge-arm64 # 4C8G
          - ec2-c6g.2xlarge-arm64 # 8C16G
          - ec2-c6g.4xlarge-arm64 # 16C32G
          - ec2-c6g.8xlarge-arm64 # 32C64G
          - ec2-c6g.16xlarge-arm64 # 64C128G
      macos_runner:
        type: choice
        description: The runner uses to build macOS artifacts
        default: macos-latest
        options:
          - macos-latest
      skip_test:
        description: Do not run integration tests during the build
        type: boolean
        default: true
-      skip_test:
+      build_linux_amd64_artifacts:
        description: 'Do not run tests during build'
        type: boolean
        description: Build linux-amd64 artifacts
        required: false
        default: false
      build_linux_arm64_artifacts:
        type: boolean
        description: Build linux-arm64 artifacts
        required: false
        default: false
      build_macos_artifacts:
        type: boolean
        description: Build macos artifacts
        required: false
        default: false
      build_windows_artifacts:
        type: boolean
        description: Build Windows artifacts
        required: false
        default: false
      publish_github_release:
        type: boolean
        description: Create GitHub release and upload artifacts
        required: false
        default: false
      release_images:
        type: boolean
        description: Build and push images to DockerHub and ACR
        required: false
        default: false
-name: Release
+# Use env variables to control all the release process.
 env:
-  RUST_TOOLCHAIN: nightly-2023-05-03
+  # The arguments of building greptime.
-
+  RUST_TOOLCHAIN: nightly-2024-04-18
  SCHEDULED_BUILD_VERSION_PREFIX: v0.4.0
  SCHEDULED_PERIOD: nightly
  CARGO_PROFILE: nightly
  # Controls whether to run tests, include unit-test, integration-test and sqlness.
-  DISABLE_RUN_TESTS: ${{ inputs.skip_test || false }}
+  DISABLE_RUN_TESTS: ${{ inputs.skip_test || vars.DEFAULT_SKIP_TEST }}
  # The scheduled version is '${{ env.NEXT_RELEASE_VERSION }}-nightly-YYYYMMDD', like v0.2.0-nigthly-20230313;
  NIGHTLY_RELEASE_PREFIX: nightly
  # Note: The NEXT_RELEASE_VERSION should be modified manually by every formal release.
  NEXT_RELEASE_VERSION: v0.8.0
 jobs:
-  build-macos:
+  allocate-runners:
-    name: Build macOS binary
+    name: Allocate runners
    if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
    runs-on: ubuntu-20.04
    outputs:
      linux-amd64-runner: ${{ steps.start-linux-amd64-runner.outputs.label }}
      linux-arm64-runner: ${{ steps.start-linux-arm64-runner.outputs.label }}
      macos-runner: ${{ inputs.macos_runner || vars.DEFAULT_MACOS_RUNNER }}
      windows-runner: windows-latest-8-cores
      # The following EC2 resource id will be used for resource releasing.
      linux-amd64-ec2-runner-label: ${{ steps.start-linux-amd64-runner.outputs.label }}
      linux-amd64-ec2-runner-instance-id: ${{ steps.start-linux-amd64-runner.outputs.ec2-instance-id }}
      linux-arm64-ec2-runner-label: ${{ steps.start-linux-arm64-runner.outputs.label }}
      linux-arm64-ec2-runner-instance-id: ${{ steps.start-linux-arm64-runner.outputs.ec2-instance-id }}
      # The 'version' use as the global tag name of the release workflow.
      version: ${{ steps.create-version.outputs.version }}
    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 0
      # The create-version will create a global variable named 'version' in the global workflows.
      # - If it's a tag push release, the version is the tag name(${{ github.ref_name }});
      # - If it's a scheduled release, the version is '${{ env.NEXT_RELEASE_VERSION }}-nightly-$buildTime', like v0.2.0-nigthly-20230313;
      # - If it's a manual release, the version is '${{ env.NEXT_RELEASE_VERSION }}-<short-git-sha>-YYYYMMDDSS', like v0.2.0-e5b243c-2023071245;
      - name: Create version
        id: create-version
        run: |
          echo "version=$(./.github/scripts/create-version.sh)" >> $GITHUB_OUTPUT
        env:
          GITHUB_EVENT_NAME: ${{ github.event_name }}
          GITHUB_REF_NAME: ${{ github.ref_name }}
          NEXT_RELEASE_VERSION: ${{ env.NEXT_RELEASE_VERSION }}
          NIGHTLY_RELEASE_PREFIX: ${{ env.NIGHTLY_RELEASE_PREFIX }}
      - name: Allocate linux-amd64 runner
        if: ${{ inputs.build_linux_amd64_artifacts || github.event_name == 'push' || github.event_name == 'schedule' }}
        uses: ./.github/actions/start-runner
        id: start-linux-amd64-runner
        with:
          runner: ${{ inputs.linux_amd64_runner || vars.DEFAULT_AMD64_RUNNER }}
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ vars.EC2_RUNNER_REGION }}
          github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
          image-id: ${{ vars.EC2_RUNNER_LINUX_AMD64_IMAGE_ID }}
          security-group-id: ${{ vars.EC2_RUNNER_SECURITY_GROUP_ID }}
          subnet-id: ${{ vars.EC2_RUNNER_SUBNET_ID }}
      - name: Allocate linux-arm64 runner
        if: ${{ inputs.build_linux_arm64_artifacts || github.event_name == 'push' || github.event_name == 'schedule' }}
        uses: ./.github/actions/start-runner
        id: start-linux-arm64-runner
        with:
          runner: ${{ inputs.linux_arm64_runner || vars.DEFAULT_ARM64_RUNNER }}
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ vars.EC2_RUNNER_REGION }}
          github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
          image-id: ${{ vars.EC2_RUNNER_LINUX_ARM64_IMAGE_ID }}
          security-group-id: ${{ vars.EC2_RUNNER_SECURITY_GROUP_ID }}
          subnet-id: ${{ vars.EC2_RUNNER_SUBNET_ID }}
  build-linux-amd64-artifacts:
    name: Build linux-amd64 artifacts
    if: ${{ inputs.build_linux_amd64_artifacts || github.event_name == 'push' || github.event_name == 'schedule' }}
    needs: [
      allocate-runners,
    ]
    runs-on: ${{ needs.allocate-runners.outputs.linux-amd64-runner }}
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - uses: ./.github/actions/build-linux-artifacts
        with:
          arch: amd64
          cargo-profile: ${{ env.CARGO_PROFILE }}
          version: ${{ needs.allocate-runners.outputs.version }}
          disable-run-tests: ${{ env.DISABLE_RUN_TESTS }}
  build-linux-arm64-artifacts:
    name: Build linux-arm64 artifacts
    if: ${{ inputs.build_linux_arm64_artifacts || github.event_name == 'push' || github.event_name == 'schedule' }}
    needs: [
      allocate-runners,
    ]
    runs-on: ${{ needs.allocate-runners.outputs.linux-arm64-runner }}
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - uses: ./.github/actions/build-linux-artifacts
        with:
          arch: arm64
          cargo-profile: ${{ env.CARGO_PROFILE }}
          version: ${{ needs.allocate-runners.outputs.version }}
          disable-run-tests: ${{ env.DISABLE_RUN_TESTS }}
  build-macos-artifacts:
    name: Build macOS artifacts
    strategy:
      fail-fast: false
      matrix:
        # The file format is greptime-<os>-<arch>
        include:
-          - arch: aarch64-apple-darwin
+          - os: ${{ needs.allocate-runners.outputs.macos-runner }}
-            os: macos-latest
+            arch: aarch64-apple-darwin
-            file: greptime-darwin-arm64
+            features: servers/dashboard
-            continue-on-error: false
+            artifacts-dir-prefix: greptime-darwin-arm64
-            opts: "-F servers/dashboard"
+          - os: ${{ needs.allocate-runners.outputs.macos-runner }}
-          - arch: x86_64-apple-darwin
+            arch: aarch64-apple-darwin
-            os: macos-latest
+            features: pyo3_backend,servers/dashboard
-            file: greptime-darwin-amd64
+            artifacts-dir-prefix: greptime-darwin-arm64-pyo3
-            continue-on-error: false
+          - os: ${{ needs.allocate-runners.outputs.macos-runner }}
-            opts: "-F servers/dashboard"
+            features: servers/dashboard
-          - arch: aarch64-apple-darwin
+            arch: x86_64-apple-darwin
-            os: macos-latest
+            artifacts-dir-prefix: greptime-darwin-amd64
-            file: greptime-darwin-arm64-pyo3
+          - os: ${{ needs.allocate-runners.outputs.macos-runner }}
-            continue-on-error: false
+            features: pyo3_backend,servers/dashboard
-            opts: "-F pyo3_backend,servers/dashboard"
+            arch: x86_64-apple-darwin
-          - arch: x86_64-apple-darwin
+            artifacts-dir-prefix: greptime-darwin-amd64-pyo3
            os: macos-latest
            file: greptime-darwin-amd64-pyo3
            continue-on-error: false
            opts: "-F pyo3_backend,servers/dashboard"
    runs-on: ${{ matrix.os }}
-    continue-on-error: ${{ matrix.continue-on-error }}
+    outputs:
-    if: github.repository == 'GreptimeTeam/greptimedb'
+      build-macos-result: ${{ steps.set-build-macos-result.outputs.build-macos-result }}
    needs: [
      allocate-runners,
    ]
    if: ${{ inputs.build_macos_artifacts || github.event_name == 'push' || github.event_name == 'schedule' }}
    steps:
-      - name: Checkout sources
+      - uses: actions/checkout@v4
        uses: actions/checkout@v3
      - name: Cache cargo assets
        id: cache
        uses: actions/cache@v3
        with:
-          path: |
+          fetch-depth: 0
            ~/.cargo/bin/
            ~/.cargo/registry/index/
            ~/.cargo/registry/cache/
            ~/.cargo/git/db/
            target/
          key: ${{ matrix.arch }}-build-cargo-${{ hashFiles('**/Cargo.lock') }}
-      - name: Install Protoc for macos
+      - uses: ./.github/actions/build-macos-artifacts
-        if: contains(matrix.arch, 'darwin')
+        with:
          arch: ${{ matrix.arch }}
          rust-toolchain: ${{ env.RUST_TOOLCHAIN }}
          cargo-profile: ${{ env.CARGO_PROFILE }}
          features: ${{ matrix.features }}
          version: ${{ needs.allocate-runners.outputs.version }}
          disable-run-tests: ${{ env.DISABLE_RUN_TESTS }}
          artifacts-dir: ${{ matrix.artifacts-dir-prefix }}-${{ needs.allocate-runners.outputs.version }}
      - name: Set build macos result
        id: set-build-macos-result
        run: |
-          brew install protobuf
+          echo "build-macos-result=success" >> $GITHUB_OUTPUT    
-      - name: Install etcd for macos
+  build-windows-artifacts:
-        if: contains(matrix.arch, 'darwin')
+    name: Build Windows artifacts
        run: |
          brew install etcd
          brew services start etcd
      - name: Install rust toolchain
        uses: dtolnay/rust-toolchain@master
        with:
          toolchain: ${{ env.RUST_TOOLCHAIN }}
          targets: ${{ matrix.arch }}
      - name: Install latest nextest release
        uses: taiki-e/install-action@nextest
      - name: Output package versions
        run: protoc --version ; cargo version ; rustc --version ; gcc --version ; g++ --version
      - name: Run tests
        if: env.DISABLE_RUN_TESTS == 'false'
        run: make test sqlness-test
      - name: Run cargo build
        if: contains(matrix.arch, 'darwin') || contains(matrix.opts, 'pyo3_backend') == false
        run: cargo build --profile ${{ env.CARGO_PROFILE }} --locked --target ${{ matrix.arch }} ${{ matrix.opts }}
      - name: Calculate checksum and rename binary
        shell: bash
        run: |
          cd target/${{ matrix.arch }}/${{ env.CARGO_PROFILE }}
          chmod +x greptime
          tar -zcvf ${{ matrix.file }}.tgz greptime
          echo $(shasum -a 256 ${{ matrix.file }}.tgz | cut -f1 -d' ') > ${{ matrix.file }}.sha256sum
      - name: Upload artifacts
        uses: actions/upload-artifact@v3
        with:
          name: ${{ matrix.file }}
          path: target/${{ matrix.arch }}/${{ env.CARGO_PROFILE }}/${{ matrix.file }}.tgz
      - name: Upload checksum of artifacts
        uses: actions/upload-artifact@v3
        with:
          name: ${{ matrix.file }}.sha256sum
          path: target/${{ matrix.arch }}/${{ env.CARGO_PROFILE }}/${{ matrix.file }}.sha256sum
  build-linux:
    name: Build linux binary
    strategy:
      fail-fast: false
      matrix:
        # The file format is greptime-<os>-<arch>
        include:
-          - arch: x86_64-unknown-linux-gnu
+          - os: ${{ needs.allocate-runners.outputs.windows-runner }}
-            os: ubuntu-2004-16-cores
+            arch: x86_64-pc-windows-msvc
-            file: greptime-linux-amd64
+            features: servers/dashboard
-            continue-on-error: false
+            artifacts-dir-prefix: greptime-windows-amd64
-            opts: "-F servers/dashboard"
+          - os: ${{ needs.allocate-runners.outputs.windows-runner }}
-          - arch: aarch64-unknown-linux-gnu
+            arch: x86_64-pc-windows-msvc
-            os: ubuntu-2004-16-cores
+            features: pyo3_backend,servers/dashboard
-            file: greptime-linux-arm64
+            artifacts-dir-prefix: greptime-windows-amd64-pyo3
            continue-on-error: false
            opts: "-F servers/dashboard"
          - arch: x86_64-unknown-linux-gnu
            os: ubuntu-2004-16-cores
            file: greptime-linux-amd64-pyo3
            continue-on-error: false
            opts: "-F pyo3_backend,servers/dashboard"
          - arch: aarch64-unknown-linux-gnu
            os: ubuntu-2004-16-cores
            file: greptime-linux-arm64-pyo3
            continue-on-error: false
            opts: "-F pyo3_backend,servers/dashboard"
    runs-on: ${{ matrix.os }}
-    continue-on-error: ${{ matrix.continue-on-error }}
+    outputs:
-    if: github.repository == 'GreptimeTeam/greptimedb'
+      build-windows-result: ${{ steps.set-build-windows-result.outputs.build-windows-result }}
    needs: [
      allocate-runners,
    ]
    if: ${{ inputs.build_windows_artifacts || github.event_name == 'push' || github.event_name == 'schedule' }}
    steps:
-      - name: Checkout sources
+      - run: git config --global core.autocrlf false
        uses: actions/checkout@v3
-      - name: Cache cargo assets
+      - uses: actions/checkout@v4
        id: cache
        uses: actions/cache@v3
        with:
-          path: |
+          fetch-depth: 0
            ~/.cargo/bin/
            ~/.cargo/registry/index/
            ~/.cargo/registry/cache/
            ~/.cargo/git/db/
            target/
          key: ${{ matrix.arch }}-build-cargo-${{ hashFiles('**/Cargo.lock') }}
-      - name: Install Protoc for linux
+      - uses: ./.github/actions/build-windows-artifacts
        if: contains(matrix.arch, 'linux') && endsWith(matrix.arch, '-gnu')
        run: | # Make sure the protoc is >= 3.15
          wget https://github.com/protocolbuffers/protobuf/releases/download/v21.9/protoc-21.9-linux-x86_64.zip
          unzip protoc-21.9-linux-x86_64.zip -d protoc
          sudo cp protoc/bin/protoc /usr/local/bin/
          sudo cp -r protoc/include/google /usr/local/include/
      - name: Install etcd for linux
        if: contains(matrix.arch, 'linux') && endsWith(matrix.arch, '-gnu')
        run: |
          ETCD_VER=v3.5.7
          DOWNLOAD_URL=https://github.com/etcd-io/etcd/releases/download
          curl -L ${DOWNLOAD_URL}/${ETCD_VER}/etcd-${ETCD_VER}-linux-amd64.tar.gz -o /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz
          mkdir -p /tmp/etcd-download
          tar xzvf /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz -C /tmp/etcd-download --strip-components=1
          rm -f /tmp/etcd-${ETCD_VER}-linux-amd64.tar.gz
          sudo cp -a /tmp/etcd-download/etcd* /usr/local/bin/
          nohup etcd >/tmp/etcd.log 2>&1 &
      - name: Install dependencies for linux
        if: contains(matrix.arch, 'linux') && endsWith(matrix.arch, '-gnu')
        run: |
          sudo apt-get -y update
          sudo apt-get -y install libssl-dev pkg-config g++-aarch64-linux-gnu gcc-aarch64-linux-gnu binutils-aarch64-linux-gnu wget
      # FIXME(zyy17): Should we specify the version of python when building binary for darwin?
      - name: Compile Python 3.10.10 from source for linux
        if: contains(matrix.arch, 'linux') && contains(matrix.opts, 'pyo3_backend')
        run: |
          sudo chmod +x ./docker/aarch64/compile-python.sh
          sudo ./docker/aarch64/compile-python.sh ${{ matrix.arch }}
      - name: Install rust toolchain
        uses: dtolnay/rust-toolchain@master
        with:
-          toolchain: ${{ env.RUST_TOOLCHAIN }}
+          arch: ${{ matrix.arch }}
-          targets: ${{ matrix.arch }}
+          rust-toolchain: ${{ env.RUST_TOOLCHAIN }}
-      - name: Install latest nextest release
+          cargo-profile: ${{ env.CARGO_PROFILE }}
-        uses: taiki-e/install-action@nextest
+          features: ${{ matrix.features }}
-      - name: Output package versions
+          version: ${{ needs.allocate-runners.outputs.version }}
-        run: protoc --version ; cargo version ; rustc --version ; gcc --version ; g++ --version
+          disable-run-tests: ${{ env.DISABLE_RUN_TESTS }}
          artifacts-dir: ${{ matrix.artifacts-dir-prefix }}-${{ needs.allocate-runners.outputs.version }}
-      - name: Run tests
+      - name: Set build windows result
-        if: env.DISABLE_RUN_TESTS == 'false'
+        id: set-build-windows-result
        run: make test sqlness-test
      - name: Run cargo build
        if: contains(matrix.arch, 'darwin') || contains(matrix.opts, 'pyo3_backend') == false
        run: cargo build --profile ${{ env.CARGO_PROFILE }} --locked --target ${{ matrix.arch }} ${{ matrix.opts }}
      - name: Run cargo build with pyo3 for aarch64-linux
        if: contains(matrix.arch, 'aarch64-unknown-linux-gnu') && contains(matrix.opts, 'pyo3_backend')
        run: |
-          # TODO(zyy17): We should make PYO3_CROSS_LIB_DIR configurable.
+          echo "build-windows-result=success" >> $Env:GITHUB_OUTPUT
          export PYTHON_INSTALL_PATH_AMD64=${PWD}/python-3.10.10/amd64
          export LD_LIBRARY_PATH=$PYTHON_INSTALL_PATH_AMD64/lib:$LD_LIBRARY_PATH
          export LIBRARY_PATH=$PYTHON_INSTALL_PATH_AMD64/lib:$LIBRARY_PATH
          export PATH=$PYTHON_INSTALL_PATH_AMD64/bin:$PATH
-          export PYO3_CROSS_LIB_DIR=${PWD}/python-3.10.10/aarch64
+  release-images-to-dockerhub:
-          echo "PYO3_CROSS_LIB_DIR: $PYO3_CROSS_LIB_DIR"
+    name: Build and push images to DockerHub
-          alias python=$PYTHON_INSTALL_PATH_AMD64/bin/python3
+    if: ${{ inputs.release_images || github.event_name == 'push' || github.event_name == 'schedule' }}
-          alias pip=$PYTHON_INSTALL_PATH_AMD64/bin/python3-pip
+    needs: [
-
+      allocate-runners,
-          cargo build --profile ${{ env.CARGO_PROFILE }} --locked --target ${{ matrix.arch }} ${{ matrix.opts }}
+      build-linux-amd64-artifacts,
-
+      build-linux-arm64-artifacts,
-      - name: Run cargo build with pyo3 for amd64-linux
+    ]
-        if: contains(matrix.arch, 'x86_64-unknown-linux-gnu') && contains(matrix.opts, 'pyo3_backend')
+    runs-on: ubuntu-2004-16-cores
-        run: |
+    outputs:
-          export PYTHON_INSTALL_PATH_AMD64=${PWD}/python-3.10.10/amd64
+      build-image-result: ${{ steps.set-build-image-result.outputs.build-image-result }}
          export LD_LIBRARY_PATH=$PYTHON_INSTALL_PATH_AMD64/lib:$LD_LIBRARY_PATH
          export LIBRARY_PATH=$PYTHON_INSTALL_PATH_AMD64/lib:$LIBRARY_PATH
          export PATH=$PYTHON_INSTALL_PATH_AMD64/bin:$PATH
          echo "implementation=CPython" >> pyo3.config
          echo "version=3.10" >> pyo3.config
          echo "implementation=CPython" >> pyo3.config
          echo "shared=true" >> pyo3.config
          echo "abi3=true" >> pyo3.config
          echo "lib_name=python3.10" >> pyo3.config
          echo "lib_dir=$PYTHON_INSTALL_PATH_AMD64/lib" >> pyo3.config
          echo "executable=$PYTHON_INSTALL_PATH_AMD64/bin/python3" >> pyo3.config
          echo "pointer_width=64" >> pyo3.config
          echo "build_flags=" >> pyo3.config
          echo "suppress_build_script_link_lines=false" >> pyo3.config
          cat pyo3.config
          export PYO3_CONFIG_FILE=${PWD}/pyo3.config
          alias python=$PYTHON_INSTALL_PATH_AMD64/bin/python3
          alias pip=$PYTHON_INSTALL_PATH_AMD64/bin/python3-pip
          cargo build --profile ${{ env.CARGO_PROFILE }} --locked --target ${{ matrix.arch }} ${{ matrix.opts }}
      - name: Calculate checksum and rename binary
        shell: bash
        run: |
          cd target/${{ matrix.arch }}/${{ env.CARGO_PROFILE }}
          chmod +x greptime
          tar -zcvf ${{ matrix.file }}.tgz greptime
          echo $(shasum -a 256 ${{ matrix.file }}.tgz | cut -f1 -d' ') > ${{ matrix.file }}.sha256sum
      - name: Upload artifacts
        uses: actions/upload-artifact@v3
        with:
          name: ${{ matrix.file }}
          path: target/${{ matrix.arch }}/${{ env.CARGO_PROFILE }}/${{ matrix.file }}.tgz
      - name: Upload checksum of artifacts
        uses: actions/upload-artifact@v3
        with:
          name: ${{ matrix.file }}.sha256sum
          path: target/${{ matrix.arch }}/${{ env.CARGO_PROFILE }}/${{ matrix.file }}.sha256sum
  docker:
    name: Build docker image
    needs: [build-linux, build-macos]
    runs-on: ubuntu-latest
    if: github.repository == 'GreptimeTeam/greptimedb' && !(inputs.dry_run || false)
    steps:
-      - name: Checkout sources
+      - uses: actions/checkout@v4
        uses: actions/checkout@v3
      - name: Login to Dockerhub
        uses: docker/login-action@v2
        with:
-          username: ${{ secrets.DOCKERHUB_USERNAME }}
+          fetch-depth: 0
          password: ${{ secrets.DOCKERHUB_TOKEN }}
-      - name: Configure scheduled build image tag # the tag would be ${SCHEDULED_BUILD_VERSION_PREFIX}-YYYYMMDD-${SCHEDULED_PERIOD}
+      - name: Build and push images to dockerhub
-        shell: bash
+        uses: ./.github/actions/build-images
        if: github.event_name != 'push'
        run: |
          buildTime=`date "+%Y%m%d"`
          SCHEDULED_BUILD_VERSION=${{ env.SCHEDULED_BUILD_VERSION_PREFIX }}-$buildTime-${{ env.SCHEDULED_PERIOD }}
          echo "IMAGE_TAG=${SCHEDULED_BUILD_VERSION:1}" >> $GITHUB_ENV
      - name: Configure tag # If the release tag is v0.1.0, then the image version tag will be 0.1.0.
        shell: bash
        if: github.event_name == 'push'
        run: |
          VERSION=${{ github.ref_name }}
          echo "IMAGE_TAG=${VERSION:1}" >> $GITHUB_ENV
      - name: Set up QEMU
        uses: docker/setup-qemu-action@v2
      - name: Set up buildx
        uses: docker/setup-buildx-action@v2
      - name: Download amd64 binary
        uses: actions/download-artifact@v3
        with:
-          name: greptime-linux-amd64-pyo3
+          image-registry: docker.io
-          path: amd64
+          image-namespace: ${{ vars.IMAGE_NAMESPACE }}
          image-registry-username: ${{ secrets.DOCKERHUB_USERNAME }}
          image-registry-password: ${{ secrets.DOCKERHUB_TOKEN }}
          version: ${{ needs.allocate-runners.outputs.version }}
-      - name: Unzip the amd64 artifacts
+      - name: Set build image result
        id: set-build-image-result
        run: |
-          tar xvf amd64/greptime-linux-amd64-pyo3.tgz -C amd64/ && rm amd64/greptime-linux-amd64-pyo3.tgz
+          echo "build-image-result=success" >> $GITHUB_OUTPUT    
          cp -r amd64 docker/ci
-      - name: Download arm64 binary
+  release-cn-artifacts:
-        id: download-arm64
+    name: Release artifacts to CN region
-        uses: actions/download-artifact@v3
+    if: ${{ inputs.release_images || github.event_name == 'push' || github.event_name == 'schedule' }}
-        with:
+    needs: [ # The job have to wait for all the artifacts are built.
-          name: greptime-linux-arm64-pyo3
+      allocate-runners,
-          path: arm64
+      build-linux-amd64-artifacts,
-
+      build-linux-arm64-artifacts,
-      - name: Unzip the arm64 artifacts
+      build-macos-artifacts,
-        id: unzip-arm64
+      build-windows-artifacts,
-        if: success() || steps.download-arm64.conclusion == 'success'
+      release-images-to-dockerhub,
-        run: |
+    ]
-          tar xvf arm64/greptime-linux-arm64-pyo3.tgz -C arm64/ && rm arm64/greptime-linux-arm64-pyo3.tgz
+    runs-on: ubuntu-20.04
-          cp -r arm64 docker/ci
+    # When we push to ACR, it's easy to fail due to some unknown network issues.
-
+    # However, we don't want to fail the whole workflow because of this.
-      - name: Build and push all
+    # The ACR have daily sync with DockerHub, so don't worry about the image not being updated.
        uses: docker/build-push-action@v3
        if: success() || steps.unzip-arm64.conclusion == 'success' # Build and push all platform if unzip-arm64 succeeds
        with:
          context: ./docker/ci/
          file: ./docker/ci/Dockerfile
          push: true
          platforms: linux/amd64,linux/arm64
          tags: |
            greptime/greptimedb:latest
            greptime/greptimedb:${{ env.IMAGE_TAG }}
      - name: Build and push amd64 only
        uses: docker/build-push-action@v3
        if: success() || steps.download-arm64.conclusion == 'failure' # Only build and push amd64 platform if download-arm64 fails
        with:
          context: ./docker/ci/
          file: ./docker/ci/Dockerfile
          push: true
          platforms: linux/amd64
          tags: |
            greptime/greptimedb:latest
            greptime/greptimedb:${{ env.IMAGE_TAG }}
  release:
    name: Release artifacts
    # Release artifacts only when all the artifacts are built successfully.
    needs: [build-linux, build-macos, docker]
    runs-on: ubuntu-latest
    if: github.repository == 'GreptimeTeam/greptimedb' && !(inputs.dry_run || false)
    steps:
      - name: Checkout sources
        uses: actions/checkout@v3
      - name: Download artifacts
        uses: actions/download-artifact@v3
      - name: Configure scheduled build version # the version would be ${SCHEDULED_BUILD_VERSION_PREFIX}-${SCHEDULED_PERIOD}-YYYYMMDD, like v0.2.0-nigthly-20230313.
        shell: bash
        if: github.event_name != 'push'
        run: |
          buildTime=`date "+%Y%m%d"`
          SCHEDULED_BUILD_VERSION=${{ env.SCHEDULED_BUILD_VERSION_PREFIX }}-${{ env.SCHEDULED_PERIOD }}-$buildTime
          echo "SCHEDULED_BUILD_VERSION=${SCHEDULED_BUILD_VERSION}" >> $GITHUB_ENV
      # Only publish release when the release tag is like v1.0.0, v1.0.1, v1.0.2, etc.
      - name: Set whether it is the latest release
        run: |
          if [[ "${{ github.ref_name }}" =~ ^v[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
            echo "prerelease=false" >> $GITHUB_ENV
            echo "makeLatest=true" >> $GITHUB_ENV
          else
            echo "prerelease=true" >> $GITHUB_ENV
            echo "makeLatest=false" >> $GITHUB_ENV
          fi
      - name: Create scheduled build git tag
        if: github.event_name != 'push'
        run: |
          git tag ${{ env.SCHEDULED_BUILD_VERSION }}
      - name: Publish scheduled release # configure the different release title and tags.
        uses: ncipollo/release-action@v1
        if: github.event_name != 'push'
        with:
          name: "Release ${{ env.SCHEDULED_BUILD_VERSION }}"
          prerelease: ${{ env.prerelease }}
          makeLatest: ${{ env.makeLatest }}
          tag: ${{ env.SCHEDULED_BUILD_VERSION }}
          generateReleaseNotes: true
          artifacts: |
            **/greptime-*
      - name: Publish release
        uses: ncipollo/release-action@v1
        if: github.event_name == 'push'
        with:
          name: "${{ github.ref_name }}"
          prerelease: ${{ env.prerelease }}
          makeLatest: ${{ env.makeLatest }}
          generateReleaseNotes: true
          artifacts: |
            **/greptime-*
  docker-push-acr:
    name: Push docker image to alibaba cloud container registry
    needs: [docker]
    runs-on: ubuntu-latest
    if: github.repository == 'GreptimeTeam/greptimedb' && !(inputs.dry_run || false)
    continue-on-error: true
    steps:
-      - name: Checkout sources
+      - uses: actions/checkout@v4
        uses: actions/checkout@v3
      - name: Set up QEMU
        uses: docker/setup-qemu-action@v2
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v2
      - name: Login to alibaba cloud container registry
        uses: docker/login-action@v2
        with:
-          registry: registry.cn-hangzhou.aliyuncs.com
+          fetch-depth: 0
          username: ${{ secrets.ALICLOUD_USERNAME }}
          password: ${{ secrets.ALICLOUD_PASSWORD }}
-      - name: Configure scheduled build image tag # the tag would be ${SCHEDULED_BUILD_VERSION_PREFIX}-YYYYMMDD-${SCHEDULED_PERIOD}
+      - name: Release artifacts to CN region
-        shell: bash
+        uses: ./.github/actions/release-cn-artifacts
-        if: github.event_name != 'push'
+        with:
-        run: |
+          src-image-registry: docker.io
-          buildTime=`date "+%Y%m%d"`
+          src-image-namespace: ${{ vars.IMAGE_NAMESPACE }}
-          SCHEDULED_BUILD_VERSION=${{ env.SCHEDULED_BUILD_VERSION_PREFIX }}-$buildTime-${{ env.SCHEDULED_PERIOD }}
+          src-image-name: greptimedb
-          echo "IMAGE_TAG=${SCHEDULED_BUILD_VERSION:1}" >> $GITHUB_ENV
+          dst-image-registry-username: ${{ secrets.ALICLOUD_USERNAME }}
          dst-image-registry-password: ${{ secrets.ALICLOUD_PASSWORD }}
          dst-image-registry: ${{ vars.ACR_IMAGE_REGISTRY }}
          dst-image-namespace: ${{ vars.IMAGE_NAMESPACE }}
          version: ${{ needs.allocate-runners.outputs.version }}
          aws-cn-s3-bucket: ${{ vars.AWS_RELEASE_BUCKET }}
          aws-cn-access-key-id: ${{ secrets.AWS_CN_ACCESS_KEY_ID }}
          aws-cn-secret-access-key: ${{ secrets.AWS_CN_SECRET_ACCESS_KEY }}
          aws-cn-region: ${{ vars.AWS_RELEASE_BUCKET_REGION }}
          dev-mode: false
          update-version-info: true
          push-latest-tag: true
-      - name: Configure tag # If the release tag is v0.1.0, then the image version tag will be 0.1.0.
+  publish-github-release:
-        shell: bash
+    name: Create GitHub release and upload artifacts
-        if: github.event_name == 'push'
+    if: ${{ inputs.publish_github_release || github.event_name == 'push' || github.event_name == 'schedule' }}
-        run: |
+    needs: [ # The job have to wait for all the artifacts are built.
-          VERSION=${{ github.ref_name }}
+      allocate-runners,
-          echo "IMAGE_TAG=${VERSION:1}" >> $GITHUB_ENV
+      build-linux-amd64-artifacts,
      build-linux-arm64-artifacts,
      build-macos-artifacts,
      build-windows-artifacts,
      release-images-to-dockerhub,
    ]
    runs-on: ubuntu-20.04
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0
-      - name: Push image to alibaba cloud container registry # Use 'docker buildx imagetools create' to create a new image base on source image.
+      - name: Publish GitHub release
-        run: |
+        uses: ./.github/actions/publish-github-release
-          docker buildx imagetools create \
+        with:
-            --tag registry.cn-hangzhou.aliyuncs.com/greptime/greptimedb:latest \
+          version: ${{ needs.allocate-runners.outputs.version }}
-            --tag registry.cn-hangzhou.aliyuncs.com/greptime/greptimedb:${{ env.IMAGE_TAG }} \
+
-            greptime/greptimedb:${{ env.IMAGE_TAG }}
+  ### Stop runners ###
  # It's very necessary to split the job of releasing runners into 'stop-linux-amd64-runner' and 'stop-linux-arm64-runner'.
  # Because we can terminate the specified EC2 instance immediately after the job is finished without uncessary waiting.
  stop-linux-amd64-runner: # It's always run as the last job in the workflow to make sure that the runner is released.
    name: Stop linux-amd64 runner
    # Only run this job when the runner is allocated.
    if: ${{ always() }}
    runs-on: ubuntu-20.04
    needs: [
      allocate-runners,
      build-linux-amd64-artifacts,
    ]
    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Stop EC2 runner
        uses: ./.github/actions/stop-runner
        with:
          label: ${{ needs.allocate-runners.outputs.linux-amd64-ec2-runner-label }}
          ec2-instance-id: ${{ needs.allocate-runners.outputs.linux-amd64-ec2-runner-instance-id }}
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ vars.EC2_RUNNER_REGION }}
          github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
  stop-linux-arm64-runner: # It's always run as the last job in the workflow to make sure that the runner is released.
    name: Stop linux-arm64 runner
    # Only run this job when the runner is allocated.
    if: ${{ always() }}
    runs-on: ubuntu-20.04
    needs: [
      allocate-runners,
      build-linux-arm64-artifacts,
    ]
    steps:
      - name: Checkout
        uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - name: Stop EC2 runner
        uses: ./.github/actions/stop-runner
        with:
          label: ${{ needs.allocate-runners.outputs.linux-arm64-ec2-runner-label }}
          ec2-instance-id: ${{ needs.allocate-runners.outputs.linux-arm64-ec2-runner-instance-id }}
          aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
          aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
          aws-region: ${{ vars.EC2_RUNNER_REGION }}
          github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
  notification:
    if: ${{ always() }} # Not requiring successful dependent jobs, always run.
    name: Send notification to Greptime team
    needs: [
      release-images-to-dockerhub,
      build-macos-artifacts,
      build-windows-artifacts,
    ]
    runs-on: ubuntu-20.04
    env:
      SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_DEVELOP_CHANNEL }}
    steps:
      - name: Notifiy release successful result
        uses: slackapi/slack-github-action@v1.25.0
        if: ${{ needs.release-images-to-dockerhub.outputs.build-image-result == 'success' && needs.build-windows-artifacts.outputs.build-windows-result == 'success' && needs.build-macos-artifacts.outputs.build-macos-result == 'success' }}
        with:
          payload: |
            {"text": "GreptimeDB's release version has completed successfully."}
      - name: Notifiy release failed result
        uses: slackapi/slack-github-action@v1.25.0
        if: ${{ needs.release-images-to-dockerhub.outputs.build-image-result != 'success' || needs.build-windows-artifacts.outputs.build-windows-result != 'success' || needs.build-macos-artifacts.outputs.build-macos-result != 'success' }}
        with:
          payload: |
            {"text": "GreptimeDB's release version has failed, please check 'https://github.com/GreptimeTeam/greptimedb/actions/workflows/release.yml'."}
--- a/.github/workflows/unassign.yml
+++ b/.github/workflows/unassign.yml
@@ -0,0 +1,21 @@
 name: Auto Unassign
 on:
  schedule:
    - cron: '4 2 * * *'
  workflow_dispatch:
 permissions:
  contents: read
  issues: write
  pull-requests: write
 jobs:
  auto-unassign:
    name: Auto Unassign
    runs-on: ubuntu-latest
    steps:
      - name: Auto Unassign
        uses: tisonspieces/auto-unassign@main
        with:
          token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
          repository: ${{ github.repository }}
--- a/.gitignore
+++ b/.gitignore
@@ -44,3 +44,9 @@ benchmarks/data
 # Vscode workspace
 *.code-workspace
 venv/
 # Fuzz tests 
 tests-fuzz/artifacts/
 tests-fuzz/corpus/
--- a/.licenserc.yaml
+++ b/.licenserc.yaml
@@ -1,14 +0,0 @@
 header:
  license:
    spdx-id: Apache-2.0
    copyright-owner: Greptime Team
  paths:
    - "**/*.rs"
    - "**/*.py"
  comment: on-failure
 dependency:
  files:
    - Cargo.toml
--- a/CODE_OF_CONDUCT.md
+++ b/CODE_OF_CONDUCT.md
@@ -1,132 +0,0 @@
 # Contributor Covenant Code of Conduct
 ## Our Pledge
 We as members, contributors, and leaders pledge to make participation in our
 community a harassment-free experience for everyone, regardless of age, body
 size, visible or invisible disability, ethnicity, sex characteristics, gender
 identity and expression, level of experience, education, socio-economic status,
 nationality, personal appearance, race, caste, color, religion, or sexual
 identity and orientation.
 We pledge to act and interact in ways that contribute to an open, welcoming,
 diverse, inclusive, and healthy community.
 ## Our Standards
 Examples of behavior that contributes to a positive environment for our
 community include:
 * Demonstrating empathy and kindness toward other people
 * Being respectful of differing opinions, viewpoints, and experiences
 * Giving and gracefully accepting constructive feedback
 * Accepting responsibility and apologizing to those affected by our mistakes,
  and learning from the experience
 * Focusing on what is best not just for us as individuals, but for the overall
  community
 Examples of unacceptable behavior include:
 * The use of sexualized language or imagery, and sexual attention or advances of
  any kind
 * Trolling, insulting or derogatory comments, and personal or political attacks
 * Public or private harassment
 * Publishing others' private information, such as a physical or email address,
  without their explicit permission
 * Other conduct which could reasonably be considered inappropriate in a
  professional setting
 ## Enforcement Responsibilities
 Community leaders are responsible for clarifying and enforcing our standards of
 acceptable behavior and will take appropriate and fair corrective action in
 response to any behavior that they deem inappropriate, threatening, offensive,
 or harmful.
 Community leaders have the right and responsibility to remove, edit, or reject
 comments, commits, code, wiki edits, issues, and other contributions that are
 not aligned to this Code of Conduct, and will communicate reasons for moderation
 decisions when appropriate.
 ## Scope
 This Code of Conduct applies within all community spaces, and also applies when
 an individual is officially representing the community in public spaces.
 Examples of representing our community include using an official e-mail address,
 posting via an official social media account, or acting as an appointed
 representative at an online or offline event.
 ## Enforcement
 Instances of abusive, harassing, or otherwise unacceptable behavior may be
 reported to the community leaders responsible for enforcement at
 info@greptime.com.
 All complaints will be reviewed and investigated promptly and fairly.
 All community leaders are obligated to respect the privacy and security of the
 reporter of any incident.
 ## Enforcement Guidelines
 Community leaders will follow these Community Impact Guidelines in determining
 the consequences for any action they deem in violation of this Code of Conduct:
 ### 1. Correction
 **Community Impact**: Use of inappropriate language or other behavior deemed
 unprofessional or unwelcome in the community.
 **Consequence**: A private, written warning from community leaders, providing
 clarity around the nature of the violation and an explanation of why the
 behavior was inappropriate. A public apology may be requested.
 ### 2. Warning
 **Community Impact**: A violation through a single incident or series of
 actions.
 **Consequence**: A warning with consequences for continued behavior. No
 interaction with the people involved, including unsolicited interaction with
 those enforcing the Code of Conduct, for a specified period of time. This
 includes avoiding interactions in community spaces as well as external channels
 like social media. Violating these terms may lead to a temporary or permanent
 ban.
 ### 3. Temporary Ban
 **Community Impact**: A serious violation of community standards, including
 sustained inappropriate behavior.
 **Consequence**: A temporary ban from any sort of interaction or public
 communication with the community for a specified period of time. No public or
 private interaction with the people involved, including unsolicited interaction
 with those enforcing the Code of Conduct, is allowed during this period.
 Violating these terms may lead to a permanent ban.
 ### 4. Permanent Ban
 **Community Impact**: Demonstrating a pattern of violation of community
 standards, including sustained inappropriate behavior, harassment of an
 individual, or aggression toward or disparagement of classes of individuals.
 **Consequence**: A permanent ban from any sort of public interaction within the
 community.
 ## Attribution
 This Code of Conduct is adapted from the [Contributor Covenant][homepage],
 version 2.1, available at
 [https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].
 Community Impact Guidelines were inspired by
 [Mozilla's code of conduct enforcement ladder][Mozilla CoC].
 For answers to common questions about this code of conduct, see the FAQ at
 [https://www.contributor-covenant.org/faq][FAQ]. Translations are available at
 [https://www.contributor-covenant.org/translations][translations].
 [homepage]: https://www.contributor-covenant.org
 [v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
 [Mozilla CoC]: https://github.com/mozilla/diversity
 [FAQ]: https://www.contributor-covenant.org/faq
 [translations]: https://www.contributor-covenant.org/translations
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -2,7 +2,7 @@
 Thanks a lot for considering contributing to GreptimeDB. We believe people like you would make GreptimeDB a great product. We intend to build a community where individuals can have open talks, show respect for one another, and speak with true ❤️. Meanwhile, we are to keep transparency and make your effort count here.
-Read the guidelines, and they can help you get started. Communicate with respect to developers maintaining and developing the project. In return, they should reciprocate that respect by addressing your issue, reviewing changes, as well as helping finalize and merge your pull requests.
+Please read the guidelines, and they can help you get started. Communicate with respect to developers maintaining and developing the project. In return, they should reciprocate that respect by addressing your issue, reviewing changes, as well as helping finalize and merge your pull requests.
 Follow our [README](https://github.com/GreptimeTeam/greptimedb#readme) to get the whole picture of the project. To learn about the design of GreptimeDB, please refer to the [design docs](https://github.com/GrepTimeTeam/docs).
@@ -10,7 +10,7 @@ Follow our [README](https://github.com/GreptimeTeam/greptimedb#readme) to get th
 It can feel intimidating to contribute to a complex project, but it can also be exciting and fun. These general notes will help everyone participate in this communal activity.
- Follow the [Code of Conduct](https://github.com/GreptimeTeam/greptimedb/blob/develop/CODE_OF_CONDUCT.md)
+- Follow the [Code of Conduct](https://github.com/GreptimeTeam/greptimedb/blob/main/CODE_OF_CONDUCT.md)
 - Small changes make huge differences. We will happily accept a PR making a single character change if it helps move forward. Don't wait to have everything working.
 - Check the closed issues before opening your issue.
 - Try to follow the existing style of the code.
@@ -21,12 +21,12 @@ Pull requests are great, but we accept all kinds of other help if you like. Such
 - Write tutorials or blog posts. Blog, speak about, or create tutorials about one of GreptimeDB's many features. Mention [@greptime](https://twitter.com/greptime) on Twitter and email info@greptime.com so we can give pointers and tips and help you spread the word by promoting your content on Greptime communication channels.
 - Improve the documentation. [Submit documentation](http://github.com/greptimeTeam/docs/) updates, enhancements, designs, or bug fixes, and fixing any spelling or grammar errors will be very much appreciated.
 - Present at meetups and conferences about your GreptimeDB projects. Your unique challenges and successes in building things with GreptimeDB can provide great speaking material. We'd love to review your talk abstract, so get in touch with us if you'd like some help!
- Submit bug reports. To report a bug or a security issue, you can [open a new GitHub issue](https://github.com/GrepTimeTeam/greptimedb/issues/new).
+- Submitting bug reports. To report a bug or a security issue, you can [open a new GitHub issue](https://github.com/GrepTimeTeam/greptimedb/issues/new).
 - Speak up feature requests. Send feedback is a great way for us to understand your different use cases of GreptimeDB better. If you want to share your experience with GreptimeDB, or if you want to discuss any ideas, you can start a discussion on [GitHub discussions](https://github.com/GreptimeTeam/greptimedb/discussions), chat with the Greptime team on [Slack](https://greptime.com/slack), or you can tweet [@greptime](https://twitter.com/greptime) on Twitter.
 ## Code of Conduct
-Also, there are things that we are not looking for because they don't match the goals of the product or benefit the community. Please read [Code of Conduct](https://github.com/GreptimeTeam/greptimedb/blob/develop/CODE_OF_CONDUCT.md); we hope everyone can keep good manners and become an honored member.
+Also, there are things that we are not looking for because they don't match the goals of the product or benefit the community. Please read [Code of Conduct](https://github.com/GreptimeTeam/greptimedb/blob/main/CODE_OF_CONDUCT.md); we hope everyone can keep good manners and become an honored member.
 ## License
@@ -49,7 +49,8 @@ GreptimeDB uses the [Apache 2.0 license](https://github.com/GreptimeTeam/greptim
 ### Before PR
 - To ensure that community is free and confident in its ability to use your contributions, please sign the Contributor License Agreement (CLA) which will be incorporated in the pull request process.
- Make sure all your codes are formatted and follow the [coding style](https://pingcap.github.io/style-guide/rust/).
+- Make sure all files have proper license header (running `docker run --rm -v $(pwd):/github/workspace ghcr.io/korandoru/hawkeye-native:v3 format` from the project root).
 - Make sure all your codes are formatted and follow the [coding style](https://pingcap.github.io/style-guide/rust/) and [style guide](http://github.com/greptimeTeam/docs/style-guide.md).
 - Make sure all unit tests are passed (using `cargo test --workspace` or [nextest](https://nexte.st/index.html) `cargo nextest run`).
 - Make sure all clippy warnings are fixed (you can check it locally by running `cargo clippy --workspace --all-targets -- -D warnings`).
@@ -81,7 +82,7 @@ Now, `pre-commit` will run automatically on `git commit`.
 ### Title
 The titles of pull requests should be prefixed with category names listed in [Conventional Commits specification](https://www.conventionalcommits.org/en/v1.0.0)
-like `feat`/`fix`/`docs`, with a concise summary of code change following. DO NOT use last commit message as pull request title.
+like `feat`/`fix`/`docs`, with a concise summary of code change following. AVOID using the last commit message as pull request title.
 ### Description
@@ -100,7 +101,7 @@ of what you were trying to do and what went wrong. You can also reach for help i
 ## Community
-The core team will be thrilled if you participate in any way you like. When you are stuck, try ask for help by filing an issue, with a detailed description of what you were trying to do and what went wrong. If you have any questions or if you would like to get involved in our community, please check out:
+The core team will be thrilled if you would like to participate in any way you like. When you are stuck, try to ask for help by filing an issue, with a detailed description of what you were trying to do and what went wrong. If you have any questions or if you would like to get involved in our community, please check out:
 - [GreptimeDB Community Slack](https://greptime.com/slack)
 - [GreptimeDB Github Discussions](https://github.com/GreptimeTeam/greptimedb/discussions)
--- a/Cargo.lock
+++ b/Cargo.lock
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -2,22 +2,25 @@
 members = [
    "benchmarks",
    "src/api",
    "src/auth",
    "src/catalog",
    "src/client",
    "src/cmd",
    "src/common/base",
    "src/common/catalog",
    "src/common/config",
    "src/common/datasource",
    "src/common/error",
    "src/common/function",
-    "src/common/function-macro",
+    "src/common/macro",
    "src/common/greptimedb-telemetry",
    "src/common/grpc",
    "src/common/grpc-expr",
    "src/common/mem-prof",
    "src/common/meta",
    "src/common/plugins",
    "src/common/procedure",
    "src/common/procedure-test",
    "src/common/pprof",
    "src/common/query",
    "src/common/recordbatch",
    "src/common/runtime",
@@ -25,77 +28,207 @@ members = [
    "src/common/telemetry",
    "src/common/test-util",
    "src/common/time",
    "src/common/decimal",
    "src/common/version",
    "src/common/wal",
    "src/datanode",
    "src/datatypes",
-    "src/file-table-engine",
+    "src/file-engine",
    "src/flow",
    "src/frontend",
    "src/log-store",
    "src/meta-client",
    "src/meta-srv",
-    "src/mito",
+    "src/metric-engine",
    "src/mito2",
    "src/object-store",
    "src/operator",
    "src/partition",
    "src/plugins",
    "src/promql",
    "src/puffin",
    "src/query",
    "src/script",
    "src/servers",
    "src/session",
    "src/sql",
    "src/storage",
    "src/store-api",
    "src/table",
-    "src/table-procedure",
+    "src/index",
    "tests-fuzz",
    "tests-integration",
    "tests/runner",
 ]
 resolver = "2"
 [workspace.package]
-version = "0.4.0"
+version = "0.7.2"
 edition = "2021"
 license = "Apache-2.0"
 [workspace.lints]
 clippy.print_stdout = "warn"
 clippy.print_stderr = "warn"
 clippy.implicit_clone = "warn"
 clippy.readonly_write_lock = "allow"
 rust.unknown_lints = "deny"
 # Remove this after https://github.com/PyO3/pyo3/issues/4094
 rust.non_local_definitions = "allow"
 [workspace.dependencies]
-arrow = { version = "40.0" }
+# We turn off default-features for some dependencies here so the workspaces which inherit them can
-arrow-array = "40.0"
+# selectively turn them on if needed, since we can override default-features = true (from false)
-arrow-flight = "40.0"
+# for the inherited dependency but cannot do the reverse (override from true to false).
-arrow-schema = { version = "40.0", features = ["serde"] }
+#
 # See for more detaiils: https://github.com/rust-lang/cargo/issues/11329
 ahash = { version = "0.8", features = ["compile-time-rng"] }
 aquamarine = "0.3"
 arrow = { version = "51.0.0", features = ["prettyprint"] }
 arrow-array = { version = "51.0.0", default-features = false, features = ["chrono-tz"] }
 arrow-flight = "51.0"
 arrow-ipc = { version = "51.0.0", default-features = false, features = ["lz4"] }
 arrow-schema = { version = "51.0", features = ["serde"] }
 async-stream = "0.3"
 async-trait = "0.1"
 axum = { version = "0.6", features = ["headers"] }
 base64 = "0.21"
 bigdecimal = "0.4.2"
 bitflags = "2.4.1"
 bytemuck = "1.12"
 bytes = { version = "1.5", features = ["serde"] }
 chrono = { version = "0.4", features = ["serde"] }
-# TODO(ruihang): use arrow-datafusion when it contains https://github.com/apache/arrow-datafusion/pull/6032
+clap = { version = "4.4", features = ["derive"] }
-datafusion = { git = "https://github.com/waynexia/arrow-datafusion.git", rev = "63e52dde9e44cac4b1f6c6e6b6bf6368ba3bd323" }
+dashmap = "5.4"
-datafusion-common = { git = "https://github.com/waynexia/arrow-datafusion.git", rev = "63e52dde9e44cac4b1f6c6e6b6bf6368ba3bd323" }
+datafusion = { git = "https://github.com/apache/arrow-datafusion.git", rev = "34eda15b73a9e278af8844b30ed2f1c21c10359c" }
-datafusion-expr = { git = "https://github.com/waynexia/arrow-datafusion.git", rev = "63e52dde9e44cac4b1f6c6e6b6bf6368ba3bd323" }
+datafusion-common = { git = "https://github.com/apache/arrow-datafusion.git", rev = "34eda15b73a9e278af8844b30ed2f1c21c10359c" }
-datafusion-optimizer = { git = "https://github.com/waynexia/arrow-datafusion.git", rev = "63e52dde9e44cac4b1f6c6e6b6bf6368ba3bd323" }
+datafusion-expr = { git = "https://github.com/apache/arrow-datafusion.git", rev = "34eda15b73a9e278af8844b30ed2f1c21c10359c" }
-datafusion-physical-expr = { git = "https://github.com/waynexia/arrow-datafusion.git", rev = "63e52dde9e44cac4b1f6c6e6b6bf6368ba3bd323" }
+datafusion-functions = { git = "https://github.com/apache/arrow-datafusion.git", rev = "34eda15b73a9e278af8844b30ed2f1c21c10359c" }
-datafusion-sql = { git = "https://github.com/waynexia/arrow-datafusion.git", rev = "63e52dde9e44cac4b1f6c6e6b6bf6368ba3bd323" }
+datafusion-optimizer = { git = "https://github.com/apache/arrow-datafusion.git", rev = "34eda15b73a9e278af8844b30ed2f1c21c10359c" }
-datafusion-substrait = { git = "https://github.com/waynexia/arrow-datafusion.git", rev = "63e52dde9e44cac4b1f6c6e6b6bf6368ba3bd323" }
+datafusion-physical-expr = { git = "https://github.com/apache/arrow-datafusion.git", rev = "34eda15b73a9e278af8844b30ed2f1c21c10359c" }
 datafusion-sql = { git = "https://github.com/apache/arrow-datafusion.git", rev = "34eda15b73a9e278af8844b30ed2f1c21c10359c" }
 datafusion-substrait = { git = "https://github.com/apache/arrow-datafusion.git", rev = "34eda15b73a9e278af8844b30ed2f1c21c10359c" }
 derive_builder = "0.12"
 dotenv = "0.15"
 # TODO(LFC): Wait for https://github.com/etcdv3/etcd-client/pull/76
 etcd-client = { git = "https://github.com/MichaelScofield/etcd-client.git", rev = "4c371e9b3ea8e0a8ee2f9cbd7ded26e54a45df3b" }
 fst = "0.4.7"
 futures = "0.3"
 futures-util = "0.3"
-greptime-proto = { git = "https://github.com/GreptimeTeam/greptime-proto.git", rev = "4398d20c56d5f7939cc2960789cb1fa7dd18e6fe" }
+greptime-proto = { git = "https://github.com/GreptimeTeam/greptime-proto.git", rev = "73ac0207ab71dfea48f30259ffdb611501b5ecb8" }
 humantime = "2.1"
 humantime-serde = "1.1"
 itertools = "0.10"
-parquet = "40.0"
+lazy_static = "1.4"
 meter-core = { git = "https://github.com/GreptimeTeam/greptime-meter.git", rev = "80b72716dcde47ec4161478416a5c6c21343364d" }
 mockall = "0.11.4"
 moka = "0.12"
 notify = "6.1"
 num_cpus = "1.16"
 once_cell = "1.18"
 opentelemetry-proto = { version = "0.5", features = [
    "gen-tonic",
    "metrics",
    "trace",
 ] }
 parquet = { version = "51.0.0", default-features = false, features = ["arrow", "async", "object_store"] }
 paste = "1.0"
-prost = "0.11"
+pin-project = "1.0"
 prometheus = { version = "0.13.3", features = ["process"] }
 prost = "0.12"
 raft-engine = { version = "0.4.1", default-features = false }
 rand = "0.8"
 regex = "1.8"
 regex-automata = { version = "0.4" }
 reqwest = { version = "0.11", default-features = false, features = [
    "json",
    "rustls-tls-native-roots",
    "stream",
    "multipart",
 ] }
 rskafka = "0.5"
 rust_decimal = "1.33"
 schemars = "0.8"
 serde = { version = "1.0", features = ["derive"] }
-serde_json = "1.0"
+serde_json = { version = "1.0", features = ["float_roundtrip"] }
-snafu = { version = "0.7", features = ["backtraces"] }
+serde_with = "3"
-sqlparser = "0.34"
+smallvec = { version = "1", features = ["serde"] }
 snafu = "0.7"
 sysinfo = "0.30"
 # on branch v0.44.x
 sqlparser = { git = "https://github.com/GreptimeTeam/sqlparser-rs.git", rev = "c919990bf62ad38d2b0c0a3bc90b26ad919d51b0", features = [
    "visitor",
 ] }
 strum = { version = "0.25", features = ["derive"] }
 tempfile = "3"
-tokio = { version = "1.28", features = ["full"] }
+tokio = { version = "1.36", features = ["full"] }
 tokio-stream = { version = "0.1" }
 tokio-util = { version = "0.7", features = ["io-util", "compat"] }
-tonic = { version = "0.9", features = ["tls"] }
+toml = "0.8.8"
-uuid = { version = "1", features = ["serde", "v4", "fast-rng"] }
+tonic = { version = "0.11", features = ["tls"] }
-metrics = "0.20"
+uuid = { version = "1.7", features = ["serde", "v4", "fast-rng"] }
-meter-core = { git = "https://github.com/GreptimeTeam/greptime-meter.git", rev = "f0798c4c648d89f51abe63e870919c75dd463199" }
+zstd = "0.13"
 ## workspaces members
 api = { path = "src/api" }
 auth = { path = "src/auth" }
 catalog = { path = "src/catalog" }
 client = { path = "src/client" }
 cmd = { path = "src/cmd" }
 common-base = { path = "src/common/base" }
 common-catalog = { path = "src/common/catalog" }
 common-config = { path = "src/common/config" }
 common-datasource = { path = "src/common/datasource" }
 common-decimal = { path = "src/common/decimal" }
 common-error = { path = "src/common/error" }
 common-function = { path = "src/common/function" }
 common-greptimedb-telemetry = { path = "src/common/greptimedb-telemetry" }
 common-grpc = { path = "src/common/grpc" }
 common-grpc-expr = { path = "src/common/grpc-expr" }
 common-macro = { path = "src/common/macro" }
 common-mem-prof = { path = "src/common/mem-prof" }
 common-meta = { path = "src/common/meta" }
 common-plugins = { path = "src/common/plugins" }
 common-procedure = { path = "src/common/procedure" }
 common-procedure-test = { path = "src/common/procedure-test" }
 common-query = { path = "src/common/query" }
 common-recordbatch = { path = "src/common/recordbatch" }
 common-runtime = { path = "src/common/runtime" }
 common-telemetry = { path = "src/common/telemetry" }
 common-test-util = { path = "src/common/test-util" }
 common-time = { path = "src/common/time" }
 common-version = { path = "src/common/version" }
 common-wal = { path = "src/common/wal" }
 datanode = { path = "src/datanode" }
 datatypes = { path = "src/datatypes" }
 file-engine = { path = "src/file-engine" }
 frontend = { path = "src/frontend" }
 index = { path = "src/index" }
 log-store = { path = "src/log-store" }
 meta-client = { path = "src/meta-client" }
 meta-srv = { path = "src/meta-srv" }
 metric-engine = { path = "src/metric-engine" }
 mito2 = { path = "src/mito2" }
 object-store = { path = "src/object-store" }
 operator = { path = "src/operator" }
 partition = { path = "src/partition" }
 plugins = { path = "src/plugins" }
 promql = { path = "src/promql" }
 puffin = { path = "src/puffin" }
 query = { path = "src/query" }
 script = { path = "src/script" }
 servers = { path = "src/servers" }
 session = { path = "src/session" }
 sql = { path = "src/sql" }
 store-api = { path = "src/store-api" }
 substrait = { path = "src/common/substrait" }
 table = { path = "src/table" }
 [workspace.dependencies.meter-macros]
 git = "https://github.com/GreptimeTeam/greptime-meter.git"
-rev = "f0798c4c648d89f51abe63e870919c75dd463199"
+rev = "80b72716dcde47ec4161478416a5c6c21343364d"
 [profile.release]
-debug = true
+debug = 1
 [profile.nightly]
 inherits = "release"
--- a/Cross.toml
+++ b/Cross.toml
@@ -0,0 +1,7 @@
 [build]
 pre-build = [
    "dpkg --add-architecture $CROSS_DEB_ARCH",
    "apt update && apt install -y unzip zlib1g-dev zlib1g-dev:$CROSS_DEB_ARCH",
    "curl -LO https://github.com/protocolbuffers/protobuf/releases/download/v3.15.8/protoc-3.15.8-linux-x86_64.zip && unzip protoc-3.15.8-linux-x86_64.zip -d /usr/",
    "chmod a+x /usr/bin/protoc && chmod -R a+rx /usr/include/google",
 ]
--- a/2
+++ b/2
@@ -186,7 +186,7 @@
      same "printed page" as the copyright notice for easier
      identification within third-party archives.
-   Copyright 2022 Greptime Team
+   Copyright [yyyy] [name of copyright owner]
   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
--- a/178
+++ b/178
@@ -1,15 +1,109 @@
-IMAGE_REGISTRY ?= greptimedb
+# The arguments for building images.
 CARGO_PROFILE ?=
 FEATURES ?=
 TARGET_DIR ?=
 TARGET ?=
 BUILD_BIN ?= greptime
 CARGO_BUILD_OPTS := --locked
 IMAGE_REGISTRY ?= docker.io
 IMAGE_NAMESPACE ?= greptime
 IMAGE_TAG ?= latest
 BUILDX_MULTI_PLATFORM_BUILD ?= false
 BUILDX_BUILDER_NAME ?= gtbuilder
 BASE_IMAGE ?= ubuntu
 RUST_TOOLCHAIN ?= $(shell cat rust-toolchain.toml | grep channel | cut -d'"' -f2)
 CARGO_REGISTRY_CACHE ?= ${HOME}/.cargo/registry
 ARCH := $(shell uname -m | sed 's/x86_64/amd64/' | sed 's/aarch64/arm64/')
 OUTPUT_DIR := $(shell if [ "$(RELEASE)" = "true" ]; then echo "release"; elif [ ! -z "$(CARGO_PROFILE)" ]; then echo "$(CARGO_PROFILE)" ; else echo "debug"; fi)
 # The arguments for running integration tests.
 ETCD_VERSION ?= v3.5.9
 ETCD_IMAGE ?= quay.io/coreos/etcd:${ETCD_VERSION}
 RETRY_COUNT ?= 3
 NEXTEST_OPTS := --retries ${RETRY_COUNT}
 BUILD_JOBS ?= $(shell which nproc 1>/dev/null && expr $$(nproc) / 2) # If nproc is not available, we don't set the build jobs.
 ifeq ($(BUILD_JOBS), 0) # If the number of cores is less than 2, set the build jobs to 1.
  BUILD_JOBS := 1
 endif
 ifneq ($(strip $(BUILD_JOBS)),)
 	NEXTEST_OPTS += --build-jobs=${BUILD_JOBS}
 endif
 ifneq ($(strip $(CARGO_PROFILE)),)
 	CARGO_BUILD_OPTS += --profile ${CARGO_PROFILE}
 endif
 ifneq ($(strip $(FEATURES)),)
 	CARGO_BUILD_OPTS += --features ${FEATURES}
 endif
 ifneq ($(strip $(TARGET_DIR)),)
 	CARGO_BUILD_OPTS += --target-dir ${TARGET_DIR}
 endif
 ifneq ($(strip $(TARGET)),)
 	CARGO_BUILD_OPTS += --target ${TARGET}
 endif
 ifneq ($(strip $(BUILD_BIN)),)
 	CARGO_BUILD_OPTS += --bin ${BUILD_BIN}
 endif
 ifneq ($(strip $(RELEASE)),)
 	CARGO_BUILD_OPTS += --release
 endif
 ifeq ($(BUILDX_MULTI_PLATFORM_BUILD), true)
 	BUILDX_MULTI_PLATFORM_BUILD_OPTS := --platform linux/amd64,linux/arm64 --push
 else
 	BUILDX_MULTI_PLATFORM_BUILD_OPTS := -o type=docker
 endif
 ifneq ($(strip $(CARGO_BUILD_EXTRA_OPTS)),)
 	CARGO_BUILD_OPTS += ${CARGO_BUILD_EXTRA_OPTS}
 endif
 ##@ Build
 .PHONY: build
 build: ## Build debug version greptime.
-	cargo build
+	cargo ${CARGO_EXTENSION} build ${CARGO_BUILD_OPTS}
-.PHONY: release
+.PHONY: build-by-dev-builder
-release:  ## Build release version greptime.
+build-by-dev-builder: ## Build greptime by dev-builder.
-	cargo build --release
+	docker run --network=host \
 	-v ${PWD}:/greptimedb -v ${CARGO_REGISTRY_CACHE}:/root/.cargo/registry \
 	-w /greptimedb ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/dev-builder-${BASE_IMAGE}:latest \
 	make build \
 	CARGO_EXTENSION="${CARGO_EXTENSION}" \
 	CARGO_PROFILE=${CARGO_PROFILE} \
 	FEATURES=${FEATURES} \
 	TARGET_DIR=${TARGET_DIR} \
 	TARGET=${TARGET} \
 	RELEASE=${RELEASE} \
 	CARGO_BUILD_EXTRA_OPTS="${CARGO_BUILD_EXTRA_OPTS}"
 .PHONY: build-android-bin
 build-android-bin: ## Build greptime binary for android.
 	docker run --network=host \
 	-v ${PWD}:/greptimedb -v ${CARGO_REGISTRY_CACHE}:/root/.cargo/registry \
 	-w /greptimedb ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/dev-builder-android:latest \
 	make build \
 	CARGO_EXTENSION="ndk --platform 23 -t aarch64-linux-android" \
 	CARGO_PROFILE=release \
 	FEATURES="${FEATURES}" \
 	TARGET_DIR="${TARGET_DIR}" \
 	TARGET="${TARGET}" \
 	RELEASE="${RELEASE}" \
 	CARGO_BUILD_EXTRA_OPTS="--bin greptime --no-default-features"
 .PHONY: strip-android-bin
 strip-android-bin: build-android-bin ## Strip greptime binary for android.
 	docker run --network=host \
 	-v ${PWD}:/greptimedb \
 	-w /greptimedb ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/dev-builder-android:latest \
 	bash -c '$${NDK_ROOT}/toolchains/llvm/prebuilt/linux-x86_64/bin/llvm-strip /greptimedb/target/aarch64-linux-android/release/greptime'
 .PHONY: clean
 clean: ## Clean the project.
@@ -21,23 +115,46 @@ fmt: ## Format all the Rust code.
 .PHONY: fmt-toml
 fmt-toml: ## Format all TOML files.
-	taplo format --option "indent_string=    "
+	taplo format
 .PHONY: check-toml
 check-toml: ## Check all TOML files.
-	taplo format --check --option "indent_string=    "
+	taplo format --check
 .PHONY: docker-image
-docker-image: ## Build docker image.
+docker-image: build-by-dev-builder ## Build docker image.
-	docker build --network host -f docker/Dockerfile -t ${IMAGE_REGISTRY}:${IMAGE_TAG} .
+	mkdir -p ${ARCH} && \
 	cp ./target/${OUTPUT_DIR}/greptime ${ARCH}/greptime && \
 	docker build -f docker/ci/${BASE_IMAGE}/Dockerfile -t ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/greptimedb:${IMAGE_TAG} . && \
 	rm -r ${ARCH}
 .PHONY: docker-image-buildx
 docker-image-buildx: multi-platform-buildx ## Build docker image by buildx.
 	docker buildx build --builder ${BUILDX_BUILDER_NAME} \
 	  --build-arg="CARGO_PROFILE=${CARGO_PROFILE}" \
 	  --build-arg="FEATURES=${FEATURES}" \
 	  --build-arg="OUTPUT_DIR=${OUTPUT_DIR}" \
 	  -f docker/buildx/${BASE_IMAGE}/Dockerfile \
 	  -t ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/greptimedb:${IMAGE_TAG} ${BUILDX_MULTI_PLATFORM_BUILD_OPTS} .
 .PHONY: dev-builder
 dev-builder: multi-platform-buildx ## Build dev-builder image.
 	docker buildx build --builder ${BUILDX_BUILDER_NAME} \
 	--build-arg="RUST_TOOLCHAIN=${RUST_TOOLCHAIN}" \
 	-f docker/dev-builder/${BASE_IMAGE}/Dockerfile \
 	-t ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/dev-builder-${BASE_IMAGE}:${IMAGE_TAG} ${BUILDX_MULTI_PLATFORM_BUILD_OPTS} .
 .PHONY: multi-platform-buildx
 multi-platform-buildx: ## Create buildx multi-platform builder.
 	docker buildx inspect ${BUILDX_BUILDER_NAME} || docker buildx create --name ${BUILDX_BUILDER_NAME} --driver docker-container --bootstrap --use
 ##@ Test
-
+.PHONY: test
 test: nextest ## Run unit and integration tests.
-	cargo nextest run
+	cargo nextest run ${NEXTEST_OPTS}
-.PHONY: nextest ## Install nextest tools.
+.PHONY: nextest
-nextest:
+nextest: ## Install nextest tools.
 	cargo --list | grep nextest || cargo install cargo-nextest --locked
 .PHONY: sqlness-test
@@ -46,16 +163,45 @@ sqlness-test: ## Run sqlness test.
 .PHONY: check
 check: ## Cargo check all the targets.
-	cargo check --workspace --all-targets
+	cargo check --workspace --all-targets --all-features
 .PHONY: clippy
 clippy: ## Check clippy rules.
-	cargo clippy --workspace --all-targets -- -D warnings
+	cargo clippy --workspace --all-targets --all-features -- -D warnings
 .PHONY: fix-clippy
 fix-clippy: ## Fix clippy violations.
 	cargo clippy --workspace --all-targets --all-features --fix
 .PHONY: fmt-check
 fmt-check: ## Check code format.
 	cargo fmt --all -- --check
 .PHONY: start-etcd
 start-etcd: ## Start single node etcd for testing purpose.
 	docker run --rm -d --network=host -p 2379-2380:2379-2380 ${ETCD_IMAGE}
 .PHONY: stop-etcd
 stop-etcd: ## Stop single node etcd for testing purpose.
 	docker stop $$(docker ps -q --filter ancestor=${ETCD_IMAGE})
 .PHONY: run-it-in-container
 run-it-in-container: start-etcd ## Run integration tests in dev-builder.
 	docker run --network=host \
 	-v ${PWD}:/greptimedb -v ${CARGO_REGISTRY_CACHE}:/root/.cargo/registry -v /tmp:/tmp \
 	-w /greptimedb ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/dev-builder-${BASE_IMAGE}:latest \
 	make test sqlness-test BUILD_JOBS=${BUILD_JOBS}
 ##@ Docs
 config-docs: ## Generate configuration documentation from toml files.
 	docker run --rm \
    -v ${PWD}:/greptimedb \
    -w /greptimedb/config \
    toml2docs/toml2docs:latest \
    -p '##' \
    -t ./config-docs-template.md \
    -o ./config.md
 ##@ General
 # The help target prints out all targets with their descriptions organized
@@ -71,4 +217,4 @@ fmt-check: ## Check code format.
 .PHONY: help
 help: ## Display help messages.
-	@awk 'BEGIN {FS = ":.*##"; printf "\nUsage:\n  make \033[36m<target>\033[0m\n"} /^[a-zA-Z_0-9-]+:.*?##/ { printf "  \033[36m%-20s\033[0m %s\n", $$1, $$2 } /^##@/ { printf "\n\033[1m%s\033[0m\n", substr($$0, 5) } ' $(MAKEFILE_LIST)
+	@awk 'BEGIN {FS = ":.*##"; printf "\nUsage:\n  make \033[36m<target>\033[0m\n"} /^[a-zA-Z_0-9-]+:.*?##/ { printf "  \033[36m%-30s\033[0m %s\n", $$1, $$2 } /^##@/ { printf "\n\033[1m%s\033[0m\n", substr($$0, 5) } ' $(MAKEFILE_LIST)
--- a/README.md
+++ b/README.md
@@ -1,149 +1,159 @@
 <p align="center">
  <picture>
-    <source media="(prefers-color-scheme: light)" srcset="https://cdn.jsdelivr.net/gh/GreptimeTeam/greptimedb@develop/docs/logo-text-padding.png">
+    <source media="(prefers-color-scheme: light)" srcset="https://cdn.jsdelivr.net/gh/GreptimeTeam/greptimedb@main/docs/logo-text-padding.png">
-    <source media="(prefers-color-scheme: dark)" srcset="https://cdn.jsdelivr.net/gh/GreptimeTeam/greptimedb@develop/docs/logo-text-padding-dark.png">
+    <source media="(prefers-color-scheme: dark)" srcset="https://cdn.jsdelivr.net/gh/GreptimeTeam/greptimedb@main/docs/logo-text-padding-dark.png">
-    <img alt="GreptimeDB Logo" src="https://cdn.jsdelivr.net/gh/GreptimeTeam/greptimedb@develop/docs/logo-text-padding.png" width="400px">
+    <img alt="GreptimeDB Logo" src="https://cdn.jsdelivr.net/gh/GreptimeTeam/greptimedb@main/docs/logo-text-padding.png" width="400px">
  </picture>
 </p>
 <h1 align="center">Cloud-scale, Fast and Efficient Time Series Database</h1>
 <div align="center">
 <h3 align="center">
-    The next-generation hybrid time-series/analytics processing database in the cloud
+  <a href="https://greptime.com/product/cloud">GreptimeCloud</a> |
-</h3>
+  <a href="https://docs.greptime.com/">User guide</a> |
  <a href="https://greptimedb.rs/">API Docs</a> |
  <a href="https://github.com/GreptimeTeam/greptimedb/issues/3412">Roadmap 2024</a>
 </h4>
-<p align="center">
+<a href="https://github.com/GreptimeTeam/greptimedb/releases/latest">
-    <a href="https://codecov.io/gh/GrepTimeTeam/greptimedb"><img src="https://codecov.io/gh/GrepTimeTeam/greptimedb/branch/develop/graph/badge.svg?token=FITFDI3J3C"></img></a>
+<img src="https://img.shields.io/github/v/release/GreptimeTeam/greptimedb.svg" alt="Version"/>
-    &nbsp;
+</a>
-    <a href="https://github.com/GreptimeTeam/greptimedb/actions/workflows/develop.yml"><img src="https://github.com/GreptimeTeam/greptimedb/actions/workflows/develop.yml/badge.svg" alt="CI"></img></a>
+<a href="https://github.com/GreptimeTeam/greptimedb/releases/latest">
-    &nbsp;
+<img src="https://img.shields.io/github/release-date/GreptimeTeam/greptimedb.svg" alt="Releases"/>
-    <a href="https://github.com/greptimeTeam/greptimedb/blob/develop/LICENSE"><img src="https://img.shields.io/github/license/greptimeTeam/greptimedb"></a>
+</a>
-</p>
+<a href="https://hub.docker.com/r/greptime/greptimedb/">
 <img src="https://img.shields.io/docker/pulls/greptime/greptimedb.svg" alt="Docker Pulls"/>
 </a>
 <a href="https://github.com/GreptimeTeam/greptimedb/actions/workflows/develop.yml">
 <img src="https://github.com/GreptimeTeam/greptimedb/actions/workflows/develop.yml/badge.svg" alt="GitHub Actions"/>
 </a>
 <a href="https://codecov.io/gh/GrepTimeTeam/greptimedb">
 <img src="https://codecov.io/gh/GrepTimeTeam/greptimedb/branch/main/graph/badge.svg?token=FITFDI3J3C" alt="Codecov"/>
 </a>
 <a href="https://github.com/greptimeTeam/greptimedb/blob/main/LICENSE">
 <img src="https://img.shields.io/github/license/greptimeTeam/greptimedb" alt="License"/>
 </a>
-<p align="center">
+<br/>
    <a href="https://twitter.com/greptime"><img src="https://img.shields.io/badge/twitter-follow_us-1d9bf0.svg"></a>
    &nbsp;
    <a href="https://www.linkedin.com/company/greptime/"><img src="https://img.shields.io/badge/linkedin-connect_with_us-0a66c2.svg"></a>
    &nbsp;
    <a href="https://greptime.com/slack"><img src="https://img.shields.io/badge/slack-GreptimeDB-0abd59?logo=slack" alt="slack" /></a>
 </p>
-## What is GreptimeDB
+<a href="https://greptime.com/slack">
 <img src="https://img.shields.io/badge/slack-GreptimeDB-0abd59?logo=slack&style=for-the-badge" alt="Slack"/>
 </a>
 <a href="https://twitter.com/greptime">
 <img src="https://img.shields.io/badge/twitter-follow_us-1d9bf0.svg?style=for-the-badge" alt="Twitter"/>
 </a>
 <a href="https://www.linkedin.com/company/greptime/">
 <img src="https://img.shields.io/badge/linkedin-connect_with_us-0a66c2.svg?style=for-the-badge" alt="LinkedIn"/>
 </a>
 </div>
-GreptimeDB is an open-source time-series database with a special focus on
+## Introduction
 scalability, analytical capabilities and efficiency. It's designed to work on
 infrastructure of the cloud era, and users benefit from its elasticity and commodity
 storage.
-Our core developers have been building time-series data platform
+**GreptimeDB** is an open-source time-series database focusing on efficiency, scalability, and analytical capabilities.
-for years. Based on their best-practices, GreptimeDB is born to give you:
+Designed to work on infrastructure of the cloud era, GreptimeDB benefits users with its elasticity and commodity storage, offering a fast and cost-effective **alternative to InfluxDB** and a **long-term storage for Prometheus**.
- A standalone binary that scales to highly-available distributed cluster, providing a transparent experience for cluster users
+## Why GreptimeDB
 - Optimized columnar layout for handling time-series data; compacted, compressed, and stored on various storage backends
 - Flexible indexes, tackling high cardinality issues down
 - Distributed, parallel query execution, leveraging elastic computing resource
 - Native SQL, and Python scripting for advanced analytical scenarios
 - Widely adopted database protocols and APIs, native PromQL supports
 - Extensible table engine architecture for extensive workloads
-## Quick Start
+Our core developers have been building time-series data platforms for years. Based on our best-practices, GreptimeDB is born to give you:
-### GreptimePlay
+* **Easy horizontal scaling**
  Seamless scalability from a standalone binary at edge to a robust, highly available distributed cluster in cloud, with a transparent experience for both developers and administrators.
 * **Analyzing time-series data**
  Query your time-series data with SQL and PromQL. Use Python scripts to facilitate complex analytical tasks.
 * **Cloud-native distributed database**
  Fully open-source distributed cluster architecture that harnesses the power of cloud-native elastic computing resources.
 * **Performance and Cost-effective**
  Flexible indexing capabilities and distributed, parallel-processing query engine, tackling high cardinality issues down. Optimized columnar layout for handling time-series data; compacted, compressed, and stored on various storage backends, particularly cloud object storage with 50x cost efficiency.
 * **Compatible with InfluxDB, Prometheus and more protocols**
  Widely adopted database protocols and APIs, including MySQL, PostgreSQL, and Prometheus Remote Storage, etc. [Read more](https://docs.greptime.com/user-guide/clients/overview).
 ## Try GreptimeDB
 ### 1. [GreptimePlay](https://greptime.com/playground)
 Try out the features of GreptimeDB right from your browser.
-<a href="https://greptime.com/playground" target="_blank"><img
+### 2. [GreptimeCloud](https://console.greptime.cloud/)
 src="https://www.greptime.com/assets/greptime_play_button_colorful.1bbe2746.png"
 alt="GreptimePlay" width="200px" /></a>
-### Build
+Start instantly with a free cluster.
-#### Build from Source
+### 3. Docker Image
-To compile GreptimeDB from source, you'll need:
+To install GreptimeDB locally, the recommended way is via Docker:
- C/C++ Toolchain: provides basic tools for compiling and linking. This is
+```shell
-  available as `build-essential` on ubuntu and similar name on other platforms.
+docker pull greptime/greptimedb
 - Rust: the easiest way to install Rust is to use
  [`rustup`](https://rustup.rs/), which will check our `rust-toolchain` file and
  install correct Rust version for you.
 - Protobuf: `protoc` is required for compiling `.proto` files. `protobuf` is
  available from major package manager on macos and linux distributions. You can
  find an installation instructions [here](https://grpc.io/docs/protoc-installation/).
  **Note that `protoc` version needs to be >= 3.15** because we have used the `optional`
  keyword. You can check it with `protoc --version`.
 - python3-dev or python3-devel(Optional feature, only needed if you want to run scripts
  in CPython, and also need to enable `pyo3_backend` feature when compiling(by `cargo run -F pyo3_backend` or add `pyo3_backend` to src/script/Cargo.toml 's `features.default` like `default = ["python", "pyo3_backend]`)): this install a Python shared library required for running Python
  scripting engine(In CPython Mode). This is available as `python3-dev` on
  ubuntu, you can install it with `sudo apt install python3-dev`, or
  `python3-devel` on RPM based distributions (e.g. Fedora, Red Hat, SuSE). Mac's
  `Python3` package should have this shared library by default. More detail for compiling with PyO3 can be found in [PyO3](https://pyo3.rs/v0.18.1/building_and_distribution#configuring-the-python-version)'s documentation.
 #### Build with Docker
 A docker image with necessary dependencies is provided:
 ```
 docker build --network host -f docker/Dockerfile -t greptimedb .
 ```
-### Run
+Start a GreptimeDB container with:
 Start GreptimeDB from source code, in standalone mode:
 ```shell
 docker run --rm --name greptime --net=host greptime/greptimedb standalone start
 ```
 Read more about [Installation](https://docs.greptime.com/getting-started/installation/overview) on docs.
 ## Getting Started
 * [Quickstart](https://docs.greptime.com/getting-started/quick-start/overview)
 * [Write Data](https://docs.greptime.com/user-guide/clients/overview)
 * [Query Data](https://docs.greptime.com/user-guide/query-data/overview)
 * [Operations](https://docs.greptime.com/user-guide/operations/overview)
 ## Build
 Check the prerequisite:
 * [Rust toolchain](https://www.rust-lang.org/tools/install) (nightly)
 * [Protobuf compiler](https://grpc.io/docs/protoc-installation/) (>= 3.15)
 * Python toolchain (optional): Required only if built with PyO3 backend. More detail for compiling with PyO3 can be found in its [documentation](https://pyo3.rs/v0.18.1/building_and_distribution#configuring-the-python-version).
 Build GreptimeDB binary:
 ```shell
 make
 ```
 Run a standalone server:
 ```shell
 cargo run -- standalone start
 ```
-Or if you built from docker:
+## Extension
 ```
 docker run -p 4002:4002 -v "$(pwd):/tmp/greptimedb" greptime/greptimedb standalone start
 ```
 Please see [the online document site](https://docs.greptime.com/getting-started/overview#install-greptimedb) for more installation options and [operations info](https://docs.greptime.com/user-guide/operations/overview).
 ### Get started
 Read the [complete getting started guide](https://docs.greptime.com/getting-started/overview#connect) on our [official document site](https://docs.greptime.com/).
 To write and query data, GreptimeDB is compatible with multiple [protocols and clients](https://docs.greptime.com/user-guide/client/overview).
 ## Resources
 ### Installation
 - [Pre-built Binaries](https://greptime.com/download):
  For Linux and macOS, you can easily download pre-built binaries including official releases and nightly builds that are ready to use. 
  In most cases, downloading the version without PyO3 is sufficient. However, if you plan to run scripts in CPython (and use Python packages like NumPy and Pandas), you will need to download the version with PyO3 and install a Python with the same version as the Python in the PyO3 version.
  We recommend using virtualenv for the installation process to manage multiple Python versions.
 - [Docker Images](https://hub.docker.com/r/greptime/greptimedb)(**recommended**): pre-built
  Docker images, this is the easiest way to try GreptimeDB. By default it runs CPython script with `pyo3_backend` enabled.
 - [`gtctl`](https://github.com/GreptimeTeam/gtctl): the command-line tool for
  Kubernetes deployment
 ### Documentation
 - GreptimeDB [User Guide](https://docs.greptime.com/user-guide/concepts/overview)
 - GreptimeDB [Developer
  Guide](https://docs.greptime.com/developer-guide/overview.html)
 - GreptimeDB [internal code document](https://greptimedb.rs)
 ### Dashboard
 - [The dashboard UI for GreptimeDB](https://github.com/GreptimeTeam/dashboard)
 ### SDK
- [GreptimeDB Java
+- [GreptimeDB Go Ingester](https://github.com/GreptimeTeam/greptimedb-ingester-go)
-  Client](https://github.com/GreptimeTeam/greptimedb-client-java)
+- [GreptimeDB Java Ingester](https://github.com/GreptimeTeam/greptimedb-ingester-java)
 - [GreptimeDB C++ Ingester](https://github.com/GreptimeTeam/greptimedb-ingester-cpp)
 - [GreptimeDB Erlang Ingester](https://github.com/GreptimeTeam/greptimedb-ingester-erl)
 - [GreptimeDB Rust Ingester](https://github.com/GreptimeTeam/greptimedb-ingester-rust)
 - [GreptimeDB JavaScript Ingester](https://github.com/GreptimeTeam/greptimedb-ingester-js)
 ### Grafana Dashboard
 Our official Grafana dashboard is available at [grafana](grafana/README.md) directory.
 ## Project Status
-This project is in its early stage and under heavy development. We move fast and
+The current version has not yet reached General Availability version standards.
-break things. Benchmark on development branch may not represent its potential
+In line with our Greptime 2024 Roadmap, we plan to achieve a production-level
-performance. We release pre-built binaries constantly for functional
+version with the update to v1.0 in August. [[Join Force]](https://github.com/GreptimeTeam/greptimedb/issues/3412)
 evaluation. Do not use it in production at the moment.
 For future plans, check out [GreptimeDB roadmap](https://github.com/GreptimeTeam/greptimedb/issues/669).
 ## Community
@@ -153,29 +163,28 @@ and what went wrong. If you have any questions or if you would like to get invol
 community, please check out:
 - GreptimeDB Community on [Slack](https://greptime.com/slack)
- GreptimeDB GitHub [Discussions](https://github.com/GreptimeTeam/greptimedb/discussions)
+- GreptimeDB [GitHub Discussions forum](https://github.com/GreptimeTeam/greptimedb/discussions)
- Greptime official [Website](https://greptime.com)
+- Greptime official [website](https://greptime.com)
 In addition, you may:
- View our official [Blog](https://greptime.com/blogs/index)
+- View our official [Blog](https://greptime.com/blogs/)
 - Connect us with [Linkedin](https://www.linkedin.com/company/greptime/)
 - Follow us on [Twitter](https://twitter.com/greptime)
 ## License
-GreptimeDB uses the [Apache 2.0 license][1] to strike a balance between
+GreptimeDB uses the [Apache License 2.0](https://apache.org/licenses/LICENSE-2.0.txt) to strike a balance between
 open contributions and allowing you to use the software however you want.
 [1]: <https://github.com/greptimeTeam/greptimedb/blob/develop/LICENSE>
 ## Contributing
-Please refer to [contribution guidelines](CONTRIBUTING.md) for more information.
+Please refer to [contribution guidelines](CONTRIBUTING.md) and [internal concepts docs](https://docs.greptime.com/contributor-guide/overview.html) for more information.
 ## Acknowledgement
- GreptimeDB uses [Apache Arrow](https://arrow.apache.org/) as the memory model and [Apache Parquet](https://parquet.apache.org/) as the persistent file format.
+
- GreptimeDB's query engine is powered by [Apache Arrow DataFusion](https://github.com/apache/arrow-datafusion).
+- GreptimeDB uses [Apache Arrow™](https://arrow.apache.org/) as the memory model and [Apache Parquet™](https://parquet.apache.org/) as the persistent file format.
- [OpenDAL](https://github.com/datafuselabs/opendal) from [Datafuse Labs](https://github.com/datafuselabs) gives GreptimeDB a very general and elegant data access abstraction layer.
+- GreptimeDB's query engine is powered by [Apache Arrow DataFusion™](https://arrow.apache.org/datafusion/).
- GreptimeDB’s meta service is based on [etcd](https://etcd.io/).
+- [Apache OpenDAL™](https://opendal.apache.org) gives GreptimeDB a very general and elegant data access abstraction layer.
 - GreptimeDB's meta service is based on [etcd](https://etcd.io/).
 - GreptimeDB uses [RustPython](https://github.com/RustPython/RustPython) for experimental embedded python scripting.
--- a/benchmarks/Cargo.toml
+++ b/benchmarks/Cargo.toml
@@ -4,11 +4,35 @@ version.workspace = true
 edition.workspace = true
 license.workspace = true
 [lints]
 workspace = true
 [dependencies]
 api.workspace = true
 arrow.workspace = true
-clap = { version = "4.0", features = ["derive"] }
+chrono.workspace = true
-client = { path = "../src/client" }
+clap.workspace = true
 client.workspace = true
 common-base.workspace = true
 common-telemetry.workspace = true
 common-wal.workspace = true
 dotenv.workspace = true
 futures.workspace = true
 futures-util.workspace = true
 humantime.workspace = true
 humantime-serde.workspace = true
 indicatif = "0.17.1"
 itertools.workspace = true
 lazy_static.workspace = true
 log-store.workspace = true
 mito2.workspace = true
 num_cpus.workspace = true
 parquet.workspace = true
 prometheus.workspace = true
 rand.workspace = true
 rskafka.workspace = true
 serde.workspace = true
 store-api.workspace = true
 tokio.workspace = true
 toml.workspace = true
 uuid.workspace = true
--- a/benchmarks/README.md
+++ b/benchmarks/README.md
@@ -0,0 +1,11 @@
 Benchmarkers for GreptimeDB
 --------------------------------
 ## Wal Benchmarker
 The wal benchmarker serves to evaluate the performance of GreptimeDB's Write-Ahead Log (WAL) component. It meticulously assesses the read/write performance of the WAL under diverse workloads generated by the benchmarker. 
 ### How to use
 To compile the benchmarker, navigate to the `greptimedb/benchmarks` directory and execute `cargo build --release`. Subsequently, you'll find the compiled target located at `greptimedb/target/release/wal_bench`.
 The `./wal_bench -h` command reveals numerous arguments that the target accepts. Among these, a notable one is the `cfg-file` argument. By utilizing a configuration file in the TOML format, you can bypass the need to repeatedly specify cumbersome arguments.
--- a/benchmarks/config/wal_bench.example.toml
+++ b/benchmarks/config/wal_bench.example.toml
@@ -0,0 +1,21 @@
 # Refers to the documents of `Args` in benchmarks/src/wal.rs`.
 wal_provider = "kafka"
 bootstrap_brokers = ["localhost:9092"]
 num_workers = 10
 num_topics = 32
 num_regions = 1000
 num_scrapes = 1000
 num_rows = 5
 col_types = "ifs"
 max_batch_size = "512KB"
 linger = "1ms"
 backoff_init = "10ms"
 backoff_max = "1ms"
 backoff_base = 2
 backoff_deadline = "3s"
 compression = "zstd"
 rng_seed = 42
 skip_read = false
 skip_write = false
 random_topics = true
 report_metrics = false
--- a/benchmarks/src/bin/nyc-taxi.rs
+++ b/benchmarks/src/bin/nyc-taxi.rs
@@ -27,16 +27,16 @@ use arrow::record_batch::RecordBatch;
 use clap::Parser;
 use client::api::v1::column::Values;
 use client::api::v1::{
-    Column, ColumnDataType, ColumnDef, CreateTableExpr, InsertRequest, InsertRequests,
+    Column, ColumnDataType, ColumnDef, CreateTableExpr, InsertRequest, InsertRequests, SemanticType,
 };
-use client::{Client, Database, DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME};
+use client::{Client, Database, OutputData, DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME};
 use futures_util::TryStreamExt;
 use indicatif::{MultiProgress, ProgressBar, ProgressStyle};
 use parquet::arrow::arrow_reader::ParquetRecordBatchReaderBuilder;
 use tokio::task::JoinSet;
 const CATALOG_NAME: &str = "greptime";
 const SCHEMA_NAME: &str = "public";
 const TABLE_NAME: &str = "nyc_taxi";
 #[derive(Parser)]
 #[command(name = "NYC benchmark runner")]
@@ -74,7 +74,12 @@ fn get_file_list<P: AsRef<Path>>(path: P) -> Vec<PathBuf> {
        .collect()
 }
 fn new_table_name() -> String {
    format!("nyc_taxi_{}", chrono::Utc::now().timestamp())
 }
 async fn write_data(
    table_name: &str,
    batch_size: usize,
    db: &Database,
    path: PathBuf,
@@ -104,8 +109,7 @@ async fn write_data(
        }
        let (columns, row_count) = convert_record_batch(record_batch);
        let request = InsertRequest {
-            table_name: TABLE_NAME.to_string(),
+            table_name: table_name.to_string(),
            region_number: 0,
            columns,
            row_count,
        };
@@ -132,6 +136,11 @@ fn convert_record_batch(record_batch: RecordBatch) -> (Vec<Column>, u32) {
    for (array, field) in record_batch.columns().iter().zip(fields.iter()) {
        let (values, datatype) = build_values(array);
        let semantic_type = match field.name().as_str() {
            "VendorID" => SemanticType::Tag,
            "tpep_pickup_datetime" => SemanticType::Timestamp,
            _ => SemanticType::Field,
        };
        let column = Column {
            column_name: field.name().clone(),
@@ -142,7 +151,7 @@ fn convert_record_batch(record_batch: RecordBatch) -> (Vec<Column>, u32) {
                .map(|bitmap| bitmap.buffer().as_slice().to_vec())
                .unwrap_or_default(),
            datatype: datatype.into(),
-            // datatype and semantic_type are set to default
+            semantic_type: semantic_type as i32,
            ..Default::default()
        };
        columns.push(column);
@@ -189,7 +198,7 @@ fn build_values(column: &ArrayRef) -> (Values, ColumnDataType) {
            let values = array.values();
            (
                Values {
-                    ts_microsecond_values: values.to_vec(),
+                    timestamp_microsecond_values: values.to_vec(),
                    ..Default::default()
                },
                ColumnDataType::TimestampMicrosecond,
@@ -206,37 +215,7 @@ fn build_values(column: &ArrayRef) -> (Values, ColumnDataType) {
                ColumnDataType::String,
            )
        }
-        DataType::Null
+        _ => unimplemented!(),
        | DataType::Boolean
        | DataType::Int8
        | DataType::Int16
        | DataType::Int32
        | DataType::UInt8
        | DataType::UInt16
        | DataType::UInt32
        | DataType::UInt64
        | DataType::Float16
        | DataType::Float32
        | DataType::Date32
        | DataType::Date64
        | DataType::Time32(_)
        | DataType::Time64(_)
        | DataType::Duration(_)
        | DataType::Interval(_)
        | DataType::Binary
        | DataType::FixedSizeBinary(_)
        | DataType::LargeBinary
        | DataType::LargeUtf8
        | DataType::List(_)
        | DataType::FixedSizeList(_, _)
        | DataType::LargeList(_)
        | DataType::Struct(_)
        | DataType::Union(_, _)
        | DataType::Dictionary(_, _)
        | DataType::Decimal128(_, _)
        | DataType::Decimal256(_, _)
        | DataType::RunEndEncoded(_, _)
        | DataType::Map(_, _) => todo!(),
    }
 }
@@ -244,159 +223,212 @@ fn is_record_batch_full(batch: &RecordBatch) -> bool {
    batch.columns().iter().all(|col| col.null_count() == 0)
 }
-fn create_table_expr() -> CreateTableExpr {
+fn create_table_expr(table_name: &str) -> CreateTableExpr {
    CreateTableExpr {
        catalog_name: CATALOG_NAME.to_string(),
        schema_name: SCHEMA_NAME.to_string(),
-        table_name: TABLE_NAME.to_string(),
+        table_name: table_name.to_string(),
-        desc: "".to_string(),
+        desc: String::default(),
        column_defs: vec![
            ColumnDef {
                name: "VendorID".to_string(),
-                datatype: ColumnDataType::Int64 as i32,
+                data_type: ColumnDataType::Int64 as i32,
                is_nullable: true,
                default_constraint: vec![],
                semantic_type: SemanticType::Tag as i32,
                comment: String::new(),
                ..Default::default()
            },
            ColumnDef {
                name: "tpep_pickup_datetime".to_string(),
-                datatype: ColumnDataType::TimestampMicrosecond as i32,
+                data_type: ColumnDataType::TimestampMicrosecond as i32,
-                is_nullable: true,
+                is_nullable: false,
                default_constraint: vec![],
                semantic_type: SemanticType::Timestamp as i32,
                comment: String::new(),
                ..Default::default()
            },
            ColumnDef {
                name: "tpep_dropoff_datetime".to_string(),
-                datatype: ColumnDataType::TimestampMicrosecond as i32,
+                data_type: ColumnDataType::TimestampMicrosecond as i32,
                is_nullable: true,
                default_constraint: vec![],
                semantic_type: SemanticType::Field as i32,
                comment: String::new(),
                ..Default::default()
            },
            ColumnDef {
                name: "passenger_count".to_string(),
-                datatype: ColumnDataType::Float64 as i32,
+                data_type: ColumnDataType::Float64 as i32,
                is_nullable: true,
                default_constraint: vec![],
                semantic_type: SemanticType::Field as i32,
                comment: String::new(),
                ..Default::default()
            },
            ColumnDef {
                name: "trip_distance".to_string(),
-                datatype: ColumnDataType::Float64 as i32,
+                data_type: ColumnDataType::Float64 as i32,
                is_nullable: true,
                default_constraint: vec![],
                semantic_type: SemanticType::Field as i32,
                comment: String::new(),
                ..Default::default()
            },
            ColumnDef {
                name: "RatecodeID".to_string(),
-                datatype: ColumnDataType::Float64 as i32,
+                data_type: ColumnDataType::Float64 as i32,
                is_nullable: true,
                default_constraint: vec![],
                semantic_type: SemanticType::Field as i32,
                comment: String::new(),
                ..Default::default()
            },
            ColumnDef {
                name: "store_and_fwd_flag".to_string(),
-                datatype: ColumnDataType::String as i32,
+                data_type: ColumnDataType::String as i32,
                is_nullable: true,
                default_constraint: vec![],
                semantic_type: SemanticType::Field as i32,
                comment: String::new(),
                ..Default::default()
            },
            ColumnDef {
                name: "PULocationID".to_string(),
-                datatype: ColumnDataType::Int64 as i32,
+                data_type: ColumnDataType::Int64 as i32,
                is_nullable: true,
                default_constraint: vec![],
                semantic_type: SemanticType::Field as i32,
                comment: String::new(),
                ..Default::default()
            },
            ColumnDef {
                name: "DOLocationID".to_string(),
-                datatype: ColumnDataType::Int64 as i32,
+                data_type: ColumnDataType::Int64 as i32,
                is_nullable: true,
                default_constraint: vec![],
                semantic_type: SemanticType::Field as i32,
                comment: String::new(),
                ..Default::default()
            },
            ColumnDef {
                name: "payment_type".to_string(),
-                datatype: ColumnDataType::Int64 as i32,
+                data_type: ColumnDataType::Int64 as i32,
                is_nullable: true,
                default_constraint: vec![],
                semantic_type: SemanticType::Field as i32,
                comment: String::new(),
                ..Default::default()
            },
            ColumnDef {
                name: "fare_amount".to_string(),
-                datatype: ColumnDataType::Float64 as i32,
+                data_type: ColumnDataType::Float64 as i32,
                is_nullable: true,
                default_constraint: vec![],
                semantic_type: SemanticType::Field as i32,
                comment: String::new(),
                ..Default::default()
            },
            ColumnDef {
                name: "extra".to_string(),
-                datatype: ColumnDataType::Float64 as i32,
+                data_type: ColumnDataType::Float64 as i32,
                is_nullable: true,
                default_constraint: vec![],
                semantic_type: SemanticType::Field as i32,
                comment: String::new(),
                ..Default::default()
            },
            ColumnDef {
                name: "mta_tax".to_string(),
-                datatype: ColumnDataType::Float64 as i32,
+                data_type: ColumnDataType::Float64 as i32,
                is_nullable: true,
                default_constraint: vec![],
                semantic_type: SemanticType::Field as i32,
                comment: String::new(),
                ..Default::default()
            },
            ColumnDef {
                name: "tip_amount".to_string(),
-                datatype: ColumnDataType::Float64 as i32,
+                data_type: ColumnDataType::Float64 as i32,
                is_nullable: true,
                default_constraint: vec![],
                semantic_type: SemanticType::Field as i32,
                comment: String::new(),
                ..Default::default()
            },
            ColumnDef {
                name: "tolls_amount".to_string(),
-                datatype: ColumnDataType::Float64 as i32,
+                data_type: ColumnDataType::Float64 as i32,
                is_nullable: true,
                default_constraint: vec![],
                semantic_type: SemanticType::Field as i32,
                comment: String::new(),
                ..Default::default()
            },
            ColumnDef {
                name: "improvement_surcharge".to_string(),
-                datatype: ColumnDataType::Float64 as i32,
+                data_type: ColumnDataType::Float64 as i32,
                is_nullable: true,
                default_constraint: vec![],
                semantic_type: SemanticType::Field as i32,
                comment: String::new(),
                ..Default::default()
            },
            ColumnDef {
                name: "total_amount".to_string(),
-                datatype: ColumnDataType::Float64 as i32,
+                data_type: ColumnDataType::Float64 as i32,
                is_nullable: true,
                default_constraint: vec![],
                semantic_type: SemanticType::Field as i32,
                comment: String::new(),
                ..Default::default()
            },
            ColumnDef {
                name: "congestion_surcharge".to_string(),
-                datatype: ColumnDataType::Float64 as i32,
+                data_type: ColumnDataType::Float64 as i32,
                is_nullable: true,
                default_constraint: vec![],
                semantic_type: SemanticType::Field as i32,
                comment: String::new(),
                ..Default::default()
            },
            ColumnDef {
                name: "airport_fee".to_string(),
-                datatype: ColumnDataType::Float64 as i32,
+                data_type: ColumnDataType::Float64 as i32,
                is_nullable: true,
                default_constraint: vec![],
                semantic_type: SemanticType::Field as i32,
                comment: String::new(),
                ..Default::default()
            },
        ],
        time_index: "tpep_pickup_datetime".to_string(),
        primary_keys: vec!["VendorID".to_string()],
-        create_if_not_exists: false,
+        create_if_not_exists: true,
        table_options: Default::default(),
        region_numbers: vec![0],
        table_id: None,
        engine: "mito".to_string(),
    }
 }
-fn query_set() -> HashMap<String, String> {
+fn query_set(table_name: &str) -> HashMap<String, String> {
-    let mut ret = HashMap::new();
+    HashMap::from([
-
+        (
-    ret.insert(
+            "count_all".to_string(),
-        "count_all".to_string(),
+            format!("SELECT COUNT(*) FROM {table_name};"),
-        format!("SELECT COUNT(*) FROM {TABLE_NAME};"),
+        ),
-    );
+        (
-
+            "fare_amt_by_passenger".to_string(),
-    ret.insert(
+            format!("SELECT passenger_count, MIN(fare_amount), MAX(fare_amount), SUM(fare_amount) FROM {table_name} GROUP BY passenger_count"),
-        "fare_amt_by_passenger".to_string(),
+        )
-        format!("SELECT passenger_count, MIN(fare_amount), MAX(fare_amount), SUM(fare_amount) FROM {TABLE_NAME} GROUP BY passenger_count")
+    ])
    );
    ret
 }
-async fn do_write(args: &Args, db: &Database) {
+async fn do_write(args: &Args, db: &Database, table_name: &str) {
    let mut file_list = get_file_list(args.path.clone().expect("Specify data path in argument"));
    let mut write_jobs = JoinSet::new();
-    let create_table_result = db.create(create_table_expr()).await;
+    let create_table_result = db.create(create_table_expr(table_name)).await;
    println!("Create table result: {create_table_result:?}");
    let progress_bar_style = ProgressStyle::with_template(
@@ -414,7 +446,10 @@ async fn do_write(args: &Args, db: &Database) {
            let db = db.clone();
            let mpb = multi_progress_bar.clone();
            let pb_style = progress_bar_style.clone();
-            write_jobs.spawn(async move { write_data(batch_size, &db, path, mpb, pb_style).await });
+            let table_name = table_name.to_string();
            let _ = write_jobs.spawn(async move {
                write_data(&table_name, batch_size, &db, path, mpb, pb_style).await
            });
        }
    }
    while write_jobs.join_next().await.is_some() {
@@ -423,23 +458,32 @@ async fn do_write(args: &Args, db: &Database) {
            let db = db.clone();
            let mpb = multi_progress_bar.clone();
            let pb_style = progress_bar_style.clone();
-            write_jobs.spawn(async move { write_data(batch_size, &db, path, mpb, pb_style).await });
+            let table_name = table_name.to_string();
            let _ = write_jobs.spawn(async move {
                write_data(&table_name, batch_size, &db, path, mpb, pb_style).await
            });
        }
    }
 }
-async fn do_query(num_iter: usize, db: &Database) {
+async fn do_query(num_iter: usize, db: &Database, table_name: &str) {
-    for (query_name, query) in query_set() {
+    for (query_name, query) in query_set(table_name) {
        println!("Running query: {query}");
        for i in 0..num_iter {
            let now = Instant::now();
-            let _res = db.sql(&query).await.unwrap();
+            let res = db.sql(&query).await.unwrap();
            match res.data {
                OutputData::AffectedRows(_) | OutputData::RecordBatches(_) => (),
                OutputData::Stream(stream) => {
                    stream.try_collect::<Vec<_>>().await.unwrap();
                }
            }
            let elapsed = now.elapsed();
            println!(
                "query {}, iteration {}: {}ms",
                query_name,
                i,
-                elapsed.as_millis()
+                elapsed.as_millis(),
            );
        }
    }
@@ -456,13 +500,14 @@ fn main() {
        .block_on(async {
            let client = Client::with_urls(vec![&args.endpoint]);
            let db = Database::new(DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME, client);
            let table_name = new_table_name();
            if !args.skip_write {
-                do_write(&args, &db).await;
+                do_write(&args, &db, &table_name).await;
            }
            if !args.skip_read {
-                do_query(args.iter_num, &db).await;
+                do_query(args.iter_num, &db, &table_name).await;
            }
        })
 }
--- a/benchmarks/src/bin/wal_bench.rs
+++ b/benchmarks/src/bin/wal_bench.rs
@@ -0,0 +1,326 @@
 // Copyright 2023 Greptime Team
 //
 // Licensed under the Apache License, Version 2.0 (the "License");
 // you may not use this file except in compliance with the License.
 // You may obtain a copy of the License at
 //
 //     http://www.apache.org/licenses/LICENSE-2.0
 //
 // Unless required by applicable law or agreed to in writing, software
 // distributed under the License is distributed on an "AS IS" BASIS,
 // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 // See the License for the specific language governing permissions and
 // limitations under the License.
 #![feature(int_roundings)]
 use std::fs;
 use std::sync::Arc;
 use std::time::Instant;
 use api::v1::{ColumnDataType, ColumnSchema, SemanticType};
 use benchmarks::metrics;
 use benchmarks::wal_bench::{Args, Config, Region, WalProvider};
 use clap::Parser;
 use common_telemetry::info;
 use common_wal::config::kafka::common::BackoffConfig;
 use common_wal::config::kafka::DatanodeKafkaConfig as KafkaConfig;
 use common_wal::config::raft_engine::RaftEngineConfig;
 use common_wal::options::{KafkaWalOptions, WalOptions};
 use itertools::Itertools;
 use log_store::kafka::log_store::KafkaLogStore;
 use log_store::raft_engine::log_store::RaftEngineLogStore;
 use mito2::wal::Wal;
 use prometheus::{Encoder, TextEncoder};
 use rand::distributions::{Alphanumeric, DistString};
 use rand::rngs::SmallRng;
 use rand::SeedableRng;
 use rskafka::client::partition::Compression;
 use rskafka::client::ClientBuilder;
 use store_api::logstore::LogStore;
 use store_api::storage::RegionId;
 async fn run_benchmarker<S: LogStore>(cfg: &Config, topics: &[String], wal: Arc<Wal<S>>) {
    let chunk_size = cfg.num_regions.div_ceil(cfg.num_workers);
    let region_chunks = (0..cfg.num_regions)
        .map(|id| {
            build_region(
                id as u64,
                topics,
                &mut SmallRng::seed_from_u64(cfg.rng_seed),
                cfg,
            )
        })
        .chunks(chunk_size as usize)
        .into_iter()
        .map(|chunk| Arc::new(chunk.collect::<Vec<_>>()))
        .collect::<Vec<_>>();
    let mut write_elapsed = 0;
    let mut read_elapsed = 0;
    if !cfg.skip_write {
        info!("Benchmarking write ...");
        let num_scrapes = cfg.num_scrapes;
        let timer = Instant::now();
        futures::future::join_all((0..cfg.num_workers).map(|i| {
            let wal = wal.clone();
            let regions = region_chunks[i as usize].clone();
            tokio::spawn(async move {
                for _ in 0..num_scrapes {
                    let mut wal_writer = wal.writer();
                    regions
                        .iter()
                        .for_each(|region| region.add_wal_entry(&mut wal_writer));
                    wal_writer.write_to_wal().await.unwrap();
                }
            })
        }))
        .await;
        write_elapsed += timer.elapsed().as_millis();
    }
    if !cfg.skip_read {
        info!("Benchmarking read ...");
        let timer = Instant::now();
        futures::future::join_all((0..cfg.num_workers).map(|i| {
            let wal = wal.clone();
            let regions = region_chunks[i as usize].clone();
            tokio::spawn(async move {
                for region in regions.iter() {
                    region.replay(&wal).await;
                }
            })
        }))
        .await;
        read_elapsed = timer.elapsed().as_millis();
    }
    dump_report(cfg, write_elapsed, read_elapsed);
 }
 fn build_region(id: u64, topics: &[String], rng: &mut SmallRng, cfg: &Config) -> Region {
    let wal_options = match cfg.wal_provider {
        WalProvider::Kafka => {
            assert!(!topics.is_empty());
            WalOptions::Kafka(KafkaWalOptions {
                topic: topics.get(id as usize % topics.len()).cloned().unwrap(),
            })
        }
        WalProvider::RaftEngine => WalOptions::RaftEngine,
    };
    Region::new(
        RegionId::from_u64(id),
        build_schema(&parse_col_types(&cfg.col_types), rng),
        wal_options,
        cfg.num_rows,
        cfg.rng_seed,
    )
 }
 fn build_schema(col_types: &[ColumnDataType], mut rng: &mut SmallRng) -> Vec<ColumnSchema> {
    col_types
        .iter()
        .map(|col_type| ColumnSchema {
            column_name: Alphanumeric.sample_string(&mut rng, 5),
            datatype: *col_type as i32,
            semantic_type: SemanticType::Field as i32,
            datatype_extension: None,
        })
        .chain(vec![ColumnSchema {
            column_name: "ts".to_string(),
            datatype: ColumnDataType::TimestampMillisecond as i32,
            semantic_type: SemanticType::Tag as i32,
            datatype_extension: None,
        }])
        .collect()
 }
 fn dump_report(cfg: &Config, write_elapsed: u128, read_elapsed: u128) {
    let cost_report = format!(
        "write costs: {} ms, read costs: {} ms",
        write_elapsed, read_elapsed,
    );
    let total_written_bytes = metrics::METRIC_WAL_WRITE_BYTES_TOTAL.get() as u128;
    let write_throughput = if write_elapsed > 0 {
        (total_written_bytes * 1000).div_floor(write_elapsed)
    } else {
        0
    };
    let total_read_bytes = metrics::METRIC_WAL_READ_BYTES_TOTAL.get() as u128;
    let read_throughput = if read_elapsed > 0 {
        (total_read_bytes * 1000).div_floor(read_elapsed)
    } else {
        0
    };
    let throughput_report = format!(
        "total written bytes: {} bytes, total read bytes: {} bytes, write throuput: {} bytes/s ({} mb/s), read throughput: {} bytes/s ({} mb/s)",
        total_written_bytes,
        total_read_bytes,
        write_throughput,
        write_throughput.div_floor(1 << 20),
        read_throughput,
        read_throughput.div_floor(1 << 20),
    );
    let metrics_report = if cfg.report_metrics {
        let mut buffer = Vec::new();
        let encoder = TextEncoder::new();
        let metrics = prometheus::gather();
        encoder.encode(&metrics, &mut buffer).unwrap();
        String::from_utf8(buffer).unwrap()
    } else {
        String::new()
    };
    info!(
        r#"
 Benchmark config: 
 {cfg:?}
 Benchmark report:
 {cost_report}
 {throughput_report}
 {metrics_report}"#
    );
 }
 async fn create_topics(cfg: &Config) -> Vec<String> {
    // Creates topics.
    let client = ClientBuilder::new(cfg.bootstrap_brokers.clone())
        .build()
        .await
        .unwrap();
    let ctrl_client = client.controller_client().unwrap();
    let (topics, tasks): (Vec<_>, Vec<_>) = (0..cfg.num_topics)
        .map(|i| {
            let topic = if cfg.random_topics {
                format!(
                    "greptime_wal_bench_topic_{}_{}",
                    uuid::Uuid::new_v4().as_u128(),
                    i
                )
            } else {
                format!("greptime_wal_bench_topic_{}", i)
            };
            let task = ctrl_client.create_topic(
                topic.clone(),
                1,
                cfg.bootstrap_brokers.len() as i16,
                2000,
            );
            (topic, task)
        })
        .unzip();
    // Must ignore errors since we allow topics being created more than once.
    let _ = futures::future::try_join_all(tasks).await;
    topics
 }
 fn parse_compression(comp: &str) -> Compression {
    match comp {
        "no" => Compression::NoCompression,
        "gzip" => Compression::Gzip,
        "lz4" => Compression::Lz4,
        "snappy" => Compression::Snappy,
        "zstd" => Compression::Zstd,
        other => unreachable!("Unrecognized compression {other}"),
    }
 }
 fn parse_col_types(col_types: &str) -> Vec<ColumnDataType> {
    let parts = col_types.split('x').collect::<Vec<_>>();
    assert!(parts.len() <= 2);
    let pattern = parts[0];
    let repeat = parts
        .get(1)
        .map(|r| r.parse::<usize>().unwrap())
        .unwrap_or(1);
    pattern
        .chars()
        .map(|c| match c {
            'i' | 'I' => ColumnDataType::Int64,
            'f' | 'F' => ColumnDataType::Float64,
            's' | 'S' => ColumnDataType::String,
            other => unreachable!("Cannot parse {other} as a column data type"),
        })
        .cycle()
        .take(pattern.len() * repeat)
        .collect()
 }
 fn main() {
    // Sets the global logging to INFO and suppress loggings from rskafka other than ERROR and upper ones.
    std::env::set_var("UNITTEST_LOG_LEVEL", "info,rskafka=error");
    common_telemetry::init_default_ut_logging();
    let args = Args::parse();
    let cfg = if !args.cfg_file.is_empty() {
        toml::from_str(&fs::read_to_string(&args.cfg_file).unwrap()).unwrap()
    } else {
        Config::from(args)
    };
    // Validates arguments.
    if cfg.num_regions < cfg.num_workers {
        panic!("num_regions must be greater than or equal to num_workers");
    }
    if cfg
        .num_workers
        .min(cfg.num_topics)
        .min(cfg.num_regions)
        .min(cfg.num_scrapes)
        .min(cfg.max_batch_size.as_bytes() as u32)
        .min(cfg.bootstrap_brokers.len() as u32)
        == 0
    {
        panic!("Invalid arguments");
    }
    tokio::runtime::Builder::new_multi_thread()
        .enable_all()
        .build()
        .unwrap()
        .block_on(async {
            match cfg.wal_provider {
                WalProvider::Kafka => {
                    let topics = create_topics(&cfg).await;
                    let kafka_cfg = KafkaConfig {
                        broker_endpoints: cfg.bootstrap_brokers.clone(),
                        max_batch_size: cfg.max_batch_size,
                        linger: cfg.linger,
                        backoff: BackoffConfig {
                            init: cfg.backoff_init,
                            max: cfg.backoff_max,
                            base: cfg.backoff_base,
                            deadline: Some(cfg.backoff_deadline),
                        },
                        compression: parse_compression(&cfg.compression),
                        ..Default::default()
                    };
                    let store = Arc::new(KafkaLogStore::try_new(&kafka_cfg).await.unwrap());
                    let wal = Arc::new(Wal::new(store));
                    run_benchmarker(&cfg, &topics, wal).await;
                }
                WalProvider::RaftEngine => {
                    // The benchmarker assumes the raft engine directory exists.
                    let store = RaftEngineLogStore::try_new(
                        "/tmp/greptimedb/raft-engine-wal".to_string(),
                        RaftEngineConfig::default(),
                    )
                    .await
                    .map(Arc::new)
                    .unwrap();
                    let wal = Arc::new(Wal::new(store));
                    run_benchmarker(&cfg, &[], wal).await;
                }
            }
        });
 }
--- a/src/file-table-engine/src/config.rs
+++ b/src/file-table-engine/src/config.rs
@@ -12,5 +12,5 @@
 // See the License for the specific language governing permissions and
 // limitations under the License.
-#[derive(Debug, Clone, Default)]
+pub mod metrics;
-pub struct EngineConfig {}
+pub mod wal_bench;
--- a/benchmarks/src/metrics.rs
+++ b/benchmarks/src/metrics.rs
@@ -0,0 +1,39 @@
 // Copyright 2023 Greptime Team
 //
 // Licensed under the Apache License, Version 2.0 (the "License");
 // you may not use this file except in compliance with the License.
 // You may obtain a copy of the License at
 //
 //     http://www.apache.org/licenses/LICENSE-2.0
 //
 // Unless required by applicable law or agreed to in writing, software
 // distributed under the License is distributed on an "AS IS" BASIS,
 // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 // See the License for the specific language governing permissions and
 // limitations under the License.
 use lazy_static::lazy_static;
 use prometheus::*;
 /// Logstore label.
 pub const LOGSTORE_LABEL: &str = "logstore";
 /// Operation type label.
 pub const OPTYPE_LABEL: &str = "optype";
 lazy_static! {
    /// Counters of bytes of each operation on a logstore.
    pub static ref METRIC_WAL_OP_BYTES_TOTAL: IntCounterVec = register_int_counter_vec!(
        "greptime_bench_wal_op_bytes_total",
        "wal operation bytes total",
        &[OPTYPE_LABEL],
    )
    .unwrap();
    /// Counter of bytes of the append_batch operation.
    pub static ref METRIC_WAL_WRITE_BYTES_TOTAL: IntCounter = METRIC_WAL_OP_BYTES_TOTAL.with_label_values(
        &["write"],
    );
    /// Counter of bytes of the read operation.
    pub static ref METRIC_WAL_READ_BYTES_TOTAL: IntCounter = METRIC_WAL_OP_BYTES_TOTAL.with_label_values(
        &["read"],
    );
 }
--- a/benchmarks/src/wal_bench.rs
+++ b/benchmarks/src/wal_bench.rs
@@ -0,0 +1,361 @@
 // Copyright 2023 Greptime Team
 //
 // Licensed under the Apache License, Version 2.0 (the "License");
 // you may not use this file except in compliance with the License.
 // You may obtain a copy of the License at
 //
 //     http://www.apache.org/licenses/LICENSE-2.0
 //
 // Unless required by applicable law or agreed to in writing, software
 // distributed under the License is distributed on an "AS IS" BASIS,
 // WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 // See the License for the specific language governing permissions and
 // limitations under the License.
 use std::mem::size_of;
 use std::sync::atomic::{AtomicI64, AtomicU64, Ordering};
 use std::sync::{Arc, Mutex};
 use std::time::Duration;
 use api::v1::value::ValueData;
 use api::v1::{ColumnDataType, ColumnSchema, Mutation, OpType, Row, Rows, Value, WalEntry};
 use clap::{Parser, ValueEnum};
 use common_base::readable_size::ReadableSize;
 use common_wal::options::WalOptions;
 use futures::StreamExt;
 use mito2::wal::{Wal, WalWriter};
 use rand::distributions::{Alphanumeric, DistString, Uniform};
 use rand::rngs::SmallRng;
 use rand::{Rng, SeedableRng};
 use serde::{Deserialize, Serialize};
 use store_api::logstore::LogStore;
 use store_api::storage::RegionId;
 use crate::metrics;
 /// The wal provider.
 #[derive(Clone, ValueEnum, Default, Debug, PartialEq, Serialize, Deserialize)]
 #[serde(rename_all = "snake_case")]
 pub enum WalProvider {
    #[default]
    RaftEngine,
    Kafka,
 }
 #[derive(Parser)]
 pub struct Args {
    /// The provided configuration file.
    /// The example configuration file can be found at `greptimedb/benchmarks/config/wal_bench.example.toml`.
    #[clap(long, short = 'c')]
    pub cfg_file: String,
    /// The wal provider.
    #[clap(long, value_enum, default_value_t = WalProvider::default())]
    pub wal_provider: WalProvider,
    /// The advertised addresses of the kafka brokers.
    /// If there're multiple bootstrap brokers, their addresses should be separated by comma, for e.g. "localhost:9092,localhost:9093".
    #[clap(long, short = 'b', default_value = "localhost:9092")]
    pub bootstrap_brokers: String,
    /// The number of workers each running in a dedicated thread.
    #[clap(long, default_value_t = num_cpus::get() as u32)]
    pub num_workers: u32,
    /// The number of kafka topics to be created.
    #[clap(long, default_value_t = 32)]
    pub num_topics: u32,
    /// The number of regions.
    #[clap(long, default_value_t = 1000)]
    pub num_regions: u32,
    /// The number of times each region is scraped.
    #[clap(long, default_value_t = 1000)]
    pub num_scrapes: u32,
    /// The number of rows in each wal entry.
    /// Each time a region is scraped, a wal entry containing will be produced.
    #[clap(long, default_value_t = 5)]
    pub num_rows: u32,
    /// The column types of the schema for each region.
    /// Currently, three column types are supported:
    /// - i = ColumnDataType::Int64
    /// - f = ColumnDataType::Float64
    /// - s = ColumnDataType::String  
    /// For e.g., "ifs" will be parsed as three columns: i64, f64, and string.
    ///
    /// Additionally, a "x" sign can be provided to repeat the column types for a given number of times.
    /// For e.g., "iix2" will be parsed as 4 columns: i64, i64, i64, and i64.
    /// This feature is useful if you want to specify many columns.
    #[clap(long, default_value = "ifs")]
    pub col_types: String,
    /// The maximum size of a batch of kafka records.
    /// The default value is 1mb.
    #[clap(long, default_value = "512KB")]
    pub max_batch_size: ReadableSize,
    /// The minimum latency the kafka client issues a batch of kafka records.
    /// However, a batch of kafka records would be immediately issued if a record cannot be fit into the batch.
    #[clap(long, default_value = "1ms")]
    pub linger: String,
    /// The initial backoff delay of the kafka consumer.
    #[clap(long, default_value = "10ms")]
    pub backoff_init: String,
    /// The maximum backoff delay of the kafka consumer.
    #[clap(long, default_value = "1s")]
    pub backoff_max: String,
    /// The exponential backoff rate of the kafka consumer. The next back off = base * the current backoff.
    #[clap(long, default_value_t = 2)]
    pub backoff_base: u32,
    /// The deadline of backoff. The backoff ends if the total backoff delay reaches the deadline.
    #[clap(long, default_value = "3s")]
    pub backoff_deadline: String,
    /// The client-side compression algorithm for kafka records.
    #[clap(long, default_value = "zstd")]
    pub compression: String,
    /// The seed of random number generators.
    #[clap(long, default_value_t = 42)]
    pub rng_seed: u64,
    /// Skips the read phase, aka. region replay, if set to true.
    #[clap(long, default_value_t = false)]
    pub skip_read: bool,
    /// Skips the write phase if set to true.
    #[clap(long, default_value_t = false)]
    pub skip_write: bool,
    /// Randomly generates topic names if set to true.
    /// Useful when you want to run the benchmarker without worrying about the topics created before.
    #[clap(long, default_value_t = false)]
    pub random_topics: bool,
    /// Logs out the gathered prometheus metrics when the benchmarker ends.
    #[clap(long, default_value_t = false)]
    pub report_metrics: bool,
 }
 /// Benchmarker config.
 #[derive(Debug, Clone, Serialize, Deserialize)]
 pub struct Config {
    pub wal_provider: WalProvider,
    pub bootstrap_brokers: Vec<String>,
    pub num_workers: u32,
    pub num_topics: u32,
    pub num_regions: u32,
    pub num_scrapes: u32,
    pub num_rows: u32,
    pub col_types: String,
    pub max_batch_size: ReadableSize,
    #[serde(with = "humantime_serde")]
    pub linger: Duration,
    #[serde(with = "humantime_serde")]
    pub backoff_init: Duration,
    #[serde(with = "humantime_serde")]
    pub backoff_max: Duration,
    pub backoff_base: u32,
    #[serde(with = "humantime_serde")]
    pub backoff_deadline: Duration,
    pub compression: String,
    pub rng_seed: u64,
    pub skip_read: bool,
    pub skip_write: bool,
    pub random_topics: bool,
    pub report_metrics: bool,
 }
 impl From<Args> for Config {
    fn from(args: Args) -> Self {
        let cfg = Self {
            wal_provider: args.wal_provider,
            bootstrap_brokers: args
                .bootstrap_brokers
                .split(',')
                .map(ToString::to_string)
                .collect::<Vec<_>>(),
            num_workers: args.num_workers.min(num_cpus::get() as u32),
            num_topics: args.num_topics,
            num_regions: args.num_regions,
            num_scrapes: args.num_scrapes,
            num_rows: args.num_rows,
            col_types: args.col_types,
            max_batch_size: args.max_batch_size,
            linger: humantime::parse_duration(&args.linger).unwrap(),
            backoff_init: humantime::parse_duration(&args.backoff_init).unwrap(),
            backoff_max: humantime::parse_duration(&args.backoff_max).unwrap(),
            backoff_base: args.backoff_base,
            backoff_deadline: humantime::parse_duration(&args.backoff_deadline).unwrap(),
            compression: args.compression,
            rng_seed: args.rng_seed,
            skip_read: args.skip_read,
            skip_write: args.skip_write,
            random_topics: args.random_topics,
            report_metrics: args.report_metrics,
        };
        cfg
    }
 }
 /// The region used for wal benchmarker.
 pub struct Region {
    id: RegionId,
    schema: Vec<ColumnSchema>,
    wal_options: WalOptions,
    next_sequence: AtomicU64,
    next_entry_id: AtomicU64,
    next_timestamp: AtomicI64,
    rng: Mutex<Option<SmallRng>>,
    num_rows: u32,
 }
 impl Region {
    /// Creates a new region.
    pub fn new(
        id: RegionId,
        schema: Vec<ColumnSchema>,
        wal_options: WalOptions,
        num_rows: u32,
        rng_seed: u64,
    ) -> Self {
        Self {
            id,
            schema,
            wal_options,
            next_sequence: AtomicU64::new(1),
            next_entry_id: AtomicU64::new(1),
            next_timestamp: AtomicI64::new(1655276557000),
            rng: Mutex::new(Some(SmallRng::seed_from_u64(rng_seed))),
            num_rows,
        }
    }
    /// Scrapes the region and adds the generated entry to wal.
    pub fn add_wal_entry<S: LogStore>(&self, wal_writer: &mut WalWriter<S>) {
        let mutation = Mutation {
            op_type: OpType::Put as i32,
            sequence: self
                .next_sequence
                .fetch_add(self.num_rows as u64, Ordering::Relaxed),
            rows: Some(self.build_rows()),
        };
        let entry = WalEntry {
            mutations: vec![mutation],
        };
        metrics::METRIC_WAL_WRITE_BYTES_TOTAL.inc_by(Self::entry_estimated_size(&entry) as u64);
        wal_writer
            .add_entry(
                self.id,
                self.next_entry_id.fetch_add(1, Ordering::Relaxed),
                &entry,
                &self.wal_options,
            )
            .unwrap();
    }
    /// Replays the region.
    pub async fn replay<S: LogStore>(&self, wal: &Arc<Wal<S>>) {
        let mut wal_stream = wal.scan(self.id, 0, &self.wal_options).unwrap();
        while let Some(res) = wal_stream.next().await {
            let (_, entry) = res.unwrap();
            metrics::METRIC_WAL_READ_BYTES_TOTAL.inc_by(Self::entry_estimated_size(&entry) as u64);
        }
    }
    /// Computes the estimated size in bytes of the entry.
    pub fn entry_estimated_size(entry: &WalEntry) -> usize {
        let wrapper_size = size_of::<WalEntry>()
            + entry.mutations.capacity() * size_of::<Mutation>()
            + size_of::<Rows>();
        let rows = entry.mutations[0].rows.as_ref().unwrap();
        let schema_size = rows.schema.capacity() * size_of::<ColumnSchema>()
            + rows
                .schema
                .iter()
                .map(|s| s.column_name.capacity())
                .sum::<usize>();
        let values_size = (rows.rows.capacity() * size_of::<Row>())
            + rows
                .rows
                .iter()
                .map(|r| r.values.capacity() * size_of::<Value>())
                .sum::<usize>();
        wrapper_size + schema_size + values_size
    }
    fn build_rows(&self) -> Rows {
        let cols = self
            .schema
            .iter()
            .map(|col_schema| {
                let col_data_type = ColumnDataType::try_from(col_schema.datatype).unwrap();
                self.build_col(&col_data_type, self.num_rows)
            })
            .collect::<Vec<_>>();
        let rows = (0..self.num_rows)
            .map(|i| {
                let values = cols.iter().map(|col| col[i as usize].clone()).collect();
                Row { values }
            })
            .collect();
        Rows {
            schema: self.schema.clone(),
            rows,
        }
    }
    fn build_col(&self, col_data_type: &ColumnDataType, num_rows: u32) -> Vec<Value> {
        let mut rng_guard = self.rng.lock().unwrap();
        let rng = rng_guard.as_mut().unwrap();
        match col_data_type {
            ColumnDataType::TimestampMillisecond => (0..num_rows)
                .map(|_| {
                    let ts = self.next_timestamp.fetch_add(1000, Ordering::Relaxed);
                    Value {
                        value_data: Some(ValueData::TimestampMillisecondValue(ts)),
                    }
                })
                .collect(),
            ColumnDataType::Int64 => (0..num_rows)
                .map(|_| {
                    let v = rng.sample(Uniform::new(0, 10_000));
                    Value {
                        value_data: Some(ValueData::I64Value(v)),
                    }
                })
                .collect(),
            ColumnDataType::Float64 => (0..num_rows)
                .map(|_| {
                    let v = rng.sample(Uniform::new(0.0, 5000.0));
                    Value {
                        value_data: Some(ValueData::F64Value(v)),
                    }
                })
                .collect(),
            ColumnDataType::String => (0..num_rows)
                .map(|_| {
                    let v = Alphanumeric.sample_string(rng, 10);
                    Value {
                        value_data: Some(ValueData::StringValue(v)),
                    }
                })
                .collect(),
            _ => unreachable!(),
        }
    }
 }
--- a/cliff.toml
+++ b/cliff.toml
@@ -0,0 +1,127 @@
 # https://git-cliff.org/docs/configuration
 [remote.github]
 owner = "GreptimeTeam"
 repo = "greptimedb"
 [changelog]
 header = ""
 footer = ""
 # template for the changelog body
 # https://keats.github.io/tera/docs/#introduction
 body = """
 # {{ version }}
 Release date: {{ timestamp | date(format="%B %d, %Y") }}
 {%- set breakings = commits | filter(attribute="breaking", value=true) -%}
 {%- if breakings | length > 0 %}
 ## Breaking changes
    {% for commit in breakings %}
      * {{ commit.github.pr_title }}\
        {% if commit.github.username %} by \
          {% set author = commit.github.username -%}
          [@{{ author }}](https://github.com/{{ author }})
        {%- endif -%}
        {% if commit.github.pr_number %} in \
          {% set number = commit.github.pr_number -%}
          [#{{ number }}]({{ self::remote_url() }}/pull/{{ number }})
        {%- endif %}
    {%- endfor %}
 {%- endif -%}
 {%- set grouped_commits = commits | filter(attribute="breaking", value=false) | group_by(attribute="group") -%}
 {% for group, commits in grouped_commits %}
    ### {{ group | striptags | trim | upper_first }}
    {% for commit in commits %}
        * {{ commit.github.pr_title }}\
            {% if commit.github.username %} by \
              {% set author = commit.github.username -%}
              [@{{ author }}](https://github.com/{{ author }})
            {%- endif -%}
            {% if commit.github.pr_number %} in \
              {% set number = commit.github.pr_number -%}
              [#{{ number }}]({{ self::remote_url() }}/pull/{{ number }})
            {%- endif %}
    {%- endfor -%}
 {% endfor %}
 {%- if github.contributors | filter(attribute="is_first_time", value=true) | length != 0 %}
  {% raw %}\n{% endraw -%}
  ## New Contributors
 {% endif -%}
 {% for contributor in github.contributors | filter(attribute="is_first_time", value=true) %}
  * [@{{ contributor.username }}](https://github.com/{{ contributor.username }}) made their first contribution
    {%- if contributor.pr_number %} in \
      [#{{ contributor.pr_number }}]({{ self::remote_url() }}/pull/{{ contributor.pr_number }}) \
    {%- endif %}
 {%- endfor -%}
 {% if github.contributors | length != 0 %}
  {% raw %}\n{% endraw -%}
 ## All Contributors
 We would like to thank the following contributors from the GreptimeDB community:
 {%- set contributors = github.contributors | sort(attribute="username") | map(attribute="username") -%}
 {%- set bots = ['dependabot[bot]'] %}
 {% for contributor in contributors %}
 {%- if bots is containing(contributor) -%}{% continue %}{%- endif -%}
 {%- if loop.first -%}
  [@{{ contributor }}](https://github.com/{{ contributor }})
 {%- else -%}
  , [@{{ contributor }}](https://github.com/{{ contributor }})
 {%- endif -%}
 {%- endfor %}
 {%- endif %}
 {% raw %}\n{% endraw %}
 {%- macro remote_url() -%}
  https://github.com/{{ remote.github.owner }}/{{ remote.github.repo }}
 {%- endmacro -%}
 """
 trim = true
 [git]
 # parse the commits based on https://www.conventionalcommits.org
 conventional_commits = true
 # filter out the commits that are not conventional
 filter_unconventional = true
 # process each line of a commit as an individual commit
 split_commits = false
 # regex for parsing and grouping commits
 commit_parsers = [
  { message = "^feat", group = "<!-- 0 -->🚀 Features" },
  { message = "^fix", group = "<!-- 1 -->🐛 Bug Fixes" },
  { message = "^doc", group = "<!-- 3 -->📚 Documentation" },
  { message = "^perf", group = "<!-- 4 -->⚡ Performance" },
  { message = "^refactor", group = "<!-- 2 -->🚜 Refactor" },
  { message = "^style", group = "<!-- 5 -->🎨 Styling" },
  { message = "^test", group = "<!-- 6 -->🧪 Testing" },
  { message = "^chore\\(release\\): prepare for", skip = true },
  { message = "^chore\\(deps.*\\)", skip = true },
  { message = "^chore\\(pr\\)", skip = true },
  { message = "^chore\\(pull\\)", skip = true },
  { message = "^chore|^ci", group = "<!-- 7 -->⚙️ Miscellaneous Tasks" },
  { body = ".*security", group = "<!-- 8 -->🛡️ Security" },
  { message = "^revert", group = "<!-- 9 -->◀️ Revert" },
 ]
 # protect breaking changes from being skipped due to matching a skipping commit_parser
 protect_breaking_commits = false
 # filter out the commits that are not matched by commit parsers
 filter_commits = false
 # regex for matching git tags
 # tag_pattern = "v[0-9].*"
 # regex for skipping tags
 # skip_tags = ""
 # regex for ignoring tags
 ignore_tags = ".*-nightly-.*"
 # sort the tags topologically
 topo_order = false
 # sort the commits inside sections by oldest/newest order
 sort_commits = "oldest"
 # limit the number of commits included in the changelog.
 # limit_commits = 42
--- a/codecov.yml
+++ b/codecov.yml
@@ -8,5 +8,6 @@ coverage:
 ignore:
  - "**/error*.rs" # ignore all error.rs files
  - "tests/runner/*.rs" # ignore integration test runner
  - "tests-integration/**/*.rs" # ignore integration tests
 comment:                  # this is a top-level key
  layout: "diff"
--- a/config/config-docs-template.md
+++ b/config/config-docs-template.md
@@ -0,0 +1,19 @@
 # Configurations
 ## Standalone Mode
 {{ toml2docs "./standalone.example.toml" }}
 ## Cluster Mode
 ### Frontend
 {{ toml2docs "./frontend.example.toml" }}
 ### Metasrv
 {{ toml2docs "./metasrv.example.toml" }}
 ### Datanode
 {{ toml2docs "./datanode.example.toml" }}
--- a/config/config.md
+++ b/config/config.md
@@ -0,0 +1,376 @@
 # Configurations
 ## Standalone Mode
 | Key | Type | Default | Descriptions |
 | --- | -----| ------- | ----------- |
 | `mode` | String | `standalone` | The running mode of the datanode. It can be `standalone` or `distributed`. |
 | `enable_telemetry` | Bool | `true` | Enable telemetry to collect anonymous usage data. |
 | `default_timezone` | String | `None` | The default timezone of the server. |
 | `http` | -- | -- | The HTTP server options. |
 | `http.addr` | String | `127.0.0.1:4000` | The address to bind the HTTP server. |
 | `http.timeout` | String | `30s` | HTTP request timeout. |
 | `http.body_limit` | String | `64MB` | HTTP request body limit.<br/>Support the following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`. |
 | `grpc` | -- | -- | The gRPC server options. |
 | `grpc.addr` | String | `127.0.0.1:4001` | The address to bind the gRPC server. |
 | `grpc.runtime_size` | Integer | `8` | The number of server worker threads. |
 | `mysql` | -- | -- | MySQL server options. |
 | `mysql.enable` | Bool | `true` | Whether to enable. |
 | `mysql.addr` | String | `127.0.0.1:4002` | The addr to bind the MySQL server. |
 | `mysql.runtime_size` | Integer | `2` | The number of server worker threads. |
 | `mysql.tls` | -- | -- | -- |
 | `mysql.tls.mode` | String | `disable` | TLS mode, refer to https://www.postgresql.org/docs/current/libpq-ssl.html<br/>- `disable` (default value)<br/>- `prefer`<br/>- `require`<br/>- `verify-ca`<br/>- `verify-full` |
 | `mysql.tls.cert_path` | String | `None` | Certificate file path. |
 | `mysql.tls.key_path` | String | `None` | Private key file path. |
 | `mysql.tls.watch` | Bool | `false` | Watch for Certificate and key file change and auto reload |
 | `postgres` | -- | -- | PostgresSQL server options. |
 | `postgres.enable` | Bool | `true` | Whether to enable |
 | `postgres.addr` | String | `127.0.0.1:4003` | The addr to bind the PostgresSQL server. |
 | `postgres.runtime_size` | Integer | `2` | The number of server worker threads. |
 | `postgres.tls` | -- | -- | PostgresSQL server TLS options, see `mysql_options.tls` section. |
 | `postgres.tls.mode` | String | `disable` | TLS mode. |
 | `postgres.tls.cert_path` | String | `None` | Certificate file path. |
 | `postgres.tls.key_path` | String | `None` | Private key file path. |
 | `postgres.tls.watch` | Bool | `false` | Watch for Certificate and key file change and auto reload |
 | `opentsdb` | -- | -- | OpenTSDB protocol options. |
 | `opentsdb.enable` | Bool | `true` | Whether to enable |
 | `opentsdb.addr` | String | `127.0.0.1:4242` | OpenTSDB telnet API server address. |
 | `opentsdb.runtime_size` | Integer | `2` | The number of server worker threads. |
 | `influxdb` | -- | -- | InfluxDB protocol options. |
 | `influxdb.enable` | Bool | `true` | Whether to enable InfluxDB protocol in HTTP API. |
 | `prom_store` | -- | -- | Prometheus remote storage options |
 | `prom_store.enable` | Bool | `true` | Whether to enable Prometheus remote write and read in HTTP API. |
 | `prom_store.with_metric_engine` | Bool | `true` | Whether to store the data from Prometheus remote write in metric engine. |
 | `wal` | -- | -- | The WAL options. |
 | `wal.provider` | String | `raft_engine` | The provider of the WAL.<br/>- `raft_engine`: the wal is stored in the local file system by raft-engine.<br/>- `kafka`: it's remote wal that data is stored in Kafka. |
 | `wal.dir` | String | `None` | The directory to store the WAL files.<br/>**It's only used when the provider is `raft_engine`**. |
 | `wal.file_size` | String | `256MB` | The size of the WAL segment file.<br/>**It's only used when the provider is `raft_engine`**. |
 | `wal.purge_threshold` | String | `4GB` | The threshold of the WAL size to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
 | `wal.purge_interval` | String | `10m` | The interval to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
 | `wal.read_batch_size` | Integer | `128` | The read batch size.<br/>**It's only used when the provider is `raft_engine`**. |
 | `wal.sync_write` | Bool | `false` | Whether to use sync write.<br/>**It's only used when the provider is `raft_engine`**. |
 | `wal.enable_log_recycle` | Bool | `true` | Whether to reuse logically truncated log files.<br/>**It's only used when the provider is `raft_engine`**. |
 | `wal.prefill_log_files` | Bool | `false` | Whether to pre-create log files on start up.<br/>**It's only used when the provider is `raft_engine`**. |
 | `wal.sync_period` | String | `10s` | Duration for fsyncing log files.<br/>**It's only used when the provider is `raft_engine`**. |
 | `wal.broker_endpoints` | Array | -- | The Kafka broker endpoints.<br/>**It's only used when the provider is `kafka`**. |
 | `wal.max_batch_size` | String | `1MB` | The max size of a single producer batch.<br/>Warning: Kafka has a default limit of 1MB per message in a topic.<br/>**It's only used when the provider is `kafka`**. |
 | `wal.linger` | String | `200ms` | The linger duration of a kafka batch producer.<br/>**It's only used when the provider is `kafka`**. |
 | `wal.consumer_wait_timeout` | String | `100ms` | The consumer wait timeout.<br/>**It's only used when the provider is `kafka`**. |
 | `wal.backoff_init` | String | `500ms` | The initial backoff delay.<br/>**It's only used when the provider is `kafka`**. |
 | `wal.backoff_max` | String | `10s` | The maximum backoff delay.<br/>**It's only used when the provider is `kafka`**. |
 | `wal.backoff_base` | Integer | `2` | The exponential backoff rate, i.e. next backoff = base * current backoff.<br/>**It's only used when the provider is `kafka`**. |
 | `wal.backoff_deadline` | String | `5mins` | The deadline of retries.<br/>**It's only used when the provider is `kafka`**. |
 | `metadata_store` | -- | -- | Metadata storage options. |
 | `metadata_store.file_size` | String | `256MB` | Kv file size in bytes. |
 | `metadata_store.purge_threshold` | String | `4GB` | Kv purge threshold. |
 | `procedure` | -- | -- | Procedure storage options. |
 | `procedure.max_retry_times` | Integer | `3` | Procedure max retry time. |
 | `procedure.retry_delay` | String | `500ms` | Initial retry delay of procedures, increases exponentially |
 | `storage` | -- | -- | The data storage options. |
 | `storage.data_home` | String | `/tmp/greptimedb/` | The working home directory. |
 | `storage.type` | String | `File` | The storage type used to store the data.<br/>- `File`: the data is stored in the local file system.<br/>- `S3`: the data is stored in the S3 object storage.<br/>- `Gcs`: the data is stored in the Google Cloud Storage.<br/>- `Azblob`: the data is stored in the Azure Blob Storage.<br/>- `Oss`: the data is stored in the Aliyun OSS. |
 | `storage.cache_path` | String | `None` | Cache configuration for object storage such as 'S3' etc.<br/>The local file cache directory. |
 | `storage.cache_capacity` | String | `None` | The local file cache capacity in bytes. |
 | `storage.bucket` | String | `None` | The S3 bucket name.<br/>**It's only used when the storage type is `S3`, `Oss` and `Gcs`**. |
 | `storage.root` | String | `None` | The S3 data will be stored in the specified prefix, for example, `s3://${bucket}/${root}`.<br/>**It's only used when the storage type is `S3`, `Oss` and `Azblob`**. |
 | `storage.access_key_id` | String | `None` | The access key id of the aws account.<br/>It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.<br/>**It's only used when the storage type is `S3` and `Oss`**. |
 | `storage.secret_access_key` | String | `None` | The secret access key of the aws account.<br/>It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.<br/>**It's only used when the storage type is `S3`**. |
 | `storage.access_key_secret` | String | `None` | The secret access key of the aliyun account.<br/>**It's only used when the storage type is `Oss`**. |
 | `storage.account_name` | String | `None` | The account key of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
 | `storage.account_key` | String | `None` | The account key of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
 | `storage.scope` | String | `None` | The scope of the google cloud storage.<br/>**It's only used when the storage type is `Gcs`**. |
 | `storage.credential_path` | String | `None` | The credential path of the google cloud storage.<br/>**It's only used when the storage type is `Gcs`**. |
 | `storage.container` | String | `None` | The container of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
 | `storage.sas_token` | String | `None` | The sas token of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
 | `storage.endpoint` | String | `None` | The endpoint of the S3 service.<br/>**It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**. |
 | `storage.region` | String | `None` | The region of the S3 service.<br/>**It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**. |
 | `[[region_engine]]` | -- | -- | The region engine options. You can configure multiple region engines. |
 | `region_engine.mito` | -- | -- | The Mito engine options. |
 | `region_engine.mito.num_workers` | Integer | `8` | Number of region workers. |
 | `region_engine.mito.worker_channel_size` | Integer | `128` | Request channel size of each worker. |
 | `region_engine.mito.worker_request_batch_size` | Integer | `64` | Max batch size for a worker to handle requests. |
 | `region_engine.mito.manifest_checkpoint_distance` | Integer | `10` | Number of meta action updated to trigger a new checkpoint for the manifest. |
 | `region_engine.mito.compress_manifest` | Bool | `false` | Whether to compress manifest and checkpoint file by gzip (default false). |
 | `region_engine.mito.max_background_jobs` | Integer | `4` | Max number of running background jobs |
 | `region_engine.mito.auto_flush_interval` | String | `1h` | Interval to auto flush a region if it has not flushed yet. |
 | `region_engine.mito.global_write_buffer_size` | String | `1GB` | Global write buffer size for all regions. If not set, it's default to 1/8 of OS memory with a max limitation of 1GB. |
 | `region_engine.mito.global_write_buffer_reject_size` | String | `2GB` | Global write buffer size threshold to reject write requests. If not set, it's default to 2 times of `global_write_buffer_size` |
 | `region_engine.mito.sst_meta_cache_size` | String | `128MB` | Cache size for SST metadata. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/32 of OS memory with a max limitation of 128MB. |
 | `region_engine.mito.vector_cache_size` | String | `512MB` | Cache size for vectors and arrow arrays. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/16 of OS memory with a max limitation of 512MB. |
 | `region_engine.mito.page_cache_size` | String | `512MB` | Cache size for pages of SST row groups. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/16 of OS memory with a max limitation of 512MB. |
 | `region_engine.mito.sst_write_buffer_size` | String | `8MB` | Buffer size for SST writing. |
 | `region_engine.mito.scan_parallelism` | Integer | `0` | Parallelism to scan a region (default: 1/4 of cpu cores).<br/>- `0`: using the default value (1/4 of cpu cores).<br/>- `1`: scan in current thread.<br/>- `n`: scan in parallelism n. |
 | `region_engine.mito.parallel_scan_channel_size` | Integer | `32` | Capacity of the channel to send data from parallel scan tasks to the main task. |
 | `region_engine.mito.allow_stale_entries` | Bool | `false` | Whether to allow stale WAL entries read during replay. |
 | `region_engine.mito.inverted_index` | -- | -- | The options for inverted index in Mito engine. |
 | `region_engine.mito.inverted_index.create_on_flush` | String | `auto` | Whether to create the index on flush.<br/>- `auto`: automatically<br/>- `disable`: never |
 | `region_engine.mito.inverted_index.create_on_compaction` | String | `auto` | Whether to create the index on compaction.<br/>- `auto`: automatically<br/>- `disable`: never |
 | `region_engine.mito.inverted_index.apply_on_query` | String | `auto` | Whether to apply the index on query<br/>- `auto`: automatically<br/>- `disable`: never |
 | `region_engine.mito.inverted_index.mem_threshold_on_create` | String | `64M` | Memory threshold for performing an external sort during index creation.<br/>Setting to empty will disable external sorting, forcing all sorting operations to happen in memory. |
 | `region_engine.mito.inverted_index.intermediate_path` | String | `""` | File system path to store intermediate files for external sorting (default `{data_home}/index_intermediate`). |
 | `region_engine.mito.memtable` | -- | -- | -- |
 | `region_engine.mito.memtable.type` | String | `time_series` | Memtable type.<br/>- `time_series`: time-series memtable<br/>- `partition_tree`: partition tree memtable (experimental) |
 | `region_engine.mito.memtable.index_max_keys_per_shard` | Integer | `8192` | The max number of keys in one shard.<br/>Only available for `partition_tree` memtable. |
 | `region_engine.mito.memtable.data_freeze_threshold` | Integer | `32768` | The max rows of data inside the actively writing buffer in one shard.<br/>Only available for `partition_tree` memtable. |
 | `region_engine.mito.memtable.fork_dictionary_bytes` | String | `1GiB` | Max dictionary bytes.<br/>Only available for `partition_tree` memtable. |
 | `logging` | -- | -- | The logging options. |
 | `logging.dir` | String | `/tmp/greptimedb/logs` | The directory to store the log files. |
 | `logging.level` | String | `None` | The log level. Can be `info`/`debug`/`warn`/`error`. |
 | `logging.enable_otlp_tracing` | Bool | `false` | Enable OTLP tracing. |
 | `logging.otlp_endpoint` | String | `None` | The OTLP tracing endpoint. |
 | `logging.append_stdout` | Bool | `true` | Whether to append logs to stdout. |
 | `logging.tracing_sample_ratio` | -- | -- | The percentage of tracing will be sampled and exported.<br/>Valid range `[0, 1]`, 1 means all traces are sampled, 0 means all traces are not sampled, the default value is 1.<br/>ratio > 1 are treated as 1. Fractions < 0 are treated as 0 |
 | `logging.tracing_sample_ratio.default_ratio` | Float | `1.0` | -- |
 | `export_metrics` | -- | -- | The datanode can export its metrics and send to Prometheus compatible service (e.g. send to `greptimedb` itself) from remote-write API.<br/>This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape. |
 | `export_metrics.enable` | Bool | `false` | whether enable export metrics. |
 | `export_metrics.write_interval` | String | `30s` | The interval of export metrics. |
 | `export_metrics.self_import` | -- | -- | For `standalone` mode, `self_import` is recommend to collect metrics generated by itself |
 | `export_metrics.self_import.db` | String | `None` | -- |
 | `export_metrics.remote_write` | -- | -- | -- |
 | `export_metrics.remote_write.url` | String | `""` | The url the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=information_schema`. |
 | `export_metrics.remote_write.headers` | InlineTable | -- | HTTP headers of Prometheus remote-write carry. |
 ## Cluster Mode
 ### Frontend
 | Key | Type | Default | Descriptions |
 | --- | -----| ------- | ----------- |
 | `mode` | String | `standalone` | The running mode of the datanode. It can be `standalone` or `distributed`. |
 | `default_timezone` | String | `None` | The default timezone of the server. |
 | `heartbeat` | -- | -- | The heartbeat options. |
 | `heartbeat.interval` | String | `18s` | Interval for sending heartbeat messages to the metasrv. |
 | `heartbeat.retry_interval` | String | `3s` | Interval for retrying to send heartbeat messages to the metasrv. |
 | `http` | -- | -- | The HTTP server options. |
 | `http.addr` | String | `127.0.0.1:4000` | The address to bind the HTTP server. |
 | `http.timeout` | String | `30s` | HTTP request timeout. |
 | `http.body_limit` | String | `64MB` | HTTP request body limit.<br/>Support the following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`. |
 | `grpc` | -- | -- | The gRPC server options. |
 | `grpc.addr` | String | `127.0.0.1:4001` | The address to bind the gRPC server. |
 | `grpc.runtime_size` | Integer | `8` | The number of server worker threads. |
 | `mysql` | -- | -- | MySQL server options. |
 | `mysql.enable` | Bool | `true` | Whether to enable. |
 | `mysql.addr` | String | `127.0.0.1:4002` | The addr to bind the MySQL server. |
 | `mysql.runtime_size` | Integer | `2` | The number of server worker threads. |
 | `mysql.tls` | -- | -- | -- |
 | `mysql.tls.mode` | String | `disable` | TLS mode, refer to https://www.postgresql.org/docs/current/libpq-ssl.html<br/>- `disable` (default value)<br/>- `prefer`<br/>- `require`<br/>- `verify-ca`<br/>- `verify-full` |
 | `mysql.tls.cert_path` | String | `None` | Certificate file path. |
 | `mysql.tls.key_path` | String | `None` | Private key file path. |
 | `mysql.tls.watch` | Bool | `false` | Watch for Certificate and key file change and auto reload |
 | `postgres` | -- | -- | PostgresSQL server options. |
 | `postgres.enable` | Bool | `true` | Whether to enable |
 | `postgres.addr` | String | `127.0.0.1:4003` | The addr to bind the PostgresSQL server. |
 | `postgres.runtime_size` | Integer | `2` | The number of server worker threads. |
 | `postgres.tls` | -- | -- | PostgresSQL server TLS options, see `mysql_options.tls` section. |
 | `postgres.tls.mode` | String | `disable` | TLS mode. |
 | `postgres.tls.cert_path` | String | `None` | Certificate file path. |
 | `postgres.tls.key_path` | String | `None` | Private key file path. |
 | `postgres.tls.watch` | Bool | `false` | Watch for Certificate and key file change and auto reload |
 | `opentsdb` | -- | -- | OpenTSDB protocol options. |
 | `opentsdb.enable` | Bool | `true` | Whether to enable |
 | `opentsdb.addr` | String | `127.0.0.1:4242` | OpenTSDB telnet API server address. |
 | `opentsdb.runtime_size` | Integer | `2` | The number of server worker threads. |
 | `influxdb` | -- | -- | InfluxDB protocol options. |
 | `influxdb.enable` | Bool | `true` | Whether to enable InfluxDB protocol in HTTP API. |
 | `prom_store` | -- | -- | Prometheus remote storage options |
 | `prom_store.enable` | Bool | `true` | Whether to enable Prometheus remote write and read in HTTP API. |
 | `prom_store.with_metric_engine` | Bool | `true` | Whether to store the data from Prometheus remote write in metric engine. |
 | `meta_client` | -- | -- | The metasrv client options. |
 | `meta_client.metasrv_addrs` | Array | -- | The addresses of the metasrv. |
 | `meta_client.timeout` | String | `3s` | Operation timeout. |
 | `meta_client.heartbeat_timeout` | String | `500ms` | Heartbeat timeout. |
 | `meta_client.ddl_timeout` | String | `10s` | DDL timeout. |
 | `meta_client.connect_timeout` | String | `1s` | Connect server timeout. |
 | `meta_client.tcp_nodelay` | Bool | `true` | `TCP_NODELAY` option for accepted connections. |
 | `meta_client.metadata_cache_max_capacity` | Integer | `100000` | The configuration about the cache of the metadata. |
 | `meta_client.metadata_cache_ttl` | String | `10m` | TTL of the metadata cache. |
 | `meta_client.metadata_cache_tti` | String | `5m` | -- |
 | `datanode` | -- | -- | Datanode options. |
 | `datanode.client` | -- | -- | Datanode client options. |
 | `datanode.client.timeout` | String | `10s` | -- |
 | `datanode.client.connect_timeout` | String | `10s` | -- |
 | `datanode.client.tcp_nodelay` | Bool | `true` | -- |
 | `logging` | -- | -- | The logging options. |
 | `logging.dir` | String | `/tmp/greptimedb/logs` | The directory to store the log files. |
 | `logging.level` | String | `None` | The log level. Can be `info`/`debug`/`warn`/`error`. |
 | `logging.enable_otlp_tracing` | Bool | `false` | Enable OTLP tracing. |
 | `logging.otlp_endpoint` | String | `None` | The OTLP tracing endpoint. |
 | `logging.append_stdout` | Bool | `true` | Whether to append logs to stdout. |
 | `logging.tracing_sample_ratio` | -- | -- | The percentage of tracing will be sampled and exported.<br/>Valid range `[0, 1]`, 1 means all traces are sampled, 0 means all traces are not sampled, the default value is 1.<br/>ratio > 1 are treated as 1. Fractions < 0 are treated as 0 |
 | `logging.tracing_sample_ratio.default_ratio` | Float | `1.0` | -- |
 | `export_metrics` | -- | -- | The datanode can export its metrics and send to Prometheus compatible service (e.g. send to `greptimedb` itself) from remote-write API.<br/>This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape. |
 | `export_metrics.enable` | Bool | `false` | whether enable export metrics. |
 | `export_metrics.write_interval` | String | `30s` | The interval of export metrics. |
 | `export_metrics.self_import` | -- | -- | For `standalone` mode, `self_import` is recommend to collect metrics generated by itself |
 | `export_metrics.self_import.db` | String | `None` | -- |
 | `export_metrics.remote_write` | -- | -- | -- |
 | `export_metrics.remote_write.url` | String | `""` | The url the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=information_schema`. |
 | `export_metrics.remote_write.headers` | InlineTable | -- | HTTP headers of Prometheus remote-write carry. |
 ### Metasrv
 | Key | Type | Default | Descriptions |
 | --- | -----| ------- | ----------- |
 | `data_home` | String | `/tmp/metasrv/` | The working home directory. |
 | `bind_addr` | String | `127.0.0.1:3002` | The bind address of metasrv. |
 | `server_addr` | String | `127.0.0.1:3002` | The communication server address for frontend and datanode to connect to metasrv,  "127.0.0.1:3002" by default for localhost. |
 | `store_addr` | String | `127.0.0.1:2379` | Etcd server address. |
 | `selector` | String | `lease_based` | Datanode selector type.<br/>- `lease_based` (default value).<br/>- `load_based`<br/>For details, please see "https://docs.greptime.com/developer-guide/metasrv/selector". |
 | `use_memory_store` | Bool | `false` | Store data in memory. |
 | `enable_telemetry` | Bool | `true` | Whether to enable greptimedb telemetry. |
 | `store_key_prefix` | String | `""` | If it's not empty, the metasrv will store all data with this key prefix. |
 | `procedure` | -- | -- | Procedure storage options. |
 | `procedure.max_retry_times` | Integer | `12` | Procedure max retry time. |
 | `procedure.retry_delay` | String | `500ms` | Initial retry delay of procedures, increases exponentially |
 | `procedure.max_metadata_value_size` | String | `1500KiB` | Auto split large value<br/>GreptimeDB procedure uses etcd as the default metadata storage backend.<br/>The etcd the maximum size of any request is 1.5 MiB<br/>1500KiB = 1536KiB (1.5MiB) - 36KiB (reserved size of key)<br/>Comments out the `max_metadata_value_size`, for don't split large value (no limit). |
 | `failure_detector` | -- | -- | -- |
 | `failure_detector.threshold` | Float | `8.0` | -- |
 | `failure_detector.min_std_deviation` | String | `100ms` | -- |
 | `failure_detector.acceptable_heartbeat_pause` | String | `3000ms` | -- |
 | `failure_detector.first_heartbeat_estimate` | String | `1000ms` | -- |
 | `datanode` | -- | -- | Datanode options. |
 | `datanode.client` | -- | -- | Datanode client options. |
 | `datanode.client.timeout` | String | `10s` | -- |
 | `datanode.client.connect_timeout` | String | `10s` | -- |
 | `datanode.client.tcp_nodelay` | Bool | `true` | -- |
 | `wal` | -- | -- | -- |
 | `wal.provider` | String | `raft_engine` | -- |
 | `wal.broker_endpoints` | Array | -- | The broker endpoints of the Kafka cluster. |
 | `wal.num_topics` | Integer | `64` | Number of topics to be created upon start. |
 | `wal.selector_type` | String | `round_robin` | Topic selector type.<br/>Available selector types:<br/>- `round_robin` (default) |
 | `wal.topic_name_prefix` | String | `greptimedb_wal_topic` | A Kafka topic is constructed by concatenating `topic_name_prefix` and `topic_id`. |
 | `wal.replication_factor` | Integer | `1` | Expected number of replicas of each partition. |
 | `wal.create_topic_timeout` | String | `30s` | Above which a topic creation operation will be cancelled. |
 | `wal.backoff_init` | String | `500ms` | The initial backoff for kafka clients. |
 | `wal.backoff_max` | String | `10s` | The maximum backoff for kafka clients. |
 | `wal.backoff_base` | Integer | `2` | Exponential backoff rate, i.e. next backoff = base * current backoff. |
 | `wal.backoff_deadline` | String | `5mins` | Stop reconnecting if the total wait time reaches the deadline. If this config is missing, the reconnecting won't terminate. |
 | `logging` | -- | -- | The logging options. |
 | `logging.dir` | String | `/tmp/greptimedb/logs` | The directory to store the log files. |
 | `logging.level` | String | `None` | The log level. Can be `info`/`debug`/`warn`/`error`. |
 | `logging.enable_otlp_tracing` | Bool | `false` | Enable OTLP tracing. |
 | `logging.otlp_endpoint` | String | `None` | The OTLP tracing endpoint. |
 | `logging.append_stdout` | Bool | `true` | Whether to append logs to stdout. |
 | `logging.tracing_sample_ratio` | -- | -- | The percentage of tracing will be sampled and exported.<br/>Valid range `[0, 1]`, 1 means all traces are sampled, 0 means all traces are not sampled, the default value is 1.<br/>ratio > 1 are treated as 1. Fractions < 0 are treated as 0 |
 | `logging.tracing_sample_ratio.default_ratio` | Float | `1.0` | -- |
 | `export_metrics` | -- | -- | The datanode can export its metrics and send to Prometheus compatible service (e.g. send to `greptimedb` itself) from remote-write API.<br/>This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape. |
 | `export_metrics.enable` | Bool | `false` | whether enable export metrics. |
 | `export_metrics.write_interval` | String | `30s` | The interval of export metrics. |
 | `export_metrics.self_import` | -- | -- | For `standalone` mode, `self_import` is recommend to collect metrics generated by itself |
 | `export_metrics.self_import.db` | String | `None` | -- |
 | `export_metrics.remote_write` | -- | -- | -- |
 | `export_metrics.remote_write.url` | String | `""` | The url the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=information_schema`. |
 | `export_metrics.remote_write.headers` | InlineTable | -- | HTTP headers of Prometheus remote-write carry. |
 ### Datanode
 | Key | Type | Default | Descriptions |
 | --- | -----| ------- | ----------- |
 | `mode` | String | `standalone` | The running mode of the datanode. It can be `standalone` or `distributed`. |
 | `node_id` | Integer | `None` | The datanode identifier and should be unique in the cluster. |
 | `require_lease_before_startup` | Bool | `false` | Start services after regions have obtained leases.<br/>It will block the datanode start if it can't receive leases in the heartbeat from metasrv. |
 | `init_regions_in_background` | Bool | `false` | Initialize all regions in the background during the startup.<br/>By default, it provides services after all regions have been initialized. |
 | `rpc_addr` | String | `127.0.0.1:3001` | The gRPC address of the datanode. |
 | `rpc_hostname` | String | `None` | The hostname of the datanode. |
 | `rpc_runtime_size` | Integer | `8` | The number of gRPC server worker threads. |
 | `rpc_max_recv_message_size` | String | `512MB` | The maximum receive message size for gRPC server. |
 | `rpc_max_send_message_size` | String | `512MB` | The maximum send message size for gRPC server. |
 | `enable_telemetry` | Bool | `true` | Enable telemetry to collect anonymous usage data. |
 | `heartbeat` | -- | -- | The heartbeat options. |
 | `heartbeat.interval` | String | `3s` | Interval for sending heartbeat messages to the metasrv. |
 | `heartbeat.retry_interval` | String | `3s` | Interval for retrying to send heartbeat messages to the metasrv. |
 | `meta_client` | -- | -- | The metasrv client options. |
 | `meta_client.metasrv_addrs` | Array | -- | The addresses of the metasrv. |
 | `meta_client.timeout` | String | `3s` | Operation timeout. |
 | `meta_client.heartbeat_timeout` | String | `500ms` | Heartbeat timeout. |
 | `meta_client.ddl_timeout` | String | `10s` | DDL timeout. |
 | `meta_client.connect_timeout` | String | `1s` | Connect server timeout. |
 | `meta_client.tcp_nodelay` | Bool | `true` | `TCP_NODELAY` option for accepted connections. |
 | `meta_client.metadata_cache_max_capacity` | Integer | `100000` | The configuration about the cache of the metadata. |
 | `meta_client.metadata_cache_ttl` | String | `10m` | TTL of the metadata cache. |
 | `meta_client.metadata_cache_tti` | String | `5m` | -- |
 | `wal` | -- | -- | The WAL options. |
 | `wal.provider` | String | `raft_engine` | The provider of the WAL.<br/>- `raft_engine`: the wal is stored in the local file system by raft-engine.<br/>- `kafka`: it's remote wal that data is stored in Kafka. |
 | `wal.dir` | String | `None` | The directory to store the WAL files.<br/>**It's only used when the provider is `raft_engine`**. |
 | `wal.file_size` | String | `256MB` | The size of the WAL segment file.<br/>**It's only used when the provider is `raft_engine`**. |
 | `wal.purge_threshold` | String | `4GB` | The threshold of the WAL size to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
 | `wal.purge_interval` | String | `10m` | The interval to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
 | `wal.read_batch_size` | Integer | `128` | The read batch size.<br/>**It's only used when the provider is `raft_engine`**. |
 | `wal.sync_write` | Bool | `false` | Whether to use sync write.<br/>**It's only used when the provider is `raft_engine`**. |
 | `wal.enable_log_recycle` | Bool | `true` | Whether to reuse logically truncated log files.<br/>**It's only used when the provider is `raft_engine`**. |
 | `wal.prefill_log_files` | Bool | `false` | Whether to pre-create log files on start up.<br/>**It's only used when the provider is `raft_engine`**. |
 | `wal.sync_period` | String | `10s` | Duration for fsyncing log files.<br/>**It's only used when the provider is `raft_engine`**. |
 | `wal.broker_endpoints` | Array | -- | The Kafka broker endpoints.<br/>**It's only used when the provider is `kafka`**. |
 | `wal.max_batch_size` | String | `1MB` | The max size of a single producer batch.<br/>Warning: Kafka has a default limit of 1MB per message in a topic.<br/>**It's only used when the provider is `kafka`**. |
 | `wal.linger` | String | `200ms` | The linger duration of a kafka batch producer.<br/>**It's only used when the provider is `kafka`**. |
 | `wal.consumer_wait_timeout` | String | `100ms` | The consumer wait timeout.<br/>**It's only used when the provider is `kafka`**. |
 | `wal.backoff_init` | String | `500ms` | The initial backoff delay.<br/>**It's only used when the provider is `kafka`**. |
 | `wal.backoff_max` | String | `10s` | The maximum backoff delay.<br/>**It's only used when the provider is `kafka`**. |
 | `wal.backoff_base` | Integer | `2` | The exponential backoff rate, i.e. next backoff = base * current backoff.<br/>**It's only used when the provider is `kafka`**. |
 | `wal.backoff_deadline` | String | `5mins` | The deadline of retries.<br/>**It's only used when the provider is `kafka`**. |
 | `storage` | -- | -- | The data storage options. |
 | `storage.data_home` | String | `/tmp/greptimedb/` | The working home directory. |
 | `storage.type` | String | `File` | The storage type used to store the data.<br/>- `File`: the data is stored in the local file system.<br/>- `S3`: the data is stored in the S3 object storage.<br/>- `Gcs`: the data is stored in the Google Cloud Storage.<br/>- `Azblob`: the data is stored in the Azure Blob Storage.<br/>- `Oss`: the data is stored in the Aliyun OSS. |
 | `storage.cache_path` | String | `None` | Cache configuration for object storage such as 'S3' etc.<br/>The local file cache directory. |
 | `storage.cache_capacity` | String | `None` | The local file cache capacity in bytes. |
 | `storage.bucket` | String | `None` | The S3 bucket name.<br/>**It's only used when the storage type is `S3`, `Oss` and `Gcs`**. |
 | `storage.root` | String | `None` | The S3 data will be stored in the specified prefix, for example, `s3://${bucket}/${root}`.<br/>**It's only used when the storage type is `S3`, `Oss` and `Azblob`**. |
 | `storage.access_key_id` | String | `None` | The access key id of the aws account.<br/>It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.<br/>**It's only used when the storage type is `S3` and `Oss`**. |
 | `storage.secret_access_key` | String | `None` | The secret access key of the aws account.<br/>It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.<br/>**It's only used when the storage type is `S3`**. |
 | `storage.access_key_secret` | String | `None` | The secret access key of the aliyun account.<br/>**It's only used when the storage type is `Oss`**. |
 | `storage.account_name` | String | `None` | The account key of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
 | `storage.account_key` | String | `None` | The account key of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
 | `storage.scope` | String | `None` | The scope of the google cloud storage.<br/>**It's only used when the storage type is `Gcs`**. |
 | `storage.credential_path` | String | `None` | The credential path of the google cloud storage.<br/>**It's only used when the storage type is `Gcs`**. |
 | `storage.container` | String | `None` | The container of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
 | `storage.sas_token` | String | `None` | The sas token of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
 | `storage.endpoint` | String | `None` | The endpoint of the S3 service.<br/>**It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**. |
 | `storage.region` | String | `None` | The region of the S3 service.<br/>**It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**. |
 | `[[region_engine]]` | -- | -- | The region engine options. You can configure multiple region engines. |
 | `region_engine.mito` | -- | -- | The Mito engine options. |
 | `region_engine.mito.num_workers` | Integer | `8` | Number of region workers. |
 | `region_engine.mito.worker_channel_size` | Integer | `128` | Request channel size of each worker. |
 | `region_engine.mito.worker_request_batch_size` | Integer | `64` | Max batch size for a worker to handle requests. |
 | `region_engine.mito.manifest_checkpoint_distance` | Integer | `10` | Number of meta action updated to trigger a new checkpoint for the manifest. |
 | `region_engine.mito.compress_manifest` | Bool | `false` | Whether to compress manifest and checkpoint file by gzip (default false). |
 | `region_engine.mito.max_background_jobs` | Integer | `4` | Max number of running background jobs |
 | `region_engine.mito.auto_flush_interval` | String | `1h` | Interval to auto flush a region if it has not flushed yet. |
 | `region_engine.mito.global_write_buffer_size` | String | `1GB` | Global write buffer size for all regions. If not set, it's default to 1/8 of OS memory with a max limitation of 1GB. |
 | `region_engine.mito.global_write_buffer_reject_size` | String | `2GB` | Global write buffer size threshold to reject write requests. If not set, it's default to 2 times of `global_write_buffer_size` |
 | `region_engine.mito.sst_meta_cache_size` | String | `128MB` | Cache size for SST metadata. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/32 of OS memory with a max limitation of 128MB. |
 | `region_engine.mito.vector_cache_size` | String | `512MB` | Cache size for vectors and arrow arrays. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/16 of OS memory with a max limitation of 512MB. |
 | `region_engine.mito.page_cache_size` | String | `512MB` | Cache size for pages of SST row groups. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/16 of OS memory with a max limitation of 512MB. |
 | `region_engine.mito.sst_write_buffer_size` | String | `8MB` | Buffer size for SST writing. |
 | `region_engine.mito.scan_parallelism` | Integer | `0` | Parallelism to scan a region (default: 1/4 of cpu cores).<br/>- `0`: using the default value (1/4 of cpu cores).<br/>- `1`: scan in current thread.<br/>- `n`: scan in parallelism n. |
 | `region_engine.mito.parallel_scan_channel_size` | Integer | `32` | Capacity of the channel to send data from parallel scan tasks to the main task. |
 | `region_engine.mito.allow_stale_entries` | Bool | `false` | Whether to allow stale WAL entries read during replay. |
 | `region_engine.mito.inverted_index` | -- | -- | The options for inverted index in Mito engine. |
 | `region_engine.mito.inverted_index.create_on_flush` | String | `auto` | Whether to create the index on flush.<br/>- `auto`: automatically<br/>- `disable`: never |
 | `region_engine.mito.inverted_index.create_on_compaction` | String | `auto` | Whether to create the index on compaction.<br/>- `auto`: automatically<br/>- `disable`: never |
 | `region_engine.mito.inverted_index.apply_on_query` | String | `auto` | Whether to apply the index on query<br/>- `auto`: automatically<br/>- `disable`: never |
 | `region_engine.mito.inverted_index.mem_threshold_on_create` | String | `64M` | Memory threshold for performing an external sort during index creation.<br/>Setting to empty will disable external sorting, forcing all sorting operations to happen in memory. |
 | `region_engine.mito.inverted_index.intermediate_path` | String | `""` | File system path to store intermediate files for external sorting (default `{data_home}/index_intermediate`). |
 | `region_engine.mito.memtable` | -- | -- | -- |
 | `region_engine.mito.memtable.type` | String | `time_series` | Memtable type.<br/>- `time_series`: time-series memtable<br/>- `partition_tree`: partition tree memtable (experimental) |
 | `region_engine.mito.memtable.index_max_keys_per_shard` | Integer | `8192` | The max number of keys in one shard.<br/>Only available for `partition_tree` memtable. |
 | `region_engine.mito.memtable.data_freeze_threshold` | Integer | `32768` | The max rows of data inside the actively writing buffer in one shard.<br/>Only available for `partition_tree` memtable. |
 | `region_engine.mito.memtable.fork_dictionary_bytes` | String | `1GiB` | Max dictionary bytes.<br/>Only available for `partition_tree` memtable. |
 | `logging` | -- | -- | The logging options. |
 | `logging.dir` | String | `/tmp/greptimedb/logs` | The directory to store the log files. |
 | `logging.level` | String | `None` | The log level. Can be `info`/`debug`/`warn`/`error`. |
 | `logging.enable_otlp_tracing` | Bool | `false` | Enable OTLP tracing. |
 | `logging.otlp_endpoint` | String | `None` | The OTLP tracing endpoint. |
 | `logging.append_stdout` | Bool | `true` | Whether to append logs to stdout. |
 | `logging.tracing_sample_ratio` | -- | -- | The percentage of tracing will be sampled and exported.<br/>Valid range `[0, 1]`, 1 means all traces are sampled, 0 means all traces are not sampled, the default value is 1.<br/>ratio > 1 are treated as 1. Fractions < 0 are treated as 0 |
 | `logging.tracing_sample_ratio.default_ratio` | Float | `1.0` | -- |
 | `export_metrics` | -- | -- | The datanode can export its metrics and send to Prometheus compatible service (e.g. send to `greptimedb` itself) from remote-write API.<br/>This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape. |
 | `export_metrics.enable` | Bool | `false` | whether enable export metrics. |
 | `export_metrics.write_interval` | String | `30s` | The interval of export metrics. |
 | `export_metrics.self_import` | -- | -- | For `standalone` mode, `self_import` is recommend to collect metrics generated by itself |
 | `export_metrics.self_import.db` | String | `None` | -- |
 | `export_metrics.remote_write` | -- | -- | -- |
 | `export_metrics.remote_write.url` | String | `""` | The url the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=information_schema`. |
 | `export_metrics.remote_write.headers` | InlineTable | -- | HTTP headers of Prometheus remote-write carry. |
--- a/config/datanode.example.toml
+++ b/config/datanode.example.toml
@@ -1,81 +1,430 @@
-# Node running mode, see `standalone.example.toml`.
+## The running mode of the datanode. It can be `standalone` or `distributed`.
-mode = "distributed"
+mode = "standalone"
-# Whether to use in-memory catalog, see `standalone.example.toml`.
+
-enable_memory_catalog = false
+## The datanode identifier and should be unique in the cluster.
-# The datanode identifier, should be unique.
+## +toml2docs:none-default
 node_id = 42
-# gRPC server address, "127.0.0.1:3001" by default.
+
 ## Start services after regions have obtained leases.
 ## It will block the datanode start if it can't receive leases in the heartbeat from metasrv.
 require_lease_before_startup = false
 ## Initialize all regions in the background during the startup.
 ## By default, it provides services after all regions have been initialized.
 init_regions_in_background = false
 ## The gRPC address of the datanode.
 rpc_addr = "127.0.0.1:3001"
-# Hostname of this node.
+
 ## The hostname of the datanode.
 ## +toml2docs:none-default
 rpc_hostname = "127.0.0.1"
-# The number of gRPC server worker threads, 8 by default.
+
 ## The number of gRPC server worker threads.
 rpc_runtime_size = 8
-# Metasrv client options.
+## The maximum receive message size for gRPC server.
-[meta_client_options]
+rpc_max_recv_message_size = "512MB"
-# Metasrv address list.
+
 ## The maximum send message size for gRPC server.
 rpc_max_send_message_size = "512MB"
 ## Enable telemetry to collect anonymous usage data.
 enable_telemetry = true
 ## The heartbeat options.
 [heartbeat]
 ## Interval for sending heartbeat messages to the metasrv.
 interval = "3s"
 ## Interval for retrying to send heartbeat messages to the metasrv.
 retry_interval = "3s"
 ## The metasrv client options.
 [meta_client]
 ## The addresses of the metasrv.
 metasrv_addrs = ["127.0.0.1:3002"]
-# Operation timeout in milliseconds, 3000 by default.
+
-timeout_millis = 3000
+## Operation timeout.
-# Connect server timeout in milliseconds, 5000 by default.
+timeout = "3s"
-connect_timeout_millis = 5000
+
-# `TCP_NODELAY` option for accepted connections, true by default.
+## Heartbeat timeout.
 heartbeat_timeout = "500ms"
 ## DDL timeout.
 ddl_timeout = "10s"
 ## Connect server timeout.
 connect_timeout = "1s"
 ## `TCP_NODELAY` option for accepted connections.
 tcp_nodelay = true
-# WAL options, see `standalone.example.toml`.
+## The configuration about the cache of the metadata.
 metadata_cache_max_capacity = 100000
 ## TTL of the metadata cache.
 metadata_cache_ttl = "10m"
 # TTI of the metadata cache.
 metadata_cache_tti = "5m"
 ## The WAL options.
 [wal]
-# WAL data directory
+## The provider of the WAL.
-# dir = "/tmp/greptimedb/wal"
+## - `raft_engine`: the wal is stored in the local file system by raft-engine.
-file_size = "1GB"
+## - `kafka`: it's remote wal that data is stored in Kafka.
-purge_threshold = "50GB"
+provider = "raft_engine"
 ## The directory to store the WAL files.
 ## **It's only used when the provider is `raft_engine`**.
 ## +toml2docs:none-default
 dir = "/tmp/greptimedb/wal"
 ## The size of the WAL segment file.
 ## **It's only used when the provider is `raft_engine`**.
 file_size = "256MB"
 ## The threshold of the WAL size to trigger a flush.
 ## **It's only used when the provider is `raft_engine`**.
 purge_threshold = "4GB"
 ## The interval to trigger a flush.
 ## **It's only used when the provider is `raft_engine`**.
 purge_interval = "10m"
 ## The read batch size.
 ## **It's only used when the provider is `raft_engine`**.
 read_batch_size = 128
 ## Whether to use sync write.
 ## **It's only used when the provider is `raft_engine`**.
 sync_write = false
-# Storage options, see `standalone.example.toml`.
+## Whether to reuse logically truncated log files.
 ## **It's only used when the provider is `raft_engine`**.
 enable_log_recycle = true
 ## Whether to pre-create log files on start up.
 ## **It's only used when the provider is `raft_engine`**.
 prefill_log_files = false
 ## Duration for fsyncing log files.
 ## **It's only used when the provider is `raft_engine`**.
 sync_period = "10s"
 ## The Kafka broker endpoints.
 ## **It's only used when the provider is `kafka`**.
 broker_endpoints = ["127.0.0.1:9092"]
 ## The max size of a single producer batch.
 ## Warning: Kafka has a default limit of 1MB per message in a topic.
 ## **It's only used when the provider is `kafka`**.
 max_batch_size = "1MB"
 ## The linger duration of a kafka batch producer.
 ## **It's only used when the provider is `kafka`**.
 linger = "200ms"
 ## The consumer wait timeout.
 ## **It's only used when the provider is `kafka`**.
 consumer_wait_timeout = "100ms"
 ## The initial backoff delay.
 ## **It's only used when the provider is `kafka`**.
 backoff_init = "500ms"
 ## The maximum backoff delay.
 ## **It's only used when the provider is `kafka`**.
 backoff_max = "10s"
 ## The exponential backoff rate, i.e. next backoff = base * current backoff.
 ## **It's only used when the provider is `kafka`**.
 backoff_base = 2
 ## The deadline of retries.
 ## **It's only used when the provider is `kafka`**.
 backoff_deadline = "5mins"
 # Example of using S3 as the storage.
 # [storage]
 # type = "S3"
 # bucket = "greptimedb"
 # root = "data"
 # access_key_id = "test"
 # secret_access_key = "123456"
 # endpoint = "https://s3.amazonaws.com"
 # region = "us-west-2"
 # Example of using Oss as the storage.
 # [storage]
 # type = "Oss"
 # bucket = "greptimedb"
 # root = "data"
 # access_key_id = "test"
 # access_key_secret = "123456"
 # endpoint = "https://oss-cn-hangzhou.aliyuncs.com"
 # Example of using Azblob as the storage.
 # [storage]
 # type = "Azblob"
 # container = "greptimedb"
 # root = "data"
 # account_name = "test"
 # account_key = "123456"
 # endpoint = "https://greptimedb.blob.core.windows.net"
 # sas_token = ""
 # Example of using Gcs as the storage.
 # [storage]
 # type = "Gcs"
 # bucket = "greptimedb"
 # root = "data"
 # scope = "test"
 # credential_path = "123456"
 # endpoint = "https://storage.googleapis.com"
 ## The data storage options.
 [storage]
-type = "File"
+## The working home directory.
 data_home = "/tmp/greptimedb/"
 # TTL for all tables. Disabled by default.
 # global_ttl = "7d"
-# Compaction options, see `standalone.example.toml`.
+## The storage type used to store the data.
-[storage.compaction]
+## - `File`: the data is stored in the local file system.
-max_inflight_tasks = 4
+## - `S3`: the data is stored in the S3 object storage.
-max_files_in_level0 = 8
+## - `Gcs`: the data is stored in the Google Cloud Storage.
-max_purge_tasks = 32
+## - `Azblob`: the data is stored in the Azure Blob Storage.
 ## - `Oss`: the data is stored in the Aliyun OSS.
 type = "File"
-# Storage manifest options
+## Cache configuration for object storage such as 'S3' etc.
-[storage.manifest]
+## The local file cache directory.
-# Region checkpoint actions margin.
+## +toml2docs:none-default
-# Create a checkpoint every <checkpoint_margin> actions.
+cache_path = "/path/local_cache"
 checkpoint_margin = 10
 # Region manifest logs and checkpoints gc execution duration
 gc_duration = '10m'
 # Whether to try creating a manifest checkpoint on region opening
 checkpoint_on_startup = false
-# Storage flush options
+## The local file cache capacity in bytes.
-[storage.flush]
+## +toml2docs:none-default
-# Max inflight flush tasks.
+cache_capacity = "256MB"
-max_flush_tasks = 8
+
-# Default write buffer size for a region.
+## The S3 bucket name.
-region_write_buffer_size = "32MB"
+## **It's only used when the storage type is `S3`, `Oss` and `Gcs`**.
-# Interval to check whether a region needs flush.
+## +toml2docs:none-default
-picker_schedule_interval = "5m"
+bucket = "greptimedb"
-# Interval to auto flush a region if it has not flushed yet.
+
 ## The S3 data will be stored in the specified prefix, for example, `s3://${bucket}/${root}`.
 ## **It's only used when the storage type is `S3`, `Oss` and `Azblob`**.
 ## +toml2docs:none-default
 root = "greptimedb"
 ## The access key id of the aws account.
 ## It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.
 ## **It's only used when the storage type is `S3` and `Oss`**.
 ## +toml2docs:none-default
 access_key_id = "test"
 ## The secret access key of the aws account.
 ## It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.
 ## **It's only used when the storage type is `S3`**.
 ## +toml2docs:none-default
 secret_access_key = "test"
 ## The secret access key of the aliyun account.
 ## **It's only used when the storage type is `Oss`**.
 ## +toml2docs:none-default
 access_key_secret = "test"
 ## The account key of the azure account.
 ## **It's only used when the storage type is `Azblob`**.
 ## +toml2docs:none-default
 account_name = "test"
 ## The account key of the azure account.
 ## **It's only used when the storage type is `Azblob`**.
 ## +toml2docs:none-default
 account_key = "test"
 ## The scope of the google cloud storage.
 ## **It's only used when the storage type is `Gcs`**.
 ## +toml2docs:none-default
 scope = "test"
 ## The credential path of the google cloud storage.
 ## **It's only used when the storage type is `Gcs`**.
 ## +toml2docs:none-default
 credential_path = "test"
 ## The container of the azure account.
 ## **It's only used when the storage type is `Azblob`**.
 ## +toml2docs:none-default
 container = "greptimedb"
 ## The sas token of the azure account.
 ## **It's only used when the storage type is `Azblob`**.
 ## +toml2docs:none-default
 sas_token = ""
 ## The endpoint of the S3 service.
 ## **It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**.
 ## +toml2docs:none-default
 endpoint = "https://s3.amazonaws.com"
 ## The region of the S3 service.
 ## **It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**.
 ## +toml2docs:none-default
 region = "us-west-2"
 # Custom storage options
 # [[storage.providers]]
 # type = "S3"
 # [[storage.providers]]
 # type = "Gcs"
 ## The region engine options. You can configure multiple region engines.
 [[region_engine]]
 ## The Mito engine options.
 [region_engine.mito]
 ## Number of region workers.
 num_workers = 8
 ## Request channel size of each worker.
 worker_channel_size = 128
 ## Max batch size for a worker to handle requests.
 worker_request_batch_size = 64
 ## Number of meta action updated to trigger a new checkpoint for the manifest.
 manifest_checkpoint_distance = 10
 ## Whether to compress manifest and checkpoint file by gzip (default false).
 compress_manifest = false
 ## Max number of running background jobs
 max_background_jobs = 4
 ## Interval to auto flush a region if it has not flushed yet.
 auto_flush_interval = "1h"
-# Global write buffer size for all regions.
+
 ## Global write buffer size for all regions. If not set, it's default to 1/8 of OS memory with a max limitation of 1GB.
 global_write_buffer_size = "1GB"
-# Procedure storage options, see `standalone.example.toml`.
+## Global write buffer size threshold to reject write requests. If not set, it's default to 2 times of `global_write_buffer_size`
-[procedure]
+global_write_buffer_reject_size = "2GB"
 max_retry_times = 3
 retry_delay = "500ms"
-# Log options
+## Cache size for SST metadata. Setting it to 0 to disable the cache.
-# [logging]
+## If not set, it's default to 1/32 of OS memory with a max limitation of 128MB.
-# Specify logs directory.
+sst_meta_cache_size = "128MB"
-# dir = "/tmp/greptimedb/logs"
+
-# Specify the log level [info | debug | error | warn]
+## Cache size for vectors and arrow arrays. Setting it to 0 to disable the cache.
-# level = "info"
+## If not set, it's default to 1/16 of OS memory with a max limitation of 512MB.
 vector_cache_size = "512MB"
 ## Cache size for pages of SST row groups. Setting it to 0 to disable the cache.
 ## If not set, it's default to 1/16 of OS memory with a max limitation of 512MB.
 page_cache_size = "512MB"
 ## Buffer size for SST writing.
 sst_write_buffer_size = "8MB"
 ## Parallelism to scan a region (default: 1/4 of cpu cores).
 ## - `0`: using the default value (1/4 of cpu cores).
 ## - `1`: scan in current thread.
 ## - `n`: scan in parallelism n.
 scan_parallelism = 0
 ## Capacity of the channel to send data from parallel scan tasks to the main task.
 parallel_scan_channel_size = 32
 ## Whether to allow stale WAL entries read during replay.
 allow_stale_entries = false
 ## The options for inverted index in Mito engine.
 [region_engine.mito.inverted_index]
 ## Whether to create the index on flush.
 ## - `auto`: automatically
 ## - `disable`: never
 create_on_flush = "auto"
 ## Whether to create the index on compaction.
 ## - `auto`: automatically
 ## - `disable`: never
 create_on_compaction = "auto"
 ## Whether to apply the index on query
 ## - `auto`: automatically
 ## - `disable`: never
 apply_on_query = "auto"
 ## Memory threshold for performing an external sort during index creation.
 ## Setting to empty will disable external sorting, forcing all sorting operations to happen in memory.
 mem_threshold_on_create = "64M"
 ## File system path to store intermediate files for external sorting (default `{data_home}/index_intermediate`).
 intermediate_path = ""
 [region_engine.mito.memtable]
 ## Memtable type.
 ## - `time_series`: time-series memtable
 ## - `partition_tree`: partition tree memtable (experimental)
 type = "time_series"
 ## The max number of keys in one shard.
 ## Only available for `partition_tree` memtable.
 index_max_keys_per_shard = 8192
 ## The max rows of data inside the actively writing buffer in one shard.
 ## Only available for `partition_tree` memtable.
 data_freeze_threshold = 32768
 ## Max dictionary bytes.
 ## Only available for `partition_tree` memtable.
 fork_dictionary_bytes = "1GiB"
 ## The logging options.
 [logging]
 ## The directory to store the log files.
 dir = "/tmp/greptimedb/logs"
 ## The log level. Can be `info`/`debug`/`warn`/`error`.
 ## +toml2docs:none-default
 level = "info"
 ## Enable OTLP tracing.
 enable_otlp_tracing = false
 ## The OTLP tracing endpoint.
 ## +toml2docs:none-default
 otlp_endpoint = ""
 ## Whether to append logs to stdout.
 append_stdout = true
 ## The percentage of tracing will be sampled and exported.
 ## Valid range `[0, 1]`, 1 means all traces are sampled, 0 means all traces are not sampled, the default value is 1.
 ## ratio > 1 are treated as 1. Fractions < 0 are treated as 0
 [logging.tracing_sample_ratio]
 default_ratio = 1.0
 ## The datanode can export its metrics and send to Prometheus compatible service (e.g. send to `greptimedb` itself) from remote-write API.
 ## This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape.
 [export_metrics]
 ## whether enable export metrics.
 enable = false
 ## The interval of export metrics.
 write_interval = "30s"
 ## For `standalone` mode, `self_import` is recommend to collect metrics generated by itself
 [export_metrics.self_import]
 ## +toml2docs:none-default
 db = "information_schema"
 [export_metrics.remote_write]
 ## The url the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=information_schema`.
 url = ""
 ## HTTP headers of Prometheus remote-write carry.
 headers = { }
--- a/config/frontend.example.toml
+++ b/config/frontend.example.toml
@@ -1,63 +1,192 @@
-# Node running mode, see `standalone.example.toml`.
+## The running mode of the datanode. It can be `standalone` or `distributed`.
-mode = "distributed"
+mode = "standalone"
-# HTTP server options, see `standalone.example.toml`.
+## The default timezone of the server.
-[http_options]
+## +toml2docs:none-default
 default_timezone = "UTC"
 ## The heartbeat options.
 [heartbeat]
 ## Interval for sending heartbeat messages to the metasrv.
 interval = "18s"
 ## Interval for retrying to send heartbeat messages to the metasrv.
 retry_interval = "3s"
 ## The HTTP server options.
 [http]
 ## The address to bind the HTTP server.
 addr = "127.0.0.1:4000"
 ## HTTP request timeout.
 timeout = "30s"
 ## HTTP request body limit.
 ## Support the following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.
 body_limit = "64MB"
-# gRPC server options, see `standalone.example.toml`.
+## The gRPC server options.
-[grpc_options]
+[grpc]
 ## The address to bind the gRPC server.
 addr = "127.0.0.1:4001"
 ## The number of server worker threads.
 runtime_size = 8
-# MySQL server options, see `standalone.example.toml`.
+## MySQL server options.
-[mysql_options]
+[mysql]
 ## Whether to enable.
 enable = true
 ## The addr to bind the MySQL server.
 addr = "127.0.0.1:4002"
 ## The number of server worker threads.
 runtime_size = 2
-# MySQL server TLS options, see `standalone.example.toml`.
+# MySQL server TLS options.
-[mysql_options.tls]
+[mysql.tls]
 ## TLS mode, refer to https://www.postgresql.org/docs/current/libpq-ssl.html
 ## - `disable` (default value)
 ## - `prefer`
 ## - `require`
 ## - `verify-ca`
 ## - `verify-full`
 mode = "disable"
 ## Certificate file path.
 ## +toml2docs:none-default
 cert_path = ""
 ## Private key file path.
 ## +toml2docs:none-default
 key_path = ""
-# PostgresSQL server options, see `standalone.example.toml`.
+## Watch for Certificate and key file change and auto reload
-[postgres_options]
+watch = false
 ## PostgresSQL server options.
 [postgres]
 ## Whether to enable
 enable = true
 ## The addr to bind the PostgresSQL server.
 addr = "127.0.0.1:4003"
 ## The number of server worker threads.
 runtime_size = 2
-# PostgresSQL server TLS options, see `standalone.example.toml`.
+## PostgresSQL server TLS options, see `mysql_options.tls` section.
-[postgres_options.tls]
+[postgres.tls]
 ## TLS mode.
 mode = "disable"
 ## Certificate file path.
 ## +toml2docs:none-default
 cert_path = ""
 ## Private key file path.
 ## +toml2docs:none-default
 key_path = ""
-# OpenTSDB protocol options, see `standalone.example.toml`.
+## Watch for Certificate and key file change and auto reload
-[opentsdb_options]
+watch = false
 ## OpenTSDB protocol options.
 [opentsdb]
 ## Whether to enable
 enable = true
 ## OpenTSDB telnet API server address.
 addr = "127.0.0.1:4242"
 ## The number of server worker threads.
 runtime_size = 2
-# InfluxDB protocol options, see `standalone.example.toml`.
+## InfluxDB protocol options.
-[influxdb_options]
+[influxdb]
 ## Whether to enable InfluxDB protocol in HTTP API.
 enable = true
-# Prometheus protocol options, see `standalone.example.toml`.
+## Prometheus remote storage options
-[prometheus_options]
+[prom_store]
 ## Whether to enable Prometheus remote write and read in HTTP API.
 enable = true
 ## Whether to store the data from Prometheus remote write in metric engine.
 with_metric_engine = true
-# Prometheus protocol options, see `standalone.example.toml`.
+## The metasrv client options.
-[prom_options]
+[meta_client]
-addr = "127.0.0.1:4004"
+## The addresses of the metasrv.
 # Metasrv client options, see `datanode.example.toml`.
 [meta_client_options]
 metasrv_addrs = ["127.0.0.1:3002"]
-timeout_millis = 3000
+
-connect_timeout_millis = 5000
+## Operation timeout.
 timeout = "3s"
 ## Heartbeat timeout.
 heartbeat_timeout = "500ms"
 ## DDL timeout.
 ddl_timeout = "10s"
 ## Connect server timeout.
 connect_timeout = "1s"
 ## `TCP_NODELAY` option for accepted connections.
 tcp_nodelay = true
-# Log options, see `standalone.example.toml`
+## The configuration about the cache of the metadata.
-# [logging]
+metadata_cache_max_capacity = 100000
-# dir = "/tmp/greptimedb/logs"
+
-# level = "info"
+## TTL of the metadata cache.
 metadata_cache_ttl = "10m"
 # TTI of the metadata cache.
 metadata_cache_tti = "5m"
 ## Datanode options.
 [datanode]
 ## Datanode client options.
 [datanode.client]
 timeout = "10s"
 connect_timeout = "10s"
 tcp_nodelay = true
 ## The logging options.
 [logging]
 ## The directory to store the log files.
 dir = "/tmp/greptimedb/logs"
 ## The log level. Can be `info`/`debug`/`warn`/`error`.
 ## +toml2docs:none-default
 level = "info"
 ## Enable OTLP tracing.
 enable_otlp_tracing = false
 ## The OTLP tracing endpoint.
 ## +toml2docs:none-default
 otlp_endpoint = ""
 ## Whether to append logs to stdout.
 append_stdout = true
 ## The percentage of tracing will be sampled and exported.
 ## Valid range `[0, 1]`, 1 means all traces are sampled, 0 means all traces are not sampled, the default value is 1.
 ## ratio > 1 are treated as 1. Fractions < 0 are treated as 0
 [logging.tracing_sample_ratio]
 default_ratio = 1.0
 ## The datanode can export its metrics and send to Prometheus compatible service (e.g. send to `greptimedb` itself) from remote-write API.
 ## This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape.
 [export_metrics]
 ## whether enable export metrics.
 enable = false
 ## The interval of export metrics.
 write_interval = "30s"
 ## For `standalone` mode, `self_import` is recommend to collect metrics generated by itself
 [export_metrics.self_import]
 ## +toml2docs:none-default
 db = "information_schema"
 [export_metrics.remote_write]
 ## The url the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=information_schema`.
 url = ""
 ## HTTP headers of Prometheus remote-write carry.
 headers = { }
--- a/config/metasrv.example.toml
+++ b/config/metasrv.example.toml
@@ -1,20 +1,143 @@
-# The bind address of metasrv, "127.0.0.1:3002" by default.
+## The working home directory.
 data_home = "/tmp/metasrv/"
 ## The bind address of metasrv.
 bind_addr = "127.0.0.1:3002"
-# The communication server address for frontend and datanode to connect to metasrv,  "127.0.0.1:3002" by default for localhost.
+
 ## The communication server address for frontend and datanode to connect to metasrv,  "127.0.0.1:3002" by default for localhost.
 server_addr = "127.0.0.1:3002"
-# Etcd server address, "127.0.0.1:2379" by default.
+
 ## Etcd server address.
 store_addr = "127.0.0.1:2379"
-# Datanode lease in seconds, 15 seconds by default.
+
-datanode_lease_secs = 15
+## Datanode selector type.
-# Datanode selector type.
+## - `lease_based` (default value).
-# - "LeaseBased" (default value).
+## - `load_based`
-# - "LoadBased"
+## For details, please see "https://docs.greptime.com/developer-guide/metasrv/selector".
-# For details, please see "https://docs.greptime.com/developer-guide/meta/selector".
+selector = "lease_based"
-selector = "LeaseBased"
+
-# Store data in memory, false by default.
+## Store data in memory.
 use_memory_store = false
-# Log options, see `standalone.example.toml`
+## Whether to enable greptimedb telemetry.
-# [logging]
+enable_telemetry = true
-# dir = "/tmp/greptimedb/logs"
+
-# level = "info"
+## If it's not empty, the metasrv will store all data with this key prefix.
 store_key_prefix = ""
 ## Procedure storage options.
 [procedure]
 ## Procedure max retry time.
 max_retry_times = 12
 ## Initial retry delay of procedures, increases exponentially
 retry_delay = "500ms"
 ## Auto split large value
 ## GreptimeDB procedure uses etcd as the default metadata storage backend.
 ## The etcd the maximum size of any request is 1.5 MiB
 ## 1500KiB = 1536KiB (1.5MiB) - 36KiB (reserved size of key)
 ## Comments out the `max_metadata_value_size`, for don't split large value (no limit).
 max_metadata_value_size = "1500KiB"
 # Failure detectors options.
 [failure_detector]
 threshold = 8.0
 min_std_deviation = "100ms"
 acceptable_heartbeat_pause = "3000ms"
 first_heartbeat_estimate = "1000ms"
 ## Datanode options.
 [datanode]
 ## Datanode client options.
 [datanode.client]
 timeout = "10s"
 connect_timeout = "10s"
 tcp_nodelay = true
 [wal]
 # Available wal providers:
 # - `raft_engine` (default): there're none raft-engine wal config since metasrv only involves in remote wal currently.
 # - `kafka`: metasrv **have to be** configured with kafka wal config when using kafka wal provider in datanode.
 provider = "raft_engine"
 # Kafka wal config.
 ## The broker endpoints of the Kafka cluster.
 broker_endpoints = ["127.0.0.1:9092"]
 ## Number of topics to be created upon start.
 num_topics = 64
 ## Topic selector type.
 ## Available selector types:
 ## - `round_robin` (default)
 selector_type = "round_robin"
 ## A Kafka topic is constructed by concatenating `topic_name_prefix` and `topic_id`.
 topic_name_prefix = "greptimedb_wal_topic"
 ## Expected number of replicas of each partition.
 replication_factor = 1
 ## Above which a topic creation operation will be cancelled.
 create_topic_timeout = "30s"
 ## The initial backoff for kafka clients.
 backoff_init = "500ms"
 ## The maximum backoff for kafka clients.
 backoff_max = "10s"
 ## Exponential backoff rate, i.e. next backoff = base * current backoff.
 backoff_base = 2
 ## Stop reconnecting if the total wait time reaches the deadline. If this config is missing, the reconnecting won't terminate.
 backoff_deadline = "5mins"
 ## The logging options.
 [logging]
 ## The directory to store the log files.
 dir = "/tmp/greptimedb/logs"
 ## The log level. Can be `info`/`debug`/`warn`/`error`.
 ## +toml2docs:none-default
 level = "info"
 ## Enable OTLP tracing.
 enable_otlp_tracing = false
 ## The OTLP tracing endpoint.
 ## +toml2docs:none-default
 otlp_endpoint = ""
 ## Whether to append logs to stdout.
 append_stdout = true
 ## The percentage of tracing will be sampled and exported.
 ## Valid range `[0, 1]`, 1 means all traces are sampled, 0 means all traces are not sampled, the default value is 1.
 ## ratio > 1 are treated as 1. Fractions < 0 are treated as 0
 [logging.tracing_sample_ratio]
 default_ratio = 1.0
 ## The datanode can export its metrics and send to Prometheus compatible service (e.g. send to `greptimedb` itself) from remote-write API.
 ## This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape.
 [export_metrics]
 ## whether enable export metrics.
 enable = false
 ## The interval of export metrics.
 write_interval = "30s"
 ## For `standalone` mode, `self_import` is recommend to collect metrics generated by itself
 [export_metrics.self_import]
 ## +toml2docs:none-default
 db = "information_schema"
 [export_metrics.remote_write]
 ## The url the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=information_schema`.
 url = ""
 ## HTTP headers of Prometheus remote-write carry.
 headers = { }
--- a/config/standalone.example.toml
+++ b/config/standalone.example.toml
@@ -1,147 +1,477 @@
-# Node running mode, "standalone" or "distributed".
+## The running mode of the datanode. It can be `standalone` or `distributed`.
 mode = "standalone"
 # Whether to use in-memory catalog, `false` by default.
 enable_memory_catalog = false
-# HTTP server options.
+## Enable telemetry to collect anonymous usage data.
-[http_options]
+enable_telemetry = true
-# Server address, "127.0.0.1:4000" by default.
+
 ## The default timezone of the server.
 ## +toml2docs:none-default
 default_timezone = "UTC"
 ## The HTTP server options.
 [http]
 ## The address to bind the HTTP server.
 addr = "127.0.0.1:4000"
-# HTTP request timeout, 30s by default.
+## HTTP request timeout.
 timeout = "30s"
 ## HTTP request body limit.
 ## Support the following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.
 body_limit = "64MB"
-# gRPC server options.
+## The gRPC server options.
-[grpc_options]
+[grpc]
-# Server address, "127.0.0.1:4001" by default.
+## The address to bind the gRPC server.
 addr = "127.0.0.1:4001"
-# The number of server worker threads, 8 by default.
+## The number of server worker threads.
 runtime_size = 8
-# MySQL server options.
+## MySQL server options.
-[mysql_options]
+[mysql]
-# Server address, "127.0.0.1:4002" by default.
+## Whether to enable.
 enable = true
 ## The addr to bind the MySQL server.
 addr = "127.0.0.1:4002"
-# The number of server worker threads, 2 by default.
+## The number of server worker threads.
 runtime_size = 2
 # MySQL server TLS options.
-[mysql_options.tls]
+[mysql.tls]
-# TLS mode, refer to https://www.postgresql.org/docs/current/libpq-ssl.html
+
-# - "disable" (default value)
+## TLS mode, refer to https://www.postgresql.org/docs/current/libpq-ssl.html
-# - "prefer"
+## - `disable` (default value)
-# - "require"
+## - `prefer`
-# - "verify-ca"
+## - `require`
-# - "verify-full"
+## - `verify-ca`
 ## - `verify-full`
 mode = "disable"
-# Certificate file path.
+
 ## Certificate file path.
 ## +toml2docs:none-default
 cert_path = ""
-# Private key file path.
+
 ## Private key file path.
 ## +toml2docs:none-default
 key_path = ""
-# PostgresSQL server options.
+## Watch for Certificate and key file change and auto reload
-[postgres_options]
+watch = false
-# Server address, "127.0.0.1:4003" by default.
+
 ## PostgresSQL server options.
 [postgres]
 ## Whether to enable
 enable = true
 ## The addr to bind the PostgresSQL server.
 addr = "127.0.0.1:4003"
-# The number of server worker threads, 2 by default.
+## The number of server worker threads.
 runtime_size = 2
-# PostgresSQL server TLS options, see `[mysql_options.tls]` section.
+## PostgresSQL server TLS options, see `mysql_options.tls` section.
-[postgres_options.tls]
+[postgres.tls]
-# TLS mode.
+## TLS mode.
 mode = "disable"
-# certificate file path.
+
 ## Certificate file path.
 ## +toml2docs:none-default
 cert_path = ""
-# private key file path.
+
 ## Private key file path.
 ## +toml2docs:none-default
 key_path = ""
-# OpenTSDB protocol options.
+## Watch for Certificate and key file change and auto reload
-[opentsdb_options]
+watch = false
-# OpenTSDB telnet API server address, "127.0.0.1:4242" by default.
+
 ## OpenTSDB protocol options.
 [opentsdb]
 ## Whether to enable
 enable = true
 ## OpenTSDB telnet API server address.
 addr = "127.0.0.1:4242"
-# The number of server worker threads, 2 by default.
+## The number of server worker threads.
 runtime_size = 2
-# InfluxDB protocol options.
+## InfluxDB protocol options.
-[influxdb_options]
+[influxdb]
-# Whether to enable InfluxDB protocol in HTTP API, true by default.
+## Whether to enable InfluxDB protocol in HTTP API.
 enable = true
-# Prometheus protocol options.
+## Prometheus remote storage options
-[prometheus_options]
+[prom_store]
-# Whether to enable Prometheus remote write and read in HTTP API, true by default.
+## Whether to enable Prometheus remote write and read in HTTP API.
 enable = true
 ## Whether to store the data from Prometheus remote write in metric engine.
 with_metric_engine = true
-# Prom protocol options.
+## The WAL options.
 [prom_options]
 # Prometheus API server address, "127.0.0.1:4004" by default.
 addr = "127.0.0.1:4004"
 # WAL options.
 [wal]
-# WAL data directory
+## The provider of the WAL.
-# dir = "/tmp/greptimedb/wal"
+## - `raft_engine`: the wal is stored in the local file system by raft-engine.
-# WAL file size in bytes.
+## - `kafka`: it's remote wal that data is stored in Kafka.
-file_size = "1GB"
+provider = "raft_engine"
-# WAL purge threshold in bytes.
+
-purge_threshold = "50GB"
+## The directory to store the WAL files.
-# WAL purge interval in seconds.
+## **It's only used when the provider is `raft_engine`**.
 ## +toml2docs:none-default
 dir = "/tmp/greptimedb/wal"
 ## The size of the WAL segment file.
 ## **It's only used when the provider is `raft_engine`**.
 file_size = "256MB"
 ## The threshold of the WAL size to trigger a flush.
 ## **It's only used when the provider is `raft_engine`**.
 purge_threshold = "4GB"
 ## The interval to trigger a flush.
 ## **It's only used when the provider is `raft_engine`**.
 purge_interval = "10m"
-# WAL read batch size.
+
 ## The read batch size.
 ## **It's only used when the provider is `raft_engine`**.
 read_batch_size = 128
-# Whether to sync log file after every write.
+
 ## Whether to use sync write.
 ## **It's only used when the provider is `raft_engine`**.
 sync_write = false
-# Storage options.
+## Whether to reuse logically truncated log files.
-[storage]
+## **It's only used when the provider is `raft_engine`**.
-# Storage type.
+enable_log_recycle = true
 type = "File"
 # Data directory, "/tmp/greptimedb/data" by default.
 data_home = "/tmp/greptimedb/"
 # TTL for all tables. Disabled by default.
 # global_ttl = "7d"
-# Compaction options.
+## Whether to pre-create log files on start up.
-[storage.compaction]
+## **It's only used when the provider is `raft_engine`**.
-# Max task number that can concurrently run.
+prefill_log_files = false
 max_inflight_tasks = 4
 # Max files in level 0 to trigger compaction.
 max_files_in_level0 = 8
 # Max task number for SST purge task after compaction.
 max_purge_tasks = 32
-# Storage manifest options
+## Duration for fsyncing log files.
-[storage.manifest]
+## **It's only used when the provider is `raft_engine`**.
-# Region checkpoint actions margin.
+sync_period = "10s"
 # Create a checkpoint every <checkpoint_margin> actions.
 checkpoint_margin = 10
 # Region manifest logs and checkpoints gc execution duration
 gc_duration = '10m'
 # Whether to try creating a manifest checkpoint on region opening
 checkpoint_on_startup = false
-# Storage flush options
+## The Kafka broker endpoints.
-[storage.flush]
+## **It's only used when the provider is `kafka`**.
-# Max inflight flush tasks.
+broker_endpoints = ["127.0.0.1:9092"]
 max_flush_tasks = 8
 # Default write buffer size for a region.
 region_write_buffer_size = "32MB"
 # Interval to check whether a region needs flush.
 picker_schedule_interval = "5m"
 # Interval to auto flush a region if it has not flushed yet.
 auto_flush_interval = "1h"
 # Global write buffer size for all regions.
 global_write_buffer_size = "1GB"
-# Procedure storage options.
+## The max size of a single producer batch.
 ## Warning: Kafka has a default limit of 1MB per message in a topic.
 ## **It's only used when the provider is `kafka`**.
 max_batch_size = "1MB"
 ## The linger duration of a kafka batch producer.
 ## **It's only used when the provider is `kafka`**.
 linger = "200ms"
 ## The consumer wait timeout.
 ## **It's only used when the provider is `kafka`**.
 consumer_wait_timeout = "100ms"
 ## The initial backoff delay.
 ## **It's only used when the provider is `kafka`**.
 backoff_init = "500ms"
 ## The maximum backoff delay.
 ## **It's only used when the provider is `kafka`**.
 backoff_max = "10s"
 ## The exponential backoff rate, i.e. next backoff = base * current backoff.
 ## **It's only used when the provider is `kafka`**.
 backoff_base = 2
 ## The deadline of retries.
 ## **It's only used when the provider is `kafka`**.
 backoff_deadline = "5mins"
 ## Metadata storage options.
 [metadata_store]
 ## Kv file size in bytes.
 file_size = "256MB"
 ## Kv purge threshold.
 purge_threshold = "4GB"
 ## Procedure storage options.
 [procedure]
-# Procedure max retry time.
+## Procedure max retry time.
 max_retry_times = 3
-# Initial retry delay of procedures, increases exponentially
+## Initial retry delay of procedures, increases exponentially
 retry_delay = "500ms"
-# Log options
+# Example of using S3 as the storage.
-# [logging]
+# [storage]
-# Specify logs directory.
+# type = "S3"
-# dir = "/tmp/greptimedb/logs"
+# bucket = "greptimedb"
-# Specify the log level [info | debug | error | warn]
+# root = "data"
-# level = "info"
+# access_key_id = "test"
 # secret_access_key = "123456"
 # endpoint = "https://s3.amazonaws.com"
 # region = "us-west-2"
 # Example of using Oss as the storage.
 # [storage]
 # type = "Oss"
 # bucket = "greptimedb"
 # root = "data"
 # access_key_id = "test"
 # access_key_secret = "123456"
 # endpoint = "https://oss-cn-hangzhou.aliyuncs.com"
 # Example of using Azblob as the storage.
 # [storage]
 # type = "Azblob"
 # container = "greptimedb"
 # root = "data"
 # account_name = "test"
 # account_key = "123456"
 # endpoint = "https://greptimedb.blob.core.windows.net"
 # sas_token = ""
 # Example of using Gcs as the storage.
 # [storage]
 # type = "Gcs"
 # bucket = "greptimedb"
 # root = "data"
 # scope = "test"
 # credential_path = "123456"
 # endpoint = "https://storage.googleapis.com"
 ## The data storage options.
 [storage]
 ## The working home directory.
 data_home = "/tmp/greptimedb/"
 ## The storage type used to store the data.
 ## - `File`: the data is stored in the local file system.
 ## - `S3`: the data is stored in the S3 object storage.
 ## - `Gcs`: the data is stored in the Google Cloud Storage.
 ## - `Azblob`: the data is stored in the Azure Blob Storage.
 ## - `Oss`: the data is stored in the Aliyun OSS.
 type = "File"
 ## Cache configuration for object storage such as 'S3' etc.
 ## The local file cache directory.
 ## +toml2docs:none-default
 cache_path = "/path/local_cache"
 ## The local file cache capacity in bytes.
 ## +toml2docs:none-default
 cache_capacity = "256MB"
 ## The S3 bucket name.
 ## **It's only used when the storage type is `S3`, `Oss` and `Gcs`**.
 ## +toml2docs:none-default
 bucket = "greptimedb"
 ## The S3 data will be stored in the specified prefix, for example, `s3://${bucket}/${root}`.
 ## **It's only used when the storage type is `S3`, `Oss` and `Azblob`**.
 ## +toml2docs:none-default
 root = "greptimedb"
 ## The access key id of the aws account.
 ## It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.
 ## **It's only used when the storage type is `S3` and `Oss`**.
 ## +toml2docs:none-default
 access_key_id = "test"
 ## The secret access key of the aws account.
 ## It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.
 ## **It's only used when the storage type is `S3`**.
 ## +toml2docs:none-default
 secret_access_key = "test"
 ## The secret access key of the aliyun account.
 ## **It's only used when the storage type is `Oss`**.
 ## +toml2docs:none-default
 access_key_secret = "test"
 ## The account key of the azure account.
 ## **It's only used when the storage type is `Azblob`**.
 ## +toml2docs:none-default
 account_name = "test"
 ## The account key of the azure account.
 ## **It's only used when the storage type is `Azblob`**.
 ## +toml2docs:none-default
 account_key = "test"
 ## The scope of the google cloud storage.
 ## **It's only used when the storage type is `Gcs`**.
 ## +toml2docs:none-default
 scope = "test"
 ## The credential path of the google cloud storage.
 ## **It's only used when the storage type is `Gcs`**.
 ## +toml2docs:none-default
 credential_path = "test"
 ## The container of the azure account.
 ## **It's only used when the storage type is `Azblob`**.
 ## +toml2docs:none-default
 container = "greptimedb"
 ## The sas token of the azure account.
 ## **It's only used when the storage type is `Azblob`**.
 ## +toml2docs:none-default
 sas_token = ""
 ## The endpoint of the S3 service.
 ## **It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**.
 ## +toml2docs:none-default
 endpoint = "https://s3.amazonaws.com"
 ## The region of the S3 service.
 ## **It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**.
 ## +toml2docs:none-default
 region = "us-west-2"
 # Custom storage options
 # [[storage.providers]]
 # type = "S3"
 # [[storage.providers]]
 # type = "Gcs"
 ## The region engine options. You can configure multiple region engines.
 [[region_engine]]
 ## The Mito engine options.
 [region_engine.mito]
 ## Number of region workers.
 num_workers = 8
 ## Request channel size of each worker.
 worker_channel_size = 128
 ## Max batch size for a worker to handle requests.
 worker_request_batch_size = 64
 ## Number of meta action updated to trigger a new checkpoint for the manifest.
 manifest_checkpoint_distance = 10
 ## Whether to compress manifest and checkpoint file by gzip (default false).
 compress_manifest = false
 ## Max number of running background jobs
 max_background_jobs = 4
 ## Interval to auto flush a region if it has not flushed yet.
 auto_flush_interval = "1h"
 ## Global write buffer size for all regions. If not set, it's default to 1/8 of OS memory with a max limitation of 1GB.
 global_write_buffer_size = "1GB"
 ## Global write buffer size threshold to reject write requests. If not set, it's default to 2 times of `global_write_buffer_size`
 global_write_buffer_reject_size = "2GB"
 ## Cache size for SST metadata. Setting it to 0 to disable the cache.
 ## If not set, it's default to 1/32 of OS memory with a max limitation of 128MB.
 sst_meta_cache_size = "128MB"
 ## Cache size for vectors and arrow arrays. Setting it to 0 to disable the cache.
 ## If not set, it's default to 1/16 of OS memory with a max limitation of 512MB.
 vector_cache_size = "512MB"
 ## Cache size for pages of SST row groups. Setting it to 0 to disable the cache.
 ## If not set, it's default to 1/16 of OS memory with a max limitation of 512MB.
 page_cache_size = "512MB"
 ## Buffer size for SST writing.
 sst_write_buffer_size = "8MB"
 ## Parallelism to scan a region (default: 1/4 of cpu cores).
 ## - `0`: using the default value (1/4 of cpu cores).
 ## - `1`: scan in current thread.
 ## - `n`: scan in parallelism n.
 scan_parallelism = 0
 ## Capacity of the channel to send data from parallel scan tasks to the main task.
 parallel_scan_channel_size = 32
 ## Whether to allow stale WAL entries read during replay.
 allow_stale_entries = false
 ## The options for inverted index in Mito engine.
 [region_engine.mito.inverted_index]
 ## Whether to create the index on flush.
 ## - `auto`: automatically
 ## - `disable`: never
 create_on_flush = "auto"
 ## Whether to create the index on compaction.
 ## - `auto`: automatically
 ## - `disable`: never
 create_on_compaction = "auto"
 ## Whether to apply the index on query
 ## - `auto`: automatically
 ## - `disable`: never
 apply_on_query = "auto"
 ## Memory threshold for performing an external sort during index creation.
 ## Setting to empty will disable external sorting, forcing all sorting operations to happen in memory.
 mem_threshold_on_create = "64M"
 ## File system path to store intermediate files for external sorting (default `{data_home}/index_intermediate`).
 intermediate_path = ""
 [region_engine.mito.memtable]
 ## Memtable type.
 ## - `time_series`: time-series memtable
 ## - `partition_tree`: partition tree memtable (experimental)
 type = "time_series"
 ## The max number of keys in one shard.
 ## Only available for `partition_tree` memtable.
 index_max_keys_per_shard = 8192
 ## The max rows of data inside the actively writing buffer in one shard.
 ## Only available for `partition_tree` memtable.
 data_freeze_threshold = 32768
 ## Max dictionary bytes.
 ## Only available for `partition_tree` memtable.
 fork_dictionary_bytes = "1GiB"
 ## The logging options.
 [logging]
 ## The directory to store the log files.
 dir = "/tmp/greptimedb/logs"
 ## The log level. Can be `info`/`debug`/`warn`/`error`.
 ## +toml2docs:none-default
 level = "info"
 ## Enable OTLP tracing.
 enable_otlp_tracing = false
 ## The OTLP tracing endpoint.
 ## +toml2docs:none-default
 otlp_endpoint = ""
 ## Whether to append logs to stdout.
 append_stdout = true
 ## The percentage of tracing will be sampled and exported.
 ## Valid range `[0, 1]`, 1 means all traces are sampled, 0 means all traces are not sampled, the default value is 1.
 ## ratio > 1 are treated as 1. Fractions < 0 are treated as 0
 [logging.tracing_sample_ratio]
 default_ratio = 1.0
 ## The datanode can export its metrics and send to Prometheus compatible service (e.g. send to `greptimedb` itself) from remote-write API.
 ## This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape.
 [export_metrics]
 ## whether enable export metrics.
 enable = false
 ## The interval of export metrics.
 write_interval = "30s"
 ## For `standalone` mode, `self_import` is recommend to collect metrics generated by itself
 [export_metrics.self_import]
 ## +toml2docs:none-default
 db = "information_schema"
 [export_metrics.remote_write]
 ## The url the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=information_schema`.
 url = ""
 ## HTTP headers of Prometheus remote-write carry.
 headers = { }
--- a/docker/Dockerfile
+++ b/docker/Dockerfile
@@ -1,38 +0,0 @@
 FROM ubuntu:22.04 as builder
 ENV LANG en_US.utf8
 WORKDIR /greptimedb
 # Install dependencies.
 RUN apt-get update && apt-get install -y \
    libssl-dev \
    protobuf-compiler \
    curl \
    build-essential \
    pkg-config \
    python3 \
    python3-dev \
    python3-pip \
    && pip3 install --upgrade pip \
    && pip3 install pyarrow
 # Install Rust.
 SHELL ["/bin/bash", "-c"]
 RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --no-modify-path --default-toolchain none -y
 ENV PATH /root/.cargo/bin/:$PATH
 # Build the project in release mode.
 COPY . .
 RUN cargo build --release
 # Export the binary to the clean image.
 # TODO(zyy17): Maybe should use the more secure container image.
 FROM ubuntu:22.04 as base
 RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y install ca-certificates
 WORKDIR /greptime
 COPY --from=builder /greptimedb/target/release/greptime /greptime/bin/
 ENV PATH /greptime/bin/:$PATH
 ENTRYPOINT ["greptime"]
--- a/docker/aarch64/Dockerfile
+++ b/docker/aarch64/Dockerfile
@@ -1,57 +0,0 @@
 FROM ubuntu:22.04 as builder
 ENV LANG en_US.utf8
 WORKDIR /greptimedb
 # Install dependencies.
 RUN apt-get update && apt-get install -y \
    libssl-dev \
    protobuf-compiler \
    curl \
    build-essential \
    pkg-config \
    wget
 # Install Rust.
 SHELL ["/bin/bash", "-c"]
 RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --no-modify-path --default-toolchain none -y
 ENV PATH /root/.cargo/bin/:$PATH
 # Install cross platform toolchain
 RUN apt-get -y update && \
    apt-get -y install g++-aarch64-linux-gnu gcc-aarch64-linux-gnu && \
    apt-get install binutils-aarch64-linux-gnu
 COPY ./docker/aarch64/compile-python.sh ./docker/aarch64/
 RUN chmod +x ./docker/aarch64/compile-python.sh && \
    ./docker/aarch64/compile-python.sh
 COPY ./rust-toolchain.toml .
 # Install rustup target for cross compiling.
 RUN rustup target add aarch64-unknown-linux-gnu
 COPY . .
 # Update dependency, using separate `RUN` to separate cache
 RUN cargo fetch
 # This three env var is set in script, so I set it manually in dockerfile.
 ENV LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib/
 ENV LIBRARY_PATH=$LIBRARY_PATH:/usr/local/lib/
 ENV PY_INSTALL_PATH=/greptimedb/python_arm64_build
 # Set the environment variable for cross compiling and compile it
 # cross compiled python is `python3` in path, but pyo3 need `python` in path so alias it
 # Build the project in release mode.
 RUN export PYO3_CROSS_LIB_DIR=$PY_INSTALL_PATH/lib && \ 
    alias python=python3 && \
    cargo build --target aarch64-unknown-linux-gnu --release -F pyo3_backend
 # Exporting the binary to the clean image
 FROM ubuntu:22.04 as base
 RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y install ca-certificates
 WORKDIR /greptime
 COPY --from=builder /greptimedb/target/aarch64-unknown-linux-gnu/release/greptime /greptime/bin/
 ENV PATH /greptime/bin/:$PATH
 ENTRYPOINT ["greptime"]
--- a/docker/aarch64/compile-python.sh
+++ b/docker/aarch64/compile-python.sh
@@ -1,87 +0,0 @@
 #!/usr/bin/env bash
 set -e
 # this script will download Python source code, compile it, and install it to /usr/local/lib
 # then use this python to compile cross-compiled python for aarch64
 ARCH=$1
 PYTHON_VERSION=3.10.10
 PYTHON_SOURCE_DIR=Python-${PYTHON_VERSION}
 PYTHON_INSTALL_PATH_AMD64=${PWD}/python-${PYTHON_VERSION}/amd64
 PYTHON_INSTALL_PATH_AARCH64=${PWD}/python-${PYTHON_VERSION}/aarch64
 function download_python_source_code() {
  wget https://www.python.org/ftp/python/$PYTHON_VERSION/Python-$PYTHON_VERSION.tgz
  tar -xvf Python-$PYTHON_VERSION.tgz
 }
 function compile_for_amd64_platform() {
  mkdir -p "$PYTHON_INSTALL_PATH_AMD64"
  echo "Compiling for amd64 platform..."
  ./configure \
    --prefix="$PYTHON_INSTALL_PATH_AMD64" \
    --enable-shared \
    ac_cv_pthread_is_default=no ac_cv_pthread=yes ac_cv_cxx_thread=yes \
    ac_cv_have_long_long_format=yes \
    --disable-ipv6 ac_cv_file__dev_ptmx=no ac_cv_file__dev_ptc=no
  make
  make install
 }
 # explain Python compile options here a bit:s
 # --enable-shared: enable building a shared Python library (default is no) but we do need it for calling from rust
 # CC, CXX, AR, LD, RANLIB: set the compiler, archiver, linker, and ranlib programs to use
 # build: the machine you are building on, host: the machine you will run the compiled program on
 # --with-system-ffi: build _ctypes module using an installed ffi library, see Doc/library/ctypes.rst, not used in here TODO: could remove
 # ac_cv_pthread_is_default=no ac_cv_pthread=yes ac_cv_cxx_thread=yes:
 # allow cross-compiled python to have -pthread set for CXX, see https://github.com/python/cpython/pull/22525
 # ac_cv_have_long_long_format=yes: target platform supports long long type
 # disable-ipv6: disable ipv6 support, we don't need it in here
 # ac_cv_file__dev_ptmx=no ac_cv_file__dev_ptc=no: disable pty support, we don't need it in here
 function compile_for_aarch64_platform() {
  export LD_LIBRARY_PATH=$PYTHON_INSTALL_PATH_AMD64/lib:$LD_LIBRARY_PATH
  export LIBRARY_PATH=$PYTHON_INSTALL_PATH_AMD64/lib:$LIBRARY_PATH
  export PATH=$PYTHON_INSTALL_PATH_AMD64/bin:$PATH
  mkdir -p "$PYTHON_INSTALL_PATH_AARCH64"
  echo "Compiling for aarch64 platform..."
  echo "LD_LIBRARY_PATH: $LD_LIBRARY_PATH"
  echo "LIBRARY_PATH: $LIBRARY_PATH"
  echo "PATH: $PATH"
  ./configure --build=x86_64-linux-gnu --host=aarch64-linux-gnu \
    --prefix="$PYTHON_INSTALL_PATH_AARCH64" --enable-optimizations \
    CC=aarch64-linux-gnu-gcc \
    CXX=aarch64-linux-gnu-g++ \
    AR=aarch64-linux-gnu-ar \
    LD=aarch64-linux-gnu-ld \
    RANLIB=aarch64-linux-gnu-ranlib \
    --enable-shared \
    ac_cv_pthread_is_default=no ac_cv_pthread=yes ac_cv_cxx_thread=yes \
    ac_cv_have_long_long_format=yes \
    --disable-ipv6 ac_cv_file__dev_ptmx=no ac_cv_file__dev_ptc=no
  make
  make altinstall
 }
 # Main script starts here.
 download_python_source_code
 # Enter the python source code directory.
 cd $PYTHON_SOURCE_DIR || exit 1
 # Build local python first, then build cross-compiled python.
 compile_for_amd64_platform
 # Clean the build directory.
 make clean && make distclean
 # Cross compile python for aarch64.
 if [ "$ARCH" = "aarch64-unknown-linux-gnu" ]; then
  compile_for_aarch64_platform
 fi
--- a/docker/buildx/centos/Dockerfile
+++ b/docker/buildx/centos/Dockerfile
@@ -0,0 +1,54 @@
 FROM centos:7 as builder
 ARG CARGO_PROFILE
 ARG FEATURES
 ARG OUTPUT_DIR
 ENV LANG en_US.utf8
 WORKDIR /greptimedb
 # Install dependencies
 RUN ulimit -n 1024000 && yum groupinstall -y 'Development Tools'
 RUN yum install -y epel-release  \
    openssl \
    openssl-devel  \
    centos-release-scl  \
    rh-python38  \
    rh-python38-python-devel \
    which
 # Install protoc
 RUN curl -LO https://github.com/protocolbuffers/protobuf/releases/download/v3.15.8/protoc-3.15.8-linux-x86_64.zip
 RUN unzip protoc-3.15.8-linux-x86_64.zip -d /usr/local/
 # Install Rust
 SHELL ["/bin/bash", "-c"]
 RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --no-modify-path --default-toolchain none -y
 ENV PATH /opt/rh/rh-python38/root/usr/bin:/usr/local/bin:/root/.cargo/bin/:$PATH
 # Build the project in release mode.
 RUN --mount=target=.,rw \
    --mount=type=cache,target=/root/.cargo/registry \
    make build \
    CARGO_PROFILE=${CARGO_PROFILE} \
    FEATURES=${FEATURES} \
    TARGET_DIR=/out/target
 # Export the binary to the clean image.
 FROM centos:7 as base
 ARG OUTPUT_DIR
 RUN yum install -y epel-release \
    openssl \
    openssl-devel  \
    centos-release-scl  \
    rh-python38  \
    rh-python38-python-devel \
    which
 WORKDIR /greptime
 COPY --from=builder /out/target/${OUTPUT_DIR}/greptime /greptime/bin/
 ENV PATH /greptime/bin/:$PATH
 ENTRYPOINT ["greptime"]
--- a/docker/buildx/ubuntu/Dockerfile
+++ b/docker/buildx/ubuntu/Dockerfile
@@ -0,0 +1,62 @@
 FROM ubuntu:20.04 as builder
 ARG CARGO_PROFILE
 ARG FEATURES
 ARG OUTPUT_DIR
 ENV LANG en_US.utf8
 WORKDIR /greptimedb
 # Add PPA for Python 3.10.
 RUN apt-get update && \
    DEBIAN_FRONTEND=noninteractive apt-get install -y software-properties-common && \
    add-apt-repository ppa:deadsnakes/ppa -y
 # Install dependencies.
 RUN --mount=type=cache,target=/var/cache/apt \
    apt-get update && apt-get install -y \
    libssl-dev \
    protobuf-compiler \
    curl \
    git \
    build-essential \
    pkg-config \
    python3.10 \
    python3.10-dev \
    python3-pip
 # Install Rust.
 SHELL ["/bin/bash", "-c"]
 RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --no-modify-path --default-toolchain none -y
 ENV PATH /root/.cargo/bin/:$PATH
 # Build the project in release mode.
 RUN --mount=target=. \
    --mount=type=cache,target=/root/.cargo/registry \
    make build \
    CARGO_PROFILE=${CARGO_PROFILE} \
    FEATURES=${FEATURES} \
    TARGET_DIR=/out/target
 # Export the binary to the clean image.
 # TODO(zyy17): Maybe should use the more secure container image.
 FROM ubuntu:22.04 as base
 ARG OUTPUT_DIR
 RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get \
    -y install ca-certificates \
    python3.10 \
    python3.10-dev \
    python3-pip \
    curl
 COPY ./docker/python/requirements.txt /etc/greptime/requirements.txt
 RUN python3 -m pip install -r /etc/greptime/requirements.txt
 WORKDIR /greptime
 COPY --from=builder /out/target/${OUTPUT_DIR}/greptime /greptime/bin/
 ENV PATH /greptime/bin/:$PATH
 ENTRYPOINT ["greptime"]
--- a/docker/ci/Dockerfile
+++ b/docker/ci/Dockerfile
@@ -1,19 +0,0 @@
 FROM ubuntu:22.04
 RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y \
    ca-certificates \
    python3.10 \
    python3.10-dev \
    python3-pip
 COPY requirements.txt /etc/greptime/requirements.txt
 RUN python3 -m pip install -r /etc/greptime/requirements.txt
 ARG TARGETARCH
 ADD $TARGETARCH/greptime /greptime/bin/
 ENV PATH /greptime/bin/:$PATH
 ENTRYPOINT ["greptime"]
--- a/docker/ci/centos/Dockerfile
+++ b/docker/ci/centos/Dockerfile
@@ -0,0 +1,16 @@
 FROM centos:7
 RUN yum install -y epel-release \
    openssl \
    openssl-devel  \
    centos-release-scl  \
    rh-python38  \
    rh-python38-python-devel
 ARG TARGETARCH
 ADD $TARGETARCH/greptime /greptime/bin/
 ENV PATH /greptime/bin/:$PATH
 ENTRYPOINT ["greptime"]
--- a/docker/ci/ubuntu/Dockerfile
+++ b/docker/ci/ubuntu/Dockerfile
@@ -0,0 +1,28 @@
 FROM ubuntu:22.04
 # The root path under which contains all the dependencies to build this Dockerfile.
 ARG DOCKER_BUILD_ROOT=.
 # The binary name of GreptimeDB executable.
 # Defaults to "greptime", but sometimes in other projects it might be different.
 ARG TARGET_BIN=greptime
 RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y \
    ca-certificates \
    python3.10 \
    python3.10-dev \
    python3-pip \
    curl
 COPY $DOCKER_BUILD_ROOT/docker/python/requirements.txt /etc/greptime/requirements.txt
 RUN python3 -m pip install -r /etc/greptime/requirements.txt
 ARG TARGETARCH
 ADD $TARGETARCH/$TARGET_BIN /greptime/bin/
 ENV PATH /greptime/bin/:$PATH
 ENV TARGET_BIN=$TARGET_BIN
 ENTRYPOINT ["sh", "-c", "exec $TARGET_BIN \"$@\"", "--"]
--- a/docker/dev-builder/android/Dockerfile
+++ b/docker/dev-builder/android/Dockerfile
@@ -0,0 +1,41 @@
 FROM --platform=linux/amd64 saschpe/android-ndk:34-jdk17.0.8_7-ndk25.2.9519653-cmake3.22.1
 ENV LANG en_US.utf8
 WORKDIR /greptimedb
 # Rename libunwind to libgcc
 RUN cp ${NDK_ROOT}/toolchains/llvm/prebuilt/linux-x86_64/lib64/clang/14.0.7/lib/linux/aarch64/libunwind.a ${NDK_ROOT}/toolchains/llvm/prebuilt/linux-x86_64/lib64/clang/14.0.7/lib/linux/aarch64/libgcc.a
 # Install dependencies.
 RUN apt-get update && apt-get install -y \
    libssl-dev \
    protobuf-compiler \
    curl \
    git \
    build-essential \
    pkg-config \
    python3 \
    python3-dev \
    python3-pip \
    && pip3 install --upgrade pip \
    && pip3 install pyarrow
 # Trust workdir
 RUN git config --global --add safe.directory /greptimedb
 # Install Rust.
 SHELL ["/bin/bash", "-c"]
 RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --no-modify-path --default-toolchain none -y
 ENV PATH /root/.cargo/bin/:$PATH
 # Add android toolchains
 ARG RUST_TOOLCHAIN
 RUN rustup toolchain install ${RUST_TOOLCHAIN}
 RUN rustup target add aarch64-linux-android
 # Install cargo-ndk
 RUN cargo install cargo-ndk
 ENV ANDROID_NDK_HOME $NDK_ROOT
 # Builder entrypoint.
 CMD ["cargo", "ndk", "--platform", "23", "-t", "aarch64-linux-android", "build", "--bin", "greptime", "--profile", "release", "--no-default-features"]
--- a/docker/dev-builder/centos/Dockerfile
+++ b/docker/dev-builder/centos/Dockerfile
@@ -0,0 +1,30 @@
 FROM centos:7 as builder
 ENV LANG en_US.utf8
 # Install dependencies
 RUN ulimit -n 1024000 && yum groupinstall -y 'Development Tools'
 RUN yum install -y epel-release  \
    openssl \
    openssl-devel  \
    centos-release-scl  \
    rh-python38  \
    rh-python38-python-devel \
    which
 # Install protoc
 RUN curl -LO https://github.com/protocolbuffers/protobuf/releases/download/v3.15.8/protoc-3.15.8-linux-x86_64.zip
 RUN unzip protoc-3.15.8-linux-x86_64.zip -d /usr/local/
 # Install Rust
 SHELL ["/bin/bash", "-c"]
 RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --no-modify-path --default-toolchain none -y
 ENV PATH /opt/rh/rh-python38/root/usr/bin:/usr/local/bin:/root/.cargo/bin/:$PATH
 # Install Rust toolchains.
 ARG RUST_TOOLCHAIN
 RUN rustup toolchain install ${RUST_TOOLCHAIN}
 # Install nextest.
 RUN cargo install cargo-binstall --locked
 RUN cargo binstall cargo-nextest --no-confirm
--- a/docker/dev-builder/ubuntu/Dockerfile
+++ b/docker/dev-builder/ubuntu/Dockerfile
@@ -0,0 +1,60 @@
 FROM ubuntu:20.04
 # The root path under which contains all the dependencies to build this Dockerfile.
 ARG DOCKER_BUILD_ROOT=.
 ENV LANG en_US.utf8
 WORKDIR /greptimedb
 # Add PPA for Python 3.10.
 RUN apt-get update && \
    DEBIAN_FRONTEND=noninteractive apt-get install -y software-properties-common && \
    add-apt-repository ppa:deadsnakes/ppa -y
 # Install dependencies.
 RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y \
    libssl-dev \
    tzdata \
    protobuf-compiler \
    curl \
    ca-certificates \
    git \
    build-essential \
    pkg-config \
    python3.10 \
    python3.10-dev
 # Remove Python 3.8 and install pip.
 RUN apt-get -y purge python3.8 && \
    apt-get -y autoremove && \
    ln -s /usr/bin/python3.10 /usr/bin/python3 && \
    curl -sS https://bootstrap.pypa.io/get-pip.py | python3.10
 # Silence all `safe.directory` warnings, to avoid the "detect dubious repository" error when building with submodules.
 # Disabling the safe directory check here won't pose extra security issues, because in our usage for this dev build
 # image, we use it solely on our own environment (that github action's VM, or ECS created dynamically by ourselves),
 # and the repositories are pulled from trusted sources (still us, of course). Doing so does not violate the intention
 # of the Git's addition to the "safe.directory" at the first place (see the commit message here:
 # https://github.com/git/git/commit/8959555cee7ec045958f9b6dd62e541affb7e7d9).
 # There's also another solution to this, that we add the desired submodules to the safe directory, instead of using 
 # wildcard here. However, that requires the git's config files and the submodules all owned by the very same user.
 # It's troublesome to do this since the dev build runs in Docker, which is under user "root"; while outside the Docker,
 # it can be a different user that have prepared the submodules.
 RUN git config --global --add safe.directory *
 # Install Python dependencies.
 COPY $DOCKER_BUILD_ROOT/docker/python/requirements.txt /etc/greptime/requirements.txt
 RUN python3 -m pip install -r /etc/greptime/requirements.txt
 # Install Rust.
 SHELL ["/bin/bash", "-c"]
 RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --no-modify-path --default-toolchain none -y
 ENV PATH /root/.cargo/bin/:$PATH
 # Install Rust toolchains.
 ARG RUST_TOOLCHAIN
 RUN rustup toolchain install ${RUST_TOOLCHAIN}
 # Install nextest.
 RUN cargo install cargo-binstall --locked
 RUN cargo binstall cargo-nextest --no-confirm
--- a/docker/dev-builder/ubuntu/Dockerfile-18.10
+++ b/docker/dev-builder/ubuntu/Dockerfile-18.10
@@ -0,0 +1,48 @@
 # Use the legacy glibc 2.28.
 FROM ubuntu:18.10
 ENV LANG en_US.utf8
 WORKDIR /greptimedb
 # Use old-releases.ubuntu.com to avoid 404s: https://help.ubuntu.com/community/EOLUpgrades.
 RUN echo "deb http://old-releases.ubuntu.com/ubuntu/ cosmic main restricted universe multiverse\n\
 deb http://old-releases.ubuntu.com/ubuntu/ cosmic-updates main restricted universe multiverse\n\
 deb http://old-releases.ubuntu.com/ubuntu/ cosmic-security main restricted universe multiverse" > /etc/apt/sources.list
 # Install dependencies.
 RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y \
    libssl-dev \
    tzdata \
    curl \
    ca-certificates \
    git \
    build-essential \
    unzip \
    pkg-config
 # Install protoc.
 ENV PROTOC_VERSION=25.1
 RUN if [ "$(uname -m)" = "x86_64" ]; then \
        PROTOC_ZIP=protoc-${PROTOC_VERSION}-linux-x86_64.zip; \
    elif [ "$(uname -m)" = "aarch64" ]; then \
        PROTOC_ZIP=protoc-${PROTOC_VERSION}-linux-aarch_64.zip; \
    else \
        echo "Unsupported architecture"; exit 1; \
    fi && \
    curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v${PROTOC_VERSION}/${PROTOC_ZIP} && \
    unzip -o ${PROTOC_ZIP} -d /usr/local bin/protoc && \
    unzip -o ${PROTOC_ZIP} -d /usr/local 'include/*' && \
    rm -f ${PROTOC_ZIP}
 # Install Rust.
 SHELL ["/bin/bash", "-c"]
 RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --no-modify-path --default-toolchain none -y
 ENV PATH /root/.cargo/bin/:$PATH
 # Install Rust toolchains.
 ARG RUST_TOOLCHAIN
 RUN rustup toolchain install ${RUST_TOOLCHAIN}
 # Install nextest.
 RUN cargo install cargo-binstall --locked
 RUN cargo binstall cargo-nextest --no-confirm
--- a/docker/python/requirements.txt
+++ b/docker/python/requirements.txt
--- a/docs/banner/KCCNC_NA_2023_1000x200_Email
+++ b/docs/banner/KCCNC_NA_2023_1000x200_Email
--- a/docs/benchmarks/tsbs/v0.3.2.md
+++ b/docs/benchmarks/tsbs/v0.3.2.md
@@ -0,0 +1,39 @@
 # TSBS benchmark - v0.3.2
 ## Environment
 |     |     |
 | --- | --- |
 | CPU | AMD Ryzen 7 7735HS (8 core 3.2GHz) |
 | Memory | 32GB |
 | Disk | SOLIDIGM SSDPFKNU010TZ |
 | OS | Ubuntu 22.04.2 LTS |
 ## Write performance
 | Write buffer size | Ingest rate（rows/s） |
 | --- | --- |
 | 512M | 139583.04 |
 | 32M | 279250.52 |
 ## Query performance
 | Query type  | v0.3.2 write buffer 32M (ms) | v0.3.2 write buffer 512M (ms) | v0.3.1 write buffer 32M (ms) |
 | --- | --- | --- | --- |
 | cpu-max-all-1 | 921.12 | 241.23 | 553.63 |
 | cpu-max-all-8 | 2657.66 | 502.78 | 3308.41 |
 | double-groupby-1 | 28238.85 | 27367.42 | 52148.22 |
 | double-groupby-5 | 33094.65 | 32421.89 | 56762.37 |
 | double-groupby-all | 38565.89 | 38635.52 | 59596.80 |
 | groupby-orderby-limit | 23321.60 | 22423.55 | 53983.23 |
 | high-cpu-1 | 1167.04 | 254.15 | 832.41 |
 | high-cpu-all | 32814.08 | 29906.94 | 62853.12 |
 | lastpoint | 192045.05 | 153575.42 | NA   |
 | single-groupby-1-1-1 | 63.97 | 87.35 | 92.66 |
 | single-groupby-1-1-12 | 666.24 | 326.98 | 781.50 |
 | single-groupby-1-8-1 | 225.29 | 137.97 |281.95 |
 | single-groupby-5-1-1 | 70.40 | 81.64 | 86.15 |
 | single-groupby-5-1-12 | 722.75 | 356.01 | 805.18 |
 | single-groupby-5-8-1 | 285.60 | 115.88 | 326.29 |
--- a/docs/benchmarks/tsbs/v0.4.0.md
+++ b/docs/benchmarks/tsbs/v0.4.0.md
@@ -0,0 +1,61 @@
 # TSBS benchmark - v0.4.0
 ## Environment
 ### Local
 |        |                                    |
 | ------ | ---------------------------------- |
 | CPU    | AMD Ryzen 7 7735HS (8 core 3.2GHz) |
 | Memory | 32GB                               |
 | Disk   | SOLIDIGM SSDPFKNU010TZ             |
 | OS     | Ubuntu 22.04.2 LTS                 |
 ### Aliyun amd64
 |         |                |
 | ------- | -------------- |
 | Machine | ecs.g7.4xlarge |
 | CPU     | 16 core        |
 | Memory  | 64GB           |
 | Disk    | 100G           |
 | OS      | Ubuntu  22.04  |
 ### Aliyun arm64
 |         |                   |
 | ------- | ----------------- |
 | Machine | ecs.g8y.4xlarge   |
 | CPU     | 16 core           |
 | Memory  | 64GB              |
 | Disk    | 100G              |
 | OS      | Ubuntu  22.04 ARM |
 ## Write performance
 | Environment        | Ingest rate（rows/s） |
 | ------------------ | --------------------- |
 | Local              | 365280.60             |
 | Aliyun g7.4xlarge  | 341368.72             |
 | Aliyun g8y.4xlarge | 320907.29             |
 ## Query performance
 | Query type            | Local (ms) | Aliyun g7.4xlarge (ms) | Aliyun g8y.4xlarge (ms) |
 | --------------------- | ---------- | ---------------------- | ----------------------- |
 | cpu-max-all-1         | 50.70      | 31.46                  | 47.61                   |
 | cpu-max-all-8         | 262.16     | 129.26                 | 152.43                  |
 | double-groupby-1      | 2512.71    | 1408.19                | 1586.10                 |
 | double-groupby-5      | 3896.15    | 2304.29                | 2585.29                 |
 | double-groupby-all    | 5404.67    | 3337.61                | 3773.91                 |
 | groupby-orderby-limit | 3786.98    | 2065.72                | 2312.57                 |
 | high-cpu-1            | 71.96      | 37.29                  | 54.01                   |
 | high-cpu-all          | 9468.75    | 7595.69                | 8467.46                 |
 | lastpoint             | 13379.43   | 11253.76               | 12949.40                |
 | single-groupby-1-1-1  | 20.72      | 12.16                  | 13.35                   |
 | single-groupby-1-1-12 | 28.53      | 15.67                  | 21.62                   |
 | single-groupby-1-8-1  | 72.23      | 37.90                  | 43.52                   |
 | single-groupby-5-1-1  | 26.75      | 15.59                  | 17.48                   |
 | single-groupby-5-1-12 | 45.41      | 22.90                  | 31.96                   |
 | single-groupby-5-8-1  | 107.96     | 59.76                  | 69.58                   |
--- a/docs/benchmarks/tsbs/v0.7.0.md
+++ b/docs/benchmarks/tsbs/v0.7.0.md
@@ -0,0 +1,50 @@
 # TSBS benchmark - v0.7.0
 ## Environment
 ### Local
 |        |                                    |
 | ------ | ---------------------------------- |
 | CPU    | AMD Ryzen 7 7735HS (8 core 3.2GHz) |
 | Memory | 32GB                               |
 | Disk   | SOLIDIGM SSDPFKNU010TZ             |
 | OS     | Ubuntu 22.04.2 LTS                 |
 ### Amazon EC2
 |         |                |
 | ------- | -------------- |
 | Machine | c5d.2xlarge    |
 | CPU     | 8 core         |
 | Memory  | 16GB           |
 | Disk    | 50GB (GP3)     |
 | OS      | Ubuntu 22.04.1 |
 ## Write performance
 | Environment        | Ingest rate (rows/s)  |
 | ------------------ | --------------------- |
 | Local              | 3695814.64            |
 | EC2 c5d.2xlarge    | 2987166.64            |
 ## Query performance
 | Query type            | Local (ms) | EC2 c5d.2xlarge (ms)   |
 | --------------------- | ---------- | ---------------------- |
 | cpu-max-all-1         | 30.56      | 54.74                  |
 | cpu-max-all-8         | 52.69      | 70.50                  |
 | double-groupby-1      | 664.30     | 1366.63                |
 | double-groupby-5      | 1391.26    | 2141.71                |
 | double-groupby-all    | 2828.94    | 3389.59                |
 | groupby-orderby-limit | 718.92     | 1213.90                |
 | high-cpu-1            | 29.21      | 52.98                  |
 | high-cpu-all          | 5514.12    | 7194.91                |
 | lastpoint             | 7571.40    | 9423.41                |
 | single-groupby-1-1-1  | 19.09      | 7.77                   |
 | single-groupby-1-1-12 | 27.28      | 51.64                  |
 | single-groupby-1-8-1  | 31.85      | 11.64                  |
 | single-groupby-5-1-1  | 16.14      | 9.67                   |
 | single-groupby-5-1-12 | 27.21      | 53.62                  |
 | single-groupby-5-8-1  | 39.62      | 14.96                  |
--- a/docs/rfcs/2023-05-09-distributed-planner.md
+++ b/docs/rfcs/2023-05-09-distributed-planner.md
@@ -79,7 +79,7 @@ This RFC proposes to add a new expression node `MergeScan` to merge result from
 │               │    │                             │
 └─Frontend──────┘    └─Remote-Sources──────────────┘
 ```
-This merge operation simply chains all the the underlying remote data sources and return `RecordBatch`, just like a coalesce op. And each remote sources is a gRPC query to datanode via the substrait logical plan interface. The plan is transformed and divided from the original query that comes to frontend.
+This merge operation simply chains all the underlying remote data sources and return `RecordBatch`, just like a coalesce op. And each remote sources is a gRPC query to datanode via the substrait logical plan interface. The plan is transformed and divided from the original query that comes to frontend.
 ## Commutativity of MergeScan
--- a/docs/rfcs/2023-07-06-table-engine-refactor.md
+++ b/docs/rfcs/2023-07-06-table-engine-refactor.md
@@ -0,0 +1,303 @@
 ---
 Feature Name: table-engine-refactor
 Tracking Issue: https://github.com/GreptimeTeam/greptimedb/issues/1869
 Date: 2023-07-06
 Author: "Yingwen <realevenyag@gmail.com>"
 ---
 Refactor Table Engine
 ----------------------
 # Summary
 Refactor table engines to address several historical tech debts.
 # Motivation
 Both `Frontend` and `Datanode` have to deal with multiple regions in a table. This results in code duplication and additional burden to the `Datanode`.
 Before:
 ```mermaid
 graph TB
 subgraph Frontend["Frontend"]
    subgraph MyTable
        A("region 0, 2 -> Datanode0")
        B("region 1, 3 -> Datanode1")
    end
 end
 MyTable --> Metasrv
 Metasrv --> ETCD
 MyTable-->TableEngine0
 MyTable-->TableEngine1
 subgraph Datanode0
    Procedure0("procedure")
    TableEngine0("table engine")
    region0
    region2
    mytable0("my_table")
    Procedure0-->mytable0
    TableEngine0-->mytable0
    mytable0-->region0
    mytable0-->region2
 end
 subgraph Datanode1
    Procedure1("procedure")
    TableEngine1("table engine")
    region1
    region3
    mytable1("my_table")
    Procedure1-->mytable1
    TableEngine1-->mytable1
    mytable1-->region1
    mytable1-->region3
 end
 subgraph manifest["table manifest"]
    M0("my_table")
    M1("regions: [0, 1, 2, 3]")
 end
 mytable1-->manifest
 mytable0-->manifest
 RegionManifest0("region manifest 0")
 RegionManifest1("region manifest 1")
 RegionManifest2("region manifest 2")
 RegionManifest3("region manifest 3")
 region0-->RegionManifest0
 region1-->RegionManifest1
 region2-->RegionManifest2
 region3-->RegionManifest3
 ```
 `Datanodes` can update the same manifest file for a table as regions are assigned to different nodes in the cluster. We also have to run procedures on `Datanode` to ensure the table manifest is consistent with region manifests. "Table" in a `Datanode` is a subset of the table's regions. The `Datanode` is much closer to `RegionServer` in `HBase` which only deals with regions.
 In cluster mode, we store table metadata in etcd and table manifest. The table manifest becomes redundant. We can remove the table manifest if we refactor the table engines to region engines that only care about regions. What's more, we don't need to run those procedures on `Datanode`.
 After:
 ```mermaid
 graph TB
 subgraph Frontend["Frontend"]
    direction LR
    subgraph MyTable
        A("region 0, 2 -> Datanode0")
        B("region 1, 3 -> Datanode1")
    end
 end
 MyTable --> Metasrv
 Metasrv --> ETCD
 MyTable-->RegionEngine
 MyTable-->RegionEngine1
 subgraph Datanode0
    RegionEngine("region engine")
    region0
    region2
    RegionEngine-->region0
    RegionEngine-->region2
 end
 subgraph Datanode1
    RegionEngine1("region engine")
    region1
    region3
    RegionEngine1-->region1
    RegionEngine1-->region3
 end
 RegionManifest0("region manifest 0")
 RegionManifest1("region manifest 1")
 RegionManifest2("region manifest 2")
 RegionManifest3("region manifest 3")
 region0-->RegionManifest0
 region1-->RegionManifest1
 region2-->RegionManifest2
 region3-->RegionManifest3
 ```
 This RFC proposes to refactor table engines into region engines as a first step to make the `Datanode` acts like a `RegionServer`.
 # Details
 ## Overview
 We plan to refactor the `TableEngine` trait into `RegionEngine` gradually. This RFC focuses on the `mito` engine as it is the default table engine and the most complicated engine.
 Currently, we built `MitoEngine` upon `StorageEngine` that manages regions of the `mito` engine. Since `MitoEngine` becomes a region engine, we could combine `StorageEngine` with `MitoEngine` to simplify our code structure.
 The chart below shows the overall architecture of the `MitoEngine`.
 ```mermaid
 classDiagram
 class MitoEngine~LogStore~ {
    -WorkerGroup workers
 }
 class MitoRegion {
    +VersionControlRef version_control
    -RegionId region_id
    -String manifest_dir
    -AtomicI64 last_flush_millis
    +region_id() RegionId
    +scan() ChunkReaderImpl
 }
 class RegionMap {
    -HashMap&lt;RegionId, MitoRegionRef&gt; regions
 }
 class ChunkReaderImpl
 class WorkerGroup {
    -Vec~RegionWorker~ workers
 }
 class RegionWorker {
    -RegionMap regions
    -Sender sender
    -JoinHandle handle
 }
 class RegionWorkerThread~LogStore~ {
    -RegionMap regions
    -Receiver receiver
    -Wal~LogStore~ wal
    -ObjectStore object_store
    -MemtableBuilderRef memtable_builder
    -FlushSchedulerRef~LogStore~ flush_scheduler
    -FlushStrategy flush_strategy
    -CompactionSchedulerRef~LogStore~ compaction_scheduler
    -FilePurgerRef file_purger
 }
 class Wal~LogStore~ {
    -LogStore log_store
 }
 class MitoConfig
 MitoEngine~LogStore~ o-- MitoConfig
 MitoEngine~LogStore~ o-- MitoRegion
 MitoEngine~LogStore~ o-- WorkerGroup
 MitoRegion o-- VersionControl
 MitoRegion -- ChunkReaderImpl
 WorkerGroup o-- RegionWorker
 RegionWorker o-- RegionMap
 RegionWorker -- RegionWorkerThread~LogStore~
 RegionWorkerThread~LogStore~ o-- RegionMap
 RegionWorkerThread~LogStore~ o-- Wal~LogStore~
 ```
 We replace the `RegionWriter` with `RegionWorker` to process write requests and DDL requests.
 ## Metadata
 We also merge region's metadata with table's metadata. It should make metadata much easier to maintain.
 ```mermaid
 classDiagram
 class VersionControl {
    -CowCell~Version~ version
    -AtomicU64 committed_sequence
 }
 class Version {
    -RegionMetadataRef metadata
    -MemtableVersionRef memtables
    -LevelMetasRef ssts
    -SequenceNumber flushed_sequence
    -ManifestVersion manifest_version
 }
 class MemtableVersion {
    -MemtableRef mutable
    -Vec~MemtableRef~ immutables
    +mutable_memtable() MemtableRef
    +immutable_memtables() &[MemtableRef]
    +freeze_mutable(MemtableRef new_mutable) MemtableVersion
 }
 class LevelMetas {
    -LevelMetaVec levels
    -AccessLayerRef sst_layer
    -FilePurgerRef file_purger
    -Option~i64~ compaction_time_window
 }
 class LevelMeta {
    -Level level
    -HashMap&lt;FileId, FileHandle&gt; files
 }
 class FileHandle {
    -FileMeta meta
    -bool compacting
    -AtomicBool deleted
    -AccessLayerRef sst_layer
    -FilePurgerRef file_purger
 }
 class FileMeta {
    +RegionId region_id
    +FileId file_id
    +Option&lt;Timestamp, Timestamp&gt; time_range
    +Level level
    +u64 file_size
 }
 VersionControl o-- Version
 Version o-- RegionMetadata
 Version o-- MemtableVersion
 Version o-- LevelMetas
 LevelMetas o-- LevelMeta
 LevelMeta o-- FileHandle
 FileHandle o-- FileMeta
 class RegionMetadata {
    +RegionId region_id
    +VersionNumber version
    +SchemaRef table_schema
    +Vec~usize~ primary_key_indices
    +Vec~usize~ value_indices
    +ColumnId next_column_id
    +TableOptions region_options
    +DateTime~Utc~ created_on
    +RegionSchemaRef region_schema
 }
 class RegionSchema {
    -SchemaRef user_schema
    -StoreSchemaRef store_schema
    -ColumnsMetadataRef columns
 }
 class Schema
 class StoreSchema {
    -Vec~ColumnMetadata~ columns
    -SchemaRef schema
    -usize row_key_end
    -usize user_column_end
 }
 class ColumnsMetadata {
    -Vec~ColumnMetadata~ columns
    -HashMap&lt;String, usize&gt; name_to_col_index
    -usize row_key_end
    -usize timestamp_key_index
    -usize user_column_end
 }
 class ColumnMetadata
 RegionMetadata o-- RegionSchema
 RegionMetadata o-- Schema
 RegionSchema o-- StoreSchema
 RegionSchema o-- Schema
 RegionSchema o-- ColumnsMetadata
 StoreSchema o-- ColumnsMetadata
 StoreSchema o-- Schema
 StoreSchema o-- ColumnMetadata
 ColumnsMetadata o-- ColumnMetadata
 ```
 # Drawback
 This is a breaking change.
 # Future Work
 - Rename `TableEngine` to `RegionEngine`
 - Simplify schema relationship in the `mito` engine
 - Refactor the `Datanode` into a `RegionServer`.
--- a/docs/rfcs/2023-07-10-metric-engine.md
+++ b/docs/rfcs/2023-07-10-metric-engine.md
@@ -0,0 +1,202 @@
 ---
 Feature Name: metric-engine
 Tracking Issue: TBD
 Date: 2023-07-10
 Author: "Ruihang Xia <waynestxia@gmail.com>"
 ---
 # Summary
 A new metric engine that can significantly enhance our ability to handle the tremendous number of small tables in scenarios like Prometheus metrics, by leveraging a synthetic wide table that offers storage and metadata multiplexing capabilities over the existing engine.
 # Motivation
 The concept "Table" in GreptimeDB is a bit "heavy" compared to other time-series storage like Prometheus or VictoriaMetrics. This has lots of disadvantages in aspects from performance, footprint, and storage to cost.
 # Details
 ## Top level description
 - User Interface
    This feature will add a new type of storage engine. It might be available to be an option like `with ENGINE=mito` or an internal interface like auto create table on Prometheus remote write. From the user side, there is no difference from tables in mito engine. All the DDL like `CREATE`, `ALTER` and DML like `SELECT` should be supported.
 - Implementation Overlook
    This new engine doesn't re-implement low level components like file R/W etc. It's a wrapper layer over the existing mito engine, with extra storage and metadata multiplexing capabilities. I.e., it expose multiple table based on one mito engine table like this:
 	``` plaintext
 	   ┌───────────────┐ ┌───────────────┐ ┌───────────────┐
 	   │ Metric Engine │ │ Metric Engine │ │ Metric Engine │
 	   │   Table 1     │ │   Table 2     │ │   Table 3     │
 	   └───────────────┘ └───────────────┘ └───────────────┘
 	           ▲               ▲                   ▲
 	           │               │                   │
 	           └───────────────┼───────────────────┘
 	                           │
 	                 ┌─────────┴────────┐
 	                 │ Metric Region    │
 	                 │   Engine         │
 	                 │    ┌─────────────┤
 	                 │    │ Mito Region │
 	                 │    │   Engine    │
 	                 └────▲─────────────┘
 	                      │
 	                      │
 	                ┌─────┴───────────────┐
 	                │                     │
 	                │  Mito Engine Table  │
 	                │                     │
 	                └─────────────────────┘
 	```
 The following parts will describe these implementation details:
 - How to route these metric region tables and how those table are distributed
 - How to maintain the schema and other metadata of the underlying mito engine table
 - How to maintain the schema of metric engine table
 - How the query goes
 ## Routing
 Before this change, the region route rule was based on a group of partition keys. Relation of physical table to region is one-to-many.
 ``` rust
  pub struct PartitionDef {
      partition_columns: Vec<String>,
      partition_bounds: Vec<PartitionBound>,
  }
 ```
 And for metric engine tables, the key difference is we split the concept of "physical table" and "logical table". Like the previous ASCII chart, multiple logical tables are based on one physical table. The relationship of logical table to region becomes many-to-many. Thus, we must include the table name (of logical table) into partition rules.
 Consider the partition/route interface is a generic map of string array to region id, all we need to do is to insert logical table name into the request:
 ``` rust
  fn route(request: Vec<String>) -> RegionId;
 ```
 The next question is, where to do this conversion? The basic idea is to dispatch different routing behavior based on the engine type. Since we have all the necessary information in frontend, it's a good place to do that. And can leave meta server untouched. The essential change is to associate engine type with route rule.
 ## Physical Region Schema
 The idea "physical wide table" is to perform column-level multiplexing. I.e., map all logical columns to physical columns by their names.
 ```
   ┌────────────┐      ┌────────────┐         ┌────────────┐
   │   Table 1  │      │   Table 2  │         │   Table 3  │
   ├───┬────┬───┤      ├───┬────┬───┤         ├───┬────┬───┤
   │C1 │ C2 │ C3│      │C1 │ C3 │ C5├──────┐  │C2 │ C4 │ C6│
   └─┬─┴──┬─┴─┬─┘ ┌────┴───┴──┬─┴───┘      │  └─┬─┴──┬─┴─┬─┘
     │    │   │   │           │            │    │    │   │
     │    │   │   │           └──────────┐ │    │    │   │
     │    │   │   │                      │ │    │    │   │
     │    │   │   │  ┌─────────────────┐ │ │    │    │   │
     │    │   │   │  │ Physical Table  │ │ │    │    │   │
     │    │   │   │  ├──┬──┬──┬──┬──┬──┘ │ │    │    │   │
     └────x───x───┴─►│C1│C2│C3│C4│C5│C6◄─┼─x────x────x───┘
          │   │      └──┘▲─┘▲─┴─▲└─▲└──┘ │ │    │    │
          │   │          │  │   │  │     │ │    │    │
          ├───x──────────┘  ├───x──x─────┘ │    │    │
          │   │             │   │  │       │    │    │
          │   └─────────────┘   │  └───────┘    │    │
          │                     │               │    │
          └─────────────────────x───────────────┘    │
                                │                    │
                                └────────────────────┘
 ```
 This approach is very straightforward but has one problem. It only works when two columns have different semantic type (time index, tag or field) or data types but with the same name. E.g., `CREATE TABLE t1 (c1 timestamp(3) TIME INDEX)` and `CREATE TABLE t2 (c1 STRING PRIMARY KEY)`.
 One possible workaround is to prefix each column with its data type and semantic type, like `_STRING_PK_c1`. However, considering the primary goal at present is to support data from monitoring metrics like Prometheus remote write, it's acceptable not to support this at first because data types are often simple and limited here.
 The next point is changing the physical table's schema. This is only needed when creating a new logical table or altering the existing table. Typically speaking, table creating and altering are explicit. We only need to emit an add column request to underlying physical table on processing logical table's DDL. GreptimeDB can create or alter table automatically on some protocols, but the internal logic is the same.
 Also for simplicity, we don't support shrinking the underlying table at first. This can be achieved by introducing mechanism on the physical column.
 Frontend needs not to keep physical table's schema.
 ## Metadata of physical regions
 Those metric engine regions need to store extra metadata like the schema of logical table or all logical table's name. That information is relatively simple and can be stored in a format like key-value pair. For now, we have to use another physical mito region for metadata. This involves an issue with region scheduling. Since we don't have the ability to perform affinity scheduling, the initial version will just assume the data region and metadata region are in the same instance. See alternatives - other storage for physical region's metadata for possible future improvement.
 Here is the schema of metadata region and how we would use it. The `CREATE TABLE` clause of metadata region looks like the following. Notice that it wouldn't be actually created by SQL.
 ``` sql
  CREATE TABLE metadata(
  	ts timestamp time index,
    	key string primary key,
    	value string
  );
 ```
 The `ts` field is just a placeholder -- for the constraints that a mito region must contain a time index field. It will always be `0`. The other two fields `key` and `value` will be used as a k-v storage. It contains two group of key
    - `__table_<TABLE_NAME>` is used for marking table existence. It doesn't have value.
    - `__column_<TABLE_NAME>_<COLUMN_NAME>` is used for marking table existence, the value is column's semantic type.
 ## Physical region implementation
 This RFC proposes to add a new region implementation named "MetricRegion". As showed in the first chart, it's wrapped over the existing mito region. This section will describe the implementation details. Firstly, here is a chart shows how the region hierarchy looks like:
 ```plaintext
 ┌───────────────────────┐
 │ Metric Region         │
 │                       │
 │   ┌────────┬──────────┤
 │   │ Mito   │ Mito     │
 │   │ Region │ Region   │
 │   │ for    │ for      │
 │   │ Data   │ Metadata │
 └───┴────────┴──────────┘
 ```
 All upper levels only see the Metric Region. E.g., Meta Server schedules on this region, or Frontend routes requests to this Metrics Region's id. To be scheduled (open or close etc.), Metric Region needs to implement its own procedures. Most of those procedures can be simply assembled from underlying Mito Regions', but those related to data like alter or drop will have its own new logic.
 Another point is region id. Since the region id is used widely from meta server to persisted state, it's better to keep it unchanged. This means we can't use the same id for two regions, but one for each. To achieve this, this RFC proposes a concept named "region id group". A region id group is a group of region ids that are bound for different purposes. Like the two underlying regions here. 
 This preserves the first 8 bits of the `u32` region number for grouping. Each group has one main id (the first one) and other sub ids (the rest non-zero ids). All components other than the region implementation itself doesn't aware of the existence of region id group. They only see the main id. The region implementation is in response of managing and using the region id group.
 ```plaintext
 63                                  31         23                  0
 ┌────────────────────────────────────┬──────────┬──────────────────┐
 │          Table Id(32)              │ Group(8) │ Region Number(24)│
 └────────────────────────────────────┴──────────┴──────────────────┘
                                            Region Id(32)
 ```
 ## Routing in meta server
 From previous sections, we can conclude the following points about routing:
 - Each "logical table" has its own, universe unique table id.
 - Logical table doesn't have physical region, they share the same physical region with other logical tables.
 - Route rule of logical table's is a strict subset of physical table's.
 To associate the logical table with physical region, we need to specify necessary information in the create table request. Specifically, the table type and its parent table. This require to change our gRPC proto's definition. And once meta recognize the table to create is a logical table, it will use the parent table's region to create route entry.
 And to reduce the consumption of region failover (which need to update the physical table route info), we'd better to split the current route table structure into two parts:
 ```rust
 region_route: Map<TableName, [RegionId]>,
 node_route: Map<RegionId, NodeId>,
 ```
 By doing this on each failover the meta server only needs to update the second `node_route` map and leave the first one untouched.
 ## Query
 Like other existing components, a user query always starts in the frontend. In the planning phase, frontend needs to fetch related schemas of the queried table. This part is the same. I.e., changes in this RFC don't affect components above the `Table` abstraction.
 # Alternatives
 ## Other routing method
 We can also do this "special" route rule in the meta server. But there is no difference with the proposed method.
 ## Other storage for physical region's metadata
 Once we have implemented the "region family" that allows multiple physical schemas exist in one region, we can store the metadata and table data into one region.
 Before that, we can also let the `MetricRegion` holds a `KvBackend` to access the storage layer directly. But this breaks the abstraction in some way.
 # Drawbacks
 Since the physical storage is mixed together. It's hard to do fine-grained operations in table level. Like configuring TTL, memtable size or compaction strategy in table level. Or define different partition rules for different tables. For scenarios like this, it's better to move the table out of metrics engine and "upgrade" it to a normal mito engine table. This requires a migration process in a low cost. And we have to ensure data consistency during the migration, which may require a out-of-service period.
--- a/docs/rfcs/2023-08-04-table-trait-refactor.md
+++ b/docs/rfcs/2023-08-04-table-trait-refactor.md
@@ -0,0 +1,175 @@
 ---
 Feature Name: table-trait-refactor
 Tracking Issue: https://github.com/GreptimeTeam/greptimedb/issues/2065
 Date: 2023-08-04
 Author: "Ruihang Xia <waynestxia@gmail.com>"
 ---
 Refactor Table Trait
 --------------------
 # Summary
 Refactor `Table` trait to adapt the new region server architecture and make code more straightforward.
 # Motivation
 The `Table` is designed in the background of both frontend and datanode keeping the same concepts. And all the operations are served by a `Table`. However, in our practice, we found that not all the operations are suitable to be served by a `Table`. For example, the `Table` doesn't hold actual physical data itself, thus operations like write or alter are simply a proxy over underlying regions. And in the recent refactor to datanode ([rfc table-engine-refactor](./2023-07-06-table-engine-refactor.md)), we are changing datanode to region server that is only aware of `Region` things. This also calls for a refactor to the `Table` trait.
 # Details
 ## Definitions
 The current `Table` trait contains the following methods:
 ```rust
 pub trait Table {
    /// Get a reference to the schema for this table
    fn schema(&self) -> SchemaRef;
    /// Get a reference to the table info.
    fn table_info(&self) -> TableInfoRef;
    /// Get the type of this table for metadata/catalog purposes.
    fn table_type(&self) -> TableType;
    /// Insert values into table.
    ///
    /// Returns number of inserted rows.
    async fn insert(&self, _request: InsertRequest) -> Result<usize>;
    /// Generate a record batch stream for querying.
    async fn scan_to_stream(&self, request: ScanRequest) -> Result<SendableRecordBatchStream>;
    /// Tests whether the table provider can make use of any or all filter expressions
    /// to optimise data retrieval.
    fn supports_filters_pushdown(&self, filters: &[&Expr]) -> Result<Vec<FilterPushDownType>>;
    /// Alter table.
    async fn alter(&self, _context: AlterContext, _request: &AlterTableRequest) -> Result<()>;
    /// Delete rows in the table.
    ///
    /// Returns number of deleted rows.
    async fn delete(&self, _request: DeleteRequest) -> Result<usize>;
    /// Flush table.
    ///
    /// Options:
    /// - region_number: specify region to flush.
    /// - wait: Whether to wait until flush is done.
    async fn flush(&self, region_number: Option<RegionNumber>, wait: Option<bool>) -> Result<()>;
    /// Close the table.
    async fn close(&self, _regions: &[RegionNumber]) -> Result<()>;
    /// Get region stats in this table.
    fn region_stats(&self) -> Result<Vec<RegionStat>>;
    /// Return true if contains the region
    fn contains_region(&self, _region: RegionNumber) -> Result<bool>;
    /// Get statistics for this table, if available
    fn statistics(&self) -> Option<TableStatistics>;
    async fn compact(&self, region_number: Option<RegionNumber>, wait: Option<bool>) -> Result<()>;
 }
 ```
 We can divide those methods into three categories from the perspective of functionality:
 |     Retrieve Metadata      | Manipulate Data |    Read Data     |
 | :------------------------: | :-------------: | :--------------: |
 |          `schema`          |    `insert`     | `scan_to_stream` |
 |        `table_info`        |     `alter`     |                  |
 |        `table_type`        |    `delete`     |                  |
 | `supports_filter_pushdown` |     `flush`     |                  |
 |       `region_stats`       |     `close`     |                  |
 |     `contains_region`      |    `compact`    |                  |
 |        `statistics`        |                 |                  |
 And considering most of the access to metadata happens in frontend, like route or query; and all the persisted data are stored in regions; while only the query engine needs to read data. We can divide the `Table` trait into three concepts:
 - struct `Table` provides metadata:
    ```rust
    impl Table {
        /// Get a reference to the schema for this table
        fn schema(&self) -> SchemaRef;
        /// Get a reference to the table info.
        fn table_info(&self) -> TableInfoRef;
        /// Get the type of this table for metadata/catalog purposes.
        fn table_type(&self) -> TableType;
        /// Get statistics for this table, if available
        fn statistics(&self) -> Option<TableStatistics>;
        fn to_data_source(&self) -> DataSourceRef;
    }
    ```
 - Requests to region server
  - `InsertRequest`
  - `AlterRequest`
  - `DeleteRequest`
  - `FlushRequest`
  - `CompactRequest`
  - `CloseRequest`
 - trait `DataSource` provides data (`RecordBatch`)
    ```rust
    trait DataSource {
        fn get_stream(&self, request: ScanRequest) -> Result<SendableRecordBatchStream>;
    }
    ```
 ## Use `Table`
 `Table` will only be used in frontend. It's constructed from the `OpenTableRequest` or `CreateTableRequest`.
 `Table` also provides a method `to_data_source` to generate a `DataSource` from itself. But this method is only for non-`TableType::Base` tables (i.e., `TableType::View` and `TableType::Temporary`) because `TableType::Base` table doesn't hold actual data itself. Its `DataSource` should be constructed from the `Region` directly (in other words, it's a remote query).
 And it requires some extra information to construct a `DataSource`, named `TableSourceProvider`:
 ```rust
 type TableFactory = Arc<dyn Fn() -> DataSourceRef>;
 pub enum TableSourceProvider {
    Base,
    View(LogicalPlan),
    Temporary(TableFactory),
 }
 ```
 ## Use `DataSource`
 `DataSource` will be adapted to the `TableProvider` from DataFusion that can be `scan()`ed in a `TableScan` plan.
 In frontend this is done in the planning phase. And datanode will have one implementation for `Region` to generate record batch stream.
 ## Interact with RegionServer
 Previously, persisted state change operations were through the old `Table` trait, like said before. Now they will come from the action source, like the procedure or protocol handler directly to the region server. E.g., on alter table, the corresponding procedure will generate its `AlterRequest` and send it to regions. Or write request will be split in frontend handler, and sent to regions. `Table` only provides necessary metadata like route information if needed, but not the necessary part anymore.
 ## Implement temporary table
 Temporary table is a special table that doesn't revolves to any persistent physical region. Examples are:
 - the `Numbers` table for testing, which produces a record batch that contains 0-100 integers.
 - tables in information schema. It is an interface for querying catalog's metadata. The contents are generated on the fly with information from `CatalogManager`. The `CatalogManager` can be held in `TableFactory`.
 - Function table that produces data generated by a formula or a function. Like something that always `sin(current_timestamp())`.
 ## Relationship among those components
 Here is a diagram to show the relationship among those components, and how they interact with each other.
 ```mermaid
 erDiagram
    CatalogManager ||--|{ Table : manages
    Table ||--|{ DataStream : generates
    Table ||--|{ Region : routes
    Region ||--|{ DataStream : implements
    DataStream }|..|| QueryEngine : adapts-to
    Procedure ||--|{ Region : requests
    Protocol ||--|{ Region : writes
    Protocol ||--|{ QueryEngine : queries
 ```
 # Drawback
 This is a breaking change.
--- a/docs/rfcs/2023-08-13-metadata-txn.md
+++ b/docs/rfcs/2023-08-13-metadata-txn.md
@@ -0,0 +1,90 @@
 ---
 Feature Name: Update Metadata in single transaction
 Tracking Issue: https://github.com/GreptimeTeam/greptimedb/issues/1715
 Date: 2023-08-13
 Author: "Feng Yangsen <fengys1996@gmail.com>, Xu Wenkang <wenymedia@gmail.com>"
 ---
 # Summary
 Update Metadata in single transaction.
 # Motivation
 Currently, multiple transactions are involved during the procedure. This implementation is inefficient, and it's hard to make data consistent. Therefore, We can update multiple metadata in a single transaction.
 # Details 
 Now we have the following table metadata keys:
 **TableInfo** 
 ```rust
 // __table_info/{table_id}
 pub struct TableInfoKey {
    table_id: TableId,
 }
 pub struct TableInfoValue {
    pub table_info: RawTableInfo,
    version: u64,
 }
 ```
 **TableRoute** 
 ```rust
 // __table_route/{table_id}
 pub struct NextTableRouteKey {
    table_id: TableId,
 }
 pub struct TableRoute {
    pub region_routes: Vec<RegionRoute>,
 }
 ```
 **DatanodeTable**
 ```rust
 // __table_route/{datanode_id}/{table_id}
 pub struct DatanodeTableKey {
    datanode_id: DatanodeId,
    table_id: TableId,
 }
 pub struct DatanodeTableValue {
    pub table_id: TableId,
    pub regions: Vec<RegionNumber>,
    version: u64,
 }
 ```
 **TableNameKey**
 ```rust
 // __table_name/{CatalogName}/{SchemaName}/{TableName}
 pub struct TableNameKey<'a> {
    pub catalog: &'a str,
    pub schema: &'a str,
    pub table: &'a str,
 }
 pub struct TableNameValue {
    table_id: TableId,
 }
 ```
 These table metadata only updates in the following operations.
 ## Region Failover
 It needs to update `TableRoute` key and `DatanodeTable` keys. If the `TableRoute` equals the Snapshot of `TableRoute` submitting the Failover task, then we can safely update these keys.
 After submitting Failover tasks to acquire locks for execution, the `TableRoute` may be updated by another task. After acquiring the lock, we can get the latest `TableRoute` again and then execute it if needed.
 ## Create Table DDL
 Creates all of the above keys. `TableRoute`, `TableInfo`, should be empty.
 The **TableNameKey**'s lock will be held by the procedure framework.
 ## Drop Table DDL
 `TableInfoKey` and `NextTableRouteKey` will be added with  `__removed-` prefix, and the other above keys will be deleted.  The transaction will not compare any keys.
 ## Alter Table DDL
 1. Rename table, updates `TableInfo` and `TableName`. Compares `TableInfo`, and the new `TableNameKey` should be empty, and TableInfo should equal the Snapshot when submitting DDL.
 The old and new **TableNameKey**'s lock will be held by the procedure framework.
 2. Alter table, updates `TableInfo`. `TableInfo` should equal the Snapshot when submitting DDL.
--- a/docs/rfcs/2023-11-03-inverted-index.md
+++ b/docs/rfcs/2023-11-03-inverted-index.md
@@ -0,0 +1,113 @@
 ---
 Feature Name: Inverted Index for SST File
 Tracking Issue: https://github.com/GreptimeTeam/greptimedb/issues/2705
 Date: 2023-11-03
 Author: "Zhong Zhenchi <zhongzc_arch@outlook.com>"
 ---
 # Summary
 This RFC proposes an optimization towards the storage engine by introducing an inverted indexing methodology aimed at optimizing label selection queries specifically pertaining to Metrics with tag columns as the target for optimization.
 # Introduction
 In the current system setup, in the Mito Engine, the first column of Primary Keys has a Min-Max index, which significantly optimizes the outcome. However, there are limitations when it comes to other columns, primarily tags. This RFC suggests the implementation of an inverted index to provide enhanced filtering benefits to bridge these limitations and improve overall system performance.
 # Design Detail
 ## Inverted Index
 The primary aim of the proposed inverted index is to optimize tag columns in the SST Parquet Files within the Mito Engine. The mapping and construction of an inverted index, from Tag Values to Row Groups, enables efficient logical structures that provide faster and more flexible queries.
 When scanning SST Files, pushed-down filters applied to a respective Tag's inverted index, determine the final Row Groups to be indexed and scanned, further bolstering the speed and efficiency of data retrieval processes.
 ## Index Format
 The Inverted Index for each SST file follows the format shown below:
 ```
 inverted_index₀ inverted_index₁ ... inverted_indexₙ footer
 ```
 The structure inside each Inverted Index is as followed:
 ```
 bitmap₀ bitmap₁ bitmap₂ ... bitmapₙ null_bitmap fst
 ```
 The format is encapsulated by a footer:
 ```
 footer_payload footer_payload_size
 ```
 The `footer_payload` is presented in protobuf encoding of `InvertedIndexFooter`.
 The complete format is containerized in [Puffin](https://iceberg.apache.org/puffin-spec/) with the type defined as `greptime-inverted-index-v1`.
 ## Protobuf Details
 The `InvertedIndexFooter` is defined in the following protobuf structure:
 ```protobuf
 message InvertedIndexFooter {
    repeated InvertedIndexMeta metas;
 }
 message InvertedIndexMeta {
    string name;
    uint64 row_count_in_group;
    uint64 fst_offset;
    uint64 fst_size;
    uint64 null_bitmap_offset;
    uint64 null_bitmap_size;
    InvertedIndexStats stats;
 }
 message InvertedIndexStats {
    uint64 null_count;
    uint64 distinct_count;
    bytes min_value;
    bytes max_value;
 }
 ```
 ## Bitmap
 Bitmaps are used to represent indices of fixed-size groups. Rows are divided into groups of a fixed size, defined in the `InvertedIndexMeta` as `row_count_in_group`.
 For example, when `row_count_in_group` is `4096`, it means each group has `4096` rows. If there are a total of `10000` rows, there will be `3` groups in total. The first two groups will have `4096` rows each, and the last group will have `1808` rows. If the indexed values are found in row `200` and `9000`, they will correspond to groups `0` and `2`, respectively. Therefore, the bitmap should show `0` and `2`.
 Bitmap is implemented using [BitVec](https://docs.rs/bitvec/latest/bitvec/), selected due to its efficient representation of dense data arrays typical of indices of groups.
 ## Finite State Transducer (FST)
 [FST](https://docs.rs/fst/latest/fst/) is a highly efficient data structure ideal for in-memory indexing. It represents ordered sets or maps where the keys are bytes. The choice of the FST effectively balances the need for performance, space efficiency, and the ability to perform complex analyses such as regular expression matching.
 The conventional usage of FST and `u64` values has been adapted to facilitate indirect indexing to row groups. As the row groups are represented as Bitmaps, we utilize the `u64` values split into bitmap's offset (higher 32 bits) and size (lower 32 bits) to represent the location of these Bitmaps. 
 ## API Design
 Two APIs `InvertedIndexBuilder` for building indexes and  `InvertedIndexSearcher` for querying indexes are designed:
 ```rust
 type Bytes = Vec<u8>;
 type GroupId = u64;
 trait InvertedIndexBuilder {
    fn add(&mut self, name: &str, value: Option<&Bytes>, group_id: GroupId) -> Result<()>;
    fn finish(&mut self) -> Result<()>;
 }
 enum Predicate {
    Gt(Bytes),
    GtEq(Bytes),
    Lt(Bytes),
    LtEq(Bytes),
    InList(Vec<Bytes>),
    RegexMatch(String),
 }
 trait InvertedIndexSearcher {
    fn search(&mut self, name: &str, predicates: &[Predicate]) -> Result<impl IntoIterator<GroupId>>;
 }
 ```
--- a/docs/rfcs/2023-11-07-region-migration.md
+++ b/docs/rfcs/2023-11-07-region-migration.md
@@ -0,0 +1,169 @@
 ---
 Feature Name: Region Migration Procedure
 Tracking Issue: https://github.com/GreptimeTeam/greptimedb/issues/2700
 Date: 2023-11-03
 Author: "Xu Wenkang <wenymedia@gmail.com>"
 ---
 # Summary
 This RFC proposes a way that brings the ability of Meta Server to move regions between the Datanodes.
 # Motivation
 Typically, We need this ability in the following scenarios:
 - Migrate hot-spot Regions to idle Datanode
 - Move the failure Regions to an available Datanode
 # Details
 ```mermaid
 flowchart TD
    style Start fill:#85CB90,color:#fff
    style End fill:#85CB90,color:#fff
    style SelectCandidate fill:#F38488,color:#fff
    style OpenCandidate fill:#F38488,color:#fff
    style UpdateMetadataDown fill:#F38488,color:#fff
    style UpdateMetadataUp fill:#F38488,color:#fff
    style UpdateMetadataRollback fill:#F38488,color:#fff
    style DowngradeLeader fill:#F38488,color:#fff
    style UpgradeCandidate fill:#F38488,color:#fff
    Start[Start] 
    SelectCandidate[Select Candidate] 
    UpdateMetadataDown["`Update Metadata(Down)
        1. Downgrade Leader
    `"]
    DowngradeLeader["`Downgrade Leader
    1. Become Follower
    2. Return **last_entry_id**
    `"]
    UpgradeCandidate["`Upgrade Candidate
    1. Replay to **last_entry_id**
    2. Become Leader
    `"]
    UpdateMetadataUp["`Update Metadata(Up)
        1. Switch Leader
        2.1. Remove Old Leader(Opt.)
        2.2. Move Old Leader to Follower(Opt.)
    `"]
    UpdateMetadataRollback["`Update Metadata(Rollback)
        1. Upgrade old Leader
    `"]
    End
    AnyCandidate{Available?}
    OpenCandidate["Open Candidate"]
    CloseOldLeader["Close Old Leader"]
    Start 
    --> SelectCandidate
    --> AnyCandidate
    --> |Yes| UpdateMetadataDown 
    --> I1["Invalid Frontend Cache"]
    --> DowngradeLeader 
    --> UpgradeCandidate
    --> UpdateMetadataUp
    --> I2["Invalid Frontend Cache"]
    --> End
    UpgradeCandidate
    --> UpdateMetadataRollback
    --> I3["Invalid Frontend Cache"]
    --> End
    I2 
    --> CloseOldLeader
    --> End
    AnyCandidate 
    --> |No| OpenCandidate
    --> UpdateMetadataDown
 ```
 **Only the red nodes will persist state after it has succeeded**, and other nodes won't persist state. (excluding the Start and End nodes).
 ## Steps
 **The persistent context:** It's shared in each step and available after recovering. It will only be updated/stored after the Red node has succeeded.
 Values: 
 - `region_id`: The target leader region.
 - `peer`: The target datanode.
 - `close_old_leader`: Indicates whether close the region. 
 - `leader_may_unreachable`: It's used to support the failover procedure.
 **The Volatile context:** It's shared in each step and available in executing (including retrying). It will be dropped if the procedure runner crashes.
 ### Select Candidate
 The Persistent state: Selected Candidate Region.
 ### Update Metadata(Down)
 **The Persistent context:**
 - The (latest/updated) `version` of `TableRouteValue`, It will be used in the step of `Update Metadata(Up)`.
 ### Downgrade Leader
 This step sends an instruction via heartbeat and performs:
 1. Downgrades leader region.
 2. Retrieves the `last_entry_id` (if available).
 If the target leader region is not found: 
 - Sets `close_old_leader` to true.
 - Sets `leader_may_unreachable` to true.
 If the target Datanode is unreachable:
 - Waits for region lease expired.
 - Sets `close_old_leader` to true.
 - Sets `leader_may_unreachable` to true.
 **The Persistent context:**
 None
 **The Persistent state:** 
 - `last_entry_id`
 *Passes to next step.
 ### Upgrade Candidate
 This step sends an instruction via heartbeat and performs:
 1. Replays the WAL to latest(`last_entry_id`).
 2. Upgrades the candidate region.
 If the target region is not found: 
 - Rollbacks.
 - Notifies the failover detector if `leader_may_unreachable` == true.
 - Exits procedure.
 If the target Datanode is unreachable:
 - Rollbacks.
 - Notifies the failover detector if `leader_may_unreachable` == true.
 - Exits procedure.
 **The Persistent context:**
 None
 ### Update Metadata(Up)
 This step performs
 1. Switches Leader.
 2. Removes Old Leader(Opt.).
 3. Moves Old Leader to follower(Opt.).
 The `TableRouteValue` version should equal the `TableRouteValue`'s `version` in Persistent context. Otherwise, verifies whether `TableRouteValue` already updated.
 **The Persistent context:**
 None
 ### Close Old Leader(Opt.)
 This step sends a close region instruction via heartbeat.
 If the target leader region is not found: 
 - Ignore.
 If the target Datanode is unreachable: 
 - Ignore.
 ### Open Candidate(Opt.)
 This step sends an open region instruction via heartbeat and waits for conditions to be met (typically, the condition is that the `last_entry_id` of the Candidate Region is very close to that of the Leader Region or the latest).
 If the target Datanode is unreachable: 
 - Exits procedure.
--- a/docs/rfcs/2023-12-22-enclose-column-id.md
+++ b/docs/rfcs/2023-12-22-enclose-column-id.md
@@ -0,0 +1,44 @@
 ---
 Feature Name: Enclose Column Id
 Tracking Issue: https://github.com/GreptimeTeam/greptimedb/issues/2982
 Date: 2023-12-22
 Author: "Ruihang Xia <waynestxia@gmail.com>"
 ---
 # Summary
 This RFC proposes to enclose the usage of `ColumnId` into the region engine only.
 # Motivation
 `ColumnId` is an identifier for columns. It's assigned by meta server, stored in `TableInfo` and `RegionMetadata` and used in region engine to distinguish columns.
 At present, Both Frontend, Datanode and Metasrv are aware of `ColumnId` but it's only used in region engine. Thus this RFC proposes to remove it from Frontend (mainly used in `TableInfo`) and Metasrv.
 # Details
 `ColumnId` is used widely on both read and write paths. Removing it from Frontend and Metasrv implies several things:
 - A column may have different column id in different regions.
 - A column is identified by its name in all components.
 - Column order in the region engine is not restricted, i.e., no need to be in the same order with table info.
 The first thing doesn't matter IMO. This concept doesn't exist anymore outside of region server, and each region is autonomous and independent -- the only guarantee it should hold is those columns exist. But if we consider region repartition, where the SST file would be re-assign to different regions, things would become a bit more complicated. A possible solution is store the relation between name and ColumnId in the manifest, but it's out of the scope of this RFC. We can likely give a workaround by introducing a indirection mapping layer of different version of partitions.
 And more importantly, we can still assume columns have the same column ids across regions. We have procedure to maintain consistency between regions and the region engine should ensure alterations are idempotent. So it is possible that region repartition doesn't need to consider column ids or other region metadata in the future.
 Users write and query column by their names, not by ColumnId or something else. The second point also means to change the column reference in ScanRequest from index to name. This change can hugely alleviate the misuse of the column index, which has given us many surprises.
 And for the last one, column order only matters in table info. This order is used in user-faced table structure operation, like add column, describe column or as the default order of INSERT clause. None of them is connected with the order in storage.
 # Drawback
 Firstly, this is a breaking change. Delivering this change requires a full upgrade of the cluster. Secondly, this change may introduce some performance regression. For example, we have to pass the full table name in the `ScanRequest` instead of the `ColumnId`. But this influence is very limited, since the column index is only used in the region engine.
 # Alternatives
 There are two alternatives from the perspective of "what can be used as the column identifier":
 - Index of column to the table schema
 - `ColumnId` of that column
 The first one is what we are using now. By choosing this way, it's required to keep the column order in the region engine the same as the table info. This is not hard to achieve, but it's a bit annoying. And things become tricky when there is internal column or different schemas like those stored in file format. And this is the initial purpose of this RFC, which is trying to decouple the table schema and region schema.
 The second one, in other hand, requires the `ColumnId` should be identical in all regions and `TableInfo`. It has the same drawback with the previous alternative, that the `TableInfo` and `RegionMetadata` are tighted together. Another point is that the `ColumnId` is assigned by the Metasrv, who doesn't need it but have to maintain it. And this also limits the functionality of `ColumnId`, by taking the ability of assigning it from concrete region engine.
--- a/docs/rfcs/2024-01-17-dataflow-framework.md
+++ b/docs/rfcs/2024-01-17-dataflow-framework.md
@@ -0,0 +1,97 @@
 ---
 Feature Name: Dataflow Framework
 Tracking Issue: https://github.com/GreptimeTeam/greptimedb/issues/3187
 Date: 2024-01-17
 Author: "Discord9 <discord9@163.com>"
 ---
 # Summary
 This RFC proposes a Lightweight Module for executing continuous aggregation queries on a stream of data.
 # Motivation
 Being able to do continuous aggregation is a very powerful tool. It allows you to do things like:
 1. downsample data from i.e. 1 milliseconds to 1 second
 2. calculate the average of a stream of data
 3. Keeping a sliding window of data in memory
 In order to do those things while maintaining a low memory footprint, you need to be able to manage the data in a smart way. Hence, we only store necessary data in memory, and send/recv data deltas to/from the client.
 # Details
 ## System boundary / What it's and isn't
 - GreptimeFlow provides a way to perform continuous aggregation over time-series data.
 - It's not a complete streaming-processing system. Only a must subset functionalities are provided.
 - Flow can process a configured range of fresh data. Data exceeding this range will be dropped directly. Thus it cannot handle random datasets (random on timestamp).
 - Both sliding windows (e.g., latest 5m from present) and fixed windows (every 5m from some time) are supported. And these two are the major targeting scenarios.
 - Flow can handle most aggregate operators within one table(i.e. Sum, avg, min, max and comparison operators). But others (join, trigger, txn etc.) are not the target feature.
 ## Framework
 - Greptime Flow's is built on top of [Hydroflow](https://github.com/hydro-project/hydroflow).
 - We have three choices for the Dataflow/Streaming process framework for our simple continuous aggregation feature:
 1. Based on the timely/differential dataflow crate that [materialize](https://github.com/MaterializeInc/materialize) based on. Later, it's proved too obscure for a simple usage, and is hard to customize memory usage control.
 2. Based on a simple dataflow framework that we write from ground up, like what [arroyo](https://www.arroyo.dev/) or [risingwave](https://www.risingwave.dev/) did, for example the core streaming logic of [arroyo](https://github.com/ArroyoSystems/arroyo/blob/master/arroyo-datastream/src/lib.rs) only takes up to 2000 line of codes. However, it means maintaining another layer of dataflow framework, which might seem easy in the beginning, but I fear it might be too burdensome to maintain once we need more features.
 3. Based on a simple and lower level dataflow framework that someone else write, like [hydroflow](https://github.com/hydro-project/hydroflow), this approach combines the best of both worlds. Firstly, it boasts ease of comprehension and customization. Secondly, the dataflow framework offers precisely the necessary features for crafting uncomplicated single-node dataflow programs while delivering decent performance.
 Hence, we choose the third option, and use a simple logical plan that's anagonistic to the underlying dataflow framework, as it only describe how the dataflow graph should be doing, not how it do that. And we built operator in hydroflow to execute the plan. And the result hydroflow graph is wrapped in a engine that only support data in/out and tick event to flush and compute the result. This provide a thin middle layer that's easy to maintain and allow switching to other dataflow framework if necessary.
 ## Deploy mode and protocol
 - Greptime Flow is an independent streaming compute component. It can be used either within a standalone node or as a dedicated node at the same level as frontend in distributed mode.
 - It accepts insert request Rows, which is used between frontend and datanode.
 - New flow job is submitted in the format of modified SQL query like snowflake do, like: `CREATE TASK avg_over_5m WINDOW_SIZE = "5m" AS SELECT avg(value) FROM table WHERE time > now() - 5m GROUP BY time(1m)`. Flow job then got stored in Metasrv.
 - It also persists results in the format of Rows to frontend.
 - The query plan uses Substrait as codec format. It's the same with GreptimeDB's query engine.
 - Greptime Flow needs a WAL for recovering. It's possible to reuse datanode's.
 The workflow is shown in the following diagram
 ```mermaid
 graph TB
 subgraph Flownode["Flownode"]
    subgraph Dataflows
        df1("Dataflow_1")
        df2("Dataflow_2")
    end
 end
 subgraph Frontend["Frontend"]
    newLines["Mirror Insert
 Create Task From Query
 Write result from flow node"]
 end
 subgraph Datanode["Datanode"]
 end
 User --> Frontend
 Frontend -->|Register Task| Metasrv
 Metasrv -->|Read Task Metadata| Frontend
 Frontend -->|Create Task| Flownode
 Frontend -->|Mirror Insert| Flownode
 Flownode -->|Write back| Frontend
 Frontend --> Datanode
 Datanode --> Frontend
 ```
 ## Lifecycle of data
 - New data is inserted into frontend like before. Frontend will mirror insert request to Flow node if there is configured flow job.
 - Depending on the timestamp of incoming data, flow will either drop it (outdated data) or process it (fresh data).
 - Greptime Flow will periodically write results back to the result table through frontend.
 - Those result will then be written into a result table stored in datanode.
 - A small table of intermediate state is kept in memory, which is used to calculate the result.
 ## Supported operations
 - Greptime Flow accepts a configurable "materialize window", data point exceeds that time window is discarded.
 - Data within that "materialize window" is queryable and updateable.
 - Greptime Flow can handle partitioning, if and only if the input query can be transformed to a fully partitioned plan according to the existing commutative rules. Otherwise the corresponding flow job has to be calculated in a single node.
 - Notice that Greptime Flow has to see all the data belongs to one partition.
 - Deletion and duplicate insertion are not supported at early stage.
 ## Miscellaneous 
 - Greptime Flow can translate SQL to it's own plan, however only a selected few aggregate function is supported for now, like min/max/sum/count/avg
 - Greptime Flow's operator is configurable in terms of the size of the materialize window, whether to allow delay of incoming data etc., so simplest operator can choose to not tolerate any delay to save memory.
 # Future Work
 - Support UDF that can do one-to-one mapping. Preferably, we can reuse the UDF mechanism in GreptimeDB.
 - Support join operator.
 - Design syntax for config operator for different materialize window and delay tolerance.
 - Support cross partition merge operator that allows complex query plan that not necessary accord with partitioning rule to communicate between nodes and create final materialize result.
 - Duplicate insertion, which can be reverted easily within the current framework, so supporting it could be easy
 - Deletion within "materialize window", this requires operators like min/max to store all inputs within materialize window, which might require further optimization.
--- a/docs/rfcs/2024-02-21-multi-dimension-partition-rule/2d-example.png
+++ b/docs/rfcs/2024-02-21-multi-dimension-partition-rule/2d-example.png
--- a/docs/rfcs/2024-02-21-multi-dimension-partition-rule/rfc.md
+++ b/docs/rfcs/2024-02-21-multi-dimension-partition-rule/rfc.md
@@ -0,0 +1,101 @@
 ---
 Feature Name:  Multi-dimension Partition Rule
 Tracking Issue: https://github.com/GreptimeTeam/greptimedb/issues/3351
 Date: 2024-02-21
 Author: "Ruihang Xia <waynestxia@gmail.com>"
 ---
 # Summary
 A new region partition scheme that runs on multiple dimensions of the key space. The partition rule is defined by a set of simple expressions on the partition key columns.
 # Motivation
 The current partition rule is from MySQL's [`RANGE Partition`](https://dev.mysql.com/doc/refman/8.0/en/partitioning-range.html), which is based on a single dimension. It is sort of a [Hilbert Curve](https://en.wikipedia.org/wiki/Hilbert_curve) and pick several point on the curve to divide the space. It is neither easy to understand how the data get partitioned nor flexible enough to handle complex partitioning requirements.
 Considering the future requirements like region repartitioning or autonomous rebalancing, where both workload and partition may change frequently. Here proposes a new region partition scheme that uses a set of simple expressions on the partition key columns to divide the key space.
 # Details
 ## Partition rule
 First, we define a simple expression that can be used to define the partition rule. The simple expression is a binary expression expression on the partition key columns that can be evaluated to a boolean value. The binary operator is limited to comparison operators only, like `=`, `!=`, `>`, `>=`, `<`, `<=`. And the operands are limited either literal value or partition column.
 Example of valid simple expressions are $`col_A = 10`$, $`col_A \gt 10 \& col_B \gt 20`$ or $`col_A \ne 10`$.
 Those expressions can be used as predicates to divide the key space into different regions. The following example have two partition columns `Col A` and `Col B`, and four partitioned regions.
 ```math
 \left\{\begin{aligned}
 &col_A \le 10 &Region_1 \\
 &10 \lt col_A \& col_A \le 20 &Region_2 \\
 &20 \lt col_A \space \& \space col_B \lt 100 &Region_3 \\
 &20 \lt col_A \space \& \space col_B \ge 100 &Region_4
 \end{aligned}\right\}
 ```
 An advantage of this scheme is that it is easy to understand how the data get partitioned. The above example can be visualized in a 2D space (two partition column is involved in the example).
 ![example](2d-example.png)
 Here each expression draws a line in the 2D space. Managing data partitioning becomes a matter of drawing lines in the key space.
 To make it easy to use, there is a "default region" which catches all the data that doesn't match any of previous expressions. The default region exist by default and do not need to specify. It is also possible to remove this default region if the DB finds it is not necessary.
 ## SQL interface
 The SQL interface is in response to two parts: specifying the partition columns and the partition rule. Thouth we are targeting an autonomous system, it's still allowed to give some bootstrap rules or hints on creating table.
 Partition column is specified by `PARTITION ON COLUMNS` sub-clause in `CREATE TABLE`:
 ```sql
 CREATE TABLE t (...)
 PARTITION ON COLUMNS (...) ();
 ```
 Two following brackets are for partition columns and partition rule respectively.
 Columns provided here are only used as an allow-list of how the partition rule can be defined. Which means (a) the sequence between columns doesn't matter, (b) the columns provided here are not necessarily being used in the partition rule.
 The partition rule part is a list of comma-separated simple expressions. Expressions here are not corresponding to region, as they might be changed by system to fit various workload.
 A full example of `CREATE TABLE` with partition rule is:
 ```sql
 CREATE TABLE IF NOT EXISTS demo (
  a STRING,
  b STRING,
  c STRING,
  d STRING,
  ts TIMESTAMP,
  memory DOUBLE,
  TIME INDEX (ts),
  PRIMARY KEY (a, b, c, d)
 )
 PARTITION ON COLUMNS (c, b, a) (
  a < 10,
  10 >= a AND a < 20,
  20 >= a AND b < 100,
  20 >= a AND b > 100
 )
 ```
 ## Combine with storage
 Examining columns separately suits our columnar storage very well in two aspects.
 1. The simple expression can be pushed down to storage and file format, and is likely to hit existing index. Makes pruning operation very efficient.
 2. Columns in columnar storage are not tightly coupled like in the traditional row storages, which means we can easily add or remove columns from partition rule without much impact (like a global reshuffle) on data.
 The data file itself can be "projected" to the key space as a polyhedron, it is guaranteed that each plane is in parallel with some coordinate planes (in a 2D scenario, this is saying that all the files can be projected to a rectangle). Thus partition or repartition also only need to consider related columns.
 ![sst-project](sst-project.png)
 An additional limitation is that considering how the index works and how we organize the primary keys at present, the partition columns are limited to be a subset of primary keys for better performance.
 # Drawbacks
 This is a breaking change.
--- a/docs/rfcs/2024-02-21-multi-dimension-partition-rule/sst-project.png
+++ b/docs/rfcs/2024-02-21-multi-dimension-partition-rule/sst-project.png
--- a/docs/style-guide.md
+++ b/docs/style-guide.md
@@ -0,0 +1,46 @@
 # GreptimeDB Style Guide
 This style guide is intended to help contributors to GreptimeDB write code that is consistent with the rest of the codebase. It is a living document and will be updated as the codebase evolves.
 It's mainly an complement to the [Rust Style Guide](https://pingcap.github.io/style-guide/rust/).
 ## Table of Contents
 - Formatting
 - Modules
 - Comments
 ## Formatting
 - Place all `mod` declaration before any `use`.
 - Use `unimplemented!()` instead of `todo!()` for things that aren't likely to be implemented.
 - Add an empty line before and after declaration blocks.
 - Place comment before attributes (`#[]`) and derive (`#[derive]`).
 ## Modules
 - Use the file with same name instead of `mod.rs` to define a module. E.g.:
 ```
 .
 ├── cache
 │  ├── cache_size.rs
 │  └── write_cache.rs
 └── cache.rs
 ```
 ## Comments
 - Add comments for public functions and structs.
 - Prefer document comment (`///`) over normal comment (`//`) for structs, fields, functions etc.
 - Add link (`[]`) to struct, method, or any other reference. And make sure that link works.
 ## Error handling
 - Define a custom error type for the module if needed.
 - Prefer `with_context()` over `context()` when allocation is needed to construct an error.
 - Use `error!()` or `warn!()` macros in the `common_telemetry` crate to log errors. E.g.:
 ```rust
 error!(e; "Failed to do something");
 ```
--- a/grafana/README.md
+++ b/grafana/README.md
@@ -0,0 +1,10 @@
 Grafana dashboard for GreptimeDB
 --------------------------------
 GreptimeDB's official Grafana dashboard.
 Status notify: we are still working on this config. It's expected to change frequently in the recent days. Please feel free to submit your feedback and/or contribution to this dashboard 🤗
 # How to use
 Open Grafana Dashboard page, choose `New` -> `Import`. And upload `greptimedb.json` file.
--- a/grafana/greptimedb.json
+++ b/grafana/greptimedb.json
--- a/Show More
+++ b/Show More