Compare commits

..

19 Commits
v0.7.1 ... geo

Author SHA1 Message Date
Ning Sun
a362c4fb22 chore: use explicit geo_json version 2022-11-01 16:12:01 +08:00
Ning Sun
20fcc7819a feat: implement postgres read/write for geo data 2022-11-01 16:03:33 +08:00
Ning Sun
8119bbd7b4 feat: update sqlparser api 2022-10-20 10:57:11 +08:00
Ning Sun
48693cb12d fix: rename wkt conversion function 2022-10-19 14:52:48 +08:00
Ning Sun
8d938c3ac8 feat: use our forked sqlparser to parse Geometry(POINT) syntax 2022-10-19 14:13:26 +08:00
liangxingjian
3235436f60 fix: format 2022-10-19 10:31:02 +08:00
liangxingjian
3bf2c9840d feat: add impl of arrow array access 2022-10-19 10:25:58 +08:00
liangxingjian
35afa9dc74 fix: fix some error 2022-10-18 15:09:45 +08:00
liangxingjian
f4d8c2cef6 feat: implement simple sql demo 2022-10-18 11:54:38 +08:00
Ning Sun
788001b4bc fix: resolve lint warnings 2022-10-18 11:32:42 +08:00
Ning Sun
92ab3002c9 fix: resolve check warnings 2022-10-18 11:27:59 +08:00
Ning Sun
36ce08cb03 refactor: set inner subtype to ConcreteDataType::Geometry 2022-10-18 11:16:14 +08:00
liangxingjian
2f159dbe22 feat:add impl of geo-vec,add some unit test 2022-10-14 18:04:12 +08:00
liangxingjian
3d7d029cb5 feat:add some impl and test with a little refactor 2022-10-13 18:12:19 +08:00
liangxingjian
7aed777bc4 feat:add iter and ref of geo types 2022-10-12 17:06:42 +08:00
liangxingjian
ebcd18d3c4 feat:add some impl of geo type 2022-10-11 19:07:15 +08:00
liangxingjian
d44887bada feat:add some geo vec impl 2022-10-10 17:52:17 +08:00
liangxingjian
c0893ac19b fix:fix some error 2022-10-10 15:23:38 +08:00
liangxingjian
8a91e26020 feat:init to add new geo types 2022-10-09 17:46:06 +08:00
1853 changed files with 47195 additions and 295956 deletions

View File

@@ -1,5 +0,0 @@
[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"
[alias]
sqlness = "run --bin sqlness-runner --"

View File

@@ -1,3 +0,0 @@
[profile.default]
slow-timeout = { period = "60s", terminate-after = 3, grace-period = "30s" }
retries = { backoff = "exponential", count = 3, delay = "10s", jitter = true }

View File

@@ -20,3 +20,6 @@ out/
# Rust
target/
# Git
.git

View File

@@ -1,10 +0,0 @@
root = true
[*]
end_of_line = lf
indent_style = space
insert_final_newline = true
trim_trailing_whitespace = true
[{Makefile,**.mk}]
indent_style = tab

View File

@@ -1,26 +0,0 @@
# Settings for s3 test
GT_S3_BUCKET=S3 bucket
GT_S3_ACCESS_KEY_ID=S3 access key id
GT_S3_ACCESS_KEY=S3 secret access key
GT_S3_ENDPOINT_URL=S3 endpoint url
GT_S3_REGION=S3 region
# Settings for oss test
GT_OSS_BUCKET=OSS bucket
GT_OSS_ACCESS_KEY_ID=OSS access key id
GT_OSS_ACCESS_KEY=OSS access key
GT_OSS_ENDPOINT=OSS endpoint
# Settings for azblob test
GT_AZBLOB_CONTAINER=AZBLOB container
GT_AZBLOB_ACCOUNT_NAME=AZBLOB account name
GT_AZBLOB_ACCOUNT_KEY=AZBLOB account key
GT_AZBLOB_ENDPOINT=AZBLOB endpoint
# Settings for gcs test
GT_GCS_BUCKET = GCS bucket
GT_GCS_SCOPE = GCS scope
GT_GCS_CREDENTIAL_PATH = GCS credential path
GT_GCS_ENDPOINT = GCS end point
# Settings for kafka wal test
GT_KAFKA_ENDPOINTS = localhost:9092
# Setting for fuzz tests
GT_MYSQL_ADDR = localhost:4002

View File

@@ -1,105 +0,0 @@
---
name: Bug report
description: Is something not working? Help us fix it!
labels: [ "bug" ]
body:
- type: markdown
attributes:
value: |
Take some time to fill out this bug report. Thank you!
- type: dropdown
id: type
attributes:
label: What type of bug is this?
multiple: true
options:
- Configuration
- Crash
- Data corruption
- Incorrect result
- Locking issue
- Performance issue
- Unexpected error
- User Experience
- Other
validations:
required: true
- type: dropdown
id: subsystem
attributes:
label: What subsystems are affected?
description: You can pick multiple subsystems.
multiple: true
options:
- Standalone mode
- Distributed Cluster
- Storage Engine
- Query Engine
- Table Engine
- Write Protocols
- MetaSrv
- Frontend
- Datanode
- Other
validations:
required: true
- type: textarea
id: reproduce
attributes:
label: Minimal reproduce step
description: |
Please walk us through and provide steps and details on how
to reproduce the issue. If possible, provide scripts that we
can run to trigger the bug.
validations:
required: true
- type: textarea
id: expected-manner
attributes:
label: What did you expect to see?
validations:
required: true
- type: textarea
id: actual-manner
attributes:
label: What did you see instead?
validations:
required: true
- type: input
id: os
attributes:
label: What operating system did you use?
description: |
Please provide OS, version, and architecture. For example:
Windows 10 x64, Ubuntu 21.04 x64, Mac OS X 10.5 ARM, Rasperry
Pi i386, etc.
placeholder: "Ubuntu 21.04 x64"
validations:
required: true
- type: input
id: greptimedb
attributes:
label: What version of GreptimeDB did you use?
description: |
Please provide the version of GreptimeDB. For example:
0.5.1 etc. You can get it by executing command line `greptime --version`.
placeholder: "0.5.1"
validations:
required: true
- type: textarea
id: logs
attributes:
label: Relevant log output and stack trace
description: |
Please copy and paste any relevant log output or a stack
trace. This will be automatically formatted into code, so no
need for backticks.
render: bash

View File

@@ -1,8 +0,0 @@
blank_issues_enabled: false
contact_links:
- name: Greptime Community Slack
url: https://greptime.com/slack
about: Get free help from the Greptime community
- name: Greptime Community Discussion
url: https://github.com/greptimeTeam/greptimedb/discussions
about: Get free help from the Greptime community

View File

@@ -1,39 +0,0 @@
---
name: Enhancement
description: Suggest an enhancement to existing functionality
labels: [ "enhancement" ]
body:
- type: dropdown
id: type
attributes:
label: What type of enhancement is this?
multiple: true
options:
- API improvement
- Configuration
- Performance
- Refactor
- Tech debt reduction
- User experience
- Other
validations:
required: true
- type: textarea
id: what
attributes:
label: What does the enhancement do?
description: |
Give a high-level overview of how you
suggest improving an existing feature or functionality.
validations:
required: true
- type: textarea
id: implementation
attributes:
label: Implementation challenges
description: |
Share any ideas of how to implement the enhancement.
validations:
required: false

View File

@@ -1,42 +0,0 @@
---
name: Feature request
description: Suggest a new feature for GreptimeDB
labels: [ "feature request" ]
body:
- type: markdown
id: info
attributes:
value: |
Only use this template to suggest a new feature that doesn't already exist in GreptimeDB.
For enhancements to existing features, use the "Enhancement" issue template. For bugs,
use the bug report template.
- type: textarea
id: what
attributes:
label: What problem does the new feature solve?
description: |
Describe the problem and why it is important to solve. Did you consider alternative
solutions, perhaps outside the database? Why is it better to add the feature to
GreptimeDB?
validations:
required: true
- type: textarea
id: how
attributes:
label: What does the feature do?
description: |
Give a high-level overview of what the feature does and how it would work.
validations:
required: true
- type: textarea
id: implementation
attributes:
label: Implementation challenges
description: |
If you have ideas of how to implement the feature, and any particularly
challenging issues to overcome, then provide them here.
validations:
required: false

View File

@@ -1,76 +0,0 @@
name: Build and push dev-builder images
description: Build and push dev-builder images to DockerHub and ACR
inputs:
dockerhub-image-registry:
description: The dockerhub image registry to store the images
required: false
default: docker.io
dockerhub-image-registry-username:
description: The dockerhub username to login to the image registry
required: true
dockerhub-image-registry-token:
description: The dockerhub token to login to the image registry
required: true
dockerhub-image-namespace:
description: The dockerhub namespace of the image registry to store the images
required: false
default: greptime
version:
description: Version of the dev-builder
required: false
default: latest
build-dev-builder-ubuntu:
description: Build dev-builder-ubuntu image
required: false
default: 'true'
build-dev-builder-centos:
description: Build dev-builder-centos image
required: false
default: 'true'
build-dev-builder-android:
description: Build dev-builder-android image
required: false
default: 'true'
runs:
using: composite
steps:
- name: Login to Dockerhub
uses: docker/login-action@v2
with:
registry: ${{ inputs.dockerhub-image-registry }}
username: ${{ inputs.dockerhub-image-registry-username }}
password: ${{ inputs.dockerhub-image-registry-token }}
- name: Build and push dev-builder-ubuntu image
shell: bash
if: ${{ inputs.build-dev-builder-ubuntu == 'true' }}
run: |
make dev-builder \
BASE_IMAGE=ubuntu \
BUILDX_MULTI_PLATFORM_BUILD=true \
IMAGE_REGISTRY=${{ inputs.dockerhub-image-registry }} \
IMAGE_NAMESPACE=${{ inputs.dockerhub-image-namespace }} \
IMAGE_TAG=${{ inputs.version }}
- name: Build and push dev-builder-centos image
shell: bash
if: ${{ inputs.build-dev-builder-centos == 'true' }}
run: |
make dev-builder \
BASE_IMAGE=centos \
BUILDX_MULTI_PLATFORM_BUILD=true \
IMAGE_REGISTRY=${{ inputs.dockerhub-image-registry }} \
IMAGE_NAMESPACE=${{ inputs.dockerhub-image-namespace }} \
IMAGE_TAG=${{ inputs.version }}
- name: Build and push dev-builder-android image # Only build image for amd64 platform.
shell: bash
if: ${{ inputs.build-dev-builder-android == 'true' }}
run: |
make dev-builder \
BASE_IMAGE=android \
IMAGE_REGISTRY=${{ inputs.dockerhub-image-registry }} \
IMAGE_NAMESPACE=${{ inputs.dockerhub-image-namespace }} \
IMAGE_TAG=${{ inputs.version }} && \
docker push ${{ inputs.dockerhub-image-registry }}/${{ inputs.dockerhub-image-namespace }}/dev-builder-android:${{ inputs.version }}

View File

@@ -1,65 +0,0 @@
name: Build greptime binary
description: Build and upload the single linux artifact
inputs:
base-image:
description: Base image to build greptime
required: true
features:
description: Cargo features to build
required: true
cargo-profile:
description: Cargo profile to build
required: true
artifacts-dir:
description: Directory to store artifacts
required: true
version:
description: Version of the artifact
required: true
working-dir:
description: Working directory to build the artifacts
required: false
default: .
build-android-artifacts:
description: Build android artifacts
required: false
default: 'false'
runs:
using: composite
steps:
- name: Build greptime binary
shell: bash
if: ${{ inputs.build-android-artifacts == 'false' }}
run: |
cd ${{ inputs.working-dir }} && \
make build-by-dev-builder \
CARGO_PROFILE=${{ inputs.cargo-profile }} \
FEATURES=${{ inputs.features }} \
BASE_IMAGE=${{ inputs.base-image }}
- name: Upload artifacts
uses: ./.github/actions/upload-artifacts
if: ${{ inputs.build-android-artifacts == 'false' }}
env:
PROFILE_TARGET: ${{ inputs.cargo-profile == 'dev' && 'debug' || inputs.cargo-profile }}
with:
artifacts-dir: ${{ inputs.artifacts-dir }}
target-file: ./target/$PROFILE_TARGET/greptime
version: ${{ inputs.version }}
working-dir: ${{ inputs.working-dir }}
# TODO(zyy17): We can remove build-android-artifacts flag in the future.
- name: Build greptime binary
shell: bash
if: ${{ inputs.build-android-artifacts == 'true' }}
run: |
cd ${{ inputs.working-dir }} && make strip-android-bin
- name: Upload android artifacts
uses: ./.github/actions/upload-artifacts
if: ${{ inputs.build-android-artifacts == 'true' }}
with:
artifacts-dir: ${{ inputs.artifacts-dir }}
target-file: ./target/aarch64-linux-android/release/greptime
version: ${{ inputs.version }}
working-dir: ${{ inputs.working-dir }}

View File

@@ -1,104 +0,0 @@
name: Build greptime images
description: Build and push greptime images
inputs:
image-registry:
description: The image registry to store the images
required: true
image-registry-username:
description: The username to login to the image registry
required: true
image-registry-password:
description: The password to login to the image registry
required: true
amd64-artifact-name:
description: The name of the amd64 artifact for building images
required: true
arm64-artifact-name:
description: The name of the arm64 artifact for building images
required: false
default: ""
image-namespace:
description: The namespace of the image registry to store the images
required: true
image-name:
description: The name of the image to build
required: true
image-tag:
description: The tag of the image to build
required: true
docker-file:
description: The path to the Dockerfile to build
required: true
platforms:
description: The supported platforms to build the image
required: true
push-latest-tag:
description: Whether to push the latest tag
required: false
default: 'true'
runs:
using: composite
steps:
- name: Login to image registry
uses: docker/login-action@v2
with:
registry: ${{ inputs.image-registry }}
username: ${{ inputs.image-registry-username }}
password: ${{ inputs.image-registry-password }}
- name: Set up qemu for multi-platform builds
uses: docker/setup-qemu-action@v2
- name: Set up buildx
uses: docker/setup-buildx-action@v2
- name: Download amd64 artifacts
uses: actions/download-artifact@v4
with:
name: ${{ inputs.amd64-artifact-name }}
- name: Unzip the amd64 artifacts
shell: bash
run: |
tar xvf ${{ inputs.amd64-artifact-name }}.tar.gz && \
rm ${{ inputs.amd64-artifact-name }}.tar.gz && \
rm -rf amd64 && \
mv ${{ inputs.amd64-artifact-name }} amd64
- name: Download arm64 artifacts
uses: actions/download-artifact@v4
if: ${{ inputs.arm64-artifact-name }}
with:
name: ${{ inputs.arm64-artifact-name }}
- name: Unzip the arm64 artifacts
shell: bash
if: ${{ inputs.arm64-artifact-name }}
run: |
tar xvf ${{ inputs.arm64-artifact-name }}.tar.gz && \
rm ${{ inputs.arm64-artifact-name }}.tar.gz && \
rm -rf arm64 && \
mv ${{ inputs.arm64-artifact-name }} arm64
- name: Build and push images(without latest) for amd64 and arm64
if: ${{ inputs.push-latest-tag == 'false' }}
uses: docker/build-push-action@v3
with:
context: .
file: ${{ inputs.docker-file }}
push: true
platforms: ${{ inputs.platforms }}
tags: |
${{ inputs.image-registry }}/${{ inputs.image-namespace }}/${{ inputs.image-name }}:${{ inputs.image-tag }}
- name: Build and push images for amd64 and arm64
if: ${{ inputs.push-latest-tag == 'true' }}
uses: docker/build-push-action@v3
with:
context: .
file: ${{ inputs.docker-file }}
push: true
platforms: ${{ inputs.platforms }}
tags: |
${{ inputs.image-registry }}/${{ inputs.image-namespace }}/${{ inputs.image-name }}:latest
${{ inputs.image-registry }}/${{ inputs.image-namespace }}/${{ inputs.image-name }}:${{ inputs.image-tag }}

View File

@@ -1,62 +0,0 @@
name: Group for building greptimedb images
description: Group for building greptimedb images
inputs:
image-registry:
description: The image registry to store the images
required: true
image-namespace:
description: The namespace of the image registry to store the images
required: true
image-name:
description: The name of the image to build
required: false
default: greptimedb
image-registry-username:
description: The username to login to the image registry
required: true
image-registry-password:
description: The password to login to the image registry
required: true
version:
description: Version of the artifact
required: true
push-latest-tag:
description: Whether to push the latest tag
required: false
default: 'true'
dev-mode:
description: Enable dev mode, only build standard greptime
required: false
default: 'false'
runs:
using: composite
steps:
- name: Build and push standard images to dockerhub
uses: ./.github/actions/build-greptime-images
with: # The image will be used as '${{ inputs.image-registry }}/${{ inputs.image-namespace }}/${{ inputs.image-name }}:${{ inputs.version }}'
image-registry: ${{ inputs.image-registry }}
image-namespace: ${{ inputs.image-namespace }}
image-registry-username: ${{ inputs.image-registry-username }}
image-registry-password: ${{ inputs.image-registry-password }}
image-name: ${{ inputs.image-name }}
image-tag: ${{ inputs.version }}
docker-file: docker/ci/ubuntu/Dockerfile
amd64-artifact-name: greptime-linux-amd64-pyo3-${{ inputs.version }}
arm64-artifact-name: greptime-linux-arm64-pyo3-${{ inputs.version }}
platforms: linux/amd64,linux/arm64
push-latest-tag: ${{ inputs.push-latest-tag }}
- name: Build and push centos images to dockerhub
if: ${{ inputs.dev-mode == 'false' }}
uses: ./.github/actions/build-greptime-images
with:
image-registry: ${{ inputs.image-registry }}
image-namespace: ${{ inputs.image-namespace }}
image-registry-username: ${{ inputs.image-registry-username }}
image-registry-password: ${{ inputs.image-registry-password }}
image-name: ${{ inputs.image-name }}-centos
image-tag: ${{ inputs.version }}
docker-file: docker/ci/centos/Dockerfile
amd64-artifact-name: greptime-linux-amd64-centos-${{ inputs.version }}
platforms: linux/amd64
push-latest-tag: ${{ inputs.push-latest-tag }}

View File

@@ -1,88 +0,0 @@
name: Build linux artifacts
description: Build linux artifacts
inputs:
arch:
description: Architecture to build
required: true
cargo-profile:
description: Cargo profile to build
required: true
version:
description: Version of the artifact
required: true
disable-run-tests:
description: Disable running integration tests
required: true
dev-mode:
description: Enable dev mode, only build standard greptime
required: false
default: 'false'
working-dir:
description: Working directory to build the artifacts
required: false
default: .
runs:
using: composite
steps:
- name: Run integration test
if: ${{ inputs.disable-run-tests == 'false' }}
shell: bash
# NOTE: If the BUILD_JOBS > 4, it's always OOM in EC2 instance.
run: |
cd ${{ inputs.working-dir }} && \
make run-it-in-container BUILD_JOBS=4
- name: Upload sqlness logs
if: ${{ failure() && inputs.disable-run-tests == 'false' }} # Only upload logs when the integration tests failed.
uses: actions/upload-artifact@v4
with:
name: sqlness-logs
path: /tmp/greptime-*.log
retention-days: 3
- name: Build standard greptime
uses: ./.github/actions/build-greptime-binary
with:
base-image: ubuntu
features: pyo3_backend,servers/dashboard
cargo-profile: ${{ inputs.cargo-profile }}
artifacts-dir: greptime-linux-${{ inputs.arch }}-pyo3-${{ inputs.version }}
version: ${{ inputs.version }}
working-dir: ${{ inputs.working-dir }}
- name: Build greptime without pyo3
if: ${{ inputs.dev-mode == 'false' }}
uses: ./.github/actions/build-greptime-binary
with:
base-image: ubuntu
features: servers/dashboard
cargo-profile: ${{ inputs.cargo-profile }}
artifacts-dir: greptime-linux-${{ inputs.arch }}-${{ inputs.version }}
version: ${{ inputs.version }}
working-dir: ${{ inputs.working-dir }}
- name: Clean up the target directory # Clean up the target directory for the centos7 base image, or it will still use the objects of last build.
shell: bash
run: |
rm -rf ./target/
- name: Build greptime on centos base image
uses: ./.github/actions/build-greptime-binary
if: ${{ inputs.arch == 'amd64' && inputs.dev-mode == 'false' }} # Only build centos7 base image for amd64.
with:
base-image: centos
features: servers/dashboard
cargo-profile: ${{ inputs.cargo-profile }}
artifacts-dir: greptime-linux-${{ inputs.arch }}-centos-${{ inputs.version }}
version: ${{ inputs.version }}
working-dir: ${{ inputs.working-dir }}
- name: Build greptime on android base image
uses: ./.github/actions/build-greptime-binary
if: ${{ inputs.arch == 'amd64' && inputs.dev-mode == 'false' }} # Only build android base image on amd64.
with:
base-image: android
artifacts-dir: greptime-android-arm64-${{ inputs.version }}
version: ${{ inputs.version }}
working-dir: ${{ inputs.working-dir }}
build-android-artifacts: true

View File

@@ -1,89 +0,0 @@
name: Build macos artifacts
description: Build macos artifacts
inputs:
arch:
description: Architecture to build
required: true
rust-toolchain:
description: Rust toolchain to use
required: true
cargo-profile:
description: Cargo profile to build
required: true
features:
description: Cargo features to build
required: true
version:
description: Version of the artifact
required: true
disable-run-tests:
description: Disable running integration tests
required: true
artifacts-dir:
description: Directory to store artifacts
required: true
runs:
using: composite
steps:
- name: Cache cargo assets
id: cache
uses: actions/cache@v3
with:
path: |
~/.cargo/bin/
~/.cargo/registry/index/
~/.cargo/registry/cache/
~/.cargo/git/db/
target/
key: ${{ inputs.arch }}-build-cargo-${{ hashFiles('**/Cargo.lock') }}
- name: Install protoc
shell: bash
run: |
brew install protobuf
- name: Install rust toolchain
uses: dtolnay/rust-toolchain@master
with:
toolchain: ${{ inputs.rust-toolchain }}
targets: ${{ inputs.arch }}
- name: Start etcd # For integration tests.
if: ${{ inputs.disable-run-tests == 'false' }}
shell: bash
run: |
brew install etcd && \
brew services start etcd
- name: Install latest nextest release # For integration tests.
if: ${{ inputs.disable-run-tests == 'false' }}
uses: taiki-e/install-action@nextest
- name: Run integration tests
if: ${{ inputs.disable-run-tests == 'false' }}
shell: bash
run: |
make test sqlness-test
- name: Upload sqlness logs
if: ${{ failure() }} # Only upload logs when the integration tests failed.
uses: actions/upload-artifact@v4
with:
name: sqlness-logs
path: /tmp/greptime-*.log
retention-days: 3
- name: Build greptime binary
shell: bash
run: |
make build \
CARGO_PROFILE=${{ inputs.cargo-profile }} \
FEATURES=${{ inputs.features }} \
TARGET=${{ inputs.arch }}
- name: Upload artifacts
uses: ./.github/actions/upload-artifacts
with:
artifacts-dir: ${{ inputs.artifacts-dir }}
target-file: target/${{ inputs.arch }}/${{ inputs.cargo-profile }}/greptime
version: ${{ inputs.version }}

View File

@@ -1,80 +0,0 @@
name: Build Windows artifacts
description: Build Windows artifacts
inputs:
arch:
description: Architecture to build
required: true
rust-toolchain:
description: Rust toolchain to use
required: true
cargo-profile:
description: Cargo profile to build
required: true
features:
description: Cargo features to build
required: true
version:
description: Version of the artifact
required: true
disable-run-tests:
description: Disable running integration tests
required: true
artifacts-dir:
description: Directory to store artifacts
required: true
runs:
using: composite
steps:
- uses: arduino/setup-protoc@v3
- name: Install rust toolchain
uses: dtolnay/rust-toolchain@master
with:
toolchain: ${{ inputs.rust-toolchain }}
targets: ${{ inputs.arch }}
components: llvm-tools-preview
- name: Rust Cache
uses: Swatinem/rust-cache@v2
- name: Install Python
uses: actions/setup-python@v5
with:
python-version: '3.10'
- name: Install PyArrow Package
shell: pwsh
run: pip install pyarrow
- name: Install WSL distribution
uses: Vampire/setup-wsl@v2
with:
distribution: Ubuntu-22.04
- name: Install latest nextest release # For integration tests.
if: ${{ inputs.disable-run-tests == 'false' }}
uses: taiki-e/install-action@nextest
- name: Run integration tests
if: ${{ inputs.disable-run-tests == 'false' }}
shell: pwsh
run: make test sqlness-test
- name: Upload sqlness logs
if: ${{ failure() }} # Only upload logs when the integration tests failed.
uses: actions/upload-artifact@v4
with:
name: sqlness-logs
path: /tmp/greptime-*.log
retention-days: 3
- name: Build greptime binary
shell: pwsh
run: cargo build --profile ${{ inputs.cargo-profile }} --features ${{ inputs.features }} --target ${{ inputs.arch }} --bin greptime
- name: Upload artifacts
uses: ./.github/actions/upload-artifacts
with:
artifacts-dir: ${{ inputs.artifacts-dir }}
target-file: target/${{ inputs.arch }}/${{ inputs.cargo-profile }}/greptime
version: ${{ inputs.version }}

View File

@@ -1,31 +0,0 @@
name: Deploy GreptimeDB cluster
description: Deploy GreptimeDB cluster on Kubernetes
inputs:
aws-ci-test-bucket:
description: 'AWS S3 bucket name for testing'
required: true
aws-region:
description: 'AWS region for testing'
required: true
data-root:
description: 'Data root for testing'
required: true
aws-access-key-id:
description: 'AWS access key id for testing'
required: true
aws-secret-access-key:
description: 'AWS secret access key for testing'
required: true
runs:
using: composite
steps:
- name: Deploy GreptimeDB by Helm
shell: bash
env:
DATA_ROOT: ${{ inputs.data-root }}
AWS_CI_TEST_BUCKET: ${{ inputs.aws-ci-test-bucket }}
AWS_REGION: ${{ inputs.aws-region }}
AWS_ACCESS_KEY_ID: ${{ inputs.aws-access-key-id }}
AWS_SECRET_ACCESS_KEY: ${{ inputs.aws-secret-access-key }}
run: |
./.github/scripts/deploy-greptimedb.sh

View File

@@ -1,13 +0,0 @@
name: Fuzz Test
description: 'Fuzz test given setup and service'
inputs:
target:
description: "The fuzz target to test"
runs:
using: composite
steps:
- name: Run Fuzz Test
shell: bash
run: cargo fuzz run ${{ inputs.target }} --fuzz-dir tests-fuzz -D -s none -- -max_total_time=120
env:
GT_MYSQL_ADDR: 127.0.0.1:4002

View File

@@ -1,53 +0,0 @@
name: Publish GitHub release
description: Publish GitHub release
inputs:
version:
description: Version to release
required: true
runs:
using: composite
steps:
# Download artifacts from previous jobs, the artifacts will be downloaded to:
# ${WORKING_DIR}
# |- greptime-darwin-amd64-pyo3-v0.5.0/greptime-darwin-amd64-pyo3-v0.5.0.tar.gz
# |- greptime-darwin-amd64-pyo3-v0.5.0.sha256sum/greptime-darwin-amd64-pyo3-v0.5.0.sha256sum
# |- greptime-darwin-amd64-v0.5.0/greptime-darwin-amd64-v0.5.0.tar.gz
# |- greptime-darwin-amd64-v0.5.0.sha256sum/greptime-darwin-amd64-v0.5.0.sha256sum
# ...
- name: Download artifacts
uses: actions/download-artifact@v4
- name: Create git tag for release
if: ${{ github.event_name != 'push' }} # Meaning this is a scheduled or manual workflow.
shell: bash
run: |
git tag ${{ inputs.version }}
# Only publish release when the release tag is like v1.0.0, v1.0.1, v1.0.2, etc.
- name: Set release arguments
shell: bash
run: |
if [[ "${{ inputs.version }}" =~ ^v[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
echo "prerelease=false" >> $GITHUB_ENV
echo "makeLatest=true" >> $GITHUB_ENV
echo "generateReleaseNotes=false" >> $GITHUB_ENV
echo "omitBody=true" >> $GITHUB_ENV
else
echo "prerelease=true" >> $GITHUB_ENV
echo "makeLatest=false" >> $GITHUB_ENV
echo "generateReleaseNotes=true" >> $GITHUB_ENV
echo "omitBody=false" >> $GITHUB_ENV
fi
- name: Publish release
uses: ncipollo/release-action@v1
with:
name: "Release ${{ inputs.version }}"
prerelease: ${{ env.prerelease }}
makeLatest: ${{ env.makeLatest }}
tag: ${{ inputs.version }}
generateReleaseNotes: ${{ env.generateReleaseNotes }}
omitBody: ${{ env.omitBody }} # omitBody is true when the release is a official release.
allowUpdates: true
artifacts: |
**/greptime-*/*

View File

@@ -1,138 +0,0 @@
name: Release CN artifacts
description: Release artifacts to CN region
inputs:
src-image-registry:
description: The source image registry to store the images
required: true
default: docker.io
src-image-namespace:
description: The namespace of the source image registry to store the images
required: true
default: greptime
src-image-name:
description: The name of the source image
required: false
default: greptimedb
dst-image-registry:
description: The destination image registry to store the images
required: true
dst-image-namespace:
description: The namespace of the destination image registry to store the images
required: true
default: greptime
dst-image-registry-username:
description: The username to login to the image registry
required: true
dst-image-registry-password:
description: The password to login to the image registry
required: true
version:
description: Version of the artifact
required: true
dev-mode:
description: Enable dev mode, only push standard greptime
required: false
default: 'false'
push-latest-tag:
description: Whether to push the latest tag of the image
required: false
default: 'true'
aws-cn-s3-bucket:
description: S3 bucket to store released artifacts in CN region
required: true
aws-cn-access-key-id:
description: AWS access key id in CN region
required: true
aws-cn-secret-access-key:
description: AWS secret access key in CN region
required: true
aws-cn-region:
description: AWS region in CN
required: true
upload-to-s3:
description: Upload to S3
required: false
default: 'true'
artifacts-dir:
description: Directory to store artifacts
required: false
default: 'artifacts'
update-version-info:
description: Update the version info in S3
required: false
default: 'true'
upload-max-retry-times:
description: Max retry times for uploading artifacts to S3
required: false
default: "20"
upload-retry-timeout:
description: Timeout for uploading artifacts to S3
required: false
default: "30" # minutes
runs:
using: composite
steps:
- name: Download artifacts
uses: actions/download-artifact@v4
with:
path: ${{ inputs.artifacts-dir }}
- name: Release artifacts to cn region
uses: nick-invision/retry@v2
if: ${{ inputs.upload-to-s3 == 'true' }}
env:
AWS_ACCESS_KEY_ID: ${{ inputs.aws-cn-access-key-id }}
AWS_SECRET_ACCESS_KEY: ${{ inputs.aws-cn-secret-access-key }}
AWS_DEFAULT_REGION: ${{ inputs.aws-cn-region }}
UPDATE_VERSION_INFO: ${{ inputs.update-version-info }}
with:
max_attempts: ${{ inputs.upload-max-retry-times }}
timeout_minutes: ${{ inputs.upload-retry-timeout }}
command: |
./.github/scripts/upload-artifacts-to-s3.sh \
${{ inputs.artifacts-dir }} \
${{ inputs.version }} \
${{ inputs.aws-cn-s3-bucket }}
- name: Push greptimedb image from Dockerhub to ACR
shell: bash
env:
DST_REGISTRY_USERNAME: ${{ inputs.dst-image-registry-username }}
DST_REGISTRY_PASSWORD: ${{ inputs.dst-image-registry-password }}
run: |
./.github/scripts/copy-image.sh \
${{ inputs.src-image-registry }}/${{ inputs.src-image-namespace }}/${{ inputs.src-image-name }}:${{ inputs.version }} \
${{ inputs.dst-image-registry }}/${{ inputs.dst-image-namespace }}
- name: Push latest greptimedb image from Dockerhub to ACR
shell: bash
if: ${{ inputs.push-latest-tag == 'true' }}
env:
DST_REGISTRY_USERNAME: ${{ inputs.dst-image-registry-username }}
DST_REGISTRY_PASSWORD: ${{ inputs.dst-image-registry-password }}
run: |
./.github/scripts/copy-image.sh \
${{ inputs.src-image-registry }}/${{ inputs.src-image-namespace }}/${{ inputs.src-image-name }}:latest \
${{ inputs.dst-image-registry }}/${{ inputs.dst-image-namespace }}
- name: Push greptimedb-centos image from DockerHub to ACR
shell: bash
if: ${{ inputs.dev-mode == 'false' }}
env:
DST_REGISTRY_USERNAME: ${{ inputs.dst-image-registry-username }}
DST_REGISTRY_PASSWORD: ${{ inputs.dst-image-registry-password }}
run: |
./.github/scripts/copy-image.sh \
${{ inputs.src-image-registry }}/${{ inputs.src-image-namespace }}/${{ inputs.src-image-name }}-centos:latest \
${{ inputs.dst-image-registry }}/${{ inputs.dst-image-namespace }}
- name: Push greptimedb-centos image from DockerHub to ACR
shell: bash
if: ${{ inputs.dev-mode == 'false' && inputs.push-latest-tag == 'true' }}
env:
DST_REGISTRY_USERNAME: ${{ inputs.dst-image-registry-username }}
DST_REGISTRY_PASSWORD: ${{ inputs.dst-image-registry-password }}
run: |
./.github/scripts/copy-image.sh \
${{ inputs.src-image-registry }}/${{ inputs.src-image-namespace }}/${{ inputs.src-image-name }}-centos:latest \
${{ inputs.dst-image-registry }}/${{ inputs.dst-image-namespace }}

View File

@@ -1,59 +0,0 @@
name: Run sqlness test
description: Run sqlness test on GreptimeDB
inputs:
aws-ci-test-bucket:
description: 'AWS S3 bucket name for testing'
required: true
aws-region:
description: 'AWS region for testing'
required: true
data-root:
description: 'Data root for testing'
required: true
aws-access-key-id:
description: 'AWS access key id for testing'
required: true
aws-secret-access-key:
description: 'AWS secret access key for testing'
required: true
runs:
using: composite
steps:
- name: Deploy GreptimeDB cluster by Helm
uses: ./.github/actions/deploy-greptimedb
with:
data-root: ${{ inputs.data-root }}
aws-ci-test-bucket: ${{ inputs.aws-ci-test-bucket }}
aws-region: ${{ inputs.aws-region }}
aws-access-key-id: ${{ inputs.aws-access-key-id }}
aws-secret-access-key: ${{ inputs.aws-secret-access-key }}
# TODO(zyy17): The following tests will be replaced by the real sqlness test.
- name: Run tests on greptimedb cluster
shell: bash
run: |
mysql -h 127.0.0.1 -P 14002 -e "CREATE TABLE IF NOT EXISTS system_metrics (host VARCHAR(255), idc VARCHAR(255), cpu_util DOUBLE, memory_util DOUBLE, disk_util DOUBLE, ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY(host, idc), TIME INDEX(ts));" && \
mysql -h 127.0.0.1 -P 14002 -e "SHOW TABLES;"
- name: Run tests on greptimedb cluster that uses S3
shell: bash
run: |
mysql -h 127.0.0.1 -P 24002 -e "CREATE TABLE IF NOT EXISTS system_metrics (host VARCHAR(255), idc VARCHAR(255), cpu_util DOUBLE, memory_util DOUBLE, disk_util DOUBLE, ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY(host, idc), TIME INDEX(ts));" && \
mysql -h 127.0.0.1 -P 24002 -e "SHOW TABLES;"
- name: Run tests on standalone greptimedb
shell: bash
run: |
mysql -h 127.0.0.1 -P 34002 -e "CREATE TABLE IF NOT EXISTS system_metrics (host VARCHAR(255), idc VARCHAR(255), cpu_util DOUBLE, memory_util DOUBLE, disk_util DOUBLE, ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP, PRIMARY KEY(host, idc), TIME INDEX(ts));" && \
mysql -h 127.0.0.1 -P 34002 -e "SHOW TABLES;"
- name: Clean S3 data
shell: bash
env:
AWS_DEFAULT_REGION: ${{ inputs.aws-region }}
AWS_ACCESS_KEY_ID: ${{ inputs.aws-access-key-id }}
AWS_SECRET_ACCESS_KEY: ${{ inputs.aws-secret-access-key }}
run: |
aws s3 rm s3://${{ inputs.aws-ci-test-bucket }}/${{ inputs.data-root }} --recursive

View File

@@ -1,67 +0,0 @@
name: Start EC2 runner
description: Start EC2 runner
inputs:
runner:
description: The linux runner name
required: true
aws-access-key-id:
description: AWS access key id
required: true
aws-secret-access-key:
description: AWS secret access key
required: true
aws-region:
description: AWS region
required: true
github-token:
description: The GitHub token to clone private repository
required: false
default: ""
image-id:
description: The EC2 image id
required: true
security-group-id:
description: The EC2 security group id
required: true
subnet-id:
description: The EC2 subnet id
required: true
outputs:
label:
description: "label"
value: ${{ steps.start-linux-arm64-ec2-runner.outputs.label || inputs.runner }}
ec2-instance-id:
description: "ec2-instance-id"
value: ${{ steps.start-linux-arm64-ec2-runner.outputs.ec2-instance-id }}
runs:
using: composite
steps:
- name: Configure AWS credentials
if: startsWith(inputs.runner, 'ec2')
uses: aws-actions/configure-aws-credentials@v2
with:
aws-access-key-id: ${{ inputs.aws-access-key-id }}
aws-secret-access-key: ${{ inputs.aws-secret-access-key }}
aws-region: ${{ inputs.aws-region }}
# The EC2 runner will use the following format:
# <vm-type>-<instance-type>-<arch>
# like 'ec2-c6a.4xlarge-amd64'.
- name: Get EC2 instance type
if: startsWith(inputs.runner, 'ec2')
id: get-ec2-instance-type
shell: bash
run: |
echo "instance-type=$(echo ${{ inputs.runner }} | cut -d'-' -f2)" >> $GITHUB_OUTPUT
- name: Start EC2 runner
if: startsWith(inputs.runner, 'ec2')
uses: machulav/ec2-github-runner@v2
id: start-linux-arm64-ec2-runner
with:
mode: start
ec2-image-id: ${{ inputs.image-id }}
ec2-instance-type: ${{ steps.get-ec2-instance-type.outputs.instance-type }}
subnet-id: ${{ inputs.subnet-id }}
security-group-id: ${{ inputs.security-group-id }}
github-token: ${{ inputs.github-token }}

View File

@@ -1,41 +0,0 @@
name: Stop EC2 runner
description: Stop EC2 runner
inputs:
label:
description: The linux runner name
required: true
ec2-instance-id:
description: The EC2 instance id
required: true
aws-access-key-id:
description: AWS access key id
required: true
aws-secret-access-key:
description: AWS secret access key
required: true
aws-region:
description: AWS region
required: true
github-token:
description: The GitHub token to clone private repository
required: false
default: ""
runs:
using: composite
steps:
- name: Configure AWS credentials
if: ${{ inputs.label && inputs.ec2-instance-id }}
uses: aws-actions/configure-aws-credentials@v2
with:
aws-access-key-id: ${{ inputs.aws-access-key-id }}
aws-secret-access-key: ${{ inputs.aws-secret-access-key }}
aws-region: ${{ inputs.aws-region }}
- name: Stop EC2 runner
if: ${{ inputs.label && inputs.ec2-instance-id }}
uses: machulav/ec2-github-runner@v2
with:
mode: stop
label: ${{ inputs.label }}
ec2-instance-id: ${{ inputs.ec2-instance-id }}
github-token: ${{ inputs.github-token }}

View File

@@ -1,64 +0,0 @@
name: Upload artifacts
description: Upload artifacts
inputs:
artifacts-dir:
description: Directory to store artifacts
required: true
target-file:
description: The path of the target artifact
required: false
version:
description: Version of the artifact
required: true
working-dir:
description: Working directory to upload the artifacts
required: false
default: .
runs:
using: composite
steps:
- name: Create artifacts directory
if: ${{ inputs.target-file != '' }}
working-directory: ${{ inputs.working-dir }}
shell: bash
run: |
mkdir -p ${{ inputs.artifacts-dir }} && \
cp ${{ inputs.target-file }} ${{ inputs.artifacts-dir }}
# The compressed artifacts will use the following layout:
# greptime-linux-amd64-pyo3-v0.3.0sha256sum
# greptime-linux-amd64-pyo3-v0.3.0.tar.gz
# greptime-linux-amd64-pyo3-v0.3.0
# └── greptime
- name: Compress artifacts and calculate checksum
working-directory: ${{ inputs.working-dir }}
shell: bash
run: |
tar -zcvf ${{ inputs.artifacts-dir }}.tar.gz ${{ inputs.artifacts-dir }}
- name: Calculate checksum
if: runner.os != 'Windows'
working-directory: ${{ inputs.working-dir }}
shell: bash
run: |
echo $(shasum -a 256 ${{ inputs.artifacts-dir }}.tar.gz | cut -f1 -d' ') > ${{ inputs.artifacts-dir }}.sha256sum
- name: Calculate checksum on Windows
if: runner.os == 'Windows'
working-directory: ${{ inputs.working-dir }}
shell: pwsh
run: Get-FileHash ${{ inputs.artifacts-dir }}.tar.gz -Algorithm SHA256 | select -ExpandProperty Hash > ${{ inputs.artifacts-dir }}.sha256sum
# Note: The artifacts will be double zip compressed(related issue: https://github.com/actions/upload-artifact/issues/39).
# However, when we use 'actions/download-artifact' to download the artifacts, it will be automatically unzipped.
- name: Upload artifacts
uses: actions/upload-artifact@v4
with:
name: ${{ inputs.artifacts-dir }}
path: ${{ inputs.working-dir }}/${{ inputs.artifacts-dir }}.tar.gz
- name: Upload checksum
uses: actions/upload-artifact@v4
with:
name: ${{ inputs.artifacts-dir }}.sha256sum
path: ${{ inputs.working-dir }}/${{ inputs.artifacts-dir }}.sha256sum

10
.github/codecov.yml vendored Normal file
View File

@@ -0,0 +1,10 @@
# codecov config
coverage:
status:
patch: off # disable patch status
project:
default:
enable: yes
threshold: 1%
ignore:
- "**/error*.rs" # ignore all error.rs files

View File

@@ -1,4 +0,0 @@
Doc not needed:
- '- \[x\] This PR does not require documentation updates.'
Doc update required:
- '- \[ \] This PR does not require documentation updates.'

View File

@@ -1,13 +0,0 @@
{
"LABEL": {
"name": "breaking change",
"color": "D93F0B"
},
"CHECKS": {
"regexp": "^(?:(?!!:).)*$",
"ignoreLabels": [
"ignore-title"
],
"alwaysPassCI": true
}
}

View File

@@ -1,12 +1,10 @@
{
"LABEL": {
"name": "Invalid PR Title",
"color": "B60205"
},
"CHECKS": {
"regexp": "^(feat|fix|test|refactor|chore|style|docs|perf|build|ci|revert)(\\(.*\\))?\\!?:.*",
"ignoreLabels": [
"ignore-title"
]
}
"LABEL": {
"name": "Invalid PR Title",
"color": "B60205"
},
"CHECKS": {
"regexp": "^(feat|fix|test|refactor|chore|style|doc|perf|build|ci|revert)(\\(.*\\))?:.*",
"ignoreLabels" : ["ignore-title"]
}
}

View File

@@ -1,20 +0,0 @@
I hereby agree to the terms of the [GreptimeDB CLA](https://github.com/GreptimeTeam/.github/blob/main/CLA.md).
## Refer to a related PR or issue link (optional)
## What's changed and what's your intention?
__!!! DO NOT LEAVE THIS BLOCK EMPTY !!!__
Please explain IN DETAIL what the changes are in this PR and why they are needed:
- Summarize your change (**mandatory**)
- How does this PR work? Need a brief introduction for the changed logic (optional)
- Describe clearly one logical change and avoid lazy messages (optional)
- Describe any limitations of the current code (optional)
## Checklist
- [ ] I have written the necessary rustdoc comments.
- [ ] I have added the necessary unit tests and integration tests.
- [x] This PR does not require documentation updates.

View File

@@ -1,47 +0,0 @@
#!/usr/bin/env bash
set -e
set -o pipefail
SRC_IMAGE=$1
DST_REGISTRY=$2
SKOPEO_STABLE_IMAGE="quay.io/skopeo/stable:latest"
# Check if necessary variables are set.
function check_vars() {
for var in DST_REGISTRY_USERNAME DST_REGISTRY_PASSWORD DST_REGISTRY SRC_IMAGE; do
if [ -z "${!var}" ]; then
echo "$var is not set or empty."
echo "Usage: DST_REGISTRY_USERNAME=<your-dst-registry-username> DST_REGISTRY_PASSWORD=<your-dst-registry-password> $0 <dst-registry> <src-image>"
exit 1
fi
done
}
# Copies images from DockerHub to the destination registry.
function copy_images_from_dockerhub() {
# Check if docker is installed.
if ! command -v docker &> /dev/null; then
echo "docker is not installed. Please install docker to continue."
exit 1
fi
# Extract the name and tag of the source image.
IMAGE_NAME=$(echo "$SRC_IMAGE" | sed "s/.*\///")
echo "Copying $SRC_IMAGE to $DST_REGISTRY/$IMAGE_NAME"
docker run "$SKOPEO_STABLE_IMAGE" copy -a docker://"$SRC_IMAGE" \
--dest-creds "$DST_REGISTRY_USERNAME":"$DST_REGISTRY_PASSWORD" \
docker://"$DST_REGISTRY/$IMAGE_NAME"
}
function main() {
check_vars
copy_images_from_dockerhub
}
# Usage example:
# DST_REGISTRY_USERNAME=123 DST_REGISTRY_PASSWORD=456 \
# ./copy-image.sh greptime/greptimedb:v0.4.0 greptime-registry.cn-hangzhou.cr.aliyuncs.com
main

View File

@@ -1,68 +0,0 @@
#!/usr/bin/env bash
set -e
# - If it's a tag push release, the version is the tag name(${{ github.ref_name }});
# - If it's a scheduled release, the version is '${{ env.NEXT_RELEASE_VERSION }}-nightly-$buildTime', like 'v0.2.0-nightly-20230313';
# - If it's a manual release, the version is '${{ env.NEXT_RELEASE_VERSION }}-$(git rev-parse --short HEAD)-YYYYMMDDSS', like 'v0.2.0-e5b243c-2023071245';
# - If it's a nightly build, the version is 'nightly-YYYYMMDD-$(git rev-parse --short HEAD)', like 'nightly-20230712-e5b243c'.
# create_version ${GIHUB_EVENT_NAME} ${NEXT_RELEASE_VERSION} ${NIGHTLY_RELEASE_PREFIX}
function create_version() {
# Read from envrionment variables.
if [ -z "$GITHUB_EVENT_NAME" ]; then
echo "GITHUB_EVENT_NAME is empty"
exit 1
fi
if [ -z "$NEXT_RELEASE_VERSION" ]; then
echo "NEXT_RELEASE_VERSION is empty"
exit 1
fi
if [ -z "$NIGHTLY_RELEASE_PREFIX" ]; then
echo "NIGHTLY_RELEASE_PREFIX is empty"
exit 1
fi
# Reuse $NEXT_RELEASE_VERSION to identify whether it's a nightly build.
# It will be like 'nigtly-20230808-7d0d8dc6'.
if [ "$NEXT_RELEASE_VERSION" = nightly ]; then
echo "$NIGHTLY_RELEASE_PREFIX-$(date "+%Y%m%d")-$(git rev-parse --short HEAD)"
exit 0
fi
# Reuse $NEXT_RELEASE_VERSION to identify whether it's a dev build.
# It will be like 'dev-2023080819-f0e7216c'.
if [ "$NEXT_RELEASE_VERSION" = dev ]; then
if [ -z "$COMMIT_SHA" ]; then
echo "COMMIT_SHA is empty in dev build"
exit 1
fi
echo "dev-$(date "+%Y%m%d-%s")-$(echo "$COMMIT_SHA" | cut -c1-8)"
exit 0
fi
# Note: Only output 'version=xxx' to stdout when everything is ok, so that it can be used in GitHub Actions Outputs.
if [ "$GITHUB_EVENT_NAME" = push ]; then
if [ -z "$GITHUB_REF_NAME" ]; then
echo "GITHUB_REF_NAME is empty in push event"
exit 1
fi
echo "$GITHUB_REF_NAME"
elif [ "$GITHUB_EVENT_NAME" = workflow_dispatch ]; then
echo "$NEXT_RELEASE_VERSION-$(git rev-parse --short HEAD)-$(date "+%Y%m%d-%s")"
elif [ "$GITHUB_EVENT_NAME" = schedule ]; then
echo "$NEXT_RELEASE_VERSION-$NIGHTLY_RELEASE_PREFIX-$(date "+%Y%m%d")"
else
echo "Unsupported GITHUB_EVENT_NAME: $GITHUB_EVENT_NAME"
exit 1
fi
}
# You can run as following examples:
# GITHUB_EVENT_NAME=push NEXT_RELEASE_VERSION=v0.4.0 NIGHTLY_RELEASE_PREFIX=nigtly GITHUB_REF_NAME=v0.3.0 ./create-version.sh
# GITHUB_EVENT_NAME=workflow_dispatch NEXT_RELEASE_VERSION=v0.4.0 NIGHTLY_RELEASE_PREFIX=nigtly ./create-version.sh
# GITHUB_EVENT_NAME=schedule NEXT_RELEASE_VERSION=v0.4.0 NIGHTLY_RELEASE_PREFIX=nigtly ./create-version.sh
# GITHUB_EVENT_NAME=schedule NEXT_RELEASE_VERSION=nightly NIGHTLY_RELEASE_PREFIX=nigtly ./create-version.sh
# GITHUB_EVENT_NAME=workflow_dispatch COMMIT_SHA=f0e7216c4bb6acce9b29a21ec2d683be2e3f984a NEXT_RELEASE_VERSION=dev NIGHTLY_RELEASE_PREFIX=nigtly ./create-version.sh
create_version

View File

@@ -1,169 +0,0 @@
#!/usr/bin/env bash
set -e
set -o pipefail
KUBERNETES_VERSION="${KUBERNETES_VERSION:-v1.24.0}"
ENABLE_STANDALONE_MODE="${ENABLE_STANDALONE_MODE:-true}"
DEFAULT_INSTALL_NAMESPACE=${DEFAULT_INSTALL_NAMESPACE:-default}
GREPTIMEDB_IMAGE_TAG=${GREPTIMEDB_IMAGE_TAG:-latest}
ETCD_CHART="oci://registry-1.docker.io/bitnamicharts/etcd"
GREPTIME_CHART="https://greptimeteam.github.io/helm-charts/"
# Ceate a cluster with 1 control-plane node and 5 workers.
function create_kind_cluster() {
cat <<EOF | kind create cluster --name "${CLUSTER}" --image kindest/node:"$KUBERNETES_VERSION" --config=-
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
- role: worker
- role: worker
- role: worker
- role: worker
- role: worker
EOF
}
# Add greptime Helm chart repo.
function add_greptime_chart() {
helm repo add greptime "$GREPTIME_CHART"
helm repo update
}
# Deploy a etcd cluster with 3 members.
function deploy_etcd_cluster() {
local namespace="$1"
helm install etcd "$ETCD_CHART" \
--set replicaCount=3 \
--set auth.rbac.create=false \
--set auth.rbac.token.enabled=false \
-n "$namespace"
# Wait for etcd cluster to be ready.
kubectl rollout status statefulset/etcd -n "$namespace"
}
# Deploy greptimedb-operator.
function deploy_greptimedb_operator() {
# Use the latest chart and image.
helm install greptimedb-operator greptime/greptimedb-operator \
--set image.tag=latest \
-n "$DEFAULT_INSTALL_NAMESPACE"
# Wait for greptimedb-operator to be ready.
kubectl rollout status deployment/greptimedb-operator -n "$DEFAULT_INSTALL_NAMESPACE"
}
# Deploy greptimedb cluster by using local storage.
# It will expose cluster service ports as '14000', '14001', '14002', '14003' to local access.
function deploy_greptimedb_cluster() {
local cluster_name=$1
local install_namespace=$2
kubectl create ns "$install_namespace"
deploy_etcd_cluster "$install_namespace"
helm install "$cluster_name" greptime/greptimedb-cluster \
--set image.tag="$GREPTIMEDB_IMAGE_TAG" \
--set meta.etcdEndpoints="etcd.$install_namespace:2379" \
-n "$install_namespace"
# Wait for greptimedb cluster to be ready.
while true; do
PHASE=$(kubectl -n "$install_namespace" get gtc "$cluster_name" -o jsonpath='{.status.clusterPhase}')
if [ "$PHASE" == "Running" ]; then
echo "Cluster is ready"
break
else
echo "Cluster is not ready yet: Current phase: $PHASE"
sleep 5 # wait for 5 seconds before check again.
fi
done
# Expose greptimedb cluster to local access.
kubectl -n "$install_namespace" port-forward svc/"$cluster_name"-frontend \
14000:4000 \
14001:4001 \
14002:4002 \
14003:4003 > /tmp/connections.out &
}
# Deploy greptimedb cluster by using S3.
# It will expose cluster service ports as '24000', '24001', '24002', '24003' to local access.
function deploy_greptimedb_cluster_with_s3_storage() {
local cluster_name=$1
local install_namespace=$2
kubectl create ns "$install_namespace"
deploy_etcd_cluster "$install_namespace"
helm install "$cluster_name" greptime/greptimedb-cluster -n "$install_namespace" \
--set image.tag="$GREPTIMEDB_IMAGE_TAG" \
--set meta.etcdEndpoints="etcd.$install_namespace:2379" \
--set storage.s3.bucket="$AWS_CI_TEST_BUCKET" \
--set storage.s3.region="$AWS_REGION" \
--set storage.s3.root="$DATA_ROOT" \
--set storage.credentials.secretName=s3-credentials \
--set storage.credentials.accessKeyId="$AWS_ACCESS_KEY_ID" \
--set storage.credentials.secretAccessKey="$AWS_SECRET_ACCESS_KEY"
# Wait for greptimedb cluster to be ready.
while true; do
PHASE=$(kubectl -n "$install_namespace" get gtc "$cluster_name" -o jsonpath='{.status.clusterPhase}')
if [ "$PHASE" == "Running" ]; then
echo "Cluster is ready"
break
else
echo "Cluster is not ready yet: Current phase: $PHASE"
sleep 5 # wait for 5 seconds before check again.
fi
done
# Expose greptimedb cluster to local access.
kubectl -n "$install_namespace" port-forward svc/"$cluster_name"-frontend \
24000:4000 \
24001:4001 \
24002:4002 \
24003:4003 > /tmp/connections.out &
}
# Deploy standalone greptimedb.
# It will expose cluster service ports as '34000', '34001', '34002', '34003' to local access.
function deploy_standalone_greptimedb() {
helm install greptimedb-standalone greptime/greptimedb-standalone \
--set image.tag="$GREPTIMEDB_IMAGE_TAG" \
-n "$DEFAULT_INSTALL_NAMESPACE"
# Wait for etcd cluster to be ready.
kubectl rollout status statefulset/greptimedb-standalone -n "$DEFAULT_INSTALL_NAMESPACE"
# Expose greptimedb to local access.
kubectl -n "$DEFAULT_INSTALL_NAMESPACE" port-forward svc/greptimedb-standalone \
34000:4000 \
34001:4001 \
34002:4002 \
34003:4003 > /tmp/connections.out &
}
# Entrypoint of the script.
function main() {
create_kind_cluster
add_greptime_chart
# Deploy standalone greptimedb in the same K8s.
if [ "$ENABLE_STANDALONE_MODE" == "true" ]; then
deploy_standalone_greptimedb
fi
deploy_greptimedb_operator
deploy_greptimedb_cluster testcluster testcluster
deploy_greptimedb_cluster_with_s3_storage testcluster-s3 testcluster-s3
}
# Usages:
# - Deploy greptimedb cluster: ./deploy-greptimedb.sh
main

View File

@@ -1,102 +0,0 @@
#!/usr/bin/env bash
set -e
set -o pipefail
ARTIFACTS_DIR=$1
VERSION=$2
AWS_S3_BUCKET=$3
RELEASE_DIRS="releases/greptimedb"
GREPTIMEDB_REPO="GreptimeTeam/greptimedb"
# Check if necessary variables are set.
function check_vars() {
for var in AWS_S3_BUCKET VERSION ARTIFACTS_DIR; do
if [ -z "${!var}" ]; then
echo "$var is not set or empty."
echo "Usage: $0 <artifacts-dir> <version> <aws-s3-bucket>"
exit 1
fi
done
}
# Uploads artifacts to AWS S3 bucket.
function upload_artifacts() {
# The bucket layout will be:
# releases/greptimedb
# ├── latest-version.txt
# ├── latest-nightly-version.txt
# ├── v0.1.0
# │ ├── greptime-darwin-amd64-pyo3-v0.1.0.sha256sum
# │ └── greptime-darwin-amd64-pyo3-v0.1.0.tar.gz
# └── v0.2.0
# ├── greptime-darwin-amd64-pyo3-v0.2.0.sha256sum
# └── greptime-darwin-amd64-pyo3-v0.2.0.tar.gz
find "$ARTIFACTS_DIR" -type f \( -name "*.tar.gz" -o -name "*.sha256sum" \) | while IFS= read -r file; do
aws s3 cp \
"$file" "s3://$AWS_S3_BUCKET/$RELEASE_DIRS/$VERSION/$(basename "$file")"
done
}
# Updates the latest version information in AWS S3 if UPDATE_VERSION_INFO is true.
function update_version_info() {
if [ "$UPDATE_VERSION_INFO" == "true" ]; then
# If it's the officail release(like v1.0.0, v1.0.1, v1.0.2, etc.), update latest-version.txt.
if [[ "$VERSION" =~ ^v[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
echo "Updating latest-version.txt"
echo "$VERSION" > latest-version.txt
aws s3 cp \
latest-version.txt "s3://$AWS_S3_BUCKET/$RELEASE_DIRS/latest-version.txt"
fi
# If it's the nightly release, update latest-nightly-version.txt.
if [[ "$VERSION" == *"nightly"* ]]; then
echo "Updating latest-nightly-version.txt"
echo "$VERSION" > latest-nightly-version.txt
aws s3 cp \
latest-nightly-version.txt "s3://$AWS_S3_BUCKET/$RELEASE_DIRS/latest-nightly-version.txt"
fi
fi
}
# Downloads artifacts from Github if DOWNLOAD_ARTIFACTS_FROM_GITHUB is true.
function download_artifacts_from_github() {
if [ "$DOWNLOAD_ARTIFACTS_FROM_GITHUB" == "true" ]; then
# Check if jq is installed.
if ! command -v jq &> /dev/null; then
echo "jq is not installed. Please install jq to continue."
exit 1
fi
# Get the latest release API response.
RELEASES_API_RESPONSE=$(curl -s -H "Accept: application/vnd.github.v3+json" "https://api.github.com/repos/$GREPTIMEDB_REPO/releases/latest")
# Extract download URLs for the artifacts.
# Exclude source code archives which are typically named as 'greptimedb-<version>.zip' or 'greptimedb-<version>.tar.gz'.
ASSET_URLS=$(echo "$RELEASES_API_RESPONSE" | jq -r '.assets[] | select(.name | test("greptimedb-.*\\.(zip|tar\\.gz)$") | not) | .browser_download_url')
# Download each asset.
while IFS= read -r url; do
if [ -n "$url" ]; then
curl -LJO "$url"
echo "Downloaded: $url"
fi
done <<< "$ASSET_URLS"
fi
}
function main() {
check_vars
download_artifacts_from_github
upload_artifacts
update_version_info
}
# Usage example:
# AWS_ACCESS_KEY_ID=<your_access_key_id> \
# AWS_SECRET_ACCESS_KEY=<your_secret_access_key> \
# AWS_DEFAULT_REGION=<your_region> \
# UPDATE_VERSION_INFO=true \
# DOWNLOAD_ARTIFACTS_FROM_GITHUB=false \
# ./upload-artifacts-to-s3.sh <artifacts-dir> <version> <aws-s3-bucket>
main

View File

@@ -1,42 +0,0 @@
on:
push:
branches:
- main
paths-ignore:
- 'docs/**'
- 'config/**'
- '**.md'
- '.dockerignore'
- 'docker/**'
- '.gitignore'
name: Build API docs
env:
RUST_TOOLCHAIN: nightly-2023-12-19
jobs:
apidoc:
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v4
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
- uses: dtolnay/rust-toolchain@master
with:
toolchain: ${{ env.RUST_TOOLCHAIN }}
- run: cargo doc --workspace --no-deps --document-private-items
- run: |
cat <<EOF > target/doc/index.html
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="refresh" content="0; url='greptime/'" />
</head>
<body></body></html>
EOF
- name: Publish dist directory
uses: JamesIves/github-pages-deploy-action@v4
with:
folder: target/doc

56
.github/workflows/coverage.yml vendored Normal file
View File

@@ -0,0 +1,56 @@
on:
pull_request:
types: [opened, synchronize, reopened, ready_for_review]
push:
branches:
- "main"
- "develop"
name: Code coverage
env:
RUST_TOOLCHAIN: nightly-2022-07-14
jobs:
grcov:
if: github.event.pull_request.draft == false
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: arduino/setup-protoc@v1
- name: Install toolchain
uses: actions-rs/toolchain@v1
with:
toolchain: ${{ env.RUST_TOOLCHAIN }}
override: true
profile: minimal
- name: Rust Cache
uses: Swatinem/rust-cache@v2.0.0
- name: Cleanup disk
uses: curoky/cleanup-disk-action@v2.0
with:
retain: 'rust'
- name: Execute tests
uses: actions-rs/cargo@v1
with:
command: test
args: --workspace
env:
RUST_BACKTRACE: 1
CARGO_INCREMENTAL: 0
RUSTFLAGS: "-Zprofile -Ccodegen-units=1 -Cinline-threshold=0 -Clink-dead-code -Coverflow-checks=off -Cpanic=unwind -Zpanic_abort_tests"
GT_S3_BUCKET: ${{ secrets.S3_BUCKET }}
GT_S3_ACCESS_KEY_ID: ${{ secrets.S3_ACCESS_KEY_ID }}
GT_S3_ACCESS_KEY: ${{ secrets.S3_ACCESS_KEY }}
UNITTEST_LOG_DIR: "__unittest_logs"
- name: Gather coverage data
id: coverage
uses: actions-rs/grcov@v0.1
- name: Codecov upload
uses: codecov/codecov-action@v2
with:
token: ${{ secrets.CODECOV_TOKEN }}
files: ./lcov.info
flags: rust
fail_ci_if_error: true
verbose: true

View File

@@ -1,345 +0,0 @@
# Development build only build the debug version of the artifacts manually.
name: GreptimeDB Development Build
on:
workflow_dispatch: # Allows you to run this workflow manually.
inputs:
repository:
description: The public repository to build
required: false
default: GreptimeTeam/greptimedb
commit: # Note: We only pull the source code and use the current workflow to build the artifacts.
description: The commit to build
required: true
linux_amd64_runner:
type: choice
description: The runner uses to build linux-amd64 artifacts
default: ec2-c6i.4xlarge-amd64
options:
- ubuntu-20.04
- ubuntu-20.04-8-cores
- ubuntu-20.04-16-cores
- ubuntu-20.04-32-cores
- ubuntu-20.04-64-cores
- ec2-c6i.xlarge-amd64 # 4C8G
- ec2-c6i.2xlarge-amd64 # 8C16G
- ec2-c6i.4xlarge-amd64 # 16C32G
- ec2-c6i.8xlarge-amd64 # 32C64G
- ec2-c6i.16xlarge-amd64 # 64C128G
linux_arm64_runner:
type: choice
description: The runner uses to build linux-arm64 artifacts
default: ec2-c6g.4xlarge-arm64
options:
- ec2-c6g.xlarge-arm64 # 4C8G
- ec2-c6g.2xlarge-arm64 # 8C16G
- ec2-c6g.4xlarge-arm64 # 16C32G
- ec2-c6g.8xlarge-arm64 # 32C64G
- ec2-c6g.16xlarge-arm64 # 64C128G
skip_test:
description: Do not run integration tests during the build
type: boolean
default: true
build_linux_amd64_artifacts:
type: boolean
description: Build linux-amd64 artifacts
required: false
default: true
build_linux_arm64_artifacts:
type: boolean
description: Build linux-arm64 artifacts
required: false
default: true
release_images:
type: boolean
description: Build and push images to DockerHub and ACR
required: false
default: true
cargo_profile:
type: choice
description: The cargo profile to use in building GreptimeDB.
default: nightly
options:
- dev
- release
- nightly
# Use env variables to control all the release process.
env:
CARGO_PROFILE: ${{ inputs.cargo_profile }}
# Controls whether to run tests, include unit-test, integration-test and sqlness.
DISABLE_RUN_TESTS: ${{ inputs.skip_test || vars.DEFAULT_SKIP_TEST }}
# Always use 'dev' to indicate it's the dev build.
NEXT_RELEASE_VERSION: dev
NIGHTLY_RELEASE_PREFIX: nightly
# Use the different image name to avoid conflict with the release images.
IMAGE_NAME: greptimedb-dev
# The source code will check out in the following path: '${WORKING_DIR}/dev/greptime'.
CHECKOUT_GREPTIMEDB_PATH: dev/greptimedb
jobs:
allocate-runners:
name: Allocate runners
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
runs-on: ubuntu-20.04
outputs:
linux-amd64-runner: ${{ steps.start-linux-amd64-runner.outputs.label }}
linux-arm64-runner: ${{ steps.start-linux-arm64-runner.outputs.label }}
# The following EC2 resource id will be used for resource releasing.
linux-amd64-ec2-runner-label: ${{ steps.start-linux-amd64-runner.outputs.label }}
linux-amd64-ec2-runner-instance-id: ${{ steps.start-linux-amd64-runner.outputs.ec2-instance-id }}
linux-arm64-ec2-runner-label: ${{ steps.start-linux-arm64-runner.outputs.label }}
linux-arm64-ec2-runner-instance-id: ${{ steps.start-linux-arm64-runner.outputs.ec2-instance-id }}
# The 'version' use as the global tag name of the release workflow.
version: ${{ steps.create-version.outputs.version }}
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Create version
id: create-version
run: |
version=$(./.github/scripts/create-version.sh) && \
echo $version && \
echo "version=$version" >> $GITHUB_OUTPUT
env:
GITHUB_EVENT_NAME: ${{ github.event_name }}
GITHUB_REF_NAME: ${{ github.ref_name }}
COMMIT_SHA: ${{ inputs.commit }}
NEXT_RELEASE_VERSION: ${{ env.NEXT_RELEASE_VERSION }}
NIGHTLY_RELEASE_PREFIX: ${{ env.NIGHTLY_RELEASE_PREFIX }}
- name: Allocate linux-amd64 runner
if: ${{ inputs.build_linux_amd64_artifacts || github.event_name == 'schedule' }}
uses: ./.github/actions/start-runner
id: start-linux-amd64-runner
with:
runner: ${{ inputs.linux_amd64_runner || vars.DEFAULT_AMD64_RUNNER }}
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ vars.EC2_RUNNER_REGION }}
github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
image-id: ${{ vars.EC2_RUNNER_LINUX_AMD64_IMAGE_ID }}
security-group-id: ${{ vars.EC2_RUNNER_SECURITY_GROUP_ID }}
subnet-id: ${{ vars.EC2_RUNNER_SUBNET_ID }}
- name: Allocate linux-arm64 runner
if: ${{ inputs.build_linux_arm64_artifacts || github.event_name == 'schedule' }}
uses: ./.github/actions/start-runner
id: start-linux-arm64-runner
with:
runner: ${{ inputs.linux_arm64_runner || vars.DEFAULT_ARM64_RUNNER }}
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ vars.EC2_RUNNER_REGION }}
github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
image-id: ${{ vars.EC2_RUNNER_LINUX_ARM64_IMAGE_ID }}
security-group-id: ${{ vars.EC2_RUNNER_SECURITY_GROUP_ID }}
subnet-id: ${{ vars.EC2_RUNNER_SUBNET_ID }}
build-linux-amd64-artifacts:
name: Build linux-amd64 artifacts
if: ${{ inputs.build_linux_amd64_artifacts || github.event_name == 'schedule' }}
needs: [
allocate-runners,
]
runs-on: ${{ needs.allocate-runners.outputs.linux-amd64-runner }}
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Checkout greptimedb
uses: actions/checkout@v4
with:
repository: ${{ inputs.repository }}
ref: ${{ inputs.commit }}
path: ${{ env.CHECKOUT_GREPTIMEDB_PATH }}
- uses: ./.github/actions/build-linux-artifacts
with:
arch: amd64
cargo-profile: ${{ env.CARGO_PROFILE }}
version: ${{ needs.allocate-runners.outputs.version }}
disable-run-tests: ${{ env.DISABLE_RUN_TESTS }}
dev-mode: true # Only build the standard greptime binary.
working-dir: ${{ env.CHECKOUT_GREPTIMEDB_PATH }}
build-linux-arm64-artifacts:
name: Build linux-arm64 artifacts
if: ${{ inputs.build_linux_arm64_artifacts || github.event_name == 'schedule' }}
needs: [
allocate-runners,
]
runs-on: ${{ needs.allocate-runners.outputs.linux-arm64-runner }}
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Checkout greptimedb
uses: actions/checkout@v4
with:
repository: ${{ inputs.repository }}
ref: ${{ inputs.commit }}
path: ${{ env.CHECKOUT_GREPTIMEDB_PATH }}
- uses: ./.github/actions/build-linux-artifacts
with:
arch: arm64
cargo-profile: ${{ env.CARGO_PROFILE }}
version: ${{ needs.allocate-runners.outputs.version }}
disable-run-tests: ${{ env.DISABLE_RUN_TESTS }}
dev-mode: true # Only build the standard greptime binary.
working-dir: ${{ env.CHECKOUT_GREPTIMEDB_PATH }}
release-images-to-dockerhub:
name: Build and push images to DockerHub
if: ${{ inputs.release_images || github.event_name == 'schedule' }}
needs: [
allocate-runners,
build-linux-amd64-artifacts,
build-linux-arm64-artifacts,
]
runs-on: ubuntu-20.04
outputs:
build-result: ${{ steps.set-build-result.outputs.build-result }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Build and push images to dockerhub
uses: ./.github/actions/build-images
with:
image-registry: docker.io
image-namespace: ${{ vars.IMAGE_NAMESPACE }}
image-name: ${{ env.IMAGE_NAME }}
image-registry-username: ${{ secrets.DOCKERHUB_USERNAME }}
image-registry-password: ${{ secrets.DOCKERHUB_TOKEN }}
version: ${{ needs.allocate-runners.outputs.version }}
push-latest-tag: false # Don't push the latest tag to registry.
dev-mode: true # Only build the standard images.
- name: Set build result
id: set-build-result
run: |
echo "build-result=success" >> $GITHUB_OUTPUT
release-cn-artifacts:
name: Release artifacts to CN region
if: ${{ inputs.release_images || github.event_name == 'schedule' }}
needs: [
allocate-runners,
release-images-to-dockerhub,
]
runs-on: ubuntu-20.04
continue-on-error: true
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Release artifacts to CN region
uses: ./.github/actions/release-cn-artifacts
with:
src-image-registry: docker.io
src-image-namespace: ${{ vars.IMAGE_NAMESPACE }}
src-image-name: ${{ env.IMAGE_NAME }}
dst-image-registry-username: ${{ secrets.ALICLOUD_USERNAME }}
dst-image-registry-password: ${{ secrets.ALICLOUD_PASSWORD }}
dst-image-registry: ${{ vars.ACR_IMAGE_REGISTRY }}
dst-image-namespace: ${{ vars.IMAGE_NAMESPACE }}
version: ${{ needs.allocate-runners.outputs.version }}
aws-cn-s3-bucket: ${{ vars.AWS_RELEASE_BUCKET }}
aws-cn-access-key-id: ${{ secrets.AWS_CN_ACCESS_KEY_ID }}
aws-cn-secret-access-key: ${{ secrets.AWS_CN_SECRET_ACCESS_KEY }}
aws-cn-region: ${{ vars.AWS_RELEASE_BUCKET_REGION }}
dev-mode: true # Only build the standard images(exclude centos images).
push-latest-tag: false # Don't push the latest tag to registry.
update-version-info: false # Don't update the version info in S3.
stop-linux-amd64-runner: # It's always run as the last job in the workflow to make sure that the runner is released.
name: Stop linux-amd64 runner
# Only run this job when the runner is allocated.
if: ${{ always() }}
runs-on: ubuntu-20.04
needs: [
allocate-runners,
build-linux-amd64-artifacts,
]
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Stop EC2 runner
uses: ./.github/actions/stop-runner
with:
label: ${{ needs.allocate-runners.outputs.linux-amd64-ec2-runner-label }}
ec2-instance-id: ${{ needs.allocate-runners.outputs.linux-amd64-ec2-runner-instance-id }}
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ vars.EC2_RUNNER_REGION }}
github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
stop-linux-arm64-runner: # It's always run as the last job in the workflow to make sure that the runner is released.
name: Stop linux-arm64 runner
# Only run this job when the runner is allocated.
if: ${{ always() }}
runs-on: ubuntu-20.04
needs: [
allocate-runners,
build-linux-arm64-artifacts,
]
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Stop EC2 runner
uses: ./.github/actions/stop-runner
with:
label: ${{ needs.allocate-runners.outputs.linux-arm64-ec2-runner-label }}
ec2-instance-id: ${{ needs.allocate-runners.outputs.linux-arm64-ec2-runner-instance-id }}
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ vars.EC2_RUNNER_REGION }}
github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
notification:
if: ${{ always() }} # Not requiring successful dependent jobs, always run.
name: Send notification to Greptime team
needs: [
release-images-to-dockerhub
]
runs-on: ubuntu-20.04
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_DEVELOP_CHANNEL }}
steps:
- name: Notifiy dev build successful result
uses: slackapi/slack-github-action@v1.23.0
if: ${{ needs.release-images-to-dockerhub.outputs.build-result == 'success' }}
with:
payload: |
{"text": "GreptimeDB's ${{ env.NEXT_RELEASE_VERSION }} build has completed successfully."}
- name: Notifiy dev build failed result
uses: slackapi/slack-github-action@v1.23.0
if: ${{ needs.release-images-to-dockerhub.outputs.build-result != 'success' }}
with:
payload: |
{"text": "GreptimeDB's ${{ env.NEXT_RELEASE_VERSION }} build has failed, please check 'https://github.com/GreptimeTeam/greptimedb/actions/workflows/${{ env.NEXT_RELEASE_VERSION }}-build.yml'."}

View File

@@ -1,340 +1,93 @@
on:
merge_group:
pull_request:
types: [ opened, synchronize, reopened, ready_for_review ]
paths-ignore:
- 'docs/**'
- 'config/**'
- '**.md'
- '.dockerignore'
- 'docker/**'
- '.gitignore'
- 'grafana/**'
push:
branches:
- main
paths-ignore:
- 'docs/**'
- 'config/**'
- '**.md'
- '.dockerignore'
- 'docker/**'
- '.gitignore'
- 'grafana/**'
workflow_dispatch:
types: [opened, synchronize, reopened, ready_for_review]
name: CI
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
name: Continuous integration for developing
env:
RUST_TOOLCHAIN: nightly-2023-12-19
RUST_TOOLCHAIN: nightly-2022-07-14
jobs:
typos:
name: Spell Check with Typos
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v4
- uses: crate-ci/typos@v1.13.10
check:
name: Check
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ windows-latest, ubuntu-20.04 ]
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
- uses: dtolnay/rust-toolchain@master
with:
toolchain: ${{ env.RUST_TOOLCHAIN }}
- name: Rust Cache
uses: Swatinem/rust-cache@v2
with:
# Shares across multiple jobs
# Shares with `Clippy` job
shared-key: "check-lint"
- name: Run cargo check
run: cargo check --locked --workspace --all-targets
toml:
name: Toml Check
runs-on: ubuntu-20.04
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@master
with:
toolchain: stable
- name: Rust Cache
uses: Swatinem/rust-cache@v2
with:
# Shares across multiple jobs
shared-key: "check-toml"
- name: Install taplo
run: cargo +stable install taplo-cli --version ^0.9 --locked
- name: Run taplo
run: taplo format --check
build:
name: Build GreptimeDB binaries
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ ubuntu-20.04 ]
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
- uses: arduino/setup-protoc@v3
- uses: dtolnay/rust-toolchain@master
with:
toolchain: ${{ env.RUST_TOOLCHAIN }}
- uses: Swatinem/rust-cache@v2
with:
# Shares across multiple jobs
shared-key: "build-binaries"
- name: Build greptime binaries
shell: bash
run: cargo build --bin greptime --bin sqlness-runner
- name: Pack greptime binaries
shell: bash
run: |
mkdir bins && \
mv ./target/debug/greptime bins && \
mv ./target/debug/sqlness-runner bins
- name: Print greptime binaries info
run: ls -lh bins
- name: Upload artifacts
uses: ./.github/actions/upload-artifacts
with:
artifacts-dir: bins
version: current
fuzztest:
name: Fuzz Test
needs: build
runs-on: ubuntu-latest
strategy:
matrix:
target: [ "fuzz_create_table", "fuzz_alter_table" ]
steps:
- uses: actions/checkout@v4
- uses: arduino/setup-protoc@v3
- uses: dtolnay/rust-toolchain@master
with:
toolchain: ${{ env.RUST_TOOLCHAIN }}
- name: Rust Cache
uses: Swatinem/rust-cache@v2
with:
# Shares across multiple jobs
shared-key: "fuzz-test-targets"
- name: Set Rust Fuzz
shell: bash
run: |
sudo apt update && sudo apt install -y libfuzzer-14-dev
cargo install cargo-fuzz
- name: Download pre-built binaries
uses: actions/download-artifact@v4
with:
name: bins
path: .
- name: Unzip binaries
run: tar -xvf ./bins.tar.gz
- name: Run GreptimeDB
run: |
./bins/greptime standalone start&
- name: Fuzz Test
uses: ./.github/actions/fuzz-test
env:
CUSTOM_LIBFUZZER_PATH: /usr/lib/llvm-14/lib/libFuzzer.a
with:
target: ${{ matrix.target }}
sqlness:
name: Sqlness Test
needs: build
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ ubuntu-20.04 ]
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
- name: Download pre-built binaries
uses: actions/download-artifact@v4
with:
name: bins
path: .
- name: Unzip binaries
run: tar -xvf ./bins.tar.gz
- name: Run sqlness
run: RUST_BACKTRACE=1 ./bins/sqlness-runner -c ./tests/cases --bins-dir ./bins
- name: Upload sqlness logs
if: always()
uses: actions/upload-artifact@v4
with:
name: sqlness-logs
path: /tmp/greptime-*.log
retention-days: 3
sqlness-kafka-wal:
name: Sqlness Test with Kafka Wal
needs: build
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ ubuntu-20.04 ]
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
- name: Download pre-built binaries
uses: actions/download-artifact@v4
with:
name: bins
path: .
- name: Unzip binaries
run: tar -xvf ./bins.tar.gz
- name: Setup kafka server
working-directory: tests-integration/fixtures/kafka
run: docker compose -f docker-compose-standalone.yml up -d --wait
- name: Run sqlness
run: RUST_BACKTRACE=1 ./bins/sqlness-runner -w kafka -k 127.0.0.1:9092 -c ./tests/cases --bins-dir ./bins
- name: Upload sqlness logs
if: always()
uses: actions/upload-artifact@v4
with:
name: sqlness-logs-with-kafka-wal
path: /tmp/greptime-*.log
retention-days: 3
fmt:
name: Rustfmt
runs-on: ubuntu-20.04
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
- uses: dtolnay/rust-toolchain@master
with:
toolchain: ${{ env.RUST_TOOLCHAIN }}
components: rustfmt
- name: Rust Cache
uses: Swatinem/rust-cache@v2
with:
# Shares across multiple jobs
shared-key: "check-rust-fmt"
- name: Run cargo fmt
run: cargo fmt --all -- --check
clippy:
name: Clippy
runs-on: ubuntu-20.04
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
- uses: dtolnay/rust-toolchain@master
with:
toolchain: ${{ env.RUST_TOOLCHAIN }}
components: clippy
- name: Rust Cache
uses: Swatinem/rust-cache@v2
with:
# Shares across multiple jobs
# Shares with `Check` job
shared-key: "check-lint"
- name: Run cargo clippy
run: cargo clippy --workspace --all-targets -- -D warnings
coverage:
if: github.event.pull_request.draft == false
runs-on: ubuntu-20.04-8-cores
timeout-minutes: 60
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
- uses: KyleMayes/install-llvm-action@v1
with:
version: "14.0"
- name: Install toolchain
uses: dtolnay/rust-toolchain@master
- uses: actions/checkout@v2
- uses: arduino/setup-protoc@v1
- uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: ${{ env.RUST_TOOLCHAIN }}
components: llvm-tools-preview
override: true
- name: Rust Cache
uses: Swatinem/rust-cache@v2
uses: Swatinem/rust-cache@v2.0.0
- uses: actions-rs/cargo@v1
with:
# Shares cross multiple jobs
shared-key: "coverage-test"
- name: Docker Cache
uses: ScribeMD/docker-cache@0.3.7
command: check
args: --workspace --all-targets
test:
name: Test Suite
if: github.event.pull_request.draft == false
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: arduino/setup-protoc@v1
- uses: actions-rs/toolchain@v1
with:
key: docker-${{ runner.os }}-coverage
- name: Install latest nextest release
uses: taiki-e/install-action@nextest
- name: Install cargo-llvm-cov
uses: taiki-e/install-action@cargo-llvm-cov
- name: Install Python
uses: actions/setup-python@v5
profile: minimal
toolchain: ${{ env.RUST_TOOLCHAIN }}
override: true
- name: Rust Cache
uses: Swatinem/rust-cache@v2.0.0
- uses: actions-rs/cargo@v1
with:
python-version: '3.10'
- name: Install PyArrow Package
run: pip install pyarrow
- name: Setup etcd server
working-directory: tests-integration/fixtures/etcd
run: docker compose -f docker-compose-standalone.yml up -d --wait
- name: Setup kafka server
working-directory: tests-integration/fixtures/kafka
run: docker compose -f docker-compose-standalone.yml up -d --wait
- name: Run nextest cases
run: cargo llvm-cov nextest --workspace --lcov --output-path lcov.info -F pyo3_backend -F dashboard
command: test
args: --workspace
env:
CARGO_BUILD_RUSTFLAGS: "-C link-arg=-fuse-ld=lld"
RUST_BACKTRACE: 1
CARGO_INCREMENTAL: 0
GT_S3_BUCKET: ${{ secrets.S3_BUCKET }}
GT_S3_ACCESS_KEY_ID: ${{ secrets.S3_ACCESS_KEY_ID }}
GT_S3_ACCESS_KEY: ${{ secrets.S3_ACCESS_KEY }}
GT_S3_REGION: ${{ secrets.S3_REGION }}
GT_ETCD_ENDPOINTS: http://127.0.0.1:2379
GT_KAFKA_ENDPOINTS: 127.0.0.1:9092
UNITTEST_LOG_DIR: "__unittest_logs"
- name: Codecov upload
uses: codecov/codecov-action@v4
with:
token: ${{ secrets.CODECOV_TOKEN }}
files: ./lcov.info
flags: rust
fail_ci_if_error: false
verbose: true
compat:
name: Compatibility Test
needs: build
runs-on: ubuntu-20.04
timeout-minutes: 60
fmt:
name: Rustfmt
if: github.event.pull_request.draft == false
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Download pre-built binaries
uses: actions/download-artifact@v4
- uses: actions/checkout@v2
- uses: arduino/setup-protoc@v1
- uses: actions-rs/toolchain@v1
with:
name: bins
path: .
- name: Unzip binaries
run: |
mkdir -p ./bins/current
tar -xvf ./bins.tar.gz --strip-components=1 -C ./bins/current
- run: ./tests/compat/test-compat.sh 0.6.0
profile: minimal
toolchain: ${{ env.RUST_TOOLCHAIN }}
override: true
- name: Rust Cache
uses: Swatinem/rust-cache@v2.0.0
- run: rustup component add rustfmt
- uses: actions-rs/cargo@v1
with:
command: fmt
args: --all -- --check
clippy:
name: Clippy
if: github.event.pull_request.draft == false
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- uses: arduino/setup-protoc@v1
- uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: ${{ env.RUST_TOOLCHAIN }}
override: true
- name: Rust Cache
uses: Swatinem/rust-cache@v2.0.0
- run: rustup component add clippy
- uses: actions-rs/cargo@v1
with:
command: clippy
args: --workspace --all-targets -- -D warnings -D clippy::print_stdout -D clippy::print_stderr

View File

@@ -1,39 +0,0 @@
name: Create Issue in downstream repos
on:
issues:
types:
- labeled
pull_request_target:
types:
- labeled
jobs:
doc_issue:
if: github.event.label.name == 'doc update required'
runs-on: ubuntu-20.04
steps:
- name: create an issue in doc repo
uses: dacbd/create-issue-action@v1.2.1
with:
owner: GreptimeTeam
repo: docs
token: ${{ secrets.DOCS_REPO_TOKEN }}
title: Update docs for ${{ github.event.issue.title || github.event.pull_request.title }}
body: |
A document change request is generated from
${{ github.event.issue.html_url || github.event.pull_request.html_url }}
cloud_issue:
if: github.event.label.name == 'cloud followup required'
runs-on: ubuntu-20.04
steps:
- name: create an issue in cloud repo
uses: dacbd/create-issue-action@v1.2.1
with:
owner: GreptimeTeam
repo: greptimedb-cloud
token: ${{ secrets.DOCS_REPO_TOKEN }}
title: Followup changes in ${{ github.event.issue.title || github.event.pull_request.title }}
body: |
A followup request is generated from
${{ github.event.issue.html_url || github.event.pull_request.html_url }}

View File

@@ -1,36 +0,0 @@
name: "PR Doc Labeler"
on:
pull_request_target:
types: [opened, edited, synchronize, ready_for_review, auto_merge_enabled, labeled, unlabeled]
permissions:
pull-requests: write
contents: read
jobs:
triage:
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
runs-on: ubuntu-latest
steps:
- uses: github/issue-labeler@v3.4
with:
configuration-path: .github/doc-label-config.yml
enable-versioned-regex: false
repo-token: ${{ secrets.GITHUB_TOKEN }}
sync-labels: 1
- name: create an issue in doc repo
uses: dacbd/create-issue-action@v1.2.1
if: ${{ github.event.action == 'opened' && contains(github.event.pull_request.body, '- [ ] This PR does not require documentation updates.') }}
with:
owner: GreptimeTeam
repo: docs
token: ${{ secrets.DOCS_REPO_TOKEN }}
title: Update docs for ${{ github.event.issue.title || github.event.pull_request.title }}
body: |
A document change request is generated from
${{ github.event.issue.html_url || github.event.pull_request.html_url }}
- name: Check doc labels
uses: docker://agilepathway/pull-request-label-checker:latest
with:
one_of: Doc update required,Doc not needed
repo_token: ${{ secrets.GITHUB_TOKEN }}

View File

@@ -1,78 +0,0 @@
on:
merge_group:
pull_request:
types: [opened, synchronize, reopened, ready_for_review]
paths:
- 'docs/**'
- 'config/**'
- '**.md'
- '.dockerignore'
- 'docker/**'
- '.gitignore'
- 'grafana/**'
push:
branches:
- main
paths:
- 'docs/**'
- 'config/**'
- '**.md'
- '.dockerignore'
- 'docker/**'
- '.gitignore'
- 'grafana/**'
workflow_dispatch:
name: CI
# To pass the required status check, see:
# https://docs.github.com/en/repositories/configuring-branches-and-merges-in-your-repository/defining-the-mergeability-of-pull-requests/troubleshooting-required-status-checks#handling-skipped-but-required-checks
jobs:
typos:
name: Spell Check with Typos
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v4
- uses: crate-ci/typos@v1.13.10
check:
name: Check
runs-on: ubuntu-20.04
steps:
- run: 'echo "No action required"'
fmt:
name: Rustfmt
runs-on: ubuntu-20.04
steps:
- run: 'echo "No action required"'
clippy:
name: Clippy
runs-on: ubuntu-20.04
steps:
- run: 'echo "No action required"'
coverage:
runs-on: ubuntu-20.04
steps:
- run: 'echo "No action required"'
sqlness:
name: Sqlness Test
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ ubuntu-20.04 ]
steps:
- run: 'echo "No action required"'
sqlness-kafka-wal:
name: Sqlness Test with Kafka Wal
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ ubuntu-20.04 ]
steps:
- run: 'echo "No action required"'

View File

@@ -1,16 +0,0 @@
name: License checker
on:
push:
branches:
- main
pull_request:
types: [opened, synchronize, reopened, ready_for_review]
jobs:
license-header-check:
runs-on: ubuntu-20.04
name: license-header-check
steps:
- uses: actions/checkout@v4
- name: Check License Header
uses: korandoru/hawkeye@v4

View File

@@ -1,309 +0,0 @@
# Nightly build only do the following things:
# 1. Run integration tests;
# 2. Build binaries and images for linux-amd64 and linux-arm64 platform;
name: GreptimeDB Nightly Build
on:
schedule:
# Trigger at 00:00(UTC) on every day-of-week from Monday through Friday.
- cron: '0 0 * * 1-5'
workflow_dispatch: # Allows you to run this workflow manually.
inputs:
linux_amd64_runner:
type: choice
description: The runner uses to build linux-amd64 artifacts
default: ec2-c6i.2xlarge-amd64
options:
- ubuntu-20.04
- ubuntu-20.04-8-cores
- ubuntu-20.04-16-cores
- ubuntu-20.04-32-cores
- ubuntu-20.04-64-cores
- ec2-c6i.xlarge-amd64 # 4C8G
- ec2-c6i.2xlarge-amd64 # 8C16G
- ec2-c6i.4xlarge-amd64 # 16C32G
- ec2-c6i.8xlarge-amd64 # 32C64G
- ec2-c6i.16xlarge-amd64 # 64C128G
linux_arm64_runner:
type: choice
description: The runner uses to build linux-arm64 artifacts
default: ec2-c6g.2xlarge-arm64
options:
- ec2-c6g.xlarge-arm64 # 4C8G
- ec2-c6g.2xlarge-arm64 # 8C16G
- ec2-c6g.4xlarge-arm64 # 16C32G
- ec2-c6g.8xlarge-arm64 # 32C64G
- ec2-c6g.16xlarge-arm64 # 64C128G
skip_test:
description: Do not run integration tests during the build
type: boolean
default: true
build_linux_amd64_artifacts:
type: boolean
description: Build linux-amd64 artifacts
required: false
default: false
build_linux_arm64_artifacts:
type: boolean
description: Build linux-arm64 artifacts
required: false
default: false
release_images:
type: boolean
description: Build and push images to DockerHub and ACR
required: false
default: false
# Use env variables to control all the release process.
env:
CARGO_PROFILE: nightly
# Controls whether to run tests, include unit-test, integration-test and sqlness.
DISABLE_RUN_TESTS: ${{ inputs.skip_test || vars.DEFAULT_SKIP_TEST }}
# Always use 'nightly' to indicate it's the nightly build.
NEXT_RELEASE_VERSION: nightly
NIGHTLY_RELEASE_PREFIX: nightly
jobs:
allocate-runners:
name: Allocate runners
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
runs-on: ubuntu-20.04
outputs:
linux-amd64-runner: ${{ steps.start-linux-amd64-runner.outputs.label }}
linux-arm64-runner: ${{ steps.start-linux-arm64-runner.outputs.label }}
# The following EC2 resource id will be used for resource releasing.
linux-amd64-ec2-runner-label: ${{ steps.start-linux-amd64-runner.outputs.label }}
linux-amd64-ec2-runner-instance-id: ${{ steps.start-linux-amd64-runner.outputs.ec2-instance-id }}
linux-arm64-ec2-runner-label: ${{ steps.start-linux-arm64-runner.outputs.label }}
linux-arm64-ec2-runner-instance-id: ${{ steps.start-linux-arm64-runner.outputs.ec2-instance-id }}
# The 'version' use as the global tag name of the release workflow.
version: ${{ steps.create-version.outputs.version }}
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Create version
id: create-version
run: |
version=$(./.github/scripts/create-version.sh) && \
echo $version && \
echo "version=$version" >> $GITHUB_OUTPUT
env:
GITHUB_EVENT_NAME: ${{ github.event_name }}
GITHUB_REF_NAME: ${{ github.ref_name }}
NEXT_RELEASE_VERSION: ${{ env.NEXT_RELEASE_VERSION }}
NIGHTLY_RELEASE_PREFIX: ${{ env.NIGHTLY_RELEASE_PREFIX }}
- name: Allocate linux-amd64 runner
if: ${{ inputs.build_linux_amd64_artifacts || github.event_name == 'schedule' }}
uses: ./.github/actions/start-runner
id: start-linux-amd64-runner
with:
runner: ${{ inputs.linux_amd64_runner || vars.DEFAULT_AMD64_RUNNER }}
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ vars.EC2_RUNNER_REGION }}
github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
image-id: ${{ vars.EC2_RUNNER_LINUX_AMD64_IMAGE_ID }}
security-group-id: ${{ vars.EC2_RUNNER_SECURITY_GROUP_ID }}
subnet-id: ${{ vars.EC2_RUNNER_SUBNET_ID }}
- name: Allocate linux-arm64 runner
if: ${{ inputs.build_linux_arm64_artifacts || github.event_name == 'schedule' }}
uses: ./.github/actions/start-runner
id: start-linux-arm64-runner
with:
runner: ${{ inputs.linux_arm64_runner || vars.DEFAULT_ARM64_RUNNER }}
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ vars.EC2_RUNNER_REGION }}
github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
image-id: ${{ vars.EC2_RUNNER_LINUX_ARM64_IMAGE_ID }}
security-group-id: ${{ vars.EC2_RUNNER_SECURITY_GROUP_ID }}
subnet-id: ${{ vars.EC2_RUNNER_SUBNET_ID }}
build-linux-amd64-artifacts:
name: Build linux-amd64 artifacts
if: ${{ inputs.build_linux_amd64_artifacts || github.event_name == 'schedule' }}
needs: [
allocate-runners,
]
runs-on: ${{ needs.allocate-runners.outputs.linux-amd64-runner }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: ./.github/actions/build-linux-artifacts
with:
arch: amd64
cargo-profile: ${{ env.CARGO_PROFILE }}
version: ${{ needs.allocate-runners.outputs.version }}
disable-run-tests: ${{ env.DISABLE_RUN_TESTS }}
build-linux-arm64-artifacts:
name: Build linux-arm64 artifacts
if: ${{ inputs.build_linux_arm64_artifacts || github.event_name == 'schedule' }}
needs: [
allocate-runners,
]
runs-on: ${{ needs.allocate-runners.outputs.linux-arm64-runner }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: ./.github/actions/build-linux-artifacts
with:
arch: arm64
cargo-profile: ${{ env.CARGO_PROFILE }}
version: ${{ needs.allocate-runners.outputs.version }}
disable-run-tests: ${{ env.DISABLE_RUN_TESTS }}
release-images-to-dockerhub:
name: Build and push images to DockerHub
if: ${{ inputs.release_images || github.event_name == 'schedule' }}
needs: [
allocate-runners,
build-linux-amd64-artifacts,
build-linux-arm64-artifacts,
]
runs-on: ubuntu-20.04
outputs:
nightly-build-result: ${{ steps.set-nightly-build-result.outputs.nightly-build-result }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Build and push images to dockerhub
uses: ./.github/actions/build-images
with:
image-registry: docker.io
image-namespace: ${{ vars.IMAGE_NAMESPACE }}
image-registry-username: ${{ secrets.DOCKERHUB_USERNAME }}
image-registry-password: ${{ secrets.DOCKERHUB_TOKEN }}
version: ${{ needs.allocate-runners.outputs.version }}
push-latest-tag: false # Don't push the latest tag to registry.
- name: Set nightly build result
id: set-nightly-build-result
run: |
echo "nightly-build-result=success" >> $GITHUB_OUTPUT
release-cn-artifacts:
name: Release artifacts to CN region
if: ${{ inputs.release_images || github.event_name == 'schedule' }}
needs: [
allocate-runners,
release-images-to-dockerhub,
]
runs-on: ubuntu-20.04
# When we push to ACR, it's easy to fail due to some unknown network issues.
# However, we don't want to fail the whole workflow because of this.
# The ACR have daily sync with DockerHub, so don't worry about the image not being updated.
continue-on-error: true
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Release artifacts to CN region
uses: ./.github/actions/release-cn-artifacts
with:
src-image-registry: docker.io
src-image-namespace: ${{ vars.IMAGE_NAMESPACE }}
src-image-name: greptimedb
dst-image-registry-username: ${{ secrets.ALICLOUD_USERNAME }}
dst-image-registry-password: ${{ secrets.ALICLOUD_PASSWORD }}
dst-image-registry: ${{ vars.ACR_IMAGE_REGISTRY }}
dst-image-namespace: ${{ vars.IMAGE_NAMESPACE }}
version: ${{ needs.allocate-runners.outputs.version }}
aws-cn-s3-bucket: ${{ vars.AWS_RELEASE_BUCKET }}
aws-cn-access-key-id: ${{ secrets.AWS_CN_ACCESS_KEY_ID }}
aws-cn-secret-access-key: ${{ secrets.AWS_CN_SECRET_ACCESS_KEY }}
aws-cn-region: ${{ vars.AWS_RELEASE_BUCKET_REGION }}
dev-mode: false
update-version-info: false # Don't update version info in S3.
push-latest-tag: false # Don't push the latest tag to registry.
stop-linux-amd64-runner: # It's always run as the last job in the workflow to make sure that the runner is released.
name: Stop linux-amd64 runner
# Only run this job when the runner is allocated.
if: ${{ always() }}
runs-on: ubuntu-20.04
needs: [
allocate-runners,
build-linux-amd64-artifacts,
]
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Stop EC2 runner
uses: ./.github/actions/stop-runner
with:
label: ${{ needs.allocate-runners.outputs.linux-amd64-ec2-runner-label }}
ec2-instance-id: ${{ needs.allocate-runners.outputs.linux-amd64-ec2-runner-instance-id }}
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ vars.EC2_RUNNER_REGION }}
github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
stop-linux-arm64-runner: # It's always run as the last job in the workflow to make sure that the runner is released.
name: Stop linux-arm64 runner
# Only run this job when the runner is allocated.
if: ${{ always() }}
runs-on: ubuntu-20.04
needs: [
allocate-runners,
build-linux-arm64-artifacts,
]
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Stop EC2 runner
uses: ./.github/actions/stop-runner
with:
label: ${{ needs.allocate-runners.outputs.linux-arm64-ec2-runner-label }}
ec2-instance-id: ${{ needs.allocate-runners.outputs.linux-arm64-ec2-runner-instance-id }}
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ vars.EC2_RUNNER_REGION }}
github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
notification:
if: ${{ always() }} # Not requiring successful dependent jobs, always run.
name: Send notification to Greptime team
needs: [
release-images-to-dockerhub
]
runs-on: ubuntu-20.04
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_DEVELOP_CHANNEL }}
steps:
- name: Notifiy nightly build successful result
uses: slackapi/slack-github-action@v1.23.0
if: ${{ needs.release-images-to-dockerhub.outputs.nightly-build-result == 'success' }}
with:
payload: |
{"text": "GreptimeDB's ${{ env.NEXT_RELEASE_VERSION }} build has completed successfully."}
- name: Notifiy nightly build failed result
uses: slackapi/slack-github-action@v1.23.0
if: ${{ needs.release-images-to-dockerhub.outputs.nightly-build-result != 'success' }}
with:
payload: |
{"text": "GreptimeDB's ${{ env.NEXT_RELEASE_VERSION }} build has failed, please check 'https://github.com/GreptimeTeam/greptimedb/actions/workflows/${{ env.NEXT_RELEASE_VERSION }}-build.yml'."}

View File

@@ -1,100 +0,0 @@
# Nightly CI: runs tests every night for our second tier plaforms (Windows)
on:
schedule:
- cron: '0 23 * * 1-5'
workflow_dispatch:
name: Nightly CI
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
env:
RUST_TOOLCHAIN: nightly-2023-12-19
jobs:
sqlness:
name: Sqlness Test
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
runs-on: ${{ matrix.os }}
strategy:
matrix:
os: [ windows-latest-8-cores ]
timeout-minutes: 60
steps:
- uses: actions/checkout@v4
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
- uses: dtolnay/rust-toolchain@master
with:
toolchain: ${{ env.RUST_TOOLCHAIN }}
- name: Rust Cache
uses: Swatinem/rust-cache@v2
- name: Run sqlness
run: cargo sqlness
- name: Notify slack if failed
if: failure()
uses: slackapi/slack-github-action@v1.23.0
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_DEVELOP_CHANNEL }}
with:
payload: |
{"text": "Nightly CI failed for sqlness tests"}
- name: Upload sqlness logs
if: always()
uses: actions/upload-artifact@v4
with:
name: sqlness-logs
path: /tmp/greptime-*.log
retention-days: 3
test-on-windows:
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
runs-on: windows-latest-8-cores
timeout-minutes: 60
steps:
- run: git config --global core.autocrlf false
- uses: actions/checkout@v4
- uses: arduino/setup-protoc@v3
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
- name: Install Rust toolchain
uses: dtolnay/rust-toolchain@master
with:
toolchain: ${{ env.RUST_TOOLCHAIN }}
components: llvm-tools-preview
- name: Rust Cache
uses: Swatinem/rust-cache@v2
- name: Install Cargo Nextest
uses: taiki-e/install-action@nextest
- name: Install Python
uses: actions/setup-python@v5
with:
python-version: '3.10'
- name: Install PyArrow Package
run: pip install pyarrow
- name: Install WSL distribution
uses: Vampire/setup-wsl@v2
with:
distribution: Ubuntu-22.04
- name: Running tests
run: cargo nextest run -F pyo3_backend,dashboard
env:
RUST_BACKTRACE: 1
CARGO_INCREMENTAL: 0
GT_S3_BUCKET: ${{ secrets.S3_BUCKET }}
GT_S3_ACCESS_KEY_ID: ${{ secrets.S3_ACCESS_KEY_ID }}
GT_S3_ACCESS_KEY: ${{ secrets.S3_ACCESS_KEY }}
GT_S3_REGION: ${{ secrets.S3_REGION }}
UNITTEST_LOG_DIR: "__unittest_logs"
- name: Notify slack if failed
if: failure()
uses: slackapi/slack-github-action@v1.23.0
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_DEVELOP_CHANNEL }}
with:
payload: |
{"text": "Nightly CI failed for cargo test"}

View File

@@ -1,27 +0,0 @@
name: Nightly functional tests
on:
schedule:
# At 00:00 on Tuesday.
- cron: '0 0 * * 2'
workflow_dispatch:
jobs:
sqlness-test:
name: Run sqlness test
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
runs-on: ubuntu-22.04
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Run sqlness test
uses: ./.github/actions/sqlness-test
with:
data-root: sqlness-test
aws-ci-test-bucket: ${{ vars.AWS_CI_TEST_BUCKET }}
aws-region: ${{ vars.AWS_CI_TEST_BUCKET_REGION }}
aws-access-key-id: ${{ secrets.AWS_CI_TEST_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_CI_TEST_SECRET_ACCESS_KEY }}

View File

@@ -10,20 +10,10 @@ on:
jobs:
check:
runs-on: ubuntu-20.04
timeout-minutes: 10
runs-on: ubuntu-latest
steps:
- uses: thehanimo/pr-title-checker@v1.4.2
- uses: thehanimo/pr-title-checker@v1.3.4
with:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
pass_on_octokit_error: false
configuration_path: ".github/pr-title-checker-config.json"
breaking:
runs-on: ubuntu-20.04
timeout-minutes: 10
steps:
- uses: thehanimo/pr-title-checker@v1.4.2
with:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
pass_on_octokit_error: false
configuration_path: ".github/pr-title-breaking-change-label-config.json"

View File

@@ -1,85 +0,0 @@
name: Release dev-builder images
on:
workflow_dispatch: # Allows you to run this workflow manually.
inputs:
version:
description: Version of the dev-builder
required: false
default: latest
release_dev_builder_ubuntu_image:
type: boolean
description: Release dev-builder-ubuntu image
required: false
default: false
release_dev_builder_centos_image:
type: boolean
description: Release dev-builder-centos image
required: false
default: false
release_dev_builder_android_image:
type: boolean
description: Release dev-builder-android image
required: false
default: false
jobs:
release-dev-builder-images:
name: Release dev builder images
if: ${{ inputs.release_dev_builder_ubuntu_image || inputs.release_dev_builder_centos_image || inputs.release_dev_builder_android_image }} # Only manually trigger this job.
runs-on: ubuntu-20.04-16-cores
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Build and push dev builder images
uses: ./.github/actions/build-dev-builder-images
with:
version: ${{ inputs.version }}
dockerhub-image-registry-username: ${{ secrets.DOCKERHUB_USERNAME }}
dockerhub-image-registry-token: ${{ secrets.DOCKERHUB_TOKEN }}
build-dev-builder-ubuntu: ${{ inputs.release_dev_builder_ubuntu_image }}
build-dev-builder-centos: ${{ inputs.release_dev_builder_centos_image }}
build-dev-builder-android: ${{ inputs.release_dev_builder_android_image }}
release-dev-builder-images-cn: # Note: Be careful issue: https://github.com/containers/skopeo/issues/1874 and we decide to use the latest stable skopeo container.
name: Release dev builder images to CN region
runs-on: ubuntu-20.04
needs: [
release-dev-builder-images
]
steps:
- name: Push dev-builder-ubuntu image
shell: bash
if: ${{ inputs.release_dev_builder_ubuntu_image }}
env:
DST_REGISTRY_USERNAME: ${{ secrets.ALICLOUD_USERNAME }}
DST_REGISTRY_PASSWORD: ${{ secrets.ALICLOUD_PASSWORD }}
run: |
docker run quay.io/skopeo/stable:latest copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-ubuntu:${{ inputs.version }} \
--dest-creds "$DST_REGISTRY_USERNAME":"$DST_REGISTRY_PASSWORD" \
docker://${{ vars.ACR_IMAGE_REGISTRY }}/${{ vars.IMAGE_NAMESPACE }}/dev-builder-ubuntu:${{ inputs.version }}
- name: Push dev-builder-centos image
shell: bash
if: ${{ inputs.release_dev_builder_centos_image }}
env:
DST_REGISTRY_USERNAME: ${{ secrets.ALICLOUD_USERNAME }}
DST_REGISTRY_PASSWORD: ${{ secrets.ALICLOUD_PASSWORD }}
run: |
docker run quay.io/skopeo/stable:latest copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-centos:${{ inputs.version }} \
--dest-creds "$DST_REGISTRY_USERNAME":"$DST_REGISTRY_PASSWORD" \
docker://${{ vars.ACR_IMAGE_REGISTRY }}/${{ vars.IMAGE_NAMESPACE }}/dev-builder-centos:${{ inputs.version }}
- name: Push dev-builder-android image
shell: bash
if: ${{ inputs.release_dev_builder_android_image }}
env:
DST_REGISTRY_USERNAME: ${{ secrets.ALICLOUD_USERNAME }}
DST_REGISTRY_PASSWORD: ${{ secrets.ALICLOUD_PASSWORD }}
run: |
docker run quay.io/skopeo/stable:latest copy -a docker://docker.io/${{ vars.IMAGE_NAMESPACE }}/dev-builder-android:${{ inputs.version }} \
--dest-creds "$DST_REGISTRY_USERNAME":"$DST_REGISTRY_PASSWORD" \
docker://${{ vars.ACR_IMAGE_REGISTRY }}/${{ vars.IMAGE_NAMESPACE }}/dev-builder-android:${{ inputs.version }}

View File

@@ -1,462 +0,0 @@
name: Release
# There are two kinds of formal release:
# 1. The tag('v*.*.*') push release: the release workflow will be triggered by the tag push event.
# 2. The scheduled release(the version will be '${{ env.NEXT_RELEASE_VERSION }}-nightly-YYYYMMDD'): the release workflow will be triggered by the schedule event.
on:
push:
tags:
- "v*.*.*"
schedule:
# At 00:00 on Monday.
- cron: '0 0 * * 1'
workflow_dispatch: # Allows you to run this workflow manually.
# Notes: The GitHub Actions ONLY support 10 inputs, and it's already used up.
inputs:
linux_amd64_runner:
type: choice
description: The runner uses to build linux-amd64 artifacts
default: ec2-c6i.4xlarge-amd64
options:
- ubuntu-20.04
- ubuntu-20.04-8-cores
- ubuntu-20.04-16-cores
- ubuntu-20.04-32-cores
- ubuntu-20.04-64-cores
- ec2-c6i.xlarge-amd64 # 4C8G
- ec2-c6i.2xlarge-amd64 # 8C16G
- ec2-c6i.4xlarge-amd64 # 16C32G
- ec2-c6i.8xlarge-amd64 # 32C64G
- ec2-c6i.16xlarge-amd64 # 64C128G
linux_arm64_runner:
type: choice
description: The runner uses to build linux-arm64 artifacts
default: ec2-c6g.4xlarge-arm64
options:
- ec2-c6g.xlarge-arm64 # 4C8G
- ec2-c6g.2xlarge-arm64 # 8C16G
- ec2-c6g.4xlarge-arm64 # 16C32G
- ec2-c6g.8xlarge-arm64 # 32C64G
- ec2-c6g.16xlarge-arm64 # 64C128G
macos_runner:
type: choice
description: The runner uses to build macOS artifacts
default: macos-latest
options:
- macos-latest
skip_test:
description: Do not run integration tests during the build
type: boolean
default: true
build_linux_amd64_artifacts:
type: boolean
description: Build linux-amd64 artifacts
required: false
default: false
build_linux_arm64_artifacts:
type: boolean
description: Build linux-arm64 artifacts
required: false
default: false
build_macos_artifacts:
type: boolean
description: Build macos artifacts
required: false
default: false
build_windows_artifacts:
type: boolean
description: Build Windows artifacts
required: false
default: false
publish_github_release:
type: boolean
description: Create GitHub release and upload artifacts
required: false
default: false
release_images:
type: boolean
description: Build and push images to DockerHub and ACR
required: false
default: false
# Use env variables to control all the release process.
env:
# The arguments of building greptime.
RUST_TOOLCHAIN: nightly-2023-12-19
CARGO_PROFILE: nightly
# Controls whether to run tests, include unit-test, integration-test and sqlness.
DISABLE_RUN_TESTS: ${{ inputs.skip_test || vars.DEFAULT_SKIP_TEST }}
# The scheduled version is '${{ env.NEXT_RELEASE_VERSION }}-nightly-YYYYMMDD', like v0.2.0-nigthly-20230313;
NIGHTLY_RELEASE_PREFIX: nightly
# Note: The NEXT_RELEASE_VERSION should be modified manually by every formal release.
NEXT_RELEASE_VERSION: v0.8.0
jobs:
allocate-runners:
name: Allocate runners
if: ${{ github.repository == 'GreptimeTeam/greptimedb' }}
runs-on: ubuntu-20.04
outputs:
linux-amd64-runner: ${{ steps.start-linux-amd64-runner.outputs.label }}
linux-arm64-runner: ${{ steps.start-linux-arm64-runner.outputs.label }}
macos-runner: ${{ inputs.macos_runner || vars.DEFAULT_MACOS_RUNNER }}
windows-runner: windows-latest-8-cores
# The following EC2 resource id will be used for resource releasing.
linux-amd64-ec2-runner-label: ${{ steps.start-linux-amd64-runner.outputs.label }}
linux-amd64-ec2-runner-instance-id: ${{ steps.start-linux-amd64-runner.outputs.ec2-instance-id }}
linux-arm64-ec2-runner-label: ${{ steps.start-linux-arm64-runner.outputs.label }}
linux-arm64-ec2-runner-instance-id: ${{ steps.start-linux-arm64-runner.outputs.ec2-instance-id }}
# The 'version' use as the global tag name of the release workflow.
version: ${{ steps.create-version.outputs.version }}
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
# The create-version will create a global variable named 'version' in the global workflows.
# - If it's a tag push release, the version is the tag name(${{ github.ref_name }});
# - If it's a scheduled release, the version is '${{ env.NEXT_RELEASE_VERSION }}-nightly-$buildTime', like v0.2.0-nigthly-20230313;
# - If it's a manual release, the version is '${{ env.NEXT_RELEASE_VERSION }}-<short-git-sha>-YYYYMMDDSS', like v0.2.0-e5b243c-2023071245;
- name: Create version
id: create-version
run: |
echo "version=$(./.github/scripts/create-version.sh)" >> $GITHUB_OUTPUT
env:
GITHUB_EVENT_NAME: ${{ github.event_name }}
GITHUB_REF_NAME: ${{ github.ref_name }}
NEXT_RELEASE_VERSION: ${{ env.NEXT_RELEASE_VERSION }}
NIGHTLY_RELEASE_PREFIX: ${{ env.NIGHTLY_RELEASE_PREFIX }}
- name: Allocate linux-amd64 runner
if: ${{ inputs.build_linux_amd64_artifacts || github.event_name == 'push' || github.event_name == 'schedule' }}
uses: ./.github/actions/start-runner
id: start-linux-amd64-runner
with:
runner: ${{ inputs.linux_amd64_runner || vars.DEFAULT_AMD64_RUNNER }}
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ vars.EC2_RUNNER_REGION }}
github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
image-id: ${{ vars.EC2_RUNNER_LINUX_AMD64_IMAGE_ID }}
security-group-id: ${{ vars.EC2_RUNNER_SECURITY_GROUP_ID }}
subnet-id: ${{ vars.EC2_RUNNER_SUBNET_ID }}
- name: Allocate linux-arm64 runner
if: ${{ inputs.build_linux_arm64_artifacts || github.event_name == 'push' || github.event_name == 'schedule' }}
uses: ./.github/actions/start-runner
id: start-linux-arm64-runner
with:
runner: ${{ inputs.linux_arm64_runner || vars.DEFAULT_ARM64_RUNNER }}
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ vars.EC2_RUNNER_REGION }}
github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
image-id: ${{ vars.EC2_RUNNER_LINUX_ARM64_IMAGE_ID }}
security-group-id: ${{ vars.EC2_RUNNER_SECURITY_GROUP_ID }}
subnet-id: ${{ vars.EC2_RUNNER_SUBNET_ID }}
build-linux-amd64-artifacts:
name: Build linux-amd64 artifacts
if: ${{ inputs.build_linux_amd64_artifacts || github.event_name == 'push' || github.event_name == 'schedule' }}
needs: [
allocate-runners,
]
runs-on: ${{ needs.allocate-runners.outputs.linux-amd64-runner }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: ./.github/actions/build-linux-artifacts
with:
arch: amd64
cargo-profile: ${{ env.CARGO_PROFILE }}
version: ${{ needs.allocate-runners.outputs.version }}
disable-run-tests: ${{ env.DISABLE_RUN_TESTS }}
build-linux-arm64-artifacts:
name: Build linux-arm64 artifacts
if: ${{ inputs.build_linux_arm64_artifacts || github.event_name == 'push' || github.event_name == 'schedule' }}
needs: [
allocate-runners,
]
runs-on: ${{ needs.allocate-runners.outputs.linux-arm64-runner }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: ./.github/actions/build-linux-artifacts
with:
arch: arm64
cargo-profile: ${{ env.CARGO_PROFILE }}
version: ${{ needs.allocate-runners.outputs.version }}
disable-run-tests: ${{ env.DISABLE_RUN_TESTS }}
build-macos-artifacts:
name: Build macOS artifacts
strategy:
fail-fast: false
matrix:
include:
- os: ${{ needs.allocate-runners.outputs.macos-runner }}
arch: aarch64-apple-darwin
features: servers/dashboard
artifacts-dir-prefix: greptime-darwin-arm64
- os: ${{ needs.allocate-runners.outputs.macos-runner }}
arch: aarch64-apple-darwin
features: pyo3_backend,servers/dashboard
artifacts-dir-prefix: greptime-darwin-arm64-pyo3
- os: ${{ needs.allocate-runners.outputs.macos-runner }}
features: servers/dashboard
arch: x86_64-apple-darwin
artifacts-dir-prefix: greptime-darwin-amd64
- os: ${{ needs.allocate-runners.outputs.macos-runner }}
features: pyo3_backend,servers/dashboard
arch: x86_64-apple-darwin
artifacts-dir-prefix: greptime-darwin-amd64-pyo3
runs-on: ${{ matrix.os }}
outputs:
build-macos-result: ${{ steps.set-build-macos-result.outputs.build-macos-result }}
needs: [
allocate-runners,
]
if: ${{ inputs.build_macos_artifacts || github.event_name == 'push' || github.event_name == 'schedule' }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: ./.github/actions/build-macos-artifacts
with:
arch: ${{ matrix.arch }}
rust-toolchain: ${{ env.RUST_TOOLCHAIN }}
cargo-profile: ${{ env.CARGO_PROFILE }}
features: ${{ matrix.features }}
version: ${{ needs.allocate-runners.outputs.version }}
disable-run-tests: ${{ env.DISABLE_RUN_TESTS }}
artifacts-dir: ${{ matrix.artifacts-dir-prefix }}-${{ needs.allocate-runners.outputs.version }}
- name: Set build macos result
id: set-build-macos-result
run: |
echo "build-macos-result=success" >> $GITHUB_OUTPUT
build-windows-artifacts:
name: Build Windows artifacts
strategy:
fail-fast: false
matrix:
include:
- os: ${{ needs.allocate-runners.outputs.windows-runner }}
arch: x86_64-pc-windows-msvc
features: servers/dashboard
artifacts-dir-prefix: greptime-windows-amd64
- os: ${{ needs.allocate-runners.outputs.windows-runner }}
arch: x86_64-pc-windows-msvc
features: pyo3_backend,servers/dashboard
artifacts-dir-prefix: greptime-windows-amd64-pyo3
runs-on: ${{ matrix.os }}
outputs:
build-windows-result: ${{ steps.set-build-windows-result.outputs.build-windows-result }}
needs: [
allocate-runners,
]
if: ${{ inputs.build_windows_artifacts || github.event_name == 'push' || github.event_name == 'schedule' }}
steps:
- run: git config --global core.autocrlf false
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: ./.github/actions/build-windows-artifacts
with:
arch: ${{ matrix.arch }}
rust-toolchain: ${{ env.RUST_TOOLCHAIN }}
cargo-profile: ${{ env.CARGO_PROFILE }}
features: ${{ matrix.features }}
version: ${{ needs.allocate-runners.outputs.version }}
disable-run-tests: ${{ env.DISABLE_RUN_TESTS }}
artifacts-dir: ${{ matrix.artifacts-dir-prefix }}-${{ needs.allocate-runners.outputs.version }}
- name: Set build windows result
id: set-build-windows-result
run: |
echo "build-windows-result=success" >> $Env:GITHUB_OUTPUT
release-images-to-dockerhub:
name: Build and push images to DockerHub
if: ${{ inputs.release_images || github.event_name == 'push' || github.event_name == 'schedule' }}
needs: [
allocate-runners,
build-linux-amd64-artifacts,
build-linux-arm64-artifacts,
]
runs-on: ubuntu-2004-16-cores
outputs:
build-image-result: ${{ steps.set-build-image-result.outputs.build-image-result }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Build and push images to dockerhub
uses: ./.github/actions/build-images
with:
image-registry: docker.io
image-namespace: ${{ vars.IMAGE_NAMESPACE }}
image-registry-username: ${{ secrets.DOCKERHUB_USERNAME }}
image-registry-password: ${{ secrets.DOCKERHUB_TOKEN }}
version: ${{ needs.allocate-runners.outputs.version }}
- name: Set build image result
id: set-build-image-result
run: |
echo "build-image-result=success" >> $GITHUB_OUTPUT
release-cn-artifacts:
name: Release artifacts to CN region
if: ${{ inputs.release_images || github.event_name == 'push' || github.event_name == 'schedule' }}
needs: [ # The job have to wait for all the artifacts are built.
allocate-runners,
build-linux-amd64-artifacts,
build-linux-arm64-artifacts,
build-macos-artifacts,
build-windows-artifacts,
release-images-to-dockerhub,
]
runs-on: ubuntu-20.04
# When we push to ACR, it's easy to fail due to some unknown network issues.
# However, we don't want to fail the whole workflow because of this.
# The ACR have daily sync with DockerHub, so don't worry about the image not being updated.
continue-on-error: true
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Release artifacts to CN region
uses: ./.github/actions/release-cn-artifacts
with:
src-image-registry: docker.io
src-image-namespace: ${{ vars.IMAGE_NAMESPACE }}
src-image-name: greptimedb
dst-image-registry-username: ${{ secrets.ALICLOUD_USERNAME }}
dst-image-registry-password: ${{ secrets.ALICLOUD_PASSWORD }}
dst-image-registry: ${{ vars.ACR_IMAGE_REGISTRY }}
dst-image-namespace: ${{ vars.IMAGE_NAMESPACE }}
version: ${{ needs.allocate-runners.outputs.version }}
aws-cn-s3-bucket: ${{ vars.AWS_RELEASE_BUCKET }}
aws-cn-access-key-id: ${{ secrets.AWS_CN_ACCESS_KEY_ID }}
aws-cn-secret-access-key: ${{ secrets.AWS_CN_SECRET_ACCESS_KEY }}
aws-cn-region: ${{ vars.AWS_RELEASE_BUCKET_REGION }}
dev-mode: false
update-version-info: true
push-latest-tag: true
publish-github-release:
name: Create GitHub release and upload artifacts
if: ${{ inputs.publish_github_release || github.event_name == 'push' || github.event_name == 'schedule' }}
needs: [ # The job have to wait for all the artifacts are built.
allocate-runners,
build-linux-amd64-artifacts,
build-linux-arm64-artifacts,
build-macos-artifacts,
build-windows-artifacts,
release-images-to-dockerhub,
]
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Publish GitHub release
uses: ./.github/actions/publish-github-release
with:
version: ${{ needs.allocate-runners.outputs.version }}
### Stop runners ###
# It's very necessary to split the job of releasing runners into 'stop-linux-amd64-runner' and 'stop-linux-arm64-runner'.
# Because we can terminate the specified EC2 instance immediately after the job is finished without uncessary waiting.
stop-linux-amd64-runner: # It's always run as the last job in the workflow to make sure that the runner is released.
name: Stop linux-amd64 runner
# Only run this job when the runner is allocated.
if: ${{ always() }}
runs-on: ubuntu-20.04
needs: [
allocate-runners,
build-linux-amd64-artifacts,
]
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Stop EC2 runner
uses: ./.github/actions/stop-runner
with:
label: ${{ needs.allocate-runners.outputs.linux-amd64-ec2-runner-label }}
ec2-instance-id: ${{ needs.allocate-runners.outputs.linux-amd64-ec2-runner-instance-id }}
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ vars.EC2_RUNNER_REGION }}
github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
stop-linux-arm64-runner: # It's always run as the last job in the workflow to make sure that the runner is released.
name: Stop linux-arm64 runner
# Only run this job when the runner is allocated.
if: ${{ always() }}
runs-on: ubuntu-20.04
needs: [
allocate-runners,
build-linux-arm64-artifacts,
]
steps:
- name: Checkout
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Stop EC2 runner
uses: ./.github/actions/stop-runner
with:
label: ${{ needs.allocate-runners.outputs.linux-arm64-ec2-runner-label }}
ec2-instance-id: ${{ needs.allocate-runners.outputs.linux-arm64-ec2-runner-instance-id }}
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: ${{ vars.EC2_RUNNER_REGION }}
github-token: ${{ secrets.GH_PERSONAL_ACCESS_TOKEN }}
notification:
if: ${{ always() }} # Not requiring successful dependent jobs, always run.
name: Send notification to Greptime team
needs: [
release-images-to-dockerhub,
build-macos-artifacts,
build-windows-artifacts,
]
runs-on: ubuntu-20.04
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_DEVELOP_CHANNEL }}
steps:
- name: Notifiy release successful result
uses: slackapi/slack-github-action@v1.25.0
if: ${{ needs.release-images-to-dockerhub.outputs.build-image-result == 'success' && needs.build-windows-artifacts.outputs.build-windows-result == 'success' && needs.build-macos-artifacts.outputs.build-macos-result == 'success' }}
with:
payload: |
{"text": "GreptimeDB's release version has completed successfully."}
- name: Notifiy release failed result
uses: slackapi/slack-github-action@v1.25.0
if: ${{ needs.release-images-to-dockerhub.outputs.build-image-result != 'success' || needs.build-windows-artifacts.outputs.build-windows-result != 'success' || needs.build-macos-artifacts.outputs.build-macos-result != 'success' }}
with:
payload: |
{"text": "GreptimeDB's release version has failed, please check 'https://github.com/GreptimeTeam/greptimedb/actions/workflows/release.yml'."}

28
.gitignore vendored
View File

@@ -1,8 +1,6 @@
# Generated by Cargo
# will have compiled files and executables
/target/
# also ignore if it's a symbolic link
/target
# Remove Cargo.lock from gitignore if creating an executable, leave it for libraries
# More information here https://doc.rust-lang.org/cargo/guide/cargo-toml-vs-cargo-lock.html
@@ -20,33 +18,13 @@ debug/
# JetBrains IDE config directory
.idea/
*.iml
# VSCode IDE config directory
.vscode/
# Logs
**/__unittest_logs
logs/
.DS_store
.gitignore
# cpython's generated python byte code
**/__pycache__/
# Benchmark dataset
benchmarks/data
# dotenv
.env
# dashboard files
!/src/servers/dashboard/VERSION
/src/servers/dashboard/*
# Vscode workspace
*.code-workspace
venv/
# Fuzz tests
tests-fuzz/artifacts/
tests-fuzz/corpus/

View File

@@ -5,11 +5,11 @@ repos:
- id: conventional-pre-commit
stages: [commit-msg]
# - repo: https://github.com/DevinR528/cargo-sort
# rev: e6a795bc6b2c0958f9ef52af4863bbd7cc17238f
# hooks:
# - id: cargo-sort
# args: ["--workspace"]
- repo: https://github.com/DevinR528/cargo-sort
rev: e6a795bc6b2c0958f9ef52af4863bbd7cc17238f
hooks:
- id: cargo-sort
args: ["--workspace"]
- repo: https://github.com/doublify/pre-commit-rust
rev: v1.0

View File

@@ -1,132 +0,0 @@
# Contributor Covenant Code of Conduct
## Our Pledge
We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, caste, color, religion, or sexual
identity and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming,
diverse, inclusive, and healthy community.
## Our Standards
Examples of behavior that contributes to a positive environment for our
community include:
* Demonstrating empathy and kindness toward other people
* Being respectful of differing opinions, viewpoints, and experiences
* Giving and gracefully accepting constructive feedback
* Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience
* Focusing on what is best not just for us as individuals, but for the overall
community
Examples of unacceptable behavior include:
* The use of sexualized language or imagery, and sexual attention or advances of
any kind
* Trolling, insulting or derogatory comments, and personal or political attacks
* Public or private harassment
* Publishing others' private information, such as a physical or email address,
without their explicit permission
* Other conduct which could reasonably be considered inappropriate in a
professional setting
## Enforcement Responsibilities
Community leaders are responsible for clarifying and enforcing our standards of
acceptable behavior and will take appropriate and fair corrective action in
response to any behavior that they deem inappropriate, threatening, offensive,
or harmful.
Community leaders have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are
not aligned to this Code of Conduct, and will communicate reasons for moderation
decisions when appropriate.
## Scope
This Code of Conduct applies within all community spaces, and also applies when
an individual is officially representing the community in public spaces.
Examples of representing our community include using an official e-mail address,
posting via an official social media account, or acting as an appointed
representative at an online or offline event.
## Enforcement
Instances of abusive, harassing, or otherwise unacceptable behavior may be
reported to the community leaders responsible for enforcement at
info@greptime.com.
All complaints will be reviewed and investigated promptly and fairly.
All community leaders are obligated to respect the privacy and security of the
reporter of any incident.
## Enforcement Guidelines
Community leaders will follow these Community Impact Guidelines in determining
the consequences for any action they deem in violation of this Code of Conduct:
### 1. Correction
**Community Impact**: Use of inappropriate language or other behavior deemed
unprofessional or unwelcome in the community.
**Consequence**: A private, written warning from community leaders, providing
clarity around the nature of the violation and an explanation of why the
behavior was inappropriate. A public apology may be requested.
### 2. Warning
**Community Impact**: A violation through a single incident or series of
actions.
**Consequence**: A warning with consequences for continued behavior. No
interaction with the people involved, including unsolicited interaction with
those enforcing the Code of Conduct, for a specified period of time. This
includes avoiding interactions in community spaces as well as external channels
like social media. Violating these terms may lead to a temporary or permanent
ban.
### 3. Temporary Ban
**Community Impact**: A serious violation of community standards, including
sustained inappropriate behavior.
**Consequence**: A temporary ban from any sort of interaction or public
communication with the community for a specified period of time. No public or
private interaction with the people involved, including unsolicited interaction
with those enforcing the Code of Conduct, is allowed during this period.
Violating these terms may lead to a permanent ban.
### 4. Permanent Ban
**Community Impact**: Demonstrating a pattern of violation of community
standards, including sustained inappropriate behavior, harassment of an
individual, or aggression toward or disparagement of classes of individuals.
**Consequence**: A permanent ban from any sort of public interaction within the
community.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.1, available at
[https://www.contributor-covenant.org/version/2/1/code_of_conduct.html][v2.1].
Community Impact Guidelines were inspired by
[Mozilla's code of conduct enforcement ladder][Mozilla CoC].
For answers to common questions about this code of conduct, see the FAQ at
[https://www.contributor-covenant.org/faq][FAQ]. Translations are available at
[https://www.contributor-covenant.org/translations][translations].
[homepage]: https://www.contributor-covenant.org
[v2.1]: https://www.contributor-covenant.org/version/2/1/code_of_conduct.html
[Mozilla CoC]: https://github.com/mozilla/diversity
[FAQ]: https://www.contributor-covenant.org/faq
[translations]: https://www.contributor-covenant.org/translations

View File

@@ -1,92 +1,23 @@
# Welcome 👋
# Contributing to GreptimeDB
Thanks a lot for considering contributing to GreptimeDB. We believe people like you would make GreptimeDB a great product. We intend to build a community where individuals can have open talks, show respect for one another, and speak with true ❤️. Meanwhile, we are to keep transparency and make your effort count here.
Much appreciate for your interest in contributing to GreptimeDB! This document list some guidelines for contributing to our code base.
Please read the guidelines, and they can help you get started. Communicate with respect to developers maintaining and developing the project. In return, they should reciprocate that respect by addressing your issue, reviewing changes, as well as helping finalize and merge your pull requests.
To learn about the design of GreptimeDB, please refer to the [design docs](https://github.com/GrepTimeTeam/docs).
Follow our [README](https://github.com/GreptimeTeam/greptimedb#readme) to get the whole picture of the project. To learn about the design of GreptimeDB, please refer to the [design docs](https://github.com/GrepTimeTeam/docs).
## Your First Contribution
It can feel intimidating to contribute to a complex project, but it can also be exciting and fun. These general notes will help everyone participate in this communal activity.
- Follow the [Code of Conduct](https://github.com/GreptimeTeam/greptimedb/blob/main/CODE_OF_CONDUCT.md)
- Small changes make huge differences. We will happily accept a PR making a single character change if it helps move forward. Don't wait to have everything working.
- Check the closed issues before opening your issue.
- Try to follow the existing style of the code.
- More importantly, when in doubt, ask away.
Pull requests are great, but we accept all kinds of other help if you like. Such as
- Write tutorials or blog posts. Blog, speak about, or create tutorials about one of GreptimeDB's many features. Mention [@greptime](https://twitter.com/greptime) on Twitter and email info@greptime.com so we can give pointers and tips and help you spread the word by promoting your content on Greptime communication channels.
- Improve the documentation. [Submit documentation](http://github.com/greptimeTeam/docs/) updates, enhancements, designs, or bug fixes, and fixing any spelling or grammar errors will be very much appreciated.
- Present at meetups and conferences about your GreptimeDB projects. Your unique challenges and successes in building things with GreptimeDB can provide great speaking material. We'd love to review your talk abstract, so get in touch with us if you'd like some help!
- Submitting bug reports. To report a bug or a security issue, you can [open a new GitHub issue](https://github.com/GrepTimeTeam/greptimedb/issues/new).
- Speak up feature requests. Send feedback is a great way for us to understand your different use cases of GreptimeDB better. If you want to share your experience with GreptimeDB, or if you want to discuss any ideas, you can start a discussion on [GitHub discussions](https://github.com/GreptimeTeam/greptimedb/discussions), chat with the Greptime team on [Slack](https://greptime.com/slack), or you can tweet [@greptime](https://twitter.com/greptime) on Twitter.
## Code of Conduct
Also, there are things that we are not looking for because they don't match the goals of the product or benefit the community. Please read [Code of Conduct](https://github.com/GreptimeTeam/greptimedb/blob/main/CODE_OF_CONDUCT.md); we hope everyone can keep good manners and become an honored member.
## License
GreptimeDB uses the [Apache 2.0 license](https://github.com/GreptimeTeam/greptimedb/blob/master/LICENSE) to strike a balance between open contributions and allowing you to use the software however you want.
## Getting Started
### Submitting Issues
- Check if an issue already exists. Before filing an issue report, see whether it's already covered. Use the search bar and check out existing issues.
- File an issue:
- To report a bug, a security issue, or anything that you think is a problem and that isn't under the radar, go ahead and [open a new GitHub issue](https://github.com/GrepTimeTeam/greptimedb/issues/new).
- In the given templates, look for the one that suits you.
- If you bump into anything, reach out to our [Slack](https://greptime.com/slack) for a wider audience and ask for help.
- What happens after:
- Once we spot a new issue, we identify and categorize it as soon as possible.
- Usually, it gets assigned to other developers. Follow up and see what folks are talking about and how they take care of it.
- Please be patient and offer as much information as you can to help reach a solution or a consensus. You are not alone and embrace team power.
## Pull Requests
### Before PR
- To ensure that community is free and confident in its ability to use your contributions, please sign the Contributor License Agreement (CLA) which will be incorporated in the pull request process.
- Make sure all files have proper license header (running `docker run --rm -v $(pwd):/github/workspace ghcr.io/korandoru/hawkeye-native:v3 format` from the project root).
- Make sure all your codes are formatted and follow the [coding style](https://pingcap.github.io/style-guide/rust/).
- Make sure all unit tests are passed (using `cargo test --workspace` or [nextest](https://nexte.st/index.html) `cargo nextest run`).
- Make sure all clippy warnings are fixed (you can check it locally by running `cargo clippy --workspace --all-targets -- -D warnings`).
#### `pre-commit` Hooks
You could setup the [`pre-commit`](https://pre-commit.com/#plugins) hooks to run these checks on every commit automatically.
1. Install `pre-commit`
pip install pre-commit
or
brew install pre-commit
2. Install the `pre-commit` hooks
$ pre-commit install
pre-commit installed at .git/hooks/pre-commit
$ pre-commit install --hook-type commit-msg
pre-commit installed at .git/hooks/commit-msg
$ pre-commit install --hook-type pre-push
pre-commit installed at .git/hooks/pre-push
Now, `pre-commit` will run automatically on `git commit`.
- Make sure all unit tests are passed.
- Make sure all clippy warnings are fixed (you can check it locally by running `cargo clippy --workspace --all-targets -- -D warnings -D clippy::print_stdout -D clippy::print_stderr`).
### Title
The titles of pull requests should be prefixed with category names listed in [Conventional Commits specification](https://www.conventionalcommits.org/en/v1.0.0)
like `feat`/`fix`/`docs`, with a concise summary of code change following. AVOID using the last commit message as pull request title.
The titles of pull requests should be prefixed with category name listed in [Conventional Commits specification](https://www.conventionalcommits.org/en/v1.0.0)
like `feat`/`fix`/`doc`, with a concise summary of code change follows. DO NOT use last commit message as pull request title.
### Description
- Feel free to go brief if your pull request is small, like a typo fix.
- If your pull request is small, like a typo fix, feel free to go brief.
- But if it contains large code change, make sure to state the motivation/design details of this PR so that reviewers can understand what you're trying to do.
- If the PR contains any breaking change or API change, make sure that is clearly listed in your description.
@@ -94,20 +25,11 @@ like `feat`/`fix`/`docs`, with a concise summary of code change following. AVOID
All commit messages SHOULD adhere to the [Conventional Commits specification](https://conventionalcommits.org/).
## Getting Help
## Getting help
There are many ways to get help when you're stuck. It is recommended to ask for help by opening an issue, with a detailed description
of what you were trying to do and what went wrong. You can also reach for help in our [Slack channel](https://greptime.com/slack).
of what you were trying to do and what went wrong. You can also reach for help in our Slack channel.
## Community
The core team will be thrilled if you would like to participate in any way you like. When you are stuck, try to ask for help by filing an issue, with a detailed description of what you were trying to do and what went wrong. If you have any questions or if you would like to get involved in our community, please check out:
- [GreptimeDB Community Slack](https://greptime.com/slack)
- [GreptimeDB Github Discussions](https://github.com/GreptimeTeam/greptimedb/discussions)
Also, see some extra GreptimeDB content:
- [GreptimeDB Docs](https://docs.greptime.com/)
- [Learn GreptimeDB](https://greptime.com/product/db)
- [Greptime Inc. Website](https://greptime.com)
## Bug report
To report a bug or a security issue, you can [open a new GitHub issue](https://github.com/GrepTimeTeam/greptimedb/issues/new).

9806
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -1,222 +1,35 @@
[workspace]
members = [
"benchmarks",
"src/api",
"src/auth",
"src/catalog",
"src/client",
"src/cmd",
"src/common/base",
"src/common/catalog",
"src/common/config",
"src/common/datasource",
"src/common/error",
"src/common/function",
"src/common/macro",
"src/common/greptimedb-telemetry",
"src/common/function-macro",
"src/common/grpc",
"src/common/grpc-expr",
"src/common/mem-prof",
"src/common/meta",
"src/common/plugins",
"src/common/procedure",
"src/common/procedure-test",
"src/common/query",
"src/common/recordbatch",
"src/common/runtime",
"src/common/substrait",
"src/common/telemetry",
"src/common/test-util",
"src/common/time",
"src/common/decimal",
"src/common/version",
"src/common/wal",
"src/datanode",
"src/datatypes",
"src/file-engine",
"src/flow",
"src/frontend",
"src/log-store",
"src/meta-client",
"src/meta-srv",
"src/metric-engine",
"src/mito2",
"src/logical-plans",
"src/object-store",
"src/operator",
"src/partition",
"src/plugins",
"src/promql",
"src/puffin",
"src/query",
"src/script",
"src/servers",
"src/session",
"src/sql",
"src/storage",
"src/store-api",
"src/table",
"src/index",
"tests-fuzz",
"tests-integration",
"tests/runner",
"src/table-engine",
"test-util",
]
resolver = "2"
[workspace.package]
version = "0.7.1"
edition = "2021"
license = "Apache-2.0"
[workspace.lints]
clippy.print_stdout = "warn"
clippy.print_stderr = "warn"
clippy.implicit_clone = "warn"
rust.unknown_lints = "deny"
[workspace.dependencies]
ahash = { version = "0.8", features = ["compile-time-rng"] }
aquamarine = "0.3"
arrow = { version = "47.0" }
arrow-array = "47.0"
arrow-flight = "47.0"
arrow-ipc = { version = "47.0", features = ["lz4"] }
arrow-schema = { version = "47.0", features = ["serde"] }
async-stream = "0.3"
async-trait = "0.1"
axum = { version = "0.6", features = ["headers"] }
base64 = "0.21"
bigdecimal = "0.4.2"
bitflags = "2.4.1"
bytemuck = "1.12"
bytes = { version = "1.5", features = ["serde"] }
chrono = { version = "0.4", features = ["serde"] }
clap = { version = "4.4", features = ["derive"] }
dashmap = "5.4"
datafusion = { git = "https://github.com/apache/arrow-datafusion.git", rev = "26e43acac3a96cec8dd4c8365f22dfb1a84306e9" }
datafusion-common = { git = "https://github.com/apache/arrow-datafusion.git", rev = "26e43acac3a96cec8dd4c8365f22dfb1a84306e9" }
datafusion-expr = { git = "https://github.com/apache/arrow-datafusion.git", rev = "26e43acac3a96cec8dd4c8365f22dfb1a84306e9" }
datafusion-optimizer = { git = "https://github.com/apache/arrow-datafusion.git", rev = "26e43acac3a96cec8dd4c8365f22dfb1a84306e9" }
datafusion-physical-expr = { git = "https://github.com/apache/arrow-datafusion.git", rev = "26e43acac3a96cec8dd4c8365f22dfb1a84306e9" }
datafusion-sql = { git = "https://github.com/apache/arrow-datafusion.git", rev = "26e43acac3a96cec8dd4c8365f22dfb1a84306e9" }
datafusion-substrait = { git = "https://github.com/apache/arrow-datafusion.git", rev = "26e43acac3a96cec8dd4c8365f22dfb1a84306e9" }
derive_builder = "0.12"
etcd-client = "0.12"
fst = "0.4.7"
futures = "0.3"
futures-util = "0.3"
greptime-proto = { git = "https://github.com/GreptimeTeam/greptime-proto.git", rev = "96f1f0404f421ee560a4310c73c5071e49168168" }
humantime-serde = "1.1"
itertools = "0.10"
lazy_static = "1.4"
meter-core = { git = "https://github.com/GreptimeTeam/greptime-meter.git", rev = "80b72716dcde47ec4161478416a5c6c21343364d" }
mockall = "0.11.4"
moka = "0.12"
num_cpus = "1.16"
once_cell = "1.18"
opentelemetry-proto = { git = "https://github.com/waynexia/opentelemetry-rust.git", rev = "33841b38dda79b15f2024952be5f32533325ca02", features = [
"gen-tonic",
"metrics",
"trace",
] }
parquet = "47.0"
paste = "1.0"
pin-project = "1.0"
prometheus = { version = "0.13.3", features = ["process"] }
prost = "0.12"
raft-engine = { version = "0.4.1", default-features = false }
rand = "0.8"
regex = "1.8"
regex-automata = { version = "0.2", features = ["transducer"] }
reqwest = { version = "0.11", default-features = false, features = [
"json",
"rustls-tls-native-roots",
"stream",
] }
rskafka = "0.5"
rust_decimal = "1.33"
serde = { version = "1.0", features = ["derive"] }
serde_json = { version = "1.0", features = ["float_roundtrip"] }
serde_with = "3"
smallvec = { version = "1", features = ["serde"] }
snafu = "0.7"
sysinfo = "0.30"
# on branch v0.38.x
sqlparser = { git = "https://github.com/GreptimeTeam/sqlparser-rs.git", rev = "6a93567ae38d42be5c8d08b13c8ff4dde26502ef", features = [
"visitor",
] }
strum = { version = "0.25", features = ["derive"] }
tempfile = "3"
tokio = { version = "1.28", features = ["full"] }
tokio-stream = { version = "0.1" }
tokio-util = { version = "0.7", features = ["io-util", "compat"] }
toml = "0.8.8"
tonic = { version = "0.10", features = ["tls"] }
uuid = { version = "1", features = ["serde", "v4", "fast-rng"] }
## workspaces members
api = { path = "src/api" }
auth = { path = "src/auth" }
catalog = { path = "src/catalog" }
client = { path = "src/client" }
cmd = { path = "src/cmd" }
common-base = { path = "src/common/base" }
common-catalog = { path = "src/common/catalog" }
common-config = { path = "src/common/config" }
common-datasource = { path = "src/common/datasource" }
common-decimal = { path = "src/common/decimal" }
common-error = { path = "src/common/error" }
common-function = { path = "src/common/function" }
common-greptimedb-telemetry = { path = "src/common/greptimedb-telemetry" }
common-grpc = { path = "src/common/grpc" }
common-grpc-expr = { path = "src/common/grpc-expr" }
common-macro = { path = "src/common/macro" }
common-mem-prof = { path = "src/common/mem-prof" }
common-meta = { path = "src/common/meta" }
common-plugins = { path = "src/common/plugins" }
common-procedure = { path = "src/common/procedure" }
common-procedure-test = { path = "src/common/procedure-test" }
common-query = { path = "src/common/query" }
common-recordbatch = { path = "src/common/recordbatch" }
common-runtime = { path = "src/common/runtime" }
common-telemetry = { path = "src/common/telemetry" }
common-test-util = { path = "src/common/test-util" }
common-time = { path = "src/common/time" }
common-version = { path = "src/common/version" }
common-wal = { path = "src/common/wal" }
datanode = { path = "src/datanode" }
datatypes = { path = "src/datatypes" }
file-engine = { path = "src/file-engine" }
frontend = { path = "src/frontend" }
index = { path = "src/index" }
log-store = { path = "src/log-store" }
meta-client = { path = "src/meta-client" }
meta-srv = { path = "src/meta-srv" }
metric-engine = { path = "src/metric-engine" }
mito2 = { path = "src/mito2" }
object-store = { path = "src/object-store" }
operator = { path = "src/operator" }
partition = { path = "src/partition" }
plugins = { path = "src/plugins" }
promql = { path = "src/promql" }
puffin = { path = "src/puffin" }
query = { path = "src/query" }
script = { path = "src/script" }
servers = { path = "src/servers" }
session = { path = "src/session" }
sql = { path = "src/sql" }
store-api = { path = "src/store-api" }
substrait = { path = "src/common/substrait" }
table = { path = "src/table" }
[workspace.dependencies.meter-macros]
git = "https://github.com/GreptimeTeam/greptime-meter.git"
rev = "80b72716dcde47ec4161478416a5c6c21343364d"
[profile.release]
debug = 1
[profile.nightly]
inherits = "release"
strip = true
lto = "thin"
debug = false
incremental = false
[patch.crates-io]
sqlparser = { git = "https://github.com/sunng87/sqlparser-rs.git", branch = "feature/argument-for-custom-type-for-v015" }

View File

@@ -1,7 +0,0 @@
[build]
pre-build = [
"dpkg --add-architecture $CROSS_DEB_ARCH",
"apt update && apt install -y unzip zlib1g-dev zlib1g-dev:$CROSS_DEB_ARCH",
"curl -LO https://github.com/protocolbuffers/protobuf/releases/download/v3.15.8/protoc-3.15.8-linux-x86_64.zip && unzip protoc-3.15.8-linux-x86_64.zip -d /usr/",
"chmod a+x /usr/bin/protoc && chmod -R a+rx /usr/include/google",
]

201
LICENSE
View File

@@ -1,201 +0,0 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

206
Makefile
View File

@@ -1,206 +0,0 @@
# The arguments for building images.
CARGO_PROFILE ?=
FEATURES ?=
TARGET_DIR ?=
TARGET ?=
BUILD_BIN ?= greptime
CARGO_BUILD_OPTS := --locked
IMAGE_REGISTRY ?= docker.io
IMAGE_NAMESPACE ?= greptime
IMAGE_TAG ?= latest
BUILDX_MULTI_PLATFORM_BUILD ?= false
BUILDX_BUILDER_NAME ?= gtbuilder
BASE_IMAGE ?= ubuntu
RUST_TOOLCHAIN ?= $(shell cat rust-toolchain.toml | grep channel | cut -d'"' -f2)
CARGO_REGISTRY_CACHE ?= ${HOME}/.cargo/registry
ARCH := $(shell uname -m | sed 's/x86_64/amd64/' | sed 's/aarch64/arm64/')
OUTPUT_DIR := $(shell if [ "$(RELEASE)" = "true" ]; then echo "release"; elif [ ! -z "$(CARGO_PROFILE)" ]; then echo "$(CARGO_PROFILE)" ; else echo "debug"; fi)
# The arguments for running integration tests.
ETCD_VERSION ?= v3.5.9
ETCD_IMAGE ?= quay.io/coreos/etcd:${ETCD_VERSION}
RETRY_COUNT ?= 3
NEXTEST_OPTS := --retries ${RETRY_COUNT}
BUILD_JOBS ?= $(shell which nproc 1>/dev/null && expr $$(nproc) / 2) # If nproc is not available, we don't set the build jobs.
ifeq ($(BUILD_JOBS), 0) # If the number of cores is less than 2, set the build jobs to 1.
BUILD_JOBS := 1
endif
ifneq ($(strip $(BUILD_JOBS)),)
NEXTEST_OPTS += --build-jobs=${BUILD_JOBS}
endif
ifneq ($(strip $(CARGO_PROFILE)),)
CARGO_BUILD_OPTS += --profile ${CARGO_PROFILE}
endif
ifneq ($(strip $(FEATURES)),)
CARGO_BUILD_OPTS += --features ${FEATURES}
endif
ifneq ($(strip $(TARGET_DIR)),)
CARGO_BUILD_OPTS += --target-dir ${TARGET_DIR}
endif
ifneq ($(strip $(TARGET)),)
CARGO_BUILD_OPTS += --target ${TARGET}
endif
ifneq ($(strip $(BUILD_BIN)),)
CARGO_BUILD_OPTS += --bin ${BUILD_BIN}
endif
ifneq ($(strip $(RELEASE)),)
CARGO_BUILD_OPTS += --release
endif
ifeq ($(BUILDX_MULTI_PLATFORM_BUILD), true)
BUILDX_MULTI_PLATFORM_BUILD_OPTS := --platform linux/amd64,linux/arm64 --push
else
BUILDX_MULTI_PLATFORM_BUILD_OPTS := -o type=docker
endif
ifneq ($(strip $(CARGO_BUILD_EXTRA_OPTS)),)
CARGO_BUILD_OPTS += ${CARGO_BUILD_EXTRA_OPTS}
endif
##@ Build
.PHONY: build
build: ## Build debug version greptime.
cargo ${CARGO_EXTENSION} build ${CARGO_BUILD_OPTS}
.PHONY: build-by-dev-builder
build-by-dev-builder: ## Build greptime by dev-builder.
docker run --network=host \
-v ${PWD}:/greptimedb -v ${CARGO_REGISTRY_CACHE}:/root/.cargo/registry \
-w /greptimedb ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/dev-builder-${BASE_IMAGE}:latest \
make build \
CARGO_EXTENSION="${CARGO_EXTENSION}" \
CARGO_PROFILE=${CARGO_PROFILE} \
FEATURES=${FEATURES} \
TARGET_DIR=${TARGET_DIR} \
TARGET=${TARGET} \
RELEASE=${RELEASE} \
CARGO_BUILD_EXTRA_OPTS="${CARGO_BUILD_EXTRA_OPTS}"
.PHONY: build-android-bin
build-android-bin: ## Build greptime binary for android.
docker run --network=host \
-v ${PWD}:/greptimedb -v ${CARGO_REGISTRY_CACHE}:/root/.cargo/registry \
-w /greptimedb ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/dev-builder-android:latest \
make build \
CARGO_EXTENSION="ndk --platform 23 -t aarch64-linux-android" \
CARGO_PROFILE=release \
FEATURES="${FEATURES}" \
TARGET_DIR="${TARGET_DIR}" \
TARGET="${TARGET}" \
RELEASE="${RELEASE}" \
CARGO_BUILD_EXTRA_OPTS="--bin greptime --no-default-features"
.PHONY: strip-android-bin
strip-android-bin: build-android-bin ## Strip greptime binary for android.
docker run --network=host \
-v ${PWD}:/greptimedb \
-w /greptimedb ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/dev-builder-android:latest \
bash -c '$${NDK_ROOT}/toolchains/llvm/prebuilt/linux-x86_64/bin/llvm-strip /greptimedb/target/aarch64-linux-android/release/greptime'
.PHONY: clean
clean: ## Clean the project.
cargo clean
.PHONY: fmt
fmt: ## Format all the Rust code.
cargo fmt --all
.PHONY: fmt-toml
fmt-toml: ## Format all TOML files.
taplo format
.PHONY: check-toml
check-toml: ## Check all TOML files.
taplo format --check
.PHONY: docker-image
docker-image: build-by-dev-builder ## Build docker image.
mkdir -p ${ARCH} && \
cp ./target/${OUTPUT_DIR}/greptime ${ARCH}/greptime && \
docker build -f docker/ci/${BASE_IMAGE}/Dockerfile -t ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/greptimedb:${IMAGE_TAG} . && \
rm -r ${ARCH}
.PHONY: docker-image-buildx
docker-image-buildx: multi-platform-buildx ## Build docker image by buildx.
docker buildx build --builder ${BUILDX_BUILDER_NAME} \
--build-arg="CARGO_PROFILE=${CARGO_PROFILE}" \
--build-arg="FEATURES=${FEATURES}" \
--build-arg="OUTPUT_DIR=${OUTPUT_DIR}" \
-f docker/buildx/${BASE_IMAGE}/Dockerfile \
-t ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/greptimedb:${IMAGE_TAG} ${BUILDX_MULTI_PLATFORM_BUILD_OPTS} .
.PHONY: dev-builder
dev-builder: multi-platform-buildx ## Build dev-builder image.
docker buildx build --builder ${BUILDX_BUILDER_NAME} \
--build-arg="RUST_TOOLCHAIN=${RUST_TOOLCHAIN}" \
-f docker/dev-builder/${BASE_IMAGE}/Dockerfile \
-t ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/dev-builder-${BASE_IMAGE}:${IMAGE_TAG} ${BUILDX_MULTI_PLATFORM_BUILD_OPTS} .
.PHONY: multi-platform-buildx
multi-platform-buildx: ## Create buildx multi-platform builder.
docker buildx inspect ${BUILDX_BUILDER_NAME} || docker buildx create --name ${BUILDX_BUILDER_NAME} --driver docker-container --bootstrap --use
##@ Test
.PHONY: test
test: nextest ## Run unit and integration tests.
cargo nextest run ${NEXTEST_OPTS}
.PHONY: nextest
nextest: ## Install nextest tools.
cargo --list | grep nextest || cargo install cargo-nextest --locked
.PHONY: sqlness-test
sqlness-test: ## Run sqlness test.
cargo sqlness
.PHONY: check
check: ## Cargo check all the targets.
cargo check --workspace --all-targets --all-features
.PHONY: clippy
clippy: ## Check clippy rules.
cargo clippy --workspace --all-targets --all-features -- -D warnings
.PHONY: fmt-check
fmt-check: ## Check code format.
cargo fmt --all -- --check
.PHONY: start-etcd
start-etcd: ## Start single node etcd for testing purpose.
docker run --rm -d --network=host -p 2379-2380:2379-2380 ${ETCD_IMAGE}
.PHONY: stop-etcd
stop-etcd: ## Stop single node etcd for testing purpose.
docker stop $$(docker ps -q --filter ancestor=${ETCD_IMAGE})
.PHONY: run-it-in-container
run-it-in-container: start-etcd ## Run integration tests in dev-builder.
docker run --network=host \
-v ${PWD}:/greptimedb -v ${CARGO_REGISTRY_CACHE}:/root/.cargo/registry -v /tmp:/tmp \
-w /greptimedb ${IMAGE_REGISTRY}/${IMAGE_NAMESPACE}/dev-builder-${BASE_IMAGE}:latest \
make test sqlness-test BUILD_JOBS=${BUILD_JOBS}
##@ General
# The help target prints out all targets with their descriptions organized
# beneath their categories. The categories are represented by '##@' and the
# target descriptions by '##'. The awk commands is responsible for reading the
# entire set of makefiles included in this invocation, looking for lines of the
# file as xyz: ## something, and then pretty-format the target and help. Then,
# if there's a line with ##@ something, that gets pretty-printed as a category.
# More info on the usage of ANSI control characters for terminal formatting:
# https://en.wikipedia.org/wiki/ANSI_escape_code#SGR_parameters
# More info on the awk command:
# https://linuxcommand.org/lc3_adv_awk.php
.PHONY: help
help: ## Display help messages.
@awk 'BEGIN {FS = ":.*##"; printf "\nUsage:\n make \033[36m<target>\033[0m\n"} /^[a-zA-Z_0-9-]+:.*?##/ { printf " \033[36m%-30s\033[0m %s\n", $$1, $$2 } /^##@/ { printf "\n\033[1m%s\033[0m\n", substr($$0, 5) } ' $(MAKEFILE_LIST)

254
README.md
View File

@@ -1,181 +1,183 @@
<p align="center">
<picture>
<source media="(prefers-color-scheme: light)" srcset="https://cdn.jsdelivr.net/gh/GreptimeTeam/greptimedb@main/docs/logo-text-padding.png">
<source media="(prefers-color-scheme: dark)" srcset="https://cdn.jsdelivr.net/gh/GreptimeTeam/greptimedb@main/docs/logo-text-padding-dark.png">
<img alt="GreptimeDB Logo" src="https://cdn.jsdelivr.net/gh/GreptimeTeam/greptimedb@main/docs/logo-text-padding.png" width="400px">
</picture>
</p>
# GreptimeDB
[![codecov](https://codecov.io/gh/GrepTimeTeam/greptimedb/branch/develop/graph/badge.svg?token=FITFDI3J3C)](https://codecov.io/gh/GrepTimeTeam/greptimedb)
<h3 align="center">
The next-generation hybrid time-series/analytics processing database in the cloud
</h3>
GreptimeDB: the next-generation hybrid timeseries/analytics processing database in the cloud.
<p align="center">
<a href="https://codecov.io/gh/GrepTimeTeam/greptimedb"><img src="https://codecov.io/gh/GrepTimeTeam/greptimedb/branch/main/graph/badge.svg?token=FITFDI3J3C"></img></a>
&nbsp;
<a href="https://github.com/GreptimeTeam/greptimedb/actions/workflows/develop.yml"><img src="https://github.com/GreptimeTeam/greptimedb/actions/workflows/develop.yml/badge.svg" alt="CI"></img></a>
&nbsp;
<a href="https://github.com/greptimeTeam/greptimedb/blob/main/LICENSE"><img src="https://img.shields.io/github/license/greptimeTeam/greptimedb"></a>
</p>
## Getting Started
<p align="center">
<a href="https://twitter.com/greptime"><img src="https://img.shields.io/badge/twitter-follow_us-1d9bf0.svg"></a>
&nbsp;
<a href="https://www.linkedin.com/company/greptime/"><img src="https://img.shields.io/badge/linkedin-connect_with_us-0a66c2.svg"></a>
&nbsp;
<a href="https://greptime.com/slack"><img src="https://img.shields.io/badge/slack-GreptimeDB-0abd59?logo=slack" alt="slack" /></a>
</p>
### Prerequisites
## What is GreptimeDB
To compile GreptimeDB from source, you'll need the following:
- Rust
- Protobuf
- OpenSSL
GreptimeDB is an open-source time-series database focusing on efficiency, scalability, and analytical capabilities.
It's designed to work on infrastructure of the cloud era, and users benefit from its elasticity and commodity storage.
#### Rust
Our core developers have been building time-series data platforms for years. Based on their best-practices, GreptimeDB is born to give you:
The easiest way to install Rust is to use [`rustup`](https://rustup.rs/), which will check our `rust-toolchain` file and install correct Rust version for you.
- Optimized columnar layout for handling time-series data; compacted, compressed, and stored on various storage backends, particularly cloud object storage with 50x cost efficiency.
- Fully open-source distributed cluster architecture that harnesses the power of cloud-native elastic computing resources.
- Seamless scalability from a standalone binary at edge to a robust, highly available distributed cluster in cloud, with a transparent experience for both developers and administrators.
- Native SQL and PromQL for queries, and Python scripting to facilitate complex analytical tasks.
- Flexible indexing capabilities and distributed, parallel-processing query engine, tackling high cardinality issues down.
- Widely adopted database protocols and APIs, including MySQL, PostgreSQL, and Prometheus Remote Storage, etc.
#### Protobuf
## Quick Start
`protoc` is required for compiling `.proto` files. `protobuf` is available from
major package manager on macos and linux distributions. You can find an
installation instructions [here](https://grpc.io/docs/protoc-installation/).
### [GreptimePlay](https://greptime.com/playground)
#### OpenSSL
Try out the features of GreptimeDB right from your browser.
For Ubuntu:
```bash
sudo apt install libssl-dev
```
### Build
For RedHat-based: Fedora, Oracle Linux, etc:
```bash
sudo dnf install openssl-devel
```
#### Build from Source
For macOS:
```bash
brew install openssl
```
To compile GreptimeDB from source, you'll need:
- C/C++ Toolchain: provides basic tools for compiling and linking. This is
available as `build-essential` on ubuntu and similar name on other platforms.
- Rust: the easiest way to install Rust is to use
[`rustup`](https://rustup.rs/), which will check our `rust-toolchain` file and
install correct Rust version for you.
- Protobuf: `protoc` is required for compiling `.proto` files. `protobuf` is
available from major package manager on macos and linux distributions. You can
find an installation instructions [here](https://grpc.io/docs/protoc-installation/).
**Note that `protoc` version needs to be >= 3.15** because we have used the `optional`
keyword. You can check it with `protoc --version`.
- python3-dev or python3-devel(Optional feature, only needed if you want to run scripts
in CPython, and also need to enable `pyo3_backend` feature when compiling(by `cargo run -F pyo3_backend` or add `pyo3_backend` to src/script/Cargo.toml 's `features.default` like `default = ["python", "pyo3_backend]`)): this install a Python shared library required for running Python
scripting engine(In CPython Mode). This is available as `python3-dev` on
ubuntu, you can install it with `sudo apt install python3-dev`, or
`python3-devel` on RPM based distributions (e.g. Fedora, Red Hat, SuSE). Mac's
`Python3` package should have this shared library by default. More detail for compiling with PyO3 can be found in [PyO3](https://pyo3.rs/v0.18.1/building_and_distribution#configuring-the-python-version)'s documentation.
#### Build with Docker
A docker image with necessary dependencies is provided:
### Build the Docker Image
```
docker build --network host -f docker/Dockerfile -t greptimedb .
```
### Run
## Usage
Start GreptimeDB from source code, in standalone mode:
### Start Datanode
```
cargo run -- standalone start
// Start datanode with default options.
cargo run -- datanode start
OR
// Start datanode with `http-addr` option.
cargo run -- datanode start --http-addr=0.0.0.0:9999
OR
// Start datanode with `log-dir` and `log-level` options.
cargo run -- --log-dir=logs --log-level=debug datanode start
```
Or if you built from docker:
Start datanode with config file:
```
docker run -p 4002:4002 -v "$(pwd):/tmp/greptimedb" greptime/greptimedb standalone start
cargo run -- --log-dir=logs --log-level=debug datanode start -c ./config/datanode.example.toml
```
Please see the online document site for more installation options and [operations info](https://docs.greptime.com/user-guide/operations/overview).
Start datanode by runing docker container:
### Get started
```
docker run -p 3000:3000 \
-p 3001:3001 \
-p 3306:3306 \
greptimedb
```
Read the [complete getting started guide](https://docs.greptime.com/getting-started/overview) on our [official document site](https://docs.greptime.com/).
### Start Frontend
To write and query data, GreptimeDB is compatible with multiple [protocols and clients](https://docs.greptime.com/user-guide/clients/overview).
Frontend should connect to Datanode, so **Datanode must have been started** at first!
## Resources
```
// Connects to local Datanode at its default GRPC port: 3001
### Installation
// Start Frontend with default options.
cargo run -- frontend start
- [Pre-built Binaries](https://greptime.com/download):
For Linux and macOS, you can easily download pre-built binaries including official releases and nightly builds that are ready to use.
In most cases, downloading the version without PyO3 is sufficient. However, if you plan to run scripts in CPython (and use Python packages like NumPy and Pandas), you will need to download the version with PyO3 and install a Python with the same version as the Python in the PyO3 version.
We recommend using virtualenv for the installation process to manage multiple Python versions.
- [Docker Images](https://hub.docker.com/r/greptime/greptimedb)(**recommended**): pre-built
Docker images, this is the easiest way to try GreptimeDB. By default it runs CPython script with `pyo3_backend` enabled.
- [`gtctl`](https://github.com/GreptimeTeam/gtctl): the command-line tool for
Kubernetes deployment
OR
### Documentation
// Start Frontend with `mysql-addr` option.
cargo run -- frontend start --mysql-addr=0.0.0.0:9999
- GreptimeDB [User Guide](https://docs.greptime.com/user-guide/concepts/overview)
- GreptimeDB [Developer
Guide](https://docs.greptime.com/developer-guide/overview.html)
- GreptimeDB [internal code document](https://greptimedb.rs)
OR
### Dashboard
- [The dashboard UI for GreptimeDB](https://github.com/GreptimeTeam/dashboard)
// Start datanode with `log-dir` and `log-level` options.
cargo run -- --log-dir=logs --log-level=debug frontend start
```
### SDK
Start datanode with config file:
- [GreptimeDB C++ Client](https://github.com/GreptimeTeam/greptimedb-client-cpp)
- [GreptimeDB Erlang Client](https://github.com/GreptimeTeam/greptimedb-client-erl)
- [GreptimeDB Go Ingester](https://github.com/GreptimeTeam/greptimedb-ingester-go)
- [GreptimeDB Java Ingester](https://github.com/GreptimeTeam/greptimedb-ingester-java)
- [GreptimeDB Python Client](https://github.com/GreptimeTeam/greptimedb-client-py) (WIP)
- [GreptimeDB Rust Client](https://github.com/GreptimeTeam/greptimedb-client-rust)
- [GreptimeDB JavaScript Client](https://github.com/GreptimeTeam/greptime-js-sdk)
```
cargo run -- --log-dir=logs --log-level=debug frontend start -c ./config/frontend.example.toml
```
### Grafana Dashboard
### SQL Operations
Our official Grafana dashboard is available at [grafana](./grafana/README.md) directory.
1. Connecting DB by [mysql client](https://dev.mysql.com/downloads/mysql/):
## Project Status
```
# The datanode listen on port 3306 by default.
mysql -h 127.0.0.1 -P 3306
```
This project is in its early stage and under heavy development. We move fast and
break things. Benchmark on development branch may not represent its potential
performance. We release pre-built binaries constantly for functional
evaluation. Do not use it in production at the moment.
2. Create table:
For future plans, check out [GreptimeDB roadmap](https://github.com/GreptimeTeam/greptimedb/issues/669).
```SQL
CREATE TABLE monitor (
host STRING,
ts TIMESTAMP,
cpu DOUBLE DEFAULT 0,
memory DOUBLE,
TIME INDEX (ts),
PRIMARY KEY(ts,host)) ENGINE=mito WITH(regions=1);
```
## Community
3. Insert data:
Our core team is thrilled to see you participate in any ways you like. When you are stuck, try to
ask for help by filling an issue with a detailed description of what you were trying to do
and what went wrong. If you have any questions or if you would like to get involved in our
community, please check out:
```SQL
INSERT INTO monitor(host, cpu, memory, ts) VALUES ('host1', 66.6, 1024, 1660897955);
INSERT INTO monitor(host, cpu, memory, ts) VALUES ('host2', 77.7, 2048, 1660897956);
INSERT INTO monitor(host, cpu, memory, ts) VALUES ('host3', 88.8, 4096, 1660897957);
```
- GreptimeDB Community on [Slack](https://greptime.com/slack)
- GreptimeDB GitHub [Discussions](https://github.com/GreptimeTeam/greptimedb/discussions)
- Greptime official [Website](https://greptime.com)
4. Query data:
In addition, you may:
```SQL
mysql> SELECT * FROM monitor;
+-------+------------+------+--------+
| host | ts | cpu | memory |
+-------+------------+------+--------+
| host1 | 1660897955 | 66.6 | 1024 |
| host2 | 1660897956 | 77.7 | 2048 |
| host3 | 1660897957 | 88.8 | 4096 |
+-------+------------+------+--------+
3 rows in set (0.01 sec)
```
You can delete your data by removing `/tmp/greptimedb`.
- View our official [Blog](https://greptime.com/blogs/index)
- Connect us with [Linkedin](https://www.linkedin.com/company/greptime/)
- Follow us on [Twitter](https://twitter.com/greptime)
## Contribute
## License
1. [Install rust](https://www.rust-lang.org/tools/install)
2. [Install `pre-commit`](https://pre-commit.com/#plugins) for run hooks on every commit automatically such as `cargo fmt` etc.
GreptimeDB uses the [Apache License 2.0](https://apache.org/licenses/LICENSE-2.0.txt) to strike a balance between
open contributions and allowing you to use the software however you want.
```
$ pip install pre-commit
## Contributing
or
Please refer to [contribution guidelines](CONTRIBUTING.md) for more information.
$ brew install pre-commit
$
```
## Acknowledgement
3. Install the git hook scripts:
- GreptimeDB uses [Apache Arrow™](https://arrow.apache.org/) as the memory model and [Apache Parquet™](https://parquet.apache.org/) as the persistent file format.
- GreptimeDB's query engine is powered by [Apache Arrow DataFusion™](https://arrow.apache.org/datafusion/).
- [Apache OpenDAL™](https://opendal.apache.org) gives GreptimeDB a very general and elegant data access abstraction layer.
- GreptimeDB's meta service is based on [etcd](https://etcd.io/).
- GreptimeDB uses [RustPython](https://github.com/RustPython/RustPython) for experimental embedded python scripting.
```
$ pre-commit install
pre-commit installed at .git/hooks/pre-commit
$ pre-commit install --hook-type commit-msg
pre-commit installed at .git/hooks/commit-msg
$ pre-commit install --hook-type pre-push
pre-commit installed at .git/hooks/pre-pus
```
now `pre-commit` will run automatically on `git commit`.
4. Check out branch from `develop` and make your contribution. Follow the [style guide](https://github.com/GreptimeTeam/docs/blob/main/style-guide/zh.md). Create a PR when you are ready, feel free and have fun!

View File

@@ -1,19 +0,0 @@
# Security Policy
## Supported Versions
| Version | Supported |
| ------- | ------------------ |
| >= v0.1.0 | :white_check_mark: |
| < v0.1.0 | :x: |
## Reporting a Vulnerability
We place great importance on the security of GreptimeDB code, software,
and cloud platform. If you come across a security vulnerability in GreptimeDB,
we kindly request that you inform us immediately. We will thoroughly investigate
all valid reports and make every effort to resolve the issue promptly.
To report any issues or vulnerabilities, please email us at info@greptime.com, rather than
posting publicly on GitHub. Be sure to provide us with the version identifier as well as details
on how the vulnerability can be exploited.

View File

@@ -1,19 +0,0 @@
[package]
name = "benchmarks"
version.workspace = true
edition.workspace = true
license.workspace = true
[lints]
workspace = true
[dependencies]
arrow.workspace = true
chrono.workspace = true
clap.workspace = true
client.workspace = true
futures-util.workspace = true
indicatif = "0.17.1"
itertools.workspace = true
parquet.workspace = true
tokio.workspace = true

View File

@@ -1,543 +0,0 @@
// Copyright 2023 Greptime Team
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.
//! Use the taxi trip records from New York City dataset to bench. You can download the dataset from
//! [here](https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page).
#![allow(clippy::print_stdout)]
use std::collections::HashMap;
use std::path::{Path, PathBuf};
use std::time::Instant;
use arrow::array::{ArrayRef, PrimitiveArray, StringArray, TimestampMicrosecondArray};
use arrow::datatypes::{DataType, Float64Type, Int64Type};
use arrow::record_batch::RecordBatch;
use clap::Parser;
use client::api::v1::column::Values;
use client::api::v1::{
Column, ColumnDataType, ColumnDef, CreateTableExpr, InsertRequest, InsertRequests, SemanticType,
};
use client::{Client, Database, OutputData, DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME};
use futures_util::TryStreamExt;
use indicatif::{MultiProgress, ProgressBar, ProgressStyle};
use parquet::arrow::arrow_reader::ParquetRecordBatchReaderBuilder;
use tokio::task::JoinSet;
const CATALOG_NAME: &str = "greptime";
const SCHEMA_NAME: &str = "public";
#[derive(Parser)]
#[command(name = "NYC benchmark runner")]
struct Args {
/// Path to the dataset
#[arg(short, long)]
path: Option<String>,
/// Batch size of insert request.
#[arg(short = 's', long = "batch-size", default_value_t = 4096)]
batch_size: usize,
/// Number of client threads on write (parallel on file level)
#[arg(short = 't', long = "thread-num", default_value_t = 4)]
thread_num: usize,
/// Number of query iteration
#[arg(short = 'i', long = "iter-num", default_value_t = 3)]
iter_num: usize,
#[arg(long = "skip-write")]
skip_write: bool,
#[arg(long = "skip-read")]
skip_read: bool,
#[arg(short, long, default_value_t = String::from("127.0.0.1:4001"))]
endpoint: String,
}
fn get_file_list<P: AsRef<Path>>(path: P) -> Vec<PathBuf> {
std::fs::read_dir(path)
.unwrap()
.map(|dir| dir.unwrap().path().canonicalize().unwrap())
.collect()
}
fn new_table_name() -> String {
format!("nyc_taxi_{}", chrono::Utc::now().timestamp())
}
async fn write_data(
table_name: &str,
batch_size: usize,
db: &Database,
path: PathBuf,
mpb: MultiProgress,
pb_style: ProgressStyle,
) -> u128 {
let file = std::fs::File::open(&path).unwrap();
let record_batch_reader_builder = ParquetRecordBatchReaderBuilder::try_new(file).unwrap();
let row_num = record_batch_reader_builder
.metadata()
.file_metadata()
.num_rows();
let record_batch_reader = record_batch_reader_builder
.with_batch_size(batch_size)
.build()
.unwrap();
let progress_bar = mpb.add(ProgressBar::new(row_num as _));
progress_bar.set_style(pb_style);
progress_bar.set_message(format!("{path:?}"));
let mut total_rpc_elapsed_ms = 0;
for record_batch in record_batch_reader {
let record_batch = record_batch.unwrap();
if !is_record_batch_full(&record_batch) {
continue;
}
let (columns, row_count) = convert_record_batch(record_batch);
let request = InsertRequest {
table_name: table_name.to_string(),
columns,
row_count,
};
let requests = InsertRequests {
inserts: vec![request],
};
let now = Instant::now();
db.insert(requests).await.unwrap();
let elapsed = now.elapsed();
total_rpc_elapsed_ms += elapsed.as_millis();
progress_bar.inc(row_count as _);
}
progress_bar.finish_with_message(format!("file {path:?} done in {total_rpc_elapsed_ms}ms",));
total_rpc_elapsed_ms
}
fn convert_record_batch(record_batch: RecordBatch) -> (Vec<Column>, u32) {
let schema = record_batch.schema();
let fields = schema.fields();
let row_count = record_batch.num_rows();
let mut columns = vec![];
for (array, field) in record_batch.columns().iter().zip(fields.iter()) {
let (values, datatype) = build_values(array);
let semantic_type = match field.name().as_str() {
"VendorID" => SemanticType::Tag,
"tpep_pickup_datetime" => SemanticType::Timestamp,
_ => SemanticType::Field,
};
let column = Column {
column_name: field.name().clone(),
values: Some(values),
null_mask: array
.to_data()
.nulls()
.map(|bitmap| bitmap.buffer().as_slice().to_vec())
.unwrap_or_default(),
datatype: datatype.into(),
semantic_type: semantic_type as i32,
..Default::default()
};
columns.push(column);
}
(columns, row_count as _)
}
fn build_values(column: &ArrayRef) -> (Values, ColumnDataType) {
match column.data_type() {
DataType::Int64 => {
let array = column
.as_any()
.downcast_ref::<PrimitiveArray<Int64Type>>()
.unwrap();
let values = array.values();
(
Values {
i64_values: values.to_vec(),
..Default::default()
},
ColumnDataType::Int64,
)
}
DataType::Float64 => {
let array = column
.as_any()
.downcast_ref::<PrimitiveArray<Float64Type>>()
.unwrap();
let values = array.values();
(
Values {
f64_values: values.to_vec(),
..Default::default()
},
ColumnDataType::Float64,
)
}
DataType::Timestamp(_, _) => {
let array = column
.as_any()
.downcast_ref::<TimestampMicrosecondArray>()
.unwrap();
let values = array.values();
(
Values {
timestamp_microsecond_values: values.to_vec(),
..Default::default()
},
ColumnDataType::TimestampMicrosecond,
)
}
DataType::Utf8 => {
let array = column.as_any().downcast_ref::<StringArray>().unwrap();
let values = array.iter().filter_map(|s| s.map(String::from)).collect();
(
Values {
string_values: values,
..Default::default()
},
ColumnDataType::String,
)
}
DataType::Null
| DataType::Boolean
| DataType::Int8
| DataType::Int16
| DataType::Int32
| DataType::UInt8
| DataType::UInt16
| DataType::UInt32
| DataType::UInt64
| DataType::Float16
| DataType::Float32
| DataType::Date32
| DataType::Date64
| DataType::Time32(_)
| DataType::Time64(_)
| DataType::Duration(_)
| DataType::Interval(_)
| DataType::Binary
| DataType::FixedSizeBinary(_)
| DataType::LargeBinary
| DataType::LargeUtf8
| DataType::List(_)
| DataType::FixedSizeList(_, _)
| DataType::LargeList(_)
| DataType::Struct(_)
| DataType::Union(_, _)
| DataType::Dictionary(_, _)
| DataType::Decimal128(_, _)
| DataType::Decimal256(_, _)
| DataType::RunEndEncoded(_, _)
| DataType::Map(_, _) => todo!(),
}
}
fn is_record_batch_full(batch: &RecordBatch) -> bool {
batch.columns().iter().all(|col| col.null_count() == 0)
}
fn create_table_expr(table_name: &str) -> CreateTableExpr {
CreateTableExpr {
catalog_name: CATALOG_NAME.to_string(),
schema_name: SCHEMA_NAME.to_string(),
table_name: table_name.to_string(),
desc: String::default(),
column_defs: vec![
ColumnDef {
name: "VendorID".to_string(),
data_type: ColumnDataType::Int64 as i32,
is_nullable: true,
default_constraint: vec![],
semantic_type: SemanticType::Tag as i32,
comment: String::new(),
..Default::default()
},
ColumnDef {
name: "tpep_pickup_datetime".to_string(),
data_type: ColumnDataType::TimestampMicrosecond as i32,
is_nullable: false,
default_constraint: vec![],
semantic_type: SemanticType::Timestamp as i32,
comment: String::new(),
..Default::default()
},
ColumnDef {
name: "tpep_dropoff_datetime".to_string(),
data_type: ColumnDataType::TimestampMicrosecond as i32,
is_nullable: true,
default_constraint: vec![],
semantic_type: SemanticType::Field as i32,
comment: String::new(),
..Default::default()
},
ColumnDef {
name: "passenger_count".to_string(),
data_type: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
semantic_type: SemanticType::Field as i32,
comment: String::new(),
..Default::default()
},
ColumnDef {
name: "trip_distance".to_string(),
data_type: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
semantic_type: SemanticType::Field as i32,
comment: String::new(),
..Default::default()
},
ColumnDef {
name: "RatecodeID".to_string(),
data_type: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
semantic_type: SemanticType::Field as i32,
comment: String::new(),
..Default::default()
},
ColumnDef {
name: "store_and_fwd_flag".to_string(),
data_type: ColumnDataType::String as i32,
is_nullable: true,
default_constraint: vec![],
semantic_type: SemanticType::Field as i32,
comment: String::new(),
..Default::default()
},
ColumnDef {
name: "PULocationID".to_string(),
data_type: ColumnDataType::Int64 as i32,
is_nullable: true,
default_constraint: vec![],
semantic_type: SemanticType::Field as i32,
comment: String::new(),
..Default::default()
},
ColumnDef {
name: "DOLocationID".to_string(),
data_type: ColumnDataType::Int64 as i32,
is_nullable: true,
default_constraint: vec![],
semantic_type: SemanticType::Field as i32,
comment: String::new(),
..Default::default()
},
ColumnDef {
name: "payment_type".to_string(),
data_type: ColumnDataType::Int64 as i32,
is_nullable: true,
default_constraint: vec![],
semantic_type: SemanticType::Field as i32,
comment: String::new(),
..Default::default()
},
ColumnDef {
name: "fare_amount".to_string(),
data_type: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
semantic_type: SemanticType::Field as i32,
comment: String::new(),
..Default::default()
},
ColumnDef {
name: "extra".to_string(),
data_type: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
semantic_type: SemanticType::Field as i32,
comment: String::new(),
..Default::default()
},
ColumnDef {
name: "mta_tax".to_string(),
data_type: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
semantic_type: SemanticType::Field as i32,
comment: String::new(),
..Default::default()
},
ColumnDef {
name: "tip_amount".to_string(),
data_type: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
semantic_type: SemanticType::Field as i32,
comment: String::new(),
..Default::default()
},
ColumnDef {
name: "tolls_amount".to_string(),
data_type: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
semantic_type: SemanticType::Field as i32,
comment: String::new(),
..Default::default()
},
ColumnDef {
name: "improvement_surcharge".to_string(),
data_type: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
semantic_type: SemanticType::Field as i32,
comment: String::new(),
..Default::default()
},
ColumnDef {
name: "total_amount".to_string(),
data_type: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
semantic_type: SemanticType::Field as i32,
comment: String::new(),
..Default::default()
},
ColumnDef {
name: "congestion_surcharge".to_string(),
data_type: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
semantic_type: SemanticType::Field as i32,
comment: String::new(),
..Default::default()
},
ColumnDef {
name: "airport_fee".to_string(),
data_type: ColumnDataType::Float64 as i32,
is_nullable: true,
default_constraint: vec![],
semantic_type: SemanticType::Field as i32,
comment: String::new(),
..Default::default()
},
],
time_index: "tpep_pickup_datetime".to_string(),
primary_keys: vec!["VendorID".to_string()],
create_if_not_exists: true,
table_options: Default::default(),
table_id: None,
engine: "mito".to_string(),
}
}
fn query_set(table_name: &str) -> HashMap<String, String> {
HashMap::from([
(
"count_all".to_string(),
format!("SELECT COUNT(*) FROM {table_name};"),
),
(
"fare_amt_by_passenger".to_string(),
format!("SELECT passenger_count, MIN(fare_amount), MAX(fare_amount), SUM(fare_amount) FROM {table_name} GROUP BY passenger_count"),
)
])
}
async fn do_write(args: &Args, db: &Database, table_name: &str) {
let mut file_list = get_file_list(args.path.clone().expect("Specify data path in argument"));
let mut write_jobs = JoinSet::new();
let create_table_result = db.create(create_table_expr(table_name)).await;
println!("Create table result: {create_table_result:?}");
let progress_bar_style = ProgressStyle::with_template(
"[{elapsed_precise}] {bar:60.cyan/blue} {pos:>7}/{len:7} {msg}",
)
.unwrap()
.progress_chars("##-");
let multi_progress_bar = MultiProgress::new();
let file_progress = multi_progress_bar.add(ProgressBar::new(file_list.len() as _));
file_progress.inc(0);
let batch_size = args.batch_size;
for _ in 0..args.thread_num {
if let Some(path) = file_list.pop() {
let db = db.clone();
let mpb = multi_progress_bar.clone();
let pb_style = progress_bar_style.clone();
let table_name = table_name.to_string();
let _ = write_jobs.spawn(async move {
write_data(&table_name, batch_size, &db, path, mpb, pb_style).await
});
}
}
while write_jobs.join_next().await.is_some() {
file_progress.inc(1);
if let Some(path) = file_list.pop() {
let db = db.clone();
let mpb = multi_progress_bar.clone();
let pb_style = progress_bar_style.clone();
let table_name = table_name.to_string();
let _ = write_jobs.spawn(async move {
write_data(&table_name, batch_size, &db, path, mpb, pb_style).await
});
}
}
}
async fn do_query(num_iter: usize, db: &Database, table_name: &str) {
for (query_name, query) in query_set(table_name) {
println!("Running query: {query}");
for i in 0..num_iter {
let now = Instant::now();
let res = db.sql(&query).await.unwrap();
match res.data {
OutputData::AffectedRows(_) | OutputData::RecordBatches(_) => (),
OutputData::Stream(stream) => {
stream.try_collect::<Vec<_>>().await.unwrap();
}
}
let elapsed = now.elapsed();
println!(
"query {}, iteration {}: {}ms",
query_name,
i,
elapsed.as_millis(),
);
}
}
}
fn main() {
let args = Args::parse();
tokio::runtime::Builder::new_multi_thread()
.worker_threads(args.thread_num)
.enable_all()
.build()
.unwrap()
.block_on(async {
let client = Client::with_urls(vec![&args.endpoint]);
let db = Database::new(DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME, client);
let table_name = new_table_name();
if !args.skip_write {
do_write(&args, &db, &table_name).await;
}
if !args.skip_read {
do_query(args.iter_num, &db, &table_name).await;
}
})
}

View File

@@ -1,13 +0,0 @@
# codecov config
coverage:
status:
project:
default:
threshold: 1%
patch: off
ignore:
- "**/error*.rs" # ignore all error.rs files
- "tests/runner/*.rs" # ignore integration test runner
- "tests-integration/**/*.rs" # ignore integration tests
comment: # this is a top-level key
layout: "diff"

View File

View File

@@ -0,0 +1,71 @@
import sys
# for annoying releative import beyond top-level package
sys.path.insert(0, "../")
from greptime import mock_tester, coprocessor, greptime as gt_builtin
from greptime.greptime import interval, vector, log, prev, sqrt, datetime
import greptime.greptime as greptime
import json
import numpy as np
def data_sample(k_lines, symbol, density=5 * 30 * 86400):
"""
Only return close data for simplicty for now
"""
k_lines = k_lines["result"] if k_lines["ret_msg"] == "OK" else None
if k_lines is None:
raise Exception("Expect a `OK`ed message")
close = [float(i["close"]) for i in k_lines]
return interval(close, density, "prev")
def as_table(kline: list):
col_len = len(kline)
ret = {
k: vector([fn(row[k]) for row in kline], str(ty))
for k, fn, ty in
[
("symbol", str, "str"),
("period", str, "str"),
("open_time", int, "int"),
("open", float, "float"),
("high", float, "float"),
("low", float, "float"),
("close", float, "float")
]
}
return ret
@coprocessor(args=["open_time", "close"], returns=[
"rv_7d",
"rv_15d",
"rv_30d",
"rv_60d",
"rv_90d",
"rv_180d"
])
def calc_rvs(open_time, close):
from greptime import vector, log, prev, sqrt, datetime, pow, sum, last
import greptime as g
def calc_rv(close, open_time, time, interval):
mask = (open_time < time) & (open_time > time - interval)
close = close[mask]
open_time = open_time[mask]
close = g.interval(open_time, close, datetime("10m"), lambda x:last(x))
avg_time_interval = (open_time[-1] - open_time[0])/(len(open_time)-1)
ref = log(close/prev(close))
var = sum(pow(ref, 2)/(len(ref)-1))
return sqrt(var/avg_time_interval)
# how to get env var,
# maybe through accessing scope and serde then send to remote?
timepoint = open_time[-1]
rv_7d = vector([calc_rv(close, open_time, timepoint, datetime("7d"))])
rv_15d = vector([calc_rv(close, open_time, timepoint, datetime("15d"))])
rv_30d = vector([calc_rv(close, open_time, timepoint, datetime("30d"))])
rv_60d = vector([calc_rv(close, open_time, timepoint, datetime("60d"))])
rv_90d = vector([calc_rv(close, open_time, timepoint, datetime("90d"))])
rv_180d = vector([calc_rv(close, open_time, timepoint, datetime("180d"))])
return rv_7d, rv_15d, rv_30d, rv_60d, rv_90d, rv_180d

View File

@@ -0,0 +1 @@
curl "https://api.bybit.com/v2/public/index-price-kline?symbol=BTCUSD&interval=1&limit=$1&from=1581231260" > kline.json

View File

@@ -0,0 +1,108 @@
{
"ret_code": 0,
"ret_msg": "OK",
"ext_code": "",
"ext_info": "",
"result": [
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 300,
"open": "10107",
"high": "10109.34",
"low": "10106.71",
"close": "10106.79"
},
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 900,
"open": "10106.79",
"high": "10109.27",
"low": "10105.92",
"close": "10106.09"
},
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 1200,
"open": "10106.09",
"high": "10108.75",
"low": "10104.66",
"close": "10108.73"
},
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 1800,
"open": "10108.73",
"high": "10109.52",
"low": "10106.07",
"close": "10106.38"
},
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 2400,
"open": "10106.38",
"high": "10109.48",
"low": "10104.81",
"close": "10106.95"
},
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 3000,
"open": "10106.95",
"high": "10109.48",
"low": "10106.6",
"close": "10107.55"
},
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 3600,
"open": "10107.55",
"high": "10109.28",
"low": "10104.68",
"close": "10104.68"
},
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 4200,
"open": "10104.68",
"high": "10109.18",
"low": "10104.14",
"close": "10108.8"
},
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 4800,
"open": "10108.8",
"high": "10117.36",
"low": "10108.8",
"close": "10115.96"
},
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 5400,
"open": "10115.96",
"high": "10119.19",
"low": "10115.96",
"close": "10117.08"
},
{
"symbol": "BTCUSD",
"period": "1",
"open_time": 6000,
"open": "10117.08",
"high": "10120.73",
"low": "10116.96",
"close": "10120.43"
}
],
"time_now": "1661225351.158190"
}

View File

@@ -0,0 +1,4 @@
from .greptime import coprocessor, copr
from .greptime import vector, log, prev, next, first, last, sqrt, pow, datetime, sum, interval
from .mock import mock_tester
from .cfg import set_conn_addr, get_conn_addr

View File

@@ -0,0 +1,11 @@
GREPTIME_DB_CONN_ADDRESS = "localhost:3000"
"""The Global Variable for address for conntect to database"""
def set_conn_addr(addr: str):
"""set database address to given `addr`"""
global GREPTIME_DB_CONN_ADDRESS
GREPTIME_DB_CONN_ADDRESS = addr
def get_conn_addr()->str:
global GREPTIME_DB_CONN_ADDRESS
return GREPTIME_DB_CONN_ADDRESS

View File

@@ -0,0 +1,207 @@
"""
Be note that this is a mock library, if not connected to database,
it can only run on mock data and mock function which is supported by numpy
"""
import functools
import numpy as np
import json
from urllib import request
import inspect
import requests
from .cfg import set_conn_addr, get_conn_addr
log = np.log
sum = np.nansum
sqrt = np.sqrt
pow = np.power
nan = np.nan
class TimeStamp(str):
"""
TODO: impl date time
"""
pass
class i32(int):
"""
For Python Coprocessor Type Annotation ONLY
A signed 32-bit integer.
"""
def __repr__(self) -> str:
return "i32"
class i64(int):
"""
For Python Coprocessor Type Annotation ONLY
A signed 64-bit integer.
"""
def __repr__(self) -> str:
return "i64"
class f32(float):
"""
For Python Coprocessor Type Annotation ONLY
A 32-bit floating point number.
"""
def __repr__(self) -> str:
return "f32"
class f64(float):
"""
For Python Coprocessor Type Annotation ONLY
A 64-bit floating point number.
"""
def __repr__(self) -> str:
return "f64"
class vector(np.ndarray):
"""
A compact Vector with all elements of same Data type.
"""
_datatype: str | None = None
def __new__(
cls,
lst,
dtype=None
) -> ...:
self = np.asarray(lst).view(cls)
self._datatype = dtype
return self
def __str__(self) -> str:
return "vector({}, \"{}\")".format(super().__str__(), self.datatype())
def datatype(self):
return self._datatype
def filter(self, lst_bool):
return self[lst_bool]
def last(lst):
return lst[-1]
def first(lst):
return lst[0]
def prev(lst):
ret = np.zeros(len(lst))
ret[1:] = lst[0:-1]
ret[0] = nan
return ret
def next(lst):
ret = np.zeros(len(lst))
ret[:-1] = lst[1:]
ret[-1] = nan
return ret
def interval(ts: vector, arr: vector, duration: int, func):
"""
Note that this is a mock function with same functionailty to the actual Python Coprocessor
`arr` is a vector of integral or temporal type.
"""
start = np.min(ts)
end = np.max(ts)
masks = [(ts >= i) & (ts <= (i+duration)) for i in range(start, end, duration)]
lst_res = [func(arr[mask]) for mask in masks]
return lst_res
def factor(unit: str) -> int:
if unit == "d":
return 24 * 60 * 60
elif unit == "h":
return 60 * 60
elif unit == "m":
return 60
elif unit == "s":
return 1
else:
raise Exception("Only d,h,m,s, found{}".format(unit))
def datetime(input_time: str) -> int:
"""
support `d`(day) `h`(hour) `m`(minute) `s`(second)
support format:
`12s` `7d` `12d2h7m`
"""
prev = 0
cur = 0
state = "Num"
parse_res = []
for idx, ch in enumerate(input_time):
if ch.isdigit():
cur = idx
if state != "Num":
parse_res.append((state, input_time[prev:cur], (prev, cur)))
prev = idx
state = "Num"
else:
cur = idx
if state != "Symbol":
parse_res.append((state, input_time[prev:cur], (prev, cur)))
prev = idx
state = "Symbol"
parse_res.append((state, input_time[prev:cur+1], (prev, cur+1)))
cur_idx = 0
res_time = 0
while cur_idx < len(parse_res):
pair = parse_res[cur_idx]
if pair[0] == "Num":
val = int(pair[1])
nxt = parse_res[cur_idx+1]
res_time += val * factor(nxt[1])
cur_idx += 2
else:
raise Exception("Two symbol in a row is impossible")
return res_time
def coprocessor(args=None, returns=None, sql=None):
"""
The actual coprocessor, which will connect to database and update
whatever function decorated with `@coprocessor(args=[...], returns=[...], sql=...)`
"""
def decorator_copr(func):
@functools.wraps(func)
def wrapper_do_actual(*args, **kwargs):
if len(args)!=0 or len(kwargs)!=0:
raise Exception("Expect call with no arguements(for all args are given by coprocessor itself)")
source = inspect.getsource(func)
url = "http://{}/v1/scripts".format(get_conn_addr())
print("Posting to {}".format(url))
data = {
"script": source,
"engine": None,
}
res = requests.post(
url,
headers={"Content-Type": "application/json"},
json=data
)
return res
return wrapper_do_actual
return decorator_copr
# make a alias for short
copr = coprocessor

View File

@@ -0,0 +1,82 @@
"""
Note this is a mock library, if not connected to database,
it can only run on mock data and support by numpy
"""
from typing import Any
import numpy as np
from .greptime import i32,i64,f32,f64, vector, interval, prev, datetime, log, sum, sqrt, pow, nan, copr, coprocessor
import inspect
import functools
import ast
def mock_tester(
func,
env:dict,
table=None
):
"""
Mock tester helper function,
What it does is replace `@coprocessor` with `@mock_cpor` and add a keyword `env=env`
like `@mock_copr(args=...,returns=...,env=env)`
"""
code = inspect.getsource(func)
tree = ast.parse(code)
tree = HackyReplaceDecorator("env").visit(tree)
new_func = tree.body[0]
fn_name = new_func.name
code_obj = compile(tree, "<embedded>", "exec")
exec(code_obj)
ret = eval("{}()".format(fn_name))
return ret
def mock_copr(args, returns, sql=None, env:None|dict=None):
"""
This should not be used directly by user
"""
def decorator_copr(func):
@functools.wraps(func)
def wrapper_do_actual(*fn_args, **fn_kwargs):
real_args = [env[name] for name in args]
ret = func(*real_args)
return ret
return wrapper_do_actual
return decorator_copr
class HackyReplaceDecorator(ast.NodeTransformer):
"""
This class accept a `env` dict for environment to extract args from,
and put `env` dict in the param list of `mock_copr` decorator, i.e:
a `@copr(args=["a", "b"], returns=["c"])` with call like mock_helper(abc, env={"a":2, "b":3})
will be transform into `@mock_copr(args=["a", "b"], returns=["c"], env={"a":2, "b":3})`
"""
def __init__(self, env: str) -> None:
# just for add `env` keyword
self.env = env
def visit_FunctionDef(self, node: ast.FunctionDef) -> Any:
new_node = node
decorator_list = new_node.decorator_list
if len(decorator_list)!=1:
return node
deco = decorator_list[0]
if deco.func.id!="coprocessor" and deco.func.id !="copr":
raise Exception("Expect a @copr or @coprocessor, found {}.".format(deco.func.id))
deco.func = ast.Name(id="mock_copr", ctx=ast.Load())
new_kw = ast.keyword(arg="env", value=ast.Name(id=self.env, ctx=ast.Load()))
deco.keywords.append(new_kw)
# Tie up loose ends in the AST.
ast.copy_location(new_node, node)
ast.fix_missing_locations(new_node)
self.generic_visit(node)
return new_node

View File

@@ -0,0 +1,60 @@
from example.calc_rv import as_table, calc_rvs
from greptime import coprocessor, set_conn_addr, get_conn_addr, mock_tester
import sys
import json
import requests
'''
To run this script, you need to first start a http server of greptime, and
`
python3 component/script/python/test.py 地址:端口
`
'''
@coprocessor(sql='select number from numbers limit 10', args=['number'], returns=['n'])
def test(n):
return n+2
def init_table(close, open_time):
req_init = "/v1/sql?sql=create table k_line (close double, open_time bigint, TIME INDEX (open_time))"
print(get_db(req_init).text)
for c1, c2 in zip(close, open_time):
req = "/v1/sql?sql=INSERT INTO k_line(close, open_time) VALUES ({}, {})".format(c1, c2)
print(get_db(req).text)
print(get_db("/v1/sql?sql=select * from k_line").text)
def get_db(req:str):
return requests.get("http://{}{}".format(get_conn_addr(), req))
if __name__ == "__main__":
with open("component/script/python/example/kline.json", "r") as kline_file:
kline = json.load(kline_file)
table = as_table(kline["result"])
close = table["close"]
open_time = table["open_time"]
env = {"close":close, "open_time": open_time}
res = mock_tester(calc_rvs, env=env)
print("Mock result:", [i[0] for i in res])
exit()
if len(sys.argv)!=2:
raise Exception("Expect only one address as cmd's args")
set_conn_addr(sys.argv[1])
res = test()
print(res.headers)
print(res.text)
with open("component/script/python/example/kline.json", "r") as kline_file:
kline = json.load(kline_file)
# vec = vector([1,2,3], int)
# print(vec, vec.datatype())
table = as_table(kline["result"])
# print(table)
close = table["close"]
open_time = table["open_time"]
init_table(close, open_time)
real = calc_rvs()
print(real)
try:
print(real.text["error"])
except:
print(real.text)

View File

@@ -1,171 +1,14 @@
# Node running mode, see `standalone.example.toml`.
mode = "distributed"
# The datanode identifier, should be unique.
node_id = 42
# gRPC server address, "127.0.0.1:3001" by default.
rpc_addr = "127.0.0.1:3001"
# Hostname of this node.
rpc_hostname = "127.0.0.1"
# The number of gRPC server worker threads, 8 by default.
rpc_runtime_size = 8
# Start services after regions have obtained leases.
# It will block the datanode start if it can't receive leases in the heartbeat from metasrv.
require_lease_before_startup = false
http_addr = '0.0.0.0:3000'
rpc_addr = '0.0.0.0:3001'
wal_dir = '/tmp/greptimedb/wal'
# Initialize all regions in the background during the startup.
# By default, it provides services after all regions have been initialized.
init_regions_in_background = false
mysql_addr = '0.0.0.0:3306'
mysql_runtime_size = 4
[heartbeat]
# Interval for sending heartbeat messages to the Metasrv, 3 seconds by default.
interval = "3s"
# applied when postgres feature enbaled
postgres_addr = '0.0.0.0:5432'
postgres_runtime_size = 4
# Metasrv client options.
[meta_client]
# Metasrv address list.
metasrv_addrs = ["127.0.0.1:3002"]
# Heartbeat timeout, 500 milliseconds by default.
heartbeat_timeout = "500ms"
# Operation timeout, 3 seconds by default.
timeout = "3s"
# Connect server timeout, 1 second by default.
connect_timeout = "1s"
# `TCP_NODELAY` option for accepted connections, true by default.
tcp_nodelay = true
# WAL options.
[wal]
provider = "raft_engine"
# Raft-engine wal options, see `standalone.example.toml`.
# dir = "/tmp/greptimedb/wal"
file_size = "256MB"
purge_threshold = "4GB"
purge_interval = "10m"
read_batch_size = 128
sync_write = false
# Kafka wal options, see `standalone.example.toml`.
# broker_endpoints = ["127.0.0.1:9092"]
# Warning: Kafka has a default limit of 1MB per message in a topic.
# max_batch_size = "1MB"
# linger = "200ms"
# consumer_wait_timeout = "100ms"
# backoff_init = "500ms"
# backoff_max = "10s"
# backoff_base = 2
# backoff_deadline = "5mins"
# Storage options, see `standalone.example.toml`.
[storage]
# The working home directory.
data_home = "/tmp/greptimedb/"
# Storage type.
type = "File"
# TTL for all tables. Disabled by default.
# global_ttl = "7d"
# Cache configuration for object storage such as 'S3' etc.
# The local file cache directory
# cache_path = "/path/local_cache"
# The local file cache capacity in bytes.
# cache_capacity = "256MB"
# Custom storage options
#[[storage.providers]]
#type = "S3"
#[[storage.providers]]
#type = "Gcs"
# Mito engine options
[[region_engine]]
[region_engine.mito]
# Number of region workers
num_workers = 8
# Request channel size of each worker
worker_channel_size = 128
# Max batch size for a worker to handle requests
worker_request_batch_size = 64
# Number of meta action updated to trigger a new checkpoint for the manifest
manifest_checkpoint_distance = 10
# Whether to compress manifest and checkpoint file by gzip (default false).
compress_manifest = false
# Max number of running background jobs
max_background_jobs = 4
# Interval to auto flush a region if it has not flushed yet.
auto_flush_interval = "1h"
# Global write buffer size for all regions. If not set, it's default to 1/8 of OS memory with a max limitation of 1GB.
global_write_buffer_size = "1GB"
# Global write buffer size threshold to reject write requests. If not set, it's default to 2 times of `global_write_buffer_size`
global_write_buffer_reject_size = "2GB"
# Cache size for SST metadata. Setting it to 0 to disable the cache.
# If not set, it's default to 1/32 of OS memory with a max limitation of 128MB.
sst_meta_cache_size = "128MB"
# Cache size for vectors and arrow arrays. Setting it to 0 to disable the cache.
# If not set, it's default to 1/16 of OS memory with a max limitation of 512MB.
vector_cache_size = "512MB"
# Cache size for pages of SST row groups. Setting it to 0 to disable the cache.
# If not set, it's default to 1/16 of OS memory with a max limitation of 512MB.
page_cache_size = "512MB"
# Buffer size for SST writing.
sst_write_buffer_size = "8MB"
# Parallelism to scan a region (default: 1/4 of cpu cores).
# - 0: using the default value (1/4 of cpu cores).
# - 1: scan in current thread.
# - n: scan in parallelism n.
scan_parallelism = 0
# Capacity of the channel to send data from parallel scan tasks to the main task (default 32).
parallel_scan_channel_size = 32
# Whether to allow stale WAL entries read during replay.
allow_stale_entries = false
[region_engine.mito.inverted_index]
# Whether to create the index on flush.
# - "auto": automatically
# - "disable": never
create_on_flush = "auto"
# Whether to create the index on compaction.
# - "auto": automatically
# - "disable": never
create_on_compaction = "auto"
# Whether to apply the index on query
# - "auto": automatically
# - "disable": never
apply_on_query = "auto"
# Memory threshold for performing an external sort during index creation.
# Setting to empty will disable external sorting, forcing all sorting operations to happen in memory.
mem_threshold_on_create = "64M"
# File system path to store intermediate files for external sorting (default `{data_home}/index_intermediate`).
intermediate_path = ""
[region_engine.mito.memtable]
# Memtable type.
# - "experimental": experimental memtable
# - "time_series": time-series memtable (deprecated)
type = "experimental"
# The max number of keys in one shard.
index_max_keys_per_shard = 8192
# The max rows of data inside the actively writing buffer in one shard.
data_freeze_threshold = 32768
# Max dictionary bytes.
fork_dictionary_bytes = "1GiB"
# Log options, see `standalone.example.toml`
# [logging]
# dir = "/tmp/greptimedb/logs"
# level = "info"
# Datanode export the metrics generated by itself
# encoded to Prometheus remote-write format
# and send to Prometheus remote-write compatible receiver (e.g. send to `greptimedb` itself)
# This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape.
# [export_metrics]
# whether enable export metrics, default is false
# enable = false
# The interval of export metrics
# write_interval = "30s"
# [export_metrics.remote_write]
# The url the metrics send to. The url is empty by default, url example: `http://127.0.0.1:4000/v1/prometheus/write?db=information_schema`
# url = ""
# HTTP headers of Prometheus remote-write carry
# headers = {}
type = 'File'
data_dir = '/tmp/greptimedb/data/'

View File

@@ -1,106 +1,4 @@
# Node running mode, see `standalone.example.toml`.
mode = "distributed"
# The default timezone of the server
# default_timezone = "UTC"
[heartbeat]
# Interval for sending heartbeat task to the Metasrv, 5 seconds by default.
interval = "5s"
# Interval for retry sending heartbeat task, 5 seconds by default.
retry_interval = "5s"
# HTTP server options, see `standalone.example.toml`.
[http]
addr = "127.0.0.1:4000"
timeout = "30s"
body_limit = "64MB"
# gRPC server options, see `standalone.example.toml`.
[grpc]
addr = "127.0.0.1:4001"
runtime_size = 8
# MySQL server options, see `standalone.example.toml`.
[mysql]
enable = true
addr = "127.0.0.1:4002"
runtime_size = 2
# MySQL server TLS options, see `standalone.example.toml`.
[mysql.tls]
mode = "disable"
cert_path = ""
key_path = ""
watch = false
# PostgresSQL server options, see `standalone.example.toml`.
[postgres]
enable = true
addr = "127.0.0.1:4003"
runtime_size = 2
# PostgresSQL server TLS options, see `standalone.example.toml`.
[postgres.tls]
mode = "disable"
cert_path = ""
key_path = ""
watch = false
# OpenTSDB protocol options, see `standalone.example.toml`.
[opentsdb]
enable = true
addr = "127.0.0.1:4242"
runtime_size = 2
# InfluxDB protocol options, see `standalone.example.toml`.
[influxdb]
enable = true
# Prometheus remote storage options, see `standalone.example.toml`.
[prom_store]
enable = true
# Whether to store the data from Prometheus remote write in metric engine.
# true by default
with_metric_engine = true
# Metasrv client options, see `datanode.example.toml`.
[meta_client]
metasrv_addrs = ["127.0.0.1:3002"]
timeout = "3s"
# DDL timeouts options.
ddl_timeout = "10s"
connect_timeout = "1s"
tcp_nodelay = true
# The configuration about the cache of the Metadata.
# default: 100000
metadata_cache_max_capacity = 100000
# default: 10m
metadata_cache_ttl = "10m"
# default: 5m
metadata_cache_tti = "5m"
# Log options, see `standalone.example.toml`
# [logging]
# dir = "/tmp/greptimedb/logs"
# level = "info"
# Datanode options.
[datanode]
# Datanode client options.
[datanode.client]
timeout = "10s"
connect_timeout = "10s"
tcp_nodelay = true
# Frontend export the metrics generated by itself
# encoded to Prometheus remote-write format
# and send to Prometheus remote-write compatible receiver (e.g. send to `greptimedb` itself)
# This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape.
# [export_metrics]
# whether enable export metrics, default is false
# enable = false
# The interval of export metrics
# write_interval = "30s"
# for `frontend`, `self_import` is recommend to collect metrics generated by itself
# [export_metrics.self_import]
# db = "information_schema"
http_addr = '0.0.0.0:4000'
grpc_addr = '0.0.0.0:4001'
mysql_addr = '0.0.0.0:4003'
mysql_runtime_size = 4

View File

@@ -1,93 +0,0 @@
# The working home directory.
data_home = "/tmp/metasrv/"
# The bind address of metasrv, "127.0.0.1:3002" by default.
bind_addr = "127.0.0.1:3002"
# The communication server address for frontend and datanode to connect to metasrv, "127.0.0.1:3002" by default for localhost.
server_addr = "127.0.0.1:3002"
# Etcd server address, "127.0.0.1:2379" by default.
store_addr = "127.0.0.1:2379"
# Datanode selector type.
# - "lease_based" (default value).
# - "load_based"
# For details, please see "https://docs.greptime.com/developer-guide/metasrv/selector".
selector = "lease_based"
# Store data in memory, false by default.
use_memory_store = false
# Whether to enable greptimedb telemetry, true by default.
enable_telemetry = true
# If it's not empty, the metasrv will store all data with this key prefix.
store_key_prefix = ""
# Log options, see `standalone.example.toml`
# [logging]
# dir = "/tmp/greptimedb/logs"
# level = "info"
# Procedure storage options.
[procedure]
# Procedure max retry time.
max_retry_times = 12
# Initial retry delay of procedures, increases exponentially
retry_delay = "500ms"
# Failure detectors options.
[failure_detector]
threshold = 8.0
min_std_deviation = "100ms"
acceptable_heartbeat_pause = "3000ms"
first_heartbeat_estimate = "1000ms"
# # Datanode options.
# [datanode]
# # Datanode client options.
# [datanode.client_options]
# timeout = "10s"
# connect_timeout = "10s"
# tcp_nodelay = true
[wal]
# Available wal providers:
# - "raft_engine" (default)
# - "kafka"
provider = "raft_engine"
# There're none raft-engine wal config since meta srv only involves in remote wal currently.
# Kafka wal config.
# The broker endpoints of the Kafka cluster. ["127.0.0.1:9092"] by default.
# broker_endpoints = ["127.0.0.1:9092"]
# Number of topics to be created upon start.
# num_topics = 64
# Topic selector type.
# Available selector types:
# - "round_robin" (default)
# selector_type = "round_robin"
# A Kafka topic is constructed by concatenating `topic_name_prefix` and `topic_id`.
# topic_name_prefix = "greptimedb_wal_topic"
# Expected number of replicas of each partition.
# replication_factor = 1
# Above which a topic creation operation will be cancelled.
# create_topic_timeout = "30s"
# The initial backoff for kafka clients.
# backoff_init = "500ms"
# The maximum backoff for kafka clients.
# backoff_max = "10s"
# Exponential backoff rate, i.e. next backoff = base * current backoff.
# backoff_base = 2
# Stop reconnecting if the total wait time reaches the deadline. If this config is missing, the reconnecting won't terminate.
# backoff_deadline = "5mins"
# Metasrv export the metrics generated by itself
# encoded to Prometheus remote-write format
# and send to Prometheus remote-write compatible receiver (e.g. send to `greptimedb` itself)
# This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape.
# [export_metrics]
# whether enable export metrics, default is false
# enable = false
# The interval of export metrics
# write_interval = "30s"
# [export_metrics.remote_write]
# The url the metrics send to. The url is empty by default, url example: `http://127.0.0.1:4000/v1/prometheus/write?db=information_schema`
# url = ""
# HTTP headers of Prometheus remote-write carry
# headers = {}

View File

@@ -1,286 +0,0 @@
# Node running mode, "standalone" or "distributed".
mode = "standalone"
# Whether to enable greptimedb telemetry, true by default.
enable_telemetry = true
# The default timezone of the server
# default_timezone = "UTC"
# HTTP server options.
[http]
# Server address, "127.0.0.1:4000" by default.
addr = "127.0.0.1:4000"
# HTTP request timeout, 30s by default.
timeout = "30s"
# HTTP request body limit, 64Mb by default.
# the following units are supported: B, KB, KiB, MB, MiB, GB, GiB, TB, TiB, PB, PiB
body_limit = "64MB"
# gRPC server options.
[grpc]
# Server address, "127.0.0.1:4001" by default.
addr = "127.0.0.1:4001"
# The number of server worker threads, 8 by default.
runtime_size = 8
# MySQL server options.
[mysql]
# Whether to enable
enable = true
# Server address, "127.0.0.1:4002" by default.
addr = "127.0.0.1:4002"
# The number of server worker threads, 2 by default.
runtime_size = 2
# MySQL server TLS options.
[mysql.tls]
# TLS mode, refer to https://www.postgresql.org/docs/current/libpq-ssl.html
# - "disable" (default value)
# - "prefer"
# - "require"
# - "verify-ca"
# - "verify-full"
mode = "disable"
# Certificate file path.
cert_path = ""
# Private key file path.
key_path = ""
# Watch for Certificate and key file change and auto reload
watch = false
# PostgresSQL server options.
[postgres]
# Whether to enable
enable = true
# Server address, "127.0.0.1:4003" by default.
addr = "127.0.0.1:4003"
# The number of server worker threads, 2 by default.
runtime_size = 2
# PostgresSQL server TLS options, see `[mysql_options.tls]` section.
[postgres.tls]
# TLS mode.
mode = "disable"
# certificate file path.
cert_path = ""
# private key file path.
key_path = ""
# Watch for Certificate and key file change and auto reload
watch = false
# OpenTSDB protocol options.
[opentsdb]
# Whether to enable
enable = true
# OpenTSDB telnet API server address, "127.0.0.1:4242" by default.
addr = "127.0.0.1:4242"
# The number of server worker threads, 2 by default.
runtime_size = 2
# InfluxDB protocol options.
[influxdb]
# Whether to enable InfluxDB protocol in HTTP API, true by default.
enable = true
# Prometheus remote storage options
[prom_store]
# Whether to enable Prometheus remote write and read in HTTP API, true by default.
enable = true
# Whether to store the data from Prometheus remote write in metric engine.
# true by default
with_metric_engine = true
[wal]
# Available wal providers:
# - "raft_engine" (default)
# - "kafka"
provider = "raft_engine"
# Raft-engine wal options.
# WAL data directory
# dir = "/tmp/greptimedb/wal"
# WAL file size in bytes.
file_size = "256MB"
# WAL purge threshold.
purge_threshold = "4GB"
# WAL purge interval in seconds.
purge_interval = "10m"
# WAL read batch size.
read_batch_size = 128
# Whether to sync log file after every write.
sync_write = false
# Whether to reuse logically truncated log files.
enable_log_recycle = true
# Whether to pre-create log files on start up
prefill_log_files = false
# Duration for fsyncing log files.
sync_period = "1000ms"
# Kafka wal options.
# The broker endpoints of the Kafka cluster. ["127.0.0.1:9092"] by default.
# broker_endpoints = ["127.0.0.1:9092"]
# Number of topics to be created upon start.
# num_topics = 64
# Topic selector type.
# Available selector types:
# - "round_robin" (default)
# selector_type = "round_robin"
# The prefix of topic name.
# topic_name_prefix = "greptimedb_wal_topic"
# The number of replicas of each partition.
# Warning: the replication factor must be positive and must not be greater than the number of broker endpoints.
# replication_factor = 1
# The max size of a single producer batch.
# Warning: Kafka has a default limit of 1MB per message in a topic.
# max_batch_size = "1MB"
# The linger duration.
# linger = "200ms"
# The consumer wait timeout.
# consumer_wait_timeout = "100ms"
# Create topic timeout.
# create_topic_timeout = "30s"
# The initial backoff delay.
# backoff_init = "500ms"
# The maximum backoff delay.
# backoff_max = "10s"
# Exponential backoff rate, i.e. next backoff = base * current backoff.
# backoff_base = 2
# The deadline of retries.
# backoff_deadline = "5mins"
# Metadata storage options.
[metadata_store]
# Kv file size in bytes.
file_size = "256MB"
# Kv purge threshold.
purge_threshold = "4GB"
# Procedure storage options.
[procedure]
# Procedure max retry time.
max_retry_times = 3
# Initial retry delay of procedures, increases exponentially
retry_delay = "500ms"
# Storage options.
[storage]
# The working home directory.
data_home = "/tmp/greptimedb/"
# Storage type.
type = "File"
# TTL for all tables. Disabled by default.
# global_ttl = "7d"
# Cache configuration for object storage such as 'S3' etc.
# cache_path = "/path/local_cache"
# The local file cache capacity in bytes.
# cache_capacity = "256MB"
# Custom storage options
#[[storage.providers]]
#type = "S3"
#[[storage.providers]]
#type = "Gcs"
# Mito engine options
[[region_engine]]
[region_engine.mito]
# Number of region workers
num_workers = 8
# Request channel size of each worker
worker_channel_size = 128
# Max batch size for a worker to handle requests
worker_request_batch_size = 64
# Number of meta action updated to trigger a new checkpoint for the manifest
manifest_checkpoint_distance = 10
# Whether to compress manifest and checkpoint file by gzip (default false).
compress_manifest = false
# Max number of running background jobs
max_background_jobs = 4
# Interval to auto flush a region if it has not flushed yet.
auto_flush_interval = "1h"
# Global write buffer size for all regions. If not set, it's default to 1/8 of OS memory with a max limitation of 1GB.
global_write_buffer_size = "1GB"
# Global write buffer size threshold to reject write requests. If not set, it's default to 2 times of `global_write_buffer_size`
global_write_buffer_reject_size = "2GB"
# Cache size for SST metadata. Setting it to 0 to disable the cache.
# If not set, it's default to 1/32 of OS memory with a max limitation of 128MB.
sst_meta_cache_size = "128MB"
# Cache size for vectors and arrow arrays. Setting it to 0 to disable the cache.
# If not set, it's default to 1/16 of OS memory with a max limitation of 512MB.
vector_cache_size = "512MB"
# Cache size for pages of SST row groups. Setting it to 0 to disable the cache.
# If not set, it's default to 1/16 of OS memory with a max limitation of 512MB.
page_cache_size = "512MB"
# Buffer size for SST writing.
sst_write_buffer_size = "8MB"
# Parallelism to scan a region (default: 1/4 of cpu cores).
# - 0: using the default value (1/4 of cpu cores).
# - 1: scan in current thread.
# - n: scan in parallelism n.
scan_parallelism = 0
# Capacity of the channel to send data from parallel scan tasks to the main task (default 32).
parallel_scan_channel_size = 32
# Whether to allow stale WAL entries read during replay.
allow_stale_entries = false
[region_engine.mito.inverted_index]
# Whether to create the index on flush.
# - "auto": automatically
# - "disable": never
create_on_flush = "auto"
# Whether to create the index on compaction.
# - "auto": automatically
# - "disable": never
create_on_compaction = "auto"
# Whether to apply the index on query
# - "auto": automatically
# - "disable": never
apply_on_query = "auto"
# Memory threshold for performing an external sort during index creation.
# Setting to empty will disable external sorting, forcing all sorting operations to happen in memory.
mem_threshold_on_create = "64M"
# File system path to store intermediate files for external sorting (default `{data_home}/index_intermediate`).
intermediate_path = ""
[region_engine.mito.memtable]
# Memtable type.
# - "experimental": experimental memtable
# - "time_series": time-series memtable (deprecated)
type = "experimental"
# The max number of keys in one shard.
index_max_keys_per_shard = 8192
# The max rows of data inside the actively writing buffer in one shard.
data_freeze_threshold = 32768
# Max dictionary bytes.
fork_dictionary_bytes = "1GiB"
# Log options
# [logging]
# Specify logs directory.
# dir = "/tmp/greptimedb/logs"
# Specify the log level [info | debug | error | warn]
# level = "info"
# whether enable tracing, default is false
# enable_otlp_tracing = false
# tracing exporter endpoint with format `ip:port`, we use grpc oltp as exporter, default endpoint is `localhost:4317`
# otlp_endpoint = "localhost:4317"
# Whether to append logs to stdout. Defaults to true.
# append_stdout = true
# The percentage of tracing will be sampled and exported. Valid range `[0, 1]`, 1 means all traces are sampled, 0 means all traces are not sampled, the default value is 1. ratio > 1 are treated as 1. Fractions < 0 are treated as 0
# [logging.tracing_sample_ratio]
# default_ratio = 0.0
# Standalone export the metrics generated by itself
# encoded to Prometheus remote-write format
# and send to Prometheus remote-write compatible receiver (e.g. send to `greptimedb` itself)
# This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape.
# [export_metrics]
# whether enable export metrics, default is false
# enable = false
# The interval of export metrics
# write_interval = "30s"
# for `standalone`, `self_import` is recommend to collect metrics generated by itself
# [export_metrics.self_import]
# db = "information_schema"

32
docker/Dockerfile Normal file
View File

@@ -0,0 +1,32 @@
FROM ubuntu:22.04 as builder
ENV LANG en_US.utf8
WORKDIR /greptimedb
# Install dependencies.
RUN apt-get update && apt-get install -y \
libssl-dev \
protobuf-compiler \
curl \
build-essential \
pkg-config
# Install Rust.
SHELL ["/bin/bash", "-c"]
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --no-modify-path --default-toolchain none -y
ENV PATH /root/.cargo/bin/:$PATH
# Build the project in release mode.
COPY . .
RUN cargo build --release
# Export the binary to the clean image.
# TODO(zyy17): Maybe should use the more secure container image.
FROM ubuntu:22.04 as base
WORKDIR /greptimedb
COPY --from=builder /greptimedb/target/release/greptime /greptimedb/bin/
ENV PATH /greptimedb/bin/:$PATH
ENTRYPOINT [ "greptime" ]
CMD [ "datanode", "start"]

View File

@@ -1,54 +0,0 @@
FROM centos:7 as builder
ARG CARGO_PROFILE
ARG FEATURES
ARG OUTPUT_DIR
ENV LANG en_US.utf8
WORKDIR /greptimedb
# Install dependencies
RUN ulimit -n 1024000 && yum groupinstall -y 'Development Tools'
RUN yum install -y epel-release \
openssl \
openssl-devel \
centos-release-scl \
rh-python38 \
rh-python38-python-devel \
which
# Install protoc
RUN curl -LO https://github.com/protocolbuffers/protobuf/releases/download/v3.15.8/protoc-3.15.8-linux-x86_64.zip
RUN unzip protoc-3.15.8-linux-x86_64.zip -d /usr/local/
# Install Rust
SHELL ["/bin/bash", "-c"]
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --no-modify-path --default-toolchain none -y
ENV PATH /opt/rh/rh-python38/root/usr/bin:/usr/local/bin:/root/.cargo/bin/:$PATH
# Build the project in release mode.
RUN --mount=target=.,rw \
--mount=type=cache,target=/root/.cargo/registry \
make build \
CARGO_PROFILE=${CARGO_PROFILE} \
FEATURES=${FEATURES} \
TARGET_DIR=/out/target
# Export the binary to the clean image.
FROM centos:7 as base
ARG OUTPUT_DIR
RUN yum install -y epel-release \
openssl \
openssl-devel \
centos-release-scl \
rh-python38 \
rh-python38-python-devel \
which
WORKDIR /greptime
COPY --from=builder /out/target/${OUTPUT_DIR}/greptime /greptime/bin/
ENV PATH /greptime/bin/:$PATH
ENTRYPOINT ["greptime"]

View File

@@ -1,62 +0,0 @@
FROM ubuntu:20.04 as builder
ARG CARGO_PROFILE
ARG FEATURES
ARG OUTPUT_DIR
ENV LANG en_US.utf8
WORKDIR /greptimedb
# Add PPA for Python 3.10.
RUN apt-get update && \
DEBIAN_FRONTEND=noninteractive apt-get install -y software-properties-common && \
add-apt-repository ppa:deadsnakes/ppa -y
# Install dependencies.
RUN --mount=type=cache,target=/var/cache/apt \
apt-get update && apt-get install -y \
libssl-dev \
protobuf-compiler \
curl \
git \
build-essential \
pkg-config \
python3.10 \
python3.10-dev \
python3-pip
# Install Rust.
SHELL ["/bin/bash", "-c"]
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --no-modify-path --default-toolchain none -y
ENV PATH /root/.cargo/bin/:$PATH
# Build the project in release mode.
RUN --mount=target=. \
--mount=type=cache,target=/root/.cargo/registry \
make build \
CARGO_PROFILE=${CARGO_PROFILE} \
FEATURES=${FEATURES} \
TARGET_DIR=/out/target
# Export the binary to the clean image.
# TODO(zyy17): Maybe should use the more secure container image.
FROM ubuntu:22.04 as base
ARG OUTPUT_DIR
RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get \
-y install ca-certificates \
python3.10 \
python3.10-dev \
python3-pip \
curl
COPY ./docker/python/requirements.txt /etc/greptime/requirements.txt
RUN python3 -m pip install -r /etc/greptime/requirements.txt
WORKDIR /greptime
COPY --from=builder /out/target/${OUTPUT_DIR}/greptime /greptime/bin/
ENV PATH /greptime/bin/:$PATH
ENTRYPOINT ["greptime"]

View File

@@ -1,16 +0,0 @@
FROM centos:7
RUN yum install -y epel-release \
openssl \
openssl-devel \
centos-release-scl \
rh-python38 \
rh-python38-python-devel
ARG TARGETARCH
ADD $TARGETARCH/greptime /greptime/bin/
ENV PATH /greptime/bin/:$PATH
ENTRYPOINT ["greptime"]

View File

@@ -1,28 +0,0 @@
FROM ubuntu:22.04
# The root path under which contains all the dependencies to build this Dockerfile.
ARG DOCKER_BUILD_ROOT=.
# The binary name of GreptimeDB executable.
# Defaults to "greptime", but sometimes in other projects it might be different.
ARG TARGET_BIN=greptime
RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y \
ca-certificates \
python3.10 \
python3.10-dev \
python3-pip \
curl
COPY $DOCKER_BUILD_ROOT/docker/python/requirements.txt /etc/greptime/requirements.txt
RUN python3 -m pip install -r /etc/greptime/requirements.txt
ARG TARGETARCH
ADD $TARGETARCH/$TARGET_BIN /greptime/bin/
ENV PATH /greptime/bin/:$PATH
ENV TARGET_BIN=$TARGET_BIN
ENTRYPOINT ["sh", "-c", "exec $TARGET_BIN \"$@\"", "--"]

View File

@@ -1,41 +0,0 @@
FROM --platform=linux/amd64 saschpe/android-ndk:34-jdk17.0.8_7-ndk25.2.9519653-cmake3.22.1
ENV LANG en_US.utf8
WORKDIR /greptimedb
# Rename libunwind to libgcc
RUN cp ${NDK_ROOT}/toolchains/llvm/prebuilt/linux-x86_64/lib64/clang/14.0.7/lib/linux/aarch64/libunwind.a ${NDK_ROOT}/toolchains/llvm/prebuilt/linux-x86_64/lib64/clang/14.0.7/lib/linux/aarch64/libgcc.a
# Install dependencies.
RUN apt-get update && apt-get install -y \
libssl-dev \
protobuf-compiler \
curl \
git \
build-essential \
pkg-config \
python3 \
python3-dev \
python3-pip \
&& pip3 install --upgrade pip \
&& pip3 install pyarrow
# Trust workdir
RUN git config --global --add safe.directory /greptimedb
# Install Rust.
SHELL ["/bin/bash", "-c"]
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --no-modify-path --default-toolchain none -y
ENV PATH /root/.cargo/bin/:$PATH
# Add android toolchains
ARG RUST_TOOLCHAIN
RUN rustup toolchain install ${RUST_TOOLCHAIN}
RUN rustup target add aarch64-linux-android
# Install cargo-ndk
RUN cargo install cargo-ndk
ENV ANDROID_NDK_HOME $NDK_ROOT
# Builder entrypoint.
CMD ["cargo", "ndk", "--platform", "23", "-t", "aarch64-linux-android", "build", "--bin", "greptime", "--profile", "release", "--no-default-features"]

View File

@@ -1,30 +0,0 @@
FROM centos:7 as builder
ENV LANG en_US.utf8
# Install dependencies
RUN ulimit -n 1024000 && yum groupinstall -y 'Development Tools'
RUN yum install -y epel-release \
openssl \
openssl-devel \
centos-release-scl \
rh-python38 \
rh-python38-python-devel \
which
# Install protoc
RUN curl -LO https://github.com/protocolbuffers/protobuf/releases/download/v3.15.8/protoc-3.15.8-linux-x86_64.zip
RUN unzip protoc-3.15.8-linux-x86_64.zip -d /usr/local/
# Install Rust
SHELL ["/bin/bash", "-c"]
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --no-modify-path --default-toolchain none -y
ENV PATH /opt/rh/rh-python38/root/usr/bin:/usr/local/bin:/root/.cargo/bin/:$PATH
# Install Rust toolchains.
ARG RUST_TOOLCHAIN
RUN rustup toolchain install ${RUST_TOOLCHAIN}
# Install nextest.
RUN cargo install cargo-binstall --locked
RUN cargo binstall cargo-nextest --no-confirm

View File

@@ -1,60 +0,0 @@
FROM ubuntu:20.04
# The root path under which contains all the dependencies to build this Dockerfile.
ARG DOCKER_BUILD_ROOT=.
ENV LANG en_US.utf8
WORKDIR /greptimedb
# Add PPA for Python 3.10.
RUN apt-get update && \
DEBIAN_FRONTEND=noninteractive apt-get install -y software-properties-common && \
add-apt-repository ppa:deadsnakes/ppa -y
# Install dependencies.
RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y \
libssl-dev \
tzdata \
protobuf-compiler \
curl \
ca-certificates \
git \
build-essential \
pkg-config \
python3.10 \
python3.10-dev
# Remove Python 3.8 and install pip.
RUN apt-get -y purge python3.8 && \
apt-get -y autoremove && \
ln -s /usr/bin/python3.10 /usr/bin/python3 && \
curl -sS https://bootstrap.pypa.io/get-pip.py | python3.10
# Silence all `safe.directory` warnings, to avoid the "detect dubious repository" error when building with submodules.
# Disabling the safe directory check here won't pose extra security issues, because in our usage for this dev build
# image, we use it solely on our own environment (that github action's VM, or ECS created dynamically by ourselves),
# and the repositories are pulled from trusted sources (still us, of course). Doing so does not violate the intention
# of the Git's addition to the "safe.directory" at the first place (see the commit message here:
# https://github.com/git/git/commit/8959555cee7ec045958f9b6dd62e541affb7e7d9).
# There's also another solution to this, that we add the desired submodules to the safe directory, instead of using
# wildcard here. However, that requires the git's config files and the submodules all owned by the very same user.
# It's troublesome to do this since the dev build runs in Docker, which is under user "root"; while outside the Docker,
# it can be a different user that have prepared the submodules.
RUN git config --global --add safe.directory *
# Install Python dependencies.
COPY $DOCKER_BUILD_ROOT/docker/python/requirements.txt /etc/greptime/requirements.txt
RUN python3 -m pip install -r /etc/greptime/requirements.txt
# Install Rust.
SHELL ["/bin/bash", "-c"]
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --no-modify-path --default-toolchain none -y
ENV PATH /root/.cargo/bin/:$PATH
# Install Rust toolchains.
ARG RUST_TOOLCHAIN
RUN rustup toolchain install ${RUST_TOOLCHAIN}
# Install nextest.
RUN cargo install cargo-binstall --locked
RUN cargo binstall cargo-nextest --no-confirm

View File

@@ -1,48 +0,0 @@
# Use the legacy glibc 2.28.
FROM ubuntu:18.10
ENV LANG en_US.utf8
WORKDIR /greptimedb
# Use old-releases.ubuntu.com to avoid 404s: https://help.ubuntu.com/community/EOLUpgrades.
RUN echo "deb http://old-releases.ubuntu.com/ubuntu/ cosmic main restricted universe multiverse\n\
deb http://old-releases.ubuntu.com/ubuntu/ cosmic-updates main restricted universe multiverse\n\
deb http://old-releases.ubuntu.com/ubuntu/ cosmic-security main restricted universe multiverse" > /etc/apt/sources.list
# Install dependencies.
RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y \
libssl-dev \
tzdata \
curl \
ca-certificates \
git \
build-essential \
unzip \
pkg-config
# Install protoc.
ENV PROTOC_VERSION=25.1
RUN if [ "$(uname -m)" = "x86_64" ]; then \
PROTOC_ZIP=protoc-${PROTOC_VERSION}-linux-x86_64.zip; \
elif [ "$(uname -m)" = "aarch64" ]; then \
PROTOC_ZIP=protoc-${PROTOC_VERSION}-linux-aarch_64.zip; \
else \
echo "Unsupported architecture"; exit 1; \
fi && \
curl -OL https://github.com/protocolbuffers/protobuf/releases/download/v${PROTOC_VERSION}/${PROTOC_ZIP} && \
unzip -o ${PROTOC_ZIP} -d /usr/local bin/protoc && \
unzip -o ${PROTOC_ZIP} -d /usr/local 'include/*' && \
rm -f ${PROTOC_ZIP}
# Install Rust.
SHELL ["/bin/bash", "-c"]
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- --no-modify-path --default-toolchain none -y
ENV PATH /root/.cargo/bin/:$PATH
# Install Rust toolchains.
ARG RUST_TOOLCHAIN
RUN rustup toolchain install ${RUST_TOOLCHAIN}
# Install nextest.
RUN cargo install cargo-binstall --locked
RUN cargo binstall cargo-nextest --no-confirm

View File

@@ -1,5 +0,0 @@
numpy>=1.24.2
pandas>=1.5.3
pyarrow>=11.0.0
requests>=2.28.2
scipy>=1.10.1

Binary file not shown.

Before

Width:  |  Height:  |  Size: 51 KiB

View File

@@ -1,39 +0,0 @@
# TSBS benchmark - v0.3.2
## Environment
| | |
| --- | --- |
| CPU | AMD Ryzen 7 7735HS (8 core 3.2GHz) |
| Memory | 32GB |
| Disk | SOLIDIGM SSDPFKNU010TZ |
| OS | Ubuntu 22.04.2 LTS |
## Write performance
| Write buffer size | Ingest raterows/s |
| --- | --- |
| 512M | 139583.04 |
| 32M | 279250.52 |
## Query performance
| Query type | v0.3.2 write buffer 32M (ms) | v0.3.2 write buffer 512M (ms) | v0.3.1 write buffer 32M (ms) |
| --- | --- | --- | --- |
| cpu-max-all-1 | 921.12 | 241.23 | 553.63 |
| cpu-max-all-8 | 2657.66 | 502.78 | 3308.41 |
| double-groupby-1 | 28238.85 | 27367.42 | 52148.22 |
| double-groupby-5 | 33094.65 | 32421.89 | 56762.37 |
| double-groupby-all | 38565.89 | 38635.52 | 59596.80 |
| groupby-orderby-limit | 23321.60 | 22423.55 | 53983.23 |
| high-cpu-1 | 1167.04 | 254.15 | 832.41 |
| high-cpu-all | 32814.08 | 29906.94 | 62853.12 |
| lastpoint | 192045.05 | 153575.42 | NA |
| single-groupby-1-1-1 | 63.97 | 87.35 | 92.66 |
| single-groupby-1-1-12 | 666.24 | 326.98 | 781.50 |
| single-groupby-1-8-1 | 225.29 | 137.97 |281.95 |
| single-groupby-5-1-1 | 70.40 | 81.64 | 86.15 |
| single-groupby-5-1-12 | 722.75 | 356.01 | 805.18 |
| single-groupby-5-8-1 | 285.60 | 115.88 | 326.29 |

View File

@@ -1,61 +0,0 @@
# TSBS benchmark - v0.4.0
## Environment
### Local
| | |
| ------ | ---------------------------------- |
| CPU | AMD Ryzen 7 7735HS (8 core 3.2GHz) |
| Memory | 32GB |
| Disk | SOLIDIGM SSDPFKNU010TZ |
| OS | Ubuntu 22.04.2 LTS |
### Aliyun amd64
| | |
| ------- | -------------- |
| Machine | ecs.g7.4xlarge |
| CPU | 16 core |
| Memory | 64GB |
| Disk | 100G |
| OS | Ubuntu 22.04 |
### Aliyun arm64
| | |
| ------- | ----------------- |
| Machine | ecs.g8y.4xlarge |
| CPU | 16 core |
| Memory | 64GB |
| Disk | 100G |
| OS | Ubuntu 22.04 ARM |
## Write performance
| Environment | Ingest raterows/s |
| ------------------ | --------------------- |
| Local | 365280.60 |
| Aliyun g7.4xlarge | 341368.72 |
| Aliyun g8y.4xlarge | 320907.29 |
## Query performance
| Query type | Local (ms) | Aliyun g7.4xlarge (ms) | Aliyun g8y.4xlarge (ms) |
| --------------------- | ---------- | ---------------------- | ----------------------- |
| cpu-max-all-1 | 50.70 | 31.46 | 47.61 |
| cpu-max-all-8 | 262.16 | 129.26 | 152.43 |
| double-groupby-1 | 2512.71 | 1408.19 | 1586.10 |
| double-groupby-5 | 3896.15 | 2304.29 | 2585.29 |
| double-groupby-all | 5404.67 | 3337.61 | 3773.91 |
| groupby-orderby-limit | 3786.98 | 2065.72 | 2312.57 |
| high-cpu-1 | 71.96 | 37.29 | 54.01 |
| high-cpu-all | 9468.75 | 7595.69 | 8467.46 |
| lastpoint | 13379.43 | 11253.76 | 12949.40 |
| single-groupby-1-1-1 | 20.72 | 12.16 | 13.35 |
| single-groupby-1-1-12 | 28.53 | 15.67 | 21.62 |
| single-groupby-1-8-1 | 72.23 | 37.90 | 43.52 |
| single-groupby-5-1-1 | 26.75 | 15.59 | 17.48 |
| single-groupby-5-1-12 | 45.41 | 22.90 | 31.96 |
| single-groupby-5-8-1 | 107.96 | 59.76 | 69.58 |

View File

@@ -1,50 +0,0 @@
# TSBS benchmark - v0.7.0
## Environment
### Local
| | |
| ------ | ---------------------------------- |
| CPU | AMD Ryzen 7 7735HS (8 core 3.2GHz) |
| Memory | 32GB |
| Disk | SOLIDIGM SSDPFKNU010TZ |
| OS | Ubuntu 22.04.2 LTS |
### Amazon EC2
| | |
| ------- | -------------- |
| Machine | c5d.2xlarge |
| CPU | 8 core |
| Memory | 16GB |
| Disk | 50GB (GP3) |
| OS | Ubuntu 22.04.1 |
## Write performance
| Environment | Ingest rate (rows/s) |
| ------------------ | --------------------- |
| Local | 3695814.64 |
| EC2 c5d.2xlarge | 2987166.64 |
## Query performance
| Query type | Local (ms) | EC2 c5d.2xlarge (ms) |
| --------------------- | ---------- | ---------------------- |
| cpu-max-all-1 | 30.56 | 54.74 |
| cpu-max-all-8 | 52.69 | 70.50 |
| double-groupby-1 | 664.30 | 1366.63 |
| double-groupby-5 | 1391.26 | 2141.71 |
| double-groupby-all | 2828.94 | 3389.59 |
| groupby-orderby-limit | 718.92 | 1213.90 |
| high-cpu-1 | 29.21 | 52.98 |
| high-cpu-all | 5514.12 | 7194.91 |
| lastpoint | 7571.40 | 9423.41 |
| single-groupby-1-1-1 | 19.09 | 7.77 |
| single-groupby-1-1-12 | 27.28 | 51.64 |
| single-groupby-1-8-1 | 31.85 | 11.64 |
| single-groupby-5-1-1 | 16.14 | 9.67 |
| single-groupby-5-1-12 | 27.21 | 53.62 |
| single-groupby-5-8-1 | 39.62 | 14.96 |

View File

@@ -1,74 +0,0 @@
This document introduces how to implement SQL statements in GreptimeDB.
The execution entry point for SQL statements locates at Frontend Instance. You can see it has
implemented `SqlQueryHandler`:
```rust
impl SqlQueryHandler for Instance {
type Error = Error;
async fn do_query(&self, query: &str, query_ctx: QueryContextRef) -> Vec<Result<Output>> {
// ...
}
}
```
Normally, when a SQL query arrives at GreptimeDB, the `do_query` method will be called. After some parsing work, the SQL
will be feed into `StatementExecutor`:
```rust
// in Frontend Instance:
self.statement_executor.execute_sql(stmt, query_ctx).await
```
That's where we handle our SQL statements. You can just create a new match arm for your statement there, then the
statement is implemented for both GreptimeDB Standalone and Cluster. You can see how `DESCRIBE TABLE` is implemented as
an example.
Now, what if the statements should be handled differently for GreptimeDB Standalone and Cluster? You can see there's
a `SqlStatementExecutor` field in `StatementExecutor`. Each GreptimeDB Standalone and Cluster has its own implementation
of `SqlStatementExecutor`. If you are going to implement the statements differently in the two mode (
like `CREATE TABLE`), you have to implement them in their own `SqlStatementExecutor`s.
Summarize as the diagram below:
```text
SQL query
|
v
+---------------------------+
| SqlQueryHandler::do_query |
+---------------------------+
|
| SQL parsing
v
+--------------------------------+
| StatementExecutor::execute_sql |
+--------------------------------+
|
| SQL execution
v
+----------------------------------+
| commonly handled statements like |
| "plan_exec" for selection or |
+----------------------------------+
| |
For Standalone | | For Cluster
v v
+---------------------------+ +---------------------------+
| SqlStatementExecutor impl | | SqlStatementExecutor impl |
| in Datanode Instance | | in Frontend DistInstance |
+---------------------------+ +---------------------------+
```
Note that some SQL statements can be executed in our QueryEngine, in the form of `LogicalPlan`. You can follow the
invocation path down to the `QueryEngine` implementation from `StatementExecutor::plan_exec`. For now, there's only
one `DatafusionQueryEngine` for both GreptimeDB Standalone and Cluster. That lone query engine works for both modes is
because GreptimeDB read/write data through `Table` trait, and each mode has its own `Table` implementation.
We don't have any bias towards whether statements should be handled in query engine or `StatementExecutor`. You can
implement one kind of statement in both places. For example, `Insert` with selection is handled in query engine, because
we can easily do the query part there. However, `Insert` without selection is not, for the cost of parsing statement
to `LogicalPlan` is not neglectable. So generally if the SQL query is simple enough, you can handle it
in `StatementExecutor`; otherwise if it is complex or has some part of selection, it should be parsed to `LogicalPlan`
and handled in query engine.

View File

@@ -55,7 +55,7 @@ The DataFusion basically execute aggregate like this:
2. Call `update_batch` on each accumulator with partitioned data, to let you update your aggregate calculation.
3. Call `state` to get each accumulator's internal state, the medial calculation result.
4. Call `merge_batch` to merge all accumulator's internal state to one.
5. Execute `evaluate` on the chosen one to get the final calculation result.
5. Execute `evalute` on the chosen one to get the final calculation result.
Once you know the meaning of each method, you can easily write your accumulator. You can refer to `Median` accumulator or `SUM` accumulator defined in file `my_sum_udaf_example.rs` for more details.
@@ -63,7 +63,7 @@ Once you know the meaning of each method, you can easily write your accumulator.
You can call `register_aggregate_function` method in query engine to register your aggregate function. To do that, you have to new an instance of struct `AggregateFunctionMeta`. The struct has three fields, first is the name of your aggregate function's name. The function name is case-sensitive due to DataFusion's restriction. We strongly recommend using lowercase for your name. If you have to use uppercase name, wrap your aggregate function with quotation marks. For example, if you define an aggregate function named "my_aggr", you can use "`SELECT MY_AGGR(x)`"; if you define "my_AGGR", you have to use "`SELECT "my_AGGR"(x)`".
The second field is arg_counts ,the count of the arguments. Like accumulator `percentile`, calculating the p_number of the column. We need to input the value of column and the value of p to cacalate, and so the count of the arguments is two.
The second field is arg_counts ,the count of the arguments. Like accumulator `percentile`, caculating the p_number of the column. We need to input the value of column and the value of p to cacalate, and so the count of the arguments is two.
The third field is a function about how to create your accumulator creator that you defined in step 1 above. Create creator, that's a bit intertwined, but it is how we make DataFusion use a newly created aggregate function each time it executes a SQL, preventing the stored input types from affecting each other. The key detail can be starting looking at our `DfContextProviderAdapter` struct's `get_aggregate_meta` method.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 25 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 18 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 34 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 58 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 35 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 46 KiB

View File

@@ -1,175 +0,0 @@
---
Feature Name: "promql-in-rust"
Tracking Issue: https://github.com/GreptimeTeam/greptimedb/issues/596
Date: 2022-12-20
Author: "Ruihang Xia <waynestxia@gmail.com>"
---
Rewrite PromQL in Rust
----------------------
# Summary
A Rust native implementation of PromQL, for GreptimeDB.
# Motivation
Prometheus and its query language PromQL prevails in the cloud-native observability area, which is an important scenario for time series database like GreptimeDB. We already have support for its remote read and write protocols. Users can now integrate GreptimeDB as the storage backend to existing Prometheus deployment, but cannot run PromQL query directly on GreptimeDB like SQL.
This RFC proposes to add support for PromQL. Because it was created in Go, we can't use the existing code easily. For interoperability, performance and extendability, porting its logic to Rust is a good choice.
# Details
## Overview
One of the goals is to make use of our existing basic operators, execution model and runtime to reduce the work. So the entire proposal is built on top of Apache Arrow DataFusion. The rewrote PromQL logic is manifested as `Expr` or `Execution Plan` in DataFusion. And both the intermediate data structure and the result is in the format of `Arrow`'s `RecordBatch`.
The following sections are organized in a top-down manner. Starts with evaluation procedure. Then introduces the building blocks of our new PromQL operation. Follows by an explanation of data model. And end with an example logic plan.
*This RFC is heavily related to Prometheus and PromQL. It won't repeat some basic concepts of them.*
## Evaluation
The original implementation is like an interpreter of parsed PromQL AST. It has two characteristics: (1) Operations are evaluated in place after they are parsed to AST. And some key parameters are separated from the AST because they do not present in the query, but come from other places like another field in the HTTP payload. (2) calculation is performed per timestamp. You can see this pattern many times:
```go
for ts := ev.startTimestamp; ts <= ev.endTimestamp; ts += ev.interval {}
```
These bring out two differences in the proposed implementation. First, to make it more general and clear, the evaluation procedure is reorganized into serval phases (and is the same as DataFusion's). And second, data are evaluated by time series (corresponding to "columnar calculation", if think timestamp as row number).
```
Logic
Query AST Plan
─────────► Parser ───────► Logical ────────► Physical ────┐
Planner Planner │
◄───────────────────────────── Executor ◄────────────────┘
Evaluation Result Execution
Plan
```
- Parser
Provided by [`promql-parser`](https://github.com/GreptimeTeam/promql-parser) crate. Same as the original implementation.
- Logical Planner
Generates a logical plan with all the needed parameters. It should accept something like `EvalStmt` in Go's implementation, which contains query time range, evaluation interval and lookback range.
Another important thing done here is assembling the logic plan, with all the operations baked into logically. Like what's the filter and time range to read, how the data then flows through a selector into a binary operation, etc. Or what's the output schema of every single step. The generated logic plan is deterministic without variables, and can be `EXPLAIN`ed clearly.
- Physical Planner
This step converts a logic plan into evaluatable execution plan. There are not many special things like the previous step. Except when a query is going to be executed distributedly. In this case, a logic plan will be divided into serval parts and sent to serval nodes. One physical planner only sees its own part.
- Executor
As its name shows, this step calculates data to result. And all new calculation logic, the implementation of PromQL in rust, is placed here. And the rewrote functions are using `RecordBatch` and `Array` from `Arrow` as the intermediate data structure.
Each "batch" contains only data from single time series. This is from the underlying storage implementation. Though it's not a requirement of this RFC, having this property can simplify some functions.
Another thing to mention is the rewrote functions don't aware of timestamp or value columns, they are defined only based on the input data types. For example, `increase()` function in PromQL calculates the unbiased delta of data, its implementation here only does this single thing. Let's compare the signature of two implementations:
- Go
```go
func funcIncrease(vals []parser.Value, args parser.Expressions) Vector {}
```
- Rust
```rust
fn prom_increase(input: Array) -> Array {}
```
Some unimportant parameters are omitted. The original Go version only writes the logic for `Point`'s value, either float or histogram. But the proposed rewritten one accepts a generic `Array` as input, which can be any type that suits, from `i8` to `u64` to `TimestampNanosecond`.
## Plan and Expression
They are structures to express logic from PromQL. The proposed implementation is built on top of DataFusion, thus our plan and expression are in form of `ExtensionPlan` and `ScalarUDF`. The only difference between them in this context is the return type: plan returns a record batch while expression returns a single column.
This RFC proposes to add four new plans, they are fundamental building blocks that mainly handle data selection logic in PromQL, for the following calculation expressions.
- `SeriesNormalize`
Sort data inside one series on the timestamp column, and bias "offset" if has. This plan usually comes after `TableScan` (or `TableScan` and `Filter`) plan.
- `VectorManipulator` and `MatrixManipulator`
Corresponding to `InstantSelector` and `RangeSelector`. We don't calculate timestamp by timestamp, thus use "vector" instead of "instant", this image shows the difference. And "matrix" is another name for "range vector", for not confused with our "vector". The following section will detail how they are implemented using Arrow.
![instant_and_vector](instant-and-vector.png)
Due to "interval" parameter in PromQL, data after "selector" (or "manipulator" here) are usually shorter than input. And we have to modify the entire record batch to shorten both timestamp, value and tag columns. So they are formed as plan.
- `PromAggregator`
The carrier of aggregator expressions. This should not be very different from the DataFusion built-in `Aggregate` plan, except PromQL can use "group without" to do reverse selection.
PromQL has around 70 expressions and functions. But luckily we can reuse lots of them from DataFusion. Like unary expression, binary expression and aggregator. We only need to implement those PromQL-specific expressions, like `rate` or `percentile`. The following table lists some typical functions in PromQL, and their signature in the proposed implementation. Other function should be the same.
| Name | In Param(s) | Out Param(s) | Explain |
|-------------------- |------------------------------------------------------ |-------------- |-------------------- |
| instant_delta | Matrix T | Array T | idelta in PromQL |
| increase | Matrix T | Array T | increase in PromQL |
| extrapolate_factor | - Matrix T<br>- Array Timestamp<br>- Array Timestamp | Array T | * |
*: *`extrapolate_factor` is one of the "dark sides" in PromQL. In short it's a translation of this [paragraph](https://github.com/prometheus/prometheus/blob/0372e259baf014bbade3134fd79bcdfd8cbdef2c/promql/functions.go#L134-L159)*
To reuse those common calculation logic, we can break them into serval expressions, and assemble in the logic planning phase. Like `rate()` in PromQL can be represented as `increase / extrapolate_factor`.
## Data Model
This part explains how data is represented. Following the data model in GreptimeDB, all the data are stored as table, with tag columns, timestamp column and value column. Table to record batch is very straightforward. So an instant vector can be thought of as a row (though as said before, we don't use instant vectors) in the table. Given four basic types in PromQL: scalar, string, instant vector and range vector, only the last "range vector" need some tricks to adapt our columnar calculation.
Range vector is some sort of matrix, it's consisted of small one-dimension vectors, with each being an input of range function. And, applying range function to a range vector can be thought of kind of convolution.
![range-vector-with-matrix](range-vector-with-matrix.png)
(Left is an illustration of range vector. Notice the Y-axis has no meaning, it's just put different pieces separately. The right side is an imagined "matrix" as range function. Multiplying the left side to it can get a one-dimension "matrix" with four elements. That's the evaluation result of a range vector.)
To adapt this range vector to record batch, it should be represented by a column. This RFC proposes to use `DictionaryArray` from Arrow to represent range vector, or `Matrix`. This is "misusing" `DictionaryArray` to ship some additional information about an array. Because the range vector is sliding over one series, we only need to know the `offset` and `length` of each slides to reconstruct the matrix from an array:
![matrix-from-array](matrix-from-array.png)
The length is not fixed, it depends on the input's timestamp. An PoC implementation of `Matrix` and `increase()` can be found in [this repo](https://github.com/waynexia/corroding-prometheus).
## Example
The logic plan of PromQL query
```promql
# start: 2022-12-20T10:00:00
# end: 2022-12-21T10:00:00
# interval: 1m
# lookback: 30s
sum (rate(request_duration[5m])) by (idc)
```
looks like
<!-- title: 'PromAggregator: \naggr = sum, column = idc'
operator: prom
inputs:
- title: 'Matrix Manipulator: \ninterval = 1m, range = 5m, expr = div(increase(value), extrapolate_factor(timestamp))'
operator: prom
inputs:
- title: 'Series Normalize: \noffset = 0'
operator: prom
inputs:
- title: 'Filter: \ntimestamp > 2022-12-20T10:00:00 && timestamp < 2022-12-21T10:00:00'
operator: filter
inputs:
- title: 'Table Scan: \ntable = request_duration, timestamp > 2022-12-20T10:00:00 && timestamp < 2022-12-21T10:00:00'
operator: scan -->
![example](example.png)
# Drawbacks
Human-being is always error-prone. It's harder to endeavor to rewrite from the ground and requires more attention to ensure correctness, than translate line-by-line. And, since the evaluator's architecture are different, it might be painful to catch up with PromQL's breaking update (if any) in the future.
Misusing Arrow's DictionaryVector as Matrix is another point. This hack needs some `unsafe` function call to bypass Arrow's check. And though Arrow's API is stable, this is still an undocumented behavior.
# Alternatives
There are a few alternatives we've considered:
- Wrap the existing PromQL's implementation via FFI, and import it to GreptimeDB.
- Translate its evaluator engine line-by-line, rather than rewrite one.
- Integrate the Prometheus server into GreptimeDB via RPC, making it a detached execution engine for PromQL.
The first and second options are making a separate execution engine in GreptimeDB, they may alleviate the pain during rewriting, but will have negative impacts to afterward evolve like resource management. And introduce another deploy component in the last option will bring a complex deploy architecture.
And all of them are more or less redundant in data transportation that affects performance and resources. The proposed built-in executing procedure is also easy to integrate and expose to the existing SQL interface GreptimeDB currently provides. Some concepts in PromQL like sliding windows (range vector in PromQL) are very convenient and ergonomic in analyzing series data. This makes it not only a PromQL evaluator, but also an enhancement to our query system.

View File

@@ -1,151 +0,0 @@
---
Feature Name: "procedure-framework"
Tracking Issue: https://github.com/GreptimeTeam/greptimedb/issues/286
Date: 2023-01-03
Author: "Yingwen <realevenyag@gmail.com>"
---
Procedure Framework
----------------------
# Summary
A framework for executing operations in a fault-tolerant manner.
# Motivation
Some operations in GreptimeDB require multiple steps to implement. For example, creating a table needs:
1. Check whether the table exists
2. Create the table in the table engine
1. Create a region for the table in the storage engine
2. Persist the metadata of the table to the table manifest
3. Add the table to the catalog manager
If the node dies or restarts in the middle of creating a table, it could leave the system in an inconsistent state. The procedure framework, inspired by [Apache HBase's ProcedureV2 framework](https://github.com/apache/hbase/blob/bfc9fc9605de638785435e404430a9408b99a8d0/src/main/asciidoc/_chapters/pv2.adoc) and [Apache Accumulos FATE framework](https://accumulo.apache.org/docs/2.x/administration/fate), aims to provide a unified way to implement multi-step operations that is tolerant to failure.
# Details
## Overview
The procedure framework consists of the following primary components:
- A `Procedure` represents an operation or a set of operations to be performed step-by-step
- `ProcedureManager`, the runtime to run `Procedures`. It executes the submitted procedures, stores procedures' states to the `ProcedureStore` and restores procedures from `ProcedureStore` while the database restarts.
- `ProcedureStore` is a storage layer for persisting the procedure state
## Procedures
The `ProcedureManager` keeps calling `Procedure::execute()` until the Procedure is done, so the operation of the Procedure should be [idempotent](https://developer.mozilla.org/en-US/docs/Glossary/Idempotent): it needs to be able to undo or replay a partial execution of itself.
```rust
trait Procedure {
fn execute(&mut self, ctx: &Context) -> Result<Status>;
fn dump(&self) -> Result<String>;
fn rollback(&self) -> Result<()>;
// other methods...
}
```
The `Status` is an enum that has the following variants:
```rust
enum Status {
Executing {
persist: bool,
},
Suspended {
subprocedures: Vec<ProcedureWithId>,
persist: bool,
},
Done,
}
```
A call to `execute()` can result in the following possibilities:
- `Ok(Status::Done)`: we are done
- `Ok(Status::Executing { .. })`: there are remaining steps to do
- `Ok(Status::Suspend { sub_procedure, .. })`: execution is suspended and can be resumed later after the sub-procedure is done.
- `Err(e)`: error occurs during execution and the procedure is unable to proceed anymore.
Users need to assign a unique `ProcedureId` to the procedure and the procedure can get this id via the `Context`. The `ProcedureId` is typically a UUID.
```rust
struct Context {
id: ProcedureId,
// other fields ...
}
```
The `ProcedureManager` calls `Procedure::dump()` to serialize the internal state of the procedure and writes to the `ProcedureStore`. The `Status` has a field `persist` to tell the `ProcedureManager` whether it needs persistence.
## Sub-procedures
A procedure may need to create some sub-procedures to process its subtasks. For example, creating a distributed table with multiple regions (partitions) needs to set up the regions in each node, thus the parent procedure should instantiate a sub-procedure for each region. The `ProcedureManager` makes sure that the parent procedure does not proceed till all sub-procedures are successfully finished.
The procedure can submit sub-procedures to the `ProcedureManager` by returning `Status::Suspended`. It needs to assign a procedure id to each procedure manually so it can track the status of the sub-procedures.
```rust
struct ProcedureWithId {
id: ProcedureId,
procedure: BoxedProcedure,
}
```
## ProcedureStore
We might need to provide two different ProcedureStore implementations:
- In standalone mode, it stores data on the local disk.
- In distributed mode, it stores data on the meta server or the object store service.
These implementations should share the same storage structure. They store each procedure's state in a unique path based on the procedure id:
```
Sample paths:
/procedures/{PROCEDURE_ID}/000001.step
/procedures/{PROCEDURE_ID}/000002.step
/procedures/{PROCEDURE_ID}/000003.commit
```
`ProcedureStore` behaves like a WAL. Before performing each step, the `ProcedureManager` can write the procedure's current state to the ProcedureStore, which stores the state in the `.step` file. The `000001` in the path is a monotonic increasing sequence of the step. After the procedure is done, the `ProcedureManager` puts a `.commit` file to indicate the procedure is finished (committed).
The `ProcedureManager` can remove the procedure's files once the procedure is done, but it needs to leave the `.commit` as the last file to remove in case of failure during removal.
## ProcedureManager
`ProcedureManager` executes procedures submitted to it.
```rust
trait ProcedureManager {
fn register_loader(&self, name: &str, loader: BoxedProcedureLoader) -> Result<()>;
async fn submit(&self, procedure: ProcedureWithId) -> Result<()>;
}
```
It supports the following operations:
- Register a `ProcedureLoader` by the type name of the `Procedure`.
- Submit a `Procedure` to the manager and execute it.
When `ProcedureManager` starts, it loads procedures from the `ProcedureStore` and restores the procedures by the `ProcedureLoader`. The manager stores the type name from `Procedure::type_name()` with the data from `Procedure::dump()` in the `.step` file and uses the type name to find a `ProcedureLoader` to recover the procedure from its data.
```rust
type BoxedProcedureLoader = Box<dyn Fn(&str) -> Result<BoxedProcedure> + Send>;
```
## Rollback
The rollback step is supposed to clean up the resources created during the execute() step. When a procedure has failed, the `ProcedureManager` puts a `rollback` file and calls the `Procedure::rollback()` method.
```text
/procedures/{PROCEDURE_ID}/000001.step
/procedures/{PROCEDURE_ID}/000002.rollback
```
Rollback is complicated to implement so some procedures might not support rollback or only provide a best-efforts approach.
## Locking
The `ProcedureManager` can provide a locking mechanism that gives a procedure read/write access to a database object such as a table so other procedures are unable to modify the same table while the current one is executing.
# Drawbacks
The `Procedure` framework introduces additional complexity and overhead to our database.
- To execute a `Procedure`, we need to write to the `ProcedureStore` multiple times, which may slow down the server
- We need to rewrite the logic of creating/dropping/altering a table using the procedure framework
# Alternatives
Another approach is to tolerate failure during execution and allow users to retry the operation until it succeeds. But we still need to:
- Make each step idempotent
- Record the status in some place to check whether we are done

View File

@@ -1,92 +0,0 @@
---
Feature Name: "table-compaction"
Tracking Issue: https://github.com/GreptimeTeam/greptimedb/issues/930
Date: 2023-02-01
Author: "Lei, HUANG <mrsatangel@gmail.com>"
---
# Table Compaction
---
## Background
GreptimeDB uses an LSM-tree based storage engine that flushes memtables to SSTs for persistence.
But currently it only supports level 0. SST files in level 0 does not guarantee to contain only rows with disjoint time ranges.
That is to say, different SST files in level 0 may contain overlapped timestamps.
The consequence is, in order to retrieve rows in some time range, all files need to be scanned, which brings a lot of IO overhead.
Also, just like other LSMT engines, delete/update to existing primary keys are converted to new rows with delete/update mark and appended to SSTs on flushing.
We need to merge the operations to same primary keys so that we don't have to go through all SST files to find the final state of these primary keys.
## Goal
Implement a compaction framework to:
- maintain SSTs in timestamp order to accelerate queries with timestamp condition;
- merge rows with same primary key;
- purge expired SSTs;
- accommodate other tasks like data rollup/indexing.
## Overview
Table compaction involves following components:
- Compaction scheduler: run compaction tasks, limit the consumed resources;
- Compaction strategy: find the SSTs to compact and determine the output files of compaction.
- Compaction task: read the rows from input SSTs and write to the output files.
## Implementation
### Compaction scheduler
`CompactionScheduler` is an executor that continuously polls and executes compaction request from a task queue.
```rust
#[async_trait]
pub trait CompactionScheduler {
/// Schedules a compaction task.
async fn schedule(&self, task: CompactionRequest) -> Result<()>;
/// Stops compaction scheduler.
async fn stop(&self) -> Result<()>;
}
```
### Compaction triggering
Currently, we can check whether to compact tables when memtable is flushed to SST.
https://github.com/GreptimeTeam/greptimedb/blob/4015dd80752e1e6aaa3d7cacc3203cb67ed9be6d/src/storage/src/flush.rs#L245
### Compaction strategy
`CompactionStrategy` defines how to pick SSTs in all levels for compaction.
```rust
pub trait CompactionStrategy {
fn pick(
&self,
ctx: CompactionContext,
levels: &LevelMetas,
) -> Result<CompactionTask>;
}
```
The most suitable compaction strategy for time-series scenario would be
a hybrid strategy that combines time window compaction with size-tired compaction, just like [Cassandra](https://cassandra.apache.org/doc/latest/cassandra/operating/compaction/twcs.html) and [ScyllaDB](https://docs.scylladb.com/stable/architecture/compaction/compaction-strategies.html#time-window-compaction-strategy-twcs) does.
We can first group SSTs in level n into buckets according to some predefined time window. Within that window,
SSTs are compacted in a size-tired manner (find SSTs with similar size and compact them to level n+1).
SSTs from different time windows are neven compacted together.
That strategy guarantees SSTs in each level are mainly sorted in timestamp order which boosts queries with
explicit timestamp condition, while size-tired compaction minimizes the impact to foreground writes.
### Alternatives
Currently, GreptimeDB's storage engine [only support two levels](https://github.com/GreptimeTeam/greptimedb/blob/43aefc5d74dfa73b7819cae77b7eb546d8534a41/src/storage/src/sst.rs#L32).
For level 0, we can start with a simple time-window based leveled compaction, which reads from all SSTs in level 0,
align them to time windows with a fixed duration, merge them with SSTs in level 1 within the same time window
to ensure there is only one sorted run in level 1.

Some files were not shown because too many files have changed in this diff Show More