greptimedb

mirror of https://github.com/GreptimeTeam/greptimedb.git synced 2026-07-07 14:30:39 +00:00

Go to file

Lei, HUANG 11ecb7a28a refactor(servers): bulk insert service (#7329 )

* refactor/bulk-insert-service:
 refactor: decode FlightData early in put_record_batch pipeline

 - Move FlightDecoder usage from Inserter up to PutRecordBatchRequestStream,
   passing decoded RecordBatch and schema bytes instead of raw FlightData.
 - Eliminate redundant per-request decoding/encoding in Inserter; encode
   once and reuse for all region requests.
 - Streamline GrpcQueryHandler trait and implementations to accept
   PutRecordBatchRequest containing pre-decoded data.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor/bulk-insert-service:
 feat: stream-based bulk insert with per-batch responses

 - Introduce handle_put_record_batch_stream() to process Flight DoPut streams
 - Resolve table & permissions once, yield (request_id, AffectedRows) per batch
 - Replace loop-over-request with async-stream in frontend & server
 - Make PutRecordBatchRequestStream public for cross-crate usage

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor/bulk-insert-service:
 fix: propagate request_id with errors in bulk insert stream

 Changes the bulk-insert stream item type from
 Result<(i64, AffectedRows), E> to (i64, Result<AffectedRows, E>)
 so every emitted tuple carries the request_id even on failure,
 letting callers correlate errors with the originating request.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor/bulk-insert-service:
 refactor: unify DoPut response stream to return DoPutResponse

 Replace the tuple (i64, Result<AffectedRows>) with Result<DoPutResponse>
 throughout the gRPC bulk-insert path so the handler, adapter and server
 all speak the same type.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor/bulk-insert-service:
 feat: add elapsed_secs to DoPutResponse for bulk-insert timing

 - DoPutResponse now carries elapsed_secs field
 - Frontend measures and attaches insert duration
 - Server observes GRPC_BULK_INSERT_ELAPSED metric from response

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor/bulk-insert-service:
 refactor: unify Bytes import in flight module

 - Replace `bytes::Bytes` with `Bytes` alias for consistency
 - Remove redundant `ProstBytes` alias

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor/bulk-insert-service:
 fix: terminate gRPC stream on error and optimize FlightData handling

 - Stop retrying on stream errors in gRPC handler
 - Replace Vec1 indexing with into_iter().next() for FlightData
 - Remove redundant clones in bulk_insert and flight modules

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor/bulk-insert-service:
 Improve permission check placement in `grpc.rs`

 - Moved the permission check for `BulkInsert` to occur before resolving the table reference in `GrpcQueryHandler` implementation.
 - Ensures permission validation is performed earlier in the process, potentially avoiding unnecessary operations if permission is denied.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor/bulk-insert-service:
 **Refactor Bulk Insert Handling in gRPC**

 - **`grpc.rs`**:
   - Switched from `async_stream::stream` to `async_stream::try_stream` for error handling.
   - Removed `body_size` parameter and added `flight_data` to `handle_bulk_insert`.
   - Simplified error handling and permission checks in `GrpcQueryHandler`.

 - **`bulk_insert.rs`**:
   - Added `raw_flight_data` parameter to `handle_bulk_insert`.
   - Calculated `body_size` from `raw_flight_data` and removed redundant encoding logic.

 - **`flight.rs`**:
   - Replaced `body_size` with `flight_data` in `PutRecordBatchRequest`.
   - Updated memory usage calculation to include `flight_data` components.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

* refactor/bulk-insert-service:
 perf(bulk_insert): encode record batch once per datanode

 Move FlightData encoding outside the per-region loop so the same
 encoded bytes are reused when mask.select_all(), eliminating redundant
 serialisation work.

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

---------

Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>

2025-12-04 07:08:02 +00:00

.cargo

feat: put sqlness into a separated dir (#6911 )

2025-09-05 01:39:29 +00:00

.config

build: on windows (#2054 )

2023-08-10 08:08:37 +00:00

.github

ci: add multi lang tests workflow into release and nightly workflows (#7300 )

2025-11-26 04:35:04 +00:00

config

docs(config): clarify store_addrs format (#7279 )

2025-11-21 22:26:52 +00:00

cyborg

fix: doc issue assignee (#6406 )

2025-06-26 09:18:47 +00:00

docker

feat: add building option to build images base on distroless image (#7240 )

2025-11-26 05:13:05 +00:00

docs

feat: gc worker only local regions&test (#7203 )

2025-11-18 02:45:09 +00:00

grafana

fix: use instance lables to fetch greptime_memory_limit_in_bytes and greptime_cpu_limit_in_millicores metrics (#7043 )

2025-09-29 11:43:35 +00:00

scripts

feat: add TLS support for mysql backend (#6979 )

2025-09-16 13:46:37 +00:00

src

refactor(servers): bulk insert service (#7329 )

2025-12-04 07:08:02 +00:00

tests

perf(metric-engine)!: Replace mur3 with fxhash for faster TSID generation (#7316 )

2025-12-02 08:38:29 +00:00

tests-fuzz

feat: allow fuzz input override through env var (#7208 )

2025-11-10 14:02:23 +00:00

tests-integration

perf(metric-engine)!: Replace mur3 with fxhash for faster TSID generation (#7316 )

2025-12-02 08:38:29 +00:00

.dockerignore

fix: docker build (#1822 )

2023-06-25 11:05:46 +08:00

.editorconfig

feat: to_timezone function (#3470 )

2024-03-12 01:46:19 +00:00

.env.example

feat: add GcsConfig credential field (#4568 )

2024-08-16 03:11:20 +00:00

.gitignore

feat: resolve unused dependencies with cargo-udeps (#6578 ) (#6619 )

2025-08-26 10:22:53 +00:00

.pre-commit-config.yaml

feat: Loki remote write (#4941 )

2024-11-18 08:39:17 +00:00

AUTHOR.md

chore: members and committers update (#7341 )

2025-12-04 04:08:43 +00:00

Cargo.lock

perf(metric-engine)!: Replace mur3 with fxhash for faster TSID generation (#7316 )

2025-12-02 08:38:29 +00:00

Cargo.toml

feat: update pg-catalog for describe table (#7321 )

2025-12-02 01:38:36 +00:00

cliff.toml

chore: improve contributor click in git-cliff (#3672 )

2024-04-08 18:15:00 +00:00

codecov.yml

refactor: refactor TableRouteManager (#3392 )

2024-02-28 06:18:09 +00:00

CONTRIBUTING.md

feat: resolve unused dependencies with cargo-udeps (#6578 ) (#6619 )

2025-08-26 10:22:53 +00:00

Cross.toml

fix: cross compiling for aarch64 targets and allow customizing page size (#5487 )

2025-02-07 11:21:16 +00:00

flake.lock

chore: update rust to nightly 2025-10-01 (#7069 )

2025-10-11 07:30:52 +00:00

flake.nix

chore: update rust to nightly 2025-10-01 (#7069 )

2025-10-11 07:30:52 +00:00

LICENSE

chore: multiple licenses fixes (#2714 )

2023-11-09 10:38:12 +00:00

licenserc.toml

feat: trigger alter parse (#6553 )

2025-07-29 11:07:31 +00:00

Makefile

ci: dev-build with large page size (#7228 )

2025-11-17 02:38:16 +00:00

README.md

docs: update project status and tweak readme (#7216 )

2025-11-12 15:06:56 +00:00

rust-toolchain.toml

chore: update rust to nightly 2025-10-01 (#7069 )

2025-10-11 07:30:52 +00:00

rustfmt.toml

chore: specify import style in rustfmt (#460 )

2022-11-15 15:58:54 +08:00

SECURITY.md

feat: Create SECURITY.md (#1270 )

2023-03-28 19:14:29 +08:00

taplo.toml

chore: skip reorder workspace tables in taplo (#3388 )

2024-02-26 08:57:49 +00:00

typos.toml

feat: node excluder (#5964 )

2025-04-23 10:48:46 +00:00

README.md

Real-Time & Cloud-Native Observability Database
for metrics, logs, and traces

Delivers sub-second querying at PB scale and exceptional cost efficiency from edge to cloud.

User Guide | API Docs | Roadmap 2025

Introduction
⭐ Key Features
Quick Comparison
Architecture
Try GreptimeDB
Getting Started
Build From Source
Tools & Extensions
Project Status
Community
License
Commercial Support
Contributing
Acknowledgement

Introduction

GreptimeDB is an open-source, cloud-native database that unifies metrics, logs, and traces, enabling real-time observability at any scale — across edge, cloud, and hybrid environments.

Features

Feature	Description
All-in-One Observability	OpenTelemetry-native platform unifying metrics, logs, and traces. Query via SQL, PromQL, and Flow.
High Performance	Written in Rust with rich indexing (inverted, fulltext, skipping, vector), delivering sub-second responses at PB scale.
Cost Efficiency	50x lower operational and storage costs with compute-storage separation and native object storage (S3, Azure Blob, etc.).
Cloud-Native & Scalable	Purpose-built for Kubernetes with unlimited cross-cloud scaling, handling hundreds of thousands of concurrent requests.
Developer-Friendly	SQL/PromQL interfaces, built-in web dashboard, REST API, MySQL/PostgreSQL protocol compatibility, and native OpenTelemetry support.
Flexible Deployment	Deploy anywhere from ARM-based edge devices (including Android) to cloud, with unified APIs and efficient data sync.

✅ Perfect for:

Unified observability stack replacing Prometheus + Loki + Tempo
Large-scale metrics with high cardinality (millions to billions of time series)
Large-scale observability platform requiring cost efficiency and scalability
IoT and edge computing with resource and bandwidth constraints

Learn more in Why GreptimeDB and Observability 2.0 and the Database for It.

Quick Comparison

Feature	GreptimeDB	Traditional TSDB	Log Stores
Data Types	Metrics, Logs, Traces	Metrics only	Logs only
Query Language	SQL, PromQL	Custom/PromQL	Custom/DSL
Deployment	Edge + Cloud	Cloud/On-prem	Mostly central
Indexing & Performance	PB-Scale, Sub-second	Varies	Varies
Integration	REST API, SQL, Common protocols	Varies	Varies

Performance:

Architecture

GreptimeDB can run in two modes:

Standalone Mode - Single binary for development and small deployments
Distributed Mode - Separate components for production scale:
- Frontend: Query processing and protocol handling
- Datanode: Data storage and retrieval
- Metasrv: Metadata management and coordination

Read the architecture document. DeepWiki provides an in-depth look at GreptimeDB:

Try GreptimeDB

docker pull greptime/greptimedb

docker run -p 127.0.0.1:4000-4003:4000-4003 \
  -v "$(pwd)/greptimedb_data:/greptimedb_data" \
  --name greptime --rm \
  greptime/greptimedb:latest standalone start \
  --http-addr 0.0.0.0:4000 \
  --rpc-bind-addr 0.0.0.0:4001 \
  --mysql-addr 0.0.0.0:4002 \
  --postgres-addr 0.0.0.0:4003

Dashboard: http://localhost:4000/dashboard

Getting Started

Build From Source

Prerequisites:

Rust toolchain (nightly)
Protobuf compiler (>= 3.15)
C/C++ building essentials, including gcc/g++/autoconf and glibc library (eg. libc6-dev on Ubuntu and glibc-devel on Fedora)
Python toolchain (optional): Required only if using some test scripts.

Build and Run:

make
cargo run -- standalone start

Tools & Extensions

Kubernetes: GreptimeDB Operator
Helm Charts: Greptime Helm Charts
Dashboard: Web UI
gRPC Ingester: Go, Java, C++, Erlang, Rust
Grafana Data Source: GreptimeDB Grafana data source plugin
Grafana Dashboard: Official Dashboard for monitoring

Project Status

Status: Beta — marching toward v1.0 GA! GA (v1.0): January 10, 2026

Deployed in production by open-source projects and commercial users
Stable, actively maintained, with regular releases (version info)
Suitable for evaluation and pilot deployments

GreptimeDB v1.0 represents a major milestone toward maturity — marking stable APIs, production readiness, and proven performance.

Roadmap: Beta1 (Nov 10) → Beta2 (Nov 24) → RC1 (Dec 8) → GA (Jan 10, 2026), please read v1.0 highlights and release plan for details.

For production use, we recommend using the latest stable release.

If you find this project useful, a ⭐ would mean a lot to us!

Community

We invite you to engage and contribute!

License

GreptimeDB is licensed under the Apache License 2.0.

Commercial Support

Running GreptimeDB in your organization? We offer enterprise add-ons, services, training, and consulting. Contact us for details.

Contributing

Read our Contribution Guidelines.
Explore Internal Concepts and DeepWiki.
Pick up a good first issue and join the #contributors Slack channel.

Acknowledgement

Special thanks to all contributors! See AUTHORS.md.

Uses Apache Arrow™ (memory model)
Apache Parquet™ (file storage)
Apache DataFusion™ (query engine)
Apache OpenDAL™ (data access abstraction)

Description

Open-source, cloud-native, unified observability database for metrics, logs and traces, supporting SQL/PromQL/Streaming.

analytics cloud-native database distributed greptimedb logs metrics monitoring observability observability-database observability-datalake promql rust rust-database sql time-series traces tsdb

Readme Apache-2.0 818 MiB

README.md

Real-Time & Cloud-Native Observability Databasefor metrics, logs, and traces

User Guide | API Docs | Roadmap 2025

Introduction

Features

Quick Comparison

Architecture

Try GreptimeDB

Getting Started

Build From Source

Tools & Extensions

Project Status

Community

License

Commercial Support

Contributing

Acknowledgement

Real-Time & Cloud-Native Observability Database
for metrics, logs, and traces