Memtable::iter with Memtable::ranges (#6549)
* bulk-multiparts-merge-reader: **Enhance Memtable Iteration and Flushing Logic** - **`flush.rs`**: Updated `RegionFlushTask` to handle multiple ranges using `MergeReaderBuilder` for improved source management during flush operations. - **`memtable.rs`**: Introduced `build_prune_iter` and `build_iter` methods in `MemtableRange` for flexible iteration. Added `MemtableRanges` struct to manage multiple contexts. - **`simple_bulk_memtable.rs`**: Refactored to use `BatchIterBuilder` and `BatchIterBuilderDeprecated` for iteration, supporting new `read_to_values` method in `Series`. - **`time_series.rs`**: Added `read_to_values` and `finish_cloned` methods in `Series` and `ValueBuilder` for efficient data handling. - **`scan_util.rs`**: Replaced `build_iter` with `build_prune_iter` for range iteration, enhancing scan utility. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: - **Add Rayon for Parallel Processing**: Introduced `rayon` for parallel processing in `simple_bulk_memtable.rs` and updated `Cargo.toml` and `Cargo.lock` to include `rayon` dependency. - **Enhance Benchmarking**: Added new benchmarks in `simple_bulk_memtable.rs` to compare parallel vs sequential processing, projection, sequence filtering, and write performance. - **Make Structs and Methods Public**: Changed visibility of several structs and methods to `pub` in `simple_bulk_memtable.rs`, `memtable.rs`, `time_series.rs`, and `test_util.rs` to facilitate testing and benchmarking. - **Update Criterion Features**: Modified `Cargo.toml` to include `html_reports` feature for `criterion`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: ### Commit Summary - **Refactor `SimpleBulkMemtable`**: - Moved `ranges_sequential` function to a new `test_only` module and made it a method of `SimpleBulkMemtable`. - Made several fields in `SimpleBulkMemtable` private and added a `region_metadata` getter. - Affected files: `simple_bulk_memtable.rs`, `test_only.rs`. - **Benchmark Adjustments**: - Updated benchmark functions to use the new `ranges_sequential` method. - Affected file: `simple_bulk_memtable.rs`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: ### Add Test Configuration for `iter` Method in Memtable Implementations - **Enhancements**: - Added `#[cfg(any(test, feature = "test"))]` attribute to the `iter` method in various `Memtable` implementations to enable conditional compilation for testing purposes. - Affected files: - `src/mito2/src/memtable.rs` - `src/mito2/src/memtable/bulk.rs` - `src/mito2/src/memtable/partition_tree.rs` - `src/mito2/src/memtable/simple_bulk_memtable.rs` - `src/mito2/src/memtable/time_series.rs` - `src/mito2/src/test_util/memtable_util.rs` - **Benchmark Adjustments**: - Removed `black_box` usage in `bench_memtable_write_performance` function to streamline benchmarking. - Affected file: `src/mito2/benches/simple_bulk_memtable.rs` Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: **Enhance Async Support and Refactor Iteration in `mito2`** - **Add Async Features**: Updated `Cargo.toml` to include `async` and `async_tokio` features for `criterion`. - **Async Iteration**: Introduced async functions `flush` and `flush_original` in `simple_bulk_memtable.rs` to handle memtable flushing using async iterators. - **Refactor Iteration Logic**: Moved `create_iter` and `BatchIterBuilderDeprecated` to `test_only.rs` for better separation of concerns. - **Public API Change**: Made `next_batch` in `read.rs` public to support async batch processing. - **Benchmark Updates**: Modified benchmarks in `simple_bulk_memtable.rs` to use async runtime for performance testing. Files affected: `Cargo.toml`, `simple_bulk_memtable.rs`, `test_only.rs`, `read.rs`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: **Enhance Benchmarking for Memtable** - Refactored `create_large_memtable` to `create_memtable_with_rows` in `simple_bulk_memtable.rs` to allow dynamic row count configuration. - Introduced parameterized benchmarking in `bench_ranges_parallel_vs_sequential` to test various row counts, improving the flexibility and coverage of performance tests. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: ### Enhance Memory Management and Public API - **`builder.rs`**: Made `next_offset` method public to allow external access to offset calculations. - **`simple_bulk_memtable.rs`**: Simplified the `series.extend` method by removing the iterator conversion for `fields`. - **`time_series.rs`**: - Added `can_accommodate` method to `ValueBuilder` to check if fields can be accommodated without offset overflow. - Modified `extend` method to use a `Vec` for `fields` instead of an iterator, improving memory management and error handling. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: Add License and Enhance Testing in `simple_bulk_memtable.rs` - Added Apache License header to `simple_bulk_memtable.rs`. - Modified test configuration in `simple_bulk_memtable.rs` to include `any(test, feature = "test")`. - Introduced a new test `test_write_read_large_string` in `simple_bulk_memtable.rs` to verify handling of large strings. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: Update `Cargo.toml` dependencies - Adjust features for `common-meta` and `mito-codec` to include "testing". - Maintain `criterion` version and features for async support. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: ### Update Predicate Type in Memtable Iterators - **Files Modified**: - `src/mito2/src/memtable.rs` - `src/mito2/src/memtable/bulk.rs` - `src/mito2/src/memtable/simple_bulk_memtable.rs` - **Key Changes**: - Updated the `iter` method in `Memtable` trait and its implementations to use `Option<table::predicate::Predicate>` instead of `Option<Predicate>`. - Adjusted return type in `BulkMemtable`'s `iter` method to `Result<crate::memtable::BoxedBatchIterator>`. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: **Enhance Memtable Functionality** - **`memtable.rs`**: - Added `Clone` trait to `MemtableStats` and made `num_ranges` public. - Introduced `num_rows` field in `MemtableRange` and updated its constructor. - Added `num_rows` method to `MemtableRange`. - **`partition_tree.rs`, `simple_bulk_memtable.rs`, `time_series.rs`**: - Updated `MemtableRange` instantiation to include `num_rows`. - **`range.rs`**: - Refactored `MemRangeBuilder` to handle a single `MemtableRange` and `MemtableStats`. - **`scan_region.rs`**: - Enhanced memtable filtering based on time range and updated `MemRangeBuilder` usage. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: **Enhancements and Bug Fixes** - **Deduplication Enhancements**: - Introduced `DedupReader` and `LastRow` as public structs in `dedup.rs` to enhance deduplication capabilities. - Added `LastNonNull` deduplication strategy in `flush.rs` and `simple_bulk_memtable.rs`. - **Memtable Improvements**: - Updated `SimpleBulkMemtable` to support batch size configuration and deduplication strategies. - Modified `Series` struct in `time_series.rs` to include a configurable capacity. - **Testing Enhancements**: - Added new test `test_write_dedup` in `simple_bulk_memtable.rs` to verify deduplication functionality. - Updated existing tests to include `OpType` parameter for better operation type handling. - **Refactoring**: - Renamed `BatchIterBuilder` to `BatchRangeBuilder` in `simple_bulk_memtable.rs` for clarity. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * bulk-multiparts-merge-reader: - **Refactor `flush.rs`:** Removed `LastNonNullIter` usage and adjusted `DedupReader` instantiation to use `LastRow::new(false)` and `LastNonNull::new(false)`. - **Enhance `simple_bulk_memtable.rs`:** Added logic to handle `LastNonNull` merge mode in `IterBuilder`. Introduced new tests: `test_delete_only` and `test_single_range` to verify delete operations and single range handling. Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> * fix: tests Signed-off-by: Lei, HUANG <mrsatangel@gmail.com> --------- Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
Real-Time & Cloud-Native Observability Database
for metrics, logs, and traces
Delivers sub-second querying at PB scale and exceptional cost efficiency from edge to cloud.
- Introduction
- ⭐ Key Features
- Quick Comparison
- Architecture
- Try GreptimeDB
- Getting Started
- Build From Source
- Tools & Extensions
- Project Status
- Community
- License
- Commercial Support
- Contributing
- Acknowledgement
Introduction
GreptimeDB is an open-source, cloud-native database purpose-built for the unified collection and analysis of observability data (metrics, logs, and traces). Whether you’re operating on the edge, in the cloud, or across hybrid environments, GreptimeDB empowers real-time insights at massive scale — all in one system.
Features
| Feature | Description |
|---|---|
| Unified Observability Data | Store metrics, logs, and traces as timestamped, contextual wide events. Query via SQL, PromQL, and streaming. |
| High Performance & Cost Effective | Written in Rust, with a distributed query engine, rich indexing, and optimized columnar storage, delivering sub-second responses at PB scale. |
| Cloud-Native Architecture | Designed for Kubernetes, with compute/storage separation, native object storage (AWS S3, Azure Blob, etc.) and seamless cross-cloud access. |
| Developer-Friendly | Access via SQL/PromQL interfaces, REST API, MySQL/PostgreSQL protocols, and popular ingestion protocols. |
| Flexible Deployment | Deploy anywhere: edge (including ARM/Android) or cloud, with unified APIs and efficient data sync. |
Learn more in Why GreptimeDB and Observability 2.0 and the Database for It.
Quick Comparison
| Feature | GreptimeDB | Traditional TSDB | Log Stores |
|---|---|---|---|
| Data Types | Metrics, Logs, Traces | Metrics only | Logs only |
| Query Language | SQL, PromQL, Streaming | Custom/PromQL | Custom/DSL |
| Deployment | Edge + Cloud | Cloud/On-prem | Mostly central |
| Indexing & Performance | PB-Scale, Sub-second | Varies | Varies |
| Integration | REST, SQL, Common protocols | Varies | Varies |
Performance:
Read more benchmark reports.
Architecture
- Read the architecture document.
- DeepWiki provides an in-depth look at GreptimeDB:

Try GreptimeDB
1. Live Demo
Experience GreptimeDB directly in your browser.
2. GreptimeCloud
Start instantly with a free cluster.
3. Docker (Local Quickstart)
docker pull greptime/greptimedb
docker run -p 127.0.0.1:4000-4003:4000-4003 \
-v "$(pwd)/greptimedb_data:/greptimedb_data" \
--name greptime --rm \
greptime/greptimedb:latest standalone start \
--http-addr 0.0.0.0:4000 \
--rpc-bind-addr 0.0.0.0:4001 \
--mysql-addr 0.0.0.0:4002 \
--postgres-addr 0.0.0.0:4003
Dashboard: http://localhost:4000/dashboard Full Install Guide
Troubleshooting:
- Cannot connect to the database? Ensure that ports
4000,4001,4002, and4003are not blocked by a firewall or used by other services. - Failed to start? Check the container logs with
docker logs greptimefor further details.
Getting Started
Build From Source
Prerequisites:
- Rust toolchain (nightly)
- Protobuf compiler (>= 3.15)
- C/C++ building essentials, including
gcc/g++/autoconfand glibc library (eg.libc6-devon Ubuntu andglibc-develon Fedora) - Python toolchain (optional): Required only if using some test scripts.
Build and Run:
make
cargo run -- standalone start
Tools & Extensions
- Kubernetes: GreptimeDB Operator
- Helm Charts: Greptime Helm Charts
- Dashboard: Web UI
- SDKs/Ingester: Go, Java, C++, Erlang, Rust, JS
- Grafana: Official Dashboard
Project Status
Status: Beta. GA (v1.0): Targeted for mid 2025.
- Being used in production by early adopters
- Stable, actively maintained, with regular releases (version info)
- Suitable for evaluation and pilot deployments
For production use, we recommend using the latest stable release.
If you find this project useful, a ⭐ would mean a lot to us!

Community
We invite you to engage and contribute!
License
GreptimeDB is licensed under the Apache License 2.0.
Commercial Support
Running GreptimeDB in your organization? We offer enterprise add-ons, services, training, and consulting. Contact us for details.
Contributing
- Read our Contribution Guidelines.
- Explore Internal Concepts and DeepWiki.
- Pick up a good first issue and join the #contributors Slack channel.
Acknowledgement
Special thanks to all contributors! See AUTHORS.md.
- Uses Apache Arrow™ (memory model)
- Apache Parquet™ (file storage)
- Apache Arrow DataFusion™ (query engine)
- Apache OpenDAL™ (data access abstraction)
