### Commit Message
Enhance Schema Handling and Add Telemetry Logging
- **`access_layer.rs`**: Refactored to use `physical_schema` for schema creation and improved error handling with telemetry logging for batch writing.
- **`batch_builder.rs`**: Introduced `physical_schema` function for schema creation and updated data structures to use `RegionId` for physical tables. Removed redundant schema function.
- **`prom_store.rs`**: Added telemetry logging to track bulk mode usage and processing time.
- **`prom_row_builder.rs`**: Implemented telemetry logging to measure elapsed time for key operations like table creation, metadata collection, and batch appending.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
### Add Object Store Integration and Access Layer Factory
- **Cargo.lock, Cargo.toml**: Added `object-store` as a dependency to integrate object storage capabilities.
- **frontend.rs, instance.rs, builder.rs, server.rs**: Introduced `ObjectStoreConfig` and `AccessLayerFactory` to manage object storage configurations and access layers.
- **access_layer.rs**: Made `AccessLayerFactory` clonable and its constructor public to facilitate object store access layer creation.
- **prom_store.rs**: Updated `PromBulkState` to include `AccessLayerFactory` and modified `remote_write` to utilize the access layer for processing requests.
- **lib.rs**: Made `access_layer` module public to expose access layer functionalities.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
### Commit Message
Enhance Schema and Table Handling in MetricsBatchBuilder
- **`access_layer.rs`**: Made `create_sst_writer` function public within the crate to facilitate SST writing.
- **`batch_builder.rs`**: Updated `MetricsBatchBuilder` to handle builders as a nested `HashMap` for schemas and logical table names. Modified `finish` method to return record batches grouped by schema and logical table name.
- **`prom_row_builder.rs`**: Renamed `as_record_batch` to `as_record_batches` and implemented logic to write record batches using `AccessLayerFactory`.
- **`proto.rs`**: Updated method call from `as_record_batch` to `as_record_batches` to reflect the new method name.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
- **Refactor `AccessLayerFactory`**: Updated the method for joining paths in `access_layer.rs` by replacing `join_dir` with `join_path` for file path construction.
- **Enhance Testing**: Added comprehensive tests in `access_layer.rs` to verify the functionality of the Parquet writer, including writing multiple batches and handling provided timestamp ranges.
- **Optimize `MetricsBatchBuilder`**: Simplified the loop in `batch_builder.rs` by removing unnecessary mutability in the encoder variable.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
### Add Parquet Writer and Access Layer
- **`Cargo.lock`, `Cargo.toml`**: Updated dependencies to include `parquet`, `object-store`, and `common-datasource`.
- **`parquet_writer.rs`**: Introduced a new module for writing Parquet files asynchronously using `AsyncWriter`.
- **`access_layer.rs`**: Added a new access layer for managing Parquet file writing, including `AccessLayerFactory` and `ParquetWriter` for handling record batches and metadata.
### Enhance Batch Builder
- **`batch_builder.rs`**: Enhanced `BatchEncoder` to handle timestamp ranges and return optional record batches with timestamp metadata.
### Update Error Handling
- **`error.rs`**: Added new error variants for handling object store and Parquet operations.
### Modify Mito2 Parquet Constants
- **`parquet.rs`**: Changed visibility of `DEFAULT_ROW_GROUP_SIZE` to public for broader access.
### Add Tests
- **`access_layer.rs`**: Added basic tests for building data region directories.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
### Commit Message
Enhance `MetricsBatchBuilder` and `TablesBuilder` functionalities
- **`batch_builder.rs`**:
- Made `collect_physical_region_metadata` and `append_rows_to_batch` methods public to allow external access.
- Implemented the `finish` method to complete record batch building and return batches grouped by physical table ID.
- **`prom_row_builder.rs`**:
- Updated `TablesBuilder::build` to utilize `MetricsBatchBuilder` for creating or altering physical tables and appending rows.
- Integrated collection of physical region metadata and finalized record batch processing.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
### Commit Message
Enhance error handling in `batch_builder.rs` and `error.rs`
- **`batch_builder.rs`**:
- Replaced `todo!` with specific error handling for invalid timestamp and field value types using `InvalidTimestampValueTypeSnafu` and `InvalidFieldValueTypeSnafu`.
- **`error.rs`**:
- Added new error variants `InvalidTimestampValueType` and `InvalidFieldValueType` with descriptive messages.
- Updated `status_code` method to handle new error types with `StatusCode::Unexpected`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
### Commit Message
Enhance batch sorting in `batch_builder.rs`
- Implement sorting of batches by primary key using `compute::sort_to_indices`.
- Update `op_type` and `sequence` to use `Arc` for consistency.
- Apply sorting to `value`, `timestamp`, and primary key arrays using `compute::take`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
- **Add `ahash` Dependency**: Updated `Cargo.lock` and `Cargo.toml` to include `ahash` as a dependency for the `metric-engine` project.
- **Refactor `MetricsBatchBuilder`**: Modified `MetricsBatchBuilder` in `batch_builder.rs` to include a `builders` field and refactored methods to use `BatchEncoder` for appending rows.
- **Update `BatchEncoder`**: Changed `BatchEncoder` in `batch_builder.rs` to use a `name_to_id` map and added `append_rows` and `finish` methods for handling row encoding.
- **Modify `LogicalSchemas` Structure**: Updated `schema_helper.rs` to use `HashMap` instead of `ahash::HashMap` for the `schemas` field in `LogicalSchemas`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
### Commit Summary
- **Add New Dependencies**: Updated `Cargo.lock` and `Cargo.toml` to include `metric-engine` and `mito-codec` dependencies.
- **Enhance `MetricEngineInner`**: Modified `put.rs` to use `logical_table_id` instead of `table_id` and adjusted method calls accordingly.
- **Expose Structs and Methods**: Made `RowModifier`, `RowsIter`, and `RowIter` structs and their methods public in `row_modifier.rs`.
- **Implement Batch Processing**: Added batch processing logic in `batch_builder.rs` to handle row conversion to record batches with primary key encoding.
- **Error Handling**: Introduced `EncodePrimaryKey` error variant in `error.rs` for handling primary key encoding errors.
- **Clone Support for `TableBuilder`**: Added `Clone` trait to `TableBuilder` in `prom_row_builder.rs`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
Add a new bulk_mode flag to PromStoreOptions that enables bulk processing
for prometheus metrics ingestion. When enabled, initializes PromBulkState
with a SchemaHelper instance for efficient schema management.
Changes:
- Add bulk_mode field to PromStoreOptions (default: false)
- Add create_schema_helper() method to Instance for SchemaHelper construction
- Add getter methods to StatementExecutor for procedure_executor and cache_invalidator
- Update server initialization to conditionally create PromBulkState when bulk_mode is enabled
- Fix clippy warnings in schema_helper.rs
Signed-off-by: evenyag <realevenyag@gmail.com>
- **Add `operator` dependency**: Updated `Cargo.lock` and `Cargo.toml` to include the `operator` dependency.
- **Expose structs and functions in `schema_helper.rs`**: Made `LogicalSchema`, `LogicalSchemas`, and `ensure_logical_tables_for_metrics` public in `src/operator/src/schema_helper.rs`.
- **Refactor `batch_builder.rs`**:
- Changed the logic for handling physical and logical tables, including the introduction of `tags_to_logical_schemas` function.
- Modified `determine_physical_table_name` to return a string instead of a table reference.
- Updated logic for managing tags and logical schemas in `MetricsBatchBuilder`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
### Commit Summary
- **Enhancements in `MetricsBatchBuilder`:**
- Added support for session catalog and schema in `create_or_alter_physical_tables`.
- Introduced `physical_tables` mapping for logical to physical table references.
- Implemented `rows_to_batch` to build `RecordBatch` from rows with primary key encoded.
- **Function Modifications:**
- Updated `determine_physical_table` to use session catalog and schema.
- Modified `build_create_table_expr` and `build_alter_table_expr` to include additional parameters for schema compatibility.
- **Error Handling:**
- Added error handling for missing physical tables in `rows_to_batch`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
### Add `MetricsBatchBuilder` for Physical Table Management
- **New Feature**: Introduced `MetricsBatchBuilder` in `batch_builder.rs` to manage the creation and alteration of physical tables based on logical table tags.
- **Error Handling**: Added `CommonMeta` error variant in `error.rs` to handle errors from `common_meta`.
- **Enhancements**:
- Added `tags` method in `prom_row_builder.rs` to retrieve tag names from `TableBuilder`.
- Implemented `primary_key_names` method in `metadata.rs` to return primary key names from `TableMeta`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
refactor/building-backend-in-object-store:
### Refactor Object Store Configuration
- **Centralize Object Store Configurations**: Moved object store configurations (`FileConfig`, `S3Config`, `OssConfig`, `AzblobConfig`, `GcsConfig`) to `object-store/src/config.rs`.
- **Error Handling Enhancements**: Introduced `object-store/src/error.rs` for improved error handling related to object store operations.
- **Factory Pattern for Object Store**: Implemented `object-store/src/factory.rs` to create object store instances, consolidating logic from `datanode/src/store.rs`.
- **Remove Redundant Store Implementations**: Deleted individual store files (`azblob.rs`, `fs.rs`, `gcs.rs`, `oss.rs`, `s3.rs`) from `datanode/src/store/`.
- **Update Usage of Object Store Config**: Updated references to `ObjectStoreConfig` in `datanode.rs`, `standalone.rs`, `config.rs`, and `error.rs` to use the new centralized configuration.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>