* fix: event api content type only check type and subtype
Signed-off-by: paomian <xpaomian@gmail.com>
* chore: make clippy happy
Signed-off-by: paomian <xpaomian@gmail.com>
---------
Signed-off-by: paomian <xpaomian@gmail.com>
chore/add-conn-info-to-query-ctx:
### Add Connection Information to Query Context
- **`src/frontend/src/instance.rs`**: Updated to use `query_ctx.conn_info().to_string()` for connection information instead of a placeholder string.
- **`src/session/src/context.rs`**: Introduced `conn_info` field in `QueryContext` and added a method `conn_info()` to retrieve it. Updated `QueryContextBuilder` to handle `conn_info`.
- **`src/session/src/lib.rs`**: Modified `Session` to include `conn_info` in the query context building process.
These changes enhance the query context by incorporating connection information, allowing for more detailed session management.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/kill-process:
### Add Cancellation Support and Enhance Process Management
- **Cancellation Handle Implementation**: Introduced `CancellationHandle` in `cancellation_handle.rs` to facilitate cancellation of futures and streams.
- **Process Management Enhancements**:
- Updated `ProcessManager` in `process_manager.rs` to support cancellable processes using `CancellableProcess`.
- Added `kill_process` method for terminating processes.
- **Stream Wrapper Update**:
- Replaced `StreamWrapper` with `CancellableStreamWrapper` in `stream_wrapper.rs` and `instance.rs` to handle stream cancellation.
- **Error Handling**:
- Added `StreamCancelled` error variant in `error.rs` to handle stream cancellation scenarios.
- **gRPC Handler Update**:
- Added `kill_process` gRPC method in `frontend_grpc_handler.rs` to allow external process termination.
- **Dependency Updates**:
- Updated `Cargo.lock` and `Cargo.toml` to include `common-base` and `tokio-util`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/kill-process:
**Enhancements and Bug Fixes**
- **Dependency Update**: Updated `greptime-proto` dependency in `Cargo.lock` and `Cargo.toml` to a new revision.
- **Error Handling Improvements**:
- Modified error variants in `src/catalog/src/error.rs` and `src/common/frontend/src/error.rs` to improve error messages and handling.
- Added `FrontendNotFound` error variant for better error specificity.
- **Process Management Enhancements**:
- Updated `ProcessManager` in `src/catalog/src/process_manager.rs` to include `kill_process` functionality with server address validation.
- Enhanced `FrontendClient` trait in `src/common/frontend/src/selector.rs` to support `kill_process` requests.
- **gRPC Handler Update**:
- Refactored `FrontendGrpcHandler` in `src/servers/src/grpc/frontend_grpc_handler.rs` to handle `kill_process` requests asynchronously and return process status.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/kill-process:
### Add Kill Process Functionality
- **`Cargo.lock`, `Cargo.toml`**: Added `common-frontend` as a dependency.
- **`server.rs`, `builder.rs`, `instance.rs`**: Updated `FrontendInvoker` and `FrontendBuilder` to support process management.
- **`error.rs`**: Introduced `InvalidProcessId` error for handling invalid process IDs.
- **`statement.rs`, `kill.rs`**: Implemented `execute_kill` method in `StatementExecutor` to handle the `KILL` statement.
- **`parser.rs`, `statement.rs`**: Updated SQL parser to recognize and parse the `KILL` statement.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/kill-process:
## Add Cancellation Support to Query Execution
- **`process_manager.rs`**: Updated `CancellationHandle` initialization to use `default()` method.
- **`cancellation_handle.rs`**: Implemented `Debug` trait for `CancellationHandle` and added `Cancellation` and `CancellableFuture` structs to support cancellable futures.
- **`error.rs`**: Introduced `Cancelled` error variant to handle query cancellations.
- **`instance.rs`**: Integrated `CancellableFuture` to manage query execution with cancellation support.
- **`stream_wrapper.rs`**: Modified `CancellableStreamWrapper` to use the new `waker()` method for cancellation handling.
- **`statement.rs`**: Added `#[allow(clippy::too_many_arguments)]` to `StatementExecutor::new` to suppress clippy warnings.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/kill-process:
- **Add `MetaClientMissing` Error**: Introduced a new error variant `MetaClientMissing` in `error.rs` to handle missing meta client scenarios.
- **Refactor Cancellation Handling**: Merged `cancellation_handle.rs` into `cancellation.rs` and updated related logic in `process_manager.rs`, `instance.rs`, and `stream_wrapper.rs`.
- **Enhance Process Management**: Improved process management logic in `process_manager.rs` to handle process cancellation more effectively.
- **Update Tests**: Added and updated tests in `cancellation.rs` and `stream_wrapper.rs` to cover new cancellation logic and error handling.
- **Cargo.toml Update**: Adjusted workspace settings in `Cargo.toml` for `common-frontend`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/kill-process:
- **Add Tests for Process Management**: Introduced multiple async tests in `process_manager.rs` to verify query registration, deregistration, cancellation, and process killing functionalities.
- **Update Error Message in SQL Parser**: Modified the expected error message in `parser.rs` to clarify the expected token as a "process id string literal".
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/kill-process:
### Add Process Count Metrics to Catalog
- **`metrics.rs`**: Introduced a new metric `PROCESS_LIST_COUNT` to track the count of running processes per catalog using `IntGaugeVec`.
- **`process_manager.rs`**: Updated `CancellableProcess` to increment and decrement `PROCESS_LIST_COUNT` upon creation and destruction, respectively. Added a `Drop` implementation for `CancellableProcess` to handle metric updates.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/kill-process:
### Fix process removal logic in `process_manager.rs`
- Corrected the condition for removing an entry from the catalog in `ProcessManager` by using `o.get()` instead of `o.get_mut()`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/kill-process:
- **Error Handling Improvements**:
- Updated status codes for `Error::FrontendNotFound` and `Error::MetaClientMissing` to `StatusCode::Unexpected` in `src/catalog/src/error.rs`.
- Changed `InvokeFrontend` error display message and status code in `src/common/frontend/src/error.rs`.
- Added `ProcessManagerMissing` error in `src/operator/src/error.rs` and updated its handling in `src/operator/src/statement/kill.rs`.
- **Process Management Enhancements**:
- Added documentation for `ProcessManager` and `register_query` in `src/catalog/src/process_manager.rs`.
- Modified `kill_process` response handling in `src/servers/src/grpc/frontend_grpc_handler.rs`.
- **Cancellation Logic Update**:
- Improved cancellation logic in `src/common/base/src/cancellation.rs` to use `compare_exchange` for atomic operations.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/kill-process:
### Add Process Kill Count Metric and Refactor Cancellation Handle
- **Metrics Update**: Added a new metric `PROCESS_KILL_COUNT` in `metrics.rs` to track the count of completed kill process requests per catalog.
- **Refactor Cancellation Handle**: Renamed `cancellation_handler` to `cancellation_handle` across multiple files for consistency:
- `process_manager.rs`
- `instance.rs`
- `stream_wrapper.rs`
- **Process Management**: Updated process management logic in `process_manager.rs` to increment the `PROCESS_KILL_COUNT` metric upon successful process termination.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/kill-process:
Update metric description in `metrics.rs`
- Changed the description of `PROCESS_KILL_COUNT` to reflect the count of killed processes instead of running processes in `metrics.rs`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat/kill-process:
Update `greptime-proto` Dependency and Fix Response Field
- **Updated Dependency**: Changed the `greptime-proto` Git revision in `Cargo.lock` and `Cargo.toml` to `f0913f1`.
- **Code Fix**: Modified `frontend_grpc_handler.rs` to correct the response field from `found` to `success` in `KillProcessResponse`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat: process id for session, query context and postgres
Signed-off-by: Ning Sun <sunning@greptime.com>
* feat: add sql functions to retrieve connection/process id
Signed-off-by: Ning Sun <sunning@greptime.com>
---------
Signed-off-by: Ning Sun <sunning@greptime.com>
* fix/file-group-in-compaction:
### Enhance Compaction Logic with File Grouping
- **`run.rs`**: Introduced `FileGroup` struct to manage groups of `FileHandle` objects, allowing for more efficient compaction operations. Updated `Ranged` and `Item` trait implementations to work with `FileGroup`.
- **`test_util.rs`**: Added `new_file_handle_with_sequence` function to support file handles with sequence numbers, enhancing test utilities.
- **`twcs.rs`**: Modified `TwcsPicker` to utilize `FileGroup` for managing files within windows, improving compaction logic. Updated `Window` struct to use `HashMap` for storing `FileGroup` objects.
- **`version_util.rs`**: Updated version control utilities to handle sequence numbers in file metadata, aligning with new compaction logic.
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
* fix/file-group-in-compaction:
### Add Test for File Group Assignment in TWCS
- **Enhancements in `twcs.rs`:**
- Added a new test `test_assign_file_groups_to_windows` to verify the correct assignment of file groups to windows.
- Enhanced `test_assign_compacting_to_windows` with a new case to ensure files with overlapping time ranges and the same sequence are treated as one `FileGroup`.
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
* fix/file-group-in-compaction:
**Enhance Compaction Task Documentation and Initialization**
- **`run.rs`**: Added documentation for `FileGroup` to clarify its role in representing a group of files created by the same compaction task.
- **`twcs.rs`**: Introduced comments in the `Window` struct to explain the mapping of file sequences to file groups, indicating files created from the same compaction task. Simplified the initialization of the `files` hashmap using `HashMap::from`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
---------
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* ### Add Process List Management
- **Error Handling Enhancements**:
* refactor: Update test IP addresses to include ports in ProcessKey
* feat/show-process-list:
Refactor Process Management in Meta Module
- Introduced `ProcessManager` for handling process registration and deregistration.
- Added methods for managing and querying process states, including `register_query`, `deregister_query`, and `list_all_processes`.
- Removed redundant process management code from the query module.
- Updated error handling to reflect changes in process management.
- Enhanced test coverage for process management functionalities.
* chore: rebase main
* add information schema process list table
* integrate process list table to system catalog
* build ProcessManager on frontend and standalone mode
* feat/show-process-list:
**Add Process Management Enhancements**
- **`manager.rs`**: Introduced `process_manager` to `SystemCatalog` and `KvBackendCatalogManager` for improved process handling.
- **`information_schema.rs`**: Updated table insertion logic to conditionally include `PROCESS_LIST`.
- **`frontend.rs`, `standalone.rs`**: Enhanced `StartCommand` to clone `process_manager` for better resource management.
- **`instance.rs`, `builder.rs`**: Integrated `ProcessManager` into `Instance` and `FrontendBuilder` to manage query
* feat/show-process-list:
### Add Process Listing and Error Handling Enhancements
- **Error Handling**: Introduced a new error variant `ListProcess` in `error.rs` to handle failures when listing running processes.
- **Process List Implementation**: Enhanced `InformationSchemaProcessList` in `process_list.rs` to track running queries, including defining column names and implementing the `make_process_list` function to build the process list.
- **Frontend Builder**: Added a `#[allow(clippy::too_many_arguments)]` attribute in `builder.rs` to suppress Clippy warnings for the `FrontendBuilder::new` function.
These changes improve error handling and process tracking capabilities within the system.
* feat/show-process-list:
Refactor imports in `process_list.rs`
- Updated import paths for `Predicates` and `InformationTable` in `process_list.rs` to align with the new module structure.
* feat/show-process-list:
Refactor process list generation in `process_list.rs`
- Simplified the process list generation by removing intermediate row storage and directly building vectors.
- Updated `process_to_row` function to use a mutable vector for current row data, improving memory efficiency.
- Removed `rows_to_record_batch` function, integrating its logic directly into the main loop for streamlined processing.
* wip: move ProcessManager to catalog crate
* feat/show-process-list:
- **Refactor Row Construction**: Updated row construction in multiple files to use references for `Value` objects, improving memory efficiency. Affected files include:
- `cluster_info.rs`
- `columns.rs`
- `flows.rs`
- `key_column_usage.rs`
- `partitions.rs`
- `procedure_info.rs`
- `process_list.rs`
- `region_peers.rs`
- `region_statistics.rs`
- `schemata.rs`
- `table_constraints.rs`
- `tables.rs`
- `views.rs`
- `pg_class.rs`
- `pg_database.rs`
- `pg_namespace.rs`
- **Remove Unused Code**: Deleted unused functions and error variants related to process management in `process_list.rs` and `error.rs`.
- **Predicate Evaluation Update**: Modified predicate evaluation functions in `predicate.rs` to work with references, enhancing performance.
* feat/show-process-list:
### Implement Process Management Enhancements
- **Error Handling Enhancements**:
- Added new error variants `BumpSequence`, `StartReportTask`, `ReportProcess`, and `BuildProcessManager` in `error.rs` to improve error handling for process management tasks.
- Updated `ErrorExt` implementations to handle new error types.
- **Process Manager Improvements**:
- Introduced `ProcessManager` enhancements in `process_manager.rs` to manage process states using `ProcessWithState` and `ProcessState` enums.
- Implemented periodic task `ReportTask` to report running queries to the KV backend.
- Modified `register_query` and `deregister_query` methods to use the new state management system.
- **Testing and Validation**:
- Updated tests in `process_manager.rs` to validate new process management logic.
- Replaced `dump` method with `list_all_processes` for listing processes.
- **Integration with Frontend and Standalone**:
- Updated `frontend.rs` and `standalone.rs` to handle `ProcessManager` initialization errors using `BuildProcessManager` error variant.
- **Schema Adjustments**:
- Modified `process_list.rs` in `system_schema/information_schema` to use the updated process listing method.
- **Key-Value Conversion**:
- Added `TryFrom` implementation for converting `Process` to `KeyValue` in `process_list.rs`.
* chore: remove register
* fix: sqlness tests
* merge main
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
* feat/show-process-list:
- **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency in `Cargo.lock` and `Cargo.toml` to a new revision.
- **Refactor `ProcessManager`**: Simplified the `ProcessManager` implementation by removing the use of `KvBackendRef` and `SequenceRef`, and replaced them with `AtomicU64` and `RwLock` for managing process IDs and catalogs in `process_manager.rs`.
- **Remove Process List Metadata**: Deleted the `process_list.rs` file and removed related metadata key definitions in `key.rs`.
- **Update Process List Logic**: Modified the process list logic in `process_list.rs` to use the new `ProcessManager` structure.
- **Adjust Frontend and Standalone Start Commands**: Updated `frontend.rs` and `standalone.rs` to use the new `ProcessManager` constructor.
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
* feat/show-process-list:
- **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency version in `Cargo.lock` and `Cargo.toml` to a new commit hash.
- **Refactor Error Handling**: Removed unused error variants and added a new `ParseProcessId` error in `src/catalog/src/error.rs`.
- **Enhance Process Management**: Introduced `DisplayProcessId` struct for better process ID representation and parsing in `src/catalog/src/process_manager.rs`.
- **Revise Process List Schema**: Updated the schema and logic for process listing in `src/catalog/src/system_schema/information_schema/process_list.rs` to include new fields like `client` and `frontend`.
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
* feat/show-process-list:
### Commit Message
**Enhancements and Refactoring**
- **Process Management:**
- Refactored `ProcessManager` to list local processes with an optional catalog filter in `process_manager.rs`.
- Updated related tests in `process_manager.rs` and `process_list.rs`.
- **Client Enhancements:**
- Added `frontend_client` method in `client.rs` to support gRPC communication with the frontend.
- **Error Handling:**
- Extended error handling in `error.rs` to include gRPC and Meta errors.
- **Frontend Module:**
- Introduced `selector.rs` for frontend client selection and process listing.
- Updated `Cargo.toml` to include new dependencies and dev-dependencies.
- **gRPC Server:**
- Integrated `FrontendServer` in `builder.rs` for enhanced gRPC server capabilities.
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
* feat/show-process-list:
### Commit Message
**Refactor Process Management and Frontend Integration**
- **Add `common-frontend` Dependency**:
- Updated `Cargo.lock`, `Cargo.toml` files to include `common-frontend` as a dependency.
- **Refactor Process Management**:
- Moved `ProcessManager` trait and `DisplayProcessId` struct to `common-frontend`.
- Updated `process_manager.rs` to use `MetaProcessManager` and `ProcessManagerRef`.
- Removed `ParseProcessId` error variant from `error.rs` in `catalog` and `frontend`.
- **Frontend gRPC Service**:
- Added `frontend_grpc_handler.rs` to handle gRPC requests for frontend processes.
- Updated `grpc.rs` and `builder.rs` to integrate `FrontendGrpcHandler`.
- **Update Tests**:
- Modified tests in `process_manager.rs` to align with new `ProcessManager` implementation.
- **Remove Unused Code**:
- Removed `DisplayProcessId` and related parsing logic from `process_manager.rs`.
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
* feat/show-process-list:
### Add `MetaClientRef` to `MetaProcessManager` and Update Instantiation
- **Files Modified**:
- `src/catalog/src/process_manager.rs`
- `src/cmd/src/frontend.rs`
- `src/cmd/src/standalone.rs`
- **Key Changes**:
- Added `MetaClientRef` as an optional parameter to the `MetaProcessManager::new` method.
- Updated instantiation of `MetaProcessManager` to include `MetaClientRef` where applicable.
### Update `ProcessManagerRef` Usage
- **Files Modified**:
- `src/catalog/src/kvbackend/manager.rs`
- `src/catalog/src/system_schema/information_schema.rs`
- `src/catalog/src/system_schema/information_schema/process_list.rs`
- `src/frontend/src/instance.rs`
- `src/frontend/src/instance/builder.rs`
- **Key Changes**:
- Ensured consistent usage of `ProcessManagerRef` across various modules.
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
* feat/show-process-list:
## Refactor Process Management
- **Unified Process Manager**:
- Replaced `MetaProcessManager` with `ProcessManager` across the codebase.
- Updated `ProcessManager` to use `Arc` for shared references and introduced a `Ticket` struct for query registration and deregistration.
- Affected files: `manager.rs`, `process_manager.rs`, `frontend.rs`, `standalone.rs`, `frontend_grpc_handler.rs`, `instance.rs`, `builder.rs`, `cluster.rs`, `standalone.rs`.
- **Stream Wrapper Implementation**:
- Added `StreamWrapper` to handle record batch streams with process management.
- Affected file: `stream_wrapper.rs`.
- **Test Adjustments**:
- Updated tests to align with the new `ProcessManager` implementation.
- Affected file: `tests-integration/src/cluster.rs`, `tests-integration/src/standalone.rs`.
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
* feat/show-process-list:
### Add Error Handling and Process Management
- **Error Handling Enhancements**:
- Added new error variants `ListProcess` and `CreateChannel` in `error.rs` to handle specific gRPC service invocation failures.
- Updated error handling in `selector.rs` to use the new error variants for better context and error propagation.
- **Process Management Integration**:
- Introduced `process_manager` method in `instance.rs` to access the process manager.
- Integrated `FrontendGrpcHandler` with process management in `server.rs` to handle gRPC requests related to process management.
- **gRPC Server Enhancements**:
- Made `frontend_grpc_handler` public in `grpc.rs` to allow external access and integration with other modules.
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
* feat/show-process-list:
Update `greptime-proto` dependency and enhance process management
- **Dependency Update**: Updated `greptime-proto` in `Cargo.lock` and `Cargo.toml` to a new revision.
- **Process Management**:
- Modified `process_manager.rs` to include catalog filtering in `list_process`.
- Updated `frontend_grpc_handler.rs` to handle catalog filtering in `list_process` requests.
- **System Schema**: Added a TODO comment in `process_list.rs` for future user catalog filtering implementation.
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
* feat/show-process-list:
- **Update Workspace Dependencies**:
- Modified `Cargo.toml` files in `src/catalog`, `src/common/frontend`, and `src/servers` to adjust workspace dependencies.
- **Refactor `ProcessManager` Logic**:
- Updated `process_manager.rs` to simplify the condition in the `select` method.
- **Remove Unused Error Variants**:
- Deleted `BuildProcessManager` error variant from `error.rs` in `src/cmd`.
- Removed `InvalidProcessKey` error variant from `error.rs` in `src/common/meta`.
- **Add License Header**:
- Added Apache License header to `stream_wrapper.rs` in `src/frontend`.
- **Update Test Results**:
- Adjusted expected results in `information_schema.result` to reflect changes in the schema.
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
* feat/show-process-list:
### Add Error Handling for Process Listing
- **`src/catalog/src/error.rs`**: Introduced a new error variant `ListProcess` to handle failures in listing frontend nodes.
- **`src/catalog/src/process_manager.rs`**: Updated `local_processes` and `list_all_processes` methods to return the new error type, adding context for error handling.
- **`src/catalog/src/system_schema/information_schema/process_list.rs`**: Modified `make_process_list` to propagate errors using the new error handling mechanism.
- **`src/servers/src/grpc/frontend_grpc_handler.rs`**: Enhanced error handling in the `list_process` method to log errors and return appropriate gRPC status codes.
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
* feat/show-process-list:
Update `greptime-proto` Dependency and Remove `frontend_client` Method
- **Cargo.lock** and **Cargo.toml**: Updated the `greptime-proto` dependency to a new revision (`5f6119ac7952878d39dcde0343c4bf828d18ffc8`).
- **src/client/src/client.rs**: Removed the `frontend_client` method from the `Client` implementation.
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
* feat/show-process-list:
### Add Query Registration with Pre-Generated ID
- **`process_manager.rs`**: Introduced `register_query_with_id` method to allow registering queries with a pre-generated ID. This includes creating a `ProcessInfo` instance and inserting it into the catalog. Added `next_id` method to generate the next process ID.
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
* feat/show-process-list:
### Update Process List Retrieval Method
- **File**: `process_list.rs`
- Updated the method for retrieving process lists from `local_processes` to `list_all_processes` to support asynchronous operations.
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
* feat/show-process-list:
### Update error handling in `error.rs`
- Refined status code handling for `CreateChannel` error by delegating to `source.status_code()`.
- Separated `ListProcess` and `CreateChannel` error handling for clarity.
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
---------
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
fix/config-docs:
Update `config.md` to specify default compression mode
- Added default value `none` for `grpc.flight_compression` in both frontend and datanode sections of `config/config.md`.
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
* chore/enable-flight-encoder:
### Add Flight Compression Support
- **Configuration Updates**:
- Added `grpc.flight_compression` option to `config/config.md`, `config/datanode.example.toml`, and `config/frontend.example.toml` to specify compression modes for Arrow IPC service.
- **Code Enhancements**:
- Updated `FlightEncoder` in `src/common/grpc/src/flight.rs` to support compression modes.
- Modified `RegionServer` and `DatanodeBuilder` in `src/datanode/src/datanode.rs` and `src/datanode/src/region_server.rs` to handle `FlightCompression`.
- Integrated `FlightCompression` in `src/servers/src/grpc.rs` and `src/servers/src/grpc/flight.rs` to manage compression settings.
- **Testing and Integration**:
- Updated test utilities and integration tests in `tests-integration/src/grpc/flight.rs` and `tests-integration/src/test_util.rs` to include `FlightCompression`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* chore/enable-flight-encoder:
### Enable Compression in FlightClient
- **`client.rs`**: Updated `make_flight_client` to accept `send_compression` and `accept_compression` parameters, enabling Zstd compression for sending and receiving messages.
- **`client_manager.rs`**: Modified `datanode` method to pass compression settings from `ChannelConfig` to `RegionRequester`.
- **`database.rs`**: Adjusted calls to `make_flight_client` to include compression parameters.
- **`region.rs`**: Updated `RegionRequester` to store and utilize compression settings.
- **`frontend.rs`**: Configured `ChannelConfig` to enable compression based on options.
- **`channel_manager.rs`**: Added `send_compression` and `accept_compression` fields to `ChannelConfig` with default values and updated tests accordingly.
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
* chore/enable-flight-encoder:
### Update Compression Defaults and Documentation
- **Configuration Files**: Updated `datanode.example.toml` and `frontend.example.toml` to include a default setting comment for `flight_compression`, specifying it defaults to `none`.
- **gRPC Server Code**: Modified `grpc.rs` to set `None` as the default for `FlightCompression` instead of `ArrowIpc`.
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
* chore: describe pods on CI failure
Signed-off-by: WenyXu <wenymedia@gmail.com>
* chore: increase memory limit for main pod template from 2Gi to 3Gi
Signed-off-by: WenyXu <wenymedia@gmail.com>
---------
Signed-off-by: WenyXu <wenymedia@gmail.com>
* feat/disable-flight-compression:
### Commit Summary
- **Add Compression Control in Flight Encoder**: Introduced a new method `with_compression_disabled` in `FlightEncoder` to allow encoding without compression in `flight.rs`.
- **Update Flight Stream Initialization**: Modified `FlightRecordBatchStream` to use the new `FlightEncoder::with_compression_disabled` method for initializing the encoder in `stream.rs`.
* feat/disable-flight-compression:
Remove Unused Import in `flight.rs`
- Removed the unused import `write_message` from `flight.rs` to clean up the codebase.
* feat/disable-flight-compression:
### Disable Compression in Flight Encoder
- Updated `tests-integration/src/grpc/flight.rs` to use `FlightEncoder::with_compression_disabled()` instead of `FlightEncoder::default()` for encoding `FlightMessage::Schema` and `FlightMessage::RecordBatch`. This change disables compression in the Flight encoder for these operations.
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
* Signed-off-by: Lei, HUANG <lhuang@greptime.com>
* disable flight client compression
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
---------
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
* fix unit tests
* fix: sqlness
* fix/default-time-window:
## Add Helper Functions and Enhance Compaction Tests
- **Refactor Compaction Logic**: Introduced helper functions `flush` and `compact` in `compaction_test.rs` to streamline compaction operations.
- **Enhance Compaction Tests**: Added a new test `test_infer_compaction_time_window` in `compaction_test.rs` to verify compaction time window inference.
- **Testing Improvements**: Added `#[cfg(test)]` attribute to `new_multi_partitions` in `time_partition.rs` to ensure it's only included in test builds.
* fix/default-time-window:
- **Refactor `TimePartition` Struct**: Removed unnecessary comments regarding `time_range` in `time_partition.rs`.
- **Enhance `TimePartitions` Functionality**: Added a method `part_duration_or_default` to provide a default partition duration in `time_partition.rs`.
- **Update SQL Test Cases**: Modified SQL operations and expected results in `scan_big_varchar.result` and `scan_big_varchar.sql` to reflect changes in data manipulation logic.
* fix/default-time-window:
### Update Time Partition Default Duration
- **Refactor Default Duration**: Introduced `INITIAL_TIME_WINDOW` constant to define the default time window duration as `Duration::from_days(1)`. This change replaces multiple instances of the hardcoded default duration across the `time_partition.rs` file.
- **Files Affected**: `time_partition.rs`
* fix/default-time-window:
## Update Partition Duration Handling
- **`time_partition.rs`**: Refactored `part_duration` to be non-optional, removing `Option` wrapper. Updated logic to use `unwrap_or` with `INITIAL_TIME_WINDOW` where necessary. Adjusted related methods and tests to accommodate this change.
- **`version.rs` (memtable and region)**: Updated handling of `part_duration` to align with changes in `time_partition.rs`, ensuring consistent use of non-optional `Duration`.
* fix/default-time-window:
### Improve Error Context in `time_partition.rs`
- Enhanced error context message in `time_partition.rs` to provide clearer information on partition time range issues, including bucket size details.
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
---------
Signed-off-by: Lei, HUANG <lhuang@greptime.com>
* feat(object_store): add support for Alibaba Cloud OSS
- Implement OSS backend in object_store module
- Add OSS-related options to ExportCommand
- Update build_operator to support OSS
- Modify parse_url to handle OSS schema
Signed-off-by: Logic <zqr10159@dromara.org>
* feat(object_store): add support for Alibaba Cloud OSS
- Implement OSS backend in object_store module
- Add OSS-related options to ExportCommand
- Update build_operator to support OSS
- Modify parse_url to handle OSS schema
Signed-off-by: Logic <zqr10159@dromara.org>
* test(object_store): update OSS backend tests with comprehensive scenarios
- Remove minimal case test for OSS backend
- Update test for OSS backend with all fields valid- Remove invalid allow_anonymous test case
Signed-off-by: Logic <zqr10159@dromara.org>
* feat(datasource): add support for OSS (Object Storage Service)
- Implement is_supported_in_oss function to check if a key is supported in OSS configuration- Add build_oss_backend function for creating an OSS backend
- Update requests module to include OSS support check
Signed-off-by: Logic <zqr10159@dromara.org>
* refactor(export): enhance security and logging for sensitive data
- Replace plain strings with SecretString for sensitive information- Implement masking of sensitive data in SQL logs
- Update handling of S3 and OSS credentials
Signed-off-by: Logic <zqr10159@dromara.org>
* refactor(export): generalize remote storage support and rename options
- Rename `s3_ddl_local_dir` to `ddl_local_dir` for better clarity
- Update comments to support both S3 and OSS remote storage options
- Modify logic to handle remote storage options more generically
Signed-off-by: Logic <zqr10159@dromara.org>
* refactor(export): generalize remote storage support and rename options
- Rename `s3_ddl_local_dir` to `ddl_local_dir` for better clarity
- Update comments to support both S3 and OSS remote storage options
- Modify logic to handle remote storage options more generically
Signed-off-by: Logic <zqr10159@dromara.org>
---------
Signed-off-by: Logic <zqr10159@dromara.org>
* wip
* feat: add cpu and memory limit gauge
* chore: add some test cases
* docs: polish some docs
* refactor: remove '#[cfg(target_os = linux)]'
* refactor: add cfg(target_os) in get_cpu_limit() and get_memory_limit()
* feat: pipeline recognize hints from exec
* chore: rename and add test
* chore: minor improve
* chore: rename and add comments
* fix: typos
* feat: add initial impl for vrl processor
* chore: update processors to allow vrl process
* feat: pipeline recognize hints from exec
* chore: rename and add test
* chore: minor improve
* chore: rename and add comments
* fix: typos
* chore: remove unnecessory clone fn
* chore: group metrics
* chore: use struct in transform output enum
* test: add test for vrl
* fix: leaked conflicts
* chore: merge branch code & add check in compile
* fix: check condition
* fix: check auto-transform timeindex
* chore: support table_suffix in hint
* chore: add test for table suffix in vrl hint
* refactor: change context_opt to a struct
chore/allow-numberic-values-in-alter:
### Commit Message
Enhance `alter_parser.rs` to Support Numeric Values
- Updated `parse_string_options` function in `alter_parser.rs` to handle numeric literals in addition to string literals and `NULL` for alter table statements.
- Added a new test `test_parse_alter_with_numeric_value` in `alter_parser.rs` to verify the parsing of numeric values in alter table options.
* refactor: extract some common functions and structs in election module
* chore: add comments and modify a function name
* chore: add comments and modify a function name
* fix: missing 2 lines in license header
* fix: acqrel
* chore: apply comment suggestions
* Update src/meta-srv/src/election.rs
Co-authored-by: jeremyhi <jiachun_feng@proton.me>
---------
Co-authored-by: jeremyhi <jiachun_feng@proton.me>
* fix/initial-builder-cap:
### Enhance Series Initialization and Capacity Management
- **`simple_bulk_memtable.rs`**: Updated the `Series` initialization to use `with_capacity` with a specified capacity of 8192, improving memory management.
- **`time_series.rs`**: Introduced `with_capacity` method in `Series` to allow custom initial capacity for `ValueBuilder`. Adjusted `INITIAL_BUILDER_CAPACITY` to 16 for more efficient memory usage. Added a new `new` method to maintain backward compatibility.
* fix/initial-builder-cap:
### Adjust Memory Allocation in Memtable
- **`simple_bulk_memtable.rs`**: Reduced the initial capacity of `Series` from 8192 to 1024 to optimize memory usage.
- **`time_series.rs`**: Decreased `INITIAL_BUILDER_CAPACITY` from 16 to 4 to improve efficiency in vector building.
* chore: support shared pipeline under catalog with compatibility
* test: add test for cross schema ref
* chore: use empty string schema by default
* chore: remove unwrap in the patch
* fix: df check
* feat: support SQL parsing for trigger show
* add excludes in licenserc
* refine comment
* fix: typo
* fix: add show/trigger.rs to excludes in licenserc
* feat/lossy-string-validation-in-prom-remote-write:
### Commit Message
#### Refactor Prometheus Validation Mode
- **Replace `is_strict_mode` with `PromValidationMode` Enum:**
- Updated `HttpOptions` and related structures to use `PromValidationMode` enum instead of the boolean `is_strict_mode`.
- Modified functions and tests to accommodate the new enum, ensuring flexible validation modes (`Strict`, `Lossy`, `Unchecked`).
- Affected files: `server.rs`, `prom_decode.rs`, `http.rs`, `prom_store.rs`, `prom_row_builder.rs`, `proto.rs`, `prom_store_test.rs`, `test_util.rs`, `http.rs`.
- **Enhance UTF-8 String Decoding:**
- Introduced `decode_string` function to handle UTF-8 string decoding based on the selected `PromValidationMode`.
- Affected files: `proto.rs`, `prom_row_builder.rs`.
This refactor improves the flexibility and clarity of Prometheus request handling by allowing different validation strategies.
* feat/lossy-string-validation-in-prom-remote-write:
- **Add Prometheus Validation Mode Configuration:**
- Updated `config/config.md`, `config/frontend.example.toml`, and `config/standalone.example.toml` to include `http.prom_validation_mode` setting for Prometheus remote write requests.
- **Enhance Benchmarking for Prometheus Requests:**
- Modified `src/servers/benches/prom_decode.rs` to benchmark different Prometheus validation modes (`Strict`, `Lossy`, `Unchecked`).
- **Implement and Test String Decoding:**
- Added `decode_string` function and comprehensive tests in `src/servers/src/proto.rs` to handle string decoding with different validation modes.
* feat/lossy-string-validation-in-prom-remote-write:
### Add Histogram Buckets to Metrics
- **Files Modified**: `src/servers/src/metrics.rs`
- **Key Changes**:
- Added specific histogram buckets to `METRIC_MYSQL_QUERY_TIMER`, `METRIC_POSTGRES_QUERY_TIMER`, and `METRIC_SERVER_GRPC_PROM_REQUEST_TIMER` to enhance granularity in query elapsed time metrics.
* feat/lossy-string-validation-in-prom-remote-write:
### Update Prometheus Validation Mode Default
- **Config Documentation**: Updated the default description for `http.prom_validation_mode` to indicate that "strict" is the default option in `config.md`, `frontend.example.toml`, and `standalone.example.toml`.
- **HTTP Server Implementation**: Changed the default `prom_validation_mode` to `PromValidationMode::Strict` in `src/servers/src/http.rs`.
* feat/lossy-string-validation-in-prom-remote-write:
**Commit Message:**
Update Prometheus Validation Mode to Strict
- Changed `http.prom_validation_mode` from `unchecked` to `strict` in `config.md`, `frontend.example.toml`, and
`standalone.example.toml` to enforce strict validation of Prometheus remote write requests.
* feat/bulk-wal:
### Refactor: Simplify Data Handling in LogStore Implementations
- **`kafka/log_store.rs`, `raft_engine/log_store.rs`, `wal.rs`, `raw_entry_reader.rs`, `logstore.rs`:**
- Refactored `entry` and `build_entry` functions to accept `Vec<u8>` directly instead of `&mut Vec<u8>`.
- Removed usage of `std::mem::take` for data handling, simplifying the code and improving readability.
- Updated test cases to align with the new function signatures.
* feat/bulk-wal:
### Add Support for Bulk WAL Entries and Flight Data Encoding
- **Add `raw_data` field to `BulkPart` and related structs**: Updated `BulkPart` and related structures in `src/mito2/src/memtable/bulk/part.rs`, `src/mito2/src/memtable/simple_bulk_memtable.rs`, `src/mito2/src/memtable/time_partition.rs`, `src/mito2/src/region_write_ctx.rs`,
`src/mito2/src/worker/handle_bulk_insert.rs`, and `src/store-api/src/region_request.rs` to include a new `raw_data` field for handling Arrow IPC data.
- **Implement Flight Data Encoding**: Added a new module `flight` in `src/common/test-util/src/flight.rs` to encode record batches to Flight data format.
- **Update `greptime-proto` dependency**: Changed the revision of the `greptime-proto` dependency in `Cargo.lock` and `Cargo.toml`.
- **Enhance WAL Writer and Tests**: Modified `src/mito2/src/wal.rs` and related test files to support bulk WAL entries and added tests for encoding and handling bulk data.
* feat/bulk-wal:
- **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency to a new revision in `Cargo.lock` and `Cargo.toml`.
- **Add `common-grpc` Dependency**: Added `common-grpc` as a dependency in `Cargo.lock` and `src/mito2/Cargo.toml`.
- **Refactor `BulkPart` Structure**: Removed `num_rows` field and added `num_rows()` method in `src/mito2/src/memtable/bulk/part.rs`. Updated related usages in `src/mito2/src/memtable/simple_bulk_memtable.rs`, `src/mito2/src/memtable/time_partition.rs`, `src/mito2/src/memtable/time_series.rs`,
`src/mito2/src/region_write_ctx.rs`, and `src/mito2/src/worker/handle_bulk_insert.rs`.
- **Implement `TryFrom` and `From` for `BulkWalEntry`**: Added implementations for converting between `BulkPart` and `BulkWalEntry` in `src/mito2/src/memtable/bulk/part.rs`.
- **Handle Bulk Entries in Region Opener**: Added logic to process bulk entries in `src/mito2/src/region/opener.rs`.
- **Fix `BulkInsertRequest` Handling**: Corrected `region_id` handling in `src/operator/src/bulk_insert.rs` and `src/store-api/src/region_request.rs`.
- **Add Error Variant for `ConvertBulkWalEntry`**: Added a new error variant in `src/mito2/src/error.rs` for handling bulk WAL entry conversion errors.
* fix: ci
* feat/bulk-wal:
Add bulk write operation in `opener.rs`
- Enhanced the region write context by adding a call to `write_bulk()` after `write_memtable()` in `opener.rs`.
- This change aims to improve the efficiency of writing operations by enabling bulk writes.
* feat/bulk-wal:
Enhance error handling and metrics in `bulk_insert.rs`
- Updated `Inserter` to improve error handling by capturing the result of `datanode.handle(request)` and incrementing the `DIST_INGEST_ROW_COUNT` metric with the number of affected rows.
* feat/bulk-wal:
### Remove Encode Error Handling for WAL Entries
- **`error.rs`**: Removed the `EncodeWal` error variant and its associated handling.
- **`wal.rs`**: Eliminated the `entry_encode_buf` buffer and its usage for encoding WAL entries. Replaced with direct encoding to a vector using `encode_to_vec()`.
* chore: add tool to export db meta
* chore: add meta restore command
* chore: fmt code
* chore: remove useless error
* chore: support key prefix
* chore: add clean check for meta restore
* chore: add more log for meta restore
* chore: resolve s3 and local file root in command meta-snapshot
* chore: remove the pg mysql features from the build script as they are already in the default feature
* chore: fix by pr comment
* fix: alter table update table column default
* fix: fuzz test also cast default value
* chore: more testcase
* test: non-zero value
* refactor: per review
* tests: unexpected alter result(WIP on fix)
* ub
* ub more
* test: update sqlness
* refactor/flight-codec:
### Refactor and Enhance Schema and RecordBatch Handling
- **Add `datatypes` Dependency**: Updated `Cargo.lock` and `Cargo.toml` to include the `datatypes` dependency.
- **Schema Conversion and Error Handling**:
- Updated `src/client/src/database.rs` and `src/client/src/region.rs` to handle schema conversion using `Arc` and added error handling for schema conversion.
- Enhanced error handling in `src/client/src/error.rs` and `src/common/grpc/src/error.rs` by adding `ConvertSchema` error and removing unused errors.
- **FlightMessage and RecordBatch Refactoring**:
- Refactored `FlightMessage` enum in `src/common/grpc/src/flight.rs` to use `RecordBatch` instead of `Recordbatch`.
- Updated related functions and tests in `src/common/grpc/benches/bench_flight_decoder.rs`, `src/operator/src/bulk_insert.rs`, `src/servers/src/grpc/flight/stream.rs`, and `tests-integration/src/grpc/flight.rs` to align with the new `FlightMessage` structure.
* refactor/flight-codec:
Remove `ConvertArrowSchema` Error Variant
- Removed the `ConvertArrowSchema` error variant from `error.rs`.
- Updated the `ErrorExt` implementation to exclude `ConvertArrowSchema`.
- Affected file: `src/common/query/src/error.rs`.
* fix: cr
* fix/bulk-insert-case-sensitive:
Add error inspection for gRPC bulk insert in `greptime_handler.rs`
- Enhanced error handling by adding `inspect_err` to log errors during the `put_record_batch` operation in `greptime_handler.rs`.
* fix: silient error while bulk ingest with uppercase columns
* main:
**Enhancements to Flight Data Handling and Error Management**
- **Flight Data Handling:**
- Added `bytes` dependency in `Cargo.lock` and `Cargo.toml`.
- Introduced `try_from_schema_bytes` and `try_decode_record_batch` methods in `FlightDecoder` to handle schema and record batch decoding more efficiently in `src/common/grpc/src/flight.rs`.
- Updated `Inserter` in `src/operator/src/bulk_insert.rs` to utilize schema bytes directly, improving bulk insert operations.
- **Error Management:**
- Added `ArrowError` handling in `src/common/grpc/src/error.rs` to manage errors related to Arrow operations.
- **Region Request Processing:**
- Modified `make_region_bulk_inserts` in `src/store-api/src/region_request.rs` to use the new `FlightDecoder` methods for decoding Arrow IPC data.
* - **Flight Data Handling:**
- Added `bytes` dependency in `Cargo.lock` and `Cargo.toml`.
- Introduced `try_from_schema_bytes` and `try_decode_record_batch` methods in `FlightDecoder` to handle schema and record batch decoding more efficiently in `src/common/grpc/src/flight.rs`.
- Updated `Inserter` in `src/operator/src/bulk_insert.rs` to utilize schema bytes directly, improving bulk insert operations.
- **Error Management:**
- Added `ArrowError` handling in `src/common/grpc/src/error.rs` to manage errors related to Arrow operations.
- **Region Request Processing:**
- Modified `make_region_bulk_inserts` in `src/store-api/src/region_request.rs` to use the new `FlightDecoder` methods for decoding Arrow IPC data.
* perf/optimize-bulk-encode-decode:
Update `greptime-proto` dependency and refactor error handling
- **Dependency Update**: Updated the `greptime-proto` dependency to a new revision in `Cargo.lock` and `Cargo.toml`.
- **Error Handling Refactor**: Removed the `Prost` error variant from `MetadataError` in `src/store-api/src/metadata.rs`.
- **Error Handling Improvement**: Replaced `unwrap` with `context(FlightCodecSnafu)` for error handling in `make_region_bulk_inserts` function in `src/store-api/src/region_request.rs`.
* fix: clippy
* fix: toml
* perf/optimize-bulk-encode-decode:
### Update `Cargo.toml` Dependencies
- Updated the `bytes` dependency to use the workspace version in `Cargo.toml`.
* perf/optimize-bulk-encode-decode:
**Fix payload assignment in `bulk_insert.rs`**
- Corrected the assignment of the `payload` field in the `ArrowIpc` struct within the `Inserter` implementation in `bulk_insert.rs`.
* use main branch proto
* chore: invalid table flow mapping
* chore: exists
* fix: invalid all related keys in kv cache when drop flow&refactor: per review
* fix: flow not found status code
* chore: rm unused error code
* chore: stuff
* chore: unused
* - **Refactor `RegionFilePathFactory` to `RegionFilePathProvider`:** Updated references and implementations in `access_layer.rs`, `write_cache.rs`, and related test files to use the new struct name.
- **Add `max_file_size` support in compaction:** Introduced `max_file_size` option in `PickerOutput`, `SerializedPickerOutput`, and `WriteOptions` in `compactor.rs`, `picker.rs`, `twcs.rs`, and `window.rs`.
- **Enhance Parquet writing logic:** Modified `parquet.rs` and `parquet/writer.rs` to support optional `max_file_size` and added a test case `test_write_multiple_files` to verify writing multiple files based on size constraints.
**Refactor Parquet Writer Initialization and File Handling**
- Updated `ParquetWriter` in `writer.rs` to handle `current_indexer` as an `Option`, allowing for more flexible initialization and management.
- Introduced `finish_current_file` method to encapsulate logic for completing and transitioning between SST files, improving code clarity and maintainability.
- Enhanced error handling and logging with `debug` statements for better traceability during file operations.
- **Removed Output Size Enforcement in `twcs.rs`:**
- Deleted the `enforce_max_output_size` function and related logic to simplify compaction input handling.
- **Added Max File Size Option in `parquet.rs`:**
- Introduced `max_file_size` in `WriteOptions` to control the maximum size of output files.
- **Refactored Indexer Management in `parquet/writer.rs`:**
- Changed `current_indexer` from an `Option` to a direct `Indexer` type.
- Implemented `roll_to_next_file` to handle file transitions when exceeding `max_file_size`.
- Simplified indexer initialization and management logic.
- **Refactored SST File Handling**:
- Introduced `FilePathProvider` trait and its implementations (`WriteCachePathProvider`, `RegionFilePathFactory`) to manage SST and index file paths.
- Updated `AccessLayer`, `WriteCache`, and `ParquetWriter` to use `FilePathProvider` for path management.
- Modified `SstWriteRequest` and `SstUploadRequest` to use path providers instead of direct paths.
- Files affected: `access_layer.rs`, `write_cache.rs`, `parquet.rs`, `writer.rs`.
- **Enhanced Indexer Management**:
- Replaced `IndexerBuilder` with `IndexerBuilderImpl` and made it async to support dynamic indexer creation.
- Updated `ParquetWriter` to handle multiple indexers and file IDs.
- Files affected: `index.rs`, `parquet.rs`, `writer.rs`.
- **Removed Redundant File ID Handling**:
- Removed `file_id` from `SstWriteRequest` and `CompactionOutput`.
- Updated related logic to dynamically generate file IDs where necessary.
- Files affected: `compaction.rs`, `flush.rs`, `picker.rs`, `twcs.rs`, `window.rs`.
- **Test Adjustments**:
- Updated tests to align with new path and indexer management.
- Introduced `FixedPathProvider` and `NoopIndexBuilder` for testing purposes.
- Files affected: `sst_util.rs`, `version_util.rs`, `parquet.rs`.
* chore: rebase main
* feat/multiple-compaction-output:
### Add Benchmarking and Refactor Compaction Logic
- **Benchmarking**: Added a new benchmark `run_bench` in `Cargo.toml` and implemented benchmarks in `benches/run_bench.rs` using Criterion for `find_sorted_runs` and `reduce_runs` functions.
- **Compaction Module Enhancements**:
- Made `run.rs` public and refactored the `Ranged` and `Item` traits to be public.
- Simplified the logic in `find_sorted_runs` and `reduce_runs` by removing `MergeItems` and related functions.
- Introduced `find_overlapping_items` for identifying overlapping items.
- **Code Cleanup**: Removed redundant code and tests related to `MergeItems` in `run.rs`.
* feat/multiple-compaction-output:
### Enhance Compaction Logic and Add Benchmarks
- **Compaction Logic Improvements**:
- Updated `reduce_runs` function in `src/mito2/src/compaction/run.rs` to remove the target parameter and improve the logic for selecting files to merge based on minimum penalty.
- Enhanced `find_overlapping_items` to handle unsorted inputs and improve overlap detection efficiency.
- **Benchmark Enhancements**:
- Added `bench_find_overlapping_items` in `src/mito2/benches/run_bench.rs` to benchmark the new `find_overlapping_items` function.
- Extended existing benchmarks to include larger data sizes.
- **Testing Enhancements**:
- Updated tests in `src/mito2/src/compaction/run.rs` to reflect changes in `reduce_runs` and added new tests for `find_overlapping_items`.
- **Logging and Debugging**:
- Improved logging in `src/mito2/src/compaction/twcs.rs` to provide more detailed information about compaction decisions.
* feat/multiple-compaction-output:
### Refactor and Enhance Compaction Logic
- **Refactor `find_overlapping_items` Function**: Changed the function signature to accept slices instead of mutable vectors in `run.rs`.
- **Rename and Update Struct Fields**: Renamed `penalty` to `size` in `SortedRun` struct and updated related logic in `run.rs`.
- **Enhance `reduce_runs` Function**: Improved logic to sort runs by size and limit probe runs to 100 in `run.rs`.
- **Add `merge_seq_files` Function**: Introduced a new function `merge_seq_files` in `run.rs` for merging sequential files.
- **Modify `TwcsPicker` Logic**: Updated the compaction logic to use `merge_seq_files` when only one run is found in `twcs.rs`.
- **Remove `enforce_file_num` Function**: Deleted the `enforce_file_num` function and its related test cases in `twcs.rs`.
* feat/multiple-compaction-output:
### Enhance Compaction Logic and Testing
- **Add `merge_seq_files` Functionality**: Implemented the `merge_seq_files` function in `run.rs` to optimize file merging based on scoring systems. Updated
benchmarks in `run_bench.rs` to include `bench_merge_seq_files`.
- **Improve Compaction Strategy in `twcs.rs`**: Modified the compaction logic to handle file merging more effectively, considering file size and overlap.
- **Update Tests**: Enhanced test coverage in `compaction_test.rs` and `append_mode_test.rs` to validate new compaction logic and file merging strategies.
- **Remove Unused Function**: Deleted `new_file_handles` from `test_util.rs` as it was no longer needed.
* feat/multiple-compaction-output:
### Refactor TWCS Compaction Options
- **Refactor Compaction Logic**: Simplified the TWCS compaction logic by replacing multiple parameters (`max_active_window_runs`, `max_active_window_files`, `max_inactive_window_runs`, `max_inactive_window_files`) with a single `trigger_file_num` parameter in `picker.rs`, `twcs.rs`, and `options.rs`.
- **Update Tests**: Adjusted test cases to reflect the new compaction logic in `append_mode_test.rs`, `compaction_test.rs`, `filter_deleted_test.rs`, `merge_mode_test.rs`, and various test files under `tests/cases`.
- **Modify Engine Options**: Updated engine option keys to use `trigger_file_num` in `mito_engine_options.rs` and `region_request.rs`.
- **Fuzz Testing**: Updated fuzz test generators and translators to accommodate the new compaction parameter in `alter_expr.rs` and related files.
This refactor aims to streamline the compaction configuration by reducing the number of parameters and simplifying the codebase.
* chore: add trailing space
* fix license header
* feat/revise-compaction-picker:
**Limit File Processing and Optimize Merge Logic in `run.rs`**
- Introduced a limit to process a maximum of 100 files in `merge_seq_files` to control time complexity.
- Adjusted logic to calculate `target_size` and iterate over files using the limited set of files.
- Updated scoring calculations to use the limited file set, ensuring efficient file merging.
* feat/revise-compaction-picker:
### Add Compaction Metrics and Remove Debug Logging
- **Compaction Metrics**: Introduced new histograms `COMPACTION_INPUT_BYTES` and `COMPACTION_OUTPUT_BYTES` to track compaction input and output file sizes in `metrics.rs`. Updated `compactor.rs` to observe these metrics during the compaction process.
- **Logging Cleanup**: Removed debug logging of file ranges during the merge process in `twcs.rs`.
* feat/revise-compaction-picker:
## Enhance Compaction Logic and Metrics
- **Compaction Logic Improvements**:
- Added methods `input_file_size` and `output_file_size` to `MergeOutput` in `compactor.rs` to streamline file size calculations.
- Updated `Compactor` implementation to use these methods for metrics tracking.
- Modified `Ranged` trait logic in `run.rs` to improve range comparison.
- Enhanced test cases in `run.rs` to reflect changes in compaction logic.
- **Metrics Enhancements**:
- Changed `COMPACTION_INPUT_BYTES` and `COMPACTION_OUTPUT_BYTES` from histograms to counters in `metrics.rs` for better performance tracking.
- **Debugging and Logging**:
- Added detailed logging for compaction pick results in `twcs.rs`.
- Implemented custom `Debug` trait for `FileMeta` in `file.rs` to improve debugging output.
- **Testing Enhancements**:
- Added new test `test_compaction_overlapping_files` in `compaction_test.rs` to verify compaction behavior with overlapping files.
- Updated `merge_mode_test.rs` to reflect changes in file handling during scans.
* feat/revise-compaction-picker:
### Update `FileHandle` Debug Implementation
- **Refactor Debug Output**: Simplified the `fmt::Debug` implementation for `FileHandle` in `src/mito2/src/sst/file.rs` by consolidating multiple fields into a single `meta` field using `meta_ref()`.
- **Atomic Operations**: Updated the `deleted` field to use atomic loading with `Ordering::Relaxed`.
* Trigger CI
* feat/revise-compaction-picker:
**Update compaction logic and default options**
- **`twcs.rs`**: Enhanced logging for compaction pick results by improving the formatting for better readability.
- **`options.rs`**: Modified the default `max_output_file_size` in `TwcsOptions` from 2GB to 512MB to optimize file handling and performance.
* feat/revise-compaction-picker:
Refactor `find_overlapping_items` to use an external result vector
- Updated `find_overlapping_items` in `src/mito2/src/compaction/run.rs` to accept a mutable result vector instead of returning a new vector, improving memory efficiency.
- Modified benchmarks in `src/mito2/benches/bench_compaction_picker.rs` to accommodate the new function signature.
- Adjusted tests in `src/mito2/src/compaction/run.rs` to use the updated function signature, ensuring correct functionality with the new approach.
* feat/revise-compaction-picker:
Improve file merging logic in `run.rs`
- Refactor the loop logic in `merge_seq_files` to simplify the iteration over file groups.
- Adjust the range for `end_idx` to include the endpoint, allowing for more flexible group selection.
- Remove the condition that skips groups with only one file, enabling more comprehensive processing of file sequences.
* feat/revise-compaction-picker:
Enhance `find_overlapping_items` with `SortedRun` and Update Tests
- Refactor `find_overlapping_items` in `src/mito2/src/compaction/run.rs` to utilize the `SortedRun` struct for improved efficiency and clarity.
- Introduce a `sorted` flag in `SortedRun` to optimize sorting operations.
- Update test cases in `src/mito2/benches/bench_compaction_picker.rs` to accommodate changes in `find_overlapping_items` by using `SortedRun`.
- Add `From<Vec<T>>` implementation for `SortedRun` to facilitate easy conversion from vectors.
* feat/revise-compaction-picker:
**Enhancements in `compaction/run.rs`:**
- Added `ReadableSize` import to handle size calculations.
- Modified the logic in `merge_seq_files` to clamp the calculated target size to a maximum of 2GB when `max_file_size` is not provided.
* feat/revise-compaction-picker: Add Default Max Output Size Constant for Compaction
Introduce DEFAULT_MAX_OUTPUT_SIZE constant to define the default maximum compaction output file size as 2GB. Refactor the merge_seq_files function to utilize this constant, ensuring consistent and maintainable code for handling file size limits during compaction.
* fix: always check for shutdown signal in flow
chore: correct log msg for flows that shouldn't exist
feat: use time window size/2 as sleep interval
* chore: better slower query refresh time
* chore
* refactor: per review
fix/stall-metrics:
Improve stalled request handling in `handle_write.rs`
- Updated logic to account for both `write_requests` and `bulk_requests` when adjusting `stalled_count`.
- Modified `reject_region_stalled_requests` and `handle_region_stalled_requests` to correctly subtract the combined length of `requests` and `bulk` from `stalled_count`.
fix/flaky-prom-gateway-test:
**Refactor gRPC Test Assertions in `grpc.rs`**
- Updated test assertions for `test_prom_gateway_query` to improve clarity and maintainability.
- Replaced direct comparison with expected `PrometheusJsonResponse` objects with individual field assertions.
- Added sorting for `vector` and `matrix` results to ensure consistent test outcomes.
* 1. rename `greptime_mito_flush_errors_total` metric to `greptime_mito_flush_errors_total` for consistency
2. update grafana dashboard to add following panel:
- compaction input/output bytes
- bulk insert handle elasped time in frontend and region worker
* chore: supporting more data type for pipeline dryrun API
* chore: add docs for parse_dryrun_data
* chore: fix by pr comment
* chore: add user-friendly error message
* chore: change EventPayloadResolver content_type field type from owner to ref
* Apply suggestions from code review
Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>
---------
Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>
* feat: export to s3 add more options
* chore: rm output dir override logic
* fix: s3 root export data
* feat: use output_dir and s3 at same time
* refactor: per review
* fix: keep same behavior
* fix/fast-path-for-single-region-bulk-insert:
### Commit Summary
- **Refactor `try_decode` Method**: Updated the `try_decode` method in `FlightDecoder` to accept a reference to `FlightData` instead of consuming it. This change affects multiple files including `database.rs`, `region.rs`, `flight.rs`, `bulk_insert.rs`, `stream.rs`, and `region_request.rs`.
- **Optimize Bulk Insert Handling**: Added a fast path for handling bulk inserts when only one region is involved in `bulk_insert.rs`.
* fix/fast-path-for-single-region-bulk-insert:
Improve `FlightDecoder` usage in tests
- Updated `try_decode` method calls in `flight.rs` to remove unnecessary references for `d1`, `d2`, and `d3`.
- Ensured consistency in handling `FlightMessage` variants within test cases.
* fix/fast-path-for-single-region-bulk-insert:
**Enhancement: Skip Empty Regions in Bulk Insert**
- Updated `bulk_insert.rs` to improve efficiency by skipping regions without data during the bulk insert process. This change ensures that regions with a `true_count` of zero are not processed, optimizing resource usage and performance.
* fix/fast-path-for-single-region-bulk-insert:
### Commit Summary
- **Refactor `RegionMask` Handling**:
- Introduced `RegionMask` struct to encapsulate boolean array and selected rows count.
- Updated methods to use `RegionMask` instead of `BooleanArray` for region selection.
- Affected files: `bulk_insert.rs`, `multi_dim.rs`, `partition.rs`, `splitter.rs`.
- **Optimize Region Selection**:
- Removed unnecessary checks for empty regions in `bulk_insert.rs`.
- Improved logic for handling default regions in `multi_dim.rs`.
- **Update Tests**:
- Modified test cases to accommodate `RegionMask` changes.
- Affected files: `multi_dim.rs`, `splitter.rs`.
* fix/fast-path-for-single-region-bulk-insert:
**Enhancements to MultiDimPartitionRule Logic and Tests**
- **`multi_dim.rs`**: Improved the logic for selecting rows in `MultiDimPartitionRule` by optimizing the selection process when only one region is present.
- **Tests**: Added new test cases to verify the behavior of default regions with unselected rows, existing default regions, and scenarios where all rows are selected. These tests ensure robust handling of partition rules and validate the correct assignment of rows to regions.
* feat: improve topic management and add stale records cleanup
* fix: fix unit tests
* chore: apply suggestions from CR
* chore: apply suggestions from CR
* fix: remove files under atomic dir on failure
* fix: clean atomic dir on download failure
* chore: update comment
* fix: clean if failed to write without write cache
* feat: add a TempFileCleaner to clean files on failure
* chore: after merge fix
* chore: more fix
---------
Co-authored-by: discord9 <55937128+discord9@users.noreply.github.com>
Co-authored-by: discord9 <discord9@163.com>
* add benchmark for splitting according to time partition
* feat/write-to-multiple-time-partitions:
**Enhancements to Bulk Processing and Time Partitioning**
- **`part.rs`**: Added `Snafu` to imports and introduced `timestamp_index` in `BulkPart` struct. Implemented `timestamps` method for accessing timestamp columns.
- **`simple_bulk_memtable.rs`**: Updated tests to include `timestamp_index` initialization.
- **`time_partition.rs`**: Enhanced `TimePartition` to support partial writes with `write_record_batch_partial`. Implemented `split_record_batch` for filtering records by timestamp range. Added comprehensive tests for `split_record_batch`.
- **`handle_bulk_insert.rs`**: Modified to retrieve timestamp index and column together, updating `BulkPart` initialization with `timestamp_index`.
* feat/write-to-multiple-time-partitions:
### Enhance Time Partitioning Logic
- **`time_partition.rs`**:
- Introduced `HashSet` for efficient partition management.
- Refactored `write_bulk` to handle multiple partitions and added `find_partitions_by_time_range` for identifying existing and missing partitions.
- Updated `get_or_create_time_partition` to manage partition creation.
- Added comprehensive tests for partition finding logic, covering various scenarios including overlapping and non-overlapping time ranges.
- **Tests**:
- Added `test_find_partitions_by_time_range` to validate new partitioning logic.
- Updated `test_split_record_batch` to ensure correct record batch splitting behavior.
* feat/write-to-multiple-time-partitions:
### Enhance Time Partitioning and Testing in `time_partition.rs`
- **Time Partitioning Enhancements**:
- Updated `split_record_batch` to handle multiple timestamp units (`Second`, `Millisecond`, `Microsecond`, `Nanosecond`) by matching on `DataType`.
- Improved filtering logic for timestamp arrays to support various time units.
- **Testing Enhancements**:
- Added `test_write_bulk` to verify writing across multiple partitions and scenarios in `time_partition.rs`.
- Updated `test_split_record_batch` to use `TimestampMillisecondArray` for testing timestamp partitioning.
- **Imports and Dependencies**:
- Added necessary imports for new timestamp array types and testing utilities.
* feat/write-to-multiple-time-partitions:
### Refactor and Enhance Time Partition Filtering
- **Refactor Filtering Logic**: Consolidated the filtering logic for timestamp arrays using macros in `time_partition.rs` and `bench_filter_time_partition.rs`. This reduces code duplication and improves maintainability.
- **Enhance `BulkPart` Struct**: Made fields in `BulkPart` public to facilitate easier access and manipulation in `memtable.rs` and `part.rs`.
- **Rename Function**: Renamed `split_record_batch` to `filter_record_batch` for clarity in `time_partition.rs` and `bench_filter_time_partition.rs`.
- **Add Feature Flag**: Introduced `int_roundings` feature in `lib.rs` to support new functionality.
* refactor tests
* feat/write-to-multiple-time-partitions:
Improve timestamp handling in `time_partition.rs`
- Enhanced safety comments for timestamp conversion to ensure clarity.
- Modified logic to prevent overflow by using `div_euclid` for `bulk_start_sec` and `bulk_end_sec` calculations.
- Adjusted the `filter_map` logic to correctly compute timestamps using `start_sec` and `part_duration_sec`.
* feat/write-to-multiple-time-partitions:
**Refactor timestamp handling and add utility function**
- **Refactor `time_partition.rs`:** Simplified timestamp handling by replacing direct type access with a utility function to retrieve the timestamp unit. Improved error handling for timestamp conversion.
- **Enhance `metadata.rs`:** Added `time_index_type` function to `RegionMetadata` to retrieve the timestamp type of the time index column, ensuring safer and more readable code.
* feat/write-to-multiple-time-partitions:
Refactor time partition variable names in `time_partition.rs`
- Renamed variables for clarity: `bulk_start_sec` to `start_bucket` and `bulk_end_sec` to `end_bucket`.
- Updated related logic to use new variable names for improved readability and maintainability.
* feat/write-to-multiple-time-partitions:
**Refactor variable names in `time_partition.rs`**
- Updated variable names from `matching` and `missing` to `matchings` and `missings` for clarity and consistency.
- Modified function calls and loop iterations to align with the new variable names.
- Affected file: `src/mito2/src/memtable/time_partition.rs`
* feat/write-to-multiple-time-partitions:
### Refactor variable names in `time_partition.rs`
- Updated variable names for clarity in `time_partition.rs`:
- Renamed `matchings` to `matching_parts`
- Renamed `missings` to `missing_parts`
- Adjusted logic to use new variable names in methods `find_partitions_by_time_range` and `write_record_batch`.
* feat/write-to-multiple-time-partitions:
### Enhance Time Partition Handling
- **`time_partition.rs`**:
- Added `ArrayRef` to handle timestamp arrays, improving the partitioning logic by allowing more efficient timestamp range checks.
- Enhanced `find_partitions_by_time_range` to support sparse data and handle different timestamp units (`Second`, `Millisecond`, `Microsecond`, `Nanosecond`).
- Updated test cases to cover new scenarios, including sparse data and edge cases, ensuring robustness of partition handling.
---------
Co-authored-by: Lei <lei@Leis-MacBook-Pro.local>
* ci: automatically update helm-charts when release
* Update .github/workflows/release.yml
Co-authored-by: Ning Sun <classicning@gmail.com>
* Update update-helm-charts-version.sh
---------
Co-authored-by: Ning Sun <classicning@gmail.com>
* fix: select after alter
* fix: insert a proper row&catch a bug
* fix: alter table modify type modify default value type too
* refactor: per review
* chore: per review
* refactor: per review
* refactor: per review
* feat/bridge-bulk-insert:
## Implement Bulk Insert and Update Dependencies
- **Bulk Insert Implementation**: Added `handle_bulk_inserts` method in `src/operator/src/bulk_insert.rs` to manage bulk insert requests using `FlightDecoder` and `FlightData`.
- **Dependency Updates**: Updated `Cargo.lock` and `Cargo.toml` to use the latest revision of `greptime-proto` and added new dependencies like `arrow`, `arrow-ipc`, `bytes`, and `prost`.
- **gRPC Enhancements**: Modified `put_record_batch` method in `src/frontend/src/instance/grpc.rs` and `src/servers/src/grpc/flight.rs` to handle `FlightData` instead of `RawRecordBatch`.
- **Error Handling**: Added new error types in `src/operator/src/error.rs` for handling Arrow operations and decoding flight data.
- **Miscellaneous**: Updated `src/operator/src/insert.rs` to expose `partition_manager` and `node_manager` as public fields.
* feat/bridge-bulk-insert:
- **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency to a new revision in `Cargo.lock` and `Cargo.toml`.
- **Refactor gRPC Query Handling**: Removed `RawRecordBatch` usage from `grpc.rs`, `flight.rs`, `greptime_handler.rs`, and test files, simplifying the gRPC query handling.
- **Enhance Bulk Insert Logic**: Improved bulk insert logic in `bulk_insert.rs` and `region_request.rs` by using `FlightDecoder` and `BooleanArray` for better performance and clarity.
- **Add `common-grpc` Dependency**: Added `common-grpc` as a workspace dependency in `store-api/Cargo.toml` to support gRPC functionalities.
* fix: clippy
* fix schema serialization
* feat/bridge-bulk-insert:
Add error handling for encoding/decoding in `metadata.rs` and `region_request.rs`
- Introduced new error variants `FlightCodec` and `Prost` in `MetadataError` to handle encoding/decoding failures in `metadata.rs`.
- Updated `make_region_bulk_inserts` function in `region_request.rs` to use `context` for error handling with `ProstSnafu` and `FlightCodecSnafu`.
- Enhanced error handling for `FlightData` decoding and `filter_record_batch` operations.
* fix: test
* refactor: rename
* allow empty app_metadata in FlightData
* feat/bridge-bulk-insert:
- **Remove Logging**: Removed unnecessary logging of affected rows in `region_server.rs`.
- **Error Handling Enhancement**: Improved error handling in `bulk_insert.rs` by adding context to `split_record_batch` and handling single datanode fast path.
- **Error Enum Cleanup**: Removed unused `Arrow` error variant from `error.rs`.
* fix: standalone test
* feat/bridge-bulk-insert:
### Enhance Bulk Insert Handling and Metadata Management
- **`lib.rs`**: Enabled the `result_flattening` feature for improved error handling.
- **`request.rs`**: Made `name_to_index` and `has_null` fields public in `WriteRequest` for better accessibility.
- **`handle_bulk_insert.rs`**:
- Added `handle_record_batch` function to streamline processing of bulk insert payloads.
- Improved error handling and task management for bulk insert operations.
- Updated `region_metadata_to_column_schema` to return both column schemas and a name-to-index map for efficient data access.
* feat/bridge-bulk-insert:
- **Refactor `handle_bulk_insert.rs`:**
- Replaced `handle_record_batch` with `handle_payload` for handling payloads.
- Modified the fast path to use `common_runtime::spawn_global` for asynchronous task execution.
- **Optimize `multi_dim.rs`:**
- Added a fast path for single-region scenarios in `MultiDimPartitionRule::partition_record_batch`.
* feat/bridge-bulk-insert:
- **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency to a new revision in both `Cargo.lock` and `Cargo.toml`.
- **Optimize Memory Allocation**: Increased initial and builder capacities in `time_series.rs` to improve performance.
- **Enhance Data Handling**: Modified `bulk_insert.rs` to use `Bytes` for efficient data handling.
- **Improve Bulk Insert Logic**: Refined the bulk insert logic in `region_request.rs` to handle schema and payload data more effectively and optimize record batch filtering.
- **String Handling Improvement**: Updated string conversion in `helper.rs` for better performance.
* fix: clippy warnings
* feat/bridge-bulk-insert:
**Add Metrics and Improve Error Handling**
- **Metrics Enhancements**: Introduced new metrics for bulk insert operations in `metrics.rs`, `bulk_insert.rs`, `greptime_handler.rs`, and `region_request.rs`. Added `HANDLE_BULK_INSERT_ELAPSED`, `BULK_REQUEST_MESSAGE_SIZE`, and `GRPC_BULK_INSERT_ELAPSED` histograms to
monitor performance.
- **Error Handling Improvements**: Removed unnecessary error handling in `handle_bulk_insert.rs` by eliminating redundant `let _ =` patterns.
- **Dependency Updates**: Added `lazy_static` and `prometheus` to `Cargo.lock` and `Cargo.toml` for metrics support.
- **Code Refactoring**: Simplified function calls in `region_server.rs` and `handle_bulk_insert.rs` for better readability.
* chore: rebase main
* implement simple bulk memtable
* impl write_bulk
* implement simple bulk memtable
* feat/simple-bulk-memtable:
### Enhance Time-Series Memtable and Bulk Insert Handling
- **Visibility Modifications**: Made `mutable_array` in `PrimitiveVectorBuilder` and `StringVectorBuilder` public in `primitive.rs` and `string.rs`.
- **New Module**: Added `builder.rs` to `memtable` for time-series builders, including `FieldBuilder` and `StringBuilder` implementations.
- **Bulk Insert Enhancements**:
- Added `sequence` field to `BulkPart` in `part.rs` and updated its handling in `simple_bulk_memtable.rs` and `region_write_ctx.rs`.
- Introduced metrics for bulk insert operations in `metrics.rs` and `bulk_insert.rs`.
- **Performance Metrics**: Added timing metrics for write operations in `metrics.rs`, `region_write_ctx.rs`, and `handle_write.rs`.
- **Region Request Handling**: Updated `make_region_bulk_inserts` in `region_request.rs` to include performance metrics.
* feat/simple-bulk-memtable:
**Improve Memtable Stats Calculation and Add Metrics Timer**
- **`simple_bulk_memtable.rs`**: Refactored `stats` method to use `num_rows` for checking if rows have been written, improving accuracy in memory table statistics.
- **`handle_bulk_insert.rs`**: Introduced a metrics timer to measure the elapsed time for processing bulk requests, enhancing performance monitoring.
* feat/simple-bulk-memtable:
### Commit Message
**Enhancements and Bug Fixes**
- **Dependency Update**: Updated `greptime-proto` dependency to a new revision in `Cargo.lock` and `Cargo.toml`.
- **Feature Addition**: Implemented `to_mutation` method in `BulkPart` to convert `BulkPart` to `Mutation` for fallback `write_bulk` implementation in `src/mito2/src/memtable/bulk/part.rs`.
- **Functionality Improvement**: Modified `write_bulk` method in `TimeSeriesMemtable` to support default implementation fallback to row iteration in `src/mito2/src/memtable/time_series.rs`.
- **Performance Optimization**: Enhanced `bulk_insert` handling by optimizing region request processing and data partitioning in `src/operator/src/bulk_insert.rs`.
- **Error Handling**: Added `ComputeArrow` error variant for better error management in `src/operator/src/error.rs`.
- **Code Refactoring**: Simplified region bulk insert request processing in `src/store-api/src/region_request.rs`.
* fix: some clippy warnings
* feat/simple-bulk-memtable:
### Commit Summary
- **Refactor Return Types to `Result`:**
Updated the return type of the `ranges` method in `memtable.rs`, `bulk.rs`, `partition_tree.rs`, `simple_bulk_memtable.rs`, `time_series.rs`, and `memtable_util.rs` to return `Result<MemtableRanges>` for better error handling.
- **Enhance Metrics Tracking:**
Improved metrics tracking by adding `num_rows` and `max_sequence` to `WriteMetrics` in `stats.rs`. Updated related methods in `partition_tree.rs`, `simple_bulk_memtable.rs`, `time_series.rs`, and `scan_region.rs` to utilize these metrics.
- **Remove Unused Imports:**
Cleaned up unused imports in `time_series.rs` to streamline the codebase.
* merge main
* remove useless error vairant
* use newer version of proto
* feat/simple-bulk-memtable:
Commit Message
Summary
Enhance FieldBuilder and StringBuilder functionality, add tests, and improve error handling.
Key Changes
• builder.rs:
• Added documentation for FieldBuilder methods.
• Renamed append_string_vector to append_vector in StringBuilder.
• simple_bulk_memtable.rs:
• Added new test cases for write_one, write_bulk, is_empty, stats, fork, and sequence_filter.
• time_series.rs:
• Improved error handling in ValueBuilder for type mismatches.
• memtable_util.rs:
• Removed unused imports and streamlined code.
These changes enhance the robustness and test coverage of the memtable components.
* feat/simple-bulk-memtable:
Improve Time Partition Matching Logic in `time_partition.rs`
- Enhanced the `write_bulk` method in `time_partition.rs` to improve the logic for matching partitions based on time ranges.
- Introduced a new mechanism to filter and select partitions that overlap with the record batch's timestamp range before writing.
* feat/simple-bulk-memtable:
Improve Metrics Handling in `bulk_insert.rs`
- Removed the `group_request_timer` and its associated metric observation to streamline the timing logic.
- Moved the `BULK_REQUEST_ROWS` metric observation to occur after filtering, ensuring accurate row count metrics.
* feat/simple-bulk-memtable:
**Enhance Stalled Requests Calculation and Update Metrics**
- **`worker.rs`**: Updated the `stalled_count` method to include both `reqs` and `bulk_reqs` in the calculation of stalled requests.
- **`bulk_insert.rs`**: Removed duplicate observation of `BULK_REQUEST_MESSAGE_SIZE` metric.
- **`metrics.rs`**: Changed the bucket strategy for `BULK_REQUEST_ROWS` from linear to exponential, improving the granularity of metrics collection.
* feat/simple-bulk-memtable:
**Refactor `StringVector` Usage and Update Method Signatures**
- **`src/datatypes/src/vectors/string.rs`**: Changed `StringVector`'s `array` field from public to private.
- **`src/mito2/src/memtable/builder.rs`**: Refactored `append_vector` method to `append_array`, updating its usage to work directly with `StringArray` instead of `StringVector`.
- **`src/mito2/src/memtable/time_series.rs`**: Updated `ValueBuilder` to handle `StringArray` directly, replacing `StringVector` usage with `StringArray` in the `FieldBuilder::String` case.
* feat/simple-bulk-memtable:
- **Refactor `PrimitiveVectorBuilder`**: Made `mutable_array` private in `src/datatypes/src/vectors/primitive.rs`.
- **Optimize `ValueBuilder`**: Replaced `UInt64VectorBuilder` and `UInt8VectorBuilder` with `Vec<u64>` and `Vec<u8>` for `sequence` and `op_type` in `src/mito2/src/memtable/time_series.rs`.
- **Improve Metrics Initialization**: Updated histogram bucket initialization to use `exponential_buckets` in `src/mito2/src/metrics.rs`.
* feat/simple-bulk-memtable:
Improve error handling in `simple_bulk_memtable.rs` and `time_series.rs`
- Enhanced error handling by using `OptionExt` for more concise error context management in `simple_bulk_memtable.rs` and `time_series.rs`.
- Replaced `ok_or` with `with_context` to streamline error context creation in both files.
* feat/simple-bulk-memtable:
**Enhance Time Partition Handling in `time_partition.rs`**
- Introduced `create_time_partition` function to streamline the creation of new time partitions, ensuring thread safety by acquiring a lock.
- Modified logic to handle cases where no matching time partitions exist, creating new partitions as needed.
- Updated `write_record_batch` and `write_one` methods to utilize the new partition creation logic, improving partition management and data writing efficiency.
* replace proto
* feat/simple-bulk-memtable:
Update `metrics.rs` to adjust the range of exponential buckets for bulk insert message rows from `10 ~ 1_000_000` to `10 ~ 100_000`.
* feat: update pgwire to 0.29
* chore: only build default binary in nix ci
* Update src/servers/Cargo.toml
Co-authored-by: dennis zhuang <killme2008@gmail.com>
---------
Co-authored-by: dennis zhuang <killme2008@gmail.com>
* feat: flow add static user/pwd auth
* fix: not print password
* chore: rm explict Any bound
* refactor: per review
* refactor: move away from plugin
* refactor: not use any
* chore: per revieww
* chore: complete a todo
* chore: fix after rebase
* feat: support auto transform
* refactor: replace hashbrown with ahash
* refactor: params of run identity pipeline
* refactor: minor update
* test: add test for auto transform
* feat: add select processor
* test: select processor
* chore: use include and exclude for key
* fix: typos
* chore: address CR comment
* chore: typo
* chore: typo
* chore: address CR comment
* chore: use with_context
* fix: do not add projection for cast
Use cast to build time filter directly instead of adding a projection,
which will cause column not found
* feat: cast before creating plan
* feat/bridge-bulk-insert:
## Implement Bulk Insert and Update Dependencies
- **Bulk Insert Implementation**: Added `handle_bulk_inserts` method in `src/operator/src/bulk_insert.rs` to manage bulk insert requests using `FlightDecoder` and `FlightData`.
- **Dependency Updates**: Updated `Cargo.lock` and `Cargo.toml` to use the latest revision of `greptime-proto` and added new dependencies like `arrow`, `arrow-ipc`, `bytes`, and `prost`.
- **gRPC Enhancements**: Modified `put_record_batch` method in `src/frontend/src/instance/grpc.rs` and `src/servers/src/grpc/flight.rs` to handle `FlightData` instead of `RawRecordBatch`.
- **Error Handling**: Added new error types in `src/operator/src/error.rs` for handling Arrow operations and decoding flight data.
- **Miscellaneous**: Updated `src/operator/src/insert.rs` to expose `partition_manager` and `node_manager` as public fields.
* feat/bridge-bulk-insert:
- **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency to a new revision in `Cargo.lock` and `Cargo.toml`.
- **Refactor gRPC Query Handling**: Removed `RawRecordBatch` usage from `grpc.rs`, `flight.rs`, `greptime_handler.rs`, and test files, simplifying the gRPC query handling.
- **Enhance Bulk Insert Logic**: Improved bulk insert logic in `bulk_insert.rs` and `region_request.rs` by using `FlightDecoder` and `BooleanArray` for better performance and clarity.
- **Add `common-grpc` Dependency**: Added `common-grpc` as a workspace dependency in `store-api/Cargo.toml` to support gRPC functionalities.
* fix: clippy
* fix schema serialization
* feat/bridge-bulk-insert:
Add error handling for encoding/decoding in `metadata.rs` and `region_request.rs`
- Introduced new error variants `FlightCodec` and `Prost` in `MetadataError` to handle encoding/decoding failures in `metadata.rs`.
- Updated `make_region_bulk_inserts` function in `region_request.rs` to use `context` for error handling with `ProstSnafu` and `FlightCodecSnafu`.
- Enhanced error handling for `FlightData` decoding and `filter_record_batch` operations.
* fix: test
* refactor: rename
* allow empty app_metadata in FlightData
* feat/bridge-bulk-insert:
- **Remove Logging**: Removed unnecessary logging of affected rows in `region_server.rs`.
- **Error Handling Enhancement**: Improved error handling in `bulk_insert.rs` by adding context to `split_record_batch` and handling single datanode fast path.
- **Error Enum Cleanup**: Removed unused `Arrow` error variant from `error.rs`.
* fix: standalone test
* feat/bridge-bulk-insert:
### Enhance Bulk Insert Handling and Metadata Management
- **`lib.rs`**: Enabled the `result_flattening` feature for improved error handling.
- **`request.rs`**: Made `name_to_index` and `has_null` fields public in `WriteRequest` for better accessibility.
- **`handle_bulk_insert.rs`**:
- Added `handle_record_batch` function to streamline processing of bulk insert payloads.
- Improved error handling and task management for bulk insert operations.
- Updated `region_metadata_to_column_schema` to return both column schemas and a name-to-index map for efficient data access.
* feat/bridge-bulk-insert:
- **Refactor `handle_bulk_insert.rs`:**
- Replaced `handle_record_batch` with `handle_payload` for handling payloads.
- Modified the fast path to use `common_runtime::spawn_global` for asynchronous task execution.
- **Optimize `multi_dim.rs`:**
- Added a fast path for single-region scenarios in `MultiDimPartitionRule::partition_record_batch`.
* feat/bridge-bulk-insert:
- **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency to a new revision in both `Cargo.lock` and `Cargo.toml`.
- **Optimize Memory Allocation**: Increased initial and builder capacities in `time_series.rs` to improve performance.
- **Enhance Data Handling**: Modified `bulk_insert.rs` to use `Bytes` for efficient data handling.
- **Improve Bulk Insert Logic**: Refined the bulk insert logic in `region_request.rs` to handle schema and payload data more effectively and optimize record batch filtering.
- **String Handling Improvement**: Updated string conversion in `helper.rs` for better performance.
* fix: clippy warnings
* feat/bridge-bulk-insert:
**Add Metrics and Improve Error Handling**
- **Metrics Enhancements**: Introduced new metrics for bulk insert operations in `metrics.rs`, `bulk_insert.rs`, `greptime_handler.rs`, and `region_request.rs`. Added `HANDLE_BULK_INSERT_ELAPSED`, `BULK_REQUEST_MESSAGE_SIZE`, and `GRPC_BULK_INSERT_ELAPSED` histograms to
monitor performance.
- **Error Handling Improvements**: Removed unnecessary error handling in `handle_bulk_insert.rs` by eliminating redundant `let _ =` patterns.
- **Dependency Updates**: Added `lazy_static` and `prometheus` to `Cargo.lock` and `Cargo.toml` for metrics support.
- **Code Refactoring**: Simplified function calls in `region_server.rs` and `handle_bulk_insert.rs` for better readability.
* chore: rebase main
* chore: merge main
* ci: update website greptimedb version when releasing automatically
* fix: token name
* chore: tweak readme
* fix: style
* chore: license year
* refactor: simplify bump-website-version.ts
* chore: being used
* fix: make ci happy
* chore: insert support string to numeric auto cast
* test: add sqlness test
* chore: remove log
* test: fix sql test
* style: fix clippy
* test: test invalid number
* feat: do not convert to default if unable to parse
* chore: update comment
* test: update sqlness test
* test: update prepare test
* feat: support auto transform
* refactor: replace hashbrown with ahash
* refactor: params of run identity pipeline
* refactor: minor update
* test: add test for auto transform
* chore: fix cr issues
* chore: only retry when retry-able
* chore: revert dbg change
* refactor: per review
* fix: check for available frontend first
* docs: more explain&longer timeout&feat: more retry at every level&try send select 1
* fix: use `sql` method for "SELECT 1"
* fix: also put recover flows in spawned task and a dead loop
* test: update transient error in flow rebuild test
* chore: sleep after sqlness sleep
* chore: add a warning
* chore: wait even more time after reboot
* test: incorrect test result when filtering pk with multiple columns
* fix: prune non first tag correctly
Distinguish no column and no stats and only use default value when no
column
* test: update test result
* refactor: rename test file
* test: add test for null filter
* fix: use StatValues for null counts
* test: drop table
* test: fix unstable flow test
fix/checking-memtable-empty-and-stats:
- **Refactor timestamp updates**: Simplified timestamp range updates in `PartitionTreeMemtable` and `TimeSeriesMemtable` by replacing `update_timestamp_range` with `fetch_max` and `fetch_min` methods for `max_timestamp` and `min_timestamp`.
- Affected files: `partition_tree.rs`, `time_series.rs`
- **Remove unused code**: Deleted the `update_timestamp_range` method from `WriteMetrics` and removed unnecessary imports.
- Affected file: `stats.rs`
- **Optimize memtable filtering**: Streamlined the check for empty memtables in `ScanRegion` by directly using `time_range`.
- Affected file: `scan_region.rs`
* try prune one less
* test: also not add one
* ci: use longer fuzz time
* revert fuzz time&per review
* chore: no (
* docs: add explain to offset used in delete records
* test: fix test_procedure_execution
* feat: use flow batching engine
broken: try using logical plan
fix: use dummy catalog for logical plan
fix: insert plan exec&sqlness grpc addr
feat: use frontend instance in flownode in standalone
feat: flow type in metasrv&fix: flush flow out of sync& column name alias
tests: sqlness update
tests: sqlness flow rebuild udpate
chore: per review
refactor: keep chnl mgr
refactor: use catalog mgr for get table
tests: use valid sql
fix: add more check
refactor: put flow type determine to frontend
* chore: update proto
* chore: update proto to main branch
* fix: add locks for create/drop flow&docs: update docs
* feat: flush_flow flush all ranges now
* test: add align time window test
* docs: explain `nodeid` use in check task
* refactor: AddAutoColumnRewriter check for Projection
* refactor: per review
* fix: query without time window also clean dirty time window
* chore: better logging
* chore: add comments per review
* refactor: per review
* chore: per review
* chore: per review rename args
* refactor: per review partially
* chore: update docs
* chore: use better error variant
* chore: better error variant
* refactor: rename FlowWorkerManager to FlowStreamingEngine
* rename again
* refactor: per review
* chore: rebase after #5963 merged
* refactor: rename all flow_worker_manager occurs
* docs: rm resolved TODO
* fix: store flow schema on creation
* chore: update sqlness
* refactor: save the entire query context to flow info
* chore: sqlness update
* chore: rm pub
* fix: keep old version compatibility
* fix: remove obsolete failover detectors after region leader change
* chore: apply suggestions from CR
* fix: fix unit tests
* fix: fix unit test
* fix: failover logic
* [wip]: implement arrow service
* add service
* feat/otel-arrow:
### Add OpenTelemetry Arrow Support
- **`Cargo.toml`, `Cargo.lock`**: Updated `otel-arrow-rust` dependency to use a local path and added `arrow-ipc` as a dependency.
- **`src/servers/src/grpc.rs`, `src/servers/src/grpc/builder.rs`**: Integrated `ArrowMetricsServiceServer` with gRPC server, including support for custom header interception and message compression.
- **`src/servers/src/otel_arrow.rs`**: Implemented `OtelArrowServiceHandler` for handling OpenTelemetry Arrow metrics and added `HeaderInterceptor` for custom header handling.
* feat/otel-arrow:
Add error handling for OpenTelemetry Arrow requests
- **`src/error.rs`**: Introduced a new error variant `HandleOtelArrowRequest` to handle failures in processing OpenTelemetry Arrow requests.
- **`src/otel_arrow.rs`**: Implemented error handling for receiving and consuming batches from the OpenTelemetry Arrow client. Added logging for errors and updated the response status accordingly.
* feat/otel-arrow:
Remove `otel_arrow` Module from gRPC Server
- Deleted the `otel_arrow` module from the gRPC server implementation.
- Removed the `otel_arrow` module import from `grpc.rs`.
- Deleted the `otel_arrow.rs` file, which contained the `OtelArrowServer` struct and its implementation.
* feat/otel-arrow:
## Remove `Arc` Implementations for Protocol and Pipeline Handlers
- **Removed `Arc` Implementations**: Deleted `Arc` implementations for `OpenTelemetryProtocolHandler` and `PipelineHandler` traits in `query_handler.rs`. This change simplifies the code by removing redundant async trait implementations for `Arc<T>`.
- **File Affected**: `src/servers/src/query_handler.rs`
* feat/otel-arrow:
Improve error handling and metadata processing in `otel_arrow.rs`
- Updated error handling by ignoring the result of `sender.send` to prevent panic on failure.
- Enhanced metadata processing in `HeaderInterceptor` by using `Ok` to safely handle `grpc-encoding` entry retrieval.
* fix dependency
* feat/otel-arrow:
- **Update Dependencies**:
- Moved `otel-arrow-rust` dependency in `Cargo.toml`.
- Adjusted workspace dependencies in `src/frontend/Cargo.toml`.
- **Error Handling**:
- Removed `MissingQueryContext` error variant from `src/servers/src/error.rs`.
* fix: toml format
* remove useless code
* chore: resolve conflicts
* test: sqlness test case
* feat: use correct default while pruning row groups
* fix: consider default in SimpleFilterContext
* test: update sqlness test
* test: add order by
* feat: enable submitting wal prune procedure periodically
* chore: fix and add options
* test: add unit test
* test: fix unit test
* test: enable active_wal_pruning in test
* test: update default config
* chore: update config name
* refactor: use semaphore to control the number of prune process
* refactor: use split client for wal prune manager and topic creator
* chore: add configs
* chore: apply review comments
* fix: use tracker properly
* fix: use guard to track semaphore
* test: update unit tests
* chore: update config name
* chore: use prunable_entry_id
* refactor: semaphore to only limit the process of submitting
* chore: remove legacy sort
* chore: better configs
* fix: update config.md
* chore: respect fmt
* test: update unit tests
* chore: use interval_at
* fix: fix unit test
* test: fix unit test
* test: fix unit test
* chore: apply review comments
* docs: update config docs
* feat: close follower regions after dropping leader regions
* chore: upgrade greptime-proto
* feat: sync region followers after alter region operations
* test: add tests
* chore: apply suggestions from CR
* chore: apply suggestions from CR
* feat: cache regex in evaluator
* chore: fix warnings
* chore: add reference
* refactor: address CR comments
* Add negative to state
* Don't create the evaluator if the regex is invalid
* test: add test for maybe_build_regex
* fix/pg-timestamp-diff:
### Add Support for `Duration` Type in PostgreSQL Encoding
- **Enhanced `encode_value` Functionality**: Updated `src/servers/src/postgres/types.rs` to support encoding of `Value::Duration` using `PgInterval`.
- **Implemented `Duration` Conversion**: Added conversion logic from `Duration` to `PgInterval` in `src/servers/src/postgres/types/interval.rs`.
- **Added Unit Tests**: Introduced tests for `Duration` to `PgInterval` conversion in `src/servers/src/postgres/types/interval.rs`.
- **Updated SQL Test Cases**: Modified `tests/cases/standalone/common/types/timestamp/timestamp.sql` and `timestamp.result` to include tests for timestamp subtraction using PostgreSQL protocol.
* fix: overflow
* fix/pg-timestamp-diff:
Update `timestamp.sql` to ensure newline consistency
- Modified `timestamp.sql` to add a newline at the end of the file for consistency.
* fix/pg-timestamp-diff:
### Add Documentation for Month Approximation in Interval Calculation
- **File Modified**: `src/servers/src/postgres/types/interval.rs`
- **Key Change**: Added a comment explaining the approximation of one month as 30.44 days in the interval calculations.
* feat: implement Arrow Flight "DoPut" in Frontend
* support auth for "do_put"
* set request_id in DoPut requests and responses
* set "db" in request header
* wip: implement basic request handling
* feat/bulk-insert:
### Add Error Handling and Enhance Bulk Insert Functionality
- **Error Handling**: Introduced a new error variant `ConvertDataType` in `error.rs` to handle conversion failures from `ConcreteDataType` to `ColumnDataType`.
- **Bulk Insert Enhancements**:
- Updated `WorkerRequest::BulkInserts` in `request.rs` to include metadata and sender.
- Implemented `handle_bulk_inserts` in `worker.rs` to process bulk insert requests with region metadata.
- Added functions `region_metadata_to_column_schema` and `record_batch_to_rows` in `handle_bulk_insert.rs` for schema conversion and row processing.
- **API Changes**: Modified `RegionBulkInsertsRequest` in `region_request.rs` to include `region_id`.
Files affected: `error.rs`, `request.rs`, `worker.rs`, `handle_bulk_insert.rs`, `region_request.rs`.
* feat/bulk-insert:
**Enhance Error Handling and Add Unit Tests**
- Improved error handling in `record_batch_to_rows` function within `handle_bulk_insert.rs` by returning `Result` and handling errors with `context`.
- Added unit tests for `region_metadata_to_column_schema` and `record_batch_to_rows` functions in `handle_bulk_insert.rs` to ensure correct functionality and error handling.
* chore: update proto version
* feat/bulk-insert:
- **Refactor Error Handling**: Updated error handling in `error.rs` by modifying the `ConvertDataType` error handling.
- **Improve Logging and Error Reporting**: Enhanced logging and error reporting in `worker.rs` by adding error messages for missing region metadata.
- **Add New Error Type**: Introduced `DecodeArrowIpc` error in `metadata.rs` to handle Arrow IPC decoding failures.
- **Handle Arrow IPC Decoding**: Updated `region_request.rs` to handle Arrow IPC decoding errors using the new `DecodeArrowIpc` error type.
* chore: update proto version
* feat/bulk-insert:
Refactor `handle_bulk_insert.rs` to simplify row construction
- Removed the mutable `current_row` vector and refactored `row_at` function to return a new vector directly.
- Updated `record_batch_to_rows` to utilize the refactored `row_at` function for constructing rows.
* feat/bulk-insert:
### Commit Summary
**Enhancements in Region Server Request Handling**
- Updated `region_server.rs` to include `RegionRequest::BulkInserts(_)` in the `RegionChange::Ingest` category, improving the handling of bulk insert operations.
- Refined the categorization of region requests to ensure accurate mapping to `RegionChange` actions.
* wip: naive impl
* feat/column-partition:
### Add support for DataFusion physical expressions
- **`Cargo.lock` & `Cargo.toml`**: Added `datafusion-physical-expr` as a dependency to support physical expression creation.
- **`expr.rs`**: Implemented conversion methods `try_as_logical_expr` and `try_as_physical_expr` for `Operand` and `PartitionExpr` to facilitate logical and physical expression handling.
- **`multi_dim.rs`**: Enhanced `MultiDimPartitionRule` to utilize physical expressions for partitioning logic, including new methods for evaluating record batches.
- **Tests**: Added unit tests for logical and physical expression conversions and partitioning logic in `expr.rs` and `multi_dim.rs`.
* feat/column-partition:
### Refactor and Enhance Partition Handling
- **Refactor Partition Parsing Logic**: Moved partition parsing logic from `src/operator/src/statement/ddl.rs` to a new utility module `src/partition/src/utils.rs`. This includes functions like `parse_partitions`, `find_partition_bounds`, and `convert_one_expr`.
- **Error Handling Improvements**: Added new error variants `ColumnNotFound`, `InvalidPartitionRule`, and `ParseSqlValue` in `src/partition/src/error.rs` to improve error reporting for partition-related operations.
- **Dependency Updates**: Updated `Cargo.lock` and `Cargo.toml` to include new dependencies `common-time` and `session`.
- **Code Cleanup**: Removed redundant partition parsing functions from `src/operator/src/error.rs` and `src/operator/src/statement/ddl.rs`.
* feat/column-partition:
## Refactor and Enhance SQL and Table Handling
- **Refactor Column Definitions and Error Handling**
- Made `FULLTEXT_GRPC_KEY`, `INVERTED_INDEX_GRPC_KEY`, and `SKIPPING_INDEX_GRPC_KEY` public in `column_def.rs`.
- Removed `IllegalPrimaryKeysDef` error from `error.rs` and moved it to `sql/src/error.rs`.
- Updated error handling in `fill_impure_default.rs` and `expr_helper.rs`.
- **Enhance SQL Utility Functions**
- Moved and refactored functions like `create_to_expr`, `find_primary_keys`, and `validate_create_expr` to `sql/src/util.rs`.
- Added new utility functions for SQL parsing and validation in `sql/src/util.rs`.
- **Improve Partition Handling**
- Added `parse_partition_columns_and_exprs` function in `partition/src/utils.rs`.
- Updated partition rule tests in `partition/src/multi_dim.rs` to use SQL-based partitioning.
- **Simplify Table Name Handling**
- Re-exported `table_idents_to_full_name` from `sql::util` in `session/src/table_name.rs`.
- **Test Enhancements**
- Updated tests in `partition/src/multi_dim.rs` to use SQL for partition rule creation.
* feat/column-partition:
**Add Benchmarking and Enhance Partitioning Logic**
- **Benchmarking**: Introduced a new benchmark for `split_record_batch` in `bench_split_record_batch.rs` using `criterion` and `rand` as development dependencies in `Cargo.toml`.
- **Partitioning Logic**: Enhanced `MultiDimPartitionRule` in `multi_dim.rs` to include a default region for unmatched partition expressions and optimized the `split_record_batch` method.
- **Refactoring**: Moved `sql_to_partition_rule` function to a public scope for reuse in `multi_dim.rs`.
- **Testing**: Added new test module `test_split_record_batch` to validate the partitioning logic.
* Revert "feat/column-partition: ### Refactor and Enhance Partition Handling"
This reverts commit 183fa19f
* fix: revert refctoring parse_partition
* revert some refactor
* feat/column-partition:
### Enhance Partitioning and Error Handling
- **Benchmark Enhancements**: Added new benchmark `bench_split_record_batch_vs_row` in `bench_split_record_batch.rs` to compare row and column-based splitting.
- **Error Handling Improvements**: Introduced new error variants in `error.rs` for better error reporting related to record batch evaluation and arrow kernel computation.
- **Expression Handling**: Updated `expr.rs` to improve error context when converting schemas and creating physical expressions.
- **Partition Rule Enhancements**: Made `row_at` and `record_batch_to_cols` methods public in `multi_dim.rs` and improved error handling for physical expression evaluation and boolean operations.
* feat/column-partition:
### Add `eq` Method and Optimize Expression Caching
- **`expr.rs`**: Added a new `eq` method to the `Operand` struct for equality comparisons.
- **`multi_dim.rs`**: Introduced a caching mechanism for physical expressions using `RwLock` to improve performance in `MultiDimPartitionRule`.
- **`lib.rs`**: Enabled the `let_chains` feature for more concise code.
- **`multi_dim.rs` Tests**: Enhanced test coverage with new test cases for multi-dimensional partitioning, including random record batch generation and default region handling.
* feat/column-partition:
### Add `split_record_batch` Method to `PartitionRule` Trait
- **Files Modified**:
- `src/partition/src/multi_dim.rs`
- `src/partition/src/partition.rs`
- `src/partition/src/splitter.rs`
Added a new method `split_record_batch` to the `PartitionRule` trait, allowing record batches to be split into multiple regions based on partition values. Implemented this method in `MultiDimPartitionRule` and provided unimplemented stubs in test modules.
### Dependency Update
- **File Modified**:
- `src/operator/src/expr_helper.rs`
Removed unused import `ColumnDataType` and `Timezone` from the test module.
### Miscellaneous
- **File Modified**:
- `src/partition/Cargo.toml`
No functional changes; only minor formatting adjustments.
* chore: add license header
* chore: remove useless fules
* feat/column-partition:
Add support for handling unsupported partition expression values
- **`error.rs`**: Introduced a new error variant `UnsupportedPartitionExprValue` to handle unsupported partition expression values, and updated `ErrorExt` to map this error to `StatusCode::InvalidArguments`.
- **`expr.rs`**: Modified the `Operand` implementation to return the new error when encountering unsupported partition expression values.
- **`multi_dim.rs`**: Added a fast path to optimize the selection process when all rows are selected.
* feat/column-partition: Add validation for expression and region length in MultiDimPartitionRule constructor
• Ensure the lengths of exprs and regions match to prevent mismatches.
• Introduce error handling for length discrepancies with a descriptive error message.
* chore: add debug log
* feat/column-partition: Removed the validation check for matching lengths between exprs and regions in MultiDimPartitionRule constructor, simplifying the initialization process.
* fix: unit tests
* fix: gRPC connection pool leak
* use .config() instead of .inner.config
* cancel the bg task if it is running
* fix: cr
* add unit test for pool release
* Avoid potential data races
* fix/remove-metadata-region-options:
### Add `SKIP_WAL_KEY` Option to Metric Engine
- **Enhancements**:
- Introduced `SKIP_WAL_KEY` to the metric engine options in `create.rs` and `mito_engine_options.rs`.
- Updated test cases in `create.rs` to include `skip_wal` option and ensure it is removed for metadata regions.
- **Refactoring**:
- Updated `requests.rs` to use `SKIP_WAL_KEY` from `store_api::mito_engine_options`.
These changes enhance the metric engine by allowing the option to skip Write-Ahead Logging (WAL) and ensure consistent usage of option keys across modules.
* fix/remove-metadata-region-options: Add note for new options in mito_engine_options.rs
• Introduce a comment to remind developers to check if new options should be removed in region_options_for_metadata_region within metric_engine::engine::create.
* empty
* feat: partial impl of rr task/state
* feat: recording rule engine
* chore: rm unused
* chore: per review partially
* test: gen create table
* chore: rm some unused
* test: merge time window
* refactor: rename to batching mode
* refactor: per review
* refactor(partially): per review
* refactor: split engine.rs into three files
* refactor: use plan not sql
* chore: per review
* chore: per review
* refactor: per review
* refactor: per review
* chore: more per review
* refactor: per review
* refactor(partial): per review
* refactor: per review
* chore: clone task cheaper&more comments
* chore: fmt
* chore: typo
* refactor: improve jaeger '/api/services' performance by adding the trace services table
* chore: refine some logic
* chore: compatible v0
* test: add integration test
* chore: expand default limit from 100 to 2000
* test: fix integration test
* refactor: make trace service table configurable
* refactor: use a timestamp(2100-01-01 00:00:00) as large as possible
* refactor: use '<trace_table>_services' as trace services table name
* perf: introduce simd_json for parsing ndjson
* fix: some tests
* fix: some tests
* fix: es test case
* chore: use `as_bytes_mut()`
* chore: remove unnecessary `to_string`
* chore: add safety comment
* refactor: remove mode option in configuration files
* chore: remove mode in configuration file
* remvoe mode field in FlownodeOptions
* add comment for test
* update config.md
* remove mode field in standalone options
* fix: ci
* feat: time window expr
* chore: comments
* refactor: per review
* chore: partially per review
* chore: per review
* chore: per review use query engine's session
* chore: add table name template in pipeline yaml
* chore: implement apply function and add simple test
* chore: add comment and integration test
* chore: minor update
* fix: typos
* chore: change to table suffix
* chore: update comment and test
* chore: change name to table_suffix
* feat: add metrics list to scanner
* chore: add report metrics method
* feat: use df metrics in PartitionMetrics
* feat: pass execution metrics to scan partition
* refactor: remove PartitionMetricsList
* feat: better debug format for ScanMetricsSet
* feat: do not expose all metrics to execution metrics by default
* refactor: use struct destruction
* feat: add metrics list to scanner
* chore: Add custom Debug for ScanMetricsSet and partition metrics display
* test: update sqlness result
* fix: fix region follower procedure
* feat: add table related info to region peers table and follower regions
* feat: impl show region
* chore: apply suggestions from CR
* chore: utils for rr
* chore: one more test
* chore: more test case
* test: even more tests
* chore: per review
* tests: add more&update testcase
* chore: update comment
* chore: add Noop Wal option
* remove: WalOptionsAllocator::alloc method
* feat/no-op-wal:
### Add Noop WAL Option
- **`engine.rs`, `opener.rs`, `wal.rs`, `entry_reader.rs`, `handle_write.rs`, `provider.rs`**:
- Introduced a new `WalOptions::Noop` variant to handle scenarios where no write-ahead logging is required.
- Implemented `NoopEntryReader` to provide a no-operation entry reader.
- Updated logic to skip WAL operations for regions with `Noop` option.
- Added `Provider::Noop` to handle `Noop` operations in the provider logic.
* feat/no-op-wal:
### Add `skip_wal` Option to Table Metadata
- **Enhancements in `table_meta.rs`**:
- Added a `skip_wal` parameter to the `create_wal_options` function to allow skipping WAL writes.
- Updated the `create_table_route` function to utilize the `skip_wal` option from `table_info.meta.options`.
- **Updates in `wal_options_allocator.rs`**:
- Modified `alloc_batch` to handle the `skip_wal` flag, setting WAL options to `Noop` when true.
- Added a test case `test_allocator_with_skip_wal` to verify the `skip_wal` functionality.
- **Changes in `requests.rs`**:
- Introduced `skip_wal` in `TableOptions` and added parsing logic.
- Updated `TableOptions` display to include `skip_wal`.
These changes introduce the ability to skip WAL writes for tables, enhancing flexibility in table metadata management.
* feat/no-op-wal:
**Add WAL Option Handling and Table Option Validation**
- **`handle_write.rs`**: Introduced a check for `WalOptions::Noop` in the `RegionWorkerLoop` to skip WAL writing for regions with this option.
- **`requests.rs`**: Added `SKIP_WAL_KEY` to the list of valid table options for enhanced table configuration validation.
* feat/no-op-wal:
### Update WAL Options Allocation
- **`key.rs`**: Modified the `allocate_region_wal_options` function to include an additional boolean parameter, enhancing the allocation logic.
- **`wal_options_allocator.rs`**: Simplified the `test_allocator_with_skip_wal` test by removing unnecessary variable declarations and directly using `WalOptionsAllocator::RaftEngine`.
These changes improve the flexibility and efficiency of WAL options allocation in the system.
* chore: reformat code
* feat/no-op-wal:
**Enhancement:** Conditional Addition of `SKIP_WAL_KEY` in `requests.rs`
- Updated `TableOptions` implementation in `requests.rs` to conditionally add `SKIP_WAL_KEY` to `key_vals` only when `self.skip_wal` is true, optimizing the key-value pair generation.
* feat/no-op-wal:
Update `requests.rs` tests to reflect changes in `skip_wal` option
- Modified test assertions in `requests.rs` to remove `skip_wal=false` from expected strings.
- Added a new test case to verify `skip_wal=true` is correctly represented in `TableOptions`.
* feat/no-op-wal: Add Debug Logging and Improve Error Handling for WAL and Table Options
• Introduced debug logging in wal.rs to skip obsolete regions, enhancing traceability.
• Improved error handling in requests.rs by replacing warn with error propagation for invalid skip_wal values.
• Added new test cases for skip_wal functionality, including SQL scripts and expected results, to ensure correct behavior and validation of the changes.
* Add explain_verbose to QueryContext
* feat: fmt plan by display type
* feat: update proto to use ExplainOptions
* feat: display more info in verbose mode
* chore: fix clippy
* test: add sqlness test
* test: update sqlness result
* chore: update proto version
* chore: Simplify QueryContextBuilder::explain_options using get_or_insert_default
* chore: minor refactor
* chore: minor refactor
* chore: support custom ts for identity pipeline
* chore: fix clippy
* chore: minor refactor & update tests
* chore: use ref on identity pipeline param
* feat: add mysql election
* feat: add mysql election
* chore: fix deps
* chore: fix deps
* fix: duplicate container
* fix: duplicate setup for sqlness
* fix: call once
* fix: do not use NOWAIT for mysql 5.7
* chore: apply comments
* fix: no parallel sqlness for mysql
* chore: comments and minor revert
* chore: apply comments
* chore: apply comments
* chore: add to table name
* ci: use 2 metasrv to detect election bugs
* refactor: better election logic
* chore: apply comments
* chore: apply comments
* feat: version check before startup
* refactor: remove trace id in primary key
* refactor: remove trace id in primary key in v0 model
* refactor: add span id in v1
* fix: integration test
* feat: add vec_kth_elem function
Signed-off-by: pikady <2652917633@qq.com>
* code format
Signed-off-by: pikady <2652917633@qq.com>
* add test sql
Signed-off-by: pikady <2652917633@qq.com>
* change indexing from 1-based to 0-based
Signed-off-by: pikady <2652917633@qq.com>
* improve code formatting and correct spelling errors
Signed-off-by: pikady <2652917633@qq.com>
* Update tests/cases/standalone/common/function/vector/vector.sql
I noticed the two lines are identical. Could you clarify the reason for the change? Thanks!
Co-authored-by: Zhenchi <zhongzc_arch@outlook.com>
---------
Signed-off-by: pikady <2652917633@qq.com>
Co-authored-by: Zhenchi <zhongzc_arch@outlook.com>
* feat: update to disable http timeout by default
* feat: make http timeout default to 0
* test: correct test case
* chore: generate new config doc
* test: correct tests
* refactor: update jaeger api implementation
* test: add tests for v1 data model
* feat: customize trace table name
* fix: update column requirements to use Column type instead of String
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* fix: lint fix
* refactor: accumulate resource attributes for v1
* fix: add empty check for additional string
* feat: add table option to mark data model version
* fix: do not overwrite all tags
* feat: use table option to mark table data model version and process accordingly
* chore: update comments to reflect query changes
* feat: use header for jaeger table name
* feat: update index for service_name, drop index for span_name
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: zyy17 <zyylsxm@gmail.com>
* refactor: use proc macro to generate conversion between TableMeta and TableMetaBuilder
* chore: format
* fix/partition-key-index:
### Update `TableMeta` and Add Partition and Alter Table Tests
- **`metadata.rs`**: Modified `new_meta_builder` method in `TableMeta` to manually remove `value_indices` by setting it to `None` in the `TableMetaBuilder`.
- **`partition_and_alter.result` & `partition_and_alter.sql`**: Added new test cases for creating, inserting, selecting, altering, and dropping a partitioned table `molestiAe`. These tests verify partitioning on the `sImiLiQUE` column and altering the table with a TTL
setting.
fix/partition-key-index:
### Remove Obsolete TODO Comment in `metadata.rs`
- Removed an outdated TODO comment regarding the `new_meta_builder` function in `src/table/src/metadata.rs`.
chore: check struct name in derive_meta_builder
refactor: Simplify TableMeta struct name check in macro
refactor: Improve ToMetaBuilder derive macro validation and error handling
refactor: Enforce ToMetaBuilder macro for table::metadata::TableMeta struct
* fix/partition-key-index:
Update `partition_and_alter.sql` to modify TTL setting
- Modified the TTL setting for the `molestiAe` table to '1d' in `partition_and_alter.sql`.
* fix: sqlness
* fix/partition-key-index:
### Update `TableMeta` and Test File Structure
- **Enhancement**: Added a note in `metadata.rs` to always use `new_meta_builder` for creating `TableMetaBuilder`.
- **Refactor**: Renamed test result and SQL files for better organization:
- `partition_and_alter.result` to `alter/partition_and_alter.result`
- `partition_and_alter.sql` to `alter/partition_and_alter.sql`
* refactor: Simplify `derive_meta_builder` by initializing fields with `Default::default()`
* fix/partition-key-index:
### Commit Summary
- **Refactor `TableMetaBuilder` Initialization**:
- Replaced `TableMetaBuilder::default()` with `TableMetaBuilder::empty()` across multiple files for initializing `TableMetaBuilder` instances.
- Affected files include:
- `src/catalog/src/system_schema.rs`
- `src/common/meta/src/key/test_utils.rs`
- `src/operator/src/req_convert/insert/fill_impure_default.rs`
- `src/query/src/log_query/planner.rs`
- `src/query/src/promql/planner.rs`
- `src/query/src/range_select/plan_rewrite.rs`
- `src/query/src/sql/show_create_table.rs`
- `src/table/src/test_util/memtable.rs`
- `src/table/src/test_util/table_info.rs`
- **Enhance `TableMetaBuilder`**:
- Added `custom_constructor` to `TableMeta` and implemented an `empty` method for `TableMetaBuilder`.
- Modified `TableMetaBuilder` to include a `new_external_table` method with default values.
- Updated `src/table/src/metadata.rs` to reflect these changes.
- **Add Testing Feature**:
- Introduced a conditional compilation for `test_util` in `src/table/src/lib.rs` to include testing utilities when the `testing` feature is enabled.
- **Update `Cargo.toml`**:
- Enabled the `testing` feature for the `table` module in `src/common/meta/Cargo.toml`.
- **Modify `NumbersTable` Initialization**:
- Replaced `TableMetaBuilder` with direct `TableMeta` struct initialization in `src/table/src/table/numbers.rs`.
- **Test Result Update**:
- Updated test results in `tests/cases/standalone/common/alter/partition_and_alter.result` to reflect changes in table meta handling.
* fix: rename default to empty
* docs: add doc for TableMetaBuilder::empty
* chore: Update src/table/src/metadata.rs
---------
Co-authored-by: Yingwen <realevenyag@gmail.com>
* feat: enhancement information_schema.flows
* feat: enhancement information_schema.flows
* u
* u
* u
* u
* u
* u
* u
* u
* u
* update
* update
* update
* delete unused code
* u
* u
* Update src/flow/src/adapter/worker.rs
Co-authored-by: dennis zhuang <killme2008@gmail.com>
* Update src/common/meta/src/key/flow/flow_state.rs
Co-authored-by: dennis zhuang <killme2008@gmail.com>
* Update src/common/meta/src/key/flow/flow_info.rs
Co-authored-by: dennis zhuang <killme2008@gmail.com>
* Update src/common/meta/src/key/flow/flow_state.rs
Co-authored-by: dennis zhuang <killme2008@gmail.com>
* Update src/common/meta/src/key/flow/flow_info.rs
Co-authored-by: dennis zhuang <killme2008@gmail.com>
* u
* u
* u
* u
* u
* u
* chore: fix sqlness
* chore: update proto
* fix: remove date time
* fix: update result of information_schema test
---------
Co-authored-by: dennis zhuang <killme2008@gmail.com>
Co-authored-by: discord9 <discord9@163.com>
* feat: add region follower manager
* feat: add region procudure
* refactor: make add, remove follower procedure look nice
* feat: add region follower procedure
* chore: undo some chane, possibly made by AI
* feat: on prepare cheking
* feat: on update metadata
* feat: on broadcast
* chore: unit test
* feat: add remove follower operation
* feat: add or remove region follower procedure
* chore: ut
* chore: rename
* chore: by comment
* chore: by comment
---------
Co-authored-by: jeremy <jeremy@greptime.local>
chore/move-wal-sync-to-bg:
### Refactor Log Store Task Management
- **Error Handling Enhancements**: Updated error handling for task management in `error.rs` by renaming `StartGcTask` and `StopGcTask` to `StartWalTask` and `StopWalTask`, respectively, and added a `name` field for more descriptive error messages.
- **Task Management Improvements**: Introduced `SyncWalTaskFunction` in `log_store.rs` to handle periodic synchronization of WAL tasks, replacing the previous atomic-based sync logic.
- **Backend Adjustments**: Modified `backend.rs` to use the new `StartWalTaskSnafu` for starting tasks, ensuring consistency with the updated error handling approach.
* feat: add predicate group
* feat: pass predicate group
* feat: memtable prune by time filters
* test: test PruneTimeIterator with time filters
* feat: push down returns exact for timestamp simple filters
---------
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
* fix: skip schema check to avoid schema mismatch brought by metadata
* docs: add some comment to remind me add that check back
* test: add sqlness case
* fix/skip-schema-check:
### Update CTE Test Cases
- **Added GRPC Latencies Test**: Introduced a new test case for GRPC latencies in `cte.result` and `cte.sql` under `standalone/common/cte`.
- **Removed Redundant Test Files**: Deleted `cte.result` and `cte.sql` under `standalone/common/range` as they were duplicates of the new test case.
* fix: other col alias to time index column handle
* test: update sqlness
* chore: per review
* test: more sqlness
* test: mv some to optimizer folder
* fix: resolve alias properly
* fix: also retain old name
* chore: remove wrong comment
* chore: fix sqlness
* test: standalone/dist more projection diff
* chore: resolve conflicts
* chore: merge main
* test: add compatibility test for DatanodeLeaseKey with missing cluster_id
* test: add compatibility test for DatanodeLeaseKey without cluster_id
* refactor/remove-cluster-id:
- **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency in `Cargo.lock` and `Cargo.toml` to a new revision.
- **Remove `cluster_id` Usage**: Removed the `cluster_id` field and its related logic from various files, including `cluster.rs`, `datanode.rs`, `rpc.rs`,
`adapter.rs`, `client.rs`, `ask_leader.rs`, `heartbeat.rs`, `procedure.rs`, `store.rs`, `handler.rs`, `response_header_handler.rs`, `key.rs`, `datanode.rs`,
`lease.rs`, `metrics.rs`, `cluster.rs`, `heartbeat.rs`, `procedure.rs`, and `store.rs`.
- **Refactor Tests**: Updated tests in `client.rs`, `response_header_handler.rs`, `store.rs`, and `service` modules to reflect the removal of `cluster_id`.
* fix: clippy
* refactor/remove-cluster-id:
**Refactor and Cleanup in Meta Server**
- **`response_header_handler.rs`**: Removed unused import of `HeartbeatResponse` and cleaned up the test function by eliminating the creation of an unused `HeartbeatResponse` object.
- **`node_lease.rs`**: Simplified parameter handling in `HttpHandler` implementation by using an underscore for unused parameters.
* refactor/remove-cluster-id:
### Remove `TableMetadataAllocatorContext` and Refactor Code
- **Removed `TableMetadataAllocatorContext`**: Eliminated the `TableMetadataAllocatorContext` struct and its usage across multiple files, including `ddl.rs`, `create_table.rs`, `create_view.rs`, `table_meta.rs`, `test_util.rs`, `create_logical_tables.rs`,
`drop_table.rs`, and `table_meta_alloc.rs`.
- **Refactored Function Signatures**: Updated function signatures to remove the `TableMetadataAllocatorContext` parameter in methods like `create`, `create_view`, and `alloc` in `table_meta.rs` and `table_meta_alloc.rs`.
- **Updated Imports**: Adjusted import statements to reflect the removal of `TableMetadataAllocatorContext` in affected files.
These changes simplify the codebase by removing an unnecessary context struct and updating related function calls.
* refactor/remove-cluster-id:
### Update `datanode.rs` to Modify Key Prefix
- **File Modified**: `src/common/meta/src/datanode.rs`
- **Key Changes**:
- Updated `DatanodeStatKey::prefix_key` and `From<DatanodeStatKey>` to remove the cluster ID from the key prefix.
- Adjusted comments to reflect the changes in key prefix handling.
* reformat code
* refactor/remove-cluster-id:
### Commit Summary
- **Refactor `Pusher` Initialization**: Removed the `RequestHeader` parameter from the `Pusher::new` method across multiple files, including `handler.rs`, `test_util.rs`, and `heartbeat.rs`. This change simplifies the `Pusher` initialization process by eliminating th
unnecessary parameter.
- **Update Imports**: Adjusted import statements in `handler.rs` and `test_util.rs` to remove unused `RequestHeader` references, ensuring cleaner and more efficient code.
* chore: update proto
* feat: include trace v1 encoding
* feat: add trace ingestion in inserter
* feat: add partition rules and index for trace_id
* chore: format
* chore: fmt
* fix: issue introduced with merge
* feat: adjust index and add integration test for v1
* refactor: remove comment key
* fix: update default value of skip index granularity
* fix: update default value of skip index granularity
* refactor: rename some functions
* feat: remove skipping index from span_id
* refactor: made span_id part of primary key for potential dedup purpose
* feat: move the special attribute resource_attribute.service.name to top level
---------
Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>
* fix/interval-cast-rewrite:
### Enhance Interval Parsing and Casting
- **`create_parser.rs`**: Added a test case `test_parse_interval_cast` to verify the parsing of interval casts.
- **`expand_interval.rs`**: Refactored interval casting logic to handle `CastKind` and `format` attributes. Removed the `create_interval` function and integrated its logic directly into the casting process.
- **`interval.result`**: Updated test results to reflect changes in interval representation, switching from `IntervalMonthDayNano` to `Utf8` format for interval operations.
* reformat code
fix/comment-in-cjk:
### Update `OptionMap` Formatting and Add Tests
- **Enhancements in `OptionMap`**:
- Changed formatting from `escape_default` to `escape_debug` for better handling of special characters in `src/sql/src/statements/option_map.rs`.
- Added unit tests to verify the new formatting behavior.
- **Test Cases for CJK Comments**:
- Added test cases for tables with comments in CJK (Chinese, Japanese, Korean) characters in `tests/cases/standalone/common/show/show_create.sql` and `show_create.result`.
* fix/frontend-node-state: Refactor NodeInfoKey and Context Handling in Meta Server
• Removed unused cluster_id from NodeInfoKey struct.
• Updated HeartbeatHandlerGroup to return Context alongside HeartbeatResponse.
• Added current_node_info to Context for tracking node information.
• Implemented on_node_disconnect in Context to handle node disconnection events, specifically for Frontend roles.
• Adjusted register_pusher function to return PusherId directly.
• Updated tests to accommodate changes in Context structure.
* fix/frontend-node-state: Refactor Heartbeat Handler Context Management
Refactored the HeartbeatHandlerGroup::handle method to use a mutable reference for Context instead of passing it by value. This change simplifies the
context management by eliminating the need to return the context with the response. Updated the Metasrv implementation to align with this new context
handling approach, improving code clarity and reducing unnecessary context cloning.
* revert: clean cluster info on disconnect
* fix/frontend-node-state: Add Frontend Expiry Listener and Update NodeInfoKey Conversion
• Introduced FrontendExpiryListener to manage the expiration of frontend nodes, including its integration with leadership change notifications.
• Modified NodeInfoKey conversion to use references, enhancing efficiency and consistency across the codebase.
• Updated collect_cluster_info_handler and metasrv to incorporate the new listener and conversion changes.
• Added frontend_expiry module to the project structure for better organization and maintainability.
* chore: add config for node expiry
* add some doc
* fix: clippy
* fix/frontend-node-state:
### Refactor Node Expiry Handling
- **Configuration Update**: Removed `node_expiry_tick` from `metasrv.example.toml` and `MetasrvOptions` in `metasrv.rs`.
- **Module Renaming**: Renamed `frontend_expiry.rs` to `node_expiry_listener.rs` and updated references in `lib.rs`.
- **Code Refactoring**: Replaced `FrontendExpiryListener` with `NodeExpiryListener` in `node_expiry_listener.rs` and `metasrv.rs`, removing the tick interval and adjusting logic to use a fixed 60-second interval for node expiry checks.
* fix/frontend-node-state:
Improve logging in `node_expiry_listener.rs`
- Enhanced warning message to include peer information when an unrecognized node info key is encountered in `node_expiry_listener.rs`.
* docs: update config docs
* fix/frontend-node-state:
**Refactor Context Handling in Heartbeat Services**
- Updated `HeartbeatHandlerGroup` in `handler.rs` to pass `Context` by value instead of by mutable reference, allowing for more flexible context
management.
- Modified `Metasrv` implementation in `heartbeat.rs` to clone `Context` when passing to `handle` method, ensuring thread safety and consistency in
asynchronous operations.
* fix/reject-ddl-in-follower-metasrv:
Add leader check and logging for gRPC requests in `procedure.rs`
- Implemented leader verification for `query_procedure_state`, `ddl`, and `procedure_details` gRPC requests in `procedure.rs`.
- Added logging with `warn` for requests reaching a non-leader node.
- Introduced `ResponseHeader` and `Error::is_not_leader()` to handle non-leader responses.
* fix/reject-ddl-in-follower-metasrv:
Improve leader address handling in `heartbeat.rs`
- Refactor leader address retrieval by renaming `leader` to `leader_addr` for clarity.
- Update `make_client` function to use a reference to `leader_addr`.
- Enhance logging to include the leader address in the success message for creating a heartbeat stream.
* fmt
* fix/reject-ddl-in-follower-metasrv:
**Enhance Leader Check in `procedure.rs`**
- Updated the leader verification logic in `procedure.rs` to return a failed `MigrateRegionResponse` when the server is not the leader.
- Added logging to warn when a migrate request is received by a non-leader server.
* perf: do not delete columns when drop logical region in drop database
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* fix: make ci happy
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* fix: address review comments
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* fix: address some comments
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* fix: drop stupid comments by copilot
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* chore: minor refactor
* chore: minor refactor
* chore: update grpetime-proto
---------
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
Co-authored-by: WenyXu <wenymedia@gmail.com>
* fix: use alias expr to check commutativity
* chore: debug sort
* feat: consider alias in window sort optimizer
* test: sqlness test
* test: update sqlness result
* TODO: snapshot read
* feat: RegionEngine get last seq
* feat: query context snapshot
* chore: use new proto
* feat: get_region_seqs in region engine
* chore: typo
* chore: toml
* feat: make snapshots modifiable
* feat: add hint for snapshot read
* chore: some typo
* refactor: remove hint as not used
* fix: use commited seqs
* refactor: remove sequences variant on RegionRequest
* refactor: per review
* chore: rebase solve conflict
* refactor: rm unused key
* chore: per review
* chore: per review
* feat(metric-engine): introduce batch alter request handling
* refactor: minor refactor
* refactor: push down filter to mito
* chore: apply suggestions from CR
* feat: handle filter for window sort
* test: sqlness filter test for window sort
* test: add test on tag column filter
* test: test for filter on ts
* test: update sqlness test
* feat: change cache policy for file cache
* feat: file cache run pending task after put
* feat: run pending task in put_dir
* feat: run pending task after stager recovered
* feat: purge recycle bin periodically
* feat: use lru policy for read cache
* feat remove datetime type
* chore: fix unit test
* chore: add column test
* refactor: move create and alter validation to one place
* chore: minor refactor ut
* refactor: rename expr_factory to expr_helper
* chore: remove unnecessary args
fix: use fixed tonistiigi/binfmt:qemu-v7.0.0-28 image version instead of latest version to avoid segmentation fault
Co-authored-by: Yingwen <realevenyag@gmail.com>
ci: skip nightly ci jobs (#9)
(cherry picked from commit 345b4c30474f47a0477263bfba9894d7b4acda2d)
(cherry picked from commit dcd779cd668802fb1ea12fefb4dc3f83f34e30a2)
* refactor: rename grpc options
* refactor: make the arg clearly
* chore: comments on server_addr
* chore: fix test
* chore: remove the store_addr alias
* refactor: cli option rpc_server_addr
* chore: keep store-addr alias
* chore: by comment
* fix: do not transform exprs in the limit plan
* chore: keep some logs for debug
* feat: workaround for limit in other rules
* test: add sqlness tests for offset 0
* chore: add fixme
* - **Refactored SST File Handling**:
- Introduced `FilePathProvider` trait and its implementations (`WriteCachePathProvider`, `RegionFilePathFactory`) to manage SST and index file paths.
- Updated `AccessLayer`, `WriteCache`, and `ParquetWriter` to use `FilePathProvider` for path management.
- Modified `SstWriteRequest` and `SstUploadRequest` to use path providers instead of direct paths.
- Files affected: `access_layer.rs`, `write_cache.rs`, `parquet.rs`, `writer.rs`.
- **Enhanced Indexer Management**:
- Replaced `IndexerBuilder` with `IndexerBuilderImpl` and made it async to support dynamic indexer creation.
- Updated `ParquetWriter` to handle multiple indexers and file IDs.
- Files affected: `index.rs`, `parquet.rs`, `writer.rs`.
- **Removed Redundant File ID Handling**:
- Removed `file_id` from `SstWriteRequest` and `CompactionOutput`.
- Updated related logic to dynamically generate file IDs where necessary.
- Files affected: `compaction.rs`, `flush.rs`, `picker.rs`, `twcs.rs`, `window.rs`.
- **Test Adjustments**:
- Updated tests to align with new path and indexer management.
- Introduced `FixedPathProvider` and `NoopIndexBuilder` for testing purposes.
- Files affected: `sst_util.rs`, `version_util.rs`, `parquet.rs`.
* chore: merge main
* refactor/generate-file-id-in-parquet-writer:
**Enhance Logging in Compactor**
- Updated `compactor.rs` to improve logging of compaction process.
- Added `itertools::Itertools` for efficient string joining.
- Moved logging of compaction inputs and outputs to the async block for better context.
- Enhanced log message to include both input and output file names for better traceability.
* refactor: support to flatten json object in greptime_identity pipeline
* refactor: add GreptimeIdentityPipelineParams to configure greptime_identity pipeline
* refactor: pass greptime identity pipeline params by one header kv
* refactor: code review
* refactor: make pipeline params more general for all internal pipelines
* chore: remove axum deps from pipeline
* fix: clippy errors
* chore: fix and add test
* test: adopt api change for test client
---------
Co-authored-by: shuiyisong <xixing.sys@gmail.com>
Co-authored-by: Ning Sun <sunng@protonmail.com>
Co-authored-by: Ning Sun <sunning@greptime.com>
* change dep
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* feat: adapt to arrow's interval array
* chore: fix compile errors in datatypes crate
* chore: fix api crate compiler errors
* chore: fix compiler errors in common-grpc
* chore: fix common-datasource errors
* chore: fix deprecated code in common-datasource
* fix promql and physical plan related
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* wip: upgrading network deps
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* block on updating `sqlparser`
* upgrade sqlparser
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* adapt new df's trait requirements
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* chore: fix compiler errors in mito2
* chore: fix common-function crate errors
* chore: fix catalog errors
* change import path
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* chore: fix some errors in query crate
* chore: fix some errors in query crate
* aggr expr and some other tiny fixes
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* chore: fix expr related errors in query crate
* chore: fix query serializer and admin command
* chore: fix grpc services
* feat: axum serve
* chore: fix http server
* remove handle_error handler
* refactor timeout layer
* serve axum
* chore: fix flow aggr functions
* chore: fix flow
* feat: fix errors in meta-srv
* boxed()
* use TokioIo
* feat!: Remove script crate and python feature (#5321)
* feat: exclude script crate
* chore: simplify feature
* feat: remove the script crate
* chore: remove python feature and some comments
* chore: fix warning
* chore: fix servers tests compiler errors
* feat: fix tests-integration errors
* chore: fix unused
* test: fix catalog test
* chore: fix compiler errors for crates using common-meta
testing feature is enabled when check with --workspace
* test: use display for logical plan test
* test: implement rewrite for ScanHintRule
* fix: http server build panic
* test: fix mito test
* fix: sql parser type alias error
* test: fix TestClient not listen
* test: some flow tests
* test(flow): more fix
* fix: test_otlp_logs
* test: fix promql test that using deprecated method fun()
* fix: sql type replace supports Int8 ~ Int64, UInt8 ~ UInt64
* test: fix infer schema test case
* test: fix tests related to plan display
* chore: fix last flow test
* test: fix function format related assertion
* test: use larger port range for tests
* fix: test_otlp_traces
* fix: test_otlp_metrics
* fix range query and dist plan
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* fix: flow handle distinct use deprecated field
* fix: can't pass Join plan expressions to LogicalPlan::with_new_exprs
* test: fix deserialize test
* test: reduce split key case num
* tests: lower case aggr func name
* test: fix some sqlness tests
* tests: more sqlness fix
* tests: fixed sqlness test
* commit non-bug changes
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* fix: make our udf correct
* fix: implement empty methods of ContextProvider for DfContextProviderAdapter
* test: update sqlness test result
* chore: remove unused
* fix: provide alias name for AggregateExprBuilder in range plan
* test: update range query result
* fix: implement missing ContextProvider methods for DfContextProviderAdapter
* test: update timestamps, cte result
* fix: supports empty projection in mito
* test: update comment for cte test
* fix: support projection for numbers
* test: update test cases after projection fix
* fix: fix range select first_value/last_value
* fix: handle CAST and time index conflict
* fix: handle order by correctly in range first_value/last_value
* test: update sqlness result
* test: update view test result
* test: update decimal test
wait for https://github.com/apache/datafusion/pull/14126 to fix this
* feat: remove redundant physical optimization
todo(ruihang): Check if we can remove this.
* test: update sqlness test result
* chore: range select default sort use nulls_first = false
* test: update filter push down test result
* test: comment deciaml test to avoid different panic message
* test: update some distributed test result
* test: update test for distributed count and filter push down
* test: update subqueries test
* fix: SessionState may overwrite our UDFs
* chore: fix compiler errors after merging main
* fix: fix elasticsearch and dashboard router panic
* chore: fix common-functions tests
* chore: update sqlness result
* test: fix id keyword and update sqlness result
* test: fix flow_null test
* fix: enlarge thread size in debug mode to avoid overflow
* chore: fix warnings in common-function
* chore: fix warning in flow
* chore: fix warnings in query crate
* chore: remove unused warnings
* chore: fix deprecated warnings for parquet
* chore: fix deprecated warning in servers crate
* style: fix clippy
* test: enlarge mito cache tttl test ttl time
* chore: fix typo
* style: fmt toml
* refactor: reimplement PartialOrd for RangeSelect
* chore: remove script crate files introduced by merge
* fix: return error if sql option is not kv
* chore: do not use ..default::default()
* chore: per review
* chore: update error message in BuildAdminFunctionArgsSnafu
Co-authored-by: jeremyhi <jiachun_feng@proton.me>
* refactor: typed precision
* update sqlness view case
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* chore: flow per review
* chore: add example in comment
* chore: warn if parquet stats of timestamp is not INT64
* style: add a newline before derive to make the comment more clear
* test: update sqlness result
* fix: flow from substrait
* chore: change update_range_context log to debug level
* chore: move axum-extra axum-macros to workspace
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: luofucong <luofc@foxmail.com>
Co-authored-by: discord9 <discord9@163.com>
Co-authored-by: shuiyisong <xixing.sys@gmail.com>
Co-authored-by: jeremyhi <jiachun_feng@proton.me>
* fix/avoid-suppress-manual-compaction:
**Refactor Compaction Logic**
- Removed `PendingCompaction` struct and integrated its functionality directly into `CompactionStatus` in `compaction.rs`.
- Simplified waiter management by consolidating waiter handling logic into `CompactionStatus`.
- Updated `CompactionRequest` creation to directly handle waiters without intermediate structures.
- Adjusted test cases in `compaction.rs` to align with the new waiter management approach.
(cherry picked from commit 87e2d1c2cc9bd82c02991d22e429bef25c5ee348)
* fix/avoid-suppress-manual-compaction:
### Add Support for Manual Compaction Requests
- **Compaction Logic Enhancements**:
- Updated `CompactionScheduler` in `compaction.rs` to handle manual compaction requests using `Options::StrictWindow`.
- Introduced `PendingCompaction` struct to manage pending manual compaction requests.
- Added logic to reschedule manual compaction requests once the current compaction task is completed.
- **Testing**:
- Added `test_manual_compaction_when_compaction_in_progress` to verify the handling of manual compaction requests during ongoing compaction processes.
These changes enhance the compaction scheduling mechanism by allowing manual compaction requests to be queued and processed efficiently.
(cherry picked from commit bc38ed0f2f8ba2c4690e0d0e251aeb2acce308ca)
* chore: fix conflicts
* fix/avoid-suppress-manual-compaction:
### Add Error Handling for Manual Compaction Override
- **`compaction.rs`**: Enhanced the `set_pending_request` method to handle manual compaction overrides by sending an error to the waiter if a previous request exists.
- **`error.rs`**: Introduced a new error variant `ManualCompactionOverride` to represent manual compaction being overridden, and mapped it to the `Cancelled` status code.
* fix: format
* fix/avoid-suppress-manual-compaction:
**Add Error Handling for Pending Compaction Requests**
- Enhanced error handling in `compaction.rs` by adding logic to handle errors for pending compaction requests.
- Introduced a mechanism to send errors using `waiter.send` when a pending compaction request fails, ensuring proper error propagation and context with `CompactRegionSnafu`.
* fix/avoid-suppress-manual-compaction:
**Fix Typo and Simplify Code Logic in `compaction.rs`**
- Corrected a typo in the license comment from "langucage" to "language".
- Simplified the logic for handling `pending_compaction` in `CompactionStatus` by removing unnecessary pattern matching and directly accessing `waiter`.
* fix: typo
* feat: make instant_query and range_query to supports not-equal matchers
* feat: impl query_metric_names
* feat: forgot some files and refactor
* chore: test and docs
* fix: typo
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
* refactor: parse_query
* chore: improve test
* fix: use current catalog to query information_schema
---------
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
* fix: vector function for PromQL need to ignore the time index also close#5392
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* fix: do not affect scalar function
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* fix: betteer name for it
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
---------
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* refactor: use MetadataKey
* fix: match all prefix
* refactor: introduce TopicPool
* fix: fix test, some rename
* test: add unit test for legacy restore
* fix: add _ between prefix and topic id
* chore: readable legacy topics
* refactor: a refactor
* Apply suggestions from code review
* Apply suggestions from code review
* refactor: introduce TopicPool
* fix: fix unit test
* chore: fix unit test and add some comments
* fix: fix unit test
* refactor: just refactor
* refactor: rename
* chore: rename, comments and remove unnecessary clone
* chore/change-authorization-header:
### Add Custom Authorization Header Support
- **Files Modified**: `http.rs`, `authorize.rs`, `authorize.rs` (tests)
- **Key Changes**:
- Introduced a custom authorization header `x-greptime-auth` in `http.rs`.
- Updated authorization logic in `authorize.rs` to support both `x-greptime-auth` and the standard `Authorization` header.
- Enhanced test cases in `authorize.rs` to validate the new custom header functionality.
* chore: add more tests
chore/change-default-compaction-output-size-limit:
### Update `TwcsOptions` Default Configuration
- Modified the default value of `max_output_file_size` in `TwcsOptions` to `Some(ReadableSize::gb(2))` in `src/mito2/src/region/options.rs`.
* feat: use time window in compaction options for compaction window
* test: add tests for overwriting options
* chore: typo
* chore: fix a grammar issue in log
This patch support pg_database for pg_catalog, also add query replace,
in fixtures.rs for the reason that datafusion do not support sql like
'select 1,1;' more can check issue #5344.
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* feat: introduce `PrimaryKeyEncoding`
* fix: fix unit tests
* chore: add empty line
* test: add unit tests
* chore: fmt code
* refactor: introduce new codec trait to support various encoding
* fix: fix unit tests
* chore: update sqlness result
* chore: apply suggestions from CR
* chore: apply suggestions from CR
* feat: more workers
* feat: use round robin
* refactor: per review
* refactor: per bot review
* chore: per review
* docs: example
* docs: update config.md
* docs: update
* chore: per review
* refactor: set workers to cpu/2.max(1)
* fix: flow config in standalone mode
* test: fix config test
* docs: update docs&opt name
* chore: update config.md
* refactor: per review, sanitize at top
* chore: per review
* chore: config.md
* refactor(elasticsearch): use `_index` as greptimedb table in log ingestion and add `/${index}/_bulk` API
Signed-off-by: zyy17 <zyylsxm@gmail.com>
* refactor: code review
---------
Signed-off-by: zyy17 <zyylsxm@gmail.com>
* fix: drop all python embedding code for docker and doc
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* fix: address comments drop the left python
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
---------
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* feat: support `select session_user;`
This commit is part of support DBeaver that support function
select session_user like postgres did.
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* fix: lint problem
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* fix: address comments add tests
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
---------
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* test: optimize out partition split insert requests if there is only one region
* Now that the optimization for single region insert has been lifted up, the original "fast path" can be obsoleted.
* resolve PR comments
* feat(flow): (Part 1) refill utils
* chore: after rebase fix
* chore: more rebase
* rm refill.rs to reduce pr size
* chore: simpler args
* refactor: per review
* docs: more explain for instant requests
* refactor: per review
* fix: drop unused dep using udeps to minial the size
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* fix: adress comments fix the problem
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
---------
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* feat: impl COPY a query resultset to external file
* chore: add more tests for parse `copy_table_to`
* chore: add more tests for parse `copy_table_to`
* feat: txn for pg kv backend
* chore: clippy
* fix: txn uses one client
* test: clean up and txn test
* test: clean up
* test: change lock_id to avoid conflict in test
* test: use different prefix in pg election test
* fix(test): just a fix
* test: aggregate multiple test to avoid concurrency problem
* test: use uuid instead of rng
* perf: batch cmp in txn
* perf: batch same op in txn
chore/suppress-list-warning:
### Update logging level in `intermediate.rs`
- Changed logging level from `warn` to `debug` for unexpected directory entries in index creation.
- Added `debug` to the `common_telemetry` import to support the logging level change.
* test: test adding existing columns
* chore: add more checks to AlterKind
* chore: update logs
* fix: check and build table info first
* feat: Add add_if_not_exists flag to alter expr
* feat: skip existing columns when building alter kind
* checks in make_region_alter_kind()
* reuse the alter kind
* test: fix tests in common-meta
* chore: fix typos
* chore: update comments
* feat: update partition duration of memtable using compaction window
* chore: only use provided duration if it is not None
* test: more tests
* test: test compaction apply window
* style: fix clippy
* feat: init PgElection
fix: release advisory lock
fix: handle duplicate keys
chore: update comments
fix: unlock if acquired the lock
chore: add TODO and avoid unwrap
refactor: check both lock and expire time, add more comments
chore: fmt
fix: deal with multiple edge cases
feat: init PgElection with candidate registration
chore: fmt
chore: remove
* test: add unit test for pg candidate registration
* test: add unit test for pg candidate registration
* chore: update pg env
* chore: make ci happy
* fix: spawn a background connection thread
* chore: typo
* fix: shadow the election client for now
* fix: fix ci
* chore: readability
* chore: follow review comments
* refactor: use kvbackend for pg election
* chore: rename
* chore: make clippy happy
* refactor: use pg server time instead of local ones
* chore: typo
* chore: rename infancy to leader_infancy for clarification
* chore: clean up
* chore: follow review comments
* chore: follow review comments
* ci: unit test should test all features
* ci: fix
* ci: just test pg
* wip: row group reader base
* wip: memtable row group reader
* Refactor MemtableRowGroupReader to streamline data fetching
- Added early return when fetch_ranges is empty to optimize performance.
- Replaced inline chunk data assignment with a call to `assign_dense_chunk` for cleaner code.
* wip: row group reader
* wip: reuse RowGroupReader
* wip: bulk part reader
* Enhance BulkPart Iteration with Filtering
- Introduced `RangeBase` to `BulkIterContext` for improved filter handling.
- Implemented filter application in `BulkPartIter` to prune batches based on predicates.
- Updated `SimpleFilterContext::new_opt` to be public for broader access.
* chore: add prune test
* fix: clippy
* fix: introduce prune reader for memtable and add more prune test
* Enhance BulkPart read method to return Option<BoxedBatchIterator>
- Modified `BulkPart::read` to return `Option<BoxedBatchIterator>` to handle cases where no row groups are selected.
- Added logic to return `None` when all row groups are filtered out.
- Updated tests to handle the new return type and added a test case to verify behavior when no row groups match the pr
* refactor/separate-paraquet-reader: Add helper function to parse parquet metadata and integrate it into BulkPartEncoder
* refactor/separate-paraquet-reader:
Change BulkPartEncoder row_group_size from Option to usize and update tests
* refactor/separate-paraquet-reader: Add context module for bulk memtable iteration and refactor part reading
• Introduce context module to encapsulate context for bulk memtable iteration.
• Refactor BulkPart to use BulkIterContextRef for reading operations.
• Remove redundant code in BulkPart by centralizing context creation and row group pruning logic in the new context module.
• Create new file context.rs with structures and logic for handling iteration context.
• Adjust part_reader.rs and row_group_reader.rs to reference the new BulkIterContextRef.
* refactor/separate-paraquet-reader: Refactor RowGroupReader traits and implementations in memtable and parquet reader modules
• Rename RowGroupReaderVirtual to RowGroupReaderContext for clarity.
• Replace BulkPartVirt with direct usage of BulkIterContextRef in MemtableRowGroupReader.
• Simplify MemtableRowGroupReaderBuilder by directly passing context instead of creating a BulkPartVirt instance.
• Update RowGroupReaderBase to use context field instead of virt, reflecting the trait renaming and usage.
• Modify FileRangeVirt to FileRangeContextRef and adjust implementations accordingly.
* refactor/separate-paraquet-reader: Refactor column page reader creation and remove unused code
• Centralize creation of SerializedPageReader in RowGroupBase::column_reader method.
• Remove unused RowGroupCachedReader and related code from MemtableRowGroupPageFetcher.
• Eliminate redundant error handling for invalid column index in multiple places.
* chore: rebase main and resolve conflicts
* fix: some comments
* chore: resolve conflicts
* chore: resolve conflicts
* chore: improve nix-shell support
* fix: add pkg-config
* ci: add a github action to ensure build on clean system
* ci: optimise dependencies of task
* ci: move clean build to nightly
Add ORDER BY clause to subquery union tests
Updated the SQL and result files for subquery union tests to include an ORDER BY clause, ensuring consistent result ordering. This change aligns with the test case from the DuckDB repository.
* feat: do not remove time filters
* chore: remove `time_range` from parquet reader
* chore: print more message in the check script
* chore: fix unused error
* perf/avoid-holding-memtable-during-compaction: Refactor Compaction Version Handling
• Introduced CompactionVersion struct to encapsulate region version details for compaction, removing dependency on VersionRef.
• Updated CompactionRequest and CompactionRegion to use CompactionVersion.
• Modified open_compaction_region to construct CompactionVersion without memtables.
• Adjusted WindowedCompactionPicker to work with CompactionVersion.
• Enhanced flush logic in WriteBufferManager to improve memory usage checks and logging.
* reformat code
* chore: change log level
* reformat code
---------
Co-authored-by: Yingwen <realevenyag@gmail.com>
* feat: simple version switch
* chore: remove debug print
* chore: add common folder
* tests: add drop table
* feat: pull versioned binary
* chore: don't use native-tls
* chore: rm outdated docs
* chore: new line
* fix: save old bin dir
* fix: switch version restart all node
* feat: use etcd
* fix: wait for election
* fix: normal sqlness
* refactor: hashmap for bin dir
* test: past 3 major version compat crate table
* refactor: allow using without setup etcd
* add metrics
* chore/bench-metrics: Add INFLIGHT_FLUSH_COUNT Metric to Flush Process
• Introduced INFLIGHT_FLUSH_COUNT metric to track the number of ongoing flush operations.
• Incremented INFLIGHT_FLUSH_COUNT in FlushScheduler to monitor active flushes.
• Removed redundant increment of INFLIGHT_FLUSH_COUNT in RegionWorkerLoop to prevent double counting.
* chore/bench-metrics: Add Metrics for Compaction and Flush Operations
• Introduced INFLIGHT_COMPACTION_COUNT and INFLIGHT_FLUSH_COUNT metrics to track the number of ongoing compaction and flush operations.
• Incremented INFLIGHT_COMPACTION_COUNT when scheduling remote and local compaction jobs, and decremented it upon completion.
• Added INFLIGHT_FLUSH_COUNT increment and decrement logic around flush tasks to monitor active flush operations.
• Removed redundant metric updates in worker.rs and handle_compaction.rs to streamline metric handling.
* chore: add metrics for remote compaction jobs
* chore: format
* chore: also add dashbaord
* feat: cache inverted index by page instead of file
* fix: add unit test and fix bugs
* chore: typo
* chore: ci
* fix: math
* chore: apply review comments
* chore: renames
* test: add unit test for index key calculation
* refactor: use ReadableSize
* feat: add config for inverted index page size
* chore: update config file
* refactor: handle multiple range read and fix some related bugs
* fix: add config
* test: turn to a fs reader to match behaviors of object store
* chore: decide tag column in log api follow table schema if table exists
* chore: add more test for greptime_identity pipeline
* chore: change pipeline get_table function signature
* chore: change identity_pipeline_inner tag_column_names type
* feat(fuzz): add set table options to alter fuzzer
* chore: clippy is happy, I'm sad
* chore: happy ci happy
* fix: unit test
* feat(fuzz): add unset table options to alter fuzzer
* fix: unit test
* feat(fuzz): add table option validator
* fix: make clippy happy
* chore: add comments
* chore: apply review comments
* fix: unit test
* feat(fuzz): add more ttl options
* fix: #5108
* chore: add comments
* chore: add comments
* feat: assign partition ranges by rows
* feat: balance partition rows
* feat: get uppoer bound for part nums
* feat: only split in non-compaction seq scan
* fix: parallel scan on multiple sources
* fix: can split check
* feat: scanner prepare by request
* feat: remove scan_parallelism
* docs: upate docs
* chore: update comment
* style: fix clippy
* feat: skip merge and dedup if there is only one source
* chore: Revert "feat: skip merge and dedup if there is only one source"
Since memtable won't do dedup jobs
This reverts commit 2fc7a54b11.
* test: avoid compaction in sqlness window sort test
* chore: do not create semaphore if num partitions is enough
* chore: more assertions
* chore: fix typo
* fix: compaction flag not set
* chore: address review comments
* feat: ttl zero filter
* refactor: use TimeToLive enum
* fix: unit test
* tests: sqlness
* refactor: Option<TTL> None means UNSET
* tests: sqlness
* fix: 10000 years --> forever
* chore: minor refactor from reviews
* chore: rename back TimeToLive
* refactor: split imme request from normal requests
* fix: use correct lifetime
* refactor: rename immediate to instant
* tests: flow sink table default ttl
* refactor: per review
* tests: sqlness
* fix: ttl alter to instant
* tests: sqlness
* refactor: per review
* chore: per review
* feat: add db ttl type&forbid instant for db
* tests: more unit test
* fix: use SchemaCache to locate database metadata
* main:
Refactor SchemaMetadataManager to use TableInfoCacheRef
- Replace TableInfoManagerRef with TableInfoCacheRef in SchemaMetadataManager
- Update DatanodeBuilder to pass TableInfoCacheRef to SchemaMetadataManager
- Rename error MissingCacheRegistrySnafu to MissingCacheSnafu in datanode module
- Adjust tests to use new mock_schema_metadata_manager with TableInfoCacheRef
* fix/schema-cache-invalidation: Add cache module and integrate cache registry into datanode
• Implement build_datanode_cache_registry function to create cache registry for datanode
• Integrate cache registry into datanode by modifying DatanodeBuilder and HeartbeatTask
• Refactor InvalidateTableCacheHandler to InvalidateCacheHandler and move to common-meta crate
• Update Cargo.toml to include cache as a dev-dependency for datanode
• Adjust related modules (flownode, frontend, tests-integration, standalone) to use new cache handler and registry
• Remove obsolete handler module from frontend crate
* fix: fuzz imports
* chore: add some doc for cahce builder functions
* refactor: change table info cache to table schema cache
* fix: remove unused variants
* fix fuzz
* chore: apply suggestion
Co-authored-by: Weny Xu <wenymedia@gmail.com>
* chore: apply suggestion
Co-authored-by: Weny Xu <wenymedia@gmail.com>
* fix: compile
---------
Co-authored-by: dennis zhuang <killme2008@gmail.com>
Co-authored-by: Weny Xu <wenymedia@gmail.com>
* feat: add cache for schema options
* fix/use-cache-kv-manager: Add cache invalidation handling to Datanode's heartbeat task
• Implement InvalidateSchemaCacheHandler in heartbeat.rs to handle cache invalidation instructions.
• Update HeartbeatTask constructor to accept cached_kv_backend and pass it to InvalidateSchemaCacheHandler.
• Modify DatanodeBuilder to clone cached_kv_backend when creating schema_metadata_manager.
• Refactor MetasrvCacheInvalidator in cache_invalidator.rs to reuse MailboxMessage for broadcasting to different channels.
* fix: only remove schema related cache entries
* chore: add more tests
* fix/use-cache-kv-manager: Moved InvalidateSchemaCacheHandler to a separate module
• Extracted InvalidateSchemaCacheHandler and associated tests into a new file cache_invalidator.rs
• Removed async_trait and CacheInvalidator related code from heartbeat.rs
• Added cache_invalidator module declaration in handler.rs
* fix: unit tests
* fix/use-cache-kv-manager:
Standardize TODO comment format in CachedKvBackend txn method
* Update src/datanode/src/heartbeat/handler/cache_invalidator.rs
* Update src/datanode/src/heartbeat/handler/cache_invalidator.rs
* Update src/datanode/src/heartbeat/handler/cache_invalidator.rs
---------
Co-authored-by: jeremyhi <jiachun_feng@proton.me>
* fix/metric-metadata-region-options: Remove APPEND_MODE_KEY and refactor TTL option handling in MetricEngineInner
* fix/metric-metadata-region-options: Refactor metadata region options into a shared function
• Extract metadata region options into region_options_for_metadata_region function
• Replace inline options map with a call to the new shared function in both create.rs and open.rs files
* fix: exclude typos
* fix/metric-metadata-region-options:
Refactor metadata region options to accept original options and remove APPEND_MODE_KEY
* feat: prune in each partition
* chore: change pick log to trace
* chore: add in progress partition scan to metrics
* feat: seqscan support pruning in partition
* chore: remove commented codes
* feat: Replace flow
* refactor: better show create flow&tests: better check
* tests: sqlness result update
* tests: unit test for update
* refactor: cmp with raw bytes
* refactor: rename
* refactor: per review
* support set and show on statement/execution timeout session variables.
* implement statement timeout for mysql read, and postgres queries
* add mysql test with max execution time
* tests: more flow testcase
* tests(WIP): more tests
* tests: more flow tests
* test: wired regex for sqlness
* refactor: put blog&example to two files
* fix: result of nulls
* update test result
* fix null behaviors, add null tests
* update NULL tests
* error handler when parsing json_path
* change the logic to: items' datatype in the input arrays are all the same.
* remove a comment
* refactor: better logic
* drop unnecessary err check
* added an error test case
* main:
Add common-meta dependency and implement SchemaMetadataManager
- Introduce `common-meta` as a new dependency in `mito2`.
- Implement `SchemaMetadataManager` for managing schema-level metadata.
- Update `DatanodeBuilder` and `MitoEngine` to pass `KvBackendRef` for schema metadata management.
- Add `SchemaMetadataManager` to `RegionWorkerLoop` for compaction handling.
- Include `SchemaNameKey` usage in compaction-related code.
- Add `database_metadata_manager` module with `SchemaMetadataManager` struct and associated logic.
* fix/database-base-ttl:
Refactor metadata management and update compaction logic
- Remove `database_metadata_manager` and introduce `schema_metadata_manager`
- Update compaction logic to handle TTL based on schema metadata
- Adjust tests to use `schema_metadata_manager` for setting up schema options
- Fix engine creation in tests to pass `kv_backend` explicitly
- Remove unused imports and apply minor code cleanups
* fix/database-base-ttl:
Extend CREATE TABLE LIKE to inherit schema options
- Implement inheritance of database level options for CREATE TABLE LIKE
- Add schema options to SHOW CREATE TABLE output
- Refactor create_table_stmt to include schema_options in SQL generation
- Update error handling to include TableMetadataManagerSnafu
* fix/database-base-ttl:
Refactor error handling and remove schema dependency in table creation
- Replace expect with the ? operator for error handling in open_compaction_region
- Simplify create_logical_tables by removing catalog and schema name parameters
- Remove unnecessary schema retrieval and merging of schema options in create_table_info
- Clean up unused imports and redundant code
* fix/database-base-ttl:
Refactor error handling and update documentation comments
- Update comment to reflect retrieval of schema options instead of metadata
- Introduce new error type `GetSchemaMetadataSnafu` for schema metadata retrieval failures
- Implement error handling for schema metadata retrieval in `find_ttl` function
* fix: toml
* fix/database-base-ttl:
Refactor SchemaMetadataManager and adjust Cargo.toml dependencies
- Remove unused imports in schema_metadata_manager.rs
- Add conditional compilation for SchemaMetadataManager::new
- Update Cargo.toml to remove "testing" feature from common-meta dependency in main section and add it to dev-dependencies
* fix/database-base-ttl:
Fix typos in comments and function names across multiple modules
- Correct spelling of 'parallelism' in region_server, engine, and scan_region modules
- Amend typo in TODO comment from 'persisent' to 'persistent' in server module
- Update incorrect test query from 'versiona' to 'version' in federated module tests
* fix/database-base-ttl: Add schema existence check in StatementExecutor for CREATE TABLE operation
* fix/database-base-ttl: Add warning log for failed TTL retrieval in compaction region open function
* fix/database-base-ttl:
Refactor to use SchemaMetadataManagerRef in Datanode and MitoEngine
- Replace KvBackendRef with SchemaMetadataManagerRef across various components.
- Update DatanodeBuilder and MitoEngine to pass SchemaMetadataManagerRef instead of KvBackendRef.
- Adjust test cases to use get_schema_metadata_manager method for consistency.
* fix: data_length, index_length, table_rows in tables
* feat: table stats only works for mito engine currently
* fix: tests
* fix: typo
* chore: log error when region_stats fails
fix/database-base-ttl:
Fix typos in comments and function names across multiple modules
- Correct spelling of 'parallelism' in region_server, engine, and scan_region modules
- Amend typo in TODO comment from 'persisent' to 'persistent' in server module
- Update incorrect test query from 'versiona' to 'version' in federated module tests
Co-authored-by: Lei, HUANG <mrsatangel@gmail.com>
* feat: get part range min-max from cache for unordered scan
* feat: seq scan push row groups if num_row_groups > 0
* test: test split
* feat: update comment
* test: fix split test
* refactor: rename get meta data method
* feat: adds index size to region statistics
* feat: adds the number of rows for region statistics
* test: adds sqlness test for region_statistics
* fix: test
* feat/alter-ttl:
Update greptime-proto source and add ChangeTableOptions handling
- Change greptime-proto source repository and revision in Cargo.lock and Cargo.toml
- Implement handling for ChangeTableOptions in grpc-expr and meta modules
- Add support for parsing and applying region option changes in mito2
- Introduce new error type for invalid change table option requests
- Add humantime dependency to store-api
- Fix SQL syntax in tests for changing column types
* chore: remove write buffer size option handling since we don't support specifying write_buffer_size for single table or region
* persist ttl to manifest
* chore: add sqlness
* fix: sqlness
* fix: typo and toml format
* fix: tests
* update: change alter syntax
* feat/alter-ttl: Add Clone trait to RegionFlushRequest and remove redundant Default derive in region_request.rs.
* feat/alter-ttl: Refactor code to replace 'ChangeTableOption' with 'ChangeRegionOption' and handle TTL as a region option
• Rename ChangeTableOption to ChangeRegionOption across various files.
• Update AlterKind::ChangeTableOptions to AlterKind::ChangeRegionOptions.
• Modify TTL handling to treat '0d' as None for TTL in table options.
• Adjust related function names and comments to reflect the change from table to region options.
• Include test case updates to verify the new TTL handling behavior.
* chore: update format
* refactor: update region options in DatanodeTableValue
* feat/alter-ttl:
Remove TTL handling from RegionManifest and related structures
- Eliminate TTL fields from `RegionManifest`, `RegionChange`, and associated handling logic.
- Update tests and checksums to reflect removal of TTL.
- Refactor `RegionOpener` and `handle_alter` to adjust to TTL removal.
- Simplify `RegionChangeResult` by replacing `change` with `new_meta`.
* chore: fmt
* remove useless delete op
* feat/alter-ttl: Updated Cargo.lock and gRPC expression Cargo.toml to include store-api dependency. Refactored alter.rs to use ChangeOption from store-api instead of ChangeTableOptionRequest.
Adjusted error handling in error.rs to use MetadataError. Modified handle_alter.rs to handle TTL changes with ChangeOption. Simplified region_request.rs by replacing
ChangeRegionOption with ChangeOption and removing redundant code. Removed UnsupportedTableOptionChange error in table/src/error.rs. Updated metadata.rs to use ChangeOption for table
options. Removed ChangeTableOptionRequest enum and related conversion code from requests.rs.
* feat/alter-ttl: Update greptime-proto dependency to revision 53ab9a9553
* chore: format code
* chore: update greptime-proto
* test: add fuzz tests for migrate metric regions
* test: insert values before migrating metric region
* feat: correct table num
* chore: apply suggestions from CR
* chore: add schema urls to otlp logs table
* chore: update meter-macros version to remove anymap warning
* chore: change span id and trace id to field
* feat: add empty batch to end of range stream
* feat: add batch validation
* fix: validate batch order
* fix: not yield empty batch in compaction
* fix: empty record batch
* feat: add a flag to enable empty batch
* chore: otlp logs api
* feat: add API to write OpenTelemetry logs to GreptimeDB
* chore: fix test data schema error
* chore: modify the underlying data structure of the pipeline value map type from hashmap to btremap to keep key order
* chore: fix by pr comment
* chore: resolve conflicts and add some test
* chore: remove useless error
* chore: change otlp header name
* chore: fmt code
* chore: fix integration test for otlp log write api
* chore: fix by pr comment
* chore: set otlp body with fulltext default
* refactor: replace info logs with debug logs in region server
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* fix: update error handling for closing and opening nonexistent regions
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* feat: enable prof by default
* docs: don't need to build with features
* feat: add common-pprof as optional dep for pprof feature
* build: remove optional
* feat: use dump_text
* fix/union_all_panic:
Improve MetricCollector by incrementing level and fix underflow issue; add tests for UNION ALL queries
* chore: remove useless documentation
* fix/union_all_panic: Add order by clause to UNION ALL select queries in tests
* feat: json output format for http
* feat: add json result test case
* fix: typo and refactor a piece of code
* fix: cargo check
* move affected_rows to top level
* feat: add geojson function to aggregate paths
* test: add sqlness results
* test: add sqlness
* refactor: corrected to aggregation function
* chore: update comments
* fix: make linter happy again
* refactor: rename to remove `geo` from `geojson` function name
The return type is not geojson at all. It's just compatible with geojson's
coordinates part and superset's deckgl path plugin.
* chore: add json write
* chore: add test for write json log api
* chore: enhancement of Error Handling
* chore: fix by pr comment
* chore: fix by pr comment
* chore: enhancement of error content and add some doc
* feat: set max log files to 720 by default, info log only
* expose max_log_files in tomls
* include dir info when panicing, limit max_log_files of err_log to 30, and that of slow_queries to opt.max_log_files
* fix clippy
* update config.md
* update expected config str
* limit err_log max files size to `max_log_files` too, include err info when panicing, put `max_l_f` in right position
* fix typos
* chore: config
Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com>
---------
Co-authored-by: dennis zhuang <killme2008@gmail.com>
Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com>
* feat: define range meta
* feat: group ranges
* feat: split range
* feat: build ranges from the scan input
* feat: get partition range from range meta
* feat: build file range
* feat: unordered scan read by ranges
* feat: wip for mem ranges
* feat: build ranges
* feat: remove unused codes
* chore: update comments
* feat: update metrics
* chore: address review comments
* chore: debug assertion
* Commit Message
Clarify documentation for CompactionOutput struct
Updated the documentation for the `CompactionOutput` struct to specify that the output time range is only relevant for windowed compaction.
* Add max_output_file_size to TwcsPicker and TwcsOptions
- Introduced `max_output_file_size` to `TwcsPicker` struct and its logic to enforce output file size limits during compaction.
- Updated `TwcsOptions` to include `max_output_file_size` and adjusted related tests.
- Modified `new_picker` function to initialize `TwcsPicker` with the new `max_output_file_size` field.
* feat/limit-compaction-output-size:
Refactor compaction picker and TWCS to support append mode and improve options handling
- Update compaction picker to accept a reference to options and append mode flag
- Modify TWCS picker logic to consider append mode when filtering deleted rows
- Remove VersionControl usage in compactor and simplify return type
- Adjust enforce_max_output_size logic in TWCS picker to handle max output file size
- Add append mode flag to TwcsPicker struct
- Fix incorrect condition in TWCS picker for enforcing max output size
- Update region options tests to reflect new max output file size format (1GB and 7MB)
- Simplify InvalidTableOptionSnafu error handling in create_parser
- Add `compaction.twcs.max_output_file_size` to mito engine option keys
* resolve some comments
* feat: add capability to send warning to pgclient
* fix: refactor query context to carry query scope data
* feat: return a warning for unsupported postgres statement
* feat: list/array support for postgres output
* fix: implement time zone support for postgrsql
* feat: add a geohash function that returns array
* fix: typo
* fix: lint warnings
* test: add sqlness test
* refactor: check resolution range before convert value
* fix: test result for sqlness
* feat: upgrade pgwire apis
* chore: cherrypick 52e8eebb2dbbbe81179583c05094004a5eedd7fd
* refactor/tables: Change variable from immutable to mutable in KvBackendCatalogManager's method
* refactor/tables: Replace unbounded channel with bounded and use semaphore for concurrency control in KvBackendCatalogManager
* refactor/tables: Add common-runtime dependency and update KvBackendCatalogManager to use common_runtime::spawn_global
* refactor/tables: Await on sending error through channel in KvBackendCatalogManager
* chore: version skew
* fix: even more version skew
* feat: use `ring` instead of `aws-lc` for remove nasm assembler on windows
* feat: use `ring` for pgwire
* feat: change to use `aws-lc-sys` on windows instead
* feat: change back to use `ring`
* chore: provide CryptoProvider
* feat: use upstream repo
* feat: install ring crypto lib in main
* chore: use same fn to install in tests
* feat: make pgwire use `ring`
* generic bundle trait
* feat: impl get/let
* fix: drop batch
* test: tumble batch
* feat: use batch eval flow
* fix: div use arrow::div not mul
* perf: not append batch
* perf: use bool mask for reduce
* perf: tiny opt
* perf: refactor slow path
* feat: opt if then
* fix: WIP
* perf: if then
* chore: use trace instead
* fix: reduce missing non-first batch
* perf: flow if then using interleave
* docs: add TODO
* perf: remove unnecessary eq
* chore: remove unused import
* fix: run_available no longer loop forever
* feat: blocking on high input buf
* chore: increase threhold
* chore: after rebase
* chore: per review
* chore: per review
* fix: allow empty values in reduce&test
* tests: more flow doc example tests
* chore: per review
* chore: per review
* feat: add json type and vector
* fix: allow to create and insert json data
* feat: udf to query json as string
* refactor: remove JsonbValue and JsonVector
* feat: show json value as strings
* chore: make ci happy
* test: adunit test and sqlness test
* refactor: use binary as grpc value of json
* fix: use non-preserve-order jsonb
* test: revert changed test
* refactor: change udf get_by_path to jq
* chore: make ci happy
* fix: distinguish binary and json in proto
* chore: delete udf for future pr
* refactor: remove Value(Json)
* chore: follow review comments
* test: some tests and checks
* test: fix unit tests
* chore: follow review comments
* chore: corresponding changes to proto
* fix: change grpc and pgsql server behavior alongside with sqlness/crud tests
* chore: follow review comments
* feat: udf of conversions between json and strings, used for grpc server
* refactor: rename to_string to json_to_string
* test: add more sqlness test for json
* chore: thanks for review :)
* Apply suggestions from code review
---------
Co-authored-by: Weny Xu <wenymedia@gmail.com>
* Refactor RaftEngineLogStore to use references for config
- Updated `RaftEngineLogStore::try_new` to accept a reference to `RaftEngineConfig` instead of taking ownership.
- Replaced direct usage of `config` with individual fields (`sync_write`, `sync_period`, `read_batch_size`).
- Adjusted test cases to pass references to `RaftEngineConfig`.
* Add parallelism configuration for WAL recovery
- Introduced `recovery_parallelism` setting in `datanode.example.toml` and `standalone.example.toml` for configuring parallelism during WAL recovery.
- Updated `Cargo.lock` and `Cargo.toml` to include `num_cpus` dependency.
- Modified `RaftEngineConfig` to include `recovery_parallelism` with a default value set to the number of CP
* feat/wal-recovery-parallelism:
Add `wal.recovery_parallelism` configuration option
- Introduced `wal.recovery_parallelism` to config.md for specifying parallelism during WAL recovery.
- Updated `RaftEngineLogStore` to include `recovery_threads` from the new configuration.
* fix: ut
* fix: table resolving logic related to pg_catalog
refer to
https://github.com/GreptimeTeam/greptimedb/issues/3560#issuecomment-2287794348
and #4543
* refactor: remove CatalogProtocol type
* fix: sqlness
* fix: forbid create database pg_catalog with mysql client
* refactor: use QueryContext as arguments rather than Channel
* refactor: pass None as default behaviour in information_schema
* test: fix test
* chore: add test pipeline api
* chore: add test for test pipeline api
* chore: fix taplo check
* chore: change pipeline dryrun api path
* chore: add more info for pipeline dryrun api
* chore: add processor builder and transform buidler
* chore: in process
* chore: intermediate state from hashmap to vector in pipeline
* chore: remove useless code and rename some struct
* chore: fix typos
* chore: format code
* chore: add error handling and optimize code readability
* chore: fix typos
* chore: remove useless code
* chore: add some doc
* chore: fix by pr commit
* chore: remove useless code and change struct name
* chore: modify the location of the find_key_index function.
* refactor: pre-read the ingested sst file in object store to fill the local cache to accelerate first query
* feat: pre-download the ingested SST from remote to accelerate following reads
* resolve PR comments
* resolve PR comments
* feat(log_store): use new `Consumer`
* feat: add `from_peer_id`
* feat: read WAL entries respect index
* test: add test for `build_region_wal_index_iterator`
* fix: keep the handle
* fix: incorrect last index
* fix: replay last entry id may be greater than expected
* chore: remove unused code
* chore: apply suggestions from CR
* chore: rename `datanode_id` to `location_id`
* chore: rename `from_peer_id` to `location_id`
* chore: rename `from_peer_id` to `location_id`
* chore: apply suggestions from CR
* use list_with_metakey and concurrent_stat_in_list
* change concurrent in recover_cache like before
* remove stat funcation
* use 8 concurrent
* use const value
* fmt code
* Apply suggestions from code review
---------
Co-authored-by: ozewr <l19ht@google.com>
Co-authored-by: Weny Xu <wenymedia@gmail.com>
* chore: improve pipeline performance
* chore: use arc to improve time type
* chore: improve pipeline coerce
* chore: add vec refactor
* chore: add vec pp
* chore: improve pipeline
* inprocess
* chore: set log ingester use new pipeline
* chore: fix some error by pr comment
* chore: fix typo
* chore: use enum_dispatch to simplify code
* chore: some minor fix
* chore: format code
* chore: update by pr comment
* chore: fix typo
* chore: make clippy happy
* chore: fix by pr comment
* chore: remove epoch and date process add new timestamp process
* chore: add more test for pipeline
* chore: restore epoch and date processor
* chore: compatibility issue
* chore: fix by pr comment
* chore: move the evaluation out of the loop
* chore: fix by pr comment
* chore: fix dissect output key filter
* chore: fix transform output greptime value has order error
* chore: keep pipeline transform output order
* chore: revert tests
* chore: simplify pipeline prepare implementation
* chore: add test for timestamp pipelin processor
* chore: make clippy happy
* chore: replace is_some check to match
---------
Co-authored-by: shuiyisong <xixing.sys@gmail.com>
* feat: support fast count(*) for append-only tables
* fix: total_rows stats in time series memtable
* fix: sqlness result changes for SinglePartitionScanner -> StreamScanAdapter
* fix: some cr comments
* Add file number limits to TWCS compaction
- Introduce `max_active_window_files` and `max_inactive_window_files` to `TwcsOptions`.
* feat/limit-files-in-windows: Add max active/inactive window files options to mito engine config
* feat/limit-files-in-windows: Add Debug derive to TwcsPicker and implement max file enforcement logging in TWCS compaction
* fix: clippy
* feat: export all schemas and data at onece
* feat: introduce export all to export schemas and data at once
* feat: default value for target
* feat: refactor export target
* chore: fix unit test
* feat: hint options for gRPC isnert
* chore: unit test for extract_hints
* feat: add integration test for grpc hint
* test: add integration test for hints
* feat: add source channel to meter recorders
* feat: provide channel for query context
* fix: testing and extension get for query context
* chore: revert cargo toml structure changes
* fix: querycontext modification for prometheus and pipeline
* chore: switch git dependency to main branches
* chore: remove TODO
* refactor: rename other to unknown
---------
Co-authored-by: shuiyisong <113876041+shuiyisong@users.noreply.github.com>
* feat: refine logs for scan
* feat: improve build parts and unordered scan metrics
* feat: change to debug log
* fix: release lock before reading part
* test: replace region id
* test: fix sqlness
* chore: add todo
Co-authored-by: dennis zhuang <killme2008@gmail.com>
---------
Co-authored-by: dennis zhuang <killme2008@gmail.com>
* feat: support setting time range in Copy From statement
* test: add batch_filter_test
* fix: ts data type inconsistent error
* test: add sqlness test for copy from with statement
* fix: sqlness result error
* fix: cr comments
* feat: show root cause on the error line
* feat: show root error for grpc
* feat: add error information for http error
* feat: add db information on error mysql/postgres logs
* feat: add function 'pg_catalog.pg_table_is_visible'q
* feat: add 'pg_class' and 'pg_namespace', now we can run '\d' and '\dt'!
* refactor: move memory_table::tables to utils::tables
* refactor: move out predicate to system_schema to reuse it
* feat: predicates pushdown
* test: add pg_namespace, pg_class related sqlness test
* fix: typos and license header
* fix: sqlness test
* refactor: use `expect` instead of `unwrap` here
* refactor: remove the `information_schema::utils` mod
* doc: make the comment in pg_get_userbyid more precise
* doc: add TODO and comment in pg_catalog
* fix: typo
* fix: sqlness
* doc: change to comment on PGClassBuilder to TODO
* Add dynamic cache size adjustment for InvertedIndexConfig
* Increase cache sizes in integration tests for HTTP
- Updated `metadata_cache_size` from 32MiB to 64MiB
* Remove cache size settings from config and update drop_lines_with_inconsistent_results function to handle them
* Add cache size configurations for inverted index metadata and content
- Introduced `metadata_cache_size` with a default of 64MiB.
- Introduced `content_cache_size` with a default of 128MiB.
* chore/index-content-cache-default-size: Add cache size configuration options for Mito engine's inverted index
fix/reader-metrics:
Refactor cache hit/miss logic and update metrics in mito2
- Simplify cache retrieval logic in CacheManager by removing inline update_hit_miss function call.
- Add separate functions for incrementing cache hit and miss metrics.
- Update RowGroupLastRowCachedReader to use new cache hit/miss functions and refactor to new helper methods for creating Hit and Miss variants.
* chore: update grafana dashboard to reflect recent metric changes
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* chore: add a blank line at the end
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: dennis zhuang <killme2008@gmail.com>
* chore: add a compile cfg for python
* fix: feature gate additive turn off default features in workspace&add cfg in place
* chore: remove unused in different cfg
* fix: ensure keep alive is completed in time
* chore: apply suggestions from CR
* chore: use write runtime
* refactor: set META_LEASE_SECS to 5
* chore: set etcd replicas to 1
* chore: apply suggestions from CR
* chore: apply suggestions from CR
* fix: set `MissedTickBehavior::Delay`
* chore: apply suggestions from CR
* Add caching for last row reader and expose cache manager
- Implement `RowGroupLastRowCachedReader` to handle cache hits and misses for last row reads.
* Add projection field to SelectorResultValue and refactor RowGroupLastRowReader
- Introduced `projection` field in `SelectorResultValue` to store projection indices.
* WIP: pg_catalog
* refactor: move memory_table to crate public level to reuse it in pgcatalog
* refactor: new system_schema mod to manage implementation of information_schema and pg_catalog
* feat: pg_catalog.pg_type
* fix: remove unused code to avoid warning
* test: add pg_catalog sqlness test
* feat: pg_catalog_cache in system_catalog
* fix: integration test
* test: rollback unit test
* refactor: mix pg_catalog table_id with old ones
* fix: add todo information
* tests: rerun sqlness
---------
Co-authored-by: johnsonlee <johnsonlee@localhost.localdomain>
* Add PruneReader for optimized row filtering and error handling
- Introduced `PruneReader` to replace `RowGroupReader` for optimized row filtering.
* Commit Message:
Make ReaderMetrics fields public for external access
* Add row selection support to SeqScan and FileRange readers
- Updated `SeqScan::build_part_sources` to accept an optional `TimeSeriesRowSelector`.
* Refactor `scan_region.rs` to remove unnecessary cloning of `series_row_selector`. Enhance `file_range.rs` by adding `select_all` method to check if all rows in a row group are selected, and update the logic in `reader` method to use `LastRowReader` only when all rows are
selected and no DELETE operations are present.
* Commit Message:
Enhance PruneReader and ParquetReader with reset functionality and metrics handling
Summary:
• Made Source enum public in prune.rs.
* chore: Update src/mito2/src/sst/parquet/reader.rs
---------
Co-authored-by: Yingwen <realevenyag@gmail.com>
* feat: support text/plain format of log input
* refactor: pipeline query and delete using dataframe api
* chore: minor refactor
* refactor: skip jsonify when processing plan/text
* refactor: support array(string) as pipeline engine input
* feat/copy-to-parquet-parameter: Commit Message:
Enhance Parquet Writer with Column-wise Configuration
Summary:
• Introduced column_wise_config function to customize per-column properties in Parquet writer.
* feat/copy-to-parquet-parameter: Commit Message:
Enhance Parquet File Format Handling for Specific Data Types
Summary:
• Added ConcreteDataType import to support specific data type handling.
* feat/copy-to-parquet-parameter: Commit Message:
Refactor Parquet file format configuration
* feat/copy-to-parquet-parameter:
Enhance Parquet file format handling for timestamp columns
- Added logic to disable dictionary encoding and set DELTA_BINARY_PACKED encoding for timestamp columns in the Parquet file format configuration.
* feat/copy-to-parquet-parameter:
Disable dictionary encoding for timestamp columns in Parquet writer and update default max_active_window_runs in TwcsOptions
- Modified Parquet writer to disable dictionary encoding for timestamp columns to optimize for increasing timestamp data.
* feat/copy-to-parquet-parameter:
Update compaction settings in tests
- Modified `test_compaction_region` to include new compaction options: `compaction.type`,
`compaction.twcs.max_active_window_runs`, and `compaction.twcs.max_inactive_window_runs`.
- Updated `test_merge_mode_compaction` to use `compaction.twcs.max_active_window_runs` and
`compaction.twcs.max_inactive_window_runs` instead of `max_active_window_files` and
`max_inactive_window_files`.
* feat: use `Inserter` as Frontend
* fix: enable procedure in flownode
* docs: remove `frontend_addr` opts
* chore: rm fe addr in test runner
* refactor: int test also use inserter invoker
* feat: flow shutdown&refactor: remove `Frontendinvoker`
* refactor: rename `RemoteFrontendInvoker` to `FrontendInvoker`
* refactor: per review
* refactor: remove a layer of box
* fix: standalone use `node_manager`
* fix: remove a `Arc` cycle
* feat/inverted-index-cache:
Update dependencies and add caching for inverted index reader
- Updated `atomic` to 0.6.0 and `uuid` to 1.9.1 in `Cargo.lock`.
- Added `moka` and `uuid` dependencies in `Cargo.toml`.
- Introduced `seek_read` method in `InvertedIndexBlobReader` for common seek and read operations.
- Added `cache.rs` module to implement caching for inverted index reader using `moka`.
- Updated `async-compression` to 0.4.11 in `puffin/Cargo.toml`.
* feat/inverted-index-cache:
Refactor InvertedIndexReader and Add Index Cache Support
- Refactored `InvertedIndexReader` to include `seek_read` method and default implementations for `fst` and `bitmap`.
- Implemented `seek_read` in `InvertedIndexBlobReader` and `CachedInvertedIndexBlobReader`.
- Introduced `InvertedIndexCache` in `CacheManager` and `SstIndexApplier`.
- Updated `SstIndexApplierBuilder` to accept and utilize `InvertedIndexCache`.
- Added `From<FileId> for Uuid` implementation.
* feat/inverted-index-cache:
Update Cargo.toml and refactor SstIndexApplier
- Moved `uuid.workspace` entry in Cargo.toml for better organization.
* feat/inverted-index-cache:
Refactor InvertedIndexCache to use type alias for Arc
- Replaced `Arc<InvertedIndexCache>` with `InvertedIndexCacheRef` type alias.
* feat/inverted-index-cache:
Add Prometheus metrics and caching improvements for inverted index
- Introduced `prometheus` and `puffin` dependencies for metrics.
* feat/inverted-index-cache:
Refactor InvertedIndexReader and Cache handling
- Simplified `InvertedIndexReader` trait by removing seek-related comments.
* feat/inverted-index-cache:
Add configurable cache sizes for inverted index metadata and content
- Introduced `index_metadata_size` and `index_content_size` in `CacheManagerBuilder`.
* feat/inverted-index-cache:
Refactor and optimize inverted index caching
- Removed `metrics.rs` and integrated cache metrics into `index.rs`.
* feat/inverted-index-cache:
Remove unused dependencies from Cargo.lock and Cargo.toml
- Removed `moka`, `prometheus`, and `puffin` dependencies from both Cargo.lock and Cargo.toml.
* feat/inverted-index-cache:
Replace Uuid with FileId in CachedInvertedIndexBlobReader
- Updated `file_id` type from `Uuid` to `FileId` in `CachedInvertedIndexBlobReader` and related methods.
* feat/inverted-index-cache:
Refactor cache configuration for inverted index
- Moved `inverted_index_metadata_cache_size` and `inverted_index_cache_size` from `MitoConfig` to `InvertedIndexConfig`.
* feat/inverted-index-cache:
Remove unnecessary conversion of `file_id` in `SstIndexApplier`
- Simplified the initialization of `CachedInvertedIndexBlobReader` by removing the redundant `into()` conversion for `file_id`.
feat: flownode frontend client&test
feat: Frontend Client
feat: set frontend invoker for flownode
feat: set frontend invoker for flownode
chore: test script
WIP: test flow distributed
feat: hard coded demo
docs: flownode example toml
feat: add flownode support in runner
docs: comments for node
chore: after rebase
docs: add a todo
tests: move flow tests to common
fix: flownode sqlness dist test
chore: per review
docs: make
fix: make doc
* feat: add path prefix label to storage metrics
* refactor: return full path when the levels are less than 3
* refactor: align path label name with upstream
* refactor: better implementation of sub path
---------
Co-authored-by: Weny Xu <wenymedia@gmail.com>
* chore: update sqlness results
* refactor: use rwlock for modifiable data in session and querycontext
* chore: format toml
* refactor: use mutable_inner structure for mutable fields
* refactor: remove arc wrapper
* fix(fuzz): adapt for new partition rules
* feat: implement naive fuzz test for region migration
* chore(ci): add ci cfg
* chore: apply suggestions from CR
* chore: apply suggestions from CR
* fix: forbid to change tables in information_schema
* refactor: use unified read-only check function
* test: add more sqlness tests for information_schema
* refactor: move is_readonly_schema to common_catalog
* feat: add more placeholder field in information_schema.tables
* feat: make schema modifiable for use statement
* chore: add todo items
* fix: resolve lint issues after data type changes
* chore: update sqlness results
* refactor: patch for select database is no longer needed
* test: align tests and data types
* Apply suggestions from code review
Co-authored-by: dennis zhuang <killme2008@gmail.com>
* fix: use canonicalize_identifier for database name
* feat: add all columns for information_schema.tables
* test: remove vairables from sqlness results
* feat: add to_string impl for table options
---------
Co-authored-by: dennis zhuang <killme2008@gmail.com>
* ci: use 'vault.centos.org' as default yum for centos:7 image
* ci: fix cargo-binstall version to adapt rust toolchain
* ci: specify cargo-binstall version to adapt current rust toolchain
* refactor: add Compactor trait
* chore: add compact() in Compactor trait and expose compaction module
* refactor: add CompactionRequest and open_compaction_region
* refactor: export the compaction api
* refactor: add DefaultCompactor::new_from_request
* refactor: no need to pass mito_config in open_compaction_region()
* refactor: CompactionRequest -> &CompactionRequest
* fix: typo
* docs: add docs for public apis
* refactor: remove 'Picker' from Compactor
* chore: add logs
* chore: change pub attribute for Picker
* refactor: remove do_merge_ssts()
* refactor: update comments
* refactor: use CompactionRegion argument in Picker
* chore: make compaction module public and remove unnessary clone
* refactor: move build_compaction_task() in CompactionScheduler{}
* chore: use in open_compaction_region() and add some comments for public structure
* refactor: add 'manifest_dir()' in store-api
* refactor: move the default implementation to DefaultCompactor
* refactor: remove Options from MergeOutput
* chore: minor modification
* fix: clippy errors
* fix: unit test errors
* refactor: remove 'manifest_dir()' from store-api crate(already have one in opener)
* refactor: use 'region_dir' in CompactionRequest
* refactor: refine naming
* refactor: refine naming
* refactor: remove clone()
* chore: add comments
* refactor: add PickerOutput field in CompactorRequest
* feat: introduce RemoteJobScheduler
* feat: add RemoteJobScheudler in schedule_compaction_request()
* refactor: use Option type for senders field of CompactionFinished
* refactor: modify CompactionJob
* refactor: schedule remote compaction job by options
* refactor: remove unused Options
* build: remove unused log
* refactor: fallback to local compaction if the remote compaction failed
* fix: clippy errors
* refactor: add plugins in mito2
* refactor: add from_u64() for JobId
* refactor: make schedule module public
* refactor: add error for RemoteJobScheduler
* refactor: add Notifier
* refactor: use Arc for Notifier
* refactor: add 'remote_compaction' in compaction options
* fix: clippy errors
* fix: unrecognized table option
* refactor: add 'start_time' in CompactionJob
* refactor: modify error type of RemoteJobScheduler
* chore: revert changes for request
* refactor: code refactor by review comment
* refactor: use string type for JobId
* refactor: add 'waiters' field in DefaultNotifier
* fix: build error
* refactor: take coderabbit's review comment
* refactor: use uuid::Uuid as JobId
* refactor: return waiters when schedule failed and add on_failure for DefaultNotifier
* refactor: move waiters from notifier to Job
* refactor: use ObjectStoreManagerRef in open_compaction_region()
* refactor: implement for JobId and adds related unit tests
* fix: run unit tests failed
* refactor: add RemoteJobSchedulerError
* fix: make Influxdb lines able to be inserted into last created tables
* Update src/servers/src/influxdb.rs
* add an option to control the time index alignment behavior
* fix ci
* refactor: use interceptor to handle timestamp align
* Apply suggestions from code review
Co-authored-by: dennis zhuang <killme2008@gmail.com>
---------
Co-authored-by: tison <wander4096@gmail.com>
Co-authored-by: dennis zhuang <killme2008@gmail.com>
* feat: Use DATANODE_LEASE_SECS from distributed_time_constants for heartbeat pause duration
* feat: introduce `RegionFailureDetectorController` to manage region failure detectors
* feat: add `RegionFailureDetectorController` to `DdlContext`
* feat: add `region_failure_detector_controller` to `Context` in region migration
* feat: register region failure detectors during rollback region migration procedure
* feat: deregister region failure detectors during drop table procedure
* feat: register region failure detectors during create table procedure
* fix: update meta config
* chore: apply suggestions from CR
* chore: avoid cloning
* chore: rename
* chore: reduce the size of the test
* chore: apply suggestions from CR
* chore: move channel initialization into `RegionSupervisor::channel`
* chore: minor refactor
* chore: rename ident
* fix: add serialize_ignore_column_ids() to fix deserialize region options failed from json string
* refactor: return empty vector if column_id is empty
* feat: add functions to find and merge sorted runs
* chore: refactor code
* chore: remove some duplicates
* chore: remove one clone
* refactor: change max_active_window_files to max_active_window_runs
* feat: integrate with sorted runs
* fix: unit tests
* feat: limit num of sorted runs during compaction
* fix: some test
* fix: some cr comments
* feat: use smallvec
* chore: rebase main
* feat/reduce-sorted-runs:
Refactor compaction logic and update test configurations
- Refactored `merge_all_runs` function to use `sort_ranged_items` for sorting.
- Improved item merging logic by iterating with `into_iter` and handling overlaps.
- Updated test configurations to use `max_active_window_runs` instead of `max_active_window_files` for consistency.
---------
Co-authored-by: tison <wander4096@gmail.com>
* test: add e2e test for region failover
* chore: add ci cfg
* chore: reduce parallelism to 8
* fix(ci): enable region failure
* chore: set sqlx LogLevel to Off
* refactor: move help functions to utils
* feat: add update_mode to region options
* test: add test
* feat: last not null iter
* feat: time series last not null
* feat: partition tree update mode
* feat: partition tree
* fix: last not null iter slice
* test: add test for compaction
* test: use second resolution
* style: fix clippy
* chore: merge two lines
Co-authored-by: Jeremyhi <jiachun_feng@proton.me>
* chore: address CR comments
* refactor: UpdateMode -> MergeMode
* refactor: LastNotNull -> LastNonNull
* chore: return None earlier
* feat: validate region options
make merge mode optional and use default while it is None
* test: fix tests
---------
Co-authored-by: Jeremyhi <jiachun_feng@proton.me>
* refactor: remove compaction_options and use RegionOptions type for region_options
* refactor: add file_purger field in CompactionRegion
* refactor: add SerializedPickerOutput
* refactor: rename CompactorRequest to OpenCompactionRegionRequest and remove PickerOutput
* refactor: use &PickerOutput instead of clone()
* refactor: migrate region failover implementation to region migration
* fix: use HEARTBEAT_INTERVAL_MILLIS as lease secs
* fix: return false if leader is downgraded
* fix: only remove failure detector after submitting procedure successfully
* feat: ignore dropped region
* refactor: retrieve table routes in batches
* refactor: disable region failover on local WAL implementation
* fix: move the guard into procedure
* feat: use real peer addr
* feat: use interval instead of sleep
* chore: rename `HeartbeatSender` to `HeartbeatAcceptor`
* chore: apply suggestions from CR
* chore: reduce duplicate code
* chore: apply suggestions from CR
* feat: lookup peer addr
* chore: add comments
* chore: apply suggestions from CR
* chore: apply suggestions from CR
* feat: introduce bulk memtable encoder/decoder
* chore: rebase main
* chore: resolve some comments
* refactor: only carries time unit in ArraysSorter
* fix: some comments
* feat: herat beat task
* feat: use real flow peer allocator when building
* feat: add peer look up in ddl context
* fix: drop flow test
* refactor: per review(WIP)
* refactor: not check if is alive
* refactor: per review
* refactor: remove useless `reset`
* refactor: per bot advices
* refactor: alive peer
* chore: bot review
* fix: `region_peers` returns same region_id for multi logical tables
* test: add sqlness test for information_schema.region_peers
* refactor: region_peers sqlness
* tests: flow sqlness tests
* tests: WIP df func test
* fix: use schema before expand for transform expr
* tests: some basic flow tests
* tests: unit test
* chore: dep use rev not patch
* fix: wired sqlness error?
* refactor: per review
* fix: temp sqlness bug
* fix: use fixed sqlness
* fix: impl drop as async shutdown
* refactor: per bot's review
* tests: drop worker handler both sync/async
* docs: add rationale for test
* refactor: per review
* chore: fmt
* chore: make RegionOptions serializable and add region_dir in CompactionRegion
* refactor: make `PickerOutput` and `MergeOutput` serializable and deserializable
* refactor: remove Serialize and Deserialize from PickerOutput
* chore: revert changes for file.rs
* chore: revert changes for compactor.rs and compaction.rs
---------
Co-authored-by: tison <wander4096@gmail.com>
* feat: prepare stmt in mysql client
* feat: execute stmt in mysql client
* fix: handle parameters properly
* refactor: use existing funcs to convert expr to scalar value
* refactor: use uuid strings as stmt_key for queries from COM_PREPARE packet
* refactor: take prepare and execute parser as submodule
* test: add unit test for converting expr to scalar value
* feat: deallocate stmt in mysql client
* chore: comments and duplicates
---------
Co-authored-by: dennis zhuang <killme2008@gmail.com>
* refactor: RangeBase
* feat: memtable range
* feat: scanner use mem range
* feat: remove base from mem range context
* feat: impl ranges for memtables
* chore: fix warnings
* refactor: make predicate cheap to clone
* refactor: MemRange -> MemtableRange
* feat: pub empty memtable to fix warnings
* test: fix sqlness result
* chore: call df function types
* feat: RelationDesc to DfSchema
* refactor: use RelationDesc instead of Type
* chore: WIP get to phy expr
* feat: custom deserialize
* chore: fmt
* refactor: renaming to DfScalarFunction
* feat: eval df func(untested)
* fix: had to spawn a thread for calling async
* chore: per review advices
* tests: test df scalar function
* refactor: add Compactor trait
* chore: add compact() in Compactor trait and expose compaction module
* refactor: add CompactionRequest and open_compaction_region
* refactor: export the compaction api
* refactor: add DefaultCompactor::new_from_request
* refactor: no need to pass mito_config in open_compaction_region()
* refactor: CompactionRequest -> &CompactionRequest
* fix: typo
* docs: add docs for public apis
* refactor: remove 'Picker' from Compactor
* chore: add logs
* chore: change pub attribute for Picker
* refactor: remove do_merge_ssts()
* refactor: update comments
* refactor: use CompactionRegion argument in Picker
* chore: make compaction module public and remove unnessary clone
* refactor: move build_compaction_task() in CompactionScheduler{}
* chore: use in open_compaction_region() and add some comments for public structure
* refactor: add 'manifest_dir()' in store-api
* refactor: move the default implementation to DefaultCompactor
* refactor: remove Options from MergeOutput
* chore: minor modification
* fix: clippy errors
* fix: unit test errors
* refactor: remove 'manifest_dir()' from store-api crate(already have one in opener)
* refactor: use 'region_dir' in CompactionRequest
* refactor: refine naming
* refactor: refine naming
* refactor: remove clone()
* chore: add comments
* refactor: add PickerOutput field in CompactorRequest
* refactor: make individual col name optional
* chore: rename TypedPlan's `typ` to `schema`
* feat: add optional col name to typed plan
* feat: pass col name all along
* feat: correct infer output table schema
* chore: unused import
* fix: error when key is not projected
* refactor: per review
* chore: fmt
* ci: enable debug log
* chore: test to reproduce panic
* chore: Revert "ci: enable debug log"
This reverts commit 17eff2a045.
* test: add test for alter during flush
* fix: clear status if region has nothing to flush
It will also executes pending ddls and requests
* docs: fix typo
* feat: support cancellation
* chore: add unit test for cancellation
* chore: minor refactor
* feat: we do not need to spawn in distributed mode
---------
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
* set global runtime size
* fix: resolve PR comments
* fix: log the whole option
* fix ci
* debug ci
* debug ci
---------
Co-authored-by: Weny Xu <wenymedia@gmail.com>
* fix: mfp missing rows if run twice in same tick
* tests: run mfp for multiple times
* refactor: make mfp less hacky
* feat: make channel larger
* chore: typos
* fix: display the PartitionBound and PartitionDef correctly
* Update src/partition/src/partition.rs
Co-authored-by: dennis zhuang <killme2008@gmail.com>
* fix: fix unit test of partition definition
---------
Co-authored-by: dennis zhuang <killme2008@gmail.com>
* feat: implement the `WalEntryDistributor` and `WalEntryReceiver`
* test: add tests for `WalEntryDistributor`
* refactor: use bounded channel
* chore: apply suggestions from CR
* feat: unordered scanner
* feat: support compat
* chore: update debug print
fix: missing ranges in scan parts
* fix: ensure chunk size > 0
* fix: parallel is disabled if there is only one file and memtable
* chore: reader metrics
* chore: remove todo
* refactor: add ScanPartBuilder trait
* chore: pass file meta to the part builder
* chore: make part builder private
* docs: update comment
* chore: remove meta()
* refactor: only prune file ranges in ScanInput
replaces ScanPartBuilder with FileRangeCollector which only collect file
ranges
* chore: address typo
* fix: panic when no partition
* feat: Postpone part distribution
* chore: handle empty partition in mito
* style: fix clippy
* show status returning empty contents
* return an empty set instead of affected rows
* chore: Update src/query/src/sql.rs
---------
Co-authored-by: Yingwen <realevenyag@gmail.com>
* feat: open region in background
* feat: trace opening regions
* feat: wait for the opening region
* feat: let engine to handle the future open request
* fix: fix `test_region_registering`
* feat: invoke `flush_table` and `compact_table` in fuzz tests
* feat: support to flush and compact physical metric table
* fix: avoid to create tables with the same name
* feat: validate values after flushing or compacting table
* add compaction udf params
* wip: pass compaction options through grpc
* wip: pass compaction options all the way down to region server
* wip: window compaction task
* feat: trigger major compaction
* refactor: optimize compaction parameter parsing
* chore: rebase main
* chore: update proto
* chore: add some tests
* feat: validate catalog
* chore: fix typo and rebase main
* fix: some cr comments
* fix: file_time_bucket_span
* fix: avoid upper bound overflow
* chore: update proto
* feat: convert timestamp range filters to predicates
* chore: rebase main
* fix: remove prediactes once they have been added to timestamp filters to avoid duplicate filtering
* fix: some comments
* fix: resolve conflicts
* feat: enable gzip in grpc server side
* feat: add enable_gzip_compression config
* test: add grpc compression test
* feat: support user configured compression on grpc server
* chore: update doc
* chore: add tests
* fix: make config-docs
* chore: fix cr issue
* chore: add test
* refactor: remove config on server side, auto enable all compression support
* chore: minor update
* chore: remove unused code
* refactor: enable zstd compression internally by default
* chore: minor fix
* chore: change binary array type from LargeBinaryArray to BinaryArray
* fix: adjust try_into_vector logic
* fix: apply CR suggestions, add tests
* chore: fix failing test
* chore: fix integration test
* chore: adjust the assertions according to changed implementation
* chore: add a test with LargeBinary type
* chore: apply CR suggestions
* chore: simplify tests
* feat(WIP): tumble window rewrite parser
* tests: tumble func
* feat: add `update_at` column for all flow output
* chore: cleanup per review
* fix: update_at not as time index
* fix: demo tumble
* fix: tests&tumble signature&accept both ts&datetime
* refactor: update_at now ts millis type
* chore: per review advices
* feat(flow): flow node manager
feat(flow): render src/sink
feat(flow): flow node manager in standalone
fix?: higher run freq
chore: remove abunant error enum variant
fix: run with higher freq if insert more
chore: fix after rebase
chore: typos
* chore(WIP): per review
* chore: per review
* ci: disable other test
* ci: timeout 30
* ci: try to use lld
* ci: change linker
* test: wait for file change in test multiple times
* ci: enable other tests
* chore: revert sleep in loop
* feat(fuzz): add validator for inserted rows
* fix: compatibility with mysql types
* feat(fuzz): add datetime and date type in mysql for row validator
* ci: use windows 2019
* test: ignore cleanup result
* chore: revert change
* test: unstable repeated task test
* build: update rust toolchain and windows
* ci: test sqlness
* chore: enable other tests
* feat: support to create & drop flow via grpc
* chore: apply suggestions from CR
* chore: apply suggestions from CR
* chore: apply suggestions from CR
* refactor: passing QueryContext to RegionServer
* refactor: change the return type of build() in QueryContextBuilder
* fix: update greptime-proto reference
* chore: apply suggestion
* chore: revert the last commit
---------
Co-authored-by: dennis zhuang <killme2008@gmail.com>
* feat: mirror insert req to flow node
* refactor: group_requests_by_peer
* chore: rename `nodes` to `flows` to be more apt
* docs: add TODO
* refactor: split flow&data node grouping to two func
* refactor: mirror_flow_node_request
* chore: add some TODOs
* refactor: use Option in value
* feat: skip non-src table quickly
* docs: add TODO for `Peer.address`
* fix: dedup
* refactor: let upper caller control whether to omit column list
* feat(fuzz): add insert logical table target
* ci: add fuzz_insert_logical_table ci cfg
* feat(fuzz): add create logical table target
* fix: drop physical table after fuzz test
* fix: remove backticks of table name in with clause
* fix: create physical and logical table properly
* chore: update comments
* chore(ci): add fuzz_create_logical_table ci cfg
* fix: create one logical table once a time
* fix: avoid possible duplicate table and column name
* feat: use hard-code physical table
* chore: remove useless phantom
* refactor: create logical table with struct initialization
* chore: suggested changes and corresponding test changes
* chore: clean up
* fix: post process result on query full column name of prom labels API
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* only preserve tag column
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* feat: support different types for `CompatReader`
* chore: only compare whether we need: (data_type)
* fix: optimize code based on review suggestions
- add unit test `test_safe_cast_to_null` to test safely cast
- add DataType to projected_fields
- remove TODO
* fix: assert_eq fail on `projection.rs`
* style: codefmt
* style: fix the code based on review suggestions
* fix: do not remove deletion markers when window time range overlaps
* chore: fix some minor issues; add compaction test
* chore: add more test
* fix: nitpick master's nitpick
* feat: support invalidate schema name key cache
* fix: remove pub for invalidate_schema_cache
* refactor: add DropMetadataBroadcast State Op
* fix: delete files
* feat: transofrm substrait SELECT&WHERE&GROUP BY to Flow Plan
* chore: reexport from common/substrait
* feat: use datafusion Aggr Func to map to Flow aggr func
* chore: remove unwrap&split literal
* refactor: split transform.rs into smaller files
* feat: apply optimize for variadic fn
* refactor: split unit test
* chore: per review
* fix: cli export `create table` with quoted names
* add test
* apply review comments
* fix to pass check
* remove eprintln for clippy check
* use prebuilt binary to avoid compile
* ci run coverage after build
* drop dirty hack test
Signed-off-by: tison <wander4096@gmail.com>
---------
Signed-off-by: tison <wander4096@gmail.com>
Co-authored-by: tison <wander4096@gmail.com>
* refactor: func's specialization& use Error not EvalError
* docs: some pub item
* chore: typo
* docs: add comments for every pub item
* chore: per review
* chore: per reveiw&derive Copy
* chore: per review&test for binary fn spec
* docs: comment explain how binary func spec works
* chore: minor style change
* fix: Error not EvalError
* chore: keep the same method order in KvBackend
* feat: make meta client can get all node info of cluster
* feat: cluster info data model
* feat: frontend and datanode info
* feat: list node info
* chore: remove the method: is_started
* fix: scan key prefix
* chore: impl From for NodeInfoKey
* chore: doc for trait and struct
* chore: reuse the error
* chore: refactor two collec cluster info handlers
* chore: remove inline
* chore: refactor two collec cluster info handlers
* fix: move object store read/write timer into inner
* add Drop for PrometheusMetricWrapper
* call await on async read/write
* apply review comments
* git rid of option on timer
* test: add integration_test for datetime style
* feat: support various datestyle for postgres
* doc: rewrite the comment about merge_datestyle_value
* test: add more test to illustrate valid datestyle input
* feat: add __schema__ tag for promql parser
* feat: disable matcher op other than equals
* test: add more test to ensure context getting reset
* test: add integration test
* test: refactor tests
* refactor: remove duplicated test code
* refactor: update according to review comments
* test: add sqlness test for cross schema scenario
---------
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
* feat(tql): add initial support for start,stop,step as sql functions
* fix(tql): remove unwraps, adjust fmt
* fix(tql): address taplo issue
* feat(tql): update parse_tql_query logic
* fix(tql): change query parsing logic to use parser instead of delimiter
* fix(tql): add timestamp function support, add sqlness tests
* fix(tql): add lookback optional param for tql eval
* fix(tql): adjust tests for now() function
* fix(tql): introduce the tqlerror to differentiate failures on parsing, evaluation and simplification stages
* fix(tql): add tests for explain/analyze
* feat(tql): add lookback support for explain/analyze, update tests
* feat(tql): add more sqlness tests
* chore(tql): extract common logic for eval, analyze and explain into a single function
* feat(tql): address CR points
* feat(tql): use snafu for tql errors, add more docs
* feat(tql): address CR points
* test: add integration tests for kafka wal
* chore: rebase main
* chore: unify naming convention for wal config
* chore: add register loaders switch
* chore: alter tables by adding a new column
* chore: move rand to dev-dependencies
* chore: update Cargo.lock
* feat: support set variable statement of session
* feat: support printing postgresql's bytea data type in its "hex" and "escape" format in ugly way
* refactor: add 'SessionConfigValue' type and unify the name
* doc: add license header
* refactor: confine coupling with 'sql::ast::Value' in SessionConfigValue
* refactor: move all bytea wrapper into bytea.rs
* fix: remove unused import in context.rs and postgres.rs
* refactor: rename 'set_configuration_parameter' to 'set_session_config'
rename 'set_configuration_parameter' in statement_.rs to 'set_session_config'
* refactor: use mod to organize options via macro
* refactor: re-model the session config value with static type
* test: add integration test
* refactor: move the encode bytea by format type logic into encoder
refactor: use Arc<DashMap> instead of DashMap in QueryContext
refactor: use Arc<DashMap> instead of DashMap in QueryContext
Avoid expensive clone
refactor: use unreachable!() instead of unimplemented!()
refactor: move the encode bytea by format type logic into encoder
test: add binary format integration test case
* test: add ut for byte related type
* doc: remove TODO of bytea_output
* refactor: simplify the implementation with simple struct instead of complex typing
* fix: typo of 'Available'
* fix compile
Signed-off-by: tison <wander4096@gmail.com>
---------
Signed-off-by: tison <wander4096@gmail.com>
Co-authored-by: tison <wander4096@gmail.com>
* feat: Able to pretty print sql query result in http output
* fix: add some tests
* fix: add some space, delete fn into_payload, and impl Display for TableResponse
* feat: Arrangement shared state
* feat: arrange&tests
* docs: detailed&tests for get
* chore: license
* refactor: opt out ts expr&tests: internal ts
* docs: remove some TODOs
* feat: use smallvec size of 2
* refactor: per review
* chore: per review
* chore: per review
* chore: remove reduant clone
* feat: return max expire time&docs: more explain cur expire config
* feat: add memtable builder to region
* refactor: rename memtable_builder in worker to default_memtable_builder
* fix: return error instead of using default compaction options
Support deserializing memtable and compaction options from the option
map
* feat: optional memtable options
* feat: add MemtableBuilderProvider to create builders
* feat: change default memtable and skip deserializing dedup
* chore: update test and comment
* chore: test invalid type
* feat: metric engine use new memtable manually
* feat: expose more memtable configs
* feat: add memtable options to valid option list
* test: add test
* test: sqlness test
* chore: serde workspace
* chore: remove comments
* feat: handle flush periodically
* chore: call periodical method in loop
* feat: check periodical tasks on channel timeout
* refactor: use time provider to get time
Mock a time provider to test auto flush
* chore: fix typos
* refactor: rename mock time provider
* style: fix cilppy
* chore: address comment
* feat: acquire catalog and schema lock in region failover
* chore: remove unused code
* feat!: acquire catalog and schema lock in region migration
* feat: acquire catalog and schema lock in create table
* feat: call freeze if the active data buffer in a shard is full
* chore: more metrics
* chore: print metrics
* chore: enlarge freeze threshold
* test: test freeze
* test: fix config test
* feat(influxdb): add db query param support for v2 write api
* fix(influxdb): update authorize logic to get catalog and schema from query string
* fix(influxdb): address CR suggestions
* fix(influxdb): use the correct import
* feat: add create table fuzz test
* chore: add ci cfg for fuzz tests
* refactor: remove redundant nightly config
* chore: run fuzz test in debug mode
* chore: use ubuntu-latest
* fix: close connection
* chore: add cache in fuzz test ci
* chore: apply suggestion from CR
* chore: apply suggestion from CR
* chore: refactor the fuzz test action
* feat: add configuration for tls watch option
* test: sleep longer to ensure async task run
* test: update config api integration test
* refactor: rename function
* feat: Support automatic DNS lookup for kafka bootstrap servers
* Revert "feat: Support automatic DNS lookup for kafka bootstrap servers"
This reverts commit 5baed7b01d.
* feat: Support automatic DNS lookup for Kafka broker
* fix: resolve broker endpoint in client manager
* fix: apply clippy lints
* refactor: slimplify the code with clippy hint
* refactor: move resolve_broker_endpoint to common/wal/src/lib.rs
* test: add mock test for resolver_broker_endpoint
* refactor: accept niebayes's advice
* refactor: rename EndpointIpNotFound to EndpointIPV4NotFound
* refactor: remove mock test and simplify the implementation
* docs: add comments about test_vallid_host_ipv6
* Apply suggestions from code review
Co-authored-by: niebayes <niebayes@gmail.com>
* move more common code
Signed-off-by: tison <wander4096@gmail.com>
---------
Signed-off-by: tison <wander4096@gmail.com>
Co-authored-by: tison <wander4096@gmail.com>
Co-authored-by: niebayes <niebayes@gmail.com>
* feat: partition level map
* test: test shard and builder
* fix: do not use pk index from shard builder
* feat: add multi key test
* fix: freeze shard before finding pk in shards
* fix: dict builder resets num_keys on finish
* feat: skip empty shard and builder
* feat: avoid pruning if possible
Implementations:
- Apply all filters on the partition column
- If no filter to prune, skip decoding keys
* refactor: change the receivers of Shard::read/DataBuffer::read/DataParts::read to &self instead of &mut self
* refactor: remove allow(dead_code) in merge tree
* fix: KeyValues num_fields() is incorrect
* chore: fix warnings
* feat: support dedup
* feat: allow using the new memtable
* feat: serde default for config
* fix: resets pk index after finishing a dict
* feat: add fork method to the memtable
* feat: allow mark immutable returns result
* feat: use fork to create the mutable memtable
* feat: remove memtable builder from freeze
* chore: warninigs
* fix: inspect error
* feat: iter returns result
* chore: maintains memtable id in region
* chore: update comment
* fix: remove region status if failed to freeze a memtable
* chroe: update comment
* chore: iter should not require sync
* chore: implement freeze and fork for the new memtable
* refactor: data reader returns reference to data batch
* refactor: use range to create merger
* chore: Reference RecordBatch in DataBatch
* fix: top node not read if no next node
* refactor: move timestamp_array_to_i64_slice to data mod
* style: fix cilppy
* chore: derive copy for DataBatch
* chore: address CR comments
* feat: write to a shard or a shard builder
* feat: freeze and fork for partition and shards
* chore: shard builder
* chore: change dict reader to support random access
* test: test write shard
* test: test write
* test: test memtable
* feat: add new and write_row to DataParts
* refactor: partition freeze shards
* refactor: write_with_pk_id
* style: fix clippy
* chore: add methods to get pk weights
* chroe: fix compiler errors
* feat: impl merge reader for DataParts
* fix: fmt
* fix: sort rows with pk and ts according to sequnce desc
* fix: remove pk weight as pk index are already replace by weights
* fix: format
* fix: some cr comments
* fix: some cr comments
* refactor: simply trait's associated types
* fix: some cr comments
* refactor: set the actual bound port so we can use port 0 in testing
* Update src/servers/src/server.rs
Co-authored-by: Weny Xu <wenymedia@gmail.com>
* fmt
---------
Co-authored-by: Weny Xu <wenymedia@gmail.com>
* fix: logical region can't find region routes
* feat: fetch partitions info in batch
* refactor: rename batch functions
* refactor: rename DdlTaskExecutor to ProcedureExecutor
* feat: impl migrate_region and query_procedure_state for ProcedureExecutor
* feat: adds SQL function procedure_state and finish migrate_region impl
* fix: constant vector
* feat: unit tests for migrate_region and procedure_state
* test: test region migration by SQL
* fix: compile error after rebeasing
* fix: clippy warnings
* feat: ensure procedure_state and migrate_region can be only called under greptime catalog
* fix: license header
* feat: impl for ScalarExpr
* feat: plain functions
* refactor: simpler trait bound&tests
* chore: remove unused imports
* chore: fmt
* refactor: early ret on first error
* refactor: remove abunant match arm
* chore: per review
* doc: `support` fn
* chore: per review more
* chore: more per review
* fix: extract_bound
* chore: per review
* refactor: reduce nest
* feat: replace pk index with pk_weight during freeze
* chore: add parameter to control pk_index replacement
* fix: dedup pk weights also
* fix: generate pk array before dedup
* feat: data buffer and related structs
* fix: some cr comments
* chore: remove freeze_threshold in DataBuffer
* fix: use LazyMutableVectorBuilder instead of two vector; add option to control dedup
* fix: dedup rows according to both pk weights and timestamps
* fix: assembly DataBatch on demand
* refactor: bring metrics to http output
* chore: remove unwrap
* chore: make walk plan accumulate
* chore: change field name and comment
* chore: add metrics to http resp header
* chore: move PrometheusJsonResponse to a separate file and impl IntoResponse
* chore: put metrics in prometheus resp header too
* fix(util): join_path function should not trim leading `/`
Signed-off-by: Hudson C. Dalpra <dalpra.hcd@gmail.com>
* fix(util): making required changes at join_path function
* fix(util): added unit tests to match function comments
---------
Signed-off-by: Hudson C. Dalpra <dalpra.hcd@gmail.com>
* chore: start plugins in standalone
* chore: respect current catalog in use statement for mysql
* chore: reduce unnecessory convert to string
* chore: reduce duplicate code
* feat: add arrow format output for sql api
* refactor: remove unwraps
* test: add test for arrow format
* chore: update cargo toml format
* fix: resolve lint warrnings
* fix: ensure outputs size is one
* feat: let TypeConversionRule aware query context timezone setting
* chore: don't optimize explain command
* feat: parse string into timestamp with timezone
* fix: compile error
* chore: check the scalar value type in predicate
* chore: remove mut for engine context
* chore: return none if the scalar value is utf8 in time range predicate
* fix: some fixme
* feat: let Date and DateTime parsing from string value be aware of timezone
* chore: tweak
* test: add datetime from_str test with timezone
* feat: construct function context from query context
* test: add timezone test for to_unixtime and date_format function
* fix: typo
* chore: apply suggestion
* test: adds string with timezone
* chore: apply CR suggestion
Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com>
* chore: apply suggestion
---------
Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com>
* feat(tests-fuzz): add CreateTableExprGenerator
* refactor: move Column to root of ir mod
* feat: add AlterTableExprGenerator
* feat: add Serialize and Deserialize derive
* chore: refactor the AlterExprGenerator
* feat: auto config cache and buffer size according to mem size
* feat: utils
* refactor: add util function to common config
* refactor: check cgroups
* refactor: code
* fix: test
* fix: test
* chore: cr comment
Co-authored-by: Yingwen <realevenyag@gmail.com>
Co-authored-by: Dennis Zhuang <killme2008@gmail.com>
* chore: remove default comment
---------
Co-authored-by: Yingwen <realevenyag@gmail.com>
Co-authored-by: Dennis Zhuang <killme2008@gmail.com>
* feat: adds date_format function
* fix: compile error
* chore: use system timezone for FunctionContext and EvalContext
* test: as_formatted_string
* test: sqlness test
* chore: rename function
* docs: Update README.md
Complying with ASF policy we should refer to Apache projects in their full form in the first and most prominent usage.
* Update README.md
* Update README.md
* test: add unit tests
* feat: introduce kafka runtime backed by testcontainers
* test: add test for kafka runtime
* fix: format
* chore: make kafka image ready to be used
* feat: add entry builder
* tmp
* test: add unit tests for client manager
* test: add some unit tests for kafka log store
* chore: resolve some todos
* chore: resolve some todos
* test: add unit tests for kafka log store
* chore: add deprecate develop branch warning
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* tmp: ready to move unit tests to an indie dir
* test: update unit tests for client manager
* test: add unit tests for meta srv remote wal
* fix: license
* fix: test
* refactor: kafka image
* doc: add doc example for kafka image
* chore: migrate kafka image to an indie PR
* fix: CR
* fix: CR
* fix: test
* fix: CR
* fix: update Cargo.toml
* fix: CR
* feat: skip test if no endpoints env
* fix: format
* test: rewrite parallel test with barrier
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
* feat: split an entry if it's too large
* chore: rewrite check records
* test: add some unit tests for record
* chore: rewrite entry splitting
* chore: add unit tests for build records
* chore: add more unit tests for record
* chore: rewrite encdec of record
* revert: ignored test
* fix: set limit for max_batch_size
* fix: clippy
* chore: remove heavy logging
* fix: CR
* fix: properly terminate
* fix: CR
* fix: compiling
* fix: sqlness
* fix: CR
* fix: license
* fix: license
* refactor(metrics): add 'greptimedb_' prefix for every metrics
* chore: use 'greptime_' as prefix
* chore: add some prefix for new metrics
* chore: fix format error
* feat: implement `KeyRwLock`
* refactor: use KeyRwLock instead of LockMap
* refactor: use StringKey instead of String
* chore: remove redundant code
* refactor: cleanup KeyRwLock staled locks before granting new lock
* feat: clean staled locks manually
* feat: sort lock key in lexicographically order
* feat: ensure the ref count before dropping the rwlock
* feat: add more tests for rwlock
* feat: drop the key guards first
* feat: drops the key guards in the reverse order
* chore: apply suggestions from CR
* chore: apply suggestions from CR
* chore: apply suggestions from CR
* fix: fix heartbeat handler ignore upgrade candidate instruction
* fix: fix handler did not inject wal options
* feat: expose `RegionMigrationProcedureTask`
* feat(tests-integration): add a naive region migration test
* chore: apply suggestions from CR
* feat: add test if the target region has migrated
* chore: apply suggestions from CR
* chore(tests-integration): add setup tests with kafka wal to README.md
* feat(tests-integration): add meta wal config
* fix(tests-integration): fix sign of both_instances_cases_with_kafka_wal
* chore(tests-integration): set num_topic to 3 for tests
* test(tests-integration): add a naive test with kafka wal
* chore: apply suggestions from CR
* refactor: use string type instead of Option type for '--store-key-prefix'
Signed-off-by: zyy17 <zyylsxm@gmail.com>
* chore: refine for code review comments
---------
Signed-off-by: zyy17 <zyylsxm@gmail.com>
* feat: remote write metric task
* chore: pass stanalone task to frontend
* chore: change name to system metric
* fix: add header and rename to export metrics
* refactor: open regions in background
* feat: add status code of RegionNotReady
* feat: use RegionNotReady instead of RegionNotFound for a registering region
* chore: apply suggestions from CR
* feat: add status code of RegionBusy
* feat: return RegionBusy for mutually exclusive operations
* refactor: refactor wal config
* test: update tests related to wal
* feat: introduce kafka wal config
* chore: augment proto with wal options
* feat: augment region open request with wal options
* feat: augment mito region with wal options
* feat: augment region create request with wal options
* refactor: refactor log store trait
* feat: add skeleton for kafka log store
* feat: generalize building log store when starting datanode
* feat: integrate wal options to region write
* chore: minor update
* refactor: remove wal options from region create/open requests
* fix: compliation issues
* chore: insert wal options into region options upon initializing region server
* chore: integrate wal options into region options
* chore: fill in kafka wal config
* chore: reuse namespaces while writing to wal
* chore: minor update
* chore: fetch wal options from region while handling truncate/flush
* fix: region options test
* fix: resolve some review conversations
* refactor: serde with wal options
* fix: resolve some review conversations
* feat: introduce wal config and kafka config
* feat: introduce kafka topic manager and selector
* feat: introduce region wal options
* chore: build region wal options upon starting meta srv
* feat: integrate region wal options allocator into table meta allocator
* chore: add wal config to metasrv.example.toml
* chore: add region wal options map to create table procedure
* feat: augment region create request with wal options
* feat: augment DatanodeTableValue with region wal options map
* chore: encode region wal options upon constructing table creator
* feat: persist region wal options when creating table meta
* fix: sqlness test
* chore: set default wal provider to raft-engine
* refactor: refactor wal options
* chore: update wal options allocator
* refactor: rename region wal options to wal options
* chore: update usages of region wal options
* chore: add some comments to kafka
* chore: fill in kafka config
* test: add tests for serde wal config
* test: add tests for wal options
* refactor: refactor wal options allocator to enum
* refactor: store wal options into the request options instead
* fix: typo
* fix: typo
* refactor: move wal options map to region info
* refactor: refacto serialization and deserialization of wal options
* refactor: use serde_json to encode wal options
* chore: rename wal_options_map to region_wal_options
* chore: resolve some review comments
* fix: typo
* refactor: replace kecab-case with snake_case
* fix: sqlness and converage tests
* fix: typo
* fix: coverage test
* fix: coverage test
* chore: resolve some review conversations
* fix: resolve some review conversations
* chore: format comments in metasrv.example.toml
* chore: update import style
* feat: integrate wal options allocator to standalone mode
* test: add compatible test for OpenRegion
* test: add compatible test for UpdateRegionMetadata
* chore: remove based suffix from topic selector type
* fix: typos and bit operation
* fix: helper
* chore: add tests in decimal128.rs and interval.rs
* chore: test
* chore: change proto version
* chore: clippy
* chore: change auth_fn to function and return response with json body
* chore: move unsupported to debug level
* chore: add docs and tests
* chore: rebase and update test
* feat: sql with influxdb v1 result format
* chore: add unit tests
* feat: minor refactor
* chore: by comment
* chore; u128 to u64 since serde can't deser u128 in enum
* chore: by comment
* chore: apply suggestion
* chore: revert suggestion
* chore: try again
---------
Co-authored-by: dennis zhuang <killme2008@gmail.com>
* fix: use linear interpolation to implement range LINEAR fill strategy
* chore: update test case
* chore: optimize linear interpolation implementation
* chore: update test and add comment
* feat: add build function and register it
build() function to return the database build info #2909
* refactor: fix typos and change code structure
* test: add test for build()
* refactor: cargo fmt and eliminate warnings
* Apply suggestions from code review
Co-authored-by: Weny Xu <wenymedia@gmail.com>
* refactor: move system.sql to a new directory
---------
Co-authored-by: Weny Xu <wenymedia@gmail.com>
* test: add flow test for open candidate region with retryable error
* test: add flow test for upgrade candidate retry failed
* test: add flow test for upgrade candidate with retry
* refactor: use downgrading the region instead of closing region
* feat: enhance the tests for alive keeper
* feat: add a metric to track region lease expired
* chore: apply suggestions from CR
* chore: enable logging for test_distributed_handle_ddl_request
* refactor: simplify lease keeper
* feat: add metrics for lease keeper
* chore: apply suggestions from CR
* chore: apply suggestions from CR
* chore: apply suggestions from CR
* refactor: move OpeningRegionKeeper to common_meta
* feat: register operating regions to MemoryRegionKeeper
* feat: add backward compatibility test for persistent ctx
* refactor: refactor State of region migration
* feat: add test utils for region migration tests
* test: add simple region migration tests
* chore: apply suggestions from CR
* feat: adds date_add and date_sub function
* test: add date function
* fix: adds interval to date returns wrong result
* fix: header
* fix: typo
* fix: timestamp resolution
* fix: capacity
* chore: apply suggestion
* fix: wrong behavior when adding intervals to timestamp, date and datetime
* chore: remove unused error
* test: refactor and add some tests
* chore: add logs and metrics
* feat: add the timer to track heartbeat intervel
* feat: add the gauge to track region leases
* refactor: use gauge instead of the timer
* chore: apply suggestions from CR
* feat: add hit rate and etcd txn metrics
* feat: add random weigted choose in load_based selector
* fix: meta cannot save heartbeats when cluster have no region
* chore: print some log
* chore: remove unused code
* cr
* add some logs when filter result is empty
* feat: add page cache
* docs: update mito config toml
* feat: impl CachedPageReader
* feat: use cache reader to read row group
* feat: do not fetch data if we have pages in cache
* chore: return if nothing to fetch
* feat: enlarge page cache size
* test: test write read parquet
* test: test cache
* docs: update comments
* test: fix config api test
* feat: cache metrics
* feat: change default page cache size
* test: fix config api test
* feat: bump prost and fix pprof feature compiler errors
* feat: fix compiler errors on tokio-console
* chore: fix compiler errors
* ci: add all features check to ci
* feat: decrease the `page size` if the response message size exceeds the limit
* chore: apply suggestions from CR
* feat: prefer to use adaptive_page_size
* chore: apply suggestions from CR
* feat: Control merge reader by batch size
* test: test heap have large range
* fix: merge one batch
* test: merge many duplicates
* test: test reheap hot
* feat: don't handle empty batch in merge reader
* feat: use fixed error message for unknown error
* feat: return fixed message for internal error as well
* chore: include status code in error message
* test: update tests for asserts of error message
* feat: change status code of some datafusion error
* fix: make CollectRecordbatch an query error
* test: update sqlness results
* feat: RepeatedTask adds execute-first-wait-later behavior.
* feat: add inverval generator for repeate task component
* feat: impl debug for dyn IntervalGenerator trait
* chore: change some words
* chore: instead of complicated way, we add an initial_delay to control task interval
* chore: some improve by pr comment
* feat: opentsdb row protocol
* fix: added commnets for num of rows and failure if output is not of affecetd rows
* fix: added extra 1 to number of columns
* fix: avoided cloning datapoints, took ownership instead
* fix: avoided cloning datapoints, took ownership instead
* fix: changed vecotr slice to vector
* fix: remove clone
* fix: combined datapoints and requests with zip instead of enumerating
---------
Co-authored-by: Ubuntu <ubuntu@ip-172-31-43-183.us-east-2.compute.internal>
* feat: eable range expr nest
* fix: change range expr rewrite format
* chore: organize range query tests
* chore: change range expr name(e.g. MAX(v) RANGE 5s FILL 6)
* chore: add range query test
* chore: fix code advice
* chore: fix ca
* ci: add copy-image.sh and upload-artifacts-to-s3.sh
* ci: remove unused options in dev build
* ci: use 'upload-artifacts-to-s3.sh' and 'copy-image.sh' in release-cn-artifacts action
* refactor: refine copy-image.sh
* fix: invalid requests created by nyc-taxi
* feat: add timestamp to table name
* style: fix clippy
* chore: re-export deps for client
* fix: wait result
* chore: no need to define a prefix constant
* feat: memtable support filter pushdown to prune primary keys
* fix: switch to next time series when pk not selected
* fix: allow predicate evaluation failure
* fix: some clippy warnings
* fix: panic when no primary key in schema
* feat: cache decoded record batch for primary key
* refactor: use arcswap instead of rwlock
* fix: format toml
* test: test different order
* test: add tests for missing and invalid columns
* fix: do not skip schema validation while missing columns
* chore: use field_columns()
* test: add tests for different column order
* chore: add dbname in region request header for tracking purpose
* chore: fix handle read
* chore: add write meter
* chore: add meter-core to dep
* chore: add converter between RegionRequestHeader and QueryContext & update proto version
* feat: add vector_cache to CacheManager
* feat: cache repeated vectors
* feat: skip decoding pk if output doesn't contain tags
* test: add TestRegionMetadataBuilder
* test: test ProjectionMapper
* test: test vector cache
* test: test projection mapper convert
* style: fix clippy
* feat: do not cache vector if it is too large
* docs: update comment
* feat: merge by heap
* fix: fix heap order
* feat: avoid pop/push next and refactor some functions
* feat: replace merge_batches and fixe tests
* test: add test that a key is deleted
* fix: skip empty batch
* style: clippy
* chore: fix typos
* feat: support greatest function
* feat: make greatest take date_type as input
* fix: move sqlness test into common/function/time.sql
* fix: avoid using unwarp
* fix: use downcast
* refactor: simplify arrow cast
* feat: implement new histogram data model
* feat: use prometheus table format for histogram
* refactor: remove duplicated code
* fix: histogram tag column
* fix: use accumulated count in buckets
* refactor: using row based protocol for otlp WIP
* refactor: use row based writer for otlp.
Also updated row writer for owned keys
* refactor: use row writers for otlp
* test: add integration tests for histogram
* refactor: change le column name
* test: test on_compaction_finished
* fix: avoid submit same region to compact
* feat: persist and recover compaction time window
* test: fix test
* test: sort like result
* feat: added show tables command
* fix(tests): fixed parser and statement unit tests
* chore: implemeted display trait for table type
* fix: handled no tabletype and error for usopprted command in show databse
* chore: removed full as a show kind, instead as a show option
* chore(tests): fixed failing test and added more tests for show full
* chore: refactored table types to use filters
* fix: changed table_type to tables
* feat: RegionMetadataBuilder allow adding/dropping columns multiple times
* test: test add if not exists/drop if exists
* feat: change validator and add need_alter
* test: fix tests and test need_alter
* test: test alter retry
* feat: open before create
* style: fix clippy
* refactor: move RegionOptions to options mod
* refactor: define compaction strategy in region/options.rs
* feat: use duration for time window
* refactor: rename CompactionStrategy to CompactionOptions
* feat: use serde to parse options
* feat: parse options
* feat: set options on creation/opening
* test: test create/open with options
* chore: remove todo
* feat: get compaction ttl and options from RegionOptions
* style: fix clippy
* chore: Remove unused engine_options
* style: fix clippy
* chore: remove todo
* fix: check version before alter region
* chore: apply suggestions from CR
* Update src/mito2/src/worker/handle_alter.rs
Co-authored-by: dennis zhuang <killme2008@gmail.com>
---------
Co-authored-by: dennis zhuang <killme2008@gmail.com>
* feat: allow multiple waiters in compaction request
* feat: compaction status wip
* feat: track region status in compaction scheduler
* feat: impl compaction scheduler
* feat: call compaction scheduler
* feat: remove status if nothing to compact
* feat: schedule compaction after flush
* feat: set compacting to false after compaction finished
* refactor: flush status only needs region id and version control
* refactor: schedule_compaction don't need region as argument
* test: test flush/scheduler for empty requests
* test: trigger compaction in test
* feat: notify scheduler on truncated
* chore: Apply suggestions from code review
Co-authored-by: JeremyHi <jiachun_feng@proton.me>
---------
Co-authored-by: JeremyHi <jiachun_feng@proton.me>
* refactor:
1. remove TableIdent, use TableId directly
2. use the latest greptime-proto
3. independently invalidate table id cache and table name cache
* rebase
* fix: resolve PR comments
* fix: resolve PR comments
* chore: set mutable limit to half of the global write buffer size
* refactor: put handle_flush_finished after handle_flush_request
* refactor: rename tests.rs to basic_test.rs
* style: fmt code
* feat: add writable flag to region.
* refactor: rename MitoEngine to MitoEngine::scanner
* feat: add set_writable() to RegionEngine
* feat: check whether region is writable
* feat: make set_writable sync
* test: test set_writable
* docs: update comments
* feat: send result on compaction failure
* refactor: wrap output sender in new type
* feat: on failure
* refactor: use get_region_or/writable_region_or
* refactor: remove send_result
* feat: notify waiters on flush scheduler drop
* test: fix tests
* fix: only alter writable region
* feat: add more info to error messages
* feat: store next column id in procedure
* fix: update next column id for table info
* test: fix add col test
* chore: remove location from invalid request error
* test: update test
* test: fix test
* test: add test for reopen
* feat: last entry id starts from flushed entry id
* fix: store flushed sequence and recover it from manifest
* test: check sequence in alter test
* test: more tests for alter
* feat: impl handle_alter wip
* refactor: move send_result to worker.rs
* feat: skeleton for handle_alter_request
* feat: write requests should wait for alteration
* feat: define alter request
* chore: no warnings
* fix: remove memtables after flush
* chore: update comments and impl add_write_request_to_pending
* feat: add schema version to RegionMetadata
* feat: impl alter_schema/can_alter_directly
* chore: use send_result
* test: pull next_batch again
* feat: convert pb AlterRequest to RegionAlterRequest
* feat: validate alter request
* feat: validate request and alter metadata
* feat: allow none location
* test: test alter
* fix: recover files and flushed entry id from manifest
* test: test alter
* chore: change comments and variables
* chore: fix compiler errors
* feat: add is_empty() to MemtableVersion
* test: fix metadata alter test
* fix: Compaction picker doesn't notify waiters if it returns None
* chore: address CR comments
* test: add tests for alter request
* refactor: use send_result
* feat: compaction component
* feat: mito2 compaction
* Avoid building time range predicates when merge SST files since in TWCS we don't enforce strict time window.
* fix: some CR comments
* minor: change CompactionRequest::senders to an option
* chore: handle compaction finish error
* feat: integrate compaction into region worker
* chore: rebase upstream
* fix: Some CR comments
* chore: Apply suggestions from code review
* style: fix clippy
---------
Co-authored-by: Yingwen <realevenyag@gmail.com>
* refactor:
1. remove method `register_system_table` from CatalogManager
2. the creation of ScriptTable (as a system table) is removed from CatalogManager. Instead, the ScriptTable is created when Frontend instance is starting; and is created by calling Frontend instance's grpc handler.
* rebase
* fix: filter out outdated heartbeat, #1707
* feat: reorder handlers
* refactor: disableXXX to enableXXX
* feat: make full use of region leases to facilitate failover
* chore: minor refactor
* chore: by comment
* feat: logging on inactive/active
* chore: call handle_flush_request
* feat: alias SchedulerRef and clean scheduler on drop
* feat: add scheduler to workers
* feat: remove RegionMemtableStats
* feat: pick regions to flush
* feat: add more fields to region flush task
* feat: smallvec workspace dep
* feat: Use list to hold immutable memtables
* feat: flush job wip
* feat: use access layer to read write sst
* feat: flush memtables to l0
* feat: write manifest
* feat: schedule next flush on success
* feat: schedule flush on success and failure
* feat: add purger to region
* feat: apply edit after flush
* feat: collect stats for SSTs
* feat: manual flush
* test: test flush and fix manifest test
* feat: remove flush scheduler job limit
* fix: typo
* style: clippy
* feat: clean flushed files on failure
* chore: address CR comment
* refactor: Use put_rows
* feat: Clean flush scheduler on drop
* feat: remove region flush status on drop and close
* chore: address CR comment
* feat: alias SchedulerRef and clean scheduler on drop
* feat: add scheduler to workers
* feat: use access layer to read write sst
* feat: add purger to region
* refactor: allow getting region_dir from AccessLayer
* feat: add scheduler to FlushScheduler
* feat: getter for object store
* chore: fix typo
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
---------
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
* feat: only allow timestamp data type as time index
* test: update sqltest cases, todo: need some fixes
* fix: sqlness tests
* fix: forgot adding back cte test
* chore: style
* fix: existing null value for schema name value
* chore: fix null check
* fix: change catalognamevalue and schemanamevalue to option
* fix: fix null case
* chore: update proto
* chore: add try from for schema name value
* chore: merge schema opts to table opts while creating table
* chore: use table ttl opts first
* chore: add unit test
* chore: update proto version
* feat: replay memtable when opening table
* test: region replay
* refactor: save logstore in TestEnv
* fix: some cr comments
* chore: rebase develop
* chore: update last entry id during replay
* chore: change userinfo to query_ctx in http handler
* chore: minor change
* chore: move prometheus http to http mod
* chore: fix uni test:
* chore: add back schema check
* chore: minor change
* chore: remove clone
* refactor: use arrow::compute::concat instead of push values to vector builders
* feat: support projection
* refactor: remove sequence
* refactor: concatenate
* fix: series must not be empty
* refactor: projection
* feat: timestamp types sqlness tests
* feat: adds timestamp tests
* test: add string tests
* test: comment a case in timestamp
* test: add float type tests
* chore: adds TODO
* feat: set TZ=UTC for sqlness test
* feat: Implement slice and first/last timestamp for Batch
* feat(mito): implements sort/concat for Batch
* chore: fix typo
* chore: remove comments
* feat: sort and dedup
* test: test batch operations
* chore: cast enum to test op type
* test: test filter related api
* sytle: fix clippy
* docs: comment for slice
* chore: address CR comment
Don't return Option in get_timestamp()/get_sequence()
* feat: time series memtable
* feat: add some test
* fix: some clippy warnings
* chore: some rustdoc
* refactor: test
* fix: remove useless functions
* feat: add config for TimeSeriesMemtable
* chore: some optimize
* refactor: remove bucketing
* refactor: avoid cloing RegionMetadataRef across all Series; make initial_builder_capacity a const; sort batch only by timestamp and sequence
* feat: rewrite do_get for streaming get flight data
* feat: rewrite do_get call stack but leave the async stream adapter not modified yet
* feat: rewrite the async stream adapter to accept greptime record batch stream
* fix: resolve some PR comments
* feat: rewrite tests to adapt to the streaming do_get
* feat: add unit tests for streaming do_get
* feat: rewrite timer metric of merge scan
* remove unhelpful unit tests for streaming do_get
* add a new metric timer for merge scan and fix some test errors
* rewrite mysql writer to write query results in a streaming manner
* fix: fix fmt errors
* fix: rewrite sqlness runner to take into account the streaming do_get
* fix: fix toml format errors
* fix: resolve some PR comments
* fix: resolve some PR comments
* fix: refactor do_get to increase readability
* fix: refactor mysql try_write_one to increase readability
* fix: fix ddl client can not update leader addr
* chore: apply suggestions from CR
* feat: add message to context
* fix: only retry if unavailable or deadline exceeded
* chore: apply suggestions from CR
* feat: datanode's row insrter
* refactor: ExprFactory
* feat: row inserter in standalon mode
* chore: minor refactor
* feat: influxdb line protocol's row protocol
* chore: minor refactor
* improve: avoid to use too many string
* no longer async
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
* chore: do not check empty data
* chore: by review comment
* chore: by comment
* chore: by review comment
* chore: by review comment
---------
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
* move prometheus routes to default http server
Signed-off-by: sh2 <shawnhxh@outlook.com>
* fix ci test and remove the server logic of prometheus
* remove unused import and prometheus relevant code
* fix ci: rustfmt and test
* fix ci: silly fmt
* fix ci: silly silly fmt
* change `/prom_store` back to `/prometheus`
* remove unused variable
---------
Signed-off-by: sh2 <shawnhxh@outlook.com>
* refactor: KeyValues return ValueRef
* 1. Change KeyValues returned value from pb value to ValueRef
2. Replace OpType/SemanticType with pb's OpType and SemanticType to avoid duplicated conversions.
* feat: define min value of OpType as a const
* fix: toml format
* feat: remove greptimedb-telemetry feature
* feat: adds enable_telemetry option to metasrv and datanode
* refactor: move data_home from file config to storage config
* feat: store the installation uuid into datanode and metasrv working home
* fix: cargo toml fmt
* test: ignore region failver test when using local fle storage
* test: ignore telemetry reporter in test mode
* feat: print warning log when enabling telemetry
* chore: the telemetry doc link
* chore: remove enable_telemetry from datanode example config file
* refactor: rename GREPTIMEDB_TELEMETRY_CLIENT_REQUEST_TIMEOUT
* chore: rename print_warn_log to print_anonymous_usage_data_disclaimer
* ci: add context argument in build-greptime-binary action
* refactor: add 'working-dir' in upload-artifacts action and rename 'context' to 'working-dir'
* refactor: use timestamp as part of image tag when trigger manually
* fix(timestamp): add trim for the input date string
* fix(timestamp): add analyzer rule to trim strings before conversion
* fix: adjust according to CR
* chore: add version reporter
* chore: add uuid for version report
* chore: add file license
* chore: format code
* chore: fix by pr comment
* chore: change version report api url
* chore: change greptimedb opentelemetry crate name
* chore: minor code beautification
* chore: add keys only option when range etcd
* chore: fix by pr comment
* chore: fix by pr comment
* chore: change uuid file location
* chore: only run telemetry in meta leader
* chore: add more test and some minor fix
* chore: make clippy happy
* chore: fix by pr comment
* chore: fix by pr comment
* chore: add debug log for greptimedb telemetry
* feat: add exists api into KvBackend
* refactor: region lease
* feat: fiter out inactive node in keep-lease
* feat: register&deregister inactive node
* chore: doc
* chore: ut
* chore: minor refactor
* feat: use memory_kv to store inactive node
* fix: use real error in
* chore: make inactive_node_manager's func compact
* chore: more efficiently
* feat: clear inactive status on cadidate node
* refactor: unify the make targets of building images
* refactor: make Dockerfile more clean
1. Add dev-builder image to build greptime binary easily;
2. Add 'docker/ci/Dockerfile-centos' to release centos image;
3. Delete Dockerfile of aarch64 and just need to use one Dockerfile;
Signed-off-by: zyy17 <zyylsxm@gmail.com>
---------
Signed-off-by: zyy17 <zyylsxm@gmail.com>
* feat: define structs for version
* feat: Build region from metadata and memtable builder
* feat: impl validate for metadata
* feat: add more fields to RegionMetadata
* test: more tests
* test: more check and test
* feat: allow overwriting version
* style: fix clippy
* chore: remove useless Option type in plugins (#1544)
Co-authored-by: paomian <qtang@greptime.com>
* chore: remove useless Option type in plugins (#1544)
Co-authored-by: paomian <qtang@greptime.com>
* chore: remove useless Option type in plugins (#1544)
Co-authored-by: paomian <qtang@greptime.com>
* chore: remove useless Option type in plugins (#1544)
Co-authored-by: paomian <qtang@greptime.com>
* feat: first commit for time type
* feat: impl time type
* fix: arrow vectors type conversion
* test: add time test
* test: adds more tests for time type
* chore: style
* fix: sqlness result
* Update src/common/time/src/time.rs
Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com>
* chore: CR comments
---------
Co-authored-by: localhost <xpaomian@gmail.com>
Co-authored-by: paomian <qtang@greptime.com>
Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com>
refactor: move heartbeat configuration into an independent section in config file
* refactor: move heartbeat configuration into an independent section in config file
* feat: add HeartbeatOptions struct
* test: modify corresponding test case
* chore: modify corresponding example file
* feat: support append entries from multiple regions at a time
* chore: add some tests
* fix: false postive mutable_key warning
* fix: append_batch api
* fix: remove unused clippy allows
* chore(prom): rename prometheus(remote storage) to prom-store and promql(HTTP server) to prometheus
* chore: apply clippy suggestions
* chore: adjust format according to rustfmt
* feat: meta procedure options
* chore: tune meta procedure options in tests
* Update src/common/procedure/Cargo.toml
Co-authored-by: dennis zhuang <killme2008@gmail.com>
---------
Co-authored-by: dennis zhuang <killme2008@gmail.com>
* feat(config-endpoint): add initial implementation
* feat: add initial handler implementation
* fix: apply clippy suggestions, use axum response instead of string
* feat: address CR suggestions
* fix: minor adjustments in formatting
* fix: add a test
* feat: add to_toml_string method to options
* fix: adjust the assertion for the integration test
* fix: adjust expected indents
* fix: adjust assertion for the integration test
* fix: improve according to clippy
* feat(http_body_limit): add initial support for DefaultBodyLimit
* fix: address CR suggestions
* fix: adjust the const for default http body limit
* fix: adjust the toml_str for the test
* fix: address CR suggestions
* fix: body_limit units in example config toml files
* fix: address clippy suggestions
* feat: initial twcs impl
* chore: rename SimplePicker to LeveledPicker
* rename some structs
* Remove Compaction strategy
* make compaction picker a trait object
* make compaction picker configurable for every region
* chore: add some test for ttl
* add some tests
* fix: some style issues in cr
* feat: enable twcs when creating tables
* feat: allow config time window when creating tables
* fix: some cr comments
* feat: add log
* feat: print more info
* feat: use chain reader
* fix: panic on getting first range
* fix: prev not updated
* fix: reverse readers and iter backward
* chore: don't print windows in log
* feat: consider memtable range
Also fix the issue that using incorrect comparision method to sort time
ranges.
* fix: merge memtable window with sst's
* feat: add use_chain_reader option
* feat: skip empty memtables
* chore: change log level
* fix: memtable range not ordered
* style: fix clippy
* chore: address review comments
* chore: print region id in log
* feat: txn for meta kvstore
* feat: txn
* chore: add unit test
* chore: more test
* chore: more test
* Update src/meta-srv/src/service/store/memory.rs
Co-authored-by: LFC <bayinamine@gmail.com>
* chore: by cr
---------
Co-authored-by: LFC <bayinamine@gmail.com>
* refactor: add table_id to get_table()/table_exists()
* refactor: Add table_id to alter table request
* refactor: Add table id to DropTableRequest
* refactor: add table id to DropTableRequest
* refactor: Use table id as key for the tables map
* refactor: use table id as file engine's map key
* refactor: Remove table reference from engine's get_table/table_exists
* style: remove unused imports
* feat!: Add table id to TableRegionalValue
* style: fix cilppy
* chore: add comments and logs
* feat: support to copy from orc format
* test: add copy from orc test
* chore: add license header
* refactor: remove unimplemented macro
* chore: apply suggestions from CR
* chore: bump orc-rust to 0.2.3
* feat: add initial implementation for status endpoint
* feat(status_endpoint): add more data to response
* feat(status_endpoint): use build data env vars
* feat(status_endpoint): add simple test
* fix(status_endpoint): adjust the toml indentation
* fix: set max_files_in_l0 in unit tests to avoid compaction
* refactor: pass while EngineConfig
* fix: comment out unstable sqlness test
* revert commented sqlness
* add some debug log
* fix: use lazy parquet reader in MitoTable::scan_to_stream to avoid IO in plan stage
* fix: unit tests
* fix: order-by optimization
* add some tests
* fix: move metric names to metrics.rs
* fix: some cr comments
* refactor: Remove MySQL related options from Datanode
remove mysql_addr and mysql_runtime_size in datanode.rs, remove command line argument mysql_addr in cmd/src/datanode.rs
#1739
* feat: remove --mysql-addr from command line
in pre commit, sqlness can not find --mysql-addrr, because we remove it
issue#1739
* refactor: remove --mysql-addr from command line
in pre commit, sqlness can not find --mysql-addrr, because we remove it
issue#1739
# - If it's a tag push release, the version is the tag name(${{ github.ref_name }});
# - If it's a scheduled release, the version is '${{ env.NEXT_RELEASE_VERSION }}-nightly-$buildTime', like 'v0.2.0-nightly-20230313';
# - If it's a manual release, the version is '${{ env.NEXT_RELEASE_VERSION }}-$(git rev-parse --short HEAD)-YYYYMMDDSS', like 'v0.2.0-e5b243c-2023071245';
# - If it's a nightly build, the version is 'nightly-YYYYMMDD-$(git rev-parse --short HEAD)', like 'nightly-20230712-e5b243c'.
release-dev-builder-images-cn: # Note: Be careful issue:https://github.com/containers/skopeo/issues/1874 and we decide to use the latest stable skopeo container.
# 1. The tag('v*.*.*') push release: the release workflow will be triggered by the tag push event.
# 2. The scheduled release(the version will be '${{ env.NEXT_RELEASE_VERSION }}-nightly-YYYYMMDD'): the release workflow will be triggered by the schedule event.
on:
push:
tags:
@@ -5,464 +10,543 @@ on:
schedule:
# At 00:00 on Monday.
- cron:'0 0 * * 1'
# Mannually trigger only builds binaries.
workflow_dispatch:
name:Release
workflow_dispatch:# Allows you to run this workflow manually.
# Notes: The GitHub Actions ONLY support 10 inputs, and it's already used up.
inputs:
linux_amd64_runner:
type:choice
description:The runner uses to build linux-amd64 artifacts
default:ec2-c6i.4xlarge-amd64
options:
- ubuntu-22.04
- ubuntu-22.04-8-cores
- ubuntu-22.04-16-cores
- ubuntu-22.04-32-cores
- ubuntu-22.04-64-cores
- ec2-c6i.xlarge-amd64# 4C8G
- ec2-c6i.2xlarge-amd64# 8C16G
- ec2-c6i.4xlarge-amd64# 16C32G
- ec2-c6i.8xlarge-amd64# 32C64G
- ec2-c6i.16xlarge-amd64# 64C128G
linux_arm64_runner:
type:choice
description:The runner uses to build linux-arm64 artifacts
default:ec2-c6g.8xlarge-arm64
options:
- ubuntu-2204-32-cores-arm
- ec2-c6g.xlarge-arm64# 4C8G
- ec2-c6g.2xlarge-arm64# 8C16G
- ec2-c6g.4xlarge-arm64# 16C32G
- ec2-c6g.8xlarge-arm64# 32C64G
- ec2-c6g.16xlarge-arm64# 64C128G
macos_runner:
type:choice
description:The runner uses to build macOS artifacts
default:macos-latest
options:
- macos-latest
skip_test:
description:Do not run integration tests during the build
type:boolean
default:true
build_linux_amd64_artifacts:
type:boolean
description:Build linux-amd64 artifacts
required:false
default:false
build_linux_arm64_artifacts:
type:boolean
description:Build linux-arm64 artifacts
required:false
default:false
build_macos_artifacts:
type:boolean
description:Build macos artifacts
required:false
default:false
build_windows_artifacts:
type:boolean
description:Build Windows artifacts
required:false
default:false
publish_github_release:
type:boolean
description:Create GitHub release and upload artifacts
required:false
default:false
release_images:
type:boolean
description:Build and push images to DockerHub and ACR
required:false
default:false
# Use env variables to control all the release process.
env:
RUST_TOOLCHAIN:nightly-2023-05-03
SCHEDULED_BUILD_VERSION_PREFIX:v0.3.0
SCHEDULED_PERIOD:nightly
# The arguments of building greptime.
CARGO_PROFILE:nightly
# Controls whether to run tests, include unit-test, integration-test and sqlness.
- name:Configure scheduled build version# the version would be ${SCHEDULED_BUILD_VERSION_PREFIX}-${SCHEDULED_PERIOD}-YYYYMMDD, like v0.2.0-nigthly-20230313.
Thanks a lot for considering contributing to GreptimeDB. We believe people like you would make GreptimeDB a great product. We intend to build a community where individuals can have open talks, show respect for one another, and speak with true ❤️. Meanwhile, we are to keep transparency and make your effort count here.
Read the guidelines, and they can help you get started. Communicate with respect to developers maintaining and developing the project. In return, they should reciprocate that respect by addressing your issue, reviewing changes, as well as helping finalize and merge your pull requests.
You can find our contributors at https://github.com/GreptimeTeam/greptimedb/graphs/contributors. When you dedicate to GreptimeDB for a few months and keep bringing high-quality contributions (code, docs, advocate, etc.), you will be a candidate of a committer.
A committer will be granted both read & write access to GreptimeDB repos. Check the [AUTHOR.md](AUTHOR.md) file for all current individual committers.
Please read the guidelines, and they can help you get started. Communicate respectfully with the developers maintaining and developing the project. In return, they should reciprocate that respect by addressing your issue, reviewing changes, as well as helping finalize and merge your pull requests.
Follow our [README](https://github.com/GreptimeTeam/greptimedb#readme) to get the whole picture of the project. To learn about the design of GreptimeDB, please refer to the [design docs](https://github.com/GrepTimeTeam/docs).
@@ -10,7 +14,7 @@ Follow our [README](https://github.com/GreptimeTeam/greptimedb#readme) to get th
It can feel intimidating to contribute to a complex project, but it can also be exciting and fun. These general notes will help everyone participate in this communal activity.
- Follow the [Code of Conduct](https://github.com/GreptimeTeam/greptimedb/blob/develop/CODE_OF_CONDUCT.md)
- Follow the [Code of Conduct](https://github.com/GreptimeTeam/.github/blob/main/.github/CODE_OF_CONDUCT.md)
- Small changes make huge differences. We will happily accept a PR making a single character change if it helps move forward. Don't wait to have everything working.
- Check the closed issues before opening your issue.
- Try to follow the existing style of the code.
@@ -21,12 +25,12 @@ Pull requests are great, but we accept all kinds of other help if you like. Such
- Write tutorials or blog posts. Blog, speak about, or create tutorials about one of GreptimeDB's many features. Mention [@greptime](https://twitter.com/greptime) on Twitter and email info@greptime.com so we can give pointers and tips and help you spread the word by promoting your content on Greptime communication channels.
- Improve the documentation. [Submit documentation](http://github.com/greptimeTeam/docs/) updates, enhancements, designs, or bug fixes, and fixing any spelling or grammar errors will be very much appreciated.
- Present at meetups and conferences about your GreptimeDB projects. Your unique challenges and successes in building things with GreptimeDB can provide great speaking material. We'd love to review your talk abstract, so get in touch with us if you'd like some help!
- Submit bug reports. To report a bug or a security issue, you can [open a new GitHub issue](https://github.com/GrepTimeTeam/greptimedb/issues/new).
- Submitting bug reports. To report a bug or a security issue, you can [open a new GitHub issue](https://github.com/GrepTimeTeam/greptimedb/issues/new).
- Speak up feature requests. Send feedback is a great way for us to understand your different use cases of GreptimeDB better. If you want to share your experience with GreptimeDB, or if you want to discuss any ideas, you can start a discussion on [GitHub discussions](https://github.com/GreptimeTeam/greptimedb/discussions), chat with the Greptime team on [Slack](https://greptime.com/slack), or you can tweet [@greptime](https://twitter.com/greptime) on Twitter.
## Code of Conduct
Also, there are things that we are not looking for because they don't match the goals of the product or benefit the community. Please read [Code of Conduct](https://github.com/GreptimeTeam/greptimedb/blob/develop/CODE_OF_CONDUCT.md); we hope everyone can keep good manners and become an honored member.
Also, there are things that we are not looking for because they don't match the goals of the product or benefit the community. Please read [Code of Conduct](https://github.com/GreptimeTeam/.github/blob/main/.github/CODE_OF_CONDUCT.md); we hope everyone can keep good manners and become an honored member.
## License
@@ -49,8 +53,9 @@ GreptimeDB uses the [Apache 2.0 license](https://github.com/GreptimeTeam/greptim
### Before PR
- To ensure that community is free and confident in its ability to use your contributions, please sign the Contributor License Agreement (CLA) which will be incorporated in the pull request process.
- Make sure all your codes are formatted and follow the [coding style](https://pingcap.github.io/style-guide/rust/).
- Make sure all unit tests are passed (using `cargo test --workspace` or [nextest](https://nexte.st/index.html) `cargo nextest run`).
- Make sure all files have proper license header (running `docker run --rm -v $(pwd):/github/workspace ghcr.io/korandoru/hawkeye-native:v3 format` from the project root).
- Make sure all your codes are formatted and follow the [coding style](https://pingcap.github.io/style-guide/rust/) and [style guide](docs/style-guide.md).
- Make sure all unit tests are passed using [nextest](https://nexte.st/index.html) `cargo nextest run`.
- Make sure all clippy warnings are fixed (you can check it locally by running `cargo clippy --workspace --all-targets -- -D warnings`).
#### `pre-commit` Hooks
@@ -81,7 +86,7 @@ Now, `pre-commit` will run automatically on `git commit`.
### Title
The titles of pull requests should be prefixed with category names listed in [Conventional Commits specification](https://www.conventionalcommits.org/en/v1.0.0)
like `feat`/`fix`/`docs`, with a concise summary of code change following. DO NOT use last commit message as pull request title.
like `feat`/`fix`/`docs`, with a concise summary of code change following. AVOID using the last commit message as pull request title.
### Description
@@ -100,10 +105,10 @@ of what you were trying to do and what went wrong. You can also reach for help i
## Community
The core team will be thrilled if you participate in any way you like. When you are stuck, try ask for help by filing an issue, with a detailed description of what you were trying to do and what went wrong. If you have any questions or if you would like to get involved in our community, please check out:
The core team will be thrilled if you would like to participate in any way you like. When you are stuck, try to ask for help by filing an issue, with a detailed description of what you were trying to do and what went wrong. If you have any questions or if you would like to get involved in our community, please check out:
- [GreptimeDB Community Slack](https://greptime.com/slack)
- Native SQL, and Python scripting for advanced analytical scenarios
- Widely adopted database protocols and APIs, native PromQL supports
- Extensible table engine architecture for extensive workloads
**GreptimeDB** is an open-source, cloud-native database purpose-built for the unified collection and analysis of observability data (metrics, logs, and traces). Whether you’re operating on the edge, in the cloud, or across hybrid environments, GreptimeDB empowers real-time insights at massive scale — all in one system.
## Quick Start
## Features
### GreptimePlay
| Feature | Description |
| --------- | ----------- |
| [Unified Observability Data](https://docs.greptime.com/user-guide/concepts/why-greptimedb) | Store metrics, logs, and traces as timestamped, contextual wide events. Query via [SQL](https://docs.greptime.com/user-guide/query-data/sql), [PromQL](https://docs.greptime.com/user-guide/query-data/promql), and [streaming](https://docs.greptime.com/user-guide/flow-computation/overview). |
| [High Performance & Cost Effective](https://docs.greptime.com/user-guide/manage-data/data-index) | Written in Rust, with a distributed query engine, [rich indexing](https://docs.greptime.com/user-guide/manage-data/data-index), and optimized columnar storage, delivering sub-second responses at PB scale. |
| [Cloud-Native Architecture](https://docs.greptime.com/user-guide/concepts/architecture) | Designed for [Kubernetes](https://docs.greptime.com/user-guide/deployments/deploy-on-kubernetes/greptimedb-operator-management), with compute/storage separation, native object storage (AWS S3, Azure Blob, etc.) and seamless cross-cloud access. |
| [Developer-Friendly](https://docs.greptime.com/user-guide/protocols/overview) | Access via SQL/PromQL interfaces, REST API, MySQL/PostgreSQL protocols, and popular ingestion [protocols](https://docs.greptime.com/user-guide/protocols/overview). |
| [Flexible Deployment](https://docs.greptime.com/user-guide/deployments/overview) | Deploy anywhere: edge (including ARM/[Android](https://docs.greptime.com/user-guide/deployments/run-on-android)) or cloud, with unified APIs and efficient data sync. |
Try out the features of GreptimeDB right from your browser.
Learn more in [Why GreptimeDB](https://docs.greptime.com/user-guide/concepts/why-greptimedb) and [Observability 2.0 and the Database for It](https://greptime.com/blogs/2025-04-25-greptimedb-observability2-new-database).
- C/C++ Toolchain: provides basic tools for compiling and linking. This is
available as `build-essential` on ubuntu and similar name on other platforms.
- Rust: the easiest way to install Rust is to use
[`rustup`](https://rustup.rs/), which will check our `rust-toolchain` file and
install correct Rust version for you.
- Protobuf: `protoc` is required for compiling `.proto` files. `protobuf` is
available from major package manager on macos and linux distributions. You can
find an installation instructions [here](https://grpc.io/docs/protoc-installation/).
**Note that `protoc` version needs to be >= 3.15** because we have used the `optional`
keyword. You can check it with `protoc --version`.
- python3-dev or python3-devel(Optional feature, only needed if you want to run scripts
in CPython, and also need to enable `pyo3_backend` feature when compiling(by `cargo run -F pyo3_backend` or add `pyo3_backend` to src/script/Cargo.toml 's `features.default` like `default = ["python", "pyo3_backend]`)): this install a Python shared library required for running Python
scripting engine(In CPython Mode). This is available as `python3-dev` on
ubuntu, you can install it with `sudo apt install python3-dev`, or
`python3-devel` on RPM based distributions (e.g. Fedora, Red Hat, SuSE). Mac's
`Python3` package should have this shared library by default. More detail for compiling with PyO3 can be found in [PyO3](https://pyo3.rs/v0.18.1/building_and_distribution#configuring-the-python-version)'s documentation.
## Architecture
#### Build with Docker
* Read the [architecture](https://docs.greptime.com/contributor-guide/overview/#architecture) document.
* [DeepWiki](https://deepwiki.com/GreptimeTeam/greptimedb/1-overview) provides an in-depth look at GreptimeDB:
<imgalt="GreptimeDB System Overview"src="docs/architecture.png">
A docker image with necessary dependencies is provided:
* C/C++ building essentials, including `gcc`/`g++`/`autoconf` and glibc library (eg. `libc6-dev` on Ubuntu and `glibc-devel` on Fedora)
* Python toolchain (optional): Required only if using some test scripts.
**Build and Run:**
```bash
make
cargo run -- standalone start
```
Or if you built from docker:
## Tools & Extensions
```
docker run -p 4002:4002 -v "$(pwd):/tmp/greptimedb" greptime/greptimedb standalone start
```
Please see [the online document site](https://docs.greptime.com/getting-started/overview#install-greptimedb) for more installation options and [operations info](https://docs.greptime.com/user-guide/operations/overview).
### Get started
Read the [complete getting started guide](https://docs.greptime.com/getting-started/overview#connect) on our [official document site](https://docs.greptime.com/).
To write and query data, GreptimeDB is compatible with multiple [protocols and clients](https://docs.greptime.com/user-guide/clients).
For Linux and macOS, you can easily download pre-built binaries including official releases and nightly builds that are ready to use.
In most cases, downloading the version without PyO3 is sufficient. However, if you plan to run scripts in CPython (and use Python packages like NumPy and Pandas), you will need to download the version with PyO3 and install a Python with the same version as the Python in the PyO3 version.
We recommend using virtualenv for the installation process to manage multiple Python versions.
This project is in its early stage and under heavy development. We move fast and
break things. Benchmark on development branch may not represent its potential
performance. We release pre-built binaries constantly for functional
evaluation. Do not use it in production at the moment.
> **Status:** Beta.
> **GA (v1.0):** Targeted for mid 2025.
For future plans, check out [GreptimeDB roadmap](https://github.com/GreptimeTeam/greptimedb/issues/669).
- Being used in production by early adopters
- Stable, actively maintained, with regular releases ([version info](https://docs.greptime.com/nightly/reference/about-greptimedb-version))
- Suitable for evaluation and pilot deployments
For production use, we recommend using the latest stable release.
[](https://www.star-history.com/#GreptimeTeam/GreptimeDB&Date)
If you find this project useful, a ⭐ would mean a lot to us!
- Explore [Internal Concepts](https://docs.greptime.com/contributor-guide/overview.html) and [DeepWiki](https://deepwiki.com/GreptimeTeam/greptimedb).
- Pick up a [good first issue](https://github.com/GreptimeTeam/greptimedb/issues?q=is%3Aissue+is%3Aopen+label%3A%22good+first+issue%22) and join the #contributors [Slack](https://greptime.com/slack) channel.
## Acknowledgement
- GreptimeDB uses [Apache Arrow](https://arrow.apache.org/) as the memory model and [Apache Parquet](https://parquet.apache.org/) as the persistent file format.
- GreptimeDB's query engine is powered by [Apache Arrow DataFusion](https://github.com/apache/arrow-datafusion).
- [OpenDAL](https://github.com/datafuselabs/opendal) from [Datafuse Labs](https://github.com/datafuselabs) gives GreptimeDB a very general and elegant data access abstraction layer.
-GreptimeDB’s meta service is based on [etcd](https://etcd.io/).
-GreptimeDB uses [RustPython](https://github.com/RustPython/RustPython) for experimental embedded python scripting.
Special thanks to all contributors! See [AUTHORS.md](https://github.com/GreptimeTeam/greptimedb/blob/main/AUTHOR.md).
| `default_timezone` | String | Unset | The default timezone of the server. |
| `init_regions_in_background` | Bool | `false` | Initialize all regions in the background during the startup.<br/>By default, it provides services after all regions have been initialized. |
| `max_concurrent_queries` | Integer | `0` | The maximum current queries allowed to be executed. Zero means unlimited. |
| `enable_telemetry` | Bool | `true` | Enable telemetry to collect anonymous usage data. Enabled by default. |
| `max_in_flight_write_bytes` | String | Unset | The maximum in-flight write bytes. |
| `runtime` | -- | -- | The runtime options. |
| `runtime.global_rt_size` | Integer | `8` | The number of threads to execute the runtime for global read operations. |
| `runtime.compact_rt_size` | Integer | `4` | The number of threads to execute the runtime for global write operations. |
| `http` | -- | -- | The HTTP server options. |
| `http.addr` | String | `127.0.0.1:4000` | The address to bind the HTTP server. |
| `http.timeout` | String | `0s` | HTTP request timeout. Set to 0 to disable timeout. |
| `http.body_limit` | String | `64MB` | HTTP request body limit.<br/>The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.<br/>Set to 0 to disable limit. |
| `http.enable_cors` | Bool | `true` | HTTP CORS support, it's turned on by default<br/>This allows browser to access http APIs without CORS restrictions |
| `grpc.tls.watch` | Bool | `false` | Watch for Certificate and key file change and auto reload.<br/>For now, gRPC tls config does not support auto reload. |
| `prom_store.enable` | Bool | `true` | Whether to enable Prometheus remote write and read in HTTP API. |
| `prom_store.with_metric_engine` | Bool | `true` | Whether to store the data from Prometheus remote write in metric engine. |
| `wal` | -- | -- | The WAL options. |
| `wal.provider` | String | `raft_engine` | The provider of the WAL.<br/>- `raft_engine`: the wal is stored in the local file system by raft-engine.<br/>- `kafka`: it's remote wal that data is stored in Kafka. |
| `wal.dir` | String | Unset | The directory to store the WAL files.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.file_size` | String | `128MB` | The size of the WAL segment file.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_threshold` | String | `1GB` | The threshold of the WAL size to trigger a purge.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_interval` | String | `1m` | The interval to trigger a purge.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.read_batch_size` | Integer | `128` | The read batch size.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.sync_write` | Bool | `false` | Whether to use sync write.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.enable_log_recycle` | Bool | `true` | Whether to reuse logically truncated log files.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.prefill_log_files` | Bool | `false` | Whether to pre-create log files on start up.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.sync_period` | String | `10s` | Duration for fsyncing log files.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.broker_endpoints` | Array | -- | The Kafka broker endpoints.<br/>**It's only used when the provider is `kafka`**. |
| `wal.auto_create_topics` | Bool | `true` | Automatically create topics for WAL.<br/>Set to `true` to automatically create topics for WAL.<br/>Otherwise, use topics named `topic_name_prefix_[0..num_topics)` |
| `wal.num_topics` | Integer | `64` | Number of topics.<br/>**It's only used when the provider is `kafka`**. |
| `wal.selector_type` | String | `round_robin` | Topic selector type.<br/>Available selector types:<br/>- `round_robin` (default)<br/>**It's only used when the provider is `kafka`**. |
| `wal.topic_name_prefix` | String | `greptimedb_wal_topic` | A Kafka topic is constructed by concatenating `topic_name_prefix` and `topic_id`.<br/>i.g., greptimedb_wal_topic_0, greptimedb_wal_topic_1.<br/>**It's only used when the provider is `kafka`**. |
| `wal.replication_factor` | Integer | `1` | Expected number of replicas of each partition.<br/>**It's only used when the provider is `kafka`**. |
| `wal.create_topic_timeout` | String | `30s` | Above which a topic creation operation will be cancelled.<br/>**It's only used when the provider is `kafka`**. |
| `wal.max_batch_bytes` | String | `1MB` | The max size of a single producer batch.<br/>Warning: Kafka has a default limit of 1MB per message in a topic.<br/>**It's only used when the provider is `kafka`**. |
| `wal.consumer_wait_timeout` | String | `100ms` | The consumer wait timeout.<br/>**It's only used when the provider is `kafka`**. |
| `wal.overwrite_entry_start_id` | Bool | `false` | Ignore missing entries during read WAL.<br/>**It's only used when the provider is `kafka`**.<br/><br/>This option ensures that when Kafka messages are deleted, the system<br/>can still successfully replay memtable data without throwing an<br/>out-of-range error.<br/>However, enabling this option might lead to unexpected data loss,<br/>as the system will skip over missing entries instead of treating<br/>them as critical errors. |
| `procedure.max_running_procedures` | Integer | `128` | Max running procedures.<br/>The maximum number of procedures that can be running at the same time.<br/>If the number of running procedures exceeds this limit, the procedure will be rejected. |
| `flow` | -- | -- | flow engine options. |
| `flow.num_workers` | Integer | `0` | The number of flow worker in flownode.<br/>Not setting(or set to 0) this value will use the number of CPU cores divided by 2. |
| `query` | -- | -- | The query engine options. |
| `query.parallelism` | Integer | `0` | Parallelism of the query engine.<br/>Default to 0, which means the number of CPU cores. |
| `storage` | -- | -- | The data storage options. |
| `storage.data_home` | String | `./greptimedb_data` | The working home directory. |
| `storage.type` | String | `File` | The storage type used to store the data.<br/>- `File`: the data is stored in the local file system.<br/>- `S3`: the data is stored in the S3 object storage.<br/>- `Gcs`: the data is stored in the Google Cloud Storage.<br/>- `Azblob`: the data is stored in the Azure Blob Storage.<br/>- `Oss`: the data is stored in the Aliyun OSS. |
| `storage.cache_path` | String | Unset | Read cache configuration for object storage such as 'S3' etc, it's configured by default when using object storage. It is recommended to configure it when using object storage for better performance.<br/>A local file directory, defaults to `{data_home}`. An empty string means disabling. |
| `storage.cache_capacity` | String | Unset | The local file cache capacity in bytes. If your disk space is sufficient, it is recommended to set it larger. |
| `storage.bucket` | String | Unset | The S3 bucket name.<br/>**It's only used when the storage type is `S3`, `Oss` and `Gcs`**. |
| `storage.root` | String | Unset | The S3 data will be stored in the specified prefix, for example, `s3://${bucket}/${root}`.<br/>**It's only used when the storage type is `S3`, `Oss` and `Azblob`**. |
| `storage.access_key_id` | String | Unset | The access key id of the aws account.<br/>It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.<br/>**It's only used when the storage type is `S3` and `Oss`**. |
| `storage.secret_access_key` | String | Unset | The secret access key of the aws account.<br/>It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.<br/>**It's only used when the storage type is `S3`**. |
| `storage.access_key_secret` | String | Unset | The secret access key of the aliyun account.<br/>**It's only used when the storage type is `Oss`**. |
| `storage.account_name` | String | Unset | The account key of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.account_key` | String | Unset | The account key of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.scope` | String | Unset | The scope of the google cloud storage.<br/>**It's only used when the storage type is `Gcs`**. |
| `storage.credential_path` | String | Unset | The credential path of the google cloud storage.<br/>**It's only used when the storage type is `Gcs`**. |
| `storage.credential` | String | Unset | The credential of the google cloud storage.<br/>**It's only used when the storage type is `Gcs`**. |
| `storage.container` | String | Unset | The container of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.sas_token` | String | Unset | The sas token of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.endpoint` | String | Unset | The endpoint of the S3 service.<br/>**It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**. |
| `storage.region` | String | Unset | The region of the S3 service.<br/>**It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**. |
| `storage.http_client` | -- | -- | The http client options to the storage.<br/>**It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**. |
| `storage.http_client.pool_max_idle_per_host` | Integer | `1024` | The maximum idle connection per host allowed in the pool. |
| `storage.http_client.connect_timeout` | String | `30s` | The timeout for only the connect phase of a http client. |
| `storage.http_client.timeout` | String | `30s` | The total request timeout, applied from when the request starts connecting until the response body has finished.<br/>Also considered a total deadline. |
| `storage.http_client.pool_idle_timeout` | String | `90s` | The timeout for idle sockets being kept-alive. |
| `[[region_engine]]` | -- | -- | The region engine options. You can configure multiple region engines. |
| `region_engine.mito.num_workers` | Integer | `8` | Number of region workers. |
| `region_engine.mito.worker_channel_size` | Integer | `128` | Request channel size of each worker. |
| `region_engine.mito.worker_request_batch_size` | Integer | `64` | Max batch size for a worker to handle requests. |
| `region_engine.mito.manifest_checkpoint_distance` | Integer | `10` | Number of meta action updated to trigger a new checkpoint for the manifest. |
| `region_engine.mito.compress_manifest` | Bool | `false` | Whether to compress manifest and checkpoint file by gzip (default false). |
| `region_engine.mito.max_background_flushes` | Integer | Auto | Max number of running background flush jobs (default: 1/2 of cpu cores). |
| `region_engine.mito.max_background_compactions` | Integer | Auto | Max number of running background compaction jobs (default: 1/4 of cpu cores). |
| `region_engine.mito.max_background_purges` | Integer | Auto | Max number of running background purge jobs (default: number of cpu cores). |
| `region_engine.mito.auto_flush_interval` | String | `1h` | Interval to auto flush a region if it has not flushed yet. |
| `region_engine.mito.global_write_buffer_size` | String | Auto | Global write buffer size for all regions. If not set, it's default to 1/8 of OS memory with a max limitation of 1GB. |
| `region_engine.mito.global_write_buffer_reject_size` | String | Auto | Global write buffer size threshold to reject write requests. If not set, it's default to 2 times of `global_write_buffer_size`. |
| `region_engine.mito.sst_meta_cache_size` | String | Auto | Cache size for SST metadata. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/32 of OS memory with a max limitation of 128MB. |
| `region_engine.mito.vector_cache_size` | String | Auto | Cache size for vectors and arrow arrays. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/16 of OS memory with a max limitation of 512MB. |
| `region_engine.mito.page_cache_size` | String | Auto | Cache size for pages of SST row groups. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/8 of OS memory. |
| `region_engine.mito.selector_result_cache_size` | String | Auto | Cache size for time series selector (e.g. `last_value()`). Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/16 of OS memory with a max limitation of 512MB. |
| `region_engine.mito.enable_write_cache` | Bool | `false` | Whether to enable the write cache, it's enabled by default when using object storage. It is recommended to enable it when using object storage for better performance. |
| `region_engine.mito.write_cache_path` | String | `""` | File system path for write cache, defaults to `{data_home}`. |
| `region_engine.mito.write_cache_size` | String | `5GiB` | Capacity for write cache. If your disk space is sufficient, it is recommended to set it larger. |
| `region_engine.mito.parallel_scan_channel_size` | Integer | `32` | Capacity of the channel to send data from parallel scan tasks to the main task. |
| `region_engine.mito.allow_stale_entries` | Bool | `false` | Whether to allow stale WAL entries read during replay. |
| `region_engine.mito.min_compaction_interval` | String | `0m` | Minimum time interval between two compactions.<br/>To align with the old behavior, the default value is 0 (no restrictions). |
| `region_engine.mito.index` | -- | -- | The options for index in Mito engine. |
| `region_engine.mito.index.aux_path` | String | `""` | Auxiliary directory path for the index in filesystem, used to store intermediate files for<br/>creating the index and staging files for searching the index, defaults to `{data_home}/index_intermediate`.<br/>The default name for this directory is `index_intermediate` for backward compatibility.<br/><br/>This path contains two subdirectories:<br/>- `__intm`: for storing intermediate files used during creating index.<br/>- `staging`: for storing staging files used during searching index. |
| `region_engine.mito.index.staging_size` | String | `2GB` | The max capacity of the staging directory. |
| `region_engine.mito.index.staging_ttl` | String | `7d` | The TTL of the staging directory.<br/>Defaults to 7 days.<br/>Setting it to "0s" to disable TTL. |
| `region_engine.mito.index.metadata_cache_size` | String | `64MiB` | Cache size for inverted index metadata. |
| `region_engine.mito.index.content_cache_size` | String | `128MiB` | Cache size for inverted index content. |
| `region_engine.mito.index.content_cache_page_size` | String | `64KiB` | Page size for inverted index content cache. |
| `region_engine.mito.index.result_cache_size` | String | `128MiB` | Cache size for index result. |
| `region_engine.mito.inverted_index` | -- | -- | The options for inverted index in Mito engine. |
| `region_engine.mito.inverted_index.create_on_flush` | String | `auto` | Whether to create the index on flush.<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.inverted_index.create_on_compaction` | String | `auto` | Whether to create the index on compaction.<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.inverted_index.apply_on_query` | String | `auto` | Whether to apply the index on query<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.inverted_index.mem_threshold_on_create` | String | `auto` | Memory threshold for performing an external sort during index creation.<br/>- `auto`: automatically determine the threshold based on the system memory size (default)<br/>- `unlimited`: no memory limit<br/>- `[size]` e.g. `64MB`: fixed memory threshold |
| `region_engine.mito.fulltext_index` | -- | -- | The options for full-text index in Mito engine. |
| `region_engine.mito.fulltext_index.create_on_flush` | String | `auto` | Whether to create the index on flush.<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.fulltext_index.create_on_compaction` | String | `auto` | Whether to create the index on compaction.<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.fulltext_index.apply_on_query` | String | `auto` | Whether to apply the index on query<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.fulltext_index.mem_threshold_on_create` | String | `auto` | Memory threshold for index creation.<br/>- `auto`: automatically determine the threshold based on the system memory size (default)<br/>- `unlimited`: no memory limit<br/>- `[size]` e.g. `64MB`: fixed memory threshold |
| `region_engine.mito.bloom_filter_index` | -- | -- | The options for bloom filter in Mito engine. |
| `region_engine.mito.bloom_filter_index.create_on_flush` | String | `auto` | Whether to create the bloom filter on flush.<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.bloom_filter_index.create_on_compaction` | String | `auto` | Whether to create the bloom filter on compaction.<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.bloom_filter_index.apply_on_query` | String | `auto` | Whether to apply the bloom filter on query<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.bloom_filter_index.mem_threshold_on_create` | String | `auto` | Memory threshold for bloom filter creation.<br/>- `auto`: automatically determine the threshold based on the system memory size (default)<br/>- `unlimited`: no memory limit<br/>- `[size]` e.g. `64MB`: fixed memory threshold |
| `region_engine.mito.memtable.index_max_keys_per_shard` | Integer | `8192` | The max number of keys in one shard.<br/>Only available for `partition_tree` memtable. |
| `region_engine.mito.memtable.data_freeze_threshold` | Integer | `32768` | The max rows of data inside the actively writing buffer in one shard.<br/>Only available for `partition_tree` memtable. |
| `region_engine.mito.memtable.fork_dictionary_bytes` | String | `1GiB` | Max dictionary bytes.<br/>Only available for `partition_tree` memtable. |
| `logging.append_stdout` | Bool | `true` | Whether to append logs to stdout. |
| `logging.log_format` | String | `text` | The log format. Can be `text`/`json`. |
| `logging.max_log_files` | Integer | `720` | The maximum amount of log files. |
| `logging.tracing_sample_ratio` | -- | -- | The percentage of tracing will be sampled and exported.<br/>Valid range `[0, 1]`, 1 means all traces are sampled, 0 means all traces are not sampled, the default value is 1.<br/>ratio > 1 are treated as 1. Fractions <0aretreatedas0|
|`export_metrics`|--|--|ThestandalonecanexportitsmetricsandsendtoPrometheuscompatibleservice(e.g.`greptimedb`)fromremote-writeAPI.<br/>This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape. |
| `export_metrics.write_interval` | String | `30s` | The interval of export metrics. |
| `export_metrics.self_import` | -- | -- | For `standalone` mode, `self_import` is recommended to collect metrics generated by itself<br/>You must create the database before enabling it. |
| `export_metrics.remote_write.url` | String | `""` | The prometheus remote write endpoint that the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=greptime_metrics`. |
| `heartbeat.interval` | String | `18s` | Interval for sending heartbeat messages to the metasrv. |
| `heartbeat.retry_interval` | String | `3s` | Interval for retrying to send heartbeat messages to the metasrv. |
| `http` | -- | -- | The HTTP server options. |
| `http.addr` | String | `127.0.0.1:4000` | The address to bind the HTTP server. |
| `http.timeout` | String | `0s` | HTTP request timeout. Set to 0 to disable timeout. |
| `http.body_limit` | String | `64MB` | HTTP request body limit.<br/>The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.<br/>Set to 0 to disable limit. |
| `http.enable_cors` | Bool | `true` | HTTP CORS support, it's turned on by default<br/>This allows browser to access http APIs without CORS restrictions |
| `http.prom_validation_mode` | String | `strict` | Whether to enable validation for Prometheus remote write requests.<br/>Available options:<br/>- strict: deny invalid UTF-8 strings (default).<br/>- lossy: allow invalid UTF-8 strings, replace invalid characters with REPLACEMENT_CHARACTER(U+FFFD).<br/>- unchecked: do not valid strings. |
| `grpc` | -- | -- | The gRPC server options. |
| `grpc.bind_addr` | String | `127.0.0.1:4001` | The address to bind the gRPC server. |
| `grpc.server_addr` | String | `127.0.0.1:4001` | The address advertised to the metasrv, and used for connections from outside the host.<br/>If left empty or unset, the server will automatically use the IP address of the first network interface<br/>on the host, with the same port number as the one specified in `grpc.bind_addr`. |
| `grpc.runtime_size` | Integer | `8` | The number of server worker threads. |
| `grpc.flight_compression` | String | `arrow_ipc` | Compression mode for frontend side Arrow IPC service. Available options:<br/>- `none`: disable all compression<br/>- `transport`: only enable gRPC transport compression (zstd)<br/>- `arrow_ipc`: only enable Arrow IPC compression (lz4)<br/>- `all`: enable all compression.<br/>Default to `none` |
| `grpc.tls` | -- | -- | gRPC server TLS options, see `mysql.tls` section. |
| `grpc.tls.watch` | Bool | `false` | Watch for Certificate and key file change and auto reload.<br/>For now, gRPC tls config does not support auto reload. |
| `logging.append_stdout` | Bool | `true` | Whether to append logs to stdout. |
| `logging.log_format` | String | `text` | The log format. Can be `text`/`json`. |
| `logging.max_log_files` | Integer | `720` | The maximum amount of log files. |
| `logging.tracing_sample_ratio` | -- | -- | The percentage of tracing will be sampled and exported.<br/>Valid range `[0, 1]`, 1 means all traces are sampled, 0 means all traces are not sampled, the default value is 1.<br/>ratio > 1 are treated as 1. Fractions <0aretreatedas0|
|`slow_query.record_type`|String|`system_table`|Therecordtypeofslowqueries.Itcanbe`system_table`or`log`.<br/>If `system_table` is selected, the slow queries will be recorded in a system table `greptime_private.slow_queries`.<br/>If `log` is selected, the slow queries will be logged in a log file `greptimedb-slow-queries.*`. |
| `slow_query.threshold` | String | `30s` | The threshold of slow query. It can be human readable time string, for example: `10s`, `100ms`, `1s`. |
| `slow_query.sample_ratio` | Float | `1.0` | The sampling ratio of slow query log. The value should be in the range of (0, 1]. For example, `0.1` means 10% of the slow queries will be logged and `1.0` means all slow queries will be logged. |
| `slow_query.ttl` | String | `30d` | The TTL of the `slow_queries` system table. Default is `30d` when `record_type` is `system_table`. |
| `export_metrics` | -- | -- | The frontend can export its metrics and send to Prometheus compatible service (e.g. `greptimedb` itself) from remote-write API.<br/>This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape. |
| `export_metrics.write_interval` | String | `30s` | The interval of export metrics. |
| `export_metrics.remote_write` | -- | -- | -- |
| `export_metrics.remote_write.url` | String | `""` | The prometheus remote write endpoint that the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=greptime_metrics`. |
| `tracing` | -- | -- | The tracing options. Only effect when compiled with `tokio-console` feature. |
| `tracing.tokio_console_addr` | String | Unset | The tokio console address. |
### Metasrv
| Key | Type | Default | Descriptions |
| --- | -----| ------- | ----------- |
| `data_home` | String | `./greptimedb_data` | The working home directory. |
| `store_addrs` | Array | -- | Store server address default to etcd store.<br/>For postgres store, the format is:<br/>"password=password dbname=postgres user=postgres host=localhost port=5432"<br/>For etcd store, the format is:<br/>"127.0.0.1:2379" |
| `store_key_prefix` | String | `""` | If it's not empty, the metasrv will store all data with this key prefix. |
| `backend` | String | `etcd_store` | The datastore for meta server.<br/>Available values:<br/>- `etcd_store` (default value)<br/>- `memory_store`<br/>- `postgres_store`<br/>- `mysql_store` |
| `meta_table_name` | String | `greptime_metakv` | Table name in RDS to store metadata. Effect when using a RDS kvbackend.<br/>**Only used when backend is `postgres_store`.** |
| `meta_election_lock_id` | Integer | `1` | Advisory lock id in PostgreSQL for election. Effect when using PostgreSQL as kvbackend<br/>Only used when backend is `postgres_store`. |
| `use_memory_store` | Bool | `false` | Store data in memory. |
| `enable_region_failover` | Bool | `false` | Whether to enable region failover.<br/>This feature is only available on GreptimeDB running on cluster mode and<br/>- Using Remote WAL<br/>- Using shared storage (e.g., s3). |
| `allow_region_failover_on_local_wal` | Bool | `false` | Whether to allow region failover on local WAL.<br/>**This option is not recommended to be set to true, because it may lead to data loss during failover.** |
| `node_max_idle_time` | String | `24hours` | Max allowed idle time before removing node info from metasrv memory. |
| `enable_telemetry` | Bool | `true` | Whether to enable greptimedb telemetry. Enabled by default. |
| `runtime` | -- | -- | The runtime options. |
| `runtime.global_rt_size` | Integer | `8` | The number of threads to execute the runtime for global read operations. |
| `runtime.compact_rt_size` | Integer | `4` | The number of threads to execute the runtime for global write operations. |
| `grpc` | -- | -- | The gRPC server options. |
| `grpc.bind_addr` | String | `127.0.0.1:3002` | The address to bind the gRPC server. |
| `grpc.server_addr` | String | `127.0.0.1:3002` | The communication server address for the frontend and datanode to connect to metasrv.<br/>If left empty or unset, the server will automatically use the IP address of the first network interface<br/>on the host, with the same port number as the one specified in `bind_addr`. |
| `grpc.runtime_size` | Integer | `8` | The number of server worker threads. |
| `grpc.max_recv_message_size` | String | `512MB` | The maximum receive message size for gRPC server. |
| `grpc.max_send_message_size` | String | `512MB` | The maximum send message size for gRPC server. |
| `http` | -- | -- | The HTTP server options. |
| `http.addr` | String | `127.0.0.1:4000` | The address to bind the HTTP server. |
| `http.timeout` | String | `0s` | HTTP request timeout. Set to 0 to disable timeout. |
| `http.body_limit` | String | `64MB` | HTTP request body limit.<br/>The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.<br/>Set to 0 to disable limit. |
| `procedure.max_metadata_value_size` | String | `1500KiB` | Auto split large value<br/>GreptimeDB procedure uses etcd as the default metadata storage backend.<br/>The etcd the maximum size of any request is 1.5 MiB<br/>1500KiB = 1536KiB (1.5MiB) - 36KiB (reserved size of key)<br/>Comments out the `max_metadata_value_size`, for don't split large value (no limit). |
| `procedure.max_running_procedures` | Integer | `128` | Max running procedures.<br/>The maximum number of procedures that can be running at the same time.<br/>If the number of running procedures exceeds this limit, the procedure will be rejected. |
| `failure_detector` | -- | -- | -- |
| `failure_detector.threshold` | Float | `8.0` | The threshold value used by the failure detector to determine failure conditions. |
| `failure_detector.min_std_deviation` | String | `100ms` | The minimum standard deviation of the heartbeat intervals, used to calculate acceptable variations. |
| `failure_detector.acceptable_heartbeat_pause` | String | `10000ms` | The acceptable pause duration between heartbeats, used to determine if a heartbeat interval is acceptable. |
| `failure_detector.first_heartbeat_estimate` | String | `1000ms` | The initial estimate of the heartbeat interval used by the failure detector. |
| `wal.broker_endpoints` | Array | -- | The broker endpoints of the Kafka cluster. |
| `wal.auto_create_topics` | Bool | `true` | Automatically create topics for WAL.<br/>Set to `true` to automatically create topics for WAL.<br/>Otherwise, use topics named `topic_name_prefix_[0..num_topics)` |
| `wal.auto_prune_interval` | String | `0s` | Interval of automatically WAL pruning.<br/>Set to `0s` to disable automatically WAL pruning which delete unused remote WAL entries periodically. |
| `wal.trigger_flush_threshold` | Integer | `0` | The threshold to trigger a flush operation of a region in automatically WAL pruning.<br/>Metasrv will send a flush request to flush the region when:<br/>`trigger_flush_threshold` + `prunable_entry_id`<`max_prunable_entry_id`<br/>where:<br/>- `prunable_entry_id` is the maximum entry id that can be pruned of the region.<br/>- `max_prunable_entry_id` is the maximum prunable entry id among all regions in the same topic.<br/>Set to `0` to disable the flush operation. |
| `wal.topic_name_prefix` | String | `greptimedb_wal_topic` | A Kafka topic is constructed by concatenating `topic_name_prefix` and `topic_id`.<br/>Only accepts strings that match the following regular expression pattern:<br/>[a-zA-Z_:-][a-zA-Z0-9_:\-\.@#]*<br/>i.g., greptimedb_wal_topic_0, greptimedb_wal_topic_1. |
| `wal.replication_factor` | Integer | `1` | Expected number of replicas of each partition. |
| `wal.create_topic_timeout` | String | `30s` | Above which a topic creation operation will be cancelled. |
| `logging` | -- | -- | The logging options. |
| `logging.dir` | String | `./greptimedb_data/logs` | The directory to store the log files. If set to empty, logs will not be written to files. |
| `logging.level` | String | Unset | The log level. Can be `info`/`debug`/`warn`/`error`. |
| `logging.append_stdout` | Bool | `true` | Whether to append logs to stdout. |
| `logging.log_format` | String | `text` | The log format. Can be `text`/`json`. |
| `logging.max_log_files` | Integer | `720` | The maximum amount of log files. |
| `logging.tracing_sample_ratio` | -- | -- | The percentage of tracing will be sampled and exported.<br/>Valid range `[0, 1]`, 1 means all traces are sampled, 0 means all traces are not sampled, the default value is 1.<br/>ratio > 1 are treated as 1. Fractions <0aretreatedas0|
|`export_metrics`|--|--|ThemetasrvcanexportitsmetricsandsendtoPrometheuscompatibleservice(e.g.`greptimedb`itself)fromremote-writeAPI.<br/>This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape. |
| `export_metrics.write_interval` | String | `30s` | The interval of export metrics. |
| `export_metrics.remote_write` | -- | -- | -- |
| `export_metrics.remote_write.url` | String | `""` | The prometheus remote write endpoint that the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=greptime_metrics`. |
| `tracing` | -- | -- | The tracing options. Only effect when compiled with `tokio-console` feature. |
| `tracing.tokio_console_addr` | String | Unset | The tokio console address. |
### Datanode
| Key | Type | Default | Descriptions |
| --- | -----| ------- | ----------- |
| `node_id` | Integer | Unset | The datanode identifier and should be unique in the cluster. |
| `require_lease_before_startup` | Bool | `false` | Start services after regions have obtained leases.<br/>It will block the datanode start if it can't receive leases in the heartbeat from metasrv. |
| `init_regions_in_background` | Bool | `false` | Initialize all regions in the background during the startup.<br/>By default, it provides services after all regions have been initialized. |
| `max_concurrent_queries` | Integer | `0` | The maximum current queries allowed to be executed. Zero means unlimited. |
| `enable_telemetry` | Bool | `true` | Enable telemetry to collect anonymous usage data. Enabled by default. |
| `http` | -- | -- | The HTTP server options. |
| `http.addr` | String | `127.0.0.1:4000` | The address to bind the HTTP server. |
| `http.timeout` | String | `0s` | HTTP request timeout. Set to 0 to disable timeout. |
| `http.body_limit` | String | `64MB` | HTTP request body limit.<br/>The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.<br/>Set to 0 to disable limit. |
| `grpc` | -- | -- | The gRPC server options. |
| `grpc.bind_addr` | String | `127.0.0.1:3001` | The address to bind the gRPC server. |
| `grpc.server_addr` | String | `127.0.0.1:3001` | The address advertised to the metasrv, and used for connections from outside the host.<br/>If left empty or unset, the server will automatically use the IP address of the first network interface<br/>on the host, with the same port number as the one specified in `grpc.bind_addr`. |
| `grpc.runtime_size` | Integer | `8` | The number of server worker threads. |
| `grpc.max_recv_message_size` | String | `512MB` | The maximum receive message size for gRPC server. |
| `grpc.max_send_message_size` | String | `512MB` | The maximum send message size for gRPC server. |
| `grpc.flight_compression` | String | `arrow_ipc` | Compression mode for datanode side Arrow IPC service. Available options:<br/>- `none`: disable all compression<br/>- `transport`: only enable gRPC transport compression (zstd)<br/>- `arrow_ipc`: only enable Arrow IPC compression (lz4)<br/>- `all`: enable all compression.<br/>Default to `none` |
| `grpc.tls` | -- | -- | gRPC server TLS options, see `mysql.tls` section. |
| `grpc.tls.watch` | Bool | `false` | Watch for Certificate and key file change and auto reload.<br/>For now, gRPC tls config does not support auto reload. |
| `runtime` | -- | -- | The runtime options. |
| `runtime.global_rt_size` | Integer | `8` | The number of threads to execute the runtime for global read operations. |
| `runtime.compact_rt_size` | Integer | `4` | The number of threads to execute the runtime for global write operations. |
| `wal.provider` | String | `raft_engine` | The provider of the WAL.<br/>- `raft_engine`: the wal is stored in the local file system by raft-engine.<br/>- `kafka`: it's remote wal that data is stored in Kafka. |
| `wal.dir` | String | Unset | The directory to store the WAL files.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.file_size` | String | `128MB` | The size of the WAL segment file.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_threshold` | String | `1GB` | The threshold of the WAL size to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_interval` | String | `1m` | The interval to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.read_batch_size` | Integer | `128` | The read batch size.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.sync_write` | Bool | `false` | Whether to use sync write.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.enable_log_recycle` | Bool | `true` | Whether to reuse logically truncated log files.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.prefill_log_files` | Bool | `false` | Whether to pre-create log files on start up.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.sync_period` | String | `10s` | Duration for fsyncing log files.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.broker_endpoints` | Array | -- | The Kafka broker endpoints.<br/>**It's only used when the provider is `kafka`**. |
| `wal.max_batch_bytes` | String | `1MB` | The max size of a single producer batch.<br/>Warning: Kafka has a default limit of 1MB per message in a topic.<br/>**It's only used when the provider is `kafka`**. |
| `wal.consumer_wait_timeout` | String | `100ms` | The consumer wait timeout.<br/>**It's only used when the provider is `kafka`**. |
| `wal.create_index` | Bool | `true` | Whether to enable WAL index creation.<br/>**It's only used when the provider is `kafka`**. |
| `wal.dump_index_interval` | String | `60s` | The interval for dumping WAL indexes.<br/>**It's only used when the provider is `kafka`**. |
| `wal.overwrite_entry_start_id` | Bool | `false` | Ignore missing entries during read WAL.<br/>**It's only used when the provider is `kafka`**.<br/><br/>This option ensures that when Kafka messages are deleted, the system<br/>can still successfully replay memtable data without throwing an<br/>out-of-range error.<br/>However, enabling this option might lead to unexpected data loss,<br/>as the system will skip over missing entries instead of treating<br/>them as critical errors. |
| `query` | -- | -- | The query engine options. |
| `query.parallelism` | Integer | `0` | Parallelism of the query engine.<br/>Default to 0, which means the number of CPU cores. |
| `storage` | -- | -- | The data storage options. |
| `storage.data_home` | String | `./greptimedb_data` | The working home directory. |
| `storage.type` | String | `File` | The storage type used to store the data.<br/>- `File`: the data is stored in the local file system.<br/>- `S3`: the data is stored in the S3 object storage.<br/>- `Gcs`: the data is stored in the Google Cloud Storage.<br/>- `Azblob`: the data is stored in the Azure Blob Storage.<br/>- `Oss`: the data is stored in the Aliyun OSS. |
| `storage.cache_path` | String | Unset | Read cache configuration for object storage such as 'S3' etc, it's configured by default when using object storage. It is recommended to configure it when using object storage for better performance.<br/>A local file directory, defaults to `{data_home}`. An empty string means disabling. |
| `storage.cache_capacity` | String | Unset | The local file cache capacity in bytes. If your disk space is sufficient, it is recommended to set it larger. |
| `storage.bucket` | String | Unset | The S3 bucket name.<br/>**It's only used when the storage type is `S3`, `Oss` and `Gcs`**. |
| `storage.root` | String | Unset | The S3 data will be stored in the specified prefix, for example, `s3://${bucket}/${root}`.<br/>**It's only used when the storage type is `S3`, `Oss` and `Azblob`**. |
| `storage.access_key_id` | String | Unset | The access key id of the aws account.<br/>It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.<br/>**It's only used when the storage type is `S3` and `Oss`**. |
| `storage.secret_access_key` | String | Unset | The secret access key of the aws account.<br/>It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.<br/>**It's only used when the storage type is `S3`**. |
| `storage.access_key_secret` | String | Unset | The secret access key of the aliyun account.<br/>**It's only used when the storage type is `Oss`**. |
| `storage.account_name` | String | Unset | The account key of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.account_key` | String | Unset | The account key of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.scope` | String | Unset | The scope of the google cloud storage.<br/>**It's only used when the storage type is `Gcs`**. |
| `storage.credential_path` | String | Unset | The credential path of the google cloud storage.<br/>**It's only used when the storage type is `Gcs`**. |
| `storage.credential` | String | Unset | The credential of the google cloud storage.<br/>**It's only used when the storage type is `Gcs`**. |
| `storage.container` | String | Unset | The container of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.sas_token` | String | Unset | The sas token of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.endpoint` | String | Unset | The endpoint of the S3 service.<br/>**It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**. |
| `storage.region` | String | Unset | The region of the S3 service.<br/>**It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**. |
| `storage.http_client` | -- | -- | The http client options to the storage.<br/>**It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**. |
| `storage.http_client.pool_max_idle_per_host` | Integer | `1024` | The maximum idle connection per host allowed in the pool. |
| `storage.http_client.connect_timeout` | String | `30s` | The timeout for only the connect phase of a http client. |
| `storage.http_client.timeout` | String | `30s` | The total request timeout, applied from when the request starts connecting until the response body has finished.<br/>Also considered a total deadline. |
| `storage.http_client.pool_idle_timeout` | String | `90s` | The timeout for idle sockets being kept-alive. |
| `[[region_engine]]` | -- | -- | The region engine options. You can configure multiple region engines. |
| `region_engine.mito.num_workers` | Integer | `8` | Number of region workers. |
| `region_engine.mito.worker_channel_size` | Integer | `128` | Request channel size of each worker. |
| `region_engine.mito.worker_request_batch_size` | Integer | `64` | Max batch size for a worker to handle requests. |
| `region_engine.mito.manifest_checkpoint_distance` | Integer | `10` | Number of meta action updated to trigger a new checkpoint for the manifest. |
| `region_engine.mito.compress_manifest` | Bool | `false` | Whether to compress manifest and checkpoint file by gzip (default false). |
| `region_engine.mito.max_background_flushes` | Integer | Auto | Max number of running background flush jobs (default: 1/2 of cpu cores). |
| `region_engine.mito.max_background_compactions` | Integer | Auto | Max number of running background compaction jobs (default: 1/4 of cpu cores). |
| `region_engine.mito.max_background_purges` | Integer | Auto | Max number of running background purge jobs (default: number of cpu cores). |
| `region_engine.mito.auto_flush_interval` | String | `1h` | Interval to auto flush a region if it has not flushed yet. |
| `region_engine.mito.global_write_buffer_size` | String | Auto | Global write buffer size for all regions. If not set, it's default to 1/8 of OS memory with a max limitation of 1GB. |
| `region_engine.mito.global_write_buffer_reject_size` | String | Auto | Global write buffer size threshold to reject write requests. If not set, it's default to 2 times of `global_write_buffer_size` |
| `region_engine.mito.sst_meta_cache_size` | String | Auto | Cache size for SST metadata. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/32 of OS memory with a max limitation of 128MB. |
| `region_engine.mito.vector_cache_size` | String | Auto | Cache size for vectors and arrow arrays. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/16 of OS memory with a max limitation of 512MB. |
| `region_engine.mito.page_cache_size` | String | Auto | Cache size for pages of SST row groups. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/8 of OS memory. |
| `region_engine.mito.selector_result_cache_size` | String | Auto | Cache size for time series selector (e.g. `last_value()`). Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/16 of OS memory with a max limitation of 512MB. |
| `region_engine.mito.enable_write_cache` | Bool | `false` | Whether to enable the write cache, it's enabled by default when using object storage. It is recommended to enable it when using object storage for better performance. |
| `region_engine.mito.write_cache_path` | String | `""` | File system path for write cache, defaults to `{data_home}`. |
| `region_engine.mito.write_cache_size` | String | `5GiB` | Capacity for write cache. If your disk space is sufficient, it is recommended to set it larger. |
| `region_engine.mito.parallel_scan_channel_size` | Integer | `32` | Capacity of the channel to send data from parallel scan tasks to the main task. |
| `region_engine.mito.allow_stale_entries` | Bool | `false` | Whether to allow stale WAL entries read during replay. |
| `region_engine.mito.min_compaction_interval` | String | `0m` | Minimum time interval between two compactions.<br/>To align with the old behavior, the default value is 0 (no restrictions). |
| `region_engine.mito.index` | -- | -- | The options for index in Mito engine. |
| `region_engine.mito.index.aux_path` | String | `""` | Auxiliary directory path for the index in filesystem, used to store intermediate files for<br/>creating the index and staging files for searching the index, defaults to `{data_home}/index_intermediate`.<br/>The default name for this directory is `index_intermediate` for backward compatibility.<br/><br/>This path contains two subdirectories:<br/>- `__intm`: for storing intermediate files used during creating index.<br/>- `staging`: for storing staging files used during searching index. |
| `region_engine.mito.index.staging_size` | String | `2GB` | The max capacity of the staging directory. |
| `region_engine.mito.index.staging_ttl` | String | `7d` | The TTL of the staging directory.<br/>Defaults to 7 days.<br/>Setting it to "0s" to disable TTL. |
| `region_engine.mito.index.metadata_cache_size` | String | `64MiB` | Cache size for inverted index metadata. |
| `region_engine.mito.index.content_cache_size` | String | `128MiB` | Cache size for inverted index content. |
| `region_engine.mito.index.content_cache_page_size` | String | `64KiB` | Page size for inverted index content cache. |
| `region_engine.mito.index.result_cache_size` | String | `128MiB` | Cache size for index result. |
| `region_engine.mito.inverted_index` | -- | -- | The options for inverted index in Mito engine. |
| `region_engine.mito.inverted_index.create_on_flush` | String | `auto` | Whether to create the index on flush.<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.inverted_index.create_on_compaction` | String | `auto` | Whether to create the index on compaction.<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.inverted_index.apply_on_query` | String | `auto` | Whether to apply the index on query<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.inverted_index.mem_threshold_on_create` | String | `auto` | Memory threshold for performing an external sort during index creation.<br/>- `auto`: automatically determine the threshold based on the system memory size (default)<br/>- `unlimited`: no memory limit<br/>- `[size]` e.g. `64MB`: fixed memory threshold |
| `region_engine.mito.fulltext_index` | -- | -- | The options for full-text index in Mito engine. |
| `region_engine.mito.fulltext_index.create_on_flush` | String | `auto` | Whether to create the index on flush.<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.fulltext_index.create_on_compaction` | String | `auto` | Whether to create the index on compaction.<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.fulltext_index.apply_on_query` | String | `auto` | Whether to apply the index on query<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.fulltext_index.mem_threshold_on_create` | String | `auto` | Memory threshold for index creation.<br/>- `auto`: automatically determine the threshold based on the system memory size (default)<br/>- `unlimited`: no memory limit<br/>- `[size]` e.g. `64MB`: fixed memory threshold |
| `region_engine.mito.bloom_filter_index` | -- | -- | The options for bloom filter index in Mito engine. |
| `region_engine.mito.bloom_filter_index.create_on_flush` | String | `auto` | Whether to create the index on flush.<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.bloom_filter_index.create_on_compaction` | String | `auto` | Whether to create the index on compaction.<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.bloom_filter_index.apply_on_query` | String | `auto` | Whether to apply the index on query<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.bloom_filter_index.mem_threshold_on_create` | String | `auto` | Memory threshold for the index creation.<br/>- `auto`: automatically determine the threshold based on the system memory size (default)<br/>- `unlimited`: no memory limit<br/>- `[size]` e.g. `64MB`: fixed memory threshold |
| `region_engine.mito.memtable.index_max_keys_per_shard` | Integer | `8192` | The max number of keys in one shard.<br/>Only available for `partition_tree` memtable. |
| `region_engine.mito.memtable.data_freeze_threshold` | Integer | `32768` | The max rows of data inside the actively writing buffer in one shard.<br/>Only available for `partition_tree` memtable. |
| `region_engine.mito.memtable.fork_dictionary_bytes` | String | `1GiB` | Max dictionary bytes.<br/>Only available for `partition_tree` memtable. |
| `logging.append_stdout` | Bool | `true` | Whether to append logs to stdout. |
| `logging.log_format` | String | `text` | The log format. Can be `text`/`json`. |
| `logging.max_log_files` | Integer | `720` | The maximum amount of log files. |
| `logging.tracing_sample_ratio` | -- | -- | The percentage of tracing will be sampled and exported.<br/>Valid range `[0, 1]`, 1 means all traces are sampled, 0 means all traces are not sampled, the default value is 1.<br/>ratio > 1 are treated as 1. Fractions <0aretreatedas0|
|`export_metrics`|--|--|ThedatanodecanexportitsmetricsandsendtoPrometheuscompatibleservice(e.g.`greptimedb`itself)fromremote-writeAPI.<br/>This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape. |
| `export_metrics.write_interval` | String | `30s` | The interval of export metrics. |
| `export_metrics.remote_write` | -- | -- | -- |
| `export_metrics.remote_write.url` | String | `""` | The prometheus remote write endpoint that the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=greptime_metrics`. |
| `tracing` | -- | -- | The tracing options. Only effect when compiled with `tokio-console` feature. |
| `tracing.tokio_console_addr` | String | Unset | The tokio console address. |
### Flownode
| Key | Type | Default | Descriptions |
| --- | -----| ------- | ----------- |
| `node_id` | Integer | Unset | The flownode identifier and should be unique in the cluster. |
| `flow` | -- | -- | flow engine options. |
| `flow.num_workers` | Integer | `0` | The number of flow worker in flownode.<br/>Not setting(or set to 0) this value will use the number of CPU cores divided by 2. |
| `grpc` | -- | -- | The gRPC server options. |
| `grpc.bind_addr` | String | `127.0.0.1:6800` | The address to bind the gRPC server. |
| `grpc.server_addr` | String | `127.0.0.1:6800` | The address advertised to the metasrv,<br/>and used for connections from outside the host |
| `grpc.runtime_size` | Integer | `2` | The number of server worker threads. |
| `grpc.max_recv_message_size` | String | `512MB` | The maximum receive message size for gRPC server. |
| `grpc.max_send_message_size` | String | `512MB` | The maximum send message size for gRPC server. |
| `http` | -- | -- | The HTTP server options. |
| `http.addr` | String | `127.0.0.1:4000` | The address to bind the HTTP server. |
| `http.timeout` | String | `0s` | HTTP request timeout. Set to 0 to disable timeout. |
| `http.body_limit` | String | `64MB` | HTTP request body limit.<br/>The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.<br/>Set to 0 to disable limit. |
| `logging.append_stdout` | Bool | `true` | Whether to append logs to stdout. |
| `logging.log_format` | String | `text` | The log format. Can be `text`/`json`. |
| `logging.max_log_files` | Integer | `720` | The maximum amount of log files. |
| `logging.tracing_sample_ratio` | -- | -- | The percentage of tracing will be sampled and exported.<br/>Valid range `[0, 1]`, 1 means all traces are sampled, 0 means all traces are not sampled, the default value is 1.<br/>ratio > 1 are treated as 1. Fractions <0aretreatedas0|
## Default to 0, which means the number of CPU cores.
parallelism=0
## The data storage options.
[storage]
## The working home directory.
data_home="./greptimedb_data"
## The storage type used to store the data.
## - `File`: the data is stored in the local file system.
## - `S3`: the data is stored in the S3 object storage.
## - `Gcs`: the data is stored in the Google Cloud Storage.
## - `Azblob`: the data is stored in the Azure Blob Storage.
## - `Oss`: the data is stored in the Aliyun OSS.
type="File"
data_home="/tmp/greptimedb/"
# TTL for all tables. Disabled by default.
# global_ttl = "7d"
# Compaction options, see `standalone.example.toml`.
[storage.compaction]
max_inflight_tasks=4
max_files_in_level0=8
max_purge_tasks=32
## Read cache configuration for object storage such as 'S3' etc, it's configured by default when using object storage. It is recommended to configure it when using object storage for better performance.
## A local file directory, defaults to `{data_home}`. An empty string means disabling.
## @toml2docs:none-default
#+ cache_path = ""
# Storage manifest options
[storage.manifest]
# Region checkpoint actions margin.
# Create a checkpoint every <checkpoint_margin> actions.
checkpoint_margin=10
# Region manifest logs and checkpoints gc execution duration
gc_duration='30s'
# Whether to try creating a manifest checkpoint on region opening
checkpoint_on_startup=false
## The local file cache capacity in bytes. If your disk space is sufficient, it is recommended to set it larger.
## @toml2docs:none-default
cache_capacity="5GiB"
# Storage flush options
[storage.flush]
# Max inflight flush tasks.
max_flush_tasks=8
# Default write buffer size for a region.
region_write_buffer_size="32MB"
# Interval to check whether a region needs flush.
picker_schedule_interval="5m"
# Interval to auto flush a region if it has not flushed yet.
## The S3 bucket name.
## **It's only used when the storage type is `S3`, `Oss` and `Gcs`**.
## @toml2docs:none-default
bucket="greptimedb"
## The S3 data will be stored in the specified prefix, for example, `s3://${bucket}/${root}`.
## **It's only used when the storage type is `S3`, `Oss` and `Azblob`**.
## @toml2docs:none-default
root="greptimedb"
## The access key id of the aws account.
## It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.
## **It's only used when the storage type is `S3` and `Oss`**.
## @toml2docs:none-default
access_key_id="test"
## The secret access key of the aws account.
## It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.
## **It's only used when the storage type is `S3`**.
## @toml2docs:none-default
secret_access_key="test"
## The secret access key of the aliyun account.
## **It's only used when the storage type is `Oss`**.
## @toml2docs:none-default
access_key_secret="test"
## The account key of the azure account.
## **It's only used when the storage type is `Azblob`**.
## @toml2docs:none-default
account_name="test"
## The account key of the azure account.
## **It's only used when the storage type is `Azblob`**.
## @toml2docs:none-default
account_key="test"
## The scope of the google cloud storage.
## **It's only used when the storage type is `Gcs`**.
## @toml2docs:none-default
scope="test"
## The credential path of the google cloud storage.
## **It's only used when the storage type is `Gcs`**.
## @toml2docs:none-default
credential_path="test"
## The credential of the google cloud storage.
## **It's only used when the storage type is `Gcs`**.
## @toml2docs:none-default
credential="base64-credential"
## The container of the azure account.
## **It's only used when the storage type is `Azblob`**.
## @toml2docs:none-default
container="greptimedb"
## The sas token of the azure account.
## **It's only used when the storage type is `Azblob`**.
## @toml2docs:none-default
sas_token=""
## The endpoint of the S3 service.
## **It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**.
## @toml2docs:none-default
endpoint="https://s3.amazonaws.com"
## The region of the S3 service.
## **It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**.
## @toml2docs:none-default
region="us-west-2"
## The http client options to the storage.
## **It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**.
[storage.http_client]
## The maximum idle connection per host allowed in the pool.
pool_max_idle_per_host=1024
## The timeout for only the connect phase of a http client.
connect_timeout="30s"
## The total request timeout, applied from when the request starts connecting until the response body has finished.
## Also considered a total deadline.
timeout="30s"
## The timeout for idle sockets being kept-alive.
pool_idle_timeout="90s"
# Custom storage options
# [[storage.providers]]
# name = "S3"
# type = "S3"
# bucket = "greptimedb"
# root = "data"
# access_key_id = "test"
# secret_access_key = "123456"
# endpoint = "https://s3.amazonaws.com"
# region = "us-west-2"
# [[storage.providers]]
# name = "Gcs"
# type = "Gcs"
# bucket = "greptimedb"
# root = "data"
# scope = "test"
# credential_path = "123456"
# credential = "base64-credential"
# endpoint = "https://storage.googleapis.com"
## The region engine options. You can configure multiple region engines.
[[region_engine]]
## The Mito engine options.
[region_engine.mito]
## Number of region workers.
#+ num_workers = 8
## Request channel size of each worker.
worker_channel_size=128
## Max batch size for a worker to handle requests.
worker_request_batch_size=64
## Number of meta action updated to trigger a new checkpoint for the manifest.
manifest_checkpoint_distance=10
## Whether to compress manifest and checkpoint file by gzip (default false).
compress_manifest=false
## Max number of running background flush jobs (default: 1/2 of cpu cores).
## @toml2docs:none-default="Auto"
#+ max_background_flushes = 4
## Max number of running background compaction jobs (default: 1/4 of cpu cores).
## @toml2docs:none-default="Auto"
#+ max_background_compactions = 2
## Max number of running background purge jobs (default: number of cpu cores).
## @toml2docs:none-default="Auto"
#+ max_background_purges = 8
## Interval to auto flush a region if it has not flushed yet.
auto_flush_interval="1h"
# Global write buffer size for all regions.
global_write_buffer_size="1GB"
# Procedure storage options, see `standalone.example.toml`.
[procedure]
max_retry_times=3
retry_delay="500ms"
## Global write buffer size for all regions. If not set, it's default to 1/8 of OS memory with a max limitation of 1GB.
## Global write buffer size threshold to reject write requests. If not set, it's default to 2 times of `global_write_buffer_size`
## @toml2docs:none-default="Auto"
#+ global_write_buffer_reject_size = "2GB"
## Cache size for SST metadata. Setting it to 0 to disable the cache.
## If not set, it's default to 1/32 of OS memory with a max limitation of 128MB.
## @toml2docs:none-default="Auto"
#+ sst_meta_cache_size = "128MB"
## Cache size for vectors and arrow arrays. Setting it to 0 to disable the cache.
## If not set, it's default to 1/16 of OS memory with a max limitation of 512MB.
## @toml2docs:none-default="Auto"
#+ vector_cache_size = "512MB"
## Cache size for pages of SST row groups. Setting it to 0 to disable the cache.
## If not set, it's default to 1/8 of OS memory.
## @toml2docs:none-default="Auto"
#+ page_cache_size = "512MB"
## Cache size for time series selector (e.g. `last_value()`). Setting it to 0 to disable the cache.
## If not set, it's default to 1/16 of OS memory with a max limitation of 512MB.
## @toml2docs:none-default="Auto"
#+ selector_result_cache_size = "512MB"
## Whether to enable the write cache, it's enabled by default when using object storage. It is recommended to enable it when using object storage for better performance.
enable_write_cache=false
## File system path for write cache, defaults to `{data_home}`.
write_cache_path=""
## Capacity for write cache. If your disk space is sufficient, it is recommended to set it larger.
write_cache_size="5GiB"
## TTL for write cache.
## @toml2docs:none-default
write_cache_ttl="8h"
## Buffer size for SST writing.
sst_write_buffer_size="8MB"
## Capacity of the channel to send data from parallel scan tasks to the main task.
parallel_scan_channel_size=32
## Whether to allow stale WAL entries read during replay.
allow_stale_entries=false
## Minimum time interval between two compactions.
## To align with the old behavior, the default value is 0 (no restrictions).
min_compaction_interval="0m"
## The options for index in Mito engine.
[region_engine.mito.index]
## Auxiliary directory path for the index in filesystem, used to store intermediate files for
## creating the index and staging files for searching the index, defaults to `{data_home}/index_intermediate`.
## The default name for this directory is `index_intermediate` for backward compatibility.
##
## This path contains two subdirectories:
## - `__intm`: for storing intermediate files used during creating index.
## - `staging`: for storing staging files used during searching index.
aux_path=""
## The max capacity of the staging directory.
staging_size="2GB"
## The TTL of the staging directory.
## Defaults to 7 days.
## Setting it to "0s" to disable TTL.
staging_ttl="7d"
## Cache size for inverted index metadata.
metadata_cache_size="64MiB"
## Cache size for inverted index content.
content_cache_size="128MiB"
## Page size for inverted index content cache.
content_cache_page_size="64KiB"
## Cache size for index result.
result_cache_size="128MiB"
## The options for inverted index in Mito engine.
[region_engine.mito.inverted_index]
## Whether to create the index on flush.
## - `auto`: automatically (default)
## - `disable`: never
create_on_flush="auto"
## Whether to create the index on compaction.
## - `auto`: automatically (default)
## - `disable`: never
create_on_compaction="auto"
## Whether to apply the index on query
## - `auto`: automatically (default)
## - `disable`: never
apply_on_query="auto"
## Memory threshold for performing an external sort during index creation.
## - `auto`: automatically determine the threshold based on the system memory size (default)
## - `unlimited`: no memory limit
## - `[size]` e.g. `64MB`: fixed memory threshold
mem_threshold_on_create="auto"
## Deprecated, use `region_engine.mito.index.aux_path` instead.
intermediate_path=""
## The options for full-text index in Mito engine.
[region_engine.mito.fulltext_index]
## Whether to create the index on flush.
## - `auto`: automatically (default)
## - `disable`: never
create_on_flush="auto"
## Whether to create the index on compaction.
## - `auto`: automatically (default)
## - `disable`: never
create_on_compaction="auto"
## Whether to apply the index on query
## - `auto`: automatically (default)
## - `disable`: never
apply_on_query="auto"
## Memory threshold for index creation.
## - `auto`: automatically determine the threshold based on the system memory size (default)
## - `unlimited`: no memory limit
## - `[size]` e.g. `64MB`: fixed memory threshold
mem_threshold_on_create="auto"
## The options for bloom filter index in Mito engine.
[region_engine.mito.bloom_filter_index]
## Whether to create the index on flush.
## - `auto`: automatically (default)
## - `disable`: never
create_on_flush="auto"
## Whether to create the index on compaction.
## - `auto`: automatically (default)
## - `disable`: never
create_on_compaction="auto"
## Whether to apply the index on query
## - `auto`: automatically (default)
## - `disable`: never
apply_on_query="auto"
## Memory threshold for the index creation.
## - `auto`: automatically determine the threshold based on the system memory size (default)
## - `unlimited`: no memory limit
## - `[size]` e.g. `64MB`: fixed memory threshold
mem_threshold_on_create="auto"
[region_engine.mito.memtable]
## Memtable type.
## - `time_series`: time-series memtable
## - `partition_tree`: partition tree memtable (experimental)
type="time_series"
## The max number of keys in one shard.
## Only available for `partition_tree` memtable.
index_max_keys_per_shard=8192
## The max rows of data inside the actively writing buffer in one shard.
## Only available for `partition_tree` memtable.
data_freeze_threshold=32768
## Max dictionary bytes.
## Only available for `partition_tree` memtable.
fork_dictionary_bytes="1GiB"
[[region_engine]]
## Enable the file engine.
[region_engine.file]
[[region_engine]]
## Metric engine options.
[region_engine.metric]
## Whether to enable the experimental sparse primary key encoding.
experimental_sparse_primary_key_encoding=false
## The logging options.
[logging]
## The directory to store the log files. If set to empty, logs will not be written to files.
dir="./greptimedb_data/logs"
## The log level. Can be `info`/`debug`/`warn`/`error`.
## @toml2docs:none-default
level="info"
## Enable OTLP tracing.
enable_otlp_tracing=false
## The OTLP tracing endpoint.
otlp_endpoint="http://localhost:4317"
## Whether to append logs to stdout.
append_stdout=true
## The log format. Can be `text`/`json`.
log_format="text"
## The maximum amount of log files.
max_log_files=720
## The percentage of tracing will be sampled and exported.
## Valid range `[0, 1]`, 1 means all traces are sampled, 0 means all traces are not sampled, the default value is 1.
## ratio > 1 are treated as 1. Fractions < 0 are treated as 0
[logging.tracing_sample_ratio]
default_ratio=1.0
## The datanode can export its metrics and send to Prometheus compatible service (e.g. `greptimedb` itself) from remote-write API.
## This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape.
[export_metrics]
## whether enable export metrics.
enable=false
## The interval of export metrics.
write_interval="30s"
[export_metrics.remote_write]
## The prometheus remote write endpoint that the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=greptime_metrics`.
url=""
## HTTP headers of Prometheus remote-write carry.
headers={}
## The tracing options. Only effect when compiled with `tokio-console` feature.
# gRPC server options, see `standalone.example.toml`.
[grpc_options]
addr="127.0.0.1:4001"
## The gRPC server options.
[grpc]
## The address to bind the gRPC server.
bind_addr="127.0.0.1:4001"
## The address advertised to the metasrv, and used for connections from outside the host.
## If left empty or unset, the server will automatically use the IP address of the first network interface
## on the host, with the same port number as the one specified in `grpc.bind_addr`.
server_addr="127.0.0.1:4001"
## The number of server worker threads.
runtime_size=8
## Compression mode for frontend side Arrow IPC service. Available options:
## - `none`: disable all compression
## - `transport`: only enable gRPC transport compression (zstd)
## - `arrow_ipc`: only enable Arrow IPC compression (lz4)
## - `all`: enable all compression.
## Default to `none`
flight_compression="arrow_ipc"
# MySQL server options, see `standalone.example.toml`.
[mysql_options]
## gRPC server TLS options, see `mysql.tls` section.
[grpc.tls]
## TLS mode.
mode="disable"
## Certificate file path.
## @toml2docs:none-default
cert_path=""
## Private key file path.
## @toml2docs:none-default
key_path=""
## Watch for Certificate and key file change and auto reload.
## For now, gRPC tls config does not support auto reload.
watch=false
## MySQL server options.
[mysql]
## Whether to enable.
enable=true
## The addr to bind the MySQL server.
addr="127.0.0.1:4002"
## The number of server worker threads.
runtime_size=2
## Server-side keep-alive time.
## Set to 0 (default) to disable.
keep_alive="0s"
# MySQL server TLS options, see `standalone.example.toml`.
[mysql_options.tls]
# MySQL server TLS options.
[mysql.tls]
## TLS mode, refer to https://www.postgresql.org/docs/current/libpq-ssl.html
## - `disable` (default value)
## - `prefer`
## - `require`
## - `verify-ca`
## - `verify-full`
mode="disable"
## Certificate file path.
## @toml2docs:none-default
cert_path=""
## Private key file path.
## @toml2docs:none-default
key_path=""
# PostgresSQL server options, see `standalone.example.toml`.
[postgres_options]
## Watch for Certificate and key file change and auto reload
watch=false
## PostgresSQL server options.
[postgres]
## Whether to enable
enable=true
## The addr to bind the PostgresSQL server.
addr="127.0.0.1:4003"
## The number of server worker threads.
runtime_size=2
## Server-side keep-alive time.
## Set to 0 (default) to disable.
keep_alive="0s"
# PostgresSQL server TLS options, see `standalone.example.toml`.
[postgres_options.tls]
## PostgresSQL server TLS options, see `mysql.tls` section.
[postgres.tls]
## TLS mode.
mode="disable"
## Certificate file path.
## @toml2docs:none-default
cert_path=""
## Private key file path.
## @toml2docs:none-default
key_path=""
# OpenTSDB protocol options, see `standalone.example.toml`.
[opentsdb_options]
addr="127.0.0.1:4242"
runtime_size=2
## Watch for Certificate and key file change and auto reload
watch=false
# InfluxDB protocol options, see `standalone.example.toml`.
[influxdb_options]
## OpenTSDB protocol options.
[opentsdb]
## Whether to enable OpenTSDB put in HTTP API.
enable=true
# Prometheus protocol options, see `standalone.example.toml`.
[prometheus_options]
## InfluxDB protocol options.
[influxdb]
## Whether to enable InfluxDB protocol in HTTP API.
enable=true
# Prometheus protocol options, see `standalone.example.toml`.
[prom_options]
addr="127.0.0.1:4004"
## Jaeger protocol options.
[jaeger]
## Whether to enable Jaeger protocol in HTTP API.
enable=true
# Metasrv client options, see `datanode.example.toml`.
[meta_client_options]
## Prometheus remote storage options
[prom_store]
## Whether to enable Prometheus remote write and read in HTTP API.
enable=true
## Whether to store the data from Prometheus remote write in metric engine.
with_metric_engine=true
## The metasrv client options.
[meta_client]
## The addresses of the metasrv.
metasrv_addrs=["127.0.0.1:3002"]
timeout_millis=3000
connect_timeout_millis=5000
## Operation timeout.
timeout="3s"
## Heartbeat timeout.
heartbeat_timeout="500ms"
## DDL timeout.
ddl_timeout="10s"
## Connect server timeout.
connect_timeout="1s"
## `TCP_NODELAY` option for accepted connections.
tcp_nodelay=true
# Log options, see `standalone.example.toml`
# [logging]
# dir = "/tmp/greptimedb/logs"
# level = "info"
## The configuration about the cache of the metadata.
metadata_cache_max_capacity=100000
## TTL of the metadata cache.
metadata_cache_ttl="10m"
# TTI of the metadata cache.
metadata_cache_tti="5m"
## The query engine options.
[query]
## Parallelism of the query engine.
## Default to 0, which means the number of CPU cores.
parallelism=0
## Datanode options.
[datanode]
## Datanode client options.
[datanode.client]
connect_timeout="10s"
tcp_nodelay=true
## The logging options.
[logging]
## The directory to store the log files. If set to empty, logs will not be written to files.
dir="./greptimedb_data/logs"
## The log level. Can be `info`/`debug`/`warn`/`error`.
## @toml2docs:none-default
level="info"
## Enable OTLP tracing.
enable_otlp_tracing=false
## The OTLP tracing endpoint.
otlp_endpoint="http://localhost:4317"
## Whether to append logs to stdout.
append_stdout=true
## The log format. Can be `text`/`json`.
log_format="text"
## The maximum amount of log files.
max_log_files=720
## The percentage of tracing will be sampled and exported.
## Valid range `[0, 1]`, 1 means all traces are sampled, 0 means all traces are not sampled, the default value is 1.
## ratio > 1 are treated as 1. Fractions < 0 are treated as 0
[logging.tracing_sample_ratio]
default_ratio=1.0
## The slow query log options.
[slow_query]
## Whether to enable slow query log.
enable=true
## The record type of slow queries. It can be `system_table` or `log`.
## If `system_table` is selected, the slow queries will be recorded in a system table `greptime_private.slow_queries`.
## If `log` is selected, the slow queries will be logged in a log file `greptimedb-slow-queries.*`.
record_type="system_table"
## The threshold of slow query. It can be human readable time string, for example: `10s`, `100ms`, `1s`.
threshold="30s"
## The sampling ratio of slow query log. The value should be in the range of (0, 1]. For example, `0.1` means 10% of the slow queries will be logged and `1.0` means all slow queries will be logged.
sample_ratio=1.0
## The TTL of the `slow_queries` system table. Default is `30d` when `record_type` is `system_table`.
ttl="30d"
## The frontend can export its metrics and send to Prometheus compatible service (e.g. `greptimedb` itself) from remote-write API.
## This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape.
[export_metrics]
## whether enable export metrics.
enable=false
## The interval of export metrics.
write_interval="30s"
[export_metrics.remote_write]
## The prometheus remote write endpoint that the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=greptime_metrics`.
url=""
## HTTP headers of Prometheus remote-write carry.
headers={}
## The tracing options. Only effect when compiled with `tokio-console` feature.
## Above which a topic creation operation will be cancelled.
create_topic_timeout="30s"
# The Kafka SASL configuration.
# **It's only used when the provider is `kafka`**.
# Available SASL mechanisms:
# - `PLAIN`
# - `SCRAM-SHA-256`
# - `SCRAM-SHA-512`
# [wal.sasl]
# type = "SCRAM-SHA-512"
# username = "user_kafka"
# password = "secret"
# The Kafka TLS configuration.
# **It's only used when the provider is `kafka`**.
# [wal.tls]
# server_ca_cert_path = "/path/to/server_cert"
# client_cert_path = "/path/to/client_cert"
# client_key_path = "/path/to/key"
## The logging options.
[logging]
## The directory to store the log files. If set to empty, logs will not be written to files.
dir="./greptimedb_data/logs"
## The log level. Can be `info`/`debug`/`warn`/`error`.
## @toml2docs:none-default
level="info"
## Enable OTLP tracing.
enable_otlp_tracing=false
## The OTLP tracing endpoint.
otlp_endpoint="http://localhost:4317"
## Whether to append logs to stdout.
append_stdout=true
## The log format. Can be `text`/`json`.
log_format="text"
## The maximum amount of log files.
max_log_files=720
## The percentage of tracing will be sampled and exported.
## Valid range `[0, 1]`, 1 means all traces are sampled, 0 means all traces are not sampled, the default value is 1.
## ratio > 1 are treated as 1. Fractions < 0 are treated as 0
[logging.tracing_sample_ratio]
default_ratio=1.0
## The metasrv can export its metrics and send to Prometheus compatible service (e.g. `greptimedb` itself) from remote-write API.
## This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape.
[export_metrics]
## whether enable export metrics.
enable=false
## The interval of export metrics.
write_interval="30s"
[export_metrics.remote_write]
## The prometheus remote write endpoint that the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=greptime_metrics`.
url=""
## HTTP headers of Prometheus remote-write carry.
headers={}
## The tracing options. Only effect when compiled with `tokio-console` feature.
## Default to 0, which means the number of CPU cores.
parallelism=0
## The data storage options.
[storage]
## The working home directory.
data_home="./greptimedb_data"
## The storage type used to store the data.
## - `File`: the data is stored in the local file system.
## - `S3`: the data is stored in the S3 object storage.
## - `Gcs`: the data is stored in the Google Cloud Storage.
## - `Azblob`: the data is stored in the Azure Blob Storage.
## - `Oss`: the data is stored in the Aliyun OSS.
type="File"
## Read cache configuration for object storage such as 'S3' etc, it's configured by default when using object storage. It is recommended to configure it when using object storage for better performance.
## A local file directory, defaults to `{data_home}`. An empty string means disabling.
## @toml2docs:none-default
#+ cache_path = ""
## The local file cache capacity in bytes. If your disk space is sufficient, it is recommended to set it larger.
## @toml2docs:none-default
cache_capacity="5GiB"
## The S3 bucket name.
## **It's only used when the storage type is `S3`, `Oss` and `Gcs`**.
## @toml2docs:none-default
bucket="greptimedb"
## The S3 data will be stored in the specified prefix, for example, `s3://${bucket}/${root}`.
## **It's only used when the storage type is `S3`, `Oss` and `Azblob`**.
## @toml2docs:none-default
root="greptimedb"
## The access key id of the aws account.
## It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.
## **It's only used when the storage type is `S3` and `Oss`**.
## @toml2docs:none-default
access_key_id="test"
## The secret access key of the aws account.
## It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.
## **It's only used when the storage type is `S3`**.
## @toml2docs:none-default
secret_access_key="test"
## The secret access key of the aliyun account.
## **It's only used when the storage type is `Oss`**.
## @toml2docs:none-default
access_key_secret="test"
## The account key of the azure account.
## **It's only used when the storage type is `Azblob`**.
## @toml2docs:none-default
account_name="test"
## The account key of the azure account.
## **It's only used when the storage type is `Azblob`**.
## @toml2docs:none-default
account_key="test"
## The scope of the google cloud storage.
## **It's only used when the storage type is `Gcs`**.
## @toml2docs:none-default
scope="test"
## The credential path of the google cloud storage.
## **It's only used when the storage type is `Gcs`**.
## @toml2docs:none-default
credential_path="test"
## The credential of the google cloud storage.
## **It's only used when the storage type is `Gcs`**.
## @toml2docs:none-default
credential="base64-credential"
## The container of the azure account.
## **It's only used when the storage type is `Azblob`**.
## @toml2docs:none-default
container="greptimedb"
## The sas token of the azure account.
## **It's only used when the storage type is `Azblob`**.
## @toml2docs:none-default
sas_token=""
## The endpoint of the S3 service.
## **It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**.
## @toml2docs:none-default
endpoint="https://s3.amazonaws.com"
## The region of the S3 service.
## **It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**.
## @toml2docs:none-default
region="us-west-2"
## The http client options to the storage.
## **It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**.
[storage.http_client]
## The maximum idle connection per host allowed in the pool.
pool_max_idle_per_host=1024
## The timeout for only the connect phase of a http client.
connect_timeout="30s"
## The total request timeout, applied from when the request starts connecting until the response body has finished.
## Also considered a total deadline.
timeout="30s"
## The timeout for idle sockets being kept-alive.
pool_idle_timeout="90s"
# Custom storage options
# [[storage.providers]]
# name = "S3"
# type = "S3"
# bucket = "greptimedb"
# root = "data"
# access_key_id = "test"
# secret_access_key = "123456"
# endpoint = "https://s3.amazonaws.com"
# region = "us-west-2"
# [[storage.providers]]
# name = "Gcs"
# type = "Gcs"
# bucket = "greptimedb"
# root = "data"
# scope = "test"
# credential_path = "123456"
# credential = "base64-credential"
# endpoint = "https://storage.googleapis.com"
## The region engine options. You can configure multiple region engines.
[[region_engine]]
## The Mito engine options.
[region_engine.mito]
## Number of region workers.
#+ num_workers = 8
## Request channel size of each worker.
worker_channel_size=128
## Max batch size for a worker to handle requests.
worker_request_batch_size=64
## Number of meta action updated to trigger a new checkpoint for the manifest.
manifest_checkpoint_distance=10
## Whether to compress manifest and checkpoint file by gzip (default false).
compress_manifest=false
## Max number of running background flush jobs (default: 1/2 of cpu cores).
## @toml2docs:none-default="Auto"
#+ max_background_flushes = 4
## Max number of running background compaction jobs (default: 1/4 of cpu cores).
## @toml2docs:none-default="Auto"
#+ max_background_compactions = 2
## Max number of running background purge jobs (default: number of cpu cores).
## @toml2docs:none-default="Auto"
#+ max_background_purges = 8
## Interval to auto flush a region if it has not flushed yet.
auto_flush_interval="1h"
## Global write buffer size for all regions. If not set, it's default to 1/8 of OS memory with a max limitation of 1GB.
## @toml2docs:none-default="Auto"
#+ global_write_buffer_size = "1GB"
## Global write buffer size threshold to reject write requests. If not set, it's default to 2 times of `global_write_buffer_size`.
## @toml2docs:none-default="Auto"
#+ global_write_buffer_reject_size = "2GB"
## Cache size for SST metadata. Setting it to 0 to disable the cache.
## If not set, it's default to 1/32 of OS memory with a max limitation of 128MB.
## @toml2docs:none-default="Auto"
#+ sst_meta_cache_size = "128MB"
## Cache size for vectors and arrow arrays. Setting it to 0 to disable the cache.
## If not set, it's default to 1/16 of OS memory with a max limitation of 512MB.
## @toml2docs:none-default="Auto"
#+ vector_cache_size = "512MB"
## Cache size for pages of SST row groups. Setting it to 0 to disable the cache.
## If not set, it's default to 1/8 of OS memory.
## @toml2docs:none-default="Auto"
#+ page_cache_size = "512MB"
## Cache size for time series selector (e.g. `last_value()`). Setting it to 0 to disable the cache.
## If not set, it's default to 1/16 of OS memory with a max limitation of 512MB.
## @toml2docs:none-default="Auto"
#+ selector_result_cache_size = "512MB"
## Whether to enable the write cache, it's enabled by default when using object storage. It is recommended to enable it when using object storage for better performance.
enable_write_cache=false
## File system path for write cache, defaults to `{data_home}`.
write_cache_path=""
## Capacity for write cache. If your disk space is sufficient, it is recommended to set it larger.
write_cache_size="5GiB"
## TTL for write cache.
## @toml2docs:none-default
write_cache_ttl="8h"
## Buffer size for SST writing.
sst_write_buffer_size="8MB"
## Capacity of the channel to send data from parallel scan tasks to the main task.
parallel_scan_channel_size=32
## Whether to allow stale WAL entries read during replay.
allow_stale_entries=false
## Minimum time interval between two compactions.
## To align with the old behavior, the default value is 0 (no restrictions).
min_compaction_interval="0m"
## The options for index in Mito engine.
[region_engine.mito.index]
## Auxiliary directory path for the index in filesystem, used to store intermediate files for
## creating the index and staging files for searching the index, defaults to `{data_home}/index_intermediate`.
## The default name for this directory is `index_intermediate` for backward compatibility.
##
## This path contains two subdirectories:
## - `__intm`: for storing intermediate files used during creating index.
## - `staging`: for storing staging files used during searching index.
aux_path=""
## The max capacity of the staging directory.
staging_size="2GB"
## The TTL of the staging directory.
## Defaults to 7 days.
## Setting it to "0s" to disable TTL.
staging_ttl="7d"
## Cache size for inverted index metadata.
metadata_cache_size="64MiB"
## Cache size for inverted index content.
content_cache_size="128MiB"
## Page size for inverted index content cache.
content_cache_page_size="64KiB"
## Cache size for index result.
result_cache_size="128MiB"
## The options for inverted index in Mito engine.
[region_engine.mito.inverted_index]
## Whether to create the index on flush.
## - `auto`: automatically (default)
## - `disable`: never
create_on_flush="auto"
## Whether to create the index on compaction.
## - `auto`: automatically (default)
## - `disable`: never
create_on_compaction="auto"
## Whether to apply the index on query
## - `auto`: automatically (default)
## - `disable`: never
apply_on_query="auto"
## Memory threshold for performing an external sort during index creation.
## - `auto`: automatically determine the threshold based on the system memory size (default)
## - `unlimited`: no memory limit
## - `[size]` e.g. `64MB`: fixed memory threshold
mem_threshold_on_create="auto"
## Deprecated, use `region_engine.mito.index.aux_path` instead.
intermediate_path=""
## The options for full-text index in Mito engine.
[region_engine.mito.fulltext_index]
## Whether to create the index on flush.
## - `auto`: automatically (default)
## - `disable`: never
create_on_flush="auto"
## Whether to create the index on compaction.
## - `auto`: automatically (default)
## - `disable`: never
create_on_compaction="auto"
## Whether to apply the index on query
## - `auto`: automatically (default)
## - `disable`: never
apply_on_query="auto"
## Memory threshold for index creation.
## - `auto`: automatically determine the threshold based on the system memory size (default)
## - `unlimited`: no memory limit
## - `[size]` e.g. `64MB`: fixed memory threshold
mem_threshold_on_create="auto"
## The options for bloom filter in Mito engine.
[region_engine.mito.bloom_filter_index]
## Whether to create the bloom filter on flush.
## - `auto`: automatically (default)
## - `disable`: never
create_on_flush="auto"
## Whether to create the bloom filter on compaction.
## - `auto`: automatically (default)
## - `disable`: never
create_on_compaction="auto"
## Whether to apply the bloom filter on query
## - `auto`: automatically (default)
## - `disable`: never
apply_on_query="auto"
## Memory threshold for bloom filter creation.
## - `auto`: automatically determine the threshold based on the system memory size (default)
## - `unlimited`: no memory limit
## - `[size]` e.g. `64MB`: fixed memory threshold
mem_threshold_on_create="auto"
[region_engine.mito.memtable]
## Memtable type.
## - `time_series`: time-series memtable
## - `partition_tree`: partition tree memtable (experimental)
type="time_series"
## The max number of keys in one shard.
## Only available for `partition_tree` memtable.
index_max_keys_per_shard=8192
## The max rows of data inside the actively writing buffer in one shard.
## Only available for `partition_tree` memtable.
data_freeze_threshold=32768
## Max dictionary bytes.
## Only available for `partition_tree` memtable.
fork_dictionary_bytes="1GiB"
[[region_engine]]
## Enable the file engine.
[region_engine.file]
[[region_engine]]
## Metric engine options.
[region_engine.metric]
## Whether to enable the experimental sparse primary key encoding.
experimental_sparse_primary_key_encoding=false
## The logging options.
[logging]
## The directory to store the log files. If set to empty, logs will not be written to files.
dir="./greptimedb_data/logs"
## The log level. Can be `info`/`debug`/`warn`/`error`.
## @toml2docs:none-default
level="info"
## Enable OTLP tracing.
enable_otlp_tracing=false
## The OTLP tracing endpoint.
otlp_endpoint="http://localhost:4317"
## Whether to append logs to stdout.
append_stdout=true
## The log format. Can be `text`/`json`.
log_format="text"
## The maximum amount of log files.
max_log_files=720
## The percentage of tracing will be sampled and exported.
## Valid range `[0, 1]`, 1 means all traces are sampled, 0 means all traces are not sampled, the default value is 1.
## ratio > 1 are treated as 1. Fractions < 0 are treated as 0
[logging.tracing_sample_ratio]
default_ratio=1.0
## The slow query log options.
[slow_query]
## Whether to enable slow query log.
#+ enable = false
## The record type of slow queries. It can be `system_table` or `log`.
## @toml2docs:none-default
#+ record_type = "system_table"
## The threshold of slow query.
## @toml2docs:none-default
#+ threshold = "10s"
## The sampling ratio of slow query log. The value should be in the range of (0, 1].
## @toml2docs:none-default
#+ sample_ratio = 1.0
## The standalone can export its metrics and send to Prometheus compatible service (e.g. `greptimedb`) from remote-write API.
## This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape.
[export_metrics]
## whether enable export metrics.
enable=false
## The interval of export metrics.
write_interval="30s"
## For `standalone` mode, `self_import` is recommended to collect metrics generated by itself
## You must create the database before enabling it.
[export_metrics.self_import]
## @toml2docs:none-default
db="greptime_metrics"
[export_metrics.remote_write]
## The prometheus remote write endpoint that the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=greptime_metrics`.
url=""
## HTTP headers of Prometheus remote-write carry.
headers={}
## The tracing options. Only effect when compiled with `tokio-console` feature.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.