Implement MVP split logic with checker-safe degrade paths and move module under utils/split with aligned split naming and tests.
Signed-off-by: WenyXu <wenymedia@gmail.com>
* refactor/prom-related-code:
### Commit Message
Refactor Byte Handling and Improve Decoding Logic
- **`prom_decode.rs`**: Removed `Bytes` usage in favor of `Vec<u8>` for handling raw data, improving memory management and simplifying the decoding process.
- **`prom_store.rs`**: Updated `try_decompress` function to return `Vec<u8>` instead of `Bytes`, aligning with the new data handling approach.
- **`prom_row_builder.rs`**: Modified `TablesBuilder` to use `Vec<u8>` for `raw_data`, enhancing data manipulation capabilities.
- **`proto.rs`**: Refactored `PromWriteRequest` decoding logic to use `Vec<u8>`, optimizing the buffer management and decoding flow.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* refactor: mod structure
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* refactor/prom-related-code:
- **Refactor `prom_store.rs` and `prom_remote_write/mod.rs`:** Moved `decode_remote_write_request` and `try_decompress` functions from `prom_store.rs` to `prom_remote_write/mod.rs`. This change centralizes the logic related to remote write request
decoding and decompression.
- **Update `PromValidationMode` in `validation.rs`:** Implemented `Default` trait using the `#[derive(Default)]` attribute for `PromValidationMode` and updated related methods to use `Result` instead of `std::result::Result`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* refactor/prom-related-code:
### Remove `proto.rs` and Update References
- **Removed**: Deleted the `proto.rs` file, which contained re-exports for Prometheus remote write decode types.
- **Updated References**: Adjusted references to `PromSeriesProcessor` and `PromWriteRequest` in `prom_decode.rs` and `prom_store.rs` to import directly from `prom_remote_write`.
- **Modified Modules**: Removed the `proto` module from `lib.rs`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* fix: lint
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* fix: remove assert_eq
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* refactor/prom-related-code:
### Refactor Prometheus Remote Write Module
- **Modularization of `prom_remote_write`:**
- Split `PromValidationMode` and `validate_label_name` into a new `validation` module.
- Moved `PromSeriesProcessor` and `PromWriteRequest` to a `decode` module.
- Separated `PromLabel` into a `types` module and adjusted visibility.
- **Visibility Adjustments:**
- Changed `PromTimeSeries` and `PromLabel` structs to `pub(crate)` for internal use.
- **File Updates:**
- Updated references in `prom_decode.rs`, `http.rs`, `prom_store.rs`, `decode.rs`, `mod.rs`, `row_builder.rs`, `types.rs`, `prom_store_test.rs`, and `test_util.rs` to reflect module changes.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* test: fix unstable index meta list test
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix: raise bucket_size threshold to avoid bucketing sizes in [512, 999] to 0
Signed-off-by: evenyag <realevenyag@gmail.com>
---------
Signed-off-by: evenyag <realevenyag@gmail.com>
* feat: initial function rewriter for json_get
* feat: make sure rewrite rule is applied
* feat: keep analyzer's default rules
* feat: implement rewriter for arrow_cast
* test: add unit test for tht rewriter
* chore: format
* refactor: extract some more functions
* Apply suggestion from @waynexia
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
---------
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
* perf: optimize Prometheus label name decoding with byte-level validation
Add `decode_label_name` and `validate_label_name` to skip redundant
UTF-8 validation for Prometheus label names, which are guaranteed ASCII
(`[a-zA-Z_][a-zA-Z0-9_]*`). Rename `validate_bytes` to `validate_utf8`
for clarity and add benchmarks for label name validation and UTF-8
validation (std vs simdutf8).
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* perf(servers): optimize validate_label_name with lookup table and loop unrolling
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* perf/decode-prom-2:
- **Refactor UTF-8 Validation and Label Decoding**:
- Removed `validate_utf8` method and integrated label name validation directly in `decode_label_name` in `http.rs`.
- Updated `decode_label_name` to always enforce Prometheus label name validation across all modes.
- Adjusted test cases in `http.rs` to reflect the new validation logic.
- **Enhance Label Validation in `prom_row_builder.rs`**:
- Replaced UTF-8 validation with direct label name validation using `validate_label_name`.
- Updated `decode_label_name` usage to return `&str` and adjusted related logic.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* perf/decode-prom-2:
**Refactor `TableBuilder` to Use `RawBytes` for Column Indexes**
- Updated `TableBuilder` in `prom_row_builder.rs` to use `RawBytes` instead of `Vec<u8>` for `col_indexes`.
- Modified `with_capacity` method to directly insert `RawBytes` for timestamp and value columns.
- Adjusted schema handling to use `to_owned` for `tag_name` and directly insert `raw_tag_name` into `col_indexes`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* perf/decode-prom-2:
### Commit Message
Refactor `PromWriteRequest` Method and Enhance Data Handling
- **Refactor Method**: Renamed the `merge` method to `decode` in `PromWriteRequest` to better reflect its functionality. Updated references in `prom_decode.rs`, `prom_store.rs`, and `prom_row_builder.rs`.
- **Enhance Data Handling**: Introduced `raw_data` field in `PromWriteRequest` to store a clone of the buffer for potential future use. Updated the `clear` method to reset `raw_data`.
Files affected: `prom_decode.rs`, `prom_store.rs`, `prom_row_builder.rs`, `proto.rs`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* perf/decode-prom-2:
**Commit Summary:**
- **Enhancement in `prom_row_builder.rs`:**
- Added a new field `raw_data` of type `Bytes` to `TablesBuilder`.
- Implemented `set_raw_data` method to update `raw_data`.
- Modified `clear` method to reset `raw_data`.
- **Refactor in `proto.rs`:**
- Removed `raw_data` field from `PromWriteRequest`.
- Updated `decode_and_process` method to use `set_raw_data` from `TablesBuilder` for handling raw data.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* chore: remove duplicated validation
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* perf/decode-prom-2:
### Commit Message
Refactor `TablesBuilder` and `TableBuilder` to Use Lifetime Annotations
- Updated `prom_store.rs`:
- Modified `PROM_WRITE_REQUEST_POOL` and `decode_remote_write_request` to use lifetime annotations for `PromWriteRequest` and `TablesBuilder`.
- Updated `prom_row_builder.rs`:
- Refactored `TablesBuilder` and `TableBuilder` structs to include lifetime annotations.
- Adjusted methods in `TablesBuilder` and `TableBuilder` to accommodate lifetime changes.
- Updated `proto.rs`:
- Added lifetime annotations to `PromWriteRequest` and its methods.
- Modified `add_to_table_data` to use lifetime annotations for `TablesBuilder`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* chore: fmt
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat: cast filters type for scanbench
Signed-off-by: evenyag <realevenyag@gmail.com>
* chore: pub file_range mod
So we can use the pub struct FileRange in other places
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix: add api as dev-dependency to cmd for clippy
Signed-off-by: evenyag <realevenyag@gmail.com>
* feat: support profiling after warmup
Signed-off-by: evenyag <realevenyag@gmail.com>
---------
Signed-off-by: evenyag <realevenyag@gmail.com>
* perf/prom-decode:
**Refactor `PromLabel` to Use Raw Byte Slices**
- Updated `PromLabel` struct in `proto.rs` to use `RawBytes` for `name` and `value` fields, replacing `Bytes` with static byte slices.
- Modified test cases in `prom_row_builder.rs` to accommodate changes in `PromLabel` by using byte literals.
- Simplified `merge_bytes` function in `proto.rs` to directly assign byte slices, removing unnecessary memory operations.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* perf/prom-decode:
- **Add UTF-8 Validation**: Introduced `validate_bytes` method in `http.rs` to validate UTF-8 encoding using `simdutf8` for `PromValidationMode::Strict`.
- **Update Column Indexing**: Modified `prom_row_builder.rs` to use `Vec<u8>` for `col_indexes` keys, ensuring UTF-8 validation for label names.
- **Dependency Update**: Added `simdutf8` version `0.1.5` to `Cargo.toml` and updated `Cargo.lock` to include this new dependency.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* fix: style issues
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* chore(version): refresh build info on demand
Introduce a `refresh-build-info` feature to `common-version` to control
whether build timestamps are updated. By default, timestamps are no longer
refreshed, and `shadow.rs` regeneration is skipped if it already exists.
This prevents the build script from invalidating incremental compilation
results when nothing else has changed. CI and release builds are updated
to explicitly enable this feature.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* chore/refresh-build-info-on-demand:
### Update Build Configuration
- **Remove `refresh-build-info` Feature:**
- Removed the `refresh-build-info` feature from `action.yml`, `release.yml`, and `Cargo.toml`.
- Updated `build.rs` to refresh timestamps by default in release builds, with an option to disable via `DISABLE_BUILD_INFO`.
- **Modify GitHub Actions:**
- Updated `.github/actions/build-linux-artifacts/action.yml` and `.github/workflows/release.yml` to exclude `refresh-build-info` from the `features` list.
- **Enhance Build Script Logic:**
- Adjusted logic in `build.rs` to handle timestamp refreshing based on build profile and environment variables.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>