* test: fix unstable index meta list test
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix: raise bucket_size threshold to avoid bucketing sizes in [512, 999] to 0
Signed-off-by: evenyag <realevenyag@gmail.com>
---------
Signed-off-by: evenyag <realevenyag@gmail.com>
* feat: initial function rewriter for json_get
* feat: make sure rewrite rule is applied
* feat: keep analyzer's default rules
* feat: implement rewriter for arrow_cast
* test: add unit test for tht rewriter
* chore: format
* refactor: extract some more functions
* Apply suggestion from @waynexia
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
---------
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
* perf: optimize Prometheus label name decoding with byte-level validation
Add `decode_label_name` and `validate_label_name` to skip redundant
UTF-8 validation for Prometheus label names, which are guaranteed ASCII
(`[a-zA-Z_][a-zA-Z0-9_]*`). Rename `validate_bytes` to `validate_utf8`
for clarity and add benchmarks for label name validation and UTF-8
validation (std vs simdutf8).
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* perf(servers): optimize validate_label_name with lookup table and loop unrolling
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* perf/decode-prom-2:
- **Refactor UTF-8 Validation and Label Decoding**:
- Removed `validate_utf8` method and integrated label name validation directly in `decode_label_name` in `http.rs`.
- Updated `decode_label_name` to always enforce Prometheus label name validation across all modes.
- Adjusted test cases in `http.rs` to reflect the new validation logic.
- **Enhance Label Validation in `prom_row_builder.rs`**:
- Replaced UTF-8 validation with direct label name validation using `validate_label_name`.
- Updated `decode_label_name` usage to return `&str` and adjusted related logic.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* perf/decode-prom-2:
**Refactor `TableBuilder` to Use `RawBytes` for Column Indexes**
- Updated `TableBuilder` in `prom_row_builder.rs` to use `RawBytes` instead of `Vec<u8>` for `col_indexes`.
- Modified `with_capacity` method to directly insert `RawBytes` for timestamp and value columns.
- Adjusted schema handling to use `to_owned` for `tag_name` and directly insert `raw_tag_name` into `col_indexes`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* perf/decode-prom-2:
### Commit Message
Refactor `PromWriteRequest` Method and Enhance Data Handling
- **Refactor Method**: Renamed the `merge` method to `decode` in `PromWriteRequest` to better reflect its functionality. Updated references in `prom_decode.rs`, `prom_store.rs`, and `prom_row_builder.rs`.
- **Enhance Data Handling**: Introduced `raw_data` field in `PromWriteRequest` to store a clone of the buffer for potential future use. Updated the `clear` method to reset `raw_data`.
Files affected: `prom_decode.rs`, `prom_store.rs`, `prom_row_builder.rs`, `proto.rs`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* perf/decode-prom-2:
**Commit Summary:**
- **Enhancement in `prom_row_builder.rs`:**
- Added a new field `raw_data` of type `Bytes` to `TablesBuilder`.
- Implemented `set_raw_data` method to update `raw_data`.
- Modified `clear` method to reset `raw_data`.
- **Refactor in `proto.rs`:**
- Removed `raw_data` field from `PromWriteRequest`.
- Updated `decode_and_process` method to use `set_raw_data` from `TablesBuilder` for handling raw data.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* chore: remove duplicated validation
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* perf/decode-prom-2:
### Commit Message
Refactor `TablesBuilder` and `TableBuilder` to Use Lifetime Annotations
- Updated `prom_store.rs`:
- Modified `PROM_WRITE_REQUEST_POOL` and `decode_remote_write_request` to use lifetime annotations for `PromWriteRequest` and `TablesBuilder`.
- Updated `prom_row_builder.rs`:
- Refactored `TablesBuilder` and `TableBuilder` structs to include lifetime annotations.
- Adjusted methods in `TablesBuilder` and `TableBuilder` to accommodate lifetime changes.
- Updated `proto.rs`:
- Added lifetime annotations to `PromWriteRequest` and its methods.
- Modified `add_to_table_data` to use lifetime annotations for `TablesBuilder`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* chore: fmt
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat: cast filters type for scanbench
Signed-off-by: evenyag <realevenyag@gmail.com>
* chore: pub file_range mod
So we can use the pub struct FileRange in other places
Signed-off-by: evenyag <realevenyag@gmail.com>
* fix: add api as dev-dependency to cmd for clippy
Signed-off-by: evenyag <realevenyag@gmail.com>
* feat: support profiling after warmup
Signed-off-by: evenyag <realevenyag@gmail.com>
---------
Signed-off-by: evenyag <realevenyag@gmail.com>
* perf/prom-decode:
**Refactor `PromLabel` to Use Raw Byte Slices**
- Updated `PromLabel` struct in `proto.rs` to use `RawBytes` for `name` and `value` fields, replacing `Bytes` with static byte slices.
- Modified test cases in `prom_row_builder.rs` to accommodate changes in `PromLabel` by using byte literals.
- Simplified `merge_bytes` function in `proto.rs` to directly assign byte slices, removing unnecessary memory operations.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* perf/prom-decode:
- **Add UTF-8 Validation**: Introduced `validate_bytes` method in `http.rs` to validate UTF-8 encoding using `simdutf8` for `PromValidationMode::Strict`.
- **Update Column Indexing**: Modified `prom_row_builder.rs` to use `Vec<u8>` for `col_indexes` keys, ensuring UTF-8 validation for label names.
- **Dependency Update**: Added `simdutf8` version `0.1.5` to `Cargo.toml` and updated `Cargo.lock` to include this new dependency.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* fix: style issues
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* chore(version): refresh build info on demand
Introduce a `refresh-build-info` feature to `common-version` to control
whether build timestamps are updated. By default, timestamps are no longer
refreshed, and `shadow.rs` regeneration is skipped if it already exists.
This prevents the build script from invalidating incremental compilation
results when nothing else has changed. CI and release builds are updated
to explicitly enable this feature.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* chore/refresh-build-info-on-demand:
### Update Build Configuration
- **Remove `refresh-build-info` Feature:**
- Removed the `refresh-build-info` feature from `action.yml`, `release.yml`, and `Cargo.toml`.
- Updated `build.rs` to refresh timestamps by default in release builds, with an option to disable via `DISABLE_BUILD_INFO`.
- **Modify GitHub Actions:**
- Updated `.github/actions/build-linux-artifacts/action.yml` and `.github/workflows/release.yml` to exclude `refresh-build-info` from the `features` list.
- **Enhance Build Script Logic:**
- Adjusted logic in `build.rs` to handle timestamp refreshing based on build profile and environment variables.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>