* feat(copy_to_json): add `date_format`/`timestamp_format`/`time_format` for JSON format.
Signed-off-by: Yihai Lin <yihai-lin@foxmail.com>
* Update src/common/datasource/src/file_format/json.rs
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Signed-off-by: Yihai Lin <yihai-lin@foxmail.com>
* chore: Use predefined constants as the time format.
Signed-off-by: Yihai Lin <yihai-lin@foxmail.com>
---------
Signed-off-by: Yihai Lin <yihai-lin@foxmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
* feat: adds format, regex_extract function and more type tests
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
* fix: forgot functions
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
* chore: forgot null type
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
* test: forgot date type
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
* feat: remove format function
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
* test: update results after upgrading datafusion
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
---------
Signed-off-by: Dennis Zhuang <killme2008@gmail.com>
fix/disable-parquet-stats-truncate:
- **Update `memcomparable` Dependency**: Switched from crates.io to a Git repository for `memcomparable` in `Cargo.lock`, `mito-codec/Cargo.toml`, and removed it from `mito2/Cargo.toml`.
- **Enhance Parquet Writer Properties**: Added `set_statistics_truncate_length` and `set_column_index_truncate_length` to `WriterProperties` in `parquet.rs`, `bulk/part.rs`, `partition_tree/data.rs`, and `writer.rs`.
- **Add Test for Corrupt Scan**: Introduced a new test module `scan_corrupt.rs` in `mito2/src/engine` to verify handling of corrupt data.
- **Update Test Data**: Modified test data in `flush.rs` to reflect changes in file sizes and sequences.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* wip
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* refactor/expose-bulk-symbols:
### Commit Message
Enhance DDL Module Accessibility and Refactor `verify_alter` Function
- **`statement.rs`**: Made the `ddl` module public to enhance accessibility.
- **`ddl.rs`**:
- Made `NAME_PATTERN_REG` public for broader usage.
- Refactored `verify_alter` function to be a standalone public function, improving modularity and reusability.
- Made `parse_partitions` function public to allow external access.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* refactor/expose-bulk-symbols:
### Add Parquet Writer and Enhance Row Modifier
- **Add Parquet Writer Module**: Introduced a new module `parquet_writer.rs` to bridge `opendal` `Writer` with `parquet` `AsyncFileWriter`.
- **Enhance Row Modifier**: Updated `RowModifier` to use `Default` trait and made `fill_internal_columns` a public static method in `row_modifier.rs`.
- **Expose Internal Structures**: Made `RowsIter`, `RowIter`, `TablesBuilder`, and `TableBuilder` structs public in `row_modifier.rs` and `prom_row_builder.rs`.
- **Update Metric Engine**: Changed `RowModifier` instantiation to use `default()` in `engine.rs`.
- **Modify Table Options Handling**: Added `fill_table_options_for_create` function in `insert.rs` to handle table options based on `AutoCreateTableType`.
- **Make Constants Public**: Changed `DEFAULT_ROW_GROUP_SIZE` to public in `parquet.rs`.
- **Expose Functions**: Made `extract_add_columns_expr` public in `expr_helper.rs` and `AutoCreateTableType` public in `insert.rs`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* refactor/expose-bulk-symbols:
### Commit Message
Enhance HTTP Server and Prometheus Integration
- **`http.rs`**: Made `extractor` module public to allow external access.
- **`prom_store.rs`**: Refactored `decode_remote_write_request` to return `TablesBuilder` and adjusted logic for processing requests based on pipeline usage.
- **`lib.rs`**: Made `metrics` module public for broader accessibility.
- **`prom_row_builder.rs`**: Exposed `tables` field in `TablesBuilder` for external manipulation.
- **`proto.rs`**: Changed visibility of `table_data` in `PromWriteRequest` to `pub(crate)` for internal module access.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* refactor/expose-bulk-symbols:
### Add Accessor Methods for Managers and Executors
- **`src/frontend/src/instance.rs`**: Added accessor methods for `NodeManagerRef`, `PartitionRuleManagerRef`, `CacheInvalidatorRef`, and `ProcedureExecutorRef` to the `Instance` struct.
- **`src/operator/src/insert.rs`**: Introduced methods to access `NodeManagerRef` and `PartitionRuleManagerRef` in the `Inserter` struct.
- **`src/operator/src/statement.rs`**: Added methods to retrieve `ProcedureExecutorRef` and `CacheInvalidatorRef` in the `StatementExecutor` struct.
### Change HashMap Implementation
- **`src/servers/src/prom_row_builder.rs`**: Replaced `ahash::HashMap` with `std::collections::HashMap`.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* refactor/expose-bulk-symbols:
Refactor table option handling in `insert.rs`
- Replaced `Vec` with `HashMap` for `table_options` to improve efficiency.
- Extracted logic for filling table options into a new function `fill_table_options_for_create`.
- Modified `fill_table_options_for_create` to return the engine name based on `create_type`.
- Simplified the insertion of table options into `create_table_expr` by using `extend` method.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* refactor/expose-bulk-symbols:
Refactor `insert.rs` to separate engine name logic from table options
- Updated `Inserter` implementation to determine `engine_name` separately from `fill_table_options_for_create`.
- Modified `fill_table_options_for_create` to no longer return an engine name, focusing solely on populating table options.
- Adjusted logic to set `engine_name` based on `AutoCreateTableType`, using `METRIC_ENGINE_NAME` for logical tables and `default_engine()` otherwise.
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
---------
Signed-off-by: Lei, HUANG <mrsatangel@gmail.com>
* feat(object_store): add support for Alibaba Cloud OSS
- Implement OSS backend in object_store module
- Add OSS-related options to ExportCommand
- Update build_operator to support OSS
- Modify parse_url to handle OSS schema
Signed-off-by: Logic <zqr10159@dromara.org>
* feat(object_store): add support for Alibaba Cloud OSS
- Implement OSS backend in object_store module
- Add OSS-related options to ExportCommand
- Update build_operator to support OSS
- Modify parse_url to handle OSS schema
Signed-off-by: Logic <zqr10159@dromara.org>
* test(object_store): update OSS backend tests with comprehensive scenarios
- Remove minimal case test for OSS backend
- Update test for OSS backend with all fields valid- Remove invalid allow_anonymous test case
Signed-off-by: Logic <zqr10159@dromara.org>
* feat(datasource): add support for OSS (Object Storage Service)
- Implement is_supported_in_oss function to check if a key is supported in OSS configuration- Add build_oss_backend function for creating an OSS backend
- Update requests module to include OSS support check
Signed-off-by: Logic <zqr10159@dromara.org>
* refactor(export): enhance security and logging for sensitive data
- Replace plain strings with SecretString for sensitive information- Implement masking of sensitive data in SQL logs
- Update handling of S3 and OSS credentials
Signed-off-by: Logic <zqr10159@dromara.org>
* refactor(export): generalize remote storage support and rename options
- Rename `s3_ddl_local_dir` to `ddl_local_dir` for better clarity
- Update comments to support both S3 and OSS remote storage options
- Modify logic to handle remote storage options more generically
Signed-off-by: Logic <zqr10159@dromara.org>
* refactor(export): generalize remote storage support and rename options
- Rename `s3_ddl_local_dir` to `ddl_local_dir` for better clarity
- Update comments to support both S3 and OSS remote storage options
- Modify logic to handle remote storage options more generically
Signed-off-by: Logic <zqr10159@dromara.org>
---------
Signed-off-by: Logic <zqr10159@dromara.org>
* change dep
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* feat: adapt to arrow's interval array
* chore: fix compile errors in datatypes crate
* chore: fix api crate compiler errors
* chore: fix compiler errors in common-grpc
* chore: fix common-datasource errors
* chore: fix deprecated code in common-datasource
* fix promql and physical plan related
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* wip: upgrading network deps
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* block on updating `sqlparser`
* upgrade sqlparser
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* adapt new df's trait requirements
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* chore: fix compiler errors in mito2
* chore: fix common-function crate errors
* chore: fix catalog errors
* change import path
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* chore: fix some errors in query crate
* chore: fix some errors in query crate
* aggr expr and some other tiny fixes
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* chore: fix expr related errors in query crate
* chore: fix query serializer and admin command
* chore: fix grpc services
* feat: axum serve
* chore: fix http server
* remove handle_error handler
* refactor timeout layer
* serve axum
* chore: fix flow aggr functions
* chore: fix flow
* feat: fix errors in meta-srv
* boxed()
* use TokioIo
* feat!: Remove script crate and python feature (#5321)
* feat: exclude script crate
* chore: simplify feature
* feat: remove the script crate
* chore: remove python feature and some comments
* chore: fix warning
* chore: fix servers tests compiler errors
* feat: fix tests-integration errors
* chore: fix unused
* test: fix catalog test
* chore: fix compiler errors for crates using common-meta
testing feature is enabled when check with --workspace
* test: use display for logical plan test
* test: implement rewrite for ScanHintRule
* fix: http server build panic
* test: fix mito test
* fix: sql parser type alias error
* test: fix TestClient not listen
* test: some flow tests
* test(flow): more fix
* fix: test_otlp_logs
* test: fix promql test that using deprecated method fun()
* fix: sql type replace supports Int8 ~ Int64, UInt8 ~ UInt64
* test: fix infer schema test case
* test: fix tests related to plan display
* chore: fix last flow test
* test: fix function format related assertion
* test: use larger port range for tests
* fix: test_otlp_traces
* fix: test_otlp_metrics
* fix range query and dist plan
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* fix: flow handle distinct use deprecated field
* fix: can't pass Join plan expressions to LogicalPlan::with_new_exprs
* test: fix deserialize test
* test: reduce split key case num
* tests: lower case aggr func name
* test: fix some sqlness tests
* tests: more sqlness fix
* tests: fixed sqlness test
* commit non-bug changes
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* fix: make our udf correct
* fix: implement empty methods of ContextProvider for DfContextProviderAdapter
* test: update sqlness test result
* chore: remove unused
* fix: provide alias name for AggregateExprBuilder in range plan
* test: update range query result
* fix: implement missing ContextProvider methods for DfContextProviderAdapter
* test: update timestamps, cte result
* fix: supports empty projection in mito
* test: update comment for cte test
* fix: support projection for numbers
* test: update test cases after projection fix
* fix: fix range select first_value/last_value
* fix: handle CAST and time index conflict
* fix: handle order by correctly in range first_value/last_value
* test: update sqlness result
* test: update view test result
* test: update decimal test
wait for https://github.com/apache/datafusion/pull/14126 to fix this
* feat: remove redundant physical optimization
todo(ruihang): Check if we can remove this.
* test: update sqlness test result
* chore: range select default sort use nulls_first = false
* test: update filter push down test result
* test: comment deciaml test to avoid different panic message
* test: update some distributed test result
* test: update test for distributed count and filter push down
* test: update subqueries test
* fix: SessionState may overwrite our UDFs
* chore: fix compiler errors after merging main
* fix: fix elasticsearch and dashboard router panic
* chore: fix common-functions tests
* chore: update sqlness result
* test: fix id keyword and update sqlness result
* test: fix flow_null test
* fix: enlarge thread size in debug mode to avoid overflow
* chore: fix warnings in common-function
* chore: fix warning in flow
* chore: fix warnings in query crate
* chore: remove unused warnings
* chore: fix deprecated warnings for parquet
* chore: fix deprecated warning in servers crate
* style: fix clippy
* test: enlarge mito cache tttl test ttl time
* chore: fix typo
* style: fmt toml
* refactor: reimplement PartialOrd for RangeSelect
* chore: remove script crate files introduced by merge
* fix: return error if sql option is not kv
* chore: do not use ..default::default()
* chore: per review
* chore: update error message in BuildAdminFunctionArgsSnafu
Co-authored-by: jeremyhi <jiachun_feng@proton.me>
* refactor: typed precision
* update sqlness view case
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* chore: flow per review
* chore: add example in comment
* chore: warn if parquet stats of timestamp is not INT64
* style: add a newline before derive to make the comment more clear
* test: update sqlness result
* fix: flow from substrait
* chore: change update_range_context log to debug level
* chore: move axum-extra axum-macros to workspace
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: luofucong <luofc@foxmail.com>
Co-authored-by: discord9 <discord9@163.com>
Co-authored-by: shuiyisong <xixing.sys@gmail.com>
Co-authored-by: jeremyhi <jiachun_feng@proton.me>
* feat/copy-to-parquet-parameter: Commit Message:
Enhance Parquet Writer with Column-wise Configuration
Summary:
• Introduced column_wise_config function to customize per-column properties in Parquet writer.
* feat/copy-to-parquet-parameter: Commit Message:
Enhance Parquet File Format Handling for Specific Data Types
Summary:
• Added ConcreteDataType import to support specific data type handling.
* feat/copy-to-parquet-parameter: Commit Message:
Refactor Parquet file format configuration
* feat/copy-to-parquet-parameter:
Enhance Parquet file format handling for timestamp columns
- Added logic to disable dictionary encoding and set DELTA_BINARY_PACKED encoding for timestamp columns in the Parquet file format configuration.
* feat/copy-to-parquet-parameter:
Disable dictionary encoding for timestamp columns in Parquet writer and update default max_active_window_runs in TwcsOptions
- Modified Parquet writer to disable dictionary encoding for timestamp columns to optimize for increasing timestamp data.
* feat/copy-to-parquet-parameter:
Update compaction settings in tests
- Modified `test_compaction_region` to include new compaction options: `compaction.type`,
`compaction.twcs.max_active_window_runs`, and `compaction.twcs.max_inactive_window_runs`.
- Updated `test_merge_mode_compaction` to use `compaction.twcs.max_active_window_runs` and
`compaction.twcs.max_inactive_window_runs` instead of `max_active_window_files` and
`max_inactive_window_files`.
* chore: update sqlness results
* refactor: use rwlock for modifiable data in session and querycontext
* chore: format toml
* refactor: use mutable_inner structure for mutable fields
* refactor: remove arc wrapper