* wip: naive impl
* feat/column-partition:
### Add support for DataFusion physical expressions
- **`Cargo.lock` & `Cargo.toml`**: Added `datafusion-physical-expr` as a dependency to support physical expression creation.
- **`expr.rs`**: Implemented conversion methods `try_as_logical_expr` and `try_as_physical_expr` for `Operand` and `PartitionExpr` to facilitate logical and physical expression handling.
- **`multi_dim.rs`**: Enhanced `MultiDimPartitionRule` to utilize physical expressions for partitioning logic, including new methods for evaluating record batches.
- **Tests**: Added unit tests for logical and physical expression conversions and partitioning logic in `expr.rs` and `multi_dim.rs`.
* feat/column-partition:
### Refactor and Enhance Partition Handling
- **Refactor Partition Parsing Logic**: Moved partition parsing logic from `src/operator/src/statement/ddl.rs` to a new utility module `src/partition/src/utils.rs`. This includes functions like `parse_partitions`, `find_partition_bounds`, and `convert_one_expr`.
- **Error Handling Improvements**: Added new error variants `ColumnNotFound`, `InvalidPartitionRule`, and `ParseSqlValue` in `src/partition/src/error.rs` to improve error reporting for partition-related operations.
- **Dependency Updates**: Updated `Cargo.lock` and `Cargo.toml` to include new dependencies `common-time` and `session`.
- **Code Cleanup**: Removed redundant partition parsing functions from `src/operator/src/error.rs` and `src/operator/src/statement/ddl.rs`.
* feat/column-partition:
## Refactor and Enhance SQL and Table Handling
- **Refactor Column Definitions and Error Handling**
- Made `FULLTEXT_GRPC_KEY`, `INVERTED_INDEX_GRPC_KEY`, and `SKIPPING_INDEX_GRPC_KEY` public in `column_def.rs`.
- Removed `IllegalPrimaryKeysDef` error from `error.rs` and moved it to `sql/src/error.rs`.
- Updated error handling in `fill_impure_default.rs` and `expr_helper.rs`.
- **Enhance SQL Utility Functions**
- Moved and refactored functions like `create_to_expr`, `find_primary_keys`, and `validate_create_expr` to `sql/src/util.rs`.
- Added new utility functions for SQL parsing and validation in `sql/src/util.rs`.
- **Improve Partition Handling**
- Added `parse_partition_columns_and_exprs` function in `partition/src/utils.rs`.
- Updated partition rule tests in `partition/src/multi_dim.rs` to use SQL-based partitioning.
- **Simplify Table Name Handling**
- Re-exported `table_idents_to_full_name` from `sql::util` in `session/src/table_name.rs`.
- **Test Enhancements**
- Updated tests in `partition/src/multi_dim.rs` to use SQL for partition rule creation.
* feat/column-partition:
**Add Benchmarking and Enhance Partitioning Logic**
- **Benchmarking**: Introduced a new benchmark for `split_record_batch` in `bench_split_record_batch.rs` using `criterion` and `rand` as development dependencies in `Cargo.toml`.
- **Partitioning Logic**: Enhanced `MultiDimPartitionRule` in `multi_dim.rs` to include a default region for unmatched partition expressions and optimized the `split_record_batch` method.
- **Refactoring**: Moved `sql_to_partition_rule` function to a public scope for reuse in `multi_dim.rs`.
- **Testing**: Added new test module `test_split_record_batch` to validate the partitioning logic.
* Revert "feat/column-partition: ### Refactor and Enhance Partition Handling"
This reverts commit 183fa19f
* fix: revert refctoring parse_partition
* revert some refactor
* feat/column-partition:
### Enhance Partitioning and Error Handling
- **Benchmark Enhancements**: Added new benchmark `bench_split_record_batch_vs_row` in `bench_split_record_batch.rs` to compare row and column-based splitting.
- **Error Handling Improvements**: Introduced new error variants in `error.rs` for better error reporting related to record batch evaluation and arrow kernel computation.
- **Expression Handling**: Updated `expr.rs` to improve error context when converting schemas and creating physical expressions.
- **Partition Rule Enhancements**: Made `row_at` and `record_batch_to_cols` methods public in `multi_dim.rs` and improved error handling for physical expression evaluation and boolean operations.
* feat/column-partition:
### Add `eq` Method and Optimize Expression Caching
- **`expr.rs`**: Added a new `eq` method to the `Operand` struct for equality comparisons.
- **`multi_dim.rs`**: Introduced a caching mechanism for physical expressions using `RwLock` to improve performance in `MultiDimPartitionRule`.
- **`lib.rs`**: Enabled the `let_chains` feature for more concise code.
- **`multi_dim.rs` Tests**: Enhanced test coverage with new test cases for multi-dimensional partitioning, including random record batch generation and default region handling.
* feat/column-partition:
### Add `split_record_batch` Method to `PartitionRule` Trait
- **Files Modified**:
- `src/partition/src/multi_dim.rs`
- `src/partition/src/partition.rs`
- `src/partition/src/splitter.rs`
Added a new method `split_record_batch` to the `PartitionRule` trait, allowing record batches to be split into multiple regions based on partition values. Implemented this method in `MultiDimPartitionRule` and provided unimplemented stubs in test modules.
### Dependency Update
- **File Modified**:
- `src/operator/src/expr_helper.rs`
Removed unused import `ColumnDataType` and `Timezone` from the test module.
### Miscellaneous
- **File Modified**:
- `src/partition/Cargo.toml`
No functional changes; only minor formatting adjustments.
* chore: add license header
* chore: remove useless fules
* feat/column-partition:
Add support for handling unsupported partition expression values
- **`error.rs`**: Introduced a new error variant `UnsupportedPartitionExprValue` to handle unsupported partition expression values, and updated `ErrorExt` to map this error to `StatusCode::InvalidArguments`.
- **`expr.rs`**: Modified the `Operand` implementation to return the new error when encountering unsupported partition expression values.
- **`multi_dim.rs`**: Added a fast path to optimize the selection process when all rows are selected.
* feat/column-partition: Add validation for expression and region length in MultiDimPartitionRule constructor
• Ensure the lengths of exprs and regions match to prevent mismatches.
• Introduce error handling for length discrepancies with a descriptive error message.
* chore: add debug log
* feat/column-partition: Removed the validation check for matching lengths between exprs and regions in MultiDimPartitionRule constructor, simplifying the initialization process.
* fix: unit tests
* change dep
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* feat: adapt to arrow's interval array
* chore: fix compile errors in datatypes crate
* chore: fix api crate compiler errors
* chore: fix compiler errors in common-grpc
* chore: fix common-datasource errors
* chore: fix deprecated code in common-datasource
* fix promql and physical plan related
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* wip: upgrading network deps
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* block on updating `sqlparser`
* upgrade sqlparser
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* adapt new df's trait requirements
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* chore: fix compiler errors in mito2
* chore: fix common-function crate errors
* chore: fix catalog errors
* change import path
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* chore: fix some errors in query crate
* chore: fix some errors in query crate
* aggr expr and some other tiny fixes
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* chore: fix expr related errors in query crate
* chore: fix query serializer and admin command
* chore: fix grpc services
* feat: axum serve
* chore: fix http server
* remove handle_error handler
* refactor timeout layer
* serve axum
* chore: fix flow aggr functions
* chore: fix flow
* feat: fix errors in meta-srv
* boxed()
* use TokioIo
* feat!: Remove script crate and python feature (#5321)
* feat: exclude script crate
* chore: simplify feature
* feat: remove the script crate
* chore: remove python feature and some comments
* chore: fix warning
* chore: fix servers tests compiler errors
* feat: fix tests-integration errors
* chore: fix unused
* test: fix catalog test
* chore: fix compiler errors for crates using common-meta
testing feature is enabled when check with --workspace
* test: use display for logical plan test
* test: implement rewrite for ScanHintRule
* fix: http server build panic
* test: fix mito test
* fix: sql parser type alias error
* test: fix TestClient not listen
* test: some flow tests
* test(flow): more fix
* fix: test_otlp_logs
* test: fix promql test that using deprecated method fun()
* fix: sql type replace supports Int8 ~ Int64, UInt8 ~ UInt64
* test: fix infer schema test case
* test: fix tests related to plan display
* chore: fix last flow test
* test: fix function format related assertion
* test: use larger port range for tests
* fix: test_otlp_traces
* fix: test_otlp_metrics
* fix range query and dist plan
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* fix: flow handle distinct use deprecated field
* fix: can't pass Join plan expressions to LogicalPlan::with_new_exprs
* test: fix deserialize test
* test: reduce split key case num
* tests: lower case aggr func name
* test: fix some sqlness tests
* tests: more sqlness fix
* tests: fixed sqlness test
* commit non-bug changes
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* fix: make our udf correct
* fix: implement empty methods of ContextProvider for DfContextProviderAdapter
* test: update sqlness test result
* chore: remove unused
* fix: provide alias name for AggregateExprBuilder in range plan
* test: update range query result
* fix: implement missing ContextProvider methods for DfContextProviderAdapter
* test: update timestamps, cte result
* fix: supports empty projection in mito
* test: update comment for cte test
* fix: support projection for numbers
* test: update test cases after projection fix
* fix: fix range select first_value/last_value
* fix: handle CAST and time index conflict
* fix: handle order by correctly in range first_value/last_value
* test: update sqlness result
* test: update view test result
* test: update decimal test
wait for https://github.com/apache/datafusion/pull/14126 to fix this
* feat: remove redundant physical optimization
todo(ruihang): Check if we can remove this.
* test: update sqlness test result
* chore: range select default sort use nulls_first = false
* test: update filter push down test result
* test: comment deciaml test to avoid different panic message
* test: update some distributed test result
* test: update test for distributed count and filter push down
* test: update subqueries test
* fix: SessionState may overwrite our UDFs
* chore: fix compiler errors after merging main
* fix: fix elasticsearch and dashboard router panic
* chore: fix common-functions tests
* chore: update sqlness result
* test: fix id keyword and update sqlness result
* test: fix flow_null test
* fix: enlarge thread size in debug mode to avoid overflow
* chore: fix warnings in common-function
* chore: fix warning in flow
* chore: fix warnings in query crate
* chore: remove unused warnings
* chore: fix deprecated warnings for parquet
* chore: fix deprecated warning in servers crate
* style: fix clippy
* test: enlarge mito cache tttl test ttl time
* chore: fix typo
* style: fmt toml
* refactor: reimplement PartialOrd for RangeSelect
* chore: remove script crate files introduced by merge
* fix: return error if sql option is not kv
* chore: do not use ..default::default()
* chore: per review
* chore: update error message in BuildAdminFunctionArgsSnafu
Co-authored-by: jeremyhi <jiachun_feng@proton.me>
* refactor: typed precision
* update sqlness view case
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* chore: flow per review
* chore: add example in comment
* chore: warn if parquet stats of timestamp is not INT64
* style: add a newline before derive to make the comment more clear
* test: update sqlness result
* fix: flow from substrait
* chore: change update_range_context log to debug level
* chore: move axum-extra axum-macros to workspace
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: luofucong <luofc@foxmail.com>
Co-authored-by: discord9 <discord9@163.com>
Co-authored-by: shuiyisong <xixing.sys@gmail.com>
Co-authored-by: jeremyhi <jiachun_feng@proton.me>
* test: optimize out partition split insert requests if there is only one region
* Now that the optimization for single region insert has been lifted up, the original "fast path" can be obsoleted.
* resolve PR comments
* fix: display the PartitionBound and PartitionDef correctly
* Update src/partition/src/partition.rs
Co-authored-by: dennis zhuang <killme2008@gmail.com>
* fix: fix unit test of partition definition
---------
Co-authored-by: dennis zhuang <killme2008@gmail.com>
* fix: logical region can't find region routes
* feat: fetch partitions info in batch
* refactor: rename batch functions
* refactor: rename DdlTaskExecutor to ProcedureExecutor
* feat: impl migrate_region and query_procedure_state for ProcedureExecutor
* feat: adds SQL function procedure_state and finish migrate_region impl
* fix: constant vector
* feat: unit tests for migrate_region and procedure_state
* test: test region migration by SQL
* fix: compile error after rebeasing
* fix: clippy warnings
* feat: ensure procedure_state and migrate_region can be only called under greptime catalog
* fix: license header
* feat(tests-fuzz): add CreateTableExprGenerator
* refactor: move Column to root of ir mod
* feat: add AlterTableExprGenerator
* feat: add Serialize and Deserialize derive
* chore: refactor the AlterExprGenerator