mirror of
https://github.com/GreptimeTeam/greptimedb.git
synced 2026-01-07 13:52:59 +00:00
* feat/bridge-bulk-insert: ## Implement Bulk Insert and Update Dependencies - **Bulk Insert Implementation**: Added `handle_bulk_inserts` method in `src/operator/src/bulk_insert.rs` to manage bulk insert requests using `FlightDecoder` and `FlightData`. - **Dependency Updates**: Updated `Cargo.lock` and `Cargo.toml` to use the latest revision of `greptime-proto` and added new dependencies like `arrow`, `arrow-ipc`, `bytes`, and `prost`. - **gRPC Enhancements**: Modified `put_record_batch` method in `src/frontend/src/instance/grpc.rs` and `src/servers/src/grpc/flight.rs` to handle `FlightData` instead of `RawRecordBatch`. - **Error Handling**: Added new error types in `src/operator/src/error.rs` for handling Arrow operations and decoding flight data. - **Miscellaneous**: Updated `src/operator/src/insert.rs` to expose `partition_manager` and `node_manager` as public fields. * feat/bridge-bulk-insert: - **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency to a new revision in `Cargo.lock` and `Cargo.toml`. - **Refactor gRPC Query Handling**: Removed `RawRecordBatch` usage from `grpc.rs`, `flight.rs`, `greptime_handler.rs`, and test files, simplifying the gRPC query handling. - **Enhance Bulk Insert Logic**: Improved bulk insert logic in `bulk_insert.rs` and `region_request.rs` by using `FlightDecoder` and `BooleanArray` for better performance and clarity. - **Add `common-grpc` Dependency**: Added `common-grpc` as a workspace dependency in `store-api/Cargo.toml` to support gRPC functionalities. * fix: clippy * fix schema serialization * feat/bridge-bulk-insert: Add error handling for encoding/decoding in `metadata.rs` and `region_request.rs` - Introduced new error variants `FlightCodec` and `Prost` in `MetadataError` to handle encoding/decoding failures in `metadata.rs`. - Updated `make_region_bulk_inserts` function in `region_request.rs` to use `context` for error handling with `ProstSnafu` and `FlightCodecSnafu`. - Enhanced error handling for `FlightData` decoding and `filter_record_batch` operations. * fix: test * refactor: rename * allow empty app_metadata in FlightData * feat/bridge-bulk-insert: - **Remove Logging**: Removed unnecessary logging of affected rows in `region_server.rs`. - **Error Handling Enhancement**: Improved error handling in `bulk_insert.rs` by adding context to `split_record_batch` and handling single datanode fast path. - **Error Enum Cleanup**: Removed unused `Arrow` error variant from `error.rs`. * fix: standalone test * feat/bridge-bulk-insert: ### Enhance Bulk Insert Handling and Metadata Management - **`lib.rs`**: Enabled the `result_flattening` feature for improved error handling. - **`request.rs`**: Made `name_to_index` and `has_null` fields public in `WriteRequest` for better accessibility. - **`handle_bulk_insert.rs`**: - Added `handle_record_batch` function to streamline processing of bulk insert payloads. - Improved error handling and task management for bulk insert operations. - Updated `region_metadata_to_column_schema` to return both column schemas and a name-to-index map for efficient data access. * feat/bridge-bulk-insert: - **Refactor `handle_bulk_insert.rs`:** - Replaced `handle_record_batch` with `handle_payload` for handling payloads. - Modified the fast path to use `common_runtime::spawn_global` for asynchronous task execution. - **Optimize `multi_dim.rs`:** - Added a fast path for single-region scenarios in `MultiDimPartitionRule::partition_record_batch`. * feat/bridge-bulk-insert: - **Update `greptime-proto` Dependency**: Updated the `greptime-proto` dependency to a new revision in both `Cargo.lock` and `Cargo.toml`. - **Optimize Memory Allocation**: Increased initial and builder capacities in `time_series.rs` to improve performance. - **Enhance Data Handling**: Modified `bulk_insert.rs` to use `Bytes` for efficient data handling. - **Improve Bulk Insert Logic**: Refined the bulk insert logic in `region_request.rs` to handle schema and payload data more effectively and optimize record batch filtering. - **String Handling Improvement**: Updated string conversion in `helper.rs` for better performance. * fix: clippy warnings * feat/bridge-bulk-insert: **Add Metrics and Improve Error Handling** - **Metrics Enhancements**: Introduced new metrics for bulk insert operations in `metrics.rs`, `bulk_insert.rs`, `greptime_handler.rs`, and `region_request.rs`. Added `HANDLE_BULK_INSERT_ELAPSED`, `BULK_REQUEST_MESSAGE_SIZE`, and `GRPC_BULK_INSERT_ELAPSED` histograms to monitor performance. - **Error Handling Improvements**: Removed unnecessary error handling in `handle_bulk_insert.rs` by eliminating redundant `let _ =` patterns. - **Dependency Updates**: Added `lazy_static` and `prometheus` to `Cargo.lock` and `Cargo.toml` for metrics support. - **Code Refactoring**: Simplified function calls in `region_server.rs` and `handle_bulk_insert.rs` for better readability. * chore: rebase main * chore: merge main
201 lines
6.5 KiB
Rust
201 lines
6.5 KiB
Rust
// Copyright 2023 Greptime Team
|
|
//
|
|
// Licensed under the Apache License, Version 2.0 (the "License");
|
|
// you may not use this file except in compliance with the License.
|
|
// You may obtain a copy of the License at
|
|
//
|
|
// http://www.apache.org/licenses/LICENSE-2.0
|
|
//
|
|
// Unless required by applicable law or agreed to in writing, software
|
|
// distributed under the License is distributed on an "AS IS" BASIS,
|
|
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
|
// See the License for the specific language governing permissions and
|
|
// limitations under the License.
|
|
|
|
use std::sync::Arc;
|
|
|
|
use api::v1::greptime_request::Request;
|
|
use api::v1::query_request::Query;
|
|
use arrow_flight::FlightData;
|
|
use async_trait::async_trait;
|
|
use catalog::memory::MemoryCatalogManager;
|
|
use common_base::AffectedRows;
|
|
use common_catalog::consts::{DEFAULT_CATALOG_NAME, DEFAULT_SCHEMA_NAME};
|
|
use common_grpc::flight::FlightDecoder;
|
|
use common_query::Output;
|
|
use datafusion_expr::LogicalPlan;
|
|
use query::options::QueryOptions;
|
|
use query::parser::{PromQuery, QueryLanguageParser, QueryStatement};
|
|
use query::query_engine::DescribeResult;
|
|
use query::{QueryEngineFactory, QueryEngineRef};
|
|
use servers::error::{Error, NotSupportedSnafu, Result};
|
|
use servers::query_handler::grpc::{GrpcQueryHandler, ServerGrpcQueryHandlerRef};
|
|
use servers::query_handler::sql::{ServerSqlQueryHandlerRef, SqlQueryHandler};
|
|
use session::context::QueryContextRef;
|
|
use snafu::ensure;
|
|
use sql::statements::statement::Statement;
|
|
use table::metadata::TableId;
|
|
use table::table_name::TableName;
|
|
use table::TableRef;
|
|
|
|
mod grpc;
|
|
mod http;
|
|
mod interceptor;
|
|
mod mysql;
|
|
mod postgres;
|
|
|
|
const LOCALHOST_WITH_0: &str = "127.0.0.1:0";
|
|
|
|
pub struct DummyInstance {
|
|
query_engine: QueryEngineRef,
|
|
}
|
|
|
|
impl DummyInstance {
|
|
fn new(query_engine: QueryEngineRef) -> Self {
|
|
Self { query_engine }
|
|
}
|
|
}
|
|
|
|
#[async_trait]
|
|
impl SqlQueryHandler for DummyInstance {
|
|
type Error = Error;
|
|
|
|
async fn do_query(&self, query: &str, query_ctx: QueryContextRef) -> Vec<Result<Output>> {
|
|
let stmt = QueryLanguageParser::parse_sql(query, &query_ctx).unwrap();
|
|
let plan = self
|
|
.query_engine
|
|
.planner()
|
|
.plan(&stmt, query_ctx.clone())
|
|
.await
|
|
.unwrap();
|
|
let output = self.query_engine.execute(plan, query_ctx).await.unwrap();
|
|
vec![Ok(output)]
|
|
}
|
|
|
|
async fn do_exec_plan(&self, plan: LogicalPlan, query_ctx: QueryContextRef) -> Result<Output> {
|
|
Ok(self.query_engine.execute(plan, query_ctx).await.unwrap())
|
|
}
|
|
|
|
async fn do_promql_query(
|
|
&self,
|
|
_: &PromQuery,
|
|
_: QueryContextRef,
|
|
) -> Vec<std::result::Result<Output, Self::Error>> {
|
|
unimplemented!()
|
|
}
|
|
|
|
async fn do_describe(
|
|
&self,
|
|
stmt: Statement,
|
|
query_ctx: QueryContextRef,
|
|
) -> Result<Option<DescribeResult>> {
|
|
if let Statement::Query(_) = stmt {
|
|
let plan = self
|
|
.query_engine
|
|
.planner()
|
|
.plan(&QueryStatement::Sql(stmt), query_ctx.clone())
|
|
.await
|
|
.unwrap();
|
|
let schema = self.query_engine.describe(plan, query_ctx).await.unwrap();
|
|
Ok(Some(schema))
|
|
} else {
|
|
Ok(None)
|
|
}
|
|
}
|
|
|
|
async fn is_valid_schema(&self, catalog: &str, schema: &str) -> Result<bool> {
|
|
Ok(catalog == DEFAULT_CATALOG_NAME && schema == DEFAULT_SCHEMA_NAME)
|
|
}
|
|
}
|
|
|
|
#[async_trait]
|
|
impl GrpcQueryHandler for DummyInstance {
|
|
type Error = Error;
|
|
|
|
async fn do_query(
|
|
&self,
|
|
request: Request,
|
|
ctx: QueryContextRef,
|
|
) -> std::result::Result<Output, Self::Error> {
|
|
let output = match request {
|
|
Request::Inserts(_)
|
|
| Request::Deletes(_)
|
|
| Request::RowInserts(_)
|
|
| Request::RowDeletes(_) => unimplemented!(),
|
|
Request::Query(query_request) => {
|
|
let query = query_request.query.unwrap();
|
|
match query {
|
|
Query::Sql(sql) => {
|
|
let mut result = SqlQueryHandler::do_query(self, &sql, ctx).await;
|
|
ensure!(
|
|
result.len() == 1,
|
|
NotSupportedSnafu {
|
|
feat: "execute multiple statements in SQL query string through GRPC interface"
|
|
}
|
|
);
|
|
result.remove(0)?
|
|
}
|
|
Query::LogicalPlan(_) | Query::InsertIntoPlan(_) => unimplemented!(),
|
|
Query::PromRangeQuery(promql) => {
|
|
let prom_query = PromQuery {
|
|
query: promql.query,
|
|
start: promql.start,
|
|
end: promql.end,
|
|
step: promql.step,
|
|
lookback: promql.lookback,
|
|
};
|
|
let mut result =
|
|
SqlQueryHandler::do_promql_query(self, &prom_query, ctx).await;
|
|
ensure!(
|
|
result.len() == 1,
|
|
NotSupportedSnafu {
|
|
feat: "execute multiple statements in PromQL query string through GRPC interface"
|
|
}
|
|
);
|
|
result.remove(0)?
|
|
}
|
|
}
|
|
}
|
|
Request::Ddl(_) => unimplemented!(),
|
|
};
|
|
Ok(output)
|
|
}
|
|
|
|
async fn put_record_batch(
|
|
&self,
|
|
table: &TableName,
|
|
table_id: &mut Option<TableId>,
|
|
decoder: &mut FlightDecoder,
|
|
data: FlightData,
|
|
) -> std::result::Result<AffectedRows, Self::Error> {
|
|
let _ = table;
|
|
let _ = data;
|
|
let _ = table_id;
|
|
let _ = decoder;
|
|
unimplemented!()
|
|
}
|
|
}
|
|
|
|
fn create_testing_instance(table: TableRef) -> DummyInstance {
|
|
let catalog_manager = MemoryCatalogManager::new_with_table(table);
|
|
let query_engine = QueryEngineFactory::new(
|
|
catalog_manager,
|
|
None,
|
|
None,
|
|
None,
|
|
None,
|
|
false,
|
|
QueryOptions::default(),
|
|
)
|
|
.query_engine();
|
|
DummyInstance::new(query_engine)
|
|
}
|
|
|
|
fn create_testing_sql_query_handler(table: TableRef) -> ServerSqlQueryHandlerRef {
|
|
Arc::new(create_testing_instance(table)) as _
|
|
}
|
|
|
|
fn create_testing_grpc_query_handler(table: TableRef) -> ServerGrpcQueryHandlerRef {
|
|
Arc::new(create_testing_instance(table)) as _
|
|
}
|