Files
greptimedb/docs/how-to/how-to-implement-sql-statement.md
2025-05-12 12:12:47 +00:00

4.0 KiB

This document introduces how to implement SQL statements in GreptimeDB.

The execution entry point for SQL statements locates at Frontend Instance. You can see it has implemented SqlQueryHandler:

impl SqlQueryHandler for Instance {
    type Error = Error;

    async fn do_query(&self, query: &str, query_ctx: QueryContextRef) -> Vec<Result<Output>> {
        // ...
    }
}

Normally, when a SQL query arrives at GreptimeDB, the do_query method will be called. After some parsing work, the SQL will be fed into StatementExecutor:

// in Frontend Instance:
self.statement_executor.execute_sql(stmt, query_ctx).await

That's where we handle our SQL statements. You can just create a new match arm for your statement there, then the statement is implemented for both GreptimeDB Standalone and Cluster. You can see how DESCRIBE TABLE is implemented as an example.

Now, what if the statements should be handled differently for GreptimeDB Standalone and Cluster? You can see there's a SqlStatementExecutor field in StatementExecutor. Each GreptimeDB Standalone and Cluster has its own implementation of SqlStatementExecutor. If you are going to implement the statements differently in the two modes ( like CREATE TABLE), you have to implement them in their own SqlStatementExecutors.

Summarize as the diagram below:

                             SQL query                            
                                |                                
                                v                                
                  +---------------------------+                  
                  | SqlQueryHandler::do_query |                  
                  +---------------------------+                  
                                |                                
                                | SQL parsing                    
                                v                                
               +--------------------------------+                
               | StatementExecutor::execute_sql |                
               +--------------------------------+                
                                |                                
                                | SQL execution                    
                                v                                
               +----------------------------------+                
               | commonly handled statements like |
               | "plan_exec" for selection or     |
               +----------------------------------+                
                       |                |                        
        For Standalone |                | For Cluster          
                       v                v                        
+---------------------------+      +---------------------------+ 
| SqlStatementExecutor impl |      | SqlStatementExecutor impl | 
| in Datanode Instance      |      | in Frontend DistInstance  | 
+---------------------------+      +---------------------------+ 

Note that some SQL statements can be executed in our QueryEngine, in the form of LogicalPlan. You can follow the invocation path down to the QueryEngine implementation from StatementExecutor::plan_exec. For now, there's only one DatafusionQueryEngine for both GreptimeDB Standalone and Cluster. That lone query engine works for both modes is because GreptimeDB read/write data through Table trait, and each mode has its own Table implementation.

We don't have any bias towards whether statements should be handled in query engine or StatementExecutor. You can implement one kind of statement in both places. For example, Insert with selection is handled in query engine, because we can easily do the query part there. However, Insert without selection is not, for the cost of parsing statement to LogicalPlan is not neglectable. So generally if the SQL query is simple enough, you can handle it in StatementExecutor; otherwise if it is complex or has some part of selection, it should be parsed to LogicalPlan and handled in query engine.