* fix/frontend-node-state: Refactor NodeInfoKey and Context Handling in Meta Server
• Removed unused cluster_id from NodeInfoKey struct.
• Updated HeartbeatHandlerGroup to return Context alongside HeartbeatResponse.
• Added current_node_info to Context for tracking node information.
• Implemented on_node_disconnect in Context to handle node disconnection events, specifically for Frontend roles.
• Adjusted register_pusher function to return PusherId directly.
• Updated tests to accommodate changes in Context structure.
* fix/frontend-node-state: Refactor Heartbeat Handler Context Management
Refactored the HeartbeatHandlerGroup::handle method to use a mutable reference for Context instead of passing it by value. This change simplifies the
context management by eliminating the need to return the context with the response. Updated the Metasrv implementation to align with this new context
handling approach, improving code clarity and reducing unnecessary context cloning.
* revert: clean cluster info on disconnect
* fix/frontend-node-state: Add Frontend Expiry Listener and Update NodeInfoKey Conversion
• Introduced FrontendExpiryListener to manage the expiration of frontend nodes, including its integration with leadership change notifications.
• Modified NodeInfoKey conversion to use references, enhancing efficiency and consistency across the codebase.
• Updated collect_cluster_info_handler and metasrv to incorporate the new listener and conversion changes.
• Added frontend_expiry module to the project structure for better organization and maintainability.
* chore: add config for node expiry
* add some doc
* fix: clippy
* fix/frontend-node-state:
### Refactor Node Expiry Handling
- **Configuration Update**: Removed `node_expiry_tick` from `metasrv.example.toml` and `MetasrvOptions` in `metasrv.rs`.
- **Module Renaming**: Renamed `frontend_expiry.rs` to `node_expiry_listener.rs` and updated references in `lib.rs`.
- **Code Refactoring**: Replaced `FrontendExpiryListener` with `NodeExpiryListener` in `node_expiry_listener.rs` and `metasrv.rs`, removing the tick interval and adjusting logic to use a fixed 60-second interval for node expiry checks.
* fix/frontend-node-state:
Improve logging in `node_expiry_listener.rs`
- Enhanced warning message to include peer information when an unrecognized node info key is encountered in `node_expiry_listener.rs`.
* docs: update config docs
* fix/frontend-node-state:
**Refactor Context Handling in Heartbeat Services**
- Updated `HeartbeatHandlerGroup` in `handler.rs` to pass `Context` by value instead of by mutable reference, allowing for more flexible context
management.
- Modified `Metasrv` implementation in `heartbeat.rs` to clone `Context` when passing to `handle` method, ensuring thread safety and consistency in
asynchronous operations.
* fix/reject-ddl-in-follower-metasrv:
Add leader check and logging for gRPC requests in `procedure.rs`
- Implemented leader verification for `query_procedure_state`, `ddl`, and `procedure_details` gRPC requests in `procedure.rs`.
- Added logging with `warn` for requests reaching a non-leader node.
- Introduced `ResponseHeader` and `Error::is_not_leader()` to handle non-leader responses.
* fix/reject-ddl-in-follower-metasrv:
Improve leader address handling in `heartbeat.rs`
- Refactor leader address retrieval by renaming `leader` to `leader_addr` for clarity.
- Update `make_client` function to use a reference to `leader_addr`.
- Enhance logging to include the leader address in the success message for creating a heartbeat stream.
* fmt
* fix/reject-ddl-in-follower-metasrv:
**Enhance Leader Check in `procedure.rs`**
- Updated the leader verification logic in `procedure.rs` to return a failed `MigrateRegionResponse` when the server is not the leader.
- Added logging to warn when a migrate request is received by a non-leader server.
* perf: do not delete columns when drop logical region in drop database
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* fix: make ci happy
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* fix: address review comments
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* fix: address some comments
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* fix: drop stupid comments by copilot
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* chore: minor refactor
* chore: minor refactor
* chore: update grpetime-proto
---------
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
Co-authored-by: WenyXu <wenymedia@gmail.com>
* fix: use alias expr to check commutativity
* chore: debug sort
* feat: consider alias in window sort optimizer
* test: sqlness test
* test: update sqlness result
* TODO: snapshot read
* feat: RegionEngine get last seq
* feat: query context snapshot
* chore: use new proto
* feat: get_region_seqs in region engine
* chore: typo
* chore: toml
* feat: make snapshots modifiable
* feat: add hint for snapshot read
* chore: some typo
* refactor: remove hint as not used
* fix: use commited seqs
* refactor: remove sequences variant on RegionRequest
* refactor: per review
* chore: rebase solve conflict
* refactor: rm unused key
* chore: per review
* chore: per review
* feat(metric-engine): introduce batch alter request handling
* refactor: minor refactor
* refactor: push down filter to mito
* chore: apply suggestions from CR
* feat: handle filter for window sort
* test: sqlness filter test for window sort
* test: add test on tag column filter
* test: test for filter on ts
* test: update sqlness test
* feat: change cache policy for file cache
* feat: file cache run pending task after put
* feat: run pending task in put_dir
* feat: run pending task after stager recovered
* feat: purge recycle bin periodically
* feat: use lru policy for read cache
* feat remove datetime type
* chore: fix unit test
* chore: add column test
* refactor: move create and alter validation to one place
* chore: minor refactor ut
* refactor: rename expr_factory to expr_helper
* chore: remove unnecessary args
fix: use fixed tonistiigi/binfmt:qemu-v7.0.0-28 image version instead of latest version to avoid segmentation fault
Co-authored-by: Yingwen <realevenyag@gmail.com>
ci: skip nightly ci jobs (#9)
(cherry picked from commit 345b4c30474f47a0477263bfba9894d7b4acda2d)
(cherry picked from commit dcd779cd668802fb1ea12fefb4dc3f83f34e30a2)
* refactor: rename grpc options
* refactor: make the arg clearly
* chore: comments on server_addr
* chore: fix test
* chore: remove the store_addr alias
* refactor: cli option rpc_server_addr
* chore: keep store-addr alias
* chore: by comment
* fix: do not transform exprs in the limit plan
* chore: keep some logs for debug
* feat: workaround for limit in other rules
* test: add sqlness tests for offset 0
* chore: add fixme
* - **Refactored SST File Handling**:
- Introduced `FilePathProvider` trait and its implementations (`WriteCachePathProvider`, `RegionFilePathFactory`) to manage SST and index file paths.
- Updated `AccessLayer`, `WriteCache`, and `ParquetWriter` to use `FilePathProvider` for path management.
- Modified `SstWriteRequest` and `SstUploadRequest` to use path providers instead of direct paths.
- Files affected: `access_layer.rs`, `write_cache.rs`, `parquet.rs`, `writer.rs`.
- **Enhanced Indexer Management**:
- Replaced `IndexerBuilder` with `IndexerBuilderImpl` and made it async to support dynamic indexer creation.
- Updated `ParquetWriter` to handle multiple indexers and file IDs.
- Files affected: `index.rs`, `parquet.rs`, `writer.rs`.
- **Removed Redundant File ID Handling**:
- Removed `file_id` from `SstWriteRequest` and `CompactionOutput`.
- Updated related logic to dynamically generate file IDs where necessary.
- Files affected: `compaction.rs`, `flush.rs`, `picker.rs`, `twcs.rs`, `window.rs`.
- **Test Adjustments**:
- Updated tests to align with new path and indexer management.
- Introduced `FixedPathProvider` and `NoopIndexBuilder` for testing purposes.
- Files affected: `sst_util.rs`, `version_util.rs`, `parquet.rs`.
* chore: merge main
* refactor/generate-file-id-in-parquet-writer:
**Enhance Logging in Compactor**
- Updated `compactor.rs` to improve logging of compaction process.
- Added `itertools::Itertools` for efficient string joining.
- Moved logging of compaction inputs and outputs to the async block for better context.
- Enhanced log message to include both input and output file names for better traceability.
* refactor: support to flatten json object in greptime_identity pipeline
* refactor: add GreptimeIdentityPipelineParams to configure greptime_identity pipeline
* refactor: pass greptime identity pipeline params by one header kv
* refactor: code review
* refactor: make pipeline params more general for all internal pipelines
* chore: remove axum deps from pipeline
* fix: clippy errors
* chore: fix and add test
* test: adopt api change for test client
---------
Co-authored-by: shuiyisong <xixing.sys@gmail.com>
Co-authored-by: Ning Sun <sunng@protonmail.com>
Co-authored-by: Ning Sun <sunning@greptime.com>
* change dep
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* feat: adapt to arrow's interval array
* chore: fix compile errors in datatypes crate
* chore: fix api crate compiler errors
* chore: fix compiler errors in common-grpc
* chore: fix common-datasource errors
* chore: fix deprecated code in common-datasource
* fix promql and physical plan related
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* wip: upgrading network deps
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* block on updating `sqlparser`
* upgrade sqlparser
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* adapt new df's trait requirements
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* chore: fix compiler errors in mito2
* chore: fix common-function crate errors
* chore: fix catalog errors
* change import path
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* chore: fix some errors in query crate
* chore: fix some errors in query crate
* aggr expr and some other tiny fixes
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* chore: fix expr related errors in query crate
* chore: fix query serializer and admin command
* chore: fix grpc services
* feat: axum serve
* chore: fix http server
* remove handle_error handler
* refactor timeout layer
* serve axum
* chore: fix flow aggr functions
* chore: fix flow
* feat: fix errors in meta-srv
* boxed()
* use TokioIo
* feat!: Remove script crate and python feature (#5321)
* feat: exclude script crate
* chore: simplify feature
* feat: remove the script crate
* chore: remove python feature and some comments
* chore: fix warning
* chore: fix servers tests compiler errors
* feat: fix tests-integration errors
* chore: fix unused
* test: fix catalog test
* chore: fix compiler errors for crates using common-meta
testing feature is enabled when check with --workspace
* test: use display for logical plan test
* test: implement rewrite for ScanHintRule
* fix: http server build panic
* test: fix mito test
* fix: sql parser type alias error
* test: fix TestClient not listen
* test: some flow tests
* test(flow): more fix
* fix: test_otlp_logs
* test: fix promql test that using deprecated method fun()
* fix: sql type replace supports Int8 ~ Int64, UInt8 ~ UInt64
* test: fix infer schema test case
* test: fix tests related to plan display
* chore: fix last flow test
* test: fix function format related assertion
* test: use larger port range for tests
* fix: test_otlp_traces
* fix: test_otlp_metrics
* fix range query and dist plan
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* fix: flow handle distinct use deprecated field
* fix: can't pass Join plan expressions to LogicalPlan::with_new_exprs
* test: fix deserialize test
* test: reduce split key case num
* tests: lower case aggr func name
* test: fix some sqlness tests
* tests: more sqlness fix
* tests: fixed sqlness test
* commit non-bug changes
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* fix: make our udf correct
* fix: implement empty methods of ContextProvider for DfContextProviderAdapter
* test: update sqlness test result
* chore: remove unused
* fix: provide alias name for AggregateExprBuilder in range plan
* test: update range query result
* fix: implement missing ContextProvider methods for DfContextProviderAdapter
* test: update timestamps, cte result
* fix: supports empty projection in mito
* test: update comment for cte test
* fix: support projection for numbers
* test: update test cases after projection fix
* fix: fix range select first_value/last_value
* fix: handle CAST and time index conflict
* fix: handle order by correctly in range first_value/last_value
* test: update sqlness result
* test: update view test result
* test: update decimal test
wait for https://github.com/apache/datafusion/pull/14126 to fix this
* feat: remove redundant physical optimization
todo(ruihang): Check if we can remove this.
* test: update sqlness test result
* chore: range select default sort use nulls_first = false
* test: update filter push down test result
* test: comment deciaml test to avoid different panic message
* test: update some distributed test result
* test: update test for distributed count and filter push down
* test: update subqueries test
* fix: SessionState may overwrite our UDFs
* chore: fix compiler errors after merging main
* fix: fix elasticsearch and dashboard router panic
* chore: fix common-functions tests
* chore: update sqlness result
* test: fix id keyword and update sqlness result
* test: fix flow_null test
* fix: enlarge thread size in debug mode to avoid overflow
* chore: fix warnings in common-function
* chore: fix warning in flow
* chore: fix warnings in query crate
* chore: remove unused warnings
* chore: fix deprecated warnings for parquet
* chore: fix deprecated warning in servers crate
* style: fix clippy
* test: enlarge mito cache tttl test ttl time
* chore: fix typo
* style: fmt toml
* refactor: reimplement PartialOrd for RangeSelect
* chore: remove script crate files introduced by merge
* fix: return error if sql option is not kv
* chore: do not use ..default::default()
* chore: per review
* chore: update error message in BuildAdminFunctionArgsSnafu
Co-authored-by: jeremyhi <jiachun_feng@proton.me>
* refactor: typed precision
* update sqlness view case
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* chore: flow per review
* chore: add example in comment
* chore: warn if parquet stats of timestamp is not INT64
* style: add a newline before derive to make the comment more clear
* test: update sqlness result
* fix: flow from substrait
* chore: change update_range_context log to debug level
* chore: move axum-extra axum-macros to workspace
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
Co-authored-by: luofucong <luofc@foxmail.com>
Co-authored-by: discord9 <discord9@163.com>
Co-authored-by: shuiyisong <xixing.sys@gmail.com>
Co-authored-by: jeremyhi <jiachun_feng@proton.me>
* fix/avoid-suppress-manual-compaction:
**Refactor Compaction Logic**
- Removed `PendingCompaction` struct and integrated its functionality directly into `CompactionStatus` in `compaction.rs`.
- Simplified waiter management by consolidating waiter handling logic into `CompactionStatus`.
- Updated `CompactionRequest` creation to directly handle waiters without intermediate structures.
- Adjusted test cases in `compaction.rs` to align with the new waiter management approach.
(cherry picked from commit 87e2d1c2cc9bd82c02991d22e429bef25c5ee348)
* fix/avoid-suppress-manual-compaction:
### Add Support for Manual Compaction Requests
- **Compaction Logic Enhancements**:
- Updated `CompactionScheduler` in `compaction.rs` to handle manual compaction requests using `Options::StrictWindow`.
- Introduced `PendingCompaction` struct to manage pending manual compaction requests.
- Added logic to reschedule manual compaction requests once the current compaction task is completed.
- **Testing**:
- Added `test_manual_compaction_when_compaction_in_progress` to verify the handling of manual compaction requests during ongoing compaction processes.
These changes enhance the compaction scheduling mechanism by allowing manual compaction requests to be queued and processed efficiently.
(cherry picked from commit bc38ed0f2f8ba2c4690e0d0e251aeb2acce308ca)
* chore: fix conflicts
* fix/avoid-suppress-manual-compaction:
### Add Error Handling for Manual Compaction Override
- **`compaction.rs`**: Enhanced the `set_pending_request` method to handle manual compaction overrides by sending an error to the waiter if a previous request exists.
- **`error.rs`**: Introduced a new error variant `ManualCompactionOverride` to represent manual compaction being overridden, and mapped it to the `Cancelled` status code.
* fix: format
* fix/avoid-suppress-manual-compaction:
**Add Error Handling for Pending Compaction Requests**
- Enhanced error handling in `compaction.rs` by adding logic to handle errors for pending compaction requests.
- Introduced a mechanism to send errors using `waiter.send` when a pending compaction request fails, ensuring proper error propagation and context with `CompactRegionSnafu`.
* fix/avoid-suppress-manual-compaction:
**Fix Typo and Simplify Code Logic in `compaction.rs`**
- Corrected a typo in the license comment from "langucage" to "language".
- Simplified the logic for handling `pending_compaction` in `CompactionStatus` by removing unnecessary pattern matching and directly accessing `waiter`.
* fix: typo
* feat: make instant_query and range_query to supports not-equal matchers
* feat: impl query_metric_names
* feat: forgot some files and refactor
* chore: test and docs
* fix: typo
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
* refactor: parse_query
* chore: improve test
* fix: use current catalog to query information_schema
---------
Co-authored-by: Ruihang Xia <waynestxia@gmail.com>
* fix: vector function for PromQL need to ignore the time index also close#5392
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* fix: do not affect scalar function
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* fix: betteer name for it
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
---------
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* refactor: use MetadataKey
* fix: match all prefix
* refactor: introduce TopicPool
* fix: fix test, some rename
* test: add unit test for legacy restore
* fix: add _ between prefix and topic id
* chore: readable legacy topics
* refactor: a refactor
* Apply suggestions from code review
* Apply suggestions from code review
* refactor: introduce TopicPool
* fix: fix unit test
* chore: fix unit test and add some comments
* fix: fix unit test
* refactor: just refactor
* refactor: rename
* chore: rename, comments and remove unnecessary clone
* chore/change-authorization-header:
### Add Custom Authorization Header Support
- **Files Modified**: `http.rs`, `authorize.rs`, `authorize.rs` (tests)
- **Key Changes**:
- Introduced a custom authorization header `x-greptime-auth` in `http.rs`.
- Updated authorization logic in `authorize.rs` to support both `x-greptime-auth` and the standard `Authorization` header.
- Enhanced test cases in `authorize.rs` to validate the new custom header functionality.
* chore: add more tests
chore/change-default-compaction-output-size-limit:
### Update `TwcsOptions` Default Configuration
- Modified the default value of `max_output_file_size` in `TwcsOptions` to `Some(ReadableSize::gb(2))` in `src/mito2/src/region/options.rs`.
* feat: use time window in compaction options for compaction window
* test: add tests for overwriting options
* chore: typo
* chore: fix a grammar issue in log
This patch support pg_database for pg_catalog, also add query replace,
in fixtures.rs for the reason that datafusion do not support sql like
'select 1,1;' more can check issue #5344.
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* feat: introduce `PrimaryKeyEncoding`
* fix: fix unit tests
* chore: add empty line
* test: add unit tests
* chore: fmt code
* refactor: introduce new codec trait to support various encoding
* fix: fix unit tests
* chore: update sqlness result
* chore: apply suggestions from CR
* chore: apply suggestions from CR
* feat: more workers
* feat: use round robin
* refactor: per review
* refactor: per bot review
* chore: per review
* docs: example
* docs: update config.md
* docs: update
* chore: per review
* refactor: set workers to cpu/2.max(1)
* fix: flow config in standalone mode
* test: fix config test
* docs: update docs&opt name
* chore: update config.md
* refactor: per review, sanitize at top
* chore: per review
* chore: config.md
* refactor(elasticsearch): use `_index` as greptimedb table in log ingestion and add `/${index}/_bulk` API
Signed-off-by: zyy17 <zyylsxm@gmail.com>
* refactor: code review
---------
Signed-off-by: zyy17 <zyylsxm@gmail.com>
* fix: drop all python embedding code for docker and doc
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* fix: address comments drop the left python
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
---------
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* feat: support `select session_user;`
This commit is part of support DBeaver that support function
select session_user like postgres did.
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* fix: lint problem
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* fix: address comments add tests
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
---------
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* test: optimize out partition split insert requests if there is only one region
* Now that the optimization for single region insert has been lifted up, the original "fast path" can be obsoleted.
* resolve PR comments
* feat(flow): (Part 1) refill utils
* chore: after rebase fix
* chore: more rebase
* rm refill.rs to reduce pr size
* chore: simpler args
* refactor: per review
* docs: more explain for instant requests
* refactor: per review
* fix: drop unused dep using udeps to minial the size
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* fix: adress comments fix the problem
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
---------
Signed-off-by: yihong0618 <zouzou0208@gmail.com>
* feat: impl COPY a query resultset to external file
* chore: add more tests for parse `copy_table_to`
* chore: add more tests for parse `copy_table_to`
* feat: txn for pg kv backend
* chore: clippy
* fix: txn uses one client
* test: clean up and txn test
* test: clean up
* test: change lock_id to avoid conflict in test
* test: use different prefix in pg election test
* fix(test): just a fix
* test: aggregate multiple test to avoid concurrency problem
* test: use uuid instead of rng
* perf: batch cmp in txn
* perf: batch same op in txn
chore/suppress-list-warning:
### Update logging level in `intermediate.rs`
- Changed logging level from `warn` to `debug` for unexpected directory entries in index creation.
- Added `debug` to the `common_telemetry` import to support the logging level change.
* test: test adding existing columns
* chore: add more checks to AlterKind
* chore: update logs
* fix: check and build table info first
* feat: Add add_if_not_exists flag to alter expr
* feat: skip existing columns when building alter kind
* checks in make_region_alter_kind()
* reuse the alter kind
* test: fix tests in common-meta
* chore: fix typos
* chore: update comments
* feat: update partition duration of memtable using compaction window
* chore: only use provided duration if it is not None
* test: more tests
* test: test compaction apply window
* style: fix clippy
* feat: init PgElection
fix: release advisory lock
fix: handle duplicate keys
chore: update comments
fix: unlock if acquired the lock
chore: add TODO and avoid unwrap
refactor: check both lock and expire time, add more comments
chore: fmt
fix: deal with multiple edge cases
feat: init PgElection with candidate registration
chore: fmt
chore: remove
* test: add unit test for pg candidate registration
* test: add unit test for pg candidate registration
* chore: update pg env
* chore: make ci happy
* fix: spawn a background connection thread
* chore: typo
* fix: shadow the election client for now
* fix: fix ci
* chore: readability
* chore: follow review comments
* refactor: use kvbackend for pg election
* chore: rename
* chore: make clippy happy
* refactor: use pg server time instead of local ones
* chore: typo
* chore: rename infancy to leader_infancy for clarification
* chore: clean up
* chore: follow review comments
* chore: follow review comments
* ci: unit test should test all features
* ci: fix
* ci: just test pg
* wip: row group reader base
* wip: memtable row group reader
* Refactor MemtableRowGroupReader to streamline data fetching
- Added early return when fetch_ranges is empty to optimize performance.
- Replaced inline chunk data assignment with a call to `assign_dense_chunk` for cleaner code.
* wip: row group reader
* wip: reuse RowGroupReader
* wip: bulk part reader
* Enhance BulkPart Iteration with Filtering
- Introduced `RangeBase` to `BulkIterContext` for improved filter handling.
- Implemented filter application in `BulkPartIter` to prune batches based on predicates.
- Updated `SimpleFilterContext::new_opt` to be public for broader access.
* chore: add prune test
* fix: clippy
* fix: introduce prune reader for memtable and add more prune test
* Enhance BulkPart read method to return Option<BoxedBatchIterator>
- Modified `BulkPart::read` to return `Option<BoxedBatchIterator>` to handle cases where no row groups are selected.
- Added logic to return `None` when all row groups are filtered out.
- Updated tests to handle the new return type and added a test case to verify behavior when no row groups match the pr
* refactor/separate-paraquet-reader: Add helper function to parse parquet metadata and integrate it into BulkPartEncoder
* refactor/separate-paraquet-reader:
Change BulkPartEncoder row_group_size from Option to usize and update tests
* refactor/separate-paraquet-reader: Add context module for bulk memtable iteration and refactor part reading
• Introduce context module to encapsulate context for bulk memtable iteration.
• Refactor BulkPart to use BulkIterContextRef for reading operations.
• Remove redundant code in BulkPart by centralizing context creation and row group pruning logic in the new context module.
• Create new file context.rs with structures and logic for handling iteration context.
• Adjust part_reader.rs and row_group_reader.rs to reference the new BulkIterContextRef.
* refactor/separate-paraquet-reader: Refactor RowGroupReader traits and implementations in memtable and parquet reader modules
• Rename RowGroupReaderVirtual to RowGroupReaderContext for clarity.
• Replace BulkPartVirt with direct usage of BulkIterContextRef in MemtableRowGroupReader.
• Simplify MemtableRowGroupReaderBuilder by directly passing context instead of creating a BulkPartVirt instance.
• Update RowGroupReaderBase to use context field instead of virt, reflecting the trait renaming and usage.
• Modify FileRangeVirt to FileRangeContextRef and adjust implementations accordingly.
* refactor/separate-paraquet-reader: Refactor column page reader creation and remove unused code
• Centralize creation of SerializedPageReader in RowGroupBase::column_reader method.
• Remove unused RowGroupCachedReader and related code from MemtableRowGroupPageFetcher.
• Eliminate redundant error handling for invalid column index in multiple places.
* chore: rebase main and resolve conflicts
* fix: some comments
* chore: resolve conflicts
* chore: resolve conflicts
* chore: improve nix-shell support
* fix: add pkg-config
* ci: add a github action to ensure build on clean system
* ci: optimise dependencies of task
* ci: move clean build to nightly
Add ORDER BY clause to subquery union tests
Updated the SQL and result files for subquery union tests to include an ORDER BY clause, ensuring consistent result ordering. This change aligns with the test case from the DuckDB repository.
* feat: do not remove time filters
* chore: remove `time_range` from parquet reader
* chore: print more message in the check script
* chore: fix unused error
* perf/avoid-holding-memtable-during-compaction: Refactor Compaction Version Handling
• Introduced CompactionVersion struct to encapsulate region version details for compaction, removing dependency on VersionRef.
• Updated CompactionRequest and CompactionRegion to use CompactionVersion.
• Modified open_compaction_region to construct CompactionVersion without memtables.
• Adjusted WindowedCompactionPicker to work with CompactionVersion.
• Enhanced flush logic in WriteBufferManager to improve memory usage checks and logging.
* reformat code
* chore: change log level
* reformat code
---------
Co-authored-by: Yingwen <realevenyag@gmail.com>
* feat: simple version switch
* chore: remove debug print
* chore: add common folder
* tests: add drop table
* feat: pull versioned binary
* chore: don't use native-tls
* chore: rm outdated docs
* chore: new line
* fix: save old bin dir
* fix: switch version restart all node
* feat: use etcd
* fix: wait for election
* fix: normal sqlness
* refactor: hashmap for bin dir
* test: past 3 major version compat crate table
* refactor: allow using without setup etcd
* add metrics
* chore/bench-metrics: Add INFLIGHT_FLUSH_COUNT Metric to Flush Process
• Introduced INFLIGHT_FLUSH_COUNT metric to track the number of ongoing flush operations.
• Incremented INFLIGHT_FLUSH_COUNT in FlushScheduler to monitor active flushes.
• Removed redundant increment of INFLIGHT_FLUSH_COUNT in RegionWorkerLoop to prevent double counting.
* chore/bench-metrics: Add Metrics for Compaction and Flush Operations
• Introduced INFLIGHT_COMPACTION_COUNT and INFLIGHT_FLUSH_COUNT metrics to track the number of ongoing compaction and flush operations.
• Incremented INFLIGHT_COMPACTION_COUNT when scheduling remote and local compaction jobs, and decremented it upon completion.
• Added INFLIGHT_FLUSH_COUNT increment and decrement logic around flush tasks to monitor active flush operations.
• Removed redundant metric updates in worker.rs and handle_compaction.rs to streamline metric handling.
* chore: add metrics for remote compaction jobs
* chore: format
* chore: also add dashbaord
* feat: cache inverted index by page instead of file
* fix: add unit test and fix bugs
* chore: typo
* chore: ci
* fix: math
* chore: apply review comments
* chore: renames
* test: add unit test for index key calculation
* refactor: use ReadableSize
* feat: add config for inverted index page size
* chore: update config file
* refactor: handle multiple range read and fix some related bugs
* fix: add config
* test: turn to a fs reader to match behaviors of object store
* chore: decide tag column in log api follow table schema if table exists
* chore: add more test for greptime_identity pipeline
* chore: change pipeline get_table function signature
* chore: change identity_pipeline_inner tag_column_names type
* feat(fuzz): add set table options to alter fuzzer
* chore: clippy is happy, I'm sad
* chore: happy ci happy
* fix: unit test
* feat(fuzz): add unset table options to alter fuzzer
* fix: unit test
* feat(fuzz): add table option validator
* fix: make clippy happy
* chore: add comments
* chore: apply review comments
* fix: unit test
* feat(fuzz): add more ttl options
* fix: #5108
* chore: add comments
* chore: add comments
* feat: assign partition ranges by rows
* feat: balance partition rows
* feat: get uppoer bound for part nums
* feat: only split in non-compaction seq scan
* fix: parallel scan on multiple sources
* fix: can split check
* feat: scanner prepare by request
* feat: remove scan_parallelism
* docs: upate docs
* chore: update comment
* style: fix clippy
* feat: skip merge and dedup if there is only one source
* chore: Revert "feat: skip merge and dedup if there is only one source"
Since memtable won't do dedup jobs
This reverts commit 2fc7a54b11.
* test: avoid compaction in sqlness window sort test
* chore: do not create semaphore if num partitions is enough
* chore: more assertions
* chore: fix typo
* fix: compaction flag not set
* chore: address review comments
* feat: ttl zero filter
* refactor: use TimeToLive enum
* fix: unit test
* tests: sqlness
* refactor: Option<TTL> None means UNSET
* tests: sqlness
* fix: 10000 years --> forever
* chore: minor refactor from reviews
* chore: rename back TimeToLive
* refactor: split imme request from normal requests
* fix: use correct lifetime
* refactor: rename immediate to instant
* tests: flow sink table default ttl
* refactor: per review
* tests: sqlness
* fix: ttl alter to instant
* tests: sqlness
* refactor: per review
* chore: per review
* feat: add db ttl type&forbid instant for db
* tests: more unit test
* fix: use SchemaCache to locate database metadata
* main:
Refactor SchemaMetadataManager to use TableInfoCacheRef
- Replace TableInfoManagerRef with TableInfoCacheRef in SchemaMetadataManager
- Update DatanodeBuilder to pass TableInfoCacheRef to SchemaMetadataManager
- Rename error MissingCacheRegistrySnafu to MissingCacheSnafu in datanode module
- Adjust tests to use new mock_schema_metadata_manager with TableInfoCacheRef
* fix/schema-cache-invalidation: Add cache module and integrate cache registry into datanode
• Implement build_datanode_cache_registry function to create cache registry for datanode
• Integrate cache registry into datanode by modifying DatanodeBuilder and HeartbeatTask
• Refactor InvalidateTableCacheHandler to InvalidateCacheHandler and move to common-meta crate
• Update Cargo.toml to include cache as a dev-dependency for datanode
• Adjust related modules (flownode, frontend, tests-integration, standalone) to use new cache handler and registry
• Remove obsolete handler module from frontend crate
* fix: fuzz imports
* chore: add some doc for cahce builder functions
* refactor: change table info cache to table schema cache
* fix: remove unused variants
* fix fuzz
* chore: apply suggestion
Co-authored-by: Weny Xu <wenymedia@gmail.com>
* chore: apply suggestion
Co-authored-by: Weny Xu <wenymedia@gmail.com>
* fix: compile
---------
Co-authored-by: dennis zhuang <killme2008@gmail.com>
Co-authored-by: Weny Xu <wenymedia@gmail.com>
* feat: add cache for schema options
* fix/use-cache-kv-manager: Add cache invalidation handling to Datanode's heartbeat task
• Implement InvalidateSchemaCacheHandler in heartbeat.rs to handle cache invalidation instructions.
• Update HeartbeatTask constructor to accept cached_kv_backend and pass it to InvalidateSchemaCacheHandler.
• Modify DatanodeBuilder to clone cached_kv_backend when creating schema_metadata_manager.
• Refactor MetasrvCacheInvalidator in cache_invalidator.rs to reuse MailboxMessage for broadcasting to different channels.
* fix: only remove schema related cache entries
* chore: add more tests
* fix/use-cache-kv-manager: Moved InvalidateSchemaCacheHandler to a separate module
• Extracted InvalidateSchemaCacheHandler and associated tests into a new file cache_invalidator.rs
• Removed async_trait and CacheInvalidator related code from heartbeat.rs
• Added cache_invalidator module declaration in handler.rs
* fix: unit tests
* fix/use-cache-kv-manager:
Standardize TODO comment format in CachedKvBackend txn method
* Update src/datanode/src/heartbeat/handler/cache_invalidator.rs
* Update src/datanode/src/heartbeat/handler/cache_invalidator.rs
* Update src/datanode/src/heartbeat/handler/cache_invalidator.rs
---------
Co-authored-by: jeremyhi <jiachun_feng@proton.me>
* fix/metric-metadata-region-options: Remove APPEND_MODE_KEY and refactor TTL option handling in MetricEngineInner
* fix/metric-metadata-region-options: Refactor metadata region options into a shared function
• Extract metadata region options into region_options_for_metadata_region function
• Replace inline options map with a call to the new shared function in both create.rs and open.rs files
* fix: exclude typos
* fix/metric-metadata-region-options:
Refactor metadata region options to accept original options and remove APPEND_MODE_KEY
* feat: prune in each partition
* chore: change pick log to trace
* chore: add in progress partition scan to metrics
* feat: seqscan support pruning in partition
* chore: remove commented codes
* feat: Replace flow
* refactor: better show create flow&tests: better check
* tests: sqlness result update
* tests: unit test for update
* refactor: cmp with raw bytes
* refactor: rename
* refactor: per review
* support set and show on statement/execution timeout session variables.
* implement statement timeout for mysql read, and postgres queries
* add mysql test with max execution time
* tests: more flow testcase
* tests(WIP): more tests
* tests: more flow tests
* test: wired regex for sqlness
* refactor: put blog&example to two files
* fix: result of nulls
* update test result
* fix null behaviors, add null tests
* update NULL tests
* error handler when parsing json_path
* change the logic to: items' datatype in the input arrays are all the same.
* remove a comment
* refactor: better logic
* drop unnecessary err check
* added an error test case
* main:
Add common-meta dependency and implement SchemaMetadataManager
- Introduce `common-meta` as a new dependency in `mito2`.
- Implement `SchemaMetadataManager` for managing schema-level metadata.
- Update `DatanodeBuilder` and `MitoEngine` to pass `KvBackendRef` for schema metadata management.
- Add `SchemaMetadataManager` to `RegionWorkerLoop` for compaction handling.
- Include `SchemaNameKey` usage in compaction-related code.
- Add `database_metadata_manager` module with `SchemaMetadataManager` struct and associated logic.
* fix/database-base-ttl:
Refactor metadata management and update compaction logic
- Remove `database_metadata_manager` and introduce `schema_metadata_manager`
- Update compaction logic to handle TTL based on schema metadata
- Adjust tests to use `schema_metadata_manager` for setting up schema options
- Fix engine creation in tests to pass `kv_backend` explicitly
- Remove unused imports and apply minor code cleanups
* fix/database-base-ttl:
Extend CREATE TABLE LIKE to inherit schema options
- Implement inheritance of database level options for CREATE TABLE LIKE
- Add schema options to SHOW CREATE TABLE output
- Refactor create_table_stmt to include schema_options in SQL generation
- Update error handling to include TableMetadataManagerSnafu
* fix/database-base-ttl:
Refactor error handling and remove schema dependency in table creation
- Replace expect with the ? operator for error handling in open_compaction_region
- Simplify create_logical_tables by removing catalog and schema name parameters
- Remove unnecessary schema retrieval and merging of schema options in create_table_info
- Clean up unused imports and redundant code
* fix/database-base-ttl:
Refactor error handling and update documentation comments
- Update comment to reflect retrieval of schema options instead of metadata
- Introduce new error type `GetSchemaMetadataSnafu` for schema metadata retrieval failures
- Implement error handling for schema metadata retrieval in `find_ttl` function
* fix: toml
* fix/database-base-ttl:
Refactor SchemaMetadataManager and adjust Cargo.toml dependencies
- Remove unused imports in schema_metadata_manager.rs
- Add conditional compilation for SchemaMetadataManager::new
- Update Cargo.toml to remove "testing" feature from common-meta dependency in main section and add it to dev-dependencies
* fix/database-base-ttl:
Fix typos in comments and function names across multiple modules
- Correct spelling of 'parallelism' in region_server, engine, and scan_region modules
- Amend typo in TODO comment from 'persisent' to 'persistent' in server module
- Update incorrect test query from 'versiona' to 'version' in federated module tests
* fix/database-base-ttl: Add schema existence check in StatementExecutor for CREATE TABLE operation
* fix/database-base-ttl: Add warning log for failed TTL retrieval in compaction region open function
* fix/database-base-ttl:
Refactor to use SchemaMetadataManagerRef in Datanode and MitoEngine
- Replace KvBackendRef with SchemaMetadataManagerRef across various components.
- Update DatanodeBuilder and MitoEngine to pass SchemaMetadataManagerRef instead of KvBackendRef.
- Adjust test cases to use get_schema_metadata_manager method for consistency.
* fix: data_length, index_length, table_rows in tables
* feat: table stats only works for mito engine currently
* fix: tests
* fix: typo
* chore: log error when region_stats fails
fix/database-base-ttl:
Fix typos in comments and function names across multiple modules
- Correct spelling of 'parallelism' in region_server, engine, and scan_region modules
- Amend typo in TODO comment from 'persisent' to 'persistent' in server module
- Update incorrect test query from 'versiona' to 'version' in federated module tests
Co-authored-by: Lei, HUANG <mrsatangel@gmail.com>
* feat: get part range min-max from cache for unordered scan
* feat: seq scan push row groups if num_row_groups > 0
* test: test split
* feat: update comment
* test: fix split test
* refactor: rename get meta data method
* feat: adds index size to region statistics
* feat: adds the number of rows for region statistics
* test: adds sqlness test for region_statistics
* fix: test
* feat/alter-ttl:
Update greptime-proto source and add ChangeTableOptions handling
- Change greptime-proto source repository and revision in Cargo.lock and Cargo.toml
- Implement handling for ChangeTableOptions in grpc-expr and meta modules
- Add support for parsing and applying region option changes in mito2
- Introduce new error type for invalid change table option requests
- Add humantime dependency to store-api
- Fix SQL syntax in tests for changing column types
* chore: remove write buffer size option handling since we don't support specifying write_buffer_size for single table or region
* persist ttl to manifest
* chore: add sqlness
* fix: sqlness
* fix: typo and toml format
* fix: tests
* update: change alter syntax
* feat/alter-ttl: Add Clone trait to RegionFlushRequest and remove redundant Default derive in region_request.rs.
* feat/alter-ttl: Refactor code to replace 'ChangeTableOption' with 'ChangeRegionOption' and handle TTL as a region option
• Rename ChangeTableOption to ChangeRegionOption across various files.
• Update AlterKind::ChangeTableOptions to AlterKind::ChangeRegionOptions.
• Modify TTL handling to treat '0d' as None for TTL in table options.
• Adjust related function names and comments to reflect the change from table to region options.
• Include test case updates to verify the new TTL handling behavior.
* chore: update format
* refactor: update region options in DatanodeTableValue
* feat/alter-ttl:
Remove TTL handling from RegionManifest and related structures
- Eliminate TTL fields from `RegionManifest`, `RegionChange`, and associated handling logic.
- Update tests and checksums to reflect removal of TTL.
- Refactor `RegionOpener` and `handle_alter` to adjust to TTL removal.
- Simplify `RegionChangeResult` by replacing `change` with `new_meta`.
* chore: fmt
* remove useless delete op
* feat/alter-ttl: Updated Cargo.lock and gRPC expression Cargo.toml to include store-api dependency. Refactored alter.rs to use ChangeOption from store-api instead of ChangeTableOptionRequest.
Adjusted error handling in error.rs to use MetadataError. Modified handle_alter.rs to handle TTL changes with ChangeOption. Simplified region_request.rs by replacing
ChangeRegionOption with ChangeOption and removing redundant code. Removed UnsupportedTableOptionChange error in table/src/error.rs. Updated metadata.rs to use ChangeOption for table
options. Removed ChangeTableOptionRequest enum and related conversion code from requests.rs.
* feat/alter-ttl: Update greptime-proto dependency to revision 53ab9a9553
* chore: format code
* chore: update greptime-proto
* test: add fuzz tests for migrate metric regions
* test: insert values before migrating metric region
* feat: correct table num
* chore: apply suggestions from CR
* chore: add schema urls to otlp logs table
* chore: update meter-macros version to remove anymap warning
* chore: change span id and trace id to field
* feat: add empty batch to end of range stream
* feat: add batch validation
* fix: validate batch order
* fix: not yield empty batch in compaction
* fix: empty record batch
* feat: add a flag to enable empty batch
* chore: otlp logs api
* feat: add API to write OpenTelemetry logs to GreptimeDB
* chore: fix test data schema error
* chore: modify the underlying data structure of the pipeline value map type from hashmap to btremap to keep key order
* chore: fix by pr comment
* chore: resolve conflicts and add some test
* chore: remove useless error
* chore: change otlp header name
* chore: fmt code
* chore: fix integration test for otlp log write api
* chore: fix by pr comment
* chore: set otlp body with fulltext default
* refactor: replace info logs with debug logs in region server
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* fix: update error handling for closing and opening nonexistent regions
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
---------
Signed-off-by: Ruihang Xia <waynestxia@gmail.com>
* feat: enable prof by default
* docs: don't need to build with features
* feat: add common-pprof as optional dep for pprof feature
* build: remove optional
* feat: use dump_text
* fix/union_all_panic:
Improve MetricCollector by incrementing level and fix underflow issue; add tests for UNION ALL queries
* chore: remove useless documentation
* fix/union_all_panic: Add order by clause to UNION ALL select queries in tests
* feat: json output format for http
* feat: add json result test case
* fix: typo and refactor a piece of code
* fix: cargo check
* move affected_rows to top level
* feat: add geojson function to aggregate paths
* test: add sqlness results
* test: add sqlness
* refactor: corrected to aggregation function
* chore: update comments
* fix: make linter happy again
* refactor: rename to remove `geo` from `geojson` function name
The return type is not geojson at all. It's just compatible with geojson's
coordinates part and superset's deckgl path plugin.
* chore: add json write
* chore: add test for write json log api
* chore: enhancement of Error Handling
* chore: fix by pr comment
* chore: fix by pr comment
* chore: enhancement of error content and add some doc
* feat: set max log files to 720 by default, info log only
* expose max_log_files in tomls
* include dir info when panicing, limit max_log_files of err_log to 30, and that of slow_queries to opt.max_log_files
* fix clippy
* update config.md
* update expected config str
* limit err_log max files size to `max_log_files` too, include err info when panicing, put `max_l_f` in right position
* fix typos
* chore: config
Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com>
---------
Co-authored-by: dennis zhuang <killme2008@gmail.com>
Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com>
* feat: define range meta
* feat: group ranges
* feat: split range
* feat: build ranges from the scan input
* feat: get partition range from range meta
* feat: build file range
* feat: unordered scan read by ranges
* feat: wip for mem ranges
* feat: build ranges
* feat: remove unused codes
* chore: update comments
* feat: update metrics
* chore: address review comments
* chore: debug assertion
* Commit Message
Clarify documentation for CompactionOutput struct
Updated the documentation for the `CompactionOutput` struct to specify that the output time range is only relevant for windowed compaction.
* Add max_output_file_size to TwcsPicker and TwcsOptions
- Introduced `max_output_file_size` to `TwcsPicker` struct and its logic to enforce output file size limits during compaction.
- Updated `TwcsOptions` to include `max_output_file_size` and adjusted related tests.
- Modified `new_picker` function to initialize `TwcsPicker` with the new `max_output_file_size` field.
* feat/limit-compaction-output-size:
Refactor compaction picker and TWCS to support append mode and improve options handling
- Update compaction picker to accept a reference to options and append mode flag
- Modify TWCS picker logic to consider append mode when filtering deleted rows
- Remove VersionControl usage in compactor and simplify return type
- Adjust enforce_max_output_size logic in TWCS picker to handle max output file size
- Add append mode flag to TwcsPicker struct
- Fix incorrect condition in TWCS picker for enforcing max output size
- Update region options tests to reflect new max output file size format (1GB and 7MB)
- Simplify InvalidTableOptionSnafu error handling in create_parser
- Add `compaction.twcs.max_output_file_size` to mito engine option keys
* resolve some comments
* feat: add capability to send warning to pgclient
* fix: refactor query context to carry query scope data
* feat: return a warning for unsupported postgres statement
* feat: list/array support for postgres output
* fix: implement time zone support for postgrsql
* feat: add a geohash function that returns array
* fix: typo
* fix: lint warnings
* test: add sqlness test
* refactor: check resolution range before convert value
* fix: test result for sqlness
* feat: upgrade pgwire apis
* chore: cherrypick 52e8eebb2dbbbe81179583c05094004a5eedd7fd
* refactor/tables: Change variable from immutable to mutable in KvBackendCatalogManager's method
* refactor/tables: Replace unbounded channel with bounded and use semaphore for concurrency control in KvBackendCatalogManager
* refactor/tables: Add common-runtime dependency and update KvBackendCatalogManager to use common_runtime::spawn_global
* refactor/tables: Await on sending error through channel in KvBackendCatalogManager
* chore: version skew
* fix: even more version skew
* feat: use `ring` instead of `aws-lc` for remove nasm assembler on windows
* feat: use `ring` for pgwire
* feat: change to use `aws-lc-sys` on windows instead
* feat: change back to use `ring`
* chore: provide CryptoProvider
* feat: use upstream repo
* feat: install ring crypto lib in main
* chore: use same fn to install in tests
* feat: make pgwire use `ring`
* generic bundle trait
* feat: impl get/let
* fix: drop batch
* test: tumble batch
* feat: use batch eval flow
* fix: div use arrow::div not mul
* perf: not append batch
* perf: use bool mask for reduce
* perf: tiny opt
* perf: refactor slow path
* feat: opt if then
* fix: WIP
* perf: if then
* chore: use trace instead
* fix: reduce missing non-first batch
* perf: flow if then using interleave
* docs: add TODO
* perf: remove unnecessary eq
* chore: remove unused import
* fix: run_available no longer loop forever
* feat: blocking on high input buf
* chore: increase threhold
* chore: after rebase
* chore: per review
* chore: per review
* fix: allow empty values in reduce&test
* tests: more flow doc example tests
* chore: per review
* chore: per review
* feat: add json type and vector
* fix: allow to create and insert json data
* feat: udf to query json as string
* refactor: remove JsonbValue and JsonVector
* feat: show json value as strings
* chore: make ci happy
* test: adunit test and sqlness test
* refactor: use binary as grpc value of json
* fix: use non-preserve-order jsonb
* test: revert changed test
* refactor: change udf get_by_path to jq
* chore: make ci happy
* fix: distinguish binary and json in proto
* chore: delete udf for future pr
* refactor: remove Value(Json)
* chore: follow review comments
* test: some tests and checks
* test: fix unit tests
* chore: follow review comments
* chore: corresponding changes to proto
* fix: change grpc and pgsql server behavior alongside with sqlness/crud tests
* chore: follow review comments
* feat: udf of conversions between json and strings, used for grpc server
* refactor: rename to_string to json_to_string
* test: add more sqlness test for json
* chore: thanks for review :)
* Apply suggestions from code review
---------
Co-authored-by: Weny Xu <wenymedia@gmail.com>
* Refactor RaftEngineLogStore to use references for config
- Updated `RaftEngineLogStore::try_new` to accept a reference to `RaftEngineConfig` instead of taking ownership.
- Replaced direct usage of `config` with individual fields (`sync_write`, `sync_period`, `read_batch_size`).
- Adjusted test cases to pass references to `RaftEngineConfig`.
* Add parallelism configuration for WAL recovery
- Introduced `recovery_parallelism` setting in `datanode.example.toml` and `standalone.example.toml` for configuring parallelism during WAL recovery.
- Updated `Cargo.lock` and `Cargo.toml` to include `num_cpus` dependency.
- Modified `RaftEngineConfig` to include `recovery_parallelism` with a default value set to the number of CP
* feat/wal-recovery-parallelism:
Add `wal.recovery_parallelism` configuration option
- Introduced `wal.recovery_parallelism` to config.md for specifying parallelism during WAL recovery.
- Updated `RaftEngineLogStore` to include `recovery_threads` from the new configuration.
* fix: ut
* fix: table resolving logic related to pg_catalog
refer to
https://github.com/GreptimeTeam/greptimedb/issues/3560#issuecomment-2287794348
and #4543
* refactor: remove CatalogProtocol type
* fix: sqlness
* fix: forbid create database pg_catalog with mysql client
* refactor: use QueryContext as arguments rather than Channel
* refactor: pass None as default behaviour in information_schema
* test: fix test
* chore: add test pipeline api
* chore: add test for test pipeline api
* chore: fix taplo check
* chore: change pipeline dryrun api path
* chore: add more info for pipeline dryrun api
* chore: add processor builder and transform buidler
* chore: in process
* chore: intermediate state from hashmap to vector in pipeline
* chore: remove useless code and rename some struct
* chore: fix typos
* chore: format code
* chore: add error handling and optimize code readability
* chore: fix typos
* chore: remove useless code
* chore: add some doc
* chore: fix by pr commit
* chore: remove useless code and change struct name
* chore: modify the location of the find_key_index function.
* refactor: pre-read the ingested sst file in object store to fill the local cache to accelerate first query
* feat: pre-download the ingested SST from remote to accelerate following reads
* resolve PR comments
* resolve PR comments
* feat(log_store): use new `Consumer`
* feat: add `from_peer_id`
* feat: read WAL entries respect index
* test: add test for `build_region_wal_index_iterator`
* fix: keep the handle
* fix: incorrect last index
* fix: replay last entry id may be greater than expected
* chore: remove unused code
* chore: apply suggestions from CR
* chore: rename `datanode_id` to `location_id`
* chore: rename `from_peer_id` to `location_id`
* chore: rename `from_peer_id` to `location_id`
* chore: apply suggestions from CR
* use list_with_metakey and concurrent_stat_in_list
* change concurrent in recover_cache like before
* remove stat funcation
* use 8 concurrent
* use const value
* fmt code
* Apply suggestions from code review
---------
Co-authored-by: ozewr <l19ht@google.com>
Co-authored-by: Weny Xu <wenymedia@gmail.com>
* chore: improve pipeline performance
* chore: use arc to improve time type
* chore: improve pipeline coerce
* chore: add vec refactor
* chore: add vec pp
* chore: improve pipeline
* inprocess
* chore: set log ingester use new pipeline
* chore: fix some error by pr comment
* chore: fix typo
* chore: use enum_dispatch to simplify code
* chore: some minor fix
* chore: format code
* chore: update by pr comment
* chore: fix typo
* chore: make clippy happy
* chore: fix by pr comment
* chore: remove epoch and date process add new timestamp process
* chore: add more test for pipeline
* chore: restore epoch and date processor
* chore: compatibility issue
* chore: fix by pr comment
* chore: move the evaluation out of the loop
* chore: fix by pr comment
* chore: fix dissect output key filter
* chore: fix transform output greptime value has order error
* chore: keep pipeline transform output order
* chore: revert tests
* chore: simplify pipeline prepare implementation
* chore: add test for timestamp pipelin processor
* chore: make clippy happy
* chore: replace is_some check to match
---------
Co-authored-by: shuiyisong <xixing.sys@gmail.com>
* feat: support fast count(*) for append-only tables
* fix: total_rows stats in time series memtable
* fix: sqlness result changes for SinglePartitionScanner -> StreamScanAdapter
* fix: some cr comments
release-dev-builder-images-cn: # Note: Be careful issue:https://github.com/containers/skopeo/issues/1874 and we decide to use the latest stable skopeo container.
# change all rustls dependencies to use our fork to default to `ring` to make it "just work"
hyper-rustls={git="https://github.com/GreptimeTeam/hyper-rustls",rev="a951e03"}# version = "0.27.5" with ring patch
rustls={git="https://github.com/GreptimeTeam/rustls",rev="34fd0c6"}# version = "0.23.20" with ring patch
tokio-rustls={git="https://github.com/GreptimeTeam/tokio-rustls",rev="4604ca6"}# version = "0.26.0" with ring patch
# This is commented, since we are not using aws-lc-sys, if we need to use it, we need to uncomment this line or use a release after this commit, or it wouldn't compile with gcc < 8.1
**GreptimeDB** is an open-source unified time-series database for **Metrics**, **Logs**, and **Events** (also **Traces** in plan). You can gain real-time insights from Edge to Cloud at any scale.
**GreptimeDB** is an open-source unified & cost-effective time-series database for **Metrics**, **Logs**, and **Events** (also **Traces** in plan). You can gain real-time insights from Edge to Cloud at Any Scale.
## Why GreptimeDB
Our core developers have been building time-series data platforms for years. Based on our best-practices, GreptimeDB is born to give you:
Our core developers have been building time-series data platforms for years. Based on our bestpractices, GreptimeDB was born to give you:
* **Unified all kinds of time series**
* **Unified Processing of Metrics, Logs, and Events**
GreptimeDB treats all time series as contextual events with timestamp, and thus unifies the processing of metrics, logs, and events. It supports analyzing metrics, logs, and events with SQL and PromQL, and doing streaming with continuous aggregation.
GreptimeDB unifies time series data processing by treating all data - whether metrics, logs, or events - as timestamped events with context. Users can analyze this data using either [SQL](https://docs.greptime.com/user-guide/query-data/sql) or [PromQL](https://docs.greptime.com/user-guide/query-data/promql) and leverage stream processing ([Flow](https://docs.greptime.com/user-guide/flow-computation/overview)) to enable continuous aggregation. [Read more](https://docs.greptime.com/user-guide/concepts/data-model).
* **Cloud-Edge collaboration**
* **Cloud-native Distributed Database**
GreptimeDB can be deployed on ARM architecture-compatible Android/Linux systems as well as cloud environments from various vendors. Both sides run the same software, providing identical APIs and control planes, so your application can run at the edge or on the cloud without modification, and data synchronization also becomes extremely easy and efficient.
* **Cloud-native distributed database**
By leveraging object storage (S3 and others), separating compute and storage, scaling stateless compute nodes arbitrarily, GreptimeDB implements seamless scalability. It also supports cross-cloud deployment with a built-in unified data access layer over different object storages.
Built for [Kubernetes](https://docs.greptime.com/user-guide/deployments/deploy-on-kubernetes/greptimedb-operator-management). GreptimeDB achieves seamless scalability with its [cloud-native architecture](https://docs.greptime.com/user-guide/concepts/architecture) of separated compute and storage, built on object storage (AWS S3, Azure Blob Storage, etc.) while enabling cross-cloud deployment through a unified data access layer.
* **Performance and Cost-effective**
Flexible indexing capabilities and distributed, parallel-processing query engine, tackling high cardinality issues down. Optimized columnar layout for handling time-series data; compacted, compressed, and stored on various storage backends, particularly cloud object storage with 50x cost efficiency.
Written in pure Rust for superior performance and reliability. GreptimeDB features a distributed query engine with intelligent indexing to handle high cardinality data efficiently. Its optimized columnar storage achieves 50x cost efficiency on cloud object storage through advanced compression. [Benchmark reports](https://www.greptime.com/blogs/2024-09-09-report-summary).
* **Compatible with InfluxDB, Prometheus and more protocols**
* **Cloud-Edge Collaboration**
Widely adopted database protocols and APIs, including MySQL, PostgreSQL, and Prometheus Remote Storage, etc. [Read more](https://docs.greptime.com/user-guide/clients/overview).
GreptimeDB seamlessly operates across cloud and edge (ARM/Android/Linux), providing consistent APIs and control plane for unified data management and efficient synchronization. [Learn how to run on Android](https://docs.greptime.com/user-guide/deployments/run-on-android/).
*Python toolchain (optional): Required only if built with PyO3 backend. More detail for compiling with PyO3 can be found in its [documentation](https://pyo3.rs/v0.18.1/building_and_distribution#configuring-the-python-version).
*C/C++ building essentials, including `gcc`/`g++`/`autoconf` and glibc library (eg. `libc6-dev` on Ubuntu and `glibc-devel` on Fedora)
* Python toolchain (optional): Required only if using some test scripts.
@@ -146,14 +174,19 @@ cargo run -- standalone start
### Grafana Dashboard
Our official Grafana dashboard is available at [grafana](grafana/README.md) directory.
Our official Grafana dashboard for monitoring GreptimeDB is available at [grafana](grafana/README.md) directory.
## Project Status
The current version has not yet reached the standards for General Availability.
According to our Greptime 2024 Roadmap, we aim to achieve a production-level version with the release of v1.0 by the end of 2024. [Join Us](https://github.com/GreptimeTeam/greptimedb/issues/3412)
GreptimeDB is currently in Beta. We are targeting GA (General Availability) with v1.0 release by Early 2025.
We welcome you to test and use GreptimeDB. Some users have already adopted it in their production environments. If you're interested in trying it out, please use the latest stable release available.
While in Beta, GreptimeDB is already:
* Being used in production by early adopters
* Actively maintained with regular releases, [about version number](https://docs.greptime.com/nightly/reference/about-greptimedb-version)
* Suitable for testing and evaluation
For production use, we recommend using the latest stable release.
## Community
@@ -172,6 +205,13 @@ In addition, you may:
- Connect us with [Linkedin](https://www.linkedin.com/company/greptime/)
- Follow us on [Twitter](https://twitter.com/greptime)
## Commercial Support
If you are running GreptimeDB OSS in your organization, we offer additional
enterprise add-ons, installation services, training, and consulting. [Contact
us](https://greptime.com/contactus) and we will reach out to you with more
detail of our commercial license.
## License
GreptimeDB uses the [Apache License 2.0](https://apache.org/licenses/LICENSE-2.0.txt) to strike a balance between
@@ -189,4 +229,3 @@ Special thanks to all the contributors who have propelled GreptimeDB forward. Fo
- GreptimeDB's query engine is powered by [Apache Arrow DataFusion™](https://arrow.apache.org/datafusion/).
- [Apache OpenDAL™](https://opendal.apache.org) gives GreptimeDB a very general and elegant data access abstraction layer.
- GreptimeDB's meta service is based on [etcd](https://etcd.io/).
- GreptimeDB uses [RustPython](https://github.com/RustPython/RustPython) for experimental embedded python scripting.
| `default_timezone` | String | `None` | The default timezone of the server. |
| `default_timezone` | String | Unset | The default timezone of the server. |
| `init_regions_in_background` | Bool | `false` | Initialize all regions in the background during the startup.<br/>By default, it provides services after all regions have been initialized. |
| `max_concurrent_queries` | Integer | `0` | The maximum current queries allowed to be executed. Zero means unlimited. |
| `enable_telemetry` | Bool | `true` | Enable telemetry to collect anonymous usage data. Enabled by default. |
| `max_in_flight_write_bytes` | String | Unset | The maximum in-flight write bytes. |
| `runtime` | -- | -- | The runtime options. |
| `runtime.global_rt_size` | Integer | `8` | The number of threads to execute the runtime for global read operations. |
| `runtime.compact_rt_size` | Integer | `4` | The number of threads to execute the runtime for global write operations. |
@@ -22,81 +26,104 @@
| `http.addr` | String | `127.0.0.1:4000` | The address to bind the HTTP server. |
| `http.timeout` | String | `30s` | HTTP request timeout. Set to 0 to disable timeout. |
| `http.body_limit` | String | `64MB` | HTTP request body limit.<br/>The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.<br/>Set to 0 to disable limit. |
| `http.enable_cors` | Bool | `true` | HTTP CORS support, it's turned on by default<br/>This allows browser to access http APIs without CORS restrictions |
| `grpc.tls.watch` | Bool | `false` | Watch for Certificate and key file change and auto reload.<br/>For now, gRPC tls config does not support auto reload. |
| `prom_store.enable` | Bool | `true` | Whether to enable Prometheus remote write and read in HTTP API. |
| `prom_store.with_metric_engine` | Bool | `true` | Whether to store the data from Prometheus remote write in metric engine. |
| `wal` | -- | -- | The WAL options. |
| `wal.provider` | String | `raft_engine` | The provider of the WAL.<br/>- `raft_engine`: the wal is stored in the local file system by raft-engine.<br/>- `kafka`: it's remote wal that data is stored in Kafka. |
| `wal.dir` | String | `None` | The directory to store the WAL files.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.file_size` | String | `256MB` | The size of the WAL segment file.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_threshold` | String | `4GB` | The threshold of the WAL size to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_interval` | String | `10m` | The interval to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.dir` | String | Unset | The directory to store the WAL files.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.file_size` | String | `128MB` | The size of the WAL segment file.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_threshold` | String | `1GB` | The threshold of the WAL size to trigger a purge.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_interval` | String | `1m` | The interval to trigger a purge.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.read_batch_size` | Integer | `128` | The read batch size.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.sync_write` | Bool | `false` | Whether to use sync write.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.enable_log_recycle` | Bool | `true` | Whether to reuse logically truncated log files.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.prefill_log_files` | Bool | `false` | Whether to pre-create log files on start up.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.sync_period` | String | `10s` | Duration for fsyncing log files.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.broker_endpoints` | Array | -- | The Kafka broker endpoints.<br/>**It's only used when the provider is `kafka`**. |
| `wal.auto_create_topics` | Bool | `true` | Automatically create topics for WAL.<br/>Set to `true` to automatically create topics for WAL.<br/>Otherwise, use topics named `topic_name_prefix_[0..num_topics)` |
| `wal.num_topics` | Integer | `64` | Number of topics.<br/>**It's only used when the provider is `kafka`**. |
| `wal.selector_type` | String | `round_robin` | Topic selector type.<br/>Available selector types:<br/>- `round_robin` (default)<br/>**It's only used when the provider is `kafka`**. |
| `wal.topic_name_prefix` | String | `greptimedb_wal_topic` | A Kafka topic is constructed by concatenating `topic_name_prefix` and `topic_id`.<br/>i.g., greptimedb_wal_topic_0, greptimedb_wal_topic_1.<br/>**It's only used when the provider is `kafka`**. |
| `wal.replication_factor` | Integer | `1` | Expected number of replicas of each partition.<br/>**It's only used when the provider is `kafka`**. |
| `wal.create_topic_timeout` | String | `30s` | Above which a topic creation operation will be cancelled.<br/>**It's only used when the provider is `kafka`**. |
| `wal.max_batch_bytes` | String | `1MB` | The max size of a single producer batch.<br/>Warning: Kafka has a default limit of 1MB per message in a topic.<br/>**It's only used when the provider is `kafka`**. |
| `wal.consumer_wait_timeout` | String | `100ms` | The consumer wait timeout.<br/>**It's only used when the provider is `kafka`**. |
| `wal.backoff_init` | String | `500ms` | The initial backoff delay.<br/>**It's only used when the provider is `kafka`**. |
| `wal.backoff_max` | String | `10s` | The maximum backoff delay.<br/>**It's only used when the provider is `kafka`**. |
| `wal.backoff_base` | Integer | `2` | The exponential backoff rate, i.e. next backoff = base * current backoff.<br/>**It's only used when the provider is `kafka`**. |
| `wal.backoff_deadline` | String | `5mins` | The deadline of retries.<br/>**It's only used when the provider is `kafka`**. |
| `wal.overwrite_entry_start_id` | Bool | `false` | Ignore missing entries during read WAL.<br/>**It's only used when the provider is `kafka`**.<br/><br/>This option ensures that when Kafka messages are deleted, the system<br/>can still successfully replay memtable data without throwing an<br/>out-of-range error.<br/>However, enabling this option might lead to unexpected data loss,<br/>as the system will skip over missing entries instead of treating<br/>them as critical errors. |
| `flow.num_workers` | Integer | `0` | The number of flow worker in flownode.<br/>Not setting(or set to 0) this value will use the number of CPU cores divided by 2. |
| `storage` | -- | -- | The data storage options. |
| `storage.data_home` | String | `/tmp/greptimedb/` | The working home directory. |
| `storage.type` | String | `File` | The storage type used to store the data.<br/>- `File`: the data is stored in the local file system.<br/>- `S3`: the data is stored in the S3 object storage.<br/>- `Gcs`: the data is stored in the Google Cloud Storage.<br/>- `Azblob`: the data is stored in the Azure Blob Storage.<br/>- `Oss`: the data is stored in the Aliyun OSS. |
| `storage.cache_path` | String | `None` | Cache configuration for object storage such as 'S3' etc.<br/>The local file cache directory. |
| `storage.cache_capacity` | String | `None` | The local file cache capacity in bytes. |
| `storage.bucket` | String | `None` | The S3 bucket name.<br/>**It's only used when the storage type is `S3`, `Oss` and `Gcs`**. |
| `storage.root` | String | `None` | The S3 data will be stored in the specified prefix, for example, `s3://${bucket}/${root}`.<br/>**It's only used when the storage type is `S3`, `Oss` and `Azblob`**. |
| `storage.access_key_id` | String | `None` | The access key id of the aws account.<br/>It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.<br/>**It's only used when the storage type is `S3` and `Oss`**. |
| `storage.secret_access_key` | String | `None` | The secret access key of the aws account.<br/>It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.<br/>**It's only used when the storage type is `S3`**. |
| `storage.access_key_secret` | String | `None` | The secret access key of the aliyun account.<br/>**It's only used when the storage type is `Oss`**. |
| `storage.account_name` | String | `None` | The account key of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.account_key` | String | `None` | The account key of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.scope` | String | `None` | The scope of the google cloud storage.<br/>**It's only used when the storage type is `Gcs`**. |
| `storage.credential_path` | String | `None` | The credential path of the google cloud storage.<br/>**It's only used when the storage type is `Gcs`**. |
| `storage.container` | String | `None` | The container of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.sas_token` | String | `None` | The sas token of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.endpoint` | String | `None` | The endpoint of the S3 service.<br/>**It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**. |
| `storage.region` | String | `None` | The region of the S3 service.<br/>**It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**. |
| `storage.cache_path` | String | Unset | Read cache configuration for object storage such as 'S3' etc, it's configured by default when using object storage. It is recommended to configure it when using object storage for better performance.<br/>A local file directory, defaults to `{data_home}`. An empty string means disabling. |
| `storage.cache_capacity` | String | Unset | The local file cache capacity in bytes. If your disk space is sufficient, it is recommended to set it larger. |
| `storage.bucket` | String | Unset | The S3 bucket name.<br/>**It's only used when the storage type is `S3`, `Oss` and `Gcs`**. |
| `storage.root` | String | Unset | The S3 data will be stored in the specified prefix, for example, `s3://${bucket}/${root}`.<br/>**It's only used when the storage type is `S3`, `Oss` and `Azblob`**. |
| `storage.access_key_id` | String | Unset | The access key id of the aws account.<br/>It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.<br/>**It's only used when the storage type is `S3` and `Oss`**. |
| `storage.secret_access_key` | String | Unset | The secret access key of the aws account.<br/>It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.<br/>**It's only used when the storage type is `S3`**. |
| `storage.access_key_secret` | String | Unset | The secret access key of the aliyun account.<br/>**It's only used when the storage type is `Oss`**. |
| `storage.account_name` | String | Unset | The account key of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.account_key` | String | Unset | The account key of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.scope` | String | Unset | The scope of the google cloud storage.<br/>**It's only used when the storage type is `Gcs`**. |
| `storage.credential_path` | String | Unset | The credential path of the google cloud storage.<br/>**It's only used when the storage type is `Gcs`**. |
| `storage.credential` | String | Unset | The credential of the google cloud storage.<br/>**It's only used when the storage type is `Gcs`**. |
| `storage.container` | String | Unset | The container of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.sas_token` | String | Unset | The sas token of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.endpoint` | String | Unset | The endpoint of the S3 service.<br/>**It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**. |
| `storage.region` | String | Unset | The region of the S3 service.<br/>**It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**. |
| `storage.http_client` | -- | -- | The http client options to the storage.<br/>**It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**. |
| `storage.http_client.pool_max_idle_per_host` | Integer | `1024` | The maximum idle connection per host allowed in the pool. |
| `storage.http_client.connect_timeout` | String | `30s` | The timeout for only the connect phase of a http client. |
| `storage.http_client.timeout` | String | `30s` | The total request timeout, applied from when the request starts connecting until the response body has finished.<br/>Also considered a total deadline. |
| `storage.http_client.pool_idle_timeout` | String | `90s` | The timeout for idle sockets being kept-alive. |
| `[[region_engine]]` | -- | -- | The region engine options. You can configure multiple region engines. |
| `region_engine.mito.num_workers` | Integer | `8` | Number of region workers. |
@@ -104,61 +131,79 @@
| `region_engine.mito.worker_request_batch_size` | Integer | `64` | Max batch size for a worker to handle requests. |
| `region_engine.mito.manifest_checkpoint_distance` | Integer | `10` | Number of meta action updated to trigger a new checkpoint for the manifest. |
| `region_engine.mito.compress_manifest` | Bool | `false` | Whether to compress manifest and checkpoint file by gzip (default false). |
| `region_engine.mito.max_background_jobs` | Integer | `4` | Max number of running background jobs |
| `region_engine.mito.max_background_flushes` | Integer | Auto | Max number of running background flush jobs (default: 1/2 of cpu cores). |
| `region_engine.mito.max_background_compactions` | Integer | Auto | Max number of running background compaction jobs (default: 1/4 of cpu cores). |
| `region_engine.mito.max_background_purges` | Integer | Auto | Max number of running background purge jobs (default: number of cpu cores). |
| `region_engine.mito.auto_flush_interval` | String | `1h` | Interval to auto flush a region if it has not flushed yet. |
| `region_engine.mito.global_write_buffer_size` | String | `1GB` | Global write buffer size for all regions. If not set, it's default to 1/8 of OS memory with a max limitation of 1GB. |
| `region_engine.mito.global_write_buffer_reject_size` | String | `2GB` | Global write buffer size threshold to reject write requests. If not set, it's default to 2 times of `global_write_buffer_size` |
| `region_engine.mito.sst_meta_cache_size` | String | `128MB` | Cache size for SST metadata. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/32 of OS memory with a max limitation of 128MB. |
| `region_engine.mito.vector_cache_size` | String | `512MB` | Cache size for vectors and arrow arrays. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/16 of OS memory with a max limitation of 512MB. |
| `region_engine.mito.page_cache_size` | String | `512MB` | Cache size for pages of SST row groups. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/16 of OS memory with a max limitation of 512MB. |
| `region_engine.mito.enable_experimental_write_cache` | Bool | `false` | Whether to enable the experimental write cache. |
| `region_engine.mito.experimental_write_cache_path` | String | `""` | File system path for write cache, defaults to `{data_home}/write_cache`. |
| `region_engine.mito.global_write_buffer_size` | String | Auto | Global write buffer size for all regions. If not set, it's default to 1/8 of OS memory with a max limitation of 1GB. |
| `region_engine.mito.global_write_buffer_reject_size` | String | Auto | Global write buffer size threshold to reject write requests. If not set, it's default to 2 times of `global_write_buffer_size`. |
| `region_engine.mito.sst_meta_cache_size` | String | Auto | Cache size for SST metadata. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/32 of OS memory with a max limitation of 128MB. |
| `region_engine.mito.vector_cache_size` | String | Auto | Cache size for vectors and arrow arrays. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/16 of OS memory with a max limitation of 512MB. |
| `region_engine.mito.page_cache_size` | String | Auto | Cache size for pages of SST row groups. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/8 of OS memory. |
| `region_engine.mito.selector_result_cache_size` | String | Auto | Cache size for time series selector (e.g. `last_value()`). Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/16 of OS memory with a max limitation of 512MB. |
| `region_engine.mito.enable_write_cache` | Bool | `false` | Whether to enable the write cache, it's enabled by default when using object storage. It is recommended to enable it when using object storage for better performance. |
| `region_engine.mito.write_cache_path` | String | `""` | File system path for write cache, defaults to `{data_home}`. |
| `region_engine.mito.write_cache_size` | String | `5GiB` | Capacity for write cache. If your disk space is sufficient, it is recommended to set it larger. |
| `region_engine.mito.scan_parallelism` | Integer | `0` | Parallelism to scan a region (default: 1/4 of cpu cores).<br/>- `0`: using the default value (1/4 of cpu cores).<br/>- `1`: scan in current thread.<br/>- `n`: scan in parallelism n. |
| `region_engine.mito.parallel_scan_channel_size` | Integer | `32` | Capacity of the channel to send data from parallel scan tasks to the main task. |
| `region_engine.mito.allow_stale_entries` | Bool | `false` | Whether to allow stale WAL entries read during replay. |
| `region_engine.mito.min_compaction_interval` | String | `0m` | Minimum time interval between two compactions.<br/>To align with the old behavior, the default value is 0 (no restrictions). |
| `region_engine.mito.index` | -- | -- | The options for index in Mito engine. |
| `region_engine.mito.index.aux_path` | String | `""` | Auxiliary directory path for the index in filesystem, used to store intermediate files for<br/>creating the index and staging files for searching the index, defaults to `{data_home}/index_intermediate`.<br/>The default name for this directory is `index_intermediate` for backward compatibility.<br/><br/>This path contains two subdirectories:<br/>- `__intm`: for storing intermediate files used during creating index.<br/>- `staging`: for storing staging files used during searching index. |
| `region_engine.mito.index.staging_size` | String | `2GB` | The max capacity of the staging directory. |
| `region_engine.mito.index.staging_ttl` | String | `7d` | The TTL of the staging directory.<br/>Defaults to 7 days.<br/>Setting it to "0s" to disable TTL. |
| `region_engine.mito.index.metadata_cache_size` | String | `64MiB` | Cache size for inverted index metadata. |
| `region_engine.mito.index.content_cache_size` | String | `128MiB` | Cache size for inverted index content. |
| `region_engine.mito.index.content_cache_page_size` | String | `64KiB` | Page size for inverted index content cache. |
| `region_engine.mito.inverted_index` | -- | -- | The options for inverted index in Mito engine. |
| `region_engine.mito.inverted_index.create_on_flush` | String | `auto` | Whether to create the index on flush.<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.inverted_index.create_on_compaction` | String | `auto` | Whether to create the index on compaction.<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.inverted_index.apply_on_query` | String | `auto` | Whether to apply the index on query<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.inverted_index.mem_threshold_on_create` | String | `auto` | Memory threshold for performing an external sort during index creation.<br/>- `auto`: automatically determine the threshold based on the system memory size (default)<br/>- `unlimited`: no memory limit<br/>- `[size]` e.g. `64MB`: fixed memory threshold |
| `region_engine.mito.inverted_index.metadata_cache_size` | String | `64MiB` | Cache size for inverted index metadata. |
| `region_engine.mito.inverted_index.content_cache_size` | String | `128MiB` | Cache size for inverted index content. |
| `region_engine.mito.fulltext_index` | -- | -- | The options for full-text index in Mito engine. |
| `region_engine.mito.fulltext_index.create_on_flush` | String | `auto` | Whether to create the index on flush.<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.fulltext_index.create_on_compaction` | String | `auto` | Whether to create the index on compaction.<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.fulltext_index.apply_on_query` | String | `auto` | Whether to apply the index on query<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.fulltext_index.mem_threshold_on_create` | String | `auto` | Memory threshold for index creation.<br/>- `auto`: automatically determine the threshold based on the system memory size (default)<br/>- `unlimited`: no memory limit<br/>- `[size]` e.g. `64MB`: fixed memory threshold |
| `region_engine.mito.bloom_filter_index` | -- | -- | The options for bloom filter in Mito engine. |
| `region_engine.mito.bloom_filter_index.create_on_flush` | String | `auto` | Whether to create the bloom filter on flush.<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.bloom_filter_index.create_on_compaction` | String | `auto` | Whether to create the bloom filter on compaction.<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.bloom_filter_index.apply_on_query` | String | `auto` | Whether to apply the bloom filter on query<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.bloom_filter_index.mem_threshold_on_create` | String | `auto` | Memory threshold for bloom filter creation.<br/>- `auto`: automatically determine the threshold based on the system memory size (default)<br/>- `unlimited`: no memory limit<br/>- `[size]` e.g. `64MB`: fixed memory threshold |
| `region_engine.mito.memtable.index_max_keys_per_shard` | Integer | `8192` | The max number of keys in one shard.<br/>Only available for `partition_tree` memtable. |
| `region_engine.mito.memtable.data_freeze_threshold` | Integer | `32768` | The max rows of data inside the actively writing buffer in one shard.<br/>Only available for `partition_tree` memtable. |
| `region_engine.mito.memtable.fork_dictionary_bytes` | String | `1GiB` | Max dictionary bytes.<br/>Only available for `partition_tree` memtable. |
| `logging.append_stdout` | Bool | `true` | Whether to append logs to stdout. |
| `logging.log_format` | String | `text` | The log format. Can be `text`/`json`. |
| `logging.max_log_files` | Integer | `720` | The maximum amount of log files. |
| `logging.tracing_sample_ratio` | -- | -- | The percentage of tracing will be sampled and exported.<br/>Valid range `[0, 1]`, 1 means all traces are sampled, 0 means all traces are not sampled, the default value is 1.<br/>ratio > 1 are treated as 1. Fractions <0aretreatedas0|
|`export_metrics`|--|--|ThedatanodecanexportitsmetricsandsendtoPrometheuscompatibleservice(e.g.sendto`greptimedb`itself)fromremote-writeAPI.<br/>This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape. |
| `export_metrics.self_import` | -- | -- | For `standalone` mode, `self_import` is recommended to collect metrics generated by itself<br/>You must create the database before enabling it. |
| `export_metrics.remote_write.url` | String | `""` | The url the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=information_schema`. |
| `export_metrics.remote_write.url` | String | `""` | The url the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=greptime_metrics`. |
| `tracing` | -- | -- | The tracing options. Only effect when compiled with `tokio-console` feature. |
| `tracing.tokio_console_addr` | String | `None` | The tokio console address. |
| `tracing.tokio_console_addr` | String | Unset | The tokio console address. |
## Distributed Mode
@@ -167,7 +212,8 @@
| Key | Type | Default | Descriptions |
| --- | -----| ------- | ----------- |
| `default_timezone` | String | `None` | The default timezone of the server. |
| `default_timezone` | String | Unset | The default timezone of the server. |
| `max_in_flight_write_bytes` | String | Unset | The maximum in-flight write bytes. |
| `runtime` | -- | -- | The runtime options. |
| `runtime.global_rt_size` | Integer | `8` | The number of threads to execute the runtime for global read operations. |
| `runtime.compact_rt_size` | Integer | `4` | The number of threads to execute the runtime for global write operations. |
@@ -178,37 +224,43 @@
| `http.addr` | String | `127.0.0.1:4000` | The address to bind the HTTP server. |
| `http.timeout` | String | `30s` | HTTP request timeout. Set to 0 to disable timeout. |
| `http.body_limit` | String | `64MB` | HTTP request body limit.<br/>The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.<br/>Set to 0 to disable limit. |
| `http.enable_cors` | Bool | `true` | HTTP CORS support, it's turned on by default<br/>This allows browser to access http APIs without CORS restrictions |
| `grpc.addr` | String | `127.0.0.1:4001` | The address to bind the gRPC server. |
| `grpc.hostname` | String | `127.0.0.1` | The hostname advertised to the metasrv,<br/>and used for connections from outside the host |
| `grpc.bind_addr` | String | `127.0.0.1:4001` | The address to bind the gRPC server. |
| `grpc.server_addr` | String | `127.0.0.1:4001` | The address advertised to the metasrv,and used for connections from outside the host.<br/>If left empty or unset, the server will automatically use the IP address of the first network interface<br/>on the host, with the same port number as the one specified in `grpc.bind_addr`. |
| `grpc.runtime_size` | Integer | `8` | The number of server worker threads. |
| `grpc.tls` | -- | -- | gRPC server TLS options, see `mysql.tls` section. |
| `grpc.tls.watch` | Bool | `false` | Watch for Certificate and key file change and auto reload.<br/>For now, gRPC tls config does not support auto reload. |
| `logging.append_stdout` | Bool | `true` | Whether to append logs to stdout. |
| `logging.log_format` | String | `text` | The log format. Can be `text`/`json`. |
| `logging.max_log_files` | Integer | `720` | The maximum amount of log files. |
| `logging.tracing_sample_ratio` | -- | -- | The percentage of tracing will be sampled and exported.<br/>Valid range `[0, 1]`, 1 means all traces are sampled, 0 means all traces are not sampled, the default value is 1.<br/>ratio > 1 are treated as 1. Fractions <0aretreatedas0|
|`export_metrics`|--|--|ThedatanodecanexportitsmetricsandsendtoPrometheuscompatibleservice(e.g.sendto`greptimedb`itself)fromremote-writeAPI.<br/>This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape. |
| `export_metrics.self_import` | -- | -- | For `standalone` mode, `self_import` is recommend to collect metrics generated by itself<br/>You must create the database before enabling it. |
| `export_metrics.remote_write.url` | String | `""` | The url the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=information_schema`. |
| `export_metrics.remote_write.url` | String | `""` | The url the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=greptime_metrics`. |
| `tracing` | -- | -- | The tracing options. Only effect when compiled with `tokio-console` feature. |
| `tracing.tokio_console_addr` | String | `None` | The tokio console address. |
| `tracing.tokio_console_addr` | String | Unset | The tokio console address. |
### Metasrv
@@ -252,13 +310,17 @@
| --- | -----| ------- | ----------- |
| `data_home` | String | `/tmp/metasrv/` | The working home directory. |
| `bind_addr` | String | `127.0.0.1:3002` | The bind address of metasrv. |
| `server_addr` | String | `127.0.0.1:3002` | The communication server address for frontend and datanode to connect to metasrv, "127.0.0.1:3002" by default for localhost. |
| `server_addr` | String | `127.0.0.1:3002` | The communication server address for the frontend and datanode to connect to metasrv.<br/>If left empty or unset, the server will automatically use the IP address of the first network interface<br/>on the host, with the same port number as the one specified in `bind_addr`. |
| `store_addrs` | Array | -- | Store server address default to etcd store.<br/>For postgres store, the format is:<br/>"password=password dbname=postgres user=postgres host=localhost port=5432"<br/>For etcd store, the format is:<br/>"127.0.0.1:2379" |
| `store_key_prefix` | String | `""` | If it's not empty, the metasrv will store all data with this key prefix. |
| `backend` | String | `etcd_store` | The datastore for meta server.<br/>Available values:<br/>- `etcd_store` (default value)<br/>- `memory_store`<br/>- `postgres_store` |
| `meta_table_name` | String | `greptime_metakv` | Table name in RDS to store metadata. Effect when using a RDS kvbackend.<br/>**Only used when backend is `postgres_store`.** |
| `meta_election_lock_id` | Integer | `1` | Advisory lock id in PostgreSQL for election. Effect when using PostgreSQL as kvbackend<br/>Only used when backend is `postgres_store`. |
| `use_memory_store` | Bool | `false` | Store data in memory. |
| `enable_region_failover` | Bool | `false` | Whether to enable region failover.<br/>This feature is only available on GreptimeDB running on cluster mode and<br/>- Using Remote WAL<br/>- Using shared storage (e.g., s3). |
| `node_max_idle_time` | String | `24hours` | Max allowed idle time before removing node info from metasrv memory. |
| `enable_telemetry` | Bool | `true` | Whether to enable greptimedb telemetry. Enabled by default. |
| `runtime` | -- | -- | The runtime options. |
| `runtime.global_rt_size` | Integer | `8` | The number of threads to execute the runtime for global read operations. |
| `runtime.compact_rt_size` | Integer | `4` | The number of threads to execute the runtime for global write operations. |
@@ -279,9 +341,10 @@
| `wal` | -- | -- | -- |
| `wal.provider` | String | `raft_engine` | -- |
| `wal.broker_endpoints` | Array | -- | The broker endpoints of the Kafka cluster. |
| `wal.num_topics` | Integer | `64` | Number of topics to be created upon start. |
| `wal.auto_create_topics` | Bool | `true` | Automatically create topics for WAL.<br/>Set to `true` to automatically create topics for WAL.<br/>Otherwise, use topics named `topic_name_prefix_[0..num_topics)` |
| `wal.num_topics` | Integer | `64` | Number of topics. |
| `wal.topic_name_prefix` | String | `greptimedb_wal_topic` | A Kafka topic is constructed by concatenating `topic_name_prefix` and `topic_id`. |
| `wal.topic_name_prefix` | String | `greptimedb_wal_topic` | A Kafka topic is constructed by concatenating `topic_name_prefix` and `topic_id`.<br/>Only accepts strings that match the following regular expression pattern:<br/>[a-zA-Z_:-][a-zA-Z0-9_:\-\.@#]*<br/>i.g., greptimedb_wal_topic_0, greptimedb_wal_topic_1. |
| `wal.replication_factor` | Integer | `1` | Expected number of replicas of each partition. |
| `wal.create_topic_timeout` | String | `30s` | Above which a topic creation operation will be cancelled. |
| `wal.backoff_init` | String | `500ms` | The initial backoff for kafka clients. |
@@ -289,23 +352,29 @@
| `wal.backoff_base` | Integer | `2` | Exponential backoff rate, i.e. next backoff = base * current backoff. |
| `wal.backoff_deadline` | String | `5mins` | Stop reconnecting if the total wait time reaches the deadline. If this config is missing, the reconnecting won't terminate. |
| `logging` | -- | -- | The logging options. |
| `logging.dir` | String | `/tmp/greptimedb/logs` | The directory to store the log files. |
| `logging.level` | String | `None` | The log level. Can be `info`/`debug`/`warn`/`error`. |
| `logging.dir` | String | `/tmp/greptimedb/logs` | The directory to store the log files. If set to empty, logs will not be written to files. |
| `logging.level` | String | Unset | The log level. Can be `info`/`debug`/`warn`/`error`. |
| `logging.append_stdout` | Bool | `true` | Whether to append logs to stdout. |
| `logging.log_format` | String | `text` | The log format. Can be `text`/`json`. |
| `logging.max_log_files` | Integer | `720` | The maximum amount of log files. |
| `logging.tracing_sample_ratio` | -- | -- | The percentage of tracing will be sampled and exported.<br/>Valid range `[0, 1]`, 1 means all traces are sampled, 0 means all traces are not sampled, the default value is 1.<br/>ratio > 1 are treated as 1. Fractions <0aretreatedas0|
|`export_metrics`|--|--|ThedatanodecanexportitsmetricsandsendtoPrometheuscompatibleservice(e.g.sendto`greptimedb`itself)fromremote-writeAPI.<br/>This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape. |
| `export_metrics.self_import` | -- | -- | For `standalone` mode, `self_import` is recommend to collect metrics generated by itself<br/>You must create the database before enabling it. |
| `export_metrics.remote_write.url` | String | `""` | The url the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=information_schema`. |
| `export_metrics.remote_write.url` | String | `""` | The url the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=greptime_metrics`. |
| `tracing` | -- | -- | The tracing options. Only effect when compiled with `tokio-console` feature. |
| `tracing.tokio_console_addr` | String | `None` | The tokio console address. |
| `tracing.tokio_console_addr` | String | Unset | The tokio console address. |
### Datanode
@@ -313,26 +382,26 @@
| Key | Type | Default | Descriptions |
| --- | -----| ------- | ----------- |
| `mode` | String | `standalone` | The running mode of the datanode. It can be `standalone` or `distributed`. |
| `node_id` | Integer | `None` | The datanode identifier and should be unique in the cluster. |
| `node_id` | Integer | Unset | The datanode identifier and should be unique in the cluster. |
| `require_lease_before_startup` | Bool | `false` | Start services after regions have obtained leases.<br/>It will block the datanode start if it can't receive leases in the heartbeat from metasrv. |
| `init_regions_in_background` | Bool | `false` | Initialize all regions in the background during the startup.<br/>By default, it provides services after all regions have been initialized. |
| `max_concurrent_queries` | Integer | `0` | The maximum current queries allowed to be executed. Zero means unlimited. |
| `enable_telemetry` | Bool | `true` | Enable telemetry to collect anonymous usage data. Enabled by default. |
| `http` | -- | -- | The HTTP server options. |
| `http.addr` | String | `127.0.0.1:4000` | The address to bind the HTTP server. |
| `http.timeout` | String | `30s` | HTTP request timeout. Set to 0 to disable timeout. |
| `http.body_limit` | String | `64MB` | HTTP request body limit.<br/>The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.<br/>Set to 0 to disable limit. |
| `grpc` | -- | -- | The gRPC server options. |
| `grpc.addr` | String | `127.0.0.1:3001` | The address to bind the gRPC server. |
| `grpc.hostname` | String | `127.0.0.1` | The hostname advertised to the metasrv,<br/>and used for connections from outside the host |
| `grpc.bind_addr` | String | `127.0.0.1:3001` | The address to bind the gRPC server. |
| `grpc.server_addr` | String | `127.0.0.1:3001` | The address advertised to the metasrv,and used for connections from outside the host.<br/>If left empty or unset, the server will automatically use the IP address of the first network interface<br/>on the host, with the same port number as the one specified in `grpc.bind_addr`. |
| `grpc.runtime_size` | Integer | `8` | The number of server worker threads. |
| `grpc.max_recv_message_size` | String | `512MB` | The maximum receive message size for gRPC server. |
| `grpc.max_send_message_size` | String | `512MB` | The maximum send message size for gRPC server. |
| `grpc.tls` | -- | -- | gRPC server TLS options, see `mysql.tls` section. |
| `grpc.tls.watch` | Bool | `false` | Watch for Certificate and key file change and auto reload.<br/>For now, gRPC tls config does not support auto reload. |
| `runtime` | -- | -- | The runtime options. |
| `runtime.global_rt_size` | Integer | `8` | The number of threads to execute the runtime for global read operations. |
| `wal.provider` | String | `raft_engine` | The provider of the WAL.<br/>- `raft_engine`: the wal is stored in the local file system by raft-engine.<br/>- `kafka`: it's remote wal that data is stored in Kafka. |
| `wal.dir` | String | `None` | The directory to store the WAL files.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.file_size` | String | `256MB` | The size of the WAL segment file.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_threshold` | String | `4GB` | The threshold of the WAL size to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_interval` | String | `10m` | The interval to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.dir` | String | Unset | The directory to store the WAL files.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.file_size` | String | `128MB` | The size of the WAL segment file.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_threshold` | String | `1GB` | The threshold of the WAL size to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.purge_interval` | String | `1m` | The interval to trigger a flush.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.read_batch_size` | Integer | `128` | The read batch size.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.sync_write` | Bool | `false` | Whether to use sync write.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.enable_log_recycle` | Bool | `true` | Whether to reuse logically truncated log files.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.prefill_log_files` | Bool | `false` | Whether to pre-create log files on start up.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.sync_period` | String | `10s` | Duration for fsyncing log files.<br/>**It's only used when the provider is `raft_engine`**. |
| `wal.broker_endpoints` | Array | -- | The Kafka broker endpoints.<br/>**It's only used when the provider is `kafka`**. |
| `wal.max_batch_bytes` | String | `1MB` | The max size of a single producer batch.<br/>Warning: Kafka has a default limit of 1MB per message in a topic.<br/>**It's only used when the provider is `kafka`**. |
| `wal.consumer_wait_timeout` | String | `100ms` | The consumer wait timeout.<br/>**It's only used when the provider is `kafka`**. |
@@ -368,24 +438,33 @@
| `wal.backoff_max` | String | `10s` | The maximum backoff delay.<br/>**It's only used when the provider is `kafka`**. |
| `wal.backoff_base` | Integer | `2` | The exponential backoff rate, i.e. next backoff = base * current backoff.<br/>**It's only used when the provider is `kafka`**. |
| `wal.backoff_deadline` | String | `5mins` | The deadline of retries.<br/>**It's only used when the provider is `kafka`**. |
| `wal.create_index` | Bool | `true` | Whether to enable WAL index creation.<br/>**It's only used when the provider is `kafka`**. |
| `wal.dump_index_interval` | String | `60s` | The interval for dumping WAL indexes.<br/>**It's only used when the provider is `kafka`**. |
| `wal.overwrite_entry_start_id` | Bool | `false` | Ignore missing entries during read WAL.<br/>**It's only used when the provider is `kafka`**.<br/><br/>This option ensures that when Kafka messages are deleted, the system<br/>can still successfully replay memtable data without throwing an<br/>out-of-range error.<br/>However, enabling this option might lead to unexpected data loss,<br/>as the system will skip over missing entries instead of treating<br/>them as critical errors. |
| `storage` | -- | -- | The data storage options. |
| `storage.data_home` | String | `/tmp/greptimedb/` | The working home directory. |
| `storage.type` | String | `File` | The storage type used to store the data.<br/>- `File`: the data is stored in the local file system.<br/>- `S3`: the data is stored in the S3 object storage.<br/>- `Gcs`: the data is stored in the Google Cloud Storage.<br/>- `Azblob`: the data is stored in the Azure Blob Storage.<br/>- `Oss`: the data is stored in the Aliyun OSS. |
| `storage.cache_path` | String | `None` | Cache configuration for object storage such as 'S3' etc.<br/>The local file cache directory. |
| `storage.cache_capacity` | String | `None` | The local file cache capacity in bytes. |
| `storage.bucket` | String | `None` | The S3 bucket name.<br/>**It's only used when the storage type is `S3`, `Oss` and `Gcs`**. |
| `storage.root` | String | `None` | The S3 data will be stored in the specified prefix, for example, `s3://${bucket}/${root}`.<br/>**It's only used when the storage type is `S3`, `Oss` and `Azblob`**. |
| `storage.access_key_id` | String | `None` | The access key id of the aws account.<br/>It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.<br/>**It's only used when the storage type is `S3` and `Oss`**. |
| `storage.secret_access_key` | String | `None` | The secret access key of the aws account.<br/>It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.<br/>**It's only used when the storage type is `S3`**. |
| `storage.access_key_secret` | String | `None` | The secret access key of the aliyun account.<br/>**It's only used when the storage type is `Oss`**. |
| `storage.account_name` | String | `None` | The account key of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.account_key` | String | `None` | The account key of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.scope` | String | `None` | The scope of the google cloud storage.<br/>**It's only used when the storage type is `Gcs`**. |
| `storage.credential_path` | String | `None` | The credential path of the google cloud storage.<br/>**It's only used when the storage type is `Gcs`**. |
| `storage.container` | String | `None` | The container of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.sas_token` | String | `None` | The sas token of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.endpoint` | String | `None` | The endpoint of the S3 service.<br/>**It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**. |
| `storage.region` | String | `None` | The region of the S3 service.<br/>**It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**. |
| `storage.cache_path` | String | Unset | Read cache configuration for object storage such as 'S3' etc, it's configured by default when using object storage. It is recommended to configure it when using object storage for better performance.<br/>A local file directory, defaults to `{data_home}`. An empty string means disabling. |
| `storage.cache_capacity` | String | Unset | The local file cache capacity in bytes. If your disk space is sufficient, it is recommended to set it larger. |
| `storage.bucket` | String | Unset | The S3 bucket name.<br/>**It's only used when the storage type is `S3`, `Oss` and `Gcs`**. |
| `storage.root` | String | Unset | The S3 data will be stored in the specified prefix, for example, `s3://${bucket}/${root}`.<br/>**It's only used when the storage type is `S3`, `Oss` and `Azblob`**. |
| `storage.access_key_id` | String | Unset | The access key id of the aws account.<br/>It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.<br/>**It's only used when the storage type is `S3` and `Oss`**. |
| `storage.secret_access_key` | String | Unset | The secret access key of the aws account.<br/>It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.<br/>**It's only used when the storage type is `S3`**. |
| `storage.access_key_secret` | String | Unset | The secret access key of the aliyun account.<br/>**It's only used when the storage type is `Oss`**. |
| `storage.account_name` | String | Unset | The account key of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.account_key` | String | Unset | The account key of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.scope` | String | Unset | The scope of the google cloud storage.<br/>**It's only used when the storage type is `Gcs`**. |
| `storage.credential_path` | String | Unset | The credential path of the google cloud storage.<br/>**It's only used when the storage type is `Gcs`**. |
| `storage.credential` | String | Unset | The credential of the google cloud storage.<br/>**It's only used when the storage type is `Gcs`**. |
| `storage.container` | String | Unset | The container of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.sas_token` | String | Unset | The sas token of the azure account.<br/>**It's only used when the storage type is `Azblob`**. |
| `storage.endpoint` | String | Unset | The endpoint of the S3 service.<br/>**It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**. |
| `storage.region` | String | Unset | The region of the S3 service.<br/>**It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**. |
| `storage.http_client` | -- | -- | The http client options to the storage.<br/>**It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**. |
| `storage.http_client.pool_max_idle_per_host` | Integer | `1024` | The maximum idle connection per host allowed in the pool. |
| `storage.http_client.connect_timeout` | String | `30s` | The timeout for only the connect phase of a http client. |
| `storage.http_client.timeout` | String | `30s` | The total request timeout, applied from when the request starts connecting until the response body has finished.<br/>Also considered a total deadline. |
| `storage.http_client.pool_idle_timeout` | String | `90s` | The timeout for idle sockets being kept-alive. |
| `[[region_engine]]` | -- | -- | The region engine options. You can configure multiple region engines. |
| `region_engine.mito.num_workers` | Integer | `8` | Number of region workers. |
@@ -393,24 +472,31 @@
| `region_engine.mito.worker_request_batch_size` | Integer | `64` | Max batch size for a worker to handle requests. |
| `region_engine.mito.manifest_checkpoint_distance` | Integer | `10` | Number of meta action updated to trigger a new checkpoint for the manifest. |
| `region_engine.mito.compress_manifest` | Bool | `false` | Whether to compress manifest and checkpoint file by gzip (default false). |
| `region_engine.mito.max_background_jobs` | Integer | `4` | Max number of running background jobs |
| `region_engine.mito.max_background_flushes` | Integer | Auto | Max number of running background flush jobs (default: 1/2 of cpu cores). |
| `region_engine.mito.max_background_compactions` | Integer | Auto | Max number of running background compaction jobs (default: 1/4 of cpu cores). |
| `region_engine.mito.max_background_purges` | Integer | Auto | Max number of running background purge jobs (default: number of cpu cores). |
| `region_engine.mito.auto_flush_interval` | String | `1h` | Interval to auto flush a region if it has not flushed yet. |
| `region_engine.mito.global_write_buffer_size` | String | `1GB` | Global write buffer size for all regions. If not set, it's default to 1/8 of OS memory with a max limitation of 1GB. |
| `region_engine.mito.global_write_buffer_reject_size` | String | `2GB` | Global write buffer size threshold to reject write requests. If not set, it's default to 2 times of `global_write_buffer_size` |
| `region_engine.mito.sst_meta_cache_size` | String | `128MB` | Cache size for SST metadata. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/32 of OS memory with a max limitation of 128MB. |
| `region_engine.mito.vector_cache_size` | String | `512MB` | Cache size for vectors and arrow arrays. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/16 of OS memory with a max limitation of 512MB. |
| `region_engine.mito.page_cache_size` | String | `512MB` | Cache size for pages of SST row groups. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/16 of OS memory with a max limitation of 512MB. |
| `region_engine.mito.enable_experimental_write_cache` | Bool | `false` | Whether to enable the experimental write cache. |
| `region_engine.mito.experimental_write_cache_path` | String | `""` | File system path for write cache, defaults to `{data_home}/write_cache`. |
| `region_engine.mito.global_write_buffer_size` | String | Auto | Global write buffer size for all regions. If not set, it's default to 1/8 of OS memory with a max limitation of 1GB. |
| `region_engine.mito.global_write_buffer_reject_size` | String | Auto | Global write buffer size threshold to reject write requests. If not set, it's default to 2 times of `global_write_buffer_size` |
| `region_engine.mito.sst_meta_cache_size` | String | Auto | Cache size for SST metadata. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/32 of OS memory with a max limitation of 128MB. |
| `region_engine.mito.vector_cache_size` | String | Auto | Cache size for vectors and arrow arrays. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/16 of OS memory with a max limitation of 512MB. |
| `region_engine.mito.page_cache_size` | String | Auto | Cache size for pages of SST row groups. Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/8 of OS memory. |
| `region_engine.mito.selector_result_cache_size` | String | Auto | Cache size for time series selector (e.g. `last_value()`). Setting it to 0 to disable the cache.<br/>If not set, it's default to 1/16 of OS memory with a max limitation of 512MB. |
| `region_engine.mito.enable_write_cache` | Bool | `false` | Whether to enable the write cache, it's enabled by default when using object storage. It is recommended to enable it when using object storage for better performance. |
| `region_engine.mito.write_cache_path` | String | `""` | File system path for write cache, defaults to `{data_home}`. |
| `region_engine.mito.write_cache_size` | String | `5GiB` | Capacity for write cache. If your disk space is sufficient, it is recommended to set it larger. |
| `region_engine.mito.scan_parallelism` | Integer | `0` | Parallelism to scan a region (default: 1/4 of cpu cores).<br/>- `0`: using the default value (1/4 of cpu cores).<br/>- `1`: scan in current thread.<br/>- `n`: scan in parallelism n. |
| `region_engine.mito.parallel_scan_channel_size` | Integer | `32` | Capacity of the channel to send data from parallel scan tasks to the main task. |
| `region_engine.mito.allow_stale_entries` | Bool | `false` | Whether to allow stale WAL entries read during replay. |
| `region_engine.mito.min_compaction_interval` | String | `0m` | Minimum time interval between two compactions.<br/>To align with the old behavior, the default value is 0 (no restrictions). |
| `region_engine.mito.index` | -- | -- | The options for index in Mito engine. |
| `region_engine.mito.index.aux_path` | String | `""` | Auxiliary directory path for the index in filesystem, used to store intermediate files for<br/>creating the index and staging files for searching the index, defaults to `{data_home}/index_intermediate`.<br/>The default name for this directory is `index_intermediate` for backward compatibility.<br/><br/>This path contains two subdirectories:<br/>- `__intm`: for storing intermediate files used during creating index.<br/>- `staging`: for storing staging files used during searching index. |
| `region_engine.mito.index.staging_size` | String | `2GB` | The max capacity of the staging directory. |
| `region_engine.mito.index.staging_ttl` | String | `7d` | The TTL of the staging directory.<br/>Defaults to 7 days.<br/>Setting it to "0s" to disable TTL. |
| `region_engine.mito.index.metadata_cache_size` | String | `64MiB` | Cache size for inverted index metadata. |
| `region_engine.mito.index.content_cache_size` | String | `128MiB` | Cache size for inverted index content. |
| `region_engine.mito.index.content_cache_page_size` | String | `64KiB` | Page size for inverted index content cache. |
| `region_engine.mito.inverted_index` | -- | -- | The options for inverted index in Mito engine. |
| `region_engine.mito.inverted_index.create_on_flush` | String | `auto` | Whether to create the index on flush.<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.inverted_index.create_on_compaction` | String | `auto` | Whether to create the index on compaction.<br/>- `auto`: automatically (default)<br/>- `disable`: never |
@@ -422,30 +508,43 @@
| `region_engine.mito.fulltext_index.create_on_compaction` | String | `auto` | Whether to create the index on compaction.<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.fulltext_index.apply_on_query` | String | `auto` | Whether to apply the index on query<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.fulltext_index.mem_threshold_on_create` | String | `auto` | Memory threshold for index creation.<br/>- `auto`: automatically determine the threshold based on the system memory size (default)<br/>- `unlimited`: no memory limit<br/>- `[size]` e.g. `64MB`: fixed memory threshold |
| `region_engine.mito.bloom_filter_index` | -- | -- | The options for bloom filter index in Mito engine. |
| `region_engine.mito.bloom_filter_index.create_on_flush` | String | `auto` | Whether to create the index on flush.<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.bloom_filter_index.create_on_compaction` | String | `auto` | Whether to create the index on compaction.<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.bloom_filter_index.apply_on_query` | String | `auto` | Whether to apply the index on query<br/>- `auto`: automatically (default)<br/>- `disable`: never |
| `region_engine.mito.bloom_filter_index.mem_threshold_on_create` | String | `auto` | Memory threshold for the index creation.<br/>- `auto`: automatically determine the threshold based on the system memory size (default)<br/>- `unlimited`: no memory limit<br/>- `[size]` e.g. `64MB`: fixed memory threshold |
| `region_engine.mito.memtable.index_max_keys_per_shard` | Integer | `8192` | The max number of keys in one shard.<br/>Only available for `partition_tree` memtable. |
| `region_engine.mito.memtable.data_freeze_threshold` | Integer | `32768` | The max rows of data inside the actively writing buffer in one shard.<br/>Only available for `partition_tree` memtable. |
| `region_engine.mito.memtable.fork_dictionary_bytes` | String | `1GiB` | Max dictionary bytes.<br/>Only available for `partition_tree` memtable. |
| `logging.append_stdout` | Bool | `true` | Whether to append logs to stdout. |
| `logging.log_format` | String | `text` | The log format. Can be `text`/`json`. |
| `logging.max_log_files` | Integer | `720` | The maximum amount of log files. |
| `logging.tracing_sample_ratio` | -- | -- | The percentage of tracing will be sampled and exported.<br/>Valid range `[0, 1]`, 1 means all traces are sampled, 0 means all traces are not sampled, the default value is 1.<br/>ratio > 1 are treated as 1. Fractions <0aretreatedas0|
|`export_metrics`|--|--|ThedatanodecanexportitsmetricsandsendtoPrometheuscompatibleservice(e.g.sendto`greptimedb`itself)fromremote-writeAPI.<br/>This is only used for `greptimedb` to export its own metrics internally. It's different from prometheus scrape. |
| `export_metrics.self_import` | -- | -- | For `standalone` mode, `self_import` is recommend to collect metrics generated by itself<br/>You must create the database before enabling it. |
| `export_metrics.remote_write.url` | String | `""` | The url the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=information_schema`. |
| `export_metrics.remote_write.url` | String | `""` | The url the metrics send to. The url example can be: `http://127.0.0.1:4000/v1/prometheus/write?db=greptime_metrics`. |
| `tracing` | -- | -- | The tracing options. Only effect when compiled with `tokio-console` feature. |
| `tracing.tokio_console_addr` | String | `None` | The tokio console address. |
| `tracing.tokio_console_addr` | String | Unset | The tokio console address. |
### Flownode
@@ -453,13 +552,19 @@
| Key | Type | Default | Descriptions |
| --- | -----| ------- | ----------- |
| `mode` | String | `distributed` | The running mode of the flownode. It can be `standalone` or `distributed`. |
| `node_id` | Integer | `None` | The flownode identifier and should be unique in the cluster. |
| `node_id` | Integer | Unset | The flownode identifier and should be unique in the cluster. |
| `flow` | -- | -- | flow engine options. |
| `flow.num_workers` | Integer | `0` | The number of flow worker in flownode.<br/>Not setting(or set to 0) this value will use the number of CPU cores divided by 2. |
| `grpc` | -- | -- | The gRPC server options. |
| `grpc.addr` | String | `127.0.0.1:6800` | The address to bind the gRPC server. |
| `grpc.hostname` | String | `127.0.0.1` | The hostname advertised to the metasrv,<br/>and used for connections from outside the host |
| `grpc.bind_addr` | String | `127.0.0.1:6800` | The address to bind the gRPC server. |
| `grpc.server_addr` | String | `127.0.0.1:6800` | The address advertised to the metasrv,<br/>and used for connections from outside the host |
| `grpc.runtime_size` | Integer | `2` | The number of server worker threads. |
| `grpc.max_recv_message_size` | String | `512MB` | The maximum receive message size for gRPC server. |
| `grpc.max_send_message_size` | String | `512MB` | The maximum send message size for gRPC server. |
| `http` | -- | -- | The HTTP server options. |
| `http.addr` | String | `127.0.0.1:4000` | The address to bind the HTTP server. |
| `http.timeout` | String | `30s` | HTTP request timeout. Set to 0 to disable timeout. |
| `http.body_limit` | String | `64MB` | HTTP request body limit.<br/>The following units are supported: `B`, `KB`, `KiB`, `MB`, `MiB`, `GB`, `GiB`, `TB`, `TiB`, `PB`, `PiB`.<br/>Set to 0 to disable limit. |
| `logging.append_stdout` | Bool | `true` | Whether to append logs to stdout. |
| `logging.log_format` | String | `text` | The log format. Can be `text`/`json`. |
| `logging.max_log_files` | Integer | `720` | The maximum amount of log files. |
| `logging.tracing_sample_ratio` | -- | -- | The percentage of tracing will be sampled and exported.<br/>Valid range `[0, 1]`, 1 means all traces are sampled, 0 means all traces are not sampled, the default value is 1.<br/>ratio > 1 are treated as 1. Fractions <0aretreatedas0|
## Cache configuration for object storage such as 'S3' etc.
## The local file cache directory.
## +toml2docs:none-default
cache_path ="/path/local_cache"
## Read cache configuration for object storage such as 'S3' etc, it's configured by default when using object storage. It is recommended to configure it when using object storage for better performance.
## A local file directory, defaults to `{data_home}`. An empty string means disabling.
## @toml2docs:none-default
#+ cache_path = ""
## The local file cache capacity in bytes.
## +toml2docs:none-default
cache_capacity="256MB"
## The local file cache capacity in bytes. If your disk space is sufficient, it is recommended to set it larger.
## @toml2docs:none-default
cache_capacity="5GiB"
## The S3 bucket name.
## **It's only used when the storage type is `S3`, `Oss` and `Gcs`**.
## +toml2docs:none-default
## @toml2docs:none-default
bucket="greptimedb"
## The S3 data will be stored in the specified prefix, for example, `s3://${bucket}/${root}`.
## **It's only used when the storage type is `S3`, `Oss` and `Azblob`**.
## +toml2docs:none-default
## @toml2docs:none-default
root="greptimedb"
## The access key id of the aws account.
## It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.
## **It's only used when the storage type is `S3` and `Oss`**.
## +toml2docs:none-default
## @toml2docs:none-default
access_key_id="test"
## The secret access key of the aws account.
## It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.
## **It's only used when the storage type is `S3`**.
## +toml2docs:none-default
## @toml2docs:none-default
secret_access_key="test"
## The secret access key of the aliyun account.
## **It's only used when the storage type is `Oss`**.
## +toml2docs:none-default
## @toml2docs:none-default
access_key_secret="test"
## The account key of the azure account.
## **It's only used when the storage type is `Azblob`**.
## +toml2docs:none-default
## @toml2docs:none-default
account_name="test"
## The account key of the azure account.
## **It's only used when the storage type is `Azblob`**.
## +toml2docs:none-default
## @toml2docs:none-default
account_key="test"
## The scope of the google cloud storage.
## **It's only used when the storage type is `Gcs`**.
## +toml2docs:none-default
## @toml2docs:none-default
scope="test"
## The credential path of the google cloud storage.
## **It's only used when the storage type is `Gcs`**.
## +toml2docs:none-default
## @toml2docs:none-default
credential_path="test"
## The credential of the google cloud storage.
## **It's only used when the storage type is `Gcs`**.
## @toml2docs:none-default
credential="base64-credential"
## The container of the azure account.
## **It's only used when the storage type is `Azblob`**.
## +toml2docs:none-default
## @toml2docs:none-default
container="greptimedb"
## The sas token of the azure account.
## **It's only used when the storage type is `Azblob`**.
## +toml2docs:none-default
## @toml2docs:none-default
sas_token=""
## The endpoint of the S3 service.
## **It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**.
## +toml2docs:none-default
## @toml2docs:none-default
endpoint="https://s3.amazonaws.com"
## The region of the S3 service.
## **It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**.
## +toml2docs:none-default
## @toml2docs:none-default
region="us-west-2"
## The http client options to the storage.
## **It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**.
[storage.http_client]
## The maximum idle connection per host allowed in the pool.
pool_max_idle_per_host=1024
## The timeout for only the connect phase of a http client.
connect_timeout="30s"
## The total request timeout, applied from when the request starts connecting until the response body has finished.
## Also considered a total deadline.
timeout="30s"
## The timeout for idle sockets being kept-alive.
pool_idle_timeout="90s"
# Custom storage options
# [[storage.providers]]
# name = "S3"
# type = "S3"
# bucket = "greptimedb"
# root = "data"
# access_key_id = "test"
# secret_access_key = "123456"
# endpoint = "https://s3.amazonaws.com"
# region = "us-west-2"
# [[storage.providers]]
# name = "Gcs"
# type = "Gcs"
# bucket = "greptimedb"
# root = "data"
# scope = "test"
# credential_path = "123456"
# credential = "base64-credential"
# endpoint = "https://storage.googleapis.com"
## The region engine options. You can configure multiple region engines.
## Whether to compress manifest and checkpoint file by gzip (default false).
compress_manifest=false
## Max number of running background jobs
max_background_jobs=4
## Max number of running background flush jobs (default: 1/2 of cpu cores).
## @toml2docs:none-default="Auto"
#+ max_background_flushes = 4
## Max number of running background compaction jobs (default: 1/4 of cpu cores).
## @toml2docs:none-default="Auto"
#+ max_background_compactions = 2
## Max number of running background purge jobs (default: number of cpu cores).
## @toml2docs:none-default="Auto"
#+ max_background_purges = 8
## Interval to auto flush a region if it has not flushed yet.
auto_flush_interval="1h"
## Global write buffer size for all regions. If not set, it's default to 1/8 of OS memory with a max limitation of 1GB.
global_write_buffer_size="1GB"
## @toml2docs:none-default="Auto"
#+ global_write_buffer_size = "1GB"
## Global write buffer size threshold to reject write requests. If not set, it's default to 2 times of `global_write_buffer_size`
global_write_buffer_reject_size="2GB"
## @toml2docs:none-default="Auto"
#+ global_write_buffer_reject_size = "2GB"
## Cache size for SST metadata. Setting it to 0 to disable the cache.
## If not set, it's default to 1/32 of OS memory with a max limitation of 128MB.
sst_meta_cache_size="128MB"
## @toml2docs:none-default="Auto"
#+ sst_meta_cache_size = "128MB"
## Cache size for vectors and arrow arrays. Setting it to 0 to disable the cache.
## If not set, it's default to 1/16 of OS memory with a max limitation of 512MB.
vector_cache_size="512MB"
## @toml2docs:none-default="Auto"
#+ vector_cache_size = "512MB"
## Cache size for pages of SST row groups. Setting it to 0 to disable the cache.
## If not set, it's default to 1/8 of OS memory.
## @toml2docs:none-default="Auto"
#+ page_cache_size = "512MB"
## Cache size for time series selector (e.g. `last_value()`). Setting it to 0 to disable the cache.
## If not set, it's default to 1/16 of OS memory with a max limitation of 512MB.
page_cache_size="512MB"
## @toml2docs:none-default="Auto"
#+ selector_result_cache_size = "512MB"
## Whether to enable the experimental write cache.
enable_experimental_write_cache=false
## Whether to enable the write cache, it's enabled by default when using object storage. It is recommended to enable it when using object storage for better performance.
enable_write_cache=false
## File system path for write cache, defaults to `{data_home}/write_cache`.
experimental_write_cache_path=""
## File system path for write cache, defaults to `{data_home}`.
write_cache_path=""
## Capacity for write cache.
experimental_write_cache_size="512MB"
## Capacity for write cache. If your disk space is sufficient, it is recommended to set it larger.
write_cache_size="5GiB"
## TTL for write cache.
experimental_write_cache_ttl="1h"
## @toml2docs:none-default
write_cache_ttl="8h"
## Buffer size for SST writing.
sst_write_buffer_size="8MB"
## Parallelism to scan a region (default: 1/4 of cpu cores).
## - `0`: using the default value (1/4 of cpu cores).
## - `1`: scan in current thread.
## - `n`: scan in parallelism n.
scan_parallelism=0
## Capacity of the channel to send data from parallel scan tasks to the main task.
parallel_scan_channel_size=32
## Whether to allow stale WAL entries read during replay.
allow_stale_entries=false
## Minimum time interval between two compactions.
## To align with the old behavior, the default value is 0 (no restrictions).
min_compaction_interval="0m"
## The options for index in Mito engine.
[region_engine.mito.index]
@@ -407,6 +497,20 @@ aux_path = ""
## The max capacity of the staging directory.
staging_size="2GB"
## The TTL of the staging directory.
## Defaults to 7 days.
## Setting it to "0s" to disable TTL.
staging_ttl="7d"
## Cache size for inverted index metadata.
metadata_cache_size="64MiB"
## Cache size for inverted index content.
content_cache_size="128MiB"
## Page size for inverted index content cache.
content_cache_page_size="64KiB"
## The options for inverted index in Mito engine.
[region_engine.mito.inverted_index]
@@ -458,6 +562,30 @@ apply_on_query = "auto"
## - `[size]` e.g. `64MB`: fixed memory threshold
mem_threshold_on_create="auto"
## The options for bloom filter index in Mito engine.
[region_engine.mito.bloom_filter_index]
## Whether to create the index on flush.
## - `auto`: automatically (default)
## - `disable`: never
create_on_flush="auto"
## Whether to create the index on compaction.
## - `auto`: automatically (default)
## - `disable`: never
create_on_compaction="auto"
## Whether to apply the index on query
## - `auto`: automatically (default)
## - `disable`: never
apply_on_query="auto"
## Memory threshold for the index creation.
## - `auto`: automatically determine the threshold based on the system memory size (default)
## Cache configuration for object storage such as 'S3' etc.
## The local file cache directory.
## +toml2docs:none-default
cache_path ="/path/local_cache"
## Read cache configuration for object storage such as 'S3' etc, it's configured by default when using object storage. It is recommended to configure it when using object storage for better performance.
## A local file directory, defaults to `{data_home}`. An empty string means disabling.
## @toml2docs:none-default
#+ cache_path = ""
## The local file cache capacity in bytes.
## +toml2docs:none-default
cache_capacity="256MB"
## The local file cache capacity in bytes. If your disk space is sufficient, it is recommended to set it larger.
## @toml2docs:none-default
cache_capacity="5GiB"
## The S3 bucket name.
## **It's only used when the storage type is `S3`, `Oss` and `Gcs`**.
## +toml2docs:none-default
## @toml2docs:none-default
bucket="greptimedb"
## The S3 data will be stored in the specified prefix, for example, `s3://${bucket}/${root}`.
## **It's only used when the storage type is `S3`, `Oss` and `Azblob`**.
## +toml2docs:none-default
## @toml2docs:none-default
root="greptimedb"
## The access key id of the aws account.
## It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.
## **It's only used when the storage type is `S3` and `Oss`**.
## +toml2docs:none-default
## @toml2docs:none-default
access_key_id="test"
## The secret access key of the aws account.
## It's **highly recommended** to use AWS IAM roles instead of hardcoding the access key id and secret key.
## **It's only used when the storage type is `S3`**.
## +toml2docs:none-default
## @toml2docs:none-default
secret_access_key="test"
## The secret access key of the aliyun account.
## **It's only used when the storage type is `Oss`**.
## +toml2docs:none-default
## @toml2docs:none-default
access_key_secret="test"
## The account key of the azure account.
## **It's only used when the storage type is `Azblob`**.
## +toml2docs:none-default
## @toml2docs:none-default
account_name="test"
## The account key of the azure account.
## **It's only used when the storage type is `Azblob`**.
## +toml2docs:none-default
## @toml2docs:none-default
account_key="test"
## The scope of the google cloud storage.
## **It's only used when the storage type is `Gcs`**.
## +toml2docs:none-default
## @toml2docs:none-default
scope="test"
## The credential path of the google cloud storage.
## **It's only used when the storage type is `Gcs`**.
## +toml2docs:none-default
## @toml2docs:none-default
credential_path="test"
## The credential of the google cloud storage.
## **It's only used when the storage type is `Gcs`**.
## @toml2docs:none-default
credential="base64-credential"
## The container of the azure account.
## **It's only used when the storage type is `Azblob`**.
## +toml2docs:none-default
## @toml2docs:none-default
container="greptimedb"
## The sas token of the azure account.
## **It's only used when the storage type is `Azblob`**.
## +toml2docs:none-default
## @toml2docs:none-default
sas_token=""
## The endpoint of the S3 service.
## **It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**.
## +toml2docs:none-default
## @toml2docs:none-default
endpoint="https://s3.amazonaws.com"
## The region of the S3 service.
## **It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**.
## +toml2docs:none-default
## @toml2docs:none-default
region="us-west-2"
## The http client options to the storage.
## **It's only used when the storage type is `S3`, `Oss`, `Gcs` and `Azblob`**.
[storage.http_client]
## The maximum idle connection per host allowed in the pool.
pool_max_idle_per_host=1024
## The timeout for only the connect phase of a http client.
connect_timeout="30s"
## The total request timeout, applied from when the request starts connecting until the response body has finished.
## Also considered a total deadline.
timeout="30s"
## The timeout for idle sockets being kept-alive.
pool_idle_timeout="90s"
# Custom storage options
# [[storage.providers]]
# name = "S3"
# type = "S3"
# bucket = "greptimedb"
# root = "data"
# access_key_id = "test"
# secret_access_key = "123456"
# endpoint = "https://s3.amazonaws.com"
# region = "us-west-2"
# [[storage.providers]]
# name = "Gcs"
# type = "Gcs"
# bucket = "greptimedb"
# root = "data"
# scope = "test"
# credential_path = "123456"
# credential = "base64-credential"
# endpoint = "https://storage.googleapis.com"
## The region engine options. You can configure multiple region engines.
## Whether to compress manifest and checkpoint file by gzip (default false).
compress_manifest=false
## Max number of running background jobs
max_background_jobs=4
## Max number of running background flush jobs (default: 1/2 of cpu cores).
## @toml2docs:none-default="Auto"
#+ max_background_flushes = 4
## Max number of running background compaction jobs (default: 1/4 of cpu cores).
## @toml2docs:none-default="Auto"
#+ max_background_compactions = 2
## Max number of running background purge jobs (default: number of cpu cores).
## @toml2docs:none-default="Auto"
#+ max_background_purges = 8
## Interval to auto flush a region if it has not flushed yet.
auto_flush_interval="1h"
## Global write buffer size for all regions. If not set, it's default to 1/8 of OS memory with a max limitation of 1GB.
global_write_buffer_size="1GB"
## @toml2docs:none-default="Auto"
#+ global_write_buffer_size = "1GB"
## Global write buffer size threshold to reject write requests. If not set, it's default to 2 times of `global_write_buffer_size`
global_write_buffer_reject_size="2GB"
## Global write buffer size threshold to reject write requests. If not set, it's default to 2 times of `global_write_buffer_size`.
## @toml2docs:none-default="Auto"
#+ global_write_buffer_reject_size = "2GB"
## Cache size for SST metadata. Setting it to 0 to disable the cache.
## If not set, it's default to 1/32 of OS memory with a max limitation of 128MB.
sst_meta_cache_size="128MB"
## @toml2docs:none-default="Auto"
#+ sst_meta_cache_size = "128MB"
## Cache size for vectors and arrow arrays. Setting it to 0 to disable the cache.
## If not set, it's default to 1/16 of OS memory with a max limitation of 512MB.
vector_cache_size="512MB"
## @toml2docs:none-default="Auto"
#+ vector_cache_size = "512MB"
## Cache size for pages of SST row groups. Setting it to 0 to disable the cache.
## If not set, it's default to 1/8 of OS memory.
## @toml2docs:none-default="Auto"
#+ page_cache_size = "512MB"
## Cache size for time series selector (e.g. `last_value()`). Setting it to 0 to disable the cache.
## If not set, it's default to 1/16 of OS memory with a max limitation of 512MB.
page_cache_size="512MB"
## @toml2docs:none-default="Auto"
#+ selector_result_cache_size = "512MB"
## Whether to enable the experimental write cache.
enable_experimental_write_cache=false
## Whether to enable the write cache, it's enabled by default when using object storage. It is recommended to enable it when using object storage for better performance.
enable_write_cache=false
## File system path for write cache, defaults to `{data_home}/write_cache`.
experimental_write_cache_path=""
## File system path for write cache, defaults to `{data_home}`.
write_cache_path=""
## Capacity for write cache.
experimental_write_cache_size="512MB"
## Capacity for write cache. If your disk space is sufficient, it is recommended to set it larger.
write_cache_size="5GiB"
## TTL for write cache.
experimental_write_cache_ttl="1h"
## @toml2docs:none-default
write_cache_ttl="8h"
## Buffer size for SST writing.
sst_write_buffer_size="8MB"
## Parallelism to scan a region (default: 1/4 of cpu cores).
## - `0`: using the default value (1/4 of cpu cores).
## - `1`: scan in current thread.
## - `n`: scan in parallelism n.
scan_parallelism=0
## Capacity of the channel to send data from parallel scan tasks to the main task.
parallel_scan_channel_size=32
## Whether to allow stale WAL entries read during replay.
allow_stale_entries=false
## Minimum time interval between two compactions.
## To align with the old behavior, the default value is 0 (no restrictions).
The goal is to test string/text support for each database. In real scenarios it means the datasource(or log data producers) have separate fields defined, or have already processed the raw input.
__Unstructured model__
The log data is inserted as a long string, and then we build fulltext index upon these strings. For example an insert request looks like following
```SQL
INSERTINTOtest_table(message,timestamp)VALUES()
```
The goal is to test fuzzy search performance for each database. In real scenarios it means the log is produced by some kind of middleware and inserted directly into the database.
## Creating tables
See [here](./create_table.sql) for GreptimeDB and Clickhouse's create table clause.
The mapping of Elastic search is created automatically.
## Vector Configuration
We use vector to generate random log data and send inserts to databases.
Please refer to [structured config](./structured_vector.toml) and [unstructured config](./unstructured_vector.toml) for detailed configuration.
## SQLs and payloads
Please refer to [SQL query](./query.sql) for GreptimeDB and Clickhouse, and [query payload](./query.md) for Elastic search.
## Steps to reproduce
0. Decide whether to run structured model test or unstructured mode test.
1. Build vector binary(see vector's config file for specific branch) and databases binaries accordingly.
2. Create table in GreptimeDB and Clickhouse in advance.
3. Run vector to insert data.
4. When data insertion is finished, run queries against each database. Note: you'll need to update timerange value after data insertion.
## Addition
- You can tune GreptimeDB's configuration to get better performance.
- You can setup GreptimeDB to use S3 as storage, see [here](https://docs.greptime.com/user-guide/deployments/configuration#storage-options).
Log Level changed from Some("info") to "trace,flow=debug"%
```
The data is a string in the format of `global_level,module1=level1,module2=level2,...` that follow the same rule of `RUST_LOG`.
The module is the module name of the log, and the level is the log level. The log level can be one of the following: `trace`, `debug`, `info`, `warn`, `error`, `off`(case insensitive).
This RFC proposes a method for storing and querying JSON data in the database.
# Motivation
JSON is widely used across various scenarios. Direct support for writing and querying JSON can significantly enhance the database's flexibility.
# Details
## Storage and Query
GreptimeDB's type system is built on Arrow/DataFusion, where each data type in GreptimeDB corresponds to a data type in Arrow/DataFusion. The proposed JSON type will be implemented on top of the existing `Binary` type, leveraging the current `datatype::value::Value` and `datatype::vectors::BinaryVector` implementations, utilizing the JSONB format as the encoding of JSON data. JSON data is stored and processed similarly to binary data within the storage layer and query engine.
This approach brings problems when dealing with insertions and queries of JSON columns.
## Insertion
Users commonly write JSON data as strings. Thus we need to make conversions between string and JSONB. There are 2 ways to do this:
1. MySQL and PostgreSQL servers provide auto-conversions between strings and JSONB. When a string is inserted into a JSON column, the server will try to parse the string as JSON and convert it to JSONB. The non-JSON strings will be rejected.
2. A function `parse_json` is provided to convert string to JSONB. If the string is not a valid JSON string, the function will return an error.
For example, in MySQL client:
```SQL
CREATETABLEIFNOTEXISTStest(
tsTIMESTAMPTIMEINDEX,
aINT,
bJSON
);
INSERTINTOtestVALUES(
0,
0,
'{
"name": "jHl2oDDnPc1i2OzlP5Y",
"timestamp": "2024-07-25T04:33:11.369386Z",
"attributes": { "event_attributes": 48.28667 }
}'
);
INSERTINTOtestVALUES(
0,
0,
parse_json('{
"name": "jHl2oDDnPc1i2OzlP5Y",
"timestamp": "2024-07-25T04:33:11.369386Z",
"attributes": { "event_attributes": 48.28667 }
}')
);
```
Are both valid.
The dataflow of the insertion process is as follows:
Client ---------------------->│ Server │---------------------->│ UDF │------------------>│ Mito │------------------> Storage
└──────────┘ └─────┘ └──────┘
(Conversion is performed by UDF inside Query Engine)
```
Servers identify JSON column through column schema and perform auto-conversions. But when using prepared statements and binding parameters, the corresponding cached plans in datafusion generated by prepared statements cannot identify JSON columns. Under this circumstance, the servers identify JSON columns through the given parameters and perform auto-conversions.
The following is an example of inserting JSON data through prepared statements:
```Rust
sqlx::query(
"create table test(ts timestamp time index, j json)",
)
.execute(&pool)
.await
.unwrap();
letjson=serde_json::json!({
"code": 200,
"success": true,
"payload": {
"features": [
"serde",
"json"
],
"homepage": null
}
});
// Valid, can identify serde_json::Value as JSON type
sqlx::query("insert into test values($1, $2)")
.bind(i)
.bind(json)
.execute(&pool)
.await
.unwrap();
// Invalid, cannot identify String as JSON type
sqlx::query("insert into test values($1, $2)")
.bind(i)
.bind(json.to_string())
.execute(&pool)
.await
.unwrap();
```
## Query
Correspondingly, users prefer to display JSON data as strings. Thus we need to make conversions between JSON data and strings before presenting JSON data. There are also 2 ways to do this: auto-conversions on MySQL and PostgreSQL servers, and function `json_to_string`.
For example, in MySQL client:
```SQL
SELECTbFROMtest;
SELECTjson_to_string(b)FROMtest;
```
Will both return the JSON as human-readable strings.
Specifically, to perform auto-conversions, we attach a message to JSON data in the `metadata` of `Field` in Arrow/Datafusion schema when scanning a JSON column. Frontend servers could identify JSON data and convert it to strings.
Client <----------------------│ Server │<----------------------│ Query Engine │<----------------- Storage
└──────────┘ └──────────────┘
(Conversion is performed by UDF inside Query Engine)
```
However, if a function uses JSON type as its return type, the metadata method mentioned above is not applicable. Thus the functions of JSON type should specify the return type explicitly instead of returning a JSON type, such as `json_get_int` and `json_get_float` which return corresponding data of `INT` and `FLOAT` type respectively.
## Functions
Similar to the common JSON type, JSON data can be queried with functions.
As a general purpose JSON data type, JSONB may not be as efficient as specialized data types for specific scenarios.
The auto-conversion mechanism is not supported in all scenarios. We need to find workarounds for these scenarios.
# Alternatives
Extract and flatten JSON schema to store in a structured format through pipeline. For nested data, we can provide nested types like `STRUCT` or `ARRAY`.
@@ -5,6 +5,13 @@ GreptimeDB's official Grafana dashboard.
Status notify: we are still working on this config. It's expected to change frequently in the recent days. Please feel free to submit your feedback and/or contribution to this dashboard 🤗
If you use Helm [chart](https://github.com/GreptimeTeam/helm-charts) to deploy GreptimeDB cluster, you can enable self-monitoring by setting the following values in your Helm chart:
-`monitoring.enabled=true`: Deploys a standalone GreptimeDB instance dedicated to monitoring the cluster;
-`grafana.enabled=true`: Deploys Grafana and automatically imports the monitoring dashboard;
The standalone GreptimeDB instance will collect metrics from your cluster and the dashboard will be available in the Grafana UI. For detailed deployment instructions, please refer to our [Kubernetes deployment guide](https://docs.greptime.com/nightly/user-guide/deployments/deploy-on-kubernetes/getting-started).
# How to use
## `greptimedb.json`
@@ -25,7 +32,7 @@ Please ensure the following configuration before importing the dashboard into Gr
__1. Prometheus scrape config__
Assign `greptime_pod` label to each host target. We use this label to identify each node instance.
Configure Prometheus to scrape the cluster.
```yml
# example config
@@ -34,27 +41,15 @@ Assign `greptime_pod` label to each host target. We use this label to identify e
echo"Error: The rust toolchain '$RUST_TOOLCHAIN_VERSION_IN_BUILDER' in builder '$DEV_BUILDER_UBUNTU_IMAGE' maybe outdated, please update it to '$CURRENT_VERSION'"
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.