* feat: v2 schema handling Signed-off-by: jeremyhi <fengjiachun@gmail.com> * feat: impl m1.5 ddl export/import and schema tests Signed-off-by: jeremyhi <fengjiachun@gmail.com> * chore: git ignore update Signed-off-by: jeremyhi <fengjiachun@gmail.com> * chore: add license header Signed-off-by: jeremyhi <fengjiachun@gmail.com> * chore: make fmt-check happy Signed-off-by: jeremyhi <fengjiachun@gmail.com> * fix: Run imported DDL against the intended schema Signed-off-by: jeremyhi <fengjiachun@gmail.com> * fix: Canonicalize schema names after case-insensitive check Signed-off-by: jeremyhi <fengjiachun@gmail.com> * fix: escape sql funcs Signed-off-by: jeremyhi <fengjiachun@gmail.com> * fix: Fixed by carrying explicit execution_schema in DdlStatement instead of parsing schema from SQL Signed-off-by: jeremyhi <fengjiachun@gmail.com> * fix: Fixed by encoding schema names as safe path segments in shared DDL path helpers Signed-off-by: jeremyhi <fengjiachun@gmail.com> * refactor(cli): make export/import v2 schema recovery DDL-only Signed-off-by: jeremyhi <fengjiachun@gmail.com> * chore: by clippy Signed-off-by: jeremyhi <fengjiachun@gmail.com> * chore: follow our styling Signed-off-by: jeremyhi <fengjiachun@gmail.com> * fix(cli): reject remote snapshot URIs with empty root Signed-off-by: jeremyhi <fengjiachun@gmail.com> * fix(cli): dedupe schema filters after canonicalization Signed-off-by: jeremyhi <fengjiachun@gmail.com> * fix(cli): schema-scoped detection to cover external tables Signed-off-by: jeremyhi <fengjiachun@gmail.com> --------- Signed-off-by: jeremyhi <fengjiachun@gmail.com>
14 KiB
Summary
This RFC proposes a redesigned export/import system (V2) for GreptimeDB that addresses fundamental issues in the current implementation. The new design leverages time-series characteristics for efficient chunking, provides clear storage semantics, and ensures data reliability through comprehensive validation mechanisms.
Motivation
Problems with V1
The current export/import implementation has several critical issues:
- Ambiguous path semantics:
output_dirserves dual purposes (local vs remote), causing confusion - No chunking strategy: Large databases cannot be exported/imported efficiently
- Poor reliability: No resume capability, no progress tracking
- Limited format support: Relies on SQL dumps, not optimized for time-series data
Goals
- Clear semantics: Explicit distinction between remote storage and server-local paths
- Scalability: Support TB-scale databases through time-based chunking
- Reliability: Resume capability, progress tracking, data integrity verification
- Performance: Streaming export/import using native COPY DATABASE
Non-Goals
- Schema evolution/migration (out of scope)
- Cross-version compatibility (V2 does not support V1 format)
- Real-time replication (use dedicated replication mechanism)
Guide-level Explanation
Core Concepts
Snapshot
A snapshot represents the complete state of a GreptimeDB catalog at a point in time, including schemas and data.
Snapshot structure:
snapshot-20250101/
├── manifest.json # Snapshot metadata and chunk index
├── schema/
│ ├── schemas.json # Schema definitions (JSON)
│ ├── tables.json # Table definitions (JSON)
│ └── views.json # View definitions (JSON)
└── data/
├── 1/
│ ├── public.metrics.parquet
│ └── public.logs.parquet
├── 2/
│ ├── public.metrics.parquet
│ └── public.logs.parquet
└── 3/
├── public.metrics.parquet
└── public.logs.parquet
Key properties:
- Self-contained (all information needed for restore)
- Immutable (content never changes after creation)
- Verifiable (checksums at file, chunk, and snapshot levels)
- Schema-only snapshots contain only
manifest.jsonandschema/;data/is absent,chunksis empty, and later data append is rejected (use--forceto recreate)
Chunk
A chunk is a time-range partition of data. Each chunk is independently exportable/importable and retryable.
Chunk properties:
- Has explicit
start_timeandend_time(recorded in manifest) - Non-overlapping with other chunks
- Covers a contiguous time range
- Independent (can be exported/imported in any order)
- Atomic (either fully succeeds or fully fails)
Chunk directory naming:
- Sequential numbers:
1/,2/,3/, ... - Time ranges are recorded in manifest.json
Storage Types
V2 supports two storage types:
| Type | Example | Use Case |
|---|---|---|
| Remote Storage | s3://bucket/snapshots |
Production (recommended) |
| Server Path | file:///data/backup |
Local dev/testing |
Important: Local paths (e.g., /tmp/export, ./backup) are not supported because schema export (CLI) and data export (server) run in different processes, which would split the snapshot across two machines.
Basic Usage
Export
# Full snapshot to S3
greptime export create \
--to s3://my-bucket/snapshots/prod-20250101
# Incremental snapshot (time range)
greptime export create \
--start-time 2024-12-01T00:00:00Z \
--end-time 2024-12-31T23:59:59Z \
--to s3://my-bucket/snapshots/prod-december
# Schema-only export
greptime export create \
--schema-only \
--to s3://my-bucket/snapshots/prod-schema-only
Schema-only snapshots cannot be resumed with data; use `--force` to recreate.
# Export with specific format (default: parquet)
greptime export create \
--format csv \
--to s3://my-bucket/snapshots/prod-csv
# Resume interrupted export (automatic if snapshot exists)
greptime export create \
--to s3://my-bucket/snapshots/prod-20250101
# Force recreate (delete existing and start over)
greptime export create \
--to s3://my-bucket/snapshots/prod-20250101 \
--force
Import
# Full import
greptime import \
--from s3://my-bucket/snapshots/prod-20250101
# Partial import (selected schemas)
greptime import \
--from s3://my-bucket/snapshots/prod-20250101 \
--schemas public,private
# Dry-run (verify without importing)
greptime import \
--from s3://my-bucket/snapshots/prod-20250101 \
--dry-run
Reference-level Explanation
Architecture
The export/import system consists of four main components:
- CLI: Command parsing, progress display, state management
- Coordinator: Snapshot planning, chunk scheduling, retry logic
- Schema Engine: DDL extraction, JSON serialization, schema validation
- Data Engine: Time-based chunking, streaming export/import via COPY DATABASE
All components use OpenDAL for storage abstraction, supporting S3, OSS, GCS, Azure Blob, and local filesystem.
Data Format
Manifest File
The manifest is a JSON file containing snapshot metadata and chunk index:
Key fields:
snapshot_id: Unique identifier (UUID)catalog,schemas: Catalog and schema listtime_range: Overall time range coveredschema_only: Whether the snapshot contains schema onlychunks[]: Array of chunk metadataformat: Data format for exported fileschecksum: Snapshot-level SHA256 checksum
Chunk metadata structure:
Each chunk entry in the manifest contains:
id: Chunk identifier (sequential number)time_range: Start and end timestampsstatus: Export status (Pending, InProgress, Completed, Failed)files: List of data files in the chunk directorychecksum: Chunk-level checksum for integrity verification
Schema Files
Schema definitions are stored as JSON (not SQL) for better version compatibility and programmatic processing.
Why JSON instead of SQL?
- Version-agnostic (can handle schema evolution)
- Programmatically processable (direct deserialization)
- Extensible (easy to add new fields)
Data Files
Data is exported via COPY DATABASE, supporting multiple formats:
- Parquet (default): Columnar format, efficient compression, recommended for production use
- CSV: Human-readable, universally compatible, useful for debugging and third-party integration
- JSON: Structured text format, flexible schema representation
- Other formats supported by COPY DATABASE
Format is specified via --format flag and recorded in manifest.json. Import automatically detects the format from manifest.
Core Design Decisions
1. Storage Path Validation
Export/import operations validate storage paths to prevent misconfigurations:
Path types:
s3://,oss://,gs://,azblob://→ Remote storage (recommended)file:///→ Server-local path (only allowed when CLI and server are co-located)/path,./path→ Rejected (would split snapshot across machines)
Validation:
- Detect path type by URI scheme
- For
file:///, verify server endpoint resolves to localhost or local IP - Reject bare paths without URI scheme
2. Time-based Chunking
Data is partitioned into time-range chunks for efficient parallel processing and retry.
Algorithm:
- Generate non-overlapping half-open intervals with configurable time window (default: 1 day)
- Chunks are numbered sequentially (1, 2, 3, ...)
- Each chunk's time range is recorded in manifest.json
- Empty chunks (no data in time window) are skipped and not recorded in manifest
- Guarantees: no gaps, no overlaps, complete coverage of time range
Chunk time window selection:
The optimal chunk time window depends on data density (volume per unit time):
- Target: 100MB - 1GB per chunk (balances parallelism and retry cost)
- Default: 1 day (suitable for most workloads)
- Recommendations:
- High density (>1GB/day): Use smaller windows like 1h, 6h, or 12h
- Low density (<100MB/day): Use larger windows like 7d or 30d
- Time windows can be adjusted flexibly (not required to align to day boundaries)
Example: 500GB database spanning 30 days → ~16.7GB/day → use 1h chunks → ~695MB/chunk
3. Data Export via COPY DATABASE
V2 leverages the existing COPY DATABASE TO for data export, with additional tooling layer for chunking, resume, and metadata management.
How it works:
- Export tool generates chunks based on time range and chunk window
- For each chunk, calls COPY DATABASE with specific time range:
COPY DATABASE <schema> TO '<chunk_path>' WITH ( START_TIME = '<chunk_start>', END_TIME = '<chunk_end>', FORMAT = 'parquet' ) - Export tool records chunk metadata, calculates checksums, and updates manifest
Separation of concerns:
- COPY DATABASE (data layer): Streaming export, format support, time filtering
- Export tool (tooling layer): Chunking, resume, manifest management, checksum calculation, schema export
4. Data Integrity
Three-layer checksum validation ensures data integrity:
- File-level: SHA256 of each Parquet file
- Chunk-level: Aggregate checksum of all files in a chunk
- Snapshot-level: Aggregate checksum of all chunks
Checksums are verified during import before data is written to the database.
5. Retry and Resume
Chunk-level retry:
- Each chunk is an independent unit of work
- Failed chunks are retried with exponential backoff (capped at 5 minutes)
- Successful chunks are never re-exported
Resume capability:
- Manifest tracks chunk status (Pending, InProgress, Completed, Failed)
- Export/import automatically resumes when executed on existing snapshot
- Skips completed chunks, retries failed/in-progress chunks, processes pending chunks
- Works across process restarts
- Use
--force(export only) to delete existing snapshot and start over
6. Concurrent Export Safety
Scenario 1: Export to different paths ✅
- Multiple exports can run simultaneously to different storage locations
- No conflicts (only reads database, writes to different paths)
Scenario 2: Export to same path ⚠️
- Concurrent exports to the same path can corrupt the snapshot
- With default resume behavior: both processes will resume the same export (usually safe)
- Race condition exists during initial manifest creation, but unlikely in practice
- Recommendation: Include timestamp in snapshot path to avoid conflicts
CLI Interface
Export Command
greptime export create [OPTIONS] --to <LOCATION>
Required:
--to <LOCATION> Target storage location
Optional:
--catalog <CATALOG> Catalog name (default: greptime)
--schemas <SCHEMAS> Comma-separated schema list (default: all)
--start-time <TIMESTAMP> Time range start (default: earliest)
--end-time <TIMESTAMP> Time range end (default: now)
--chunk-time-window <DURATION> Chunk time window (default: 1d)
--parallelism <N> Concurrency level (default: 1)
--format <FORMAT> Export format for data file: parquet (default), csv, json, or other formats supported by COPY DATABASE
--schema-only Export schema only, no data
--force Delete existing snapshot and recreate
Behavior:
- If snapshot doesn't exist: create new snapshot
- If snapshot exists: automatically resume export (skip completed chunks,
retry failed chunks, process pending chunks)
- If --force is specified: delete existing snapshot first, then create new one
Import Command
greptime import [OPTIONS] --from <SNAPSHOT>
Required:
--from <SNAPSHOT> Source snapshot location
Optional:
--catalog <CATALOG> Catalog name (default: greptime)
--schemas <SCHEMAS> Comma-separated schema list (default: all)
--parallelism <N> Concurrency level (default: 1)
--dry-run Verify without importing
--time-range <RANGE> Import partial time range only
Behavior:
- Automatically resumes if import was previously interrupted
- Skips completed chunks, retries failed chunks, processes pending chunks
Management Commands
# List snapshots
greptime export list --location s3://bucket/snapshots
# Verify snapshot integrity
greptime export verify --snapshot s3://bucket/snapshots/prod-20250101
# Delete snapshot
greptime export delete --snapshot s3://bucket/snapshots/old-snapshot
Drawbacks
- Breaking change: V2 does not support V1 format (migration required)
- Storage dependency: Relies on object storage for production use
- Time-series assumption: Assumes data has time index (not suitable for non-time-series tables)
- Chunk granularity: Very sparse data may result in many small chunks
Rationale and Alternatives
Why time-based chunking?
Alternatives considered:
- Size-based chunking: Requires scanning data to estimate size (double I/O overhead)
- Adaptive chunking: Complex, requires metadata scanning, marginal benefit
Decision: Fixed time-window chunking
- Simple and predictable
- Aligns with time-series data characteristics
- User can adjust based on data density
- No pre-scanning required
Why JSON for schema, not SQL?
Alternatives: SQL dumps (V1 approach)
Decision: JSON schema format
- Version-agnostic (can add compatibility layer)
- Programmatically processable
- Easier to validate and transform
- Avoids SQL dialect issues
Unresolved Questions
- Cross-version restore: Should V2 support restoring to older GreptimeDB versions?
- Partial schema export: Should we support table-level filtering (not just schema-level)?
Future Possibilities
- Incremental backup: Export only data changes since last backup (requires WAL integration)
- Parallel chunk processing: Export/import multiple chunks simultaneously (requires careful resource management)
- Snapshot metadata service: Centralized snapshot registry and lifecycle management