Commit Graph

7 Commits

Author SHA1 Message Date
discord9
ba15a9c056 feat: support pending flow metadata with defer_on_missing_source (#8124)
* feat: support defer_on_missing_source for pending flow creation

Add `defer_on_missing_source` flow option that allows creating flows
even when source tables do not yet exist. The flow enters a pending
state and is automatically activated when source tables become available.

Key changes:
- New `FlowStatus::PendingSources` and fields in `FlowInfoValue` for
  unresolved source table names and last activation error
- `defer_on_missing_source` create-time-only option: stripped from
  runtime/flownode `CreateRequest` but preserved in metadata for
  SQL round-trip (`SHOW CREATE FLOW`, `information_schema.flows`)
- `CreateFlowProcedure` creates pending metadata when sources are
  missing and `defer_on_missing_source=true`; falls back to
  `FlowType::Batching` for missing-source flows
- `PendingFlowReconcileManager` in meta-srv periodically checks
  pending flows and activates them when source tables resolve
- `ActivatePendingFlowProcedure` handles activation: allocates peers,
  creates flows on flownodes, updates metadata, invalidates cache
- `OR REPLACE` properly handles pending<->active transitions,
  including peer allocation and flownode flow teardown
- `FlowMetadataAllocator::alloc_peers` for peer allocation at
  activation time
- Validated flow options: only `defer_on_missing_source` allowed;
  unknown options rejected
- Known issue: standalone mode does not support flownodes, so
  pending flow flush/sink behavior covered only in distributed
  sqlness; operator and meta unit tests cover activation logic

Tests:
- operator `determine_flow_type_for_source_state` (3 passed)
- common-meta `create_flow` (19 passed) including replacement
- common-meta `activate_flow` (4 passed)
- meta-srv `flow` (11 passed)
- sqlness: `flow_pending` covers create/replace/round-trip

Signed-off-by: discord9 <discord9@163.com>

* chore: simplify pending flow PR scope

Reduce PR #8124 to the metadata-only MVP after complexity review.

Changes:
- Remove automatic activation procedure and meta-srv reconcile wiring
- Remove activation tests and activation-only metadata fields
- Reject cross-state pending<->active `OR REPLACE` transitions for now
- Keep pending metadata creation and SQL round-trip behavior
- Allow `DROP FLOW` for pending flows without routes
- Reduce flow_pending sqlness to metadata/round-trip/drop coverage only

Deferred follow-ups are documented locally in `.tmp/tasks/pending-defer-semantics/deferred-followups.md` and intentionally not committed.

Tests:
- `cargo test -p operator determine_flow_type_for_source_state`
- `cargo test -p common-meta create_flow`
- `cargo test -p common-meta drop_flow`
- `cargo sqlness bare --test-filter flow_pending --bins-dir /mnt/nvme_rust/rust-targets/pending_defer/debug`

Signed-off-by: discord9 <discord9@163.com>

* test: cover pending flow metadata edge cases

Signed-off-by: discord9 <discord9@163.com>

* test: fix pending flow metadata test lint

Signed-off-by: discord9 <discord9@163.com>

* docs: document pending flow metadata fields

Signed-off-by: discord9 <discord9@163.com>

* chore: more sleep when test

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
2026-05-29 06:59:21 +00:00
LFC
b2074e3863 chore: upgrade DataFusion family, again (#7578)
* chore: upgrade DataFusion family

Signed-off-by: luofucong <luofc@foxmail.com>

* chore: switch to released version of datafusion-pg-catalog

---------

Signed-off-by: luofucong <luofc@foxmail.com>
Co-authored-by: Ning Sun <sunning@greptime.com>
Co-authored-by: Ning Sun <sunng@protonmail.com>
2026-03-03 07:36:39 +00:00
discord9
9b5baa965c feat: truly limit time range by split window (#6295)
* feat: actually split window to limit time range

feat: truly limit time range by split window

Update src/flow/src/batching_mode/state.rs

Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com>
Signed-off-by: discord9 <discord9@163.com>

* chore: added stalled time window range

Signed-off-by: discord9 <discord9@163.com>

* fix: not flush all time range as too expensive

Signed-off-by: discord9 <discord9@163.com>

* test: make it more robust

Signed-off-by: discord9 <discord9@163.com>

* what

Signed-off-by: discord9 <discord9@163.com>

* feat: denfensively handle surplus

Signed-off-by: discord9 <discord9@163.com>

* refactor: per review,explain flush flow

Signed-off-by: discord9 <discord9@163.com>

* chore: per bugbot

Signed-off-by: discord9 <discord9@163.com>

* fix: a temp fix to make mirror insert go first(still need better fix to sync with mirror insert that happens before

Signed-off-by: discord9 <discord9@163.com>

* chore: add todo

Signed-off-by: discord9 <discord9@163.com>

---------

Signed-off-by: discord9 <discord9@163.com>
Co-authored-by: Lei, HUANG <6406592+v0y4g3r@users.noreply.github.com>
2025-07-04 03:37:43 +00:00
discord9
bdd44fd7ec chore: only retry when retry-able in flow (#5987)
* chore: only retry when retry-able

* chore: revert dbg change

* refactor: per review

* fix: check for available frontend first

* docs: more explain&longer timeout&feat: more retry at every level&try send select 1

* fix: use `sql` method for "SELECT 1"

* fix: also put recover flows in spawned task and a dead loop

* test: update transient error in flow rebuild test

* chore: sleep after sqlness sleep

* chore: add a warning

* chore: wait even more time after reboot
2025-04-28 09:49:49 +00:00
discord9
a0900f5b90 feat(flow): use batching mode&fix sqlness (#5903)
* feat: use flow batching engine

broken: try using logical plan

fix: use dummy catalog for logical plan

fix: insert plan exec&sqlness grpc addr

feat: use frontend instance in flownode in standalone

feat: flow type in metasrv&fix: flush flow out of sync& column name alias

tests: sqlness update

tests: sqlness flow rebuild udpate

chore: per review

refactor: keep chnl mgr

refactor: use catalog mgr for get table

tests: use valid sql

fix: add more check

refactor: put flow type determine to frontend

* chore: update proto

* chore: update proto to main branch

* fix: add locks for create/drop flow&docs: update docs

* feat: flush_flow flush all ranges now

* test: add align time window test

* docs: explain `nodeid` use in check task

* refactor: AddAutoColumnRewriter check for Projection

* refactor: per review

* fix: query without time window also clean dirty time window

* chore: better logging

* chore: add comments per review

* refactor: per review

* chore: per review

* chore: per review rename args

* refactor: per review partially

* chore: update docs

* chore: use better error variant

* chore: better error variant

* refactor: rename FlowWorkerManager to FlowStreamingEngine

* rename again

* refactor: per review

* chore: rebase after #5963 merged

* refactor: rename all flow_worker_manager occurs

* docs: rm resolved TODO
2025-04-23 15:12:16 +00:00
discord9
79ed7bbc44 fix: store flow query ctx on creation (#5963)
* fix: store flow schema on creation

* chore: update sqlness

* refactor: save the entire query context to flow info

* chore: sqlness update

* chore: rm pub

* fix: keep old version compatibility
2025-04-23 09:59:09 +00:00
discord9
043d0bd7c2 test: flow rebuild (#5162)
* tests: rebuild flow

* tests: more rebuild

* tests: restart

* chore: drop clean
2024-12-16 12:25:23 +00:00