mirror of
https://github.com/neondatabase/neon.git
synced 2026-05-27 01:50:38 +00:00
make TenantState::{Loading,Attaching,Activating} owned by spawn_load / spawn_attach
See the Mermaid diagram in the doc comment for the now-possible state transitions.
The two core insights / changes are:
- spawn_load and spawn_attach own the tenant state until they're done
- once load()/attach() calls are done
- if they failed, transition them to Broken directly (we know
that there's no background activity because we didn't call activate yet)
- if they succeed, call activate. We can make it infallible. How? Later.
- set_broken() and set_stopping() are changed to wait for spawn_load() /
spawn_attach() to finish. This sounds scary because it might hinder
detach or shutdown, but actually, concurrent attach+detach, or
attach+shutdown, or load+shutdown, or attach+shutdown were just racy.
With this change, they're not anymore.
We can add a CancellationToken stored in Tenant for load/attach and cancel
it from set_stopping() or set_broken() if necessary in the future.
So, why can activate() be infallible now: because we declare that
spawn_load and spawn_attach own the tenant state until they're done.
And we enforce that ownership using the wait_for at the start of
set_stopping and set_broken.
This commit is contained in:
@@ -18,7 +18,29 @@ use crate::reltag::RelTag;
|
||||
use anyhow::bail;
|
||||
use bytes::{BufMut, Bytes, BytesMut};
|
||||
|
||||
/// A state of a tenant in pageserver's memory.
|
||||
/// The state of a tenant in this pageserver.
|
||||
///
|
||||
/// ```mermaid
|
||||
/// stateDiagram-v2
|
||||
///
|
||||
/// [*] --> Loading: spawn_load()
|
||||
/// [*] --> Attaching: spawn_attach()
|
||||
///
|
||||
/// Loading --> Activating: activate()
|
||||
/// Attaching --> Activating: activate()
|
||||
/// Activating --> Active: infallible
|
||||
///
|
||||
/// Loading --> Broken: load() failure
|
||||
/// Attaching --> Broken: attach() failure
|
||||
///
|
||||
/// Active --> Stopping: set_stopping(), part of shutdown & detach
|
||||
/// Stopping --> Broken: late error in remove_tenant_from_memory
|
||||
///
|
||||
/// Broken --> [*]: ignore / detach / shutdown
|
||||
/// Stopping --> [*]: remove_from_memory complete
|
||||
///
|
||||
/// Active --> Broken: cfg(testing)-only tenant break point
|
||||
/// ```
|
||||
#[derive(
|
||||
Clone,
|
||||
PartialEq,
|
||||
@@ -35,11 +57,11 @@ use bytes::{BufMut, Bytes, BytesMut};
|
||||
pub enum TenantState {
|
||||
/// This tenant is being loaded from local disk
|
||||
Loading,
|
||||
/// This tenant is being downloaded from cloud storage.
|
||||
/// This tenant is being attached to the pageserver.
|
||||
Attaching,
|
||||
/// The tenant is transitioning from Loading/Attaching to Active.
|
||||
Activating,
|
||||
/// Tenant is fully operational
|
||||
/// The tenant has finished activating and is open for business.
|
||||
Active,
|
||||
/// A tenant is recognized by pageserver, but it is being detached or the
|
||||
/// system is being shut down.
|
||||
|
||||
Reference in New Issue
Block a user