mirror of
https://github.com/neondatabase/neon.git
synced 2026-01-16 18:02:56 +00:00
This PR contains the first version of a [FoundationDB-like](https://www.youtube.com/watch?v=4fFDFbi3toc) simulation testing for safekeeper and walproposer. ### desim This is a core "framework" for running determenistic simulation. It operates on threads, allowing to test syncronous code (like walproposer). `libs/desim/src/executor.rs` contains implementation of a determenistic thread execution. This is achieved by blocking all threads, and each time allowing only a single thread to make an execution step. All executor's threads are blocked using `yield_me(after_ms)` function. This function is called when a thread wants to sleep or wait for an external notification (like blocking on a channel until it has a ready message). `libs/desim/src/chan.rs` contains implementation of a channel (basic sync primitive). It has unlimited capacity and any thread can push or read messages to/from it. `libs/desim/src/network.rs` has a very naive implementation of a network (only reliable TCP-like connections are supported for now), that can have arbitrary delays for each package and failure injections for breaking connections with some probability. `libs/desim/src/world.rs` ties everything together, to have a concept of virtual nodes that can have network connections between them. ### walproposer_sim Has everything to run walproposer and safekeepers in a simulation. `safekeeper.rs` reimplements all necesary stuff from `receive_wal.rs`, `send_wal.rs` and `timelines_global_map.rs`. `walproposer_api.rs` implements all walproposer callback to use simulation library. `simulation.rs` defines a schedule – a set of events like `restart <sk>` or `write_wal` that should happen at time `<ts>`. It also has code to spawn walproposer/safekeeper threads and provide config to them. ### tests `simple_test.rs` has tests that just start walproposer and 3 safekeepers together in a simulation, and tests that they are not crashing right away. `misc_test.rs` has tests checking more advanced simulation cases, like crashing or restarting threads, testing memory deallocation, etc. `random_test.rs` is the main test, it checks thousands of random seeds (schedules) for correctness. It roughly corresponds to running a real python integration test in an environment with very unstable network and cpu, but in a determenistic way (each seed results in the same execution log) and much much faster. Closes #547 --------- Co-authored-by: Arseny Sher <sher-ars@yandex.ru>
64 lines
1.6 KiB
Rust
64 lines
1.6 KiB
Rust
use std::fmt::Debug;
|
|
|
|
use bytes::Bytes;
|
|
use utils::lsn::Lsn;
|
|
|
|
use crate::{network::TCP, world::NodeId};
|
|
|
|
/// Internal node events.
|
|
#[derive(Debug)]
|
|
pub enum NodeEvent {
|
|
Accept(TCP),
|
|
Internal(AnyMessage),
|
|
}
|
|
|
|
/// Events that are coming from a network socket.
|
|
#[derive(Clone, Debug)]
|
|
pub enum NetEvent {
|
|
Message(AnyMessage),
|
|
Closed,
|
|
}
|
|
|
|
/// Custom events generated throughout the simulation. Can be used by the test to verify the correctness.
|
|
#[derive(Debug)]
|
|
pub struct SimEvent {
|
|
pub time: u64,
|
|
pub node: NodeId,
|
|
pub data: String,
|
|
}
|
|
|
|
/// Umbrella type for all possible flavours of messages. These events can be sent over network
|
|
/// or to an internal node events channel.
|
|
#[derive(Clone)]
|
|
pub enum AnyMessage {
|
|
/// Not used, empty placeholder.
|
|
None,
|
|
/// Used internally for notifying node about new incoming connection.
|
|
InternalConnect,
|
|
Just32(u32),
|
|
ReplCell(ReplCell),
|
|
Bytes(Bytes),
|
|
LSN(u64),
|
|
}
|
|
|
|
impl Debug for AnyMessage {
|
|
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
|
match self {
|
|
AnyMessage::None => write!(f, "None"),
|
|
AnyMessage::InternalConnect => write!(f, "InternalConnect"),
|
|
AnyMessage::Just32(v) => write!(f, "Just32({})", v),
|
|
AnyMessage::ReplCell(v) => write!(f, "ReplCell({:?})", v),
|
|
AnyMessage::Bytes(v) => write!(f, "Bytes({})", hex::encode(v)),
|
|
AnyMessage::LSN(v) => write!(f, "LSN({})", Lsn(*v)),
|
|
}
|
|
}
|
|
}
|
|
|
|
/// Used in reliable_copy_test.rs
|
|
#[derive(Clone, Debug)]
|
|
pub struct ReplCell {
|
|
pub value: u32,
|
|
pub client_id: u32,
|
|
pub seqno: u32,
|
|
}
|