mirror of
https://github.com/neondatabase/neon.git
synced 2026-01-08 05:52:55 +00:00
This PR contains the first version of a [FoundationDB-like](https://www.youtube.com/watch?v=4fFDFbi3toc) simulation testing for safekeeper and walproposer. ### desim This is a core "framework" for running determenistic simulation. It operates on threads, allowing to test syncronous code (like walproposer). `libs/desim/src/executor.rs` contains implementation of a determenistic thread execution. This is achieved by blocking all threads, and each time allowing only a single thread to make an execution step. All executor's threads are blocked using `yield_me(after_ms)` function. This function is called when a thread wants to sleep or wait for an external notification (like blocking on a channel until it has a ready message). `libs/desim/src/chan.rs` contains implementation of a channel (basic sync primitive). It has unlimited capacity and any thread can push or read messages to/from it. `libs/desim/src/network.rs` has a very naive implementation of a network (only reliable TCP-like connections are supported for now), that can have arbitrary delays for each package and failure injections for breaking connections with some probability. `libs/desim/src/world.rs` ties everything together, to have a concept of virtual nodes that can have network connections between them. ### walproposer_sim Has everything to run walproposer and safekeepers in a simulation. `safekeeper.rs` reimplements all necesary stuff from `receive_wal.rs`, `send_wal.rs` and `timelines_global_map.rs`. `walproposer_api.rs` implements all walproposer callback to use simulation library. `simulation.rs` defines a schedule – a set of events like `restart <sk>` or `write_wal` that should happen at time `<ts>`. It also has code to spawn walproposer/safekeeper threads and provide config to them. ### tests `simple_test.rs` has tests that just start walproposer and 3 safekeepers together in a simulation, and tests that they are not crashing right away. `misc_test.rs` has tests checking more advanced simulation cases, like crashing or restarting threads, testing memory deallocation, etc. `random_test.rs` is the main test, it checks thousands of random seeds (schedules) for correctness. It roughly corresponds to running a real python integration test in an environment with very unstable network and cpu, but in a determenistic way (each seed results in the same execution log) and much much faster. Closes #547 --------- Co-authored-by: Arseny Sher <sher-ars@yandex.ru>
78 lines
2.1 KiB
Rust
78 lines
2.1 KiB
Rust
use std::{fmt, sync::Arc};
|
|
|
|
use desim::time::Timing;
|
|
use once_cell::sync::OnceCell;
|
|
use parking_lot::Mutex;
|
|
use tracing_subscriber::fmt::{format::Writer, time::FormatTime};
|
|
|
|
/// SimClock can be plugged into tracing logger to print simulation time.
|
|
#[derive(Clone)]
|
|
pub struct SimClock {
|
|
clock_ptr: Arc<Mutex<Option<Arc<Timing>>>>,
|
|
}
|
|
|
|
impl Default for SimClock {
|
|
fn default() -> Self {
|
|
SimClock {
|
|
clock_ptr: Arc::new(Mutex::new(None)),
|
|
}
|
|
}
|
|
}
|
|
|
|
impl SimClock {
|
|
pub fn set_clock(&self, clock: Arc<Timing>) {
|
|
*self.clock_ptr.lock() = Some(clock);
|
|
}
|
|
}
|
|
|
|
impl FormatTime for SimClock {
|
|
fn format_time(&self, w: &mut Writer<'_>) -> fmt::Result {
|
|
let clock = self.clock_ptr.lock();
|
|
|
|
if let Some(clock) = clock.as_ref() {
|
|
let now = clock.now();
|
|
write!(w, "[{}]", now)
|
|
} else {
|
|
write!(w, "[?]")
|
|
}
|
|
}
|
|
}
|
|
|
|
static LOGGING_DONE: OnceCell<SimClock> = OnceCell::new();
|
|
|
|
/// Returns ptr to clocks attached to tracing logger to update them when the
|
|
/// world is (re)created.
|
|
pub fn init_tracing_logger(debug_enabled: bool) -> SimClock {
|
|
LOGGING_DONE
|
|
.get_or_init(|| {
|
|
let clock = SimClock::default();
|
|
let base_logger = tracing_subscriber::fmt()
|
|
.with_target(false)
|
|
// prefix log lines with simulated time timestamp
|
|
.with_timer(clock.clone())
|
|
// .with_ansi(true) TODO
|
|
.with_max_level(match debug_enabled {
|
|
true => tracing::Level::DEBUG,
|
|
false => tracing::Level::WARN,
|
|
})
|
|
.with_writer(std::io::stdout);
|
|
base_logger.init();
|
|
|
|
// logging::replace_panic_hook_with_tracing_panic_hook().forget();
|
|
|
|
if !debug_enabled {
|
|
std::panic::set_hook(Box::new(|_| {}));
|
|
}
|
|
|
|
clock
|
|
})
|
|
.clone()
|
|
}
|
|
|
|
pub fn init_logger() -> SimClock {
|
|
// RUST_TRACEBACK envvar controls whether we print all logs or only warnings.
|
|
let debug_enabled = std::env::var("RUST_TRACEBACK").is_ok();
|
|
|
|
init_tracing_logger(debug_enabled)
|
|
}
|