mirror of
https://github.com/neondatabase/neon.git
synced 2026-01-10 23:12:54 +00:00
In general, tiered compaction is splitting delta layers along the key dimension, but this can only continue until a single key is reached: if the changes from a single key don't fit into one layer file, we used to create layer files of unbounded sizes. This patch implements the method listed as TODO/FIXME in the source code. It does the following things: * Make `accum_key_values` take the target size and if one key's modifications exceed it, make it fill `partition_lsns`, a vector of lsns to use for partitioning. * Have `retile_deltas` use that `partition_lsns` to create delta layers separated by lsn. * Adjust the `test_many_updates_for_single_key` to allow layer files below 0.5 the target size. This situation can create arbitarily small layer files: The amount of data is arbitrary that sits between having just cut a new delta, and then stumbling upon the key that needs to be split along lsn. This data will end up in a dedicated layer and it can be arbitrarily small. * Ignore single-key delta layers for depth calculation: in theory we might have only single-key delta layers in a tier, and this might confuse depth calculation as well, but this should be unlikely. Fixes #7243 Part of #7554 --------- Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
71 lines
2.2 KiB
Rust
71 lines
2.2 KiB
Rust
use once_cell::sync::OnceCell;
|
|
use pageserver_compaction::interface::CompactionLayer;
|
|
use pageserver_compaction::simulator::MockTimeline;
|
|
use utils::logging;
|
|
|
|
static LOG_HANDLE: OnceCell<()> = OnceCell::new();
|
|
|
|
pub(crate) fn setup_logging() {
|
|
LOG_HANDLE.get_or_init(|| {
|
|
logging::init(
|
|
logging::LogFormat::Test,
|
|
logging::TracingErrorLayerEnablement::EnableWithRustLogFilter,
|
|
logging::Output::Stdout,
|
|
)
|
|
.expect("Failed to init test logging")
|
|
});
|
|
}
|
|
|
|
/// Test the extreme case that there are so many updates for a single key that
|
|
/// even if we produce an extremely narrow delta layer, spanning just that one
|
|
/// key, we still too many records to fit in the target file size. We need to
|
|
/// split in the LSN dimension too in that case.
|
|
#[tokio::test]
|
|
async fn test_many_updates_for_single_key() {
|
|
setup_logging();
|
|
let mut executor = MockTimeline::new();
|
|
executor.target_file_size = 1_000_000; // 1 MB
|
|
|
|
// Ingest 10 MB of updates to a single key.
|
|
for _ in 1..1000 {
|
|
executor.ingest_uniform(100, 10, &(0..100_000)).unwrap();
|
|
executor.ingest_uniform(1000, 10, &(0..1)).unwrap();
|
|
executor.compact().await.unwrap();
|
|
}
|
|
|
|
// Check that all the layers are smaller than the target size (with some slop)
|
|
for l in executor.live_layers.iter() {
|
|
println!("layer {}: {}", l.short_id(), l.file_size());
|
|
}
|
|
for l in executor.live_layers.iter() {
|
|
assert!(l.file_size() < executor.target_file_size * 2);
|
|
// Sanity check that none of the delta layers are empty either.
|
|
if l.is_delta() {
|
|
assert!(l.file_size() > 0);
|
|
}
|
|
}
|
|
}
|
|
|
|
#[tokio::test]
|
|
async fn test_simple_updates() {
|
|
setup_logging();
|
|
let mut executor = MockTimeline::new();
|
|
executor.target_file_size = 500_000; // 500 KB
|
|
|
|
// Ingest some traffic.
|
|
for _ in 1..400 {
|
|
executor.ingest_uniform(100, 500, &(0..100_000)).unwrap();
|
|
}
|
|
|
|
for l in executor.live_layers.iter() {
|
|
println!("layer {}: {}", l.short_id(), l.file_size());
|
|
}
|
|
|
|
println!("Running compaction...");
|
|
executor.compact().await.unwrap();
|
|
|
|
for l in executor.live_layers.iter() {
|
|
println!("layer {}: {}", l.short_id(), l.file_size());
|
|
}
|
|
}
|