Files
neon/test_runner/performance/pageserver/util.py
Vlad Lazar 2b11466b59 pageserver: optimise disk io for vectored get (#6780)
## Problem
The vectored read path proposed in
https://github.com/neondatabase/neon/pull/6576 seems
to be functionally correct, but in my testing (see below) it is about 10-20% slower than the naive
sequential vectored implementation.

## Summary of changes
There's three parts to this PR:
1. Supporting vectored blob reads. This is actually trickier than it
sounds because on disk blobs are prefixed with a variable length size header.
Since the blobs are not necessarily fixed size, we need to juggle the offsets
such that the callers can retrieve the blobs from the resulting buffer.

2. Merge disk read requests issued by the vectored read path up to a
maximum size. Again, the merging is complicated by the fact that blobs
are not fixed size. We keep track of the begin and end offset of each blob
and pass them into the vectored blob reader. In turn, the reader will return
a buffer and the offsets at which the blobs begin and end.

3. A benchmark for basebackup requests against tenant with large SLRU
block counts is added. This required a small change to pagebench and a new config
variable for the pageserver which toggles the vectored get validation.

We can probably optimise things further by adding a little bit of
concurrency for our IO. In principle, it's as simple as spawning a task which deals with issuing
IO and doing the serialisation and handling on the parent task which receives input via a
channel.
2024-02-28 12:06:00 +00:00

56 lines
1.7 KiB
Python

"""
Utilities used by all code in this sub-directory
"""
from typing import Any, Callable, Dict, Tuple
import fixtures.pageserver.many_tenants as many_tenants
from fixtures.log_helper import log
from fixtures.neon_fixtures import (
NeonEnv,
NeonEnvBuilder,
)
from fixtures.pageserver.utils import wait_until_all_tenants_state
from fixtures.types import TenantId, TimelineId
def ensure_pageserver_ready_for_benchmarking(env: NeonEnv, n_tenants: int):
"""
Helper function.
"""
ps_http = env.pageserver.http_client()
log.info("wait for all tenants to become active")
wait_until_all_tenants_state(
ps_http, "Active", iterations=n_tenants, period=1, http_error_ok=False
)
# ensure all layers are resident for predictiable performance
tenants = [info["id"] for info in ps_http.tenant_list()]
for tenant in tenants:
for timeline in ps_http.tenant_status(tenant)["timelines"]:
info = ps_http.layer_map_info(tenant, timeline)
for layer in info.historic_layers:
assert not layer.remote
log.info("ready")
def setup_pageserver_with_tenants(
neon_env_builder: NeonEnvBuilder,
name: str,
n_tenants: int,
setup: Callable[[NeonEnv], Tuple[TenantId, TimelineId, Dict[str, Any]]],
) -> NeonEnv:
"""
Utility function to set up a pageserver with a given number of identical tenants.
"""
def doit(neon_env_builder: NeonEnvBuilder) -> NeonEnv:
return many_tenants.single_timeline(neon_env_builder, setup, n_tenants)
env = neon_env_builder.build_and_use_snapshot(name, doit)
env.start()
ensure_pageserver_ready_for_benchmarking(env, n_tenants)
return env