pageserver: simplify LayerAccessStats (#8431)

## Problem

LayerAccessStats contains a lot of detail that we don't use: short
histories of most recent accesses, specifics on what kind of task
accessed a layer, etc. This is all stored inside a Mutex, which is
locked every time something accesses a layer.

## Summary of changes

- Store timestamps at a very low resolution (to the nearest second),
sufficient for use on the timescales of eviction.
- Pack access time and last residence change time into a single u64
- Use the high bits of the u64 for other flags, including the new layer
visibility concept.
- Simplify the external-facing model for access stats to just include
what we now track.

Note that the `HistoryBufferWithDropCounter` is removed here because it
is no longer used. I do not dislike this type, we just happen not to use
it for anything else at present.


Co-authored-by: Christian Schwarz <christian@neon.tech>
This commit is contained in:
John Spray
2024-07-24 08:17:28 +01:00
committed by GitHub
parent 925c5ad1e8
commit f5db655447
15 changed files with 209 additions and 579 deletions

View File

@@ -21,6 +21,10 @@ from fixtures.utils import human_bytes, wait_until
GLOBAL_LRU_LOG_LINE = "tenant_min_resident_size-respecting LRU would not relieve pressure, evicting more following global LRU policy"
# access times in the pageserver are stored at a very low resolution: to generate meaningfully different
# values, tests must inject sleeps
ATIME_RESOLUTION = 2
@pytest.mark.parametrize("config_level_override", [None, 400])
def test_min_resident_size_override_handling(
@@ -546,6 +550,7 @@ def test_partial_evict_tenant(eviction_env: EvictionEnv, order: EvictionOrder):
(tenant_id, timeline_id) = warm
# make picked tenant more recently used than the other one
time.sleep(ATIME_RESOLUTION)
env.warm_up_tenant(tenant_id)
# Build up enough pressure to require evictions from both tenants,
@@ -622,6 +627,10 @@ def test_fast_growing_tenant(neon_env_builder: NeonEnvBuilder, pg_bin: PgBin, or
for scale in [1, 1, 1, 4]:
timelines.append((pgbench_init_tenant(layer_size, scale, env, pg_bin), scale))
# Eviction times are stored at a low resolution. We must ensure that the time between
# tenants is long enough for the pageserver to distinguish them.
time.sleep(ATIME_RESOLUTION)
env.neon_cli.safekeeper_stop()
for (tenant_id, timeline_id), scale in timelines:

View File

@@ -52,8 +52,8 @@ def test_threshold_based_eviction(
"kind": "NoEviction"
}
eviction_threshold = 5
eviction_period = 1
eviction_threshold = 10
eviction_period = 2
ps_http.set_tenant_config(
tenant_id,
{