Don't upload index file in compaction, if there was nothing to do. (#3149)

This splits the storage_sync2::schedule_index_file into two (public)
functions:
1. `schedule_index_upload_for_metadata_update`, for when the metadata
(e.g. disk_consistent_lsn or last_gc_cutoff) has changed, and

2. `schedule_index_upload_for_file_changes`, for when layer file uploads
or deletions have been scheduled.

We now keep track of whether there have been any uploads or deletions
since the last index-file upload, and skip the upload in
`schedule_index_upload_for_file_changes` if there haven't been any
changes. That allows us to call the function liberally in timeline.rs,
whenever layer file uploads or deletions might've been scheduled,
without starting a lot of unnecessary index file uploads.

GC was covered earlier by commit c262390214, but that missed that we
have the same problem with compaction.
This commit is contained in:
Heikki Linnakangas
2022-12-19 23:58:24 +02:00
committed by GitHub
parent 3735aece56
commit 39f58038d1
4 changed files with 114 additions and 37 deletions

View File

@@ -165,6 +165,11 @@ def test_gc_index_upload(neon_env_builder: NeonEnvBuilder, remote_storage_kind:
cur.execute("INSERT INTO foo VALUES (0, 0, 'foo')")
pageserver_http.timeline_gc(tenant_id, timeline_id, 10000 - i * 32)
num_index_uploads = get_num_remote_ops("index", "upload")
# Also make sure that a no-op compaction doesn't upload the index
# file unnecessarily.
pageserver_http.timeline_compact(tenant_id, timeline_id)
log.info(f"{num_index_uploads} index uploads after GC iteration {i}")
after = num_index_uploads