mirror of
https://github.com/neondatabase/neon.git
synced 2026-01-14 17:02:56 +00:00
## Problem For context, this problem was observed in a research project where we try to make neon run in multiple regions and I was asked by @hlinnaka to make this PR. In our project, we use the pageserver in a non-conventional way such that we would send a larger number of requests to the pageserver than normal (imagine postgres without the buffer pool). I measured the time from the moment a WAL record left the safekeeper to when it reached the pageserver ([code](e593db1f5a/pageserver/src/tenant/timeline/walreceiver/walreceiver_connection.rs (L282-L287))) and observed that when the number of get_page_at_lsn requests was high, the wal receiving time increased significantly (see the left side of the graphs below). Upon further investigation, I found that the delay was caused by this lined2ca410919/pageserver/src/tenant/timeline.rs (L2348)The `get_layer_for_write` method is called for every value during WAL ingestion and it tries to acquire layers write lock every time, thus this results in high contention when read lock is acquired more frequently.   ## Summary of changes It is unnecessary to call `get_layer_for_write` repeatedly for all values in a WAL message since they would end up in the same memory layer anyway, so I created the batched versions of `InMemoryLayer::put_value`, `InMemoryLayer ::put_tombstone`, `Timeline::put_value`, and `Timeline::put_tombstone`, that acquire the locks once for a batch of values. Additionally, `DatadirModification` is changed to store multiple versions of uncommitted values, and `WalIngest::ingest_record()` can now ingest records without immediately committing them. With these new APIs, the new ingestion loop can be changed to commit for every `ingest_batch_size` records. The `ingest_batch_size` variable is exposed as a config. If it is set to 1 then we get the same behavior before this change. I found that setting this value to 100 seems to work the best, and you can see its effect on the right side of the above graphs. --------- Co-authored-by: John Spray <john@neon.tech>