Christian Schwarz
a2a3613185
reintroduce task-based execution
2024-11-28 20:50:06 +01:00
Christian Schwarz
6bd39f95f5
rn benchmark on hetzner runner
...
-------------------------------------------------------------------------------------------------------------------- Benchmark results ---------------------------------------------------------------------------------------------------------------------
test_throughput[release-pg16-50-None-30-1-128-not batchable None].tablesize_mib: 50 MiB
test_throughput[release-pg16-50-None-30-1-128-not batchable None].pipelining_enabled: 0
test_throughput[release-pg16-50-None-30-1-128-not batchable None].effective_io_concurrency: 1
test_throughput[release-pg16-50-None-30-1-128-not batchable None].readhead_buffer_size: 128
test_throughput[release-pg16-50-None-30-1-128-not batchable None].counters.time: 0.8905
test_throughput[release-pg16-50-None-30-1-128-not batchable None].counters.pageserver_getpage_count: 6,403.0000
test_throughput[release-pg16-50-None-30-1-128-not batchable None].counters.pageserver_vectored_get_count: 6,403.0000
test_throughput[release-pg16-50-None-30-1-128-not batchable None].counters.compute_getpage_count: 6,403.0000
test_throughput[release-pg16-50-None-30-1-128-not batchable None].counters.pageserver_cpu_seconds_total: 0.8633
test_throughput[release-pg16-50-None-30-1-128-not batchable None].perfmetric.batching_factor: 1.0000
test_throughput[release-pg16-50-pipelining_config1-30-1-128-not batchable {'max_batch_size': 1}].tablesize_mib: 50 MiB
test_throughput[release-pg16-50-pipelining_config1-30-1-128-not batchable {'max_batch_size': 1}].pipelining_enabled: 1
test_throughput[release-pg16-50-pipelining_config1-30-1-128-not batchable {'max_batch_size': 1}].effective_io_concurrency: 1
test_throughput[release-pg16-50-pipelining_config1-30-1-128-not batchable {'max_batch_size': 1}].readhead_buffer_size: 128
test_throughput[release-pg16-50-pipelining_config1-30-1-128-not batchable {'max_batch_size': 1}].pipelining_config.max_batch_size: 1
test_throughput[release-pg16-50-pipelining_config1-30-1-128-not batchable {'max_batch_size': 1}].counters.time: 0.9195
test_throughput[release-pg16-50-pipelining_config1-30-1-128-not batchable {'max_batch_size': 1}].counters.pageserver_getpage_count: 6,403.0000
test_throughput[release-pg16-50-pipelining_config1-30-1-128-not batchable {'max_batch_size': 1}].counters.pageserver_vectored_get_count: 6,403.0000
test_throughput[release-pg16-50-pipelining_config1-30-1-128-not batchable {'max_batch_size': 1}].counters.compute_getpage_count: 6,403.0000
test_throughput[release-pg16-50-pipelining_config1-30-1-128-not batchable {'max_batch_size': 1}].counters.pageserver_cpu_seconds_total: 0.8925
test_throughput[release-pg16-50-pipelining_config1-30-1-128-not batchable {'max_batch_size': 1}].perfmetric.batching_factor: 1.0000
test_throughput[release-pg16-50-pipelining_config2-30-1-128-not batchable {'max_batch_size': 32}].tablesize_mib: 50 MiB
test_throughput[release-pg16-50-pipelining_config2-30-1-128-not batchable {'max_batch_size': 32}].pipelining_enabled: 1
test_throughput[release-pg16-50-pipelining_config2-30-1-128-not batchable {'max_batch_size': 32}].effective_io_concurrency: 1
test_throughput[release-pg16-50-pipelining_config2-30-1-128-not batchable {'max_batch_size': 32}].readhead_buffer_size: 128
test_throughput[release-pg16-50-pipelining_config2-30-1-128-not batchable {'max_batch_size': 32}].pipelining_config.max_batch_size: 32
test_throughput[release-pg16-50-pipelining_config2-30-1-128-not batchable {'max_batch_size': 32}].counters.time: 0.8724
test_throughput[release-pg16-50-pipelining_config2-30-1-128-not batchable {'max_batch_size': 32}].counters.pageserver_getpage_count: 6,403.0000
test_throughput[release-pg16-50-pipelining_config2-30-1-128-not batchable {'max_batch_size': 32}].counters.pageserver_vectored_get_count: 6,403.0000
test_throughput[release-pg16-50-pipelining_config2-30-1-128-not batchable {'max_batch_size': 32}].counters.compute_getpage_count: 6,403.0000
test_throughput[release-pg16-50-pipelining_config2-30-1-128-not batchable {'max_batch_size': 32}].counters.pageserver_cpu_seconds_total: 0.8406
test_throughput[release-pg16-50-pipelining_config2-30-1-128-not batchable {'max_batch_size': 32}].perfmetric.batching_factor: 1.0000
test_throughput[release-pg16-50-None-30-100-128-batchable None].tablesize_mib: 50 MiB
test_throughput[release-pg16-50-None-30-100-128-batchable None].pipelining_enabled: 0
test_throughput[release-pg16-50-None-30-100-128-batchable None].effective_io_concurrency: 100
test_throughput[release-pg16-50-None-30-100-128-batchable None].readhead_buffer_size: 128
test_throughput[release-pg16-50-None-30-100-128-batchable None].counters.time: 0.2576
test_throughput[release-pg16-50-None-30-100-128-batchable None].counters.pageserver_getpage_count: 6,401.5259
test_throughput[release-pg16-50-None-30-100-128-batchable None].counters.pageserver_vectored_get_count: 307.8534
test_throughput[release-pg16-50-None-30-100-128-batchable None].counters.compute_getpage_count: 6,401.5259
test_throughput[release-pg16-50-None-30-100-128-batchable None].counters.pageserver_cpu_seconds_total: 0.3043
test_throughput[release-pg16-50-None-30-100-128-batchable None].perfmetric.batching_factor: 20.7941
test_throughput[release-pg16-50-pipelining_config4-30-100-128-batchable {'max_batch_size': 1}].tablesize_mib: 50 MiB
test_throughput[release-pg16-50-pipelining_config4-30-100-128-batchable {'max_batch_size': 1}].pipelining_enabled: 1
test_throughput[release-pg16-50-pipelining_config4-30-100-128-batchable {'max_batch_size': 1}].effective_io_concurrency: 100
test_throughput[release-pg16-50-pipelining_config4-30-100-128-batchable {'max_batch_size': 1}].readhead_buffer_size: 128
test_throughput[release-pg16-50-pipelining_config4-30-100-128-batchable {'max_batch_size': 1}].pipelining_config.max_batch_size: 1
test_throughput[release-pg16-50-pipelining_config4-30-100-128-batchable {'max_batch_size': 1}].counters.time: 0.6187
test_throughput[release-pg16-50-pipelining_config4-30-100-128-batchable {'max_batch_size': 1}].counters.pageserver_getpage_count: 6,403.0000
test_throughput[release-pg16-50-pipelining_config4-30-100-128-batchable {'max_batch_size': 1}].counters.pageserver_vectored_get_count: 6,403.0000
test_throughput[release-pg16-50-pipelining_config4-30-100-128-batchable {'max_batch_size': 1}].counters.compute_getpage_count: 6,403.0000
test_throughput[release-pg16-50-pipelining_config4-30-100-128-batchable {'max_batch_size': 1}].counters.pageserver_cpu_seconds_total: 0.7473
test_throughput[release-pg16-50-pipelining_config4-30-100-128-batchable {'max_batch_size': 1}].perfmetric.batching_factor: 1.0000
test_throughput[release-pg16-50-pipelining_config5-30-100-128-batchable {'max_batch_size': 2}].tablesize_mib: 50 MiB
test_throughput[release-pg16-50-pipelining_config5-30-100-128-batchable {'max_batch_size': 2}].pipelining_enabled: 1
test_throughput[release-pg16-50-pipelining_config5-30-100-128-batchable {'max_batch_size': 2}].effective_io_concurrency: 100
test_throughput[release-pg16-50-pipelining_config5-30-100-128-batchable {'max_batch_size': 2}].readhead_buffer_size: 128
test_throughput[release-pg16-50-pipelining_config5-30-100-128-batchable {'max_batch_size': 2}].pipelining_config.max_batch_size: 2
test_throughput[release-pg16-50-pipelining_config5-30-100-128-batchable {'max_batch_size': 2}].counters.time: 0.4419
test_throughput[release-pg16-50-pipelining_config5-30-100-128-batchable {'max_batch_size': 2}].counters.pageserver_getpage_count: 6,402.6418
test_throughput[release-pg16-50-pipelining_config5-30-100-128-batchable {'max_batch_size': 2}].counters.pageserver_vectored_get_count: 3,207.7015
test_throughput[release-pg16-50-pipelining_config5-30-100-128-batchable {'max_batch_size': 2}].counters.compute_getpage_count: 6,402.6418
test_throughput[release-pg16-50-pipelining_config5-30-100-128-batchable {'max_batch_size': 2}].counters.pageserver_cpu_seconds_total: 0.5391
test_throughput[release-pg16-50-pipelining_config5-30-100-128-batchable {'max_batch_size': 2}].perfmetric.batching_factor: 1.9960
test_throughput[release-pg16-50-pipelining_config6-30-100-128-batchable {'max_batch_size': 4}].tablesize_mib: 50 MiB
test_throughput[release-pg16-50-pipelining_config6-30-100-128-batchable {'max_batch_size': 4}].pipelining_enabled: 1
test_throughput[release-pg16-50-pipelining_config6-30-100-128-batchable {'max_batch_size': 4}].effective_io_concurrency: 100
test_throughput[release-pg16-50-pipelining_config6-30-100-128-batchable {'max_batch_size': 4}].readhead_buffer_size: 128
test_throughput[release-pg16-50-pipelining_config6-30-100-128-batchable {'max_batch_size': 4}].pipelining_config.max_batch_size: 4
test_throughput[release-pg16-50-pipelining_config6-30-100-128-batchable {'max_batch_size': 4}].counters.time: 0.3569
test_throughput[release-pg16-50-pipelining_config6-30-100-128-batchable {'max_batch_size': 4}].counters.pageserver_getpage_count: 6,402.1071
test_throughput[release-pg16-50-pipelining_config6-30-100-128-batchable {'max_batch_size': 4}].counters.pageserver_vectored_get_count: 1,660.0952
test_throughput[release-pg16-50-pipelining_config6-30-100-128-batchable {'max_batch_size': 4}].counters.compute_getpage_count: 6,402.1071
test_throughput[release-pg16-50-pipelining_config6-30-100-128-batchable {'max_batch_size': 4}].counters.pageserver_cpu_seconds_total: 0.4244
test_throughput[release-pg16-50-pipelining_config6-30-100-128-batchable {'max_batch_size': 4}].perfmetric.batching_factor: 3.8565
test_throughput[release-pg16-50-pipelining_config7-30-100-128-batchable {'max_batch_size': 8}].tablesize_mib: 50 MiB
test_throughput[release-pg16-50-pipelining_config7-30-100-128-batchable {'max_batch_size': 8}].pipelining_enabled: 1
test_throughput[release-pg16-50-pipelining_config7-30-100-128-batchable {'max_batch_size': 8}].effective_io_concurrency: 100
test_throughput[release-pg16-50-pipelining_config7-30-100-128-batchable {'max_batch_size': 8}].readhead_buffer_size: 128
test_throughput[release-pg16-50-pipelining_config7-30-100-128-batchable {'max_batch_size': 8}].pipelining_config.max_batch_size: 8
test_throughput[release-pg16-50-pipelining_config7-30-100-128-batchable {'max_batch_size': 8}].counters.time: 0.2977
test_throughput[release-pg16-50-pipelining_config7-30-100-128-batchable {'max_batch_size': 8}].counters.pageserver_getpage_count: 6,401.7700
test_throughput[release-pg16-50-pipelining_config7-30-100-128-batchable {'max_batch_size': 8}].counters.pageserver_vectored_get_count: 886.6900
test_throughput[release-pg16-50-pipelining_config7-30-100-128-batchable {'max_batch_size': 8}].counters.compute_getpage_count: 6,401.7700
test_throughput[release-pg16-50-pipelining_config7-30-100-128-batchable {'max_batch_size': 8}].counters.pageserver_cpu_seconds_total: 0.3511
test_throughput[release-pg16-50-pipelining_config7-30-100-128-batchable {'max_batch_size': 8}].perfmetric.batching_factor: 7.2199
test_throughput[release-pg16-50-pipelining_config8-30-100-128-batchable {'max_batch_size': 16}].tablesize_mib: 50 MiB
test_throughput[release-pg16-50-pipelining_config8-30-100-128-batchable {'max_batch_size': 16}].pipelining_enabled: 1
test_throughput[release-pg16-50-pipelining_config8-30-100-128-batchable {'max_batch_size': 16}].effective_io_concurrency: 100
test_throughput[release-pg16-50-pipelining_config8-30-100-128-batchable {'max_batch_size': 16}].readhead_buffer_size: 128
test_throughput[release-pg16-50-pipelining_config8-30-100-128-batchable {'max_batch_size': 16}].pipelining_config.max_batch_size: 16
test_throughput[release-pg16-50-pipelining_config8-30-100-128-batchable {'max_batch_size': 16}].counters.time: 0.2697
test_throughput[release-pg16-50-pipelining_config8-30-100-128-batchable {'max_batch_size': 16}].counters.pageserver_getpage_count: 6,401.5946
test_throughput[release-pg16-50-pipelining_config8-30-100-128-batchable {'max_batch_size': 16}].counters.pageserver_vectored_get_count: 500.5766
test_throughput[release-pg16-50-pipelining_config8-30-100-128-batchable {'max_batch_size': 16}].counters.compute_getpage_count: 6,401.5946
test_throughput[release-pg16-50-pipelining_config8-30-100-128-batchable {'max_batch_size': 16}].counters.pageserver_cpu_seconds_total: 0.3195
test_throughput[release-pg16-50-pipelining_config8-30-100-128-batchable {'max_batch_size': 16}].perfmetric.batching_factor: 12.7884
test_throughput[release-pg16-50-pipelining_config9-30-100-128-batchable {'max_batch_size': 32}].tablesize_mib: 50 MiB
test_throughput[release-pg16-50-pipelining_config9-30-100-128-batchable {'max_batch_size': 32}].pipelining_enabled: 1
test_throughput[release-pg16-50-pipelining_config9-30-100-128-batchable {'max_batch_size': 32}].effective_io_concurrency: 100
test_throughput[release-pg16-50-pipelining_config9-30-100-128-batchable {'max_batch_size': 32}].readhead_buffer_size: 128
test_throughput[release-pg16-50-pipelining_config9-30-100-128-batchable {'max_batch_size': 32}].pipelining_config.max_batch_size: 32
test_throughput[release-pg16-50-pipelining_config9-30-100-128-batchable {'max_batch_size': 32}].counters.time: 0.2548
test_throughput[release-pg16-50-pipelining_config9-30-100-128-batchable {'max_batch_size': 32}].counters.pageserver_getpage_count: 6,401.5128
test_throughput[release-pg16-50-pipelining_config9-30-100-128-batchable {'max_batch_size': 32}].counters.pageserver_vectored_get_count: 307.7692
test_throughput[release-pg16-50-pipelining_config9-30-100-128-batchable {'max_batch_size': 32}].counters.compute_getpage_count: 6,401.5128
test_throughput[release-pg16-50-pipelining_config9-30-100-128-batchable {'max_batch_size': 32}].counters.pageserver_cpu_seconds_total: 0.3015
test_throughput[release-pg16-50-pipelining_config9-30-100-128-batchable {'max_batch_size': 32}].perfmetric.batching_factor: 20.7997
test_latency[release-pg16-None-None].latency_mean: 0.127 ms
test_latency[release-pg16-None-None].latency_percentiles.p95: 0.166 ms
test_latency[release-pg16-None-None].latency_percentiles.p99: 0.187 ms
test_latency[release-pg16-None-None].latency_percentiles.p99.9: 0.292 ms
test_latency[release-pg16-None-None].latency_percentiles.p99.99: 0.624 ms
test_latency[release-pg16-pipelining_config1-{'max_batch_size': 1}].latency_mean: 0.139 ms
test_latency[release-pg16-pipelining_config1-{'max_batch_size': 1}].latency_percentiles.p95: 0.175 ms
test_latency[release-pg16-pipelining_config1-{'max_batch_size': 1}].latency_percentiles.p99: 0.200 ms
test_latency[release-pg16-pipelining_config1-{'max_batch_size': 1}].latency_percentiles.p99.9: 0.444 ms
test_latency[release-pg16-pipelining_config1-{'max_batch_size': 1}].latency_percentiles.p99.99: 0.658 ms
test_latency[release-pg16-pipelining_config2-{'max_batch_size': 32}].latency_mean: 0.119 ms
test_latency[release-pg16-pipelining_config2-{'max_batch_size': 32}].latency_percentiles.p95: 0.155 ms
test_latency[release-pg16-pipelining_config2-{'max_batch_size': 32}].latency_percentiles.p99: 0.172 ms
test_latency[release-pg16-pipelining_config2-{'max_batch_size': 32}].latency_percentiles.p99.9: 0.267 ms
test_latency[release-pg16-pipelining_config2-{'max_batch_size': 32}].latency_percentiles.p99.99: 0.587 ms
2024-11-28 20:24:01 +01:00
Christian Schwarz
07358dea89
converge on approach that pushes read Result through pipeline
2024-11-28 20:06:15 +01:00
Christian Schwarz
82e1fa3f83
WIP
2024-11-27 12:31:56 +01:00
Christian Schwarz
7fb3d95596
review & identified a cast that isn't handled, document that
2024-11-27 10:33:53 +01:00
Christian Schwarz
e0123c8a80
explain the pipeline cancellation story
2024-11-27 10:13:51 +01:00
Christian Schwarz
18ffaba975
fix pipeline cancellation
2024-11-26 20:44:36 +01:00
Christian Schwarz
a23abb2cc0
adopt spsc_fold
2024-11-26 13:30:40 +01:00
Christian Schwarz
99b664c9ed
expand fix to tasks mode; add some comments
2024-11-25 11:51:58 +01:00
Christian Schwarz
b9477aa945
fix: batcher wouldn't shut down after executor exits
2024-11-25 11:28:30 +01:00
Christian Schwarz
0bb037240d
logging to debug test_pageserver_restarts_under_worload
2024-11-25 10:36:48 +01:00
Christian Schwarz
d6e5a46015
eliminate the word batch and stale doc comments
2024-11-22 12:46:52 +01:00
Christian Schwarz
a28c54dac1
cosmetics
2024-11-22 12:44:31 +01:00
Christian Schwarz
ef502f8311
remove async-timer heritage
2024-11-22 12:43:55 +01:00
Christian Schwarz
c1e8347160
make configurable whether pipelining should use concurrent futures or tasks
2024-11-22 11:27:23 +01:00
Christian Schwarz
093674b2fb
impmlement the serial mode
2024-11-22 09:53:08 +01:00
Christian Schwarz
0fa8ae3c0a
WIP refactor to allow truly serial mode
2024-11-22 09:47:49 +01:00
Christian Schwarz
c1040bc25d
task-based mode
2024-11-22 09:36:45 +01:00
Christian Schwarz
a3d1cf636b
config changes to express pipelining config (not respected yet)
2024-11-22 08:36:17 +01:00
Christian Schwarz
88fd8aed52
watch-based approach
2024-11-21 23:03:21 +01:00
Christian Schwarz
db9093f938
revert back to 'span fixes' commit
2024-11-21 22:07:05 +01:00
Christian Schwarz
240e48df59
improvements
2024-11-21 21:57:53 +01:00
Christian Schwarz
7680aa12a8
draft
2024-11-21 21:34:58 +01:00
Christian Schwarz
56de07154e
fruitless debugging
2024-11-21 20:46:56 +01:00
Christian Schwarz
73046fdf5b
span fixes
2024-11-21 20:21:55 +01:00
Christian Schwarz
408bc8fc71
cleanups
2024-11-21 19:42:43 +01:00
Christian Schwarz
345f8b6c3b
fix ready_for_next_batch order
2024-11-21 19:11:57 +01:00
Christian Schwarz
aa1032aeff
no need for cancel & ctx in pagestream_do_batch
2024-11-21 18:40:22 +01:00
Christian Schwarz
a1bb2e7bb0
WIP: pipelined batching
2024-11-21 18:33:34 +01:00
Christian Schwarz
09e7485004
Merge branch 'problame/merge-getpage-test' into problame/batching-timer
2024-11-21 11:28:12 +01:00
Christian Schwarz
058b35f884
Merge branch 'problame/batching-benchmark' into problame/merge-getpage-test
2024-11-21 11:27:16 +01:00
Christian Schwarz
fa7ce2ca07
the final choice: async-timer 1.0beta15 with features=["tokio1"]
2024-11-21 11:15:02 +01:00
John Spray
42bda5d632
pageserver: revise metrics lifetime for SecondaryTenant ( #9818 )
...
## Problem
We saw a scale test failure when one shard went
secondary->attached->secondary in a short period of time -- the metrics
for the shard failed a validation assertion that is meant to ensure the
size metric matches the sum of layer sizes in the SecondaryDetail
struct.
This appears to be due to two SecondaryTenants being alive at the same
time -- the first one was shut down but still had its contributions to
the metrics.
Closes: https://github.com/neondatabase/neon/issues/9628
## Summary of changes
- Refactor code for validating metrics and call it in shutdown as well
as during downloads
- Move code for dropping per-tenant secondary metrics from drop() into
shutdown(), so that once shutdown() completes it is definitely safe to
instantiate another SecondaryTenant for the same tenant.
2024-11-21 08:31:24 +00:00
Christian Schwarz
89b6cb8eba
Revert "vanilla tokio based timer impl based on tokio::time::Sleep"
...
This reverts commit 517dda849f .
2024-11-20 20:17:49 +01:00
Christian Schwarz
517dda849f
vanilla tokio based timer impl based on tokio::time::Sleep
2024-11-20 19:52:47 +01:00
Christian Schwarz
f22ad868cf
Revert "tokio_timerfd::Delay based impl"
...
This reverts commit fcda7a72c6 .
2024-11-20 19:45:37 +01:00
Christian Schwarz
fcda7a72c6
tokio_timerfd::Delay based impl
...
Performs identically great to the async-timer::Timer features=tokio1 impl
Makes sense because it's the same thing that's happening under the hood.
https://www.notion.so/neondatabase/benchmarking-notes-143f189e004780c4a630cb5f426e39ba?pvs=4#144f189e004780ea9decc82281f6b8d1
2024-11-20 19:42:00 +01:00
Christian Schwarz
469ce810fc
Revert "async-timer based approach (again, with data)"
...
This reverts commit 689788cbba .
2024-11-20 19:40:24 +01:00
Christian Schwarz
21866faa8a
Revert "try async-timer 1.0.0-beta15 (still signal-based timers)"
...
This reverts commit c73e9e40e9 .
2024-11-20 19:37:51 +01:00
Vlad Lazar
ee26f09e45
pageserver: remove shard split hard link assertion ( #9829 )
...
## Problem
We were hitting this assertion in debug mode tests sometimes.
This case was being hit when the parent shard has no resident layers.
For instance, this is the case on split retry where the previous attempt
shut-down the parent and deleted local state for it. If the logical size
calculation does not download some layers before we get to the
hardlinking, then the assertion is hit.
## Summary of Changes
Remove the assertion. It's fine for the ancestor to not have any
resident layers at the time of the split.
Closes https://github.com/neondatabase/neon/issues/9412
2024-11-20 18:33:05 +00:00
Christian Schwarz
5f3e6f398c
Revert "try interval-based impl to cross-chec"
...
This reverts commit 721643beed .
2024-11-20 18:52:55 +01:00
Christian Schwarz
721643beed
try interval-based impl to cross-chec
...
=> zero batching
https://www.notion.so/neondatabase/benchmarking-notes-143f189e004780c4a630cb5f426e39ba?pvs=4#144f189e00478065a9b3e51726082885
2024-11-20 18:50:48 +01:00
Christian Schwarz
c73e9e40e9
try async-timer 1.0.0-beta15 (still signal-based timers)
...
Results unchanged to 0.7.4
https://www.notion.so/neondatabase/benchmarking-notes-143f189e004780c4a630cb5f426e39ba?pvs=4#144f189e004780e18416cc0faf2aca65
2024-11-20 18:32:53 +01:00
John Spray
5ff2f1ee7d
pageserver: enable compaction to proceed while live-migrating ( #5397 )
...
## Problem
Long ago, in #5299 the tenant states for migration are added, but
respected only in a coarse-grained way: when hinted not to do deletions,
tenants will just avoid doing all GC or compaction.
Skipping compaction is not necessary for AttachedMulti, as we will soon
become the primary attached location, and it is not a waste of resources
to proceed with compaction. Instead, per the RFC
https://github.com/neondatabase/neon/pull/5029/files ), deletions should
be queued up in this state, and executed later when we switch to
AttachedSingle.
Avoiding compaction in AttachedMulti can have an operational impact if a
tenant is under significant write load, as a long-running migration can
result in a large accumulation of delta layers with commensurate impact
on read latency.
Closes: https://github.com/neondatabase/neon/issues/5396
## Summary of changes
- Add a 'config' part to RemoteTimelineClient so that it can be aware of
the mode of the tenant it belongs to, and wire this through for
construction + updates
- Add a special buffer for delayed deletions, and when in AttachedMulti
route deletions here instead of into the main remote client queue. This
is drained when transitioning to AttachedSingle. If the tenant is
detached or our process dies before then, then these objects are leaked.
- As a quality of life improvement, also use the remote timeline
client's knowledge of the tenant state to avoid submitting remote
consistent LSN updates for validation when in AttachedStale (as we know
these will fail)
## Checklist before requesting a review
- [ ] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
## Checklist before merging
- [ ] Do not forget to reformat commit message to not include the above
checklist
2024-11-20 17:31:55 +00:00
John Spray
67f5f83edc
pageserver: avoid reading SLRU blocks for GC on shards >0 ( #9423 )
...
## Problem
SLRU blocks, which can add up to several gigabytes, are currently
ingested by all shards, multiplying their capacity cost by the shard
count and slowing down ingest. We do this because all shards need the
SLRU pages to do timestamp->LSN lookup for GC.
Related: https://github.com/neondatabase/neon/issues/7512
## Summary of changes
- On non-zero shards, learn the GC offset from shard 0's index instead
of calculating it.
- Add a test `test_sharding_gc` that exercises this
- Do GC in test_pg_regress as a general smoke test that GC functions run
(e.g. this would fail if we were using SLRUs we didn't have)
In this PR we are still ingesting SLRUs everywhere, but not using them
any more. Part 2 PR (https://github.com/neondatabase/neon/pull/9786 )
makes the change to not store them at all.
## Checklist before requesting a review
- [ ] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
## Checklist before merging
- [ ] Do not forget to reformat commit message to not include the above
checklist
2024-11-20 15:56:14 +00:00
Christian Schwarz
689788cbba
async-timer based approach (again, with data)
...
Yep, it's clearly the best one with best batching factor at lowest CPU
usage.
https://www.notion.so/neondatabase/benchmarking-notes-143f189e004780c4a630cb5f426e39ba?pvs=4#144f189e004780d0a205e081458b46db
2024-11-20 15:36:10 +01:00
Christian Schwarz
f9bf038d2c
Revert "tokio_timerfd::Interval"
...
This reverts commit 12124b28d0 .
2024-11-20 15:25:52 +01:00
Christian Schwarz
12124b28d0
tokio_timerfd::Interval
...
Resolution not high enough to do _any_ batching at 10us or 20us
https://www.notion.so/neondatabase/benchmarking-notes-143f189e004780c4a630cb5f426e39ba?pvs=4#144f189e0047800fb74bd8f4ab6cf8e2
2024-11-20 15:25:14 +01:00
Christian Schwarz
1d85bec0ea
Revert "tokio::time::Interval based approach"
...
This reverts commit 81d99704ee .
2024-11-20 15:13:26 +01:00
Christian Schwarz
81d99704ee
tokio::time::Interval based approach
...
batching at 10us doesn't work well enough, prob the future is ready
too soon. batching factor is just 1.5
https://www.notion.so/neondatabase/benchmarking-notes-143f189e004780c4a630cb5f426e39ba?pvs=4#144f189e004780b79c8dd6d007dbb120
2024-11-20 15:13:11 +01:00