rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-06-01 20:40:37 +00:00

Author	SHA1	Message	Date
Christian Schwarz	bab3dd009d	Merge remote-tracking branch 'origin/main' into problame/batching-sidecar-task	2024-11-29 14:19:06 +01:00
Christian Schwarz	dfcbb139fb	the `None` configuration in the benchmark would use the default instead of the serial configuration; fix that	2024-11-29 13:35:24 +01:00
Christian Schwarz	9a5611a5ef	merge reader&batcher stages, update docs	2024-11-29 11:39:16 +01:00
Christian Schwarz	a2a3613185	reintroduce task-based execution	2024-11-28 20:50:06 +01:00
Christian Schwarz	6bd39f95f5	rn benchmark on hetzner runner -------------------------------------------------------------------------------------------------------------------- Benchmark results --------------------------------------------------------------------------------------------------------------------- test_throughput[release-pg16-50-None-30-1-128-not batchable None].tablesize_mib: 50 MiB test_throughput[release-pg16-50-None-30-1-128-not batchable None].pipelining_enabled: 0 test_throughput[release-pg16-50-None-30-1-128-not batchable None].effective_io_concurrency: 1 test_throughput[release-pg16-50-None-30-1-128-not batchable None].readhead_buffer_size: 128 test_throughput[release-pg16-50-None-30-1-128-not batchable None].counters.time: 0.8905 test_throughput[release-pg16-50-None-30-1-128-not batchable None].counters.pageserver_getpage_count: 6,403.0000 test_throughput[release-pg16-50-None-30-1-128-not batchable None].counters.pageserver_vectored_get_count: 6,403.0000 test_throughput[release-pg16-50-None-30-1-128-not batchable None].counters.compute_getpage_count: 6,403.0000 test_throughput[release-pg16-50-None-30-1-128-not batchable None].counters.pageserver_cpu_seconds_total: 0.8633 test_throughput[release-pg16-50-None-30-1-128-not batchable None].perfmetric.batching_factor: 1.0000 test_throughput[release-pg16-50-pipelining_config1-30-1-128-not batchable {'max_batch_size': 1}].tablesize_mib: 50 MiB test_throughput[release-pg16-50-pipelining_config1-30-1-128-not batchable {'max_batch_size': 1}].pipelining_enabled: 1 test_throughput[release-pg16-50-pipelining_config1-30-1-128-not batchable {'max_batch_size': 1}].effective_io_concurrency: 1 test_throughput[release-pg16-50-pipelining_config1-30-1-128-not batchable {'max_batch_size': 1}].readhead_buffer_size: 128 test_throughput[release-pg16-50-pipelining_config1-30-1-128-not batchable {'max_batch_size': 1}].pipelining_config.max_batch_size: 1 test_throughput[release-pg16-50-pipelining_config1-30-1-128-not batchable {'max_batch_size': 1}].counters.time: 0.9195 test_throughput[release-pg16-50-pipelining_config1-30-1-128-not batchable {'max_batch_size': 1}].counters.pageserver_getpage_count: 6,403.0000 test_throughput[release-pg16-50-pipelining_config1-30-1-128-not batchable {'max_batch_size': 1}].counters.pageserver_vectored_get_count: 6,403.0000 test_throughput[release-pg16-50-pipelining_config1-30-1-128-not batchable {'max_batch_size': 1}].counters.compute_getpage_count: 6,403.0000 test_throughput[release-pg16-50-pipelining_config1-30-1-128-not batchable {'max_batch_size': 1}].counters.pageserver_cpu_seconds_total: 0.8925 test_throughput[release-pg16-50-pipelining_config1-30-1-128-not batchable {'max_batch_size': 1}].perfmetric.batching_factor: 1.0000 test_throughput[release-pg16-50-pipelining_config2-30-1-128-not batchable {'max_batch_size': 32}].tablesize_mib: 50 MiB test_throughput[release-pg16-50-pipelining_config2-30-1-128-not batchable {'max_batch_size': 32}].pipelining_enabled: 1 test_throughput[release-pg16-50-pipelining_config2-30-1-128-not batchable {'max_batch_size': 32}].effective_io_concurrency: 1 test_throughput[release-pg16-50-pipelining_config2-30-1-128-not batchable {'max_batch_size': 32}].readhead_buffer_size: 128 test_throughput[release-pg16-50-pipelining_config2-30-1-128-not batchable {'max_batch_size': 32}].pipelining_config.max_batch_size: 32 test_throughput[release-pg16-50-pipelining_config2-30-1-128-not batchable {'max_batch_size': 32}].counters.time: 0.8724 test_throughput[release-pg16-50-pipelining_config2-30-1-128-not batchable {'max_batch_size': 32}].counters.pageserver_getpage_count: 6,403.0000 test_throughput[release-pg16-50-pipelining_config2-30-1-128-not batchable {'max_batch_size': 32}].counters.pageserver_vectored_get_count: 6,403.0000 test_throughput[release-pg16-50-pipelining_config2-30-1-128-not batchable {'max_batch_size': 32}].counters.compute_getpage_count: 6,403.0000 test_throughput[release-pg16-50-pipelining_config2-30-1-128-not batchable {'max_batch_size': 32}].counters.pageserver_cpu_seconds_total: 0.8406 test_throughput[release-pg16-50-pipelining_config2-30-1-128-not batchable {'max_batch_size': 32}].perfmetric.batching_factor: 1.0000 test_throughput[release-pg16-50-None-30-100-128-batchable None].tablesize_mib: 50 MiB test_throughput[release-pg16-50-None-30-100-128-batchable None].pipelining_enabled: 0 test_throughput[release-pg16-50-None-30-100-128-batchable None].effective_io_concurrency: 100 test_throughput[release-pg16-50-None-30-100-128-batchable None].readhead_buffer_size: 128 test_throughput[release-pg16-50-None-30-100-128-batchable None].counters.time: 0.2576 test_throughput[release-pg16-50-None-30-100-128-batchable None].counters.pageserver_getpage_count: 6,401.5259 test_throughput[release-pg16-50-None-30-100-128-batchable None].counters.pageserver_vectored_get_count: 307.8534 test_throughput[release-pg16-50-None-30-100-128-batchable None].counters.compute_getpage_count: 6,401.5259 test_throughput[release-pg16-50-None-30-100-128-batchable None].counters.pageserver_cpu_seconds_total: 0.3043 test_throughput[release-pg16-50-None-30-100-128-batchable None].perfmetric.batching_factor: 20.7941 test_throughput[release-pg16-50-pipelining_config4-30-100-128-batchable {'max_batch_size': 1}].tablesize_mib: 50 MiB test_throughput[release-pg16-50-pipelining_config4-30-100-128-batchable {'max_batch_size': 1}].pipelining_enabled: 1 test_throughput[release-pg16-50-pipelining_config4-30-100-128-batchable {'max_batch_size': 1}].effective_io_concurrency: 100 test_throughput[release-pg16-50-pipelining_config4-30-100-128-batchable {'max_batch_size': 1}].readhead_buffer_size: 128 test_throughput[release-pg16-50-pipelining_config4-30-100-128-batchable {'max_batch_size': 1}].pipelining_config.max_batch_size: 1 test_throughput[release-pg16-50-pipelining_config4-30-100-128-batchable {'max_batch_size': 1}].counters.time: 0.6187 test_throughput[release-pg16-50-pipelining_config4-30-100-128-batchable {'max_batch_size': 1}].counters.pageserver_getpage_count: 6,403.0000 test_throughput[release-pg16-50-pipelining_config4-30-100-128-batchable {'max_batch_size': 1}].counters.pageserver_vectored_get_count: 6,403.0000 test_throughput[release-pg16-50-pipelining_config4-30-100-128-batchable {'max_batch_size': 1}].counters.compute_getpage_count: 6,403.0000 test_throughput[release-pg16-50-pipelining_config4-30-100-128-batchable {'max_batch_size': 1}].counters.pageserver_cpu_seconds_total: 0.7473 test_throughput[release-pg16-50-pipelining_config4-30-100-128-batchable {'max_batch_size': 1}].perfmetric.batching_factor: 1.0000 test_throughput[release-pg16-50-pipelining_config5-30-100-128-batchable {'max_batch_size': 2}].tablesize_mib: 50 MiB test_throughput[release-pg16-50-pipelining_config5-30-100-128-batchable {'max_batch_size': 2}].pipelining_enabled: 1 test_throughput[release-pg16-50-pipelining_config5-30-100-128-batchable {'max_batch_size': 2}].effective_io_concurrency: 100 test_throughput[release-pg16-50-pipelining_config5-30-100-128-batchable {'max_batch_size': 2}].readhead_buffer_size: 128 test_throughput[release-pg16-50-pipelining_config5-30-100-128-batchable {'max_batch_size': 2}].pipelining_config.max_batch_size: 2 test_throughput[release-pg16-50-pipelining_config5-30-100-128-batchable {'max_batch_size': 2}].counters.time: 0.4419 test_throughput[release-pg16-50-pipelining_config5-30-100-128-batchable {'max_batch_size': 2}].counters.pageserver_getpage_count: 6,402.6418 test_throughput[release-pg16-50-pipelining_config5-30-100-128-batchable {'max_batch_size': 2}].counters.pageserver_vectored_get_count: 3,207.7015 test_throughput[release-pg16-50-pipelining_config5-30-100-128-batchable {'max_batch_size': 2}].counters.compute_getpage_count: 6,402.6418 test_throughput[release-pg16-50-pipelining_config5-30-100-128-batchable {'max_batch_size': 2}].counters.pageserver_cpu_seconds_total: 0.5391 test_throughput[release-pg16-50-pipelining_config5-30-100-128-batchable {'max_batch_size': 2}].perfmetric.batching_factor: 1.9960 test_throughput[release-pg16-50-pipelining_config6-30-100-128-batchable {'max_batch_size': 4}].tablesize_mib: 50 MiB test_throughput[release-pg16-50-pipelining_config6-30-100-128-batchable {'max_batch_size': 4}].pipelining_enabled: 1 test_throughput[release-pg16-50-pipelining_config6-30-100-128-batchable {'max_batch_size': 4}].effective_io_concurrency: 100 test_throughput[release-pg16-50-pipelining_config6-30-100-128-batchable {'max_batch_size': 4}].readhead_buffer_size: 128 test_throughput[release-pg16-50-pipelining_config6-30-100-128-batchable {'max_batch_size': 4}].pipelining_config.max_batch_size: 4 test_throughput[release-pg16-50-pipelining_config6-30-100-128-batchable {'max_batch_size': 4}].counters.time: 0.3569 test_throughput[release-pg16-50-pipelining_config6-30-100-128-batchable {'max_batch_size': 4}].counters.pageserver_getpage_count: 6,402.1071 test_throughput[release-pg16-50-pipelining_config6-30-100-128-batchable {'max_batch_size': 4}].counters.pageserver_vectored_get_count: 1,660.0952 test_throughput[release-pg16-50-pipelining_config6-30-100-128-batchable {'max_batch_size': 4}].counters.compute_getpage_count: 6,402.1071 test_throughput[release-pg16-50-pipelining_config6-30-100-128-batchable {'max_batch_size': 4}].counters.pageserver_cpu_seconds_total: 0.4244 test_throughput[release-pg16-50-pipelining_config6-30-100-128-batchable {'max_batch_size': 4}].perfmetric.batching_factor: 3.8565 test_throughput[release-pg16-50-pipelining_config7-30-100-128-batchable {'max_batch_size': 8}].tablesize_mib: 50 MiB test_throughput[release-pg16-50-pipelining_config7-30-100-128-batchable {'max_batch_size': 8}].pipelining_enabled: 1 test_throughput[release-pg16-50-pipelining_config7-30-100-128-batchable {'max_batch_size': 8}].effective_io_concurrency: 100 test_throughput[release-pg16-50-pipelining_config7-30-100-128-batchable {'max_batch_size': 8}].readhead_buffer_size: 128 test_throughput[release-pg16-50-pipelining_config7-30-100-128-batchable {'max_batch_size': 8}].pipelining_config.max_batch_size: 8 test_throughput[release-pg16-50-pipelining_config7-30-100-128-batchable {'max_batch_size': 8}].counters.time: 0.2977 test_throughput[release-pg16-50-pipelining_config7-30-100-128-batchable {'max_batch_size': 8}].counters.pageserver_getpage_count: 6,401.7700 test_throughput[release-pg16-50-pipelining_config7-30-100-128-batchable {'max_batch_size': 8}].counters.pageserver_vectored_get_count: 886.6900 test_throughput[release-pg16-50-pipelining_config7-30-100-128-batchable {'max_batch_size': 8}].counters.compute_getpage_count: 6,401.7700 test_throughput[release-pg16-50-pipelining_config7-30-100-128-batchable {'max_batch_size': 8}].counters.pageserver_cpu_seconds_total: 0.3511 test_throughput[release-pg16-50-pipelining_config7-30-100-128-batchable {'max_batch_size': 8}].perfmetric.batching_factor: 7.2199 test_throughput[release-pg16-50-pipelining_config8-30-100-128-batchable {'max_batch_size': 16}].tablesize_mib: 50 MiB test_throughput[release-pg16-50-pipelining_config8-30-100-128-batchable {'max_batch_size': 16}].pipelining_enabled: 1 test_throughput[release-pg16-50-pipelining_config8-30-100-128-batchable {'max_batch_size': 16}].effective_io_concurrency: 100 test_throughput[release-pg16-50-pipelining_config8-30-100-128-batchable {'max_batch_size': 16}].readhead_buffer_size: 128 test_throughput[release-pg16-50-pipelining_config8-30-100-128-batchable {'max_batch_size': 16}].pipelining_config.max_batch_size: 16 test_throughput[release-pg16-50-pipelining_config8-30-100-128-batchable {'max_batch_size': 16}].counters.time: 0.2697 test_throughput[release-pg16-50-pipelining_config8-30-100-128-batchable {'max_batch_size': 16}].counters.pageserver_getpage_count: 6,401.5946 test_throughput[release-pg16-50-pipelining_config8-30-100-128-batchable {'max_batch_size': 16}].counters.pageserver_vectored_get_count: 500.5766 test_throughput[release-pg16-50-pipelining_config8-30-100-128-batchable {'max_batch_size': 16}].counters.compute_getpage_count: 6,401.5946 test_throughput[release-pg16-50-pipelining_config8-30-100-128-batchable {'max_batch_size': 16}].counters.pageserver_cpu_seconds_total: 0.3195 test_throughput[release-pg16-50-pipelining_config8-30-100-128-batchable {'max_batch_size': 16}].perfmetric.batching_factor: 12.7884 test_throughput[release-pg16-50-pipelining_config9-30-100-128-batchable {'max_batch_size': 32}].tablesize_mib: 50 MiB test_throughput[release-pg16-50-pipelining_config9-30-100-128-batchable {'max_batch_size': 32}].pipelining_enabled: 1 test_throughput[release-pg16-50-pipelining_config9-30-100-128-batchable {'max_batch_size': 32}].effective_io_concurrency: 100 test_throughput[release-pg16-50-pipelining_config9-30-100-128-batchable {'max_batch_size': 32}].readhead_buffer_size: 128 test_throughput[release-pg16-50-pipelining_config9-30-100-128-batchable {'max_batch_size': 32}].pipelining_config.max_batch_size: 32 test_throughput[release-pg16-50-pipelining_config9-30-100-128-batchable {'max_batch_size': 32}].counters.time: 0.2548 test_throughput[release-pg16-50-pipelining_config9-30-100-128-batchable {'max_batch_size': 32}].counters.pageserver_getpage_count: 6,401.5128 test_throughput[release-pg16-50-pipelining_config9-30-100-128-batchable {'max_batch_size': 32}].counters.pageserver_vectored_get_count: 307.7692 test_throughput[release-pg16-50-pipelining_config9-30-100-128-batchable {'max_batch_size': 32}].counters.compute_getpage_count: 6,401.5128 test_throughput[release-pg16-50-pipelining_config9-30-100-128-batchable {'max_batch_size': 32}].counters.pageserver_cpu_seconds_total: 0.3015 test_throughput[release-pg16-50-pipelining_config9-30-100-128-batchable {'max_batch_size': 32}].perfmetric.batching_factor: 20.7997 test_latency[release-pg16-None-None].latency_mean: 0.127 ms test_latency[release-pg16-None-None].latency_percentiles.p95: 0.166 ms test_latency[release-pg16-None-None].latency_percentiles.p99: 0.187 ms test_latency[release-pg16-None-None].latency_percentiles.p99.9: 0.292 ms test_latency[release-pg16-None-None].latency_percentiles.p99.99: 0.624 ms test_latency[release-pg16-pipelining_config1-{'max_batch_size': 1}].latency_mean: 0.139 ms test_latency[release-pg16-pipelining_config1-{'max_batch_size': 1}].latency_percentiles.p95: 0.175 ms test_latency[release-pg16-pipelining_config1-{'max_batch_size': 1}].latency_percentiles.p99: 0.200 ms test_latency[release-pg16-pipelining_config1-{'max_batch_size': 1}].latency_percentiles.p99.9: 0.444 ms test_latency[release-pg16-pipelining_config1-{'max_batch_size': 1}].latency_percentiles.p99.99: 0.658 ms test_latency[release-pg16-pipelining_config2-{'max_batch_size': 32}].latency_mean: 0.119 ms test_latency[release-pg16-pipelining_config2-{'max_batch_size': 32}].latency_percentiles.p95: 0.155 ms test_latency[release-pg16-pipelining_config2-{'max_batch_size': 32}].latency_percentiles.p99: 0.172 ms test_latency[release-pg16-pipelining_config2-{'max_batch_size': 32}].latency_percentiles.p99.9: 0.267 ms test_latency[release-pg16-pipelining_config2-{'max_batch_size': 32}].latency_percentiles.p99.99: 0.587 ms	2024-11-28 20:24:01 +01:00
Christian Schwarz	07358dea89	converge on approach that pushes read Result through pipeline	2024-11-28 20:06:15 +01:00
Christian Schwarz	82e1fa3f83	WIP	2024-11-27 12:31:56 +01:00
Christian Schwarz	7fb3d95596	review & identified a cast that isn't handled, document that	2024-11-27 10:33:53 +01:00
Christian Schwarz	e0123c8a80	explain the pipeline cancellation story	2024-11-27 10:13:51 +01:00
Christian Schwarz	18ffaba975	fix pipeline cancellation	2024-11-26 20:44:36 +01:00
Christian Schwarz	a23abb2cc0	adopt spsc_fold	2024-11-26 13:30:40 +01:00
Christian Schwarz	99b664c9ed	expand fix to tasks mode; add some comments	2024-11-25 11:51:58 +01:00
Christian Schwarz	b9477aa945	fix: batcher wouldn't shut down after executor exits	2024-11-25 11:28:30 +01:00
Christian Schwarz	0bb037240d	logging to debug test_pageserver_restarts_under_worload	2024-11-25 10:36:48 +01:00
Christian Schwarz	450be26bbb	fast imports: initial Importer and Storage changes (#9218 ) Co-authored-by: Heikki Linnakangas <heikki@neon.tech> Co-authored-by: Stas Kelvic <stas@neon.tech> # Context This PR contains PoC-level changes for a product feature that allows onboarding large databases into Neon without going through the regular data path. # Changes This internal RFC provides all the context * https://github.com/neondatabase/cloud/pull/19799 In the language of the RFC, this PR covers * the Importer code (`fast_import`) * all the Pageserver changes (mgmt API changes, flow implementation, etc) * a basic test for the Pageserver changes # Reviewing As acknowledged in the RFC, the code added in this PR is not ready for general availability. Also, the architecture is not to be discussed in this PR, but in the RFC and associated Slack channel instead. Reviewers of this PR should take that into consideration. The quality bar to apply during review depends on what area of the code is being reviewed: * Importer code (`fast_import`): practically anything goes * Core flow (`flow.rs`): * Malicious input data must be expected and the existing threat models apply. * The code must not be safe to execute on dedicated Pageserver instances: * This means in particular that tenants on other Pageserver instances must not be affected negatively wrt data confidentiality, integrity or availability. * Other code: the usual quality bar * Pay special attention to correct use of gate guards, timeline cancellation in all places during shutdown & migration, etc. * Consider the broader system impact; if you find potentially problematic interactions with Storage features that were not covered in the RFC, bring that up during the review. I recommend submitting three separate reviews, for the three high-level areas with different quality bars. # References (Internal-only) * refs https://github.com/neondatabase/cloud/issues/17507 * refs https://github.com/neondatabase/company_projects/issues/293 * refs https://github.com/neondatabase/company_projects/issues/309 * refs https://github.com/neondatabase/cloud/issues/20646 --------- Co-authored-by: Stas Kelvich <stas.kelvich@gmail.com> Co-authored-by: Heikki Linnakangas <heikki@neon.tech> Co-authored-by: John Spray <john@neon.tech>	2024-11-22 22:47:06 +00:00
Alex Chi Z.	c1937d073f	fix(pageserver): ensure upload happens after delete (#9844 ) ## Problem Follow up of https://github.com/neondatabase/neon/pull/9682, that patch didn't fully address the problem: what if shutdown fails due to whatever reason and then we reattach the tenant? Then we will still remove the future layer. The underlying problem is that the fix for #5878 gets voided because of the generation optimizations. Of course, we also need to ensure that delete happens after uploads, but note that we only schedule deletes when there are no ongoing upload tasks, so that's fine. ## Summary of changes * Add a test case to reproduce the behavior (by changing the original test case to attach the same generation). * If layer upload happens after the deletion, drain the deletion queue before uploading. * If blocked_deletion is enabled, directly remove it from the blocked_deletion queue. * Local fs backend fix to avoid race between deletion and preload. * test_emergency_mode does not need to wait for uploads (and it's generally not possible to wait for uploads). * ~~Optimize deletion executor to skip validation if there are no files to delete.~~ this doesn't work --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-11-22 18:30:53 +00:00
Erik Grinaker	e939d36dd4	safekeeper,pageserver: fix CPU profiling allowlists (#9856 ) ## Problem The HTTP router allowlists matched both on the path and the query string. This meant that only `/profile/cpu` would be allowed without auth, while `/profile/cpu?format=svg` would require auth. Follows #9764. ## Summary of changes * Match allowlists on URI path, rather than the entire URI. * Fix the allowlist for Safekeeper to use `/profile/cpu` rather than the old `/pprof/profile`. * Just use a constant slice for the allowlist; it's only a handful of items, and these handlers are not on hot paths.	2024-11-22 17:50:33 +00:00
Christian Schwarz	d6e5a46015	eliminate the word `batch` and stale doc comments	2024-11-22 12:46:52 +01:00
Christian Schwarz	a28c54dac1	cosmetics	2024-11-22 12:44:31 +01:00
Christian Schwarz	ef502f8311	remove async-timer heritage	2024-11-22 12:43:55 +01:00
Christian Schwarz	c1e8347160	make configurable whether pipelining should use concurrent futures or tasks	2024-11-22 11:27:23 +01:00
John Spray	d9de65ee8f	pageserver: permit reads behind GC cutoff during LSN grace period (#9833 ) ## Problem In https://github.com/neondatabase/neon/issues/9754 and the flakiness of `test_readonly_node_gc`, we saw that although our logic for controlling GC was sound, the validation of getpage requests was not, because it could not consider LSN leases when requests arrived shortly after restart. Closes https://github.com/neondatabase/neon/issues/9754 ## Summary of changes This is the "Option 3" discussed verbally -- rather than holding back gc cutoff, we waive the usual validation of request LSN if we are still waiting for leases to be sent after startup - When validating LSN in `wait_or_get_last_lsn`, skip the validation relative to GC cutoff if the timeline is still in its LSN lease grace period - Re-enable test_readonly_node_gc	2024-11-22 09:24:23 +00:00
Christian Schwarz	093674b2fb	impmlement the serial mode	2024-11-22 09:53:08 +01:00
Christian Schwarz	0fa8ae3c0a	WIP refactor to allow truly serial mode	2024-11-22 09:47:49 +01:00
Christian Schwarz	c1040bc25d	task-based mode	2024-11-22 09:36:45 +01:00
Christian Schwarz	a3d1cf636b	config changes to express pipelining config (not respected yet)	2024-11-22 08:36:17 +01:00
Christian Schwarz	88fd8aed52	watch-based approach	2024-11-21 23:03:21 +01:00
Christian Schwarz	db9093f938	revert back to 'span fixes' commit	2024-11-21 22:07:05 +01:00
Christian Schwarz	240e48df59	improvements	2024-11-21 21:57:53 +01:00
Christian Schwarz	7680aa12a8	draft	2024-11-21 21:34:58 +01:00
Christian Schwarz	56de07154e	fruitless debugging	2024-11-21 20:46:56 +01:00
Christian Schwarz	73046fdf5b	span fixes	2024-11-21 20:21:55 +01:00
Erik Grinaker	190e8cebac	safekeeper,pageserver: add CPU profiling (#9764 ) ## Problem We don't have a convenient way to gather CPU profiles from a running binary, e.g. during production incidents or end-to-end benchmarks, nor during microbenchmarks (particularly on macOS). We would also like to have continuous profiling in production, likely using [Grafana Cloud Profiles](https://grafana.com/products/cloud/profiles-for-continuous-profiling/). We may choose to use either eBPF profiles or pprof profiles for this (pending testing and discussion with SREs), but pprof profiles appear useful regardless for the reasons listed above. See https://github.com/neondatabase/cloud/issues/14888. This PR is intended as a proof of concept, to try it out in staging and drive further discussions about profiling more broadly. Touches #9534. Touches https://github.com/neondatabase/cloud/issues/14888. ## Summary of changes Adds a HTTP route `/profile/cpu` that takes a CPU profile and returns it. Defaults to a 5-second pprof Protobuf profile for use with e.g. `pprof` or Grafana Alloy, but can also emit an SVG flamegraph. Query parameters: * `format`: output format (`pprof` or `svg`) * `frequency`: sampling frequency in microseconds (default 100) * `seconds`: number of seconds to profile (default 5) Also integrates pprof profiles into Criterion benchmarks, such that flamegraph reports can be taken with `cargo bench ... --profile-duration <seconds>`. Output under `target/criterion//profile/flamegraph.svg`. Example profiles: pprof profile (use [`pprof`](https://github.com/google/pprof)): [profile.pb.gz](https://github.com/user-attachments/files/17756788/profile.pb.gz) * Web interface: `pprof -http :6060 profile.pb.gz` * Interactive flamegraph: [profile.svg.gz](https://github.com/user-attachments/files/17756782/profile.svg.gz)	2024-11-21 18:59:46 +00:00
Christian Schwarz	408bc8fc71	cleanups	2024-11-21 19:42:43 +01:00
Christian Schwarz	345f8b6c3b	fix ready_for_next_batch order	2024-11-21 19:11:57 +01:00
Christian Schwarz	aa1032aeff	no need for cancel & ctx in pagestream_do_batch	2024-11-21 18:40:22 +01:00
Christian Schwarz	a1bb2e7bb0	WIP: pipelined batching	2024-11-21 18:33:34 +01:00
Christian Schwarz	09e7485004	Merge branch 'problame/merge-getpage-test' into problame/batching-timer	2024-11-21 11:28:12 +01:00
Christian Schwarz	058b35f884	Merge branch 'problame/batching-benchmark' into problame/merge-getpage-test	2024-11-21 11:27:16 +01:00
Christian Schwarz	fa7ce2ca07	the final choice: async-timer 1.0beta15 with features=["tokio1"]	2024-11-21 11:15:02 +01:00
John Spray	42bda5d632	pageserver: revise metrics lifetime for SecondaryTenant (#9818 ) ## Problem We saw a scale test failure when one shard went secondary->attached->secondary in a short period of time -- the metrics for the shard failed a validation assertion that is meant to ensure the size metric matches the sum of layer sizes in the SecondaryDetail struct. This appears to be due to two SecondaryTenants being alive at the same time -- the first one was shut down but still had its contributions to the metrics. Closes: https://github.com/neondatabase/neon/issues/9628 ## Summary of changes - Refactor code for validating metrics and call it in shutdown as well as during downloads - Move code for dropping per-tenant secondary metrics from drop() into shutdown(), so that once shutdown() completes it is definitely safe to instantiate another SecondaryTenant for the same tenant.	2024-11-21 08:31:24 +00:00
Christian Schwarz	89b6cb8eba	Revert "vanilla tokio based timer impl based on tokio::time::Sleep" This reverts commit `517dda849f`.	2024-11-20 20:17:49 +01:00
Christian Schwarz	517dda849f	vanilla tokio based timer impl based on tokio::time::Sleep	2024-11-20 19:52:47 +01:00
Christian Schwarz	f22ad868cf	Revert "tokio_timerfd::Delay based impl" This reverts commit `fcda7a72c6`.	2024-11-20 19:45:37 +01:00
Christian Schwarz	fcda7a72c6	tokio_timerfd::Delay based impl Performs identically great to the async-timer::Timer features=tokio1 impl Makes sense because it's the same thing that's happening under the hood. https://www.notion.so/neondatabase/benchmarking-notes-143f189e004780c4a630cb5f426e39ba?pvs=4#144f189e004780ea9decc82281f6b8d1	2024-11-20 19:42:00 +01:00
Christian Schwarz	469ce810fc	Revert "async-timer based approach (again, with data)" This reverts commit `689788cbba`.	2024-11-20 19:40:24 +01:00
Christian Schwarz	21866faa8a	Revert "try async-timer 1.0.0-beta15 (still signal-based timers)" This reverts commit `c73e9e40e9`.	2024-11-20 19:37:51 +01:00
Vlad Lazar	ee26f09e45	pageserver: remove shard split hard link assertion (#9829 ) ## Problem We were hitting this assertion in debug mode tests sometimes. This case was being hit when the parent shard has no resident layers. For instance, this is the case on split retry where the previous attempt shut-down the parent and deleted local state for it. If the logical size calculation does not download some layers before we get to the hardlinking, then the assertion is hit. ## Summary of Changes Remove the assertion. It's fine for the ancestor to not have any resident layers at the time of the split. Closes https://github.com/neondatabase/neon/issues/9412	2024-11-20 18:33:05 +00:00
Christian Schwarz	5f3e6f398c	Revert "try interval-based impl to cross-chec" This reverts commit `721643beed`.	2024-11-20 18:52:55 +01:00
Christian Schwarz	721643beed	try interval-based impl to cross-chec => zero batching https://www.notion.so/neondatabase/benchmarking-notes-143f189e004780c4a630cb5f426e39ba?pvs=4#144f189e00478065a9b3e51726082885	2024-11-20 18:50:48 +01:00

1 2 3 4 5 ...

2614 Commits