rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-07-05 05:00:37 +00:00

Author	SHA1	Message	Date
Arpad Müller	9ba2a87e69	storcon: sk heartbeat fixes (#10891 ) This PR does the following things: * The initial heartbeat round blocks the storage controller from becoming online again. If all safekeepers are unresponsive, this can cause storage controller startup to be very slow. The original intent of #10583 was that heartbeats don't affect normal functionality of the storage controller. So add a short timeout to prevent it from impeding storcon functionality. * Fix the URL of the utilization endpoint. * Don't send heartbeats to safekeepers which are decomissioned. Part of https://github.com/neondatabase/neon/issues/9011 context: https://neondb.slack.com/archives/C033RQ5SPDH/p1739966807592589	2025-02-19 16:57:11 +00:00
Alex Chi Z.	1f9511dbd9	feat(pageserver): yield image creation to L0 compactions across timelines (#10877 ) ## Problem A simpler version of https://github.com/neondatabase/neon/pull/10812 ## Summary of changes Image layer creation will be preempted by L0 accumulated on other timelines. We stop image layer generation if there's a pending L0 compaction request. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-19 15:10:12 +00:00
Erik Grinaker	aab5482fd5	storcon: add CPU/heap profiling endpoints (#10894 ) Adds CPU/heap profiling for storcon. Also fixes allowlists to match on the path only, since profiling endpoints take query parameters. Requires #10892 for heap profiling.	2025-02-19 14:43:29 +00:00
Erik Grinaker	3720cf1c5a	storcon: use jemalloc (#10892 ) ## Problem We'd like to enable CPU/heap profiling for storcon. This requires jemalloc. ## Summary of changes Use jemalloc as the global allocator, and enable heap sampling for profiling.	2025-02-19 14:20:51 +00:00
Erik Grinaker	0453eaf65c	pageserver: reduce default `compaction_upper_limit` to 20 (#10889 ) ## Problem We've seen the previous default of 50 cause OOMs. Compacting many L0 layers at once now has limited benefit, since the cost is mostly linear anyway. This is already being reduced to 20 in production settings. ## Summary of changes Reduce `DEFAULT_COMPACTION_UPPER_LIMIT` to 20. Once released, let's remove the config overrides.	2025-02-19 14:12:05 +00:00
Heikki Linnakangas	2d96134a4e	Remove unused dependencies (#10887 ) Per cargo machete.	2025-02-19 14:09:01 +00:00
JC Grünhage	e52e93797f	refactor(ci): use variables for AWS account IDs (#10886 ) ## Problem Our AWS account IDs are copy-pasted all over the place. A wrong paste might only be caught late if we hardcode them, but will get flagged instantly by actionlint if we access them from github actions variables. Resolves https://github.com/neondatabase/neon/issues/10787, follow-up for https://github.com/neondatabase/neon/pull/10613. ## Summary of changes Access AWS account IDs using Github Actions variables.	2025-02-19 12:34:41 +00:00
Erik Grinaker	aa115a774c	storcon: eagerly attempt autosplits (#10849 ) ## Problem Autosplits are crucial for bulk ingest performance. However, autosplits were only attempted when there was no other pending work. This could cause e.g. mass AZ affinity violations following Pageserver restarts to starve out autosplits for hours. Resolves #10762. ## Summary of changes Always attempt autosplits in the background reconciliation loop, regardless of other pending work.	2025-02-19 09:01:02 +00:00
Peter Bendel	2f0d6571a9	add a variant to ingest benchmark with shard-splitting disabled (#10876 ) ## Problem we measure ingest performance for a few variants (stripe-sizes, pre-sharded, shard-splitted). However some phenomena (e.g. related to L0 compaction) in PS can be better observed and optimized with un-sharded tenants. ## Summary of changes - Allow to create projects with a policy that disables sharding (`{"scheduling": "Essential"}`) - add a variant to ingest_benchmark that uses that policy for the new project ## Test run https://github.com/neondatabase/neon/actions/runs/13396325970	2025-02-19 08:43:53 +00:00
a-masterov	7199919f04	Fix the problems discovered in the upgrade test (#10826 ) ## Problem The nightly test discovered problems in the extensions upgrade test. 1. `PLv8` has different versions on PGv17 and PGv16 and a different test set, which was not implemented correctly [sample](https://github.com/neondatabase/neon/actions/runs/13382330475/job/37372930271) 2. The same for `semver` [sample](https://github.com/neondatabase/neon/actions/runs/13382330475/job/37372930017) 3. `pgtap` interfered with the other tests, e.g. tables, created by other extensions caused the tests to fail. ## Summary of changes The discovered problems were fixed. 1. The tests list for `PLv8` is now generated using the original Makefile 2. The patches for `semver` are now split for PGv16 and PGv17. 3. `pgtap` is being tested in a separate database now. --------- Co-authored-by: Mikhail Kot <mikhail@neon.tech>	2025-02-19 06:40:09 +00:00
Alex Chi Z.	a4e3989c8d	fix(pageserver): make repartition error critical (#10872 ) ## Problem Read errors during repartition should be a critical error. ## Summary of changes <del>We only have one call site</del> We have two call sites of `repartition` where one of them is during the initial image upload optimization and another is during image layer creation, so I added a `critical!` here instead of inside `collect_keyspace`. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-18 20:19:23 +00:00
Peter Bendel	9d074db18d	Use link to cross-service-endpoint dashboard in allure reports and benchmarking workflow logs (#10874 ) ## Problem We have links to deprecated dashboards in our logs Example https://github.com/neondatabase/neon/actions/runs/13382454571/job/37401983608#step:8:348 ## Summary of changes Use link to cross service endpoint instead. Example: https://github.com/neondatabase/neon/actions/runs/13395407925/job/37413056148#step:7:345	2025-02-18 19:54:21 +00:00
Alex Chi Z.	538ea03f73	feat(pageserver): allow read path debug in getpagelsn API (#10748 ) ## Problem The usual workflow for me to debug read path errors in staging is: download the tenant to my laptop, import, and then run some read tests. With this patch, we can do this directly over staging pageservers. ## Summary of changes * Add a new `touchpagelsn` API that does a page read but does not return page info back. * Allow read from latest record LSN from get/touchpagelsn * Add read_debug config in the context. * The read path will read the context config to decide whether to enable read path tracing or not. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-18 18:54:53 +00:00
Erik Grinaker	cb8060545d	pageserver: don't log noop image compaction (#10873 ) ## Problem We log image compaction stats even when no image compaction happened. This is logged every 10 seconds for every timeline. ## Summary of changes Only log when we actually performed any image compaction.	2025-02-18 17:49:01 +00:00
JC Grünhage	9151d3a318	feat(ci): notify storage oncall if deploy job fails on release branch (#10865 ) ## Problem If the deploy job on the release branch doesn't succeed, the preprod deployment will not have happened. It was requested that this triggers a notification in https://github.com/neondatabase/neon/issues/10662. ## Summary of changes If we're on the release branch and the deploy job doesn't end up in "success", notify storage oncall on slack.	2025-02-18 17:20:03 +00:00
Anastasia Lubennikova	381115b68e	Add pgaudit and pgauditlogtofile extensions (#10763 ) to compute image. This commit doesn't enable anything yet. It is a preparatory work for enabling audit logging in computes.	2025-02-18 16:32:32 +00:00
Vlad Lazar	1a69a8cba7	storage: add APIs for warming up location after cold migrations (#10788 ) ## Problem We lack an API for warming up attached locations based on the heatmap contents. This is problematic in two places: 1. If we manually migrate and cut over while the secondary is still cold 2. When we re-attach a previously offloaded tenant ## Summary of changes https://github.com/neondatabase/neon/pull/10597 made heatmap generation additive across migrations, so we won't clobber it a after a cold migration. This allows us to implement: 1. An endpoint for downloading all missing heatmap layers on the pageserver: `/v1/tenant/:tenant_shard_id/timeline/:timeline_id/download_heatmap_layers`. Only one such operation per timeline is allowed at any given time. The granularity is tenant shard. 2. An endpoint to the storage controller to trigger the downloads on the pageserver: `/v1/tenant/:tenant_shard_id/timeline/:timeline_id/download_heatmap_layers`. This works both at tenant and tenant shard level. If an unsharded tenant id is provided, the operation is started on all shards, otherwise only the specified shard. 3. A storcon cli command. Again, tenant and tenant-shard level granularities are supported. Cplane will call into storcon and trigger the downloads for all shards. When we want to rescue a migration, we will use storcon cli targeting the specific tenant shard. Related: https://github.com/neondatabase/neon/issues/10541	2025-02-18 16:09:06 +00:00
Alex Chi Z.	ed98f6d57e	feat(pageserver): log lease request (#10832 ) ## Problem To investigate https://github.com/neondatabase/cloud/issues/23650 ## Summary of changes We log lease requests to see why there are clients accessing things below gc_cutoff. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-18 16:06:39 +00:00
Alex Chi Z.	f9a063e2e9	test(pageserver): fix test_pageserver_gc_compaction_idempotent (#10833 ) ## Problem ref https://github.com/neondatabase/neon/issues/10517 ## Summary of changes For some reasons the job split algorithm decides to have different image coverage range for two compactions before/after restart. So we remove the subcompaction key range and let it generate an image covering the full range, which should make the test more stable. Also slightly tuned the logging span. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-18 16:06:20 +00:00
Heikki Linnakangas	f36ec5c84b	chore(compute): Postgres 17.4, 16.8, 15.12 and 14.17 (#10868 ) Update all minor versions. No conflicts. Postgres repository PRs: - https://github.com/neondatabase/postgres/pull/584 - https://github.com/neondatabase/postgres/pull/583 - https://github.com/neondatabase/postgres/pull/582 - https://github.com/neondatabase/postgres/pull/581	2025-02-18 15:56:43 +00:00
Alexander Bayandin	274cb13293	test_runner: fix mismatch versions tests on linux (#10869 ) ## Problem Tests with mixed-version binaries always use the latest binaries on CI ([an example](https://neon-github-public-dev.s3.amazonaws.com/reports/pr-10848/13378137061/index.html#suites/8fc5d1648d2225380766afde7c428d81/1ccefc4cfd4ef176/)): The versions of new `storage_broker` and old `pageserver` are the same: `b45254a5605f6fdafdf475cdd3e920fe00898543`. This affects only Linux, on macOS the version mixed correctly. ## Summary of changes - Use hardlinks instead of symlinks to create a directory with mixed-version binaries	2025-02-18 15:52:00 +00:00
Alex Chi Z.	290f007b8e	Revert "feat(pageserver): repartition on L0-L1 boundary (#10548 )" (#10870 ) This reverts commit `443c8d0b4b`. ## Problem We observe a massive amount of compaction errors. ## Summary of changes If the tenant did not write any L1 layers (i.e., they accumulate L0 layers where number of them is below L0 threshold), image creation will always fail. Therefore, it's not correct to simply use the disk_consistent_lsn or L0/L1 boundary for the image creation.	2025-02-18 15:43:33 +00:00
Alexander Lakhin	29e4ca351e	Pass asan/ubsan options to pg_dump/pg_restore started by fast_import (#10866 )	2025-02-18 15:41:20 +00:00
Arpad Müller	caece02da7	move pull_timeline to safekeeper_api and add SafekeeperGeneration (#10863 ) Preparations for a successor of #10440: * move `pull_timeline` to `safekeeper_api` and add it to `SafekeeperClient`. we want to do `pull_timeline` on any creations that we couldn't do initially. * Add a `SafekeeperGeneration` type instead of relying on a type alias. we want to maintain a safekeeper specific generation number now in the storcon database. A separate type is important to make it impossible to mix it up with the tenant's pageserver specific generation number. We absolutely want to avoid that for correctness reasons. If someone mixes up a safekeeper and pageserver id (both use the `NodeId` type), that's bad but there is no wrong generations flying around. part of #9011	2025-02-18 14:02:22 +00:00
Arseny Sher	d36baae758	Add gc_blocking and restore latest_gc_cutoff in openapi spec (#10867 ) ## Problem gc_blocking is missing in the tenant info, but cplane wants to use it. Also, https://github.com/neondatabase/neon/pull/10707/ removed latest_gc_cutoff from the spec, renaming it to applied_gc_cutoff. Temporarily get it back until cplane migrates. ## Summary of changes Add them. ref https://neondb.slack.com/archives/C03438W3FLZ/p1739877734963979	2025-02-18 13:57:12 +00:00
Alexander Lakhin	f81259967d	Add test to make sure sanitizers really work when expected (#10838 )	2025-02-18 13:23:18 +00:00
Conrad Ludgate	719ec378cd	fix(local_proxy): discard all in tx (#10864 ) ## Problem `discard all` cannot run in a transaction (even if implicit) ## Summary of changes Split up the query into two, we don't need transaction support.	2025-02-18 08:54:20 +00:00
Alexander Bayandin	27241f039c	test_runner: fix `neon_local` usage for version mismatch tests (#10859 ) ## Problem Tests with mixed versions of binaries always pick up new versions if services are started using `neon_local`. ## Summary of changes - Set `neon_local_binpath` along with `neon_binpath` and `pg_distrib_dir` for tests with mixed versions	2025-02-17 20:29:14 +00:00
Heikki Linnakangas	811506aaa2	fast_import: Use rust s3 client for uploading (#10777 ) This replaces the use of the awscli utility. awscli binary is massive, it added about 200 MB to the docker image size, while the s3 client was already a dependency so using that is essentially free, as far as binary size is concerned. I implemented a simple upload function that tries to keep 10 uploads going in parallel. I believe that's the default behavior of the "aws s3 sync" command too.	2025-02-17 20:07:31 +00:00
Heikki Linnakangas	2884917bd4	compute: Allow postgres user to power off the VM also on <= v16 (#10860 ) I did this for debian bookworm variant in PR #10710, but forgot to update the "bullseye" dockerfile that is used to build older PostgreSQL versions.	2025-02-17 19:42:57 +00:00
Tristan Partin	b34598516f	Warn when PR may require regenerating cloud PG settings (#10229 ) These generated Postgres settings JSON files can get out of sync causing the control plane to reject updated to an endpoint or project's Postgres settings. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-02-17 19:02:16 +00:00
Erik Grinaker	84bbe87d60	pageserver: tweak `pageserver_layers_per_read` histogram resolution (#10847 ) ## Problem The current `pageserver_layers_per_read` histogram buckets don't represent the current reality very well. For the percentiles we care about (e.g. p50 and p99), we often see fairly high read amp, especially during ingestion, and anything below 4 can be considered very good. ## Summary of changes Change the per-timeline read amp histogram buckets to `[4.0, 8.0, 16.0, 32.0, 64.0, 128.0, 256.0]`.	2025-02-17 17:24:17 +00:00
Vlad Lazar	b10890b81c	tests: compare digests in test_peer_recovery (#10853 ) ## Problem Test fails when comparing the first WAL segment because the system id in the segment header is different. The system id is not consistently set correctly since segments are usually inited on the safekeeper sync step with sysid 0. ## Summary of Chnages Compare timeline digests instead. This skips the header. Closes https://github.com/neondatabase/neon/issues/10596	2025-02-17 16:32:24 +00:00
Conrad Ludgate	3204efc860	chore(proxy): use specially named prepared statements for type-checking (#10843 ) I was looking into https://github.com/neondatabase/serverless/issues/144, I recall previous cases where proxy would trigger these prepared statements which would conflict with other statements prepared by our client downstream. Because of that, and also to aid in debugging, I've made sure all prepared statements that proxy needs to make have specific names that likely won't conflict and makes it clear in a error log if it's our statements that are causing issues	2025-02-17 16:19:57 +00:00
Tristan Partin	da79cc5eee	Add neon.extension_server_{connect,request}_timeout (#10801 ) Instead of hardcoding the request timeout, let's make it configurable as a PGC_SUSET GUC. Additionally, add a connect timeout GUC. Although the extension server runs on the compute, it is always best to keep operations from hanging. Better to present a timeout error to the user than a stuck backend. Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-02-17 15:40:43 +00:00
John Spray	39d42d846a	pageserver_api: fix decoding old-version TimelineInfo (#10845 ) ## Problem In #10707 some new fields were introduced in TimelineInfo. I forgot that we do not only use TimelineInfo for encoding, but also decoding when the storage controller calls into a pageserver, so this broke some calls from controller to pageserver while in a mixed-version state. ## Summary of changes - Make new fields have default behavior so that they are optional	2025-02-17 15:04:47 +00:00
Arpad Müller	0330b61729	Azure SDK: use neon branch again (#10844 ) Originally I wanted to switch back to the `neon` branch before merging #10825, but I forgot to do it. Do it in a separate PR now. No actual change of the source code, only changes the branch name (so that maybe in a few weeks we can delete the temporary branch `arpad/neon-rebase`).	2025-02-17 14:59:01 +00:00
Erik Grinaker	8a2d95b4b5	pageserver: appease unused lint on macOS (#10846 ) ## Problem `SmgrOpFlushInProgress::measure()` takes a `socket_fd` argument which is only used on Linux. This causes linter warnings on macOS. Touches #10823. ## Summary of changes Add a noop use of `socket_fd` on non-Linux branch.	2025-02-17 14:41:22 +00:00
Konstantin Knizhnik	8c6d133d31	Fix out-of-boundaries access in addSHLL function (#10840 ) ## Problem See https://github.com/neondatabase/neon/issues/10839 rho(x,b) functions returns values in range [1,b+1] and addSHLL tries to store it in array of size b+1. ## Summary of changes Subtract 1 fro value returned by rho --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2025-02-17 12:54:17 +00:00
Arpad Müller	81f08d304a	Rebase Azure SDK and apply newest patch (#10825 ) The [upstream PR](https://github.com/Azure/azure-sdk-for-rust/pull/1997) has been merged with some changes to use threads with async, so apply them to the neon specific fork to be nice to the executor (before, we had the state as of filing of that PR). Also, rebase onto the latest version of upstream's `legacy` branch. current SDK commits: [link](https://github.com/neondatabase/azure-sdk-for-rust/commits/neon-2025-02-14) now: [link](https://github.com/neondatabase/azure-sdk-for-rust/commits/arpad/neon-refresh) Prior update was in #10790	2025-02-17 10:44:44 +00:00
Peter Bendel	d566d604cf	feat(compute) add pg_duckdb extension v0.3.1 (#10829 ) We want to host pg_duckdb (starting with v0.3.1) on Neon. This PR replaces https://github.com/neondatabase/neon/pull/10350 which was for older pg_duckdb v0.2.0 Use cases - faster OLAP queries - access to datelake files (e.g. parquet) on S3 buckets from Neon PostgreSQL Because neon does not provide superuser role to neon customers we need to grant some additional permissions to neon_superuser: Note: some grants that we require are already granted to `PUBLIC` in new release of pg_duckdb [here](`3789e4c509/sql/pg_duckdb--0.2.0--0.3.0.sql (L1054)`) ```sql GRANT ALL ON FUNCTION duckdb.install_extension(TEXT) TO neon_superuser; GRANT ALL ON TABLE duckdb.extensions TO neon_superuser; GRANT ALL ON SEQUENCE duckdb.extensions_table_seq TO neon_superuser; ```	2025-02-17 10:43:16 +00:00
Alexander Lakhin	f739773edd	Fix format of milliseconds in pytest output (#10836 ) ## Problem The timestamp prefix of pytest log lines contains milliseconds without leading zeros, so values of milliseconds less than 100 printed incorrectly. For example: ``` 2025-02-15 12:02:51.997 INFO [_internal.py:97] 127.0.0.1 - - ... 2025-02-15 12:02:52.4 INFO [_internal.py:97] 127.0.0.1 - - ... 2025-02-15 12:02:52.9 INFO [_internal.py:97] 127.0.0.1 - - ... 2025-02-15 12:02:52.23 INFO [_internal.py:97] 127.0.0.1 - - ... ``` ## Summary of changes Fix log_format for pytest so that milliseconds are printed with leading zeros.	2025-02-16 04:59:52 +00:00
Heikki Linnakangas	2dae0612dd	fast_import: Fix shared_buffers setting (#10837 ) In commit `9537829ccd` I made shared_buffers be derived from the system's available RAM. However, I failed to remove the old hard-coded shared_buffers=10GB settings, shared_buffers was set twice. Oopsie.	2025-02-16 00:01:19 +00:00
Alexander Bayandin	2ec8dff6f7	CI(build-and-test-locally): set `session-timeout` for pytest (#10831 ) ## Problem Sometimes, a regression test run gets stuck (taking more than 60 minutes) and is killed by GitHub's `timeout-minutes` without leaving any traces in the test results database. I find no correlation between this and either the build type, the architecture, or the Postgres version. See: https://neonprod.grafana.net/goto/nM7ih7cHR?orgId=1 ## Summary of changes - Bump `pytest-timeout` to the version that supports `--session-timeout` - Set `--session-timeout` to (timeout-minutes - 10 minutes) * 60 seconds in Attempt to stop tests gracefully to generate test reports until they are forcibly stopped by the stricter `timeout-minutes` limit.	2025-02-15 10:34:11 +00:00
Alex Chi Z.	ae091c6913	feat(pageserver): store reldir in sparse keyspace (#10593 ) ## Problem Part of https://github.com/neondatabase/neon/issues/9516 ## Summary of changes This patch adds the support for storing reldir in the sparse keyspace. All logic are guarded with the `rel_size_v2_enabled` flag, so if it's set to false, the code path is exactly the same as what's currently in prod. Note that we did not persist the `rel_size_v2_enabled` flag and the logic around it will be implemented in the next patch. (i.e., what if we enabled it, restart the pageserver, and then it gets set to false? we should still read from v2 using the rel_size_v2_migration_status in the index_part). The persistence logic I'll implement in the next patch will disallow switching from v2->v1 via config item. I also refactored the metrics so that it can work with the new reldir store. However, this metric is not correctly computed for reldirs (see the comments) before. With the refactor, the value will be computed only when we have an initial value for the reldir size. The refactor keeps the incorrectness of the computation when there are more than 1 database. For the tests, we currently run all the tests with v2, and I'll set it to false and add some v2-specific tests before merging, probably also v1->v2 migration tests. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-02-14 20:31:54 +00:00
Christian Schwarz	a32e8871ac	compute/pageserver: correlation of logs through backend PID (via `application_name`) (#10810 ) This PR makes compute set the `application_name` field to the PG backend process PID which is also included in each compute log line. This allows correlation of Pageserver connection logs with compute logs in a way that was guesswork before this PR. In future, we can switch for a more unique identifier for a page_service session. Refs - discussion in https://neondb.slack.com/archives/C08DE6Q9C3B/p1739465208296169?thread_ts=1739462628.361019&cid=C08DE6Q9C3B - fixes https://github.com/neondatabase/neon/issues/10808	2025-02-14 20:11:42 +00:00
Christian Schwarz	9177312ba6	basebackup: use `Timeline::get` for `get_rel` instead of `get_rel_page_at_lsn` (#10476 ) I noticed the opportunity to simplify here while working on https://github.com/neondatabase/neon/pull/9353 . The only difference is the zero-fill behavior: if one reads past rel size, `get_rel_page_at_lsn` returns a zeroed page whereas `Timeline::get` returns an error. However, the `endblk` is at most rel size large, because `nblocks` is eq `get_rel_size`, see a few lines above this change. We're using the same LSN (`self.lsn`) for everything, so there is no chance of non-determinism. Refs: - Slack discussion debating correctness: https://neondb.slack.com/archives/C033RQ5SPDH/p1737457010607119	2025-02-14 17:57:18 +00:00
Christian Schwarz	b992a1a62a	page_service: include socket send & recv queue length in slow flush log mesage (#10823 ) # Summary In - https://github.com/neondatabase/neon/pull/10813 we added slow flush logging but it didn't log the TCP send & recv queue length. This PR adds that data to the log message. I believe the implementation to be safe & correct right now, but it's brittle and thus this PR should be reverted or improved upon once the investigation is over. Refs: - stacked atop https://github.com/neondatabase/neon/pull/10813 - context: https://neondb.slack.com/archives/C08DE6Q9C3B/p1739464533762049?thread_ts=1739462628.361019&cid=C08DE6Q9C3B - improves https://github.com/neondatabase/neon/issues/10668 - part of https://github.com/neondatabase/cloud/issues/23515 # How It Works The trouble is two-fold: 1. getting to the raw socket file descriptor through the many Rust types that wrap it and 2. integrating with the `measure()` function Rust wraps it in types to model file descriptor lifetimes and ownership, and usually one can get access using `as_raw_fd()`. However, we `split()` the stream and the resulting [`tokio::io::WriteHalf`](https://docs.rs/tokio/latest/tokio/io/struct.WriteHalf.html) . Check the PR commit history for my attempts to do it. My solution is to get the socket fd before we wrap it in our protocol types, and to store that fd in the new `PostgresBackend::socket_fd` field. I believe it's safe because the lifetime of `PostgresBackend::socket_fd` value == the lifetime of the `TcpStream` that wrap and store in `PostgresBackend::framed`. Specifically, the only place that close()s the socket is the `impl Drop for TcpStream`. I think the protocol stack calls `TcpStream::shutdown()`, but, that doesn't `close()` the file descriptor underneath. Regarding integration with the `measure()` function, the trouble is that `flush_fut` is currently a generic `Future` type. So, we just pass in the `socket_fd` as a separate argument. A clean implementation would convert the `pgb_writer.flush()` to a named future that provides an accessor for the socket fd while not being polled. I tried (see PR history), but failed to break through the `WriteHalf`. # Testing Tested locally by running ``` ./target/debug/pagebench get-page-latest-lsn --num-clients=1000 --queue-depth=1000 ``` in one terminal, waiting a bit, then ``` pkill -STOP pagebench ``` then wait for slow logs to show up in `pageserver.log`. Pick one of the slow log message's port pairs, e.g., `127.0.0.1:39500`, and then checking sockstat output ``` ss -ntp \| grep '127.0.0.1:39500' ``` to ensure that send & recv queue size match those in the log message.	2025-02-14 16:20:07 +00:00
Gleb Novikov	3d7a32f619	fast import: allow restore to provided connection string (#10407 ) Within https://github.com/neondatabase/cloud/issues/22089 we decided that would be nice to start with import that runs dump-restore into a running compute (more on this [here](https://www.notion.so/neondatabase/2024-Jan-13-Migration-Assistant-Next-Steps-Proposal-Revised-17af189e004780228bdbcad13eeda93f?pvs=4#17af189e004780de816ccd9c13afd953)) We could do it by writing another tool or by extending existing `fast_import.rs`, we chose the latter. In this PR, I have added optional `restore_connection_string` as a cli arg and as a part of the json spec. If specified, the script will not run postgres and will just perform restore into provided connection string. TODO: - [x] fast_import.rs: - [x] cli arg in the fast_import.rs - [x] encoded connstring in json spec - [x] simplify `fn main` a little, take out too verbose stuff to some functions - [ ] ~~allow streaming from dump stdout to restore stdin~~ will do in a separate PR - [ ] ~~address https://github.com/neondatabase/neon/pull/10251#pullrequestreview-2551877845~~ will do in a separate PR - [x] tests: - [x] restore with cli arg in the fast_import.rs - [x] restore with encoded connstring in json spec in s3 - [ ] ~~test with custom dbname~~ will do in a separate PR - [ ] ~~test with s3 + pageserver + fast import binary~~ https://github.com/neondatabase/neon/pull/10487 - [ ] ~~https://github.com/neondatabase/neon/pull/10271#discussion_r1923715493~~ will do in a separate PR neondatabase/cloud#22775 --------- Co-authored-by: Eduard Dykman <bird.duskpoet@gmail.com>	2025-02-14 16:10:06 +00:00
Christian Schwarz	fac5db3c8d	page_service: emit periodic log message while response flush is slow (#10813 ) The logic might seem a bit intricate / over-optimized, but I recently spent time benchmarking this code path in the context of a nightly pagebench regression (https://github.com/neondatabase/cloud/issues/21759) and I want to avoid regressing it any further. Ideally would also log the socket send & recv queue length like we do on the compute side in - https://github.com/neondatabase/neon/pull/10673 But that is proving difficult due to the Rust abstractions that wrap the socket fd. Work in progress on that is happening in - https://github.com/neondatabase/neon/pull/10823 Regarding production impact, I am worried at a theoretical level that the additional logging may cause a downward spiral in the case where a pageserver is slow to flush because there is not enough CPU. The logging would consume more CPU and thereby slow down flushes even more. However, I don't think this matters practically speaking. # Refs - context: https://neondb.slack.com/archives/C08DE6Q9C3B/p1739464533762049?thread_ts=1739462628.361019&cid=C08DE6Q9C3B - fixes https://github.com/neondatabase/neon/issues/10668 - part of https://github.com/neondatabase/cloud/issues/23515 # Testing Tested locally by running ``` ./target/debug/pagebench get-page-latest-lsn --num-clients=1000 --queue-depth=1000 ``` in one terminal, waiting a bit, then ``` pkill -STOP pagebench ``` then wait for slow logs to show up in `pageserver.log`. To see that the completion log message is logged, run ``` pkill -CONT pagebench ```	2025-02-14 14:37:03 +00:00

1 2 3 4 5 ...

7263 Commits