rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-06 21:12:55 +00:00

Author	SHA1	Message	Date
Konstantin Knizhnik	63b2060aef	Drop connections with all shards invoplved in prefetch in case of error (#7249 ) ## Problem See https://github.com/neondatabase/cloud/issues/11559 If we have multiple shards, we need to reset connections to all shards involved in prefetch (having active prefetch requests) if connection with any of them is lost. ## Summary of changes In `prefetch_on_ps_disconnect` drop connection to all shards with active page requests. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-03-28 08:16:05 +02:00
Sasha Krassovsky	24c5a5ac16	Revert "Revoke REPLICATION" (#7261 ) Reverts neondatabase/neon#7052	2024-03-27 18:07:51 +00:00
Alexander Bayandin	7f9cc1bd5e	CI(trigger-e2e-tests): set e2e-platforms (#7229 ) ## Problem We don't want to run an excessive e2e test suite on neonvm if there are no relevant changes. ## Summary of changes - Check PR diff and if there are no relevant compute changes (in `vendor/`, `pgxn/`, `libs/vm_monitor` or `Dockerfile.compute-node` - Switch job from `small` to `ubuntu-latest` runner to make it possible to use GitHub CLI	2024-03-27 13:10:37 +00:00
Christian Schwarz	cdf12ed008	fix(walreceiver): Timeline::shutdown can leave a dangling handle_walreceiver_connection tokio task (#7235 ) # Problem As pointed out through doc-comments in this PR, `drop_old_connection` is not cancellation-safe. This means we can leave a `handle_walreceiver_connection` tokio task dangling during Timeline shutdown. More details described in the corresponding issue #7062. # Solution Don't cancel-by-drop the `connection_manager_loop_step` from the `tokio::select!()` in the task_mgr task. Instead, transform the code to use a `CancellationToken` --- specifically, `task_mgr::shutdown_token()` --- and make code responsive to it. The `drop_old_connection()` is still not cancellation-safe and also doesn't get a cancellation token, because there's no point inside the function where we could return early if cancellation were requested using a token. We rely on the `handle_walreceiver_connection` to be sensitive to the `TaskHandle`s cancellation token (argument name: `cancellation`). Currently it checks for `cancellation` on each WAL message. It is probably also sensitive to `Timeline::cancel` because ultimately all that `handle_walreceiver_connection` does is interact with the `Timeline`. In summary, the above means that the following code (which is found in `Timeline::shutdown`) now might take longer, but actually ensures that all `handle_walreceiver_connection` tasks are finished: ```rust task_mgr::shutdown_tasks( Some(TaskKind::WalReceiverManager), Some(self.tenant_shard_id), Some(self.timeline_id) ) ``` # Refs refs #7062	2024-03-27 12:04:31 +01:00
Conrad Ludgate	12512f3173	add authentication rate limiting (#6865 ) ## Problem https://github.com/neondatabase/cloud/issues/9642 ## Summary of changes 1. Make `EndpointRateLimiter` generic, renamed as `BucketRateLimiter` 2. Add support for claiming multiple tokens at once 3. Add `AuthRateLimiter` alias. 4. Check `(Endpoint, IP)` pair during authentication, weighted by how many hashes proxy would be doing. TODO: handle ipv6 subnets. will do this in a separate PR.	2024-03-26 19:31:19 +00:00
John Spray	b3b7ce457c	pageserver: remove bare mgr::get_tenant, mgr::list_tenants (#7237 ) ## Problem This is a refactor. This PR was a precursor to a much smaller change `e5bd602dc1`, where as I was writing it I found that we were not far from getting rid of the last non-deprecated code paths that use `mgr::` scoped functions to get at the TenantManager state. We're almost done cleaning this up as per https://github.com/neondatabase/neon/issues/5796. The only significant remaining mgr:: item is `get_active_tenant_with_timeout`, which is page_service's path for fetching tenants. ## Summary of changes - Remove the bool argument to get_attached_tenant_shard: this was almost always false from API use cases, and in cases when it was true, it was readily replacable with an explicit check of the returned tenant's status. - Rather than letting the timeline eviction task query any tenant it likes via `mgr::`, pass an `Arc<Tenant>` into the task. This is still an ugly circular reference, but should eventually go away: either when we switch to exclusively using disk usage eviction, or when we change metadata storage to avoid the need to imitate layer accesses. - Convert all the mgr::get_tenant call sites to use TenantManager::get_attached_tenant_shard - Move list_tenants into TenantManager.	2024-03-26 18:29:08 +00:00
John Spray	6814bb4b59	tests: add a log allow list to stabilize benchmarks (#7251 ) ## Problem https://github.com/neondatabase/neon/pull/7227 destabilized various tests in the performance suite, with log errors during shutdown. It's because we switched shutdown order to stop the storage controller before the pageservers. ## Summary of changes - Tolerate "connection failed" errors from pageservers trying to validation their deletion queue.	2024-03-26 17:44:18 +00:00
John Spray	b3bb1d1cad	storage controller: make direct tenant creation more robust (#7247 ) ## Problem - Creations were not idempotent (unique key violation) - Creations waited for reconciliation, which control plane blocks while an operation is in flight ## Summary of changes - Handle unique key constraint violation as an OK situation: if we're creating the same tenant ID and shard count, it's reasonable to assume this is a duplicate creation. - Make the wait for reconcile during creation tolerate failures: this is similar to location_conf, where the cloud control plane blocks our notification calls until it is done with calling into our API (in future this constraint is expected to relax as the cloud control plane learns to run multiple operations concurrently for a tenant)	2024-03-26 16:57:35 +00:00
John Spray	47d2b3a483	pageserver: limit total ephemeral layer bytes (#7218 ) ## Problem Follows: https://github.com/neondatabase/neon/pull/7182 - Sufficient concurrent writes could OOM a pageserver from the size of indices on all the InMemoryLayer instances. - Enforcement of checkpoint_period only happened if there were some writes. Closes: https://github.com/neondatabase/neon/issues/6916 ## Summary of changes - Add `ephemeral_bytes_per_memory_kb` config property. This controls the ratio of ephemeral layer capacity to memory capacity. The weird unit is to enable making the ratio less than 1:1 (set this property to 1024 to use 1MB of ephemeral layers for every 1MB of RAM, set it smaller to get a fraction). - Implement background layer rolling checks in Timeline::compaction_iteration -- this ensures we apply layer rolling policy in the absence of writes. - During background checks, if the total ephemeral layer size has exceeded the limit, then roll layers whose size is greater than the mean size of all ephemeral layers. - Remove the tick() path from walreceiver: it isn't needed any more now that we do equivalent checks from compaction_iteration. - Add tests for the above. --------- Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2024-03-26 15:45:32 +00:00
John Spray	8dfe3a070c	pageserver: return 429 on timeline creation in progress (#7225 ) ## Problem Currently, we return 409 (Conflict) in two cases: - Temporary: Timeline creation cannot proceed because another timeline with the same ID is being created - Permanent: Timeline creation cannot proceed because another timeline exists with different parameters but the same ID. Callers which time out a request and retry should be able to distinguish these cases. Closes: #7208 ## Summary of changes - Expose `AlreadyCreating` errors as 429 instead of 409	2024-03-26 15:20:05 +00:00
Alexander Bayandin	3426619a79	test_runner/performance: skip test_bulk_insert (#7238 ) ## Problem `test_bulk_insert` becomes too slow, and it fails constantly: https://github.com/neondatabase/neon/issues/7124 ## Summary of changes - Skip `test_bulk_insert` until it's fixed	2024-03-26 15:10:15 +00:00
Vlad Lazar	de03742ca3	pageserver: drop layer map lock in Timeline::get (#7217 ) ## Problem We currently hold the layer map read lock while doing IO on the read path. This is not required for correctness. ## Summary of changes Drop the layer map lock after figuring out which layer we wish to read from. Why is this correct: * `Layer` models the lifecycle of an on disk layer. In the event the layer is removed from local disk, it will be on demand downloaded * `InMemoryLayer` holds the `EphemeralFile` which wraps the on disk file. As long as the `InMemoryLayer` is in scope, it's safe to read from it. Related https://github.com/neondatabase/neon/issues/6833	2024-03-26 14:35:36 +00:00
Christian Schwarz	ad072de420	Revert "pageserver: use a single tokio runtime (#6555 )" (#7246 )	2024-03-26 15:24:18 +01:00
Anna Khanova	6c18109734	proxy: reuse sess_id as request_id for the cplane requests (#7245 ) ## Problem https://github.com/neondatabase/cloud/issues/11599 ## Summary of changes Reuse the same sess_id for requests within the one session. TODO: get rid of `session_id` in query params.	2024-03-26 11:27:48 +00:00
John Spray	5dee58f492	tests: wait for uploads in test_secondary_downloads (#7220 ) ## Problem - https://github.com/neondatabase/neon/issues/6966 This test occasionally failed with some layers unexpectedly not present on the secondary pageserver. The issue in that failure is the attached pageserver uploading heatmaps that refer to not-yet-uploaded layers. ## Summary of changes After uploading heatmap, drain upload queue on attached pageserver, to guarantee that all the layers referenced in the haetmap are uploaded.	2024-03-26 10:59:16 +00:00
John Spray	6313f1fa7a	tests: tolerate transient unavailability in test_sharding_split_failures (#7223 ) ## Problem While most forms of split rollback don't interrupt clients, there are a couple of cases that do -- this interruption is brief, driven by the time it takes the controller to kick off Reconcilers during the async abort of the split, so it's operationally fine, but can trip up a test. - #7148 ## Summary of changes - Relax test check to require that the tenant is eventually available after split failure, rather than immediately. In the vast majority of cases this will pass on the first iteration.	2024-03-26 09:56:47 +00:00
Christian Schwarz	f72415e1fd	refactor(remote_timeline_client): infallible stop() and shutdown() (#7234 ) preliminary refactoring for https://github.com/neondatabase/neon/pull/7233 part of #7062	2024-03-25 18:42:18 +01:00
George Ma	d837ce0686	chore: remove repetitive words (#7206 ) Signed-off-by: availhang <mayangang@outlook.com>	2024-03-25 11:43:02 -04:00
John Spray	2713142308	tests: stabilize compat tests (#7227 ) This test had two flaky failure modes: - pageserver log error for timeline not found: this resulted from changes for DR when timeline destroy/create was added, but endpoint was left running during that operation. - storage controller log error because the test was running for long enough that a background reconcile happened at almost the exact moment of test teardown, and our test fixtures tear down the pageservers before the controller. Closes: #7224	2024-03-25 14:35:24 +00:00
Arseny Sher	a6c1fdcaf6	Try to fix test_crafted_wal_end flakiness. Postgres can always write some more WAL, so previous checks that WAL doesn't change after something had been crafted were wrong; remove them. Add comments here and there. should fix https://github.com/neondatabase/neon/issues/4691	2024-03-25 14:53:06 +03:00
John Spray	adb0526262	pageserver: track total ephemeral layer bytes (#7182 ) ## Problem Large quantities of ephemeral layer data can lead to excessive memory consumption (https://github.com/neondatabase/neon/issues/6939). We currently don't have a way to know how much ephemeral layer data is present on a pageserver. Before we can add new behaviors to proactively roll layers in response to too much ephemeral data, we must calculate that total. Related: https://github.com/neondatabase/neon/issues/6916 ## Summary of changes - Create GlobalResources and GlobalResourceUnits types, where timelines carry a GlobalResourceUnits in their TimelineWriterState. - Periodically update the size in GlobalResourceUnits: - During tick() - During layer roll - During put() if the latest value has drifted more than 10MB since our last update - Expose the value of the global ephemeral layer bytes counter as a prometheus metric. - Extend the lifetime of TimelineWriterState: - Instead of dropping it in TimelineWriter::drop, let it remain. - Drop TimelineWriterState in roll_layer: this drops our guard on the global byte count to reflect the fact that we're freezing the layer. - Ensure the validity of the later in the writer state by clearing the state in the same place we freeze layers, and asserting on the write-ability of the layer in `writer()` - Add a 'context' parameter to `get_open_layer_action` so that it can skip the prev_lsn==lsn check when called in tick() -- this is needed because now tick is called with a populated state, where prev_lsn==Some(lsn) is true for an idle timeline. - Extend layer rolling test to use this metric	2024-03-25 11:52:50 +00:00
John Spray	0099dfa56b	storage controller: tighten up secrets handling (#7105 ) - Remove code for using AWS secrets manager, as we're deploying with k8s->env vars instead - Load each secret independently, so that one can mix CLI args with environment variables, rather than requiring that all secrets are loaded with the same mechanism. - Add a 'strict mode', enabled by default, which will refuse to start if secrets are not loaded. This avoids the risk of accidentially disabling auth by omitting the public key, for example	2024-03-25 11:52:33 +00:00
Vlad Lazar	3a4ebfb95d	test: fix `test_pageserver_recovery` flakyness (#7207 ) ## Problem We recently introduced log file validation for the storage controller. The heartbeater will WARN when it fails for a node, hence the test fails. Closes https://github.com/neondatabase/neon/issues/7159 ## Summary of changes * Warn only once for each set of heartbeat retries * Allow list heartbeat warns	2024-03-25 09:38:12 +00:00
Christian Schwarz	3220f830b7	pageserver: use a single tokio runtime (#6555 ) Before this PR, each core had 3 executor threads from 3 different runtimes. With this PR, we just have one runtime, with one thread per core. Switching to a single tokio runtime should reduce that effective over-commit of CPU and in theory help with tail latencies -- iff all tokio tasks are well-behaved and yield to the runtime regularly. Are All Tasks Well-Behaved? Are We Ready? ----------------------------------------- Sadly there doesn't seem to be good out-of-the box tokio tooling to answer this question. We believe all tasks are well behaved in today's code base, as of the switch to `virtual_file_io_engine = "tokio-epoll-uring"` in production (https://github.com/neondatabase/aws/pull/1121). The only remaining executor-thread-blocking code is walredo and some filesystem namespace operations. Filesystem namespace operations work is being tracked in #6663 and not considered likely to actually block at this time. Regarding walredo, it currently does a blocking `poll` for read/write to the pipe file descriptors we use for IPC with the walredo process. There is an ongoing experiment to make walredo async (#6628), but it needs more time because there are surprisingly tricky trade-offs that are articulated in that PR's description (which itself is still WIP). What's relevant for this PR is that 1. walredo is always CPU-bound 2. production tail latencies for walredo request-response (`pageserver_wal_redo_seconds_bucket`) are - p90: with few exceptions, low hundreds of micro-seconds - p95: except on very packed pageservers, below 1ms - p99: all below 50ms, vast majority below 1ms - p99.9: almost all around 50ms, rarely at >= 70ms - [Dashboard Link](https://neonprod.grafana.net/d/edgggcrmki3uof/2024-03-walredo-latency?orgId=1&var-ds=ZNX49CDVz&var-pXX_by_instance=0.9&var-pXX_by_instance=0.99&var-pXX_by_instance=0.95&var-adhoc=instance%7C%21%3D%7Cpageserver-30.us-west-2.aws.neon.tech&var-per_instance_pXX_max_seconds=0.0005&from=1711049688777&to=1711136088777) The ones below 1ms are below our current threshold for when we start thinking about yielding to the executor. The tens of milliseconds stalls aren't great, but, not least because of the implicit overcommit of CPU by the three runtimes, we can't be sure whether these tens of milliseconds are inherently necessary to do the walredo work or whether we could be faster if there was less contention for CPU. On the first item (walredo being always CPU-bound work): it means that walredo processes will always compete with the executor threads. We could yield, using async walredo, but then we hit the trade-offs explained in that PR. tl;dr: the risk of stalling executor threads through blocking walredo seems low, and switching to one runtime cleans up one potential source for higher-than-necessary stall times (explained in the previous paragraphs). Code Changes ------------ - Remove the 3 different runtime definitions. - Add a new definition called `THE_RUNTIME`. - Use it in all places that previously used one of the 3 removed runtimes. - Remove the argument from `task_mgr`. - Fix failpoint usage where `pausable_failpoint!` should have been used. We encountered some actual failures because of this, e.g., hung `get_metric()` calls during test teardown that would client-timeout after 300s. As indicated by the comment above `THE_RUNTIME`, we could take this clean-up further. But before we create so much churn, let's first validate that there's no perf regression. Performance ----------- We will test this in staging using the various nightly benchmark runs. However, the worst-case impact of this change is likely compaction (=>image layer creation) competing with compute requests. Image layer creation work can't be easily generated & repeated quickly by pagebench. So, we'll simply watch getpage & basebackup tail latencies in staging. Additionally, I have done manual benchmarking using pagebench. Report: https://neondatabase.notion.site/2024-03-23-oneruntime-change-benchmarking-22a399c411e24399a73311115fb703ec?pvs=4 Tail latencies and throughput are marginally better (no regression = good). Except in a workload with 128 clients against one tenant. There, the p99.9 and p99.99 getpage latency is about 2x worse (at slightly lower throughput). A dip in throughput every 20s (compaction_period_ is clearly visible, and probably responsible for that worse tail latency. This has potential to improve with async walredo, and is an edge case workload anyway. Future Work ----------- 1. Once this change has shown satisfying results in production, change the codebase to use the ambient runtime instead of explicitly referencing `THE_RUNTIME`. 2. Have a mode where we run with a single-threaded runtime, so we uncover executor stalls more quickly. 3. Switch or write our own failpoints library that is async-native: https://github.com/neondatabase/neon/issues/7216	2024-03-23 19:25:11 +01:00
Conrad Ludgate	72103d481d	proxy: fix stack overflow in cancel publisher (#7212 ) ## Problem stack overflow in blanket impl for `CancellationPublisher` ## Summary of changes Removes `async_trait` and fixes the impl order to make it non-recursive.	2024-03-23 06:36:58 +00:00
Alex Chi Z	643683f41a	fixup(#7204 / postgres): revert `IsPrimaryAlive` checks (#7209 ) Fix #7204. https://github.com/neondatabase/postgres/pull/400 https://github.com/neondatabase/postgres/pull/401 https://github.com/neondatabase/postgres/pull/402 These commits never go into prod. Detailed investigation will be posted in another issue. Reverting the commits so that things can keep running in prod. This pull request adds the test to start two replicas. It fails on the current main https://github.com/neondatabase/neon/pull/7210 but passes in this pull request. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-03-23 01:01:51 +00:00
Konstantin Knizhnik	35f4c04c9b	Remove Get/SetZenithCurrentClusterSize from Postgres core (#7196 ) ## Problem See https://neondb.slack.com/archives/C04DGM6SMTM/p1711003752072899 ## Summary of changes Move keeping of cluster size to neon extension --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-03-22 13:14:31 -04:00
John Spray	1787cf19e3	pageserver: write consumption metrics to S3 (#7200 ) ## Problem The service that receives consumption metrics has lower availability than S3. Writing metrics to S3 improves their availability. Closes: https://github.com/neondatabase/cloud/issues/9824 ## Summary of changes - The same data as consumption metrics POST bodies is also compressed and written to an S3 object with a timestamp-formatted path. - Set `metric_collection_bucket` (same format as `remote_storage` config) to configure the location to write to	2024-03-22 14:52:14 +00:00
Alexander Bayandin	2668a1dfab	CI: deploy release version to a preprod region (#6811 ) ## Problem We want to deploy releases to a preprod region first to perform required checks ## Summary of changes - Deploy `release-XXX` / `release-proxy-YYY` docker tags to a preprod region	2024-03-22 14:42:10 +00:00
Conrad Ludgate	77f3a30440	proxy: unit tests for auth_quirks (#7199 ) ## Problem I noticed code coverage for auth_quirks was pretty bare ## Summary of changes Adds 3 happy path unit tests for auth_quirks * scram * cleartext (websockets) * cleartext (password hack)	2024-03-22 13:31:10 +00:00
John Spray	62b318c928	Fix ephemeral file warning on secondaries (#7201 ) A test was added which exercises secondary locations more, and there was a location in the secondary downloader that warned on ephemeral files. This was intended to be fixed in this faulty commit: `8cea866adf`	2024-03-22 10:10:28 +00:00
Anna Khanova	6770ddba2e	proxy: connect redis with AWS IAM (#7189 ) ## Problem Support of IAM Roles for Service Accounts for authentication. ## Summary of changes * Obtain aws 15m-long credentials * Retrieve redis password from credentials * Update every 1h to keep connection for more than 12h * For now allow to have different endpoints for pubsub/stream redis. TODOs: * PubSub doesn't support credentials refresh, consider using stream instead. * We need an AWS role for proxy to be able to connect to both: S3 and elasticache. Credentials obtaining and connection refresh was tested on xenon preview. https://github.com/neondatabase/cloud/issues/10365	2024-03-22 09:38:04 +01:00
Arpad Müller	3ee34a3f26	Update Rust to 1.77.0 (#7198 ) Release notes: https://blog.rust-lang.org/2024/03/21/Rust-1.77.0.html Thanks to #6886 the diff is reasonable, only for one new lint `clippy::suspicious_open_options`. I added `truncate()` calls to the places where it is obviously the right choice to me, and added allows everywhere else, leaving it for followups. I had to specify cargo install --locked because the build would fail otherwise. This was also recommended by upstream.	2024-03-22 06:52:31 +00:00
Christian Schwarz	fb60278e02	walredo benchmark: throughput-oriented rewrite (#7190 ) See the updated `bench_walredo.rs` module comment. tl;dr: we measure avg latency of single redo operations issues against a single redo manager from N tokio tasks. part of https://github.com/neondatabase/neon/issues/6628	2024-03-21 15:24:56 +01:00
Conrad Ludgate	d5304337cf	proxy: simplify password validation (#7188 ) ## Problem for HTTP/WS/password hack flows we imitate SCRAM to validate passwords. This code was unnecessarily complicated. ## Summary of changes Copy in the `pbkdf2` and 'derive keys' steps from the `postgres_protocol` crate in our `rust-postgres` fork. Derive the `client_key`, `server_key` and `stored_key` from the password directly. Use constant time equality to compare the `stored_key` and `server_key` with the ones we are sent from cplane.	2024-03-21 13:54:06 +00:00
John Spray	06cb582d91	pageserver: extend /re-attach response to include tenant mode (#6941 ) This change improves the resilience of the system to unclean restarts. Previously, re-attach responses only included attached tenants - If the pageserver had local state for a secondary location, it would remain, but with no guarantee that it was still _meant_ to be there. After this change, the pageserver will only retain secondary locations if the /re-attach response indicates that they should still be there. - If the pageserver had local state for an attached location that was omitted from a re-attach response, it would be entirely detached. This is wasteful in a typical HA setup, where an offline node's tenants might have been re-attached elsewhere before it restarts, but the offline node's location should revert to a secondary location rather than being wiped. Including secondary tenants in the re-attach response enables the pageserver to avoid throwing away local state unnecessarily. In this PR: - The re-attach items are extended with a 'mode' field. - Storage controller populates 'mode' - Pageserver interprets it (default is attached if missing) to construct either a SecondaryTenant or a Tenant. - A new test exercises both cases.	2024-03-21 13:39:23 +00:00
John Spray	bb47d536fb	pageserver: quieten log on shutdown-while-attaching (#7177 ) ## Problem If a shutdown happens when a tenant is attaching, we were logging at ERROR severity and with a backtrace. Yuck. ## Summary of changes - Pass a flag into `make_broken` to enable quietening this non-scary case.	2024-03-21 12:56:13 +00:00
John Spray	59cdee749e	storage controller: fixes to secondary location handling (#7169 ) Stacks on: - https://github.com/neondatabase/neon/pull/7165 Fixes while working on background optimization of scheduling after a split: - When a tenant has secondary locations, we weren't detaching the parent shards' secondary locations when doing a split - When a reconciler detaches a location, it was feeding back a locationconf with `Detached` mode in its `observed` object, whereas it should omit that location. This could cause the background reconcile task to keep kicking off no-op reconcilers forever (harmless but annoying). - During shard split, we were scheduling secondary locations for the child shards, but no reconcile was run for these until the next time the background reconcile task ran. Creating these ASAP is useful, because they'll be used shortly after a shard split as the destination locations for migrating the new shards to different nodes.	2024-03-21 12:06:57 +00:00
Vlad Lazar	c75b584430	storage_controller: add metrics (#7178 ) ## Problem Storage controller had basically no metrics. ## Summary of changes 1. Migrate the existing metrics to use Conrad's [`measured`](https://docs.rs/measured/0.0.14/measured/) crate. 2. Add metrics for incoming http requests 3. Add metrics for outgoing http requests to the pageserver 4. Add metrics for outgoing pass through requests to the pageserver 5. Add metrics for database queries Note that the metrics response for the attachment service does not use chunked encoding like the rest of the metrics endpoints. Conrad has kindly extended the crate such that it can now be done. Let's leave it for a follow-up since the payload shouldn't be that big at this point. Fixes https://github.com/neondatabase/neon/issues/6875	2024-03-21 12:00:20 +00:00
Conrad Ludgate	5ec6862bcf	proxy: async aware password validation (#7176 ) ## Problem spawn_blocking in #7171 was a hack ## Summary of changes https://github.com/neondatabase/rust-postgres/pull/29	2024-03-21 11:58:41 +01:00
Jure Bajic	94138c1a28	Enforce LSN ordering of batch entries (#7071 ) ## Summary of changes Enforce LSN ordering of batch entries. Closes https://github.com/neondatabase/neon/issues/6707	2024-03-21 09:17:24 +00:00
Joonas Koivunen	2206e14c26	fix(layer): remove the need to repair internal state (#7030 ) ## Problem The current implementation of struct Layer supports canceled read requests, but those will leave the internal state such that a following `Layer::keep_resident` call will need to repair the state. In pathological cases seen during generation numbers resetting in staging or with too many in-progress on-demand downloads, this repair activity will need to wait for the download to complete, which stalls disk usage-based eviction. Similar stalls have been observed in staging near disk-full situations, where downloads failed because the disk was full. Fixes #6028 or the "layer is present on filesystem but not evictable" problems by: 1. not canceling pending evictions by a canceled `LayerInner::get_or_maybe_download` 2. completing post-download initialization of the `LayerInner::inner` from the download task Not canceling evictions above case (1) and always initializing (2) lead to plain `LayerInner::inner` always having the up-to-date information, which leads to the old `Layer::keep_resident` never having to wait for downloads to complete. Finally, the `Layer::keep_resident` is replaced with `Layer::is_likely_resident`. These fix #7145. ## Summary of changes - add a new test showing that a canceled get_or_maybe_download should not cancel the eviction - switch to using a `watch` internally rather than a `broadcast` to avoid hanging eviction while a download is ongoing - doc changes for new semantics and cleanup - fix `Layer::keep_resident` to use just `self.0.inner.get()` as truth as `Layer::is_likely_resident` - remove `LayerInner::wanted_evicted` boolean as no longer needed Builds upon: #7185. Cc: #5331.	2024-03-21 03:19:08 +02:00
Joonas Koivunen	a95c41f463	fix(heavier_once_cell): take_and_deinit should take ownership (#7185 ) Small fix to remove confusing `mut` bindings. Builds upon #7175, split off from #7030. Cc: #5331.	2024-03-21 00:42:38 +02:00
Tristan Partin	041b653a1a	Add state diagram for compute Models a compute's lifetime.	2024-03-20 17:10:46 -05:00
Alex Chi Z	55c4ef408b	safekeeper: correctly handle signals (#7167 ) errno is not preserved in the signal handler. This pull request fixes it. Maybe related: https://github.com/neondatabase/neon/issues/6969, but does not fix the flaky test problem. Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-03-20 15:22:25 -04:00
Alex Chi Z	5f0d9f2360	fix: add safekeeper team to pgxn codeowners (#7170 ) `pgxn/` also contains WAL proposer code, so modifications to this directory should be able to be approved by the safekeeper team. Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-03-20 18:40:48 +00:00
Arpad Müller	34fa34d15c	Dump layer map json in test_gc_feedback.py (#7179 ) The layer map json is an interesting file for that test, so dump it to make debugging easier.	2024-03-20 18:39:46 +00:00
Joonas Koivunen	e961e0d3df	fix(Layer): always init after downloading in the spawned task (#7175 ) Before this PR, cancellation for `LayerInner::get_or_maybe_download` could occur so that we have downloaded the layer file in the filesystem, but because of the cancellation chance, we have not set the internal `LayerInner::inner` or initialized the state. With the detached init support introduced in #7135 and in place in #7152, we can now initialize the internal state after successfully downloading in the spawned task. The next PR will fix the remaining problems that this PR leaves: - `Layer::keep_resident` is still used because - `Layer::get_or_maybe_download` always cancels an eviction, even when canceled Split off from #7030. Stacked on top of #7152. Cc: #5331.	2024-03-20 20:37:47 +02:00
John Spray	2726b1934e	pageserver: extra debug for test_secondary_downloads failures (#7183 ) - Enable debug logs for this test - Add some debug logging detail in downloader.rs - Add an info-level message in scheduler.rs that makes it obvious if a command is waiting for an existing task rather than spawning a new one.	2024-03-20 18:07:45 +00:00
Joonas Koivunen	3d16cda846	refactor(layer): use detached init (#7152 ) The second part of work towards fixing `Layer::keep_resident` so that it does not need to repair the internal state. #7135 added a nicer API for initialization. This PR uses it to remove a few indentation levels and the loop construction. The next PR #7175 will use the refactorings done in this PR, and always initialize the internal state after a download. Cc: #5331	2024-03-20 18:03:09 +02:00

1 2 3 4 5 ...

4930 Commits