rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-07 13:32:57 +00:00

Author	SHA1	Message	Date
John Spray	e89e41f8ba	tests: update for tenant generations (#5449 ) ## Problem Some existing tests are written in a way that's incompatible with tenant generations. ## Summary of changes Update all the tests that need updating: this is things like calling through the NeonPageserver.tenant_attach helper to get a generation number, instead of calling directly into the pageserver API. There are various more subtle cases.	2023-12-07 12:27:16 +00:00
Conrad Ludgate	f9401fdd31	proxy: fix channel binding error messages (#6054 ) ## Problem For channel binding failed messages we were still saying "channel binding not supported" in the errors. ## Summary of changes Fix error messages	2023-12-07 11:47:16 +00:00
Joonas Koivunen	b7ffe24426	build: update tokio to 1.34.0, tokio-utils 0.7.10 (#6061 ) We should still remember to bump minimum crates for libraries beginning to use task tracker.	2023-12-07 11:31:38 +00:00
Joonas Koivunen	52718bb8ff	fix(layer): metric splitting, span rename (#5902 ) Per [feedback], split the Layer metrics, also finally account for lost and [re-submitted feedback] on `layer_gc` by renaming it to `layer_delete`, `Layer::garbage_collect_on_drop` renamed to `Layer::delete_on_drop`. References to "gc" dropped from metric names and elsewhere. Also fixes how the cancellations were tracked: there was one rare counter. Now there is a top level metric for cancelled inits, and the rare "download failed but failed to communicate" counter is kept. Fixes: #6027 [feedback]: https://github.com/neondatabase/neon/pull/5809#pullrequestreview-1720043251 [re-submitted feedback]: https://github.com/neondatabase/neon/pull/5108#discussion_r1401867311	2023-12-07 11:39:40 +02:00
Joonas Koivunen	10c77cb410	temp: increase the wait tenant activation timeout (#6058 ) 5s is causing way too much noise; this is of course a temporary fix, we should prioritize tenants for which there are pagestream openings the highest, second highest the basebackups. Deployment thread for context: https://neondb.slack.com/archives/C03H1K0PGKH/p1701935048144479?thread_ts=1701765158.926659&cid=C03H1K0PGKH	2023-12-07 09:01:08 +00:00
Heikki Linnakangas	31be301ef3	Make simple_rcu::RcuWaitList::wait() async (#6046 ) The gc_timeline() function is async, but it calls the synchronous wait() function. In the worst case, that could lead to a deadlock by using up all tokio executor threads. In the passing, fix a few typos in comments. Fixes issue #6045. --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-12-07 10:20:40 +02:00
Joonas Koivunen	a3c7d400b4	fix: avoid allocations with logging a slug (#6047 ) to_string forces allocating a less than pointer sized string (costing on stack 4 usize), using a Display formattable slug saves that. the difference seems small, but at the same time, we log these a lot.	2023-12-07 07:25:22 +00:00
Vadim Kharitonov	7501ca6efb	Revert timescaledb for pg14 and pg15 (#6056 ) ``` could not start the compute node: compute is in state "failed": db error: ERROR: could not access file "$libdir/timescaledb-2.10.1": No such file or directory Caused by: ERROR: could not access file "$libdir/timescaledb-2.10.1": No such file or directory ```	2023-12-06 15:12:36 +00:00
Christian Schwarz	987c9aaea0	virtual_file: fix the metric for close() calls done by VirtualFile::drop (#6051 ) Before this PR we would inc() the counter for `Close` even though the slot's FD had already been closed. Especially visible when subtracting `open` from `close+close-by-replace` on a system that does a lot of attach and detach. refs https://github.com/neondatabase/cloud/issues/8440 refs https://github.com/neondatabase/cloud/issues/8351	2023-12-06 12:05:28 +00:00
Konstantin Knizhnik	7fab731f65	Track size of FSM fork while applying records at replica (#5901 ) ## Problem See https://neondb.slack.com/archives/C04DGM6SMTM/p1700560921471619 ## Summary of changes Update relation size cache for FSM fork in WAL records filter ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2023-12-05 18:49:24 +02:00
John Spray	483caa22c6	pageserver: logging tweaks (#6039 ) - The `Attaching tenant` log message omitted some useful information like the generation and mode - info-level messages about writing configuration files were unnecessarily verbose - During process shutdown, we don't emit logs about the various phases: this is very cheap to log since we do it once per process lifetime, and is helpful when figuring out where something got stuck during a hang.	2023-12-05 16:11:15 +00:00
John Spray	da5e03b0d8	pageserver: add a /reset API for tenants (#6014 ) ## Problem Traditionally we would detach/attach directly with curl if we wanted to "reboot" a single tenant. That's kind of inconvenient these days, because one needs to know a generation number to issue an attach request. Closes: https://github.com/neondatabase/neon/issues/6011 ## Summary of changes - Introduce a new `/reset` API, which remembers the LocationConf from the current attachment so that callers do not have to work out the correct configuration/generation to use. - As an additional support tool, allow an optional `drop_cache` query parameter, for situations where we are concerned that some on-disk state might be bad and want to clear that as well as the in-memory state. One might wonder why I didn't call this "reattach" -- it's because there's already a PS->CP API of that name and it could get confusing.	2023-12-05 15:38:27 +00:00
John Spray	be885370f6	pageserver: remove redundant unsafe_create_dir_all (#6040 ) This non-fsyncing analog to our safe directory creation function was just duplicating what tokio's fs::create_dir_all does.	2023-12-05 15:03:07 +00:00
Alexey Kondratov	bc1020f965	compute_ctl: Notify waiters when Postgres failed to start (#6034 ) In case of configuring the empty compute, API handler is waiting on condvar for compute state change. Yet, previously if Postgres failed to start we were just setting compute status to `Failed` without notifying. It causes a timeout on control plane side, although we can return a proper error from compute earlier. With this commit API handler should be properly notified.	2023-12-05 13:38:45 +01:00
John Spray	61fe9d360d	pageserver: add Key->Shard mapping logic & use it in page service (#5980 ) ## Problem When a pageserver receives a page service request identified by TenantId, it must decide which `Tenant` object to route it to. As in earlier PRs, this stuff is all a no-op for tenants with a single shard: calls to `is_key_local` always return true without doing any hashing on a single-shard ShardIdentity. Closes: https://github.com/neondatabase/neon/issues/6026 ## Summary of changes - Carry immutable `ShardIdentity` objects in Tenant and Timeline. These provide the information that Tenants/Timelines need to figure out which shard is responsible for which Key. - Augment `get_active_tenant_with_timeout` to take a `ShardSelector` specifying how the shard should be resolved for this tenant. This mode depends on the kind of request (e.g. basebackups always go to shard zero). - In `handle_get_page_at_lsn_request`, handle the case where the Timeline we looked up at connection time is not the correct shard for the page being requested. This can happen whenever one node holds multiple shards for the same tenant. This is currently written as a "slow path" with the optimistic expectation that usually we'll run with one shard per pageserver, and the Timeline resolved at connection time will be the one serving page requests. There is scope for optimization here later, to avoid doing the full shard lookup for each page. - Omit consumption metrics from nonzero shards: only the 0th shard is responsible for tracing accurate relation sizes. Note to reviewers: - Testing of these changes is happening separately on the `jcsp/sharding-pt1` branch, where we have hacked neon_local etc needed to run a test_pg_regress. - The main caveat to this implementation is that page service connections still look up one Timeline when the connection is opened, before they know which pages are going to be read. If there is one shard per pageserver then this will always also be the Timeline that serves page requests. However, if multiple shards are on one pageserver then get page requests will incur the cost of looking up the correct Timeline on each getpage request. We may look to improve this in future with a "sticky" timeline per connection handler so that subsequent requests for the same Timeline don't have to look up again, and/or by having postgres pass a shard hint when connecting. This is tracked in the "Loose ends" section of https://github.com/neondatabase/neon/issues/5507	2023-12-05 12:01:55 +00:00
Conrad Ludgate	f60e49fe8e	proxy: fix panic in startup packet (#6032 ) ## Problem Panic when less than 8 bytes is presented in a startup packet. ## Summary of changes We need there to be a 4 byte message code, so the expected min length is 8.	2023-12-05 11:24:16 +01:00
Anna Khanova	c48918d329	Rename metric (#6030 ) ## Problem It looks like because of reallocation of the buckets in previous PR, the metric is broken in graphana. ## Summary of changes Renamed the metric.	2023-12-05 10:03:07 +00:00
Sasha Krassovsky	bad686bb71	Remove trusted from wal2json (#6035 ) ## Problem ## Summary of changes	2023-12-04 21:10:23 +00:00
Alexey Kondratov	85d08581ed	[compute_ctl] Introduce feature flags in the compute spec (#6016 ) ## Problem In the past we've rolled out all new `compute_ctl` functionality right to all users, which could be risky. I want to have a more fine-grained control over what we enable, in which env and to which users. ## Summary of changes Add an option to pass a list of feature flags to `compute_ctl`. If not passed, it defaults to an empty list. Any unknown flags are ignored. This allows us to release new experimental features safer, as we can then flip the flag for one specific user, only Neon employees, free / pro / etc. users and so on. Or control it per environment. In the current implementation feature flags are passed via compute spec, so they do not allow controlling behavior of `empty` computes. For them, we can either stick with the previous approach, i.e. add separate cli args or introduce a more generic `--features` cli argument.	2023-12-04 19:54:18 +01:00
Christian Schwarz	c7f1143e57	concurrency-limit low-priority initial logical size calculation [v2] (#6000 ) Problem ------- Before this PR, there was no concurrency limit on initial logical size computations. While logical size computations are lazy in theory, in practice (production), they happen in a short timeframe after restart. This means that on a PS with 20k tenants, we'd have up to 20k concurrent initial logical size calculation requests. This is self-inflicted needless overload. This hasn't been a problem so far because the `.await` points on the logical size calculation path never return `Pending`, hence we have a natural concurrency limit of the number of executor threads. But, as soon as we return `Pending` somewhere in the logical size calculation path, other concurrent tasks get scheduled by tokio. If these other tasks are also logical size calculations, they eventually pound on the same bottleneck. For example, in #5479, we want to switch the VirtualFile descriptor cache to a `tokio::sync::RwLock`, which makes us return `Pending`, and without measures like this patch, after PS restart, VirtualFile descriptor cache thrashes heavily for 2 hours until all the logical size calculations have been computed and the degree of concurrency / concurrent VirtualFile operations is down to regular levels. See the Experiment section below for details. <!-- Experiments (see below) show that plain #5479 causes heavy thrashing of the VirtualFile descriptor cache. The high degree of concurrency is too much for In the case of #5479 the VirtualFile descriptor cache size starts thrashing heavily. --> Background ---------- Before this PR, initial logical size calculation was spawned lazily on first call to `Timeline::get_current_logical_size()`. In practice (prod), the lazy calculation is triggered by `WalReceiverConnectionHandler` if the timeline is active according to storage broker, or by the first iteration of consumption metrics worker after restart (`MetricsCollection`). The spawns by walreceiver are high-priority because logical size is needed by Safekeepers (via walreceiver `PageserverFeedback`) to enforce the project logical size limit. The spawns by metrics collection are not on the user-critical path and hence low-priority. [^consumption_metrics_slo] [^consumption_metrics_slo]: We can't delay metrics collection indefintely because there are TBD internal SLOs tied to metrics collection happening in a timeline manner (https://github.com/neondatabase/cloud/issues/7408). But let's ignore that in this issue. The ratio of walreceiver-initiated spawns vs consumption-metrics-initiated spawns can be reconstructed from logs (`spawning logical size computation from context of task kind {:?}"`). PR #5995 and #6018 adds metrics for this. First investigation of the ratio lead to the discovery that walreceiver spawns 75% of init logical size computations. That's because of two bugs: - In Safekeepers: https://github.com/neondatabase/neon/issues/5993 - In interaction between Pageservers and Safekeepers: https://github.com/neondatabase/neon/issues/5962 The safekeeper bug is likely primarily responsible but we don't have the data yet. The metrics will hopefully provide some insights. When assessing production-readiness of this PR, please assume that neither of these bugs are fixed yet. Changes In This PR ------------------ With this PR, initial logical size calculation is reworked as follows: First, all initial logical size calculation task_mgr tasks are started early, as part of timeline activation, and run a retry loop with long back-off until success. This removes the lazy computation; it was needless complexity because in practice, we compute all logical sizes anyways, because consumption metrics collects it. Second, within the initial logical size calculation task, each attempt queues behind the background loop concurrency limiter semaphore. This fixes the performance issue that we pointed out in the "Problem" section earlier. Third, there is a twist to queuing behind the background loop concurrency limiter semaphore. Logical size is needed by Safekeepers (via walreceiver `PageserverFeedback`) to enforce the project logical size limit. However, we currently do open walreceiver connections even before we have an exact logical size. That's bad, and I'll build on top of this PR to fix that (https://github.com/neondatabase/neon/issues/5963). But, for the purposes of this PR, we don't want to introduce a regression, i.e., we don't want to provide an exact value later than before this PR. The solution is to introduce a priority-boosting mechanism (`GetLogicalSizePriority`), allowing callers of `Timeline::get_current_logical_size` to specify how urgently they need an exact value. The effect of specifying high urgency is that the initial logical size calculation task for the timeline will skip the concurrency limiting semaphore. This should yield effectively the same behavior as we had before this PR with lazy spawning. Last, the priority-boosting mechanism obsoletes the `init_order`'s grace period for initial logical size calculations. It's a separate commit to reduce the churn during review. We can drop that commit if people think it's too much churn, and commit it later once we know this PR here worked as intended. Experiment With #5479 --------------------- I validated this PR combined with #5479 to assess whether we're making forward progress towards asyncification. The setup is an `i3en.3xlarge` instance with 20k tenants, each with one timeline that has 9 layers. All tenants are inactive, i.e., not known to SKs nor storage broker. This means all initial logical size calculations are spawned by consumption metrics `MetricsCollection` task kind. The consumption metrics worker starts requesting logical sizes at low priority immediately after restart. This is achieved by deleting the consumption metrics cache file on disk before starting PS.[^consumption_metrics_cache_file] [^consumption_metrics_cache_file] Consumption metrics worker persists its interval across restarts to achieve persistent reporting intervals across PS restarts; delete the state file on disk to get predictable (and I believe worst-case in terms of concurrency during PS restart) behavior. Before this patch, all of these timelines would all do their initial logical size calculation in parallel, leading to extreme thrashing in page cache and virtual file cache. With this patch, the virtual file cache thrashing is reduced significantly (from 80k `open`-system-calls/second to ~500 `open`-system-calls/second during loading). ### Critique The obvious critique with above experiment is that there's no skipping of the semaphore, i.e., the priority-boosting aspect of this PR is not exercised. If even just 1% of our 20k tenants in the setup were active in SK/storage_broker, then 200 logical size calculations would skip the limiting semaphore immediately after restart and run concurrently. Further critique: given the two bugs wrt timeline inactive vs active state that were mentioned in the Background section, we could have 75% of our 20k tenants being (falsely) active on restart. So... (next section) This Doesn't Make Us Ready For Async VirtualFile ------------------------------------------------ This PR is a step towards asynchronous `VirtualFile`, aka, #5479 or even #4744. But it doesn't yet enable us to ship #5479. The reason is that this PR doesn't limit the amount of high-priority logical size computations. If there are many high-priority logical size calculations requested, we'll fall over like we did if #5479 is applied without this PR. And currently, at very least due to the bugs mentioned in the Background section, we run thousands of high-priority logical size calculations on PS startup in prod. So, at a minimum, we need to fix these bugs. Then we can ship #5479 and #4744, and things will likely be fine under normal operation. But in high-traffic situations, overload problems will still be more likely to happen, e.g., VirtualFile cache descriptor thrashing. The solution candidates for that are orthogonal to this PR though: * global concurrency limiting * per-tenant rate limiting => #5899 * load shedding * scaling bottleneck resources (fd cache size (neondatabase/cloud#8351), page cache size(neondatabase/cloud#8351), spread load across more PSes, etc) Conclusion ---------- Even with the remarks from in the previous section, we should merge this PR because: 1. it's an improvement over the status quo (esp. if the aforementioned bugs wrt timeline active / inactive are fixed) 2. it prepares the way for https://github.com/neondatabase/neon/pull/6010 3. it gets us close to shipping #5479 and #4744	2023-12-04 17:22:26 +00:00
Christian Schwarz	7403d55013	walredo: stderr cleanup & make explicitly cancel safe (#6031 ) # Problem I need walredo to be cancellation-safe for https://github.com/neondatabase/neon/pull/6000#discussion_r1412049728 # Solution We are only `async fn` because of `wait_for(stderr_logger_task_done).await`, added in #5560 . The `stderr_logger_cancel` and `stderr_logger_task_done` were there out of precaution that the stderr logger task might for some reason not stop when the walredo process terminates. That hasn't been a problem in practice. So, simplify things: - remove `stderr_logger_cancel` and the `wait_for(...stderr_logger_task_done...)` - use `tokio::process::ChildStderr` in the stderr logger task - add metrics to track number of running stderr logger tasks so in case I'm wrong here, we can use these metrics to identify the issue (not planning to put them into a dashboard or anything)	2023-12-04 16:06:41 +00:00
Anna Khanova	12f02523a4	Enable dynamic rate limiter (#6029 ) ## Problem Limit the number of open connections between the control plane and proxy. ## Summary of changes Enable dynamic rate limiter in prod. Unfortunately the latency metrics are a bit broken, but from logs I see that on staging for the past 7 days only 2 times latency for acquiring was greater than 1ms (for most of the cases it's insignificant).	2023-12-04 15:00:24 +00:00
Arseny Sher	207c527270	Safekeepers: persist state before timeline deactivation. Without it, sometimes on restart we lose latest remote_consistent_lsn which leads to excessive ps -> sk reconnections. https://github.com/neondatabase/neon/issues/5993	2023-12-04 18:22:36 +04:00
John Khvatov	eae49ff598	Perform L0 compaction before creating new image layers (#5950 ) If there are too many L0 layers before compaction, the compaction process becomes slow because of slow `Timeline::get`. As a result of the slowdown, the pageserver will generate even more L0 layers for the next iteration, further exacerbating the slow performance. Change to perform L0 -> L1 compaction before creating new images. The simple change speeds up compaction time and `Timeline::get` to 5x. `Timeline::get` is faster on top of L1 layers. Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-12-04 12:35:09 +00:00
Alexander Bayandin	e6b2f89fec	test_pg_clients: fix test that reads from stdout (#6021 ) ## Problem `test_pg_clients` reads the actual result from a .stdout file, https://github.com/neondatabase/neon/pull/5977 has added a header to such files, so `test_pg_clients` started to fail. ## Summary of changes - Use `capture_stdout` and compare the expected result with the output instead of .stdout file content	2023-12-04 11:18:41 +00:00
John Spray	1d81e70d60	pageserver: tweak logs for index_part loading (#6005 ) ## Problem On pageservers upgraded to enable generations, these INFO level logs were rather frequent. If a tenant timeline hasn't written new layers since the upgrade, it will emit the "No index_part.json*" log every time it starts. ## Summary of changes - Downgrade two log lines from info to debug - Add a tiny unit test that I wrote for sanity-checking that there wasn't something wrong with our Generation-comparing logic when loading index parts.	2023-12-04 09:57:47 +00:00
Anastasia Lubennikova	e3512340c1	Override neon.max_cluster_size for the time of compute_ctl (#5998 ) Temporarily reset neon.max_cluster_size to avoid the possibility of hitting the limit, while we are applying config: creating new extensions, roles, etc...	2023-12-03 15:21:44 +00:00
Christian Schwarz	e43cde7aba	initial logical size: remove CALLS metric from hot path (#6018 ) Only introduced a few hours ago (#5995), I took a look at the numbers from staging and realized that `get_current_logical_size()` is on the walingest hot path: we call it for every `ReplicationMessage::XLogData` that we receive. Since the metric is global, it would be quite a busy cache line. This PR replaces it with a new metric purpose-built for what's most interesting right now.	2023-12-01 22:45:04 +01:00
Alexey Kondratov	c1295bfb3a	[compute_ctl] User correct HTTP code in the /configure errors (#6017 ) It was using `PRECONDITION_FAILED` for errors during `ComputeSpec` to `ParsedSpec` conversion, but this disobeys the OpenAPI spec [1] and correct code should be `BAD_REQUEST` for any spec processing errors. While on it, I also noticed that `compute_ctl` OpenAPI spec has an invalid format and fixed it. [1] `fd81945a60/compute_tools/src/http/openapi_spec.yaml (L119-L120)`	2023-12-01 18:19:55 +01:00
Joonas Koivunen	711425cc47	fix: use create_new instead of create for mutex file (#6012 ) Using create_new makes the uninit marker work as a mutual exclusion primitive. Temporary hopefully.	2023-12-01 18:30:51 +02:00
bojanserafimov	fd81945a60	Use TEST_OUTPUT envvar in pageserver (#5984 )	2023-12-01 09:16:24 -05:00
bojanserafimov	e49c21a3cd	Speed up rel extend (#5983 )	2023-12-01 09:11:41 -05:00
Anastasia Lubennikova	92e7cd40e8	add sql_exporter to vm-image (#5949 ) expose LFC metrics	2023-12-01 13:40:49 +00:00
Alexander Bayandin	7eabfc40ee	test_runner: use separate directory for each rerun (#6004 ) ## Problem While investigating https://github.com/neondatabase/neon/issues/5854, we hypothesised that logs/repo-dir from the initial failure might leak into reruns. Use different directories for each run to avoid such a possibility. ## Summary of changes - make each test rerun use different directories - update `pytest-rerunfailure` plugin from 11.1.2 to 13.0	2023-12-01 13:26:19 +00:00
Christian Schwarz	ce1652990d	logical size: better represent level of accuracy in the type system (#5999 ) I would love to not expose the in-accurate value int he mgmt API at all, and in fact control plane doesn't use it [^1]. But our tests do, and I have no desire to change them at this time. [^1]: https://github.com/neondatabase/cloud/pull/8317	2023-12-01 14:16:29 +01:00
Christian Schwarz	8cd28e1718	logical size calculation: make .current_size() infallible (#5999 ) ... by panicking on overflow; It was made fallible initially due to in-confidence in logical size calculation. However, the error has never happened since I am at Neon. Let's stop worrying about this by converting the overflow check into a panic.	2023-12-01 14:16:29 +01:00
Christian Schwarz	1c88824ed0	initial logical size calculation: add a bunch of metrics (#5995 ) These will help us answer questions such as: - when & at what do calculations get started after PS restart? - how often is the api to get current incrementally-computed logical size called, and does it return Exact vs Approximate? I'd also be interested in a histogram of how much wall clock time size calculations take, but, I don't know good bucket sizes, and, logging it would introduce yet another per-timeline log message during startup; don't think that's worth it just yet. Context - https://neondb.slack.com/archives/C033RQ5SPDH/p1701197668789769 - https://github.com/neondatabase/neon/issues/5962 - https://github.com/neondatabase/neon/issues/5963 - https://github.com/neondatabase/neon/pull/5955 - https://github.com/neondatabase/cloud/issues/7408	2023-12-01 12:52:59 +01:00
Arpad Müller	1ce1c82d78	Clean up local state if index_part.json request gives 404 (#6009 ) If `index_part.json` is (verifiably) not present on remote storage, we should regard the timeline as inexistent. This lets `clean_up_timelines` purge the partial local disk state, which is important in the case of incomplete creations leaving behind state that hinders retries. For incomplete deletions, we also want the timeline's local disk content be gone completely. The PR removes the allowed warnings added by #5390 and #5912, as we now are only supposed to issue info level messages. It also adds a reproducer for #6007, by parametrizing the `test_timeline_init_break_before_checkpoint_recreate` test added by #5390. If one reverts the .rs changes, the "cannot create its uninit mark file" log line occurs once one comments out the failing checks for the local disk state being actually empty. Closes #6007 --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-12-01 10:58:06 +00:00
Vadim Kharitonov	f784e59b12	Update timescaledb to 2.13.0 (#5975 ) TimescaleDB has released 2.13.0. This version is compatible with Postgres16	2023-11-30 17:12:52 -06:00
Arpad Müller	b71b8ecfc2	Add existing_initdb_timeline_id param to timeline creation (#5912 ) This PR adds an `existing_initdb_timeline_id` option to timeline creation APIs, taking an optional timeline ID. Follow-up of #5390. If the `existing_initdb_timeline_id` option is specified via the HTTP API, the pageserver downloads the existing initdb archive from the given timeline ID and extracts it, instead of running initdb itself. --------- Co-authored-by: Christian Schwarz <christian@neon.tech>	2023-11-30 22:32:04 +01:00
Arpad Müller	3842773546	Correct RFC number for Pageserver WAL DR RFC (#5997 ) When I opened #5248, 27 was an unused RFC number. Since then, two RFCs have been merged, so now 27 is taken. 29 is free though, so move it there.	2023-11-30 21:01:25 +00:00
Conrad Ludgate	f39fca0049	proxy: chore: replace strings with SmolStr (#5786 ) ## Problem no problem ## Summary of changes replaces boxstr with arcstr as it's cheaper to clone. mild perf improvement. probably should look into other smallstring optimsations tbh, they will likely be even better. The longest endpoint name I was able to construct is something like `ep-weathered-wildflower-12345678` which is 32 bytes. Most string optimisations top out at 23 bytes	2023-11-30 20:52:30 +00:00
Joonas Koivunen	b451e75dc6	test: include cmdline in captured output (#5977 ) aiming for faster to understand a bunch of `.stdout` and `.stderr` files, see example echo_1.stdout differences: ``` +# echo foobar abbacd + foobar abbacd ``` it can be disabled and is disabled in this PR for some tests; use `pg_bin.run_capture(..., with_command_header=False)` for that. as a bonus this cleans up the echoed newlines from s3_scrubber output which are also saved to file but echoed to test log. Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2023-11-30 17:31:03 +00:00
Anna Khanova	3657a3c76e	Proxy fix metrics record (#5996 ) ## Problem Some latency metrics are recorded in inconsistent way. ## Summary of changes Make sure that everything is recorded in seconds.	2023-11-30 16:33:54 +00:00
Joonas Koivunen	eba3bfc57e	test: python needs thread safety as well (#5992 ) we have test cases which launch processes from threads, and they capture output assuming this counter is thread-safe. at least according to my understanding this operation in python requires a lock to be thread-safe.	2023-11-30 15:48:40 +00:00
John Spray	57ae9cd07f	pageserver: add `flush_ms` and document `/location_config` API (#5860 ) - During migration of tenants, it is useful for callers to `/location_conf` to flush a tenant's layers while transitioning to AttachedStale: this optimization reduces the redundant WAL replay work that the tenant's new attached pageserver will have to do. Test coverage for this will come as part of the larger tests for live migration in #5745 #5842 - Flushing is controlled with `flush_ms` query parameter: it is the caller's job to decide how long they want to wait for a flush to complete. If flush is not complete within the time limit, the pageserver proceeds to succeed anyway: flushing is only an optimization. - Add swagger definitions for all this: the location_config API is the primary interface for driving tenant migration as described in docs/rfcs/028-pageserver-migration.md, and will eventually replace the various /attach /detach /load /ignore APIs. --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-11-30 14:22:07 +00:00
Christian Schwarz	3bb1030f5d	walingest: refactor if-cascade on `decoded.xl_rmid` into match statement (#5974 ) refs https://github.com/neondatabase/neon/issues/5962 --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-11-30 14:07:41 +00:00
John Spray	5d3c3636fc	tests: add a log allow list entry in `test_timeline_deletion_with_files_stuck_in_upload_queue` (#5981 ) Test failure seen here: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-5860/7032903218/index.html#suites/837740b64a53e769572c4ed7b7a7eeeb/c0f1c79a70a3b9ab ``` E AssertionError: assert not [(302, '2023-11-29T13:23:51.046801Z ERROR request{method=PUT path=/v1/tenant/f6b845de60cb0e92f4426e0d6af1d2ea/timeline/69a8c98004abe71a281cff8642a45274/checkpoint request_id=eca33d8a-7af2-46e7-92ab-c28629feb42c}: Error processing HTTP request: InternalServerError(queue is in state Stopped\n')] ``` This appears to be a legitimate log: the test is issuing checkpoint requests in the background, and deleting (therefore shutting down) a timeline.	2023-11-30 13:44:14 +00:00
Conrad Ludgate	0c87d1866b	proxy: fix wake_compute error prop (#5989 ) ## Problem fixes #5654 - WakeComputeErrors occuring during a connect_to_compute got propagated as IO errors, which get forwarded to the user as "Couldn't connect to compute node" with no helpful message. ## Summary of changes Handle WakeComputeError during ConnectionError properly	2023-11-30 13:43:21 +00:00
Arpad Müller	8ec6033ed8	Pageserver disaster recovery RFC (#5248 ) Enable the pageserver to recover from data corruption events by implementing a feature to re-apply historic WAL records in parallel to the already occurring WAL replay. The feature is outside of the user-visible backup and history story, and only serves as a second-level backup for the case that there is a bug in the pageservers that corrupted the served pages. The RFC proposes the addition of two new features: * recover a broken branch from WAL (downtime is allowed) * a test recovery system to recover random branches to make sure recovery works	2023-11-30 14:30:17 +01:00

1 2 3 4 5 ...

4152 Commits