rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-05-24 00:20:37 +00:00

Author	SHA1	Message	Date
Christian Schwarz	e603337f39	WIP	2024-02-05 11:19:19 +00:00
Christian Schwarz	cb4908dedb	Merge branch 'problame/2024-02-walredo-work/prespawn/switch-to-heavier-once-cell-with-rwlock' into problame/2024-02-walredo-work/prespawn/impl	2024-02-02 18:36:19 +00:00
Christian Schwarz	f8652fc738	Merge branch 'problame/2024-02-walredo-work/prespawn/heaver-once-cell-for-process-launch' into problame/2024-02-walredo-work/prespawn/switch-to-heavier-once-cell-with-rwlock	2024-02-02 18:36:17 +00:00
Christian Schwarz	bbf954d411	Merge branch 'problame/2024-02-walredo-work/prespawn/broken-tenants-no-walredo' into problame/2024-02-walredo-work/prespawn/heaver-once-cell-for-process-launch	2024-02-02 18:36:16 +00:00
Christian Schwarz	efb3e7bb15	Merge branch 'problame/2024-02-walredo-work/prespawn/split-code' into problame/2024-02-walredo-work/prespawn/broken-tenants-no-walredo	2024-02-02 18:36:15 +00:00
Christian Schwarz	01688a5ce1	Merge branch 'main' into problame/2024-02-walredo-work/prespawn/split-code	2024-02-02 18:36:14 +00:00
John Spray	2e5eab69c6	tests: remove test_gc_cutoff (#6587 ) This test became flaky when postgres retry handling was fixed to use backoff delays -- each iteration in this test's loop was taking much longer because pgbench doesn't fail until postgres has given up on retrying to the pageserver. We are just removing it, because the condition it tests is no longer risky: we reload all metadata from remote storage on restart, so crashing directly between making local changes and doing remote uploads isn't interesting any more. Closes: https://github.com/neondatabase/neon/issues/2856 Closes: https://github.com/neondatabase/neon/issues/5329	2024-02-02 18:20:18 +00:00
Christian Schwarz	86bd14181e	Merge branch 'problame/2024-02-walredo-work/prespawn/switch-to-heavier-once-cell-with-rwlock' into problame/2024-02-walredo-work/prespawn/impl	2024-02-02 17:34:49 +00:00
Christian Schwarz	64b4b498a4	Revert "remove the walredo usage, that'll be in the next pr" This reverts commit `20e82629df`.	2024-02-02 17:25:25 +00:00
Christian Schwarz	20e82629df	remove the walredo usage, that'll be in the next pr	2024-02-02 17:21:59 +00:00
Christian Schwarz	0a09cff816	heavier_once_cell: switch to tokio::sync::RwLock Using the RwLock reduces contention on the hot path.	2024-02-02 17:09:56 +00:00
Christian Schwarz	c29532cded	Revert "Revert "[DO NOT MERGE] refactor(walredo): use replace RwLock with heavier_once_cell"" This reverts commit `6d94d9fb19`.	2024-02-02 16:43:14 +00:00
Christian Schwarz	1102d3f0bf	Revert "switch to tokio::RwLock" This reverts commit `e8f1af5527`.	2024-02-02 16:43:08 +00:00
Christian Schwarz	e8f1af5527	switch to tokio::RwLock	2024-02-02 16:42:54 +00:00
Christian Schwarz	6d94d9fb19	Revert "[DO NOT MERGE] refactor(walredo): use replace RwLock with heavier_once_cell" This reverts commit `2ab2608d4c`.	2024-02-02 16:15:37 +00:00
Christian Schwarz	84169c926a	Merge branch 'problame/2024-02-walredo-work/prespawn/broken-tenants-no-walredo' into problame/2024-02-walredo-work/prespawn/heaver-once-cell-for-process-launch	2024-02-02 15:53:57 +00:00
Christian Schwarz	acdebf2cec	Merge branch 'problame/2024-02-walredo-work/prespawn/split-code' into problame/2024-02-walredo-work/prespawn/broken-tenants-no-walredo	2024-02-02 15:53:56 +00:00
Christian Schwarz	44cb5e5be6	Merge branch 'main' into problame/2024-02-walredo-work/prespawn/split-code	2024-02-02 15:53:55 +00:00
John Spray	46fb1a90ce	pageserver: avoid calculating/sending logical sizes on shard !=0 (#6567 ) ## Problem Sharded tenants only maintain accurate relation sizes on shard 0. Therefore logical size can only be calculated on shard 0. Fortunately it is also only _needed_ on shard 0, to provide Safekeeper feedback and to send consumption metrics. Closes: #6307 ## Summary of changes - Send 0 for logical size to safekeepers on shards !=0 - Skip logical size warmup task on shards !=0 - Skip imitate_layer_accesses on shards !=0	2024-02-02 15:52:03 +00:00
Christian Schwarz	2ab2608d4c	[DO NOT MERGE] refactor(walredo): use replace RwLock with heavier_once_cell The API is nice, exactly what we want, but we would want a more optimistic underlying sync primitive.	2024-02-02 15:36:15 +00:00
Christian Schwarz	de7d366df3	wip	2024-02-02 15:25:39 +00:00
Christian Schwarz	7c1b2dc9ef	Merge branch 'problame/2024-02-walredo-work/prespawn/broken-tenants-no-walredo' into problame/2024-02-walredo-work/prespawn/impl	2024-02-02 14:56:29 +00:00
Christian Schwarz	f73aa3eb32	refactor(walredo): avoid the need for a WalRedoManager in broken tenants When we'll later introduce a global pool of pre-spawned walredo processes (https://github.com/neondatabase/neon/issues/6581), this refactoring avoids plumbing through the reference to the pool to all the places where we create a broken tenant. Builds atop the refactoring in #6583	2024-02-02 14:52:53 +00:00
Christian Schwarz	2374e1318e	Merge branch 'main' into problame/2024-02-walredo-work/prespawn/split-code	2024-02-02 14:42:30 +00:00
Christian Schwarz	8fe3c9ff55	wip	2024-02-02 14:42:00 +00:00
Christian Schwarz	8e8890530c	wip	2024-02-02 14:26:26 +00:00
John Spray	56171cbe8c	pageserver: more permissive activation timeout when testing (#6564 ) ## Problem The 5 second activation timeout is appropriate for production environments, where we want to give a prompt response to the cloud control plane, and if we fail it will retry the call. In tests however, we don't want every call to e.g. timeline create to have to come with a retry wrapper. This issue has always been there, but it is more apparent in sharding tests that concurrently attach several tenant shards. Closes: https://github.com/neondatabase/neon/issues/6563 ## Summary of changes When `testing` feature is enabled, make `ACTIVE_TENANT_TIMEOUT` 30 seconds instead of 5 seconds.	2024-02-02 15:14:42 +01:00
Arpad Müller	48b05b7c50	Add a time_travel_remote_storage http endpoint (#6533 ) Adds an endpoint to the pageserver to S3-recover an entire tenant to a specific given timestamp. Required input parameters: * `travel_to`: the target timestamp to recover the S3 state to * `done_if_after`: a timestamp that marks the beginning of the recovery process. retries of the query should keep this value constant. it must be after `travel_to`, and also after any changes we want to revert, and must represent a point in time before the endpoint is being called, all of these time points in terms of the time source used by S3. these criteria need to hold even in the face of clock differences, so I recommend waiting a specific amount of time, then taking `done_if_after`, then waiting some amount of time again, and only then issuing the request. Also important to note: the timestamps in S3 work at second accuracy, so one needs to add generous waits before and after for the process to work smoothly (at least 2-3 seconds). We ignore the added test for the mocked S3 for now due to a limitation in moto: https://github.com/getmoto/moto/issues/7300 . Part of https://github.com/neondatabase/cloud/issues/8233	2024-02-02 14:52:12 +01:00
Christian Schwarz	014147a644	wip	2024-02-02 11:50:43 +00:00
Christian Schwarz	aa0e9fdaef	Merge branch 'main' into problame/2024-02-walredo-work/prespawn/split-code	2024-02-02 11:50:15 +00:00
John Spray	24e916d37f	pageserver: fix a syntax error in swagger (#6566 ) A description was written as a follow-on to a section line, rather than in the proper `description:` part. This caused swagger parsers to rightly reject it.	2024-02-02 10:35:09 +00:00
Christian Schwarz	9b8aa270b8	cleanups	2024-02-02 10:19:18 +00:00
Christian Schwarz	4571db1750	extract NeonWalRecord apply logic	2024-02-02 10:14:50 +00:00
Christian Schwarz	6fe534fea3	move protocol ad child module of `process`, where it belongs	2024-02-02 10:05:50 +00:00
Christian Schwarz	8b258e20a0	move more stuff around	2024-02-02 10:03:40 +00:00
Christian Schwarz	29eec6c563	split off walredo process & protocol from walredo.rs	2024-02-02 09:59:31 +00:00
Heikki Linnakangas	350865392c	Print checkpoint key contents with "pagectl print-layer-file" (#6541 ) This was very useful in debugging the bugs fixed in #6410 and #6502. There's a lot more we could do. This only adds the printing to delta layers, not image layers, for example, and it might be useful to print details of more record types. But this is a good start.	2024-02-02 01:35:31 +02:00
Christian Schwarz	1be5e564ce	feat(walredo): use posix_spawn by moving close_fds() work to walredo C code (#6574 ) The rust stdlib uses the efficient `posix_spawn` by default. However, before this PR, pageserver used `pre_exec()` in our `close_fds()` ext trait. This PR moves the work that `close_fds()` did to the walredo C code. I verified manually using `gdb` that we're now forking out the walredo process using `posix_spawn`. refs https://github.com/neondatabase/neon/issues/6565	2024-02-01 22:38:34 +01:00
Christian Schwarz	7a70ef991f	feat(walredo): various observability improvements (#6573 ) - log when we start walredo process - include tenant shard id in walredo argv - dump some basic walredo state in tenant details api - more suitable walredo process launch histogram buckets - avoid duplicate tracing labels in walredo launch spans	2024-02-01 21:59:40 +01:00
Vlad Lazar	221531c9db	pageserver: lift ancestor timeline logic from read path (#6543 ) When the read path needs to follow a key into the ancestor timeline, it needs to wait for said ancestor to become active and aware of it's branching lsn. The logic is lifted into a separate function with it's own new error type. This is done because the vectored read path needs the same logic. It's also the reason for the newly introduced error type. When we'll switch the read path to proxy into `get_vectored`, we can remove the duplicated variants from `PageReconstructError`.	2024-02-01 10:35:18 +00:00
Christian Schwarz	e82625b77d	refactor(pageserver main): signal handling (#6554 ) This refactoring makes it easier to experimentally replace BACKGROUND_RUNTIME with a single-threaded runtime. Found this useful [during benchmarking](https://github.com/neondatabase/neon/pull/6555).	2024-01-31 23:25:57 +00:00
Joonas Koivunen	3d5fab127a	rewrite Gate impl for better observability (#6542 ) changes: - two messages instead of message every second when gate was closing - replace the gate name string by using a pointer - slow GateGuards are likely to log who they were (see example) example found in regress tests: <https://github.com/neondatabase/neon/pull/6542#issuecomment-1919009256>	2024-01-31 22:15:58 +00:00
Joonas Koivunen	66719d7eaf	logging: fix span usage (#6549 ) Fixes some duplication due to extra or misconfigured `#[instrument]`, while filling in the `timeline_id` to delete timeline flow calls.	2024-01-31 20:52:00 +00:00
Konstantin Knizhnik	9a9d9beaee	Download SLRU segments on demand (#6151 ) ## Problem See https://github.com/neondatabase/cloud/issues/8673 ## Summary of changes Download missed SLRU segments from page server ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2024-01-31 21:39:18 +02:00
Arpad Müller	47380be12d	Remove version param from get_lsn_by_timestamp (#6551 ) This removes the last remnants of the version param added by #5608 , concluding the transition plan laid out in https://github.com/neondatabase/cloud/pull/7553#discussion_r1370473911 . It follows PR https://github.com/neondatabase/cloud/pull/9202, which we now assume has been deployed to all environments. Full history: * https://github.com/neondatabase/neon/pull/5608 * https://github.com/neondatabase/cloud/pull/7553 * https://github.com/neondatabase/neon/pull/6178 * https://github.com/neondatabase/cloud/pull/9202	2024-01-31 15:30:19 +01:00
John Spray	4010adf653	control_plane/attachment_service: complete APIs (#6394 ) Depends on: https://github.com/neondatabase/neon/pull/6468 ## Problem The sharding service will be used as a "virtual pageserver" by the control plane -- so it needs the set of pageserver APIs that the control plane uses, and to present them under identical URLs, including prefix (/v1). ## Summary of changes - Add missing APIs: - Tenant deletion - Timeline deletion - Node list (used in test now, later in tools) - `/location_config` API (for migrating tenants into the sharding service) - Rework attachment service URLs: - `/v1` prefix is used for pageserver-compatible APIs - `/upcall/v1` prefix is used for APIs that are called by the pageserver (re-attach and validate) - `/debug/v1` prefix is used for endpoints that are for testing - `/control/v1` prefix is used for new sharding service APIs that do not mimic a pageserver API, such as registering and configuring nodes. - Add test_sharding_service. The sharding service already had some collateral coverage from its use in general tests, but this is the first dedicated testing for it.	2024-01-31 12:23:06 +00:00
Christian Schwarz	79137a089f	fix(#6366 ): pageserver: incorrect log level for Tenant not found during basebackup (#6400 ) Before this patch, when requesting basebackup for a not-found tenant or timeline, we'd emit an ERROR-level log entry with a huge stack trace. See #6366 "Details" section for an example With this patch, we log at INFO level and only a single line. Example: ``` 2024-01-19T14:16:11.479800Z INFO page_service_conn_main{peer_addr=127.0.0.1:43448}: query handler for 'basebackup d69a536d529a68fcf85bc070030cdf4b 035484e9c28d8d0138a492caadd03ffd 0/2204340 --gzip' entity not found: Tenant d69a536d529a68fcf85bc070030cdf4b not found 2024-01-19T14:19:35.807819Z INFO page_service_conn_main{peer_addr=127.0.0.1:48862}: query handler for 'basebackup d69a536d529a68fcf85bc070030cdf4a 035484e9c28d8d0138a492caadd03ffd 0/2204340 --gzip' entity not found: Timeline d69a536d529a68fcf85bc070030cdf4a/035484e9c28d8d0138a492caadd03ffd was not found ``` fixes https://github.com/neondatabase/neon/issues/6366 Changes ------- - Change `handle_basebackup_request` to return a `QueryError` - The new `impl From<WaitLsnError> for QueryError` is needed so the `?` at `wait_lsn()` call in `handle_basebackup_request` works again. It's duplicating `impl From<WaitLsnError> for PageStreamError`. - Remove hard-to-spot conversion of `handle_basebackup_request` return value to anyhow::Result (the place where I replaced `anyhow::Ok` with `Result::<(), QueryError>::Ok(())` - Add forgotten distinguished handling for "Tenant not found" case in `impl From<GetActiveTenantError> for QueryError` This was not at all pleasant, and I find it very hard to follow the various error conversions. It took me a while to spot the hard-to-spot `anyhow::Ok` thing above. It would have been caught by the compiler if we weren't auto-converting `anyhow::Error` into `QueryError::Other`. We should move away from that, in my opinion, instead forcing each `.context()` site to become `.context().map_err(QueryError::Other)`. But that's for a future PR.	2024-01-30 13:10:48 +00:00
Joonas Koivunen	e3cb715e8a	fix: capture initdb stderr, discard others (#6524 ) When using spawn + wait_with_output instead of std::process::Command::output or tokio::process::Command::output we must configure the redirection. Fixes: #6523 by discarding the stdout completely, we only care about stderr if any.	2024-01-30 14:07:58 +01:00
Vlad Lazar	0c7b89235c	pageserver: add range layer map search implementation (#6469 ) ## Problem There's no efficient way of querying the layer map for a range. ## Summary of changes Introduce a range query for the layer map (`LayerMap::range_search`). There's two broad steps to it: 1. Find all coverage changes for layers that intersect the queried range (see `LayerCoverage::range_overlaps`). The slightly tricky part is dealing with the start of the range. We can either be aligned with a layer or not and we need to treat these cases differently. 2. Iterate over the coverage changes and collect the result. For this we use a two pointer approach: the trailing pointer tracks the start of the current range (current location in the key space) and the forward pointer tracks the next coverage change. Plugging the range search into the read path is deferred to a future PR. ## Performance I adapted the layer map benchmarks on a local branch. Range searches are between 2x and 2.5x slower than point searches. That's in line with what I expected since we query thelayer map twice. Since `Timeline::get` will proxy to `Timeline::get_vectored` we can special case the one element layer map range search at that point.	2024-01-29 09:47:12 +00:00
Joonas Koivunen	1e9a50bca8	disk_usage_eviction_task: cleanup summaries (#6490 ) This is the "partial revert" of #6384. The summaries turned out to be expensive due to naive vec usage, but also inconclusive because of the additional context required. In addition to removing summary traces, small refactoring is done.	2024-01-29 10:38:40 +02:00

1 2 3 4 5 ...

1805 Commits