rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-15 17:32:56 +00:00

Author	SHA1	Message	Date
Joonas Koivunen	44cdb9fb58	refactor: reparented_direct_children query	2024-07-26 14:39:31 +00:00
Joonas Koivunen	cdfaf0700f	fix: bifurcate the detach+reparent step	2024-07-26 14:39:31 +00:00
Joonas Koivunen	881e1ad056	refactor: no need to collect reparentable here	2024-07-26 14:39:31 +00:00
Joonas Koivunen	bb3d70e24d	fix: properly cancel if any reparenting failed	2024-07-26 14:39:31 +00:00
Joonas Koivunen	c6c560e4c8	rewrite to include testing assertion	2024-07-26 14:39:31 +00:00
Joonas Koivunen	8dd332aed5	doc: remove unnecessary comment	2024-07-26 14:39:31 +00:00
Joonas Koivunen	5c03a17eb8	wip: some progress now we hit the todo! in "already detached" path.	2024-07-26 14:39:31 +00:00
Joonas Koivunen	402d66778e	make reparenting operations idempotent	2024-07-26 14:39:31 +00:00
Joonas Koivunen	39e2bc932f	prepare to reparent while gc blocked	2024-07-26 14:39:31 +00:00
Joonas Koivunen	5fc034fa7f	feat: block gc persistently until detach ancestor completes	2024-07-26 14:39:31 +00:00
Joonas Koivunen	f9b12def0b	add support for WaitToActivate errors	2024-07-26 14:39:31 +00:00
Joonas Koivunen	5d0071447c	partial: index_part.json support for ongoing_detach_ancestor	2024-07-26 14:39:31 +00:00
Joonas Koivunen	409e2eff9e	fix: run upload_rewritten_layer in a span there was a weird failure observed with CI tests: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-8430/10108870590/index.html#suites/a1c2be32556270764423c495fad75d47/94a4686382b96297	2024-07-26 14:39:31 +00:00
Joonas Koivunen	e6e3b9a716	doc: remove on_gc_task_start fixme	2024-07-26 08:52:55 +00:00
Joonas Koivunen	7f31a3f671	forgotten rename, maybe	2024-07-26 08:52:55 +00:00
Joonas Koivunen	9971ae3d24	rename is_detached_from_{original_,}ancestor (just the rename)	2024-07-26 08:52:55 +00:00
Joonas Koivunen	48a2a20de3	chore: derive default	2024-07-26 08:52:55 +00:00
Joonas Koivunen	29ef8f15ce	chore: unused variable	2024-07-26 08:52:55 +00:00
Joonas Koivunen	5e45dd3f86	rename SharedState::notify to continue_existing_attempt	2024-07-26 08:52:55 +00:00
Joonas Koivunen	5fced442d7	warning caused by removed body	2024-07-26 08:52:55 +00:00
Joonas Koivunen	4222610233	cleanup index part dependent	2024-07-26 08:52:55 +00:00
Joonas Koivunen	92deb0dfd7	plumbing: collect timelines index parts	2024-07-26 08:52:55 +00:00
Joonas Koivunen	46ca6f17c5	plumbing: notify shared state of existing attempt	2024-07-26 08:52:55 +00:00
Joonas Koivunen	14869abb77	complete the plumbing with non-notifying attempt_blocks_gc impl	2024-07-26 08:52:55 +00:00
Joonas Koivunen	5330fd9366	doc(fixme): shared state	2024-07-26 08:52:55 +00:00
Joonas Koivunen	6c5b3b7812	doc: more sketched api comments	2024-07-26 08:52:55 +00:00
Joonas Koivunen	849fe0f191	plumb the shared state through the api for the gc pausing is quite awkward.	2024-07-26 08:52:55 +00:00
Joonas Koivunen	f564b66f21	shared state sketch	2024-07-26 08:52:55 +00:00
Joonas Koivunen	2e58ccee78	temp: planning	2024-07-26 08:52:55 +00:00
Joonas Koivunen	0ad31bb7fb	doc: remove obsolete FIXME this was cleared with partial metadata updates.	2024-07-26 08:52:55 +00:00
Joonas Koivunen	86f26d0918	chore: minor rename FIXME in IndexPart	2024-07-26 08:52:55 +00:00
Joonas Koivunen	4a562dff2e	doc: more	2024-07-26 08:52:55 +00:00
Joonas Koivunen	f9185b42a9	doc: minor enhancements	2024-07-26 08:52:55 +00:00
Joonas Koivunen	d4f30daa81	chore: minor indentation problem	2024-07-26 08:52:55 +00:00
John Spray	6711087ddf	remote_storage: expose last_modified in listings (#8497 ) ## Problem The scrubber would like to check the highest mtime in a tenant's objects as a safety check during purges. It recently switched to use GenericRemoteStorage, so we need to expose that in the listing methods. ## Summary of changes - In Listing.keys, return a ListingObject{} including a last_modified field, instead of a RemotePath --------- Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2024-07-26 10:57:52 +03:00
Arpad Müller	8e02db1ab9	Handle NotInitialized::ShuttingDown error in shard split (#8506 ) There is a race condition between timeline shutdown and the split task. Timeline shutdown first shuts down the upload queue, and only then fires the cancellation token. A parallel running timeline split operation might thus encounter a cancelled upload queue before the cancellation token is fired, and print a noisy error. Fix this by mapping `anyhow::Error{ NotInitialized::ShuttingDown }) to `FlushLayerError::Cancelled` instead of `FlushLayerError::Other(_)`. Fixes #8496	2024-07-26 02:16:10 +02:00
Alex Chi Z.	bea0468f1f	fix(pageserver): allow incomplete history in btm-gc-compaction (#8500 ) This pull request (should) fix the failure of test_gc_feedback. See the explanation in the newly-added test case. Part of https://github.com/neondatabase/neon/issues/8002 Allow incomplete history for the compaction algorithm. Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-07-25 12:56:37 -04:00
Vlad Lazar	9c5ad21341	storcon: make heartbeats restart aware (#8222 ) ## Problem Re-attach blocks the pageserver http server from starting up. Hence, it can't reply to heartbeats until that's done. This makes the storage controller mark the node off-line (not good). We worked around this by setting the interval after which nodes are marked offline to 5 minutes. This isn't a long term solution. ## Summary of changes * Introduce a new `NodeAvailability` state: `WarmingUp`. This state models the following time interval: * From receiving the re-attach request until the pageserver replies to the first heartbeat post re-attach * The heartbeat delta generator becomes aware of this state and uses a separate longer interval * Flag `max-warming-up-interval` now models the longer timeout and `max-offline-interval` the shorter one to match the names of the states Closes https://github.com/neondatabase/neon/issues/7552	2024-07-25 14:09:12 +01:00
Christian Schwarz	a1256b2a67	fix: remote timeline client shutdown trips circuit breaker (#8495 ) Before this PR 1.The circuit breaker would trip on CompactionError::Shutdown. That's wrong, we want to ignore those cases. 2. remote timeline client shutdown would not be mapped to CompactionError::Shutdown in all circumstances. We observed this in staging, see https://neondb.slack.com/archives/C033RQ5SPDH/p1721829745384449 This PR fixes (1) with a simple `match` statement, and (2) by switching a bunch of `anyhow` usage over to distinguished errors that ultimately get mapped to `CompactionError::Shutdown`. I removed the implicit `#[from]` conversion from `anyhow::Error` to `CompactionError::Other` to discover all the places that were mapping remote timeline client shutdown to `anyhow::Error`. In my opinion `#[from]` is an antipattern and we should avoid it, especially for `anyhow::Error`. If some callee is going to return anyhow, the very least the caller should to is to acknowledge, through a `map_err(MyError::Other)` that they're conflating different failure reasons.	2024-07-25 09:44:31 +01:00
Christian Schwarz	d57412aaab	followup(#8359 ): pre-initialize circuitbreaker metrics (#8491 )	2024-07-25 10:24:28 +02:00
John Spray	5f4e14d27d	pageserver: fix a compilation error (#8487 ) ## Problem PR that modified compaction raced with PR that modified the GcInfo structure ## Summary of changes Fix it Co-authored-by: Vlad Lazar <vlalazar.vlad@gmail.com>	2024-07-24 16:37:15 +01:00
Vlad Lazar	2723a8156a	pageserver: faster and simpler inmem layer vec read (#8469 ) ## Problem The in-memory layer vectored read was very slow in some conditions (walingest::test_large_rel) test. Upon profiling, I realised that 80% of the time was spent building up the binary heap of reads. This stage isn't actually needed. ## Summary of changes Remove the planning stage as we never took advantage of it in order to merge reads. There should be no functional change from this patch.	2024-07-24 14:23:03 +01:00
John Spray	2ef8e57f86	pageserver: maintain gc_info incrementally (#8427 ) ## Problem Previously, Timeline::gc_info was only updated in a batch operation at the start of GC. That means that timelines didn't generally have accurate information about who their children were before the first GC, or between GC cycles. Knowledge of child branches is important for calculating layer visibility in #8398 ## Summary of changes - Split out part of refresh_gc_info into initialize_gc_info, which is now called early in startup - Include TimelineId in retain_lsns so that we can later add/remove the LSNs for particular children - When timelines are added/removed, update their parent's retain_lsns	2024-07-24 12:33:44 +02:00
John Spray	842c3d8c10	tests: simplify code around unstable `test_basebackup_with_high_slru_count` (#8477 ) ## Problem In `test_basebackup_with_high_slru_count`, the pageserver is sometimes mysteriously hanging on startup, having been started+stopped earlier in the test setup while populating template tenant data. - #7586 We can't see why this is hanging in this particular test. The test does some weird stuff though, like attaching a load of broken tenants and then doing a SIGQUIT kill of a pageserver. ## Summary of changes - Attach tenants normally instead of doing a failpoint dance to attach them as broken - Shut the pageserver down gracefully during init instead of using immediate mode - Remove the "sequential" variant of the unstable test, as this is going away soon anyway - Log before trying to acquire lock file, so that if it hangs we have a clearer sense of if that's really where it's hanging. It seems like it is, but that code does a non-blocking flock so it's surprising.	2024-07-24 11:26:24 +01:00
John Spray	f5db655447	pageserver: simplify LayerAccessStats (#8431 ) ## Problem LayerAccessStats contains a lot of detail that we don't use: short histories of most recent accesses, specifics on what kind of task accessed a layer, etc. This is all stored inside a Mutex, which is locked every time something accesses a layer. ## Summary of changes - Store timestamps at a very low resolution (to the nearest second), sufficient for use on the timescales of eviction. - Pack access time and last residence change time into a single u64 - Use the high bits of the u64 for other flags, including the new layer visibility concept. - Simplify the external-facing model for access stats to just include what we now track. Note that the `HistoryBufferWithDropCounter` is removed here because it is no longer used. I do not dislike this type, we just happen not to use it for anything else at present. Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-07-24 08:17:28 +01:00
Alex Chi Z.	18cf5cfefd	feat(pageserver): support retain_lsn in bottommost gc-compaction (#8328 ) part of https://github.com/neondatabase/neon/issues/8002 The main thing in this pull request is the new `generate_key_retention` function. It decides which deltas to retain and generate images for a given key based on its history + retain_lsn + horizon. On that, we generate a flat single level of delta layers over all deltas included in the compaction. In the future, we can decide whether to split them over the LSN axis as described in the RFC. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-07-24 00:28:43 +01:00
Konstantin Knizhnik	563d73d923	Use smgrexists() instead of access() to enforce uniqueness of generated relfilenumber (#7992 ) ## Problem Postgres is using `access()` function in `GetNewRelFileNumber` to check if assigned relfilenumber is not used for any other relation. This check will not work in Neon, because we do not have all files in local storage. ## Summary of changes Use smgrexists() instead which will check at page server if such relfilenode is used. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-07-23 18:41:55 +03:00
John Spray	1a4c1eba92	pageserver: add LayerVisibilityHint (#8432 ) ## Problem As described in https://github.com/neondatabase/neon/issues/8398, layer visibility is a new hint that will help us manage disk space more efficiently. ## Summary of changes - Introduce LayerVisibilityHint and store it as part of access stats - Automatically mark a layer visible if it is accessed, or when it is created. The impact on the access stats size will be reversed in https://github.com/neondatabase/neon/pull/8431 This is functionally a no-op change: subsequent PRs will add the logic that sets layers to Covered, and which uses the layer visibility as an input to eviction and heatmap generation. --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2024-07-23 15:37:12 +01:00
John Spray	80c8ceacbc	tests: make `test_scrubber_physical_gc_ancestors` more stable (#8453 ) ## Problem This test sometimes found that ancestors were getting cleaned up before it had done any compaction. Compaction was happening implicitly via Workload. Example: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-8298/10032173390/index.html#testresult/fb04786402f80822/retries ## Summary of changes - Set upload=False when writing data after shard split, to avoid doing a checkpoint - Add a checkpoint_period & explicit wait for uploads so that we ensure data lands in S3 without doing a checkpoint	2024-07-23 12:57:57 +01:00
Vlad Lazar	35854928d9	pageserver: use identity file as node id authority and remove init command and config-override flags (#7766 ) Ansible will soon write the node id to `identity.toml` in the work dir for new pageservers. On the pageserver side, we read the node id from the identity file if it is present and use that as the source of truth. If the identity file is missing, cannot be read, or does not deserialise, start-up is aborted. This PR also removes the `--init` mode and the `--config-override` flag from the `pageserver` binary. The neon_local is already not using these flags anymore. Ansible still uses them until the linked change is merged & deployed, so, this PR has to land simultaneously or after the Ansible change due to that. Related Ansible change: https://github.com/neondatabase/aws/pull/1322 Cplane change to remove config-override usages: https://github.com/neondatabase/cloud/pull/13417 Closes: https://github.com/neondatabase/neon/issues/7736 Overall plan: https://www.notion.so/neondatabase/Rollout-Plan-simplified-pageserver-initialization-f935ae02b225444e8a41130b7d34e4ea?pvs=4 Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-07-23 11:41:12 +01:00

1 2 3 4 5 ...

2334 Commits