rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-10 15:02:56 +00:00

Author	SHA1	Message	Date
Alex Chi Z.	29ee273d78	fix(storcon): correctly converts 404 for tenant passthrough requests (#12631 ) ## Problem Follow up of https://github.com/neondatabase/neon/pull/12620 Discussions: https://databricks.slack.com/archives/C09254R641L/p1752677940697529 The original code and after the patch above we converts 404s to 503s regardless of the type of 404. We should only do that for tenant not found errors. For other 404s like timeline not found, we should not prompt clients to retry. ## Summary of changes - Inspect the response body to figure out the type of 404. If it's a tenant not found error, return 503. - Otherwise, fallthrough and return 404 as-is. - Add `tenant_shard_remote_mutation` that manipulates a single shard. - Use `Service::tenant_shard_remote_mutation` for tenant shard passthrough requests. This prevents us from another race that the attach state changes within the request. (This patch mainly addresses the case that the tenant is "not yet attached"). - TODO: lease API is still using the old code path. We should refactor it to use `tenant_remote_mutation`. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-07-17 19:42:48 +00:00
HaoyuHuang	2c6b327be6	A few PS changes (#12540 ) # TLDR All changes are no-op except some metrics. ## Summary of changes I ### Pageserver Added a new global counter metric `pageserver_pagestream_handler_results_total` that categorizes pagestream request results according to their outcomes: 1. Success 2. Internal errors 3. Other errors Internal errors include: 1. Page reconstruction error: This probably indicates a pageserver bug/corruption 2. LSN timeout error: Could indicate overload or bugs with PS's ability to reach other components 3. Misrouted request error: Indicates bugs in the Storage Controller/HCC Other errors include transient errors that are expected during normal operation or errors indicating bugs with other parts of the system (e.g., malformed requests, errors due to cancelled operations during PS shutdown, etc.) ## Summary of changes II This PR adds a pageserver endpoint and its counterpart in storage controller to list visible size of all tenant shards. This will be a prerequisite of the tenant rebalance command. ## Problem III We need a way to download WAL segments/layerfiles from S3 and replay WAL records. We cannot access production S3 from our laptops directly, and we also can't transfer any user data out of production systems for GDPR compliance, so we need solutions. ## Summary of changes III This PR adds a couple of tools to support the debugging workflow in production: 1. A new `pagectl download-remote-object` command that can be used to download remote storage objects assuming the correct access is set up. ## Summary of changes IV This PR adds a command to list all visible delta and image layers from index_part. This is useful to debug compaction issues as index_part often contain a lot of covered layers due to PITR. --------- Co-authored-by: William Huang <william.huang@databricks.com> Co-authored-by: Chen Luo <chen.luo@databricks.com> Co-authored-by: Vlad Lazar <vlad@neon.tech>	2025-07-10 14:39:38 +00:00
Vlad Lazar	ffeede085e	libs: move metric collection for pageserver and safekeeper in a background task (#12525 ) ## Problem Safekeeper and pageserver metrics collection might time out. We've seen this in both hadron and neon. ## Summary of changes This PR moves metrics collection in PS/SK to the background so that we will always get some metrics, despite there may be some delays. Will leave it to the future work to reduce metrics collection time. --------- Co-authored-by: Chen Luo <chen.luo@databricks.com>	2025-07-10 11:58:22 +00:00
Alex Chi Z.	5ec82105cc	fix(pageserver): ensure remote size gets computed (#12520 ) ## Problem Follow up of #12400 ## Summary of changes We didn't set remote_size_mb to Some when initialized so it never gets computed :( Also added a new API to force refresh the properties. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-07-09 15:35:19 +00:00
Alex Chi Z.	85164422d0	feat(pageserver): support force overriding feature flags (#12233 ) ## Problem Part of #11813 ## Summary of changes Add a test API to make it easier to manipulate the feature flags within tests. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-06-23 17:31:53 +00:00
Vlad Lazar	51639cd6af	pageserver: allow for deletion of importing timelines (#12033 ) ## Problem Importing timelines can't currently be deleted. This is problematic because: 1. Cplane cannot delete failed imports and we leave the timeline behind. 2. The flow does not support user driven cancellation of the import ## Summary of changes On the pageserver: I've taken the path of least resistance, extended `TimelineOrOffloaded` with a new variant and added handling in the right places. I'm open to thoughts here, but I think it turned out better than I was envisioning. On the storage controller: Again, fairly simple business: when a DELETE timeline request is received, we remove the import from the DB and stop any finalization tasks/futures. In order to stop finalizations, we track them in-memory. For each finalizing import, we associate a gate and a cancellation token. Note that we delete the entry from the database before cancelling any finalizations. This is such that a concurrent request can't progress the import into finalize state and race with the deletion. This concern about deleting an import with on-going finalization is theoretical in the near future. We are only going to delete importing timelines after the storage controller reports the failure to cplane. Alas, the design works for user driven cancellation too. Closes https://github.com/neondatabase/neon/issues/11897	2025-05-29 11:13:52 +00:00
Alex Chi Z.	131b32ef48	fix(pageserver): clean up aux files before detaching (#11299 ) ## Problem Related to https://github.com/neondatabase/cloud/issues/26091 and https://github.com/neondatabase/cloud/issues/25840 Close https://github.com/neondatabase/neon/issues/11297 Discussion on Slack: https://neondb.slack.com/archives/C033RQ5SPDH/p1742320666313969 ## Summary of changes * When detaching, scan all aux files within `sparse_non_inherited_keyspace` in the ancestor timeline and create an image layer exactly at the ancestor LSN. All scanned keys will map to an empty value, which is a delete tombstone. - Note that end_lsn for rewritten delta layers = ancestor_lsn + 1, so the image layer will have image_end_lsn=end_lsn. With the current `select_layer` logic, the read path will always first read the image layer. * Add a test case. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech>	2025-04-03 15:55:22 +00:00
Alex Chi Z.	dd1299f337	feat(storcon): passthrough mark invisible and add tests (#11401 ) ## Problem close https://github.com/neondatabase/neon/issues/11279 ## Summary of changes * Allow passthrough of other methods in tenant timeline shard0 passthrough of storcon. * Passthrough mark invisible API in storcon. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-02 17:11:49 +00:00
Alexander Bayandin	30a7dd630c	ruff: enable TC — flake8-type-checking (#11368 ) ## Problem `TYPE_CHECKING` is used inconsistently across Python tests. ## Summary of changes - Update `ruff`: 0.7.0 -> 0.11.2 - Enable TC (flake8-type-checking): https://docs.astral.sh/ruff/rules/#flake8-type-checking-tc - (auto)fix all new issues	2025-03-30 18:58:33 +00:00
Alex Chi Z.	23b713900e	feat(storcon): passthrough ancestor detach behavior (#11199 ) ## Problem https://github.com/neondatabase/neon/issues/10310 https://github.com/neondatabase/neon/pull/11158 ## Summary of changes We need to passthrough the new detach behavior through the storcon API. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-13 20:21:23 +00:00
Alex Chi Z.	c3b3b507f7	feat(pageserver): support detaching behavior v2 (#11158 ) ## Problem close https://github.com/neondatabase/neon/issues/10310 ## Summary of changes This patch adds a new behavior for the detach_ancestor API: detach with multi-level ancestor and no reparenting. Though we can potentially support multi-level + do reparenting / single-level + no-reparenting in the future, as it's not required for the recovery/snapshot epic, I'd prefer keeping things simple now that we only handle the old one and the new one instead of supporting the full feature matrix. I only added a test case of successful detaching instead of testing failures. I'd like to make this into staging and add more tests in the future. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-12 22:27:23 +00:00
Alex Chi Z.	cd438406fb	feat(pageserver): add force patch index_part API (#11119 ) ## Problem As part of the disaster recovery tool. Partly for https://github.com/neondatabase/neon/issues/9114. ## Summary of changes * Add a new pageserver API to force patch the fields in index_part and modify the timeline internal structures. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-07 17:42:52 +00:00
Arseny Sher	2d0ea08524	Add safekeeper membership conf to control file. (#10196 ) ## Problem https://github.com/neondatabase/neon/issues/9965 ## Summary of changes Add safekeeper membership configuration struct itself and storing it in the control file. In passing also add creation timestamp to the control file (there were cases where I wanted it in the past). Remove obsolete unused PersistedPeerInfo struct from control file (still keep it control_file_upgrade.rs to have it in old upgrade code). Remove the binary representation of cfile in the roundtrip test. Updating it is annoying, and we still test the actual roundtrip. Also add configuration to timeline creation http request, currently used only in one python test. In passing, slightly change LSNs meaning in the request: normally start_lsn is passed (the same as ancestor_start_lsn in similar pageserver call), but we allow specifying higher commit_lsn for manual intervention if needed. Also when given LSN initialize term_history with it.	2025-01-15 09:45:58 +00:00
Alex Chi Z.	3d1c3a80ae	feat(pageserver): add compact queue http endpoint (#10173 ) ## Problem We cannot get the size of the compaction queue and access the info. Part of #9114 ## Summary of changes * Add an API endpoint to get the compaction queue. * gc_compaction test case now waits until the compaction finishes. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-12-18 18:09:02 +00:00
Vlad Lazar	a3e80448e8	pageserver/storcon: add patch endpoints for tenant config metrics (#10020 ) ## Problem Cplane and storage controller tenant config changes are not additive. Any change overrides all existing tenant configs. This would be fine if both did client side patching, but that's not the case. Once this merges, we must update cplane to use the PATCH endpoint. ## Summary of changes ### High Level Allow for patching of tenant configuration with a `PATCH /v1/tenant/config` endpoint. It takes the same data as it's PUT counterpart. For example the payload below will update `gc_period` and unset `compaction_period`. All other fields are left in their original state. ``` { "tenant_id": "1234", "gc_period": "10s", "compaction_period": null } ``` ### Low Level * PS and storcon gain `PATCH /v1/tenant/config` endpoints. PS endpoint is only used for cplane managed instances. * `storcon_cli` is updated to have separate commands for `set-tenant-config` and `patch-tenant-config` Related https://github.com/neondatabase/cloud/issues/21043	2024-12-11 19:16:33 +00:00
Erik Grinaker	ec4072f845	pageserver: add `wait_until_flushed` parameter for timeline checkpoint (#10013 ) ## Problem I'm writing an ingest benchmark in #9812. To time S3 uploads, I need to schedule a flush of the Pageserver's in-memory layer, but don't actually want to wait around for it to complete (which will take a minute). ## Summary of changes Add a parameter `wait_until_flush` (default `true`) for `timeline/checkpoint` to control whether to wait for the flush to complete.	2024-12-06 10:12:39 +00:00
Christian Schwarz	450be26bbb	fast imports: initial Importer and Storage changes (#9218 ) Co-authored-by: Heikki Linnakangas <heikki@neon.tech> Co-authored-by: Stas Kelvic <stas@neon.tech> # Context This PR contains PoC-level changes for a product feature that allows onboarding large databases into Neon without going through the regular data path. # Changes This internal RFC provides all the context * https://github.com/neondatabase/cloud/pull/19799 In the language of the RFC, this PR covers * the Importer code (`fast_import`) * all the Pageserver changes (mgmt API changes, flow implementation, etc) * a basic test for the Pageserver changes # Reviewing As acknowledged in the RFC, the code added in this PR is not ready for general availability. Also, the architecture is not to be discussed in this PR, but in the RFC and associated Slack channel instead. Reviewers of this PR should take that into consideration. The quality bar to apply during review depends on what area of the code is being reviewed: * Importer code (`fast_import`): practically anything goes * Core flow (`flow.rs`): * Malicious input data must be expected and the existing threat models apply. * The code must not be safe to execute on dedicated Pageserver instances: * This means in particular that tenants on other Pageserver instances must not be affected negatively wrt data confidentiality, integrity or availability. * Other code: the usual quality bar * Pay special attention to correct use of gate guards, timeline cancellation in all places during shutdown & migration, etc. * Consider the broader system impact; if you find potentially problematic interactions with Storage features that were not covered in the RFC, bring that up during the review. I recommend submitting three separate reviews, for the three high-level areas with different quality bars. # References (Internal-only) * refs https://github.com/neondatabase/cloud/issues/17507 * refs https://github.com/neondatabase/company_projects/issues/293 * refs https://github.com/neondatabase/company_projects/issues/309 * refs https://github.com/neondatabase/cloud/issues/20646 --------- Co-authored-by: Stas Kelvich <stas.kelvich@gmail.com> Co-authored-by: Heikki Linnakangas <heikki@neon.tech> Co-authored-by: John Spray <john@neon.tech>	2024-11-22 22:47:06 +00:00
Alex Chi Z.	c1937d073f	fix(pageserver): ensure upload happens after delete (#9844 ) ## Problem Follow up of https://github.com/neondatabase/neon/pull/9682, that patch didn't fully address the problem: what if shutdown fails due to whatever reason and then we reattach the tenant? Then we will still remove the future layer. The underlying problem is that the fix for #5878 gets voided because of the generation optimizations. Of course, we also need to ensure that delete happens after uploads, but note that we only schedule deletes when there are no ongoing upload tasks, so that's fine. ## Summary of changes * Add a test case to reproduce the behavior (by changing the original test case to attach the same generation). * If layer upload happens after the deletion, drain the deletion queue before uploading. * If blocked_deletion is enabled, directly remove it from the blocked_deletion queue. * Local fs backend fix to avoid race between deletion and preload. * test_emergency_mode does not need to wait for uploads (and it's generally not possible to wait for uploads). * ~~Optimize deletion executor to skip validation if there are no files to delete.~~ this doesn't work --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-11-22 18:30:53 +00:00
Alex Chi Z.	6f8b1eb5a6	test(pageserver): add detach ancestor smoke test (#9842 ) ## Problem Follow up to https://github.com/neondatabase/neon/pull/9682, hopefully we can detect some issues or assure ourselves that this is ready for production. ## Summary of changes * Add a compaction-detach-ancestor smoke test. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-11-22 18:21:51 +00:00
Alexander Bayandin	8d1c44039e	Python 3.11 (#9515 ) ## Problem On Debian 12 (Bookworm), Python 3.11 is the latest available version. ## Summary of changes - Update Python to 3.11 in build-tools - Fix ruff check / format - Fix mypy - Use `StrEnum` instead of pair `str`, `Enum` - Update docs	2024-11-21 16:25:31 +00:00
Alex Chi Z.	b22a84a7bf	feat(pageserver): support key range for manual compaction trigger (#9723 ) part of https://github.com/neondatabase/neon/issues/9114, we want to be able to run partial gc-compaction in tests. In the future, we can also expand this functionality to legacy compaction, so that we can trigger compaction for a specific key range. ## Summary of changes * Support passing compaction key range through pageserver routes. * Refactor input parameters of compact related function to take the new `CompactOptions`. * Add tests for partial compaction. Note that the test may or may not trigger compaction based on GC horizon. We need to improve the test case to ensure things always get below the gc_horizon and the gc-compaction can be triggered. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-11-19 19:38:41 +00:00
Tristan Partin	ecde8d7632	Improve type safety according to pyright Pyright found many issues that mypy doesn't seem to want to catch or mypy isn't configured to catch. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-11-08 14:43:15 -06:00
Christian Schwarz	06113e94e6	fix(test_regress): always use storcon virtual pageserver API to set tenant config (#9622 ) Problem ------- Tests that directly call the Pageserver Management API to set tenant config are flaky if the Pageserver is managed by Storcon because Storcon is the source of truth and may (theoretically) reconcile a tenant at any time. Solution -------- Switch all users of `set_tenant_config`/`patch_tenant_config_client_side` to use the `env.storage_controller.pageserver_api()` Future Work ----------- Prevent regressions from creeping in. And generally clean up up tenant configuration. Maybe we can avoid the Pageserver having a default tenant config at all and put the default into Storcon instead? * => https://github.com/neondatabase/neon/issues/9621 Refs ---- fixes https://github.com/neondatabase/neon/issues/9522	2024-11-04 17:42:08 +01:00
Christian Schwarz	6f5c262684	pageserver: add testing API to scan layers for disposable keys (#9393 ) This PR adds a pageserver mgmt API to scan a layer file for disposable keys. It hooks it up to the sharding compaction test, demonstrating that we're not filtering out all disposable keys. This is extracted from PGDATA import (https://github.com/neondatabase/neon/pull/9218) where I do the filtering of layer files based on `is_key_disposable`.	2024-10-25 14:16:45 +02:00
Arpad Müller	4d9036bf1f	Support offloaded timelines during shard split (#9489 ) Before, we didn't copy over the `index-part.json` of offloaded timelines to the new shard's location, resulting in the new shard not knowing the timeline even exists. In #9444, we copy over the manifest, but we also need to do this for `index-part.json`. As the operations to do are mostly the same between offloaded and non-offloaded timelines, we can iterate over all of them in the same loop, after the introduction of a `TimelineOrOffloadedArcRef` type to generalize over the two cases. This is analogous to the deletion code added in #8907. The added test also ensures that the sharded archival config endpoint works, something that has not yet been ensured by tests. Part of #8088	2024-10-25 12:32:46 +02:00
Christian Schwarz	b782b11b33	refactor(timeline creation): represent bootstrap vs branch using enum (#9366 ) # Problem Timeline creation can either be bootstrap or branch. The distinction is made based on whether the `ancestor_` fields are present or not. In the PGDATA import code (https://github.com/neondatabase/neon/pull/9218), I add a third variant to timeline creation. # Solution The above pushed me to refactor the code in Pageserver to distinguish the different creation requests through enum variants. There is no externally observable effect from this change. On the implementation level, a notable change is that the acquisition of the `TimelineCreationGuard` happens later than before. This is necessary so that we have everything in place to construct the `CreateTimelineIdempotency`. Notably, this moves the acquisition of the creation guard _after_ the acquisition of the `gc_cs` lock in the case of branching. This might appear as if we're at risk of holding `gc_cs` longer than before this PR, but, even before this PR, we were holding `gc_cs` until after the `wait_completion()` that makes the timeline creation durable in S3 returns. I don't see any deadlock risk with reversing the lock acquisition order. As a drive-by change, I found that the `create_timeline()` function in `neon_local` is unused, so I removed it. # Refs platform context: https://github.com/neondatabase/neon/pull/9218 * product context: https://github.com/neondatabase/cloud/issues/17507 * next PR stacked atop this one: https://github.com/neondatabase/neon/pull/9501	2024-10-25 10:04:27 +00:00
Arpad Müller	ec4cc30de9	Shut down timelines during offload and add offload tests (#9289 ) Add a test for timeline offloading, and subsequent unoffloading. Also adds a manual endpoint, and issues a proper timeline shutdown during offloading which prevents a pageserver hang at shutdown. Part of #8088.	2024-10-15 09:46:51 +00:00
Tristan Partin	53147b51f9	Use valid type hints for Python 3.9 I have no idea how this made it past the linters. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-10 13:00:25 -05:00
Tristan Partin	5bd8e2363a	Enable all pyupgrade checks in ruff This will help to keep us from using deprecated Python features going forward. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-08 14:32:26 -05:00
Alex Chi Z.	700885471f	fix(test): only test num of L1 layers in compaction smoke test (#9186 ) close https://github.com/neondatabase/neon/issues/9160 For whatever reason, pg17's WAL pattern seems different from others, which triggers some flaky behavior within the compaction smoke test. ## Summary of changes * Run L0 compaction before proceeding with the read benchmark. * So that we can ensure the num of L0 layers is 0 and test the compaction behavior only with L1 layers. We have a threshold for triggering L0 compaction. In some cases, the test case did not produce enough L0 layers to do a L0 compaction, therefore leaving the layer map with 3+ L0 layers above the L1 layers. This increases the average read depth for the timeline. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-10-02 17:42:35 +01:00
Nikita Kalyanov	f446e08fb8	change HTTP method to comply with spec (#9100 ) There is discrepancy with the spec, it has PUT	2024-09-23 15:53:06 +02:00
Arpad Müller	2dd53e7ae0	Timeline archival test (#8824 ) This PR: * Implements the rule that archived timelines require all of their children to be archived as well, as specified in the RFC. There is no fancy locking mechanism though, so the precondition can still be broken. As a TODO for later, we still allow unarchiving timelines with archived parents. * Adds an `is_archived` flag to `TimelineInfo` * Adds timeline_archival_config to `PageserverHttpClient` * Adds a new `test_timeline_archive` test, loosely based on `test_timeline_delete` Part of #8088	2024-08-26 17:30:19 +02:00
Vlad Lazar	f5cef7bf7f	storcon: skip draining shard if it's secondary is lagging too much (#8644 ) ## Problem Migrations of tenant shards with cold secondaries are holding up drains in during production deployments. ## Summary of changes If a secondary locations is lagging by more than 256MiB (configurable, but that's the default), then skip cutting it over to the secondary as part of the node drain.	2024-08-09 15:45:07 +01:00
Joonas Koivunen	a81fab4826	refactor(timeline_detach_ancestor): replace ordered reparented with a hashset (#8629 ) Earlier I was thinking we'd need a (ancestor_lsn, timeline_id) ordered list of reparented. Turns out we did not need it at all. Replace it with an unordered hashset. Additionally refactor the reparented direct children query out, it will later be used from more places. Split off from #8430. Cc: #6994	2024-08-07 18:19:00 +02:00
John Spray	3727c6fbbe	pageserver: use layer visibility when composing heatmap (#8616 ) ## Problem Sometimes, a layer is Covered by hasn't yet been evicted from local disk (e.g. shortly after image layer generation). It is not good use of resources to download these to a secondary location, as there's a good chance they will never be read. This follows the previous change that added layer visibility: - #8511 Part of epic: - https://github.com/neondatabase/neon/issues/8398 ## Summary of changes - When generating heatmaps, only include Visible layers - Update test_secondary_downloads to filter to visible layers when listing layers from an attached location	2024-08-06 17:15:40 +01:00
Joonas Koivunen	138f008bab	feat: persistent gc blocking (#8600 ) Currently, we do not have facilities to persistently block GC on a tenant for whatever reason. We could do a tenant configuration update, but that is risky for generation numbers and would also be transient. Introduce a `gc_block` facility in the tenant, which manages per timeline blocking reasons. Additionally, add HTTP endpoints for enabling/disabling manual gc blocking for a specific timeline. For debugging, individual tenant status now includes a similar string representation logged when GC is skipped. Cc: #6994	2024-08-06 10:09:56 +01:00
Arpad Müller	8c828c586e	Wait for completion of the upload queue in flush_frozen_layer (#8550 ) Makes `flush_frozen_layer` add a barrier to the upload queue and makes it wait for that barrier to be reached until it lets the flushing be completed. This gives us backpressure and ensures that writes can't build up in an unbounded fashion. Fixes #7317	2024-08-02 13:07:12 +02:00
John Spray	80c8ceacbc	tests: make `test_scrubber_physical_gc_ancestors` more stable (#8453 ) ## Problem This test sometimes found that ancestors were getting cleaned up before it had done any compaction. Compaction was happening implicitly via Workload. Example: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-8298/10032173390/index.html#testresult/fb04786402f80822/retries ## Summary of changes - Set upload=False when writing data after shard split, to avoid doing a checkpoint - Add a checkpoint_period & explicit wait for uploads so that we ensure data lands in S3 without doing a checkpoint	2024-07-23 12:57:57 +01:00
John Spray	975f8ac658	tests: add test_compaction_l0_memory (#8403 ) This test reproduces the case of a writer creating a deep stack of L0 layers. It uses realistic layer sizes and writes several gigabytes of data, therefore runs as a performance test although it is validating memory footprint rather than performance per se. It acts a regression test for two recent fixes: - https://github.com/neondatabase/neon/pull/8401 - https://github.com/neondatabase/neon/pull/8391 In future it will demonstrate the larger improvement of using a k-merge iterator for L0 compaction (#8184) This test can be extended to enforce limits on the memory consumption of other housekeeping steps, by restarting the pageserver and then running other things to do the same "how much did RSS increase" measurement.	2024-07-17 17:35:27 +00:00
Arpad Müller	66337097de	Avoid the storage controller in test_tenant_creation_fails (#8392 ) As described in #8385, the likely source for flakiness in test_tenant_creation_fails is the following sequence of events: 1. test instructs the storage controller to create the tenant 2. storage controller adds the tenant and persists it to the database. issues a creation request 3. the pageserver restarts with the failpoint disabled 4. storage controller's background reconciliation still wants to create the tenant 5. pageserver gets new request to create the tenant from background reconciliation This commit just avoids the storage controller entirely. It has its own set of issues, as the re-attach request will obviously not include the tenant, but it's still useful to test for non-existence of the tenant. The generation is also not optional any more during tenant attachment. If you omit it, the pageserver yields an error. We change the signature of `tenant_attach` to reflect that. Alternative to #8385 Fixes #8266	2024-07-16 12:19:28 +02:00
Joonas Koivunen	730db859c7	feat(timeline_detach_ancestor): success idempotency (#8354 ) Right now timeline detach ancestor reports an error (409, "no ancestor") on a new attempt after successful completion. This makes it troublesome for storage controller retries. Fix it to respond with `200 OK` as if the operation had just completed quickly. Additionally, the returned timeline identifiers in the 200 OK response are now ordered so that responses between different nodes for error comparison are done by the storage controller added in #8353. Design-wise, this PR introduces a new strategy for accessing the latest uploaded IndexPart: `RemoteTimelineClient::initialized_upload_queue(&self) -> Result<UploadQueueAccessor<'_>, NotInitialized>`. It should be a more scalable way to query the latest uploaded `IndexPart` than to add a query method for each question directly on `RemoteTimelineClient`. GC blocking will need to be introduced to make the operation fully idempotent. However, it is idempotent for the cases demonstrated by tests. Cc: #6994	2024-07-15 17:47:53 +00:00
Yuchen Liang	19accfee4e	feat(pageserver): integrate lsn lease into synthetic size (#8220 ) Part of #7497, closes #8071. (accidentally closed #8208, reopened here) ## Problem After the changes in #8084, we need synthetic size to also account for leased LSNs so that users do not get free retention by running a small ephemeral endpoint for a long time. ## Summary of changes This PR integrates LSN leases into the synthetic size calculation. We model leases as read-only branches started at the leased LSN (except it does not have a timeline id). Other changes: - Add new unit tests testing whether a lease behaves like a read-only branch. - Change `/size_debug` response to include lease point in the SVG visualization. - Fix `/lsn_lease` HTTP API to do proper parsing for POST. Signed-off-by: Yuchen Liang <yuchen@neon.tech> Co-authored-by: Joonas Koivunen <joonas@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-07-04 15:09:05 +00:00
John Spray	063553a51b	pageserver: remove tenant create API (#8135 ) ## Problem For some time, we have created tenants with calls to location_conf. The legacy "POST /v1/tenant" path was only used in some tests. ## Summary of changes - Remove the API - Relocate TenantCreateRequest to the controller API file (this used to be used in both pageserver and controller APIs) - Rewrite tenant_create test helper to use location_config API, as control plane and storage controller do - Update docker-compose test script to create tenants with location_config API (this small commit is also present in https://github.com/neondatabase/neon/pull/7947)	2024-06-28 09:14:19 +01:00
Alex Chi Z	9b98823d61	bottom-most-compaction: use in test_gc_feedback + fix bugs (#8103 ) Adds manual compaction trigger; add gc compaction to test_gc_feedback Part of https://github.com/neondatabase/neon/issues/8002 ``` test_gc_feedback[debug-pg15].logical_size: 50 Mb test_gc_feedback[debug-pg15].physical_size: 2269 Mb test_gc_feedback[debug-pg15].physical/logical ratio: 44.5302 test_gc_feedback[debug-pg15].max_total_num_of_deltas: 7 test_gc_feedback[debug-pg15].max_num_of_deltas_above_image: 2 test_gc_feedback[debug-pg15].logical_size_after_bottom_most_compaction: 50 Mb test_gc_feedback[debug-pg15].physical_size_after_bottom_most_compaction: 287 Mb test_gc_feedback[debug-pg15].physical/logical ratio after bottom_most_compaction: 5.6312 test_gc_feedback[debug-pg15].max_total_num_of_deltas_after_bottom_most_compaction: 4 test_gc_feedback[debug-pg15].max_num_of_deltas_above_image_after_bottom_most_compaction: 1 ``` ## Summary of changes * Add the manual compaction trigger * Use in test_gc_feedback * Add a guard to avoid running it with retain_lsns * Fix: Do `schedule_compaction_update` after compaction * Fix: Supply deltas in the correct order to reconstruct value --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-06-25 23:00:14 +00:00
John Spray	07f21dd6b6	pageserver: remove attach/detach apis (#8134 ) ## Problem These APIs have been deprecated for some time, but were still used from test code. Closes: https://github.com/neondatabase/neon/issues/4282 ## Summary of changes - It is still convenient to do a "tenant_attach" from a test without having to write out a location_conf body, so those test methods have been retained with implementations that call through to their location_conf equivalent.	2024-06-25 17:38:06 +01:00
Yuchen Liang	219e78f885	feat(pageserver): add an optional lease to the get_lsn_by_timestamp API (#8104 ) Part of #7497, closes #8072. ## Problem Currently the `get_lsn_by_timestamp` and branch creation pageserver APIs do not provide a pleasant client experience where the looked-up LSN might be GC-ed between the two API calls. This PR attempts to prevent common races between GC and branch creation by making use of LSN leases provided in #8084. A lease can be optionally granted to a looked-up LSN. With the lease, GC will not touch layers needed to reconstruct all pages at this LSN for the duration of the lease. Signed-off-by: Yuchen Liang <yuchen@neon.tech>	2024-06-24 20:12:24 +00:00
John Spray	59f949b4a8	pageserver: remove unused load/ignore APIs (#8122 ) ## Problem These APIs have be unused for some time. They were superseded by /location_conf: the equivalent of ignoring a tenant is now to put it in secondary mode. ## Summary of changes - Remove APIs - Remove tests & helpers that used them - Remove error variants that are no longer needed.	2024-06-21 10:02:15 +00:00
Alex Chi Z	3e63d0f9e0	test(pageserver): quantify compaction outcome (#7867 ) A simple API to collect some statistics after compaction to easily understand the result. The tool reads the layer map, and analyze range by range instead of doing single-key operations, which is more efficient than doing a benchmark to collect the result. It currently computes two key metrics: * Latest data access efficiency, which finds how many delta layers / image layers the system needs to iterate before returning any key in a key range. * (Approximate) PiTR efficiency, as in https://github.com/neondatabase/neon/issues/7770, which is simply the number of delta files in the range. The reason behind that is, assume no image layer is created, PiTR efficiency is simply the cost of collect records from the delta layers, and the replay time. Number of delta files (or in the future, estimated size of reads) is a simple yet efficient way of estimating how much effort the page server needs to reconstruct a page. Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-06-10 10:42:13 +02:00
Yuchen Liang	630cfbe420	refactor(pageserver): designated api error type for cancelled request (#7949 ) Closes #7406. ## Problem When a `get_lsn_by_timestamp` request is cancelled, an anyhow error is exposed to handle that case, which verbosely logs the error. However, we don't benefit from having the full backtrace provided by anyhow in this case. ## Summary of changes This PR introduces a new `ApiError` type to handle errors caused by cancelled request more robustly. - A new enum variant `ApiError::Cancelled` - Currently the cancelled request is mapped to status code 500. - Need to handle this error in proxy's `http_util` as well. - Added a failpoint test to simulate cancelled `get_lsn_by_timestamp` request. Signed-off-by: Yuchen Liang <yuchen@neon.tech>	2024-06-06 14:00:14 +00:00
Joonas Koivunen	a8a88ba7bc	test(detach_ancestor): ensure L0 compaction in history is ok (#7813 ) detaching a timeline from its ancestor can leave the resulting timeline with more L0 layers than the compaction threshold. most of the time, the detached timeline has made progress, and next time the L0 -> L1 compaction happens near the original branch point and not near the last_record_lsn. add a test to ensure that inheriting the historical L0s does not change fullbackup. additionally: - add `wait_until_completed` to test-only timeline checkpoint and compact HTTP endpoints. with `?wait_until_completed=true` the endpoints will wait until the remote client has completed uploads. - for delta layers, describe L0-ness with the `/layer` endpoint Cc: #6994	2024-05-21 20:08:43 +03:00

1 2 3

105 Commits