rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-08 05:52:55 +00:00

Author	SHA1	Message	Date
Christian Schwarz	2b2a547671	fix(tests): periodic and immediate gc is effectively a no-op in tests (#12431 ) The introduction of the default lease deadline feature 9 months ago made it so that after PS restart, `.timeline_gc()` calls in Python tests are no-ops for 10 minute after pageserver startup: the `gc_iteration()` bails with `Skipping GC because lsn lease deadline is not reached`. I did some impact analysis in the following PR. About 30 Python tests are affected: - https://github.com/neondatabase/neon/pull/12411 Rust tests that don't explicitly enable periodic GC or invoke GC manually are unaffected because we disable periodic GC by default in the `TenantHarness`'s tenant config. Two tests explicitly did `start_paused=true` + `tokio::time::advance()`, but it would add cognitive and code bloat to each existing and future test case that uses TenantHarness if we took that route. So, this PR sets the default lease deadline feature in both Python and Rust tests to zero by default. Tests that test the feature were thus identified by failing the test: - Python test `test_readonly_node_gc` + `test_lsn_lease_size` - Rust test `test_lsn_lease`. To accomplish the above, I changed the code that computes the initial lease deadline to respect the pageserver.toml's default tenant config, which it didn't before (and I would consider a bug). The Python test harness and the Rust TenantHarness test harness then simply set the default tenant config field to zero. Drive-by: - `test_lsn_lease_size` was writing a lot of data unnecessarily; reduce the amount and speed up the test refs - PR that introduced default lease deadline: https://github.com/neondatabase/neon/pull/9055/files - fixes https://databricks.atlassian.net/browse/LKB-92 --------- Co-authored-by: Christian Schwarz <Christian Schwarz>	2025-07-08 12:56:22 +00:00
Alex Chi Z.	bfc767d60d	fix(test): wait for shard split complete for test_lsn_lease_storcon (#11436 ) ## Problem close https://github.com/neondatabase/neon/issues/11397 ref https://github.com/neondatabase/cloud/issues/23667 ## Summary of changes We need to wait until the shard split is complete, otherwise it will print warning like waiting for shard split exclusive lock for 30s. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-03 17:49:45 +00:00
Alex Chi Z.	dd1299f337	feat(storcon): passthrough mark invisible and add tests (#11401 ) ## Problem close https://github.com/neondatabase/neon/issues/11279 ## Summary of changes * Allow passthrough of other methods in tenant timeline shard0 passthrough of storcon. * Passthrough mark invisible API in storcon. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-04-02 17:11:49 +00:00
Alex Chi Z.	47d47000df	fix(pageserver): passthrough lsn lease in storcon API (#11386 ) ## Problem part of https://github.com/neondatabase/cloud/issues/23667 ## Summary of changes lsn_lease API can only be used on pageservers. This patch enables storcon passthrough. Signed-off-by: Alex Chi Z <chi@neon.tech>	2025-03-31 19:16:42 +00:00
Alexander Bayandin	30a7dd630c	ruff: enable TC — flake8-type-checking (#11368 ) ## Problem `TYPE_CHECKING` is used inconsistently across Python tests. ## Summary of changes - Update `ruff`: 0.7.0 -> 0.11.2 - Enable TC (flake8-type-checking): https://docs.astral.sh/ruff/rules/#flake8-type-checking-tc - (auto)fix all new issues	2025-03-30 18:58:33 +00:00
Vlad Lazar	02a83913ec	storcon: do not update observed state on node activation (#11155 ) ## Problem When a node becomes active, we query its locations and update the observed state in-place. This can race with the observed state updates done when processing reconcile results. ## Summary of changes The argument for this reconciliation step is that is reduces the need for background reconciliations. I don't think is actually true anymore. There's two cases. 1. Restart of node after drain. Usually the node does not go through the offline state here, so observed locations were not marked as none. In any case, there should be a handful of shards max on the node since we've just drained it. 2. Node comes back online after failure or network partition. When the node is marked offline, we reschedule everything away from it. When it later becomes active, the previous observed location is extraneous and requires a reconciliation anyway. Closes https://github.com/neondatabase/neon/issues/11148	2025-03-12 15:31:28 +00:00
Erik Grinaker	5330122049	test_runner: improve `wait_until` (#9936 ) Improves `wait_until` by: * Use `timeout` instead of `iterations`. This allows changing the timeout/interval parameters independently. * Make `timeout` and `interval` optional (default 20s and 0.5s). Most callers don't care. * Only output status every 1s by default, and add optional `status_interval` parameter. * Remove `show_intermediate_error`, this was always emitted anyway. Most callers have been updated to use the defaults, except where they had good reason otherwise.	2024-12-02 10:26:15 +00:00
Alexander Bayandin	e9dcfa2eb2	test_runner: skip more tests using decorator instead of pytest.skip (#9704 ) ## Problem Running `pytest.skip(...)` in a test body instead of marking the test with `@pytest.mark.skipif(...)` makes all fixtures to be initialised, which is not necessary if the test is going to be skipped anyway. Also, some tests are unnecessarily skipped (e.g. `test_layer_bloating` on Postgres 17, or `test_idle_reconnections` at all) or run (e.g. `test_parse_project_git_version_output_positive` more than on once configuration) according to comments. ## Summary of changes - Move `skip_on_postgres` / `xfail_on_postgres` / `run_only_on_default_postgres` decorators to `fixture.utils` - Add new `skip_in_debug_build` and `skip_on_ci` decorators - Replace `pytest.skip(...)` calls with decorators where possible	2024-11-11 18:07:01 +00:00
Tristan Partin	ecde8d7632	Improve type safety according to pyright Pyright found many issues that mypy doesn't seem to want to catch or mypy isn't configured to catch. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-11-08 14:43:15 -06:00
Vlad Lazar	f4f7ea247c	tests: make size comparisons more lenient (#9388 ) The empirically determined threshold doesn't hold for PG 17. Bump the limit to stabilise ci.	2024-10-14 16:50:12 +01:00
Tristan Partin	5bd8e2363a	Enable all pyupgrade checks in ruff This will help to keep us from using deprecated Python features going forward. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-08 14:32:26 -05:00
Heikki Linnakangas	19db9e9aad	tests: Replace direct calls to neon_cli with wrappers in NeonEnv (#9195 ) Add wrappers for a few commands that didn't have them before. Move the logic to generate tenant and timeline IDs from NeonCli to the callers, so that NeonCli is more purely just a type-safe wrapper around 'neon_local'.	2024-10-03 22:03:22 +03:00
Heikki Linnakangas	4dc9cb7cf9	tests: Remove some spurious list_timelines calls These calls seem really out of place. We know what the initial tenant and branch are in these tests, just like in all other tests.	2024-09-30 17:56:37 +03:00
Joonas Koivunen	b83d722369	test: fix more flaky due to graceful shutdown (#8787 ) Going through the list of recent flaky tests, trying to fix those related to graceful shutdown. - test_forward_compatibility: flush and wait for uploads to avoid graceful shutdown - test_layer_bloating: in the end the endpoint and vanilla are still up => immediate shutdown - test_lagging_sk: pageserver shutdown is not related to the test => immediate shutdown - test_lsn_lease_size: pageserver flushing is not needed => immediate shutdown Additionally: - remove `wait_for_upload` usage from workload fixture Cc: #8708 Fixes: #8710	2024-08-21 17:22:47 +01:00
Yuchen Liang	d9c1068cf4	tests: increase approx size equal threshold to avoid `test_lsn_lease_size` flakiness (#8282 ) ## Summary of changes Increase the `assert_size_approx_equal` threshold to avoid flakiness of `test_lsn_lease_size`. Still needs more investigation to fully resolve #8293. - Also set `autovacuum=off` for the endpoint we are running in the test. Signed-off-by: Yuchen Liang <yuchen@neon.tech>	2024-07-08 21:50:13 +01:00
Yuchen Liang	19accfee4e	feat(pageserver): integrate lsn lease into synthetic size (#8220 ) Part of #7497, closes #8071. (accidentally closed #8208, reopened here) ## Problem After the changes in #8084, we need synthetic size to also account for leased LSNs so that users do not get free retention by running a small ephemeral endpoint for a long time. ## Summary of changes This PR integrates LSN leases into the synthetic size calculation. We model leases as read-only branches started at the leased LSN (except it does not have a timeline id). Other changes: - Add new unit tests testing whether a lease behaves like a read-only branch. - Change `/size_debug` response to include lease point in the SVG visualization. - Fix `/lsn_lease` HTTP API to do proper parsing for POST. Signed-off-by: Yuchen Liang <yuchen@neon.tech> Co-authored-by: Joonas Koivunen <joonas@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-07-04 15:09:05 +00:00
John Spray	15728be0e1	pageserver: always detach before deleting (#8082 ) In #7957 we enabled deletion without attachment, but retained the old-style deletion (return 202, delete in background) for attached tenants. In this PR, we remove the old-style deletion path, such that if the tenant delete API is invoked while a tenant is detached, it is simply detached before completing the deletion. This intentionally doesn't rip out all the old deletion code: in case a deletion was in progress at time of upgrade, we keep around the code for finishing it for one release cycle. The rest of the code removal happens in https://github.com/neondatabase/neon/pull/8091 Now that deletion will always be via the new path, the new path is also updated to use some retries around remote storage operations, to tripping up the control plane with 500s if S3 has an intermittent issue.	2024-06-21 15:39:19 +01:00
John Spray	eb0ca9b648	pageserver: improved synthetic size & find_gc_cutoff error handling (#8051 ) ## Problem This PR refactors some error handling to avoid log spam on tenant/timeline shutdown. - "ignoring failure to find gc cutoffs: timeline shutting down." logs (https://github.com/neondatabase/neon/issues/8012) - "synthetic_size_worker: failed to calculate synthetic size for tenant ...: Failed to refresh gc_info before gathering inputs: tenant shutting down", for example here: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-8049/9502988669/index.html#suites/3fc871d9ee8127d8501d607e03205abb/1a074a66548bbcea Closes: https://github.com/neondatabase/neon/issues/8012 ## Summary of changes - Refactor: Add a PageReconstructError variant to GcError: this is the only kind of error that find_gc_cutoffs can emit. - Functional change: only ignore shutdown PageReconstructError variant: for other variants, treat it as a real error - Refactor: add a structured CalculateSyntheticSizeError type and use it instead of anyhow::Error in synthetic size calculations - Functional change: while iterating through timelines gathering logical sizes, only drop out if the whole tenant is cancelled: individual timeline cancellations indicate deletion in progress and we can just ignore those.	2024-06-14 11:08:11 +01:00
Christian Schwarz	6d951e69d6	test_suite: patch, don't replace, the `tenant_config` field, where appropriate (#7771 ) Before this PR, the changed tests would overwrite the entire `tenant_config` because `pageserver_config_override` is merged non-recursively into the `ps_cfg`. This meant they would override the `PAGESERVER_DEFAULT_TENANT_CONFIG_COMPACTION_ALGORITHM`, impacting our matrix build for `compaction_algorithm=Tiered\|Legacy` in https://github.com/neondatabase/neon/pull/7748. I found the tests fixed in this PR using the `NEON_PAGESERVER_PANIC_ON_UNSPECIFIED_COMPACTION_ALGORITHM` env var that I added in #7748. Therefore, I think this is an exhaustive fix. This is better than just searching the code base for `tenant_config`, which is what I had sketched in #7747. refs #7749	2024-05-17 12:24:02 +02:00
Joonas Koivunen	d9dcbffac3	python: allow using allowed_errors.py (#7719 ) See #7718. Fix it by renaming all `types.py` to `common_types.py`. Additionally, add an advert for using `allowed_errors.py` to test any added regex.	2024-05-13 15:16:23 +03:00
Joonas Koivunen	a96e15cb6b	test: less flaky test_synthetic_size_while_deleting (#7622 ) #7585 introduced test case for deletions while synthetic size is being calculated. The test has a race against deletion, but we only accept one outcome. Fix it to accept 404 as well, as we cannot control from outside which outcome happens. Evidence: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-7456/8970595458/index.html#/testresult/32a5b2f8c4094bdb	2024-05-06 15:52:51 +00:00
Joonas Koivunen	ed9a114bde	fix: find gc cutoff points without holding Tenant::gc_cs (#7585 ) The current implementation of finding timeline gc cutoff Lsn(s) is done while holding `Tenant::gc_cs`. In recent incidents long create branch times were caused by holding the `Tenant::gc_cs` over extremely long `Timeline::find_lsn_by_timestamp`. The fix is to find the GC cutoff values before taking the `Tenant::gc_cs` lock. This change is safe to do because the GC cutoff values and the branch points have no dependencies on each other. In the case of `Timeline::find_gc_cutoff` taking a long time with this change, we should no longer see `Tenant::gc_cs` interfering with branch creation. Additionally, the `Tenant::refresh_gc_info` is now tolerant of timeline deletions (or any other failures to find the pitr_cutoff). This helps with the synthetic size calculation being constantly completed instead of having a break for a timely timeline deletion. Fixes: #7560 Fixes: #7587	2024-05-03 14:57:26 +03:00
John Spray	ee9ec26808	pageserver: change pitr_interval=0 behavior (#7423 ) ## Problem We already made a change in #6407 to make pitr_interval authoritative for synthetic size calculations (do not charge users for data retained due to gc_horizon), but that change didn't cover the case where someone entirely disables time-based retention by setting pitr_interval=0 Relates to: https://github.com/neondatabase/neon/issues/6374 ## Summary of changes When pitr_interval is zero, do not set `pitr_cutoff` based on gc_horizon. gc_horizon is still enforced, but separately (its value is passed separately, there was never a need to claim pitr_cutoff to gc_horizon) ## More detail ### Issue 1 Before this PR, we would skip the update_gc_info for timelines with last_record_lsn() < gc_horizon. Let's call such timelines "tiny". The rationale for that presumably was that we can't GC anything in the tiny timelines, why bother to call update_gc_info(). However, synthetic size calculation relies on up-to-date update_gc_info() data. Before this PR, tiny timelines would never get an updated GcInfo::pitr_horizon (it remained Lsn(0)). Even on projects with pitr_interval=0d. With this PR, update_gc_info is always called, hence GcInfo::pitr_horizon is always updated, thereby providing synthetic size calculation with up-to-data data. ### Issue 2 Before this PR, regardless of whether the timeline is "tiny" or not, GcInfo::pitr_horizon was clamped to at least last_record_lsn - gc_horizon, even if the pitr window in terms of LSN range was shorter (=less than) the gc_horizon. With this PR, that clamping is removed, so, for pitr_interval=0, the pitr_horizon = last_record_lsn.	2024-04-23 17:16:17 +01:00
Vlad Lazar	c5f64fe54f	tests: reinstate some syntethic size tests (#7294 ) ## Problem `test_empty_tenant_size` was marked `xfail` and a few other tests were skipped. ## Summary of changes Stabilise `test_empty_tenant_size`. This test attempted to disable checkpointing for the postgres instance and expected that the synthetic size remains stable for an empty tenant. When debugging I noticed that postgres was issuing a checkpoint after the transaction in the test (perhaps something changed since the test was introduced). Hence, I relaxed the size check to allow for the checkpoint key written on the pageserver. Also removed the checks for synthetic size inputs since the expected values differ between postgres versions. Closes https://github.com/neondatabase/neon/issues/7138	2024-04-04 09:45:14 +00:00
John Spray	bc1efa827f	pageserver: exclude gc_horizon from synthetic size calculation (#6407 ) ## Problem See: - https://github.com/neondatabase/neon/issues/6374 ## Summary of changes Whereas previously we calculated synthetic size from the gc_horizon or the pitr_interval (whichever is the lower LSN), now we ignore gc_horizon and exclusively start from the `pitr_interval`. This is a more generous calculation for billing, where we do not charge users for data retained due to gc_horizon.	2024-03-15 16:07:36 +00:00
John Spray	6a1903987a	tests: use approximate equality in test_get_tenant_size_with_multiple_branches (#5411 ) ## Problem This test has been flaky for a long time. As far as I can tell, the test was simply wrong to expect postgres activity to result in deterministic sizes: making the match fuzzy is not a hack, it's just matching the reality that postgres doesn't promise to write exactly the same number of pages every time it runs a given query. ## Summary of changes Equalities now tolerate up to 4 pages different. This is big enough to tolerate the deltas we've seen in practice. Closes: https://github.com/neondatabase/neon/issues/2962	2023-09-29 09:15:43 +01:00
Joonas Koivunen	a55a78a453	Misc test flakyness fixes (#5233 ) Assorted flakyness fixes from #5198, might not be flaky on `main`. Migrate some tests using neon_simple_env to just neon_env_builder and using initial_tenant to make flakyness understanding easier. (Did not understand the flakyness of `test_timeline_create_break_after_uninit_mark`.) `test_download_remote_layers_api` is flaky because we have no atomic "wait for WAL, checkpoint, wait for upload and do not receive any more WAL". `test_tenant_size` fixes are just boilerplate which should had always existed; we should wait for the tenant to be active. similarly for `test_timeline_delete`. `test_timeline_size_post_checkpoint` fails often for me with reading zero from metrics. Give it a few attempts.	2023-09-11 11:42:49 +03:00
Sasha Krassovsky	c215389f1c	quote_ident identifiers when creating neon_superuser (#4562 ) ## Problem	2023-06-24 10:34:15 +03:00
Sasha Krassovsky	b1477b4448	Create neon_superuser role, grant it to roles created from control plane (#4425 ) ## Problem Currently, if a user creates a role, it won't by default have any grants applied to it. If the compute restarts, the grants get applied. This gives a very strange UX of being able to drop roles/not have any access to anything at first, and then once something triggers a config application, suddenly grants are applied. This removes these grants.	2023-06-24 01:38:27 +03:00
Dmitry Rodionov	472cc17b7a	propagate lock guard to background deletion task (#4495 ) ## Problem 1. During the rollout we got a panic: "timeline that we were deleting was concurrently removed from 'timelines' map" that was caused by lock guard not being propagated to the background part of the deletion. Existing test didnt catch it because failpoint that was used for verification was placed earlier prior to background task spawning. 2. When looking at surrounding code one more bug was detected. We removed timeline from the map before deletion is finished, which breaks client retry logic, because it will indicate 404 before actual deletion is completed which can lead to client stopping its retry poll earlier. ## Summary of changes 1. Carry the lock guard over to background deletion. Ensure existing test case fails without applied patch (second deletion becomes stuck without it, which eventually leads to a test failure). 2. Move delete_all call earlier so timeline is removed from the map is the last thing done during deletion. Additionally I've added timeline_id to the `update_gc_info` span, because `debug_assert_current_span_has_tenant_and_timeline_id` in `download_remote_layer` was firing when `update_gc_info` lead to on-demand downloads via `find_lsn_for_timestamp` (caught by @problame). This is not directly related to the PR but fixes possible flakiness. Another smaller set of changes involves deletion wrapper used in python tests. Now there is a simpler wrapper that waits for deletions to complete `timeline_delete_wait_completed`. Most of the test_delete_timeline.py tests make negative tests, i.e., "does ps_http.timeline_delete() fail in this and that scenario". These can be left alone. Other places when we actually do the deletions, we need to use the helper that polls for completion. Discussion https://neondb.slack.com/archives/C03F5SM1N02/p1686668007396639 resolves #4496 --------- Co-authored-by: Christian Schwarz <christian@neon.tech>	2023-06-15 17:30:12 +03:00
Heikki Linnakangas	df3bae2ce3	Use `compute_ctl` to manage Postgres in tests. (#3886 ) This adds test coverage for 'compute_ctl', as it is now used by all the python tests. There are a few differences in how 'compute_ctl' is called in the tests, compared to the real web console: - In the tests, the postgresql.conf file is included as one large string in the spec file, and it is written out as it is to the data directory. I added a new field for that to the spec file. The real web console, however, sets all the necessary settings in the 'settings' field, and 'compute_ctl' creates the postgresql.conf from those settings. - In the tests, the information needed to connect to the storage, i.e. tenant_id, timeline_id, connection strings to pageserver and safekeepers, are now passed as new fields in the spec file. The real web console includes them as the GUCs in the 'settings' field. (Both of these are different from what the test control plane used to do: It used to write the GUCs directly in the postgresql.conf file). The plan is to change the control plane to use the new method, and remove the old method, but for now, support both. Some tests that were sensitive to the amount of WAL generated needed small changes, to accommodate that compute_ctl runs the background health monitor which makes a few small updates. Also some tests shut down the pageserver, and now that the background health check can run some queries while the pageserver is down, that can produce a few extra errors in the logs, which needed to be allowlisted. Other changes: - remove obsolete comments about PostgresNode; - create standby.signal file for Static compute node; - log output of `compute_ctl` and `postgres` is merged into `endpoints/compute.log`. --------- Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>	2023-06-06 14:59:36 +01:00
Alexander Bayandin	bb06d281ea	Run regressions tests on both Postgres 14 and 15 (#4192 ) This PR adds tests runs on Postgres 15 and created unified Allure report with results for all tests. - Split `.github/actions/allure-report` into `.github/actions/allure-report-store` and `.github/actions/allure-report-generate` - Add debug or release pytest parameter for all tests (depending on `BUILD_TYPE` env variable) - Add Postgres version as a pytest parameter for all tests (depending on `DEFAULT_PG_VERSION` env variable) - Fix `test_wal_restore` and `restore_from_wal.sh` to support path with `[`/`]` in it (fixed by applying spellcheck to the script and fixing all warnings), `restore_from_wal_archive.sh` is deleted as unused. - All known failures on Postgres 15 marked with xfail	2023-05-12 15:28:51 +01:00
Heikki Linnakangas	89b5589b1b	Tenant size should never be zero. Simplify test. Looking at the git history of this test, I think "size == 0" used to have a special meaning earlier, but now it should never happen.	2023-04-13 16:57:31 +03:00
Heikki Linnakangas	53f438a8a8	Rename "Postgres nodes" in control_plane to endpoints. We use the term "endpoint" in for compute Postgres nodes in the web UI and user-facing documentation now. Adjust the nomenclature in the code. This changes the name of the "neon_local pg" command to "neon_local endpoint". Also adjust names of classes, variables etc. in the python tests accordingly. This also changes the directory structure so that endpoints are now stored in: .neon/endpoints/<endpoint id> instead of: .neon/pgdatadirs/tenants/<tenant_id>/<endpoint (node) name> The tenant ID is no longer part of the path. That means that you cannot have two endpoints with the same name/ID in two different tenants anymore. That's consistent with how we treat endpoints in the real control plane and proxy: the endpoint ID must be globally unique.	2023-04-13 14:34:29 +03:00
Dmitry Rodionov	bfeb428d1b	tests: make neon_fixtures a bit thinner by splitting out some pageserver related helpers (#3977 ) neon_fixture is quite big and messy, lets clean it up a bit.	2023-04-07 13:47:28 +03:00
MMeent	20a4d817ce	Update vendored PostgreSQL versions to 14.7 and 15.2 (#3581 ) ## Describe your changes Rebase vendored PostgreSQL onto 14.7 and 15.2 ## Issue ticket number and link #3579 ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [x] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [x] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ``` The version of PostgreSQL that we use is updated to 14.7 for PostgreSQL 14 and 15.2 for PostgreSQL 15. ```	2023-02-23 16:10:22 +02:00
Heikki Linnakangas	ddbdcdddd7	Tenant size calculation: refactor, rewrite, and add SVG (#2817 ) Refactor the tenant_size_model code. Segment now contains just the minimum amount of information needed to calculate the size. Other information that is useful for building up the segment tree, and for display purposes, is now kept elsewhere. The code in 'main.rs' has a new ScenarioBuilder struct for that. Calculating which Segments are "needed" is now the responsibility of the caller of tenant_size_mode, not part of the calculation itself. So it's up to the caller to make all the decisions with retention periods for each branch. The output of the sizing calculation is now a Vec of SizeResults, rather than a tree. It uses a tree representation internally, when doing the calculation, but it's not exposed to the caller anymore. Refactor the way the recursive calculation is performed. Rewrite the code in size.rs that builds the Segment model. Get rid of the intermediate representation with Update structs. Build the Segments directly, with some local HashMaps and Vecs to track branch points to help with that. retention_period is now an input to gather_inputs(), rather than an output. Update pageserver http API: rename /size endpoint to /synthetic_size with following parameters: - /synthetic_size?inputs_only to get debug info; - /synthetic_size?retention_period=0 to override cutoff that is used to calculate the size; pass header -H "Accept: text/html" to get HTML output, otherwise JSON is returned Update python tests and openapi spec. --------- Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech> Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-02-16 10:53:46 +02:00
Shany Pozin	c9821f13e0	Expose the tenant calculated synthetic size as a Prometheus metric (#3541 ) ## Describe your changes Expose the currently calculated synthetic size as a Prometheus metric ## Issue ticket number and link #3509 ## Checklist before requesting a review - [X] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.	2023-02-06 09:25:15 +02:00
Anastasia Lubennikova	36f048d6b0	Fix tenant size orphans (#3377 ) Before only the timelines which have passed the `gc_horizon` were processed which failed with orphans at the tree_sort phase. Example input in added `test_branched_empty_timeline_size` test case. The PR changes iteration to happen through all timelines, and in addition to that, any learned branch points will be calculated as they would had been in the original implementation if the ancestor branch had been over the `gc_horizon`. This also changes how tenants where all timelines are below `gc_horizon` are handled. Previously tenant_size 0 was returned, but now they will have approximately `initdb_lsn` worth of tenant_size. The PR also adds several new tenant size tests that describe various corner cases of branching structure and `gc_horizon` setting. They are currently disabled to not consume time during CI. Co-authored-by: Joonas Koivunen <joonas@neon.tech> Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>	2023-01-20 20:21:36 +02:00
Heikki Linnakangas	9a6c0be823	storage_sync2 The code in this change was extracted from PR #2595, i.e., Heikki’s draft PR for on-demand download. High-Level Changes - storage_sync module rewrite - Changes to Tenant Loading - Changes to Timeline States - Crash-safe & Resumable Tenant Attach There are several follow-up work items planned. Refer to the Epic issue on GitHub: https://github.com/neondatabase/neon/issues/2029 Metadata: closes https://github.com/neondatabase/neon/pull/2785 unsquashed history of this patch: archive/pr-2785-storage-sync2/pre-squash Co-authored-by: Dmitry Rodionov <dmitry@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech> =============================================================================== storage_sync module rewrite =========================== The storage_sync code is rewritten. New module name is storage_sync2, mostly to make a more reasonable git diff. The updated block comment in storage_sync2.rs describes the changes quite well, so, we will not reproduce that comment here. TL;DR: - Global sync queue and RemoteIndex are replaced with per-timeline `RemoteTimelineClient` structure that contains a queue for UploadOperations to ensure proper ordering and necessary metadata. - Before deleting local layer files, wait for ongoing UploadOps to finish (wait_completion()). - Download operations are not queued and executed immediately. Changes to Tenant Loading ========================= Initial sync part was rewritten as well and represents the other major change that serves as a foundation for on-demand downloads. Routines for attaching and loading shifted directly to Tenant struct and now are asynchronous and spawned into the background. Since this patch doesn’t introduce on-demand download of layers we fully synchronize with the remote during pageserver startup. See details in `Timeline::reconcile_with_remote` and `Timeline::download_missing`. Changes to Tenant States ======================== The “Active” state has lost its “background_jobs_running: bool” member. That variable indicated whether the GC & Compaction background loops are spawned or not. With this patch, they are now always spawned. Unit tests (#[test]) use the TenantConf::{gc_period,compaction_period} to disable their effect (`15db566`). This patch introduces a new tenant state, “Attaching”. A tenant that is being attached starts in this state and transitions to “Active” once it finishes download. The `GET /tenant` endpoints returns `TenantInfo::has_in_progress_downloads`. We derive the value for that field from the tenant state now, to remain backwards-compatible with cloud.git. We will remove that field when we switch to on-demand downloads. Changes to Timeline States ========================== The TimelineInfo::awaits_download field is now equivalent to the tenant being in Attaching state. Previously, download progress was tracked per timeline. With this change, it’s only tracked per tenant. When on-demand downloads arrive, the field will be completely obsolete. Deprecation is tracked in isuse #2930. Crash-safe & Resumable Tenant Attach ==================================== Previously, the attach operation was not persistent. I.e., when tenant attach was interrupted by a crash, the pageserver would not continue attaching after pageserver restart. In fact, the half-finished tenant directory on disk would simply be skipped by tenant_mgr because it lacked the metadata file (it’s written last). This patch introduces an “attaching” marker file inside that is present inside the tenant directory while the tenant is attaching. During pageserver startup, tenant_mgr will resume attach if that file is present. If not, it assumes that the local tenant state is consistent and tries to load the tenant. If that fails, the tenant transitions into Broken state.	2022-11-29 18:55:20 +01:00
Heikki Linnakangas	15db566420	Allow setting gc/compaction_period to 0, to disable automatic GC/compaction Many python tests were setting the GC/compaction period to large values, to effectively disable GC / compaction. Reserve value 0 to mean "explicitly disabled". We also set them to 0 in unit tests now, although currently, unit tests don't launch the background jobs at all, so it won't have any effect. Fixes https://github.com/neondatabase/neon/issues/2917	2022-11-25 20:14:06 +02:00
Heikki Linnakangas	86e483f87b	Fix tenant size modeling code to include WAL at end of branch Imagine that you have a tenant with a single branch like this: ---------------==========> ^ gc horizon where: ---- is the portion of the branch that is older than retention period ==== is the portion of the branch that is newer than retention period. Before this commit, the sizing model included the logical size at the GC horizon, but not the WAL after that. In particular, that meant that on a newly created tenant with just one timeline, where the retention period covered the whole history of the timeline, i.e. gc_cutoff was 0, the calculated tenant size was always zero. We now include the WAL after the GC horizon in the size. So in the above example, the calculated tenant size would be the logical size of the database the GC horizon, plus all the WAL after it (marked with ===). This adds a new `insert_point` function to the sizing model, alongside `modify_branch`, and changes the code in size.rs to use the new function. The new function takes an absolute lsn and logical size as argument, so we no longer need to calculate the difference to the previous point. Also, the end-size is now optional, because we now need to add a point to represent the end of each branch to the model, but we don't want to or need to calculate the logical size at that point.	2022-11-22 17:11:27 +02:00
Heikki Linnakangas	dcb79ef08f	Silence yet another test failure from race condition between GC and delete. Another similar case to commit `9ae4da4f31`.	2022-11-18 10:18:15 +02:00
Heikki Linnakangas	24d3ed0952	Ignore another ERROR that's expected in test. Got a test failure in CI because of this.	2022-11-17 12:42:56 +02:00
Joonas Koivunen	548d472b12	fix: logical size query at before initdb_lsn (#2755 ) With more realistic selection of gc_horizon in tests there is an immediate failure with trying to query logical size with lsn < initdb_lsn. Fixes that, adds illustration gathered from clarity of explaining this tenant size calculation to more people. Cc: #2748, #2599.	2022-11-07 12:03:57 +02:00
Joonas Koivunen	cf68963b18	Add initial tenant sizing model and a http route to query it (#2714 ) Tenant size information is gathered by using existing parts of `Tenant::gc_iteration` which are now separated as `Tenant::refresh_gc_info`. `Tenant::refresh_gc_info` collects branch points, and invokes `Timeline::update_gc_info`; nothing was supposed to be changed there. The gathered branch points (through Timeline's `GcInfo::retain_lsns`), `GcInfo::horizon_cutoff`, and `GcInfo::pitr_cutoff` are used to build up a Vec of updates fed into the `libs/tenant_size_model` to calculate the history size. The gathered information is now exposed using `GET /v1/tenant/{tenant_id}/size`, which which will respond with the actual calculated size. Initially the idea was to have this delivered as tenant background task and exported via metric, but it might be too computationally expensive to run it periodically as we don't yet know if the returned values are any good. Adds one new metric: - pageserver_storage_operations_seconds with label `logical_size` - separating from original `init_logical_size` Adds a pageserver wide configuration variable: - `concurrent_tenant_size_logical_size_queries` with default 1 This leaves a lot of TODO's, tracked on issue #2748.	2022-11-03 12:39:19 +00:00

46 Commits