rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-10 15:02:56 +00:00

Author	SHA1	Message	Date
Dmitry Rodionov	6d023484ed	Use mark file to allow for deletion operations to continue through restarts (#4552 ) ## Problem Currently we delete local files first, so if pageserver restarts after local files deletion then remote deletion is not continued. This can be solved with inversion of these steps. But even if these steps are inverted when index_part.json is deleted there is no way to distinguish between "this timeline is good, we just didnt upload it to remote" and "this timeline is deleted we should continue with removal of local state". So to solve it we use another mark file. After index part is deleted presence of this mark file indentifies that it was a deletion intention. Alternative approach that was discussed was to delete all except metadata first, and then delete metadata and index part. In this case we still do not support local only configs making them rather unsafe (deletion in them is already unsafe, but this direction solidifies this direction instead of fixing it). Another downside is that if we crash after local metadata gets removed we may leave dangling index part on the remote which in theory shouldnt be a big deal because the file is small. It is not a big change to choose another approach at this point. ## Summary of changes Timeline deletion sequence: 1. Set deleted_at in remote index part. 2. Create local mark file. 3. Delete local files except metadata (it is simpler this way, to be able to reuse timeline initialization code that expects metadata) 4. Delete remote layers 5. Delete index part 6. Delete meta, timeline directory. 7. Delete mark file. This works for local only configuration without remote storage. Sequence is resumable from any point. resolves #4453 resolves https://github.com/neondatabase/neon/pull/4552 (the issue was created with async cancellation in mind, but we can still have issues with retries if metadata is deleted among the first by remove_dir_all (which doesnt have any ordering guarantees)) --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech>	2023-07-25 16:25:27 +03:00
Joonas Koivunen	77a68326c5	Thin out TenantState metric, keep set of broken tenants (#4796 ) We currently have a timeseries for each of the tenants in different states. We only want this for Broken. Other states could be counters. Fix this by making the `pageserver_tenant_states_count` a counter without a `tenant_id` and add a `pageserver_broken_tenants_count` which has a `tenant_id` label, each broken tenant being 1.	2023-07-25 11:15:54 +03:00
Joonas Koivunen	294b8a8fde	Convert per timeline metrics to global (#4769 ) Cut down the per-(tenant, timeline) histograms by making them global: - `pageserver_getpage_get_reconstruct_data_seconds` - `pageserver_read_num_fs_layers` - `pageserver_remote_operation_seconds` - `pageserver_remote_timeline_client_calls_started` - `pageserver_wait_lsn_seconds` - `pageserver_io_operations_seconds` --------- Co-authored-by: Shany Pozin <shany@neon.tech>	2023-07-25 00:43:27 +03:00
Arseny Sher	921bb86909	Use safekeeper tenant only port in all tests and actually test it. Compute now uses special safekeeper WAL service port allowing auth tokens with only tenant scope. Adds understanding of this port to neon_local and fixtures, as well as test of both ports behaviour with different tokens. ref https://github.com/neondatabase/neon/issues/4730	2023-07-19 06:03:51 +04:00
Joonas Koivunen	762a8a7bb5	python: more linting (#4734 ) Ruff has "B" class of lints, including B018 which will nag on useless expressions, related to #4719. Enable such lints and fix the existing issues. Most notably: - https://beta.ruff.rs/docs/rules/mutable-argument-default/ - https://beta.ruff.rs/docs/rules/assert-false/ --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2023-07-18 12:56:40 +03:00
Alexander Bayandin	33c2d94ba6	Fix git-env version for PRs (#4641 ) ## Problem Binaries created from PRs (both in docker images and for tests) have wrong git-env versions, they point to phantom merge commits. ## Summary of changes - Prefer GIT_VERSION env variable even if git information was accessible - Use `${{ github.event.pull_request.head.sha \|\| github.sha }}` instead of `${{ github.sha }}` for `GIT_VERSION` in workflows So the builds will still happen from this phantom commit, but we will report the PR commit. --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-07-10 20:01:01 +01:00
Christian Schwarz	505aa242ac	page cache: add size metrics (#4629 ) Make them a member of `struct PageCache` to prepare for a future where there's no global state.	2023-07-05 15:36:42 +03:00
Christian Schwarz	3f9defbfb4	page cache: add access & hit rate metrics (#4628 ) Co-authored-by: Dmitry Rodionov <dmitry@neon.tech>	2023-07-05 10:38:32 +02:00
Alexander Bayandin	78a7f68902	Make pg_version and build_type regular parameters (#4311 ) ## Problem All tests have already been parametrised by Postgres version and build type (to have them distinguishable in the Allure report), but despite it, it's anyway required to have DEFAULT_PG_VERSION and BUILD_TYPE env vars set to corresponding values, for example to run`test_timeline_deletion_with_files_stuck_in_upload_queue[release-pg14-local_fs]` test it's required to set `DEFAULT_PG_VERSION=14` and `BUILD_TYPE=release`. This PR makes the test framework pick up parameters from the test name itself. ## Summary of changes - Postgres version and build type related fixtures now are function-scoped (instead of being sessions scoped before) - Deprecate `--pg-version` argument in favour of DEFAULT_PG_VERSION env variable (it's easier to parse) - GitHub autocomment now includes only one command with all the failed tests + runs them in parallel	2023-07-03 13:51:40 +01:00
Christian Schwarz	24eaa3b7ca	timeline creation: reflect failures due to ancestor LSN issues in status code (#4600 ) Before, it was a `500` and control plane would retry, wereas it actually should have stopped retrying. (Stacked on top of https://github.com/neondatabase/neon/pull/4597 ) fixes https://github.com/neondatabase/neon/issues/4595 part of https://github.com/neondatabase/cloud/issues/5626 --------- Co-authored-by: Shany Pozin <shany@neon.tech>	2023-07-03 15:21:10 +03:00
Christian Schwarz	76718472be	add pageserver-global histogram for basebackup latency (#4559 ) The histogram distinguishes by ok/err. I took the liberty to create a small abstraction for such use cases. It helps keep the label values inside `metrics.rs`, right next to the place where the metric and its labels are declared.	2023-06-23 16:42:12 +02:00
bojanserafimov	90e1f629e8	Add test for `skip_pg_catalog_updates` (#4530 )	2023-06-20 11:38:59 -04:00
Alexander Bayandin	1b947fc8af	test_runner: workaround rerunfailures and timeout incompatibility (#4469 ) ## Problem `pytest-timeout` and `pytest-rerunfailures` are incompatible (or rather not fully compatible). Timeouts aren't set for reruns. Ref https://github.com/pytest-dev/pytest-rerunfailures/issues/99 ## Summary of changes - Dynamically make timeouts `func_only` for tests that we're going to retry. It applies timeouts for reruns as well.	2023-06-16 18:08:11 +01:00
Dmitry Rodionov	472cc17b7a	propagate lock guard to background deletion task (#4495 ) ## Problem 1. During the rollout we got a panic: "timeline that we were deleting was concurrently removed from 'timelines' map" that was caused by lock guard not being propagated to the background part of the deletion. Existing test didnt catch it because failpoint that was used for verification was placed earlier prior to background task spawning. 2. When looking at surrounding code one more bug was detected. We removed timeline from the map before deletion is finished, which breaks client retry logic, because it will indicate 404 before actual deletion is completed which can lead to client stopping its retry poll earlier. ## Summary of changes 1. Carry the lock guard over to background deletion. Ensure existing test case fails without applied patch (second deletion becomes stuck without it, which eventually leads to a test failure). 2. Move delete_all call earlier so timeline is removed from the map is the last thing done during deletion. Additionally I've added timeline_id to the `update_gc_info` span, because `debug_assert_current_span_has_tenant_and_timeline_id` in `download_remote_layer` was firing when `update_gc_info` lead to on-demand downloads via `find_lsn_for_timestamp` (caught by @problame). This is not directly related to the PR but fixes possible flakiness. Another smaller set of changes involves deletion wrapper used in python tests. Now there is a simpler wrapper that waits for deletions to complete `timeline_delete_wait_completed`. Most of the test_delete_timeline.py tests make negative tests, i.e., "does ps_http.timeline_delete() fail in this and that scenario". These can be left alone. Other places when we actually do the deletions, we need to use the helper that polls for completion. Discussion https://neondb.slack.com/archives/C03F5SM1N02/p1686668007396639 resolves #4496 --------- Co-authored-by: Christian Schwarz <christian@neon.tech>	2023-06-15 17:30:12 +03:00
Alex Chi Z	2252c5c282	metrics: convert some metrics to pageserver-level (#4490 ) ## Problem Some metrics are better to be observed at page-server level. Otherwise, as we have a lot of tenants in production, we cannot do a sum b/c Prometheus has limit on how many time series we can aggregate. This also helps reduce metrics scraping size. ## Summary of changes Some integration tests are likely not to pass as it will check the existence of some metrics. Waiting for CI complete and fix them. Metrics downgraded: page cache hit (where we are likely to have a page-server level page cache in the future instead of per-tenant), and reconstruct time (this would better be tenant-level, as we have one pg replayer for each tenant, but now we make it page-server level as we do not need that fine-grained data). --------- Signed-off-by: Alex Chi <iskyzh@gmail.com>	2023-06-14 17:12:34 -04:00
Dmitry Rodionov	d53f9ab3eb	delete timelines from s3 (#4384 ) Delete data from s3 when timeline deletion is requested ## Summary of changes UploadQueue is altered to support scheduling of delete operations in stopped state. This looks weird, and I'm thinking whether there are better options/refactorings for upload client to make it look better. Probably can be part of https://github.com/neondatabase/neon/issues/4378 Deletion is implemented directly in existing endpoint because changes are not that significant. If we want more safety we can separate those or create feature flag for new behavior. resolves [#4193](https://github.com/neondatabase/neon/issues/4193) --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-06-08 15:01:22 +03:00
Dmitry Rodionov	1a1019990a	map TenantState::Broken to TenantAttachmentStatus::Failed (#4371 ) ## Problem Attach failures are not reported in public part of the api (in `attachment_status` field of TenantInfo). ## Summary of changes Expose TenantState::Broken as TenantAttachmentStatus::Failed In the way its written Failed status will be reported even if no attachment happened. (I e if tenant become broken on startup). This is in line with other members. I e Active will be resolved to Attached even if no actual attach took place. This can be tweaked if needed. At the current stage it would be overengineering without clear motivation resolves #4344	2023-06-07 18:25:30 +03:00
Heikki Linnakangas	df3bae2ce3	Use `compute_ctl` to manage Postgres in tests. (#3886 ) This adds test coverage for 'compute_ctl', as it is now used by all the python tests. There are a few differences in how 'compute_ctl' is called in the tests, compared to the real web console: - In the tests, the postgresql.conf file is included as one large string in the spec file, and it is written out as it is to the data directory. I added a new field for that to the spec file. The real web console, however, sets all the necessary settings in the 'settings' field, and 'compute_ctl' creates the postgresql.conf from those settings. - In the tests, the information needed to connect to the storage, i.e. tenant_id, timeline_id, connection strings to pageserver and safekeepers, are now passed as new fields in the spec file. The real web console includes them as the GUCs in the 'settings' field. (Both of these are different from what the test control plane used to do: It used to write the GUCs directly in the postgresql.conf file). The plan is to change the control plane to use the new method, and remove the old method, but for now, support both. Some tests that were sensitive to the amount of WAL generated needed small changes, to accommodate that compute_ctl runs the background health monitor which makes a few small updates. Also some tests shut down the pageserver, and now that the background health check can run some queries while the pageserver is down, that can produce a few extra errors in the logs, which needed to be allowlisted. Other changes: - remove obsolete comments about PostgresNode; - create standby.signal file for Static compute node; - log output of `compute_ctl` and `postgres` is merged into `endpoints/compute.log`. --------- Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>	2023-06-06 14:59:36 +01:00
Joonas Koivunen	0cef7e977d	refactor: just one way to shutdown a tenant (#4407 ) We have 2 ways of tenant shutdown, we should have just one. Changes are mostly mechanical simple refactorings. Added `warn!` on the "shutdown all remaining tasks" should trigger test failures in the between time of not having solved the "tenant/timeline owns all spawned tasks" issue. Cc: #4327.	2023-06-06 15:30:55 +03:00
Joonas Koivunen	18a9d47f8e	test: restore NotConnected being allowed globally (#4426 ) Flakyness introduced by #4402 evidence [^1]. I had assumed the NotConnected would had been an expected io error, but it's not. Restore the global `allowed_error`. [^1]: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-4407/5185897757/index.html#suites/82004ab4e3720b47bf78f312dabe7c55/14f636d0ecd3939d/	2023-06-06 13:51:39 +03:00
Joonas Koivunen	77598f5d0a	Better walreceiver logging (#4402 ) walreceiver logs are a bit hard to understand because of partial span usage, extra messages, ignored errors popping up as huge stacktraces. Fixes #3330 (by spans, also demote info -> debug). - arrange walreceivers spans into a hiearchy: - `wal_connection_manager{tenant_id, timeline_id}` -> `connection{node_id}` -> `poller` - unifies the error reporting inside `wal_receiver`: - All ok errors are now `walreceiver connection handling ended: {e:#}` - All unknown errors are still stacktraceful task_mgr reported errors with context `walreceiver connection handling failure` - Remove `connect` special casing, was: `DB connection stream finished` for ok errors - Remove `done replicating` special casing, was `Replication stream finished` for ok errors - lowered log levels for (non-exhaustive list): - `WAL receiver manager started, connecting to broker` (at startup) - `WAL receiver shutdown requested, shutting down` (at shutdown) - `Connection manager loop ended, shutting down` (at shutdown) - `sender is dropped while join handle is still alive` (at lucky shutdown, see #2885) - `timeline entered terminal state {:?}, stopping wal connection manager loop` (at shutdown) - `connected!` (at startup) - `Walreceiver db connection closed` (at disconnects?, was without span) - `Connection cancelled` (at shutdown, was without span) - `observed timeline state change, new state is {new_state:?}` (never after Timeline::activate was made infallible) - changed: - `Timeline dropped state updates sender, stopping wal connection manager loop` - was out of date; sender is not dropped but `Broken \| Stopping` state transition - also made `debug!` - `Timeline dropped state updates sender before becoming active, stopping wal connection manager loop` - was out of date: sender is again not dropped but `Broken \| Stopping` state transition - also made `debug!` - log fixes: - stop double reporting panics via JoinError	2023-06-05 17:35:23 +03:00
Joonas Koivunen	8142edda01	test: Less flaky gc (#4416 ) Solves a flaky test error in the wild[^1] by: - Make the gc shutdown signal reading an `allowed_error` - Note the gc shutdown signal readings as being in `allowed_error`s - Allow passing tenant conf to init_start to avoid unncessary tenants [^1]: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-4399/5176432780/index.html#suites/b97efae3a617afb71cb8142f5afa5224/2cd76021ea011f93	2023-06-05 15:43:52 +03:00
Alex Chi Z	82484e8241	pgserver: add more metrics for better observability (#4323 ) ## Problem This PR includes doc changes to the current metrics as well as adding new metrics. With the new set of metrics, we can quantitatively analyze the read amp., write amp. and space amp. in the system, when used together with https://github.com/neondatabase/neonbench close https://github.com/neondatabase/neon/issues/4312 ref https://github.com/neondatabase/neon/issues/4347 compaction metrics TBD, a novel idea is to print L0 file number and number of layers in the system, and we can do this in the future when we start working on compaction. ## Summary of changes * Add `READ_NUM_FS_LAYERS` for computing read amp. * Add `MATERIALIZED_PAGE_CACHE_HIT_UPON_REQUEST`. * Add `GET_RECONSTRUCT_DATA_TIME`. GET_RECONSTRUCT_DATA_TIME + RECONSTRUCT_TIME + WAIT_LSN_TIME should be approximately total time of reads. * Add `5.0` and `10.0` to `STORAGE_IO_TIME_BUCKETS` given some fsync runs slow (i.e., > 1s) in some cases. * Some `WAL_REDO` metrics are only used when Postgres is involved in the redo process. --------- Signed-off-by: Alex Chi <iskyzh@gmail.com>	2023-06-01 21:46:04 +03:00
Dmitry Rodionov	b190c3e6c3	reduce flakiness by allowing Compaction failed, retrying in X queue is in state Stopped. (#4379 ) resolves https://github.com/neondatabase/neon/issues/4374 by adding the error to allowed_errors	2023-05-30 20:11:44 +03:00
Heikki Linnakangas	a560b28829	Make new tenant/timeline IDs mandatory in create APIs. (#4304 ) We used to generate the ID, if the caller didn't specify it. That's bad practice, however, because network is never fully reliable, so it's possible we create a new tenant but the caller doesn't know about it, and because it doesn't know the tenant ID, it has no way of retrying or checking if it succeeded. To discourage that, make it mandatory. The web control plane has not relied on the auto-generation for a long time.	2023-05-26 16:19:36 +03:00
Joonas Koivunen	85e76090ea	test: fix ancestor is stopping flakyness (#4234 ) Flakyness most likely introduced in #4170, detected in https://neon-github-public-dev.s3.amazonaws.com/reports/pr-4232/4980691289/index.html#suites/542b1248464b42cc5a4560f408115965/18e623585e47af33. Opted to allow it globally because it can happen in other tests as well, basically whenever compaction is enabled and we stop pageserver gracefully.	2023-05-25 16:22:58 +00:00
Sasha Krassovsky	6052ecee07	Add connector extension to send Role/Database updates to console (#3891 ) ## Describe your changes ## Issue ticket number and link ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [x] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.	2023-05-25 12:36:57 +03:00
Christian Schwarz	e11ba24ec5	tenant loops: operate on the Arc<Tenant> directly (#4298 ) (Instead of going through mgr every iteration.) The `wait_for_active_tenant` function's `wait` argument could be removed because it was only used for the loop that waits for the tenant to show up in the tenants map. Since we're passing the tenant in, we now longer need to get it from the tenants map. NB that there's no guarantee that the tenant object is in the tenants map at the time the background loop function starts running. But the tenant mgr guarantees that it will be quite soon. See `tenant_map_insert` way upwards in the call hierarchy for details. This is prep work to eliminate `subscribe_for_state_updates` (PR #4299 ) Fixes: #3501	2023-05-25 10:49:09 +02:00
Christian Schwarz	df52587bef	attach-time tenant config (#4255 ) This PR adds support for supplying the tenant config upon /attach. Before this change, when relocating a tenant using `/detach` and `/attach`, the tenant config after `/attach` would be the default config from `pageserver.toml`. That is undesirable for settings such as the PITR-interval: if the tenant's config on the source was `30 days` and the default config on the attach-side is `7 days`, then the first GC run would eradicate 23 days worth of PITR capability. The API change is backwards-compatible: if the body is empty, we continue to use the default config. We'll remove that capability as soon as the cloud.git code is updated to use attach-time tenant config (https://github.com/neondatabase/neon/issues/4282 keeps track of this). unblocks https://github.com/neondatabase/cloud/issues/5092 fixes https://github.com/neondatabase/neon/issues/1555 part of https://github.com/neondatabase/neon/issues/886 (Tenant Relocation) Implementation ============== The preliminary PRs for this work were (most-recent to least-recent) * https://github.com/neondatabase/neon/pull/4279 * https://github.com/neondatabase/neon/pull/4267 * https://github.com/neondatabase/neon/pull/4252 * https://github.com/neondatabase/neon/pull/4235	2023-05-24 17:46:30 +03:00
Alexander Bayandin	2a3f54002c	test_runner: update dependencies (#4328 ) ## Problem `pytest` 6 truncates error messages and this is not configured. It's fixed in `pytest` 7, it prints the whole message (truncating limit is higher) if `--verbose` is set (it's set on CI). ## Summary of changes - `pytest` and `pytest` plugins are updated to their latest versions - linters (`black` and `ruff`) are updated to their latest versions - `mypy` and types are updated to their latest versions, new warnings are fixed - while we're here, allure updated its latest version as well	2023-05-24 12:47:01 +01:00
Stas Kelvich	dad3519351	Add SQL-over-HTTP endpoint to Proxy This commit introduces an SQL-over-HTTP endpoint in the proxy, with a JSON response structure resembling that of the node-postgres driver. This method, using HTTP POST, achieves smaller amortized latencies in edge setups due to fewer round trips and an enhanced open connection reuse by the v8 engine. This update involves several intricacies: 1. SQL injection protection: We employed the extended query protocol, modifying the rust-postgres driver to send queries in one roundtrip using a text protocol rather than binary, bypassing potential issues like those identified in https://github.com/sfackler/rust-postgres/issues/1030. 2. Postgres type compatibility: As not all postgres types have binary representations (e.g., acl's in pg_class), we adjusted rust-postgres to respond with text protocol, simplifying serialization and fixing queries with text-only types in response. 3. Data type conversion: Considering JSON supports fewer data types than Postgres, we perform conversions where possible, passing all other types as strings. Key conversions include: - postgres int2, int4, float4, float8 -> json number (NaN and Inf remain text) - postgres bool, null, text -> json bool, null, string - postgres array -> json array - postgres json and jsonb -> json object 4. Alignment with node-postgres: To facilitate integration with js libraries, we've matched the response structure of node-postgres, returning command tags and column oids. Command tag capturing was added to the rust-postgres functionality as part of this change.	2023-05-23 20:01:40 +03:00
Christian Schwarz	b391c94440	tenant create / update-config: reject unknown fields (#4267 ) This PR enforces that the tenant create / update-config APIs reject requests with unknown fields. This is a desirable property because some tenant config settings control the lifetime of user data (e.g., GC horizon or PITR interval). Suppose we inadvertently rename the `pitr_interval` field in the Rust code. Then, right now, a client that still uses the old name will send a tenant config request to configure a new PITR interval. Before this PR, we would accept such a request, ignore the old name field, and use the pageserver.toml default value for what the new PITR interval is. With this PR, we will instead reject such a request. One might argue that the client could simply check whether the config it sent has been applied, using the `/v1/tenant/.../config` endpoint. That is correct for tenant create and update-config. But, attach will soon [^1] grow the ability to have attach-time config as well. If we ignore unknown fields and fall back to global defaults in that case, we risk data loss. Example: 1. Default PITR in pageservers is 7 days. 2. Create a tenant and set its PITR to 30 days. 3. For 30 days, fill the tenant continuously with data. 4. Detach the tenant. 5. Attach tenant. Attach must use the 30-day PITR setting in this scenario. If it were to fall back to the 7-day default value, we would lose 23 days of PITR capability for the tenant. So, the PR that adds attach-time tenant config will build on the (clunky) infrastructure added in this PR [^1]: https://github.com/neondatabase/neon/pull/4255 Implementation Notes ==================== This could have been a simple `#[serde(deny_unknown_fields)]` but sadly, that is documented- but silent-at-compile-time-incompatible with `#[serde(flatten)]`. But we are still using this by adding on outer struct and use unit tests to ensure it is correct. `neon_local tenant config` now uses the `.remove()` pattern + bail if there are leftover config args. That's in line with what `neon_local tenant create` does. We should dedupe that logic in a future PR. --------- Signed-off-by: Alex Chi <iskyzh@gmail.com> Co-authored-by: Alex Chi <iskyzh@gmail.com>	2023-05-18 21:16:09 -04:00
Alexander Bayandin	1b2ece3715	Re-enable compatibility tests on Postgres 15 (#4274 ) - Enable compatibility tests for Postgres 15 - Also add `PgVersion::v_prefixed` property to return the version number with, _guess what,_ v-prefix!	2023-05-18 19:56:09 +01:00
Joonas Koivunen	4a76f2b8d6	upload new timeline index part json before 201 or on retry (#4204 ) Await for upload to complete before returning 201 Created on `branch_timeline` or when `bootstrap_timeline` happens. Should either of those waits fail, then on the retried request await for uploads again. This should work as expected assuming control-plane does not start to use timeline creation as a wait_for_upload mechanism. Fixes #3865, started from https://github.com/neondatabase/neon/pull/3857/files#r1144468177 Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2023-05-15 14:16:43 +03:00
Alexander Bayandin	bb06d281ea	Run regressions tests on both Postgres 14 and 15 (#4192 ) This PR adds tests runs on Postgres 15 and created unified Allure report with results for all tests. - Split `.github/actions/allure-report` into `.github/actions/allure-report-store` and `.github/actions/allure-report-generate` - Add debug or release pytest parameter for all tests (depending on `BUILD_TYPE` env variable) - Add Postgres version as a pytest parameter for all tests (depending on `DEFAULT_PG_VERSION` env variable) - Fix `test_wal_restore` and `restore_from_wal.sh` to support path with `[`/`]` in it (fixed by applying spellcheck to the script and fixing all warnings), `restore_from_wal_archive.sh` is deleted as unused. - All known failures on Postgres 15 marked with xfail	2023-05-12 15:28:51 +01:00
Alexander Bayandin	1d490b2311	Make benchmark_fixture less noisy	2023-05-10 16:59:03 +01:00
Dmitry Rodionov	eb3a8be933	keep track of timeline deletion status in IndexPart to prevent timeline resurrection (#3919 ) Before this patch, the following sequence would lead to the resurrection of a deleted timeline: - create timeline - wait for its index part to reach s3 - delete timeline - wait an arbitrary amount of time, including 0 seconds - detach tenant - attach tenant - the timeline is there and Active again This happens because we only kept track of the deletion in the tenant dir (by deleting the timeline dir) but not in S3. The solution is to turn the deleted timeline's IndexPart into a tombstone. The deletion status of the timeline is expressed in the `deleted_at: Option<NativeDateTime>` field of IndexPart. It's `None` while the timeline is alive and `Some(deletion time stamp)` if it is deleted. We change the timeline deletion handler to upload this tombstoned IndexPart. The handler does not return success if the upload fails. Coincidentally, this fixes the long-stanging TODO about the `std::fs::remove_dir_all` being not atomic. It need not be atomic anymore because we set the `deleted_at=Some()` before starting the `remove_dir_all`. The tombstone is in the IndexPart only, not in the `metadata`. So, we only have the tombstone and the `remove_dir_all` benefits mentioned above if remote storage is configured. This was a conscious trade-off because there's no good format evolution story for the current metadata file format. The introduction of this additional step into `delete_timeline` was painful because delete_timeline needs to be 1. cancel-safe 2. idempotent 3. safe to call concurrently These are mostly self-inflicted limitations that can be avoided by using request-coalescing. PR https://github.com/neondatabase/neon/pull/4159 will do that. fixes https://github.com/neondatabase/neon/issues/3560 refs https://github.com/neondatabase/neon/issues/3889 (part of tenant relocation) Co-authored-by: Joonas Koivunen <joonas@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech>	2023-05-10 10:27:12 +02:00
Alexander Bayandin	653e633c59	test_runner: add --pg-version pytest argument (#4037 ) - allows setting Postgres version for testing using --pg-version argument - fixes tests for the non-default Postgres version.	2023-05-05 02:57:47 +03:00
Christian Schwarz	7dd9553bbb	eviction: regression test + distinguish layer write from map insert (#4005 ) This patch adds a regression test for the threshold-based layer eviction. The test asserts the basic invariant that, if left alone, the residence statuses will stabilize, with some layers resident and some layers evicted. Thereby, we cover both the aspect of last-access-time-threshold-based eviction, and the "imitate access" hacks that we put in recently. The aggressive `period` and `threshold` values revealed a subtle bug which is also fixed in this patch. The symptom was that, without the Rust changes of this patch, there would be occasional test failures due to `WARN... unexpectedly downloading` log messages. These log messages were caused by the "imitate access" calls of the eviction task. But, the whole point of the "imitate access" hack was to prevent eviction of the layers that we access there. After some digging, I found the root cause, which is the following race condition: 1. Compact: Write out an L1 layer from several L0 layers. This records residence event `LayerCreate` with the current timestamp. 2. Eviction: imitate access logical size calculation. This accesses the L0 layers because the L1 layer is not yet in the layer map. 3. Compact: Grab layer map lock, add the new L1 to layer map and remove the L0s, release layer map lock. 4. Eviction: observes the new L1 layer whose only activity timestamp is the `LayerCreate` event. The L1 layer had no chance of being accessed until after (3). So, if enough time passes between (1) and (3), then (4) will observe a layer with `now-last_activity > threshold` and evict it The fix is to require the first `record_residence_event` to happen while we already hold the layer map lock. The API requires a ref to a `BatchedUpdates` as a witness that we are inside a layer map lock. That is not fool-proof, e.g., new call sites for `insert_historic` could just completely forget to record the residence event. It would be nice to prevent this at the type level. In the meantime, we have a rate-limited log messages to warn us, if such an implementation error sneaks in in the future. fixes https://github.com/neondatabase/neon/issues/3593 fixes https://github.com/neondatabase/neon/issues/3942 --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-05-04 16:16:48 +02:00
Arthur Petukhovsky	8543485e92	Pull clone timeline from peer safekeepers (#4089 ) Add HTTP endpoint to initialize safekeeper timeline from peer safekeepers. This is useful for initializing new safekeeper to replace failed safekeeper. Not fully "correct" in all cases, but should work in most. This code is not suitable for production workloads but can be tested on staging to get started. New endpoint is separated from usual cases and should not affect anything if no one explicitly uses a new endpoint. We can rollback this commit in case of issues.	2023-04-28 14:20:46 +00:00
Stas Kelvich	5bb971d64e	fix more python tests	2023-04-28 17:15:43 +03:00
Stas Kelvich	0364f77b9a	fix python styling	2023-04-28 17:15:43 +03:00
Stas Kelvich	9486d76b2a	Add tests for link auth to compute connection	2023-04-28 17:15:43 +03:00
Arseny Sher	b2a3981ead	Move tracking of walsenders out of Timeline. Refactors walsenders out of timeline.rs to makes it less convoluted into separate WalSenders with its own lock, but otherwise having the same structure. Tracking of in-memory remote_consistent_lsn is also moved there as it is mainly received from pageserver. State of walsender (feedback) is also restructured to be cleaner; now it is either PageserverFeedback or StandbyFeedback(StandbyReply, HotStandbyFeedback), but not both.	2023-04-28 06:22:13 +04:00
MMeent	e6ec2400fc	Enable hot standby PostgreSQL replicas. Notes: - This still needs UI support from the Console - I've not tuned any GUCs for PostgreSQL to make this work better - Safekeeper has gotten a tweak in which WAL is sent and how: It now sends zero-ed WAL data from the start of the timeline's first segment up to the first byte of the timeline to be compatible with normal PostgreSQL WAL streaming. - This includes the commits of #3714 Fixes one part of https://github.com/neondatabase/neon/issues/769 Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>	2023-04-27 15:26:44 +02:00
Christian Schwarz	6861259be7	add global metric for unexpected on-demand downloads (#4069 ) Until we have toned down the prod logs to zero WARN and ERROR, we want a dedicated metric for which we can have a dedicated alert. fixes https://github.com/neondatabase/neon/issues/3924	2023-04-26 15:18:26 +02:00
Arseny Sher	31a3910fd9	Remove wait_for_sk_commit_lsn_to_reach_remote_storage. It had a couple of inherent races: 1) Even if compute is killed before the call, some more data might still arrive to safekeepers after commit_lsn on them is polled, advancing it. Then checkpoint on pageserver might not include this tail, and so upload of expected LSN won't happen until one more checkpoint. 2) commit_lsn is updated asynchronously -- compute can commit transaction before communicating commit_lsn to even single safekeeper (sync-safekeepers can be used to forces the advancement). This makes semantics of wait_for_sk_commit_lsn_to_reach_remote_storage quite complicated. Replace it with last_flush_lsn_upload which 1) Learns last flush LSN on compute; 2) Waits for it to arrive to pageserver; 3) Checkpoints it; 4) Waits for the upload. In some tests this keeps compute alive longer than before, but this doesn't seem to be important. There is a chance this fixes https://github.com/neondatabase/neon/issues/3209	2023-04-26 13:46:33 +04:00
Christian Schwarz	fa20e37574	add gauge for in-flight layer uploads (#3951 ) For the "worst-case /storage usage panel", we need to compute ``` remote size + local-only size ``` We currently don't have a metric for local-only layers. The number of in-flight layers in the upload queue is just that, so, let Prometheus scrape it. The metric is two counters (started and finished). The delta is the amount of in-flight uploads in the queue. The metrics are incremented in the respective `call_unfinished_metric_*` functions. These track ongoing operations by file_kind and op_kind. We only need this metric for layer uploads, so, there's the new RemoteTimelineClientMetricsCallTrackSize type that forces all call sites to decide whether they want the size tracked or not. If we find that other file_kinds or op_kinds are interesting (metadata uploads, layer downloads, layer deletes) are interesting, we can just enable them, and they'll be just another label combination within the metrics that this PR adds. fixes https://github.com/neondatabase/neon/issues/3922	2023-04-25 14:22:48 +02:00
Christian Schwarz	e83684b868	add libmetric metric for each logged log message (#4055 ) This patch extends the libmetrics logging setup functionality with a `tracing` layer that increments a Prometheus counter each time we log a log message. We have the counter per tracing event level. This allows for monitoring WARN and ERR log volume without parsing the log. Also, it would allow cross-checking whether logs got dropped on the way into Loki. It would be nicer if we could hook deeper into the tracing logging layer, to avoid evaluating the filter twice. But I don't know how to do it.	2023-04-25 14:10:18 +02:00
Alexander Bayandin	0c82ff3d98	test_runner: add Timeline Inspector to Grafana links (#4021 )	2023-04-14 11:46:47 +01:00

1 2 3 4 5 ...

366 Commits