rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-10 15:02:56 +00:00

Author	SHA1	Message	Date
John Spray	34ebfbdd6f	pageserver: fix handling getpage with multiple shards on one node Previously, we would wait for the LSN to be visible on whichever timeline we happened to load at the start of the connection, then proceed to look up the correct timeline for the key and do the read. If the timeline holding the key was behind the timeline we used for the LSN wait, then we might serve an apparently-successful read result that actually contains data from behind the requested lsn.	2024-01-03 14:22:40 +00:00
John Spray	ef7c9c2ccc	pageserver: fix active tenant lookup hitting secondaries with sharding If there is some secondary shard for a tenant on the same node as an attached shard, the secondary shard could trip up this code and cause page_service to incorrectly get an error instead of finding the attached shard.	2024-01-03 14:22:40 +00:00
John Spray	6c79e12630	pageserver: drop unwanted keys during compaction after split	2024-01-03 14:22:40 +00:00
John Spray	753d97bd77	pageserver: don't delete ancestor shard layers	2024-01-03 14:22:40 +00:00
John Spray	edc962f1d7	test_runner: test_issue_5878 log allow list (#6259 ) ## Problem https://neon-github-public-dev.s3.amazonaws.com/reports/pr-6254/7388706419/index.html#suites/5a4b8734277a9878cb429b80c314f470/e54c4f6f6ed22672 ## Summary of changes Permit the log message: because the test helper's detach function increments the generation number, a detach/attach cycle can cause the error if the test runner node is slow enough for the opportunistic deletion queue flush on detach not to complete by the time we call attach.	2024-01-03 14:22:17 +00:00
Arseny Sher	65b4e6e7d6	Remove empty safekeeper init since truncateLsn. It has caveats such as creating half empty segment which can't be offloaded. Instead we'll pursue approach of pull_timeline, seeding new state from some peer.	2024-01-03 18:20:19 +04:00
Alexander Bayandin	17b256679b	vm-image-spec: build pgbouncer from Neon's fork (#6249 ) ## Problem We need to add one more patch to pgbouncer (for https://github.com/neondatabase/neon/issues/5801). I've decided to cherry-pick all required patches to a pgbouncer fork (`neondatabase/pgbouncer`) and use it instead. See https://github.com/neondatabase/pgbouncer/releases/tag/pgbouncer_1_21_0-neon-1 ## Summary of changes - Revert the previous patch (for deallocate/discard all) — the fork already contains it. - Remove `libssl-dev` dependency — we build pgbouncer without `openssl` support. - Clone git tag and build pgbouncer from source code.	2024-01-03 13:02:04 +00:00
John Spray	673a865055	tests: tolerate 304 when evicting layers (#6261 ) In tests that evict layers, explicit eviction can race with automatic eviction of the same layer and result in a 304	2024-01-03 11:50:58 +00:00
Cuong Nguyen	fb518aea0d	Add batch ingestion mechanism to avoid high contention (#5886 ) ## Problem For context, this problem was observed in a research project where we try to make neon run in multiple regions and I was asked by @hlinnaka to make this PR. In our project, we use the pageserver in a non-conventional way such that we would send a larger number of requests to the pageserver than normal (imagine postgres without the buffer pool). I measured the time from the moment a WAL record left the safekeeper to when it reached the pageserver ([code](`e593db1f5a/pageserver/src/tenant/timeline/walreceiver/walreceiver_connection.rs (L282-L287)`)) and observed that when the number of get_page_at_lsn requests was high, the wal receiving time increased significantly (see the left side of the graphs below). Upon further investigation, I found that the delay was caused by this line `d2ca410919/pageserver/src/tenant/timeline.rs (L2348)` The `get_layer_for_write` method is called for every value during WAL ingestion and it tries to acquire layers write lock every time, thus this results in high contention when read lock is acquired more frequently. ![Untitled](https://github.com/neondatabase/neon/assets/6244849/85460f4d-ead1-4532-bc64-736d0bfd7f16) ![Untitled2](https://github.com/neondatabase/neon/assets/6244849/84199ab7-5f0e-413b-a42b-f728f2225218) ## Summary of changes It is unnecessary to call `get_layer_for_write` repeatedly for all values in a WAL message since they would end up in the same memory layer anyway, so I created the batched versions of `InMemoryLayer::put_value`, `InMemoryLayer ::put_tombstone`, `Timeline::put_value`, and `Timeline::put_tombstone`, that acquire the locks once for a batch of values. Additionally, `DatadirModification` is changed to store multiple versions of uncommitted values, and `WalIngest::ingest_record()` can now ingest records without immediately committing them. With these new APIs, the new ingestion loop can be changed to commit for every `ingest_batch_size` records. The `ingest_batch_size` variable is exposed as a config. If it is set to 1 then we get the same behavior before this change. I found that setting this value to 100 seems to work the best, and you can see its effect on the right side of the above graphs. --------- Co-authored-by: John Spray <john@neon.tech>	2024-01-03 10:41:58 +00:00
John Spray	42f41afcbd	tests: update pytest and boto3 dependencies (#6253 ) ## Problem The version of pytest we were using emits a number of DeprecationWarnings on latest python: these are fixed in latest release. boto3 and python-dateutil also have deprecation warnings, but unfortunately these aren't fixed upstream yet. ## Summary of changes - Update pytest - Update boto3 (this doesn't fix deprecation warnings, but by the time I figured that out I had already done the update, and it's good hygiene anyway)	2024-01-03 10:36:53 +00:00
Arseny Sher	f71110383c	Remove second check for max_slot_wal_keep_size download size. Already checked in GetLogRepRestartLSN, a rebase artifact.	2024-01-03 13:13:32 +04:00
Arseny Sher	ae3eaf9995	Add [WP] prefix to all walproposer logging. - rename walpop_log to wp_log - create also wpg_log which is used in postgres-specific code - in passing format messages to start with lower case	2024-01-03 11:10:27 +04:00
Christian Schwarz	aa9f1d4b69	pagebench get-page: default to latest=true, make configurable via flag (#6252 ) fixes https://github.com/neondatabase/neon/issues/6209	2024-01-02 16:57:29 +00:00
Joonas Koivunen	946c6a0006	scrubber: use adaptive config with retries, check subset of tenants (#6219 ) The tool still needs a lot of work. These are the easiest fix and feature: - use similar adaptive config with s3 as remote_storage, use retries - process only particular tenants Tenants need to be from the correct region, they are not deduplicated, but the feature is useful for re-checking small amount of tenants after a large run.	2024-01-02 15:22:16 +00:00
Sasha Krassovsky	ce13281d54	MIN not MAX	2024-01-02 06:28:49 -08:00
Sasha Krassovsky	4e1d16f311	Switch to exponential rate-limiting	2024-01-02 06:28:49 -08:00
Sasha Krassovsky	091a0cda9d	Switch to rate-limiting strategy	2024-01-02 06:28:49 -08:00
Sasha Krassovsky	ea9fad419e	Add exponential backoff to page_server->send	2024-01-02 06:28:49 -08:00
Arseny Sher	e92c9f42c0	Don't split WAL record across two XLogData's when sending from safekeepers. As protocol demands. Not following this makes standby complain about corrupted WAL in various ways. https://neondb.slack.com/archives/C05L7D1JAUS/p1703774799114719 closes https://github.com/neondatabase/cloud/issues/9057	2024-01-02 10:50:20 +04:00
Arseny Sher	aaaa39d9f5	Add large insertion and slow WAL sending to test_hot_standby. To exercise MAX_SEND_SIZE sending from safekeeper; we've had a bug with WAL records torn across several XLogData messages. Add failpoint to safekeeper to slow down sending. Also check for corrupted WAL complains in standby log. Make the test a bit simpler in passing, e.g. we don't need explicit commits as autocommit is enabled by default. https://neondb.slack.com/archives/C05L7D1JAUS/p1703774799114719 https://github.com/neondatabase/cloud/issues/9057	2024-01-02 10:50:20 +04:00
Arseny Sher	e79a19339c	Add failpoint support to safekeeper. Just a copy paste from pageserver.	2024-01-02 10:50:20 +04:00
Arseny Sher	dbd36e40dc	Move failpoint support code to utils. To enable them in safekeeper as well.	2024-01-02 10:50:20 +04:00
Arseny Sher	90ef48aab8	Fix safekeeper START_REPLICATION (term=n). It was giving WAL only up to commit_lsn instead of flush_lsn, so recovery of uncommitted WAL since `cdb08f03` hanged. Add test for this.	2024-01-01 20:44:05 +04:00
Arseny Sher	9a43c04a19	compute_ctl: kill postgres and sync-safekeeprs on exit. Otherwise they are left orphaned when compute_ctl is terminated with a signal. It was invisible most of the time because normally neon_local or k8s kills postgres directly and then compute_ctl finishes gracefully. However, in some tests compute_ctl gets stuck waiting for sync-safekeepers which intentionally never ends because safekeepers are offline, and we want to stop compute_ctl without leaving orphanes behind. This is a quite rough approach which doesn't wait for children termination. A better way would be to convert compute_ctl to async which would make waiting easy.	2024-01-01 20:44:05 +04:00
Abhijeet Patil	f28bdb6528	Use nextest for rust unittests (#6223 ) ## Problem `cargo test` doesn't support timeouts or junit output format ## Summary of changes - Add `nextest` to `build-tools` image - Switch `cargo test` with `cargo nextest` on CI - Set timeout	2023-12-30 13:45:31 +00:00
Conrad Ludgate	1c037209c7	proxy: fix compute addr parsing (#6237 ) ## Problem control plane should be able to return domain names and not just IP addresses. ## Summary of changes 1. add regression tests 2. use rsplit to split the port from the back, then trim the ipv6 brackets	2023-12-29 09:32:24 +00:00
Bodobolero	e5a3b6dfd8	Pg stat statements reset for neon superuser (#6232 ) ## Problem Extension pg_stat_statements has function pg_stat_statements_reset(). In vanilla Postgres this function can only be called by superuser role or other users/roles explicitly granted. In Neon no end user can use superuser role. Instead we have neon_superuser role. We need to grant execute on pg_stat_statements_reset() to neon_superuser ## Summary of changes Modify the Postgres v14, v15, v16 contrib in our compute docker file to grant execute on pg_stat_statements_reset() to neon_superuser. (Modifying it in our docker file is preferable to changes in neondatabase/postgres because we want to limit the changes in our fork that we have to carry with each new version of Postgres). Note that the interface of proc/function pg_stat_statements_reset changed in pg_stat_statements version 1.7 So for versions up to and including 1.6 we must `GRANT EXECUTE ON FUNCTION pg_stat_statements_reset() TO neon_superuser;` and for versions starting from 1.7 we must `GRANT EXECUTE ON FUNCTION pg_stat_statements_reset(Oid, Oid, bigint) TO neon_superuser;` If we just use `GRANT EXECUTE ON FUNCTION pg_stat_statements_reset() TO neon_superuser;` for all version this results in the following error for versions 1.7+: ```sql neondb=> create extension pg_stat_statements; ERROR: function pg_stat_statements_reset() does not exist ``` ## Checklist before requesting a review - [x ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [x ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist ## I have run the following test and could now invoke pg_stat_statements_reset() using default user ```bash (neon) peterbendel@Peters-MBP neon % kubectl get pods \| grep compute-quiet-mud-88416983 compute-quiet-mud-88416983-74f4bf67db-crl4c 3/3 Running 0 7m26s (neon) peterbendel@Peters-MBP neon % kubectl set image deploy/compute-quiet-mud-88416983 compute-node=neondatabase/compute-node-v15:7307610371 deployment.apps/compute-quiet-mud-88416983 image updated (neon) peterbendel@Peters-MBP neon % psql postgresql://peterbendel:<secret>@ep-bitter-sunset-73589702.us-east-2.aws.neon.build/neondb psql (16.1, server 15.5) SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off) Type "help" for help. neondb=> select version(); version --------------------------------------------------------------------------------------------------- PostgreSQL 15.5 on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit (1 row) neondb=> create extension pg_stat_statements; CREATE EXTENSION neondb=> select pg_stat_statements_reset(); pg_stat_statements_reset -------------------------- (1 row) ```	2023-12-27 18:15:17 +01:00
Sasha Krassovsky	136aab5479	Bump postgres submodule versions	2023-12-27 08:39:00 -08:00
Anastasia Lubennikova	6e40900569	Manage pgbouncer configuration from compute_ctl: - add pgbouncer_settings section to compute spec; - add pgbouncer-connstr option to compute_ctl. - add pgbouncer-ini-path option to compute_ctl. Default: /etc/pgbouncer/pgbouncer.ini Apply pgbouncer config on compute start and respec to override default spec. Save pgbouncer config updates to pgbouncer.ini to preserve them across pgbouncer restarts.	2023-12-26 15:17:09 +00:00
Arseny Sher	ddc431fc8f	pgindent walproposer condvar comment	2023-12-26 14:12:53 +04:00
Arseny Sher	bfc98f36e3	Refactor handling responses in walproposer. Remove confirm_wal_streamed; we already apply both write and flush positions of the slot to commit_lsn which is fine because 1) we need to wake up waiters 2) committed WAL can be fetched from safekeepers by neon_walreader now.	2023-12-26 14:12:53 +04:00
Arseny Sher	d5fbfe2399	Remove test_wal_deleted_after_broadcast. It is superseded by stronger test_lagging_sk.	2023-12-26 14:12:53 +04:00
Arseny Sher	1f1c50e8c7	Don't re-add neon_walreader socket to waiteventset if possible. Should make recovery slightly more efficient (likely negligibly).	2023-12-26 14:12:53 +04:00
Arseny Sher	854df0f566	Do PQgetCopyData before PQconsumeInput in libpqwp_async_read. To avoid a lot of redundant memmoves and bloated input buffer. fixes https://github.com/neondatabase/neon/issues/6055	2023-12-26 14:12:53 +04:00
Arseny Sher	9c493869c7	Perform synchronous WAL download in wp only for logical replication. wp -> sk communication now uses neon_walreader which will fetch missing WAL on demand from safekeepers, so doesn't need this anymore. Also, cap WAL download by max_slot_wal_keep_size to be able to start compute if lag is too high.	2023-12-26 14:12:53 +04:00
Arseny Sher	df760e6de5	Add test_lagging_sk.	2023-12-26 14:12:53 +04:00
Arseny Sher	14913c6443	Adapt rust walproposer to neon_walreader.	2023-12-26 14:12:53 +04:00
Arseny Sher	cdb08f0362	Introduce NeonWALReader downloading sk -> compute WAL on demand. It is similar to XLogReader, but when either requested segment is missing locally or requested LSN is before basebackup_lsn NeonWALReader asynchronously fetches WAL from one of safekeepers. Patch includes walproposer switch to NeonWALReader, splitting wouldn't make much sense as it is hard to test otherwise. This finally removes risk of pg_wal explosion (as well as slow start time) when one safekeeper is lagging, at the same time allowing to recover it. In the future reader should also be used by logical walsender for similar reasons (currently we download the tail on compute start synchronously). The main test is test_lagging_sk. However, I also run it manually a lot varying MAX_SEND_SIZE on both sides (on safekeeper and on walproposer), testing various fragmentations (one side having small buffer, another, both), which brought up https://github.com/neondatabase/neon/issues/6055 closes https://github.com/neondatabase/neon/issues/1012	2023-12-26 14:12:53 +04:00
Konstantin Knizhnik	572bc06011	Do not copy WAL for lagged slots (#6221 ) ## Problem See https://neondb.slack.com/archives/C026T7K2YP9/p1702813041997959 ## Summary of changes Do not take in account invalidated slots when calculate restart_lsn position for basebackup at page server ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2023-12-22 20:47:55 +02:00
Arpad Müller	a7342b3897	remote_storage: store last_modified and etag in Download (#6227 ) Store the content of the `last-modified` and `etag` HTTP headers in `Download`. This serves both as the first step towards #6199 and as a preparation for tests in #6155 .	2023-12-22 14:13:20 +01:00
John Spray	e68ae2888a	pageserver: expedite tenant activation on delete (#6190 ) ## Problem During startup, a tenant delete request might have to retry for many minutes waiting for a tenant to enter Active state. ## Summary of changes - Refactor delete_tenant into TenantManager: this is not a functional change, but will avoid merge conflicts with https://github.com/neondatabase/neon/pull/6105 later - Add 412 responses to the swagger definition of this endpoint. - Use Tenant::wait_to_become_active in `TenantManager::delete_tenant` --------- Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2023-12-22 10:22:22 +00:00
Arpad Müller	83000b3824	buildtools: update protoc and mold (#6222 ) These updates aren't very important but I would like to try out the new process as of #6195	2023-12-21 18:07:21 +01:00
Arpad Müller	a21b719770	Use neon-github-ci-tests S3 bucket for remote_storage tests (#6216 ) This bucket is already used by the pytests. The current bucket github-public-dev is more meant for longer living artifacts. slack thread: https://neondb.slack.com/archives/C039YKBRZB4/p1703124944669009 Part of https://github.com/neondatabase/cloud/issues/8233 / #6155	2023-12-21 17:28:28 +01:00
Alexander Bayandin	1dff98be84	CI: fix build-tools image tag for PRs (#6217 ) ## Problem Fix build-tools image tag calculation for PRs. Broken in https://github.com/neondatabase/neon/pull/6195 ## Summary of changes - Use `pinned` tag instead of `$GITHUB_RUN_ID` if there's no changes in the dockerfile (and we don't build such image)	2023-12-21 14:55:24 +00:00
Arpad Müller	7d6fc3c826	Use pre-generated initdb.tar.zst in test_ingest_real_wal (#6206 ) This implements the TODO mentioned in the test added by #5892.	2023-12-21 14:23:09 +00:00
Abhijeet Patil	61b6c4cf30	Build dockerfile from neon repo (#6195 ) ## Fixing GitHub workflow issue related to build and push images ## Summary of changes Followup of PR#608[move docker file from build repo to neon to solve issue some issues The build started failing because it missed a validation in logic that determines changes in the docker file Also, all the dependent jobs were skipped because of the build and push of the image job. To address the above issue following changes were made - we are adding validation to generate image tag even if it's a merge to repo. - All the dependent jobs won't skip even if the build and push image job is skipped. - We have moved the logic to generate a tag in the sub-workflow. As the tag name was necessary to be passed to the sub-workflow it made sense to abstract that away where it was needed and then store it as an output variable so that downward dependent jobs could access the value. - This made the dependency logic easy and we don't need complex expressions to check the condition on which it will run - An earlier PR was closed that tried solving a similar problem that has some feedback and context before creating this PR https://github.com/neondatabase/neon/pull/6175 ## Checklist before requesting a review - [x] Move the tag generation logic from the main workflow to the sub-workflow of build and push the image - [x] Add a condition to generate an image tag for a non-PR-related run - [x] remove complex if the condition from the job if conditions --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech> Co-authored-by: Abhijeet Patil <abhijeet@neon.tech>	2023-12-21 12:46:51 +00:00
Bodobolero	f93d15f781	add comment to run vacuum for clickbench (#6212 ) ## Problem This is a comment only change. To ensure that our benchmarking results are fair we need to have correct stats in catalog. Otherwise optimizer chooses seq scan instead of index only scan for some queries. Added comment to run vacuum after data prep.	2023-12-21 13:34:31 +01:00
Christian Schwarz	5385791ca6	add pageserver component-level benchmark (`pagebench`) (#6174 ) This PR adds a component-level benchmarking utility for pageserver. Its name is `pagebench`. The problem solved by `pagebench` is that we want to put Pageserver under high load. This isn't easily achieved with `pgbench` because it needs to go through a compute, which has signficant performance overhead compared to accessing Pageserver directly. Further, compute has its own performance optimizations (most importantly: caches). Instead of designing a compute-facing workload that defeats those internal optimizations, `pagebench` simply bypasses them by accessing pageserver directly. Supported benchmarks: * getpage@latest_lsn * basebackup * triggering logical size calculation This code has no automated users yet. A performance regression test for getpage@latest_lsn will be added in a later PR. part of https://github.com/neondatabase/neon/issues/5771	2023-12-21 13:07:23 +01:00
Conrad Ludgate	2df3602a4b	Add GC to http connection pool (#6196 ) ## Problem HTTP connection pool will grow without being pruned ## Summary of changes Remove connection clients from pools once idle, or once they exit. Periodically clear pool shards. GC Logic: Each shard contains a hashmap of `Arc<EndpointPool>`s. Each connection stores a `Weak<EndpointPool>`. During a GC sweep, we take a random shard write lock, and check that if any of the `Arc<EndpointPool>`s are unique (using `Arc::get_mut`). - If they are unique, then we check that the endpoint-pool is empty, and sweep if it is. - If they are not unique, then the endpoint-pool is in active use and we don't sweep. - Idle connections will self-clear from the endpoint-pool after 5 minutes. Technically, the uniqueness of the endpoint-pool should be enough to consider it empty, but the connection count check is done for completeness sake.	2023-12-21 12:00:10 +00:00
Arpad Müller	48890d206e	Simplify inject_index_part test function (#6207 ) Instead of manually constructing the directory's path, we can just use the `parent()` function. This is a drive-by improvement from #6206	2023-12-21 12:52:38 +01:00

1 2 3 4 5 ...

4306 Commits