rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-17 02:12:56 +00:00

Author	SHA1	Message	Date
Christian Schwarz	f911050e31	Pull in squash of page_cache: find_victim: prevent starvation #5483 Squashed commit of the following: commit `71bf9cf8ae` Author: Christian Schwarz <me@cschwarz.com> Date: Mon Oct 9 20:21:22 2023 +0000 origin/problame/page-cache-forward-progress/3: trace spans and events only for tests commit `fd97c98dd9` Author: Christian Schwarz <christian@neon.tech> Date: Mon Oct 9 21:02:27 2023 +0200 move into library commit `05dbff7a18` Author: Christian Schwarz <christian@neon.tech> Date: Mon Oct 9 19:26:47 2023 +0200 commented out the check for just-once-polled, works now, don't understand why though commit `31632502aa` Author: Christian Schwarz <christian@neon.tech> Date: Mon Oct 9 17:54:44 2023 +0200 fixes commit `76d3e44588` Author: Christian Schwarz <christian@neon.tech> Date: Fri Oct 6 14:39:40 2023 +0200 hand-roll it instead commit `a5912dcc1b` Author: Christian Schwarz <me@cschwarz.com> Date: Wed Oct 4 17:21:24 2023 +0000 page_cache: find_victim: prevent starvation commit `da9a88a882` Author: Christian Schwarz <me@cschwarz.com> Date: Mon Oct 2 17:39:35 2023 +0000 page_cache: ensure forward progress on cache miss	2023-11-29 11:51:24 +00:00
Christian Schwarz	2eb9f64978	results look good: with initial size calculation in background, we can revert the revert without incurrent significant overhead with revert `49b63570c8` 20:05:04 - 20:28:00 => 22:56 duration without revert `e0ac820e87` 11:46:38 - 12:09:05 => 13:22 + 9:05 = 22:30 duration	2023-11-29 11:37:26 +00:00
Christian Schwarz	e0ac820e87	Revert "Revert "revert recent VirtualFile asyncification changes (#5291 )"" This reverts commit `fef7018ec6`.	2023-11-29 10:42:56 +00:00
Christian Schwarz	49b63570c8	idea: concurrency-limit initial logical size calculation Before this patch, there was no concurrency limit on initial logical size computations. In an experiment with a PS with 20k tenants, 1 timeline each, all tenants inactive in SKs / not present in storage broker, all logical size calculations are spawned by MetricsCollection, i.e., consumption metrics worker. Before this patch, these timelines would all do their initial logical size calculation in parallel, leading to extreme thrashing in page cache and virtual file cache. With this patch, the virtual file cache thrashing is reduced signficantly (from 80k `open`-system-calls/second to ~500 `open`-system-calls/second during loading). This patch uses the existing background tasks semaphore to limit concurrency, which generally is the right call for background activity. However, due to logical size's involvement in PageserverFeedback towards safekeepers, I think we need a priority-boosting mechanism, e.g., if we're still calculating but walreceiver is actively asking, skip the semaphore. That's fairly easy to implement, but, want to some feedback on the general idea first before implementing it. See also the FIXME in the block comment added in this commit. NB: when evaluating, keep in mind that consumption metrics worker persists its interval across restarts; delete the state file on disk to get predictable (and I believe worst-case in terms of concurrency during PS restart) behavior.	2023-11-28 19:02:07 +00:00
Christian Schwarz	fef7018ec6	Revert "revert recent VirtualFile asyncification changes (#5291 )" This reverts commit `ab1f37e908`. fixes #5479	2023-11-28 15:25:54 +00:00
Christian Schwarz	d16ff0d12c	usage instructions for generator script	2023-11-28 15:24:58 +00:00
Christian Schwarz	b4b3f3e3b6	Revert "idea: concurrency-limit initial logical size calculation" This reverts commit e94fdd9838c0abcb823b891c1e7389c2615e4f5a.	2023-11-28 15:24:58 +00:00
Christian Schwarz	537ef94146	idea: concurrency-limit initial logical size calculation Before this patch, there was no concurrency limit on initial logical size computations. In an experiment with a PS with 20k tenants, 1 timeline each, all tenants inactive in SKs / not present in storage broker, all logical size calculations are spawned by MetricsCollection, i.e., consumption metrics worker. Before this patch, these timelines would all do their initial logical size calculation in parallel, leading to extreme thrashing in page cache and virtual file cache. With this patch, the virtual file cache thrashing is reduced signficantly (from 80k `open`-system-calls/second to ~500 `open`-system-calls/second during loading). This patch uses the existing background tasks semaphore to limit concurrency, which generally is the right call for background activity. However, due to logical size's involvement in PageserverFeedback towards safekeepers, I think we need a priority-boosting mechanism, e.g., if we're still calculating but walreceiver is actively asking, skip the semaphore. That's fairly easy to implement, but, want to some feedback on the general idea first before implementing it. See also the FIXME in the block comment added in this commit. NB: when evaluating, keep in mind that consumption metrics worker persists its interval across restarts; delete the state file on disk to get predictable (and I believe worst-case in terms of concurrency during PS restart) behavior.	2023-11-28 15:24:58 +00:00
Christian Schwarz	efe0d93bf5	many_tenants script now works	2023-11-27 16:11:05 +00:00
Christian Schwarz	306880081d	test suite: add method for generation-aware detachment of a tenant	2023-11-27 16:06:54 +00:00
Christian Schwarz	9915597d3a	update many tenants script to use the new method for duplicating tenants (copy-paste from benchmarking WIP PR)	2023-11-27 15:12:27 +00:00
Christian Schwarz	e01c0c989e	Squashed commit of the following: commit `de90ba56d4` Author: Christian Schwarz <christian@neon.tech> Date: Mon Nov 27 14:47:26 2023 +0000 expose generation number in API commit `ae2c7589f9` Author: Christian Schwarz <christian@neon.tech> Date: Mon Nov 27 14:53:13 2023 +0000 pagectl: add subcommand to rewrite layer file history	2023-11-27 15:00:51 +00:00
Christian Schwarz	3a95fbcae9	measured BACKGROUND_RUNTIME performance using `wrk` Launch wrk from command line 3-4 seconds after the load starts. => blocking of executor threads is clearly visible, my branch performs _much_ better. baseline: commit `15b8618d25` (HEAD -> problame/loadtest-baseline, origin/problame/loadtest-baseline, main) neon-main (compaction semaphore disabled!) admin@ip-172-31-13-23:[~/neon]: wrk --latency http://localhost:2342 Running 10s test @ http://localhost:2342 2 threads and 10 connections Thread Stats Avg Stdev Max +/- Stdev Latency 71.42ms 15.97ms 125.18ms 70.82% Req/Sec 41.44 28.85 101.00 57.35% Latency Distribution 50% 72.53ms 75% 82.07ms 90% 91.44ms 99% 116.56ms 291 requests in 10.01s, 22.73KB read Socket errors: connect 0, read 0, write 0, timeout 10 Requests/sec: 29.07 Transfer/sec: 2.27KB this branch (comapction semaphore also disabled!): admin@ip-172-31-13-23:[~/neon]: wrk --latency http://localhost:2342 Running 10s test @ http://localhost:2342 2 threads and 10 connections Thread Stats Avg Stdev Max +/- Stdev Latency 45.74ms 64.13ms 293.44ms 83.27% Req/Sec 442.81 258.18 1.32k 69.79% Latency Distribution 50% 2.92ms 75% 75.52ms 90% 148.03ms 99% 248.50ms 8641 requests in 10.01s, 675.08KB read Requests/sec: 862.81 Transfer/sec: 67.41KB	2023-11-27 13:13:07 +00:00
Christian Schwarz	c9dc9e7d70	HACK: BACKGROUND_RUNTIME webserver to measure response time using `wrk`	2023-11-27 13:13:07 +00:00
Christian Schwarz	fc7403944e	REPRO the problem: , uses 430GB of space; 4 seconds load time; constant 20kIOPS after ~20s	2023-11-27 13:13:07 +00:00
Christian Schwarz	a76a503b8b	remove confusing no-op .take() of init_tenant_load_remote (#5923 ) The `Tenant::spawn()` method already `.take()`s it. I think this was an oversight in https://github.com/neondatabase/neon/pull/5580 .	2023-11-27 12:50:19 +00:00
Anastasia Lubennikova	92bc2bb132	Refactor remote extensions feature to request extensions from proxy (#5836 ) instead of direct S3 request. Pros: - simplify code a lot (no need to provide AWS credentials and paths); - reduce latency of downloading extension data as proxy resides near computes; -reduce AWS costs as proxy has cache and 1000 computes asking the same extension will not generate 1000 downloads from S3. - we can use only one S3 bucket to store extensions (and rid of regional buckets which were introduced to reduce latency); Changes: - deprecate remote-ext-config compute_ctl parameter, use http://pg-ext-s3-gateway if any old format remote-ext-cofig is provided; - refactor tests to use mock http server;	2023-11-27 12:10:23 +00:00
John Spray	b80b9e1c4c	pageserver: remove defunct local timeline delete markers (#5699 ) ## Problem Historically, we treated the presence of a timeline on local disk as evidence that it logically exists. Since #5580 that is no longer the case, so we can always rely on remote storage. If we restart and the timeline is gone in remote storage, we will also purge it from local disk: no need for a marker. Reference on why this PR is for timeline markers and not tenant markers: https://github.com/neondatabase/neon/issues/5080#issuecomment-1783187807 ## Summary of changes Remove code paths that read + write deletion marker for timelines. Leave code path that deletes these markers, just in case we deploy while there are some in existence. This can be cleaned up later. (https://github.com/neondatabase/neon/issues/5718)	2023-11-27 09:31:20 +00:00
Anastasia Lubennikova	87b8ac3ec3	Only create neon extension in postgres database; (#5918 ) Create neon extension in neon schema.	2023-11-26 08:37:01 +00:00
Joonas Koivunen	6b1c4cc983	fix: long timeline create cancelled by tenant delete (#5917 ) Fix the fallible vs. infallible check order with `UninitTimeline::finish_creation` so that the incomplete timeline can be removed. Currently the order of drop guard unwrapping causes uninit files to be left on pageserver, blocking the tenant deletion. Cc: #5914 Cc: #investigation-2023-11-23-stuck-tenant-deletion	2023-11-24 16:17:56 +00:00
Joonas Koivunen	831fad46d5	tests: fix allowed_error for compaction detecting a shutdown (#5919 ) This has been causing flaky tests, [example evidence]. Follow-up to #5883 where I forgot to fix this. [example evidence]: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-5917/6981540065/index.html#suites/9d2450a537238135fd4007859e09aca7/6fd3556a879fa3d1	2023-11-24 16:14:32 +00:00
Joonas Koivunen	53851ea8ec	fix: log cancelled request handler errors (#5915 ) noticed during [investigation] with @problame a major point of lost error logging which would had sped up the investigation. Cc: #5815 [investigation]: https://neondb.slack.com/archives/C066ZFAJU85/p1700751858049319	2023-11-24 15:54:06 +02:00
Joonas Koivunen	044375732a	test: support validating allowed_errors against a logfile (#5905 ) this will make it easier to test if an added allowed_error does in fact match for example against a log file from an allure report. ``` $ python3 test_runner/fixtures/pageserver/allowed_errors.py --help usage: allowed_errors.py [-h] [-i INPUT] check input against pageserver global allowed_errors optional arguments: -h, --help show this help message and exit -i INPUT, --input INPUT Pageserver logs file. Reads from stdin if no file is provided. ``` Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2023-11-24 12:43:25 +00:00
Konstantin Knizhnik	ea63b43009	Check if LFC was intialized in local_cache_pages function (#5911 ) ## Problem There is not check that LFC is initialised (`lfc_max_size != 0`) in `local_cache_pages` function ## Summary of changes Add proper check. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2023-11-24 08:23:00 +02:00
Conrad Ludgate	a56fd45f56	proxy: fix memory leak again (#5909 ) ## Problem The connections.join_next helped but it wasn't enough... The way I implemented the improvement before was still faulty but it mostly worked so it looked like it was working correctly. From [`tokio::select` docs](https://docs.rs/tokio/latest/tokio/macro.select.html): > 4. Once an <async expression> returns a value, attempt to apply the value to the provided <pattern>, if the pattern matches, evaluate <handler> and return. If the pattern does not match, disable the current branch and for the remainder of the current call to select!. Continue from step 3. The `connections.join_next()` future would complete and `Some(Err(e))` branch would be evaluated but not match (as the future would complete without panicking, we would hope). Since the branch doesn't match, it's disabled. The select continues but never attempts to call `join_next` again. Getting unlucky, more TCP connections are created than we attempt to join_next. ## Summary of changes Replace the `Some(Err(e))` pattern with `Some(e)`. Because of the auto-disabling feature, we don't need the `if !connections.is_empty()` step as the `None` pattern will disable it for us.	2023-11-23 19:11:24 +00:00
Anastasia Lubennikova	582a42762b	update extension version in test_neon_extension	2023-11-23 18:53:03 +00:00
Anastasia Lubennikova	f5dfa6f140	Create extension neon in existing databases too	2023-11-23 18:53:03 +00:00
Anastasia Lubennikova	f8d9bd8d14	Add extension neon to all databases. - Run CREATE EXTENSION neon for template1, so that it was created in all databases. - Run ALTER EXTENSION neon in all databases, to always have the newest version of the extension in computes. - Add test_neon_extension test	2023-11-23 18:53:03 +00:00
Anastasia Lubennikova	04e6c09f14	Add pgxn/neon/README.md	2023-11-23 18:53:03 +00:00
Arpad Müller	54327bbeec	Upload initdb results to S3 (#5390 ) ## Problem See #2592 ## Summary of changes Compresses the results of initdb into a .tar.zst file and uploads them to S3, to enable usage in recovery from lsn. Generations should not be involved I think because we do this only once at the very beginning of a timeline. --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-11-23 18:11:52 +00:00
Shany Pozin	35f243e787	Move weekly release PR trigger to Monday morning (#5908 )	2023-11-23 19:09:34 +02:00
Shany Pozin	b7a988ba46	Support cancellation for find_lsn_for_timestamp API (#5904 ) ## Problem #5900 ## Summary of changes Added cancellation token as param in all relevant code paths and actually used it in the find_lsn_for_timestamp main loop	2023-11-23 17:08:32 +02:00
Christian Schwarz	a0e61145c8	fix: cleanup of layers from the future can race with their re-creation (#5890 ) fixes https://github.com/neondatabase/neon/issues/5878 obsoletes https://github.com/neondatabase/neon/issues/5879 Before this PR, it could happen that `load_layer_map` schedules removal of the future image layer. Then a later compaction run could re-create the same image layer, scheduling a PUT. Due to lack of an upload queue barrier, the PUT and DELETE could be re-ordered. The result was IndexPart referencing a non-existent object. ## Summary of changes * Add support to `pagectl` / Python tests to decode `IndexPart` * Rust * new `pagectl` Subcommand * `IndexPart::{from,to}_s3_bytes()` methods to internalize knowledge about encoding of `IndexPart` * Python * new `NeonCli` subclass * Add regression test * Rust * Ability to force repartitioning; required to ensure image layer creation at last_record_lsn * Python * The regression test. * Fix the issue * Insert an `UploadOp::Barrier` after scheduling the deletions.	2023-11-23 13:33:41 +00:00
Konstantin Knizhnik	6afbadc90e	LFC fixes + statistics (#5727 ) ## Problem ## Summary of changes See #5500 ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2023-11-23 08:59:19 +02:00
Anastasia Lubennikova	2a12e9c46b	Add documentation for our sample pre-commit hook (#5868 )	2023-11-22 12:04:36 +00:00
Christian Schwarz	9e3c07611c	logging: support output to stderr (#5896 ) (part of the getpage benchmarking epic #5771) The plan is to make the benchmarking tool log on stderr and emit results as JSON on stdout. That way, the test suite can simply take captures stdout and json.loads() it, while interactive users of the benchmarking tool have a reasonable experience as well. Existing logging users continue to print to stdout, so, this change should be a no-op functionally and performance-wise.	2023-11-22 11:08:35 +00:00
Christian Schwarz	d353fa1998	refer to our rust-postgres.git fork by branch name (#5894 ) This way, `cargo update -p tokio-postgres` just works. The `Cargo.toml` communicates more clearly that we're referring to the `main` branch. And the git revision is still pinned in `Cargo.lock`.	2023-11-22 10:58:27 +00:00
Joonas Koivunen	0d10992e46	Cleanup compact_level0_phase1 fsyncing (#5852 ) While reviewing code noticed a scary `layer_paths.pop().unwrap()` then realized this should be further asyncified, something I forgot to do when I switched the `compact_level0_phase1` back to async in #4938. This keeps the double-fsync for new deltas as #4749 is still unsolved.	2023-11-21 15:30:40 +02:00
Arpad Müller	3e131bb3d7	Update Rust to 1.74.0 (#5873 ) [Release notes](https://github.com/rust-lang/rust/releases/tag/1.74.0).	2023-11-21 11:41:41 +01:00
Sasha Krassovsky	81b2cefe10	Disallow CREATE DATABASE WITH OWNER neon_superuser (#5887 ) ## Problem Currently, control plane doesn't know about neon_superuser, so if a user creates a database with owner neon_superuser it causes an exception when it tries to forward it. It is also currently possible to ALTER ROLE neon_superuser. ## Summary of changes Disallow creating database with owner neon_superuser. This is probably fine, since I don't think you can create a database with owner normal superuser. Also forbids altering neon_superuser	2023-11-20 22:39:47 +00:00
Christian Schwarz	d2ca410919	build: back to opt-level=0 in debug builds, for faster compile times (#5751 ) This change brings down incremental compilation for me from > 1min to 10s (and this is a pretty old Ryzen 1700X). More details: "incremental compilation" here means to change one character in the `failed to read value from offset` string in `image_layer.rs`. The command for incremental compilation is `cargo build_testing`. The system on which I got these numbers uses `mold` via `~/.cargo/config.toml`. As a bonus, `rust-gdb` is now at least a little fun again. Some tests are timing out in debug builds due to these changes. This PR makes them skip for debug builds. We run both with debug and release build, so, the loss of coverage is marginal. --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2023-11-20 15:41:37 +01:00
Joonas Koivunen	d98ac04136	chore(background_tasks): missed allowed_error change, logging change (#5883 ) - I am always confused by the log for the error wait time, now it will be `2s` or `2.0s` not `2.0` - fix missed string change introduced in #5881 [evidence] [evidence]: https://neon-github-public-dev.s3.amazonaws.com/reports/main/6921062837/index.html#suites/f9eba3cfdb71aa6e2b54f6466222829b/87897fe1ddee3825	2023-11-20 07:33:17 +00:00
Joonas Koivunen	ac08072d2e	fix(layer): VirtualFile opening and read errors can be caused by contention (#5880 ) A very low number of layer loads have been marked wrongly as permanent, as I did not remember that `VirtualFile::open` or reading could fail transiently for contention. Return separate errors for transient and persistent errors from `{Delta,Image}LayerInner::load`. Includes drive-by comment changes. The implementation looks quite ugly because having the same type be both the inner (operation error) and outer (critical error), but with the alternatives I tried I did not find a better way.	2023-11-19 14:57:39 +00:00
John Spray	d22dce2e31	pageserver: shut down idle walredo processes (#5877 ) The longer a pageserver runs, the more walredo processes it accumulates from tenants that are touched intermittently (e.g. by availability checks). This can lead to getting OOM killed. Changes: - Add an Instant recording the last use of the walredo process for a tenant - After compaction iteration in the background task, check for idleness and stop the walredo process if idle for more than 10x compaction period. Cc: #3620 Co-authored-by: Joonas Koivunen <joonas@neon.tech> Co-authored-by: Shany Pozin <shany@neon.tech>	2023-11-19 14:21:16 +00:00
Joonas Koivunen	3b3f040be3	fix(background_tasks): first backoff, compaction error stacktraces (#5881 ) First compaction/gc error backoff starts from 0 which is less than 2s what it was before #5672. This is now fixed to be the intended 2**n. Additionally noticed the `compaction_iteration` creating an `anyhow::Error` via `into()` always captures a stacktrace even if we had a stacktraceful anyhow error within the CompactionError because there is no stable api for querying that.	2023-11-19 14:16:31 +00:00
Em Sharnoff	cad0dca4b8	compute_ctl: Remove deprecated flag `--file-cache-on-disk` (#5622 ) See neondatabase/cloud#7516 for more.	2023-11-18 12:43:54 +01:00
Em Sharnoff	5d13a2e426	Improve error message when neon.max_cluster_size reached (#4173 ) Changes the error message encountered when the `neon.max_cluster_size` limit is reached. Reasoning is that this is user-visible, and so should probably use language that's closer to what users are familiar with.	2023-11-16 21:51:26 +00:00
khanova	0c243faf96	Proxy log pid hack (#5869 ) ## Problem Improve observability for the compute node. ## Summary of changes Log pid from the compute node. Doesn't work with pgbouncer.	2023-11-16 20:46:23 +00:00
Em Sharnoff	d0a842a509	Update vm-builder to v0.19.0 and move its customization here (#5783 ) ref neondatabase/autoscaling#600 for more	2023-11-16 18:17:42 +01:00
khanova	6b82f22ada	Collect number of connections by sni type (#5867 ) ## Problem We don't know the number of users with the different kind of authentication: ["sni", "endpoint in options" (A and B from [here](https://neon.tech/docs/connect/connection-errors)), "password_hack"] ## Summary of changes Collect metrics by sni kind.	2023-11-16 12:19:13 +00:00

1 2 3 4 5 ...

4083 Commits