rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2025-12-23 06:09:59 +00:00

Author	SHA1	Message	Date
Tristan Partin	9794f386f4	Make Postgres 17 the default version (#11619 ) This is mostly a documentation update, but a few updates with regard to neon_local, pageserver, and tests. 17 is our default for users in production, so dropping references to 16 makes sense. Signed-off-by: Tristan Partin <tristan@neon.tech> Signed-off-by: Tristan Partin <tristan@neon.tech>	2025-04-16 23:23:37 +00:00
Erik Grinaker	c610f3584d	test_runner: tweak `test_create_snapshot` compaction (#11495 ) ## Problem With the recent improvements to L0 compaction responsiveness, `test_create_snapshot` now ends up generating 10,000 layer files (compared to 1,000 in previous snapshots). This increases the snapshot size by 4x, and significantly slows down tests. ## Summary of changes Increase the target layer size from 128 KB to 256 KB, and the L0 compaction threshold from 1 to 5. This reduces the layer count from about 10,000 to 1,000.	2025-04-09 06:52:49 +00:00
Erik Grinaker	7679b63a2c	pageserver: persist stripe size in tenant manifest for tenant_import (#11181 ) ## Problem `tenant_import`, used to import an existing tenant from remote storage into a storage controller for support and debugging, assumed `DEFAULT_STRIPE_SIZE` since this can't be recovered from remote storage. In #11168, we are changing the stripe size, which will break `tenant_import`. Resolves #11175. ## Summary of changes * Add `stripe_size` to the tenant manifest. * Add `TenantScanRemoteStorageShard::stripe_size` and return from `tenant_scan_remote` if present. * Recover the stripe size during`tenant_import`, or fall back to 32768 (the original default stripe size). * Add tenant manifest compatibility snapshot: `2025-04-08-pgv17-tenant-manifest-v1.tar.zst` There are no cross-version concerns here, since unknown fields are ignored during deserialization where relevant.	2025-04-08 20:43:27 +00:00
Christian Schwarz	4f94751b75	pageserver config: ignore+warn about unknown fields (instead of `deny_unknown_fields`) (#11275 ) # Refs - refs https://github.com/neondatabase/neon/issues/8915 - discussion thread: https://neondb.slack.com/archives/C033RQ5SPDH/p1742406381132599 - stacked atop https://github.com/neondatabase/neon/pull/11298 - corresponding internal docs update that illustrates how this PR removes friction: https://github.com/neondatabase/docs/pull/404 # Problem Rejecting `pageserver.toml`s with unknown fields adds friction, especially when using `pageserver.toml` fields as feature flags that need to be decommissioned. See the added paragraphs on `pageserver_api::models::ConfigToml` for details on what kind of friction it causes. Also read the corresponding internal docs update linked above to see a more imperative guide for using `pageserver.toml` flags as feature flags. # Solution ## Ignoring unknown fields Ignoring is the serde default behavior. So, just remove `serde(deny_unknown_fields)` from all structs in `pageserver_api::config::ConfigToml` `pageserver_api::config::TenantConfigToml`. I went through all the child fields and verified they don't use `deny_unknown_fields` either, including those shared with `pageserver_api::models`. ## Warning about unknown fields We still want to warn about unknown fields to - be informed about typos in the config template - be reminded about feature-flag style configs that have been cleaned up in code but not yet in config templates We tried `serde_ignore` (cf draft #11319) but it doesn't work with `serde(flatten)`. The solution we arrived at is to compare the on-disk TOML with the TOML that we produce if we serialize the `ConfigToml` again. Any key specified in the on-disk TOML but not present in the serialized TOML is flagged as an ignored key. The mechanism to do it is a tiny recursive decent visitor on the `toml_edit::DocumentMut`. # Future Work Invalid config _values_ in known fields will continue to fail pageserver startup. See - https://github.com/neondatabase/cloud/issues/24349 for current worst case impact to deployments & ideas to improve.	2025-04-04 17:30:58 +00:00
Vlad Lazar	74920d8cd8	storcon: notify compute if correct observed state was refreshed (#11342 ) ## Problem Previously, if the observed state was refreshed and matching the intent, we wouldn't send a compute notification. This is unsafe. There's no guarantee that the location landed on the pageserver _and_ a compute notification for it was delivered. See https://github.com/neondatabase/neon/issues/11291#issuecomment-2743205411 for one such example. ## Summary of changes Add a reproducer and notify the compute if the correct observed state required a refresh. Closes https://github.com/neondatabase/neon/issues/11291	2025-04-03 16:35:55 +00:00
Alexander Bayandin	30a7dd630c	ruff: enable TC — flake8-type-checking (#11368 ) ## Problem `TYPE_CHECKING` is used inconsistently across Python tests. ## Summary of changes - Update `ruff`: 0.7.0 -> 0.11.2 - Enable TC (flake8-type-checking): https://docs.astral.sh/ruff/rules/#flake8-type-checking-tc - (auto)fix all new issues	2025-03-30 18:58:33 +00:00
Vlad Lazar	9fc7c22cc9	storcon: add use_local_compute_notifications flag (#11333 ) ## Problem While working on bulk import, I want to use the `control-plane-url` flag for a different request. Currently, the local compute hook is used whenever no control plane is specified in the config. My test requires local compute notifications and a configured `control-plane-url` which isn't supported. ## Summary of changes Add a `use-local-compute-notifications` flag. When this is set, we use the local flow regardless of other config values. It's enabled by default in neon_local and disabled by default in all other envs. I had to turn the flag off in tests that wish to bypass the local flow, but that's expected. --------- Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2025-03-21 15:31:06 +00:00
John Spray	08f92bb916	pageserver: clean up DeletionQueue push_layers_sync (#10701 ) ## Problem This is tech debt. While we introduced generations for tenants, some legacy situations without generations needed to delete things inline (async operation) instead of enqueing them (sync operation). ## Summary of changes - Remove the async code, replace calls with the sync variant, and assert that the generation is always set	2025-02-07 13:03:01 +00:00
Alexander Lakhin	977781e423	Enable sanitizers for postgres v17 (#10401 ) Add a build with sanitizers (asan, ubsan) to the CI pipeline and run tests on it. See https://github.com/neondatabase/neon/issues/6053 --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2025-02-06 12:53:43 +00:00
Erik Grinaker	ddb9ae1214	pageserver: add compaction backpressure for layer flushes (#10405 ) ## Problem There is no direct backpressure for compaction and L0 read amplification. This allows a large buildup of compaction debt and read amplification. Resolves #5415. Requires #10402. ## Summary of changes Delay layer flushes based on the number of level 0 delta layers: * `l0_flush_delay_threshold`: delay flushes such that they take 2x as long (default `2 * compaction_threshold`). * `l0_flush_stall_threshold`: stall flushes until level 0 delta layers drop below threshold (default `4 * compaction_threshold`). If either threshold is reached, ephemeral layer rolls also synchronously wait for layer flushes to propagate this backpressure up into WAL ingestion. This will bound the number of frozen layers to 1 once backpressure kicks in, since all other frozen layers must flush before the rolled layer. ## Analysis This will significantly change the compute backpressure characteristics. Recall the three compute backpressure knobs: * `max_replication_write_lag`: 500 MB (based on Pageserver `last_received_lsn`). * `max_replication_flush_lag`: 10 GB (based on Pageserver `disk_consistent_lsn`). * `max_replication_apply_lag`: disabled (based on Pageserver `remote_consistent_lsn`). Previously, the Pageserver would keep ingesting WAL and build up ephemeral layers and L0 layers until the compute hit `max_replication_flush_lag` at 10 GB and began backpressuring. Now, once we delay/stall WAL ingestion, the compute will begin backpressuring after `max_replication_write_lag`, i.e. 500 MB. This is probably a good thing (we're not building up a ton of compaction debt), but we should consider tuning these settings. `max_replication_flush_lag` probably doesn't serve a purpose anymore, and we should consider removing it. Furthermore, the removal of the upload barrier in #10402 will mean that we no longer backpressure flushes based on S3 uploads, since `max_replication_apply_lag` is disabled. We should consider enabling this as well. ### When and what do we compact? Default compaction settings: * `compaction_threshold`: 10 L0 delta layers. * `compaction_period`: 20 seconds (between each compaction loop check). * `checkpoint_distance`: 256 MB (size of L0 delta layers). * `l0_flush_delay_threshold`: 20 L0 delta layers. * `l0_flush_stall_threshold`: 40 L0 delta layers. Compaction characteristics: * Minimum compaction volume: 10 layers * 256 MB = 2.5 GB. * Additional compaction volume (assuming 128 MB/s WAL): 128 MB/s * 20 seconds = 2.5 GB (10 L0 layers). * Required compaction bandwidth: 5.0 GB / 20 seconds = 256 MB/s. ### When do we hit `max_replication_write_lag`? Depending on how fast compaction and flushes happens, the compute will backpressure somewhere between `l0_flush_delay_threshold` or `l0_flush_stall_threshold` + `max_replication_write_lag`. * Minimum compute backpressure lag: 20 layers * 256 MB + 500 MB = 5.6 GB * Maximum compute backpressure lag: 40 layers * 256 MB + 500 MB = 10.0 GB This seems like a reasonable range to me.	2025-01-24 09:47:28 +00:00
Vlad Lazar	414ed82c1f	pageserver: issue concurrent IO on the read path (#9353 ) ## Refs - Epic: https://github.com/neondatabase/neon/issues/9378 Co-authored-by: Vlad Lazar <vlad@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech> ## Problem The read path does its IOs sequentially. This means that if N values need to be read to reconstruct a page, we will do N IOs and getpage latency is `O(NIoLatency)`. ## Solution With this PR we gain the ability to issue IO concurrently within one layer visit and* to move on to the next layer without waiting for IOs from the previous visit to complete. This is an evolved version of the work done at the Lisbon hackathon, cf https://github.com/neondatabase/neon/pull/9002. ## Design ### `will_init` now sourced from disk btree index keys On the algorithmic level, the only change is that the `get_values_reconstruct_data` now sources `will_init` from the disk btree index key (which is PS-page_cache'd), instead of from the `Value`, which is only available after the IO completes. ### Concurrent IOs, Submission & Completion To separate IO submission from waiting for its completion, while simultaneously feature-gating the change, we introduce the notion of an `IoConcurrency` struct through which IO futures are "spawned". An IO is an opaque future, and waiting for completions is handled through `tokio::sync::oneshot` channels. The oneshot Receiver's take the place of the `img` and `records` fields inside `VectoredValueReconstructState`. When we're done visiting all the layers and submitting all the IOs along the way we concurrently `collect_pending_ios` for each value, which means for each value there is a future that awaits all the oneshot receivers and then calls into walredo to reconstruct the page image. Walredo is now invoked concurrently for each value instead of sequentially. Walredo itself remains unchanged. The spawned IO futures are driven to completion by a sidecar tokio task that is separate from the task that performs all the layer visiting and spawning of IOs. That tasks receives the IO futures via an unbounded mpsc channel and drives them to completion inside a `FuturedUnordered`. (The behavior from before this PR is available through `IoConcurrency::Sequential`, which awaits the IO futures in place, without "spawning" or "submitting" them anywhere.) #### Alternatives Explored A few words on the rationale behind having a sidecar task and what alternatives were considered. One option is to queue up all IO futures in a FuturesUnordered that is polled the first time when we `collect_pending_ios`. Firstly, the IO futures are opaque, compiler-generated futures that need to be polled at least once to submit their IO. "At least once" because tokio-epoll-uring may not be able to submit the IO to the kernel on first poll right away. Second, there are deadlocks if we don't drive the IO futures to completion independently of the spawning task. The reason is that both the IO futures and the spawning task may hold some _and_ try to acquire _more_ shared limited resources. For example, both spawning task and IO future may try to acquire * a VirtualFile file descriptor cache slot async mutex (observed during impl) * a tokio-epoll-uring submission slot (observed during impl) * a PageCache slot (currently this is not the case but we may move more code into the IO futures in the future) Another option is to spawn a short-lived `tokio::task` for each IO future. We implemented and benchmarked it during development, but found little throughput improvement and moderate mean & tail latency degradation. Concerns about pressure on the tokio scheduler made us discard this variant. The sidecar task could be obsoleted if the IOs were not arbitrary code but a well-defined struct. However, 1. the opaque futures approach taken in this PR allows leaving the existing code unchanged, which 2. allows us to implement the `IoConcurrency::Sequential` mode for feature-gating the change. Once the new mode sidecar task implementation is rolled out everywhere, and `::Sequential` removed, we can think about a descriptive submission & completion interface. The problems around deadlocks pointed out earlier will need to be solved then. For example, we could eliminate VirtualFile file descriptor cache and tokio-epoll-uring slots. The latter has been drafted in https://github.com/neondatabase/tokio-epoll-uring/pull/63. See the lengthy doc comment on `spawn_io()` for more details. ### Error handling There are two error classes during reconstruct data retrieval: * traversal errors: index lookup, move to next layer, and the like * value read IO errors A traversal error fails the entire get_vectored request, as before this PR. A value read error only fails that value. In any case, we preserve the existing behavior that once `get_vectored` returns, all IOs are done. Panics and failing to poll `get_vectored` to completion will leave the IOs dangling, which is safe but shouldn't happen, and so, a rate-limited log statement will be emitted at warning level. There is a doc comment on `collect_pending_ios` giving more code-level details and rationale. ### Feature Gating The new behavior is opt-in via pageserver config. The `Sequential` mode is the default. The only significant change in `Sequential` mode compared to before this PR is the buffering of results in the `oneshot`s. ## Code-Level Changes Prep work: * Make `GateGuard` clonable. Core Feature: * Traversal code: track `will_init` in `BlobMeta` and source it from the Delta/Image/InMemory layer index, instead of determining `will_init` after we've read the value. This avoids having to read the value to determine whether traversal can stop. * Introduce `IoConcurrency` & its sidecar task. * `IoConcurrency` is the clonable handle. * It connects to the sidecar task via an `mpsc`. * Plumb through `IoConcurrency` from high level code to the individual layer implementations' `get_values_reconstruct_data`. We piggy-back on the `ValuesReconstructState` for this. * The sidecar task should be long-lived, so, `IoConcurrency` needs to be rooted up "high" in the call stack. * Roots as of this PR: * `page_service`: outside of pagestream loop * `create_image_layers`: when it is called * `basebackup`(only auxfiles + replorigin + SLRU segments) * Code with no roots that uses `IoConcurrency::sequential` * any `Timeline::get` call * `collect_keyspace` is a good example * follow-up: https://github.com/neondatabase/neon/issues/10460 * `TimelineAdaptor` code used by the compaction simulator, unused in practive * `ingest_xlog_dbase_create` * Transform Delta/Image/InMemoryLayer to * do their values IO in a distinct `async {}` block * extend the residence of the Delta/Image layer until the IO is done * buffer their results in a `oneshot` channel instead of straight in `ValuesReconstructState` * the `oneshot` channel is wrapped in `OnDiskValueIo` / `OnDiskValueIoWaiter` types that aid in expressiveness and are used to keep track of in-flight IOs so we can print warnings if we leave them dangling. * Change `ValuesReconstructState` to hold the receiving end of the `oneshot` channel aka `OnDiskValueIoWaiter`. * Change `get_vectored_impl` to `collect_pending_ios` and issue walredo concurrently, in a `FuturesUnordered`. Testing / Benchmarking: * Support queue-depth in pagebench for manual benchmarkinng. * Add test suite support for setting concurrency mode ps config field via a) an env var and b) via NeonEnvBuilder. * Hacky helper to have sidecar-based IoConcurrency in tests. This will be cleaned up later. More benchmarking will happen post-merge in nightly benchmarks, plus in staging/pre-prod. Some intermediate helpers for manual benchmarking have been preserved in https://github.com/neondatabase/neon/pull/10466 and will be landed in later PRs. (L0 layer stack generator!) Drive-By: * test suite actually didn't enable batching by default because `config.compatibility_neon_binpath` is always Truthy in our CI environment => https://neondb.slack.com/archives/C059ZC138NR/p1737490501941309 * initial logical size calculation wasn't always polled to completion, which was surfaced through the added WARN logs emitted when dropping a `ValuesReconstructState` that still has inflight IOs. * remove the timing histograms `pageserver_getpage_get_reconstruct_data_seconds` and `pageserver_getpage_reconstruct_seconds` because with planning, value read IO, and walredo happening concurrently, one can no longer attribute latency to any one of them; we'll revisit this when Vlad's work on tracing/sampling through RequestContext lands. * remove code related to `get_cached_lsn()`. The logic around this has been dead at runtime for a long time, ever since the removal of the materialized page cache in #8105. ## Testing Unit tests use the sidecar task by default and run both modes in CI. Python regression tests and benchmarks also use the sidecar task by default. We'll test more in staging and possibly preprod. # Future Work Please refer to the parent epic for the full plan. The next step will be to fold the plumbing of IoConcurrency into RequestContext so that the function signatures get cleaned up. Once `Sequential` isn't used anymore, we can take the next big leap which is replacing the opaque IOs with structs that have well-defined semantics. --------- Co-authored-by: Christian Schwarz <christian@neon.tech>	2025-01-22 15:30:23 +00:00
John Spray	0d4fce2d35	tests: refine how compat snapshot is generated (#10342 ) ## Problem I noticed in https://github.com/neondatabase/neon/pull/9537 that tests which work with compat snapshots were writing several hundred MB of data, which isn't really necessary. Also, the snapshots are large but don't have the proper variety of storage format features, e.g. they could just have L0 deltas. ## Summary of changes - Use smaller scale factor and runtime to generate less data - Configure a small layer size and use force image layer generation so that our output contains L1 deltas and image layers, and has a decent number of entries in the layer map	2025-01-10 13:57:23 +00:00
Alexander Bayandin	8d1c44039e	Python 3.11 (#9515 ) ## Problem On Debian 12 (Bookworm), Python 3.11 is the latest available version. ## Summary of changes - Update Python to 3.11 in build-tools - Fix ruff check / format - Fix mypy - Use `StrEnum` instead of pair `str`, `Enum` - Update docs	2024-11-21 16:25:31 +00:00
a-masterov	091a175a3e	Test versions mismatch (#9167 ) ## Problem We faced the problem of incompatibility of the different components of different versions. This should be detected automatically to prevent production bugs. ## Summary of changes The test for this situation was implemented Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2024-10-11 15:29:54 +02:00
Tristan Partin	5bd8e2363a	Enable all pyupgrade checks in ruff This will help to keep us from using deprecated Python features going forward. Signed-off-by: Tristan Partin <tristan@neon.tech>	2024-10-08 14:32:26 -05:00
Heikki Linnakangas	8ef0c38b23	tests: Rename NeonLocalCli functions to match the 'neon_local' commands (#9195 ) This makes it more clear that the functions in NeonLocalCli are just typed wrappers around the corresponding 'neon_local' commands.	2024-10-03 22:03:27 +03:00
a-masterov	5dc68e4e6a	test_compatibility: fix the regexes detecting the version (#9205 ) ## Problem The Neon components, built locally and by the GitHub workflow have slightly different version prefixes (git: vs git-env:) This does not allow running tests against local builds correctly. ## Summary of changes The regular expressions were changed to work with both prefixes.	2024-09-30 16:37:14 +02:00
Alexander Bayandin	7fdf1ab5b6	CI: run compatibility tests on Postgres 17 (#9145 ) ## Problem The latest storage release has generated artifacts for Postgres 17, so we can enable compatibility tests this version ## Summary of changes - Unskip `test_backward_compatibility` / `test_forward_compatibility` on Postgres 17	2024-09-26 15:17:01 +01:00
Matthias van de Meent	78938d1b59	[compute/postgres] feature: PostgreSQL 17 (#8573 ) This adds preliminary PG17 support to Neon, based on RC1 / 2024-09-04 `07b828e9d4` NOTICE: The data produced by the included version of the PostgreSQL fork may not be compatible with the future full release of PostgreSQL 17 due to expected or unexpected future changes in magic numbers and internals. DO NOT EXPECT DATA IN V17-TENANTS TO BE COMPATIBLE WITH THE 17.0 RELEASE! Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech> Co-authored-by: Alexander Bayandin <alexander@neon.tech> Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2024-09-12 23:18:41 +01:00
Heikki Linnakangas	8dc069037b	Remove NeonEnvBuilder.start() function It feels wrong to me to start() from the builder object. Surely the thing you start is the environment itself, not its configuration.	2024-09-12 01:28:56 +03:00
Christian Schwarz	bf0531d107	fixup(#8839 ): `test_forward_compatibility` needs to allow lag warning as well (#8891 ) Found in https://neon-github-public-dev.s3.amazonaws.com/reports/pr-8885/10665614629/index.html#suites/0fbaeb107ef328d03993d44a1fb15690/ea10ba1c140fba1d	2024-09-02 15:10:10 +01:00
Christian Schwarz	c2f8fdccd7	ingest: rate-limited warning if WAL commit timestamps lags for > wait_lsn_timeout (#8839 ) refs https://github.com/neondatabase/cloud/issues/13750 The logging in this commit will make it easier to detect lagging ingest. We're trusting compute timestamps --- ideally we'd use SK timestmaps instead. But trusting the compute timestamp is ok for now.	2024-08-29 12:06:00 +01:00
Joonas Koivunen	b1c457898b	test_compatibility: flush in the end (#8804 ) `test_forward_compatibility` is still often failing at graceful shutdown. Fix this by explicit flush before shutdown. Example: https://neon-github-public-dev.s3.amazonaws.com/reports/main/10506613738/index.html#testresult/5e7111907f7ecfb2/ Cc: #8655 and #8708 Previous attempt: #8787	2024-08-22 16:38:03 +01:00
Joonas Koivunen	b83d722369	test: fix more flaky due to graceful shutdown (#8787 ) Going through the list of recent flaky tests, trying to fix those related to graceful shutdown. - test_forward_compatibility: flush and wait for uploads to avoid graceful shutdown - test_layer_bloating: in the end the endpoint and vanilla are still up => immediate shutdown - test_lagging_sk: pageserver shutdown is not related to the test => immediate shutdown - test_lsn_lease_size: pageserver flushing is not needed => immediate shutdown Additionally: - remove `wait_for_upload` usage from workload fixture Cc: #8708 Fixes: #8710	2024-08-21 17:22:47 +01:00
Alexander Bayandin	c96593b473	Make Postgres 16 default version (#8745 ) ## Problem The default Postgres version is set to 15 in code, while we use 16 in most of the other places (and Postgres 17 is coming) ## Summary of changes - Run `benchmarks` job with Postgres 16 (instead of Postgres 14) - Set `DEFAULT_PG_VERSION` to 16 in all places - Remove deprecated `--pg-version` pytest argument - Update `test_metadata_bincode_serde_ensure_roundtrip` for Postgres 16	2024-08-20 10:46:58 +01:00
Yuchen Liang	ed5724d79d	scrubber: clean up `scan_metadata` before prod (#8565 ) Part of #8128. ## Problem Currently, scrubber `scan_metadata` command will return with an error code if the metadata on remote storage is corrupted with fatal errors. To safely deploy this command in a cronjob, we want to differentiate between failures while running scrubber command and the erroneous metadata. At the same time, we also want our regression tests to catch corrupted metadata using the scrubber command. ## Summary of changes - Return with error code only when the scrubber command fails - Uses explicit checks on errors and warnings to determine metadata health in regression tests. Resolve conflict with `tenant-snapshot` command (after shard split): [`test_scrubber_tenant_snapshot`](https://github.com/neondatabase/neon/blob/yuchen/scrubber-scan-cleanup-before-prod/test_runner/regress/test_storage_scrubber.py#L23) failed before applying `422a8443dd` - When taking a snapshot, the old `index_part.json` in the unsharded tenant directory is not kept. - The current `list_timeline_blobs` implementation consider no `index_part.json` as a parse error. - During the scan, we are only analyzing shards with highest shard count, so we will not get a parse error. but we do need to add the layers to tenant object listing, otherwise we will get index is referencing a layer that is not in remote storage error. - Action: Add s3_layers from `list_timeline_blobs` regardless of parsing error Signed-off-by: Yuchen Liang <yuchen@neon.tech>	2024-08-06 18:55:42 +01:00
John Spray	0a667bc8ef	tests: add test_historic_storage_formats (#8423 ) ## Problem Currently, our backward compatibility tests only look one release back. That means, for example, that when we switch on image layer compression by default, we'll test reading of uncompressed layers for one release, and then stop doing it. When we make an index_part.json format change, we'll test against the old format for a week, then stop (unless we write separate unit tests for each old format). The reality in the field is that data in old formats will continue to exist for weeks/months/years. When we make major format changes, we should retain examples of the old format data, and continuously verify that the latest code can still read them. This test uses contents from a new path in the public S3 bucket, `compatibility-data-snapshots/`. It is populated by hand. The first important artifact is one from before we switch on compression, so that we will keep testing reads of uncompressed data. We will generate more artifacts ahead of other key changes, like when we update remote storage format for archival timelines. Closes: https://github.com/neondatabase/cloud/issues/15576	2024-08-02 18:28:23 +01:00
Vlad Lazar	cdaa2816e7	pageserver: make vectored get the default read path for the pageserver (#8384 ) ## Problem Vectored get is already enabled in all prod regions without validation. The pageserver defaults are out of sync however. ## Summary of changes Update the pageserver defaults to match the prod config. Also means that when running tests locally, people don't have to use the env vars to get the prod config.	2024-07-26 14:19:52 +01:00
Christian Schwarz	a2d170b6d0	NeonEnv.from_repo_dir: use storage_controller_db instead of `attachments.json` (#8382 ) When `NeonEnv.from_repo_dir` was introduced, storage controller stored its state exclusively `attachments.json`. Since then, it has moved to using Postgres, which stores its state in `storage_controller_db`. But `NeonEnv.from_repo_dir` wasn't adjusted to do this. This PR rectifies the situation. Context for this is failures in `test_pageserver_characterize_throughput_with_n_tenants` CF: https://neondb.slack.com/archives/C033RQ5SPDH/p1721035799502239?thread_ts=1720901332.293769&cid=C033RQ5SPDH Notably, `from_repo_dir` is also used by the backwards- and forwards-compatibility. Thus, the changes in this PR affect those tests as well. However, it turns out that the compatibility snapshot already contains the `storage_controller_db`. Thus, it should just work and in fact we can remove hacks like `fixup_storage_controller`. Follow-ups created as part of this work: * https://github.com/neondatabase/neon/issues/8399 * https://github.com/neondatabase/neon/issues/8400	2024-07-18 10:56:07 +02:00
John Spray	814c8e8f68	storage controller: add node deletion API (#8226 ) ## Problem In anticipation of later adding a really nice drain+delete API, I initially only added an intentionally basic `/drop` API that is just about usable for deleting nodes in a pinch, but requires some ugly storage controller restarts to persuade it to restart secondaries. ## Summary of changes I started making a few tiny fixes, and ended up writing the delete API... - Quality of life nit: ordering of node + tenant listings in storcon_cli - Papercut: Fix the attach_hook using the wrong operation type for reporting slow locks - Make Service::spawn tolerate `generation_pageserver` columns that point to nonexistent node IDs. I started out thinking of this as a general resilience thing, but when implementing the delete API I realized it was actually a legitimate end state after the delete API is called (as that API doesn't wait for all reconciles to succeed). - Add a `DELETE` API for nodes, which does not gracefully drain, but does reschedule everything. This becomes safe to use when the system is in any state, but will incur availability gaps for any tenants that weren't already live-migrated away. If tenants have already been drained, this becomes a totally clean + safe way to decom a node. - Add a test and a storcon_cli wrapper for it This is meant to be a robust initial API that lets us remove nodes without doing ugly things like restarting the storage controller -- it's not quite a totally graceful node-draining routine yet. There's more work in https://github.com/neondatabase/neon/issues/8333 to get to our end-end state.	2024-07-11 17:05:47 +01:00
Alex Chi Z	30d15ad403	chore(test): add version check for forward compat test (#7685 ) A test for https://github.com/neondatabase/neon/pull/7684. This pull request checks if the pageserver version we specified is the one actually running by comparing the git hash in forward compatibility tests. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-05-14 10:36:48 -04:00
Joonas Koivunen	d9dcbffac3	python: allow using allowed_errors.py (#7719 ) See #7718. Fix it by renaming all `types.py` to `common_types.py`. Additionally, add an advert for using `allowed_errors.py` to test any added regex.	2024-05-13 15:16:23 +03:00
Christian Schwarz	ea531d448e	fix(test suite): forward compat test is not using latest neon_local (#7637 ) The `test_forward_compatibility` test runs the old production binaries, but is supposed to always run the latest neon_local binary. I think commit `6acbee23` broke that by accident because in that commit, `from_repo_dir` is introduced and runs an `init_start()` before the `test_forward_compatibility` gets a chance to patch up the neon_local_binpath.	2024-05-07 15:43:04 +00:00
Vlad Lazar	1f417af9fd	pagserver: use vectored read path in benchmarks (#7498 ) ## Problem Benchmarks don't use the vectored read path. ## Summary of changes * Update the benchmarks to use the vectored read path for both singular and vectored gets. * Disable validation for the benchmarks	2024-04-29 17:26:35 +01:00
Vlad Lazar	e4a279db13	pageserver: coalesce read paths (#7477 ) ## Problem We are currently supporting two read paths. No bueno. ## Summary of changes High level: use vectored read path to serve get page requests - gated by `get_impl` config Low level: 1. Add ps config, `get_impl` to specify which read path to use when serving get page requests 2. Fix base cached image handling for the vectored read path. This was subtly broken: previously we would not mark keys that went past their cached lsn as complete. This is a self standing change which could be its own PR, but I've included it here because writing separate tests for it is tricky. 3. Fork get page to use either the legacy or vectored implementation 4. Validate the use of vectored read path when serving get page requests against the legacy implementation. Controlled by `validate_vectored_get` ps config. 5. Use the vectored read path to serve get page requests in tests (with validation). ## Note Since the vectored read path does not go through the page cache to read buffers, this change also amounts to a removal of the buffer page cache. Materialized page cache is still used.	2024-04-25 13:29:17 +01:00
Arthur Petukhovsky	db72543f4d	Reenable test_forward_compatibility (#7358 ) It was disabled due to https://github.com/neondatabase/neon/pull/6530 breaking forward compatiblity. Now that we have deployed it to production, we can reenable the test	2024-04-11 12:31:27 +02:00
Arthur Petukhovsky	3f77f26aa2	Upload partial segments (#6530 ) Add support for backing up partial segments to remote storage. Disabled by default, can be enabled with `--partial-backup-enabled`. Safekeeper timeline has a background task which is subscribed to `commit_lsn` and `flush_lsn` updates. After the partial segment was updated (`flush_lsn` was changed), the segment will be uploaded to S3 in about 15 minutes. The filename format for partial segments is `Segment_Term_Flush_Commit_skNN.partial`, where: - `Segment` – the segment name, like `000000010000000000000001` - `Term` – current term - `Flush` – flush_lsn in hex format `{:016X}`, e.g. `00000000346BC568` - `Commit` – commit_lsn in the same hex format - `NN` – safekeeper_id, like `1` The full object name example: `000000010000000000000002_2_0000000002534868_0000000002534410_sk1.partial` Each safekeeper will keep info about remote partial segments in its control file. Code updates state in the control file before doing any S3 operations. This way control file stores information about all potentially existing remote partial segments and can clean them up after uploading a newer version. Closes #6336	2024-04-03 15:20:51 +00:00
John Spray	a5777bab09	tests: clean up compat test workarounds (#7097 ) - Cleanup from https://github.com/neondatabase/neon/pull/7040#discussion_r1521120263 -- in that PR, we needed to let compat tests manually register a node, because it would run an old binary that doesn't self-register. - Cleanup vectored get config workaround - Cleanup a log allow list for which the underlying log noise has been fixed.	2024-04-02 16:46:24 +01:00
John Spray	2713142308	tests: stabilize compat tests (#7227 ) This test had two flaky failure modes: - pageserver log error for timeline not found: this resulted from changes for DR when timeline destroy/create was added, but endpoint was left running during that operation. - storage controller log error because the test was running for long enough that a background reconcile happened at almost the exact moment of test teardown, and our test fixtures tear down the pageservers before the controller. Closes: #7224	2024-03-25 14:35:24 +00:00
John Spray	b80704cd34	tests: log hygiene checks for storage controller (#6710 ) ## Problem As with the pageserver, we should fail tests that emit unexpected log errors/warnings. ## Summary of changes - Refactor existing log checks to be reusable - Run log checks for attachment_service - Add allow lists as needed.	2024-03-19 10:30:33 +00:00
John Spray	7ae8364b0b	storage controller: register nodes in re-attach request (#7040 ) ## Problem Currently we manually register nodes with the storage controller, and use a script during deploy to register with the cloud control plane. Rather than extend that script further, nodes should just register on startup. ## Summary of changes - Extend the re-attach request to include an optional NodeRegisterRequest - If the `register` field is set, handle it like a normal node registration before executing the normal re-attach work. - Update tests/neon_local that used to rely on doing an explicit register step that could be enabled/disabled. --------- Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-03-12 14:47:12 +00:00
John Spray	89cf714890	tests/neon_local: rename "attachment service" -> "storage controller" (#7087 ) Not a user-facing change, but can break any existing `.neon` directories created by neon_local, as the name of the database used by the storage controller changes. This PR changes all the locations apart from the path of `control_plane/attachment_service` (waiting for an opportune moment to do that one, because it's the most conflict-ish wrt ongoing PRs like #6676 )	2024-03-12 11:36:27 +00:00
Vlad Lazar	5d6083bfc6	pageserver: add vectored get implementation (#6576 ) This PR introduces a new vectored implementation of the read path. The search is basically a DFS if you squint at it long enough. LayerFringe tracks the next layers to visit and acts as our stack. Vertices are tuples of (layer, keyspace, lsn range). Continuously pop the top of the stack (most recent layer) and do all the reads for one layer at once. The search maintains a fringe (`LayerFringe`) which tracks all the layers that intersect the current keyspace being searched. Continuously pop the top of the fringe (layer with highest LSN) and get all the data required from the layer in one go. Said search is done on one timeline at a time. If data is still required for some keys, then search the ancestor timeline. Apart from the high level layer traversal, vectored variants have been introduced for grabbing data from each layer type. They still suffer from read amplification issues and that will be addressed in a different PR. You might notice that in some places we duplicate the code for the existing read path. All of that code will be removed when we switch the non-vectored read path to proxy into the vectored read path. In the meantime, we'll have to contend with the extra cruft for the sake of testing and gentle releasing.	2024-02-21 09:49:46 +00:00
Alexander Bayandin	61f99d703d	test_create_snapshot: do not try to copy pg_dynshmem dir (#6796 ) ## Problem `test_create_snapshot` is flaky[0] on CI and fails constantly on macOS, but with a slightly different error: ``` shutil.Error: [('/Users/bayandin/work/neon/test_output/test_create_snapshot[release-pg15-1-100]/repo/endpoints/ep-1/pgdata/pg_dynshmem', '/Users/bayandin/work/neon/test_output/compatibility_snapshot_pgv15/repo/endpoints/ep-1/pgdata/pg_dynshmem', "[Errno 2] No such file or directory: '/Users/bayandin/work/neon/test_output/test_create_snapshot[release-pg15-1-100]/repo/endpoints/ep-1/pgdata/pg_dynshmem'")] ``` Also (on macOS) `repo/endpoints/ep-1/pgdata/pg_dynshmem` is a symlink to `/dev/shm/`. - [0] https://github.com/neondatabase/neon/issues/6784 ## Summary of changes Ignore `pg_dynshmem` directory while copying a snapshot	2024-02-18 12:16:07 +00:00
Heikki Linnakangas	d77583c86a	tests: Remove obsolete allowlist entries Commit `9a6c0be823` removed the code that printed these warnings: marking {} as locally complete, while it doesnt exist in remote index No timelines to attach received Remove those warnings from all the allowlists in tests.	2024-02-11 01:34:31 +02:00
John Spray	58f6cb649e	control_plane: database persistence for attachment_service (#6468 ) ## Problem Spun off from https://github.com/neondatabase/neon/pull/6394 -- this PR is just the persistence parts and the changes that enable it to work nicely ## Summary of changes - Revert #6444 and #6450 - In neon_local, start a vanilla postgres instance for the attachment service to use. - Adopt `diesel` crate for database access in attachment service. This uses raw SQL migrations as the source of truth for the schema, so it's a soft dependency: we can switch libraries pretty easily. - Rewrite persistence.rs to use postgres (via diesel) instead of JSON. - Preserve JSON read+write at startup and shutdown: this enables using the JSON format in compatibility tests, so that we don't have to commit to our DB schema yet. - In neon_local, run database creation + migrations before starting attachment service - Run the initial reconciliation in Service::spawn in the background, so that the pageserver + attachment service don't get stuck waiting for each other to start, when restarting both together in a test.	2024-01-26 17:20:44 +00:00
Arpad Müller	faf275d4a2	Remove initdb on timeline delete (#6387 ) This PR: * makes `initdb.tar.zst` be deleted by default on timeline deletion (#6226), mirroring the safekeeper: https://github.com/neondatabase/neon/pull/6381 * adds a new `preserve_initdb_archive` endpoint for a timeline, to be used during the disaster recovery process, see reasoning [here](https://github.com/neondatabase/neon/issues/6226#issuecomment-1894574778) * makes the creation code look for `initdb-preserved.tar.zst` in addition to `initdb.tar.zst`. * makes the tests use the new endpoint fixes #6226	2024-01-23 18:22:59 +00:00
Alexander Bayandin	7de829e475	test_runner: replace black with ruff format (#6268 ) ## Problem `black` is slow sometimes, we can replace it with `ruff format` (a new feature in 0.1.2 [0]), which produces pretty similar to black style [1]. On my local machine (MacBook M1 Pro 16GB): ``` # `black` on main $ hyperfine "BLACK_CACHE_DIR=/dev/null poetry run black ." Benchmark 1: BLACK_CACHE_DIR=/dev/null poetry run black . Time (mean ± σ): 3.131 s ± 0.090 s [User: 5.194 s, System: 0.859 s] Range (min … max): 3.047 s … 3.354 s 10 runs ``` ``` # `ruff format` on the current PR $ hyperfine "RUFF_NO_CACHE=true poetry run ruff format" Benchmark 1: RUFF_NO_CACHE=true poetry run ruff format Time (mean ± σ): 300.7 ms ± 50.2 ms [User: 259.5 ms, System: 76.1 ms] Range (min … max): 267.5 ms … 420.2 ms 10 runs ``` ## Summary of changes - Replace `black` with `ruff format` everywhere - [0] https://docs.astral.sh/ruff/formatter/ - [1] https://docs.astral.sh/ruff/formatter/#black-compatibility	2024-01-05 15:35:07 +00:00
Alexander Bayandin	0cd49cac84	test_compatibility: make it use initdb.tar.zst	2023-12-13 15:04:25 -06:00
Alexander Bayandin	6acbee2368	test_runner: add `from_repo_dir` method (#6087 ) ## Problem We need a reliable way to restore a project state (in this context, I mean data on pageservers, safekeepers, and remote storage) from a snapshot. The existing method (that we use in `test_compatibility`) heavily relies on config files, which makes it harder to add/change fields in the config. The proposed solution uses config file only to get `default_tenant_id` and `branch_name_mappings`. ## Summary of changes - Add `NeonEnvBuilder#from_repo_dir` method, which allows using the `neon_env_builder` fixture with data from a snapshot. - Use `NeonEnvBuilder#from_repo_dir` in compatibility tests Requires for https://github.com/neondatabase/neon/issues/6033	2023-12-12 16:24:13 +00:00

1 2

92 Commits