Compare commits

..

12 Commits

Author SHA1 Message Date
Heikki Linnakangas
2efad33ab9 Temporarily disable submodules check
It failed on a PR that looks totally fine:
https://github.com/neondatabase/neon/actions/runs/10993839972/job/30526595474?pr=9084

Disable it to unblock that work and other submodule changes, while we
investigate.
2024-09-23 18:52:32 +03:00
Nikita Kalyanov
f446e08fb8 change HTTP method to comply with spec (#9100)
There is discrepancy with the spec, it has PUT
2024-09-23 15:53:06 +02:00
Christian Schwarz
4d5add9ca0 compact_level0_phase1: remove final traces of value access mode config (#8935)
refs https://github.com/neondatabase/neon/issues/8184
stacked atop https://github.com/neondatabase/neon/pull/8934

This PR changes from ignoring the config field to rejecting configs that
contain it.

PR https://github.com/neondatabase/infra/pull/1903 removes the field
usage from `pageserver.toml`.

It rolls into prod sooner or in the same release as this PR.
2024-09-23 15:05:22 +02:00
Christian Schwarz
59b4c2eaf9 walredo: add a ping method (#8952)
Not used in production, but in benchmarks, to demonstrate minimal RTT.
(It would be nice to not have to copy the 8KiB of zeroes, but, that
would require larger protocol changes).

Found this useful in investigation
https://github.com/neondatabase/neon/pull/8952.
2024-09-23 10:19:37 +00:00
Vlad Lazar
5432155b0d storcon: update compute hook state on detach (#9045)
## Problem

Previously, the storage controller may send compute notifications
containing stale pageservers (i.e. pageserver serving the shard was
detached). This happened because detaches did not update the compute
hook state.

## Summary of Changes

Update compute hook state on shard detach.

Fixes #8928
2024-09-23 10:05:02 +01:00
Heikki Linnakangas
e16e82749f Remove unused crates from workspace Cargo.toml
These were not referenced in any of the other Cargo.toml files in the
workspace. They were not being built because of that, so there was
little harm in having them listed, but let's be tidy.
2024-09-23 00:37:41 +03:00
Heikki Linnakangas
9f653893b9 Update a few dependencies, removing some indirect dependencies
cargo update ciborium iana-time-zone lazy_static schannel uuid
    cargo update hyper@0.14
    cargo update  --precise 2.9.7 ureq

It might be worthwhile just update all our dependencies at some point,
but this is aimed at pruning the dependency tree, to make the build a
little faster. That's also why I didn't update ureq to the latest
version: that would've added a dependency to yet another version of
rustls.
2024-09-23 00:37:41 +03:00
Heikki Linnakangas
913af44219 Update "memoffset" crate
To eliminate one version of it from our dependency tree.
2024-09-23 00:37:41 +03:00
Heikki Linnakangas
ecd615ab6d Update "hostname" crate
We were already building v0.4.0 as an indirect dependency, so this
avoids having to build two different versions of it.
2024-09-23 00:37:41 +03:00
Heikki Linnakangas
c9b2ec9ff1 Check submodule forward progress (#8949)
We frequently mess up our submodule references. This adds one safeguard:
it checks that the submodule references are only updated "forwards", not
to some older commit, or a commit that's not a descended of the previous 
one.

As next step, I'm thinking that we should automate things so that when
you merge a PR to the 'neon' repository that updates the submodule
references, the REL_*_STABLE_neon branches are automatically updated to
match the submodule references. That way, you never need to manually
merge PRs in the postgres repository, it's all triggered from commits in
the 'neon' repository. But that's not included here.
2024-09-22 21:46:53 +03:00
Arpad Müller
a3800dcb0c Move load_timeline_metadata into separate function (#9080)
Moves the per-timeline code to load timeline metadata into a new
dedicated function called `load_timeline_metadata`. The old
`load_timeline_metadata` becomes `load_timelines_metadata`.

Split out of #8907

Part of #8088
2024-09-21 12:36:41 +00:00
Heikki Linnakangas
9a32aa828d Fix init of WAL page header at startup (#8914)
If the primary is started at an LSN within the first of a 16 MB WAL
segment, the "long XLOG page header" at the beginning of the segment was
not initialized correctly. That has gone unnnoticed, because under
normal circumstances, nothing looks at the page header. The WAL that is
streamed to the safekeepers starts at the new record's LSN, not at the
beginning of the page, so that bogus page header didn't propagate
elsewhere, and a primary server doesn't normally read the WAL its
written. Which is good because the contents of the page would be bogus
anyway, as it wouldn't contain any of the records before the LSN where
the new record is written.

Except that in the following cases a primary does read its own WAL:

1. When there are two-phase transactions in prepared state at
checkpoint. The checkpointer reads the two-phase state from the
XLOG_XACT_PREPARE record, and writes it to a file in pg_twophase/.

2. Logical decoding reads the WAL starting from the replication slot's
restart LSN.

This PR fixes the problem with two-phase transactions. For that, it's
sufficient to initialize the page header correctly. The checkpointer
only needs to read XLOG_XACT_PREPARE records that were generated after
the server startup, so it's still OK that older WAL is missing / bogus.

I have not investigated if we have a problem with logical decoding,
however. Let's deal with that separately.

Special thanks to @Lzjing-1997, who independently found the same bug
and opened a PR to fix it, although I did not use that PR.
2024-09-21 04:00:38 +03:00
26 changed files with 458 additions and 502 deletions

View File

@@ -120,6 +120,59 @@ jobs:
- name: Run mypy to check types
run: poetry run mypy .
# Check that the vendor/postgres-* submodules point to the
# corresponding REL_*_STABLE_neon branches.
check-submodules:
runs-on: ubuntu-22.04
steps:
- name: Checkout
uses: actions/checkout@v4
with:
submodules: true
- uses: dorny/paths-filter@v3
id: check-if-submodules-changed
with:
filters: |
vendor:
- 'vendor/**'
- name: Check vendor/postgres-v14 submodule reference
if: steps.check-if-submodules-changed.outputs.vendor == 'true'
uses: jtmullen/submodule-branch-check-action@v1
with:
path: "vendor/postgres-v14"
fetch_depth: "50"
sub_fetch_depth: "50"
pass_if_unchanged: true
- name: Check vendor/postgres-v15 submodule reference
if: steps.check-if-submodules-changed.outputs.vendor == 'true'
uses: jtmullen/submodule-branch-check-action@v1
with:
path: "vendor/postgres-v15"
fetch_depth: "50"
sub_fetch_depth: "50"
pass_if_unchanged: true
- name: Check vendor/postgres-v16 submodule reference
if: steps.check-if-submodules-changed.outputs.vendor == 'true'
uses: jtmullen/submodule-branch-check-action@v1
with:
path: "vendor/postgres-v16"
fetch_depth: "50"
sub_fetch_depth: "50"
pass_if_unchanged: true
- name: Check vendor/postgres-v17 submodule reference
if: steps.check-if-submodules-changed.outputs.vendor == 'true'
uses: jtmullen/submodule-branch-check-action@v1
with:
path: "vendor/postgres-v17"
fetch_depth: "50"
sub_fetch_depth: "50"
pass_if_unchanged: true
check-codestyle-rust:
needs: [ check-permissions, build-build-tools-image ]
strategy:
@@ -1154,6 +1207,8 @@ jobs:
# Usually we do `needs: [...]`
needs:
- build-and-test-locally
# XXX: Temporarily disabled, while we investigate an unexpected failure with it
#- check-submodules
- check-codestyle-python
- check-codestyle-rust
- promote-images

307
Cargo.lock generated
View File

@@ -255,12 +255,6 @@ dependencies = [
"syn 2.0.52",
]
[[package]]
name = "atomic"
version = "0.5.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c59bdb34bc650a32731b31bd8f0829cc15d24a708ee31559e0bb34f2bc320cba"
[[package]]
name = "atomic-take"
version = "1.1.0"
@@ -295,8 +289,8 @@ dependencies = [
"fastrand 2.0.0",
"hex",
"http 0.2.9",
"hyper 0.14.26",
"ring 0.17.6",
"hyper 0.14.30",
"ring",
"time",
"tokio",
"tracing",
@@ -486,7 +480,7 @@ dependencies = [
"once_cell",
"p256 0.11.1",
"percent-encoding",
"ring 0.17.6",
"ring",
"sha2",
"subtle",
"time",
@@ -593,7 +587,7 @@ dependencies = [
"http 0.2.9",
"http-body 0.4.5",
"http-body 1.0.0",
"hyper 0.14.26",
"hyper 0.14.30",
"hyper-rustls 0.24.0",
"once_cell",
"pin-project-lite",
@@ -684,7 +678,7 @@ dependencies = [
"futures-util",
"http 0.2.9",
"http-body 0.4.5",
"hyper 0.14.26",
"hyper 0.14.30",
"itoa",
"matchit 0.7.0",
"memchr",
@@ -1089,9 +1083,9 @@ dependencies = [
[[package]]
name = "ciborium"
version = "0.2.1"
version = "0.2.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "effd91f6c78e5a4ace8a5d3c0b6bfaec9e2baaef55f3efc00e45fb2e477ee926"
checksum = "42e69ffd6f0917f5c029256a24d0161db17cea3997d185db0d35926308770f0e"
dependencies = [
"ciborium-io",
"ciborium-ll",
@@ -1100,18 +1094,18 @@ dependencies = [
[[package]]
name = "ciborium-io"
version = "0.2.1"
version = "0.2.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "cdf919175532b369853f5d5e20b26b43112613fd6fe7aee757e35f7a44642656"
checksum = "05afea1e0a06c9be33d539b876f1ce3692f4afea2cb41f740e7743225ed1c757"
[[package]]
name = "ciborium-ll"
version = "0.2.1"
version = "0.2.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "defaa24ecc093c77630e6c15e17c51f5e187bf35ee514f4e2d67baaa96dae22b"
checksum = "57663b653d948a338bfb3eeba9bb2fd5fcfaecb9e199e87e1eda4d9e8b240fd9"
dependencies = [
"ciborium-io",
"half 1.8.2",
"half",
]
[[package]]
@@ -1224,7 +1218,7 @@ dependencies = [
"compute_api",
"flate2",
"futures",
"hyper 0.14.26",
"hyper 0.14.30",
"nix 0.27.1",
"notify",
"num_cpus",
@@ -1330,7 +1324,7 @@ dependencies = [
"git-version",
"humantime",
"humantime-serde",
"hyper 0.14.26",
"hyper 0.14.30",
"nix 0.27.1",
"once_cell",
"pageserver_api",
@@ -2304,12 +2298,6 @@ dependencies = [
"tracing",
]
[[package]]
name = "half"
version = "1.8.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "eabb4a44450da02c90444cf74558da904edde8fb4e9035a9a6a4e15445af0bd7"
[[package]]
name = "half"
version = "2.4.1"
@@ -2411,17 +2399,6 @@ dependencies = [
"digest",
]
[[package]]
name = "hostname"
version = "0.3.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3c731c3e10504cc8ed35cfe2f1db4c9274c3d35fa486e3b31df46f068ef3e867"
dependencies = [
"libc",
"match_cfg",
"winapi",
]
[[package]]
name = "hostname"
version = "0.4.0"
@@ -2430,7 +2407,7 @@ checksum = "f9c7c7c8ac16c798734b8a24560c1362120597c40d5e1459f09498f8f6c8f2ba"
dependencies = [
"cfg-if",
"libc",
"windows 0.52.0",
"windows",
]
[[package]]
@@ -2539,9 +2516,9 @@ dependencies = [
[[package]]
name = "hyper"
version = "0.14.26"
version = "0.14.30"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ab302d72a6f11a3b910431ff93aae7e773078c769f0a3ef15fb9ec692ed147d4"
checksum = "a152ddd61dfaec7273fe8419ab357f33aee0d914c5f4efbf0d96fa749eea5ec9"
dependencies = [
"bytes",
"futures-channel",
@@ -2554,7 +2531,7 @@ dependencies = [
"httpdate",
"itoa",
"pin-project-lite",
"socket2 0.4.9",
"socket2",
"tokio",
"tower-service",
"tracing",
@@ -2589,7 +2566,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0646026eb1b3eea4cd9ba47912ea5ce9cc07713d105b1a14698f4e6433d348b7"
dependencies = [
"http 0.2.9",
"hyper 0.14.26",
"hyper 0.14.30",
"log",
"rustls 0.21.11",
"rustls-native-certs 0.6.2",
@@ -2620,7 +2597,7 @@ version = "0.4.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bbb958482e8c7be4bc3cf272a766a2b0bf1a6755e7a6ae777f017a31d11b13b1"
dependencies = [
"hyper 0.14.26",
"hyper 0.14.30",
"pin-project-lite",
"tokio",
"tokio-io-timeout",
@@ -2639,7 +2616,7 @@ dependencies = [
"http-body 1.0.0",
"hyper 1.2.0",
"pin-project-lite",
"socket2 0.5.5",
"socket2",
"tokio",
"tower",
"tower-service",
@@ -2648,16 +2625,16 @@ dependencies = [
[[package]]
name = "iana-time-zone"
version = "0.1.56"
version = "0.1.61"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0722cd7114b7de04316e7ea5456a0bbb20e4adb46fd27a3697adb812cff0f37c"
checksum = "235e081f3925a06703c2d0117ea8b91f042756fd6e7a6e5d901e8ca1a996b220"
dependencies = [
"android_system_properties",
"core-foundation-sys",
"iana-time-zone-haiku",
"js-sys",
"wasm-bindgen",
"windows 0.48.0",
"windows-core",
]
[[package]]
@@ -2870,7 +2847,7 @@ dependencies = [
"base64 0.21.1",
"js-sys",
"pem",
"ring 0.17.6",
"ring",
"serde",
"serde_json",
"simple_asn1",
@@ -2908,11 +2885,11 @@ dependencies = [
[[package]]
name = "lazy_static"
version = "1.4.0"
version = "1.5.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e2abad23fbc42b3700f2f279844dc832adb2b2eb069b2df918f455c4e18cc646"
checksum = "bbd2bcb4c963f2ddae06a2efc7e9f3591312473c50c6685e1f298068316e66fe"
dependencies = [
"spin 0.5.2",
"spin",
]
[[package]]
@@ -2974,12 +2951,6 @@ dependencies = [
"hashbrown 0.14.5",
]
[[package]]
name = "match_cfg"
version = "0.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ffbee8634e0d45d258acb448e7eaab3fce7a0a467395d4d9f228e3c1f01fb2e4"
[[package]]
name = "matchers"
version = "0.1.0"
@@ -3072,15 +3043,6 @@ dependencies = [
"autocfg",
]
[[package]]
name = "memoffset"
version = "0.8.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d61c719bcfbcf5d62b3a09efa6088de8c54bc0bfcd3ea7ae39fcc186108b8de1"
dependencies = [
"autocfg",
]
[[package]]
name = "memoffset"
version = "0.9.0"
@@ -3660,7 +3622,7 @@ dependencies = [
"hex-literal",
"humantime",
"humantime-serde",
"hyper 0.14.26",
"hyper 0.14.30",
"indoc",
"itertools 0.10.5",
"md5",
@@ -3853,7 +3815,7 @@ dependencies = [
"ahash",
"bytes",
"chrono",
"half 2.4.1",
"half",
"hashbrown 0.14.5",
"num",
"num-bigint",
@@ -4140,7 +4102,7 @@ dependencies = [
"crc32c",
"env_logger",
"log",
"memoffset 0.8.0",
"memoffset 0.9.0",
"once_cell",
"postgres",
"regex",
@@ -4350,12 +4312,12 @@ dependencies = [
"hashlink",
"hex",
"hmac",
"hostname 0.3.1",
"hostname",
"http 1.1.0",
"http-body-util",
"humantime",
"humantime-serde",
"hyper 0.14.26",
"hyper 0.14.30",
"hyper 1.2.0",
"hyper-util",
"indexmap 2.0.1",
@@ -4400,7 +4362,7 @@ dependencies = [
"signature 2.2.0",
"smallvec",
"smol_str",
"socket2 0.5.5",
"socket2",
"subtle",
"thiserror",
"tikv-jemalloc-ctl",
@@ -4578,7 +4540,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "48406db8ac1f3cbc7dcdb56ec355343817958a356ff430259bb07baf7607e1e1"
dependencies = [
"pem",
"ring 0.17.6",
"ring",
"time",
"yasna",
]
@@ -4602,7 +4564,7 @@ dependencies = [
"rustls-pki-types",
"ryu",
"sha1_smol",
"socket2 0.5.5",
"socket2",
"tokio",
"tokio-rustls 0.25.0",
"tokio-util",
@@ -4714,7 +4676,7 @@ dependencies = [
"futures-util",
"http-types",
"humantime-serde",
"hyper 0.14.26",
"hyper 0.14.30",
"itertools 0.10.5",
"metrics",
"once_cell",
@@ -4747,7 +4709,7 @@ dependencies = [
"h2 0.3.26",
"http 0.2.9",
"http-body 0.4.5",
"hyper 0.14.26",
"hyper 0.14.30",
"hyper-rustls 0.24.0",
"ipnet",
"js-sys",
@@ -4905,21 +4867,6 @@ dependencies = [
"subtle",
]
[[package]]
name = "ring"
version = "0.16.20"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3053cf52e236a3ed746dfc745aa9cacf1b791d846bdaf412f60a8d7d6e17c8fc"
dependencies = [
"cc",
"libc",
"once_cell",
"spin 0.5.2",
"untrusted 0.7.1",
"web-sys",
"winapi",
]
[[package]]
name = "ring"
version = "0.17.6"
@@ -4929,8 +4876,8 @@ dependencies = [
"cc",
"getrandom 0.2.11",
"libc",
"spin 0.9.8",
"untrusted 0.9.0",
"spin",
"untrusted",
"windows-sys 0.48.0",
]
@@ -4950,7 +4897,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "496c1d3718081c45ba9c31fbfc07417900aa96f4070ff90dc29961836b7a9945"
dependencies = [
"http 0.2.9",
"hyper 0.14.26",
"hyper 0.14.30",
"lazy_static",
"percent-encoding",
"regex",
@@ -5074,7 +5021,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7fecbfb7b1444f477b345853b1fce097a2c6fb637b2bfb87e6bc5db0f043fae4"
dependencies = [
"log",
"ring 0.17.6",
"ring",
"rustls-webpki 0.101.7",
"sct",
]
@@ -5086,7 +5033,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bf4ef73721ac7bcd79b2b315da7779d8fc09718c6b3d2d1b2d94850eb8c18432"
dependencies = [
"log",
"ring 0.17.6",
"ring",
"rustls-pki-types",
"rustls-webpki 0.102.2",
"subtle",
@@ -5143,24 +5090,14 @@ version = "1.3.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5ede67b28608b4c60685c7d54122d4400d90f62b40caee7700e700380a390fa8"
[[package]]
name = "rustls-webpki"
version = "0.100.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e98ff011474fa39949b7e5c0428f9b4937eda7da7848bbb947786b7be0b27dab"
dependencies = [
"ring 0.16.20",
"untrusted 0.7.1",
]
[[package]]
name = "rustls-webpki"
version = "0.101.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8b6275d1ee7a1cd780b64aca7726599a1dbc893b1e64144529e55c3c2f745765"
dependencies = [
"ring 0.17.6",
"untrusted 0.9.0",
"ring",
"untrusted",
]
[[package]]
@@ -5169,9 +5106,9 @@ version = "0.102.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "faaa0a62740bedb9b2ef5afa303da42764c012f743917351dc9a237ea1663610"
dependencies = [
"ring 0.17.6",
"ring",
"rustls-pki-types",
"untrusted 0.9.0",
"untrusted",
]
[[package]]
@@ -5205,7 +5142,7 @@ dependencies = [
"git-version",
"hex",
"humantime",
"hyper 0.14.26",
"hyper 0.14.30",
"metrics",
"once_cell",
"parking_lot 0.12.1",
@@ -5262,11 +5199,11 @@ dependencies = [
[[package]]
name = "schannel"
version = "0.1.21"
version = "0.1.23"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "713cfb06c7059f3588fb8044c0fad1d09e3c01d225e25b9220dbfdcf16dbb1b3"
checksum = "fbc91545643bcf3a0bbb6569265615222618bdf33ce4ffbbd13c4bbd4c093534"
dependencies = [
"windows-sys 0.42.0",
"windows-sys 0.52.0",
]
[[package]]
@@ -5290,8 +5227,8 @@ version = "0.7.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "da046153aa2352493d6cb7da4b6e5c0c057d8a1d0a9aa8560baffdd945acd414"
dependencies = [
"ring 0.17.6",
"untrusted 0.9.0",
"ring",
"untrusted",
]
[[package]]
@@ -5400,7 +5337,7 @@ version = "0.32.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "eba8870c5dba2bfd9db25c75574a11429f6b95957b0a78ac02e2970dd7a5249a"
dependencies = [
"hostname 0.4.0",
"hostname",
"libc",
"os_info",
"rustc_version",
@@ -5712,16 +5649,6 @@ dependencies = [
"serde",
]
[[package]]
name = "socket2"
version = "0.4.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "64a4a911eed85daf18834cfaa86a79b7d266ff93ff5ba14005426219480ed662"
dependencies = [
"libc",
"winapi",
]
[[package]]
name = "socket2"
version = "0.5.5"
@@ -5732,12 +5659,6 @@ dependencies = [
"windows-sys 0.48.0",
]
[[package]]
name = "spin"
version = "0.5.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "6e63cff320ae2c57904679ba7cb63280a3dc4613885beafb148ee7bf9aa9042d"
[[package]]
name = "spin"
version = "0.9.8"
@@ -5783,7 +5704,7 @@ dependencies = [
"futures-util",
"git-version",
"humantime",
"hyper 0.14.26",
"hyper 0.14.30",
"metrics",
"once_cell",
"parking_lot 0.12.1",
@@ -5812,7 +5733,7 @@ dependencies = [
"git-version",
"hex",
"humantime",
"hyper 0.14.26",
"hyper 0.14.30",
"itertools 0.10.5",
"lasso",
"measured",
@@ -6228,7 +6149,7 @@ dependencies = [
"num_cpus",
"pin-project-lite",
"signal-hook-registry",
"socket2 0.5.5",
"socket2",
"tokio-macros",
"windows-sys 0.48.0",
]
@@ -6288,7 +6209,7 @@ dependencies = [
"pin-project-lite",
"postgres-protocol",
"postgres-types",
"socket2 0.5.5",
"socket2",
"tokio",
"tokio-util",
]
@@ -6300,7 +6221,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0ea13f22eda7127c827983bdaf0d7fff9df21c8817bab02815ac277a21143677"
dependencies = [
"futures",
"ring 0.17.6",
"ring",
"rustls 0.22.4",
"tokio",
"tokio-postgres",
@@ -6434,7 +6355,7 @@ dependencies = [
"h2 0.3.26",
"http 0.2.9",
"http-body 0.4.5",
"hyper 0.14.26",
"hyper 0.14.30",
"hyper-timeout",
"percent-encoding",
"pin-project",
@@ -6611,7 +6532,7 @@ dependencies = [
name = "tracing-utils"
version = "0.1.0"
dependencies = [
"hyper 0.14.26",
"hyper 0.14.30",
"opentelemetry",
"opentelemetry-otlp",
"opentelemetry-semantic-conventions",
@@ -6714,12 +6635,6 @@ version = "0.2.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f962df74c8c05a667b5ee8bcf162993134c104e96440b663c8daa176dc772d8c"
[[package]]
name = "untrusted"
version = "0.7.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a156c684c91ea7d62626509bce3cb4e1d9ed5c4d978f7b4352658f96a4c26b4a"
[[package]]
name = "untrusted"
version = "0.9.0"
@@ -6728,17 +6643,18 @@ checksum = "8ecb6da28b8a351d773b68d5825ac39017e680750f980f3a1a85cd8dd28a47c1"
[[package]]
name = "ureq"
version = "2.7.1"
version = "2.9.7"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0b11c96ac7ee530603dcdf68ed1557050f374ce55a5a07193ebf8cbc9f8927e9"
checksum = "d11a831e3c0b56e438a28308e7c810799e3c118417f342d30ecec080105395cd"
dependencies = [
"base64 0.21.1",
"base64 0.22.1",
"log",
"once_cell",
"rustls 0.21.11",
"rustls-webpki 0.100.2",
"rustls 0.22.4",
"rustls-pki-types",
"rustls-webpki 0.102.2",
"url",
"webpki-roots 0.23.1",
"webpki-roots 0.26.1",
]
[[package]]
@@ -6802,7 +6718,7 @@ dependencies = [
"hex",
"hex-literal",
"humantime",
"hyper 0.14.26",
"hyper 0.14.30",
"jsonwebtoken",
"metrics",
"nix 0.27.1",
@@ -6837,11 +6753,10 @@ dependencies = [
[[package]]
name = "uuid"
version = "1.6.1"
version = "1.10.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5e395fcf16a7a3d8127ec99782007af141946b4795001f876d54fb0d55978560"
checksum = "81dfa00651efa65069b0b6b651f4aaa31ba9e3c3ce0137aaad053604ee7e0314"
dependencies = [
"atomic",
"getrandom 0.2.11",
"serde",
]
@@ -7075,15 +6990,6 @@ dependencies = [
"wasm-bindgen",
]
[[package]]
name = "webpki-roots"
version = "0.23.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b03058f88386e5ff5310d9111d53f48b17d732b401aeb83a8d5190f2ac459338"
dependencies = [
"rustls-webpki 0.100.2",
]
[[package]]
name = "webpki-roots"
version = "0.25.2"
@@ -7152,15 +7058,6 @@ version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "712e227841d057c1ee1cd2fb22fa7e5a5461ae8e48fa2ca79ec42cfc1931183f"
[[package]]
name = "windows"
version = "0.48.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e686886bc078bc1b0b600cac0147aadb815089b6e4da64016cbd754b6342700f"
dependencies = [
"windows-targets 0.48.0",
]
[[package]]
name = "windows"
version = "0.52.0"
@@ -7180,21 +7077,6 @@ dependencies = [
"windows-targets 0.52.4",
]
[[package]]
name = "windows-sys"
version = "0.42.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5a3e1820f08b8513f676f7ab6c1f99ff312fb97b553d30ff4dd86f9f15728aa7"
dependencies = [
"windows_aarch64_gnullvm 0.42.2",
"windows_aarch64_msvc 0.42.2",
"windows_i686_gnu 0.42.2",
"windows_i686_msvc 0.42.2",
"windows_x86_64_gnu 0.42.2",
"windows_x86_64_gnullvm 0.42.2",
"windows_x86_64_msvc 0.42.2",
]
[[package]]
name = "windows-sys"
version = "0.48.0"
@@ -7243,12 +7125,6 @@ dependencies = [
"windows_x86_64_msvc 0.52.4",
]
[[package]]
name = "windows_aarch64_gnullvm"
version = "0.42.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "597a5118570b68bc08d8d59125332c54f1ba9d9adeedeef5b99b02ba2b0698f8"
[[package]]
name = "windows_aarch64_gnullvm"
version = "0.48.0"
@@ -7261,12 +7137,6 @@ version = "0.52.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bcf46cf4c365c6f2d1cc93ce535f2c8b244591df96ceee75d8e83deb70a9cac9"
[[package]]
name = "windows_aarch64_msvc"
version = "0.42.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e08e8864a60f06ef0d0ff4ba04124db8b0fb3be5776a5cd47641e942e58c4d43"
[[package]]
name = "windows_aarch64_msvc"
version = "0.48.0"
@@ -7279,12 +7149,6 @@ version = "0.52.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "da9f259dd3bcf6990b55bffd094c4f7235817ba4ceebde8e6d11cd0c5633b675"
[[package]]
name = "windows_i686_gnu"
version = "0.42.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c61d927d8da41da96a81f029489353e68739737d3beca43145c8afec9a31a84f"
[[package]]
name = "windows_i686_gnu"
version = "0.48.0"
@@ -7297,12 +7161,6 @@ version = "0.52.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b474d8268f99e0995f25b9f095bc7434632601028cf86590aea5c8a5cb7801d3"
[[package]]
name = "windows_i686_msvc"
version = "0.42.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "44d840b6ec649f480a41c8d80f9c65108b92d89345dd94027bfe06ac444d1060"
[[package]]
name = "windows_i686_msvc"
version = "0.48.0"
@@ -7315,12 +7173,6 @@ version = "0.52.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1515e9a29e5bed743cb4415a9ecf5dfca648ce85ee42e15873c3cd8610ff8e02"
[[package]]
name = "windows_x86_64_gnu"
version = "0.42.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8de912b8b8feb55c064867cf047dda097f92d51efad5b491dfb98f6bbb70cb36"
[[package]]
name = "windows_x86_64_gnu"
version = "0.48.0"
@@ -7333,12 +7185,6 @@ version = "0.52.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5eee091590e89cc02ad514ffe3ead9eb6b660aedca2183455434b93546371a03"
[[package]]
name = "windows_x86_64_gnullvm"
version = "0.42.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "26d41b46a36d453748aedef1486d5c7a85db22e56aff34643984ea85514e94a3"
[[package]]
name = "windows_x86_64_gnullvm"
version = "0.48.0"
@@ -7351,12 +7197,6 @@ version = "0.52.4"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "77ca79f2451b49fa9e2af39f0747fe999fcda4f5e241b2898624dca97a1f2177"
[[package]]
name = "windows_x86_64_msvc"
version = "0.42.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9aec5da331524158c6d1a4ac0ab1541149c0b9505fde06423b02f5ef0106b9f0"
[[package]]
name = "windows_x86_64_msvc"
version = "0.48.0"
@@ -7433,10 +7273,11 @@ dependencies = [
"futures-util",
"generic-array",
"getrandom 0.2.11",
"half",
"hashbrown 0.14.5",
"hex",
"hmac",
"hyper 0.14.26",
"hyper 0.14.30",
"indexmap 1.9.3",
"itertools 0.10.5",
"itertools 0.12.1",
@@ -7504,7 +7345,7 @@ dependencies = [
"der 0.7.8",
"hex",
"pem",
"ring 0.17.6",
"ring",
"signature 2.2.0",
"spki 0.7.3",
"thiserror",

View File

@@ -76,8 +76,6 @@ clap = { version = "4.0", features = ["derive"] }
comfy-table = "7.1"
const_format = "0.2"
crc32c = "0.6"
crossbeam-deque = "0.8.5"
crossbeam-utils = "0.8.5"
dashmap = { version = "5.5.0", features = ["raw-api"] }
either = "1.8"
enum-map = "2.4.2"
@@ -95,7 +93,7 @@ hdrhistogram = "7.5.2"
hex = "0.4"
hex-literal = "0.4"
hmac = "0.12.1"
hostname = "0.3.1"
hostname = "0.4"
http = {version = "1.1.0", features = ["std"]}
http-types = { version = "2", default-features = false }
humantime = "2.1"
@@ -104,7 +102,6 @@ hyper = "0.14"
tokio-tungstenite = "0.20.0"
indexmap = "2"
indoc = "2"
inotify = "0.10.2"
ipnet = "2.9.0"
itertools = "0.10"
jsonwebtoken = "9"
@@ -113,7 +110,7 @@ libc = "0.2"
md5 = "0.7.0"
measured = { version = "0.0.22", features=["lasso"] }
measured-process = { version = "0.0.22" }
memoffset = "0.8"
memoffset = "0.9"
nix = { version = "0.27", features = ["dir", "fs", "process", "socket", "signal", "poll"] }
notify = "6.0.0"
num_cpus = "1.15"
@@ -142,7 +139,6 @@ rpds = "0.13"
rustc-hash = "1.1.0"
rustls = "0.22"
rustls-pemfile = "2"
rustls-split = "0.3"
scopeguard = "1.1"
sysinfo = "0.29.2"
sd-notify = "0.4.1"
@@ -164,7 +160,6 @@ strum_macros = "0.26"
svg_fmt = "0.4.3"
sync_wrapper = "0.1.2"
tar = "0.4"
task-local-extensions = "0.1.4"
test-context = "0.3"
thiserror = "1.0"
tikv-jemallocator = "0.5"

View File

@@ -64,7 +64,6 @@ pub struct ConfigToml {
#[serde(with = "humantime_serde")]
pub wal_redo_timeout: Duration,
pub superuser: String,
pub initdb_cache_dir: Option<Utf8PathBuf>,
pub page_cache_size: usize,
pub max_file_descriptors: usize,
pub pg_distrib_dir: Option<Utf8PathBuf>,
@@ -105,9 +104,6 @@ pub struct ConfigToml {
pub image_compression: ImageCompressionAlgorithm,
pub ephemeral_bytes_per_memory_kb: usize,
pub l0_flush: Option<crate::models::L0FlushConfig>,
#[serde(skip_serializing)]
// TODO(https://github.com/neondatabase/neon/issues/8184): remove after this field is removed from all pageserver.toml's
pub compact_level0_phase1_value_access: serde::de::IgnoredAny,
pub virtual_file_direct_io: crate::models::virtual_file::DirectIoMode,
pub io_buffer_alignment: usize,
}
@@ -323,7 +319,6 @@ impl Default for ConfigToml {
wal_redo_timeout: (humantime::parse_duration(DEFAULT_WAL_REDO_TIMEOUT)
.expect("cannot parse default wal redo timeout")),
superuser: (DEFAULT_SUPERUSER.to_string()),
initdb_cache_dir: None,
page_cache_size: (DEFAULT_PAGE_CACHE_SIZE),
max_file_descriptors: (DEFAULT_MAX_FILE_DESCRIPTORS),
pg_distrib_dir: None, // Utf8PathBuf::from("./pg_install"), // TODO: formely, this was std::env::current_dir()
@@ -386,7 +381,6 @@ impl Default for ConfigToml {
image_compression: (DEFAULT_IMAGE_COMPRESSION),
ephemeral_bytes_per_memory_kb: (DEFAULT_EPHEMERAL_BYTES_PER_MEMORY_KB),
l0_flush: None,
compact_level0_phase1_value_access: Default::default(),
virtual_file_direct_io: crate::models::virtual_file::DirectIoMode::default(),
io_buffer_alignment: DEFAULT_IO_BUFFER_ALIGNMENT,

View File

@@ -1,7 +1,7 @@
//! Quantify a single walredo manager's throughput under N concurrent callers.
//!
//! The benchmark implementation ([`bench_impl`]) is parametrized by
//! - `redo_work` => [`Request::short_request`] or [`Request::medium_request`]
//! - `redo_work` => an async closure that takes a `PostgresRedoManager` and performs one redo
//! - `n_redos` => number of times the benchmark shell execute the `redo_work`
//! - `nclients` => number of clients (more on this shortly).
//!
@@ -10,7 +10,7 @@
//! Each task executes the `redo_work` `n_redos/nclients` times.
//!
//! We exercise the following combinations:
//! - `redo_work = short / medium``
//! - `redo_work = ping / short / medium``
//! - `nclients = [1, 2, 4, 8, 16, 32, 64, 128]`
//!
//! We let `criterion` determine the `n_redos` using `iter_custom`.
@@ -27,33 +27,43 @@
//!
//! # Reference Numbers
//!
//! 2024-04-15 on i3en.3xlarge
//! 2024-09-18 on im4gn.2xlarge
//!
//! ```text
//! short/1 time: [24.584 µs 24.737 µs 24.922 µs]
//! short/2 time: [33.479 µs 33.660 µs 33.888 µs]
//! short/4 time: [42.713 µs 43.046 µs 43.440 µs]
//! short/8 time: [71.814 µs 72.478 µs 73.240 µs]
//! short/16 time: [132.73 µs 134.45 µs 136.22 µs]
//! short/32 time: [258.31 µs 260.73 µs 263.27 µs]
//! short/64 time: [511.61 µs 514.44 µs 517.51 µs]
//! short/128 time: [992.64 µs 998.23 µs 1.0042 ms]
//! medium/1 time: [110.11 µs 110.50 µs 110.96 µs]
//! medium/2 time: [153.06 µs 153.85 µs 154.99 µs]
//! medium/4 time: [317.51 µs 319.92 µs 322.85 µs]
//! medium/8 time: [638.30 µs 644.68 µs 652.12 µs]
//! medium/16 time: [1.2651 ms 1.2773 ms 1.2914 ms]
//! medium/32 time: [2.5117 ms 2.5410 ms 2.5720 ms]
//! medium/64 time: [4.8088 ms 4.8555 ms 4.9047 ms]
//! medium/128 time: [8.8311 ms 8.9849 ms 9.1263 ms]
//! ping/1 time: [21.789 µs 21.918 µs 22.078 µs]
//! ping/2 time: [27.686 µs 27.812 µs 27.970 µs]
//! ping/4 time: [35.468 µs 35.671 µs 35.926 µs]
//! ping/8 time: [59.682 µs 59.987 µs 60.363 µs]
//! ping/16 time: [101.79 µs 102.37 µs 103.08 µs]
//! ping/32 time: [184.18 µs 185.15 µs 186.36 µs]
//! ping/64 time: [349.86 µs 351.45 µs 353.47 µs]
//! ping/128 time: [684.53 µs 687.98 µs 692.17 µs]
//! short/1 time: [31.833 µs 32.126 µs 32.428 µs]
//! short/2 time: [35.558 µs 35.756 µs 35.992 µs]
//! short/4 time: [44.850 µs 45.138 µs 45.484 µs]
//! short/8 time: [65.985 µs 66.379 µs 66.853 µs]
//! short/16 time: [127.06 µs 127.90 µs 128.87 µs]
//! short/32 time: [252.98 µs 254.70 µs 256.73 µs]
//! short/64 time: [497.13 µs 499.86 µs 503.26 µs]
//! short/128 time: [987.46 µs 993.45 µs 1.0004 ms]
//! medium/1 time: [137.91 µs 138.55 µs 139.35 µs]
//! medium/2 time: [192.00 µs 192.91 µs 194.07 µs]
//! medium/4 time: [389.62 µs 391.55 µs 394.01 µs]
//! medium/8 time: [776.80 µs 780.33 µs 784.77 µs]
//! medium/16 time: [1.5323 ms 1.5383 ms 1.5459 ms]
//! medium/32 time: [3.0120 ms 3.0226 ms 3.0350 ms]
//! medium/64 time: [5.7405 ms 5.7787 ms 5.8166 ms]
//! medium/128 time: [10.412 ms 10.574 ms 10.718 ms]
//! ```
use anyhow::Context;
use bytes::{Buf, Bytes};
use criterion::{BenchmarkId, Criterion};
use once_cell::sync::Lazy;
use pageserver::{config::PageServerConf, walrecord::NeonWalRecord, walredo::PostgresRedoManager};
use pageserver_api::{key::Key, shard::TenantShardId};
use std::{
future::Future,
sync::Arc,
time::{Duration, Instant},
};
@@ -61,40 +71,59 @@ use tokio::{sync::Barrier, task::JoinSet};
use utils::{id::TenantId, lsn::Lsn};
fn bench(c: &mut Criterion) {
{
let nclients = [1, 2, 4, 8, 16, 32, 64, 128];
for nclients in nclients {
let mut group = c.benchmark_group("short");
group.bench_with_input(
BenchmarkId::from_parameter(nclients),
&nclients,
|b, nclients| {
let redo_work = Arc::new(Request::short_input());
b.iter_custom(|iters| bench_impl(Arc::clone(&redo_work), iters, *nclients));
},
);
}
}
{
let nclients = [1, 2, 4, 8, 16, 32, 64, 128];
for nclients in nclients {
let mut group = c.benchmark_group("medium");
group.bench_with_input(
BenchmarkId::from_parameter(nclients),
&nclients,
|b, nclients| {
let redo_work = Arc::new(Request::medium_input());
b.iter_custom(|iters| bench_impl(Arc::clone(&redo_work), iters, *nclients));
},
);
}
macro_rules! bench_group {
($name:expr, $redo_work:expr) => {{
let name: &str = $name;
let nclients = [1, 2, 4, 8, 16, 32, 64, 128];
for nclients in nclients {
let mut group = c.benchmark_group(name);
group.bench_with_input(
BenchmarkId::from_parameter(nclients),
&nclients,
|b, nclients| {
b.iter_custom(|iters| bench_impl($redo_work, iters, *nclients));
},
);
}
}};
}
//
// benchmark the protocol implementation
//
let pg_version = 14;
bench_group!(
"ping",
Arc::new(move |mgr: Arc<PostgresRedoManager>| async move {
let _: () = mgr.ping(pg_version).await.unwrap();
})
);
//
// benchmarks with actual record redo
//
let make_redo_work = |req: &'static Request| {
Arc::new(move |mgr: Arc<PostgresRedoManager>| async move {
let page = req.execute(&mgr).await.unwrap();
assert_eq!(page.remaining(), 8192);
})
};
bench_group!("short", {
static REQUEST: Lazy<Request> = Lazy::new(Request::short_input);
make_redo_work(&REQUEST)
});
bench_group!("medium", {
static REQUEST: Lazy<Request> = Lazy::new(Request::medium_input);
make_redo_work(&REQUEST)
});
}
criterion::criterion_group!(benches, bench);
criterion::criterion_main!(benches);
// Returns the sum of each client's wall-clock time spent executing their share of the n_redos.
fn bench_impl(redo_work: Arc<Request>, n_redos: u64, nclients: u64) -> Duration {
fn bench_impl<F, Fut>(redo_work: Arc<F>, n_redos: u64, nclients: u64) -> Duration
where
F: Fn(Arc<PostgresRedoManager>) -> Fut + Send + Sync + 'static,
Fut: Future<Output = ()> + Send + 'static,
{
let repo_dir = camino_tempfile::tempdir_in(env!("CARGO_TARGET_TMPDIR")).unwrap();
let conf = PageServerConf::dummy_conf(repo_dir.path().to_path_buf());
@@ -135,17 +164,20 @@ fn bench_impl(redo_work: Arc<Request>, n_redos: u64, nclients: u64) -> Duration
})
}
async fn client(
async fn client<F, Fut>(
mgr: Arc<PostgresRedoManager>,
start: Arc<Barrier>,
redo_work: Arc<Request>,
redo_work: Arc<F>,
n_redos: u64,
) -> Duration {
) -> Duration
where
F: Fn(Arc<PostgresRedoManager>) -> Fut + Send + Sync + 'static,
Fut: Future<Output = ()> + Send + 'static,
{
start.wait().await;
let start = Instant::now();
for _ in 0..n_redos {
let page = redo_work.execute(&mgr).await.unwrap();
assert_eq!(page.remaining(), 8192);
redo_work(Arc::clone(&mgr)).await;
// The real pageserver will rarely if ever do 2 walredos in a row without
// yielding to the executor.
tokio::task::yield_now().await;

View File

@@ -432,7 +432,7 @@ impl Client {
self.mgmt_api_endpoint
);
self.request(Method::POST, &uri, req)
self.request(Method::PUT, &uri, req)
.await?
.json()
.await

View File

@@ -70,8 +70,6 @@ pub struct PageServerConf {
pub superuser: String,
pub initdb_cache_dir: Option<Utf8PathBuf>,
pub page_cache_size: usize,
pub max_file_descriptors: usize,
@@ -299,7 +297,6 @@ impl PageServerConf {
wait_lsn_timeout,
wal_redo_timeout,
superuser,
initdb_cache_dir,
page_cache_size,
max_file_descriptors,
pg_distrib_dir,
@@ -327,7 +324,6 @@ impl PageServerConf {
max_vectored_read_bytes,
image_compression,
ephemeral_bytes_per_memory_kb,
compact_level0_phase1_value_access: _,
l0_flush,
virtual_file_direct_io,
concurrent_tenant_warmup,
@@ -347,7 +343,6 @@ impl PageServerConf {
wait_lsn_timeout,
wal_redo_timeout,
superuser,
initdb_cache_dir,
page_cache_size,
max_file_descriptors,
http_auth_type,
@@ -539,16 +534,6 @@ mod tests {
.expect("parse_and_validate");
}
#[test]
fn test_compactl0_phase1_access_mode_is_ignored_silently() {
let input = indoc::indoc! {r#"
[compact_level0_phase1_value_access]
mode = "streaming-kmerge"
validate = "key-lsn-value"
"#};
toml_edit::de::from_str::<pageserver_api::config::ConfigToml>(input).unwrap();
}
/// If there's a typo in the pageserver config, we'd rather catch that typo
/// and fail pageserver startup than silently ignoring the typo, leaving whoever
/// made it in the believe that their config change is effective.

View File

@@ -2955,7 +2955,7 @@ pub fn make_router(
"/v1/tenant/:tenant_shard_id/timeline/:timeline_id/preserve_initdb_archive",
|r| api_handler(r, timeline_preserve_initdb_handler),
)
.post(
.put(
"/v1/tenant/:tenant_shard_id/timeline/:timeline_id/archival_config",
|r| api_handler(r, timeline_archival_config_handler),
)

View File

@@ -18,7 +18,6 @@ use camino::Utf8Path;
use camino::Utf8PathBuf;
use enumset::EnumSet;
use futures::stream::FuturesUnordered;
use futures::FutureExt;
use futures::StreamExt;
use pageserver_api::models;
use pageserver_api::models::AuxFilePolicy;
@@ -34,6 +33,7 @@ use remote_storage::GenericRemoteStorage;
use remote_storage::TimeoutOrCancel;
use std::collections::BTreeMap;
use std::fmt;
use std::future::Future;
use std::sync::Weak;
use std::time::SystemTime;
use storage_broker::BrokerClientChannel;
@@ -1031,13 +1031,9 @@ impl Tenant {
}
Ok(TenantPreload {
timelines: Self::load_timeline_metadata(
self,
remote_timeline_ids,
remote_storage,
cancel,
)
.await?,
timelines: self
.load_timelines_metadata(remote_timeline_ids, remote_storage, cancel)
.await?,
})
}
@@ -1303,7 +1299,7 @@ impl Tenant {
.await
}
async fn load_timeline_metadata(
async fn load_timelines_metadata(
self: &Arc<Tenant>,
timeline_ids: HashSet<TimelineId>,
remote_storage: &GenericRemoteStorage,
@@ -1311,33 +1307,10 @@ impl Tenant {
) -> anyhow::Result<HashMap<TimelineId, TimelinePreload>> {
let mut part_downloads = JoinSet::new();
for timeline_id in timeline_ids {
let client = RemoteTimelineClient::new(
remote_storage.clone(),
self.deletion_queue_client.clone(),
self.conf,
self.tenant_shard_id,
timeline_id,
self.generation,
);
let cancel_clone = cancel.clone();
part_downloads.spawn(
async move {
debug!("starting index part download");
let index_part = client.download_index_file(&cancel_clone).await;
debug!("finished index part download");
Result::<_, anyhow::Error>::Ok(TimelinePreload {
client,
timeline_id,
index_part,
})
}
.map(move |res| {
res.with_context(|| format!("download index part for timeline {timeline_id}"))
})
.instrument(info_span!("download_index_part", %timeline_id)),
self.load_timeline_metadata(timeline_id, remote_storage.clone(), cancel_clone)
.instrument(info_span!("download_index_part", %timeline_id)),
);
}
@@ -1348,8 +1321,7 @@ impl Tenant {
next = part_downloads.join_next() => {
match next {
Some(result) => {
let preload_result = result.context("join preload task")?;
let preload = preload_result?;
let preload = result.context("join preload task")?;
timeline_preloads.insert(preload.timeline_id, preload);
},
None => {
@@ -1366,6 +1338,36 @@ impl Tenant {
Ok(timeline_preloads)
}
fn load_timeline_metadata(
self: &Arc<Tenant>,
timeline_id: TimelineId,
remote_storage: GenericRemoteStorage,
cancel: CancellationToken,
) -> impl Future<Output = TimelinePreload> {
let client = RemoteTimelineClient::new(
remote_storage.clone(),
self.deletion_queue_client.clone(),
self.conf,
self.tenant_shard_id,
timeline_id,
self.generation,
);
async move {
debug_assert_current_span_has_tenant_and_timeline_id();
debug!("starting index part download");
let index_part = client.download_index_file(&cancel).await;
debug!("finished index part download");
TimelinePreload {
client,
timeline_id,
index_part,
}
}
}
pub(crate) async fn apply_timeline_archival_config(
&self,
timeline_id: TimelineId,
@@ -3522,7 +3524,7 @@ impl Tenant {
.context("extract initdb tar")?;
} else {
// Init temporarily repo to get bootstrap data, this creates a directory in the `pgdata_path` path
run_initdb_with_cache(self.conf, &pgdata_path, pg_version, &self.cancel).await?;
run_initdb(self.conf, &pgdata_path, pg_version, &self.cancel).await?;
// Upload the created data dir to S3
if self.tenant_shard_id().is_shard_zero() {
@@ -3868,118 +3870,6 @@ impl Tenant {
}
}
fn cached_initdb_dirname(initial_superuser_name: &str, pg_version: u32) -> String
{
use std::hash::Hash;
use std::hash::Hasher;
use std::collections::hash_map::DefaultHasher;
let mut hasher = DefaultHasher::new();
initial_superuser_name.hash(&mut hasher);
let hash = hasher.finish();
format!("cached_initial_pgdata_{pg_version}_{:016}", hash)
}
fn copy_dir_all(src: impl AsRef<std::path::Path>, dst: impl AsRef<std::path::Path>) -> std::io::Result<()> {
for entry in fs::read_dir(src.as_ref())? {
let entry = entry?;
let subsrc = entry.path();
let subdst = dst.as_ref().join(&entry.file_name());
if entry.file_type()?.is_dir() {
std::fs::create_dir(&subdst)?;
copy_dir_all(&subsrc, &subdst)?;
} else {
std::fs::copy(&subsrc, &subdst)?;
}
}
Ok(())
}
fn restore_cached_initdb_dir(
cached_path: &Utf8Path,
target_path: &Utf8Path,
) -> anyhow::Result<bool> {
if !cached_path.exists() {
info!("cached initdb dir \"{cached_path}\" does not exist yet");
return Ok(false);
}
std::fs::create_dir(target_path)?;
copy_dir_all(cached_path, target_path)?;
info!("restored initdb result from cache dir \"{cached_path}\"");
Ok(true)
}
fn save_cached_initdb_dir(
src_path: &Utf8Path,
cache_path: &Utf8Path,
) -> anyhow::Result<()> {
match std::fs::create_dir(cache_path) {
Ok(()) => {
info!("saving initdb result to cache dir \"{cache_path}\"");
},
Err(err) if err.kind() == std::io::ErrorKind::AlreadyExists => {
info!("cache initdb dir \"{cache_path}\" already exists, not saving");
return Ok(())
},
Err(err) => { return Err(anyhow::Error::from(err))},
};
let cache_dir_guard = scopeguard::guard(cache_path, |cp| {
if let Err(err) = std::fs::remove_dir_all(&cp) {
error!("could not remove cached initdb directory {cp}: {err}");
}
});
let cache_parent_path = cache_path.parent().ok_or(anyhow::Error::msg("no cache parent path"))?;
let tmp_dirpath = camino_tempfile::tempdir_in(cache_parent_path)?;
copy_dir_all(src_path, &tmp_dirpath)?;
std::fs::rename(tmp_dirpath, &*cache_dir_guard)?;
// disarm the guard
scopeguard::ScopeGuard::into_inner(cache_dir_guard);
Ok(())
}
async fn run_initdb_with_cache(
conf: &'static PageServerConf,
initdb_target_dir: &Utf8Path,
pg_version: u32,
cancel: &CancellationToken,
) -> Result<(), InitdbError> {
let cache_dir = conf.initdb_cache_dir.as_ref().map(|initdb_cache_dir| {
initdb_cache_dir.join(cached_initdb_dirname(&conf.superuser, pg_version))
});
if let Some(cache_dir) = &cache_dir {
match restore_cached_initdb_dir(&cache_dir, initdb_target_dir) {
Ok(true) => return Ok(()),
Ok(false) => {},
Err(err) => {
warn!("Error restoring from cached initdb directory \"{cache_dir}\": {err}");
if initdb_target_dir.exists() {
if let Err(err) = std::fs::remove_dir_all(&initdb_target_dir) {
error!("could not remove temporary initdb target directory {initdb_target_dir}: {err}");
}
}
},
}
}
run_initdb(conf, initdb_target_dir, pg_version, cancel).await?;
if let Some(cache_dir) = &cache_dir {
if let Err(err) = save_cached_initdb_dir(initdb_target_dir, &cache_dir) {
warn!("error saving initdb result to cache directory \"{cache_dir}\": {err}");
}
}
Ok(())
}
/// Create the cluster temporarily in 'initdbpath' directory inside the repository
/// to get bootstrap data for timeline initialization.
async fn run_initdb(

View File

@@ -205,6 +205,22 @@ impl PostgresRedoManager {
}
}
/// Do a ping request-response roundtrip.
///
/// Not used in production, but by Rust benchmarks.
///
/// # Cancel-Safety
///
/// This method is cancellation-safe.
pub async fn ping(&self, pg_version: u32) -> Result<(), Error> {
self.do_with_walredo_process(pg_version, |proc| async move {
proc.ping(Duration::from_secs(1))
.await
.map_err(Error::Other)
})
.await
}
pub fn status(&self) -> WalRedoManagerStatus {
WalRedoManagerStatus {
last_redo_at: {
@@ -297,6 +313,9 @@ impl PostgresRedoManager {
}
}
/// # Cancel-Safety
///
/// This method is cancel-safe iff `closure` is cancel-safe.
async fn do_with_walredo_process<
F: FnOnce(Arc<Process>) -> Fut,
Fut: Future<Output = Result<O, Error>>,
@@ -537,6 +556,17 @@ mod tests {
use tracing::Instrument;
use utils::{id::TenantId, lsn::Lsn};
#[tokio::test]
async fn test_ping() {
let h = RedoHarness::new().unwrap();
h.manager
.ping(14)
.instrument(h.span())
.await
.expect("ping should work");
}
#[tokio::test]
async fn short_v14_redo() {
let expected = std::fs::read("test_data/short_v14_redo.page").unwrap();

View File

@@ -6,6 +6,7 @@ use self::no_leak_child::NoLeakChild;
use crate::{
config::PageServerConf,
metrics::{WalRedoKillCause, WAL_REDO_PROCESS_COUNTERS, WAL_REDO_RECORD_COUNTER},
page_cache::PAGE_SZ,
span::debug_assert_current_span_has_tenant_id,
walrecord::NeonWalRecord,
};
@@ -237,6 +238,26 @@ impl WalRedoProcess {
res
}
/// Do a ping request-response roundtrip.
///
/// Not used in production, but by Rust benchmarks.
pub(crate) async fn ping(&self, timeout: Duration) -> anyhow::Result<()> {
let mut writebuf: Vec<u8> = Vec::with_capacity(4);
protocol::build_ping_msg(&mut writebuf);
let Ok(res) = tokio::time::timeout(timeout, self.apply_wal_records0(&writebuf)).await
else {
anyhow::bail!("WAL redo ping timed out");
};
let response = res?;
if response.len() != PAGE_SZ {
anyhow::bail!(
"WAL redo ping response should respond with page-sized response: {}",
response.len()
);
}
Ok(())
}
/// # Cancel-Safety
///
/// When not polled to completion (e.g. because in `tokio::select!` another

View File

@@ -55,3 +55,8 @@ pub(crate) fn build_get_page_msg(tag: BufferTag, buf: &mut Vec<u8>) {
tag.ser_into(buf)
.expect("serialize BufferTag should always succeed");
}
pub(crate) fn build_ping_msg(buf: &mut Vec<u8>) {
buf.put_u8(b'H');
buf.put_u32(4);
}

View File

@@ -24,6 +24,7 @@
* PushPage ('P'): Copy a page image (in the payload) to buffer cache
* ApplyRecord ('A'): Apply a WAL record (in the payload)
* GetPage ('G'): Return a page image from buffer cache.
* Ping ('H'): Return the input message.
*
* Currently, you only get a response to GetPage requests; the response is
* simply a 8k page, without any headers. Errors are logged to stderr.
@@ -133,6 +134,7 @@ static void ApplyRecord(StringInfo input_message);
static void apply_error_callback(void *arg);
static bool redo_block_filter(XLogReaderState *record, uint8 block_id);
static void GetPage(StringInfo input_message);
static void Ping(StringInfo input_message);
static ssize_t buffered_read(void *buf, size_t count);
static void CreateFakeSharedMemoryAndSemaphores();
@@ -394,6 +396,10 @@ WalRedoMain(int argc, char *argv[])
GetPage(&input_message);
break;
case 'H': /* Ping */
Ping(&input_message);
break;
/*
* EOF means we're done. Perform normal shutdown.
*/
@@ -1057,6 +1063,36 @@ GetPage(StringInfo input_message)
}
static void
Ping(StringInfo input_message)
{
int tot_written;
/* Response: the input message */
tot_written = 0;
do {
ssize_t rc;
/* We don't need alignment, but it's bad practice to use char[BLCKSZ] */
#if PG_VERSION_NUM >= 160000
static const PGIOAlignedBlock response;
#else
static const PGAlignedBlock response;
#endif
rc = write(STDOUT_FILENO, &response.data[tot_written], BLCKSZ - tot_written);
if (rc < 0) {
/* If interrupted by signal, just retry */
if (errno == EINTR)
continue;
ereport(ERROR,
(errcode_for_file_access(),
errmsg("could not write to stdout: %m")));
}
tot_written += rc;
} while (tot_written < BLCKSZ);
elog(TRACE, "Page sent back for ping");
}
/* Buffer used by buffered_read() */
static char stdin_buf[16 * 1024];
static size_t stdin_len = 0; /* # of bytes in buffer */

View File

@@ -71,6 +71,37 @@ impl ComputeHookTenant {
}
}
fn is_sharded(&self) -> bool {
matches!(self, ComputeHookTenant::Sharded(_))
}
/// Clear compute hook state for the specified shard.
/// Only valid for [`ComputeHookTenant::Sharded`] instances.
fn remove_shard(&mut self, tenant_shard_id: TenantShardId, stripe_size: ShardStripeSize) {
match self {
ComputeHookTenant::Sharded(sharded) => {
if sharded.stripe_size != stripe_size
|| sharded.shard_count != tenant_shard_id.shard_count
{
tracing::warn!("Shard split detected while handling detach")
}
let shard_idx = sharded.shards.iter().position(|(shard_number, _node_id)| {
*shard_number == tenant_shard_id.shard_number
});
if let Some(shard_idx) = shard_idx {
sharded.shards.remove(shard_idx);
} else {
tracing::warn!("Shard not found while handling detach")
}
}
ComputeHookTenant::Unsharded(_) => {
unreachable!("Detach of unsharded tenants is handled externally");
}
}
}
/// Set one shard's location. If stripe size or shard count have changed, Self is reset
/// and drops existing content.
fn update(
@@ -614,6 +645,36 @@ impl ComputeHook {
self.notify_execute(maybe_send_result, tenant_shard_id, cancel)
.await
}
/// Reflect a detach for a particular shard in the compute hook state.
///
/// The goal is to avoid sending compute notifications with stale information (i.e.
/// including detach pageservers).
#[tracing::instrument(skip_all, fields(tenant_id=%tenant_shard_id.tenant_id, shard_id=%tenant_shard_id.shard_slug()))]
pub(super) fn handle_detach(
&self,
tenant_shard_id: TenantShardId,
stripe_size: ShardStripeSize,
) {
use std::collections::hash_map::Entry;
let mut state_locked = self.state.lock().unwrap();
match state_locked.entry(tenant_shard_id.tenant_id) {
Entry::Vacant(_) => {
tracing::warn!("Compute hook tenant not found for detach");
}
Entry::Occupied(mut e) => {
let sharded = e.get().is_sharded();
if !sharded {
e.remove();
} else {
e.get_mut().remove_shard(tenant_shard_id, stripe_size);
}
tracing::debug!("Compute hook handled shard detach");
}
}
}
}
#[cfg(test)]

View File

@@ -1849,7 +1849,7 @@ pub fn make_router(
RequestName("v1_tenant_timeline"),
)
})
.post(
.put(
"/v1/tenant/:tenant_id/timeline/:timeline_id/archival_config",
|r| {
tenant_service_handler(

View File

@@ -238,7 +238,7 @@ impl PageserverClient {
) -> Result<()> {
measured_request!(
"timeline_archival_config",
crate::metrics::Method::Post,
crate::metrics::Method::Put,
&self.node_id_label,
self.inner
.timeline_archival_config(tenant_shard_id, timeline_id, req)

View File

@@ -820,6 +820,16 @@ impl Reconciler {
self.location_config(&node, conf, None, false).await?;
}
// The condition below identifies a detach. We must have no attached intent and
// must have been attached to something previously. Pass this information to
// the [`ComputeHook`] such that it can update its tenant-wide state.
if self.intent.attached.is_none() && !self.detach.is_empty() {
// TODO: Consider notifying control plane about detaches. This would avoid situations
// where the compute tries to start-up with a stale set of pageservers.
self.compute_hook
.handle_detach(self.tenant_shard_id, self.shard.stripe_size);
}
failpoint_support::sleep_millis_async!("sleep-on-reconcile-epilogue");
Ok(())

View File

@@ -181,17 +181,6 @@ def top_output_dir(base_dir: Path) -> Iterator[Path]:
log.info(f"top_output_dir is {output_dir}")
yield output_dir
@pytest.fixture(scope="session", autouse=True)
def shared_initdb_cache_dir(top_output_dir: Path) -> Iterator[Path]:
log.info("Creating shared initdb cache directory")
cache_dir = top_output_dir / "shared_initdb_cache"
shutil.rmtree(cache_dir, ignore_errors=True)
cache_dir.mkdir(exist_ok=True)
yield cache_dir
@pytest.fixture(scope="session")
def neon_api_key() -> str:
@@ -412,7 +401,6 @@ class NeonEnvBuilder:
safekeeper_extra_opts: Optional[list[str]] = None,
storage_controller_port_override: Optional[int] = None,
pageserver_io_buffer_alignment: Optional[int] = None,
shared_initdb_cache_dir: Optional[Path] = None,
):
self.repo_dir = repo_dir
self.rust_log_override = rust_log_override
@@ -444,7 +432,6 @@ class NeonEnvBuilder:
self.enable_scrub_on_exit = True
self.test_output_dir = test_output_dir
self.test_overlay_dir = test_overlay_dir
self.shared_initdb_cache_dir = shared_initdb_cache_dir
self.overlay_mounts_created_by_us: List[Tuple[str, Path]] = []
self.config_init_force: Optional[str] = None
self.top_output_dir = top_output_dir
@@ -982,7 +969,6 @@ class NeonEnv:
def __init__(self, config: NeonEnvBuilder):
self.repo_dir = config.repo_dir
self.shared_initdb_cache_dir = config.shared_initdb_cache_dir
self.rust_log_override = config.rust_log_override
self.port_distributor = config.port_distributor
self.s3_mock_server = config.mock_s3_server
@@ -1085,10 +1071,6 @@ class NeonEnv:
# Default which can be overriden with `NeonEnvBuilder.pageserver_config_override`
"availability_zone": "us-east-2a",
}
if self.shared_initdb_cache_dir is not None:
ps_cfg["initdb_cache_dir"] = str(self.shared_initdb_cache_dir)
if self.pageserver_virtual_file_io_engine is not None:
ps_cfg["virtual_file_io_engine"] = self.pageserver_virtual_file_io_engine
if config.pageserver_default_tenant_config_compaction_algorithm is not None:
@@ -1351,7 +1333,6 @@ def neon_simple_env(
pageserver_aux_file_policy: Optional[AuxFileStore],
pageserver_default_tenant_config_compaction_algorithm: Optional[Dict[str, Any]],
pageserver_io_buffer_alignment: Optional[int],
shared_initdb_cache_dir: Optional[Path],
) -> Iterator[NeonEnv]:
"""
Simple Neon environment, with no authentication and no safekeepers.
@@ -1378,7 +1359,6 @@ def neon_simple_env(
pageserver_aux_file_policy=pageserver_aux_file_policy,
pageserver_default_tenant_config_compaction_algorithm=pageserver_default_tenant_config_compaction_algorithm,
pageserver_io_buffer_alignment=pageserver_io_buffer_alignment,
shared_initdb_cache_dir=shared_initdb_cache_dir
) as builder:
env = builder.init_start()
@@ -1403,7 +1383,6 @@ def neon_env_builder(
pageserver_aux_file_policy: Optional[AuxFileStore],
record_property: Callable[[str, object], None],
pageserver_io_buffer_alignment: Optional[int],
shared_initdb_cache_dir: Optional[Path],
) -> Iterator[NeonEnvBuilder]:
"""
Fixture to create a Neon environment for test.
@@ -1439,7 +1418,6 @@ def neon_env_builder(
pageserver_aux_file_policy=pageserver_aux_file_policy,
pageserver_default_tenant_config_compaction_algorithm=pageserver_default_tenant_config_compaction_algorithm,
pageserver_io_buffer_alignment=pageserver_io_buffer_alignment,
shared_initdb_cache_dir=shared_initdb_cache_dir
) as builder:
yield builder
# Propogate `preserve_database_files` to make it possible to use in other fixtures,

View File

@@ -631,7 +631,7 @@ class PageserverHttpClient(requests.Session, MetricsGetter):
log.info(
f"requesting timeline archival config {config} for tenant {tenant_id} and timeline {timeline_id}"
)
res = self.post(
res = self.put(
f"http://localhost:{self.port}/v1/tenant/{tenant_id}/timeline/{timeline_id}/archival_config",
json=config,
)

View File

@@ -8,6 +8,7 @@ from fixtures.neon_fixtures import (
PgBin,
fork_at_current_lsn,
import_timeline_from_vanilla_postgres,
wait_for_wal_insert_lsn,
)
@@ -22,11 +23,6 @@ def twophase_test_on_timeline(env: NeonEnv):
conn = endpoint.connect()
cur = conn.cursor()
# FIXME: Switch to the next WAL segment, to work around the bug fixed in
# https://github.com/neondatabase/neon/pull/8914. When that is merged, this can be
# removed.
cur.execute("select pg_switch_wal()")
cur.execute("CREATE TABLE foo (t text)")
# Prepare a transaction that will insert a row
@@ -140,3 +136,28 @@ def test_twophase_nonzero_epoch(
vanilla_pg.stop() # don't need the original server anymore
twophase_test_on_timeline(env)
def test_twophase_at_wal_segment_start(neon_simple_env: NeonEnv):
"""
Same as 'test_twophase' test, but the server is started at an LSN at the beginning
of a WAL segment. We had a bug where we didn't initialize the "long XLOG page header"
at the beginning of the segment correctly, which was detected when the checkpointer
tried to read the XLOG_XACT_PREPARE record from the WAL, if that record was on the
very first page of a WAL segment and the server was started up at that first page.
"""
env = neon_simple_env
timeline_id = env.neon_cli.create_branch("test_twophase", "main")
endpoint = env.endpoints.create_start(
"test_twophase", config_lines=["max_prepared_transactions=5"]
)
endpoint.safe_psql("SELECT pg_switch_wal()")
# to avoid hitting https://github.com/neondatabase/neon/issues/9079, wait for the
# WAL to reach the pageserver.
wait_for_wal_insert_lsn(env, endpoint, env.initial_tenant, timeline_id)
endpoint.stop_and_destroy()
twophase_test_on_timeline(env)

View File

@@ -1,18 +1,18 @@
{
"v17": [
"17rc1",
"dadd6fe208bb906cc0a48980f2ab4e13c47ba3ad"
"2cf120e7393ca5f537c6a38b457585576dc035fc"
],
"v16": [
"16.4",
"d009084a745cb4d5e6de222c778b2a562c8b2767"
"1d7081a3b076ddf5086e0b118d4329820e6a7427"
],
"v15": [
"15.8",
"1dbd6f316416c8360bbd4f3d6db956cf70937cf0"
"16c3c6b64f1420a367a2a9b2510f20d94f791af8"
],
"v14": [
"14.13",
"f9c51c12438b20049b6905eb4e43d321defd6ff2"
"a38d15f3233a4c07f2bf3335fcbd874dd1f4e386"
]
}

View File

@@ -45,6 +45,7 @@ futures-io = { version = "0.3" }
futures-util = { version = "0.3", features = ["channel", "io", "sink"] }
generic-array = { version = "0.14", default-features = false, features = ["more_lengths", "zeroize"] }
getrandom = { version = "0.2", default-features = false, features = ["std"] }
half = { version = "2", default-features = false, features = ["num-traits"] }
hashbrown = { version = "0.14", features = ["raw"] }
hex = { version = "0.4", features = ["serde"] }
hmac = { version = "0.12", default-features = false, features = ["reset"] }
@@ -106,6 +107,7 @@ cc = { version = "1", default-features = false, features = ["parallel"] }
chrono = { version = "0.4", default-features = false, features = ["clock", "serde", "wasmbind"] }
either = { version = "1" }
getrandom = { version = "0.2", default-features = false, features = ["std"] }
half = { version = "2", default-features = false, features = ["num-traits"] }
hashbrown = { version = "0.14", features = ["raw"] }
indexmap = { version = "1", default-features = false, features = ["std"] }
itertools-5ef9efb8ec2df382 = { package = "itertools", version = "0.12", default-features = false, features = ["use_std"] }