Compare commits

..

81 Commits

Author SHA1 Message Date
Erik Grinaker
1821b6ea3f Storage release 2025-05-09 12:25 UTC 2025-05-09 14:25:07 +02:00
Alex Chi Z.
93b964f829 fix(pageserver): do not do image compaction if it's below gc cutoff (#11872)
## Problem

We observe image compaction errors after gc-compaction finishes
compacting below the gc_cutoff. This is because `repartition` returns an
LSN below the gc horizon as we (likely) determined that `distance <=
self.repartition_threshold`.

I think it's better to keep the current behavior of when to trigger
compaction but we should skip image compaction if the returned LSN is
below the gc horizon.

## Summary of changes

If the repartition returns an invalid LSN, skip image compaction.

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-05-09 12:07:52 +00:00
Vlad Lazar
d0aaec2abb storage_controller: create imported timelines on safekeepers (#11801)
## Problem

SK timeline creations were skipped for imported timelines since we
didn't know the correct start LSN
of the timeline at that point.

## Summary of changes

Created imported timelines on the SK as part of the import finalize
step.
We use the last record LSN of shard 0 as the start LSN for the
safekeeper timeline.

Closes https://github.com/neondatabase/neon/issues/11569
2025-05-09 10:55:26 +00:00
Alex Chi Z.
d0dc65da12 fix(pageserver): give up gc-compaction if one key has too long history (#11869)
## Problem

The limitation we imposed last week
https://github.com/neondatabase/neon/pull/11709 is not enough to protect
excessive memory usage.

## Summary of changes

If a single key accumulated too much history, give up compaction. In the
future, we can make the `generate_key_retention` function take a stream
of keys instead of first accumulating them in memory, thus easily
support such long key history cases.

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-05-09 10:12:49 +00:00
Konstantin Knizhnik
03d635b916 Add more guards for prefetch_pump_state (#11859)
## Problem

See https://neondb.slack.com/archives/C08PJ07BZ44/p1746566292750689

Looks like there are more cases when `prefetch_pump_state` can be called
in unexpected place and cause core dump.

## Summary of changes

Add more guards.

---------

Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
2025-05-09 09:07:08 +00:00
Conrad Ludgate
5cd7f936f9 fix(neon-rls): optimistically assume role grants are already assigned for replicas (#11811)
## Problem

Read replicas cannot grant permissions for roles for Neon RLS. Usually
the permission is already granted, so we can optimistically check. See
INC-509

## Summary of changes

Perform a permission lookup prior to actually executing any grants.
2025-05-09 07:48:30 +00:00
Erik Grinaker
be3686e2af Storage release 2025-05-06 15:12 UTC 2025-05-06 17:12:17 +02:00
Erik Grinaker
1c3cb18c60 storcon: fix split aborts removing other tenants (#11837)
## Problem

When aborting a split, the code accidentally removes all other tenant
shards from the in-memory map that have the same shard count as the
aborted split, causing "tenant not found" errors. It will recover on a
storcon restart, when it loads the persisted state. This issue has been
present for at least a year.

Resolves https://github.com/neondatabase/cloud/issues/28589.

## Summary of changes

Only remove shards belonging to the relevant tenant when aborting a
split.

Also adds a regression test.
2025-05-06 17:12:17 +02:00
github-actions[bot]
946a0667eb Storage release 2025-05-02 06:10 UTC 2025-05-02 06:10:54 +00:00
Vlad Lazar
e5d15450ec Storage release 2025-04-28 2025-04-28 22:43:18 +02:00
Alex Chi Z.
d294078d1f fix(pageserver): consider tombstones in replorigin (#11752)
## Problem

We didn't consider tombstones in replorigin read path in the past. This
was fine because tombstones are stored as LSN::Invalid before we
universally define what the tombstone is for sparse keyspaces.

Now we remove non-inherited keys during detach ancestor and write the
universal tombstone "empty image". So we need to consider it across all
the read paths.

related: https://github.com/neondatabase/neon/pull/11299

## Summary of changes

Empty value gets ignored for replorigin scans.

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-04-28 21:20:11 +01:00
github-actions[bot]
eb70d7a55c Storage release 2025-04-25 2025-04-25 11:04:06 +00:00
github-actions[bot]
20723ea039 Storage release 2025-04-21 2025-04-21 19:39:21 +00:00
Vlad Lazar
db95540975 pageserver: handle empty get vectored queries (#11652)
## Problem

If all batched requests are excluded from the query by
`Timeine::get_rel_page_at_lsn_batched` (e.g. because they are past the
end of the relation), the read path would panic since it doesn't expect
empty queries. This is a change in behaviour that was introduced with
the scattered query implementation.

## Summary of Changes

Handle empty queries explicitly.
2025-04-21 15:38:44 -04:00
JC Grünhage
90033fe693 fix(ci): set token for fast-forward failure comments and allow merging with state unstable (#11647)
## Problem

https://github.com/neondatabase/neon/actions/runs/14538136318/job/40790985693?pr=11645
failed, even though the relevant parts of the CI had passed and
auto-merge determined the PR is ready to merge. After that, commenting
failed.

## Summary of changes
- set GH_TOKEN for commenting after fast-forward failure
- allow merging with mergeable_state unstable
2025-04-21 15:38:44 -04:00
JC Grünhage
cb9d439cc1 fix(ci): make regex to find rc branches less strict (#11646)
## Problem

https://github.com/neondatabase/neon/actions/runs/14537161022/job/40787763965
failed to find the correct RC PR run, preventing artifact re-use. This
broke in https://github.com/neondatabase/neon/pull/11547.

There's a hotfix release containing this in
https://github.com/neondatabase/neon/pull/11645.

## Summary of changes
Make the regex for finding the RC PR run less strict, it was needlessly
precise.
2025-04-21 15:38:44 -04:00
Jan Christian Grünhage
7a2f0c4d53 Storage release 2025-04-18 2025-04-18 18:51:56 +02:00
Jan Christian Grünhage
60c907cbdb fix(ci): allow merging with mergeable_state unstable 2025-04-18 18:51:27 +02:00
Jan Christian Grünhage
2eb43a5d81 fix(ci): set GH_TOKEN for commenting after fast-forward failure 2025-04-18 18:51:21 +02:00
Jan Christian Grünhage
b2f45fe37f fix(ci): make regex to find rc branches less strict 2025-04-18 17:15:01 +02:00
github-actions[bot]
3904a0fe4f Storage release 2025-04-18 2025-04-18 06:02:29 +00:00
github-actions[bot]
cd07b120b5 Storage release 2025-04-11 2025-04-11 16:20:12 +00:00
github-actions[bot]
7c06a33df0 Storage release 2025-04-04 2025-04-04 15:40:58 +00:00
github-actions[bot]
85992000e3 Storage release 2025-03-28 2025-03-28 06:02:32 +00:00
github-actions[bot]
c33cf739e3 Storage release 2025-03-16 2025-03-16 17:05:15 +00:00
JC Grünhage
1812fc7cf1 Merge pull request #11251 from neondatabase/rc/release/2025-03-14--storcon-hotfix
Storage controller hotfix 2025-03-14
2025-03-14 16:43:22 +01:00
Christian Schwarz
929d963a35 Merge branch 'hotfix/release/2025-03-14-storcon-optimizations' into rc/release/2025-03-14--storcon-hotfix 2025-03-14 15:06:25 +01:00
Christian Schwarz
e405420458 CHERRY PICK: fix(storcon): optimization validation makes decisions based on wrong SecondaryProgress (#11229)
(cherry picked from commit 04370b48b3)

Conflicts:
	storage_controller/src/service.rs

Because `release` head doesn't yet have
`storcon: timetime table, creation and deletion (#11058)`
2025-03-14 14:47:29 +01:00
Alex Chi Z.
1b5258ef6a Merge pull request #11161 from neondatabase/arpad/release-db-sk-loading
release branch: re-apply "storcon db: load safekeepers from DB again"
2025-03-10 21:50:58 -04:00
Arpad Müller
9fbb33e9d9 storcon db: load safekeepers from DB again (#11087)
Earlier PR #11041 soft-disabled the loading code for safekeepers from
the storcon db. This PR makes us load the safekeepers from the database
again, now that we have [JWTs available on
staging](https://github.com/neondatabase/neon/pull/11087) and soon on
prod.

This reverts commit 23fb8053c5.

Part of https://github.com/neondatabase/cloud/issues/24727
2025-03-11 01:12:21 +01:00
Alex Chi Z.
93983ac5fc Merge pull request #11128 from neondatabase/rc/release/2025-03-07
Storage release 2025-03-07
2025-03-07 14:02:55 -05:00
Alex Chi Z
72dd540c87 Merge branch 'release' of https://github.com/neondatabase/neon into rc/release/2025-03-07 2025-03-07 13:04:10 -05:00
Alex Chi Z.
612d0aea4f feat(pageserver): add force patch index_part API (#11119)
## Problem

As part of the disaster recovery tool. Partly for
https://github.com/neondatabase/neon/issues/9114.

## Summary of changes

* Add a new pageserver API to force patch the fields in index_part and
modify the timeline internal structures.

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-03-07 12:54:44 -05:00
Vlad Lazar
aa3a75a0a7 pageserver: enable previous heatmaps by default (#11132)
We add the off by default configs in
https://github.com/neondatabase/neon/pull/11088 because
the unarchival heatmap was causing oversized secondary locations. That
was fixed in https://github.com/neondatabase/neon/pull/11098, so let's
turn them on by default.
2025-03-07 11:14:38 -05:00
Arpad Müller
6389c9184c update ring to 0.17.13 (#11131)
Update ring from 0.17.6 to 0.17.13. Addresses the advisory:
https://rustsec.org/advisories/RUSTSEC-2025-0009
2025-03-07 11:14:26 -05:00
Vlad Lazar
e177927476 safekeeper: don't skip empty records for shard zero (#11137)
## Problem

Shard zero needs to track the start LSN of the latest record
in adition to the LSN of the next record to ingest. The former
is included in basebackup persisted by the compute in WAL.

Previously, empty records were skipped for all shards. This caused
the prev LSN tracking on the PS to fall behind and led to logical
replication
issues.

## Summary of changes

Shard zero now receives emtpy interpreted records for LSN tracking
purposes.
A test is included too.
2025-03-07 11:04:30 -05:00
github-actions[bot]
e9bbafebbd Storage release 2025-03-07 2025-03-07 06:02:10 +00:00
Vlad Lazar
7430fb9836 Merge pull request #11090 from neondatabase/vlad/release-gate-previous-heatmap
Storage release 2025-03-05
2025-03-05 14:37:24 +00:00
Vlad Lazar
c45d169527 pageserver: gate previous heatmap behind config flag (#11088)
## Problem

On unarchival, we update the previous heatmap with all visible layers.
When the primary generates a new heatmap it includes all those layers,
so the secondary will download them. Since they're not actually resident
on the primary (we didn't call the warm up API), they'll never be
evicted, so they remain in the heatmap.

This leads to oversized secondary locations like we saw in pre-prod.

## Summary of changes

Gate the loading of the previous heatmaps and the heatmap generation on
unarchival behind configuration
flags. They are disabled by default, but enabled in tests.
2025-03-05 13:29:46 +01:00
Vlad Lazar
a1e67cfe86 Merge pull request #11051 from neondatabase/vlad/release-8005-and-cherry-picks
Storage release 2025-02-28
2025-02-28 19:40:33 +00:00
Erik Grinaker
0263c92c47 pageserver: fix race that can wedge background tasks (#11047)
## Problem

`wait_for_active_tenant()`, used when starting background tasks, has a
race condition that can cause it to wait forever (until cancelled). It
first checks the current tenant state, and then subscribes for state
updates, but if the state changes between these then it won't be
notified about it.

We've seen this wedge compaction tasks, which can cause unbounded layer
file buildup and read amplification.

## Summary of changes

Use `watch::Receiver::wait_for()` to check both the current and new
tenant states.
2025-02-28 18:11:10 +01:00
Vlad Lazar
5fc599d653 storcon: soft disable SK heartbeats (#11041)
## Problem

JWT tokens aren't in place, so all SK heartbeats fail. This is
equivalent to a wait before applying the PS heartbeats and makes things
more flaky.

## Summary of Changes

Add a flag that skips loading SKs from the db on start-up and at
runtime.
2025-02-28 18:11:10 +01:00
JC Grünhage
66d2592d04 Merge pull request #11032 from neondatabase/rc/release/2025-02-28
Storage release 2025-02-28
2025-02-28 11:12:07 +01:00
Jan Christian Grünhage
517ae7a60e Merge remote-tracking branch 'origin/release' into rc/release/2025-02-28 2025-02-28 10:10:19 +01:00
github-actions[bot]
b2a769cc86 Storage release 2025-02-28 2025-02-28 06:02:09 +00:00
Erik Grinaker
12f0e525c6 Merge pull request #10961 from neondatabase/erik/release-7930-slow-getpage
pageserver: tweak slow GetPage logging (#10956)
2025-02-24 23:05:10 +01:00
Erik Grinaker
6c6e5bfc2b pageserver: tweak slow GetPage logging 2025-02-24 21:27:38 +01:00
Erik Grinaker
9c0fefee25 Merge pull request #10921 from neondatabase/rc/release/2025-02-21
Storage release 2025-02-21
2025-02-21 18:55:58 +01:00
Vlad Lazar
97e2e27f68 storcon: use Duration for duration's in the storage controller tenant config (#10928)
## Problem

The storage controller treats durations in the tenant config as strings.
These are loaded from the db.
The pageserver maps these durations to a seconds only format and we
always get a mismatch compared
to what's in the db.

## Summary of changes

Treat durations as durations inside the storage controller and not as
strings.
Nothing changes in the cross service API's themselves or the way things
are stored in the db.

I also added some logging which I would have made the investigation a
10min job:
1. Reason for why the reconciliation was spawned
2. Location config diff between the observed and wanted states
2025-02-21 17:14:43 +01:00
Erik Grinaker
cea2b222d0 Merge branch 'release' into rc/release/2025-02-21 2025-02-21 12:38:14 +01:00
github-actions[bot]
5b2afd953c Storage release 2025-02-21 2025-02-21 06:02:10 +00:00
Alex Chi Z.
74e789b155 Merge pull request #10878 from neondatabase/rc/release/2025-02-18 2025-02-18 23:04:21 -05:00
Alex Chi Z.
235439e639 fix(pageserver): make repartition error critical (#10872)
## Problem

Read errors during repartition should be a critical error.

## Summary of changes

<del>We only have one call site</del> We have two call sites of
`repartition` where one of them is during the initial image upload
optimization and another is during image layer creation, so I added a
`critical!` here instead of inside `collect_keyspace`.

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-02-18 15:29:19 -05:00
Alex Chi Z.
a78438f15c Revert "feat(pageserver): repartition on L0-L1 boundary (#10548)" (#10870)
This reverts commit 443c8d0b4b.

## Problem

We observe a massive amount of compaction errors.

## Summary of changes

If the tenant did not write any L1 layers (i.e., they accumulate L0
layers where number of them is below L0 threshold), image creation will
always fail. Therefore, it's not correct to simply use the
disk_consistent_lsn or L0/L1 boundary for the image creation.
2025-02-18 13:39:01 -05:00
Arseny Sher
3d370679a1 Merge pull request #10855 from neondatabase/rel-02-14-39d42d846ae38
Cherry-pick #10845 into release
2025-02-17 18:46:22 +03:00
John Spray
c37e3020ab pageserver_api: fix decoding old-version TimelineInfo (#10845)
## Problem

In #10707 some new fields were introduced in TimelineInfo.

I forgot that we do not only use TimelineInfo for encoding, but also
decoding when the storage controller calls into a pageserver, so this
broke some calls from controller to pageserver while in a mixed-version
state.

## Summary of changes

- Make new fields have default behavior so that they are optional
2025-02-17 18:43:14 +03:00
Arseny Sher
a036708da1 Merge pull request #10820 from neondatabase/rc/release/2025-02-14
Storage release 2025-02-14
2025-02-14 19:36:36 +03:00
John Spray
1e7ad80ee7 storage controller: prioritize reconciles for user-facing operations (#10822)
## Problem

Some situations may produce a large number of pending reconciles. If we
experience an issue where reconciles are processed more slowly than
expected, that can prevent us responding promptly to user requests like
tenant/timeline CRUD.

This is a cleaner implementation of the hotfix in
https://github.com/neondatabase/neon/pull/10815

## Summary of changes

- Introduce a second semaphore for high priority tasks, with
configurable units (default 256). The intent is that in practical
situations these user-facing requests should never have to wait.
- Use the high priority semaphore for: tenant/timeline CRUD, and shard
splitting operations. Use normal priority for everything else.
2025-02-14 17:34:51 +03:00
John Spray
581be23100 storcon: fix eliding parameters from proxied URL labels (#10817)
## Problem

We had code for stripping IDs out of proxied paths to reduce cardinality
of metrics, but it was only stripping out tenant IDs, and leaving in
timeline IDs and query parameters (e.g. LSN in lsn->timestamp lookups).

## Summary of changes

- Use a more general regex approach.

There is still some risk that a future pageserver API might include a
parameter in `/the/path/`, but we control that API and it is not often
extended. We will also alert on metrics cardinality in staging so that
if we made that mistake we would notice.
2025-02-14 17:34:25 +03:00
github-actions[bot]
8ca7ea859d Storage release 2025-02-14 2025-02-14 06:02:05 +00:00
Christian Schwarz
efea8223bb Merge pull request #10774 from neondatabase/releases/2025-02-11-smgr-op-latency-metrics-hotfix 2025-02-11 21:16:44 +01:00
Christian Schwarz
d3d3bfc6d0 fix(page_service / batching): smgr op latency metric of dropped responses include flush time (#10756)
# Problem

Say we have a batch of 10 responses to send out.

Then, even with

- #10728

we've still only called observe_execution_end_flush_start for the first
3 responses.

The remaining 7 response timers are still ticking.

When compute now closes the connection, the waiting flush fails with an
error and we `drop()` the remaining 7 responses' smgr op timers. The
`impl Drop for SmgrOpTimer` will observe an execution time that includes
the flush time.

In practice, this is supsected to produce the `+Inf` observations in the
smgr op latency histogram we've seen since the introduction of
pipelining, even after shipping #10728.

refs:
- fixup of https://github.com/neondatabase/neon/pull/10042
- fixup of https://github.com/neondatabase/neon/pull/10728
- fixes https://github.com/neondatabase/neon/issues/10754
2025-02-11 20:25:13 +01:00
Christian Schwarz
b3a911ff8c fix(page_service / batching): smgr op latency metrics includes the flush time of preceding requests (#10728)
Before this PR, if a batch contains N responses, the smgr op latency
reported for response (N-i) would include the time we spent flushing
the preceding requests.

refs:
- fixup of https://github.com/neondatabase/neon/pull/10042
- fixes https://github.com/neondatabase/neon/issues/10674
2025-02-11 20:25:08 +01:00
John Spray
a54853abd5 Merge pull request #10712 from neondatabase/rc/release/2025-02-07
Storage release 2025-02-07
2025-02-07 18:21:13 +00:00
Arpad Müller
69007f7ac8 Revert recent AWS SDK update (#10724)
We've been seeing some regressions in staging since the AWS SDK updates:
https://github.com/neondatabase/neon/issues/10695 . We aren't sure the
regression was caused by the SDK update, but the issues do involve S3,
so it's not unlikely. By reverting the SDK update we find out whether it
was really the SDK update, or something else.

Reverts the two PRs:

* https://github.com/neondatabase/neon/pull/10588
* https://github.com/neondatabase/neon/pull/10699

https://neondb.slack.com/archives/C08C2G15M6U/p1738576986047179
2025-02-07 18:18:45 +00:00
github-actions[bot]
d255fa4b7e Storage release 2025-02-07 2025-02-07 06:02:18 +00:00
Arpad Müller
40d6b3a34e Merge pull request #10602 from neondatabase/rc/release/2025-01-31
Storage release 2025-01-31
2025-02-03 16:43:04 +01:00
github-actions[bot]
a018878e27 Storage release 2025-01-31 2025-01-31 06:02:08 +00:00
Christian Schwarz
e5b3eb1e64 Merge pull request #10500 from neondatabase/rc/release/2025-01-24
Storage release 2025-01-24
2025-01-25 00:54:56 +01:00
github-actions[bot]
f35e1356a1 Storage release 2025-01-24 2025-01-24 06:02:13 +00:00
Christian Schwarz
4dec0dddc6 Merge pull request #10447 from neondatabase/releases/2025-01-20-hotfix
Release: storage hotfix 2025-01-20
2025-01-20 15:55:44 +01:00
Christian Schwarz
e0c504af38 fix(page_service / handle): panic when parallel client disconnect & Timeline shutdown
Refs
- fixes https://github.com/neondatabase/neon/issues/10444
2025-01-20 14:37:16 +01:00
Alex Chi Z.
3399eea2ed Merge pull request #10436 from neondatabase/rc/release/2025-01-17
Storage release 2025-01-17
2025-01-17 12:36:17 -05:00
Alex Chi Z
6a29c809d5 Merge branch 'release' of https://github.com/neondatabase/neon into rc/release/2025-01-17 2025-01-17 10:44:25 -05:00
github-actions[bot]
a62c01df4c Storage release 2025-01-17 2025-01-17 06:02:11 +00:00
Vlad Lazar
4c093c6314 Merge pull request #10338 from neondatabase/rc/release/2025-01-10
Storage release 2025-01-10
2025-01-10 19:21:43 +00:00
github-actions[bot]
32f58f8228 Storage release 2025-01-10 2025-01-10 06:02:00 +00:00
Erik Grinaker
96c36c0894 Merge pull request #10263 from neondatabase/rc/release/2025-01-03
Storage release 2025-01-03
2025-01-03 20:32:37 +01:00
Erik Grinaker
d719709316 Revert "pageserver: revert flush backpressure (#8550) (#10135)" (#10270)
This reverts commit f3ecd5d76a.

It is
[suspected](https://neondb.slack.com/archives/C033RQ5SPDH/p1735907405716759)
to have caused significant read amplification in the [ingest
benchmark](https://neonprod.grafana.net/d/de3mupf4g68e8e/perf-test3a-ingest-benchmark?orgId=1&from=now-30d&to=now&timezone=utc&var-new_project_endpoint_id=ep-solitary-sun-w22bmut6&var-large_tenant_endpoint_id=ep-holy-bread-w203krzs)
(specifically during index creation).

We will revisit an intermediate improvement here to unblock [upload
parallelism](https://github.com/neondatabase/neon/issues/10096) before
properly addressing [compaction
backpressure](https://github.com/neondatabase/neon/issues/8390).
2025-01-03 16:51:16 +01:00
Erik Grinaker
97912f19fc pageserver,safekeeper: disable heap profiling (#10268)
## Problem

Since enabling continuous profiling in staging, we've seen frequent seg
faults. This is suspected to be because jemalloc and pprof-rs take a
stack trace at the same time, and the handlers aren't signal safe.
jemalloc does this probabilistically on every allocation, regardless of
whether someone is taking a heap profile, which means that any CPU
profile has a chance to cause a seg fault.

Touches #10225.

## Summary of changes

For now, just disable heap profiles -- CPU profiles are more important,
and we need to be able to take them without risking a crash.
2025-01-03 16:51:16 +01:00
github-actions[bot]
49724aa3b6 Storage release 2025-01-03 2025-01-03 06:02:03 +00:00
21 changed files with 246 additions and 113 deletions

1
Cargo.lock generated
View File

@@ -1303,6 +1303,7 @@ dependencies = [
"futures",
"http 1.1.0",
"indexmap 2.0.1",
"itertools 0.10.5",
"jsonwebtoken",
"metrics",
"nix 0.27.1",

View File

@@ -28,6 +28,7 @@ flate2.workspace = true
futures.workspace = true
http.workspace = true
indexmap.workspace = true
itertools.workspace = true
jsonwebtoken.workspace = true
metrics.workspace = true
nix.workspace = true

View File

@@ -60,12 +60,16 @@ use utils::failpoint_support;
// Compatibility hack: if the control plane specified any remote-ext-config
// use the default value for extension storage proxy gateway.
// Remove this once the control plane is updated to pass the gateway URL
fn parse_remote_ext_config(arg: &str) -> Result<String> {
if arg.starts_with("http") {
Ok(arg.trim_end_matches('/').to_string())
fn parse_remote_ext_base_url(arg: &str) -> Result<String> {
const FALLBACK_PG_EXT_GATEWAY_BASE_URL: &str =
"http://pg-ext-s3-gateway.pg-ext-s3-gateway.svc.cluster.local";
Ok(if arg.starts_with("http") {
arg
} else {
Ok("http://pg-ext-s3-gateway".to_string())
FALLBACK_PG_EXT_GATEWAY_BASE_URL
}
.to_owned())
}
#[derive(Parser)]
@@ -74,8 +78,10 @@ struct Cli {
#[arg(short = 'b', long, default_value = "postgres", env = "POSTGRES_PATH")]
pub pgbin: String,
#[arg(short = 'r', long, value_parser = parse_remote_ext_config)]
pub remote_ext_config: Option<String>,
/// The base URL for the remote extension storage proxy gateway.
/// Should be in the form of `http(s)://<gateway-hostname>[:<port>]`.
#[arg(short = 'r', long, value_parser = parse_remote_ext_base_url, alias = "remote-ext-config")]
pub remote_ext_base_url: Option<String>,
/// The port to bind the external listening HTTP server to. Clients running
/// outside the compute will talk to the compute through this port. Keep
@@ -164,7 +170,7 @@ fn main() -> Result<()> {
pgversion: get_pg_version_string(&cli.pgbin),
external_http_port: cli.external_http_port,
internal_http_port: cli.internal_http_port,
ext_remote_storage: cli.remote_ext_config.clone(),
remote_ext_base_url: cli.remote_ext_base_url.clone(),
resize_swap_on_bind: cli.resize_swap_on_bind,
set_disk_quota_for_fs: cli.set_disk_quota_for_fs,
#[cfg(target_os = "linux")]
@@ -265,4 +271,18 @@ mod test {
fn verify_cli() {
Cli::command().debug_assert()
}
#[test]
fn parse_pg_ext_gateway_base_url() {
let arg = "http://pg-ext-s3-gateway2";
let result = super::parse_remote_ext_base_url(arg).unwrap();
assert_eq!(result, arg);
let arg = "pg-ext-s3-gateway";
let result = super::parse_remote_ext_base_url(arg).unwrap();
assert_eq!(
result,
"http://pg-ext-s3-gateway.pg-ext-s3-gateway.svc.cluster.local"
);
}
}

View File

@@ -11,6 +11,7 @@ use compute_api::spec::{
use futures::StreamExt;
use futures::future::join_all;
use futures::stream::FuturesUnordered;
use itertools::Itertools;
use nix::sys::signal::{Signal, kill};
use nix::unistd::Pid;
use once_cell::sync::Lazy;
@@ -18,7 +19,7 @@ use postgres;
use postgres::NoTls;
use postgres::error::SqlState;
use remote_storage::{DownloadError, RemotePath};
use std::collections::HashMap;
use std::collections::{HashMap, HashSet};
use std::net::SocketAddr;
use std::os::unix::fs::{PermissionsExt, symlink};
use std::path::Path;
@@ -95,7 +96,7 @@ pub struct ComputeNodeParams {
pub internal_http_port: u16,
/// the address of extension storage proxy gateway
pub ext_remote_storage: Option<String>,
pub remote_ext_base_url: Option<String>,
}
/// Compute node info shared across several `compute_ctl` threads.
@@ -1896,9 +1897,9 @@ LIMIT 100",
real_ext_name: String,
ext_path: RemotePath,
) -> Result<u64, DownloadError> {
let ext_remote_storage =
let remote_ext_base_url =
self.params
.ext_remote_storage
.remote_ext_base_url
.as_ref()
.ok_or(DownloadError::BadInput(anyhow::anyhow!(
"Remote extensions storage is not configured",
@@ -1960,7 +1961,7 @@ LIMIT 100",
let download_size = extension_server::download_extension(
&real_ext_name,
&ext_path,
ext_remote_storage,
remote_ext_base_url,
&self.params.pgbin,
)
.await
@@ -1995,23 +1996,40 @@ LIMIT 100",
tokio::spawn(conn);
// TODO: support other types of grants apart from schemas?
let query = format!(
"GRANT {} ON SCHEMA {} TO {}",
privileges
.iter()
// should not be quoted as it's part of the command.
// is already sanitized so it's ok
.map(|p| p.as_str())
.collect::<Vec<&'static str>>()
.join(", "),
// quote the schema and role name as identifiers to sanitize them.
schema_name.pg_quote(),
role_name.pg_quote(),
);
db_client
.simple_query(&query)
// check the role grants first - to gracefully handle read-replicas.
let select = "SELECT privilege_type
FROM pg_namespace
JOIN LATERAL (SELECT * FROM aclexplode(nspacl) AS x) acl ON true
JOIN pg_user users ON acl.grantee = users.usesysid
WHERE users.usename = $1
AND nspname = $2";
let rows = db_client
.query(select, &[role_name, schema_name])
.await
.with_context(|| format!("Failed to execute query: {}", query))?;
.with_context(|| format!("Failed to execute query: {select}"))?;
let already_granted: HashSet<String> = rows.into_iter().map(|row| row.get(0)).collect();
let grants = privileges
.iter()
.filter(|p| !already_granted.contains(p.as_str()))
// should not be quoted as it's part of the command.
// is already sanitized so it's ok
.map(|p| p.as_str())
.join(", ");
if !grants.is_empty() {
// quote the schema and role name as identifiers to sanitize them.
let schema_name = schema_name.pg_quote();
let role_name = role_name.pg_quote();
let query = format!("GRANT {grants} ON SCHEMA {schema_name} TO {role_name}",);
db_client
.simple_query(&query)
.await
.with_context(|| format!("Failed to execute query: {}", query))?;
}
Ok(())
}
@@ -2069,7 +2087,7 @@ LIMIT 100",
&self,
spec: &ComputeSpec,
) -> Result<RemoteExtensionMetrics> {
if self.params.ext_remote_storage.is_none() {
if self.params.remote_ext_base_url.is_none() {
return Ok(RemoteExtensionMetrics {
num_ext_downloaded: 0,
largest_ext_size: 0,

View File

@@ -158,14 +158,14 @@ fn parse_pg_version(human_version: &str) -> PostgresMajorVersion {
pub async fn download_extension(
ext_name: &str,
ext_path: &RemotePath,
ext_remote_storage: &str,
remote_ext_base_url: &str,
pgbin: &str,
) -> Result<u64> {
info!("Download extension {:?} from {:?}", ext_name, ext_path);
// TODO add retry logic
let download_buffer =
match download_extension_tar(ext_remote_storage, &ext_path.to_string()).await {
match download_extension_tar(remote_ext_base_url, &ext_path.to_string()).await {
Ok(buffer) => buffer,
Err(error_message) => {
return Err(anyhow::anyhow!(
@@ -272,8 +272,8 @@ pub fn create_control_files(remote_extensions: &RemoteExtSpec, pgbin: &str) {
// Do request to extension storage proxy, e.g.,
// curl http://pg-ext-s3-gateway/latest/v15/extensions/anon.tar.zst
// using HTTP GET and return the response body as bytes.
async fn download_extension_tar(ext_remote_storage: &str, ext_path: &str) -> Result<Bytes> {
let uri = format!("{}/{}", ext_remote_storage, ext_path);
async fn download_extension_tar(remote_ext_base_url: &str, ext_path: &str) -> Result<Bytes> {
let uri = format!("{}/{}", remote_ext_base_url, ext_path);
let filename = Path::new(ext_path)
.file_name()
.unwrap_or_else(|| std::ffi::OsStr::new("unknown"))

View File

@@ -22,7 +22,7 @@ pub(in crate::http) async fn download_extension(
State(compute): State<Arc<ComputeNode>>,
) -> Response {
// Don't even try to download extensions if no remote storage is configured
if compute.params.ext_remote_storage.is_none() {
if compute.params.remote_ext_base_url.is_none() {
return JsonResponse::error(
StatusCode::PRECONDITION_FAILED,
"remote storage is not configured",

View File

@@ -644,9 +644,10 @@ struct EndpointStartCmdArgs {
#[clap(
long,
help = "Configure the remote extensions storage proxy gateway to request for extensions."
help = "Configure the remote extensions storage proxy gateway URL to request for extensions.",
alias = "remote-ext-config"
)]
remote_ext_config: Option<String>,
remote_ext_base_url: Option<String>,
#[clap(
long,
@@ -1414,7 +1415,7 @@ async fn handle_endpoint(subcmd: &EndpointCmd, env: &local_env::LocalEnv) -> Res
EndpointCmd::Start(args) => {
let endpoint_id = &args.endpoint_id;
let pageserver_id = args.endpoint_pageserver_id;
let remote_ext_config = &args.remote_ext_config;
let remote_ext_base_url = &args.remote_ext_base_url;
let default_generation = env
.storage_controller
@@ -1517,7 +1518,7 @@ async fn handle_endpoint(subcmd: &EndpointCmd, env: &local_env::LocalEnv) -> Res
safekeepers_generation,
safekeepers,
pageservers,
remote_ext_config.as_ref(),
remote_ext_base_url.as_ref(),
stripe_size.0 as usize,
args.create_test_user,
args.start_timeout,

View File

@@ -655,7 +655,7 @@ impl Endpoint {
safekeepers_generation: Option<SafekeeperGeneration>,
safekeepers: Vec<NodeId>,
pageservers: Vec<(Host, u16)>,
remote_ext_config: Option<&String>,
remote_ext_base_url: Option<&String>,
shard_stripe_size: usize,
create_test_user: bool,
start_timeout: Duration,
@@ -825,8 +825,8 @@ impl Endpoint {
.stderr(logfile.try_clone()?)
.stdout(logfile);
if let Some(remote_ext_config) = remote_ext_config {
cmd.args(["--remote-ext-config", remote_ext_config]);
if let Some(remote_ext_base_url) = remote_ext_base_url {
cmd.args(["--remote-ext-base-url", remote_ext_base_url]);
}
let child = cmd.spawn()?;

View File

@@ -1277,6 +1277,8 @@ impl Timeline {
return Ok(CompactionOutcome::YieldForL0);
}
let gc_cutoff = *self.applied_gc_cutoff_lsn.read();
// 2. Repartition and create image layers if necessary
match self
.repartition(
@@ -1287,7 +1289,7 @@ impl Timeline {
)
.await
{
Ok(((dense_partitioning, sparse_partitioning), lsn)) => {
Ok(((dense_partitioning, sparse_partitioning), lsn)) if lsn >= gc_cutoff => {
// Disables access_stats updates, so that the files we read remain candidates for eviction after we're done with them
let image_ctx = RequestContextBuilder::from(ctx)
.access_stats_behavior(AccessStatsBehavior::Skip)
@@ -1341,6 +1343,10 @@ impl Timeline {
}
}
Ok(_) => {
info!("skipping repartitioning due to image compaction LSN being below GC cutoff");
}
// Suppress errors when cancelled.
Err(_) if self.cancel.is_cancelled() => {}
Err(err) if err.is_cancel() => {}
@@ -3606,6 +3612,13 @@ impl Timeline {
last_key = Some(key);
}
accumulated_values.push((key, lsn, val));
if accumulated_values.len() >= 65536 {
// Assume all of them are images, that would be 512MB of data in memory for a single key.
return Err(CompactionError::Other(anyhow!(
"too many values for a single key, giving up gc-compaction"
)));
}
} else {
let last_key: &mut Key = last_key.as_mut().unwrap();
stat.on_unique_key_visited(); // TODO: adjust statistics for partial compaction

View File

@@ -425,15 +425,12 @@ compact_prefetch_buffers(void)
* point inside and outside PostgreSQL.
*
* This still does throw errors when it receives malformed responses from PS.
*
* When we're not called from CHECK_FOR_INTERRUPTS (indicated by
* IsHandlingInterrupts) we also report we've ended prefetch receive work,
* just in case state tracking was lost due to an error in the sync getPage
* response code.
*/
void
communicator_prefetch_pump_state(bool IsHandlingInterrupts)
communicator_prefetch_pump_state(void)
{
START_PREFETCH_RECEIVE_WORK();
while (MyPState->ring_receive != MyPState->ring_flush)
{
NeonResponse *response;
@@ -482,9 +479,7 @@ communicator_prefetch_pump_state(bool IsHandlingInterrupts)
}
}
/* We never pump the prefetch state while handling other pages */
if (!IsHandlingInterrupts)
END_PREFETCH_RECEIVE_WORK();
END_PREFETCH_RECEIVE_WORK();
communicator_reconfigure_timeout_if_needed();
}
@@ -672,9 +667,10 @@ prefetch_wait_for(uint64 ring_index)
Assert(MyPState->ring_unused > ring_index);
START_PREFETCH_RECEIVE_WORK();
while (MyPState->ring_receive <= ring_index)
{
START_PREFETCH_RECEIVE_WORK();
entry = GetPrfSlot(MyPState->ring_receive);
Assert(entry->status == PRFS_REQUESTED);
@@ -683,17 +679,18 @@ prefetch_wait_for(uint64 ring_index)
result = false;
break;
}
END_PREFETCH_RECEIVE_WORK();
CHECK_FOR_INTERRUPTS();
}
if (result)
{
/* Check that slot is actually received (srver can be disconnected in prefetch_pump_state called from CHECK_FOR_INTERRUPTS */
PrefetchRequest *slot = GetPrfSlot(ring_index);
return slot->status == PRFS_RECEIVED;
result = slot->status == PRFS_RECEIVED;
}
return false;
END_PREFETCH_RECEIVE_WORK();
return result;
;
}
@@ -720,6 +717,7 @@ prefetch_read(PrefetchRequest *slot)
Assert(slot->status == PRFS_REQUESTED);
Assert(slot->response == NULL);
Assert(slot->my_ring_index == MyPState->ring_receive);
Assert(readpage_reentrant_guard);
if (slot->status != PRFS_REQUESTED ||
slot->response != NULL ||
@@ -802,6 +800,7 @@ communicator_prefetch_receive(BufferTag tag)
PrfHashEntry *entry;
PrefetchRequest hashkey;
Assert(readpage_reentrant_guard);
hashkey.buftag = tag;
entry = prfh_lookup(MyPState->prf_hash, &hashkey);
if (entry != NULL && prefetch_wait_for(entry->slot->my_ring_index))
@@ -821,8 +820,12 @@ communicator_prefetch_receive(BufferTag tag)
void
prefetch_on_ps_disconnect(void)
{
bool save_readpage_reentrant_guard = readpage_reentrant_guard;
MyPState->ring_flush = MyPState->ring_unused;
/* Prohibit callig of prefetch_pump_state */
START_PREFETCH_RECEIVE_WORK();
while (MyPState->ring_receive < MyPState->ring_unused)
{
PrefetchRequest *slot;
@@ -851,6 +854,9 @@ prefetch_on_ps_disconnect(void)
MyNeonCounters->getpage_prefetch_discards_total += 1;
}
/* Restore guard */
readpage_reentrant_guard = save_readpage_reentrant_guard;
/*
* We can have gone into retry due to network error, so update stats with
* the latest available
@@ -2509,7 +2515,7 @@ communicator_processinterrupts(void)
if (timeout_signaled)
{
if (!readpage_reentrant_guard && readahead_getpage_pull_timeout_ms > 0)
communicator_prefetch_pump_state(true);
communicator_prefetch_pump_state();
timeout_signaled = false;
communicator_reconfigure_timeout_if_needed();

View File

@@ -44,7 +44,7 @@ extern int communicator_read_slru_segment(SlruKind kind, int64 segno,
void *buffer);
extern void communicator_reconfigure_timeout_if_needed(void);
extern void communicator_prefetch_pump_state(bool IsHandlingInterrupts);
extern void communicator_prefetch_pump_state(void);
#endif

View File

@@ -1179,7 +1179,7 @@ neon_prefetch(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
blocknum += iterblocks;
}
communicator_prefetch_pump_state(false);
communicator_prefetch_pump_state();
return false;
}
@@ -1218,7 +1218,7 @@ neon_prefetch(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum)
communicator_prefetch_register_bufferv(tag, NULL, 1, NULL);
communicator_prefetch_pump_state(false);
communicator_prefetch_pump_state();
return false;
}
@@ -1262,7 +1262,7 @@ neon_writeback(SMgrRelation reln, ForkNumber forknum,
*/
neon_log(SmgrTrace, "writeback noop");
communicator_prefetch_pump_state(false);
communicator_prefetch_pump_state();
#ifdef DEBUG_COMPARE_LOCAL
if (IS_LOCAL_REL(reln))
@@ -1315,7 +1315,7 @@ neon_read(SMgrRelation reln, ForkNumber forkNum, BlockNumber blkno, void *buffer
}
/* Try to read PS results if they are available */
communicator_prefetch_pump_state(false);
communicator_prefetch_pump_state();
neon_get_request_lsns(InfoFromSMgrRel(reln), forkNum, blkno, &request_lsns, 1);
@@ -1339,7 +1339,7 @@ neon_read(SMgrRelation reln, ForkNumber forkNum, BlockNumber blkno, void *buffer
/*
* Try to receive prefetch results once again just to make sure we don't leave the smgr code while the OS might still have buffered bytes.
*/
communicator_prefetch_pump_state(false);
communicator_prefetch_pump_state();
#ifdef DEBUG_COMPARE_LOCAL
if (forkNum == MAIN_FORKNUM && IS_LOCAL_REL(reln))
@@ -1449,7 +1449,7 @@ neon_readv(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
nblocks, PG_IOV_MAX);
/* Try to read PS results if they are available */
communicator_prefetch_pump_state(false);
communicator_prefetch_pump_state();
neon_get_request_lsns(InfoFromSMgrRel(reln), forknum, blocknum,
request_lsns, nblocks);
@@ -1480,7 +1480,7 @@ neon_readv(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum,
/*
* Try to receive prefetch results once again just to make sure we don't leave the smgr code while the OS might still have buffered bytes.
*/
communicator_prefetch_pump_state(false);
communicator_prefetch_pump_state();
#ifdef DEBUG_COMPARE_LOCAL
if (forknum == MAIN_FORKNUM && IS_LOCAL_REL(reln))
@@ -1665,7 +1665,7 @@ neon_write(SMgrRelation reln, ForkNumber forknum, BlockNumber blocknum, const vo
lfc_write(InfoFromSMgrRel(reln), forknum, blocknum, buffer);
communicator_prefetch_pump_state(false);
communicator_prefetch_pump_state();
#ifdef DEBUG_COMPARE_LOCAL
if (IS_LOCAL_REL(reln))
@@ -1727,7 +1727,7 @@ neon_writev(SMgrRelation reln, ForkNumber forknum, BlockNumber blkno,
lfc_writev(InfoFromSMgrRel(reln), forknum, blkno, buffers, nblocks);
communicator_prefetch_pump_state(false);
communicator_prefetch_pump_state();
#ifdef DEBUG_COMPARE_LOCAL
if (IS_LOCAL_REL(reln))
@@ -1902,7 +1902,7 @@ neon_immedsync(SMgrRelation reln, ForkNumber forknum)
neon_log(SmgrTrace, "[NEON_SMGR] immedsync noop");
communicator_prefetch_pump_state(false);
communicator_prefetch_pump_state();
#ifdef DEBUG_COMPARE_LOCAL
if (IS_LOCAL_REL(reln))

View File

@@ -3886,10 +3886,10 @@ impl Service {
None
} else if safekeepers {
// Note that we do not support creating the timeline on the safekeepers
// for imported timelines. The `start_lsn` of the timeline is not known
// until the import finshes.
// https://github.com/neondatabase/neon/issues/11569
// Note that for imported timelines, we do not create the timeline on the safekeepers
// straight away. Instead, we do it once the import finalized such that we know what
// start LSN to provide for the safekeepers. This is done in
// [`Self::finalize_timeline_import`].
let res = self
.tenant_timeline_create_safekeepers(tenant_id, &timeline_info)
.instrument(tracing::info_span!("timeline_create_safekeepers", %tenant_id, timeline_id=%timeline_info.timeline_id))
@@ -3966,11 +3966,22 @@ impl Service {
let active = self.timeline_active_on_all_shards(&import).await?;
match active {
true => {
Some(timeline_info) => {
tracing::info!("Timeline became active on all shards");
if self.config.timelines_onto_safekeepers {
// Now that we know the start LSN of this timeline, create it on the
// safekeepers.
self.tenant_timeline_create_safekeepers_until_success(
import.tenant_id,
timeline_info,
)
.await?;
}
break;
}
false => {
None => {
tracing::info!("Timeline not active on all shards yet");
tokio::select! {
@@ -4004,9 +4015,6 @@ impl Service {
.range_mut(TenantShardId::tenant_range(import.tenant_id))
.for_each(|(_id, shard)| shard.importing = TimelineImportState::Idle);
// TODO(vlad): Timeline creations in import mode do not return a correct initdb lsn,
// so we can't create the timeline on the safekeepers. Fix by moving creation here.
// https://github.com/neondatabase/neon/issues/11569
tracing::info!(%import_failed, "Timeline import complete");
Ok(())
@@ -4021,10 +4029,16 @@ impl Service {
.await;
}
/// If the timeline is active on all shards, returns the [`TimelineInfo`]
/// collected from shard 0.
///
/// An error is returned if the shard layout has changed during the import.
/// This is guarded against within the storage controller and the pageserver,
/// and, therefore, unexpected.
async fn timeline_active_on_all_shards(
self: &Arc<Self>,
import: &TimelineImport,
) -> anyhow::Result<bool> {
) -> anyhow::Result<Option<TimelineInfo>> {
let targets = {
let locked = self.inner.read().unwrap();
let mut targets = Vec::new();
@@ -4048,13 +4062,17 @@ impl Service {
.expect("Pageservers may not be deleted while referenced");
targets.push((*tenant_shard_id, node.clone()));
} else {
return Ok(false);
return Ok(None);
}
}
targets
};
if targets.is_empty() {
anyhow::bail!("No shards found to finalize import for");
}
let results = self
.tenant_for_shards_api(
targets,
@@ -4070,10 +4088,17 @@ impl Service {
)
.await;
Ok(results.into_iter().all(|res| match res {
let all_active = results.iter().all(|res| match res {
Ok(info) => info.state == TimelineState::Active,
Err(_) => false,
}))
});
if all_active {
// Both unwraps are validated above
Ok(Some(results.into_iter().next().unwrap().unwrap()))
} else {
Ok(None)
}
}
pub(crate) async fn tenant_timeline_archival_config(

View File

@@ -323,6 +323,42 @@ impl Service {
})
}
pub(crate) async fn tenant_timeline_create_safekeepers_until_success(
self: &Arc<Self>,
tenant_id: TenantId,
timeline_info: TimelineInfo,
) -> anyhow::Result<()> {
const BACKOFF: Duration = Duration::from_secs(5);
loop {
if self.cancel.is_cancelled() {
anyhow::bail!("Shut down requested while finalizing import");
}
let res = self
.tenant_timeline_create_safekeepers(tenant_id, &timeline_info)
.await;
match res {
Ok(_) => {
tracing::info!("Timeline created on safekeepers");
break;
}
Err(err) => {
tracing::error!("Failed to create timeline on safekeepers: {err}");
tokio::select! {
_ = self.cancel.cancelled() => {
anyhow::bail!("Shut down requested while finalizing import");
},
_ = tokio::time::sleep(BACKOFF) => {}
};
}
}
}
Ok(())
}
/// Directly insert the timeline into the database without reconciling it with safekeepers.
///
/// Useful if the timeline already exists on the specified safekeepers,

View File

@@ -557,7 +557,7 @@ class NeonLocalCli(AbstractNeonCli):
endpoint_id: str,
safekeepers_generation: int | None = None,
safekeepers: list[int] | None = None,
remote_ext_config: str | None = None,
remote_ext_base_url: str | None = None,
pageserver_id: int | None = None,
allow_multiple: bool = False,
create_test_user: bool = False,
@@ -572,8 +572,8 @@ class NeonLocalCli(AbstractNeonCli):
extra_env_vars = env or {}
if basebackup_request_tries is not None:
extra_env_vars["NEON_COMPUTE_TESTING_BASEBACKUP_TRIES"] = str(basebackup_request_tries)
if remote_ext_config is not None:
args.extend(["--remote-ext-config", remote_ext_config])
if remote_ext_base_url is not None:
args.extend(["--remote-ext-base-url", remote_ext_base_url])
if safekeepers_generation is not None:
args.extend(["--safekeepers-generation", str(safekeepers_generation)])

View File

@@ -4195,7 +4195,7 @@ class Endpoint(PgProtocol, LogUtils):
def start(
self,
remote_ext_config: str | None = None,
remote_ext_base_url: str | None = None,
pageserver_id: int | None = None,
safekeeper_generation: int | None = None,
safekeepers: list[int] | None = None,
@@ -4221,7 +4221,7 @@ class Endpoint(PgProtocol, LogUtils):
self.endpoint_id,
safekeepers_generation=safekeeper_generation,
safekeepers=self.active_safekeepers,
remote_ext_config=remote_ext_config,
remote_ext_base_url=remote_ext_base_url,
pageserver_id=pageserver_id,
allow_multiple=allow_multiple,
create_test_user=create_test_user,
@@ -4436,7 +4436,7 @@ class Endpoint(PgProtocol, LogUtils):
hot_standby: bool = False,
lsn: Lsn | None = None,
config_lines: list[str] | None = None,
remote_ext_config: str | None = None,
remote_ext_base_url: str | None = None,
pageserver_id: int | None = None,
allow_multiple: bool = False,
basebackup_request_tries: int | None = None,
@@ -4455,7 +4455,7 @@ class Endpoint(PgProtocol, LogUtils):
pageserver_id=pageserver_id,
allow_multiple=allow_multiple,
).start(
remote_ext_config=remote_ext_config,
remote_ext_base_url=remote_ext_base_url,
pageserver_id=pageserver_id,
allow_multiple=allow_multiple,
basebackup_request_tries=basebackup_request_tries,
@@ -4539,7 +4539,7 @@ class EndpointFactory:
lsn: Lsn | None = None,
hot_standby: bool = False,
config_lines: list[str] | None = None,
remote_ext_config: str | None = None,
remote_ext_base_url: str | None = None,
pageserver_id: int | None = None,
basebackup_request_tries: int | None = None,
) -> Endpoint:
@@ -4559,7 +4559,7 @@ class EndpointFactory:
hot_standby=hot_standby,
config_lines=config_lines,
lsn=lsn,
remote_ext_config=remote_ext_config,
remote_ext_base_url=remote_ext_base_url,
pageserver_id=pageserver_id,
basebackup_request_tries=basebackup_request_tries,
)
@@ -4613,7 +4613,10 @@ class EndpointFactory:
return self
def new_replica(
self, origin: Endpoint, endpoint_id: str, config_lines: list[str] | None = None
self,
origin: Endpoint,
endpoint_id: str | None = None,
config_lines: list[str] | None = None,
):
branch_name = origin.branch_name
assert origin in self.endpoints
@@ -4629,7 +4632,10 @@ class EndpointFactory:
)
def new_replica_start(
self, origin: Endpoint, endpoint_id: str, config_lines: list[str] | None = None
self,
origin: Endpoint,
endpoint_id: str | None = None,
config_lines: list[str] | None = None,
):
branch_name = origin.branch_name
assert origin in self.endpoints

View File

@@ -221,7 +221,7 @@ def test_remote_extensions(
endpoint.create_remote_extension_spec(spec)
endpoint.start(remote_ext_config=extensions_endpoint)
endpoint.start(remote_ext_base_url=extensions_endpoint)
with endpoint.connect() as conn:
with conn.cursor() as cur:
@@ -249,7 +249,7 @@ def test_remote_extensions(
# Remove the extension files to force a redownload of the extension.
extension.remove(test_output_dir, pg_version)
endpoint.start(remote_ext_config=extensions_endpoint)
endpoint.start(remote_ext_base_url=extensions_endpoint)
# Test that ALTER EXTENSION UPDATE statements also fetch remote extensions.
with endpoint.connect() as conn:

View File

@@ -24,6 +24,7 @@ from fixtures.utils import (
skip_in_debug_build,
wait_until,
)
from fixtures.workload import Workload
from mypy_boto3_kms import KMSClient
from mypy_boto3_kms.type_defs import EncryptResponseTypeDef
from mypy_boto3_s3 import S3Client
@@ -97,6 +98,10 @@ def test_pgdata_import_smoke(
f"http://{cplane_mgmt_api_server.host}:{cplane_mgmt_api_server.port}/storage/api/v1/"
)
if neon_env_builder.storage_controller_config is None:
neon_env_builder.storage_controller_config = {}
neon_env_builder.storage_controller_config["timelines_onto_safekeepers"] = True
env = neon_env_builder.init_start()
# The test needs LocalFs support, which is only built in testing mode.
@@ -286,34 +291,28 @@ def test_pgdata_import_smoke(
#
# validate that we can write
#
rw_endpoint = env.endpoints.create_start(
branch_name=import_branch_name,
endpoint_id="rw",
tenant_id=tenant_id,
config_lines=ep_config,
)
rw_endpoint.safe_psql("create table othertable(values text)")
rw_lsn = Lsn(rw_endpoint.safe_psql_scalar("select pg_current_wal_flush_lsn()"))
workload = Workload(env, tenant_id, timeline_id, branch_name=import_branch_name)
workload.init()
workload.write_rows(64)
workload.validate()
# TODO: consider using `class Workload` here
# to do compaction and whatnot?
rw_lsn = Lsn(workload.endpoint().safe_psql_scalar("select pg_current_wal_flush_lsn()"))
#
# validate that we can branch (important use case)
#
# ... at the tip
_ = env.create_branch(
child_timeline_id = env.create_branch(
new_branch_name="br-tip",
ancestor_branch_name=import_branch_name,
tenant_id=tenant_id,
ancestor_start_lsn=rw_lsn,
)
br_tip_endpoint = env.endpoints.create_start(
branch_name="br-tip", endpoint_id="br-tip-ro", tenant_id=tenant_id, config_lines=ep_config
)
validate_vanilla_equivalence(br_tip_endpoint)
br_tip_endpoint.safe_psql("select * from othertable")
child_workload = workload.branch(timeline_id=child_timeline_id, branch_name="br-tip")
child_workload.validate()
validate_vanilla_equivalence(child_workload.endpoint())
# ... at the initdb lsn
_ = env.create_branch(
@@ -330,7 +329,7 @@ def test_pgdata_import_smoke(
)
validate_vanilla_equivalence(br_initdb_endpoint)
with pytest.raises(psycopg2.errors.UndefinedTable):
br_initdb_endpoint.safe_psql("select * from othertable")
br_initdb_endpoint.safe_psql(f"select * from {workload.table}")
@run_only_on_default_postgres(reason="PG version is irrelevant here")

View File

@@ -39,3 +39,10 @@ def test_role_grants(neon_simple_env: NeonEnv):
res = cur.fetchall()
assert res == [(1,)], "select should not succeed"
# confirm that replicas can also ensure the grants are correctly set.
replica = env.endpoints.new_replica_start(endpoint)
replica_client = replica.http_client()
replica_client.set_role_grants(
"test_role_grants", "test_role", "test_schema", ["CREATE", "USAGE"]
)

View File

@@ -5,7 +5,7 @@
],
"v16": [
"16.8",
"05ddf212e2e07b788b5c8b88bdcf98630941f6ae"
"d72d76f2cdee4194dd052ce099e9784aca7c794a"
],
"v15": [
"15.12",