Compare commits

...

58 Commits

Author SHA1 Message Date
Erik Grinaker
7475bf4a34 pageserver: fix tenant::storage_layer::layer tests 2024-12-23 12:53:21 +01:00
Alex Chi Z.
9c53b41245 fix(pageserver): update remote latest_gc_cutoff after gc-compaction (#10209)
## Problem

close https://github.com/neondatabase/neon/issues/10208
part of #9114 

## Summary of changes

* Ensure remote `latest_gc_cutoff` is up-to-date before removing any
files for gc-compaction.

Signed-off-by: Alex Chi Z <chi@neon.tech>
2024-12-19 18:40:20 +00:00
Konstantin Knizhnik
197a89ab3d Increase default stotrage controller heartbeat interval from 100msec … (#10206)
## Problem

Currently default value of storage controller heartbeat interval is
100msec. It means that 10 times per second it establish connection to
PS. And it seems to be quite expensive.
At MacOS right now storage_controller consumes 70% CPU and trusts - 30%.
So together they completely utilize one core.
A lot of us has Macs. Let's save environment a little bit and do not
waste electricity and contribute to global warming.

By the way, on prod we have interval  10seconds 

## Summary of changes

Increase heartbeat interval from 100msec to 1 second.

Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
2024-12-19 18:32:32 +00:00
Alex Chi Z.
b89e02f3e8 fix(pageserver): consider partial compaction layer map in layer check (#10044)
## Problem

In https://github.com/neondatabase/neon/pull/9897 we temporarily
disabled the layer valid check because the current one only considers
the end result of all compaction algorithms, but partial gc-compaction
would temporarily produce an "invalid" layer map.

part of https://github.com/neondatabase/neon/issues/9114

## Summary of changes

Allow LSN splits to overlap in the slow path check. Currently, the valid
check is only used in storage scrubber (background job) and during
gc-compaction (without taking layer lock). Therefore, it's fine for such
checks to be a little bit inefficient but more accurate.

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>
2024-12-19 18:04:53 +00:00
Konstantin Knizhnik
04517c6ff3 Do not reload config file on PS reconnect (#10204)
## Problem

See https://github.com/neondatabase/neon/issues/10184
and
https://neondb.slack.com/archives/C04DGM6SMTM/p1733997259898819

Reloading config file inside parallel worker cause it's termination

## Summary of changes

Remove call of `HandleMainLoopInterrupts()` 
Update of page server URL is propagated by postmaster through shared
memory and we should not reload config for it.

Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
2024-12-19 15:22:39 +00:00
Vlad Lazar
628451d68e safekeeper: short-circuit interpreted wal sender (#10202)
## Problem

Safekeeper may currently send a batch to the pageserver even if it
hasn't decoded a new record.
I think this is quite unlikely in the field, but worth adressing.

## Summary of changes

Don't send anything if we haven't decoded a full record. Once this
merges and releases, the `InterpretedWalRecords` struct can be updated
to remove the Option wrapper for `next_record_lsn`.
2024-12-19 14:04:46 +00:00
Vlad Lazar
502d512fe2 safekeeper: lift benchmarking utils into safekeeper crate (#10200)
## Problem

The benchmarking utilities are also useful for testing. We want to write
tests in the safekeeper crate.

## Summary of changes

This commit lifts the utils to the safekeeper crate. They are compiled
if the benchmarking features is enabled or if in test mode.
2024-12-19 14:04:42 +00:00
John Spray
afda6d4700 storage_scrubber: don't report half-created timelines as corruption (#10198)
## Problem

test_timeline_archival_chaos does timeline creation with failure
injection, and thereby sometimes leaves timelines in a part created
state. This was being reported as corruption by the scrubber on test
teardown, because it considered a layer without an index to be an
invalid state. This was incorrect: the scrubber should accept this
state, it occurs legitimately during timeline creation.

Closes: https://github.com/neondatabase/neon/issues/9988

## Summary of changes

- Report a timeline with layers but no index as Relic rather than
MissingIndexPart.
- We retain the MissingIndexPart variant for the case where an index
_was_ found in the listing, but was not found by a subsequent GET, i.e.
racing with deletion.
2024-12-19 12:55:05 +00:00
John Spray
65042cbadd tests: use high IO concurrency in test_pgdata_import_smoke, use effective_io_concurrency=2 in tests by default (#10114)
## Problem

`test_pgdata_import_smoke` writes two gigabytes of pages and then reads
them back serially. This is CPU bottlenecked and results in a long
runtime, and sensitivity to CPU load from other tests on the same
machine.

Closes: https://github.com/neondatabase/neon/issues/10071

## Summary of changes

- Use effective_io_concurrency=32 when doing sequential scans through
2GiB of pages in test_pgdata_import_smoke. This is a ~10x runtime
decrease in the parts of the test that do sequential scans.
- Also set `effective_io_concurrency=2` for tests, as I noticed while
debugging that we were doing all getpage requests serially, which is bad
for checking the stability of the batching code.
2024-12-19 10:58:49 +00:00
Folke Behrens
b135194090 proxy: Delay SASL complete message until auth is done (#10189)
The final SASL complete message can be bundled with the remainder of the
auth flow messages until ReadyForQuery.

neondatabase/cloud#19184
2024-12-19 10:37:08 +00:00
Peter Bendel
43dc03459d Run pgbench on 10 GB scale factor on database with n relations (e.g. 10k) (#10172)
## Problem

We want to verify how much / if pgbench throughput and latency on Neon
suffers if the database contains many other relations, too.

## Summary of changes

Modify the benchmarking.yml pgbench-compare job to
- create an addiitional project at scale factor 10 GiB
- before running pgbench add n tables (initially 10k) to the database
- then compare the pgbench throughput and latency to the existing
pgbench-compare at 10 Gib scale factor

We use a realistic template for the n relations that is a partitioned
table with some realistic data types, indexes and constraints - similar
to a table that we use internally.

Example run:
https://github.com/neondatabase/neon/actions/runs/12377565956/job/34547386959
2024-12-19 10:25:44 +00:00
Christian Schwarz
a1b0558493 fast import: importer: use aws s3 cli (#10162)
## Problem

s5cmd doesn't pick up the pod service account

```
2024/12/16 16:26:01 Ignoring, HTTP credential provider invalid endpoint host, "169.254.170.23", only loopback hosts are allowed. <nil>
ERROR "ls s3://neon-dev-bulk-import-us-east-2/import-pgdata/fast-import/v1/br-wandering-hall-w2xobawv": NoCredentialProviders: no valid providers in chain. Deprecated. For verbose messaging see aws.Config.CredentialsChainVerboseErrors
```

## Summary of changes

Switch to offical CLI.


## Testing

Tested the pre-merge image in staging, using `job_image` override in
project settings.


https://neondb.slack.com/archives/C033RQ5SPDH/p1734554944391949?thread_ts=1734368383.258759&cid=C033RQ5SPDH

## Future Work

Switch back to s5cmd once https://github.com/peak/s5cmd/pull/769 gets
merged.

## Refs

- fixes https://github.com/neondatabase/cloud/issues/21876

---------

Co-authored-by: Gleb Novikov <NanoBjorn@users.noreply.github.com>
2024-12-19 10:04:17 +00:00
Alex Chi Z.
cc138b56f9 fix(pageserver): run psql in thread to avoid blocking (#10177)
## Problem

ref https://github.com/neondatabase/neon/issues/10170
ref https://github.com/neondatabase/neon/issues/9994

The psql command will block the main thread, causing other async tasks
to timeout (i.e., HTTP connect). Therefore, we need to move it to an I/O
executor thread.

## Summary of changes

* run psql connection in a thread

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
Co-authored-by: John Spray <john@neon.tech>
2024-12-19 09:45:06 +00:00
Konstantin Knizhnik
61fcf64c22 Fix flukyness of test_physical_and_logical_replicaiton.py (#10176)
## Problem

See https://github.com/neondatabase/neon/issues/10037
test_physical_and_logical_replication.py sometimes failed.

## Summary of changes

Add `wait_replica_caughtup` to wait for replica sync

Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
2024-12-18 19:15:38 +00:00
Alex Chi Z.
6d3e8096fc refactor(test): tighten up test_gc_feedback (#10126)
## Problem

In https://github.com/neondatabase/neon/pull/8103 we changed the test
case to have more test coverage of gc_compaction. Now that we have
`test_gc_compaction_smoke`, we can revert this test case to serve its
original purpose and revert the parameter changes.

part of https://github.com/neondatabase/neon/issues/9114

## Summary of changes

* Revert pitr_interval from 60s to 10s.
* Assert the physical/logical size ratio in the benchmark.

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>
2024-12-18 18:10:05 +00:00
Alex Chi Z.
3d1c3a80ae feat(pageserver): add compact queue http endpoint (#10173)
## Problem

We cannot get the size of the compaction queue and access the info.

Part of #9114 

## Summary of changes

* Add an API endpoint to get the compaction queue.
* gc_compaction test case now waits until the compaction finishes.

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
2024-12-18 18:09:02 +00:00
John Spray
835287ba3a neon_local: add a flock to protect against concurrent execution (#10185)
## Problem

`neon_local` has always been unsafe to run concurrently with itself: it
uses simple text files for persistent state, and concurrent runs will
step on each other.

In some test environments we intentionally handle this with mutexes in
python land, but it's fragile to try and always remember to do that.

## Summary of changes

- Add a `flock` based mutex around the `main` function of neon_local,
using the repo directory as the file to lock
- Clean up an Option<> around control_plane_api, this is a drive-by
change because it was one of the fields that had a weird effect when
previous concurrent stuff stamped on it.
2024-12-18 16:29:47 +00:00
Conrad Ludgate
d63602cc78 chore(proxy): fully remove allow-self-signed-compute flag (#10168)
When https://github.com/neondatabase/cloud/pull/21856 is merged, this
flag is no longer necessary.
2024-12-18 16:03:14 +00:00
Erik Grinaker
1668d39b7c safekeeper: fix typo in allowlist for /profile/heap (#10186) 2024-12-18 15:51:53 +00:00
Alex Chi Z.
1d12efc428 fix(pageserver): allow repartition errors during gc-compaction smoke tests (#10164)
## Problem

part of https://github.com/neondatabase/neon/issues/9114

In https://github.com/neondatabase/neon/pull/10127 we fixed the race,
but we didn't add the errors to the allowlist.

## Summary of changes

* Allow repartition errors in the gc-compaction smoke test.

I think it might be worth to refactor the code to allow multiple threads
getting a copy of repartition status (i.e., using Rcu) in the future.

Signed-off-by: Alex Chi Z <chi@neon.tech>
2024-12-18 15:37:26 +00:00
Arpad Müller
85696297c5 Add safekeepers command to storcon_cli for listing (#10151)
Add a `safekeepers` subcommand to `storcon_cli` that allows listing the
safekeepers.

```
$ curl -X POST --url http://localhost:1234/control/v1/safekeeper/42 --data \
  '{"active":true, "id":42, "created_at":"2023-10-25T09:11:25Z", "updated_at":"2024-08-28T11:32:43Z","region_id":"neon_local","host":"localhost","port":5454,"http_port":0,"version":123,"availability_zone_id":"us-east-2b"}'
$ cargo run --bin storcon_cli  -- --api http://localhost:1234 safekeepers
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.38s
     Running `target/debug/storcon_cli --api 'http://localhost:1234' safekeepers`
+----+---------+-----------+------+-----------+------------+
| Id | Version | Host      | Port | Http Port | AZ Id      |
+==========================================================+
| 42 | 123     | localhost | 5454 | 0         | us-east-2b |
+----+---------+-----------+------+-----------+------------+
```

Also:

* Don't return the raw `SafekeeperPersistence` struct that contains the
raw database presentation, but instead a new
`SafekeeperDescribeResponse` struct.
* The `SafekeeperPersistence` struct leaves out the `active` field on
purpose because we want to deprecate it and replace it with a
`scheduling_policy` one.

Part of https://github.com/neondatabase/neon/issues/9981
2024-12-18 12:47:56 +00:00
Konstantin Knizhnik
aaf980f70d Online checkpoint replication state (#9976)
## Problem

See https://neondb.slack.com/archives/C04DGM6SMTM/p1733180965970089

Replication state is checkpointed only by shutdown checkpoint.
It means that replication snapshots are not removed till compute
shutdown.

## Summary of changes

Checkpoint replication state during online checkpoint

Related Postgres PR:
https://github.com/neondatabase/postgres/pull/546

Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
2024-12-18 09:34:38 +00:00
a-masterov
c52514ab02 Fix allure report creation on periodic pg_regress testing (#10171)
## Problem
The allure report finishes with the error `HttpError: Resource not
accessible by integration` while running the `pg_regress` test against a
cloud staging project due to a lack of permissions.
## Summary of changes
The permissions are added.
2024-12-17 20:47:44 +00:00
Conrad Ludgate
2ee6bc5ec4 chore(proxy): update vendored postgres libs to edition 2021 (#10139)
I ran `cargo fix --edition` in each project prior, and it found nothing
that needed fixing.
2024-12-17 20:06:18 +00:00
John Spray
fd230227f2 storcon: include preferred AZ in compute notifications (#9953)
## Problem

It is unreliable for the control plane to infer the AZ for computes from
where the tenant is currently attached, because if a tenant happens to
be in a degraded state or a release is ongoing while a compute starts,
then the tenant's attached AZ can be a different one to where it will
run long-term, and the control plane doesn't check back later to restart
the compute.

This can land in parallel with
https://github.com/neondatabase/neon/pull/9947

## Summary of changes

- Thread through the preferred AZ into the compute hook code via the
reconciler
- Include the preferred AZ in the body of compute hook notifications
2024-12-17 20:04:09 +00:00
Ivan Efremov
93e958341f [proxy]: Use TLS for cancellation queries (#10152)
## Problem
pg_sni_router assumes that all the streams are upgradable to TLS.
Cancellation requests were declined because of using NoTls config.

## Summary of changes
Provide TLS client config for cancellation requests.

Fixes
[#21789](https://github.com/orgs/neondatabase/projects/65/views/1?pane=issue&itemId=90911361&issue=neondatabase%7Ccloud%7C21789)
2024-12-17 19:26:54 +00:00
Tristan Partin
7dddbb9570 Add pg_repack extension (#10100)
Our solutions engineers and some customers would like to have this
extension available.

Link: https://github.com/neondatabase/cloud/issues/18890

Signed-off-by: Tristan Partin <tristan@neon.tech>
2024-12-17 18:36:55 +00:00
Erik Grinaker
a55853f67f utils: symbolize heap profiles (#10153)
## Problem

Jemalloc heap profiles aren't symbolized. This is inconvenient, and
doesn't work with Grafana Cloud Profiles.

Resolves #9964.

## Summary of changes

Symbolize the heap profiles in-process, and strip unnecessary cruft.

This uses about 100 MB additional memory to cache the DWARF information,
but I believe this is already the case with CPU profiles, which use the
same library for symbolization. With cached DWARF information, the
symbolization CPU overhead is negligible.

Example profiles:

*
[pageserver.pb.gz](https://github.com/user-attachments/files/18141395/pageserver.pb.gz)
*
[safekeeper.pb.gz](https://github.com/user-attachments/files/18141396/safekeeper.pb.gz)
2024-12-17 16:51:58 +00:00
Mikhail Kot
007b13b79a Don't build tests in compute image, use ninja (#10149)
Don't build tests in h3 and rdkit: ~15 min speedup.
Use Ninja as cmake generator where possible: ~10 min speedup.
Clean apt cache for smaller images: around 250mb size loss for
intermediate layers
2024-12-17 16:43:54 +00:00
Alexey Kondratov
2dfd3cab8c fix(compute): Report compute_backpressure_throttling_seconds as counter (#10125)
## Problem

It was reported as `gauge`, but it's actually a `counter`.

Also add `_total` suffix as that's the convention for counters.

The corresponding flux-fleet PR:
https://github.com/neondatabase/flux-fleet/pull/386
2024-12-17 16:14:07 +00:00
John Spray
b5833ef259 remote_storage: configurable connection pooling for ABS (#10169)
## Problem

The ABS SDK's default behavior is to do no connection pooling, i.e. open
and close a fresh connection for each request. Under high request rates,
this can result in an accumulation of TCP connections in TIME_WAIT or
CLOSE_WAIT state, and in extreme cases exhaustion of client ports.

Related: https://github.com/neondatabase/cloud/issues/20971

## Summary of changes

- Add a configurable `conn_pool_size` parameter for Azure storage,
defaulting to zero (current behavior)
- Construct a custom reqwest client using this connection pool size.
2024-12-17 12:24:51 +00:00
Erik Grinaker
b0e43c2f88 postgres_ffi: add WalStreamDecoder::complete_record() benchmark (#10158)
Touches #10097.
2024-12-17 10:35:00 +00:00
a-masterov
e226d7a3d1 Fix docker compose with PG17 (#10165)
## Problem
It's impossible to run docker compose with compute v17 due to `pg_anon`
extension which is not supported under PG17.
## Summary of changes
The auto-loading of `pg_anon` is disabled by default
2024-12-17 08:16:54 +00:00
Folke Behrens
aa7ab9b3ac proxy: Allow dumping TLS session keys for debugging (#10163)
## Problem

To debug issues with TLS connections there's no easy way to decrypt
packets unless a client has special support for logging the keys.

## Summary of changes

Add TLS session keys logging to proxy via `SSLKEYLOGFILE` env var gated
by flag.
2024-12-16 18:56:24 +00:00
Erik Grinaker
28ccda0a63 test_runner: ignore error in test_timeline_archival_chaos (#10161)
Resolves #10159.
2024-12-16 17:10:55 +00:00
Conrad Ludgate
59b7ff8988 chore(proxy): disallow unwrap and unimplemented (#10142)
As the title says, I updated the lint rules to no longer allow unwrap or
unimplemented.

Three special cases:
* Tests are allowed to use them
* std::sync::Mutex lock().unwrap() is common because it's usually
correct to continue panicking on poison
* `tokio::spawn_blocking(...).await.unwrap()` is common because it will
only error if the blocking fn panics, so continuing the panic is also
correct

I've introduced two extension traits to help with these last two, that
are a bit more explicit so they don't need an expect message every time.
2024-12-16 16:37:15 +00:00
Conrad Ludgate
2e4c9c5704 chore(proxy): remove allow_self_signed from regular proxy (#10157)
I noticed that the only place we use this flag is for testing console
redirect proxy. Makes sense to me to make this assumption more explicit.
2024-12-16 16:11:39 +00:00
Erik Grinaker
3d30a7a934 pageserver: make RemoteTimelineClient::schedule_index_upload infallible (#10155)
Remove an unnecessary `Result` and address a `FIXME`.
2024-12-16 15:54:47 +00:00
Conrad Ludgate
6565fd4056 chore: fix clippy lints 2024-12-06 (#10138) 2024-12-16 15:33:21 +00:00
Arseny Sher
c5e3314c6e Add test restarting compute at WAL page boundary (#10111)
## Problem

We've had similar test in test_logical_replication, but then removed it
because it wasn't needed to trigger LR related bug. Restarting at WAL
page boundary is still a useful test, so add it separately back.

## Summary of changes

Add the test.
2024-12-16 14:53:04 +00:00
Arseny Sher
1ed0e52bc8 Extract safekeeper http client to separate crate. (#10140)
## Problem

We want to use safekeeper http client in storage controller and
neon_local.

## Summary of changes

Extract it to separate crate. No functional changes.
2024-12-16 12:07:24 +00:00
Conrad Ludgate
24d6587914 chore(proxy): refactor self-signed config (#10154)
## Problem

While reviewing #10152 I found it tricky to actually determine whether
the connection used `allow_self_signed_compute` or not.

I've tried to remove this setting in the past:
* https://github.com/neondatabase/neon/pull/7884
* https://github.com/neondatabase/neon/pull/7437
* https://github.com/neondatabase/cloud/pull/13702

But each time it seems it is used by e2e tests

## Summary of changes

The `node_info.allow_self_signed_computes` is always initialised to
false, and then sometimes inherits the proxy config value. There's no
need this needs to be in the node_info, so removing it and propagating
it via `TcpMechansim` is simpler.
2024-12-16 11:15:25 +00:00
John Spray
ebcbc1a482 pageserver: tighten up code around SLRU dir key handling (#10082)
## Problem

Changes in #9786 were functionally complete but missed some edges that
made testing less robust than it should have been:
- `is_key_disposable` didn't consider SLRU dir keys disposable
- Timeline `init_empty` was always creating SLRU dir keys on all shards

The result was that when we had a bug
(https://github.com/neondatabase/neon/pull/10080), it wasn't apparent in
tests, because one would only encounter the issue if running on a
long-lived timeline with enough compaction to drop the initially created
empty SLRU dir keys, _and_ some CLog truncation going on.

Closes: https://github.com/neondatabase/cloud/issues/21516

## Summary of changes

- Update is_key_global and init_empty to handle SLRU dir keys properly
-- the only functional impact is that we avoid writing some spurious
keys in shards >0, but this makes testing much more robust.
- Make `test_clog_truncate` explicitly use a sharded tenant

The net result is that if one reverts #10080, then tests fail (i.e. this
PR is a reproducer for the issue)
2024-12-16 10:06:08 +00:00
Konstantin Knizhnik
117c1b5dde Do not perform prefetch for temp relations (#10146)
## Problem

See https://neondb.slack.com/archives/C04DGM6SMTM/p1734002916827019

With recent prefetch fixes for pg17 and `effective_io_concurrency=100` 
pg_regress test stats.sql is failed when set temp_buffers to 100.
Stream API will try to lock all this 100 buffers for prefetch.

## Summary of changes

Disable such behaviour for temp relations.
Postgres PR: https://github.com/neondatabase/postgres/pull/548

Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
2024-12-16 06:03:53 +00:00
Erik Grinaker
f3ecd5d76a pageserver: revert flush backpressure (#8550) (#10135)
## Problem

In #8550, we made the flush loop wait for uploads after every layer.
This was to avoid unbounded buildup of uploads, and to reduce compaction
debt. However, the approach has several problems:

* It prevents upload parallelism.
* It prevents flush and upload pipelining.
* It slows down ingestion even when there is no need to backpressure.
* It does not directly backpressure WAL ingestion (only via
`disk_consistent_lsn`), and will build up in-memory layers.
* It does not directly backpressure based on compaction debt and read
amplification.

An alternative solution to these problems is proposed in #8390.

In the meanwhile, we revert the change to reduce the impact on ingest
throughput. This does reintroduce some risk of unbounded
upload/compaction buildup. Until
https://github.com/neondatabase/neon/issues/8390, this can be addressed
in other ways:

* Use `max_replication_apply_lag` (aka `remote_consistent_lsn`), which
will more directly limit upload debt.
* Shard the tenant, which will spread the flush/upload work across more
Pageservers and move the bottleneck to Safekeeper.

Touches #10095.

## Summary of changes

Remove waiting on the upload queue in the flush loop.
2024-12-15 09:45:12 +00:00
Mikhail Kot
cf161e1556 fix(adapter): password not set in role drop (#10130)
## Problem

When entry was dropped and password wasn't set, new entry
had uninitialized memory in controlplane adapter

Resolves: https://github.com/neondatabase/cloud/issues/14914

## Summary of changes

Initialize password in all cases, add tests.
Minor formatting for less indentation
2024-12-14 17:37:13 +00:00
Konstantin Knizhnik
2521eba674 Check for invalid down link while prefetching B-Tree leave pages for index-only scan (#9867)
## Problem

See #9866

Index-only scan prefetch implementation doesn't take in account that
down link may be invalid

## Summary of changes

Check that downlink is valid block number


Correspondent Postgres PRs:
https://github.com/neondatabase/postgres/pull/534
https://github.com/neondatabase/postgres/pull/535
https://github.com/neondatabase/postgres/pull/536
https://github.com/neondatabase/postgres/pull/537

---------

Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
2024-12-13 20:46:41 +00:00
Alexander Bayandin
d56fea680e CI: always require aws-oicd-role-arn input to be set (#10145)
## Problem
`benchmarking` job fails because `aws-oicd-role-arn` input is not set

## Summary of changes:
- Set `aws-oicd-role-arn` for `benchmarking job
- Always require `aws-oicd-role-arn` to be set
- Rename `aws_oicd_role_arn` to `aws-oicd-role-arn` for consistency
2024-12-13 19:56:32 +00:00
Alex Chi Z.
7ee5dca752 fix(pageserver): race between gc-compaction and repartition (#10127)
## Problem

close https://github.com/neondatabase/neon/issues/10124

gc-compaction split_gc_jobs is holding the repartition lock for too long
time.

## Summary of changes

* Ensure split_gc_compaction_jobs drops the repartition lock once it
finishes cloning the structures.
* Update comments.

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
2024-12-13 18:22:25 +00:00
Tristan Partin
07d1db54b3 Improve comments and log messages in the logical replication monitor (#9974)
Improved comments will help others when they read the code, and the log
messages will help others understand why the logical replication monitor
works the way it does.

Signed-off-by: Tristan Partin <tristan@neon.tech>
2024-12-13 18:10:42 +00:00
Konstantin Knizhnik
eeabecd89f Correctly update LFC used_pages in case of LFC resize (#10128)
## Problem

LFC used_pages statistic is not updated in case of LFC resize (shrinking
`neon.file_cache_size_limit`)

## Summary of changes

Update `lfc_ctl->used_pages` in `lfc_change_limit_hook`

Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
2024-12-13 17:40:26 +00:00
Christian Schwarz
fcff752851 fix(test_timeline_archival_chaos): flakiness caused by orphan layers (#10083)
The test was failing with the scary but generic message `Remote storage
metadata corrupted`.

The underlying scrubber error is `Orphan layer detected: ...`.

The test kills pageserver at random points, hence it's expected that we
leak layers if we're killed in the window after layer upload but before
it's referenced from index part.

Refer to generation numbers RFC for details.

Refs:
- fixes https://github.com/neondatabase/neon/issues/9988
- root-cause analysis
https://github.com/neondatabase/neon/issues/9988#issuecomment-2520673167
2024-12-13 16:28:21 +00:00
Alexander Bayandin
2c91062828 test_prefetch: reduce timeout to default 5m from 10m (#10105)
## Problem

`test_prefetch` is flaky
(https://github.com/neondatabase/neon/issues/9961), but if it passes,
the run time is less than 30 seconds — we don't need an extended timeout
for it.

## Summary of changes
- Remove extended test timeout for `test_prefetch`
2024-12-13 14:52:54 +00:00
Arseny Sher
ce8eb089f3 Extract public sk types to safekeeper_api (#10137)
## Problem

We want to extract safekeeper http client to separate crate for use in
storage controller and neon_local. However, many types used in the API
are internal to safekeeper.

## Summary of changes

Move them to safekeeper_api crate. No functional changes.

ref https://github.com/neondatabase/neon/issues/9011
2024-12-13 14:06:27 +00:00
a-masterov
7dc382601c Fix pg_regress tests on a cloud staging instance (#10134)
## Problem
pg_regress tests start failing due to unique ids added to Neon error
messages
## Summary of changes
Patches updated
2024-12-13 13:59:04 +00:00
Rahul Patil
2451969d5c fix(ci): Allow github-action-script to post reports (#10136)
Allow github-action-script to post reports.

Failed CI:
https://github.com/neondatabase/neon/actions/runs/12304655364/job/34342554049#step:13:514
2024-12-13 12:22:15 +00:00
JC Grünhage
59ef701925 CI(deploy): fix git tag/release creation (#10119)
## Problem

When moving the comment on proxy-releases from the yaml doc into a
javascript code block, I missed converting the comment marker from `#`
to `//`.

## Summary of changes

Correctly convert comment marker.
2024-12-12 23:38:20 +00:00
Alexander Bayandin
ac04bad457 CI: don't run debug builds with LFC (#10123)
## Problem

I've noticed that debug builds with LFC fail more frequently and for
some reason ,their failure do block merging (but it should not)

## Summary of changes
- Do not run Debug builds with LFC
2024-12-12 22:55:38 +00:00
206 changed files with 2667 additions and 1408 deletions

View File

@@ -7,10 +7,9 @@ inputs:
type: boolean
required: false
default: false
aws_oicd_role_arn:
description: 'the OIDC role arn to (re-)acquire for allure report upload - if not set call must acquire OIDC role'
required: false
default: ''
aws-oicd-role-arn:
description: 'OIDC role arn to interract with S3'
required: true
outputs:
base-url:
@@ -84,12 +83,11 @@ runs:
ALLURE_VERSION: 2.27.0
ALLURE_ZIP_SHA256: b071858fb2fa542c65d8f152c5c40d26267b2dfb74df1f1608a589ecca38e777
- name: (Re-)configure AWS credentials # necessary to upload reports to S3 after a long-running test
if: ${{ !cancelled() && (inputs.aws_oicd_role_arn != '') }}
uses: aws-actions/configure-aws-credentials@v4
- uses: aws-actions/configure-aws-credentials@v4
if: ${{ !cancelled() }}
with:
aws-region: eu-central-1
role-to-assume: ${{ inputs.aws_oicd_role_arn }}
role-to-assume: ${{ inputs.aws-oicd-role-arn }}
role-duration-seconds: 3600 # 1 hour should be more than enough to upload report
# Potentially we could have several running build for the same key (for example, for the main branch), so we use improvised lock for this

View File

@@ -8,10 +8,9 @@ inputs:
unique-key:
description: 'string to distinguish different results in the same run'
required: true
aws_oicd_role_arn:
description: 'the OIDC role arn to (re-)acquire for allure report upload - if not set call must acquire OIDC role'
required: false
default: ''
aws-oicd-role-arn:
description: 'OIDC role arn to interract with S3'
required: true
runs:
using: "composite"
@@ -36,12 +35,11 @@ runs:
env:
REPORT_DIR: ${{ inputs.report-dir }}
- name: (Re-)configure AWS credentials # necessary to upload reports to S3 after a long-running test
if: ${{ !cancelled() && (inputs.aws_oicd_role_arn != '') }}
uses: aws-actions/configure-aws-credentials@v4
- uses: aws-actions/configure-aws-credentials@v4
if: ${{ !cancelled() }}
with:
aws-region: eu-central-1
role-to-assume: ${{ inputs.aws_oicd_role_arn }}
role-to-assume: ${{ inputs.aws-oicd-role-arn }}
role-duration-seconds: 3600 # 1 hour should be more than enough to upload report
- name: Upload test results

View File

@@ -15,19 +15,17 @@ inputs:
prefix:
description: "S3 prefix. Default is '${GITHUB_RUN_ID}/${GITHUB_RUN_ATTEMPT}'"
required: false
aws_oicd_role_arn:
description: "the OIDC role arn for aws auth"
required: false
default: ""
aws-oicd-role-arn:
description: 'OIDC role arn to interract with S3'
required: true
runs:
using: "composite"
steps:
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
- uses: aws-actions/configure-aws-credentials@v4
with:
aws-region: eu-central-1
role-to-assume: ${{ inputs.aws_oicd_role_arn }}
role-to-assume: ${{ inputs.aws-oicd-role-arn }}
role-duration-seconds: 3600
- name: Download artifact

View File

@@ -48,10 +48,9 @@ inputs:
description: 'benchmark durations JSON'
required: false
default: '{}'
aws_oicd_role_arn:
description: 'the OIDC role arn to (re-)acquire for allure report upload - if not set call must acquire OIDC role'
required: false
default: ''
aws-oicd-role-arn:
description: 'OIDC role arn to interract with S3'
required: true
runs:
using: "composite"
@@ -62,7 +61,7 @@ runs:
with:
name: neon-${{ runner.os }}-${{ runner.arch }}-${{ inputs.build_type }}-artifact
path: /tmp/neon
aws_oicd_role_arn: ${{ inputs.aws_oicd_role_arn }}
aws-oicd-role-arn: ${{ inputs.aws-oicd-role-arn }}
- name: Download Neon binaries for the previous release
if: inputs.build_type != 'remote'
@@ -71,7 +70,7 @@ runs:
name: neon-${{ runner.os }}-${{ runner.arch }}-${{ inputs.build_type }}-artifact
path: /tmp/neon-previous
prefix: latest
aws_oicd_role_arn: ${{ inputs.aws_oicd_role_arn }}
aws-oicd-role-arn: ${{ inputs.aws-oicd-role-arn }}
- name: Download compatibility snapshot
if: inputs.build_type != 'remote'
@@ -83,7 +82,7 @@ runs:
# The lack of compatibility snapshot (for example, for the new Postgres version)
# shouldn't fail the whole job. Only relevant test should fail.
skip-if-does-not-exist: true
aws_oicd_role_arn: ${{ inputs.aws_oicd_role_arn }}
aws-oicd-role-arn: ${{ inputs.aws-oicd-role-arn }}
- name: Checkout
if: inputs.needs_postgres_source == 'true'
@@ -221,19 +220,19 @@ runs:
# The lack of compatibility snapshot shouldn't fail the job
# (for example if we didn't run the test for non build-and-test workflow)
skip-if-does-not-exist: true
aws_oicd_role_arn: ${{ inputs.aws_oicd_role_arn }}
aws-oicd-role-arn: ${{ inputs.aws-oicd-role-arn }}
- name: (Re-)configure AWS credentials # necessary to upload reports to S3 after a long-running test
if: ${{ !cancelled() && (inputs.aws_oicd_role_arn != '') }}
uses: aws-actions/configure-aws-credentials@v4
- uses: aws-actions/configure-aws-credentials@v4
if: ${{ !cancelled() }}
with:
aws-region: eu-central-1
role-to-assume: ${{ inputs.aws_oicd_role_arn }}
role-to-assume: ${{ inputs.aws-oicd-role-arn }}
role-duration-seconds: 3600 # 1 hour should be more than enough to upload report
- name: Upload test results
if: ${{ !cancelled() }}
uses: ./.github/actions/allure-report-store
with:
report-dir: /tmp/test_output/allure/results
unique-key: ${{ inputs.build_type }}-${{ inputs.pg_version }}
aws_oicd_role_arn: ${{ inputs.aws_oicd_role_arn }}
aws-oicd-role-arn: ${{ inputs.aws-oicd-role-arn }}

View File

@@ -14,11 +14,11 @@ runs:
name: coverage-data-artifact
path: /tmp/coverage
skip-if-does-not-exist: true # skip if there's no previous coverage to download
aws_oicd_role_arn: ${{ inputs.aws_oicd_role_arn }}
aws-oicd-role-arn: ${{ inputs.aws-oicd-role-arn }}
- name: Upload coverage data
uses: ./.github/actions/upload
with:
name: coverage-data-artifact
path: /tmp/coverage
aws_oicd_role_arn: ${{ inputs.aws_oicd_role_arn }}
aws-oicd-role-arn: ${{ inputs.aws-oicd-role-arn }}

View File

@@ -14,7 +14,7 @@ inputs:
prefix:
description: "S3 prefix. Default is '${GITHUB_SHA}/${GITHUB_RUN_ID}/${GITHUB_RUN_ATTEMPT}'"
required: false
aws_oicd_role_arn:
aws-oicd-role-arn:
description: "the OIDC role arn for aws auth"
required: false
default: ""
@@ -61,7 +61,7 @@ runs:
uses: aws-actions/configure-aws-credentials@v4
with:
aws-region: eu-central-1
role-to-assume: ${{ inputs.aws_oicd_role_arn }}
role-to-assume: ${{ inputs.aws-oicd-role-arn }}
role-duration-seconds: 3600
- name: Upload artifact

View File

@@ -70,7 +70,7 @@ jobs:
name: neon-${{ runner.os }}-${{ runner.arch }}-release-artifact
path: /tmp/neon/
prefix: latest
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
# we create a table that has one row for each database that we want to restore with the status whether the restore is done
- name: Create benchmark_restore_status table if it does not exist

View File

@@ -264,7 +264,7 @@ jobs:
with:
name: neon-${{ runner.os }}-${{ runner.arch }}-${{ inputs.build-type }}-artifact
path: /tmp/neon
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
# XXX: keep this after the binaries.list is formed, so the coverage can properly work later
- name: Merge and upload coverage data
@@ -308,7 +308,7 @@ jobs:
real_s3_region: eu-central-1
rerun_failed: true
pg_version: ${{ matrix.pg_version }}
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
TEST_RESULT_CONNSTR: ${{ secrets.REGRESS_TEST_RESULT_CONNSTR_NEW }}
CHECK_ONDISK_DATA_COMPATIBILITY: nonempty

View File

@@ -105,7 +105,7 @@ jobs:
name: neon-${{ runner.os }}-${{ runner.arch }}-release-artifact
path: /tmp/neon/
prefix: latest
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
- name: Create Neon Project
id: create-neon-project
@@ -123,7 +123,7 @@ jobs:
run_in_parallel: false
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
pg_version: ${{ env.DEFAULT_PG_VERSION }}
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
# Set --sparse-ordering option of pytest-order plugin
# to ensure tests are running in order of appears in the file.
# It's important for test_perf_pgbench.py::test_pgbench_remote_* tests
@@ -153,7 +153,7 @@ jobs:
if: ${{ !cancelled() }}
uses: ./.github/actions/allure-report-generate
with:
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
- name: Post to a Slack channel
if: ${{ github.event.schedule && failure() }}
@@ -205,7 +205,7 @@ jobs:
name: neon-${{ runner.os }}-${{ runner.arch }}-release-artifact
path: /tmp/neon/
prefix: latest
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
- name: Run Logical Replication benchmarks
uses: ./.github/actions/run-python-test-set
@@ -216,7 +216,7 @@ jobs:
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 5400
pg_version: ${{ env.DEFAULT_PG_VERSION }}
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"
@@ -233,7 +233,7 @@ jobs:
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 5400
pg_version: ${{ env.DEFAULT_PG_VERSION }}
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"
@@ -245,7 +245,7 @@ jobs:
uses: ./.github/actions/allure-report-generate
with:
store-test-results-into-db: true
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
REGRESS_TEST_RESULT_CONNSTR_NEW: ${{ secrets.REGRESS_TEST_RESULT_CONNSTR_NEW }}
@@ -308,6 +308,7 @@ jobs:
"image": [ "'"$image_default"'" ],
"include": [{ "pg_version": 16, "region_id": "'"$region_id_default"'", "platform": "neonvm-captest-freetier", "db_size": "3gb" ,"runner": '"$runner_default"', "image": "'"$image_default"'" },
{ "pg_version": 16, "region_id": "'"$region_id_default"'", "platform": "neonvm-captest-new", "db_size": "10gb","runner": '"$runner_default"', "image": "'"$image_default"'" },
{ "pg_version": 16, "region_id": "'"$region_id_default"'", "platform": "neonvm-captest-new-many-tables","db_size": "10gb","runner": '"$runner_default"', "image": "'"$image_default"'" },
{ "pg_version": 16, "region_id": "'"$region_id_default"'", "platform": "neonvm-captest-new", "db_size": "50gb","runner": '"$runner_default"', "image": "'"$image_default"'" },
{ "pg_version": 16, "region_id": "azure-eastus2", "platform": "neonvm-azure-captest-freetier", "db_size": "3gb" ,"runner": '"$runner_azure"', "image": "neondatabase/build-tools:pinned-bookworm" },
{ "pg_version": 16, "region_id": "azure-eastus2", "platform": "neonvm-azure-captest-new", "db_size": "10gb","runner": '"$runner_azure"', "image": "neondatabase/build-tools:pinned-bookworm" },
@@ -407,10 +408,10 @@ jobs:
name: neon-${{ runner.os }}-${{ runner.arch }}-release-artifact
path: /tmp/neon/
prefix: latest
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
- name: Create Neon Project
if: contains(fromJson('["neonvm-captest-new", "neonvm-captest-freetier", "neonvm-azure-captest-freetier", "neonvm-azure-captest-new"]'), matrix.platform)
if: contains(fromJson('["neonvm-captest-new", "neonvm-captest-new-many-tables", "neonvm-captest-freetier", "neonvm-azure-captest-freetier", "neonvm-azure-captest-new"]'), matrix.platform)
id: create-neon-project
uses: ./.github/actions/neon-project-create
with:
@@ -429,7 +430,7 @@ jobs:
neonvm-captest-sharding-reuse)
CONNSTR=${{ secrets.BENCHMARK_CAPTEST_SHARDING_CONNSTR }}
;;
neonvm-captest-new | neonvm-captest-freetier | neonvm-azure-captest-new | neonvm-azure-captest-freetier)
neonvm-captest-new | neonvm-captest-new-many-tables | neonvm-captest-freetier | neonvm-azure-captest-new | neonvm-azure-captest-freetier)
CONNSTR=${{ steps.create-neon-project.outputs.dsn }}
;;
rds-aurora)
@@ -446,6 +447,26 @@ jobs:
echo "connstr=${CONNSTR}" >> $GITHUB_OUTPUT
# we want to compare Neon project OLTP throughput and latency at scale factor 10 GB
# without (neonvm-captest-new)
# and with (neonvm-captest-new-many-tables) many relations in the database
- name: Create many relations before the run
if: contains(fromJson('["neonvm-captest-new-many-tables"]'), matrix.platform)
uses: ./.github/actions/run-python-test-set
with:
build_type: ${{ env.BUILD_TYPE }}
test_selection: performance
run_in_parallel: false
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600 -k test_perf_many_relations
pg_version: ${{ env.DEFAULT_PG_VERSION }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"
TEST_NUM_RELATIONS: 10000
- name: Benchmark init
uses: ./.github/actions/run-python-test-set
with:
@@ -455,7 +476,7 @@ jobs:
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600 -k test_pgbench_remote_init
pg_version: ${{ env.DEFAULT_PG_VERSION }}
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
@@ -470,7 +491,7 @@ jobs:
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600 -k test_pgbench_remote_simple_update
pg_version: ${{ env.DEFAULT_PG_VERSION }}
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
@@ -485,7 +506,7 @@ jobs:
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600 -k test_pgbench_remote_select_only
pg_version: ${{ env.DEFAULT_PG_VERSION }}
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
@@ -503,7 +524,7 @@ jobs:
if: ${{ !cancelled() }}
uses: ./.github/actions/allure-report-generate
with:
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
- name: Post to a Slack channel
if: ${{ github.event.schedule && failure() }}
@@ -614,7 +635,7 @@ jobs:
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600 -k test_pgvector_indexing
pg_version: ${{ env.DEFAULT_PG_VERSION }}
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"
@@ -629,7 +650,7 @@ jobs:
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600
pg_version: ${{ env.DEFAULT_PG_VERSION }}
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
@@ -640,7 +661,7 @@ jobs:
if: ${{ !cancelled() }}
uses: ./.github/actions/allure-report-generate
with:
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
- name: Post to a Slack channel
if: ${{ github.event.schedule && failure() }}
@@ -711,7 +732,7 @@ jobs:
name: neon-${{ runner.os }}-${{ runner.arch }}-release-artifact
path: /tmp/neon/
prefix: latest
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
- name: Set up Connection String
id: set-up-connstr
@@ -743,7 +764,7 @@ jobs:
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 43200 -k test_clickbench
pg_version: ${{ env.DEFAULT_PG_VERSION }}
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"
@@ -757,7 +778,7 @@ jobs:
if: ${{ !cancelled() }}
uses: ./.github/actions/allure-report-generate
with:
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
- name: Post to a Slack channel
if: ${{ github.event.schedule && failure() }}
@@ -822,7 +843,7 @@ jobs:
name: neon-${{ runner.os }}-${{ runner.arch }}-release-artifact
path: /tmp/neon/
prefix: latest
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
- name: Get Connstring Secret Name
run: |
@@ -861,7 +882,7 @@ jobs:
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600 -k test_tpch
pg_version: ${{ env.DEFAULT_PG_VERSION }}
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"
@@ -873,7 +894,7 @@ jobs:
if: ${{ !cancelled() }}
uses: ./.github/actions/allure-report-generate
with:
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
- name: Post to a Slack channel
if: ${{ github.event.schedule && failure() }}
@@ -931,7 +952,7 @@ jobs:
name: neon-${{ runner.os }}-${{ runner.arch }}-release-artifact
path: /tmp/neon/
prefix: latest
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
- name: Set up Connection String
id: set-up-connstr
@@ -963,7 +984,7 @@ jobs:
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600 -k test_user_examples
pg_version: ${{ env.DEFAULT_PG_VERSION }}
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"
@@ -974,7 +995,7 @@ jobs:
if: ${{ !cancelled() }}
uses: ./.github/actions/allure-report-generate
with:
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
- name: Post to a Slack channel
if: ${{ github.event.schedule && failure() }}

View File

@@ -254,16 +254,14 @@ jobs:
build-tag: ${{ needs.tag.outputs.build-tag }}
build-type: ${{ matrix.build-type }}
# Run tests on all Postgres versions in release builds and only on the latest version in debug builds.
# Run without LFC on v17 release and debug builds only. For all the other cases LFC is enabled. Failure on the
# debug build with LFC enabled doesn't block merging.
# Run without LFC on v17 release and debug builds only. For all the other cases LFC is enabled.
test-cfg: |
${{ matrix.build-type == 'release' && '[{"pg_version":"v14", "lfc_state": "with-lfc"},
{"pg_version":"v15", "lfc_state": "with-lfc"},
{"pg_version":"v16", "lfc_state": "with-lfc"},
{"pg_version":"v17", "lfc_state": "with-lfc"},
{"pg_version":"v17", "lfc_state": "without-lfc"}]'
|| '[{"pg_version":"v17", "lfc_state": "without-lfc"},
{"pg_version":"v17", "lfc_state": "with-lfc" }]' }}
|| '[{"pg_version":"v17", "lfc_state": "without-lfc" }]' }}
secrets: inherit
# Keep `benchmarks` job outside of `build-and-test-locally` workflow to make job failures non-blocking
@@ -305,6 +303,11 @@ jobs:
benchmarks:
if: github.ref_name == 'main' || contains(github.event.pull_request.labels.*.name, 'run-benchmarks')
needs: [ check-permissions, build-and-test-locally, build-build-tools-image, get-benchmarks-durations ]
permissions:
id-token: write # aws-actions/configure-aws-credentials
statuses: write
contents: write
pull-requests: write
runs-on: [ self-hosted, small ]
container:
image: ${{ needs.build-build-tools-image.outputs.image }}-bookworm
@@ -333,6 +336,7 @@ jobs:
extra_params: --splits 5 --group ${{ matrix.pytest_split_group }}
benchmark_durations: ${{ needs.get-benchmarks-durations.outputs.json }}
pg_version: v16
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"
@@ -345,6 +349,11 @@ jobs:
report-benchmarks-failures:
needs: [ benchmarks, create-test-report ]
if: github.ref_name == 'main' && failure() && needs.benchmarks.result == 'failure'
permissions:
id-token: write # aws-actions/configure-aws-credentials
statuses: write
contents: write
pull-requests: write
runs-on: ubuntu-22.04
steps:
@@ -385,7 +394,7 @@ jobs:
uses: ./.github/actions/allure-report-generate
with:
store-test-results-into-db: true
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
REGRESS_TEST_RESULT_CONNSTR_NEW: ${{ secrets.REGRESS_TEST_RESULT_CONNSTR_NEW }}
@@ -447,14 +456,14 @@ jobs:
with:
name: neon-${{ runner.os }}-${{ runner.arch }}-${{ matrix.build_type }}-artifact
path: /tmp/neon
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
- name: Get coverage artifact
uses: ./.github/actions/download
with:
name: coverage-data-artifact
path: /tmp/coverage
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
- name: Merge coverage data
run: scripts/coverage "--profraw-prefix=$GITHUB_JOB" --dir=/tmp/coverage merge
@@ -1026,6 +1035,11 @@ jobs:
trigger-custom-extensions-build-and-wait:
needs: [ check-permissions, tag ]
runs-on: ubuntu-22.04
permissions:
id-token: write # aws-actions/configure-aws-credentials
statuses: write
contents: write
pull-requests: write
steps:
- name: Set PR's status to pending and request a remote CI test
run: |
@@ -1145,7 +1159,7 @@ jobs:
console.log(`Tag ${tag} created successfully.`);
}
# TODO: check how GitHub releases looks for proxy/compute releases and enable them if they're ok
// TODO: check how GitHub releases looks for proxy/compute releases and enable them if they're ok
if (context.ref !== 'refs/heads/release') {
console.log(`GitHub release skipped for ${context.ref}.`);
return;
@@ -1266,6 +1280,12 @@ jobs:
echo "run-id=${run_id}" | tee -a ${GITHUB_OUTPUT}
echo "commit-sha=${last_commit_sha}" | tee -a ${GITHUB_OUTPUT}
- uses: aws-actions/configure-aws-credentials@v4
with:
aws-region: eu-central-1
role-to-assume: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
role-duration-seconds: 3600
- name: Promote compatibility snapshot and Neon artifact
env:
BUCKET: neon-github-public-dev

View File

@@ -21,6 +21,8 @@ concurrency:
permissions:
id-token: write # aws-actions/configure-aws-credentials
statuses: write
contents: write
jobs:
regress:
@@ -79,7 +81,7 @@ jobs:
name: neon-${{ runner.os }}-${{ runner.arch }}-release-artifact
path: /tmp/neon/
prefix: latest
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
- name: Create a new branch
id: create-branch
@@ -95,10 +97,12 @@ jobs:
test_selection: cloud_regress
pg_version: ${{matrix.pg-version}}
extra_params: -m remote_cluster
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
BENCHMARK_CONNSTR: ${{steps.create-branch.outputs.dsn}}
- name: Delete branch
if: always()
uses: ./.github/actions/neon-branch-delete
with:
api_key: ${{ secrets.NEON_STAGING_API_KEY }}
@@ -110,7 +114,7 @@ jobs:
if: ${{ !cancelled() }}
uses: ./.github/actions/allure-report-generate
with:
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
- name: Post to a Slack channel
if: ${{ github.event.schedule && failure() }}

View File

@@ -13,7 +13,7 @@ on:
# │ │ │ │ ┌───────────── day of the week (0 - 6 or SUN-SAT)
- cron: '0 9 * * *' # run once a day, timezone is utc
workflow_dispatch: # adds ability to run this manually
defaults:
run:
shell: bash -euxo pipefail {0}
@@ -28,7 +28,7 @@ jobs:
strategy:
fail-fast: false # allow other variants to continue even if one fails
matrix:
target_project: [new_empty_project, large_existing_project]
target_project: [new_empty_project, large_existing_project]
permissions:
contents: write
statuses: write
@@ -56,7 +56,7 @@ jobs:
with:
aws-region: eu-central-1
role-to-assume: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
role-duration-seconds: 18000 # 5 hours is currently max associated with IAM role
role-duration-seconds: 18000 # 5 hours is currently max associated with IAM role
- name: Download Neon artifact
uses: ./.github/actions/download
@@ -64,7 +64,7 @@ jobs:
name: neon-${{ runner.os }}-${{ runner.arch }}-release-artifact
path: /tmp/neon/
prefix: latest
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
- name: Create Neon Project
if: ${{ matrix.target_project == 'new_empty_project' }}
@@ -95,7 +95,7 @@ jobs:
project_id: ${{ vars.BENCHMARK_INGEST_TARGET_PROJECTID }}
api_key: ${{ secrets.NEON_STAGING_API_KEY }}
- name: Initialize Neon project
- name: Initialize Neon project
if: ${{ matrix.target_project == 'large_existing_project' }}
env:
BENCHMARK_INGEST_TARGET_CONNSTR: ${{ steps.create-neon-branch-ingest-target.outputs.dsn }}
@@ -123,7 +123,7 @@ jobs:
${PSQL} "${BENCHMARK_INGEST_TARGET_CONNSTR}" -c "CREATE EXTENSION IF NOT EXISTS neon; CREATE EXTENSION IF NOT EXISTS neon_utils;"
echo "BENCHMARK_INGEST_TARGET_CONNSTR=${BENCHMARK_INGEST_TARGET_CONNSTR}" >> $GITHUB_ENV
- name: Invoke pgcopydb
- name: Invoke pgcopydb
uses: ./.github/actions/run-python-test-set
with:
build_type: remote
@@ -132,7 +132,7 @@ jobs:
extra_params: -s -m remote_cluster --timeout 86400 -k test_ingest_performance_using_pgcopydb
pg_version: v16
save_perf_report: true
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
BENCHMARK_INGEST_SOURCE_CONNSTR: ${{ secrets.BENCHMARK_INGEST_SOURCE_CONNSTR }}
TARGET_PROJECT_TYPE: ${{ matrix.target_project }}
@@ -144,7 +144,7 @@ jobs:
run: |
export LD_LIBRARY_PATH=${PG_16_LIB_PATH}
${PSQL} "${BENCHMARK_INGEST_TARGET_CONNSTR}" -c "\dt+"
- name: Delete Neon Project
if: ${{ always() && matrix.target_project == 'new_empty_project' }}
uses: ./.github/actions/neon-project-delete

View File

@@ -21,15 +21,17 @@ defaults:
run:
shell: bash -euo pipefail {0}
permissions:
id-token: write # aws-actions/configure-aws-credentials
concurrency:
group: ${{ github.workflow }}
cancel-in-progress: false
jobs:
trigger_bench_on_ec2_machine_in_eu_central_1:
permissions:
id-token: write # aws-actions/configure-aws-credentials
statuses: write
contents: write
pull-requests: write
runs-on: [ self-hosted, small ]
container:
image: neondatabase/build-tools:pinned-bookworm
@@ -135,7 +137,7 @@ jobs:
if: ${{ !cancelled() }}
uses: ./.github/actions/allure-report-generate
with:
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
- name: Post to a Slack channel
if: ${{ github.event.schedule && failure() }}

View File

@@ -96,7 +96,7 @@ jobs:
name: neon-${{ runner.os }}-${{ runner.arch }}-release-artifact
path: /tmp/neon/
prefix: latest
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
- name: Create Neon Project
id: create-neon-project
@@ -113,6 +113,7 @@ jobs:
run_in_parallel: false
extra_params: -m remote_cluster
pg_version: ${{ env.DEFAULT_PG_VERSION }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
BENCHMARK_CONNSTR: ${{ steps.create-neon-project.outputs.dsn }}
@@ -129,7 +130,7 @@ jobs:
uses: ./.github/actions/allure-report-generate
with:
store-test-results-into-db: true
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
REGRESS_TEST_RESULT_CONNSTR_NEW: ${{ secrets.REGRESS_TEST_RESULT_CONNSTR_NEW }}
@@ -163,7 +164,7 @@ jobs:
name: neon-${{ runner.os }}-${{ runner.arch }}-release-artifact
path: /tmp/neon/
prefix: latest
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
- name: Create Neon Project
id: create-neon-project
@@ -180,6 +181,7 @@ jobs:
run_in_parallel: false
extra_params: -m remote_cluster
pg_version: ${{ env.DEFAULT_PG_VERSION }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
BENCHMARK_CONNSTR: ${{ steps.create-neon-project.outputs.dsn }}
@@ -196,7 +198,7 @@ jobs:
uses: ./.github/actions/allure-report-generate
with:
store-test-results-into-db: true
aws_oicd_role_arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
REGRESS_TEST_RESULT_CONNSTR_NEW: ${{ secrets.REGRESS_TEST_RESULT_CONNSTR_NEW }}

165
Cargo.lock generated
View File

@@ -10,9 +10,9 @@ checksum = "8b5ace29ee3216de37c0546865ad08edef58b0f9e76838ed8959a84a990e58c5"
[[package]]
name = "addr2line"
version = "0.21.0"
version = "0.24.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8a30b2e23b9e17a9f90641c7ab1549cd9b44f296d3ccbf309d2863cfe398a0cb"
checksum = "dfbe277e56a376000877090da837660b4427aad530e3028d44e0bffe4f89a1c1"
dependencies = [
"gimli",
]
@@ -23,6 +23,12 @@ version = "1.0.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f26201604c87b1e01bd3d98f8d5d9a8fcbb815e8cedb41ffccbeb4bf593a35fe"
[[package]]
name = "adler2"
version = "2.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "512761e0bb2578dd7380c6baaa0f4ce03e84f95e960231d1dec8bf4d7d6e2627"
[[package]]
name = "ahash"
version = "0.8.11"
@@ -871,17 +877,17 @@ dependencies = [
[[package]]
name = "backtrace"
version = "0.3.69"
version = "0.3.74"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2089b7e3f35b9dd2d0ed921ead4f6d318c27680d4a5bd167b3ee120edb105837"
checksum = "8d82cb332cdfaed17ae235a638438ac4d4839913cc2af585c3c6746e8f8bee1a"
dependencies = [
"addr2line",
"cc",
"cfg-if",
"libc",
"miniz_oxide",
"miniz_oxide 0.8.0",
"object",
"rustc-demangle",
"windows-targets 0.52.6",
]
[[package]]
@@ -1127,7 +1133,7 @@ dependencies = [
"num-traits",
"serde",
"wasm-bindgen",
"windows-targets 0.52.4",
"windows-targets 0.52.6",
]
[[package]]
@@ -2107,7 +2113,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "3b9429470923de8e8cbd4d2dc513535400b4b3fef0319fb5c4e1f520a7bef743"
dependencies = [
"crc32fast",
"miniz_oxide",
"miniz_oxide 0.7.1",
]
[[package]]
@@ -2308,9 +2314,9 @@ dependencies = [
[[package]]
name = "gimli"
version = "0.28.1"
version = "0.31.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "4271d37baee1b8c7e4b708028c57d816cf9d2434acb33a549475f78c181f6253"
checksum = "07e28edb80900c19c28f1072f2e8aeca7fa06b23cd4169cefe1af5aa3260783f"
[[package]]
name = "git-version"
@@ -3404,6 +3410,15 @@ dependencies = [
"adler",
]
[[package]]
name = "miniz_oxide"
version = "0.8.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e2d80299ef12ff69b16a84bb182e3b9df68b5a91574d3d4fa6e41b65deec4df1"
dependencies = [
"adler2",
]
[[package]]
name = "mio"
version = "0.8.11"
@@ -3638,9 +3653,9 @@ dependencies = [
[[package]]
name = "object"
version = "0.32.2"
version = "0.36.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "a6a622008b6e321afc04970976f62ee297fdbaa6f95318ca343e3eebb9648441"
checksum = "aedf0a2d09c573ed1d8d85b30c119153926a2b36dce0ab28322c09a117a4683e"
dependencies = [
"memchr",
]
@@ -3930,6 +3945,7 @@ dependencies = [
"serde_json",
"serde_path_to_error",
"serde_with",
"serial_test",
"smallvec",
"storage_broker",
"strum",
@@ -4401,11 +4417,13 @@ dependencies = [
"bindgen",
"bytes",
"crc32c",
"criterion",
"env_logger",
"log",
"memoffset 0.9.0",
"once_cell",
"postgres",
"pprof",
"regex",
"serde",
"thiserror",
@@ -5062,6 +5080,7 @@ dependencies = [
"once_cell",
"pin-project-lite",
"rand 0.8.5",
"reqwest",
"scopeguard",
"serde",
"serde_json",
@@ -5320,9 +5339,9 @@ dependencies = [
[[package]]
name = "rustc-demangle"
version = "0.1.23"
version = "0.1.24"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d626bb9dae77e28219937af045c257c28bfd3f69333c512553507f5f9798cb76"
checksum = "719b953e2095829ee67db738b3bfa9fa368c94900df327b3f07fe6e794d2fe1f"
[[package]]
name = "rustc-hash"
@@ -5535,6 +5554,7 @@ dependencies = [
"remote_storage",
"reqwest",
"safekeeper_api",
"safekeeper_client",
"scopeguard",
"sd-notify",
"serde",
@@ -5565,10 +5585,25 @@ name = "safekeeper_api"
version = "0.1.0"
dependencies = [
"const_format",
"postgres_ffi",
"pq_proto",
"serde",
"tokio",
"utils",
]
[[package]]
name = "safekeeper_client"
version = "0.1.0"
dependencies = [
"reqwest",
"safekeeper_api",
"serde",
"thiserror",
"utils",
"workspace_hack",
]
[[package]]
name = "same-file"
version = "1.0.6"
@@ -5578,6 +5613,15 @@ dependencies = [
"winapi-util",
]
[[package]]
name = "scc"
version = "2.2.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "94b13f8ea6177672c49d12ed964cca44836f59621981b04a3e26b87e675181de"
dependencies = [
"sdd",
]
[[package]]
name = "schannel"
version = "0.1.23"
@@ -5618,6 +5662,12 @@ version = "0.4.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "621e3680f3e07db4c9c2c3fb07c6223ab2fab2e54bd3c04c3ae037990f428c32"
[[package]]
name = "sdd"
version = "3.0.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "478f121bb72bbf63c52c93011ea1791dca40140dfe13f8336c4c5ac952c33aa9"
[[package]]
name = "sec1"
version = "0.3.0"
@@ -5907,6 +5957,31 @@ dependencies = [
"serde",
]
[[package]]
name = "serial_test"
version = "3.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1b258109f244e1d6891bf1053a55d63a5cd4f8f4c30cf9a1280989f80e7a1fa9"
dependencies = [
"futures",
"log",
"once_cell",
"parking_lot 0.12.1",
"scc",
"serial_test_derive",
]
[[package]]
name = "serial_test_derive"
version = "3.2.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5d69265a08751de7844521fd15003ae0a888e035773ba05695c5c759a6f89eef"
dependencies = [
"proc-macro2",
"quote",
"syn 2.0.90",
]
[[package]]
name = "sha1"
version = "0.10.5"
@@ -7200,6 +7275,7 @@ dependencies = [
"anyhow",
"arc-swap",
"async-compression",
"backtrace",
"bincode",
"byteorder",
"bytes",
@@ -7210,12 +7286,14 @@ dependencies = [
"criterion",
"diatomic-waker",
"fail",
"flate2",
"futures",
"git-version",
"hex",
"hex-literal",
"humantime",
"hyper 0.14.30",
"itertools 0.10.5",
"jemalloc_pprof",
"jsonwebtoken",
"metrics",
@@ -7572,7 +7650,7 @@ source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e48a53791691ab099e5e2ad123536d0fff50652600abaf43bbf952894110d0be"
dependencies = [
"windows-core",
"windows-targets 0.52.4",
"windows-targets 0.52.6",
]
[[package]]
@@ -7581,7 +7659,7 @@ version = "0.52.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "33ab640c8d7e35bf8ba19b884ba838ceb4fba93a4e8c65a9059d08afcfc683d9"
dependencies = [
"windows-targets 0.52.4",
"windows-targets 0.52.6",
]
[[package]]
@@ -7599,7 +7677,7 @@ version = "0.52.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "282be5f36a8ce781fad8c8ae18fa3f9beff57ec1b52cb3de0789201425d9a33d"
dependencies = [
"windows-targets 0.52.4",
"windows-targets 0.52.6",
]
[[package]]
@@ -7619,17 +7697,18 @@ dependencies = [
[[package]]
name = "windows-targets"
version = "0.52.4"
version = "0.52.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7dd37b7e5ab9018759f893a1952c9420d060016fc19a472b4bb20d1bdd694d1b"
checksum = "9b724f72796e036ab90c1021d4780d4d3d648aca59e491e6b98e725b84e99973"
dependencies = [
"windows_aarch64_gnullvm 0.52.4",
"windows_aarch64_msvc 0.52.4",
"windows_i686_gnu 0.52.4",
"windows_i686_msvc 0.52.4",
"windows_x86_64_gnu 0.52.4",
"windows_x86_64_gnullvm 0.52.4",
"windows_x86_64_msvc 0.52.4",
"windows_aarch64_gnullvm 0.52.6",
"windows_aarch64_msvc 0.52.6",
"windows_i686_gnu 0.52.6",
"windows_i686_gnullvm",
"windows_i686_msvc 0.52.6",
"windows_x86_64_gnu 0.52.6",
"windows_x86_64_gnullvm 0.52.6",
"windows_x86_64_msvc 0.52.6",
]
[[package]]
@@ -7640,9 +7719,9 @@ checksum = "91ae572e1b79dba883e0d315474df7305d12f569b400fcf90581b06062f7e1bc"
[[package]]
name = "windows_aarch64_gnullvm"
version = "0.52.4"
version = "0.52.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bcf46cf4c365c6f2d1cc93ce535f2c8b244591df96ceee75d8e83deb70a9cac9"
checksum = "32a4622180e7a0ec044bb555404c800bc9fd9ec262ec147edd5989ccd0c02cd3"
[[package]]
name = "windows_aarch64_msvc"
@@ -7652,9 +7731,9 @@ checksum = "b2ef27e0d7bdfcfc7b868b317c1d32c641a6fe4629c171b8928c7b08d98d7cf3"
[[package]]
name = "windows_aarch64_msvc"
version = "0.52.4"
version = "0.52.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "da9f259dd3bcf6990b55bffd094c4f7235817ba4ceebde8e6d11cd0c5633b675"
checksum = "09ec2a7bb152e2252b53fa7803150007879548bc709c039df7627cabbd05d469"
[[package]]
name = "windows_i686_gnu"
@@ -7664,9 +7743,15 @@ checksum = "622a1962a7db830d6fd0a69683c80a18fda201879f0f447f065a3b7467daa241"
[[package]]
name = "windows_i686_gnu"
version = "0.52.4"
version = "0.52.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "b474d8268f99e0995f25b9f095bc7434632601028cf86590aea5c8a5cb7801d3"
checksum = "8e9b5ad5ab802e97eb8e295ac6720e509ee4c243f69d781394014ebfe8bbfa0b"
[[package]]
name = "windows_i686_gnullvm"
version = "0.52.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0eee52d38c090b3caa76c563b86c3a4bd71ef1a819287c19d586d7334ae8ed66"
[[package]]
name = "windows_i686_msvc"
@@ -7676,9 +7761,9 @@ checksum = "4542c6e364ce21bf45d69fdd2a8e455fa38d316158cfd43b3ac1c5b1b19f8e00"
[[package]]
name = "windows_i686_msvc"
version = "0.52.4"
version = "0.52.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "1515e9a29e5bed743cb4415a9ecf5dfca648ce85ee42e15873c3cd8610ff8e02"
checksum = "240948bc05c5e7c6dabba28bf89d89ffce3e303022809e73deaefe4f6ec56c66"
[[package]]
name = "windows_x86_64_gnu"
@@ -7688,9 +7773,9 @@ checksum = "ca2b8a661f7628cbd23440e50b05d705db3686f894fc9580820623656af974b1"
[[package]]
name = "windows_x86_64_gnu"
version = "0.52.4"
version = "0.52.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5eee091590e89cc02ad514ffe3ead9eb6b660aedca2183455434b93546371a03"
checksum = "147a5c80aabfbf0c7d901cb5895d1de30ef2907eb21fbbab29ca94c5b08b1a78"
[[package]]
name = "windows_x86_64_gnullvm"
@@ -7700,9 +7785,9 @@ checksum = "7896dbc1f41e08872e9d5e8f8baa8fdd2677f29468c4e156210174edc7f7b953"
[[package]]
name = "windows_x86_64_gnullvm"
version = "0.52.4"
version = "0.52.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "77ca79f2451b49fa9e2af39f0747fe999fcda4f5e241b2898624dca97a1f2177"
checksum = "24d5b23dc417412679681396f2b49f3de8c1473deb516bd34410872eff51ed0d"
[[package]]
name = "windows_x86_64_msvc"
@@ -7712,9 +7797,9 @@ checksum = "1a515f5799fe4961cb532f983ce2b23082366b898e52ffbce459c86f67c8378a"
[[package]]
name = "windows_x86_64_msvc"
version = "0.52.4"
version = "0.52.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "32b752e52a2da0ddfbdbcc6fceadfeede4c939ed16d13e648833a61dfb611ed8"
checksum = "589f6da84c646204747d1270a2a5661ea66ed1cced2631d546fdfb155959f9ec"
[[package]]
name = "winnow"

View File

@@ -11,6 +11,7 @@ members = [
"pageserver/pagebench",
"proxy",
"safekeeper",
"safekeeper/client",
"storage_broker",
"storage_controller",
"storage_controller/client",
@@ -51,6 +52,7 @@ anyhow = { version = "1.0", features = ["backtrace"] }
arc-swap = "1.6"
async-compression = { version = "0.4.0", features = ["tokio", "gzip", "zstd"] }
atomic-take = "1.1.0"
backtrace = "0.3.74"
flate2 = "1.0.26"
async-stream = "0.3"
async-trait = "0.1"
@@ -159,6 +161,7 @@ serde_json = "1"
serde_path_to_error = "0.1"
serde_with = { version = "2.0", features = [ "base64" ] }
serde_assert = "0.5.0"
serial_test = "3.2.0"
sha2 = "0.10.2"
signal-hook = "0.3"
smallvec = "1.11"
@@ -233,6 +236,7 @@ postgres_initdb = { path = "./libs/postgres_initdb" }
pq_proto = { version = "0.1", path = "./libs/pq_proto/" }
remote_storage = { version = "0.1", path = "./libs/remote_storage/" }
safekeeper_api = { version = "0.1", path = "./libs/safekeeper_api" }
safekeeper_client = { path = "./safekeeper/client" }
desim = { version = "0.1", path = "./libs/desim" }
storage_broker = { version = "0.1", path = "./storage_broker/" } # Note: main broker code is inside the binary crate, so linking with the library shouldn't be heavy.
storage_controller_client = { path = "./storage_controller/client" }

View File

@@ -35,10 +35,12 @@ RUN case $DEBIAN_VERSION in \
;; \
esac && \
apt update && \
apt install --no-install-recommends -y git autoconf automake libtool build-essential bison flex libreadline-dev \
apt install --no-install-recommends --no-install-suggests -y \
ninja-build git autoconf automake libtool build-essential bison flex libreadline-dev \
zlib1g-dev libxml2-dev libcurl4-openssl-dev libossp-uuid-dev wget ca-certificates pkg-config libssl-dev \
libicu-dev libxslt1-dev liblz4-dev libzstd-dev zstd \
$VERSION_INSTALLS
$VERSION_INSTALLS \
&& apt clean && rm -rf /var/lib/apt/lists/*
#########################################################################################
#
@@ -113,10 +115,12 @@ ARG DEBIAN_VERSION
ARG PG_VERSION
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
RUN apt update && \
apt install --no-install-recommends -y gdal-bin libboost-dev libboost-thread-dev libboost-filesystem-dev \
apt install --no-install-recommends --no-install-suggests -y \
gdal-bin libboost-dev libboost-thread-dev libboost-filesystem-dev \
libboost-system-dev libboost-iostreams-dev libboost-program-options-dev libboost-timer-dev \
libcgal-dev libgdal-dev libgmp-dev libmpfr-dev libopenscenegraph-dev libprotobuf-c-dev \
protobuf-c-compiler xsltproc
protobuf-c-compiler xsltproc \
&& apt clean && rm -rf /var/lib/apt/lists/*
# Postgis 3.5.0 requires SFCGAL 1.4+
@@ -143,9 +147,9 @@ RUN case "${DEBIAN_VERSION}" in \
wget https://gitlab.com/sfcgal/SFCGAL/-/archive/v${SFCGAL_VERSION}/SFCGAL-v${SFCGAL_VERSION}.tar.gz -O SFCGAL.tar.gz && \
echo "${SFCGAL_CHECKSUM} SFCGAL.tar.gz" | sha256sum --check && \
mkdir sfcgal-src && cd sfcgal-src && tar xzf ../SFCGAL.tar.gz --strip-components=1 -C . && \
cmake -DCMAKE_BUILD_TYPE=Release . && make -j $(getconf _NPROCESSORS_ONLN) && \
DESTDIR=/sfcgal make install -j $(getconf _NPROCESSORS_ONLN) && \
make clean && cp -R /sfcgal/* /
cmake -DCMAKE_BUILD_TYPE=Release -GNinja . && ninja -j $(getconf _NPROCESSORS_ONLN) && \
DESTDIR=/sfcgal ninja install -j $(getconf _NPROCESSORS_ONLN) && \
ninja clean && cp -R /sfcgal/* /
ENV PATH="/usr/local/pgsql/bin:$PATH"
@@ -213,9 +217,9 @@ RUN case "${PG_VERSION}" in \
echo "${PGROUTING_CHECKSUM} pgrouting.tar.gz" | sha256sum --check && \
mkdir pgrouting-src && cd pgrouting-src && tar xzf ../pgrouting.tar.gz --strip-components=1 -C . && \
mkdir build && cd build && \
cmake -DCMAKE_BUILD_TYPE=Release .. && \
make -j $(getconf _NPROCESSORS_ONLN) && \
make -j $(getconf _NPROCESSORS_ONLN) install && \
cmake -GNinja -DCMAKE_BUILD_TYPE=Release .. && \
ninja -j $(getconf _NPROCESSORS_ONLN) && \
ninja -j $(getconf _NPROCESSORS_ONLN) install && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/pgrouting.control && \
find /usr/local/pgsql -type f | sed 's|^/usr/local/pgsql/||' > /after.txt &&\
cp /usr/local/pgsql/share/extension/pgrouting.control /extensions/postgis && \
@@ -235,7 +239,9 @@ COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
COPY compute/patches/plv8-3.1.10.patch /plv8-3.1.10.patch
RUN apt update && \
apt install --no-install-recommends -y ninja-build python3-dev libncurses5 binutils clang
apt install --no-install-recommends --no-install-suggests -y \
ninja-build python3-dev libncurses5 binutils clang \
&& apt clean && rm -rf /var/lib/apt/lists/*
# plv8 3.2.3 supports v17
# last release v3.2.3 - Sep 7, 2024
@@ -301,9 +307,10 @@ RUN mkdir -p /h3/usr/ && \
echo "ec99f1f5974846bde64f4513cf8d2ea1b8d172d2218ab41803bf6a63532272bc h3.tar.gz" | sha256sum --check && \
mkdir h3-src && cd h3-src && tar xzf ../h3.tar.gz --strip-components=1 -C . && \
mkdir build && cd build && \
cmake .. -DCMAKE_BUILD_TYPE=Release && \
make -j $(getconf _NPROCESSORS_ONLN) && \
DESTDIR=/h3 make install && \
cmake .. -GNinja -DBUILD_BENCHMARKS=0 -DCMAKE_BUILD_TYPE=Release \
-DBUILD_FUZZERS=0 -DBUILD_FILTERS=0 -DBUILD_GENERATORS=0 -DBUILD_TESTING=0 \
&& ninja -j $(getconf _NPROCESSORS_ONLN) && \
DESTDIR=/h3 ninja install && \
cp -R /h3/usr / && \
rm -rf build
@@ -650,14 +657,15 @@ FROM build-deps AS rdkit-pg-build
ARG PG_VERSION
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
RUN apt-get update && \
apt-get install --no-install-recommends -y \
RUN apt update && \
apt install --no-install-recommends --no-install-suggests -y \
libboost-iostreams1.74-dev \
libboost-regex1.74-dev \
libboost-serialization1.74-dev \
libboost-system1.74-dev \
libeigen3-dev \
libboost-all-dev
libboost-all-dev \
&& apt clean && rm -rf /var/lib/apt/lists/*
# rdkit Release_2024_09_1 supports v17
# last release Release_2024_09_1 - Sep 27, 2024
@@ -693,6 +701,8 @@ RUN case "${PG_VERSION}" in \
-D RDK_BUILD_MOLINTERCHANGE_SUPPORT=OFF \
-D RDK_BUILD_YAEHMOP_SUPPORT=OFF \
-D RDK_BUILD_STRUCTCHECKER_SUPPORT=OFF \
-D RDK_TEST_MULTITHREADED=OFF \
-D RDK_BUILD_CPP_TESTS=OFF \
-D RDK_USE_URF=OFF \
-D RDK_BUILD_PGSQL=ON \
-D RDK_PGSQL_STATIC=ON \
@@ -704,9 +714,10 @@ RUN case "${PG_VERSION}" in \
-D RDK_INSTALL_COMIC_FONTS=OFF \
-D RDK_BUILD_FREETYPE_SUPPORT=OFF \
-D CMAKE_BUILD_TYPE=Release \
-GNinja \
. && \
make -j $(getconf _NPROCESSORS_ONLN) && \
make -j $(getconf _NPROCESSORS_ONLN) install && \
ninja -j $(getconf _NPROCESSORS_ONLN) && \
ninja -j $(getconf _NPROCESSORS_ONLN) install && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/rdkit.control
#########################################################################################
@@ -849,8 +860,9 @@ FROM build-deps AS rust-extensions-build
ARG PG_VERSION
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
RUN apt-get update && \
apt-get install --no-install-recommends -y curl libclang-dev && \
RUN apt update && \
apt install --no-install-recommends --no-install-suggests -y curl libclang-dev && \
apt clean && rm -rf /var/lib/apt/lists/* && \
useradd -ms /bin/bash nonroot -b /home
ENV HOME=/home/nonroot
@@ -885,8 +897,9 @@ FROM build-deps AS rust-extensions-build-pgrx12
ARG PG_VERSION
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
RUN apt-get update && \
apt-get install --no-install-recommends -y curl libclang-dev && \
RUN apt update && \
apt install --no-install-recommends --no-install-suggests -y curl libclang-dev && \
apt clean && rm -rf /var/lib/apt/lists/* && \
useradd -ms /bin/bash nonroot -b /home
ENV HOME=/home/nonroot
@@ -914,18 +927,22 @@ FROM rust-extensions-build-pgrx12 AS pg-onnx-build
# cmake 3.26 or higher is required, so installing it using pip (bullseye-backports has cmake 3.25).
# Install it using virtual environment, because Python 3.11 (the default version on Debian 12 (Bookworm)) complains otherwise
RUN apt-get update && apt-get install -y python3 python3-pip python3-venv && \
RUN apt update && apt install --no-install-recommends --no-install-suggests -y \
python3 python3-pip python3-venv && \
apt clean && rm -rf /var/lib/apt/lists/* && \
python3 -m venv venv && \
. venv/bin/activate && \
python3 -m pip install cmake==3.30.5 && \
wget https://github.com/microsoft/onnxruntime/archive/refs/tags/v1.18.1.tar.gz -O onnxruntime.tar.gz && \
mkdir onnxruntime-src && cd onnxruntime-src && tar xzf ../onnxruntime.tar.gz --strip-components=1 -C . && \
./build.sh --config Release --parallel --skip_submodule_sync --skip_tests --allow_running_as_root
./build.sh --config Release --parallel --cmake_generator Ninja \
--skip_submodule_sync --skip_tests --allow_running_as_root
FROM pg-onnx-build AS pgrag-pg-build
RUN apt-get install -y protobuf-compiler && \
RUN apt update && apt install --no-install-recommends --no-install-suggests -y protobuf-compiler \
&& apt clean && rm -rf /var/lib/apt/lists/* && \
wget https://github.com/neondatabase-labs/pgrag/archive/refs/tags/v0.0.0.tar.gz -O pgrag.tar.gz && \
echo "2cbe394c1e74fc8bcad9b52d5fbbfb783aef834ca3ce44626cfd770573700bb4 pgrag.tar.gz" | sha256sum --check && \
mkdir pgrag-src && cd pgrag-src && tar xzf ../pgrag.tar.gz --strip-components=1 -C . && \
@@ -1168,6 +1185,25 @@ RUN case "${PG_VERSION}" in \
make BUILD_TYPE=release -j $(getconf _NPROCESSORS_ONLN) install && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/pg_mooncake.control
#########################################################################################
#
# Layer "pg_repack"
# compile pg_repack extension
#
#########################################################################################
FROM build-deps AS pg-repack-build
ARG PG_VERSION
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
ENV PATH="/usr/local/pgsql/bin/:$PATH"
RUN wget https://github.com/reorg/pg_repack/archive/refs/tags/ver_1.5.2.tar.gz -O pg_repack.tar.gz && \
echo '4516cad42251ed3ad53ff619733004db47d5755acac83f75924cd94d1c4fb681 pg_repack.tar.gz' | sha256sum --check && \
mkdir pg_repack-src && cd pg_repack-src && tar xzf ../pg_repack.tar.gz --strip-components=1 -C . && \
make -j $(getconf _NPROCESSORS_ONLN) && \
make -j $(getconf _NPROCESSORS_ONLN) install
#########################################################################################
#
# Layer "neon-pg-ext-build"
@@ -1213,6 +1249,7 @@ COPY --from=pg-anon-pg-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=pg-ivm-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=pg-partman-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=pg-mooncake-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=pg-repack-build /usr/local/pgsql/ /usr/local/pgsql/
COPY pgxn/ pgxn/
RUN make -j $(getconf _NPROCESSORS_ONLN) \
@@ -1279,8 +1316,8 @@ COPY --from=compute-tools /home/nonroot/target/release-line-debug-size-lto/fast_
FROM debian:$DEBIAN_FLAVOR AS pgbouncer
RUN set -e \
&& apt-get update \
&& apt-get install --no-install-recommends -y \
&& apt update \
&& apt install --no-install-suggests --no-install-recommends -y \
build-essential \
git \
ca-certificates \
@@ -1288,7 +1325,8 @@ RUN set -e \
automake \
libevent-dev \
libtool \
pkg-config
pkg-config \
&& apt clean && rm -rf /var/lib/apt/lists/*
# Use `dist_man_MANS=` to skip manpage generation (which requires python3/pandoc)
ENV PGBOUNCER_TAG=pgbouncer_1_22_1
@@ -1518,28 +1556,30 @@ RUN apt update && \
locales \
procps \
ca-certificates \
curl \
unzip \
$VERSION_INSTALLS && \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* && \
apt clean && rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* && \
localedef -i en_US -c -f UTF-8 -A /usr/share/locale/locale.alias en_US.UTF-8
# s5cmd 2.2.2 from https://github.com/peak/s5cmd/releases/tag/v2.2.2
# used by fast_import
# aws cli is used by fast_import (curl and unzip above are at this time only used for this installation step)
ARG TARGETARCH
ADD https://github.com/peak/s5cmd/releases/download/v2.2.2/s5cmd_2.2.2_linux_$TARGETARCH.deb /tmp/s5cmd.deb
RUN set -ex; \
\
# Determine the expected checksum based on TARGETARCH
if [ "${TARGETARCH}" = "amd64" ]; then \
CHECKSUM="392c385320cd5ffa435759a95af77c215553d967e4b1c0fffe52e4f14c29cf85"; \
TARGETARCH_ALT="x86_64"; \
CHECKSUM="c9a9df3770a3ff9259cb469b6179e02829687a464e0824d5c32d378820b53a00"; \
elif [ "${TARGETARCH}" = "arm64" ]; then \
CHECKSUM="939bee3cf4b5604ddb00e67f8c157b91d7c7a5b553d1fbb6890fad32894b7b46"; \
TARGETARCH_ALT="aarch64"; \
CHECKSUM="8181730be7891582b38b028112e81b4899ca817e8c616aad807c9e9d1289223a"; \
else \
echo "Unsupported architecture: ${TARGETARCH}"; exit 1; \
fi; \
\
# Compute and validate the checksum
echo "${CHECKSUM} /tmp/s5cmd.deb" | sha256sum -c -
RUN dpkg -i /tmp/s5cmd.deb && rm /tmp/s5cmd.deb
curl -L "https://awscli.amazonaws.com/awscli-exe-linux-${TARGETARCH_ALT}-2.17.5.zip" -o /tmp/awscliv2.zip; \
echo "${CHECKSUM} /tmp/awscliv2.zip" | sha256sum -c -; \
unzip /tmp/awscliv2.zip -d /tmp/awscliv2; \
/tmp/awscliv2/aws/install; \
rm -rf /tmp/awscliv2.zip /tmp/awscliv2; \
true
ENV LANG=en_US.utf8
USER postgres

View File

@@ -3,7 +3,7 @@
metrics: [
import 'sql_exporter/checkpoints_req.libsonnet',
import 'sql_exporter/checkpoints_timed.libsonnet',
import 'sql_exporter/compute_backpressure_throttling_seconds.libsonnet',
import 'sql_exporter/compute_backpressure_throttling_seconds_total.libsonnet',
import 'sql_exporter/compute_current_lsn.libsonnet',
import 'sql_exporter/compute_logical_snapshot_files.libsonnet',
import 'sql_exporter/compute_logical_snapshots_bytes.libsonnet',

View File

@@ -1,10 +1,10 @@
{
metric_name: 'compute_backpressure_throttling_seconds',
type: 'gauge',
metric_name: 'compute_backpressure_throttling_seconds_total',
type: 'counter',
help: 'Time compute has spent throttled',
key_labels: null,
values: [
'throttled',
],
query: importstr 'sql_exporter/compute_backpressure_throttling_seconds.sql',
query: importstr 'sql_exporter/compute_backpressure_throttling_seconds_total.sql',
}

View File

@@ -981,7 +981,7 @@ index fc42d418bf..e38f517574 100644
CREATE SCHEMA addr_nsp;
SET search_path TO 'addr_nsp';
diff --git a/src/test/regress/expected/password.out b/src/test/regress/expected/password.out
index 8475231735..1afae5395f 100644
index 8475231735..0653946337 100644
--- a/src/test/regress/expected/password.out
+++ b/src/test/regress/expected/password.out
@@ -12,11 +12,11 @@ SET password_encryption = 'md5'; -- ok
@@ -1006,65 +1006,63 @@ index 8475231735..1afae5395f 100644
-----------------+---------------------------------------------------
- regress_passwd1 | md5783277baca28003b33453252be4dbb34
- regress_passwd2 | md54044304ba511dd062133eb5b4b84a2a3
+ regress_passwd1 | NEON_MD5_PLACEHOLDER_regress_passwd1
+ regress_passwd2 | NEON_MD5_PLACEHOLDER_regress_passwd2
+ regress_passwd1 | NEON_MD5_PLACEHOLDER:regress_passwd1
+ regress_passwd2 | NEON_MD5_PLACEHOLDER:regress_passwd2
regress_passwd3 | SCRAM-SHA-256$4096:<salt>$<storedkey>:<serverkey>
- regress_passwd4 |
+ regress_passwd4 | SCRAM-SHA-256$4096:<salt>$<storedkey>:<serverkey>
(4 rows)
-- Rename a role
@@ -54,24 +54,30 @@ ALTER ROLE regress_passwd2_new RENAME TO regress_passwd2;
@@ -54,24 +54,16 @@ ALTER ROLE regress_passwd2_new RENAME TO regress_passwd2;
-- passwords.
SET password_encryption = 'md5';
-- encrypt with MD5
-ALTER ROLE regress_passwd2 PASSWORD 'foo';
--- already encrypted, use as they are
-ALTER ROLE regress_passwd1 PASSWORD 'md5cd3578025fe2c3d7ed1b9a9b26238b70';
-ALTER ROLE regress_passwd3 PASSWORD 'SCRAM-SHA-256$4096:VLK4RMaQLCvNtQ==$6YtlR4t69SguDiwFvbVgVZtuz6gpJQQqUMZ7IQJK5yI=:ps75jrHeYU4lXCcXI4O8oIdJ3eO8o2jirjruw9phBTo=';
+ALTER ROLE regress_passwd2 PASSWORD NEON_PASSWORD_PLACEHOLDER;
-- already encrypted, use as they are
ALTER ROLE regress_passwd1 PASSWORD 'md5cd3578025fe2c3d7ed1b9a9b26238b70';
+ERROR: Received HTTP code 400 from control plane: {"error":"Neon only supports being given plaintext passwords"}
ALTER ROLE regress_passwd3 PASSWORD 'SCRAM-SHA-256$4096:VLK4RMaQLCvNtQ==$6YtlR4t69SguDiwFvbVgVZtuz6gpJQQqUMZ7IQJK5yI=:ps75jrHeYU4lXCcXI4O8oIdJ3eO8o2jirjruw9phBTo=';
+ERROR: Received HTTP code 400 from control plane: {"error":"Neon only supports being given plaintext passwords"}
SET password_encryption = 'scram-sha-256';
-- create SCRAM secret
-ALTER ROLE regress_passwd4 PASSWORD 'foo';
--- already encrypted with MD5, use as it is
-CREATE ROLE regress_passwd5 PASSWORD 'md5e73a4b11df52a6068f8b39f90be36023';
--- This looks like a valid SCRAM-SHA-256 secret, but it is not
--- so it should be hashed with SCRAM-SHA-256.
-CREATE ROLE regress_passwd6 PASSWORD 'SCRAM-SHA-256$1234';
--- These may look like valid MD5 secrets, but they are not, so they
--- should be hashed with SCRAM-SHA-256.
--- trailing garbage at the end
-CREATE ROLE regress_passwd7 PASSWORD 'md5012345678901234567890123456789zz';
--- invalid length
-CREATE ROLE regress_passwd8 PASSWORD 'md501234567890123456789012345678901zz';
+ALTER ROLE regress_passwd4 PASSWORD NEON_PASSWORD_PLACEHOLDER;
-- already encrypted with MD5, use as it is
CREATE ROLE regress_passwd5 PASSWORD 'md5e73a4b11df52a6068f8b39f90be36023';
+ERROR: Received HTTP code 400 from control plane: {"error":"Neon only supports being given plaintext passwords"}
-- This looks like a valid SCRAM-SHA-256 secret, but it is not
-- so it should be hashed with SCRAM-SHA-256.
CREATE ROLE regress_passwd6 PASSWORD 'SCRAM-SHA-256$1234';
+ERROR: Received HTTP code 400 from control plane: {"error":"Neon only supports being given plaintext passwords"}
-- These may look like valid MD5 secrets, but they are not, so they
-- should be hashed with SCRAM-SHA-256.
-- trailing garbage at the end
CREATE ROLE regress_passwd7 PASSWORD 'md5012345678901234567890123456789zz';
+ERROR: Received HTTP code 400 from control plane: {"error":"Neon only supports being given plaintext passwords"}
-- invalid length
CREATE ROLE regress_passwd8 PASSWORD 'md501234567890123456789012345678901zz';
+ERROR: Received HTTP code 400 from control plane: {"error":"Neon only supports being given plaintext passwords"}
+-- Neon does not support encrypted passwords, use unencrypted instead
+CREATE ROLE regress_passwd5 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+-- Neon does not support encrypted passwords, use unencrypted instead
+CREATE ROLE regress_passwd6 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+CREATE ROLE regress_passwd7 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+CREATE ROLE regress_passwd8 PASSWORD NEON_PASSWORD_PLACEHOLDER;
-- Changing the SCRAM iteration count
SET scram_iterations = 1024;
CREATE ROLE regress_passwd9 PASSWORD 'alterediterationcount';
@@ -81,63 +87,67 @@ SELECT rolname, regexp_replace(rolpassword, '(SCRAM-SHA-256)\$(\d+):([a-zA-Z0-9+
@@ -81,11 +73,11 @@ SELECT rolname, regexp_replace(rolpassword, '(SCRAM-SHA-256)\$(\d+):([a-zA-Z0-9+
ORDER BY rolname, rolpassword;
rolname | rolpassword_masked
-----------------+---------------------------------------------------
- regress_passwd1 | md5cd3578025fe2c3d7ed1b9a9b26238b70
- regress_passwd2 | md5dfa155cadd5f4ad57860162f3fab9cdb
+ regress_passwd1 | NEON_MD5_PLACEHOLDER_regress_passwd1
+ regress_passwd2 | NEON_MD5_PLACEHOLDER_regress_passwd2
+ regress_passwd1 | NEON_MD5_PLACEHOLDER:regress_passwd1
+ regress_passwd2 | NEON_MD5_PLACEHOLDER:regress_passwd2
regress_passwd3 | SCRAM-SHA-256$4096:<salt>$<storedkey>:<serverkey>
regress_passwd4 | SCRAM-SHA-256$4096:<salt>$<storedkey>:<serverkey>
- regress_passwd5 | md5e73a4b11df52a6068f8b39f90be36023
- regress_passwd6 | SCRAM-SHA-256$4096:<salt>$<storedkey>:<serverkey>
- regress_passwd7 | SCRAM-SHA-256$4096:<salt>$<storedkey>:<serverkey>
- regress_passwd8 | SCRAM-SHA-256$4096:<salt>$<storedkey>:<serverkey>
regress_passwd9 | SCRAM-SHA-256$1024:<salt>$<storedkey>:<serverkey>
-(9 rows)
+(5 rows)
+ regress_passwd5 | SCRAM-SHA-256$4096:<salt>$<storedkey>:<serverkey>
regress_passwd6 | SCRAM-SHA-256$4096:<salt>$<storedkey>:<serverkey>
regress_passwd7 | SCRAM-SHA-256$4096:<salt>$<storedkey>:<serverkey>
regress_passwd8 | SCRAM-SHA-256$4096:<salt>$<storedkey>:<serverkey>
@@ -95,23 +87,20 @@ SELECT rolname, regexp_replace(rolpassword, '(SCRAM-SHA-256)\$(\d+):([a-zA-Z0-9+
-- An empty password is not allowed, in any form
CREATE ROLE regress_passwd_empty PASSWORD '';
NOTICE: empty string is not a valid password, clearing password
@@ -1082,56 +1080,37 @@ index 8475231735..1afae5395f 100644
-(1 row)
+(0 rows)
-- Test with invalid stored and server keys.
--
-- The first is valid, to act as a control. The others have too long
-- stored/server keys. They will be re-hashed.
CREATE ROLE regress_passwd_sha_len0 PASSWORD 'SCRAM-SHA-256$4096:A6xHKoH/494E941doaPOYg==$Ky+A30sewHIH3VHQLRN9vYsuzlgNyGNKCh37dy96Rqw=:COPdlNiIkrsacU5QoxydEuOH6e/KfiipeETb/bPw8ZI=';
+ERROR: Received HTTP code 400 from control plane: {"error":"Neon only supports being given plaintext passwords"}
CREATE ROLE regress_passwd_sha_len1 PASSWORD 'SCRAM-SHA-256$4096:A6xHKoH/494E941doaPOYg==$Ky+A30sewHIH3VHQLRN9vYsuzlgNyGNKCh37dy96RqwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=:COPdlNiIkrsacU5QoxydEuOH6e/KfiipeETb/bPw8ZI=';
+ERROR: Received HTTP code 400 from control plane: {"error":"Neon only supports being given plaintext passwords"}
CREATE ROLE regress_passwd_sha_len2 PASSWORD 'SCRAM-SHA-256$4096:A6xHKoH/494E941doaPOYg==$Ky+A30sewHIH3VHQLRN9vYsuzlgNyGNKCh37dy96Rqw=:COPdlNiIkrsacU5QoxydEuOH6e/KfiipeETb/bPw8ZIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=';
+ERROR: Received HTTP code 400 from control plane: {"error":"Neon only supports being given plaintext passwords"}
--- Test with invalid stored and server keys.
---
--- The first is valid, to act as a control. The others have too long
--- stored/server keys. They will be re-hashed.
-CREATE ROLE regress_passwd_sha_len0 PASSWORD 'SCRAM-SHA-256$4096:A6xHKoH/494E941doaPOYg==$Ky+A30sewHIH3VHQLRN9vYsuzlgNyGNKCh37dy96Rqw=:COPdlNiIkrsacU5QoxydEuOH6e/KfiipeETb/bPw8ZI=';
-CREATE ROLE regress_passwd_sha_len1 PASSWORD 'SCRAM-SHA-256$4096:A6xHKoH/494E941doaPOYg==$Ky+A30sewHIH3VHQLRN9vYsuzlgNyGNKCh37dy96RqwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=:COPdlNiIkrsacU5QoxydEuOH6e/KfiipeETb/bPw8ZI=';
-CREATE ROLE regress_passwd_sha_len2 PASSWORD 'SCRAM-SHA-256$4096:A6xHKoH/494E941doaPOYg==$Ky+A30sewHIH3VHQLRN9vYsuzlgNyGNKCh37dy96Rqw=:COPdlNiIkrsacU5QoxydEuOH6e/KfiipeETb/bPw8ZIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=';
+-- Neon does not support encrypted passwords, use unencrypted instead
+CREATE ROLE regress_passwd_sha_len0 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+CREATE ROLE regress_passwd_sha_len1 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+CREATE ROLE regress_passwd_sha_len2 PASSWORD NEON_PASSWORD_PLACEHOLDER;
-- Check that the invalid secrets were re-hashed. A re-hashed secret
-- should not contain the original salt.
SELECT rolname, rolpassword not like '%A6xHKoH/494E941doaPOYg==%' as is_rolpassword_rehashed
FROM pg_authid
WHERE rolname LIKE 'regress_passwd_sha_len%'
@@ -120,7 +109,7 @@ SELECT rolname, rolpassword not like '%A6xHKoH/494E941doaPOYg==%' as is_rolpassw
ORDER BY rolname;
- rolname | is_rolpassword_rehashed
--------------------------+-------------------------
rolname | is_rolpassword_rehashed
-------------------------+-------------------------
- regress_passwd_sha_len0 | f
- regress_passwd_sha_len1 | t
- regress_passwd_sha_len2 | t
-(3 rows)
+ rolname | is_rolpassword_rehashed
+---------+-------------------------
+(0 rows)
DROP ROLE regress_passwd1;
DROP ROLE regress_passwd2;
DROP ROLE regress_passwd3;
DROP ROLE regress_passwd4;
DROP ROLE regress_passwd5;
+ERROR: role "regress_passwd5" does not exist
DROP ROLE regress_passwd6;
+ERROR: role "regress_passwd6" does not exist
DROP ROLE regress_passwd7;
+ERROR: role "regress_passwd7" does not exist
+ regress_passwd_sha_len0 | t
regress_passwd_sha_len1 | t
regress_passwd_sha_len2 | t
(3 rows)
@@ -135,6 +124,7 @@ DROP ROLE regress_passwd7;
DROP ROLE regress_passwd8;
+ERROR: role "regress_passwd8" does not exist
DROP ROLE regress_passwd9;
DROP ROLE regress_passwd_empty;
+ERROR: role "regress_passwd_empty" does not exist
DROP ROLE regress_passwd_sha_len0;
+ERROR: role "regress_passwd_sha_len0" does not exist
DROP ROLE regress_passwd_sha_len1;
+ERROR: role "regress_passwd_sha_len1" does not exist
DROP ROLE regress_passwd_sha_len2;
+ERROR: role "regress_passwd_sha_len2" does not exist
-- all entries should have been removed
SELECT rolname, rolpassword
FROM pg_authid
diff --git a/src/test/regress/expected/privileges.out b/src/test/regress/expected/privileges.out
index 5b9dba7b32..cc408dad42 100644
--- a/src/test/regress/expected/privileges.out
@@ -3194,7 +3173,7 @@ index 1a6c61f49d..1c31ac6a53 100644
-- Test generic object addressing/identification functions
CREATE SCHEMA addr_nsp;
diff --git a/src/test/regress/sql/password.sql b/src/test/regress/sql/password.sql
index 53e86b0b6c..f07cf1ec54 100644
index 53e86b0b6c..0303fdfe96 100644
--- a/src/test/regress/sql/password.sql
+++ b/src/test/regress/sql/password.sql
@@ -10,11 +10,11 @@ SET password_encryption = 'scram-sha-256'; -- ok
@@ -3213,23 +3192,59 @@ index 53e86b0b6c..f07cf1ec54 100644
-- check list of created entries
--
@@ -42,14 +42,14 @@ ALTER ROLE regress_passwd2_new RENAME TO regress_passwd2;
@@ -42,26 +42,18 @@ ALTER ROLE regress_passwd2_new RENAME TO regress_passwd2;
SET password_encryption = 'md5';
-- encrypt with MD5
-ALTER ROLE regress_passwd2 PASSWORD 'foo';
--- already encrypted, use as they are
-ALTER ROLE regress_passwd1 PASSWORD 'md5cd3578025fe2c3d7ed1b9a9b26238b70';
-ALTER ROLE regress_passwd3 PASSWORD 'SCRAM-SHA-256$4096:VLK4RMaQLCvNtQ==$6YtlR4t69SguDiwFvbVgVZtuz6gpJQQqUMZ7IQJK5yI=:ps75jrHeYU4lXCcXI4O8oIdJ3eO8o2jirjruw9phBTo=';
+ALTER ROLE regress_passwd2 PASSWORD NEON_PASSWORD_PLACEHOLDER;
-- already encrypted, use as they are
ALTER ROLE regress_passwd1 PASSWORD 'md5cd3578025fe2c3d7ed1b9a9b26238b70';
ALTER ROLE regress_passwd3 PASSWORD 'SCRAM-SHA-256$4096:VLK4RMaQLCvNtQ==$6YtlR4t69SguDiwFvbVgVZtuz6gpJQQqUMZ7IQJK5yI=:ps75jrHeYU4lXCcXI4O8oIdJ3eO8o2jirjruw9phBTo=';
SET password_encryption = 'scram-sha-256';
-- create SCRAM secret
-ALTER ROLE regress_passwd4 PASSWORD 'foo';
--- already encrypted with MD5, use as it is
-CREATE ROLE regress_passwd5 PASSWORD 'md5e73a4b11df52a6068f8b39f90be36023';
+ALTER ROLE regress_passwd4 PASSWORD NEON_PASSWORD_PLACEHOLDER;
-- already encrypted with MD5, use as it is
CREATE ROLE regress_passwd5 PASSWORD 'md5e73a4b11df52a6068f8b39f90be36023';
+-- Neon does not support encrypted passwords, use unencrypted instead
+CREATE ROLE regress_passwd5 PASSWORD NEON_PASSWORD_PLACEHOLDER;
--- This looks like a valid SCRAM-SHA-256 secret, but it is not
--- so it should be hashed with SCRAM-SHA-256.
-CREATE ROLE regress_passwd6 PASSWORD 'SCRAM-SHA-256$1234';
--- These may look like valid MD5 secrets, but they are not, so they
--- should be hashed with SCRAM-SHA-256.
--- trailing garbage at the end
-CREATE ROLE regress_passwd7 PASSWORD 'md5012345678901234567890123456789zz';
--- invalid length
-CREATE ROLE regress_passwd8 PASSWORD 'md501234567890123456789012345678901zz';
+-- Neon does not support encrypted passwords, use unencrypted instead
+CREATE ROLE regress_passwd6 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+CREATE ROLE regress_passwd7 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+CREATE ROLE regress_passwd8 PASSWORD NEON_PASSWORD_PLACEHOLDER;
-- Changing the SCRAM iteration count
SET scram_iterations = 1024;
@@ -78,13 +70,10 @@ ALTER ROLE regress_passwd_empty PASSWORD 'md585939a5ce845f1a1b620742e3c659e0a';
ALTER ROLE regress_passwd_empty PASSWORD 'SCRAM-SHA-256$4096:hpFyHTUsSWcR7O9P$LgZFIt6Oqdo27ZFKbZ2nV+vtnYM995pDh9ca6WSi120=:qVV5NeluNfUPkwm7Vqat25RjSPLkGeoZBQs6wVv+um4=';
SELECT rolpassword FROM pg_authid WHERE rolname='regress_passwd_empty';
--- Test with invalid stored and server keys.
---
--- The first is valid, to act as a control. The others have too long
--- stored/server keys. They will be re-hashed.
-CREATE ROLE regress_passwd_sha_len0 PASSWORD 'SCRAM-SHA-256$4096:A6xHKoH/494E941doaPOYg==$Ky+A30sewHIH3VHQLRN9vYsuzlgNyGNKCh37dy96Rqw=:COPdlNiIkrsacU5QoxydEuOH6e/KfiipeETb/bPw8ZI=';
-CREATE ROLE regress_passwd_sha_len1 PASSWORD 'SCRAM-SHA-256$4096:A6xHKoH/494E941doaPOYg==$Ky+A30sewHIH3VHQLRN9vYsuzlgNyGNKCh37dy96RqwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=:COPdlNiIkrsacU5QoxydEuOH6e/KfiipeETb/bPw8ZI=';
-CREATE ROLE regress_passwd_sha_len2 PASSWORD 'SCRAM-SHA-256$4096:A6xHKoH/494E941doaPOYg==$Ky+A30sewHIH3VHQLRN9vYsuzlgNyGNKCh37dy96Rqw=:COPdlNiIkrsacU5QoxydEuOH6e/KfiipeETb/bPw8ZIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=';
+-- Neon does not support encrypted passwords, use unencrypted instead
+CREATE ROLE regress_passwd_sha_len0 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+CREATE ROLE regress_passwd_sha_len1 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+CREATE ROLE regress_passwd_sha_len2 PASSWORD NEON_PASSWORD_PLACEHOLDER;
-- Check that the invalid secrets were re-hashed. A re-hashed secret
-- should not contain the original salt.
diff --git a/src/test/regress/sql/privileges.sql b/src/test/regress/sql/privileges.sql
index 249df17a58..b258e7f26a 100644
--- a/src/test/regress/sql/privileges.sql

View File

@@ -1014,10 +1014,10 @@ index fc42d418bf..e38f517574 100644
CREATE SCHEMA addr_nsp;
SET search_path TO 'addr_nsp';
diff --git a/src/test/regress/expected/password.out b/src/test/regress/expected/password.out
index 924d6e001d..5966531db6 100644
index 924d6e001d..7fdda73439 100644
--- a/src/test/regress/expected/password.out
+++ b/src/test/regress/expected/password.out
@@ -12,13 +12,13 @@ SET password_encryption = 'md5'; -- ok
@@ -12,13 +12,11 @@ SET password_encryption = 'md5'; -- ok
SET password_encryption = 'scram-sha-256'; -- ok
-- consistency of password entries
SET password_encryption = 'md5';
@@ -1026,9 +1026,7 @@ index 924d6e001d..5966531db6 100644
-CREATE ROLE regress_passwd2;
-ALTER ROLE regress_passwd2 PASSWORD 'role_pwd2';
+CREATE ROLE regress_passwd1 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+ALTER ROLE regress_passwd1 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+CREATE ROLE regress_passwd2 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+ALTER ROLE regress_passwd2 PASSWORD NEON_PASSWORD_PLACEHOLDER;
SET password_encryption = 'scram-sha-256';
-CREATE ROLE regress_passwd3 PASSWORD 'role_pwd3';
-CREATE ROLE regress_passwd4 PASSWORD NULL;
@@ -1037,71 +1035,69 @@ index 924d6e001d..5966531db6 100644
-- check list of created entries
--
-- The scram secret will look something like:
@@ -32,10 +32,10 @@ SELECT rolname, regexp_replace(rolpassword, '(SCRAM-SHA-256)\$(\d+):([a-zA-Z0-9+
@@ -32,10 +30,10 @@ SELECT rolname, regexp_replace(rolpassword, '(SCRAM-SHA-256)\$(\d+):([a-zA-Z0-9+
ORDER BY rolname, rolpassword;
rolname | rolpassword_masked
-----------------+---------------------------------------------------
- regress_passwd1 | md5783277baca28003b33453252be4dbb34
- regress_passwd2 | md54044304ba511dd062133eb5b4b84a2a3
+ regress_passwd1 | NEON_MD5_PLACEHOLDER_regress_passwd1
+ regress_passwd2 | NEON_MD5_PLACEHOLDER_regress_passwd2
+ regress_passwd1 | NEON_MD5_PLACEHOLDER:regress_passwd1
+ regress_passwd2 | NEON_MD5_PLACEHOLDER:regress_passwd2
regress_passwd3 | SCRAM-SHA-256$4096:<salt>$<storedkey>:<serverkey>
- regress_passwd4 |
+ regress_passwd4 | SCRAM-SHA-256$4096:<salt>$<storedkey>:<serverkey>
(4 rows)
-- Rename a role
@@ -56,24 +56,30 @@ ALTER ROLE regress_passwd2_new RENAME TO regress_passwd2;
@@ -56,24 +54,17 @@ ALTER ROLE regress_passwd2_new RENAME TO regress_passwd2;
-- passwords.
SET password_encryption = 'md5';
-- encrypt with MD5
-ALTER ROLE regress_passwd2 PASSWORD 'foo';
--- already encrypted, use as they are
-ALTER ROLE regress_passwd1 PASSWORD 'md5cd3578025fe2c3d7ed1b9a9b26238b70';
-ALTER ROLE regress_passwd3 PASSWORD 'SCRAM-SHA-256$4096:VLK4RMaQLCvNtQ==$6YtlR4t69SguDiwFvbVgVZtuz6gpJQQqUMZ7IQJK5yI=:ps75jrHeYU4lXCcXI4O8oIdJ3eO8o2jirjruw9phBTo=';
+ALTER ROLE regress_passwd2 PASSWORD NEON_PASSWORD_PLACEHOLDER;
-- already encrypted, use as they are
ALTER ROLE regress_passwd1 PASSWORD 'md5cd3578025fe2c3d7ed1b9a9b26238b70';
+ERROR: Received HTTP code 400 from control plane: {"error":"Neon only supports being given plaintext passwords"}
ALTER ROLE regress_passwd3 PASSWORD 'SCRAM-SHA-256$4096:VLK4RMaQLCvNtQ==$6YtlR4t69SguDiwFvbVgVZtuz6gpJQQqUMZ7IQJK5yI=:ps75jrHeYU4lXCcXI4O8oIdJ3eO8o2jirjruw9phBTo=';
+ERROR: Received HTTP code 400 from control plane: {"error":"Neon only supports being given plaintext passwords"}
SET password_encryption = 'scram-sha-256';
-- create SCRAM secret
-ALTER ROLE regress_passwd4 PASSWORD 'foo';
+ALTER ROLE regress_passwd4 PASSWORD NEON_PASSWORD_PLACEHOLDER;
-- already encrypted with MD5, use as it is
CREATE ROLE regress_passwd5 PASSWORD 'md5e73a4b11df52a6068f8b39f90be36023';
+ERROR: Received HTTP code 400 from control plane: {"error":"Neon only supports being given plaintext passwords"}
-- This looks like a valid SCRAM-SHA-256 secret, but it is not
-- so it should be hashed with SCRAM-SHA-256.
CREATE ROLE regress_passwd6 PASSWORD 'SCRAM-SHA-256$1234';
+ERROR: Received HTTP code 400 from control plane: {"error":"Neon only supports being given plaintext passwords"}
-- These may look like valid MD5 secrets, but they are not, so they
-- should be hashed with SCRAM-SHA-256.
-- trailing garbage at the end
CREATE ROLE regress_passwd7 PASSWORD 'md5012345678901234567890123456789zz';
+ERROR: Received HTTP code 400 from control plane: {"error":"Neon only supports being given plaintext passwords"}
-- invalid length
CREATE ROLE regress_passwd8 PASSWORD 'md501234567890123456789012345678901zz';
+ERROR: Received HTTP code 400 from control plane: {"error":"Neon only supports being given plaintext passwords"}
-CREATE ROLE regress_passwd5 PASSWORD 'md5e73a4b11df52a6068f8b39f90be36023';
--- This looks like a valid SCRAM-SHA-256 secret, but it is not
--- so it should be hashed with SCRAM-SHA-256.
-CREATE ROLE regress_passwd6 PASSWORD 'SCRAM-SHA-256$1234';
--- These may look like valid MD5 secrets, but they are not, so they
--- should be hashed with SCRAM-SHA-256.
--- trailing garbage at the end
-CREATE ROLE regress_passwd7 PASSWORD 'md5012345678901234567890123456789zz';
--- invalid length
-CREATE ROLE regress_passwd8 PASSWORD 'md501234567890123456789012345678901zz';
+-- Neon does not support encrypted passwords, use unencrypted instead
+CREATE ROLE regress_passwd5 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+-- Neon does not support encrypted passwords, use unencrypted instead
+CREATE ROLE regress_passwd6 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+CREATE ROLE regress_passwd7 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+CREATE ROLE regress_passwd8 PASSWORD NEON_PASSWORD_PLACEHOLDER;
-- Changing the SCRAM iteration count
SET scram_iterations = 1024;
CREATE ROLE regress_passwd9 PASSWORD 'alterediterationcount';
@@ -83,63 +89,67 @@ SELECT rolname, regexp_replace(rolpassword, '(SCRAM-SHA-256)\$(\d+):([a-zA-Z0-9+
@@ -83,11 +74,11 @@ SELECT rolname, regexp_replace(rolpassword, '(SCRAM-SHA-256)\$(\d+):([a-zA-Z0-9+
ORDER BY rolname, rolpassword;
rolname | rolpassword_masked
-----------------+---------------------------------------------------
- regress_passwd1 | md5cd3578025fe2c3d7ed1b9a9b26238b70
- regress_passwd2 | md5dfa155cadd5f4ad57860162f3fab9cdb
+ regress_passwd1 | NEON_MD5_PLACEHOLDER_regress_passwd1
+ regress_passwd2 | NEON_MD5_PLACEHOLDER_regress_passwd2
+ regress_passwd1 | NEON_MD5_PLACEHOLDER:regress_passwd1
+ regress_passwd2 | NEON_MD5_PLACEHOLDER:regress_passwd2
regress_passwd3 | SCRAM-SHA-256$4096:<salt>$<storedkey>:<serverkey>
regress_passwd4 | SCRAM-SHA-256$4096:<salt>$<storedkey>:<serverkey>
- regress_passwd5 | md5e73a4b11df52a6068f8b39f90be36023
- regress_passwd6 | SCRAM-SHA-256$4096:<salt>$<storedkey>:<serverkey>
- regress_passwd7 | SCRAM-SHA-256$4096:<salt>$<storedkey>:<serverkey>
- regress_passwd8 | SCRAM-SHA-256$4096:<salt>$<storedkey>:<serverkey>
regress_passwd9 | SCRAM-SHA-256$1024:<salt>$<storedkey>:<serverkey>
-(9 rows)
+(5 rows)
+ regress_passwd5 | SCRAM-SHA-256$4096:<salt>$<storedkey>:<serverkey>
regress_passwd6 | SCRAM-SHA-256$4096:<salt>$<storedkey>:<serverkey>
regress_passwd7 | SCRAM-SHA-256$4096:<salt>$<storedkey>:<serverkey>
regress_passwd8 | SCRAM-SHA-256$4096:<salt>$<storedkey>:<serverkey>
@@ -97,23 +88,20 @@ SELECT rolname, regexp_replace(rolpassword, '(SCRAM-SHA-256)\$(\d+):([a-zA-Z0-9+
-- An empty password is not allowed, in any form
CREATE ROLE regress_passwd_empty PASSWORD '';
NOTICE: empty string is not a valid password, clearing password
@@ -1119,56 +1115,37 @@ index 924d6e001d..5966531db6 100644
-(1 row)
+(0 rows)
-- Test with invalid stored and server keys.
--
-- The first is valid, to act as a control. The others have too long
-- stored/server keys. They will be re-hashed.
CREATE ROLE regress_passwd_sha_len0 PASSWORD 'SCRAM-SHA-256$4096:A6xHKoH/494E941doaPOYg==$Ky+A30sewHIH3VHQLRN9vYsuzlgNyGNKCh37dy96Rqw=:COPdlNiIkrsacU5QoxydEuOH6e/KfiipeETb/bPw8ZI=';
+ERROR: Received HTTP code 400 from control plane: {"error":"Neon only supports being given plaintext passwords"}
CREATE ROLE regress_passwd_sha_len1 PASSWORD 'SCRAM-SHA-256$4096:A6xHKoH/494E941doaPOYg==$Ky+A30sewHIH3VHQLRN9vYsuzlgNyGNKCh37dy96RqwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=:COPdlNiIkrsacU5QoxydEuOH6e/KfiipeETb/bPw8ZI=';
+ERROR: Received HTTP code 400 from control plane: {"error":"Neon only supports being given plaintext passwords"}
CREATE ROLE regress_passwd_sha_len2 PASSWORD 'SCRAM-SHA-256$4096:A6xHKoH/494E941doaPOYg==$Ky+A30sewHIH3VHQLRN9vYsuzlgNyGNKCh37dy96Rqw=:COPdlNiIkrsacU5QoxydEuOH6e/KfiipeETb/bPw8ZIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=';
+ERROR: Received HTTP code 400 from control plane: {"error":"Neon only supports being given plaintext passwords"}
--- Test with invalid stored and server keys.
---
--- The first is valid, to act as a control. The others have too long
--- stored/server keys. They will be re-hashed.
-CREATE ROLE regress_passwd_sha_len0 PASSWORD 'SCRAM-SHA-256$4096:A6xHKoH/494E941doaPOYg==$Ky+A30sewHIH3VHQLRN9vYsuzlgNyGNKCh37dy96Rqw=:COPdlNiIkrsacU5QoxydEuOH6e/KfiipeETb/bPw8ZI=';
-CREATE ROLE regress_passwd_sha_len1 PASSWORD 'SCRAM-SHA-256$4096:A6xHKoH/494E941doaPOYg==$Ky+A30sewHIH3VHQLRN9vYsuzlgNyGNKCh37dy96RqwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=:COPdlNiIkrsacU5QoxydEuOH6e/KfiipeETb/bPw8ZI=';
-CREATE ROLE regress_passwd_sha_len2 PASSWORD 'SCRAM-SHA-256$4096:A6xHKoH/494E941doaPOYg==$Ky+A30sewHIH3VHQLRN9vYsuzlgNyGNKCh37dy96Rqw=:COPdlNiIkrsacU5QoxydEuOH6e/KfiipeETb/bPw8ZIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=';
+-- Neon does not support encrypted passwords, use unencrypted instead
+CREATE ROLE regress_passwd_sha_len0 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+CREATE ROLE regress_passwd_sha_len1 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+CREATE ROLE regress_passwd_sha_len2 PASSWORD NEON_PASSWORD_PLACEHOLDER;
-- Check that the invalid secrets were re-hashed. A re-hashed secret
-- should not contain the original salt.
SELECT rolname, rolpassword not like '%A6xHKoH/494E941doaPOYg==%' as is_rolpassword_rehashed
FROM pg_authid
WHERE rolname LIKE 'regress_passwd_sha_len%'
@@ -122,7 +110,7 @@ SELECT rolname, rolpassword not like '%A6xHKoH/494E941doaPOYg==%' as is_rolpassw
ORDER BY rolname;
- rolname | is_rolpassword_rehashed
--------------------------+-------------------------
rolname | is_rolpassword_rehashed
-------------------------+-------------------------
- regress_passwd_sha_len0 | f
- regress_passwd_sha_len1 | t
- regress_passwd_sha_len2 | t
-(3 rows)
+ rolname | is_rolpassword_rehashed
+---------+-------------------------
+(0 rows)
DROP ROLE regress_passwd1;
DROP ROLE regress_passwd2;
DROP ROLE regress_passwd3;
DROP ROLE regress_passwd4;
DROP ROLE regress_passwd5;
+ERROR: role "regress_passwd5" does not exist
DROP ROLE regress_passwd6;
+ERROR: role "regress_passwd6" does not exist
DROP ROLE regress_passwd7;
+ERROR: role "regress_passwd7" does not exist
+ regress_passwd_sha_len0 | t
regress_passwd_sha_len1 | t
regress_passwd_sha_len2 | t
(3 rows)
@@ -137,6 +125,7 @@ DROP ROLE regress_passwd7;
DROP ROLE regress_passwd8;
+ERROR: role "regress_passwd8" does not exist
DROP ROLE regress_passwd9;
DROP ROLE regress_passwd_empty;
+ERROR: role "regress_passwd_empty" does not exist
DROP ROLE regress_passwd_sha_len0;
+ERROR: role "regress_passwd_sha_len0" does not exist
DROP ROLE regress_passwd_sha_len1;
+ERROR: role "regress_passwd_sha_len1" does not exist
DROP ROLE regress_passwd_sha_len2;
+ERROR: role "regress_passwd_sha_len2" does not exist
-- all entries should have been removed
SELECT rolname, rolpassword
FROM pg_authid
diff --git a/src/test/regress/expected/privileges.out b/src/test/regress/expected/privileges.out
index 1296da0d57..f43fffa44c 100644
--- a/src/test/regress/expected/privileges.out
@@ -3249,10 +3226,10 @@ index 1a6c61f49d..1c31ac6a53 100644
-- Test generic object addressing/identification functions
CREATE SCHEMA addr_nsp;
diff --git a/src/test/regress/sql/password.sql b/src/test/regress/sql/password.sql
index bb82aa4aa2..7424c91b10 100644
index bb82aa4aa2..dd8a05e24d 100644
--- a/src/test/regress/sql/password.sql
+++ b/src/test/regress/sql/password.sql
@@ -10,13 +10,13 @@ SET password_encryption = 'scram-sha-256'; -- ok
@@ -10,13 +10,11 @@ SET password_encryption = 'scram-sha-256'; -- ok
-- consistency of password entries
SET password_encryption = 'md5';
@@ -3261,9 +3238,7 @@ index bb82aa4aa2..7424c91b10 100644
-CREATE ROLE regress_passwd2;
-ALTER ROLE regress_passwd2 PASSWORD 'role_pwd2';
+CREATE ROLE regress_passwd1 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+ALTER ROLE regress_passwd1 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+CREATE ROLE regress_passwd2 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+ALTER ROLE regress_passwd2 PASSWORD NEON_PASSWORD_PLACEHOLDER;
SET password_encryption = 'scram-sha-256';
-CREATE ROLE regress_passwd3 PASSWORD 'role_pwd3';
-CREATE ROLE regress_passwd4 PASSWORD NULL;
@@ -3272,23 +3247,59 @@ index bb82aa4aa2..7424c91b10 100644
-- check list of created entries
--
@@ -44,14 +44,14 @@ ALTER ROLE regress_passwd2_new RENAME TO regress_passwd2;
@@ -44,26 +42,19 @@ ALTER ROLE regress_passwd2_new RENAME TO regress_passwd2;
SET password_encryption = 'md5';
-- encrypt with MD5
-ALTER ROLE regress_passwd2 PASSWORD 'foo';
--- already encrypted, use as they are
-ALTER ROLE regress_passwd1 PASSWORD 'md5cd3578025fe2c3d7ed1b9a9b26238b70';
-ALTER ROLE regress_passwd3 PASSWORD 'SCRAM-SHA-256$4096:VLK4RMaQLCvNtQ==$6YtlR4t69SguDiwFvbVgVZtuz6gpJQQqUMZ7IQJK5yI=:ps75jrHeYU4lXCcXI4O8oIdJ3eO8o2jirjruw9phBTo=';
+ALTER ROLE regress_passwd2 PASSWORD NEON_PASSWORD_PLACEHOLDER;
-- already encrypted, use as they are
ALTER ROLE regress_passwd1 PASSWORD 'md5cd3578025fe2c3d7ed1b9a9b26238b70';
ALTER ROLE regress_passwd3 PASSWORD 'SCRAM-SHA-256$4096:VLK4RMaQLCvNtQ==$6YtlR4t69SguDiwFvbVgVZtuz6gpJQQqUMZ7IQJK5yI=:ps75jrHeYU4lXCcXI4O8oIdJ3eO8o2jirjruw9phBTo=';
SET password_encryption = 'scram-sha-256';
-- create SCRAM secret
-ALTER ROLE regress_passwd4 PASSWORD 'foo';
+ALTER ROLE regress_passwd4 PASSWORD NEON_PASSWORD_PLACEHOLDER;
-- already encrypted with MD5, use as it is
CREATE ROLE regress_passwd5 PASSWORD 'md5e73a4b11df52a6068f8b39f90be36023';
-CREATE ROLE regress_passwd5 PASSWORD 'md5e73a4b11df52a6068f8b39f90be36023';
+-- Neon does not support encrypted passwords, use unencrypted instead
+CREATE ROLE regress_passwd5 PASSWORD NEON_PASSWORD_PLACEHOLDER;
--- This looks like a valid SCRAM-SHA-256 secret, but it is not
--- so it should be hashed with SCRAM-SHA-256.
-CREATE ROLE regress_passwd6 PASSWORD 'SCRAM-SHA-256$1234';
--- These may look like valid MD5 secrets, but they are not, so they
--- should be hashed with SCRAM-SHA-256.
--- trailing garbage at the end
-CREATE ROLE regress_passwd7 PASSWORD 'md5012345678901234567890123456789zz';
--- invalid length
-CREATE ROLE regress_passwd8 PASSWORD 'md501234567890123456789012345678901zz';
+-- Neon does not support encrypted passwords, use unencrypted instead
+CREATE ROLE regress_passwd6 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+CREATE ROLE regress_passwd7 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+CREATE ROLE regress_passwd8 PASSWORD NEON_PASSWORD_PLACEHOLDER;
-- Changing the SCRAM iteration count
SET scram_iterations = 1024;
@@ -80,13 +71,10 @@ ALTER ROLE regress_passwd_empty PASSWORD 'md585939a5ce845f1a1b620742e3c659e0a';
ALTER ROLE regress_passwd_empty PASSWORD 'SCRAM-SHA-256$4096:hpFyHTUsSWcR7O9P$LgZFIt6Oqdo27ZFKbZ2nV+vtnYM995pDh9ca6WSi120=:qVV5NeluNfUPkwm7Vqat25RjSPLkGeoZBQs6wVv+um4=';
SELECT rolpassword FROM pg_authid WHERE rolname='regress_passwd_empty';
--- Test with invalid stored and server keys.
---
--- The first is valid, to act as a control. The others have too long
--- stored/server keys. They will be re-hashed.
-CREATE ROLE regress_passwd_sha_len0 PASSWORD 'SCRAM-SHA-256$4096:A6xHKoH/494E941doaPOYg==$Ky+A30sewHIH3VHQLRN9vYsuzlgNyGNKCh37dy96Rqw=:COPdlNiIkrsacU5QoxydEuOH6e/KfiipeETb/bPw8ZI=';
-CREATE ROLE regress_passwd_sha_len1 PASSWORD 'SCRAM-SHA-256$4096:A6xHKoH/494E941doaPOYg==$Ky+A30sewHIH3VHQLRN9vYsuzlgNyGNKCh37dy96RqwAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=:COPdlNiIkrsacU5QoxydEuOH6e/KfiipeETb/bPw8ZI=';
-CREATE ROLE regress_passwd_sha_len2 PASSWORD 'SCRAM-SHA-256$4096:A6xHKoH/494E941doaPOYg==$Ky+A30sewHIH3VHQLRN9vYsuzlgNyGNKCh37dy96Rqw=:COPdlNiIkrsacU5QoxydEuOH6e/KfiipeETb/bPw8ZIAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA=';
+-- Neon does not support encrypted passwords, use unencrypted instead
+CREATE ROLE regress_passwd_sha_len0 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+CREATE ROLE regress_passwd_sha_len1 PASSWORD NEON_PASSWORD_PLACEHOLDER;
+CREATE ROLE regress_passwd_sha_len2 PASSWORD NEON_PASSWORD_PLACEHOLDER;
-- Check that the invalid secrets were re-hashed. A re-hashed secret
-- should not contain the original salt.
diff --git a/src/test/regress/sql/privileges.sql b/src/test/regress/sql/privileges.sql
index 5880bc018d..27aa952b18 100644
--- a/src/test/regress/sql/privileges.sql

View File

@@ -34,12 +34,12 @@ use nix::unistd::Pid;
use tracing::{info, info_span, warn, Instrument};
use utils::fs_ext::is_directory_empty;
#[path = "fast_import/aws_s3_sync.rs"]
mod aws_s3_sync;
#[path = "fast_import/child_stdio_to_log.rs"]
mod child_stdio_to_log;
#[path = "fast_import/s3_uri.rs"]
mod s3_uri;
#[path = "fast_import/s5cmd.rs"]
mod s5cmd;
#[derive(clap::Parser)]
struct Args {
@@ -326,7 +326,7 @@ pub(crate) async fn main() -> anyhow::Result<()> {
}
info!("upload pgdata");
s5cmd::sync(Utf8Path::new(&pgdata_dir), &s3_prefix.append("/"))
aws_s3_sync::sync(Utf8Path::new(&pgdata_dir), &s3_prefix.append("/pgdata/"))
.await
.context("sync dump directory to destination")?;
@@ -334,10 +334,10 @@ pub(crate) async fn main() -> anyhow::Result<()> {
{
let status_dir = working_directory.join("status");
std::fs::create_dir(&status_dir).context("create status directory")?;
let status_file = status_dir.join("status");
let status_file = status_dir.join("pgdata");
std::fs::write(&status_file, serde_json::json!({"done": true}).to_string())
.context("write status file")?;
s5cmd::sync(&status_file, &s3_prefix.append("/status/pgdata"))
aws_s3_sync::sync(&status_dir, &s3_prefix.append("/status/"))
.await
.context("sync status directory to destination")?;
}

View File

@@ -4,24 +4,21 @@ use camino::Utf8Path;
use super::s3_uri::S3Uri;
pub(crate) async fn sync(local: &Utf8Path, remote: &S3Uri) -> anyhow::Result<()> {
let mut builder = tokio::process::Command::new("s5cmd");
// s5cmd uses aws-sdk-go v1, hence doesn't support AWS_ENDPOINT_URL
if let Some(val) = std::env::var_os("AWS_ENDPOINT_URL") {
builder.arg("--endpoint-url").arg(val);
}
let mut builder = tokio::process::Command::new("aws");
builder
.arg("s3")
.arg("sync")
.arg(local.as_str())
.arg(remote.to_string());
let st = builder
.spawn()
.context("spawn s5cmd")?
.context("spawn aws s3 sync")?
.wait()
.await
.context("wait for s5cmd")?;
.context("wait for aws s3 sync")?;
if st.success() {
Ok(())
} else {
Err(anyhow::anyhow!("s5cmd failed"))
Err(anyhow::anyhow!("aws s3 sync failed"))
}
}

View File

@@ -19,6 +19,7 @@ use control_plane::storage_controller::{
NeonStorageControllerStartArgs, NeonStorageControllerStopArgs, StorageController,
};
use control_plane::{broker, local_env};
use nix::fcntl::{flock, FlockArg};
use pageserver_api::config::{
DEFAULT_HTTP_LISTEN_PORT as DEFAULT_PAGESERVER_HTTP_PORT,
DEFAULT_PG_LISTEN_PORT as DEFAULT_PAGESERVER_PG_PORT,
@@ -36,6 +37,8 @@ use safekeeper_api::{
};
use std::borrow::Cow;
use std::collections::{BTreeSet, HashMap};
use std::fs::File;
use std::os::fd::AsRawFd;
use std::path::PathBuf;
use std::process::exit;
use std::str::FromStr;
@@ -689,6 +692,21 @@ struct TimelineTreeEl {
pub children: BTreeSet<TimelineId>,
}
/// A flock-based guard over the neon_local repository directory
struct RepoLock {
_file: File,
}
impl RepoLock {
fn new() -> Result<Self> {
let repo_dir = File::open(local_env::base_path())?;
let repo_dir_fd = repo_dir.as_raw_fd();
flock(repo_dir_fd, FlockArg::LockExclusive)?;
Ok(Self { _file: repo_dir })
}
}
// Main entry point for the 'neon_local' CLI utility
//
// This utility helps to manage neon installation. That includes following:
@@ -700,9 +718,14 @@ fn main() -> Result<()> {
let cli = Cli::parse();
// Check for 'neon init' command first.
let subcommand_result = if let NeonLocalCmd::Init(args) = cli.command {
handle_init(&args).map(|env| Some(Cow::Owned(env)))
let (subcommand_result, _lock) = if let NeonLocalCmd::Init(args) = cli.command {
(handle_init(&args).map(|env| Some(Cow::Owned(env))), None)
} else {
// This tool uses a collection of simple files to store its state, and consequently
// it is not generally safe to run multiple commands concurrently. Rather than expect
// all callers to know this, use a lock file to protect against concurrent execution.
let _repo_lock = RepoLock::new().unwrap();
// all other commands need an existing config
let env = LocalEnv::load_config(&local_env::base_path()).context("Error loading config")?;
let original_env = env.clone();
@@ -728,11 +751,12 @@ fn main() -> Result<()> {
NeonLocalCmd::Mappings(subcmd) => handle_mappings(&subcmd, env),
};
if &original_env != env {
let subcommand_result = if &original_env != env {
subcommand_result.map(|()| Some(Cow::Borrowed(env)))
} else {
subcommand_result.map(|()| None)
}
};
(subcommand_result, Some(_repo_lock))
};
match subcommand_result {
@@ -922,7 +946,7 @@ fn handle_init(args: &InitCmdArgs) -> anyhow::Result<LocalEnv> {
} else {
// User (likely interactive) did not provide a description of the environment, give them the default
NeonLocalInitConf {
control_plane_api: Some(Some(DEFAULT_PAGESERVER_CONTROL_PLANE_API.parse().unwrap())),
control_plane_api: Some(DEFAULT_PAGESERVER_CONTROL_PLANE_API.parse().unwrap()),
broker: NeonBroker {
listen_addr: DEFAULT_BROKER_ADDR.parse().unwrap(),
},
@@ -1718,18 +1742,15 @@ async fn handle_start_all_impl(
broker::start_broker_process(env, &retry_timeout).await
});
// Only start the storage controller if the pageserver is configured to need it
if env.control_plane_api.is_some() {
js.spawn(async move {
let storage_controller = StorageController::from_env(env);
storage_controller
.start(NeonStorageControllerStartArgs::with_default_instance_id(
retry_timeout,
))
.await
.map_err(|e| e.context("start storage_controller"))
});
}
js.spawn(async move {
let storage_controller = StorageController::from_env(env);
storage_controller
.start(NeonStorageControllerStartArgs::with_default_instance_id(
retry_timeout,
))
.await
.map_err(|e| e.context("start storage_controller"))
});
for ps_conf in &env.pageservers {
js.spawn(async move {
@@ -1774,10 +1795,6 @@ async fn neon_start_status_check(
const RETRY_INTERVAL: Duration = Duration::from_millis(100);
const NOTICE_AFTER_RETRIES: Duration = Duration::from_secs(5);
if env.control_plane_api.is_none() {
return Ok(());
}
let storcon = StorageController::from_env(env);
let retries = retry_timeout.as_millis() / RETRY_INTERVAL.as_millis();

View File

@@ -316,6 +316,10 @@ impl Endpoint {
// and can cause errors like 'no unpinned buffers available', see
// <https://github.com/neondatabase/neon/issues/9956>
conf.append("shared_buffers", "1MB");
// Postgres defaults to effective_io_concurrency=1, which does not exercise the pageserver's
// batching logic. Set this to 2 so that we exercise the code a bit without letting
// individual tests do a lot of concurrent work on underpowered test machines
conf.append("effective_io_concurrency", "2");
conf.append("fsync", "off");
conf.append("max_connections", "100");
conf.append("wal_level", "logical");

View File

@@ -76,7 +76,7 @@ pub struct LocalEnv {
// Control plane upcall API for pageserver: if None, we will not run storage_controller If set, this will
// be propagated into each pageserver's configuration.
pub control_plane_api: Option<Url>,
pub control_plane_api: Url,
// Control plane upcall API for storage controller. If set, this will be propagated into the
// storage controller's configuration.
@@ -133,7 +133,7 @@ pub struct NeonLocalInitConf {
pub storage_controller: Option<NeonStorageControllerConf>,
pub pageservers: Vec<NeonLocalInitPageserverConf>,
pub safekeepers: Vec<SafekeeperConf>,
pub control_plane_api: Option<Option<Url>>,
pub control_plane_api: Option<Url>,
pub control_plane_compute_hook_api: Option<Option<Url>>,
}
@@ -180,7 +180,7 @@ impl NeonStorageControllerConf {
const DEFAULT_MAX_WARMING_UP_INTERVAL: std::time::Duration = std::time::Duration::from_secs(30);
// Very tight heartbeat interval to speed up tests
const DEFAULT_HEARTBEAT_INTERVAL: std::time::Duration = std::time::Duration::from_millis(100);
const DEFAULT_HEARTBEAT_INTERVAL: std::time::Duration = std::time::Duration::from_millis(1000);
}
impl Default for NeonStorageControllerConf {
@@ -535,7 +535,7 @@ impl LocalEnv {
storage_controller,
pageservers,
safekeepers,
control_plane_api,
control_plane_api: control_plane_api.unwrap(),
control_plane_compute_hook_api,
branch_name_mappings,
}
@@ -638,7 +638,7 @@ impl LocalEnv {
storage_controller: self.storage_controller.clone(),
pageservers: vec![], // it's skip_serializing anyway
safekeepers: self.safekeepers.clone(),
control_plane_api: self.control_plane_api.clone(),
control_plane_api: Some(self.control_plane_api.clone()),
control_plane_compute_hook_api: self.control_plane_compute_hook_api.clone(),
branch_name_mappings: self.branch_name_mappings.clone(),
},
@@ -768,7 +768,7 @@ impl LocalEnv {
storage_controller: storage_controller.unwrap_or_default(),
pageservers: pageservers.iter().map(Into::into).collect(),
safekeepers,
control_plane_api: control_plane_api.unwrap_or_default(),
control_plane_api: control_plane_api.unwrap(),
control_plane_compute_hook_api: control_plane_compute_hook_api.unwrap_or_default(),
branch_name_mappings: Default::default(),
};

View File

@@ -95,21 +95,19 @@ impl PageServerNode {
let mut overrides = vec![pg_distrib_dir_param, broker_endpoint_param];
if let Some(control_plane_api) = &self.env.control_plane_api {
overrides.push(format!(
"control_plane_api='{}'",
control_plane_api.as_str()
));
overrides.push(format!(
"control_plane_api='{}'",
self.env.control_plane_api.as_str()
));
// Storage controller uses the same auth as pageserver: if JWT is enabled
// for us, we will also need it to talk to them.
if matches!(conf.http_auth_type, AuthType::NeonJWT) {
let jwt_token = self
.env
.generate_auth_token(&Claims::new(None, Scope::GenerationsApi))
.unwrap();
overrides.push(format!("control_plane_api_token='{}'", jwt_token));
}
// Storage controller uses the same auth as pageserver: if JWT is enabled
// for us, we will also need it to talk to them.
if matches!(conf.http_auth_type, AuthType::NeonJWT) {
let jwt_token = self
.env
.generate_auth_token(&Claims::new(None, Scope::GenerationsApi))
.unwrap();
overrides.push(format!("control_plane_api_token='{}'", jwt_token));
}
if !conf.other.contains_key("remote_storage") {

View File

@@ -338,7 +338,7 @@ impl StorageController {
.port(),
)
} else {
let listen_url = self.env.control_plane_api.clone().unwrap();
let listen_url = self.env.control_plane_api.clone();
let listen = format!(
"{}:{}",
@@ -708,7 +708,7 @@ impl StorageController {
} else {
// The configured URL has the /upcall path prefix for pageservers to use: we will strip that out
// for general purpose API access.
let listen_url = self.env.control_plane_api.clone().unwrap();
let listen_url = self.env.control_plane_api.clone();
Url::from_str(&format!(
"http://{}:{}/{path}",
listen_url.host_str().unwrap(),

View File

@@ -5,7 +5,8 @@ use clap::{Parser, Subcommand};
use pageserver_api::{
controller_api::{
AvailabilityZone, NodeAvailabilityWrapper, NodeDescribeResponse, NodeShardResponse,
ShardSchedulingPolicy, TenantCreateRequest, TenantDescribeResponse, TenantPolicyRequest,
SafekeeperDescribeResponse, ShardSchedulingPolicy, TenantCreateRequest,
TenantDescribeResponse, TenantPolicyRequest,
},
models::{
EvictionPolicy, EvictionPolicyLayerAccessThreshold, LocationConfigSecondary,
@@ -211,6 +212,8 @@ enum Command {
#[arg(long)]
timeout: humantime::Duration,
},
/// List safekeepers known to the storage controller
Safekeepers {},
}
#[derive(Parser)]
@@ -1020,6 +1023,31 @@ async fn main() -> anyhow::Result<()> {
"Fill was cancelled for node {node_id}. Schedulling policy is now {final_policy:?}"
);
}
Command::Safekeepers {} => {
let mut resp = storcon_client
.dispatch::<(), Vec<SafekeeperDescribeResponse>>(
Method::GET,
"control/v1/safekeeper".to_string(),
None,
)
.await?;
resp.sort_by(|a, b| a.id.cmp(&b.id));
let mut table = comfy_table::Table::new();
table.set_header(["Id", "Version", "Host", "Port", "Http Port", "AZ Id"]);
for sk in resp {
table.add_row([
format!("{}", sk.id),
format!("{}", sk.version),
sk.host,
format!("{}", sk.port),
format!("{}", sk.http_port),
sk.availability_zone_id.to_string(),
]);
}
println!("{table}");
}
}
Ok(())

View File

@@ -132,11 +132,6 @@
"name": "cron.database",
"value": "postgres",
"vartype": "string"
},
{
"name": "session_preload_libraries",
"value": "anon",
"vartype": "string"
}
]
},

View File

@@ -35,11 +35,11 @@ for pg_version in ${TEST_VERSION_ONLY-14 15 16 17}; do
echo "clean up containers if exists"
cleanup
PG_TEST_VERSION=$((pg_version < 16 ? 16 : pg_version))
# The support of pg_anon not yet added to PG17, so we have to remove the corresponding option
if [ $pg_version -eq 17 ]; then
# The support of pg_anon not yet added to PG17, so we have to add the corresponding option for other PG versions
if [ "${pg_version}" -ne 17 ]; then
SPEC_PATH="compute_wrapper/var/db/postgres/specs"
mv $SPEC_PATH/spec.json $SPEC_PATH/spec.bak
jq 'del(.cluster.settings[] | select (.name == "session_preload_libraries"))' $SPEC_PATH/spec.bak > $SPEC_PATH/spec.json
jq '.cluster.settings += [{"name": "session_preload_libraries","value": "anon","vartype": "string"}]' "${SPEC_PATH}/spec.bak" > "${SPEC_PATH}/spec.json"
fi
PG_VERSION=$pg_version PG_TEST_VERSION=$PG_TEST_VERSION docker compose --profile test-extensions -f $COMPOSE_FILE up --build -d
@@ -106,8 +106,8 @@ for pg_version in ${TEST_VERSION_ONLY-14 15 16 17}; do
fi
fi
cleanup
# The support of pg_anon not yet added to PG17, so we have to remove the corresponding option
if [ $pg_version -eq 17 ]; then
mv $SPEC_PATH/spec.bak $SPEC_PATH/spec.json
# Restore the original spec.json
if [ "$pg_version" -ne 17 ]; then
mv "$SPEC_PATH/spec.bak" "$SPEC_PATH/spec.json"
fi
done

View File

@@ -91,7 +91,7 @@ impl Timing {
/// Return true if there is a ready event.
fn is_event_ready(&self, queue: &mut BinaryHeap<Pending>) -> bool {
queue.peek().map_or(false, |x| x.time <= self.now())
queue.peek().is_some_and(|x| x.time <= self.now())
}
/// Clear all pending events.

View File

@@ -372,6 +372,23 @@ pub struct MetadataHealthListOutdatedResponse {
pub health_records: Vec<MetadataHealthRecord>,
}
/// Publicly exposed safekeeper description
///
/// The `active` flag which we have in the DB is not included on purpose: it is deprecated.
#[derive(Serialize, Deserialize, Clone)]
pub struct SafekeeperDescribeResponse {
pub id: NodeId,
pub region_id: String,
/// 1 is special, it means just created (not currently posted to storcon).
/// Zero or negative is not really expected.
/// Otherwise the number from `release-$(number_of_commits_on_branch)` tag.
pub version: i64,
pub host: String,
pub port: i32,
pub http_port: i32,
pub availability_zone_id: String,
}
#[cfg(test)]
mod test {
use super::*;

View File

@@ -565,6 +565,10 @@ impl Key {
&& self.field5 == 0
&& self.field6 == u32::MAX
}
pub fn is_slru_dir_key(&self) -> bool {
slru_dir_kind(self).is_some()
}
}
#[inline(always)]

View File

@@ -6,6 +6,7 @@ pub mod utilization;
use camino::Utf8PathBuf;
pub use utilization::PageserverUtilization;
use core::ops::Range;
use std::{
collections::HashMap,
fmt::Display,
@@ -28,6 +29,7 @@ use utils::{
};
use crate::{
key::Key,
reltag::RelTag,
shard::{ShardCount, ShardStripeSize, TenantShardId},
};
@@ -210,6 +212,68 @@ pub enum TimelineState {
Broken { reason: String, backtrace: String },
}
#[serde_with::serde_as]
#[derive(Debug, Clone, serde::Deserialize, serde::Serialize)]
pub struct CompactLsnRange {
pub start: Lsn,
pub end: Lsn,
}
#[serde_with::serde_as]
#[derive(Debug, Clone, serde::Deserialize, serde::Serialize)]
pub struct CompactKeyRange {
#[serde_as(as = "serde_with::DisplayFromStr")]
pub start: Key,
#[serde_as(as = "serde_with::DisplayFromStr")]
pub end: Key,
}
impl From<Range<Lsn>> for CompactLsnRange {
fn from(range: Range<Lsn>) -> Self {
Self {
start: range.start,
end: range.end,
}
}
}
impl From<Range<Key>> for CompactKeyRange {
fn from(range: Range<Key>) -> Self {
Self {
start: range.start,
end: range.end,
}
}
}
impl From<CompactLsnRange> for Range<Lsn> {
fn from(range: CompactLsnRange) -> Self {
range.start..range.end
}
}
impl From<CompactKeyRange> for Range<Key> {
fn from(range: CompactKeyRange) -> Self {
range.start..range.end
}
}
impl CompactLsnRange {
pub fn above(lsn: Lsn) -> Self {
Self {
start: lsn,
end: Lsn::MAX,
}
}
}
#[derive(Debug, Clone, Serialize)]
pub struct CompactInfoResponse {
pub compact_key_range: Option<CompactKeyRange>,
pub compact_lsn_range: Option<CompactLsnRange>,
pub sub_compaction: bool,
}
#[derive(Serialize, Deserialize, Clone)]
pub struct TimelineCreateRequest {
pub new_timeline_id: TimelineId,

View File

@@ -173,7 +173,11 @@ impl ShardIdentity {
/// Return true if the key should be stored on all shards, not just one.
pub fn is_key_global(&self, key: &Key) -> bool {
if key.is_slru_block_key() || key.is_slru_segment_size_key() || key.is_aux_file_key() {
if key.is_slru_block_key()
|| key.is_slru_segment_size_key()
|| key.is_aux_file_key()
|| key.is_slru_dir_key()
{
// Special keys that are only stored on shard 0
false
} else if key.is_rel_block_key() {

View File

@@ -9,9 +9,11 @@ regex.workspace = true
bytes.workspace = true
anyhow.workspace = true
crc32c.workspace = true
criterion.workspace = true
once_cell.workspace = true
log.workspace = true
memoffset.workspace = true
pprof.workspace = true
thiserror.workspace = true
serde.workspace = true
utils.workspace = true
@@ -24,3 +26,7 @@ postgres.workspace = true
[build-dependencies]
anyhow.workspace = true
bindgen.workspace = true
[[bench]]
name = "waldecoder"
harness = false

View File

@@ -0,0 +1,26 @@
## Benchmarks
To run benchmarks:
```sh
# All benchmarks.
cargo bench --package postgres_ffi
# Specific file.
cargo bench --package postgres_ffi --bench waldecoder
# Specific benchmark.
cargo bench --package postgres_ffi --bench waldecoder complete_record/size=1024
# List available benchmarks.
cargo bench --package postgres_ffi --benches -- --list
# Generate flamegraph profiles using pprof-rs, profiling for 10 seconds.
# Output in target/criterion/*/profile/flamegraph.svg.
cargo bench --package postgres_ffi --bench waldecoder complete_record/size=1024 -- --profile-time 10
```
Additional charts and statistics are available in `target/criterion/report/index.html`.
Benchmarks are automatically compared against the previous run. To compare against other runs, see
`--baseline` and `--save-baseline`.

View File

@@ -0,0 +1,49 @@
use std::ffi::CStr;
use criterion::{criterion_group, criterion_main, Bencher, Criterion};
use postgres_ffi::v17::wal_generator::LogicalMessageGenerator;
use postgres_ffi::v17::waldecoder_handler::WalStreamDecoderHandler;
use postgres_ffi::waldecoder::WalStreamDecoder;
use pprof::criterion::{Output, PProfProfiler};
use utils::lsn::Lsn;
const KB: usize = 1024;
// Register benchmarks with Criterion.
criterion_group!(
name = benches;
config = Criterion::default().with_profiler(PProfProfiler::new(100, Output::Flamegraph(None)));
targets = bench_complete_record,
);
criterion_main!(benches);
/// Benchmarks WalStreamDecoder::complete_record() for a logical message of varying size.
fn bench_complete_record(c: &mut Criterion) {
let mut g = c.benchmark_group("complete_record");
for size in [64, KB, 8 * KB, 128 * KB] {
// Kind of weird to change the group throughput per benchmark, but it's the only way
// to vary it per benchmark. It works.
g.throughput(criterion::Throughput::Bytes(size as u64));
g.bench_function(format!("size={size}"), |b| run_bench(b, size).unwrap());
}
fn run_bench(b: &mut Bencher, size: usize) -> anyhow::Result<()> {
const PREFIX: &CStr = c"";
let value_size = LogicalMessageGenerator::make_value_size(size, PREFIX);
let value = vec![1; value_size];
let mut decoder = WalStreamDecoder::new(Lsn(0), 170000);
let msg = LogicalMessageGenerator::new(PREFIX, &value)
.next()
.unwrap()
.encode(Lsn(0));
assert_eq!(msg.len(), size);
b.iter(|| {
let msg = msg.clone(); // Bytes::clone() is cheap
decoder.complete_record(msg).unwrap();
});
Ok(())
}
}

View File

@@ -106,11 +106,11 @@ impl<R: RecordGenerator> WalGenerator<R> {
const TIMELINE_ID: u32 = 1;
/// Creates a new WAL generator with the given record generator.
pub fn new(record_generator: R) -> WalGenerator<R> {
pub fn new(record_generator: R, start_lsn: Lsn) -> WalGenerator<R> {
Self {
record_generator,
lsn: Lsn(0),
prev_lsn: Lsn(0),
lsn: start_lsn,
prev_lsn: start_lsn,
}
}
@@ -231,6 +231,22 @@ impl LogicalMessageGenerator {
};
[&header.encode(), prefix, message].concat().into()
}
/// Computes how large a value must be to get a record of the given size. Convenience method to
/// construct records of pre-determined size. Panics if the record size is too small.
pub fn make_value_size(record_size: usize, prefix: &CStr) -> usize {
let xlog_header_size = XLOG_SIZE_OF_XLOG_RECORD;
let lm_header_size = size_of::<XlLogicalMessage>();
let prefix_size = prefix.to_bytes_with_nul().len();
let data_header_size = match record_size - xlog_header_size - 2 {
0..=255 => 2,
256..=258 => panic!("impossible record_size {record_size}"),
259.. => 5,
};
record_size
.checked_sub(xlog_header_size + lm_header_size + prefix_size + data_header_size)
.expect("record_size too small")
}
}
impl Iterator for LogicalMessageGenerator {

View File

@@ -81,7 +81,7 @@ fn test_end_of_wal<C: crate::Crafter>(test_name: &str) {
continue;
}
let mut f = File::options().write(true).open(file.path()).unwrap();
const ZEROS: [u8; WAL_SEGMENT_SIZE] = [0u8; WAL_SEGMENT_SIZE];
static ZEROS: [u8; WAL_SEGMENT_SIZE] = [0u8; WAL_SEGMENT_SIZE];
f.write_all(
&ZEROS[0..min(
WAL_SEGMENT_SIZE,

View File

@@ -1,7 +1,7 @@
[package]
name = "postgres-protocol2"
version = "0.1.0"
edition = "2018"
edition = "2021"
license = "MIT/Apache-2.0"
[dependencies]

View File

@@ -9,8 +9,7 @@
//!
//! This library assumes that the `client_encoding` backend parameter has been
//! set to `UTF8`. It will most likely not behave properly if that is not the case.
#![doc(html_root_url = "https://docs.rs/postgres-protocol/0.6")]
#![warn(missing_docs, rust_2018_idioms, clippy::all)]
#![warn(missing_docs, clippy::all)]
use byteorder::{BigEndian, ByteOrder};
use bytes::{BufMut, BytesMut};

View File

@@ -3,7 +3,6 @@
use byteorder::{BigEndian, ByteOrder};
use bytes::{Buf, BufMut, BytesMut};
use std::convert::TryFrom;
use std::error::Error;
use std::io;
use std::marker;

View File

@@ -1,7 +1,7 @@
[package]
name = "postgres-types2"
version = "0.1.0"
edition = "2018"
edition = "2021"
license = "MIT/Apache-2.0"
[dependencies]

View File

@@ -2,8 +2,7 @@
//!
//! This crate is used by the `tokio-postgres` and `postgres` crates. You normally don't need to depend directly on it
//! unless you want to define your own `ToSql` or `FromSql` definitions.
#![doc(html_root_url = "https://docs.rs/postgres-types/0.2")]
#![warn(clippy::all, rust_2018_idioms, missing_docs)]
#![warn(clippy::all, missing_docs)]
use fallible_iterator::FallibleIterator;
use postgres_protocol2::types;

View File

@@ -1,7 +1,7 @@
[package]
name = "tokio-postgres2"
version = "0.1.0"
edition = "2018"
edition = "2021"
license = "MIT/Apache-2.0"
[dependencies]

View File

@@ -1,5 +1,5 @@
//! An asynchronous, pipelined, PostgreSQL client.
#![warn(rust_2018_idioms, clippy::all)]
#![warn(clippy::all)]
pub use crate::cancel_token::CancelToken;
pub use crate::client::{Client, SocketConfig};

View File

@@ -11,7 +11,7 @@ mod private {
Query(&'a str),
}
impl<'a> ToStatementType<'a> {
impl ToStatementType<'_> {
pub async fn into_statement(self, client: &Client) -> Result<Statement, Error> {
match self {
ToStatementType::Statement(s) => Ok(s.clone()),

View File

@@ -18,6 +18,7 @@ camino = { workspace = true, features = ["serde1"] }
humantime-serde.workspace = true
hyper = { workspace = true, features = ["client"] }
futures.workspace = true
reqwest.workspace = true
serde.workspace = true
serde_json.workspace = true
tokio = { workspace = true, features = ["sync", "fs", "io-util"] }

View File

@@ -8,6 +8,7 @@ use std::io;
use std::num::NonZeroU32;
use std::pin::Pin;
use std::str::FromStr;
use std::sync::Arc;
use std::time::Duration;
use std::time::SystemTime;
@@ -15,6 +16,8 @@ use super::REMOTE_STORAGE_PREFIX_SEPARATOR;
use anyhow::Context;
use anyhow::Result;
use azure_core::request_options::{IfMatchCondition, MaxResults, Metadata, Range};
use azure_core::HttpClient;
use azure_core::TransportOptions;
use azure_core::{Continuable, RetryOptions};
use azure_storage::StorageCredentials;
use azure_storage_blobs::blob::CopyStatus;
@@ -80,8 +83,13 @@ impl AzureBlobStorage {
StorageCredentials::token_credential(token_credential)
};
// we have an outer retry
let builder = ClientBuilder::new(account, credentials).retry(RetryOptions::none());
let builder = ClientBuilder::new(account, credentials)
// we have an outer retry
.retry(RetryOptions::none())
// Customize transport to configure conneciton pooling
.transport(TransportOptions::new(Self::reqwest_client(
azure_config.conn_pool_size,
)));
let client = builder.container_client(azure_config.container_name.to_owned());
@@ -106,6 +114,14 @@ impl AzureBlobStorage {
})
}
fn reqwest_client(conn_pool_size: usize) -> Arc<dyn HttpClient> {
let client = reqwest::ClientBuilder::new()
.pool_max_idle_per_host(conn_pool_size)
.build()
.expect("failed to build `reqwest` client");
Arc::new(client)
}
pub fn relative_path_to_name(&self, path: &RemotePath) -> String {
assert_eq!(std::path::MAIN_SEPARATOR, REMOTE_STORAGE_PREFIX_SEPARATOR);
let path_string = path.get_path().as_str();
@@ -544,9 +560,9 @@ impl RemoteStorage for AzureBlobStorage {
.await
}
async fn delete_objects<'a>(
async fn delete_objects(
&self,
paths: &'a [RemotePath],
paths: &[RemotePath],
cancel: &CancellationToken,
) -> anyhow::Result<()> {
let kind = RequestKind::Delete;

View File

@@ -114,6 +114,16 @@ fn default_max_keys_per_list_response() -> Option<i32> {
DEFAULT_MAX_KEYS_PER_LIST_RESPONSE
}
fn default_azure_conn_pool_size() -> usize {
// Conservative default: no connection pooling. At time of writing this is the Azure
// SDK's default as well, due to historic reports of hard-to-reproduce issues
// (https://github.com/hyperium/hyper/issues/2312)
//
// However, using connection pooling is important to avoid exhausting client ports when
// doing huge numbers of requests (https://github.com/neondatabase/cloud/issues/20971)
0
}
impl Debug for S3Config {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.debug_struct("S3Config")
@@ -146,6 +156,8 @@ pub struct AzureConfig {
pub concurrency_limit: NonZeroUsize,
#[serde(default = "default_max_keys_per_list_response")]
pub max_keys_per_list_response: Option<i32>,
#[serde(default = "default_azure_conn_pool_size")]
pub conn_pool_size: usize,
}
fn default_remote_storage_azure_concurrency_limit() -> NonZeroUsize {
@@ -302,6 +314,7 @@ timeout = '5s'";
container_region = 'westeurope'
upload_storage_class = 'INTELLIGENT_TIERING'
timeout = '7s'
conn_pool_size = 8
";
let config = parse(toml).unwrap();
@@ -316,6 +329,7 @@ timeout = '5s'";
prefix_in_container: None,
concurrency_limit: default_remote_storage_azure_concurrency_limit(),
max_keys_per_list_response: DEFAULT_MAX_KEYS_PER_LIST_RESPONSE,
conn_pool_size: 8,
}),
timeout: Duration::from_secs(7),
small_timeout: RemoteStorageConfig::DEFAULT_SMALL_TIMEOUT

View File

@@ -341,9 +341,9 @@ pub trait RemoteStorage: Send + Sync + 'static {
/// If the operation fails because of timeout or cancellation, the root cause of the error will be
/// set to `TimeoutOrCancel`. In such situation it is unknown which deletions, if any, went
/// through.
async fn delete_objects<'a>(
async fn delete_objects(
&self,
paths: &'a [RemotePath],
paths: &[RemotePath],
cancel: &CancellationToken,
) -> anyhow::Result<()>;

View File

@@ -562,9 +562,9 @@ impl RemoteStorage for LocalFs {
}
}
async fn delete_objects<'a>(
async fn delete_objects(
&self,
paths: &'a [RemotePath],
paths: &[RemotePath],
cancel: &CancellationToken,
) -> anyhow::Result<()> {
for path in paths {

View File

@@ -813,9 +813,9 @@ impl RemoteStorage for S3Bucket {
.await
}
async fn delete_objects<'a>(
async fn delete_objects(
&self,
paths: &'a [RemotePath],
paths: &[RemotePath],
cancel: &CancellationToken,
) -> anyhow::Result<()> {
let kind = RequestKind::Delete;

View File

@@ -181,9 +181,9 @@ impl RemoteStorage for UnreliableWrapper {
self.delete_inner(path, true, cancel).await
}
async fn delete_objects<'a>(
async fn delete_objects(
&self,
paths: &'a [RemotePath],
paths: &[RemotePath],
cancel: &CancellationToken,
) -> anyhow::Result<()> {
self.attempt(RemoteOp::DeleteObjects(paths.to_vec()))?;

View File

@@ -218,6 +218,7 @@ async fn create_azure_client(
prefix_in_container: Some(format!("test_{millis}_{random:08x}/")),
concurrency_limit: NonZeroUsize::new(100).unwrap(),
max_keys_per_list_response,
conn_pool_size: 8,
}),
timeout: RemoteStorageConfig::DEFAULT_TIMEOUT,
small_timeout: RemoteStorageConfig::DEFAULT_SMALL_TIMEOUT,

View File

@@ -5,6 +5,9 @@ edition.workspace = true
license.workspace = true
[dependencies]
serde.workspace = true
const_format.workspace = true
serde.workspace = true
postgres_ffi.workspace = true
pq_proto.workspace = true
tokio.workspace = true
utils.workspace = true

View File

@@ -1,10 +1,27 @@
#![deny(unsafe_code)]
#![deny(clippy::undocumented_unsafe_blocks)]
use const_format::formatcp;
use pq_proto::SystemId;
use serde::{Deserialize, Serialize};
/// Public API types
pub mod models;
/// Consensus logical timestamp. Note: it is a part of sk control file.
pub type Term = u64;
pub const INVALID_TERM: Term = 0;
/// Information about Postgres. Safekeeper gets it once and then verifies all
/// further connections from computes match. Note: it is a part of sk control
/// file.
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub struct ServerInfo {
/// Postgres server version
pub pg_version: u32,
pub system_id: SystemId,
pub wal_seg_size: u32,
}
pub const DEFAULT_PG_LISTEN_PORT: u16 = 5454;
pub const DEFAULT_PG_LISTEN_ADDR: &str = formatcp!("127.0.0.1:{DEFAULT_PG_LISTEN_PORT}");

View File

@@ -1,10 +1,23 @@
//! Types used in safekeeper http API. Many of them are also reused internally.
use postgres_ffi::TimestampTz;
use serde::{Deserialize, Serialize};
use std::net::SocketAddr;
use tokio::time::Instant;
use utils::{
id::{NodeId, TenantId, TimelineId},
id::{NodeId, TenantId, TenantTimelineId, TimelineId},
lsn::Lsn,
pageserver_feedback::PageserverFeedback,
};
use crate::{ServerInfo, Term};
#[derive(Debug, Serialize)]
pub struct SafekeeperStatus {
pub id: NodeId,
}
#[derive(Serialize, Deserialize)]
pub struct TimelineCreateRequest {
pub tenant_id: TenantId,
@@ -18,6 +31,161 @@ pub struct TimelineCreateRequest {
pub local_start_lsn: Option<Lsn>,
}
/// Same as TermLsn, but serializes LSN using display serializer
/// in Postgres format, i.e. 0/FFFFFFFF. Used only for the API response.
#[derive(Debug, Clone, Copy, Serialize, Deserialize)]
pub struct TermSwitchApiEntry {
pub term: Term,
pub lsn: Lsn,
}
/// Augment AcceptorState with last_log_term for convenience
#[derive(Debug, Serialize, Deserialize)]
pub struct AcceptorStateStatus {
pub term: Term,
pub epoch: Term, // aka last_log_term, old `epoch` name is left for compatibility
pub term_history: Vec<TermSwitchApiEntry>,
}
/// Things safekeeper should know about timeline state on peers.
/// Used as both model and internally.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct PeerInfo {
pub sk_id: NodeId,
pub term: Term,
/// Term of the last entry.
pub last_log_term: Term,
/// LSN of the last record.
pub flush_lsn: Lsn,
pub commit_lsn: Lsn,
/// Since which LSN safekeeper has WAL.
pub local_start_lsn: Lsn,
/// When info was received. Serde annotations are not very useful but make
/// the code compile -- we don't rely on this field externally.
#[serde(skip)]
#[serde(default = "Instant::now")]
pub ts: Instant,
pub pg_connstr: String,
pub http_connstr: String,
}
pub type FullTransactionId = u64;
/// Hot standby feedback received from replica
#[derive(Debug, Clone, Copy, Serialize, Deserialize)]
pub struct HotStandbyFeedback {
pub ts: TimestampTz,
pub xmin: FullTransactionId,
pub catalog_xmin: FullTransactionId,
}
pub const INVALID_FULL_TRANSACTION_ID: FullTransactionId = 0;
impl HotStandbyFeedback {
pub fn empty() -> HotStandbyFeedback {
HotStandbyFeedback {
ts: 0,
xmin: 0,
catalog_xmin: 0,
}
}
}
/// Standby status update
#[derive(Debug, Clone, Copy, Serialize, Deserialize)]
pub struct StandbyReply {
pub write_lsn: Lsn, // The location of the last WAL byte + 1 received and written to disk in the standby.
pub flush_lsn: Lsn, // The location of the last WAL byte + 1 flushed to disk in the standby.
pub apply_lsn: Lsn, // The location of the last WAL byte + 1 applied in the standby.
pub reply_ts: TimestampTz, // The client's system clock at the time of transmission, as microseconds since midnight on 2000-01-01.
pub reply_requested: bool,
}
impl StandbyReply {
pub fn empty() -> Self {
StandbyReply {
write_lsn: Lsn::INVALID,
flush_lsn: Lsn::INVALID,
apply_lsn: Lsn::INVALID,
reply_ts: 0,
reply_requested: false,
}
}
}
#[derive(Debug, Clone, Copy, Serialize, Deserialize)]
pub struct StandbyFeedback {
pub reply: StandbyReply,
pub hs_feedback: HotStandbyFeedback,
}
impl StandbyFeedback {
pub fn empty() -> Self {
StandbyFeedback {
reply: StandbyReply::empty(),
hs_feedback: HotStandbyFeedback::empty(),
}
}
}
/// Receiver is either pageserver or regular standby, which have different
/// feedbacks.
/// Used as both model and internally.
#[derive(Debug, Clone, Copy, Serialize, Deserialize)]
pub enum ReplicationFeedback {
Pageserver(PageserverFeedback),
Standby(StandbyFeedback),
}
/// Uniquely identifies a WAL service connection. Logged in spans for
/// observability.
pub type ConnectionId = u32;
/// Serialize is used only for json'ing in API response. Also used internally.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct WalSenderState {
pub ttid: TenantTimelineId,
pub addr: SocketAddr,
pub conn_id: ConnectionId,
// postgres application_name
pub appname: Option<String>,
pub feedback: ReplicationFeedback,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct WalReceiverState {
/// None means it is recovery initiated by us (this safekeeper).
pub conn_id: Option<ConnectionId>,
pub status: WalReceiverStatus,
}
/// Walreceiver status. Currently only whether it passed voting stage and
/// started receiving the stream, but it is easy to add more if needed.
#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum WalReceiverStatus {
Voting,
Streaming,
}
/// Info about timeline on safekeeper ready for reporting.
#[derive(Debug, Serialize, Deserialize)]
pub struct TimelineStatus {
pub tenant_id: TenantId,
pub timeline_id: TimelineId,
pub acceptor_state: AcceptorStateStatus,
pub pg_info: ServerInfo,
pub flush_lsn: Lsn,
pub timeline_start_lsn: Lsn,
pub local_start_lsn: Lsn,
pub commit_lsn: Lsn,
pub backup_lsn: Lsn,
pub peer_horizon_lsn: Lsn,
pub remote_consistent_lsn: Lsn,
pub peers: Vec<PeerInfo>,
pub walsenders: Vec<WalSenderState>,
pub walreceivers: Vec<WalReceiverState>,
}
fn lsn_invalid() -> Lsn {
Lsn::INVALID
}

View File

@@ -15,17 +15,20 @@ arc-swap.workspace = true
sentry.workspace = true
async-compression.workspace = true
anyhow.workspace = true
backtrace.workspace = true
bincode.workspace = true
bytes.workspace = true
camino.workspace = true
chrono.workspace = true
diatomic-waker.workspace = true
flate2.workspace = true
git-version.workspace = true
hex = { workspace = true, features = ["serde"] }
humantime.workspace = true
hyper0 = { workspace = true, features = ["full"] }
itertools.workspace = true
fail.workspace = true
futures = { workspace = true}
futures = { workspace = true }
jemalloc_pprof.workspace = true
jsonwebtoken.workspace = true
nix.workspace = true

View File

@@ -1,15 +1,22 @@
use crate::auth::{AuthError, Claims, SwappableJwtAuth};
use crate::http::error::{api_error_handler, route_error_handler, ApiError};
use crate::http::request::{get_query_param, parse_query_param};
use crate::pprof;
use ::pprof::protos::Message as _;
use ::pprof::ProfilerGuardBuilder;
use anyhow::{anyhow, Context};
use bytes::{Bytes, BytesMut};
use hyper::header::{HeaderName, AUTHORIZATION, CONTENT_DISPOSITION};
use hyper::http::HeaderValue;
use hyper::Method;
use hyper::{header::CONTENT_TYPE, Body, Request, Response};
use metrics::{register_int_counter, Encoder, IntCounter, TextEncoder};
use once_cell::sync::Lazy;
use regex::Regex;
use routerify::ext::RequestExt;
use routerify::{Middleware, RequestInfo, Router, RouterBuilder};
use tokio::sync::{mpsc, Mutex};
use tokio_stream::wrappers::ReceiverStream;
use tokio_util::io::ReaderStream;
use tracing::{debug, info, info_span, warn, Instrument};
@@ -18,11 +25,6 @@ use std::io::Write as _;
use std::str::FromStr;
use std::time::Duration;
use bytes::{Bytes, BytesMut};
use pprof::protos::Message as _;
use tokio::sync::{mpsc, Mutex};
use tokio_stream::wrappers::ReceiverStream;
static SERVE_METRICS_COUNT: Lazy<IntCounter> = Lazy::new(|| {
register_int_counter!(
"libmetrics_metric_handler_requests_total",
@@ -365,7 +367,7 @@ pub async fn profile_cpu_handler(req: Request<Body>) -> Result<Response<Body>, A
// Take the profile.
let report = tokio::task::spawn_blocking(move || {
let guard = pprof::ProfilerGuardBuilder::default()
let guard = ProfilerGuardBuilder::default()
.frequency(frequency_hz)
.blocklist(&["libc", "libgcc", "pthread", "vdso"])
.build()?;
@@ -457,10 +459,34 @@ pub async fn profile_heap_handler(req: Request<Body>) -> Result<Response<Body>,
}
Format::Pprof => {
let data = tokio::task::spawn_blocking(move || prof_ctl.dump_pprof())
.await
.map_err(|join_err| ApiError::InternalServerError(join_err.into()))?
.map_err(ApiError::InternalServerError)?;
let data = tokio::task::spawn_blocking(move || {
let bytes = prof_ctl.dump_pprof()?;
// Symbolize the profile.
// TODO: consider moving this upstream to jemalloc_pprof and avoiding the
// serialization roundtrip.
static STRIP_FUNCTIONS: Lazy<Vec<(Regex, bool)>> = Lazy::new(|| {
// Functions to strip from profiles. If true, also remove child frames.
vec![
(Regex::new("^__rust").unwrap(), false),
(Regex::new("^_start$").unwrap(), false),
(Regex::new("^irallocx_prof").unwrap(), true),
(Regex::new("^prof_alloc_prep").unwrap(), true),
(Regex::new("^std::rt::lang_start").unwrap(), false),
(Regex::new("^std::sys::backtrace::__rust").unwrap(), false),
]
});
let profile = pprof::decode(&bytes)?;
let profile = pprof::symbolize(profile)?;
let profile = pprof::strip_locations(
profile,
&["libc", "libgcc", "pthread", "vdso"],
&STRIP_FUNCTIONS,
);
pprof::encode(&profile)
})
.await
.map_err(|join_err| ApiError::InternalServerError(join_err.into()))?
.map_err(ApiError::InternalServerError)?;
Response::builder()
.status(200)
.header(CONTENT_TYPE, "application/octet-stream")

View File

@@ -96,6 +96,8 @@ pub mod circuit_breaker;
pub mod try_rcu;
pub mod pprof;
// Re-export used in macro. Avoids adding git-version as dep in target crates.
#[doc(hidden)]
pub use git_version;

190
libs/utils/src/pprof.rs Normal file
View File

@@ -0,0 +1,190 @@
use flate2::write::{GzDecoder, GzEncoder};
use flate2::Compression;
use itertools::Itertools as _;
use once_cell::sync::Lazy;
use pprof::protos::{Function, Line, Message as _, Profile};
use regex::Regex;
use std::borrow::Cow;
use std::collections::{HashMap, HashSet};
use std::ffi::c_void;
use std::io::Write as _;
/// Decodes a gzip-compressed Protobuf-encoded pprof profile.
pub fn decode(bytes: &[u8]) -> anyhow::Result<Profile> {
let mut gz = GzDecoder::new(Vec::new());
gz.write_all(bytes)?;
Ok(Profile::parse_from_bytes(&gz.finish()?)?)
}
/// Encodes a pprof profile as gzip-compressed Protobuf.
pub fn encode(profile: &Profile) -> anyhow::Result<Vec<u8>> {
let mut gz = GzEncoder::new(Vec::new(), Compression::default());
profile.write_to_writer(&mut gz)?;
Ok(gz.finish()?)
}
/// Symbolizes a pprof profile using the current binary.
pub fn symbolize(mut profile: Profile) -> anyhow::Result<Profile> {
if !profile.function.is_empty() {
return Ok(profile); // already symbolized
}
// Collect function names.
let mut functions: HashMap<String, Function> = HashMap::new();
let mut strings: HashMap<String, i64> = profile
.string_table
.into_iter()
.enumerate()
.map(|(i, s)| (s, i as i64))
.collect();
// Helper to look up or register a string.
let mut string_id = |s: &str| -> i64 {
// Don't use .entry() to avoid unnecessary allocations.
if let Some(id) = strings.get(s) {
return *id;
}
let id = strings.len() as i64;
strings.insert(s.to_string(), id);
id
};
for loc in &mut profile.location {
if !loc.line.is_empty() {
continue;
}
// Resolve the line and function for each location.
backtrace::resolve(loc.address as *mut c_void, |symbol| {
let Some(symname) = symbol.name() else {
return;
};
let mut name = symname.to_string();
// Strip the Rust monomorphization suffix from the symbol name.
static SUFFIX_REGEX: Lazy<Regex> =
Lazy::new(|| Regex::new("::h[0-9a-f]{16}$").expect("invalid regex"));
if let Some(m) = SUFFIX_REGEX.find(&name) {
name.truncate(m.start());
}
let function_id = match functions.get(&name) {
Some(function) => function.id,
None => {
let id = functions.len() as u64 + 1;
let system_name = String::from_utf8_lossy(symname.as_bytes());
let filename = symbol
.filename()
.map(|path| path.to_string_lossy())
.unwrap_or(Cow::Borrowed(""));
let function = Function {
id,
name: string_id(&name),
system_name: string_id(&system_name),
filename: string_id(&filename),
..Default::default()
};
functions.insert(name, function);
id
}
};
loc.line.push(Line {
function_id,
line: symbol.lineno().unwrap_or(0) as i64,
..Default::default()
});
});
}
// Store the resolved functions, and mark the mapping as resolved.
profile.function = functions.into_values().sorted_by_key(|f| f.id).collect();
profile.string_table = strings
.into_iter()
.sorted_by_key(|(_, i)| *i)
.map(|(s, _)| s)
.collect();
for mapping in &mut profile.mapping {
mapping.has_functions = true;
mapping.has_filenames = true;
}
Ok(profile)
}
/// Strips locations (stack frames) matching the given mappings (substring) or function names
/// (regex). The function bool specifies whether child frames should be stripped as well.
///
/// The string definitions are left behind in the profile for simplicity, to avoid rewriting all
/// string references.
pub fn strip_locations(
mut profile: Profile,
mappings: &[&str],
functions: &[(Regex, bool)],
) -> Profile {
// Strip mappings.
let mut strip_mappings: HashSet<u64> = HashSet::new();
profile.mapping.retain(|mapping| {
let Some(name) = profile.string_table.get(mapping.filename as usize) else {
return true;
};
if mappings.iter().any(|substr| name.contains(substr)) {
strip_mappings.insert(mapping.id);
return false;
}
true
});
// Strip functions.
let mut strip_functions: HashMap<u64, bool> = HashMap::new();
profile.function.retain(|function| {
let Some(name) = profile.string_table.get(function.name as usize) else {
return true;
};
for (regex, strip_children) in functions {
if regex.is_match(name) {
strip_functions.insert(function.id, *strip_children);
return false;
}
}
true
});
// Strip locations. The bool specifies whether child frames should be stripped too.
let mut strip_locations: HashMap<u64, bool> = HashMap::new();
profile.location.retain(|location| {
for line in &location.line {
if let Some(strip_children) = strip_functions.get(&line.function_id) {
strip_locations.insert(location.id, *strip_children);
return false;
}
}
if strip_mappings.contains(&location.mapping_id) {
strip_locations.insert(location.id, false);
return false;
}
true
});
// Strip sample locations.
for sample in &mut profile.sample {
// First, find the uppermost function with child removal and truncate the stack.
if let Some(truncate) = sample
.location_id
.iter()
.rposition(|id| strip_locations.get(id) == Some(&true))
{
sample.location_id.drain(..=truncate);
}
// Next, strip any individual frames without child removal.
sample
.location_id
.retain(|id| !strip_locations.contains_key(id));
}
profile
}

View File

@@ -94,6 +94,7 @@ procfs.workspace = true
[dev-dependencies]
criterion.workspace = true
hex-literal.workspace = true
serial_test.workspace = true
tokio = { workspace = true, features = ["process", "sync", "fs", "rt", "io-util", "time", "test-util"] }
indoc.workspace = true

View File

@@ -272,7 +272,7 @@ struct CompactionJob<E: CompactionJobExecutor> {
completed: bool,
}
impl<'a, E> LevelCompactionState<'a, E>
impl<E> LevelCompactionState<'_, E>
where
E: CompactionJobExecutor,
{

View File

@@ -224,9 +224,8 @@ impl<L> Level<L> {
}
// recalculate depth if this was the last event at this point
let more_events_at_this_key = events_iter
.peek()
.map_or(false, |next_e| next_e.key == e.key);
let more_events_at_this_key =
events_iter.peek().is_some_and(|next_e| next_e.key == e.key);
if !more_events_at_this_key {
let mut active_depth = 0;
for (_end_lsn, is_image, _idx) in active_set.iter().rev() {

View File

@@ -148,7 +148,7 @@ pub trait CompactionDeltaLayer<E: CompactionJobExecutor + ?Sized>: CompactionLay
Self: 'a;
/// Return all keys in this delta layer.
fn load_keys<'a>(
fn load_keys(
&self,
ctx: &E::RequestContext,
) -> impl Future<Output = anyhow::Result<Vec<Self::DeltaEntry<'_>>>> + Send;

View File

@@ -143,7 +143,7 @@ impl interface::CompactionLayer<Key> for Arc<MockDeltaLayer> {
impl interface::CompactionDeltaLayer<MockTimeline> for Arc<MockDeltaLayer> {
type DeltaEntry<'a> = MockRecord;
async fn load_keys<'a>(&self, _ctx: &MockRequestContext) -> anyhow::Result<Vec<MockRecord>> {
async fn load_keys(&self, _ctx: &MockRequestContext) -> anyhow::Result<Vec<MockRecord>> {
Ok(self.records.clone())
}
}

View File

@@ -248,7 +248,7 @@ where
}
}
impl<'a, W> Basebackup<'a, W>
impl<W> Basebackup<'_, W>
where
W: AsyncWrite + Send + Sync + Unpin,
{

View File

@@ -97,8 +97,8 @@ use crate::tenant::{LogicalSizeCalculationCause, PageReconstructError};
use crate::DEFAULT_PG_VERSION;
use crate::{disk_usage_eviction_task, tenant};
use pageserver_api::models::{
StatusResponse, TenantConfigRequest, TenantInfo, TimelineCreateRequest, TimelineGcRequest,
TimelineInfo,
CompactInfoResponse, StatusResponse, TenantConfigRequest, TenantInfo, TimelineCreateRequest,
TimelineGcRequest, TimelineInfo,
};
use utils::{
auth::SwappableJwtAuth,
@@ -2039,6 +2039,34 @@ async fn timeline_cancel_compact_handler(
.await
}
// Get compact info of a timeline
async fn timeline_compact_info_handler(
request: Request<Body>,
_cancel: CancellationToken,
) -> Result<Response<Body>, ApiError> {
let tenant_shard_id: TenantShardId = parse_request_param(&request, "tenant_shard_id")?;
let timeline_id: TimelineId = parse_request_param(&request, "timeline_id")?;
check_permission(&request, Some(tenant_shard_id.tenant_id))?;
let state = get_state(&request);
async {
let tenant = state
.tenant_manager
.get_attached_tenant_shard(tenant_shard_id)?;
let res = tenant.get_scheduled_compaction_tasks(timeline_id);
let mut resp = Vec::new();
for item in res {
resp.push(CompactInfoResponse {
compact_key_range: item.compact_key_range,
compact_lsn_range: item.compact_lsn_range,
sub_compaction: item.sub_compaction,
});
}
json_response(StatusCode::OK, resp)
}
.instrument(info_span!("timeline_compact_info", tenant_id = %tenant_shard_id.tenant_id, shard_id = %tenant_shard_id.shard_slug(), %timeline_id))
.await
}
// Run compaction immediately on given timeline.
async fn timeline_compact_handler(
mut request: Request<Body>,
@@ -3400,6 +3428,10 @@ pub fn make_router(
"/v1/tenant/:tenant_shard_id/timeline/:timeline_id/do_gc",
|r| api_handler(r, timeline_gc_handler),
)
.get(
"/v1/tenant/:tenant_shard_id/timeline/:timeline_id/compact",
|r| api_handler(r, timeline_compact_info_handler),
)
.put(
"/v1/tenant/:tenant_shard_id/timeline/:timeline_id/compact",
|r| api_handler(r, timeline_compact_handler),

View File

@@ -3,7 +3,7 @@ use metrics::{
register_counter_vec, register_gauge_vec, register_histogram, register_histogram_vec,
register_int_counter, register_int_counter_pair_vec, register_int_counter_vec,
register_int_gauge, register_int_gauge_vec, register_uint_gauge, register_uint_gauge_vec,
Counter, CounterVec, Gauge, GaugeVec, Histogram, HistogramVec, IntCounter, IntCounterPair,
Counter, CounterVec, GaugeVec, Histogram, HistogramVec, IntCounter, IntCounterPair,
IntCounterPairVec, IntCounterVec, IntGauge, IntGaugeVec, UIntGauge, UIntGaugeVec,
};
use once_cell::sync::Lazy;
@@ -445,15 +445,6 @@ pub(crate) static WAIT_LSN_TIME: Lazy<Histogram> = Lazy::new(|| {
.expect("failed to define a metric")
});
static FLUSH_WAIT_UPLOAD_TIME: Lazy<GaugeVec> = Lazy::new(|| {
register_gauge_vec!(
"pageserver_flush_wait_upload_seconds",
"Time spent waiting for preceding uploads during layer flush",
&["tenant_id", "shard_id", "timeline_id"]
)
.expect("failed to define a metric")
});
static LAST_RECORD_LSN: Lazy<IntGaugeVec> = Lazy::new(|| {
register_int_gauge_vec!(
"pageserver_last_record_lsn",
@@ -2586,7 +2577,6 @@ pub(crate) struct TimelineMetrics {
shard_id: String,
timeline_id: String,
pub flush_time_histo: StorageTimeMetrics,
pub flush_wait_upload_time_gauge: Gauge,
pub compact_time_histo: StorageTimeMetrics,
pub create_images_time_histo: StorageTimeMetrics,
pub logical_size_histo: StorageTimeMetrics,
@@ -2632,9 +2622,6 @@ impl TimelineMetrics {
&shard_id,
&timeline_id,
);
let flush_wait_upload_time_gauge = FLUSH_WAIT_UPLOAD_TIME
.get_metric_with_label_values(&[&tenant_id, &shard_id, &timeline_id])
.unwrap();
let compact_time_histo = StorageTimeMetrics::new(
StorageTimeOperation::Compact,
&tenant_id,
@@ -2780,7 +2767,6 @@ impl TimelineMetrics {
shard_id,
timeline_id,
flush_time_histo,
flush_wait_upload_time_gauge,
compact_time_histo,
create_images_time_histo,
logical_size_histo,
@@ -2830,14 +2816,6 @@ impl TimelineMetrics {
self.resident_physical_size_gauge.get()
}
pub(crate) fn flush_wait_upload_time_gauge_add(&self, duration: f64) {
self.flush_wait_upload_time_gauge.add(duration);
crate::metrics::FLUSH_WAIT_UPLOAD_TIME
.get_metric_with_label_values(&[&self.tenant_id, &self.shard_id, &self.timeline_id])
.unwrap()
.add(duration);
}
pub(crate) fn shutdown(&self) {
let was_shutdown = self
.shutdown
@@ -2855,7 +2833,6 @@ impl TimelineMetrics {
let shard_id = &self.shard_id;
let _ = LAST_RECORD_LSN.remove_label_values(&[tenant_id, shard_id, timeline_id]);
let _ = DISK_CONSISTENT_LSN.remove_label_values(&[tenant_id, shard_id, timeline_id]);
let _ = FLUSH_WAIT_UPLOAD_TIME.remove_label_values(&[tenant_id, shard_id, timeline_id]);
let _ = STANDBY_HORIZON.remove_label_values(&[tenant_id, shard_id, timeline_id]);
{
RESIDENT_PHYSICAL_SIZE_GLOBAL.sub(self.resident_physical_size_get());

View File

@@ -1242,7 +1242,7 @@ pub struct DatadirModification<'a> {
pending_metadata_bytes: usize,
}
impl<'a> DatadirModification<'a> {
impl DatadirModification<'_> {
// When a DatadirModification is committed, we do a monolithic serialization of all its contents. WAL records can
// contain multiple pages, so the pageserver's record-based batch size isn't sufficient to bound this allocation: we
// additionally specify a limit on how much payload a DatadirModification may contain before it should be committed.
@@ -1263,7 +1263,7 @@ impl<'a> DatadirModification<'a> {
pub(crate) fn has_dirty_data(&self) -> bool {
self.pending_data_batch
.as_ref()
.map_or(false, |b| b.has_data())
.is_some_and(|b| b.has_data())
}
/// Set the current lsn
@@ -1319,18 +1319,23 @@ impl<'a> DatadirModification<'a> {
let buf: Bytes = SlruSegmentDirectory::ser(&SlruSegmentDirectory::default())?.into();
let empty_dir = Value::Image(buf);
self.put(slru_dir_to_key(SlruKind::Clog), empty_dir.clone());
self.pending_directory_entries
.push((DirectoryKind::SlruSegment(SlruKind::Clog), 0));
self.put(
slru_dir_to_key(SlruKind::MultiXactMembers),
empty_dir.clone(),
);
self.pending_directory_entries
.push((DirectoryKind::SlruSegment(SlruKind::Clog), 0));
self.put(slru_dir_to_key(SlruKind::MultiXactOffsets), empty_dir);
self.pending_directory_entries
.push((DirectoryKind::SlruSegment(SlruKind::MultiXactOffsets), 0));
// Initialize SLRUs on shard 0 only: creating these on other shards would be
// harmless but they'd just be dropped on later compaction.
if self.tline.tenant_shard_id.is_shard_zero() {
self.put(slru_dir_to_key(SlruKind::Clog), empty_dir.clone());
self.pending_directory_entries
.push((DirectoryKind::SlruSegment(SlruKind::Clog), 0));
self.put(
slru_dir_to_key(SlruKind::MultiXactMembers),
empty_dir.clone(),
);
self.pending_directory_entries
.push((DirectoryKind::SlruSegment(SlruKind::Clog), 0));
self.put(slru_dir_to_key(SlruKind::MultiXactOffsets), empty_dir);
self.pending_directory_entries
.push((DirectoryKind::SlruSegment(SlruKind::MultiXactOffsets), 0));
}
Ok(())
}
@@ -2225,7 +2230,7 @@ impl<'a> DatadirModification<'a> {
assert!(!self
.pending_data_batch
.as_ref()
.map_or(false, |b| b.updates_key(&key)));
.is_some_and(|b| b.updates_key(&key)));
}
}
@@ -2294,7 +2299,7 @@ pub enum Version<'a> {
Modified(&'a DatadirModification<'a>),
}
impl<'a> Version<'a> {
impl Version<'_> {
async fn get(
&self,
timeline: &Timeline,

View File

@@ -3122,6 +3122,23 @@ impl Tenant {
}
}
pub(crate) fn get_scheduled_compaction_tasks(
&self,
timeline_id: TimelineId,
) -> Vec<CompactOptions> {
use itertools::Itertools;
let guard = self.scheduled_compaction_tasks.lock().unwrap();
guard
.get(&timeline_id)
.map(|tline_pending_tasks| {
tline_pending_tasks
.iter()
.map(|x| x.options.clone())
.collect_vec()
})
.unwrap_or_default()
}
/// Schedule a compaction task for a timeline.
pub(crate) async fn schedule_compaction(
&self,
@@ -5759,13 +5776,13 @@ mod tests {
use timeline::{CompactOptions, DeltaLayerTestDesc};
use utils::id::TenantId;
#[cfg(feature = "testing")]
use models::CompactLsnRange;
#[cfg(feature = "testing")]
use pageserver_api::record::NeonWalRecord;
#[cfg(feature = "testing")]
use timeline::compaction::{KeyHistoryRetention, KeyLogAtLsn};
#[cfg(feature = "testing")]
use timeline::CompactLsnRange;
#[cfg(feature = "testing")]
use timeline::GcInfo;
static TEST_KEY: Lazy<Key> =
@@ -9634,7 +9651,7 @@ mod tests {
#[cfg(feature = "testing")]
#[tokio::test]
async fn test_simple_bottom_most_compaction_on_branch() -> anyhow::Result<()> {
use timeline::CompactLsnRange;
use models::CompactLsnRange;
let harness = TenantHarness::create("test_simple_bottom_most_compaction_on_branch").await?;
let (tenant, ctx) = harness.load().await;

View File

@@ -35,7 +35,7 @@ pub struct CompressionInfo {
pub compressed_size: Option<usize>,
}
impl<'a> BlockCursor<'a> {
impl BlockCursor<'_> {
/// Read a blob into a new buffer.
pub async fn read_blob(
&self,

View File

@@ -89,7 +89,7 @@ pub(crate) enum BlockReaderRef<'a> {
VirtualFile(&'a VirtualFile),
}
impl<'a> BlockReaderRef<'a> {
impl BlockReaderRef<'_> {
#[inline(always)]
async fn read_blk(
&self,

View File

@@ -1,12 +1,15 @@
use std::collections::BTreeSet;
use itertools::Itertools;
use pageserver_compaction::helpers::overlaps_with;
use super::storage_layer::LayerName;
/// Checks whether a layer map is valid (i.e., is a valid result of the current compaction algorithm if nothing goes wrong).
///
/// The function checks if we can split the LSN range of a delta layer only at the LSNs of the delta layers. For example,
/// The function implements a fast path check and a slow path check.
///
/// The fast path checks if we can split the LSN range of a delta layer only at the LSNs of the delta layers. For example,
///
/// ```plain
/// | | | |
@@ -25,31 +28,47 @@ use super::storage_layer::LayerName;
/// | | | 4 | | |
///
/// If layer 2 and 4 contain the same single key, this is also a valid layer map.
///
/// However, if a partial compaction is still going on, it is possible that we get a layer map not satisfying the above condition.
/// Therefore, we fallback to simply check if any of the two delta layers overlap. (See "A slow path...")
pub fn check_valid_layermap(metadata: &[LayerName]) -> Option<String> {
let mut lsn_split_point = BTreeSet::new(); // TODO: use a better data structure (range tree / range set?)
let mut all_delta_layers = Vec::new();
for name in metadata {
if let LayerName::Delta(layer) = name {
if layer.key_range.start.next() != layer.key_range.end {
all_delta_layers.push(layer.clone());
}
all_delta_layers.push(layer.clone());
}
}
for layer in &all_delta_layers {
let lsn_range = &layer.lsn_range;
lsn_split_point.insert(lsn_range.start);
lsn_split_point.insert(lsn_range.end);
if layer.key_range.start.next() != layer.key_range.end {
let lsn_range = &layer.lsn_range;
lsn_split_point.insert(lsn_range.start);
lsn_split_point.insert(lsn_range.end);
}
}
for layer in &all_delta_layers {
for (idx, layer) in all_delta_layers.iter().enumerate() {
if layer.key_range.start.next() == layer.key_range.end {
continue;
}
let lsn_range = layer.lsn_range.clone();
let intersects = lsn_split_point.range(lsn_range).collect_vec();
if intersects.len() > 1 {
let err = format!(
"layer violates the layer map LSN split assumption: layer {} intersects with LSN [{}]",
layer,
intersects.into_iter().map(|lsn| lsn.to_string()).join(", ")
);
return Some(err);
// A slow path to check if the layer intersects with any other delta layer.
for (other_idx, other_layer) in all_delta_layers.iter().enumerate() {
if other_idx == idx {
// do not check self intersects with self
continue;
}
if overlaps_with(&layer.lsn_range, &other_layer.lsn_range)
&& overlaps_with(&layer.key_range, &other_layer.key_range)
{
let err = format!(
"layer violates the layer map LSN split assumption: layer {} intersects with layer {}",
layer, other_layer
);
return Some(err);
}
}
}
}
None

View File

@@ -532,7 +532,7 @@ pub struct DiskBtreeIterator<'a> {
>,
}
impl<'a> DiskBtreeIterator<'a> {
impl DiskBtreeIterator<'_> {
pub async fn next(&mut self) -> Option<std::result::Result<(Vec<u8>, u64), DiskBtreeError>> {
self.stream.next().await
}

View File

@@ -174,11 +174,11 @@ impl EphemeralFile {
}
impl super::storage_layer::inmemory_layer::vectored_dio_read::File for EphemeralFile {
async fn read_exact_at_eof_ok<'a, 'b, B: IoBufAlignedMut + Send>(
&'b self,
async fn read_exact_at_eof_ok<B: IoBufAlignedMut + Send>(
&self,
start: u64,
dst: tokio_epoll_uring::Slice<B>,
ctx: &'a RequestContext,
ctx: &RequestContext,
) -> std::io::Result<(tokio_epoll_uring::Slice<B>, usize)> {
let submitted_offset = self.buffered_writer.bytes_submitted();

View File

@@ -392,8 +392,8 @@ impl LayerMap {
image_layer: Option<Arc<PersistentLayerDesc>>,
end_lsn: Lsn,
) -> Option<SearchResult> {
assert!(delta_layer.as_ref().map_or(true, |l| l.is_delta()));
assert!(image_layer.as_ref().map_or(true, |l| !l.is_delta()));
assert!(delta_layer.as_ref().is_none_or(|l| l.is_delta()));
assert!(image_layer.as_ref().is_none_or(|l| !l.is_delta()));
match (delta_layer, image_layer) {
(None, None) => None,

View File

@@ -749,7 +749,7 @@ impl RemoteTimelineClient {
// ahead of what's _actually_ on the remote during index upload.
upload_queue.dirty.metadata = metadata.clone();
self.schedule_index_upload(upload_queue)?;
self.schedule_index_upload(upload_queue);
Ok(())
}
@@ -770,7 +770,7 @@ impl RemoteTimelineClient {
upload_queue.dirty.metadata.apply(update);
self.schedule_index_upload(upload_queue)?;
self.schedule_index_upload(upload_queue);
Ok(())
}
@@ -809,7 +809,7 @@ impl RemoteTimelineClient {
if let Some(archived_at_set) = need_upload_scheduled {
let intended_archived_at = archived_at_set.then(|| Utc::now().naive_utc());
upload_queue.dirty.archived_at = intended_archived_at;
self.schedule_index_upload(upload_queue)?;
self.schedule_index_upload(upload_queue);
}
let need_wait = need_change(&upload_queue.clean.0.archived_at, state).is_some();
@@ -824,7 +824,7 @@ impl RemoteTimelineClient {
let mut guard = self.upload_queue.lock().unwrap();
let upload_queue = guard.initialized_mut()?;
upload_queue.dirty.import_pgdata = state;
self.schedule_index_upload(upload_queue)?;
self.schedule_index_upload(upload_queue);
Ok(())
}
@@ -843,17 +843,14 @@ impl RemoteTimelineClient {
let upload_queue = guard.initialized_mut()?;
if upload_queue.latest_files_changes_since_metadata_upload_scheduled > 0 {
self.schedule_index_upload(upload_queue)?;
self.schedule_index_upload(upload_queue);
}
Ok(())
}
/// Launch an index-file upload operation in the background (internal function)
fn schedule_index_upload(
self: &Arc<Self>,
upload_queue: &mut UploadQueueInitialized,
) -> Result<(), NotInitialized> {
fn schedule_index_upload(self: &Arc<Self>, upload_queue: &mut UploadQueueInitialized) {
let disk_consistent_lsn = upload_queue.dirty.metadata.disk_consistent_lsn();
// fix up the duplicated field
upload_queue.dirty.disk_consistent_lsn = disk_consistent_lsn;
@@ -880,7 +877,6 @@ impl RemoteTimelineClient {
// Launch the task immediately, if possible
self.launch_queued_tasks(upload_queue);
Ok(())
}
/// Reparent this timeline to a new parent.
@@ -909,7 +905,7 @@ impl RemoteTimelineClient {
upload_queue.dirty.metadata.reparent(new_parent);
upload_queue.dirty.lineage.record_previous_ancestor(&prev);
self.schedule_index_upload(upload_queue)?;
self.schedule_index_upload(upload_queue);
Some(self.schedule_barrier0(upload_queue))
}
@@ -948,7 +944,7 @@ impl RemoteTimelineClient {
assert!(prev.is_none(), "copied layer existed already {layer}");
}
self.schedule_index_upload(upload_queue)?;
self.schedule_index_upload(upload_queue);
Some(self.schedule_barrier0(upload_queue))
}
@@ -1004,7 +1000,7 @@ impl RemoteTimelineClient {
upload_queue.dirty.gc_blocking = current
.map(|x| x.with_reason(reason))
.or_else(|| Some(index::GcBlocking::started_now_for(reason)));
self.schedule_index_upload(upload_queue)?;
self.schedule_index_upload(upload_queue);
Some(self.schedule_barrier0(upload_queue))
}
}
@@ -1057,8 +1053,7 @@ impl RemoteTimelineClient {
upload_queue.dirty.gc_blocking =
current.as_ref().and_then(|x| x.without_reason(reason));
assert!(wanted(upload_queue.dirty.gc_blocking.as_ref()));
// FIXME: bogus ?
self.schedule_index_upload(upload_queue)?;
self.schedule_index_upload(upload_queue);
Some(self.schedule_barrier0(upload_queue))
}
}
@@ -1125,8 +1120,8 @@ impl RemoteTimelineClient {
let mut guard = self.upload_queue.lock().unwrap();
let upload_queue = guard.initialized_mut()?;
let with_metadata = self
.schedule_unlinking_of_layers_from_index_part0(upload_queue, names.iter().cloned())?;
let with_metadata =
self.schedule_unlinking_of_layers_from_index_part0(upload_queue, names.iter().cloned());
self.schedule_deletion_of_unlinked0(upload_queue, with_metadata);
@@ -1153,7 +1148,7 @@ impl RemoteTimelineClient {
let names = gc_layers.iter().map(|x| x.layer_desc().layer_name());
self.schedule_unlinking_of_layers_from_index_part0(upload_queue, names)?;
self.schedule_unlinking_of_layers_from_index_part0(upload_queue, names);
self.launch_queued_tasks(upload_queue);
@@ -1166,7 +1161,7 @@ impl RemoteTimelineClient {
self: &Arc<Self>,
upload_queue: &mut UploadQueueInitialized,
names: I,
) -> Result<Vec<(LayerName, LayerFileMetadata)>, NotInitialized>
) -> Vec<(LayerName, LayerFileMetadata)>
where
I: IntoIterator<Item = LayerName>,
{
@@ -1208,10 +1203,10 @@ impl RemoteTimelineClient {
// index_part update, because that needs to be uploaded before we can actually delete the
// files.
if upload_queue.latest_files_changes_since_metadata_upload_scheduled > 0 {
self.schedule_index_upload(upload_queue)?;
self.schedule_index_upload(upload_queue);
}
Ok(with_metadata)
with_metadata
}
/// Schedules deletion for layer files which have previously been unlinked from the
@@ -1302,7 +1297,7 @@ impl RemoteTimelineClient {
let names = compacted_from.iter().map(|x| x.layer_desc().layer_name());
self.schedule_unlinking_of_layers_from_index_part0(upload_queue, names)?;
self.schedule_unlinking_of_layers_from_index_part0(upload_queue, names);
self.launch_queued_tasks(upload_queue);
Ok(())

View File

@@ -145,8 +145,8 @@ pub async fn download_layer_file<'a>(
///
/// If Err() is returned, there was some error. The file at `dst_path` has been unlinked.
/// The unlinking has _not_ been made durable.
async fn download_object<'a>(
storage: &'a GenericRemoteStorage,
async fn download_object(
storage: &GenericRemoteStorage,
src_path: &RemotePath,
dst_path: &Utf8PathBuf,
#[cfg_attr(target_os = "macos", allow(unused_variables))] gate: &utils::sync::gate::Gate,

View File

@@ -25,8 +25,8 @@ use utils::id::{TenantId, TimelineId};
use tracing::info;
/// Serializes and uploads the given index part data to the remote storage.
pub(crate) async fn upload_index_part<'a>(
storage: &'a GenericRemoteStorage,
pub(crate) async fn upload_index_part(
storage: &GenericRemoteStorage,
tenant_shard_id: &TenantShardId,
timeline_id: &TimelineId,
generation: Generation,

View File

@@ -345,10 +345,7 @@ impl LayerFringe {
}
pub(crate) fn next_layer(&mut self) -> Option<(ReadableLayer, KeySpace, Range<Lsn>)> {
let read_desc = match self.planned_visits_by_lsn.pop() {
Some(desc) => desc,
None => return None,
};
let read_desc = self.planned_visits_by_lsn.pop()?;
let removed = self.visit_reads.remove_entry(&read_desc.layer_to_visit_id);

View File

@@ -1486,7 +1486,7 @@ pub struct ValueRef<'a> {
layer: &'a DeltaLayerInner,
}
impl<'a> ValueRef<'a> {
impl ValueRef<'_> {
/// Loads the value from disk
pub async fn load(&self, ctx: &RequestContext) -> Result<Value> {
let buf = self.load_raw(ctx).await?;
@@ -1543,7 +1543,7 @@ pub struct DeltaLayerIterator<'a> {
is_end: bool,
}
impl<'a> DeltaLayerIterator<'a> {
impl DeltaLayerIterator<'_> {
pub(crate) fn layer_dbg_info(&self) -> String {
self.delta_layer.layer_dbg_info()
}

View File

@@ -1052,7 +1052,7 @@ pub struct ImageLayerIterator<'a> {
is_end: bool,
}
impl<'a> ImageLayerIterator<'a> {
impl ImageLayerIterator<'_> {
pub(crate) fn layer_dbg_info(&self) -> String {
self.image_layer.layer_dbg_info()
}

View File

@@ -25,11 +25,11 @@ pub trait File: Send {
/// [`std::io::ErrorKind::UnexpectedEof`] error if the file is shorter than `start+dst.len()`.
///
/// No guarantees are made about the remaining bytes in `dst` in case of a short read.
async fn read_exact_at_eof_ok<'a, 'b, B: IoBufAlignedMut + Send>(
&'b self,
async fn read_exact_at_eof_ok<B: IoBufAlignedMut + Send>(
&self,
start: u64,
dst: Slice<B>,
ctx: &'a RequestContext,
ctx: &RequestContext,
) -> std::io::Result<(Slice<B>, usize)>;
}
@@ -479,11 +479,11 @@ mod tests {
}
impl File for InMemoryFile {
async fn read_exact_at_eof_ok<'a, 'b, B: IoBufMut + Send>(
&'b self,
async fn read_exact_at_eof_ok<B: IoBufMut + Send>(
&self,
start: u64,
mut dst: Slice<B>,
_ctx: &'a RequestContext,
_ctx: &RequestContext,
) -> std::io::Result<(Slice<B>, usize)> {
let dst_slice: &mut [u8] = dst.as_mut_rust_slice_full_zeroed();
let nread = {
@@ -609,12 +609,12 @@ mod tests {
}
}
impl<'x> File for RecorderFile<'x> {
async fn read_exact_at_eof_ok<'a, 'b, B: IoBufAlignedMut + Send>(
&'b self,
impl File for RecorderFile<'_> {
async fn read_exact_at_eof_ok<B: IoBufAlignedMut + Send>(
&self,
start: u64,
dst: Slice<B>,
ctx: &'a RequestContext,
ctx: &RequestContext,
) -> std::io::Result<(Slice<B>, usize)> {
let (dst, nread) = self.file.read_exact_at_eof_ok(start, dst, ctx).await?;
self.recorded.borrow_mut().push(RecordedRead {
@@ -740,11 +740,11 @@ mod tests {
}
impl File for MockFile {
async fn read_exact_at_eof_ok<'a, 'b, B: IoBufMut + Send>(
&'b self,
async fn read_exact_at_eof_ok<B: IoBufMut + Send>(
&self,
start: u64,
mut dst: Slice<B>,
_ctx: &'a RequestContext,
_ctx: &RequestContext,
) -> std::io::Result<(Slice<B>, usize)> {
let ExpectedRead {
expect_pos,

View File

@@ -1,5 +1,6 @@
use anyhow::Context;
use camino::{Utf8Path, Utf8PathBuf};
use once_cell::sync::Lazy;
use pageserver_api::keyspace::KeySpace;
use pageserver_api::models::HistoricLayerInfo;
use pageserver_api::shard::{ShardIdentity, ShardIndex, TenantShardId};
@@ -2096,6 +2097,36 @@ impl Default for LayerImplMetrics {
}
impl LayerImplMetrics {
/// Resets the layer metrics to 0, for use in tests. Since this is a global static, metrics will
/// be shared across tests, and must be reset in each test case.
#[cfg(test)]
fn reset(&self) {
// Destructure to error on new fields.
let LayerImplMetrics {
started_evictions,
completed_evictions,
cancelled_evictions,
started_deletes,
completed_deletes,
failed_deletes,
rare_counters,
inits_cancelled,
redownload_after,
time_to_evict,
} = self;
started_evictions.reset();
completed_evictions.reset();
cancelled_evictions.values().for_each(|c| c.reset());
started_deletes.reset();
completed_deletes.reset();
failed_deletes.values().for_each(|c| c.reset());
rare_counters.values().for_each(|c| c.reset());
inits_cancelled.reset();
redownload_after.local().clear();
time_to_evict.local().clear();
}
fn inc_started_evictions(&self) {
self.started_evictions.inc();
}
@@ -2247,5 +2278,4 @@ impl RareEvent {
}
}
pub(crate) static LAYER_IMPL_METRICS: once_cell::sync::Lazy<LayerImplMetrics> =
once_cell::sync::Lazy::new(LayerImplMetrics::default);
pub(crate) static LAYER_IMPL_METRICS: Lazy<LayerImplMetrics> = Lazy::new(LayerImplMetrics::default);

View File

@@ -1,6 +1,7 @@
use std::time::UNIX_EPOCH;
use pageserver_api::key::CONTROLFILE_KEY;
use serial_test::serial;
use tokio::task::JoinSet;
use utils::{
completion::{self, Completion},
@@ -21,7 +22,10 @@ const FOREVER: std::time::Duration = std::time::Duration::from_secs(ADVANCE.as_s
/// Demonstrate the API and resident -> evicted -> resident -> deleted transitions.
#[tokio::test]
#[serial]
async fn smoke_test() {
LAYER_IMPL_METRICS.reset();
let handle = tokio::runtime::Handle::current();
let h = TenantHarness::create("smoke_test").await.unwrap();
@@ -198,7 +202,10 @@ async fn smoke_test() {
/// This test demonstrates a previous hang when a eviction and deletion were requested at the same
/// time. Now both of them complete per Arc drop semantics.
#[tokio::test(start_paused = true)]
#[serial]
async fn evict_and_wait_on_wanted_deleted() {
LAYER_IMPL_METRICS.reset();
// this is the runtime on which Layer spawns the blocking tasks on
let handle = tokio::runtime::Handle::current();
@@ -275,7 +282,10 @@ async fn evict_and_wait_on_wanted_deleted() {
/// This test ensures we are able to read the layer while the layer eviction has been
/// started but not completed.
#[test]
#[serial]
fn read_wins_pending_eviction() {
LAYER_IMPL_METRICS.reset();
let rt = tokio::runtime::Builder::new_current_thread()
.max_blocking_threads(1)
.enable_all()
@@ -395,6 +405,7 @@ fn read_wins_pending_eviction() {
/// Use failpoint to delay an eviction starting to get a VersionCheckFailed.
#[test]
#[serial]
fn multiple_pending_evictions_in_order() {
let name = "multiple_pending_evictions_in_order";
let in_order = true;
@@ -403,6 +414,7 @@ fn multiple_pending_evictions_in_order() {
/// Use failpoint to reorder later eviction before first to get a UnexpectedEvictedState.
#[test]
#[serial]
fn multiple_pending_evictions_out_of_order() {
let name = "multiple_pending_evictions_out_of_order";
let in_order = false;
@@ -410,6 +422,8 @@ fn multiple_pending_evictions_out_of_order() {
}
fn multiple_pending_evictions_scenario(name: &'static str, in_order: bool) {
LAYER_IMPL_METRICS.reset();
let rt = tokio::runtime::Builder::new_current_thread()
.max_blocking_threads(1)
.enable_all()
@@ -587,7 +601,10 @@ fn multiple_pending_evictions_scenario(name: &'static str, in_order: bool) {
/// disk but the layer internal state says it has not been initialized. Futhermore, it allows us to
/// have non-repairing `Layer::is_likely_resident`.
#[tokio::test(start_paused = true)]
#[serial]
async fn cancelled_get_or_maybe_download_does_not_cancel_eviction() {
LAYER_IMPL_METRICS.reset();
let handle = tokio::runtime::Handle::current();
let h = TenantHarness::create("cancelled_get_or_maybe_download_does_not_cancel_eviction")
.await
@@ -665,8 +682,8 @@ async fn cancelled_get_or_maybe_download_does_not_cancel_eviction() {
}
#[tokio::test(start_paused = true)]
#[serial]
async fn evict_and_wait_does_not_wait_for_download() {
// let handle = tokio::runtime::Handle::current();
let h = TenantHarness::create("evict_and_wait_does_not_wait_for_download")
.await
.unwrap();
@@ -759,10 +776,13 @@ async fn evict_and_wait_does_not_wait_for_download() {
///
/// Also checks that the same does not happen on a non-evicted layer (regression test).
#[tokio::test(start_paused = true)]
#[serial]
async fn eviction_cancellation_on_drop() {
use bytes::Bytes;
use pageserver_api::value::Value;
LAYER_IMPL_METRICS.reset();
// this is the runtime on which Layer spawns the blocking tasks on
let handle = tokio::runtime::Handle::current();

View File

@@ -31,9 +31,9 @@ use pageserver_api::{
},
keyspace::{KeySpaceAccum, KeySpaceRandomAccum, SparseKeyPartitioning},
models::{
CompactionAlgorithm, CompactionAlgorithmSettings, DownloadRemoteLayersTaskInfo,
DownloadRemoteLayersTaskSpawnRequest, EvictionPolicy, InMemoryLayerInfo, LayerMapInfo,
LsnLease, TimelineState,
CompactKeyRange, CompactLsnRange, CompactionAlgorithm, CompactionAlgorithmSettings,
DownloadRemoteLayersTaskInfo, DownloadRemoteLayersTaskSpawnRequest, EvictionPolicy,
InMemoryLayerInfo, LayerMapInfo, LsnLease, TimelineState,
},
reltag::BlockNumber,
shard::{ShardIdentity, ShardNumber, TenantShardId},
@@ -144,19 +144,15 @@ use self::layer_manager::LayerManager;
use self::logical_size::LogicalSize;
use self::walreceiver::{WalReceiver, WalReceiverConf};
use super::config::TenantConf;
use super::remote_timeline_client::index::IndexPart;
use super::remote_timeline_client::RemoteTimelineClient;
use super::secondary::heatmap::{HeatMapLayer, HeatMapTimeline};
use super::storage_layer::{LayerFringe, LayerVisibilityHint, ReadableLayer};
use super::upload_queue::NotInitialized;
use super::GcError;
use super::{
config::TenantConf, storage_layer::LayerVisibilityHint, upload_queue::NotInitialized,
MaybeOffloaded,
};
use super::{debug_assert_current_span_has_tenant_and_timeline_id, AttachedTenantConf};
use super::{remote_timeline_client::index::IndexPart, storage_layer::LayerFringe};
use super::{
remote_timeline_client::RemoteTimelineClient, remote_timeline_client::WaitCompletionError,
storage_layer::ReadableLayer,
};
use super::{
secondary::heatmap::{HeatMapLayer, HeatMapTimeline},
GcError,
debug_assert_current_span_has_tenant_and_timeline_id, AttachedTenantConf, MaybeOffloaded,
};
#[cfg(test)]
@@ -792,63 +788,6 @@ pub(crate) struct CompactRequest {
pub sub_compaction_max_job_size_mb: Option<u64>,
}
#[serde_with::serde_as]
#[derive(Debug, Clone, serde::Deserialize)]
pub(crate) struct CompactLsnRange {
pub start: Lsn,
pub end: Lsn,
}
#[serde_with::serde_as]
#[derive(Debug, Clone, serde::Deserialize)]
pub(crate) struct CompactKeyRange {
#[serde_as(as = "serde_with::DisplayFromStr")]
pub start: Key,
#[serde_as(as = "serde_with::DisplayFromStr")]
pub end: Key,
}
impl From<Range<Lsn>> for CompactLsnRange {
fn from(range: Range<Lsn>) -> Self {
Self {
start: range.start,
end: range.end,
}
}
}
impl From<Range<Key>> for CompactKeyRange {
fn from(range: Range<Key>) -> Self {
Self {
start: range.start,
end: range.end,
}
}
}
impl From<CompactLsnRange> for Range<Lsn> {
fn from(range: CompactLsnRange) -> Self {
range.start..range.end
}
}
impl From<CompactKeyRange> for Range<Key> {
fn from(range: CompactKeyRange) -> Self {
range.start..range.end
}
}
impl CompactLsnRange {
#[cfg(test)]
#[cfg(feature = "testing")]
pub fn above(lsn: Lsn) -> Self {
Self {
start: lsn,
end: Lsn::MAX,
}
}
}
#[derive(Debug, Clone, Default)]
pub(crate) struct CompactOptions {
pub flags: EnumSet<CompactFlags>,
@@ -3897,24 +3836,6 @@ impl Timeline {
// release lock on 'layers'
};
// Backpressure mechanism: wait with continuation of the flush loop until we have uploaded all layer files.
// This makes us refuse ingest until the new layers have been persisted to the remote
let start = Instant::now();
self.remote_client
.wait_completion()
.await
.map_err(|e| match e {
WaitCompletionError::UploadQueueShutDownOrStopped
| WaitCompletionError::NotInitialized(
NotInitialized::ShuttingDown | NotInitialized::Stopped,
) => FlushLayerError::Cancelled,
WaitCompletionError::NotInitialized(NotInitialized::Uninitialized) => {
FlushLayerError::Other(anyhow!(e).into())
}
})?;
let duration = start.elapsed().as_secs_f64();
self.metrics.flush_wait_upload_time_gauge_add(duration);
// FIXME: between create_delta_layer and the scheduling of the upload in `update_metadata_file`,
// a compaction can delete the file and then it won't be available for uploads any more.
// We still schedule the upload, resulting in an error, but ideally we'd somehow avoid this
@@ -4064,8 +3985,11 @@ impl Timeline {
// NB: there are two callers, one is the compaction task, of which there is only one per struct Tenant and hence Timeline.
// The other is the initdb optimization in flush_frozen_layer, used by `boostrap_timeline`, which runs before `.activate()`
// and hence before the compaction task starts.
// Note that there are a third "caller" that will take the `partitioning` lock. It is `gc_compaction_split_jobs` for
// gc-compaction where it uses the repartition data to determine the split jobs. In the future, it might use its own
// heuristics, but for now, we should allow concurrent access to it and let the caller retry compaction.
return Err(CompactionError::Other(anyhow!(
"repartition() called concurrently, this should not happen"
"repartition() called concurrently, this is rare and a retry should be fine"
)));
};
let ((dense_partition, sparse_partition), partition_lsn) = &*partitioning_guard;
@@ -5861,7 +5785,7 @@ enum OpenLayerAction {
None,
}
impl<'a> TimelineWriter<'a> {
impl TimelineWriter<'_> {
async fn handle_open_layer_action(
&mut self,
at: Lsn,

View File

@@ -29,6 +29,7 @@ use utils::id::TimelineId;
use crate::context::{AccessStatsBehavior, RequestContext, RequestContextBuilder};
use crate::page_cache;
use crate::statvfs::Statvfs;
use crate::tenant::checks::check_valid_layermap;
use crate::tenant::remote_timeline_client::WaitCompletionError;
use crate::tenant::storage_layer::batch_split_writer::{
BatchWriterResult, SplitDeltaLayerWriter, SplitImageLayerWriter,
@@ -1110,7 +1111,7 @@ impl Timeline {
return Err(CompactionError::ShuttingDown);
}
let same_key = prev_key.map_or(false, |prev_key| prev_key == key);
let same_key = prev_key == Some(key);
// We need to check key boundaries once we reach next key or end of layer with the same key
if !same_key || lsn == dup_end_lsn {
let mut next_key_size = 0u64;
@@ -1821,10 +1822,12 @@ impl Timeline {
let mut compact_jobs = Vec::new();
// For now, we simply use the key partitioning information; we should do a more fine-grained partitioning
// by estimating the amount of files read for a compaction job. We should also partition on LSN.
let Ok(partition) = self.partitioning.try_lock() else {
bail!("failed to acquire partition lock");
let ((dense_ks, sparse_ks), _) = {
let Ok(partition) = self.partitioning.try_lock() else {
bail!("failed to acquire partition lock during gc-compaction");
};
partition.clone()
};
let ((dense_ks, sparse_ks), _) = &*partition;
// Truncate the key range to be within user specified compaction range.
fn truncate_to(
source_start: &Key,
@@ -2154,15 +2157,14 @@ impl Timeline {
// Step 1: construct a k-merge iterator over all layers.
// Also, verify if the layer map can be split by drawing a horizontal line at every LSN start/end split point.
// disable the check for now because we need to adjust the check for partial compactions, will enable later.
// let layer_names = job_desc
// .selected_layers
// .iter()
// .map(|layer| layer.layer_desc().layer_name())
// .collect_vec();
// if let Some(err) = check_valid_layermap(&layer_names) {
// warn!("gc-compaction layer map check failed because {}, this is normal if partial compaction is not finished yet", err);
// }
let layer_names = job_desc
.selected_layers
.iter()
.map(|layer| layer.layer_desc().layer_name())
.collect_vec();
if let Some(err) = check_valid_layermap(&layer_names) {
bail!("gc-compaction layer map check failed because {}, cannot proceed with compaction due to potential data loss", err);
}
// The maximum LSN we are processing in this compaction loop
let end_lsn = job_desc
.selected_layers
@@ -2544,13 +2546,48 @@ impl Timeline {
);
// Step 3: Place back to the layer map.
// First, do a sanity check to ensure the newly-created layer map does not contain overlaps.
let all_layers = {
let guard = self.layers.read().await;
let layer_map = guard.layer_map()?;
layer_map.iter_historic_layers().collect_vec()
};
let mut final_layers = all_layers
.iter()
.map(|layer| layer.layer_name())
.collect::<HashSet<_>>();
for layer in &layer_selection {
final_layers.remove(&layer.layer_desc().layer_name());
}
for layer in &compact_to {
final_layers.insert(layer.layer_desc().layer_name());
}
let final_layers = final_layers.into_iter().collect_vec();
// TODO: move this check before we call `finish` on image layer writers. However, this will require us to get the layer name before we finish
// the writer, so potentially, we will need a function like `ImageLayerBatchWriter::get_all_pending_layer_keys` to get all the keys that are
// in the writer before finalizing the persistent layers. Now we would leave some dangling layers on the disk if the check fails.
if let Some(err) = check_valid_layermap(&final_layers) {
bail!("gc-compaction layer map check failed after compaction because {}, compaction result not applied to the layer map due to potential data loss", err);
}
// Between the sanity check and this compaction update, there could be new layers being flushed, but it should be fine because we only
// operate on L1 layers.
{
// TODO: sanity check if the layer map is valid (i.e., should not have overlaps)
let mut guard = self.layers.write().await;
guard
.open_mut()?
.finish_gc_compaction(&layer_selection, &compact_to, &self.metrics)
};
// Schedule an index-only upload to update the `latest_gc_cutoff` in the index_part.json.
// Otherwise, after restart, the index_part only contains the old `latest_gc_cutoff` and
// find_gc_cutoffs will try accessing things below the cutoff. TODO: ideally, this should
// be batched into `schedule_compaction_update`.
let disk_consistent_lsn = self.disk_consistent_lsn.load();
self.schedule_uploads(disk_consistent_lsn, None)?;
self.remote_client
.schedule_compaction_update(&layer_selection, &compact_to)?;
@@ -2902,7 +2939,7 @@ impl CompactionLayer<Key> for ResidentDeltaLayer {
impl CompactionDeltaLayer<TimelineAdaptor> for ResidentDeltaLayer {
type DeltaEntry<'a> = DeltaEntry<'a>;
async fn load_keys<'a>(&self, ctx: &RequestContext) -> anyhow::Result<Vec<DeltaEntry<'_>>> {
async fn load_keys(&self, ctx: &RequestContext) -> anyhow::Result<Vec<DeltaEntry<'_>>> {
self.0.get_as_delta(ctx).await?.index_entries(ctx).await
}
}

View File

@@ -428,6 +428,8 @@ MergeTable()
hash_seq_init(&status, old_table->role_table);
while ((entry = hash_seq_search(&status)) != NULL)
{
RoleEntry * old;
bool found_old = false;
RoleEntry *to_write = hash_search(
CurrentDdlTable->role_table,
entry->name,
@@ -435,30 +437,23 @@ MergeTable()
NULL);
to_write->type = entry->type;
if (entry->password)
to_write->password = entry->password;
to_write->password = entry->password;
strlcpy(to_write->old_name, entry->old_name, NAMEDATALEN);
if (entry->old_name[0] != '\0')
{
bool found_old = false;
RoleEntry *old = hash_search(
CurrentDdlTable->role_table,
entry->old_name,
HASH_FIND,
&found_old);
if (entry->old_name[0] == '\0')
continue;
if (found_old)
{
if (old->old_name[0] != '\0')
strlcpy(to_write->old_name, old->old_name, NAMEDATALEN);
else
strlcpy(to_write->old_name, entry->old_name, NAMEDATALEN);
hash_search(CurrentDdlTable->role_table,
entry->old_name,
HASH_REMOVE,
NULL);
}
}
old = hash_search(
CurrentDdlTable->role_table,
entry->old_name,
HASH_FIND,
&found_old);
if (!found_old)
continue;
strlcpy(to_write->old_name, old->old_name, NAMEDATALEN);
hash_search(CurrentDdlTable->role_table,
entry->old_name,
HASH_REMOVE,
NULL);
}
hash_destroy(old_table->role_table);
}

View File

@@ -365,6 +365,10 @@ lfc_change_limit_hook(int newval, void *extra)
neon_log(LOG, "Failed to punch hole in file: %m");
#endif
/* We remove the old entry, and re-enter a hole to the hash table */
for (int i = 0; i < BLOCKS_PER_CHUNK; i++)
{
lfc_ctl->used_pages -= (victim->bitmap[i >> 5] >> (i & 31)) & 1;
}
hash_search_with_hash_value(lfc_hash, &victim->key, victim->hash, HASH_REMOVE, NULL);
memset(&holetag, 0, sizeof(holetag));

View File

@@ -827,7 +827,6 @@ pageserver_send(shardno_t shard_no, NeonRequest *request)
{
while (!pageserver_connect(shard_no, shard->n_reconnect_attempts < max_reconnect_attempts ? LOG : ERROR))
{
HandleMainLoopInterrupts();
shard->n_reconnect_attempts += 1;
}
shard->n_reconnect_attempts = 0;

View File

@@ -131,8 +131,8 @@ get_snapshots_cutoff_lsn(void)
{
cutoff = snapshot_descriptors[logical_replication_max_snap_files - 1].lsn;
elog(LOG,
"ls_monitor: dropping logical slots with restart_lsn lower %X/%X, found %zu snapshot files, limit is %d",
LSN_FORMAT_ARGS(cutoff), snapshot_index, logical_replication_max_snap_files);
"ls_monitor: number of snapshot files, %zu, is larger than limit of %d",
snapshot_index, logical_replication_max_snap_files);
}
/* Is the size of the logical snapshots directory larger than specified?
@@ -162,8 +162,8 @@ get_snapshots_cutoff_lsn(void)
}
if (cutoff != original)
elog(LOG, "ls_monitor: dropping logical slots with restart_lsn lower than %X/%X, " SNAPDIR " is larger than %d KB",
LSN_FORMAT_ARGS(cutoff), logical_replication_max_logicalsnapdir_size);
elog(LOG, "ls_monitor: " SNAPDIR " is larger than %d KB",
logical_replication_max_logicalsnapdir_size);
}
pfree(snapshot_descriptors);
@@ -214,9 +214,13 @@ InitLogicalReplicationMonitor(void)
}
/*
* Unused logical replication slots pins WAL and prevents deletion of snapshots.
* Unused logical replication slots pins WAL and prevent deletion of snapshots.
* WAL bloat is guarded by max_slot_wal_keep_size; this bgw removes slots which
* need too many .snap files.
* need too many .snap files. These files are stored as AUX files, which are a
* pageserver mechanism for storing non-relation data. AUX files are shipped in
* in the basebackup which is requested by compute_ctl before Postgres starts.
* The larger the time to retrieve the basebackup, the more likely it is the
* compute will be killed by the control plane due to a timeout.
*/
void
LogicalSlotsMonitorMain(Datum main_arg)
@@ -239,10 +243,7 @@ LogicalSlotsMonitorMain(Datum main_arg)
ProcessConfigFile(PGC_SIGHUP);
}
/*
* If there are too many .snap files, just drop all logical slots to
* prevent aux files bloat.
*/
/* Get the cutoff LSN */
cutoff_lsn = get_snapshots_cutoff_lsn();
if (cutoff_lsn > 0)
{
@@ -252,31 +253,37 @@ LogicalSlotsMonitorMain(Datum main_arg)
ReplicationSlot *s = &ReplicationSlotCtl->replication_slots[i];
XLogRecPtr restart_lsn;
/* find the name */
LWLockAcquire(ReplicationSlotControlLock, LW_SHARED);
/* Consider only logical repliction slots */
/* Consider only active logical repliction slots */
if (!s->in_use || !SlotIsLogical(s))
{
LWLockRelease(ReplicationSlotControlLock);
continue;
}
/* do we need to drop it? */
/*
* Retrieve the restart LSN to determine if we need to drop the
* slot
*/
SpinLockAcquire(&s->mutex);
restart_lsn = s->data.restart_lsn;
SpinLockRelease(&s->mutex);
strlcpy(slot_name, s->data.name.data, sizeof(slot_name));
LWLockRelease(ReplicationSlotControlLock);
if (restart_lsn >= cutoff_lsn)
{
LWLockRelease(ReplicationSlotControlLock);
elog(LOG, "ls_monitor: not dropping replication slot %s because restart LSN %X/%X is greater than cutoff LSN %X/%X",
slot_name, LSN_FORMAT_ARGS(restart_lsn), LSN_FORMAT_ARGS(cutoff_lsn));
continue;
}
strlcpy(slot_name, s->data.name.data, NAMEDATALEN);
elog(LOG, "ls_monitor: dropping slot %s with restart_lsn %X/%X below horizon %X/%X",
elog(LOG, "ls_monitor: dropping replication slot %s because restart LSN %X/%X lower than cutoff LSN %X/%X",
slot_name, LSN_FORMAT_ARGS(restart_lsn), LSN_FORMAT_ARGS(cutoff_lsn));
LWLockRelease(ReplicationSlotControlLock);
/* now try to drop it, killing owner before if any */
/* now try to drop it, killing owner before, if any */
for (;;)
{
pid_t active_pid;
@@ -288,9 +295,9 @@ LogicalSlotsMonitorMain(Datum main_arg)
if (active_pid == 0)
{
/*
* Slot is releasted, try to drop it. Though of course
* Slot is released, try to drop it. Though of course,
* it could have been reacquired, so drop can ERROR
* out. Similarly it could have been dropped in the
* out. Similarly, it could have been dropped in the
* meanwhile.
*
* In principle we could remove pg_try/pg_catch, that
@@ -300,14 +307,14 @@ LogicalSlotsMonitorMain(Datum main_arg)
PG_TRY();
{
ReplicationSlotDrop(slot_name, true);
elog(LOG, "ls_monitor: slot %s dropped", slot_name);
elog(LOG, "ls_monitor: replication slot %s dropped", slot_name);
}
PG_CATCH();
{
/* log ERROR and reset elog stack */
EmitErrorReport();
FlushErrorState();
elog(LOG, "ls_monitor: failed to drop slot %s", slot_name);
elog(LOG, "ls_monitor: failed to drop replication slot %s", slot_name);
}
PG_END_TRY();
break;
@@ -315,7 +322,7 @@ LogicalSlotsMonitorMain(Datum main_arg)
else
{
/* kill the owner and wait for release */
elog(LOG, "ls_monitor: killing slot %s owner %d", slot_name, active_pid);
elog(LOG, "ls_monitor: killing replication slot %s owner %d", slot_name, active_pid);
(void) kill(active_pid, SIGTERM);
/* We shouldn't get stuck, but to be safe add timeout. */
ConditionVariableTimedSleep(&s->active_cv, 1000, WAIT_EVENT_REPLICATION_SLOT_DROP);

View File

@@ -187,7 +187,6 @@ async fn authenticate(
NodeInfo {
config,
aux: db_info.aux,
allow_self_signed_compute: false, // caller may override
},
db_info.allowed_ips,
))

View File

@@ -776,6 +776,7 @@ impl From<&jose_jwk::Key> for KeyType {
}
#[cfg(test)]
#[expect(clippy::unwrap_used)]
mod tests {
use std::future::IntoFuture;
use std::net::SocketAddr;

View File

@@ -37,7 +37,6 @@ impl LocalBackend {
branch_id: BranchIdTag::get_interner().get_or_intern("local"),
cold_start_info: ColdStartInfo::WarmCached,
},
allow_self_signed_compute: false,
},
}
}

View File

@@ -463,6 +463,8 @@ impl ComputeConnectBackend for Backend<'_, ComputeCredentials> {
#[cfg(test)]
mod tests {
#![allow(clippy::unimplemented, clippy::unwrap_used)]
use std::net::IpAddr;
use std::sync::Arc;
use std::time::Duration;
@@ -676,6 +678,9 @@ mod tests {
.await
.unwrap();
// flush the final server message
stream.flush().await.unwrap();
handle.await.unwrap();
}

Some files were not shown because too many files have changed in this diff Show More