Commit Graph

4530 Commits

Author SHA1 Message Date
Christian Schwarz
64b4b498a4 Revert "remove the walredo usage, that'll be in the next pr"
This reverts commit 20e82629df.
2024-02-02 17:25:25 +00:00
Christian Schwarz
20e82629df remove the walredo usage, that'll be in the next pr 2024-02-02 17:21:59 +00:00
Christian Schwarz
6788bde87a Merge branch 'problame/2024-02-walredo-work/prespawn/broken-tenants-no-walredo' into problame/2024-02-walredo-work/prespawn/heaver-once-cell-for-process-launch 2024-02-02 17:16:26 +00:00
Christian Schwarz
283c8abc04 Merge branch 'problame/2024-02-walredo-work/prespawn/split-code' into problame/2024-02-walredo-work/prespawn/broken-tenants-no-walredo 2024-02-02 17:16:25 +00:00
Christian Schwarz
647d409f0f Merge branch 'main' into problame/2024-02-walredo-work/prespawn/split-code 2024-02-02 17:16:24 +00:00
Christian Schwarz
0a09cff816 heavier_once_cell: switch to tokio::sync::RwLock
Using the RwLock reduces contention on the hot path.
2024-02-02 17:09:56 +00:00
John Spray
7e2436695d storage controller: use AWS Secrets Manager for database URL, etc (#6585)
## Problem

Passing secrets in via CLI/environment is awkward when using helm for
deployment, and not ideal for security (secrets may show up in ps,
/proc).

We can bypass these issues by simply connecting directly to the AWS
Secrets Manager service at runtime.

## Summary of changes

- Add dependency on aws-sdk-secretsmanager
- Update other aws dependencies to latest, to match transitive
dependency versions
- Add `Secrets` type in attachment service, using AWS SDK to load if
secrets are not provided on the command line.
2024-02-02 16:57:11 +00:00
Christian Schwarz
c29532cded Revert "Revert "[DO NOT MERGE] refactor(walredo): use replace RwLock with heavier_once_cell""
This reverts commit 6d94d9fb19.
2024-02-02 16:43:14 +00:00
Christian Schwarz
1102d3f0bf Revert "switch to tokio::RwLock"
This reverts commit e8f1af5527.
2024-02-02 16:43:08 +00:00
Christian Schwarz
e8f1af5527 switch to tokio::RwLock 2024-02-02 16:42:54 +00:00
Christian Schwarz
6d94d9fb19 Revert "[DO NOT MERGE] refactor(walredo): use replace RwLock with heavier_once_cell"
This reverts commit 2ab2608d4c.
2024-02-02 16:15:37 +00:00
Conrad Ludgate
6506fd14c4 proxy: more refactors (#6526)
## Problem

not really any problem, just some drive-by changes

## Summary of changes

1. move wake compute
2. move json processing
3. move handle_try_wake
4. move test backend to api provider
5. reduce wake-compute concerns
6. remove duplicate wake-compute loop
2024-02-02 16:07:35 +00:00
Christian Schwarz
84169c926a Merge branch 'problame/2024-02-walredo-work/prespawn/broken-tenants-no-walredo' into problame/2024-02-walredo-work/prespawn/heaver-once-cell-for-process-launch 2024-02-02 15:53:57 +00:00
Christian Schwarz
acdebf2cec Merge branch 'problame/2024-02-walredo-work/prespawn/split-code' into problame/2024-02-walredo-work/prespawn/broken-tenants-no-walredo 2024-02-02 15:53:56 +00:00
Christian Schwarz
44cb5e5be6 Merge branch 'main' into problame/2024-02-walredo-work/prespawn/split-code 2024-02-02 15:53:55 +00:00
John Spray
46fb1a90ce pageserver: avoid calculating/sending logical sizes on shard !=0 (#6567)
## Problem

Sharded tenants only maintain accurate relation sizes on shard 0.
Therefore logical size can only be calculated on shard 0. Fortunately it
is also only _needed_ on shard 0, to provide Safekeeper feedback and to
send consumption metrics.

Closes: #6307

## Summary of changes

- Send 0 for logical size to safekeepers on shards !=0
- Skip logical size warmup task on shards !=0
- Skip imitate_layer_accesses on shards !=0
2024-02-02 15:52:03 +00:00
Christian Schwarz
2ab2608d4c [DO NOT MERGE] refactor(walredo): use replace RwLock with heavier_once_cell
The API is nice, exactly what we want, but we would want a more
optimistic underlying sync primitive.
2024-02-02 15:36:15 +00:00
Christian Schwarz
f73aa3eb32 refactor(walredo): avoid the need for a WalRedoManager in broken tenants
When we'll later introduce a global pool of pre-spawned walredo
processes (https://github.com/neondatabase/neon/issues/6581), this
refactoring avoids plumbing through the reference to the pool to all the
places where we create a broken tenant.

Builds atop the refactoring in #6583
2024-02-02 14:52:53 +00:00
Christian Schwarz
2374e1318e Merge branch 'main' into problame/2024-02-walredo-work/prespawn/split-code 2024-02-02 14:42:30 +00:00
John Spray
56171cbe8c pageserver: more permissive activation timeout when testing (#6564)
## Problem

The 5 second activation timeout is appropriate for production
environments, where we want to give a prompt response to the cloud
control plane, and if we fail it will retry the call. In tests however,
we don't want every call to e.g. timeline create to have to come with a
retry wrapper.

This issue has always been there, but it is more apparent in sharding
tests that concurrently attach several tenant shards.

Closes: https://github.com/neondatabase/neon/issues/6563

## Summary of changes

When `testing` feature is enabled, make `ACTIVE_TENANT_TIMEOUT` 30
seconds instead of 5 seconds.
2024-02-02 15:14:42 +01:00
Arpad Müller
48b05b7c50 Add a time_travel_remote_storage http endpoint (#6533)
Adds an endpoint to the pageserver to S3-recover an entire tenant to a
specific given timestamp.

Required input parameters:
* `travel_to`: the target timestamp to recover the S3 state to
* `done_if_after`: a timestamp that marks the beginning of the recovery
process. retries of the query should keep this value constant. it *must*
be after `travel_to`, and also after any changes we want to revert, and
must represent a point in time before the endpoint is being called, all
of these time points in terms of the time source used by S3. these
criteria need to hold even in the face of clock differences, so I
recommend waiting a specific amount of time, then taking
`done_if_after`, then waiting some amount of time again, and only then
issuing the request.

Also important to note: the timestamps in S3 work at second accuracy, so
one needs to add generous waits before and after for the process to work
smoothly (at least 2-3 seconds).

We ignore the added test for the mocked S3 for now due to a limitation
in moto: https://github.com/getmoto/moto/issues/7300 .

Part of https://github.com/neondatabase/cloud/issues/8233
2024-02-02 14:52:12 +01:00
Conrad Ludgate
0856fe6676 proxy: remove per client bytes (#5466)
## Problem

Follow up to #5461

In my memory usage/fragmentation measurements, these metrics came up as
a large source of small allocations. The replacement metric has been in
use for a long time now so I think it's good to finally remove this.
Per-endpoint data is still tracked elsewhere

## Summary of changes

remove the per-client bytes metrics
2024-02-02 12:28:48 +00:00
Christian Schwarz
aa0e9fdaef Merge branch 'main' into problame/2024-02-walredo-work/prespawn/split-code 2024-02-02 11:50:15 +00:00
Alexander Bayandin
4133d14a77 Compute: pgbouncer 1.22.0 (#6582)
## Problem
Update pgbouncer from 1.21 (and patches[0][1]) to 1.22 (which includes
these patches)
- [0] https://github.com/pgbouncer/pgbouncer/pull/972
- [1] https://github.com/pgbouncer/pgbouncer/pull/998

## Summary of changes
- Build pgbouncer 1.22.0 for neonVMs from upstream
2024-02-02 11:49:11 +00:00
Alexander Bayandin
30c9e145d7 check-macos-build: switch job to macos-14 (M1) (#6539)
## Problem
- GitHub made available `macos-14` runners, and they run on M1
processors[0]
- The price is the same as Intel-based runners — "macOS | 3 or 4 (M1 or
Intel) | $0.08"[1], but runners on Apple Silicon should be significantly 
faster than their Intel counterparts.
- Most developers who use macOS use Apple Silicon-based Macs nowadays.

- [0] https://github.blog/changelog/2024-01-30-github-actions-introducing-the-new-m1-macos-runner-available-to-open-source/
- [1] https://docs.github.com/en/billing/managing-billing-for-github-actions/about-billing-for-github-actions#per-minute-rates

## Summary of changes
- Run `check-macos-build` on `macos-14`
2024-02-02 10:51:20 +00:00
John Spray
24e916d37f pageserver: fix a syntax error in swagger (#6566)
A description was written as a follow-on to a section line, rather than
in the proper `description:` part. This caused swagger parsers to
rightly reject it.
2024-02-02 10:35:09 +00:00
Andreas Scherbaum
23f58145ed Update wording for better readability (#6559)
Update wording, add spaces in commandline arguments

Co-authored-by: Andreas Scherbaum <andreas@neon.tech>
2024-02-02 11:22:32 +01:00
Christian Schwarz
9b8aa270b8 cleanups 2024-02-02 10:19:18 +00:00
Christian Schwarz
4571db1750 extract NeonWalRecord apply logic 2024-02-02 10:14:50 +00:00
Christian Schwarz
6fe534fea3 move protocol ad child module of process, where it belongs 2024-02-02 10:05:50 +00:00
Christian Schwarz
8b258e20a0 move more stuff around 2024-02-02 10:03:40 +00:00
Christian Schwarz
29eec6c563 split off walredo process & protocol from walredo.rs 2024-02-02 09:59:31 +00:00
Heikki Linnakangas
350865392c Print checkpoint key contents with "pagectl print-layer-file" (#6541)
This was very useful in debugging the bugs fixed in #6410 and #6502.

There's a lot more we could do. This only adds the printing to delta
layers, not image layers, for example, and it might be useful to print
details of more record types. But this is a good start.
2024-02-02 01:35:31 +02:00
Christian Schwarz
1be5e564ce feat(walredo): use posix_spawn by moving close_fds() work to walredo C code (#6574)
The rust stdlib uses the efficient `posix_spawn` by default.
However, before this PR, pageserver used `pre_exec()` in our
`close_fds()` ext trait.

This PR moves the work that `close_fds()` did to the walredo C code.
I verified manually using `gdb` that we're now forking out the walredo
process using `posix_spawn`.

refs https://github.com/neondatabase/neon/issues/6565
2024-02-01 22:38:34 +01:00
Christian Schwarz
7a70ef991f feat(walredo): various observability improvements (#6573)
- log when we start walredo process
- include tenant shard id in walredo argv
- dump some basic walredo state in tenant details api
- more suitable walredo process launch histogram buckets
- avoid duplicate tracing labels in walredo launch spans
2024-02-01 21:59:40 +01:00
Sasha Krassovsky
be30388901 Add retry to fetching basebackup (#6537)
## Problem
Currently we have no retry mechanism for fetching basebackup. If there's
an unstable connection, starting compute will just fail.

## Summary of changes
Adds an exponential backoff with 7 retries to get the basebackup.
2024-02-01 20:50:04 +00:00
Heikki Linnakangas
3525080031 Fix pgvector 0.6.0 with Neon. (#6571)
The previous patch was broken. rd_smgr as not open yet, need to use
RelationGetSmgr() to access it.
2024-02-01 20:48:31 +00:00
Arpad Müller
527cdbc010 Don't require AWS access keys for S3 pytests (#6556)
Don't require AWS access keys (AWS_ACCESS_KEY_ID and
AWS_SECRET_ACCESS_KEY) for S3 usage in the pytests, and also allow
AWS_PROFILE to be passed.

One of the two methods is required however.

This allows local development like:

```
aws sso login --profile dev
export ENABLE_REAL_S3_REMOTE_STORAGE=nonempty REMOTE_STORAGE_S3_REGION=eu-central-1 REMOTE_STORAGE_S3_BUCKET=neon-github-ci-tests AWS_PROFILE=dev
cargo build_testing && RUST_BACKTRACE=1 ./scripts/pytest -k debug-pg16 test_runner/regress/test_tenant_delete.py::test_tenant_delete_smoke
```

related earlier PR for the cargo unit tests of the `remote_storage` crate: #6202

---------

Co-authored-by: Alexander Bayandin <alexander@neon.tech>
2024-02-01 20:18:07 +00:00
Alexander Bayandin
39be2b0108 Makefile: set PQ_LIB_DIR to avoid linkage with system libpq (#6538)
## Problem

Initially spotted on macOS. When building `attachment_service`, it might
get linked with system `libpq`:
```
$ otool -L target/debug/attachment_service
target/debug/attachment_service:
	/opt/homebrew/opt/libpq/lib/libpq.5.dylib (compatibility version 5.0.0, current version 5.16.0)
	/System/Library/Frameworks/Security.framework/Versions/A/Security (compatibility version 1.0.0, current version 61040.61.1)
	/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 2202.0.0)
	/usr/lib/libiconv.2.dylib (compatibility version 7.0.0, current version 7.0.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1336.61.1)
```

After this PR:
```
$ otool -L target/debug/attachment_service 
target/debug/attachment_service:
	/Users/bayandin/work/neon/pg_install/v16/lib/libpq.5.dylib (compatibility version 5.0.0, current version 5.16.0)
	/System/Library/Frameworks/Security.framework/Versions/A/Security (compatibility version 1.0.0, current version 61040.61.1)
	/System/Library/Frameworks/CoreFoundation.framework/Versions/A/CoreFoundation (compatibility version 150.0.0, current version 2202.0.0)
	/usr/lib/libiconv.2.dylib (compatibility version 7.0.0, current version 7.0.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1336.61.1)
```

## Summary of changes
- Set `PQ_LIB_DIR` to bundled Postgres 16 lib dir
2024-02-01 17:34:48 +00:00
Alexander Bayandin
fa52cd575e Remove old tests results and old coverage collection (#6376)
## Problem
We have switched to new test results and new coverage results, so no
need to collect these data in old formats.

## Summary of changes
- Remove "Upload coverage report" for old coverage report
- Remove "Store Allure test stat in the DB" for old test results format
2024-02-01 13:36:55 +00:00
Vlad Lazar
d2c410c748 pageserver_api: remove overlaps from KeySpace (#6544)
This commit adds a function to `KeySpace` which updates a key key space
by removing all overlaps with a second key space. This can involve
splitting or removing of existing ranges.

The implementation is not particularly efficient: O(M * N * log(N))
where N is the number of ranges in the current key space and M is the
number of ranges in the key space we are checking against. In practice,
this shouldn't matter much since, in the short term, the only caller of
this function will be the vectored read path and the number of key
spaces invovled will be small. This follows from the upper bound placed
on the number of keys accepted by the vectored read path.

A couple other small utility functions are added. They'll be used by the
vectored search path as well.
2024-02-01 13:14:35 +00:00
Vlad Lazar
221531c9db pageserver: lift ancestor timeline logic from read path (#6543)
When the read path needs to follow a key into the ancestor timeline, it
needs to wait for said ancestor to become active and aware of it's
branching lsn. The logic is lifted into a separate function with it's
own new error type.

This is done because the vectored read path needs the same logic. It's
also the reason for the newly introduced error type.

When we'll switch the read path to proxy into `get_vectored`, we can
remove the duplicated variants from `PageReconstructError`.
2024-02-01 10:35:18 +00:00
Christian Schwarz
4c173456dc pagebench: fix percentiles reporting (#6547)
Before this patch, pagebench was always showing the same value.

refs https://github.com/neondatabase/neon/issues/6509
2024-01-31 23:29:48 +00:00
Christian Schwarz
e82625b77d refactor(pageserver main): signal handling (#6554)
This refactoring makes it easier to experimentally replace
BACKGROUND_RUNTIME with a single-threaded runtime. Found this useful
[during benchmarking](https://github.com/neondatabase/neon/pull/6555).
2024-01-31 23:25:57 +00:00
Christian Schwarz
0ac1e71524 update tokio-epoll-uring (#6558)
to pull in fixes for
https://github.com/neondatabase/tokio-epoll-uring/issues/37
2024-01-31 22:54:54 +00:00
Anna Khanova
271133d960 Proxy: reduce number of get role secret calls (#6557)
## Problem

Right now if get_role_secret response wasn't cached (e.g. cache already
reached max size) it will send the second (exactly the same request).

## Summary of changes

Avoid needless request.
2024-01-31 22:16:56 +00:00
Joonas Koivunen
3d5fab127a rewrite Gate impl for better observability (#6542)
changes:
- two messages instead of message every second when gate was closing
- replace the gate name string by using a pointer
- slow GateGuards are likely to log who they were (see example)

example found in regress tests: <https://github.com/neondatabase/neon/pull/6542#issuecomment-1919009256>
2024-01-31 22:15:58 +00:00
Joonas Koivunen
66719d7eaf logging: fix span usage (#6549)
Fixes some duplication due to extra or misconfigured `#[instrument]`,
while filling in the `timeline_id` to delete timeline flow calls.
2024-01-31 20:52:00 +00:00
Konstantin Knizhnik
9a9d9beaee Download SLRU segments on demand (#6151)
## Problem

See https://github.com/neondatabase/cloud/issues/8673

## Summary of changes


Download missed SLRU segments from page server

## Checklist before requesting a review

- [ ] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.

## Checklist before merging

- [ ] Do not forget to reformat commit message to not include the above
checklist

---------

Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2024-01-31 21:39:18 +02:00
John Spray
2bfc831c60 control_plane/attachment_service: make --path optional (#6545)
## Problem

The `--path` argument is only used in testing, for compat tests that use
a JSON snapshot of state rather than the postgres database. In regular
deployments, it should be omitted (currently one has to specify `--path
""`)

## Summary of changes

Make `--path` optional.
2024-01-31 17:02:41 +00:00