John Spray
18159b7695
deletion queue: expose errors from push/flush
2023-08-22 10:01:10 +01:00
John Spray
c1bc9c0f70
Various test fixes + tweaks to flushing
2023-08-18 12:44:35 +01:00
John Spray
2de5efa208
Fix broken wait_untils in test_remote_storage_upload_queue_retries
2023-08-18 12:44:35 +01:00
John Spray
d330eac4bc
clippy
2023-08-18 12:44:35 +01:00
John Spray
3ebceeda71
pageserver: refactor timeline args into TimelineResources
...
This sidesteps clippy complaining about function arg counts,
and will enable introducing more shared structures in future
without the noise of adding extra args to all the functions
involved in timeline setup.
2023-08-18 12:44:35 +01:00
John Spray
31729d6f4d
pageserver: refactor tenant args into a structure
...
This way, when we add some new shared structure that the
tenants need a reference to, we do not have to add it
individually as an extra argument to the various functions.
2023-08-18 12:44:35 +01:00
John Spray
7e0e3517c1
clippy
2023-08-18 12:44:35 +01:00
John Spray
c4fc6e433d
tests: add e2e deletion queue recovery test
2023-08-18 12:44:35 +01:00
John Spray
c36cba28d6
pageserver: generalize flush API
2023-08-18 12:44:35 +01:00
John Spray
8eaa4015de
deletion queue: versions in keys
2023-08-18 12:44:35 +01:00
John Spray
10e927ee3e
Add encoding versions to deletion queue structs
2023-08-18 12:44:35 +01:00
John Spray
bb3a59f275
clippy
2023-08-18 12:44:35 +01:00
John Spray
a0ed43cc12
deletion queue: add DeletionHeader for sequence numbers
2023-08-18 12:44:35 +01:00
John Spray
99dc5a5c27
Deletion queue: implement recovery on startup
2023-08-18 12:44:35 +01:00
John Spray
54db1f5d8a
remote_storage: add a helper for downloading full objects
...
This is only for use with small objects that we will
deserialize in a non-streaming way.
Also add a strip_prefix method to RemotePath.
2023-08-18 12:44:35 +01:00
John Spray
404b25e45f
Remove vestigial remote_timeline_client deletion paths
2023-08-18 12:44:35 +01:00
John Spray
f4dba9f907
tests: update tenant deletion tests for deletion queue
2023-08-18 12:44:35 +01:00
John Spray
4ec45bc7dc
tests: update tenant deletion tests for deletion queue
2023-08-18 12:44:35 +01:00
John Spray
a00d4a8d8c
tests: update test_remote_timeline_client_calls_started_metric for deletion queue
2023-08-18 12:44:35 +01:00
John Spray
43c9a09d8f
tests: update remote storage test for deletion queue
2023-08-18 12:44:35 +01:00
John Spray
3edd7ece40
deletion queue: improve frontend retry
2023-08-18 12:44:35 +01:00
John Spray
504fe9c2b0
pageserver: send timeline deletions through the deletion queue
2023-08-18 12:44:35 +01:00
John Spray
10df237a81
deletion queue: add push for generic objects (layers and garbage)
2023-08-18 12:44:35 +01:00
John Spray
d40f8475a5
Error metric and retries
2023-08-18 12:44:35 +01:00
John Spray
164f916a40
Spawn deletion workers with info spans
2023-08-18 12:44:35 +01:00
John Spray
4ebc29768c
Add failpoint for deletion execution
2023-08-18 12:44:35 +01:00
John Spray
bae62916dc
pageserver/http: add /v1/deletion_queue/flush_execute
...
This is principally for tesing, but might be useful in
the field if we want to e.g. flush a deletion queue
before running an external scrub tool
2023-08-18 12:44:35 +01:00
John Spray
5e2b8b376c
utils: add ApiError::ShuttingDown
...
So that handlers that check their CancellationToken
explicitly can map it to a set http status.
2023-08-18 12:44:35 +01:00
John Spray
54ec7919b8
pageserver: add deletion queue submitted/executed metrics
2023-08-18 12:44:35 +01:00
John Spray
e0bed0732c
Tweak deletion queue constants
2023-08-18 12:44:35 +01:00
John Spray
9e92121cc3
pageserver: flush deletion queue on clean shutdown
2023-08-18 12:44:35 +01:00
John Spray
50a9508f4f
clippy
2023-08-18 12:44:35 +01:00
John Spray
f61402be24
pageserver: testing for deletion queue
2023-08-18 12:44:35 +01:00
John Spray
975e4f2235
Refactor deletion worker construction
2023-08-18 12:44:35 +01:00
John Spray
537eca489e
Implement flush_execute() in deletion queue
2023-08-18 12:44:35 +01:00
John Spray
de4882886e
pageserver: implement batching in deletion queue
2023-08-18 12:44:35 +01:00
John Spray
6982288426
pageserver: implement frontend of deletion queue
2023-08-18 12:44:35 +01:00
John Spray
ccfcfa1098
remote_storage: implement Serialize/Deserialize for RemotePath
2023-08-18 12:44:35 +01:00
John Spray
e2c793c897
Use deletion queue in schedule_layer_file_deletion
2023-08-18 12:44:33 +01:00
John Spray
0fdc492aa4
Add MockDeletionQueue for unit tests
2023-08-18 11:25:40 +01:00
John Spray
787b099541
wire deletion queue into timeline
2023-08-18 11:25:40 +01:00
John Spray
3af693749d
pageserver: wire deletion queue through to Tenant
2023-08-18 11:25:40 +01:00
John Spray
6f9ae6bb5f
pageserver: instantiate deletion queue at process scope
2023-08-18 11:25:40 +01:00
John Spray
16d77dcb73
Initial stub implementation of deletion queue
2023-08-18 11:25:40 +01:00
Joonas Koivunen
67af24191e
test: cleanup remote_timeline_client tests ( #5013 )
...
I will have to change these as I change remote_timeline_client api in
#4938 . So a bit of cleanup, handle my comments which were just resolved
during initial review.
Cleanup:
- use unwrap in tests instead of mixed `?` and `unwrap`
- use `Handle` instead of `&'static Reactor` to make the
RemoteTimelineClient more natural
- use arrays in tests
- use plain `#[tokio::test]`
2023-08-17 19:27:30 +03:00
Joonas Koivunen
6af5f9bfe0
fix: format context ( #5022 )
...
We return an error with unformatted `{timeline_id}`.
2023-08-17 14:30:25 +00:00
Dmitry Rodionov
64fc7eafcd
Increase timeout once again. ( #5021 )
...
When failpoint is early in deletion process it takes longer to complete
after failpoint is removed.
Example was: https://neon-github-public-dev.s3.amazonaws.com/reports/main/5889544346/index.html#suites/3556ed71f2d69272a7014df6dcb02317/49826c68ce8492b1
2023-08-17 15:37:28 +03:00
Conrad Ludgate
3e4710c59e
proxy: add more sasl logs ( #5012 )
...
## Problem
A customer is having trouble connecting to neon from their production
environment. The logs show a mix of "Internal error" and "authentication
protocol violation" but not the full error
## Summary of changes
Make sure we don't miss any logs during SASL/SCRAM
2023-08-17 12:05:54 +01:00
Dmitry Rodionov
d8b0a298b7
Do not attach deleted tenants ( #5008 )
...
Rather temporary solution before proper:
https://github.com/neondatabase/neon/issues/5006
It requires more plumbing so lets not attach deleted tenants first and
then implement resume.
Additionally fix `assert_prefix_empty`. It had a buggy prefix calculation,
and since we always asserted for absence of stuff it worked. Here I
started to assert for presence of stuff too and it failed. Added more
"presence" asserts to other places to be confident that it works.
Resolves [#5016 ](https://github.com/neondatabase/neon/issues/5016 )
2023-08-17 13:46:49 +03:00
Alexander Bayandin
c8094ee51e
test_compatibility: run amcheck unconditionally ( #4985 )
...
## Problem
The previous version of neon (that we use in the forward compatibility test)
has installed `amcheck` extension now. We can run `pg_amcheck`
unconditionally.
## Summary of changes
- Run `pg_amcheck` in compatibility tests unconditionally
2023-08-17 11:46:00 +01:00