Commit Graph

3205 Commits

Author SHA1 Message Date
Konstantin Knizhnik
7d4ebf8485 Update pageserver/src/keyspace.rs
Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
9b9b125d13 Make clippy happy 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
4d76c2916e Apply black to test_gc_feedback 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
199771371c Make ruff happy 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
43187715d6 Add KeySpaceRandomAccum 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
1e63fc99db Move test_gc_feedback test to performance 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
7b58f82f7b Move test_gc_feedback test to performance 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
2af45505b8 test_runner/performance/test_gc_feedback.py 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
3c5b99b4b9 Rename test_gc_feedback test 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
8b05a87f75 Add test that no redundant image are generatd if them are wanted by GC 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
3bc4a7c1e2 Add test that no redundant image are generatd if them are wanted by GC 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
3d0a51567f Fix KeySpace.add_range 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
7e6dbc32d1 Update pageserver/src/tenant/timeline.rs
Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
f12f6b0275 Fix pythin style 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
0fbd85f64b Fix python style violations in test_gc_old_layers.py 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
af75d59b4c Update pageserver/src/tenant/timeline.rs
Co-authored-by: Joonas Koivunen <joonas@neon.tech>
2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
4618739cb3 Update test_runner/regress/test_gc_old_layers.py
Co-authored-by: Joonas Koivunen <joonas@neon.tech>
2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
f838a11514 Update pageserver/src/keyspace.rs
Co-authored-by: Joonas Koivunen <joonas@neon.tech>
2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
f0fe03ea80 Make clippy happy 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
be22be7b24 Make clippy happy 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
451479305e Use KeySpace for passing infirmation about wanted image layers from GC to copaction task 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
5e690307fb Avoid redundant generation of wanted image layers if such layer already exists beyond GC cutoff horizon 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
3e6288d7d8 Update pageserver/src/tenant/timeline.rs
Co-authored-by: Joonas Koivunen <joonas@neon.tech>
2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
2d015a1464 Update pageserver/src/tenant/timeline.rs
Co-authored-by: Joonas Koivunen <joonas@neon.tech>
2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
0785d92577 Add isort happy 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
fcb9bac847 Revert changes in key space partitioning 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
7a3d6531b8 Revert "fix KeySpace initialization in bench_layer_map.rs"
This reverts commit 63b1fcb813ca5f40a2b1328d4cb6e21646fba69f.
2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
3275305a30 Revert "Split keyspace in partitions without holes"
This reverts commit 02c0e9082f804ccf201fe1cf07eb167b697ea9a3.
2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
0deca452bf Add comments 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
98de2a6d93 Update test_runner/regress/test_gc_old_layers.py
Co-authored-by: Joonas Koivunen <joonas@neon.tech>
2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
88257b91d7 Update test_runner/regress/test_gc_old_layers.py
Co-authored-by: Joonas Koivunen <joonas@neon.tech>
2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
e8066631a6 Update pageserver/src/tenant/timeline.rs
Co-authored-by: Joonas Koivunen <joonas@neon.tech>
2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
e069c409ef Update pageserver/src/tenant/timeline.rs
Co-authored-by: Joonas Koivunen <joonas@neon.tech>
2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
7f81d57d52 Update pageserver/src/tenant/timeline.rs
Co-authored-by: Joonas Koivunen <joonas@neon.tech>
2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
787c4a8bbb Update pageserver/src/tenant/timeline.rs
Co-authored-by: Joonas Koivunen <joonas@neon.tech>
2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
6ec9922184 Make clippy happy 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
9b241f29cd Remove sleep at the end of test_gc_old_layers.py 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
9b418a71ac fix KeySpace initialization in bench_layer_map.rs 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
1bb8ca0806 Split keyspace in partitions without holes 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
a1c8e74fb9 Add test for GC of stairs layers 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
f9999c84d9 Rebase with main 2023-05-16 21:18:20 +03:00
Konstantin Knizhnik
c01c31d045 Add comment exlaining wanted_image_layers 2023-05-16 21:18:19 +03:00
Konstantin Knizhnik
4da24ba34f Pass set of wanted image layers from GC to compaction 2023-05-16 21:18:19 +03:00
Alexander Bayandin
a5615bd8ea Fix Allure reports for different benchmark jobs (#4229)
- Fix Allure report generation failure for Nightly Benchmarks
- Fix GitHub Autocomment for `run-benchmarks` label
(`build_and_test.yml::benchmarks` job)
2023-05-15 13:04:03 +01:00
Joonas Koivunen
4a76f2b8d6 upload new timeline index part json before 201 or on retry (#4204)
Await for upload to complete before returning 201 Created on
`branch_timeline` or when `bootstrap_timeline` happens. Should either of
those waits fail, then on the retried request await for uploads again.
This should work as expected assuming control-plane does not start to
use timeline creation as a wait_for_upload mechanism.

Fixes #3865, started from
https://github.com/neondatabase/neon/pull/3857/files#r1144468177

Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2023-05-15 14:16:43 +03:00
Shany Pozin
9cd6f2ceeb Remove duplicated logic in creating TenantConfOpt (#4230)
## Describe your changes

Remove duplicated logic in creating TenantConfOpt in both TryFrom of
TenantConfigRequest and TenantCreateRequest
2023-05-15 10:08:44 +03:00
Heikki Linnakangas
2855c73990 Fix race condition after attaching tenant with branches. (#4170)
After tenant attach, there is a window where the child timeline is
loaded and accepts GetPage requests, but its parent is not. If a
GetPage request needs to traverse to the parent, it needs to wait for
the parent timeline to become active, or it might miss some records on
the parent timeline.

It's also possible that the parent timeline is active, but it hasn't
yet received all the WAL up to the branch point from the safekeeper.
This happens if a pageserver crashes soon after creating a timeline,
so that the WAL leading to the branch point has not yet been uploaded
to remote storage. After restart, the WAL will be re-streamed and
ingested from the safekeeper, but that takes a while. Because of that,
it's not enough to check that the parent timeline is active, we also
need to wait for the WAL to arrive on the parent timeline, just like
at the beginning of GetPage handling. We probably should change the
behavior at create_timeline so that a timeline can only be created
after all the WAL up to the branch point has been uploaded to remote
storage, but that's not currently the case and out of scope for this
PR (see github issue #4218).

@NanoBjorn encountered this while working on tenant migration. After
migrating a tenant with a parent and child branch, connecting to the
child branch failed with an error like:

```
FATAL:  "base/16385" is not a valid data directory
DETAIL:  File "base/16385/PG_VERSION" is missing.
```

This commit adds two tests that reproduce the bug, with slightly
different symptoms.
2023-05-13 10:44:11 +03:00
Christian Schwarz
edcf4d61a4 distinguish imitated from real size::gather_input calls in metrics (#4224)
Before this PR, the gather_inputs() calls made to imitate synthetic size
calculation accesses were accounted towards the real logical size
calculation metric.

This PR forces all callers to declare the cause for making logical size
calculations, making the decision which cause counts towards which
metric explicit.

This is follow-up to

```
commit 1d266a6365
Author: Christian Schwarz <christian@neon.tech>
Date:   Thu May 11 16:09:29 2023 +0200

    logical size calculation metrics: differentiate regular vs imitated (#4197)
```

After merging this patch, I hope to be able to explain why we have ca
30x more "logical size" ops in prod than "imitate logical size" for any
given observation interval.

refs https://github.com/neondatabase/neon/issues/4154
2023-05-12 17:57:33 +00:00
Christian Schwarz
a2a9c598be add counter metric that increases whenever a background loop overruns its period (#4223)
We already have the warn!() log line for this condition. This PR adds a
corresponding metric on which we can have a dedicated alert. Cheaper and
more reliable than alerting on the logs, because, we run into log rate
limits from time to time these days.

refs https://github.com/neondatabase/neon/issues/4222
2023-05-12 19:00:06 +03:00
Alexander Bayandin
bb06d281ea Run regressions tests on both Postgres 14 and 15 (#4192)
This PR adds tests runs on Postgres 15 and created unified Allure report
with results for all tests.

- Split `.github/actions/allure-report` into
`.github/actions/allure-report-store` and
`.github/actions/allure-report-generate`
- Add debug or release pytest parameter for all tests (depending on
`BUILD_TYPE` env variable)
- Add Postgres version as a pytest parameter for all tests (depending on
`DEFAULT_PG_VERSION` env variable)
- Fix `test_wal_restore` and `restore_from_wal.sh` to support path with
`[`/`]` in it (fixed by applying spellcheck to the script and fixing all
warnings), `restore_from_wal_archive.sh` is deleted as unused.
- All known failures on Postgres 15 marked with xfail
2023-05-12 15:28:51 +01:00