## Problem
`pytest` 6 truncates error messages and this is not configured.
It's fixed in `pytest` 7, it prints the whole message (truncating limit
is higher) if `--verbose` is set (it's set on CI).
## Summary of changes
- `pytest` and `pytest` plugins are updated to their latest versions
- linters (`black` and `ruff`) are updated to their latest versions
- `mypy` and types are updated to their latest versions, new warnings
are fixed
- while we're here, allure updated its latest version as well
In the v0.6.0 release, vm-builder was changed to be Neon-specific, so
it's handling all the stuff that Dockerfile.vm-compute-node used to do.
This commit bumps vm-builder to v0.7.3-alpha3.
## Problem
`osgeo.org` is experiencing some problems with DNS resolving which
breaks `compute-node-image` (because it can't download postgis)
## Summary of changes
- Add `140.211.15.30 download.osgeo.org` to /etc/hosts by passing it via
the container option
- Group tests by Postgres version
- Merge different build types
- Add a command to GitHub comment on how to rerun all failed tests
(different command for different Postgres versions)
- Restore a link to a test report in the build summary
## Problem
`neondatabase/zenith-coverage-data` is too big:
- It takes ~6 minutes to clone and push the repo
- GitHub fails to publish an HTML report to github.io
Part of https://github.com/neondatabase/neon/issues/3543
## Summary of changes
Replace pushing code coverage report to
`neondatabase/zenith-coverage-data` with uploading it to S3
## Problem
Compatibility tests don't support Postgres 15 yet, but we're still
trying to upload compatibility snapshot (which we do not collect).
Ref
https://github.com/neondatabase/neon/actions/runs/4991394158/jobs/8940369368#step:4:38129
## Summary of changes
Add `pg_version` parameter to `run-python-test-set` actions and do not
upload compatibility snapshot for Postgres 15
Should fix flakiness caused by the error
```
FATAL: could not resize shared memory segment "/PostgreSQL.3944613150" to 1048576 bytes: No space left on device
```
This PR adds tests runs on Postgres 15 and created unified Allure report
with results for all tests.
- Split `.github/actions/allure-report` into
`.github/actions/allure-report-store` and
`.github/actions/allure-report-generate`
- Add debug or release pytest parameter for all tests (depending on
`BUILD_TYPE` env variable)
- Add Postgres version as a pytest parameter for all tests (depending on
`DEFAULT_PG_VERSION` env variable)
- Fix `test_wal_restore` and `restore_from_wal.sh` to support path with
`[`/`]` in it (fixed by applying spellcheck to the script and fixing all
warnings), `restore_from_wal_archive.sh` is deleted as unused.
- All known failures on Postgres 15 marked with xfail
Follow-up fix after https://github.com/neondatabase/neon/pull/4067
```
+ crane tag neondatabase/vm-compute-node-v14:3064 latest
Error: fetching "neondatabase/vm-compute-node-v14:3064": GET https://index.docker.io/v2/neondatabase/vm-compute-node-v14/manifests/3064: MANIFEST_UNKNOWN: manifest unknown; unknown tag=3064
```
I reverted back the previous approach for promoting images
(login to one registry, save images to local fs, logout and login to
another registry, and push images from local fs). It turns out what
works for one Google project (kaniko), doesn't work for another (crane)
[sigh]
- Update kaniko to 1.9.2 (from 1.7.0), problem with reproducible build is fixed
- Login to ECR and Docker Hub at once, so we can push to several
registries, it makes job `push-docker-hub` unneeded
- `push-docker-hub` replaced with `promote-images` in `needs:` clause,
Pushing images to production ECR moved to `promote-images` job
## Describe your changes
Deploy `main` proxies to the preview environments
We don't deploy storage there yet, as it's tricky.
## Issue ticket number and link
https://github.com/neondatabase/cloud/issues/4737
## Describe your changes
## Issue ticket number and link
## Checklist before requesting a review
- [x] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
## Checklist before merging
- [ ] Do not forget to reformat commit message to not include the above
checklist
Before this patch, if a tenant would override its eviction_policy
setting to use a lower LayerAccessThreshold::threshold than the
`evictions_low_residence_duration_metric_threshold`, the evictions done
for that tenant would count towards the
`evictions_with_low_residence_duration` metric.
That metric is used to identify pre-mature evictions, commonly triggered
by disk-usage-based eviction under disk pressure.
We don't want that to happen for the legitimate evictions of the tenant
that overrides its eviction_policy.
So, this patch
- moves the setting into TenantConf
- adds test coverage
- updates the staging & prod yamls
Forward Compatibility:
Software before this patch will ignore the new tenant conf field and use
the global one instead.
So we can roll back safely.
Backward Compatibility:
Parsing old configs with software as of this patch will fail in
`PageServerConf::parse_and_validate` with error
`unrecognized pageserver option 'evictions_low_residence_duration_metric_threshold'`
if the option is still present in the global section.
We deal with this by updating the configs in Ansible.
fixes https://github.com/neondatabase/neon/issues/3940
## Describe your changes
Do not forget to process required manual stuff after release
## Issue ticket number and link
## Checklist before requesting a review
- [ ] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
## Checklist before merging
- [ ] Do not forget to reformat commit message to not include the above
checklist
---------
Co-authored-by: Dmitry Rodionov <dmitry@neon.tech>
We have enabled prefetch by default, let's use this in Nightly
Benchmarks:
- effective_io_concurrency=100 by default (instead of 32)
- maintenance_io_concurrency=100 by default (instead of 32)
Rename `neon-captest-prefetch` to `neon-captest-new` (for pgbench with
initialisation) and `neon-captest-reuse` (for OLAP scenarios)
This allows to skip compatibility tests based on `CHECK_ONDISK_DATA_COMPATIBILITY` environment variable. When the variable is missing (default) compatibility tests wont be run.
## Describe your changes
In https://github.com/neondatabase/cloud/issues/4354 we are making
scheduling of projects based on available disk space and overcommit, so
we need to know disk size and just in case instance type of the
pageserver
## Issue ticket number and link
https://github.com/neondatabase/cloud/issues/4354
## Checklist before requesting a review
- [x] I have performed a self-review of my code.
- [ ] ~If it is a core feature, I have added thorough tests.~
- [ ] ~Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?~
- [ ] ~If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.~