Commit Graph

718 Commits

Author SHA1 Message Date
Sergey Melnikov
1254dc7ee2 Fix production deploy: run as root to access docker (#3555) 2023-02-07 15:21:15 +01:00
Sergey Melnikov
959f5c6f40 Do not deploy legacy scram proxy (*.cloud.neon.tech) to the old account (#3546)
We have migrated to the new proxy, which was setup in
https://github.com/neondatabase/neon/pull/3461
2023-02-06 15:51:20 +01:00
Kirill Bulatov
f474495ba0 Publish builds stats that are easy to browse (#3514)
Adds two new tags, `run-extra-build-macos` and `run-extra-build-stats`
to trigger corresponding build jobs on any PR.

On every build for `main` or PR with `run-extra-build-stats` tag, publish a GitHub commit status with the link to the `cargo build --all --release --timings` report.
2023-02-02 11:18:42 +02:00
Alexander Bayandin
567b71c1d2 Require poetry 1.3; regenerate poetry.lock (#3508)
Ref https://python-poetry.org/blog/announcing-poetry-1.3.0/#new-lock-file-format
2023-02-01 18:11:00 +00:00
Sergey Melnikov
f3dadfb3d0 Confirm that there is an emergency before manual execution of prod deploy workflow (#3507)
![image](https://user-images.githubusercontent.com/7127190/215840037-69eda3ee-920e-4b90-bf7d-aa58f0bdfb50.png)
2023-02-01 16:01:27 +01:00
Sergey Melnikov
847fc566fd Use the same runners/container for old prod deployments as for new prod 2023-01-31 17:40:24 +01:00
Vadim Kharitonov
a7d8bfa631 Fix create release PR 2023-01-31 14:36:04 +01:00
Sergey Melnikov
0806a46c0c Fix production deploy (#3498)
`get_binaries.sh` no longer use `RELEASE` environmental variable, it
just use `DOCKER_TAG`
2023-01-31 13:36:25 +01:00
Sergey Melnikov
5e08b35f53 Fix new deploy workflow (#3492)
Add 'branch' input to specify commit for deploy scripts/configs. Commit
can't be passed to workflow as ref, and we need to pin configs to
specific commit for main/release deploys
Update deploy input descriptions to match GH interface
2023-01-30 22:08:00 +01:00
Sergey Melnikov
82cbcb36ab Extract neon deploy jobs into separate workflows (#3424)
Extract deploy jobs from build_and_test.yml to deploy-dev and
deploy-prod workflows.
Add trigger to run this workflows after Neon is build and tested on main and
release branches.

This will allow us to redeploy/rollback/patch config without full
rebuild.
2023-01-30 20:10:54 +01:00
Vadim Kharitonov
ec0e641578 Create Release PR: review fixes 2023-01-30 16:15:22 +01:00
Rory de Zoete
7bb13569b3 Switch more jobs to small runner (#3483)
As these jobs don't benefit from additional cores

Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
2023-01-30 14:00:44 +01:00
Vadim Kharitonov
5fc233964a Create release PR 2023-01-30 12:44:48 +01:00
Rory de Zoete
4d291d0e90 Prevent assume error (#3476)
To fix `Error: The requested DurationSeconds exceeds the
MaxSessionDuration set for this role.`

Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>
2023-01-27 19:27:23 +01:00
Rory de Zoete
4718c67c17 Update deploy steps (#3470)
First one isn't optimal, but as it was requested to run the runner as
nonroot ->
https://github.com/neondatabase/runner/pull/1#discussion_r1069909593
this job will need more significant refactoring. This should unblock the
deployment process.

---------

Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>
2023-01-27 18:05:49 +01:00
Rory de Zoete
8342e9ea6f Update helm job (#3467)
As followup from https://github.com/neondatabase/build/pull/47

Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>
2023-01-27 13:28:26 +01:00
Rory de Zoete
2388981311 Add cleanup tasks for ansible and helm (#3465)
To fix:

https://github.com/neondatabase/neon/actions/runs/4023027504/jobs/6913421070

https://github.com/neondatabase/neon/actions/runs/4023027504/jobs/6913421268

Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
2023-01-27 11:20:51 +01:00
Sergey Melnikov
fb721cdfa5 Setup legacy scram proxy in the new account (#3461)
This setup proxies with *.cloud.neon.tech certificate in the us-west-2
region of the new account, we are not switching to them here yet
2023-01-27 11:05:05 +01:00
Sergey Melnikov
2ecd0e1f00 Decommission link proxy from old account (#3454) 2023-01-26 16:18:57 +01:00
Rory de Zoete
b858d70f19 Update promote job (#3455)
To fix errors such as:
`An error occurred (ImageAlreadyExistsException) when calling the
PutImage operation: Image with digest
'sha256:da6d8ad97d84e3aec4e6a240c3a35868b626692ee5d199cdd3fe45d29a8e54df'
and tag 'latest' already exists in the repository with name
'compute-node-v14' in registry with id '369495373322'`

Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>
2023-01-26 14:26:23 +01:00
Rory de Zoete
4bcbb7793d Revert docker hub job (#3453)
Regression fix as permissions aren't configured properly on gen3 for
this job.

Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
2023-01-26 11:30:53 +01:00
Rory de Zoete
cd5732d9d8 Gen3 runners (#3220)
https://github.com/neondatabase/cloud/issues/2738

Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>
Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
2023-01-26 10:46:06 +01:00
Sergey Melnikov
4b8dbea5c1 Add production link proxy to new account (#3444)
This PR setup link proxy in us-east-2 region, but do not redirect
pg.neon.tech DNS name to it
Will keep old link proxy for the time of migration
2023-01-25 17:15:56 +01:00
Vadim Kharitonov
00f1f54b7a Leave one Dockerfile 2023-01-25 15:10:45 +01:00
sharnoff
f8e887830a build: Use curl -f on vm-informant download (#3363)
Without this, we can silently fail
2023-01-17 10:38:33 +01:00
sharnoff
5c6a7a17cb Add VM informant to vm-compute-node (#3324)
The general idea is that the VM informant binary is added to the
vm-compute-node images only. `compute_tools` then will run whatever's at
`/bin/vm-informant`, if the path exists.
2023-01-16 07:05:29 -08:00
Alexander Bayandin
c28bfd4c63 Nightly Benchmarks: add user provided example (#3308) 2023-01-12 23:03:21 +00:00
Sergey Melnikov
95bf19b85a Add --atomic to all helm upgrade operations (#3299)
When number of github actions workers is changed, some jobs get killed.
When helm if killed during the upgrade, release stuck in pending-upgrade
state. --atomic should initiate automatic rollback in this case.
2023-01-10 10:05:27 +00:00
Sergey Melnikov
14df37c108 Use GHA environments for gradual prod rollout (#3295)
Each release will wait for manual approval for each region
2023-01-09 20:18:16 +04:00
Sergey Melnikov
93c77b0383 Use GHA environment for per-region deploy approvals on staging (#3293)
Each main deploy will wait for manual approval for each region
2023-01-09 15:40:14 +04:00
Heikki Linnakangas
e9583db73b Remove code and test to generate flamegraph on GetPage requests. (#3257)
It was nice to have and useful at the time, but unfortunately the method
used to gather the profiling data doesn't play nicely with 'async'. PR
#3228 will turn 'get_page_at_lsn' function async, which will break the
profiling support. Let's remove it, and re-introduce some kind of
profiling later, using some different method, if we feel like we need it
again.
2023-01-03 20:11:32 +02:00
Vadim Kharitonov
0b428f7c41 Enable licenses check for 3rd-parties 2023-01-03 15:11:50 +01:00
Sergey Melnikov
c01f92c081 Fully remove old staging deploy (#3191) 2022-12-22 20:09:45 +01:00
Sergey Melnikov
7bc17b373e Fix calculate-deploy-targets (#3189)
Was broken in https://github.com/neondatabase/neon/pull/3180
2022-12-22 16:28:36 +01:00
Sergey Melnikov
5a496d82b0 Do not deploy storage and proxies to old staging (#3180)
We fully migrated out, this nodes will be soon decommissioned
2022-12-22 15:37:17 +01:00
Alexander Bayandin
201fedd65c tpch-compare: use rust image instead of rustlegacy (#3182) 2022-12-22 12:40:39 +00:00
Sergey Melnikov
707d1c1c94 Fix vm-compute-image upload to dockerhub (#3181) 2022-12-22 13:34:16 +01:00
Sergey Melnikov
f5f1197e15 Build vm-compute-node images (#3174) 2022-12-22 11:25:56 +01:00
Alexander Bayandin
8d39fcdf72 pgbench-compare: don't run neon-captest-new (#3130)
Do not run Nightly Benchmarks on `neon-captest-new`.
This is a temporary solution to avoid spikes in the storage we consume
during the test run. To collect data for the default instance, we could
run tests weekly (i.e. not daily).
2022-12-16 13:23:36 +00:00
Arseny Sher
70ce01d84d Deploy broker with L4 LB in new env. (#3125)
Seems to be fixing issue with missing keepalives.
2022-12-15 22:42:30 +01:00
Sergey Melnikov
827ee10b5a Disable neon-stress deploy (#3093) 2022-12-14 01:51:42 +01:00
Alexander Bayandin
c819b699be Nightly Benchmark: run neon-captest-reuse from staging (#3086)
The project has been migrated (now it is `restless-king-632302`), and
now we should run tests from staging runners.

Test run:
https://github.com/neondatabase/neon/actions/runs/3686865543/jobs/6241367161

Ref https://github.com/neondatabase/cloud/issues/2836
2022-12-13 23:02:45 +00:00
Sergey Melnikov
826214ae56 Force ansible-galaxy to also use local ansible.cfg (#3091) 2022-12-13 21:06:18 +01:00
Sergey Melnikov
b39d6126bb Force ansible to use local ansible.cfg (#3089) 2022-12-13 21:57:39 +03:00
Alexander Bayandin
feb07ed510 deploy (old): replace actions/setup-python@v4 with ansible image (#3081)
Replace actions/setup-python@v4 with the ansible image to fix
```
Version 3.10 was not found in the local cache
Error: The version '3.10' with architecture 'x64' was not found for this operating system.
```
2022-12-13 14:01:29 +00:00
Sergey Melnikov
e5d523c86a Add new us-west-2 region (#3071) 2022-12-13 14:11:40 +01:00
Arseny Sher
544777e86b Fix storage_broker deploy typo. 2022-12-13 10:57:26 +03:00
Arseny Sher
e2ae4c09a6 Put e2e tag back.
32662ff1c4 required running e2e tests on patched branch of cloud repo; not
that it is merged, put the tag back.
2022-12-13 09:53:22 +03:00
Rory de Zoete
d1edc8aa00 Deprecate old runner for deploy job (#3070)
As we plan to no longer use them

Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>
2022-12-12 16:55:40 +01:00
Kirill Bulatov
0aa2f5c9a5 Regroup CI testing (#3049)
Part of https://github.com/neondatabase/neon/pull/2410 and
https://github.com/neondatabase/neon/pull/2407

* adds `hashFiles('rust-toolchain.toml')` into Rust cache keys, thus
removing one of the manual steps to do when upgrading rustc
* copies Python and Rust style checks from the `codestyle.yml` workflow
* adjusts shell defaults in the main workflow
* replaces `codestyle.yml` with a `neon_extra_builds.yml` worlflow

The new workflow runs on commits to `main` (`codestyle.yml` was run per
PR), and runs two custom builds on GH agents:

* macos-latest, to ensure the entire project compiles on it (no tests
run)

There were no frequent breakages on macOs in our builds, so we can check
it rarely without making every storage PR to wait for it to complete.
The updated mac build use release builds now, so presumably should work
a bit faster due to overall smaller files to cache between builds.

* ubuntu-latest, without caches, to produce full compilation stats for
Rust builds and upload it as an artifact to GitHub

Old `clippy build --timings` stats were collected from the builds that
use caches and incremental calculation hence never could produce a full
report, it got removed.
2022-12-12 12:58:55 +02:00