Commit Graph

404 Commits

Author SHA1 Message Date
Andrey Taranik
873b222080 use own arm64 gha runners (#7373)
## Problem

Move from aws based arm64 runners to bare-metal based

## Summary of changes
Changes in GitHub action workflows where `runs-on: arm64` used. More
parallelism added, build time for `neon with extra platform builds`
workflow reduced from 45m to 25m
2024-05-10 11:04:23 +00:00
Alexander Bayandin
51376ef3c8 Add Postgres commit sha to Postgres version (#4603)
## Problem

Ref https://neondb.slack.com/archives/C036U0GRMRB/p1688122168477729

## Summary of changes
- Add sha from postgres repo into postgres version string (via
`--with-extra-version`)
- Add a test that Postgres version matches the expected one
- Remove build-time hard check and allow only related tests to fail
2024-05-07 15:18:17 +00:00
Vlad Lazar
1f417af9fd pagserver: use vectored read path in benchmarks (#7498)
## Problem
Benchmarks don't use the vectored read path.

## Summary of changes
* Update the benchmarks to use the vectored read path for both singular
and vectored gets.
* Disable validation for the benchmarks
2024-04-29 17:26:35 +01:00
Vlad Lazar
e4a279db13 pageserver: coalesce read paths (#7477)
## Problem
We are currently supporting two read paths. No bueno.

## Summary of changes
High level: use vectored read path to serve get page requests - gated by
`get_impl` config
Low level:
1. Add ps config, `get_impl` to specify which read path to use when
serving get page requests
2. Fix base cached image handling for the vectored read path. This was
subtly broken: previously we
would not mark keys that went past their cached lsn as complete. This is
a self standing change which
could be its own PR, but I've included it here because writing separate
tests for it is tricky.
3. Fork get page to use either the legacy or vectored implementation 
4. Validate the use of vectored read path when serving get page requests
against the legacy implementation.
Controlled by `validate_vectored_get` ps config.
5. Use the vectored read path to serve get page requests in tests (with
validation).

## Note
Since the vectored read path does not go through the page cache to read
buffers, this change also amounts to a removal of the buffer page cache. Materialized page cache
is still used.
2024-04-25 13:29:17 +01:00
Em Sharnoff
35e9fb360b Bump vm-builder v0.23.2 -> v0.28.1 (#7433)
Only one relevant change, from v0.28.0:

- neondatabase/autoscaling#887

Double-checked with `git log neonvm/tools/vm-builder`.
2024-04-21 17:35:01 -07:00
Alexander Bayandin
e8a98adcd0 CI: downgrade docker/setup-buildx-action to v2
- Cleanup part for `docker/setup-buildx-action` started to fail with the following error (for no obvious reason):
```
/nvme/actions-runner/_work/_actions/docker/setup-buildx-action/v3/webpack:/docker-setup-buildx/node_modules/@actions/cache/lib/cache.js:175
            throw new Error(`Path Validation Error: Path(s) specified in the action for caching do(es) not exist, hence no cache is being saved.`);
^
Error: Path Validation Error: Path(s) specified in the action for caching do(es) not exist, hence no cache is being saved.
    at Object.rejected (/nvme/actions-runner/_work/_actions/docker/setup-buildx-action/v3/webpack:/docker-setup-buildx/node_modules/@actions/cache/lib/cache.js:175:1)
    at Generator.next (<anonymous>)
    at fulfilled (/nvme/actions-runner/_work/_actions/docker/setup-buildx-action/v3/webpack:/docker-setup-buildx/node_modules/@actions/cache/lib/cache.js:29:1)
```

- Downgrade `docker/setup-buildx-action` from v3 to v2
2024-04-21 21:10:05 +01:00
Alexander Bayandin
41bb1e42b8 CI(check-build-tools-image): fix getting build-tools image tag (#7402)
## Problem

For PRs, by default, we check out a phantom merge commit (merge a branch
into the main), but using a real branches head when finding `build-tools`
image tag.

## Summary of changes
- Change `COMMIT_SHA` to use `${{ github.sha }}` instead of `${{
github.event.pull_request.head.sha }}` for PRs

## Checklist before requesting a review

- [x] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.

## Checklist before merging

- [ ] Do not forget to reformat commit message to not include the above
checklist
2024-04-17 09:50:58 +01:00
Christian Schwarz
f752c40f58 storage release: stop using no-op deployProxy / deployPgSniRouter (#7382)
As of https://github.com/neondatabase/aws/pull/1264
these options are no-ops.

This PR unblocks removal of the variables in
https://github.com/neondatabase/aws/pull/1263
2024-04-15 15:05:44 +02:00
Alexander Bayandin
99a56b5606 CI(build-build-tools-image): Do not cancel concurrent workflows (#7226)
## Problem

`build-build-tools-image` workflow is designed to be run only in one
example per the whole repository. Currently, the job gets cancelled if a
newer one is scheduled, here's an example:
https://github.com/neondatabase/neon/actions/runs/8419610607

## Summary of changes
- Explicitly set `cancel-in-progress: false` for all jobs that aren't
supposed to be cancelled
2024-04-11 15:23:08 +01:00
Alexander Bayandin
90a8ff55fa CI(benchmarking): Add Sharded Tenant for pgbench (#7186)
## Problem

During Nightly Benchmarks, we want to collect pgbench results for
sharded tenants as well.

## Summary of changes
- Add pre-created sharded project for pgbench
2024-04-02 14:39:24 +01:00
Alexander Bayandin
c52b80b930 CI(deploy): Do not deploy storage controller to preprod for proxy releases (#7269)
## Problem

Proxy release to a preprod automatically triggers a deployment of storage
controller (`deployStorageController=true` by default)

## Summary of changes
- Set `deployStorageController=false` for proxy releases to preprod
- Set explicitly `deployStorageController=true` for storage releases to
preprod and prod
2024-03-28 16:51:45 +00:00
Alexander Bayandin
7f9cc1bd5e CI(trigger-e2e-tests): set e2e-platforms (#7229)
## Problem

We don't want to run an excessive e2e test suite on neonvm if there are
no relevant changes.

## Summary of changes
- Check PR diff and if there are no relevant compute changes (in
`vendor/`, `pgxn/`, `libs/vm_monitor` or `Dockerfile.compute-node`
- Switch job from `small` to `ubuntu-latest` runner to make it possible
to use GitHub CLI
2024-03-27 13:10:37 +00:00
Alexander Bayandin
2668a1dfab CI: deploy release version to a preprod region (#6811)
## Problem

We want to deploy releases to a preprod region first to perform required
checks

## Summary of changes
- Deploy `release-XXX` / `release-proxy-YYY` docker tags to a preprod region
2024-03-22 14:42:10 +00:00
John Spray
7d32af5ad5 .github: apply timeout to pytest regress (#7142)
These test runs usually take 20-30 minutes. if something hangs, we see
actions proceeding for several hours: it's more convenient to have them
time out sooner so that we notice that something has hung faster.
2024-03-15 15:57:01 +00:00
Christian Schwarz
2b0f3549f7 default to tokio-epoll-uring in CI tests & on Linux (#7077)
All of production is using it now as of
https://github.com/neondatabase/aws/pull/1121

The change in `flaky_tests.py` resets the flakiness detection logic.

The alternative would have been to repeat the choice of io engine in
each test name, which would junk up the various test reports too much.

---------

Co-authored-by: Alexander Bayandin <alexander@neon.tech>
2024-03-11 14:35:59 +00:00
Anna Khanova
e1c032fb3c Fix type (#6998)
## Problem

Typo

## Summary of changes

Fix
2024-03-04 13:26:16 +00:00
Alexander Bayandin
6e46204712 CI(deploy): use separate workflow for proxy deploys (#6995)
## Problem

The current implementation of `deploy-prod` workflow doesn't allow to
run parallel deploys on Storage and Proxy.

## Summary of changes
- Call `deploy-proxy-prod` workflow that deploys only Proxy components,
and that can be run in parallel with `deploy-prod` for Storage.
2024-03-04 12:08:44 +00:00
Alexander Bayandin
54586d6b57 CI: create compute-tools image from compute-node image (#6899)
## Problem

We build compute-tools binary twice — in `compute-node` and in
`compute-tools` jobs, and we build them slightly differently:
- `cargo build --locked --profile release-line-debug-size-lto`
(previously in `compute-node`)
- `mold -run cargo build -p compute_tools --locked --release`
(previously in `compute-tools`)

Before:
- compute-node: **6m 34s**
- compute-tools (as a separate job): **7m 47s**

After:
- compute-node: **7m 34s**
- compute-tools (as a separate step, within compute-node job):  **5s**

## Summary of changes
- Move compute-tools image creation to `Dockerfile.compute-node`
- Delete `Dockerfile.compute-tools`
2024-02-28 15:24:35 +00:00
Alexander Bayandin
60a232400b CI(pin-build-tools-image): pass secrets to the job (#6949)
## Problem

`pin-build-tools-image` job doesn't have access to secrets and thus
fails. Missed in the original PR[0]

- [0] https://github.com/neondatabase/neon/pull/6795

## Summary of changes
- pass secrets to `pin-build-tools-image` job
2024-02-28 14:36:17 +00:00
Alexander Bayandin
1d5e476c96 CI: use build-tools image from dockerhub (#6795)
## Problem

Currently, after updating `Dockerfile.build-tools` in a PR, it requires
a manual action to make it `pinned`, i.e., the default for everyone. It
also makes all opened PRs use such images (even created in the PR and
without such changes).
This PR overhauls the way we build and use `build-tools` image (and uses
the image from Docker Hub).

## Summary of changes
- The `neondatabase/build-tools` image gets tagged with the latest
commit sha for the `Dockerfile.build-tools` file
- Each PR calculates the tag for `neondatabase/build-tools`, tries to
pull it, and rebuilds the image with such tag if it doesn't exist.
- Use `neondatabase/build-tools` as a default image
- When running on `main` branch — create a `pinned` tag and push it to
ECR
- Use `concurrency` to ensure we don't build `build-tools` image for the
same commit in parallel from different PRs
2024-02-28 12:38:11 +00:00
Joonas Koivunen
fcb77f3d8f build: add a timeout for test-images (#6942)
normal runtime seems to be 3min, add 20min timeout.
2024-02-28 12:58:13 +02:00
Alexander Bayandin
975786265c CI: Delete GitHub Actions caches once PR is closed (#6900)
## Problem

> Approaching total cache storage limit (9.25 GB of 10 GB Used)
> Least recently used caches will be automatically evicted to limit the
total cache storage to 10 GB. [Learn more about cache
usage.](https://docs.github.com/actions/using-workflows/caching-dependencies-to-speed-up-workflows#usage-limits-and-eviction-policy)

From https://github.com/neondatabase/neon/actions/caches

Some of these caches are from closed/merged PRs.

## Summary of changes
- Add a workflow that deletes caches for closed branches
2024-02-26 18:17:22 +01:00
Alexander Bayandin
94f6b488ed CI(release-proxy): fix a couple missed release-proxy branch handling (#6892)
## Problem

In the original PR[0], I've missed a couple of `release` occurrences
that should also be handled for `release-proxy` branch

- [0] https://github.com/neondatabase/neon/pull/6797

## Summary of changes
- Add handling for `release-proxy` branch to allure report
- Add handling for `release-proxy` branch to e2e tests malts.com
2024-02-23 14:12:09 +00:00
Alexander Bayandin
6f8f7c7de9 CI: Build images using docker buildx instead of kaniko (#6871)
## Problem

To "build" a compute image that doesn't have anything new, kaniko takes
13m[0], docker buildx does it in 5m[1].
Also, kaniko doesn't fully support bash expressions in the Dockerfile
`RUN`, so we have to use different workarounds for this (like `bash -c
...`).

- [0]
https://github.com/neondatabase/neon/actions/runs/8011512414/job/21884933687
- [1]
https://github.com/neondatabase/neon/actions/runs/8008245697/job/21874278162

## Summary of changes
- Use docker buildx to build `compute-node` images
- Use docker buildx to build `neon-image` image
- Use docker buildx to build `compute-tools` image 
- Use docker hub for image cache (instead of ECR)
2024-02-23 12:36:18 +01:00
Alexander Bayandin
2424d90883 CI: Split Proxy and Storage releases (#6797)
## Problem

We want to release Proxy at a different cadence.

## Summary of changes

- build-and-test workflow:
  - Handle the `release-proxy` branch
  - Tag images built on this branch with `release-proxy-XXX` tag
- Trigger deploy workflow with `deployStorage=true` &
`deployStorageBroker=true` on `release` branch
- Trigger deploy workflow with `deployPgSniRouter=true` &
`deployProxy=true` on `release-proxy` branch
- release workflow (scheduled creation of release branch):
- Schedule Proxy releases for Thursdays (a random day to make it
different from Storage releases)
2024-02-22 17:15:18 +01:00
Alexander Bayandin
f2767d2056 CI: run check-permissions before all jobs (#6794)
## Problem
For PRs from external contributors, we're still running `actionlint` and
`neon_extra_builds` workflows (which could fail due to lack of
permissions to secrets).

## Summary of changes
- Extract `check-permissions` job to a separate reusable workflow
- Depend all jobs from `actionlint` and `neon_extra_builds` workflows on
`check-permissions`
2024-02-21 20:32:12 +01:00
Vlad Lazar
5d6083bfc6 pageserver: add vectored get implementation (#6576)
This PR introduces a new vectored implementation of the read path.

The search is basically a DFS if you squint at it long enough.
LayerFringe tracks the next layers to visit and acts as our stack.
Vertices are tuples of (layer, keyspace, lsn range). Continuously
pop the top of the stack (most recent layer) and do all the reads
for one layer at once.

The search maintains a fringe (`LayerFringe`) which tracks all the
layers that intersect the current keyspace being searched. Continuously
pop the top of the fringe (layer with highest LSN) and get all the data
required from the layer in one go.

Said search is done on one timeline at a time. If data is still required for
some keys, then search the ancestor timeline.

Apart from the high level layer traversal, vectored variants have been
introduced for grabbing data from each layer type. They still suffer from
read amplification issues and that will be addressed in a different PR.

You might notice that in some places we duplicate the code for the
existing read path. All of that code will be removed when we switch
the non-vectored read path to proxy into the vectored read path.
In the meantime, we'll have to contend with the extra cruft for the sake
of testing and gentle releasing.
2024-02-21 09:49:46 +00:00
Alexander Bayandin
feb359b459 CI: Update deprecated GitHub Actions (#6822)
## Problem

We use a bunch of deprecated actions.
See https://github.com/neondatabase/neon/actions/runs/7958569728
(Annotations section)

```
Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3, actions/setup-java@v3, actions/cache@v3, actions/github-script@v6. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/.
```

## Summary of changes
- `actions/cache@v3` -> `actions/cache@v4`
- `actions/checkout@v3` -> `actions/checkout@v4`
- `actions/github-script@v6` -> `actions/github-script@v7`
- `actions/setup-java@v3` -> `actions/setup-java@v4`
- `actions/upload-artifact@v3` -> `actions/upload-artifact@v4`
2024-02-19 21:46:22 +00:00
Alexander Bayandin
4d2bf55e6c CI: temporary disable coverage report for regression tests (#6798)
## Problem

The merging coverage data step recently started to be too flaky.
This failure blocks staging deployment and along with the flakiness of
regression tests might require 4-5-6 manual restarts of a CI job.

Refs:
- https://github.com/neondatabase/neon/issues/4540
- https://github.com/neondatabase/neon/issues/6485
- https://neondb.slack.com/archives/C059ZC138NR/p1704131143740669

## Summary of changes
- Disable code coverage report for functional tests
2024-02-19 11:07:27 +00:00
Heikki Linnakangas
1af047dd3e Fix typo in CI message (#6749) 2024-02-15 14:34:19 +02:00
Christian Schwarz
8b8ff88e4b GH actions: label to disable CI runs completely (#6677)
I don't want my very-early-draft PRs to trigger any CI runs.
So, add a label `run-no-ci`, and piggy-back on the `check-permissions` job.
2024-02-12 15:25:33 +00:00
Abhijeet Patil
75f1a01d4a Optimise e2e run (#6513)
## Problem
We have finite amount of runners and intermediate results are often
wanted before a PR is ready for merging. Currently all PRs get e2e tests
run and this creates a lot of throwaway e2e results which may or may not
get to start or complete before a new push.

## Summary of changes

1. Skip e2e test when PR is in draft mode
2. Run e2e when PR status changes from draft to ready for review (change
this to having its trigger in below PR and update results of build and
test)
3. Abstract e2e test in a Separate workflow and call it from the main
workflow for the e2e test
5. Add a label, if that label is present run e2e test in draft
(run-e2e-test-in-draft)
6. Auto add a label(approve to ci) so that all the external contributors
PR , e2e run in draft
7. Document the new label changes and the above behaviour

Draft PR  : https://github.com/neondatabase/neon/actions/runs/7729128470
Ready To Review :
https://github.com/neondatabase/neon/actions/runs/7733779916
Draft PR with label :
https://github.com/neondatabase/neon/actions/runs/7725691012/job/21062432342
and https://github.com/neondatabase/neon/actions/runs/7733854028

## Checklist before requesting a review

- [x] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.

## Checklist before merging

- [ ] Do not forget to reformat commit message to not include the above
checklist

---------

Co-authored-by: Alexander Bayandin <alexander@neon.tech>
2024-02-07 16:14:10 +00:00
Alexander Bayandin
f4cc7cae14 CI(build-tools): Update Python from 3.9.2 to 3.9.18 (#6615)
## Problem

We use an outdated version of Python (3.9.2)

## Summary of changes
- Update Python to the latest patch version (3.9.18)
- Unify the usage of python caches where possible
2024-02-06 20:30:43 +00:00
Alexander Bayandin
e65f0fe874 CI(benchmarks): make job split consistent across reruns (#6614)
## Problem

We've got several issues with the current `benchmarks` job setup:
- `benchmark_durations.json` file (that we generate in runtime to
split tests into several jobs[0]) is not consistent between these
jobs (and very not consistent with the file if we rerun the job). I.e.
test selection for each job can be different, which could end up in
missed tests in a test run.
- `scripts/benchmark_durations` doesn't fetch all tests from the
database (it doesn't expect any extra directories inside
`test_runner/performance`)
- For some reason, currently split into 4 groups ends up with the 4th
group has no tests to run, which fails the job[1]

- [0] https://github.com/neondatabase/neon/pull/4683
- [1] https://github.com/neondatabase/neon/issues/6629

## Summary of changes
- Generate `benchmark_durations.json` file once before we start
`benchmarks` jobs (this makes it consistent across the jobs) and pass
the file content through the GitHub Actions input (this makes it
consistent for reruns)
- `scripts/benchmark_durations` fix SQL query for getting all required
tests
- Split benchmarks into 5 jobs instead of 4 jobs.
2024-02-06 17:00:55 +00:00
Joonas Koivunen
bb92721168 build: migrate check-style-rust to small runners (#6588)
We have more small runners than large runners, and often a shortage of
large runners. Migrate `check-style-rust` to run on small runners.
2024-02-06 15:53:04 +00:00
Em Sharnoff
d820d64e38 Bump vm-builder v0.21.0 -> v0.23.2 (#6480)
Relevant changes were all from v0.23.0:

- neondatabase/autoscaling#724
- neondatabase/autoscaling#726
- neondatabase/autoscaling#732

Co-authored-by: Alexander Bayandin <alexander@neon.tech>
2024-02-02 22:39:20 +00:00
Alexander Bayandin
30c9e145d7 check-macos-build: switch job to macos-14 (M1) (#6539)
## Problem
- GitHub made available `macos-14` runners, and they run on M1
processors[0]
- The price is the same as Intel-based runners — "macOS | 3 or 4 (M1 or
Intel) | $0.08"[1], but runners on Apple Silicon should be significantly 
faster than their Intel counterparts.
- Most developers who use macOS use Apple Silicon-based Macs nowadays.

- [0] https://github.blog/changelog/2024-01-30-github-actions-introducing-the-new-m1-macos-runner-available-to-open-source/
- [1] https://docs.github.com/en/billing/managing-billing-for-github-actions/about-billing-for-github-actions#per-minute-rates

## Summary of changes
- Run `check-macos-build` on `macos-14`
2024-02-02 10:51:20 +00:00
Alexander Bayandin
fa52cd575e Remove old tests results and old coverage collection (#6376)
## Problem
We have switched to new test results and new coverage results, so no
need to collect these data in old formats.

## Summary of changes
- Remove "Upload coverage report" for old coverage report
- Remove "Store Allure test stat in the DB" for old test results format
2024-02-01 13:36:55 +00:00
Christian Schwarz
3a82430432 fixup(#6492): also switch the benchmarks that runs on merge-to-main back to std-fs (#6501) 2024-01-28 00:15:11 +01:00
Arpad Müller
734755eaca Enable nextest retries for the arm build (#6496)
Also make the NEXTEST_RETRIES declaration more local.

Requested in https://github.com/neondatabase/neon/pull/6493#issuecomment-1912110202
2024-01-27 05:16:11 +01:00
Christian Schwarz
e34166a28f CI: switch back to std-fs io engine for soak time before next release (#6492)
PR #5824 introduced the concept of io engines in pageserver and
implemented `tokio-epoll-uring` in addition to our current method,
`std-fs`.

We used `tokio-epoll-uring` in CI for a day to get more exposure to
the code.  Now it's time to switch CI back so that we test with `std-fs`
as well, because that's what we're (still) using in production.
2024-01-26 22:48:34 +01:00
Alexander Bayandin
4c245b0f5a update_build_tools_image.yml: Push build-tools image to Docker Hub (#6481)
## Problem

- `docker.io/neondatabase/build-tools:pinned` image is frequently
outdated on Docker Hub because there's no automated way to update it.
- `update_build_tools_image.yml` workflow contains legacy roll-back
logic, which is not required anymore because it updates only a single
image.

## Summary of changes
- Make `update_build_tools_image.yml` workflow push images to both ECR
and Docker Hub
- Remove unneeded roll-back logic
2024-01-26 16:12:49 +00:00
Christian Schwarz
918b03b3b0 integrate tokio-epoll-uring as alternative VirtualFile IO engine (#5824) 2024-01-26 09:25:07 +01:00
Alexander Bayandin
d36623ad74 CI: cancel old e2e-tests on new commits (#6463)
## Problem

Triggered `e2e-tests` job is not cancelled along with other jobs in a PR
if the PR get new commits. We can improve the situation by setting
`concurrency_group` for the remote workflow
(https://github.com/neondatabase/cloud/pull/9622 adds
`concurrency_group` group input to the remote workflow).

Ref https://neondb.slack.com/archives/C059ZC138NR/p1706087124297569

Cloud's part added in https://github.com/neondatabase/cloud/pull/9622

## Summary of changes
- Set `concurrency_group` parameter when triggering `e2e-tests`
- At the beginning of a CI pipeline, trigger Cloud's
`cancel-previous-in-concurrency-group.yml` workflow which cancels
previously triggered e2e-tests
2024-01-25 19:25:29 +00:00
Arpad Müller
d52b81340f S3 based recovery (#6155)
Adds a new `time_travel_recover` function to the `RemoteStorage` trait
that allows time travel like functionality for S3 buckets, regardless of
their content (it is not even pageserver related). It takes a different
approach from [this
post](https://aws.amazon.com/blogs/storage/point-in-time-restore-for-amazon-s3-buckets/)
that is more complicated.

It takes as input a prefix a target timestamp, and a limit timestamp:

* executes [`ListObjectVersions`](https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectVersions.html)
* obtains the latest version that comes before the target timestamp
* copies that latest version to the same prefix
* if there is versions newer than the limit timestamp, it doesn't do
anything for the file

The limit timestamp is meant to be some timestamp before the start of
the recovery operation and after any changes that one wants to revert.
For example, it might be the time point after a tenant was detached from
all involved pageservers. The limiting mechanism ensures that the
operation is idempotent and can be retried without causing additional
writes/copies.

The approach fulfills all the requirements laid out in 8233, and is a
recoverable operation. Nothing is deleted permanently, only new entries
added to the version log.

I also enable [nextest retries](https://nexte.st/book/retries.html) to
help with some general S3 flakiness (on top of low level retries).

Part of https://github.com/neondatabase/cloud/issues/8233
2024-01-25 18:23:18 +01:00
Cihan Demirci
d34adf46b4 do not provide disclaimer input for the deploy-prod workflow (#6360)
We've removed this input from the deploy-prod workflow.
2024-01-15 16:15:34 +00:00
Alexander Bayandin
7de829e475 test_runner: replace black with ruff format (#6268)
## Problem

`black` is slow sometimes, we can replace it with `ruff format` (a new
feature in 0.1.2 [0]), which produces pretty similar to black style [1].

On my local machine (MacBook M1 Pro 16GB):
```
# `black` on main
$ hyperfine "BLACK_CACHE_DIR=/dev/null poetry run black ."
Benchmark 1: BLACK_CACHE_DIR=/dev/null poetry run black .
  Time (mean ± σ):      3.131 s ±  0.090 s    [User: 5.194 s, System: 0.859 s]
  Range (min … max):    3.047 s …  3.354 s    10 runs
```
```
# `ruff format` on the current PR
$ hyperfine "RUFF_NO_CACHE=true poetry run ruff format"      
Benchmark 1: RUFF_NO_CACHE=true poetry run ruff format
  Time (mean ± σ):     300.7 ms ±  50.2 ms    [User: 259.5 ms, System: 76.1 ms]
  Range (min … max):   267.5 ms … 420.2 ms    10 runs
```

## Summary of changes
- Replace `black` with `ruff format` everywhere

- [0] https://docs.astral.sh/ruff/formatter/
- [1] https://docs.astral.sh/ruff/formatter/#black-compatibility
2024-01-05 15:35:07 +00:00
Abhijeet Patil
f28bdb6528 Use nextest for rust unittests (#6223)
## Problem
`cargo test` doesn't support timeouts 
or junit output format

## Summary of changes
- Add `nextest` to `build-tools` image
- Switch `cargo test` with `cargo nextest` on CI
- Set timeout
2023-12-30 13:45:31 +00:00
Arpad Müller
a21b719770 Use neon-github-ci-tests S3 bucket for remote_storage tests (#6216)
This bucket is already used by the pytests. The current bucket
github-public-dev is more meant for longer living artifacts.

slack thread:
https://neondb.slack.com/archives/C039YKBRZB4/p1703124944669009

Part of https://github.com/neondatabase/cloud/issues/8233 / #6155
2023-12-21 17:28:28 +01:00
Alexander Bayandin
1dff98be84 CI: fix build-tools image tag for PRs (#6217)
## Problem

Fix build-tools image tag calculation for PRs.
Broken in https://github.com/neondatabase/neon/pull/6195

## Summary of changes
- Use `pinned` tag instead of `$GITHUB_RUN_ID` if there's no changes in
the dockerfile (and we don't build such image)
2023-12-21 14:55:24 +00:00