29 Commits

Author SHA1 Message Date
a-masterov
59e393aef3 Enable parallel execution of extension tests (#12118)
## Problem
Extension tests were previously run sequentially, resulting in
unnecessary wait time and underutilization of available CPU cores.
## Summary of changes
Tests are now executed in a customizable number of parallel threads
using separate database branches. This reduces overall test time by
approximately 50% (e.g., on my laptop, parallel test lasts 173s, while
sequential one lasts 340s) and increases the load on the pageserver,
providing better test coverage.

---------

Co-authored-by: Alexander Bayandin <alexander@neon.tech>
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: Alexey Masterov <alexey.masterov@databricks.com>
2025-07-08 11:28:39 +00:00
Suhas Thalanki
5f3532970e [compute] fix: background worker that collects installed extension metrics now updates collection interval (#12277)
## Problem

Previously, the background worker that collects the list of installed
extensions across DBs had a timeout set to 1 hour. This cause a problem
with computes that had a `suspend_timeout` > 1 hour as this collection
was treated as activity, preventing compute shutdown.

Issue: https://github.com/neondatabase/cloud/issues/30147

## Summary of changes

Passing the `suspend_timeout` as part of the `ComputeSpec` so that any
updates to this are taken into account by the background worker and
updates its collection interval.
2025-06-30 22:12:37 +00:00
Suhas Thalanki
83069f6ca1 fix: terminate pgbouncer on compute suspend (#12153)
## Problem

PgBouncer does not terminate connections on a suspend:
https://github.com/neondatabase/cloud/issues/16282

## Summary of changes

1. Adds a pid file to store the pid of PgBouncer
2. Terminates connections on a compute suspend

---------

Co-authored-by: Alexey Kondratov <kondratov.aleksey@gmail.com>
2025-06-17 22:56:05 +00:00
a-masterov
e4ca3ac745 Fix codestyle for compute.sh for docker-compose (#12128)
## Problem
The script `compute.sh` had a non-consistent coding style and didn't
follow best practices for modern bash scripts
## Summary of changes
The coding style was fixed to follow best practices.
2025-06-04 15:07:48 +00:00
a-masterov
208cbd52d4 Add postgis to the test image (#11672)
## Problem
We don't currently run tests for PostGIS in our test environment.

## Summary of Changes
- Added PostGIS test support for PostgreSQL v16 and v17
- Configured different PostGIS versions based on PostgreSQL version:
  - PostgreSQL v17: PostGIS 3.5.0
  - PostgreSQL v14/v15/v16: PostGIS 3.3.3
- Added necessary test scripts and configurations

This ensures our PostgreSQL implementation remains compatible with this
widely-used extension.

---------

Co-authored-by: Alexander Bayandin <alexander@neon.tech>
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
2025-06-04 09:57:31 +00:00
Tristan Partin
cdfa06caad Remove test-images compatibility hack for confirming library load paths (#11927)
This hack was needed for compatiblity tests, but after the compute
release is no longer needed.

Signed-off-by: Tristan Partin <tristan@neon.tech>
2025-05-27 17:33:16 +00:00
Tristan Partin
9971fba584 Properly configure the dynamic loader to load our compiled libraries (#11858)
The first line in /etc/ld.so.conf is:

	/etc/ld.so.conf.d/*

We want to control library load order so that our compiled binaries are
picked up before others from system packages. The previous solution
allowed the system libraries to load before ours.

Part-of: https://github.com/neondatabase/neon/issues/11857

Signed-off-by: Tristan Partin <tristan@neon.tech>
2025-05-12 17:36:07 +00:00
Tristan Partin
79083de61c Remove forward compatibility hacks related to compute_ctl auth (#11621)
These various hacks were needed for the forward compatibility tests.
Enough time has passed since the merge that these are no longer needed.

Signed-off-by: Tristan Partin <tristan@neon.tech>
2025-04-16 23:14:24 +00:00
Tristan Partin
c002236145 Remove compute_ctl authorization bypass if testing feature was enable (#11596)
We want to exercise the authorization middleware in our regression
tests.

Signed-off-by: Tristan Partin <tristan@neon.tech>
2025-04-16 17:54:51 +00:00
Tristan Partin
ff5a527167 Consolidate compute_ctl configuration structures (#11514)
Previously, the structure of the spec file was just the compute spec.
However, the response from the control plane get spec request included
the compute spec and the compute_ctl config. This divergence was
hindering other work such as adding regression tests for compute_ctl
HTTP authorization.

Signed-off-by: Tristan Partin <tristan@neon.tech>
2025-04-11 15:06:29 +00:00
a-masterov
cf9d817a21 Add tests for some extensions currently not covered by the regression tests (#11191)
## Problem
Some extensions do not contain tests, which can be easily run on top of
docker-compose or staging.

## Summary of changes
Added the pg_regress based tests for `pg_tiktoken`, `pgx_ulid`, `pg_rag`
Now they will be run on top of docker-compose, but I intend to adopt
them to be run on top staging in the next PRs
2025-03-19 15:38:33 +00:00
JC Grünhage
2dfff6a2a3 impr(ci): use ghcr.io as the default container registry (#11210)
## Problem
Docker Hub has new rate limits coming up, and to avoid problems coming
with those we're switching to GHCR.

## Summary of changes
- Push images to GHCR initially and distribute them from there
- Use images from GHCR in docker-compose
2025-03-18 11:30:49 +00:00
Tristan Partin
d571553d8a Remove hacks in compute_ctl related to compute ID (#10751) 2025-02-20 17:31:52 +00:00
Tristan Partin
3d143ad799 Unbrick the forward compatibility test failures (#10747)
Since the merge of https://github.com/neondatabase/neon/pull/10523,
forward compatibility tests have been broken everywhere.

Signed-off-by: Tristan Partin <tristan@neon.tech>
2025-02-10 22:22:10 +00:00
Tristan Partin
946da3f7e2 Require --compute-id when running compute_ctl (#10523)
The compute_id will be used when verifying claims sent by the control
plane.

Signed-off-by: Tristan Partin <tristan@neon.tech>

Signed-off-by: Tristan Partin <tristan@neon.tech>
2025-02-10 16:46:20 +00:00
a-masterov
bf6d5e93ba Run tests of the contrib extensions (#10392)
## Problem
We don't test the extensions, shipped with contrib
## Summary of changes
The tests are now running
2025-01-30 19:32:35 +00:00
Fedor Dikarev
de1c35fab3 add retries for apt, wget and curl (#10553)
Ref: https://github.com/neondatabase/cloud/issues/23461

## Problem
> recent CI failure due to apt-get:
```
4.266 E: Failed to fetch http://deb.debian.org/debian/pool/main/g/gcc-10/libgfortran5_10.2.1-6_arm64.deb  Error reading from server - read (104: Connection reset by peer) [IP: 146.75.122.132 80]
```

https://github.com/neondatabase/neon/actions/runs/11144974698/job/30973537767?pr=9186
thinking about if there should be a mirror-selector at the beginning of
the dockerfile so that it uses a debian mirror closer to the build
server?
## Summary of changes
We could consider adding local mirror or proxy and keep it close to our
self-hosted runners.
For now lets just add retries for `apt`, `wget` and `curl`

thanks to @skyzh for reporting that in October 2024, I just finally
found time to take a look here :)
2025-01-29 21:02:54 +00:00
a-masterov
9f81828429 Test extension upgrade compatibility (#10244)
## Problem
We have to test the extensions, shipped with Neon for compatibility
before the upgrade.
## Summary of changes
Added the test for compatibility with the upgraded extensions.
2025-01-29 09:19:11 +00:00
a-masterov
737888e5c9 Remove the tests for pg_anon (#10382)
## Problem
We are removing the `pg_anon` v1 extension from Neon. So we don't need
to test it anymore and can remove the code for simplicity.
## Summary of changes
The code required for testing `pg_anon` is removed.
2025-01-21 19:17:14 +00:00
a-masterov
e226d7a3d1 Fix docker compose with PG17 (#10165)
## Problem
It's impossible to run docker compose with compute v17 due to `pg_anon`
extension which is not supported under PG17.
## Summary of changes
The auto-loading of `pg_anon` is disabled by default
2024-12-17 08:16:54 +00:00
a-masterov
a2a942f93c Add support for the extensions test for Postgres v17 (#9748)
## Problem
The extensions for Postgres v17 are ready but we do not test the
extensions shipped with v17
## Summary of changes
Build the test image based on Postgres v17. Run the tests for v17.

---------

Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>
2024-12-03 11:25:29 +00:00
Vlad Lazar
35854928d9 pageserver: use identity file as node id authority and remove init command and config-override flags (#7766)
Ansible will soon write the node id to `identity.toml` in the work dir
for new pageservers. On the pageserver side, we read the node id from
the identity file if it is present and use that as the source of truth.
If the identity file is missing, cannot be read, or does not
deserialise, start-up is aborted.
 
This PR also removes the `--init` mode and the `--config-override` flag
from the `pageserver` binary.
The neon_local is already not using these flags anymore.

Ansible still uses them until the linked change is merged & deployed,
so, this PR has to land simultaneously or after the Ansible change due
to that.

Related Ansible change: https://github.com/neondatabase/aws/pull/1322
Cplane change to remove config-override usages:
https://github.com/neondatabase/cloud/pull/13417
Closes: https://github.com/neondatabase/neon/issues/7736
Overall plan:
https://www.notion.so/neondatabase/Rollout-Plan-simplified-pageserver-initialization-f935ae02b225444e8a41130b7d34e4ea?pvs=4

Co-authored-by: Christian Schwarz <christian@neon.tech>
2024-07-23 11:41:12 +01:00
John Spray
c39d5b03e8 pageserver: remove legacy tenant config code, clean up redundant generation none/broken usages (#7947)
## Problem

In https://github.com/neondatabase/neon/pull/5299, the new config-v1
tenant config file was added to hold the LocationConf type. We left the
old config file in place for forward compat, and because running without
generations (therefore without LocationConf) as still useful before the
storage controller was ready for prime-time.

Closes: https://github.com/neondatabase/neon/issues/5388

## Summary of changes

- Remove code for reading and writing the legacy config file
- Remove Generation::Broken: it was unused.
- Treat missing config file on disk as an error loading a tenant, rather
than defaulting it. We can now remove LocationConf::default, and thereby
guarantee that we never construct a tenant with a None generation.
- Update some comments + add some assertions to clarify that
Generation::None is only used in layer metadata, not in the state of a
running tenant.
- Update docker compose test to create tenants with a generation
2024-06-26 19:53:59 +00:00
a-masterov
e27ce38619 Add testing for extensions (#7818)
## Problem

We need automated tests of extensions shipped with Neon to detect
possible problems.

## Summary of changes

A new image neon-test-extensions is added. Workflow changes to test the
shipped extensions are added as well.
Currently, the regression tests, shipped with extensions are in use.
Some extensions, i.e. rum, timescaledb, rdkit, postgis, pgx_ulid, pgtap,
pg_tiktoken, pg_jsonschema, pg_graphql, kq_imcx, wal2json_2_5 are
excluded due to problems or absence of internal tests.

---------

Co-authored-by: Alexander Bayandin <alexander@neon.tech>
Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2024-06-11 13:07:51 +02:00
Alexander Bayandin
a7f31f1a59 CI: build multi-arch images (#7696)
## Problem

We don't build our docker images for ARM arch, and that makes it harder
to run images on ARM (on MacBooks with Apple Silicon, for example).

## Summary of changes
- Build `neondatabase/neon` for ARM and create a multi-arch image
- Build `neondatabase/compute-node-vXX` for ARM and create a multi-arch
image
- Run `test-images` job on ARM as well
2024-05-22 16:06:05 +01:00
Konstantin Knizhnik
5c88213eaf Logical replication (#5271)
## Problem

See https://github.com/neondatabase/company_projects/issues/111

## Summary of changes

Save logical replication files in WAL at compute and include them in
basebackup at pate server.

## Checklist before requesting a review

- [ ] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.

## Checklist before merging

- [ ] Do not forget to reformat commit message to not include the above
checklist

---------

Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
Co-authored-by: Arseny Sher <sher-ars@yandex.ru>
2023-10-18 16:42:22 +03:00
Heikki Linnakangas
a560b28829 Make new tenant/timeline IDs mandatory in create APIs. (#4304)
We used to generate the ID, if the caller didn't specify it. That's bad
practice, however, because network is never fully reliable, so it's
possible we create a new tenant but the caller doesn't know about it,
and because it doesn't know the tenant ID, it has no way of retrying or
checking if it succeeded. To discourage that, make it mandatory. The web
control plane has not relied on the auto-generation for a long time.
2023-05-26 16:19:36 +03:00
MMeent
e6ec2400fc Enable hot standby PostgreSQL replicas.
Notes:
 - This still needs UI support from the Console
 - I've not tuned any GUCs for PostgreSQL to make this work better
 - Safekeeper has gotten a tweak in which WAL is sent and how: It now
sends zero-ed WAL data from the start of the timeline's first segment up to
the first byte of the timeline to be compatible with normal PostgreSQL
WAL streaming.
 - This includes the commits of #3714 

Fixes one part of https://github.com/neondatabase/neon/issues/769

Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>
2023-04-27 15:26:44 +02:00
Kirill Bulatov
03695261fc Test storage Docker images (#2767)
Closes https://github.com/neondatabase/neon/issues/2697
Example:
https://github.com/neondatabase/neon/actions/runs/3416774593/jobs/5688394855

Adds a set of tests on the storage Docker images before they are pushed
to the public registries:
* tests that pageserver binary has the correct version string (other
binaries are built with the same library, so it should be enough to test
one)
* tests that the compose file set-up works and all components are able
to start and perform a single SQL query (CREATE TABLE)
2022-11-11 19:42:26 +02:00