Compare commits

..

182 Commits

Author SHA1 Message Date
Erik Grinaker
377af2c673 pageserver: add DeletionNotify to wait for deletion queue execution 2025-02-12 18:20:39 +01:00
John Spray
9989d8bfae tests: make Workload more determinstic (#10741)
## Problem

Previously, Workload was reconfiguring the compute before each run of
writes, which was meant to be a no-op when nothing changed, but was
actually writing extra data due to an issue being fixed in
https://github.com/neondatabase/neon/pull/10696.

The row counts in tests were too low in some cases, these tests were
only working because of those extra writes that shouldn't have been
happening, and moreover were relying on checkpoints happening.

## Summary of changes

- Only reconfigure compute if the attached pageserver actually changed.
If pageserver is set to None, that means controller is managing
everything, so never reconfigure compute.
- Update tests that wrote too few rows.

---------

Co-authored-by: Alexey Kondratov <kondratov.aleksey@gmail.com>
2025-02-12 12:35:29 +00:00
Heikki Linnakangas
9537829ccd fast_import: Make CPU & memory size configurable (#10709)
The old values assumed that you have at least about 18 GB of RAM
available (shared_buffers=10GB and maintenance_work_mem=8GB). That's a
lot when testing locally. Make it configurable, and make the default
assumption much smaller: 256 MB.

This is nice for local testing, but it's also in preparation for
starting to use VMs to run these jobs. When launched in a VM, the
control plane can set these env variables according to the max size of
the VM.

Also change the formula for how RAM is distributed: use 10% of RAM for
shared_buffers, and 70% for maintenance_work_mem. That leaves a good
amount for misc. other stuff and the OS. A very large shared_buffers
setting won't typically help with bulk loading. It won't help with the
network and I/O of processing all the tables, unless maybe if the whole
database fits in shared buffers, but even then it's not much faster than
using local disk. Bulk loading is all sequential I/O. It also won't help
much with index creation, which is also sequential I/O. A large
maintenance_work_mem can be quite useful, however, so that's where we
put most of the RAM.
2025-02-12 11:43:23 +00:00
Mikhail Kot
2c4c6e6330 fix(neon): Add tests clarifying postgres sigabrt on pageserver unavailability (#10666)
Resolves: https://github.com/neondatabase/neon/issues/5734

When we query pageserver and it's unavailable after some retries,
postgres sigabrt's. This is intended behavior so I've added tests
checking it
2025-02-12 10:52:26 +00:00
Erik Grinaker
71c30e52fa pageserver: properly yield for L0 compaction (#10769)
## Problem

When image compaction yields for L0 compaction, it may not immediately
schedule L0 compaction, because it just goes on to compact the next
pending timeline.

Touches #10694.
Requires #10744.

## Summary of changes

Extend `CompactionOutcome` with `YieldForL0` and `Skipped` variants, and
immediately schedule an L0 compaction pass in the `YieldForL0` case.
2025-02-11 23:43:58 +00:00
Erik Grinaker
6c83ac3fd2 pageserver: do all L0 compaction before image compaction (#10744)
## Problem

Image compaction can starve out L0 compaction if a tenant has several
timelines with L0 debt.

Touches #10694.
Requires #10740.

## Summary of changes

* Add an initial L0 compaction pass, in order of L0 count.
* Add a tenant option `compaction_l0_first` to control the L0 pass
(disabled by default).
* Add `CompactFlags::OnlyL0Compaction` to run an L0-only compaction
pass.
* Clean up the compaction iteration logic.

A later PR will use separate semaphores for the L0 and image compaction
passes to avoid cross-tenant L0 starvation. That PR will also make image
compaction yield if _any_ of the tenant's timelines have pending L0
compaction to further avoid starvation.
2025-02-11 22:08:46 +00:00
Heikki Linnakangas
635b67508b Split utils::http to separate crate (#10753)
Avoids compiling the crate and its dependencies into binaries that don't
need them. Shrinks the compute_ctl binary from about 31MB to 28MB in the
release-line-debug-size-lto profile.
2025-02-11 22:06:53 +00:00
dependabot[bot]
9491154eae build(deps): bump cryptography from 43.0.1 to 44.0.1 in the pip group (#10773)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Alexander Bayandin <alexander@neon.tech>
2025-02-11 21:23:17 +00:00
Heikki Linnakangas
b5e09fdaf3 Re-order Dockerfile steps for putting together final compute image (#10736)
Run "apt install" first, and only then COPY the files from the
intermediary build layers to the final image. This way, if you modify
any of the sources that trigger e.g. rebuilding compute_ctl, the "apt
install" step can still be cached.
2025-02-11 20:10:06 +00:00
John Spray
cd51ed2f86 tests: parametrize test_graceful_cluster_restart on AZ count (#10427)
## Problem

In https://github.com/neondatabase/neon/pull/10411 fill logic changes
such that it benefits us to test it with & without AZs set up. I didn't
extend the test inline in that PR because there were overlapping test
changes in flight to add `num_az` parameter.

## Summary of changes

- Parameterise test on AZ count (1 or 2)
- When AZ count is 2, use a different balance check that just asserts
the _tenants_ are balanced (since AZ affinity is chosen on a per-tenant
basis)
2025-02-11 20:09:41 +00:00
Folke Behrens
f62bc28086 proxy: Move binaries into the lib (#10758)
* This way all clippy lints defined in the lib also cover the binary
code.
* It's much easier to detect unused code.
* Fix all discovered lints.
2025-02-11 19:46:23 +00:00
Tristan Partin
da9c101939 Implement a second HTTP server within compute_ctl (#10574)
The compute_ctl HTTP server has the following purposes:

- Allow management via the control plane
- Provide an endpoint for scaping metrics
- Provide APIs for compute internal clients
  - Neon Postgres extension for installing remote extensions
  - local_proxy for installing extensions and adding grants

The first two purposes require the HTTP server to be available outside
the compute.

The Neon threat model is a bad actor within our internal network. We
need to reduce the surface area of attack. By exposing unnecessary
unauthenticated HTTP endpoints to the internal network, we increase the
surface area of attack. For endpoints described in the third bullet
point, we can just run an extra HTTP server, which is only bound to the
loopback interface since all consumers of those endpoints are within the
compute.
2025-02-11 18:02:22 +00:00
Arpad Müller
f7b2293317 Hardlink resident layers during detach ancestor (#10729)
After a detach ancestor operation, we don't want to on-demand download
layers that are already resident. This has shown to impede performance,
sometimes quite a lot (50 seconds:
https://github.com/neondatabase/neon/issues/8828#issuecomment-2643735644)

Fixes #8828.
2025-02-11 16:58:34 +00:00
Arpad Müller
be447ba4f8 Change timeline_offloading setting default to true (#10760)
This changes the default value of the `timeline_offloading` pageserver
and tenant configs to true, now that offloading has been rolled out
without problems.

There is also a small fix in the tenant config merge function, where we
applied the `lazy_slru_download` value instead of `timeline_offloading`.

Related issue: https://github.com/neondatabase/cloud/issues/21353
2025-02-11 16:36:54 +00:00
Christian Schwarz
9247331c67 fix(page_service / batching): smgr op latency metric of dropped responses include flush time (#10756)
# Problem

Say we have a batch of 10 responses to send out.

Then, even with

- #10728

we've still only called observe_execution_end_flush_start for the first
3 responses.

The remaining 7 response timers are still ticking.

When compute now closes the connection, the waiting flush fails with an
error and we `drop()` the remaining 7 responses' smgr op timers. The
`impl Drop for SmgrOpTimer` will observe an execution time that includes
the flush time.

In practice, this is supsected to produce the `+Inf` observations in the
smgr op latency histogram we've seen since the introduction of
pipelining, even after shipping #10728.

refs:
- fixup of https://github.com/neondatabase/neon/pull/10042
- fixup of https://github.com/neondatabase/neon/pull/10728
- fixes https://github.com/neondatabase/neon/issues/10754
2025-02-11 14:05:59 +00:00
John Spray
fcedd10226 tests: temporarily permit a log error (#10752)
## Problem

These tests can encounter a bug in the pageserver read path (#9185)
which occurs under the very specific circumstances that the tests
create, but is very unlikely to happen in the field.

We will fix the bug, but in the meantime let's un-flake the tests.

Related: https://github.com/neondatabase/neon/issues/10720

## Summary of changes

- Permit "could not find data for key" errors in tests affected by #9185
2025-02-11 12:37:09 +00:00
Arpad Müller
a4ea1e53ae Apply Azure SDK patch to periodically load workload identity file (#10415)
The SDK bug https://github.com/Azure/azure-sdk-for-rust/issues/1739 was
originally worked around via #10378, but now upstream has provided a fix
in [this](https://github.com/Azure/azure-sdk-for-rust/pull/1997) PR,
which we've been asked to test.

So this is what this PR is doing: revert #10378 (to make sure we fail if
the bug isn't fixed by the SDK PR), and apply the SDK PR to our fork.

Currently pointing to my local branch to check CI. I'd like to merge the
[SDK fork PR](https://github.com/neondatabase/azure-sdk-for-rust/pull/2)
before merging this to main.
2025-02-11 09:40:22 +00:00
Heikki Linnakangas
c26131c2b3 Link pgbouncer dynamically (#10749)
I don't see the point of static linking, postgres itself and many of the
extensions are already built dynamically.

One reason for the change is that I'm working on bigger changes to start
using systemd in the compute, and as part of that I wanted to add the
--with-systemd configure option to pgbouncer, and there doesn't seem to
be a static version of libsystemd (at least not on Debian).
2025-02-11 07:48:54 +00:00
Andrew Rudenko
4ab18444ec compute_ctl: database_schema should keep process::Child as part of returned value (#10273)
## Problem

/database_schema endpoint returns incomplete output from `pg_dump`

## Summary of changes

The Tokio process was not used properly. The returned stream does not
include `process::Child`, and the process is scheduled to be killed
immediately after the `get_database_schema` call when `cmd` goes out of
scope.

The solution in this PR is to return a special Stream implementation
that retains `process::Child`.
2025-02-11 07:02:13 +00:00
Heikki Linnakangas
98883e4b30 compute_ctl: Use a single tokio runtime (#10743)
compute_ctl is mostly written in synchronous fashion, intended to run in
a single thread. However various parts had become async, and they
launched their own tokio runtimes to run the async code. For example, VM
monitor ran in its own multi-threaded runtime, and apply_spec_sql()
launched another multi-threaded runtime to run the per-database SQL
commands in parallel. In addition to that, a few places used a
current-thread runtime to run async code in the main thread, or launched
a current-thread runtime in a *different* thread to run background
tasks.

Unify the runtimes so that there is only one tokio runtime. It's created
very early at process startup, and the main thread "enters" the runtime,
so that it's always available for tokio::spawn() and runtime.block_on()
calls. All code that needs to run async code uses the same runtime.

The main thread still mostly runs in a synchronous fashion. When it
needs to run async code, it uses rt.block_on().

Spawn fewer additional threads, prefer to spawn tokio tasks instead.
Convert some code that ran synchronously in background threads into
async. I didn't go all the way, though, some background threads are
still spawned.
2025-02-11 00:39:44 +00:00
Tristan Partin
3d143ad799 Unbrick the forward compatibility test failures (#10747)
Since the merge of https://github.com/neondatabase/neon/pull/10523,
forward compatibility tests have been broken everywhere.

Signed-off-by: Tristan Partin <tristan@neon.tech>
2025-02-10 22:22:10 +00:00
Alex Chi Z.
b0c7ee0175 feat(pageserver): better gc_compaction_split heuristics (#10727)
## Problem

close https://github.com/neondatabase/neon/issues/10213

`range_search` only returns the top-most layers that may satisfy the
search, so it doesn't include all layers that might be accessed (the
user needs to recursively call this function). We need to retrieve the
full layer map and find overlaps in order to have a correct heuristics
of the job split.

## Summary of changes

Retrieve all layers and find overlaps instead of doing `range_search`.
The patch also reduces the time holding the layer map read guard.

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-02-10 19:33:34 +00:00
Erik Grinaker
8c4e94107d pageserver: notify compaction loop at threshold (#10740)
## Problem

The compaction loop currently runs periodically, which can cause it to
wait for up to 20 seconds before starting L0 compaction by default.

Also, when we later separate the semaphores for L0 compaction and image
compaction, we want to give up waiting for the image compaction
semaphore if L0 compaction is needed on any timeline.

Touches #10694.

## Summary of changes

Notify the compaction loop when an L0 flush (on any timeline) exceeds
`compaction_threshold`.

Also do some opportunistic cleanups in the area.
2025-02-10 17:48:09 +00:00
Heikki Linnakangas
c368b0fe14 Use a cache mount to speed up rebuilding compute node image (#10737)
Building the compute rust binaries from scratch is pretty slow, it takes
between 4-15 minutes on my laptop, depending on which compiler flags and
other tricks I use. A cache mount allows caching the dependencies and
incremental builds, which speeds up rebuilding significantly when you
only makes a small change in a source file.
2025-02-10 16:58:29 +00:00
Heikki Linnakangas
aba61a3712 Download awscli in separate layer in Dockerfile, to allow caching (#10733)
The awscli was downloaded at the last stages of the overall compute
image build, which meant that if you modified any part of the build, it
would trigger a re-download of the awscli. That's a bit annoying when
developing locally and rebuilding the compute image repeatedly. Move it
to a separate layer, to cache separately and to avoid the spurious
rebuilds.
2025-02-10 16:48:28 +00:00
Tristan Partin
946da3f7e2 Require --compute-id when running compute_ctl (#10523)
The compute_id will be used when verifying claims sent by the control
plane.

Signed-off-by: Tristan Partin <tristan@neon.tech>

Signed-off-by: Tristan Partin <tristan@neon.tech>
2025-02-10 16:46:20 +00:00
Ivan Efremov
73633e27ed fix(proxy): Log errors from the local proxy in auth-broker (#10659)
Handle errors from local proxy by parsing HTTP response in auth broker
code

Closes [#19476](https://github.com/neondatabase/cloud/issues/19476)
2025-02-10 16:06:13 +00:00
Konstantin Knizhnik
0cf0119751 Add --save_records option to pg_waldump (#10626)
## Problem

Make it possible to dump WAL records in format recognised by walredo
process.
Intended usage:

```
pg_waldump -R 1663/5/16396  -B 771727 000000010000000100000034 --save-records=/tmp/walredo.records
postgres --wal-redo < /tmp/walredo.records > /tmp/page.img
```

## Summary of changes

Related Postgres PRs:
https://github.com/neondatabase/postgres/pull/575
https://github.com/neondatabase/postgres/pull/572

---------

Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
2025-02-10 15:48:03 +00:00
Alex Chi Z.
b37f52fdf1 feat(pageserver): dump read path on missing key error (#10528)
## Problem

helps investigate https://github.com/neondatabase/neon/issues/10482

## Summary of changes

In debug mode and testing mode, we will record all files visited by a
read operation, and print it out when it errors.

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-02-10 14:25:56 +00:00
Alex Chi Z.
443c8d0b4b feat(pageserver): repartition on L0-L1 boundary (#10548)
## Problem

Reduce the read amplification when doing `repartition`.

## Summary of changes

Compute the L0-L1 boundary LSN and do repartition here.

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-02-10 14:25:48 +00:00
Alexander Bayandin
2f36bdb218 CI(build-neon): fix duplicated builds (#10731)
## Problem

Parameterising `build-neon` job with `test-cfg` makes it to build
exactly the same thing several times.

See
-
874accd6ed/.github/workflows/_build-and-test-locally.yml (L51-L52)
-
https://github.com/neondatabase/neon/actions/runs/13215068271/job/36893373038

## Summary of changes
- Extract `sanitizers` to a separate input from `test-cfg` and set it
separately
- Don't parametrise `build-neon` with `test-cfg`
2025-02-10 12:29:39 +00:00
Ivan Efremov
e7118213ab impr(proxy): Set TTL for Redis cancellation map keys (#10671)
Use expire() op to set TTL for Redis cancellation key
2025-02-10 10:51:53 +00:00
a-masterov
d204d51faf Fix the upgrade test for pg_jwt by adding the database name (#10738)
## Problem
The upgrade test for pg_jwt does not work correctly.
## Summary of changes
The script for the upgrade test is modified to use the database
`contrib_regression`.
2025-02-10 09:56:46 +00:00
Erik Grinaker
ac55e2dbe5 pageserver: improve tenant housekeeping task (#10725)
# Problem

walredo shutdown is done in the compaction task. Let's move it to tenant
housekeeping.

# Summary of changes

* Rename "ingest housekeeping" to "tenant housekeeping".
* Move walredo shutdown into tenant housekeeping.
* Add a constant `WALREDO_IDLE_TIMEOUT` set to 3 minutes (previously 10x
compaction threshold).
2025-02-08 12:42:55 +00:00
Erik Grinaker
874accd6ed pageserver: misc task cleanups (#10723)
This patch does a bunch of superficial cleanups of `tenant::tasks` to
avoid noise in subsequent PRs. There are no functional changes.

PS: enable "hide whitespace" when reviewing, due to the unindentation of
large async blocks.
2025-02-08 11:02:13 +00:00
Christian Schwarz
6cd3b501ec fix(page_service / batching): smgr op latency metrics includes the flush time of preceding requests (#10728)
Before this PR, if a batch contains N responses, the smgr op latency
reported for response (N-i) would include the time we spent flushing
the preceding requests.

refs:
- fixup of https://github.com/neondatabase/neon/pull/10042
- fixes https://github.com/neondatabase/neon/issues/10674
2025-02-08 09:28:09 +00:00
Christian Schwarz
bf20d78292 fix(page_service): page reconstruct error log does not include shard_id label (#10680)
# Problem

Before this PR, the `shard_id` field was missing when page_service logs
a reconstruct error.

This was caused by batching-related refactorings.

Example from staging:

```
2025-01-30T07:10:04.346022Z ERROR page_service_conn_main{peer_addr=...}:process_query{tenant_id=... timeline_id=...}:handle_pagerequests:request:handle_get_page_at_lsn_request_batched{req_lsn=FFFFFFFF/FFFFFFFF}: error reading relation or page version: Read error: whole vectored get request failed because one or more of the requested keys were missing: could not find data for key  ...
```

# Changes

Delay creation of the handler-specific span until after shard routing

This also avoids the need for the record() call in the pagestream hot
path.

# Testing

Manual testing with a failpoint that is part of this PR's history but
will be squashed away.


# Refs

- fixes https://github.com/neondatabase/neon/issues/10599
2025-02-07 19:45:39 +00:00
Arpad Müller
2656c713a4 Revert recent AWS SDK update (#10724)
We've been seeing some regressions in staging since the AWS SDK updates:
https://github.com/neondatabase/neon/issues/10695 . We aren't sure the
regression was caused by the SDK update, but the issues do involve S3,
so it's not unlikely. By reverting the SDK update we find out whether it
was really the SDK update, or something else.

Reverts the two PRs:

* https://github.com/neondatabase/neon/pull/10588
* https://github.com/neondatabase/neon/pull/10699

https://neondb.slack.com/archives/C08C2G15M6U/p1738576986047179
2025-02-07 17:37:53 +00:00
John Spray
5e95860e70 tests: wait for manifest persistence in test_timeline_archival_chaos (#10719)
## Problem

This test would sometimes fail its assertion that a timeline does not
revert to active once archived. That's because it was using the
in-memory offload state, not the persistent state, so this was sometimes
lost across a pageserver restart.

Closes: https://github.com/neondatabase/neon/issues/10389

## Summary of changes

- When reading offload status, read from pageserver API _and_ remote
storage before considering the timeline offloaded
2025-02-07 16:27:39 +00:00
Heikki Linnakangas
0abff59e97 compute: Allow postgres user to power off the VM (#10710)
I plan to use this when launching a fast_import job in a VM. There's
currently no good way for an executable running in a NeonVM to exit
gracefully and have the VM shut down. The inittab we use always respawns
the payload command. The idea is that the control plane can use
"fast_import ... && poweroff" as the command, so that when fast_import
completes successfully, the VM is terminated, and the k8s Pod and
VirtualMachine object are marked as completed successfully.

I'm working on bigger changes to how we launch VMs, and will try to come
up with a nicer system for that, but in the meanwhile, this quick hack
allows us to proceed with using VMs for one-off jobs like fast_import.
2025-02-07 16:03:01 +00:00
John Spray
9609f7547e tests: address warnings in timeline shutdown (#10702)
## Problem

There are a couple of log warnings tripping up
`test_timeline_archival_chaos`

- `[stopping left-over name="timeline_delete"
tenant_shard_id=2d526292b67dac0e6425266d7079c253
timeline_id=Some(44ba36bfdee5023672c93778985facd9)
kind=TimelineDeletionWorker\n')](https://neon-github-public-dev.s3.amazonaws.com/reports/pr-10672/13161357302/index.html#/testresult/716b997bb1d8a021)`
- `ignoring attempt to restart exited flush_loop
503d8f401d8887cfaae873040a6cc193/d5eed0673ba37d8992f7ec411363a7e3\n')`

Related: https://github.com/neondatabase/neon/issues/10389

## Summary of changes

- Downgrade the 'ignoring attempt to restart' to info -- there's nothing
in the design that forbids this happening, i.e. someone calling
maybe_spawn_flush_loop concurrently with shutdown()
- Prevent timeline deletion tasks outliving tenants by carrying a
gateguard. This logically makes sense because the deletion process does
call into Tenant to update manifests.
2025-02-07 15:29:34 +00:00
Erik Grinaker
d6e87a3a9c pageserver: add separate, disabled compaction semaphore (#10716)
## Problem

L0 compaction can get starved by other background tasks. It needs to be
responsive to avoid read amp blowing up during heavy write workloads.

Touches #10694.

## Summary of changes

Add a separate semaphore for compaction, configurable via
`use_compaction_semaphore` (disabled by default). This is primarily for
testing in staging; it needs further work (in particular to split
image/L0 compaction jobs) before it can be enabled.
2025-02-07 15:11:31 +00:00
Arpad Müller
f5243992fa safekeeper: make timeline deletions a bit more verbose (#10721)
Make timeline deletion print the sub-steps, so that we can narrow down
some stuck timeline deletion issues we are observing.

https://neondb.slack.com/archives/C08C2G15M6U/p1738930694716009
2025-02-07 15:06:26 +00:00
John Spray
95220ba43e tests: fix flaky endpoint in test_ingest_logical_message (#10700)
## Problem

Endpoint kept running while timeline was deleted, causing forbidden
warnings on the pageserver when the tenant is not found.

## Summary of changes

- Explicitly stop the endpoint before the end of the test, so that it
isn't trying to talk to the pageserver in the background while things
are torn down
2025-02-07 14:51:36 +00:00
John Spray
08f92bb916 pageserver: clean up DeletionQueue push_layers_sync (#10701)
## Problem

This is tech debt. While we introduced generations for tenants, some
legacy situations without generations needed to delete things inline
(async operation) instead of enqueing them (sync operation).

## Summary of changes

- Remove the async code, replace calls with the sync variant, and assert
that the generation is always set
2025-02-07 13:03:01 +00:00
Fedor Dikarev
8f651f9582 switch from localtest.me to local.neon.build (#10714)
## Problem
Ref: https://github.com/neondatabase/neon/issues/10632

We use dns named `*.localtest.me` in our test, and that domain is
well-known and widely used for that, with all the records there resolve
to the localhost, both IPv4 and IPv6: `127.0.0.1` and `::1`

In some cases on our runners these addresses resolves only to `IPv6`,
and so components fail to connect when runner doesn't have `IPv6`
address. We suspect issue in systemd-resolved here
(https://github.com/systemd/systemd/issues/17745)
To workaround that and improve test stability, we introduced our own
domain `*.local.neon.build` with IPv4 address `127.0.0.1` only

See full details and troubleshoot log in referred issue.

p.s.
If you're FritzBox user, don't forget to add that domain
`local.neon.build` to the `DNS Rebind Protection` section under `Home
Network -> Network -> Network Settings`, otherwise FritzBox will block
addresses, resolving to the local addresses.
For other devices/vendors, please check corresponding documentation, if
resolving `local.neon.build` will produce empty answer for you.

## Summary of changes
Replace all the occurrences of `localtest.me` with `local.neon.build`
2025-02-07 12:25:16 +00:00
Arseny Sher
b5a239c4ae Add reconciliation details to sk membership change rfc (#10514)
## Problem

RFC pointed out the need of reconciliation, but wasn't detailed how it
can be done.

## Summary of changes

Add these details.
2025-02-07 11:20:49 +00:00
Alexander Lakhin
de05258419 Adjust diesel schema check for build with sanitizers (#10711)
We need to disable the detection of memory leaks when running
``neon_local init` for build with sanitizers to avoid an error thrown by
AddressSanitizer.
2025-02-07 08:56:39 +00:00
Peter Bendel
e73d681a0e Patch pgcopydb and fix another segfault (#10706)
## Problem

Found another pgcopydb segfault in error handling

```bash
2025-02-06 15:30:40.112 51299 ERROR  pgsql.c:2330              [TARGET -738302813] FATAL:  terminating connection due to administrator command
2025-02-06 15:30:40.112 51298 ERROR  pgsql.c:2330              [TARGET -1407749748] FATAL:  terminating connection due to administrator command
2025-02-06 15:30:40.112 51297 ERROR  pgsql.c:2330              [TARGET -2073308066] FATAL:  terminating connection due to administrator command
2025-02-06 15:30:40.112 51300 ERROR  pgsql.c:2330              [TARGET 1220908650] FATAL:  terminating connection due to administrator command
2025-02-06 15:30:40.432 51300 ERROR  pgsql.c:2536              [Postgres] FATAL:  terminating connection due to administrator command
2025-02-06 15:30:40.513 51290 ERROR  copydb.c:773              Sub-process 51300 exited with code 0 and signal Segmentation fault
2025-02-06 15:30:40.578 51299 ERROR  pgsql.c:2536              [Postgres] FATAL:  terminating connection due to administrator command
2025-02-06 15:30:40.613 51290 ERROR  copydb.c:773              Sub-process 51299 exited with code 0 and signal Segmentation fault
2025-02-06 15:30:41.253 51298 ERROR  pgsql.c:2536              [Postgres] FATAL:  terminating connection due to administrator command
2025-02-06 15:30:41.314 51290 ERROR  copydb.c:773              Sub-process 51298 exited with code 0 and signal Segmentation fault
2025-02-06 15:30:43.133 51297 ERROR  pgsql.c:2536              [Postgres] FATAL:  terminating connection due to administrator command
2025-02-06 15:30:43.215 51290 ERROR  copydb.c:773              Sub-process 51297 exited with code 0 and signal Segmentation fault
2025-02-06 15:30:43.215 51290 ERROR  indexes.c:123             Some INDEX worker process(es) have exited with error, see above for details
2025-02-06 15:30:43.215 51290 ERROR  indexes.c:59              Failed to create indexes, see above for details
2025-02-06 15:30:43.232 51271 ERROR  copydb.c:768              Sub-process 51290 exited with code 12
```

```bashadmin@ip-172-31-38-164:~/pgcopydb$ gdb /usr/local/pgsql/bin/pgcopydb core
GNU gdb (Debian 13.1-3) 13.1
Copyright (C) 2023 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "aarch64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<https://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /usr/local/pgsql/bin/pgcopydb...
[New LWP 51297]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
Core was generated by `pgcopydb: create index ocr.ocr_pipeline_step_results_version_pkey             '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x0000aaaac3a4b030 in splitLines (lbuf=lbuf@entry=0xffffd8b86930, buffer=<optimized out>) at string_utils.c:630
630				*newLinePtr = '\0';
(gdb) bt
#0  0x0000aaaac3a4b030 in splitLines (lbuf=lbuf@entry=0xffffd8b86930, buffer=<optimized out>) at string_utils.c:630
#1  0x0000aaaac3a3a678 in pgsql_execute_log_error (pgsql=pgsql@entry=0xffffd8b87040, result=result@entry=0x0, 
    sql=sql@entry=0xffff81fe9be0 "CREATE UNIQUE INDEX IF NOT EXISTS ocr_pipeline_step_results_version_pkey ON ocr.ocr_pipeline_step_results_version USING btree (id, transaction_id);", 
    debugParameters=debugParameters@entry=0xaaaaec5f92f0, context=context@entry=0x0) at pgsql.c:2322
#2  0x0000aaaac3a3bbec in pgsql_execute_with_params (pgsql=pgsql@entry=0xffffd8b87040, 
    sql=0xffff81fe9be0 "CREATE UNIQUE INDEX IF NOT EXISTS ocr_pipeline_step_results_version_pkey ON ocr.ocr_pipeline_step_results_version USING btree (id, transaction_id);", paramCount=paramCount@entry=0, 
    paramTypes=paramTypes@entry=0x0, paramValues=paramValues@entry=0x0, context=context@entry=0x0, parseFun=parseFun@entry=0x0) at pgsql.c:1649
#3  0x0000aaaac3a3c468 in pgsql_execute (pgsql=pgsql@entry=0xffffd8b87040, sql=<optimized out>) at pgsql.c:1522
#4  0x0000aaaac3a245f4 in copydb_create_index (specs=specs@entry=0xffffd8b8ec98, dst=dst@entry=0xffffd8b87040, index=index@entry=0xffff81f71800, ifNotExists=<optimized out>) at indexes.c:846
#5  0x0000aaaac3a24ca8 in copydb_create_index_by_oid (specs=specs@entry=0xffffd8b8ec98, dst=dst@entry=0xffffd8b87040, indexOid=<optimized out>) at indexes.c:410
#6  0x0000aaaac3a25040 in copydb_index_worker (specs=specs@entry=0xffffd8b8ec98) at indexes.c:297
#7  0x0000aaaac3a25238 in copydb_start_index_workers (specs=specs@entry=0xffffd8b8ec98) at indexes.c:209
#8  0x0000aaaac3a252f4 in copydb_index_supervisor (specs=specs@entry=0xffffd8b8ec98) at indexes.c:112
#9  0x0000aaaac3a253f4 in copydb_start_index_supervisor (specs=0xffffd8b8ec98) at indexes.c:57
#10 copydb_start_index_supervisor (specs=specs@entry=0xffffd8b8ec98) at indexes.c:34
#11 0x0000aaaac3a51ff4 in copydb_process_table_data (specs=specs@entry=0xffffd8b8ec98) at table-data.c:146
#12 0x0000aaaac3a520dc in copydb_copy_all_table_data (specs=specs@entry=0xffffd8b8ec98) at table-data.c:69
#13 0x0000aaaac3a0ccd8 in cloneDB (copySpecs=copySpecs@entry=0xffffd8b8ec98) at cli_clone_follow.c:602
#14 0x0000aaaac3a0d2cc in start_clone_process (pid=0xffffd8b743d8, copySpecs=0xffffd8b8ec98) at cli_clone_follow.c:502
#15 start_clone_process (copySpecs=copySpecs@entry=0xffffd8b8ec98, pid=pid@entry=0xffffd8b89788) at cli_clone_follow.c:482
#16 0x0000aaaac3a0d52c in cli_clone (argc=<optimized out>, argv=<optimized out>) at cli_clone_follow.c:164
#17 0x0000aaaac3a53850 in commandline_run (command=command@entry=0xffffd8b9eb88, argc=0, argc@entry=22, argv=0xffffd8b9edf8, argv@entry=0xffffd8b9ed48) at /home/admin/pgcopydb/src/bin/pgcopydb/../lib/subcommands.c/commandline.c:71
#18 0x0000aaaac3a01464 in main (argc=22, argv=0xffffd8b9ed48) at main.c:140
(gdb) 

```

The problem is most likely that the following call returned a message in
a read-only memory segment where we cannot replace \n with \0 in
string_utils.c splitLines() function
```C
char *message = PQerrorMessage(pgsql->connection);
```

## Summary of changes

modified the patch to also address this problem
2025-02-06 20:21:18 +00:00
Anastasia Lubennikova
44b905d14b Fix remote extension lookup (#10708)
when library name doesn't match extension name.
The bug was introduced by recent commit ebc55e6a
2025-02-06 19:21:38 +00:00
Arseny Sher
186199f406 Update aws sdk (#10699)
## Problem

We have unclear issue with stuck s3 client, probably after partial aws
sdk update without updating sdk-s3.
https://github.com/neondatabase/neon/pull/10588
Let's try to update s3 as well.

## Summary of changes

Result of running

cargo update -p aws-types -p aws-sigv4 -p aws-credential-types -p
aws-smithy-types -p aws-smithy-async -p aws-sdk-kms -p aws-sdk-iam -p
aws-sdk-s3 -p aws-config

ref https://github.com/neondatabase/neon/issues/10695
2025-02-06 17:28:27 +00:00
OBBO67
82cbab7512 Switch reqlsns[0].request_lsn to arrow operator in neon_read_at_lsnv() (#10620) (#10687)
## Problem

Currently the following line below uses array subscript notation which
is confusing since `reqlsns` is not an array but just a pointer to a
struct.

```
XLogWaitForReplayOf(reqlsns[0].request_lsn);
```

## Summary of changes

Switch from array subscript notation to arrow operator to improve
readability of code.

Close #10620.
2025-02-06 17:26:26 +00:00
Erik Grinaker
2943590694 pageserver: use histogram for background job semaphore waits (#10697)
## Problem

We don't have visibility into how long an individual background job is
waiting for a semaphore permit.

## Summary of changes

* Make `pageserver_background_loop_semaphore_wait_seconds` a histogram
rather than a sum.
* Add a paced warning when a task takes more than 10 minutes to get a
permit (for now).
* Drive-by cleanup of some `EnumMap` usage.
2025-02-06 17:17:47 +00:00
John Spray
df06c41085 tests: don't detach from controller in test_issue_5878 (#10675)
## Problem

This test called NeonPageserver.tenant_detach, which as well as
detaching locally on the pageserver, also updates the storage controller
to put the tenant into Detached mode. When the test runs slowly in debug
mode, it sometimes takes long enough that the background_reconcile loop
wakes up and drops the tenant from memory in response, such that the
pageserver can't validate its deletions and the test does not behave as
expected.

Closes: https://github.com/neondatabase/neon/issues/10513

## Summary of changes

- Call the pageserver HTTP client directly rather than going via
NeonPageserver.tenant_detach
2025-02-06 15:18:50 +00:00
Alexander Bayandin
ddd7c36343 CI(approved-for-ci-run): Use internal CI_ACCESS_TOKEN for cloning repo (#10693)
## Problem

The default `GITHUB_TOKEN` is used to push changes created with
`approved-for-ci-run`, which doesn't work:
```
Run git push --force origin "${BRANCH}"
remote: Permission to neondatabase/neon.git denied to github-actions[bot].
fatal: unable to access 'https://github.com/neondatabase/neon/': The requested URL returned error: 403
```
Ref:
https://github.com/neondatabase/neon/actions/runs/13166108303/job/36746518291?pr=10687

## Summary of changes
- Use `CI_ACCESS_TOKEN` to clone an external repo
- Remove unneeded `actions/checkout`
2025-02-06 14:40:22 +00:00
Peter Bendel
839f41f5bb fix pgcopydb seg fault and -c idle_in_transaction_session_timeout=0 (#10692)
## Problem

During ingest_benchmark which uses `pgcopydb`
([see](https://github.com/dimitri/pgcopydb))we sometimes had outages.
- when PostgreSQL COPY step failed we got a segfault (reported
[here](https://github.com/dimitri/pgcopydb/issues/899))
- the root cause was Neon idle_in_transaction_session_timeout is set to
5 minutes which is suboptimal for long-running tasks like project import
(reported [here](https://github.com/dimitri/pgcopydb/issues/900))

## Summary of changes

Patch pgcopydb to avoid segfault.
override idle_in_transaction_session_timeout and set it to "unlimited"
2025-02-06 14:39:45 +00:00
Alex Chi Z.
f22d41eaec feat(pageserver): num of background job metrics (#10690)
## Problem

We need a metrics to know what's going on in pageserver's background
jobs.

## Summary of changes

* Waiting tasks: task still waiting for the semaphore.
* Running tasks: tasks doing their actual jobs.

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
Co-authored-by: Erik Grinaker <erik@neon.tech>
2025-02-06 14:39:37 +00:00
Alexander Lakhin
977781e423 Enable sanitizers for postgres v17 (#10401)
Add a build with sanitizers (asan, ubsan) to the CI pipeline and run
tests on it.

See https://github.com/neondatabase/neon/issues/6053

---------

Co-authored-by: Alexander Bayandin <alexander@neon.tech>
2025-02-06 12:53:43 +00:00
Arpad Müller
67b71538d0 Limit returned lsn for timestamp by the planned gc cutoff (#10678)
Often the output of the timestamp->lsn API is used as input for branch
creation, and branch creation takes the planned lsn into account, i.e.
rejects lsn's as branch lsns that are before the planned lsn.

This patch doesn't fix all race conditions, it's still racy. But at
least it is a step into the right direction.

For #10639
2025-02-06 11:17:08 +00:00
Erik Grinaker
f4cfa725b8 pageserver: add a few critical errors (#10657)
## Problem

Following #10641, let's add a few critical errors.

Resolves #10094.

## Summary of changes

Adds the following critical errors:

* WAL sender read/decode failure.
* WAL record ingestion failure.
* WAL redo failure.
* Missing key during compaction.

We don't add an error for missing keys during GetPage requests, since
we've seen a handful of these in production recently, and the cause is
still unclear (most likely a benign race).
2025-02-06 10:30:27 +00:00
Arpad Müller
05326cc247 Skip gc cutoff lsn check at timeline creation if lease exists (#10685)
Right now, branch creation doesn't care if a lsn lease exists or not, it
just fails if the passed lsn is older than either the last or the
planned gc cutoff.

However, if an lsn lease exists for a given lsn, we can actually create
a branch at that point: nothing has been gc'd away.

This prevents race conditions that #10678 still leaves around.

Related: #10639
https://github.com/neondatabase/cloud/issues/23667
2025-02-06 10:10:11 +00:00
Arpad Müller
b66fbd6176 Warn on basebackups for archived timelines (#10688)
We don't want any external requests for an archived timeline. This
includes basebackup requests, i.e. when a compute is being started up.

Therefore, we'd like to forbid such basebackup requests: any attempt to
get a basebackup on an archived timeline (or any getpage request really)
is a cplane bug. Make this a warning for now so that, if there is
potentially a bug, we can detect cases in the wild before they cause
stuck operations, but the intention is to return an error eventually.

Related: #9548
2025-02-06 10:09:20 +00:00
Vlad Lazar
95588dab98 safekeeper: fix wal fan-out shard subscription data race (#10677)
## Problem

[This select
arm](https://github.com/neondatabase/neon/blob/main/safekeeper/src/send_interpreted_wal.rs#L414)
runs when we want to attach a new reader to the current cursor.
It checks the current position of the cursor and resets it if required.

The current position of the cursor is updated in the [other select
arm](https://github.com/neondatabase/neon/blob/main/safekeeper/src/send_interpreted_wal.rs#L336-L345).
That runs when we get some WAL to send.

Now, what happens if we want to attach two shards consecutively to the
cursor?
Let's say [this select
arm](https://github.com/neondatabase/neon/blob/main/safekeeper/src/send_interpreted_wal.rs#L397)
runs twice in a row.

Let's assume cursor is currently at LSN X. First shard wants to attach
at position V
and the other one at W. Assume X > W > V.

First shard resets the stream to position V. Second shard comes in, 
sees stale cursor position X and resets it to W. This means that the 
first shard doesn't get wal in the [V, W) range.

## Summary of changes

Ultimately, this boils down to the current position not being kept in
sync with the reset of the WAL stream. This patch fixes the race by
updating it when resetting the WAL stream and adds a unit test repro.

Closes https://github.com/neondatabase/cloud/issues/23750
2025-02-06 09:24:28 +00:00
Christian Schwarz
1686d9e733 perf(page_service): dont .instrument(span.clone()) the response flush (#10686)
On my AX102 Hetzner box, removing this line removes about 20us from the
`latency_mean` result in

`test_pageserver_characterize_latencies_with_1_client_and_throughput_with_many_clients_one_tenant`.

If the same 20us can be removed in the nightly benchmark run, this will
be a ~10% improvement because there, mean latencies are about ~220us.

This span was added during batching refactors, we didn't have it before,
and I don't think it's terribly useful.

refs
- https://github.com/neondatabase/cloud/issues/21759
2025-02-06 08:33:37 +00:00
Erik Grinaker
abcd00181c pageserver: set a concurrency limit for LocalFS (#10676)
## Problem

The local filesystem backend for remote storage doesn't set a
concurrency limit. While it can't/won't enforce a concurrency limit
itself, this also bounds the upload queue concurrency. Some tests create
thousands of uploads, which slows down the quadratic scheduling of the
upload queue, and there is no point spawning that many Tokio tasks.

Resolves #10409.

## Summary of changes

Set a concurrency limit of 100 for the LocalFS backend.

Before: `test_layer_map[release-pg17].test_query: 68.338 s`
After: `test_layer_map[release-pg17].test_query: 5.209 s`
2025-02-06 07:24:36 +00:00
Konstantin Knizhnik
01f0be03b5 Fix bugs in lfc_cache_containsv (#10682)
## Problem

Incorrect manipulations with iteration index in `lfc_cache_containsv`

## Summary of changes

```
-		int		this_chunk = Min(nblocks, BLOCKS_PER_CHUNK - chunk_offs);
+		int		this_chunk = Min(nblocks - i, BLOCKS_PER_CHUNK - chunk_offs);		int		this_chunk = ```
 -		if (i + 1 >= nblocks)
+		if (i >= nblocks)
```

Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
2025-02-06 07:00:00 +00:00
Konstantin Knizhnik
81cd30e4d6 Use #ifdef instead of #if USE_ASSERT_CHECKING (#10683)
## Problem

USE_ASSERT _CHECKING is defined as empty entity. but it is checked using
#if

## Summary of changes

Replace `#if USE_ASSERT _CHECKING` with `#ifdef USE_ASSERT _CHECKING` as
done in other places in Postgres

Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
2025-02-06 05:47:56 +00:00
Konstantin Knizhnik
7fc6953da4 Is neon superuser (#10625)
## Problem

is_neon_superuser() fiunction is public in pg14/pg15
but statically defined in publicationcmd.c in pg16/pg17

## Summary of changes

Make this function public for all Postgres version.
It is intended to be used not only in  publicationcmd.c

See
https://github.com/neondatabase/postgres/pull/573
https://github.com/neondatabase/postgres/pull/576

---------

Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
2025-02-06 05:42:14 +00:00
Christian Schwarz
77f9e74d86 pgxn: include socket send & recv queue size in slow response logs (#10673)
# Problem

When we see an apparent slow request, one possible cause is that the
client is failing to consume responses, but we don't have a clear way to
see that.

# Solution

- Log the socket queue depths on slow/stuck connections, so that we have
an indication of whether the compute is keeping up with processing the
connection's responses.

refs
- slack https://neondb.slack.com/archives/C036U0GRMRB/p1738652644396329
- refs https://github.com/neondatabase/cloud/issues/23515
- refs https://github.com/neondatabase/cloud/issues/23486
2025-02-06 01:14:29 +00:00
Alex Chi Z.
0ceeec9be3 fix(pageserver): schedule compaction immediately if pending (#10684)
## Problem

The code is intended to reschedule compaction immediately if there are
pending tasks. We set the duration to 0 before if there are pending
tasks, but this will go through the `if period == Duration::ZERO {`
branch and sleep for another 10 seconds.

## Summary of changes

Set duration to 1 so that it doesn't sleep for too long.

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-02-05 22:11:50 +00:00
Alex Chi Z.
733a57247b fix(pageserver): disallow gc-compaction produce l0 layer (#10679)
## Problem

Any compaction should never produce l0 layers. This never happened in my
experiments, but would be good to guard it early.

## Summary of changes

Disallow gc-compaction to produce l0 layers.

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-02-05 20:44:28 +00:00
Heikki Linnakangas
6699a30a49 Make it easy to build only a subset of extensions into compute image (#10655)
The full build of all extensions takes a long time. When working locally
on parts that don't need extensions, you can iterate more quickly by
skipping the unnecessary extensions.

This adds a build argument to the dockerfile to specify extensions to
build. There are three options:

- EXTENSIONS=all (default)
- EXTENSIONS=minimal: Build only a few extensions that are listed in
shared_preload_libraries in the default neon config.
- EXTENSIONS=none: Build no extensions (except for the mandatory 'neon'
extension).
2025-02-05 18:07:51 +00:00
Alex Chi Z.
133b89a83d feat(pageserver): continue from last incomplete image layer creation (#10660)
## Problem

close https://github.com/neondatabase/neon/issues/10651

## Summary of changes

* Image layer creation starts from the next partition of the last
processed partition if the previous attempt was not complete.
* Add tests.

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-02-05 17:35:39 +00:00
Arseny Sher
fba22a7123 Record more timings in test_layer_map (#10670)
## Problem

It it is not very clear how much time take different operations.

## Summary of changes

Record more timings.

ref https://github.com/neondatabase/neon/issues/10409
2025-02-05 17:00:26 +00:00
John Spray
14e05276a3 storcon: fix a case where optimise could get stuck on unschedulable node (#10648)
## Problem

When a shard has two secondary locations, but one of them is on a node
with MaySchedule::No, the optimiser would get stuck, because it couldn't
decide which secondary to remove.

This is generally okay if a node is offline, but if a node is in Pause
mode for a long period of time, it's a problem.

Closes: https://github.com/neondatabase/neon/issues/10646

## Summary of changes

- Instead of insisting on finding a node in the wrong AZ to remove, find
an available node in the _right_ AZ, and remove all the others. This
ensures that if there is one live suitable node, then other
offline/paused nodes cannot hold things up.
2025-02-05 16:05:12 +00:00
Tristan Partin
ebc55e6ae8 Fix logic for checking if a compute can install a remote extension (#10656)
Given a remote extensions manifest of the following:

```json
  {
    "public_extensions": [],
    "custom_extensions": null,
    "library_index": {
      "pg_search": "pg_search"
    },
    "extension_data": {
      "pg_search": {
        "control_data": {
          "pg_search.control": "comment = 'pg_search: Full text search for PostgreSQL using BM25'\ndefault_version = '0.14.1'\nmodule_pathname = '$libdir/pg_search'\nrelocatable = false\nsuperuser = true\nschema = paradedb\ntrusted = true\n"
        },
        "archive_path": "13117844657/v14/extensions/pg_search.tar.zst"
      }
    }
  }
```

We were allowing a compute to install a remote extension that wasn't
listed in either public_extensions or custom_extensions.

Signed-off-by: Tristan Partin <tristan@neon.tech>
2025-02-05 14:58:33 +00:00
Erik Grinaker
f07119cca7 pageserver: add pageserver_wal_ingest_values_committed metric (#10653)
## Problem

We don't have visibility into the ratio of image vs. delta pages
ingested in Pageservers. This might be useful to determine whether we
should compress WAL records before storing them, which in turn might
make compaction more efficient.

## Summary of changes

Add `pageserver_wal_ingest_values_committed` metric with dimensions
`class=metadata|data` and `kind=image|delta`.
2025-02-05 14:33:04 +00:00
Vlad Lazar
47975d06d9 storcon: silence cplane 404s on tenant creation (#10665)
## Problem

We get WARN log noise on tenant creations. Cplane creates tenants via
/location_config. That returns the attached locations in the response
and spawns a reconciliation which will also attempt to notify cplane. If
the notification is attempted before cplane persists the shards to its
database, storcon gets back a 404. The situation is harmless, but
annoying.

## Summary of Changes

* Add a tenant creation hint to the reconciler config
* If the hint is true and we get back a 404 on the notification from
cplane, ignore the error, but still queue the reconcile up for a retry.

Closes https://github.com/neondatabase/cloud/issues/20732
2025-02-05 12:41:09 +00:00
Fedor Dikarev
472007dd7c ci: unify Dockerfiles, set bash as SHELL for debian layers, make cpan step as separate RUN (#10645)
## Problem
Ref: https://github.com/neondatabase/cloud/issues/23461

and follow-up after: https://github.com/neondatabase/neon/pull/10553

we used `echo` to set-up `.wgetrc` and `.curlrc`, and there we used `\n`
to make these multiline configs with one echo command.

The problem is that Debian `/bin/sh`'s built-in echo command behaves
differently from the `/bin/echo` executable and from the `echo` built-in
in `bash`. Namely, it does not support the`-e` option, and while it does
treat `\n` as a newline, passing `-e` here will add that `-e` to the
output.
At the same time, when we use different base images, for example
`alpine/curl`, their `/bin/sh` supports and requires `-e` for treating
escape sequences like `\n`.
But having different `echo` and remembering difference in their
behaviour isn't best experience for the developer and makes bad
experience maintaining Dockerfiles.

Work-arounds:

- Explicitly use `/bin/bash` (like in this PR)
- Use `/bin/echo` instead of the shell's built-in echo function
- Use printf "foo\n" instead of echo -e "foo\n"

## Summary of changes
1. To fix that, we process with the option setting `/bin/bash` as a
SHELL for the debian-baysed layers
2. With no changes for `alpine/curl` based layers.
3. And one more change here: in `extensions` layer split to the 2 steps:
installing dependencies from `CPAN` and installing `lcov` from github,
so upgrading `lcov` could reuse previous layer with installed cpan
modules.
2025-02-04 18:58:02 +00:00
Vlad Lazar
f9009d6b80 pageserver: write heatmap to disk after uploading it (#10650)
## Problem

We wish to make heatmap generation additive in
https://github.com/neondatabase/neon/pull/10597.
However, if the pageserver restarts and has a heatmap on disk from when
it was a secondary long ago,
we can end up keeping extra layers on the secondary's disk.

## Summary of changes

Persist the heatmap after a successful upload.
2025-02-04 17:52:54 +00:00
Alex Chi Z.
cab60b6d9f fix(pagesever): stablize gc-compaction tests (#10621)
## Problem

Hopefully this can resolve
https://github.com/neondatabase/neon/issues/10517. The reason why the
test is flaky is that after restart the compute node might write some
data so that the pageserver flush some layers, and in the end, causing
L0 compaction to run, and we cannot get the test scenario as we want.

## Summary of changes

Ensure all L0 layers are compacted before starting the test.

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-02-04 16:11:31 +00:00
Erik Grinaker
06090bbccd pageserver: log critical error on ClearVmBits for unknown pages (#10634)
## Problem

In #9895, we fixed some issues where `ClearVmBits` were broadcast to all
shards, even those not owning the VM relation. As part of that, we found
some ancient code from #1417, which discarded spurious incorrect
`ClearVmBits` records for pages outside of the VM relation. We added
observability in #9911 to see how often this actually happens in the
wild.

After two months, we have not seen this happen once in production or
staging. However, out of caution, we don't want a hard error and break
WAL ingestion.

Resolves #10067.

## Summary of changes

Log a critical error when ingesting `ClearVmBits` for unknown VM
relations or pages.
2025-02-04 14:55:11 +00:00
Folke Behrens
dcf335a251 proxy: Switch proxy to JSON logging (#9857)
## Problem

We want to switch proxy and ideally all Rust services to structured JSON
logging to support better filtering and cross-referencing with tracing.

## Summary of changes

* Introduce a custom tracing-subscriber to write the JSON. In a first
attempt a customized tracing::fmt::FmtSubscriber was used, but it's very
inefficient and can still generate invalid JSON. It's also doesn't allow
us to add important fields to the root object.
* Make this opt in: the `LOGFMT` env var can be set to `"json"` to
enable to new logger at startup.
2025-02-04 14:50:53 +00:00
Arpad Müller
b6e9daea9a storcon: only allow errrors of the server cert verification (#10644)
This PR does a bunch of things:

* only allow errors of the server cert verification, not of the TLS
handshake. The TLS handshake doesn't cause any errors for us so we can
just always require it to be valid. This simplifies the code a little.
* As the solution is more permanent than originally anticipated, I think
it makes sense to move the `AcceptAll` verifier outside.
* log the connstr information. this helps with figuring out which domain
names are configured in the connstr, etc. I think it is generally useful
to print it. make extra sure that the password is not leaked.

Follow-up of #10640
2025-02-04 14:01:57 +00:00
a-masterov
d5c3a4e2b9 Add support for pgjwt test (#10611)
## Problem
We don't currently test pgjwt, while it is based on pg_prove and can be
easily added
## Summary of changes
The test for pgjwt was added.
2025-02-04 13:49:44 +00:00
Heikki Linnakangas
8107140f7f Refactor compute dockerfile (#10371)
Refactor how extensions are built in compute Dockerfile
    
1. Rename some of the extension layers, so that names correspond more
   precisely to the upstream repository name and the source directory
   name. For example, instead of "pg-jsonschema-pg-build", spell it
   "pg_jsonschema-build". Some of the layer names had the extra "pg-"
   part, and some didn't; harmonize on not having it. And use an
   underscore if the upstream project name uses an underscore.

2. Each extension now consists of two dockerfile targets:
   [extension]-src and [extension]-build. By convention, the -src
   target downloads the sources and applies any neon-specific patches
   if necessary. The source tarball is downloaded and extracted under
   /ext-src. For example, the 'pgvector' extension creates the
   following files and directory:

        /ext-src/pgvector.tar.gz  # original tarball
/ext-src/pgvector.patch # neon-specific patch, copied from patches/ dir
/ext-src/pgvector-src/ # extracted tarball, with patch applied

    This separation avoids re-downloading the sources every time the
    extension is recompiled. The 'extension-tests' target also uses the
    [extension]-src layers, by copying the /ext-src/ dirs from all
    the extensions together into one image.

This refactoring came about when I was experimenting with different
ways of splitting up the Dockerfile so that each extension would be in
a separate file. That's not part of this PR yet, but this is a good
step in modularizing the extensions.
2025-02-04 10:35:43 +00:00
Alex Chi Z.
e219d48bfe refactor(pageserver): clearify compaction return value (#10643)
## Problem

## Summary of changes

Make the return value of the set of compaction functions less confusing.

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-02-03 21:56:55 +00:00
Alex Chi Z.
c1be84197e feat(pageserver): preempt image layer generation if L0 piles up (#10572)
## Problem

Image layer generation could block L0 compactions for a long time.

## Summary of changes

* Refactored the return value of `create_image_layers_for_*` functions
to make it self-explainable.
* Preempt image layer generation in `Try` mode if L0 piles up.

Note that we might potentially run into a state that only the beginning
part of the keyspace gets image coverage. In that case, we either need
to implement something to prioritize some keyspaces with image coverage,
or tune the image_creation_threshold to ensure that the frequency of
image creation could keep up with L0 compaction.

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
Co-authored-by: Erik Grinaker <erik@neon.tech>
2025-02-03 20:55:47 +00:00
dependabot[bot]
d80cbb2443 build(deps): bump openssl from 0.10.66 to 0.10.70 in /test_runner/pg_clients/rust/tokio-postgres in the cargo group across 1 directory (#10642)
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-03 19:42:40 +00:00
Erik Grinaker
06b45fd0fd utils/logging: add critical! macro and metric (#10641)
## Problem

We don't currently have good alerts for critical errors, e.g. data
loss/corruption.

Touches #10094.

## Summary of changes

Add a `critical!` macro and corresponding
`libmetrics_tracing_event_count{level="critical"}` metric. This will:

* Emit an `ERROR` log message with prefix `"CRITICAL:"` and a backtrace.
* Increment `libmetrics_tracing_event_count{level="critical"}`, and
indirectly `level="error"`.
* Trigger a pageable alert (via the metric above).
* In debug builds, panic the process.

I'll add uses of the macro separately.
2025-02-03 19:23:12 +00:00
John Spray
715e20343a storage controller: improve scheduling of tenants created in PlacementPolicy::Secondary (#10590)
## Problem

I noticed when onboarding lots of tenants that the AZ scheduling
violation stat was climbing, before falling later as optimisations
happened. This was happening because we first add the tenant with
PlacementPolicy::Secondary, and then later go to
PlacementPolicy::Attached, and the scheduler's behavior led to a bad AZ
choice:
1. Create a secondary location in the non-preferred AZ
2. Upgrade to Attached where we promote that non-preferred-AZ location
to attached and then create another secondary
3. Optimiser later realises we're in the wrong AZ and moves us

## Summary of changes

- Extend some logging to give more information about AZs
- When scheduling secondary location in PlacementPolicy::Secondary,
select it as if we were attached: in this mode, our business goal is to
have a warm pageserver location that we can make available as attached
quickly if needed, therefore we want it to be in the preferred AZ.
- Make optimize_secondary logic the same, so that it will consider a
secondary location in the preferred AZ to be optimal when in
PlacementPolicy::Secondary
- When transitioning to from PlacementPolicy::Attached(N) to
PlacementPolicy::Secondary, instead of arbitrarily picking a location to
keep, prefer to keep the location in the preferred AZ
2025-02-03 19:01:16 +00:00
Arpad Müller
c774f0a147 storcon db: allow accepting any TLS certificate (#10640)
We encountered some TLS validation errors for the storcon since applying
#10614. Add an option to downgrade them to logged errors instead to
allow us to debug with more peace.

cc issue https://github.com/neondatabase/cloud/issues/23583
2025-02-03 18:21:01 +00:00
Folke Behrens
628a9616c4 fix(proxy): Don't use --is-private-access-proxy to disable IP check (#10633)
## Problem
* The behavior of this flag changed. Plus, it's not necessary to disable
the IP check as long as there are no IPs listed in the local postgres.

## Summary of changes
* Drop the flag from the command in the README.md section.
* Change the postgres URL passed to proxy to not use the endpoint
hostname.
* Also swap postgres creation and proxy startup, so the DB is running
when proxy comes up.
2025-02-03 14:12:41 +00:00
Alexander Bayandin
43682624b5 CI(pg-clients): fix logical replication tests (#10623)
## Problem

Tests for logical replication (on Staging) have been failing for some
time because logical replication is not enabled for them. This issue
occurred after switching to an org API key with a different default
setting, where logical replication was not enabled by default.

## Summary of changes
- Add `enable_logical_replication` input to
`actions/neon-project-create`
- Enable logical replication in `test-logical-replication` job
2025-02-03 13:41:41 +00:00
Em Sharnoff
e617a3a075 vm-monitor: Improve error display (#10542)
Logging errors with the debug format specifier causes multi-line errors,
which are sometimes a pain to deal with. Instead, we should use anyhow's
alternate display format, which shows the same information on a single
line.

Also adjusted a couple of error messages that were stale.

Fixes neondatabase/cloud#14710.
2025-02-03 13:34:11 +00:00
Fedor Dikarev
23ca8b061b Use actions/checkout for checkout (#10630)
## Problem
1. First of all it's more correct
2. Current usage allows ` Time-of-Check-Time-of-Use (TOCTOU) 'Pwn
Request' vulnerabilities`. Please check security slack channel or reach
me for more details. I will update PR description after merge.

## Summary of changes
1. Use `actions/checkout` with `ref: ${{
github.event.pull_request.head.sha }}`

Discovered by and Co-author: @varunsh-coder
2025-02-03 12:55:48 +00:00
Anastasia Lubennikova
b1bc33eb4d Fix logical_replication_sync test fixture (#10531)
Fixes flaky test_lr_with_slow_safekeeper test #10242

Fix query to `pg_catalog.pg_stat_subscription` catalog to handle table
synchronization and parallel LR correctly.
2025-02-03 12:44:47 +00:00
OBBO67
b1e451091a pageserver: clean up references to timeline delete marker, uninit marker (#5718) (#10627)
## Problem

Since [#5580](https://github.com/neondatabase/neon/pull/5580) the delete
and uninit file markers are no longer needed.

## Summary of changes

Remove the remaining code for the delete and uninit markers.

Additionally removes the `ends_with_suffix` function as it is no longer
required.

Closes [#5718](https://github.com/neondatabase/neon/issues/5718).
2025-02-03 11:54:07 +00:00
Arpad Müller
87ad50c925 storcon: use diesel-async again, now with tls support (#10614)
Successor of #10280 after it was reverted in #10592.

Re-introduce the usage of diesel-async again, but now also add TLS
support so that we connect to the storcon database using TLS. By
default, diesel-async doesn't support TLS, so add some code to make us
explicitly request TLS.

cc https://github.com/neondatabase/cloud/issues/23583
2025-02-03 11:53:51 +00:00
Alexander Bayandin
89b9f74077 CI(pre-merge-checks): do not run conclusion job for PRs (#10619)
## Problem

While working on https://github.com/neondatabase/neon/pull/10617 I
(unintentionally) merged the PR before the main CI pipeline has
finished.
I suspect this happens because we have received all the required job
results from the pre-merge-checks workflow, which runs on PRs that
include changes to relevant files.

## Summary of changes
- Skip the `conclusion` job in `pre-merge-checks` workflows for PRs
2025-02-03 09:40:12 +00:00
John Spray
f071800979 tests: stabilize shard locations earlier in test_scrubber_tenant_snapshot (#10606)
## Problem

This test would sometimes emit unexpected logs from the storage
controller's requests to do migrations, which overlap with the test's
restarts of pageservers, where those migrations are happening some time
after a shard split as the controller moves load around.

Example:
https://neon-github-public-dev.s3.amazonaws.com/reports/pr-10602/13067323736/index.html#testresult/f66f1329557a1fc5/retries

## Summary of changes

- Do a reconcile_until_idle after shard split, so that the rest of the
test doesn't run concurrently with migrations
2025-02-03 09:02:21 +00:00
Peter Bendel
4dfe60e2ad revert https://github.com/neondatabase/neon/pull/10616 (#10631)
## Problem

https://github.com/neondatabase/neon/pull/10616 was only intended
temparily during the weekend, want to reset to prior state

## Summary of changes

revert https://github.com/neondatabase/neon/pull/10616 but keep fixes in
https://github.com/neondatabase/neon/pull/10622
2025-02-03 09:00:23 +00:00
Arpad Müller
8ae6f656a6 Don't require partial backup semaphore capacity for deletions (#10628)
In the safekeeper, we block deletions on the timeline's gate closing,
and any `WalResidentTimeline` keeps the gate open (because it owns a
gate lock object). Thus, unless the `main_task` function of a partial
backup doesn't return, we can't delete the associated timeline.

In order to make these tasks exit early, we call the cancellation token
of the timeline upon its shutdown. However, the partial backup task
wasn't looking for the cancellation while waiting to acquire a partial
backup permit.

On a staging safekeeper we have been in a situation in the past where
the semaphore was already empty for a duration of many hours, rendering
all attempted deletions unable to proceed until a restart where the
semaphore was reset:
https://neondb.slack.com/archives/C03H1K0PGKH/p1738416586442029
2025-02-03 04:11:06 +00:00
Peter Bendel
b9e1a67246 fix generate matrix for olap for saturdays (#10622)
## Problem

when introducing pg17 for job step `Generate matrix for OLAP benchmarks`
I introduced a syntax error that only hits on Saturdays.

## Summary of changes

Remove trailing comma

## successful test run

https://github.com/neondatabase/neon/actions/runs/13086363907
2025-02-01 11:09:45 +00:00
Folke Behrens
6318828c63 Update rust to 1.84.1 (#10618)
We keep the practice of keeping the compiler up to date, pointing to the
latest release. This is done by many other projects in the Rust
ecosystem as well.

[Release notes](https://releases.rs/docs/1.84.1/).

Prior update was in https://github.com/neondatabase/neon/pull/10328.

Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>
2025-01-31 20:52:17 +00:00
Stefan Radig
6dd48ba148 feat(proxy): Implement access control with VPC endpoint checks and block for public internet / VPC (#10143)
- Wired up filtering on VPC endpoints
- Wired up block access from public internet / VPC depending on per
project flag
- Added cache invalidation for VPC endpoints (partially based on PR from
Raphael)
- Removed BackendIpAllowlist trait

---------

Co-authored-by: Ivan Efremov <ivan@neon.tech>
2025-01-31 20:32:57 +00:00
Conrad Ludgate
ad1a41157a feat(proxy): optimizing the chances of large write in copy_bidirectional (#10608)
We forked copy_bidirectional to solve some issues like fast-shutdown
(disallowing half-open connections) and to introduce better error
tracking (which side of the conn closed down).

A change recently made its way upstream offering performance
improvements: https://github.com/tokio-rs/tokio/pull/6532. These seem
applicable to our fork, thus it makes sense to apply them here as well.
2025-01-31 19:14:27 +00:00
Tristan Partin
fcd195c2b6 Migrate compute_ctl arg parsing to clap derive (#10497)
The primary benefit is that all the ad hoc get_matches() calls are no
longer necessary. Now all it takes to get at the CLI arguments is
referencing a struct member. It's also great the we can replace the ad
hoc CLI struct we had with this more formal solution.

Signed-off-by: Tristan Partin <tristan@neon.tech>
2025-01-31 19:04:26 +00:00
Peter Bendel
bc7822d90c temporarily disable some steps and run more often to expose more pgbench --initialize in benchmarking workflow (#10616)
## Problem

we want to disable some steps in benchmarking workflow that do not
initialize new projects and instead run the test more frequently

Test run
 https://github.com/neondatabase/neon/actions/runs/13077737888
2025-01-31 18:41:17 +00:00
Alexander Bayandin
48c87dc458 CI(pre-merge-checks): fix condition (#10617)
## Problem

Merge Queue fails if changes include Rust code.

## Summary of changes
- Fix condition for `build-build-tools-image`
- Add a couple of no-op `false ||` to make predicates look 
symmetric
2025-01-31 18:07:26 +00:00
John Spray
aedeb1c7c2 pageserver: revise logging of cancelled request results (#10604)
## Problem

When a client dropped before a request completed, and a handler returned
an ApiError, we would log that at error severity. That was excessive in
the case of a request erroring on a shutdown, and could cause test
flakes.

example:
https://neon-github-public-dev.s3.amazonaws.com/reports/main/13067651123/index.html#suites/ad9c266207b45eafe19909d1020dd987/6021ce86a0d72ae7/

```
Cancelled request finished with an error: ShuttingDown
```

## Summary of changes

- Log a different info-level on ShuttingDown and ResourceUnavailable API
errors from cancelled requests
2025-01-31 17:43:54 +00:00
John Spray
a93e9f22fc pageserver: remove faulty debug assertion in compaction (#10610)
## Problem

This assertion is incorrect: it is legal to see another shard's data at
this point, after a shard split.

Closes: https://github.com/neondatabase/neon/issues/10609

## Summary of changes

- Remove faulty assertion
2025-01-31 17:43:31 +00:00
JC Grünhage
10cf5e7a38 Move cargo-deny into a separate workflow on a schedule (#10289)
## Problem
There are two (related) problems with the previous handling of
`cargo-deny`:
- When a new advisory is added to rustsec that affects a dependency,
unrelated pull requests will fail.
- New advisories rely on pushes or PRs to be surfaced. Problems that
already exist on main will only be found if we try to merge new things
into main.

## Summary of changes
We split out `cargo-deny` into a separate workflow that runs on all PRs
that touch `Cargo.lock`, and on a schedule on `main`, `release`,
`release-compute` and `release-proxy` to find new advisories.
2025-01-31 13:42:59 +00:00
Arpad Müller
dce617fe07 Update to rebased rust-postgres (#10584)
Update to a rebased version of our rust-postgres patches, rebased on
[this](98f5a11bc0)
commit this time.

With #10280 reapplied, this means that the rust-postgres crates will be
deduplicated, as the new crate versions are finally compatible with the
requirements of diesel-async.

Earlier update: #10561

rust-postgres PR: https://github.com/neondatabase/rust-postgres/pull/39
2025-01-31 12:40:20 +00:00
Alexander Bayandin
503bc72d31 CI: add diesel print-schema check (#10527)
## Problem

We want to check that `diesel print-schema` doesn't generate any changes
(`storage_controller/src/schema.rs`) in comparison with the list of
migration.

## Summary of changes
- Add `diesel_cli` to `build-tools` image
- Add `Check diesel schema` step to `build-neon` job, at this stage we
have all required binaries, so don't need to compile anything
additionally
- Check runs only on x86 release builds to be sure we do it at least
once per CI run.
2025-01-31 11:48:46 +00:00
Fedor Dikarev
89cff08354 unify pg-build-nonroot-with-cargo base layer and config retries in curl (#10575)
Ref: https://github.com/neondatabase/cloud/issues/23461

## Problem

Just made changes around and see these 2 base layers could be optimised.

and after review comment from @myrrc setting up timeouts and retries in
`alpine/curl` image

## Summary of changes
2025-01-31 11:46:33 +00:00
Erik Grinaker
afbcebe7f7 test_runner: force-compact in test_sharding_autosplit (#10605)
## Problem

This test may not fully detect data corruption during splits, since we
don't force-compact the entire keyspace.

## Summary of changes

Force-compact all data in `test_sharding_autosplit`.
2025-01-31 11:31:58 +00:00
Arpad Müller
7d5c70c717 Update AWS SDK crates (#10588)
We want to keep the AWS SDK up to date as that way we benefit from new
developments and improvements.

Prior update was in #10056
2025-01-31 11:23:12 +00:00
John Spray
f09cfd11cb pageserver: exclude archived timelines from freeze+flush on shutdown (#10594)
## Problem

If offloading races with normal shutdown, we get a "failed to freeze and
flush: cannot flush frozen layers when flush_loop is not running, state
is Exited". This is harmless but points to it being quite strange to try
and freeze and flush such a timeline. flushing on shutdown for an
archived timeline isn't useful.

Related: https://github.com/neondatabase/neon/issues/10389

## Summary of changes

- During Timeline::shutdown, ignore ShutdownMode::FreezeAndFlush if the
timeline is archived
2025-01-31 10:54:14 +00:00
Arseny Sher
765ba43438 Allow pageserver unreachable errors in test_scrubber_tenant_snapshot (#10585)
## Problem

test_scrubber_tenant_snapshot restarts pageservers, but log validation
fails tests on any non white listed storcon warnings, making the test
flaky.

## Summary of changes

Allow warns like
2025-01-29T12:37:42.622179Z WARN reconciler{seq=1
tenant_id=2011077aea9b4e8a60e8e8a19407634c shard_id=0004}: Call to node
2 (localhost:15352) management API failed, will retry (attempt 1):
receive body: error sending request for url
(http://localhost:15352/v1/tenant/2011077aea9b4e8a60e8e8a19407634c-0004/location_config):
client error (Connect)

ref https://github.com/neondatabase/neon/issues/10462
2025-01-31 10:33:24 +00:00
Folke Behrens
6041a93591 Update tokio base crates (#10556)
Update `tokio` base crates and their deps. Pin `tokio` to at least 1.41
which stabilized task ID APIs.

To dedup `mio` dep the `notify` crate is updated. It's used in
`compute_tools`.

9f81828429/compute_tools/src/pg_helpers.rs (L258-L367)
2025-01-31 09:54:31 +00:00
Conrad Ludgate
738bf83583 chore: replace dashmap with clashmap (#10582)
## Problem

Because dashmap 6 switched to hashbrown RawTable API, it required us to
use unsafe code in the upgrade:
https://github.com/neondatabase/neon/pull/8107

## Summary of changes

Switch to clashmap, a fork maintained by me which removes much of the
unsafe and ultimately switches to HashTable instead of RawTable to
remove much of the unsafe requirement on us.
2025-01-31 09:53:43 +00:00
Anna Stepanyan
423e239617 [infra/notes] impr: add issue types to issue templates (#10018)
refs #0000

---------

Co-authored-by: Fedor Dikarev <fedor@neon.tech>
2025-01-31 06:29:06 +00:00
Heikki Linnakangas
df87a55609 tests: Speed up test_pgdata_import_smoke on Postgres v17 (#10567)
The test runs this query:

    select count(*), sum(data::bigint)::bigint from t

to validate the test results between each part of the test. It performs
a simple sequential scan and aggregation, but was taking an order of
magnitude longer on v17 than on previous Postgres versions, which
sometimes caused the test to time out. There were two reasons for that:

1. On v17, the planner estimates the table to have only only one row. In
reality it has 305790 rows, and older versions estimated it at 611580,
which is not too bad given that the table has not been analyzed so the
planner bases that estimate just on the number of pages and the widths
of the datatypes. The new estimate of 1 row is much worse, and it leads
the planner to disregard parallel plans, whereas on older versions you
got a Parallel Seq Scan.

I tracked this down to upstream commit 29cf61ade3, "Consider fillfactor
when estimating relation size". With that commit,
table_block_relation_estimate_size() function calculates that each page
accommodates less than 1 row when the fillfactor is taken into account,
which rounds down to 0. In reality, the executor will always place at
least one row on a page regardless of fillfactor, but the new estimation
formula doesn't take that into account.

I reported this to pgsql-hackers
(https://www.postgresql.org/message-id/2bf9d973-7789-4937-a7ca-0af9fb49c71e%40iki.fi),
we don't need to do anything more about it in neon. It's OK to not use
parallel scans here; once issue 2. below is addressed, the queries are
fast enough without parallelism..

2. On v17, prefetching was not happening for the sequential scan. That's
because starting with v17, buffers are reserved in the shared buffer
cache before prefetching is initiated, and we use a tiny
shared_buffers=1MB setting in the tests. The prefetching is effectively
disabled with such a small shared_buffers setting, to protect the system
from completely starving out of buffers.

   To address that, simply bump up shared_buffers in the test.

This patch addresses the second issue, which is enough to fix the
problem.
2025-01-30 22:55:17 +00:00
John Spray
5e0c40709f storcon: refine chaos selection logic (#10600)
## Problem

In https://github.com/neondatabase/neon/pull/10438 it was pointed out
that it would be good to avoid picking tenants in ID order, and also to
avoid situations where we might double-select the same tenant.

There was an initial swing at this in
https://github.com/neondatabase/neon/pull/10443, where Chi suggested a
simpler approach which is done in this PR

## Summary of changes

- Split total set of tenants into in and out of home AZ
- Consume out of home AZ first, and if necessary shuffle + consume from
out of home AZ
2025-01-30 22:45:43 +00:00
John Spray
e1273acdb1 pageserver: handle shutdown cleanly in layer download API (#10598)
## Problem

This API is used in tests and occasionally for support. It cast all
errors to 500.

That can cause a failure on the log checks:
https://neon-github-public-dev.s3.amazonaws.com/reports/main/13056992876/index.html#suites/ad9c266207b45eafe19909d1020dd987/683a7031d877f3db/

## Summary of changes

- Avoid using generic anyhow::Error for layer downloads
- Map shutdown cases to 503 in http route
2025-01-30 22:43:36 +00:00
John Spray
d18f6198e1 storcon: fix AZ-driven tenant selection in chaos (#10443)
## Problem

In https://github.com/neondatabase/neon/pull/10438 I had got the
function for picking tenants backwards, and it was preferring to move
things _away_ from their preferred AZ.

## Summary of changes

- Fix condition in `is_attached_outside_preferred_az`
2025-01-30 22:17:07 +00:00
John Spray
6da7c556c2 pageserver: fix race cleaning up timeline files when shut down during bootstrap (#10532)
## Problem

Timeline bootstrap starts a flush loop, but doesn't reliably shut down
the timeline (incl. waiting for flush loop to exit) before destroying
UninitializedTimeline, and that destructor tries to clean up local
storage. If local storage is still being written to, then this is
unsound.

Currently the symptom is that we see a "Directory not empty" error log,
e.g.
https://neon-github-public-dev.s3.amazonaws.com/reports/main/12966756686/index.html#testresult/5523f7d15f46f7f7/retries

## Summary of changes

- Move fallible IO part of bootstrap into a function (notably, this is
fallible in the case of the tenant being shut down while creation is
happening)
- When that function returns an error, call shutdown() on the timeline
2025-01-30 20:33:22 +00:00
a-masterov
bf6d5e93ba Run tests of the contrib extensions (#10392)
## Problem
We don't test the extensions, shipped with contrib
## Summary of changes
The tests are now running
2025-01-30 19:32:35 +00:00
Arpad Müller
4d2c2e9460 Revert "storcon: switch to diesel-async and tokio-postgres (#10280)" (#10592)
There was a regression of #10280, tracked in
[#23583](https://github.com/neondatabase/cloud/issues/23583).

I have ideas how to fix the issue, but we are too close to the release
cutoff, so revert #10280 for now. We can revert the revert later :).
2025-01-30 19:23:25 +00:00
John Spray
bae0de643e tests: relax constraints on test_timeline_archival_chaos (#10595)
## Problem

The test asserts that it completes at least 10 full timeline lifecycles,
but the noisy CI environment sometimes doesn't meet that goal.

Related: https://github.com/neondatabase/neon/issues/10389

## Summary of changes

- Sleep for longer between pageserver restarts, so that the timeline
workers have more chance to make progress
- Sleep for shorter between retries from timeline worker, so that they
have better chance to get in while a pageserver is up between restarts
- Relax the success condition to complete at least 5 iterations instead
of 10
2025-01-30 19:22:59 +00:00
Cheng Chen
8293b252b2 chore(compute): pg_mooncake v0.1.1 (#10578)
## Problem
Upgrade pg_mooncake to v0.1.1

## Summary of changes

https://github.com/Mooncake-Labs/pg_mooncake/blob/main/CHANGELOG.md#011-2025-01-29
2025-01-30 18:33:25 +00:00
Peter Bendel
6c8fc909d6 Benchmarking PostgreSQL17: for OLAP need specific connstr secrets (#10587)
## Problem

for OLAP benchmarks we need specific connstr secrets with different
database names for each job step

This is a follow-up for https://github.com/neondatabase/neon/pull/10536
In previous PR we used a common GitHub secret for a shared re-use
project that has 4 databases: neondb, tpch, clickbench and userexamples.

[Failure
example](https://neon-github-public-dev.s3.amazonaws.com/reports/main/13044872855/index.html#suites/54d0af6f403f1d8611e8894c2e07d023/fc029330265e9f6e/):


```log
# /tmp/neon/pg_install/v17/bin/psql user=neondb_owner dbname=neondb host=ep-broad-brook-w2luwzzv.us-east-2.aws.neon.build sslmode=require options='-cstatement_timeout=0 ' -c -- $ID$
-- TPC-H/TPC-R Pricing Summary Report Query (Q1)
-- Functional Query Definition
-- Approved February 1998
...
ERROR:  relation "lineitem" does not exist

```

## Summary of changes

We need dedicated GitHub secrets and dedicated connection strings for
each of the use cases.

## Test run
https://github.com/neondatabase/neon/actions/runs/13053968231
2025-01-30 16:41:46 +00:00
Heikki Linnakangas
efe42db264 tests: test_pgdata_import_smoke requires the 'testing' cargo feature (#10569)
It took me ages to figure out why it was failing on my laptop. What I
saw was that when the test makes the 'import_pgdata' in the pageserver,
the pageserver actually performs a regular 'bootstrap' timeline creation
by running initdb, with no importing. It boiled down to the json request
that the test uses:

```
        {
            "new_timeline_id": str(timeline_id),
            "import_pgdata": {
                "idempotency_key": str(idempotency),
                "location": {"LocalFs": {"path": str(importbucket.absolute())}},
            },
        },
```

and how serde deserializes into rust structs. The 'LocalFs' enum variant
in `models.rs` is gated on the 'testing' cargo feature. On a non-testing
build, that got deserialized into the default Bootstrap enum variant, as
a valid TimelineCreateRequestModeImportPgdata variant could not be
formed.

PS. IMHO we should get rid of the testing feature, compile in all the
functionality, and have a runtime flag to disable anything dangeorous.
With that, you would've gotten a nice "feature only enabled in testing
mode" error in this case, or the test would've simply worked. But that's
another story.
2025-01-30 16:11:26 +00:00
Alex Chi Z.
cf6dee946e fix(pageserver): gc-compaction race with read (#10543)
## Problem

close https://github.com/neondatabase/neon/issues/10482

## Summary of changes

Add an extra lock on the read path to protect against races. The read
path has an implication that only certain kind of compactions can be
performed. Garbage keys must first have an image layer covering the
range, and then being gc-ed -- they cannot be done in one operation. An
alternative to fix this is to move the layers read guard to be acquired
at the beginning of `get_vectored_reconstruct_data_timeline`, but that
was intentionally optimized out and I don't want to regress.

The race is not limited to image layers. Gc-compaction will consolidate
deltas automatically and produce a flat delta layer (i.e., when we have
retain_lsns below the gc-horizon). The same race would also cause
behaviors like getting an un-replayable key history as in
https://github.com/neondatabase/neon/issues/10049.

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-01-30 15:25:29 +00:00
Alexey Kondratov
be51b10da7 chore(compute): Print some compute_ctl errors in debug mode (#10586)
## Problem

In some cases, we were returning a very shallow error like `error
sending request for url (XXX)`, which made it very hard to figure out
the actual error.

## Summary of changes

Use `{:?}` in a few places, and remove it from places where we were
printing a string anyway.
2025-01-30 14:31:49 +00:00
Arpad Müller
93714c4c7b secondary downloader: load metadata on loading of timeline (#10539)
Related to #10308, we might have legitimate changes in file size or
generation. Those changes should not cause warn log lines.

In order to detect changes of the generation number while the file size
stayed the same, load the metadata that we store on disk on loading of
the timeline.

Still do a comparison with the on-disk layer sizes to find any
discrepancies that might occur due to race conditions (new metadata file
gets written but layer file has not been updated yet, and PS shuts
down). However, as it's possible to hit it in a race conditon, downgrade
it to a warning.

Also fix a mistake in #10529: we want to compare the old with the new
metadata, not the old metadata with itself.
2025-01-30 12:03:36 +00:00
John Spray
ab627ad9fd storcon_cli: fix spurious error setting preferred AZ (#10568)
## Problem

The client code for `tenant-set-preferred-az` declared response type
`()`, so printed a spurious error on each use:
```
Error: receive body: error decoding response body: invalid type: map, expected unit at line 1 column 0
```

The requests were successful anyway.

## Summary of changes

- Declare the proper return type, so that the command succeeds quietly.
2025-01-30 11:54:02 +00:00
Erik Grinaker
6a2afa0c02 pageserver: add per-timeline read amp histogram (#10566)
## Problem

We don't have per-timeline observability for read amplification.

Touches https://github.com/neondatabase/cloud/issues/23283.

## Summary of changes

Add a per-timeline `pageserver_layers_per_read` histogram.

NB: per-timeline histograms are expensive, but probably worth it in this
case.
2025-01-30 11:24:49 +00:00
Alexander Bayandin
8804d58943 Nightly Benchmarks: use pgbench from artifacts (#10370)
We don't use statically linked OpenSSL anymore (#10302), 
it's ok to switch to Neon's pgbench for pgvector benchmarks
2025-01-30 11:18:07 +00:00
Erik Grinaker
d3db96c211 pageserver: add pageserver_deltas_per_read_global metric (#10570)
## Problem

We suspect that Postgres checkpoints will limit the number of page
deltas necessary to reconstruct a page, but don't know for certain.

Touches https://github.com/neondatabase/cloud/issues/23283.

## Summary of changes

Add `pageserver_deltas_per_read_global` metric.

This pairs with `pageserver_layers_per_read_global` from #10573.
2025-01-30 10:55:07 +00:00
Erik Grinaker
b24727134c pageserver: improve read amp metric (#10573)
## Problem

The current global `pageserver_layers_visited_per_vectored_read_global`
metric does not appear to accurately measure read amplification. It
divides the layer count by the number of reads in a batch, but this
means that e.g. 10 reads with 100 L0 layers will only measure a read amp
of 10 per read, while the actual read amp was 100.

While the cost of layer visits are amortized across the batch, and some
layers may not intersect with a given key, each visited layer
contributes directly to the observed latency for every read in the
batch, which is what we care about.

Touches https://github.com/neondatabase/cloud/issues/23283.
Extracted from #10566.

## Summary of changes

* Count the number of layers visited towards each read in the batch,
instead of the average across the batch.
* Rename `pageserver_layers_visited_per_vectored_read_global` to
`pageserver_layers_per_read_global`.
* Reduce the read amp log warning threshold down from 512 to 100.
2025-01-30 09:27:40 +00:00
Alexander Lakhin
a7a706cff7 Fix submodule reference after #10473 (#10577) 2025-01-30 09:09:43 +00:00
Alex Chi Z.
77ea9b16fe fix(pageserver): use the larger one of upper limit and threshold (#10571)
## Problem

Follow up of https://github.com/neondatabase/neon/pull/10550 in case the
upper limit is set larger than threshold. It does not make sense for
someone to enforce the behavior like "if there are >= 50 L0s, only
compact 10 of them".

## Summary of changes

Use the maximum of compaction threshold and upper limit when selecting
L0 files to compact.

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-01-30 00:05:40 +00:00
Alex Chi Z.
9dff6cc2a4 fix(pageserver): skip repartition if we need L0 compaction (#10547)
## Problem

Repartition is slow, but it's only used in image layer creation. We can
skip it if we have a lot of L0 layers to ingest.

## Summary of changes

If L0 compaction is not complete, do not repartition and do not create
image layers.

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-01-29 21:32:50 +00:00
Erik Grinaker
ff298afb97 pageserver: add level for timeline layer metrics (#10563)
## Problem

We don't have good observability for per-timeline compaction debt,
specifically the number of delta layers in the frozen, L0, and L1
levels.

Touches https://github.com/neondatabase/cloud/issues/23283.

## Summary of changes

* Add a `level` label for `pageserver_layer_{count,size}` with values
`l0`, `l1`, and `frozen`.
* Track metrics for frozen layers.

There is already a `kind={delta,image}` label. `kind=image` is only
possible for `level=l1`.

We don't include the currently open ephemeral layer, only frozen layers.
There is always exactly 1 ephemeral layer, with a dynamic size which is
already tracked in `pageserver_timeline_ephemeral_bytes`.
2025-01-29 21:10:56 +00:00
Fedor Dikarev
de1c35fab3 add retries for apt, wget and curl (#10553)
Ref: https://github.com/neondatabase/cloud/issues/23461

## Problem
> recent CI failure due to apt-get:
```
4.266 E: Failed to fetch http://deb.debian.org/debian/pool/main/g/gcc-10/libgfortran5_10.2.1-6_arm64.deb  Error reading from server - read (104: Connection reset by peer) [IP: 146.75.122.132 80]
```

https://github.com/neondatabase/neon/actions/runs/11144974698/job/30973537767?pr=9186
thinking about if there should be a mirror-selector at the beginning of
the dockerfile so that it uses a debian mirror closer to the build
server?
## Summary of changes
We could consider adding local mirror or proxy and keep it close to our
self-hosted runners.
For now lets just add retries for `apt`, `wget` and `curl`

thanks to @skyzh for reporting that in October 2024, I just finally
found time to take a look here :)
2025-01-29 21:02:54 +00:00
Peter Bendel
62819aca36 Add PostgreSQL version 17 benchmarks (#10536)
## Problem

benchmarking.yml so far is only running benchmarks with PostgreSQL
version 16.
However neon recently changed the default for new customers to
PostgreSQL version 17.

See related [epic](https://github.com/neondatabase/cloud/issues/23295)

## Summary of changes

We do not want to run every job step with both pg 16 and 17 because this
would need excessive resources (runners, computes) and extend the
benchmarking run wall clock time too much.

So we select an opinionated subset of testcases that we also report in
weekly reporting and add a postgres v17 job step.

For re-use projects associated Neon projects have been created and
connection strings have been added to neon database organization
secrets.

A follow up is to add the reporting for these new runs to some grafana
dashboards.
2025-01-29 20:21:42 +00:00
Tristan Partin
707a926057 Remove unused compute_ctl HTTP routes (#10544)
These are not used anywhere within the platform, so let's remove dead
code.

Signed-off-by: Tristan Partin <tristan@neon.tech>
2025-01-29 19:22:01 +00:00
Alex Chi Z.
5bcefb4ee1 fix(pageserver): compaction perftest wrt upper limit (#10564)
## Problem

The config is added in https://github.com/neondatabase/neon/pull/10550
causing behavior change for l0 compaction.

close https://github.com/neondatabase/neon/issues/10562

## Summary of changes

Fix the test case to consider the effect of upper_limit.

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-01-29 18:43:39 +00:00
Alexey Kondratov
34322b2424 chore(compute): Simplify new compute_ctl metrics and fix flaky test (#10560)
## Problem

1. d04d924 added separate metrics for total requests and failures
separately, but it doesn't make much sense. We could just have a unified
counter with `http_status`.
2. `test_compute_migrations_retry` had a race, i.e., it was waiting for
the last successful migration, not an actual failure. This was revealed
after adding an assert on failure metric in d04d924.

## Summary of changes

1. Switch to unified counters for `compute_ctl` requests.
2. Add a waiting loop into `test_compute_migrations_retry` to eliminate
the race.

Part of neondatabase/cloud#17590
2025-01-29 18:09:25 +00:00
Vlad Lazar
fdfbc7b358 pageserver: hold GC while reading from a timeline (#10559)
## Problem

If we are GC-ing because a new image layer was added while traversing
the timeline, then it will remove layers that are required for
fulfilling the current get request (read-path cannot "look back" and
notice the new image layer).

## Summary of Changes

Prevent GC from progressing on the current timeline while it is being
visited for a read.

Epic: https://github.com/neondatabase/neon/issues/9376
2025-01-29 17:08:25 +00:00
Conrad Ludgate
190c19c034 chore: update rust-postgres on rebase (#10561)
I tried a full update of our tokio-postgres fork before. We hit some
breaking change. This PR only pulls in ~50% of the changes from
upstream: https://github.com/neondatabase/rust-postgres/pull/38.
2025-01-29 17:02:07 +00:00
Mikhail Kot
34e560fe37 download exporters from releases rather than using docker images (#10551)
Use releases for postgres-exporter, pgbouncer-exporter, and sql-exporter
2025-01-29 15:52:00 +00:00
Tristan Partin
7922458b98 Use num_cpus from the workspace in pageserver (#10545)
Luckily they were the same version, so we didn't spend time compiling
two versions, which could have been the case in the future.

Signed-off-by: Tristan Partin <tristan@neon.tech>
2025-01-29 15:45:36 +00:00
a-masterov
34d9e2d8e3 Add a test for GrapgQL (#10156)
## Problem
We currently don't run the tests shipped with `pg_graphql`.
## Summary of changes
The tests for `pg_graphql` are added.
2025-01-29 15:01:56 +00:00
Conrad Ludgate
2f82c21c63 chore: update rust-postgres fork (#10557)
I updated the fork to fix some lints. Cargo keeps getting confused by it
so let's just update the lockfile here
2025-01-29 12:55:24 +00:00
Ivan Efremov
222cc181e9 impr(proxy): Move the CancelMap to Redis hashes (#10364)
## Problem
The approach of having CancelMap as an in-memory structure increases
code complexity,
as well as putting additional load for Redis streams.

## Summary of changes
- Implement a set of KV ops for Redis client;
- Remove cancel notifications code;
- Send KV ops over the bounded channel to the handling background task
for removing and adding the cancel keys.


Closes #9660
2025-01-29 11:19:10 +00:00
alexanderlaw
4d2328ebe3 Fix C code to satisfy sanitizers (#10473) 2025-01-29 10:05:43 +00:00
a-masterov
9f81828429 Test extension upgrade compatibility (#10244)
## Problem
We have to test the extensions, shipped with Neon for compatibility
before the upgrade.
## Summary of changes
Added the test for compatibility with the upgraded extensions.
2025-01-29 09:19:11 +00:00
Arseny Sher
9ab13d6e2c Log statements in test_layer_map (#10554)
## Problem

test_layer_map doesn't log statements and it is not clear how long they
take.

## Summary of changes

Do log them.

ref https://github.com/neondatabase/neon/issues/10409
2025-01-29 09:16:00 +00:00
Alex Chi Z.
983e18e63e feat(pageserver): add compaction_upper_limit config (#10550)
## Problem

Follow-up of the incident, we should not use the same bound on
lower/upper limit of compaction files. This patch adds an upper bound
limit, which is set to 50 for now.

## Summary of changes

Add `compaction_upper_limit`.

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
Co-authored-by: Christian Schwarz <christian@neon.tech>
2025-01-28 23:18:32 +00:00
Alex Chi Z.
b735df6ff0 fix(pageserver): make image layer generation atomic (#10516)
## Problem

close https://github.com/neondatabase/neon/issues/8362

## Summary of changes

Use `BatchLayerWriter` to ensure we clean up image layers after failed
compaction.

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-01-28 21:29:51 +00:00
Fedor Dikarev
68cf0ba439 run benchmark tests on small-metal runners (#10549)
## Problem
Ref: https://github.com/neondatabase/cloud/issues/23314

We suspect some inconsistency in Benchmark tests runs could be due to
different type of runners they are landed in.
To have that aligned in both terms: failure rates and benchmark results,
lets run them for now on `small-metal` servers and see the progress for
the tests stability.
 
## Summary of changes
2025-01-28 21:26:38 +00:00
Alexey Kondratov
d04d924649 feat(compute): Add some basic compute_ctl metrics (#10504)
## Problem

There are several parts of `compute_ctl` with a very low visibility of
errors:
1. DB migrations that run async in the background after compute start.
2. Requests made to control plane (currently only `GetSpec`).
3. Requests made to the remote extensions server.

## Summary of changes

Add new counters to quickly evaluate the amount of errors among the
fleet.

Part of neondatabase/cloud#17590
2025-01-28 19:24:07 +00:00
JC Grünhage
f5fdaa6dc6 feat(ci): generate basic release notes with links (#10511)
## Problem
https://github.com/neondatabase/neon/pull/10448 removed release notes,
because if their generation failed, the whole release was failing.
People liked them though, and wanted some basic release notes as a
fall-back instead of completely removing them.

## Summary of changes
Include basic release notes that link to the release PR and to a diff to
the previous release.
2025-01-28 19:13:39 +00:00
Vlad Lazar
c54cd9e76a storcon: signal LSN wait to pageserver during live migration (#10452)
## Problem

We've seen the ingest connection manager get stuck shortly after a
migration.

## Summary of changes

A speculative mitigation is to use the same mechanism as get page
requests for kicking LSN ingest. The connection manager monitors
LSN waits and queries the broker if no updates are received for the
timeline.

Closes https://github.com/neondatabase/neon/issues/10351
2025-01-28 17:33:07 +00:00
Erik Grinaker
1010b8add4 pageserver: add l0_flush_wait_upload setting (#10534)
## Problem

We need a setting to disable the flush upload wait, to test L0 flush
backpressure in staging.

## Summary of changes

Add `l0_flush_wait_upload` setting.
2025-01-28 17:21:05 +00:00
Folke Behrens
ae4b2af299 fix(proxy): Use correct identifier for usage metrics upload (#10538)
## Problem

The request data and usage metrics S3 requests use the same identifier
shown in logs, causing confusion about what type of upload failed.

## Summary of changes

Use the correct identifier for usage metrics uploads.

neondatabase/cloud#23084
2025-01-28 17:08:17 +00:00
Tristan Partin
15fecb8474 Update axum to 0.8.1 (#10332)
Only a few things that needed updating:

- async_trait was removed
- Message::Text takes a Utf8Bytes object instead of a String

Signed-off-by: Tristan Partin <tristan@neon.tech>
Co-authored-by: Conrad Ludgate <connor@neon.tech>
2025-01-28 15:32:59 +00:00
Erik Grinaker
47677ba578 pageserver: disable L0 backpressure by default (#10535)
## Problem

We'll need further improvements to compaction before enabling L0 flush
backpressure by default. See:
https://neondb.slack.com/archives/C033RQ5SPDH/p1738066068960519?thread_ts=1737818888.474179&cid=C033RQ5SPDH.

Touches #5415.

## Summary of changes

Disable `l0_flush_delay_threshold` by default.
2025-01-28 14:51:30 +00:00
Arpad Müller
83b6bfa229 Re-download layer if its local and on-disk metadata diverge (#10529)
In #10308, we noticed many warnings about the local layer having
different sizes on-disk compared to the metadata.

However, the layer downloader would never redownload layer files if the
sizes or generation numbers change. This is obviously a bug, which we
aim to fix with this PR.

This change also moves the code deciding what to do about a layer to a
dedicated function: before we handled the "routing" via control flow,
but now it's become too complicated and it is nicer to have the
different verdicts for a layer spelled out in a list/match.
2025-01-28 13:39:53 +00:00
Erik Grinaker
ed942b05f7 Revert "pageserver: revert flush backpressure" (#10402)" (#10533)
This reverts commit 9e55d79803.

We'll still need this until we can tune L0 flush backpressure and
compaction. I'll add a setting to disable this separately.
2025-01-28 13:33:58 +00:00
Vlad Lazar
62a717a2ca pageserver: use PS node id for SK appname (#10522)
## Problem

This one is fairly embarrassing. Safekeeper node id was used in the
pageserver application name
when connecting to safekeepers.

## Summary of changes

Use the right node id.

Closes https://github.com/neondatabase/neon/issues/10461
2025-01-28 13:11:51 +00:00
Peter Bendel
c8fbbb9b65 Test ingest_benchmark with different stripe size and also PostgreSQL version 17 (#10510)
We want to verify if pageserver stripe size has an impact on ingest
performance.
We want to verify if ingest performance has improved or regressed with
postgres version 17.

## Summary of changes

- Allow to create new project with different postgres versions
- allow to pre-shard new project with different stripe sizes instead of
relying on storage manager to shard_split the project once a threshold
is exceeded

Replaces https://github.com/neondatabase/neon/pull/10509

Test run https://github.com/neondatabase/neon/actions/runs/12986410381
2025-01-27 21:06:05 +00:00
John Spray
d73f4a6470 pageserver: retry wrapper on manifest upload (#10524)
## Problem

On remote storage errors (e.g. I/O timeout) uploading tenant manifest,
all of compaction could fail. This is a problem IRL because we shouldn't
abort compaction on a single IO error, and in tests because it generates
spurious failures.

Related:
https://github.com/orgs/neondatabase/projects/51/views/2?sliceBy%5Bvalue%5D=jcsp&pane=issue&itemId=93692919&issue=neondatabase%7Cneon%7C10389

## Summary of changes

- Use `backoff::retry` when uploading tenant manifest
2025-01-27 21:02:25 +00:00
Heikki Linnakangas
5477d7db93 fast_import: fixes for Postgres v17 (#10414)
Now that the tests are run on v17, they're also run in debug mode, which
is slow. Increase statement_timeout in the test to work around that.
2025-01-27 19:47:49 +00:00
Arpad Müller
eb9832d846 Remove PQ_LIB_DIR env var (#10526)
We now don't need libpq any more for the build of the storage
controller, as we use `diesel-async` since #10280. Therefore, we remove
the env var that gave cargo/rustc the location for libpq.

Follow-up of #10280
2025-01-27 19:38:18 +00:00
Christian Schwarz
3d36dfe533 fix: noisy broker subscription failed error during storage broker deploys (#10521)
During broker deploys, pageservers log this noisy WARN en masse.

I can trivially reproduce the WARN message in neon_local by SIGKILLing
broker during e.g. `pgbench -i`.

I don't understand why tonic is not detecting the error as
`Code::Unavailable`.

Until we find time to understand that / fix upstream, this PR adds the
error message to the existing list of known error messages that get
demoted to INFO level.

Refs:
-  refs https://github.com/neondatabase/neon/issues/9562
2025-01-27 19:19:55 +00:00
John Spray
ebf44210ba remote_storage: less sensitive timeout logging in ABS listings (#10518)
## Problem

We were logging a warning after a single request timeout, while listing
objects.

Closes: https://github.com/neondatabase/neon/issues/10166

## Summary of changes

- These timeouts are a pretty normal part of life, so back it off to
only log a warning after two in a row.
2025-01-27 17:44:18 +00:00
John Spray
aabf455dfb README: clarify that neon_local is a dev/test tool (#10512)
## Problem

From time to time, folks discover our `control_plane/` folder and make
the (reasonable) mistake of thinking it's a tool for running full-sized
Neon systems, whereas in reality it is a tool for dev/test.

## Summary of changes

- Change control_plane's readme title to "Local Development Control
Plane (`neon_local`)`
- Change "Running local installation" to "Running a local development
environment" in the main readme
2025-01-27 17:24:42 +00:00
John Spray
aec92bfc34 pageserver: decrease utilization MAX_SHARDS (#10489)
## Problem

The intent of this parameter is to have pageservers consider themselves
"full" if they've got lots of shards, even if they have plenty of
capacity. It works, but because we typically successfully oversubscribe
capacity up to 200%, the MAX_SHARDS limit is effectively doubled, so
this 20,000 value ends up meaning 40,000, whereas the original intent
was to limit nodes to ~10000 shards.

## Summary of changes

- Change MAX_SHARDS to 5000, so that a node with 5000 will get a 100%
utilization, which is equivalent in practice to being considered "half
full" by the storage controller in capacity terms.

This is all a bit subtle and indiret. Originally the limit was baked
into the pageserver with the idea that the pageserver knows better what
its own resources tolerate than the storage controller does, but in
practice it would be probably be easier to understand all this if we
just did it controller-side. So there's scope to refactor here in
future.
2025-01-27 17:03:32 +00:00
311 changed files with 12711 additions and 6280 deletions

View File

@@ -24,3 +24,4 @@
!storage_controller/
!vendor/postgres-*/
!workspace_hack/
!build_tools/patches

View File

@@ -3,6 +3,7 @@ name: Bug Template
about: Used for describing bugs
title: ''
labels: t/bug
type: Bug
assignees: ''
---

View File

@@ -4,6 +4,7 @@ about: A set of related tasks contributing towards specific outcome, comprising
more than 1 week of work.
title: 'Epic: '
labels: t/Epic
type: Epic
assignees: ''
---

View File

@@ -4,6 +4,7 @@ self-hosted-runner:
- large
- large-arm64
- small
- small-metal
- small-arm64
- us-east-2
config-variables:
@@ -26,3 +27,4 @@ config-variables:
- SLACK_ON_CALL_QA_STAGING_STREAM
- DEV_AWS_OIDC_ROLE_MANAGE_BENCHMARK_EC2_VMS_ARN
- SLACK_ON_CALL_STORAGE_STAGING_STREAM
- SLACK_CICD_CHANNEL_ID

View File

@@ -17,6 +17,34 @@ inputs:
compute_units:
description: '[Min, Max] compute units'
default: '[1, 1]'
# settings below only needed if you want the project to be sharded from the beginning
shard_split_project:
description: 'by default new projects are not shard-split, specify true to shard-split'
required: false
default: 'false'
admin_api_key:
description: 'Admin API Key needed for shard-splitting. Must be specified if shard_split_project is true'
required: false
shard_count:
description: 'Number of shards to split the project into, only applies if shard_split_project is true'
required: false
default: '8'
stripe_size:
description: 'Stripe size, optional, in 8kiB pages. e.g. set 2048 for 16MB stripes. Default is 128 MiB, only applies if shard_split_project is true'
required: false
default: '32768'
psql_path:
description: 'Path to psql binary - it is caller responsibility to provision the psql binary'
required: false
default: '/tmp/neon/pg_install/v16/bin/psql'
libpq_lib_path:
description: 'Path to directory containing libpq library - it is caller responsibility to provision the libpq library'
required: false
default: '/tmp/neon/pg_install/v16/lib'
project_settings:
description: 'A JSON object with project settings'
required: false
default: '{}'
outputs:
dsn:
@@ -48,7 +76,7 @@ runs:
\"provisioner\": \"k8s-neonvm\",
\"autoscaling_limit_min_cu\": ${MIN_CU},
\"autoscaling_limit_max_cu\": ${MAX_CU},
\"settings\": { }
\"settings\": ${PROJECT_SETTINGS}
}
}")
@@ -63,6 +91,23 @@ runs:
echo "project_id=${project_id}" >> $GITHUB_OUTPUT
echo "Project ${project_id} has been created"
if [ "${SHARD_SPLIT_PROJECT}" = "true" ]; then
# determine tenant ID
TENANT_ID=`${PSQL} ${dsn} -t -A -c "SHOW neon.tenant_id"`
echo "Splitting project ${project_id} with tenant_id ${TENANT_ID} into $((SHARD_COUNT)) shards with stripe size $((STRIPE_SIZE))"
echo "Sending PUT request to https://${API_HOST}/regions/${REGION_ID}/api/v1/admin/storage/proxy/control/v1/tenant/${TENANT_ID}/shard_split"
echo "with body {\"new_shard_count\": $((SHARD_COUNT)), \"new_stripe_size\": $((STRIPE_SIZE))}"
# we need an ADMIN API KEY to invoke storage controller API for shard splitting (bash -u above checks that the variable is set)
curl -X PUT \
"https://${API_HOST}/regions/${REGION_ID}/api/v1/admin/storage/proxy/control/v1/tenant/${TENANT_ID}/shard_split" \
-H "Accept: application/json" -H "Content-Type: application/json" -H "Authorization: Bearer ${ADMIN_API_KEY}" \
-d "{\"new_shard_count\": $SHARD_COUNT, \"new_stripe_size\": $STRIPE_SIZE}"
fi
env:
API_HOST: ${{ inputs.api_host }}
API_KEY: ${{ inputs.api_key }}
@@ -70,3 +115,10 @@ runs:
POSTGRES_VERSION: ${{ inputs.postgres_version }}
MIN_CU: ${{ fromJSON(inputs.compute_units)[0] }}
MAX_CU: ${{ fromJSON(inputs.compute_units)[1] }}
SHARD_SPLIT_PROJECT: ${{ inputs.shard_split_project }}
ADMIN_API_KEY: ${{ inputs.admin_api_key }}
SHARD_COUNT: ${{ inputs.shard_count }}
STRIPE_SIZE: ${{ inputs.stripe_size }}
PSQL: ${{ inputs.psql_path }}
LD_LIBRARY_PATH: ${{ inputs.libpq_lib_path }}
PROJECT_SETTINGS: ${{ inputs.project_settings }}

View File

@@ -121,6 +121,8 @@ runs:
export DEFAULT_PG_VERSION=${PG_VERSION#v}
export LD_LIBRARY_PATH=${POSTGRES_DISTRIB_DIR}/v${DEFAULT_PG_VERSION}/lib
export BENCHMARK_CONNSTR=${BENCHMARK_CONNSTR:-}
export ASAN_OPTIONS=detect_leaks=0:detect_stack_use_after_return=0:abort_on_error=1:strict_string_checks=1:check_initialization_order=1:strict_init_order=1
export UBSAN_OPTIONS=abort_on_error=1:print_stacktrace=1
if [ "${BUILD_TYPE}" = "remote" ]; then
export REMOTE_ENV=1

View File

@@ -1,4 +1,5 @@
rust_code: ['**/*.rs', '**/Cargo.toml', '**/Cargo.lock']
rust_dependencies: ['**/Cargo.lock']
v14: ['vendor/postgres-v14/**', 'Makefile', 'pgxn/**']
v15: ['vendor/postgres-v15/**', 'Makefile', 'pgxn/**']

View File

@@ -17,7 +17,7 @@ jobs:
strategy:
fail-fast: false
matrix:
platform: [ aws-rds-postgres, aws-aurora-serverless-v2-postgres, neon ]
platform: [ aws-rds-postgres, aws-aurora-serverless-v2-postgres, neon, neon_pg17 ]
database: [ clickbench, tpch, userexample ]
env:
@@ -41,6 +41,9 @@ jobs:
neon)
CONNSTR=${{ secrets.BENCHMARK_CAPTEST_CONNSTR }}
;;
neon_pg17)
CONNSTR=${{ secrets.BENCHMARK_CAPTEST_CONNSTR_PG17 }}
;;
aws-rds-postgres)
CONNSTR=${{ secrets.BENCHMARK_RDS_POSTGRES_CONNSTR }}
;;

View File

@@ -23,6 +23,11 @@ on:
description: 'a json object of postgres versions and lfc states to run regression tests on'
required: true
type: string
sanitizers:
description: 'enabled or disabled'
required: false
default: 'disabled'
type: string
defaults:
run:
@@ -87,6 +92,7 @@ jobs:
- name: Set env variables
env:
ARCH: ${{ inputs.arch }}
SANITIZERS: ${{ inputs.sanitizers }}
run: |
CARGO_FEATURES="--features testing"
if [[ $BUILD_TYPE == "debug" && $ARCH == 'x64' ]]; then
@@ -99,8 +105,14 @@ jobs:
cov_prefix=""
CARGO_FLAGS="--locked --release"
fi
if [[ $SANITIZERS == 'enabled' ]]; then
make_vars="WITH_SANITIZERS=yes"
else
make_vars=""
fi
{
echo "cov_prefix=${cov_prefix}"
echo "make_vars=${make_vars}"
echo "CARGO_FEATURES=${CARGO_FEATURES}"
echo "CARGO_FLAGS=${CARGO_FLAGS}"
echo "CARGO_HOME=${GITHUB_WORKSPACE}/.cargo"
@@ -136,37 +148,39 @@ jobs:
- name: Build postgres v14
if: steps.cache_pg_14.outputs.cache-hit != 'true'
run: mold -run make postgres-v14 -j$(nproc)
run: mold -run make ${make_vars} postgres-v14 -j$(nproc)
- name: Build postgres v15
if: steps.cache_pg_15.outputs.cache-hit != 'true'
run: mold -run make postgres-v15 -j$(nproc)
run: mold -run make ${make_vars} postgres-v15 -j$(nproc)
- name: Build postgres v16
if: steps.cache_pg_16.outputs.cache-hit != 'true'
run: mold -run make postgres-v16 -j$(nproc)
run: mold -run make ${make_vars} postgres-v16 -j$(nproc)
- name: Build postgres v17
if: steps.cache_pg_17.outputs.cache-hit != 'true'
run: mold -run make postgres-v17 -j$(nproc)
run: mold -run make ${make_vars} postgres-v17 -j$(nproc)
- name: Build neon extensions
run: mold -run make neon-pg-ext -j$(nproc)
run: mold -run make ${make_vars} neon-pg-ext -j$(nproc)
- name: Build walproposer-lib
run: mold -run make walproposer-lib -j$(nproc)
run: mold -run make ${make_vars} walproposer-lib -j$(nproc)
- name: Run cargo build
env:
WITH_TESTS: ${{ inputs.sanitizers != 'enabled' && '--tests' || '' }}
run: |
PQ_LIB_DIR=$(pwd)/pg_install/v16/lib
export PQ_LIB_DIR
${cov_prefix} mold -run cargo build $CARGO_FLAGS $CARGO_FEATURES --bins --tests
export ASAN_OPTIONS=detect_leaks=0
${cov_prefix} mold -run cargo build $CARGO_FLAGS $CARGO_FEATURES --bins ${WITH_TESTS}
# Do install *before* running rust tests because they might recompile the
# binaries with different features/flags.
- name: Install rust binaries
env:
ARCH: ${{ inputs.arch }}
SANITIZERS: ${{ inputs.sanitizers }}
run: |
# Install target binaries
mkdir -p /tmp/neon/bin/
@@ -181,7 +195,7 @@ jobs:
done
# Install test executables and write list of all binaries (for code coverage)
if [[ $BUILD_TYPE == "debug" && $ARCH == 'x64' ]]; then
if [[ $BUILD_TYPE == "debug" && $ARCH == 'x64' && $SANITIZERS != 'enabled' ]]; then
# Keep bloated coverage data files away from the rest of the artifact
mkdir -p /tmp/coverage/
@@ -214,11 +228,10 @@ jobs:
role-duration-seconds: 18000 # 5 hours
- name: Run rust tests
if: ${{ inputs.sanitizers != 'enabled' }}
env:
NEXTEST_RETRIES: 3
run: |
PQ_LIB_DIR=$(pwd)/pg_install/v16/lib
export PQ_LIB_DIR
LD_LIBRARY_PATH=$(pwd)/pg_install/v17/lib
export LD_LIBRARY_PATH
@@ -271,6 +284,27 @@ jobs:
path: /tmp/neon
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
- name: Check diesel schema
if: inputs.build-type == 'release' && inputs.arch == 'x64'
env:
DATABASE_URL: postgresql://localhost:1235/storage_controller
POSTGRES_DISTRIB_DIR: /tmp/neon/pg_install
run: |
export ASAN_OPTIONS=detect_leaks=0
/tmp/neon/bin/neon_local init
/tmp/neon/bin/neon_local storage_controller start
diesel print-schema > storage_controller/src/schema.rs
if [ -n "$(git diff storage_controller/src/schema.rs)" ]; then
echo >&2 "Uncommitted changes in diesel schema"
git diff .
exit 1
fi
/tmp/neon/bin/neon_local storage_controller stop
# XXX: keep this after the binaries.list is formed, so the coverage can properly work later
- name: Merge and upload coverage data
if: inputs.build-type == 'debug'
@@ -303,7 +337,7 @@ jobs:
- name: Pytest regression tests
continue-on-error: ${{ matrix.lfc_state == 'with-lfc' && inputs.build-type == 'debug' }}
uses: ./.github/actions/run-python-test-set
timeout-minutes: 60
timeout-minutes: ${{ inputs.sanitizers != 'enabled' && 60 || 180 }}
with:
build_type: ${{ inputs.build-type }}
test_selection: regress
@@ -321,6 +355,7 @@ jobs:
PAGESERVER_VIRTUAL_FILE_IO_ENGINE: tokio-epoll-uring
PAGESERVER_GET_VECTORED_CONCURRENT_IO: sidecar-task
USE_LFC: ${{ matrix.lfc_state == 'with-lfc' && 'true' || 'false' }}
SANITIZERS: ${{ inputs.sanitizers }}
# Temporary disable this step until we figure out why it's so flaky
# Ref https://github.com/neondatabase/neon/issues/4540

View File

@@ -16,6 +16,9 @@ defaults:
run:
shell: bash -euxo pipefail {0}
# No permission for GITHUB_TOKEN by default; the **minimal required** set of permissions should be granted in each job.
permissions: {}
jobs:
check-codestyle-rust:
strategy:
@@ -84,8 +87,3 @@ jobs:
run: |
cargo hakari generate --diff # workspace-hack Cargo.toml is up-to-date
cargo hakari manage-deps --dry-run # all workspace crates depend on workspace-hack
# https://github.com/EmbarkStudios/cargo-deny
- name: Check rust licenses/bans/advisories/sources
if: ${{ !cancelled() }}
run: cargo deny check --hide-inclusion-graph

View File

@@ -67,9 +67,9 @@ jobs:
- uses: actions/checkout@v4
with:
ref: main
ref: ${{ github.event.pull_request.head.sha }}
token: ${{ secrets.CI_ACCESS_TOKEN }}
- name: Look for existing PR
id: get-pr
env:
@@ -77,7 +77,7 @@ jobs:
run: |
ALREADY_CREATED="$(gh pr --repo ${GITHUB_REPOSITORY} list --head ${BRANCH} --base main --json number --jq '.[].number')"
echo "ALREADY_CREATED=${ALREADY_CREATED}" >> ${GITHUB_OUTPUT}
- name: Get changed labels
id: get-labels
if: steps.get-pr.outputs.ALREADY_CREATED != ''
@@ -94,8 +94,6 @@ jobs:
echo "LABELS_TO_ADD=${LABELS_TO_ADD}" >> ${GITHUB_OUTPUT}
echo "LABELS_TO_REMOVE=${LABELS_TO_REMOVE}" >> ${GITHUB_OUTPUT}
- run: gh pr checkout "${PR_NUMBER}"
- run: git checkout -b "${BRANCH}"
- run: git push --force origin "${BRANCH}"
@@ -103,7 +101,7 @@ jobs:
- name: Create a Pull Request for CI run (if required)
if: steps.get-pr.outputs.ALREADY_CREATED == ''
env:
env:
GH_TOKEN: ${{ secrets.CI_ACCESS_TOKEN }}
run: |
cat << EOF > body.md
@@ -140,7 +138,7 @@ jobs:
- run: git push --force origin "${BRANCH}"
if: steps.get-pr.outputs.ALREADY_CREATED != ''
cleanup:
# Close PRs and delete branchs if the original PR is closed.

View File

@@ -63,11 +63,15 @@ jobs:
fail-fast: false
matrix:
include:
- DEFAULT_PG_VERSION: 16
- PG_VERSION: 16
PLATFORM: "neon-staging"
region_id: ${{ github.event.inputs.region_id || 'aws-us-east-2' }}
RUNNER: [ self-hosted, us-east-2, x64 ]
- DEFAULT_PG_VERSION: 16
- PG_VERSION: 17
PLATFORM: "neon-staging"
region_id: ${{ github.event.inputs.region_id || 'aws-us-east-2' }}
RUNNER: [ self-hosted, us-east-2, x64 ]
- PG_VERSION: 16
PLATFORM: "azure-staging"
region_id: 'azure-eastus2'
RUNNER: [ self-hosted, eastus2, x64 ]
@@ -75,7 +79,7 @@ jobs:
TEST_PG_BENCH_DURATIONS_MATRIX: "300"
TEST_PG_BENCH_SCALES_MATRIX: "10,100"
POSTGRES_DISTRIB_DIR: /tmp/neon/pg_install
DEFAULT_PG_VERSION: ${{ matrix.DEFAULT_PG_VERSION }}
PG_VERSION: ${{ matrix.PG_VERSION }}
TEST_OUTPUT: /tmp/test_output
BUILD_TYPE: remote
SAVE_PERF_REPORT: ${{ github.event.inputs.save_perf_report || ( github.ref_name == 'main' ) }}
@@ -112,7 +116,7 @@ jobs:
uses: ./.github/actions/neon-project-create
with:
region_id: ${{ matrix.region_id }}
postgres_version: ${{ env.DEFAULT_PG_VERSION }}
postgres_version: ${{ env.PG_VERSION }}
api_key: ${{ secrets.NEON_STAGING_API_KEY }}
- name: Run benchmark
@@ -122,7 +126,7 @@ jobs:
test_selection: performance
run_in_parallel: false
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
pg_version: ${{ env.DEFAULT_PG_VERSION }}
pg_version: ${{ env.PG_VERSION }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
# Set --sparse-ordering option of pytest-order plugin
# to ensure tests are running in order of appears in the file.
@@ -313,7 +317,11 @@ jobs:
{ "pg_version": 16, "region_id": "azure-eastus2", "platform": "neonvm-azure-captest-freetier", "db_size": "3gb" ,"runner": '"$runner_azure"', "image": "neondatabase/build-tools:pinned-bookworm" },
{ "pg_version": 16, "region_id": "azure-eastus2", "platform": "neonvm-azure-captest-new", "db_size": "10gb","runner": '"$runner_azure"', "image": "neondatabase/build-tools:pinned-bookworm" },
{ "pg_version": 16, "region_id": "azure-eastus2", "platform": "neonvm-azure-captest-new", "db_size": "50gb","runner": '"$runner_azure"', "image": "neondatabase/build-tools:pinned-bookworm" },
{ "pg_version": 16, "region_id": "'"$region_id_default"'", "platform": "neonvm-captest-sharding-reuse", "db_size": "50gb","runner": '"$runner_default"', "image": "'"$image_default"'" }]
{ "pg_version": 16, "region_id": "'"$region_id_default"'", "platform": "neonvm-captest-sharding-reuse", "db_size": "50gb","runner": '"$runner_default"', "image": "'"$image_default"'" },
{ "pg_version": 17, "region_id": "'"$region_id_default"'", "platform": "neonvm-captest-freetier", "db_size": "3gb" ,"runner": '"$runner_default"', "image": "'"$image_default"'" },
{ "pg_version": 17, "region_id": "'"$region_id_default"'", "platform": "neonvm-captest-new", "db_size": "10gb","runner": '"$runner_default"', "image": "'"$image_default"'" },
{ "pg_version": 17, "region_id": "'"$region_id_default"'", "platform": "neonvm-captest-new-many-tables","db_size": "10gb","runner": '"$runner_default"', "image": "'"$image_default"'" },
{ "pg_version": 17, "region_id": "'"$region_id_default"'", "platform": "neonvm-captest-new", "db_size": "50gb","runner": '"$runner_default"', "image": "'"$image_default"'" }]
}'
if [ "$(date +%A)" = "Saturday" ] || [ ${RUN_AWS_RDS_AND_AURORA} = "true" ]; then
@@ -329,12 +337,15 @@ jobs:
matrix='{
"platform": [
"neonvm-captest-reuse"
],
"pg_version" : [
16,17
]
}'
if [ "$(date +%A)" = "Saturday" ] || [ ${RUN_AWS_RDS_AND_AURORA} = "true" ]; then
matrix=$(echo "$matrix" | jq '.include += [{ "platform": "rds-postgres" },
{ "platform": "rds-aurora" }]')
matrix=$(echo "$matrix" | jq '.include += [{ "pg_version": 16, "platform": "rds-postgres" },
{ "pg_version": 16, "platform": "rds-aurora" }]')
fi
echo "matrix=$(echo "$matrix" | jq --compact-output '.')" >> $GITHUB_OUTPUT
@@ -346,14 +357,14 @@ jobs:
"platform": [
"neonvm-captest-reuse"
],
"scale": [
"10"
"pg_version" : [
16,17
]
}'
if [ "$(date +%A)" = "Saturday" ] || [ ${RUN_AWS_RDS_AND_AURORA} = "true" ]; then
matrix=$(echo "$matrix" | jq '.include += [{ "platform": "rds-postgres", "scale": "10" },
{ "platform": "rds-aurora", "scale": "10" }]')
matrix=$(echo "$matrix" | jq '.include += [{ "pg_version": 16, "platform": "rds-postgres" },
{ "pg_version": 16, "platform": "rds-aurora" }]')
fi
echo "matrix=$(echo "$matrix" | jq --compact-output '.')" >> $GITHUB_OUTPUT
@@ -378,7 +389,7 @@ jobs:
TEST_PG_BENCH_DURATIONS_MATRIX: "60m"
TEST_PG_BENCH_SCALES_MATRIX: ${{ matrix.db_size }}
POSTGRES_DISTRIB_DIR: /tmp/neon/pg_install
DEFAULT_PG_VERSION: ${{ matrix.pg_version }}
PG_VERSION: ${{ matrix.pg_version }}
TEST_OUTPUT: /tmp/test_output
BUILD_TYPE: remote
SAVE_PERF_REPORT: ${{ github.event.inputs.save_perf_report || ( github.ref_name == 'main' ) }}
@@ -416,7 +427,7 @@ jobs:
uses: ./.github/actions/neon-project-create
with:
region_id: ${{ matrix.region_id }}
postgres_version: ${{ env.DEFAULT_PG_VERSION }}
postgres_version: ${{ env.PG_VERSION }}
api_key: ${{ secrets.NEON_STAGING_API_KEY }}
compute_units: ${{ (contains(matrix.platform, 'captest-freetier') && '[0.25, 0.25]') || '[1, 1]' }}
@@ -447,7 +458,7 @@ jobs:
echo "connstr=${CONNSTR}" >> $GITHUB_OUTPUT
# we want to compare Neon project OLTP throughput and latency at scale factor 10 GB
# we want to compare Neon project OLTP throughput and latency at scale factor 10 GB
# without (neonvm-captest-new)
# and with (neonvm-captest-new-many-tables) many relations in the database
- name: Create many relations before the run
@@ -459,7 +470,7 @@ jobs:
run_in_parallel: false
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600 -k test_perf_many_relations
pg_version: ${{ env.DEFAULT_PG_VERSION }}
pg_version: ${{ env.PG_VERSION }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
@@ -475,7 +486,7 @@ jobs:
run_in_parallel: false
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600 -k test_pgbench_remote_init
pg_version: ${{ env.DEFAULT_PG_VERSION }}
pg_version: ${{ env.PG_VERSION }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
@@ -490,7 +501,7 @@ jobs:
run_in_parallel: false
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600 -k test_pgbench_remote_simple_update
pg_version: ${{ env.DEFAULT_PG_VERSION }}
pg_version: ${{ env.PG_VERSION }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
@@ -505,7 +516,7 @@ jobs:
run_in_parallel: false
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600 -k test_pgbench_remote_select_only
pg_version: ${{ env.DEFAULT_PG_VERSION }}
pg_version: ${{ env.PG_VERSION }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
@@ -549,14 +560,19 @@ jobs:
include:
- PLATFORM: "neonvm-captest-pgvector"
RUNNER: [ self-hosted, us-east-2, x64 ]
postgres_version: 16
- PLATFORM: "neonvm-captest-pgvector-pg17"
RUNNER: [ self-hosted, us-east-2, x64 ]
postgres_version: 17
- PLATFORM: "azure-captest-pgvector"
RUNNER: [ self-hosted, eastus2, x64 ]
postgres_version: 16
env:
TEST_PG_BENCH_DURATIONS_MATRIX: "15m"
TEST_PG_BENCH_SCALES_MATRIX: "1"
POSTGRES_DISTRIB_DIR: /tmp/neon/pg_install
DEFAULT_PG_VERSION: 16
PG_VERSION: ${{ matrix.postgres_version }}
TEST_OUTPUT: /tmp/test_output
BUILD_TYPE: remote
@@ -574,32 +590,20 @@ jobs:
steps:
- uses: actions/checkout@v4
# until https://github.com/neondatabase/neon/issues/8275 is fixed we temporarily install postgresql-16
# instead of using Neon artifacts containing pgbench
- name: Install postgresql-16 where pytest expects it
run: |
# Just to make it easier to test things locally on macOS (with arm64)
arch=$(uname -m | sed 's/x86_64/amd64/g' | sed 's/aarch64/arm64/g')
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-region: eu-central-1
role-to-assume: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
role-duration-seconds: 18000 # 5 hours
cd /home/nonroot
wget -q "https://apt.postgresql.org/pub/repos/apt/pool/main/p/postgresql-17/libpq5_17.2-1.pgdg120+1_${arch}.deb"
wget -q "https://apt.postgresql.org/pub/repos/apt/pool/main/p/postgresql-16/postgresql-client-16_16.6-1.pgdg120+1_${arch}.deb"
wget -q "https://apt.postgresql.org/pub/repos/apt/pool/main/p/postgresql-16/postgresql-16_16.6-1.pgdg120+1_${arch}.deb"
dpkg -x libpq5_17.2-1.pgdg120+1_${arch}.deb pg
dpkg -x postgresql-16_16.6-1.pgdg120+1_${arch}.deb pg
dpkg -x postgresql-client-16_16.6-1.pgdg120+1_${arch}.deb pg
mkdir -p /tmp/neon/pg_install/v16/bin
ln -s /home/nonroot/pg/usr/lib/postgresql/16/bin/pgbench /tmp/neon/pg_install/v16/bin/pgbench
ln -s /home/nonroot/pg/usr/lib/postgresql/16/bin/psql /tmp/neon/pg_install/v16/bin/psql
ln -s /home/nonroot/pg/usr/lib/$(uname -m)-linux-gnu /tmp/neon/pg_install/v16/lib
LD_LIBRARY_PATH="/home/nonroot/pg/usr/lib/$(uname -m)-linux-gnu:${LD_LIBRARY_PATH:-}"
export LD_LIBRARY_PATH
echo "LD_LIBRARY_PATH=${LD_LIBRARY_PATH}" >> ${GITHUB_ENV}
/tmp/neon/pg_install/v16/bin/pgbench --version
/tmp/neon/pg_install/v16/bin/psql --version
- name: Download Neon artifact
uses: ./.github/actions/download
with:
name: neon-${{ runner.os }}-${{ runner.arch }}-release-artifact
path: /tmp/neon/
prefix: latest
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
- name: Set up Connection String
id: set-up-connstr
@@ -608,6 +612,9 @@ jobs:
neonvm-captest-pgvector)
CONNSTR=${{ secrets.BENCHMARK_PGVECTOR_CONNSTR }}
;;
neonvm-captest-pgvector-pg17)
CONNSTR=${{ secrets.BENCHMARK_PGVECTOR_CONNSTR_PG17 }}
;;
azure-captest-pgvector)
CONNSTR=${{ secrets.BENCHMARK_PGVECTOR_CONNSTR_AZURE }}
;;
@@ -619,13 +626,6 @@ jobs:
echo "connstr=${CONNSTR}" >> $GITHUB_OUTPUT
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4
with:
aws-region: eu-central-1
role-to-assume: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
role-duration-seconds: 18000 # 5 hours
- name: Benchmark pgvector hnsw indexing
uses: ./.github/actions/run-python-test-set
with:
@@ -634,7 +634,7 @@ jobs:
run_in_parallel: false
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600 -k test_pgvector_indexing
pg_version: ${{ env.DEFAULT_PG_VERSION }}
pg_version: ${{ env.PG_VERSION }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
@@ -649,7 +649,7 @@ jobs:
run_in_parallel: false
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600
pg_version: ${{ env.DEFAULT_PG_VERSION }}
pg_version: ${{ env.PG_VERSION }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
@@ -696,7 +696,7 @@ jobs:
env:
POSTGRES_DISTRIB_DIR: /tmp/neon/pg_install
DEFAULT_PG_VERSION: 16
PG_VERSION: ${{ matrix.pg_version }}
TEST_OUTPUT: /tmp/test_output
TEST_OLAP_COLLECT_EXPLAIN: ${{ github.event.inputs.collect_olap_explain }}
TEST_OLAP_COLLECT_PG_STAT_STATEMENTS: ${{ github.event.inputs.collect_pg_stat_statements }}
@@ -739,7 +739,18 @@ jobs:
run: |
case "${PLATFORM}" in
neonvm-captest-reuse)
CONNSTR=${{ secrets.BENCHMARK_CAPTEST_CLICKBENCH_10M_CONNSTR }}
case "${PG_VERSION}" in
16)
CONNSTR=${{ secrets.BENCHMARK_CAPTEST_CLICKBENCH_10M_CONNSTR }}
;;
17)
CONNSTR=${{ secrets.BENCHMARK_CAPTEST_CLICKBENCH_CONNSTR_PG17 }}
;;
*)
echo >&2 "Unsupported PG_VERSION=${PG_VERSION} for PLATFORM=${PLATFORM}"
exit 1
;;
esac
;;
rds-aurora)
CONNSTR=${{ secrets.BENCHMARK_RDS_AURORA_CLICKBENCH_10M_CONNSTR }}
@@ -763,7 +774,7 @@ jobs:
run_in_parallel: false
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 43200 -k test_clickbench
pg_version: ${{ env.DEFAULT_PG_VERSION }}
pg_version: ${{ env.PG_VERSION }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
@@ -799,7 +810,7 @@ jobs:
# We might change it after https://github.com/neondatabase/neon/issues/2900.
#
# *_TPCH_S10_CONNSTR: DB generated with scale factor 10 (~10 GB)
if: ${{ !cancelled() && (github.event.inputs.run_only_pgvector_tests == 'false' || github.event.inputs.run_only_pgvector_tests == null) }}
# if: ${{ !cancelled() && (github.event.inputs.run_only_pgvector_tests == 'false' || github.event.inputs.run_only_pgvector_tests == null) }}
permissions:
contents: write
statuses: write
@@ -812,12 +823,11 @@ jobs:
env:
POSTGRES_DISTRIB_DIR: /tmp/neon/pg_install
DEFAULT_PG_VERSION: 16
PG_VERSION: ${{ matrix.pg_version }}
TEST_OUTPUT: /tmp/test_output
BUILD_TYPE: remote
SAVE_PERF_REPORT: ${{ github.event.inputs.save_perf_report || ( github.ref_name == 'main' ) }}
PLATFORM: ${{ matrix.platform }}
TEST_OLAP_SCALE: ${{ matrix.scale }}
runs-on: [ self-hosted, us-east-2, x64 ]
container:
@@ -849,13 +859,24 @@ jobs:
run: |
case "${PLATFORM}" in
neonvm-captest-reuse)
ENV_PLATFORM=CAPTEST_TPCH
case "${PG_VERSION}" in
16)
CONNSTR_SECRET_NAME="BENCHMARK_CAPTEST_TPCH_S10_CONNSTR"
;;
17)
CONNSTR_SECRET_NAME="BENCHMARK_CAPTEST_TPCH_CONNSTR_PG17"
;;
*)
echo >&2 "Unsupported PG_VERSION=${PG_VERSION} for PLATFORM=${PLATFORM}"
exit 1
;;
esac
;;
rds-aurora)
ENV_PLATFORM=RDS_AURORA_TPCH
CONNSTR_SECRET_NAME="BENCHMARK_RDS_AURORA_TPCH_S10_CONNSTR"
;;
rds-postgres)
ENV_PLATFORM=RDS_POSTGRES_TPCH
CONNSTR_SECRET_NAME="BENCHMARK_RDS_POSTGRES_TPCH_S10_CONNSTR"
;;
*)
echo >&2 "Unknown PLATFORM=${PLATFORM}. Allowed only 'neonvm-captest-reuse', 'rds-aurora', or 'rds-postgres'"
@@ -863,7 +884,6 @@ jobs:
;;
esac
CONNSTR_SECRET_NAME="BENCHMARK_${ENV_PLATFORM}_S${TEST_OLAP_SCALE}_CONNSTR"
echo "CONNSTR_SECRET_NAME=${CONNSTR_SECRET_NAME}" >> $GITHUB_ENV
- name: Set up Connection String
@@ -881,13 +901,13 @@ jobs:
run_in_parallel: false
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600 -k test_tpch
pg_version: ${{ env.DEFAULT_PG_VERSION }}
pg_version: ${{ env.PG_VERSION }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
TEST_OLAP_SCALE: ${{ matrix.scale }}
TEST_OLAP_SCALE: 10
- name: Create Allure report
id: create-allure-report
@@ -909,7 +929,7 @@ jobs:
SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}
user-examples-compare:
if: ${{ !cancelled() && (github.event.inputs.run_only_pgvector_tests == 'false' || github.event.inputs.run_only_pgvector_tests == null) }}
# if: ${{ !cancelled() && (github.event.inputs.run_only_pgvector_tests == 'false' || github.event.inputs.run_only_pgvector_tests == null) }}
permissions:
contents: write
statuses: write
@@ -922,7 +942,7 @@ jobs:
env:
POSTGRES_DISTRIB_DIR: /tmp/neon/pg_install
DEFAULT_PG_VERSION: 16
PG_VERSION: ${{ matrix.pg_version }}
TEST_OUTPUT: /tmp/test_output
BUILD_TYPE: remote
SAVE_PERF_REPORT: ${{ github.event.inputs.save_perf_report || ( github.ref_name == 'main' ) }}
@@ -959,7 +979,18 @@ jobs:
run: |
case "${PLATFORM}" in
neonvm-captest-reuse)
CONNSTR=${{ secrets.BENCHMARK_USER_EXAMPLE_CAPTEST_CONNSTR }}
case "${PG_VERSION}" in
16)
CONNSTR=${{ secrets.BENCHMARK_USER_EXAMPLE_CAPTEST_CONNSTR }}
;;
17)
CONNSTR=${{ secrets.BENCHMARK_CAPTEST_USER_EXAMPLE_CONNSTR_PG17 }}
;;
*)
echo >&2 "Unsupported PG_VERSION=${PG_VERSION} for PLATFORM=${PLATFORM}"
exit 1
;;
esac
;;
rds-aurora)
CONNSTR=${{ secrets.BENCHMARK_USER_EXAMPLE_RDS_AURORA_CONNSTR }}
@@ -983,7 +1014,7 @@ jobs:
run_in_parallel: false
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600 -k test_user_examples
pg_version: ${{ env.DEFAULT_PG_VERSION }}
pg_version: ${{ env.PG_VERSION }}
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"

View File

@@ -235,7 +235,7 @@ jobs:
echo 'CPPFLAGS=-I/usr/local/opt/openssl@3/include' >> $GITHUB_ENV
- name: Run cargo build (only for v17)
run: PQ_LIB_DIR=$(pwd)/pg_install/v17/lib cargo build --all --release -j$(sysctl -n hw.ncpu)
run: cargo build --all --release -j$(sysctl -n hw.ncpu)
- name: Check that no warnings are produced (only for v17)
run: ./run_clippy.sh

View File

@@ -45,6 +45,26 @@ jobs:
run cancel-previous-in-concurrency-group.yml \
--field concurrency_group="${{ env.E2E_CONCURRENCY_GROUP }}"
files-changed:
needs: [ check-permissions ]
runs-on: [ self-hosted, small ]
timeout-minutes: 3
outputs:
check-rust-dependencies: ${{ steps.files-changed.outputs.rust_dependencies }}
steps:
- name: Checkout
uses: actions/checkout@v4
with:
submodules: true
- name: Check for file changes
uses: dorny/paths-filter@de90cc6fb38fc0963ad72b210f1f284cd68cea36 # v3.0.2
id: files-changed
with:
token: ${{ secrets.GITHUB_TOKEN }}
filters: .github/file-filters.yaml
tag:
needs: [ check-permissions ]
runs-on: [ self-hosted, small ]
@@ -170,6 +190,14 @@ jobs:
archs: '["x64", "arm64"]'
secrets: inherit
check-dependencies-rust:
needs: [ files-changed, build-build-tools-image ]
if: ${{ needs.files-changed.outputs.check-rust-dependencies == 'true' }}
uses: ./.github/workflows/cargo-deny.yml
with:
build-tools-image: ${{ needs.build-build-tools-image.outputs.image }}-bookworm
secrets: inherit
build-and-test-locally:
needs: [ tag, build-build-tools-image ]
strategy:
@@ -242,7 +270,7 @@ jobs:
statuses: write
contents: write
pull-requests: write
runs-on: [ self-hosted, small ]
runs-on: [ self-hosted, small-metal ]
container:
image: ${{ needs.build-build-tools-image.outputs.image }}-bookworm
credentials:
@@ -654,7 +682,7 @@ jobs:
push: true
pull: true
file: compute/compute-node.Dockerfile
target: neon-pg-ext-test
target: extension-tests
cache-from: type=registry,ref=cache.neon.build/compute-node-${{ matrix.version.pg }}:cache-${{ matrix.version.debian }}-${{ matrix.arch }}
tags: |
neondatabase/neon-test-extensions-${{ matrix.version.pg }}:${{needs.tag.outputs.build-tag}}-${{ matrix.version.debian }}-${{ matrix.arch }}
@@ -786,6 +814,17 @@ jobs:
username: ${{ secrets.NEON_DOCKERHUB_USERNAME }}
password: ${{ secrets.NEON_DOCKERHUB_PASSWORD }}
- name: Get the last compute release tag
id: get-last-compute-release-tag
env:
GH_TOKEN: ${{ secrets.CI_ACCESS_TOKEN }}
run: |
tag=$(gh api -q '[.[].tag_name | select(startswith("release-compute"))][0]'\
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
"/repos/${{ github.repository }}/releases")
echo tag=${tag} >> ${GITHUB_OUTPUT}
# `neondatabase/neon` contains multiple binaries, all of them use the same input for the version into the same version formatting library.
# Pick pageserver as currently the only binary with extra "version" features printed in the string to verify.
# Regular pageserver version string looks like
@@ -817,6 +856,20 @@ jobs:
TEST_VERSION_ONLY: ${{ matrix.pg_version }}
run: ./docker-compose/docker_compose_test.sh
- name: Print logs and clean up docker-compose test
if: always()
run: |
docker compose --profile test-extensions -f ./docker-compose/docker-compose.yml logs || true
docker compose --profile test-extensions -f ./docker-compose/docker-compose.yml down
- name: Test extension upgrade
timeout-minutes: 20
if: ${{ needs.tag.outputs.build-tag == github.run_id }}
env:
NEWTAG: ${{ needs.tag.outputs.build-tag }}
OLDTAG: ${{ steps.get-last-compute-release-tag.outputs.tag }}
run: ./docker-compose/test_extensions_upgrade.sh
- name: Print logs and clean up
if: always()
run: |
@@ -1050,6 +1103,7 @@ jobs:
retries: 5
script: |
const tag = "${{ needs.tag.outputs.build-tag }}";
const branch = "${{ github.ref_name }}";
try {
const existingRef = await github.rest.git.getRef({
@@ -1092,12 +1146,48 @@ jobs:
}
console.log(`Release for tag ${tag} does not exist. Creating it...`);
// Find the PR number using the commit SHA
const pullRequests = await github.rest.pulls.list({
owner: context.repo.owner,
repo: context.repo.repo,
state: 'closed',
base: branch,
});
const pr = pullRequests.data.find(pr => pr.merge_commit_sha === context.sha);
const prNumber = pr ? pr.number : null;
// Find the previous release on the branch
const releases = await github.rest.repos.listReleases({
owner: context.repo.owner,
repo: context.repo.repo,
per_page: 100,
});
const branchReleases = releases.data
.filter((release) => {
const regex = new RegExp(`^${branch}-\\d+$`);
return regex.test(release.tag_name) && !release.draft && !release.prerelease;
})
.sort((a, b) => new Date(b.created_at) - new Date(a.created_at));
const previousTag = branchReleases.length > 0 ? branchReleases[0].tag_name : null;
const releaseNotes = [
prNumber
? `Release PR https://github.com/${context.repo.owner}/${context.repo.repo}/pull/${prNumber}.`
: 'Release PR not found.',
previousTag
? `Diff with the previous release https://github.com/${context.repo.owner}/${context.repo.repo}/compare/${previousTag}...${tag}.`
: `No previous release found on branch ${branch}.`,
].join('\n\n');
await github.rest.repos.createRelease({
owner: context.repo.owner,
repo: context.repo.repo,
tag_name: tag,
// TODO: Automate release notes properly
generate_release_notes: false,
body: releaseNotes,
});
console.log(`Release for tag ${tag} created successfully.`);
}
@@ -1270,6 +1360,8 @@ jobs:
- build-and-test-locally
- check-codestyle-python
- check-codestyle-rust
- check-dependencies-rust
- files-changed
- promote-images-dev
- test-images
- trigger-custom-extensions-build-and-wait
@@ -1282,4 +1374,11 @@ jobs:
if: |
contains(needs.*.result, 'failure')
|| contains(needs.*.result, 'cancelled')
|| contains(needs.*.result, 'skipped')
|| (needs.check-dependencies-rust.result == 'skipped' && needs.files-changed.outputs.check-rust-dependencies == 'true')
|| needs.build-and-test-locally.result == 'skipped'
|| needs.check-codestyle-python.result == 'skipped'
|| needs.check-codestyle-rust.result == 'skipped'
|| needs.files-changed.result == 'skipped'
|| needs.promote-images-dev.result == 'skipped'
|| needs.test-images.result == 'skipped'
|| needs.trigger-custom-extensions-build-and-wait.result == 'skipped'

View File

@@ -0,0 +1,134 @@
name: Build and Test with Sanitizers
on:
schedule:
# * is a special character in YAML so you have to quote this string
# ┌───────────── minute (0 - 59)
# │ ┌───────────── hour (0 - 23)
# │ │ ┌───────────── day of the month (1 - 31)
# │ │ │ ┌───────────── month (1 - 12 or JAN-DEC)
# │ │ │ │ ┌───────────── day of the week (0 - 6 or SUN-SAT)
- cron: '0 1 * * *' # run once a day, timezone is utc
workflow_dispatch:
defaults:
run:
shell: bash -euxo pipefail {0}
concurrency:
# Allow only one workflow per any non-`main` branch.
group: ${{ github.workflow }}-${{ github.ref_name }}-${{ github.ref_name == 'main' && github.sha || 'anysha' }}
cancel-in-progress: true
env:
RUST_BACKTRACE: 1
COPT: '-Werror'
jobs:
tag:
runs-on: [ self-hosted, small ]
container: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/base:pinned
outputs:
build-tag: ${{steps.build-tag.outputs.tag}}
steps:
# Need `fetch-depth: 0` to count the number of commits in the branch
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Get build tag
run: |
echo run:$GITHUB_RUN_ID
echo ref:$GITHUB_REF_NAME
echo rev:$(git rev-list --count HEAD)
if [[ "$GITHUB_REF_NAME" == "main" ]]; then
echo "tag=$(git rev-list --count HEAD)" >> $GITHUB_OUTPUT
elif [[ "$GITHUB_REF_NAME" == "release" ]]; then
echo "tag=release-$(git rev-list --count HEAD)" >> $GITHUB_OUTPUT
elif [[ "$GITHUB_REF_NAME" == "release-proxy" ]]; then
echo "tag=release-proxy-$(git rev-list --count HEAD)" >> $GITHUB_OUTPUT
elif [[ "$GITHUB_REF_NAME" == "release-compute" ]]; then
echo "tag=release-compute-$(git rev-list --count HEAD)" >> $GITHUB_OUTPUT
else
echo "GITHUB_REF_NAME (value '$GITHUB_REF_NAME') is not set to either 'main' or 'release', 'release-proxy', 'release-compute'"
echo "tag=$GITHUB_RUN_ID" >> $GITHUB_OUTPUT
fi
shell: bash
id: build-tag
build-build-tools-image:
uses: ./.github/workflows/build-build-tools-image.yml
secrets: inherit
build-and-test-locally:
needs: [ tag, build-build-tools-image ]
strategy:
fail-fast: false
matrix:
arch: [ x64, arm64 ]
build-type: [ release ]
uses: ./.github/workflows/_build-and-test-locally.yml
with:
arch: ${{ matrix.arch }}
build-tools-image: ${{ needs.build-build-tools-image.outputs.image }}-bookworm
build-tag: ${{ needs.tag.outputs.build-tag }}
build-type: ${{ matrix.build-type }}
test-cfg: '[{"pg_version":"v17"}]'
sanitizers: enabled
secrets: inherit
create-test-report:
needs: [ build-and-test-locally, build-build-tools-image ]
if: ${{ !cancelled() }}
permissions:
id-token: write # aws-actions/configure-aws-credentials
statuses: write
contents: write
pull-requests: write
outputs:
report-url: ${{ steps.create-allure-report.outputs.report-url }}
runs-on: [ self-hosted, small ]
container:
image: ${{ needs.build-build-tools-image.outputs.image }}-bookworm
credentials:
username: ${{ secrets.NEON_DOCKERHUB_USERNAME }}
password: ${{ secrets.NEON_DOCKERHUB_PASSWORD }}
options: --init
steps:
- uses: actions/checkout@v4
- name: Create Allure report
if: ${{ !cancelled() }}
id: create-allure-report
uses: ./.github/actions/allure-report-generate
with:
store-test-results-into-db: true
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
REGRESS_TEST_RESULT_CONNSTR_NEW: ${{ secrets.REGRESS_TEST_RESULT_CONNSTR_NEW }}
- uses: actions/github-script@v7
if: ${{ !cancelled() }}
with:
# Retry script for 5XX server errors: https://github.com/actions/github-script#retries
retries: 5
script: |
const report = {
reportUrl: "${{ steps.create-allure-report.outputs.report-url }}",
reportJsonUrl: "${{ steps.create-allure-report.outputs.report-json-url }}",
}
const coverage = {}
const script = require("./scripts/comment-test-report.js")
await script({
github,
context,
fetch,
report,
coverage,
})

57
.github/workflows/cargo-deny.yml vendored Normal file
View File

@@ -0,0 +1,57 @@
name: cargo deny checks
on:
workflow_call:
inputs:
build-tools-image:
required: false
type: string
schedule:
- cron: '0 0 * * *'
jobs:
cargo-deny:
strategy:
matrix:
ref: >-
${{
fromJSON(
github.event_name == 'schedule'
&& '["main","release","release-proxy","release-compute"]'
|| format('["{0}"]', github.sha)
)
}}
runs-on: [self-hosted, small]
container:
image: ${{ inputs.build-tools-image || 'neondatabase/build-tools:pinned' }}
credentials:
username: ${{ secrets.NEON_DOCKERHUB_USERNAME }}
password: ${{ secrets.NEON_DOCKERHUB_PASSWORD }}
options: --init
steps:
- name: Checkout
uses: actions/checkout@v4
with:
ref: ${{ matrix.ref }}
- name: Check rust licenses/bans/advisories/sources
env:
CARGO_DENY_TARGET: >-
${{ github.event_name == 'schedule' && 'advisories' || 'all' }}
run: cargo deny check --hide-inclusion-graph $CARGO_DENY_TARGET
- name: Post to a Slack channel
if: ${{ github.event_name == 'schedule' && failure() }}
uses: slackapi/slack-github-action@v2
with:
method: chat.postMessage
token: ${{ secrets.SLACK_BOT_TOKEN }}
payload: |
channel: ${{ vars.SLACK_CICD_CHANNEL_ID }}
text: |
Periodic cargo-deny on ${{ matrix.ref }}: ${{ job.status }}
<${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}|GitHub Run>
Pinging @oncall-devprod.

View File

@@ -28,7 +28,24 @@ jobs:
strategy:
fail-fast: false # allow other variants to continue even if one fails
matrix:
target_project: [new_empty_project, large_existing_project]
include:
- target_project: new_empty_project_stripe_size_2048
stripe_size: 2048 # 16 MiB
postgres_version: 16
- target_project: new_empty_project_stripe_size_32768
stripe_size: 32768 # 256 MiB # note that this is different from null because using null will shard_split the project only if it reaches the threshold
# while here it is sharded from the beginning with a shard size of 256 MiB
postgres_version: 16
- target_project: new_empty_project
stripe_size: null # run with neon defaults which will shard split only when reaching the threshold
postgres_version: 16
- target_project: new_empty_project
stripe_size: null # run with neon defaults which will shard split only when reaching the threshold
postgres_version: 17
- target_project: large_existing_project
stripe_size: null # cannot re-shared or choose different stripe size for existing, already sharded project
postgres_version: 16
max-parallel: 1 # we want to run each stripe size sequentially to be able to compare the results
permissions:
contents: write
statuses: write
@@ -67,17 +84,21 @@ jobs:
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
- name: Create Neon Project
if: ${{ matrix.target_project == 'new_empty_project' }}
if: ${{ startsWith(matrix.target_project, 'new_empty_project') }}
id: create-neon-project-ingest-target
uses: ./.github/actions/neon-project-create
with:
region_id: aws-us-east-2
postgres_version: 16
postgres_version: ${{ matrix.postgres_version }}
compute_units: '[7, 7]' # we want to test large compute here to avoid compute-side bottleneck
api_key: ${{ secrets.NEON_STAGING_API_KEY }}
shard_split_project: ${{ matrix.stripe_size != null && 'true' || 'false' }}
admin_api_key: ${{ secrets.NEON_STAGING_ADMIN_API_KEY }}
shard_count: 8
stripe_size: ${{ matrix.stripe_size }}
- name: Initialize Neon project
if: ${{ matrix.target_project == 'new_empty_project' }}
if: ${{ startsWith(matrix.target_project, 'new_empty_project') }}
env:
BENCHMARK_INGEST_TARGET_CONNSTR: ${{ steps.create-neon-project-ingest-target.outputs.dsn }}
NEW_PROJECT_ID: ${{ steps.create-neon-project-ingest-target.outputs.project_id }}
@@ -130,7 +151,7 @@ jobs:
test_selection: performance/test_perf_ingest_using_pgcopydb.py
run_in_parallel: false
extra_params: -s -m remote_cluster --timeout 86400 -k test_ingest_performance_using_pgcopydb
pg_version: v16
pg_version: v${{ matrix.postgres_version }}
save_perf_report: true
aws-oicd-role-arn: ${{ vars.DEV_AWS_OIDC_ROLE_ARN }}
env:
@@ -146,7 +167,7 @@ jobs:
${PSQL} "${BENCHMARK_INGEST_TARGET_CONNSTR}" -c "\dt+"
- name: Delete Neon Project
if: ${{ always() && matrix.target_project == 'new_empty_project' }}
if: ${{ always() && startsWith(matrix.target_project, 'new_empty_project') }}
uses: ./.github/actions/neon-project-delete
with:
project_id: ${{ steps.create-neon-project-ingest-target.outputs.project_id }}

View File

@@ -114,7 +114,7 @@ jobs:
run: make walproposer-lib -j$(nproc)
- name: Produce the build stats
run: PQ_LIB_DIR=$(pwd)/pg_install/v17/lib cargo build --all --release --timings -j$(nproc)
run: cargo build --all --release --timings -j$(nproc)
- name: Configure AWS credentials
uses: aws-actions/configure-aws-credentials@v4

View File

@@ -12,8 +12,8 @@ on:
pull_request:
paths:
- '.github/workflows/pg-clients.yml'
- 'test_runner/pg_clients/**'
- 'test_runner/logical_repl/**'
- 'test_runner/pg_clients/**/*.py'
- 'test_runner/logical_repl/**/*.py'
- 'poetry.lock'
workflow_dispatch:
@@ -104,6 +104,8 @@ jobs:
with:
api_key: ${{ secrets.NEON_STAGING_API_KEY }}
postgres_version: ${{ env.DEFAULT_PG_VERSION }}
project_settings: >-
{"enable_logical_replication": true}
- name: Run tests
uses: ./.github/actions/run-python-test-set

View File

@@ -59,7 +59,10 @@ jobs:
echo "${RUST_CHANGED_FILES}"
build-build-tools-image:
if: needs.get-changed-files.outputs.python-changed == 'true'
if: |
false
|| needs.get-changed-files.outputs.python-changed == 'true'
|| needs.get-changed-files.outputs.rust-changed == 'true'
needs: [ get-changed-files ]
uses: ./.github/workflows/build-build-tools-image.yml
with:
@@ -92,7 +95,8 @@ jobs:
# - conclusion
# - neon-cloud-e2e
conclusion:
if: always()
# Do not run job on Pull Requests as it interferes with the `conclusion` job from the `build_and_test` workflow
if: always() && github.event_name == 'merge_group'
permissions:
statuses: write # for `github.repos.createCommitStatus(...)`
contents: write
@@ -124,6 +128,8 @@ jobs:
- name: Fail the job if any of the dependencies do not succeed or skipped
run: exit 1
if: |
(contains(needs.check-codestyle-python.result, 'skipped') && needs.get-changed-files.outputs.python-changed == 'true')
false
|| (needs.check-codestyle-python.result == 'skipped' && needs.get-changed-files.outputs.python-changed == 'true')
|| (needs.check-codestyle-rust.result == 'skipped' && needs.get-changed-files.outputs.rust-changed == 'true')
|| contains(needs.*.result, 'failure')
|| contains(needs.*.result, 'cancelled')

496
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -18,6 +18,7 @@ members = [
"storage_scrubber",
"workspace_hack",
"libs/compute_api",
"libs/http-utils",
"libs/pageserver_api",
"libs/postgres_ffi",
"libs/safekeeper_api",
@@ -54,6 +55,7 @@ async-compression = { version = "0.4.0", features = ["tokio", "gzip", "zstd"] }
atomic-take = "1.1.0"
backtrace = "0.3.74"
flate2 = "1.0.26"
assert-json-diff = "2"
async-stream = "0.3"
async-trait = "0.1"
aws-config = { version = "1.5", default-features = false, features=["rustls", "sso"] }
@@ -65,7 +67,7 @@ aws-smithy-types = "1.2"
aws-credential-types = "1.2.0"
aws-sigv4 = { version = "1.2", features = ["sign-http"] }
aws-types = "1.3"
axum = { version = "0.7.9", features = ["ws"] }
axum = { version = "0.8.1", features = ["ws"] }
base64 = "0.13.0"
bincode = "1.3"
bindgen = "0.70"
@@ -77,10 +79,10 @@ camino = "1.1.6"
cfg-if = "1.0.0"
chrono = { version = "0.4", default-features = false, features = ["clock"] }
clap = { version = "4.0", features = ["derive", "env"] }
clashmap = { version = "1.0", features = ["raw-api"] }
comfy-table = "7.1"
const_format = "0.2"
crc32c = "0.6"
dashmap = { version = "5.5.0", features = ["raw-api"] }
diatomic-waker = { version = "0.2.3" }
either = "1.8"
enum-map = "2.4.2"
@@ -123,7 +125,7 @@ measured = { version = "0.0.22", features=["lasso"] }
measured-process = { version = "0.0.22" }
memoffset = "0.9"
nix = { version = "0.27", features = ["dir", "fs", "process", "socket", "signal", "poll"] }
notify = "6.0.0"
notify = "8.0.0"
num_cpus = "1.15"
num-traits = "0.2.15"
once_cell = "1.13"
@@ -177,7 +179,7 @@ test-context = "0.3"
thiserror = "1.0"
tikv-jemallocator = { version = "0.6", features = ["profiling", "stats", "unprefixed_malloc_on_supported_platforms"] }
tikv-jemalloc-ctl = { version = "0.6", features = ["stats"] }
tokio = { version = "1.17", features = ["macros"] }
tokio = { version = "1.41", features = ["macros"] }
tokio-epoll-uring = { git = "https://github.com/neondatabase/tokio-epoll-uring.git" , branch = "main" }
tokio-io-timeout = "1.2.0"
tokio-postgres-rustls = "0.12.0"
@@ -193,7 +195,9 @@ tower-http = { version = "0.6.2", features = ["request-id", "trace"] }
tower-service = "0.3.3"
tracing = "0.1"
tracing-error = "0.2"
tracing-log = "0.2"
tracing-opentelemetry = "0.28"
tracing-serde = "0.2.0"
tracing-subscriber = { version = "0.3", default-features = false, features = ["smallvec", "fmt", "tracing-log", "std", "env-filter", "json"] }
try-lock = "0.2.5"
twox-hash = { version = "1.6.3", default-features = false }
@@ -226,6 +230,7 @@ azure_storage_blobs = { git = "https://github.com/neondatabase/azure-sdk-for-rus
## Local libraries
compute_api = { version = "0.1", path = "./libs/compute_api/" }
consumption_metrics = { version = "0.1", path = "./libs/consumption_metrics/" }
http-utils = { version = "0.1", path = "./libs/http-utils/" }
metrics = { version = "0.1", path = "./libs/metrics/" }
pageserver = { path = "./pageserver" }
pageserver_api = { version = "0.1", path = "./libs/pageserver_api/" }

View File

@@ -45,7 +45,7 @@ COPY --chown=nonroot . .
ARG ADDITIONAL_RUSTFLAGS
RUN set -e \
&& PQ_LIB_DIR=$(pwd)/pg_install/v${STABLE_PG_VERSION}/lib RUSTFLAGS="-Clinker=clang -Clink-arg=-fuse-ld=mold -Clink-arg=-Wl,--no-rosegment -Cforce-frame-pointers=yes ${ADDITIONAL_RUSTFLAGS}" cargo build \
&& RUSTFLAGS="-Clinker=clang -Clink-arg=-fuse-ld=mold -Clink-arg=-Wl,--no-rosegment -Cforce-frame-pointers=yes ${ADDITIONAL_RUSTFLAGS}" cargo build \
--bin pg_sni_router \
--bin pageserver \
--bin pagectl \
@@ -64,6 +64,7 @@ ARG DEFAULT_PG_VERSION
WORKDIR /data
RUN set -e \
&& echo 'Acquire::Retries "5";' > /etc/apt/apt.conf.d/80-retries \
&& apt update \
&& apt install -y \
libreadline-dev \
@@ -72,6 +73,7 @@ RUN set -e \
# System postgres for use with client libraries (e.g. in storage controller)
postgresql-15 \
openssl \
&& rm -f /etc/apt/apt.conf.d/80-retries \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* \
&& useradd -d /data neon \
&& chown -R neon:neon /data

View File

@@ -10,18 +10,29 @@ ICU_PREFIX_DIR := /usr/local/icu
# environment variable.
#
BUILD_TYPE ?= debug
WITH_SANITIZERS ?= no
ifeq ($(BUILD_TYPE),release)
PG_CONFIGURE_OPTS = --enable-debug --with-openssl
PG_CFLAGS = -O2 -g3 $(CFLAGS)
PG_LDFLAGS = $(LDFLAGS)
# Unfortunately, `--profile=...` is a nightly feature
CARGO_BUILD_FLAGS += --release
else ifeq ($(BUILD_TYPE),debug)
PG_CONFIGURE_OPTS = --enable-debug --with-openssl --enable-cassert --enable-depend
PG_CFLAGS = -O0 -g3 $(CFLAGS)
PG_LDFLAGS = $(LDFLAGS)
else
$(error Bad build type '$(BUILD_TYPE)', see Makefile for options)
endif
ifeq ($(WITH_SANITIZERS),yes)
PG_CFLAGS += -fsanitize=address -fsanitize=undefined -fno-sanitize-recover
COPT += -Wno-error # to avoid failing on warnings induced by sanitizers
PG_LDFLAGS = -fsanitize=address -fsanitize=undefined -static-libasan -static-libubsan $(LDFLAGS)
export CC := gcc
export ASAN_OPTIONS := detect_leaks=0
endif
ifeq ($(shell test -e /home/nonroot/.docker_build && echo -n yes),yes)
# Exclude static build openssl, icu for local build (MacOS, Linux)
# Only keep for build type release and debug
@@ -33,7 +44,9 @@ endif
UNAME_S := $(shell uname -s)
ifeq ($(UNAME_S),Linux)
# Seccomp BPF is only available for Linux
PG_CONFIGURE_OPTS += --with-libseccomp
ifneq ($(WITH_SANITIZERS),yes)
PG_CONFIGURE_OPTS += --with-libseccomp
endif
else ifeq ($(UNAME_S),Darwin)
PG_CFLAGS += -DUSE_PREFETCH
ifndef DISABLE_HOMEBREW
@@ -106,7 +119,7 @@ $(POSTGRES_INSTALL_DIR)/build/%/config.status:
EXTRA_VERSION=$$(cd $(ROOT_PROJECT_DIR)/vendor/postgres-$$VERSION && git rev-parse HEAD); \
(cd $(POSTGRES_INSTALL_DIR)/build/$$VERSION && \
env PATH="$(EXTRA_PATH_OVERRIDES):$$PATH" $(ROOT_PROJECT_DIR)/vendor/postgres-$$VERSION/configure \
CFLAGS='$(PG_CFLAGS)' \
CFLAGS='$(PG_CFLAGS)' LDFLAGS='$(PG_LDFLAGS)' \
$(PG_CONFIGURE_OPTS) --with-extra-version=" ($$EXTRA_VERSION)" \
--prefix=$(abspath $(POSTGRES_INSTALL_DIR))/$$VERSION > configure.log)

View File

@@ -21,8 +21,10 @@ The Neon storage engine consists of two major components:
See developer documentation in [SUMMARY.md](/docs/SUMMARY.md) for more information.
## Running local installation
## Running a local development environment
Neon can be run on a workstation for small experiments and to test code changes, by
following these instructions.
#### Installing dependencies on Linux
1. Install build dependencies and other applicable packages
@@ -238,7 +240,7 @@ postgres=# select * from t;
> cargo neon stop
```
More advanced usages can be found at [Control Plane and Neon Local](./control_plane/README.md).
More advanced usages can be found at [Local Development Control Plane (`neon_local`))](./control_plane/README.md).
#### Handling build failures

View File

@@ -3,6 +3,17 @@ ARG DEBIAN_VERSION=bookworm
FROM debian:bookworm-slim AS pgcopydb_builder
ARG DEBIAN_VERSION
# Use strict mode for bash to catch errors early
SHELL ["/bin/bash", "-euo", "pipefail", "-c"]
# By default, /bin/sh used in debian images will treat '\n' as eol,
# but as we use bash as SHELL, and built-in echo in bash requires '-e' flag for that.
RUN echo 'Acquire::Retries "5";' > /etc/apt/apt.conf.d/80-retries && \
echo -e "retry_connrefused = on\ntimeout=15\ntries=5\n" > /root/.wgetrc && \
echo -e "--retry-connrefused\n--connect-timeout 15\n--retry 5\n--max-time 300\n" > /root/.curlrc
COPY build_tools/patches/pgcopydbv017.patch /pgcopydbv017.patch
RUN if [ "${DEBIAN_VERSION}" = "bookworm" ]; then \
set -e && \
apt update && \
@@ -35,6 +46,7 @@ RUN if [ "${DEBIAN_VERSION}" = "bookworm" ]; then \
mkdir /tmp/pgcopydb && \
tar -xzf /tmp/pgcopydb.tar.gz -C /tmp/pgcopydb --strip-components=1 && \
cd /tmp/pgcopydb && \
patch -p1 < /pgcopydbv017.patch && \
make -s clean && \
make -s -j12 install && \
libpq_path=$(find /lib /usr/lib -name "libpq.so.5" | head -n 1) && \
@@ -51,7 +63,8 @@ ARG DEBIAN_VERSION
# Add nonroot user
RUN useradd -ms /bin/bash nonroot -b /home
SHELL ["/bin/bash", "-c"]
# Use strict mode for bash to catch errors early
SHELL ["/bin/bash", "-euo", "pipefail", "-c"]
RUN mkdir -p /pgcopydb/bin && \
mkdir -p /pgcopydb/lib && \
@@ -61,6 +74,10 @@ RUN mkdir -p /pgcopydb/bin && \
COPY --from=pgcopydb_builder /usr/lib/postgresql/16/bin/pgcopydb /pgcopydb/bin/pgcopydb
COPY --from=pgcopydb_builder /pgcopydb/lib/libpq.so.5 /pgcopydb/lib/libpq.so.5
RUN echo 'Acquire::Retries "5";' > /etc/apt/apt.conf.d/80-retries && \
echo -e "retry_connrefused = on\ntimeout=15\ntries=5\n" > /root/.wgetrc && \
echo -e "--retry-connrefused\n--connect-timeout 15\n--retry 5\n--max-time 300\n" > /root/.curlrc
# System deps
#
# 'gdb' is included so that we get backtraces of core dumps produced in
@@ -182,8 +199,14 @@ RUN set -e \
# It includes several bug fixes on top on v2.0 release (https://github.com/linux-test-project/lcov/compare/v2.0...master)
# And patches from us:
# - Generates json file with code coverage summary (https://github.com/neondatabase/lcov/commit/426e7e7a22f669da54278e9b55e6d8caabd00af0.tar.gz)
RUN for package in Capture::Tiny DateTime Devel::Cover Digest::MD5 File::Spec JSON::XS Memory::Process Time::HiRes JSON; do yes | perl -MCPAN -e "CPAN::Shell->notest('install', '$package')"; done \
&& wget https://github.com/neondatabase/lcov/archive/426e7e7a22f669da54278e9b55e6d8caabd00af0.tar.gz -O lcov.tar.gz \
RUN set +o pipefail && \
for package in Capture::Tiny DateTime Devel::Cover Digest::MD5 File::Spec JSON::XS Memory::Process Time::HiRes JSON; do \
yes | perl -MCPAN -e "CPAN::Shell->notest('install', '$package')";\
done && \
set -o pipefail
# Split into separate step to debug flaky failures here
RUN wget https://github.com/neondatabase/lcov/archive/426e7e7a22f669da54278e9b55e6d8caabd00af0.tar.gz -O lcov.tar.gz \
&& ls -laht lcov.tar.gz && sha256sum lcov.tar.gz \
&& echo "61a22a62e20908b8b9e27d890bd0ea31f567a7b9668065589266371dcbca0992 lcov.tar.gz" | sha256sum --check \
&& mkdir -p lcov && tar -xzf lcov.tar.gz -C lcov --strip-components=1 \
&& cd lcov \
@@ -218,6 +241,8 @@ RUN wget -O /tmp/libicu-${ICU_VERSION}.tgz https://github.com/unicode-org/icu/re
USER nonroot:nonroot
WORKDIR /home/nonroot
RUN echo -e "--retry-connrefused\n--connect-timeout 15\n--retry 5\n--max-time 300\n" > /home/nonroot/.curlrc
# Python
ENV PYTHON_VERSION=3.11.10 \
PYENV_ROOT=/home/nonroot/.pyenv \
@@ -243,7 +268,7 @@ WORKDIR /home/nonroot
# Rust
# Please keep the version of llvm (installed above) in sync with rust llvm (`rustc --version --verbose | grep LLVM`)
ENV RUSTC_VERSION=1.84.0
ENV RUSTC_VERSION=1.84.1
ENV RUSTUP_HOME="/home/nonroot/.rustup"
ENV PATH="/home/nonroot/.cargo/bin:${PATH}"
ARG RUSTFILT_VERSION=0.2.1
@@ -251,6 +276,7 @@ ARG CARGO_HAKARI_VERSION=0.9.33
ARG CARGO_DENY_VERSION=0.16.2
ARG CARGO_HACK_VERSION=0.6.33
ARG CARGO_NEXTEST_VERSION=0.9.85
ARG CARGO_DIESEL_CLI_VERSION=2.2.6
RUN curl -sSO https://static.rust-lang.org/rustup/dist/$(uname -m)-unknown-linux-gnu/rustup-init && whoami && \
chmod +x rustup-init && \
./rustup-init -y --default-toolchain ${RUSTC_VERSION} && \
@@ -264,6 +290,8 @@ RUN curl -sSO https://static.rust-lang.org/rustup/dist/$(uname -m)-unknown-linux
cargo install cargo-deny --locked --version ${CARGO_DENY_VERSION} && \
cargo install cargo-hack --version ${CARGO_HACK_VERSION} && \
cargo install cargo-nextest --version ${CARGO_NEXTEST_VERSION} && \
cargo install diesel_cli --version ${CARGO_DIESEL_CLI_VERSION} \
--features postgres-bundled --no-default-features && \
rm -rf /home/nonroot/.cargo/registry && \
rm -rf /home/nonroot/.cargo/git

View File

@@ -0,0 +1,57 @@
diff --git a/src/bin/pgcopydb/copydb.c b/src/bin/pgcopydb/copydb.c
index d730b03..69a9be9 100644
--- a/src/bin/pgcopydb/copydb.c
+++ b/src/bin/pgcopydb/copydb.c
@@ -44,6 +44,7 @@ GUC dstSettings[] = {
{ "synchronous_commit", "'off'" },
{ "statement_timeout", "0" },
{ "lock_timeout", "0" },
+ { "idle_in_transaction_session_timeout", "0" },
{ NULL, NULL },
};
diff --git a/src/bin/pgcopydb/pgsql.c b/src/bin/pgcopydb/pgsql.c
index 94f2f46..e051ba8 100644
--- a/src/bin/pgcopydb/pgsql.c
+++ b/src/bin/pgcopydb/pgsql.c
@@ -2319,6 +2319,11 @@ pgsql_execute_log_error(PGSQL *pgsql,
LinesBuffer lbuf = { 0 };
+ if (message != NULL){
+ // make sure message is writable by splitLines
+ message = strdup(message);
+ }
+
if (!splitLines(&lbuf, message))
{
/* errors have already been logged */
@@ -2332,6 +2337,7 @@ pgsql_execute_log_error(PGSQL *pgsql,
PQbackendPID(pgsql->connection),
lbuf.lines[lineNumber]);
}
+ free(message); // free copy of message we created above
if (pgsql->logSQL)
{
@@ -3174,11 +3180,18 @@ pgcopy_log_error(PGSQL *pgsql, PGresult *res, const char *context)
/* errors have already been logged */
return;
}
-
if (res != NULL)
{
char *sqlstate = PQresultErrorField(res, PG_DIAG_SQLSTATE);
- strlcpy(pgsql->sqlstate, sqlstate, sizeof(pgsql->sqlstate));
+ if (sqlstate == NULL)
+ {
+ // PQresultErrorField returned NULL!
+ pgsql->sqlstate[0] = '\0'; // Set to an empty string to avoid segfault
+ }
+ else
+ {
+ strlcpy(pgsql->sqlstate, sqlstate, sizeof(pgsql->sqlstate));
+ }
}
char *endpoint =

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,242 @@
diff --git a/contrib/amcheck/expected/check_heap.out b/contrib/amcheck/expected/check_heap.out
index 979e5e8..2375b45 100644
--- a/contrib/amcheck/expected/check_heap.out
+++ b/contrib/amcheck/expected/check_heap.out
@@ -80,12 +80,9 @@ INSERT INTO heaptest (a, b)
-- same transaction. The heaptest table is smaller than the default
-- wal_skip_threshold, so a wal_level=minimal commit reads the table into
-- shared_buffers. A transaction delays that and excludes any autovacuum.
-SET allow_in_place_tablespaces = true;
-CREATE TABLESPACE regress_test_stats_tblspc LOCATION '';
SELECT sum(reads) AS stats_bulkreads_before
FROM pg_stat_io WHERE context = 'bulkread' \gset
BEGIN;
-ALTER TABLE heaptest SET TABLESPACE regress_test_stats_tblspc;
-- Check that valid options are not rejected nor corruption reported
-- for a non-empty table
SELECT * FROM verify_heapam(relation := 'heaptest', skip := 'none');
@@ -118,14 +115,6 @@ SELECT pg_stat_force_next_flush();
(1 row)
-SELECT sum(reads) AS stats_bulkreads_after
- FROM pg_stat_io WHERE context = 'bulkread' \gset
-SELECT :stats_bulkreads_after > :stats_bulkreads_before;
- ?column?
-----------
- t
-(1 row)
-
CREATE ROLE regress_heaptest_role;
-- verify permissions are checked (error due to function not callable)
SET ROLE regress_heaptest_role;
@@ -233,7 +222,6 @@ ERROR: cannot check relation "test_foreign_table"
DETAIL: This operation is not supported for foreign tables.
-- cleanup
DROP TABLE heaptest;
-DROP TABLESPACE regress_test_stats_tblspc;
DROP TABLE test_partition;
DROP TABLE test_partitioned;
DROP OWNED BY regress_heaptest_role; -- permissions
diff --git a/contrib/amcheck/sql/check_heap.sql b/contrib/amcheck/sql/check_heap.sql
index 1745bae..3b429c3 100644
--- a/contrib/amcheck/sql/check_heap.sql
+++ b/contrib/amcheck/sql/check_heap.sql
@@ -40,12 +40,9 @@ INSERT INTO heaptest (a, b)
-- same transaction. The heaptest table is smaller than the default
-- wal_skip_threshold, so a wal_level=minimal commit reads the table into
-- shared_buffers. A transaction delays that and excludes any autovacuum.
-SET allow_in_place_tablespaces = true;
-CREATE TABLESPACE regress_test_stats_tblspc LOCATION '';
SELECT sum(reads) AS stats_bulkreads_before
FROM pg_stat_io WHERE context = 'bulkread' \gset
BEGIN;
-ALTER TABLE heaptest SET TABLESPACE regress_test_stats_tblspc;
-- Check that valid options are not rejected nor corruption reported
-- for a non-empty table
SELECT * FROM verify_heapam(relation := 'heaptest', skip := 'none');
@@ -58,9 +55,6 @@ COMMIT;
-- ALTER TABLE ... SET TABLESPACE ...
-- causing an additional bulkread, which should be reflected in pg_stat_io.
SELECT pg_stat_force_next_flush();
-SELECT sum(reads) AS stats_bulkreads_after
- FROM pg_stat_io WHERE context = 'bulkread' \gset
-SELECT :stats_bulkreads_after > :stats_bulkreads_before;
CREATE ROLE regress_heaptest_role;
@@ -140,7 +134,6 @@ SELECT * FROM verify_heapam('test_foreign_table',
-- cleanup
DROP TABLE heaptest;
-DROP TABLESPACE regress_test_stats_tblspc;
DROP TABLE test_partition;
DROP TABLE test_partitioned;
DROP OWNED BY regress_heaptest_role; -- permissions
diff --git a/contrib/citext/expected/create_index_acl.out b/contrib/citext/expected/create_index_acl.out
index 33be13a..70a406c 100644
--- a/contrib/citext/expected/create_index_acl.out
+++ b/contrib/citext/expected/create_index_acl.out
@@ -5,9 +5,6 @@
-- owner having as few applicable privileges as possible. (The privileges.sql
-- regress_sro_user tests look for the opposite defect; they confirm that
-- DefineIndex() uses the table owner userid where necessary.)
-SET allow_in_place_tablespaces = true;
-CREATE TABLESPACE regress_create_idx_tblspace LOCATION '';
-RESET allow_in_place_tablespaces;
BEGIN;
CREATE ROLE regress_minimal;
CREATE SCHEMA s;
@@ -49,11 +46,9 @@ ALTER TABLE s.x OWNER TO regress_minimal;
-- Empty-table DefineIndex()
CREATE UNIQUE INDEX u0rows ON s.x USING btree
((s.index_this_expr(y, s.const())) COLLATE s.coll s.citext_pattern_ops)
- TABLESPACE regress_create_idx_tblspace
WHERE s.index_row_if(y);
ALTER TABLE s.x ADD CONSTRAINT e0rows EXCLUDE USING btree
((s.index_this_expr(y, s.const())) COLLATE s.coll WITH s.=)
- USING INDEX TABLESPACE regress_create_idx_tblspace
WHERE (s.index_row_if(y));
-- Make the table nonempty.
INSERT INTO s.x VALUES ('foo'), ('bar');
@@ -66,11 +61,9 @@ RESET search_path;
GRANT EXECUTE ON FUNCTION s.index_this_expr TO regress_minimal;
CREATE UNIQUE INDEX u2rows ON s.x USING btree
((s.index_this_expr(y, s.const())) COLLATE s.coll s.citext_pattern_ops)
- TABLESPACE regress_create_idx_tblspace
WHERE s.index_row_if(y);
ALTER TABLE s.x ADD CONSTRAINT e2rows EXCLUDE USING btree
((s.index_this_expr(y, s.const())) COLLATE s.coll WITH s.=)
- USING INDEX TABLESPACE regress_create_idx_tblspace
WHERE (s.index_row_if(y));
-- Shall not find s.coll via search_path, despite the s.const->public.setter
-- call having set search_path=s during expression planning. Suppress the
@@ -78,9 +71,7 @@ ALTER TABLE s.x ADD CONSTRAINT e2rows EXCLUDE USING btree
\set VERBOSITY sqlstate
ALTER TABLE s.x ADD CONSTRAINT underqualified EXCLUDE USING btree
((s.index_this_expr(y, s.const())) COLLATE coll WITH s.=)
- USING INDEX TABLESPACE regress_create_idx_tblspace
WHERE (s.index_row_if(y));
ERROR: 42704
\set VERBOSITY default
ROLLBACK;
-DROP TABLESPACE regress_create_idx_tblspace;
diff --git a/contrib/citext/sql/create_index_acl.sql b/contrib/citext/sql/create_index_acl.sql
index 10b5225..ae442e1 100644
--- a/contrib/citext/sql/create_index_acl.sql
+++ b/contrib/citext/sql/create_index_acl.sql
@@ -6,10 +6,6 @@
-- regress_sro_user tests look for the opposite defect; they confirm that
-- DefineIndex() uses the table owner userid where necessary.)
-SET allow_in_place_tablespaces = true;
-CREATE TABLESPACE regress_create_idx_tblspace LOCATION '';
-RESET allow_in_place_tablespaces;
-
BEGIN;
CREATE ROLE regress_minimal;
CREATE SCHEMA s;
@@ -51,11 +47,9 @@ ALTER TABLE s.x OWNER TO regress_minimal;
-- Empty-table DefineIndex()
CREATE UNIQUE INDEX u0rows ON s.x USING btree
((s.index_this_expr(y, s.const())) COLLATE s.coll s.citext_pattern_ops)
- TABLESPACE regress_create_idx_tblspace
WHERE s.index_row_if(y);
ALTER TABLE s.x ADD CONSTRAINT e0rows EXCLUDE USING btree
((s.index_this_expr(y, s.const())) COLLATE s.coll WITH s.=)
- USING INDEX TABLESPACE regress_create_idx_tblspace
WHERE (s.index_row_if(y));
-- Make the table nonempty.
INSERT INTO s.x VALUES ('foo'), ('bar');
@@ -68,11 +62,9 @@ RESET search_path;
GRANT EXECUTE ON FUNCTION s.index_this_expr TO regress_minimal;
CREATE UNIQUE INDEX u2rows ON s.x USING btree
((s.index_this_expr(y, s.const())) COLLATE s.coll s.citext_pattern_ops)
- TABLESPACE regress_create_idx_tblspace
WHERE s.index_row_if(y);
ALTER TABLE s.x ADD CONSTRAINT e2rows EXCLUDE USING btree
((s.index_this_expr(y, s.const())) COLLATE s.coll WITH s.=)
- USING INDEX TABLESPACE regress_create_idx_tblspace
WHERE (s.index_row_if(y));
-- Shall not find s.coll via search_path, despite the s.const->public.setter
-- call having set search_path=s during expression planning. Suppress the
@@ -80,9 +72,7 @@ ALTER TABLE s.x ADD CONSTRAINT e2rows EXCLUDE USING btree
\set VERBOSITY sqlstate
ALTER TABLE s.x ADD CONSTRAINT underqualified EXCLUDE USING btree
((s.index_this_expr(y, s.const())) COLLATE coll WITH s.=)
- USING INDEX TABLESPACE regress_create_idx_tblspace
WHERE (s.index_row_if(y));
\set VERBOSITY default
ROLLBACK;
-DROP TABLESPACE regress_create_idx_tblspace;
diff --git a/contrib/file_fdw/expected/file_fdw.out b/contrib/file_fdw/expected/file_fdw.out
index 72304e0..ebe131b 100644
--- a/contrib/file_fdw/expected/file_fdw.out
+++ b/contrib/file_fdw/expected/file_fdw.out
@@ -4,6 +4,7 @@
-- directory paths are passed to us in environment variables
\getenv abs_srcdir PG_ABS_SRCDIR
-- Clean up in case a prior regression run failed
+SET compute_query_id TO 'off';
SET client_min_messages TO 'warning';
DROP ROLE IF EXISTS regress_file_fdw_superuser, regress_file_fdw_user, regress_no_priv_user;
RESET client_min_messages;
diff --git a/contrib/file_fdw/sql/file_fdw.sql b/contrib/file_fdw/sql/file_fdw.sql
index f0548e1..848a08c 100644
--- a/contrib/file_fdw/sql/file_fdw.sql
+++ b/contrib/file_fdw/sql/file_fdw.sql
@@ -6,6 +6,7 @@
\getenv abs_srcdir PG_ABS_SRCDIR
-- Clean up in case a prior regression run failed
+SET compute_query_id TO 'off';
SET client_min_messages TO 'warning';
DROP ROLE IF EXISTS regress_file_fdw_superuser, regress_file_fdw_user, regress_no_priv_user;
RESET client_min_messages;
diff --git a/contrib/pageinspect/expected/gist.out b/contrib/pageinspect/expected/gist.out
index d1adbab..38b52ac 100644
--- a/contrib/pageinspect/expected/gist.out
+++ b/contrib/pageinspect/expected/gist.out
@@ -10,25 +10,6 @@ BEGIN;
CREATE TABLE test_gist AS SELECT point(i,i) p, i::text t FROM
generate_series(1,1000) i;
CREATE INDEX test_gist_idx ON test_gist USING gist (p);
--- Page 0 is the root, the rest are leaf pages
-SELECT * FROM gist_page_opaque_info(get_raw_page('test_gist_idx', 0));
- lsn | nsn | rightlink | flags
------+-----+------------+-------
- 0/1 | 0/0 | 4294967295 | {}
-(1 row)
-
-SELECT * FROM gist_page_opaque_info(get_raw_page('test_gist_idx', 1));
- lsn | nsn | rightlink | flags
------+-----+------------+--------
- 0/1 | 0/0 | 4294967295 | {leaf}
-(1 row)
-
-SELECT * FROM gist_page_opaque_info(get_raw_page('test_gist_idx', 2));
- lsn | nsn | rightlink | flags
------+-----+-----------+--------
- 0/1 | 0/0 | 1 | {leaf}
-(1 row)
-
COMMIT;
SELECT * FROM gist_page_items(get_raw_page('test_gist_idx', 0), 'test_gist_idx');
itemoffset | ctid | itemlen | dead | keys
diff --git a/contrib/pageinspect/sql/gist.sql b/contrib/pageinspect/sql/gist.sql
index d263542..607992f 100644
--- a/contrib/pageinspect/sql/gist.sql
+++ b/contrib/pageinspect/sql/gist.sql
@@ -12,11 +12,6 @@ CREATE TABLE test_gist AS SELECT point(i,i) p, i::text t FROM
generate_series(1,1000) i;
CREATE INDEX test_gist_idx ON test_gist USING gist (p);
--- Page 0 is the root, the rest are leaf pages
-SELECT * FROM gist_page_opaque_info(get_raw_page('test_gist_idx', 0));
-SELECT * FROM gist_page_opaque_info(get_raw_page('test_gist_idx', 1));
-SELECT * FROM gist_page_opaque_info(get_raw_page('test_gist_idx', 2));
-
COMMIT;
SELECT * FROM gist_page_items(get_raw_page('test_gist_idx', 0), 'test_gist_idx');

View File

@@ -0,0 +1,196 @@
diff --git a/contrib/amcheck/expected/check_heap.out b/contrib/amcheck/expected/check_heap.out
index 979e5e8..2375b45 100644
--- a/contrib/amcheck/expected/check_heap.out
+++ b/contrib/amcheck/expected/check_heap.out
@@ -80,12 +80,9 @@ INSERT INTO heaptest (a, b)
-- same transaction. The heaptest table is smaller than the default
-- wal_skip_threshold, so a wal_level=minimal commit reads the table into
-- shared_buffers. A transaction delays that and excludes any autovacuum.
-SET allow_in_place_tablespaces = true;
-CREATE TABLESPACE regress_test_stats_tblspc LOCATION '';
SELECT sum(reads) AS stats_bulkreads_before
FROM pg_stat_io WHERE context = 'bulkread' \gset
BEGIN;
-ALTER TABLE heaptest SET TABLESPACE regress_test_stats_tblspc;
-- Check that valid options are not rejected nor corruption reported
-- for a non-empty table
SELECT * FROM verify_heapam(relation := 'heaptest', skip := 'none');
@@ -118,14 +115,6 @@ SELECT pg_stat_force_next_flush();
(1 row)
-SELECT sum(reads) AS stats_bulkreads_after
- FROM pg_stat_io WHERE context = 'bulkread' \gset
-SELECT :stats_bulkreads_after > :stats_bulkreads_before;
- ?column?
-----------
- t
-(1 row)
-
CREATE ROLE regress_heaptest_role;
-- verify permissions are checked (error due to function not callable)
SET ROLE regress_heaptest_role;
@@ -233,7 +222,6 @@ ERROR: cannot check relation "test_foreign_table"
DETAIL: This operation is not supported for foreign tables.
-- cleanup
DROP TABLE heaptest;
-DROP TABLESPACE regress_test_stats_tblspc;
DROP TABLE test_partition;
DROP TABLE test_partitioned;
DROP OWNED BY regress_heaptest_role; -- permissions
diff --git a/contrib/amcheck/sql/check_heap.sql b/contrib/amcheck/sql/check_heap.sql
index 1745bae..3b429c3 100644
--- a/contrib/amcheck/sql/check_heap.sql
+++ b/contrib/amcheck/sql/check_heap.sql
@@ -40,12 +40,9 @@ INSERT INTO heaptest (a, b)
-- same transaction. The heaptest table is smaller than the default
-- wal_skip_threshold, so a wal_level=minimal commit reads the table into
-- shared_buffers. A transaction delays that and excludes any autovacuum.
-SET allow_in_place_tablespaces = true;
-CREATE TABLESPACE regress_test_stats_tblspc LOCATION '';
SELECT sum(reads) AS stats_bulkreads_before
FROM pg_stat_io WHERE context = 'bulkread' \gset
BEGIN;
-ALTER TABLE heaptest SET TABLESPACE regress_test_stats_tblspc;
-- Check that valid options are not rejected nor corruption reported
-- for a non-empty table
SELECT * FROM verify_heapam(relation := 'heaptest', skip := 'none');
@@ -58,9 +55,6 @@ COMMIT;
-- ALTER TABLE ... SET TABLESPACE ...
-- causing an additional bulkread, which should be reflected in pg_stat_io.
SELECT pg_stat_force_next_flush();
-SELECT sum(reads) AS stats_bulkreads_after
- FROM pg_stat_io WHERE context = 'bulkread' \gset
-SELECT :stats_bulkreads_after > :stats_bulkreads_before;
CREATE ROLE regress_heaptest_role;
@@ -140,7 +134,6 @@ SELECT * FROM verify_heapam('test_foreign_table',
-- cleanup
DROP TABLE heaptest;
-DROP TABLESPACE regress_test_stats_tblspc;
DROP TABLE test_partition;
DROP TABLE test_partitioned;
DROP OWNED BY regress_heaptest_role; -- permissions
diff --git a/contrib/citext/expected/create_index_acl.out b/contrib/citext/expected/create_index_acl.out
index 33be13a..70a406c 100644
--- a/contrib/citext/expected/create_index_acl.out
+++ b/contrib/citext/expected/create_index_acl.out
@@ -5,9 +5,6 @@
-- owner having as few applicable privileges as possible. (The privileges.sql
-- regress_sro_user tests look for the opposite defect; they confirm that
-- DefineIndex() uses the table owner userid where necessary.)
-SET allow_in_place_tablespaces = true;
-CREATE TABLESPACE regress_create_idx_tblspace LOCATION '';
-RESET allow_in_place_tablespaces;
BEGIN;
CREATE ROLE regress_minimal;
CREATE SCHEMA s;
@@ -49,11 +46,9 @@ ALTER TABLE s.x OWNER TO regress_minimal;
-- Empty-table DefineIndex()
CREATE UNIQUE INDEX u0rows ON s.x USING btree
((s.index_this_expr(y, s.const())) COLLATE s.coll s.citext_pattern_ops)
- TABLESPACE regress_create_idx_tblspace
WHERE s.index_row_if(y);
ALTER TABLE s.x ADD CONSTRAINT e0rows EXCLUDE USING btree
((s.index_this_expr(y, s.const())) COLLATE s.coll WITH s.=)
- USING INDEX TABLESPACE regress_create_idx_tblspace
WHERE (s.index_row_if(y));
-- Make the table nonempty.
INSERT INTO s.x VALUES ('foo'), ('bar');
@@ -66,11 +61,9 @@ RESET search_path;
GRANT EXECUTE ON FUNCTION s.index_this_expr TO regress_minimal;
CREATE UNIQUE INDEX u2rows ON s.x USING btree
((s.index_this_expr(y, s.const())) COLLATE s.coll s.citext_pattern_ops)
- TABLESPACE regress_create_idx_tblspace
WHERE s.index_row_if(y);
ALTER TABLE s.x ADD CONSTRAINT e2rows EXCLUDE USING btree
((s.index_this_expr(y, s.const())) COLLATE s.coll WITH s.=)
- USING INDEX TABLESPACE regress_create_idx_tblspace
WHERE (s.index_row_if(y));
-- Shall not find s.coll via search_path, despite the s.const->public.setter
-- call having set search_path=s during expression planning. Suppress the
@@ -78,9 +71,7 @@ ALTER TABLE s.x ADD CONSTRAINT e2rows EXCLUDE USING btree
\set VERBOSITY sqlstate
ALTER TABLE s.x ADD CONSTRAINT underqualified EXCLUDE USING btree
((s.index_this_expr(y, s.const())) COLLATE coll WITH s.=)
- USING INDEX TABLESPACE regress_create_idx_tblspace
WHERE (s.index_row_if(y));
ERROR: 42704
\set VERBOSITY default
ROLLBACK;
-DROP TABLESPACE regress_create_idx_tblspace;
diff --git a/contrib/citext/sql/create_index_acl.sql b/contrib/citext/sql/create_index_acl.sql
index 10b5225..ae442e1 100644
--- a/contrib/citext/sql/create_index_acl.sql
+++ b/contrib/citext/sql/create_index_acl.sql
@@ -6,10 +6,6 @@
-- regress_sro_user tests look for the opposite defect; they confirm that
-- DefineIndex() uses the table owner userid where necessary.)
-SET allow_in_place_tablespaces = true;
-CREATE TABLESPACE regress_create_idx_tblspace LOCATION '';
-RESET allow_in_place_tablespaces;
-
BEGIN;
CREATE ROLE regress_minimal;
CREATE SCHEMA s;
@@ -51,11 +47,9 @@ ALTER TABLE s.x OWNER TO regress_minimal;
-- Empty-table DefineIndex()
CREATE UNIQUE INDEX u0rows ON s.x USING btree
((s.index_this_expr(y, s.const())) COLLATE s.coll s.citext_pattern_ops)
- TABLESPACE regress_create_idx_tblspace
WHERE s.index_row_if(y);
ALTER TABLE s.x ADD CONSTRAINT e0rows EXCLUDE USING btree
((s.index_this_expr(y, s.const())) COLLATE s.coll WITH s.=)
- USING INDEX TABLESPACE regress_create_idx_tblspace
WHERE (s.index_row_if(y));
-- Make the table nonempty.
INSERT INTO s.x VALUES ('foo'), ('bar');
@@ -68,11 +62,9 @@ RESET search_path;
GRANT EXECUTE ON FUNCTION s.index_this_expr TO regress_minimal;
CREATE UNIQUE INDEX u2rows ON s.x USING btree
((s.index_this_expr(y, s.const())) COLLATE s.coll s.citext_pattern_ops)
- TABLESPACE regress_create_idx_tblspace
WHERE s.index_row_if(y);
ALTER TABLE s.x ADD CONSTRAINT e2rows EXCLUDE USING btree
((s.index_this_expr(y, s.const())) COLLATE s.coll WITH s.=)
- USING INDEX TABLESPACE regress_create_idx_tblspace
WHERE (s.index_row_if(y));
-- Shall not find s.coll via search_path, despite the s.const->public.setter
-- call having set search_path=s during expression planning. Suppress the
@@ -80,9 +72,7 @@ ALTER TABLE s.x ADD CONSTRAINT e2rows EXCLUDE USING btree
\set VERBOSITY sqlstate
ALTER TABLE s.x ADD CONSTRAINT underqualified EXCLUDE USING btree
((s.index_this_expr(y, s.const())) COLLATE coll WITH s.=)
- USING INDEX TABLESPACE regress_create_idx_tblspace
WHERE (s.index_row_if(y));
\set VERBOSITY default
ROLLBACK;
-DROP TABLESPACE regress_create_idx_tblspace;
diff --git a/contrib/file_fdw/expected/file_fdw.out b/contrib/file_fdw/expected/file_fdw.out
index 86c148a..81bdb2c 100644
--- a/contrib/file_fdw/expected/file_fdw.out
+++ b/contrib/file_fdw/expected/file_fdw.out
@@ -4,6 +4,7 @@
-- directory paths are passed to us in environment variables
\getenv abs_srcdir PG_ABS_SRCDIR
-- Clean up in case a prior regression run failed
+SET compute_query_id TO 'off';
SET client_min_messages TO 'warning';
DROP ROLE IF EXISTS regress_file_fdw_superuser, regress_file_fdw_user, regress_no_priv_user;
RESET client_min_messages;
diff --git a/contrib/file_fdw/sql/file_fdw.sql b/contrib/file_fdw/sql/file_fdw.sql
index f0548e1..848a08c 100644
--- a/contrib/file_fdw/sql/file_fdw.sql
+++ b/contrib/file_fdw/sql/file_fdw.sql
@@ -6,6 +6,7 @@
\getenv abs_srcdir PG_ABS_SRCDIR
-- Clean up in case a prior regression run failed
+SET compute_query_id TO 'off';
SET client_min_messages TO 'warning';
DROP ROLE IF EXISTS regress_file_fdw_superuser, regress_file_fdw_user, regress_no_priv_user;
RESET client_min_messages;

View File

@@ -0,0 +1,19 @@
commit ec6a491d126882966a696f9ad5d3698935361d55
Author: Alexey Masterov <alexeymasterov@neon.tech>
Date: Tue Dec 17 10:25:00 2024 +0100
Changes required to run tests on Neon
diff --git a/test/expected/permissions_functions.out b/test/expected/permissions_functions.out
index 1e9fbc2..94cbe25 100644
--- a/test/expected/permissions_functions.out
+++ b/test/expected/permissions_functions.out
@@ -64,7 +64,7 @@ begin;
select current_user;
current_user
--------------
- postgres
+ cloud_admin
(1 row)
-- revoke default access from the public role for new functions

View File

@@ -6,16 +6,16 @@ index da723b8..5328114 100644
----
-- No.A-1-1-3
CREATE EXTENSION pg_hint_plan;
+LOG: Sending request to compute_ctl: http://localhost:3080/extension_server/pg_hint_plan
+LOG: Sending request to compute_ctl: http://localhost:3081/extension_server/pg_hint_plan
-- No.A-1-2-3
DROP EXTENSION pg_hint_plan;
-- No.A-1-1-4
CREATE SCHEMA other_schema;
CREATE EXTENSION pg_hint_plan SCHEMA other_schema;
+LOG: Sending request to compute_ctl: http://localhost:3080/extension_server/pg_hint_plan
+LOG: Sending request to compute_ctl: http://localhost:3081/extension_server/pg_hint_plan
ERROR: extension "pg_hint_plan" must be installed in schema "hint_plan"
CREATE EXTENSION pg_hint_plan;
+LOG: Sending request to compute_ctl: http://localhost:3080/extension_server/pg_hint_plan
+LOG: Sending request to compute_ctl: http://localhost:3081/extension_server/pg_hint_plan
DROP SCHEMA other_schema;
----
---- No. A-5-1 comment pattern
@@ -35,7 +35,7 @@ index d372459..6282afe 100644
SET client_min_messages TO LOG;
SET pg_hint_plan.enable_hint TO on;
CREATE EXTENSION file_fdw;
+LOG: Sending request to compute_ctl: http://localhost:3080/extension_server/file_fdw
+LOG: Sending request to compute_ctl: http://localhost:3081/extension_server/file_fdw
CREATE SERVER file_server FOREIGN DATA WRAPPER file_fdw;
CREATE USER MAPPING FOR PUBLIC SERVER file_server;
CREATE FOREIGN TABLE ft1 (id int, val int) SERVER file_server OPTIONS (format 'csv', filename :'filename');

View File

@@ -6,16 +6,16 @@ index e7d68a1..65a056c 100644
----
-- No.A-1-1-3
CREATE EXTENSION pg_hint_plan;
+LOG: Sending request to compute_ctl: http://localhost:3080/extension_server/pg_hint_plan
+LOG: Sending request to compute_ctl: http://localhost:3081/extension_server/pg_hint_plan
-- No.A-1-2-3
DROP EXTENSION pg_hint_plan;
-- No.A-1-1-4
CREATE SCHEMA other_schema;
CREATE EXTENSION pg_hint_plan SCHEMA other_schema;
+LOG: Sending request to compute_ctl: http://localhost:3080/extension_server/pg_hint_plan
+LOG: Sending request to compute_ctl: http://localhost:3081/extension_server/pg_hint_plan
ERROR: extension "pg_hint_plan" must be installed in schema "hint_plan"
CREATE EXTENSION pg_hint_plan;
+LOG: Sending request to compute_ctl: http://localhost:3080/extension_server/pg_hint_plan
+LOG: Sending request to compute_ctl: http://localhost:3081/extension_server/pg_hint_plan
DROP SCHEMA other_schema;
----
---- No. A-5-1 comment pattern
@@ -168,7 +168,7 @@ index 017fa4b..98d989b 100644
SET client_min_messages TO LOG;
SET pg_hint_plan.enable_hint TO on;
CREATE EXTENSION file_fdw;
+LOG: Sending request to compute_ctl: http://localhost:3080/extension_server/file_fdw
+LOG: Sending request to compute_ctl: http://localhost:3081/extension_server/file_fdw
CREATE SERVER file_server FOREIGN DATA WRAPPER file_fdw;
CREATE USER MAPPING FOR PUBLIC SERVER file_server;
CREATE FOREIGN TABLE ft1 (id int, val int) SERVER file_server OPTIONS (format 'csv', filename :'filename');

View File

@@ -47,7 +47,9 @@ files:
# Allow postgres user (which is what compute_ctl runs as) to run /neonvm/bin/resize-swap
# and /neonvm/bin/set-disk-quota as root without requiring entering a password (NOPASSWD),
# regardless of hostname (ALL)
postgres ALL=(root) NOPASSWD: /neonvm/bin/resize-swap, /neonvm/bin/set-disk-quota
#
# Also allow it to shut down the VM. The fast_import job does that when it's finished.
postgres ALL=(root) NOPASSWD: /neonvm/bin/resize-swap, /neonvm/bin/set-disk-quota, /neonvm/bin/poweroff
- filename: cgconfig.conf
content: |
# Configuration for cgroups in VM compute nodes

View File

@@ -34,18 +34,21 @@
//! -r http://pg-ext-s3-gateway \
//! ```
use std::collections::HashMap;
use std::ffi::OsString;
use std::fs::File;
use std::path::Path;
use std::process::exit;
use std::str::FromStr;
use std::sync::atomic::Ordering;
use std::sync::{mpsc, Arc, Condvar, Mutex, RwLock};
use std::time::SystemTime;
use std::{thread, time::Duration};
use anyhow::{Context, Result};
use chrono::Utc;
use clap::Arg;
use clap::Parser;
use compute_tools::disk_quota::set_disk_quota;
use compute_tools::http::server::Server;
use compute_tools::lsn_lease::launch_lsn_lease_bg_task_for_static;
use signal_hook::consts::{SIGQUIT, SIGTERM};
use signal_hook::{consts::SIGINT, iterator::Signals};
@@ -60,7 +63,6 @@ use compute_tools::compute::{
};
use compute_tools::configurator::launch_configurator;
use compute_tools::extension_server::get_pg_version_string;
use compute_tools::http::launch_http_server;
use compute_tools::logger::*;
use compute_tools::monitor::launch_monitor;
use compute_tools::params::*;
@@ -73,10 +75,109 @@ use utils::failpoint_support;
// in-case of not-set environment var
const BUILD_TAG_DEFAULT: &str = "latest";
fn main() -> Result<()> {
let scenario = failpoint_support::init();
// Compatibility hack: if the control plane specified any remote-ext-config
// use the default value for extension storage proxy gateway.
// Remove this once the control plane is updated to pass the gateway URL
fn parse_remote_ext_config(arg: &str) -> Result<String> {
if arg.starts_with("http") {
Ok(arg.trim_end_matches('/').to_string())
} else {
Ok("http://pg-ext-s3-gateway".to_string())
}
}
let (build_tag, clap_args) = init()?;
/// Generate a compute ID if one is not supplied. This exists to keep forward
/// compatibility tests working, but will be removed in a future iteration.
fn generate_compute_id() -> String {
let now = SystemTime::now();
format!(
"compute-{}",
now.duration_since(SystemTime::UNIX_EPOCH)
.unwrap()
.as_secs()
)
}
#[derive(Parser)]
#[command(rename_all = "kebab-case")]
struct Cli {
#[arg(short = 'b', long, default_value = "postgres", env = "POSTGRES_PATH")]
pub pgbin: String,
#[arg(short = 'r', long, value_parser = parse_remote_ext_config)]
pub remote_ext_config: Option<String>,
/// The port to bind the external listening HTTP server to. Clients running
/// outside the compute will talk to the compute through this port. Keep
/// the previous name for this argument around for a smoother release
/// with the control plane.
///
/// TODO: Remove the alias after the control plane release which teaches the
/// control plane about the renamed argument.
#[arg(long, alias = "http-port", default_value_t = 3080)]
pub external_http_port: u16,
/// The port to bind the internal listening HTTP server to. Clients like
/// the neon extension (for installing remote extensions) and local_proxy.
#[arg(long)]
pub internal_http_port: Option<u16>,
#[arg(short = 'D', long, value_name = "DATADIR")]
pub pgdata: String,
#[arg(short = 'C', long, value_name = "DATABASE_URL")]
pub connstr: String,
#[cfg(target_os = "linux")]
#[arg(long, default_value = "neon-postgres")]
pub cgroup: String,
#[cfg(target_os = "linux")]
#[arg(
long,
default_value = "host=localhost port=5432 dbname=postgres user=cloud_admin sslmode=disable application_name=vm-monitor"
)]
pub filecache_connstr: String,
#[cfg(target_os = "linux")]
#[arg(long, default_value = "0.0.0.0:10301")]
pub vm_monitor_addr: String,
#[arg(long, action = clap::ArgAction::SetTrue)]
pub resize_swap_on_bind: bool,
#[arg(long)]
pub set_disk_quota_for_fs: Option<String>,
#[arg(short = 's', long = "spec", group = "spec")]
pub spec_json: Option<String>,
#[arg(short = 'S', long, group = "spec-path")]
pub spec_path: Option<OsString>,
#[arg(short = 'i', long, group = "compute-id", default_value = generate_compute_id())]
pub compute_id: String,
#[arg(short = 'p', long, conflicts_with_all = ["spec", "spec-path"], value_name = "CONTROL_PLANE_API_BASE_URL")]
pub control_plane_uri: Option<String>,
}
fn main() -> Result<()> {
let cli = Cli::parse();
// For historical reasons, the main thread that processes the spec and launches postgres
// is synchronous, but we always have this tokio runtime available and we "enter" it so
// that you can use tokio::spawn() and tokio::runtime::Handle::current().block_on(...)
// from all parts of compute_ctl.
let runtime = tokio::runtime::Builder::new_multi_thread()
.enable_all()
.build()?;
let _rt_guard = runtime.enter();
let build_tag = runtime.block_on(init())?;
let scenario = failpoint_support::init();
// enable core dumping for all child processes
setrlimit(Resource::CORE, rlimit::INFINITY, rlimit::INFINITY)?;
@@ -85,13 +186,11 @@ fn main() -> Result<()> {
// Enter startup tracing context
let _startup_context_guard = startup_context_from_env();
let cli_args = process_cli(&clap_args)?;
let cli_spec = try_spec_from_cli(&cli)?;
let cli_spec = try_spec_from_cli(&clap_args, &cli_args)?;
let compute = wait_spec(build_tag, &cli, cli_spec)?;
let wait_spec_result = wait_spec(build_tag, cli_args, cli_spec)?;
start_postgres(&clap_args, wait_spec_result)?
start_postgres(&cli, compute)?
// Startup is finished, exit the startup tracing span
};
@@ -108,8 +207,8 @@ fn main() -> Result<()> {
deinit_and_exit(wait_pg_result);
}
fn init() -> Result<(String, clap::ArgMatches)> {
init_tracing_and_logging(DEFAULT_LOG_LEVEL)?;
async fn init() -> Result<String> {
init_tracing_and_logging(DEFAULT_LOG_LEVEL).await?;
let mut signals = Signals::new([SIGINT, SIGTERM, SIGQUIT])?;
thread::spawn(move || {
@@ -123,66 +222,7 @@ fn init() -> Result<(String, clap::ArgMatches)> {
.to_string();
info!("build_tag: {build_tag}");
Ok((build_tag, cli().get_matches()))
}
fn process_cli(matches: &clap::ArgMatches) -> Result<ProcessCliResult> {
let pgbin_default = "postgres";
let pgbin = matches
.get_one::<String>("pgbin")
.map(|s| s.as_str())
.unwrap_or(pgbin_default);
let ext_remote_storage = matches
.get_one::<String>("remote-ext-config")
// Compatibility hack: if the control plane specified any remote-ext-config
// use the default value for extension storage proxy gateway.
// Remove this once the control plane is updated to pass the gateway URL
.map(|conf| {
if conf.starts_with("http") {
conf.trim_end_matches('/')
} else {
"http://pg-ext-s3-gateway"
}
});
let http_port = *matches
.get_one::<u16>("http-port")
.expect("http-port is required");
let pgdata = matches
.get_one::<String>("pgdata")
.expect("PGDATA path is required");
let connstr = matches
.get_one::<String>("connstr")
.expect("Postgres connection string is required");
let spec_json = matches.get_one::<String>("spec");
let spec_path = matches.get_one::<String>("spec-path");
let resize_swap_on_bind = matches.get_flag("resize-swap-on-bind");
let set_disk_quota_for_fs = matches.get_one::<String>("set-disk-quota-for-fs");
Ok(ProcessCliResult {
connstr,
pgdata,
pgbin,
ext_remote_storage,
http_port,
spec_json,
spec_path,
resize_swap_on_bind,
set_disk_quota_for_fs,
})
}
struct ProcessCliResult<'clap> {
connstr: &'clap str,
pgdata: &'clap str,
pgbin: &'clap str,
ext_remote_storage: Option<&'clap str>,
http_port: u16,
spec_json: Option<&'clap String>,
spec_path: Option<&'clap String>,
resize_swap_on_bind: bool,
set_disk_quota_for_fs: Option<&'clap String>,
Ok(build_tag)
}
fn startup_context_from_env() -> Option<opentelemetry::ContextGuard> {
@@ -235,19 +275,9 @@ fn startup_context_from_env() -> Option<opentelemetry::ContextGuard> {
}
}
fn try_spec_from_cli(
matches: &clap::ArgMatches,
ProcessCliResult {
spec_json,
spec_path,
..
}: &ProcessCliResult,
) -> Result<CliSpecParams> {
let compute_id = matches.get_one::<String>("compute-id");
let control_plane_uri = matches.get_one::<String>("control-plane-uri");
fn try_spec_from_cli(cli: &Cli) -> Result<CliSpecParams> {
// First, try to get cluster spec from the cli argument
if let Some(spec_json) = spec_json {
if let Some(ref spec_json) = cli.spec_json {
info!("got spec from cli argument {}", spec_json);
return Ok(CliSpecParams {
spec: Some(serde_json::from_str(spec_json)?),
@@ -256,7 +286,7 @@ fn try_spec_from_cli(
}
// Second, try to read it from the file if path is provided
if let Some(spec_path) = spec_path {
if let Some(ref spec_path) = cli.spec_path {
let file = File::open(Path::new(spec_path))?;
return Ok(CliSpecParams {
spec: Some(serde_json::from_reader(file)?),
@@ -264,17 +294,11 @@ fn try_spec_from_cli(
});
}
let Some(compute_id) = compute_id else {
panic!(
"compute spec should be provided by one of the following ways: \
--spec OR --spec-path OR --control-plane-uri and --compute-id"
);
};
let Some(control_plane_uri) = control_plane_uri else {
panic!("must specify both --control-plane-uri and --compute-id or none");
if cli.control_plane_uri.is_none() {
panic!("must specify --control-plane-uri");
};
match get_spec_from_control_plane(control_plane_uri, compute_id) {
match get_spec_from_control_plane(cli.control_plane_uri.as_ref().unwrap(), &cli.compute_id) {
Ok(spec) => Ok(CliSpecParams {
spec,
live_config_allowed: true,
@@ -298,21 +322,12 @@ struct CliSpecParams {
fn wait_spec(
build_tag: String,
ProcessCliResult {
connstr,
pgdata,
pgbin,
ext_remote_storage,
resize_swap_on_bind,
set_disk_quota_for_fs,
http_port,
..
}: ProcessCliResult,
cli: &Cli,
CliSpecParams {
spec,
live_config_allowed,
}: CliSpecParams,
) -> Result<WaitSpecResult> {
) -> Result<Arc<ComputeNode>> {
let mut new_state = ComputeState::new();
let spec_set;
@@ -324,23 +339,25 @@ fn wait_spec(
} else {
spec_set = false;
}
let connstr = Url::parse(connstr).context("cannot parse connstr as a URL")?;
let connstr = Url::parse(&cli.connstr).context("cannot parse connstr as a URL")?;
let conn_conf = postgres::config::Config::from_str(connstr.as_str())
.context("cannot build postgres config from connstr")?;
let tokio_conn_conf = tokio_postgres::config::Config::from_str(connstr.as_str())
.context("cannot build tokio postgres config from connstr")?;
let compute_node = ComputeNode {
compute_id: cli.compute_id.clone(),
connstr,
conn_conf,
tokio_conn_conf,
pgdata: pgdata.to_string(),
pgbin: pgbin.to_string(),
pgversion: get_pg_version_string(pgbin),
http_port,
pgdata: cli.pgdata.clone(),
pgbin: cli.pgbin.clone(),
pgversion: get_pg_version_string(&cli.pgbin),
external_http_port: cli.external_http_port,
internal_http_port: cli.internal_http_port.unwrap_or(cli.external_http_port + 1),
live_config_allowed,
state: Mutex::new(new_state),
state_changed: Condvar::new(),
ext_remote_storage: ext_remote_storage.map(|s| s.to_string()),
ext_remote_storage: cli.remote_ext_config.clone(),
ext_download_progress: RwLock::new(HashMap::new()),
build_tag,
};
@@ -354,10 +371,13 @@ fn wait_spec(
compute.prewarm_postgres()?;
}
// Launch http service first, so that we can serve control-plane requests
// while configuration is still in progress.
let _http_handle =
launch_http_server(http_port, &compute).expect("cannot launch http endpoint thread");
// Launch the external HTTP server first, so that we can serve control plane
// requests while configuration is still in progress.
Server::External(cli.external_http_port).launch(&compute);
// The internal HTTP server could be launched later, but there isn't much
// sense in waiting.
Server::Internal(cli.internal_http_port.unwrap_or(cli.external_http_port + 1)).launch(&compute);
if !spec_set {
// No spec provided, hang waiting for it.
@@ -389,27 +409,12 @@ fn wait_spec(
launch_lsn_lease_bg_task_for_static(&compute);
Ok(WaitSpecResult {
compute,
resize_swap_on_bind,
set_disk_quota_for_fs: set_disk_quota_for_fs.cloned(),
})
}
struct WaitSpecResult {
compute: Arc<ComputeNode>,
resize_swap_on_bind: bool,
set_disk_quota_for_fs: Option<String>,
Ok(compute)
}
fn start_postgres(
// need to allow unused because `matches` is only used if target_os = "linux"
#[allow(unused_variables)] matches: &clap::ArgMatches,
WaitSpecResult {
compute,
resize_swap_on_bind,
set_disk_quota_for_fs,
}: WaitSpecResult,
cli: &Cli,
compute: Arc<ComputeNode>,
) -> Result<(Option<PostgresHandle>, StartPostgresResult)> {
// We got all we need, update the state.
let mut state = compute.state.lock().unwrap();
@@ -437,7 +442,7 @@ fn start_postgres(
let mut delay_exit = false;
// Resize swap to the desired size if the compute spec says so
if let (Some(size_bytes), true) = (swap_size_bytes, resize_swap_on_bind) {
if let (Some(size_bytes), true) = (swap_size_bytes, cli.resize_swap_on_bind) {
// To avoid 'swapoff' hitting postgres startup, we need to run resize-swap to completion
// *before* starting postgres.
//
@@ -464,9 +469,9 @@ fn start_postgres(
// Set disk quota if the compute spec says so
if let (Some(disk_quota_bytes), Some(disk_quota_fs_mountpoint)) =
(disk_quota_bytes, set_disk_quota_for_fs)
(disk_quota_bytes, cli.set_disk_quota_for_fs.as_ref())
{
match set_disk_quota(disk_quota_bytes, &disk_quota_fs_mountpoint) {
match set_disk_quota(disk_quota_bytes, disk_quota_fs_mountpoint) {
Ok(()) => {
let size_mib = disk_quota_bytes as f32 / (1 << 20) as f32; // just for more coherent display.
info!(%disk_quota_bytes, %size_mib, "set disk quota");
@@ -509,27 +514,6 @@ fn start_postgres(
if #[cfg(target_os = "linux")] {
use std::env;
use tokio_util::sync::CancellationToken;
let vm_monitor_addr = matches
.get_one::<String>("vm-monitor-addr")
.expect("--vm-monitor-addr should always be set because it has a default arg");
let file_cache_connstr = matches.get_one::<String>("filecache-connstr");
let cgroup = matches.get_one::<String>("cgroup");
// Only make a runtime if we need to.
// Note: it seems like you can make a runtime in an inner scope and
// if you start a task in it it won't be dropped. However, make it
// in the outermost scope just to be safe.
let rt = if env::var_os("AUTOSCALING").is_some() {
Some(
tokio::runtime::Builder::new_multi_thread()
.worker_threads(4)
.enable_all()
.build()
.expect("failed to create tokio runtime for monitor")
)
} else {
None
};
// This token is used internally by the monitor to clean up all threads
let token = CancellationToken::new();
@@ -538,19 +522,22 @@ fn start_postgres(
let pgconnstr = if disable_lfc_resizing.unwrap_or(false) {
None
} else {
file_cache_connstr.cloned()
Some(cli.filecache_connstr.clone())
};
let vm_monitor = rt.as_ref().map(|rt| {
rt.spawn(vm_monitor::start(
let vm_monitor = if env::var_os("AUTOSCALING").is_some() {
let vm_monitor = tokio::spawn(vm_monitor::start(
Box::leak(Box::new(vm_monitor::Args {
cgroup: cgroup.cloned(),
cgroup: Some(cli.cgroup.clone()),
pgconnstr,
addr: vm_monitor_addr.clone(),
addr: cli.vm_monitor_addr.clone(),
})),
token.clone(),
))
});
));
Some(vm_monitor)
} else {
None
};
}
}
@@ -560,8 +547,6 @@ fn start_postgres(
delay_exit,
compute,
#[cfg(target_os = "linux")]
rt,
#[cfg(target_os = "linux")]
token,
#[cfg(target_os = "linux")]
vm_monitor,
@@ -569,15 +554,13 @@ fn start_postgres(
))
}
type PostgresHandle = (std::process::Child, std::thread::JoinHandle<()>);
type PostgresHandle = (std::process::Child, tokio::task::JoinHandle<Result<()>>);
struct StartPostgresResult {
delay_exit: bool,
// passed through from WaitSpecResult
compute: Arc<ComputeNode>,
#[cfg(target_os = "linux")]
rt: Option<tokio::runtime::Runtime>,
#[cfg(target_os = "linux")]
token: tokio_util::sync::CancellationToken,
#[cfg(target_os = "linux")]
@@ -596,10 +579,10 @@ fn wait_postgres(pg: Option<PostgresHandle>) -> Result<WaitPostgresResult> {
.expect("failed to start waiting on Postgres process");
PG_PID.store(0, Ordering::SeqCst);
// Process has exited, so we can join the logs thread.
let _ = logs_handle
.join()
.map_err(|e| tracing::error!("log thread panicked: {:?}", e));
// Process has exited. Wait for the log collecting task to finish.
let _ = tokio::runtime::Handle::current()
.block_on(logs_handle)
.map_err(|e| tracing::error!("log task panicked: {:?}", e));
info!("Postgres exited with code {}, shutting down", ecode);
exit_code = ecode.code()
@@ -620,8 +603,6 @@ fn cleanup_after_postgres_exit(
vm_monitor,
#[cfg(target_os = "linux")]
token,
#[cfg(target_os = "linux")]
rt,
}: StartPostgresResult,
) -> Result<bool> {
// Terminate the vm_monitor so it releases the file watcher on
@@ -634,10 +615,6 @@ fn cleanup_after_postgres_exit(
token.cancel();
// Kills the actual task running the monitor
handle.abort();
// If handle is some, rt must have been used to produce it, and
// hence is also some
rt.unwrap().shutdown_timeout(Duration::from_secs(2));
}
}
}
@@ -702,105 +679,6 @@ fn deinit_and_exit(WaitPostgresResult { exit_code }: WaitPostgresResult) -> ! {
exit(exit_code.unwrap_or(1))
}
fn cli() -> clap::Command {
// Env variable is set by `cargo`
let version = option_env!("CARGO_PKG_VERSION").unwrap_or("unknown");
clap::Command::new("compute_ctl")
.version(version)
.arg(
Arg::new("http-port")
.long("http-port")
.value_name("HTTP_PORT")
.default_value("3080")
.value_parser(clap::value_parser!(u16))
.required(false),
)
.arg(
Arg::new("connstr")
.short('C')
.long("connstr")
.value_name("DATABASE_URL")
.required(true),
)
.arg(
Arg::new("pgdata")
.short('D')
.long("pgdata")
.value_name("DATADIR")
.required(true),
)
.arg(
Arg::new("pgbin")
.short('b')
.long("pgbin")
.default_value("postgres")
.value_name("POSTGRES_PATH"),
)
.arg(
Arg::new("spec")
.short('s')
.long("spec")
.value_name("SPEC_JSON"),
)
.arg(
Arg::new("spec-path")
.short('S')
.long("spec-path")
.value_name("SPEC_PATH"),
)
.arg(
Arg::new("compute-id")
.short('i')
.long("compute-id")
.value_name("COMPUTE_ID"),
)
.arg(
Arg::new("control-plane-uri")
.short('p')
.long("control-plane-uri")
.value_name("CONTROL_PLANE_API_BASE_URI"),
)
.arg(
Arg::new("remote-ext-config")
.short('r')
.long("remote-ext-config")
.value_name("REMOTE_EXT_CONFIG"),
)
// TODO(fprasx): we currently have default arguments because the cloud PR
// to pass them in hasn't been merged yet. We should get rid of them once
// the PR is merged.
.arg(
Arg::new("vm-monitor-addr")
.long("vm-monitor-addr")
.default_value("0.0.0.0:10301")
.value_name("VM_MONITOR_ADDR"),
)
.arg(
Arg::new("cgroup")
.long("cgroup")
.default_value("neon-postgres")
.value_name("CGROUP"),
)
.arg(
Arg::new("filecache-connstr")
.long("filecache-connstr")
.default_value(
"host=localhost port=5432 dbname=postgres user=cloud_admin sslmode=disable application_name=vm-monitor",
)
.value_name("FILECACHE_CONNSTR"),
)
.arg(
Arg::new("resize-swap-on-bind")
.long("resize-swap-on-bind")
.action(clap::ArgAction::SetTrue),
)
.arg(
Arg::new("set-disk-quota-for-fs")
.long("set-disk-quota-for-fs")
.value_name("SET_DISK_QUOTA_FOR_FS")
)
}
/// When compute_ctl is killed, send also termination signal to sync-safekeepers
/// to prevent leakage. TODO: it is better to convert compute_ctl to async and
/// wait for termination which would be easy then.
@@ -810,7 +688,14 @@ fn handle_exit_signal(sig: i32) {
exit(1);
}
#[test]
fn verify_cli() {
cli().debug_assert()
#[cfg(test)]
mod test {
use clap::CommandFactory;
use super::Cli;
#[test]
fn verify_cli() {
Cli::command().debug_assert()
}
}

View File

@@ -60,6 +60,16 @@ struct Args {
pg_lib_dir: Utf8PathBuf,
#[clap(long)]
pg_port: Option<u16>, // port to run postgres on, 5432 is default
/// Number of CPUs in the system. This is used to configure # of
/// parallel worker processes, for index creation.
#[clap(long, env = "NEON_IMPORTER_NUM_CPUS")]
num_cpus: Option<usize>,
/// Amount of RAM in the system. This is used to configure shared_buffers
/// and maintenance_work_mem.
#[clap(long, env = "NEON_IMPORTER_MEMORY_MB")]
memory_mb: Option<usize>,
}
#[serde_with::serde_as]
@@ -202,7 +212,16 @@ pub(crate) async fn main() -> anyhow::Result<()> {
.await
.context("initdb")?;
let nproc = num_cpus::get();
// If the caller didn't specify CPU / RAM to use for sizing, default to
// number of CPUs in the system, and pretty arbitrarily, 256 MB of RAM.
let nproc = args.num_cpus.unwrap_or_else(num_cpus::get);
let memory_mb = args.memory_mb.unwrap_or(256);
// Somewhat arbitrarily, use 10 % of memory for shared buffer cache, 70% for
// maintenance_work_mem (i.e. for sorting during index creation), and leave the rest
// available for misc other stuff that PostgreSQL uses memory for.
let shared_buffers_mb = ((memory_mb as f32) * 0.10) as usize;
let maintenance_work_mem_mb = ((memory_mb as f32) * 0.70) as usize;
//
// Launch postgres process
@@ -212,12 +231,15 @@ pub(crate) async fn main() -> anyhow::Result<()> {
.arg(&pgdata_dir)
.args(["-p", &format!("{pg_port}")])
.args(["-c", "wal_level=minimal"])
.args(["-c", "shared_buffers=10GB"])
.args(["-c", &format!("shared_buffers={shared_buffers_mb}MB")])
.args(["-c", "max_wal_senders=0"])
.args(["-c", "fsync=off"])
.args(["-c", "full_page_writes=off"])
.args(["-c", "synchronous_commit=off"])
.args(["-c", "maintenance_work_mem=8388608"])
.args([
"-c",
&format!("maintenance_work_mem={maintenance_work_mem_mb}MB"),
])
.args(["-c", &format!("max_parallel_maintenance_workers={nproc}")])
.args(["-c", &format!("max_parallel_workers={nproc}")])
.args(["-c", &format!("max_parallel_workers_per_gather={nproc}")])
@@ -231,6 +253,14 @@ pub(crate) async fn main() -> anyhow::Result<()> {
])
.env_clear()
.env("LD_LIBRARY_PATH", &pg_lib_dir)
.env(
"ASAN_OPTIONS",
std::env::var("ASAN_OPTIONS").unwrap_or_default(),
)
.env(
"UBSAN_OPTIONS",
std::env::var("UBSAN_OPTIONS").unwrap_or_default(),
)
.stdout(std::process::Stdio::piped())
.stderr(std::process::Stdio::piped())
.spawn()

View File

@@ -140,5 +140,34 @@ pub async fn get_database_schema(
warn!("pg_dump stderr: {}", line)
}
});
Ok(initial_stream.chain(stdout_reader.map(|res| res.map(|b| b.freeze()))))
#[allow(dead_code)]
struct SchemaStream<S> {
// We keep a reference to the child process to ensure it stays alive
// while the stream is being consumed. When SchemaStream is dropped,
// cmd will be dropped, which triggers kill_on_drop and terminates pg_dump
cmd: tokio::process::Child,
stream: S,
}
impl<S> Stream for SchemaStream<S>
where
S: Stream<Item = Result<bytes::Bytes, std::io::Error>> + Unpin,
{
type Item = Result<bytes::Bytes, std::io::Error>;
fn poll_next(
mut self: std::pin::Pin<&mut Self>,
cx: &mut std::task::Context<'_>,
) -> std::task::Poll<Option<Self::Item>> {
Stream::poll_next(std::pin::Pin::new(&mut self.stream), cx)
}
}
let schema_stream = SchemaStream {
cmd,
stream: initial_stream.chain(stdout_reader.map(|res| res.map(|b| b.freeze()))),
};
Ok(schema_stream)
}

View File

@@ -9,7 +9,6 @@ use std::str::FromStr;
use std::sync::atomic::AtomicU32;
use std::sync::atomic::Ordering;
use std::sync::{Arc, Condvar, Mutex, RwLock};
use std::thread;
use std::time::Duration;
use std::time::Instant;
@@ -59,6 +58,8 @@ pub static PG_PID: AtomicU32 = AtomicU32::new(0);
/// Compute node info shared across several `compute_ctl` threads.
pub struct ComputeNode {
/// The ID of the compute
pub compute_id: String,
// Url type maintains proper escaping
pub connstr: url::Url,
// We connect to Postgres from many different places, so build configs once
@@ -81,8 +82,10 @@ pub struct ComputeNode {
/// - we push spec and it does configuration
/// - but then it is restarted without any spec again
pub live_config_allowed: bool,
/// The port that the compute's HTTP server listens on
pub http_port: u16,
/// The port that the compute's external HTTP server listens on
pub external_http_port: u16,
/// The port that the compute's internal HTTP server listens on
pub internal_http_port: u16,
/// Volatile part of the `ComputeNode`, which should be used under `Mutex`.
/// To allow HTTP API server to serving status requests, while configuration
/// is in progress, lock should be held only for short periods of time to do
@@ -546,11 +549,7 @@ impl ComputeNode {
pub fn check_safekeepers_synced(&self, compute_state: &ComputeState) -> Result<Option<Lsn>> {
let start_time = Utc::now();
// Run actual work with new tokio runtime
let rt = tokio::runtime::Builder::new_current_thread()
.enable_all()
.build()
.expect("failed to create rt");
let rt = tokio::runtime::Handle::current();
let result = rt.block_on(self.check_safekeepers_synced_async(compute_state));
// Record runtime
@@ -597,9 +596,9 @@ impl ComputeNode {
SYNC_SAFEKEEPERS_PID.store(0, Ordering::SeqCst);
// Process has exited, so we can join the logs thread.
let _ = logs_handle
.join()
.map_err(|e| tracing::error!("log thread panicked: {:?}", e));
let _ = tokio::runtime::Handle::current()
.block_on(logs_handle)
.map_err(|e| tracing::error!("log task panicked: {:?}", e));
if !sync_output.status.success() {
anyhow::bail!(
@@ -634,7 +633,7 @@ impl ComputeNode {
config::write_postgres_conf(
&pgdata_path.join("postgresql.conf"),
&pspec.spec,
self.http_port,
self.internal_http_port,
)?;
// Syncing safekeepers is only safe with primary nodes: if a primary
@@ -784,7 +783,7 @@ impl ComputeNode {
pub fn start_postgres(
&self,
storage_auth_token: Option<String>,
) -> Result<(std::process::Child, std::thread::JoinHandle<()>)> {
) -> Result<(std::process::Child, tokio::task::JoinHandle<Result<()>>)> {
let pgdata_path = Path::new(&self.pgdata);
// Run postgres as a child process.
@@ -800,7 +799,7 @@ impl ComputeNode {
.expect("cannot start postgres process");
PG_PID.store(pg.id(), Ordering::SeqCst);
// Start a thread to collect logs from stderr.
// Start a task to collect logs from stderr.
let stderr = pg.stderr.take().expect("stderr should be captured");
let logs_handle = handle_postgres_logs(stderr);
@@ -809,20 +808,28 @@ impl ComputeNode {
Ok((pg, logs_handle))
}
/// Do post configuration of the already started Postgres. This function spawns a background thread to
/// Do post configuration of the already started Postgres. This function spawns a background task to
/// configure the database after applying the compute spec. Currently, it upgrades the neon extension
/// version. In the future, it may upgrade all 3rd-party extensions.
#[instrument(skip_all)]
pub fn post_apply_config(&self) -> Result<()> {
let conf = self.get_conn_conf(Some("compute_ctl:post_apply_config"));
thread::spawn(move || {
let func = || {
let mut client = conf.connect(NoTls)?;
let conf = self.get_tokio_conn_conf(Some("compute_ctl:post_apply_config"));
tokio::spawn(async move {
let res = async {
let (mut client, connection) = conf.connect(NoTls).await?;
tokio::spawn(async move {
if let Err(e) = connection.await {
eprintln!("connection error: {}", e);
}
});
handle_neon_extension_upgrade(&mut client)
.await
.context("handle_neon_extension_upgrade")?;
Ok::<_, anyhow::Error>(())
};
if let Err(err) = func() {
}
.await;
if let Err(err) = res {
error!("error while post_apply_config: {err:#}");
}
});
@@ -919,13 +926,10 @@ impl ComputeNode {
conf: Arc<tokio_postgres::Config>,
concurrency: usize,
) -> Result<()> {
let rt = tokio::runtime::Builder::new_multi_thread()
.enable_all()
.build()?;
info!("Applying config with max {} concurrency", concurrency);
debug!("Config: {:?}", spec);
let rt = tokio::runtime::Handle::current();
rt.block_on(async {
// Proceed with post-startup configuration. Note, that order of operations is important.
let client = Self::get_maintenance_client(&conf).await?;
@@ -1319,14 +1323,18 @@ impl ComputeNode {
}
// Run migrations separately to not hold up cold starts
thread::spawn(move || {
let conf = conf.as_ref().clone();
let mut conf = postgres::config::Config::from(conf);
tokio::spawn(async move {
let mut conf = conf.as_ref().clone();
conf.application_name("compute_ctl:migrations");
match conf.connect(NoTls) {
Ok(mut client) => {
if let Err(e) = handle_migrations(&mut client) {
match conf.connect(NoTls).await {
Ok((mut client, connection)) => {
tokio::spawn(async move {
if let Err(e) = connection.await {
eprintln!("connection error: {}", e);
}
});
if let Err(e) = handle_migrations(&mut client).await {
error!("Failed to run migrations: {}", e);
}
}
@@ -1363,16 +1371,11 @@ impl ComputeNode {
if let Some(ref pgbouncer_settings) = spec.pgbouncer_settings {
info!("tuning pgbouncer");
let rt = tokio::runtime::Builder::new_current_thread()
.enable_all()
.build()
.expect("failed to create rt");
// Spawn a thread to do the tuning,
// Spawn a background task to do the tuning,
// so that we don't block the main thread that starts Postgres.
let pgbouncer_settings = pgbouncer_settings.clone();
let _handle = thread::spawn(move || {
let res = rt.block_on(tune_pgbouncer(pgbouncer_settings));
tokio::spawn(async move {
let res = tune_pgbouncer(pgbouncer_settings).await;
if let Err(err) = res {
error!("error while tuning pgbouncer: {err:?}");
}
@@ -1382,20 +1385,20 @@ impl ComputeNode {
if let Some(ref local_proxy) = spec.local_proxy_config {
info!("configuring local_proxy");
// Spawn a thread to do the configuration,
// Spawn a background task to do the configuration,
// so that we don't block the main thread that starts Postgres.
let local_proxy = local_proxy.clone();
let _handle = Some(thread::spawn(move || {
tokio::spawn(async move {
if let Err(err) = local_proxy::configure(&local_proxy) {
error!("error while configuring local_proxy: {err:?}");
}
}));
});
}
// Write new config
let pgdata_path = Path::new(&self.pgdata);
let postgresql_conf_path = pgdata_path.join("postgresql.conf");
config::write_postgres_conf(&postgresql_conf_path, &spec, self.http_port)?;
config::write_postgres_conf(&postgresql_conf_path, &spec, self.internal_http_port)?;
let max_concurrent_connections = spec.reconfigure_concurrency;
@@ -1431,7 +1434,9 @@ impl ComputeNode {
}
#[instrument(skip_all)]
pub fn start_compute(&self) -> Result<(std::process::Child, std::thread::JoinHandle<()>)> {
pub fn start_compute(
&self,
) -> Result<(std::process::Child, tokio::task::JoinHandle<Result<()>>)> {
let compute_state = self.state.lock().unwrap().clone();
let pspec = compute_state.pspec.as_ref().expect("spec must be set");
info!(
@@ -1446,16 +1451,11 @@ impl ComputeNode {
if let Some(pgbouncer_settings) = &pspec.spec.pgbouncer_settings {
info!("tuning pgbouncer");
let rt = tokio::runtime::Builder::new_current_thread()
.enable_all()
.build()
.expect("failed to create rt");
// Spawn a thread to do the tuning,
// Spawn a background task to do the tuning,
// so that we don't block the main thread that starts Postgres.
let pgbouncer_settings = pgbouncer_settings.clone();
let _handle = thread::spawn(move || {
let res = rt.block_on(tune_pgbouncer(pgbouncer_settings));
let _handle = tokio::spawn(async move {
let res = tune_pgbouncer(pgbouncer_settings).await;
if let Err(err) = res {
error!("error while tuning pgbouncer: {err:?}");
}
@@ -1465,10 +1465,10 @@ impl ComputeNode {
if let Some(local_proxy) = &pspec.spec.local_proxy_config {
info!("configuring local_proxy");
// Spawn a thread to do the configuration,
// Spawn a background task to do the configuration,
// so that we don't block the main thread that starts Postgres.
let local_proxy = local_proxy.clone();
let _handle = thread::spawn(move || {
let _handle = tokio::spawn(async move {
if let Err(err) = local_proxy::configure(&local_proxy) {
error!("error while configuring local_proxy: {err:?}");
}
@@ -1487,7 +1487,8 @@ impl ComputeNode {
extension_server::create_control_files(remote_extensions, &self.pgbin);
let library_load_start_time = Utc::now();
let remote_ext_metrics = self.prepare_preload_libraries(&pspec.spec)?;
let rt = tokio::runtime::Handle::current();
let remote_ext_metrics = rt.block_on(self.prepare_preload_libraries(&pspec.spec))?;
let library_load_time = Utc::now()
.signed_duration_since(library_load_start_time)
@@ -1542,7 +1543,7 @@ impl ComputeNode {
self.post_apply_config()?;
let conf = self.get_conn_conf(None);
thread::spawn(move || {
tokio::task::spawn_blocking(|| {
let res = get_installed_extensions(conf);
match res {
Ok(extensions) => {
@@ -1891,7 +1892,6 @@ LIMIT 100",
Ok(ext_version)
}
#[tokio::main]
pub async fn prepare_preload_libraries(
&self,
spec: &ComputeSpec,

View File

@@ -51,9 +51,12 @@ fn configurator_main_loop(compute: &Arc<ComputeNode>) {
pub fn launch_configurator(compute: &Arc<ComputeNode>) -> thread::JoinHandle<()> {
let compute = Arc::clone(compute);
let runtime = tokio::runtime::Handle::current();
thread::Builder::new()
.name("compute-configurator".into())
.spawn(move || {
let _rt_guard = runtime.enter();
configurator_main_loop(&compute);
info!("configurator thread is exited");
})

View File

@@ -85,6 +85,8 @@ use tracing::info;
use tracing::log::warn;
use zstd::stream::read::Decoder;
use crate::metrics::{REMOTE_EXT_REQUESTS_TOTAL, UNKNOWN_HTTP_STATUS};
fn get_pg_config(argument: &str, pgbin: &str) -> String {
// gives the result of `pg_config [argument]`
// where argument is a flag like `--version` or `--sharedir`
@@ -256,23 +258,60 @@ pub fn create_control_files(remote_extensions: &RemoteExtSpec, pgbin: &str) {
async fn download_extension_tar(ext_remote_storage: &str, ext_path: &str) -> Result<Bytes> {
let uri = format!("{}/{}", ext_remote_storage, ext_path);
info!("Download extension {:?} from uri {:?}", ext_path, uri);
info!("Download extension {} from uri {}", ext_path, uri);
let resp = reqwest::get(uri).await?;
match do_extension_server_request(&uri).await {
Ok(resp) => {
info!("Successfully downloaded remote extension data {}", ext_path);
REMOTE_EXT_REQUESTS_TOTAL
.with_label_values(&[&StatusCode::OK.to_string()])
.inc();
Ok(resp)
}
Err((msg, status)) => {
REMOTE_EXT_REQUESTS_TOTAL
.with_label_values(&[&status])
.inc();
bail!(msg);
}
}
}
match resp.status() {
// Do a single remote extensions server request.
// Return result or (error message + stringified status code) in case of any failures.
async fn do_extension_server_request(uri: &str) -> Result<Bytes, (String, String)> {
let resp = reqwest::get(uri).await.map_err(|e| {
(
format!(
"could not perform remote extensions server request: {:?}",
e
),
UNKNOWN_HTTP_STATUS.to_string(),
)
})?;
let status = resp.status();
match status {
StatusCode::OK => match resp.bytes().await {
Ok(resp) => {
info!("Download extension {:?} completed successfully", ext_path);
Ok(resp)
}
Err(e) => bail!("could not deserialize remote extension response: {}", e),
Ok(resp) => Ok(resp),
Err(e) => Err((
format!("could not read remote extensions server response: {:?}", e),
// It's fine to return and report error with status as 200 OK,
// because we still failed to read the response.
status.to_string(),
)),
},
StatusCode::SERVICE_UNAVAILABLE => bail!("remote extension is temporarily unavailable"),
_ => bail!(
"unexpected remote extension response status code: {}",
resp.status()
),
StatusCode::SERVICE_UNAVAILABLE => Err((
"remote extensions server is temporarily unavailable".to_string(),
status.to_string(),
)),
_ => Err((
format!(
"unexpected remote extensions server response status code: {}",
status
),
status.to_string(),
)),
}
}

View File

@@ -1,9 +1,6 @@
use std::ops::{Deref, DerefMut};
use axum::{
async_trait,
extract::{rejection::JsonRejection, FromRequest, Request},
};
use axum::extract::{rejection::JsonRejection, FromRequest, Request};
use compute_api::responses::GenericAPIError;
use http::StatusCode;
@@ -12,7 +9,6 @@ use http::StatusCode;
#[derive(Debug, Clone, Copy, Default)]
pub(crate) struct Json<T>(pub T);
#[async_trait]
impl<S, T> FromRequest<S> for Json<T>
where
axum::Json<T>: FromRequest<S, Rejection = JsonRejection>,

View File

@@ -1,9 +1,6 @@
use std::ops::{Deref, DerefMut};
use axum::{
async_trait,
extract::{rejection::PathRejection, FromRequestParts},
};
use axum::extract::{rejection::PathRejection, FromRequestParts};
use compute_api::responses::GenericAPIError;
use http::{request::Parts, StatusCode};
@@ -12,7 +9,6 @@ use http::{request::Parts, StatusCode};
#[derive(Debug, Clone, Copy, Default)]
pub(crate) struct Path<T>(pub T);
#[async_trait]
impl<S, T> FromRequestParts<S> for Path<T>
where
axum::extract::Path<T>: FromRequestParts<S, Rejection = PathRejection>,

View File

@@ -1,9 +1,6 @@
use std::ops::{Deref, DerefMut};
use axum::{
async_trait,
extract::{rejection::QueryRejection, FromRequestParts},
};
use axum::extract::{rejection::QueryRejection, FromRequestParts};
use compute_api::responses::GenericAPIError;
use http::{request::Parts, StatusCode};
@@ -12,7 +9,6 @@ use http::{request::Parts, StatusCode};
#[derive(Debug, Clone, Copy, Default)]
pub(crate) struct Query<T>(pub T);
#[async_trait]
impl<S, T> FromRequestParts<S> for Query<T>
where
axum::extract::Query<T>: FromRequestParts<S, Rejection = QueryRejection>,

View File

@@ -4,11 +4,9 @@ use http::{header::CONTENT_TYPE, StatusCode};
use serde::Serialize;
use tracing::error;
pub use server::launch_http_server;
mod extract;
mod routes;
mod server;
pub mod server;
/// Convenience response builder for JSON responses
struct JsonResponse;

View File

@@ -68,35 +68,6 @@ paths:
schema:
$ref: "#/components/schemas/ComputeInsights"
/installed_extensions:
get:
tags:
- Info
summary: Get installed extensions.
description: ""
operationId: getInstalledExtensions
responses:
200:
description: List of installed extensions
content:
application/json:
schema:
$ref: "#/components/schemas/InstalledExtensions"
/info:
get:
tags:
- Info
summary: Get info about the compute pod / VM.
description: ""
operationId: getInfo
responses:
200:
description: Info
content:
application/json:
schema:
$ref: "#/components/schemas/Info"
/dbs_and_roles:
get:
tags:

View File

@@ -1,7 +1,21 @@
use axum::response::{IntoResponse, Response};
use http::StatusCode;
use serde::{Deserialize, Serialize};
use tracing::info;
use utils::failpoint_support::{apply_failpoint, ConfigureFailpointsRequest};
use utils::failpoint_support::apply_failpoint;
pub type ConfigureFailpointsRequest = Vec<FailpointConfig>;
/// Information for configuring a single fail point
#[derive(Debug, Serialize, Deserialize)]
pub struct FailpointConfig {
/// Name of the fail point
pub name: String,
/// List of actions to take, using the format described in `fail::cfg`
///
/// We also support `actions = "exit"` to cause the fail point to immediately exit.
pub actions: String,
}
use crate::http::{extract::Json, JsonResponse};

View File

@@ -1,11 +0,0 @@
use axum::response::Response;
use compute_api::responses::InfoResponse;
use http::StatusCode;
use crate::http::JsonResponse;
/// Get information about the physical characteristics about the compute.
pub(in crate::http) async fn get_info() -> Response {
let num_cpus = num_cpus::get_physical();
JsonResponse::success(StatusCode::OK, &InfoResponse { num_cpus })
}

View File

@@ -1,33 +0,0 @@
use std::sync::Arc;
use axum::{extract::State, response::Response};
use compute_api::responses::ComputeStatus;
use http::StatusCode;
use tokio::task;
use crate::{compute::ComputeNode, http::JsonResponse, installed_extensions};
/// Get a list of installed extensions.
pub(in crate::http) async fn get_installed_extensions(
State(compute): State<Arc<ComputeNode>>,
) -> Response {
let status = compute.get_status();
if status != ComputeStatus::Running {
return JsonResponse::invalid_status(status);
}
let conf = compute.get_conn_conf(None);
let res = task::spawn_blocking(move || installed_extensions::get_installed_extensions(conf))
.await
.unwrap();
match res {
Ok(installed_extensions) => {
JsonResponse::success(StatusCode::OK, Some(installed_extensions))
}
Err(e) => JsonResponse::error(
StatusCode::INTERNAL_SERVER_ERROR,
format!("failed to get list of installed extensions: {e}"),
),
}
}

View File

@@ -2,17 +2,16 @@ use axum::{body::Body, response::Response};
use http::header::CONTENT_TYPE;
use http::StatusCode;
use metrics::proto::MetricFamily;
use metrics::Encoder;
use metrics::TextEncoder;
use metrics::{Encoder, TextEncoder};
use crate::{http::JsonResponse, installed_extensions};
use crate::{http::JsonResponse, metrics::collect};
/// Expose Prometheus metrics.
pub(in crate::http) async fn get_metrics() -> Response {
// When we call TextEncoder::encode() below, it will immediately return an
// error if a metric family has no metrics, so we need to preemptively
// filter out metric families with no metrics.
let metrics = installed_extensions::collect()
let metrics = collect()
.into_iter()
.filter(|m| !m.get_metric().is_empty())
.collect::<Vec<MetricFamily>>();

View File

@@ -10,9 +10,7 @@ pub(in crate::http) mod extension_server;
pub(in crate::http) mod extensions;
pub(in crate::http) mod failpoints;
pub(in crate::http) mod grants;
pub(in crate::http) mod info;
pub(in crate::http) mod insights;
pub(in crate::http) mod installed_extensions;
pub(in crate::http) mod metrics;
pub(in crate::http) mod metrics_json;
pub(in crate::http) mod status;

View File

@@ -1,7 +1,7 @@
use std::{
fmt::Display,
net::{IpAddr, Ipv6Addr, SocketAddr},
sync::Arc,
thread,
time::Duration,
};
@@ -22,56 +22,69 @@ use uuid::Uuid;
use super::routes::{
check_writability, configure, database_schema, dbs_and_roles, extension_server, extensions,
grants, info as info_route, insights, installed_extensions, metrics, metrics_json, status,
terminate,
grants, insights, metrics, metrics_json, status, terminate,
};
use crate::compute::ComputeNode;
async fn handle_404() -> Response {
StatusCode::NOT_FOUND.into_response()
}
const X_REQUEST_ID: &str = "x-request-id";
/// This middleware function allows compute_ctl to generate its own request ID
/// if one isn't supplied. The control plane will always send one as a UUID. The
/// neon Postgres extension on the other hand does not send one.
async fn maybe_add_request_id_header(mut request: Request, next: Next) -> Response {
let headers = request.headers_mut();
if headers.get(X_REQUEST_ID).is_none() {
headers.append(X_REQUEST_ID, Uuid::new_v4().to_string().parse().unwrap());
}
next.run(request).await
/// `compute_ctl` has two servers: internal and external. The internal server
/// binds to the loopback interface and handles communication from clients on
/// the compute. The external server is what receives communication from the
/// control plane, the metrics scraper, etc. We make the distinction because
/// certain routes in `compute_ctl` only need to be exposed to local processes
/// like Postgres via the neon extension and local_proxy.
#[derive(Clone, Copy, Debug)]
pub enum Server {
Internal(u16),
External(u16),
}
/// Run the HTTP server and wait on it forever.
#[tokio::main]
async fn serve(port: u16, compute: Arc<ComputeNode>) {
let mut app = Router::new()
.route("/check_writability", post(check_writability::is_writable))
.route("/configure", post(configure::configure))
.route("/database_schema", get(database_schema::get_schema_dump))
.route("/dbs_and_roles", get(dbs_and_roles::get_catalog_objects))
.route(
"/extension_server/*filename",
post(extension_server::download_extension),
)
.route("/extensions", post(extensions::install_extension))
.route("/grants", post(grants::add_grant))
.route("/info", get(info_route::get_info))
.route("/insights", get(insights::get_insights))
.route(
"/installed_extensions",
get(installed_extensions::get_installed_extensions),
)
.route("/metrics", get(metrics::get_metrics))
.route("/metrics.json", get(metrics_json::get_metrics))
.route("/status", get(status::get_status))
.route("/terminate", post(terminate::terminate))
.fallback(handle_404)
.layer(
impl Display for Server {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
match self {
Server::Internal(_) => f.write_str("internal"),
Server::External(_) => f.write_str("external"),
}
}
}
impl From<Server> for Router<Arc<ComputeNode>> {
fn from(server: Server) -> Self {
let mut router = Router::<Arc<ComputeNode>>::new();
router = match server {
Server::Internal(_) => {
router = router
.route(
"/extension_server/{*filename}",
post(extension_server::download_extension),
)
.route("/extensions", post(extensions::install_extension))
.route("/grants", post(grants::add_grant));
// Add in any testing support
if cfg!(feature = "testing") {
use super::routes::failpoints;
router = router.route("/failpoints", post(failpoints::configure_failpoints));
}
router
}
Server::External(_) => router
.route("/check_writability", post(check_writability::is_writable))
.route("/configure", post(configure::configure))
.route("/database_schema", get(database_schema::get_schema_dump))
.route("/dbs_and_roles", get(dbs_and_roles::get_catalog_objects))
.route("/insights", get(insights::get_insights))
.route("/metrics", get(metrics::get_metrics))
.route("/metrics.json", get(metrics_json::get_metrics))
.route("/status", get(status::get_status))
.route("/terminate", post(terminate::terminate)),
};
router.fallback(Server::handle_404).method_not_allowed_fallback(Server::handle_405).layer(
ServiceBuilder::new()
// Add this middleware since we assume the request ID exists
.layer(middleware::from_fn(maybe_add_request_id_header))
@@ -111,45 +124,88 @@ async fn serve(port: u16, compute: Arc<ComputeNode>) {
)
.layer(PropagateRequestIdLayer::x_request_id()),
)
.with_state(compute);
}
}
// Add in any testing support
if cfg!(feature = "testing") {
use super::routes::failpoints;
app = app.route("/failpoints", post(failpoints::configure_failpoints))
impl Server {
async fn handle_404() -> impl IntoResponse {
StatusCode::NOT_FOUND
}
// This usually binds to both IPv4 and IPv6 on Linux, see
// https://github.com/rust-lang/rust/pull/34440 for more information
let addr = SocketAddr::new(IpAddr::from(Ipv6Addr::UNSPECIFIED), port);
let listener = match TcpListener::bind(&addr).await {
Ok(listener) => listener,
Err(e) => {
error!(
"failed to bind the compute_ctl HTTP server to port {}: {}",
port, e
);
return;
async fn handle_405() -> impl IntoResponse {
StatusCode::METHOD_NOT_ALLOWED
}
async fn listener(&self) -> Result<TcpListener> {
let addr = SocketAddr::new(self.ip(), self.port());
let listener = TcpListener::bind(&addr).await?;
Ok(listener)
}
fn ip(&self) -> IpAddr {
match self {
// TODO: Change this to Ipv6Addr::LOCALHOST when the GitHub runners
// allow binding to localhost
Server::Internal(_) => IpAddr::from(Ipv6Addr::UNSPECIFIED),
Server::External(_) => IpAddr::from(Ipv6Addr::UNSPECIFIED),
}
};
if let Ok(local_addr) = listener.local_addr() {
info!("compute_ctl HTTP server listening on {}", local_addr);
} else {
info!("compute_ctl HTTP server listening on port {}", port);
}
if let Err(e) = axum::serve(listener, app).await {
error!("compute_ctl HTTP server error: {}", e);
fn port(self) -> u16 {
match self {
Server::Internal(port) => port,
Server::External(port) => port,
}
}
async fn serve(self, compute: Arc<ComputeNode>) {
let listener = self.listener().await.unwrap_or_else(|e| {
// If we can't bind, the compute cannot operate correctly
panic!(
"failed to bind the compute_ctl {} HTTP server to {}: {}",
self,
SocketAddr::new(self.ip(), self.port()),
e
);
});
if tracing::enabled!(tracing::Level::INFO) {
let local_addr = match listener.local_addr() {
Ok(local_addr) => local_addr,
Err(_) => SocketAddr::new(self.ip(), self.port()),
};
info!(
"compute_ctl {} HTTP server listening at {}",
self, local_addr
);
}
let router = Router::from(self).with_state(compute);
if let Err(e) = axum::serve(listener, router).await {
error!("compute_ctl {} HTTP server error: {}", self, e);
}
}
pub fn launch(self, compute: &Arc<ComputeNode>) {
let state = Arc::clone(compute);
info!("Launching the {} server", self);
tokio::spawn(self.serve(state));
}
}
/// Launch a separate HTTP server thread and return its `JoinHandle`.
pub fn launch_http_server(port: u16, state: &Arc<ComputeNode>) -> Result<thread::JoinHandle<()>> {
let state = Arc::clone(state);
/// This middleware function allows compute_ctl to generate its own request ID
/// if one isn't supplied. The control plane will always send one as a UUID. The
/// neon Postgres extension on the other hand does not send one.
async fn maybe_add_request_id_header(mut request: Request, next: Next) -> Response {
let headers = request.headers_mut();
if headers.get(X_REQUEST_ID).is_none() {
headers.append(X_REQUEST_ID, Uuid::new_v4().to_string().parse().unwrap());
}
Ok(thread::Builder::new()
.name("http-server".into())
.spawn(move || serve(port, state))?)
next.run(request).await
}

View File

@@ -1,13 +1,10 @@
use compute_api::responses::{InstalledExtension, InstalledExtensions};
use metrics::proto::MetricFamily;
use std::collections::HashMap;
use anyhow::Result;
use postgres::{Client, NoTls};
use metrics::core::Collector;
use metrics::{register_uint_gauge_vec, UIntGaugeVec};
use once_cell::sync::Lazy;
use crate::metrics::INSTALLED_EXTENSIONS;
/// We don't reuse get_existing_dbs() just for code clarity
/// and to make database listing query here more explicit.
@@ -102,16 +99,3 @@ pub fn get_installed_extensions(mut conf: postgres::config::Config) -> Result<In
extensions: extensions_map.into_values().collect(),
})
}
static INSTALLED_EXTENSIONS: Lazy<UIntGaugeVec> = Lazy::new(|| {
register_uint_gauge_vec!(
"compute_installed_extensions",
"Number of databases where the version of extension is installed",
&["extension_name", "version", "owned_by_superuser"]
)
.expect("failed to define a metric")
});
pub fn collect() -> Vec<MetricFamily> {
INSTALLED_EXTENSIONS.collect()
}

View File

@@ -16,6 +16,7 @@ pub mod extension_server;
pub mod installed_extensions;
pub mod local_proxy;
pub mod lsn_lease;
pub mod metrics;
mod migration;
pub mod monitor;
pub mod params;

View File

@@ -11,7 +11,7 @@ use tracing_subscriber::prelude::*;
/// set `OTEL_EXPORTER_OTLP_ENDPOINT=http://jaeger:4318`. See
/// `tracing-utils` package description.
///
pub fn init_tracing_and_logging(default_log_level: &str) -> anyhow::Result<()> {
pub async fn init_tracing_and_logging(default_log_level: &str) -> anyhow::Result<()> {
// Initialize Logging
let env_filter = tracing_subscriber::EnvFilter::try_from_default_env()
.unwrap_or_else(|_| tracing_subscriber::EnvFilter::new(default_log_level));
@@ -22,7 +22,7 @@ pub fn init_tracing_and_logging(default_log_level: &str) -> anyhow::Result<()> {
.with_writer(std::io::stderr);
// Initialize OpenTelemetry
let otlp_layer = tracing_utils::init_tracing_without_runtime("compute_ctl");
let otlp_layer = tracing_utils::init_tracing("compute_ctl").await;
// Put it all together
tracing_subscriber::registry()

View File

@@ -0,0 +1,70 @@
use metrics::core::Collector;
use metrics::proto::MetricFamily;
use metrics::{register_int_counter_vec, register_uint_gauge_vec, IntCounterVec, UIntGaugeVec};
use once_cell::sync::Lazy;
pub(crate) static INSTALLED_EXTENSIONS: Lazy<UIntGaugeVec> = Lazy::new(|| {
register_uint_gauge_vec!(
"compute_installed_extensions",
"Number of databases where the version of extension is installed",
&["extension_name", "version", "owned_by_superuser"]
)
.expect("failed to define a metric")
});
// Normally, any HTTP API request is described by METHOD (e.g. GET, POST, etc.) + PATH,
// but for all our APIs we defined a 'slug'/method/operationId in the OpenAPI spec.
// And it's fair to call it a 'RPC' (Remote Procedure Call).
pub enum CPlaneRequestRPC {
GetSpec,
}
impl CPlaneRequestRPC {
pub fn as_str(&self) -> &str {
match self {
CPlaneRequestRPC::GetSpec => "GetSpec",
}
}
}
pub const UNKNOWN_HTTP_STATUS: &str = "unknown";
pub(crate) static CPLANE_REQUESTS_TOTAL: Lazy<IntCounterVec> = Lazy::new(|| {
register_int_counter_vec!(
"compute_ctl_cplane_requests_total",
"Total number of control plane requests made by compute_ctl by status",
&["rpc", "http_status"]
)
.expect("failed to define a metric")
});
/// Total number of failed database migrations. Per-compute, this is actually a boolean metric,
/// either empty or with a single value (1, migration_id) because we stop at the first failure.
/// Yet, the sum over the fleet will provide the total number of failures.
pub(crate) static DB_MIGRATION_FAILED: Lazy<IntCounterVec> = Lazy::new(|| {
register_int_counter_vec!(
"compute_ctl_db_migration_failed_total",
"Total number of failed database migrations",
&["migration_id"]
)
.expect("failed to define a metric")
});
pub(crate) static REMOTE_EXT_REQUESTS_TOTAL: Lazy<IntCounterVec> = Lazy::new(|| {
register_int_counter_vec!(
"compute_ctl_remote_ext_requests_total",
"Total number of requests made by compute_ctl to download extensions from S3 proxy by status",
// Do not use any labels like extension name yet.
// We can add them later if needed.
&["http_status"]
)
.expect("failed to define a metric")
});
pub fn collect() -> Vec<MetricFamily> {
let mut metrics = INSTALLED_EXTENSIONS.collect();
metrics.extend(CPLANE_REQUESTS_TOTAL.collect());
metrics.extend(REMOTE_EXT_REQUESTS_TOTAL.collect());
metrics.extend(DB_MIGRATION_FAILED.collect());
metrics
}

View File

@@ -1,7 +1,9 @@
use anyhow::{Context, Result};
use fail::fail_point;
use postgres::{Client, Transaction};
use tracing::info;
use tokio_postgres::{Client, Transaction};
use tracing::{error, info};
use crate::metrics::DB_MIGRATION_FAILED;
/// Runs a series of migrations on a target database
pub(crate) struct MigrationRunner<'m> {
@@ -19,10 +21,11 @@ impl<'m> MigrationRunner<'m> {
}
/// Get the current value neon_migration.migration_id
fn get_migration_id(&mut self) -> Result<i64> {
async fn get_migration_id(&mut self) -> Result<i64> {
let row = self
.client
.query_one("SELECT id FROM neon_migration.migration_id", &[])?;
.query_one("SELECT id FROM neon_migration.migration_id", &[])
.await?;
Ok(row.get::<&str, i64>("id"))
}
@@ -32,7 +35,7 @@ impl<'m> MigrationRunner<'m> {
/// This function has a fail point called compute-migration, which can be
/// used if you would like to fail the application of a series of migrations
/// at some point.
fn update_migration_id(txn: &mut Transaction, migration_id: i64) -> Result<()> {
async fn update_migration_id(txn: &mut Transaction<'_>, migration_id: i64) -> Result<()> {
// We use this fail point in order to check that failing in the
// middle of applying a series of migrations fails in an expected
// manner
@@ -57,71 +60,89 @@ impl<'m> MigrationRunner<'m> {
"UPDATE neon_migration.migration_id SET id = $1",
&[&migration_id],
)
.await
.with_context(|| format!("update neon_migration.migration_id to {migration_id}"))?;
Ok(())
}
/// Prepare the migrations the target database for handling migrations
fn prepare_database(&mut self) -> Result<()> {
async fn prepare_database(&mut self) -> Result<()> {
self.client
.simple_query("CREATE SCHEMA IF NOT EXISTS neon_migration")?;
self.client.simple_query("CREATE TABLE IF NOT EXISTS neon_migration.migration_id (key INT NOT NULL PRIMARY KEY, id bigint NOT NULL DEFAULT 0)")?;
self.client.simple_query(
"INSERT INTO neon_migration.migration_id VALUES (0, 0) ON CONFLICT DO NOTHING",
)?;
.simple_query("CREATE SCHEMA IF NOT EXISTS neon_migration")
.await?;
self.client.simple_query("CREATE TABLE IF NOT EXISTS neon_migration.migration_id (key INT NOT NULL PRIMARY KEY, id bigint NOT NULL DEFAULT 0)").await?;
self.client
.simple_query("ALTER SCHEMA neon_migration OWNER TO cloud_admin")?;
.simple_query(
"INSERT INTO neon_migration.migration_id VALUES (0, 0) ON CONFLICT DO NOTHING",
)
.await?;
self.client
.simple_query("REVOKE ALL ON SCHEMA neon_migration FROM PUBLIC")?;
.simple_query("ALTER SCHEMA neon_migration OWNER TO cloud_admin")
.await?;
self.client
.simple_query("REVOKE ALL ON SCHEMA neon_migration FROM PUBLIC")
.await?;
Ok(())
}
/// Run an individual migration
fn run_migration(txn: &mut Transaction, migration_id: i64, migration: &str) -> Result<()> {
/// Run an individual migration in a separate transaction block.
async fn run_migration(client: &mut Client, migration_id: i64, migration: &str) -> Result<()> {
let mut txn = client
.transaction()
.await
.with_context(|| format!("begin transaction for migration {migration_id}"))?;
if migration.starts_with("-- SKIP") {
info!("Skipping migration id={}", migration_id);
// Even though we are skipping the migration, updating the
// migration ID should help keep logic easy to understand when
// trying to understand the state of a cluster.
Self::update_migration_id(txn, migration_id)?;
Self::update_migration_id(&mut txn, migration_id).await?;
} else {
info!("Running migration id={}:\n{}\n", migration_id, migration);
txn.simple_query(migration)
.await
.with_context(|| format!("apply migration {migration_id}"))?;
Self::update_migration_id(txn, migration_id)?;
Self::update_migration_id(&mut txn, migration_id).await?;
}
txn.commit()
.await
.with_context(|| format!("commit transaction for migration {migration_id}"))?;
Ok(())
}
/// Run the configured set of migrations
pub fn run_migrations(mut self) -> Result<()> {
pub async fn run_migrations(mut self) -> Result<()> {
self.prepare_database()
.await
.context("prepare database to handle migrations")?;
let mut current_migration = self.get_migration_id()? as usize;
let mut current_migration = self.get_migration_id().await? as usize;
while current_migration < self.migrations.len() {
// The index lags the migration ID by 1, so the current migration
// ID is also the next index
let migration_id = (current_migration + 1) as i64;
let migration = self.migrations[current_migration];
let mut txn = self
.client
.transaction()
.with_context(|| format!("begin transaction for migration {migration_id}"))?;
Self::run_migration(&mut txn, migration_id, self.migrations[current_migration])
.with_context(|| format!("running migration {migration_id}"))?;
txn.commit()
.with_context(|| format!("commit transaction for migration {migration_id}"))?;
info!("Finished migration id={}", migration_id);
match Self::run_migration(self.client, migration_id, migration).await {
Ok(_) => {
info!("Finished migration id={}", migration_id);
}
Err(e) => {
error!("Failed to run migration id={}: {:?}", migration_id, e);
DB_MIGRATION_FAILED
.with_label_values(&[migration_id.to_string().as_str()])
.inc();
return Err(e);
}
}
current_migration += 1;
}

View File

@@ -7,7 +7,6 @@ use std::os::unix::fs::PermissionsExt;
use std::path::Path;
use std::process::Child;
use std::str::FromStr;
use std::thread::JoinHandle;
use std::time::{Duration, Instant};
use anyhow::{bail, Result};
@@ -16,6 +15,7 @@ use ini::Ini;
use notify::{RecursiveMode, Watcher};
use postgres::config::Config;
use tokio::io::AsyncBufReadExt;
use tokio::task::JoinHandle;
use tokio::time::timeout;
use tokio_postgres;
use tokio_postgres::NoTls;
@@ -477,23 +477,13 @@ pub async fn tune_pgbouncer(pgbouncer_config: HashMap<String, String>) -> Result
Ok(())
}
/// Spawn a thread that will read Postgres logs from `stderr`, join multiline logs
/// Spawn a task that will read Postgres logs from `stderr`, join multiline logs
/// and send them to the logger. In the future we may also want to add context to
/// these logs.
pub fn handle_postgres_logs(stderr: std::process::ChildStderr) -> JoinHandle<()> {
std::thread::spawn(move || {
let runtime = tokio::runtime::Builder::new_current_thread()
.enable_all()
.build()
.expect("failed to build tokio runtime");
let res = runtime.block_on(async move {
let stderr = tokio::process::ChildStderr::from_std(stderr)?;
handle_postgres_logs_async(stderr).await
});
if let Err(e) = res {
tracing::error!("error while processing postgres logs: {}", e);
}
pub fn handle_postgres_logs(stderr: std::process::ChildStderr) -> JoinHandle<Result<()>> {
tokio::spawn(async move {
let stderr = tokio::process::ChildStderr::from_std(stderr)?;
handle_postgres_logs_async(stderr).await
})
}

View File

@@ -1,11 +1,12 @@
use anyhow::{anyhow, bail, Result};
use postgres::Client;
use reqwest::StatusCode;
use std::fs::File;
use std::path::Path;
use tokio_postgres::Client;
use tracing::{error, info, instrument, warn};
use crate::config;
use crate::metrics::{CPlaneRequestRPC, CPLANE_REQUESTS_TOTAL, UNKNOWN_HTTP_STATUS};
use crate::migration::MigrationRunner;
use crate::params::PG_HBA_ALL_MD5;
use crate::pg_helpers::*;
@@ -19,7 +20,7 @@ use compute_api::spec::ComputeSpec;
fn do_control_plane_request(
uri: &str,
jwt: &str,
) -> Result<ControlPlaneSpecResponse, (bool, String)> {
) -> Result<ControlPlaneSpecResponse, (bool, String, String)> {
let resp = reqwest::blocking::Client::new()
.get(uri)
.header("Authorization", format!("Bearer {}", jwt))
@@ -27,35 +28,42 @@ fn do_control_plane_request(
.map_err(|e| {
(
true,
format!("could not perform spec request to control plane: {}", e),
format!("could not perform spec request to control plane: {:?}", e),
UNKNOWN_HTTP_STATUS.to_string(),
)
})?;
match resp.status() {
let status = resp.status();
match status {
StatusCode::OK => match resp.json::<ControlPlaneSpecResponse>() {
Ok(spec_resp) => Ok(spec_resp),
Err(e) => Err((
true,
format!("could not deserialize control plane response: {}", e),
format!("could not deserialize control plane response: {:?}", e),
status.to_string(),
)),
},
StatusCode::SERVICE_UNAVAILABLE => {
Err((true, "control plane is temporarily unavailable".to_string()))
}
StatusCode::SERVICE_UNAVAILABLE => Err((
true,
"control plane is temporarily unavailable".to_string(),
status.to_string(),
)),
StatusCode::BAD_GATEWAY => {
// We have a problem with intermittent 502 errors now
// https://github.com/neondatabase/cloud/issues/2353
// It's fine to retry GET request in this case.
Err((true, "control plane request failed with 502".to_string()))
Err((
true,
"control plane request failed with 502".to_string(),
status.to_string(),
))
}
// Another code, likely 500 or 404, means that compute is unknown to the control plane
// or some internal failure happened. Doesn't make much sense to retry in this case.
_ => Err((
false,
format!(
"unexpected control plane response status code: {}",
resp.status()
),
format!("unexpected control plane response status code: {}", status),
status.to_string(),
)),
}
}
@@ -83,17 +91,28 @@ pub fn get_spec_from_control_plane(
// - got spec -> return Ok(Some(spec))
while attempt < 4 {
spec = match do_control_plane_request(&cp_uri, &jwt) {
Ok(spec_resp) => match spec_resp.status {
ControlPlaneComputeStatus::Empty => Ok(None),
ControlPlaneComputeStatus::Attached => {
if let Some(spec) = spec_resp.spec {
Ok(Some(spec))
} else {
bail!("compute is attached, but spec is empty")
Ok(spec_resp) => {
CPLANE_REQUESTS_TOTAL
.with_label_values(&[
CPlaneRequestRPC::GetSpec.as_str(),
&StatusCode::OK.to_string(),
])
.inc();
match spec_resp.status {
ControlPlaneComputeStatus::Empty => Ok(None),
ControlPlaneComputeStatus::Attached => {
if let Some(spec) = spec_resp.spec {
Ok(Some(spec))
} else {
bail!("compute is attached, but spec is empty")
}
}
}
},
Err((retry, msg)) => {
}
Err((retry, msg, status)) => {
CPLANE_REQUESTS_TOTAL
.with_label_values(&[CPlaneRequestRPC::GetSpec.as_str(), &status])
.inc();
if retry {
Err(anyhow!(msg))
} else {
@@ -147,17 +166,17 @@ pub fn add_standby_signal(pgdata_path: &Path) -> Result<()> {
}
#[instrument(skip_all)]
pub fn handle_neon_extension_upgrade(client: &mut Client) -> Result<()> {
pub async fn handle_neon_extension_upgrade(client: &mut Client) -> Result<()> {
info!("handle neon extension upgrade");
let query = "ALTER EXTENSION neon UPDATE";
info!("update neon extension version with query: {}", query);
client.simple_query(query)?;
client.simple_query(query).await?;
Ok(())
}
#[instrument(skip_all)]
pub fn handle_migrations(client: &mut Client) -> Result<()> {
pub async fn handle_migrations(client: &mut Client) -> Result<()> {
info!("handle migrations");
// !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
@@ -187,7 +206,9 @@ pub fn handle_migrations(client: &mut Client) -> Result<()> {
),
];
MigrationRunner::new(client, &migrations).run_migrations()?;
MigrationRunner::new(client, &migrations)
.run_migrations()
.await?;
Ok(())
}
@@ -195,7 +216,7 @@ pub fn handle_migrations(client: &mut Client) -> Result<()> {
/// Connect to the database as superuser and pre-create anon extension
/// if it is present in shared_preload_libraries
#[instrument(skip_all)]
pub fn handle_extension_anon(
pub async fn handle_extension_anon(
spec: &ComputeSpec,
db_owner: &str,
db_client: &mut Client,
@@ -208,7 +229,7 @@ pub fn handle_extension_anon(
if !grants_only {
// check if extension is already initialized using anon.is_initialized()
let query = "SELECT anon.is_initialized()";
match db_client.query(query, &[]) {
match db_client.query(query, &[]).await {
Ok(rows) => {
if !rows.is_empty() {
let is_initialized: bool = rows[0].get(0);
@@ -230,7 +251,7 @@ pub fn handle_extension_anon(
// Users cannot create it themselves, because superuser is required.
let mut query = "CREATE EXTENSION IF NOT EXISTS anon CASCADE";
info!("creating anon extension with query: {}", query);
match db_client.query(query, &[]) {
match db_client.query(query, &[]).await {
Ok(_) => {}
Err(e) => {
error!("anon extension creation failed with error: {}", e);
@@ -240,7 +261,7 @@ pub fn handle_extension_anon(
// check that extension is installed
query = "SELECT extname FROM pg_extension WHERE extname = 'anon'";
let rows = db_client.query(query, &[])?;
let rows = db_client.query(query, &[]).await?;
if rows.is_empty() {
error!("anon extension is not installed");
return Ok(());
@@ -249,7 +270,7 @@ pub fn handle_extension_anon(
// Initialize anon extension
// This also requires superuser privileges, so users cannot do it themselves.
query = "SELECT anon.init()";
match db_client.query(query, &[]) {
match db_client.query(query, &[]).await {
Ok(_) => {}
Err(e) => {
error!("anon.init() failed with error: {}", e);
@@ -260,7 +281,7 @@ pub fn handle_extension_anon(
// check that extension is installed, if not bail early
let query = "SELECT extname FROM pg_extension WHERE extname = 'anon'";
match db_client.query(query, &[]) {
match db_client.query(query, &[]).await {
Ok(rows) => {
if rows.is_empty() {
error!("anon extension is not installed");
@@ -275,12 +296,12 @@ pub fn handle_extension_anon(
let query = format!("GRANT ALL ON SCHEMA anon TO {}", db_owner);
info!("granting anon extension permissions with query: {}", query);
db_client.simple_query(&query)?;
db_client.simple_query(&query).await?;
// Grant permissions to db_owner to use anon extension functions
let query = format!("GRANT ALL ON ALL FUNCTIONS IN SCHEMA anon TO {}", db_owner);
info!("granting anon extension permissions with query: {}", query);
db_client.simple_query(&query)?;
db_client.simple_query(&query).await?;
// This is needed, because some functions are defined as SECURITY DEFINER.
// In Postgres SECURITY DEFINER functions are executed with the privileges
@@ -295,16 +316,16 @@ pub fn handle_extension_anon(
where nsp.nspname = 'anon';", db_owner);
info!("change anon extension functions owner to db owner");
db_client.simple_query(&query)?;
db_client.simple_query(&query).await?;
// affects views as well
let query = format!("GRANT ALL ON ALL TABLES IN SCHEMA anon TO {}", db_owner);
info!("granting anon extension permissions with query: {}", query);
db_client.simple_query(&query)?;
db_client.simple_query(&query).await?;
let query = format!("GRANT ALL ON ALL SEQUENCES IN SCHEMA anon TO {}", db_owner);
info!("granting anon extension permissions with query: {}", query);
db_client.simple_query(&query)?;
db_client.simple_query(&query).await?;
}
}

View File

@@ -33,6 +33,7 @@ postgres_backend.workspace = true
safekeeper_api.workspace = true
postgres_connection.workspace = true
storage_broker.workspace = true
http-utils.workspace = true
utils.workspace = true
whoami.workspace = true

View File

@@ -1,6 +1,10 @@
# Control Plane and Neon Local
# Local Development Control Plane (`neon_local`)
This crate contains tools to start a Neon development environment locally. This utility can be used with the `cargo neon` command.
This crate contains tools to start a Neon development environment locally. This utility can be used with the `cargo neon` command. This is a convenience to invoke
the `neon_local` binary.
**Note**: this is a dev/test tool -- a minimal control plane suitable for testing
code changes locally, but not suitable for running production systems.
## Example: Start with Postgres 16

View File

@@ -261,7 +261,13 @@ fn fill_rust_env_vars(cmd: &mut Command) -> &mut Command {
let mut filled_cmd = cmd.env_clear().env("RUST_BACKTRACE", backtrace_setting);
// Pass through these environment variables to the command
for var in ["LLVM_PROFILE_FILE", "FAILPOINTS", "RUST_LOG"] {
for var in [
"LLVM_PROFILE_FILE",
"FAILPOINTS",
"RUST_LOG",
"ASAN_OPTIONS",
"UBSAN_OPTIONS",
] {
if let Some(val) = std::env::var_os(var) {
filled_cmd = filled_cmd.env(var, val);
}

View File

@@ -552,8 +552,10 @@ struct EndpointCreateCmdArgs {
lsn: Option<Lsn>,
#[clap(long)]
pg_port: Option<u16>,
#[clap(long, alias = "http-port")]
external_http_port: Option<u16>,
#[clap(long)]
http_port: Option<u16>,
internal_http_port: Option<u16>,
#[clap(long = "pageserver-id")]
endpoint_pageserver_id: Option<NodeId>,
@@ -1353,7 +1355,8 @@ async fn handle_endpoint(subcmd: &EndpointCmd, env: &local_env::LocalEnv) -> Res
tenant_id,
timeline_id,
args.pg_port,
args.http_port,
args.external_http_port,
args.internal_http_port,
args.pg_version,
mode,
!args.update_catalog,

View File

@@ -37,6 +37,8 @@
//! ```
//!
use std::collections::BTreeMap;
use std::net::IpAddr;
use std::net::Ipv4Addr;
use std::net::SocketAddr;
use std::net::TcpStream;
use std::path::PathBuf;
@@ -73,7 +75,8 @@ pub struct EndpointConf {
timeline_id: TimelineId,
mode: ComputeMode,
pg_port: u16,
http_port: u16,
external_http_port: u16,
internal_http_port: u16,
pg_version: u32,
skip_pg_catalog_updates: bool,
drop_subscriptions_before_start: bool,
@@ -128,7 +131,7 @@ impl ComputeControlPlane {
1 + self
.endpoints
.values()
.map(|ep| std::cmp::max(ep.pg_address.port(), ep.http_address.port()))
.map(|ep| std::cmp::max(ep.pg_address.port(), ep.external_http_address.port()))
.max()
.unwrap_or(self.base_port)
}
@@ -140,18 +143,27 @@ impl ComputeControlPlane {
tenant_id: TenantId,
timeline_id: TimelineId,
pg_port: Option<u16>,
http_port: Option<u16>,
external_http_port: Option<u16>,
internal_http_port: Option<u16>,
pg_version: u32,
mode: ComputeMode,
skip_pg_catalog_updates: bool,
drop_subscriptions_before_start: bool,
) -> Result<Arc<Endpoint>> {
let pg_port = pg_port.unwrap_or_else(|| self.get_port());
let http_port = http_port.unwrap_or_else(|| self.get_port() + 1);
let external_http_port = external_http_port.unwrap_or_else(|| self.get_port() + 1);
let internal_http_port = internal_http_port.unwrap_or_else(|| external_http_port + 1);
let ep = Arc::new(Endpoint {
endpoint_id: endpoint_id.to_owned(),
pg_address: SocketAddr::new("127.0.0.1".parse().unwrap(), pg_port),
http_address: SocketAddr::new("127.0.0.1".parse().unwrap(), http_port),
pg_address: SocketAddr::new(IpAddr::from(Ipv4Addr::LOCALHOST), pg_port),
external_http_address: SocketAddr::new(
IpAddr::from(Ipv4Addr::UNSPECIFIED),
external_http_port,
),
internal_http_address: SocketAddr::new(
IpAddr::from(Ipv4Addr::LOCALHOST),
internal_http_port,
),
env: self.env.clone(),
timeline_id,
mode,
@@ -176,7 +188,8 @@ impl ComputeControlPlane {
tenant_id,
timeline_id,
mode,
http_port,
external_http_port,
internal_http_port,
pg_port,
pg_version,
skip_pg_catalog_updates,
@@ -230,9 +243,10 @@ pub struct Endpoint {
pub timeline_id: TimelineId,
pub mode: ComputeMode,
// port and address of the Postgres server and `compute_ctl`'s HTTP API
// port and address of the Postgres server and `compute_ctl`'s HTTP APIs
pub pg_address: SocketAddr,
pub http_address: SocketAddr,
pub external_http_address: SocketAddr,
pub internal_http_address: SocketAddr,
// postgres major version in the format: 14, 15, etc.
pg_version: u32,
@@ -287,8 +301,15 @@ impl Endpoint {
serde_json::from_slice(&std::fs::read(entry.path().join("endpoint.json"))?)?;
Ok(Endpoint {
pg_address: SocketAddr::new("127.0.0.1".parse().unwrap(), conf.pg_port),
http_address: SocketAddr::new("127.0.0.1".parse().unwrap(), conf.http_port),
pg_address: SocketAddr::new(IpAddr::from(Ipv4Addr::LOCALHOST), conf.pg_port),
external_http_address: SocketAddr::new(
IpAddr::from(Ipv4Addr::UNSPECIFIED),
conf.external_http_port,
),
internal_http_address: SocketAddr::new(
IpAddr::from(Ipv4Addr::LOCALHOST),
conf.internal_http_port,
),
endpoint_id,
env: env.clone(),
timeline_id: conf.timeline_id,
@@ -650,24 +671,51 @@ impl Endpoint {
println!("Also at '{}'", conn_str);
}
let mut cmd = Command::new(self.env.neon_distrib_dir.join("compute_ctl"));
cmd.args(["--http-port", &self.http_address.port().to_string()])
.args(["--pgdata", self.pgdata().to_str().unwrap()])
.args(["--connstr", &conn_str])
.args([
"--spec-path",
self.endpoint_path().join("spec.json").to_str().unwrap(),
])
.args([
"--pgbin",
self.env
.pg_bin_dir(self.pg_version)?
.join("postgres")
.to_str()
.unwrap(),
])
.stdin(std::process::Stdio::null())
.stderr(logfile.try_clone()?)
.stdout(logfile);
//cmd.args([
// "--external-http-port",
// &self.external_http_address.port().to_string(),
//])
//.args([
// "--internal-http-port",
// &self.internal_http_address.port().to_string(),
//])
cmd.args([
"--http-port",
&self.external_http_address.port().to_string(),
])
.args(["--pgdata", self.pgdata().to_str().unwrap()])
.args(["--connstr", &conn_str])
.args([
"--spec-path",
self.endpoint_path().join("spec.json").to_str().unwrap(),
])
.args([
"--pgbin",
self.env
.pg_bin_dir(self.pg_version)?
.join("postgres")
.to_str()
.unwrap(),
])
// TODO: It would be nice if we generated compute IDs with the same
// algorithm as the real control plane.
//
// TODO: Add this back when
// https://github.com/neondatabase/neon/pull/10747 is merged.
//
//.args([
// "--compute-id",
// &format!(
// "compute-{}",
// SystemTime::now()
// .duration_since(UNIX_EPOCH)
// .unwrap()
// .as_secs()
// ),
//])
.stdin(std::process::Stdio::null())
.stderr(logfile.try_clone()?)
.stdout(logfile);
if let Some(remote_ext_config) = remote_ext_config {
cmd.args(["--remote-ext-config", remote_ext_config]);
@@ -754,8 +802,8 @@ impl Endpoint {
reqwest::Method::GET,
format!(
"http://{}:{}/status",
self.http_address.ip(),
self.http_address.port()
self.external_http_address.ip(),
self.external_http_address.port()
),
)
.send()
@@ -828,8 +876,8 @@ impl Endpoint {
let response = client
.post(format!(
"http://{}:{}/configure",
self.http_address.ip(),
self.http_address.port()
self.external_http_address.ip(),
self.external_http_address.port()
))
.header(CONTENT_TYPE.as_str(), "application/json")
.body(format!(

View File

@@ -347,16 +347,31 @@ impl PageServerNode {
.map(|x| x.parse::<usize>())
.transpose()
.context("Failed to parse 'compaction_threshold' as an integer")?,
compaction_upper_limit: settings
.remove("compaction_upper_limit")
.map(|x| x.parse::<usize>())
.transpose()
.context("Failed to parse 'compaction_upper_limit' as an integer")?,
compaction_algorithm: settings
.remove("compaction_algorithm")
.map(serde_json::from_str)
.transpose()
.context("Failed to parse 'compaction_algorithm' json")?,
compaction_l0_first: settings
.remove("compaction_l0_first")
.map(|x| x.parse::<bool>())
.transpose()
.context("Failed to parse 'compaction_l0_first' as a bool")?,
l0_flush_delay_threshold: settings
.remove("l0_flush_delay_threshold")
.map(|x| x.parse::<usize>())
.transpose()
.context("Failed to parse 'l0_flush_delay_threshold' as an integer")?,
l0_flush_wait_upload: settings
.remove("l0_flush_wait_upload")
.map(|x| x.parse::<bool>())
.transpose()
.context("Failed to parse 'l0_flush_wait_upload' as a boolean")?,
l0_flush_stall_threshold: settings
.remove("l0_flush_stall_threshold")
.map(|x| x.parse::<usize>())
@@ -378,6 +393,11 @@ impl PageServerNode {
.map(|x| x.parse::<u8>())
.transpose()
.context("Failed to parse 'image_creation_check_threshold' as integer")?,
image_creation_preempt_threshold: settings
.remove("image_creation_preempt_threshold")
.map(|x| x.parse::<usize>())
.transpose()
.context("Failed to parse 'image_creation_preempt_threshold' as integer")?,
pitr_interval: settings.remove("pitr_interval").map(|x| x.to_string()),
walreceiver_connect_timeout: settings
.remove("walreceiver_connect_timeout")

View File

@@ -17,8 +17,10 @@ use camino::Utf8PathBuf;
use postgres_connection::PgConnectionConfig;
use reqwest::{IntoUrl, Method};
use thiserror::Error;
use http_utils::error::HttpErrorBody;
use utils::auth::{Claims, Scope};
use utils::{http::error::HttpErrorBody, id::NodeId};
use utils::id::NodeId;
use crate::{
background_process,

View File

@@ -221,7 +221,17 @@ impl StorageController {
"-p",
&format!("{}", postgres_port),
];
let exitcode = Command::new(bin_path).args(args).spawn()?.wait().await?;
let pg_lib_dir = self.get_pg_lib_dir().await.unwrap();
let envs = [
("LD_LIBRARY_PATH".to_owned(), pg_lib_dir.to_string()),
("DYLD_LIBRARY_PATH".to_owned(), pg_lib_dir.to_string()),
];
let exitcode = Command::new(bin_path)
.args(args)
.envs(envs)
.spawn()?
.wait()
.await?;
Ok(exitcode.success())
}
@@ -242,6 +252,11 @@ impl StorageController {
let pg_bin_dir = self.get_pg_bin_dir().await?;
let createdb_path = pg_bin_dir.join("createdb");
let pg_lib_dir = self.get_pg_lib_dir().await.unwrap();
let envs = [
("LD_LIBRARY_PATH".to_owned(), pg_lib_dir.to_string()),
("DYLD_LIBRARY_PATH".to_owned(), pg_lib_dir.to_string()),
];
let output = Command::new(&createdb_path)
.args([
"-h",
@@ -254,6 +269,7 @@ impl StorageController {
&username(),
DB_NAME,
])
.envs(envs)
.output()
.await
.expect("Failed to spawn createdb");

View File

@@ -10,8 +10,8 @@ use pageserver_api::{
controller_api::{
AvailabilityZone, NodeAvailabilityWrapper, NodeDescribeResponse, NodeShardResponse,
SafekeeperDescribeResponse, SafekeeperSchedulingPolicyRequest, ShardSchedulingPolicy,
ShardsPreferredAzsRequest, SkSchedulingPolicy, TenantCreateRequest, TenantDescribeResponse,
TenantPolicyRequest,
ShardsPreferredAzsRequest, ShardsPreferredAzsResponse, SkSchedulingPolicy,
TenantCreateRequest, TenantDescribeResponse, TenantPolicyRequest,
},
models::{
EvictionPolicy, EvictionPolicyLayerAccessThreshold, LocationConfigSecondary,
@@ -800,7 +800,7 @@ async fn main() -> anyhow::Result<()> {
.collect(),
};
storcon_client
.dispatch::<ShardsPreferredAzsRequest, ()>(
.dispatch::<ShardsPreferredAzsRequest, ShardsPreferredAzsResponse>(
Method::PUT,
"control/v1/preferred_azs".to_string(),
Some(req),

View File

@@ -32,6 +32,7 @@ reason = "the marvin attack only affects private key decryption, not public key
# https://embarkstudios.github.io/cargo-deny/checks/licenses/cfg.html
[licenses]
allow = [
"0BSD",
"Apache-2.0",
"BSD-2-Clause",
"BSD-3-Clause",
@@ -41,8 +42,8 @@ allow = [
"MIT",
"MPL-2.0",
"OpenSSL",
"Unicode-DFS-2016",
"Unicode-3.0",
"Zlib",
]
confidence-threshold = 0.8
exceptions = [

View File

@@ -7,11 +7,12 @@ FROM $REPOSITORY/${COMPUTE_IMAGE}:$TAG
ARG COMPUTE_IMAGE
USER root
RUN apt-get update && \
RUN echo 'Acquire::Retries "5";' > /etc/apt/apt.conf.d/80-retries && \
apt-get update && \
apt-get install -y curl \
jq \
netcat-openbsd
#This is required for the pg_hintplan test
RUN mkdir -p /ext-src/pg_hint_plan-src && chown postgres /ext-src/pg_hint_plan-src
RUN mkdir -p /ext-src/pg_hint_plan-src /postgres/contrib/file_fdw && chown postgres /ext-src/pg_hint_plan-src /postgres/contrib/file_fdw
USER postgres
USER postgres

View File

@@ -20,30 +20,55 @@ while ! nc -z pageserver 6400; do
done
echo "Page server is ready."
echo "Create a tenant and timeline"
generate_id tenant_id
PARAMS=(
-X PUT
-H "Content-Type: application/json"
-d "{\"mode\": \"AttachedSingle\", \"generation\": 1, \"tenant_conf\": {}}"
"http://pageserver:9898/v1/tenant/${tenant_id}/location_config"
)
result=$(curl "${PARAMS[@]}")
echo $result | jq .
cp ${SPEC_FILE_ORG} ${SPEC_FILE}
generate_id timeline_id
PARAMS=(
-sbf
-X POST
-H "Content-Type: application/json"
-d "{\"new_timeline_id\": \"${timeline_id}\", \"pg_version\": ${PG_VERSION}}"
"http://pageserver:9898/v1/tenant/${tenant_id}/timeline/"
)
result=$(curl "${PARAMS[@]}")
echo $result | jq .
if [ -n "${TENANT_ID:-}" ] && [ -n "${TIMELINE_ID:-}" ]; then
tenant_id=${TENANT_ID}
timeline_id=${TIMELINE_ID}
else
echo "Check if a tenant present"
PARAMS=(
-X GET
-H "Content-Type: application/json"
"http://pageserver:9898/v1/tenant"
)
tenant_id=$(curl "${PARAMS[@]}" | jq -r .[0].id)
if [ -z "${tenant_id}" ] || [ "${tenant_id}" = null ]; then
echo "Create a tenant"
generate_id tenant_id
PARAMS=(
-X PUT
-H "Content-Type: application/json"
-d "{\"mode\": \"AttachedSingle\", \"generation\": 1, \"tenant_conf\": {}}"
"http://pageserver:9898/v1/tenant/${tenant_id}/location_config"
)
result=$(curl "${PARAMS[@]}")
echo $result | jq .
fi
echo "Check if a timeline present"
PARAMS=(
-X GET
-H "Content-Type: application/json"
"http://pageserver:9898/v1/tenant/${tenant_id}/timeline"
)
timeline_id=$(curl "${PARAMS[@]}" | jq -r .[0].timeline_id)
if [ -z "${timeline_id}" ] || [ "${timeline_id}" = null ]; then
generate_id timeline_id
PARAMS=(
-sbf
-X POST
-H "Content-Type: application/json"
-d "{\"new_timeline_id\": \"${timeline_id}\", \"pg_version\": ${PG_VERSION}}"
"http://pageserver:9898/v1/tenant/${tenant_id}/timeline/"
)
result=$(curl "${PARAMS[@]}")
echo $result | jq .
fi
fi
echo "Overwrite tenant id and timeline id in spec file"
sed "s/TENANT_ID/${tenant_id}/" ${SPEC_FILE_ORG} > ${SPEC_FILE}
sed -i "s/TENANT_ID/${tenant_id}/" ${SPEC_FILE}
sed -i "s/TIMELINE_ID/${timeline_id}/" ${SPEC_FILE}
cat ${SPEC_FILE}

View File

@@ -149,11 +149,13 @@ services:
args:
- REPOSITORY=${REPOSITORY:-neondatabase}
- COMPUTE_IMAGE=compute-node-v${PG_VERSION:-16}
- TAG=${TAG:-latest}
- TAG=${COMPUTE_TAG:-${TAG:-latest}}
- http_proxy=${http_proxy:-}
- https_proxy=${https_proxy:-}
environment:
- PG_VERSION=${PG_VERSION:-16}
- TENANT_ID=${TENANT_ID:-}
- TIMELINE_ID=${TIMELINE_ID:-}
#- RUST_BACKTRACE=1
# Mount the test files directly, for faster editing cycle.
volumes:

View File

@@ -31,7 +31,7 @@ for pg_version in ${TEST_VERSION_ONLY-14 15 16 17}; do
echo "clean up containers if exists"
cleanup
PG_TEST_VERSION=$((pg_version < 16 ? 16 : pg_version))
PG_VERSION=$pg_version PG_TEST_VERSION=$PG_TEST_VERSION docker compose --profile test-extensions -f $COMPOSE_FILE up --build -d
PG_VERSION=$pg_version PG_TEST_VERSION=$PG_TEST_VERSION docker compose --profile test-extensions -f $COMPOSE_FILE up --quiet-pull --build -d
echo "wait until the compute is ready. timeout after 60s. "
cnt=0
@@ -51,6 +51,8 @@ for pg_version in ${TEST_VERSION_ONLY-14 15 16 17}; do
done
if [ $pg_version -ge 16 ]; then
docker cp ext-src $TEST_CONTAINER_NAME:/
docker exec $TEST_CONTAINER_NAME bash -c "apt update && apt install -y libtap-parser-sourcehandler-pgtap-perl"
# This is required for the pg_hint_plan test, to prevent flaky log message causing the test to fail
# It cannot be moved to Dockerfile now because the database directory is created after the start of the container
echo Adding dummy config
@@ -60,17 +62,32 @@ for pg_version in ${TEST_VERSION_ONLY-14 15 16 17}; do
docker cp $TEST_CONTAINER_NAME:/ext-src/pg_hint_plan-src/data $TMPDIR/data
docker cp $TMPDIR/data $COMPUTE_CONTAINER_NAME:/ext-src/pg_hint_plan-src/
rm -rf $TMPDIR
# The following block does the same for the contrib/file_fdw test
TMPDIR=$(mktemp -d)
docker cp $TEST_CONTAINER_NAME:/postgres/contrib/file_fdw/data $TMPDIR/data
docker cp $TMPDIR/data $COMPUTE_CONTAINER_NAME:/postgres/contrib/file_fdw/data
rm -rf $TMPDIR
# Apply patches
cat ../compute/patches/contrib_pg${pg_version}.patch | docker exec -i $TEST_CONTAINER_NAME bash -c "(cd /postgres && patch -p1)"
# We are running tests now
if ! docker exec -e SKIP=timescaledb-src,rdkit-src,postgis-src,pgx_ulid-src,pgtap-src,pg_tiktoken-src,pg_jsonschema-src,pg_graphql-src,kq_imcx-src,wal2json_2_5-src \
$TEST_CONTAINER_NAME /run-tests.sh | tee testout.txt
then
FAILED=$(tail -1 testout.txt)
for d in $FAILED
do
mkdir $d
docker cp $TEST_CONTAINER_NAME:/ext-src/$d/regression.diffs $d || true
docker cp $TEST_CONTAINER_NAME:/ext-src/$d/regression.out $d || true
cat $d/regression.out $d/regression.diffs || true
rm -f testout.txt testout_contrib.txt
docker exec -e USE_PGXS=1 -e SKIP=timescaledb-src,rdkit-src,postgis-src,pgx_ulid-src,pgtap-src,pg_tiktoken-src,pg_jsonschema-src,kq_imcx-src,wal2json_2_5-src \
$TEST_CONTAINER_NAME /run-tests.sh /ext-src | tee testout.txt && EXT_SUCCESS=1 || EXT_SUCCESS=0
docker exec -e SKIP=start-scripts,postgres_fdw,ltree_plpython,jsonb_plpython,jsonb_plperl,hstore_plpython,hstore_plperl,dblink,bool_plperl \
$TEST_CONTAINER_NAME /run-tests.sh /postgres/contrib | tee testout_contrib.txt && CONTRIB_SUCCESS=1 || CONTRIB_SUCCESS=0
if [ $EXT_SUCCESS -eq 0 ] || [ $CONTRIB_SUCCESS -eq 0 ]; then
CONTRIB_FAILED=
FAILED=
[ $EXT_SUCCESS -eq 0 ] && FAILED=$(tail -1 testout.txt | awk '{for(i=1;i<=NF;i++){print "/ext-src/"$i;}}')
[ $CONTRIB_SUCCESS -eq 0 ] && CONTRIB_FAILED=$(tail -1 testout_contrib.txt | awk '{for(i=0;i<=NF;i++){print "/postgres/contrib/"$i;}}')
for d in $FAILED $CONTRIB_FAILED; do
dn="$(basename $d)"
rm -rf $dn
mkdir $dn
docker cp $TEST_CONTAINER_NAME:$d/regression.diffs $dn || [ $? -eq 1 ]
docker cp $TEST_CONTAINER_NAME:$d/regression.out $dn || [ $? -eq 1 ]
cat $dn/regression.out $dn/regression.diffs || true
rm -rf $dn
done
rm -rf $FAILED
exit 1

View File

@@ -0,0 +1,5 @@
#!/bin/sh
set -ex
cd "$(dirname ${0})"
PG_REGRESS=$(dirname "$(pg_config --pgxs)")/../test/regress/pg_regress
${PG_REGRESS} --use-existing --inputdir=./ --bindir='/usr/local/pgsql/bin' --dbname=contrib_regression add_agg agg_oob auto_sparse card_op cast_shape copy_binary cumulative_add_cardinality_correction cumulative_add_comprehensive_promotion cumulative_add_sparse_edge cumulative_add_sparse_random cumulative_add_sparse_step cumulative_union_comprehensive cumulative_union_explicit_explicit cumulative_union_explicit_promotion cumulative_union_probabilistic_probabilistic cumulative_union_sparse_full_representation cumulative_union_sparse_promotion cumulative_union_sparse_sparse disable_hashagg equal explicit_thresh hash hash_any meta_func murmur_bigint murmur_bytea nosparse notequal scalar_oob storedproc transaction typmod typmod_insert union_op

View File

@@ -0,0 +1,27 @@
diff --git a/expected/hypopg.out b/expected/hypopg.out
index 90121d0..859260b 100644
--- a/expected/hypopg.out
+++ b/expected/hypopg.out
@@ -11,7 +11,8 @@ BEGIN
END;
$_$
LANGUAGE plpgsql;
-CREATE EXTENSION hypopg;
+CREATE EXTENSION IF NOT EXISTS hypopg;
+NOTICE: extension "hypopg" already exists, skipping
CREATE TABLE hypo (id integer, val text, "Id2" bigint);
INSERT INTO hypo SELECT i, 'line ' || i
FROM generate_series(1,100000) f(i);
diff --git a/test/sql/hypopg.sql b/test/sql/hypopg.sql
index 99722b0..8d6bacb 100644
--- a/test/sql/hypopg.sql
+++ b/test/sql/hypopg.sql
@@ -12,7 +12,7 @@ END;
$_$
LANGUAGE plpgsql;
-CREATE EXTENSION hypopg;
+CREATE EXTENSION IF NOT EXISTS hypopg;
CREATE TABLE hypo (id integer, val text, "Id2" bigint);

View File

@@ -0,0 +1,6 @@
#!/bin/sh
set -ex
cd "$(dirname ${0})"
patch -p1 <test-upgrade.patch
PG_REGRESS=$(dirname "$(pg_config --pgxs)")/../test/regress/pg_regress
${PG_REGRESS} --inputdir=./ --bindir='/usr/local/pgsql/bin' --use-existing --inputdir=test --dbname=contrib_regression hypopg hypo_brin hypo_index_part hypo_include hypo_hash hypo_hide_index

View File

@@ -0,0 +1,23 @@
diff --git a/expected/ip4r.out b/expected/ip4r.out
index 7527af3..b38ed29 100644
--- a/expected/ip4r.out
+++ b/expected/ip4r.out
@@ -1,6 +1,5 @@
--
/*CUT-HERE*/
-CREATE EXTENSION ip4r;
-- Check whether any of our opclasses fail amvalidate
DO $d$
DECLARE
diff --git a/sql/ip4r.sql b/sql/ip4r.sql
index 65c49ec..24ade09 100644
--- a/sql/ip4r.sql
+++ b/sql/ip4r.sql
@@ -1,7 +1,6 @@
--
/*CUT-HERE*/
-CREATE EXTENSION ip4r;
-- Check whether any of our opclasses fail amvalidate

View File

@@ -0,0 +1,6 @@
#!/bin/sh
set -ex
cd "$(dirname ${0})"
patch -p1 <test-upgrade.patch
PG_REGRESS=$(dirname "$(pg_config --pgxs)")/../test/regress/pg_regress
${PG_REGRESS} --use-existing --inputdir=./ --bindir='/usr/local/pgsql/bin' --dbname=contrib_regression ip4r ip4r-softerr ip4r-v11

View File

@@ -0,0 +1,75 @@
diff --git a/expected/pg_cron-test.out b/expected/pg_cron-test.out
index d79d542..1663886 100644
--- a/expected/pg_cron-test.out
+++ b/expected/pg_cron-test.out
@@ -1,30 +1,3 @@
-CREATE EXTENSION pg_cron VERSION '1.0';
-SELECT extversion FROM pg_extension WHERE extname='pg_cron';
- extversion
-------------
- 1.0
-(1 row)
-
--- Test binary compatibility with v1.4 function signature.
-ALTER EXTENSION pg_cron UPDATE TO '1.4';
-SELECT cron.unschedule(job_name := 'no_such_job');
-ERROR: could not find valid entry for job 'no_such_job'
-SELECT cron.schedule('testjob', '* * * * *', 'SELECT 1');
- schedule
-----------
- 1
-(1 row)
-
-SELECT cron.unschedule('testjob');
- unschedule
-------------
- t
-(1 row)
-
--- Test cache invalidation
-DROP EXTENSION pg_cron;
-CREATE EXTENSION pg_cron VERSION '1.4';
-ALTER EXTENSION pg_cron UPDATE;
-- Vacuum every day at 10:00am (GMT)
SELECT cron.schedule('0 10 * * *', 'VACUUM');
schedule
@@ -300,8 +273,3 @@ SELECT jobid, jobname, schedule, command FROM cron.job ORDER BY jobid;
SELECT cron.schedule('bad-last-dom-job1', '0 11 $foo * *', 'VACUUM FULL');
ERROR: invalid schedule: 0 11 $foo * *
HINT: Use cron format (e.g. 5 4 * * *), or interval format '[1-59] seconds'
--- cleaning
-DROP EXTENSION pg_cron;
-drop user pgcron_cront;
-drop database pgcron_dbno;
-drop database pgcron_dbyes;
diff --git a/sql/pg_cron-test.sql b/sql/pg_cron-test.sql
index 45f94d9..241cf73 100644
--- a/sql/pg_cron-test.sql
+++ b/sql/pg_cron-test.sql
@@ -1,17 +1,3 @@
-CREATE EXTENSION pg_cron VERSION '1.0';
-SELECT extversion FROM pg_extension WHERE extname='pg_cron';
--- Test binary compatibility with v1.4 function signature.
-ALTER EXTENSION pg_cron UPDATE TO '1.4';
-SELECT cron.unschedule(job_name := 'no_such_job');
-SELECT cron.schedule('testjob', '* * * * *', 'SELECT 1');
-SELECT cron.unschedule('testjob');
-
--- Test cache invalidation
-DROP EXTENSION pg_cron;
-CREATE EXTENSION pg_cron VERSION '1.4';
-
-ALTER EXTENSION pg_cron UPDATE;
-
-- Vacuum every day at 10:00am (GMT)
SELECT cron.schedule('0 10 * * *', 'VACUUM');
@@ -156,8 +142,3 @@ SELECT jobid, jobname, schedule, command FROM cron.job ORDER BY jobid;
-- invalid last of day job
SELECT cron.schedule('bad-last-dom-job1', '0 11 $foo * *', 'VACUUM FULL');
--- cleaning
-DROP EXTENSION pg_cron;
-drop user pgcron_cront;
-drop database pgcron_dbno;
-drop database pgcron_dbyes;

View File

@@ -0,0 +1,6 @@
#!/bin/sh
set -ex
cd "$(dirname ${0})"
patch -p1 <test-upgrade.patch
PG_REGRESS=$(dirname "$(pg_config --pgxs)")/../test/regress/pg_regress
${PG_REGRESS} --use-existing --inputdir=./ --bindir='/usr/local/pgsql/bin' --dbname=contrib_regression pg_cron-test

View File

@@ -0,0 +1,13 @@
#!/bin/bash
set -ex
cd "$(dirname "${0}")"
dropdb --if-exists contrib_regression
createdb contrib_regression
PGXS="$(dirname "$(pg_config --pgxs)" )"
REGRESS="${PGXS}/../test/regress/pg_regress"
TESTDIR="test"
TESTS=$(ls "${TESTDIR}/sql" | sort )
TESTS=${TESTS//\.sql/}
psql -v ON_ERROR_STOP=1 -f test/fixtures.sql -d contrib_regression
${REGRESS} --use-existing --dbname=contrib_regression --inputdir=${TESTDIR} ${TESTS}

View File

@@ -0,0 +1,18 @@
diff --git a/expected/pg_ivm.out b/expected/pg_ivm.out
index e8798ee..cca58d0 100644
--- a/expected/pg_ivm.out
+++ b/expected/pg_ivm.out
@@ -1,4 +1,3 @@
-CREATE EXTENSION pg_ivm;
GRANT ALL ON SCHEMA public TO public;
-- create a table to use as a basis for views and materialized views in various combinations
CREATE TABLE mv_base_a (i int, j int);
diff --git a/sql/pg_ivm.sql b/sql/pg_ivm.sql
index d3c1a01..9382d7f 100644
--- a/sql/pg_ivm.sql
+++ b/sql/pg_ivm.sql
@@ -1,4 +1,3 @@
-CREATE EXTENSION pg_ivm;
GRANT ALL ON SCHEMA public TO public;
-- create a table to use as a basis for views and materialized views in various combinations

View File

@@ -0,0 +1,6 @@
#!/bin/sh
set -ex
cd "$(dirname ${0})"
patch -p1 <test-upgrade.patch
PG_REGRESS=$(dirname "$(pg_config --pgxs)")/../test/regress/pg_regress
${PG_REGRESS} --use-existing --inputdir=./ --bindir='/usr/local/pgsql/bin' --dbname=contrib_regression pg_ivm create_immv refresh_immv

View File

@@ -0,0 +1,25 @@
diff --git a/expected/roaringbitmap.out b/expected/roaringbitmap.out
index de70531..a5f7c15 100644
--- a/expected/roaringbitmap.out
+++ b/expected/roaringbitmap.out
@@ -1,7 +1,6 @@
--
-- Test roaringbitmap extension
--
-CREATE EXTENSION if not exists roaringbitmap;
-- Test input and output
set roaringbitmap.output_format='array';
set extra_float_digits = 0;
diff --git a/sql/roaringbitmap.sql b/sql/roaringbitmap.sql
index a0e9c74..84bc966 100644
--- a/sql/roaringbitmap.sql
+++ b/sql/roaringbitmap.sql
@@ -2,8 +2,6 @@
-- Test roaringbitmap extension
--
-CREATE EXTENSION if not exists roaringbitmap;
-
-- Test input and output
set roaringbitmap.output_format='array';

View File

@@ -0,0 +1,6 @@
#!/bin/sh
set -ex
cd "$(dirname ${0})"
patch -p1 <test-upgrade.patch
PG_REGRESS=$(dirname "$(pg_config --pgxs)")/../test/regress/pg_regress
${PG_REGRESS} --use-existing --inputdir=./ --bindir='/usr/local/pgsql/bin' --dbname=contrib_regression roaringbitmap

View File

@@ -0,0 +1,24 @@
diff --git a/test/sql/base.sql b/test/sql/base.sql
index af599d8..2eed91b 100644
--- a/test/sql/base.sql
+++ b/test/sql/base.sql
@@ -2,7 +2,6 @@
BEGIN;
\i test/pgtap-core.sql
-\i sql/semver.sql
SELECT plan(334);
--SELECT * FROM no_plan();
diff --git a/test/sql/corpus.sql b/test/sql/corpus.sql
index 1f5f637..a519905 100644
--- a/test/sql/corpus.sql
+++ b/test/sql/corpus.sql
@@ -4,7 +4,6 @@ BEGIN;
-- Test the SemVer corpus from https://regex101.com/r/Ly7O1x/3/.
\i test/pgtap-core.sql
-\i sql/semver.sql
SELECT plan(71);
--SELECT * FROM no_plan();

View File

@@ -0,0 +1,6 @@
#!/bin/sh
set -ex
cd "$(dirname ${0})"
patch -p1 <test-upgrade.patch
PG_REGRESS=$(dirname "$(pg_config --pgxs)")/../test/regress/pg_regress
${PG_REGRESS} --use-existing --inputdir=./ --bindir='/usr/local/pgsql/bin' --inputdir=test --dbname=contrib_regression base corpus

View File

@@ -0,0 +1,5 @@
#!/bin/sh
set -ex
cd "$(dirname ${0})"
PG_REGRESS=$(dirname "$(pg_config --pgxs)")/../test/regress/pg_regress
${PG_REGRESS} --use-existing --inputdir=./ --bindir='/usr/local/pgsql/bin' --inputdir=test --dbname=contrib_regression 002_uuid_generate_v7 003_uuid_v7_to_timestamptz 004_uuid_timestamptz_to_v7 005_uuid_v7_to_timestamp 006_uuid_timestamp_to_v7

View File

@@ -0,0 +1,4 @@
#!/bin/bash
set -ex
cd "$(dirname "${0}")"
pg_prove test.sql

View File

@@ -0,0 +1,15 @@
diff --git a/test.sql b/test.sql
index d7a0ca8..f15bc76 100644
--- a/test.sql
+++ b/test.sql
@@ -9,9 +9,7 @@
\set ON_ERROR_STOP true
\set QUIET 1
-CREATE EXTENSION pgcrypto;
-CREATE EXTENSION pgtap;
-CREATE EXTENSION pgjwt;
+CREATE EXTENSION IF NOT EXISTS pgtap;
BEGIN;
SELECT plan(23);

View File

@@ -0,0 +1,5 @@
#!/bin/sh
set -ex
cd "$(dirname ${0})"
patch -p1 <test-upgrade.patch
pg_prove -d contrib_regression test.sql

View File

@@ -0,0 +1,5 @@
#!/bin/sh
set -ex
cd "$(dirname ${0})"
PG_REGRESS=$(dirname "$(pg_config --pgxs)")/../test/regress/pg_regress
${PG_REGRESS} --inputdir=./ --bindir='/usr/local/pgsql/bin' --inputdir=test --use-existing --dbname=contrib_regression bit btree cast copy halfvec hnsw_bit hnsw_halfvec hnsw_sparsevec hnsw_vector ivfflat_bit ivfflat_halfvec ivfflat_vector sparsevec vector_type

View File

@@ -0,0 +1,5 @@
#!/bin/sh
set -ex
cd "$(dirname ${0})"
PG_REGRESS=$(dirname "$(pg_config --pgxs)")/../test/regress/pg_regress
${PG_REGRESS} --inputdir=./ --bindir='/usr/local/pgsql/bin' --use-existing --dbname=contrib_regression plv8 plv8-errors scalar_args inline json startup_pre startup varparam json_conv jsonb_conv window guc es6 arraybuffer composites currentresource startup_perms bytea find_function_perms memory_limits reset show array_spread regression dialect bigint procedure

View File

@@ -0,0 +1,5 @@
#!/bin/sh
set -ex
cd "$(dirname ${0})"
PG_REGRESS=$(dirname "$(pg_config --pgxs)")/../test/regress/pg_regress
${PG_REGRESS} --inputdir=./ --bindir='/usr/local/pgsql/bin' --use-existing --dbname=contrib_regression extension tables unit binary unicode prefix units time temperature functions language_functions round derived compare aggregate iec custom crosstab convert

View File

@@ -0,0 +1,5 @@
#!/bin/sh
set -ex
cd "$(dirname ${0})"
PG_REGRESS=$(dirname "$(pg_config --pgxs)")/../test/regress/pg_regress
${PG_REGRESS} --use-existing --inputdir=./ --bindir='/usr/local/pgsql/bin' --dbname=contrib_regression prefix falcon explain queries

View File

@@ -0,0 +1,19 @@
diff --git a/expected/rum.out b/expected/rum.out
index 5966d19..8860b79 100644
--- a/expected/rum.out
+++ b/expected/rum.out
@@ -1,4 +1,3 @@
-CREATE EXTENSION rum;
CREATE TABLE test_rum( t text, a tsvector );
CREATE TRIGGER tsvectorupdate
BEFORE UPDATE OR INSERT ON test_rum
diff --git a/sql/rum.sql b/sql/rum.sql
index 8414bb9..898e6ab 100644
--- a/sql/rum.sql
+++ b/sql/rum.sql
@@ -1,5 +1,3 @@
-CREATE EXTENSION rum;
-
CREATE TABLE test_rum( t text, a tsvector );
CREATE TRIGGER tsvectorupdate

View File

@@ -0,0 +1,6 @@
#!/bin/sh
set -ex
cd "$(dirname ${0})"
patch -p1 <test-upgrade.patch
PG_REGRESS=$(dirname "$(pg_config --pgxs)")/../test/regress/pg_regress
${PG_REGRESS} --inputdir=./ --bindir='/usr/local/pgsql/bin' --use-existing --dbname=contrib_regression rum rum_validate rum_hash ruminv timestamp orderby orderby_hash altorder altorder_hash limits int2 int4 int8 float4 float8 money oid time timetz date interval macaddr inet cidr text varchar char bytea bit varbit numeric rum_weight expr array

View File

@@ -1,17 +1,22 @@
#!/bin/bash
set -x
cd /ext-src || exit 2
extdir=${1}
cd "${extdir}" || exit 2
FAILED=
LIST=$( (echo -e "${SKIP//","/"\n"}"; ls -d -- *-src) | sort | uniq -u)
for d in ${LIST}
do
[ -d "${d}" ] || continue
if ! psql -w -c "select 1" >/dev/null; then
FAILED="${d} ${FAILED}"
break
fi
LIST=$( (echo -e "${SKIP//","/"\n"}"; ls) | sort | uniq -u)
for d in ${LIST}; do
[ -d "${d}" ] || continue
if ! psql -w -c "select 1" >/dev/null; then
FAILED="${d} ${FAILED}"
break
fi
if [ -f "${d}/neon-test.sh" ]; then
"${d}/neon-test.sh" || FAILED="${d} ${FAILED}"
else
USE_PGXS=1 make -C "${d}" installcheck || FAILED="${d} ${FAILED}"
fi
done
[ -z "${FAILED}" ] && exit 0
echo "${FAILED}"

View File

@@ -0,0 +1,94 @@
#!/bin/bash
set -eux -o pipefail
cd "$(dirname "${0}")"
# Takes a variable name as argument. The result is stored in that variable.
generate_id() {
local -n resvar=$1
printf -v resvar '%08x%08x%08x%08x' $SRANDOM $SRANDOM $SRANDOM $SRANDOM
}
if [ -z ${OLDTAG+x} ] || [ -z ${NEWTAG+x} ] || [ -z "${OLDTAG}" ] || [ -z "${NEWTAG}" ]; then
echo OLDTAG and NEWTAG must be defined
exit 1
fi
export PG_VERSION=${PG_VERSION:-16}
function wait_for_ready {
TIME=0
while ! docker compose logs compute_is_ready | grep -q "accepting connections" && [ ${TIME} -le 300 ] ; do
((TIME += 1 ))
sleep 1
done
if [ ${TIME} -gt 300 ]; then
echo Time is out.
exit 2
fi
}
function create_extensions() {
for ext in ${1}; do
docker compose exec neon-test-extensions psql -X -v ON_ERROR_STOP=1 -d contrib_regression -c "CREATE EXTENSION IF NOT EXISTS ${ext} CASCADE"
done
}
EXTENSIONS='[
{"extname": "plv8", "extdir": "plv8-src"},
{"extname": "vector", "extdir": "pgvector-src"},
{"extname": "unit", "extdir": "postgresql-unit-src"},
{"extname": "hypopg", "extdir": "hypopg-src"},
{"extname": "rum", "extdir": "rum-src"},
{"extname": "ip4r", "extdir": "ip4r-src"},
{"extname": "prefix", "extdir": "prefix-src"},
{"extname": "hll", "extdir": "hll-src"},
{"extname": "pg_cron", "extdir": "pg_cron-src"},
{"extname": "pg_uuidv7", "extdir": "pg_uuidv7-src"},
{"extname": "roaringbitmap", "extdir": "pg_roaringbitmap-src"},
{"extname": "semver", "extdir": "pg_semver-src"},
{"extname": "pg_ivm", "extdir": "pg_ivm-src"},
{"extname": "pgjwt", "extdir": "pgjwt-src"}
]'
EXTNAMES=$(echo ${EXTENSIONS} | jq -r '.[].extname' | paste -sd ' ' -)
TAG=${NEWTAG} docker compose --profile test-extensions up --quiet-pull --build -d
wait_for_ready
docker compose exec neon-test-extensions psql -c "DROP DATABASE IF EXISTS contrib_regression"
docker compose exec neon-test-extensions psql -c "CREATE DATABASE contrib_regression"
create_extensions "${EXTNAMES}"
query="select json_object_agg(extname,extversion) from pg_extension where extname in ('${EXTNAMES// /\',\'}')"
new_vers=$(docker compose exec neon-test-extensions psql -Aqt -d contrib_regression -c "$query")
docker compose --profile test-extensions down
TAG=${OLDTAG} docker compose --profile test-extensions up --quiet-pull --build -d --force-recreate
wait_for_ready
docker compose cp ext-src neon-test-extensions:/
docker compose exec neon-test-extensions psql -c "DROP DATABASE IF EXISTS contrib_regression"
docker compose exec neon-test-extensions psql -c "CREATE DATABASE contrib_regression"
create_extensions "${EXTNAMES}"
query="select pge.extname from pg_extension pge join (select key as extname, value as extversion from json_each_text('${new_vers}')) x on pge.extname=x.extname and pge.extversion <> x.extversion"
exts=$(docker compose exec neon-test-extensions psql -Aqt -d contrib_regression -c "$query")
if [ -z "${exts}" ]; then
echo "No extensions were upgraded"
else
tenant_id=$(docker compose exec neon-test-extensions psql -Aqt -c "SHOW neon.tenant_id")
timeline_id=$(docker compose exec neon-test-extensions psql -Aqt -c "SHOW neon.timeline_id")
for ext in ${exts}; do
echo Testing ${ext}...
EXTDIR=$(echo ${EXTENSIONS} | jq -r '.[] | select(.extname=="'${ext}'") | .extdir')
generate_id new_timeline_id
PARAMS=(
-sbf
-X POST
-H "Content-Type: application/json"
-d "{\"new_timeline_id\": \"${new_timeline_id}\", \"pg_version\": ${PG_VERSION}, \"ancestor_timeline_id\": \"${timeline_id}\"}"
"http://127.0.0.1:9898/v1/tenant/${tenant_id}/timeline/"
)
result=$(curl "${PARAMS[@]}")
echo $result | jq .
TENANT_ID=${tenant_id} TIMELINE_ID=${new_timeline_id} TAG=${OLDTAG} docker compose down compute compute_is_ready
COMPUTE_TAG=${NEWTAG} TAG=${OLDTAG} TENANT_ID=${tenant_id} TIMELINE_ID=${new_timeline_id} docker compose up --quiet-pull -d --build compute compute_is_ready
wait_for_ready
TID=$(docker compose exec neon-test-extensions psql -Aqt -c "SHOW neon.timeline_id")
if [ ${TID} != ${new_timeline_id} ]; then
echo Timeline mismatch
exit 1
fi
docker compose exec neon-test-extensions psql -d contrib_regression -c "\dx ${ext}"
docker compose exec neon-test-extensions sh -c /ext-src/${EXTDIR}/test-upgrade.sh
docker compose exec neon-test-extensions psql -d contrib_regression -c "alter extension ${ext} update"
docker compose exec neon-test-extensions psql -d contrib_regression -c "\dx ${ext}"
done
fi

View File

@@ -285,10 +285,10 @@ To summarize, list of cplane changes:
### storage_controller implementation
Current 'load everything on startup and keep in memory' easy design is fine.
Single timeline shouldn't take more than 100 bytes (it's 16 byte tenant_id, 16
byte timeline_id, int generation, vec of ~3 safekeeper ids plus some flags), so
10^6 of timelines shouldn't take more than 100MB.
If desired, we may continue using current 'load everything on startup and keep
in memory' approach: single timeline shouldn't take more than 100 bytes (it's 16
byte tenant_id, 16 byte timeline_id, int generation, vec of ~3 safekeeper ids
plus some flags), so 10^6 of timelines shouldn't take more than 100MB.
Similar to pageserver attachment Intents storage_controller would have in-memory
`MigrationRequest` (or its absense) for each timeline and pool of tasks trying
@@ -296,7 +296,7 @@ to make these request reality; this ensures one instance of storage_controller
won't do several migrations on the same timeline concurrently. In the first
version it is simpler to have more manual control and no retries, i.e. migration
failure removes the request. Later we can build retries and automatic
scheduling/migration. `MigrationRequest` is
scheduling/migration around. `MigrationRequest` is
```
enum MigrationRequest {
To(Vec<NodeId>),
@@ -313,9 +313,9 @@ similarly, in the first version it is ok to trigger it manually).
#### Schema
`safekeepers` table mirroring current `nodes` should be added, except that for
`scheduling_policy` field (seems like `status` is a better name for it): it is enough
to have at least in the beginning only 3 fields: 1) `active` 2) `offline` 3)
`decomissioned`.
`scheduling_policy`: it is enough to have at least in the beginning only 3
fields: 1) `active` 2) `paused` (initially means only not assign new tlis there
3) `decomissioned` (node is removed).
`timelines` table:
```
@@ -324,18 +324,24 @@ table! {
timelines (tenant_id, timeline_id) {
timeline_id -> Varchar,
tenant_id -> Varchar,
start_lsn -> pg_lsn,
generation -> Int4,
sk_set -> Array<Int4>, // list of safekeeper ids
new_sk_set -> Nullable<Array<Int4>>, // list of safekeeper ids, null if not joint conf
new_sk_set -> Nullable<Array<Int8>>, // list of safekeeper ids, null if not joint conf
cplane_notified_generation -> Int4,
deleted_at -> Nullable<Timestamptz>,
}
}
```
`start_lsn` is needed to create timeline on safekeepers properly, see below. We
might also want to add ancestor_timeline_id to preserve the hierarchy, but for
this RFC it is not needed.
#### API
Node management is similar to pageserver:
1) POST `/control/v1/safekeepers` upserts safekeeper.
1) POST `/control/v1/safekeepers` inserts safekeeper.
2) GET `/control/v1/safekeepers` lists safekeepers.
3) GET `/control/v1/safekeepers/:node_id` gets safekeeper.
4) PUT `/control/v1/safekepers/:node_id/status` changes status to e.g.
@@ -345,25 +351,15 @@ Node management is similar to pageserver:
Safekeeper deploy scripts should register safekeeper at storage_contorller as
they currently do with cplane, under the same id.
Timeline creation/deletion: already existing POST `tenant/:tenant_id/timeline`
would 1) choose initial set of safekeepers; 2) write to the db initial
`Configuration` with `INSERT ON CONFLICT DO NOTHING` returning existing row in
case of conflict; 3) create timeline on the majority of safekeepers (already
created is ok).
Timeline creation/deletion will work through already existing POST and DELETE
`tenant/:tenant_id/timeline`. Cplane is expected to retry both until they
succeed. See next section on the implementation details.
We don't want to block timeline creation when one safekeeper is down. Currently
this is solved by compute implicitly creating timeline on any safekeeper it is
connected to. This creates ugly timeline state on safekeeper when timeline is
created, but start LSN is not defined yet. It would be nice to remove this; to
do that, controller can in the background retry to create timeline on
safekeeper(s) which missed that during initial creation call. It can do that
through `pull_timeline` from majority so it doesn't need to remember
`parent_lsn` in its db.
Timeline deletion removes the row from the db and forwards deletion to the
current configuration members. Without additional actions deletions might leak,
see below on this; initially let's ignore these, reporting to cplane success if
at least one safekeeper deleted the timeline (this will remove s3 data).
We don't want to block timeline creation/deletion when one safekeeper is down.
Currently this is crutched by compute implicitly creating timeline on any
safekeeper it is connected to. This creates ugly timeline state on safekeeper
when timeline is created, but start LSN is not defined yet. Next section
describes dealing with this.
Tenant deletion repeats timeline deletion for all timelines.
@@ -395,26 +391,6 @@ Similar call should be added for the tenant.
It would be great to have some way of subscribing to the results (apart from
looking at logs/metrics).
Migration is executed as described above. One subtlety is that (local) deletion on
source safekeeper might fail, which is not a problem if we are going to
decomission the node but leaves garbage otherwise. I'd propose in the first version
1) Don't attempt deletion at all if node status is `offline`.
2) If it failed, just issue warning.
And add PUT `/control/v1/safekeepers/:node_id/scrub` endpoint which would find and
remove garbage timelines for manual use. It will 1) list all timelines on the
safekeeper 2) compare each one against configuration storage: if timeline
doesn't exist at all (had been deleted), it can be deleted. Otherwise, it can
be deleted under generation number if node is not member of current generation.
Automating this is untrivial; we'd need to register all potential missing
deletions <tenant_id, timeline_id, generation, node_id> in the same transaction
which switches configurations. Similarly when timeline is fully deleted to
prevent cplane operation from blocking when some safekeeper is not available
deletion should be also registered.
One more task pool should infinitely retry notifying control plane about changed
safekeeper sets.
3) GET `/control/v1/tenant/:tenant_id/timeline/:timeline_id/` should return
current in memory state of the timeline and pending `MigrationRequest`,
if any.
@@ -423,12 +399,153 @@ safekeeper sets.
migration by switching configuration from the joint to the one with (previous) `sk_set` under CAS
(incrementing generation as always).
#### API implementation and reconciliation
For timeline creation/deletion we want to preserve the basic assumption that
unreachable minority (1 sk of 3) doesn't block their completion, but eventually
we want to finish creation/deletion on nodes which missed it (unless they are
removed). Similarly for migration; it may and should finish even though excluded
members missed their exclusion. And of course e.g. such pending exclusion on
node C after migration ABC -> ABD must not prevent next migration ABD -> ABE. As
another example, if some node missed timeline creation it clearly must not block
migration from it. Hence it is natural to have per safekeeper background
reconciler which retries these ops until they succeed. There are 3 possible
operation types, and the type is defined by timeline state (membership
configuration and whether it is deleted) and safekeeper id: we may need to
create timeline on sk (node added), locally delete it (node excluded, somewhat
similar to detach) or globally delete it (timeline is deleted).
Next, on storage controller restart in principle these pending operations can be
figured out by comparing safekeepers state against storcon state. But it seems
better to me to materialize them in the database; it is not expensive, avoids
these startup scans which themselves can fail etc and makes it very easy to see
outstanding work directly at the source of truth -- the db. So we can add table
`safekeeper_timeline_pending_ops`
```
table! {
// timeline_id, sk_id is primary key
safekeeper_timeline_pending_ops (sk_id, tenant_id, timeline_id) {
sk_id -> int8,
tenant_id -> Varchar,
timeline_id -> Varchar,
generation -> Int4,
op_type -> Varchar,
}
}
```
`op_type` can be `include` (seed from peers and ensure generation is up to
date), `exclude` (remove locally) and `delete`. Field is actually not strictly
needed as it can be computed from current configuration, but gives more explicit
observability.
`generation` is necessary there because after op is done reconciler must remove
it and not remove another row with higher gen which in theory might appear.
Any insert of row should overwrite (remove) all rows with the same sk and
timeline id but lower `generation` as next op makes previous obsolete. Insertion
of `op_type` `delete` overwrites all rows.
About `exclude`: rather than adding explicit safekeeper http endpoint, it is
reasonable to reuse membership switch endpoint: if safekeeper is not member
of the configuration it locally removes the timeline on the switch. In this case
404 should also be considered an 'ok' answer by the caller.
So, main loop of per sk reconcile reads `safekeeper_timeline_pending_ops`
joined with timeline configuration to get current conf (with generation `n`)
for the safekeeper and does the jobs, infinitely retrying failures:
1) If node is member (`include`):
- Check if timeline exists on it, if not, call pull_timeline on it from
other members
- Call switch configuration to the current
2) If node is not member (`exclude`):
- Call switch configuration to the current, 404 is ok.
3) If timeline is deleted (`delete`), call delete.
In cases 1 and 2 remove `safekeeper_timeline_pending_ops` for the sk and
timeline with generation <= `n` if `op_type` is not `delete`.
In case 3 also remove `safekeeper_timeline_pending_ops`
entry + remove `timelines` entry if there is nothing left in `safekeeper_timeline_pending_ops` for the timeline.
Let's consider in details how APIs can be implemented from this angle.
Timeline creation. It is assumed that cplane retries it until success, so all
actions must be idempotent. Now, a tricky point here is timeline start LSN. For
the initial (tenant creation) call cplane doesn't know it. However, setting
start_lsn on safekeepers during creation is a good thing -- it provides a
guarantee that walproposer can always find a common point in WAL histories of
safekeeper and its own, and so absense of it would be a clear sign of
corruption. The following sequence works:
1) Create timeline (or observe that it exists) on pageserver,
figuring out last_record_lsn in response.
2) Choose safekeepers and insert (ON CONFLICT DO NOTHING) timeline row into the
db. Note that last_record_lsn returned on the previous step is movable as it
changes once ingestion starts, insert must not overwrite it (as well as other
fields like membership conf). On the contrary, start_lsn used in the next
step must be set to the value in the db. cplane_notified_generation can be set
to 1 (initial generation) in insert to avoid notifying cplane about initial
conf as cplane will receive it in timeline creation request anyway.
3) Issue timeline creation calls to at least majority of safekeepers. Using
majority here is not necessary but handy because it guarantees that any live
majority will have at least one sk with created timeline and so
reconciliation task can use pull_timeline shared with migration instead of
create timeline special init case. OFC if timeline is already exists call is
ignored.
4) For minority of safekeepers which could have missed creation insert
entries to `safekeeper_timeline_pending_ops`. We won't miss this insertion
because response to cplane is sent only after it has happened, and cplane
retries the call until 200 response.
There is a small question how request handler (timeline creation in this
case) would interact with per sk reconciler. As always I prefer to do the
simplest possible thing and here it seems to be just waking it up so it
re-reads the db for work to do. Passing work in memory is faster, but
that shouldn't matter, and path to scan db for work will exist anyway,
simpler to reuse it.
For pg version / wal segment size: while we may persist them in `timelines`
table, it is not necessary as initial creation at step 3 can take them from
pageserver or cplane creation call and later pull_timeline will carry them
around.
Timeline migration.
1) CAS to the db to create joint conf, and in the same transaction create
`safekeeper_timeline_pending_ops` `include` entries to initialize new members
as well as deliver this conf to current ones; poke per sk reconcilers to work
on it. Also any conf change should also poke cplane notifier task(s).
2) Once it becomes possible per alg description above, get out of joint conf
with another CAS. Task should get wakeups from per sk reconcilers because
conf switch is required for advancement; however retries should be sleep
based as well as LSN advancement might be needed, though in happy path
it isn't. To see whether further transition is possible on wakup migration
executor polls safekeepers per the algorithm. CAS creating new conf with only
new members should again insert entries to `safekeeper_timeline_pending_ops`
to switch them there, as well as `exclude` rows to remove timeline from
old members.
Timeline deletion: just set `deleted_at` on the timeline row and insert
`safekeeper_timeline_pending_ops` entries in the same xact, the rest is done by
per sk reconcilers.
When node is removed (set to `decomissioned`), `safekeeper_timeline_pending_ops`
for it must be cleared in the same transaction.
One more task pool should infinitely retry notifying control plane about changed
safekeeper sets (trying making `cplane_notified_generation` equal `generation`).
#### Dealing with multiple instances of storage_controller
Operations described above executed concurrently might create some errors but do
not prevent progress, so while we normally don't want to run multiple instances
of storage_controller it is fine to have it temporarily, e.g. during redeploy.
To harden against some controller instance creating some work in
`safekeeper_timeline_pending_ops` and then disappearing without anyone pickup up
the job per sk reconcilers apart from explicit wakups should scan for work
periodically. It is possible to remove that though if all db updates are
protected with leadership token/term -- then such scans are needed only after
leadership is acquired.
Any interactions with db update in-memory controller state, e.g. if migration
request failed because different one is in progress, controller remembers that
and tries to finish it.
@@ -545,7 +662,7 @@ Aurora does this but similarly I don't think this is needed.
We should use Compute <-> safekeeper protocol change to include other (long
yearned) modifications:
- send data in network order to make arm work.
- send data in network order without putting whole structs to be arch independent
- remove term_start_lsn from AppendRequest
- add horizon to TermHistory
- add to ProposerGreeting number of connection from this wp to sk

View File

@@ -15,11 +15,6 @@ pub struct GenericAPIError {
pub error: String,
}
#[derive(Debug, Clone, Serialize)]
pub struct InfoResponse {
pub num_cpus: usize,
}
#[derive(Debug, Clone, Serialize)]
pub struct ExtensionInstallResponse {
pub extension: PgIdent,

Some files were not shown because too many files have changed in this diff Show More