Commit Graph

2234 Commits

Author SHA1 Message Date
Konstantin Knizhnik
3da70abfa5 Fix pageserver_try_receive (#11096)
## Problem

See https://neondb.slack.com/archives/C04DGM6SMTM/p1741176713523469

The problem is that this function is using `PQgetCopyData(shard->conn,
&resp_buff.data, 1 /* async = true */)`
to try to fetch next message. But this function returns 0 if the whole
message is not present in the buffer.
And input buffer may contain only part of message so result is not
fetched.

## Summary of changes

Use `PQisBusy` + `WaitEventSetWait` to check if data is available and
`PQgetCopyData(shard->conn, &resp_buff.data, 0)` to read whole message
in this case.

---------

Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
2025-03-20 15:21:00 +00:00
Dmitrii Kovalkov
9bf59989db storcon: add https API (#11239)
## Problem

Pageservers use unencrypted HTTP requests for storage controller API.

- Closes: https://github.com/neondatabase/cloud/issues/25524

## Summary of changes

- Replace hyper0::server::Server with http_utils::server::Server in
storage controller.
- Add HTTPS handler for storage controller API.
- Support `ssl_ca_file` in pageserver.
2025-03-20 08:22:02 +00:00
Alexey Kondratov
518269ea6a feat(compute): Add perf test for compute startup time breakdown (#11198)
## Problem

We had a recent Postgres startup latency (`start_postgres_ms`)
degradation, but it was only caught with SLO alerts. There was actually
an existing test for the same purpose -- `start_postgres_ms`, but it's
doing only two starts, so it's a bit noisy.

## Summary of changes

Add new compute startup latency test that does 100 iterations and
reports p50, p90 and p99 latencies.

Part of https://github.com/neondatabase/cloud/issues/24882
2025-03-19 16:11:33 +00:00
Christian Schwarz
0f20dae3c3 impr: merge pageserver_api::models::TenantConfig and pageserver::tenant::config::TenantConfOpt (#11298)
The only difference between
- `pageserver_api::models::TenantConfig` and
- `pageserver::tenant::config::TenantConfOpt`

at this point is that `TenantConfOpt` serializes with
`skip_serializing_if = Option::is_none`.
That is an efficiency improvement for all the places that currently
serde `models::TenantConfig` because new serializations will no longer
write `$fieldname: null` for each field that is `None` at runtime.

This should be particularly beneficial for Storcon, which stores
JSON-serialized `models::TenantConfig` in its DB.

# Behavior Changes


This PR changes the serialization behavior: we omit `None` fields
instead of serializing `$fieldname: null`).

So it's a data format change (see section on compatibility below).

And it changes API responses from Storcon and Pageserver.

## API Response Compatibility

Storcon returns the location description.
Afaik it is passed through into
- storcon_cli output
- storcon UI in console admin UI

These outputs will no longer contain `$fieldname: null` values,
which de-bloats the output (good).
But in storcon UI, it also serves as an editor "default", which
will be eliminated after a storcon with this PR is released.


## Data Format Compatibility


Backwards compat: new software reading old serialized data will
deserialize to the same runtime value because all the field types
are exactly the same and `skip_serializing_if` does not affect
deserialization.

Forward compat: old software reading data serialized by new software
will map absence fields in the serialized form to runtime value
`Option::None`. This is serde default behavior, see this playground
to convince yourself:

https://play.rust-lang.org/?version=stable&mode=debug&edition=2024&gist=f7f4e1a169959a3085b6158c022a05eb

The `serde(with="humantime_serde")` however behaves strangely:
if used on an `Option<Duration>`, it still requires the field to be
present,
unlike the serde default behavior shown in the previous paragraph.
The workaround is to set `serde(default)`.
Previously it was set on each individual field, but, we do have the
container attribute, so, set it there.
This requires deriving a `Default` impl, which, because all fields are
`Option`,
is non-magic.
See my notes here:
https://gist.github.com/problame/eddbc225a5d12617e9f2c6413e0cf799

# Future Work

We should have separate types (& crates) for
- runtime types configuration (e.g. PageServerConf::tenant_config,
AttachedLocationConf)
- `config-v1` file pageserver local disk file format
- `mgmt API`
- `pageserver.toml`

Right now they all use the same, which is convenient but makes it hard
to reason about compatibility breakage.

# Refs

- corresponding docs.neon.build PR
https://github.com/neondatabase/docs/pull/470
2025-03-19 12:47:17 +00:00
Dmitrii Kovalkov
57d51e949d tests: suppress excessive pageserver errors in test_timeline_ancestor_detach_errors (#11277)
## Problem

The test is flaky because of the same reasons as described in
https://github.com/neondatabase/neon/issues/11177.
The test has already suppressed these `WARN` and `ERROR` log messages,
but the regexp didn't match all possible errors.

## Summary of changes
- Change regexp to suppress all possible allowed error log messages.
2025-03-18 07:10:11 +00:00
Arpad Müller
56149a046a Add test_explicit_timeline_creation_storcon and make it work (#11261)
Adds a basic test that makes the storcon issue explicit creation of a
timeline on safeekepers (main storcon PR in #11058). It was adapted from
`test_explicit_timeline_creation` from #11002.

Also, do a bunch of fixes needed to get the test work (the API
definitions weren't correct), and log more stuff when we can't create a
new timeline due to no safekeepers being active.

Part of #9011

---------

Co-authored-by: Arseny Sher <sher-ars@yandex.ru>
2025-03-17 16:28:21 +00:00
Alexey Kondratov
966abd3bd6 fix(compute_ctl): Dollar escaping helper fixes (#11263)
## Problem

In the previous PR #11045, one edge-case wasn't covered, when an ident
contains only one `$`, we were picking `$$` as a 'wrapper'. Yet, when
this `$` is at the beginning or at the end of the ident, then we end up
with `$$$` in a row which breaks the escaping.

## Summary of changes

Start from `x` tag instead of a blank string.

Slack:
https://neondb.slack.com/archives/C08HV951W2W/p1742076675079769?thread_ts=1742004205.461159&cid=C08HV951W2W
2025-03-16 18:39:54 +00:00
Peter Bendel
228bb75354 Extend large tenant OLTP workload ... (#11166)
... to better match the workload characteristics of real Neon customers

## Problem

We analyzed workloads of large Neon users and want to extend the oltp
workload to include characteristics seen in those workloads.

## Summary of changes

- for re-use branch delete inserted rows from last run
- adjust expected run-time (time-outs) in GitHub workflow
- add queries that exposes the prefetch getpages path
- add I/U/D transactions for another table (so far the workload was
insert/append-only)
- add an explicit vacuum analyze step and measure its time
- add reindex concurrently step and measure its time (and take care that
this step succeeds even if prior reindex runs have failed or were
canceled)
- create a second connection string for the pooled connection that
removes the `-pooler` suffix from the hostname because we want to run
long-running statements (database maintenance) and bypass the pooler
which doesn't support unlimited statement timeout

## Test run


https://github.com/neondatabase/neon/actions/runs/13851772887/job/38760172415
2025-03-16 14:04:48 +00:00
Dmitrii Kovalkov
3168bd0e3a tests: suppress "Cancelled request finished with an error" in test_timeline_archive (#11241)
## Problem

Previous PR https://github.com/neondatabase/neon/pull/11190 didn't
suppress `Cancelled request finished with an error` messages, which are
also expected, so the test
https://github.com/neondatabase/neon/issues/11177 is still flaky.

## Summary of changes
- Suppress `Cancelled request finished with an error` in
`test_timeline_archive`
2025-03-14 17:42:09 +00:00
Alexander Bayandin
4a97cd0b7e test_runner: fix tests with jsonnet for Python 3.13 (#11240)
## Problem
Python's `jsonnet` 0.20.0 doesn't support Python 3.13, so we have a
couple of tests xfailed because of that.

## Summary of changes
- Bump `jsonnet` to `0.21.0rc2` which supports Python 3.13
- Unxfail `test_sql_exporter_metrics_e2e` and
`test_sql_exporter_metrics_smoke` on Python 3.13
2025-03-14 17:02:55 +00:00
Dmitrii Kovalkov
f68be2b5e2 safekeeper: https for management API (#11171)
## Problem

Storage controller uses unencrypted HTTP requests for safekeeper
management API.

- Closes: https://github.com/neondatabase/cloud/issues/24836

## Summary of changes

- Replace `hyper0::server::Server` with `http_utils::server::Server` in
safekeeper.
- Add HTTPS handler for safekeeper management API.
2025-03-14 11:41:22 +00:00
Erik Grinaker
d6d78a050f pageserver: disable l0_flush_wait_upload by default (#11215)
## Problem

This is already disabled in production, as it is replaced by L0 flush
delays. It will be removed in a later PR, once the config option is no
longer specified in production.

## Summary of changes

Disable `l0_flush_wait_upload` by default.
2025-03-13 21:08:28 +00:00
Alex Chi Z.
23b713900e feat(storcon): passthrough ancestor detach behavior (#11199)
## Problem

https://github.com/neondatabase/neon/issues/10310
https://github.com/neondatabase/neon/pull/11158

## Summary of changes

We need to passthrough the new detach behavior through the storcon API.

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-03-13 20:21:23 +00:00
Arpad Müller
b1a1be6a4c switch pytests and neon_local to control_plane_hooks_api (#11195)
We want to switch away from and deprecate the `--compute-hook-url` param
for the storcon in favour of `--control-plane-url` because it allows us
to construct urls with `notify-safekeepers`.

This PR switches the pytests and neon_local from a
`control_plane_compute_hook_api` to a new param named
`control_plane_hooks_api` which is supposed to point to the parent of
the `notify-attach` URL.

We still support reading the old url from disk to not be too disruptive
with existing deployments, but we just ignore it.

Also add docs for the `notify-safekeepers` upcall API.

Follow-up of #11173
Part of https://github.com/neondatabase/neon/issues/11163
2025-03-13 19:50:52 +00:00
Christian Schwarz
ed31dd2a3c pageserver: better observability for slow wait_lsn (#11176)
# Problem

We leave too few observability breadcrumbs in the case where wait_lsn is
exceptionally slow.

# Changes

- refactor: extract the monitoring logic out of `log_slow` into
`monitor_slow_future`
- add global + per-timeline counter for time spent waiting for wait_lsn
- It is updated while we're still waiting, similar to what we do for
page_service response flush.
- add per-timeline counterpair for started & finished wait_lsn count
- add slow-logging to leave breadcrumbs in logs, not just metrics

For the slow-logging, we need to consider not flooding the logs during a
broker or network outage/blip.
The solution is a "log-streak-level" concurrency limit per timeline.
At any given time, there is at most one slow wait_lsn that is logging
the "still running" and "completed" sequence of logs.
Other concurrent slow wait_lsn's don't log at all.
This leaves at least one breadcrumb in each timeline's logs if some
wait_lsn was exceptionally slow during a given period.
The full degree of slowness can then be determined by looking at the
per-timeline metric.

# Performance

Reran the `bench_log_slow` benchmark, no difference, so, existing call
sites are fine.

We do use a Semaphore, but only try_acquire it _after_ things have
already been determined to be slow. So, no baseline overhead
anticipated.

# Refs

-
https://github.com/neondatabase/cloud/issues/23486#issuecomment-2711587222
2025-03-13 15:03:53 +00:00
devin-ai-integration[bot]
efb1df4362 fix: Change metric_unit from 'microseconds' to 'μs' in test_compute_ctl_api.py (#11209)
# Fix metric_unit length in test_compute_ctl_api.py

## Description
This PR changes the metric_unit from "microseconds" to "μs" in
test_compute_ctl_api.py to fix the issue where perf test results were
not being stored in the database due to the string exceeding the 10
character limit of the metric_unit column in the perf_test_results
table.

## Problem
As reported in Slack, the perf test results were not being uploaded to
the database because the "microseconds" string (12 characters) exceeds
the 10 character limit of the metric_unit column in the
perf_test_results table.

## Solution
Replace "microseconds" with "μs" in all metric_unit parameters in the
test_compute_ctl_api.py file.

## Testing
The changes have been committed and pushed. The PR is ready for review.

Link to Devin run:
https://app.devin.ai/sessions/e29edd672bd34114b059915820e8a853
Requested by: Peter Bendel

Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
Co-authored-by: peterbendel@neon.tech <peterbendel@neon.tech>
2025-03-13 10:17:01 +00:00
Alex Chi Z.
c3b3b507f7 feat(pageserver): support detaching behavior v2 (#11158)
## Problem

close https://github.com/neondatabase/neon/issues/10310

## Summary of changes

This patch adds a new behavior for the detach_ancestor API: detach with
multi-level ancestor and no reparenting. Though we can potentially
support multi-level + do reparenting / single-level + no-reparenting in
the future, as it's not required for the recovery/snapshot epic, I'd
prefer keeping things simple now that we only handle the old one and the
new one instead of supporting the full feature matrix.

I only added a test case of successful detaching instead of testing
failures. I'd like to make this into staging and add more tests in the
future.

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-03-12 22:27:23 +00:00
Alex Chi Z.
8a5a739af0 test(pageserver): add small tenant compaction (#11049)
## Problem

close https://github.com/neondatabase/neon/issues/10881

## Summary of changes

Mock a tenant with very small amount of data.

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-03-12 20:34:19 +00:00
Tristan Partin
5eed0e4b94 Add docs to performance/test_logical_replication.py on how to run the suite (#10175)
These docs are in tandem with what was recently published on the
internal docs site.

Signed-off-by: Tristan Partin <tristan@neon.tech>
2025-03-12 17:31:09 +00:00
Vlad Lazar
02a83913ec storcon: do not update observed state on node activation (#11155)
## Problem

When a node becomes active, we query its locations and update the
observed state in-place.
This can race with the observed state updates done when processing
reconcile results.

## Summary of changes

The argument for this reconciliation step is that is reduces the need
for background reconciliations.
I don't think is actually true anymore. There's two cases.

1. Restart of node after drain. Usually the node does not go through the
offline state here, so observed locations
were not marked as none. In any case, there should be a handful of
shards max on the node since we've just drained it.
2. Node comes back online after failure or network partition. When the
node is marked offline, we reschedule everything away from it. When it
later becomes active, the previous observed location is extraneous and
requires a reconciliation anyway.

Closes https://github.com/neondatabase/neon/issues/11148
2025-03-12 15:31:28 +00:00
Dmitrii Kovalkov
73e37ae388 Suppress "request was dropped" errors in test_timeline_archive (#11190)
## Problem

Test `test_timeline_archive` is flaky because it makes requests that are
intended to fail. It sometimes leads to warning in pageserver's logs.
More details are in the issue.

- Closes: https://github.com/neondatabase/neon/issues/11177

## Summary of changes
- Suppress such errors.
2025-03-12 13:23:31 +00:00
Alex Chi Z.
3451bdd3d2 fix(test): force L0 compaction before gc-compaction (#11143)
## Problem

Fix test flakyness of `test_gc_feedback`

Closes: https://github.com/neondatabase/neon/issues/11153

## Summary of changes

Looking at the log, gc-compaction is interrupted by L0 compaction.

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-03-10 20:03:49 +00:00
Dmitrii Kovalkov
63b22d3fb1 pageserver: https for management API (#11025)
## Problem

Storage controller uses unencrypted HTTP requests for pageserver
management API.

Closes: https://github.com/neondatabase/cloud/issues/24283


## Summary of changes
- Implement `http_utils::server::Server` with TLS support.
- Replace `hyper0::server::Server` with `http_utils::server::Server` in
pageserver.
- Add HTTPS handler for pageserver management API.
- Generate local SSL certificates in neon local.
2025-03-10 15:07:59 +00:00
Tristan Partin
1b8c4286c4 Fetch remote extension in ALTER EXTENSION UPDATE statements (#11102)
Previously, remote extensions were not fetched unless they were used in
some other manner. For instance, loading a BM25 index in pg_search
fetches the pg_search extension. However, if on a fresh compute with
pg_search 0.15.5 installed, the user ran `ALTER EXTENSION pg_search
UPDATE TO '0.15.6'` without first using the pg_search extension, we
would not fetch the extension and fail to find an update path.

Signed-off-by: Tristan Partin <tristan@neon.tech>
2025-03-09 17:29:44 +00:00
Tristan Partin
3fe5650039 Fix dropping role with table privileges granted by non-neon_superuser (#10964)
We were previously only revoking privileges granted by neon_superuser.
However, we need to do it for all grantors.

Signed-off-by: Tristan Partin <tristan@neon.tech>
2025-03-07 19:00:11 +00:00
Alex Chi Z.
cd438406fb feat(pageserver): add force patch index_part API (#11119)
## Problem

As part of the disaster recovery tool. Partly for
https://github.com/neondatabase/neon/issues/9114.

## Summary of changes

* Add a new pageserver API to force patch the fields in index_part and
modify the timeline internal structures.

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-03-07 17:42:52 +00:00
Dmitrii Kovalkov
e876794ce5 storcon: use https safekeeper api (#11065)
## Problem

Storage controller uses http for requests to safekeeper management API.

Closes: https://github.com/neondatabase/cloud/issues/24835

## Summary of changes
- Add `use_https_safekeeper_api` option to storcon to use https api
- Use https for requests to safekeeper management API if this option is
enabled
- Add `ssl_ca_file` option to storcon for ability to specify custom root
CA certificate
2025-03-07 17:22:47 +00:00
John Spray
87e6117dfd storage controller: API-driven graceful migrations (#10913)
## Problem

The current migration API does a live migration, but if the destination
doesn't already have a secondary, that live migration is unlikely to be
able to warm up a tenant properly within its timeout (full warmup of a
big tenant can take tens of minutes).

Background optimisation code knows how to do this gracefully by creating
a secondary first, but we don't currently give a human a way to trigger
that.

Closes: https://github.com/neondatabase/neon/issues/10540

## Summary of changes

- Add `prefererred_node` parameter to TenantShard, which is respected by
optimize_attachment
- Modify migration API to have optional prewarm=true mode, in which we
set preferred_node and call optimize_attachment, rather than directly
modifying intentstate
- Require override_scheduler=true flag if migrating somewhere that is a
less-than-optimal scheduling location (e.g. wrong AZ)
- Add `origin_node_id` to migration API so that callers can ensure
they're moving from where they think they're moving from
- Add tests for the above

The storcon_cli wrapper for this has a 'watch' mode that waits for
eventual cutover. This doesn't show the warmth of the secondary evolve
because we don't currently have an API for that in the controller, as
the passthrough API only targets attached locations, not secondaries. It
would be straightforward to add later as a dedicated endpoint for
getting secondary status, then extend the storcon_cli to consume that
and print a nice progress indicator.
2025-03-07 17:02:38 +00:00
Vlad Lazar
084fc4a757 pageserver: enable previous heatmaps by default (#11132)
We add the off by default configs in
https://github.com/neondatabase/neon/pull/11088 because
the unarchival heatmap was causing oversized secondary locations. That
was fixed in https://github.com/neondatabase/neon/pull/11098, so let's
turn them on by default.
2025-03-07 16:05:31 +00:00
Alexey Kondratov
f5aa8c3eac feat(compute_ctl): Add a basic HTTP API benchmark (#11123)
## Problem

We just had a regression reported at
https://neondb.slack.com/archives/C08EXUJF554/p1741102467515599, which
clearly came with one of the releases. It's not a huge problem yet, but
it's annoying that we cannot quickly attribute it to a specific commit.

## Summary of changes

Add a very simple `compute_ctl` HTTP API benchmark that does 10k
requests to `/status` and `metrics.json` and reports p50 and p99.

---------

Co-authored-by: Peter Bendel <peterbendel@neon.tech>
2025-03-07 12:35:42 +00:00
Alexey Kondratov
a485022300 fix(compute_ctl): Properly escape identifiers inside PL/pgSQL blocks (#11045)
## Problem

In f37eeb56, I properly escaped the identifier, but I haven't noticed
that the resulting string is used in the `format('...')`, so it needs
additional escaping. Yet, after looking at it closer and with Heikki's
and Tristan's help, it appeared to be that it's a full can of worms and
we have problems all over the code in places where we use PL/pgSQL
blocks.

## Summary of changes

Add a new `pg_quote_dollar()` helper to deal with it, as dollar-quoting
of strings seems to be the only robust way to escape strings in dynamic
PL/pgSQL blocks. We mimic the Postgres' `pg_get_functiondef` logic here
[1].

While on it, I added more tests and caught a couple of more bugs with
string escaping:

1. `get_existing_dbs_async()` was wrapping `owner` in additional
double-quotes if it contained special characters
2. `construct_superuser_query()` was flawed in even more ways than the
rest of the code. It wasn't realistic to fix it quickly, but after
thinking about it more, I realized that we could drop most of it
altogether. IIUC, it was added as some sort of migration, probably back
when we haven't had migrations yet. So all the complicated code was
needed to properly update existing roles and DBs. In the current Neon,
this code only runs before we create the very first DB and role. When we
create roles and DBs, all `neon_superuser` grants are added in the
different places. So the worst thing that could happen is that there is
an ancient branch somewhere, so when users poke it, they will realize
that not all Neon features work as expected. Yet, the fix is simple and
self-serve -- just create a new role via UI or API, and it will get a
proper `neon_superuser` grant.

[1]:
8b49392b27/src/backend/utils/adt/ruleutils.c (L3153)

Closes neondatabase/cloud#25048
2025-03-06 19:54:29 +00:00
Vlad Lazar
5ceb8c994d pageserver: mark unarchival heatmap layers as cold (#11098)
## Problem

On unarchival, we update the previous heatmap with all visible layers.
When the primary generates a new heatmap it includes all those layers,
so the secondary will download them. Since they're not actually resident
on the primary (we didn't call the warm up API), they'll never be
evicted,
so they remain in the heatmap.

We want these layers in the heatmap, since we might wish to warm-up an
unarchived timeline after a shard migration. However, we don't want them
to be downloaded on the secondary until we've warmed up the primary.

## Summary of Changes

Include these layers in the heatmap and mark them as cold. All heatmap
operations act on non-cold layers apart from the attached location
warming up API,
which will download the cold layers. Once the cold layers are downloaded
on the primary,
they'll be included in the next heatmap as hot and the secondary starts
fetching them too.
2025-03-06 11:25:02 +00:00
Alex Chi Z.
2de3629b88 test(pageserver): use reldirv2 by default in regress tests (#11081)
## Problem

For pg_regress test, we do both v1 and v2; for all the rest, we default
to v2.

part of https://github.com/neondatabase/neon/issues/9516

## Summary of changes

Use reldir v2 across test cases by default.

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-03-05 21:02:44 +00:00
Peter Bendel
604eb5e8d4 fix grafana dashboard link for pooler endoints (#11099)
## Problem

Our benchmarking workflows contain links to grafana dashboards to
troubleshoot problems.
This works fine for non-pooled endpoints.
For pooled endpoints we need to remove the `-pooler` suffix from the
endpoint's hostname to get a valid endpoint ID.

Example link that doesn't work in this run


https://github.com/neondatabase/neon/actions/runs/13678933253/job/38246028316#step:8:311

## Summary of changes

Check if connection string is a -pooler connection string and if so
remove this suffix from the endpoint ID.

---------

Co-authored-by: Alexander Bayandin <alexander@neon.tech>
2025-03-05 20:01:17 +00:00
Alexey Kondratov
8263107f6c feat(compute): Add filename label to remote ext requests metric (#11091)
## Problem

We realized that we may use this metric for more 'live' info about
extension installations vs. what we have with installed extensions
metric, which is only updated at start, atm.

## Summary of changes

Add `filename` label to `compute_ctl_remote_ext_requests_total`. Note
that it contains the raw archive name with `.tar.zst` at the end, so the
consumer may need to strip this suffix.

Closes https://github.com/neondatabase/cloud/issues/24694
2025-03-05 18:17:57 +00:00
Arpad Müller
2d45522fa6 storcon db: load safekeepers from DB again (#11087)
Earlier PR #11041 soft-disabled the loading code for safekeepers from
the storcon db. This PR makes us load the safekeepers from the database
again, now that we have [JWTs available on
staging](https://github.com/neondatabase/neon/pull/11087) and soon on
prod.

This reverts commit 23fb8053c5.

Part of https://github.com/neondatabase/cloud/issues/24727
2025-03-05 15:45:43 +00:00
Erik Grinaker
332aae1484 test_runner/regress: speed up test_check_visibility_map (#11086)
## Problem

`test_check_visibility_map` is the slowest test in CI, and can cause
timeouts under particularly slow configurations (`debug` and
`without-lfc`).

## Summary of changes

* Reduce the `pgbench` scale factor from 10 to 8.
* Omit a redundant vacuum during `pgbench` init.
* Remove a final `vacuum freeze` + `pg_check_visible` pass, which has
questionable value (we've already done a vacuum freeze previously, and
we don't flush the compute cache before checking anyway).
2025-03-05 13:50:35 +00:00
Vlad Lazar
8c12ccf729 pageserver: gate previous heatmap behind config flag (#11088)
## Problem

On unarchival, we update the previous heatmap with all visible layers.
When the primary generates a new heatmap it includes all those layers,
so the secondary will download them. Since they're not actually resident
on the primary (we didn't call the warm up API), they'll never be
evicted, so they remain in the heatmap.

This leads to oversized secondary locations like we saw in pre-prod.

## Summary of changes

Gate the loading of the previous heatmaps and the heatmap generation on
unarchival behind configuration
flags. They are disabled by default, but enabled in tests.
2025-03-05 12:20:18 +00:00
Alex Chi Z.
20af9cef17 fix(test): use the same value for reldir v1+v2 (#11070)
## Problem

part of https://github.com/neondatabase/neon/issues/11067

My observation is that with the current value of settings, x86-v1
usually takes 30s, arm-v1 1m30s, x86-v2 1m, arm-v2 3m. But sometimes the
system could run too slow and cause test to timeout on arm with reldir
v2.

While I investigate what's going on and further improve the performance,
I'd like to set both of them to use the same test input, so that it
doesn't timeout and we don't abuse this test case as a performance test.

## Summary of changes

Use the same settings for both test cases.

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-03-04 14:55:50 +00:00
Vlad Lazar
435bf452e6 tests: remove obsolete err log whitelisting (#11069)
The pageserver read path now supports overlapped in-memory and image
layers via
https://github.com/neondatabase/neon/pull/11000. These allowed errors
are now obsolete.
2025-03-04 08:18:19 +00:00
Erik Grinaker
65addfc524 storcon: add per-tenant rate limiting for API requests (#10924)
## Problem

Incoming requests often take the service lock, and sometimes even do
database transactions. That creates a risk that a rogue client can
starve the controller of the ability to do its primary job of
reconciling tenants to an available state.

## Summary of changes

* Use the `governor` crate to rate limit tenant requests at 10 requests
per second. This is ~10-100x lower than the worst "attack" we've seen
from a client bug. Admin APIs are not rate limited.
* Add a `storage_controller_http_request_rate_limited` histogram for
rate limited requests.
* Log a warning every 10 seconds for rate limited tenants.

The rate limiter is parametrized on TenantId, because the kinds of
client bug we're protecting against generally happen within tenant
scope, and the rates should be somewhat stable: we expect the global
rate of requests to increase as we do more work, but we do not expect
the rate of requests to one tenant to increase.

---------

Co-authored-by: John Spray <john@neon.tech>
2025-03-03 22:04:59 +00:00
Alex Chi Z.
6d0976dad5 feat(pageserver): persist reldir v2 migration status (#10980)
## Problem

part of https://github.com/neondatabase/neon/issues/9516

## Summary of changes

Similar to the aux v2 migration, we persist the relv2 migration status
into index_part, so that even the config item is set to false, we will
still read from the v2 storage to avoid loss of data.

Note that only the two variants `None` and
`Some(RelSizeMigration::Migrating)` are used for now. We don't have full
migration implemented so it will never be set to
`RelSizeMigration::Migrated`.

---------

Signed-off-by: Alex Chi Z <chi@neon.tech>
2025-03-03 21:05:43 +00:00
John Spray
b953daa21f safekeeper: allow remote deletion to proceed after dropped requests (#11042)
## Problem

If a caller times out on safekeeper timeline deletion on a large
timeline, and waits a while before retrying, the deletion will not
progress while the retry is waiting. The net effect is very very slow
deletion as it only proceeds in 30 second bursts across 5 minute idle
periods.

Related: https://github.com/neondatabase/neon/issues/10265

## Summary of changes

- Run remote deletion in a background task
- Carry a watch::Receiver on the Timeline for other callers to join the
wait
- Restart deletion if the API is called again and the previous attempt
failed
2025-03-03 16:03:51 +00:00
Peter Bendel
a07599949f First version of a new benchmark to test larger OLTP workload (#11053)
## Problem

We want to support larger tenants (regarding logical database size,
number of transactions per second etc.) and should increase our test
coverage of OLTP transactions at larger scale.

## Summary of changes

Start a new benchmark that over time will add more OLTP tests at larger
scale.
This PR covers the first version and will be extended in further PRs.

Also fix some infrastructure:
- default for new connections and large tenants is to use connection
pooler pgbouncer, however our fixture always added
`statement_timeout=120` which is not compatible with pooler
[see](https://neon.tech/docs/connect/connection-errors#unsupported-startup-parameter)
- action to create branch timed out after 10 seconds and 10 retries but
for large tenants it can take longer so use increasing back-off for
retries

## Test run

https://github.com/neondatabase/neon/actions/runs/13593446706
2025-03-03 15:25:48 +00:00
Arseny Sher
ef2b50994c walproposer: basic infra to enable generations (#11002)
## Problem

Preparation for https://github.com/neondatabase/neon/issues/10851

## Summary of changes

Add walproposer `safekeepers_generations` field which can be set by
prefixing `neon.safekeepers` GUC with `g#n:`. Non zero value (n) forces
walproposer to use generations. In particular, this also disables
implicit timeline creation as timeline will be created by storcon. Add
test checking this. Also add missing infra: `--safekeepers-generation`
flag to neon_local endpoint start + fix `--start-timeout` flag: it
existed but value wasn't used.
2025-03-03 13:20:20 +00:00
Suhas Thalanki
ee0c8ca8fd Add -fsigned-char for cross platform signed chars (#10852)
## Problem

In multi-character keys, the GIN index creates a CRC Hash of the first 3
bytes of the key.
The hash can have the first bit to be set or unset, needing to have a
consistent representation
of `char` across architectures for consistent results. GIN stores these
keys by their hashes
which determines the order in which the keys are obtained from the GIN
index.

By default, chars are signed in x86 and unsigned in arm, leading to
inconsistent behavior across different platform architectures. Adding
the `-fsigned-char` flag to the GCC compiler forces chars to be treated
as signed across platforms, ensuring the ordering in which the keys are
obtained consistent.

## Summary of changes

Added `-fsigned-char` to the `CFLAGS` to force GCC to use signed chars
across platforms.
Added a test to check this across platforms.

Fixes: https://github.com/neondatabase/cloud/issues/23199
2025-02-28 21:07:21 +00:00
Vlad Lazar
23fb8053c5 storcon: soft disable SK heartbeats (#11041)
## Problem

JWT tokens aren't in place, so all SK heartbeats fail. This is
equivalent to a wait before applying the PS heartbeats and makes things
more flaky.

## Summary of Changes

Add a flag that skips loading SKs from the db on start-up and at
runtime.
2025-02-28 15:49:09 +00:00
Conrad Ludgate
d9ced89ec0 feat(proxy): require TLS to compute if prompted by cplane (#10717)
https://github.com/neondatabase/cloud/issues/23008

For TLS between proxy and compute, we are using an internally
provisioned CA to sign the compute certificates. This change ensures
that proxy will load them from a supplied env var pointing to the
correct file - this file and env var will be configured later, using a
kubernetes secret.

Control plane responds with a `server_name` field if and only if the
compute uses TLS. This server name is the name we use to validate the
certificate. Control plane still sends us the IP to connect to as well
(to support overlay IP).

To support this change, I'd had to split `host` and `host_addr` into
separate fields. Using `host_addr` and bypassing `lookup_addr` if
possible (which is what happens in production). `host` then is only used
for the TLS connection.

There's no blocker to merging this. The code paths will not be triggered
until the new control plane is deployed and the `enableTLS` compute flag
is enabled on a project.
2025-02-28 14:20:25 +00:00
Vlad Lazar
0d6d58bd3e pageserver: make heatmap layer download API more cplane friendly (#10957)
## Problem

We intend for cplane to use the heatmap layer download API to warm up
timelines after unarchival. It's tricky for them to recurse in the
ancestors,
and the current implementation doesn't work well when unarchiving a
chain
of branches and warming them up.

## Summary of changes

* Add a `recurse` flag to the API. When the flag is set, the operation
recurses into the parent
timeline after the current one is done.
* Be resilient to warming up a chain of unarchived branches. Let's say
we unarchived `B` and `C` from
the `A -> B -> C` branch hierarchy. `B` got unarchived first. We
generated the unarchival heatmaps
and stash them in `A` and `B`. When `C` unarchived, it dropped it's
unarchival heatmap since `A` and `B`
already had one. If `C` needed layers from `A` and `B`, it was out of
luck. Now, when choosing whether
to keep an unarchival heatmap we look at its end LSN. If it's more
inclusive than what we currently have,
keep it.
2025-02-28 10:36:53 +00:00
Christian Schwarz
e35f7758d8 impr(controller_upcall_client): clean up copy-pasta code & add context to retries (#10991)
Before this PR, re-attach and validate would log the same warning
```
calling control plane generation validation API failed
```
on retry errors.

This can be confusing.

This PR makes the message generically valid for any upcall and adds
additional tracing spans to capture context.

Along the way, clean up some copy-pasta variable naming.

refs
-
https://github.com/neondatabase/neon/issues/10381#issuecomment-2684755827

---------

Co-authored-by: Alexander Lakhin <alexander.lakhin@neon.tech>
2025-02-27 10:59:43 +00:00