Adds a benchmark for logical message WAL ingestion throughput
end-to-end. Logical messages are essentially noops, and thus ignored by
the Pageserver.
Example results from my MacBook, with fsync enabled:
```
postgres_ingest: 14.445 s
safekeeper_ingest: 29.948 s
pageserver_ingest: 30.013 s
pageserver_recover_ingest: 8.633 s
wal_written: 10,340 MB
message_count: 1310720 messages
postgres_throughput: 715 MB/s
safekeeper_throughput: 345 MB/s
pageserver_throughput: 344 MB/s
pageserver_recover_throughput: 1197 MB/s
```
See
https://github.com/neondatabase/neon/issues/9642#issuecomment-2475995205
for running analysis.
Touches #9642.
## Problem
https://github.com/neondatabase/neon/pull/9746 lifted decoding and
interpretation of WAL to the safekeeper.
This reduced the ingested amount on the pageservers by around 10x for a
tenant with 8 shards, but doubled
the ingested amount for single sharded tenants.
Also, https://github.com/neondatabase/neon/pull/9746 uses bincode which
doesn't support schema evolution.
Technically the schema can be evolved, but it's very cumbersome.
## Summary of changes
This patch set addresses both problems by adding protobuf support for
the interpreted wal records and adding compression support. Compressed
protobuf reduced the ingested amount by 100x on the 32 shards
`test_sharded_ingest` case (compared to non-interpreted proto). For the
1 shard case the reduction is 5x.
Sister change to `rust-postgres` is
[here](https://github.com/neondatabase/rust-postgres/pull/33).
## Links
Related: https://github.com/neondatabase/neon/issues/9336
Epic: https://github.com/neondatabase/neon/issues/9329
## Problem
ingest benchmark tests project migration to Neon involving steps
- COPY relation data
- create indexes
- create constraints
Previously we used only 4 copy jobs, 4 create index jobs and 7
maintenance workers. After increasing effective_io_concurrency on
compute we see that we can sustain more parallelism in the ingest bench
## Summary of changes
Increase copy jobs to 8, create index jobs to 8 and maintenance workers
to 16
## Problem
For any given tenant shard, pageservers receive all of the tenant's WAL
from the safekeeper.
This soft-blocks us from using larger shard counts due to bandwidth
concerns and CPU overhead of filtering
out the records.
## Summary of changes
This PR lifts the decoding and interpretation of WAL from the pageserver
into the safekeeper.
A customised PG replication protocol is used where instead of sending
raw WAL, the safekeeper sends
filtered, interpreted records. The receiver drives the protocol
selection, so, on the pageserver side, usage
of the new protocol is gated by a new pageserver config:
`wal_receiver_protocol`.
More granularly the changes are:
1. Optionally inject the protocol and shard identity into the arguments
used for starting replication
2. On the safekeeper side, implement a new wal sending primitive which
decodes and interprets records
before sending them over
3. On the pageserver side, implement the ingestion of this new
replication message type. It's very similar
to what we already have for raw wal (minus decoding and interpreting).
## Notes
* This PR currently uses my [branch of
rust-postgres](https://github.com/neondatabase/rust-postgres/tree/vlad/interpreted-wal-record-replication-support)
which includes the deserialization logic for the new replication message
type. PR for that is open
[here](https://github.com/neondatabase/rust-postgres/pull/32).
* This PR contains changes for both pageservers and safekeepers. It's
safe to merge because the new protocol is disabled by default on the
pageserver side. We can gradually start enabling it in subsequent
releases.
* CI tests are running on https://github.com/neondatabase/neon/pull/9747
## Links
Related: https://github.com/neondatabase/neon/issues/9336
Epic: https://github.com/neondatabase/neon/issues/9329
## Problem
close https://github.com/neondatabase/neon/issues/9761
The test assumed that no new L0 layers are flushed throughout the
process, which is not true.
## Summary of changes
Fix the test case `test_compaction_l0_memory` by flushing in-memory
layers before compaction.
Signed-off-by: Alex Chi Z <chi@neon.tech>
## Problem
Along with the migration to Python 3.11, I switched `C(str, Enum)` with
`C(StrEnum)`; one such example is the `PgVersion` enum.
It required more changes in `PgVersion` itself (before, it accepted both
`str` and `int`, and after it, it supports only `str`), which caused the
`test_bulk_insert` test to fail.
## Summary of changes
- `test_bulk_insert`: explicitly cast pg_version from `timeline_detail`
to str
## Problem
On Debian 12 (Bookworm), Python 3.11 is the latest available version.
## Summary of changes
- Update Python to 3.11 in build-tools
- Fix ruff check / format
- Fix mypy
- Use `StrEnum` instead of pair `str`, `Enum`
- Update docs
## Problem
The first version of the ingest benchmark had some parsing and reporting
logic in shell script inside GitHub workflow.
it is better to move that logic into a python testcase so that we can
also run it locally.
## Summary of changes
- Create new python testcase
- invoke pgcopydb inside python test case
- move the following logic into python testcase
- determine backpressure
- invoke pgcopydb and report its progress
- parse pgcopydb log and extract metrics
- insert metrics into perf test database
- add additional column to perf test database that can receive endpoint
ID used for pgcopydb run to have it available in grafana dashboard when
retrieving other metrics for an endpoint
## Example run
https://github.com/neondatabase/neon/actions/runs/11860622170/job/33056264386
## Problem
Due to #9471 , the scale test occasionally gets 404s while trying to
modify the config of a timeline that belongs to a tenant being migrated.
We rarely see this narrow race in the field, but the test is quite good
at reproducing it.
## Summary of changes
- Ignore 404 errors in this test.
Analysis of the LR benchmarking tests indicates that in the duration of
test_subscriber_lag, a leftover 'slotter' replication slot can lead to
retained WAL growing on the publisher. This replication slot is not used
by any subscriber. The only purpose of the slot is to generate snapshot
files for the puspose of test_snap_files.
Signed-off-by: Tristan Partin <tristan@neon.tech>
psycopg2 has the following warning related to autocommit:
> By default, any query execution, including a simple SELECT will start
> a transaction: for long-running programs, if no further action is
> taken, the session will remain “idle in transaction”, an undesirable
> condition for several reasons (locks are held by the session, tables
> bloat…). For long lived scripts, either ensure to terminate a
> transaction as soon as possible or use an autocommit connection.
In the 2.9 release notes, psycopg2 also made the following change:
> `with connection` starts a transaction on autocommit transactions too
Some of these connections are indeed long-lived, so we were retaining
tons of WAL on the endpoints because we had a transaction pinned in the
past.
Link: https://www.psycopg.org/docs/news.html#what-s-new-in-psycopg-2-9
Link: https://github.com/psycopg/psycopg2/issues/941
Signed-off-by: Tristan Partin <tristan@neon.tech>
## Problem
Running `pytest.skip(...)` in a test body instead of marking the test
with `@pytest.mark.skipif(...)` makes all fixtures to be initialised,
which is not necessary if the test is going to be skipped anyway.
Also, some tests are unnecessarily skipped (e.g. `test_layer_bloating`
on Postgres 17, or `test_idle_reconnections` at all) or run (e.g.
`test_parse_project_git_version_output_positive` more than on once
configuration) according to comments.
## Summary of changes
- Move `skip_on_postgres` / `xfail_on_postgres` /
`run_only_on_default_postgres` decorators to `fixture.utils`
- Add new `skip_in_debug_build` and `skip_on_ci` decorators
- Replace `pytest.skip(...)` calls with decorators where possible
Adds a Python benchmark for sharded ingestion. This ingests 7 GB of WAL
(100M rows) into a Safekeeper and fans out to 10 shards running on 10
different pageservers. The ingest volume and duration is recorded.
## Problem
Part of https://github.com/neondatabase/neon/issues/8623
## Summary of changes
Removed all aux-v1 config processing code. Note that we persisted it
into the index part file, so we cannot really remove the field from
index part. I also kept the config item within the tenant config, but we
will not read it any more.
---------
Signed-off-by: Alex Chi Z <chi@neon.tech>
## Problem
We were seeing timeouts on migrations in this test.
The test unfortunately tends to saturate local storage, which is shared
between the pageservers and the control plane database, which makes the
test kind of unrealistic. We will also want to increase the scale of
this test, so it's worth fixing that.
## Summary of changes
- Instead of randomly creating timelines at the same time as the other
background operations, explicitly identify a subset of tenant which will
have timelines, and create them at the start. This avoids pageservers
putting a lot of load on the test node during the main body of the test.
- Adjust the tenants created to create some number of 8 shard tenants
and the rest 1 shard tenants, instead of just creating a lot of 2 shard
tenants.
- Use archival_config to exercise tenant-mutating operations, instead of
using timeline creation for this.
- Adjust reconcile_until_idle calls to avoid waiting 5 seconds between
calls, which causes timelines with large shard count tenants.
- Fix a pageserver bug where calls to archival_config during activation
get 404
## Problem
The path to TPC-H queries was incorrectly changed in #9306.
This path is used for `test_tpch` parameterization, so all perf tests
started to fail:
```
==================================== ERRORS ====================================
__________ ERROR collecting test_runner/performance/test_perf_olap.py __________
test_runner/performance/test_perf_olap.py:205: in <module>
@pytest.mark.parametrize("query", tpch_queuies())
test_runner/performance/test_perf_olap.py:196: in tpch_queuies
assert queries_dir.exists(), f"TPC-H queries dir not found: {queries_dir}"
E AssertionError: TPC-H queries dir not found: /__w/neon/neon/test_runner/performance/performance/tpc-h/queries
E assert False
E + where False = <bound method Path.exists of PosixPath('/__w/neon/neon/test_runner/performance/performance/tpc-h/queries')>()
E + where <bound method Path.exists of PosixPath('/__w/neon/neon/test_runner/performance/performance/tpc-h/queries')> = PosixPath('/__w/neon/neon/test_runner/performance/performance/tpc-h/queries').exists
```
## Summary of changes
- Fix the path to tpc-h queries
It didn't serve much value, and was only used twice.
Path(__file__).parent is a pretty easy invocation to use.
Signed-off-by: Tristan Partin <tristan@neon.tech>