Commit Graph

1177 Commits

Author SHA1 Message Date
Konstantin Knizhnik
c28b6573b4 Use 1Mb chunk instead of page for loading data from pageserver 2022-01-21 10:55:58 +03:00
Konstantin Knizhnik
cb70e63f34 Update test_snapfiles_gc test 2022-01-20 12:11:38 +03:00
Konstantin Knizhnik
e6c82c9609 Add max_image_layers and image_layer_generation_threshold parameters to config and rewrite criteria of image layer generation 2022-01-17 18:59:34 +03:00
Konstantin Knizhnik
79ade52535 Add comments and make changes suggested by reviewers
refer #1133
2022-01-16 23:04:39 +03:00
Konstantin Knizhnik
5c33095918 Do not write full page images at each checkpoint 2022-01-14 15:15:31 +03:00
bojanserafimov
5e0f39cc9e Add proxy metrics (#1093) 2022-01-13 20:34:30 -05:00
Arthur Petukhovsky
0a34a592d5 Bump vendor/postgres (#1120) 2022-01-13 20:28:37 +03:00
Heikki Linnakangas
19aaa91f6d Timeline IDs are not globally unique, fix some code that assumed that.
A timeline ID is only guaranteed to be unique for a particular tenant,
so you need to use tenant ID + timeline ID as the key, rather than just
timeline ID.

The safekeeper currently makes the same assumption, and we should fix that
too, but this commit just addresses this one case in the page server.

In the passing, reorder some function arguments to be more consistent.
2022-01-13 18:45:30 +02:00
Konstantin Knizhnik
404aab9373 Use mutex to prevent concurrent checkpoints (#1115)
* Use mutex to prevent concurrent checkpoints

* Fix comment
2022-01-13 17:48:24 +03:00
Konstantin Knizhnik
bc6db2c10e Implement IO metrics in VirtualFile (#1112)
* Implement IO metrics in VirtualFile

* Do not group virtual file close statistics by tenantid/timelineid

* Add comments concenring close metrics
2022-01-13 17:36:53 +03:00
Heikki Linnakangas
772d853dcf Fix race condition leading to panic in walkeeper.
The walkeeper launch two threads for each connection, and uses a guard
object to remove entry from 'replicas' array, when finishes. But only
the background thread held onto the guard object, so if the background
thread finished before the other thread, the array entry would be
removed prematurely, which lead to panic in the check_stop_streaming()
call.

Fixes https://github.com/zenithdb/zenith/issues/1103
2022-01-13 11:21:11 +02:00
Arseny Sher
ab4d272149 Add safekeeper --dump-control-file option.
Hexalize zids there for better output; since Serde doesn't support several
formats for one struct, on-disk representation is changed as well, make
upgrade.rs cope with it.
2022-01-12 19:47:24 +03:00
Konstantin Knizhnik
f70a5cad61 Fix releasing of timelines lock (#1100)
refer #1087
2022-01-12 15:05:08 +03:00
anastasia
7aba299dbd Use safekeeper in test_branch_behind (#1068)
to avoid a subtle race condition.

Without safekeeper, walreceiver reconnection can stuck,
because of IO deadlock between walsender auth and regular backend.
2022-01-12 14:38:04 +03:00
Kirill Bulatov
4b3b19f444 Support prefixes when working with s3 buckets 2022-01-11 15:44:50 +02:00
Kirill Bulatov
8ab4c8a050 Code review fixes 2022-01-11 15:44:23 +02:00
Kirill Bulatov
7c4a653230 Propagate Zenith CLI's RUST_LOG env var to subprocesses 2022-01-11 15:44:23 +02:00
Kirill Bulatov
a3cd8f0e6d Add the remote storage test 2022-01-11 15:44:23 +02:00
Kirill Bulatov
65c851a451 Test pageserver's timeline http methods
z
2022-01-11 15:44:23 +02:00
Kirill Bulatov
23cf2fa984 Properly shutdown storage sync loop 2022-01-11 15:44:23 +02:00
Kirill Bulatov
ce8d6ae958 Allow using remote storage in tests 2022-01-11 15:44:23 +02:00
Kirill Bulatov
384b2a91fa Pass generic pageserver params through zenith cli 2022-01-11 15:44:23 +02:00
Arseny Sher
233c4811db Fix default safekeeper http port. 2022-01-11 10:13:27 +03:00
Konstantin Knizhnik
2fd4c390cb Do not hold timelines lock during GC (#1089)
* Do not hold timelines lock during GC
refer #1087

* Add gc_cs mutex for preveting creation of new timelines during GC

* Make clippy happy

* Use Mutex<()> instead of Mutex<i32> for GC critical section
2022-01-10 14:41:15 +03:00
bojanserafimov
5b9391b51d Support "query cancel" in proxy (#1052) 2022-01-05 17:27:12 -05:00
Arthur Petukhovsky
5a6405848d Bump vendor/postgres (#1086) 2022-01-05 14:27:51 +03:00
Patrick Insinger
191d9d2b74 par_fsync - use VirtualFile 2022-01-04 20:40:57 -08:00
Patrick Insinger
24c8dab86f pageserver - parallelize checkpoint fsyncs 2022-01-04 20:40:57 -08:00
Heikki Linnakangas
55a4cf64a1 Refactor WAL record handling.
Introduce the concept of a "ZenithWalRecord", which can be a Postgres WAL
record that is replayed with the Postgres WAL redo process, or a built-in
type that is handled entirely by pageserver code.

Replace the special code to replay Postgres XACT commit/abort records
with new Zenith WAL records. A separate zenith WAL record is created for
each modified CLOG page. This allows removing the 'main_data_offset'
field from stored PostgreSQL WAL records, which saves some memory and
some disk space in delta layers.

Introduce zenith WAL records for updating bits in the visibility map.
Previously, when e.g. a heap insert cleared the VM bit, we duplicated the
heap insert WAL record for the affected VM page. That was very wasteful.
The heap WAL record could be massive, containing a full page image in
the worst case. This addresses github issue #941.
2022-01-04 11:26:37 +02:00
Heikki Linnakangas
722667f189 Add test case for performance issue #941.
The first COPY generates about 230 MB of write I/O, but the second
COPY, after deleting most of the rows and vacuuming the rows away,
generates 370 MB of writes. Both COPYs insert the same amount of data,
so they should generate roughly the same amount of I/O. This commit
doesn't try to fix the issue, just adds a test case to demonstrate it.

Add a new 'checkpoint' command to the pageserver API. Previously,
we've used 'do_gc' for that, but many tests, including this new one,
really only want to perform a checkpoint and don't care about GC. For
now, I only used the command in the new test, though, and didn't
convert any existing tests to use it.
2022-01-04 11:26:37 +02:00
Arseny Sher
25a515b968 Don't call immediately on resume in callmemaybe.
It creates busy loop if pageserver <-> safekeeper connection fails after it was
established (e.g. currently due to 'segment checkpoint not found' error on
pageserver).

Also wake up callmemaybe thread regularly once in recall_period regardless of
channel activity.
2022-01-03 20:44:36 +03:00
Konstantin Knizhnik
1c47fbae81 Do not write image layers during enforced checkpoint (#1057)
* Do not write image layers during enforced checkpoint
refer #1056

* Add Flush option to CheckpointConfig

refer #1057
2022-01-01 19:08:09 +03:00
Alexey Kondratov
8f0cd7fb9f [compute_tools] Switch cluster_id in spec to string (zenithdb/console#72) 2021-12-29 16:35:29 +03:00
Dmitry Rodionov
c910132d4b Fix wal receiver shutdown
This patch allows to shutdown wal receiver when there are no messages
and wal receiver is blocked inside tokio-postgres. In this case it
cannot check the shutdown flag.

This patch switches to use async interface of tokio-postgres directly
without sync wrappers. It opens the possibility to use tokio::select!
between the phsycal_stream.next() and a shutdown channel readiness to
interrupt replication process.

Also this allows to shutdown only particular wal receiver without
using global shutdown_requested flag.
2021-12-29 14:42:29 +03:00
Arthur Petukhovsky
70778058d9 Add test for safekeeper setup without pageserver (#1000) 2021-12-29 12:58:27 +03:00
nikitashamgunov
a379b45257 Update README.md 2021-12-28 14:26:42 -08:00
bojanserafimov
24eca8d58b Parse cancel message in pq_proto (#1060) 2021-12-28 16:43:44 -05:00
Bojan Serafimov
1e3ddd43bc Add struct for key data 2021-12-28 22:40:22 +03:00
Bojan Serafimov
989371493b Add BeMessage::BackendKeyData variant 2021-12-28 22:40:22 +03:00
Alexey Kondratov
f64074c609 Move compute_tools from console repo (zenithdb/console#383)
Currently it's included with minimal changes and lives aside of the main
workspace. Later we may re-use and combine common parts with zenith
control_plane.

This change is mostly needed to unify cloud deployment pipeline:
1.1. build compute-tools image
1.2. build compute-node image based on the freshly built compute-tools
2. build zenith image

So we can roll new compute image and new storage required by it to
operate properly. Also it becomes easier to test console against some
specific version of compute-node/-tools.
2021-12-28 20:17:29 +03:00
anastasia
eba897ffe7 send CallmeEvent::Unsubscribe request only when pageserver is caught up with safekeeper and it's time to stop streaming 2021-12-28 17:50:48 +03:00
anastasia
5ef2b1baf7 Add new test illustrating issue with sync-safekeepers.
If safekeepers sync fast enough, callmemaybe thread may never make a call before receiving Unsubscribe request. This leads to the situation, when pageserver lacks data that exists on safekeepers.
2021-12-28 17:50:48 +03:00
Kirill Bulatov
f0afd08667 Fix zenith init defaults 2021-12-28 00:21:48 +02:00
Kirill Bulatov
b494ac1ea0 Remove redundant pageserver cli params 2021-12-27 18:38:54 +02:00
Arseny Sher
a163650a99 Refactor Postgres command parsing in safekeeper.
Do it separately with SafekeeperPostgresCommand enum as a result. Since query is
always C string, switch postgres_backend process_query argument from Bytes to
&str.

Make passing ztli/ztenant id in safekeeper connection string optional; this is
needed for upcoming intra-safekeeper heartbeat cmd which is not bound to any
timeline.
2021-12-24 15:48:13 +03:00
anastasia
980f5f8440 Propagate remote_consistent_lsn to safekeepers.
Change meaning of lsns in HOT_STANDBY_FEEDBACK:
flush_lsn = disk_consistent_lsn,
apply_lsn = remote_consistent_lsn
Update compute node backpressure configuration respectively.

Update compute node configuration:
set 'synchronous_commit=remote_write' in setup without safekeepers.
This way compute node doesn't have to wait for data checkpoint on pageserver.
This doesn't guarantee data durability, but we only use this setup for tests, so it's fine.
2021-12-24 15:32:54 +03:00
Kirill Bulatov
42647f606e Use correct pageserver CLI parameters in docker entrypoint 2021-12-24 03:41:45 +02:00
bojanserafimov
b807570f46 Use parking_lot::Mutex instead of std::Mutex in walreceiver (#1045) 2021-12-23 14:25:44 -05:00
Kirill Bulatov
114a757d1c Use generic config parameters in pageserver cli
Co-authored-by: Heikki Linnakangas <heikki.linnakangas@iki.fi>
2021-12-23 18:58:28 +02:00
Andrey Taranik
9854ded56b Feature/proxy deploy (#1046)
* zenith proxy deployment

* proxy deploy ci fix

* ci cleanup or zenith proxy deploy
2021-12-23 15:53:28 +03:00