Commit Graph

132 Commits

Author SHA1 Message Date
Eric Seppanen
4ff248515b remote unnecessary dependencies between peer crates
These dependencies make cargo rebuild more than is strictly necessary.
Removing them makes the build a little faster.
2021-04-16 15:25:43 -07:00
Eric Seppanen
2246b48348 handle keepalive messages
When postgres sends us a keepalive message, send a reply so it doesn't
time out and close the connection.

The LSN values sent may need to change in the future. Currently we send:
write_lsn <= page_cache last_valid_lsn
flush_lsn <= page_cache last_valid_lsn
apply_lsn <= 0
2021-04-16 08:29:47 -07:00
Eric Seppanen
e8032f26e6 adopt new tokio-postgres:replication branch
This PR has evolved a lot; jump to the newer version. This should make
it easier to handle keepalive messages.
2021-04-16 08:29:47 -07:00
Heikki Linnakangas
d2c3ad162a Prefer passing PageServerConf by reference.
It seems more idiomatic Rust.
2021-04-16 10:42:41 +03:00
Heikki Linnakangas
b4c5cb2773 Clean up error types a little bit.
Don't use std::io::Error for errors that are not I/O related. Prefer
anyhow::Result instead.
2021-04-16 10:42:25 +03:00
anastasia
24c3e961e4 Always run cargo build before tests in CI 2021-04-15 16:43:03 +03:00
anastasia
92fb7a1641 don't ignore multitenancy test 2021-04-15 16:43:03 +03:00
anastasia
05886b33e5 fix typo 2021-04-15 16:43:03 +03:00
anastasia
d7eeaec706 add test for restore from local pgdata 2021-04-15 16:43:03 +03:00
anastasia
1190030872 handle SLRU in restore_datadir 2021-04-15 16:43:03 +03:00
anastasia
913a91c541 bump vendor/postgres 2021-04-15 16:20:23 +03:00
lubennikovaav
82dc1e82ba Restore pageserver from s3 or local datadir (#9)
* change pageserver --skip-recovery option to --restore-from=[s3|local]
* implement restore from local pgdata 
* add simple test for local restore
2021-04-14 21:14:10 +03:00
anastasia
2e9c730dd1 Cargo fmt pass 2021-04-14 20:12:50 +03:00
Heikki Linnakangas
6266fd102c Avoid short writes if a buffer is full.
write_buf() tries to write as much as it can in one go, and can return
without writing the whole buffer. We need to use write_all() instead.
2021-04-14 18:18:38 +03:00
Eric Seppanen
d1d6c968d5 control_plane: add error handling to reading pid files
print file errors to stderr; propagate the io::Error to the caller.
This error isn't handled very gracefully in WalAcceptorNode::drop(),
but there aren't any good options there since drop can't fail.
2021-04-13 14:30:48 -07:00
Eric Seppanen
3c4ebc4030 init_logging: return Result, print error on file create
Instead of panicking if the file create fails, print the filename and
error description to stderr; then propagate the error to our caller.
2021-04-13 14:06:14 -07:00
Eric Seppanen
46543f54a6 pgbuild.sh: halt if a subcommand fails
This is helpful when first setting up a build machine, just in case
build dependencies are missing.
2021-04-13 11:45:27 -07:00
Stas Kelvich
b07fa4c896 update readme 2021-04-13 18:58:22 +03:00
Stas Kelvich
f35d13183e fixup, check testing in CI 2021-04-13 18:58:22 +03:00
Stas Kelvich
c5f379bff3 [WIP] Implement CLI pg part 2021-04-13 18:58:22 +03:00
Stas Kelvich
39ebec51d1 Return Result<()> from pageserver start/stop.
To provide meaningful error messages when it is called by CLI.
2021-04-10 19:03:40 +03:00
Stas Kelvich
6264dc6aa3 Move control_plane code out of lib.rs and split up control plane
into compute and storage parts.

Before that code was concentrated in lib.rs which was unhandy to
open by name.
2021-04-10 13:56:19 +03:00
Stas Kelvich
59163cf3b3 Rework controle_plane code to reuse it in CLI.
Move all paths from control_plane to local_env which can set them
for testing environment or for local installation.
2021-04-10 12:09:20 +03:00
Heikki Linnakangas
ba4f8e94aa Minor comment fixes. 2021-04-08 13:30:42 +03:00
Heikki Linnakangas
3e09cb5718 Add README to give an overview of the WAL safekeeper. 2021-04-08 13:30:10 +03:00
Heikki Linnakangas
0c6471ca0d Print a few more LSNs in the standard format. 2021-04-07 19:05:28 +03:00
Heikki Linnakangas
198fc9ea53 Capture initdb's stdout/stderr, to avoid messing with log formatting.
Especially with --interactive.
2021-04-07 18:51:34 +03:00
Konstantin Knizhnik
abaa36a15c Format code according to rust style guide 2021-04-07 16:50:58 +03:00
Heikki Linnakangas
6b9fc3aff0 Fix minor typos and copy-pastos 2021-04-07 16:39:37 +03:00
Konstantin Knizhnik
3fea78d688 Multitenant wal_acceptor 2021-04-07 13:43:40 +03:00
Konstantin Knizhnik
8547184830 Fix merge conflicts 2021-04-06 15:55:55 +03:00
Konstantin Knizhnik
dc16386639 Fix wal_accetor tests 2021-04-06 15:45:21 +03:00
Stas Kelvich
c0fcbbbe0c Cargo fmt pass over a codebase 2021-04-06 14:42:13 +03:00
Heikki Linnakangas
494b95886b Move launch.sh into 'pageserver' directory.
It's a script to launch the Page Server.
2021-04-06 14:05:43 +03:00
Heikki Linnakangas
4706ed4238 Add --test-threads=1 to instructions on how to run tests.
Currently, the tests don't work in parallel. The tests try to create a
data directory in the same path and clash with each other. That should be
fixed, but for now, let's at least fix the instructions on how it works
now.
2021-04-06 14:00:37 +03:00
Heikki Linnakangas
37e258edf2 Add a brief overview of the source code layout in README 2021-04-06 13:55:29 +03:00
Heikki Linnakangas
1367332447 Separate walkeeper and pageserver sources into different directories.
The integration tests, which depend on both walkeeper and pageserver,
are moved into yet another directory, 'integration_tests'.
2021-04-06 13:15:26 +03:00
Konstantin Knizhnik
012c528749 Add race condition test for safekeeper 2021-04-06 11:58:48 +03:00
Heikki Linnakangas
abbbdc401a Don't panic in test if data directory is missing. 2021-04-06 11:48:25 +03:00
Konstantin Knizhnik
b5c49b2482 Add safekeeper tests 2021-04-06 00:51:22 +03:00
Stas Kelvich
dab1f0381c Cache postgres build and cargo deps in CI builds
Now most of CI check time is spent during dependencies installation and compilation (~ 10min total). Use actions/cache@v2 to cache things between checks. This commit sets up two caching targets:

* ./tmp_install with postgres build files and installed binaries uses $runner.os-pg-$pg_submodule_revision as a cache key and will be rebuilt only if linked submodule revision changes.

* ./target with cargo dependencies. That one uses hash(Cargo.lock) as a caching key and will be rebuilt only on deps update.

Also add tg notifications in a passing.
2021-04-05 22:59:49 +03:00
Heikki Linnakangas
412e4a2ee7 Move the mgmt-console code from vendor/postgres repository. 2021-04-05 21:27:40 +03:00
Heikki Linnakangas
0bf3c3224e Make the debugging output from WAL receiver a bit nicer. 2021-04-05 17:35:28 +03:00
Stas Kelvich
79e4110cf0 Use immediate postgres stop in tests.
No need to wait for checkpoint during compute node stop.
2021-04-05 14:11:05 +03:00
Stas Kelvich
bd606ab37a Start pageserver walreceiver from predefined "-inf" lsn.
If we start walreceiver with identify_system.xlogpos() we will have race condition with
postgres start: postgres may request page that was modified with lsn
smaller than identify_system.xlogpos().

Current procedure for starting postgres will anyway be changed to something
different like having 'initdb' method on a pageserver (or importing some shared
empty database snapshot), so for now I just put start of first segment which
seems to be a valid record and is strictly before first lsn records.
2021-04-05 12:37:45 +03:00
Stas Kelvich
a555b5917f bump pg commit 2021-04-04 10:50:51 +03:00
Konstantin Knizhnik
2c7ed574b8 Fix bug XLogFromFileName 2021-04-04 08:58:23 +03:00
Stas Kelvich
decb04fc4f fix build 2021-04-03 19:42:45 +03:00
Stas Kelvich
da0decc24e bump pg version: include system_id in getPage requests 2021-04-03 19:15:15 +03:00
Stas Kelvich
2c308da4d2 Support several postgres instances on top of a single pageserver.
Each postgres will use its own page cache with associated data
structures. Postgres system_id is used to distinguish instances.
That also means that backup should have valid system_id stashed
somewhere. For now I put '42' as sys_id during S3 restore, but
that ought to be fixed.

Also this commit introduces new way of starting WAL receivers:
postgres can initiate such connection by calling 'callmemaybe $url'
command in the page_service -- that will start appropriate wal-redo
and wal-receiver threads. This way page server may start without
a priori knowledge of compute node addreses.
2021-04-03 19:02:44 +03:00