36 Commits

Author SHA1 Message Date
anastasia
babd2339cc [issue #56] Fix race at postgres instance + walreceiver start. Uses postgres/vendor issue_56 branch.
TODO: rebase on main
2021-04-22 15:51:44 +03:00
anastasia
69b786040e Decode main_data in decode_wal_record().
Replay XLOG_XACT_COMMIT and XLOG_XACT_ABORT records in walredo.
Don't wait for lsn catchup before walreceiver connected.
Use 'request_nonrel' branch of vendor/postgres
2021-04-21 19:33:29 +03:00
Heikki Linnakangas
c794f128cc Fix a few cargo clippy warnings in tui code 2021-04-21 17:12:43 +03:00
Heikki Linnakangas
e911427872 Remove some unnecessary dependencies 2021-04-21 16:42:12 +03:00
Eric Seppanen
92e4f4b3b6 cargo fmt 2021-04-20 17:59:56 -07:00
Heikki Linnakangas
d8ab2e00cb Fix compilation failure caused by last minute change in relsize_inc() 2021-04-20 19:34:48 +03:00
Heikki Linnakangas
d047a3abf7 Fixes, per Eric's and Konstantin's comments 2021-04-20 19:11:29 +03:00
Heikki Linnakangas
f69db17409 Make WAL safekeeper work with zenith timelines 2021-04-20 19:11:29 +03:00
Heikki Linnakangas
3600b33f1c Implement "timelines" in page server
This replaces the page server's "datadir" concept. The Page Server now
always works with a "Zenith Repository". When you initialize a new
repository with "zenith init", it runs initdb and loads an initial
basebackup of the freshly-created cluster into the repository, on "main"
branch. Repository can hold multiple "timelines", which can be given
human-friendly names, making them "branches". One page server simultaneously
serves all timelines stored in the repository, and you can have multiple
Postgres compute nodes connected to the page server, as long they all
operate on a different timeline.

There is a new command "zenith branch", which can be used to fork off
new branches from existing branches.

The repository uses the directory layout desribed as Repository format
v1 in https://github.com/zenithdb/rfcs/pull/5. It it *highly* inefficient:
- we never create new snapshots. So in practice, it's really just a base
  backup of the initial empty cluster, and everything else is reconstructed
  by redoing all WAL

- when you create a new timeline, the base snapshot and *all* WAL is copied
  from the new timeline to the new one. There is no smarts about
  referencing the old snapshots/wal from the ancestor timeline.

To support all this, this commit includes a bunch of other changes:

- Implement "basebackup" funtionality in page server. When you initialize
  a new compute node with "zenith pg create", it connects to the page
  server, and requests a base backup of the Postgres data directory on
  that timeline. (the base backup excludes user tables, so it's not
  as bad as it sounds).

- Have page server's WAL receiver write the WAL into timeline dir. This
  allows running a Page Server and Compute Nodes without a WAL safekeeper,
  until we get around to integrate that properly into the system. (Even
  after we integrate WAL safekeeper, this is perhaps how this will operate
  when you want to run the system on your laptop.)

- restore_datadir.rs was renamed to restore_local_repo.rs, and heavily
  modified to use the new format. It now also restores all WAL.

- Page server no longer scans and restores everything into memory at startup.
  Instead, when the first request is made for a timeline, the timeline is
  slurped into memory at that point.

- The responsibility for telling page server to "callmemaybe" was moved
  into Postgres libpqpagestore code. Also, WAL producer connstring cannot
  be specified in the pageserver's command line anymore.

- Having multiple "system identifiers" in the same page server is no
  longer supported. I repurposed much of that code to support multiple
  timelines, instead.

- Implemented very basic, incomplete, support for PostgreSQL's Extended
  Query Protocol in page_service.rs. Turns out that rust-postgres'
  copy_out() function always uses the extended query protocol to send
  out the command, and I'm using that to stream the base backup from the
  page server.

TODO: I haven't fixed the WAL safekeeper for this scheme, so all the
integration tests involving safekeepers are failing. My plan is to modify
the safekeeper to know about Zenith timelines, too, and modify it to work
with the same Zenith repository format. It only needs to care about the
'.zenith/timelines/<timeline>/wal' directories.
2021-04-20 19:11:27 +03:00
Heikki Linnakangas
2c5fb6d6c8 Change 'relsize_inc' signature to be a bit nicer.
Don't add 1 to the argument in the function, the callers must do it now.
And don't accept None argument, pass 0 instead for an empty relation.
2021-04-20 19:10:37 +03:00
Eric Seppanen
8d1bf152cf fix up logged error for walreceiver connection failed
For some reason printing the Result made the error string print twice,
along with some annoying newlines. Extracting the error first gets the
expected result (just one explanation, no newlines)
2021-04-18 23:06:35 -07:00
Eric Seppanen
3725815935 pageserver: propage errors instead of calling .unwrap()
Just a few more places where we can drop the .unwrap() call in favor of
`?`.

Also include a fix to the log file handling: don't open the file twice.
Writing to two fds would result in one message overwriting another.
Presumably `File.try_clone()` reduces down to `dup` on Linux.
2021-04-18 23:06:35 -07:00
Eric Seppanen
b32cc6a088 pageserver: change over some error handling to anyhow+thiserror
This is a first attempt at a new error-handling strategy:
- Use anyhow::Error as the first choice for easy error handling
- Use thiserror to generate local error types for anything that
  needs it (no error type is available to us) or will be inspected
  or matched on by higher layers.
2021-04-18 23:06:35 -07:00
Eric Seppanen
3c7f810849 clippy cleanup #1
Resolve some basic warnings from clippy:
- useless conversion to the same type
- redundant field names in struct initialization
- redundant single-component path imports
2021-04-18 19:15:06 -07:00
Eric Seppanen
e03417a7c9 suppress dead_code warnings on nightly
We don't need the nightly compiler, but there's no reason it shouldn't
compile without warnings, either. I don't know why stable doesn't warn
about these, but it's cheap to suppress them.
2021-04-17 14:14:27 -07:00
Eric Seppanen
52d6275812 drop nonfunctional attributes allow(dead_code)
These had no effect, so remove them.
2021-04-16 15:59:32 -07:00
Eric Seppanen
35e0099ac6 pin remote rust-s3 dependency to a git hash
Using the hash should allow us to change the remote repo and propagate
that change to user builds without that change becoming visible at a
random time.

It's unfortunate that we can't declare this dependency once in the
top-level Cargo.toml; that feature request is rust-lang rfc 2906.
2021-04-16 15:26:11 -07:00
Eric Seppanen
2246b48348 handle keepalive messages
When postgres sends us a keepalive message, send a reply so it doesn't
time out and close the connection.

The LSN values sent may need to change in the future. Currently we send:
write_lsn <= page_cache last_valid_lsn
flush_lsn <= page_cache last_valid_lsn
apply_lsn <= 0
2021-04-16 08:29:47 -07:00
Eric Seppanen
e8032f26e6 adopt new tokio-postgres:replication branch
This PR has evolved a lot; jump to the newer version. This should make
it easier to handle keepalive messages.
2021-04-16 08:29:47 -07:00
Heikki Linnakangas
d2c3ad162a Prefer passing PageServerConf by reference.
It seems more idiomatic Rust.
2021-04-16 10:42:41 +03:00
Heikki Linnakangas
b4c5cb2773 Clean up error types a little bit.
Don't use std::io::Error for errors that are not I/O related. Prefer
anyhow::Result instead.
2021-04-16 10:42:25 +03:00
anastasia
05886b33e5 fix typo 2021-04-15 16:43:03 +03:00
anastasia
d7eeaec706 add test for restore from local pgdata 2021-04-15 16:43:03 +03:00
anastasia
1190030872 handle SLRU in restore_datadir 2021-04-15 16:43:03 +03:00
lubennikovaav
82dc1e82ba Restore pageserver from s3 or local datadir (#9)
* change pageserver --skip-recovery option to --restore-from=[s3|local]
* implement restore from local pgdata 
* add simple test for local restore
2021-04-14 21:14:10 +03:00
anastasia
2e9c730dd1 Cargo fmt pass 2021-04-14 20:12:50 +03:00
Heikki Linnakangas
6266fd102c Avoid short writes if a buffer is full.
write_buf() tries to write as much as it can in one go, and can return
without writing the whole buffer. We need to use write_all() instead.
2021-04-14 18:18:38 +03:00
Eric Seppanen
3c4ebc4030 init_logging: return Result, print error on file create
Instead of panicking if the file create fails, print the filename and
error description to stderr; then propagate the error to our caller.
2021-04-13 14:06:14 -07:00
Heikki Linnakangas
0c6471ca0d Print a few more LSNs in the standard format. 2021-04-07 19:05:28 +03:00
Heikki Linnakangas
198fc9ea53 Capture initdb's stdout/stderr, to avoid messing with log formatting.
Especially with --interactive.
2021-04-07 18:51:34 +03:00
Heikki Linnakangas
6b9fc3aff0 Fix minor typos and copy-pastos 2021-04-07 16:39:37 +03:00
Stas Kelvich
c0fcbbbe0c Cargo fmt pass over a codebase 2021-04-06 14:42:13 +03:00
Heikki Linnakangas
494b95886b Move launch.sh into 'pageserver' directory.
It's a script to launch the Page Server.
2021-04-06 14:05:43 +03:00
Heikki Linnakangas
1367332447 Separate walkeeper and pageserver sources into different directories.
The integration tests, which depend on both walkeeper and pageserver,
are moved into yet another directory, 'integration_tests'.
2021-04-06 13:15:26 +03:00
Stas Kelvich
e2ce9e562e remove unused modules 2021-04-02 10:38:51 +03:00
Arun Sharma
253fc123e7 Initial repo structure 2021-03-26 10:54:25 -07:00