Commit Graph

34 Commits

Author SHA1 Message Date
Heikki Linnakangas
b484b896b6 Refactor the functionality page_cache.rs.
This moves things around:

- The PageCache is split into two structs: Repository and Timeline. A
  Repository holds multiple Timelines. In order to get a page version,
  you must first get a reference to the Repository, then the Timeline
  in the repository, and finally call the get_page_at_lsn() function
  on the Timeline object. This sounds complicated, but because each
  connection from a compute node, and each WAL receiver, only deals
  with one timeline at a time, the callers can get the reference to
  the Timeline object once and hold onto it. The Timeline corresponds
  most closely to the old PageCache object.

- Repository and Timeline are now abstract traits, so that we can
  support multiple implementations. I don't actually expect us to have
  multiple implementations for long. We have the RocksDB
  implementation now, but as soon as we have a different
  implementation that's usable, I expect that we will retire the
  RocksDB implementation. But I think this abstraction works as good
  documentation in any case: it's now easier to see what the interface
  for storing and loading pages from the repository is, by looking at
  the Repository/Timeline traits. They abstract traits are in
  repository.rs, and the RocksDB implementation of them is in
  repository/rocksdb.rs.

- page_cache.rs is now a "switchboard" to get a handle to the
  repository. Currently, the page server can only handle one
  repository at a time, so there isn't much there, but in the future
  we might do multi-tenancy there.
2021-05-05 10:37:36 +03:00
anastasia
1cdeba9db7 [issue #18] log module name and position in the file 2021-05-03 15:17:51 +03:00
Eric Seppanen
4acdcbe90f clippy cleanup #3
Fix issues raised by clippy. Mostly trivial ones, though some allow
4-5 lines of code to be reduced to 1.
2021-04-26 12:35:35 -07:00
Konstantin Knizhnik
3e007b0eb9 Do not delete versions in GC 2021-04-24 22:32:22 +03:00
Konstantin Knizhnik
499b4f7eba Log garbage collection statistics 2021-04-23 18:02:58 +03:00
Konstantin Knizhnik
52ee3a2bac Support CREATE DATABASE command 2021-04-23 17:03:56 +03:00
Konstantin Knizhnik
4a0a9e748c Enable garbage collector 2021-04-22 17:52:15 +03:00
Konstantin Knizhnik
2ca8fbb6ff Fix DEFAULT_GC_PERIOD_SEC type 2021-04-22 12:01:25 +03:00
Konstantin Knizhnik
c5a8c31b8a Update comments 2021-04-22 11:46:20 +03:00
Konstantin Knizhnik
ed30f2096c Disable GC by default 2021-04-22 11:30:27 +03:00
Konstantin Knizhnik
2dbbb8c59b Address issues from Eric's review 2021-04-22 10:12:22 +03:00
Konstantin Knizhnik
9e7c45cb72 Merge with master 2021-04-22 09:45:13 +03:00
Heikki Linnakangas
a4fd1e1a80 Cleanup more issues noted by 'clippy'
Mostly stuff that was introduced by commit 3600b33f1c.
2021-04-22 09:20:05 +03:00
Eric Seppanen
1f3f4cfaf5 clippy cleanup #2
- remove needless return
- remove needless format!
- remove a few more needless clone()
- from_str_radix(_, 10) -> .parse()
- remove needless reference
- remove needless `mut`

Also manually replaced a match statement with map_err() because after
clippy was done with it, there was almost nothing left in the match
expression.
2021-04-21 17:56:58 -07:00
Konstantin Knizhnik
c981f4ad66 Implement garbage collection of unused versions 2021-04-21 19:04:30 +03:00
Konstantin Knizhnik
d8fa2ec367 Merge with main branch 2021-04-21 16:10:05 +03:00
Konstantin Knizhnik
07507274c0 Merge branch 'main' into rocksdb_pageserver 2021-04-21 16:06:31 +03:00
Eric Seppanen
92e4f4b3b6 cargo fmt 2021-04-20 17:59:56 -07:00
Heikki Linnakangas
f69db17409 Make WAL safekeeper work with zenith timelines 2021-04-20 19:11:29 +03:00
Heikki Linnakangas
3600b33f1c Implement "timelines" in page server
This replaces the page server's "datadir" concept. The Page Server now
always works with a "Zenith Repository". When you initialize a new
repository with "zenith init", it runs initdb and loads an initial
basebackup of the freshly-created cluster into the repository, on "main"
branch. Repository can hold multiple "timelines", which can be given
human-friendly names, making them "branches". One page server simultaneously
serves all timelines stored in the repository, and you can have multiple
Postgres compute nodes connected to the page server, as long they all
operate on a different timeline.

There is a new command "zenith branch", which can be used to fork off
new branches from existing branches.

The repository uses the directory layout desribed as Repository format
v1 in https://github.com/zenithdb/rfcs/pull/5. It it *highly* inefficient:
- we never create new snapshots. So in practice, it's really just a base
  backup of the initial empty cluster, and everything else is reconstructed
  by redoing all WAL

- when you create a new timeline, the base snapshot and *all* WAL is copied
  from the new timeline to the new one. There is no smarts about
  referencing the old snapshots/wal from the ancestor timeline.

To support all this, this commit includes a bunch of other changes:

- Implement "basebackup" funtionality in page server. When you initialize
  a new compute node with "zenith pg create", it connects to the page
  server, and requests a base backup of the Postgres data directory on
  that timeline. (the base backup excludes user tables, so it's not
  as bad as it sounds).

- Have page server's WAL receiver write the WAL into timeline dir. This
  allows running a Page Server and Compute Nodes without a WAL safekeeper,
  until we get around to integrate that properly into the system. (Even
  after we integrate WAL safekeeper, this is perhaps how this will operate
  when you want to run the system on your laptop.)

- restore_datadir.rs was renamed to restore_local_repo.rs, and heavily
  modified to use the new format. It now also restores all WAL.

- Page server no longer scans and restores everything into memory at startup.
  Instead, when the first request is made for a timeline, the timeline is
  slurped into memory at that point.

- The responsibility for telling page server to "callmemaybe" was moved
  into Postgres libpqpagestore code. Also, WAL producer connstring cannot
  be specified in the pageserver's command line anymore.

- Having multiple "system identifiers" in the same page server is no
  longer supported. I repurposed much of that code to support multiple
  timelines, instead.

- Implemented very basic, incomplete, support for PostgreSQL's Extended
  Query Protocol in page_service.rs. Turns out that rust-postgres'
  copy_out() function always uses the extended query protocol to send
  out the command, and I'm using that to stream the base backup from the
  page server.

TODO: I haven't fixed the WAL safekeeper for this scheme, so all the
integration tests involving safekeepers are failing. My plan is to modify
the safekeeper to know about Zenith timelines, too, and modify it to work
with the same Zenith repository format. It only needs to care about the
'.zenith/timelines/<timeline>/wal' directories.
2021-04-20 19:11:27 +03:00
Konstantin Knizhnik
95160dee6d Merge with main branch 2021-04-19 17:00:30 +03:00
Konstantin Knizhnik
8aa3013ec2 Merge branch 'main' into rocksdb_pageserver 2021-04-19 16:28:29 +03:00
Eric Seppanen
3725815935 pageserver: propage errors instead of calling .unwrap()
Just a few more places where we can drop the .unwrap() call in favor of
`?`.

Also include a fix to the log file handling: don't open the file twice.
Writing to two fds would result in one message overwriting another.
Presumably `File.try_clone()` reduces down to `dup` on Linux.
2021-04-18 23:06:35 -07:00
Eric Seppanen
3c7f810849 clippy cleanup #1
Resolve some basic warnings from clippy:
- useless conversion to the same type
- redundant field names in struct initialization
- redundant single-component path imports
2021-04-18 19:15:06 -07:00
Heikki Linnakangas
d2c3ad162a Prefer passing PageServerConf by reference.
It seems more idiomatic Rust.
2021-04-16 10:42:41 +03:00
Heikki Linnakangas
b4c5cb2773 Clean up error types a little bit.
Don't use std::io::Error for errors that are not I/O related. Prefer
anyhow::Result instead.
2021-04-16 10:42:25 +03:00
anastasia
d7eeaec706 add test for restore from local pgdata 2021-04-15 16:43:03 +03:00
anastasia
1190030872 handle SLRU in restore_datadir 2021-04-15 16:43:03 +03:00
Konstantin Knizhnik
24b925d528 Support truncate WAL record 2021-04-15 15:50:47 +03:00
lubennikovaav
82dc1e82ba Restore pageserver from s3 or local datadir (#9)
* change pageserver --skip-recovery option to --restore-from=[s3|local]
* implement restore from local pgdata 
* add simple test for local restore
2021-04-14 21:14:10 +03:00
Eric Seppanen
3c4ebc4030 init_logging: return Result, print error on file create
Instead of panicking if the file create fails, print the filename and
error description to stderr; then propagate the error to our caller.
2021-04-13 14:06:14 -07:00
Konstantin Knizhnik
a606336074 Fix bug in WALRecord serializer 2021-04-09 20:31:34 +03:00
Stas Kelvich
c0fcbbbe0c Cargo fmt pass over a codebase 2021-04-06 14:42:13 +03:00
Heikki Linnakangas
1367332447 Separate walkeeper and pageserver sources into different directories.
The integration tests, which depend on both walkeeper and pageserver,
are moved into yet another directory, 'integration_tests'.
2021-04-06 13:15:26 +03:00