Commit Graph

198 Commits

Author SHA1 Message Date
Konstantin Knizhnik
2ca8fbb6ff Fix DEFAULT_GC_PERIOD_SEC type 2021-04-22 12:01:25 +03:00
Konstantin Knizhnik
546266b86d Merge with main branch 2021-04-22 11:53:22 +03:00
Konstantin Knizhnik
c5a8c31b8a Update comments 2021-04-22 11:46:20 +03:00
Stas Kelvich
bab954b87f Fix error message wording introduced in previous commit 2021-04-22 11:45:06 +03:00
Stas Kelvich
3ded550272 Use postgres version with computenode_mode defaulted to false as it
has ongoing issues. In a passing also fix test tests on macOS.
2021-04-22 11:32:40 +03:00
Konstantin Knizhnik
ed30f2096c Disable GC by default 2021-04-22 11:30:27 +03:00
Konstantin Knizhnik
da9508716d Address issues from Eric's review 2021-04-22 10:37:52 +03:00
Konstantin Knizhnik
2dbbb8c59b Address issues from Eric's review 2021-04-22 10:12:22 +03:00
Konstantin Knizhnik
f3192ee415 Merge branch 'main' into rocksdb_pageserver 2021-04-22 09:45:42 +03:00
Konstantin Knizhnik
9e7c45cb72 Merge with master 2021-04-22 09:45:13 +03:00
Heikki Linnakangas
18ba16aaac Fix and improve comment on ZTimelineId.
The comment was incorrect, claiming that ZTimelineId is a 32-byte value.
It is actually 16 bytes wide. While we're at it, improve the comment,
explaining what a zenith timeline is, and why it's different from
PostgreSQL timelines.
2021-04-22 09:25:53 +03:00
Heikki Linnakangas
a4fd1e1a80 Cleanup more issues noted by 'clippy'
Mostly stuff that was introduced by commit 3600b33f1c.
2021-04-22 09:20:05 +03:00
Eric Seppanen
9b71ae7dce page_cache: add an assert on the last_valid_lsn 2021-04-21 18:02:13 -07:00
Eric Seppanen
2cd730d31f page_cache: replace long mutex sleep with SeqWait
When calling into the page cache, it was possible to wait on a blocking
mutex, which can stall the async executor.

Replace that sleep with a SeqWait::wait_for(lsn).await so that the
executor can go on with other work while we wait.

Change walreceiver_works to an AtomicBool to avoid the awkwardness of
taking the lock, then dropping it while we call wait_for and then
acquiring it again to do real work.
2021-04-21 18:02:13 -07:00
Eric Seppanen
8060e17b50 add SeqWait
SeqWait adds a way to .await the arrival of some sequence number.
It provides wait_for(num) which is an async fn, and advance(num) which
is synchronous.

This should be useful in solving the page cache deadlocks, and may be
useful in other areas too.

This implementation still uses a Mutex internally, but only for a brief
critical section. If we find this code broadly useful and start to care
more about executor stalls due to unfair thread scheduling, there might
be ways to make it lock-free.
2021-04-21 18:02:13 -07:00
Eric Seppanen
1f3f4cfaf5 clippy cleanup #2
- remove needless return
- remove needless format!
- remove a few more needless clone()
- from_str_radix(_, 10) -> .parse()
- remove needless reference
- remove needless `mut`

Also manually replaced a match statement with map_err() because after
clippy was done with it, there was almost nothing left in the match
expression.
2021-04-21 17:56:58 -07:00
Konstantin Knizhnik
a22cb7acc1 Merge with main branch 2021-04-21 20:19:34 +03:00
Konstantin Knizhnik
785502c92c New version of postgres 2021-04-21 19:52:28 +03:00
anastasia
69b786040e Decode main_data in decode_wal_record().
Replay XLOG_XACT_COMMIT and XLOG_XACT_ABORT records in walredo.
Don't wait for lsn catchup before walreceiver connected.
Use 'request_nonrel' branch of vendor/postgres
2021-04-21 19:33:29 +03:00
Konstantin Knizhnik
4f3f0304c2 Merge branch 'main' into rocksdb_pageserver 2021-04-21 19:05:02 +03:00
Konstantin Knizhnik
c981f4ad66 Implement garbage collection of unused versions 2021-04-21 19:04:30 +03:00
Heikki Linnakangas
c794f128cc Fix a few cargo clippy warnings in tui code 2021-04-21 17:12:43 +03:00
Heikki Linnakangas
220a023e51 Fix typo in error message 2021-04-21 17:12:43 +03:00
Heikki Linnakangas
e911427872 Remove some unnecessary dependencies 2021-04-21 16:42:12 +03:00
Heikki Linnakangas
eb42fbadeb Re-enable test_redo_cases() test.
I accidentally commented it out in commit 3600b33f.
2021-04-21 16:30:15 +03:00
Konstantin Knizhnik
d8fa2ec367 Merge with main branch 2021-04-21 16:10:05 +03:00
Konstantin Knizhnik
07507274c0 Merge branch 'main' into rocksdb_pageserver 2021-04-21 16:06:31 +03:00
Eric Seppanen
92e4f4b3b6 cargo fmt 2021-04-20 17:59:56 -07:00
Eric Seppanen
b5a5ea5831 update README: "zenith pageserver start"
The old command was "zenith start", which no longer works.
2021-04-20 13:21:02 -07:00
Eric Seppanen
f387769203 add zenith_utils crate
This is a place for code that's shared between other crates in this
repository.
2021-04-20 11:11:29 -07:00
Heikki Linnakangas
7f777a485e Fix caching of 'postgres' build in github action.
The postgres_ext.h isn't found in vendor/postgres, if the Postgres
was restored from cache instead of building it in vendor/postgres.
To fix, change include path to point into tmp_install/include where the
headers are installed, instead of the vendor/postgres source dir.
2021-04-20 20:27:22 +03:00
Heikki Linnakangas
d8ab2e00cb Fix compilation failure caused by last minute change in relsize_inc() 2021-04-20 19:34:48 +03:00
Heikki Linnakangas
f520ef9a64 Update 'postgres' submodule 2021-04-20 19:26:27 +03:00
Heikki Linnakangas
d047a3abf7 Fixes, per Eric's and Konstantin's comments 2021-04-20 19:11:29 +03:00
Heikki Linnakangas
f69db17409 Make WAL safekeeper work with zenith timelines 2021-04-20 19:11:29 +03:00
Heikki Linnakangas
3600b33f1c Implement "timelines" in page server
This replaces the page server's "datadir" concept. The Page Server now
always works with a "Zenith Repository". When you initialize a new
repository with "zenith init", it runs initdb and loads an initial
basebackup of the freshly-created cluster into the repository, on "main"
branch. Repository can hold multiple "timelines", which can be given
human-friendly names, making them "branches". One page server simultaneously
serves all timelines stored in the repository, and you can have multiple
Postgres compute nodes connected to the page server, as long they all
operate on a different timeline.

There is a new command "zenith branch", which can be used to fork off
new branches from existing branches.

The repository uses the directory layout desribed as Repository format
v1 in https://github.com/zenithdb/rfcs/pull/5. It it *highly* inefficient:
- we never create new snapshots. So in practice, it's really just a base
  backup of the initial empty cluster, and everything else is reconstructed
  by redoing all WAL

- when you create a new timeline, the base snapshot and *all* WAL is copied
  from the new timeline to the new one. There is no smarts about
  referencing the old snapshots/wal from the ancestor timeline.

To support all this, this commit includes a bunch of other changes:

- Implement "basebackup" funtionality in page server. When you initialize
  a new compute node with "zenith pg create", it connects to the page
  server, and requests a base backup of the Postgres data directory on
  that timeline. (the base backup excludes user tables, so it's not
  as bad as it sounds).

- Have page server's WAL receiver write the WAL into timeline dir. This
  allows running a Page Server and Compute Nodes without a WAL safekeeper,
  until we get around to integrate that properly into the system. (Even
  after we integrate WAL safekeeper, this is perhaps how this will operate
  when you want to run the system on your laptop.)

- restore_datadir.rs was renamed to restore_local_repo.rs, and heavily
  modified to use the new format. It now also restores all WAL.

- Page server no longer scans and restores everything into memory at startup.
  Instead, when the first request is made for a timeline, the timeline is
  slurped into memory at that point.

- The responsibility for telling page server to "callmemaybe" was moved
  into Postgres libpqpagestore code. Also, WAL producer connstring cannot
  be specified in the pageserver's command line anymore.

- Having multiple "system identifiers" in the same page server is no
  longer supported. I repurposed much of that code to support multiple
  timelines, instead.

- Implemented very basic, incomplete, support for PostgreSQL's Extended
  Query Protocol in page_service.rs. Turns out that rust-postgres'
  copy_out() function always uses the extended query protocol to send
  out the command, and I'm using that to stream the base backup from the
  page server.

TODO: I haven't fixed the WAL safekeeper for this scheme, so all the
integration tests involving safekeepers are failing. My plan is to modify
the safekeeper to know about Zenith timelines, too, and modify it to work
with the same Zenith repository format. It only needs to care about the
'.zenith/timelines/<timeline>/wal' directories.
2021-04-20 19:11:27 +03:00
Heikki Linnakangas
2c5fb6d6c8 Change 'relsize_inc' signature to be a bit nicer.
Don't add 1 to the argument in the function, the callers must do it now.
And don't accept None argument, pass 0 instead for an empty relation.
2021-04-20 19:10:37 +03:00
Konstantin Knizhnik
8604bb8750 Increase timeout for running github tests 2021-04-20 18:46:31 +03:00
Konstantin Knizhnik
936cad17e4 LSN-aware smgrnblock/smgrexists implementations 2021-04-20 18:28:35 +03:00
Heikki Linnakangas
fa5d31056b Remove unimplemented "snapshot" subcommand from --help 2021-04-20 17:35:32 +03:00
Heikki Linnakangas
583f64768f Fix wal safekeeper's reply to IDENTIFY_SYSTEM command.
The PostgreSQL FE/BE RowDescription message was built incorrectly,
the colums were sent in wrong order, and the command tag was missing
NULL-terminator. With these fixes, 'psql' understands the reply and
shows it correctly.
2021-04-20 17:35:27 +03:00
anastasia
c5d56ffe22 Fix build: configure postgres in vendor/postgres directory for postgres_ffi 2021-04-20 15:57:05 +03:00
Heikki Linnakangas
b451ede199 Use rust bindgen for reading/writing the PostgreSQL control file. 2021-04-20 15:57:05 +03:00
Eric Seppanen
533087fd5d cargo fmt 2021-04-19 23:26:13 -07:00
Konstantin Knizhnik
95160dee6d Merge with main branch 2021-04-19 17:00:30 +03:00
Konstantin Knizhnik
8aa3013ec2 Merge branch 'main' into rocksdb_pageserver 2021-04-19 16:28:29 +03:00
Konstantin Knizhnik
8879f747ee Add multitenancy test for wal_acceptor 2021-04-19 14:30:42 +03:00
Heikki Linnakangas
9809613c6f Don't try to read from two WAL files in one read() call.
That obviously won't work, you have to stop at the WAL file boundary,
and open the next file.
2021-04-19 10:34:51 +03:00
Eric Seppanen
8d1bf152cf fix up logged error for walreceiver connection failed
For some reason printing the Result made the error string print twice,
along with some annoying newlines. Extracting the error first gets the
expected result (just one explanation, no newlines)
2021-04-18 23:06:35 -07:00
Eric Seppanen
3725815935 pageserver: propage errors instead of calling .unwrap()
Just a few more places where we can drop the .unwrap() call in favor of
`?`.

Also include a fix to the log file handling: don't open the file twice.
Writing to two fds would result in one message overwriting another.
Presumably `File.try_clone()` reduces down to `dup` on Linux.
2021-04-18 23:06:35 -07:00