Commit Graph

8 Commits

Author SHA1 Message Date
Stas Kelvich
8c07a36fda Remove last_valid_lsn tracking in wal_receiver.
There are two main reasons for that:

a) Latest unfinished record may disapper after compute node restart, so let's
    try not leak volatile part of the WAL into the repository. Always use
    last_valid_record instead.

    That change requires different getPage@LSN logic in postgres -- we need
    to ask LSN's that point to some complete record instead of GetFlushRecPtr()
    that can point in the middle of the record. That was already done by @knizhnik
    to deal with the same problem during the work on `postgres --sync-safekeepers`.

    Postgres will use LSN's aligned on 0x8 boundary in get_page requests, so we
    also need to be sure that last_valid_record is aligned.

b) Switch to get_last_record_lsn() in basebackup@no_lsn. When compute node
    is running without safekeepers and streams WAL directly
    to pageserver it is important to match basebackup LSN and LSN of replication
    start. Before this commit basebackup@no_lsn was waiting for last_valid_lsn
    and walreceiver started replication with last_record_lsn, which can be less.
    So replication was failing since compute node doesn't have requested WAL.
2021-09-02 12:06:12 +03:00
Heikki Linnakangas
cff671c1bd Remove duplicated LSN fields from the page cache.
Having multiple copies of the same values is a source of confusion.
Commit da9bf5dc63 fixed one race condition caused by that, for example.
See also discussion at
https://github.com/zenithdb/zenith/issues/57#issuecomment-824393470

This changes SeqWait.advance() to return the old number, and not panic if
you try to move the value backwards. The caller should check for that and
act accordingly.
2021-04-27 10:32:39 +03:00
Heikki Linnakangas
3b9e7fc5e6 Use explicit threads.
Remove 'async' usage a much as feasible. Async code is harder to debug,
and mixing async and non-async code is a recipe for confusion and bugs.

There are a couple of exceptions:

- The code in walredo.rs, which needs to read and write to the child
  process simultaneously, still uses async. It's more convenient there.
  The 'async' usage is carefully limited to just the functions that
  communicate with the child process.

- Code in walreceiver.rs that uses tokio-postgres to do streaming
  replication. We have to use async there, because tokio-postgres is
  async. Most rust-postgres functionality has non-async wrappers, but
  not the new replication client code. The async usage is very limited
  here, too: we use just block_on to call the tokio-postgres functions.

The code in 'page_service.rs' now launches a dedicated thread for each
connection.

This replaces tokio::sync::channel with std::sync:mpsc in
'seqwait.rs', to make that non-async. It's not a drop-in replacement,
though: std::sync::mpsc doesn't support multiple consumers, so we cannot
share a channel between multiple waiters. So this removes the code to
check if an existing channel can be reused, and creates a new one for
each waiter. That created another problem: BTreeMap cannot hold
duplicates, so I replaced that with BinaryHeap.

Similarly, the tokio::{mpsc, oneshot} channels used between WAL redo
manager and PageCache are replaced with std::sync::mpsc. (There is no
separate 'oneshot' channel in the standard library.)

Fixes github issue #58, and coincidentally also issue #66.
2021-04-26 13:07:51 +03:00
Eric Seppanen
f62ce4bcf7 make seqwait generic
SeqWait can use any type that is Ord + Debug + Copy. Debug is not
strictly necessary, but allows us to keep the panic message if a caller
wants the sequence number to go backwards.
2021-04-25 19:37:02 -07:00
Eric Seppanen
fe79082e29 require documentation in seqwait.rs 2021-04-23 15:01:22 -07:00
Eric Seppanen
8beaf76c85 SeqWait: don't do wakeups under the lock
Clippy pointed out that `drop(waiters)` didn't do anything, because
there was a misplaced ";" causing `waiters` to be a unit type `()`.

This change makes it do what was intended: the lock should be dropped
first, then the wakeups should be processed.
2021-04-23 14:16:34 -07:00
Konstantin Knizhnik
9e7c45cb72 Merge with master 2021-04-22 09:45:13 +03:00
Eric Seppanen
8060e17b50 add SeqWait
SeqWait adds a way to .await the arrival of some sequence number.
It provides wait_for(num) which is an async fn, and advance(num) which
is synchronous.

This should be useful in solving the page cache deadlocks, and may be
useful in other areas too.

This implementation still uses a Mutex internally, but only for a brief
critical section. If we find this code broadly useful and start to care
more about executor stalls due to unfair thread scheduling, there might
be ways to make it lock-free.
2021-04-21 18:02:13 -07:00