Based on Konstantin's original patch (PR #275), but I introduced helper
functions for serializing/deserializing the different kinds of
ObjectValues, which made it more pleasant to use, as the deserialization
checks are now performed in the helper functions.
Add back code to parse transaction commit and abort records, and in
particular the list of dropped relations in them. Add 'put_unlink'
function to the Timeline trait and implementation. We had the code to
handle dropped relations in the GC code and elsewhere in ObjectRepository
already, but there was nothing to create the RelationSizeEntry::Unlink
tombstone entries until now. Also add a test to check that GC correctly
removes all page versions of a dropped relation.
Implements https://github.com/zenithdb/zenith/issues/232, except for the
"orphaned" rels.
Reviewed-by: Konstantin Knizhnik
Some of these were related to handling various WAL records that are not
related to any relations, like pg_multixact updates. These should have
been removed in the revert commit 6a9c036ac1, but I missed them.
Also, we didn't anything with commit/abort records. We will start
parsing commit/abort records in the next commit, but seems better to
add that from clean slate.
Reviewed-by: Konstantin Knizhnik
Without this step, the page versions won't actually be removed, they're
just marked for deletion on the next RocksDB "merge" or "compact"
operation.
Author: Konstantin Knizhnik
- Print the number of dropped relations, and the number of relations
encountered overall.
- If a block has only one page version, the latest one, don't count it as
a "truncated" version history. Only count pages for which we actually
removed some old versions.
- Change "last" to "latest" in variable names and comments. "Last" could
be interpreted as "oldest", but here it means "newest".
- Add a comment noting that the GC code depends on get_page_at_lsn_nowait
to store the materialized page version in the repository.
- Change "last" to "latest" in variable names for clarity. "Last" could
be interpreted as the oldest, but here it means newest.
To simplify cloud ops, allow configuration via file.
toml is used as the config format, and the file is stored in the working
directory.
Arguments used at initialization are saved in the config file.
Config file params may be overridden by CLI arguments.
Use CLI args instead of environment variables to parameterize the
working directory and postgres distirbution.
Before this change, there was a mixture of environment variables and CLI
arguments that needed to be set. Moving to a single input simplifies
cloud configuration management.