The algorithm is the same (with two small exceptions), but rewrite the
way it's implemented to make it easier to follow.
The exceptions:
1. 'min_resident_size' now protects at least that much data in the first
"respectful" phase of the algorithm. Previously, it would evict layers
until the resident size fell below min_resident_size. In other words,
we know protect one more layer of each tenant, so that the resident
size stays just above min_resident_size, while previously we would
evict enough to bring the resident size just under min_resident_size.
2. Previously, the "max layer size" that's used as the default
min_resident_size was calculated from *all* layers in the tenant,
including remote layers. Now it's only calculated across all
locally-present layers. I don't know if that was a deliberate choice,
but this is slightly simpler.
Without this, we run it every p.period, which can be quite low. For
example, the running experiment with 3000 tenants in prod uses a period
of 1 minute.
Doing it once per p.threshold is enough to prevent eviction.
fix synthetic size for (last_record_lsn - gc_horizon) < initdb_lsn
Assume a single-timeline project.
If the gc_horizon covers all WAL (last_record_lsn < gc_horizon)
but we have written more data than just initdb, the synthetic
size calculation worker needs to calculate the logical size
at LSN initdb_lsn (Segment BranchStart).
Before this patch, that calculation would incorrectly return
the initial logical size calculation result that we cache in
the Timeline::initial_logical_size. Presumably, because there
was confusion around initdb_lsn vs. initial size calculation.
The fix is to only hand out the initialized_size() only if
the LSN matches.
The distinction in the metrics between "init logical size" and "logical
size" was also incorrect because of the above. So, remove it.
There was a special case for `size != 0`. This was to cover the case of
LogicalSize::empty_initial(), but `initial_part_end` is `None` in that
case, so the new `LogicalSize::initialized_size()` will return None
in that case as well.
Lastly, to prevent confusion like this in the future, rename all
occurrences of `init_lsn` to either just `lsn` or a more specific name.
Co-authored-by: Joonas Koivunen <joonas@neon.tech>
Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
whoami() was never called, 'is_test' was never set.
'restart()' might be useful, but it wasn't hooked up the CLI so it was
dead code. It's not clear what kind of a restart it should perform,
anyway: just restart Postgres, or re-initialize the data directory
from a fresh basebackup like "stop"+"start" does.
To avoid re-downloading evicted files on restart, re-compute logical
size and partitioning before each threshold based eviction run.
Cc: #3802
Co-authored-by: Christian Schwarz <christian@neon.tech>
* Add the REPOSITORY env to build args to avoid the following error when
executing without the credentials for the repository.
```
ERROR: Service 'compute' failed to build: Head
"https://369495373322.dkr.ecr.eu-central-1.amazonaws.com/v2/compute-node-v15/manifests/2221":
no basic auth credentials
```
* update the tag version in the documentation to support storage broker