The spawned thread didn't have the tokio runtime active, which lead to
this error:
ERROR lsn_lease_bg_task{tenant_id=1bb647cb7d3974b52e74f7442fa7d059 timeline_id=cf41456d3202e1c3940cb8f372d160ab lsn=0/1576000}:panic{thread=<unnamed> location=compute_tools/src/lsn_lease.rs:201:5}: there is no reactor running, must be called from the context of a Tokio 1.x runtime
Fixes `test_readonly_node_gc`
Pass back a suitable 'errno' from the communicator process to the
originating backend in all cases. Usually it's just EIO because we
don't have a good way to map from tonic StatusCodes to libc error
numbers. That's probably good enough; from the original backend's
perspective all errors are IO errors.
In the C code, set libc errno variable before calling ereport(), so
that errcode_for_file_access() works. And once we do that, we can
replace pg_strerror() calls with %m.
Commit 1dce2a9e74 changed how the `neon.pageserver_connstring` setting
is formed, but it messed up setting the `neon.stripe_size` setting so
that it was set twice. That got mixed up during development of the
patch, as commit 7fef4435c1 landed first and was merged incorrectly.
Clap automatically uses doc comments as help/about texts. Doc comments
are strictly better, since they're also used e.g. for IDE documentation,
and are better formatted.
This patch updates all `neon_local` commands to use doc comments
(courtesy of GPT-o3).
## Problem
I discovered two bugs corresponding to safekeeper migration, which
together might lead to a data loss during the migration. The second bug
is from a hadron patch and might lead to a data loss during the
safekeeper restore in hadron as well.
1. `switch_membership` returns the current `term` instead of
`last_log_term`. It is used to choose the `sync_position` in the
algorithm, so we might choose the wrong one and break the correctness
guarantees.
2. The current `term` is used to choose the most advanced SK in
`pull_timeline` with higher priority than `flush_lsn`. It is incorrect
because the most advanced safekeeper is the one with the highest
`(last_log_term, flush_lsn)` pair. The compute might bump term on the
least advanced sk, making it the best choice to pull from, and thus
making committed log entries "uncommitted" after `pull_timeline`
Part of https://databricks.atlassian.net/browse/LKB-1017
## Summary of changes
- Return `last_log_term` in `switch_membership`
- Use `(last_log_term, flush_lsn)` as a primary key for choosing the
most advanced sk in `pull_timeline` and deny pulling if the `max_term`
is higher than on the most advanced sk (hadron only)
- Write tests for both cases
- Retry `sync_safekeepers` in `compute_ctl`
- Take into the account the quorum size when calculating `sync_position`
## Problem
The deletion logic had become difficult to understand and maintain.
## Summary of changes
- Added an RFC detailing proposed improvements to all deletion-related
APIs.
---------
Co-authored-by: Aleksandr Sarantsev <aleksandr.sarantsev@databricks.com>
This adds a new request type between backend and communicator, to make
a getpage request at a given LSN, bypassing the LFC. Only used by the
get_raw_page_at_lsn() debugging/testing function.
Add DELETE /lfc/prewarm route which handles ongoing prewarm
cancellation, update API spec, add prewarm Cancelled state
Add offload Cancelled state when LFC is not initialized
These lines are a significant fraction of the total log size of the
regression tests. And it seems very uninteresting, it's always
'localhost' in local tests.
## Problem
In #12268, we added Pageserver gRPC addresses to the storage controller.
However, we didn't actually persist these in the database.
## Summary of changes
Update the database with the new gRPC address on reattach.
Today we don't have any indications (other than spammy logs in PG that
nobody monitors) if the Walproposer in PG cannot connect to/get votes
from all Safekeepers. This means we don't have signals indicating that
the Safekeepers are operating at degraded redundancy. We need these
signals.
Added plumbing in PG extension so that the `neon_perf_counters` view
exports the following gauge metrics on safekeeper health:
- `num_configured_safekeepers`: The total number of safekeepers
configured in PG.
- `num_active_safekeepers`: The number of safekeepers that PG is
actively streaming WAL to.
An alert should be raised whenever `num_active_safekeepers` <
`num_configured_safekeepers`.
The metrics are implemented by adding additional state to the
Walproposer shared memory keeping track of the active statuses of
safekeepers using a simple array. The status of the safekeeper is set to
active (1) after the Walproposer acquires a quorum and starts streaming
data to the safekeeper, and is set to inactive (0) when the connection
with a safekeeper is shut down. We scan the safekeeper status array in
Walproposer shared memory when collecting the metrics to produce results
for the gauges.
Added coverage for the metrics to integration test
`test_wal_acceptor.py::test_timeline_disk_usage_limit`.
## Problem
## Summary of changes
---------
Co-authored-by: William Huang <william.huang@databricks.com>
## Problem
```
postgres=> select neon.prewarm_local_cache('\xfcfcfcfc01000000ffffffff070000000000000000000000000000000000000000000000000000000000000000000000000000ff', 1);
WARNING: terminating connection because of crash of another server process
DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and repeat your command.
FATAL: server conn crashed?
```
The function takes a bytea argument and casts it to a C struct, without
validating the contents.
## Summary of changes
Added validation for number of pages to be prefetched and for the chunks
as well.