If the uploads after compaction happen slowly, they might have
finished before the pageserver is shut down. The L0 files have been
uploaded, so no data is lost, but then the query later in the test
will need to download all the L0 files, and causes the test to fail
because it specifically checks that download happens on-demand, not
all at once.
RequestContext is used to track each "operation" or "task" in a way
that's not tied to tokio tasks. It provides support for fine-grained
cancellation of individual operations, or all tasks working on an
active tenant or timeline. Most async functions now take a
RequestContext argument.
RequestContexts form a hierarchy, so that you have a top-level context
e.g. for a TCP listener task, a child context for each task handling
a connection, and perhaps a grandchild context for each individual
client request. In addition to the hierarchy, each RequestContext can
be associated with a Tenant or Timeline object. This is used to
prevent a Tenant or Timeline from being deleted or detached while
there are still tasks accessing it. This fixes a long-standing race
conditions between GC/compaction and deletion (see issues #2914 and
compiler in any way, but the functions like `get_active_timeline`
make it easy to do the right thing.
This replaces most of the machinery in `task_mgr.rs`. We don't track
running tasks as such anymore, only RequestContexts. In practice,
every task holds onto a RequestContext.
In addition to supporting cancellation, the RequestContext specifies
the desired behavior if a remote layer is needed for the operation.
This replaces the `with_ondemand_download_sync` and
`no_ondemand_download` macros. The on-demand download now happens deep
in the call stack, in get_reconstruct_data(), and the caller is no
longer involved in the download, except by passing a RequestContext
that specifies whether to do on-demand download or not. The
PageReconstructResult type is gone but the
PageReconstructError::NeedsDownload variant remains. It's now returned
if the context specified "don't do on-demand download", and a layer
is missing.
TODO:
- Enforce better that you hold a RequestContext associated with a Tenant
or Timeline.
- All the fields in RequestContext are currently 'pub', but things will
break if you modify the tenant/timeline fields directly. Make that more
safe.
- When you create a subcontext, should it inherit the Tenant / Timeline
of its parent?
- Can the walreceiver::TaskHandle stuff be replaced with this?
- Extract smaller patches:
- What else could we extract?
This commit adds the 'ctx' parameter to all the functions that will
need an active context. However, you can just create new contexts on
the fly, there is no cross-checks that the tenant/timeline is still in
active state. You can simply call Tenant::get_context or
Timeline::get_context, and they always succee. In the next commit, we
will change the functions for constructing contexts, so that you
cannot create a new TenantRequestContext if the tenant is being
stopped (and similarly for TimelineRequestContext).
This commit isn't useful on its own, but splitting these fairly
mechanical changes helps to make the next commit smaller, and thus
easier to review. Because the contexts are merely passed through
places, and not actually used for anything, this introduces a lot of
"unused variable" warnings. They will go away with the next commit.
This makes Timeline::get() async, and all functions that call it
directly or indirectly with it. The with_ondemand_download() mechanism
is gone, Timeline::get() now always downloads files, whether you want
it or not. That is what all the current callers want, so even though
this loses the capability to get a page only if it's already in the
pageserver, without downloading, we were not using that capability.
There were some places that used 'no_ondemand_download' in the WAL
ingestion code that would error out if a layer file was not found
locally, but those were dubious. We do actually want to on-demand
download in all of those places.
Per discussion at
https://github.com/neondatabase/neon/pull/3233#issuecomment-1368032358
When number of github actions workers is changed, some jobs get killed.
When helm if killed during the upgrade, release stuck in pending-upgrade
state. --atomic should initiate automatic rollback in this case.
With this patch, tenant_detach and timeline_delete's
task_mgr::shutdown_tasks() call will wait for on-demand
compaction to finish.
Before this patch, the on-demand compaction would grab the
layer_removal_cs after tenant_detach / timeline_delete had
removed the timeline directory.
This resulted in error
No such file or directory (os error 2)
NB: I already implemented this pattern for ondemand GC a while back.
fixes https://github.com/neondatabase/neon/issues/3136
- Fix and improve comments
- Rename 'physical_size' local variable to 'resident_size' for clarity.
- Remove one 'unnecessary wait_for_upload' call. The
'wait_for_sk_commit_lsn_to_reach_remote_storage' call after shutting
down compute is sufficient.
## Describe your changes
Added a PR template
## Issue ticket number and link
#3162
## Checklist before requesting a review
- [ ] I have performed a self-review of my code
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
This is a hacky implementation of WebSocket server, embedded into our
postgres proxy. The server is used to allow https://github.com/neondatabase/serverless
to connect to our postgres from browser and serverless javascript functions.
How it will work (general schema):
- browser opens a websocket connection to
`wss://ep-abc-xyz-123.xx-central-1.aws.neon.tech/`
- proxy accepts this connection and terminates TLS (https)
- inside encrypted tunnel (HTTPS), browser initiates plain
(non-encrypted) postgres connection
- proxy performs auth as in usual plain pg connection and forwards
connection to the compute
Related issue: #3225
Follow-up of https://github.com/neondatabase/neon/pull/3270 which made
an example from main README.md not working.
Fixes that, by adding a way to specify a default tenant now and modifies
the basic neon_local test to start postgres and check branching.
Not all neon_local commands are implemented, so not all README.md
contents is tested yet.
It's not OK to return early from within a PG_TRY-CATCH block. The
PG_TRY macro sets the global PG_exception_stack variable, and
PG_END_TRY restores it. If we jump out in between with "return NULL",
the PG_exception_stack is left to point to garbage. (I'm surprised the
comments in PG_TRY_CATCH don't warn about this.)
Add test that re-attaches tenant in pageserver while Postgres is
running. If the tenant is detached while compute is connected and
busy running queries, those queries will fail if they try to fetch any
pages. But when the tenant is re-attached, things should start working
again, without disconnecting the client <-> postgres connections.
Without this fix, this reproduced the segfault.
Fixes issue #3231
pageserver_disconnect() call invalidates 'pageserver_conn', including
the error message pointer we got from PQerrorMessage(pageserver_conn).
Copy the message to a temporary variable before disconnecting, like
we do in a few other places.
In the passing, clear 'pageserver_conn_wes' variable in a few places
where it was free'd. I didn't see any live bug from this, but since
pageserver_disconnect() checks if it's NULL, let's not leave it
dangling to already-free'd memory.
For every Python test, we start the storage first, and expect that
later, in the test, when we start a compute, it will work without
specific timeline and tenant creation or their IDs specified.
For that, we have a concept of "default" branch that was created on the
control plane level first, but that's not needed at all, given that it's
only Python tests that need it: let them create the initial timeline
during set-up.
Before, control plane started and stopped pageserver for timeline
creation, now Python harness runs an extra tenant creation request on
test env init.
I had to adjust the metrics test, turns out it registered the metrics
from the default tenant after an extra pageserver restart.
New model does not sent the metrics before the collection time happens,
and that was 30s before.
- Clean up redundant metric removal in TimelineMetrics::drop.
RemoteTimelineClientMetrics is responsible for cleaning up
REMOTE_OPERATION_TIME andREMOTE_UPLOAD_QUEUE_UNFINISHED_TASKS.
- Rename `pageserver_remote_upload_queue_unfinished_tasks` to
`pageserver_remote_timeline_client_calls_unfinished`. The new name
reflects that the metric is with respect to the entire call to remote
timeline client. This includes wait time in the upload queue and hence
it's a longer span than what `pageserver_remote_OPERATION_seconds`
measures.
- Add the `pageserver_remote_timeline_client_calls_started` histogram.
See the metric description for why we need it.
- Add helper functions `call_begin` etc to `RemoteTimelineClientMetrics`
to centralize the logic for updating the metrics above (they relate to
each other, see comments in code).
- Use these constructs to track ongoing downloads in
`pageserver_remote_timeline_client_calls_unfinished`
refs https://github.com/neondatabase/neon/issues/2029
fixes https://github.com/neondatabase/neon/issues/3249
closes https://github.com/neondatabase/neon/pull/3250
Before this change, we would not .measure_remote_op for index part
downloads.
And more generally, it's good to pass not just uploads but also
downloads through RemoteTimelineClient, e.g., if we ever want to
implement some timeline-scoped policies there.
Found this while working on https://github.com/neondatabase/neon/pull/3250
where I add a metric to measure the degree of concurrent downloads.
Layer download was missing in a test that I added there.
The Basebackup struct is really just a convenient place to carry the
various parameters around in send_tarball and its subroutines. Make it
internal to the send_tarball function.
This reverts commit 56a4466d0a.
Seems that flackiness increased after this commit, while the time
decrease was a couple of seconds.
With every regular Python test spawing 1 etcd, 3 safekeepers, 1
pageserver, few CLI commands and post-run cleanup hooks, it might be
hard to run many such tests in parallel.
We could return to this later, after we consider alternative test
structure and/or CI runner structure.
Closes https://github.com/neondatabase/neon/issues/3114
Adds more typization into errors that appear during protocol messages (`FeMessage`), postgres and walreceiver connections.
Socket IO errors are now better detected and logged with lesser (INFO, DEBUG) error level, without traces that they were logged before, when they were wrapped in anyhow context.
It was nice to have and useful at the time, but unfortunately the method
used to gather the profiling data doesn't play nicely with 'async'. PR
#3228 will turn 'get_page_at_lsn' function async, which will break the
profiling support. Let's remove it, and re-introduce some kind of
profiling later, using some different method, if we feel like we need it
again.
Makes the top-level functions in WalIngest async, and replaces
no_ondemand_download calls with with_ondemand_download.
This hopefully fixes the problem reported in issue #3230, although I
don't have a self-contained test case for it.
The synchronous 'tar' crate has required us to use block_in_place and
SyncIoBridge to work together with the async I/O in the client
connection. Switch to 'tokio-tar' crate that uses async I/O natively.
As part of this, move the CopyDataWriter implementation to
postgres_backend_async.rs. Even though it's only used in one place
currently, it's in principle generally applicable whenever you want to
use COPY out.
Unfortunately we cannot use the 'tokio-tar' as it is: the Builder
implementation requires the writer to have 'static lifetime. So we
have to use a modified version without that requirement. The 'static
lifetime was required just for the Drop implementation that writes
the end-of-archive sections if the Builder is dropped without calling
`finish`. But we don't actually want that behavior anyway; in fact
we had to jump through some hoops with the AbortableWrite hack to skip
those. With the modified version of 'tokio-tar' without that Drop
implementation, we don't need AbortableWrite either.
Co-authored-by: Kirill Bulatov <kirill@neon.tech>