Alternative to #3586.
Introduces usage of current_logical_size.current_size as a boundary after which we start to update the metric gauge on ingested wal. Previously any incremented value (ingested wal) would had updated the gauge, but this would had left the metric at zero for timelines which never receive any wal even if size had been calculated. Now the gauge is updated right away as the calculation completes, not requiring any wal to be received.
This PR replaces the ill-advised `unsafe Sync` impl with a de-facto
standard way to solve the underlying problem.
TLDR:
- tokio::task::spawn requires future to be Send
- ∀t. (t : Sync) <=> (&t : Send)
- ∀t. (t : Send + !Sync) => (&t : !Send)
Without this patch, basebackup fails if we evict all layers before that.
This slipped in as part of
commit 01b4b0c2f3
Author: Christian Schwarz <christian@neon.tech>
Date: Fri Jan 13 17:02:22 2023 +0100
Introduce RequestContext
This patch adds a per-timeline periodic task that executes an eviction
policy. The eviction policy is configurable per tenant.
Two policies exist:
- NoEviction (the default one)
- LayerAccessThreshold
The LayerAccessThreshold policy examines the last access timestamp per
layer in the layer map and evicts the layer if that last access is
further in the past than a configurable threshold value.
This policy kind is evaluated periodically at a configurable period.
It logs a summary statistic at `info!()` or `warn!()` level, depending
on whether any evictions failed.
This feature has no explicit killswitch since it's off by default.
#3536 added the custom Debug implementations but it using derived Debug
on Key lead to too verbose output. Instead of making `Key`'s `Debug`
unconditionally or conditionally do the `Display` variant (for table
space'd keys), opted to build a newtype to provide `Debug` for
`Range<Key>` via `Display` which seemed to work unconditionally.
Also orders Key to have: 1. comment, 2. derive, 3. `struct Key`.
This script can be used to remove tenant directories on safekeepers for
projects which do not longer exist (deleted in the console).
To run this script you need to upload it to safekeeper (i.e. with SSH),
and run it with python3. Ansible can be used to run this script on
multiple safekeepers.
Fixes https://github.com/neondatabase/cloud/issues/3356
The auto-eviction PR (#3552) operates in two phaes:
1. find candidate layers
2. evict them.
For (2), a batch API like the one added in this commit is useful.
Note that this PR requires #3558 to be merged first.
Otherwise, the tests won't pass.
This changes the way we compare `Arc<dyn PersistentLayer>` in Timeline's
`LayerMap` not to use `Arc::ptr_eq` which has been witnessed in
development of #3557 to yield wrong results. It gives wrong results
because it compares fat pointers, which are `(object, vtable)` tuples
for `dyn Trait` and there are no guarantees that the `vtable`s are
unique. As in there were multiple vtables for `RemoteLayer` which is why
the comparison failed in #3557.
This is a known issue in rust, clippy warns against it and rust std
might be moving to the solution which has been reproduced on this PR:
compare only object pointers by "casting out" the vtable pointer.
Follow-up to #3536, to actually use the new `Debug` in replacing the
layers, and use replacement with manual eviction endpoint.
Turns out the two paths share a lot of handling of `Replacement` but
didn't unify the two (need 3). There are also upcoming refactorings
from other PRs to this.
This patch adds basic access statistics for historic layers
and exposes them in the management API's `LayerMapInfo`.
We record the accesses in the `{Delta,Image}Layer::load()` function
because it's the common path of
* page_service (`Timline::get_reconstruct_data()`)
* Compaction (`PersistentLayer::iter()` and `PersistentLayer::key_iter()`)
The stats survive residence status changes, and record these as well.
When scraping the layer map endpoint to record its evolution over time,
one must account for stat resets because they are in-memory only and
will reset on pageserver restart.
Use the launch timestamp header added by (#3527) to identify pageserver restarts.
This is PR https://github.com/neondatabase/neon/pull/3496
Add new pageserver config setting `cached_metric_collection_interval`
with default `1 hour`.
This setting controls how often unchanged cached consumption metrics are sent to
the HTTP endpoint.
This is a workaround for billing service limitations.
fixes#3485
Follow-up to #3513.
This removes the old blanket `std::fmt::Debug` impl on `dyn Layer` which
did not seem to be used from anywhere (no compilation errors after
removing).
Adds `std::fmt::Debug` requirement and implementations for `trait Layer`
implementors:
- LayerDescriptor (derived)
- RemoteLayer (manual)
- DeltaLayer (manual)
- ImageLayer (manual)
Manual implementations are used to skip PageserverConf, tenant and
timeline ids, large collections.
Adds and adjusts some doc comments to be more rustdoc alike.
## Describe your changes
Expose the currently calculated synthetic size as a Prometheus metric
## Issue ticket number and link
#3509
## Checklist before requesting a review
- [X] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
The PR adds an endpoint to show tenant's current config: `GET
/v1/tenant/:tenant_id/config`
Tenant's config consists of two parts: tenant overrides (could be
changed via other management API requests) and the default part,
substituting all missing overrides (constant, hardcoded in pageserver).
The API returns the custom overrides and the final tenant config, after
applying all the defaults.
Along the way, it had to fix two things in the config:
* allow to shorten the json version and omit all `null`'s (same as toml
serializer behaves by default), and to understand such shortened format
when deserialized. A unit test is added
* fix a bug, when `PUT /v1/tenant/config` endpoint rewritten the local
file with what had came in the request, but updating (not rewriting the
old values) the in-memory state instead.
That got uncovered during adjusting the e2e test and fixed to do the
replacement everywhere, otherwise there's no way to revert existing
overrides. Fixes#3471 (commit
dc688affe8)
* fixes https://github.com/neondatabase/neon/issues/3472 by reordering
the config saving operations
This patch adds a LaunchTimestamp type to the `metrics` crate,
along with a `libmetric_` Prometheus metric.
The initial user is pageserver.
In addition to exposing the Prometheus metric, it also reproduces
the launch timestamp as a header in the API responses.
The motivation for this is that we plan to scrape the pageserver's
/v1/tenant/:tenant_id/timeline/:timeline_id/layer
HTTP endpoint over time. It will soon expose access metrics (#3496)
which reset upon process restart. We will use the pageserver's launch
ID to identify a restart between two scrape points.
However, there are other potential uses. For example, we could use
the Prometheus metric to annotate Grafana plots whenever the launch
timestamp changes.
Cc: #3486
Adds a method to replace a particular layer from the LayerMap for the
purposes of remote layer download and layer eviction. In those use cases
read lock on layer map needs to be released after initial search, but
other operations could modify layermap before replacing thread gets to
run.
Co-authored-by: bojanserafimov <bojan.serafimov7@gmail.com>
## Describe your changes
Add libmetrics_build_info metrics with commit sha to storage_broken
/metrics, to match behaviour of proxy, pageserver and safekeeper.
## Describe your changes
## Issue ticket number and link
## Checklist before requesting a review
- [ ] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
The project/endpoint should be set in the original (non-as_ref'd) creds,
because we call `wake_compute` not only in `try_password_hack` but also
later in the connection retry logic.
This PR also removes the obsolete `as_ref` method and makes the code
simpler because we no longer need this complication after a recent
refactoring.
Further action points: finally introduce typestate in creds (planned).
We do not need special enum variant for testing the file names, neither
its special handling across the code.
Current tests are able to create regular layers with normal layer names,
as the PR shows.
Closes https://github.com/neondatabase/neon/issues/3439
Adds a set of commands to manipulate the layer map:
* dump the layer map contents
* evict the layer form the layer map (remove the local file, put the
remote layer instead in the layer map)
* download the layer (operation, reversing the eviction)
The commands will change later, when the statistics is added on top, so
the swagger schema is not adjusted.
The commands might have issues with big amount of layers: no pagination
is done for the dump command, eviction and download commands look for
the layer to evict/download by iterating all layers sequentially and
comparing the layer names.
For now, that seems to be tolerable ("big" number of layers is ~2_000)
and further experiments are needed.
---------
Co-authored-by: Christian Schwarz <christian@neon.tech>
Adds two new tags, `run-extra-build-macos` and `run-extra-build-stats`
to trigger corresponding build jobs on any PR.
On every build for `main` or PR with `run-extra-build-stats` tag, publish a GitHub commit status with the link to the `cargo build --all --release --timings` report.
This patch adds a timed LRU cache implementation and a compute node info cache on top of that.
Cache entries might expire on their own (default ttl=5mins) or become invalid due to real-world events,
e.g. compute node scale-to-zero event, so we add a connection retry loop with a wake-up call.
Solved problems:
- [x] Find a decent LRU implementation.
- [x] Implement timed LRU on top of that.
- [x] Cache results of `proxy_wake_compute` API call.
- [x] Don't invalidate newer cache entries for the same key.
- [x] Add cmdline configuration knobs (requires some refactoring).
- [x] Add failed connection estab metric.
- [x] Refactor auth backends to make things simpler (retries, cache
placement, etc).
- [x] Address review comments (add code comments + cleanup).
- [x] Retry `/proxy_wake_compute` if we couldn't connect to a compute
(e.g. stalled cache entry).
- [x] Add high-level description for `TimedLru`.
TODOs (will be addressed later):
- [ ] Add cache metrics (hit, spurious hit, miss).
- [ ] Synchronize http requests across concurrent per-client tasks
(https://github.com/neondatabase/neon/pull/3331#issuecomment-1399216069).
- [ ] Cache results of `proxy_get_role_secret` API call.
- add parse_query_param()
- use Cow<> where possible
- move param parsing code to utils::http::request
This was originally PR https://github.com/neondatabase/neon/pull/3502
which targeted a different branch.
closes #3510