another rename

title
rename the crate and fix some compile errors that I missed a while back
2026-02-26 14:00:37 +00:00 · 2023-06-07 16:27:53 +02:00 · 2023-06-07 16:27:53 +02:00 · 2023-06-07 16:27:53 +02:00 · 2023-06-07 16:27:53 +02:00 · 2023-06-07 16:27:53 +02:00
156 changed files with 2742 additions and 6714 deletions
--- a/.config/hakari.toml
+++ b/.config/hakari.toml
@@ -4,7 +4,7 @@
 hakari-package = "workspace_hack"

 # Format for `workspace-hack = ...` lines in other Cargo.tomls. Requires cargo-hakari 0.9.8 or above.
-dep-format-version = "4"
+dep-format-version = "3"

 # Setting workspace.resolver = "2" in the root Cargo.toml is HIGHLY recommended.
 # Hakari works much better with the new feature resolver.
--- a/.github/ansible/prod.ap-southeast-1.hosts.yaml
+++ b/.github/ansible/prod.ap-southeast-1.hosts.yaml
@@ -17,7 +17,7 @@ storage:
          kind: "LayerAccessThreshold"
          period: "10m"
          threshold: &default_eviction_threshold "24h"
-        evictions_low_residence_duration_metric_threshold: *default_eviction_threshold
+      evictions_low_residence_duration_metric_threshold: *default_eviction_threshold
      remote_storage:
        bucket_name: "{{ bucket_name }}"
        bucket_region: "{{ bucket_region }}"
--- a/.github/ansible/prod.eu-central-1.hosts.yaml
+++ b/.github/ansible/prod.eu-central-1.hosts.yaml
@@ -17,7 +17,7 @@ storage:
          kind: "LayerAccessThreshold"
          period: "10m"
          threshold: &default_eviction_threshold "24h"
-        evictions_low_residence_duration_metric_threshold: *default_eviction_threshold
+      evictions_low_residence_duration_metric_threshold: *default_eviction_threshold
      remote_storage:
        bucket_name: "{{ bucket_name }}"
        bucket_region: "{{ bucket_region }}"
--- a/.github/ansible/prod.us-east-1.hosts.yaml
+++ b/.github/ansible/prod.us-east-1.hosts.yaml
@@ -1,50 +0,0 @@
-storage:
-  vars:
-    bucket_name: neon-prod-storage-us-east-1
-    bucket_region: us-east-1
-    console_mgmt_base_url: http://neon-internal-api.aws.neon.tech
-    broker_endpoint: http://storage-broker-lb.theta.us-east-1.internal.aws.neon.tech:50051
-    pageserver_config_stub:
-      pg_distrib_dir: /usr/local
-      metric_collection_endpoint: http://neon-internal-api.aws.neon.tech/billing/api/v1/usage_events
-      metric_collection_interval: 10min
-      disk_usage_based_eviction:
-        max_usage_pct: 85 # TODO: decrease to 80 after all pageservers are below 80
-        min_avail_bytes: 0
-        period: "10s"
-      tenant_config:
-        eviction_policy:
-          kind: "LayerAccessThreshold"
-          period: "10m"
-          threshold: &default_eviction_threshold "24h"
-        evictions_low_residence_duration_metric_threshold: *default_eviction_threshold
-      remote_storage:
-        bucket_name: "{{ bucket_name }}"
-        bucket_region: "{{ bucket_region }}"
-        prefix_in_bucket: "pageserver/v1"
-    safekeeper_s3_prefix: safekeeper/v1/wal
-    hostname_suffix: ""
-    remote_user: ssm-user
-    ansible_aws_ssm_region: us-east-1
-    ansible_aws_ssm_bucket_name: neon-prod-storage-us-east-1
-    console_region_id: aws-us-east-1
-    sentry_environment: production
-
-  children:
-    pageservers:
-      hosts:
-        pageserver-0.us-east-1.aws.neon.tech:
-          ansible_host: i-085222088b0d2e0c7
-        pageserver-1.us-east-1.aws.neon.tech:
-          ansible_host: i-0969d4f684d23a21e
-        pageserver-2.us-east-1.aws.neon.tech:
-          ansible_host: i-05dee87895da58dad
-
-    safekeepers:
-      hosts:
-        safekeeper-0.us-east-1.aws.neon.tech:
-          ansible_host: i-04ce739e88793d864
-        safekeeper-1.us-east-1.aws.neon.tech:
-          ansible_host: i-0e9e6c9227fb81410
-        safekeeper-2.us-east-1.aws.neon.tech:
-          ansible_host: i-072f4dd86a327d52f
--- a/.github/ansible/prod.us-east-2.hosts.yaml
+++ b/.github/ansible/prod.us-east-2.hosts.yaml
@@ -17,7 +17,7 @@ storage:
          kind: "LayerAccessThreshold"
          period: "10m"
          threshold: &default_eviction_threshold "24h"
-        evictions_low_residence_duration_metric_threshold: *default_eviction_threshold
+      evictions_low_residence_duration_metric_threshold: *default_eviction_threshold
      remote_storage:
        bucket_name: "{{ bucket_name }}"
        bucket_region: "{{ bucket_region }}"
--- a/.github/ansible/prod.us-west-2.hosts.yaml
+++ b/.github/ansible/prod.us-west-2.hosts.yaml
@@ -17,7 +17,7 @@ storage:
          kind: "LayerAccessThreshold"
          period: "10m"
          threshold: &default_eviction_threshold "24h"
-        evictions_low_residence_duration_metric_threshold: *default_eviction_threshold
+      evictions_low_residence_duration_metric_threshold: *default_eviction_threshold
      remote_storage:
        bucket_name: "{{ bucket_name }}"
        bucket_region: "{{ bucket_region }}"
@@ -34,21 +34,13 @@ storage:
    pageservers:
      hosts:
        pageserver-0.us-west-2.aws.neon.tech:
-          ansible_host: i-0d9f6dfae0e1c780d
+          ansible_host: i-0d9f6dfae0e1c780d 
        pageserver-1.us-west-2.aws.neon.tech:
          ansible_host: i-0c834be1dddba8b3f
        pageserver-2.us-west-2.aws.neon.tech:
          ansible_host: i-051642d372c0a4f32
        pageserver-3.us-west-2.aws.neon.tech:
          ansible_host: i-00c3844beb9ad1c6b
-        pageserver-4.us-west-2.aws.neon.tech:
-          ansible_host: i-013263dd1c239adcc
-        pageserver-5.us-west-2.aws.neon.tech:
-          ansible_host: i-00ca6417c7bf96820
-        pageserver-6.us-west-2.aws.neon.tech:
-          ansible_host: i-01cdf7d2bc1433b6a
-        pageserver-7.us-west-2.aws.neon.tech:
-          ansible_host: i-02eec9b40617db5bc

    safekeepers:
      hosts:
@@ -57,16 +49,5 @@ storage:
        safekeeper-1.us-west-2.aws.neon.tech:
          ansible_host: i-074682f9d3c712e7c
        safekeeper-2.us-west-2.aws.neon.tech:
-          ansible_host: i-042b7efb1729d7966
-        safekeeper-3.us-west-2.aws.neon.tech:
-          ansible_host: i-089f6b9ef426dff76
-        safekeeper-4.us-west-2.aws.neon.tech:
-          ansible_host: i-0fe6bf912c4710c82
-        safekeeper-5.us-west-2.aws.neon.tech:
-          ansible_host: i-0a83c1c46d2b4e409
-        safekeeper-6.us-west-2.aws.neon.tech:
-          ansible_host: i-0fef5317b8fdc9f8d
-        safekeeper-7.us-west-2.aws.neon.tech:
-          ansible_host: i-0be739190d4289bf9
-        safekeeper-8.us-west-2.aws.neon.tech:
-          ansible_host: i-00e851803669e5cfe                    
+          ansible_host: i-042b7efb1729d7966 
+          
--- a/.github/ansible/staging.eu-central-1.hosts.yaml
+++ b/.github/ansible/staging.eu-central-1.hosts.yaml
@@ -1,47 +0,0 @@
-storage:
-  vars:
-    bucket_name: neon-dev-storage-eu-central-1
-    bucket_region: eu-central-1
-    # We only register/update storage in one preview console and manually copy to other instances
-    console_mgmt_base_url: http://neon-internal-api.helium.aws.neon.build
-    broker_endpoint: http://storage-broker-lb.alpha.eu-central-1.internal.aws.neon.build:50051
-    pageserver_config_stub:
-      pg_distrib_dir: /usr/local
-      metric_collection_endpoint: http://neon-internal-api.helium.aws.neon.build/billing/api/v1/usage_events
-      metric_collection_interval: 10min
-      disk_usage_based_eviction:
-        max_usage_pct: 80
-        min_avail_bytes: 0
-        period: "10s"
-      tenant_config:
-        eviction_policy:
-          kind: "LayerAccessThreshold"
-          period: "20m"
-          threshold: &default_eviction_threshold "20m"
-      evictions_low_residence_duration_metric_threshold: *default_eviction_threshold
-      remote_storage:
-        bucket_name: "{{ bucket_name }}"
-        bucket_region: "{{ bucket_region }}"
-        prefix_in_bucket: "pageserver/v1"
-    safekeeper_s3_prefix: safekeeper/v1/wal
-    hostname_suffix: ""
-    remote_user: ssm-user
-    ansible_aws_ssm_region: eu-central-1
-    ansible_aws_ssm_bucket_name: neon-dev-storage-eu-central-1
-    console_region_id: aws-eu-central-1
-    sentry_environment: staging
-
-  children:
-    pageservers:
-      hosts:
-        pageserver-0.eu-central-1.aws.neon.build:
-          ansible_host: i-011f93ec26cfba2d4
-
-    safekeepers:
-      hosts:
-        safekeeper-0.eu-central-1.aws.neon.build:
-          ansible_host: i-0ff026d27babf8ddd
-        safekeeper-1.eu-central-1.aws.neon.build:
-          ansible_host: i-03983a49ee54725d9
-        safekeeper-2.eu-central-1.aws.neon.build:
-          ansible_host: i-0bd025ecdb61b0db3
--- a/.github/ansible/staging.eu-west-1.hosts.yaml
+++ b/.github/ansible/staging.eu-west-1.hosts.yaml
@@ -17,7 +17,7 @@ storage:
          kind: "LayerAccessThreshold"
          period: "20m"
          threshold: &default_eviction_threshold "20m"
-        evictions_low_residence_duration_metric_threshold: *default_eviction_threshold
+      evictions_low_residence_duration_metric_threshold: *default_eviction_threshold
      remote_storage:
        bucket_name: "{{ bucket_name }}"
        bucket_region: "{{ bucket_region }}"
@@ -35,8 +35,6 @@ storage:
      hosts:
        pageserver-0.eu-west-1.aws.neon.build:
          ansible_host: i-01d496c5041c7f34c
-        pageserver-1.eu-west-1.aws.neon.build:
-          ansible_host: i-0e8013e239ce3928c

    safekeepers:
      hosts:
@@ -46,15 +44,3 @@ storage:
          ansible_host: i-06969ee1bf2958bfc
        safekeeper-2.eu-west-1.aws.neon.build:
          ansible_host: i-087892e9625984a0b
-        safekeeper-3.eu-west-1.aws.neon.build:
-          ansible_host: i-0a6f91660e99e8891
-        safekeeper-4.eu-west-1.aws.neon.build:
-          ansible_host: i-0012e309e28e7c249
-        safekeeper-5.eu-west-1.aws.neon.build:
-          ansible_host: i-085a2b1193287b32e
-        safekeeper-6.eu-west-1.aws.neon.build:
-          ansible_host: i-0c713248465ed0fbd
-        safekeeper-7.eu-west-1.aws.neon.build:
-          ansible_host: i-02ad231aed2a80b7a
-        safekeeper-8.eu-west-1.aws.neon.build:
-          ansible_host: i-0dbbd8ffef66efda8
--- a/.github/ansible/staging.us-east-2.hosts.yaml
+++ b/.github/ansible/staging.us-east-2.hosts.yaml
@@ -17,7 +17,7 @@ storage:
          kind: "LayerAccessThreshold"
          period: "20m"
          threshold: &default_eviction_threshold "20m"
-        evictions_low_residence_duration_metric_threshold: *default_eviction_threshold
+      evictions_low_residence_duration_metric_threshold: *default_eviction_threshold
      remote_storage:
        bucket_name: "{{ bucket_name }}"
        bucket_region: "{{ bucket_region }}"
@@ -48,9 +48,9 @@ storage:
      hosts:
        safekeeper-0.us-east-2.aws.neon.build:
          ansible_host: i-027662bd552bf5db0
+        safekeeper-1.us-east-2.aws.neon.build:
+          ansible_host: i-0171efc3604a7b907
        safekeeper-2.us-east-2.aws.neon.build:
          ansible_host: i-0de0b03a51676a6ce
-        safekeeper-3.us-east-2.aws.neon.build:
-          ansible_host: i-05f8ba2cda243bd18
        safekeeper-99.us-east-2.aws.neon.build:
          ansible_host: i-0d61b6a2ea32028d5
--- a/.github/helm-values/dev-eu-central-1-alpha.neon-storage-broker.yaml
+++ b/.github/helm-values/dev-eu-central-1-alpha.neon-storage-broker.yaml
@@ -1,52 +0,0 @@
-# Helm chart values for neon-storage-broker
-podLabels:
-  neon_env: staging
-  neon_service: storage-broker
-
-# Use L4 LB
-service:
-  # service.annotations -- Annotations to add to the service
-  annotations:
-    service.beta.kubernetes.io/aws-load-balancer-type: external  # use newer AWS Load Balancer Controller
-    service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
-    service.beta.kubernetes.io/aws-load-balancer-scheme: internal  # deploy LB to private subnet
-    # assign service to this name at external-dns
-    external-dns.alpha.kubernetes.io/hostname: storage-broker-lb.alpha.eu-central-1.internal.aws.neon.build
-  # service.type -- Service type
-  type: LoadBalancer
-  # service.port -- broker listen port
-  port: 50051
-
-ingress:
-  enabled: false
-
-metrics:
-  enabled: false
-
-extraManifests:
-  - apiVersion: operator.victoriametrics.com/v1beta1
-    kind: VMServiceScrape
-    metadata:
-      name: "{{ include \"neon-storage-broker.fullname\" . }}"
-      labels:
-        helm.sh/chart: neon-storage-broker-{{ .Chart.Version }}
-        app.kubernetes.io/name: neon-storage-broker
-        app.kubernetes.io/instance: neon-storage-broker
-        app.kubernetes.io/version: "{{ .Chart.AppVersion }}"
-        app.kubernetes.io/managed-by: Helm
-      namespace: "{{ .Release.Namespace }}"
-    spec:
-      selector:
-        matchLabels:
-          app.kubernetes.io/name: "neon-storage-broker"
-      endpoints:
-        - port: broker
-          path: /metrics
-          interval: 10s
-          scrapeTimeout: 10s
-      namespaceSelector:
-        matchNames:
-          - "{{ .Release.Namespace }}"
-
-settings:
-  sentryEnvironment: "staging"
--- a/.github/helm-values/dev-eu-central-1-alpha.pg-sni-router.yaml
+++ b/.github/helm-values/dev-eu-central-1-alpha.pg-sni-router.yaml
@@ -1,19 +0,0 @@
-useCertManager: true
-
-replicaCount: 3
-
-exposedService:
-  # exposedService.port -- Exposed Service proxy port
-  port: 4432
-  annotations:
-    external-dns.alpha.kubernetes.io/hostname: "*.snirouter.alpha.eu-central-1.internal.aws.neon.build"
-
-settings:
-  domain: "*.snirouter.alpha.eu-central-1.internal.aws.neon.build"
-  sentryEnvironment: "staging"
-
-imagePullSecrets:
-  - name: docker-hub-neon
-
-metrics:
-  enabled: false
--- a/.github/helm-values/dev-eu-west-1-zeta.neon-proxy-scram.yaml
+++ b/.github/helm-values/dev-eu-west-1-zeta.neon-proxy-scram.yaml
@@ -7,13 +7,13 @@ deploymentStrategy:
    maxSurge: 100%
    maxUnavailable: 50%

-# Delay the kill signal by 5 minutes (5 * 60)
+# Delay the kill signal by 7 days (7 * 24 * 60 * 60)
 # The pod(s) will stay in Terminating, keeps the existing connections
 # but doesn't receive new ones
 containerLifecycle:
  preStop:
    exec:
-      command: ["/bin/sh", "-c", "sleep 300"]
+      command: ["/bin/sh", "-c", "sleep 604800"]
 terminationGracePeriodSeconds: 604800

 image:
@@ -23,7 +23,6 @@ settings:
  authBackend: "console"
  authEndpoint: "http://neon-internal-api.aws.neon.build/management/api/v2"
  domain: "*.eu-west-1.aws.neon.build"
-  otelExporterOtlpEndpoint: "https://otel-collector.zeta.eu-west-1.internal.aws.neon.build"
  sentryEnvironment: "staging"
  wssPort: 8443
  metricCollectionEndpoint: "http://neon-internal-api.aws.neon.build/billing/api/v1/usage_events"
--- a/.github/helm-values/dev-eu-west-1-zeta.pg-sni-router.yaml
+++ b/.github/helm-values/dev-eu-west-1-zeta.pg-sni-router.yaml
@@ -1,19 +0,0 @@
-useCertManager: true
-
-replicaCount: 3
-
-exposedService:
-  # exposedService.port -- Exposed Service proxy port
-  port: 4432
-  annotations:
-    external-dns.alpha.kubernetes.io/hostname: "*.snirouter.zeta.eu-west-1.internal.aws.neon.build"
-
-settings:
-  domain: "*.snirouter.zeta.eu-west-1.internal.aws.neon.build"
-  sentryEnvironment: "staging"
-
-imagePullSecrets:
-  - name: docker-hub-neon
-
-metrics:
-  enabled: false
--- a/.github/helm-values/dev-us-east-2-beta.neon-proxy-link.yaml
+++ b/.github/helm-values/dev-us-east-2-beta.neon-proxy-link.yaml
@@ -9,7 +9,6 @@ settings:
  authEndpoint: "https://console.stage.neon.tech/authenticate_proxy_request/"
  uri: "https://console.stage.neon.tech/psql_session/"
  domain: "pg.neon.build"
-  otelExporterOtlpEndpoint: "https://otel-collector.beta.us-east-2.internal.aws.neon.build"
  sentryEnvironment: "staging"
  metricCollectionEndpoint: "http://neon-internal-api.aws.neon.build/billing/api/v1/usage_events"
  metricCollectionInterval: "1min"
--- a/.github/helm-values/dev-us-east-2-beta.neon-proxy-scram-legacy.yaml
+++ b/.github/helm-values/dev-us-east-2-beta.neon-proxy-scram-legacy.yaml
@@ -1,22 +1,6 @@
 # Helm chart values for neon-proxy-scram.
 # This is a YAML-formatted file.

-deploymentStrategy:
-  type: RollingUpdate
-  rollingUpdate:
-    maxSurge: 100%
-    maxUnavailable: 50%
-
-# Delay the kill signal by 5 minutes (5 * 60)
-# The pod(s) will stay in Terminating, keeps the existing connections
-# but doesn't receive new ones
-containerLifecycle:
-  preStop:
-    exec:
-      command: ["/bin/sh", "-c", "sleep 300"]
-terminationGracePeriodSeconds: 604800
-
-
 image:
  repository: neondatabase/neon

@@ -24,7 +8,6 @@ settings:
  authBackend: "console"
  authEndpoint: "http://neon-internal-api.aws.neon.build/management/api/v2"
  domain: "*.cloud.stage.neon.tech"
-  otelExporterOtlpEndpoint: "https://otel-collector.beta.us-east-2.internal.aws.neon.build"
  sentryEnvironment: "staging"
  wssPort: 8443
  metricCollectionEndpoint: "http://neon-internal-api.aws.neon.build/billing/api/v1/usage_events"
--- a/.github/helm-values/dev-us-east-2-beta.neon-proxy-scram.yaml
+++ b/.github/helm-values/dev-us-east-2-beta.neon-proxy-scram.yaml
@@ -7,16 +7,15 @@ deploymentStrategy:
    maxSurge: 100%
    maxUnavailable: 50%

-# Delay the kill signal by 5 minutes (5 * 60)
+# Delay the kill signal by 7 days (7 * 24 * 60 * 60)
 # The pod(s) will stay in Terminating, keeps the existing connections
 # but doesn't receive new ones
 containerLifecycle:
  preStop:
    exec:
-      command: ["/bin/sh", "-c", "sleep 300"]
+      command: ["/bin/sh", "-c", "sleep 604800"]
 terminationGracePeriodSeconds: 604800

-
 image:
  repository: neondatabase/neon

@@ -25,7 +24,6 @@ settings:
  authEndpoint: "http://neon-internal-api.aws.neon.build/management/api/v2"
  domain: "*.us-east-2.aws.neon.build"
  extraDomains: ["*.us-east-2.postgres.zenith.tech", "*.us-east-2.retooldb-staging.com"]
-  otelExporterOtlpEndpoint: "https://otel-collector.beta.us-east-2.internal.aws.neon.build"
  sentryEnvironment: "staging"
  wssPort: 8443
  metricCollectionEndpoint: "http://neon-internal-api.aws.neon.build/billing/api/v1/usage_events"
--- a/.github/helm-values/dev-us-east-2-beta.pg-sni-router.yaml
+++ b/.github/helm-values/dev-us-east-2-beta.pg-sni-router.yaml
@@ -1,19 +0,0 @@
-useCertManager: true
-
-replicaCount: 3
-
-exposedService:
-  # exposedService.port -- Exposed Service proxy port
-  port: 4432
-  annotations:
-    external-dns.alpha.kubernetes.io/hostname: "*.snirouter.beta.us-east-2.internal.aws.neon.build"
-
-settings:
-  domain: "*.snirouter.beta.us-east-2.internal.aws.neon.build"
-  sentryEnvironment: "staging"
-
-imagePullSecrets:
-  - name: docker-hub-neon
-
-metrics:
-  enabled: false
--- a/.github/helm-values/preview-template.neon-proxy-scram.yaml
+++ b/.github/helm-values/preview-template.neon-proxy-scram.yaml
@@ -1,67 +0,0 @@
-# Helm chart values for neon-proxy-scram.
-# This is a YAML-formatted file.
-
-deploymentStrategy:
-  type: RollingUpdate
-  rollingUpdate:
-    maxSurge: 100%
-    maxUnavailable: 50%
-
-image:
-  repository: neondatabase/neon
-
-settings:
-  authBackend: "console"
-  authEndpoint: "http://neon-internal-api.${PREVIEW_NAME}.aws.neon.build/management/api/v2"
-  domain: "*.cloud.${PREVIEW_NAME}.aws.neon.build"
-  sentryEnvironment: "staging"
-  wssPort: 8443
-  metricCollectionEndpoint: "http://neon-internal-api.${PREVIEW_NAME}.aws.neon.build/billing/api/v1/usage_events"
-  metricCollectionInterval: "1min"
-
-# -- Additional labels for neon-proxy pods
-podLabels:
-  neon_service: proxy-scram
-  neon_env: test
-  neon_region: ${PREVIEW_NAME}.eu-central-1
-
-
-exposedService:
-  annotations:
-    service.beta.kubernetes.io/aws-load-balancer-type: external
-    service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
-    service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
-    external-dns.alpha.kubernetes.io/hostname: cloud.${PREVIEW_NAME}.aws.neon.build
-  httpsPort: 443
-
-#metrics:
-#  enabled: true
-#  serviceMonitor:
-#    enabled: true
-#    selector:
-#      release: kube-prometheus-stack
-
-extraManifests:
-  - apiVersion: operator.victoriametrics.com/v1beta1
-    kind: VMServiceScrape
-    metadata:
-      name: "{{ include \"neon-proxy.fullname\" . }}"
-      labels:
-        helm.sh/chart: neon-proxy-{{ .Chart.Version }}
-        app.kubernetes.io/name: neon-proxy
-        app.kubernetes.io/instance: "{{ include \"neon-proxy.fullname\" . }}"
-        app.kubernetes.io/version: "{{ .Chart.AppVersion }}"
-        app.kubernetes.io/managed-by: Helm
-      namespace: "{{ .Release.Namespace }}"
-    spec:
-      selector:
-        matchLabels:
-          app.kubernetes.io/name: "neon-proxy"
-      endpoints:
-        - port: http
-          path: /metrics
-          interval: 10s
-          scrapeTimeout: 10s
-      namespaceSelector:
-        matchNames:
-          - "{{ .Release.Namespace }}"
--- a/.github/helm-values/prod-ap-southeast-1-epsilon.neon-proxy-scram.yaml
+++ b/.github/helm-values/prod-ap-southeast-1-epsilon.neon-proxy-scram.yaml
@@ -7,13 +7,13 @@ deploymentStrategy:
    maxSurge: 100%
    maxUnavailable: 50%

-# Delay the kill signal by 5 minutes (5 * 60)
+# Delay the kill signal by 7 days (7 * 24 * 60 * 60)
 # The pod(s) will stay in Terminating, keeps the existing connections
 # but doesn't receive new ones
 containerLifecycle:
  preStop:
    exec:
-      command: ["/bin/sh", "-c", "sleep 300"]
+      command: ["/bin/sh", "-c", "sleep 604800"]
 terminationGracePeriodSeconds: 604800


--- a/.github/helm-values/prod-ap-southeast-1-epsilon.pg-sni-router.yaml
+++ b/.github/helm-values/prod-ap-southeast-1-epsilon.pg-sni-router.yaml
@@ -1,19 +0,0 @@
-useCertManager: true
-
-replicaCount: 3
-
-exposedService:
-  # exposedService.port -- Exposed Service proxy port
-  port: 4432
-  annotations:
-    external-dns.alpha.kubernetes.io/hostname: "*.snirouter.epsilon.ap-southeast-1.internal.aws.neon.tech"
-
-settings:
-  domain: "*.snirouter.epsilon.ap-southeast-1.internal.aws.neon.tech"
-  sentryEnvironment: "production"
-
-imagePullSecrets:
-  - name: docker-hub-neon
-
-metrics:
-  enabled: false
--- a/.github/helm-values/prod-eu-central-1-gamma.neon-proxy-scram.yaml
+++ b/.github/helm-values/prod-eu-central-1-gamma.neon-proxy-scram.yaml
@@ -7,13 +7,13 @@ deploymentStrategy:
    maxSurge: 100%
    maxUnavailable: 50%

-# Delay the kill signal by 5 minutes (5 * 60)
+# Delay the kill signal by 7 days (7 * 24 * 60 * 60)
 # The pod(s) will stay in Terminating, keeps the existing connections
 # but doesn't receive new ones
 containerLifecycle:
  preStop:
    exec:
-      command: ["/bin/sh", "-c", "sleep 300"]
+      command: ["/bin/sh", "-c", "sleep 604800"]
 terminationGracePeriodSeconds: 604800


--- a/.github/helm-values/prod-eu-central-1-gamma.pg-sni-router.yaml
+++ b/.github/helm-values/prod-eu-central-1-gamma.pg-sni-router.yaml
@@ -1,19 +0,0 @@
-useCertManager: true
-
-replicaCount: 3
-
-exposedService:
-  # exposedService.port -- Exposed Service proxy port
-  port: 4432
-  annotations:
-    external-dns.alpha.kubernetes.io/hostname: "*.snirouter.gamma.eu-central-1.internal.aws.neon.tech"
-
-settings:
-  domain: "*.snirouter.gamma.eu-central-1.internal.aws.neon.tech"
-  sentryEnvironment: "production"
-
-imagePullSecrets:
-  - name: docker-hub-neon
-
-metrics:
-  enabled: false
--- a/.github/helm-values/prod-us-east-1-theta.neon-proxy-scram.yaml
+++ b/.github/helm-values/prod-us-east-1-theta.neon-proxy-scram.yaml
@@ -1,69 +0,0 @@
-# Helm chart values for neon-proxy-scram.
-# This is a YAML-formatted file.
-
-deploymentStrategy:
-  type: RollingUpdate
-  rollingUpdate:
-    maxSurge: 100%
-    maxUnavailable: 50%
-
-# Delay the kill signal by 5 minutes (5 * 60)
-# The pod(s) will stay in Terminating, keeps the existing connections
-# but doesn't receive new ones
-containerLifecycle:
-  preStop:
-    exec:
-      command: ["/bin/sh", "-c", "sleep 300"]
-terminationGracePeriodSeconds: 604800
-
-image:
-  repository: neondatabase/neon
-
-settings:
-  authBackend: "console"
-  authEndpoint: "http://neon-internal-api.aws.neon.tech/management/api/v2"
-  domain: "*.us-east-1.aws.neon.tech"
-  # *.us-east-1.retooldb.com hasn't been delegated yet.
-  extraDomains: ["*.us-east-1.postgres.vercel-storage.com"]
-  sentryEnvironment: "production"
-  wssPort: 8443
-  metricCollectionEndpoint: "http://neon-internal-api.aws.neon.tech/billing/api/v1/usage_events"
-  metricCollectionInterval: "10min"
-
-podLabels:
-  neon_service: proxy-scram
-  neon_env: prod
-  neon_region: us-east-1
-
-exposedService:
-  annotations:
-    service.beta.kubernetes.io/aws-load-balancer-type: external
-    service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
-    service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
-    external-dns.alpha.kubernetes.io/hostname: us-east-1.aws.neon.tech
-  httpsPort: 443
-
-extraManifests:
-  - apiVersion: operator.victoriametrics.com/v1beta1
-    kind: VMServiceScrape
-    metadata:
-      name: "{{ include \"neon-proxy.fullname\" . }}"
-      labels:
-        helm.sh/chart: neon-proxy-{{ .Chart.Version }}
-        app.kubernetes.io/name: neon-proxy
-        app.kubernetes.io/instance: "{{ include \"neon-proxy.fullname\" . }}"
-        app.kubernetes.io/version: "{{ .Chart.AppVersion }}"
-        app.kubernetes.io/managed-by: Helm
-      namespace: "{{ .Release.Namespace }}"
-    spec:
-      selector:
-        matchLabels:
-          app.kubernetes.io/name: "neon-proxy"
-      endpoints:
-        - port: http
-          path: /metrics
-          interval: 10s
-          scrapeTimeout: 10s
-      namespaceSelector:
-        matchNames:
-          - "{{ .Release.Namespace }}"
--- a/.github/helm-values/prod-us-east-1-theta.neon-storage-broker.yaml
+++ b/.github/helm-values/prod-us-east-1-theta.neon-storage-broker.yaml
@@ -1,52 +0,0 @@
-# Helm chart values for neon-storage-broker
-podLabels:
-  neon_env: production
-  neon_service: storage-broker
-
-# Use L4 LB
-service:
-  # service.annotations -- Annotations to add to the service
-  annotations:
-    service.beta.kubernetes.io/aws-load-balancer-type: external  # use newer AWS Load Balancer Controller
-    service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
-    service.beta.kubernetes.io/aws-load-balancer-scheme: internal  # deploy LB to private subnet
-    # assign service to this name at external-dns
-    external-dns.alpha.kubernetes.io/hostname: storage-broker-lb.theta.us-east-1.internal.aws.neon.tech
-  # service.type -- Service type
-  type: LoadBalancer
-  # service.port -- broker listen port
-  port: 50051
-
-ingress:
-  enabled: false
-
-metrics:
-  enabled: false
-
-extraManifests:
-  - apiVersion: operator.victoriametrics.com/v1beta1
-    kind: VMServiceScrape
-    metadata:
-      name: "{{ include \"neon-storage-broker.fullname\" . }}"
-      labels:
-        helm.sh/chart: neon-storage-broker-{{ .Chart.Version }}
-        app.kubernetes.io/name: neon-storage-broker
-        app.kubernetes.io/instance: neon-storage-broker
-        app.kubernetes.io/version: "{{ .Chart.AppVersion }}"
-        app.kubernetes.io/managed-by: Helm
-      namespace: "{{ .Release.Namespace }}"
-    spec:
-      selector:
-        matchLabels:
-          app.kubernetes.io/name: "neon-storage-broker"
-      endpoints:
-        - port: broker
-          path: /metrics
-          interval: 10s
-          scrapeTimeout: 10s
-      namespaceSelector:
-        matchNames:
-          - "{{ .Release.Namespace }}"
-
-settings:
-  sentryEnvironment: "production"
--- a/.github/helm-values/prod-us-east-1-theta.pg-sni-router.yaml
+++ b/.github/helm-values/prod-us-east-1-theta.pg-sni-router.yaml
@@ -1,19 +0,0 @@
-useCertManager: true
-
-replicaCount: 3
-
-exposedService:
-  # exposedService.port -- Exposed Service proxy port
-  port: 4432
-  annotations:
-    external-dns.alpha.kubernetes.io/hostname: "*.snirouter.theta.us-east-1.internal.aws.neon.tech"
-
-settings:
-  domain: "*.snirouter.theta.us-east-1.internal.aws.neon.tech"
-  sentryEnvironment: "production"
-
-imagePullSecrets:
-  - name: docker-hub-neon
-
-metrics:
-  enabled: false
--- a/.github/helm-values/prod-us-east-2-delta.neon-proxy-scram.yaml
+++ b/.github/helm-values/prod-us-east-2-delta.neon-proxy-scram.yaml
@@ -7,13 +7,13 @@ deploymentStrategy:
    maxSurge: 100%
    maxUnavailable: 50%

-# Delay the kill signal by 5 minutes (5 * 60)
+# Delay the kill signal by 7 days (7 * 24 * 60 * 60)
 # The pod(s) will stay in Terminating, keeps the existing connections
 # but doesn't receive new ones
 containerLifecycle:
  preStop:
    exec:
-      command: ["/bin/sh", "-c", "sleep 300"]
+      command: ["/bin/sh", "-c", "sleep 604800"]
 terminationGracePeriodSeconds: 604800


--- a/.github/helm-values/prod-us-east-2-delta.pg-sni-router.yaml
+++ b/.github/helm-values/prod-us-east-2-delta.pg-sni-router.yaml
@@ -1,19 +0,0 @@
-useCertManager: true
-
-replicaCount: 3
-
-exposedService:
-  # exposedService.port -- Exposed Service proxy port
-  port: 4432
-  annotations:
-    external-dns.alpha.kubernetes.io/hostname: "*.snirouter.delta.us-east-2.internal.aws.neon.tech"
-
-settings:
-  domain: "*.snirouter.delta.us-east-2.internal.aws.neon.tech"
-  sentryEnvironment: "production"
-
-imagePullSecrets:
-  - name: docker-hub-neon
-
-metrics:
-  enabled: false
--- a/.github/helm-values/prod-us-west-2-eta.neon-proxy-scram-legacy.yaml
+++ b/.github/helm-values/prod-us-west-2-eta.neon-proxy-scram-legacy.yaml
@@ -7,13 +7,13 @@ deploymentStrategy:
    maxSurge: 100%
    maxUnavailable: 50%

-# Delay the kill signal by 5 minutes (5 * 60)
+# Delay the kill signal by 7 days (7 * 24 * 60 * 60)
 # The pod(s) will stay in Terminating, keeps the existing connections
 # but doesn't receive new ones
 containerLifecycle:
  preStop:
    exec:
-      command: ["/bin/sh", "-c", "sleep 300"]
+      command: ["/bin/sh", "-c", "sleep 604800"]
 terminationGracePeriodSeconds: 604800


--- a/.github/helm-values/prod-us-west-2-eta.neon-proxy-scram.yaml
+++ b/.github/helm-values/prod-us-west-2-eta.neon-proxy-scram.yaml
@@ -7,13 +7,13 @@ deploymentStrategy:
    maxSurge: 100%
    maxUnavailable: 50%

-# Delay the kill signal by 5 minutes (5 * 60)
+# Delay the kill signal by 7 days (7 * 24 * 60 * 60)
 # The pod(s) will stay in Terminating, keeps the existing connections
 # but doesn't receive new ones
 containerLifecycle:
  preStop:
    exec:
-      command: ["/bin/sh", "-c", "sleep 300"]
+      command: ["/bin/sh", "-c", "sleep 604800"]
 terminationGracePeriodSeconds: 604800


--- a/.github/helm-values/prod-us-west-2-eta.pg-sni-router.yaml
+++ b/.github/helm-values/prod-us-west-2-eta.pg-sni-router.yaml
@@ -1,19 +0,0 @@
-useCertManager: true
-
-replicaCount: 3
-
-exposedService:
-  # exposedService.port -- Exposed Service proxy port
-  port: 4432
-  annotations:
-    external-dns.alpha.kubernetes.io/hostname: "*.snirouter.eta.us-west-2.internal.aws.neon.tech"
-
-settings:
-  domain: "*.snirouter.eta.us-west-2.internal.aws.neon.tech"
-  sentryEnvironment: "production"
-
-imagePullSecrets:
-  - name: docker-hub-neon
-
-metrics:
-  enabled: false
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@@ -111,21 +111,8 @@ jobs:
      - name: Get postgres headers
        run: make postgres-headers -j$(nproc)

-      # cargo hack runs the given cargo subcommand (clippy in this case) for all feature combinations.
-      # This will catch compiler & clippy warnings in all feature combinations.
-      # TODO: use cargo hack for build and test as well, but, that's quite expensive.
-      # NB: keep clippy args in sync with ./run_clippy.sh
-      - run: |
-          CLIPPY_COMMON_ARGS="$( source .neon_clippy_args; echo "$CLIPPY_COMMON_ARGS")"
-          if [ "$CLIPPY_COMMON_ARGS" = "" ]; then
-            echo "No clippy args found in .neon_clippy_args"
-            exit 1
-          fi
-          echo "CLIPPY_COMMON_ARGS=${CLIPPY_COMMON_ARGS}" >> $GITHUB_ENV
-      - name: Run cargo clippy (debug)
-        run: cargo hack --feature-powerset clippy $CLIPPY_COMMON_ARGS
-      - name: Run cargo clippy (release)
-        run: cargo hack --feature-powerset clippy --release $CLIPPY_COMMON_ARGS
+      - name: Run cargo clippy
+        run: ./run_clippy.sh

      # Use `${{ !cancelled() }}` to run quck tests after the longer clippy run
      - name: Check formatting
@@ -554,7 +541,7 @@ jobs:
    container:
      image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/base:pinned
      options: --init
-    needs: [ promote-images, tag ]
+    needs: [ push-docker-hub, tag ]
    steps:
      - name: Set PR's status to pending and request a remote CI test
        run: |
@@ -597,7 +584,8 @@ jobs:
  neon-image:
    runs-on: [ self-hosted, gen3, large ]
    needs: [ tag ]
-    container: gcr.io/kaniko-project/executor:v1.9.2-debug
+    # https://github.com/GoogleContainerTools/kaniko/issues/2005
+    container: gcr.io/kaniko-project/executor:v1.7.0-debug
    defaults:
      run:
        shell: sh -eu {0}
@@ -609,32 +597,11 @@ jobs:
          submodules: true
          fetch-depth: 0

-      - name: Configure ECR and Docker Hub login
-        run: |
-          DOCKERHUB_AUTH=$(echo -n "${{ secrets.NEON_DOCKERHUB_USERNAME }}:${{ secrets.NEON_DOCKERHUB_PASSWORD }}" | base64)
-          echo "::add-mask::${DOCKERHUB_AUTH}"
-
-          cat <<-EOF > /kaniko/.docker/config.json
-            {
-              "auths": {
-                "https://index.docker.io/v1/": {
-                  "auth": "${DOCKERHUB_AUTH}"
-                }
-              },
-              "credHelpers": {
-                "369495373322.dkr.ecr.eu-central-1.amazonaws.com": "ecr-login"
-              }
-            }
-          EOF
+      - name: Configure ECR login
+        run: echo "{\"credsStore\":\"ecr-login\"}" > /kaniko/.docker/config.json

      - name: Kaniko build neon
-        run:
-          /kaniko/executor --reproducible --snapshot-mode=redo --skip-unused-stages --cache=true
-                           --cache-repo 369495373322.dkr.ecr.eu-central-1.amazonaws.com/cache
-                           --context .
-                           --build-arg GIT_VERSION=${{ github.sha }}
-                           --destination 369495373322.dkr.ecr.eu-central-1.amazonaws.com/neon:${{needs.tag.outputs.build-tag}}
-                           --destination neondatabase/neon:${{needs.tag.outputs.build-tag}}
+        run: /kaniko/executor --reproducible --snapshotMode=redo --skip-unused-stages --cache=true --cache-repo 369495373322.dkr.ecr.eu-central-1.amazonaws.com/cache --context . --build-arg GIT_VERSION=${{ github.sha }} --destination 369495373322.dkr.ecr.eu-central-1.amazonaws.com/neon:${{needs.tag.outputs.build-tag}}

      # Cleanup script fails otherwise - rm: cannot remove '/nvme/actions-runner/_work/_temp/_github_home/.ecr': Permission denied
      - name: Cleanup ECR folder
@@ -685,7 +652,7 @@ jobs:
  compute-tools-image:
    runs-on: [ self-hosted, gen3, large ]
    needs: [ tag ]
-    container: gcr.io/kaniko-project/executor:v1.9.2-debug
+    container: gcr.io/kaniko-project/executor:v1.7.0-debug
    defaults:
      run:
        shell: sh -eu {0}
@@ -694,41 +661,18 @@ jobs:
      - name: Checkout
        uses: actions/checkout@v1 # v3 won't work with kaniko

-      - name: Configure ECR and Docker Hub login
-        run: |
-          DOCKERHUB_AUTH=$(echo -n "${{ secrets.NEON_DOCKERHUB_USERNAME }}:${{ secrets.NEON_DOCKERHUB_PASSWORD }}" | base64)
-          echo "::add-mask::${DOCKERHUB_AUTH}"
-
-          cat <<-EOF > /kaniko/.docker/config.json
-            {
-              "auths": {
-                "https://index.docker.io/v1/": {
-                  "auth": "${DOCKERHUB_AUTH}"
-                }
-              },
-              "credHelpers": {
-                "369495373322.dkr.ecr.eu-central-1.amazonaws.com": "ecr-login"
-              }
-            }
-          EOF
+      - name: Configure ECR login
+        run: echo "{\"credsStore\":\"ecr-login\"}" > /kaniko/.docker/config.json

      - name: Kaniko build compute tools
-        run:
-          /kaniko/executor --reproducible --snapshot-mode=redo --skip-unused-stages --cache=true
-                           --cache-repo 369495373322.dkr.ecr.eu-central-1.amazonaws.com/cache
-                           --context .
-                           --build-arg GIT_VERSION=${{ github.sha }}
-                           --dockerfile Dockerfile.compute-tools
-                           --destination 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-tools:${{needs.tag.outputs.build-tag}}
-                           --destination neondatabase/compute-tools:${{needs.tag.outputs.build-tag}}
+        run: /kaniko/executor --reproducible --snapshotMode=redo --skip-unused-stages --cache=true --cache-repo 369495373322.dkr.ecr.eu-central-1.amazonaws.com/cache --context . --build-arg GIT_VERSION=${{ github.sha }} --dockerfile Dockerfile.compute-tools --destination 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-tools:${{needs.tag.outputs.build-tag}}

-      # Cleanup script fails otherwise - rm: cannot remove '/nvme/actions-runner/_work/_temp/_github_home/.ecr': Permission denied
      - name: Cleanup ECR folder
        run: rm -rf ~/.ecr

  compute-node-image:
    runs-on: [ self-hosted, gen3, large ]
-    container: gcr.io/kaniko-project/executor:v1.9.2-debug
+    container: gcr.io/kaniko-project/executor:v1.7.0-debug
    needs: [ tag ]
    strategy:
      fail-fast: false
@@ -745,36 +689,12 @@ jobs:
          submodules: true
          fetch-depth: 0

-      - name: Configure ECR and Docker Hub login
-        run: |
-          DOCKERHUB_AUTH=$(echo -n "${{ secrets.NEON_DOCKERHUB_USERNAME }}:${{ secrets.NEON_DOCKERHUB_PASSWORD }}" | base64)
-          echo "::add-mask::${DOCKERHUB_AUTH}"
-
-          cat <<-EOF > /kaniko/.docker/config.json
-            {
-              "auths": {
-                "https://index.docker.io/v1/": {
-                  "auth": "${DOCKERHUB_AUTH}"
-                }
-              },
-              "credHelpers": {
-                "369495373322.dkr.ecr.eu-central-1.amazonaws.com": "ecr-login"
-              }
-            }
-          EOF
+      - name: Configure ECR login
+        run: echo "{\"credsStore\":\"ecr-login\"}" > /kaniko/.docker/config.json

      - name: Kaniko build compute node with extensions
-        run:
-          /kaniko/executor --reproducible --snapshot-mode=redo --skip-unused-stages --cache=true
-                           --cache-repo 369495373322.dkr.ecr.eu-central-1.amazonaws.com/cache
-                           --context .
-                           --build-arg GIT_VERSION=${{ github.sha }}
-                           --build-arg PG_VERSION=${{ matrix.version }}
-                           --dockerfile Dockerfile.compute-node
-                           --destination 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node-${{ matrix.version }}:${{needs.tag.outputs.build-tag}}
-                           --destination neondatabase/compute-node-${{ matrix.version }}:${{needs.tag.outputs.build-tag}}
+        run: /kaniko/executor --reproducible --snapshotMode=redo --skip-unused-stages --cache=true --cache-repo 369495373322.dkr.ecr.eu-central-1.amazonaws.com/cache  --context . --build-arg GIT_VERSION=${{ github.sha }} --build-arg PG_VERSION=${{ matrix.version }} --dockerfile Dockerfile.compute-node --destination 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node-${{ matrix.version }}:${{needs.tag.outputs.build-tag}}

-      # Cleanup script fails otherwise - rm: cannot remove '/nvme/actions-runner/_work/_temp/_github_home/.ecr': Permission denied
      - name: Cleanup ECR folder
        run: rm -rf ~/.ecr

@@ -866,8 +786,41 @@ jobs:
    runs-on: [ self-hosted, gen3, small ]
    needs: [ tag, test-images, vm-compute-node-image ]
    container: golang:1.19-bullseye
-    # Don't add if-condition here.
-    # The job should always be run because we have dependant other jobs that shouldn't be skipped
+    if: github.event_name != 'workflow_dispatch'
+
+    steps:
+      - name: Install Crane & ECR helper
+        if: |
+          (github.ref_name == 'main' || github.ref_name == 'release') &&
+          github.event_name != 'workflow_dispatch'
+        run: |
+          go install github.com/google/go-containerregistry/cmd/crane@31786c6cbb82d6ec4fb8eb79cd9387905130534e # v0.11.0
+          go install github.com/awslabs/amazon-ecr-credential-helper/ecr-login/cli/docker-credential-ecr-login@69c85dc22db6511932bbf119e1a0cc5c90c69a7f # v0.6.0
+
+      - name: Configure ECR login
+        run: |
+          mkdir /github/home/.docker/
+          echo "{\"credsStore\":\"ecr-login\"}" > /github/home/.docker/config.json
+
+      - name: Add latest tag to images
+        if: |
+          (github.ref_name == 'main' || github.ref_name == 'release') &&
+          github.event_name != 'workflow_dispatch'
+        run: |
+          crane tag 369495373322.dkr.ecr.eu-central-1.amazonaws.com/neon:${{needs.tag.outputs.build-tag}} latest
+          crane tag 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-tools:${{needs.tag.outputs.build-tag}} latest
+          crane tag 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node-v14:${{needs.tag.outputs.build-tag}} latest
+          crane tag 369495373322.dkr.ecr.eu-central-1.amazonaws.com/vm-compute-node-v14:${{needs.tag.outputs.build-tag}} latest
+          crane tag 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node-v15:${{needs.tag.outputs.build-tag}} latest
+          crane tag 369495373322.dkr.ecr.eu-central-1.amazonaws.com/vm-compute-node-v15:${{needs.tag.outputs.build-tag}} latest
+
+      - name: Cleanup ECR folder
+        run: rm -rf ~/.ecr
+
+  push-docker-hub:
+    runs-on: [ self-hosted, dev, x64 ]
+    needs: [ promote-images, tag ]
+    container: golang:1.19-bullseye

    steps:
      - name: Install Crane & ECR helper
@@ -880,27 +833,31 @@ jobs:
          mkdir /github/home/.docker/
          echo "{\"credsStore\":\"ecr-login\"}" > /github/home/.docker/config.json

-      - name: Copy vm-compute-node images to Docker Hub
-        run: |
-          crane pull 369495373322.dkr.ecr.eu-central-1.amazonaws.com/vm-compute-node-v14:${{needs.tag.outputs.build-tag}} vm-compute-node-v14
-          crane pull 369495373322.dkr.ecr.eu-central-1.amazonaws.com/vm-compute-node-v15:${{needs.tag.outputs.build-tag}} vm-compute-node-v15
+      - name: Pull neon image from ECR
+        run: crane pull 369495373322.dkr.ecr.eu-central-1.amazonaws.com/neon:${{needs.tag.outputs.build-tag}} neon

-      - name: Add latest tag to images
-        if: |
-          (github.ref_name == 'main' || github.ref_name == 'release') &&
-           github.event_name != 'workflow_dispatch'
-        run: |
-          crane tag 369495373322.dkr.ecr.eu-central-1.amazonaws.com/neon:${{needs.tag.outputs.build-tag}} latest
-          crane tag 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-tools:${{needs.tag.outputs.build-tag}} latest
-          crane tag 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node-v14:${{needs.tag.outputs.build-tag}} latest
-          crane tag 369495373322.dkr.ecr.eu-central-1.amazonaws.com/vm-compute-node-v14:${{needs.tag.outputs.build-tag}} latest
-          crane tag 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node-v15:${{needs.tag.outputs.build-tag}} latest
-          crane tag 369495373322.dkr.ecr.eu-central-1.amazonaws.com/vm-compute-node-v15:${{needs.tag.outputs.build-tag}} latest
+      - name: Pull compute tools image from ECR
+        run: crane pull 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-tools:${{needs.tag.outputs.build-tag}} compute-tools
+
+      - name: Pull compute node v14 image from ECR
+        run: crane pull 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node-v14:${{needs.tag.outputs.build-tag}} compute-node-v14
+
+      - name: Pull vm compute node v14 image from ECR
+        run: crane pull 369495373322.dkr.ecr.eu-central-1.amazonaws.com/vm-compute-node-v14:${{needs.tag.outputs.build-tag}} vm-compute-node-v14
+
+      - name: Pull compute node v15 image from ECR
+        run: crane pull 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node-v15:${{needs.tag.outputs.build-tag}} compute-node-v15
+
+      - name: Pull vm compute node v15 image from ECR
+        run: crane pull 369495373322.dkr.ecr.eu-central-1.amazonaws.com/vm-compute-node-v15:${{needs.tag.outputs.build-tag}} vm-compute-node-v15
+
+      - name: Pull rust image from ECR
+        run: crane pull 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned rust

      - name: Push images to production ECR
        if: |
          (github.ref_name == 'main' || github.ref_name == 'release') &&
-           github.event_name != 'workflow_dispatch'
+          github.event_name != 'workflow_dispatch'
        run: |
          crane copy 369495373322.dkr.ecr.eu-central-1.amazonaws.com/neon:${{needs.tag.outputs.build-tag}} 093970136003.dkr.ecr.eu-central-1.amazonaws.com/neon:latest
          crane copy 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-tools:${{needs.tag.outputs.build-tag}} 093970136003.dkr.ecr.eu-central-1.amazonaws.com/compute-tools:latest
@@ -915,12 +872,28 @@ jobs:
          echo "" > /github/home/.docker/config.json
          crane auth login -u ${{ secrets.NEON_DOCKERHUB_USERNAME }} -p ${{ secrets.NEON_DOCKERHUB_PASSWORD }} index.docker.io

-      - name: Push vm-compute-node to Docker Hub
-        run: |
-          crane push vm-compute-node-v14 neondatabase/vm-compute-node-v14:${{needs.tag.outputs.build-tag}}
-          crane push vm-compute-node-v15 neondatabase/vm-compute-node-v15:${{needs.tag.outputs.build-tag}}
+      - name: Push neon image to Docker Hub
+        run: crane push neon neondatabase/neon:${{needs.tag.outputs.build-tag}}

-      - name: Push latest tags to Docker Hub
+      - name: Push compute tools image to Docker Hub
+        run: crane push compute-tools neondatabase/compute-tools:${{needs.tag.outputs.build-tag}}
+
+      - name: Push compute node v14 image to Docker Hub
+        run: crane push compute-node-v14 neondatabase/compute-node-v14:${{needs.tag.outputs.build-tag}}
+
+      - name: Push vm compute node v14 image to Docker Hub
+        run: crane push vm-compute-node-v14 neondatabase/vm-compute-node-v14:${{needs.tag.outputs.build-tag}}
+
+      - name: Push compute node v15 image to Docker Hub
+        run: crane push compute-node-v15 neondatabase/compute-node-v15:${{needs.tag.outputs.build-tag}}
+
+      - name: Push vm compute node v15 image to Docker Hub
+        run: crane push vm-compute-node-v15 neondatabase/vm-compute-node-v15:${{needs.tag.outputs.build-tag}}
+
+      - name: Push rust image to Docker Hub
+        run: crane push rust neondatabase/rust:pinned
+
+      - name: Add latest tag to images in Docker Hub
        if: |
          (github.ref_name == 'main' || github.ref_name == 'release') &&
          github.event_name != 'workflow_dispatch'
@@ -940,7 +913,7 @@ jobs:
    container: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/ansible:pinned
    # We need both storage **and** compute images for deploy, because control plane picks the compute version based on the storage version.
    # If it notices a fresh storage it may bump the compute version. And if compute image failed to build it may break things badly
-    needs: [ promote-images, tag, regress-tests ]
+    needs: [ push-docker-hub, tag, regress-tests ]
    if: |
      contains(github.event.pull_request.labels.*.name, 'deploy-test-storage') &&
      github.event_name != 'workflow_dispatch'
@@ -974,7 +947,7 @@ jobs:
  deploy:
    runs-on: [ self-hosted, gen3, small ]
    container: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/ansible:latest
-    needs: [ promote-images, tag, regress-tests ]
+    needs: [ push-docker-hub, tag, regress-tests ]
    if: ( github.ref_name == 'main' || github.ref_name == 'release' ) && github.event_name != 'workflow_dispatch'
    steps:
      - name: Fix git ownership
@@ -1011,7 +984,7 @@ jobs:
    container:
      image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned
      options: --init
-    needs: [ promote-images, tag, regress-tests ]
+    needs: [ push-docker-hub, tag, regress-tests ]
    if: github.ref_name == 'release' && github.event_name != 'workflow_dispatch'
    steps:
      - name: Promote compatibility snapshot for the release
--- a/.github/workflows/deploy-dev.yml
+++ b/.github/workflows/deploy-dev.yml
@@ -27,11 +27,6 @@ on:
        required: true
        type: boolean
        default: true
-      deployPgSniRouter:
-        description: 'Deploy pg-sni-router'
-        required: true
-        type: boolean
-        default: true

 env:
  AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_DEV }}
@@ -53,8 +48,7 @@ jobs:
        shell: bash
    strategy:
      matrix:
-        # TODO(sergey): Fix storage deploy in eu-central-1
-        target_region: [ eu-west-1, us-east-2]
+        target_region: [ eu-west-1, us-east-2 ]
    environment:
      name: dev-${{ matrix.target_region }}
    steps:
@@ -139,53 +133,6 @@ jobs:
  
      - name: Cleanup helm folder
        run: rm -rf ~/.cache
-
-  deploy-preview-proxy-new:
-    runs-on: [ self-hosted, gen3, small ]
-    container: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/ansible:pinned
-    if: inputs.deployProxy
-    defaults:
-      run:
-        shell: bash
-    strategy:
-      matrix:
-        include:
-          - target_region:  eu-central-1
-            target_cluster: dev-eu-central-1-alpha
-    environment:
-      name: dev-${{ matrix.target_region }}
-    steps:
-      - name: Checkout
-        uses: actions/checkout@v3
-        with:
-          submodules: true
-          fetch-depth: 0
-          ref: ${{ inputs.branch }}
-  
-      - name: Configure AWS Credentials
-        uses: aws-actions/configure-aws-credentials@v1-node16
-        with:
-          role-to-assume: arn:aws:iam::369495373322:role/github-runner
-          aws-region: eu-central-1
-          role-skip-session-tagging: true
-          role-duration-seconds: 1800
-  
-      - name: Configure environment
-        run: |
-          helm repo add neondatabase https://neondatabase.github.io/helm-charts
-          aws --region ${{ matrix.target_region }} eks update-kubeconfig --name  ${{ matrix.target_cluster }}
-  
-      - name: Re-deploy preview proxies
-        run: |
-          DOCKER_TAG=${{ inputs.dockerTag }}
-          for PREVIEW_NAME in helium argon krypton xenon radon oganesson hydrogen nitrogen oxygen fluorine chlorine; do
-            export PREVIEW_NAME
-            envsubst <.github/helm-values/preview-template.neon-proxy-scram.yaml >preview-${PREVIEW_NAME}.neon-proxy-scram.yaml
-            helm upgrade neon-proxy-scram-${PREVIEW_NAME} neondatabase/neon-proxy --namespace neon-proxy-${PREVIEW_NAME} --create-namespace --install --atomic -f preview-${PREVIEW_NAME}.neon-proxy-scram.yaml --set image.tag=${DOCKER_TAG} --set settings.sentryUrl=${{ secrets.SENTRY_URL_PROXY }} --wait --timeout 15m0s
-          done
-
-      - name: Cleanup helm folder
-        run: rm -rf ~/.cache
  
  deploy-storage-broker-new:
    runs-on: [ self-hosted, gen3, small ]
@@ -201,8 +148,6 @@ jobs:
            target_cluster: dev-us-east-2-beta
          - target_region:  eu-west-1
            target_cluster: dev-eu-west-1-zeta
-          - target_region:  eu-central-1
-            target_cluster: dev-eu-central-1-alpha
    environment:
      name: dev-${{ matrix.target_region }}
    steps:
@@ -232,49 +177,3 @@ jobs:
  
      - name: Cleanup helm folder
        run: rm -rf ~/.cache
-
-  deploy-pg-sni-router:
-    runs-on: [ self-hosted, gen3, small ]
-    container: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/ansible:pinned
-    if: inputs.deployPgSniRouter
-    defaults:
-      run:
-        shell: bash
-    strategy:
-      matrix:
-        include:
-          - target_region:  us-east-2
-            target_cluster: dev-us-east-2-beta
-          - target_region:  eu-west-1
-            target_cluster: dev-eu-west-1-zeta
-          - target_region:  eu-central-1
-            target_cluster: dev-eu-central-1-alpha
-    environment:
-      name: dev-${{ matrix.target_region }}
-    steps:
-      - name: Checkout
-        uses: actions/checkout@v3
-        with:
-          submodules: true
-          fetch-depth: 0
-          ref: ${{ inputs.branch }}
-  
-      - name: Configure AWS Credentials
-        uses: aws-actions/configure-aws-credentials@v1-node16
-        with:
-          role-to-assume: arn:aws:iam::369495373322:role/github-runner
-          aws-region: eu-central-1
-          role-skip-session-tagging: true
-          role-duration-seconds: 1800
-  
-      - name: Configure environment
-        run: |
-          helm repo add neondatabase https://neondatabase.github.io/helm-charts
-          aws --region ${{ matrix.target_region }} eks update-kubeconfig --name  ${{ matrix.target_cluster }}
-  
-      - name: Deploy pg-sni-router
-        run:
-          helm upgrade neon-pg-sni-router neondatabase/neon-pg-sni-router --namespace neon-pg-sni-router --create-namespace --install --atomic -f .github/helm-values/${{ matrix.target_cluster }}.pg-sni-router.yaml --set image.tag=${{ inputs.dockerTag }} --set settings.sentryUrl=${{ secrets.SENTRY_URL_BROKER }} --wait --timeout 15m0s
-  
-      - name: Cleanup helm folder
-        run: rm -rf ~/.cache
--- a/.github/workflows/deploy-prod.yml
+++ b/.github/workflows/deploy-prod.yml
@@ -27,11 +27,6 @@ on:
        required: true
        type: boolean
        default: true
-      deployPgSniRouter:
-        description: 'Deploy pg-sni-router'
-        required: true
-        type: boolean
-        default: true
      disclamerAcknowledged:
        description: 'I confirm that there is an emergency and I can not use regular release workflow'
        required: true
@@ -54,7 +49,7 @@ jobs:
        shell: bash
    strategy:
      matrix:
-        target_region: [ us-east-2, us-west-2, eu-central-1, ap-southeast-1, us-east-1 ]
+        target_region: [ us-east-2, us-west-2, eu-central-1, ap-southeast-1 ]
    environment:
      name: prod-${{ matrix.target_region }}
    steps:
@@ -102,10 +97,6 @@ jobs:
            target_cluster: prod-ap-southeast-1-epsilon
            deploy_link_proxy: false
            deploy_legacy_scram_proxy: false
-          - target_region: us-east-1
-            target_cluster: prod-us-east-1-theta
-            deploy_link_proxy: false
-            deploy_legacy_scram_proxy: false
    environment:
      name: prod-${{ matrix.target_region }}
    steps:
@@ -156,8 +147,6 @@ jobs:
            target_cluster: prod-eu-central-1-gamma
          - target_region: ap-southeast-1
            target_cluster: prod-ap-southeast-1-epsilon
-          - target_region: us-east-1
-            target_cluster: prod-us-east-1-theta
    environment:
      name: prod-${{ matrix.target_region }}
    steps:
@@ -176,42 +165,3 @@ jobs:
      - name: Deploy storage-broker
        run:
          helm upgrade neon-storage-broker-lb neondatabase/neon-storage-broker --namespace neon-storage-broker-lb --create-namespace --install --atomic -f .github/helm-values/${{ matrix.target_cluster }}.neon-storage-broker.yaml --set image.tag=${{ inputs.dockerTag }} --set settings.sentryUrl=${{ secrets.SENTRY_URL_BROKER }} --wait --timeout 5m0s
-
-  deploy-pg-sni-router:
-    runs-on: prod
-    container: 093970136003.dkr.ecr.eu-central-1.amazonaws.com/ansible:latest
-    if: inputs.deployPgSniRouter && inputs.disclamerAcknowledged
-    defaults:
-      run:
-        shell: bash
-    strategy:
-      matrix:
-        include:
-          - target_region:  us-east-2
-            target_cluster: prod-us-east-2-delta
-          - target_region:  us-west-2
-            target_cluster: prod-us-west-2-eta
-          - target_region: eu-central-1
-            target_cluster: prod-eu-central-1-gamma
-          - target_region: ap-southeast-1
-            target_cluster: prod-ap-southeast-1-epsilon
-          - target_region: us-east-1
-            target_cluster: prod-us-east-1-theta
-    environment:
-      name: prod-${{ matrix.target_region }}
-    steps:
-      - name: Checkout
-        uses: actions/checkout@v3
-        with:
-          submodules: true
-          fetch-depth: 0
-          ref: ${{ inputs.branch }}
-  
-      - name: Configure environment
-        run: |
-          helm repo add neondatabase https://neondatabase.github.io/helm-charts
-          aws --region ${{ matrix.target_region }} eks update-kubeconfig --name  ${{ matrix.target_cluster }}
-  
-      - name: Deploy pg-sni-router
-        run:
-          helm upgrade neon-pg-sni-router neondatabase/neon-pg-sni-router --namespace neon-pg-sni-router --create-namespace --install --atomic -f .github/helm-values/${{ matrix.target_cluster }}.pg-sni-router.yaml --set image.tag=${{ inputs.dockerTag }} --set settings.sentryUrl=${{ secrets.SENTRY_URL_BROKER }} --wait --timeout 15m0s
--- a/.neon_clippy_args
+++ b/.neon_clippy_args
@@ -1,4 +0,0 @@
-# * `-A unknown_lints` – do not warn about unknown lint suppressions
-#                        that people with newer toolchains might use
-# * `-D warnings`      - fail on any warnings (`cargo` returns non-zero exit status)
-export CLIPPY_COMMON_ARGS="--locked --workspace --all-targets -- -A unknown_lints -D warnings"
--- a/Cargo.lock
+++ b/Cargo.lock
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -24,10 +24,10 @@ atty = "0.2.14"
 aws-config = { version = "0.51.0", default-features = false, features=["rustls"] }
 aws-sdk-s3 = "0.21.0"
 aws-smithy-http = "0.51.0"
-aws-types = "0.55"
+aws-types = "0.51.0"
 base64 = "0.13.0"
 bincode = "1.3"
-bindgen = "0.65"
+bindgen = "0.61"
 bstr = "1.0"
 byteorder = "1.4"
 bytes = "1.0"
@@ -50,7 +50,7 @@ git-version = "0.3"
 hashbrown = "0.13"
 hashlink = "0.8.1"
 hex = "0.4"
-hex-literal = "0.4"
+hex-literal = "0.3"
 hmac = "0.12.1"
 hostname = "0.3.1"
 humantime = "2.1"
@@ -62,7 +62,6 @@ jsonwebtoken = "8"
 libc = "0.2"
 md5 = "0.7.0"
 memoffset = "0.8"
-native-tls = "0.2"
 nix = "0.26"
 notify = "5.0.0"
 num_cpus = "1.15"
@@ -81,18 +80,18 @@ reqwest = { version = "0.11", default-features = false, features = ["rustls-tls"
 reqwest-tracing = { version = "0.4.0", features = ["opentelemetry_0_18"] }
 reqwest-middleware = "0.2.0"
 routerify = "3"
-rpds = "0.13"
+rpds = "0.12.0"
 rustls = "0.20"
 rustls-pemfile = "1"
 rustls-split = "0.3"
 scopeguard = "1.1"
-sentry = { version = "0.30", default-features = false, features = ["backtrace", "contexts", "panic", "rustls", "reqwest" ] }
+sentry = { version = "0.29", default-features = false, features = ["backtrace", "contexts", "panic", "rustls", "reqwest" ] }
 serde = { version = "1.0", features = ["derive"] }
 serde_json = "1"
 serde_with = "2.0"
 sha2 = "0.10.2"
 signal-hook = "0.3"
-socket2 = "0.5"
+socket2 = "0.4.4"
 strum = "0.24"
 strum_macros = "0.24"
 svg_fmt = "0.4.1"
@@ -107,29 +106,27 @@ tokio-postgres-rustls = "0.9.0"
 tokio-rustls = "0.23"
 tokio-stream = "0.1"
 tokio-util = { version = "0.7", features = ["io"] }
-toml = "0.7"
-toml_edit = "0.19"
-tonic = {version = "0.9", features = ["tls", "tls-roots"]}
+toml = "0.5"
+toml_edit = { version = "0.17", features = ["easy"] }
+tonic = {version = "0.8", features = ["tls", "tls-roots"]}
 tracing = "0.1"
-tracing-error = "0.2.0"
 tracing-opentelemetry = "0.18.0"
 tracing-subscriber = { version = "0.3", features = ["env-filter"] }
 url = "2.2"
 uuid = { version = "1.2", features = ["v4", "serde"] }
 walkdir = "2.3.2"
-webpki-roots = "0.23"
-x509-parser = "0.15"
+webpki-roots = "0.22.5"
+x509-parser = "0.14"

 ## TODO replace this with tracing
 env_logger = "0.10"
 log = "0.4"

 ## Libraries from neondatabase/ git forks, ideally with changes to be upstreamed
-postgres = { git = "https://github.com/neondatabase/rust-postgres.git", rev="0bc41d8503c092b040142214aac3cf7d11d0c19f" }
-postgres-native-tls = { git = "https://github.com/neondatabase/rust-postgres.git", rev="0bc41d8503c092b040142214aac3cf7d11d0c19f" }
-postgres-protocol = { git = "https://github.com/neondatabase/rust-postgres.git", rev="0bc41d8503c092b040142214aac3cf7d11d0c19f" }
-postgres-types = { git = "https://github.com/neondatabase/rust-postgres.git", rev="0bc41d8503c092b040142214aac3cf7d11d0c19f" }
-tokio-postgres = { git = "https://github.com/neondatabase/rust-postgres.git", rev="0bc41d8503c092b040142214aac3cf7d11d0c19f" }
+postgres = { git = "https://github.com/neondatabase/rust-postgres.git", rev="43e6db254a97fdecbce33d8bc0890accfd74495e" }
+postgres-protocol = { git = "https://github.com/neondatabase/rust-postgres.git", rev="43e6db254a97fdecbce33d8bc0890accfd74495e" }
+postgres-types = { git = "https://github.com/neondatabase/rust-postgres.git", rev="43e6db254a97fdecbce33d8bc0890accfd74495e" }
+tokio-postgres = { git = "https://github.com/neondatabase/rust-postgres.git", rev="43e6db254a97fdecbce33d8bc0890accfd74495e" }
 tokio-tar = { git = "https://github.com/neondatabase/tokio-tar.git", rev="404df61437de0feef49ba2ccdbdd94eb8ad6e142" }

 ## Other git libraries
@@ -157,20 +154,14 @@ workspace_hack = { version = "0.1", path = "./workspace_hack/" }
 ## Build dependencies
 criterion = "0.4"
 rcgen = "0.10"
-rstest = "0.17"
+rstest = "0.16"
 tempfile = "3.4"
-tonic-build = "0.9"
-
-[patch.crates-io]
+tonic-build = "0.8"

 # This is only needed for proxy's tests.
 # TODO: we should probably fork `tokio-postgres-rustls` instead.
-tokio-postgres = { git = "https://github.com/neondatabase/rust-postgres.git", rev="0bc41d8503c092b040142214aac3cf7d11d0c19f" }
-
-# Changes the MAX_THREADS limit from 4096 to 32768.
-# This is a temporary workaround for using tracing from many threads in safekeepers code,
-# until async safekeepers patch is merged to the main.
-sharded-slab = { git = "https://github.com/neondatabase/sharded-slab.git", rev="98d16753ab01c61f0a028de44167307a00efea00" }
+[patch.crates-io]
+tokio-postgres = { git = "https://github.com/neondatabase/rust-postgres.git", rev="43e6db254a97fdecbce33d8bc0890accfd74495e" }

 ################# Binary contents sections

--- a/11
+++ b/11
@@ -44,15 +44,7 @@ COPY --chown=nonroot . .
 # Show build caching stats to check if it was used in the end.
 # Has to be the part of the same RUN since cachepot daemon is killed in the end of this RUN, losing the compilation stats.
 RUN set -e \
-    && mold -run cargo build  \
-      --bin pg_sni_router  \
-      --bin pageserver  \
-      --bin pageserver_binutils  \
-      --bin draw_timeline_dir \
-      --bin safekeeper  \
-      --bin storage_broker  \
-      --bin proxy  \
-      --locked --release \
+&& mold -run cargo build --bin pageserver --bin pageserver_binutils --bin draw_timeline_dir --bin safekeeper --bin storage_broker --bin proxy --locked --release \
    && cachepot -s

 # Build final image
@@ -71,7 +63,6 @@ RUN set -e \
    && useradd -d /data neon \
    && chown -R neon:neon /data

-COPY --from=build --chown=neon:neon /home/nonroot/target/release/pg_sni_router       /usr/local/bin
 COPY --from=build --chown=neon:neon /home/nonroot/target/release/pageserver          /usr/local/bin
 COPY --from=build --chown=neon:neon /home/nonroot/target/release/pageserver_binutils /usr/local/bin
 COPY --from=build --chown=neon:neon /home/nonroot/target/release/draw_timeline_dir   /usr/local/bin
--- a/Dockerfile.compute-node
+++ b/Dockerfile.compute-node
@@ -12,7 +12,7 @@ FROM debian:bullseye-slim AS build-deps
 RUN apt update &&  \
    apt install -y git autoconf automake libtool build-essential bison flex libreadline-dev \
    zlib1g-dev libxml2-dev libcurl4-openssl-dev libossp-uuid-dev wget pkg-config libssl-dev \
-    libicu-dev libxslt1-dev liblz4-dev libzstd-dev
+    libicu-dev libxslt1-dev

 #########################################################################################
 #
@@ -24,13 +24,8 @@ FROM build-deps AS pg-build
 ARG PG_VERSION
 COPY vendor/postgres-${PG_VERSION} postgres
 RUN cd postgres && \
-    export CONFIGURE_CMD="./configure CFLAGS='-O2 -g3' --enable-debug --with-openssl --with-uuid=ossp \
-    --with-icu --with-libxml --with-libxslt --with-lz4" && \
-    if [ "${PG_VERSION}" != "v14" ]; then \
-        # zstd is available only from PG15
-        export CONFIGURE_CMD="${CONFIGURE_CMD} --with-zstd"; \
-    fi && \
-    eval $CONFIGURE_CMD && \
+    ./configure CFLAGS='-O2 -g3' --enable-debug --with-openssl --with-uuid=ossp --with-icu \
+    --with-libxml --with-libxslt && \
    make MAKELEVEL=0 -j $(getconf _NPROCESSORS_ONLN) -s install && \
    make MAKELEVEL=0 -j $(getconf _NPROCESSORS_ONLN) -s -C contrib/ install && \
    # Install headers
@@ -570,17 +565,13 @@ COPY --from=compute-tools --chown=postgres /home/nonroot/target/release-line-deb
 # Install:
 # libreadline8 for psql
 # libicu67, locales for collations (including ICU and plpgsql_check)
-# liblz4-1 for lz4
 # libossp-uuid16 for extension ossp-uuid
 # libgeos, libgdal, libsfcgal1, libproj and libprotobuf-c1 for PostGIS
 # libxml2, libxslt1.1 for xml2
-# libzstd1 for zstd
 RUN apt update &&  \
    apt install --no-install-recommends -y \
-        gdb \
        locales \
        libicu67 \
-        liblz4-1 \
        libreadline8 \
        libossp-uuid16 \
        libgeos-c1v5 \
@@ -590,8 +581,7 @@ RUN apt update &&  \
        libsfcgal1 \
        libxml2 \
        libxslt1.1 \
-        libzstd1 \
-        procps && \
+        gdb && \
    rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* && \
    localedef -i en_US -c -f UTF-8 -A /usr/share/locale/locale.alias en_US.UTF-8

--- a/Dockerfile.vm-compute-node
+++ b/Dockerfile.vm-compute-node
@@ -54,7 +54,7 @@ RUN set -e \

 RUN set -e \
 	&& echo "::sysinit:cgconfigparser -l /etc/cgconfig.conf -s 1664" >> /etc/inittab \
-	&& CONNSTR="dbname=postgres user=cloud_admin sslmode=disable" \
+	&& CONNSTR="dbname=neondb user=cloud_admin sslmode=disable" \
 	&& ARGS="--auto-restart --cgroup=neon-postgres --pgconnstr=\"$CONNSTR\"" \
 	&& echo "::respawn:su vm-informant -c '/usr/local/bin/vm-informant $ARGS'" >> /etc/inittab

--- a/compute_tools/src/bin/compute_ctl.rs
+++ b/compute_tools/src/bin/compute_ctl.rs
@@ -73,7 +73,7 @@ fn main() -> Result<()> {
    // Try to use just 'postgres' if no path is provided
    let pgbin = matches.get_one::<String>("pgbin").unwrap();

-    let spec;
+    let mut spec = None;
    let mut live_config_allowed = false;
    match spec_json {
        // First, try to get cluster spec from the cli argument
@@ -89,13 +89,9 @@ fn main() -> Result<()> {
            } else if let Some(id) = compute_id {
                if let Some(cp_base) = control_plane_uri {
                    live_config_allowed = true;
-                    spec = match get_spec_from_control_plane(cp_base, id) {
-                        Ok(s) => s,
-                        Err(e) => {
-                            error!("cannot get response from control plane: {}", e);
-                            panic!("neither spec nor confirmation that compute is in the Empty state was received");
-                        }
-                    };
+                    if let Ok(s) = get_spec_from_control_plane(cp_base, id) {
+                        spec = Some(s);
+                    }
                } else {
                    panic!("must specify both --control-plane-uri and --compute-id or none");
                }
@@ -118,6 +114,7 @@ fn main() -> Result<()> {
        spec_set = false;
    }
    let compute_node = ComputeNode {
+        start_time: Utc::now(),
        connstr: Url::parse(connstr).context("cannot parse connstr as a URL")?,
        pgdata: pgdata.to_string(),
        pgbin: pgbin.to_string(),
@@ -150,17 +147,6 @@ fn main() -> Result<()> {
    let mut state = compute.state.lock().unwrap();
    let pspec = state.pspec.as_ref().expect("spec must be set");
    let startup_tracing_context = pspec.spec.startup_tracing_context.clone();
-
-    // Record for how long we slept waiting for the spec.
-    state.metrics.wait_for_spec_ms = Utc::now()
-        .signed_duration_since(state.start_time)
-        .to_std()
-        .unwrap()
-        .as_millis() as u64;
-    // Reset start time to the actual start of the configuration, so that
-    // total startup time was properly measured at the end.
-    state.start_time = Utc::now();
-
    state.status = ComputeStatus::Init;
    compute.state_changed.notify_all();
    drop(state);
--- a/compute_tools/src/checker.rs
+++ b/compute_tools/src/checker.rs
@@ -1,28 +1,12 @@
 use anyhow::{anyhow, Result};
+use postgres::Client;
 use tokio_postgres::NoTls;
 use tracing::{error, instrument};

 use crate::compute::ComputeNode;

-/// Update timestamp in a row in a special service table to check
-/// that we can actually write some data in this particular timeline.
-/// Create table if it's missing.
 #[instrument(skip_all)]
-pub async fn check_writability(compute: &ComputeNode) -> Result<()> {
-    // Connect to the database.
-    let (client, connection) = tokio_postgres::connect(compute.connstr.as_str(), NoTls).await?;
-    if client.is_closed() {
-        return Err(anyhow!("connection to postgres closed"));
-    }
-
-    // The connection object performs the actual communication with the database,
-    // so spawn it off to run on its own.
-    tokio::spawn(async move {
-        if let Err(e) = connection.await {
-            error!("connection error: {}", e);
-        }
-    });
-
+pub fn create_writability_check_data(client: &mut Client) -> Result<()> {
    let query = "
    CREATE TABLE IF NOT EXISTS health_check (
        id serial primary key,
@@ -31,15 +15,31 @@ pub async fn check_writability(compute: &ComputeNode) -> Result<()> {
    INSERT INTO health_check VALUES (1, now())
        ON CONFLICT (id) DO UPDATE
         SET updated_at = now();";
-
-    let result = client.simple_query(query).await?;
-
-    if result.len() != 2 {
-        return Err(anyhow::format_err!(
-            "expected 2 query results, but got {}",
-            result.len()
-        ));
+    let result = client.simple_query(query)?;
+    if result.len() < 2 {
+        return Err(anyhow::format_err!("executed  {} queries", result.len()));
+    }
+    Ok(())
+}
+
+#[instrument(skip_all)]
+pub async fn check_writability(compute: &ComputeNode) -> Result<()> {
+    let (client, connection) = tokio_postgres::connect(compute.connstr.as_str(), NoTls).await?;
+    if client.is_closed() {
+        return Err(anyhow!("connection to postgres closed"));
+    }
+    tokio::spawn(async move {
+        if let Err(e) = connection.await {
+            error!("connection error: {}", e);
+        }
+    });
+
+    let result = client
+        .simple_query("UPDATE health_check SET updated_at = now() WHERE id = 1;")
+        .await?;
+
+    if result.len() != 1 {
+        return Err(anyhow!("statement can't be executed"));
    }
-
    Ok(())
 }
--- a/compute_tools/src/compute.rs
+++ b/compute_tools/src/compute.rs
@@ -32,12 +32,14 @@ use utils::lsn::Lsn;
 use compute_api::responses::{ComputeMetrics, ComputeStatus};
 use compute_api::spec::ComputeSpec;

+use crate::checker::create_writability_check_data;
 use crate::config;
 use crate::pg_helpers::*;
 use crate::spec::*;

 /// Compute node info shared across several `compute_ctl` threads.
 pub struct ComputeNode {
+    pub start_time: DateTime<Utc>,
    // Url type maintains proper escaping
    pub connstr: url::Url,
    pub pgdata: String,
@@ -65,7 +67,6 @@ pub struct ComputeNode {

 #[derive(Clone, Debug)]
 pub struct ComputeState {
-    pub start_time: DateTime<Utc>,
    pub status: ComputeStatus,
    /// Timestamp of the last Postgres activity
    pub last_active: DateTime<Utc>,
@@ -77,7 +78,6 @@ pub struct ComputeState {
 impl ComputeState {
    pub fn new() -> Self {
        Self {
-            start_time: Utc::now(),
            status: ComputeStatus::Empty,
            last_active: Utc::now(),
            error: None,
@@ -249,63 +249,18 @@ impl ComputeNode {
    /// safekeepers sync, basebackup, etc.
    #[instrument(skip(self, compute_state))]
    pub fn prepare_pgdata(&self, compute_state: &ComputeState) -> Result<()> {
-        #[derive(Clone)]
-        enum Replication {
-            Primary,
-            Static { lsn: Lsn },
-            HotStandby,
-        }
-
        let pspec = compute_state.pspec.as_ref().expect("spec must be set");
-        let spec = &pspec.spec;
        let pgdata_path = Path::new(&self.pgdata);

-        let hot_replica = if let Some(option) = spec.cluster.settings.find_ref("hot_standby") {
-            if let Some(value) = &option.value {
-                anyhow::ensure!(option.vartype == "bool");
-                matches!(value.as_str(), "on" | "yes" | "true")
-            } else {
-                false
-            }
-        } else {
-            false
-        };
-
-        let replication = if hot_replica {
-            Replication::HotStandby
-        } else if let Some(lsn) = spec.cluster.settings.find("recovery_target_lsn") {
-            Replication::Static {
-                lsn: Lsn::from_str(&lsn)?,
-            }
-        } else {
-            Replication::Primary
-        };
-
        // Remove/create an empty pgdata directory and put configuration there.
        self.create_pgdata()?;
        config::write_postgres_conf(&pgdata_path.join("postgresql.conf"), &pspec.spec)?;

-        // Syncing safekeepers is only safe with primary nodes: if a primary
-        // is already connected it will be kicked out, so a secondary (standby)
-        // cannot sync safekeepers.
-        let lsn = match &replication {
-            Replication::Primary => {
-                info!("starting safekeepers syncing");
-                let lsn = self
-                    .sync_safekeepers(pspec.storage_auth_token.clone())
-                    .with_context(|| "failed to sync safekeepers")?;
-                info!("safekeepers synced at LSN {}", lsn);
-                lsn
-            }
-            Replication::Static { lsn } => {
-                info!("Starting read-only node at static LSN {}", lsn);
-                *lsn
-            }
-            Replication::HotStandby => {
-                info!("Initializing standby from latest Pageserver LSN");
-                Lsn(0)
-            }
-        };
+        info!("starting safekeepers syncing");
+        let lsn = self
+            .sync_safekeepers(pspec.storage_auth_token.clone())
+            .with_context(|| "failed to sync safekeepers")?;
+        info!("safekeepers synced at LSN {}", lsn);

        info!(
            "getting basebackup@{} from pageserver {}",
@@ -321,13 +276,6 @@ impl ComputeNode {
        // Update pg_hba.conf received with basebackup.
        update_pg_hba(pgdata_path)?;

-        match &replication {
-            Replication::Primary | Replication::Static { .. } => {}
-            Replication::HotStandby => {
-                add_standby_signal(pgdata_path)?;
-            }
-        }
-
        Ok(())
    }

@@ -394,6 +342,7 @@ impl ComputeNode {
        handle_databases(spec, &mut client)?;
        handle_role_deletions(spec, self.connstr.as_str(), &mut client)?;
        handle_grants(spec, self.connstr.as_str(), &mut client)?;
+        create_writability_check_data(&mut client)?;
        handle_extensions(spec, &mut client)?;

        // 'Close' connection
@@ -478,7 +427,7 @@ impl ComputeNode {
                .unwrap()
                .as_millis() as u64;
            state.metrics.total_startup_ms = startup_end_time
-                .signed_duration_since(compute_state.start_time)
+                .signed_duration_since(self.start_time)
                .to_std()
                .unwrap()
                .as_millis() as u64;
--- a/compute_tools/src/http/api.rs
+++ b/compute_tools/src/http/api.rs
@@ -18,7 +18,6 @@ use tracing_utils::http::OtelName;

 fn status_response_from_state(state: &ComputeState) -> ComputeStatusResponse {
    ComputeStatusResponse {
-        start_time: state.start_time,
        tenant: state
            .pspec
            .as_ref()
@@ -86,10 +85,7 @@ async fn routes(req: Request<Body>, compute: &Arc<ComputeNode>) -> Response<Body
            let res = crate::checker::check_writability(compute).await;
            match res {
                Ok(_) => Response::new(Body::from("true")),
-                Err(e) => {
-                    error!("check_writability failed: {}", e);
-                    Response::new(Body::from(e.to_string()))
-                }
+                Err(e) => Response::new(Body::from(e.to_string())),
            }
        }

--- a/compute_tools/src/http/openapi_spec.yaml
+++ b/compute_tools/src/http/openapi_spec.yaml
@@ -152,14 +152,11 @@ components:
      type: object
      description: Compute startup metrics.
      required:
-        - wait_for_spec_ms
        - sync_safekeepers_ms
        - basebackup_ms
        - config_ms
        - total_startup_ms
      properties:
-        wait_for_spec_ms:
-          type: integer
        sync_safekeepers_ms:
          type: integer
        basebackup_ms:
@@ -184,13 +181,6 @@ components:
        - status
        - last_active
      properties:
-        start_time:
-          type: string
-          description: |
-            Time when compute was started. If initially compute was started in the `empty`
-            state and then provided with valid spec, `start_time` will be reset to the
-            moment, when spec was received.
-          example: "2022-10-12T07:20:50.52Z"
        status:
          $ref: '#/components/schemas/ComputeStatus'
        last_active:
--- a/compute_tools/src/pg_helpers.rs
+++ b/compute_tools/src/pg_helpers.rs
@@ -94,7 +94,6 @@ impl PgOptionsSerialize for GenericOptions {

 pub trait GenericOptionsSearch {
    fn find(&self, name: &str) -> Option<String>;
-    fn find_ref(&self, name: &str) -> Option<&GenericOption>;
 }

 impl GenericOptionsSearch for GenericOptions {
@@ -104,12 +103,6 @@ impl GenericOptionsSearch for GenericOptions {
        let op = ops.iter().find(|s| s.name == name)?;
        op.value.clone()
    }
-
-    /// Lookup option by name, returning ref
-    fn find_ref(&self, name: &str) -> Option<&GenericOption> {
-        let ops = self.as_ref()?;
-        ops.iter().find(|s| s.name == name)
-    }
 }

 pub trait RoleExt {
--- a/compute_tools/src/spec.rs
+++ b/compute_tools/src/spec.rs
@@ -1,121 +1,45 @@
-use std::fs::File;
 use std::path::Path;
 use std::str::FromStr;

 use anyhow::{anyhow, bail, Result};
 use postgres::config::Config;
 use postgres::{Client, NoTls};
-use reqwest::StatusCode;
-use tracing::{error, info, info_span, instrument, span_enabled, warn, Level};
+use tracing::{info, info_span, instrument, span_enabled, warn, Level};

 use crate::config;
 use crate::params::PG_HBA_ALL_MD5;
 use crate::pg_helpers::*;

-use compute_api::responses::{ControlPlaneComputeStatus, ControlPlaneSpecResponse};
+use compute_api::responses::ControlPlaneSpecResponse;
 use compute_api::spec::{ComputeSpec, Database, PgIdent, Role};

-// Do control plane request and return response if any. In case of error it
-// returns a bool flag indicating whether it makes sense to retry the request
-// and a string with error message.
-fn do_control_plane_request(
-    uri: &str,
-    jwt: &str,
-) -> Result<ControlPlaneSpecResponse, (bool, String)> {
-    let resp = reqwest::blocking::Client::new()
-        .get(uri)
-        .header("Authorization", jwt)
-        .send()
-        .map_err(|e| {
-            (
-                true,
-                format!("could not perform spec request to control plane: {}", e),
-            )
-        })?;
-
-    match resp.status() {
-        StatusCode::OK => match resp.json::<ControlPlaneSpecResponse>() {
-            Ok(spec_resp) => Ok(spec_resp),
-            Err(e) => Err((
-                true,
-                format!("could not deserialize control plane response: {}", e),
-            )),
-        },
-        StatusCode::SERVICE_UNAVAILABLE => {
-            Err((true, "control plane is temporarily unavailable".to_string()))
-        }
-        StatusCode::BAD_GATEWAY => {
-            // We have a problem with intermittent 502 errors now
-            // https://github.com/neondatabase/cloud/issues/2353
-            // It's fine to retry GET request in this case.
-            Err((true, "control plane request failed with 502".to_string()))
-        }
-        // Another code, likely 500 or 404, means that compute is unknown to the control plane
-        // or some internal failure happened. Doesn't make much sense to retry in this case.
-        _ => Err((
-            false,
-            format!(
-                "unexpected control plane response status code: {}",
-                resp.status()
-            ),
-        )),
-    }
-}
-
 /// Request spec from the control-plane by compute_id. If `NEON_CONSOLE_JWT`
 /// env variable is set, it will be used for authorization.
-pub fn get_spec_from_control_plane(
-    base_uri: &str,
-    compute_id: &str,
-) -> Result<Option<ComputeSpec>> {
+pub fn get_spec_from_control_plane(base_uri: &str, compute_id: &str) -> Result<ComputeSpec> {
    let cp_uri = format!("{base_uri}/management/api/v2/computes/{compute_id}/spec");
-    let jwt: String = match std::env::var("NEON_CONTROL_PLANE_TOKEN") {
+    let jwt: String = match std::env::var("NEON_CONSOLE_JWT") {
        Ok(v) => v,
        Err(_) => "".to_string(),
    };
-    let mut attempt = 1;
-    let mut spec: Result<Option<ComputeSpec>> = Ok(None);
-
    info!("getting spec from control plane: {}", cp_uri);

-    // Do 3 attempts to get spec from the control plane using the following logic:
-    // - network error -> then retry
-    // - compute id is unknown or any other error -> bail out
-    // - no spec for compute yet (Empty state) -> return Ok(None)
-    // - got spec -> return Ok(Some(spec))
-    while attempt < 4 {
-        spec = match do_control_plane_request(&cp_uri, &jwt) {
-            Ok(spec_resp) => match spec_resp.status {
-                ControlPlaneComputeStatus::Empty => Ok(None),
-                ControlPlaneComputeStatus::Attached => {
-                    if let Some(spec) = spec_resp.spec {
-                        Ok(Some(spec))
-                    } else {
-                        bail!("compute is attached, but spec is empty")
-                    }
-                }
-            },
-            Err((retry, msg)) => {
-                if retry {
-                    Err(anyhow!(msg))
-                } else {
-                    bail!(msg);
-                }
-            }
-        };
+    // TODO: check the response. We should distinguish cases when it's
+    // - network error, then retry
+    // - no spec for compute yet, then wait
+    // - compute id is unknown or any other error, then bail out
+    let resp: ControlPlaneSpecResponse = reqwest::blocking::Client::new()
+        .get(cp_uri)
+        .header("Authorization", jwt)
+        .send()
+        .map_err(|e| anyhow!("could not send spec request to control plane: {}", e))?
+        .json()
+        .map_err(|e| anyhow!("could not get compute spec from control plane: {}", e))?;

-        if let Err(e) = &spec {
-            error!("attempt {} to get spec failed with: {}", attempt, e);
-        } else {
-            return spec;
-        }
-
-        attempt += 1;
-        std::thread::sleep(std::time::Duration::from_millis(100));
+    if let Some(spec) = resp.spec {
+        Ok(spec)
+    } else {
+        bail!("could not get compute spec from control plane")
    }
-
-    // All attempts failed, return error.
-    spec
 }

 /// It takes cluster specification and does the following:
@@ -146,21 +70,6 @@ pub fn update_pg_hba(pgdata_path: &Path) -> Result<()> {
    Ok(())
 }

-/// Create a standby.signal file
-pub fn add_standby_signal(pgdata_path: &Path) -> Result<()> {
-    // XXX: consider making it a part of spec.json
-    info!("adding standby.signal");
-    let signalfile = pgdata_path.join("standby.signal");
-
-    if !signalfile.exists() {
-        info!("created standby.signal");
-        File::create(signalfile)?;
-    } else {
-        info!("reused pre-existing standby.signal");
-    }
-    Ok(())
-}
-
 /// Given a cluster spec json and open transaction it handles roles creation,
 /// deletion and update.
 #[instrument(skip_all)]
--- a/control_plane/src/bin/neon_local.rs
+++ b/control_plane/src/bin/neon_local.rs
@@ -8,7 +8,6 @@
 use anyhow::{anyhow, bail, Context, Result};
 use clap::{value_parser, Arg, ArgAction, ArgMatches, Command};
 use control_plane::endpoint::ComputeControlPlane;
-use control_plane::endpoint::ComputeMode;
 use control_plane::local_env::LocalEnv;
 use control_plane::pageserver::PageServerNode;
 use control_plane::safekeeper::SafekeeperNode;
@@ -475,14 +474,7 @@ fn handle_timeline(timeline_match: &ArgMatches, env: &mut local_env::LocalEnv) -
            env.register_branch_mapping(name.to_string(), tenant_id, timeline_id)?;

            println!("Creating endpoint for imported timeline ...");
-            cplane.new_endpoint(
-                tenant_id,
-                name,
-                timeline_id,
-                None,
-                pg_version,
-                ComputeMode::Primary,
-            )?;
+            cplane.new_endpoint(tenant_id, name, timeline_id, None, None, pg_version)?;
            println!("Done");
        }
        Some(("branch", branch_match)) => {
@@ -568,20 +560,20 @@ fn handle_endpoint(ep_match: &ArgMatches, env: &local_env::LocalEnv) -> Result<(
                .iter()
                .filter(|(_, endpoint)| endpoint.tenant_id == tenant_id)
            {
-                let lsn_str = match endpoint.mode {
-                    ComputeMode::Static(lsn) => {
-                        // -> read-only endpoint
-                        // Use the node's LSN.
-                        lsn.to_string()
-                    }
-                    _ => {
-                        // -> primary endpoint or hot replica
+                let lsn_str = match endpoint.lsn {
+                    None => {
+                        // -> primary endpoint
                        // Use the LSN at the end of the timeline.
                        timeline_infos
                            .get(&endpoint.timeline_id)
                            .map(|bi| bi.last_record_lsn.to_string())
                            .unwrap_or_else(|| "?".to_string())
                    }
+                    Some(lsn) => {
+                        // -> read-only endpoint
+                        // Use the endpoint's LSN.
+                        lsn.to_string()
+                    }
                };

                let branch_name = timeline_name_mappings
@@ -627,19 +619,7 @@ fn handle_endpoint(ep_match: &ArgMatches, env: &local_env::LocalEnv) -> Result<(
                .copied()
                .context("Failed to parse postgres version from the argument string")?;

-            let hot_standby = sub_args
-                .get_one::<bool>("hot-standby")
-                .copied()
-                .unwrap_or(false);
-
-            let mode = match (lsn, hot_standby) {
-                (Some(lsn), false) => ComputeMode::Static(lsn),
-                (None, true) => ComputeMode::Replica,
-                (None, false) => ComputeMode::Primary,
-                (Some(_), true) => anyhow::bail!("cannot specify both lsn and hot-standby"),
-            };
-
-            cplane.new_endpoint(tenant_id, &endpoint_id, timeline_id, port, pg_version, mode)?;
+            cplane.new_endpoint(tenant_id, &endpoint_id, timeline_id, lsn, port, pg_version)?;
        }
        "start" => {
            let port: Option<u16> = sub_args.get_one::<u16>("port").copied();
@@ -657,21 +637,7 @@ fn handle_endpoint(ep_match: &ArgMatches, env: &local_env::LocalEnv) -> Result<(
                None
            };

-            let hot_standby = sub_args
-                .get_one::<bool>("hot-standby")
-                .copied()
-                .unwrap_or(false);
-
            if let Some(endpoint) = endpoint {
-                match (&endpoint.mode, hot_standby) {
-                    (ComputeMode::Static(_), true) => {
-                        bail!("Cannot start a node in hot standby mode when it is already configured as a static replica")
-                    }
-                    (ComputeMode::Primary, true) => {
-                        bail!("Cannot start a node as a hot standby replica, it is already configured as primary node")
-                    }
-                    _ => {}
-                }
                println!("Starting existing endpoint {endpoint_id}...");
                endpoint.start(&auth_token)?;
            } else {
@@ -693,14 +659,6 @@ fn handle_endpoint(ep_match: &ArgMatches, env: &local_env::LocalEnv) -> Result<(
                    .get_one::<u32>("pg-version")
                    .copied()
                    .context("Failed to `pg-version` from the argument string")?;
-
-                let mode = match (lsn, hot_standby) {
-                    (Some(lsn), false) => ComputeMode::Static(lsn),
-                    (None, true) => ComputeMode::Replica,
-                    (None, false) => ComputeMode::Primary,
-                    (Some(_), true) => anyhow::bail!("cannot specify both lsn and hot-standby"),
-                };
-
                // when used with custom port this results in non obvious behaviour
                // port is remembered from first start command, i e
                // start --port X
@@ -712,9 +670,9 @@ fn handle_endpoint(ep_match: &ArgMatches, env: &local_env::LocalEnv) -> Result<(
                    tenant_id,
                    endpoint_id,
                    timeline_id,
+                    lsn,
                    port,
                    pg_version,
-                    mode,
                )?;
                ep.start(&auth_token)?;
            }
@@ -970,12 +928,6 @@ fn cli() -> Command {
        .help("Specify Lsn on the timeline to start from. By default, end of the timeline would be used.")
        .required(false);

-    let hot_standby_arg = Arg::new("hot-standby")
-        .value_parser(value_parser!(bool))
-        .long("hot-standby")
-        .help("If set, the node will be a hot replica on the specified timeline")
-        .required(false);
-
    Command::new("Neon CLI")
        .arg_required_else_help(true)
        .version(GIT_VERSION)
@@ -1100,7 +1052,6 @@ fn cli() -> Command {
                            .long("config-only")
                            .required(false))
                    .arg(pg_version_arg.clone())
-                    .arg(hot_standby_arg.clone())
                )
                .subcommand(Command::new("start")
                    .about("Start postgres.\n If the endpoint doesn't exist yet, it is created.")
@@ -1111,7 +1062,6 @@ fn cli() -> Command {
                    .arg(lsn_arg)
                    .arg(port_arg)
                    .arg(pg_version_arg)
-                    .arg(hot_standby_arg)
                )
                .subcommand(
                    Command::new("stop")
--- a/control_plane/src/endpoint.rs
+++ b/control_plane/src/endpoint.rs
@@ -11,31 +11,15 @@ use std::sync::Arc;
 use std::time::Duration;

 use anyhow::{Context, Result};
-use serde::{Deserialize, Serialize};
-use serde_with::{serde_as, DisplayFromStr};
 use utils::{
    id::{TenantId, TimelineId},
    lsn::Lsn,
 };

-use crate::local_env::LocalEnv;
+use crate::local_env::{LocalEnv, DEFAULT_PG_VERSION};
 use crate::pageserver::PageServerNode;
 use crate::postgresql_conf::PostgresConf;

-// contents of a endpoint.json file
-#[serde_as]
-#[derive(Serialize, Deserialize, PartialEq, Eq, Clone, Debug)]
-pub struct EndpointConf {
-    name: String,
-    #[serde_as(as = "DisplayFromStr")]
-    tenant_id: TenantId,
-    #[serde_as(as = "DisplayFromStr")]
-    timeline_id: TimelineId,
-    mode: ComputeMode,
-    port: u16,
-    pg_version: u32,
-}
-
 //
 // ComputeControlPlane
 //
@@ -84,34 +68,23 @@ impl ComputeControlPlane {
        tenant_id: TenantId,
        name: &str,
        timeline_id: TimelineId,
+        lsn: Option<Lsn>,
        port: Option<u16>,
        pg_version: u32,
-        mode: ComputeMode,
    ) -> Result<Arc<Endpoint>> {
        let port = port.unwrap_or_else(|| self.get_port());
-
        let ep = Arc::new(Endpoint {
            name: name.to_owned(),
            address: SocketAddr::new("127.0.0.1".parse().unwrap(), port),
            env: self.env.clone(),
            pageserver: Arc::clone(&self.pageserver),
            timeline_id,
-            mode,
+            lsn,
            tenant_id,
            pg_version,
        });
+
        ep.create_pgdata()?;
-        std::fs::write(
-            ep.endpoint_path().join("endpoint.json"),
-            serde_json::to_string_pretty(&EndpointConf {
-                name: name.to_string(),
-                tenant_id,
-                timeline_id,
-                mode,
-                port,
-                pg_version,
-            })?,
-        )?;
        ep.setup_pg_conf()?;

        self.endpoints.insert(ep.name.clone(), Arc::clone(&ep));
@@ -122,19 +95,6 @@ impl ComputeControlPlane {

 ///////////////////////////////////////////////////////////////////////////////

-#[serde_as]
-#[derive(Serialize, Deserialize, Debug, Clone, Copy, Eq, PartialEq)]
-pub enum ComputeMode {
-    // Regular read-write node
-    Primary,
-    // if recovery_target_lsn is provided, and we want to pin the node to a specific LSN
-    Static(#[serde_as(as = "DisplayFromStr")] Lsn),
-    // Hot standby; read-only replica.
-    // Future versions may want to distinguish between replicas with hot standby
-    // feedback and other kinds of replication configurations.
-    Replica,
-}
-
 #[derive(Debug)]
 pub struct Endpoint {
    /// used as the directory name
@@ -142,7 +102,7 @@ pub struct Endpoint {
    pub tenant_id: TenantId,
    pub timeline_id: TimelineId,
    // Some(lsn) if this is a read-only endpoint anchored at 'lsn'. None for the primary.
-    pub mode: ComputeMode,
+    pub lsn: Option<Lsn>,

    // port and address of the Postgres server
    pub address: SocketAddr,
@@ -171,20 +131,42 @@ impl Endpoint {
        let fname = entry.file_name();
        let name = fname.to_str().unwrap().to_string();

-        // Read the endpoint.json file
-        let conf: EndpointConf =
-            serde_json::from_slice(&std::fs::read(entry.path().join("endpoint.json"))?)?;
+        // Read config file into memory
+        let cfg_path = entry.path().join("pgdata").join("postgresql.conf");
+        let cfg_path_str = cfg_path.to_string_lossy();
+        let mut conf_file = File::open(&cfg_path)
+            .with_context(|| format!("failed to open config file in {}", cfg_path_str))?;
+        let conf = PostgresConf::read(&mut conf_file)
+            .with_context(|| format!("failed to read config file in {}", cfg_path_str))?;
+
+        // Read a few options from the config file
+        let context = format!("in config file {}", cfg_path_str);
+        let port: u16 = conf.parse_field("port", &context)?;
+        let timeline_id: TimelineId = conf.parse_field("neon.timeline_id", &context)?;
+        let tenant_id: TenantId = conf.parse_field("neon.tenant_id", &context)?;
+
+        // Read postgres version from PG_VERSION file to determine which postgres version binary to use.
+        // If it doesn't exist, assume broken data directory and use default pg version.
+        let pg_version_path = entry.path().join("PG_VERSION");
+
+        let pg_version_str =
+            fs::read_to_string(pg_version_path).unwrap_or_else(|_| DEFAULT_PG_VERSION.to_string());
+        let pg_version = u32::from_str(&pg_version_str)?;
+
+        // parse recovery_target_lsn, if any
+        let recovery_target_lsn: Option<Lsn> =
+            conf.parse_field_optional("recovery_target_lsn", &context)?;

        // ok now
        Ok(Endpoint {
-            address: SocketAddr::new("127.0.0.1".parse().unwrap(), conf.port),
+            address: SocketAddr::new("127.0.0.1".parse().unwrap(), port),
            name,
            env: env.clone(),
            pageserver: Arc::clone(pageserver),
-            timeline_id: conf.timeline_id,
-            mode: conf.mode,
-            tenant_id: conf.tenant_id,
-            pg_version: conf.pg_version,
+            timeline_id,
+            lsn: recovery_target_lsn,
+            tenant_id,
+            pg_version,
        })
    }

@@ -317,83 +299,50 @@ impl Endpoint {
        conf.append("neon.pageserver_connstring", &pageserver_connstr);
        conf.append("neon.tenant_id", &self.tenant_id.to_string());
        conf.append("neon.timeline_id", &self.timeline_id.to_string());
+        if let Some(lsn) = self.lsn {
+            conf.append("recovery_target_lsn", &lsn.to_string());
+        }

        conf.append_line("");
-        // Replication-related configurations, such as WAL sending
-        match &self.mode {
-            ComputeMode::Primary => {
-                // Configure backpressure
-                // - Replication write lag depends on how fast the walreceiver can process incoming WAL.
-                //   This lag determines latency of get_page_at_lsn. Speed of applying WAL is about 10MB/sec,
-                //   so to avoid expiration of 1 minute timeout, this lag should not be larger than 600MB.
-                //   Actually latency should be much smaller (better if < 1sec). But we assume that recently
-                //   updates pages are not requested from pageserver.
-                // - Replication flush lag depends on speed of persisting data by checkpointer (creation of
-                //   delta/image layers) and advancing disk_consistent_lsn. Safekeepers are able to
-                //   remove/archive WAL only beyond disk_consistent_lsn. Too large a lag can cause long
-                //   recovery time (in case of pageserver crash) and disk space overflow at safekeepers.
-                // - Replication apply lag depends on speed of uploading changes to S3 by uploader thread.
-                //   To be able to restore database in case of pageserver node crash, safekeeper should not
-                //   remove WAL beyond this point. Too large lag can cause space exhaustion in safekeepers
-                //   (if they are not able to upload WAL to S3).
-                conf.append("max_replication_write_lag", "15MB");
-                conf.append("max_replication_flush_lag", "10GB");
+        // Configure backpressure
+        // - Replication write lag depends on how fast the walreceiver can process incoming WAL.
+        //   This lag determines latency of get_page_at_lsn. Speed of applying WAL is about 10MB/sec,
+        //   so to avoid expiration of 1 minute timeout, this lag should not be larger than 600MB.
+        //   Actually latency should be much smaller (better if < 1sec). But we assume that recently
+        //   updates pages are not requested from pageserver.
+        // - Replication flush lag depends on speed of persisting data by checkpointer (creation of
+        //   delta/image layers) and advancing disk_consistent_lsn. Safekeepers are able to
+        //   remove/archive WAL only beyond disk_consistent_lsn. Too large a lag can cause long
+        //   recovery time (in case of pageserver crash) and disk space overflow at safekeepers.
+        // - Replication apply lag depends on speed of uploading changes to S3 by uploader thread.
+        //   To be able to restore database in case of pageserver node crash, safekeeper should not
+        //   remove WAL beyond this point. Too large lag can cause space exhaustion in safekeepers
+        //   (if they are not able to upload WAL to S3).
+        conf.append("max_replication_write_lag", "15MB");
+        conf.append("max_replication_flush_lag", "10GB");

-                if !self.env.safekeepers.is_empty() {
-                    // Configure Postgres to connect to the safekeepers
-                    conf.append("synchronous_standby_names", "walproposer");
+        if !self.env.safekeepers.is_empty() {
+            // Configure Postgres to connect to the safekeepers
+            conf.append("synchronous_standby_names", "walproposer");

-                    let safekeepers = self
-                        .env
-                        .safekeepers
-                        .iter()
-                        .map(|sk| format!("localhost:{}", sk.pg_port))
-                        .collect::<Vec<String>>()
-                        .join(",");
-                    conf.append("neon.safekeepers", &safekeepers);
-                } else {
-                    // We only use setup without safekeepers for tests,
-                    // and don't care about data durability on pageserver,
-                    // so set more relaxed synchronous_commit.
-                    conf.append("synchronous_commit", "remote_write");
+            let safekeepers = self
+                .env
+                .safekeepers
+                .iter()
+                .map(|sk| format!("localhost:{}", sk.pg_port))
+                .collect::<Vec<String>>()
+                .join(",");
+            conf.append("neon.safekeepers", &safekeepers);
+        } else {
+            // We only use setup without safekeepers for tests,
+            // and don't care about data durability on pageserver,
+            // so set more relaxed synchronous_commit.
+            conf.append("synchronous_commit", "remote_write");

-                    // Configure the node to stream WAL directly to the pageserver
-                    // This isn't really a supported configuration, but can be useful for
-                    // testing.
-                    conf.append("synchronous_standby_names", "pageserver");
-                }
-            }
-            ComputeMode::Static(lsn) => {
-                conf.append("recovery_target_lsn", &lsn.to_string());
-            }
-            ComputeMode::Replica => {
-                assert!(!self.env.safekeepers.is_empty());
-
-                // TODO: use future host field from safekeeper spec
-                // Pass the list of safekeepers to the replica so that it can connect to any of them,
-                // whichever is availiable.
-                let sk_ports = self
-                    .env
-                    .safekeepers
-                    .iter()
-                    .map(|x| x.pg_port.to_string())
-                    .collect::<Vec<_>>()
-                    .join(",");
-                let sk_hosts = vec!["localhost"; self.env.safekeepers.len()].join(",");
-
-                let connstr = format!(
-                    "host={} port={} options='-c timeline_id={} tenant_id={}' application_name=replica replication=true",
-                    sk_hosts,
-                    sk_ports,
-                    &self.timeline_id.to_string(),
-                    &self.tenant_id.to_string(),
-                );
-
-                let slot_name = format!("repl_{}_", self.timeline_id);
-                conf.append("primary_conninfo", connstr.as_str());
-                conf.append("primary_slot_name", slot_name.as_str());
-                conf.append("hot_standby", "on");
-            }
+            // Configure the node to stream WAL directly to the pageserver
+            // This isn't really a supported configuration, but can be useful for
+            // testing.
+            conf.append("synchronous_standby_names", "pageserver");
        }

        let mut file = File::create(self.pgdata().join("postgresql.conf"))?;
@@ -406,27 +355,21 @@ impl Endpoint {
    }

    fn load_basebackup(&self, auth_token: &Option<String>) -> Result<()> {
-        let backup_lsn = match &self.mode {
-            ComputeMode::Primary => {
-                if !self.env.safekeepers.is_empty() {
-                    // LSN 0 means that it is bootstrap and we need to download just
-                    // latest data from the pageserver. That is a bit clumsy but whole bootstrap
-                    // procedure evolves quite actively right now, so let's think about it again
-                    // when things would be more stable (TODO).
-                    let lsn = self.sync_safekeepers(auth_token, self.pg_version)?;
-                    if lsn == Lsn(0) {
-                        None
-                    } else {
-                        Some(lsn)
-                    }
-                } else {
-                    None
-                }
-            }
-            ComputeMode::Static(lsn) => Some(*lsn),
-            ComputeMode::Replica => {
-                None // Take the latest snapshot available to start with
+        let backup_lsn = if let Some(lsn) = self.lsn {
+            Some(lsn)
+        } else if !self.env.safekeepers.is_empty() {
+            // LSN 0 means that it is bootstrap and we need to download just
+            // latest data from the pageserver. That is a bit clumsy but whole bootstrap
+            // procedure evolves quite actively right now, so let's think about it again
+            // when things would be more stable (TODO).
+            let lsn = self.sync_safekeepers(auth_token, self.pg_version)?;
+            if lsn == Lsn(0) {
+                None
+            } else {
+                Some(lsn)
            }
+        } else {
+            None
        };

        self.do_basebackup(backup_lsn)?;
@@ -523,7 +466,7 @@ impl Endpoint {
        // 3. Load basebackup
        self.load_basebackup(auth_token)?;

-        if self.mode != ComputeMode::Primary {
+        if self.lsn.is_some() {
            File::create(self.pgdata().join("standby.signal"))?;
        }

--- a/control_plane/src/pageserver.rs
+++ b/control_plane/src/pageserver.rs
@@ -359,8 +359,8 @@ impl PageServerNode {
                .transpose()
                .context("Failed to parse 'trace_read_requests' as bool")?,
            eviction_policy: settings
-                .remove("eviction_policy")
-                .map(serde_json::from_str)
+                .get("eviction_policy")
+                .map(|x| serde_json::from_str(x))
                .transpose()
                .context("Failed to parse 'eviction_policy' json")?,
            min_resident_size_override: settings
@@ -368,9 +368,6 @@ impl PageServerNode {
                .map(|x| x.parse::<u64>())
                .transpose()
                .context("Failed to parse 'min_resident_size_override' as integer")?,
-            evictions_low_residence_duration_metric_threshold: settings
-                .remove("evictions_low_residence_duration_metric_threshold")
-                .map(|x| x.to_string()),
        };
        if !settings.is_empty() {
            bail!("Unrecognized tenant settings: {settings:?}")
@@ -448,9 +445,6 @@ impl PageServerNode {
                    .map(|x| x.parse::<u64>())
                    .transpose()
                    .context("Failed to parse 'min_resident_size_override' as an integer")?,
-                evictions_low_residence_duration_metric_threshold: settings
-                    .get("evictions_low_residence_duration_metric_threshold")
-                    .map(|x| x.to_string()),
            })
            .send()?
            .error_from_body()?;
--- a/control_plane/src/postgresql_conf.rs
+++ b/control_plane/src/postgresql_conf.rs
@@ -13,7 +13,7 @@ use std::io::BufRead;
 use std::str::FromStr;

 /// In-memory representation of a postgresql.conf file
-#[derive(Default, Debug)]
+#[derive(Default)]
 pub struct PostgresConf {
    lines: Vec<String>,
    hash: HashMap<String, String>,
--- a/docker-compose/compute_wrapper/var/db/postgres/specs/spec.json
+++ b/docker-compose/compute_wrapper/var/db/postgres/specs/spec.json
@@ -28,6 +28,11 @@
                "value": "replica",
                "vartype": "enum"
            },
+            {
+                "name": "hot_standby",
+                "value": "on",
+                "vartype": "bool"
+            },
            {
                "name": "wal_log_hints",
                "value": "on",
--- a/libs/compute_api/src/responses.rs
+++ b/libs/compute_api/src/responses.rs
@@ -14,7 +14,6 @@ pub struct GenericAPIError {
 #[derive(Serialize, Debug)]
 #[serde(rename_all = "snake_case")]
 pub struct ComputeStatusResponse {
-    pub start_time: DateTime<Utc>,
    pub tenant: Option<String>,
    pub timeline: Option<String>,
    pub status: ComputeStatus,
@@ -64,7 +63,6 @@ where
 /// Response of the /metrics.json API
 #[derive(Clone, Debug, Default, Serialize)]
 pub struct ComputeMetrics {
-    pub wait_for_spec_ms: u64,
    pub sync_safekeepers_ms: u64,
    pub basebackup_ms: u64,
    pub config_ms: u64,
@@ -77,16 +75,4 @@ pub struct ComputeMetrics {
 #[derive(Deserialize, Debug)]
 pub struct ControlPlaneSpecResponse {
    pub spec: Option<ComputeSpec>,
-    pub status: ControlPlaneComputeStatus,
-}
-
-#[derive(Deserialize, Clone, Copy, Debug, PartialEq, Eq)]
-#[serde(rename_all = "snake_case")]
-pub enum ControlPlaneComputeStatus {
-    // Compute is known to control-plane, but it's not
-    // yet attached to any timeline / endpoint.
-    Empty,
-    // Compute is attached to some timeline / endpoint and
-    // should be able to start with provided spec.
-    Attached,
 }
--- a/libs/consumption_metrics/Cargo.toml
+++ b/libs/consumption_metrics/Cargo.toml
@@ -4,12 +4,13 @@ version = "0.1.0"
 edition = "2021"
 license = "Apache-2.0"

-[dependencies]
-anyhow.workspace = true
-chrono.workspace = true
-rand.workspace = true
-serde.workspace = true
-serde_with.workspace = true
-utils.workspace = true
+# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

-workspace_hack.workspace = true
+[dependencies]
+anyhow = "1.0.68"
+chrono = { version = "0.4", default-features = false, features = ["clock", "serde"] }
+rand = "0.8.3"
+serde = "1.0.152"
+serde_with = "2.1.0"
+utils = { version = "0.1.0", path = "../utils" }
+workspace_hack = { version = "0.1.0", path = "../../workspace_hack" }
--- a/libs/pageserver_api/src/models.rs
+++ b/libs/pageserver_api/src/models.rs
@@ -135,7 +135,6 @@ pub struct TenantCreateRequest {
    // For now, this field is not even documented in the openapi_spec.yml.
    pub eviction_policy: Option<serde_json::Value>,
    pub min_resident_size_override: Option<u64>,
-    pub evictions_low_residence_duration_metric_threshold: Option<String>,
 }

 #[serde_as]
@@ -182,7 +181,6 @@ pub struct TenantConfigRequest {
    // For now, this field is not even documented in the openapi_spec.yml.
    pub eviction_policy: Option<serde_json::Value>,
    pub min_resident_size_override: Option<u64>,
-    pub evictions_low_residence_duration_metric_threshold: Option<String>,
 }

 impl TenantConfigRequest {
@@ -204,7 +202,6 @@ impl TenantConfigRequest {
            trace_read_requests: None,
            eviction_policy: None,
            min_resident_size_override: None,
-            evictions_low_residence_duration_metric_threshold: None,
        }
    }
 }
--- a/libs/postgres_backend/src/lib.rs
+++ b/libs/postgres_backend/src/lib.rs
@@ -50,14 +50,11 @@ impl QueryError {
    }
 }

-/// Returns true if the given error is a normal consequence of a network issue,
-/// or the client closing the connection. These errors can happen during normal
-/// operations, and don't indicate a bug in our code.
 pub fn is_expected_io_error(e: &io::Error) -> bool {
    use io::ErrorKind::*;
    matches!(
        e.kind(),
-        BrokenPipe | ConnectionRefused | ConnectionAborted | ConnectionReset | TimedOut
+        ConnectionRefused | ConnectionAborted | ConnectionReset | TimedOut
    )
 }

--- a/libs/postgres_ffi/build.rs
+++ b/libs/postgres_ffi/build.rs
@@ -5,7 +5,7 @@ use std::path::PathBuf;
 use std::process::Command;

 use anyhow::{anyhow, Context};
-use bindgen::callbacks::{DeriveInfo, ParseCallbacks};
+use bindgen::callbacks::ParseCallbacks;

 #[derive(Debug)]
 struct PostgresFfiCallbacks;
@@ -20,7 +20,7 @@ impl ParseCallbacks for PostgresFfiCallbacks {

    // Add any custom #[derive] attributes to the data structures that bindgen
    // creates.
-    fn add_derives(&self, derive_info: &DeriveInfo) -> Vec<String> {
+    fn add_derives(&self, name: &str) -> Vec<String> {
        // This is the list of data structures that we want to serialize/deserialize.
        let serde_list = [
            "XLogRecord",
@@ -31,7 +31,7 @@ impl ParseCallbacks for PostgresFfiCallbacks {
            "ControlFileData",
        ];

-        if serde_list.contains(&derive_info.name) {
+        if serde_list.contains(&name) {
            vec![
                "Default".into(), // Default allows us to easily fill the padding fields with 0.
                "Serialize".into(),
--- a/libs/postgres_ffi/src/lib.rs
+++ b/libs/postgres_ffi/src/lib.rs
@@ -95,13 +95,10 @@ pub fn generate_wal_segment(
    segno: u64,
    system_id: u64,
    pg_version: u32,
-    lsn: Lsn,
 ) -> Result<Bytes, SerializeError> {
-    assert_eq!(segno, lsn.segment_number(WAL_SEGMENT_SIZE));
-
    match pg_version {
-        14 => v14::xlog_utils::generate_wal_segment(segno, system_id, lsn),
-        15 => v15::xlog_utils::generate_wal_segment(segno, system_id, lsn),
+        14 => v14::xlog_utils::generate_wal_segment(segno, system_id),
+        15 => v15::xlog_utils::generate_wal_segment(segno, system_id),
        _ => Err(SerializeError::BadInput),
    }
 }
--- a/libs/postgres_ffi/src/pg_constants.rs
+++ b/libs/postgres_ffi/src/pg_constants.rs
@@ -195,7 +195,6 @@ pub const FIRST_NORMAL_OBJECT_ID: u32 = 16384;

 pub const XLOG_CHECKPOINT_SHUTDOWN: u8 = 0x00;
 pub const XLOG_CHECKPOINT_ONLINE: u8 = 0x10;
-pub const XLP_FIRST_IS_CONTRECORD: u16 = 0x0001;
 pub const XLP_LONG_HEADER: u16 = 0x0002;

 /* From fsm_internals.h */
--- a/libs/postgres_ffi/src/xlog_utils.rs
+++ b/libs/postgres_ffi/src/xlog_utils.rs
@@ -270,11 +270,6 @@ impl XLogPageHeaderData {
        use utils::bin_ser::LeSer;
        XLogPageHeaderData::des_from(&mut buf.reader())
    }
-
-    pub fn encode(&self) -> Result<Bytes, SerializeError> {
-        use utils::bin_ser::LeSer;
-        self.ser().map(|b| b.into())
-    }
 }

 impl XLogLongPageHeaderData {
@@ -333,32 +328,22 @@ impl CheckPoint {
    }
 }

-/// Generate new, empty WAL segment, with correct block headers at the first
-/// page of the segment and the page that contains the given LSN.
-/// We need this segment to start compute node.
-pub fn generate_wal_segment(segno: u64, system_id: u64, lsn: Lsn) -> Result<Bytes, SerializeError> {
+//
+// Generate new, empty WAL segment.
+// We need this segment to start compute node.
+//
+pub fn generate_wal_segment(segno: u64, system_id: u64) -> Result<Bytes, SerializeError> {
    let mut seg_buf = BytesMut::with_capacity(WAL_SEGMENT_SIZE);

    let pageaddr = XLogSegNoOffsetToRecPtr(segno, 0, WAL_SEGMENT_SIZE);
-
-    let page_off = lsn.block_offset();
-    let seg_off = lsn.segment_offset(WAL_SEGMENT_SIZE);
-
-    let first_page_only = seg_off < XLOG_BLCKSZ;
-    let (shdr_rem_len, infoflags) = if first_page_only {
-        (seg_off, pg_constants::XLP_FIRST_IS_CONTRECORD)
-    } else {
-        (0, 0)
-    };
-
    let hdr = XLogLongPageHeaderData {
        std: {
            XLogPageHeaderData {
                xlp_magic: XLOG_PAGE_MAGIC as u16,
-                xlp_info: pg_constants::XLP_LONG_HEADER | infoflags,
+                xlp_info: pg_constants::XLP_LONG_HEADER,
                xlp_tli: PG_TLI,
                xlp_pageaddr: pageaddr,
-                xlp_rem_len: shdr_rem_len as u32,
+                xlp_rem_len: 0,
                ..Default::default() // Put 0 in padding fields.
            }
        },
@@ -372,33 +357,6 @@ pub fn generate_wal_segment(segno: u64, system_id: u64, lsn: Lsn) -> Result<Byte

    //zero out the rest of the file
    seg_buf.resize(WAL_SEGMENT_SIZE, 0);
-
-    if !first_page_only {
-        let block_offset = lsn.page_offset_in_segment(WAL_SEGMENT_SIZE) as usize;
-        let header = XLogPageHeaderData {
-            xlp_magic: XLOG_PAGE_MAGIC as u16,
-            xlp_info: if page_off >= pg_constants::SIZE_OF_PAGE_HEADER as u64 {
-                pg_constants::XLP_FIRST_IS_CONTRECORD
-            } else {
-                0
-            },
-            xlp_tli: PG_TLI,
-            xlp_pageaddr: lsn.page_lsn().0,
-            xlp_rem_len: if page_off >= pg_constants::SIZE_OF_PAGE_HEADER as u64 {
-                page_off as u32
-            } else {
-                0u32
-            },
-            ..Default::default() // Put 0 in padding fields.
-        };
-        let hdr_bytes = header.encode()?;
-
-        debug_assert!(seg_buf.len() > block_offset + hdr_bytes.len());
-        debug_assert_ne!(block_offset, 0);
-
-        seg_buf[block_offset..block_offset + hdr_bytes.len()].copy_from_slice(&hdr_bytes[..]);
-    }
-
    Ok(seg_buf.freeze())
 }

--- a/libs/postgres_ffi/wal_craft/src/lib.rs
+++ b/libs/postgres_ffi/wal_craft/src/lib.rs
@@ -1,13 +1,15 @@
-use anyhow::{bail, ensure};
+use anyhow::*;
+use core::time::Duration;
 use log::*;
 use postgres::types::PgLsn;
 use postgres::Client;
 use postgres_ffi::{WAL_SEGMENT_SIZE, XLOG_BLCKSZ};
 use postgres_ffi::{XLOG_SIZE_OF_XLOG_RECORD, XLOG_SIZE_OF_XLOG_SHORT_PHD};
 use std::cmp::Ordering;
+use std::fs;
 use std::path::{Path, PathBuf};
-use std::process::Command;
-use std::time::{Duration, Instant};
+use std::process::{Command, Stdio};
+use std::time::Instant;
 use tempfile::{tempdir, TempDir};

 #[derive(Debug, Clone, PartialEq, Eq)]
@@ -54,7 +56,7 @@ impl Conf {
        self.datadir.join("pg_wal")
    }

-    fn new_pg_command(&self, command: impl AsRef<Path>) -> anyhow::Result<Command> {
+    fn new_pg_command(&self, command: impl AsRef<Path>) -> Result<Command> {
        let path = self.pg_bin_dir()?.join(command);
        ensure!(path.exists(), "Command {:?} does not exist", path);
        let mut cmd = Command::new(path);
@@ -64,7 +66,7 @@ impl Conf {
        Ok(cmd)
    }

-    pub fn initdb(&self) -> anyhow::Result<()> {
+    pub fn initdb(&self) -> Result<()> {
        if let Some(parent) = self.datadir.parent() {
            info!("Pre-creating parent directory {:?}", parent);
            // Tests may be run concurrently and there may be a race to create `test_output/`.
@@ -78,7 +80,7 @@ impl Conf {
        let output = self
            .new_pg_command("initdb")?
            .arg("-D")
-            .arg(&self.datadir)
+            .arg(self.datadir.as_os_str())
            .args(["-U", "postgres", "--no-instructions", "--no-sync"])
            .output()?;
        debug!("initdb output: {:?}", output);
@@ -91,18 +93,26 @@ impl Conf {
        Ok(())
    }

-    pub fn start_server(&self) -> anyhow::Result<PostgresServer> {
+    pub fn start_server(&self) -> Result<PostgresServer> {
        info!("Starting Postgres server in {:?}", self.datadir);
+        let log_file = fs::File::create(self.datadir.join("pg.log")).with_context(|| {
+            format!(
+                "Failed to create pg.log file in directory {}",
+                self.datadir.display()
+            )
+        })?;
        let unix_socket_dir = tempdir()?; // We need a directory with a short name for Unix socket (up to 108 symbols)
        let unix_socket_dir_path = unix_socket_dir.path().to_owned();
        let server_process = self
            .new_pg_command("postgres")?
            .args(["-c", "listen_addresses="])
            .arg("-k")
-            .arg(&unix_socket_dir_path)
+            .arg(unix_socket_dir_path.as_os_str())
            .arg("-D")
-            .arg(&self.datadir)
+            .arg(self.datadir.as_os_str())
+            .args(["-c", "logging_collector=on"]) // stderr will mess up with tests output
            .args(REQUIRED_POSTGRES_CONFIG.iter().flat_map(|cfg| ["-c", cfg]))
+            .stderr(Stdio::from(log_file))
            .spawn()?;
        let server = PostgresServer {
            process: server_process,
@@ -111,7 +121,7 @@ impl Conf {
                let mut c = postgres::Config::new();
                c.host_path(&unix_socket_dir_path);
                c.user("postgres");
-                c.connect_timeout(Duration::from_millis(10000));
+                c.connect_timeout(Duration::from_millis(1000));
                c
            },
        };
@@ -122,7 +132,7 @@ impl Conf {
        &self,
        first_segment_name: &str,
        last_segment_name: &str,
-    ) -> anyhow::Result<std::process::Output> {
+    ) -> Result<std::process::Output> {
        let first_segment_file = self.datadir.join(first_segment_name);
        let last_segment_file = self.datadir.join(last_segment_name);
        info!(
@@ -132,7 +142,10 @@ impl Conf {
        );
        let output = self
            .new_pg_command("pg_waldump")?
-            .args([&first_segment_file, &last_segment_file])
+            .args([
+                &first_segment_file.as_os_str(),
+                &last_segment_file.as_os_str(),
+            ])
            .output()?;
        debug!("waldump output: {:?}", output);
        Ok(output)
@@ -140,9 +153,10 @@ impl Conf {
 }

 impl PostgresServer {
-    pub fn connect_with_timeout(&self) -> anyhow::Result<Client> {
+    pub fn connect_with_timeout(&self) -> Result<Client> {
        let retry_until = Instant::now() + *self.client_config.get_connect_timeout().unwrap();
        while Instant::now() < retry_until {
+            use std::result::Result::Ok;
            if let Ok(client) = self.client_config.connect(postgres::NoTls) {
                return Ok(client);
            }
@@ -159,6 +173,7 @@ impl PostgresServer {

 impl Drop for PostgresServer {
    fn drop(&mut self) {
+        use std::result::Result::Ok;
        match self.process.try_wait() {
            Ok(Some(_)) => return,
            Ok(None) => {
@@ -173,12 +188,12 @@ impl Drop for PostgresServer {
 }

 pub trait PostgresClientExt: postgres::GenericClient {
-    fn pg_current_wal_insert_lsn(&mut self) -> anyhow::Result<PgLsn> {
+    fn pg_current_wal_insert_lsn(&mut self) -> Result<PgLsn> {
        Ok(self
            .query_one("SELECT pg_current_wal_insert_lsn()", &[])?
            .get(0))
    }
-    fn pg_current_wal_flush_lsn(&mut self) -> anyhow::Result<PgLsn> {
+    fn pg_current_wal_flush_lsn(&mut self) -> Result<PgLsn> {
        Ok(self
            .query_one("SELECT pg_current_wal_flush_lsn()", &[])?
            .get(0))
@@ -187,7 +202,7 @@ pub trait PostgresClientExt: postgres::GenericClient {

 impl<C: postgres::GenericClient> PostgresClientExt for C {}

-pub fn ensure_server_config(client: &mut impl postgres::GenericClient) -> anyhow::Result<()> {
+pub fn ensure_server_config(client: &mut impl postgres::GenericClient) -> Result<()> {
    client.execute("create extension if not exists neon_test_utils", &[])?;

    let wal_keep_size: String = client.query_one("SHOW wal_keep_size", &[])?.get(0);
@@ -221,13 +236,13 @@ pub trait Crafter {
    /// * A vector of some valid "interesting" intermediate LSNs which one may start reading from.
    ///   May include or exclude Lsn(0) and the end-of-wal.
    /// * The expected end-of-wal LSN.
-    fn craft(client: &mut impl postgres::GenericClient) -> anyhow::Result<(Vec<PgLsn>, PgLsn)>;
+    fn craft(client: &mut impl postgres::GenericClient) -> Result<(Vec<PgLsn>, PgLsn)>;
 }

 fn craft_internal<C: postgres::GenericClient>(
    client: &mut C,
-    f: impl Fn(&mut C, PgLsn) -> anyhow::Result<(Vec<PgLsn>, Option<PgLsn>)>,
-) -> anyhow::Result<(Vec<PgLsn>, PgLsn)> {
+    f: impl Fn(&mut C, PgLsn) -> Result<(Vec<PgLsn>, Option<PgLsn>)>,
+) -> Result<(Vec<PgLsn>, PgLsn)> {
    ensure_server_config(client)?;

    let initial_lsn = client.pg_current_wal_insert_lsn()?;
@@ -259,7 +274,7 @@ fn craft_internal<C: postgres::GenericClient>(
 pub struct Simple;
 impl Crafter for Simple {
    const NAME: &'static str = "simple";
-    fn craft(client: &mut impl postgres::GenericClient) -> anyhow::Result<(Vec<PgLsn>, PgLsn)> {
+    fn craft(client: &mut impl postgres::GenericClient) -> Result<(Vec<PgLsn>, PgLsn)> {
        craft_internal(client, |client, _| {
            client.execute("CREATE table t(x int)", &[])?;
            Ok((Vec::new(), None))
@@ -270,7 +285,7 @@ impl Crafter for Simple {
 pub struct LastWalRecordXlogSwitch;
 impl Crafter for LastWalRecordXlogSwitch {
    const NAME: &'static str = "last_wal_record_xlog_switch";
-    fn craft(client: &mut impl postgres::GenericClient) -> anyhow::Result<(Vec<PgLsn>, PgLsn)> {
+    fn craft(client: &mut impl postgres::GenericClient) -> Result<(Vec<PgLsn>, PgLsn)> {
        // Do not use generate_internal because here we end up with flush_lsn exactly on
        // the segment boundary and insert_lsn after the initial page header, which is unusual.
        ensure_server_config(client)?;
@@ -292,7 +307,7 @@ impl Crafter for LastWalRecordXlogSwitch {
 pub struct LastWalRecordXlogSwitchEndsOnPageBoundary;
 impl Crafter for LastWalRecordXlogSwitchEndsOnPageBoundary {
    const NAME: &'static str = "last_wal_record_xlog_switch_ends_on_page_boundary";
-    fn craft(client: &mut impl postgres::GenericClient) -> anyhow::Result<(Vec<PgLsn>, PgLsn)> {
+    fn craft(client: &mut impl postgres::GenericClient) -> Result<(Vec<PgLsn>, PgLsn)> {
        // Do not use generate_internal because here we end up with flush_lsn exactly on
        // the segment boundary and insert_lsn after the initial page header, which is unusual.
        ensure_server_config(client)?;
@@ -359,7 +374,7 @@ impl Crafter for LastWalRecordXlogSwitchEndsOnPageBoundary {
 fn craft_single_logical_message(
    client: &mut impl postgres::GenericClient,
    transactional: bool,
-) -> anyhow::Result<(Vec<PgLsn>, PgLsn)> {
+) -> Result<(Vec<PgLsn>, PgLsn)> {
    craft_internal(client, |client, initial_lsn| {
        ensure!(
            initial_lsn < PgLsn::from(0x0200_0000 - 1024 * 1024),
@@ -401,7 +416,7 @@ fn craft_single_logical_message(
 pub struct WalRecordCrossingSegmentFollowedBySmallOne;
 impl Crafter for WalRecordCrossingSegmentFollowedBySmallOne {
    const NAME: &'static str = "wal_record_crossing_segment_followed_by_small_one";
-    fn craft(client: &mut impl postgres::GenericClient) -> anyhow::Result<(Vec<PgLsn>, PgLsn)> {
+    fn craft(client: &mut impl postgres::GenericClient) -> Result<(Vec<PgLsn>, PgLsn)> {
        craft_single_logical_message(client, true)
    }
 }
@@ -409,7 +424,7 @@ impl Crafter for WalRecordCrossingSegmentFollowedBySmallOne {
 pub struct LastWalRecordCrossingSegment;
 impl Crafter for LastWalRecordCrossingSegment {
    const NAME: &'static str = "last_wal_record_crossing_segment";
-    fn craft(client: &mut impl postgres::GenericClient) -> anyhow::Result<(Vec<PgLsn>, PgLsn)> {
+    fn craft(client: &mut impl postgres::GenericClient) -> Result<(Vec<PgLsn>, PgLsn)> {
        craft_single_logical_message(client, false)
    }
 }
--- a/libs/pq_proto/Cargo.toml
+++ b/libs/pq_proto/Cargo.toml
@@ -10,6 +10,7 @@ byteorder.workspace = true
 pin-project-lite.workspace = true
 postgres-protocol.workspace = true
 rand.workspace = true
+serde.workspace = true
 tokio.workspace = true
 tracing.workspace = true
 thiserror.workspace = true
--- a/libs/pq_proto/src/lib.rs
+++ b/libs/pq_proto/src/lib.rs
@@ -6,10 +6,15 @@ pub mod framed;

 use byteorder::{BigEndian, ReadBytesExt};
 use bytes::{Buf, BufMut, Bytes, BytesMut};
-use std::{borrow::Cow, collections::HashMap, fmt, io, str};
-
-// re-export for use in utils pageserver_feedback.rs
-pub use postgres_protocol::PG_EPOCH;
+use postgres_protocol::PG_EPOCH;
+use serde::{Deserialize, Serialize};
+use std::{
+    borrow::Cow,
+    collections::HashMap,
+    fmt, io, str,
+    time::{Duration, SystemTime},
+};
+use tracing::{trace, warn};

 pub type Oid = u32;
 pub type SystemId = u64;
@@ -659,7 +664,7 @@ fn write_cstr(s: impl AsRef<[u8]>, buf: &mut BytesMut) -> Result<(), ProtocolErr
 }

 /// Read cstring from buf, advancing it.
-pub fn read_cstr(buf: &mut Bytes) -> Result<Bytes, ProtocolError> {
+fn read_cstr(buf: &mut Bytes) -> Result<Bytes, ProtocolError> {
    let pos = buf
        .iter()
        .position(|x| *x == 0)
@@ -934,10 +939,175 @@ impl<'a> BeMessage<'a> {
    }
 }

+/// Feedback pageserver sends to safekeeper and safekeeper resends to compute.
+/// Serialized in custom flexible key/value format. In replication protocol, it
+/// is marked with NEON_STATUS_UPDATE_TAG_BYTE to differentiate from postgres
+/// Standby status update / Hot standby feedback messages.
+#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
+pub struct PageserverFeedback {
+    /// Last known size of the timeline. Used to enforce timeline size limit.
+    pub current_timeline_size: u64,
+    /// LSN last received and ingested by the pageserver.
+    pub last_received_lsn: u64,
+    /// LSN up to which data is persisted by the pageserver to its local disc.
+    pub disk_consistent_lsn: u64,
+    /// LSN up to which data is persisted by the pageserver on s3; safekeepers
+    /// consider WAL before it can be removed.
+    pub remote_consistent_lsn: u64,
+    pub replytime: SystemTime,
+}
+
+// NOTE: Do not forget to increment this number when adding new fields to PageserverFeedback.
+// Do not remove previously available fields because this might be backwards incompatible.
+pub const PAGESERVER_FEEDBACK_FIELDS_NUMBER: u8 = 5;
+
+impl PageserverFeedback {
+    pub fn empty() -> PageserverFeedback {
+        PageserverFeedback {
+            current_timeline_size: 0,
+            last_received_lsn: 0,
+            remote_consistent_lsn: 0,
+            disk_consistent_lsn: 0,
+            replytime: SystemTime::now(),
+        }
+    }
+
+    // Serialize PageserverFeedback using custom format
+    // to support protocol extensibility.
+    //
+    // Following layout is used:
+    // char - number of key-value pairs that follow.
+    //
+    // key-value pairs:
+    // null-terminated string - key,
+    // uint32 - value length in bytes
+    // value itself
+    //
+    // TODO: change serialized fields names once all computes migrate to rename.
+    pub fn serialize(&self, buf: &mut BytesMut) {
+        buf.put_u8(PAGESERVER_FEEDBACK_FIELDS_NUMBER); // # of keys
+        buf.put_slice(b"current_timeline_size\0");
+        buf.put_i32(8);
+        buf.put_u64(self.current_timeline_size);
+
+        buf.put_slice(b"ps_writelsn\0");
+        buf.put_i32(8);
+        buf.put_u64(self.last_received_lsn);
+        buf.put_slice(b"ps_flushlsn\0");
+        buf.put_i32(8);
+        buf.put_u64(self.disk_consistent_lsn);
+        buf.put_slice(b"ps_applylsn\0");
+        buf.put_i32(8);
+        buf.put_u64(self.remote_consistent_lsn);
+
+        let timestamp = self
+            .replytime
+            .duration_since(*PG_EPOCH)
+            .expect("failed to serialize pg_replytime earlier than PG_EPOCH")
+            .as_micros() as i64;
+
+        buf.put_slice(b"ps_replytime\0");
+        buf.put_i32(8);
+        buf.put_i64(timestamp);
+    }
+
+    // Deserialize PageserverFeedback message
+    // TODO: change serialized fields names once all computes migrate to rename.
+    pub fn parse(mut buf: Bytes) -> PageserverFeedback {
+        let mut rf = PageserverFeedback::empty();
+        let nfields = buf.get_u8();
+        for _ in 0..nfields {
+            let key = read_cstr(&mut buf).unwrap();
+            match key.as_ref() {
+                b"current_timeline_size" => {
+                    let len = buf.get_i32();
+                    assert_eq!(len, 8);
+                    rf.current_timeline_size = buf.get_u64();
+                }
+                b"ps_writelsn" => {
+                    let len = buf.get_i32();
+                    assert_eq!(len, 8);
+                    rf.last_received_lsn = buf.get_u64();
+                }
+                b"ps_flushlsn" => {
+                    let len = buf.get_i32();
+                    assert_eq!(len, 8);
+                    rf.disk_consistent_lsn = buf.get_u64();
+                }
+                b"ps_applylsn" => {
+                    let len = buf.get_i32();
+                    assert_eq!(len, 8);
+                    rf.remote_consistent_lsn = buf.get_u64();
+                }
+                b"ps_replytime" => {
+                    let len = buf.get_i32();
+                    assert_eq!(len, 8);
+                    let raw_time = buf.get_i64();
+                    if raw_time > 0 {
+                        rf.replytime = *PG_EPOCH + Duration::from_micros(raw_time as u64);
+                    } else {
+                        rf.replytime = *PG_EPOCH - Duration::from_micros(-raw_time as u64);
+                    }
+                }
+                _ => {
+                    let len = buf.get_i32();
+                    warn!(
+                        "PageserverFeedback parse. unknown key {} of len {len}. Skip it.",
+                        String::from_utf8_lossy(key.as_ref())
+                    );
+                    buf.advance(len as usize);
+                }
+            }
+        }
+        trace!("PageserverFeedback parsed is {:?}", rf);
+        rf
+    }
+}
+
 #[cfg(test)]
 mod tests {
    use super::*;

+    #[test]
+    fn test_replication_feedback_serialization() {
+        let mut rf = PageserverFeedback::empty();
+        // Fill rf with some values
+        rf.current_timeline_size = 12345678;
+        // Set rounded time to be able to compare it with deserialized value,
+        // because it is rounded up to microseconds during serialization.
+        rf.replytime = *PG_EPOCH + Duration::from_secs(100_000_000);
+        let mut data = BytesMut::new();
+        rf.serialize(&mut data);
+
+        let rf_parsed = PageserverFeedback::parse(data.freeze());
+        assert_eq!(rf, rf_parsed);
+    }
+
+    #[test]
+    fn test_replication_feedback_unknown_key() {
+        let mut rf = PageserverFeedback::empty();
+        // Fill rf with some values
+        rf.current_timeline_size = 12345678;
+        // Set rounded time to be able to compare it with deserialized value,
+        // because it is rounded up to microseconds during serialization.
+        rf.replytime = *PG_EPOCH + Duration::from_secs(100_000_000);
+        let mut data = BytesMut::new();
+        rf.serialize(&mut data);
+
+        // Add an extra field to the buffer and adjust number of keys
+        if let Some(first) = data.first_mut() {
+            *first = PAGESERVER_FEEDBACK_FIELDS_NUMBER + 1;
+        }
+
+        data.put_slice(b"new_field_one\0");
+        data.put_i32(8);
+        data.put_u64(42);
+
+        // Parse serialized data and check that new field is not parsed
+        let rf_parsed = PageserverFeedback::parse(data.freeze());
+        assert_eq!(rf, rf_parsed);
+    }
+
    #[test]
    fn test_startup_message_params_options_escaped() {
        fn split_options(params: &StartupMessageParams) -> Vec<Cow<'_, str>> {
--- a/libs/remote_storage/tests/pagination_tests.rs
+++ b/libs/remote_storage/tests/pagination_tests.rs
@@ -99,11 +99,7 @@ struct S3WithTestBlobs {
 #[async_trait::async_trait]
 impl AsyncTestContext for MaybeEnabledS3 {
    async fn setup() -> Self {
-        utils::logging::init(
-            utils::logging::LogFormat::Test,
-            utils::logging::TracingErrorLayerEnablement::Disabled,
-        )
-        .expect("logging init failed");
+        utils::logging::init(utils::logging::LogFormat::Test).expect("logging init failed");
        if env::var(ENABLE_REAL_S3_REMOTE_STORAGE_ENV_VAR_NAME).is_err() {
            info!(
                "`{}` env variable is not set, skipping the test",
@@ -208,7 +204,12 @@ async fn upload_s3_data(
            let data = format!("remote blob data {i}").into_bytes();
            let data_len = data.len();
            task_client
-                .upload(std::io::Cursor::new(data), data_len, &blob_path, None)
+                .upload(
+                    Box::new(std::io::Cursor::new(data)),
+                    data_len,
+                    &blob_path,
+                    None,
+                )
                .await?;

            Ok::<_, anyhow::Error>((blob_prefix, blob_path))
--- a/libs/timeline_data_path/Cargo.toml
+++ b/libs/timeline_data_path/Cargo.toml
@@ -0,0 +1,13 @@
+[package]
+name = "timeline_data_path"
+version = "0.1.0"
+edition.workspace = true
+license.workspace = true
+
+# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
+
+[dependencies]
+utils.workspace = true
+workspace_hack.workspace = true
+tokio.workspace = true
+thiserror.workspace = true
--- a/libs/timeline_data_path/src/lib.rs
+++ b/libs/timeline_data_path/src/lib.rs
@@ -0,0 +1,396 @@
+//! The Timeline's core data path.
+//!
+//! # Overview
+//!
+//! This crate implements the core data path of a Timeline inside Pageserver:
+//!
+//! 1. WAL records from `walreceiver`, via in-memory layers, into persistent L0 layers.
+//! 1. `GetPage@LSN`: retrieval of WAL records and page images for feeding into WAL redo.
+//! 1. Data re-shuffeling through compaction (TODO).
+//! 1. Page image creation & garbage collection through GC (TODO).
+//!
+//! This crate assumes the following concepts, but is fully generic over their implementation:
+//!
+//! - **Delta Records**: data is written into the system in the form of self-descriptive deltas.
+//!   For the Pageserver use case, these deltas are derived from Postgres WAL records.
+//! - **Page Numbers**: Delta Records always affect a single key.
+//!   That key is called page number, because, in the Pageserver use case, the Postgres table page numbers are the keys.
+//! - **LSN**: When writing Delta Records into the system, they are associated with a monotonically increasing LSN.
+//!   Subsequently written Delta Records must have increasing LSNs.
+//! - **Page Images**: Delta Records for a given page can be used to reconstruct the page. Think of it like squashing diffs.
+//!   - When sorting the Delta Records for a given key by their LSN, any prefix of that sorting can be squashed into a page image.
+//!   - Delta Records following such a squash can be squashed into that page image.
+//!   - In Pageserver, WAL redo implements the (pure) function of squashing.
+//! - **In-Memory Layer**: an object that represents an "unfinished" L0 layer file, holding Delta Records in insertion order.
+//!   "Unfinished" means that we're still writing Delta Records to that file.
+//! - **Historic Layer**: an object that represents a "finished" layer file, at any compaction level.
+//!   Such objects reside on disk and/or in remote storage.
+//!   They may contain Delta Records, Page Images, or a mixture thereof. It doesn't matter.
+//! - **HistoricStuff**: an efficient lookup data structure to find the list of Historic Layer objects
+//!   that hold the Delta Records / PageImages required to reconstruct a Page Image at a given LSN.
+//!
+//! # API
+//!
+//! The core idea is that of a specialized single-producer multi-consumer structure,
+//! embodied by a Read-end and a Write-end.
+//!
+//! The Write-end is used to push new `DeltaRecord @ LSN`s into the system.
+//! In Pageserver, this is used by the `WalReceiver`.
+//!
+//! The Read-end provides the `GetPage@LSN` API.
+//! In the current iteration, we actually return something called `ReconstructWork`.
+//! I.e., we leave the work of reading the values from the layers, and the WAL redo invocation to the caller.
+//! Find rationale for this design in the *Scope* section.
+//!
+//! ## Immutability
+//!
+//! The traits defined by this crate assume immutable data structures that are multi-versioned.
+//!
+//! As an example for what "immutable" means, take the case where we add a new Historic Layer to HistoricStuff.
+//! Traditionally, one would use shared mutable state, i.e. `Arc<RwLock<...>>`.
+//! To insert the new Historic Layer, we would acquire the RwLock in write mode and modify a lookup data structure to accomodate the new layer.
+//! The Read-ends would use RwLock in read mode to read from the data structure.
+//!
+//! Conversely, with  *immutable data structures*, writers create new version (aka *snapshots*) of the lookup data structure.
+//! New reads on the Read-ends will use the new snapshot, but old ongoing reads would use the old version(s).
+//! An efficient implementation would likely share the Historic Layer objects, e.g., using `Arc`.
+//! And maybe there's internally mutable state inside the layer objects, e.g., to track residence (i.e., *on-demand downloaded* vs *evicted*).
+//! But the important point is that there's no synchronization / lock-holding at any higher level, except when grabbing a reference to the snapshot (Read-end), or when publishing a new snapshot (Write-end).
+//!
+//! ## Scope
+//!
+//! The following concerns are considered implementation details from the perspective of this crate:
+//!
+//! - **Layer File Persistence**: `HistoricStuff::make_historic` is responsible for this.
+//! - **Reading Layer Files**: the `ReconstructWork` that the Read-end returns from `GetPage@LSN` requests contains the list of layers to consult.
+//!   The crate consumer is responsible for reading the layers & doing WAL redo.
+//!   Likely the implementation of `HistoricStuff` plays a role here, because it is responsible for persisting the layer files.
+//! - **Layer Eviction & On-Demand Download**: this is just an aspect of the above.
+//!   The crate consumer can choose to implement eviction & on-demand download however they wish.
+//!   The only requirement is that the Historic Layers don't change their contents, i.e., they always returnt he same reconstruct values for the same lookup.
+//!   - For example, a `LayerCache` modoule or service could take care of layer uploads, eviction, and on-demand downloads.
+//!     Initially, the `layer cache` can be local-only.
+//!     But in the future, it can be multi-machine / clustered pagesevers / aka "sharding".
+//!
+//! # Example
+//!
+//! The [`new`] function is the entrypoint to this crate.
+//!
+//! See the test cases for how it is used.
+
+use std::{marker::PhantomData, time::Duration};
+
+use utils::seqwait::{self, Advance, SeqWait, Wait};
+
+#[cfg(test)]
+mod tests;
+
+/// Collection of types / type bounds used by Read-end and Write-end.
+///
+/// See the [`crate`]-level docs's *Concepts* section to learn about
+/// the meaning of each associated `type`.
+///
+/// # Usage
+///
+/// Define a zero-sized-type and impl this Trait for it.
+/// Then use that zero-sized-type as the single generic argument to [`new`]
+/// and almost all types declared in this crate.
+///
+/// It might feel a bit weird, but, the alternative is to have umpteen generic
+/// types per `impl` with repetitive trait bounds.
+///
+/// Search the test cases for an example of how this can be used to improve testability.
+pub trait Types {
+    type Key: Copy;
+    type Lsn: Ord + Copy;
+    type LsnCounter: seqwait::MonotonicCounter<Self::Lsn> + Copy;
+    type DeltaRecord;
+    type HistoricLayer;
+    type InMemoryLayer: InMemoryLayer<Types = Self> + Clone;
+    type HistoricStuff: HistoricStuff<Types = Self> + Clone;
+    type GetReconstructPathError: std::error::Error;
+}
+
+/// Error returned by [`InMemoryLayer::put`].
+#[derive(thiserror::Error)]
+pub struct InMemoryLayerPutError<DeltaRecord> {
+    delta: DeltaRecord,
+    kind: InMemoryLayerPutErrorKind,
+}
+
+/// Part of [`InMemoryLayerPutError`].
+#[derive(Debug)]
+pub enum InMemoryLayerPutErrorKind {
+    LayerFull,
+    AlreadyHaveRecordForKeyAndLsn,
+}
+
+impl<DeltaRecord> std::fmt::Debug for InMemoryLayerPutError<DeltaRecord> {
+    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+        f.debug_struct("InMemoryLayerPutError")
+            // would require DeltaRecord to impl Debug
+            //         .field("delta", &self.delta)
+            .field("kind", &self.kind)
+            .finish()
+    }
+}
+
+/// An in-memory layer. See [`crate`] docs for details on this concept.
+pub trait InMemoryLayer: std::fmt::Debug + Default + Clone {
+    type Types: Types;
+    fn put(
+        &mut self,
+        key: <Self::Types as Types>::Key,
+        lsn: <Self::Types as Types>::Lsn,
+        delta: <Self::Types as Types>::DeltaRecord,
+    ) -> Result<Self, InMemoryLayerPutError<<Self::Types as Types>::DeltaRecord>>;
+    fn get(
+        &self,
+        key: <Self::Types as Types>::Key,
+        lsn: <Self::Types as Types>::Lsn,
+    ) -> Vec<<Self::Types as Types>::DeltaRecord>;
+}
+
+/// The manager of [`Types::HistoricLayer`]s.
+pub trait HistoricStuff {
+    type Types: Types;
+    fn get_reconstruct_path(
+        &self,
+        key: <Self::Types as Types>::Key,
+        lsn: <Self::Types as Types>::Lsn,
+    ) -> Result<
+        Vec<<Self::Types as Types>::HistoricLayer>,
+        <Self::Types as Types>::GetReconstructPathError,
+    >;
+    /// Produce a new version of `self` that includes the given inmem layer.
+    fn make_historic(&self, inmem: <Self::Types as Types>::InMemoryLayer) -> Self;
+}
+
+/// A snapshot of the data. See [`crate`]-level docs section on *immutability* for details.
+struct Snapshot<T: Types> {
+    _types: PhantomData<T>,
+    inmem: Option<T::InMemoryLayer>,
+    historic: T::HistoricStuff,
+}
+
+impl<T: Types> Clone for Snapshot<T> {
+    fn clone(&self) -> Self {
+        Self {
+            _types: self._types.clone(),
+            inmem: self.inmem.clone(),
+            historic: self.historic.clone(),
+        }
+    }
+}
+
+/// The Read-end. See [`crate`]-level docs for details.
+pub struct Reader<T: Types> {
+    wait: Wait<T::LsnCounter, T::Lsn, Snapshot<T>>,
+}
+
+/// The Write-end. See [`crate`]-level docs for details.
+pub struct Writer<T: Types> {
+    advance: Advance<T::LsnCounter, T::Lsn, Snapshot<T>>,
+}
+
+/// Setup a pair of Read-end and Write-End. This is the entrypoint to this crate.
+///
+/// The idea is that the caller loads the arguments from persistent state that `HistoricStuff` wrote at an earlier point in time.
+pub fn new<T: Types>(lsn: T::LsnCounter, historic: T::HistoricStuff) -> (Reader<T>, Writer<T>) {
+    let state = Snapshot {
+        _types: PhantomData::<T>::default(),
+        inmem: None,
+        historic: historic,
+    };
+    let (wait, advance) = SeqWait::new(lsn, state).split_spmc();
+    let reader = Reader { wait };
+    let read_writer = Writer { advance };
+    (reader, read_writer)
+}
+
+/// Error returned by the get-page operations.
+#[derive(Debug, thiserror::Error)]
+pub enum GetError<T: Types> {
+    #[error(transparent)]
+    SeqWait(seqwait::SeqWaitError),
+    #[error(transparent)]
+    GetReconstructPath(T::GetReconstructPathError),
+}
+
+/// Self-contained set of objects required to reconstruct a page image for the given `key` @ `lsn`.
+///
+/// This is returned by the `get` methods of [`Reader`] and [`Writer`].
+///
+/// To reconstruct the page image, stack up (top to bottom) `inmem_records` plus all records found for `key` and `lsn` along the `historic_path` until an initial page image is found.
+/// Then feed that stack to WAL-redo to get the page image.
+///
+/// See [`crate`]-level docs on *scope* for why we don't return page images from these functions.
+pub struct ReconstructWork<T: Types> {
+    pub key: T::Key,
+    pub lsn: T::Lsn,
+    pub inmem_records: Vec<T::DeltaRecord>,
+    pub historic_path: Vec<T::HistoricLayer>,
+}
+
+impl<T: Types> Reader<T> {
+    /// This is the `GetPage@LSN` operation.
+    ///
+    /// See the [`crate`]-level docs for why we return [`ReconstructWork`] instead of a Page Image here.
+    pub async fn get(&self, key: T::Key, lsn: T::Lsn) -> Result<ReconstructWork<T>, GetError<T>> {
+        // XXX dedup with Writer::get_nowait
+        let state = self.wait.wait_for(lsn).await.map_err(GetError::SeqWait)?;
+        let inmem_records = state
+            .inmem
+            .as_ref()
+            .map(|iml| iml.get(key, lsn))
+            .unwrap_or_default();
+        let historic_path = state
+            .historic
+            .get_reconstruct_path(key, lsn)
+            .map_err(GetError::GetReconstructPath)?;
+        Ok(ReconstructWork {
+            key,
+            lsn,
+            inmem_records,
+            historic_path,
+        })
+    }
+}
+
+/// Error returned by the `put` operation.
+#[derive(thiserror::Error)]
+pub struct PutError<T: Types> {
+    /// The `delta` record which we failed to `put`.
+    pub delta: T::DeltaRecord,
+    /// Description of what went wrong.
+    pub kind: PutErrorKind,
+}
+
+/// Part of [`PutError`].
+#[derive(Debug)]
+pub enum PutErrorKind {
+    AlreadyHaveInMemoryRecordForKeyAndLsn,
+}
+
+impl<T: Types> std::fmt::Debug for PutError<T> {
+    fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
+        f.debug_struct("PutError")
+            // would need to require Debug for DeltaRecord
+            // .field("delta", &self.delta)
+            .field("kind", &self.kind)
+            .finish()
+    }
+}
+
+impl<T: Types> Writer<T> {
+    /// Insert data into the system.
+    pub async fn put(
+        &mut self,
+        key: T::Key,
+        lsn: T::Lsn,
+        delta: T::DeltaRecord,
+    ) -> Result<(), PutError<T>> {
+        let (_snapshot_lsn, snapshot) = self.advance.get_current_data();
+        // TODO ensure snapshot_lsn <= lsn?
+        let mut inmem = snapshot
+            .inmem
+            .unwrap_or_else(|| T::InMemoryLayer::default());
+        // XXX: use the Advance as witness and only allow witness to access inmem in write mode
+        match inmem.put(key, lsn, delta) {
+            Ok(new_inmem) => {
+                let new_snapshot = Snapshot {
+                    _types: PhantomData,
+                    inmem: Some(new_inmem),
+                    historic: snapshot.historic,
+                };
+                self.advance.advance(lsn, Some(new_snapshot));
+            }
+            Err(InMemoryLayerPutError {
+                delta,
+                kind: InMemoryLayerPutErrorKind::AlreadyHaveRecordForKeyAndLsn,
+            }) => {
+                return Err(PutError {
+                    delta,
+                    kind: PutErrorKind::AlreadyHaveInMemoryRecordForKeyAndLsn,
+                });
+            }
+            Err(InMemoryLayerPutError {
+                delta,
+                kind: InMemoryLayerPutErrorKind::LayerFull,
+            }) => {
+                let new_historic = snapshot.historic.make_historic(inmem);
+                let mut new_inmem = T::InMemoryLayer::default();
+                let new_inmem = new_inmem
+                    .put(key, lsn, delta)
+                    .expect("put into default inmem layer must not fail");
+                let new_state = Snapshot {
+                    _types: PhantomData::<T>::default(),
+                    inmem: Some(new_inmem),
+                    historic: new_historic,
+                };
+                self.advance.advance(lsn, Some(new_state));
+            }
+        }
+        Ok(())
+    }
+
+    /// Force flushing of the current in-memory layer.
+    ///
+    /// Usually, flushing happens only if the in-memory layer is full.
+    /// Use this API to make it happen in other circumstances (shutdown, periodic ticker, etc.).
+    pub async fn force_flush(&mut self) -> tokio::io::Result<()> {
+        let (snapshot_lsn, snapshot) = self.advance.get_current_data();
+        let Snapshot {
+            _types,
+            inmem,
+            historic,
+        } = snapshot;
+        // XXX: use the Advance as witness and only allow witness to access inmem in "write" mode
+        let Some(inmem) = inmem else {
+            // nothing to do
+            return Ok(());
+        };
+        let new_historic = historic.make_historic(inmem);
+        let new_snapshot = Snapshot {
+            _types: PhantomData::<T>::default(),
+            inmem: None,
+            historic: new_historic,
+        };
+        self.advance.advance(snapshot_lsn, Some(new_snapshot)); // TODO: should fail if we're past snapshot_lsn
+        Ok(())
+    }
+
+    /// `get` at the given LSN, without blocking.
+    ///
+    /// Fails with a timeout error if the `lsn` isn't there yet.
+    /// That makes sense because the only way we'd stop waiting is by a `self.put()`.
+    /// But concurrent `put()` is forbidden.
+    pub async fn get_nowait(
+        &self,
+        key: T::Key,
+        lsn: T::Lsn,
+    ) -> Result<ReconstructWork<T>, GetError<T>> {
+        // XXX dedup with Reader::get
+        let state = self
+            .advance
+            .wait_for_timeout(lsn, Duration::from_secs(0))
+            // The await is never going to block because we pass from_secs(0).
+            .await
+            .map_err(GetError::SeqWait)?;
+        let inmem_records = state
+            .inmem
+            .as_ref()
+            .map(|iml| iml.get(key, lsn))
+            .unwrap_or_default();
+        let historic_path = state
+            .historic
+            .get_reconstruct_path(key, lsn)
+            .map_err(GetError::GetReconstructPath)?;
+        Ok(ReconstructWork {
+            key,
+            lsn,
+            inmem_records,
+            historic_path,
+        })
+    }
+}
--- a/libs/timeline_data_path/src/tests.rs
+++ b/libs/timeline_data_path/src/tests.rs
@@ -0,0 +1,170 @@
+use std::collections::{btree_map::Entry, BTreeMap};
+use std::sync::Arc;
+use utils::seqwait;
+
+/// The ZST for which we impl the `super::Types` type collection trait.
+struct TestTypes;
+
+impl super::Types for TestTypes {
+    type Key = usize;
+
+    type Lsn = usize;
+
+    type LsnCounter = UsizeCounter;
+
+    type DeltaRecord = &'static str;
+
+    type HistoricLayer = Arc<TestHistoricLayer>;
+
+    type InMemoryLayer = TestInMemoryLayer;
+
+    type HistoricStuff = TestHistoricStuff;
+}
+
+/// For testing, our in-memory layer is a simple hashmap.
+#[derive(Clone, Default, Debug)]
+struct TestInMemoryLayer {
+    by_key: BTreeMap<usize, BTreeMap<usize, &'static str>>,
+}
+
+/// For testing, our historic layers are just in-memory layer objects with `frozen==true`.
+struct TestHistoricLayer(TestInMemoryLayer);
+
+/// This is the data structure that impls the `HistoricStuff` trait.
+#[derive(Default, Clone)]
+struct TestHistoricStuff {
+    by_key: BTreeMap<usize, BTreeMap<usize, Arc<TestHistoricLayer>>>,
+}
+
+/// `seqwait::MonotonicCounter` impl
+#[derive(Copy, Clone)]
+pub struct UsizeCounter(usize);
+
+// Our testing impl of HistoricStuff references the frozen InMemoryLayer objects
+// from all the (key,lsn) entries that it covers.
+// This mimics the (much more efficient) search tree in the real impl.
+impl super::HistoricStuff for TestHistoricStuff {
+    type Types = TestTypes;
+    fn get_reconstruct_path(
+        &self,
+        key: usize,
+        lsn: usize,
+    ) -> Result<Vec<Arc<TestHistoricLayer>>, super::GetReconstructPathError> {
+        let Some(bk) = self.by_key.get(&key) else {
+                return Ok(vec![]);
+            };
+        Ok(bk.range(..=lsn).rev().map(|(_, l)| Arc::clone(l)).collect())
+    }
+
+    fn make_historic(&self, inmem: TestInMemoryLayer) -> Self {
+        // For the purposes of testing, just turn the inmemory layer historic through the type system
+        let historic = Arc::new(TestHistoricLayer(inmem));
+        // Deep-copy
+        let mut copy = self.by_key.clone();
+        // Add the references to `inmem` to the deep-copied struct
+        for (k, v) in historic.0.by_key.iter() {
+            for (lsn, _deltas) in v.into_iter() {
+                let by_key = copy.entry(*k).or_default();
+                let overwritten = by_key.insert(*lsn, historic.clone());
+                assert!(matches!(overwritten, None), "layers must not overlap");
+            }
+        }
+        Self { by_key: copy }
+    }
+}
+
+impl super::InMemoryLayer for TestInMemoryLayer {
+    type Types = TestTypes;
+
+    fn put(
+        &mut self,
+        key: usize,
+        lsn: usize,
+        delta: &'static str,
+    ) -> Result<Self, super::InMemoryLayerPutError<&'static str>> {
+        let mut clone = self.clone();
+        drop(self);
+        let by_key = clone.by_key.entry(key).or_default();
+        match by_key.entry(lsn) {
+            Entry::Occupied(_record) => {
+                return Err(super::InMemoryLayerPutError {
+                    delta,
+                    kind: super::InMemoryLayerPutErrorKind::AlreadyHaveRecordForKeyAndLsn,
+                });
+            }
+            Entry::Vacant(vacant) => vacant.insert(delta),
+        };
+        Ok(clone)
+    }
+
+    fn get(&self, key: usize, lsn: usize) -> Vec<&'static str> {
+        let by_key = match self.by_key.get(&key) {
+            Some(by_key) => by_key,
+            None => return vec![],
+        };
+        by_key
+            .range(..=lsn)
+            .map(|(_, v)| v)
+            .rev()
+            .cloned()
+            .collect()
+    }
+}
+
+impl UsizeCounter {
+    pub fn new(inital: usize) -> Self {
+        UsizeCounter(inital)
+    }
+}
+
+impl seqwait::MonotonicCounter<usize> for UsizeCounter {
+    fn cnt_advance(&mut self, new_val: usize) {
+        assert!(self.0 < new_val);
+        self.0 = new_val;
+    }
+
+    fn cnt_value(&self) -> usize {
+        self.0
+    }
+}
+
+#[test]
+fn basic() {
+    let lm = TestHistoricStuff::default();
+
+    let (r, mut rw) = super::new::<TestTypes>(UsizeCounter::new(0), lm);
+
+    let r = Arc::new(r);
+    let r2 = Arc::clone(&r);
+
+    let rt = tokio::runtime::Builder::new_current_thread()
+        .enable_all()
+        .build()
+        .unwrap();
+
+    let read_jh = rt.spawn(async move { r.get(0, 10).await });
+
+    let mut rw = rt.block_on(async move {
+        rw.put(0, 1, "foo").await.unwrap();
+        rw.put(1, 1, "bar").await.unwrap();
+        rw.put(0, 10, "baz").await.unwrap();
+        rw
+    });
+
+    let read_res = rt.block_on(read_jh).unwrap().unwrap();
+    assert!(
+        read_res.historic_path.is_empty(),
+        "we have pushed less than needed for flush"
+    );
+    assert_eq!(read_res.inmem_records, vec!["baz", "foo"]);
+
+    let rw = rt.block_on(async move {
+        rw.put(0, 11, "blup").await.unwrap();
+        rw
+    });
+    let read_res = rt.block_on(async move { r2.get(0, 11).await.unwrap() });
+    assert_eq!(read_res.historic_path.len(), 0);
+    assert_eq!(read_res.inmem_records, vec!["blup", "baz", "foo"]);
+
+    drop(rw);
+}
--- a/libs/tracing-utils/Cargo.toml
+++ b/libs/tracing-utils/Cargo.toml
@@ -14,5 +14,4 @@ tokio = { workspace = true, features = ["rt", "rt-multi-thread"] }
 tracing.workspace = true
 tracing-opentelemetry.workspace = true
 tracing-subscriber.workspace = true
-
-workspace_hack.workspace = true
+workspace_hack = { version = "0.1", path = "../../workspace_hack" }
--- a/libs/utils/Cargo.toml
+++ b/libs/utils/Cargo.toml
@@ -11,7 +11,6 @@ async-trait.workspace = true
 anyhow.workspace = true
 bincode.workspace = true
 bytes.workspace = true
-chrono.workspace = true
 heapless.workspace = true
 hex = { workspace = true, features = ["serde"] }
 hyper = { workspace = true, features = ["full"] }
@@ -28,18 +27,17 @@ signal-hook.workspace = true
 thiserror.workspace = true
 tokio.workspace = true
 tracing.workspace = true
-tracing-error.workspace = true
-tracing-subscriber = { workspace = true, features = ["json", "registry"] }
+tracing-subscriber = { workspace = true, features = ["json"] }
 rand.workspace = true
 serde_with.workspace = true
 strum.workspace = true
 strum_macros.workspace = true
 url.workspace = true
-uuid.workspace = true
+uuid = { version = "1.2", features = ["v4", "serde"] }

-pq_proto.workspace = true
 metrics.workspace = true
 workspace_hack.workspace = true
+either.workspace = true

 [dev-dependencies]
 byteorder.workspace = true
--- a/libs/utils/src/http/endpoint.rs
+++ b/libs/utils/src/http/endpoint.rs
@@ -76,7 +76,6 @@ where

        let log_quietly = method == Method::GET;
        async move {
-            let cancellation_guard = RequestCancelled::warn_when_dropped_without_responding();
            if log_quietly {
                debug!("Handling request");
            } else {
@@ -88,11 +87,7 @@ where
            // Usage of the error handler also means that we expect only the `ApiError` errors to be raised in this call.
            //
            // Panics are not handled separately, there's a `tracing_panic_hook` from another module to do that globally.
-            let res = (self.0)(request).await;
-
-            cancellation_guard.disarm();
-
-            match res {
+            match (self.0)(request).await {
                Ok(response) => {
                    let response_status = response.status();
                    if log_quietly && response_status.is_success() {
@@ -110,40 +105,6 @@ where
    }
 }

-/// Drop guard to WARN in case the request was dropped before completion.
-struct RequestCancelled {
-    warn: Option<tracing::Span>,
-}
-
-impl RequestCancelled {
-    /// Create the drop guard using the [`tracing::Span::current`] as the span.
-    fn warn_when_dropped_without_responding() -> Self {
-        RequestCancelled {
-            warn: Some(tracing::Span::current()),
-        }
-    }
-
-    /// Consume the drop guard without logging anything.
-    fn disarm(mut self) {
-        self.warn = None;
-    }
-}
-
-impl Drop for RequestCancelled {
-    fn drop(&mut self) {
-        if std::thread::panicking() {
-            // we are unwinding due to panicking, assume we are not dropped for cancellation
-        } else if let Some(span) = self.warn.take() {
-            // the span has all of the info already, but the outer `.instrument(span)` has already
-            // been dropped, so we need to manually re-enter it for this message.
-            //
-            // this is what the instrument would do before polling so it is fine.
-            let _g = span.entered();
-            warn!("request was dropped before completing");
-        }
-    }
-}
-
 async fn prometheus_metrics_handler(_req: Request<Body>) -> Result<Response<Body>, ApiError> {
    SERVE_METRICS_COUNT.inc();

--- a/libs/utils/src/http/json.rs
+++ b/libs/utils/src/http/json.rs
@@ -1,7 +1,9 @@
+use std::fmt::Display;
+
 use anyhow::Context;
 use bytes::Buf;
 use hyper::{header, Body, Request, Response, StatusCode};
-use serde::{Deserialize, Serialize};
+use serde::{Deserialize, Serialize, Serializer};

 use super::error::ApiError;

@@ -31,3 +33,12 @@ pub fn json_response<T: Serialize>(
        .map_err(|e| ApiError::InternalServerError(e.into()))?;
    Ok(response)
 }
+
+/// Serialize through Display trait.
+pub fn display_serialize<S, F>(z: &F, s: S) -> Result<S::Ok, S::Error>
+where
+    S: Serializer,
+    F: Display,
+{
+    s.serialize_str(&format!("{}", z))
+}
--- a/libs/utils/src/id.rs
+++ b/libs/utils/src/id.rs
@@ -265,26 +265,6 @@ impl fmt::Display for TenantTimelineId {
    }
 }

-impl FromStr for TenantTimelineId {
-    type Err = anyhow::Error;
-
-    fn from_str(s: &str) -> Result<Self, Self::Err> {
-        let mut parts = s.split('/');
-        let tenant_id = parts
-            .next()
-            .ok_or_else(|| anyhow::anyhow!("TenantTimelineId must contain tenant_id"))?
-            .parse()?;
-        let timeline_id = parts
-            .next()
-            .ok_or_else(|| anyhow::anyhow!("TenantTimelineId must contain timeline_id"))?
-            .parse()?;
-        if parts.next().is_some() {
-            anyhow::bail!("TenantTimelineId must contain only tenant_id and timeline_id");
-        }
-        Ok(TenantTimelineId::new(tenant_id, timeline_id))
-    }
-}
-
 // Unique ID of a storage node (safekeeper or pageserver). Supposed to be issued
 // by the console.
 #[derive(Clone, Copy, Eq, Ord, PartialEq, PartialOrd, Hash, Debug, Serialize, Deserialize)]
--- a/libs/utils/src/lib.rs
+++ b/libs/utils/src/lib.rs
@@ -54,10 +54,6 @@ pub mod measured_stream;
 pub mod serde_percent;
 pub mod serde_regex;

-pub mod pageserver_feedback;
-
-pub mod tracing_span_assert;
-
 /// use with fail::cfg("$name", "return(2000)")
 #[macro_export]
 macro_rules! failpoint_sleep_millis_async {
--- a/libs/utils/src/logging.rs
+++ b/libs/utils/src/logging.rs
@@ -1,7 +1,6 @@
 use std::str::FromStr;

 use anyhow::Context;
-use once_cell::sync::Lazy;
 use strum_macros::{EnumString, EnumVariantNames};

 #[derive(EnumString, EnumVariantNames, Eq, PartialEq, Debug, Clone, Copy)]
@@ -24,81 +23,24 @@ impl LogFormat {
    }
 }

-static TRACING_EVENT_COUNT: Lazy<metrics::IntCounterVec> = Lazy::new(|| {
-    metrics::register_int_counter_vec!(
-        "libmetrics_tracing_event_count",
-        "Number of tracing events, by level",
-        &["level"]
-    )
-    .expect("failed to define metric")
-});
+pub fn init(log_format: LogFormat) -> anyhow::Result<()> {
+    let default_filter_str = "info";

-struct TracingEventCountLayer(&'static metrics::IntCounterVec);
-
-impl<S> tracing_subscriber::layer::Layer<S> for TracingEventCountLayer
-where
-    S: tracing::Subscriber,
-{
-    fn on_event(
-        &self,
-        event: &tracing::Event<'_>,
-        _ctx: tracing_subscriber::layer::Context<'_, S>,
-    ) {
-        let level = event.metadata().level();
-        let level = match *level {
-            tracing::Level::ERROR => "error",
-            tracing::Level::WARN => "warn",
-            tracing::Level::INFO => "info",
-            tracing::Level::DEBUG => "debug",
-            tracing::Level::TRACE => "trace",
-        };
-        self.0.with_label_values(&[level]).inc();
-    }
-}
-
-/// Whether to add the `tracing_error` crate's `ErrorLayer`
-/// to the global tracing subscriber.
-///
-pub enum TracingErrorLayerEnablement {
-    /// Do not add the `ErrorLayer`.
-    Disabled,
-    /// Add the `ErrorLayer` with the filter specified by RUST_LOG, defaulting to `info` if `RUST_LOG` is unset.
-    EnableWithRustLogFilter,
-}
-
-pub fn init(
-    log_format: LogFormat,
-    tracing_error_layer_enablement: TracingErrorLayerEnablement,
-) -> anyhow::Result<()> {
    // We fall back to printing all spans at info-level or above if
    // the RUST_LOG environment variable is not set.
-    let rust_log_env_filter = || {
-        tracing_subscriber::EnvFilter::try_from_default_env()
-            .unwrap_or_else(|_| tracing_subscriber::EnvFilter::new("info"))
-    };
+    let env_filter = tracing_subscriber::EnvFilter::try_from_default_env()
+        .unwrap_or_else(|_| tracing_subscriber::EnvFilter::new(default_filter_str));

-    // NB: the order of the with() calls does not matter.
-    // See https://docs.rs/tracing-subscriber/0.3.16/tracing_subscriber/layer/index.html#per-layer-filtering
-    use tracing_subscriber::prelude::*;
-    let r = tracing_subscriber::registry();
-    let r = r.with({
-        let log_layer = tracing_subscriber::fmt::layer()
-            .with_target(false)
-            .with_ansi(atty::is(atty::Stream::Stdout))
-            .with_writer(std::io::stdout);
-        let log_layer = match log_format {
-            LogFormat::Json => log_layer.json().boxed(),
-            LogFormat::Plain => log_layer.boxed(),
-            LogFormat::Test => log_layer.with_test_writer().boxed(),
-        };
-        log_layer.with_filter(rust_log_env_filter())
-    });
-    let r = r.with(TracingEventCountLayer(&TRACING_EVENT_COUNT).with_filter(rust_log_env_filter()));
-    match tracing_error_layer_enablement {
-        TracingErrorLayerEnablement::EnableWithRustLogFilter => r
-            .with(tracing_error::ErrorLayer::default().with_filter(rust_log_env_filter()))
-            .init(),
-        TracingErrorLayerEnablement::Disabled => r.init(),
+    let base_logger = tracing_subscriber::fmt()
+        .with_env_filter(env_filter)
+        .with_target(false)
+        .with_ansi(atty::is(atty::Stream::Stdout))
+        .with_writer(std::io::stdout);
+
+    match log_format {
+        LogFormat::Json => base_logger.json().init(),
+        LogFormat::Plain => base_logger.init(),
+        LogFormat::Test => base_logger.with_test_writer().init(),
    }

    Ok(())
@@ -215,33 +157,3 @@ impl std::fmt::Debug for PrettyLocation<'_, '_> {
        <Self as std::fmt::Display>::fmt(self, f)
    }
 }
-
-#[cfg(test)]
-mod tests {
-    use metrics::{core::Opts, IntCounterVec};
-
-    use super::TracingEventCountLayer;
-
-    #[test]
-    fn tracing_event_count_metric() {
-        let counter_vec =
-            IntCounterVec::new(Opts::new("testmetric", "testhelp"), &["level"]).unwrap();
-        let counter_vec = Box::leak(Box::new(counter_vec)); // make it 'static
-        let layer = TracingEventCountLayer(counter_vec);
-        use tracing_subscriber::prelude::*;
-
-        tracing::subscriber::with_default(tracing_subscriber::registry().with(layer), || {
-            tracing::trace!("foo");
-            tracing::debug!("foo");
-            tracing::info!("foo");
-            tracing::warn!("foo");
-            tracing::error!("foo");
-        });
-
-        assert_eq!(counter_vec.with_label_values(&["trace"]).get(), 1);
-        assert_eq!(counter_vec.with_label_values(&["debug"]).get(), 1);
-        assert_eq!(counter_vec.with_label_values(&["info"]).get(), 1);
-        assert_eq!(counter_vec.with_label_values(&["warn"]).get(), 1);
-        assert_eq!(counter_vec.with_label_values(&["error"]).get(), 1);
-    }
-}
--- a/libs/utils/src/lsn.rs
+++ b/libs/utils/src/lsn.rs
@@ -62,48 +62,29 @@ impl Lsn {
    }

    /// Compute the offset into a segment
-    #[inline]
    pub fn segment_offset(self, seg_sz: usize) -> usize {
        (self.0 % seg_sz as u64) as usize
    }

    /// Compute LSN of the segment start.
-    #[inline]
    pub fn segment_lsn(self, seg_sz: usize) -> Lsn {
        Lsn(self.0 - (self.0 % seg_sz as u64))
    }

    /// Compute the segment number
-    #[inline]
    pub fn segment_number(self, seg_sz: usize) -> u64 {
        self.0 / seg_sz as u64
    }

    /// Compute the offset into a block
-    #[inline]
    pub fn block_offset(self) -> u64 {
        const BLCKSZ: u64 = XLOG_BLCKSZ as u64;
        self.0 % BLCKSZ
    }

-    /// Compute the block offset of the first byte of this Lsn within this
-    /// segment
-    #[inline]
-    pub fn page_lsn(self) -> Lsn {
-        Lsn(self.0 - self.block_offset())
-    }
-
-    /// Compute the block offset of the first byte of this Lsn within this
-    /// segment
-    #[inline]
-    pub fn page_offset_in_segment(self, seg_sz: usize) -> u64 {
-        (self.0 - self.block_offset()) - self.segment_lsn(seg_sz).0
-    }
-
    /// Compute the bytes remaining in this block
    ///
    /// If the LSN is already at the block boundary, it will return `XLOG_BLCKSZ`.
-    #[inline]
    pub fn remaining_in_block(self) -> u64 {
        const BLCKSZ: u64 = XLOG_BLCKSZ as u64;
        BLCKSZ - (self.0 % BLCKSZ)
--- a/libs/utils/src/pageserver_feedback.rs
+++ b/libs/utils/src/pageserver_feedback.rs
@@ -1,214 +0,0 @@
-use std::time::{Duration, SystemTime};
-
-use bytes::{Buf, BufMut, Bytes, BytesMut};
-use pq_proto::{read_cstr, PG_EPOCH};
-use serde::{Deserialize, Serialize};
-use serde_with::{serde_as, DisplayFromStr};
-use tracing::{trace, warn};
-
-use crate::lsn::Lsn;
-
-/// Feedback pageserver sends to safekeeper and safekeeper resends to compute.
-/// Serialized in custom flexible key/value format. In replication protocol, it
-/// is marked with NEON_STATUS_UPDATE_TAG_BYTE to differentiate from postgres
-/// Standby status update / Hot standby feedback messages.
-///
-/// serde Serialize is used only for human readable dump to json (e.g. in
-/// safekeepers debug_dump).
-#[serde_as]
-#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
-pub struct PageserverFeedback {
-    /// Last known size of the timeline. Used to enforce timeline size limit.
-    pub current_timeline_size: u64,
-    /// LSN last received and ingested by the pageserver. Controls backpressure.
-    #[serde_as(as = "DisplayFromStr")]
-    pub last_received_lsn: Lsn,
-    /// LSN up to which data is persisted by the pageserver to its local disc.
-    /// Controls backpressure.
-    #[serde_as(as = "DisplayFromStr")]
-    pub disk_consistent_lsn: Lsn,
-    /// LSN up to which data is persisted by the pageserver on s3; safekeepers
-    /// consider WAL before it can be removed.
-    #[serde_as(as = "DisplayFromStr")]
-    pub remote_consistent_lsn: Lsn,
-    // Serialize with RFC3339 format.
-    #[serde(with = "serde_systemtime")]
-    pub replytime: SystemTime,
-}
-
-// NOTE: Do not forget to increment this number when adding new fields to PageserverFeedback.
-// Do not remove previously available fields because this might be backwards incompatible.
-pub const PAGESERVER_FEEDBACK_FIELDS_NUMBER: u8 = 5;
-
-impl PageserverFeedback {
-    pub fn empty() -> PageserverFeedback {
-        PageserverFeedback {
-            current_timeline_size: 0,
-            last_received_lsn: Lsn::INVALID,
-            remote_consistent_lsn: Lsn::INVALID,
-            disk_consistent_lsn: Lsn::INVALID,
-            replytime: *PG_EPOCH,
-        }
-    }
-
-    // Serialize PageserverFeedback using custom format
-    // to support protocol extensibility.
-    //
-    // Following layout is used:
-    // char - number of key-value pairs that follow.
-    //
-    // key-value pairs:
-    // null-terminated string - key,
-    // uint32 - value length in bytes
-    // value itself
-    //
-    // TODO: change serialized fields names once all computes migrate to rename.
-    pub fn serialize(&self, buf: &mut BytesMut) {
-        buf.put_u8(PAGESERVER_FEEDBACK_FIELDS_NUMBER); // # of keys
-        buf.put_slice(b"current_timeline_size\0");
-        buf.put_i32(8);
-        buf.put_u64(self.current_timeline_size);
-
-        buf.put_slice(b"ps_writelsn\0");
-        buf.put_i32(8);
-        buf.put_u64(self.last_received_lsn.0);
-        buf.put_slice(b"ps_flushlsn\0");
-        buf.put_i32(8);
-        buf.put_u64(self.disk_consistent_lsn.0);
-        buf.put_slice(b"ps_applylsn\0");
-        buf.put_i32(8);
-        buf.put_u64(self.remote_consistent_lsn.0);
-
-        let timestamp = self
-            .replytime
-            .duration_since(*PG_EPOCH)
-            .expect("failed to serialize pg_replytime earlier than PG_EPOCH")
-            .as_micros() as i64;
-
-        buf.put_slice(b"ps_replytime\0");
-        buf.put_i32(8);
-        buf.put_i64(timestamp);
-    }
-
-    // Deserialize PageserverFeedback message
-    // TODO: change serialized fields names once all computes migrate to rename.
-    pub fn parse(mut buf: Bytes) -> PageserverFeedback {
-        let mut rf = PageserverFeedback::empty();
-        let nfields = buf.get_u8();
-        for _ in 0..nfields {
-            let key = read_cstr(&mut buf).unwrap();
-            match key.as_ref() {
-                b"current_timeline_size" => {
-                    let len = buf.get_i32();
-                    assert_eq!(len, 8);
-                    rf.current_timeline_size = buf.get_u64();
-                }
-                b"ps_writelsn" => {
-                    let len = buf.get_i32();
-                    assert_eq!(len, 8);
-                    rf.last_received_lsn = Lsn(buf.get_u64());
-                }
-                b"ps_flushlsn" => {
-                    let len = buf.get_i32();
-                    assert_eq!(len, 8);
-                    rf.disk_consistent_lsn = Lsn(buf.get_u64());
-                }
-                b"ps_applylsn" => {
-                    let len = buf.get_i32();
-                    assert_eq!(len, 8);
-                    rf.remote_consistent_lsn = Lsn(buf.get_u64());
-                }
-                b"ps_replytime" => {
-                    let len = buf.get_i32();
-                    assert_eq!(len, 8);
-                    let raw_time = buf.get_i64();
-                    if raw_time > 0 {
-                        rf.replytime = *PG_EPOCH + Duration::from_micros(raw_time as u64);
-                    } else {
-                        rf.replytime = *PG_EPOCH - Duration::from_micros(-raw_time as u64);
-                    }
-                }
-                _ => {
-                    let len = buf.get_i32();
-                    warn!(
-                        "PageserverFeedback parse. unknown key {} of len {len}. Skip it.",
-                        String::from_utf8_lossy(key.as_ref())
-                    );
-                    buf.advance(len as usize);
-                }
-            }
-        }
-        trace!("PageserverFeedback parsed is {:?}", rf);
-        rf
-    }
-}
-
-mod serde_systemtime {
-    use std::time::SystemTime;
-
-    use chrono::{DateTime, Utc};
-    use serde::{Deserialize, Deserializer, Serializer};
-
-    pub fn serialize<S>(ts: &SystemTime, serializer: S) -> Result<S::Ok, S::Error>
-    where
-        S: Serializer,
-    {
-        let chrono_dt: DateTime<Utc> = (*ts).into();
-        serializer.serialize_str(&chrono_dt.to_rfc3339())
-    }
-
-    pub fn deserialize<'de, D>(deserializer: D) -> Result<SystemTime, D::Error>
-    where
-        D: Deserializer<'de>,
-    {
-        let time: String = Deserialize::deserialize(deserializer)?;
-        Ok(DateTime::parse_from_rfc3339(&time)
-            .map_err(serde::de::Error::custom)?
-            .into())
-    }
-}
-
-#[cfg(test)]
-mod tests {
-    use super::*;
-
-    #[test]
-    fn test_replication_feedback_serialization() {
-        let mut rf = PageserverFeedback::empty();
-        // Fill rf with some values
-        rf.current_timeline_size = 12345678;
-        // Set rounded time to be able to compare it with deserialized value,
-        // because it is rounded up to microseconds during serialization.
-        rf.replytime = *PG_EPOCH + Duration::from_secs(100_000_000);
-        let mut data = BytesMut::new();
-        rf.serialize(&mut data);
-
-        let rf_parsed = PageserverFeedback::parse(data.freeze());
-        assert_eq!(rf, rf_parsed);
-    }
-
-    #[test]
-    fn test_replication_feedback_unknown_key() {
-        let mut rf = PageserverFeedback::empty();
-        // Fill rf with some values
-        rf.current_timeline_size = 12345678;
-        // Set rounded time to be able to compare it with deserialized value,
-        // because it is rounded up to microseconds during serialization.
-        rf.replytime = *PG_EPOCH + Duration::from_secs(100_000_000);
-        let mut data = BytesMut::new();
-        rf.serialize(&mut data);
-
-        // Add an extra field to the buffer and adjust number of keys
-        if let Some(first) = data.first_mut() {
-            *first = PAGESERVER_FEEDBACK_FIELDS_NUMBER + 1;
-        }
-
-        data.put_slice(b"new_field_one\0");
-        data.put_i32(8);
-        data.put_u64(42);
-
-        // Parse serialized data and check that new field is not parsed
-        let rf_parsed = PageserverFeedback::parse(data.freeze());
-        assert_eq!(rf, rf_parsed);
-    }
-}
--- a/libs/utils/src/seqwait.rs
+++ b/libs/utils/src/seqwait.rs
@@ -1,12 +1,13 @@
 #![warn(missing_docs)]

+use either::Either;
 use std::cmp::{Eq, Ordering, PartialOrd};
 use std::collections::BinaryHeap;
 use std::fmt::Debug;
 use std::mem;
-use std::sync::Mutex;
+use std::sync::{Arc, Mutex};
 use std::time::Duration;
-use tokio::sync::watch::{channel, Receiver, Sender};
+use tokio::sync::oneshot::{channel, Receiver, Sender};
 use tokio::time::timeout;

 /// An error happened while waiting for a number
@@ -36,45 +37,48 @@ pub trait MonotonicCounter<V> {
 }

 /// Internal components of a `SeqWait`
-struct SeqWaitInt<S, V>
+struct SeqWaitInt<S, V, T>
 where
    S: MonotonicCounter<V>,
    V: Ord,
+    T: Clone,
 {
-    waiters: BinaryHeap<Waiter<V>>,
+    waiters: BinaryHeap<Waiter<V, T>>,
    current: S,
    shutdown: bool,
+    data: T,
 }

-struct Waiter<T>
+struct Waiter<V, T>
 where
-    T: Ord,
+    V: Ord,
+    T: Clone,
 {
-    wake_num: T,              // wake me when this number arrives ...
-    wake_channel: Sender<()>, // ... by sending a message to this channel
+    wake_num: V,             // wake me when this number arrives ...
+    wake_channel: Sender<T>, // ... by sending a message to this channel
 }

 // BinaryHeap is a max-heap, and we want a min-heap. Reverse the ordering here
 // to get that.
-impl<T: Ord> PartialOrd for Waiter<T> {
+impl<V: Ord, T: Clone> PartialOrd for Waiter<V, T> {
    fn partial_cmp(&self, other: &Self) -> Option<Ordering> {
        other.wake_num.partial_cmp(&self.wake_num)
    }
 }

-impl<T: Ord> Ord for Waiter<T> {
+impl<V: Ord, T: Clone> Ord for Waiter<V, T> {
    fn cmp(&self, other: &Self) -> Ordering {
        other.wake_num.cmp(&self.wake_num)
    }
 }

-impl<T: Ord> PartialEq for Waiter<T> {
+impl<V: Ord, T: Clone> PartialEq for Waiter<V, T> {
    fn eq(&self, other: &Self) -> bool {
        other.wake_num == self.wake_num
    }
 }

-impl<T: Ord> Eq for Waiter<T> {}
+impl<V: Ord, T: Clone> Eq for Waiter<V, T> {}

 /// A tool for waiting on a sequence number
 ///
@@ -92,25 +96,28 @@ impl<T: Ord> Eq for Waiter<T> {}
 ///
 /// <S> means Storage, <V> is type of counter that this storage exposes.
 ///
-pub struct SeqWait<S, V>
+pub struct SeqWait<S, V, T>
 where
    S: MonotonicCounter<V>,
    V: Ord,
+    T: Clone,
 {
-    internal: Mutex<SeqWaitInt<S, V>>,
+    internal: Mutex<SeqWaitInt<S, V, T>>,
 }

-impl<S, V> SeqWait<S, V>
+impl<S, V, T> SeqWait<S, V, T>
 where
    S: MonotonicCounter<V> + Copy,
    V: Ord + Copy,
+    T: Clone,
 {
    /// Create a new `SeqWait`, initialized to a particular number
-    pub fn new(starting_num: S) -> Self {
+    pub fn new(starting_num: S, data: T) -> Self {
        let internal = SeqWaitInt {
            waiters: BinaryHeap::new(),
            current: starting_num,
            shutdown: false,
+            data,
        };
        SeqWait {
            internal: Mutex::new(internal),
@@ -144,10 +151,13 @@ where
    ///
    /// This call won't complete until someone has called `advance`
    /// with a number greater than or equal to the one we're waiting for.
-    pub async fn wait_for(&self, num: V) -> Result<(), SeqWaitError> {
-        match self.queue_for_wait(num) {
-            Ok(None) => Ok(()),
-            Ok(Some(mut rx)) => rx.changed().await.map_err(|_| SeqWaitError::Shutdown),
+    pub async fn wait_for(&self, num: V) -> Result<T, SeqWaitError> {
+        match self.queue_for_wait(num, false) {
+            Ok(Either::Left(data)) => Ok(data),
+            Ok(Either::Right(rx)) => match rx.await {
+                Err(_) => Err(SeqWaitError::Shutdown),
+                Ok(data) => Ok(data),
+            },
            Err(e) => Err(e),
        }
    }
@@ -159,15 +169,18 @@ where
    ///
    /// If that hasn't happened after the specified timeout duration,
    /// [`SeqWaitError::Timeout`] will be returned.
+    ///
+    /// Pass `timeout_duration.is_zero() == true` to guarantee that the
+    /// future that is this function will never await.
    pub async fn wait_for_timeout(
        &self,
        num: V,
        timeout_duration: Duration,
-    ) -> Result<(), SeqWaitError> {
-        match self.queue_for_wait(num) {
-            Ok(None) => Ok(()),
-            Ok(Some(mut rx)) => match timeout(timeout_duration, rx.changed()).await {
-                Ok(Ok(())) => Ok(()),
+    ) -> Result<T, SeqWaitError> {
+        match self.queue_for_wait(num, timeout_duration.is_zero()) {
+            Ok(Either::Left(data)) => Ok(data),
+            Ok(Either::Right(rx)) => match timeout(timeout_duration, rx).await {
+                Ok(Ok(data)) => Ok(data),
                Ok(Err(_)) => Err(SeqWaitError::Shutdown),
                Err(_) => Err(SeqWaitError::Timeout),
            },
@@ -177,41 +190,50 @@ where

    /// Register and return a channel that will be notified when a number arrives,
    /// or None, if it has already arrived.
-    fn queue_for_wait(&self, num: V) -> Result<Option<Receiver<()>>, SeqWaitError> {
+    fn queue_for_wait(&self, num: V, nowait: bool) -> Result<Either<T, Receiver<T>>, SeqWaitError> {
        let mut internal = self.internal.lock().unwrap();
        if internal.current.cnt_value() >= num {
-            return Ok(None);
+            return Ok(Either::Left(internal.data.clone()));
        }
        if internal.shutdown {
            return Err(SeqWaitError::Shutdown);
        }
+        if nowait {
+            return Err(SeqWaitError::Timeout);
+        }

        // Create a new channel.
-        let (tx, rx) = channel(());
+        let (tx, rx) = channel();
        internal.waiters.push(Waiter {
            wake_num: num,
            wake_channel: tx,
        });
        // Drop the lock as we exit this scope.
-        Ok(Some(rx))
+        Ok(Either::Right(rx))
    }

    /// Announce a new number has arrived
    ///
    /// All waiters at this value or below will be woken.
    ///
+    /// If `new_data` is Some(), it will update the internal data,
+    /// even if `num` is smaller than the internal counter.
+    /// It will not cause a wake-up though, in this case.
+    ///
    /// Returns the old number.
-    pub fn advance(&self, num: V) -> V {
+    pub fn advance(&self, num: V, new_data: Option<T>) -> V {
        let old_value;
-        let wake_these = {
+        let (wake_these, with_data) = {
            let mut internal = self.internal.lock().unwrap();
+            if let Some(new_data) = new_data {
+                internal.data = new_data;
+            }

            old_value = internal.current.cnt_value();
            if old_value >= num {
                return old_value;
            }
            internal.current.cnt_advance(num);
-
            // Pop all waiters <= num from the heap. Collect them in a vector, and
            // wake them up after releasing the lock.
            let mut wake_these = Vec::new();
@@ -221,13 +243,13 @@ where
                }
                wake_these.push(internal.waiters.pop().unwrap().wake_channel);
            }
-            wake_these
+            (wake_these, internal.data.clone())
        };

        for tx in wake_these {
            // This can fail if there are no receivers.
            // We don't care; discard the error.
-            let _ = tx.send(());
+            let _ = tx.send(with_data.clone());
        }
        old_value
    }
@@ -236,6 +258,106 @@ where
    pub fn load(&self) -> S {
        self.internal.lock().unwrap().current
    }
+
+    /// Split the seqwait into a part than can only do wait,
+    /// and another part that can do advance + wait.
+    ///
+    /// The wait-only part can be cloned, the advance part cannot be cloned.
+    /// This provides a single-producer multi-consumer scheme.
+    pub fn split_spmc(self) -> (Wait<S, V, T>, Advance<S, V, T>) {
+        let inner = Arc::new(self);
+        let w = Wait {
+            inner: inner.clone(),
+        };
+        let a = Advance { inner };
+        (w, a)
+    }
+}
+
+/// See [`SeqWait::split_spmc`].
+pub struct Wait<S, V, T>
+where
+    S: MonotonicCounter<V> + Copy,
+    V: Ord + Copy,
+    T: Clone,
+{
+    inner: Arc<SeqWait<S, V, T>>,
+}
+
+/// See [`SeqWait::split_spmc`].
+pub struct Advance<S, V, T>
+where
+    S: MonotonicCounter<V> + Copy,
+    V: Ord + Copy,
+    T: Clone,
+{
+    inner: Arc<SeqWait<S, V, T>>,
+}
+
+impl<S, V, T> Wait<S, V, T>
+where
+    S: MonotonicCounter<V> + Copy,
+    V: Ord + Copy,
+    T: Clone,
+{
+    /// See [`SeqWait::wait_for`].
+    pub async fn wait_for(&self, num: V) -> Result<T, SeqWaitError> {
+        self.inner.wait_for(num).await
+    }
+
+    /// See [`SeqWait::wait_for_timeout`].
+    pub async fn wait_for_timeout(
+        &self,
+        num: V,
+        timeout_duration: Duration,
+    ) -> Result<T, SeqWaitError> {
+        self.inner.wait_for_timeout(num, timeout_duration).await
+    }
+}
+
+impl<S, V, T> Advance<S, V, T>
+where
+    S: MonotonicCounter<V> + Copy,
+    V: Ord + Copy,
+    T: Clone,
+{
+    /// See [`SeqWait::advance`].
+    pub fn advance(&self, num: V, new_data: Option<T>) -> V {
+        self.inner.advance(num, new_data)
+    }
+
+    /// See [`SeqWait::wait_for`].
+    pub async fn wait_for(&self, num: V) -> Result<T, SeqWaitError> {
+        self.inner.wait_for(num).await
+    }
+
+    /// See [`SeqWait::wait_for_timeout`].
+    pub async fn wait_for_timeout(
+        &self,
+        num: V,
+        timeout_duration: Duration,
+    ) -> Result<T, SeqWaitError> {
+        self.inner.wait_for_timeout(num, timeout_duration).await
+    }
+
+    /// Get a `Clone::clone` of the current data inside the seqwait.
+    pub fn get_current_data(&self) -> (V, T) {
+        let inner = self.inner.internal.lock().unwrap();
+        (inner.current.cnt_value(), inner.data.clone())
+    }
+}
+
+impl<S, V, T> Clone for Wait<S, V, T>
+where
+    S: MonotonicCounter<V> + Copy,
+    V: Ord + Copy,
+    T: Clone,
+{
+    fn clone(&self) -> Self {
+        Self {
+            inner: self.inner.clone(),
+        }
+    }
 }

 #[cfg(test)]
@@ -256,12 +378,12 @@ mod tests {

    #[tokio::test]
    async fn seqwait() {
-        let seq = Arc::new(SeqWait::new(0));
+        let seq = Arc::new(SeqWait::new(0, ()));
        let seq2 = Arc::clone(&seq);
        let seq3 = Arc::clone(&seq);
        let jh1 = tokio::task::spawn(async move {
            seq2.wait_for(42).await.expect("wait_for 42");
-            let old = seq2.advance(100);
+            let old = seq2.advance(100, None);
            assert_eq!(old, 99);
            seq2.wait_for_timeout(999, Duration::from_millis(100))
                .await
@@ -272,12 +394,12 @@ mod tests {
            seq3.wait_for(0).await.expect("wait_for 0");
        });
        tokio::time::sleep(Duration::from_millis(200)).await;
-        let old = seq.advance(99);
+        let old = seq.advance(99, None);
        assert_eq!(old, 0);
        seq.wait_for(100).await.expect("wait_for 100");

        // Calling advance with a smaller value is a no-op
-        assert_eq!(seq.advance(98), 100);
+        assert_eq!(seq.advance(98, None), 100);
        assert_eq!(seq.load(), 100);

        jh1.await.unwrap();
@@ -288,7 +410,7 @@ mod tests {

    #[tokio::test]
    async fn seqwait_timeout() {
-        let seq = Arc::new(SeqWait::new(0));
+        let seq = Arc::new(SeqWait::new(0, ()));
        let seq2 = Arc::clone(&seq);
        let jh = tokio::task::spawn(async move {
            let timeout = Duration::from_millis(1);
@@ -298,10 +420,104 @@ mod tests {
        tokio::time::sleep(Duration::from_millis(200)).await;
        // This will attempt to wake, but nothing will happen
        // because the waiter already dropped its Receiver.
-        let old = seq.advance(99);
+        let old = seq.advance(99, None);
        assert_eq!(old, 0);
        jh.await.unwrap();

        seq.shutdown();
    }
+
+    #[tokio::test]
+    async fn data_basic() {
+        let seq = Arc::new(SeqWait::new(0, "a"));
+        let seq2 = Arc::clone(&seq);
+        let jh = tokio::task::spawn(async move {
+            let data = seq.wait_for(2).await.unwrap();
+            assert_eq!(data, "b");
+        });
+        seq2.advance(1, Some("x"));
+        seq2.advance(2, Some("b"));
+        jh.await.unwrap();
+    }
+
+    #[test]
+    fn data_always_most_recent() {
+        let rt = tokio::runtime::Builder::new_current_thread()
+            .build()
+            .unwrap();
+
+        let seq = Arc::new(SeqWait::new(0, "a"));
+        let seq2 = Arc::clone(&seq);
+
+        let jh = rt.spawn(async move {
+            let data = seq.wait_for(2).await.unwrap();
+            assert_eq!(data, "d");
+        });
+
+        // jh is not running until we poll it, thanks to current thread runtime
+
+        rt.block_on(async move {
+            seq2.advance(2, Some("b"));
+            seq2.advance(3, Some("c"));
+            seq2.advance(4, Some("d"));
+        });
+
+        rt.block_on(jh).unwrap();
+    }
+
+    #[tokio::test]
+    async fn split_spmc_api_surface() {
+        let seq = SeqWait::new(0, 1);
+        let (w, a) = seq.split_spmc();
+
+        let _ = w.wait_for(1);
+        let _ = w.wait_for_timeout(0, Duration::from_secs(10));
+        let _ = w.clone();
+
+        let _ = a.advance(1, None);
+        let _ = a.wait_for(1);
+        let _ = a.wait_for_timeout(0, Duration::from_secs(10));
+
+        // TODO would be nice to have must-not-compile tests for Advance not being clonable.
+    }
+
+    #[tokio::test]
+    async fn new_data_same_lsn() {
+        let seq = Arc::new(SeqWait::new(0, "a"));
+
+        seq.advance(1, Some("b"));
+        let data = seq.wait_for(1).await.unwrap();
+        assert_eq!(data, "b", "the regular case where lsn and data advance");
+
+        seq.advance(1, Some("c"));
+        let data = seq.wait_for(1).await.unwrap();
+        assert_eq!(
+            data, "c",
+            "no lsn advance still gives new data for old lsn wait_for's"
+        );
+
+        let (start_wait_for_sender, start_wait_for_receiver) = tokio::sync::oneshot::channel();
+        // ensure we don't wake waiters for data-only change
+        let jh = tokio::spawn({
+            let seq = seq.clone();
+            async move {
+                start_wait_for_receiver.await.unwrap();
+                match tokio::time::timeout(Duration::from_secs(2), seq.wait_for(2)).await {
+                    Ok(_) => {
+                        assert!(
+                            false,
+                            "advance should not wake waiters if data changes but LSN doesn't"
+                        );
+                    }
+                    Err(_) => {
+                        // Good, we weren't woken up.
+                    }
+                }
+            }
+        });
+
+        seq.advance(1, Some("d"));
+        start_wait_for_sender.send(()).unwrap();
+        jh.await.unwrap();
+    }
 }
--- a/libs/utils/src/tracing_span_assert.rs
+++ b/libs/utils/src/tracing_span_assert.rs
@@ -1,287 +0,0 @@
-//! Assert that the current [`tracing::Span`] has a given set of fields.
-//!
-//! # Usage
-//!
-//! ```
-//! use tracing_subscriber::prelude::*;
-//! let registry = tracing_subscriber::registry()
-//!    .with(tracing_error::ErrorLayer::default());
-//!
-//! // Register the registry as the global subscriber.
-//! // In this example, we'll only use it as a thread-local subscriber.
-//! let _guard = tracing::subscriber::set_default(registry);
-//!
-//! // Then, in the main code:
-//!
-//! let span = tracing::info_span!("TestSpan", test_id = 1);
-//! let _guard = span.enter();
-//!
-//! // ... down the call stack
-//!
-//! use utils::tracing_span_assert::{check_fields_present, MultiNameExtractor};
-//! let extractor = MultiNameExtractor::new("TestExtractor", ["test", "test_id"]);
-//! match check_fields_present([&extractor]) {
-//!    Ok(()) => {},
-//!    Err(missing) => {
-//!        panic!("Missing fields: {:?}", missing.into_iter().map(|f| f.name() ).collect::<Vec<_>>());
-//!    }
-//! }
-//! ```
-//!
-//! Recommended reading: https://docs.rs/tracing-subscriber/0.3.16/tracing_subscriber/layer/index.html#per-layer-filtering
-//!
-
-use std::{
-    collections::HashSet,
-    fmt::{self},
-    hash::{Hash, Hasher},
-};
-
-pub enum ExtractionResult {
-    Present,
-    Absent,
-}
-
-pub trait Extractor: Send + Sync + std::fmt::Debug {
-    fn name(&self) -> &str;
-    fn extract(&self, fields: &tracing::field::FieldSet) -> ExtractionResult;
-}
-
-#[derive(Debug)]
-pub struct MultiNameExtractor<const L: usize> {
-    name: &'static str,
-    field_names: [&'static str; L],
-}
-
-impl<const L: usize> MultiNameExtractor<L> {
-    pub fn new(name: &'static str, field_names: [&'static str; L]) -> MultiNameExtractor<L> {
-        MultiNameExtractor { name, field_names }
-    }
-}
-impl<const L: usize> Extractor for MultiNameExtractor<L> {
-    fn name(&self) -> &str {
-        self.name
-    }
-    fn extract(&self, fields: &tracing::field::FieldSet) -> ExtractionResult {
-        if fields.iter().any(|f| self.field_names.contains(&f.name())) {
-            ExtractionResult::Present
-        } else {
-            ExtractionResult::Absent
-        }
-    }
-}
-
-struct MemoryIdentity<'a>(&'a dyn Extractor);
-
-impl<'a> MemoryIdentity<'a> {
-    fn as_ptr(&self) -> *const () {
-        self.0 as *const _ as *const ()
-    }
-}
-impl<'a> PartialEq for MemoryIdentity<'a> {
-    fn eq(&self, other: &Self) -> bool {
-        self.as_ptr() == other.as_ptr()
-    }
-}
-impl<'a> Eq for MemoryIdentity<'a> {}
-impl<'a> Hash for MemoryIdentity<'a> {
-    fn hash<H: Hasher>(&self, state: &mut H) {
-        self.as_ptr().hash(state);
-    }
-}
-impl<'a> fmt::Debug for MemoryIdentity<'a> {
-    fn fmt(&self, f: &mut fmt::Formatter<'_>) -> std::fmt::Result {
-        write!(f, "{:p}: {}", self.as_ptr(), self.0.name())
-    }
-}
-
-/// The extractor names passed as keys to [`new`].
-pub fn check_fields_present<const L: usize>(
-    must_be_present: [&dyn Extractor; L],
-) -> Result<(), Vec<&dyn Extractor>> {
-    let mut missing: HashSet<MemoryIdentity> =
-        HashSet::from_iter(must_be_present.into_iter().map(|r| MemoryIdentity(r)));
-    let trace = tracing_error::SpanTrace::capture();
-    trace.with_spans(|md, _formatted_fields| {
-        missing.retain(|extractor| match extractor.0.extract(md.fields()) {
-            ExtractionResult::Present => false,
-            ExtractionResult::Absent => true,
-        });
-        !missing.is_empty() // continue walking up until we've found all missing
-    });
-    if missing.is_empty() {
-        Ok(())
-    } else {
-        Err(missing.into_iter().map(|mi| mi.0).collect())
-    }
-}
-
-#[cfg(test)]
-mod tests {
-
-    use tracing_subscriber::prelude::*;
-
-    use super::*;
-
-    struct Setup {
-        _current_thread_subscriber_guard: tracing::subscriber::DefaultGuard,
-        tenant_extractor: MultiNameExtractor<2>,
-        timeline_extractor: MultiNameExtractor<2>,
-    }
-
-    fn setup_current_thread() -> Setup {
-        let tenant_extractor = MultiNameExtractor::new("TenantId", ["tenant_id", "tenant"]);
-        let timeline_extractor = MultiNameExtractor::new("TimelineId", ["timeline_id", "timeline"]);
-
-        let registry = tracing_subscriber::registry()
-            .with(tracing_subscriber::fmt::layer())
-            .with(tracing_error::ErrorLayer::default());
-
-        let guard = tracing::subscriber::set_default(registry);
-
-        Setup {
-            _current_thread_subscriber_guard: guard,
-            tenant_extractor,
-            timeline_extractor,
-        }
-    }
-
-    fn assert_missing(missing: Vec<&dyn Extractor>, expected: Vec<&dyn Extractor>) {
-        let missing: HashSet<MemoryIdentity> =
-            HashSet::from_iter(missing.into_iter().map(MemoryIdentity));
-        let expected: HashSet<MemoryIdentity> =
-            HashSet::from_iter(expected.into_iter().map(MemoryIdentity));
-        assert_eq!(missing, expected);
-    }
-
-    #[test]
-    fn positive_one_level() {
-        let setup = setup_current_thread();
-        let span = tracing::info_span!("root", tenant_id = "tenant-1", timeline_id = "timeline-1");
-        let _guard = span.enter();
-        check_fields_present([&setup.tenant_extractor, &setup.timeline_extractor]).unwrap();
-    }
-
-    #[test]
-    fn negative_one_level() {
-        let setup = setup_current_thread();
-        let span = tracing::info_span!("root", timeline_id = "timeline-1");
-        let _guard = span.enter();
-        let missing =
-            check_fields_present([&setup.tenant_extractor, &setup.timeline_extractor]).unwrap_err();
-        assert_missing(missing, vec![&setup.tenant_extractor]);
-    }
-
-    #[test]
-    fn positive_multiple_levels() {
-        let setup = setup_current_thread();
-
-        let span = tracing::info_span!("root");
-        let _guard = span.enter();
-
-        let span = tracing::info_span!("child", tenant_id = "tenant-1");
-        let _guard = span.enter();
-
-        let span = tracing::info_span!("grandchild", timeline_id = "timeline-1");
-        let _guard = span.enter();
-
-        check_fields_present([&setup.tenant_extractor, &setup.timeline_extractor]).unwrap();
-    }
-
-    #[test]
-    fn negative_multiple_levels() {
-        let setup = setup_current_thread();
-
-        let span = tracing::info_span!("root");
-        let _guard = span.enter();
-
-        let span = tracing::info_span!("child", timeline_id = "timeline-1");
-        let _guard = span.enter();
-
-        let missing = check_fields_present([&setup.tenant_extractor]).unwrap_err();
-        assert_missing(missing, vec![&setup.tenant_extractor]);
-    }
-
-    #[test]
-    fn positive_subset_one_level() {
-        let setup = setup_current_thread();
-        let span = tracing::info_span!("root", tenant_id = "tenant-1", timeline_id = "timeline-1");
-        let _guard = span.enter();
-        check_fields_present([&setup.tenant_extractor]).unwrap();
-    }
-
-    #[test]
-    fn positive_subset_multiple_levels() {
-        let setup = setup_current_thread();
-
-        let span = tracing::info_span!("root");
-        let _guard = span.enter();
-
-        let span = tracing::info_span!("child", tenant_id = "tenant-1");
-        let _guard = span.enter();
-
-        let span = tracing::info_span!("grandchild", timeline_id = "timeline-1");
-        let _guard = span.enter();
-
-        check_fields_present([&setup.tenant_extractor]).unwrap();
-    }
-
-    #[test]
-    fn negative_subset_one_level() {
-        let setup = setup_current_thread();
-        let span = tracing::info_span!("root", timeline_id = "timeline-1");
-        let _guard = span.enter();
-        let missing = check_fields_present([&setup.tenant_extractor]).unwrap_err();
-        assert_missing(missing, vec![&setup.tenant_extractor]);
-    }
-
-    #[test]
-    fn negative_subset_multiple_levels() {
-        let setup = setup_current_thread();
-
-        let span = tracing::info_span!("root");
-        let _guard = span.enter();
-
-        let span = tracing::info_span!("child", timeline_id = "timeline-1");
-        let _guard = span.enter();
-
-        let missing = check_fields_present([&setup.tenant_extractor]).unwrap_err();
-        assert_missing(missing, vec![&setup.tenant_extractor]);
-    }
-
-    #[test]
-    fn tracing_error_subscriber_not_set_up() {
-        // no setup
-
-        let span = tracing::info_span!("foo", e = "some value");
-        let _guard = span.enter();
-
-        let extractor = MultiNameExtractor::new("E", ["e"]);
-        let missing = check_fields_present([&extractor]).unwrap_err();
-        assert_missing(missing, vec![&extractor]);
-    }
-
-    #[test]
-    #[should_panic]
-    fn panics_if_tracing_error_subscriber_has_wrong_filter() {
-        let r = tracing_subscriber::registry().with({
-            tracing_error::ErrorLayer::default().with_filter(
-                tracing_subscriber::filter::dynamic_filter_fn(|md, _| {
-                    if md.is_span() && *md.level() == tracing::Level::INFO {
-                        return false;
-                    }
-                    true
-                }),
-            )
-        });
-
-        let _guard = tracing::subscriber::set_default(r);
-
-        let span = tracing::info_span!("foo", e = "some value");
-        let _guard = span.enter();
-
-        let extractor = MultiNameExtractor::new("E", ["e"]);
-        let missing = check_fields_present([&extractor]).unwrap_err();
-        assert_missing(missing, vec![&extractor]);
-    }
-}
--- a/pageserver/Cargo.toml
+++ b/pageserver/Cargo.toml
@@ -52,7 +52,6 @@ sync_wrapper.workspace = true
 tokio-tar.workspace = true
 thiserror.workspace = true
 tokio = { workspace = true, features = ["process", "sync", "fs", "rt", "io-util", "time"] }
-tokio-io-timeout.workspace = true
 tokio-postgres.workspace = true
 tokio-util.workspace = true
 toml_edit = { workspace = true, features = [ "serde" ] }
--- a/pageserver/benches/bench_layer_map.rs
+++ b/pageserver/benches/bench_layer_map.rs
@@ -13,7 +13,7 @@ use std::time::Instant;

 use utils::lsn::Lsn;

-use criterion::{black_box, criterion_group, criterion_main, Criterion};
+use criterion::{criterion_group, criterion_main, Criterion};

 fn build_layer_map(filename_dump: PathBuf) -> LayerMap<LayerDescriptor> {
    let mut layer_map = LayerMap::<LayerDescriptor>::default();
@@ -33,7 +33,7 @@ fn build_layer_map(filename_dump: PathBuf) -> LayerMap<LayerDescriptor> {
        min_lsn = min(min_lsn, lsn_range.start);
        max_lsn = max(max_lsn, Lsn(lsn_range.end.0 - 1));

-        updates.insert_historic(Arc::new(layer));
+        updates.insert_historic(Arc::new(layer)).unwrap();
    }

    println!("min: {min_lsn}, max: {max_lsn}");
@@ -114,7 +114,7 @@ fn bench_from_captest_env(c: &mut Criterion) {
    c.bench_function("captest_uniform_queries", |b| {
        b.iter(|| {
            for q in queries.clone().into_iter() {
-                black_box(layer_map.search(q.0, q.1));
+                layer_map.search(q.0, q.1);
            }
        });
    });
@@ -122,11 +122,11 @@ fn bench_from_captest_env(c: &mut Criterion) {
    // test with a key that corresponds to the RelDir entry. See pgdatadir_mapping.rs.
    c.bench_function("captest_rel_dir_query", |b| {
        b.iter(|| {
-            let result = black_box(layer_map.search(
+            let result = layer_map.search(
                Key::from_hex("000000067F00008000000000000000000001").unwrap(),
                // This LSN is higher than any of the LSNs in the tree
                Lsn::from_str("D0/80208AE1").unwrap(),
-            ));
+            );
            result.unwrap();
        });
    });
@@ -183,7 +183,7 @@ fn bench_from_real_project(c: &mut Criterion) {
    group.bench_function("uniform_queries", |b| {
        b.iter(|| {
            for q in queries.clone().into_iter() {
-                black_box(layer_map.search(q.0, q.1));
+                layer_map.search(q.0, q.1);
            }
        });
    });
@@ -215,7 +215,7 @@ fn bench_sequential(c: &mut Criterion) {
            is_incremental: false,
            short_id: format!("Layer {}", i),
        };
-        updates.insert_historic(Arc::new(layer));
+        updates.insert_historic(Arc::new(layer)).unwrap();
    }
    updates.flush();
    println!("Finished layer map init in {:?}", now.elapsed());
@@ -232,7 +232,7 @@ fn bench_sequential(c: &mut Criterion) {
    group.bench_function("uniform_queries", |b| {
        b.iter(|| {
            for q in queries.clone().into_iter() {
-                black_box(layer_map.search(q.0, q.1));
+                layer_map.search(q.0, q.1);
            }
        });
    });
--- a/pageserver/src/basebackup.rs
+++ b/pageserver/src/basebackup.rs
@@ -463,13 +463,9 @@ where
        let wal_file_path = format!("pg_wal/{}", wal_file_name);
        let header = new_tar_header(&wal_file_path, WAL_SEGMENT_SIZE as u64)?;

-        let wal_seg = postgres_ffi::generate_wal_segment(
-            segno,
-            system_identifier,
-            self.timeline.pg_version,
-            self.lsn,
-        )
-        .map_err(|e| anyhow!(e).context("Failed generating wal segment"))?;
+        let wal_seg =
+            postgres_ffi::generate_wal_segment(segno, system_identifier, self.timeline.pg_version)
+                .map_err(|e| anyhow!(e).context("Failed generating wal segment"))?;
        ensure!(wal_seg.len() == WAL_SEGMENT_SIZE);
        self.ar.append(&header, &wal_seg[..]).await?;
        Ok(())
--- a/pageserver/src/bin/pageserver.rs
+++ b/pageserver/src/bin/pageserver.rs
@@ -25,7 +25,6 @@ use pageserver::{
    virtual_file,
 };
 use postgres_backend::AuthType;
-use utils::logging::TracingErrorLayerEnablement;
 use utils::signals::ShutdownSignals;
 use utils::{
    auth::JwtAuth, logging, project_git_version, sentry_init::init_sentry, signals::Signal,
@@ -87,19 +86,8 @@ fn main() -> anyhow::Result<()> {
        }
    };

-    // Initialize logging.
-    //
-    // It must be initialized before the custom panic hook is installed below.
-    //
-    // Regarding tracing_error enablement: at this time, we only use the
-    // tracing_error crate to debug_assert that log spans contain tenant and timeline ids.
-    // See `debug_assert_current_span_has_tenant_and_timeline_id` in the timeline module
-    let tracing_error_layer_enablement = if cfg!(debug_assertions) {
-        TracingErrorLayerEnablement::EnableWithRustLogFilter
-    } else {
-        TracingErrorLayerEnablement::Disabled
-    };
-    logging::init(conf.log_format, tracing_error_layer_enablement)?;
+    // Initialize logging, which must be initialized before the custom panic hook is installed.
+    logging::init(conf.log_format)?;

    // mind the order required here: 1. logging, 2. panic_hook, 3. sentry.
    // disarming this hook on pageserver, because we never tear down tracing.
@@ -238,7 +226,6 @@ fn start_pageserver(
    );
    set_build_info_metric(GIT_VERSION);
    set_launch_timestamp_metric(launch_ts);
-    pageserver::preinitialize_metrics();

    // If any failpoints were set from FAILPOINTS environment variable,
    // print them to the log for debugging purposes
--- a/pageserver/src/config.rs
+++ b/pageserver/src/config.rs
@@ -6,7 +6,6 @@

 use anyhow::{anyhow, bail, ensure, Context, Result};
 use remote_storage::{RemotePath, RemoteStorageConfig};
-use serde::de::IntoDeserializer;
 use std::env;
 use storage_broker::Uri;
 use utils::crashsafe::path_with_suffix_extension;
@@ -63,6 +62,7 @@ pub mod defaults {
    pub const DEFAULT_CACHED_METRIC_COLLECTION_INTERVAL: &str = "1 hour";
    pub const DEFAULT_METRIC_COLLECTION_ENDPOINT: Option<reqwest::Url> = None;
    pub const DEFAULT_SYNTHETIC_SIZE_CALCULATION_INTERVAL: &str = "10 min";
+    pub const DEFAULT_EVICTIONS_LOW_RESIDENCE_DURATION_METRIC_THRESHOLD: &str = "24 hour";

    ///
    /// Default built-in configuration file.
@@ -91,6 +91,7 @@ pub mod defaults {
 #cached_metric_collection_interval = '{DEFAULT_CACHED_METRIC_COLLECTION_INTERVAL}'
 #synthetic_size_calculation_interval = '{DEFAULT_SYNTHETIC_SIZE_CALCULATION_INTERVAL}'

+#evictions_low_residence_duration_metric_threshold = '{DEFAULT_EVICTIONS_LOW_RESIDENCE_DURATION_METRIC_THRESHOLD}'

 #disk_usage_based_eviction = {{ max_usage_pct = .., min_avail_bytes = .., period = "10s"}}

@@ -107,7 +108,6 @@ pub mod defaults {
 #pitr_interval = '{DEFAULT_PITR_INTERVAL}'

 #min_resident_size_override = .. # in bytes
-#evictions_low_residence_duration_metric_threshold = '{DEFAULT_EVICTIONS_LOW_RESIDENCE_DURATION_METRIC_THRESHOLD}'

 # [remote_storage]

@@ -182,6 +182,9 @@ pub struct PageServerConf {
    pub metric_collection_endpoint: Option<Url>,
    pub synthetic_size_calculation_interval: Duration,

+    // See the corresponding metric's help string.
+    pub evictions_low_residence_duration_metric_threshold: Duration,
+
    pub disk_usage_based_eviction: Option<DiskUsageEvictionTaskConfig>,

    pub test_remote_failures: u64,
@@ -254,6 +257,8 @@ struct PageServerConfigBuilder {
    metric_collection_endpoint: BuilderValue<Option<Url>>,
    synthetic_size_calculation_interval: BuilderValue<Duration>,

+    evictions_low_residence_duration_metric_threshold: BuilderValue<Duration>,
+
    disk_usage_based_eviction: BuilderValue<Option<DiskUsageEvictionTaskConfig>>,

    test_remote_failures: BuilderValue<u64>,
@@ -311,6 +316,11 @@ impl Default for PageServerConfigBuilder {
            .expect("cannot parse default synthetic size calculation interval")),
            metric_collection_endpoint: Set(DEFAULT_METRIC_COLLECTION_ENDPOINT),

+            evictions_low_residence_duration_metric_threshold: Set(humantime::parse_duration(
+                DEFAULT_EVICTIONS_LOW_RESIDENCE_DURATION_METRIC_THRESHOLD,
+            )
+            .expect("cannot parse DEFAULT_EVICTIONS_LOW_RESIDENCE_DURATION_METRIC_THRESHOLD")),
+
            disk_usage_based_eviction: Set(None),

            test_remote_failures: Set(0),
@@ -428,6 +438,10 @@ impl PageServerConfigBuilder {
        self.test_remote_failures = BuilderValue::Set(fail_first);
    }

+    pub fn evictions_low_residence_duration_metric_threshold(&mut self, value: Duration) {
+        self.evictions_low_residence_duration_metric_threshold = BuilderValue::Set(value);
+    }
+
    pub fn disk_usage_based_eviction(&mut self, value: Option<DiskUsageEvictionTaskConfig>) {
        self.disk_usage_based_eviction = BuilderValue::Set(value);
    }
@@ -511,6 +525,11 @@ impl PageServerConfigBuilder {
            synthetic_size_calculation_interval: self
                .synthetic_size_calculation_interval
                .ok_or(anyhow!("missing synthetic_size_calculation_interval"))?,
+            evictions_low_residence_duration_metric_threshold: self
+                .evictions_low_residence_duration_metric_threshold
+                .ok_or(anyhow!(
+                    "missing evictions_low_residence_duration_metric_threshold"
+                ))?,
            disk_usage_based_eviction: self
                .disk_usage_based_eviction
                .ok_or(anyhow!("missing disk_usage_based_eviction"))?,
@@ -702,12 +721,12 @@ impl PageServerConf {
                "synthetic_size_calculation_interval" =>
                    builder.synthetic_size_calculation_interval(parse_toml_duration(key, item)?),
                "test_remote_failures" => builder.test_remote_failures(parse_toml_u64(key, item)?),
+                "evictions_low_residence_duration_metric_threshold" => builder.evictions_low_residence_duration_metric_threshold(parse_toml_duration(key, item)?),
                "disk_usage_based_eviction" => {
                    tracing::info!("disk_usage_based_eviction: {:#?}", &item);
                    builder.disk_usage_based_eviction(
-                        deserialize_from_item("disk_usage_based_eviction", item)
-                            .context("parse disk_usage_based_eviction")?
-                    )
+                    toml_edit::de::from_item(item.clone())
+                    .context("parse disk_usage_based_eviction")?)
                },
                "ondemand_download_behavior_treat_error_as_warn" => builder.ondemand_download_behavior_treat_error_as_warn(parse_toml_bool(key, item)?),
                _ => bail!("unrecognized pageserver option '{key}'"),
@@ -808,25 +827,18 @@ impl PageServerConf {

        if let Some(eviction_policy) = item.get("eviction_policy") {
            t_conf.eviction_policy = Some(
-                deserialize_from_item("eviction_policy", eviction_policy)
+                toml_edit::de::from_item(eviction_policy.clone())
                    .context("parse eviction_policy")?,
            );
        }

        if let Some(item) = item.get("min_resident_size_override") {
            t_conf.min_resident_size_override = Some(
-                deserialize_from_item("min_resident_size_override", item)
+                toml_edit::de::from_item(item.clone())
                    .context("parse min_resident_size_override")?,
            );
        }

-        if let Some(item) = item.get("evictions_low_residence_duration_metric_threshold") {
-            t_conf.evictions_low_residence_duration_metric_threshold = Some(parse_toml_duration(
-                "evictions_low_residence_duration_metric_threshold",
-                item,
-            )?);
-        }
-
        Ok(t_conf)
    }

@@ -865,6 +877,10 @@ impl PageServerConf {
            cached_metric_collection_interval: Duration::from_secs(60 * 60),
            metric_collection_endpoint: defaults::DEFAULT_METRIC_COLLECTION_ENDPOINT,
            synthetic_size_calculation_interval: Duration::from_secs(60),
+            evictions_low_residence_duration_metric_threshold: humantime::parse_duration(
+                defaults::DEFAULT_EVICTIONS_LOW_RESIDENCE_DURATION_METRIC_THRESHOLD,
+            )
+            .unwrap(),
            disk_usage_based_eviction: None,
            test_remote_failures: 0,
            ondemand_download_behavior_treat_error_as_warn: false,
@@ -922,18 +938,6 @@ where
    })
 }

-fn deserialize_from_item<T>(name: &str, item: &Item) -> anyhow::Result<T>
-where
-    T: serde::de::DeserializeOwned,
-{
-    // ValueDeserializer::new is not public, so use the ValueDeserializer's documented way
-    let deserializer = match item.clone().into_value() {
-        Ok(value) => value.into_deserializer(),
-        Err(item) => anyhow::bail!("toml_edit::Item '{item}' is not a toml_edit::Value"),
-    };
-    T::deserialize(deserializer).with_context(|| format!("deserializing item for node {name}"))
-}
-
 /// Configurable semaphore permits setting.
 ///
 /// Does not allow semaphore permits to be zero, because at runtime initially zero permits and empty
@@ -1000,10 +1004,9 @@ mod tests {

    use remote_storage::{RemoteStorageKind, S3Config};
    use tempfile::{tempdir, TempDir};
-    use utils::serde_percent::Percent;

    use super::*;
-    use crate::{tenant::config::EvictionPolicy, DEFAULT_PG_VERSION};
+    use crate::DEFAULT_PG_VERSION;

    const ALL_BASE_VALUES_TOML: &str = r#"
 # Initial configuration file created by 'pageserver --init'
@@ -1026,6 +1029,8 @@ cached_metric_collection_interval = '22200 s'
 metric_collection_endpoint = 'http://localhost:80/metrics'
 synthetic_size_calculation_interval = '333 s'

+evictions_low_residence_duration_metric_threshold = '444 s'
+
 log_format = 'json'

 "#;
@@ -1082,6 +1087,9 @@ log_format = 'json'
                synthetic_size_calculation_interval: humantime::parse_duration(
                    defaults::DEFAULT_SYNTHETIC_SIZE_CALCULATION_INTERVAL
                )?,
+                evictions_low_residence_duration_metric_threshold: humantime::parse_duration(
+                    defaults::DEFAULT_EVICTIONS_LOW_RESIDENCE_DURATION_METRIC_THRESHOLD
+                )?,
                disk_usage_based_eviction: None,
                test_remote_failures: 0,
                ondemand_download_behavior_treat_error_as_warn: false,
@@ -1136,6 +1144,7 @@ log_format = 'json'
                cached_metric_collection_interval: Duration::from_secs(22200),
                metric_collection_endpoint: Some(Url::parse("http://localhost:80/metrics")?),
                synthetic_size_calculation_interval: Duration::from_secs(333),
+                evictions_low_residence_duration_metric_threshold: Duration::from_secs(444),
                disk_usage_based_eviction: None,
                test_remote_failures: 0,
                ondemand_download_behavior_treat_error_as_warn: false,
@@ -1301,71 +1310,6 @@ trace_read_requests = {trace_read_requests}"#,
        Ok(())
    }

-    #[test]
-    fn eviction_pageserver_config_parse() -> anyhow::Result<()> {
-        let tempdir = tempdir()?;
-        let (workdir, pg_distrib_dir) = prepare_fs(&tempdir)?;
-
-        let pageserver_conf_toml = format!(
-            r#"pg_distrib_dir = "{}"
-metric_collection_endpoint = "http://sample.url"
-metric_collection_interval = "10min"
-id = 222
-
-[disk_usage_based_eviction]
-max_usage_pct = 80
-min_avail_bytes = 0
-period = "10s"
-
-[tenant_config]
-evictions_low_residence_duration_metric_threshold = "20m"
-
-[tenant_config.eviction_policy]
-kind = "LayerAccessThreshold"
-period = "20m"
-threshold = "20m"
-"#,
-            pg_distrib_dir.display(),
-        );
-        let toml: Document = pageserver_conf_toml.parse()?;
-        let conf = PageServerConf::parse_and_validate(&toml, &workdir)?;
-
-        assert_eq!(conf.pg_distrib_dir, pg_distrib_dir);
-        assert_eq!(
-            conf.metric_collection_endpoint,
-            Some("http://sample.url".parse().unwrap())
-        );
-        assert_eq!(
-            conf.metric_collection_interval,
-            Duration::from_secs(10 * 60)
-        );
-        assert_eq!(
-            conf.default_tenant_conf
-                .evictions_low_residence_duration_metric_threshold,
-            Duration::from_secs(20 * 60)
-        );
-        assert_eq!(conf.id, NodeId(222));
-        assert_eq!(
-            conf.disk_usage_based_eviction,
-            Some(DiskUsageEvictionTaskConfig {
-                max_usage_pct: Percent::new(80).unwrap(),
-                min_avail_bytes: 0,
-                period: Duration::from_secs(10),
-                #[cfg(feature = "testing")]
-                mock_statvfs: None,
-            })
-        );
-        match &conf.default_tenant_conf.eviction_policy {
-            EvictionPolicy::NoEviction => panic!("Unexpected eviction opolicy tenant settings"),
-            EvictionPolicy::LayerAccessThreshold(eviction_thresold) => {
-                assert_eq!(eviction_thresold.period, Duration::from_secs(20 * 60));
-                assert_eq!(eviction_thresold.threshold, Duration::from_secs(20 * 60));
-            }
-        }
-
-        Ok(())
-    }
-
    fn prepare_fs(tempdir: &TempDir) -> anyhow::Result<(PathBuf, PathBuf)> {
        let tempdir_path = tempdir.path();

--- a/pageserver/src/http/openapi_spec.yml
+++ b/pageserver/src/http/openapi_spec.yml
@@ -520,43 +520,6 @@ paths:
              schema:
                $ref: "#/components/schemas/Error"

-  /v1/tenant/{tenant_id}/synthetic_size:
-    parameters:
-      - name: tenant_id
-        in: path
-        required: true
-        schema:
-          type: string
-          format: hex
-    get:
-      description: |
-        Calculate tenant's synthetic size
-      responses:
-        "200":
-          description: Tenant's synthetic size
-          content:
-            application/json:
-              schema:
-                $ref: "#/components/schemas/SyntheticSizeResponse"
-        "401":
-          description: Unauthorized Error
-          content:
-            application/json:
-              schema:
-                $ref: "#/components/schemas/UnauthorizedError"
-        "403":
-          description: Forbidden Error
-          content:
-            application/json:
-              schema:
-                $ref: "#/components/schemas/ForbiddenError"
-        "500":
-          description: Generic operation error
-          content:
-            application/json:
-              schema:
-                $ref: "#/components/schemas/Error"
-
  /v1/tenant/{tenant_id}/size:
    parameters:
      - name: tenant_id
@@ -985,84 +948,6 @@ components:
        latest_gc_cutoff_lsn:
          type: string
          format: hex
-
-    SyntheticSizeResponse:
-      type: object
-      required:
-        - id
-        - size
-        - segment_sizes
-        - inputs
-      properties:
-        id:
-          type: string
-          format: hex
-        size:
-          type: integer
-        segment_sizes:
-          type: array
-          items:
-            $ref: "#/components/schemas/SegmentSize"
-        inputs:
-          type: object
-          properties:
-            segments:
-              type: array
-              items:
-                $ref: "#/components/schemas/SegmentData"
-            timeline_inputs:
-              type: array
-              items:
-                $ref: "#/components/schemas/TimelineInput"
-
-    SegmentSize:
-      type: object
-      required:
-        - method
-        - accum_size
-      properties:
-        method:
-          type: string
-        accum_size:
-          type: integer
-
-    SegmentData:
-      type: object
-      required:
-        - segment
-      properties:
-        segment:
-          type: object
-          required:
-            - lsn
-          properties:
-            parent:
-              type: integer
-            lsn:
-              type: integer
-            size:
-              type: integer
-            needed:
-              type: boolean
-        timeline_id:
-          type: string
-          format: hex
-        kind:
-          type: string
-
-    TimelineInput:
-      type: object
-      required:
-        - timeline_id
-      properties:
-        ancestor_id:
-          type: string
-        ancestor_lsn:
-          type: string
-        timeline_id:
-          type: string
-          format: hex
-
    Error:
      type: object
      required:
--- a/pageserver/src/http/routes.rs
+++ b/pageserver/src/http/routes.rs
@@ -781,19 +781,6 @@ async fn tenant_create_handler(mut request: Request<Body>) -> Result<Response<Bo

    tenant_conf.min_resident_size_override = request_data.min_resident_size_override;

-    if let Some(evictions_low_residence_duration_metric_threshold) =
-        request_data.evictions_low_residence_duration_metric_threshold
-    {
-        tenant_conf.evictions_low_residence_duration_metric_threshold = Some(
-            humantime::parse_duration(&evictions_low_residence_duration_metric_threshold)
-                .with_context(bad_duration(
-                    "evictions_low_residence_duration_metric_threshold",
-                    &evictions_low_residence_duration_metric_threshold,
-                ))
-                .map_err(ApiError::BadRequest)?,
-        );
-    }
-
    let target_tenant_id = request_data
        .new_tenant_id
        .map(TenantId::from)
@@ -927,19 +914,6 @@ async fn update_tenant_config_handler(

    tenant_conf.min_resident_size_override = request_data.min_resident_size_override;

-    if let Some(evictions_low_residence_duration_metric_threshold) =
-        request_data.evictions_low_residence_duration_metric_threshold
-    {
-        tenant_conf.evictions_low_residence_duration_metric_threshold = Some(
-            humantime::parse_duration(&evictions_low_residence_duration_metric_threshold)
-                .with_context(bad_duration(
-                    "evictions_low_residence_duration_metric_threshold",
-                    &evictions_low_residence_duration_metric_threshold,
-                ))
-                .map_err(ApiError::BadRequest)?,
-        );
-    }
-
    let state = get_state(&request);
    mgr::set_new_tenant_config(state.conf, tenant_conf, tenant_id)
        .instrument(info_span!("tenant_config", tenant = ?tenant_id))
@@ -1201,37 +1175,6 @@ async fn handler_404(_: Request<Body>) -> Result<Response<Body>, ApiError> {
    )
 }

-#[cfg(feature = "testing")]
-async fn post_tracing_event_handler(mut r: Request<Body>) -> Result<Response<Body>, ApiError> {
-    #[derive(Debug, serde::Deserialize)]
-    #[serde(rename_all = "lowercase")]
-    enum Level {
-        Error,
-        Warn,
-        Info,
-        Debug,
-        Trace,
-    }
-    #[derive(Debug, serde::Deserialize)]
-    struct Request {
-        level: Level,
-        message: String,
-    }
-    let body: Request = json_request(&mut r)
-        .await
-        .map_err(|_| ApiError::BadRequest(anyhow::anyhow!("invalid JSON body")))?;
-
-    match body.level {
-        Level::Error => tracing::error!(?body.message),
-        Level::Warn => tracing::warn!(?body.message),
-        Level::Info => tracing::info!(?body.message),
-        Level::Debug => tracing::debug!(?body.message),
-        Level::Trace => tracing::trace!(?body.message),
-    }
-
-    json_response(StatusCode::OK, ())
-}
-
 pub fn make_router(
    conf: &'static PageServerConf,
    launch_ts: &'static LaunchTimestamp,
@@ -1372,9 +1315,5 @@ pub fn make_router(
            testing_api!("set tenant state to broken", handle_tenant_break),
        )
        .get("/v1/panic", |r| RequestSpan(always_panic_handler).handle(r))
-        .post(
-            "/v1/tracing/event",
-            testing_api!("emit a tracing event", post_tracing_event_handler),
-        )
        .any(handler_404))
 }
--- a/pageserver/src/import_datadir.rs
+++ b/pageserver/src/import_datadir.rs
@@ -114,7 +114,7 @@ async fn import_rel(
    path: &Path,
    spcoid: Oid,
    dboid: Oid,
-    reader: &mut (impl AsyncRead + Unpin),
+    reader: &mut (impl AsyncRead + Send + Sync + Unpin),
    len: usize,
    ctx: &RequestContext,
 ) -> anyhow::Result<()> {
@@ -200,7 +200,7 @@ async fn import_slru(
    modification: &mut DatadirModification<'_>,
    slru: SlruKind,
    path: &Path,
-    reader: &mut (impl AsyncRead + Unpin),
+    reader: &mut (impl AsyncRead + Send + Sync + Unpin),
    len: usize,
    ctx: &RequestContext,
 ) -> anyhow::Result<()> {
@@ -612,8 +612,8 @@ async fn import_file(
    Ok(None)
 }

-async fn read_all_bytes(reader: &mut (impl AsyncRead + Unpin)) -> Result<Bytes> {
+async fn read_all_bytes(reader: &mut (impl AsyncRead + Send + Sync + Unpin)) -> Result<Bytes> {
    let mut buf: Vec<u8> = vec![];
    reader.read_to_end(&mut buf).await?;
-    Ok(Bytes::from(buf))
+    Ok(Bytes::copy_from_slice(&buf[..]))
 }
--- a/pageserver/src/lib.rs
+++ b/pageserver/src/lib.rs
@@ -44,8 +44,6 @@ pub const DELTA_FILE_MAGIC: u16 = 0x5A61;

 static ZERO_PAGE: bytes::Bytes = bytes::Bytes::from_static(&[0u8; 8192]);

-pub use crate::metrics::preinitialize_metrics;
-
 pub async fn shutdown_pageserver(exit_code: i32) {
    // Shut down the libpq endpoint task. This prevents new connections from
    // being accepted.
--- a/pageserver/src/metrics.rs
+++ b/pageserver/src/metrics.rs
@@ -1,9 +1,9 @@
 use metrics::core::{AtomicU64, GenericCounter};
 use metrics::{
    register_counter_vec, register_histogram, register_histogram_vec, register_int_counter,
-    register_int_counter_vec, register_int_gauge_vec, register_uint_gauge_vec, Counter, CounterVec,
-    Histogram, HistogramVec, IntCounter, IntCounterVec, IntGauge, IntGaugeVec, UIntGauge,
-    UIntGaugeVec,
+    register_int_counter_vec, register_int_gauge, register_int_gauge_vec, register_uint_gauge_vec,
+    Counter, CounterVec, Histogram, HistogramVec, IntCounter, IntCounterVec, IntGauge, IntGaugeVec,
+    UIntGauge, UIntGaugeVec,
 };
 use once_cell::sync::Lazy;
 use pageserver_api::models::TenantState;
@@ -205,15 +205,6 @@ static EVICTIONS_WITH_LOW_RESIDENCE_DURATION: Lazy<IntCounterVec> = Lazy::new(||
    .expect("failed to define a metric")
 });

-pub static UNEXPECTED_ONDEMAND_DOWNLOADS: Lazy<IntCounter> = Lazy::new(|| {
-    register_int_counter!(
-        "pageserver_unexpected_ondemand_downloads_count",
-        "Number of unexpected on-demand downloads. \
-         We log more context for each increment, so, forgo any labels in this metric.",
-    )
-    .expect("failed to define a metric")
-});
-
 /// Each [`Timeline`]'s  [`EVICTIONS_WITH_LOW_RESIDENCE_DURATION`] metric.
 #[derive(Debug)]
 pub struct EvictionsWithLowResidenceDuration {
@@ -266,54 +257,19 @@ impl EvictionsWithLowResidenceDuration {
        }
    }

-    pub fn change_threshold(
-        &mut self,
-        tenant_id: &str,
-        timeline_id: &str,
-        new_threshold: Duration,
-    ) {
-        if new_threshold == self.threshold {
-            return;
-        }
-        let mut with_new =
-            EvictionsWithLowResidenceDurationBuilder::new(self.data_source, new_threshold)
-                .build(tenant_id, timeline_id);
-        std::mem::swap(self, &mut with_new);
-        with_new.remove(tenant_id, timeline_id);
-    }
-
    // This could be a `Drop` impl, but, we need the `tenant_id` and `timeline_id`.
    fn remove(&mut self, tenant_id: &str, timeline_id: &str) {
        let Some(_counter) = self.counter.take() else {
            return;
        };
-
-        let threshold = Self::threshold_label_value(self.threshold);
-
-        let removed = EVICTIONS_WITH_LOW_RESIDENCE_DURATION.remove_label_values(&[
-            tenant_id,
-            timeline_id,
-            self.data_source,
-            &threshold,
-        ]);
-
-        match removed {
-            Err(e) => {
-                // this has been hit in staging as
-                // <https://neondatabase.sentry.io/issues/4142396994/>, but we don't know how.
-                // because we can be in the drop path already, don't risk:
-                // - "double-panic => illegal instruction" or
-                // - future "drop panick => abort"
-                //
-                // so just nag: (the error has the labels)
-                tracing::warn!("failed to remove EvictionsWithLowResidenceDuration, it was already removed? {e:#?}");
-            }
-            Ok(()) => {
-                // to help identify cases where we double-remove the same values, let's log all
-                // deletions?
-                tracing::info!("removed EvictionsWithLowResidenceDuration with {tenant_id}, {timeline_id}, {}, {threshold}", self.data_source);
-            }
-        }
+        EVICTIONS_WITH_LOW_RESIDENCE_DURATION
+            .remove_label_values(&[
+                tenant_id,
+                timeline_id,
+                self.data_source,
+                &Self::threshold_label_value(self.threshold),
+            ])
+            .expect("we own the metric, no-one else should remove it");
    }
 }

@@ -378,6 +334,11 @@ pub static LIVE_CONNECTIONS_COUNT: Lazy<IntGaugeVec> = Lazy::new(|| {
    .expect("failed to define a metric")
 });

+pub static NUM_ONDISK_LAYERS: Lazy<IntGauge> = Lazy::new(|| {
+    register_int_gauge!("pageserver_ondisk_layers", "Number of layers on-disk")
+        .expect("failed to define a metric")
+});
+
 // remote storage metrics

 /// NB: increment _after_ recording the current value into [`REMOTE_TIMELINE_CLIENT_CALLS_STARTED_HIST`].
@@ -408,26 +369,6 @@ static REMOTE_TIMELINE_CLIENT_CALLS_STARTED_HIST: Lazy<HistogramVec> = Lazy::new
    .expect("failed to define a metric")
 });

-static REMOTE_TIMELINE_CLIENT_BYTES_STARTED_COUNTER: Lazy<IntCounterVec> = Lazy::new(|| {
-    register_int_counter_vec!(
-        "pageserver_remote_timeline_client_bytes_started",
-        "Incremented by the number of bytes associated with a remote timeline client operation. \
-         The increment happens when the operation is scheduled.",
-        &["tenant_id", "timeline_id", "file_kind", "op_kind"],
-    )
-    .expect("failed to define a metric")
-});
-
-static REMOTE_TIMELINE_CLIENT_BYTES_FINISHED_COUNTER: Lazy<IntCounterVec> = Lazy::new(|| {
-    register_int_counter_vec!(
-        "pageserver_remote_timeline_client_bytes_finished",
-        "Incremented by the number of bytes associated with a remote timeline client operation. \
-         The increment happens when the operation finishes (regardless of success/failure/shutdown).",
-        &["tenant_id", "timeline_id", "file_kind", "op_kind"],
-    )
-    .expect("failed to define a metric")
-});
-
 #[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
 pub enum RemoteOpKind {
    Upload,
@@ -648,7 +589,7 @@ pub struct TimelineMetrics {
    pub num_persistent_files_created: IntCounter,
    pub persistent_bytes_written: IntCounter,
    pub evictions: IntCounter,
-    pub evictions_with_low_residence_duration: std::sync::RwLock<EvictionsWithLowResidenceDuration>,
+    pub evictions_with_low_residence_duration: EvictionsWithLowResidenceDuration,
 }

 impl TimelineMetrics {
@@ -715,9 +656,7 @@ impl TimelineMetrics {
            num_persistent_files_created,
            persistent_bytes_written,
            evictions,
-            evictions_with_low_residence_duration: std::sync::RwLock::new(
-                evictions_with_low_residence_duration,
-            ),
+            evictions_with_low_residence_duration,
        }
    }
 }
@@ -736,8 +675,6 @@ impl Drop for TimelineMetrics {
        let _ = PERSISTENT_BYTES_WRITTEN.remove_label_values(&[tenant_id, timeline_id]);
        let _ = EVICTIONS.remove_label_values(&[tenant_id, timeline_id]);
        self.evictions_with_low_residence_duration
-            .write()
-            .unwrap()
            .remove(tenant_id, timeline_id);
        for op in STORAGE_TIME_OPERATIONS {
            let _ =
@@ -782,8 +719,6 @@ pub struct RemoteTimelineClientMetrics {
    remote_operation_time: Mutex<HashMap<(&'static str, &'static str, &'static str), Histogram>>,
    calls_unfinished_gauge: Mutex<HashMap<(&'static str, &'static str), IntGauge>>,
    calls_started_hist: Mutex<HashMap<(&'static str, &'static str), Histogram>>,
-    bytes_started_counter: Mutex<HashMap<(&'static str, &'static str), IntCounter>>,
-    bytes_finished_counter: Mutex<HashMap<(&'static str, &'static str), IntCounter>>,
 }

 impl RemoteTimelineClientMetrics {
@@ -794,8 +729,6 @@ impl RemoteTimelineClientMetrics {
            remote_operation_time: Mutex::new(HashMap::default()),
            calls_unfinished_gauge: Mutex::new(HashMap::default()),
            calls_started_hist: Mutex::new(HashMap::default()),
-            bytes_started_counter: Mutex::new(HashMap::default()),
-            bytes_finished_counter: Mutex::new(HashMap::default()),
            remote_physical_size_gauge: Mutex::new(None),
        }
    }
@@ -834,7 +767,6 @@ impl RemoteTimelineClientMetrics {
        });
        metric.clone()
    }
-
    fn calls_unfinished_gauge(
        &self,
        file_kind: &RemoteOpFileKind,
@@ -876,125 +808,32 @@ impl RemoteTimelineClientMetrics {
        });
        metric.clone()
    }
-
-    fn bytes_started_counter(
-        &self,
-        file_kind: &RemoteOpFileKind,
-        op_kind: &RemoteOpKind,
-    ) -> IntCounter {
-        // XXX would be nice to have an upgradable RwLock
-        let mut guard = self.bytes_started_counter.lock().unwrap();
-        let key = (file_kind.as_str(), op_kind.as_str());
-        let metric = guard.entry(key).or_insert_with(move || {
-            REMOTE_TIMELINE_CLIENT_BYTES_STARTED_COUNTER
-                .get_metric_with_label_values(&[
-                    &self.tenant_id.to_string(),
-                    &self.timeline_id.to_string(),
-                    key.0,
-                    key.1,
-                ])
-                .unwrap()
-        });
-        metric.clone()
-    }
-
-    fn bytes_finished_counter(
-        &self,
-        file_kind: &RemoteOpFileKind,
-        op_kind: &RemoteOpKind,
-    ) -> IntCounter {
-        // XXX would be nice to have an upgradable RwLock
-        let mut guard = self.bytes_finished_counter.lock().unwrap();
-        let key = (file_kind.as_str(), op_kind.as_str());
-        let metric = guard.entry(key).or_insert_with(move || {
-            REMOTE_TIMELINE_CLIENT_BYTES_FINISHED_COUNTER
-                .get_metric_with_label_values(&[
-                    &self.tenant_id.to_string(),
-                    &self.timeline_id.to_string(),
-                    key.0,
-                    key.1,
-                ])
-                .unwrap()
-        });
-        metric.clone()
-    }
-}
-
-#[cfg(test)]
-impl RemoteTimelineClientMetrics {
-    pub fn get_bytes_started_counter_value(
-        &self,
-        file_kind: &RemoteOpFileKind,
-        op_kind: &RemoteOpKind,
-    ) -> Option<u64> {
-        let guard = self.bytes_started_counter.lock().unwrap();
-        let key = (file_kind.as_str(), op_kind.as_str());
-        guard.get(&key).map(|counter| counter.get())
-    }
-
-    pub fn get_bytes_finished_counter_value(
-        &self,
-        file_kind: &RemoteOpFileKind,
-        op_kind: &RemoteOpKind,
-    ) -> Option<u64> {
-        let guard = self.bytes_finished_counter.lock().unwrap();
-        let key = (file_kind.as_str(), op_kind.as_str());
-        guard.get(&key).map(|counter| counter.get())
-    }
 }

 /// See [`RemoteTimelineClientMetrics::call_begin`].
 #[must_use]
-pub(crate) struct RemoteTimelineClientCallMetricGuard {
-    /// Decremented on drop.
-    calls_unfinished_metric: Option<IntGauge>,
-    /// If Some(), this references the bytes_finished metric, and we increment it by the given `u64` on drop.
-    bytes_finished: Option<(IntCounter, u64)>,
-}
+pub(crate) struct RemoteTimelineClientCallMetricGuard(Option<IntGauge>);

 impl RemoteTimelineClientCallMetricGuard {
-    /// Consume this guard object without performing the metric updates it would do on `drop()`.
-    /// The caller vouches to do the metric updates manually.
+    /// Consume this guard object without decrementing the metric.
+    /// The caller vouches to do this manually, so that the prior increment of the gauge will cancel out.
    pub fn will_decrement_manually(mut self) {
-        let RemoteTimelineClientCallMetricGuard {
-            calls_unfinished_metric,
-            bytes_finished,
-        } = &mut self;
-        calls_unfinished_metric.take();
-        bytes_finished.take();
+        self.0 = None; // prevent drop() from decrementing
    }
 }

 impl Drop for RemoteTimelineClientCallMetricGuard {
    fn drop(&mut self) {
-        let RemoteTimelineClientCallMetricGuard {
-            calls_unfinished_metric,
-            bytes_finished,
-        } = self;
-        if let Some(guard) = calls_unfinished_metric.take() {
+        if let RemoteTimelineClientCallMetricGuard(Some(guard)) = self {
            guard.dec();
        }
-        if let Some((bytes_finished_metric, value)) = bytes_finished {
-            bytes_finished_metric.inc_by(*value);
-        }
    }
 }

-/// The enum variants communicate to the [`RemoteTimelineClientMetrics`] whether to
-/// track the byte size of this call in applicable metric(s).
-pub(crate) enum RemoteTimelineClientMetricsCallTrackSize {
-    /// Do not account for this call's byte size in any metrics.
-    /// The `reason` field is there to make the call sites self-documenting
-    /// about why they don't need the metric.
-    DontTrackSize { reason: &'static str },
-    /// Track the byte size of the call in applicable metric(s).
-    Bytes(u64),
-}
-
 impl RemoteTimelineClientMetrics {
-    /// Update the metrics that change when a call to the remote timeline client instance starts.
+    /// Increment the metrics that track ongoing calls to the remote timeline client instance.
    ///
-    /// Drop the returned guard object once the operation is finished to updates corresponding metrics that track completions.
+    /// Drop the returned guard object once the operation is finished to decrement the values.
    /// Or, use [`RemoteTimelineClientCallMetricGuard::will_decrement_manually`] and [`call_end`] if that
    /// is more suitable.
    /// Never do both.
@@ -1002,51 +841,24 @@ impl RemoteTimelineClientMetrics {
        &self,
        file_kind: &RemoteOpFileKind,
        op_kind: &RemoteOpKind,
-        size: RemoteTimelineClientMetricsCallTrackSize,
    ) -> RemoteTimelineClientCallMetricGuard {
-        let calls_unfinished_metric = self.calls_unfinished_gauge(file_kind, op_kind);
+        let unfinished_metric = self.calls_unfinished_gauge(file_kind, op_kind);
        self.calls_started_hist(file_kind, op_kind)
-            .observe(calls_unfinished_metric.get() as f64);
-        calls_unfinished_metric.inc(); // NB: inc after the histogram, see comment on underlying metric
-
-        let bytes_finished = match size {
-            RemoteTimelineClientMetricsCallTrackSize::DontTrackSize { reason: _reason } => {
-                // nothing to do
-                None
-            }
-            RemoteTimelineClientMetricsCallTrackSize::Bytes(size) => {
-                self.bytes_started_counter(file_kind, op_kind).inc_by(size);
-                let finished_counter = self.bytes_finished_counter(file_kind, op_kind);
-                Some((finished_counter, size))
-            }
-        };
-        RemoteTimelineClientCallMetricGuard {
-            calls_unfinished_metric: Some(calls_unfinished_metric),
-            bytes_finished,
-        }
+            .observe(unfinished_metric.get() as f64);
+        unfinished_metric.inc();
+        RemoteTimelineClientCallMetricGuard(Some(unfinished_metric))
    }

-    /// Manually udpate the metrics that track completions, instead of using the guard object.
+    /// Manually decrement the metric instead of using the guard object.
    /// Using the guard object is generally preferable.
    /// See [`call_begin`] for more context.
-    pub(crate) fn call_end(
-        &self,
-        file_kind: &RemoteOpFileKind,
-        op_kind: &RemoteOpKind,
-        size: RemoteTimelineClientMetricsCallTrackSize,
-    ) {
-        let calls_unfinished_metric = self.calls_unfinished_gauge(file_kind, op_kind);
+    pub(crate) fn call_end(&self, file_kind: &RemoteOpFileKind, op_kind: &RemoteOpKind) {
+        let unfinished_metric = self.calls_unfinished_gauge(file_kind, op_kind);
        debug_assert!(
-            calls_unfinished_metric.get() > 0,
+            unfinished_metric.get() > 0,
            "begin and end should cancel out"
        );
-        calls_unfinished_metric.dec();
-        match size {
-            RemoteTimelineClientMetricsCallTrackSize::DontTrackSize { reason: _reason } => {}
-            RemoteTimelineClientMetricsCallTrackSize::Bytes(size) => {
-                self.bytes_finished_counter(file_kind, op_kind).inc_by(size);
-            }
-        }
+        unfinished_metric.dec();
    }
 }

@@ -1059,8 +871,6 @@ impl Drop for RemoteTimelineClientMetrics {
            remote_operation_time,
            calls_unfinished_gauge,
            calls_started_hist,
-            bytes_started_counter,
-            bytes_finished_counter,
        } = self;
        for ((a, b, c), _) in remote_operation_time.get_mut().unwrap().drain() {
            let _ = REMOTE_OPERATION_TIME.remove_label_values(&[tenant_id, timeline_id, a, b, c]);
@@ -1081,22 +891,6 @@ impl Drop for RemoteTimelineClientMetrics {
                b,
            ]);
        }
-        for ((a, b), _) in bytes_started_counter.get_mut().unwrap().drain() {
-            let _ = REMOTE_TIMELINE_CLIENT_BYTES_STARTED_COUNTER.remove_label_values(&[
-                tenant_id,
-                timeline_id,
-                a,
-                b,
-            ]);
-        }
-        for ((a, b), _) in bytes_finished_counter.get_mut().unwrap().drain() {
-            let _ = REMOTE_TIMELINE_CLIENT_BYTES_FINISHED_COUNTER.remove_label_values(&[
-                tenant_id,
-                timeline_id,
-                a,
-                b,
-            ]);
-        }
        {
            let _ = remote_physical_size_gauge; // use to avoid 'unused' warning in desctructuring above
            let _ = REMOTE_PHYSICAL_SIZE.remove_label_values(&[tenant_id, timeline_id]);
@@ -1160,10 +954,3 @@ impl<F: Future<Output = Result<O, E>>, O, E> Future for MeasuredRemoteOp<F> {
        poll_result
    }
 }
-
-pub fn preinitialize_metrics() {
-    // We want to alert on this metric increasing.
-    // Initialize it eagerly, so that our alert rule can distinguish absence of the metric from metric value 0.
-    assert_eq!(UNEXPECTED_ONDEMAND_DOWNLOADS.get(), 0);
-    UNEXPECTED_ONDEMAND_DOWNLOADS.reset();
-}
--- a/pageserver/src/page_service.rs
+++ b/pageserver/src/page_service.rs
@@ -20,6 +20,7 @@ use pageserver_api::models::{
    PagestreamFeMessage, PagestreamGetPageRequest, PagestreamGetPageResponse,
    PagestreamNblocksRequest, PagestreamNblocksResponse,
 };
+use postgres_backend::PostgresBackendTCP;
 use postgres_backend::{self, is_expected_io_error, AuthType, PostgresBackend, QueryError};
 use pq_proto::framed::ConnectionError;
 use pq_proto::FeStartupPacket;
@@ -31,7 +32,6 @@ use std::str;
 use std::str::FromStr;
 use std::sync::Arc;
 use std::time::Duration;
-use tokio::io::{AsyncRead, AsyncWrite};
 use tokio_util::io::StreamReader;
 use tracing::*;
 use utils::id::ConnectionId;
@@ -57,10 +57,7 @@ use crate::trace::Tracer;
 use postgres_ffi::pg_constants::DEFAULTTABLESPACE_OID;
 use postgres_ffi::BLCKSZ;

-fn copyin_stream<IO>(pgb: &mut PostgresBackend<IO>) -> impl Stream<Item = io::Result<Bytes>> + '_
-where
-    IO: AsyncRead + AsyncWrite + Unpin,
-{
+fn copyin_stream(pgb: &mut PostgresBackendTCP) -> impl Stream<Item = io::Result<Bytes>> + '_ {
    async_stream::try_stream! {
        loop {
            let msg = tokio::select! {
@@ -68,8 +65,8 @@ where

                _ = task_mgr::shutdown_watcher() => {
                    // We were requested to shut down.
-                    let msg = "pageserver is shutting down";
-                    let _ = pgb.write_message_noflush(&BeMessage::ErrorResponse(msg, None));
+                    let msg = format!("pageserver is shutting down");
+                    let _ = pgb.write_message_noflush(&BeMessage::ErrorResponse(&msg, None));
                    Err(QueryError::Other(anyhow::anyhow!(msg)))
                }

@@ -128,7 +125,7 @@ where
 ///
 /// XXX: Currently, any trailing data after the EOF marker prints a warning.
 /// Perhaps it should be a hard error?
-async fn read_tar_eof(mut reader: (impl AsyncRead + Unpin)) -> anyhow::Result<()> {
+async fn read_tar_eof(mut reader: (impl tokio::io::AsyncRead + Unpin)) -> anyhow::Result<()> {
    use tokio::io::AsyncReadExt;
    let mut buf = [0u8; 512];

@@ -248,23 +245,12 @@ async fn page_service_conn_main(
        .set_nodelay(true)
        .context("could not set TCP_NODELAY")?;

-    let peer_addr = socket.peer_addr().context("get peer address")?;
-
-    // setup read timeout of 10 minutes. the timeout is rather arbitrary for requirements:
-    // - long enough for most valid compute connections
-    // - less than infinite to stop us from "leaking" connections to long-gone computes
-    //
-    // no write timeout is used, because the kernel is assumed to error writes after some time.
-    let mut socket = tokio_io_timeout::TimeoutReader::new(socket);
-    socket.set_timeout(Some(std::time::Duration::from_secs(60 * 10)));
-    let socket = std::pin::pin!(socket);
-
    // XXX: pgbackend.run() should take the connection_ctx,
    // and create a child per-query context when it invokes process_query.
    // But it's in a shared crate, so, we store connection_ctx inside PageServerHandler
    // and create the per-query context in process_query ourselves.
    let mut conn_handler = PageServerHandler::new(conf, auth, connection_ctx);
-    let pgbackend = PostgresBackend::new_from_io(socket, peer_addr, auth_type, None)?;
+    let pgbackend = PostgresBackend::new(socket, auth_type, None)?;

    match pgbackend
        .run(&mut conn_handler, task_mgr::shutdown_watcher)
@@ -346,16 +332,13 @@ impl PageServerHandler {
    }

    #[instrument(skip(self, pgb, ctx))]
-    async fn handle_pagerequests<IO>(
+    async fn handle_pagerequests(
        &self,
-        pgb: &mut PostgresBackend<IO>,
+        pgb: &mut PostgresBackendTCP,
        tenant_id: TenantId,
        timeline_id: TimelineId,
        ctx: RequestContext,
-    ) -> Result<(), QueryError>
-    where
-        IO: AsyncRead + AsyncWrite + Send + Sync + Unpin,
-    {
+    ) -> anyhow::Result<()> {
        // NOTE: pagerequests handler exits when connection is closed,
        //       so there is no need to reset the association
        task_mgr::associate_with(Some(tenant_id), Some(timeline_id));
@@ -398,9 +381,7 @@ impl PageServerHandler {
                Some(FeMessage::CopyData(bytes)) => bytes,
                Some(FeMessage::Terminate) => break,
                Some(m) => {
-                    return Err(QueryError::Other(anyhow::anyhow!(
-                        "unexpected message: {m:?} during COPY"
-                    )));
+                    anyhow::bail!("unexpected message: {m:?} during COPY");
                }
                None => break, // client disconnected
            };
@@ -455,19 +436,16 @@ impl PageServerHandler {

    #[allow(clippy::too_many_arguments)]
    #[instrument(skip(self, pgb, ctx))]
-    async fn handle_import_basebackup<IO>(
+    async fn handle_import_basebackup(
        &self,
-        pgb: &mut PostgresBackend<IO>,
+        pgb: &mut PostgresBackendTCP,
        tenant_id: TenantId,
        timeline_id: TimelineId,
        base_lsn: Lsn,
        _end_lsn: Lsn,
        pg_version: u32,
        ctx: RequestContext,
-    ) -> Result<(), QueryError>
-    where
-        IO: AsyncRead + AsyncWrite + Send + Sync + Unpin,
-    {
+    ) -> Result<(), QueryError> {
        task_mgr::associate_with(Some(tenant_id), Some(timeline_id));
        // Create empty timeline
        info!("creating new timeline");
@@ -508,18 +486,15 @@ impl PageServerHandler {
    }

    #[instrument(skip(self, pgb, ctx))]
-    async fn handle_import_wal<IO>(
+    async fn handle_import_wal(
        &self,
-        pgb: &mut PostgresBackend<IO>,
+        pgb: &mut PostgresBackendTCP,
        tenant_id: TenantId,
        timeline_id: TimelineId,
        start_lsn: Lsn,
        end_lsn: Lsn,
        ctx: RequestContext,
-    ) -> Result<(), QueryError>
-    where
-        IO: AsyncRead + AsyncWrite + Send + Sync + Unpin,
-    {
+    ) -> Result<(), QueryError> {
        task_mgr::associate_with(Some(tenant_id), Some(timeline_id));

        let timeline = get_active_tenant_timeline(tenant_id, timeline_id, &ctx).await?;
@@ -715,21 +690,16 @@ impl PageServerHandler {

    #[allow(clippy::too_many_arguments)]
    #[instrument(skip(self, pgb, ctx))]
-    async fn handle_basebackup_request<IO>(
+    async fn handle_basebackup_request(
        &mut self,
-        pgb: &mut PostgresBackend<IO>,
+        pgb: &mut PostgresBackendTCP,
        tenant_id: TenantId,
        timeline_id: TimelineId,
        lsn: Option<Lsn>,
        prev_lsn: Option<Lsn>,
        full_backup: bool,
        ctx: RequestContext,
-    ) -> anyhow::Result<()>
-    where
-        IO: AsyncRead + AsyncWrite + Send + Sync + Unpin,
-    {
-        let started = std::time::Instant::now();
-
+    ) -> anyhow::Result<()> {
        // check that the timeline exists
        let timeline = get_active_tenant_timeline(tenant_id, timeline_id, &ctx).await?;
        let latest_gc_cutoff_lsn = timeline.get_latest_gc_cutoff_lsn();
@@ -742,8 +712,6 @@ impl PageServerHandler {
                .context("invalid basebackup lsn")?;
        }

-        let lsn_awaited_after = started.elapsed();
-
        // switch client to COPYOUT
        pgb.write_message_noflush(&BeMessage::CopyOutResponse)?;
        pgb.flush().await?;
@@ -764,17 +732,7 @@ impl PageServerHandler {

        pgb.write_message_noflush(&BeMessage::CopyDone)?;
        pgb.flush().await?;
-
-        let basebackup_after = started
-            .elapsed()
-            .checked_sub(lsn_awaited_after)
-            .unwrap_or(Duration::ZERO);
-
-        info!(
-            lsn_await_millis = lsn_awaited_after.as_millis(),
-            basebackup_millis = basebackup_after.as_millis(),
-            "basebackup complete"
-        );
+        info!("basebackup complete");

        Ok(())
    }
@@ -798,13 +756,10 @@ impl PageServerHandler {
 }

 #[async_trait::async_trait]
-impl<IO> postgres_backend::Handler<IO> for PageServerHandler
-where
-    IO: AsyncRead + AsyncWrite + Send + Sync + Unpin,
-{
+impl postgres_backend::Handler<tokio::net::TcpStream> for PageServerHandler {
    fn check_auth_jwt(
        &mut self,
-        _pgb: &mut PostgresBackend<IO>,
+        _pgb: &mut PostgresBackendTCP,
        jwt_response: &[u8],
    ) -> Result<(), QueryError> {
        // this unwrap is never triggered, because check_auth_jwt only called when auth_type is NeonJWT
@@ -832,7 +787,7 @@ where

    fn startup(
        &mut self,
-        _pgb: &mut PostgresBackend<IO>,
+        _pgb: &mut PostgresBackendTCP,
        _sm: &FeStartupPacket,
    ) -> Result<(), QueryError> {
        Ok(())
@@ -840,7 +795,7 @@ where

    async fn process_query(
        &mut self,
-        pgb: &mut PostgresBackend<IO>,
+        pgb: &mut PostgresBackendTCP,
        query_string: &str,
    ) -> Result<(), QueryError> {
        let ctx = self.connection_ctx.attached_child();
--- a/pageserver/src/tenant.rs
+++ b/pageserver/src/tenant.rs
@@ -118,10 +118,6 @@ pub struct Tenant {
    // Global pageserver config parameters
    pub conf: &'static PageServerConf,

-    /// The value creation timestamp, used to measure activation delay, see:
-    /// <https://github.com/neondatabase/neon/issues/4025>
-    loading_started_at: Instant,
-
    state: watch::Sender<TenantState>,

    // Overridden tenant-specific config parameters.
@@ -271,7 +267,10 @@ impl UninitializedTimeline<'_> {
            .await
            .context("Failed to flush after basebackup import")?;

-        self.initialize(ctx)
+        // Initialize without loading the layer map. We started with an empty layer map, and already
+        // updated it for the layers that we created during the import.
+        let mut timelines = self.owning_tenant.timelines.lock().unwrap();
+        self.initialize_with_lock(ctx, &mut timelines, false, true)
    }

    fn raw_timeline(&self) -> anyhow::Result<&Arc<Timeline>> {
@@ -1477,7 +1476,7 @@ impl Tenant {
                TenantState::Loading | TenantState::Attaching => {
                    *current_state = TenantState::Active;

-                    debug!(tenant_id = %self.tenant_id, "Activating tenant");
+                    info!("Activating tenant {}", self.tenant_id);

                    let timelines_accessor = self.timelines.lock().unwrap();
                    let not_broken_timelines = timelines_accessor
@@ -1488,17 +1487,12 @@ impl Tenant {
                    // down when they notice that the tenant is inactive.
                    tasks::start_background_loops(self.tenant_id);

-                    let mut activated_timelines = 0;
-                    let mut timelines_broken_during_activation = 0;
-
                    for timeline in not_broken_timelines {
                        match timeline
                            .activate(ctx)
                            .context("timeline activation for activating tenant")
                        {
-                            Ok(()) => {
-                                activated_timelines += 1;
-                            }
+                            Ok(()) => {}
                            Err(e) => {
                                error!(
                                    "Failed to activate timeline {}: {:#}",
@@ -1509,26 +1503,9 @@ impl Tenant {
                                    "failed to activate timeline {}: {}",
                                    timeline.timeline_id, e
                                ));
-
-                                timelines_broken_during_activation += 1;
                            }
                        }
                    }
-
-                    let elapsed = self.loading_started_at.elapsed();
-                    let total_timelines = timelines_accessor.len();
-
-                    // log a lot of stuff, because some tenants sometimes suffer from user-visible
-                    // times to activate. see https://github.com/neondatabase/neon/issues/4025
-                    info!(
-                        since_creation_millis = elapsed.as_millis(),
-                        tenant_id = %self.tenant_id,
-                        activated_timelines,
-                        timelines_broken_during_activation,
-                        total_timelines,
-                        post_state = <&'static str>::from(&*current_state),
-                        "activation attempt finished"
-                    );
                }
            }
        });
@@ -1758,13 +1735,6 @@ impl Tenant {

    pub fn set_new_tenant_config(&self, new_tenant_conf: TenantConfOpt) {
        *self.tenant_conf.write().unwrap() = new_tenant_conf;
-        // Don't hold self.timelines.lock() during the notifies.
-        // There's no risk of deadlock right now, but there could be if we consolidate
-        // mutexes in struct Timeline in the future.
-        let timelines = self.list_timelines();
-        for timeline in timelines {
-            timeline.tenant_conf_updated();
-        }
    }

    fn create_timeline_data(
@@ -1835,9 +1805,6 @@ impl Tenant {
        Tenant {
            tenant_id,
            conf,
-            // using now here is good enough approximation to catch tenants with really long
-            // activation times.
-            loading_started_at: Instant::now(),
            tenant_conf: Arc::new(RwLock::new(tenant_conf)),
            timelines: Mutex::new(HashMap::new()),
            gc_cs: tokio::sync::Mutex::new(()),
@@ -1920,7 +1887,7 @@ impl Tenant {
            .to_string();

            // Convert the config to a toml file.
-            conf_content += &toml_edit::ser::to_string(&tenant_conf)?;
+            conf_content += &toml_edit::easy::to_string(&tenant_conf)?;

            let mut target_config_file = VirtualFile::open_with_options(
                target_config_path,
@@ -2352,6 +2319,8 @@ impl Tenant {
                )
            })?;

+        // Initialize the timeline without loading the layer map, because we already updated the layer
+        // map above, when we imported the datadir.
        let timeline = {
            let mut timelines = self.timelines.lock().unwrap();
            raw_timeline.initialize_with_lock(ctx, &mut timelines, false, true)?
@@ -2846,9 +2815,6 @@ pub mod harness {
                trace_read_requests: Some(tenant_conf.trace_read_requests),
                eviction_policy: Some(tenant_conf.eviction_policy),
                min_resident_size_override: tenant_conf.min_resident_size_override,
-                evictions_low_residence_duration_metric_threshold: Some(
-                    tenant_conf.evictions_low_residence_duration_metric_threshold,
-                ),
            }
        }
    }
@@ -2881,13 +2847,7 @@ pub mod harness {
            };

            LOG_HANDLE.get_or_init(|| {
-                logging::init(
-                    logging::LogFormat::Test,
-                    // enable it in case in case the tests exercise code paths that use
-                    // debug_assert_current_span_has_tenant_and_timeline_id
-                    logging::TracingErrorLayerEnablement::EnableWithRustLogFilter,
-                )
-                .expect("Failed to init test logging")
+                logging::init(logging::LogFormat::Test).expect("Failed to init test logging")
            });

            let repo_dir = PageServerConf::test_repo_dir(test_name);
--- a/pageserver/src/tenant/config.rs
+++ b/pageserver/src/tenant/config.rs
@@ -39,7 +39,6 @@ pub mod defaults {
    pub const DEFAULT_WALRECEIVER_CONNECT_TIMEOUT: &str = "2 seconds";
    pub const DEFAULT_WALRECEIVER_LAGGING_WAL_TIMEOUT: &str = "3 seconds";
    pub const DEFAULT_MAX_WALRECEIVER_LSN_WAL_LAG: u64 = 10 * 1024 * 1024;
-    pub const DEFAULT_EVICTIONS_LOW_RESIDENCE_DURATION_METRIC_THRESHOLD: &str = "24 hour";
 }

 /// Per-tenant configuration options
@@ -94,9 +93,6 @@ pub struct TenantConf {
    pub trace_read_requests: bool,
    pub eviction_policy: EvictionPolicy,
    pub min_resident_size_override: Option<u64>,
-    // See the corresponding metric's help string.
-    #[serde(with = "humantime_serde")]
-    pub evictions_low_residence_duration_metric_threshold: Duration,
 }

 /// Same as TenantConf, but this struct preserves the information about
@@ -168,11 +164,6 @@ pub struct TenantConfOpt {
    #[serde(skip_serializing_if = "Option::is_none")]
    #[serde(default)]
    pub min_resident_size_override: Option<u64>,
-
-    #[serde(skip_serializing_if = "Option::is_none")]
-    #[serde(with = "humantime_serde")]
-    #[serde(default)]
-    pub evictions_low_residence_duration_metric_threshold: Option<Duration>,
 }

 #[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
@@ -237,9 +228,6 @@ impl TenantConfOpt {
            min_resident_size_override: self
                .min_resident_size_override
                .or(global_conf.min_resident_size_override),
-            evictions_low_residence_duration_metric_threshold: self
-                .evictions_low_residence_duration_metric_threshold
-                .unwrap_or(global_conf.evictions_low_residence_duration_metric_threshold),
        }
    }
 }
@@ -272,10 +260,6 @@ impl Default for TenantConf {
            trace_read_requests: false,
            eviction_policy: EvictionPolicy::NoEviction,
            min_resident_size_override: None,
-            evictions_low_residence_duration_metric_threshold: humantime::parse_duration(
-                DEFAULT_EVICTIONS_LOW_RESIDENCE_DURATION_METRIC_THRESHOLD,
-            )
-            .expect("cannot parse default evictions_low_residence_duration_metric_threshold"),
        }
    }
 }
@@ -291,9 +275,9 @@ mod tests {
            ..TenantConfOpt::default()
        };

-        let toml_form = toml_edit::ser::to_string(&small_conf).unwrap();
+        let toml_form = toml_edit::easy::to_string(&small_conf).unwrap();
        assert_eq!(toml_form, "gc_horizon = 42\n");
-        assert_eq!(small_conf, toml_edit::de::from_str(&toml_form).unwrap());
+        assert_eq!(small_conf, toml_edit::easy::from_str(&toml_form).unwrap());

        let json_form = serde_json::to_string(&small_conf).unwrap();
        assert_eq!(json_form, "{\"gc_horizon\":42}");
--- a/pageserver/src/tenant/layer_map.rs
+++ b/pageserver/src/tenant/layer_map.rs
@@ -48,10 +48,11 @@ mod layer_coverage;

 use crate::context::RequestContext;
 use crate::keyspace::KeyPartitioning;
+use crate::metrics::NUM_ONDISK_LAYERS;
 use crate::repository::Key;
 use crate::tenant::storage_layer::InMemoryLayer;
 use crate::tenant::storage_layer::Layer;
-use anyhow::Result;
+use anyhow::{bail, Result};
 use std::collections::VecDeque;
 use std::ops::Range;
 use std::sync::Arc;
@@ -125,7 +126,7 @@ where
    ///
    /// Insert an on-disk layer.
    ///
-    pub fn insert_historic(&mut self, layer: Arc<L>) {
+    pub fn insert_historic(&mut self, layer: Arc<L>) -> anyhow::Result<()> {
        self.layer_map.insert_historic_noflush(layer)
    }

@@ -273,16 +274,22 @@ where
    ///
    /// Helper function for BatchedUpdates::insert_historic
    ///
-    pub(self) fn insert_historic_noflush(&mut self, layer: Arc<L>) {
-        // TODO: See #3869, resulting #4088, attempted fix and repro #4094
-        self.historic.insert(
-            historic_layer_coverage::LayerKey::from(&*layer),
-            Arc::clone(&layer),
-        );
+    pub(self) fn insert_historic_noflush(&mut self, layer: Arc<L>) -> anyhow::Result<()> {
+        let key = historic_layer_coverage::LayerKey::from(&*layer);
+        if self.historic.contains(&key) {
+            bail!(
+                "Attempt to insert duplicate layer {} in layer map",
+                layer.short_id()
+            );
+        }
+        self.historic.insert(key, Arc::clone(&layer));

        if Self::is_l0(&layer) {
            self.l0_delta_layers.push(layer);
        }
+
+        NUM_ONDISK_LAYERS.inc();
+        Ok(())
    }

    ///
@@ -307,6 +314,8 @@ where
                "failed to locate removed historic layer from l0_delta_layers"
            );
        }
+
+        NUM_ONDISK_LAYERS.dec();
    }

    pub(self) fn replace_historic_noflush(
@@ -834,7 +843,7 @@ mod tests {

            let expected_in_counts = (1, usize::from(expected_l0));

-            map.batch_update().insert_historic(remote.clone());
+            map.batch_update().insert_historic(remote.clone()).unwrap();
            assert_eq!(count_layer_in(&map, &remote), expected_in_counts);

            let replaced = map
--- a/pageserver/src/tenant/layer_map/historic_layer_coverage.rs
+++ b/pageserver/src/tenant/layer_map/historic_layer_coverage.rs
@@ -417,6 +417,14 @@ impl<Value: Clone> BufferedHistoricLayerCoverage<Value> {
        }
    }

+    pub fn contains(&self, layer_key: &LayerKey) -> bool {
+        match self.buffer.get(layer_key) {
+            Some(None) => false,                         // layer remove was buffered
+            Some(_) => true,                             // layer insert was buffered
+            None => self.layers.contains_key(layer_key), // no buffered ops for this layer
+        }
+    }
+
    pub fn insert(&mut self, layer_key: LayerKey, value: Value) {
        self.buffer.insert(layer_key, Some(value));
    }
--- a/pageserver/src/tenant/remote_timeline_client.rs
+++ b/pageserver/src/tenant/remote_timeline_client.rs
@@ -219,8 +219,7 @@ use utils::lsn::Lsn;

 use crate::metrics::{
    MeasureRemoteOp, RemoteOpFileKind, RemoteOpKind, RemoteTimelineClientMetrics,
-    RemoteTimelineClientMetricsCallTrackSize, REMOTE_ONDEMAND_DOWNLOADED_BYTES,
-    REMOTE_ONDEMAND_DOWNLOADED_LAYERS,
+    REMOTE_ONDEMAND_DOWNLOADED_BYTES, REMOTE_ONDEMAND_DOWNLOADED_LAYERS,
 };
 use crate::tenant::remote_timeline_client::index::LayerFileMetadata;
 use crate::{
@@ -368,13 +367,9 @@ impl RemoteTimelineClient {

    /// Download index file
    pub async fn download_index_file(&self) -> Result<IndexPart, DownloadError> {
-        let _unfinished_gauge_guard = self.metrics.call_begin(
-            &RemoteOpFileKind::Index,
-            &RemoteOpKind::Download,
-            crate::metrics::RemoteTimelineClientMetricsCallTrackSize::DontTrackSize {
-                reason: "no need for a downloads gauge",
-            },
-        );
+        let _unfinished_gauge_guard = self
+            .metrics
+            .call_begin(&RemoteOpFileKind::Index, &RemoteOpKind::Download);

        download::download_index_part(
            self.conf,
@@ -403,13 +398,9 @@ impl RemoteTimelineClient {
        layer_metadata: &LayerFileMetadata,
    ) -> anyhow::Result<u64> {
        let downloaded_size = {
-            let _unfinished_gauge_guard = self.metrics.call_begin(
-                &RemoteOpFileKind::Layer,
-                &RemoteOpKind::Download,
-                crate::metrics::RemoteTimelineClientMetricsCallTrackSize::DontTrackSize {
-                    reason: "no need for a downloads gauge",
-                },
-            );
+            let _unfinished_gauge_guard = self
+                .metrics
+                .call_begin(&RemoteOpFileKind::Layer, &RemoteOpKind::Download);
            download::download_layer_file(
                self.conf,
                &self.storage_impl,
@@ -895,32 +886,11 @@ impl RemoteTimelineClient {
    fn calls_unfinished_metric_impl(
        &self,
        op: &UploadOp,
-    ) -> Option<(
-        RemoteOpFileKind,
-        RemoteOpKind,
-        RemoteTimelineClientMetricsCallTrackSize,
-    )> {
-        use RemoteTimelineClientMetricsCallTrackSize::DontTrackSize;
+    ) -> Option<(RemoteOpFileKind, RemoteOpKind)> {
        let res = match op {
-            UploadOp::UploadLayer(_, m) => (
-                RemoteOpFileKind::Layer,
-                RemoteOpKind::Upload,
-                RemoteTimelineClientMetricsCallTrackSize::Bytes(m.file_size()),
-            ),
-            UploadOp::UploadMetadata(_, _) => (
-                RemoteOpFileKind::Index,
-                RemoteOpKind::Upload,
-                DontTrackSize {
-                    reason: "metadata uploads are tiny",
-                },
-            ),
-            UploadOp::Delete(file_kind, _) => (
-                *file_kind,
-                RemoteOpKind::Delete,
-                DontTrackSize {
-                    reason: "should we track deletes? positive or negative sign?",
-                },
-            ),
+            UploadOp::UploadLayer(_, _) => (RemoteOpFileKind::Layer, RemoteOpKind::Upload),
+            UploadOp::UploadMetadata(_, _) => (RemoteOpFileKind::Index, RemoteOpKind::Upload),
+            UploadOp::Delete(file_kind, _) => (*file_kind, RemoteOpKind::Delete),
            UploadOp::Barrier(_) => {
                // we do not account these
                return None;
@@ -930,20 +900,20 @@ impl RemoteTimelineClient {
    }

    fn calls_unfinished_metric_begin(&self, op: &UploadOp) {
-        let (file_kind, op_kind, track_bytes) = match self.calls_unfinished_metric_impl(op) {
+        let (file_kind, op_kind) = match self.calls_unfinished_metric_impl(op) {
            Some(x) => x,
            None => return,
        };
-        let guard = self.metrics.call_begin(&file_kind, &op_kind, track_bytes);
+        let guard = self.metrics.call_begin(&file_kind, &op_kind);
        guard.will_decrement_manually(); // in unfinished_ops_metric_end()
    }

    fn calls_unfinished_metric_end(&self, op: &UploadOp) {
-        let (file_kind, op_kind, track_bytes) = match self.calls_unfinished_metric_impl(op) {
+        let (file_kind, op_kind) = match self.calls_unfinished_metric_impl(op) {
            Some(x) => x,
            None => return,
        };
-        self.metrics.call_end(&file_kind, &op_kind, track_bytes);
+        self.metrics.call_end(&file_kind, &op_kind);
    }

    fn stop(&self) {
@@ -1011,19 +981,11 @@ impl RemoteTimelineClient {
 mod tests {
    use super::*;
    use crate::{
-        context::RequestContext,
-        tenant::{
-            harness::{TenantHarness, TIMELINE_ID},
-            Tenant,
-        },
+        tenant::harness::{TenantHarness, TIMELINE_ID},
        DEFAULT_PG_VERSION,
    };
    use remote_storage::{RemoteStorageConfig, RemoteStorageKind};
-    use std::{
-        collections::HashSet,
-        path::{Path, PathBuf},
-    };
-    use tokio::runtime::EnterGuard;
+    use std::{collections::HashSet, path::Path};
    use utils::lsn::Lsn;

    pub(super) fn dummy_contents(name: &str) -> Vec<u8> {
@@ -1072,80 +1034,39 @@ mod tests {
        assert_eq!(found, expected);
    }

-    struct TestSetup {
-        runtime: &'static tokio::runtime::Runtime,
-        entered_runtime: EnterGuard<'static>,
-        harness: TenantHarness<'static>,
-        tenant: Arc<Tenant>,
-        tenant_ctx: RequestContext,
-        remote_fs_dir: PathBuf,
-        client: Arc<RemoteTimelineClient>,
-    }
-
-    impl TestSetup {
-        fn new(test_name: &str) -> anyhow::Result<Self> {
-            // Use a current-thread runtime in the test
-            let runtime = Box::leak(Box::new(
-                tokio::runtime::Builder::new_current_thread()
-                    .enable_all()
-                    .build()?,
-            ));
-            let entered_runtime = runtime.enter();
-
-            let test_name = Box::leak(Box::new(format!("remote_timeline_client__{test_name}")));
-            let harness = TenantHarness::create(test_name)?;
-            let (tenant, ctx) = runtime.block_on(harness.load());
-            // create an empty timeline directory
-            let timeline =
-                tenant.create_empty_timeline(TIMELINE_ID, Lsn(0), DEFAULT_PG_VERSION, &ctx)?;
-            let _ = timeline.initialize(&ctx).unwrap();
-
-            let remote_fs_dir = harness.conf.workdir.join("remote_fs");
-            std::fs::create_dir_all(remote_fs_dir)?;
-            let remote_fs_dir = std::fs::canonicalize(harness.conf.workdir.join("remote_fs"))?;
-
-            let storage_config = RemoteStorageConfig {
-                max_concurrent_syncs: std::num::NonZeroUsize::new(
-                    remote_storage::DEFAULT_REMOTE_STORAGE_MAX_CONCURRENT_SYNCS,
-                )
-                .unwrap(),
-                max_sync_errors: std::num::NonZeroU32::new(
-                    remote_storage::DEFAULT_REMOTE_STORAGE_MAX_SYNC_ERRORS,
-                )
-                .unwrap(),
-                storage: RemoteStorageKind::LocalFs(remote_fs_dir.clone()),
-            };
-
-            let storage = GenericRemoteStorage::from_config(&storage_config).unwrap();
-
-            let client = Arc::new(RemoteTimelineClient {
-                conf: harness.conf,
-                runtime,
-                tenant_id: harness.tenant_id,
-                timeline_id: TIMELINE_ID,
-                storage_impl: storage,
-                upload_queue: Mutex::new(UploadQueue::Uninitialized),
-                metrics: Arc::new(RemoteTimelineClientMetrics::new(
-                    &harness.tenant_id,
-                    &TIMELINE_ID,
-                )),
-            });
-
-            Ok(Self {
-                runtime,
-                entered_runtime,
-                harness,
-                tenant,
-                tenant_ctx: ctx,
-                remote_fs_dir,
-                client,
-            })
-        }
-    }
-
    // Test scheduling
    #[test]
    fn upload_scheduling() -> anyhow::Result<()> {
+        // Use a current-thread runtime in the test
+        let runtime = Box::leak(Box::new(
+            tokio::runtime::Builder::new_current_thread()
+                .enable_all()
+                .build()?,
+        ));
+        let _entered = runtime.enter();
+
+        let harness = TenantHarness::create("upload_scheduling")?;
+        let (tenant, ctx) = runtime.block_on(harness.load());
+        let _timeline =
+            tenant.create_empty_timeline(TIMELINE_ID, Lsn(0), DEFAULT_PG_VERSION, &ctx)?;
+        let timeline_path = harness.timeline_path(&TIMELINE_ID);
+
+        let remote_fs_dir = harness.conf.workdir.join("remote_fs");
+        std::fs::create_dir_all(remote_fs_dir)?;
+        let remote_fs_dir = std::fs::canonicalize(harness.conf.workdir.join("remote_fs"))?;
+
+        let storage_config = RemoteStorageConfig {
+            max_concurrent_syncs: std::num::NonZeroUsize::new(
+                remote_storage::DEFAULT_REMOTE_STORAGE_MAX_CONCURRENT_SYNCS,
+            )
+            .unwrap(),
+            max_sync_errors: std::num::NonZeroU32::new(
+                remote_storage::DEFAULT_REMOTE_STORAGE_MAX_SYNC_ERRORS,
+            )
+            .unwrap(),
+            storage: RemoteStorageKind::LocalFs(remote_fs_dir.clone()),
+        };
+
        // Test outline:
        //
        // Schedule upload of a bunch of layers. Check that they are started immediately, not queued
@@ -1160,20 +1081,22 @@ mod tests {
        // Schedule another deletion. Check that it's launched immediately.
        // Schedule index upload. Check that it's queued

-        let TestSetup {
-            runtime,
-            entered_runtime: _entered_runtime,
-            harness,
-            tenant: _tenant,
-            tenant_ctx: _tenant_ctx,
-            remote_fs_dir,
-            client,
-        } = TestSetup::new("upload_scheduling").unwrap();
-
-        let timeline_path = harness.timeline_path(&TIMELINE_ID);
-
        println!("workdir: {}", harness.conf.workdir.display());

+        let storage_impl = GenericRemoteStorage::from_config(&storage_config)?;
+        let client = Arc::new(RemoteTimelineClient {
+            conf: harness.conf,
+            runtime,
+            tenant_id: harness.tenant_id,
+            timeline_id: TIMELINE_ID,
+            storage_impl,
+            upload_queue: Mutex::new(UploadQueue::Uninitialized),
+            metrics: Arc::new(RemoteTimelineClientMetrics::new(
+                &harness.tenant_id,
+                &TIMELINE_ID,
+            )),
+        });
+
        let remote_timeline_dir =
            remote_fs_dir.join(timeline_path.strip_prefix(&harness.conf.workdir)?);
        println!("remote_timeline_dir: {}", remote_timeline_dir.display());
@@ -1293,90 +1216,4 @@ mod tests {

        Ok(())
    }
-
-    #[test]
-    fn bytes_unfinished_gauge_for_layer_file_uploads() -> anyhow::Result<()> {
-        // Setup
-
-        let TestSetup {
-            runtime,
-            harness,
-            client,
-            ..
-        } = TestSetup::new("metrics")?;
-
-        let metadata = dummy_metadata(Lsn(0x10));
-        client.init_upload_queue_for_empty_remote(&metadata)?;
-
-        let timeline_path = harness.timeline_path(&TIMELINE_ID);
-
-        let layer_file_name_1: LayerFileName = "000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000016B59D8-00000000016B5A51".parse().unwrap();
-        let content_1 = dummy_contents("foo");
-        std::fs::write(
-            timeline_path.join(layer_file_name_1.file_name()),
-            &content_1,
-        )?;
-
-        #[derive(Debug, PartialEq)]
-        struct BytesStartedFinished {
-            started: Option<usize>,
-            finished: Option<usize>,
-        }
-        let get_bytes_started_stopped = || {
-            let started = client
-                .metrics
-                .get_bytes_started_counter_value(&RemoteOpFileKind::Layer, &RemoteOpKind::Upload)
-                .map(|v| v.try_into().unwrap());
-            let stopped = client
-                .metrics
-                .get_bytes_finished_counter_value(&RemoteOpFileKind::Layer, &RemoteOpKind::Upload)
-                .map(|v| v.try_into().unwrap());
-            BytesStartedFinished {
-                started,
-                finished: stopped,
-            }
-        };
-
-        // Test
-
-        let init = get_bytes_started_stopped();
-
-        client.schedule_layer_file_upload(
-            &layer_file_name_1,
-            &LayerFileMetadata::new(content_1.len() as u64),
-        )?;
-
-        let pre = get_bytes_started_stopped();
-
-        runtime.block_on(client.wait_completion())?;
-
-        let post = get_bytes_started_stopped();
-
-        // Validate
-
-        assert_eq!(
-            init,
-            BytesStartedFinished {
-                started: None,
-                finished: None
-            }
-        );
-        assert_eq!(
-            pre,
-            BytesStartedFinished {
-                started: Some(content_1.len()),
-                // assert that the _finished metric is created eagerly so that subtractions work on first sample
-                finished: Some(0),
-            }
-        );
-        assert_eq!(
-            post,
-            BytesStartedFinished {
-                started: Some(content_1.len()),
-                finished: Some(content_1.len())
-            }
-        );
-
-        Ok(())
-    }
 }
--- a/pageserver/src/tenant/remote_timeline_client/download.rs
+++ b/pageserver/src/tenant/remote_timeline_client/download.rs
@@ -16,7 +16,6 @@ use tracing::{info, warn};

 use crate::config::PageServerConf;
 use crate::tenant::storage_layer::LayerFileName;
-use crate::tenant::timeline::debug_assert_current_span_has_tenant_and_timeline_id;
 use crate::{exponential_backoff, DEFAULT_BASE_BACKOFF_SECONDS, DEFAULT_MAX_BACKOFF_SECONDS};
 use remote_storage::{DownloadError, GenericRemoteStorage};
 use utils::crashsafe::path_with_suffix_extension;
@@ -44,8 +43,6 @@ pub async fn download_layer_file<'a>(
    layer_file_name: &'a LayerFileName,
    layer_metadata: &'a LayerFileMetadata,
 ) -> Result<u64, DownloadError> {
-    debug_assert_current_span_has_tenant_and_timeline_id();
-
    let timeline_path = conf.timeline_path(&timeline_id, &tenant_id);

    let local_path = timeline_path.join(layer_file_name.file_name());
@@ -157,7 +154,7 @@ pub async fn download_layer_file<'a>(
        .with_context(|| format!("Could not fsync layer file {}", local_path.display(),))
        .map_err(DownloadError::Other)?;

-    tracing::debug!("download complete: {}", local_path.display());
+    tracing::info!("download complete: {}", local_path.display());

    Ok(bytes_amount)
 }
--- a/pageserver/src/tenant/remote_timeline_client/upload.rs
+++ b/pageserver/src/tenant/remote_timeline_client/upload.rs
@@ -74,7 +74,7 @@ pub(super) async fn upload_timeline_layer<'a>(
    })?;

    storage
-        .upload(source_file, fs_size, &storage_path, None)
+        .upload(Box::new(source_file), fs_size, &storage_path, None)
        .await
        .with_context(|| {
            format!(
--- a/pageserver/src/tenant/timeline.rs
+++ b/pageserver/src/tenant/timeline.rs
@@ -48,7 +48,7 @@ use crate::tenant::{

 use crate::config::PageServerConf;
 use crate::keyspace::{KeyPartitioning, KeySpace};
-use crate::metrics::{TimelineMetrics, UNEXPECTED_ONDEMAND_DOWNLOADS};
+use crate::metrics::TimelineMetrics;
 use crate::pgdatadir_mapping::LsnForTimestamp;
 use crate::pgdatadir_mapping::{is_rel_fsm_block_key, is_rel_vm_block_key};
 use crate::pgdatadir_mapping::{BlockNumber, CalculateLogicalSizeError};
@@ -77,7 +77,6 @@ pub(super) use self::eviction_task::EvictionTaskTenantState;
 use self::eviction_task::EvictionTaskTimelineState;
 use self::walreceiver::{WalReceiver, WalReceiverConf};

-use super::config::TenantConf;
 use super::layer_map::BatchedUpdates;
 use super::remote_timeline_client::index::IndexPart;
 use super::remote_timeline_client::RemoteTimelineClient;
@@ -146,7 +145,7 @@ pub struct Timeline {
    // 'last_record_lsn.load().prev'. It's used to set the xl_prev pointer of the
    // first WAL record when the node is started up. But here, we just
    // keep track of it.
-    last_record_lsn: SeqWait<RecordLsn, Lsn>,
+    last_record_lsn: SeqWait<RecordLsn, Lsn, ()>,

    // All WAL records have been processed and stored durably on files on
    // local disk, up to this LSN. On crash and restart, we need to re-process
@@ -162,7 +161,7 @@ pub struct Timeline {
    ancestor_timeline: Option<Arc<Timeline>>,
    ancestor_lsn: Lsn,

-    pub(super) metrics: TimelineMetrics,
+    metrics: TimelineMetrics,

    /// Ensures layers aren't frozen by checkpointer between
    /// [`Timeline::get_layer_for_write`] and layer reads.
@@ -936,7 +935,6 @@ impl Timeline {
        }
    }

-    #[instrument(skip_all, fields(tenant = %self.tenant_id, timeline = %self.timeline_id))]
    pub async fn download_layer(&self, layer_file_name: &str) -> anyhow::Result<Option<bool>> {
        let Some(layer) = self.find_layer(layer_file_name) else { return Ok(None) };
        let Some(remote_layer) = layer.downcast_remote_layer() else { return  Ok(Some(false)) };
@@ -1138,8 +1136,6 @@ impl Timeline {
                if let Some(delta) = local_layer_residence_duration {
                    self.metrics
                        .evictions_with_low_residence_duration
-                        .read()
-                        .unwrap()
                        .observe(delta);
                    info!(layer=%local_layer.short_id(), residence_millis=delta.as_millis(), "evicted layer after known residence period");
                } else {
@@ -1213,35 +1209,6 @@ impl Timeline {
            .unwrap_or(self.conf.default_tenant_conf.eviction_policy)
    }

-    fn get_evictions_low_residence_duration_metric_threshold(
-        tenant_conf: &TenantConfOpt,
-        default_tenant_conf: &TenantConf,
-    ) -> Duration {
-        tenant_conf
-            .evictions_low_residence_duration_metric_threshold
-            .unwrap_or(default_tenant_conf.evictions_low_residence_duration_metric_threshold)
-    }
-
-    pub(super) fn tenant_conf_updated(&self) {
-        // NB: Most tenant conf options are read by background loops, so,
-        // changes will automatically be picked up.
-
-        // The threshold is embedded in the metric. So, we need to update it.
-        {
-            let new_threshold = Self::get_evictions_low_residence_duration_metric_threshold(
-                &self.tenant_conf.read().unwrap(),
-                &self.conf.default_tenant_conf,
-            );
-            let tenant_id_str = self.tenant_id.to_string();
-            let timeline_id_str = self.timeline_id.to_string();
-            self.metrics
-                .evictions_with_low_residence_duration
-                .write()
-                .unwrap()
-                .change_threshold(&tenant_id_str, &timeline_id_str, new_threshold);
-        }
-    }
-
    /// Open a Timeline handle.
    ///
    /// Loads the metadata for the timeline into memory, but not the layer map.
@@ -1273,11 +1240,6 @@ impl Timeline {
        let max_lsn_wal_lag = tenant_conf_guard
            .max_lsn_wal_lag
            .unwrap_or(conf.default_tenant_conf.max_lsn_wal_lag);
-        let evictions_low_residence_duration_metric_threshold =
-            Self::get_evictions_low_residence_duration_metric_threshold(
-                &tenant_conf_guard,
-                &conf.default_tenant_conf,
-            );
        drop(tenant_conf_guard);

        Arc::new_cyclic(|myself| {
@@ -1308,10 +1270,13 @@ impl Timeline {
                remote_client: remote_client.map(Arc::new),

                // initialize in-memory 'last_record_lsn' from 'disk_consistent_lsn'.
-                last_record_lsn: SeqWait::new(RecordLsn {
-                    last: disk_consistent_lsn,
-                    prev: metadata.prev_record_lsn().unwrap_or(Lsn(0)),
-                }),
+                last_record_lsn: SeqWait::new(
+                    RecordLsn {
+                        last: disk_consistent_lsn,
+                        prev: metadata.prev_record_lsn().unwrap_or(Lsn(0)),
+                    },
+                    (),
+                ),
                disk_consistent_lsn: AtomicLsn::new(disk_consistent_lsn.0),

                last_freeze_at: AtomicLsn::new(disk_consistent_lsn.0),
@@ -1325,7 +1290,7 @@ impl Timeline {
                    &timeline_id,
                    crate::metrics::EvictionsWithLowResidenceDurationBuilder::new(
                        "mtime",
-                        evictions_low_residence_duration_metric_threshold,
+                        conf.evictions_low_residence_duration_metric_threshold,
                    ),
                ),

@@ -1484,7 +1449,7 @@ impl Timeline {

                trace!("found layer {}", layer.path().display());
                total_physical_size += file_size;
-                updates.insert_historic(Arc::new(layer));
+                updates.insert_historic(Arc::new(layer))?;
                num_layers += 1;
            } else if let Some(deltafilename) = DeltaFileName::parse_str(&fname) {
                // Create a DeltaLayer struct for each delta file.
@@ -1516,7 +1481,7 @@ impl Timeline {

                trace!("found layer {}", layer.path().display());
                total_physical_size += file_size;
-                updates.insert_historic(Arc::new(layer));
+                updates.insert_historic(Arc::new(layer))?;
                num_layers += 1;
            } else if fname == METADATA_FILE_NAME || fname.ends_with(".old") {
                // ignore these
@@ -1590,7 +1555,7 @@ impl Timeline {
            // remote index file?
            // If so, rename_to_backup those files & replace their local layer with
            // a RemoteLayer in the layer map so that we re-download them on-demand.
-            if let Some(local_layer) = local_layer {
+            if let Some(local_layer) = &local_layer {
                let local_layer_path = local_layer
                    .local_path()
                    .expect("caller must ensure that local_layers only contains local layers");
@@ -1615,7 +1580,6 @@ impl Timeline {
                        anyhow::bail!("could not rename file {local_layer_path:?}: {err:?}");
                    } else {
                        self.metrics.resident_physical_size_gauge.sub(local_size);
-                        updates.remove_historic(local_layer);
                        // fall-through to adding the remote layer
                    }
                } else {
@@ -1651,7 +1615,11 @@ impl Timeline {
                    );
                    let remote_layer = Arc::new(remote_layer);

-                    updates.insert_historic(remote_layer);
+                    if let Some(local_layer) = &local_layer {
+                        updates.replace_historic(local_layer, remote_layer)?;
+                    } else {
+                        updates.insert_historic(remote_layer)?;
+                    }
                }
                LayerFileName::Delta(deltafilename) => {
                    // Create a RemoteLayer for the delta file.
@@ -1675,7 +1643,11 @@ impl Timeline {
                        LayerAccessStats::for_loading_layer(LayerResidenceStatus::Evicted),
                    );
                    let remote_layer = Arc::new(remote_layer);
-                    updates.insert_historic(remote_layer);
+                    if let Some(local_layer) = &local_layer {
+                        updates.replace_historic(local_layer, remote_layer)?;
+                    } else {
+                        updates.insert_historic(remote_layer)?;
+                    }
                }
            }
        }
@@ -2349,7 +2321,6 @@ impl Timeline {
                            id,
                            ctx.task_kind()
                        );
-                        UNEXPECTED_ONDEMAND_DOWNLOADS.inc();
                        timeline.download_remote_layer(remote_layer).await?;
                        continue 'layer_map_search;
                    }
@@ -2452,7 +2423,7 @@ impl Timeline {
        assert!(new_lsn.is_aligned());

        self.metrics.last_record_gauge.set(new_lsn.0 as i64);
-        self.last_record_lsn.advance(new_lsn);
+        self.last_record_lsn.advance(new_lsn, None);
    }

    fn freeze_inmem_layer(&self, write_lock_held: bool) {
@@ -2723,7 +2694,7 @@ impl Timeline {
            .write()
            .unwrap()
            .batch_update()
-            .insert_historic(Arc::new(new_delta));
+            .insert_historic(Arc::new(new_delta))?;

        // update the timeline's physical size
        let sz = new_delta_path.metadata()?.len();
@@ -2928,7 +2899,7 @@ impl Timeline {
            self.metrics
                .resident_physical_size_gauge
                .add(metadata.len());
-            updates.insert_historic(Arc::new(l));
+            updates.insert_historic(Arc::new(l))?;
        }
        updates.flush();
        drop(layers);
@@ -3361,7 +3332,7 @@ impl Timeline {

            new_layer_paths.insert(new_delta_path, LayerFileMetadata::new(metadata.len()));
            let x: Arc<dyn PersistentLayer + 'static> = Arc::new(l);
-            updates.insert_historic(x);
+            updates.insert_historic(x)?;
        }

        // Now that we have reshuffled the data to set of new delta layers, we can
@@ -3813,13 +3784,11 @@ impl Timeline {
    /// If the caller has a deadline or needs a timeout, they can simply stop polling:
    /// we're **cancellation-safe** because the download happens in a separate task_mgr task.
    /// So, the current download attempt will run to completion even if we stop polling.
-    #[instrument(skip_all, fields(layer=%remote_layer.short_id()))]
+    #[instrument(skip_all, fields(tenant_id=%self.tenant_id, timeline_id=%self.timeline_id, layer=%remote_layer.short_id()))]
    pub async fn download_remote_layer(
        &self,
        remote_layer: Arc<RemoteLayer>,
    ) -> anyhow::Result<()> {
-        debug_assert_current_span_has_tenant_and_timeline_id();
-
        use std::sync::atomic::Ordering::Relaxed;

        let permit = match Arc::clone(&remote_layer.ongoing_download)
@@ -3863,8 +3832,6 @@ impl Timeline {
                    .await;

                if let Ok(size) = &result {
-                    info!("layer file download finished");
-
                    // XXX the temp file is still around in Err() case
                    // and consumes space until we clean up upon pageserver restart.
                    self_clone.metrics.resident_physical_size_gauge.add(*size);
@@ -3936,8 +3903,6 @@ impl Timeline {
                    updates.flush();
                    drop(layers);

-                    info!("on-demand download successful");
-
                    // Now that we've inserted the download into the layer map,
                    // close the semaphore. This will make other waiters for
                    // this download return Ok(()).
@@ -3945,7 +3910,7 @@ impl Timeline {
                    remote_layer.ongoing_download.close();
                } else {
                    // Keep semaphore open. We'll drop the permit at the end of the function.
-                    error!("layer file download failed: {:?}", result.as_ref().unwrap_err());
+                    error!("on-demand download failed: {:?}", result.as_ref().unwrap_err());
                }

                // Don't treat it as an error if the task that triggered the download
@@ -4256,36 +4221,3 @@ fn rename_to_backup(path: &Path) -> anyhow::Result<()> {

    bail!("couldn't find an unused backup number for {:?}", path)
 }
-
-#[cfg(not(debug_assertions))]
-#[inline]
-pub(crate) fn debug_assert_current_span_has_tenant_and_timeline_id() {}
-
-#[cfg(debug_assertions)]
-#[inline]
-pub(crate) fn debug_assert_current_span_has_tenant_and_timeline_id() {
-    use utils::tracing_span_assert;
-
-    pub static TENANT_ID_EXTRACTOR: once_cell::sync::Lazy<
-        tracing_span_assert::MultiNameExtractor<2>,
-    > = once_cell::sync::Lazy::new(|| {
-        tracing_span_assert::MultiNameExtractor::new("TenantId", ["tenant_id", "tenant"])
-    });
-
-    pub static TIMELINE_ID_EXTRACTOR: once_cell::sync::Lazy<
-        tracing_span_assert::MultiNameExtractor<2>,
-    > = once_cell::sync::Lazy::new(|| {
-        tracing_span_assert::MultiNameExtractor::new("TimelineId", ["timeline_id", "timeline"])
-    });
-
-    match tracing_span_assert::check_fields_present([
-        &*TENANT_ID_EXTRACTOR,
-        &*TIMELINE_ID_EXTRACTOR,
-    ]) {
-        Ok(()) => (),
-        Err(missing) => panic!(
-            "missing extractors: {:?}",
-            missing.into_iter().map(|e| e.name()).collect::<Vec<_>>()
-        ),
-    }
-}
--- a/pageserver/src/tenant/timeline/walreceiver/connection_manager.rs
+++ b/pageserver/src/tenant/timeline/walreceiver/connection_manager.rs
@@ -348,7 +348,7 @@ impl ConnectionManagerState {
                .context("walreceiver connection handling failure")
            }
            .instrument(
-                info_span!("walreceiver_connection", tenant_id = %id.tenant_id, timeline_id = %id.timeline_id, node_id = %new_sk.safekeeper_id),
+                info_span!("walreceiver_connection", id = %id, node_id = %new_sk.safekeeper_id),
            )
        });

--- a/pageserver/src/tenant/timeline/walreceiver/walreceiver_connection.rs
+++ b/pageserver/src/tenant/timeline/walreceiver/walreceiver_connection.rs
@@ -37,8 +37,8 @@ use crate::{
 use postgres_backend::is_expected_io_error;
 use postgres_connection::PgConnectionConfig;
 use postgres_ffi::waldecoder::WalStreamDecoder;
+use pq_proto::PageserverFeedback;
 use utils::lsn::Lsn;
-use utils::pageserver_feedback::PageserverFeedback;

 /// Status of the connection.
 #[derive(Debug, Clone, Copy)]
@@ -319,12 +319,12 @@ pub(super) async fn handle_walreceiver_connection(
                timeline.get_remote_consistent_lsn().unwrap_or(Lsn(0));

            // The last LSN we processed. It is not guaranteed to survive pageserver crash.
-            let last_received_lsn = last_lsn;
+            let last_received_lsn = u64::from(last_lsn);
            // `disk_consistent_lsn` is the LSN at which page server guarantees local persistence of all received data
-            let disk_consistent_lsn = timeline.get_disk_consistent_lsn();
+            let disk_consistent_lsn = u64::from(timeline.get_disk_consistent_lsn());
            // The last LSN that is synced to remote storage and is guaranteed to survive pageserver crash
            // Used by safekeepers to remove WAL preceding `remote_consistent_lsn`.
-            let remote_consistent_lsn = timeline_remote_consistent_lsn;
+            let remote_consistent_lsn = u64::from(timeline_remote_consistent_lsn);
            let ts = SystemTime::now();

            // Update the status about what we just received. This is shown in the mgmt API.
--- a/pgxn/neon/file_cache.c
+++ b/pgxn/neon/file_cache.c
@@ -96,8 +96,6 @@ static shmem_request_hook_type prev_shmem_request_hook;
 #endif
 static int   lfc_shrinking_factor; /* power of two by which local cache size will be shrinked when lfc_free_space_watermark is reached */

-void FileCacheMonitorMain(Datum main_arg);
-
 static void
 lfc_shmem_startup(void)
 {
@@ -372,73 +370,6 @@ lfc_cache_contains(RelFileNode rnode, ForkNumber forkNum, BlockNumber blkno)
 	return found;
 }

-/*
- * Evict a page (if present) from the local file cache
- */
-void
-lfc_evict(RelFileNode rnode, ForkNumber forkNum, BlockNumber blkno)
-{
-	BufferTag tag;
-	FileCacheEntry* entry;
-	bool found;
-	int chunk_offs = blkno & (BLOCKS_PER_CHUNK-1);
-	uint32 hash;
-
-	if (lfc_size_limit == 0) /* fast exit if file cache is disabled */
-		return;
-
-	INIT_BUFFERTAG(tag, rnode, forkNum, (blkno & ~(BLOCKS_PER_CHUNK-1)));
-
-	hash = get_hash_value(lfc_hash, &tag);
-
-	LWLockAcquire(lfc_lock, LW_EXCLUSIVE);
-	entry = hash_search_with_hash_value(lfc_hash, &tag, hash, HASH_FIND, &found);
-
-	if (!found)
-	{
-		/* nothing to do */
-		LWLockRelease(lfc_lock);
-		return;
-	}
-
-	/* remove the page from the cache */
-	entry->bitmap[chunk_offs >> 5] &= ~(1 << (chunk_offs & (32 - 1)));
-
-	/*
-	 * If the chunk has no live entries, we can position the chunk to be
-	 * recycled first.
-	 */
-	if (entry->bitmap[chunk_offs >> 5] == 0)
-	{
-		bool has_remaining_pages;
-
-		for (int i = 0; i < (BLOCKS_PER_CHUNK / 32); i++) {
-			if (entry->bitmap[i] != 0)
-			{
-				has_remaining_pages = true;
-				break;
-			}
-		}
-
-		/*
-		 * Put the entry at the position that is first to be reclaimed when
-		 * we have no cached pages remaining in the chunk
-		 */
-		if (!has_remaining_pages)
-		{
-			dlist_delete(&entry->lru_node);
-			dlist_push_head(&lfc_ctl->lru, &entry->lru_node);
-		}
-	}
-
-	/*
-	 * Done: apart from empty chunks, we don't move chunks in the LRU when
-	 * they're empty because eviction isn't usage.
-	 */
-
-	LWLockRelease(lfc_lock);
-}
-
 /*
 * Try to read page from local cache.
 * Returns true if page is found in local cache.
@@ -597,6 +528,7 @@ lfc_write(RelFileNode rnode, ForkNumber forkNum, BlockNumber blkno,
 	LWLockRelease(lfc_lock);
 }

+
 /*
 * Record structure holding the to be exposed cache data.
 */
--- a/pgxn/neon/libpagestore.c
+++ b/pgxn/neon/libpagestore.c
@@ -17,8 +17,6 @@
 #include "pagestore_client.h"
 #include "fmgr.h"
 #include "access/xlog.h"
-#include "access/xlogutils.h"
-#include "storage/buf_internals.h"

 #include "libpq-fe.h"
 #include "libpq/pqformat.h"
@@ -59,8 +57,6 @@ int			n_unflushed_requests = 0;
 int			flush_every_n_requests = 8;
 int			readahead_buffer_size = 128;

-bool	(*old_redo_read_buffer_filter) (XLogReaderState *record, uint8 block_id) = NULL;
-
 static void pageserver_flush(void);

 static bool
@@ -471,8 +467,6 @@ pg_init_libpagestore(void)
 		smgr_hook = smgr_neon;
 		smgr_init_hook = smgr_init_neon;
 		dbsize_hook = neon_dbsize;
-		old_redo_read_buffer_filter = redo_read_buffer_filter;
-		redo_read_buffer_filter = neon_redo_read_buffer_filter;
 	}
 	lfc_init();
 }
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Christian Schwarz	81d715b187	another rename	2023-06-07 16:27:53 +02:00
Christian Schwarz	0afd20068b	title	2023-06-07 16:27:53 +02:00
Christian Schwarz	f3d7bf9e09	rename the crate and fix some compile errors that I missed a while back	2023-06-07 16:27:53 +02:00
Christian Schwarz	748c06cff8	docs improvements	2023-06-07 16:27:53 +02:00
Christian Schwarz	0d82862d55	generic GetReconstructPathError	2023-06-07 16:27:53 +02:00
Christian Schwarz	f2bd71d0a8	more docs	2023-06-07 16:27:53 +02:00
Christian Schwarz	de9521214d	all-in on immutable + better crate comment	2023-06-07 16:27:53 +02:00
Christian Schwarz	8d9207040f	WIP doc comments (more aspirational than reality)	2023-06-07 16:27:53 +02:00
Christian Schwarz	8e57d95026	completely immutable variant of the design	2023-06-07 16:27:53 +02:00
Christian Schwarz	6c71fc6646	struct type for in-memory layer put failure	2023-06-07 16:27:53 +02:00
Christian Schwarz	e6a36b5236	unused PutError accessors	2023-06-07 16:27:53 +02:00
Christian Schwarz	56f57172dd	move code around to prep for alternative impl	2023-06-07 16:27:53 +02:00
Christian Schwarz	74ad719ede	fix most unused variable warnings	2023-06-07 16:27:53 +02:00
Christian Schwarz	5570384672	WIP	2023-06-07 16:27:53 +02:00
Christian Schwarz	321e74b5ee	implement support for non-blocking SeqWait::wait_for_timeout and use it	2023-06-07 16:27:53 +02:00
Christian Schwarz	712a516a2f	switch to a `Types` trait to declutter the generics	2023-06-07 16:27:53 +02:00
Christian Schwarz	be5ba04dca	rename trait to HistoricStuff	2023-06-07 16:27:53 +02:00
Christian Schwarz	b4b1292e15	WIP version of fully generic layer map built atop the new seqwait	2023-06-07 16:27:53 +02:00
Christian Schwarz	9b992c621d	seqwait: handle wait_for_timeout() with a zero duration efficiently	2023-06-07 16:27:49 +02:00
Christian Schwarz	1fa17ed486	seqwait: add support for communicating immutable data between advance and wait The long-term plan is to make LayerMap an immutable data structure that is multi-versioned. Meaning, we will not modify LayerMap in place but create a (cheap) copy and modify that copy. Once we're done making modifications, we make the copy available to readers through the SeqWait. The modifications will be made by a _single_ task, the pageserver actor. But _many_ readers can wait_for & use same or multiple versions of the LayerMap. So, there's a new method `split_spmc` that splits up a `SeqWait` into a not-clonable producer (Advance) and a clonable consumer (Wait). (SeqWait itself is mpmc, but, for the actor architecture, it makes sense to enforce spmc in the type system) # Please enter the commit message for your changes. Lines starting	2023-06-07 16:26:48 +02:00