[neon/azure] impr: push directly into ACR

As we observed [^1], messing up with compute image, trying to use an unexistent one, results in cplane schedules too many pods for the pool that cannot pull the image because it does not exist, reaching out to the docker hub too often, which results in our token being rate-limited. So, we need to push the images directly into ACR, instead of using pull-through cache. [^1]: https://neondb.slack.com/archives/C06SJG60FRB/p1721749525396229
review:
2026-07-04 12:40:37 +00:00 · 2024-07-24 17:44:49 +03:00 · 2024-07-22 08:36:46 +03:00 · 2024-07-19 18:12:48 +03:00 · 2024-07-19 16:43:29 +03:00 · 2024-07-19 14:13:49 +03:00
17 changed files with 280 additions and 203 deletions
--- a/.github/workflows/build_and_test.yml
+++ b/.github/workflows/build_and_test.yml
@@ -783,6 +783,10 @@ jobs:

  neon-image:
    needs: [ neon-image-arch, tag ]
+    permissions: # This is for Azure login to work.
+      id-token: write
+      contents: read
+    environment: dev
    runs-on: ubuntu-22.04

    steps:
@@ -808,6 +812,18 @@ jobs:
          docker buildx imagetools create -t 369495373322.dkr.ecr.eu-central-1.amazonaws.com/neon:${{ needs.tag.outputs.build-tag }} \
                                                                                neondatabase/neon:${{ needs.tag.outputs.build-tag }}

+      - name: Azure login
+        uses: azure/login@6c251865b4e6290e7b78be643ea2d005bc51f69a  # @v2.1.1
+        with:
+          client-id: ${{ secrets.AZURE_CLIENT_ID }}
+          tenant-id: ${{ secrets.AZURE_TENANT_ID }}
+          subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
+
+      - name: Copy docker images to ACR-dev
+        run: |
+          docker buildx imagetools create -t neoneastus2.azurecr.io/neondatabase/neon:${{ needs.tag.outputs.build-tag }} \
+                                             neondatabase/neon:${{ needs.tag.outputs.build-tag }}
+
  compute-node-image-arch:
    needs: [ check-permissions, build-build-tools-image, tag ]
    strategy:
@@ -913,6 +929,10 @@ jobs:
          rm -rf .docker-custom

  compute-node-image:
+    permissions: # This is for Azure login to work.
+      id-token: write
+      contents: read
+    environment: dev
    needs: [ compute-node-image-arch, tag ]
    runs-on: ubuntu-22.04

@@ -963,6 +983,24 @@ jobs:
          docker buildx imagetools create -t 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-tools:${{ needs.tag.outputs.build-tag }} \
                                                                                neondatabase/compute-tools:${{ needs.tag.outputs.build-tag }}

+      - name: Azure login
+        uses: azure/login@6c251865b4e6290e7b78be643ea2d005bc51f69a  # @v2.1.1
+        with:
+          client-id: ${{ secrets.AZURE_CLIENT_ID }}
+          tenant-id: ${{ secrets.AZURE_TENANT_ID }}
+          subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
+
+      - name: Push multi-arch compute-node-${{ matrix.version }} image to ACR
+        run: |
+          docker buildx imagetools create -t neoneastus2.azurecr.io/neondatabase/compute-node-${{ matrix.version }}:${{ needs.tag.outputs.build-tag }} \
+                                                                                neondatabase/compute-node-${{ matrix.version }}:${{ needs.tag.outputs.build-tag }}
+
+      - name: Push multi-arch compute-tools image to ACR
+        if: matrix.version == 'v16'
+        run: |
+          docker buildx imagetools create -t neoneastus2.azurecr.io/neondatabase/compute-tools:${{ needs.tag.outputs.build-tag }} \
+                                                                                neondatabase/compute-tools:${{ needs.tag.outputs.build-tag }}
+
  vm-compute-node-image:
    needs: [ check-permissions, tag, compute-node-image ]
    runs-on: [ self-hosted, gen3, large ]
@@ -1085,6 +1123,10 @@ jobs:
          rm -rf .docker-custom

  promote-images:
+    permissions: # This is for Azure login to work.
+      id-token: write
+      contents: read
+    environment: dev
    needs: [ check-permissions, tag, test-images, vm-compute-node-image ]
    runs-on: ubuntu-22.04

@@ -1111,6 +1153,20 @@ jobs:
                                               neondatabase/vm-compute-node-${version}:${{ needs.tag.outputs.build-tag }}
          done

+      - name: Azure login
+        uses: azure/login@6c251865b4e6290e7b78be643ea2d005bc51f69a  # @v2.1.1
+        with:
+          client-id: ${{ secrets.AZURE_CLIENT_ID }}
+          tenant-id: ${{ secrets.AZURE_TENANT_ID }}
+          subscription-id: ${{ secrets.AZURE_SUBSCRIPTION_ID }}
+
+      - name: Copy docker images to ACR-dev
+        run: |
+          for version in ${VERSIONS}; do
+          docker buildx imagetools create -t neoneastus2.azurecr.io/neondatabase/vm-compute-node-${version}:${{ needs.tag.outputs.build-tag }} \
+                                             neondatabase/vm-compute-node-${version}:${{ needs.tag.outputs.build-tag }}
+          done
+
      - name: Add latest tag to images
        if: github.ref_name == 'main'
        run: |
--- a/Cargo.lock
+++ b/Cargo.lock
@@ -1368,6 +1368,7 @@ dependencies = [
 "tracing",
 "url",
 "utils",
+ "whoami",
 "workspace_hack",
 ]

@@ -4603,6 +4604,15 @@ dependencies = [
 "bitflags 1.3.2",
 ]

+[[package]]
+name = "redox_syscall"
+version = "0.4.1"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "4722d768eff46b75989dd134e5c353f0d6296e5aaa3132e776cbdb56be7731aa"
+dependencies = [
+ "bitflags 1.3.2",
+]
+
 [[package]]
 name = "regex"
 version = "1.10.2"
@@ -6972,6 +6982,12 @@ version = "0.11.0+wasi-snapshot-preview1"
 source = "registry+https://github.com/rust-lang/crates.io-index"
 checksum = "9c8d87e72b64a3b4db28d11ce29237c246188f4f51057d65a7eab63b7987e423"

+[[package]]
+name = "wasite"
+version = "0.1.0"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "b8dad83b4f25e74f184f64c43b150b91efe7647395b42289f38e50566d82855b"
+
 [[package]]
 name = "wasm-bindgen"
 version = "0.2.92"
@@ -7124,6 +7140,17 @@ dependencies = [
 "once_cell",
 ]

+[[package]]
+name = "whoami"
+version = "1.5.1"
+source = "registry+https://github.com/rust-lang/crates.io-index"
+checksum = "a44ab49fad634e88f55bf8f9bb3abd2f27d7204172a112c7c9987e01c1c94ea9"
+dependencies = [
+ "redox_syscall 0.4.1",
+ "wasite",
+ "web-sys",
+]
+
 [[package]]
 name = "winapi"
 version = "0.3.9"
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -191,6 +191,7 @@ uuid = { version = "1.6.1", features = ["v4", "v7", "serde"] }
 walkdir = "2.3.2"
 rustls-native-certs = "0.7"
 x509-parser = "0.15"
+whoami = "1.5.1"

 ## TODO replace this with tracing
 env_logger = "0.10"
--- a/control_plane/Cargo.toml
+++ b/control_plane/Cargo.toml
@@ -40,6 +40,7 @@ safekeeper_api.workspace = true
 postgres_connection.workspace = true
 storage_broker.workspace = true
 utils.workspace = true
+whoami.workspace = true

 compute_api.workspace = true
 workspace_hack.workspace = true
--- a/control_plane/src/broker.rs
+++ b/control_plane/src/broker.rs
@@ -1,9 +1,9 @@
 //! Code to manage the storage broker
 //!
-//! In the local test environment, the data for each safekeeper is stored in
+//! In the local test environment, the storage broker stores its data directly in
 //!
 //! ```text
-//!   .neon/safekeepers/<safekeeper id>
+//!   .neon
 //! ```
 use std::time::Duration;

--- a/control_plane/src/pageserver.rs
+++ b/control_plane/src/pageserver.rs
@@ -1,8 +1,10 @@
 //! Code to manage pageservers
 //!
-//! In the local test environment, the pageserver stores its data directly in
+//! In the local test environment, the data for each pageserver is stored in
 //!
-//!   .neon/
+//! ```text
+//!   .neon/pageserver_<pageserver_id>
+//! ```
 //!
 use std::collections::HashMap;

--- a/control_plane/src/storage_controller.rs
+++ b/control_plane/src/storage_controller.rs
@@ -29,7 +29,6 @@ use utils::{
 pub struct StorageController {
    env: LocalEnv,
    listen: String,
-    path: Utf8PathBuf,
    private_key: Option<Vec<u8>>,
    public_key: Option<String>,
    postgres_port: u16,
@@ -41,6 +40,8 @@ const COMMAND: &str = "storage_controller";

 const STORAGE_CONTROLLER_POSTGRES_VERSION: u32 = 16;

+const DB_NAME: &str = "storage_controller";
+
 #[derive(Serialize, Deserialize)]
 pub struct AttachHookRequest {
    pub tenant_shard_id: TenantShardId,
@@ -65,10 +66,6 @@ pub struct InspectResponse {

 impl StorageController {
    pub fn from_env(env: &LocalEnv) -> Self {
-        let path = Utf8PathBuf::from_path_buf(env.base_data_dir.clone())
-            .unwrap()
-            .join("attachments.json");
-
        // Makes no sense to construct this if pageservers aren't going to use it: assume
        // pageservers have control plane API set
        let listen_url = env.control_plane_api.clone().unwrap();
@@ -128,7 +125,6 @@ impl StorageController {

        Self {
            env: env.clone(),
-            path,
            listen,
            private_key,
            public_key,
@@ -203,7 +199,6 @@ impl StorageController {
    ///
    /// Returns the database url
    pub async fn setup_database(&self) -> anyhow::Result<String> {
-        const DB_NAME: &str = "storage_controller";
        let database_url = format!("postgresql://localhost:{}/{DB_NAME}", self.postgres_port);

        let pg_bin_dir = self.get_pg_bin_dir().await?;
@@ -232,6 +227,30 @@ impl StorageController {
        Ok(database_url)
    }

+    pub async fn connect_to_database(
+        &self,
+    ) -> anyhow::Result<(
+        tokio_postgres::Client,
+        tokio_postgres::Connection<tokio_postgres::Socket, tokio_postgres::tls::NoTlsStream>,
+    )> {
+        tokio_postgres::Config::new()
+            .host("localhost")
+            .port(self.postgres_port)
+            // The user is the ambient operating system user name.
+            // That is an impurity which we want to fix in => TODO https://github.com/neondatabase/neon/issues/8400
+            //
+            // Until we get there, use the ambient operating system user name.
+            // Recent tokio-postgres versions default to this if the user isn't specified.
+            // But tokio-postgres fork doesn't have this upstream commit:
+            // https://github.com/sfackler/rust-postgres/commit/cb609be758f3fb5af537f04b584a2ee0cebd5e79
+            // => we should rebase our fork => TODO https://github.com/neondatabase/neon/issues/8399
+            .user(&whoami::username())
+            .dbname(DB_NAME)
+            .connect(tokio_postgres::NoTls)
+            .await
+            .map_err(anyhow::Error::new)
+    }
+
    pub async fn start(&self, retry_timeout: &Duration) -> anyhow::Result<()> {
        // Start a vanilla Postgres process used by the storage controller for persistence.
        let pg_data_path = Utf8PathBuf::from_path_buf(self.env.base_data_dir.clone())
@@ -256,18 +275,21 @@ impl StorageController {
            if !status.success() {
                anyhow::bail!("initdb failed with status {status}");
            }
-
-            // Write a minimal config file:
-            // - Specify the port, since this is chosen dynamically
-            // - Switch off fsync, since we're running on lightweight test environments and when e.g. scale testing
-            //   the storage controller we don't want a slow local disk to interfere with that.
-            tokio::fs::write(
-                &pg_data_path.join("postgresql.conf"),
-                format!("port = {}\nfsync=off\n", self.postgres_port),
-            )
-            .await?;
        };

+        // Write a minimal config file:
+        // - Specify the port, since this is chosen dynamically
+        // - Switch off fsync, since we're running on lightweight test environments and when e.g. scale testing
+        //   the storage controller we don't want a slow local disk to interfere with that.
+        //
+        // NB: it's important that we rewrite this file on each start command so we propagate changes
+        // from `LocalEnv`'s config file (`.neon/config`).
+        tokio::fs::write(
+            &pg_data_path.join("postgresql.conf"),
+            format!("port = {}\nfsync=off\n", self.postgres_port),
+        )
+        .await?;
+
        println!("Starting storage controller database...");
        let db_start_args = [
            "-w",
@@ -296,11 +318,38 @@ impl StorageController {
        // Run migrations on every startup, in case something changed.
        let database_url = self.setup_database().await?;

+        // We support running a startup SQL script to fiddle with the database before we launch storcon.
+        // This is used by the test suite.
+        let startup_script_path = self
+            .env
+            .base_data_dir
+            .join("storage_controller_db.startup.sql");
+        let startup_script = match tokio::fs::read_to_string(&startup_script_path).await {
+            Ok(script) => {
+                tokio::fs::remove_file(startup_script_path).await?;
+                script
+            }
+            Err(e) => {
+                if e.kind() == std::io::ErrorKind::NotFound {
+                    // always run some startup script so that this code path doesn't bit rot
+                    "BEGIN; COMMIT;".to_string()
+                } else {
+                    anyhow::bail!("Failed to read startup script: {e}")
+                }
+            }
+        };
+        let (mut client, conn) = self.connect_to_database().await?;
+        let conn = tokio::spawn(conn);
+        let tx = client.build_transaction();
+        let tx = tx.start().await?;
+        tx.batch_execute(&startup_script).await?;
+        tx.commit().await?;
+        drop(client);
+        conn.await??;
+
        let mut args = vec![
            "-l",
            &self.listen,
-            "-p",
-            self.path.as_ref(),
            "--dev",
            "--database-url",
            &database_url,
--- a/proxy/src/bin/proxy.rs
+++ b/proxy/src/bin/proxy.rs
@@ -176,6 +176,9 @@ struct ProxyCliArgs {
    /// redis url for notifications (if empty, redis_host:port will be used for both notifications and streaming connections)
    #[clap(long)]
    redis_notifications: Option<String>,
+    /// what from the available authentications type to use for the regional redis we have. Supported are "irsa" and "plain".
+    #[clap(long, default_value = "irsa")]
+    redis_auth_type: String,
    /// redis host for streaming connections (might be different from the notifications host)
    #[clap(long)]
    redis_host: Option<String>,
@@ -319,24 +322,38 @@ async fn main() -> anyhow::Result<()> {
        ),
        aws_credentials_provider,
    ));
-    let regional_redis_client = match (args.redis_host, args.redis_port) {
-        (Some(host), Some(port)) => Some(
-            ConnectionWithCredentialsProvider::new_with_credentials_provider(
-                host,
-                port,
-                elasticache_credentials_provider.clone(),
+    let regional_redis_client = match (args.redis_auth_type.as_str(), &args.redis_notifications) {
+        ("plain", redis_url) => match redis_url {
+            None => {
+                bail!("plain auth requires redis_notifications to be set");
+            }
+            Some(url) => Some(
+                ConnectionWithCredentialsProvider::new_with_static_credentials(url.to_string()),
            ),
-        ),
-        (None, None) => {
-            warn!("Redis events from console are disabled");
-            None
-        }
+        },
+        ("irsa", _) => match (&args.redis_host, args.redis_port) {
+            (Some(host), Some(port)) => Some(
+                ConnectionWithCredentialsProvider::new_with_credentials_provider(
+                    host.to_string(),
+                    port,
+                    elasticache_credentials_provider.clone(),
+                ),
+            ),
+            (None, None) => {
+                warn!("irsa auth requires redis-host and redis-port to be set, continuing without regional_redis_client");
+                None
+            }
+            _ => {
+                bail!("redis-host and redis-port must be specified together");
+            }
+        },
        _ => {
-            bail!("redis-host and redis-port must be specified together");
+            bail!("unknown auth type given");
        }
    };
+
    let redis_notifications_client = if let Some(url) = args.redis_notifications {
-        Some(ConnectionWithCredentialsProvider::new_with_static_credentials(url))
+        Some(ConnectionWithCredentialsProvider::new_with_static_credentials(url.to_string()))
    } else {
        regional_redis_client.clone()
    };
--- a/safekeeper/src/timeline_eviction.rs
+++ b/safekeeper/src/timeline_eviction.rs
@@ -199,10 +199,7 @@ async fn redownload_partial_segment(
    file.flush().await?;

    let final_path = local_segment_path(mgr, partial);
-    info!(
-        "downloaded {} bytes, renaming to {}",
-        final_path, final_path,
-    );
+    info!("downloaded {actual_len} bytes, renaming to {final_path}");
    if let Err(e) = durable_rename(&tmp_file, &final_path, !mgr.conf.no_sync).await {
        // Probably rename succeeded, but fsync of it failed. Remove
        // the file then to avoid using it.
--- a/safekeeper/src/wal_backup_partial.rs
+++ b/safekeeper/src/wal_backup_partial.rs
@@ -289,6 +289,18 @@ impl PartialBackup {
            })
            .collect();

+        if new_segments.len() == 1 {
+            // we have an uploaded segment, it must not be deleted from remote storage
+            segments_to_delete.retain(|name| name != &new_segments[0].name);
+        } else {
+            // there should always be zero or one uploaded segment
+            assert!(
+                new_segments.is_empty(),
+                "too many uploaded segments: {:?}",
+                new_segments
+            );
+        }
+
        info!("deleting objects: {:?}", segments_to_delete);
        let mut objects_to_delete = vec![];
        for seg in segments_to_delete.iter() {
--- a/storage_controller/src/main.rs
+++ b/storage_controller/src/main.rs
@@ -1,5 +1,4 @@
 use anyhow::{anyhow, Context};
-use camino::Utf8PathBuf;
 use clap::Parser;
 use diesel::Connection;
 use metrics::launch_timestamp::LaunchTimestamp;
@@ -51,10 +50,6 @@ struct Cli {
    #[arg(long)]
    compute_hook_url: Option<String>,

-    /// Path to the .json file to store state (will be created if it doesn't exist)
-    #[arg(short, long)]
-    path: Option<Utf8PathBuf>,
-
    /// URL to connect to postgres, like postgresql://localhost:1234/storage_controller
    #[arg(long)]
    database_url: Option<String>,
@@ -206,11 +201,10 @@ async fn async_main() -> anyhow::Result<()> {

    let args = Cli::parse();
    tracing::info!(
-        "version: {}, launch_timestamp: {}, build_tag {}, state at {}, listening on {}",
+        "version: {}, launch_timestamp: {}, build_tag {}, listening on {}",
        GIT_VERSION,
        launch_ts.to_string(),
        BUILD_TAG,
-        args.path.as_ref().unwrap_or(&Utf8PathBuf::from("<none>")),
        args.listen
    );

@@ -277,8 +271,7 @@ async fn async_main() -> anyhow::Result<()> {
        .await
        .context("Running database migrations")?;

-    let json_path = args.path;
-    let persistence = Arc::new(Persistence::new(secrets.database_url, json_path.clone()));
+    let persistence = Arc::new(Persistence::new(secrets.database_url));

    let service = Service::spawn(config, persistence.clone()).await?;

@@ -316,14 +309,6 @@ async fn async_main() -> anyhow::Result<()> {
    }
    tracing::info!("Terminating on signal");

-    if json_path.is_some() {
-        // Write out a JSON dump on shutdown: this is used in compat tests to avoid passing
-        // full postgres dumps around.
-        if let Err(e) = persistence.write_tenants_json().await {
-            tracing::error!("Failed to write JSON on shutdown: {e}")
-        }
-    }
-
    // Stop HTTP server first, so that we don't have to service requests
    // while shutting down Service
    server_shutdown.cancel();
--- a/storage_controller/src/persistence.rs
+++ b/storage_controller/src/persistence.rs
@@ -5,8 +5,6 @@ use std::time::Duration;
 use std::time::Instant;

 use self::split_state::SplitState;
-use camino::Utf8Path;
-use camino::Utf8PathBuf;
 use diesel::pg::PgConnection;
 use diesel::prelude::*;
 use diesel::Connection;
@@ -55,11 +53,6 @@ use crate::node::Node;
 /// we can UPDATE a node's scheduling mode reasonably quickly to mark a bad node offline.
 pub struct Persistence {
    connection_pool: diesel::r2d2::Pool<diesel::r2d2::ConnectionManager<PgConnection>>,
-
-    // In test environments, we support loading+saving a JSON file.  This is temporary, for the benefit of
-    // test_compatibility.py, so that we don't have to commit to making the database contents fully backward/forward
-    // compatible just yet.
-    json_path: Option<Utf8PathBuf>,
 }

 /// Legacy format, for use in JSON compat objects in test environment
@@ -124,7 +117,7 @@ impl Persistence {
    const IDLE_CONNECTION_TIMEOUT: Duration = Duration::from_secs(10);
    const MAX_CONNECTION_LIFETIME: Duration = Duration::from_secs(60);

-    pub fn new(database_url: String, json_path: Option<Utf8PathBuf>) -> Self {
+    pub fn new(database_url: String) -> Self {
        let manager = diesel::r2d2::ConnectionManager::<PgConnection>::new(database_url);

        // We will use a connection pool: this is primarily to _limit_ our connection count, rather than to optimize time
@@ -139,10 +132,7 @@ impl Persistence {
            .build(manager)
            .expect("Could not build connection pool");

-        Self {
-            connection_pool,
-            json_path,
-        }
+        Self { connection_pool }
    }

    /// A helper for use during startup, where we would like to tolerate concurrent restarts of the
@@ -302,85 +292,13 @@ impl Persistence {
    /// At startup, load the high level state for shards, such as their config + policy.  This will
    /// be enriched at runtime with state discovered on pageservers.
    pub(crate) async fn list_tenant_shards(&self) -> DatabaseResult<Vec<TenantShardPersistence>> {
-        let loaded = self
-            .with_measured_conn(
-                DatabaseOperation::ListTenantShards,
-                move |conn| -> DatabaseResult<_> {
-                    Ok(crate::schema::tenant_shards::table.load::<TenantShardPersistence>(conn)?)
-                },
-            )
-            .await?;
-
-        if loaded.is_empty() {
-            if let Some(path) = &self.json_path {
-                if tokio::fs::try_exists(path)
-                    .await
-                    .map_err(|e| DatabaseError::Logical(format!("Error stat'ing JSON file: {e}")))?
-                {
-                    tracing::info!("Importing from legacy JSON format at {path}");
-                    return self.list_tenant_shards_json(path).await;
-                }
-            }
-        }
-        Ok(loaded)
-    }
-
-    /// Shim for automated compatibility tests: load tenants from a JSON file instead of database
-    pub(crate) async fn list_tenant_shards_json(
-        &self,
-        path: &Utf8Path,
-    ) -> DatabaseResult<Vec<TenantShardPersistence>> {
-        let bytes = tokio::fs::read(path)
-            .await
-            .map_err(|e| DatabaseError::Logical(format!("Failed to load JSON: {e}")))?;
-
-        let mut decoded = serde_json::from_slice::<JsonPersistence>(&bytes)
-            .map_err(|e| DatabaseError::Logical(format!("Deserialization error: {e}")))?;
-        for shard in decoded.tenants.values_mut() {
-            if shard.placement_policy == "\"Single\"" {
-                // Backward compat for test data after PR https://github.com/neondatabase/neon/pull/7165
-                shard.placement_policy = "{\"Attached\":0}".to_string();
-            }
-
-            if shard.scheduling_policy.is_empty() {
-                shard.scheduling_policy =
-                    serde_json::to_string(&ShardSchedulingPolicy::default()).unwrap();
-            }
-        }
-
-        let tenants: Vec<TenantShardPersistence> = decoded.tenants.into_values().collect();
-
-        // Synchronize database with what is in the JSON file
-        self.insert_tenant_shards(tenants.clone()).await?;
-
-        Ok(tenants)
-    }
-
-    /// For use in testing environments, where we dump out JSON on shutdown.
-    pub async fn write_tenants_json(&self) -> anyhow::Result<()> {
-        let Some(path) = &self.json_path else {
-            anyhow::bail!("Cannot write JSON if path isn't set (test environment bug)");
-        };
-        tracing::info!("Writing state to {path}...");
-        let tenants = self.list_tenant_shards().await?;
-        let mut tenants_map = HashMap::new();
-        for tsp in tenants {
-            let tenant_shard_id = TenantShardId {
-                tenant_id: TenantId::from_str(tsp.tenant_id.as_str())?,
-                shard_number: ShardNumber(tsp.shard_number as u8),
-                shard_count: ShardCount::new(tsp.shard_count as u8),
-            };
-
-            tenants_map.insert(tenant_shard_id, tsp);
-        }
-        let json = serde_json::to_string(&JsonPersistence {
-            tenants: tenants_map,
-        })?;
-
-        tokio::fs::write(path, &json).await?;
-        tracing::info!("Wrote {} bytes to {path}...", json.len());
-
-        Ok(())
+        self.with_measured_conn(
+            DatabaseOperation::ListTenantShards,
+            move |conn| -> DatabaseResult<_> {
+                Ok(crate::schema::tenant_shards::table.load::<TenantShardPersistence>(conn)?)
+            },
+        )
+        .await
    }

    /// Tenants must be persisted before we schedule them for the first time.  This enables us
--- a/test_runner/fixtures/neon_fixtures.py
+++ b/test_runner/fixtures/neon_fixtures.py
@@ -31,6 +31,7 @@ import backoff
 import httpx
 import jwt
 import psycopg2
+import psycopg2.sql
 import pytest
 import requests
 import toml
@@ -727,8 +728,30 @@ class NeonEnvBuilder:
                self.repo_dir / "local_fs_remote_storage",
            )

-        if (attachments_json := Path(repo_dir / "attachments.json")).exists():
-            shutil.copyfile(attachments_json, self.repo_dir / attachments_json.name)
+        # restore storage controller (the db is small, don't bother with overlayfs)
+        storcon_db_from_dir = repo_dir / "storage_controller_db"
+        storcon_db_to_dir = self.repo_dir / "storage_controller_db"
+        log.info(f"Copying storage_controller_db from {storcon_db_from_dir} to {storcon_db_to_dir}")
+        assert storcon_db_from_dir.is_dir()
+        assert not storcon_db_to_dir.exists()
+
+        def ignore_postgres_log(path: str, _names):
+            if Path(path) == storcon_db_from_dir:
+                return {"postgres.log"}
+            return set()
+
+        shutil.copytree(storcon_db_from_dir, storcon_db_to_dir, ignore=ignore_postgres_log)
+        assert not (storcon_db_to_dir / "postgres.log").exists()
+        # NB: neon_local rewrites postgresql.conf on each start based on neon_local config. No need to patch it.
+        # However, in this new NeonEnv, the pageservers listen on different ports, and the storage controller
+        # will currently reject re-attach requests from them because the NodeMetadata isn't identical.
+        # So, from_repo_dir patches up the the storcon database.
+        patch_script_path = self.repo_dir / "storage_controller_db.startup.sql"
+        assert not patch_script_path.exists()
+        patch_script = ""
+        for ps in self.env.pageservers:
+            patch_script += f"UPDATE nodes SET listen_http_port={ps.service_port.http}, listen_pg_port={ps.service_port.pg}  WHERE node_id = '{ps.id}';"
+        patch_script_path.write_text(patch_script)

        # Update the config with info about tenants and timelines
        with (self.repo_dir / "config").open("r") as f:
@@ -4054,6 +4077,22 @@ class Safekeeper(LogUtils):
        self.id = id
        self.running = running
        self.logfile = Path(self.data_dir) / f"safekeeper-{id}.log"
+
+        if extra_opts is None:
+            # Testing defaults: enable everything, and set short timeouts so that background
+            # work will happen during short tests.
+            # **Note**: Any test that explicitly sets extra_opts will not get these defaults.
+            extra_opts = [
+                "--enable-offload",
+                "--delete-offloaded-wal",
+                "--partial-backup-timeout",
+                "10s",
+                "--control-file-save-interval",
+                "1s",
+                "--eviction-min-resident",
+                "10s",
+            ]
+
        self.extra_opts = extra_opts

    def start(
--- a/test_runner/performance/pageserver/pagebench/test_pageserver_max_throughput_getpage_at_latest_lsn.py
+++ b/test_runner/performance/pageserver/pagebench/test_pageserver_max_throughput_getpage_at_latest_lsn.py
@@ -255,11 +255,3 @@ def run_pagebench_benchmark(
            unit="ms",
            report=MetricReport.LOWER_IS_BETTER,
        )
-
-    env.storage_controller.allowed_errors.append(
-        # The test setup swaps NeonEnv instances, hence different
-        # pg instances are used for the storage controller db. This means
-        # the storage controller doesn't know about the nodes mentioned
-        # in attachments.json at start-up.
-        ".* Scheduler missing node 1",
-    )
--- a/test_runner/regress/test_compatibility.py
+++ b/test_runner/regress/test_compatibility.py
@@ -93,29 +93,6 @@ check_ondisk_data_compatibility_if_enabled = pytest.mark.skipif(
 )


-def fixup_storage_controller(env: NeonEnv):
-    """
-    After importing a repo_dir, we need to massage the storage controller's state a bit: it will have
-    initially started up with no nodes, but some tenants, and thereby those tenants won't be scheduled
-    anywhere.
-
-    After NeonEnv.start() is done (i.e. nodes are started + registered), call this function to get
-    the storage controller into a good state.
-
-    This function should go away once compat tests carry the controller database in their snapshots, so
-    that the controller properly remembers nodes between creating + restoring the snapshot.
-    """
-    env.storage_controller.allowed_errors.extend(
-        [
-            ".*Tenant shard .+ references non-existent node.*",
-            ".*Failed to schedule tenant .+ at startup.*",
-        ]
-    )
-    env.storage_controller.stop()
-    env.storage_controller.start()
-    env.storage_controller.reconcile_until_idle()
-
-
@pytest.mark.xdist_group("compatibility")
@pytest.mark.order(before="test_forward_compatibility")
 def test_create_snapshot(
@@ -198,7 +175,6 @@ def test_backward_compatibility(
        neon_env_builder.num_safekeepers = 3
        env = neon_env_builder.from_repo_dir(compatibility_snapshot_dir / "repo")
        neon_env_builder.start()
-        fixup_storage_controller(env)

        check_neon_works(
            env,
@@ -287,7 +263,6 @@ def test_forward_compatibility(
        assert not env.pageserver.log_contains("git-env:" + prev_pageserver_version)

        neon_env_builder.start()
-        fixup_storage_controller(env)

        # ensure the specified pageserver is running
        assert env.pageserver.log_contains("git-env:" + prev_pageserver_version)
--- a/test_runner/regress/test_pg_regress.py
+++ b/test_runner/regress/test_pg_regress.py
@@ -117,7 +117,7 @@ def post_checks(env: NeonEnv, test_output_dir: Path, db_name: str, endpoint: End

 # Run the main PostgreSQL regression tests, in src/test/regress.
 #
-@pytest.mark.timeout(600)
+@pytest.mark.timeout(900)  # Contains many sub-tests, is slow in debug builds
@pytest.mark.parametrize("shard_count", [None, 4])
 def test_pg_regress(
    neon_env_builder: NeonEnvBuilder,
@@ -186,6 +186,7 @@ def test_pg_regress(

 # Run the PostgreSQL "isolation" tests, in src/test/isolation.
 #
+@pytest.mark.timeout(600)  # Contains many sub-tests, is slow in debug builds
@pytest.mark.parametrize("shard_count", [None, 4])
 def test_isolation(
    neon_env_builder: NeonEnvBuilder,
--- a/test_runner/regress/test_wal_acceptor.py
+++ b/test_runner/regress/test_wal_acceptor.py
@@ -2242,6 +2242,8 @@ def test_s3_eviction(

    check_values = [0] * n_timelines

+    event_metrics_seen = False
+
    n_iters = 20
    for _ in range(n_iters):
        if log.isEnabledFor(logging.DEBUG):
@@ -2266,6 +2268,27 @@ def test_s3_eviction(
        # update remote_consistent_lsn on pageserver
        ps_client.timeline_checkpoint(env.initial_tenant, timelines[i], wait_until_uploaded=True)

+        # Do metrics check before restarts, since these will reset to zero across a restart
+        event_metrics_seen |= any(
+            sk.http_client().get_metric_value(
+                "safekeeper_eviction_events_started_total", {"kind": "evict"}
+            )
+            or 0 > 0
+            and sk.http_client().get_metric_value(
+                "safekeeper_eviction_events_completed_total", {"kind": "evict"}
+            )
+            or 0 > 0
+            and sk.http_client().get_metric_value(
+                "safekeeper_eviction_events_started_total", {"kind": "restore"}
+            )
+            or 0 > 0
+            and sk.http_client().get_metric_value(
+                "safekeeper_eviction_events_completed_total", {"kind": "restore"}
+            )
+            or 0 > 0
+            for sk in env.safekeepers
+        )
+
        # restarting random safekeepers
        for sk in env.safekeepers:
            if random.random() < restart_chance:
@@ -2280,22 +2303,4 @@ def test_s3_eviction(
        for sk in env.safekeepers
    )

-    assert any(
-        sk.http_client().get_metric_value(
-            "safekeeper_eviction_events_started_total", {"kind": "evict"}
-        )
-        or 0 > 0
-        and sk.http_client().get_metric_value(
-            "safekeeper_eviction_events_completed_total", {"kind": "evict"}
-        )
-        or 0 > 0
-        and sk.http_client().get_metric_value(
-            "safekeeper_eviction_events_started_total", {"kind": "restore"}
-        )
-        or 0 > 0
-        and sk.http_client().get_metric_value(
-            "safekeeper_eviction_events_completed_total", {"kind": "restore"}
-        )
-        or 0 > 0
-        for sk in env.safekeepers
-    )
+    assert event_metrics_seen
Author	SHA1	Message	Date
Anton Chaporgin	49db1c47ee	[neon/azure] impr: push directly into ACR As we observed [^1], messing up with compute image, trying to use an unexistent one, results in cplane schedules too many pods for the pool that cannot pull the image because it does not exist, reaching out to the docker hub too often, which results in our token being rate-limited. So, we need to push the images directly into ACR, instead of using pull-through cache. [^1]: https://neondb.slack.com/archives/C06SJG60FRB/p1721749525396229	2024-07-24 17:44:49 +03:00
Anton Chaporgin	cf386c6c2c	review: * redis_publisher is based on regional_redis_client as it was before * do not error out when irsa redis is not configured	2024-07-22 08:36:46 +03:00
Anton Chaporgin	a9a5a19d30	cargo fmt --all	2024-07-19 18:12:48 +03:00
Anton Chaporgin	a6e67eb13e	fix the with	2024-07-19 16:43:29 +03:00
Anton Chaporgin	8646fc8361	fix the bug	2024-07-19 14:13:49 +03:00
Anton Chaporgin	d73b9b8afd	removed redundant comment	2024-07-19 11:36:36 +03:00
Anton Chaporgin	095af95bd9	[proxy/redis] impr: use redis_auth_type to switch between auth types This adds `redis_auth_type` to the config with default value of "irsa". Not specifying it will enforce the `regional_redis_client` to be configured with IRSA redis (as it's done now). If "plain" is specified, then the regional client is condifigured with `redis_notifications`, consuming username:password auth from URI. We plan to do that for Azure. Configuring `regional_redis_client` is required now, there is no opt-out from configuring it.	2024-07-19 11:34:45 +03:00
Arthur Petukhovsky	d263b1804e	Fix partial upload bug with invalid remote state (#8383 ) We have an issue that some partial uploaded segments can be actually missing in remote storage. I found this issue when was looking at the logs in staging, and it can be triggered by failed uploads: 1. Code tries to upload `SEG_TERM_LSN_LSN_sk5.partial`, but receives error from S3 2. The failed attempt is saved to `segments` vec 3. After some time, the code tries to upload `SEG_TERM_LSN_LSN_sk5.partial` again 4. This time the upload is successful and code calls `gc()` to delete previous uploads 5. Since new object and old object share the same name, uploaded data gets deleted from remote storage This commit fixes the issue by patching `gc()` not to delete objects with the same name as currently uploaded. --------- Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2024-07-18 13:46:00 +01:00
John Spray	b461755326	tests: turn on safekeeper eviction by default (#8352 ) ## Problem Ahead of enabling eviction in the field, where it will become the normal/default mode, let's enable it by default throughout our tests in case any issues become visible there. ## Summary of changes - Make default `extra_opts` for safekeepers enable offload & deletion - Set low timeouts in `extra_opts` so that tests running for tens of seconds have a chance to hit some of these background operations.	2024-07-18 12:59:14 +01:00
John Spray	9ded2556df	tests: increase test_pg_regress and test_isolation timeouts (#8418 ) ## Problem These tests time out ~1 in 50 runs when in debug mode. There is no indication of a real issue: they're just wrappers that have large numbers of individual tests contained within on pytest case. ## Summary of changes - Bump pg_regress timeout from 600 to 900s - Bump test_isolation timeout from 300s (default) to 600s In future it would be nice to break out these tests to run individual cases (or batches thereof) as separate tests, rather than this monolith.	2024-07-18 10:23:17 +01:00
John Spray	7672e49ab5	tests: fix metrics check in test_s3_eviction (#8419 ) ## Problem This test would occasionally fail its metric check. This could happen in the rare case that the nodes had all been restarted before their most recent eviction. The metric check was added in https://github.com/neondatabase/neon/pull/8348 ## Summary of changes - Check metrics before each restart, accumulate into a bool that we assert on at the end of the test	2024-07-18 10:14:56 +01:00
Christian Schwarz	a2d170b6d0	NeonEnv.from_repo_dir: use storage_controller_db instead of `attachments.json` (#8382 ) When `NeonEnv.from_repo_dir` was introduced, storage controller stored its state exclusively `attachments.json`. Since then, it has moved to using Postgres, which stores its state in `storage_controller_db`. But `NeonEnv.from_repo_dir` wasn't adjusted to do this. This PR rectifies the situation. Context for this is failures in `test_pageserver_characterize_throughput_with_n_tenants` CF: https://neondb.slack.com/archives/C033RQ5SPDH/p1721035799502239?thread_ts=1720901332.293769&cid=C033RQ5SPDH Notably, `from_repo_dir` is also used by the backwards- and forwards-compatibility. Thus, the changes in this PR affect those tests as well. However, it turns out that the compatibility snapshot already contains the `storage_controller_db`. Thus, it should just work and in fact we can remove hacks like `fixup_storage_controller`. Follow-ups created as part of this work: * https://github.com/neondatabase/neon/issues/8399 * https://github.com/neondatabase/neon/issues/8400	2024-07-18 10:56:07 +02:00
dotdister	1303d47778	Fix comment in Control Plane (#8406 ) ## Problem There are something wrong in the comment of `control_plane/src/broker.rs` and `control_plane/src/pageserver.rs` ## Summary of changes Fixed the comment about component name and their data path in `control_plane/src/broker.rs` and `control_plane/src/pageserver.rs`.	2024-07-18 09:33:46 +01:00