rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-07 05:22:56 +00:00

Author	SHA1	Message	Date
Arseny Sher	930763cad2	s/jsonb/array	2024-08-12 14:48:05 +03:00
Arseny Sher	28ef1522d6	cosmetic fixes	2024-08-12 14:48:05 +03:00
Arseny Sher	c9d2b61195	fix term uniqueness	2024-08-12 14:48:05 +03:00
Arseny Sher	4d1cf2dc6f	tests, rollout	2024-08-12 14:48:05 +03:00
Arseny Sher	7b50c1a457	more wip ref https://github.com/neondatabase/cloud/issues/14668	2024-08-12 14:48:05 +03:00
Arseny Sher	1e789fb963	wipwip	2024-08-12 14:48:05 +03:00
Arseny Sher	162424ad77	wip	2024-08-12 14:48:05 +03:00
Arseny Sher	a4eea5025c	Fix logical apply worker reporting of flush_lsn wrt sync replication. It should take syncrep flush_lsn into account because WAL before it on endpoint restart is lost, which makes replication miss some data if slot had already been advanced too far. This commit adds test reproducing the issue and bumps vendor/postgres to commit with the actual fix.	2024-08-12 13:14:02 +03:00
Alexander Bayandin	4476caf670	CI: add `actions/set-docker-config-dir` to set DOCKER_CONFIG (#8676 ) ## Problem In several workflows, we have repeating code which is separated into two steps: ```bash mkdir -p $(pwd)/.docker-custom echo DOCKER_CONFIG=/tmp/.docker-custom >> $GITHUB_ENV ... rm -rf $(pwd)/.docker-custom ``` Such copy-paste is prone to errors; for example, in one case, instead of `$(pwd)/.docker-custom`, we use `/tmp/.docker-custom`, which is shared between workflows. ## Summary of changes - Create a new action `actions/set-docker-config-dir`, which sets `DOCKER_CONFIG` and deletes it in a Post action part	2024-08-12 09:17:31 +01:00
dependabot[bot]	f7a3380aec	chore(deps): bump aiohttp from 3.9.4 to 3.10.2 (#8684 )	2024-08-11 12:21:32 +01:00
Arpad Müller	507f1a5bdd	Also pass HOME env var in access_env_vars (#8685 ) Noticed this while debugging a test failure in #8673 which only occurs with real S3 instead of mock S3: if you authenticate to S3 via `AWS_PROFILE`, then it requires the `HOME` env var to be set so that it can read inside the `~/.aws` directory. The scrubber abstraction `StorageScrubber::scrubber_cli` in `neon_fixtures.py` would otherwise not work. My earlier PR #6556 has done similar things for the `neon_local` wrapper. You can try: ``` aws sso login --profile dev export ENABLE_REAL_S3_REMOTE_STORAGE=y REMOTE_STORAGE_S3_BUCKET=neon-github-ci-tests REMOTE_STORAGE_S3_REGION=eu-central-1 AWS_PROFILE=dev RUST_BACKTRACE=1 BUILD_TYPE=debug DEFAULT_PG_VERSION=16 ./scripts/pytest -vv --tb=short -k test_scrubber_tenant_snapshot ``` before and after this patch: this patch fixes it.	2024-08-10 12:04:47 +00:00
John Spray	401dcd3551	Update docs/SUMMARY.md (#8665 ) ## Problem This page had many dead links, and was confusing for folks looking for documentation about our product. Closes: https://github.com/neondatabase/neon/issues/8535 ## Summary of changes - Add a link to the product docs up top - Remove dead/placeholder links	2024-08-09 18:30:15 +01:00
Alexander Bayandin	4a53cd0fc3	Dockerfiles: remove cachepot (#8666 ) ## Problem We install and try to use `cachepot`. But it is not configured correctly and doesn't work (after https://github.com/neondatabase/neon/pull/2290) ## Summary of changes - Remove `cachepot`	2024-08-09 15:48:16 +01:00
Vlad Lazar	f5cef7bf7f	storcon: skip draining shard if it's secondary is lagging too much (#8644 ) ## Problem Migrations of tenant shards with cold secondaries are holding up drains in during production deployments. ## Summary of changes If a secondary locations is lagging by more than 256MiB (configurable, but that's the default), then skip cutting it over to the secondary as part of the node drain.	2024-08-09 15:45:07 +01:00
John Spray	e6770d79fd	pageserver: don't treat NotInitialized::Stopped as unexpected (#8675 ) ## Problem This type of error can happen during shutdown & was triggering a circuit breaker alert. ## Summary of changes - Map NotIntialized::Stopped to CompactionError::ShuttingDown, so that we may handle it cleanly	2024-08-09 14:01:56 +01:00
Alexander Bayandin	201f56baf7	CI(pin-build-tools-image): fix permissions for Azure login (#8671 ) ## Problem Azure login fails in `pin-build-tools-image` workflow because the job doesn't have the required permissions. ``` Error: Please make sure to give write permissions to id-token in the workflow. Error: Login failed with Error: Error message: Unable to get ACTIONS_ID_TOKEN_REQUEST_URL env variable. Double check if the 'auth-type' is correct. Refer to https://github.com/Azure/login#readme for more information. ``` ## Summary of changes - Add `id-token: write` permission to `pin-build-tools-image` - Add an input to force image tagging - Unify pushing to Docker Hub with other registries - Split the job into two to have less if's	2024-08-09 12:05:43 +01:00
Alex Chi Z.	a155914c1c	fix(neon): disable create tablespace stmt (#8657 ) part of https://github.com/neondatabase/neon/issues/8653 Disable create tablespace stmt. It turns out it requires much less effort to do the regress test mode flag than patching the test cases, and given that we might need to support tablespaces in the future, I decided to add a new flag `regress_test_mode` to change the behavior of create tablespace. Tested manually that without setting regress_test_mode, create tablespace will be rejected. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2024-08-09 09:18:55 +01:00
Conrad Ludgate	7e08fbd1b9	Revert "proxy: update tokio-postgres to allow arbitrary config params (#8076 )" (#8654 ) This reverts #8076 - which was already reverted from the release branch since forever (it would have been a breaking change to release for all users who currently set TimeZone options). It's causing conflicts now so we should revert it here as well.	2024-08-09 09:09:29 +01:00
Peter Bendel	2ca5ff26d7	Run a subset of benchmarking job steps on GitHub action runners in Azure - closer to the system under test (#8651 ) ## Problem Latency from one cloud provider to another one is higher than within the same cloud provider. Some of our benchmarks are latency sensitive - we run a pgbench or psql in the github action runner and the system under test is running in Neon (database project). For realistic perf tps and latency results we need to compare apples to apples and run the database client in the same "latency distance" for all tests. ## Summary of changes Move job steps that test Neon databases deployed on Azure into Azure action runners. - bench strategy variant using azure database - pgvector strategy variant using azure database - pgbench-compare strategy variants using azure database ## Test run https://github.com/neondatabase/neon/actions/runs/10314848502	2024-08-09 08:36:29 +01:00
Alexander Bayandin	8acce00953	Dockerfiles: fix LegacyKeyValueFormat & JSONArgsRecommended (#8664 ) ## Problem CI complains in all PRs: ``` "ENV key=value" should be used instead of legacy "ENV key value" format ``` https://docs.docker.com/reference/build-checks/legacy-key-value-format/ See - https://github.com/neondatabase/neon/pull/8644/files ("Unchanged files with check annotations" section) - https://github.com/neondatabase/neon/actions/runs/10304090562?pr=8644 ("Annotations" section) ## Summary of changes - Use `ENV key=value` instead of `ENV key value` in all Dockerfiles	2024-08-09 07:54:54 +01:00
Alexander Bayandin	d28a6f2576	CI(build-tools): update Rust, Python, Mold (#8667 ) ## Problem - Rust 1.80.1 has been released: https://blog.rust-lang.org/2024/08/08/Rust-1.80.1.html - Python 3.9.19 has been released: https://www.python.org/downloads/release/python-3919/ - Mold 2.33.0 has been released: https://github.com/rui314/mold/releases/tag/v2.33.0 - Unpinned `cargo-deny` in `build-tools` got updated to the latest version and doesn't work anymore with the current config file ## Summary of changes - Bump Rust to 1.80.1 - Bump Python to 3.9.19 - Bump Mold to 2.33.0 - Pin `cargo-deny`, `cargo-hack`, `cargo-hakari`, `cargo-nextest`, `rustfilt` versions - Update `deny.toml` to the latest format, see https://github.com/EmbarkStudios/cargo-deny/pull/611	2024-08-09 06:17:16 +00:00
John Spray	4431688dc6	tests: don't require kafka client for regular tests (#8662 ) ## Problem We're adding more third party dependencies to support more diverse + realistic test cases in `test_runner/logical_repl`. I ❤️ these tests, they are a good thing. The slight glitch is that python packaging is hard, and some third party python packages have issues. For example the current kafka dependency doesn't work on latest python. We can mitigate that by only importing these more specialized dependencies in the tests that use them. ## Summary of changes - Move the `kafka` import into a test body, so that folks running the regular `test_runner/regress` tests don't have to have a working kafka client package.	2024-08-08 19:24:21 +01:00
John Spray	953b7d4f7e	pageserver: remove paranoia double-calculation of retain_lsns (#8617 ) ## Problem This code was to mitigate risk in https://github.com/neondatabase/neon/pull/8427 As expected, we did not hit this code path - the new continuous updates of gc_info are working fine, we can remove this code now. ## Summary of changes - Remove block that double-checks retain_lsns	2024-08-08 12:57:48 +01:00
Joonas Koivunen	8561b2c628	fix: stop leaking BackgroundPurges (#8650 ) avoid "leaking" the completions of BackgroundPurges by: 1. switching it to TaskTracker for provided close+wait 2. stop using tokio::fs::remove_dir_all which will consume two units of memory instead of one blocking task Additionally, use more graceful shutdown in tests which do actually some background cleanup.	2024-08-08 12:02:53 +01:00
Joonas Koivunen	21638ee96c	fix(test): do not fail test for filesystem race (#8643 ) evidence: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-8632/10287641784/index.html#suites/0e58fb04d9998963e98e45fe1880af7d/c7a46335515142b/	2024-08-08 10:34:47 +01:00
Konstantin Knizhnik	cbe8c77997	Use sycnhronous commit for logical replicaiton worker (#8645 ) ## Problem See https://neondb.slack.com/archives/C03QLRH7PPD/p1723038557449239?thread_ts=1722868375.476789&cid=C03QLRH7PPD Logical replication subscription by default use `synchronous_commit=off` which cause problems with safekeeper ## Summary of changes Set `synchronous_commit=on` for logical replication subscription in test_subscriber_restart.py ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2024-08-08 10:23:57 +03:00
John Spray	cf3eac785b	pageserver: make bench_ingest build (but panic) on macOS (#8641 ) ## Problem Some developers build on MacOS, which doesn't have io_uring. ## Summary of changes - Add `io_engine_for_bench`, which on linux will give io_uring or panic if it's unavailable, and on MacOS will always panic. We do not want to run such benchmarks with StdFs: the results aren't interesting, and will actively waste the time of any developers who start investigating performance before they realize they're using a known-slow I/O backend. Why not just conditionally compile this benchmark on linux only? Because even on linux, I still want it to refuse to run if it can't get io_uring.	2024-08-07 21:17:08 +01:00
Yuchen Liang	542385e364	feat(pageserver): add direct io pageserver config (#8622 ) Part of #8130, [RFC: Direct IO For Pageserver](https://github.com/neondatabase/neon/blob/problame/direct-io-rfc/docs/rfcs/034-direct-io-for-pageserver.md) ## Description Add pageserver config for evaluating/enabling direct I/O. - Disabled: current default, uses buffered io as is. - Evaluate: still uses buffered io, but could do alignment checking and perf simulation (pad latency by direct io RW to a fake file). - Enabled: uses direct io, behavior on alignment error is configurable. Signed-off-by: Yuchen Liang <yuchen@neon.tech>	2024-08-07 21:04:19 +01:00
Joonas Koivunen	05dd1ae9e0	fix: drain completed page_service connections (#8632 ) We've noticed increased memory usage with the latest release. Drain the joinset of `page_service` connection handlers to avoid leaking them until shutdown. An alternative would be to use a TaskTracker. TaskTracker was not discussed in original PR #8339 review, so not hot fixing it in here either.	2024-08-07 17:14:45 +00:00
Cihan Demirci	8468d51a14	cicd: push build-tools image to ACR as well (#8638 ) https://github.com/neondatabase/cloud/issues/15899	2024-08-07 17:53:47 +01:00
Joonas Koivunen	a81fab4826	refactor(timeline_detach_ancestor): replace ordered reparented with a hashset (#8629 ) Earlier I was thinking we'd need a (ancestor_lsn, timeline_id) ordered list of reparented. Turns out we did not need it at all. Replace it with an unordered hashset. Additionally refactor the reparented direct children query out, it will later be used from more places. Split off from #8430. Cc: #6994	2024-08-07 18:19:00 +02:00
Alex Chi Z.	b3eea45277	fix(pageserver): dump the key when it's invalid (#8633 ) We see an assertion error in staging. Dump the key to guess where it was from, and then we can fix it. Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-08-07 16:37:46 +01:00
Joonas Koivunen	fc78774f39	fix: EphemeralFiles can outlive their Timeline via `enum LayerManager` (#8229 ) Ephemeral files cleanup on drop but did not delay shutdown, leading to problems with restarting the tenant. The solution is as proposed: - make ephemeral files carry the gate guard to delay `Timeline::gate` closing - flush in-memory layers and strong references to those on `Timeline::shutdown` The above are realized by making LayerManager an `enum` with `Open` and `Closed` variants, and fail requests to modify `LayerMap`. Additionally: - fix too eager anyhow conversions in compaction - unify how we freeze layers and handle errors - optimize likely_resident_layers to read LayerFileManager hashmap values instead of bouncing through LayerMap Fixes: #7830	2024-08-07 17:50:09 +03:00
Conrad Ludgate	ad0988f278	proxy: random changes (#8602 ) ## Problem 1. Hard to correlate startup parameters with the endpoint that provided them. 2. Some configurations are not needed in the `ProxyConfig` struct. ## Summary of changes Because of some borrow checker fun, I needed to switch to an interior-mutability implementation of our `RequestMonitoring` context system. Using https://docs.rs/try-lock/latest/try_lock/ as a cheap lock for such a use-case (needed to be thread safe). Removed the lock of each startup message, instead just logging only the startup params in a successful handshake. Also removed from values from `ProxyConfig` and kept as arguments. (needed for local-proxy config)	2024-08-07 14:37:03 +01:00
Arpad Müller	4d7c0dac93	Add missing colon to ArchivalConfigRequest specification (#8627 ) Add a missing colon to the API specification of `ArchivalConfigRequest`. The `state` field is required. Pointed out by Gleb.	2024-08-07 14:53:52 +02:00
Arpad Müller	00c981576a	Lower level for timeline cancellations during gc (#8626 ) Timeline cancellation running in parallel with gc yields error log lines like: ``` Gc failed 1 times, retrying in 2s: TimelineCancelled ``` They are completely harmless though and normal to occur. Therefore, only print those messages at an info level. Still print them at all so that we know what is going on if we focus on a single timeline.	2024-08-07 09:29:52 +02:00
Arpad Müller	c3f2240fbd	storage broker: only print one line for version and build tag in init (#8624 ) This makes it more consistent with pageserver and safekeeper. Also, it is easier to collect the two values into one data point.	2024-08-07 09:14:26 +02:00
Yuchen Liang	ed5724d79d	scrubber: clean up `scan_metadata` before prod (#8565 ) Part of #8128. ## Problem Currently, scrubber `scan_metadata` command will return with an error code if the metadata on remote storage is corrupted with fatal errors. To safely deploy this command in a cronjob, we want to differentiate between failures while running scrubber command and the erroneous metadata. At the same time, we also want our regression tests to catch corrupted metadata using the scrubber command. ## Summary of changes - Return with error code only when the scrubber command fails - Uses explicit checks on errors and warnings to determine metadata health in regression tests. Resolve conflict with `tenant-snapshot` command (after shard split): [`test_scrubber_tenant_snapshot`](https://github.com/neondatabase/neon/blob/yuchen/scrubber-scan-cleanup-before-prod/test_runner/regress/test_storage_scrubber.py#L23) failed before applying `422a8443dd` - When taking a snapshot, the old `index_part.json` in the unsharded tenant directory is not kept. - The current `list_timeline_blobs` implementation consider no `index_part.json` as a parse error. - During the scan, we are only analyzing shards with highest shard count, so we will not get a parse error. but we do need to add the layers to tenant object listing, otherwise we will get index is referencing a layer that is not in remote storage error. - Action: Add s3_layers from `list_timeline_blobs` regardless of parsing error Signed-off-by: Yuchen Liang <yuchen@neon.tech>	2024-08-06 18:55:42 +01:00
John Spray	ca5390a89d	pageserver: add `bench_ingest` (#7409 ) ## Problem We lack a rust bench for the inmemory layer and delta layer write paths: it is useful to benchmark these components independent of postgres & WAL decoding. Related: https://github.com/neondatabase/neon/issues/8452 ## Summary of changes - Refactor DeltaLayerWriter to avoid carrying a Timeline, so that it can be cleanly tested + benched without a Tenant/Timeline test harness. It only needed the Timeline for building `Layer`, so this can be done in a separate step. - Add `bench_ingest`, which exercises a variety of workload "shapes" (big values, small values, sequential keys, random keys) - Include a small uncontroversial optimization: in `freeze`, only exhaustively walk values to assert ordering relative to end_lsn in debug mode. These benches are limited by drive performance on a lot of machines, but still useful as a local tool for iterating on CPU/memory improvements around this code path. Anecdotal measurements on Hetzner AX102 (Ryzen 7950xd): ``` ingest-small-values/ingest 128MB/100b seq time: [1.1160 s 1.1230 s 1.1289 s] thrpt: [113.38 MiB/s 113.98 MiB/s 114.70 MiB/s] Found 1 outliers among 10 measurements (10.00%) 1 (10.00%) low mild Benchmarking ingest-small-values/ingest 128MB/100b rand: Warming up for 3.0000 s Warning: Unable to complete 10 samples in 10.0s. You may wish to increase target time to 18.9s. ingest-small-values/ingest 128MB/100b rand time: [1.9001 s 1.9056 s 1.9110 s] thrpt: [66.982 MiB/s 67.171 MiB/s 67.365 MiB/s] Benchmarking ingest-small-values/ingest 128MB/100b rand-1024keys: Warming up for 3.0000 s Warning: Unable to complete 10 samples in 10.0s. You may wish to increase target time to 11.0s. ingest-small-values/ingest 128MB/100b rand-1024keys time: [1.0715 s 1.0828 s 1.0937 s] thrpt: [117.04 MiB/s 118.21 MiB/s 119.46 MiB/s] ingest-small-values/ingest 128MB/100b seq, no delta time: [425.49 ms 429.07 ms 432.04 ms] thrpt: [296.27 MiB/s 298.32 MiB/s 300.83 MiB/s] Found 1 outliers among 10 measurements (10.00%) 1 (10.00%) low mild ingest-big-values/ingest 128MB/8k seq time: [373.03 ms 375.84 ms 379.17 ms] thrpt: [337.58 MiB/s 340.57 MiB/s 343.13 MiB/s] Found 1 outliers among 10 measurements (10.00%) 1 (10.00%) high mild ingest-big-values/ingest 128MB/8k seq, no delta time: [81.534 ms 82.811 ms 83.364 ms] thrpt: [1.4994 GiB/s 1.5095 GiB/s 1.5331 GiB/s] Found 1 outliers among 10 measurements (10.00%) ```	2024-08-06 16:39:40 +00:00
John Spray	3727c6fbbe	pageserver: use layer visibility when composing heatmap (#8616 ) ## Problem Sometimes, a layer is Covered by hasn't yet been evicted from local disk (e.g. shortly after image layer generation). It is not good use of resources to download these to a secondary location, as there's a good chance they will never be read. This follows the previous change that added layer visibility: - #8511 Part of epic: - https://github.com/neondatabase/neon/issues/8398 ## Summary of changes - When generating heatmaps, only include Visible layers - Update test_secondary_downloads to filter to visible layers when listing layers from an attached location	2024-08-06 17:15:40 +01:00
John Spray	42229aacf6	pageserver: fixes for layer visibility metric (#8603 ) ## Problem In staging, we could see that occasionally tenants were wrapping their pageserver_visible_physical_size metric past zero to 2^64. This is harmless right now, but will matter more later when we start using visible size in things like the /utilization endpoint. ## Summary of changes - Add debug asserts that detect this case. `test_gc_of_remote_layers` works as a reproducer for this issue once the asserts are added. - Tighten up the interface around access_stats so that only Layer can mutate it. - In Layer, wrap calls to `record_access` in code that will update the visible size statistic if the access implicitly marks the layer visible (this was what caused the bug) - In LayerManager::rewrite_layers, use the proper set_visibility layer function instead of directly using access_stats (this is an additional path where metrics could go bad.) - Removed unused instances of LayerAccessStats in DeltaLayer and ImageLayer which I noticed while reviewing the code paths that call record_access.	2024-08-06 14:47:01 +01:00
John Spray	b7beaa0fd7	tests: improve stability of `test_storage_controller_many_tenants` (#8607 ) ## Problem The controller scale test does random migrations. These mutate secondary locations, and therefore can cause secondary optimizations to happen in the background, violating the test's expectation that consistency_check will work as there are no reconciliations running. Example: https://neon-github-public-dev.s3.amazonaws.com/reports/main/10247161379/index.html#suites/07874de07c4a1c9effe0d92da7755ebf/6316beacd3fb3060/ ## Summary of changes - Only migrate to existing secondary locations, not randomly picked nodes, so that we can do a fast reconcile_until_idle (otherwise reconcile_until_idle is takes a long time to create new secondary locations). - Do a reconcile_until_idle before consistency_check.	2024-08-06 12:58:33 +01:00
a-masterov	16c91ff5d3	enable rum test (#8380 ) ## Problem We need to test the rum extension automatically as a path of the GitHub workflow ## Summary of changes rum test is enabled	2024-08-06 13:56:42 +02:00
a-masterov	078f941dc8	Add a test using Debezium as a client for the logical replication (#8568 ) ## Problem We need to test the logical replication with some external consumers. ## Summary of changes A test of the logical replication with Debezium as a consumer was added. --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2024-08-06 13:08:55 +02:00
Arseny Sher	68bcbf8227	Add package-mode=false to poetry. We don't use it for packaging, and 'poetry install' will soon error otherwise. Also remove name and version fields as these are not required for non-packaging mode.	2024-08-06 13:53:23 +03:00
Arpad Müller	a31c95cb40	storage_scrubber: migrate scan_safekeeper_metadata to remote_storage (#8595 ) Migrates the safekeeper-specific parts of `ScanMetadata` to GenericRemoteStorage, making it Azure-ready. Part of https://github.com/neondatabase/neon/issues/7547	2024-08-06 10:51:39 +00:00
Joonas Koivunen	dc7eb5ae5a	chore: bump index part version (#8611 ) #8600 missed the hunk changing index_part.json informative version. Include it in this PR, in addition add more non-warning index_part.json versions to scrubber.	2024-08-06 11:45:41 +01:00
Vlad Lazar	44fedfd6c3	pageserver: remove legacy read path (#8601 ) ## Problem We have been maintaining two read paths (legacy and vectored) for a while now. The legacy read-path was only used for cross validation in some tests. ## Summary of changes * Tweak all tests that were using the legacy read path to use the vectored read path instead * Remove the read path dispatching based on the pageserver configs * Remove the legacy read path code We will be able to remove the single blob io code in `pageserver/src/tenant/blob_io.rs` when https://github.com/neondatabase/neon/issues/7386 is complete. Closes https://github.com/neondatabase/neon/issues/8005	2024-08-06 10:14:01 +01:00
Joonas Koivunen	138f008bab	feat: persistent gc blocking (#8600 ) Currently, we do not have facilities to persistently block GC on a tenant for whatever reason. We could do a tenant configuration update, but that is risky for generation numbers and would also be transient. Introduce a `gc_block` facility in the tenant, which manages per timeline blocking reasons. Additionally, add HTTP endpoints for enabling/disabling manual gc blocking for a specific timeline. For debugging, individual tenant status now includes a similar string representation logged when GC is skipped. Cc: #6994	2024-08-06 10:09:56 +01:00
Joonas Koivunen	6a6f30e378	fix: make Timeline::set_disk_consistent_lsn use fetch_max (#8311 ) now it is safe to use from multiple callers, as we have two callers.	2024-08-06 08:52:01 +01:00

1 2 3 4 5 ...

5833 Commits