rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-05-31 12:00:42 +00:00

Author	SHA1	Message	Date
Conrad Ludgate	686b3c79c8	http2 alpn (#6815 ) ## Problem Proxy already supported HTTP2, but I expect no one is using it because we don't advertise it in the TLS handshake. ## Summary of changes #6335 without the websocket changes.	2024-02-20 10:44:46 +00:00
John Spray	02a8b7fbe0	storage controller: issue timeline create/delete calls concurrently (#6827 ) ## Problem Timeline creation is meant to be very fast: it should only take approximately on S3 PUT latency. When we have many shards in a tenant, we should preserve that responsiveness. ## Summary of changes - Issue create/delete pageserver API calls concurrently across all >0 shards - During tenant deletion, delete shard zero last, separately, to avoid confusing anything using GETs on the timeline. - Return 201 instead of 200 on creations to make cloud control plane happy --------- Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2024-02-20 10:13:21 +00:00
Alexander Bayandin	feb359b459	CI: Update deprecated GitHub Actions (#6822 ) ## Problem We use a bunch of deprecated actions. See https://github.com/neondatabase/neon/actions/runs/7958569728 (Annotations section) ``` Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3, actions/setup-java@v3, actions/cache@v3, actions/github-script@v6. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/. ``` ## Summary of changes - `actions/cache@v3` -> `actions/cache@v4` - `actions/checkout@v3` -> `actions/checkout@v4` - `actions/github-script@v6` -> `actions/github-script@v7` - `actions/setup-java@v3` -> `actions/setup-java@v4` - `actions/upload-artifact@v3` -> `actions/upload-artifact@v4`	2024-02-19 21:46:22 +00:00
John Spray	0c105ef352	storage controller: debug observability endpoints and self-test (#6820 ) This PR stacks on https://github.com/neondatabase/neon/pull/6814 Observability: - Because we only persist a subset of our state, and our external API is pretty high level, it can be hard to get at the detail of what's going on internally (e.g. the IntentState of a shard). - Add debug endpoints for getting a full dump of all TenantState and SchedulerNode objects - Enrich the /control/v1/node listing endpoint to include full in-memory detail of `Node` rather than just the `NodePersistence` subset Consistency checks: - The storage controller maintains separate in-memory and on-disk states, by design. To catch subtle bugs, it is useful to occasionally cross-check these. - The Scheduler maintains reference counts for shard->node relationships, which could drift if there was a bug in IntentState: exhausively cross check them in tests.	2024-02-19 20:29:23 +00:00
John Spray	4f7704af24	storage controller: fix spurious reconciles after pageserver restarts (#6814 ) ## Problem When investigating test failures (https://github.com/neondatabase/neon/issues/6813) I noticed we were doing a bunch of Reconciler runs right after splitting a tenant. It's because the splitting test does a pageserver restart, and there was a bug in /re-attach handling, where we would update the generation correctly in the database and intent state, but not observed state, thereby triggering a reconciliation on the next call to maybe_reconcile. This didn't break anything profound (underlying rules about generations were respected), but caused the storage controller to do an un-needed extra round of bumping the generation and reconciling. ## Summary of changes - Start adding metrics to the storage controller - Assert on the number of reconciles done in test_sharding_split_smoke - Fix /re-attach to update `observed` such that we don't spuriously re-reconcile tenants.	2024-02-19 17:44:20 +00:00
Arpad Müller	e0c12faabd	Allow initdb preservation for broken tenants (#6790 ) Often times the tenants we want to (WAL) DR are the ones which the pageserver marks as broken. Therefore, we should allow initdb preservation also for broken tenants. Fixes #6781.	2024-02-19 17:27:02 +01:00
John Spray	2f8a2681b8	pageserver: ensure we never try to save empty delta layer (#6805 ) ## Problem Sharded tenants could panic during compaction when they try to generate an L1 delta layer for a region that contains no keys on a particular shard. This is a variant of https://github.com/neondatabase/neon/issues/6755, where we attempt to save a delta layer with no keys. It is harder to reproduce than the case of image layers fixed in https://github.com/neondatabase/neon/pull/6776. It will become even less likely once https://github.com/neondatabase/neon/pull/6778 tweaks keyspace generation, but even then, we should not rely on keyspace partitioning to guarantee at least one stored key in each partition. ## Summary of changes - Move construction of `writer` in `compact_level0_phase1`, so that we never leave a writer constructed but without any keys.	2024-02-19 15:07:07 +00:00
John Spray	7e4280955e	control_plane/attachment_service: improve Scheduler (#6633 ) ## Problem One of the major shortcuts in the initial version of this code was to construct a fresh `Scheduler` each time we need it, which is an O(N^2) cost as the tenant count increases. ## Summary of changes - Keep `Scheduler` alive through the lifetime of ServiceState - Use `IntentState` as a reference tracking helper, updating Scheduler refcounts as nodes are added/removed from the intent. There is an automated test that checks things don't get pathologically slow with thousands of shards, but it's not included in this PR because tests that implicitly test the runner node performance take some thought to stabilize/land in CI.	2024-02-19 14:12:20 +00:00
John Spray	349b375010	pageserver: remove heatmap file during tenant delete (#6806 ) ## Problem Secondary mode locations keep a local copy of the heatmap, which needs cleaning up during deletion. Closes: https://github.com/neondatabase/neon/issues/6802 ## Summary of changes - Extend test_live_migration to reproduce the issue - Remove heatmap-v1.json during tenant deletion	2024-02-19 14:01:36 +00:00
Conrad Ludgate	d0d4871682	proxy: use postgres_protocol scram/sasl code (#4748 ) 1) `scram::password` was used in tests only. can be replaced with `postgres_protocol::password`. 2) `postgres_protocol::authentication::sasl` provides a client impl of SASL which improves our ability to test	2024-02-19 12:54:17 +00:00
Vlad Lazar	587cb705b8	pageserver: roll open layer in timeline writer (#6661 ) ## Problem One WAL record can actually produce an arbitrary amount of key value pairs. This is problematic since it might cause our frozen layers to bloat past the max allowed size of S3 single shot uploads. [#6639](https://github.com/neondatabase/neon/pull/6639) introduced a "should roll" check after every batch of `ingest_batch_size` (100 WAL records by default). This helps, but the original problem still exists. ## Summary of changes This patch moves the responsibility of rolling the currently open layer to the `TimelineWriter`. Previously, this was done ad-hoc via calls to `check_checkpoint_distance`. The advantages of this approach are: * ability to split one batch over multiple open layers * less layer map locking * remove ad-hoc check_checkpoint_distance calls More specifically, we track the current size of the open layer in the writer. On each `put` check whether the current layer should be closed and a new one opened. Keeping track of the currently open layer results in less contention on the layer map lock. It only needs to be acquired on the first write and on writes that require a roll afterwards. Rolling the open layer can be triggered by: 1. The distance from the last LSN we rolled at. This bounds the amount of WAL that the safekeepers need to store. 2. The size of the currently open layer. 3. The time since the last roll. It helps safekeepers to regard pageserver as caught up and suspend activity. Closes #6624	2024-02-19 12:34:27 +00:00
Alexander Bayandin	4d2bf55e6c	CI: temporary disable coverage report for regression tests (#6798 ) ## Problem The merging coverage data step recently started to be too flaky. This failure blocks staging deployment and along with the flakiness of regression tests might require 4-5-6 manual restarts of a CI job. Refs: - https://github.com/neondatabase/neon/issues/4540 - https://github.com/neondatabase/neon/issues/6485 - https://neondb.slack.com/archives/C059ZC138NR/p1704131143740669 ## Summary of changes - Disable code coverage report for functional tests	2024-02-19 11:07:27 +00:00
John Spray	5667372c61	pageserver: during shard split, wait for child to activate (#6789 ) ## Problem test_sharding_split_unsharded was flaky with log errors from tenants not being active. This was happening when the split function enters wait_lsn() while the child shard might still be activating. It's flaky rather than an outright failure because activation is usually very fast. This is also a real bug fix, because in realistic scenarios we could proceed to detach the parent shard before the children are ready, leading to an availability gap for clients. ## Summary of changes - Do a short wait_to_become_active on the child shards before proceeding to wait for their LSNs to advance --------- Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2024-02-18 15:55:19 +00:00
Alexander Bayandin	61f99d703d	test_create_snapshot: do not try to copy pg_dynshmem dir (#6796 ) ## Problem `test_create_snapshot` is flaky[0] on CI and fails constantly on macOS, but with a slightly different error: ``` shutil.Error: [('/Users/bayandin/work/neon/test_output/test_create_snapshot[release-pg15-1-100]/repo/endpoints/ep-1/pgdata/pg_dynshmem', '/Users/bayandin/work/neon/test_output/compatibility_snapshot_pgv15/repo/endpoints/ep-1/pgdata/pg_dynshmem', "[Errno 2] No such file or directory: '/Users/bayandin/work/neon/test_output/test_create_snapshot[release-pg15-1-100]/repo/endpoints/ep-1/pgdata/pg_dynshmem'")] ``` Also (on macOS) `repo/endpoints/ep-1/pgdata/pg_dynshmem` is a symlink to `/dev/shm/`. - [0] https://github.com/neondatabase/neon/issues/6784 ## Summary of changes Ignore `pg_dynshmem` directory while copying a snapshot	2024-02-18 12:16:07 +00:00
John Spray	24014d8383	pageserver: fix sharding emitting empty image layers during compaction (#6776 ) ## Problem Sharded tenants would sometimes try to write empty image layers during compaction: this was more noticeable on larger databases. - https://github.com/neondatabase/neon/issues/6755 Note to reviewers: the last commit is a refactor that de-intents a whole block, I recommend reviewing the earlier commits one by one to see the real changes ## Summary of changes - Fix a case where when we drop a key during compaction, we might fail to write out keys (this was broken when vectored get was added) - If an image layer is empty, then do not try and write it out, but leave `start` where it is so that if the subsequent key range meets criteria for writing an image layer, we will extend its key range to cover the empty area. - Add a compaction test that configures small layers and compaction thresholds, and asserts that we really successfully did image layer generation. This fails before the fix.	2024-02-18 08:51:12 +00:00
Konstantin Knizhnik	e3ded64d1b	Support pg-ivm extension (#6793 ) ## Problem See https://github.com/neondatabase/cloud/issues/10268 ## Summary of changes Add pg_ivm extension ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2024-02-17 22:13:25 +02:00
dependabot[bot]	9b714c8572	build(deps): bump cryptography from 42.0.0 to 42.0.2 (#6792 )	2024-02-17 19:15:21 +00:00
Alex Chi Z	29fb675432	Revert "fix superuser permission check for extensions (#6733 )" (#6791 ) This reverts commit `9ad940086c`. This pull request reverts #6733 to avoid incompatibility with pgvector and I will push further fixes later. Note that after reverting this pull request, the postgres submodule will point to some detached branches.	2024-02-16 20:50:09 +00:00
Christian Schwarz	ca07fa5f8b	per-TenantShard read throttling (#6706 )	2024-02-16 21:26:59 +01:00
John Spray	5d039c6e9b	libs: add 'generations_api' auth scope (#6783 ) ## Problem Even if you're not enforcing auth, the JwtAuth middleware barfs on scopes it doesn't know about. Add `generations_api` scope, which was invented in the cloud control plane for the pageserver's /re-attach and /validate upcalls: this will be enforced in storage controller's implementation of these in a later PR. Unfortunately the scope's naming doesn't match the other scope's naming styles, so needs a manual serde decorator to give it an underscore. ## Summary of changes - Add `Scope::GenerationsApi` variant - Update pageserver + safekeeper auth code to print appropriate message if they see it.	2024-02-16 15:53:09 +00:00
Calin Anca	36e1100949	bench_walredo: use tokio multi-threaded runtime (#6743 ) fixes https://github.com/neondatabase/neon/issues/6648 Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-02-16 16:31:54 +01:00
Alexander Bayandin	59c5b374de	test_pageserver_max_throughput_getpage_at_latest_lsn: disable on CI (#6785 ) ## Problem `test_pageserver_max_throughput_getpage_at_latest_lsn` is flaky which makes CI status red pretty frequently. `benchmarks` is not a blocking job (doesn't block `deploy`), so having it red might hide failures in other jobs Ref: https://github.com/neondatabase/neon/issues/6724 ## Summary of changes - Disable `test_pageserver_max_throughput_getpage_at_latest_lsn` on CI until it fixed	2024-02-16 15:30:04 +00:00
Arpad Müller	0f3b87d023	Add test for pageserver_directory_entries_count metric (#6767 ) Adds a simple test to ensure the metric works. The test creates a bunch of relations to activate the metric. Follow-up of #6736	2024-02-16 14:53:36 +00:00
Konstantin Knizhnik	c19625a29c	Support sharding for compute_ctl (#6787 ) ## Problem See https://github.com/neondatabase/neon/issues/6786 ## Summary of changes Split connection string in compute.rs when requesting basebackup	2024-02-16 14:50:09 +00:00
John Spray	f2e5212fed	storage controller: background reconcile, graceful shutdown, better logging (#6709 ) ## Problem Now that the storage controller is working end to end, we start burning down the robustness aspects. ## Summary of changes - Add a background task that periodically calls `reconcile_all`. This ensures that if earlier operations couldn't succeed (e.g. because a node was unavailable), we will eventually retry. This is a naive initial implementation can start an unlimited number of reconcile tasks: limiting reconcile concurrency is a later item in #6342 - Add a number of tracing spans in key locations: each background task, each reconciler task. - Add a top level CancellationToken and Gate, and use these to implement a graceful shutdown that waits for tasks to shut down. This is not bulletproof yet, because within these tasks we have remote HTTP calls that aren't wrapped in cancellation/timeouts, but it creates the structure, and if we don't shutdown promptly then k8s will kill us. - To protect shard splits from background reconciliation, expose the `SplitState` in memory and use it to guard any APIs that require an attached tenant.	2024-02-16 13:00:53 +00:00
Christian Schwarz	568bc1fde3	fix(build): production flamegraphs are useless (#6764 )	2024-02-16 10:12:34 +00:00
Christian Schwarz	45e929c069	stop reading local `metadata` file (#6777 )	2024-02-16 09:35:11 +00:00
John Spray	6b980f38da	libs: refactor ShardCount.0 to private (#6690 ) ## Problem The ShardCount type has a magic '0' value that represents a legacy single-sharded tenant, whose TenantShardId is formatted without a `-0001` suffix (i.e. formatted as a traditional TenantId). This was error-prone in code locations that wanted the actual number of shards: they had to handle the 0 case specially. ## Summary of changes - Make the internal value of ShardCount private, and expose `count()` and `literal()` getters so that callers have to explicitly say whether they want the literal value (e.g. for storing in a TenantShardId), or the actual number of shards in the tenant. --------- Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2024-02-15 21:59:39 +00:00
MMeent	f0d8bd7855	Update Makefile (#6779 ) This fixes issues where `neon-pg-ext-clean-vYY` is used as target and resolves using the `neon-pg-ext-%` template with `$` resolving as `clean-vYY`, for older versions of GNU Make, rather than `neon-pg-ext-clean-%` using `$` = `vYY` ## Problem ``` $ make clean ... rm -f pg_config_paths.h Compiling neon clean-v14 mkdir -p /Users/<user>/neon-build//pg_install//build/neon-clean-v14 /Applications/Xcode.app/Contents/Developer/usr/bin/make PG_CONFIG=/Users/<user>/neon-build//pg_install//clean-v14/bin/pg_config CFLAGS='-O0 -g3 ' \ -C /Users/<user>/neon-build//pg_install//build/neon-clean-v14 \ -f /Users/<user>/neon-build//pgxn/neon/Makefile install make[1]: /Users/<user>/neon-build//pg_install//clean-v14/bin/pg_config: Command not found make[1]: * No rule to make target `install'. Stop. make: * [neon-pg-ext-clean-v14] Error 2 ```	2024-02-15 19:48:50 +00:00
Joonas Koivunen	046d9c69e6	fix: require wider jwt for changing the io engine (#6770 ) io-engine should not be changeable with any JWT token, for example the tenant_id scoped token which computes have.	2024-02-15 16:58:26 +00:00
Alexander Bayandin	c72cb44213	test_runner/performance: parametrize benchmarks (#6744 ) ## Problem Currently, we don't store `PLATFORM` for Nightly Benchmarks. It causes them to be merged as reruns in Allure report (because they have the same test name). ## Summary of changes - Parametrize benchmarks by - Postgres Version (14/15/16) - Build Type (debug/release/remote) - PLATFORM (neon-staging/github-actions-selfhosted/...) --------- Co-authored-by: Bodobolero <peterbendel@neon.tech>	2024-02-15 15:53:58 +00:00
Arpad Müller	cd3e4ac18d	Rename TEST_IMG function to test_img (#6762 ) Latter follows the canonical way to naming functions in Rust.	2024-02-15 15:14:51 +00:00
Alex Chi Z	9ad940086c	fix superuser permission check for extensions (#6733 ) close https://github.com/neondatabase/neon/issues/6236 This pull request bumps neon postgres dependencies. The corresponding postgres commits fix the checks for superuser permission when creating an extension. Also, for creating native functinos, it now allows neon_superuser only in the extension creation process. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2024-02-15 14:59:13 +00:00
Joonas Koivunen	936f2ee2a5	fix: accidential wide span in tests (#6772 ) introduced in a PR without other #[tracing::instrument] changes.	2024-02-15 13:48:44 +00:00
Heikki Linnakangas	1af047dd3e	Fix typo in CI message (#6749 )	2024-02-15 14:34:19 +02:00
John Spray	5fa747e493	pageserver: shard splitting refinements (parent deletion, hard linking) (#6725 ) ## Problem - We weren't deleting parent shard contents once the split was done - Re-downloading layers into child shards is wasteful ## Summary of changes - Hard-link layers into child chart local storage during split - Delete parent shards content at the end --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2024-02-15 10:21:53 +02:00
Joonas Koivunen	80854b98ff	move timeouts and cancellation handling to remote_storage (#6697 ) Cancellation and timeouts are handled at remote_storage callsites, if they are. However they should always be handled, because we've had transient problems with remote storage connections. - Add cancellation token to the `trait RemoteStorage` methods - For `download`, `list` methods there is `DownloadError::{Cancelled,Timeout}` - For the rest now using `anyhow::Error`, it will have root cause `remote_storage::TimeoutOrCancel::{Cancel,Timeout}` - Both types have `::is_permanent` equivalent which should be passed to `backoff::retry` - New generic RemoteStorageConfig option `timeout`, defaults to 120s - Start counting timeouts only after acquiring concurrency limiter permit - Cancellable permit acquiring - Download stream timeout or cancellation is communicated via an `std::io::Error` - Exit backoff::retry by marking cancellation errors permanent Fixes: #6096 Closes: #4781 Co-authored-by: arpad-m <arpad-m@users.noreply.github.com>	2024-02-14 23:24:07 +00:00
Christian Schwarz	024372a3db	Revert "refactor(VirtualFile::crashsafe_overwrite): avoid Handle::block_on in callers" (#6765 ) Reverts neondatabase/neon#6731 On high tenant count Pageservers in staging, memory and CPU usage shoots to 100% with this change. (NB: staging currently has tokio-epoll-uring enabled) Will analyze tomorrow. https://neondb.slack.com/archives/C03H1K0PGKH/p1707933875639379?thread_ts=1707929541.125329&cid=C03H1K0PGKH	2024-02-14 19:17:12 +00:00
Shayan Hosseini	fff2468aa2	Add resource consume test funcs (#6747 ) ## Problem Building on #5875 to add handy test functions for autoscaling. Resolves #5609 ## Summary of changes This PR makes the following changes to #5875: - Enable `neon_test_utils` extension in the compute node docker image, so we could use it in the e2e tests (as discussed with @kelvich). - Removed test functions related to disk as we don't use them for autoscaling. - Fix the warning with printf-ing unsigned long variables. --------- Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2024-02-14 18:45:05 +00:00
Anna Khanova	c7538a2c20	Proxy: remove fail fast logic to connect to compute (#6759 ) ## Problem Flaky tests ## Summary of changes Remove failfast logic	2024-02-14 18:43:52 +00:00
Arpad Müller	a2d0d44b42	Remove unused allow's (#6760 ) These allow's became redundant some time ago so remove them, or address them if addressing is very simple.	2024-02-14 18:16:05 +00:00
Christian Schwarz	7d3cdc05d4	fix(pageserver): pagebench doesn't work with released artifacts (#6757 ) The canonical release artifact of neon.git is the Docker image with all the binaries in them: ``` docker pull neondatabase/neon:release-4854 docker create --name extract neondatabase/neon:release-4854 docker cp extract:/usr/local/bin/pageserver ./pageserver.release-4854 chmod +x pageserver.release-4854 cp -a pageserver.release-4854 ./target/release/pageserver ``` Before this PR, these artifacts didn't expose the `keyspace` API, thereby preventing `pagebench get-page-latest-lsn` from working. Having working pagebench is useful, e.g., for experiments in staging. So, expose the API, but don't document it, as it's not part of the interface with control plane.	2024-02-14 17:01:15 +00:00
John Spray	840abe3954	pageserver: store aux files as deltas (#6742 ) ## Problem Aux files were stored with an O(N^2) cost, since on each modification the entire map is re-written as a page image. This addresses one axis of the inefficiency in logical replication's use of storage (https://github.com/neondatabase/neon/issues/6626). It will still be writing a large amount of duplicative data if writing the same slot's state every 15 seconds, but the impact will be O(N) instead of O(N^2). ## Summary of changes - Introduce `NeonWalRecord::AuxFile` - In `DatadirModification`, if the AUX_FILES_KEY has already been set, then write a delta instead of an image	2024-02-14 15:01:16 +00:00
Christian Schwarz	774a6e7475	refactor(virtual_file) make write_all_at take owned buffers (#6673 ) context: https://github.com/neondatabase/neon/issues/6663 Building atop #6664, this PR switches `write_all_at` to take owned buffers. The main challenge here is the `EphemeralFile::mutable_tail`, for which I'm picking the ugly solution of an `Option` that is `None` while the IO is in flight. After this, we will be able to switch `write_at` to take owned buffers and call tokio-epoll-uring's `write` function with that owned buffer. That'll be done in #6378.	2024-02-14 15:59:06 +01:00
Christian Schwarz	df5d588f63	refactor(VirtualFile::crashsafe_overwrite): avoid Handle::block_on in callers (#6731 ) Some callers of `VirtualFile::crashsafe_overwrite` call it on the executor thread, thereby potentially stalling it. Others are more diligent and wrap it in `spawn_blocking(..., Handle::block_on, ... )` to avoid stalling the executor thread. However, because `crashsafe_overwrite` uses VirtualFile::open_with_options internally, we spawn a new thread-local `tokio-epoll-uring::System` in the blocking pool thread that's used for the `spawn_blocking` call. This PR refactors the situation such that we do the `spawn_blocking` inside `VirtualFile::crashsafe_overwrite`. This unifies the situation for the better: 1. Callers who didn't wrap in `spawn_blocking(..., Handle::block_on, ...)` before no longer stall the executor. 2. Callers who did it before now can avoid the `block_on`, resolving the problem with the short-lived `tokio-epoll-uring::System`s in the blocking pool threads. A future PR will build on top of this and divert to tokio-epoll-uring if it's configures as the IO engine. Changes ------- - Convert implementation to std::fs and move it into `crashsafe.rs` - Yes, I know, Safekeepers (cc @arssher ) added `durable_rename` and `fsync_async_opt` recently. However, `crashsafe_overwrite` is different in the sense that it's higher level, i.e., it's more like `std::fs::write` and the Safekeeper team's code is more building block style. - The consequence is that we don't use the VirtualFile file descriptor cache anymore. - I don't think it's a big deal because we have plenty of slack wrt production file descriptor limit rlimit (see [this dashboard](https://neonprod.grafana.net/d/e4a40325-9acf-4aa0-8fd9-f6322b3f30bd/pageserver-open-file-descriptors?orgId=1)) - Use `tokio::task::spawn_blocking` in `VirtualFile::crashsafe_overwrite` to call the new `crashsafe::overwrite` API. - Inspect all callers to remove any double-`spawn_blocking` - spawn_blocking requires the captures data to be 'static + Send. So, refactor the callers. We'll need this for future tokio-epoll-uring support anyway, because tokio-epoll-uring requires owned buffers. Related Issues -------------- - overall epic to enable write path to tokio-epoll-uring: #6663 - this is also kind of relevant to the tokio-epoll-uring System creation failures that we encountered in staging, investigation being tracked in #6667 - why is it relevant? Because this PR removes two uses of `spawn_blocking+Handle::block_on`	2024-02-14 14:22:41 +00:00
John Spray	f39b0fce9b	Revert #6666 "tests: try to make restored-datadir comparison tests not flaky" (#6751 ) The #6666 change appears to have made the test fail more often. PR https://github.com/neondatabase/neon/pull/6712 should re-instate this change, along with its change to make the overall flow more reliable. This reverts commit `568f91420a`.	2024-02-14 10:57:01 +00:00
Conrad Ludgate	a9ec4eb4fc	hold cancel session (#6750 ) ## Problem In a recent refactor, we accidentally dropped the cancel session early ## Summary of changes Hold the cancel session during proxy passthrough	2024-02-14 10:26:32 +00:00
Heikki Linnakangas	a97b54e3b9	Cherry-pick Postgres bugfix to 'mmap' DSM implementation Cherry-pick Upstream commit fbf9a7ac4d to neon stable branches. We'll get it in the next PostgreSQL minor release anyway, but we need it now, if we want to start using the 'mmap' implementation. See https://github.com/neondatabase/autoscaling/issues/800 for the plans on doing that.	2024-02-14 11:37:52 +02:00
Heikki Linnakangas	a5114a99b2	Create a symlink from pg_dynshmem to /dev/shm See included comment and issue https://github.com/neondatabase/autoscaling/issues/800 for details. This has no effect, unless you set "dynamic_shared_memory_type = mmap" in postgresql.conf.	2024-02-14 11:37:52 +02:00
Arpad Müller	ee7bbdda0e	Create new metric for directory counts (#6736 ) There is O(n^2) issues due to how we store these directories (#6626), so it's good to keep an eye on them and ensure the numbers stay low. The new per-timeline metric `pageserver_directory_entries_count` isn't perfect, namely we don't calculate it every time we attach the timeline, but only if there is an actual change. Also, it is a collective metric over multiple scalars. Lastly, we only emit the metric if it is above a certain threshold. However, the metric still give a feel for the general size of the timeline. We care less for small values as the metric is mainly there to detect and track tenants with large directory counts. We also expose the directory counts in `TimelineInfo` so that one can get the detailed size distribution directly via the pageserver's API. Related: #6642 , https://github.com/neondatabase/cloud/issues/10273	2024-02-14 02:12:00 +01:00

1 2 3 4 5 ...

4661 Commits