rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-08 05:52:55 +00:00

Author	SHA1	Message	Date
Konstantin Knizhnik	ba06ea26bb	Fix issues with reanabling LFC (#5209 ) refer #5208 ## Problem See https://neondb.slack.com/archives/C03H1K0PGKH/p1693938336062439?thread_ts=1693928260.704799&cid=C03H1K0PGKH #5208 disable LFC forever in case of error. It is not good because the problem causing this error (for example ENOSPC) can be resolved anti will be nice to reenable it after fixing. Also #5208 disables LFC locally in one backend. But other backends may still see corrupted data. It should not cause problems right now with "permission denied" error because there should be no backend which is able to normally open LFC. But in case of out-of-disk-space error, other backend can read corrupted data. ## Summary of changes 1. Cleanup hash table after error to prevent access to stale or corrupted data 2. Perform disk write under exclusive lock (hoping it will not affect performance because usually write just copy data from user to system space) 3. Use generations to prevent access to stale data in lfc_read ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2023-09-09 17:51:16 +03:00
Joonas Koivunen	6f28da1737	fix: LocalFs root in test_compatibility is PosixPath('...') (#5261 ) I forgot a `str(...)` conversion in #5243. This lead to log lines such as: ``` Using fs root 'PosixPath('/tmp/test_output/test_backward_compatibility[debug-pg14]/compatibility_snapshot/repo/local_fs_remote_storage/pageserver')' as a remote storage ``` This surprisingly works, creating hierarchy of under current working directory (`repo_dir` for tests): - `PosixPath('` - `tmp` .. up until .. `local_fs_remote_storage` - `pageserver')` It should not work but right now test_compatibility.py tests finds local metadata and layers, which end up used. After #5172 when remote storage is the source of truth it will no longer work.	2023-09-08 20:27:00 +03:00
Heikki Linnakangas	60050212e1	Update rdkit to version 2023_03_03. (#5260 ) It includes PostgreSQL 16 support.	2023-09-08 19:40:29 +03:00
Joonas Koivunen	66633ef2a9	rust-toolchain: use 1.72.0, same as CI (#5256 ) Switches everyone without an `rustup override` to 1.72.0. Code changes required already done in #5255. Depends on https://github.com/neondatabase/build/pull/65.	2023-09-08 19:36:02 +03:00
Alexander Bayandin	028fbae161	Miscellaneous fixes for tests-related things (#5259 ) ## Problem A bunch of fixes for different test-related things ## Summary of changes - Fix test_runner/pg_clients (`subprocess_capture` return value has changed) - Do not run create-test-report if check-permissions failed for not cancelled jobs - Fix Code Coverage comment layout after flaky tests. Add another healing "\n" - test_compatibility: add an instruction for local run Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-09-08 16:28:09 +01:00
John Spray	7b6337db58	tests: enable multiple pageservers in `neon_local` and `neon_fixture` (#5231 ) ## Problem Currently our testing environment only supports running a single pageserver at a time. This is insufficient for testing failover and migrations. - Dependency of writing tests for #5207 ## Summary of changes - `neon_local` and `neon_fixture` now handle multiple pageservers - This is a breaking change to the `.neon/config` format: any local environments will need recreating - Existing tests continue to work unchanged: - The default number of pageservers is 1 - `NeonEnv.pageserver` is now a helper property that retrieves the first pageserver if there is only one, else throws. - Pageserver data directories are now at `.neon/pageserver_{n}` where n is 1,2,3... - Compatibility tests get some special casing to migrate neon_local configs: these are not meant to be backward/forward compatible, but they were treated that way by the test.	2023-09-08 16:19:57 +01:00
Konstantin Knizhnik	499d0707d2	Perform throttling for concurrent build index which is done outside transaction (#5048 ) See https://neondb.slack.com/archives/C03H1K0PGKH/p1692550646191429 ## Problem Build index concurrently is writing WAL outside transaction. `backpressure_throttling_impl` doesn't perform throttling for read-only transactions (not assigned XID). It cause huge write lag which can cause large delay of accessing the table. ## Summary of changes Looks at `PROC_IN_SAFE_IC` in process state set during concurrent index build. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2023-09-08 18:05:08 +03:00
Joonas Koivunen	720d59737a	rust-1.72.0 changes (#5255 ) Prepare to upgrade rust version to latest stable. - `rustfmt` has learned to format `let irrefutable = $expr else { ... };` blocks - There's a new warning about virtual (workspace) crate resolver, picked the latest resolver as I suspect everyone would expect it to be the latest; should not matter anyways - Some new clippies, which seem alright	2023-09-08 16:28:41 +03:00
Joonas Koivunen	ff87fc569d	test: Remote storage refactorings (#5243 ) Remote storage cleanup split from #5198: - pageserver, extensions, and safekeepers now have their separate remote storage - RemoteStorageKind has the configuration code - S3Storage has the cleanup code - with MOCK_S3, pageserver, extensions, safekeepers use different buckets - with LOCAL_FS, `repo_dir / "local_fs_remote_storage" / $user` is used as path, where $user is `pageserver`, `safekeeper` - no more `NeonEnvBuilder.enable_xxx_remote_storage` but one `enable_{pageserver,extensions,safekeeper}_remote_storage` Should not have any real changes. These will allow us to default to `LOCAL_FS` for pageserver on the next PR, remove `RemoteStorageKind.NOOP`, work towards #5172. Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2023-09-08 13:54:23 +03:00
Heikki Linnakangas	cdc65c1857	Update pg_cron to version 1.6.0 (#5252 ) This includes PostgreSQL 16 support. There are no catalog changes, so this is a drop-in replacement, no need to run "ALTER EXTENSION UPDATE".	2023-09-08 12:42:46 +03:00
Heikki Linnakangas	dac995e7e9	Update plpgsql_check extension to version v2.4.0 (#5249 ) This brings v16 support.	2023-09-08 10:46:02 +03:00
Alexander Bayandin	b80740bf9f	test_startup: increase timeout (#5238 ) ## Problem `test_runner/performance/test_startup.py::test_startup` started to fail more frequently because of the timeout. Let's increase the timeout to see the failures on the perf dashboard. ## Summary of changes - Increase timeout for`test_startup` from 600 to 900 seconds	2023-09-08 01:57:38 +01:00
Heikki Linnakangas	57c1ea49b3	Update hypopg extension to version 1.4.0 (#5245 ) The v1.4.0 includes changes to make it compile with PostgreSQL 16. The commit log doesn't call it out explicitly, but I tested it manually. v1.4.0 includes some new functions, but I tested manually that the the v1.3.1 functionality works with the v1.4.0 version of the library. That means that this doesn't break existing installations. Users can do "ALTER EXTENSION hypopg UPDATE" if they want to use the new v1.4.0 functionality, but they don't have to.	2023-09-08 03:30:11 +03:00
Heikki Linnakangas	6c31a2d342	Upgrade prefix extension to version 1.2.10 (#5244 ) This version includes trivial changes to make it compile with PostgreSQL 16. No functional changes.	2023-09-08 02:10:01 +03:00
Heikki Linnakangas	252b953f18	Upgrade postgresql-hll to version 2.18. (#5241 ) This includes PostgreSQL 16 support. No other changes, really. The extension version in the upstream was changed from 2.17 to 2.18, however, there is no difference between the catalog objects. So if you had installed 2.17 previously, it will continue to work. You can run "ALTER EXTENSION hll UPDATE", but all it will do is update the version number in the pg_extension table.	2023-09-08 02:07:17 +03:00
Heikki Linnakangas	b414360afb	Upgrade ip4r to version 2.4.2 (#5242 ) Includes PostgreSQL v16 support. No functional changes.	2023-09-08 02:06:53 +03:00
Arpad Müller	d206655a63	Make VirtualFile::{open, open_with_options, create,sync_all,with_file} async fn (#5224 ) ## Problem Once we use async file system APIs for `VirtualFile`, these functions will also need to be async fn. ## Summary of changes Makes the functions `open, open_with_options, create,sync_all,with_file` of `VirtualFile` async fn, including all functions that call it. Like in the prior PRs, the actual I/O operations are not using async APIs yet, as per request in the #4743 epic. We switch towards not using `VirtualFile` in the par_fsync module, hopefully this is only temporary until we can actually do fully async I/O in `VirtualFile`. This might cause us to exhaust fd limits in the tests, but it should only be an issue for the local developer as we have high ulimits in prod. This PR is a follow-up of #5189, #5190, #5195, and #5203. Part of #4743.	2023-09-08 00:50:50 +02:00
Heikki Linnakangas	e5adc4efb9	Upgrade h3-pg to version 4.1.3. (#5237 ) This includes v16 support.	2023-09-07 21:39:12 +03:00
Heikki Linnakangas	c202f0ba10	Update PostGIS to version 3.3.3 (#5236 ) It's a good idea to keep up-to-date in general. One noteworthy change is that PostGIS 3.3.3 adds support for PostgreSQL v16. We'll need that. PostGIS 3.4.0 has already been released, and we should consider upgrading to that. However, it's a major upgrade and requires running "SELECT postgis_extensions_upgrade();" in each database, to upgrade the catalogs. I don't want to deal with that right now.	2023-09-07 21:38:55 +03:00
Alexander Bayandin	d15563f93b	Misc workflows: fix quotes in bash (#5235 )	2023-09-07 19:39:42 +03:00
Rahul Modpur	485a2cfdd3	Fix pg_config version parsing (#5200 ) ## Problem Fix pg_config version parsing ## Summary of changes Use regex to capture major version of postgres #5146	2023-09-07 15:34:22 +02:00
Alexander Bayandin	1fee69371b	Update `plv8` to 3.1.8 (#5230 ) ## Problem We likely need this to support Postgres 16 It's also been asked by a user https://github.com/neondatabase/neon/discussions/5042 The latest version is 3.2.0, but it requires some changes in the build script (which I haven't checked, but it didn't work right away) ## Summary of changes ``` 3.1.8 2023-08-01 - force v8 to compile in release mode 3.1.7 2023-06-26 - fix byteoffset issue with arraybuffers - support postgres 16 beta 3.1.6 2023-04-08 - fix crash issue on fetch apply - fix interrupt issue ``` From https://github.com/plv8/plv8/blob/v3.1.8/Changes	2023-09-07 14:21:38 +01:00
Alexander Bayandin	f8a91e792c	Even better handling of `approved-for-ci-run` label (#5227 ) ## Problem We've got `approved-for-ci-run` to work 🎉 But it's still a bit rough, this PR should improve the UX for external contributors. ## Summary of changes - `build_and_test.yml`: add `check-permissions` job, which fails if PR is created from a fork. Make all jobs in the workflow to be dependant on `check-permission` to fail fast - `approved-for-ci-run.yml`: add `cleanup` job to close `ci-run/pr-` PRs and delete linked branches when the parent PR is closed - `approved-for-ci-run.yml`: fix the layout for the `ci-run/pr-` PR description - GitHub Autocomment: add a comment with tests result to the original PR (instead of a PR from `ci-run/pr-*` )	2023-09-07 14:21:01 +01:00
duguorong009	706977fb77	fix(pageserver): add the walreceiver state to tenant timeline GET api endpoint (#5196 ) Add a `walreceiver_state` field to `TimelineInfo` (response of `GET /v1/tenant/:tenant_id/timeline/:timeline_id`) and while doing that, refactor out a common `Timeline::walreceiver_state(..)`. No OpenAPI changes, because this is an internal debugging addition. Fixes #3115. Co-authored-by: Joonas Koivunen <joonas.koivunen@gmail.com>	2023-09-07 14:17:18 +03:00
Arpad Müller	7ba0f5c08d	Improve comment in page cache (#5220 ) It was easy to interpret comment in the page cache initialization code to be about justifying why we leak here at all, not just why this specific type of leaking is done (which the comment was actually meant to describe). See https://github.com/neondatabase/neon/pull/5125#discussion_r1308445993 --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-09-06 21:44:54 +02:00
Arpad Müller	6243b44dea	Remove Virtual from FileBlockReaderVirtual variant name (#5225 ) With #5181, the generics for `FileBlockReader` have been removed, so having a `Virtual` postfix makes less sense now.	2023-09-06 20:54:57 +02:00
Joonas Koivunen	3a966852aa	doc: tests expect lsof (#5226 ) On a clean system `lsof` needs to be installed. Add it to the list just to keep things nice and copy-pasteable (except for poetry).	2023-09-06 21:40:00 +03:00
duguorong009	31e1568dee	refactor(pageserver): refactor pageserver router state creation (#5165 ) Fixes #3894 by: - Refactor the pageserver router creation flow - Create the router state in `pageserver/src/bin/pageserver.rs`	2023-09-06 21:31:49 +03:00
Chengpeng Yan	9a9187b81a	Complete the missing metrics for files_created/bytes_written (#5120 )	2023-09-06 14:00:15 -04:00
Chengpeng Yan	dfe2e5159a	remove the duplicate entries in postgresql.conf (#5090 )	2023-09-06 13:57:03 -04:00
Alexander Bayandin	e4b1d6b30a	Misc post-merge fixes (#5219 ) ## Problem - `SCALE: unbound variable` from https://github.com/neondatabase/neon/pull/5079 - The layout of the GitHub auto-comment is broken if the code coverage section follows flaky test section from https://github.com/neondatabase/neon/pull/4999 ## Summary of changes - `benchmarking.yml`: Rename `SCALE` to `TEST_OLAP_SCALE` - `comment-test-report.js`: Add an extra new-line before Code coverage section	2023-09-06 20:11:44 +03:00
Alexander Bayandin	76a96b0745	Notify Slack channel about upcoming releases (#5197 ) ## Problem When the next release is coming, we want to let everyone know about it by posting a message to the Slack channel with a list of commits. ## Summary of changes - `.github/workflows/release-notify.yml` is added - the workflow sends a message to `vars.SLACK_UPCOMING_RELEASE_CHANNEL_ID` (or [#test-release-notifications](https://neondb.slack.com/archives/C05QQ9J1BRC) if not configured) - On each PR update, the workflow updates the list of commits in the message (it doesn't send additional messages)	2023-09-06 17:52:21 +01:00
Arpad Müller	5e00c44169	Add WriteBlobWriter buffering and make VirtualFile::{write,write_all} async (#5203 ) ## Problem We want to convert the `VirtualFile` APIs to async fn so that we can adopt one of the async I/O solutions. ## Summary of changes This PR is a follow-up of #5189, #5190, and #5195, and does the following: * Move the used `Write` trait functions of `VirtualFile` into inherent functions * Add optional buffering to `WriteBlobWriter`. The buffer is discarded on drop, similarly to how tokio's `BufWriter` does it: drop is neither async nor does it support errors. * Remove the generics by `Write` impl of `WriteBlobWriter`, alwaays using `VirtualFile` * Rename `WriteBlobWriter` to `BlobWriter` * Make various functions in the write path async, like `VirtualFile::{write,write_all}`. Part of #4743.	2023-09-06 18:17:12 +02:00
Alexander Bayandin	d5f1858f78	approved-for-ci-run.yml: use different tokens (#5218 ) ## Problem `CI_ACCESS_TOKEN` has quite limited access (which is good), but this doesn't allow it to remove labels from PRs (which is bad) ## Summary of changes - Use `GITHUB_TOKEN` to remove labels - Use `CI_ACCESS_TOKEN` to create PRs	2023-09-06 18:50:59 +03:00
John Spray	61d661a6c3	pageserver: generation number fetch on startup and use in /attach (#5163 ) ## Problem - #5050 Closes: https://github.com/neondatabase/neon/issues/5136 ## Summary of changes - A new configuration property `control_plane_api` controls other functionality in this PR: if it is unset (default) then everything still works as it does today. - If `control_plane_api` is set, then on startup we call out to control plane `/re-attach` endpoint to discover our attachments and their generations. If an attachment is missing from the response we implicitly detach the tenant. - Calls to pageserver `/attach` API may include a `generation` parameter. If `control_plane_api` is set, then this parameter is mandatory. - RemoteTimelineClient's loading of index_part.json is generation-aware, and will try to load the index_part with the most recent generation <= its own generation. - The `neon_local` testing environment now includes a new binary `attachment_service` which implements the endpoints that the pageserver requires to operate. This is on by default if running `cargo neon` by hand. In `test_runner/` tests, it is off by default: existing tests continue to run with in the legacy generation-less mode. Caveats: - The re-attachment during startup assumes that we are only re-attaching tenants that have previously been attached, and not totally new tenants -- this relies on the control plane's attachment logic to keep retrying so that we should eventually see the attach API call. That's important because the `/re-attach` API doesn't tell us which timelines we should attach -- we still use local disk state for that. Ref: https://github.com/neondatabase/neon/issues/5173 - Testing: generations are only enabled for one integration test right now (test_pageserver_restart), as a smoke test that all the machinery basically works. Writing fuller tests that stress tenant migration will come later, and involve extending our test fixtures to deal with multiple pageservers. - I'm not in love with "attachment_service" as a name for the neon_local component, but it's not very important because we can easily rename these test bits whenever we want. - Limited observability when in re-attach on startup: when I add generation validation for deletions in a later PR, I want to wrap up the control plane API calls in some small client class that will expose metrics for things like errors calling the control plane API, which will act as a strong red signal that something is not right. Co-authored-by: Christian Schwarz <christian@neon.tech> Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-09-06 14:44:48 +01:00
Alexander Bayandin	da60f69909	approved-for-ci-run.yml: use our bot (#5216 ) ## Problem Pull Requests created by GitHub Actions bot doesn't have access to secrets, so we need to use our bot for it to auto-trigger a tests run See previous PRs #4663, #5210, #5212 ## Summary of changes - Use our bot to create PRs	2023-09-06 14:55:11 +03:00
John Spray	743933176e	scrubber: add `scan-metadata` and hook into integration tests (#5176 ) ## Problem - Scrubber's `tidy` command requires presence of a control plane - Scrubber has no tests at all ## Summary of changes - Add re-usable async streams for reading metadata from a bucket - Add a `scan-metadata` command that reads from those streams and calls existing `checks.rs` code to validate metadata, then returns a summary struct for the bucket. Command returns nonzero status if errors are found. - Add an `enable_scrub_on_exit()` function to NeonEnvBuilder so that tests using remote storage can request to have the scrubber run after they finish - Enable remote storarge and scrub_on_exit in test_pageserver_restart and test_pageserver_chaos This is a "toe in the water" of the overall space of validating the scrubber. Later, we should: - Enable scrubbing at end of tests using remote storage by default - Make the success condition stricter than "no errors": tests should declare what tenants+timelines they expect to see in the bucket (or sniff these from the functions tests use to create them) and we should require that the scrubber reports on these particular tenants/timelines. The `tidy` command is untouched in this PR, but it should be refactored later to use similar async streaming interface instead of the current batch-reading approach (the streams are faster with large buckets), and to also be covered by some tests. --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech> Co-authored-by: Alexander Bayandin <alexander@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech> Co-authored-by: Conrad Ludgate <conrad@neon.tech>	2023-09-06 11:55:24 +01:00
Alexander Bayandin	8e25d3e79e	test_runner: add scale parameter to tpc-h tests (#5079 ) ## Problem It's hard to find out which DB size we use for OLAP benchmarks (TPC-H in particular). This PR adds handling of `TEST_OLAP_SCALE` env var, which is get added to a test name as a parameter. This is required for performing larger periodic benchmarks. ## Summary of changes - Handle `TEST_OLAP_SCALE` in `test_runner/performance/test_perf_olap.py` - Set `TEST_OLAP_SCALE` in `.github/workflows/benchmarking.yml` to a TPC-H scale	2023-09-06 13:22:57 +03:00
duguorong009	4fec48f2b5	chore(pageserver): remove unnecessary logging in tenant task loops (#5188 ) Fixes #3830 by adding the `#[cfg(not(feature = "testing"))]` attribute to unnecessary loggings in `pageserver/src/tenant/tasks.rs`. Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-09-06 13:19:19 +03:00
Vadim Kharitonov	88b1ac48bd	Create Release PR at 7:00 UTC every Tuesday (#5213 )	2023-09-06 13:17:52 +03:00
Alexander Bayandin	15ff4e5fd1	approved-for-ci-run.yml: trigger on pull_request_target (#5212 ) ## Problem Continuation of #4663, #5210 We're still getting an error: ``` GraphQL: Resource not accessible by integration (removeLabelsFromLabelable) ``` ## Summary of changes - trigger `approved-for-ci-run.yml` workflow on `pull_request_target` instead of `pull_request`	2023-09-06 13:14:07 +03:00
Alexander Bayandin	dbfb4ea7b8	Make CI more friendly for external contributors. Second try (#5210 ) ## Problem `approved-for-ci-run` label logic doesn't work as expected: - https://github.com/neondatabase/neon/pull/4722#issuecomment-1636742145 - https://github.com/neondatabase/neon/pull/4722#issuecomment-1636755394 Continuation of https://github.com/neondatabase/neon/pull/4663 Closes #2222 (hopefully) ## Summary of changes - Create a twin PR automatically - Allow `GITHUB_TOKEN` to manipulate with labels	2023-09-06 10:06:55 +01:00
Alexander Bayandin	c222320a2a	Generate lcov coverage report (#4999 ) ## Problem We want to display coverage information for each PR. - an example of a full coverage report: https://neon-github-public-dev.s3.amazonaws.com/code-coverage/abea64800fb390c32a3efe6795d53d8621115c83/lcov/index.html - an example of GitHub auto-comment with coverage information: https://github.com/neondatabase/neon/pull/4999#issuecomment-1679344658 ## Summary of changes - Use patched[*](`426e7e7a22`) lcov to generate coverage report - Upload HTML coverage report to S3 - `scripts/comment-test-report.js`: add coverage information	2023-09-06 00:48:15 +01:00
MMeent	89c64e179e	Fix corruption issue in Local File Cache (#5208 ) Fix issue where updating the size of the Local File Cache could lead to invalid reads: ## Problem LFC cache can get re-enabled when lfc_max_size is set, e.g. through an autoscaler configuration, or PostgreSQL not liking us setting the variable. 1. initialize: LFC enabled (lfc_size_limit > 0; lfc_desc = 0) 2. Open LFC file fails, lfc_desc = -1. lfc_size_limit is set to 0; 3. lfc_size_limit is updated by autoscaling to >0 4. read() now thinks LFC is enabled (size_limit > 0) and lfc_desc is valid, but doesn't try to read from the invalid file handle and thus doesn't update the buffer content with the page's data, but does think the data was read... Any buffer we try to read from local file cache is essentially uninitialized memory. Those are likely 0-bytes, but might potentially be any old buffer that was previously read from or flushed to disk. Fix this by adding a more definitive disable flag, plus better invalid state handling.	2023-09-05 20:00:47 +02:00
Alexander Bayandin	7ceddadb37	Merge custom extension CI jobs (#5194 ) ## Problem When a remote custom extension build fails, it looks a bit confusing on neon CI: - `trigger-custom-extensions-build` is green - `wait-for-extensions-build` is red - `build-and-upload-extensions` is red But to restart the build (to get everything green), you need to restart the only passed `trigger-custom-extensions-build`. ## Summary of changes - Merge `trigger-custom-extensions-build` and `wait-for-extensions-build` jobs into `trigger-custom-extensions-build-and-wait`	2023-09-05 14:02:37 +01:00
Arpad Müller	4904613aaa	Convert `VirtualFile::{seek,metadata}` to async (#5195 ) ## Problem We want to convert the `VirtualFile` APIs to async fn so that we can adopt one of the async I/O solutions. ## Summary of changes Convert the following APIs of `VirtualFile` to async fn (as well as all of the APIs calling it): * `VirtualFile::seek` * `VirtualFile::metadata` * Also, prepare for deletion of the write impl by writing the summary to a buffer before writing it to disk, as suggested in https://github.com/neondatabase/neon/issues/4743#issuecomment-1700663864 . This change adds an additional warning for the case when the summary exceeds a block. Previously, we'd have silently corrupted data in this (unlikely) case. * `WriteBlobWriter::write_blob`, in preparation for making `VirtualFile::write_all` async.	2023-09-05 12:55:45 +02:00
Nikita Kalyanov	77658a155b	support deploying in IPv6-only environments (#4135 ) A set of changes to enable neon to work in IPv6 environments. The changes are backward-compatible but allow to deploy neon even to IPv6-only environments: - bind to both IPv4 and IPv6 interfaces - allow connections to Postgres from IPv6 interface - parse the address from control plane that could also be IPv6	2023-09-05 12:45:46 +03:00
Arpad Müller	128a85ba5e	Convert many VirtualFile APIs to async (#5190 ) ## Problem `VirtualFile` does both reading and writing, and it would be nice if both could be converted to async, so that it doesn't have to support an async read path and a blocking write path (especially for the locks this is annoying as none of the lock implementations in std, tokio or parking_lot have support for both async and blocking access). ## Summary of changes This PR is some initial work on making the `VirtualFile` APIs async. It can be reviewed commit-by-commit. * Introduce the `MaybeVirtualFile` enum to be generic in a test that compares real files with virtual files. * Make various APIs of `VirtualFile` async, including `write_all_at`, `read_at`, `read_exact_at`. Part of #4743 , successor of #5180. Co-authored-by: Christian Schwarz <me@cschwarz.com>	2023-09-04 17:05:20 +02:00
Arpad Müller	6cd497bb44	Make VirtualFile::crashsafe_overwrite async fn (#5189 ) ## Problem The `VirtualFile::crashsafe_overwrite` function was introduced by #5186 but it was not turned `async fn` yet. We want to make these functions async fn as part of #4743. ## Summary of changes Make `VirtualFile::crashsafe_overwrite` async fn, as well as all the functions calling it. Don't make anything inside `crashsafe_overwrite` use async functionalities, as per #4743 instructions. Also, add rustdoc to `crashsafe_overwrite`. Part of #4743.	2023-09-04 12:52:35 +02:00
John Spray	80f10d5ced	pageserver: safe deletion for tenant directories (#5182 ) ## Problem If a pageserver crashes partway through deleting a tenant's directory, it might leave a partial state that confuses a subsequent startup/attach. ## Summary of changes Rename tenant directory to a temporary path before deleting. Timeline deletions already have deletion markers to provide safety. In future, it would be nice to exploit this to send responses to detach requests earlier: https://github.com/neondatabase/neon/issues/5183	2023-09-04 08:31:55 +01:00

1 2 3 4 5 ...

3715 Commits