rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-07-05 05:00:37 +00:00

Author	SHA1	Message	Date
Joonas Koivunen	f8536c741f	refactor: needlessly fallible DeleteTenantFlow::should_resume_deletion -- unsure why it originally left like this, probably an oversight I had while reviewing.	2023-11-27 11:42:28 +00:00
John Spray	b80b9e1c4c	pageserver: remove defunct local timeline delete markers (#5699 ) ## Problem Historically, we treated the presence of a timeline on local disk as evidence that it logically exists. Since #5580 that is no longer the case, so we can always rely on remote storage. If we restart and the timeline is gone in remote storage, we will also purge it from local disk: no need for a marker. Reference on why this PR is for timeline markers and not tenant markers: https://github.com/neondatabase/neon/issues/5080#issuecomment-1783187807 ## Summary of changes Remove code paths that read + write deletion marker for timelines. Leave code path that deletes these markers, just in case we deploy while there are some in existence. This can be cleaned up later. (https://github.com/neondatabase/neon/issues/5718)	2023-11-27 09:31:20 +00:00
Joonas Koivunen	6b1c4cc983	fix: long timeline create cancelled by tenant delete (#5917 ) Fix the fallible vs. infallible check order with `UninitTimeline::finish_creation` so that the incomplete timeline can be removed. Currently the order of drop guard unwrapping causes uninit files to be left on pageserver, blocking the tenant deletion. Cc: #5914 Cc: #investigation-2023-11-23-stuck-tenant-deletion	2023-11-24 16:17:56 +00:00
Joonas Koivunen	53851ea8ec	fix: log cancelled request handler errors (#5915 ) noticed during [investigation] with @problame a major point of lost error logging which would had sped up the investigation. Cc: #5815 [investigation]: https://neondb.slack.com/archives/C066ZFAJU85/p1700751858049319	2023-11-24 15:54:06 +02:00
Arpad Müller	54327bbeec	Upload initdb results to S3 (#5390 ) ## Problem See #2592 ## Summary of changes Compresses the results of initdb into a .tar.zst file and uploads them to S3, to enable usage in recovery from lsn. Generations should not be involved I think because we do this only once at the very beginning of a timeline. --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-11-23 18:11:52 +00:00
Shany Pozin	b7a988ba46	Support cancellation for find_lsn_for_timestamp API (#5904 ) ## Problem #5900 ## Summary of changes Added cancellation token as param in all relevant code paths and actually used it in the find_lsn_for_timestamp main loop	2023-11-23 17:08:32 +02:00
Christian Schwarz	a0e61145c8	fix: cleanup of layers from the future can race with their re-creation (#5890 ) fixes https://github.com/neondatabase/neon/issues/5878 obsoletes https://github.com/neondatabase/neon/issues/5879 Before this PR, it could happen that `load_layer_map` schedules removal of the future image layer. Then a later compaction run could re-create the same image layer, scheduling a PUT. Due to lack of an upload queue barrier, the PUT and DELETE could be re-ordered. The result was IndexPart referencing a non-existent object. ## Summary of changes * Add support to `pagectl` / Python tests to decode `IndexPart` * Rust * new `pagectl` Subcommand * `IndexPart::{from,to}_s3_bytes()` methods to internalize knowledge about encoding of `IndexPart` * Python * new `NeonCli` subclass * Add regression test * Rust * Ability to force repartitioning; required to ensure image layer creation at last_record_lsn * Python * The regression test. * Fix the issue * Insert an `UploadOp::Barrier` after scheduling the deletions.	2023-11-23 13:33:41 +00:00
Christian Schwarz	9e3c07611c	logging: support output to stderr (#5896 ) (part of the getpage benchmarking epic #5771) The plan is to make the benchmarking tool log on stderr and emit results as JSON on stdout. That way, the test suite can simply take captures stdout and json.loads() it, while interactive users of the benchmarking tool have a reasonable experience as well. Existing logging users continue to print to stdout, so, this change should be a no-op functionally and performance-wise.	2023-11-22 11:08:35 +00:00
Joonas Koivunen	0d10992e46	Cleanup compact_level0_phase1 fsyncing (#5852 ) While reviewing code noticed a scary `layer_paths.pop().unwrap()` then realized this should be further asyncified, something I forgot to do when I switched the `compact_level0_phase1` back to async in #4938. This keeps the double-fsync for new deltas as #4749 is still unsolved.	2023-11-21 15:30:40 +02:00
Joonas Koivunen	d98ac04136	chore(background_tasks): missed allowed_error change, logging change (#5883 ) - I am always confused by the log for the error wait time, now it will be `2s` or `2.0s` not `2.0` - fix missed string change introduced in #5881 [evidence] [evidence]: https://neon-github-public-dev.s3.amazonaws.com/reports/main/6921062837/index.html#suites/f9eba3cfdb71aa6e2b54f6466222829b/87897fe1ddee3825	2023-11-20 07:33:17 +00:00
Joonas Koivunen	ac08072d2e	fix(layer): VirtualFile opening and read errors can be caused by contention (#5880 ) A very low number of layer loads have been marked wrongly as permanent, as I did not remember that `VirtualFile::open` or reading could fail transiently for contention. Return separate errors for transient and persistent errors from `{Delta,Image}LayerInner::load`. Includes drive-by comment changes. The implementation looks quite ugly because having the same type be both the inner (operation error) and outer (critical error), but with the alternatives I tried I did not find a better way.	2023-11-19 14:57:39 +00:00
John Spray	d22dce2e31	pageserver: shut down idle walredo processes (#5877 ) The longer a pageserver runs, the more walredo processes it accumulates from tenants that are touched intermittently (e.g. by availability checks). This can lead to getting OOM killed. Changes: - Add an Instant recording the last use of the walredo process for a tenant - After compaction iteration in the background task, check for idleness and stop the walredo process if idle for more than 10x compaction period. Cc: #3620 Co-authored-by: Joonas Koivunen <joonas@neon.tech> Co-authored-by: Shany Pozin <shany@neon.tech>	2023-11-19 14:21:16 +00:00
Joonas Koivunen	3b3f040be3	fix(background_tasks): first backoff, compaction error stacktraces (#5881 ) First compaction/gc error backoff starts from 0 which is less than 2s what it was before #5672. This is now fixed to be the intended 2**n. Additionally noticed the `compaction_iteration` creating an `anyhow::Error` via `into()` always captures a stacktrace even if we had a stacktraceful anyhow error within the CompactionError because there is no stable api for querying that.	2023-11-19 14:16:31 +00:00
John Spray	ab631e6792	pageserver: make TenantsMap shard-aware (#5819 ) ## Problem When using TenantId as the key, we are unable to handle multiple tenant shards attached to the same pageserver for the same tenant ID. This is an expected scenario if we have e.g. 8 shards and 5 pageservers. ## Summary of changes - TenantsMap is now a BTreeMap instead of a HashMap: this enables looking up by range. In future, we will need this for page_service, as incoming requests will just specify the Key, and we'll have to figure out which shard to route it to. - A new key type TenantShardId is introduced, to act as the key in TenantsMap, and as the id type in external APIs. Its human readable serialization is backward compatible with TenantId, and also forward-compatible as long as sharding is not actually used (when we construct a TenantShardId with ShardCount(0), it serializes to an old-fashioned TenantId). - Essential tenant APIs are updated to accept TenantShardIds: tenant/timeline create, tenant delete, and /location_conf. These are the APIs that will enable driving sharded tenants. Other apis like /attach /detach /load /ignore will not work with sharding: those will soon be deprecated and replaced with /location_conf as part of the live migration work. Closes: #5787	2023-11-15 23:20:21 +02:00
Joonas Koivunen	462f04d377	Smaller test addition and change (#5858 ) - trivial serialization roundtrip test for `pageserver::repository::Value` - add missing `start_paused = true` to 15s test making it <0s test - completely unrelated future clippy lint avoidance (helps beta channel users)	2023-11-14 18:04:34 +01:00
Arpad Müller	f7249b9018	Fix comment in find_lsn_for_timestamp (#5855 ) We still subtract 1 from low to compute `commit_lsn`. the comment moved/added by #5844 should point this out.	2023-11-11 00:32:00 +00:00
John Spray	d672e44eee	pageserver: error type for collect_keyspace (#5846 ) ## Problem This is a log hygiene fix, for an occasional test failure. warn-level logging in imitate_timeline_cached_layer_accesses can't distinguish actual errors from shutdown cases. ## Summary of changes Replaced anyhow::Error with an explicit CollectKeySpaceError type, that includes conversion from PageReconstructError::Cancelled.	2023-11-10 13:58:18 +00:00
Rahul Modpur	a6f892e200	metric: add started and killed walredo processes counter (#5809 ) In OOM situations, knowing exactly how many walredo processes there were at a time would help afterwards to understand why was pageserver OOM killed. Add `pageserver_wal_redo_process_total` metric to keep track of total wal redo process started, shutdown and killed since pageserver start. Closes #5722 --------- Signed-off-by: Rahul Modpur <rmodpur2@gmail.com> Co-authored-by: Joonas Koivunen <joonas@neon.tech> Co-authored-by: Christian Schwarz <me@cschwarz.com>	2023-11-10 15:05:22 +02:00
Arpad Müller	8e5e3971ba	find_lsn_for_timestamp fixes (#5844 ) Includes the changes of #3689 that address point 1 of #3689, plus some further improvements. In particular, this PR does: * set `min_lsn` to a safe value to create branches from (and verify it in tests) * return `min_lsn` instead of `max_lsn` for `NoData` and `Past` (verify it in test for `Past`, `NoData` is harder and not as important) * return `commit_lsn` instead of `max_lsn` for Future (and verify it in the tests) * add some comments Split out of #5686 to get something more minimal out to users.	2023-11-10 13:38:44 +01:00
Joonas Koivunen	8dd29f1e27	fix(pageserver): spawn all kinds of tenant shutdowns (#5841 ) Minor bugfix, something noticed while manual code-review. Use the same joinset for inprogress tenants so we can get the benefit of the buffering logging just as we get for attached tenants, and no single inprogress task can hold up shutdown of other tenants.	2023-11-09 21:36:57 +00:00
Joonas Koivunen	f5344fb85a	temp: log all layer loading errors while we lose them (#5816 ) Temporary workaround while some errors are not being logged. Cc: #5815.	2023-11-09 21:31:53 +00:00
Arpad Müller	f95f001b8b	Lsn for get_timestamp_of_lsn should be string, not integer (#5840 ) The `get_timestamp_of_lsn` pageserver endpoint has been added in #5497, but the yml it added was wrong: the lsn is expected in hex format, not in integer (decimal) format.	2023-11-09 16:12:18 +00:00
John Spray	e0821e1eab	pageserver: refined Timeline shutdown (#5833 ) ## Problem We have observed the shutdown of a timeline taking a long time when a deletion arrives at a busy time for the system. This suggests that we are not respecting cancellation tokens promptly enough. ## Summary of changes - Refactor timeline shutdown so that rather than having a shutdown() function that takes a flag for optionally flushing, there are two distinct functions, one for graceful flushing shutdown, and another that does the "normal" shutdown where we're just setting a cancellation token and then tearing down as fast as we can. This makes things a bit easier to reason about, and enables us to remove the hand-written variant of shutdown that was maintained in `delete.rs` - Layer flush task checks cancellation token more carefully - Logical size calculation's handling of cancellation tokens is simplified: rather than passing one in, it respects the Timeline's cancellation token. This PR doesn't touch RemoteTimelineClient, which will be a key thing to fix as well, so that a slow remote storage op doesn't hold up shutdown.	2023-11-09 16:02:59 +00:00
bojanserafimov	4469b1a62c	Fix blob_io test (#5818 )	2023-11-09 10:47:03 -05:00
Joonas Koivunen	842223b47f	fix(metric): remove pageserver_wal_redo_wait_seconds (#5791 ) the meaning of the values recorded in this histogram changed with #5560 and we never had it visualized as a histogram, just the `increase(_sum)`. The histogram is not too interesting to look at, so remove it per discussion in [slack thread](https://neondb.slack.com/archives/C063LJFF26S/p1699008316109999?thread_ts=1698852436.637559&cid=C063LJFF26S).	2023-11-09 16:40:52 +02:00
Sasha Krassovsky	87389bc933	Add test simulating bad connection between pageserver and compute (#5728 ) ## Problem We have a funny 3-day timeout for connections between the compute and pageserver. We want to get rid of it, so to do that we need to make sure the compute is resilient to connection failures. Closes: https://github.com/neondatabase/neon/issues/5518 ## Summary of changes This test makes the pageserver randomly drop the connection if the failpoint is enabled, and ensures we can keep querying the pageserver. This PR also reduces the default timeout to 10 minutes from 3 days.	2023-11-08 19:48:57 +00:00
Arpad Müller	ea118a238a	JWT logging improvements (#5823 ) * lower level on auth success from info to debug (fixes #5820) * don't log stacktraces on auth errors (as requested on slack). we do this by introducing an `AuthError` type instead of using `anyhow` and `bail`. * return errors that have been censored for improved security.	2023-11-08 16:56:53 +00:00
Christian Schwarz	e9b227a11e	cleanup unused RemoteStorage fields (#5830 ) Found this while working on #5771	2023-11-08 16:54:33 +00:00
John Spray	40441f8ada	pageserver: use `Gate` for stronger safety check in `SlotGuard` (#5793 ) ## Problem #5711 and #5367 raced -- the `SlotGuard` type needs `Gate` to properly enforce its invariant that we may not drop an `Arc<Tenant>` from a slot. ## Summary of changes Replace the TODO with the intended check of Gate.	2023-11-08 13:00:11 +00:00
duguorong009	11d9d801b5	pageserver: improve the shutdown log error (#5792 ) ## Problem - Close #5784 ## Summary of changes - Update the `GetActiveTenantError` -> `QueryError` conversion process in `pageserver/src/page_service.rs` - Update the pytest logging exceptions in `./test_runner/regress/test_tenant_detach.py`	2023-11-07 16:57:26 +00:00
Arpad Müller	e310533ed3	Support JWT key reload in pageserver (#5594 ) ## Problem For quickly rotating JWT secrets, we want to be able to reload the JWT public key file in the pageserver, and also support multiple JWT keys. See #4897. ## Summary of changes * Allow directories for the `auth_validation_public_key_path` config param instead of just files. for the safekeepers, all of their config options also support multiple JWT keys. * For the pageservers, make the JWT public keys easily globally swappable by using the `arc-swap` crate. * Add an endpoint to the pageserver, triggered by a POST to `/v1/reload_auth_validation_keys`, that reloads the JWT public keys from the pre-configured path (for security reasons, you cannot upload any keys yourself). Fixes #4897 --------- Co-authored-by: Heikki Linnakangas <heikki@neon.tech> Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-11-07 15:43:29 +01:00
John Spray	1d68f52b57	pageserver: move deletion failpoint inside backoff (#5814 ) ## Problem When enabled, this failpoint would busy-spin in a loop that emits log messages. ## Summary of changes Move the failpoint inside a backoff::exponential block: it will still spam the log, but at much lower rate. --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-11-07 14:25:51 +00:00
Shany Pozin	0ac4cf67a6	Use self.tenants instead of TENANTS (#5811 )	2023-11-07 11:38:02 +00:00
Joonas Koivunen	4be6bc7251	refactor: remove unnecessary unsafe (#5802 ) unsafe impls for `Send` and `Sync` should not be added by default. in the case of `SlotGuard` removing them does not cause any issues, as the compiler automatically derives those. This PR adds requirement to document the unsafety (see [clippy::undocumented_unsafe_blocks]) and opportunistically adds `#![deny(unsafe_code)]` to most places where we don't have unsafe code right now. TRPL on Send and Sync: https://doc.rust-lang.org/book/ch16-04-extensible-concurrency-sync-and-send.html [clippy::undocumented_unsafe_blocks]: https://rust-lang.github.io/rust-clippy/master/#/undocumented_unsafe_blocks	2023-11-07 10:26:25 +00:00
John Spray	a394f49e0d	pageserver: avoid converting an error to anyhow::Error (#5803 ) This was preventing it getting cleanly converted to a CalculateLogicalSizeError::Cancelled, resulting in "Logical size calculation failed" errors in logs.	2023-11-07 09:35:45 +00:00
John Spray	c00651ff9b	pageserver: start refactoring into TenantManager (#5797 ) ## Problem See: https://github.com/neondatabase/neon/issues/5796 ## Summary of changes Completing the refactor is quite verbose and can be done in stages: each interface that is currently called directly from a top-level mgr.rs function can be moved into TenantManager once the relevant subsystems have access to it. Landing the initial change to create of TenantManager is useful because it enables new code to use it without having to be altered later, and sets us up to incrementally fix the existing code to use an explicit Arc<TenantManager> instead of relying on the static TENANTS.	2023-11-07 09:06:53 +00:00
John Spray	85cd97af61	pageserver: add `InProgress` tenant map state, use a sync lock for the map (#5367 ) ## Problem Follows on from #5299 - We didn't have a generic way to protect a tenant undergoing changes: `Tenant` had states, but for our arbitrary transitions between secondary/attached, we need a general way to say "reserve this tenant ID, and don't allow any other ops on it, but don't try and report it as being in any particular state". - The TenantsMap structure was behind an async RwLock, but it was never correct to hold it across await points: that would block any other changes for all tenants. ## Summary of changes - Add the `TenantSlot::InProgress` value. This means: - Incoming administrative operations on the tenant should retry later - Anything trying to read the live state of the tenant (e.g. a page service reader) should retry later or block. - Store TenantsMap in `std::sync::RwLock` - Provide an extended `get_active_tenant_with_timeout` for page_service to use, which will wait on InProgress slots as well as non-active tenants. Closes: https://github.com/neondatabase/neon/issues/5378 --------- Co-authored-by: Christian Schwarz <christian@neon.tech>	2023-11-06 14:03:22 +00:00
bojanserafimov	dc72567288	Layer flush minor speedup (#5765 ) Convert keys to `i128` before sorting	2023-11-06 08:58:20 -05:00
John Spray	6defa2b5d5	pageserver: add `Gate` as a partner to CancellationToken for safe shutdown of `Tenant` & `Timeline` (#5711 ) ## Problem When shutting down a Tenant, it isn't just important to cause any background tasks to stop. It's also important to wait until they have stopped before declaring shutdown complete, in cases where we may re-use the tenant's local storage for something else, such as running in secondary mode, or creating a new tenant with the same ID. ## Summary of changes A `Gate` class is added, inspired by [seastar::gate](https://docs.seastar.io/master/classseastar_1_1gate.html). For types that have an important lifetime that corresponds to some physical resource, use of a Gate as well as a CancellationToken provides a robust pattern for async requests & shutdown: - Requests must always acquire the gate as long as they are using the object - Shutdown must set the cancellation token, and then `close()` the gate to wait for requests in progress before returning. This is not for memory safety: it's for expressing the difference between "Arc<Tenant> exists", and "This tenant's files on disk are eligible to be read/written". - Both Tenant and Timeline get a Gate & CancellationToken. - The Timeline gate is held during eviction of layers, and during page_service requests. - Existing cancellation support in page_service is refined to use the timeline-scope cancellation token instead of a process-scope cancellation token. This replaces the use of `task_mgr::associate_with`: tasks no longer change their tenant/timelineidentity after being spawned. The Tenant's Gate is not yet used, but will be important for Tenant-scoped operations in secondary mode, where we must ensure that our secondary-mode downloads for a tenant are gated wrt the activity of an attached Tenant. This is part of a broader move away from using the global-state driven `task_mgr` shutdown tokens: - less global state where we rely on implicit knowledge of what task a given function is running in, and more explicit references to the cancellation token that a particular function/type will respect, making shutdown easier to reason about. - eventually avoid the big global TASKS mutex. --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-11-06 12:39:20 +00:00
duguorong009	b3d3a2587d	feat: improve the serde impl for several types(`Lsn`, `TenantId`, `TimelineId` ...) (#5335 ) Improve the serde impl for several types (`Lsn`, `TenantId`, `TimelineId`) by making them sensitive to `Serializer::is_human_readadable` (true for json, false for bincode). Fixes #3511 by: - Implement the custom serde for `Lsn` - Implement the custom serde for `Id` - Add the helper module `serde_as_u64` in `libs/utils/src/lsn.rs` - Remove the unnecessary attr `#[serde_as(as = "DisplayFromStr")]` in all possible structs Additionally some safekeeper types gained serde tests. --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-11-06 11:40:03 +02:00
John Spray	306c4f9967	s3_scrubber: prepare for scrubbing buckets with generation-aware content (#5700 ) ## Problem The scrubber didn't know how to find the latest index_part when generations were in use. ## Summary of changes - Teach the scrubber to do the same dance that pageserver does when finding the latest index_part.json - Teach the scrubber how to understand layer files with generation suffixes. - General improvement to testability: scan_metadata has a machine readable output that the testing `S3Scrubber` wrapper can read. - Existing test coverage of scrubber was false-passing because it just didn't see any data due to prefixing of data in the bucket. Fix that. This is incremental improvement: the more confidence we can have in the scrubber, the more we can use it in integration tests to validate the state of remote storage. --------- Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2023-11-03 17:36:02 +00:00
Joonas Koivunen	27bdbf5e36	chore(layer): restore logging, doc changes (#5766 ) Some of the log messages were lost with the #4938. This PR adds some of them back, most notably: - starting to on-demand download - successful completion of on-demand download - ability to see when there were many waiters for the layer download - "unexpectedly on-demand downloading ..." is now `info!` Additionally some rare events are logged as error, which should never happen.	2023-11-02 19:05:33 +00:00
Joonas Koivunen	098d3111a5	fix(layer): get_and_upgrade and metrics (#5767 ) when introducing `get_and_upgrade` I forgot that an `evict_and_wait` would had already incremented the counter for started evictions, but an upgrade would just "silently" cancel the eviction as no drop would ever run. these metrics are likely sources for alerts with the next release, so it's important to keep them correct.	2023-11-02 13:06:14 +00:00
Joonas Koivunen	3737fe3a4b	fix(layer): error out early if layer path is non-file (#5756 ) In an earlier PR https://github.com/neondatabase/neon/pull/5743#discussion_r1378625244 I added a FIXME and there's a simple solution suggested by @jcsp, so implement it. Wondering why I did not implement this originally, there is no concept of a permanent failure, so this failure will happen quite often. I don't think the frequency is a problem however. Sadly for std::fs::FileType there is only decimal and hex formatting, no octal.	2023-11-02 11:03:38 +00:00
John Spray	5650138532	pageserver: helpers for explicitly dying on fatal I/O errors (#5651 ) Following from discussion on https://github.com/neondatabase/neon/pull/5436 where hacking an implicit die-on-fatal-io behavior into an Error type was a source of disagreement -- in this PR, dying on fatal I/O errors is explicit, with `fatal_err` and `maybe_fatal_err` helpers in the `MaybeFatalIo` trait, which is implemented for std::io::Result. To enable this approach with `crashsafe_overwrite`, the return type of that function is changed to std::io::Result -- the previous error enum for this function was not used for any logic, and the utility of saying exactly which step in the function failed is outweighed by the hygiene of having an I/O funciton return an io::Result. The initial use case for these helpers is the deletion queue.	2023-11-02 09:14:26 +00:00
Joonas Koivunen	2dca4c03fc	feat(layer): cancellable get_or_maybe_download (#5744 ) With the layer implementation as was done in #4938, it is possible via cancellation to cause two concurrent downloads on the same path, due to how `RemoteTimelineClient::download_remote_layer` does tempfiles. Thread the init semaphore through the spawned task of downloading to make this impossible to happen.	2023-11-02 08:06:32 +00:00
Joonas Koivunen	e82d1ad6b8	fix(layer): reinit on access before eviction happens (#5743 ) Right before merging, I added a loop to `fn LayerInner::get_or_maybe_download`, which was always supposed to be there. However I had forgotten to restart initialization instead of waiting for the eviction to happen to support original design goal of "eviction should always lose to redownload (or init)". This was wrong. After this fix, if `spawn_blocking` queue is blocked on something, nothing bad will happen. Part of #5737.	2023-11-01 17:38:32 +02:00
Muhammet Yazici	4f0a8e92ad	fix: Add bearer prefix to Authorization header (#5740 ) ## Problem Some requests with `Authorization` header did not properly set the `Bearer ` prefix. Problem explained here https://github.com/neondatabase/cloud/issues/6390. ## Summary of changes Added `Bearer ` prefix to missing requests.	2023-11-01 09:41:48 +03:00
Konstantin Knizhnik	5952f350cb	Always handle POLLHUP in walredo error poll loop (#5716 ) ## Problem test_stderr hangs on MacOS. See https://neondb.slack.com/archives/C036U0GRMRB/p1698438997903919 ## Summary of changes Always handle POLLHUP to prevent infinite loop. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2023-10-31 20:57:03 +02:00
Joonas Koivunen	896347f307	refactor(layer): remove version checking with atomics (#5742 ) The `LayerInner::version` never needed to be read in more than one place. Clarified while fixing #5737 of which this is the first step. This decrements possible wrong atomics usage in Layer, but does not really fix anything.	2023-10-31 18:40:08 +02:00

1 2 3 4 5 ...

1666 Commits