rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-07-05 13:10:37 +00:00

Author	SHA1	Message	Date
Joonas Koivunen	77a68326c5	Thin out TenantState metric, keep set of broken tenants (#4796 ) We currently have a timeseries for each of the tenants in different states. We only want this for Broken. Other states could be counters. Fix this by making the `pageserver_tenant_states_count` a counter without a `tenant_id` and add a `pageserver_broken_tenants_count` which has a `tenant_id` label, each broken tenant being 1.	2023-07-25 11:15:54 +03:00
Joonas Koivunen	a25504deae	Limit concurrent compactions (#4777 ) Compactions can create a lot of concurrent work right now with #4265. Limit compactions to use at most 6/8 background runtime threads.	2023-07-25 10:19:04 +03:00
Joonas Koivunen	294b8a8fde	Convert per timeline metrics to global (#4769 ) Cut down the per-(tenant, timeline) histograms by making them global: - `pageserver_getpage_get_reconstruct_data_seconds` - `pageserver_read_num_fs_layers` - `pageserver_remote_operation_seconds` - `pageserver_remote_timeline_client_calls_started` - `pageserver_wait_lsn_seconds` - `pageserver_io_operations_seconds` --------- Co-authored-by: Shany Pozin <shany@neon.tech>	2023-07-25 00:43:27 +03:00
arpad-m	e5b7ddfeee	Preparatory pageserver async conversions (#4773 ) In #4743, we'd like to convert the read path to use `async` rust. In preparation of that, this PR switches some functions that are calling lower level functions like `BlockReader::read_blk`, `BlockCursor::read_blob`, etc into `async`. The PR does not switch all functions however, and only focuses on the ones which are easy to switch. This leaves around some async functions that are (currently) unnecessarily `async`, but on the other hand it makes future changes smaller in diff. Part of #4743 (but does not completely address it).	2023-07-24 14:01:54 +02:00
Alex Chi Z	1685593f38	stable merge and sort in compaction (#4573 ) Per discussion at https://github.com/neondatabase/neon/pull/4537#discussion_r1242086217, it looks like a better idea to use `<` instead of `<=` for all these comparisons. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2023-07-21 10:15:44 -04:00
Alex Chi Z	3fc3666df7	make flush frozen layer an atomic operation (#4720 ) ## Problem close https://github.com/neondatabase/neon/issues/4712 ## Summary of changes Previously, when flushing frozen layers, it was split into two operations: add delta layer to disk + remove frozen layer from memory. This would cause a short period of time where we will have the same data both in frozen and delta layer. In this PR, we merge them into one atomic operation in layer map manager, therefore simplifying the code. Note that if we decide to create image layers for L0 flush, it will still be split into two operations on layer map. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-07-20 13:39:19 -04:00
Joonas Koivunen	89746a48c6	chore: fix copypaste caused flakyness (#4763 ) I introduced a copypaste error leading to flaky [test failure][report] in #4737. Solution is to use correct/unique test name. I also looked into providing a proper fn name via macro but ... Yeah, it's probably not a great idea. [report]: https://github.com/neondatabase/neon/actions/runs/5612473297/job/15206293430#step:15:197	2023-07-20 19:55:40 +03:00
Joonas Koivunen	8d27a9c54e	Less verbose eviction failures (#4737 ) As seen in staging logs with some massive compactions (create_image_layer), in addition to racing with compaction or gc or even between two invocations to `evict_layer_batch`. Cc: #4745 Fixes: #3851 (organic tech debt reduction) Solution is not to log the Not Found in such cases; it is perfectly natural to happen. Route to this is quite long, but implemented two cases of "race between two eviction processes" which are like our disk usage based eviction and eviction_task, both have the separate "lets figure out what to evict" and "lets evict" phases.	2023-07-20 17:45:10 +03:00
arpad-m	d98cb39978	pageserver: use tokio::time::timeout where possible (#4756 ) Removes a bunch of cases which used `tokio::select` to emulate the `tokio::time::timeout` function. I've done an additional review on the cancellation safety of these futures, all of them seem to be cancellation safe (not that `select!` allows non-cancellation-safe futures, but as we touch them, such a review makes sense). Furthermore, I correct a few mentions of a non-existent `tokio::timeout!` macro in the docs to the `tokio::time::timeout` function.	2023-07-20 16:19:38 +02:00
Joonas Koivunen	9e871318a0	Wait detaches or ignores on pageserver shutdown (#4678 ) Adds in a barrier for the duration of the `Tenant::shutdown`. `pageserver_shutdown` will join this await, `detach`es and `ignore`s will not. Fixes #4429. --------- Co-authored-by: Christian Schwarz <christian@neon.tech>	2023-07-20 13:14:13 +03:00
Kirill Bulatov	be271e3edf	Use upstream version of tokio-tar (#4722 ) tokio-tar 0.3.1 got released, including all changes from the fork currently used, switch over to that one.	2023-07-17 17:18:33 +01:00
Alex Chi Z	1066bca5e3	compaction: allow duplicated layers and skip in replacement (#4696 ) ## Problem Compactions might generate files of exactly the same name as before compaction due to our naming of layer files. This could have already caused some mess in the system, and is known to cause some issues like https://github.com/neondatabase/neon/issues/4088. Therefore, we now consider duplicated layers in the post-compaction process to avoid violating the layer map duplicate checks. related previous works: close https://github.com/neondatabase/neon/pull/4094 error reported in: https://github.com/neondatabase/neon/issues/4690, https://github.com/neondatabase/neon/issues/4088 ## Summary of changes If a file already exists in the layer map before the compaction, do not modify the layer map and do not delete the file. The file on disk at that time should be the new one overwritten by the compaction process. This PR also adds a test case with a fail point that produces exactly the same set of files. This bypassing behavior is safe because the produced layer files have the same content / are the same representation of the original file. An alternative might be directly removing the duplicate check in the layer map, but I feel it would be good if we can prevent that in the first place. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Konstantin Knizhnik <knizhnik@garret.ru> Co-authored-by: Heikki Linnakangas <heikki@neon.tech> Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-07-17 17:26:29 +03:00
Christian Schwarz	966213f429	basebackup query metric: use same buckets as control plane (#4732 ) The `CRITICAL_OPS_BUCKETS` is not useful for getting an accurate picture of basebackup latency because all the observations that negatively affect our SLI fall into one bucket, i.e., 100ms-1s. Use the same buckets as control plane instead.	2023-07-17 13:46:13 +02:00
arpad-m	35e73759f5	Reword comment and add comment on race condition (#4725 ) The race condition that caused #4526 is still not fixed, so point it out in a comment. Also, reword a comment in upload.rs. Follow-up of #4694	2023-07-17 12:49:58 +02:00
arpad-m	982fce1e72	Fix rustdoc warnings and test cargo doc in CI (#4711 ) ## Problem `cargo +nightly doc` is giving a lot of warnings: broken links, naked URLs, etc. ## Summary of changes * update the `proc-macro2` dependency so that it can compile on latest Rust nightly, see https://github.com/dtolnay/proc-macro2/pull/391 and https://github.com/dtolnay/proc-macro2/issues/398 * allow the `private_intra_doc_links` lint, as linking to something that's private is always more useful than just mentioning it without a link: if the link breaks in the future, at least there is a warning due to that. Also, one might enable [`--document-private-items`](https://doc.rust-lang.org/cargo/commands/cargo-doc.html#documentation-options) in the future and make these links work in general. * fix all the remaining warnings given by `cargo +nightly doc` * make it possible to run `cargo doc` on stable Rust by updating `opentelemetry` and associated crates to version 0.19, pulling in a fix that previously broke `cargo doc` on stable: https://github.com/open-telemetry/opentelemetry-rust/pull/904 * Add `cargo doc` to CI to ensure that it won't get broken in the future. Fixes #2557 ## Future work * Potentially, it might make sense, for development purposes, to publish the generated rustdocs somewhere, like for example [how the rust compiler does it](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_driver/index.html). I will file an issue for discussion.	2023-07-15 05:11:25 +03:00
Vadim Kharitonov	e767ced8d0	Update rust to 1.71.0 (#4718 ) Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-07-14 18:34:01 +02:00
Joonas Koivunen	9a69b6cb94	Demote deletion warning, list files (#4688 ) Handle test failures like: ``` AssertionError: assert not ['$ts WARN delete_timeline{tenant_id=X timeline_id=Y}: About to remove 1 files\n'] ``` Instead of logging: ``` WARN delete_timeline{tenant_id=X timeline_id=Y}: Found 1 files not bound to index_file.json, proceeding with their deletion WARN delete_timeline{tenant_id=X timeline_id=Y}: About to remove 1 files ``` For each one operation of timeline deletion, list all unref files with `info!`, and then continue to delete them with the added spice of logging the rare/never happening non-utf8 name with `warn!`. Rationale for `info!` instead of `warn!`: this is a normal operation; like we had mentioned in `test_import.py` -- basically whenever we delete a timeline which is not idle. Rationale for N * (`ìnfo!`\|`warn!`): symmetry for the layer deletions; if we could ever need those, we could also need these for layer files which are not yet mentioned in `index_part.json`. --------- Co-authored-by: Christian Schwarz <christian@neon.tech>	2023-07-14 18:59:16 +03:00
Joonas Koivunen	cc82cd1b07	spanchecks: Support testing without tracing (#4682 ) Tests cannot be ran without configuring tracing. Split from #4678. Does not nag about the span checks when there is no subscriber configured, because then the spans will have no links and nothing can be checked. Sadly the `SpanTrace::status()` cannot be used for this. `tracing` is always configured in regress testing (running with `pageserver` binary), which should be enough. Additionally cleans up the test code in span checks to be in the test code. Fixes a `#[should_panic]` test which was flaky before these changes, but the `#[should_panic]` hid the flakyness. Rationale for need: Unit tests might not be testing only the public or `feature="testing"` APIs which are only testable within `regress` tests so not all spans might be configured.	2023-07-14 17:45:25 +03:00
Alex Chi Z	c76b74c50d	semantic layer map operations (#4618 ) ## Problem ref https://github.com/neondatabase/neon/issues/4373 ## Summary of changes A step towards immutable layer map. I decided to finish the refactor with this new approach and apply https://github.com/neondatabase/neon/pull/4455 on this patch later. In this PR, we moved all modifications of the layer map to one place with semantic operations like `initialize_local_layers`, `finish_compact_l0`, `finish_gc_timeline`, etc, which is now part of `LayerManager`. This makes it easier to build new features upon this PR: * For immutable storage state refactor, we can simply replace the layer map with `ArcSwap<LayerMap>` and remove the `layers` lock. Moving towards it requires us to put all layer map changes in a single place as in https://github.com/neondatabase/neon/pull/4455. * For manifest, we can write to manifest in each of the semantic functions. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech>	2023-07-13 17:35:27 +03:00
arpad-m	664d32eb7f	Don't propagate but log file not found error in layer uploading (#4694 ) This addresses the issue in #4526 by adding a test that reproduces the race condition that gave rise to the bug (or at least a race condition that gave rise to the same error message), and then implementing a fix that just prints a message to the log if a file could not been found for uploading. Even though the underlying race condition is not fixed yet, this will un-block the upload queue in that situation, greatly reducing the impact of such a (rare) race. Fixes #4526.	2023-07-12 18:10:49 +02:00
Alexander Bayandin	ed845b644b	Prevent unintentional Postgres submodule update (#4692 ) ## Problem Postgres submodule can be changed unintentionally, and these changes are easy to miss during the review. Adding a check that should prevent this from happening, the check fails `build-neon` job with the following message: ``` Expected postgres-v14 rev to be at '1414141414141414141414141414141414141414', but it is at '1144aee1661c79eec65e784a8dad8bd450d9df79' Expected postgres-v15 rev to be at '1515151515151515151515151515151515151515', but it is at '1984832c740a7fa0e468bb720f40c525b652835d' Please update vendors/revisions.json if these changes are intentional. ``` This is an alternative approach to https://github.com/neondatabase/neon/pull/4603 ## Summary of changes - Add `vendor/revisions.json` file with expected revisions - Add built-time check (to `build-neon` job) that Postgres submodules match revisions from `vendor/revisions.json` - A couple of small improvements for logs from https://github.com/neondatabase/neon/pull/4603 - Fixed GitHub autocomment for no tests was run case --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-07-12 15:12:37 +01:00
Joonas Koivunen	87dd37a2f2	pageserver: Align tenant, timeline id names in spans (#4687 ) Uses `(tenant\|timeline)_id`. Not a statement about endorsing this naming style but it is better to be aligned.	2023-07-12 16:58:40 +03:00
arpad-m	1355bd0ac5	layer deletion: Improve a comment and fix TOCTOU (#4673 ) The comment referenced an issue that was already closed. Remove that reference and replace it with an explanation why we already don't print an error. See discussion in https://github.com/neondatabase/neon/issues/2934#issuecomment-1626505916 For the TOCTOU fixes, the two calls after the `.exists()` both didn't handle the situation well where the file was deleted after the initial `.exists()`: one would assume that the path wasn't a file, giving a bad error, the second would give an accurate error but that's not wanted either. We remove both racy `exists` and `is_file` checks, and instead just look for errors about files not being found.	2023-07-12 15:52:14 +02:00
bojanserafimov	92aee7e07f	cold starts: basebackup compression (#4482 ) Co-authored-by: Alex Chi Z <iskyzh@gmail.com>	2023-07-11 13:11:23 -04:00
Alex Chi Z	08bfe1c826	remove `LayerDescriptor` and use `LayerObject` for tests (#4637 ) ## Problem part of https://github.com/neondatabase/neon/pull/4340 ## Summary of changes Remove LayerDescriptor and remove `todo!`. At the same time, this PR adds `AsLayerDesc` trait for all persistent layers and changed `LayerFileManager` to have a generic type. For tests, we are now using `LayerObject`, which is a wrapper around `PersistentLayerDesc`. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2023-07-10 19:40:37 +03:00
Christian Schwarz	65ff256bb8	page_service: add peer_addr span field, and set tenant_id / timeline_id fields earlier (#4638 ) Before this PR, during shutdown, we'd find naked logs like this one for every active page service connection: ``` 2023-07-05T14:13:50.791992Z INFO shutdown request received in run_message_loop ``` This PR 1. adds a peer_addr span field to distinguish the connections in logs 2. sets the tenant_id / timeline_id fields earlier It would be nice to have `tenant_id` and `timeline_id` directly on the `page_service_conn_main` span (empty, initially), then set them at the top of `process_query`. The problem is that the debug asserts for `tenant_id` and `timeline_id` presence in the tracing span doesn't support detecting empty values [1]. So, I'm a bit hesitant about over-using `Span::record`. [1] https://github.com/neondatabase/neon/issues/4676	2023-07-10 15:23:40 +02:00
Christian Schwarz	49efcc3773	walredo: add tenant_id to span of NoLeakChild::drop (#4640 ) We see the following log lines occasionally in prod: ``` kill_and_wait_impl{pid=1983042}: wait successful exit_status=signal: 9 (SIGKILL) ``` This PR makes it easier to find the tenant for the pid, by including the tenant id as a field in the span.	2023-07-10 12:49:22 +03:00
Dmitry Rodionov	76b1cdc17e	Order tenant_id argument before timeline_id, use references (#4671 ) It started from few config methods that have various orderings and sometimes use references sometimes not. So I unified path manipulation methods to always order tenant_id before timeline_id and use referenced because we dont need owned values. Similar changes happened to call-sites of config methods. I'd say its a good idea to always order tenant_id before timeline_id so it is consistent across the whole codebase.	2023-07-10 10:23:37 +02:00
arpad-m	4f280c2953	Small pageserver cleanups (#4657 ) ## Problem I was reading the code of the page server today and found these minor things that I thought could be cleaned up. ## Summary of changes * remove a redundant indentation layer and continue in the flushing loop * use the builtin `PartialEq` check instead of hand-rolling a `range_eq` function * Add a missing `>` to a prominent doc comment	2023-07-07 16:53:14 +02:00
Dmitry Rodionov	20137d9588	Polish tracing helpers (#4651 ) Context: comments here: https://github.com/neondatabase/neon/pull/4645	2023-07-06 19:49:14 +03:00
Alex Chi Z	d340cf3721	dump more info in layer map (#4567 ) A simple commit extracted from https://github.com/neondatabase/neon/pull/4539 This PR adds more info for layer dumps (is_delta, is_incremental, size). --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2023-07-06 18:21:45 +03:00
Joonas Koivunen	269e20aeab	fix: filter out zero synthetic sizes (#4639 ) Apparently sending the synthetic_size == 0 is causing problems, so filter out sending zeros. Slack discussion: https://neondb.slack.com/archives/C03F5SM1N02/p1688574285718989?thread_ts=1687874910.681049&cid=C03F5SM1N02	2023-07-06 12:32:34 +03:00
Dmitry Rodionov	b263510866	move some logical size bits to separate logical_size.rs	2023-07-06 11:58:41 +03:00
Dmitry Rodionov	e418fc6dc3	move some tracing related assertions to separate module for tenant and timeline	2023-07-06 11:58:41 +03:00
Dmitry Rodionov	434eaadbe3	Move uninitialized timeline from tenant.rs to timeline/uninit.rs	2023-07-06 11:58:41 +03:00
Christian Schwarz	505aa242ac	page cache: add size metrics (#4629 ) Make them a member of `struct PageCache` to prepare for a future where there's no global state.	2023-07-05 15:36:42 +03:00
arpad-m	1c516906e7	Impl Display for LayerFileName and require it for Layer (#4630 ) Does three things: * add a `Display` impl for `LayerFileName` equal to the `short_id` * based on that, replace the `Layer::short_id` function by a requirement for a `Display` impl * use that `Display` impl in the places where the `short_id` and `file_name()` functions were used instead Fixes #4145	2023-07-05 14:27:50 +02:00
Christian Schwarz	7d7cd8375c	callers of task_mgr::spawn: some top-level async blocks were missing tenant/timeline id (#4283 ) Looking at logs from staging and prod, I found there are a bunch of log lines without tenant / timeline context. Manully walk through all task_mgr::spawn lines and fix that using the least amount of work required. While doing it, remove some redundant `shutting down` messages. refs https://github.com/neondatabase/neon/issues/4222	2023-07-05 14:04:05 +02:00
Christian Schwarz	3f9defbfb4	page cache: add access & hit rate metrics (#4628 ) Co-authored-by: Dmitry Rodionov <dmitry@neon.tech>	2023-07-05 10:38:32 +02:00
Joonas Koivunen	10aba174c9	metrics: Remove comments regarding upgradeable rwlocks (#4622 ) Closes #4001 by removing the comments alluding towards upgradeable/downgradeable RwLocks.	2023-07-04 17:40:51 +03:00
arpad-m	9c8c55e819	Add _cached and _bytes to pageserver_tenant_synthetic_size metric name (#4616 ) This renames the `pageserver_tenant_synthetic_size` metric to `pageserver_tenant_synthetic_cached_size_bytes`, as was requested on slack (link in the linked issue). * `_cached` to hint that it is not incrementally calculated * `_bytes` to indicate the unit the size is measured in Fixes #3748	2023-07-03 19:34:07 +02:00
Christian Schwarz	24eaa3b7ca	timeline creation: reflect failures due to ancestor LSN issues in status code (#4600 ) Before, it was a `500` and control plane would retry, wereas it actually should have stopped retrying. (Stacked on top of https://github.com/neondatabase/neon/pull/4597 ) fixes https://github.com/neondatabase/neon/issues/4595 part of https://github.com/neondatabase/cloud/issues/5626 --------- Co-authored-by: Shany Pozin <shany@neon.tech>	2023-07-03 15:21:10 +03:00
Shany Pozin	26828560a8	Add timeouts and retries to consumption metrics reporting client (#4563 ) ## Problem #4528 ## Summary of changes Add a 60 seconds default timeout to the reqwest client Add retries for up to 3 times to call into the metric consumption endpoint --------- Co-authored-by: Christian Schwarz <christian@neon.tech>	2023-07-03 15:20:49 +03:00
Christian Schwarz	f558f88a08	refactor: distinguished error type for timeline creation failure (#4597 ) refs https://github.com/neondatabase/neon/issues/4595	2023-06-30 14:53:21 +02:00
Alex Chi Z	7e20b49da4	refactor: use LayerDesc in LayerMap (part 2) (#4437 ) ## Problem part of https://github.com/neondatabase/neon/issues/4392, continuation of https://github.com/neondatabase/neon/pull/4408 ## Summary of changes This PR removes all layer objects from LayerMap and moves it to the timeline struct. In timeline struct, LayerFileManager maps a layer descriptor to a layer object, and it is stored in the same RwLock as LayerMap to avoid behavior difference. Key changes: * LayerMap now does not have generic, and only stores descriptors. * In Timeline, we add a new struct called layer mapping. * Currently, layer mapping is stored in the same lock with layer map. Every time we retrieve data from the layer map, we will need to map the descriptor to the actual object. * Replace_historic is moved to layer mapping's replace, and the return value behavior is different from before. I'm a little bit unsure about this part and it would be good to have some comments on that. * Some test cases are rewritten to adapt to the new interface, and we can decide whether to remove it in the future because it does not make much sense now. * LayerDescriptor is moved to `tests` module and should only be intended for unit testing / benchmarks. * Because we now have a usage pattern like "take the guard of lock, then get the reference of two fields", we want to avoid dropping the incorrect object when we intend to unlock the lock guard. Therefore, a new set of helper function `drop_r/wlock` is added. This can be removed in the future when we finish the refactor. TODOs after this PR: fully remove RemoteLayer, and move LayerMapping to a separate LayerCache. all refactor PRs: ``` #4437 --- #4479 ------------ #4510 (refactor done at this point) \-- #4455 -- #4502 --/ ``` --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2023-06-29 15:06:07 -04:00
Alek Westover	032b603011	Fix: Wrong Enum Variant (#4589 )	2023-06-29 10:55:02 -04:00
Alex Chi Z	ca0e0781c8	use const instead of magic number for repartition threshold (#4286 ) There is a magic number about how often we repartition and therefore affecting how often we compact. This PR makes this number `10` a global constant and add docs. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2023-06-29 16:56:17 +03:00
Joonas Koivunen	02ef246db6	refactor: to pattern of await after timeout (#4432 ) Refactor the `!completed` to be about `Option<_>` instead, side-stepping any boolean true/false or false/true. As discussed on https://github.com/neondatabase/neon/pull/4399#discussion_r1219321848	2023-06-28 06:18:45 +00:00
bojanserafimov	ef2b9ffbcb	Basebackup forward compatibility (#4572 )	2023-06-27 12:05:27 -04:00
Shany Pozin	d748615c1f	RemoteTimelineClient::delete_all() to use s3::delete_objects (#4461 ) ## Problem [#4325](https://github.com/neondatabase/neon/issues/4325) ## Summary of changes Use delete_objects() method	2023-06-27 15:01:32 +03:00

1 2 3 4 5 ...

1375 Commits