rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-15 01:12:56 +00:00

Author	SHA1	Message	Date
Heikki Linnakangas	9b5792b9bf	Silence test failure caused by expected error in log.	2023-01-13 10:26:15 +02:00
Heikki Linnakangas	be0dfa9d3a	Fix test_ondemand_download_large_rel if uploads are slow. If the uploads after compaction happen slowly, they might have finished before the pageserver is shut down. The L0 files have been uploaded, so no data is lost, but then the query later in the test will need to download all the L0 files, and causes the test to fail because it specifically checks that download happens on-demand, not all at once.	2023-01-12 23:40:46 +02:00
Heikki Linnakangas	5aaa5302eb	Introduce RequestContexts. RequestContext is used to track each "operation" or "task" in a way that's not tied to tokio tasks. It provides support for fine-grained cancellation of individual operations, or all tasks working on an active tenant or timeline. Most async functions now take a RequestContext argument. RequestContexts form a hierarchy, so that you have a top-level context e.g. for a TCP listener task, a child context for each task handling a connection, and perhaps a grandchild context for each individual client request. In addition to the hierarchy, each RequestContext can be associated with a Tenant or Timeline object. This is used to prevent a Tenant or Timeline from being deleted or detached while there are still tasks accessing it. This fixes a long-standing race conditions between GC/compaction and deletion (see issues #2914 and compiler in any way, but the functions like `get_active_timeline` make it easy to do the right thing. This replaces most of the machinery in `task_mgr.rs`. We don't track running tasks as such anymore, only RequestContexts. In practice, every task holds onto a RequestContext. In addition to supporting cancellation, the RequestContext specifies the desired behavior if a remote layer is needed for the operation. This replaces the `with_ondemand_download_sync` and `no_ondemand_download` macros. The on-demand download now happens deep in the call stack, in get_reconstruct_data(), and the caller is no longer involved in the download, except by passing a RequestContext that specifies whether to do on-demand download or not. The PageReconstructResult type is gone but the PageReconstructError::NeedsDownload variant remains. It's now returned if the context specified "don't do on-demand download", and a layer is missing. TODO: - Enforce better that you hold a RequestContext associated with a Tenant or Timeline. - All the fields in RequestContext are currently 'pub', but things will break if you modify the tenant/timeline fields directly. Make that more safe. - When you create a subcontext, should it inherit the Tenant / Timeline of its parent? - Can the walreceiver::TaskHandle stuff be replaced with this? - Extract smaller patches: - What else could we extract?	2023-01-12 19:24:25 +02:00
Arthur Petukhovsky	f49e923d87	Keep deleted timelines in memory of safekeeper (#3300 ) A temporal fix for https://github.com/neondatabase/neon/issues/3146, until we come up with a reliable way to create and delete timelines in all safekeepers.	2023-01-12 15:33:07 +03:00
Heikki Linnakangas	c1731bc4f0	Push on-demand download into Timeline::get() function itself. This makes Timeline::get() async, and all functions that call it directly or indirectly with it. The with_ondemand_download() mechanism is gone, Timeline::get() now always downloads files, whether you want it or not. That is what all the current callers want, so even though this loses the capability to get a page only if it's already in the pageserver, without downloading, we were not using that capability. There were some places that used 'no_ondemand_download' in the WAL ingestion code that would error out if a layer file was not found locally, but those were dubious. We do actually want to on-demand download in all of those places. Per discussion at https://github.com/neondatabase/neon/pull/3233#issuecomment-1368032358	2023-01-12 11:53:10 +02:00
Heikki Linnakangas	8c07ef413d	Minor cleanup of test_ondemand_download_timetravel test. - Fix and improve comments - Rename 'physical_size' local variable to 'resident_size' for clarity. - Remove one 'unnecessary wait_for_upload' call. The 'wait_for_sk_commit_lsn_to_reach_remote_storage' call after shutting down compute is sufficient.	2023-01-09 18:56:50 +02:00
Kirill Bulatov	23d5e2bdaa	Fix common pg port in the CLI basics test (#3283 ) Closes https://github.com/neondatabase/neon/issues/3282	2023-01-07 00:46:42 +02:00
Kirill Bulatov	b6237474d2	Fix README and basic startup example (#3275 ) Follow-up of https://github.com/neondatabase/neon/pull/3270 which made an example from main README.md not working. Fixes that, by adding a way to specify a default tenant now and modifies the basic neon_local test to start postgres and check branching. Not all neon_local commands are implemented, so not all README.md contents is tested yet.	2023-01-06 12:26:14 +02:00
Heikki Linnakangas	8b710b9753	Fix segfault if pageserver connection is lost during backend startup. It's not OK to return early from within a PG_TRY-CATCH block. The PG_TRY macro sets the global PG_exception_stack variable, and PG_END_TRY restores it. If we jump out in between with "return NULL", the PG_exception_stack is left to point to garbage. (I'm surprised the comments in PG_TRY_CATCH don't warn about this.) Add test that re-attaches tenant in pageserver while Postgres is running. If the tenant is detached while compute is connected and busy running queries, those queries will fail if they try to fetch any pages. But when the tenant is re-attached, things should start working again, without disconnecting the client <-> postgres connections. Without this fix, this reproduced the segfault. Fixes issue #3231	2023-01-05 18:51:47 +02:00
Kirill Bulatov	8712e1899e	Move initial timeline creation into pytest (#3270 ) For every Python test, we start the storage first, and expect that later, in the test, when we start a compute, it will work without specific timeline and tenant creation or their IDs specified. For that, we have a concept of "default" branch that was created on the control plane level first, but that's not needed at all, given that it's only Python tests that need it: let them create the initial timeline during set-up. Before, control plane started and stopped pageserver for timeline creation, now Python harness runs an extra tenant creation request on test env init. I had to adjust the metrics test, turns out it registered the metrics from the default tenant after an extra pageserver restart. New model does not sent the metrics before the collection time happens, and that was 30s before.	2023-01-05 17:48:27 +02:00
Christian Schwarz	d7f1e30112	remote_timeline_client: more metrics & metrics-related cleanups - Clean up redundant metric removal in TimelineMetrics::drop. RemoteTimelineClientMetrics is responsible for cleaning up REMOTE_OPERATION_TIME andREMOTE_UPLOAD_QUEUE_UNFINISHED_TASKS. - Rename `pageserver_remote_upload_queue_unfinished_tasks` to `pageserver_remote_timeline_client_calls_unfinished`. The new name reflects that the metric is with respect to the entire call to remote timeline client. This includes wait time in the upload queue and hence it's a longer span than what `pageserver_remote_OPERATION_seconds` measures. - Add the `pageserver_remote_timeline_client_calls_started` histogram. See the metric description for why we need it. - Add helper functions `call_begin` etc to `RemoteTimelineClientMetrics` to centralize the logic for updating the metrics above (they relate to each other, see comments in code). - Use these constructs to track ongoing downloads in `pageserver_remote_timeline_client_calls_unfinished` refs https://github.com/neondatabase/neon/issues/2029 fixes https://github.com/neondatabase/neon/issues/3249 closes https://github.com/neondatabase/neon/pull/3250	2023-01-05 11:50:17 +01:00
Kirill Bulatov	10dae79c6d	Tone down safekeeper and pageserver walreceiver errors (#3227 ) Closes https://github.com/neondatabase/neon/issues/3114 Adds more typization into errors that appear during protocol messages (`FeMessage`), postgres and walreceiver connections. Socket IO errors are now better detected and logged with lesser (INFO, DEBUG) error level, without traces that they were logged before, when they were wrapped in anyhow context.	2023-01-03 20:42:04 +00:00
Heikki Linnakangas	0a0e55c3d0	Replace 'tar' crate with 'tokio-tar' (#3202 ) The synchronous 'tar' crate has required us to use block_in_place and SyncIoBridge to work together with the async I/O in the client connection. Switch to 'tokio-tar' crate that uses async I/O natively. As part of this, move the CopyDataWriter implementation to postgres_backend_async.rs. Even though it's only used in one place currently, it's in principle generally applicable whenever you want to use COPY out. Unfortunately we cannot use the 'tokio-tar' as it is: the Builder implementation requires the writer to have 'static lifetime. So we have to use a modified version without that requirement. The 'static lifetime was required just for the Drop implementation that writes the end-of-archive sections if the Builder is dropped without calling `finish`. But we don't actually want that behavior anyway; in fact we had to jump through some hoops with the AbortableWrite hack to skip those. With the modified version of 'tokio-tar' without that Drop implementation, we don't need AbortableWrite either. Co-authored-by: Kirill Bulatov <kirill@neon.tech>	2023-01-03 12:39:11 +02:00
Shany Pozin	182dc785d6	Set PITR default to 7 days (#3245 ) https://github.com/neondatabase/cloud/issues/3406	2023-01-02 18:05:23 +02:00
Arseny Sher	6fd64cd5f6	Allow failure to report metrics in test_metric_collection. Per CI https://github.com/neondatabase/neon/actions/runs/3822039946/attempts/1 shutdown seems to be racy.	2023-01-02 15:37:05 +03:00
Dmitry Ivanov	c700c7db2e	[proxy] Add more labels to the pricing metrics	2022-12-29 22:25:52 +03:00
Anastasia Lubennikova	8ff7bc5df1	Add timleline_logical_size metric. Send this metric only when it is fully calculated. Make consumption metrics more stable: - Send per-timeline metrics only for active timelines. - Adjust test assertions to make test_metric_collection test more stable.	2022-12-29 19:13:54 +02:00
Shany Pozin	c0290467fa	Fix #2907 Remove missing_layers from IndexPart (#3217 ) #2907	2022-12-29 12:33:30 +02:00
Kirill Bulatov	b77c33ee06	Move tenant-related modules below `tenant` module (#3190 ) No real code changes besides moving code around and adjusting the imports.	2022-12-23 15:40:37 +02:00
Anastasia Lubennikova	8544c59329	Fix flaky test_metrics_collection.py Only check that all metrics are present on the first request, because pageserver doesn't send unchanged metrics.	2022-12-22 16:28:11 +02:00
Anastasia Lubennikova	63eb87bde3	Set default metric_collection_interval to 10 min, which is more reasonable for real usage	2022-12-22 16:24:09 +02:00
Heikki Linnakangas	7ff591ffbf	On-Demand Download The code in this change was extracted from #2595 (Heikki’s on-demand download draft PR). High-Level Changes - New RemoteLayer Type - On-Demand Download As An Effect Of Page Reconstruction - Breaking Semantics For Physical Size Metrics There are several follow-up work items planned. Refer to the Epic issue on GitHub: https://github.com/neondatabase/neon/issues/2029 closes https://github.com/neondatabase/neon/pull/3013 Co-authored-by: Kirill Bulatov <kirill@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech> New RemoteLayer Type ==================== Instead of downloading all layers during tenant attach, we create RemoteLayer instances for each of them and add them to the layer map. On-Demand Download As An Effect Of Page Reconstruction ====================================================== At the heart of pageserver is Timeline::get_reconstruct_data(). It traverses the layer map until it has collected all the data it needs to produce the page image. Most code in the code base uses it, though many layers of indirection. Before this patch, the function would use synchronous filesystem IO to load data from disk-resident layer files if the data was not cached. That is not possible with RemoteLayer, because the layer file has not been downloaded yet. So, we do the download when get_reconstruct_data gets there, i.e., “on demand”. The mechanics of how the download is done are rather involved, because of the infamous async-sync-async sandwich problem that plagues the async Rust world. We use the new PageReconstructResult type to work around this. Its introduction is the cause for a good amount of code churn in this patch. Refer to the block comment on `with_ondemand_download()` for details. Breaking Semantics For Physical Size Metrics ============================================ We rename prometheus metric pageserver_{current,resident}_physical_size to reflect what this metric actually represents with on-demand download. This intentionally BREAKS existing grafana dashboard and the cost model data pipeline. Breaking is desirable because the meaning of this metrics has changed with on-demand download. See https://docs.google.com/document/d/12AFpvKY-7FZdR5a4CaD6Ir_rI3QokdCLSPJ6upHxJBo/edit# for how we will handle this breakage. Likewise, we rename the new billing_metrics’s PhysicalSize => ResidentSize. This is not yet used anywhere, so, this is not a breaking change. There is still a field called TimelineInfo::current_physical_size. It is now the sum of the layer sizes in layer map, regardless of whether local or remote. To compute that sum, we added a new trait method PersistentLayer::file_size(). When updating the Python tests, we got rid of current_physical_size_non_incremental. An earlier commit removed it from the OpenAPI spec already, so this is not a breaking change. test_timeline_size.py has grown additional assertions on the resident_physical_size metric.	2022-12-21 19:16:39 +01:00
Alexander Bayandin	486a985629	mypy: enable check_untyped_defs (#3142 ) Enable `check_untyped_defs` and fix warnings.	2022-12-21 09:38:42 +00:00
Anastasia Lubennikova	4235f97c6a	Implement consumption metrics collection. Add new background job to collect billing metrics for each tenant and send them to the HTTP endpoint. Metrics are cached, so we don't send non-changed metrics. Add metric collection config parameters: metric_collection_endpoint (default None, i.e. disabled) metric_collection_interval (default 60s) Add test_metric_collection.py to test metric collection and sending to the mocked HTTP endpoint. Use port distributor in metric_collection test review fixes: only update cache after metrics were send successfully, simplify code disable metric collection if metric_collection_endpoint is not provided in config	2022-12-20 22:59:52 +02:00
Heikki Linnakangas	8e2edfcf39	Retry remote downloads. Remote operations fail sometimes due to network failures or other external reasons. Add retry logic to all the remote downloads, so that a transient failure at pageserver startup or tenant attach doesn't cause the whole tenant to be marked as Broken. Like in the uploads retry logic, we print the failure to the log as a WARNing after three retries, but keep retrying. We will retry up to 10 times now, before returning the error to the caller. To test the retries, I created a new RemoteStorage wrapper that simulates failures, by returning an error for the first N times that a remote operation is performed. It can be enabled by setting a new "test_remote_failures" option in the pageserver config file. Fixes #3112	2022-12-20 14:27:24 +02:00
Heikki Linnakangas	39f58038d1	Don't upload index file in compaction, if there was nothing to do. (#3149 ) This splits the storage_sync2::schedule_index_file into two (public) functions: 1. `schedule_index_upload_for_metadata_update`, for when the metadata (e.g. disk_consistent_lsn or last_gc_cutoff) has changed, and 2. `schedule_index_upload_for_file_changes`, for when layer file uploads or deletions have been scheduled. We now keep track of whether there have been any uploads or deletions since the last index-file upload, and skip the upload in `schedule_index_upload_for_file_changes` if there haven't been any changes. That allows us to call the function liberally in timeline.rs, whenever layer file uploads or deletions might've been scheduled, without starting a lot of unnecessary index file uploads. GC was covered earlier by commit `c262390214`, but that missed that we have the same problem with compaction.	2022-12-19 23:58:24 +02:00
Kirill Bulatov	49a211c98a	Add neon_local test	2022-12-19 21:43:36 +02:00
Christian Schwarz	7db018e147	[4/4] the fix: do not leak spawn_blocking() tasks from logical size calculation code - Refactor logical_size_calculation_task, moving the pieces that are specific to try_spawn_size_init_task into that function. This allows us to spawn additional size calculation tasks that are not init size calculation tasks. - As part of this refactoring, stop logging cancellations as errors. They are part of regular operations. Logging them as errors was inadvertently introduced in earlier commit 427c1b2e9661161439e65aabc173d695cfc03ab4 initial logical size calculation: if it fails, retry on next call - Change tenant size model request code to spawn task_mgr tasks using the refactored logical_size_calculation_task function. Using a task_mgr task ensures that the calculation cannot outlive the timeline. - There are presumably still some subtle race conditions if a size requests comes in at exactly the same time as a detach / delete request. - But that's the concern of diferent area of the code (e.g., tenant_mgr) and requires holistic solutions, such as the proposed TenantGuard. - Make size calculation cancellable using CancellationToken. This is more of a cherry on top. NB: the test code doesn't use this because we _must_ return from the failpoint, because the failpoint lib doesn't allow to just continue execution in combination with executing the closure. This commit fixes the tests introduced earlier in this patch series.	2022-12-19 16:14:58 +01:00
Christian Schwarz	38ebd6e7a0	[3/4] make initial size estimation task sensitive to task_mgr shutdown requests This exacerbates the problem pointed out in the previous commit. Why? Because with this patch, deleting a timeline also exposes the issue. Extend the test to expose the problem.	2022-12-19 16:14:58 +01:00
Christian Schwarz	40a3d50883	[2/4] add test to show that tenant detach makes us leak running size calculation task	2022-12-19 16:14:58 +01:00
Dmitry Ivanov	83baf49487	[proxy] Forward compute connection params to client This fixes all kinds of problems related to missing params, like broken timestamps (due to `integer_datetimes`). This solution is not ideal, but it will help. Meanwhile, I'm going to dedicate some time to improving connection machinery. Note that this does not fix problems with passing certain parameters in a reverse direction, i.e. from client to compute. This is a separate matter and will be dealt with in an upcoming PR.	2022-12-16 21:37:50 +03:00
Heikki Linnakangas	c262390214	Don't upload index file when GC doesn't remove anything. I saw an excessive number of index file upload operations in production, even when nothing on the timeline changes. It was because our GC schedules index file upload if the GC cutoff LSN is advanced, even if the GC had nothing else to do. The GC cutoff LSN marches steadily forwards, even when there is no user activity on the timeline, when the cutoff is determined by the time-based PITR interval setting. To dial that down, only schedule index file upload when GC is about to actually remove something.	2022-12-16 11:05:55 +02:00
Heikki Linnakangas	6dec85b19d	Redefine the timeline_gc API to not perform a forced compaction Previously, the /v1/tenant/:tenant_id/timeline/:timeline_id/do_gc API call performed a flush and compaction on the timeline before GC. Change it not to do that, and change all the tests that used that API to perform compaction explicitly. The compaction happens at a slightly different point now. Previously, the code performed the `refresh_gc_info_internal` step first, and only then did compaction on all the timelines. I don't think that was what was originally intended here. Presumably the idea with compaction was to make some old layer files available for GC. But if we're going to flush the current in-memory layer to disk, surely you would want to include the newly-written layer in the compaction too. I guess this didn't make any difference to the tests in practice, but in any case, the tests now perform the flush and compaction before any of the GC steps. Some of the tests might not need the compaction at all, but I didn't try hard to determine which ones might need it. I left it out from a few tests that intentionally tested calling do_gc with an invalid tenant or timeline ID, though.	2022-12-16 11:05:55 +02:00
Christian Schwarz	397b60feab	common abstraction for waiting for SK commit_lsn to reach PS	2022-12-15 11:50:39 +01:00
Christian Schwarz	4132ae9dfe	always remove RemoteTimelineClient's metrics when dropping it	2022-12-14 19:25:29 +01:00
Alexander Bayandin	8fcba150db	test_seqscans: temporarily disable remote test (#3101 ) Temporarily disable `test_seqscans` for remote projects; they acquire too much space and time. We can try to reenable it back after switching to per-test projects.	2022-12-14 18:05:05 +00:00
Kirill Bulatov	4d201619ed	Remove large database files after every test suite (#3090 ) Closes https://github.com/neondatabase/neon/issues/1984 Closes https://github.com/neondatabase/neon/pull/2830 A follow-up of https://github.com/neondatabase/neon/pull/2830, I've noticed that benchmarks failed again due to out of space issues. Removes most of the pageserver and safekeeper files from disk after every pytest suite run. ``` $ poetry run pytest -vvsk "test_tenant_redownloads_truncated_file_on_startup[local_fs]" # ... $ du -h test_output/test_tenant_redownloads_truncated_file_on_startup\[local_fs\] # ... 104K test_output/test_tenant_redownloads_truncated_file_on_startup[local_fs] $ poetry run pytest -vvsk "test_tenant_redownloads_truncated_file_on_startup[local_fs]" --preserve-database-files # ... $ du -h test_output/test_tenant_redownloads_truncated_file_on_startup\[local_fs\] # ... 123M test_output/test_tenant_redownloads_truncated_file_on_startup[local_fs] ``` Co-authored-by: Bojan Serafimov <bojan.serafimov7@gmail.com>	2022-12-14 13:09:08 +00:00
Shany Pozin	ada5b7158f	Fix Issue #3014 (#3059 ) * TenantConfigRequest now supports tenant_id as hex string input instead of bytes array * Config file is truncated in each creation/update	2022-12-14 14:09:16 +02:00
Dmitry Ivanov	607c0facfc	[proxy] Propagate more console API errors to the user This patch aims to fix some of the inconsistencies in error reporting, for example "Internal error" or "Console request failed" instead of "password authentication failed for user '<NAME>'".	2022-12-13 16:16:31 +03:00
Christian Schwarz	22ae67af8d	refactor: use new type LayerFileName when referring to layer file names in PathBuf/RemotePath (#3026 ) refactor: use new type LayerFileName when referring to layer file names in PathBuf/RemotePath Before this patch, we would sometimes carry around plain file names in `Path` types and/or awkwardly "rebase" paths to have a unified representation of the layer file name between local and remote. This patch introduces a new type `LayerFileName` which replaces the use of `Path` / `PathBuf` / `RemotePath` in the `storage_sync2` APIs. Instead of holding a string, it contains the parsed representation of the image and delta file name. When we need the file name, e.g., to construct a local path or remote object key, we construct the name ad-hoc. `LayerFileName` is also serde {Dese,Se}rializable, and in an initial version of this patch, it was supposed to be used directly inside `IndexPart`, replacing `RemotePath`. However, commit `3122f3282f` Ignore backup files (ones with .n.old suffix) in download_missing fixed handling of `.old` backup file names in IndexPart, and we need to carry that behavior forward. The solution is to remove `.old` backup files names during deserialization. When we re-serialize the IndexPart, the `*.old` file will be gone. This leaks the `.old` file in the remote storage, but makes it safe to clean it up later. There is additional churn by a preliminary refactoring that got squashed into this change: split off LayerMap's needs from trait Layer into super trait That refactoring renames `Layer` to `PersistentLayer` and splits off a subset of the functions into a super-trait called `Layer`. The upser trait implements just the functions needed by `LayerMap`, whereas `PersisentLayer` adds the context of the pageserver. The naming is imperfect as some functions that reside in `PersistentLayer` have nothing persistence-specific to it. But it's a step in the right direction.	2022-12-13 01:27:59 +02:00
Arseny Sher	32662ff1c4	Replace etcd with storage_broker. This is the replacement itself, the binary landed earlier. See docs/storage_broker.md. ref https://github.com/neondatabase/neon/pull/2466 https://github.com/neondatabase/neon/issues/2394	2022-12-12 13:30:16 +03:00
Kirill Bulatov	700a36ee6b	Wait for certain tenant status in the remote storage test (#3055 ) Closes https://github.com/neondatabase/neon/issues/3052 From what I could understand from the PR, we did not wait enough before the attach failed. Extended the wait period a bit and put a check for a status instead of plain `sleep` to fail if we don't get the expected status.	2022-12-10 10:18:55 +02:00
Joonas Koivunen	b8a5664fb9	test: kill spawned postgres (#3054 ) Fixes #2604.	2022-12-10 00:35:05 +02:00
Dmitry Rodionov	3122f3282f	Ignore backup files (ones with .n.old suffix) in download_missing This is rather a hack to resolve immediate issue: https://github.com/neondatabase/neon/issues/3024 Properly cleaning this file from index part requires changes to initialization of remote queue. Because we need to clean it up earlier than we start warking around files. With on-demand there will be no walk around layer files becase download_missing is no longer needed, so I believe it will be natural to unify this with load_layer_map	2022-12-09 12:07:50 +03:00
Heikki Linnakangas	b513619503	Remove obsolete 'awaits_download' field. It used to be a separate piece of state, but after `9a6c0be823` it's just an alias for the Tenant being in Attaching state. It was only used in one assertion in a test, but that check doesn't make sense anymore, so just remove it. Fixes https://github.com/neondatabase/neon/issues/2930	2022-12-07 13:13:54 +02:00
Kirill Bulatov	6a57d5bbf9	Make the request tracing test more useful	2022-12-06 23:52:16 +02:00
Kirill Bulatov	d6bfe955c6	Add commands to unload and load the tenant in memory (#2977 ) Closes https://github.com/neondatabase/neon/issues/2537 Follow-up of https://github.com/neondatabase/neon/pull/2950 With the new model that prevents attaching without the remote storage, it has started to be even more odd to add attach-with-files functionality (in addition to the issues raised previously). Adds two separate commands: * `POST {tenant_id}/ignore` that places a mark file to skip such tenant on every start and removes it from memory * `POST {tenant_id}/schedule_load` that tries to load a tenant from local FS similar to what pageserver does now on startup, but without directory removals	2022-12-06 15:30:02 +00:00
Bojan Serafimov	edfebad3a1	Add test that importing an empty file fails. We used to have a bug where the pageserver just got stuck if the client sent a CopyDone message before reaching end of tar stream. That showed up with an empty tar file, as one example. That was inadvertently fixed by code refactorings, but let's add a regression test for it, so that we don't accidentally re-introduce the bug later. Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2022-12-01 12:08:56 +02:00
Heikki Linnakangas	9a6c0be823	storage_sync2 The code in this change was extracted from PR #2595, i.e., Heikki’s draft PR for on-demand download. High-Level Changes - storage_sync module rewrite - Changes to Tenant Loading - Changes to Timeline States - Crash-safe & Resumable Tenant Attach There are several follow-up work items planned. Refer to the Epic issue on GitHub: https://github.com/neondatabase/neon/issues/2029 Metadata: closes https://github.com/neondatabase/neon/pull/2785 unsquashed history of this patch: archive/pr-2785-storage-sync2/pre-squash Co-authored-by: Dmitry Rodionov <dmitry@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech> =============================================================================== storage_sync module rewrite =========================== The storage_sync code is rewritten. New module name is storage_sync2, mostly to make a more reasonable git diff. The updated block comment in storage_sync2.rs describes the changes quite well, so, we will not reproduce that comment here. TL;DR: - Global sync queue and RemoteIndex are replaced with per-timeline `RemoteTimelineClient` structure that contains a queue for UploadOperations to ensure proper ordering and necessary metadata. - Before deleting local layer files, wait for ongoing UploadOps to finish (wait_completion()). - Download operations are not queued and executed immediately. Changes to Tenant Loading ========================= Initial sync part was rewritten as well and represents the other major change that serves as a foundation for on-demand downloads. Routines for attaching and loading shifted directly to Tenant struct and now are asynchronous and spawned into the background. Since this patch doesn’t introduce on-demand download of layers we fully synchronize with the remote during pageserver startup. See details in `Timeline::reconcile_with_remote` and `Timeline::download_missing`. Changes to Tenant States ======================== The “Active” state has lost its “background_jobs_running: bool” member. That variable indicated whether the GC & Compaction background loops are spawned or not. With this patch, they are now always spawned. Unit tests (#[test]) use the TenantConf::{gc_period,compaction_period} to disable their effect (`15db566`). This patch introduces a new tenant state, “Attaching”. A tenant that is being attached starts in this state and transitions to “Active” once it finishes download. The `GET /tenant` endpoints returns `TenantInfo::has_in_progress_downloads`. We derive the value for that field from the tenant state now, to remain backwards-compatible with cloud.git. We will remove that field when we switch to on-demand downloads. Changes to Timeline States ========================== The TimelineInfo::awaits_download field is now equivalent to the tenant being in Attaching state. Previously, download progress was tracked per timeline. With this change, it’s only tracked per tenant. When on-demand downloads arrive, the field will be completely obsolete. Deprecation is tracked in isuse #2930. Crash-safe & Resumable Tenant Attach ==================================== Previously, the attach operation was not persistent. I.e., when tenant attach was interrupted by a crash, the pageserver would not continue attaching after pageserver restart. In fact, the half-finished tenant directory on disk would simply be skipped by tenant_mgr because it lacked the metadata file (it’s written last). This patch introduces an “attaching” marker file inside that is present inside the tenant directory while the tenant is attaching. During pageserver startup, tenant_mgr will resume attach if that file is present. If not, it assumes that the local tenant state is consistent and tries to load the tenant. If that fails, the tenant transitions into Broken state.	2022-11-29 18:55:20 +01:00
Heikki Linnakangas	baa8d5a16a	Test that physical size is the same before and after re-attaching tenant.	2022-11-29 14:32:01 +02:00

1 2 3

130 Commits