rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-13 16:32:56 +00:00

Author	SHA1	Message	Date
Vadim Kharitonov	e9072ee178	Compile rdkit (#4442 ) `rdkit` extension ``` postgres=# create extension rdkit; CREATE EXTENSION postgres=# select 'c1[o,s]ncn1'::qmol; qmol ------------- c1[o,s]ncn1 (1 row) ```	2023-06-12 11:13:33 +02:00
Joonas Koivunen	7e17979d7a	feat: http request logging on safekeepers. With RequestSpan, successfull GETs are not logged, but all others, errors and warns on cancellations are.	2023-06-11 22:53:08 +04:00
Arseny Sher	227271ccad	Switch safekeepers to async. This is a full switch, fs io operations are also tokio ones, working through thread pool. Similar to pageserver, we have multiple runtimes for easier `top` usage and isolation. Notable points: - Now that guts of safekeeper.rs are full of .await's, we need to be very careful not to drop task at random point, leaving timeline in unclear state. Currently the only writer is walreceiver and we don't have top level cancellation there, so we are good. But to be safe probably we should add a fuse panicking if task is being dropped while operation on a timeline is in progress. - Timeline lock is Tokio one now, as we do disk IO under it. - Collecting metrics got a crutch: since prometheus Collector is synchronous, it spawns a thread with current thread runtime collecting data. - Anything involving closures becomes significantly more complicated, as async fns are already kinda closures + 'async closures are unstable'. - Main thread now tracks other main tasks, which got much easier. - The only sync place left is initial data loading, as otherwise clippy complains on timeline map lock being held across await points -- which is not bad here as it happens only in single threaded runtime of main thread. But having it sync doesn't hurt either. I'm concerned about performance of thread pool io offloading, async traits and many await points; but we can try and see how it goes. fixes https://github.com/neondatabase/neon/issues/3036 fixes https://github.com/neondatabase/neon/issues/3966	2023-06-11 22:53:08 +04:00
dependabot[bot]	fbf0367e27	build(deps): bump cryptography from 39.0.1 to 41.0.0 (#4409 )	2023-06-11 19:14:30 +01:00
Arthur Petukhovsky	a21b55fe0b	Use connect_timeout for broker::connect (#4452 ) Use `storage_broker::connect` everywhere. Add a default 5 seconds timeout for opening new connection.	2023-06-09 17:38:53 +03:00
Shany Pozin	add51e1372	Add delete_objects to storage api (#4449 ) ## Summary of changes Add missing delete_objects API to support bulk deletes	2023-06-09 13:23:12 +03:00
Alex Chi Z	cdce04d721	pgserver: add local manifest for atomic operation (#4422 ) ## Problem Part of https://github.com/neondatabase/neon/issues/4418 ## Summary of changes This PR implements the local manifest interfaces. After the refactor of timeline is done, we can integrate this with the current storage. The reader will stop at the first corrupted record. --------- Signed-off-by: Alex Chi <iskyzh@gmail.com> Co-authored-by: bojanserafimov <bojan.serafimov7@gmail.com>	2023-06-08 19:34:25 -04:00
bojanserafimov	6bac770811	Add cold start test (#4436 )	2023-06-08 18:11:33 -04:00
Stas Kelvich	c82d19d8d6	Fix NULLs handling in proxy json endpoint There were few problems with null handling: * query_raw_txt() accepted vector of string so it always (erroneously) treated "null" as a string instead of null. Change rust pg client to accept the vector of Option<String> instead of just Strings. Adopt coding here to pass nulls as None. * pg_text_to_json() had a check that always interpreted "NULL" string as null. That is wrong and nulls were already handled by match None. This bug appeared as a bad attempt to parse arrays containing NULL elements. Fix coding by checking presence of quotes while parsing an array (no quotes -> null, quoted -> "null" string). Array parser fix also slightly changes behavior by always cleaning current entry when pushing to the resulting vector. This seems to be an omission by previous coding, however looks like it was harmless as entry was not cleared only at the end of the nested or to-level array.	2023-06-08 16:00:18 +03:00
Stas Kelvich	d73639646e	Add more output options to proxy json endpoint With this commit client can pass following optional headers: `Neon-Raw-Text-Output: true`. Return postgres values as text, without parsing them. So numbers, objects, booleans, nulls and arrays will be returned as text. That can be useful in cases when client code wants to implement it's own parsing or reuse parsing libraries from e.g. node-postgres. `Neon-Array-Mode: true`. Return postgres rows as arrays instead of objects. That is more compact representation and also helps in some edge cases where it is hard to use rows represented as objects (e.g. when several fields have the same name).	2023-06-08 16:00:18 +03:00
Dmitry Rodionov	d53f9ab3eb	delete timelines from s3 (#4384 ) Delete data from s3 when timeline deletion is requested ## Summary of changes UploadQueue is altered to support scheduling of delete operations in stopped state. This looks weird, and I'm thinking whether there are better options/refactorings for upload client to make it look better. Probably can be part of https://github.com/neondatabase/neon/issues/4378 Deletion is implemented directly in existing endpoint because changes are not that significant. If we want more safety we can separate those or create feature flag for new behavior. resolves [#4193](https://github.com/neondatabase/neon/issues/4193) --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-06-08 15:01:22 +03:00
Dmitry Rodionov	8560a98d68	fix openapi spec to pass swagger editor validation (#4445 ) There shouldnt be a dash before `type: object`. Also added description.	2023-06-08 13:25:30 +03:00
Alex Chi Z	2e687bca5b	refactor: use LayerDesc in layer map (part 1) (#4408 ) ## Problem part of https://github.com/neondatabase/neon/issues/4392 ## Summary of changes This PR adds a new HashMap that maps persistent layer desc to the layer object inside LayerMap. Originally I directly went towards adding such layer cache in Timeline, but the changes are too many and cannot be reviewed as a reasonably-sized PR. Therefore, we take this intermediate step to change part of the codebase to use persistent layer desc, and come up with other PRs to move this hash map of layer desc to the timeline struct. Also, file_size is now part of the layer desc. --------- Signed-off-by: Alex Chi <iskyzh@gmail.com> Co-authored-by: bojanserafimov <bojan.serafimov7@gmail.com>	2023-06-07 18:28:18 +03:00
Dmitry Rodionov	1a1019990a	map TenantState::Broken to TenantAttachmentStatus::Failed (#4371 ) ## Problem Attach failures are not reported in public part of the api (in `attachment_status` field of TenantInfo). ## Summary of changes Expose TenantState::Broken as TenantAttachmentStatus::Failed In the way its written Failed status will be reported even if no attachment happened. (I e if tenant become broken on startup). This is in line with other members. I e Active will be resolved to Attached even if no actual attach took place. This can be tweaked if needed. At the current stage it would be overengineering without clear motivation resolves #4344	2023-06-07 18:25:30 +03:00
Alex Chi Z	1c200bd15f	fix: break dev dependencies between wal_craft and pg_ffi (#4424 ) ## Problem close https://github.com/neondatabase/neon/issues/4266 ## Summary of changes With this PR, rust-analyzer should be able to give lints and auto complete in `mod tests`, and this makes writing tests easier. Previously, rust-analyzer cannot do auto completion. --------- Signed-off-by: Alex Chi <iskyzh@gmail.com>	2023-06-07 17:51:13 +03:00
Arseny Sher	37bf2cac4f	Persist safekeeper control file once in a while. It should make remote_consistent_lsn commonly up-to-date on non actively writing projects, which removes spike or pageserver -> safekeeper reconnections on storage nodes restart.	2023-06-07 17:23:37 +04:00
Joonas Koivunen	5761190e0d	feat: three phased startup order (#4399 ) Initial logical size calculation could still hinder our fast startup efforts in #4397. See #4183. In deployment of 2023-06-06 about a 200 initial logical sizes were calculated on hosts which took the longest to complete initial load (12s). Implements the three step/tier initialization ordering described in #4397: 1. load local tenants 2. do initial logical sizes per walreceivers for 10s 3. background tasks Ordering is controlled by: - waiting on `utils::completion::Barrier`s on background tasks - having one attempt for each Timeline to do initial logical size calculation - `pageserver/src/bin/pageserver.rs` releasing background jobs after timeout or completion of initial logical size calculation The timeout is there just to safeguard in case a legitimate non-broken timeline initial logical size calculation goes long. The timeout is configurable, by default 10s, which I think would be fine for production systems. In the test cases I've been looking at, it seems that these steps are completed as fast as possible. Co-authored-by: Christian Schwarz <christian@neon.tech>	2023-06-07 14:29:23 +03:00
Vadim Kharitonov	88f0cfc575	Fix `pgx_ulid` extension (#4431 ) The issue was in the wrong `control` file name	2023-06-07 11:41:53 +02:00
Arseny Sher	6b3c020cd9	Don't warn on system id = 0 in walproposer greeting. sync-safekeepers doesn't know it and sends 0.	2023-06-07 12:39:20 +04:00
Arseny Sher	c058e1cec2	Quick exit in truncate_wal if nothing to do. ref https://github.com/neondatabase/neon/issues/4414	2023-06-07 12:39:20 +04:00
Arseny Sher	dc6a382873	Increase timeouts on compute -> sk connections. context: https://github.com/neondatabase/neon/issues/4414 And improve messages/comments here and there.	2023-06-07 12:39:20 +04:00
Heikki Linnakangas	df3bae2ce3	Use `compute_ctl` to manage Postgres in tests. (#3886 ) This adds test coverage for 'compute_ctl', as it is now used by all the python tests. There are a few differences in how 'compute_ctl' is called in the tests, compared to the real web console: - In the tests, the postgresql.conf file is included as one large string in the spec file, and it is written out as it is to the data directory. I added a new field for that to the spec file. The real web console, however, sets all the necessary settings in the 'settings' field, and 'compute_ctl' creates the postgresql.conf from those settings. - In the tests, the information needed to connect to the storage, i.e. tenant_id, timeline_id, connection strings to pageserver and safekeepers, are now passed as new fields in the spec file. The real web console includes them as the GUCs in the 'settings' field. (Both of these are different from what the test control plane used to do: It used to write the GUCs directly in the postgresql.conf file). The plan is to change the control plane to use the new method, and remove the old method, but for now, support both. Some tests that were sensitive to the amount of WAL generated needed small changes, to accommodate that compute_ctl runs the background health monitor which makes a few small updates. Also some tests shut down the pageserver, and now that the background health check can run some queries while the pageserver is down, that can produce a few extra errors in the logs, which needed to be allowlisted. Other changes: - remove obsolete comments about PostgresNode; - create standby.signal file for Static compute node; - log output of `compute_ctl` and `postgres` is merged into `endpoints/compute.log`. --------- Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>	2023-06-06 14:59:36 +01:00
Joonas Koivunen	0cef7e977d	refactor: just one way to shutdown a tenant (#4407 ) We have 2 ways of tenant shutdown, we should have just one. Changes are mostly mechanical simple refactorings. Added `warn!` on the "shutdown all remaining tasks" should trigger test failures in the between time of not having solved the "tenant/timeline owns all spawned tasks" issue. Cc: #4327.	2023-06-06 15:30:55 +03:00
Joonas Koivunen	18a9d47f8e	test: restore NotConnected being allowed globally (#4426 ) Flakyness introduced by #4402 evidence [^1]. I had assumed the NotConnected would had been an expected io error, but it's not. Restore the global `allowed_error`. [^1]: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-4407/5185897757/index.html#suites/82004ab4e3720b47bf78f312dabe7c55/14f636d0ecd3939d/	2023-06-06 13:51:39 +03:00
Sasha Krassovsky	ac11e7c32d	Remove arch-specific stuff from HNSW extension (#4423 )	2023-06-05 22:04:15 -08:00
Konstantin Knizhnik	8e1b5e1224	Remove -ftree-vectorizer-verbose=0 option notrecognized by MaxOS/X c… (#4412 ) …ompiler ## Problem ## Summary of changes ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist	2023-06-05 20:10:19 +03:00
Joonas Koivunen	e0bd81ce1f	test: fix flaky warning on attach (#4415 ) added the `allowed_error` to the `positive_env` so any tests completing the attach are allowed have this print out. they are allowed to do so, because the `random_init_delay` can produce close to zero and thus the first run will be near attach. Though... Unsure if we ever really need the eviction task to run before it can evict something, as in after 20min or 24h. in the failed test case however period is 20s so interesting that we didn't run into this sooner. evidence of flaky: https://github.com/neondatabase/neon/actions/runs/5175677035/jobs/9323705929?pr=4399#step:4:38536	2023-06-05 18:12:58 +03:00
Joonas Koivunen	77598f5d0a	Better walreceiver logging (#4402 ) walreceiver logs are a bit hard to understand because of partial span usage, extra messages, ignored errors popping up as huge stacktraces. Fixes #3330 (by spans, also demote info -> debug). - arrange walreceivers spans into a hiearchy: - `wal_connection_manager{tenant_id, timeline_id}` -> `connection{node_id}` -> `poller` - unifies the error reporting inside `wal_receiver`: - All ok errors are now `walreceiver connection handling ended: {e:#}` - All unknown errors are still stacktraceful task_mgr reported errors with context `walreceiver connection handling failure` - Remove `connect` special casing, was: `DB connection stream finished` for ok errors - Remove `done replicating` special casing, was `Replication stream finished` for ok errors - lowered log levels for (non-exhaustive list): - `WAL receiver manager started, connecting to broker` (at startup) - `WAL receiver shutdown requested, shutting down` (at shutdown) - `Connection manager loop ended, shutting down` (at shutdown) - `sender is dropped while join handle is still alive` (at lucky shutdown, see #2885) - `timeline entered terminal state {:?}, stopping wal connection manager loop` (at shutdown) - `connected!` (at startup) - `Walreceiver db connection closed` (at disconnects?, was without span) - `Connection cancelled` (at shutdown, was without span) - `observed timeline state change, new state is {new_state:?}` (never after Timeline::activate was made infallible) - changed: - `Timeline dropped state updates sender, stopping wal connection manager loop` - was out of date; sender is not dropped but `Broken \| Stopping` state transition - also made `debug!` - `Timeline dropped state updates sender before becoming active, stopping wal connection manager loop` - was out of date: sender is again not dropped but `Broken \| Stopping` state transition - also made `debug!` - log fixes: - stop double reporting panics via JoinError	2023-06-05 17:35:23 +03:00
Joonas Koivunen	8142edda01	test: Less flaky gc (#4416 ) Solves a flaky test error in the wild[^1] by: - Make the gc shutdown signal reading an `allowed_error` - Note the gc shutdown signal readings as being in `allowed_error`s - Allow passing tenant conf to init_start to avoid unncessary tenants [^1]: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-4399/5176432780/index.html#suites/b97efae3a617afb71cb8142f5afa5224/2cd76021ea011f93	2023-06-05 15:43:52 +03:00
Vadim Kharitonov	b9871158ba	Compile PGX ULID extension (#4413 ) Create pgx_ulid extension ``` postgres=# create extension ulid; CREATE EXTENSION postgres=# CREATE TABLE users ( id ulid NOT NULL DEFAULT gen_ulid() PRIMARY KEY, name text NOT NULL ); CREATE TABLE postgres=# insert into users (name) values ('vadim'); INSERT 0 1 postgres=# select * from users; id \| name ----------------------------+------- 01H25DDG3KYMYZTNR41X38E256 \| vadim ```	2023-06-05 12:52:13 +03:00
Joonas Koivunen	8caef2c0c5	fix: delay `eviction_task` as well (#4397 ) As seen on deployment of 2023-06-01 release, times were improving but there were some outliers caused by: - timelines `eviction_task` starting while activating and running imitation - timelines `initial logical size` calculation This PR fixes it so that `eviction_task` is delayed like other background tasks fixing an oversight from earlier #4372. After this PR activation will be two phases: 1. load and activate tenants AND calculate some initial logical sizes 2. rest of initial logical sizes AND background tasks - compaction, gc, disk usage based eviction, timelines `eviction_task`, consumption metrics	2023-06-05 09:37:53 +03:00
Konstantin Knizhnik	04542826be	Add HNSW extension (#4227 ) ## Describe your changes Port HNSW implementation for ANN search top Postgres ## Issue ticket number and link https://www.pinecone.io/learn/hnsw ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist	2023-06-04 11:41:38 +03:00
bojanserafimov	4ba950a35a	Add libcurl as dependency to readme (#4405 )	2023-06-02 18:07:45 -04:00
Joonas Koivunen	a55c663848	chore: comment marker fixes (#4406 ) Upgrading to rust 1.70 will require these.	2023-06-02 21:03:12 +03:00
Heikki Linnakangas	9787227c35	Shield HTTP request handlers from async cancellations. (#4314 ) We now spawn a new task for every HTTP request, and wait on the JoinHandle. If Hyper drops the Future, the spawned task will keep running. This protects the rest of the pageserver code from unexpected async cancellations. This creates a CancellationToken for each request and passes it to the handler function. If the HTTP request is dropped by the client, the CancellationToken is signaled. None of the handler functions make use for the CancellationToken currently, but they now they could. The CancellationToken arguments also work like documentation. When you're looking at a function signature and you see that it takes a CancellationToken as argument, it's a nice hint that the function might run for a long time, and won't be async cancelled. The default assumption in the pageserver is now that async functions are not cancellation-safe anyway, unless explictly marked as such, but this is a nice extra reminder. Spawning a task for each request is OK from a performance point of view because spawning is very cheap in Tokio, and none of our HTTP requests are very performance critical anyway. Fixes issue #3478	2023-06-02 08:28:13 -04:00
Joonas Koivunen	ef80a902c8	pg_sni_router: add session_id to more messages (#4403 ) See superceded #4390. - capture log in test - expand the span to cover init and error reporting - remove obvious logging by logging only unexpected	2023-06-02 14:59:10 +03:00
Alex Chi Z	66cdba990a	refactor: use PersistentLayerDesc for persistent layers (#4398 ) ## Problem Part of https://github.com/neondatabase/neon/issues/4373 ## Summary of changes This PR adds `PersistentLayerDesc`, which will be used in LayerMap mapping and probably layer cache. After this PR and after we change LayerMap to map to layer desc, we can safely drop RemoteLayerDesc. --------- Signed-off-by: Alex Chi <iskyzh@gmail.com> Co-authored-by: bojanserafimov <bojan.serafimov7@gmail.com>	2023-06-01 22:06:28 +03:00
Alex Chi Z	82484e8241	pgserver: add more metrics for better observability (#4323 ) ## Problem This PR includes doc changes to the current metrics as well as adding new metrics. With the new set of metrics, we can quantitatively analyze the read amp., write amp. and space amp. in the system, when used together with https://github.com/neondatabase/neonbench close https://github.com/neondatabase/neon/issues/4312 ref https://github.com/neondatabase/neon/issues/4347 compaction metrics TBD, a novel idea is to print L0 file number and number of layers in the system, and we can do this in the future when we start working on compaction. ## Summary of changes * Add `READ_NUM_FS_LAYERS` for computing read amp. * Add `MATERIALIZED_PAGE_CACHE_HIT_UPON_REQUEST`. * Add `GET_RECONSTRUCT_DATA_TIME`. GET_RECONSTRUCT_DATA_TIME + RECONSTRUCT_TIME + WAIT_LSN_TIME should be approximately total time of reads. * Add `5.0` and `10.0` to `STORAGE_IO_TIME_BUCKETS` given some fsync runs slow (i.e., > 1s) in some cases. * Some `WAL_REDO` metrics are only used when Postgres is involved in the redo process. --------- Signed-off-by: Alex Chi <iskyzh@gmail.com>	2023-06-01 21:46:04 +03:00
Joonas Koivunen	36fee50f4d	compute_ctl: enable tracing panic hook (#4375 ) compute_ctl can panic, but `tracing` is used for logging. panic stderr output can interleave with messages from normal logging. The fix is to use the established way (pageserver, safekeeper, storage_broker) of using `tracing` to report panics.	2023-06-01 20:12:07 +03:00
bojanserafimov	330083638f	Fix stale and misleading comment in LayerMap (#4297 )	2023-06-01 05:04:46 +03:00
Konstantin Knizhnik	952d6e43a2	Add pageserver parameter forced_image_creation_limit which can be used… (#4353 ) This parameter can be use to restrict number of image layers generated because of GC request (wanted image layers). Been set to zero it completely eliminates creation of such image layers. So it allows to avoid extra storage consumption after merging #3673 ## Problem PR #3673 forces generation of missed image layers. So i short term is cause cause increase (in worst case up to two times) size of storage. It was intended (by me) that GC period is comparable with PiTR interval. But looks like it is not the case now - GC is performed much more frequently. It may cause the problem with space exhaustion: GC forces new image creation while large PiTR still prevent GC from collecting old layers. ## Summary of changes Add new pageserver parameter` forced_image_creation_limit` which restrict number of created image layers which are requested by GC. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist	2023-05-31 21:37:20 +03:00
bojanserafimov	b6447462dc	Fix layer map correctness bug (#4342 )	2023-05-31 12:23:00 -04:00
Dmitry Rodionov	b190c3e6c3	reduce flakiness by allowing Compaction failed, retrying in X queue is in state Stopped. (#4379 ) resolves https://github.com/neondatabase/neon/issues/4374 by adding the error to allowed_errors	2023-05-30 20:11:44 +03:00
Joonas Koivunen	f4db85de40	Continued startup speedup (#4372 ) Startup continues to be slow, work towards to alleviate it. Summary of changes: - pretty the functional improvements from #4366 into `utils::completion::{Completion, Barrier}` - extend "initial load completion" usage up to tenant background tasks - previously only global background tasks - spawn_blocking the tenant load directory traversal - demote some logging - remove some unwraps - propagate some spans to `spawn_blocking` Runtime effects should be major speedup to loading, but after that, the `BACKGROUND_RUNTIME` will be blocked for a long time (minutes). Possible follow-ups: - complete initial tenant sizes before allowing background tasks to block the `BACKGROUND_RUNTIME`	2023-05-30 16:25:07 +03:00
Arthur Petukhovsky	210be6b6ab	Replace broker duration logs with metrics (#4370 ) I've added logs for broker push duration after every iteration in https://github.com/neondatabase/neon/pull/4142. This log has not found any real issues, so we can replace it with metrics, to slightly reduce log volume. LogQL query found that pushes longer that 500ms happened only 90 times for the last month. https://neonprod.grafana.net/goto/KTNj9UwVg?orgId=1 `{unit="safekeeper.service"} \|= "timeline updates to broker in" \| regexp "to broker in (?P<duration>.*)" \| duration > 500ms`	2023-05-30 16:08:02 +03:00
Alexander Bayandin	daa79b150f	Code Coverage: store lcov report (#4358 ) ## Problem In the future, we want to compare code coverage on a PR with coverage on the main branch. Currently, we store only code coverage HTML reports, I suggest we start storing reports in "lcov info" format that we can use/parse in the future. Currently, the file size is ~7Mb (it's a text-based format and could be compressed into a ~400Kb archive) - More about "lcov info" format: https://manpages.ubuntu.com/manpages/jammy/man1/geninfo.1.html#files - Part of https://github.com/neondatabase/neon/issues/3543 ## Summary of changes - Change `scripts/coverage` to output lcov coverage to `report/lcov.info` file instead of stdout (we already upload the whole `report/` directory to S3)	2023-05-30 14:05:41 +01:00
Joonas Koivunen	db14355367	revert: static global init logical size limiter (#4368 ) added in #4366. revert for testing without it; it may have unintenteded side-effects, and it's very difficult to understand the results from the 10k load testing environments. earlier results: https://github.com/neondatabase/neon/pull/4366#issuecomment-1567491064	2023-05-30 10:40:37 +03:00
Joonas Koivunen	cb83495744	try: startup speedup (#4366 ) Startup can take a long time. We suspect it's the initial logical size calculations. Long term solution is to not block the tokio executors but do most of I/O in spawn_blocking. See: #4025, #4183 Short-term solution to above: - Delay global background tasks until initial tenant loads complete - Just limit how many init logical size calculations can we have at the same time to `cores / 2` This PR is for trying in staging.	2023-05-29 21:48:38 +03:00
Christian Schwarz	f4f300732a	refactor TenantState transitions (#4321 ) This is preliminary work for/from #4220 (async `Layer::get_value_reconstruct_data`). The motivation is to avoid locking `Tenant::timelines` in places that can't be `async`, because in #4333 we want to convert Tenant::timelines from `std::sync::Mutex` to `tokio::sync::Mutex`. But, the changes here are useful in general because they clean up & document tenant state transitions. That also paves the way for #4350, which is an alternative to #4333 that refactors the pageserver code so that we can keep the `Tenant::timelines` mutex sync. This patch consists of the following core insights and changes: * spawn_load and spawn_attach own the tenant state until they're done * once load()/attach() calls are done ... * if they failed, transition them to Broken directly (we know that there's no background activity because we didn't call activate yet) * if they succeed, call activate. We can make it infallible. How? Later. * set_broken() and set_stopping() are changed to wait for spawn_load() / spawn_attach() to finish. * This sounds scary because it might hinder detach or shutdown, but actually, concurrent attach+detach, or attach+shutdown, or load+shutdown, or attach+shutdown were just racy before this PR. So, with this change, they're not anymore. In the future, we can add a `CancellationToken` stored in Tenant to cancel `load` and `attach` faster, i.e., make `spawn_load` / `spawn_attach` transition them to Broken state sooner. See the doc comments on TenantState for the state transitions that are now possible. It might seem scary, but actually, this patch reduces the possible state transitions. We introduce a new state `TenantState::Activating` to avoid grabbing the `Tenant::timelines` lock inside the `send_modify` closure. These were the humble beginnings of this PR (see Motivation section), and I think it's still the right thing to have this `Activating` state, even if we decide against async `Tenant::timelines` mutex. The reason is that `send_modify` locks internally, and by moving locking of Tenant::timelines out of the closure, the internal locking of `send_modify` becomes a leaf of the lock graph, and so, we eliminate deadlock risk. Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-05-29 17:52:41 +03:00
Em Sharnoff	ccf653c1f4	re-enable file cache integration for VM compute node (#4338 ) #4155 inadvertently switched to a version of the VM builder that leaves the file cache integration disabled by default. This re-enables the vm-informant's file cache integration. (as a refresher: The vm-informant is the autoscaling component that sits inside the VM and manages postgres / compute_ctl) See also: https://github.com/neondatabase/autoscaling/pull/265	2023-05-28 10:22:45 -07:00

1 2 3 4 5 ...

3276 Commits