Comparing 4e547e6274...daf8edd986 - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-03-16 14:50:37 +00:00

Author	SHA1	Message	Date
Christian Schwarz	daf8edd986	Merge pull request #8468 from neondatabase/rc/2024-07-23-manual Storage release 2024-07-23 We did not deploy yesterday's * https://github.com/neondatabase/neon/pull/8451 because of CICD troubles with pre-prod. Also, it was missing * https://github.com/neondatabase/neon/pull/7766 which is low-risk and unblocks more cleanup work that would otherwise have to wait until after next week's release. So, this PR cherry-picks #7766 and creates a new storage release. Compute will release separately later this week. Back pointer to Slack thread: https://neondb.slack.com/archives/C03H1K0PGKH/p1721650191019099	2024-07-24 12:02:14 +02:00
Vlad Lazar	a1272b6ed8	pageserver: use identity file as node id authority and remove init command and config-override flags (#7766 ) Ansible will soon write the node id to `identity.toml` in the work dir for new pageservers. On the pageserver side, we read the node id from the identity file if it is present and use that as the source of truth. If the identity file is missing, cannot be read, or does not deserialise, start-up is aborted. This PR also removes the `--init` mode and the `--config-override` flag from the `pageserver` binary. The neon_local is already not using these flags anymore. Ansible still uses them until the linked change is merged & deployed, so, this PR has to land simultaneously or after the Ansible change due to that. Related Ansible change: https://github.com/neondatabase/aws/pull/1322 Cplane change to remove config-override usages: https://github.com/neondatabase/cloud/pull/13417 Closes: https://github.com/neondatabase/neon/issues/7736 Overall plan: https://www.notion.so/neondatabase/Rollout-Plan-simplified-pageserver-initialization-f935ae02b225444e8a41130b7d34e4ea?pvs=4 Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-07-23 12:55:46 +02:00
Christian Schwarz	28ee7cdede	Merge pull request #8451 from neondatabase/rc/2024-07-22 ## Storage & Compute release 2024-07-22 This PR has so many commits because the release branch diverged from `main`. Details https://neondb.slack.com/archives/C033A2WE6BZ/p1721650938949059?thread_ts=1721308848.034069&cid=C033A2WE6BZ The commit range that is truly new since the last storage release are the the `main` commit which I cherry-picked using this command ``` git cherry-pick 8a8b83df27383a07bb7dbba519325c15d2f46357..4e547e6 ```	2024-07-22 19:17:01 +02:00
Christian Schwarz	7b63092958	Merge commit '4e547e6' into rc/2024-07-22 See https://neondb.slack.com/archives/C033A2WE6BZ/p1721650938949059?thread_ts=1721308848.034069&cid=C033A2WE6BZ	2024-07-22 14:40:55 +02:00
Arpad Müller	31bfeaf934	Use DefaultCredentialsChain AWS authentication in remote_storage (#8440 ) PR #8299 has switched the storage scrubber to use `DefaultCredentialsChain`. Now we do this for `remote_storage`, as it allows us to use `remote_storage` from inside kubernetes. Most of the diff is due to `GenericRemoteStorage::from_config` becoming `async fn`.	2024-07-22 14:36:56 +02:00
Arpad Müller	21b3a191bf	Add archival_config endpoint to pageserver (#8414 ) This adds an archival_config endpoint to the pageserver. Currently it has no effect, and always "works", but later the intent is that it will make a timeline archived/unarchived. - [x] add yml spec - [x] add endpoint handler Part of https://github.com/neondatabase/neon/issues/8088	2024-07-22 14:36:56 +02:00
Shinya Kato	f7f9b4aaec	Fix openapi specification (#8273 ) ## Problem There are some swagger errors in `pageserver/src/http/openapi_spec.yml` ``` Error 431 15000 Object includes not allowed fields Error 569 3100401 should always have a 'required' Error 569 15000 Object includes not allowed fields Error 1111 10037 properties members must be schemas ``` ## Summary of changes Fixed the above errors.	2024-07-22 14:36:56 +02:00
John Spray	bba062e262	tests: longer timeouts in test_timeline_deletion_with_files_stuck_in_upload_queue (#8438 ) ## Problem This test had two locations with 2 second timeouts, which is rather low when we run on a highly contended test machine running lots of tests in parallel. It usually passes, but today I've seen both of these locations time out on separate PRs. Example failure: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-8432/10007868041/index.html#suites/837740b64a53e769572c4ed7b7a7eeeb/6c6a092be083d27c ## Summary of changes - Change 2 second timeouts to 20 second timeouts	2024-07-22 14:36:56 +02:00
Shinya Kato	067363fe95	safekeeper: remove unused safekeeper runtimes (#8433 ) There are unused safekeeper runtimes `WAL_REMOVER_RUNTIME` and `METRICS_SHIFTER_RUNTIME`. `WAL_REMOVER_RUNTIME` was implemented in [#4119](https://github.com/neondatabase/neon/pull/4119) and removed in [#7887](https://github.com/neondatabase/neon/pull/7887). `METRICS_SHIFTER_RUNTIME` was also implemented in [#4119](https://github.com/neondatabase/neon/pull/4119) but has never been used. I removed unused safekeeper runtimes `WAL_REMOVER_RUNTIME` and `METRICS_SHIFTER_RUNTIME`.	2024-07-22 14:36:56 +02:00
John Spray	affe408433	storage scrubber: GC ancestor shard layers (#8196 ) ## Problem After a shard split, the pageserver leaves the ancestor shard's content in place. It may be referenced by child shards, but eventually child shards will de-reference most ancestor layers as they write their own data and do GC. We would like to eventually clean up those ancestor layers to reclaim space. ## Summary of changes - Extend the physical GC command with `--mode=full`, which includes cleaning up unreferenced ancestor shard layers - Add test `test_scrubber_physical_gc_ancestors` - Remove colored log output: in testing this is irritating ANSI code spam in logs, and in interactive use doesn't add much. - Refactor storage controller API client code out of storcon_client into a `storage_controller/client` crate - During physical GC of ancestors, call into the storage controller to check that the latest shards seen in S3 reflect the latest state of the tenant, and there is no shard split in progress.	2024-07-22 14:36:56 +02:00
Christian Schwarz	9b883e4651	pageserver: remove obsolete cached_metric_collection_interval (#8370 ) We're removing the usage of this long-meaningless config field in https://github.com/neondatabase/aws/pull/1599 Once that PR has been deployed to staging and prod, we can merge this PR.	2024-07-22 14:36:56 +02:00
Peter Bendel	b98b301d56	Bodobolero/fix root permissions (#8429 ) ## Problem My prior PR https://github.com/neondatabase/neon/pull/8422 caused leftovers in the GitHub action runner work directory with root permission. As an example see here https://github.com/neondatabase/neon/actions/runs/10001857641/job/27646237324#step:3:37 To work-around we install vanilla postgres as non-root using deb packages in /home/nonroot user directory ## Summary of changes - since we cannot use root we install the deb pkgs directly and create symbolic links for psql, pgbench and libs in expected places - continue jobs an aws even if azure jobs fail (because this region is currently unreliable)	2024-07-22 14:36:56 +02:00
Arpad Müller	ed7ee73cba	Enable zstd in tests (#8368 ) Successor of #8288 , just enable zstd in tests. Also adds a test that creates easily compressable data. Part of #5431 --------- Co-authored-by: John Spray <john@neon.tech> Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2024-07-22 14:36:56 +02:00
Arthur Petukhovsky	fceace835b	Change log level for GuardDrop error (#8305 ) The error means that manager exited earlier than `ResidenceGuard` and it's not unexpected with current deletion implementation. This commit changes log level to reduse noise.	2024-07-22 14:36:56 +02:00
Peter Bendel	1b508a6082	Temporarily use vanilla pgbench and psql (client) for running pgvector benchmark (#8422 ) ## Problem https://github.com/neondatabase/neon/issues/8275 is not yet fixed Periodic benchmarking fails with SIGABRT in pgvector step, see https://github.com/neondatabase/neon/actions/runs/9967453263/job/27541159738#step:7:393 ## Summary of changes Instead of using pgbench and psql from Neon artifacts, download vanilla postgres binaries into the container and use those to run the client side of the test.	2024-07-22 14:36:56 +02:00
Alex Chi Z.	f87b031876	pageserver: integrate k-merge with bottom-most compaction (#8415 ) Use the k-merge iterator in the compaction process to reduce memory footprint. part of https://github.com/neondatabase/neon/issues/8002 ## Summary of changes * refactor the bottom-most compaction code to use k-merge iterator * add Send bound on some structs as it is used across the await points --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2024-07-22 14:36:56 +02:00
Arthur Petukhovsky	9f1ba2c4bf	Fix partial upload bug with invalid remote state (#8383 ) We have an issue that some partial uploaded segments can be actually missing in remote storage. I found this issue when was looking at the logs in staging, and it can be triggered by failed uploads: 1. Code tries to upload `SEG_TERM_LSN_LSN_sk5.partial`, but receives error from S3 2. The failed attempt is saved to `segments` vec 3. After some time, the code tries to upload `SEG_TERM_LSN_LSN_sk5.partial` again 4. This time the upload is successful and code calls `gc()` to delete previous uploads 5. Since new object and old object share the same name, uploaded data gets deleted from remote storage This commit fixes the issue by patching `gc()` not to delete objects with the same name as currently uploaded. --------- Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2024-07-22 14:36:56 +02:00
John Spray	9868bb3346	tests: turn on safekeeper eviction by default (#8352 ) ## Problem Ahead of enabling eviction in the field, where it will become the normal/default mode, let's enable it by default throughout our tests in case any issues become visible there. ## Summary of changes - Make default `extra_opts` for safekeepers enable offload & deletion - Set low timeouts in `extra_opts` so that tests running for tens of seconds have a chance to hit some of these background operations.	2024-07-22 14:36:56 +02:00
John Spray	27da0e9cf5	tests: increase test_pg_regress and test_isolation timeouts (#8418 ) ## Problem These tests time out ~1 in 50 runs when in debug mode. There is no indication of a real issue: they're just wrappers that have large numbers of individual tests contained within on pytest case. ## Summary of changes - Bump pg_regress timeout from 600 to 900s - Bump test_isolation timeout from 300s (default) to 600s In future it would be nice to break out these tests to run individual cases (or batches thereof) as separate tests, rather than this monolith.	2024-07-22 14:36:56 +02:00
John Spray	de9bf2af6c	tests: fix metrics check in test_s3_eviction (#8419 ) ## Problem This test would occasionally fail its metric check. This could happen in the rare case that the nodes had all been restarted before their most recent eviction. The metric check was added in https://github.com/neondatabase/neon/pull/8348 ## Summary of changes - Check metrics before each restart, accumulate into a bool that we assert on at the end of the test	2024-07-22 14:36:56 +02:00
Christian Schwarz	3d2c2ce139	NeonEnv.from_repo_dir: use storage_controller_db instead of `attachments.json` (#8382 ) When `NeonEnv.from_repo_dir` was introduced, storage controller stored its state exclusively `attachments.json`. Since then, it has moved to using Postgres, which stores its state in `storage_controller_db`. But `NeonEnv.from_repo_dir` wasn't adjusted to do this. This PR rectifies the situation. Context for this is failures in `test_pageserver_characterize_throughput_with_n_tenants` CF: https://neondb.slack.com/archives/C033RQ5SPDH/p1721035799502239?thread_ts=1720901332.293769&cid=C033RQ5SPDH Notably, `from_repo_dir` is also used by the backwards- and forwards-compatibility. Thus, the changes in this PR affect those tests as well. However, it turns out that the compatibility snapshot already contains the `storage_controller_db`. Thus, it should just work and in fact we can remove hacks like `fixup_storage_controller`. Follow-ups created as part of this work: * https://github.com/neondatabase/neon/issues/8399 * https://github.com/neondatabase/neon/issues/8400	2024-07-22 14:36:56 +02:00
dotdister	82a2081d61	Fix comment in Control Plane (#8406 ) ## Problem There are something wrong in the comment of `control_plane/src/broker.rs` and `control_plane/src/pageserver.rs` ## Summary of changes Fixed the comment about component name and their data path in `control_plane/src/broker.rs` and `control_plane/src/pageserver.rs`.	2024-07-22 14:36:56 +02:00
Joonas Koivunen	ff174a88c0	test: allow requests to any pageserver get cancelled (#8413 ) Fix flakyness on `test_sharded_timeline_detach_ancestor` which does not reproduce on a fast enough runner by allowing cancelled request before completing on all pageservers. It was only allowed on half of the pageservers. Failure evidence: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-8352/9972357040/index.html#suites/a1c2be32556270764423c495fad75d47/7cca3e3d94fe12f2	2024-07-22 14:36:56 +02:00
John Spray	ef3ebfaf67	pageserver: layer count & size metrics (#8410 ) ## Problem We lack insight into: - How much of a tenant's physical size is image vs. delta layers - Average sizes of image vs. delta layers - Total layer counts per timeline, indicating size of index_part object As well as general observability love, this is motivated by https://github.com/neondatabase/neon/issues/6738, where we need to define some sensible thresholds for storage amplification, and using total physical size may not work well (if someone does a lot of DROPs then it's legitimate for the physical-synthetic ratio to be huge), but the ratio between image layer size and delta layer size may be a better indicator of whether we're generating unreasonable quantities of image layers. ## Summary of changes - Add pageserver_layer_bytes and pageserver_layer_count metrics, labelled by timeline and `kind` (delta or image) - Add & subtract these with LayerInner's lifetime. I'm intentionally avoiding using a generic metric RAII guard object, to avoid bloating LayerInner: it already has all the information it needs to update metric on new+drop.	2024-07-22 14:36:56 +02:00
Yuchen Liang	ae1af558b4	docs: update storage controller db name in doc (#8411 ) The db name was renamed to storage_controller from attachment_service. Doc was stale.	2024-07-22 14:36:56 +02:00
John Spray	c150ad4ee2	tests: add test_compaction_l0_memory (#8403 ) This test reproduces the case of a writer creating a deep stack of L0 layers. It uses realistic layer sizes and writes several gigabytes of data, therefore runs as a performance test although it is validating memory footprint rather than performance per se. It acts a regression test for two recent fixes: - https://github.com/neondatabase/neon/pull/8401 - https://github.com/neondatabase/neon/pull/8391 In future it will demonstrate the larger improvement of using a k-merge iterator for L0 compaction (#8184) This test can be extended to enforce limits on the memory consumption of other housekeeping steps, by restarting the pageserver and then running other things to do the same "how much did RSS increase" measurement.	2024-07-22 14:36:56 +02:00
Alex Chi Z.	a98ccd185b	test(pageserver): more k-merge tests on duplicated keys (#8404 ) Existing tenants and some selection of layers might produce duplicated keys. Add tests to ensure the k-merge iterator handles it correctly. We also enforced ordering of the k-merge iterator to put images before deltas. part of https://github.com/neondatabase/neon/issues/8002 --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2024-07-22 14:36:56 +02:00
Peter Bendel	9f796ebba9	Bodobolero/pgbench compare azure (#8409 ) ## Problem We want to run performance tests on all supported cloud providers. We want to run most tests on the postgres version which is default for new projects in production, currently (July 24) this is postgres version 16 ## Summary of changes - change default postgres version for some (performance) tests to 16 (which is our default for new projects in prod anyhow) - add azure region to pgbench_compare jobs - add azure region to pgvector benchmarking jobs - re-used project `weathered-snowflake-88107345` was prepared with 1 million embeddings running on 7 minCU 7 maxCU in azure region to compare with AWS region (pgvector indexing and hnsw queries) - see job pgbench-pgvector - Note we now have a 11 environments combinations where we run pgbench-compare and 5 are for k8s-pod (deprecated) which we can remove in the future once auto-scaling team approves. ## Logs A current run with the changes from this pull request is running here https://github.com/neondatabase/neon/actions/runs/9972096222 Note that we currently expect some failures due to - https://github.com/neondatabase/neon/issues/8275 - instability of projects on azure region	2024-07-22 14:36:56 +02:00
John Spray	d51ca338c4	docs/rfcs: timeline ancestor detach API (#6888 ) ## Problem When a tenant creates a new timeline that they will treat as their 'main' history, it is awkward to permanently retain an 'old main' timeline as its ancestor. Currently this is necessary because it is forbidden to delete a timeline which has descendents. ## Summary of changes A new pageserver API is proposed to 'adopt' data from a parent timeline into one of its children, such that the link between ancestor and child can be severed, leaving the parent in a state where it may then be deleted. --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2024-07-22 14:36:56 +02:00
John Spray	07e78102bf	pageserver: reduce size of delta layer ValueRef (#8401 ) ## Problem ValueRef is an unnecessarily large structure, because it carries a cursor. L0 compaction currently instantiates gigabytes of these under some circumstances. ## Summary of changes - Carry a ref to the parent layer instead of a cursor, and construct a cursor on demand. This reduces RSS high watermark during L0 compaction by about 20%.	2024-07-22 14:36:56 +02:00
John Spray	b21e131d11	pageserver: exclude un-read layers from short residence statistic (#8396 ) ## Problem The `evictions_with_low_residence_duration` is used as an indicator of cache thrashing. However, there are situations where it is quite legitimate to only have a short residence during compaction, where a delta is downloaded, used to generate an image layer, and then discarded. This can lead to false positive alerts. ## Summary of changes - Only track low residence duration for layers that have been accessed at least once (compaction doesn't count as an access). This will give us a metric that indicates thrashing on layers that the _user_ is using, rather than those we're downloading for housekeeping purposes. Once we add "layer visibility" as an explicit property of layers, this can also be used as a cleaner condition (residence of non-visible layers should never be alertable)	2024-07-22 14:36:56 +02:00
Alex Chi Z.	abe3b4e005	fix(pageserver): limit num of delta layers for l0 compaction (#8391 ) ## Problem close https://github.com/neondatabase/neon/issues/8389 ## Summary of changes A quick mitigation for tenants with fast writes. We compact at most 60 delta layers at a time, expecting a memory footprint of 15GB. We will pick the oldest 60 L0 layers. This should be a relatively safe change so no test is added. Question is whether to make this parameter configurable via tenant config. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: John Spray <john@neon.tech>	2024-07-22 14:36:56 +02:00
Tristan Partin	18e7c2b7a1	Add some typing to Endpoint.respec()	2024-07-22 14:36:56 +02:00
Tristan Partin	ad5d784fb7	Hide import behind TYPE_CHECKING	2024-07-22 14:36:56 +02:00
Tristan Partin	85d47637ee	Run each migration in its own transaction Previously, every migration was run in the same transaction. This is preparatory work for fixing CVE-2024-4317.	2024-07-22 14:36:56 +02:00
Tristan Partin	7e818ee390	Rename compute migrations to start at 1 This matches what we put into the neon_migration.migration_id table.	2024-07-22 14:36:56 +02:00
John Spray	bff505426e	pageserver: clean up GcCutoffs names (#8379 ) - `horizon` is a confusing term, it's not at all obvious that this means space-based retention limit, rather than the total GC history limit. Rename to `GcCutoffs::space`. - `pitr` is less confusing, but still an unecessary level of indirection from what we really mean: a time-based condition. The fact that we use that that time-history for Point In Time Recovery doesn't mean we have to refer to time as "pitr" everywhere. Rename to `GcCutoffs::time`.	2024-07-22 14:36:56 +02:00
dependabot[bot]	bf7de92dc2	build(deps): bump setuptools from 65.5.1 to 70.0.0 (#8387 ) Bumps [setuptools](https://github.com/pypa/setuptools) from 65.5.1 to 70.0.0. Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: a-masterov <72613290+a-masterov@users.noreply.github.com>	2024-07-22 14:36:56 +02:00
Arpad Müller	9dc71f5a88	Avoid the storage controller in test_tenant_creation_fails (#8392 ) As described in #8385, the likely source for flakiness in test_tenant_creation_fails is the following sequence of events: 1. test instructs the storage controller to create the tenant 2. storage controller adds the tenant and persists it to the database. issues a creation request 3. the pageserver restarts with the failpoint disabled 4. storage controller's background reconciliation still wants to create the tenant 5. pageserver gets new request to create the tenant from background reconciliation This commit just avoids the storage controller entirely. It has its own set of issues, as the re-attach request will obviously not include the tenant, but it's still useful to test for non-existence of the tenant. The generation is also not optional any more during tenant attachment. If you omit it, the pageserver yields an error. We change the signature of `tenant_attach` to reflect that. Alternative to #8385 Fixes #8266	2024-07-22 14:36:56 +02:00
Anastasia Lubennikova	2ede9d7a25	Compute: add compatibility patch for rum Fixes #8251	2024-07-22 14:36:56 +02:00
John Spray	ea5460843c	pageserver: un-Arc Timeline::layers (#8386 ) ## Problem This structure was in an Arc<> unnecessarily, making it harder to reason about its lifetime (i.e. it was superficially possible for LayerManager to outlive timeline, even though no code used it that way) ## Summary of changes - Remove the Arc<>	2024-07-22 14:36:56 +02:00
Arpad Müller	5b16624bcc	Allow the new clippy::doc_lazy_continuation lint (#8388 ) The `doc_lazy_continuation` lint of clippy is still unknown on latest rust stable. Fixes fall-out from #8151.	2024-07-22 14:36:56 +02:00
Sasha Krassovsky	349373cb11	Allow reusing projects between runs of logical replication benchmarks (#8393 )	2024-07-22 14:36:56 +02:00
Joonas Koivunen	957f99cad5	feat(timeline_detach_ancestor): success idempotency (#8354 ) Right now timeline detach ancestor reports an error (409, "no ancestor") on a new attempt after successful completion. This makes it troublesome for storage controller retries. Fix it to respond with `200 OK` as if the operation had just completed quickly. Additionally, the returned timeline identifiers in the 200 OK response are now ordered so that responses between different nodes for error comparison are done by the storage controller added in #8353. Design-wise, this PR introduces a new strategy for accessing the latest uploaded IndexPart: `RemoteTimelineClient::initialized_upload_queue(&self) -> Result<UploadQueueAccessor<'_>, NotInitialized>`. It should be a more scalable way to query the latest uploaded `IndexPart` than to add a query method for each question directly on `RemoteTimelineClient`. GC blocking will need to be introduced to make the operation fully idempotent. However, it is idempotent for the cases demonstrated by tests. Cc: #6994	2024-07-22 14:36:56 +02:00
John Spray	2a3a136474	pageserver: use PITR GC cutoffs as authoritative (#8365 ) ## Problem Pageserver GC uses a size-based condition (GC "horizon" in addition to time-based "PITR"). Eventually we plan to retire the size-based condition: https://github.com/neondatabase/neon/issues/6374 Currently, we always apply the more conservative of the two, meaning that tenants always retain at least 64MB of history (default horizon), even after a very long time has passed. This is particularly acute in cases where someone has dropped tables/databases, and then leaves a database idle: the horizon can prevent GCing very large quantities of historical data (we already account for this in synthetic size by ignoring gc horizon). We're not entirely removing GC horizon right now because we don't want to 100% rely on standby_horizon for robustness of physical replication, but we can tweak our logic to avoid retaining that 64MB LSN length indefinitely. ## Summary of changes - Rework `Timeline::find_gc_cutoffs`, with new logic: - If there is no PITR set, then use `DEFAULT_PITR_INTERVAL` (1 week) to calculate a time threshold. Retain either the horizon or up to that thresholds, whichever requires less data. - When there is a PITR set, and we have unambiguously resolved the timestamp to an LSN, then ignore the GC horizon entirely. For typical PITRs (1 day, 1 week), this will still easily retain enough data to avoid stressing read only replicas. The key property we end up with, whether a PITR is set or not, is that after enough time has passed, our GC cutoff on an idle timeline will catch up with the last_record_lsn. Using `DEFAULT_PITR_INTERVAL` is a bit of an arbitrary hack, but this feels like it isn't really worth the noise of exposing in TenantConfig. We could just make it a different named constant though. The end-end state will be that there is no gc_horizon at all, and that tenants with pitr_interval=0 would truly retain no history, so this constant would go away.	2024-07-22 14:36:56 +02:00
Joonas Koivunen	cfaf30f5e8	feat(storcon): timeline detach ancestor passthrough (#8353 ) Currently storage controller does not support forwarding timeline detach ancestor requests to pageservers. Add support for forwarding `PUT .../:tenant_id/timelines/:timeline_id/detach_ancestor`. Implement the support mostly as is, because the timeline detach ancestor will be made (mostly) idempotent in future PR. Cc: #6994	2024-07-22 14:36:56 +02:00
Christian Schwarz	72c2d0812e	remove page_service `show <tenant_id>` (#8372 ) This operation isn't used in practice, so let's remove it. Context: in https://github.com/neondatabase/neon/pull/8339	2024-07-22 14:36:56 +02:00
Arseny Sher	537ecf45f8	Fix test_timeline_copy flakiness. fixes https://github.com/neondatabase/neon/issues/8355	2024-07-22 14:31:12 +02:00
Luca Bruno	1637a6ee05	proxy/http: switch to typed_json (#8377 ) ## Summary of changes This switches JSON rendering logic to `typed_json` in order to reduce the number of allocations in the HTTP responder path. Followup from https://github.com/neondatabase/neon/pull/8319#issuecomment-2216991760. --------- Co-authored-by: Conrad Ludgate <conradludgate@gmail.com>	2024-07-22 14:30:53 +02:00
Alex Chi Z	d74fb7b879	Merge pull request #8374 from neondatabase/rc/2024-07-15 Storage & Compute release 2024-07-15	2024-07-15 11:02:18 -04:00
Konstantin Knizhnik	7973c3e941	Add neon.running_xacts_overflow_policy to make it possible for RO replica to startup without primary even in case running xacts overflow (#8323 ) ## Problem Right now if there are too many running xacts to be restored from CLOG at replica startup, then replica is not trying to restore them and wait for non-overflown running-xacs WAL record from primary. But if primary is not active, then replica will not start at all. Too many running xacts can be caused by transactions with large number of subtractions. But right now it can be also cause by two reasons: - Lack of shutdown checkpoint which updates `oldestRunningXid` (because of immediate shutdown) - nextXid alignment on 1024 boundary (which cause loosing ~1k XIDs on each restart) Both problems are somehow addressed now. But we have existed customers with "sparse" CLOG and lack of checkpoints. To be able to start RO replicas for such customers I suggest to add GUC which allows replica to start even in case of subxacts overflow. ## Summary of changes Add `neon.running_xacts_overflow_policy` with the following values: - ignore: restore from CLOG last N XIDs and accept connections - skip: do not restore any XIDs from CXLOGbut still accept connections - wait: wait non-overflown running xacts record from primary node ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-07-15 09:34:35 -04:00
Vlad Lazar	085bbaf5f8	tests: allow list breaching min resident size in statvfs test (#8358 ) ## Problem This test would sometimes violate the min resident size during disk eviction and fail due to the generate warning log. Disk usage candidate collection only takes into account active tenants. However, the statvfs call takes into account the entire tenants directory, which includes tenants which haven't become active yet. After re-starting the pageserver, disk usage eviction may kick in before both tenants have become active. Hence, the logic will try to satisfy thedisk usage requirements by evicting everything belonging to the active tenant, and hence violating the tenant minimum resident size. ## Summary of changes Allow the warning	2024-07-15 09:28:35 -04:00
Alex Chi Z	85b5219861	fix(pageserver): unique test harness name for merge_in_between (#8366 ) As title, there should be a way to detect duplicated harness names in the future :( Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-07-15 09:28:35 -04:00
Conrad Ludgate	7472c69954	Fix nightly warnings 2024 june (#8151 ) ## Problem new clippy warnings on nightly. ## Summary of changes broken up each commit by warning type. 1. Remove some unnecessary refs. 2. In edition 2024, inference will default to `!` and not `()`. 3. Clippy complains about doc comment indentation 4. Fix `Trait + ?Sized` where `Trait: Sized`. 5. diesel_derives triggering `non_local_defintions`	2024-07-15 09:28:35 -04:00
John Spray	3f8819827c	pageserver: circuit breaker on compaction (#8359 ) ## Problem We already back off on compaction retries, but the impact of a failing compaction can be so great that backing off up to 300s isn't enough. The impact is consuming a lot of I/O+CPU in the case of image layer generation for large tenants, and potentially also leaking disk space. Compaction failures are extremely rare and almost always indicate a bug, frequently a bug that will not let compaction to proceed until it is fixed. Related: https://github.com/neondatabase/neon/issues/6738 ## Summary of changes - Introduce a CircuitBreaker type - Add a circuit breaker for compaction, with a policy that after 5 failures, compaction will not be attempted again for 24 hours. - Add metrics that we can alert on: any >0 value for `pageserver_circuit_breaker_broken_total` should generate an alert. - Add a test that checks this works as intended. Couple notes to reviewers: - Circuit breakers are intrinsically a defense-in-depth measure: this is not the solution to any underlying issues, it is just a general mitigation for "unknown unknowns" that might be encountered in future. - This PR isn't primarily about writing a perfect CircuitBreaker type: the one in this PR is meant to be just enough to mitigate issues in compaction, and make it easy to monitor/alert on these failures. We can refine this type in future as/when we want to use it elsewhere.	2024-07-15 09:28:35 -04:00
Japin Li	c440756410	Remove fs2 dependency (#8350 ) The fs2 dependency is not needed anymore after commit `d42700280`.	2024-07-15 09:28:35 -04:00
Arpad Müller	0e600eb921	Implement decompression for vectored reads (#8302 ) Implement decompression of images for vectored reads. This doesn't implement support for still treating blobs as uncompressed with the bits we reserved for compression, as we have removed that functionality in #8300 anyways. Part of #5431	2024-07-15 09:28:35 -04:00
Arpad Müller	a1df835e28	Pass configured compression param to image generation (#8363 ) We need to pass on the configured compression param during image layer generation. This was an oversight of #8106, and the likely cause why #8288 didn't bring any interesting regressions. Part of https://github.com/neondatabase/neon/issues/5431	2024-07-15 09:28:35 -04:00
Sasha Krassovsky	119ddf6ccf	Grant execute on snapshot functions to neon_superuser (#8346 ) ## Problem I need `neon_superuser` to be allowed to create snapshots for replication tests ## Summary of changes Adds a migration that grants these functions to neon_superuser	2024-07-15 09:28:35 -04:00
Joonas Koivunen	90f447b79d	test: limit `test_layer_download_timeouted` to MOCK_S3 (#8331 ) Requests against REAL_S3 on CI can consistently take longer than 1s; testing the short timeouts against it made no sense in hindsight, as MOCK_S3 works just as well. evidence: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-8229/9857994025/index.html#suites/b97efae3a617afb71cb8142f5afa5224/6828a50921660a32	2024-07-15 09:28:35 -04:00
Alex Chi Z	7dd71f4126	feat(pageserver): rewrite streaming vectored read planner (#8242 ) Rewrite streaming vectored read planner to be a separate struct. The API is designed to produce batches around `max_read_size` instead of exactly less than that so that `handle_XX` returns one batch a time. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-07-15 09:28:35 -04:00
Arseny Sher	8532d72276	Fix memory context of NeonWALReader allocation. Allocating it in short living context is wrong because it is reused during backend lifetime.	2024-07-15 09:28:35 -04:00
John Spray	d3ff47f572	storage controller: add node deletion API (#8226 ) ## Problem In anticipation of later adding a really nice drain+delete API, I initially only added an intentionally basic `/drop` API that is just about usable for deleting nodes in a pinch, but requires some ugly storage controller restarts to persuade it to restart secondaries. ## Summary of changes I started making a few tiny fixes, and ended up writing the delete API... - Quality of life nit: ordering of node + tenant listings in storcon_cli - Papercut: Fix the attach_hook using the wrong operation type for reporting slow locks - Make Service::spawn tolerate `generation_pageserver` columns that point to nonexistent node IDs. I started out thinking of this as a general resilience thing, but when implementing the delete API I realized it was actually a legitimate end state after the delete API is called (as that API doesn't wait for all reconciles to succeed). - Add a `DELETE` API for nodes, which does not gracefully drain, but does reschedule everything. This becomes safe to use when the system is in any state, but will incur availability gaps for any tenants that weren't already live-migrated away. If tenants have already been drained, this becomes a totally clean + safe way to decom a node. - Add a test and a storcon_cli wrapper for it This is meant to be a robust initial API that lets us remove nodes without doing ugly things like restarting the storage controller -- it's not quite a totally graceful node-draining routine yet. There's more work in https://github.com/neondatabase/neon/issues/8333 to get to our end-end state.	2024-07-15 09:28:35 -04:00
John Spray	8cc768254f	safekeeper: eviction metrics (#8348 ) ## Problem Follow up to https://github.com/neondatabase/neon/pull/8335, to improve observability of how many evict/restores we are doing. ## Summary of changes - Add `safekeeper_eviction_events_started_total` and `safekeeper_eviction_events_completed_total`, with a "kind" label of evict or restore. This gives us rates, and also ability to calculate how many are in progress. - Generalize SafekeeperMetrics test type to use the same helpers as pageserver, and enable querying any metric. - Read the new metrics at the end of the eviction test.	2024-07-15 09:28:35 -04:00
Vlad Lazar	5c80743c9c	storage_controller: fix ReconcilerWaiter::get_status (#8341 ) ## Problem SeqWait::would_wait_for returns Ok in the case when we would not wait for the sequence number and Err otherwise. ReconcilerWaiter::get_status uses it the wrong way around. This can cause the storage controller to go into a busy loop and make it look unavailable to the k8s controller. ## Summary of changes Use `SeqWait::would_wait_for` correctly.	2024-07-15 09:28:35 -04:00
Christian Schwarz	5bba3e3c75	pageserver: remove `trace_read_requests` (#8338 ) `trace_read_requests` is a per `Tenant`-object option. But the `handle_pagerequests` loop doesn't know which `Tenant` object (i.e., which shard) the request is for. The remaining use of the `Tenant` object is to check `tenant.cancel`. That check is incorrect [if the pageserver hosts multiple shards](https://github.com/neondatabase/neon/issues/7427#issuecomment-2220577518). I'll fix that in a future PR where I completely eliminate the holding of `Tenant/Timeline` objects across requests. See [my code RFC](https://github.com/neondatabase/neon/pull/8286) for the high level idea. Note that we can always bring the tracing functionality if we need it. But since it's actually about logging the `page_service` wire bytes, it should be a `page_service`-level config option, not per-Tenant. And for enabling tracing on a single connection, we can implement a `set pageserver_trace_connection;` option.	2024-07-15 09:28:35 -04:00
Peter Bendel	6caf702417	Run Performance bench on more platforms (#8312 ) ## Problem https://github.com/neondatabase/cloud/issues/14721 ## Summary of changes add one more platform to benchmarking job `57535c039c/.github/workflows/benchmarking.yml (L57C3-L126)` Run with pg 16, provisioner k8-neonvm by default on the new platform. Adjust some test cases to - not depend on database client <-> database server latency by pushing loops into server side pl/pgSQL functions - increase statement and test timeouts First successful run of these job steps https://github.com/neondatabase/neon/actions/runs/9869817756/job/27254280428	2024-07-15 09:28:35 -04:00
John Spray	32f668f5e7	rfcs: add RFC for timeline archival (#8221 ) A design for a cheap low-resource state for idle timelines: - #8088	2024-07-15 09:28:35 -04:00
Stas Kelvich	a91f9d5832	Enable core dumps for postgres (#8272 ) Set core rmilit to ulimited in compute_ctl, so that all child processes inherit it. We could also set rlimit in relevant startup script, but that way we would depend on external setup and might inadvertently disable it again (core dumping worked in pods, but not in VMs with inittab-based startup).	2024-07-15 09:28:35 -04:00
John Spray	547acde6cd	safekeeper: add eviction_min_resident to stop evictions thrashing (#8335 ) ## Problem - The condition for eviction is not time-based: it is possible for a timeline to be restored in response to a client, that client times out, and then as soon as the timeline is restored it is immediately evicted again. - There is no delay on eviction at startup of the safekeeper, so when it starts up and sees many idle timelines, it does many evictions which will likely be immediately restored when someone uses the timeline. ## Summary of changes - Add `eviction_min_resident` parameter, and use it in `ready_for_eviction` to avoid evictions if the timeline has been resident for less than this period. - This also implicitly delays evictions at startup for `eviction_min_resident` - Set this to a very low number for the existing eviction test, which expects immediate eviction. The default period is 15 minutes. The general reasoning for that is that in the worst case where we thrash ~10k timelines on one safekeeper, downloading 16MB for each one, we should set a period that would not overwhelm the node's bandwidth.	2024-07-15 09:28:35 -04:00
Alex Chi Z	bea6532881	feat(pageserver): add k-merge layer iterator with lazy loading (#8053 ) Part of https://github.com/neondatabase/neon/issues/8002. This pull request adds a k-merge iterator for bottom-most compaction. ## Summary of changes * Added back lsn_range / key_range in delta layer inner. This was removed due to https://github.com/neondatabase/neon/pull/8050, but added back because iterators need that information to process lazy loading. * Added lazy-loading k-merge iterator. * Added iterator wrapper as a unified iterator type for image+delta iterator. The current status and test should cover the use case for L0 compaction so that the L0 compaction process can bypass page cache and have a fixed amount of memory usage. The next step is to integrate this with the new bottom-most compaction. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-07-15 09:28:35 -04:00
Arpad Müller	8e2fe6b22e	Remove ImageCompressionAlgorithm::DisabledNoDecompress (#8300 ) Removes the `ImageCompressionAlgorithm::DisabledNoDecompress` variant. We now assume any blob with the specific bits set is actually a compressed blob. The `ImageCompressionAlgorithm::Disabled` variant still remains and is the new default. Reverts large parts of #8238 , as originally intended in that PR. Part of #5431	2024-07-15 09:28:35 -04:00
dependabot[bot]	4d75e1ef81	build(deps-dev): bump zipp from 3.8.1 to 3.19.1 Bumps [zipp](https://github.com/jaraco/zipp) from 3.8.1 to 3.19.1. - [Release notes](https://github.com/jaraco/zipp/releases) - [Changelog](https://github.com/jaraco/zipp/blob/main/NEWS.rst) - [Commits](https://github.com/jaraco/zipp/compare/v3.8.1...v3.19.1) --- updated-dependencies: - dependency-name: zipp dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com>	2024-07-15 09:28:35 -04:00
Conrad Ludgate	4c7c00268c	proxy: remove some trace logs (#8334 )	2024-07-15 09:28:35 -04:00
John Spray	f28abb953d	tests: stabilize test_sharding_split_compaction (#8318 ) ## Problem This test incorrectly assumed that a post-split compaction would only drop content. This was easily destabilized by any changes to image generation rules. ## Summary of changes - Before split, do a full image layer generation pass, to guarantee that post-split compaction should only drop data, never create it. - Fix the force_image_layer_creation mode of compaction that we use from tests like this: previously it would try and generate image layers even if one already existed with the same layer key, which caused compaction to fail.	2024-07-15 09:28:35 -04:00
Conrad Ludgate	4df39d7304	proxy: pg17 fixes (#8321 ) ## Problem #7809 - we do not support sslnegotiation=direct #7810 - we do not support negotiating down the protocol extensions. ## Summary of changes 1. Same as postgres, check the first startup packet byte for tls header `0x16`, and check the ALPN. 2. Tell clients using protocol >3.0 to downgrade	2024-07-15 09:28:35 -04:00
Christian Schwarz	bfc7338246	pageserver: move `page_service`'s `import basebackup` / `import wal` to mgmt API (#8292 ) I want to fix bugs in `page_service` ([issue](https://github.com/neondatabase/neon/issues/7427)) and the `import basebackup` / `import wal` stand in the way / make the refactoring more complicated. We don't use these methods anyway in practice, but, there have been some objections to removing the functionality completely. So, this PR preserves the existing functionality but moves it into the HTTP management API. Note that I don't try to fix existing bugs in the code, specifically not fixing * it only ever worked correctly for unsharded tenants * it doesn't clean up on error All errors are mapped to `ApiError::InternalServerError`.	2024-07-15 09:28:35 -04:00
Christian Schwarz	35dac6e6c8	fix(l0_flush): drops permit before fsync, potential cause for OOMs (#8327 ) ## Problem Slack thread: https://neondb.slack.com/archives/C033RQ5SPDH/p1720511577862519 We're seeing OOMs in staging on a pageserver that has l0_flush.mode=Direct enabled. There's a strong correlation between jumps in `maxrss_kb` and `pageserver_timeline_ephemeral_bytes`, so, it's quite likely that l0_flush.mode=Direct is the culprit. Notably, the expected max memory usage on that staging server by the l0_flush.mode=Direct is ~2GiB but we're seeing as much as 24GiB max RSS before the OOM kill. One hypothesis is that we're dropping the semaphore permit before all the dirtied pages have been flushed to disk. (The flushing to disk likely happens in the fsync inside the `.finish()` call, because we're using ext4 in data=ordered mode). ## Summary of changes Hold the permit until after we're done with `.finish()`.	2024-07-15 09:28:35 -04:00
Christian Schwarz	e619e8703e	refactor: postgres_backend: replace abstract shutdown_watcher with CancellationToken (#8295 ) Preliminary refactoring while working on https://github.com/neondatabase/neon/issues/7427 and specifically https://github.com/neondatabase/neon/pull/8286	2024-07-15 09:28:35 -04:00
Tristan Partin	6fd35bfe32	Add an application_name to more Neon connections Helps identify connections in the logs.	2024-07-15 09:28:35 -04:00
Tristan Partin	547a431b0d	Refactor how migrations are ran Just a small improvement I noticed while looking at fixing CVE-2024-4317 in Neon.	2024-07-15 09:28:35 -04:00
Alex Chi Z	f8c01c6341	fix(storage-scrubber): use default AWS authentication (#8299 ) part of https://github.com/neondatabase/cloud/issues/14024 close https://github.com/neondatabase/neon/issues/7665 Things running in k8s container use this authentication: https://docs.aws.amazon.com/sdkref/latest/guide/feature-container-credentials.html while we did not configure the client to use it. This pull request simply uses the default s3 client credential chain for storage scrubber. It might break compatibility with minio. ## Summary of changes * Use default AWS credential provider chain. * Improvements for s3 errors, we now have detailed errors and correct backtrace on last trial of the operation. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2024-07-15 09:28:35 -04:00
Conrad Ludgate	1145700f87	chore: fix nightly build (#8142 ) ## Problem `cargo +nightly check` fails ## Summary of changes Updates `measured`, `time`, and `crc32c`. * `measured`: updated to fix https://github.com/rust-lang/rust/issues/125763. * `time`: updated to fix https://github.com/rust-lang/rust/issues/125319 * `crc32c`: updated to remove some nightly feature detection with a removed nightly feature	2024-07-15 09:28:35 -04:00
Alex Chi Z	44339f5b70	chore(storage-scrubber): allow disable file logging (#8297 ) part of https://github.com/neondatabase/cloud/issues/14024, k8s does not always have a volume available for logging, and I'm running into weird permission errors... While I could spend time figuring out how to create temp directories for logging, I think it would be better to just disable file logging as k8s containers are ephemeral and we cannot retrieve anything on the fs after the container gets removed. ## Summary of changes `PAGESERVER_DISABLE_FILE_LOGGING=1` -> file logging disabled Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-07-15 09:28:35 -04:00
Luca BRUNO	7b4a9c1d82	proxy/http: avoid spurious vector reallocations This tweaks the rows-to-JSON rendering logic in order to avoid allocating 0-sized temporary vectors and later growing them to insert elements. As the exact size is known in advance, both vectors can be built with an exact capacity upfront. This will avoid further vector growing/reallocation in the rendering hotpath. Signed-off-by: Luca BRUNO <lucab@lucabruno.net>	2024-07-15 09:28:35 -04:00
Alexander Bayandin	3b2fc27de4	CI(promote-compatibility-data): take into account commit sha (#8283 ) ## Problem In https://github.com/neondatabase/neon/pull/8161, we changed the path to Neon artefacts by adding commit sha to it, but we missed adding these changes to `promote-compatibility-data` job that we use for backward/forward- compatibility testing. ## Summary of changes - Add commit sha to `promote-compatibility-data`	2024-07-15 09:28:35 -04:00
Yuchen Liang	0b6492e7d3	tests: increase approx size equal threshold to avoid `test_lsn_lease_size` flakiness (#8282 ) ## Summary of changes Increase the `assert_size_approx_equal` threshold to avoid flakiness of `test_lsn_lease_size`. Still needs more investigation to fully resolve #8293. - Also set `autovacuum=off` for the endpoint we are running in the test. Signed-off-by: Yuchen Liang <yuchen@neon.tech>	2024-07-15 09:28:35 -04:00
John Spray	7cfaecbeb6	tests: stabilize test_timeline_size_quota_on_startup (#8255 ) ## Problem `test_timeline_size_quota_on_startup` assumed that writing data beyond the size limit would always be blocked. This is not so: the limit is only enforced if feedback makes it back from the pageserver to the safekeeper + compute. Closes: https://github.com/neondatabase/neon/issues/6562 ## Summary of changes - Modify the test to wait for the pageserver to catch up. The size limit was never actually being enforced robustly, the original version of this test was just writing much more than 30MB and about 98% of the time getting lucky such that the feedback happened to arrive before the tests for loop was done. - If the test fails, log the logical size as seen by the pageserver.	2024-07-15 09:28:35 -04:00
Alex Chi Z	472acae615	fix(pageserver): write to both v1+v2 for aux tenant import (#8316 ) close https://github.com/neondatabase/neon/issues/8202 ref https://github.com/neondatabase/neon/pull/6560 For tenant imports, we now write the aux files into both v1+v2 storage, so that the test case can pick either one for testing. Given the API is only used for testing, this looks like a safe change. Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-07-15 09:28:35 -04:00
John Spray	108bf56e44	tests: use smaller layers in test_pg_regress (#8232 ) ## Problem Debug-mode runs of test_pg_regress are rather slow since https://github.com/neondatabase/neon/pull/8105, and occasionally exceed their 600s timeout. ## Summary of changes - Use 8MiB layer files, avoiding large ephemeral layers On a hetzner AX102, this takes the runtime from 230s to 190s. Which hopefully will be enough to get the runtime on github runners more reliably below its 600s timeout. This has the side benefit of exercising more of the pageserver stack (including compaction) under a workload that exercises a more diverse set of postgres functionality than most of our tests.	2024-07-15 09:28:35 -04:00
Alexey Kondratov	e83a499ab4	compute_ctl: Use 'fast' shutdown for Postgres termination (#8289 ) ## Problem We currently use 'immediate' mode in the most commonly used shutdown path, when the control plane calls a `compute_ctl` API to terminate Postgres inside compute without waiting for the actual pod / VM termination. Yet, 'immediate' shutdown doesn't create a shutdown checkpoint and ROs have bad times figuring out the list of running xacts during next start. ## Summary of changes Use 'fast' mode, which creates a shutdown checkpoint that is important for ROs to get a list of running xacts faster instead of going through the CLOG. On the control plane side, we poll this `compute_ctl` termination API for 10s, it should be enough as we don't really write any data at checkpoint time. If it times out, we anyway switch to the slow k8s-based termination. See https://www.postgresql.org/docs/current/server-shutdown.html for the list of modes and signals. The default VM shutdown hook already uses `fast` mode, see [1] [1] `c9fd8d7693/vm-image-spec.yaml (L30-L31)` Related to #6211	2024-07-15 09:28:35 -04:00
Yuchen Liang	ebf3bfadde	refactor: move part of sharding API from `pageserver_api` to `utils` (#8254 ) ## Problem LSN Leases introduced in #8084 is a new API that is made shard-aware from day 1. To support ephemeral endpoint in #7994 without linking Postgres C API against `compute_ctl`, part of the sharding needs to reside in `utils`. ## Summary of changes - Create a new `shard` module in utils crate. - Move more interface related part of tenant sharding API to utils and re-export them in pageserver_api. Signed-off-by: Yuchen Liang <yuchen@neon.tech>	2024-07-15 09:28:35 -04:00
John Spray	ab06240fae	pageserver: respect has_relmap_file in collect_keyspace (#8276 ) ## Problem Rarely, a dbdir entry can exist with no `relmap_file_key` data. This causes compaction to fail, because it assumes that if the database exists, then so does the relmap file. Basebackup already handled this using a boolean to record whether such a key exists, but `collect_keyspace` didn't. ## Summary of changes - Respect the flag for whether a relfilemap exists in collect_keyspace - The reproducer for this issue will merge separately in https://github.com/neondatabase/neon/pull/8232	2024-07-15 09:28:35 -04:00
Tristan Partin	cec216c5c0	Add long running replication tests These tests will help verify that replication, both physical and logical, works as expected in Neon. Co-authored-by: Sasha Krassovsky <sasha@neon.tech>	2024-07-15 09:28:35 -04:00
Tristan Partin	930201e033	Add PgBin.run_nonblocking() Allows a process to run without blocking program execution, which can be useful for certain test scenarios. Co-authored-by: Sasha Krassovsky <sasha@neon.tech>	2024-07-15 09:28:35 -04:00
Tristan Partin	8328580dc2	Log PG environment variables when a PgBin runs Useful for debugging situations like connecting to databases. Co-authored-by: Sasha Krassovsky <sasha@neon.tech>	2024-07-15 09:28:35 -04:00
Tristan Partin	8d9b632f2a	Add Neon HTTP API test fixture This is a Python binding to the Neon HTTP API. It isn't complete, but can be extended as necessary. Co-authored-by: Sasha Krassovsky <sasha@neon.tech>	2024-07-15 09:28:35 -04:00
Tristan Partin	55d37c77b9	Hide import behind TYPE_CHECKING No need to import it if we aren't type checking anything.	2024-07-15 09:28:35 -04:00
John Spray	0948fb6bf1	pageserver: switch to jemalloc (#8307 ) ## Problem - Resident memory on long running pageserver processes tends to climb: memory fragmentation is suspected. - Total resident memory may be a limiting factor for running on smaller nodes. ## Summary of changes - As a low-energy experiment, switch the pageserver to use jemalloc (not a net-new dependency, proxy already use it) - Decide at end of week whether to revert before next release.	2024-07-15 09:28:35 -04:00
Alex Chi Z	285c6d2974	fix(pageserver): ensure sparse keyspace is ordered (#8285 ) ## Problem Sparse keyspaces were constructed with ranges out of order: this didn't break things obviously, but meant that users of KeySpace functions that assume ordering would assert out. Closes https://github.com/neondatabase/neon/issues/8277 ## Summary of changes make sure the sparse keyspace has ordered keyspace parts	2024-07-15 09:28:35 -04:00
Vlad Lazar	a5491463e1	Merge pull request #8304 from neondatabase/rc/2024-07-08 Storage & Compute release 2024-07-08	2024-07-08 20:25:54 +01:00
dependabot[bot]	a58827f952	build(deps): bump certifi from 2023.7.22 to 2024.7.4 (#8301 )	2024-07-08 17:22:36 +01:00
Arpad Müller	36b790f282	Add concurrency to the find-large-objects scrubber subcommand (#8291 ) The find-large-objects scrubber subcommand is quite fast if you run it in an environment with low latency to the S3 bucket (say an EC2 instance in the same region). However, the higher the latency gets, the slower the command becomes. Therefore, add a concurrency param and make it parallelized. This doesn't change that general relationship, but at least lets us do multiple requests in parallel and therefore hopefully faster. Running with concurrency of 64 (default): ``` 2024-07-05T17:30:22.882959Z INFO lazy_load_identity [...] [...] 2024-07-05T17:30:28.289853Z INFO Scanned 500 shards. [...] ``` With concurrency of 1, simulating state before this PR: ``` 2024-07-05T17:31:43.375153Z INFO lazy_load_identity [...] [...] 2024-07-05T17:33:51.987092Z INFO Scanned 500 shards. [...] ``` In other words, to list 500 shards, speed is increased from 2:08 minutes to 6 seconds. Follow-up of #8257, part of #5431	2024-07-08 17:22:36 +01:00
Arpad Müller	3ef7748e6b	Improve parsing of `ImageCompressionAlgorithm` (#8281 ) Improve parsing of the `ImageCompressionAlgorithm` enum to allow level customization like `zstd(1)`, as strum only takes `Default::default()`, i.e. `None` as the level. Part of #5431	2024-07-08 17:22:36 +01:00
Christian Schwarz	f3310143e4	pageserver_live_connections: track as counter pair (#8227 ) Generally counter pairs are preferred over gauges. In this case, I found myself asking what the typical rate of accepted page_service connections on a pageserver is, and I couldn't answer it with the gauge metric. There are a few dashboards using this metric: https://github.com/search?q=repo%3Aneondatabase%2Fgrafana-dashboard-export%20pageserver_live_connections&type=code I'll convert them to use the new metric once this PR reaches prod. refs https://github.com/neondatabase/neon/issues/7427	2024-07-08 17:22:36 +01:00
Konstantin Knizhnik	05b4169644	Increase timeout for wating subscriber caught-up (#8118 ) ## Problem test_subscriber_restart has quit large failure rate' https://neonprod.grafana.net/d/fddp4rvg7k2dcf/regression-test-failures?orgId=1&var-test_name=test_subscriber_restart&var-max_count=100&var-restrict=false I can be caused by too small timeout (5 seconds) to wait until changes are propagated. Related to #8097 ## Summary of changes Increase timeout to 30 seconds. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-07-08 17:22:36 +01:00
Alexander Bayandin	d1495755e7	SELECT 💣(); (#8270 ) ## Problem We want to be able to test how our infrastructure reacts on segfaults in Postgres (for example, we collect cores, and get some required logs/metrics, etc) ## Summary of changes - Add `trigger_segfauls` function to `neon_test_utils` to trigger a segfault in Postgres - Add `trigger_panic` function to `neon_test_utils` to trigger SIGABRT (by using `elog(PANIC, ...)) - Fix cleanup logic in regression tests in endpoint crashed	2024-07-08 17:22:36 +01:00
Vlad Lazar	c8dd78c6c8	pageserver: add time based image layer creation check (#8247 ) ## Problem Assume a timeline with the following workload: very slow ingest of updates to a small number of keys that fit within the same partition (as decided by `KeySpace::partition`). These tenants will create small L0 layers since due to time based rolling, and, consequently, the L1 layers will also be small. Currently, by default, we need to ingest 512 MiB of WAL before checking if an image layer is required. This scheme works fine under the assumption that L1s are roughly of checkpoint distance size, but as the first paragraph explained, that's not the case for all workloads. ## Summary of changes Check if new image layers are required at least once every checkpoint timeout interval.	2024-07-08 17:22:36 +01:00
John Spray	b44ee3950a	safekeeper: add separate `tombstones` map for deleted timelines (#8253 ) ## Problem Safekeepers left running for a long time use a lot of memory (up to the point of OOMing, on small nodes) for deleted timelines, because the `Timeline` struct is kept alive as a guard against recreating deleted timelines. Closes: https://github.com/neondatabase/neon/issues/6810 ## Summary of changes - Create separate tombstones that just record a ttid and when the timeline was deleted. - Add a periodic housekeeping task that cleans up tombstones older than a hardcoded TTL (24h) I think this also makes https://github.com/neondatabase/neon/pull/6766 un-needed, as the tombstone is also checked during deletion. I considered making the overall timeline map use an enum type containing active or deleted, but having a separate map of tombstones avoids bloating that map, so that calls like `get()` can still go straight to a timeline without having to walk a hashmap that also contains tombstones.	2024-07-08 17:22:36 +01:00
John Spray	64334f497d	tests: make location_conf_churn more robust (#8271 ) ## Problem This test directly manages locations on pageservers and configuration of an endpoint. However, it did not switch off the parts of the storage controller that attempt to do the same: occasionally, the test would fail in a strange way such as a compute failing to accept a reconfiguration request. ## Summary of changes - Wire up the storage controller's compute notification hook to a no-op handler - Configure the tenant's scheduling policy to Stop.	2024-07-08 17:22:35 +01:00
Peter Bendel	5ffcb688cc	correct error handling for periodic pagebench runner status (#8274 ) ## Problem the following periodic pagebench run was failed but was still shown as successful https://github.com/neondatabase/neon/actions/runs/9798909458/job/27058179993#step:9:47 ## Summary of changes if the ec2 test runner reports a failure fail the job step and thus the workflow --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2024-07-08 17:22:35 +01:00
John Spray	32fc2dd683	tests: extend allow list in deletion test (#8268 ) ## Problem `1ea5d8b132` tolerated this as an error message, but it can show up in logs as well. Example failure: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-8201/9780147712/index.html#testresult/263422f5f5f292ea/retries ## Summary of changes - Tolerate "failed to delete 1 objects" in pageserver logs, this occurs occasionally when injected failures exhaust deletion's retries.	2024-07-08 17:22:35 +01:00
Peter Bendel	d35ddfbab7	add checkout depth1 to workflow to access local github actions like generate allure report (#8259 ) ## Problem job step to create allure report fails https://github.com/neondatabase/neon/actions/runs/9781886710/job/27006997416#step:11:1 ## Summary of changes Shallow checkout of sources to get access to local github action needed in the job step ## Example run example run with this change https://github.com/neondatabase/neon/actions/runs/9790647724 do not merge this PR until the job is clean --------- Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2024-07-08 17:22:35 +01:00
Konstantin Knizhnik	3ee82a9895	implement rolling hyper-log-log algorithm (#8068 ) ## Problem See #7466 ## Summary of changes Implement algorithm descried in https://hal.science/hal-00465313/document Now new GUC is added: `neon.wss_max_duration` which specifies size of sliding window (in seconds). Default value is 1 hour. It is possible to request estimation of working set sizes (within this window using new function `approximate_working_set_size_seconds`. Old function `approximate_working_set_size` is preserved for backward compatibility. But its scope is also limited by `neon.wss_max_duration`. Version of Neon extension is changed to 1.4 ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Matthias van de Meent <matthias@neon.tech>	2024-07-08 17:22:35 +01:00
Arpad Müller	e770aeee92	Flatten compression algorithm setting (#8265 ) This flattens the compression algorithm setting, removing the `Option<_>` wrapping layer and making handling of the setting easier. It also adds a specific setting for disabled compression with the continued ability to read copmressed data, giving us the option to more easily back out of a compression rollout, should the need arise, which was one of the limitations of #8238. Implements my suggestion from https://github.com/neondatabase/neon/pull/8238#issuecomment-2206181594 , inspired by Christian's review in https://github.com/neondatabase/neon/pull/8238#pullrequestreview-2156460268 . Part of #5431	2024-07-08 17:22:35 +01:00
Yuchen Liang	32828cddd6	feat(pageserver): integrate lsn lease into synthetic size (#8220 ) Part of #7497, closes #8071. (accidentally closed #8208, reopened here) ## Problem After the changes in #8084, we need synthetic size to also account for leased LSNs so that users do not get free retention by running a small ephemeral endpoint for a long time. ## Summary of changes This PR integrates LSN leases into the synthetic size calculation. We model leases as read-only branches started at the leased LSN (except it does not have a timeline id). Other changes: - Add new unit tests testing whether a lease behaves like a read-only branch. - Change `/size_debug` response to include lease point in the SVG visualization. - Fix `/lsn_lease` HTTP API to do proper parsing for POST. Signed-off-by: Yuchen Liang <yuchen@neon.tech> Co-authored-by: Joonas Koivunen <joonas@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-07-08 17:22:35 +01:00
Arpad Müller	bd2046e1ab	Add find-large-objects subcommand to scrubber (#8257 ) Adds a find-large-objects subcommand to the scrubber to allow listing layer objects larger than a specific size. To be used like: ``` AWS_PROFILE=dev REGION=us-east-2 BUCKET=neon-dev-storage-us-east-2 cargo run -p storage_scrubber -- find-large-objects --min-size 250000000 --ignore-deltas ``` Part of #5431	2024-07-08 17:22:35 +01:00
John Spray	7e2a3d2728	pageserver: downgrade stale generation messages to INFO (#8256 ) ## Problem When generations were new, these messages were an important way of noticing if something unexpected was going on. We found some real issues when investigating tests that unexpectedly tripped them. At time has gone on, this code is now pretty battle-tested, and as we do more live migrations etc, it's fairly normal to see the occasional message from a node with a stale generation. At this point the cognitive load on developers to selectively allow-list these logs outweighs the benefit of having them at warn severity. Closes: https://github.com/neondatabase/neon/issues/8080 ## Summary of changes - Downgrade "Dropped remote consistent LSN updates" and "Dropping stale deletions" messages to INFO - Remove all the allow-list entries for these logs.	2024-07-08 17:22:35 +01:00
Alexander Bayandin	0e4832308d	CI(pg-clients): unify workflow with build-and-test (#8160 ) ## Problem `pg-clients` workflow looks different from the main `build-and-test` workflow for historical reasons (it was my very first task at Neon, and back then I wasn't really familiar with the rest of the CI pipelines). This PR unifies `pg-clients` workflow with `build-and-test` ## Summary of changes - Rename `pg_clients.yml` to `pg-clients.yml` - Run the workflow on changes in relevant files - Create Allure report for tests - Send slack notifications to `#on-call-qa-staging-stream` channel (instead of `#on-call-staging-stream`) - Update Client libraries once we're here	2024-07-08 17:22:35 +01:00
Arpad Müller	0a63bc4818	Use bool param for round_trip_test_compressed (#8252 ) As per @koivunej 's request in https://github.com/neondatabase/neon/pull/8238#discussion_r1663892091 , use a runtime param instead of monomorphizing the function based on the value. Part of https://github.com/neondatabase/neon/issues/5431	2024-07-08 17:22:35 +01:00
Vlad Lazar	2897dcc9aa	pageserver: increase rate limit duration for layer visit log (#8263 ) ## Problem I'd like to keep this in the tree since it might be useful in prod as well. It's a bit too noisy as is and missing the lsn. ## Summary of changes Add an lsn field and and increase the rate limit duration.	2024-07-08 17:22:35 +01:00
Alexander Bayandin	1d0ec50ddb	CI(build-and-test): add conclusion job (#8246 ) ## Problem Currently, if you need to rename a job and the job is listed in [branch protection rules](https://github.com/neondatabase/neon/settings/branch_protection_rules), the PR won't be allowed to merge. ## Summary of changes - Add `conclusion` job that fails if any of its dependencies don't finish successfully	2024-07-08 17:22:35 +01:00
Conrad Ludgate	a86b43fcd7	proxy: cache certain non-retriable console errors for a short time (#8201 ) ## Problem If there's a quota error, it makes sense to cache it for a short window of time. Many clients do not handle database connection errors gracefully, so just spam retry 🤡 ## Summary of changes Updates the node_info cache to support storing console errors. Store console errors if they cannot be retried (using our own heuristic. should only trigger for quota exceeded errors).	2024-07-08 17:22:35 +01:00
Vlad Lazar	b917868ada	tests: perform graceful rolling restarts in storcon scale test (#8173 ) ## Problem Scale test doesn't exercise drain & fill. ## Summary of changes Make scale test exercise drain & fill	2024-07-08 17:22:35 +01:00
John Spray	7b7d16f52e	pageserver: add supplementary branch usage stats (#8131 ) ## Problem The metrics we have today aren't convenient for planning around the impact of timeline archival on costs. Closes: https://github.com/neondatabase/neon/issues/8108 ## Summary of changes - Add metric `pageserver_archive_size`, which indicates the logical bytes of data which we would expect to write into an archived branch. - Add metric `pageserver_pitr_history_size`, which indicates the distance between last_record_lsn and the PITR cutoff. These metrics are somewhat temporary: when we implement #8088 and associated consumption metric changes, these will reach a final form. For now, an "archived" branch is just any branch outside of its parent's PITR window: later, archival will become an explicit state (which will _usually_ correspond to falling outside the parent's PITR window). The overall volume of timeline metrics is something to watch, but we are removing many more in https://github.com/neondatabase/neon/pull/8245 than this PR is adding.	2024-07-08 17:22:35 +01:00
Alex Chi Z	fee4169b6b	fix(pageserver): ensure test creates valid layer map (#8191 ) I'd like to add some constraints to the layer map we generate in tests. (1) is the layer map that the current compaction algorithm will produce. There is a property that for all delta layer, all delta layer overlaps with it on the LSN axis will have the same LSN range. (2) is the layer map that cannot be produced with the legacy compaction algorithm. (3) is the layer map that will be produced by the future tiered-compaction algorithm. The current validator does not allow that but we can modify the algorithm to allow it in the future. ## Summary of changes Add a validator to check if the layer map is valid and refactor the test cases to include delta layer start/end LSN. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-07-08 17:22:35 +01:00
Christian Schwarz	47e06a2cc6	page_service: stop exposing `get_last_record_rlsn` (#8244 ) Compute doesn't use it, let's eliminate it. Ref to Slack thread: https://neondb.slack.com/archives/C033RQ5SPDH/p1719920261995529	2024-07-08 17:22:35 +01:00
Japin Li	c4423c0623	Fix outdated comment (#8149 ) Commit `97b48c23f` changes the log wait timeout from 1 second to 100 milliseconds but forgets to update the comment.	2024-07-08 17:22:35 +01:00
John Spray	a11cf03123	pageserver: reduce ops tracked at per-timeline detail (#8245 ) ## Problem We record detailed histograms for all page_service op types, which mostly aren't very interesting, but make our prometheus scrapes huge. Closes: #8223 ## Summary of changes - Only track GetPageAtLsn histograms on a per-timeline granularity. For all other operation types, rely on existing node-wide histograms.	2024-07-08 17:22:35 +01:00
Peter Bendel	08b33adfee	add pagebench test cases for periodic pagebench on dedicated hardware (#8233 ) we want to run some specific pagebench test cases on dedicated hardware to get reproducible results run1: 1 client per tenant => characterize throughput with n tenants. - 500 tenants - scale 13 (200 MB database) - 1 hour duration - ca 380 GB layer snapshot files run2.singleclient: 1 client per tenant => characterize latencies run2.manyclient: N clients per tenant => characterize throughput scalability within one tenant. - 1 tenant with 1 client for latencies - 1 tenant with 64 clients because typically for a high number of connections we recommend the connection pooler which by default uses 64 connections (for scalability) - scale 136 (2048 MB database) - 20 minutes each	2024-07-08 17:22:35 +01:00
Arpad Müller	4fb50144dd	Only support compressed reads if the compression setting is present (#8238 ) PR #8106 was created with the assumption that no blob is larger than `256 MiB`. Due to #7852 we have checking for writes of blobs larger than that limit, but we didn't have checking for reads of such large blobs: in theory, we could be reading these blobs every day but we just don't happen to write the blobs for some reason. Therefore, we now add a warning for reads of such large blobs as well. To make deploying compression less dangerous, we therefore only assume a blob is compressed if the compression setting is present in the config. This also means that we can't back out of compression once we enabled it. Part of https://github.com/neondatabase/neon/issues/5431	2024-07-08 17:22:35 +01:00
John Spray	c500137ca9	pageserver: don't try to flush if shutdown during attach (#8235 ) ## Problem test_location_conf_churn fails on log errors when it tries to shutdown a pageserver immediately after starting a tenant attach, like this: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-8224/9761000525/index.html#/testresult/15fb6beca5c7327c ``` shutdown:shutdown{tenant_id=35f5c55eb34e7e5e12288c5d8ab8b909 shard_id=0000}:timeline_shutdown{timeline_id=30936747043353a98661735ad09cbbfe shutdown_mode=FreezeAndFlush}: failed to freeze and flush: cannot flush frozen layers when flush_loop is not running, state is Exited\n') ``` This is happening because Tenant::shutdown fires its cancellation token early if the tenant is not fully attached by the time shutdown is called, so the flush loop is shutdown by the time we try and flush. ## Summary of changes - In the early-cancellation case, also set the shutdown mode to Hard to skip trying to do a flush that will fail.	2024-07-08 17:22:35 +01:00
Alexander Bayandin	252c4acec9	CI: update docker/* actions to latest versions (#7694 ) ## Problem GitHub Actions complain that we use actions that depend on deprecated Node 16: ``` Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: docker/setup-buildx-action@v2 ``` But also, the latest `docker/setup-buildx-action` fails with the following error: ``` /nvme/actions-runner/_work/_actions/docker/setup-buildx-action/v3/webpack:/docker-setup-buildx/node_modules/@actions/cache/lib/cache.js:175 throw new Error(`Path Validation Error: Path(s) specified in the action for caching do(es) not exist, hence no cache is being saved.`); ^ Error: Path Validation Error: Path(s) specified in the action for caching do(es) not exist, hence no cache is being saved. at Object.rejected (/nvme/actions-runner/_work/_actions/docker/setup-buildx-action/v3/webpack:/docker-setup-buildx/node_modules/@actions/cache/lib/cache.js:175:1) at Generator.next (<anonymous>) at fulfilled (/nvme/actions-runner/_work/_actions/docker/setup-buildx-action/v3/webpack:/docker-setup-buildx/node_modules/@actions/cache/lib/cache.js:29:1) ``` We can work this around by setting `cache-binary: false` for `uses: docker/setup-buildx-action@v3` ## Summary of changes - Update `docker/setup-buildx-action` from `v2` to `v3`, set `cache-binary: false` - Update `docker/login-action` from `v2` to `v3` - Update `docker/build-push-action` from `v4`/`v5` to `v6`	2024-07-08 17:22:35 +01:00
Heikki Linnakangas	db70c175e6	Simplify test_wal_page_boundary_start test (#8214 ) All the code to ensure the WAL record lands at a page boundary was unnecessary for reproducing the original problem. In fact, it's a pretty basic test that checks that outbound replication (= neon as publisher) still works after restarting the endpoint. It just used to be very broken before commit `5ceccdc7de`, which also added this test. To verify that: 1. Check out commit `f3af5f4660` (because the next commit, `7dd58e1449`, fixed the same bug in a different way, making it infeasible to revert the bug fix in an easy way) 2. Revert the bug fix from commit `5ceccdc7de` with this: ``` diff --git a/pgxn/neon/walproposer_pg.c b/pgxn/neon/walproposer_pg.c index 7debb6325..9f03bbd99 100644 --- a/pgxn/neon/walproposer_pg.c +++ b/pgxn/neon/walproposer_pg.c @@ -1437,8 +1437,10 @@ XLogWalPropWrite(WalProposer wp, char buf, Size nbytes, XLogRecPtr recptr) * * https://github.com/neondatabase/neon/issues/5749 */ +#if 0 if (!wp->config->syncSafekeepers) XLogUpdateWalBuffers(buf, recptr, nbytes); +#endif while (nbytes > 0) { ``` 3. Run the test_wal_page_boundary_start regression test. It fails, as expected 4. Apply this commit to the test, and run it again. It still fails, with the same error mentioned in issue #5749: ``` PG:2024-06-30 20:49:08.805 GMT [1248196] STATEMENT: START_REPLICATION SLOT "sub1" LOGICAL 0/0 (proto_version '4', origin 'any', publication_names '"pub1"') PG:2024-06-30 21:37:52.567 GMT [1467972] LOG: starting logical decoding for slot "sub1" PG:2024-06-30 21:37:52.567 GMT [1467972] DETAIL: Streaming transactions committing after 0/1532330, reading WAL from 0/1531C78. PG:2024-06-30 21:37:52.567 GMT [1467972] STATEMENT: START_REPLICATION SLOT "sub1" LOGICAL 0/0 (proto_version '4', origin 'any', publication_names '"pub1"') PG:2024-06-30 21:37:52.567 GMT [1467972] LOG: logical decoding found consistent point at 0/1531C78 PG:2024-06-30 21:37:52.567 GMT [1467972] DETAIL: There are no running transactions. PG:2024-06-30 21:37:52.567 GMT [1467972] STATEMENT: START_REPLICATION SLOT "sub1" LOGICAL 0/0 (proto_version '4', origin 'any', publication_names '"pub1"') PG:2024-06-30 21:37:52.568 GMT [1467972] ERROR: could not find record while sending logically-decoded data: invalid contrecord length 312 (expected 6) at 0/1533FD8 ```	2024-07-08 17:22:35 +01:00
Alex Chi Z	ed3b4a58b4	docker: add storage_scrubber into the docker image (#8239 ) ## Problem We will run this tool in the k8s cluster. To make it accessible from k8s, we need to package it into the docker image. part of https://github.com/neondatabase/cloud/issues/14024	2024-07-08 17:22:35 +01:00
Konstantin Knizhnik	2863d1df63	Add test for proper handling of connection failure to avoid 'cannot wait on socket event without a socket' error (#8231 ) ## Problem See https://github.com/neondatabase/cloud/issues/14289 and PR #8210 ## Summary of changes Add test for problems fixed in #8210 ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-07-08 17:22:35 +01:00
Alex Chi Z	320b24eab3	fix(pageserver): comments about metadata key range (#8236 ) Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-07-08 17:22:35 +01:00
John Spray	13a8a5b09b	tense of errors (#8234 ) I forgot a commit when merging https://github.com/neondatabase/neon/pull/8177	2024-07-08 17:22:35 +01:00
Alexander Bayandin	64ccdf65e0	CI(benchmarking): move psql queries to actions/run-python-test-set (#8230 ) ## Problem Some of the Nightly benchmarks fail with the error ``` + /tmp/neon/pg_install/v14/bin/pgbench --version /tmp/neon/pg_install/v14/bin/pgbench: error while loading shared libraries: libpq.so.5: cannot open shared object file: No such file or directory ``` Originally, we added the `pgbench --version` call to check that `pgbench` is installed and to fail earlier if it's not. The failure happens because we don't have `LD_LIBRARY_PATH` set for every job, and it also affects `psql` command. We can move it to `actions/run-python-test-set` so as not to duplicate code (as it already have `LD_LIBRARY_PATH` set). ## Summary of changes - Remove `pgbench --version` call - Move `psql` commands to common `actions/run-python-test-set`	2024-07-08 17:22:35 +01:00
Christian Schwarz	1ae6aa09dd	L0 flush: opt-in mechanism to bypass PageCache reads and writes (#8190 ) part of https://github.com/neondatabase/neon/issues/7418 # Motivation (reproducing #7418) When we do an `InMemoryLayer::write_to_disk`, there is a tremendous amount of random read I/O, as deltas from the ephemeral file (written in LSN order) are written out to the delta layer in key order. In benchmarks (https://github.com/neondatabase/neon/pull/7409) we can see that this delta layer writing phase is substantially more expensive than the initial ingest of data, and that within the delta layer write a significant amount of the CPU time is spent traversing the page cache. # High-Level Changes Add a new mode for L0 flush that works as follows: * Read the full ephemeral file into memory -- layers are much smaller than total memory, so this is afforable * Do all the random reads directly from this in memory buffer instead of using blob IO/page cache/disk reads. * Add a semaphore to limit how many timelines may concurrently do this (limit peak memory). * Make the semaphore configurable via PS config. # Implementation Details The new `BlobReaderRef::Slice` is a temporary hack until we can ditch `blob_io` for `InMemoryLayer` => Plan for this is laid out in https://github.com/neondatabase/neon/issues/8183 # Correctness The correctness of this change is quite obvious to me: we do what we did before (`blob_io`) but read from memory instead of going to disk. The highest bug potential is in doing owned-buffers IO. I refactored the API a bit in preliminary PR https://github.com/neondatabase/neon/pull/8186 to make it less error-prone, but still, careful review is requested. # Performance I manually measured single-client ingest performance from `pgbench -i ...`. Full report: https://neondatabase.notion.site/2024-06-28-benchmarking-l0-flush-performance-e98cff3807f94cb38f2054d8c818fe84?pvs=4 tl;dr: * no speed improvements during ingest, but * significantly lower pressure on PS PageCache (eviction rate drops to 1/3) * (that's why I'm working on this) * noticable but modestly lower CPU time This is good enough for merging this PR because the changes require opt-in. We'll do more testing in staging & pre-prod. # Stability / Monitoring memory consumption: there's no _hard_ limit on max `InMemoryLayer` size (aka "checkpoint distance") , hence there's no hard limit on the memory allocation we do for flushing. In practice, we a) [log a warning](`23827c6b0d/pageserver/src/tenant/timeline.rs (L5741-L5743)`) when we flush oversized layers, so we'd know which tenant is to blame and b) if we were to put a hard limit in place, we would have to decide what to do if there is an InMemoryLayer that exceeds the limit. It seems like a better option to guarantee a max size for frozen layer, dependent on `checkpoint_distance`. Then limit concurrency based on that. metrics: we do have the [flush_time_histo](`23827c6b0d/pageserver/src/tenant/timeline.rs (L3725-L3726)`), but that includes the wait time for the semaphore. We could add a separate metric for the time spent after acquiring the semaphore, so one can infer the wait time. Seems unnecessary at this point, though.	2024-07-08 17:22:35 +01:00
Arpad Müller	aeb68e51df	Add support for reading and writing compressed blobs (#8106 ) Add support for reading and writing zstd-compressed blobs for use in image layer generation, but maybe one day useful also for delta layers. The reading of them is unconditional while the writing is controlled by the `image_compression` config variable allowing for experiments. For the on-disk format, we re-use some of the bitpatterns we currently keep reserved for blobs larger than 256 MiB. This assumes that we have never ever written any such large blobs to image layers. After the preparation in #7852, we now are unable to read blobs with a size larger than 256 MiB (or write them). A non-goal of this PR is to come up with good heuristics of when to compress a bitpattern. This is left for future work. Parts of the PR were inspired by #7091. cc #7879 Part of #5431	2024-07-08 17:22:35 +01:00
Vlad Lazar	c3e5223a5d	pageserver: rate limit log for loads of layers visited (#8228 ) ## Problem At high percentiles we see more than 800 layers being visited by the read path. We need the tenant/timeline to investigate. ## Summary of changes Add a rate limited log line when the average number of layers visited per key is in the last specified histogram bucket. I plan to use this to identify tenants in us-east-2 staging that exhibit this behaviour. Will revert before next week's release.	2024-07-08 17:22:35 +01:00
Christian Schwarz	daaa3211a4	fix: noisy logging when download gets cancelled during shutdown (#8224 ) Before this PR, during timeline shutdown, we'd occasionally see log lines like this one: ``` 2024-06-26T18:28:11.063402Z INFO initial_size_calculation{tenant_id=$TENANT,shard_id=0000 timeline_id=$TIMELINE}:logical_size_calculation_task:get_or_maybe_download{layer=000000000000000000000000000000000000-000000067F0001A3950001C1630100000000__0000000D88265898}: layer file download failed, and caller has been cancelled: Cancelled, shutting down Stack backtrace: 0: <core::result::Result<T,F> as core::ops::try_trait::FromResidual<core::result::Result<core::convert::Infallible,E>>>::from_residual at /rustc/129f3b9964af4d4a709d1383930ade12dfe7c081/library/core/src/result.rs:1964:27 pageserver::tenant::remote_timeline_client::RemoteTimelineClient::download_layer_file::{{closure}} at /home/nonroot/pageserver/src/tenant/remote_timeline_client.rs:531:13 pageserver::tenant::storage_layer::layer::LayerInner::download_and_init::{{closure}} at /home/nonroot/pageserver/src/tenant/storage_layer/layer.rs:1136:14 pageserver::tenant::storage_layer::layer::LayerInner::download_init_and_wait::{{closure}}::{{closure}} at /home/nonroot/pageserver/src/tenant/storage_layer/layer.rs:1082:74 ``` We can eliminate the anyhow backtrace with no loss of information because the conversion to anyhow::Error happens in exactly one place. refs #7427	2024-07-08 17:22:35 +01:00
John Spray	7ff9989dd5	pageserver: simpler, stricter config error handling (#8177 ) ## Problem Tenant attachment has error paths for failures to write local configuration, but these types of local storage I/O errors should be considered fatal for the process. Related thread on an earlier PR that touched this code: https://github.com/neondatabase/neon/pull/7947#discussion_r1655134114 ## Summary of changes - Make errors writing tenant config fatal (abort process) - When reading tenant config, make all I/O errors except ENOENT fatal - Replace use of bare anyhow errors with `LoadConfigError`	2024-07-08 17:22:35 +01:00
Christian Schwarz	ed3b97604c	remote_storage config: move handling of empty inline table `{}` to callers (#8193 ) Before this PR, `RemoteStorageConfig::from_toml` would support deserializing an empty `{}` TOML inline table to a `None`, otherwise try `Some()`. We can instead let * in proxy: let clap derive handle the Option * in PS & SK: assume that if the field is specified, it must be a valid RemtoeStorageConfig (This PR started with a much simpler goal of factoring out the `deserialize_item` function because I need that in another PR).	2024-07-08 17:22:35 +01:00
Konstantin Knizhnik	47c50ec460	Check status of connection after PQconnectStartParams (#8210 ) ## Problem See https://github.com/neondatabase/cloud/issues/14289 ## Summary of changes Check connection status after calling PQconnectStartParams ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-07-08 17:22:35 +01:00
Vlad Lazar	8c0ec2f681	docs: Graceful storage controller cluster restarts RFC (#7704 ) RFC for "Graceful Restarts of Storage Controller Managed Clusters". Related https://github.com/neondatabase/neon/issues/7387	2024-07-08 17:22:35 +01:00
Heikki Linnakangas	588bda98e7	tests: Make neon_xlogflush() flush all WAL, if you omit the LSN arg (#8215 ) This makes it much more convenient to use in the common case that you want to flush all the WAL. (Passing pg_current_wal_insert_lsn() as the argument doesn't work for the same reasons as explained in the comments: we need to be back off to the beginning of a page if the previous record ended at page boundary.) I plan to use this to fix the issue that Arseny Sher called out at https://github.com/neondatabase/neon/pull/7288#discussion_r1660063852	2024-07-08 17:22:35 +01:00
Alexander Bayandin	504ca7720f	CI(gather-rust-build-stats): fix build with libpq (#8219 ) ## Problem I've missed setting `PQ_LIB_DIR` in https://github.com/neondatabase/neon/pull/8206 in `gather-rust-build-stats` job and it fails now: ``` = note: /usr/bin/ld: cannot find -lpq collect2: error: ld returned 1 exit status error: could not compile `storage_controller` (bin "storage_controller") due to 1 previous error ``` https://github.com/neondatabase/neon/actions/runs/9743960062/job/26888597735 ## Summary of changes - Set `PQ_LIB_DIR` for `gather-rust-build-stats` job	2024-07-08 17:22:35 +01:00
Alex Chi Z	cf4ea92aad	fix(pageserver): include aux file in basebackup only once (#8207 ) Extracted from https://github.com/neondatabase/neon/pull/6560, currently we include multiple copies of aux files in the basebackup. ## Summary of changes Fix the loop. Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-07-08 17:22:35 +01:00
Alexander Bayandin	325294bced	CI(build-tools): Remove libpq from build image (#8206 ) ## Problem We use `build-tools` image as a base image to build other images, and it has a pretty old `libpq-dev` installed (v13; it wasn't that old until I removed system Postgres 14 from `build-tools` image in https://github.com/neondatabase/neon/pull/6540) ## Summary of changes - Remove `libpq-dev` from `build-tools` image - Set `LD_LIBRARY_PATH` for tests (for different Postgres binaries that we use, like psql and pgbench) - Set `PQ_LIB_DIR` to build Storage Controller - Set `LD_LIBRARY_PATH`/`DYLD_LIBRARY_PATH` in the Storage Controller where it calls Postgres binaries	2024-07-08 17:22:35 +01:00
John Spray	86c8ba2563	pageserver: add metric `pageserver_secondary_resident_physical_size` (#8204 ) ## Problem We lack visibility of how much local disk space is used by secondary tenant locations Close: https://github.com/neondatabase/neon/issues/8181 ## Summary of changes - Add `pageserver_secondary_resident_physical_size`, tagged by tenant - Register & de-register label sets from SecondaryTenant - Add+use wrappers in SecondaryDetail that update metrics when adding+removing layers/timelines	2024-07-08 17:22:35 +01:00
Arseny Sher	feeb2dc6fa	Merge pull request #8217 from neondatabase/rc/2024-07-01 Storage & Compute release 2024-07-01	2024-07-04 20:22:51 +03:00
Heikki Linnakangas	57f476ff5a	Restore running xacts from CLOG on replica startup (#7288 ) We have one pretty serious MVCC visibility bug with hot standby replicas. We incorrectly treat any transactions that are in progress in the primary, when the standby is started, as aborted. That can break MVCC for queries running concurrently in the standby. It can also lead to hint bits being set incorrectly, and that damage can last until the replica is restarted. The fundamental bug was that we treated any replica start as starting from a shut down server. The fix for that is straightforward: we need to set 'wasShutdown = false' in InitWalRecovery() (see changes in the postgres repo). However, that introduces a new problem: with wasShutdown = false, the standby will not open up for queries until it receives a running-xacts WAL record from the primary. That's correct, and that's how Postgres hot standby always works. But it's a problem for Neon, because: * It changes the historical behavior for existing users. Currently, the standby immediately opens up for queries, so if they now need to wait, we can breka existing use cases that were working fine (assuming you don't hit the MVCC issues). * The problem is much worse for Neon than it is for standalone PostgreSQL, because in Neon, we can start a replica from an arbitrary LSN. In standalone PostgreSQL, the replica always starts WAL replay from a checkpoint record, and the primary arranges things so that there is always a running-xacts record soon after each checkpoint record. You can still hit this issue with PostgreSQL if you have a transaction with lots of subtransactions running in the primary, but it's pretty rare in practice. To mitigate that, we introduce another way to collect the running-xacts information at startup, without waiting for the running-xacts WAL record: We can the CLOG for XIDs that haven't been marked as committed or aborted. It has limitations with subtransactions too, but should mitigate the problem for most users. See https://github.com/neondatabase/neon/issues/7236. Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-07-04 18:58:34 +03:00
Heikki Linnakangas	7ee2bebdb7	tests: Make neon_xlogflush() flush all WAL, if you omit the LSN arg This makes it much more convenient to use in the common case that you want to flush all the WAL. (Passing pg_current_wal_insert_lsn() as the argument doesn't work for the same reasons as explained in the comments: we need to be back off to the beginning of a page if the previous record ended at page boundary.) I plan to use this to fix the issue that Arseny Sher called out at https://github.com/neondatabase/neon/pull/7288#discussion_r1660063852	2024-07-04 18:58:28 +03:00
Heikki Linnakangas	be598f1bf4	tests: remove a leftover 'running' flag (#8216 ) The 'running' boolean was replaced with a semaphore in commit `f0e2bb79b2`, but this initialization was missed. Remove it so that if a test tries to access it, you get an error rather than always claiming that the endpoint is not running. Spotted by Arseny at https://github.com/neondatabase/neon/pull/7288#discussion_r1660068657	2024-07-04 18:58:20 +03:00
John Spray	939b5954a5	Merge pull request #8138 from neondatabase/rc/2024-06-24 Storage & Compute release 2024-06-24	2024-06-24 10:57:45 +01:00
Arpad Müller	371020fe6a	Merge pull request #8069 from neondatabase/rc/2024-06-17 Release 2024-06-17	2024-06-17 15:29:35 +02:00
Christian Schwarz	f45818abed	Merge pull request #7999 from neondatabase/rc/2024-06-10 Release 2024-06-10	2024-06-10 19:08:03 +02:00
Christian Schwarz	0384267d58	Revert "Include openssl and ICU statically linked" (#8003 ) Reverts neondatabase/neon#7956 Rationale: compute incompatibilties Slack thread: https://neondb.slack.com/archives/C033RQ5SPDH/p1718011276665839?thread_ts=1718008160.431869&cid=C033RQ5SPDH Relevant quotes from @hlinnaka > If we go through with the current release candidate, but the compute is pinned, people who create new projects will get that warning, which is silly. To them, it looks like the ICU version was downgraded, because initdb was run with newer version. > We should upgrade the ICU version eventually. And when we do that, users with old projects that use ICU will start to see that warning. I think that's acceptable, as long as we do homework, notify users, and communicate that properly. > When do that, we should to try to upgrade the storage and compute versions at roughly the same time.	2024-06-10 14:35:50 +02:00
Arseny Sher	62b3bd968a	Merge pull request #7936 from neondatabase/rc/2024-06-03 Release 2024-06-03	2024-06-04 05:41:36 +03:00
Anastasia Lubennikova	e3e3bc3542	Merge pull request #7920 from neondatabase/compute-only-may-31 Compute release 2024-05-31	2024-05-31 12:47:05 +01:00
Konstantin Knizhnik	be014a2222	Do not produce error if gin page is not restored in redo (#7876 ) ## Problem See https://github.com/neondatabase/cloud/issues/10845 ## Summary of changes Do not report error if GIN page is not restored ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-05-31 09:21:40 +01:00
Joonas Koivunen	2e1fe71cc0	Merge pull request #7888 from neondatabase/rc/2024-05-27 Release 2024-05-27	2024-05-27 20:30:48 +03:00
Konstantin Knizhnik	068c158ca5	Fix connect to PS on MacOS/X (#7885 ) ## Problem After [`0e4f182680`] which introduce async connect Neon is not able to connect to page server. ## Summary of changes Perform sync commit at MacOS/X ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2024-05-27 13:09:44 +00:00
Sasha Krassovsky	b16e4f689f	Merge pull request #7869 from neondatabase/rc/2024-05-23 Metrics hotfix release	2024-05-23 14:05:30 -07:00
Sasha Krassovsky	dbff725a0c	Remove apostrophe (#7868 ) ## Problem ## Summary of changes ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist	2024-05-23 13:47:16 -07:00
Andreas Scherbaum	7fa4628434	Merge pull request #7837 from neondatabase/rc/2024-05-22 Compute-Only Release 2024-05-22	2024-05-22 19:34:39 +02:00
Arthur Petukhovsky	fc538a38b9	Merge pull request #7807 from neondatabase/rc/2024-05-20 Release 2024-05-20	2024-05-20 12:16:00 +01:00
Vlad Lazar	c2e7cb324f	Merge pull request #7735 from neondatabase/vlad/release-2024-05-13 Handmade Release 2024-05-13	2024-05-13 16:27:38 +01:00
Vlad Lazar	101043122e	Revert protocol version upgrade (#7727 ) ## Problem "John pointed out that the switch to protocol version 2 made test_gc_aggressive test flaky: https://github.com/neondatabase/neon/issues/7692. I tracked it down, and that is indeed an issue. Conditions for hitting the issue: The problem occurs in the primary GC horizon is set to a very low value, e.g. 0. If the primary is actively writing WAL, and GC runs in the pageserver at the same time that the primary sends a GetPage request, it's possible that the GC advances the GC horizon past the GetPage request's LSN. I'm working on a fix here: https://github.com/neondatabase/neon/pull/7708." - Heikki ## Summary of changes Use protocol version 1 as default.	2024-05-13 14:17:36 +01:00
Christian Schwarz	c4d7d59825	Merge pull request #7615 from neondatabase/rc/2024-05-06 Release 2024-05-06	2024-05-07 09:41:02 +02:00
Arpad Müller	0de1e1d664	Merge pull request #7530 from neondatabase/rc/2024-04-29 Release 2024-04-29	2024-04-29 15:09:58 +02:00
Joonas Koivunen	271598b77f	Merge pull request #7447 from neondatabase/rc/2024-04-22 Release 2024-04-22	2024-04-22 16:10:03 +03:00
John Spray	459bc479dc	pageserver: fix unlogged relations with sharding (#7454 ) ## Problem - #7451 INIT_FORKNUM blocks must be stored on shard 0 to enable including them in basebackup. This issue can be missed in simple tests because creating an unlogged table isn't sufficient -- to repro I had to create an _index_ on an unlogged table (then restart the endpoint). Closes: #7451 ## Summary of changes - Add a reproducer for the issue. - Tweak the condition for `key_is_shard0` to include anything that isn't a normal relation block _and_ any normal relation block whose forknum is INIT_FORKNUM. - To enable existing databases to recover from the issue, add a special case that omits relations if they were stored on the wrong INITFORK. This enables postgres to start and the user to drop the table and recreate it.	2024-04-22 11:55:24 +00:00
Christian Schwarz	c213373a59	Merge pull request #7378 from neondatabase/rc/2024-04-15 Release 2024-04-15	2024-04-15 15:48:14 +03:00
Em Sharnoff	e0addc100d	Merge pull request #7356 from neondatabase/rc/2024-04-11-#7348 Release 2024-04-11 (cherry-pick #7348 only) See here for more: https://neondb.slack.com/archives/C04DGM6SMTM/p1712776981582679	2024-04-11 09:46:34 -07:00
Em Sharnoff	0519138b04	compute_ctl: Auto-set dynamic_shared_memory_type (#7348 ) Part of neondatabase/cloud#12047. The basic idea is that for our VMs, we want to enable swap and disable Linux memory overcommit. Alongside these, we should set postgres' dynamic_shared_memory_type to mmap, but we want to avoid setting it to mmap if swap is not enabled. Implementing this in the control plane would be fiddly, but it's relatively straightforward to add to compute_ctl.	2024-04-10 13:13:08 -07:00
Vlad Lazar	5da39b469c	Merge pull request #7338 from neondatabase/rc/2024-04-08 Release 2024-04-08	2024-04-08 13:10:24 +01:00
Arseny Sher	82027e22dd	Merge pull request #7284 from neondatabase/rc/2024-04-01 Release 2024-04-01	2024-04-02 18:15:28 +03:00
Alex Chi Z	c431e2f1c5	Merge pull request #7263 from neondatabase/rc/2024-03-27 Release 2024-03-27 - compute only release	2024-03-27 14:52:38 -04:00
John Spray	4e5724d9c3	Merge pull request #7248 from neondatabase/rc/2024-03-26 Release 2024-03-26	2024-03-26 15:17:00 +00:00
John Spray	0d3e499059	Merge pull request #7219 from neondatabase/rc/2024-03-25 Release 2024-03-25	2024-03-25 12:28:09 +00:00
Arpad Müller	7b860b837c	Merge pull request #7154 from neondatabase/rc/2024-03-18 Release 2024-03-18	2024-03-19 12:07:14 +01:00
Christian Schwarz	41fc96e20f	fixup(#7160 / tokio_epoll_uring_ext): double-panic caused by info! in thread-local's drop() (#7164 ) Manual testing of the changes in #7160 revealed that, if the thread-local destructor ever runs (it apparently doesn't in our test suite runs, otherwise #7160 would not have auto-merged), we can encounter an `abort()` due to a double-panic in the tracing code. This github comment here contains the stack trace: https://github.com/neondatabase/neon/pull/7160#issuecomment-2003778176 This PR reverts #7160 and uses a atomic counter to identify the thread-local in log messages, instead of the memory address of the thread local, which may be re-used.	2024-03-18 16:28:17 +01:00
Christian Schwarz	fb2b1ce57b	fixup(#7141 / tokio_epoll_uring_ext): high frequency log message The PR #7141 added log message ``` ThreadLocalState is being dropped and id might be re-used in the future ``` which was supposed to be emitted when the thread-local is destroyed. Instead, it was emitted on _each_ call to `thread_local_system()`, ie.., on each tokio-epoll-uring operation.	2024-03-18 13:01:17 +01:00
Joonas Koivunen	464717451b	build: make procfs linux only dependency (#7156 ) the dependency refuses to build on macos so builds on `main` are broken right now, including the `release` PR.	2024-03-18 09:32:49 +00:00
Joonas Koivunen	c6ed86d3d0	Merge pull request #7081 from neondatabase/rc/2024-03-11 Release 2024-03-11	2024-03-11 14:41:39 +02:00
Roman Zaynetdinov	f0a9017008	Export db size, deadlocks and changed row metrics (#7050 ) ## Problem We want to report metrics for the oldest user database.	2024-03-11 11:55:06 +00:00
Christian Schwarz	bb7949ba00	Merge pull request #6993 from neondatabase/rc/2024-03-04 Release 2024-03-04	2024-03-04 13:08:44 +01:00
Arthur Petukhovsky	1df0f69664	Merge pull request #6973 from neondatabase/rc/2024-02-29-manual Release 2024-02-29	2024-02-29 17:26:33 +00:00
Vlad Lazar	970066a914	libs: fix expired token in auth decode test (#6963 ) The test token expired earlier today (1709200879). I regenerated the token, but without an expiration date this time.	2024-02-29 17:23:25 +00:00
Arthur Petukhovsky	1ebd3897c0	Merge pull request #6956 from neondatabase/rc/2024-02-28 Release 2024-02-28	2024-02-29 16:39:52 +00:00
Arthur Petukhovsky	6460beffcd	Merge pull request #6901 from neondatabase/rc/2024-02-26 Release 2024-02-26	2024-02-26 17:08:19 +00:00
John Spray	6f7f8958db	pageserver: only write out legacy tenant config if no generation (#6891 ) ## Problem Previously we always wrote out both legacy and modern tenant config files. The legacy write enabled rollbacks, but we are long past the point where that is needed. We still need the legacy format for situations where someone is running tenants without generations (that will be yanked as well eventually), but we can avoid writing it out at all if we do have a generation number set. We implicitly also avoid writing the legacy config if our mode is Secondary (secondary mode is newer than generations). ## Summary of changes - Make writing legacy tenant config conditional on there being no generation number set.	2024-02-26 10:25:25 +00:00
Christian Schwarz	936a00e077	pageserver: remove two obsolete/unused per-timeline metrics (#6893 ) over-compensating the addition of a new per-timeline metric in https://github.com/neondatabase/neon/pull/6834 part of https://github.com/neondatabase/neon/issues/6737	2024-02-26 09:16:24 +00:00
Nikita Kalyanov	96a4e8de66	Add /terminate API (#6745 ) (#6853 ) this is to speed up suspends, see https://github.com/neondatabase/cloud/issues/10284 Cherry-pick to release branch to build new compute images	2024-02-22 11:51:19 +02:00
Arseny Sher	01180666b0	Merge pull request #6803 from neondatabase/releases/2024-02-19 Release 2024-02-19	2024-02-19 16:38:35 +04:00
Conrad Ludgate	6c94269c32	Merge pull request #6758 from neondatabase/release-proxy-2024-02-14 2024-02-14 Proxy Release	2024-02-15 09:45:08 +00:00
Anna Khanova	edc691647d	Proxy: remove fail fast logic to connect to compute (#6759 ) ## Problem Flaky tests ## Summary of changes Remove failfast logic	2024-02-15 07:42:12 +00:00
Conrad Ludgate	855d7b4781	hold cancel session (#6750 ) ## Problem In a recent refactor, we accidentally dropped the cancel session early ## Summary of changes Hold the cancel session during proxy passthrough	2024-02-14 14:57:22 +00:00
Anna Khanova	c49c9707ce	Proxy: send cancel notifications to all instances (#6719 ) ## Problem If cancel request ends up on the wrong proxy instance, it doesn't take an effect. ## Summary of changes Send redis notifications to all proxy pods about the cancel request. Related issue: https://github.com/neondatabase/neon/issues/5839, https://github.com/neondatabase/cloud/issues/10262	2024-02-14 14:57:22 +00:00
Anna Khanova	2227540a0d	Proxy refactor auth+connect (#6708 ) ## Problem Not really a problem, just refactoring. ## Summary of changes Separate authenticate from wake compute. Do not call wake compute second time if we managed to connect to postgres or if we got it not from cache.	2024-02-14 14:57:22 +00:00
Conrad Ludgate	f1347f2417	proxy: add more http logging (#6726 ) ## Problem hard to see where time is taken during HTTP flow. ## Summary of changes add a lot more for query state. add a conn_id field to the sql-over-http span	2024-02-14 14:57:22 +00:00
Conrad Ludgate	30b295b017	proxy: some more parquet data (#6711 ) ## Summary of changes add auth_method and database to the parquet logs	2024-02-14 14:57:22 +00:00
Anna Khanova	1cef395266	Proxy: copy bidirectional fork (#6720 ) ## Problem `tokio::io::copy_bidirectional` doesn't close the connection once one of the sides closes it. It's not really suitable for the postgres protocol. ## Summary of changes Fork `copy_bidirectional` and initiate a shutdown for both connections. --------- Co-authored-by: Conrad Ludgate <conradludgate@gmail.com>	2024-02-14 14:57:22 +00:00
John Spray	78d160f76d	Merge pull request #6721 from neondatabase/releases/2024-02-12 Release 2024-02-12	2024-02-12 09:35:30 +00:00
Vlad Lazar	b9238059d6	Merge pull request #6617 from neondatabase/releases/2024-02-05 Release 2024-02-05	2024-02-05 12:50:38 +00:00
Arpad Müller	d0cb4b88c8	Don't preserve temp files on creation errors of delta layers (#6612 ) There is currently no cleanup done after a delta layer creation error, so delta layers can accumulate. The problem gets worse as the operation gets retried and delta layers accumulate on the disk. Therefore, delete them from disk (if something has been written to disk).	2024-02-05 09:58:18 +00:00
John Spray	1ec3e39d4e	Merge pull request #6504 from neondatabase/releases/2024-01-29 Release 2024-01-29	2024-01-29 10:05:01 +00:00
John Spray	a1a74eef2c	Merge pull request #6420 from neondatabase/releases/2024-01-22 Release 2024-01-22	2024-01-22 17:24:11 +00:00
John Spray	90e689adda	pageserver: mark tenant broken when cancelling attach (#6430 ) ## Problem When a tenant is in Attaching state, and waiting for the `concurrent_tenant_warmup` semaphore, it also listens for the tenant cancellation token. When that token fires, Tenant::attach drops out. Meanwhile, Tenant::set_stopping waits forever for the tenant to exit Attaching state. Fixes: https://github.com/neondatabase/neon/issues/6423 ## Summary of changes - In the absence of a valid state for the tenant, it is set to Broken in this path. A more elegant solution will require more refactoring, beyond this minimal fix. (cherry picked from commit `93572a3e99`)	2024-01-22 16:20:57 +00:00
Christian Schwarz	f0b2d4b053	fixup(#6037 ): actually fix the issue, #6388 failed to do so (#6429 ) Before this patch, the select! still retured immediately if `futs` was empty. Must have tested a stale build in my manual testing of #6388. (cherry picked from commit `15c0df4de7`)	2024-01-22 15:23:12 +00:00
Anna Khanova	299d9474c9	Proxy: fix gc (#6426 ) ## Problem Gc currently doesn't work properly. ## Summary of changes Change statement on running gc.	2024-01-22 14:39:09 +01:00
Conrad Ludgate	7234208b36	bump shlex (#6421 ) ## Problem https://rustsec.org/advisories/RUSTSEC-2024-0006 ## Summary of changes `cargo update -p shlex` (cherry picked from commit `5559b16953`)	2024-01-22 09:49:33 +00:00
Christian Schwarz	93450f11f5	Merge pull request #6354 from neondatabase/releases/2024-01-15 Release 2024-01-15 NB: the previous release PR https://github.com/neondatabase/neon/pull/6286 was accidentally merged by merge-by-squash instead of merge-by-merge-commit. See https://github.com/neondatabase/neon/pull/6354#issuecomment-1891706321 for more context.	2024-01-15 14:30:25 +01:00
Christian Schwarz	2f0f9edf33	Merge remote-tracking branch 'origin/release' into releases/2024-01-15	2024-01-15 09:36:42 +00:00
Christian Schwarz	d424f2b7c8	empty commit so we can produce a merge commit	2024-01-15 09:36:22 +00:00
Christian Schwarz	21315e80bc	Merge branch 'releases/2024-01-08--not-squashed' into releases/2024-01-15	2024-01-15 09:31:07 +00:00
vipvap	483b66d383	Merge branch 'release' into releases/2024-01-08 (not-squashed merge of #6286 ) Release PR https://github.com/neondatabase/neon/pull/6286 got accidentally merged-by-squash intstead of merge-by-merge-commit. This commit shows how things would look like if 6286 had been merged-by-squash. ``` git reset --hard `9f1327772` git merge --no-ff `5c0264b591` ``` Co-authored-by: Christian Schwarz <christian@neon.tech>	2024-01-15 09:28:08 +00:00
vipvap	aa72a22661	Release 2024-01-08 (#6286 ) Release 2024-01-08	2024-01-08 09:26:27 +00:00
Shany Pozin	5c0264b591	Merge branch 'release' into releases/2024-01-08	2024-01-08 09:34:06 +02:00
Arseny Sher	9f13277729	Merge pull request #6242 from neondatabase/releases/2024-01-02 Release 2024-01-02	2024-01-02 12:04:43 +04:00
Arseny Sher	54aa319805	Don't split WAL record across two XLogData's when sending from safekeepers. As protocol demands. Not following this makes standby complain about corrupted WAL in various ways. https://neondb.slack.com/archives/C05L7D1JAUS/p1703774799114719 closes https://github.com/neondatabase/cloud/issues/9057	2024-01-02 10:54:00 +04:00
Arseny Sher	4a227484bf	Add large insertion and slow WAL sending to test_hot_standby. To exercise MAX_SEND_SIZE sending from safekeeper; we've had a bug with WAL records torn across several XLogData messages. Add failpoint to safekeeper to slow down sending. Also check for corrupted WAL complains in standby log. Make the test a bit simpler in passing, e.g. we don't need explicit commits as autocommit is enabled by default. https://neondb.slack.com/archives/C05L7D1JAUS/p1703774799114719 https://github.com/neondatabase/cloud/issues/9057	2024-01-02 10:54:00 +04:00
Arseny Sher	2f83f85291	Add failpoint support to safekeeper. Just a copy paste from pageserver.	2024-01-02 10:54:00 +04:00
Arseny Sher	d6cfcb0d93	Move failpoint support code to utils. To enable them in safekeeper as well.	2024-01-02 10:54:00 +04:00
Arseny Sher	392843ad2a	Fix safekeeper START_REPLICATION (term=n). It was giving WAL only up to commit_lsn instead of flush_lsn, so recovery of uncommitted WAL since `cdb08f03` hanged. Add test for this.	2024-01-02 10:54:00 +04:00
Arseny Sher	bd4dae8f4a	compute_ctl: kill postgres and sync-safekeeprs on exit. Otherwise they are left orphaned when compute_ctl is terminated with a signal. It was invisible most of the time because normally neon_local or k8s kills postgres directly and then compute_ctl finishes gracefully. However, in some tests compute_ctl gets stuck waiting for sync-safekeepers which intentionally never ends because safekeepers are offline, and we want to stop compute_ctl without leaving orphanes behind. This is a quite rough approach which doesn't wait for children termination. A better way would be to convert compute_ctl to async which would make waiting easy.	2024-01-02 10:54:00 +04:00
Shany Pozin	b05fe53cfd	Merge pull request #6240 from neondatabase/releases/2024-01-01 Release 2024-01-01	2024-01-01 11:07:30 +02:00
Christian Schwarz	c13a2f0df1	Merge pull request #6192 from neondatabase/releases/2023-12-19 Release 2023-12-19 We need to do a config change that requires restarting the pageservers. Slip in two metrics-related commits that didn't make this week's regularly release.	2023-12-19 14:52:47 +01:00
Christian Schwarz	39be366fc5	higher resolution histograms for getpage@lsn (#6177 ) part of https://github.com/neondatabase/cloud/issues/7811	2023-12-19 13:46:59 +00:00
Christian Schwarz	6eda0a3158	[PRE-MERGE] fix metric `pageserver_initial_logical_size_start_calculation` (This is a pre-merge cherry-pick of https://github.com/neondatabase/neon/pull/6191) It wasn't being incremented. Fixup of commit `1c88824ed0` Author: Christian Schwarz <christian@neon.tech> Date: Fri Dec 1 12:52:59 2023 +0100 initial logical size calculation: add a bunch of metrics (#5995)	2023-12-19 13:46:55 +00:00
Shany Pozin	306c7a1813	Merge pull request #6173 from neondatabase/sasha_release_bypassrls_replication Grant BYPASSRLS and REPLICATION explicitly to neon_superuser roles	2023-12-18 22:16:36 +02:00
Sasha Krassovsky	80be423a58	Grant BYPASSRLS and REPLICATION explicitly to neon_superuser roles	2023-12-18 10:22:36 -08:00
Shany Pozin	5dcfef82f2	Merge pull request #6163 from neondatabase/releases/2023-12-18 Release 2023-12-18-2	2023-12-18 15:34:17 +02:00
Christian Schwarz	e67b8f69c0	[PRE-MERGE] pageserver: Reduce tracing overhead in timeline::get #6115 Pre-merge `git merge --squash` of https://github.com/neondatabase/neon/pull/6115 Lowering the tracing level in get_value_reconstruct_data and get_or_maybe_download from info to debug reduces the overhead of span creation in non-debug environments.	2023-12-18 13:39:48 +01:00
Shany Pozin	e546872ab4	Merge pull request #6158 from neondatabase/releases/2023-12-18 Release 2023-12-18	2023-12-18 14:24:34 +02:00
John Spray	322ea1cf7c	pageserver: on-demand activation cleanups (#6157 ) ## Problem #6112 added some logs and metrics: clean these up a bit: - Avoid counting startup completions for tenants launched after startup - exclude no-op cases from timing histograms - remove a rogue log messages	2023-12-18 11:14:19 +00:00
Vadim Kharitonov	3633742de9	Merge pull request #6121 from neondatabase/releases/2023-12-13 Release 2023-12-13	2023-12-13 12:39:43 +01:00
Joonas Koivunen	079d3a37ba	Merge remote-tracking branch 'origin/release' into releases/2023-12-13 this handles the hotfix introduced conflict.	2023-12-13 10:07:19 +00:00
Vadim Kharitonov	a46e77b476	Merge pull request #6090 from neondatabase/releases/2023-12-11 Release 2023-12-11	2023-12-12 12:10:35 +01:00
Tristan Partin	a92702b01e	Add submodule paths as safe directories as a precaution The check-codestyle-rust-arm job requires this for some reason, so let's just add them everywhere we do this workaround.	2023-12-11 22:00:35 +00:00
Tristan Partin	8ff3253f20	Fix git ownership issue in check-codestyle-rust-arm We have this workaround for other jobs. Looks like this one was forgotten about.	2023-12-11 22:00:35 +00:00
Joonas Koivunen	04b82c92a7	fix: accidential return Ok (#6106 ) Error indicating request cancellation OR timeline shutdown was deemed as a reason to exit the background worker that calculated synthetic size. Fix it to only be considered for avoiding logging such of such errors. This conflicted on tenant_shard_id having already replaced tenant_id on `main`.	2023-12-11 21:41:36 +00:00
Vadim Kharitonov	e5bf423e68	Merge branch 'release' into releases/2023-12-11	2023-12-11 11:55:48 +01:00
Vadim Kharitonov	60af392e45	Merge pull request #6057 from neondatabase/vk/patch_timescale_for_production Revert timescaledb for pg14 and pg15 (#6056)	2023-12-06 16:21:16 +01:00
Vadim Kharitonov	661fc41e71	Revert timescaledb for pg14 and pg15 (#6056 ) ``` could not start the compute node: compute is in state "failed": db error: ERROR: could not access file "$libdir/timescaledb-2.10.1": No such file or directory Caused by: ERROR: could not access file "$libdir/timescaledb-2.10.1": No such file or directory ```	2023-12-06 16:14:07 +01:00
Shany Pozin	702c488f32	Merge pull request #6022 from neondatabase/releases/2023-12-04 Release 2023-12-04	2023-12-05 17:03:28 +02:00
Sasha Krassovsky	45c5122754	Remove trusted from wal2json	2023-12-04 12:36:19 -08:00
Shany Pozin	558394f710	fix merge	2023-12-04 11:41:27 +02:00
Shany Pozin	73b0898608	Merge branch 'release' into releases/2023-12-04	2023-12-04 11:36:26 +02:00
Joonas Koivunen	e65be4c2dc	Merge pull request #6013 from neondatabase/releases/2023-12-01-hotfix fix: use create_new instead of create for mutex file	2023-12-01 15:35:56 +02:00
Joonas Koivunen	40087b8164	fix: use create_new instead of create for mutex file	2023-12-01 12:54:49 +00:00
Shany Pozin	c762b59483	Merge pull request #5986 from neondatabase/Release-11-30-hotfix Notify safekeeper readiness with systemd.	2023-11-30 10:01:05 +02:00
Arseny Sher	5d71601ca9	Notify safekeeper readiness with systemd. To avoid downtime during deploy, as in busy regions initial load can currently take ~30s.	2023-11-30 08:23:31 +03:00
Shany Pozin	a113c3e433	Merge pull request #5945 from neondatabase/release-2023-11-28-hotfix Release 2023 11 28 hotfix	2023-11-28 08:14:59 +02:00
Anastasia Lubennikova	e81fc598f4	Update neon extension relocatable for existing installations (#5943 )	2023-11-28 00:12:39 +00:00
Anastasia Lubennikova	48b845fa76	Make neon extension relocatable to allow SET SCHEMA (#5942 )	2023-11-28 00:12:32 +00:00
Shany Pozin	27096858dc	Merge pull request #5922 from neondatabase/releases/2023-11-27 Release 2023-11-27	2023-11-27 09:58:51 +02:00
Shany Pozin	4430d0ae7d	Merge pull request #5876 from neondatabase/releases/2023-11-17 Release 2023-11-17	2023-11-20 09:11:58 +02:00
Joonas Koivunen	6e183aa0de	Merge branch 'main' into releases/2023-11-17	2023-11-19 15:25:47 +00:00
Vadim Kharitonov	fd6d0b7635	Merge branch 'release' into releases/2023-11-17	2023-11-17 10:51:45 +01:00
Vadim Kharitonov	3710c32aae	Merge pull request #5778 from neondatabase/releases/2023-11-03 Release 2023-11-03	2023-11-03 16:06:58 +01:00
Vadim Kharitonov	be83bee49d	Merge branch 'release' into releases/2023-11-03	2023-11-03 11:18:15 +01:00
Alexander Bayandin	cf28e5922a	Merge pull request #5685 from neondatabase/releases/2023-10-26 Release 2023-10-26	2023-10-27 10:42:12 +01:00
Em Sharnoff	7d384d6953	Bump vm-builder v0.18.2 -> v0.18.4 (#5666 ) Only applicable change was neondatabase/autoscaling#584, setting pgbouncer auth_dbname=postgres in order to fix superuser connections from preventing dropping databases.	2023-10-26 20:15:45 +01:00
Em Sharnoff	4b3b37b912	Bump vm-builder v0.18.1 -> v0.18.2 (#5646 ) Only applicable change was neondatabase/autoscaling#571, removing the postgres_exporter flags `--auto-discover-databases` and `--exclude-databases=...`	2023-10-26 20:15:29 +01:00
Shany Pozin	1d8d200f4d	Merge pull request #5668 from neondatabase/sp/aux_files_cherry_pick Cherry pick: Ignore missed AUX_FILES_KEY when generating image layer (#5660)	2023-10-26 10:08:16 +03:00
Konstantin Knizhnik	0d80d6ce18	Ignore missed AUX_FILES_KEY when generating image layer (#5660 ) ## Problem Logical replication requires new AUX_FILES_KEY which is definitely absent in existed database. We do not have function to check if key exists in our KV storage. So I have to handle the error in `list_aux_files` method. But this key is also included in key space range and accessed y `create_image_layer` method. ## Summary of changes Check if AUX_FILES_KEY exists before including it in keyspace. --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Shany Pozin <shany@neon.tech> Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>	2023-10-26 09:30:28 +03:00
Shany Pozin	f653ee039f	Merge pull request #5638 from neondatabase/releases/2023-10-24 Release 2023-10-24	2023-10-24 12:10:52 +03:00
Em Sharnoff	e614a95853	Merge pull request #5610 from neondatabase/sharnoff/rc-2023-10-20-vm-monitor-fixes Release 2023-10-20: vm-monitor memory.high throttling fixes	2023-10-20 00:11:06 -07:00
Em Sharnoff	850db4cc13	vm-monitor: Deny not fail downscale if no memory stats yet (#5606 ) Fixes an issue we observed on staging that happens when the autoscaler-agent attempts to immediately downscale the VM after binding, which is typical for pooled computes. The issue was occurring because the autoscaler-agent was requesting downscaling before the vm-monitor had gathered sufficient cgroup memory stats to be confident in approving it. When the vm-monitor returned an internal error instead of denying downscaling, the autoscaler-agent retried the connection and immediately hit the same issue (in part because cgroup stats are collected per-connection, rather than globally).	2023-10-19 21:56:55 -07:00
Em Sharnoff	8a316b1277	vm-monitor: Log full error on message handling failure (#5604 ) There's currently an issue with the vm-monitor on staging that's not really feasible to debug because the current display impl gives no context to the errors (just says "failed to downscale"). Logging the full error should help. For communications with the autoscaler-agent, it's ok to only provide the outermost cause, because we can cross-reference with the VM logs. At some point in the future, we may want to change that.	2023-10-19 21:56:50 -07:00
Em Sharnoff	4d13bae449	vm-monitor: Switch from memory.high to polling memory.stat (#5524 ) tl;dr it's really hard to avoid throttling from memory.high, and it counts tmpfs & page cache usage, so it's also hard to make sense of. In the interest of fixing things quickly with something that should be good enough, this PR switches to instead periodically fetch memory statistics from the cgroup's memory.stat and use that data to determine if and when we should upscale. This PR fixes #5444, which has a lot more detail on the difficulties we've hit with memory.high. This PR also supersedes #5488.	2023-10-19 21:56:36 -07:00
Vadim Kharitonov	49377abd98	Merge pull request #5577 from neondatabase/releases/2023-10-17 Release 2023-10-17	2023-10-17 12:21:20 +02:00
Christian Schwarz	a6b2f4e54e	limit imitate accesses concurrency, using same semaphore as compactions (#5578 ) Before this PR, when we restarted pageserver, we'd see a rush of `$number_of_tenants` concurrent eviction tasks starting to do imitate accesses building up in the period of `[init_order allows activations, $random_access_delay + EvictionPolicyLayerAccessThreshold::period]`. We simply cannot handle that degree of concurrent IO. We already solved the problem for compactions by adding a semaphore. So, this PR shares that semaphore for use by evictions. Part of https://github.com/neondatabase/neon/issues/5479 Which is again part of https://github.com/neondatabase/neon/issues/4743 Risks / Changes In System Behavior ================================== * we don't do evictions as timely as we currently do * we log a bunch of warnings about eviction taking too long * imitate accesses and compactions compete for the same concurrency limit, so, they'll slow each other down through this shares semaphore Changes ======= - Move the `CONCURRENT_COMPACTIONS` semaphore into `tasks.rs` - Rename it to `CONCURRENT_BACKGROUND_TASKS` - Use it also for the eviction imitate accesses: - Imitate acceses are both per-TIMELINE and per-TENANT - The per-TENANT is done through coalescing all the per-TIMELINE tasks via a tokio mutex `eviction_task_tenant_state`. - We acquire the CONCURRENT_BACKGROUND_TASKS permit early, at the beginning of the eviction iteration, much before the imitate acesses start (and they may not even start at all in the given iteration, as they happen only every $threshold). - Acquiring early is sub-optimal because when the per-timline tasks coalesce on the `eviction_task_tenant_state` mutex, they are already holding a CONCURRENT_BACKGROUND_TASKS permit. - It's also unfair because tenants with many timelines win the CONCURRENT_BACKGROUND_TASKS more often. - I don't think there's another way though, without refactoring more of the imitate accesses logic, e.g, making it all per-tenant. - Add metrics for queue depth behind the semaphore. I found these very useful to understand what work is queued in the system. - The metrics are tagged by the new `BackgroundLoopKind`. - On a green slate, I would have used `TaskKind`, but we already had pre-existing labels whose names didn't map exactly to task kind. Also the task kind is kind of a lower-level detail, so, I think it's fine to have a separate enum to identify background work kinds. Future Work =========== I guess I could move the eviction tasks from a ticker to "sleep for $period". The benefit would be that the semaphore automatically "smears" the eviction task scheduling over time, so, we only have the rush on restart but a smeared-out rush afterward. The downside is that this perverts the meaning of "$period", as we'd actually not run the eviction at a fixed period. It also means the the "took to long" warning & metric becomes meaningless. Then again, that is already the case for the compaction and gc tasks, which do sleep for `$period` instead of using a ticker. (cherry picked from commit `9256788273`)	2023-10-17 12:16:26 +02:00
Shany Pozin	face60d50b	Merge pull request #5526 from neondatabase/releases/2023-10-11 Release 2023-10-11	2023-10-11 11:16:39 +03:00
Shany Pozin	9768aa27f2	Merge pull request #5516 from neondatabase/releases/2023-10-10 Release 2023-10-10	2023-10-10 14:16:47 +03:00
Shany Pozin	96b2e575e1	Merge pull request #5445 from neondatabase/releases/2023-10-03 Release 2023-10-03	2023-10-04 13:53:37 +03:00
Alexander Bayandin	7222777784	Update checksums for pg_jsonschema & pg_graphql (#5455 ) ## Problem Folks have re-taged releases for `pg_jsonschema` and `pg_graphql` (to increase timeouts on their CI), for us, these are a noop changes, but unfortunately, this will cause our builds to fail due to checksums mismatch (this might not strike right away because of the build cache). - `8ba7c7be9d` - `aa7509370a` ## Summary of changes - `pg_jsonschema` update checksum - `pg_graphql` update checksum	2023-10-03 18:44:30 +01:00
Em Sharnoff	5469fdede0	Merge pull request #5422 from neondatabase/sharnoff/rc-2023-09-28-fix-restart-on-postmaster-SIGKILL Release 2023-09-28: Fix (lack of) restart on neonvm postmaster SIGKILL	2023-09-28 10:48:51 -07:00
MMeent	72aa6b9fdd	Fix neon_zeroextend's WAL logging (#5387 ) When you log more than a few blocks, you need to reserve the space in advance. We didn't do that, so we got errors. Now we do that, and shouldn't get errors.	2023-09-28 09:37:28 -07:00
Em Sharnoff	ae0634b7be	Bump vm-builder v0.17.11 -> v0.17.12 (#5407 ) Only relevant change is neondatabase/autoscaling#534 - refer there for more details.	2023-09-28 09:28:04 -07:00
Shany Pozin	70711f32fa	Merge pull request #5375 from neondatabase/releases/2023-09-26 Release 2023-09-26	2023-09-26 15:19:45 +03:00
Vadim Kharitonov	52a88af0aa	Merge pull request #5336 from neondatabase/releases/2023-09-19 Release 2023-09-19	2023-09-19 11:16:43 +02:00
Alexander Bayandin	b7a43bf817	Merge branch 'release' into releases/2023-09-19	2023-09-19 09:07:20 +01:00
Alexander Bayandin	dce91b33a4	Merge pull request #5318 from neondatabase/releases/2023-09-15-1 Postgres 14/15: Use previous extensions versions	2023-09-15 16:30:44 +01:00
Alexander Bayandin	23ee4f3050	Revert plv8 only	2023-09-15 15:45:23 +01:00
Alexander Bayandin	46857e8282	Postgres 14/15: Use previous extensions versions	2023-09-15 15:27:00 +01:00
Alexander Bayandin	368ab0ce54	Merge pull request #5313 from neondatabase/releases/2023-09-15 Release 2023-09-15	2023-09-15 10:39:56 +01:00
Konstantin Knizhnik	a5987eebfd	References to old and new blocks were mixed in xlog_heap_update handler (#5312 ) ## Problem See https://neondb.slack.com/archives/C05L7D1JAUS/p1694614585955029 https://www.notion.so/neondatabase/Duplicate-key-issue-651627ce843c45188fbdcb2d30fd2178 ## Summary of changes Swap old/new block references ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2023-09-15 10:11:41 +01:00
Alexander Bayandin	6686ede30f	Update checksum for pg_hint_plan (#5309 ) ## Problem The checksum for `pg_hint_plan` doesn't match: ``` sha256sum: WARNING: 1 computed checksum did NOT match ``` Ref https://github.com/neondatabase/neon/actions/runs/6185715461/job/16793609251?pr=5307 It seems that the release was retagged yesterday: https://github.com/ossc-db/pg_hint_plan/releases/tag/REL16_1_6_0 I don't see any malicious changes from 15_1.5.1: https://github.com/ossc-db/pg_hint_plan/compare/REL15_1_5_1...REL16_1_6_0, so it should be ok to update. ## Summary of changes - Update checksum for `pg_hint_plan` 16_1.6.0	2023-09-15 09:54:42 +01:00
Em Sharnoff	373c7057cc	vm-monitor: Fix cgroup throttling (#5303 ) I believe this (not actual IO problems) is the cause of the "disk speed issue" that we've had for VMs recently. See e.g.: 1. https://neondb.slack.com/archives/C03H1K0PGKH/p1694287808046179?thread_ts=1694271790.580099&cid=C03H1K0PGKH 2. https://neondb.slack.com/archives/C03H1K0PGKH/p1694511932560659 The vm-informant (and now, the vm-monitor, its replacement) is supposed to gradually increase the `neon-postgres` cgroup's memory.high value, because otherwise the kernel will throttle all the processes in the cgroup. This PR fixes a bug with the vm-monitor's implementation of this behavior. --- Other references, for the vm-informant's implementation: - Original issue: neondatabase/autoscaling#44 - Original PR: neondatabase/autoscaling#223	2023-09-15 09:54:42 +01:00
Shany Pozin	7d6ec16166	Merge pull request #5296 from neondatabase/releases/2023-09-13 Release 2023-09-13	2023-09-13 13:49:14 +03:00
Shany Pozin	0e6fdc8a58	Merge pull request #5283 from neondatabase/releases/2023-09-12 Release 2023-09-12	2023-09-12 14:56:47 +03:00
Christian Schwarz	521438a5c6	fix deadlock around TENANTS (#5285 ) The sequence that can lead to a deadlock: 1. DELETE request gets all the way to `tenant.shutdown(progress, false).await.is_err() ` , while holding TENANTS.read() 2. POST request for tenant creation comes in, calls `tenant_map_insert`, it does `let mut guard = TENANTS.write().await;` 3. Something that `tenant.shutdown()` needs to wait for needs a `TENANTS.read().await`. The only case identified in exhaustive manual scanning of the code base is this one: Imitate size access does `get_tenant().await`, which does `TENANTS.read().await` under the hood. In the above case (1) waits for (3), (3)'s read-lock request is queued behind (2)'s write-lock, and (2) waits for (1). Deadlock. I made a reproducer/proof-that-above-hypothesis-holds in https://github.com/neondatabase/neon/pull/5281 , but, it's not ready for merge yet and we want the fix _now_. fixes https://github.com/neondatabase/neon/issues/5284	2023-09-12 14:13:13 +03:00
Vadim Kharitonov	07d7874bc8	Merge pull request #5202 from neondatabase/releases/2023-09-05 Release 2023-09-05	2023-09-05 12:16:06 +02:00
Anastasia Lubennikova	1804111a02	Merge pull request #5161 from neondatabase/rc-2023-08-31 Release 2023-08-31	2023-08-31 16:53:17 +03:00
Arthur Petukhovsky	cd0178efed	Merge pull request #5150 from neondatabase/release-sk-fix-active-timeline Release 2023-08-30	2023-08-30 11:43:39 +02:00
Shany Pozin	333574be57	Merge pull request #5133 from neondatabase/releases/2023-08-29 Release 2023-08-29	2023-08-29 14:02:58 +03:00
Alexander Bayandin	79a799a143	Merge branch 'release' into releases/2023-08-29	2023-08-29 11:17:57 +01:00
Conrad Ludgate	9da06af6c9	Merge pull request #5113 from neondatabase/release-http-connection-fix Release 2023-08-25	2023-08-25 17:21:35 +01:00
Conrad Ludgate	ce1753d036	proxy: dont return connection pending (#5107 ) ## Problem We were returning Pending when a connection had a notice/notification (introduced recently in #5020). When returning pending, the runtime assumes you will call `cx.waker().wake()` in order to continue processing. We weren't doing that, so the connection task would get stuck ## Summary of changes Don't return pending. Loop instead	2023-08-25 16:42:30 +01:00
Alek Westover	67db8432b4	Fix cargo deny errors (#5068 ) ## Problem cargo deny lint broken Links to the CVEs: [rustsec.org/advisories/RUSTSEC-2023-0052](https://rustsec.org/advisories/RUSTSEC-2023-0052) [rustsec.org/advisories/RUSTSEC-2023-0053](https://rustsec.org/advisories/RUSTSEC-2023-0053) One is fixed, the other one isn't so we allow it (for now), to unbreak CI. Then later we'll try to get rid of webpki in favour of the rustls fork. ## Summary of changes ``` +ignore = ["RUSTSEC-2023-0052"] ```	2023-08-25 16:42:30 +01:00
Vadim Kharitonov	4e2e44e524	Enable neon-pool-opt-in (#5062 )	2023-08-22 09:06:14 +01:00
Vadim Kharitonov	ed786104f3	Merge pull request #5060 from neondatabase/releases/2023-08-22 Release 2023-08-22	2023-08-22 09:41:02 +02:00
Stas Kelvich	84b74f2bd1	Merge pull request #4997 from neondatabase/sk/proxy-release-23-07-15 Fix lint	2023-08-15 18:54:20 +03:00
Arthur Petukhovsky	fec2ad6283	Fix lint	2023-08-15 18:49:02 +03:00
Stas Kelvich	98eebd4682	Merge pull request #4996 from neondatabase/sk/proxy_release Disable neon-pool-opt-in	2023-08-15 18:37:50 +03:00
Arthur Petukhovsky	2f74287c9b	Disable neon-pool-opt-in	2023-08-15 18:34:17 +03:00
Shany Pozin	aee1bf95e3	Merge pull request #4990 from neondatabase/releases/2023-08-15 Release 2023-08-15	2023-08-15 15:34:38 +03:00
Shany Pozin	b9de9d75ff	Merge branch 'release' into releases/2023-08-15	2023-08-15 14:35:00 +03:00
Stas Kelvich	7943b709e6	Merge pull request #4940 from neondatabase/sk/release-23-05-25-proxy-fixup Release: proxy retry fixup	2023-08-09 13:53:19 +03:00
Conrad Ludgate	d7d066d493	proxy: delay auth on retry (#4929 ) ## Problem When an endpoint is shutting down, it can take a few seconds. Currently when starting a new compute, this causes an "endpoint is in transition" error. We need to add delays before retrying to ensure that we allow time for the endpoint to shutdown properly. ## Summary of changes Adds a delay before retrying in auth. connect_to_compute already has this delay	2023-08-09 12:54:24 +03:00
Felix Prasanna	e78ac22107	release fix: revert vm builder bump from 0.13.1 -> 0.15.0-alpha1 (#4932 ) This reverts commit `682dfb3a31`. hotfix for a CLI arg issue in the monitor	2023-08-08 21:08:46 +03:00
Vadim Kharitonov	76a8f2bb44	Merge pull request #4923 from neondatabase/releases/2023-08-08 Release 2023-08-08	2023-08-08 11:44:38 +02:00
Vadim Kharitonov	8d59a8581f	Merge branch 'release' into releases/2023-08-08	2023-08-08 10:54:34 +02:00
Vadim Kharitonov	b1ddd01289	Define NEON_SMGR to make it possible for extensions to use Neon SMG API (#4889 ) Co-authored-by: Konstantin Knizhnik <knizhnik@garret.ru> Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2023-08-03 16:28:31 +03:00
Alexander Bayandin	6eae4fc9aa	Release 2023-08-02: update pg_embedding (#4877 ) Cherry-picking `ca4d71a954` from `main` into the `release` Co-authored-by: Vadim Kharitonov <vadim2404@users.noreply.github.com>	2023-08-03 08:48:09 +02:00
Christian Schwarz	765455bca2	Merge pull request #4861 from neondatabase/releases/2023-08-01--2-fix-pipeline ci: fix upload-postgres-extensions-to-s3 job	2023-08-01 13:22:07 +02:00
Christian Schwarz	4204960942	ci: fix upload-postgres-extensions-to-s3 job commit commit `5f8fd640bf` Author: Alek Westover <alek.westover@gmail.com> Date: Wed Jul 26 08:24:03 2023 -0400 Upload Test Remote Extensions (#4792) switched to using the release tag instead of `latest`, but, the `promote-images` job only uploads `latest` to the prod ECR. The switch to using release tag was good in principle, but, reverting that part to make the release pipeine work. Note that a proper fix should abandon use of `:latest` tag at all: currently, if a `main` pipeline runs concurrently with a `release` pipeline, the `release` pipeline may end up using the `main` pipeline's images.	2023-08-01 12:01:45 +02:00
Christian Schwarz	67345d66ea	Merge pull request #4858 from neondatabase/releases/2023-08-01 Release 2023-08-01	2023-08-01 10:44:01 +02:00
Shany Pozin	2266ee5971	Merge pull request #4803 from neondatabase/releases/2023-07-25 Release 2023-07-25	2023-07-25 14:21:07 +03:00
Shany Pozin	b58445d855	Merge pull request #4746 from neondatabase/releases/2023-07-18 Release 2023-07-18	2023-07-18 14:45:39 +03:00
Conrad Ludgate	36050e7f3d	Merge branch 'release' into releases/2023-07-18	2023-07-18 12:00:09 +01:00
Alexander Bayandin	33360ed96d	Merge pull request #4705 from neondatabase/release-2023-07-12 Release 2023-07-12 (only proxy)	2023-07-12 19:44:36 +01:00
Conrad Ludgate	39a28d1108	proxy wake_compute loop (#4675 ) ## Problem If we fail to wake up the compute node, a subsequent connect attempt will definitely fail. However, kubernetes won't fail the connection immediately, instead it hangs until we timeout (10s). ## Summary of changes Refactor the loop to allow fast retries of compute_wake and to skip a connect attempt.	2023-07-12 18:40:11 +01:00
Conrad Ludgate	efa6aa134f	allow repeated IO errors from compute node (#4624 ) ## Problem #4598 compute nodes are not accessible some time after wake up due to kubernetes DNS not being fully propagated. ## Summary of changes Update connect retry mechanism to support handling IO errors and sleeping for 100ms ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.	2023-07-12 18:40:06 +01:00
Alexander Bayandin	2c724e56e2	Merge pull request #4646 from neondatabase/releases/2023-07-06-hotfix Release 2023-07-06 (add pg_embedding extension only)	2023-07-06 12:19:52 +01:00
Alexander Bayandin	feff887c6f	Compile `pg_embedding` extension (#4634 ) ``` CREATE EXTENSION embedding; CREATE TABLE t (val real[]); INSERT INTO t (val) VALUES ('{0,0,0}'), ('{1,2,3}'), ('{1,1,1}'), (NULL); CREATE INDEX ON t USING hnsw (val) WITH (maxelements = 10, dims=3, m=3); INSERT INTO t (val) VALUES (array[1,2,4]); SELECT * FROM t ORDER BY val <-> array[3,3,3]; val --------- {1,2,3} {1,2,4} {1,1,1} {0,0,0} (5 rows) ```	2023-07-06 09:39:41 +01:00
Vadim Kharitonov	353d915fcf	Merge pull request #4633 from neondatabase/releases/2023-07-05 Release 2023-07-05	2023-07-05 15:10:47 +02:00
Vadim Kharitonov	2e38098cbc	Merge branch 'release' into releases/2023-07-05	2023-07-05 12:41:48 +02:00
Vadim Kharitonov	a6fe5ea1ac	Merge pull request #4571 from neondatabase/releases/2023-06-27 Release 2023-06-27	2023-06-27 12:55:33 +02:00
Vadim Kharitonov	05b0aed0c1	Merge branch 'release' into releases/2023-06-27	2023-06-27 12:22:12 +02:00
Alex Chi Z	cd1705357d	Merge pull request #4561 from neondatabase/releases/2023-06-23-hotfix Release 2023-06-23 (pageserver-only)	2023-06-23 15:38:50 -04:00
Christian Schwarz	6bc7561290	don't use MGMT_REQUEST_RUNTIME for consumption metrics synthetic size worker The consumption metrics synthetic size worker does logical size calculation. Logical size calculation currently does synchronous disk IO. This blocks the MGMT_REQUEST_RUNTIME's executor threads, starving other futures. While there's work on the way to move the synchronous disk IO into spawn_blocking, the quickfix here is to use the BACKGROUND_RUNTIME instead of MGMT_REQUEST_RUNTIME. Actually it's not just a quickfix. We simply shouldn't be blocking MGMT_REQUEST_RUNTIME executor threads on CPU or sync disk IO. That work isn't done yet, as many of the mgmt tasks still _do_ disk IO. But it's not as intensive as the logical size calculations that we're fixing here. While we're at it, fix disk-usage-based eviction in a similar way. It wasn't the culprit here, according to prod logs, but it can theoretically be a little CPU-intensive. More context, including graphs from Prod: https://neondb.slack.com/archives/C03F5SM1N02/p1687541681336949 (cherry picked from commit `d6e35222ea`)	2023-06-23 20:54:07 +02:00
Christian Schwarz	fbd3ac14b5	Merge pull request #4544 from neondatabase/releases/2023-06-21-hotfix Release 2023-06-21 (fixup for post-merge failed 2023-06-20)	2023-06-21 16:54:34 +03:00
Christian Schwarz	e437787c8f	cargo update -p openssl (#4542 ) To unblock release https://github.com/neondatabase/neon/pull/4536#issuecomment-1600678054 Context: https://rustsec.org/advisories/RUSTSEC-2023-0044	2023-06-21 15:52:56 +03:00
Christian Schwarz	3460dbf90b	Merge pull request #4536 from neondatabase/releases/2023-06-20 Release 2023-06-20 (actually 2023-06-21)	2023-06-21 14:19:14 +03:00
Vadim Kharitonov	6b89d99677	Merge pull request #4521 from neondatabase/release_2023-06-15 Release 2023 06 15	2023-06-15 17:40:01 +02:00
Vadim Kharitonov	6cc8ea86e4	Merge branch 'main' into release_2023-06-15	2023-06-15 16:50:44 +02:00
Shany Pozin	e62a492d6f	Merge pull request #4486 from neondatabase/releases/2023-06-13 Release 2023-06-13	2023-06-13 15:21:35 +03:00
Alexey Kondratov	a475cdf642	[compute_ctl] Fix logging if catalog updates are skipped (#4480 ) Otherwise, it wasn't clear from the log when Postgres started up completely if catalog updates were skipped. Follow-up for `4936ab6`	2023-06-13 13:37:24 +02:00
Stas Kelvich	7002c79a47	Merge pull request #4447 from neondatabase/release_proxy_08-06-2023 Release proxy 08 06 2023	2023-06-08 21:02:54 +03:00
Vadim Kharitonov	ee6cf357b4	Merge pull request #4427 from neondatabase/releases/2023-06-06 Release 2023-06-06	2023-06-06 14:42:21 +02:00
Vadim Kharitonov	e5c2086b5f	Merge branch 'release' into releases/2023-06-06	2023-06-06 12:33:56 +02:00
Shany Pozin	5f1208296a	Merge pull request #4395 from neondatabase/releases/2023-06-01 Release 2023-06-01	2023-06-01 10:58:00 +03:00
Stas Kelvich	88e8e473cd	Merge pull request #4345 from neondatabase/release-23-05-25-proxy Release 23-05-25, take 3	2023-05-25 19:40:43 +03:00
Stas Kelvich	b0a77844f6	Add SQL-over-HTTP endpoint to Proxy This commit introduces an SQL-over-HTTP endpoint in the proxy, with a JSON response structure resembling that of the node-postgres driver. This method, using HTTP POST, achieves smaller amortized latencies in edge setups due to fewer round trips and an enhanced open connection reuse by the v8 engine. This update involves several intricacies: 1. SQL injection protection: We employed the extended query protocol, modifying the rust-postgres driver to send queries in one roundtrip using a text protocol rather than binary, bypassing potential issues like those identified in https://github.com/sfackler/rust-postgres/issues/1030. 2. Postgres type compatibility: As not all postgres types have binary representations (e.g., acl's in pg_class), we adjusted rust-postgres to respond with text protocol, simplifying serialization and fixing queries with text-only types in response. 3. Data type conversion: Considering JSON supports fewer data types than Postgres, we perform conversions where possible, passing all other types as strings. Key conversions include: - postgres int2, int4, float4, float8 -> json number (NaN and Inf remain text) - postgres bool, null, text -> json bool, null, string - postgres array -> json array - postgres json and jsonb -> json object 4. Alignment with node-postgres: To facilitate integration with js libraries, we've matched the response structure of node-postgres, returning command tags and column oids. Command tag capturing was added to the rust-postgres functionality as part of this change.	2023-05-25 17:59:17 +03:00
Vadim Kharitonov	1baf464307	Merge pull request #4309 from neondatabase/releases/2023-05-23 Release 2023-05-23	2023-05-24 11:56:54 +02:00
Alexander Bayandin	e9b8e81cea	Merge branch 'release' into releases/2023-05-23	2023-05-23 12:54:08 +01:00
Alexander Bayandin	85d6194aa4	Fix regress-tests job for Postgres 15 on release branch (#4254 ) ## Problem Compatibility tests don't support Postgres 15 yet, but we're still trying to upload compatibility snapshot (which we do not collect). Ref https://github.com/neondatabase/neon/actions/runs/4991394158/jobs/8940369368#step:4:38129 ## Summary of changes Add `pg_version` parameter to `run-python-test-set` actions and do not upload compatibility snapshot for Postgres 15	2023-05-16 17:19:12 +01:00
Vadim Kharitonov	333a7a68ef	Merge pull request #4245 from neondatabase/releases/2023-05-16 Release 2023-05-16	2023-05-16 13:38:40 +02:00
Vadim Kharitonov	6aa4e41bee	Merge branch 'release' into releases/2023-05-16	2023-05-16 12:48:23 +02:00
Joonas Koivunen	840183e51f	try: higher page_service timeouts to isolate an issue	2023-05-11 16:24:53 +03:00
Shany Pozin	cbccc94b03	Merge pull request #4184 from neondatabase/releases/2023-05-09 Release 2023-05-09	2023-05-09 15:30:36 +03:00
Stas Kelvich	fce227df22	Merge pull request #4163 from neondatabase/main Release 23-05-05	2023-05-05 15:56:23 +03:00
Stas Kelvich	bd787e800f	Merge pull request #4133 from neondatabase/main Release 23-04-01	2023-05-01 18:52:46 +03:00
Shany Pozin	4a7704b4a3	Merge pull request #4131 from neondatabase/sp/hotfix_adding_sks_us_west Hotfix: Adding 4 new pageservers and two sets of safekeepers to us west 2	2023-05-01 15:17:38 +03:00
Shany Pozin	ff1119da66	Add 2 new sets of safekeepers to us-west2	2023-05-01 14:35:31 +03:00
Shany Pozin	4c3ba1627b	Add 4 new Pageservers for retool launch	2023-05-01 14:34:38 +03:00
Vadim Kharitonov	1407174fb2	Merge pull request #4110 from neondatabase/vk/release_2023-04-28 Release 2023 04 28	2023-04-28 17:43:16 +02:00
Vadim Kharitonov	ec9dcb1889	Merge branch 'release' into vk/release_2023-04-28	2023-04-28 16:32:26 +02:00
Joonas Koivunen	d11d781afc	revert: "Add check for duplicates of generated image layers" (#4104 ) This reverts commit `732acc5`. Reverted PR: #3869 As noted in PR #4094, we do in fact try to insert duplicates to the layer map, if L0->L1 compaction is interrupted. We do not have a proper fix for that right now, and we are in a hurry to make a release to production, so revert the changes related to this to the state that we have in production currently. We know that we have a bug here, but better to live with the bug that we've had in production for a long time, than rush a fix to production without testing it in staging first. Cc: #4094, #4088	2023-04-28 16:31:35 +02:00
Anastasia Lubennikova	4e44565b71	Merge pull request #4000 from neondatabase/releases/2023-04-11 Release 2023-04-11	2023-04-11 17:47:41 +03:00
Stas Kelvich	4ed51ad33b	Add more proxy cnames	2023-04-11 15:59:35 +03:00
Arseny Sher	1c1ebe5537	Merge pull request #3946 from neondatabase/releases/2023-04-04 Release 2023-04-04	2023-04-04 14:38:40 +04:00
Christian Schwarz	c19cb7f386	Merge pull request #3935 from neondatabase/releases/2023-04-03 Release 2023-04-03	2023-04-03 16:19:49 +02:00
Vadim Kharitonov	4b97d31b16	Merge pull request #3896 from neondatabase/releases/2023-03-28 Release 2023-03-28	2023-03-28 17:58:06 +04:00
Shany Pozin	923ade3dd7	Merge pull request #3855 from neondatabase/releases/2023-03-21 Release 2023-03-21	2023-03-21 13:12:32 +02:00
Arseny Sher	b04e711975	Merge pull request #3825 from neondatabase/release-2023-03-15 Release 2023.03.15	2023-03-15 15:38:00 +03:00
Arseny Sher	afd0a6b39a	Forward framed read buf contents to compute before proxy pass. Otherwise they get lost. Normally buffer is empty before proxy pass, but this is not the case with pipeline mode of out npm driver; fixes connection hangup introduced by `b80fe41af3` for it. fixes https://github.com/neondatabase/neon/issues/3822	2023-03-15 15:36:06 +04:00
Lassi Pölönen	99752286d8	Use RollingUpdate strategy also for legacy proxy (#3814 ) ## Describe your changes We have previously changed the neon-proxy to use RollingUpdate. This should be enabled in legacy proxy too in order to avoid breaking connections for the clients and allow for example backups to run even during deployment. (https://github.com/neondatabase/neon/pull/3683) ## Issue ticket number and link https://github.com/neondatabase/neon/issues/3333	2023-03-15 15:35:51 +04:00
Arseny Sher	15df93363c	Merge pull request #3804 from neondatabase/release-2023-03-13 Release 2023.03.13	2023-03-13 20:25:40 +03:00
Vadim Kharitonov	bc0ab741af	Merge pull request #3758 from neondatabase/releases/2023-03-07 Release 2023-03-07	2023-03-07 12:38:47 +01:00
Christian Schwarz	51d9dfeaa3	Merge pull request #3743 from neondatabase/releases/2023-03-03 Release 2023-03-03	2023-03-03 19:20:21 +01:00
Shany Pozin	f63cb18155	Merge pull request #3713 from neondatabase/releases/2023-02-28 Release 2023-02-28	2023-02-28 12:52:24 +02:00
Arseny Sher	0de603d88e	Merge pull request #3707 from neondatabase/release-2023-02-24 Release 2023-02-24 Hotfix for UNLOGGED tables. Contains #3706 Also contains rebase on 14.7 and 15.2 #3581	2023-02-25 00:32:11 +04:00
Heikki Linnakangas	240913912a	Fix UNLOGGED tables. Instead of trying to create missing files on the way, send init fork contents as main fork from pageserver during basebackup. Add test for that. Call put_rel_drop for init forks; previously they weren't removed. Bump vendor/postgres to revert previous approach on Postgres side. Co-authored-by: Arseny Sher <sher-ars@yandex.ru> ref https://github.com/neondatabase/postgres/pull/264 ref https://github.com/neondatabase/postgres/pull/259 ref https://github.com/neondatabase/neon/issues/1222	2023-02-24 23:54:53 +04:00
MMeent	91a4ea0de2	Update vendored PostgreSQL versions to 14.7 and 15.2 (#3581 ) ## Describe your changes Rebase vendored PostgreSQL onto 14.7 and 15.2 ## Issue ticket number and link #3579 ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [x] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [x] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ``` The version of PostgreSQL that we use is updated to 14.7 for PostgreSQL 14 and 15.2 for PostgreSQL 15. ```	2023-02-24 23:54:42 +04:00
Arseny Sher	8608704f49	Merge pull request #3691 from neondatabase/release-2023-02-23 Release 2023-02-23 Hotfix for the unlogged tables with indexes issue. neondatabase/postgres#259 neondatabase/postgres#262	2023-02-23 13:39:33 +04:00
Arseny Sher	efef68ce99	Bump vendor/postgres to include hotfix for unlogged tables with indexes. https://github.com/neondatabase/postgres/pull/259 https://github.com/neondatabase/postgres/pull/262	2023-02-23 08:49:43 +04:00
Joonas Koivunen	8daefd24da	Merge pull request #3679 from neondatabase/releases/2023-02-22 Releases/2023-02-22	2023-02-22 15:56:55 +02:00
Arthur Petukhovsky	46cc8b7982	Remove safekeeper-1.ap-southeast-1.aws.neon.tech (#3671 ) We migrated all timelines to `safekeeper-3.ap-southeast-1.aws.neon.tech`, now old instance can be removed.	2023-02-22 15:07:57 +02:00
Sergey Melnikov	38cd90dd0c	Add -v to ansible invocations (#3670 ) To get more debug output on failures	2023-02-22 15:07:57 +02:00
Joonas Koivunen	a51b269f15	fix: hold permit until GetObject eof (#3663 ) previously we applied the ratelimiting only up to receiving the headers from s3, or somewhere near it. the commit adds an adapter which carries the permit until the AsyncRead has been disposed. fixes #3662.	2023-02-22 15:07:57 +02:00
Joonas Koivunen	43bf6d0a0f	calculate_logical_size: no longer use spawn_blocking (#3664 ) Calculation of logical size is now async because of layer downloads, so we shouldn't use spawn_blocking for it. Use of `spawn_blocking` exhausted resources which are needed by `tokio::io::copy` when copying from a stream to a file which lead to deadlock. Fixes: #3657	2023-02-22 15:07:57 +02:00
Joonas Koivunen	15273a9b66	chore: ignore all compaction inactive tenant errors (#3665 ) these are happening in tests because of #3655 but they sure took some time to appear. makes the `Compaction failed, retrying in 2s: Cannot run compaction iteration on inactive tenant` into a globally allowed error, because it has been seen failing on different test cases.	2023-02-22 15:07:57 +02:00
Joonas Koivunen	78aca668d0	fix: log download failed error (#3661 ) Fixes #3659	2023-02-22 15:07:57 +02:00
Vadim Kharitonov	acbf4148ea	Merge pull request #3656 from neondatabase/releases/2023-02-21 Release 2023-02-21	2023-02-21 16:03:48 +01:00
Vadim Kharitonov	6508540561	Merge branch 'release' into releases/2023-02-21	2023-02-21 15:31:16 +01:00
Arthur Petukhovsky	a41b5244a8	Add new safekeeper to ap-southeast-1 prod (#3645 ) (#3646 ) To trigger deployment of #3645 to production.	2023-02-20 15:22:49 +00:00
Shany Pozin	2b3189be95	Merge pull request #3600 from neondatabase/releases/2023-02-14 Release 2023-02-14	2023-02-15 13:31:30 +02:00
Vadim Kharitonov	248563c595	Merge pull request #3553 from neondatabase/releases/2023-02-07 Release 2023-02-07	2023-02-07 14:07:44 +01:00
Vadim Kharitonov	14cd6ca933	Merge branch 'release' into releases/2023-02-07	2023-02-07 12:11:56 +01:00
Vadim Kharitonov	eb36403e71	Release 2023 01 31 (#3497 ) Co-authored-by: Kirill Bulatov <kirill@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech> Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech> Co-authored-by: bojanserafimov <bojan.serafimov7@gmail.com> Co-authored-by: Christian Schwarz <christian@neon.tech> Co-authored-by: Alexey Kondratov <kondratov.aleksey@gmail.com> Co-authored-by: Joonas Koivunen <joonas@neon.tech> Co-authored-by: Konstantin Knizhnik <knizhnik@garret.ru> Co-authored-by: Shany Pozin <shany@neon.tech> Co-authored-by: Sergey Melnikov <sergey@neon.tech> Co-authored-by: Dmitry Rodionov <dmitry@neon.tech> Co-authored-by: Rory de Zoete <33318916+zoete@users.noreply.github.com> Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> Co-authored-by: Lassi Pölönen <lassi.polonen@iki.fi>	2023-01-31 15:06:35 +02:00
Anastasia Lubennikova	3c6f779698	Merge pull request #3411 from neondatabase/release_2023_01_23 Fix Release 2023 01 23	2023-01-23 20:10:03 +02:00
Joonas Koivunen	f67f0c1c11	More tenant size fixes (#3410 ) Small changes, but hopefully this will help with the panic detected in staging, for which we cannot get the debugging information right now (end-of-branch before branch-point).	2023-01-23 17:46:13 +02:00
Shany Pozin	edb02d3299	Adding pageserver3 to staging (#3403 )	2023-01-23 17:46:13 +02:00
Konstantin Knizhnik	664a69e65b	Fix slru_segment_key_range function: segno was assigned to incorrect Key field (#3354 )	2023-01-23 17:46:13 +02:00
Anastasia Lubennikova	478322ebf9	Fix tenant size orphans (#3377 ) Before only the timelines which have passed the `gc_horizon` were processed which failed with orphans at the tree_sort phase. Example input in added `test_branched_empty_timeline_size` test case. The PR changes iteration to happen through all timelines, and in addition to that, any learned branch points will be calculated as they would had been in the original implementation if the ancestor branch had been over the `gc_horizon`. This also changes how tenants where all timelines are below `gc_horizon` are handled. Previously tenant_size 0 was returned, but now they will have approximately `initdb_lsn` worth of tenant_size. The PR also adds several new tenant size tests that describe various corner cases of branching structure and `gc_horizon` setting. They are currently disabled to not consume time during CI. Co-authored-by: Joonas Koivunen <joonas@neon.tech> Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>	2023-01-23 17:46:13 +02:00
Joonas Koivunen	802f174072	fix: dont stop pageserver if we fail to calculate synthetic size	2023-01-23 17:46:13 +02:00
Alexey Kondratov	47f9890bae	[compute_ctl] Make role deletion spec processing idempotent (#3380 ) Previously, we were trying to re-assign owned objects of the already deleted role. This were causing a crash loop in the case when compute was restarted with a spec that includes delta operation for role deletion. To avoid such cases, check that role is still present before calling `reassign_owned_objects`. Resolves neondatabase/cloud#3553	2023-01-23 17:46:13 +02:00
Christian Schwarz	262265daad	Revert "Use actual temporary dir for pageserver unit tests" This reverts commit `826e89b9ce`. The problem with that commit was that it deletes the TempDir while there are still EphemeralFile instances open. At first I thought this could be fixed by simply adding Handle::current().block_on(task_mgr::shutdown(None, Some(tenant_id), None)) to TenantHarness::drop, but it turned out to be insufficient. So, reverting the commit until we find a proper solution. refs https://github.com/neondatabase/neon/issues/3385	2023-01-23 17:46:13 +02:00
bojanserafimov	300da5b872	Improve layer map docstrings (#3382 )	2023-01-23 17:46:13 +02:00
Heikki Linnakangas	7b22b5c433	Switch to 'tracing' for logging, restructure code to make use of spans. Refactors Compute::prepare_and_run. It's split into subroutines differently, to make it easier to attach tracing spans to the different stages. The high-level logic for waiting for Postgres to exit is moved to the caller. Replace 'env_logger' with 'tracing', and add `#instrument` directives to different stages fo the startup process. This is a fairly mechanical change, except for the changes in 'spec.rs'. 'spec.rs' contained some complicated formatting, where parts of log messages were printed directly to stdout with `print`s. That was a bit messed up because the log normally goes to stderr, but those lines were printed to stdout. In our docker images, stderr and stdout both go to the same place so you wouldn't notice, but I don't think it was intentional. This changes the log format to the default 'tracing_subscriber::format' format. It's different from the Postgres log format, however, and because both compute_tools and Postgres print to the same log, it's now a mix of two different formats. I'm not sure how the Grafana log parsing pipeline can handle that. If it's a problem, we can build custom formatter to change the compute_tools log format to be the same as Postgres's, like it was before this commit, or we can change the Postgres log format to match tracing_formatter's, or we can start printing compute_tool's log output to a different destination than Postgres	2023-01-23 17:46:12 +02:00
Kirill Bulatov	ffca97bc1e	Enable logs in unit tests	2023-01-23 17:46:12 +02:00
Kirill Bulatov	cb356f3259	Use actual temporary dir for pageserver unit tests	2023-01-23 17:46:12 +02:00
Vadim Kharitonov	c85374295f	Change SENTRY_ENVIRONMENT from "development" to "staging"	2023-01-23 17:46:12 +02:00
Anastasia Lubennikova	4992160677	Fix metric_collection_endpoint for prod. It was incorrectly set to staging url	2023-01-23 17:46:12 +02:00
Heikki Linnakangas	bd535b3371	If an error happens while checking for core dumps, don't panic. If we panic, we skip the 30s wait in 'main', and don't give the console a chance to observe the error. Which is not nice. Spotted by @ololobus at https://github.com/neondatabase/neon/pull/3352#discussion_r1072806981	2023-01-23 17:46:12 +02:00
Kirill Bulatov	d90c5a03af	Add more io::Error context when fail to operate on a path (#3254 ) I have a test failure that shows ``` Caused by: 0: Failed to reconstruct a page image: 1: Directory not empty (os error 39) ``` but does not really show where exactly that happens. https://neon-github-public-dev.s3.amazonaws.com/reports/pr-3227/release/3823785365/index.html#categories/c0057473fc9ec8fb70876fd29a171ce8/7088dab272f2c7b7/?attachment=60fe6ed2add4d82d The PR aims to add more context in debugging that issue.	2023-01-23 17:46:12 +02:00
Anastasia Lubennikova	2d02cc9079	Merge pull request #3365 from neondatabase/main Release 2023-01-17	2023-01-17 16:41:34 +02:00
Christian Schwarz	49ad94b99f	Merge pull request #3301 from neondatabase/release-2023-01-10 Release 2023-01-10	2023-01-10 16:42:26 +01:00
Christian Schwarz	948a217398	Merge commit '95bf19b85a06b27a7fc3118dee03d48648efab15' into release-2023-01-10 Conflicts: .github/helm-values/neon-stress.proxy-scram.yaml .github/helm-values/neon-stress.proxy.yaml .github/helm-values/staging.proxy-scram.yaml .github/helm-values/staging.proxy.yaml All of the above were deleted in `main` after we hotfixed them in `release. Deleting them here storage_broker/src/bin/storage_broker.rs Hotfix toned down logging, but `main` has sinced implemented a proper fix. Taken `main`'s side, see https://neondb.slack.com/archives/C033RQ5SPDH/p1673354385387479?thread_ts=1673354306.474729&cid=C033RQ5SPDH closes https://github.com/neondatabase/neon/issues/3287	2023-01-10 15:40:14 +01:00
Dmitry Rodionov	125381eae7	Merge pull request #3236 from neondatabase/dkr/retrofit-sk4-sk4-change Move zenith-1-sk-3 to zenith-1-sk-4 (#3164)	2022-12-30 14:13:50 +03:00
Arthur Petukhovsky	cd01bbc715	Move zenith-1-sk-3 to zenith-1-sk-4 (#3164 )	2022-12-30 12:32:52 +02:00
Dmitry Rodionov	d8b5e3b88d	Merge pull request #3229 from neondatabase/dkr/add-pageserver-for-release add pageserver to new region see https://github.com/neondatabase/aws/pull/116 decrease log volume for pageserver	2022-12-30 12:34:04 +03:00
Dmitry Rodionov	06d25f2186	switch to debug from info to produce less noise	2022-12-29 17:48:47 +02:00
Dmitry Rodionov	f759b561f3	add pageserver to new region see https://github.com/neondatabase/aws/pull/116	2022-12-29 17:17:35 +02:00
Sergey Melnikov	ece0555600	Push proxy metrics to Victoria Metrics (#3106 )	2022-12-16 14:44:49 +02:00
Joonas Koivunen	73ea0a0b01	fix(remote_storage): use cached credentials (#3128 ) IMDSv2 has limits, and if we query it on every s3 interaction we are going to go over those limits. Changes the s3_bucket client configuration to use: - ChainCredentialsProvider to handle env variables or imds usage - LazyCachingCredentialsProvider to actually cache any credentials Related: https://github.com/awslabs/aws-sdk-rust/issues/629 Possibly related: https://github.com/neondatabase/neon/issues/3118	2022-12-16 14:44:49 +02:00
Arseny Sher	d8f6d6fd6f	Merge pull request #3126 from neondatabase/broker-lb-release Deploy broker with L4 LB in new env.	2022-12-16 01:25:28 +03:00
Arseny Sher	d24de169a7	Deploy broker with L4 LB in new env. Seems to be fixing issue with missing keepalives.	2022-12-16 01:45:32 +04:00
Arseny Sher	0816168296	Hotfix: terminate subscription if channel is full. Might help as a hotfix, but need to understand root better.	2022-12-15 12:23:56 +03:00
Dmitry Rodionov	277b44d57a	Merge pull request #3102 from neondatabase/main Hotfix. See commits for details	2022-12-14 19:38:43 +03:00
MMeent	68c2c3880e	Merge pull request #3038 from neondatabase/main Release 22-12-14	2022-12-14 14:35:47 +01:00
Arthur Petukhovsky	49da498f65	Merge pull request #2833 from neondatabase/main Release 2022-11-16	2022-11-17 08:44:10 +01:00
Stas Kelvich	2c76ba3dd7	Merge pull request #2718 from neondatabase/main-rc-22-10-28 Release 22-10-28	2022-10-28 20:33:56 +03:00
Arseny Sher	dbe3dc69ad	Merge branch 'main' into main-rc-22-10-28 Release 22-10-28.	2022-10-28 19:10:11 +04:00
Arseny Sher	8e5bb3ed49	Enable etcd compaction in neon_local.	2022-10-27 12:53:20 +03:00
Stas Kelvich	ab0be7b8da	Avoid debian-testing packages in compute Dockerfiles plv8 can only be built with a fairly new gold linker version. We used to install it via binutils packages from testing, but it also updates libc and that causes troubles in the resulting image as different extensions were built against different libc versions. We could either use libc from debian-testing everywhere or restrain from using testing packages and install necessary programs manually. This patch uses the latter approach: gold for plv8 and cmake for h3 are installed manually. In a passing declare h3_postgis as a safe extension (previous omission).	2022-10-27 12:53:20 +03:00
bojanserafimov	b4c55f5d24	Move pagestream api to libs/pageserver_api (#2698 )	2022-10-27 12:53:20 +03:00
mikecaat	ede70d833c	Add a docker-compose example file (#1943 ) (#2666 ) Co-authored-by: Masahiro Ikeda <masahiro.ikeda.us@hco.ntt.co.jp>	2022-10-27 12:53:20 +03:00
Sergey Melnikov	70c3d18bb0	Do not release to new staging proxies on release (#2685 )	2022-10-27 12:53:20 +03:00
bojanserafimov	7a491f52c4	Add draw_timeline binary (#2688 )	2022-10-27 12:53:20 +03:00
Alexander Bayandin	323c4ecb4f	Add data format backward compatibility tests (#2626 )	2022-10-27 12:53:20 +03:00
Anastasia Lubennikova	3d2466607e	Merge pull request #2692 from neondatabase/main-rc Release 2022-10-25	2022-10-25 18:18:58 +03:00
Anastasia Lubennikova	ed478b39f4	Merge branch 'release' into main-rc	2022-10-25 17:06:33 +03:00
Stas Kelvich	91585a558d	Merge pull request #2678 from neondatabase/stas/hotfix_schema Hotfix to disable grant create on public schema	2022-10-22 02:54:31 +03:00
Stas Kelvich	93467eae1f	Hotfix to disable grant create on public schema `GRANT CREATE ON SCHEMA public` fails if there is no schema `public`. Disable it in release for now and make a better fix later (it is needed for v15 support).	2022-10-22 02:26:28 +03:00
Stas Kelvich	f3aac81d19	Merge pull request #2668 from neondatabase/main Release 2022-10-21	2022-10-21 15:21:42 +03:00
Stas Kelvich	979ad60c19	Merge pull request #2581 from neondatabase/main Release 2022-10-07	2022-10-07 16:50:55 +03:00
Stas Kelvich	9316cb1b1f	Merge pull request #2573 from neondatabase/main Release 2022-10-06	2022-10-07 11:07:06 +03:00
Anastasia Lubennikova	e7939a527a	Merge pull request #2377 from neondatabase/main Release 2022-09-01	2022-09-01 20:20:44 +03:00
Arthur Petukhovsky	36d26665e1	Merge pull request #2299 from neondatabase/main * Check for entire range during sasl validation (#2281) * Gen2 GH runner (#2128) * Re-add rustup override * Try s3 bucket * Set git version * Use v4 cache key to prevent problems * Switch to v5 for key * Add second rustup fix * Rebase * Add kaniko steps * Fix typo and set compress level * Disable global run default * Specify shell for step * Change approach with kaniko * Try less verbose shell spec * Add submodule pull * Add promote step * Adjust dependency chain * Try default swap again * Use env * Don't override aws key * Make kaniko build conditional * Specify runs on * Try without dependency link * Try soft fail * Use image with git * Try passing to next step * Fix duplicate * Try other approach * Try other approach * Fix typo * Try other syntax * Set env * Adjust setup * Try step 1 * Add link * Try global env * Fix mistake * Debug * Try other syntax * Try other approach * Change order * Move output one step down * Put output up one level * Try other syntax * Skip build * Try output * Re-enable build * Try other syntax * Skip middle step * Update check * Try first step of dockerhub push * Update needs dependency * Try explicit dir * Add missing package * Try other approach * Try other approach * Specify region * Use with * Try other approach * Add debug * Try other approach * Set region * Follow AWS example * Try github approach * Skip Qemu * Try stdin * Missing steps * Add missing close * Add echo debug * Try v2 endpoint * Use v1 endpoint * Try without quotes * Revert * Try crane * Add debug * Split steps * Fix duplicate * Add shell step * Conform to options * Add verbose flag * Try single step * Try workaround * First request fails hunch * Try bullseye image * Try other approach * Adjust verbose level * Try previous step * Add more debug * Remove debug step * Remove rogue indent * Try with larger image * Add build tag step * Update workflow for testing * Add tag step for test * Remove unused * Update dependency chain * Add ownership fix * Use matrix for promote * Force update * Force build * Remove unused * Add new image * Add missing argument * Update dockerfile copy * Update Dockerfile * Update clone * Update dockerfile * Go to correct folder * Use correct format * Update dockerfile * Remove cd * Debug find where we are * Add debug on first step * Changedir to postgres * Set workdir * Use v1 approach * Use other dependency * Try other approach * Try other approach * Update dockerfile * Update approach * Update dockerfile * Update approach * Update dockerfile * Update dockerfile * Add workspace hack * Update Dockerfile * Update Dockerfile * Update Dockerfile * Change last step * Cleanup pull in prep for review * Force build images * Add condition for latest tagging * Use pinned version * Try without name value * Remove more names * Shorten names * Add kaniko comments * Pin kaniko * Pin crane and ecr helper * Up one level * Switch to pinned tag for rust image * Force update for test Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@b04468bf-cdf4-41eb-9c94-aff4ca55e4bf.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@4795e9ee-4f32-401f-85f3-f316263b62b8.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@2f8bc4e5-4ec2-4ea2-adb1-65d863c4a558.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@27565b2b-72d5-4742-9898-a26c9033e6f9.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@ecc96c26-c6c4-4664-be6e-34f7c3f89a3c.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@7caff3a5-bf03-4202-bd0e-f1a93c86bdae.fritz.box> * Add missing step output, revert one deploy step (#2285) * Add missing step output, revert one deploy step * Conform to syntax * Update approach * Add missing value * Add missing needs Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> * Error for fatal not git repo (#2286) Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> * Use main, not branch for ref check (#2288) * Use main, not branch for ref check * Add more debug * Count main, not head * Try new approach * Conform to syntax * Update approach * Get full history * Skip checkout * Cleanup debug * Remove more debug Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> * Fix docker zombie process issue (#2289) * Fix docker zombie process issue * Init everywhere Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> * Fix 1.63 clippy lints (#2282) * split out timeline metrics, track layer map loading and size calculation * reset rust cache for clippy run to avoid an ICE additionally remove trailing whitespaces * Rename pg_control_ffi.h to bindgen_deps.h, for clarity. The pg_control_ffi.h name implies that it only includes stuff related to pg_control.h. That's mostly true currently, but really the point of the file is to include everything that we need to generate Rust definitions from. * Make local mypy behave like CI mypy (#2291) * Fix flaky pageserver restarts in tests (#2261) * Remove extra type aliases (#2280) * Update cachepot endpoint (#2290) * Update cachepot endpoint * Update dockerfile & remove env * Update image building process * Cannot use metadata endpoint for this * Update workflow * Conform to kaniko syntax * Update syntax * Update approach * Update dockerfiles * Force update * Update dockerfiles * Update dockerfile * Cleanup dockerfiles * Update s3 test location * Revert s3 experiment * Add more debug * Specify aws region * Remove debug, add prefix * Remove one more debug Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> * workflows/benchmarking: increase timeout (#2294) * Rework `init` in pageserver CLI (#2272) * Do not create initial tenant and timeline (adjust Python tests for that) * Rework config handling during init, add --update-config to manage local config updates * Fix: Always build images (#2296) * Always build images * Remove unused Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> * Move auto-generated 'bindings' to a separate inner module. Re-export only things that are used by other modules. In the future, I'm imagining that we run bindgen twice, for Postgres v14 and v15. The two sets of bindings would go into separate 'bindings_v14' and 'bindings_v15' modules. Rearrange postgres_ffi modules. Move function, to avoid Postgres version dependency in timelines.rs Move function to generate a logical-message WAL record to postgres_ffi. * fix cargo test * Fix walreceiver and safekeeper bugs (#2295) - There was an issue with zero commit_lsn `reason: LaggingWal { current_commit_lsn: 0/0, new_commit_lsn: 1/6FD90D38, threshold: 10485760 } }`. The problem was in `send_wal.rs`, where we initialized `end_pos = Lsn(0)` and in some cases sent it to the pageserver. - IDENTIFY_SYSTEM previously returned `flush_lsn` as a physical end of WAL. Now it returns `flush_lsn` (as it was) to walproposer and `commit_lsn` to everyone else including pageserver. - There was an issue with backoff where connection was cancelled right after initialization: `connected!` -> `safekeeper_handle_db: Connection cancelled` -> `Backoff: waiting 3 seconds`. The problem was in sleeping before establishing the connection. This is fixed by reworking retry logic. - There was an issue with getting `NoKeepAlives` reason in a loop. The issue is probably the same as the previous. - There was an issue with filtering safekeepers based on retry attempts, which could filter some safekeepers indefinetely. This is fixed by using retry cooldown duration instead of retry attempts. - Some `send_wal.rs` connections failed with errors without context. This is fixed by adding a timeline to safekeepers errors. New retry logic works like this: - Every candidate has a `next_retry_at` timestamp and is not considered for connection until that moment - When walreceiver connection is closed, we update `next_retry_at` using exponential backoff, increasing the cooldown on every disconnect. - When `last_record_lsn` was advanced using the WAL from the safekeeper, we reset the retry cooldown and exponential backoff, allowing walreceiver to reconnect to the same safekeeper instantly. * on safekeeper registration pass availability zone param (#2292) Co-authored-by: Kirill Bulatov <kirill@neon.tech> Co-authored-by: Rory de Zoete <33318916+zoete@users.noreply.github.com> Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@b04468bf-cdf4-41eb-9c94-aff4ca55e4bf.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@4795e9ee-4f32-401f-85f3-f316263b62b8.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@2f8bc4e5-4ec2-4ea2-adb1-65d863c4a558.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@27565b2b-72d5-4742-9898-a26c9033e6f9.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@ecc96c26-c6c4-4664-be6e-34f7c3f89a3c.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@7caff3a5-bf03-4202-bd0e-f1a93c86bdae.fritz.box> Co-authored-by: Dmitry Rodionov <dmitry@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech> Co-authored-by: bojanserafimov <bojan.serafimov7@gmail.com> Co-authored-by: Alexander Bayandin <alexander@neon.tech> Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech> Co-authored-by: Anton Galitsyn <agalitsyn@users.noreply.github.com>	2022-08-18 15:32:33 +03:00
Arthur Petukhovsky	873347f977	Merge pull request #2275 from neondatabase/main * github/workflows: Fix git dubious ownership (#2223) * Move relation size cache from WalIngest to DatadirTimeline (#2094) * Move relation sie cache to layered timeline * Fix obtaining current LSN for relation size cache * Resolve merge conflicts * Resolve merge conflicts * Reestore 'lsn' field in DatadirModification * adjust DatadirModification lsn in ingest_record * Fix formatting * Pass lsn to get_relsize * Fix merge conflict * Update pageserver/src/pgdatadir_mapping.rs Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * Update pageserver/src/pgdatadir_mapping.rs Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * refactor: replace lazy-static with once-cell (#2195) - Replacing all the occurrences of lazy-static with `once-cell::sync::Lazy` - fixes #1147 Signed-off-by: Ankur Srivastava <best.ankur@gmail.com> * Add more buckets to pageserver latency metrics (#2225) * ignore record property warning to fix benchmarks * increase statement timeout * use event so it fires only if workload thread successfully finished * remove debug log * increase timeout to pass test with real s3 * avoid duplicate parameter, increase timeout * Major migration script (#2073) This script can be used to migrate a tenant across breaking storage versions, or (in the future) upgrading postgres versions. See the comment at the top for an overview. Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech> * Fix etcd typos * Fix links to safekeeper protocol docs. (#2188) safekeeper/README_PROTO.md was moved to docs/safekeeper-protocol.md in commit `0b14fdb078`, as part of reorganizing the docs into 'mdbook' format. Fixes issue #1475. Thanks to @banks for spotting the outdated references. In addition to fixing the above issue, this patch also fixes other broken links as a result of `0b14fdb078`. See https://github.com/neondatabase/neon/pull/2188#pullrequestreview-1055918480. Co-authored-by: Heikki Linnakangas <heikki@neon.tech> Co-authored-by: Thang Pham <thang@neon.tech> * Update CONTRIBUTING.md * Update CONTRIBUTING.md * support node id and remote storage params in docker_entrypoint.sh * Safe truncate (#2218) * Move relation sie cache to layered timeline * Fix obtaining current LSN for relation size cache * Resolve merge conflicts * Resolve merge conflicts * Reestore 'lsn' field in DatadirModification * adjust DatadirModification lsn in ingest_record * Fix formatting * Pass lsn to get_relsize * Fix merge conflict * Update pageserver/src/pgdatadir_mapping.rs Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * Update pageserver/src/pgdatadir_mapping.rs Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * Check if relation exists before trying to truncat it refer #1932 * Add test reporducing FSM truncate problem Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * Fix exponential backoff values * Update back `vendor/postgres` back; it was changed accidentally. (#2251) Commit `4227cfc96e` accidentally reverted vendor/postgres to an older version. Update it back. * Add pageserver checkpoint_timeout option. To flush inmemory layer eventually when no new data arrives, which helps safekeepers to suspend activity (stop pushing to the broker). Default 10m should be ok. * Share exponential backoff code and fix logic for delete task failure (#2252) * Fix bug when import large (>1GB) relations (#2172) Resolves #2097 - use timeline modification's `lsn` and timeline's `last_record_lsn` to determine the corresponding LSN to query data in `DatadirModification::get` - update `test_import_from_pageserver`. Split the test into 2 variants: `small` and `multisegment`. + `small` is the old test + `multisegment` is to simulate #2097 by using a larger number of inserted rows to create multiple segment files of a relation. `multisegment` is configured to only run with a `release` build * Fix timeline physical size flaky tests (#2244) Resolves #2212. - use `wait_for_last_flush_lsn` in `test_timeline_physical_size_` tests ## Context Need to wait for the pageserver to catch up with the compute's last flush LSN because during the timeline physical size API call, it's possible that there are running `LayerFlushThread` threads. These threads flush new layers into disk and hence update the physical size. This results in a mismatch between the physical size reported by the API and the actual physical size on disk. ### Note The `LayerFlushThread` threads are processed concurrently, so it's possible that the above error still persists even with this patch. However, making the tests wait to finish processing all the WALs (not flushing) before calculating the physical size should help reduce the "flakiness" significantly postgres_ffi/waldecoder: validate more header fields * postgres_ffi/waldecoder: remove unused startlsn * postgres_ffi/waldecoder: introduce explicit `enum State` Previously it was emulated with a combination of nullable fields. This change should make the logic more readable. * disable `test_import_from_pageserver_multisegment` (#2258) This test failed consistently on `main` now. It's better to temporarily disable it to avoid blocking others' PRs while investigating the root cause for the test failure. See: #2255, #2256 * get_binaries uses DOCKER_TAG taken from docker image build step (#2260) * [proxy] Rework wire format of the password hack and some errors (#2236) The new format has a few benefits: it's shorter, simpler and human-readable as well. We don't use base64 anymore, since url encoding got us covered. We also show a better error in case we couldn't parse the payload; the users should know it's all about passing the correct project name. * test_runner/pg_clients: collect docker logs (#2259) * get_binaries script fix (#2263) * get_binaries uses DOCKER_TAG taken from docker image build step * remove docker tag discovery at all and fix get_binaries for version variable * Better storage sync logs (#2268) * Find end of WAL on safekeepers using WalStreamDecoder. We could make it inside wal_storage.rs, but taking into account that - wal_storage.rs reading is async - we don't need s3 here - error handling is different; error during decoding is normal I decided to put it separately. Test cargo test test_find_end_of_wal_last_crossing_segment prepared earlier by @yeputons passes now. Fixes https://github.com/neondatabase/neon/issues/544 https://github.com/neondatabase/cloud/issues/2004 Supersedes https://github.com/neondatabase/neon/pull/2066 * Improve walreceiver logic (#2253) This patch makes walreceiver logic more complicated, but it should work better in most cases. Added `test_wal_lagging` to test scenarios where alive safekeepers can lag behind other alive safekeepers. - There was a bug which looks like `etcd_info.timeline.commit_lsn > Some(self.local_timeline.get_last_record_lsn())` filtered all safekeepers in some strange cases. I removed this filter, it should probably help with #2237 - Now walreceiver_connection reports status, including commit_lsn. This allows keeping safekeeper connection even when etcd is down. - Safekeeper connection now fails if pageserver doesn't receive safekeeper messages for some time. Usually safekeeper sends messages at least once per second. - `LaggingWal` check now uses `commit_lsn` directly from safekeeper. This fixes the issue with often reconnects, when compute generates WAL really fast. - `NoWalTimeout` is rewritten to trigger only when we know about the new WAL and the connected safekeeper doesn't stream any WAL. This allows setting a small `lagging_wal_timeout` because it will trigger only when we observe that the connected safekeeper has stuck. * increase timeout in wait_for_upload to avoid spurious failures when testing with real s3 * Bump vendor/postgres to include XLP_FIRST_IS_CONTRECORD fix. (#2274) * Set up a workflow to run pgbench against captest (#2077) Signed-off-by: Ankur Srivastava <best.ankur@gmail.com> Co-authored-by: Alexander Bayandin <alexander@neon.tech> Co-authored-by: Konstantin Knizhnik <knizhnik@garret.ru> Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> Co-authored-by: Ankur Srivastava <ansrivas@users.noreply.github.com> Co-authored-by: bojanserafimov <bojan.serafimov7@gmail.com> Co-authored-by: Dmitry Rodionov <dmitry@neon.tech> Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech> Co-authored-by: Kirill Bulatov <kirill@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech> Co-authored-by: Thang Pham <thang@neon.tech> Co-authored-by: Stas Kelvich <stas.kelvich@gmail.com> Co-authored-by: Arseny Sher <sher-ars@yandex.ru> Co-authored-by: Egor Suvorov <egor@neon.tech> Co-authored-by: Andrey Taranik <andrey@cicd.team> Co-authored-by: Dmitry Ivanov <ivadmi5@gmail.com>	2022-08-15 21:30:45 +03:00
Arthur Petukhovsky	e814ac16f9	Merge pull request #2219 from neondatabase/main Release 2022-08-04	2022-08-04 20:06:34 +03:00
Heikki Linnakangas	ad3055d386	Merge pull request #2203 from neondatabase/release-uuid-ossp Deploy new storage and compute version to production Release 2022-08-02	2022-08-02 15:08:14 +03:00
Heikki Linnakangas	94e03eb452	Merge remote-tracking branch 'origin/main' into 'release' Release 2022-08-01	2022-08-02 12:43:49 +03:00
Sergey Melnikov	380f26ef79	Merge pull request #2170 from neondatabase/main (Release 2022-07-28) Release 2022-07-28	2022-07-28 14:16:52 +03:00
Arthur Petukhovsky	3c5b7f59d7	Merge pull request #2119 from neondatabase/main Release 2022-07-19	2022-07-19 11:58:48 +03:00
Arthur Petukhovsky	fee89f80b5	Merge pull request #2115 from neondatabase/main-2022-07-18 Release 2022-07-18	2022-07-18 19:21:11 +03:00
Arthur Petukhovsky	41cce8eaf1	Merge remote-tracking branch 'origin/release' into main-2022-07-18	2022-07-18 18:21:20 +03:00
Alexey Kondratov	f88fe0218d	Merge pull request #1842 from neondatabase/release-deploy-hotfix [HOTFIX] Release deploy fix This PR uses this branch neondatabase/postgres#171 and several required commits from the main to use only locally built compute-tools. This should allow us to rollout safekeepers sync issue fix on prod	2022-06-01 11:04:30 +03:00
Alexey Kondratov	cc856eca85	Install missing openssl packages in the Github Actions workflow	2022-05-31 21:31:31 +02:00
Alexey Kondratov	cf350c6002	Use :local compute-tools tag to build compute-node image	2022-05-31 21:31:16 +02:00
Arseny Sher	0ce6b6a0a3	Merge pull request #1836 from neondatabase/release-hotfix-basebackup-lsn-page-boundary Bump vendor/postgres to hotfix basebackup LSN comparison.	2022-05-31 16:54:03 +04:00
Arseny Sher	73f247d537	Bump vendor/postgres to hotfix basebackup LSN comparison.	2022-05-31 16:00:50 +04:00
Andrey Taranik	960be82183	Merge pull request #1792 from neondatabase/main Release 2202-05-25 (second)	2022-05-25 16:37:57 +03:00
Andrey Taranik	806e5a6c19	Merge pull request #1787 from neondatabase/main Release 2022-05-25	2022-05-25 13:34:11 +03:00
Alexey Kondratov	8d5df07cce	Merge pull request #1385 from zenithdb/main Release main 2022-03-22	2022-03-22 05:04:34 -05:00
Andrey Taranik	df7a9d1407	release fix 2022-03-16 (#1375 )	2022-03-17 00:43:28 +03:00

9 changed files with 117 additions and 175 deletions

									
										18

Dockerfile
									
												View File
												
				@@ -93,13 +93,14 @@ COPY --from=pg-build /home/nonroot/postgres_install.tar.gz /data/

				# By default, pageserver uses `.neon/` working directory in WORKDIR, so create one and fill it with the dummy config.

				# Now, when `docker run ... pageserver` is run, it can start without errors, yet will have some default dummy values.

				RUN mkdir -p /data/.neon/ && chown -R neon:neon /data/.neon/ \

				    && /usr/local/bin/pageserver -D /data/.neon/ --init \

				       -c "id=1234" \

				       -c "broker_endpoint='http://storage_broker:50051'" \

				       -c "pg_distrib_dir='/usr/local/'" \

				       -c "listen_pg_addr='0.0.0.0:6400'" \

				       -c "listen_http_addr='0.0.0.0:9898'"

				RUN mkdir -p /data/.neon/ && \

				  echo "id=1234" > "/data/.neon/identity.toml" && \

				  echo "broker_endpoint='http://storage_broker:50051'\n" \

				       "pg_distrib_dir='/usr/local/'\n" \

				       "listen_pg_addr='0.0.0.0:6400'\n" \

				       "listen_http_addr='0.0.0.0:9898'\n" \

				  > /data/.neon/pageserver.toml && \

				  chown -R neon:neon /data/.neon

				# When running a binary that links with libpq, default to using our most recent postgres version.  Binaries

				# that want a particular postgres version will select it explicitly: this is just a default.

				@@ -110,3 +111,6 @@ VOLUME ["/data"]

				USER neon

				EXPOSE 6400

				EXPOSE 9898

				CMD /usr/local/bin/pageserver -D /data/.neon

									
										18

control_plane/src/pageserver.rs
									
												View File
												
				@@ -25,6 +25,7 @@ use pageserver_client::mgmt_api;

				use postgres_backend::AuthType;

				use postgres_connection::{parse_host_port, PgConnectionConfig};

				use utils::auth::{Claims, Scope};

				use utils::id::NodeId;

				use utils::{

				    id::{TenantId, TimelineId},

				    lsn::Lsn,

				@@ -74,6 +75,10 @@ impl PageServerNode {

				        }

				    }

				    fn pageserver_make_identity_toml(&self, node_id: NodeId) -> toml_edit::Document {

				        toml_edit::Document::from_str(&format!("id={node_id}")).unwrap()

				    }

				    fn pageserver_init_make_toml(

				        &self,

				        conf: NeonLocalInitPageserverConf,

				@@ -186,6 +191,19 @@ impl PageServerNode {

				            .write_all(config.to_string().as_bytes())

				            .context("write pageserver toml")?;

				        drop(config_file);

				        let identity_file_path = datadir.join("identity.toml");

				        let mut identity_file = std::fs::OpenOptions::new()

				            .create_new(true)

				            .write(true)

				            .open(identity_file_path)

				            .with_context(|| format!("open identity toml for write: {config_file_path:?}"))?;

				        let identity_toml = self.pageserver_make_identity_toml(node_id);

				        identity_file

				            .write_all(identity_toml.to_string().as_bytes())

				            .context("write identity toml")?;

				        drop(identity_toml);

				        // TODO: invoke a TBD config-check command to validate that pageserver will start with the written config

				        // Write metadata file, used by pageserver on startup to register itself with

									
										2

docker-compose/compute_wrapper/shell/compute.sh
									
												View File
												
				@@ -33,7 +33,7 @@ echo $result | jq .

				generate_id timeline_id

				PARAMS=(

				     -sb 

				     -sbf

				     -X POST

				     -H "Content-Type: application/json"

				     -d "{\"new_timeline_id\": \"${timeline_id}\", \"pg_version\": ${PG_VERSION}}"

									
										15

docker-compose/docker-compose.yml
									
												View File
												
				@@ -31,25 +31,14 @@ services:

				    restart: always

				    image: ${REPOSITORY:-neondatabase}/neon:${TAG:-latest}

				    environment:

				      - BROKER_ENDPOINT='http://storage_broker:50051'

				      - AWS_ACCESS_KEY_ID=minio

				      - AWS_SECRET_ACCESS_KEY=password

				      #- RUST_BACKTRACE=1

				    ports:

				       #- 6400:6400  # pg protocol handler

				       - 9898:9898 # http endpoints

				    entrypoint:

				      - "/bin/sh"

				      - "-c"

				    command:

				      - "/usr/local/bin/pageserver -D /data/.neon/

				                                   -c \"broker_endpoint=$$BROKER_ENDPOINT\"

				                                   -c \"listen_pg_addr='0.0.0.0:6400'\"

				                                   -c \"listen_http_addr='0.0.0.0:9898'\"

				                                   -c \"remote_storage={endpoint='http://minio:9000',

				                                                        bucket_name='neon',

				                                                        bucket_region='eu-north-1',

				                                                        prefix_in_bucket='/pageserver/'}\""

				    volumes:

				      - ./pageserver_config:/data/.neon/

				    depends_on:

				      - storage_broker

				      - minio_create_buckets

									
										1

docker-compose/pageserver_config/identity.toml
									
										Normal file
									
												View File
												
				@@ -0,0 +1 @@

				id=1234

									
										5

docker-compose/pageserver_config/pageserver.toml
									
										Normal file
									
												View File
												
				@@ -0,0 +1,5 @@

				broker_endpoint='http://storage_broker:50051'

				pg_distrib_dir='/usr/local/'

				listen_pg_addr='0.0.0.0:6400'

				listen_http_addr='0.0.0.0:9898'

				remote_storage={ endpoint='http://minio:9000', bucket_name='neon', bucket_region='eu-north-1', prefix_in_bucket='/pageserver' }

									
										101

pageserver/src/bin/pageserver.rs
									
												View File
												
				@@ -2,17 +2,18 @@

				//! Main entry point for the Page Server executable.

				use std::env;

				use std::env::{var, VarError};

				use std::io::Read;

				use std::sync::Arc;

				use std::time::Duration;

				use std::{env, ops::ControlFlow, str::FromStr};

				use anyhow::{anyhow, Context};

				use camino::Utf8Path;

				use clap::{Arg, ArgAction, Command};

				use metrics::launch_timestamp::{set_launch_timestamp_metric, LaunchTimestamp};

				use pageserver::config::PageserverIdentity;

				use pageserver::control_plane_client::ControlPlaneClient;

				use pageserver::disk_usage_eviction_task::{self, launch_disk_usage_global_eviction_task};

				use pageserver::metrics::{STARTUP_DURATION, STARTUP_IS_LOADING};

				@@ -25,7 +26,7 @@ use tracing::*;

				use metrics::set_build_info_metric;

				use pageserver::{

				    config::{defaults::*, PageServerConf},

				    config::PageServerConf,

				    context::{DownloadBehavior, RequestContext},

				    deletion_queue::DeletionQueue,

				    http, page_cache, page_service, task_mgr,

				@@ -84,18 +85,13 @@ fn main() -> anyhow::Result<()> {

				        .with_context(|| format!("Error opening workdir '{workdir}'"))?;

				    let cfg_file_path = workdir.join("pageserver.toml");

				    let identity_file_path = workdir.join("identity.toml");

				    // Set CWD to workdir for non-daemon modes

				    env::set_current_dir(&workdir)

				        .with_context(|| format!("Failed to set application's current dir to '{workdir}'"))?;

				    let conf = match initialize_config(&cfg_file_path, arg_matches, &workdir)? {

				        ControlFlow::Continue(conf) => conf,

				        ControlFlow::Break(()) => {

				            info!("Pageserver config init successful");

				            return Ok(());

				        }

				    };

				    let conf = initialize_config(&identity_file_path, &cfg_file_path, &workdir)?;

				    // Initialize logging.

				    //

				@@ -150,70 +146,55 @@ fn main() -> anyhow::Result<()> {

				}

				fn initialize_config(

				    identity_file_path: &Utf8Path,

				    cfg_file_path: &Utf8Path,

				    arg_matches: clap::ArgMatches,

				    workdir: &Utf8Path,

				) -> anyhow::Result<ControlFlow<(), &'static PageServerConf>> {

				    let init = arg_matches.get_flag("init");

				    let file_contents: Option<toml_edit::Document> = match std::fs::File::open(cfg_file_path) {

				) -> anyhow::Result<&'static PageServerConf> {

				    // The deployment orchestrator writes out an indentity file containing the node id

				    // for all pageservers. This file is the source of truth for the node id. In order

				    // to allow for rolling back pageserver releases, the node id is also included in

				    // the pageserver config that the deployment orchestrator writes to disk for the pageserver.

				    // A rolled back version of the pageserver will get the node id from the pageserver.toml

				    // config file.

				    let identity = match std::fs::File::open(identity_file_path) {

				        Ok(mut f) => {

				            if init {

				                anyhow::bail!("config file already exists: {cfg_file_path}");

				            let md = f.metadata().context("stat config file")?;

				            if !md.is_file() {

				                anyhow::bail!("Pageserver found identity file but it is a dir entry: {identity_file_path}. Aborting start up ...");

				            }

				            let mut s = String::new();

				            f.read_to_string(&mut s).context("read identity file")?;

				            toml_edit::de::from_str::<PageserverIdentity>(&s)?

				        }

				        Err(e) => {

				            anyhow::bail!("Pageserver could not read identity file: {identity_file_path}: {e}. Aborting start up ...");

				        }

				    };

				    let config: toml_edit::Document = match std::fs::File::open(cfg_file_path) {

				        Ok(mut f) => {

				            let md = f.metadata().context("stat config file")?;

				            if md.is_file() {

				                let mut s = String::new();

				                f.read_to_string(&mut s).context("read config file")?;

				                Some(s.parse().context("parse config file toml")?)

				                s.parse().context("parse config file toml")?

				            } else {

				                anyhow::bail!("directory entry exists but is not a file: {cfg_file_path}");

				            }

				        }

				        Err(e) if e.kind() == std::io::ErrorKind::NotFound => None,

				        Err(e) => {

				            anyhow::bail!("open pageserver config: {e}: {cfg_file_path}");

				        }

				    };

				    let mut effective_config = file_contents.unwrap_or_else(|| {

				        DEFAULT_CONFIG_FILE

				            .parse()

				            .expect("unit tests ensure this works")

				    });

				    // Patch with overrides from the command line

				    if let Some(values) = arg_matches.get_many::<String>("config-override") {

				        for option_line in values {

				            let doc = toml_edit::Document::from_str(option_line).with_context(|| {

				                format!("Option '{option_line}' could not be parsed as a toml document")

				            })?;

				            for (key, item) in doc.iter() {

				                effective_config.insert(key, item.clone());

				            }

				        }

				    }

				    debug!("Resulting toml: {effective_config}");

				    debug!("Using pageserver toml: {config}");

				    // Construct the runtime representation

				    let conf = PageServerConf::parse_and_validate(&effective_config, workdir)

				    let conf = PageServerConf::parse_and_validate(identity.id, &config, workdir)

				        .context("Failed to parse pageserver configuration")?;

				    if init {

				        info!("Writing pageserver config to '{cfg_file_path}'");

				        std::fs::write(cfg_file_path, effective_config.to_string())

				            .with_context(|| format!("Failed to write pageserver config to '{cfg_file_path}'"))?;

				        info!("Config successfully written to '{cfg_file_path}'")

				    }

				    Ok(if init {

				        ControlFlow::Break(())

				    } else {

				        ControlFlow::Continue(Box::leak(Box::new(conf)))

				    })

				    Ok(Box::leak(Box::new(conf)))

				}

				struct WaitForPhaseResult<F: std::future::Future + Unpin> {

				@@ -734,28 +715,12 @@ fn cli() -> Command {

				    Command::new("Neon page server")

				        .about("Materializes WAL stream to pages and serves them to the postgres")

				        .version(version())

				        .arg(

				            Arg::new("init")

				                .long("init")

				                .action(ArgAction::SetTrue)

				                .help("Initialize pageserver with all given config overrides"),

				        )

				        .arg(

				            Arg::new("workdir")

				                .short('D')

				                .long("workdir")

				                .help("Working directory for the pageserver"),

				        )

				        // See `settings.md` for more details on the extra configuration patameters pageserver can process

				        .arg(

				            Arg::new("config-override")

				                .long("config-override")

				                .short('c')

				                .num_args(1)

				                .action(ArgAction::Append)

				                .help("Additional configuration overrides of the ones from the toml config file (or new ones to add there). \

				                Any option has to be a valid toml document, example: `-c=\"foo='hey'\"` `-c=\"foo={value=1}\"`"),

				        )

				        .arg(

				            Arg::new("enabled-features")

				                .long("enabled-features")

									
										68

pageserver/src/config.rs
									
												View File
												
				@@ -7,8 +7,8 @@

				use anyhow::{anyhow, bail, ensure, Context, Result};

				use pageserver_api::{models::ImageCompressionAlgorithm, shard::TenantShardId};

				use remote_storage::{RemotePath, RemoteStorageConfig};

				use serde;

				use serde::de::IntoDeserializer;

				use serde::{self, Deserialize};

				use std::env;

				use storage_broker::Uri;

				use utils::crashsafe::path_with_suffix_extension;

				@@ -406,6 +406,13 @@ struct PageServerConfigBuilder {

				}

				impl PageServerConfigBuilder {

				    fn new(node_id: NodeId) -> Self {

				        let mut this = Self::default();

				        this.id(node_id);

				        this

				    }

				    #[inline(always)]

				    fn default_values() -> Self {

				        use self::BuilderValue::*;

				@@ -881,8 +888,12 @@ impl PageServerConf {

				    /// validating the input and failing on errors.

				    ///

				    /// This leaves any options not present in the file in the built-in defaults.

				    pub fn parse_and_validate(toml: &Document, workdir: &Utf8Path) -> anyhow::Result<Self> {

				        let mut builder = PageServerConfigBuilder::default();

				    pub fn parse_and_validate(

				        node_id: NodeId,

				        toml: &Document,

				        workdir: &Utf8Path,

				    ) -> anyhow::Result<Self> {

				        let mut builder = PageServerConfigBuilder::new(node_id);

				        builder.workdir(workdir.to_owned());

				        let mut t_conf = TenantConfOpt::default();

				@@ -913,7 +924,8 @@ impl PageServerConf {

				                "tenant_config" => {

				                    t_conf = TenantConfOpt::try_from(item.to_owned()).context(format!("failed to parse: '{key}'"))?;

				                }

				                "id" => builder.id(NodeId(parse_toml_u64(key, item)?)),

				                "id" => {}, // Ignoring `id` field in pageserver.toml - using identity.toml as the source of truth

				                            // Logging is not set up yet, so we can't do it.

				                "broker_endpoint" => builder.broker_endpoint(parse_toml_string(key, item)?.parse().context("failed to parse broker endpoint")?),

				                "broker_keepalive_interval" => builder.broker_keepalive_interval(parse_toml_duration(key, item)?),

				                "log_format" => builder.log_format(

				@@ -1090,6 +1102,12 @@ impl PageServerConf {

				    }

				}

				#[derive(Deserialize)]

				#[serde(deny_unknown_fields)]

				pub struct PageserverIdentity {

				    pub id: NodeId,

				}

				// Helper functions to parse a toml Item

				fn parse_toml_string(name: &str, item: &Item) -> Result<String> {

				@@ -1259,7 +1277,7 @@ background_task_maximum_delay = '334 s'

				        );

				        let toml = config_string.parse()?;

				        let parsed_config = PageServerConf::parse_and_validate(&toml, &workdir)

				        let parsed_config = PageServerConf::parse_and_validate(NodeId(10), &toml, &workdir)

				            .unwrap_or_else(|e| panic!("Failed to parse config '{config_string}', reason: {e:?}"));

				        assert_eq!(

				@@ -1341,7 +1359,7 @@ background_task_maximum_delay = '334 s'

				        );

				        let toml = config_string.parse()?;

				        let parsed_config = PageServerConf::parse_and_validate(&toml, &workdir)

				        let parsed_config = PageServerConf::parse_and_validate(NodeId(10), &toml, &workdir)

				            .unwrap_or_else(|e| panic!("Failed to parse config '{config_string}', reason: {e:?}"));

				        assert_eq!(

				@@ -1431,12 +1449,13 @@ broker_endpoint = '{broker_endpoint}'

				            let toml = config_string.parse()?;

				            let parsed_remote_storage_config = PageServerConf::parse_and_validate(&toml, &workdir)

				                .unwrap_or_else(|e| {

				                    panic!("Failed to parse config '{config_string}', reason: {e:?}")

				                })

				                .remote_storage_config

				                .expect("Should have remote storage config for the local FS");

				            let parsed_remote_storage_config =

				                PageServerConf::parse_and_validate(NodeId(10), &toml, &workdir)

				                    .unwrap_or_else(|e| {

				                        panic!("Failed to parse config '{config_string}', reason: {e:?}")

				                    })

				                    .remote_storage_config

				                    .expect("Should have remote storage config for the local FS");

				            assert_eq!(

				                parsed_remote_storage_config,

				@@ -1492,12 +1511,13 @@ broker_endpoint = '{broker_endpoint}'

				            let toml = config_string.parse()?;

				            let parsed_remote_storage_config = PageServerConf::parse_and_validate(&toml, &workdir)

				                .unwrap_or_else(|e| {

				                    panic!("Failed to parse config '{config_string}', reason: {e:?}")

				                })

				                .remote_storage_config

				                .expect("Should have remote storage config for S3");

				            let parsed_remote_storage_config =

				                PageServerConf::parse_and_validate(NodeId(10), &toml, &workdir)

				                    .unwrap_or_else(|e| {

				                        panic!("Failed to parse config '{config_string}', reason: {e:?}")

				                    })

				                    .remote_storage_config

				                    .expect("Should have remote storage config for S3");

				            assert_eq!(

				                parsed_remote_storage_config,

				@@ -1576,7 +1596,7 @@ threshold = "20m"

				"#,

				        );

				        let toml: Document = pageserver_conf_toml.parse()?;

				        let conf = PageServerConf::parse_and_validate(&toml, &workdir)?;

				        let conf = PageServerConf::parse_and_validate(NodeId(333), &toml, &workdir)?;

				        assert_eq!(conf.pg_distrib_dir, pg_distrib_dir);

				        assert_eq!(

				@@ -1592,7 +1612,11 @@ threshold = "20m"

				                .evictions_low_residence_duration_metric_threshold,

				            Duration::from_secs(20 * 60)

				        );

				        assert_eq!(conf.id, NodeId(222));

				        // Assert that the node id provided by the indentity file (threaded

				        // through the call to [`PageServerConf::parse_and_validate`] is

				        // used.

				        assert_eq!(conf.id, NodeId(333));

				        assert_eq!(

				            conf.disk_usage_based_eviction,

				            Some(DiskUsageEvictionTaskConfig {

				@@ -1637,7 +1661,7 @@ threshold = "20m"

				"#,

				        );

				        let toml: Document = pageserver_conf_toml.parse().unwrap();

				        let conf = PageServerConf::parse_and_validate(&toml, &workdir).unwrap();

				        let conf = PageServerConf::parse_and_validate(NodeId(222), &toml, &workdir).unwrap();

				        match &conf.default_tenant_conf.eviction_policy {

				            EvictionPolicy::OnlyImitiate(t) => {

				@@ -1656,7 +1680,7 @@ threshold = "20m"

				remote_storage = {}

				        "#;

				        let doc = toml_edit::Document::from_str(input).unwrap();

				        let err = PageServerConf::parse_and_validate(&doc, &workdir)

				        let err = PageServerConf::parse_and_validate(NodeId(222), &doc, &workdir)

				            .expect_err("empty remote_storage field should fail, don't specify it if you want no remote_storage");

				        assert!(format!("{err}").contains("remote_storage"), "{err}");

				    }

									
										64

test_runner/regress/test_pageserver_api.py
									
												View File
												
				@@ -1,8 +1,5 @@

				import subprocess

				from pathlib import Path

				from typing import Optional

				import toml

				from fixtures.common_types import Lsn, TenantId, TimelineId

				from fixtures.neon_fixtures import (

				    DEFAULT_BRANCH_NAME,

				@@ -13,67 +10,6 @@ from fixtures.pageserver.http import PageserverHttpClient

				from fixtures.utils import wait_until

				def test_pageserver_init_node_id(neon_simple_env: NeonEnv, neon_binpath: Path):

				    """

				    NB: The neon_local doesn't use `--init` mode anymore, but our production

				    deployment still does => https://github.com/neondatabase/aws/pull/1322

				    """

				    workdir = neon_simple_env.pageserver.workdir

				    pageserver_config = workdir / "pageserver.toml"

				    pageserver_bin = neon_binpath / "pageserver"

				    def run_pageserver(args):

				        return subprocess.run(

				            [str(pageserver_bin), "-D", str(workdir), *args],

				            check=False,

				            universal_newlines=True,

				            stdout=subprocess.PIPE,

				            stderr=subprocess.PIPE,

				        )

				    neon_simple_env.pageserver.stop()

				    with open(neon_simple_env.pageserver.config_toml_path, "r") as f:

				        ps_config = toml.load(f)

				    required_config_keys = [

				        "pg_distrib_dir",

				        "listen_pg_addr",

				        "listen_http_addr",

				        "pg_auth_type",

				        "http_auth_type",

				        # TODO: only needed for NEON_PAGESERVER_PANIC_ON_UNSPECIFIED_COMPACTION_ALGORITHM in https://github.com/neondatabase/neon/pull/7748

				        # "tenant_config",

				    ]

				    required_config_overrides = [

				        f"--config-override={toml.dumps({k: ps_config[k]})}" for k in required_config_keys

				    ]

				    pageserver_config.unlink()

				    bad_init = run_pageserver(["--init", *required_config_overrides])

				    assert (

				        bad_init.returncode == 1

				    ), "pageserver should not be able to init new config without the node id"

				    assert 'missing config value "id"' in bad_init.stderr

				    assert not pageserver_config.exists(), "config file should not be created after init error"

				    good_init_cmd = [

				        "--init",

				        f"--config-override=id={ps_config['id']}",

				        *required_config_overrides,

				    ]

				    completed_init = run_pageserver(good_init_cmd)

				    assert (

				        completed_init.returncode == 0

				    ), "pageserver should be able to create a new config with the node id given"

				    assert pageserver_config.exists(), "config file should be created successfully"

				    bad_reinit = run_pageserver(good_init_cmd)

				    assert bad_reinit.returncode == 1, "pageserver refuses to init if already exists"

				    assert "config file already exists" in bad_reinit.stderr

				def check_client(env: NeonEnv, client: PageserverHttpClient):

				    pg_version = env.pg_version

				    initial_tenant = env.initial_tenant

Compare commits

464 Commits

hackathon/ ... release-61

18

Dockerfile

View File

18

control_plane/src/pageserver.rs

View File

2

docker-compose/compute_wrapper/shell/compute.sh

View File

15

docker-compose/docker-compose.yml

View File

1

docker-compose/pageserver_config/identity.toml Normal file

View File

5

docker-compose/pageserver_config/pageserver.toml Normal file

View File

101

pageserver/src/bin/pageserver.rs

View File

68

pageserver/src/config.rs

View File

64

test_runner/regress/test_pageserver_api.py

View File

				`@@ -0,0 +1 @@`
				`id=1234`

Compare commits

464 Commits hackathon/ ... release-61

18 Dockerfile Unescape Escape View File

18 control_plane/src/pageserver.rs Unescape Escape View File

2 docker-compose/compute_wrapper/shell/compute.sh Unescape Escape View File

15 docker-compose/docker-compose.yml Unescape Escape View File

1 docker-compose/pageserver_config/identity.toml Normal file Unescape Escape View File

5 docker-compose/pageserver_config/pageserver.toml Normal file Unescape Escape View File

101 pageserver/src/bin/pageserver.rs Unescape Escape View File

68 pageserver/src/config.rs Unescape Escape View File

64 test_runner/regress/test_pageserver_api.py Unescape Escape View File

464 Commits

hackathon/ ... release-61

18

Dockerfile

View File

18

control_plane/src/pageserver.rs

View File

2

docker-compose/compute_wrapper/shell/compute.sh

View File

15

docker-compose/docker-compose.yml

View File

1

docker-compose/pageserver_config/identity.toml Normal file

View File

5

docker-compose/pageserver_config/pageserver.toml Normal file

View File

101

pageserver/src/bin/pageserver.rs

View File

68

pageserver/src/config.rs

View File

64

test_runner/regress/test_pageserver_api.py

View File