rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-08 14:02:55 +00:00

Author	SHA1	Message	Date
Shany Pozin	d19c5248c9	Add UUID header to mgmt API (#3708 ) ## Describe your changes ## Issue ticket number and link #3479 ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.	2023-03-01 18:09:08 +02:00
sharnoff	1360361f60	Fix missing VM cgconfig.conf (#3718 ) It was being added to the wrong stage in the dockerfile. This should fix it, and resolves an ongoing issue on staging.	2023-02-28 21:11:00 -08:00
Alexander Bayandin	000eb1b069	Bump tempfile from 3.3.0 to 3.4.0 (#3709 ) Update `tempfile` crate to get rid of `remove_dir_all` dependency Ref https://github.com/neondatabase/neon/security/dependabot/15	2023-02-27 12:44:08 +00:00
Heikki Linnakangas	f51b48fa49	Fix UNLOGGED tables. Instead of trying to create missing files on the way, send init fork contents as main fork from pageserver during basebackup. Add test for that. Call put_rel_drop for init forks; previously they weren't removed. Bump vendor/postgres to revert previous approach on Postgres side. Co-authored-by: Arseny Sher <sher-ars@yandex.ru> ref https://github.com/neondatabase/postgres/pull/264 ref https://github.com/neondatabase/postgres/pull/259 ref https://github.com/neondatabase/neon/issues/1222	2023-02-24 23:30:02 +04:00
Sergey Melnikov	9f906ff236	Add pageserver-2.us-east-2.aws.neon.tech (#3701 )	2023-02-23 19:56:21 +01:00
Sam Kleinman	c79dd8d458	compute_ctl: support for fetching spec from control plane (#3610 )	2023-02-23 13:19:39 -05:00
Vadim Kharitonov	ec4ecdd543	Enable postgres SPI extensions	2023-02-23 16:49:37 +01:00
MMeent	20a4d817ce	Update vendored PostgreSQL versions to 14.7 and 15.2 (#3581 ) ## Describe your changes Rebase vendored PostgreSQL onto 14.7 and 15.2 ## Issue ticket number and link #3579 ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [x] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [x] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ``` The version of PostgreSQL that we use is updated to 14.7 for PostgreSQL 14 and 15.2 for PostgreSQL 15. ```	2023-02-23 16:10:22 +02:00
Vadim Kharitonov	5ebf7e5619	Fix `pg_jsonschema` and `pg_graphql`	2023-02-23 10:43:46 +01:00
Arseny Sher	0692fffbf3	Bump vendor/postgres to include hotfix for unlogged tables with indexes. https://github.com/neondatabase/postgres/pull/259 https://github.com/neondatabase/postgres/pull/262	2023-02-23 01:34:59 +04:00
Vadim Kharitonov	093570af20	Compile `pg_hashids` extension	2023-02-22 21:00:25 +01:00
Dmitry Rodionov	eb403da814	Use debug level for successful GET http requests (#3681 ) We started rather frequently scrap some apis for metadata. This includes layer eviction tester, I believe console does that too. It should eliminate these logs: https://neonprod.grafana.net/goto/rr_ace1Vz?orgId=1 (Note the rate around 2k messages per minute)	2023-02-22 22:19:05 +03:00
Vadim Kharitonov	f3ad635911	Compile `pgrouting` extension	2023-02-22 20:16:11 +01:00
Vadim Kharitonov	a8d7360881	Compile `hypopg` extension	2023-02-22 20:14:30 +01:00
Lassi Pölönen	b0311cfdeb	Change the production neon-proxy-scram update strategy to RollingUpdate (#3683 ) ## Describe your changes The same change in production as was done in staging by https://github.com/neondatabase/neon/pull/3678 ## Issue ticket number and link https://github.com/neondatabase/neon/issues/3333	2023-02-22 20:15:37 +02:00
Konstantin Knizhnik	412e0aa985	Skip largest N holes during compaction (#3597 ) ## Describe your changes This is yet another attempt to address problem with storage size ballooning #2948 Previous PR #3348 tries to address this problem by maintaining list of holes for each layer. The problem with this approach is that we have to load all layer on pageserver start. Lazy loading of layers is not possible any more. This PR tries to collect information of N largest holes on compaction time and exclude this holes from produced layers. It can cause generation of larger number of layers (up to 2 times) and producing small layers. But it requires minimal changes in code and doesn't affect storage format. For graphical explanation please see thread: https://github.com/neondatabase/neon/pull/3597#discussion_r1112704451 ## Issue ticket number and link #2948 #3348 ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.	2023-02-22 18:28:01 +02:00
Lassi Pölönen	965b4f4ae2	Change the staging neon-proxy-scram update strategy to RollingUpdate (#3678 ) ## Describe your changes When we deploy the proxy with the default Recreate strategy, there's always some downtime and existing connections will be shut down. Change the strategy to RollingUpdate and delay the kill signal by one week. AWS Network Loadbalancer keeps the existing connections alive for as long as the pods are alive, but will direct new connections to new pods. ## Issue ticket number and link https://github.com/neondatabase/neon/issues/3333	2023-02-22 16:50:07 +02:00
Arthur Petukhovsky	95018672fa	Remove safekeeper-1.ap-southeast-1.aws.neon.tech (#3671 ) We migrated all timelines to `safekeeper-3.ap-southeast-1.aws.neon.tech`, now old instance can be removed.	2023-02-22 11:55:41 +02:00
Sergey Melnikov	2caece2077	Add -v to ansible invocations (#3670 ) To get more debug output on failures	2023-02-21 23:11:52 +03:00
Joonas Koivunen	b8b8c19fb4	fix: hold permit until GetObject eof (#3663 ) previously we applied the ratelimiting only up to receiving the headers from s3, or somewhere near it. the commit adds an adapter which carries the permit until the AsyncRead has been disposed. fixes #3662.	2023-02-21 21:14:08 +02:00
Joonas Koivunen	225add041f	calculate_logical_size: no longer use spawn_blocking (#3664 ) Calculation of logical size is now async because of layer downloads, so we shouldn't use spawn_blocking for it. Use of `spawn_blocking` exhausted resources which are needed by `tokio::io::copy` when copying from a stream to a file which lead to deadlock. Fixes: #3657	2023-02-21 21:09:31 +02:00
Joonas Koivunen	5d001b1e5a	chore: ignore all compaction inactive tenant errors (#3665 ) these are happening in tests because of #3655 but they sure took some time to appear. makes the `Compaction failed, retrying in 2s: Cannot run compaction iteration on inactive tenant` into a globally allowed error, because it has been seen failing on different test cases.	2023-02-21 20:20:13 +02:00
Joonas Koivunen	fe462de85b	fix: log download failed error (#3661 ) Fixes #3659	2023-02-21 19:31:53 +02:00
Vadim Kharitonov	c0de7f5cd8	Build `pg_jsonschema` and `pg_graphql` extensions (#3535 ) ## Describe your changes Layer for building pg extensions written on Rust It required forking: * `cargo-pgx` (in order not to catch an ABI mismatch error (`cargo-pgx` hardcoded ABI tcdi/pgx#1032) * `pg_jsonschema` (to use forked `cargo-pgx` version) * `pgx-contrib-spiext` (to use forked `cargo-pgx`) * `pg_graphql` (to use forked `cargo-pgx` and `pgx-contrib-spiext` version) Before the patch: ``` postgres=# create extension pg_jsonschema; 2023-02-02 17:45:23.120 UTC [35] ERROR: incompatible library "/usr/local/lib/pg_jsonschema.so": ABI mismatch 2023-02-02 17:45:23.120 UTC [35] DETAIL: Server has ABI "Neon Postgres", library has "PostgreSQL". 2023-02-02 17:45:23.120 UTC [35] STATEMENT: create extension pg_jsonschema; ERROR: incompatible library "/usr/local/lib/pg_jsonschema.so": ABI mismatch DETAIL: Server has ABI "Neon Postgres", library has "PostgreSQL". ``` After ``` postgres=# create extension pg_jsonschema; CREATE EXTENSION postgres=# select json_matches_schema('{"type": "object"}', '{}'); json_matches_schema --------------------- t postgres=# create extension pg_graphql; CREATE EXTENSION postgres=# create table book(id int primary key, title text); CREATE TABLE postgres=# insert into book(id, title) values (1, 'book 1'); INSERT 0 1 postgres=# select graphql.resolve($$ query { bookCollection { edges { node { id } } } } $$); resolve ---------------------------------------------------------------- {"data": {"bookCollection": {"edges": [{"node": {"id": 1}}]}}} (1 row) ``` ## Issue ticket number and link Closes #3429, #3096 ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [x] If it is a core feature, I have added thorough tests. - [x] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [x] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. `pg_jsonschema` extension will be available for our customers	2023-02-21 17:31:23 +01:00
Joonas Koivunen	b220ba6cd1	add random init delay for background tasks (#3655 ) Fixes #3649.	2023-02-21 12:42:11 +01:00
Joonas Koivunen	7de373210d	Warn when background tasks exceed their configured period (#3654 ) Fixes #3648.	2023-02-21 13:02:19 +02:00
Vadim Kharitonov	5c5b03ce08	Compile xml2 extension	2023-02-21 10:34:45 +01:00
Joonas Koivunen	d7d3f451f0	Use tracing panic hook in all binaries (#3634 ) Enables tracing panic hook in addition to pageserver introduced in #3475: - proxy - safekeeper - storage_broker For proxy, a drop guard which resets the original std panic hook was added on the first commit. Other binaries don't need it so they never reset anything by `disarm`ing the drop guard. The aim of the change is to make sure all panics a) have span information b) are logged similar to other messages, not interleaved with other messages as happens right now. Interleaving happens right now because std prints panics to stderr, and other logging happens in stdout. If this was handled gracefully by some utility, the log message splitter would treat panics as belonging to the previous message because it expects a message to start with a timestamp. Cc: #3468	2023-02-21 10:03:55 +02:00
Keanu Ashwell	bc7d3c6476	docs: add dependency requirements for arch based systems (#3588 ) This pull request adds information on building neon on Arch based system such as Artix, Manjaro, Antergos, etc.	2023-02-20 22:51:54 +03:00
Sergey Melnikov	e3d75879c0	Use fqdn to access console management API on production (#3651 ) console-release.local is legacy manual CNAME to neon-internal-api.aws.neon.tech in r53 We could use neon-internal-api.aws.neon.tech name directly This already was deployed to staging in https://github.com/neondatabase/neon/pull/3642	2023-02-20 18:11:06 +01:00
Christian Schwarz	485b269674	eviction: tone down logs to debug!() level if there were no evictions fixes #3647	2023-02-20 18:01:59 +01:00
Christian Schwarz	ee1eda9921	eviction: remove EvictionStats::not_considered_due_to_clock_skew Rationale: see the block comment added in this patch. fixes #3641	2023-02-20 18:01:59 +01:00
Christian Schwarz	e363911c85	timeline: propagate span to download_remote_layer (#3644 ) fixes #3643 refs #3604	2023-02-20 17:18:13 +02:00
Sergey Melnikov	d5d690c044	Use fqdn for staging console management API (#3642 ) `console-staging.local` is legacy manual CNAME to `neon-internal-api.aws.neon.build` in r53 We could use `neon-internal-api.aws.neon.build` name directly	2023-02-20 16:05:21 +01:00
Arthur Petukhovsky	8f557477c6	Add new safekeeper to ap-southeast-1 prod (#3645 )	2023-02-20 17:51:27 +03:00
Shany Pozin	af210c8b42	Allow running do_gc in non testing env (#3639 ) ## Describe your changes Since the current default gc period is set to 1 hour, whenever there is an immediate need to reduce PITR and run gc, the user has to wait 1 hour for PITR change to take effect By enabling this API the user can configure PITR and immediately call the do_gc API to trigger gc ## Issue ticket number and link #3590 ## Checklist before requesting a review - [X] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.	2023-02-20 13:23:13 +02:00
sharnoff	2153d2e00a	Run compute_ctl in a cgroup in VMs (#3577 )	2023-02-17 14:14:41 -08:00
Alexander Bayandin	564fa11244	Update Postgres extensions (#3615 ) - Update postgis from 3.3.1 from 3.3.2 - Update plv8 from 3.1.4 to 3.1.5 - Update h3-pg from 4.0.1 to 4.1.2 (and underlying h3 from 4.0.1 to 4.1.0)	2023-02-17 18:18:23 +00:00
Christian Schwarz	8d28a24b26	staging: enable automatic layer eviction at 20m threshold + period (#3636 ) What it says on the tin. Part of #2476	2023-02-17 18:32:01 +02:00
Anastasia Lubennikova	53128d56d9	Fix make clean: Use correct paths in neon-pg-ext-clean	2023-02-17 17:57:45 +02:00
Anastasia Lubennikova	40799d8ae7	Add debug messages to catch abnormal consumption metric values	2023-02-17 17:57:45 +02:00
Konstantin Knizhnik	b242b0ad67	Fix flaky tests (#3616 ) ## Describe your changes test_on_demand_download is flaky because not waiting until created image layer is transferred to S3. test_tenants_with_remote_storage just leaves garbage at the end of overwritten file. Right solution for test_on_demand_download is to add some API call to wait completion of synchronization with S3 (not just based on last record LSN). But right now it is solved using sleep. ## Issue ticket number and link #3209 ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.	2023-02-17 15:56:56 +02:00
Dmitry Ivanov	d90cd36bcc	[proxy] Improve tracing spans here and there.	2023-02-17 15:32:14 +03:00
Dmitry Ivanov	956b6f17ca	[proxy] Handle some unix signals. On the surface, this doesn't add much, but there are some benefits: * We can do graceful shutdowns and thus record more code coverage data. * We now have a foundation for the more interesting behaviors, e.g. "stop accepting new connections after SIGTERM but keep serving the existing ones". * We give the otel machinery a chance to flush trace events before finally shutting down.	2023-02-17 15:32:14 +03:00
Heikki Linnakangas	6f9af0aa8c	[proxy] Enable OpenTelemetry tracing. This commit sets up OpenTelemetry tracing and exporter, so that they can be exported as OpenTelemetry traces as well. All outgoing HTTP requests will be traced. A separate (child) span is created for each outgoing HTTP request, and the tracing context is also propagated to the server in the HTTP headers. If tracing is enabled in the control plane and compute node too, you can now get an end-to-end distributed trace of what happens when a new connection is established, starting from the handshake with the client, creating the 'start_compute' operation in the control plane, starting the compute node, all the way to down to fetching the base backup and the availability checks in compute_ctl. Co-authored-by: Dmitry Ivanov <dima@neon.tech>	2023-02-17 15:32:14 +03:00
Joonas Koivunen	8e6b27bf7c	fix: avoid busy loop on replacement failure (#3613 ) Add an AtomicBool per RemoteLayer, use it to mark together with closed semaphore that remotelayer is unusable until restart or ignore+load. https://github.com/neondatabase/neon/issues/3533#issuecomment-1431481554	2023-02-17 14:15:29 +02:00
Joonas Koivunen	ae3eff1ad2	Tracing panic hook (#3475 ) Fixes #3468. This does change how the panics look, and most importantly, make sure they are not interleaved with other messages. Adds a `GET /v1/panic` endpoint for panic testing (useful for sentry dedup and this hook testing). The panics are now logged within a new error level span called `panic` which separates it from other error level events. The panic info is unpacked into span fields: - thread=mgmt request worker - location="pageserver/src/http/routes.rs:898:9" Co-authored-by: Christian Schwarz <christian@neon.tech>	2023-02-17 13:56:00 +02:00
Joonas Koivunen	501702b27c	fix: flaky test_compaction_downloads_on_demand_with_image_creation (#3629 ) fix is to stop postgres before the final checkpoint to ensure no inmemory layer gets created. Fixes #3627.	2023-02-17 13:34:26 +02:00
Alexander Bayandin	526f8b76aa	Bump werkzeug from 2.1.2 to 2.2.3 (#3631 ) ## Describe your changes ``` $ poetry add werkzeug@latest "moto[server]@latest" Using version ^2.2.3 for werkzeug Using version ^4.1.2 for moto Updating dependencies Resolving dependencies... (1.6s) Writing lock file Package operations: 0 installs, 2 updates, 1 removal • Removing pytz (2022.1) • Updating werkzeug (2.1.2 -> 2.2.3) • Updating moto (3.1.18 -> 4.1.2) ``` Resolves: - https://github.com/neondatabase/neon/security/dependabot/14 - https://github.com/neondatabase/neon/security/dependabot/13 `@dependabot` failed to create a PR for some reason (I guess because it also needed to handle `moto` dependency) ## Issue ticket number and link N/A ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [x] If it is a core feature, I have added thorough tests. - [x] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [x] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.	2023-02-17 08:29:52 +00:00
Sergey Melnikov	a1b062123b	Do not deploy storage to old account (#3630 ) It's gone	2023-02-16 20:28:53 +00:00

1 2 3 4 5 ...

2847 Commits