Comparing de0e96d2be...96b2e575e1 - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-03-03 16:30:38 +00:00

Author	SHA1	Message	Date
Shany Pozin	96b2e575e1	Merge pull request #5445 from neondatabase/releases/2023-10-03 Release 2023-10-03	2023-10-04 13:53:37 +03:00
Alexander Bayandin	7222777784	Update checksums for pg_jsonschema & pg_graphql (#5455 ) ## Problem Folks have re-taged releases for `pg_jsonschema` and `pg_graphql` (to increase timeouts on their CI), for us, these are a noop changes, but unfortunately, this will cause our builds to fail due to checksums mismatch (this might not strike right away because of the build cache). - `8ba7c7be9d` - `aa7509370a` ## Summary of changes - `pg_jsonschema` update checksum - `pg_graphql` update checksum	2023-10-03 18:44:30 +01:00
Em Sharnoff	5469fdede0	Merge pull request #5422 from neondatabase/sharnoff/rc-2023-09-28-fix-restart-on-postmaster-SIGKILL Release 2023-09-28: Fix (lack of) restart on neonvm postmaster SIGKILL	2023-09-28 10:48:51 -07:00
MMeent	72aa6b9fdd	Fix neon_zeroextend's WAL logging (#5387 ) When you log more than a few blocks, you need to reserve the space in advance. We didn't do that, so we got errors. Now we do that, and shouldn't get errors.	2023-09-28 09:37:28 -07:00
Em Sharnoff	ae0634b7be	Bump vm-builder v0.17.11 -> v0.17.12 (#5407 ) Only relevant change is neondatabase/autoscaling#534 - refer there for more details.	2023-09-28 09:28:04 -07:00
Shany Pozin	70711f32fa	Merge pull request #5375 from neondatabase/releases/2023-09-26 Release 2023-09-26	2023-09-26 15:19:45 +03:00
Vadim Kharitonov	52a88af0aa	Merge pull request #5336 from neondatabase/releases/2023-09-19 Release 2023-09-19	2023-09-19 11:16:43 +02:00
Alexander Bayandin	b7a43bf817	Merge branch 'release' into releases/2023-09-19	2023-09-19 09:07:20 +01:00
Alexander Bayandin	dce91b33a4	Merge pull request #5318 from neondatabase/releases/2023-09-15-1 Postgres 14/15: Use previous extensions versions	2023-09-15 16:30:44 +01:00
Alexander Bayandin	23ee4f3050	Revert plv8 only	2023-09-15 15:45:23 +01:00
Alexander Bayandin	46857e8282	Postgres 14/15: Use previous extensions versions	2023-09-15 15:27:00 +01:00
Alexander Bayandin	368ab0ce54	Merge pull request #5313 from neondatabase/releases/2023-09-15 Release 2023-09-15	2023-09-15 10:39:56 +01:00
Konstantin Knizhnik	a5987eebfd	References to old and new blocks were mixed in xlog_heap_update handler (#5312 ) ## Problem See https://neondb.slack.com/archives/C05L7D1JAUS/p1694614585955029 https://www.notion.so/neondatabase/Duplicate-key-issue-651627ce843c45188fbdcb2d30fd2178 ## Summary of changes Swap old/new block references ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2023-09-15 10:11:41 +01:00
Alexander Bayandin	6686ede30f	Update checksum for pg_hint_plan (#5309 ) ## Problem The checksum for `pg_hint_plan` doesn't match: ``` sha256sum: WARNING: 1 computed checksum did NOT match ``` Ref https://github.com/neondatabase/neon/actions/runs/6185715461/job/16793609251?pr=5307 It seems that the release was retagged yesterday: https://github.com/ossc-db/pg_hint_plan/releases/tag/REL16_1_6_0 I don't see any malicious changes from 15_1.5.1: https://github.com/ossc-db/pg_hint_plan/compare/REL15_1_5_1...REL16_1_6_0, so it should be ok to update. ## Summary of changes - Update checksum for `pg_hint_plan` 16_1.6.0	2023-09-15 09:54:42 +01:00
Em Sharnoff	373c7057cc	vm-monitor: Fix cgroup throttling (#5303 ) I believe this (not actual IO problems) is the cause of the "disk speed issue" that we've had for VMs recently. See e.g.: 1. https://neondb.slack.com/archives/C03H1K0PGKH/p1694287808046179?thread_ts=1694271790.580099&cid=C03H1K0PGKH 2. https://neondb.slack.com/archives/C03H1K0PGKH/p1694511932560659 The vm-informant (and now, the vm-monitor, its replacement) is supposed to gradually increase the `neon-postgres` cgroup's memory.high value, because otherwise the kernel will throttle all the processes in the cgroup. This PR fixes a bug with the vm-monitor's implementation of this behavior. --- Other references, for the vm-informant's implementation: - Original issue: neondatabase/autoscaling#44 - Original PR: neondatabase/autoscaling#223	2023-09-15 09:54:42 +01:00
Shany Pozin	7d6ec16166	Merge pull request #5296 from neondatabase/releases/2023-09-13 Release 2023-09-13	2023-09-13 13:49:14 +03:00
Shany Pozin	0e6fdc8a58	Merge pull request #5283 from neondatabase/releases/2023-09-12 Release 2023-09-12	2023-09-12 14:56:47 +03:00
Christian Schwarz	521438a5c6	fix deadlock around TENANTS (#5285 ) The sequence that can lead to a deadlock: 1. DELETE request gets all the way to `tenant.shutdown(progress, false).await.is_err() ` , while holding TENANTS.read() 2. POST request for tenant creation comes in, calls `tenant_map_insert`, it does `let mut guard = TENANTS.write().await;` 3. Something that `tenant.shutdown()` needs to wait for needs a `TENANTS.read().await`. The only case identified in exhaustive manual scanning of the code base is this one: Imitate size access does `get_tenant().await`, which does `TENANTS.read().await` under the hood. In the above case (1) waits for (3), (3)'s read-lock request is queued behind (2)'s write-lock, and (2) waits for (1). Deadlock. I made a reproducer/proof-that-above-hypothesis-holds in https://github.com/neondatabase/neon/pull/5281 , but, it's not ready for merge yet and we want the fix _now_. fixes https://github.com/neondatabase/neon/issues/5284	2023-09-12 14:13:13 +03:00
Vadim Kharitonov	07d7874bc8	Merge pull request #5202 from neondatabase/releases/2023-09-05 Release 2023-09-05	2023-09-05 12:16:06 +02:00
Anastasia Lubennikova	1804111a02	Merge pull request #5161 from neondatabase/rc-2023-08-31 Release 2023-08-31	2023-08-31 16:53:17 +03:00
Arthur Petukhovsky	cd0178efed	Merge pull request #5150 from neondatabase/release-sk-fix-active-timeline Release 2023-08-30	2023-08-30 11:43:39 +02:00
Shany Pozin	333574be57	Merge pull request #5133 from neondatabase/releases/2023-08-29 Release 2023-08-29	2023-08-29 14:02:58 +03:00
Alexander Bayandin	79a799a143	Merge branch 'release' into releases/2023-08-29	2023-08-29 11:17:57 +01:00
Conrad Ludgate	9da06af6c9	Merge pull request #5113 from neondatabase/release-http-connection-fix Release 2023-08-25	2023-08-25 17:21:35 +01:00
Conrad Ludgate	ce1753d036	proxy: dont return connection pending (#5107 ) ## Problem We were returning Pending when a connection had a notice/notification (introduced recently in #5020). When returning pending, the runtime assumes you will call `cx.waker().wake()` in order to continue processing. We weren't doing that, so the connection task would get stuck ## Summary of changes Don't return pending. Loop instead	2023-08-25 16:42:30 +01:00
Alek Westover	67db8432b4	Fix cargo deny errors (#5068 ) ## Problem cargo deny lint broken Links to the CVEs: [rustsec.org/advisories/RUSTSEC-2023-0052](https://rustsec.org/advisories/RUSTSEC-2023-0052) [rustsec.org/advisories/RUSTSEC-2023-0053](https://rustsec.org/advisories/RUSTSEC-2023-0053) One is fixed, the other one isn't so we allow it (for now), to unbreak CI. Then later we'll try to get rid of webpki in favour of the rustls fork. ## Summary of changes ``` +ignore = ["RUSTSEC-2023-0052"] ```	2023-08-25 16:42:30 +01:00
Vadim Kharitonov	4e2e44e524	Enable neon-pool-opt-in (#5062 )	2023-08-22 09:06:14 +01:00
Vadim Kharitonov	ed786104f3	Merge pull request #5060 from neondatabase/releases/2023-08-22 Release 2023-08-22	2023-08-22 09:41:02 +02:00
Stas Kelvich	84b74f2bd1	Merge pull request #4997 from neondatabase/sk/proxy-release-23-07-15 Fix lint	2023-08-15 18:54:20 +03:00
Arthur Petukhovsky	fec2ad6283	Fix lint	2023-08-15 18:49:02 +03:00
Stas Kelvich	98eebd4682	Merge pull request #4996 from neondatabase/sk/proxy_release Disable neon-pool-opt-in	2023-08-15 18:37:50 +03:00
Arthur Petukhovsky	2f74287c9b	Disable neon-pool-opt-in	2023-08-15 18:34:17 +03:00
Shany Pozin	aee1bf95e3	Merge pull request #4990 from neondatabase/releases/2023-08-15 Release 2023-08-15	2023-08-15 15:34:38 +03:00
Shany Pozin	b9de9d75ff	Merge branch 'release' into releases/2023-08-15	2023-08-15 14:35:00 +03:00
Stas Kelvich	7943b709e6	Merge pull request #4940 from neondatabase/sk/release-23-05-25-proxy-fixup Release: proxy retry fixup	2023-08-09 13:53:19 +03:00
Conrad Ludgate	d7d066d493	proxy: delay auth on retry (#4929 ) ## Problem When an endpoint is shutting down, it can take a few seconds. Currently when starting a new compute, this causes an "endpoint is in transition" error. We need to add delays before retrying to ensure that we allow time for the endpoint to shutdown properly. ## Summary of changes Adds a delay before retrying in auth. connect_to_compute already has this delay	2023-08-09 12:54:24 +03:00
Felix Prasanna	e78ac22107	release fix: revert vm builder bump from 0.13.1 -> 0.15.0-alpha1 (#4932 ) This reverts commit `682dfb3a31`. hotfix for a CLI arg issue in the monitor	2023-08-08 21:08:46 +03:00
Vadim Kharitonov	76a8f2bb44	Merge pull request #4923 from neondatabase/releases/2023-08-08 Release 2023-08-08	2023-08-08 11:44:38 +02:00
Vadim Kharitonov	8d59a8581f	Merge branch 'release' into releases/2023-08-08	2023-08-08 10:54:34 +02:00
Vadim Kharitonov	b1ddd01289	Define NEON_SMGR to make it possible for extensions to use Neon SMG API (#4889 ) Co-authored-by: Konstantin Knizhnik <knizhnik@garret.ru> Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2023-08-03 16:28:31 +03:00
Alexander Bayandin	6eae4fc9aa	Release 2023-08-02: update pg_embedding (#4877 ) Cherry-picking `ca4d71a954` from `main` into the `release` Co-authored-by: Vadim Kharitonov <vadim2404@users.noreply.github.com>	2023-08-03 08:48:09 +02:00
Christian Schwarz	765455bca2	Merge pull request #4861 from neondatabase/releases/2023-08-01--2-fix-pipeline ci: fix upload-postgres-extensions-to-s3 job	2023-08-01 13:22:07 +02:00
Christian Schwarz	4204960942	ci: fix upload-postgres-extensions-to-s3 job commit commit `5f8fd640bf` Author: Alek Westover <alek.westover@gmail.com> Date: Wed Jul 26 08:24:03 2023 -0400 Upload Test Remote Extensions (#4792) switched to using the release tag instead of `latest`, but, the `promote-images` job only uploads `latest` to the prod ECR. The switch to using release tag was good in principle, but, reverting that part to make the release pipeine work. Note that a proper fix should abandon use of `:latest` tag at all: currently, if a `main` pipeline runs concurrently with a `release` pipeline, the `release` pipeline may end up using the `main` pipeline's images.	2023-08-01 12:01:45 +02:00
Christian Schwarz	67345d66ea	Merge pull request #4858 from neondatabase/releases/2023-08-01 Release 2023-08-01	2023-08-01 10:44:01 +02:00
Shany Pozin	2266ee5971	Merge pull request #4803 from neondatabase/releases/2023-07-25 Release 2023-07-25	2023-07-25 14:21:07 +03:00
Shany Pozin	b58445d855	Merge pull request #4746 from neondatabase/releases/2023-07-18 Release 2023-07-18	2023-07-18 14:45:39 +03:00
Conrad Ludgate	36050e7f3d	Merge branch 'release' into releases/2023-07-18	2023-07-18 12:00:09 +01:00
Alexander Bayandin	33360ed96d	Merge pull request #4705 from neondatabase/release-2023-07-12 Release 2023-07-12 (only proxy)	2023-07-12 19:44:36 +01:00
Conrad Ludgate	39a28d1108	proxy wake_compute loop (#4675 ) ## Problem If we fail to wake up the compute node, a subsequent connect attempt will definitely fail. However, kubernetes won't fail the connection immediately, instead it hangs until we timeout (10s). ## Summary of changes Refactor the loop to allow fast retries of compute_wake and to skip a connect attempt.	2023-07-12 18:40:11 +01:00
Conrad Ludgate	efa6aa134f	allow repeated IO errors from compute node (#4624 ) ## Problem #4598 compute nodes are not accessible some time after wake up due to kubernetes DNS not being fully propagated. ## Summary of changes Update connect retry mechanism to support handling IO errors and sleeping for 100ms ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.	2023-07-12 18:40:06 +01:00
Alexander Bayandin	2c724e56e2	Merge pull request #4646 from neondatabase/releases/2023-07-06-hotfix Release 2023-07-06 (add pg_embedding extension only)	2023-07-06 12:19:52 +01:00
Alexander Bayandin	feff887c6f	Compile `pg_embedding` extension (#4634 ) ``` CREATE EXTENSION embedding; CREATE TABLE t (val real[]); INSERT INTO t (val) VALUES ('{0,0,0}'), ('{1,2,3}'), ('{1,1,1}'), (NULL); CREATE INDEX ON t USING hnsw (val) WITH (maxelements = 10, dims=3, m=3); INSERT INTO t (val) VALUES (array[1,2,4]); SELECT * FROM t ORDER BY val <-> array[3,3,3]; val --------- {1,2,3} {1,2,4} {1,1,1} {0,0,0} (5 rows) ```	2023-07-06 09:39:41 +01:00
Vadim Kharitonov	353d915fcf	Merge pull request #4633 from neondatabase/releases/2023-07-05 Release 2023-07-05	2023-07-05 15:10:47 +02:00
Vadim Kharitonov	2e38098cbc	Merge branch 'release' into releases/2023-07-05	2023-07-05 12:41:48 +02:00
Vadim Kharitonov	a6fe5ea1ac	Merge pull request #4571 from neondatabase/releases/2023-06-27 Release 2023-06-27	2023-06-27 12:55:33 +02:00
Vadim Kharitonov	05b0aed0c1	Merge branch 'release' into releases/2023-06-27	2023-06-27 12:22:12 +02:00
Alex Chi Z	cd1705357d	Merge pull request #4561 from neondatabase/releases/2023-06-23-hotfix Release 2023-06-23 (pageserver-only)	2023-06-23 15:38:50 -04:00
Christian Schwarz	6bc7561290	don't use MGMT_REQUEST_RUNTIME for consumption metrics synthetic size worker The consumption metrics synthetic size worker does logical size calculation. Logical size calculation currently does synchronous disk IO. This blocks the MGMT_REQUEST_RUNTIME's executor threads, starving other futures. While there's work on the way to move the synchronous disk IO into spawn_blocking, the quickfix here is to use the BACKGROUND_RUNTIME instead of MGMT_REQUEST_RUNTIME. Actually it's not just a quickfix. We simply shouldn't be blocking MGMT_REQUEST_RUNTIME executor threads on CPU or sync disk IO. That work isn't done yet, as many of the mgmt tasks still _do_ disk IO. But it's not as intensive as the logical size calculations that we're fixing here. While we're at it, fix disk-usage-based eviction in a similar way. It wasn't the culprit here, according to prod logs, but it can theoretically be a little CPU-intensive. More context, including graphs from Prod: https://neondb.slack.com/archives/C03F5SM1N02/p1687541681336949 (cherry picked from commit `d6e35222ea`)	2023-06-23 20:54:07 +02:00
Christian Schwarz	fbd3ac14b5	Merge pull request #4544 from neondatabase/releases/2023-06-21-hotfix Release 2023-06-21 (fixup for post-merge failed 2023-06-20)	2023-06-21 16:54:34 +03:00
Christian Schwarz	e437787c8f	cargo update -p openssl (#4542 ) To unblock release https://github.com/neondatabase/neon/pull/4536#issuecomment-1600678054 Context: https://rustsec.org/advisories/RUSTSEC-2023-0044	2023-06-21 15:52:56 +03:00
Christian Schwarz	3460dbf90b	Merge pull request #4536 from neondatabase/releases/2023-06-20 Release 2023-06-20 (actually 2023-06-21)	2023-06-21 14:19:14 +03:00
Vadim Kharitonov	6b89d99677	Merge pull request #4521 from neondatabase/release_2023-06-15 Release 2023 06 15	2023-06-15 17:40:01 +02:00
Vadim Kharitonov	6cc8ea86e4	Merge branch 'main' into release_2023-06-15	2023-06-15 16:50:44 +02:00
Shany Pozin	e62a492d6f	Merge pull request #4486 from neondatabase/releases/2023-06-13 Release 2023-06-13	2023-06-13 15:21:35 +03:00
Alexey Kondratov	a475cdf642	[compute_ctl] Fix logging if catalog updates are skipped (#4480 ) Otherwise, it wasn't clear from the log when Postgres started up completely if catalog updates were skipped. Follow-up for `4936ab6`	2023-06-13 13:37:24 +02:00
Stas Kelvich	7002c79a47	Merge pull request #4447 from neondatabase/release_proxy_08-06-2023 Release proxy 08 06 2023	2023-06-08 21:02:54 +03:00
Vadim Kharitonov	ee6cf357b4	Merge pull request #4427 from neondatabase/releases/2023-06-06 Release 2023-06-06	2023-06-06 14:42:21 +02:00
Vadim Kharitonov	e5c2086b5f	Merge branch 'release' into releases/2023-06-06	2023-06-06 12:33:56 +02:00
Shany Pozin	5f1208296a	Merge pull request #4395 from neondatabase/releases/2023-06-01 Release 2023-06-01	2023-06-01 10:58:00 +03:00
Stas Kelvich	88e8e473cd	Merge pull request #4345 from neondatabase/release-23-05-25-proxy Release 23-05-25, take 3	2023-05-25 19:40:43 +03:00
Stas Kelvich	b0a77844f6	Add SQL-over-HTTP endpoint to Proxy This commit introduces an SQL-over-HTTP endpoint in the proxy, with a JSON response structure resembling that of the node-postgres driver. This method, using HTTP POST, achieves smaller amortized latencies in edge setups due to fewer round trips and an enhanced open connection reuse by the v8 engine. This update involves several intricacies: 1. SQL injection protection: We employed the extended query protocol, modifying the rust-postgres driver to send queries in one roundtrip using a text protocol rather than binary, bypassing potential issues like those identified in https://github.com/sfackler/rust-postgres/issues/1030. 2. Postgres type compatibility: As not all postgres types have binary representations (e.g., acl's in pg_class), we adjusted rust-postgres to respond with text protocol, simplifying serialization and fixing queries with text-only types in response. 3. Data type conversion: Considering JSON supports fewer data types than Postgres, we perform conversions where possible, passing all other types as strings. Key conversions include: - postgres int2, int4, float4, float8 -> json number (NaN and Inf remain text) - postgres bool, null, text -> json bool, null, string - postgres array -> json array - postgres json and jsonb -> json object 4. Alignment with node-postgres: To facilitate integration with js libraries, we've matched the response structure of node-postgres, returning command tags and column oids. Command tag capturing was added to the rust-postgres functionality as part of this change.	2023-05-25 17:59:17 +03:00
Vadim Kharitonov	1baf464307	Merge pull request #4309 from neondatabase/releases/2023-05-23 Release 2023-05-23	2023-05-24 11:56:54 +02:00
Alexander Bayandin	e9b8e81cea	Merge branch 'release' into releases/2023-05-23	2023-05-23 12:54:08 +01:00
Alexander Bayandin	85d6194aa4	Fix regress-tests job for Postgres 15 on release branch (#4254 ) ## Problem Compatibility tests don't support Postgres 15 yet, but we're still trying to upload compatibility snapshot (which we do not collect). Ref https://github.com/neondatabase/neon/actions/runs/4991394158/jobs/8940369368#step:4:38129 ## Summary of changes Add `pg_version` parameter to `run-python-test-set` actions and do not upload compatibility snapshot for Postgres 15	2023-05-16 17:19:12 +01:00
Vadim Kharitonov	333a7a68ef	Merge pull request #4245 from neondatabase/releases/2023-05-16 Release 2023-05-16	2023-05-16 13:38:40 +02:00
Vadim Kharitonov	6aa4e41bee	Merge branch 'release' into releases/2023-05-16	2023-05-16 12:48:23 +02:00
Joonas Koivunen	840183e51f	try: higher page_service timeouts to isolate an issue	2023-05-11 16:24:53 +03:00
Shany Pozin	cbccc94b03	Merge pull request #4184 from neondatabase/releases/2023-05-09 Release 2023-05-09	2023-05-09 15:30:36 +03:00
Stas Kelvich	fce227df22	Merge pull request #4163 from neondatabase/main Release 23-05-05	2023-05-05 15:56:23 +03:00
Stas Kelvich	bd787e800f	Merge pull request #4133 from neondatabase/main Release 23-04-01	2023-05-01 18:52:46 +03:00
Shany Pozin	4a7704b4a3	Merge pull request #4131 from neondatabase/sp/hotfix_adding_sks_us_west Hotfix: Adding 4 new pageservers and two sets of safekeepers to us west 2	2023-05-01 15:17:38 +03:00
Shany Pozin	ff1119da66	Add 2 new sets of safekeepers to us-west2	2023-05-01 14:35:31 +03:00
Shany Pozin	4c3ba1627b	Add 4 new Pageservers for retool launch	2023-05-01 14:34:38 +03:00
Vadim Kharitonov	1407174fb2	Merge pull request #4110 from neondatabase/vk/release_2023-04-28 Release 2023 04 28	2023-04-28 17:43:16 +02:00
Vadim Kharitonov	ec9dcb1889	Merge branch 'release' into vk/release_2023-04-28	2023-04-28 16:32:26 +02:00
Joonas Koivunen	d11d781afc	revert: "Add check for duplicates of generated image layers" (#4104 ) This reverts commit `732acc5`. Reverted PR: #3869 As noted in PR #4094, we do in fact try to insert duplicates to the layer map, if L0->L1 compaction is interrupted. We do not have a proper fix for that right now, and we are in a hurry to make a release to production, so revert the changes related to this to the state that we have in production currently. We know that we have a bug here, but better to live with the bug that we've had in production for a long time, than rush a fix to production without testing it in staging first. Cc: #4094, #4088	2023-04-28 16:31:35 +02:00
Anastasia Lubennikova	4e44565b71	Merge pull request #4000 from neondatabase/releases/2023-04-11 Release 2023-04-11	2023-04-11 17:47:41 +03:00
Stas Kelvich	4ed51ad33b	Add more proxy cnames	2023-04-11 15:59:35 +03:00
Arseny Sher	1c1ebe5537	Merge pull request #3946 from neondatabase/releases/2023-04-04 Release 2023-04-04	2023-04-04 14:38:40 +04:00
Christian Schwarz	c19cb7f386	Merge pull request #3935 from neondatabase/releases/2023-04-03 Release 2023-04-03	2023-04-03 16:19:49 +02:00
Vadim Kharitonov	4b97d31b16	Merge pull request #3896 from neondatabase/releases/2023-03-28 Release 2023-03-28	2023-03-28 17:58:06 +04:00
Shany Pozin	923ade3dd7	Merge pull request #3855 from neondatabase/releases/2023-03-21 Release 2023-03-21	2023-03-21 13:12:32 +02:00
Arseny Sher	b04e711975	Merge pull request #3825 from neondatabase/release-2023-03-15 Release 2023.03.15	2023-03-15 15:38:00 +03:00
Arseny Sher	afd0a6b39a	Forward framed read buf contents to compute before proxy pass. Otherwise they get lost. Normally buffer is empty before proxy pass, but this is not the case with pipeline mode of out npm driver; fixes connection hangup introduced by `b80fe41af3` for it. fixes https://github.com/neondatabase/neon/issues/3822	2023-03-15 15:36:06 +04:00
Lassi Pölönen	99752286d8	Use RollingUpdate strategy also for legacy proxy (#3814 ) ## Describe your changes We have previously changed the neon-proxy to use RollingUpdate. This should be enabled in legacy proxy too in order to avoid breaking connections for the clients and allow for example backups to run even during deployment. (https://github.com/neondatabase/neon/pull/3683) ## Issue ticket number and link https://github.com/neondatabase/neon/issues/3333	2023-03-15 15:35:51 +04:00
Arseny Sher	15df93363c	Merge pull request #3804 from neondatabase/release-2023-03-13 Release 2023.03.13	2023-03-13 20:25:40 +03:00
Vadim Kharitonov	bc0ab741af	Merge pull request #3758 from neondatabase/releases/2023-03-07 Release 2023-03-07	2023-03-07 12:38:47 +01:00
Christian Schwarz	51d9dfeaa3	Merge pull request #3743 from neondatabase/releases/2023-03-03 Release 2023-03-03	2023-03-03 19:20:21 +01:00
Shany Pozin	f63cb18155	Merge pull request #3713 from neondatabase/releases/2023-02-28 Release 2023-02-28	2023-02-28 12:52:24 +02:00
Arseny Sher	0de603d88e	Merge pull request #3707 from neondatabase/release-2023-02-24 Release 2023-02-24 Hotfix for UNLOGGED tables. Contains #3706 Also contains rebase on 14.7 and 15.2 #3581	2023-02-25 00:32:11 +04:00
Heikki Linnakangas	240913912a	Fix UNLOGGED tables. Instead of trying to create missing files on the way, send init fork contents as main fork from pageserver during basebackup. Add test for that. Call put_rel_drop for init forks; previously they weren't removed. Bump vendor/postgres to revert previous approach on Postgres side. Co-authored-by: Arseny Sher <sher-ars@yandex.ru> ref https://github.com/neondatabase/postgres/pull/264 ref https://github.com/neondatabase/postgres/pull/259 ref https://github.com/neondatabase/neon/issues/1222	2023-02-24 23:54:53 +04:00
MMeent	91a4ea0de2	Update vendored PostgreSQL versions to 14.7 and 15.2 (#3581 ) ## Describe your changes Rebase vendored PostgreSQL onto 14.7 and 15.2 ## Issue ticket number and link #3579 ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [x] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [x] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ``` The version of PostgreSQL that we use is updated to 14.7 for PostgreSQL 14 and 15.2 for PostgreSQL 15. ```	2023-02-24 23:54:42 +04:00
Arseny Sher	8608704f49	Merge pull request #3691 from neondatabase/release-2023-02-23 Release 2023-02-23 Hotfix for the unlogged tables with indexes issue. neondatabase/postgres#259 neondatabase/postgres#262	2023-02-23 13:39:33 +04:00
Arseny Sher	efef68ce99	Bump vendor/postgres to include hotfix for unlogged tables with indexes. https://github.com/neondatabase/postgres/pull/259 https://github.com/neondatabase/postgres/pull/262	2023-02-23 08:49:43 +04:00
Joonas Koivunen	8daefd24da	Merge pull request #3679 from neondatabase/releases/2023-02-22 Releases/2023-02-22	2023-02-22 15:56:55 +02:00
Arthur Petukhovsky	46cc8b7982	Remove safekeeper-1.ap-southeast-1.aws.neon.tech (#3671 ) We migrated all timelines to `safekeeper-3.ap-southeast-1.aws.neon.tech`, now old instance can be removed.	2023-02-22 15:07:57 +02:00
Sergey Melnikov	38cd90dd0c	Add -v to ansible invocations (#3670 ) To get more debug output on failures	2023-02-22 15:07:57 +02:00
Joonas Koivunen	a51b269f15	fix: hold permit until GetObject eof (#3663 ) previously we applied the ratelimiting only up to receiving the headers from s3, or somewhere near it. the commit adds an adapter which carries the permit until the AsyncRead has been disposed. fixes #3662.	2023-02-22 15:07:57 +02:00
Joonas Koivunen	43bf6d0a0f	calculate_logical_size: no longer use spawn_blocking (#3664 ) Calculation of logical size is now async because of layer downloads, so we shouldn't use spawn_blocking for it. Use of `spawn_blocking` exhausted resources which are needed by `tokio::io::copy` when copying from a stream to a file which lead to deadlock. Fixes: #3657	2023-02-22 15:07:57 +02:00
Joonas Koivunen	15273a9b66	chore: ignore all compaction inactive tenant errors (#3665 ) these are happening in tests because of #3655 but they sure took some time to appear. makes the `Compaction failed, retrying in 2s: Cannot run compaction iteration on inactive tenant` into a globally allowed error, because it has been seen failing on different test cases.	2023-02-22 15:07:57 +02:00
Joonas Koivunen	78aca668d0	fix: log download failed error (#3661 ) Fixes #3659	2023-02-22 15:07:57 +02:00
Vadim Kharitonov	acbf4148ea	Merge pull request #3656 from neondatabase/releases/2023-02-21 Release 2023-02-21	2023-02-21 16:03:48 +01:00
Vadim Kharitonov	6508540561	Merge branch 'release' into releases/2023-02-21	2023-02-21 15:31:16 +01:00
Arthur Petukhovsky	a41b5244a8	Add new safekeeper to ap-southeast-1 prod (#3645 ) (#3646 ) To trigger deployment of #3645 to production.	2023-02-20 15:22:49 +00:00
Shany Pozin	2b3189be95	Merge pull request #3600 from neondatabase/releases/2023-02-14 Release 2023-02-14	2023-02-15 13:31:30 +02:00
Vadim Kharitonov	248563c595	Merge pull request #3553 from neondatabase/releases/2023-02-07 Release 2023-02-07	2023-02-07 14:07:44 +01:00
Vadim Kharitonov	14cd6ca933	Merge branch 'release' into releases/2023-02-07	2023-02-07 12:11:56 +01:00
Vadim Kharitonov	eb36403e71	Release 2023 01 31 (#3497 ) Co-authored-by: Kirill Bulatov <kirill@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech> Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech> Co-authored-by: bojanserafimov <bojan.serafimov7@gmail.com> Co-authored-by: Christian Schwarz <christian@neon.tech> Co-authored-by: Alexey Kondratov <kondratov.aleksey@gmail.com> Co-authored-by: Joonas Koivunen <joonas@neon.tech> Co-authored-by: Konstantin Knizhnik <knizhnik@garret.ru> Co-authored-by: Shany Pozin <shany@neon.tech> Co-authored-by: Sergey Melnikov <sergey@neon.tech> Co-authored-by: Dmitry Rodionov <dmitry@neon.tech> Co-authored-by: Rory de Zoete <33318916+zoete@users.noreply.github.com> Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> Co-authored-by: Lassi Pölönen <lassi.polonen@iki.fi>	2023-01-31 15:06:35 +02:00
Anastasia Lubennikova	3c6f779698	Merge pull request #3411 from neondatabase/release_2023_01_23 Fix Release 2023 01 23	2023-01-23 20:10:03 +02:00
Joonas Koivunen	f67f0c1c11	More tenant size fixes (#3410 ) Small changes, but hopefully this will help with the panic detected in staging, for which we cannot get the debugging information right now (end-of-branch before branch-point).	2023-01-23 17:46:13 +02:00
Shany Pozin	edb02d3299	Adding pageserver3 to staging (#3403 )	2023-01-23 17:46:13 +02:00
Konstantin Knizhnik	664a69e65b	Fix slru_segment_key_range function: segno was assigned to incorrect Key field (#3354 )	2023-01-23 17:46:13 +02:00
Anastasia Lubennikova	478322ebf9	Fix tenant size orphans (#3377 ) Before only the timelines which have passed the `gc_horizon` were processed which failed with orphans at the tree_sort phase. Example input in added `test_branched_empty_timeline_size` test case. The PR changes iteration to happen through all timelines, and in addition to that, any learned branch points will be calculated as they would had been in the original implementation if the ancestor branch had been over the `gc_horizon`. This also changes how tenants where all timelines are below `gc_horizon` are handled. Previously tenant_size 0 was returned, but now they will have approximately `initdb_lsn` worth of tenant_size. The PR also adds several new tenant size tests that describe various corner cases of branching structure and `gc_horizon` setting. They are currently disabled to not consume time during CI. Co-authored-by: Joonas Koivunen <joonas@neon.tech> Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>	2023-01-23 17:46:13 +02:00
Joonas Koivunen	802f174072	fix: dont stop pageserver if we fail to calculate synthetic size	2023-01-23 17:46:13 +02:00
Alexey Kondratov	47f9890bae	[compute_ctl] Make role deletion spec processing idempotent (#3380 ) Previously, we were trying to re-assign owned objects of the already deleted role. This were causing a crash loop in the case when compute was restarted with a spec that includes delta operation for role deletion. To avoid such cases, check that role is still present before calling `reassign_owned_objects`. Resolves neondatabase/cloud#3553	2023-01-23 17:46:13 +02:00
Christian Schwarz	262265daad	Revert "Use actual temporary dir for pageserver unit tests" This reverts commit `826e89b9ce`. The problem with that commit was that it deletes the TempDir while there are still EphemeralFile instances open. At first I thought this could be fixed by simply adding Handle::current().block_on(task_mgr::shutdown(None, Some(tenant_id), None)) to TenantHarness::drop, but it turned out to be insufficient. So, reverting the commit until we find a proper solution. refs https://github.com/neondatabase/neon/issues/3385	2023-01-23 17:46:13 +02:00
bojanserafimov	300da5b872	Improve layer map docstrings (#3382 )	2023-01-23 17:46:13 +02:00
Heikki Linnakangas	7b22b5c433	Switch to 'tracing' for logging, restructure code to make use of spans. Refactors Compute::prepare_and_run. It's split into subroutines differently, to make it easier to attach tracing spans to the different stages. The high-level logic for waiting for Postgres to exit is moved to the caller. Replace 'env_logger' with 'tracing', and add `#instrument` directives to different stages fo the startup process. This is a fairly mechanical change, except for the changes in 'spec.rs'. 'spec.rs' contained some complicated formatting, where parts of log messages were printed directly to stdout with `print`s. That was a bit messed up because the log normally goes to stderr, but those lines were printed to stdout. In our docker images, stderr and stdout both go to the same place so you wouldn't notice, but I don't think it was intentional. This changes the log format to the default 'tracing_subscriber::format' format. It's different from the Postgres log format, however, and because both compute_tools and Postgres print to the same log, it's now a mix of two different formats. I'm not sure how the Grafana log parsing pipeline can handle that. If it's a problem, we can build custom formatter to change the compute_tools log format to be the same as Postgres's, like it was before this commit, or we can change the Postgres log format to match tracing_formatter's, or we can start printing compute_tool's log output to a different destination than Postgres	2023-01-23 17:46:12 +02:00
Kirill Bulatov	ffca97bc1e	Enable logs in unit tests	2023-01-23 17:46:12 +02:00
Kirill Bulatov	cb356f3259	Use actual temporary dir for pageserver unit tests	2023-01-23 17:46:12 +02:00
Vadim Kharitonov	c85374295f	Change SENTRY_ENVIRONMENT from "development" to "staging"	2023-01-23 17:46:12 +02:00
Anastasia Lubennikova	4992160677	Fix metric_collection_endpoint for prod. It was incorrectly set to staging url	2023-01-23 17:46:12 +02:00
Heikki Linnakangas	bd535b3371	If an error happens while checking for core dumps, don't panic. If we panic, we skip the 30s wait in 'main', and don't give the console a chance to observe the error. Which is not nice. Spotted by @ololobus at https://github.com/neondatabase/neon/pull/3352#discussion_r1072806981	2023-01-23 17:46:12 +02:00
Kirill Bulatov	d90c5a03af	Add more io::Error context when fail to operate on a path (#3254 ) I have a test failure that shows ``` Caused by: 0: Failed to reconstruct a page image: 1: Directory not empty (os error 39) ``` but does not really show where exactly that happens. https://neon-github-public-dev.s3.amazonaws.com/reports/pr-3227/release/3823785365/index.html#categories/c0057473fc9ec8fb70876fd29a171ce8/7088dab272f2c7b7/?attachment=60fe6ed2add4d82d The PR aims to add more context in debugging that issue.	2023-01-23 17:46:12 +02:00
Anastasia Lubennikova	2d02cc9079	Merge pull request #3365 from neondatabase/main Release 2023-01-17	2023-01-17 16:41:34 +02:00
Christian Schwarz	49ad94b99f	Merge pull request #3301 from neondatabase/release-2023-01-10 Release 2023-01-10	2023-01-10 16:42:26 +01:00
Christian Schwarz	948a217398	Merge commit '95bf19b85a06b27a7fc3118dee03d48648efab15' into release-2023-01-10 Conflicts: .github/helm-values/neon-stress.proxy-scram.yaml .github/helm-values/neon-stress.proxy.yaml .github/helm-values/staging.proxy-scram.yaml .github/helm-values/staging.proxy.yaml All of the above were deleted in `main` after we hotfixed them in `release. Deleting them here storage_broker/src/bin/storage_broker.rs Hotfix toned down logging, but `main` has sinced implemented a proper fix. Taken `main`'s side, see https://neondb.slack.com/archives/C033RQ5SPDH/p1673354385387479?thread_ts=1673354306.474729&cid=C033RQ5SPDH closes https://github.com/neondatabase/neon/issues/3287	2023-01-10 15:40:14 +01:00
Dmitry Rodionov	125381eae7	Merge pull request #3236 from neondatabase/dkr/retrofit-sk4-sk4-change Move zenith-1-sk-3 to zenith-1-sk-4 (#3164)	2022-12-30 14:13:50 +03:00
Arthur Petukhovsky	cd01bbc715	Move zenith-1-sk-3 to zenith-1-sk-4 (#3164 )	2022-12-30 12:32:52 +02:00
Dmitry Rodionov	d8b5e3b88d	Merge pull request #3229 from neondatabase/dkr/add-pageserver-for-release add pageserver to new region see https://github.com/neondatabase/aws/pull/116 decrease log volume for pageserver	2022-12-30 12:34:04 +03:00
Dmitry Rodionov	06d25f2186	switch to debug from info to produce less noise	2022-12-29 17:48:47 +02:00
Dmitry Rodionov	f759b561f3	add pageserver to new region see https://github.com/neondatabase/aws/pull/116	2022-12-29 17:17:35 +02:00
Sergey Melnikov	ece0555600	Push proxy metrics to Victoria Metrics (#3106 )	2022-12-16 14:44:49 +02:00
Joonas Koivunen	73ea0a0b01	fix(remote_storage): use cached credentials (#3128 ) IMDSv2 has limits, and if we query it on every s3 interaction we are going to go over those limits. Changes the s3_bucket client configuration to use: - ChainCredentialsProvider to handle env variables or imds usage - LazyCachingCredentialsProvider to actually cache any credentials Related: https://github.com/awslabs/aws-sdk-rust/issues/629 Possibly related: https://github.com/neondatabase/neon/issues/3118	2022-12-16 14:44:49 +02:00
Arseny Sher	d8f6d6fd6f	Merge pull request #3126 from neondatabase/broker-lb-release Deploy broker with L4 LB in new env.	2022-12-16 01:25:28 +03:00
Arseny Sher	d24de169a7	Deploy broker with L4 LB in new env. Seems to be fixing issue with missing keepalives.	2022-12-16 01:45:32 +04:00
Arseny Sher	0816168296	Hotfix: terminate subscription if channel is full. Might help as a hotfix, but need to understand root better.	2022-12-15 12:23:56 +03:00
Dmitry Rodionov	277b44d57a	Merge pull request #3102 from neondatabase/main Hotfix. See commits for details	2022-12-14 19:38:43 +03:00
MMeent	68c2c3880e	Merge pull request #3038 from neondatabase/main Release 22-12-14	2022-12-14 14:35:47 +01:00
Arthur Petukhovsky	49da498f65	Merge pull request #2833 from neondatabase/main Release 2022-11-16	2022-11-17 08:44:10 +01:00
Stas Kelvich	2c76ba3dd7	Merge pull request #2718 from neondatabase/main-rc-22-10-28 Release 22-10-28	2022-10-28 20:33:56 +03:00
Arseny Sher	dbe3dc69ad	Merge branch 'main' into main-rc-22-10-28 Release 22-10-28.	2022-10-28 19:10:11 +04:00
Arseny Sher	8e5bb3ed49	Enable etcd compaction in neon_local.	2022-10-27 12:53:20 +03:00
Stas Kelvich	ab0be7b8da	Avoid debian-testing packages in compute Dockerfiles plv8 can only be built with a fairly new gold linker version. We used to install it via binutils packages from testing, but it also updates libc and that causes troubles in the resulting image as different extensions were built against different libc versions. We could either use libc from debian-testing everywhere or restrain from using testing packages and install necessary programs manually. This patch uses the latter approach: gold for plv8 and cmake for h3 are installed manually. In a passing declare h3_postgis as a safe extension (previous omission).	2022-10-27 12:53:20 +03:00
bojanserafimov	b4c55f5d24	Move pagestream api to libs/pageserver_api (#2698 )	2022-10-27 12:53:20 +03:00
mikecaat	ede70d833c	Add a docker-compose example file (#1943 ) (#2666 ) Co-authored-by: Masahiro Ikeda <masahiro.ikeda.us@hco.ntt.co.jp>	2022-10-27 12:53:20 +03:00
Sergey Melnikov	70c3d18bb0	Do not release to new staging proxies on release (#2685 )	2022-10-27 12:53:20 +03:00
bojanserafimov	7a491f52c4	Add draw_timeline binary (#2688 )	2022-10-27 12:53:20 +03:00
Alexander Bayandin	323c4ecb4f	Add data format backward compatibility tests (#2626 )	2022-10-27 12:53:20 +03:00
Anastasia Lubennikova	3d2466607e	Merge pull request #2692 from neondatabase/main-rc Release 2022-10-25	2022-10-25 18:18:58 +03:00
Anastasia Lubennikova	ed478b39f4	Merge branch 'release' into main-rc	2022-10-25 17:06:33 +03:00
Stas Kelvich	91585a558d	Merge pull request #2678 from neondatabase/stas/hotfix_schema Hotfix to disable grant create on public schema	2022-10-22 02:54:31 +03:00
Stas Kelvich	93467eae1f	Hotfix to disable grant create on public schema `GRANT CREATE ON SCHEMA public` fails if there is no schema `public`. Disable it in release for now and make a better fix later (it is needed for v15 support).	2022-10-22 02:26:28 +03:00
Stas Kelvich	f3aac81d19	Merge pull request #2668 from neondatabase/main Release 2022-10-21	2022-10-21 15:21:42 +03:00
Stas Kelvich	979ad60c19	Merge pull request #2581 from neondatabase/main Release 2022-10-07	2022-10-07 16:50:55 +03:00
Stas Kelvich	9316cb1b1f	Merge pull request #2573 from neondatabase/main Release 2022-10-06	2022-10-07 11:07:06 +03:00
Anastasia Lubennikova	e7939a527a	Merge pull request #2377 from neondatabase/main Release 2022-09-01	2022-09-01 20:20:44 +03:00
Arthur Petukhovsky	36d26665e1	Merge pull request #2299 from neondatabase/main * Check for entire range during sasl validation (#2281) * Gen2 GH runner (#2128) * Re-add rustup override * Try s3 bucket * Set git version * Use v4 cache key to prevent problems * Switch to v5 for key * Add second rustup fix * Rebase * Add kaniko steps * Fix typo and set compress level * Disable global run default * Specify shell for step * Change approach with kaniko * Try less verbose shell spec * Add submodule pull * Add promote step * Adjust dependency chain * Try default swap again * Use env * Don't override aws key * Make kaniko build conditional * Specify runs on * Try without dependency link * Try soft fail * Use image with git * Try passing to next step * Fix duplicate * Try other approach * Try other approach * Fix typo * Try other syntax * Set env * Adjust setup * Try step 1 * Add link * Try global env * Fix mistake * Debug * Try other syntax * Try other approach * Change order * Move output one step down * Put output up one level * Try other syntax * Skip build * Try output * Re-enable build * Try other syntax * Skip middle step * Update check * Try first step of dockerhub push * Update needs dependency * Try explicit dir * Add missing package * Try other approach * Try other approach * Specify region * Use with * Try other approach * Add debug * Try other approach * Set region * Follow AWS example * Try github approach * Skip Qemu * Try stdin * Missing steps * Add missing close * Add echo debug * Try v2 endpoint * Use v1 endpoint * Try without quotes * Revert * Try crane * Add debug * Split steps * Fix duplicate * Add shell step * Conform to options * Add verbose flag * Try single step * Try workaround * First request fails hunch * Try bullseye image * Try other approach * Adjust verbose level * Try previous step * Add more debug * Remove debug step * Remove rogue indent * Try with larger image * Add build tag step * Update workflow for testing * Add tag step for test * Remove unused * Update dependency chain * Add ownership fix * Use matrix for promote * Force update * Force build * Remove unused * Add new image * Add missing argument * Update dockerfile copy * Update Dockerfile * Update clone * Update dockerfile * Go to correct folder * Use correct format * Update dockerfile * Remove cd * Debug find where we are * Add debug on first step * Changedir to postgres * Set workdir * Use v1 approach * Use other dependency * Try other approach * Try other approach * Update dockerfile * Update approach * Update dockerfile * Update approach * Update dockerfile * Update dockerfile * Add workspace hack * Update Dockerfile * Update Dockerfile * Update Dockerfile * Change last step * Cleanup pull in prep for review * Force build images * Add condition for latest tagging * Use pinned version * Try without name value * Remove more names * Shorten names * Add kaniko comments * Pin kaniko * Pin crane and ecr helper * Up one level * Switch to pinned tag for rust image * Force update for test Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@b04468bf-cdf4-41eb-9c94-aff4ca55e4bf.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@4795e9ee-4f32-401f-85f3-f316263b62b8.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@2f8bc4e5-4ec2-4ea2-adb1-65d863c4a558.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@27565b2b-72d5-4742-9898-a26c9033e6f9.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@ecc96c26-c6c4-4664-be6e-34f7c3f89a3c.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@7caff3a5-bf03-4202-bd0e-f1a93c86bdae.fritz.box> * Add missing step output, revert one deploy step (#2285) * Add missing step output, revert one deploy step * Conform to syntax * Update approach * Add missing value * Add missing needs Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> * Error for fatal not git repo (#2286) Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> * Use main, not branch for ref check (#2288) * Use main, not branch for ref check * Add more debug * Count main, not head * Try new approach * Conform to syntax * Update approach * Get full history * Skip checkout * Cleanup debug * Remove more debug Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> * Fix docker zombie process issue (#2289) * Fix docker zombie process issue * Init everywhere Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> * Fix 1.63 clippy lints (#2282) * split out timeline metrics, track layer map loading and size calculation * reset rust cache for clippy run to avoid an ICE additionally remove trailing whitespaces * Rename pg_control_ffi.h to bindgen_deps.h, for clarity. The pg_control_ffi.h name implies that it only includes stuff related to pg_control.h. That's mostly true currently, but really the point of the file is to include everything that we need to generate Rust definitions from. * Make local mypy behave like CI mypy (#2291) * Fix flaky pageserver restarts in tests (#2261) * Remove extra type aliases (#2280) * Update cachepot endpoint (#2290) * Update cachepot endpoint * Update dockerfile & remove env * Update image building process * Cannot use metadata endpoint for this * Update workflow * Conform to kaniko syntax * Update syntax * Update approach * Update dockerfiles * Force update * Update dockerfiles * Update dockerfile * Cleanup dockerfiles * Update s3 test location * Revert s3 experiment * Add more debug * Specify aws region * Remove debug, add prefix * Remove one more debug Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> * workflows/benchmarking: increase timeout (#2294) * Rework `init` in pageserver CLI (#2272) * Do not create initial tenant and timeline (adjust Python tests for that) * Rework config handling during init, add --update-config to manage local config updates * Fix: Always build images (#2296) * Always build images * Remove unused Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> * Move auto-generated 'bindings' to a separate inner module. Re-export only things that are used by other modules. In the future, I'm imagining that we run bindgen twice, for Postgres v14 and v15. The two sets of bindings would go into separate 'bindings_v14' and 'bindings_v15' modules. Rearrange postgres_ffi modules. Move function, to avoid Postgres version dependency in timelines.rs Move function to generate a logical-message WAL record to postgres_ffi. * fix cargo test * Fix walreceiver and safekeeper bugs (#2295) - There was an issue with zero commit_lsn `reason: LaggingWal { current_commit_lsn: 0/0, new_commit_lsn: 1/6FD90D38, threshold: 10485760 } }`. The problem was in `send_wal.rs`, where we initialized `end_pos = Lsn(0)` and in some cases sent it to the pageserver. - IDENTIFY_SYSTEM previously returned `flush_lsn` as a physical end of WAL. Now it returns `flush_lsn` (as it was) to walproposer and `commit_lsn` to everyone else including pageserver. - There was an issue with backoff where connection was cancelled right after initialization: `connected!` -> `safekeeper_handle_db: Connection cancelled` -> `Backoff: waiting 3 seconds`. The problem was in sleeping before establishing the connection. This is fixed by reworking retry logic. - There was an issue with getting `NoKeepAlives` reason in a loop. The issue is probably the same as the previous. - There was an issue with filtering safekeepers based on retry attempts, which could filter some safekeepers indefinetely. This is fixed by using retry cooldown duration instead of retry attempts. - Some `send_wal.rs` connections failed with errors without context. This is fixed by adding a timeline to safekeepers errors. New retry logic works like this: - Every candidate has a `next_retry_at` timestamp and is not considered for connection until that moment - When walreceiver connection is closed, we update `next_retry_at` using exponential backoff, increasing the cooldown on every disconnect. - When `last_record_lsn` was advanced using the WAL from the safekeeper, we reset the retry cooldown and exponential backoff, allowing walreceiver to reconnect to the same safekeeper instantly. * on safekeeper registration pass availability zone param (#2292) Co-authored-by: Kirill Bulatov <kirill@neon.tech> Co-authored-by: Rory de Zoete <33318916+zoete@users.noreply.github.com> Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@b04468bf-cdf4-41eb-9c94-aff4ca55e4bf.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@4795e9ee-4f32-401f-85f3-f316263b62b8.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@2f8bc4e5-4ec2-4ea2-adb1-65d863c4a558.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@27565b2b-72d5-4742-9898-a26c9033e6f9.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@ecc96c26-c6c4-4664-be6e-34f7c3f89a3c.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@7caff3a5-bf03-4202-bd0e-f1a93c86bdae.fritz.box> Co-authored-by: Dmitry Rodionov <dmitry@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech> Co-authored-by: bojanserafimov <bojan.serafimov7@gmail.com> Co-authored-by: Alexander Bayandin <alexander@neon.tech> Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech> Co-authored-by: Anton Galitsyn <agalitsyn@users.noreply.github.com>	2022-08-18 15:32:33 +03:00
Arthur Petukhovsky	873347f977	Merge pull request #2275 from neondatabase/main * github/workflows: Fix git dubious ownership (#2223) * Move relation size cache from WalIngest to DatadirTimeline (#2094) * Move relation sie cache to layered timeline * Fix obtaining current LSN for relation size cache * Resolve merge conflicts * Resolve merge conflicts * Reestore 'lsn' field in DatadirModification * adjust DatadirModification lsn in ingest_record * Fix formatting * Pass lsn to get_relsize * Fix merge conflict * Update pageserver/src/pgdatadir_mapping.rs Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * Update pageserver/src/pgdatadir_mapping.rs Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * refactor: replace lazy-static with once-cell (#2195) - Replacing all the occurrences of lazy-static with `once-cell::sync::Lazy` - fixes #1147 Signed-off-by: Ankur Srivastava <best.ankur@gmail.com> * Add more buckets to pageserver latency metrics (#2225) * ignore record property warning to fix benchmarks * increase statement timeout * use event so it fires only if workload thread successfully finished * remove debug log * increase timeout to pass test with real s3 * avoid duplicate parameter, increase timeout * Major migration script (#2073) This script can be used to migrate a tenant across breaking storage versions, or (in the future) upgrading postgres versions. See the comment at the top for an overview. Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech> * Fix etcd typos * Fix links to safekeeper protocol docs. (#2188) safekeeper/README_PROTO.md was moved to docs/safekeeper-protocol.md in commit `0b14fdb078`, as part of reorganizing the docs into 'mdbook' format. Fixes issue #1475. Thanks to @banks for spotting the outdated references. In addition to fixing the above issue, this patch also fixes other broken links as a result of `0b14fdb078`. See https://github.com/neondatabase/neon/pull/2188#pullrequestreview-1055918480. Co-authored-by: Heikki Linnakangas <heikki@neon.tech> Co-authored-by: Thang Pham <thang@neon.tech> * Update CONTRIBUTING.md * Update CONTRIBUTING.md * support node id and remote storage params in docker_entrypoint.sh * Safe truncate (#2218) * Move relation sie cache to layered timeline * Fix obtaining current LSN for relation size cache * Resolve merge conflicts * Resolve merge conflicts * Reestore 'lsn' field in DatadirModification * adjust DatadirModification lsn in ingest_record * Fix formatting * Pass lsn to get_relsize * Fix merge conflict * Update pageserver/src/pgdatadir_mapping.rs Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * Update pageserver/src/pgdatadir_mapping.rs Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * Check if relation exists before trying to truncat it refer #1932 * Add test reporducing FSM truncate problem Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * Fix exponential backoff values * Update back `vendor/postgres` back; it was changed accidentally. (#2251) Commit `4227cfc96e` accidentally reverted vendor/postgres to an older version. Update it back. * Add pageserver checkpoint_timeout option. To flush inmemory layer eventually when no new data arrives, which helps safekeepers to suspend activity (stop pushing to the broker). Default 10m should be ok. * Share exponential backoff code and fix logic for delete task failure (#2252) * Fix bug when import large (>1GB) relations (#2172) Resolves #2097 - use timeline modification's `lsn` and timeline's `last_record_lsn` to determine the corresponding LSN to query data in `DatadirModification::get` - update `test_import_from_pageserver`. Split the test into 2 variants: `small` and `multisegment`. + `small` is the old test + `multisegment` is to simulate #2097 by using a larger number of inserted rows to create multiple segment files of a relation. `multisegment` is configured to only run with a `release` build * Fix timeline physical size flaky tests (#2244) Resolves #2212. - use `wait_for_last_flush_lsn` in `test_timeline_physical_size_` tests ## Context Need to wait for the pageserver to catch up with the compute's last flush LSN because during the timeline physical size API call, it's possible that there are running `LayerFlushThread` threads. These threads flush new layers into disk and hence update the physical size. This results in a mismatch between the physical size reported by the API and the actual physical size on disk. ### Note The `LayerFlushThread` threads are processed concurrently, so it's possible that the above error still persists even with this patch. However, making the tests wait to finish processing all the WALs (not flushing) before calculating the physical size should help reduce the "flakiness" significantly postgres_ffi/waldecoder: validate more header fields * postgres_ffi/waldecoder: remove unused startlsn * postgres_ffi/waldecoder: introduce explicit `enum State` Previously it was emulated with a combination of nullable fields. This change should make the logic more readable. * disable `test_import_from_pageserver_multisegment` (#2258) This test failed consistently on `main` now. It's better to temporarily disable it to avoid blocking others' PRs while investigating the root cause for the test failure. See: #2255, #2256 * get_binaries uses DOCKER_TAG taken from docker image build step (#2260) * [proxy] Rework wire format of the password hack and some errors (#2236) The new format has a few benefits: it's shorter, simpler and human-readable as well. We don't use base64 anymore, since url encoding got us covered. We also show a better error in case we couldn't parse the payload; the users should know it's all about passing the correct project name. * test_runner/pg_clients: collect docker logs (#2259) * get_binaries script fix (#2263) * get_binaries uses DOCKER_TAG taken from docker image build step * remove docker tag discovery at all and fix get_binaries for version variable * Better storage sync logs (#2268) * Find end of WAL on safekeepers using WalStreamDecoder. We could make it inside wal_storage.rs, but taking into account that - wal_storage.rs reading is async - we don't need s3 here - error handling is different; error during decoding is normal I decided to put it separately. Test cargo test test_find_end_of_wal_last_crossing_segment prepared earlier by @yeputons passes now. Fixes https://github.com/neondatabase/neon/issues/544 https://github.com/neondatabase/cloud/issues/2004 Supersedes https://github.com/neondatabase/neon/pull/2066 * Improve walreceiver logic (#2253) This patch makes walreceiver logic more complicated, but it should work better in most cases. Added `test_wal_lagging` to test scenarios where alive safekeepers can lag behind other alive safekeepers. - There was a bug which looks like `etcd_info.timeline.commit_lsn > Some(self.local_timeline.get_last_record_lsn())` filtered all safekeepers in some strange cases. I removed this filter, it should probably help with #2237 - Now walreceiver_connection reports status, including commit_lsn. This allows keeping safekeeper connection even when etcd is down. - Safekeeper connection now fails if pageserver doesn't receive safekeeper messages for some time. Usually safekeeper sends messages at least once per second. - `LaggingWal` check now uses `commit_lsn` directly from safekeeper. This fixes the issue with often reconnects, when compute generates WAL really fast. - `NoWalTimeout` is rewritten to trigger only when we know about the new WAL and the connected safekeeper doesn't stream any WAL. This allows setting a small `lagging_wal_timeout` because it will trigger only when we observe that the connected safekeeper has stuck. * increase timeout in wait_for_upload to avoid spurious failures when testing with real s3 * Bump vendor/postgres to include XLP_FIRST_IS_CONTRECORD fix. (#2274) * Set up a workflow to run pgbench against captest (#2077) Signed-off-by: Ankur Srivastava <best.ankur@gmail.com> Co-authored-by: Alexander Bayandin <alexander@neon.tech> Co-authored-by: Konstantin Knizhnik <knizhnik@garret.ru> Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> Co-authored-by: Ankur Srivastava <ansrivas@users.noreply.github.com> Co-authored-by: bojanserafimov <bojan.serafimov7@gmail.com> Co-authored-by: Dmitry Rodionov <dmitry@neon.tech> Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech> Co-authored-by: Kirill Bulatov <kirill@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech> Co-authored-by: Thang Pham <thang@neon.tech> Co-authored-by: Stas Kelvich <stas.kelvich@gmail.com> Co-authored-by: Arseny Sher <sher-ars@yandex.ru> Co-authored-by: Egor Suvorov <egor@neon.tech> Co-authored-by: Andrey Taranik <andrey@cicd.team> Co-authored-by: Dmitry Ivanov <ivadmi5@gmail.com>	2022-08-15 21:30:45 +03:00
Arthur Petukhovsky	e814ac16f9	Merge pull request #2219 from neondatabase/main Release 2022-08-04	2022-08-04 20:06:34 +03:00
Heikki Linnakangas	ad3055d386	Merge pull request #2203 from neondatabase/release-uuid-ossp Deploy new storage and compute version to production Release 2022-08-02	2022-08-02 15:08:14 +03:00
Heikki Linnakangas	94e03eb452	Merge remote-tracking branch 'origin/main' into 'release' Release 2022-08-01	2022-08-02 12:43:49 +03:00
Sergey Melnikov	380f26ef79	Merge pull request #2170 from neondatabase/main (Release 2022-07-28) Release 2022-07-28	2022-07-28 14:16:52 +03:00
Arthur Petukhovsky	3c5b7f59d7	Merge pull request #2119 from neondatabase/main Release 2022-07-19	2022-07-19 11:58:48 +03:00
Arthur Petukhovsky	fee89f80b5	Merge pull request #2115 from neondatabase/main-2022-07-18 Release 2022-07-18	2022-07-18 19:21:11 +03:00
Arthur Petukhovsky	41cce8eaf1	Merge remote-tracking branch 'origin/release' into main-2022-07-18	2022-07-18 18:21:20 +03:00
Alexey Kondratov	f88fe0218d	Merge pull request #1842 from neondatabase/release-deploy-hotfix [HOTFIX] Release deploy fix This PR uses this branch neondatabase/postgres#171 and several required commits from the main to use only locally built compute-tools. This should allow us to rollout safekeepers sync issue fix on prod	2022-06-01 11:04:30 +03:00
Alexey Kondratov	cc856eca85	Install missing openssl packages in the Github Actions workflow	2022-05-31 21:31:31 +02:00
Alexey Kondratov	cf350c6002	Use :local compute-tools tag to build compute-node image	2022-05-31 21:31:16 +02:00
Arseny Sher	0ce6b6a0a3	Merge pull request #1836 from neondatabase/release-hotfix-basebackup-lsn-page-boundary Bump vendor/postgres to hotfix basebackup LSN comparison.	2022-05-31 16:54:03 +04:00
Arseny Sher	73f247d537	Bump vendor/postgres to hotfix basebackup LSN comparison.	2022-05-31 16:00:50 +04:00
Andrey Taranik	960be82183	Merge pull request #1792 from neondatabase/main Release 2202-05-25 (second)	2022-05-25 16:37:57 +03:00
Andrey Taranik	806e5a6c19	Merge pull request #1787 from neondatabase/main Release 2022-05-25	2022-05-25 13:34:11 +03:00
Alexey Kondratov	8d5df07cce	Merge pull request #1385 from zenithdb/main Release main 2022-03-22	2022-03-22 05:04:34 -05:00
Andrey Taranik	df7a9d1407	release fix 2022-03-16 (#1375 )	2022-03-17 00:43:28 +03:00

1 changed files with 2 additions and 2 deletions

									
										4

Dockerfile.compute-node
									
												View File
												
				@@ -651,7 +651,7 @@ FROM rust-extensions-build AS pg-jsonschema-pg-build

				ARG PG_VERSION

				RUN wget https://github.com/supabase/pg_jsonschema/archive/refs/tags/v0.2.0.tar.gz -O pg_jsonschema.tar.gz && \

				    echo "b1bd95009c8809bd6cda9a37777f8b7df425ff1a34976c1e7a4b31cf838ace66 pg_jsonschema.tar.gz" | sha256sum --check && \

				    echo "9118fc508a6e231e7a39acaa6f066fcd79af17a5db757b47d2eefbe14f7794f0 pg_jsonschema.tar.gz" | sha256sum --check && \

				    mkdir pg_jsonschema-src && cd pg_jsonschema-src && tar xvzf ../pg_jsonschema.tar.gz --strip-components=1 -C . && \

				    sed -i 's/pgrx = "0.10.2"/pgrx = { version = "0.10.2", features = [ "unsafe-postgres" ] }/g' Cargo.toml && \

				    cargo pgrx install --release && \

				@@ -668,7 +668,7 @@ FROM rust-extensions-build AS pg-graphql-pg-build

				ARG PG_VERSION

				RUN wget https://github.com/supabase/pg_graphql/archive/refs/tags/v1.4.0.tar.gz -O pg_graphql.tar.gz && \

				    echo "ea85d45f8af1d2382e2af847f88102f930782c00e6c612308e6f08f27309d5f7 pg_graphql.tar.gz" | sha256sum --check && \

				    echo "bd8dc7230282b3efa9ae5baf053a54151ed0e66881c7c53750e2d0c765776edc pg_graphql.tar.gz" | sha256sum --check && \

				    mkdir pg_graphql-src && cd pg_graphql-src && tar xvzf ../pg_graphql.tar.gz --strip-components=1 -C . && \

				    sed -i 's/pgrx = "=0.10.2"/pgrx = { version = "0.10.2", features = [ "unsafe-postgres" ] }/g' Cargo.toml && \

				    cargo pgrx install --release && \

Compare commits

185 Commits

http2 ... release-39

4

Dockerfile.compute-node

View File

Compare commits

185 Commits http2 ... release-39

4 Dockerfile.compute-node Unescape Escape View File

185 Commits

http2 ... release-39

4

Dockerfile.compute-node

View File