rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-07 21:42:56 +00:00

Author	SHA1	Message	Date
Sasha Krassovsky	7662df6ca0	Fix minimum backoff to 1ms	2024-01-03 21:09:19 -08:00
Arseny Sher	65b4e6e7d6	Remove empty safekeeper init since truncateLsn. It has caveats such as creating half empty segment which can't be offloaded. Instead we'll pursue approach of pull_timeline, seeding new state from some peer.	2024-01-03 18:20:19 +04:00
Arseny Sher	f71110383c	Remove second check for max_slot_wal_keep_size download size. Already checked in GetLogRepRestartLSN, a rebase artifact.	2024-01-03 13:13:32 +04:00
Arseny Sher	ae3eaf9995	Add [WP] prefix to all walproposer logging. - rename walpop_log to wp_log - create also wpg_log which is used in postgres-specific code - in passing format messages to start with lower case	2024-01-03 11:10:27 +04:00
Sasha Krassovsky	ce13281d54	MIN not MAX	2024-01-02 06:28:49 -08:00
Sasha Krassovsky	4e1d16f311	Switch to exponential rate-limiting	2024-01-02 06:28:49 -08:00
Sasha Krassovsky	091a0cda9d	Switch to rate-limiting strategy	2024-01-02 06:28:49 -08:00
Sasha Krassovsky	ea9fad419e	Add exponential backoff to page_server->send	2024-01-02 06:28:49 -08:00
Arseny Sher	ddc431fc8f	pgindent walproposer condvar comment	2023-12-26 14:12:53 +04:00
Arseny Sher	bfc98f36e3	Refactor handling responses in walproposer. Remove confirm_wal_streamed; we already apply both write and flush positions of the slot to commit_lsn which is fine because 1) we need to wake up waiters 2) committed WAL can be fetched from safekeepers by neon_walreader now.	2023-12-26 14:12:53 +04:00
Arseny Sher	1f1c50e8c7	Don't re-add neon_walreader socket to waiteventset if possible. Should make recovery slightly more efficient (likely negligibly).	2023-12-26 14:12:53 +04:00
Arseny Sher	854df0f566	Do PQgetCopyData before PQconsumeInput in libpqwp_async_read. To avoid a lot of redundant memmoves and bloated input buffer. fixes https://github.com/neondatabase/neon/issues/6055	2023-12-26 14:12:53 +04:00
Arseny Sher	9c493869c7	Perform synchronous WAL download in wp only for logical replication. wp -> sk communication now uses neon_walreader which will fetch missing WAL on demand from safekeepers, so doesn't need this anymore. Also, cap WAL download by max_slot_wal_keep_size to be able to start compute if lag is too high.	2023-12-26 14:12:53 +04:00
Arseny Sher	cdb08f0362	Introduce NeonWALReader downloading sk -> compute WAL on demand. It is similar to XLogReader, but when either requested segment is missing locally or requested LSN is before basebackup_lsn NeonWALReader asynchronously fetches WAL from one of safekeepers. Patch includes walproposer switch to NeonWALReader, splitting wouldn't make much sense as it is hard to test otherwise. This finally removes risk of pg_wal explosion (as well as slow start time) when one safekeeper is lagging, at the same time allowing to recover it. In the future reader should also be used by logical walsender for similar reasons (currently we download the tail on compute start synchronously). The main test is test_lagging_sk. However, I also run it manually a lot varying MAX_SEND_SIZE on both sides (on safekeeper and on walproposer), testing various fragmentations (one side having small buffer, another, both), which brought up https://github.com/neondatabase/neon/issues/6055 closes https://github.com/neondatabase/neon/issues/1012	2023-12-26 14:12:53 +04:00
Konstantin Knizhnik	572bc06011	Do not copy WAL for lagged slots (#6221 ) ## Problem See https://neondb.slack.com/archives/C026T7K2YP9/p1702813041997959 ## Summary of changes Do not take in account invalidated slots when calculate restart_lsn position for basebackup at page server ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2023-12-22 20:47:55 +02:00
Konstantin Knizhnik	8ff5387da1	eliminate GCC warning for unchecked result of fread (#6167 ) ## Problem GCCproduce warning that bread result is not checked. It doesn't affect program logic, but better live without warnings. ## Summary of changes Check read result. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist	2023-12-19 18:17:11 +02:00
Arseny Sher	b701394d7a	Fix WAL waiting in walproposer for v16. Just preparing cv right before waiting is not enough as we might have already missed the flushptr change & wakeup, so re-checked before sleep. https://neondb.slack.com/archives/C03QLRH7PPD/p1702830965396619?thread_ts=1702756761.836649&cid=C03QLRH7PPD	2023-12-19 15:34:14 +04:00
Heikki Linnakangas	6939fc3db6	Remove declarations of non-existent global variables and functions FileCacheMonitorMain was removed in commit `b497d0094e`.	2023-12-18 21:05:31 +02:00
Heikki Linnakangas	c4c48cfd63	Clean up #includes - No need to include c.h, port.h or pg_config.h, they are included in postgres.h - No need to include postgres.h in header files. Instead, the assumption in PostgreSQL is that all .c files include postgres.h. - Reorder includes to alphabetical order, and system headers before pgsql headers - Remove bunch of other unnecessary includes that got copy-pasted from one source file to another	2023-12-18 21:05:29 +02:00
Heikki Linnakangas	82215d20b0	Mark some variables 'static' Move initialization of neon_redo_read_buffer_filter. This allows marking it 'static', too.	2023-12-18 21:05:24 +02:00
Arseny Sher	e1935f42a1	Don't generate core dump when walproposer intentionally panics. Walproposer sometimes intentionally PANICs when its term is defeated as the basebackup is likely spoiled by that time. We don't want core dumped in this case.	2023-12-18 11:03:34 +04:00
Arseny Sher	193e60e2b8	Fix/edit pgindent confusing places in neon.	2023-12-08 14:03:13 +04:00
Arseny Sher	1bbd6cae24	pgindent pgxn/neon	2023-12-08 14:03:13 +04:00
Arseny Sher	65f48c7002	Make targets to run pgindent on core and neon extension.	2023-12-08 14:03:13 +04:00
Konstantin Knizhnik	7fab731f65	Track size of FSM fork while applying records at replica (#5901 ) ## Problem See https://neondb.slack.com/archives/C04DGM6SMTM/p1700560921471619 ## Summary of changes Update relation size cache for FSM fork in WAL records filter ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2023-12-05 18:49:24 +02:00
Sasha Krassovsky	f290b27378	Fix check for if shmem is valid to take into account detached shmem (#5937 ) ## Problem We can segfault if we update connstr inside of a process that has detached from shmem (e.g. inside stats collector) ## Summary of changes Add a check to make sure we're not detached	2023-11-28 03:14:42 +00:00
Anastasia Lubennikova	3c56a4dd18	Make neon extension relocatable to allow SET SCHEMA (#5942 )	2023-11-27 21:45:41 +00:00
Konstantin Knizhnik	ea63b43009	Check if LFC was intialized in local_cache_pages function (#5911 ) ## Problem There is not check that LFC is initialised (`lfc_max_size != 0`) in `local_cache_pages` function ## Summary of changes Add proper check. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2023-11-24 08:23:00 +02:00
Anastasia Lubennikova	04e6c09f14	Add pgxn/neon/README.md	2023-11-23 18:53:03 +00:00
Konstantin Knizhnik	6afbadc90e	LFC fixes + statistics (#5727 ) ## Problem ## Summary of changes See #5500 ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2023-11-23 08:59:19 +02:00
Sasha Krassovsky	81b2cefe10	Disallow CREATE DATABASE WITH OWNER neon_superuser (#5887 ) ## Problem Currently, control plane doesn't know about neon_superuser, so if a user creates a database with owner neon_superuser it causes an exception when it tries to forward it. It is also currently possible to ALTER ROLE neon_superuser. ## Summary of changes Disallow creating database with owner neon_superuser. This is probably fine, since I don't think you can create a database with owner normal superuser. Also forbids altering neon_superuser	2023-11-20 22:39:47 +00:00
Em Sharnoff	5d13a2e426	Improve error message when neon.max_cluster_size reached (#4173 ) Changes the error message encountered when the `neon.max_cluster_size` limit is reached. Reasoning is that this is user-visible, and so should probably use language that's closer to what users are familiar with.	2023-11-16 21:51:26 +00:00
John Spray	b989ad1922	extend test_change_pageserver for failure case, rework changing pageserver (#5693 ) Reproducer for https://github.com/neondatabase/neon/issues/5692 The test change in this PR intentionally fails, to demonstrate the issue. --------- Co-authored-by: Sasha Krassovsky <krassovskysasha@gmail.com>	2023-11-08 11:26:56 +00:00
Konstantin Knizhnik	5ceccdc7de	Logical replication startup fixes (#5750 ) ## Problem See https://neondb.slack.com/archives/C04DGM6SMTM/p1698226491736459 ## Summary of changes Update WAL affected buffers when restoring WAL from safekeeper ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Arseny Sher <sher-ars@yandex.ru>	2023-11-03 18:40:27 +02:00
Konstantin Knizhnik	9cdffd164a	Prevent SIGSEGV in apply_error_callback when record was not decoded (#5703 ) ## Problem See https://neondb.slack.com/archives/C036U0GRMRB/p1698652221399419?thread_ts=1698438997.903919&cid=C036U0GRMRB ## Summary of changes Check if record pointer is not NULL before trying to print record descriptor ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2023-10-30 12:06:08 +02:00
Sasha Krassovsky	116c342cad	Support changing pageserver dynamically (#5542 ) ## Problem We currently require full restart of compute if we change the pageserver url ## Summary of changes Makes it so that we don't have to do a full restart, but can just send SIGHUP	2023-10-26 10:56:07 -07:00
MMeent	6129077d31	WALRedo: Limit logging to log_level = ERROR and above (#5587 ) This fixes issues in pageserver's walredo process where WALRedo logs of loglevel=LOG are interpreted as errors. ## Problem See #5560 ## Summary of changes Set the log level to something that doesn't include LOG.	2023-10-26 12:21:41 +01:00
Arthur Petukhovsky	66f8f5f1c8	Call walproposer from Rust (#5403 ) Create Rust bindings for C functions from walproposer. This allows to write better tests with real walproposer code without spawning multiple processes and starting up the whole environment. `make walproposer-lib` stage was added to build static libraries `libwalproposer.a`, `libpgport.a`, `libpgcommon.a`. These libraries can be statically linked to any executable to call walproposer functions. `libs/walproposer/src/walproposer.rs` contains `test_simple_sync_safekeepers` to test that walproposer can be called from Rust to emulate sync_safekeepers logic. It can also be used as a usage example.	2023-10-19 14:17:15 +01:00
Konstantin Knizhnik	5c88213eaf	Logical replication (#5271 ) ## Problem See https://github.com/neondatabase/company_projects/issues/111 ## Summary of changes Save logical replication files in WAL at compute and include them in basebackup at pate server. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech> Co-authored-by: Arseny Sher <sher-ars@yandex.ru>	2023-10-18 16:42:22 +03:00
Alexey Kondratov	0ca342260c	[compute_ctl+pgxn] Handle invalid databases after failed drop (#5561 ) ## Problem In `89275f6c1e` we fixed an issue, when we were dropping db in Postgres even though cplane request failed. Yet, it introduced a new problem that we now de-register db in cplane even if we didn't actually drop it in Postgres. ## Summary of changes Here we revert extension change, so we now again may leave db in invalid state after failed drop. Instead, `compute_ctl` is now responsible for cleaning up invalid databases during full configuration. Thus, there are two ways of recovering from failed DROP DATABASE: 1. User can just repeat DROP DATABASE, same as in Vanilla Postgres. 2. If they didn't, then on next full configuration (dbs / roles changes in the API; password reset; or data availability check) invalid db will be cleaned up in the Postgres and re-created by `compute_ctl`. So again it follows pretty much the same semantics as Vanilla Postgres -- you need to drop it again after failed drop. That way, we have a recovery trajectory for both problems. See this commit for info about `invalid` db state: `a4b4cc1d60` According to it: > An invalid database cannot be connected to anymore, but can still be dropped. While on it, this commit also fixes another issue, when `compute_ctl` was trying to connect to databases with `ALLOW CONNECTIONS false`. Now it will just skip them. Fixes #5435	2023-10-16 20:46:45 +02:00
Konstantin Knizhnik	fe74fac276	Fix handling flush error in prefetch (#5473 ) ## Problem See https://neondb.slack.com/archives/C05U648A9NJ In case of failure of flush in prefetch, prefetch state is reseted. We need to retry register buffer attempt, otherwise we will get assertion failure. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2023-10-10 07:43:37 +03:00
Arthur Petukhovsky	8b15252f98	Move walproposer state into struct (#5364 ) This patch extracts all postgres-dependent functions in a separate `walproposer_api` functions struct. It helps to compile walproposer as static library without compiling all other postgres server code. This is useful to allow calling walproposer C code from Rust, or linking this library with anything else. All global variables containing walproposer state were extracted to a separate `WalProposer` struct. This makes it possible to run several walproposers in the same process, in separate threads. There were no logic changes and PR mostly consists of shuffling functions between several files. We have a good test coverage for walproposer code and I've seen no issues with tests while I was refactoring it, so I don't expect any issues after merge. ref https://github.com/neondatabase/neon/issues/547 --------- Co-authored-by: Arseny Sher <sher-ars@yandex.ru>	2023-10-05 18:48:01 +01:00
Sasha Krassovsky	89275f6c1e	Fix invalid database resulting from failed DROP DB (#5423 ) ## Problem If the control plane happened to respond to a DROP DATABASE request with a non-200 response, we'd abort the DROP DATABASE transaction in the usual spot. However, Postgres for some reason actually performs the drop inside of `standard_ProcessUtility`. As such, the database was left in a weird state after aborting the transaction. We had test coverage of a failed CREATE DATABASE but not a failed DROP DATABASE. ## Summary of changes Since DROP DATABASE can't be inside of a transaction block, we can just forward the DDL changes to the control plane inside of `ProcessUtility_hook`, and if we respond with 500 bail out of `ProcessUtility` before we perform the drop. This change also adds a test, which reproduced the invalid database issue before the fix was applied.	2023-09-29 19:39:28 +01:00
Em Sharnoff	b497d0094e	file cache: Remove free space monitor (#5406 ) This effectively reverts #3832. There's a couple issues we just discovered with the free space monitor, and to my knowledge, the fact we're putting the file cache on a separate filesystem (even when on disk) that's guaranteed to have more room than the maximum size means that this free space monitor should have no effect. More details: 1. The control plane sets the maximum file cache size based on max CU 2. The control plane sets the size of the filesystem underlying the file cache based on the maximum user selectable CU (or, if the endpoint is larger, then that size), so that there's always enough room 3. If postmaster gets SIGKILL'd, then the free space monitor process does not exit 4. If the free space monitor is acting on the cache file but not subject to locking or up-to-date metadata from a newer postgres instance, then this could lead to data corruption. So, in practice I belive the risk of data corruption is low but not nothing, and given the issues we hit because of (3), and given that this the free space monitor shouldn't be necessary because of (1) and (2), it's best to just remove it outright. See also: neondatabase/autoscaling#534, #5405	2023-09-28 06:47:44 -07:00
MMeent	7038ce40ce	Fix neon_zeroextend's WAL logging (#5387 ) When you log more than a few blocks, you need to reserve the space in advance. We didn't do that, so we got errors. Now we do that, and shouldn't get errors.	2023-09-27 13:48:30 +02:00
Alexander Bayandin	3a2e6a03bc	Forbid installation of `hnsw` extension (#5346 ) ## Problem Do not allow new installation of deprecated `hnsw` extension. The same approach as in https://github.com/neondatabase/neon/pull/5345 ## Summary of changes - Remove `trusted = true` from `hnsw.control` - Remove `hnsw` related targets from Makefile	2023-09-23 16:47:57 +01:00
Konstantin Knizhnik	6723a79bec	Do not handle lfc_change_limit in processes not haing PGPROC structure (#5332 ) ## Problem See https://neondb.slack.com/archives/C05L7D1JAUS/p1693775881474019 ## Summary of changes Do not perform local file cache resizing in processes having no PGPROC ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist --------- Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2023-09-19 21:55:36 +03:00
Konstantin Knizhnik	1697e7b319	Fix lfc_ensure_function which now disables LFC (#5294 ) ## Problem There was a bug in lfc_ensure_opened which actually disables LFC ## Summary of changes Return true ifLFC file is normally opened ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>	2023-09-13 08:56:03 +03:00
MMeent	3b6b847d76	Fixes for Pg16: (#5292 ) - pagestore_smgr.c had unnecessary WALSync() (see #5287 ) - Compute node dockerfile didn't build the neon_rmgr extension - Add PostgreSQL 16 image to docker-compose tests - Fix issue with high CPU usage in Safekeeper due to a bug in WALSender Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2023-09-12 22:02:03 +03:00
MMeent	83e7e5dbbd	Feat/postgres 16 (#4761 ) This adds PostgreSQL 16 as a vendored postgresql version, and adapts the code to support this version. The important changes to PostgreSQL 16 compared to the PostgreSQL 15 changeset include the addition of a neon_rmgr instead of altering Postgres's original WAL format. Co-authored-by: Alexander Bayandin <alexander@neon.tech> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2023-09-12 15:11:32 +02:00

1 2 3

125 Commits