rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2025-12-25 23:29:59 +00:00

Author	SHA1	Message	Date
Arseny Sher	28ef1522d6	cosmetic fixes	2024-08-12 14:48:05 +03:00
Arseny Sher	c9d2b61195	fix term uniqueness	2024-08-12 14:48:05 +03:00
Arseny Sher	4d1cf2dc6f	tests, rollout	2024-08-12 14:48:05 +03:00
Arseny Sher	7b50c1a457	more wip ref https://github.com/neondatabase/cloud/issues/14668	2024-08-12 14:48:05 +03:00
Arseny Sher	1e789fb963	wipwip	2024-08-12 14:48:05 +03:00
Arseny Sher	162424ad77	wip	2024-08-12 14:48:05 +03:00
Alex Chi Z.	fd8a7a7223	fix(docs): race on monotonic rfc id (#8445 ) ## Problem We have two No.34 RFC. ## Summary of changes ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. ## Checklist before merging - [ ] Do not forget to reformat commit message to not include the above checklist Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-07-22 09:22:07 +01:00
John Spray	f7131834eb	docs/rfcs: timeline ancestor detach API (#6888 ) ## Problem When a tenant creates a new timeline that they will treat as their 'main' history, it is awkward to permanently retain an 'old main' timeline as its ancestor. Currently this is necessary because it is forbidden to delete a timeline which has descendents. ## Summary of changes A new pageserver API is proposed to 'adopt' data from a parent timeline into one of its children, such that the link between ancestor and child can be severed, leaving the parent in a state where it may then be deleted. --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2024-07-17 14:25:35 +00:00
John Spray	69b6675da0	rfcs: add RFC for timeline archival (#8221 ) A design for a cheap low-resource state for idle timelines: - #8088	2024-07-11 08:23:51 +01:00
Vlad Lazar	9882ac8e06	docs: Graceful storage controller cluster restarts RFC (#7704 ) RFC for "Graceful Restarts of Storage Controller Managed Clusters". Related https://github.com/neondatabase/neon/issues/7387	2024-07-01 18:44:28 +01:00
John Spray	67522ce83d	docs: shard splitting RFC (#6358 ) Extend the previous sharding RFC with functionality for dynamically splitting shards to increase the total shard count on existing tenants.	2024-03-15 16:00:04 +00:00
John Spray	23416cc358	docs: sharding phase 1 RFC (#5432 ) We need to shard our Tenants to support larger databases without those large databases dominating our pageservers and/or requiring dedicated pageservers. This RFC aims to define an initial capability that will permit creating large-capacity databases using a static configuration defined at time of Tenant creation. Online re-sharding is deferred as future work, as is offloading layers for historical reads. However, both of these capabilities would be implementable without further changes to the control plane or compute: this RFC aims to define the cross-component work needed to bootstrap sharding end-to-end.	2024-03-15 11:14:25 +00:00
Andreas Scherbaum	5c6d78d469	Rename "zenith" to "neon" (#6957 ) Usually RFC documents are not modified, but the vast mentions of "zenith" in early RFC documents make it desirable to update the product name to today's name, to avoid confusion. ## Problem Early RFC documents use the old "zenith" product name a lot, which is not something everyone is aware of after the product was renamed. ## Summary of changes Replace occurrences of "zenith" with "neon". Images are excluded. --------- Co-authored-by: Andreas Scherbaum <andreas@neon.tech>	2024-03-04 13:02:18 +01:00
Clarence	3d1b08496a	Update words in docs for better readability (#6600 ) ## Problem Found typos while reading the docs ## Summary of changes Fixed the typos found	2024-02-03 00:59:39 +00:00
Arthur Petukhovsky	f2aa96f003	Console split RFC (#1997 ) [Rendered](https://github.com/neondatabase/neon/blob/rfc-console-split/docs/rfcs/017-console-split.md) Co-authored-by: Stas Kelvich <stas.kelvich@gmail.com>	2024-02-02 23:41:55 +02:00
Christian Schwarz	66c52a629a	RFC: vectored `Timeline::get` (#6250 )	2024-01-08 15:00:01 +00:00
Christian Schwarz	c272c68e5c	RFC: Per-Tenant GetPage@LSN Throttling (#5648 ) Implementation epic: https://github.com/neondatabase/neon/issues/5899	2023-12-19 11:20:56 +01:00
Arpad Müller	3842773546	Correct RFC number for Pageserver WAL DR RFC (#5997 ) When I opened #5248, 27 was an unused RFC number. Since then, two RFCs have been merged, so now 27 is taken. 29 is free though, so move it there.	2023-11-30 21:01:25 +00:00
Arpad Müller	8ec6033ed8	Pageserver disaster recovery RFC (#5248 ) Enable the pageserver to recover from data corruption events by implementing a feature to re-apply historic WAL records in parallel to the already occurring WAL replay. The feature is outside of the user-visible backup and history story, and only serves as a second-level backup for the case that there is a bug in the pageservers that corrupted the served pages. The RFC proposes the addition of two new features: * recover a broken branch from WAL (downtime is allowed) * a test recovery system to recover random branches to make sure recovery works	2023-11-30 14:30:17 +01:00
Arpad Müller	31a54d663c	Migrate links from wiki to notion (#5862 ) See the slack discussion: https://neondb.slack.com/archives/C033A2WE6BZ/p1696429688621489?thread_ts=1695647103.117499	2023-11-14 15:36:47 +00:00
John Spray	6b4bb91d0a	docs/rfcs: add RFC for fast tenant migration/failover (#5029 ) ## Problem Currently we don't have a way to migrate tenants from one pageserver to another without a risk of gap in availability. ## Summary of changes This follows on from https://github.com/neondatabase/neon/pull/4919 Migrating tenants between pageservers is essential to operating a service at scale, in several contexts: 1. Responding to a pageserver node failure by migrating tenants to other pageservers 2. Balancing load and capacity across pageservers, for example when a user expands their database and they need to migrate to a pageserver with more capacity. 3. Restarting pageservers for upgrades and maintenance Currently, a tenant may migrated by attaching to a new node, re-configuring endpoints to use the new node, and then later detaching from the old node. This is safe once [generation numbers](025-generation-numbers.md) are implemented, but does meet our seamless/fast/efficient goals: Co-authored-by: Christian Schwarz <christian@neon.tech>	2023-09-28 10:07:11 +01:00
Christian Schwarz	5edae96a83	rfc: Crash-Consistent Layer Map Updates By Leveraging index_part.json (#5086 ) This RFC describes a simple scheme to make layer map updates crash consistent by leveraging the index_part.json in remote storage. Without such a mechanism, crashes can induce certain edge cases in which broadly held assumptions about system invariants don't hold.	2023-09-01 15:24:58 +02:00
John Spray	382473d9a5	docs: add RFC for remote storage generation numbers (#4919 ) ## Summary A scheme of logical "generation numbers" for pageservers and their attachments is proposed, along with changes to the remote storage format to include these generation numbers in S3 keys. Using the control plane as the issuer of these generation numbers enables strong anti-split-brain properties in the pageserver cluster without implementing a consensus mechanism directly in the pageservers. ## Motivation Currently, the pageserver's remote storage format does not provide a mechanism for addressing split brain conditions that may happen when replacing a node during failover or when migrating a tenant from one pageserver to another. From a remote storage perspective, a split brain condition occurs whenever two nodes both think they have the same tenant attached, and both can write to S3. This can happen in the case of a network partition, pathologically long delays (e.g. suspended VM), or software bugs. This blocks robust implementation of failover from unresponsive pageservers, due to the risk that the unresponsive pageserver is still writing to S3. --------- Co-authored-by: Christian Schwarz <christian@neon.tech> Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com> Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2023-08-30 09:49:55 +01:00
Christian Schwarz	ed5bce7cba	rfcs: archive my MVCC S3 Notion Proposal (#5040 ) This is a copy from the [original Notion page](https://www.notion.so/neondatabase/Proposal-Pageserver-MVCC-S3-Storage-8a424c0c7ec5459e89d3e3f00e87657c?pvs=4), taken on 2023-08-16. This is for archival mostly. The RFC that we're likely to go with is https://github.com/neondatabase/neon/pull/4919.	2023-08-18 19:34:29 +02:00
Alek Westover	d005c77ea3	Tar Remote Extensions (#4715 ) Add infrastructure to dynamically load postgres extensions and shared libraries from remote extension storage. Before postgres start downloads list of available remote extensions and libraries, and also downloads 'shared_preload_libraries'. After postgres is running, 'compute_ctl' listens for HTTP requests to load files. Postgres has new GUC 'extension_server_port' to specify port on which 'compute_ctl' listens for requests. When PostgreSQL requests a file, 'compute_ctl' downloads it. See more details about feature design and remote extension storage layout in docs/rfcs/024-extension-loading.md --------- Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech> Co-authored-by: Alek Westover <alek.westover@gmail.com>	2023-08-02 12:38:12 +03:00
Stas Kelvich	444d6e337f	add rfcs/022-user-mgmt.md (#3838 ) Co-authored-by: Vadim Kharitonov <vadim@neon.tech>	2023-07-12 19:58:55 +02:00
Dmitry Rodionov	7529ee2ec7	rfc: the state of pageserver tenant relocation (#3868 ) Summarize current state of tenant relocation related activities and implementation ideas	2023-05-19 14:35:33 +03:00
Dmitry Rodionov	4158e24e60	rfc: delete pageserver data from s3 (#3792 ) [Rendered](https://github.com/neondatabase/neon/blob/main/docs/rfcs/022-pageserver-delete-from-s3.md) --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-03-21 20:03:27 +02:00
Stas Kelvich	431e464c1e	Consumption metering RFC	2023-01-16 19:15:59 +02:00
Dmitry Rodionov	e56d11c8e1	fix style if possible (cannot really split long lines in mermaid)	2022-11-02 17:15:49 +02:00
Dmitry Rodionov	ccdc3188ed	update according to discussion and comments	2022-11-02 17:15:49 +02:00
Dmitry Rodionov	67401cbdb8	pageserver s3 coordination	2022-11-02 17:15:49 +02:00
Kirill Bulatov	50297bef9f	RFC about Tenant / Timeline guard objects (#2660 ) Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2022-10-20 12:49:54 +03:00
Arseny Sher	725be60bb7	Storage messaging rfc 2.	2022-10-07 21:22:17 +04:00
Kirill Bulatov	b8eb908a3d	Rename old project name references	2022-09-14 08:14:05 +03:00
Kirill Bulatov	32b7259d5e	Timeline data management RFC (#2152 )	2022-09-13 22:37:20 +03:00
Kirill Bulatov	d8a37452c8	Rename ZenithFeedback (#1912 )	2022-06-11 00:44:05 +03:00
Ryan Russell	c71faae2c6	Docs readability cont Signed-off-by: Ryan Russell <git@ryanrussell.org>	2022-06-02 15:05:12 +02:00
Ryan Russell	54e163ac03	Improve Readability in Docs Signed-off-by: Ryan Russell <ryanrussell@users.noreply.github.com>	2022-05-31 17:22:47 +03:00
Anastasia Lubennikova	3accde613d	Rename contrib/zenith to contrib/neon. Rename custom GUCs: - zenith.page_server_connstring -> neon.pageserver_connstring - zenith.zenith_tenant -> neon.tenantid - zenith.zenith_timeline -> neon.timelineid - zenith.max_cluster_size -> neon.max_cluster_size	2022-05-30 11:11:01 +03:00
Kian-Meng Ang	f1c51a1267	Fix typos	2022-05-28 14:02:05 +03:00
chaitanya sharma	bea84150b2	Fix the markdown rendering on 004-durability.md RFC	2022-05-17 00:16:42 +03:00
Heikki Linnakangas	87a6c4d051	RFC on connection routing and authentication. This documents how we want this to work. We're not quite there yet.	2022-05-02 23:39:06 +03:00
Anastasia Lubennikova	c15aa04714	Move Cluster size limit RFC from rfcs repo	2022-04-18 18:11:31 +03:00
Kirill Bulatov	81417788c8	walkeeper -> safekeeper	2022-04-18 12:52:31 +03:00
Heikki Linnakangas	07342f7519	Major storage format rewrite. This is a backwards-incompatible change. The new pageserver cannot read repositories created with an old pageserver binary, or vice versa. Simplify Repository to a value-store ------------------------------------ Move the responsibility of tracking relation metadata, like which relations exist and what are their sizes, from Repository to a new module, pgdatadir_mapping.rs. The interface to Repository is now a simple key-value PUT/GET operations. It's still not any old key-value store though. A Repository is still responsible from handling branching, and every GET operation comes with an LSN. Mapping from Postgres data directory to keys/values --------------------------------------------------- All the data is now stored in the key-value store. The 'pgdatadir_mapping.rs' module handles mapping from PostgreSQL objects like relation pages and SLRUs, to key-value pairs. The key to the Repository key-value store is a Key struct, which consists of a few integer fields. It's wide enough to store a full RelFileNode, fork and block number, and to distinguish those from metadata keys. 'pgdatadir_mapping.rs' is also responsible for maintaining a "partitioning" of the keyspace. Partitioning means splitting the keyspace so that each partition holds a roughly equal number of keys. The partitioning is used when new image layer files are created, so that each image layer file is roughly the same size. The partitioning is also responsible for reclaiming space used by deleted keys. The Repository implementation doesn't have any explicit support for deleting keys. Instead, the deleted keys are simply omitted from the partitioning, and when a new image layer is created, the omitted keys are not copied over to the new image layer. We might want to implement tombstone keys in the future, to reclaim space faster, but this will work for now. Changes to low-level layer file code ------------------------------------ The concept of a "segment" is gone. Each layer file can now store an arbitrary range of Keys. Checkpointing, compaction ------------------------- The background tasks are somewhat different now. Whenever checkpoint_distance is reached, the WAL receiver thread "freezes" the current in-memory layer, and creates a new one. This is a quick operation and doesn't perform any I/O yet. It then launches a background "layer flushing thread" to write the frozen layer to disk, as a new L0 delta layer. This mechanism takes care of durability. It replaces the checkpointing thread. Compaction is a new background operation that takes a bunch of L0 delta layers, and reshuffles the data in them. It runs in a separate compaction thread. Deployment ---------- This also contains changes to the ansible scripts that enable having multiple different pageservers running at the same time in the staging environment. We will use that to keep an old version of the pageserver running, for clusters created with the old version, at the same time with a new pageserver with the new binary. Author: Heikki Linnakangas Author: Konstantin Knizhnik <knizhnik@zenith.tech> Author: Andrey Taranik <andrey@zenith.tech> Reviewed-by: Matthias Van De Meent <matthias@zenith.tech> Reviewed-by: Bojan Serafimov <bojan@zenith.tech> Reviewed-by: Konstantin Knizhnik <knizhnik@zenith.tech> Reviewed-by: Anton Shyrabokau <antons@zenith.tech> Reviewed-by: Dhammika Pathirana <dham@zenith.tech> Reviewed-by: Kirill Bulatov <kirill@zenith.tech> Reviewed-by: Anastasia Lubennikova <anastasia@zenith.tech> Reviewed-by: Alexey Kondratov <alexey@zenith.tech>	2022-03-28 05:41:15 -05:00
Dmitry Rodionov	e13bdd77fe	add safekepeers gossip annd storage messaging rfcs they were in prs during rfc repo import in addition to just import I've added sequence diagrams to storage messaging rfc	2022-03-22 15:01:26 +04:00
Heikki Linnakangas	d93fc371f3	Import all existing RFCs documents from the separate 'rfcs' repository.	2022-03-11 18:49:36 +02:00

48 Commits