mirror of
https://github.com/neondatabase/neon.git
synced 2026-01-08 14:02:55 +00:00
The layered storage format is good enough that we don't need the rocksdb implementation anymore. There are a lot of known issues but we'll keep working on them.
116 lines
3.8 KiB
Plaintext
116 lines
3.8 KiB
Plaintext
## Page server architecture
|
|
|
|
The Page Server has a few different duties:
|
|
|
|
- Respond to GetPage@LSN requests from the Compute Nodes
|
|
- Receive WAL from WAL safekeeper
|
|
- Replay WAL that's applicable to the chunks that the Page Server maintains
|
|
- Backup to S3
|
|
|
|
|
|
The Page Server consists of multiple threads that operate on a shared
|
|
cache of page versions:
|
|
|
|
|
|
| WAL
|
|
V
|
|
+--------------+
|
|
| |
|
|
| WAL receiver |
|
|
| |
|
|
+--------------+
|
|
+----+
|
|
+---------+ .......... | |
|
|
| | . . | |
|
|
GetPage@LSN | | . backup . -------> | S3 |
|
|
-------------> | Page | page cache . . | |
|
|
| Service | .......... | |
|
|
page | | +----+
|
|
<------------- | |
|
|
+---------+
|
|
|
|
...................................
|
|
. .
|
|
. Garbage Collection / Compaction .
|
|
...................................
|
|
|
|
Legend:
|
|
|
|
+--+
|
|
| | A thread or multi-threaded service
|
|
+--+
|
|
|
|
....
|
|
. . Component that we will need, but doesn't exist at the moment. A TODO.
|
|
....
|
|
|
|
---> Data flow
|
|
<---
|
|
|
|
|
|
Page Service
|
|
------------
|
|
|
|
The Page Service listens for GetPage@LSN requests from the Compute Nodes,
|
|
and responds with pages from the page cache.
|
|
|
|
|
|
WAL Receiver
|
|
------------
|
|
|
|
The WAL receiver connects to the external WAL safekeeping service (or
|
|
directly to the primary) using PostgreSQL physical streaming
|
|
replication, and continuously receives WAL. It decodes the WAL records,
|
|
and stores them to the page cache repository.
|
|
|
|
|
|
Page Cache
|
|
----------
|
|
|
|
The Page Cache is a switchboard to access different Repositories.
|
|
|
|
#### Repository
|
|
Repository corresponds to one .zenith directory.
|
|
Repository is needed to manage Timelines.
|
|
Each repository has associated WAL redo service.
|
|
|
|
There is currently only one implementation of the Repository trait: LayeredRepository.
|
|
It uses custom storage format, described in layered_repository/README.md
|
|
|
|
#### Timeline
|
|
Timeline is a page cache workhorse that accepts page changes
|
|
and serves get_page_at_lsn() and get_rel_size() requests.
|
|
Note: this has nothing to do with PostgreSQL WAL timeline.
|
|
|
|
#### Branch
|
|
We can create branch at certain LSN.
|
|
Each Branch lives in a corresponding timeline and has an ancestor.
|
|
|
|
To get full snapshot of data at certain moment we need to traverse timeline and its ancestors.
|
|
|
|
#### WAL redo service
|
|
WAL redo service - service that runs PostgreSQL in a special wal_redo mode
|
|
to apply given WAL records over an old page image and return new page image.
|
|
|
|
|
|
TODO: Garbage Collection / Compaction
|
|
-------------------------------------
|
|
|
|
Periodically, the Garbage Collection / Compaction thread runs
|
|
and applies pending WAL records, and removes old page versions that
|
|
are no longer needed.
|
|
|
|
|
|
TODO: Backup service
|
|
--------------------
|
|
|
|
The backup service is responsible for periodically pushing the chunks to S3.
|
|
|
|
TODO: How/when do restore from S3? Whenever we get a GetPage@LSN request for
|
|
a chunk we don't currently have? Or when an external Control Plane tells us?
|
|
|
|
TODO: Sharding
|
|
--------------------
|
|
|
|
We should be able to run multiple Page Servers that handle sharded data.
|