mirror of
https://github.com/neondatabase/neon.git
synced 2025-12-26 23:59:58 +00:00
The integration tests, which depend on both walkeeper and pageserver, are moved into yet another directory, 'integration_tests'.
180 lines
6.2 KiB
Plaintext
180 lines
6.2 KiB
Plaintext
Page Server
|
|
===========
|
|
|
|
|
|
How to test
|
|
-----------
|
|
|
|
|
|
1. Compile and install Postgres from this repository (there are
|
|
modifications, so vanilla Postgres won't do)
|
|
|
|
./configure --prefix=/home/heikki/zenith-install
|
|
|
|
2. Compile the page server
|
|
|
|
cd pageserver
|
|
cargo build
|
|
|
|
3. Create another "dummy" cluster that will be used by the page server when it applies
|
|
the WAL records. (shouldn't really need this, getting rid of it is a TODO):
|
|
|
|
/home/heikki/zenith-install/bin/initdb -D /data/zenith-dummy
|
|
|
|
|
|
4. Initialize and start a new postgres cluster
|
|
|
|
/home/heikki/zenith-install/bin/initdb -D /data/zenith-test-db --username=postgres
|
|
/home/heikki/zenith-install/bin/postgres -D /data/zenith-test-db
|
|
|
|
5. In another terminal, start the page server.
|
|
|
|
PGDATA=/data/zenith-dummy PATH=/home/heikki/zenith-install/bin:$PATH ./target/debug/pageserver
|
|
|
|
It should connect to the postgres instance using streaming replication, and print something
|
|
like this:
|
|
|
|
$ PGDATA=/data/zenith-dummy PATH=/home/heikki/zenith-install/bin:$PATH ./target/debug/pageserver
|
|
Starting WAL receiver
|
|
connecting...
|
|
Starting page server on 127.0.0.1:5430
|
|
connected!
|
|
page cache is empty
|
|
|
|
6. You can now open another terminal and issue DDL commands. Generated WAL records will
|
|
be streamed to the page servers, and attached to blocks that they apply to in its
|
|
page cache
|
|
|
|
$ psql postgres -U postgres
|
|
psql (14devel)
|
|
Type "help" for help.
|
|
|
|
postgres=# create table mydata (i int4);
|
|
CREATE TABLE
|
|
postgres=# insert into mydata select g from generate_series(1,100) g;
|
|
INSERT 0 100
|
|
postgres=#
|
|
|
|
7. The GetPage@LSN interface to the compute nodes isn't working yet, but to simulate
|
|
that, the page server generates a test GetPage@LSN call every 5 seconds on a random
|
|
block that's in the page cache. In a few seconds, you should see output from that:
|
|
|
|
testing GetPage@LSN for block 0
|
|
WAL record at LSN 23584576 initializes the page
|
|
2021-03-19 11:03:13.791 EET [11439] LOG: applied WAL record at 0/167DF40
|
|
2021-03-19 11:03:13.791 EET [11439] LOG: applied WAL record at 0/167DF80
|
|
2021-03-19 11:03:13.791 EET [11439] LOG: applied WAL record at 0/167DFC0
|
|
2021-03-19 11:03:13.791 EET [11439] LOG: applied WAL record at 0/167E018
|
|
2021-03-19 11:03:13.791 EET [11439] LOG: applied WAL record at 0/167E058
|
|
2021-03-19 11:03:13.791 EET [11439] LOG: applied WAL record at 0/167E098
|
|
2021-03-19 11:03:13.791 EET [11439] LOG: applied WAL record at 0/167E0D8
|
|
2021-03-19 11:03:13.792 EET [11439] LOG: applied WAL record at 0/167E118
|
|
2021-03-19 11:03:13.792 EET [11439] LOG: applied WAL record at 0/167E158
|
|
2021-03-19 11:03:13.792 EET [11439] LOG: applied WAL record at 0/167E198
|
|
applied 10 WAL records to produce page image at LSN 18446744073709547246
|
|
|
|
|
|
|
|
Architecture
|
|
============
|
|
|
|
The Page Server is responsible for all operations on a number of
|
|
"chunks" of relation data. A chunk corresponds to a PostgreSQL
|
|
relation segment (i.e. one max. 1 GB file in the data directory), but
|
|
it holds all the different versions of every page in the segment that
|
|
are still needed by the system.
|
|
|
|
Determining which chunk each Page Server holds is handled elsewhere. (TODO:
|
|
currently, there is only one Page Server which holds all chunks)
|
|
|
|
The Page Server has a few different duties:
|
|
|
|
- Respond to GetPage@LSN requests from the Compute Nodes
|
|
- Receive WAL from WAL safekeeper
|
|
- Replay WAL that's applicable to the chunks that the Page Server maintains
|
|
- Backup to S3
|
|
|
|
|
|
The Page Server consists of multiple threads that operate on a shared
|
|
cache of page versions:
|
|
|
|
|
|
| WAL
|
|
V
|
|
+--------------+
|
|
| |
|
|
| WAL receiver |
|
|
| |
|
|
+--------------+
|
|
+----+
|
|
+---------+ .......... | |
|
|
| | . . | |
|
|
GetPage@LSN | | . backup . -------> | S3 |
|
|
-------------> | Page | page cache . . | |
|
|
| Service | .......... | |
|
|
page | | +----+
|
|
<------------- | |
|
|
+---------+
|
|
|
|
...................................
|
|
. .
|
|
. Garbage Collection / Compaction .
|
|
...................................
|
|
|
|
Legend:
|
|
|
|
+--+
|
|
| | A thread or multi-threaded service
|
|
+--+
|
|
|
|
....
|
|
. . Component that we will need, but doesn't exist at the moment. A TODO.
|
|
....
|
|
|
|
---> Data flow
|
|
<---
|
|
|
|
|
|
Page Service
|
|
------------
|
|
|
|
The Page Service listens for GetPage@LSN requests from the Compute Nodes,
|
|
and responds with pages from the page cache.
|
|
|
|
|
|
WAL Receiver
|
|
------------
|
|
|
|
The WAL receiver connects to the external WAL safekeeping service (or
|
|
directly to the primary) using PostgreSQL physical streaming
|
|
replication, and continuously receives WAL. It decodes the WAL records,
|
|
and stores them to the page cache.
|
|
|
|
|
|
Page Cache
|
|
----------
|
|
|
|
The Page Cache is a data structure, to hold all the different page versions.
|
|
It is accessed by all the other threads, to perform their duties.
|
|
|
|
Currently, the page cache is implemented fully in-memory. TODO: Store it
|
|
on disk. Define a file format.
|
|
|
|
|
|
TODO: Garbage Collection / Compaction
|
|
-------------------------------------
|
|
|
|
Periodically, the Garbage Collection / Compaction thread runs
|
|
and applies pending WAL records, and removes old page versions that
|
|
are no longer needed.
|
|
|
|
|
|
TODO: Backup service
|
|
--------------------
|
|
|
|
The backup service is responsible for periodically pushing the chunks to S3.
|
|
|
|
TODO: How/when do restore from S3? Whenever we get a GetPage@LSN request for
|
|
a chunk we don't currently have? Or when an external Control Plane tells us?
|
|
|