rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-05-24 08:30:37 +00:00

Author	SHA1	Message	Date
Dmitry Rodionov	767590bbd5	support tenants this patch adds support for tenants. This touches mostly pageserver. Directory layout on disk is changed to contain new layer of indirection. Now path to particular repository has the following structure: <pageserver workdir>/tenants/<tenant id>. Tenant id has the same format as timeline id. Tenant id is included in pageserver commands when needed. Also new commands are available in pageserver: tenant_list, tenant_create. This is also reflected CLI. During init default tenant is created and it's id is saved in CLI config, so following commands can use it without extra options. Tenant id is also included in compute postgres configuration, so it can be passed via ServerInfo to safekeeper and in connection string to pageserver. For more info see docs/multitenancy.md.	2021-07-22 20:54:20 +03:00
anastasia	c913404739	Redirect log to pageserver.log during zenith init. Add new module logger.rs that contains shared code to init logging	2021-07-21 18:56:34 +03:00
sharnoff	c4b2bf7ebd	Use 'zenith_admin' as superuser name in `initdb`	2021-07-21 17:22:22 +03:00
Konstantin Knizhnik	9838c71a47	Explicit compact (#341 ) * Do no perform compaction of RocksDB storage on each GC iteration * Increase GC timeout to let GC tests passed * Add comment to gc_iteration	2021-07-19 16:49:12 +03:00
Stas Kelvich	2b33894e7b	few more review fixes	2021-07-19 14:52:41 +03:00
Dmitry Rodionov	ed0fcfa9b7	replace parse_duration crate because of unpatched known vulnerability resolves #87	2021-07-16 14:30:27 +03:00
Dmitry Rodionov	75e717fe86	allow both domains and ip addresses in connection options for pageserver and wal keeper. Also updated PageServerNode definition in control plane to account for that. resolves #303	2021-07-09 16:46:21 +03:00
Patrick Insinger	cc169a6896	pageserver - config file To simplify cloud ops, allow configuration via file. toml is used as the config format, and the file is stored in the working directory. Arguments used at initialization are saved in the config file. Config file params may be overridden by CLI arguments.	2021-06-14 09:40:22 -07:00
Patrick Insinger	77366b7a76	pageserver - remove env variables Use CLI args instead of environment variables to parameterize the working directory and postgres distirbution. Before this change, there was a mixture of environment variables and CLI arguments that needed to be set. Moving to a single input simplifies cloud configuration management.	2021-06-14 09:40:22 -07:00
Stas Kelvich	d45839879c	Bind to socket earlier during pageserver init. That allows printing reasonable error message instead of panicking if address is already in use.	2021-05-21 00:26:31 +03:00
Heikki Linnakangas	ecf2d181c4	Tidy up the code to create PageServerConf Parse all the command line options before calling "zenith init" and changing current working dir. The rest of the options don't make any difference if we're initializing a new repository, but it seems strange and error-prone to parse some arguments at different times.	2021-05-20 19:28:57 +03:00
Heikki Linnakangas	600e1a0080	Pass PageServerConf as static ref. It's created once early in server startup, after parsing the command-line options, and never modified afterwards. To simplify things, pass it around as static ref, instead of making copies in all the different structs. We still pass around a reference to it, rather than putting it in a global variable, to allow unit testing with different configs in the same process.	2021-05-20 09:11:36 +03:00
Heikki Linnakangas	1912546e52	Change the meaning of PageServerConf.workdir Commit `746f667311` added the 'workdir' field and the get__path() functions, with the idea that we cd into the directory at page server startup, so that the get__path() functions can always return paths relative to '.', but 'workdir' shows the original path to it. Change it so that 'conf.workdir' is always set to '.', too, and the get__path() functions include 'workdir' in the returned paths. Why? Because that allows writing unit tests without changing the current directory. When I was working on commit `97992226d3`, I initially wrote the test so that it changed the current working directory, just like commit `746f667311` did. But that was problematic, when I tried to add another unit test that also* wants to change the current working dir, because they could then not run concurrently. In fact, they could not even run serially, unless the current directory was carefully reset after the test. So it is better to avoid changing the current directory in tests.	2021-05-19 08:49:16 +03:00
Heikki Linnakangas	a6178c135f	Fix starting page server in non-daemonize mode. Commit `746f667311` moved the "chdir" earlier in the startup sequence, before daemonizing. But it forgot to remove a corresponding chdir call later in the sequence when not in daemonize mode. As a result, if you tried to start the pageserver without the --daemonize option, it always failed with "No such file or directory" error.	2021-05-19 08:49:09 +03:00
Heikki Linnakangas	66bced0f36	Fix leftover comment about async I/O	2021-05-18 20:47:35 +03:00
Eric Seppanen	398d522d88	cargo fmt	2021-05-17 09:29:58 -07:00
Stas Kelvich	746f667311	Refactor CLI and CLI<->pageserver interfaces to support remote pageserver This patch started as an effort to support CLI working against remote pageserver, but turned into a pretty big refactoring. * CLI now does not look into repository files directly. New commands 'branch_create' and 'identify_system' were introduced into page_service to support that. * Branch management that was scattered between local_env and zenith/main.rs is moved into pageserver/branches.rs. That code could better fit in Repository/Timeline impl, but I'll leave that for a different patch. * All tests-related code from local_env went into integration_tests/src/lib.rs as an extension to PostgresNode trait. * Paths-generating functions were concentrated around corresponding config types (LocalEnv and PageserverConf).	2021-05-17 19:17:51 +03:00
Eric Seppanen	6ff3f1b9fd	don't open log files multiple times Multiple fds writing to the same file doesn't work. One fd will overwrite the output of the other fd. We were opening log files three times (stdout, stderr, and slog). The symptoms can be seen when the program panics; the final file will have truncated or lost messages. After this change, all messages are preserved. If panicking and logging are concurrent (and they definitely can be), some of the messages may be interleaved in slightly inconvenient ways. File::try_clone() is essentially `dup` underneath, meaning the two will share the same file offset.	2021-05-13 00:32:39 -07:00
Heikki Linnakangas	b484b896b6	Refactor the functionality page_cache.rs. This moves things around: - The PageCache is split into two structs: Repository and Timeline. A Repository holds multiple Timelines. In order to get a page version, you must first get a reference to the Repository, then the Timeline in the repository, and finally call the get_page_at_lsn() function on the Timeline object. This sounds complicated, but because each connection from a compute node, and each WAL receiver, only deals with one timeline at a time, the callers can get the reference to the Timeline object once and hold onto it. The Timeline corresponds most closely to the old PageCache object. - Repository and Timeline are now abstract traits, so that we can support multiple implementations. I don't actually expect us to have multiple implementations for long. We have the RocksDB implementation now, but as soon as we have a different implementation that's usable, I expect that we will retire the RocksDB implementation. But I think this abstraction works as good documentation in any case: it's now easier to see what the interface for storing and loading pages from the repository is, by looking at the Repository/Timeline traits. They abstract traits are in repository.rs, and the RocksDB implementation of them is in repository/rocksdb.rs. - page_cache.rs is now a "switchboard" to get a handle to the repository. Currently, the page server can only handle one repository at a time, so there isn't much there, but in the future we might do multi-tenancy there.	2021-05-05 10:37:36 +03:00
anastasia	1cdeba9db7	[issue #18 ] log module name and position in the file	2021-05-03 15:17:51 +03:00
Eric Seppanen	4acdcbe90f	clippy cleanup #3 Fix issues raised by clippy. Mostly trivial ones, though some allow 4-5 lines of code to be reduced to 1.	2021-04-26 12:35:35 -07:00
Konstantin Knizhnik	3e007b0eb9	Do not delete versions in GC	2021-04-24 22:32:22 +03:00
Konstantin Knizhnik	499b4f7eba	Log garbage collection statistics	2021-04-23 18:02:58 +03:00
Konstantin Knizhnik	52ee3a2bac	Support CREATE DATABASE command	2021-04-23 17:03:56 +03:00
Konstantin Knizhnik	4a0a9e748c	Enable garbage collector	2021-04-22 17:52:15 +03:00
Konstantin Knizhnik	2ca8fbb6ff	Fix DEFAULT_GC_PERIOD_SEC type	2021-04-22 12:01:25 +03:00
Konstantin Knizhnik	c5a8c31b8a	Update comments	2021-04-22 11:46:20 +03:00
Konstantin Knizhnik	ed30f2096c	Disable GC by default	2021-04-22 11:30:27 +03:00
Konstantin Knizhnik	2dbbb8c59b	Address issues from Eric's review	2021-04-22 10:12:22 +03:00
Konstantin Knizhnik	9e7c45cb72	Merge with master	2021-04-22 09:45:13 +03:00
Heikki Linnakangas	a4fd1e1a80	Cleanup more issues noted by 'clippy' Mostly stuff that was introduced by commit `3600b33f1c`.	2021-04-22 09:20:05 +03:00
Eric Seppanen	1f3f4cfaf5	clippy cleanup #2 - remove needless return - remove needless format! - remove a few more needless clone() - from_str_radix(_, 10) -> .parse() - remove needless reference - remove needless `mut` Also manually replaced a match statement with map_err() because after clippy was done with it, there was almost nothing left in the match expression.	2021-04-21 17:56:58 -07:00
Konstantin Knizhnik	c981f4ad66	Implement garbage collection of unused versions	2021-04-21 19:04:30 +03:00
Konstantin Knizhnik	d8fa2ec367	Merge with main branch	2021-04-21 16:10:05 +03:00
Konstantin Knizhnik	07507274c0	Merge branch 'main' into rocksdb_pageserver	2021-04-21 16:06:31 +03:00
Eric Seppanen	92e4f4b3b6	cargo fmt	2021-04-20 17:59:56 -07:00
Heikki Linnakangas	f69db17409	Make WAL safekeeper work with zenith timelines	2021-04-20 19:11:29 +03:00
Heikki Linnakangas	3600b33f1c	Implement "timelines" in page server This replaces the page server's "datadir" concept. The Page Server now always works with a "Zenith Repository". When you initialize a new repository with "zenith init", it runs initdb and loads an initial basebackup of the freshly-created cluster into the repository, on "main" branch. Repository can hold multiple "timelines", which can be given human-friendly names, making them "branches". One page server simultaneously serves all timelines stored in the repository, and you can have multiple Postgres compute nodes connected to the page server, as long they all operate on a different timeline. There is a new command "zenith branch", which can be used to fork off new branches from existing branches. The repository uses the directory layout desribed as Repository format v1 in https://github.com/zenithdb/rfcs/pull/5. It it highly inefficient: - we never create new snapshots. So in practice, it's really just a base backup of the initial empty cluster, and everything else is reconstructed by redoing all WAL - when you create a new timeline, the base snapshot and all WAL is copied from the new timeline to the new one. There is no smarts about referencing the old snapshots/wal from the ancestor timeline. To support all this, this commit includes a bunch of other changes: - Implement "basebackup" funtionality in page server. When you initialize a new compute node with "zenith pg create", it connects to the page server, and requests a base backup of the Postgres data directory on that timeline. (the base backup excludes user tables, so it's not as bad as it sounds). - Have page server's WAL receiver write the WAL into timeline dir. This allows running a Page Server and Compute Nodes without a WAL safekeeper, until we get around to integrate that properly into the system. (Even after we integrate WAL safekeeper, this is perhaps how this will operate when you want to run the system on your laptop.) - restore_datadir.rs was renamed to restore_local_repo.rs, and heavily modified to use the new format. It now also restores all WAL. - Page server no longer scans and restores everything into memory at startup. Instead, when the first request is made for a timeline, the timeline is slurped into memory at that point. - The responsibility for telling page server to "callmemaybe" was moved into Postgres libpqpagestore code. Also, WAL producer connstring cannot be specified in the pageserver's command line anymore. - Having multiple "system identifiers" in the same page server is no longer supported. I repurposed much of that code to support multiple timelines, instead. - Implemented very basic, incomplete, support for PostgreSQL's Extended Query Protocol in page_service.rs. Turns out that rust-postgres' copy_out() function always uses the extended query protocol to send out the command, and I'm using that to stream the base backup from the page server. TODO: I haven't fixed the WAL safekeeper for this scheme, so all the integration tests involving safekeepers are failing. My plan is to modify the safekeeper to know about Zenith timelines, too, and modify it to work with the same Zenith repository format. It only needs to care about the '.zenith/timelines/<timeline>/wal' directories.	2021-04-20 19:11:27 +03:00
Konstantin Knizhnik	95160dee6d	Merge with main branch	2021-04-19 17:00:30 +03:00
Konstantin Knizhnik	8aa3013ec2	Merge branch 'main' into rocksdb_pageserver	2021-04-19 16:28:29 +03:00
Eric Seppanen	3725815935	pageserver: propage errors instead of calling .unwrap() Just a few more places where we can drop the .unwrap() call in favor of `?`. Also include a fix to the log file handling: don't open the file twice. Writing to two fds would result in one message overwriting another. Presumably `File.try_clone()` reduces down to `dup` on Linux.	2021-04-18 23:06:35 -07:00
Eric Seppanen	3c7f810849	clippy cleanup #1 Resolve some basic warnings from clippy: - useless conversion to the same type - redundant field names in struct initialization - redundant single-component path imports	2021-04-18 19:15:06 -07:00
Heikki Linnakangas	d2c3ad162a	Prefer passing PageServerConf by reference. It seems more idiomatic Rust.	2021-04-16 10:42:41 +03:00
Heikki Linnakangas	b4c5cb2773	Clean up error types a little bit. Don't use std::io::Error for errors that are not I/O related. Prefer anyhow::Result instead.	2021-04-16 10:42:25 +03:00
anastasia	d7eeaec706	add test for restore from local pgdata	2021-04-15 16:43:03 +03:00
anastasia	1190030872	handle SLRU in restore_datadir	2021-04-15 16:43:03 +03:00
Konstantin Knizhnik	24b925d528	Support truncate WAL record	2021-04-15 15:50:47 +03:00
lubennikovaav	82dc1e82ba	Restore pageserver from s3 or local datadir (#9 ) * change pageserver --skip-recovery option to --restore-from=[s3\|local] * implement restore from local pgdata * add simple test for local restore	2021-04-14 21:14:10 +03:00
Eric Seppanen	3c4ebc4030	init_logging: return Result, print error on file create Instead of panicking if the file create fails, print the filename and error description to stderr; then propagate the error to our caller.	2021-04-13 14:06:14 -07:00
Konstantin Knizhnik	a606336074	Fix bug in WALRecord serializer	2021-04-09 20:31:34 +03:00

1 2

52 Commits