rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-08 22:12:56 +00:00

Author	SHA1	Message	Date
Arthur Petukhovsky	134eeeb096	Add more common storage metrics (#1722 ) - Enabled process exporter for storage services - Changed zenith_proxy prefix to just proxy - Removed old `monitoring` directory - Removed common prefix for metrics, now our common metrics have `libmetrics_` prefix, for example `libmetrics_serve_metrics_count` - Added `test_metrics_normal_work`	2022-05-17 19:29:01 +03:00
Heikki Linnakangas	55ea3f262e	Fix race condition leading to panic in remote storage sync thread. The SyncQueue consisted of a tokio mpsc channel, and an atomic counter to keep track of how many items there are in the channel. Updating the atomic counter was racy, and sometimes the consumer would decrement the counter before the producer had incremented it, leading to integer wraparound to usize::MAX. Calling Vec::with_capacity(usize::MAX) leads to a panic. To fix, replace the channel with a VecDeque protected by a Mutex, and a condition variable for signaling. Now that the queue is now protected by standard blocking Mutex and Condvar, refactor the functions touching it to be sync, not async. A theoretical downside of this is that the calls to push items to the queue and the storage sync thread that drains the queue might now need to wait, if another thread is busy manipulating the queue. I believe that's OK; the lock isn't held for very long, and these operations are made in background threads, not in the hot GetPage@LSN path, so they're not very latency-sensitive. Fixes #1719. Also add a test case.	2022-05-17 18:14:57 +03:00
Heikki Linnakangas	f03779bf1a	Fix wait_for_last_record_lsn() and wait_for_upload() python functions. The contract for wait_for() was not very clear. It waits until the given function returns successfully, without an exception, but the wait_for_last_record_lsn() and wait_for_upload() functions used "a < b" as the condition, i.e. they thought that wait_for() would poll until the function returns true. Inline the logic from wait_for() into those two functions, it's not that complicated, and you get a more specific error message too, if it fails. Also add a comment to wait_for() to make it more clear how it works. Also change remote_consistent_lsn() to return 0 instead of raising an exception, if remote is None. That can happen if nothing has been uploaded to remote storage for the timeline yet. It happened once in the CI, and I was able to reproduce that locally too by adding a sleep to the storage sync thread, to delay the first upload.	2022-05-17 18:14:10 +03:00
Kirill Bulatov	f2881bbd8a	Start and stop single etcd and mock s3 servers globally in python tests	2022-05-17 01:17:44 +03:00
Kirill Bulatov	a884f4cf6b	Add etcd to neon_local	2022-05-17 01:17:44 +03:00
Kirill Bulatov	9a0fed0880	Enable at least 1 safekeeper in every test	2022-05-17 01:17:44 +03:00
Thang Pham	e4a70faa08	Add more information to timeline-related APIs (#1673 ) Resolves #1488. - implemented `GET tenant/:tenant_id/timeline/:timeline_id/wal_receiver` endpoint - returned `thread_id` in `thread_mgr::spawn` - added `latest_gc_cutoff_lsn` field to `LocalTimelineInfo` struct	2022-05-16 11:05:43 -04:00
Heikki Linnakangas	51ea9c3053	Don't swallow panics when the pageserver is build with failpoints. It's very confusing, and because you don't get a stack trace and error message in the logs, makes debugging very hard. However, the 'test_pageserver_recovery' test relied on that behavior. To support that, add a new "exit" action to the pageserver 'failpoints' command, so that you can explicitly request to exit the process when a failpoint is hit.	2022-05-16 09:58:58 +03:00
Heikki Linnakangas	a10cac980f	Continue with pageserver startup, if loading some tenants fail. Fixes https://github.com/neondatabase/neon/issues/1664	2022-05-15 00:25:38 +03:00
Egor Suvorov	768c846eeb	Fix test_delete_force from #1653 conflicting with #1692	2022-05-13 17:36:18 +02:00
Anastasia Lubennikova	a2561f0a78	Use tenant's pitr_interval instead of hardroded 0 in the command. Adjust python tests that use the	2022-05-13 18:32:14 +03:00
Anastasia Lubennikova	aa7c601eca	Fix pitr_interval check in GC: Use timestamp->LSN mapping instead of file modification time. Fix 'latest_gc_cutoff_lsn' - set it to the minimum of pitr_cutoff and gc_cutoff. Add new test: test_pitr_gc	2022-05-13 18:32:14 +03:00
Egor Suvorov	bf899a57d9	Safekeeper: add timeline/tenant force delete HTTP endpoings (closes #895 ) * There is no auth in Safekeeper HTTP at all currently, so simply calling `check_permission` is not enough. * There are no checks of Safekeeper still working with the data, as "still working" is burry now: a timeline may be "active" while there are no compute nodes and all data is propagated. * Still, callmemaybe is deactivated, and timeline is removed from the internal map. It can easily sneak back in case of race conditions and implicit creations, though.	2022-05-13 15:43:52 +02:00
Kirill Bulatov	85884a1599	Disable tenant relocation python test	2022-05-13 01:26:38 +03:00
Thang Pham	ae20751724	update `ZenithCli::create_tenant` return signature (#1692 ) to include the initial timeline's ID in addition to the new tenant's ID. Context: follow-up of https://github.com/neondatabase/neon/pull/1689	2022-05-12 17:27:08 -04:00
Thang Pham	5812e26b90	Create an initial timeline on CLI tenant creation (#1689 ) Resolves #1655	2022-05-12 16:33:09 -04:00
Dmitry Rodionov	0f3ec83172	avoid detach with alive branches	2022-05-05 12:54:42 +03:00
Anastasia Lubennikova	e2cf77441d	Implement pg_database_size(). In this implementation dbsize equals sum of all relation sizes, excluding shared ones.	2022-05-04 18:14:45 +03:00
Arseny Sher	b9fd8a36ad	Remember timeline_start_lsn and local_start_lsn on safekeeper. Make it remember when timeline starts in general and on this safekeeper in particular (the point might be later on new safekeeper replacing failed one). Bumps control file and walproposer protocol versions. While protocol is bumped, also add safekeeper node id to AcceptorProposerGreeting. ref #1561	2022-05-04 14:32:03 +04:00
Anastasia Lubennikova	2f9b17b9e5	Add simple test of pageserver recovery after crash. To cause a crash, use failpoints in checkpointer	2022-05-03 17:13:09 +03:00
Heikki Linnakangas	9ede38b6c4	Support finding LSN from a commit timestamp. A new `get_lsn_by_timestamp` command is added to the libpq page service API. An extra timestamp field is now stored in an extra field after each Clog page. It is the timestamp of the latest commit, among all the transactions on the Clog page. To find the overall latest commit, we need to scan all Clog pages, but this isn't a very frequent operation so that's not too bad. To find the LSN that corresponds to a timestamp, we perform a binary search. The binary search starts with min = last LSN when GC ran, and max = latest LSN on the timeline. On each iteration of the search we check if there are any commits with a higher-than-requested timestamp at that LSN. Implements github issue 1361.	2022-05-03 09:28:57 +03:00
Konstantin Knizhnik	baa59512b8	Traverse frozen layer in get_reconstruct_data in reverse order (#1601 ) * Traverse frozen layer in get_reconstruct_data in reverse order * Fix comments on frozen layers. Note explicitly the order that the layers are in the queue. * Add fail point to reproduce failpoint iteration error Co-authored-by: Heikki Linnakangas <heikki@neon.tech>	2022-05-03 08:07:14 +03:00
Kirill Bulatov	5cb501c2b3	Make remote storage test less flacky	2022-05-02 20:04:48 +03:00
Dhammika Pathirana	3128e8c75c	Fix tenant conf test Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>	2022-05-01 13:13:25 -07:00
Dmitry Rodionov	67b4e38092	remporarily disable test_backpressure_received_lsn_lag	2022-04-29 15:53:56 +03:00
Dmitry Rodionov	05f8e6a050	Use fsync+rename for atomic downloads from remote storage Use failpoint in test_remote_storage to check the behavior	2022-04-29 15:53:56 +03:00
Kirill Bulatov	2911eb084a	Remove timeline files on detach	2022-04-29 09:19:18 +03:00
Anastasia Lubennikova	5c5c3c64f3	Fix tenant config parsing. Add a test	2022-04-28 11:49:19 +03:00
Arthur Petukhovsky	29539b0561	Set wal_keep_size to zero (#1507 ) wal_keep_size is already set to 0 in our cloud setup, but we don't use this value in tests. This commit fixes wal_keep_size in control_plane and adds tests for WAL recycling and lagging safekeepers.	2022-04-27 19:09:28 +03:00
Dhammika Pathirana	66694e736a	Fix add ps tenant config Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>	2022-04-27 00:05:13 -07:00
Dhammika Pathirana	091cefaa92	Fix add compaction for key partitioning Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>	2022-04-27 00:05:13 -07:00
Dhammika Pathirana	6391862d8a	Add branch traversal test Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>	2022-04-27 00:05:13 -07:00
Arseny Sher	8b9d523f3c	Remove old WAL on safekeepers. Remove when it is consumed by all of 1) pageserver (remote_consistent_lsn) 2) safekeeper peers 3) s3 WAL offloading. In test s3 offloading for now is mocked by directly bumping s3_wal_lsn. ref #1403	2022-04-26 23:02:23 +04:00
Konstantin Knizhnik	5f83c9290b	Make it possible to specify per-tenant configuration parameters Add tenant config API and 'zenith tenant config' CLI command. Add 'show' query to pageserver protocol for tenantspecific config parameters Refactoring: move tenant_config code to a separate module. Save tenant conf file to tenant's directory, when tenant is created to recover it on pageserver restart. Ignore error during tenant config loading, while it is not supported by console Define PiTR interval for GC. refer #1320	2022-04-22 11:24:29 +03:00
Arseny Sher	abcd7a4b1f	Insert less data in test_wal_restore. Otherwise it sometimes hits 2m statement timeout in CI.	2022-04-21 16:00:15 +04:00
Kirill Bulatov	81cad6277a	Move and library crates into a dedicated directory and rename them	2022-04-21 13:30:33 +03:00
Konstantin Knizhnik	ac52f4f2d6	Set superuser when initializing database for wal recovery (#1544 )	2022-04-20 13:24:38 +03:00
Heikki Linnakangas	5e95338ee9	Improve logging in test_wal_restore.py - Capture the output of the restore_from_wal.sh in a log file - Kill "restored" Postgres server on test failure	2022-04-20 11:18:40 +03:00
Kirill Bulatov	3e6087a12f	Remove S3 archiving	2022-04-19 23:13:52 +03:00
Kirill Bulatov	52e0816fa5	wal_acceptor -> safekeeper	2022-04-18 12:52:31 +03:00
Heikki Linnakangas	4a8c663452	Refactor pgbench tests. - Remove batch_others/test_pgbench.py. It was a quick check that pgbench works, without actually recording any performance numbers, but that doesn't seem very interesting anymore. Remove it to avoid confusing it with the actual pgbench benchmarks - Run pgbench with "-n" and "-S" options, for two different workloads: simple-updates, and SELECT-only. Previously, we would only run it with the "default" TPCB-like workload. That's more or less the same as the simple-update (-n) workload, but I think the simple-upload workload is more relevant for testing storage performance. The SELECT-only workload is a new thing to measure. - Merge test_perf_pgbench.py and test_perf_pgbench_remote.py. I added a new "remote" implementation of the PgCompare class, which allows running the same tests against an already-running Postgres instance. - Make the PgBenchRunResult.parse_from_output function more flexible. pgbench can print different lines depending on the command-line options, but the parsing function expected a particular set of lines.	2022-04-14 13:31:42 +03:00
Heikki Linnakangas	a009fe912a	Refactor connection option handling in python tests The PgProtocol.connect() function took extra options for username, database, etc. Remove those options, and have a generic way for each subclass of PgProtocol to provide some default options, with the capability override them in the connect() call.	2022-04-14 13:31:40 +03:00
Heikki Linnakangas	19954dfd8a	Refactor proxy options test to not rely on the 'schema' argument. It was the only test that used the 'schema' argument to the connect() function. I'm about to refactor the option handling and will remove the special 'schema' argument altogether, so rewrite the test to not use it.	2022-04-14 13:31:37 +03:00
Konstantin Knizhnik	07a9553700	Add test for restore from WAL (#1366 ) * Add test for restore from WAL * Fix python formatting * Choose unused port in wal restore test * Move recovery tests to zenith_utils/scripts * Set LD_LIBRARY_PATH in wal recovery scripts * Fix python test formatting * Fix mypy warning * Bump postgres version * Bump postgres version	2022-04-11 22:30:08 +03:00
Arthur Petukhovsky	6bc78a0e77	Log more info in test_many_timelines asserts (#1473 ) It will help to debug #1470 as soon as it happens again	2022-04-07 01:44:26 +03:00
Arthur Petukhovsky	a40b7cd516	Fix timeouts in test_restarts_under_load (#1436 ) * Enable backpressure in test_restarts_under_load * Remove hacks because #644 is fixed now * Adjust config in test_restarts_under_load	2022-03-31 17:00:09 +03:00
Anton Shyrabokau	5c5629910f	Add a test case for reading historic page versions (#1314 ) * Add a test case for reading historic page versions Test read_page_at_lsn returns correct results when compared to page inspect. Validate possiblity of reading pages from dropped relation. Ensure funcitons read latest version when null lsn supplied. Check that functions do not poison buffer cache with stale page versions.	2022-03-29 22:13:06 -07:00
Arseny Sher	ec3bc74165	Add safekeeper information exchange through etcd. Safekeers now publish to and pull from etcd per-timeline data. Immediate goal is WAL truncation, for which every safekeeper must know remote_consistent_lsn; the next would be callmemaybe replacement. Adds corresponding '--broker' argument to safekeeper and ability to run etcd in tests. Adds test checking remote_consistent_lsn is indeed communicated.	2022-03-29 18:16:49 +04:00
Arseny Sher	75002adc14	Make shared_buffers large in test_pageserver_catchup. We intentionally write while pageserver is down, so we shouldn't query it. Noticed by @petuhovskiy at https://github.com/zenithdb/postgres/pull/141#issuecomment-1080261700	2022-03-28 20:34:06 +04:00
Heikki Linnakangas	07342f7519	Major storage format rewrite. This is a backwards-incompatible change. The new pageserver cannot read repositories created with an old pageserver binary, or vice versa. Simplify Repository to a value-store ------------------------------------ Move the responsibility of tracking relation metadata, like which relations exist and what are their sizes, from Repository to a new module, pgdatadir_mapping.rs. The interface to Repository is now a simple key-value PUT/GET operations. It's still not any old key-value store though. A Repository is still responsible from handling branching, and every GET operation comes with an LSN. Mapping from Postgres data directory to keys/values --------------------------------------------------- All the data is now stored in the key-value store. The 'pgdatadir_mapping.rs' module handles mapping from PostgreSQL objects like relation pages and SLRUs, to key-value pairs. The key to the Repository key-value store is a Key struct, which consists of a few integer fields. It's wide enough to store a full RelFileNode, fork and block number, and to distinguish those from metadata keys. 'pgdatadir_mapping.rs' is also responsible for maintaining a "partitioning" of the keyspace. Partitioning means splitting the keyspace so that each partition holds a roughly equal number of keys. The partitioning is used when new image layer files are created, so that each image layer file is roughly the same size. The partitioning is also responsible for reclaiming space used by deleted keys. The Repository implementation doesn't have any explicit support for deleting keys. Instead, the deleted keys are simply omitted from the partitioning, and when a new image layer is created, the omitted keys are not copied over to the new image layer. We might want to implement tombstone keys in the future, to reclaim space faster, but this will work for now. Changes to low-level layer file code ------------------------------------ The concept of a "segment" is gone. Each layer file can now store an arbitrary range of Keys. Checkpointing, compaction ------------------------- The background tasks are somewhat different now. Whenever checkpoint_distance is reached, the WAL receiver thread "freezes" the current in-memory layer, and creates a new one. This is a quick operation and doesn't perform any I/O yet. It then launches a background "layer flushing thread" to write the frozen layer to disk, as a new L0 delta layer. This mechanism takes care of durability. It replaces the checkpointing thread. Compaction is a new background operation that takes a bunch of L0 delta layers, and reshuffles the data in them. It runs in a separate compaction thread. Deployment ---------- This also contains changes to the ansible scripts that enable having multiple different pageservers running at the same time in the staging environment. We will use that to keep an old version of the pageserver running, for clusters created with the old version, at the same time with a new pageserver with the new binary. Author: Heikki Linnakangas Author: Konstantin Knizhnik <knizhnik@zenith.tech> Author: Andrey Taranik <andrey@zenith.tech> Reviewed-by: Matthias Van De Meent <matthias@zenith.tech> Reviewed-by: Bojan Serafimov <bojan@zenith.tech> Reviewed-by: Konstantin Knizhnik <knizhnik@zenith.tech> Reviewed-by: Anton Shyrabokau <antons@zenith.tech> Reviewed-by: Dhammika Pathirana <dham@zenith.tech> Reviewed-by: Kirill Bulatov <kirill@zenith.tech> Reviewed-by: Anastasia Lubennikova <anastasia@zenith.tech> Reviewed-by: Alexey Kondratov <alexey@zenith.tech>	2022-03-28 05:41:15 -05:00

1 2 3 4

178 Commits