Commit Graph

1299 Commits

Author SHA1 Message Date
Bojan Serafimov
c404e65be4 WIP remote zenith fixture 2022-03-07 16:54:22 -05:00
Bojan Serafimov
29d72e8955 Add proxy test 2022-03-07 14:32:24 -05:00
Kirill Bulatov
66eb2a1dd3 Replace zenith/build build image with zimg/* ones 2022-03-04 13:46:44 +02:00
Kirill Bulatov
9424bfae22 Use a separate newtype for ZId that (de)serialize as hex strings 2022-03-04 10:58:40 +02:00
Dmitry Rodionov
1d90b1b205 add node id to pageserver (#1310)
* Add --id argument to safekeeper setting its unique u64 id.

In preparation for storage node messaging. IDs are supposed to be monotonically
assigned by the console. In tests it is issued by ZenithEnv; at the zenith cli
level and fixtures, string name is completely replaced by integer id. Example
TOML configs are adjusted accordingly.

Sequential ids are chosen over Zid mainly because they are compact and easy to
type/remember.

* add node id to pageserver

This adds node id parameter to pageserver configuration. Also I use a
simple builder to construct pageserver config struct to avoid setting
node id to some temporary invalid value. Some of the changes in test
fixtures are needed to split init and start operations for envrionment.

Co-authored-by: Arseny Sher <sher-ars@yandex.ru>
2022-03-04 01:10:42 +03:00
Kirill Bulatov
949f8b4633 Fix 1.59 rustc clippy warnings 2022-03-02 21:35:34 +02:00
Andrey Taranik
26a68612d9 safekeeper to cosnole call fix (#1333) 2022-02-27 01:36:40 +03:00
Andrey Taranik
850dfd02df Release deployment (#1331)
* new deployment flow for staging and production

* ansible playbooks and circleci config fixes

* cleanup before merge

* additional cleanup before merge

* debug deployment to staging env

* debug deployment to staging env

* debug deployment to staging env

* debug deployment to staging env

* debug deployment to staging env

* debug deployment to staging env

* bianries artifacts path fix for ansible playbooks

* deployment flow refactored

* base64 decode fix for ssh key

* fix for console notification and production deploy settings

* cleanup after deployment tests

* fix - trigger release binaries download for production deploy
2022-02-26 23:33:16 +03:00
Arthur Petukhovsky
c8a1192b53 Optimize WAL storage in safekeeper (#1318)
When several AppendRequest's can be read from socket without blocking,
they are processed together and fsync() to segment file is only called
once. Segment file is no longer opened for every write request, now
last opened file is cached inside PhysicalStorage. New metric for WAL
flushes was added to the storage, FLUSH_WAL_SECONDS. More errors were
added to storage for non-sequential WAL writes, now write_lsn can be
moved only with calls to truncate_lsn(new_lsn).

New messages have been added to ProposerAcceptorMessage enum. They
can't be deserialized directly and now are used only for optimizing
flushes. Existing protocol wasn't changed and flush will be called for
every AppendRequest, as it was before.
2022-02-25 18:52:21 +03:00
bojanserafimov
137d616e76 [proxy] Add pytest fixture (#1311) 2022-02-24 11:20:07 -05:00
Kirill Bulatov
917c640818 Fix mypy for the new Python 2022-02-24 14:24:36 +03:00
anastasia
c1b3836df1 Bump vendor/postgres 2022-02-24 12:52:12 +03:00
Heikki Linnakangas
5120ba4b5f Refactor the interface for using cached page image.
Instead of passing it as a separate argument to get_page_reconstruct_data,
the caller can fill it in the PageReconstructData struct.
2022-02-24 10:37:12 +02:00
Heikki Linnakangas
e4670a5f1e Remove the PageVersions abstraction.
Since commit fdd987c3ad, it was only used in InMemoryLayers. Let's
just "inline" the code into InMemoryLayer itself.

I originally did this as part of a bigger PR (#1267). With that PR,
one in-memory layer, and one ephemeral file, would hold page versions
belonging to multiple segments. Currently, PageVersions can only hold
versions for a single segment, so that would need to be changed.
Rather than modify PageVersions to support that, just remove it
altogether.
2022-02-23 21:04:39 +02:00
Heikki Linnakangas
7fae894648 Move a few unit tests specific to layered file format.
These tests have intimate knowledge of the directory layeout and layer
file names used by the LayeredRepository implementation of the
Repository trait. Move them, so that all the tests that remain in
repository.rs are expected to work without changes with any
implementation of Repository. Not that we have any plans to create
another Repository implementaiton any time soon, but as long as we
have the Repository interface, let's try to maintain that abstraction
in the tests too.
2022-02-23 20:32:06 +02:00
Stas Kelvich
058123f7ef Bump postgres to fix zenith_test_utils linkage on macOS. 2022-02-23 20:33:47 +03:00
anastasia
87edbd38c7 Add 'wait_lsn_timeout' and 'wal_redo_timeout' pageserver config options instead of hardcoded defaults 2022-02-23 19:59:35 +03:00
anastasia
58ee5d005f Add --pageserver-config-override to ZenithEnvBuilder to tune checkpointer and GC in tests.
Usage example:
zenith_env_builder.pageserver_config_override = "checkpoint_period = '100 s'; checkpoint_distance = 1073741824"
2022-02-23 19:59:35 +03:00
Heikki Linnakangas
468366a28f Fix wrong 'lsn' stored in test page image
The test creates a page version with a string like "foo 123 at 0/10"
as the content. But the LSN stored in that string was wrong: the page
version stored at LSN 0/20 would say "foo <blk> at 0/10".
2022-02-23 11:33:17 +02:00
Dhammika Pathirana
b815f5fb9f Add no_sync check in storage
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>
2022-02-22 12:01:12 -08:00
anastasia
74a0942a77 Fix zenith feedback processing at compute node.
Add test for backpressure
2022-02-22 13:56:21 +03:00
anastasia
1a4682a04a Add 'walreceiver-after-ingest' failpoint. Use sleep at this point to imitate slow walreceiver. 2022-02-22 13:56:21 +03:00
Heikki Linnakangas
993b544ad0 Change default parameters for back pressure
Fixes issue #1238 and #1189. Extracted from PR #1194, with some comment
editorialization by me.

Author: Konstantin Knizhnik <knizhnik@zenith.tech>
2022-02-22 13:56:21 +03:00
Arthur Petukhovsky
dba1d36a4a Refactor WAL utils in safekeeper (#1290)
wal_storage.rs was split up from timeline.rs, safekeeper.rs and send_wal.rs,
and now contains all WAL related code from the safekeeper. Now there are
PhysicalStorage for persisting WAL to disk and WalReader for reading it.
This allows optimizing PhysicalStorage without affecting too much of other
code.

Also there is a separate structure for persisting control file now in
control_file.rs.
2022-02-21 17:20:53 +03:00
Bojan Serafimov
ca81a550ef Fmt 2022-02-21 16:43:28 +03:00
Bojan Serafimov
65a0b2736b Add static router 2022-02-21 16:43:28 +03:00
Bojan Serafimov
cca886682b Undo cplane change 2022-02-21 16:43:28 +03:00
Bojan Serafimov
c8f47cd38e Fix param name 2022-02-21 16:43:28 +03:00
Bojan Serafimov
92787159f7 Add client auth method option 2022-02-21 16:43:28 +03:00
anastasia
abb422d5de Fix SafekeeperMetrics parsing in python tests 2022-02-21 13:45:22 +03:00
bojanserafimov
fdc15de8b2 Add perf test: test_random_writes (#1292) 2022-02-18 15:46:29 -05:00
Stas Kelvich
207286f2b8 Actualize branching parts of openapi spec.
Previous version of spec caused parsing errors in generated clients
as return type is object not array, also one field was missing. In
a passing set `format: hex` on ancestor_id too as value conforms to
that format.
2022-02-18 20:22:21 +02:00
Dhammika Pathirana
d2b896381a Add safekeeper tenant tags in lsn/wal metrics
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>

Add tenant_id in lsn/wal metrics (#1234)
2022-02-18 08:26:37 -08:00
Dhammika Pathirana
009f6d4ae8 Fix safekeeper metric tags
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>

Use separate tags in sk storage file histo (#1234)
2022-02-18 08:26:37 -08:00
Kirill Bulatov
1b31379456 Log postgres errors with ERROR level 2022-02-17 13:42:09 +02:00
Bojan Serafimov
4c64b10aec Revert removal of ignore hint 2022-02-17 13:41:49 +02:00
Bojan Serafimov
ad262a46ad Remove redundant pytest_plugins assignment 2022-02-17 13:41:49 +02:00
Kirill Bulatov
ce533835e5 Use uuid.UUID types for tenants and timelines more 2022-02-17 13:41:19 +02:00
Kirill Bulatov
e5bf520b18 Use types in zenith cli invocations in Python tests 2022-02-17 13:41:19 +02:00
Dmitry Rodionov
9512e21b9e fix python formatting 2022-02-17 13:22:14 +03:00
Dmitry Ivanov
a26d565282 [proxy] Replace private static map with a public CancelMap
This is a cleaner approach which might facilitate testing.
2022-02-17 11:54:27 +03:00
Dmitry Ivanov
a47dade622 [proxy] Migrate to async
This change makes most parts of the code asynchronous, except
for the `mgmt` subsystem (we're going to drop it anyway).

Co-authored-by: bojanserafimov <bojan.serafimov7@gmail.com>
2022-02-17 11:54:27 +03:00
Dmitry Rodionov
9cce430430 remove several obsolete management api commands from pageserver's libpq
api

these commands are now available via http api
2022-02-17 11:26:28 +03:00
Dhammika Pathirana
4bf4bacf01 Add cli start/stop test
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>

Add a test for #1260
2022-02-16 13:19:12 -08:00
bojanserafimov
335abfcc28 Add slow seqscan perf test (#1283) 2022-02-16 10:59:51 -05:00
bojanserafimov
afb3342e46 Add vanilla pg baseline tests (#1275) 2022-02-15 13:44:22 -05:00
Kirill Bulatov
5563ff123f Reuse tenant-timeline id struct from utils 2022-02-15 17:45:23 +02:00
Dhammika Pathirana
0a557b2fa9 Add cli v4 loopback listener ports test
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>

Add a test for #1247
2022-02-15 17:01:22 +02:00
Heikki Linnakangas
9632c352ab Avoid having multiple records for the same page and LSN.
If a heap UPDATE record modified two pages, and both pages needed to have
their VM bits cleared, and the VM bits were located on the same VM page,
we would emit two ZenithWalRecord::ClearVisibilityMapFlags records for
the same VM page. That produced warnings like this in the pageserver log:

    Page version Wal(ClearVisibilityMapFlags { heap_blkno: 18, flags: 3 }) of rel 1663/13949/2619_vm blk 0 at 2A/346046A0 already exists

To fix, change ClearVisibilityMapFlags so that it can update the bits
for both pages as one operation.

This was already covered by several python tests, so no need to add a
new one. Fixes #1125.

Co-authored-by: Konstantin Knizhnik <knizhnik@zenith.tech>
2022-02-15 14:26:16 +02:00
Arseny Sher
328e3b4189 bump vendor/postgres to fix compiler warnings 2022-02-15 06:51:16 +03:00