437 Commits

Author SHA1 Message Date
Kirill Bulatov
c5007d3916 Remove unused module 2022-06-03 00:23:13 +03:00
Kirill Bulatov
5b06599770 Simplify etcd key regex parsing 2022-06-03 00:23:13 +03:00
Kirill Bulatov
e5cb727572 Replace callmemaybe with etcd subscriptions on safekeeper timeline info 2022-06-01 16:07:04 +03:00
Ryan Russell
54e163ac03 Improve Readability in Docs
Signed-off-by: Ryan Russell <ryanrussell@users.noreply.github.com>
2022-05-31 17:22:47 +03:00
Arthur Petukhovsky
c3e0b6c839 Implement timeline-based metrics in safekeeper (#1823)
Now there's timelines metrics collector, which goes through all timelines and reports metrics only for active ones
2022-05-31 11:10:50 +03:00
Arseny Sher
36281e3b47 Extend test_wal_backup with compute restart. 2022-05-30 13:57:17 +04:00
Heikki Linnakangas
4b4d3073b8 Fix misc typos 2022-05-28 14:56:23 +03:00
Kian-Meng Ang
f1c51a1267 Fix typos 2022-05-28 14:02:05 +03:00
Arseny Sher
cb8bf1beb6 Prevent commit_lsn <= flush_lsn violation after a42eba3cd7.
Nothing complained about that yet, but we definitely don't hold at least one
assert, so let's keep it this way until better version.
2022-05-27 20:23:30 +04:00
Arseny Sher
54b75248ff s3 WAL offloading staging review.
- Uncomment accidently `self.keep_alive.abort()` commented line, due to this
  task never finished, which blocked launcher.
- Mess up with initialization one more time, to fix offloader trying to back up
  segment 0. Now we initialize all required LSNs in handle_elected,
  where we learn start LSN for the first time.
- Fix blind attempt to provide safekeeper service file with remote storage
  params.
2022-05-27 14:02:52 +04:00
Arseny Sher
0e1bd57c53 Add WAL offloading to s3 on safekeepers.
Separate task is launched for each timeline and stopped when timeline doesn't
need offloading. Decision who offloads is done through etcd leader election;
currently there is no pre condition for participating, that's a TODO.

neon_local and tests infrastructure for remote storage in safekeepers added,
along with the test itself.

ref #1009

Co-authored-by: Anton Shyrabokau <ahtoxa@Antons-MacBook-Pro.local>
2022-05-27 06:19:23 +04:00
chaitanya sharma
c584d90bb9 initial commit, renamed znodeid to nodeid. 2022-05-25 20:11:26 +03:00
KlimentSerafimov
8346aa3a29 Potential fix to #1626. Fixed typo is Makefile. (#1781)
* Potential fix to #1626. Fixed typo is Makefile.
* Completed fix to #1626.

Summary:
changed 'error' to 'bail' in start_pageserver and start_safekeeper.
2022-05-24 04:55:38 -04:00
chaitanya sharma
fbedd535c0 Replace a bunch of zenith references with neon. 2022-05-23 13:16:00 +03:00
Egor Suvorov
432907ff5f Safekeeper: avoid holding mutex when deleting a tenant (#1746)
Following discussion with @arssher after #1653
2022-05-18 23:02:17 +03:00
Arthur Petukhovsky
134eeeb096 Add more common storage metrics (#1722)
- Enabled process exporter for storage services
- Changed zenith_proxy prefix to just proxy
- Removed old `monitoring` directory
- Removed common prefix for metrics, now our common metrics have `libmetrics_` prefix, for example `libmetrics_serve_metrics_count`
- Added `test_metrics_normal_work`
2022-05-17 19:29:01 +03:00
Kirill Bulatov
a884f4cf6b Add etcd to neon_local 2022-05-17 01:17:44 +03:00
Kirill Bulatov
9a0fed0880 Enable at least 1 safekeeper in every test 2022-05-17 01:17:44 +03:00
Egor Suvorov
bf899a57d9 Safekeeper: add timeline/tenant force delete HTTP endpoings (closes #895)
* There is no auth in Safekeeper HTTP at all currently,
  so simply calling `check_permission` is not enough.
* There are no checks of Safekeeper still working with the data,
  as "still working" is burry now: a timeline may be "active"
  while there are no compute nodes and all data is propagated.
* Still, callmemaybe is deactivated, and timeline is removed from the
  internal map. It can easily sneak back in case of race conditions
  and implicit creations, though.
2022-05-13 15:43:52 +02:00
Egor Suvorov
07b85e7cfc Safekeeper refactor: move callmemaybe_tx from SafekeeperPostgresBackend to Timeline 2022-05-13 15:43:52 +02:00
Kirill Bulatov
b683308791 Return GIT_VERSION back to storage binaries 2022-05-13 16:34:32 +03:00
Kirill Bulatov
51c0f9ab2b Force git version to be up to date via decl macro 2022-05-13 16:34:32 +03:00
Kirill Bulatov
4538f1e1b8 Correctly operate etcd safekeeper timeline data 2022-05-12 18:47:31 +03:00
Arseny Sher
b338b5dffe Make callmemaybe less agressive until we fix it/migrate to bigger machines. 2022-05-11 22:16:13 +04:00
Arseny Sher
6cb14b4200 Optionally remove WAL on safekeepers without s3 offloading.
And do that on staging, until offloading is merged.
2022-05-10 22:41:02 +04:00
Kirill Bulatov
de37f982db Share the remote storage as a crate 2022-05-07 00:30:36 +03:00
Kirill Bulatov
d4e155aaa3 Librarify common etcd timeline logic 2022-05-06 22:32:57 +03:00
Arseny Sher
c46fe90010 Fix division by zero in WAL removal. 2022-05-05 10:41:43 +04:00
bojanserafimov
bc569dde51 Remove some unwraps from waldecoder (#1539) 2022-05-04 17:41:05 -04:00
Arseny Sher
b68e3b03ed Fix control file update for b9fd8a36ad 2022-05-04 17:11:22 +04:00
Arseny Sher
b9fd8a36ad Remember timeline_start_lsn and local_start_lsn on safekeeper.
Make it remember when timeline starts in general and on this safekeeper in
particular (the point might be later on new safekeeper replacing failed one).

Bumps control file and walproposer protocol versions.

While protocol is bumped, also add safekeeper node id to
AcceptorProposerGreeting.

ref #1561
2022-05-04 14:32:03 +04:00
chaitanya sharma
76388abeb6 Rename READMEs with .md extension, and fix links to them.
Commit edba2e97 renamed pageserver/README to pageserver/README.md, but
forgot to update links to it. Fix.

Rename libs/postgres_ffi/README and safekeeper/README files to also
have the the .md extension, so that github can render them nicely.

Quote ascii-diagram in safekeeper/README.md so that it renders
correctly.
2022-04-29 14:23:42 +03:00
Arseny Sher
8b9d523f3c Remove old WAL on safekeepers.
Remove when it is consumed by all of 1) pageserver (remote_consistent_lsn) 2)
safekeeper peers 3) s3 WAL offloading.

In test s3 offloading for now is mocked by directly bumping s3_wal_lsn.

ref #1403
2022-04-26 23:02:23 +04:00
Dmitry Ivanov
d3f356e7a8 Update rust-postgres project-wide (#1525)
* Update `rust-postgres` project-wide

This commit points to https://github.com/neondatabase/rust-postgres/commits/neon
in order to test our patches on top of the latest version of this crate.

* [proxy] Update `hmac` and `sha2`
2022-04-22 17:31:58 +03:00
Kirill Bulatov
81cad6277a Move and library crates into a dedicated directory and rename them 2022-04-21 13:30:33 +03:00
Kirill Bulatov
52e0816fa5 wal_acceptor -> safekeeper 2022-04-18 12:52:31 +03:00
Kirill Bulatov
81417788c8 walkeeper -> safekeeper 2022-04-18 12:52:31 +03:00