Commit Graph

1550 Commits

Author SHA1 Message Date
Kirill Bulatov
de37f982db Share the remote storage as a crate 2022-05-07 00:30:36 +03:00
Kirill Bulatov
d4e155aaa3 Librarify common etcd timeline logic 2022-05-06 22:32:57 +03:00
Arseny Sher
dd6dca9072 Bump vendor/postgres to shut down on wrong basebackup. 2022-05-06 20:07:26 +04:00
bojanserafimov
ef40e404cf Rename zenith crate to neon_local (#1625) 2022-05-05 19:06:53 -04:00
Sergey Melnikov
11a44eda0e Add TLS support in scram-proxy (#1643)
* Add TLS support in scram-proxy

* Fix authEndpoint
2022-05-05 23:48:16 +03:00
Heikki Linnakangas
30a7598172 Some copy-editing. 2022-05-05 22:35:15 +03:00
Heikki Linnakangas
1ad5658d9c Fix typos 2022-05-05 22:35:15 +03:00
Dmitry Rodionov
954859f6c5 add readme for performance tests with the current state of things 2022-05-05 22:35:15 +03:00
Andrey Taranik
4024bfe736 get_binaries script fix (#1638)
* get_binaries script fix

* minor improvment for get_binaries
2022-05-05 22:21:07 +03:00
Kirill Bulatov
2ef0e5c6ed Do not require metadata in every upload sync task 2022-05-05 18:26:39 +03:00
Kirill Bulatov
52a7e3155e Add local path to the Layer trait and historic layers 2022-05-05 18:26:39 +03:00
Thang Pham
ad5eaa6027 Use node's LSN for read-only nodes (#1642)
Fixes #1410.
2022-05-05 10:53:10 -04:00
Dmitry Rodionov
0f3ec83172 avoid detach with alive branches 2022-05-05 12:54:42 +03:00
Arseny Sher
c46fe90010 Fix division by zero in WAL removal. 2022-05-05 10:41:43 +04:00
bojanserafimov
bc569dde51 Remove some unwraps from waldecoder (#1539) 2022-05-04 17:41:05 -04:00
bojanserafimov
02e5083695 Add hot page test (#1479) 2022-05-04 12:45:01 -04:00
Thang Pham
c4bc604e5f Fix pg list table alignment #1633
Fixes #1628

- add [`comfy_table`](https://github.com/Nukesor/comfy-table/tree/main) and use it to construct table for `pg list` CLI command

Comparison

- Old:

```
NODE	ADDRESS	TIMELINE	BRANCH NAME	LSN		STATUS
main	127.0.0.1:55432	3823dd05e35d71f6ccf33049de366d70	main	0/16FB140	running
migration_check	127.0.0.1:55433	3823dd05e35d71f6ccf33049de366d70	main	0/16FB140	running
```

- New:

```
 NODE             ADDRESS          TIMELINE                          BRANCH NAME  LSN        STATUS
 main             127.0.0.1:55432  3823dd05e35d71f6ccf33049de366d70  main         0/16FB140  running
 migration_check  127.0.0.1:55433  3823dd05e35d71f6ccf33049de366d70  main         0/16FB140  running
```
2022-05-04 12:12:26 -04:00
Anastasia Lubennikova
b8880bfaab Bump vendor/postgres 2022-05-04 18:14:45 +03:00
Anastasia Lubennikova
e2cf77441d Implement pg_database_size().
In this implementation dbsize equals sum of all relation sizes, excluding shared ones.
2022-05-04 18:14:45 +03:00
Arseny Sher
b68e3b03ed Fix control file update for b9fd8a36ad 2022-05-04 17:11:22 +04:00
Arseny Sher
e58c83870f Bump vendor/postgres to to send timeline_start_lsn. 2022-05-04 14:32:03 +04:00
Arseny Sher
b9fd8a36ad Remember timeline_start_lsn and local_start_lsn on safekeeper.
Make it remember when timeline starts in general and on this safekeeper in
particular (the point might be later on new safekeeper replacing failed one).

Bumps control file and walproposer protocol versions.

While protocol is bumped, also add safekeeper node id to
AcceptorProposerGreeting.

ref #1561
2022-05-04 14:32:03 +04:00
Heikki Linnakangas
748c5a577b Bump vendor/postgres. (#1616)
Includes fix for https://github.com/neondatabase/neon/issues/1615
2022-05-04 10:54:44 +03:00
Stas Kelvich
51a0f2683b fix scram-proxy addresses 2022-05-04 01:35:30 +03:00
Dmitry Rodionov
9dfa145c7c tone down tenant not found error 2022-05-04 00:47:52 +03:00
Stas Kelvich
5642d0b2b8 Change shutdown_process_on_error thread spawn settings.
Now princeple is following: acceptor threads (libpq and http) error will
bring the pageserver down, but all per-tenant thread failures will be treated
as an error.
2022-05-04 00:42:57 +03:00
Dmitry Rodionov
2f83f793bc print more details when thread fails 2022-05-03 18:31:23 +03:00
Anastasia Lubennikova
2f9b17b9e5 Add simple test of pageserver recovery after crash. To cause a crash, use failpoints in checkpointer 2022-05-03 17:13:09 +03:00
Dmitry Rodionov
e7cba0b607 use thiserror instead of anyhow in disk_btree 2022-05-03 15:34:23 +03:00
Dmitry Rodionov
ff7e9a86c6 turn panic into an error with more details 2022-05-03 12:44:42 +03:00
Heikki Linnakangas
9ede38b6c4 Support finding LSN from a commit timestamp.
A new `get_lsn_by_timestamp` command is added to the libpq page service
API.

An extra timestamp field is now stored in an extra field after each
Clog page. It is the timestamp of the latest commit, among all the
transactions on the Clog page. To find the overall latest commit, we
need to scan all Clog pages, but this isn't a very frequent operation
so that's not too bad.

To find the LSN that corresponds to a timestamp, we perform a binary
search. The binary search starts with min = last LSN when GC ran, and
max = latest LSN on the timeline. On each iteration of the search we
check if there are any commits with a higher-than-requested timestamp
at that LSN.

Implements github issue 1361.
2022-05-03 09:28:57 +03:00
Heikki Linnakangas
62449d6068 Bump vendor/postgres (#1573)
This brings us the performance improvements to WAL redo from
https://github.com/neondatabase/postgres/pull/144
2022-05-03 09:25:12 +03:00
Konstantin Knizhnik
baa59512b8 Traverse frozen layer in get_reconstruct_data in reverse order (#1601)
* Traverse frozen layer in get_reconstruct_data in reverse order

* Fix comments on frozen layers.

Note explicitly the order that the layers are in the queue.

* Add fail point to reproduce failpoint iteration error

Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2022-05-03 08:07:14 +03:00
Heikki Linnakangas
87a6c4d051 RFC on connection routing and authentication.
This documents how we want this to work. We're not quite there yet.
2022-05-02 23:39:06 +03:00
Stas Kelvich
801b749e1d Set correct authEndpoint for the new proxy 2022-05-02 21:46:32 +03:00
Kirill Bulatov
5cb501c2b3 Make remote storage test less flacky 2022-05-02 20:04:48 +03:00
Dmitry Rodionov
ad25736f3a Exit pageserver process with correct error code
When we shutdown pageserver due to an error (e g one of th important
thrads panicked) use 1 exit code so systemd can properly restart it
2022-05-02 19:04:45 +03:00
Stas Kelvich
9a396e1feb Support SNI-based routing in proxy 2022-05-02 18:32:18 +03:00
Stas Kelvich
0323bb5870 [proxy] Refactor cplane API and add new console SCRAM auth API
Now proxy binary accepts `--auth-backend` CLI option, which determines
auth scheme and cluster routing method. Following backends are currently
implemented:

* legacy
    old method, when username ends with `@zenith` it uses md5 auth dbname as
    the cluster name; otherwise, it sends a login link and waits for the console
    to call back
* console
    new SCRAM-based console API; uses SNI info to select the destination
    cluster
* postgres
    uses postgres to select auth secrets of existing roles. Useful for local
    testing
* link
    sends login link for all usernames
2022-05-02 18:32:18 +03:00
Dmitry Ivanov
af0195b604 [proxy] Introduce cloud::Api for communication with Neon Cloud
* `cloud::legacy` talks to Cloud API V1.
* `cloud::api` defines Cloud API v2.
* `cloud::local` mocks the Cloud API V2 using a local postgres instance.
* It's possible to choose between API versions using the `--api-version` flag.
2022-05-02 18:32:18 +03:00
Dmitry Ivanov
9df8915b03 [proxy] sasl::Mechanism may return Output during exchange
This is needed to forward the `ClientKey` that's required
to connect the proxy to a compute.

Co-authored-by: bojanserafimov <bojan.serafimov7@gmail.com>
2022-05-02 18:32:18 +03:00
Dmitry Ivanov
4b1bd32e4a Drop Debug impl for ScramKey and ServerSecret
There's a notion that accidental misuse of those implementations
might reveal authentication secrets.
2022-05-02 18:32:18 +03:00
Andrey Taranik
68ba6a58a0 authEndpoint fix 2022-05-02 17:55:13 +03:00
Andrey Taranik
8f479a712f minor fixes in proxy deployment 2022-05-02 17:55:13 +03:00
Stas Kelvich
2477d2f9e2 Deploy standalone SRAM proxy on staging 2022-05-02 17:55:13 +03:00
Dhammika Pathirana
992874c916 Fix update ps settings doc
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>
2022-05-01 13:52:08 -07:00
Dhammika Pathirana
3128e8c75c Fix tenant conf test
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>
2022-05-01 13:13:25 -07:00
Dhammika Pathirana
f3f12db2cb Add gc churn threshold knob (#1594)
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>
2022-05-01 13:13:17 -07:00
Andrey Taranik
038ea4c128 proxy notice message update (#1600) 2022-04-30 22:04:08 +03:00
Kirill Bulatov
7e1db8c8a1 Show which virtual file got the deserialization errors 2022-04-29 21:40:57 +03:00