Commit Graph

154 Commits

Author SHA1 Message Date
Arthur Petukhovsky
6a00ad3aab Validate logs 2023-09-18 17:12:40 +00:00
Arthur Petukhovsky
44c7d96ed0 Cleanup resources better 2023-09-16 23:10:53 +00:00
Arthur Petukhovsky
eb2886b401 Add test for 1000 WAL messages 2023-09-16 14:58:26 +00:00
Arthur Petukhovsky
d801ba7248 Print network config 2023-08-29 14:10:57 +00:00
Arthur Petukhovsky
2fd351fd63 Hide debug logs 2023-08-29 10:05:18 +00:00
Arthur Petukhovsky
41b9750e81 Run many schedules 2023-08-24 23:42:11 +00:00
Arthur Petukhovsky
f8729f046d Fix excessive logs 2023-08-24 17:25:44 +00:00
Arthur Petukhovsky
420d3bc18f Add simulation schedule 2023-08-24 15:24:38 +00:00
Arthur Petukhovsky
7de94c959a Support walproposer recovery 2023-08-22 23:15:46 +00:00
Arthur Petukhovsky
413ce2cfe8 Crash safekeepers 2023-08-17 10:36:23 +00:00
Arthur Petukhovsky
89bd7ab8a3 Fix read/write in walproposer 2023-07-28 15:14:24 +00:00
Arthur Petukhovsky
5034a8cca0 WIP 2023-07-26 22:51:19 +02:00
Arthur Petukhovsky
d87e822169 Return LSN from sync safekeepers 2023-07-24 21:15:35 +00:00
Arthur Petukhovsky
aed14f52d5 Test sync safekeepers 2023-06-03 19:11:28 +00:00
Arthur Petukhovsky
909d7fadb8 Implement simlib sk server 2023-06-02 14:49:55 +00:00
Arthur Petukhovsky
3840d6b18b Clean up C API 2023-06-01 09:38:07 +00:00
Arthur Petukhovsky
0d4f987fc8 Implement full simlib C API 2023-05-31 20:25:25 +00:00
Arthur Petukhovsky
aa0763d49d Run simulator on C code 2023-05-31 16:55:16 +00:00
Arthur Petukhovsky
b55005d2c4 Build simple C func example 2023-05-26 14:44:48 +03:00
Arthur Petukhovsky
6436432a77 Showcase network failures 2023-05-25 12:53:20 +03:00
Arthur Petukhovsky
1b8918e665 Add accept, close and delays to the network 2023-05-25 12:26:57 +03:00
Arthur Petukhovsky
87c9edac7c Add basic support for network delays 2023-05-24 20:28:53 +03:00
Arthur Petukhovsky
5e0550a620 Add os.sleep and os.random 2023-05-24 15:51:30 +03:00
Arthur Petukhovsky
06f493f525 Extract simlib 2023-05-24 13:06:42 +03:00
Arthur Petukhovsky
f6b540ebfe Add initial support for virtual time 2023-05-22 15:00:56 +03:00
Arthur Petukhovsky
83f87af02b Remove sync debug 2023-03-10 00:10:09 +02:00
Arthur Petukhovsky
79823c38cd It looks deterministic now 2023-03-10 00:03:35 +02:00
Arthur Petukhovsky
072fb3d7e9 WIP 2023-03-09 14:59:03 +02:00
Arthur Petukhovsky
f2fb9f6be9 WIP 2023-03-09 14:51:29 +02:00
Arthur Petukhovsky
dd4c8fb568 WIP 2023-03-09 00:51:14 +02:00
Arthur Petukhovsky
9116c01614 WIP 2023-03-08 18:45:13 +02:00
Arthur Petukhovsky
17cd96e022 WIP 2023-03-03 20:33:55 +00:00
Arthur Petukhovsky
b23742e09c Create /v1/debug_dump safekeepers endpoint (#3710)
Add HTTP endpoint to get full safekeeper state of all existing timelines
(all in-memory values and info about all files stored on disk).

Example:
https://gist.github.com/petuhovskiy/3cbb8f870401e9f486731d145161c286
2023-03-03 14:01:05 +03:00
Joonas Koivunen
d7d3f451f0 Use tracing panic hook in all binaries (#3634)
Enables tracing panic hook in addition to pageserver introduced in
#3475:

- proxy
- safekeeper
- storage_broker

For proxy, a drop guard which resets the original std panic hook was
added on the first commit. Other binaries don't need it so they never
reset anything by `disarm`ing the drop guard.

The aim of the change is to make sure all panics a) have span
information b) are logged similar to other messages, not interleaved
with other messages as happens right now. Interleaving happens right now
because std prints panics to stderr, and other logging happens in
stdout. If this was handled gracefully by some utility, the log message
splitter would treat panics as belonging to the previous message because
it expects a message to start with a timestamp.

Cc: #3468
2023-02-21 10:03:55 +02:00
Vadim Kharitonov
bc4f594ed6 Fix Sentry Version 2023-01-25 12:07:38 +01:00
Arseny Sher
84ffdc8b4f Don't keep FDs open on cancelled timelines in safekeepers.
Since PR #3300 we don't remove timelines completely until next restart, so this
prevents leakage.

fixes https://github.com/neondatabase/neon/issues/3336
2023-01-16 19:03:38 +04:00
Kirill Bulatov
bce4233d3a Rework Cargo.toml dependencies (#3322)
* Use workspace variables from cargo, coming with rustc
[1.64](https://github.com/rust-lang/rust/blob/master/RELEASES.md#version-1640-2022-09-22)

See
https://doc.rust-lang.org/nightly/cargo/reference/workspaces.html#the-package-table
and
https://doc.rust-lang.org/nightly/cargo/reference/workspaces.html#the-dependencies-table
sections.

Now, all dependencies in all non-root `Cargo.toml` files are defined as 
```
clap.workspace = true
```

sometimes, when extra features are needed, as 
```
bytes = {workspace = true, features = ['serde'] }
```

With the actual declarations (with shared features and version
numbers/file paths/etc.) in the root Cargo.toml.
Features are additive:

https://doc.rust-lang.org/nightly/cargo/reference/specifying-dependencies.html#inheriting-a-dependency-from-a-workspace

* Uses the mechanism above to set common, 2021, edition and license across the
workspace

* Mechanically bumps a few dependencies

* Updates hakari format, as it suggested:
```
work/neon/neon kb/cargo-templated ❯ cargo hakari generate
info: no changes detected
info: new hakari format version available: 3 (current: 2)
(add or update `dep-format-version = "3"` in hakari.toml, then run `cargo hakari generate && cargo hakari manage-deps`)
```
2023-01-13 18:13:34 +02:00
Arthur Petukhovsky
f49e923d87 Keep deleted timelines in memory of safekeeper (#3300)
A temporal fix for https://github.com/neondatabase/neon/issues/3146,
until we come up with a reliable way to create and delete timelines in
all safekeepers.
2023-01-12 15:33:07 +03:00
Kirill Bulatov
10dae79c6d Tone down safekeeper and pageserver walreceiver errors (#3227)
Closes https://github.com/neondatabase/neon/issues/3114

Adds more typization into errors that appear during protocol messages (`FeMessage`), postgres and walreceiver connections.

Socket IO errors are now better detected and logged with lesser (INFO, DEBUG) error level, without traces that they were logged before, when they were wrapped in anyhow context.
2023-01-03 20:42:04 +00:00
Vadim Kharitonov
0b428f7c41 Enable licenses check for 3rd-parties 2023-01-03 15:11:50 +01:00
Egor Suvorov
cb61944982 Safekeeper: refactor auth validation
* Load public auth key on startup and store it in the config.
* Get rid of a separate `auth` parameter which was passed all over the place.
2022-12-31 02:27:08 +03:00
Arseny Sher
f6bf7b2003 Add tenant_id to safekeeper spans.
Now that it's hard to map timeline id into project in the console, this should
help a little.
2022-12-27 20:19:12 +03:00
Arseny Sher
fee8bf3a17 Remove global_commit_lsn.
It is complicated and fragile to maintain and not really needed; update
commit_lsn locally only when we have enough WAL flushed.

ref https://github.com/neondatabase/neon/issues/3069
2022-12-27 20:19:12 +03:00
Arseny Sher
1ad6e186bc Refuse ProposerElected if it is going to truncate correct WAL.
Prevents commit_lsn monotonicity violation (otherwise harmless).

closes https://github.com/neondatabase/neon/issues/3069
2022-12-27 20:19:12 +03:00
Kirill Bulatov
fca25edae8 Fix 1.66 Clippy warnings (#3178)
1.66 release speeds up compile times for over 10% according to tests.

Also its Clippy finds plenty of old nits in our code:
* useless conversion, `foo as u8` where `foo: u8` and similar, removed
`as u8` and similar
* useless references and dereferenced (that were automatically adjusted
by the compiler), removed various `&` and `*`
* bool -> u8 conversion via `if/else`, changed to `u8::from`
* Map `.iter()` calls where only values were used, changed to
`.values()` instead

Standing out lints:
* `Eq` is missing in our protoc generated structs. Silenced, does not
seem crucial for us.
* `fn default` looks like the one from `Default` trait, so I've
implemented that instead and replaced the `dummy_*` method in tests with
`::default()` invocation
* Clippy detected that
```
if retry_attempt < u32::MAX {
    retry_attempt += 1;
}
```
is a saturating add and proposed to replace it.
2022-12-22 14:27:48 +02:00
Kirill Bulatov
3735aece56 Safekeeper: Always use workdir as a full path 2022-12-19 21:43:36 +02:00
Dmitry Ivanov
61194ab2f4 Update rust-postgres everywhere
I've rebased[1] Neon's fork of rust-postgres to incorporate
latest upstream changes (including dependabot's fixes),
so we need to advance revs here as well.

[1] https://github.com/neondatabase/rust-postgres/commits/neon
2022-12-17 00:26:10 +03:00
Dmitry Ivanov
83baf49487 [proxy] Forward compute connection params to client
This fixes all kinds of problems related to missing params,
like broken timestamps (due to `integer_datetimes`).

This solution is not ideal, but it will help. Meanwhile,
I'm going to dedicate some time to improving connection machinery.

Note that this **does not** fix problems with passing certain parameters
in a reverse direction, i.e. **from client to compute**. This is a
separate matter and will be dealt with in an upcoming PR.
2022-12-16 21:37:50 +03:00
Arseny Sher
e14bbb889a Enable broker client keepalives. (#3127)
Should fix stale connections.

ref https://github.com/neondatabase/neon/issues/3108
2022-12-16 11:55:12 +02:00
Kirill Bulatov
02c1c351dc Create initial timeline without remote storage (#3077)
Removes the race during pageserver initial timeline creation that lead to partial layer uploads.
This race is only reproducible in test code, we do not create initial timelines in cloud (yet, at least), but still nice to remove the non-deterministic behavior.
2022-12-13 15:42:59 +02:00