Commit Graph

736 Commits

Author SHA1 Message Date
Heikki Linnakangas
4a36d89247 Avoid spawning a layer-flush thread when there's no work to do.
The check_checkpoint_distance() always spawned a new thread, even if
there is no frozen layer to flush. That was a thinko, as @knizhnik
pointed out.
2022-05-19 00:51:48 +03:00
Arthur Petukhovsky
134eeeb096 Add more common storage metrics (#1722)
- Enabled process exporter for storage services
- Changed zenith_proxy prefix to just proxy
- Removed old `monitoring` directory
- Removed common prefix for metrics, now our common metrics have `libmetrics_` prefix, for example `libmetrics_serve_metrics_count`
- Added `test_metrics_normal_work`
2022-05-17 19:29:01 +03:00
Heikki Linnakangas
55ea3f262e Fix race condition leading to panic in remote storage sync thread.
The SyncQueue consisted of a tokio mpsc channel, and an atomic counter
to keep track of how many items there are in the channel. Updating the
atomic counter was racy, and sometimes the consumer would decrement
the counter before the producer had incremented it, leading to integer
wraparound to usize::MAX. Calling Vec::with_capacity(usize::MAX) leads
to a panic.

To fix, replace the channel with a VecDeque protected by a Mutex, and
a condition variable for signaling. Now that the queue is now
protected by standard blocking Mutex and Condvar, refactor the
functions touching it to be sync, not async.

A theoretical downside of this is that the calls to push items to the
queue and the storage sync thread that drains the queue might now need
to wait, if another thread is busy manipulating the queue. I believe
that's OK; the lock isn't held for very long, and these operations are
made in background threads, not in the hot GetPage@LSN path, so
they're not very latency-sensitive.

Fixes #1719. Also add a test case.
2022-05-17 18:14:57 +03:00
Kirill Bulatov
a884f4cf6b Add etcd to neon_local 2022-05-17 01:17:44 +03:00
Kirill Bulatov
9a0fed0880 Enable at least 1 safekeeper in every test 2022-05-17 01:17:44 +03:00
chaitanya sharma
85b5c0e989 List profiling as a feature with 'pageserver --enabled-features'
Fixes https://github.com/neondatabase/neon/issues/1627
2022-05-16 21:10:57 +03:00
Thang Pham
e4a70faa08 Add more information to timeline-related APIs (#1673)
Resolves #1488.

- implemented `GET tenant/:tenant_id/timeline/:timeline_id/wal_receiver` endpoint
- returned `thread_id` in `thread_mgr::spawn` 
- added `latest_gc_cutoff_lsn` field to `LocalTimelineInfo` struct
2022-05-16 11:05:43 -04:00
Heikki Linnakangas
51ea9c3053 Don't swallow panics when the pageserver is build with failpoints.
It's very confusing, and because you don't get a stack trace and error
message in the logs, makes debugging very hard. However, the
'test_pageserver_recovery' test relied on that behavior. To support that,
add a new "exit" action to the pageserver 'failpoints' command, so that
you can explicitly request to exit the process when a failpoint is hit.
2022-05-16 09:58:58 +03:00
Heikki Linnakangas
a10cac980f Continue with pageserver startup, if loading some tenants fail.
Fixes https://github.com/neondatabase/neon/issues/1664
2022-05-15 00:25:38 +03:00
Anastasia Lubennikova
a2561f0a78 Use tenant's pitr_interval instead of hardroded 0 in the command.
Adjust python tests that use the
2022-05-13 18:32:14 +03:00
Anastasia Lubennikova
aa7c601eca Fix pitr_interval check in GC:
Use timestamp->LSN mapping instead of file modification time.
Fix 'latest_gc_cutoff_lsn' - set it to the minimum of pitr_cutoff and gc_cutoff.
Add new test: test_pitr_gc
2022-05-13 18:32:14 +03:00
Kirill Bulatov
b683308791 Return GIT_VERSION back to storage binaries 2022-05-13 16:34:32 +03:00
Kirill Bulatov
51c0f9ab2b Force git version to be up to date via decl macro 2022-05-13 16:34:32 +03:00
Arthur Petukhovsky
ec8861b8cc Fix pageserver metrics names (#1682)
Try to follow Prometheus style-guide https://prometheus.io/docs/practices/naming/ for metrics names. More specifically:
- Use `pageserver_` prefix for all pagserver metrics
- Specify `_seconds` unit in time metrics
- Use unit as a suffix in other cases, such as `_hits`, `_bytes`, `_records`
- Use `_total` suffix for accumulating counters (note that Histograms append that suffix internally)
2022-05-12 19:53:07 +03:00
Heikki Linnakangas
5da4f3a4df Refactor DeltaLayer::dump() function
Put most of the code in a closure that returns Result, so that we can
use the ?-operator for error handling. That's simpler.
2022-05-12 10:31:04 +03:00
Konstantin Knizhnik
2bde77fced Do not apply records with LSN smaller than LSN of cached image in del… (#1672)
* Do not apply records with LSN smaller than LSN of cached image in delta layer

* Do not apply records with LSN smaller than LSN of cached image in delta layer
2022-05-12 07:56:02 +03:00
Dhammika Pathirana
c864091035 Fix err msg typo
Signed-off-by: Dhammika Pathirana <dham@neon.tech>
2022-05-11 16:13:26 -07:00
Konstantin Knizhnik
e6e883eb12 Do not set LSN for new FPI page (#1657)
* Do not set LSN for new FPI page

refer #1656

* Add page_is_new, page_get_lsn, page_set_lsn functions

* Fix page_is_new implementation

* Add comment from XLogReadBufferForRedoExtended
2022-05-11 15:23:17 +03:00
Thang Pham
87dfa99734 Update layered_repository REAMDE (#1659) 2022-05-10 09:55:14 -04:00
Kirill Bulatov
0a7735a656 Rework remote storage sync queue, general refactoring 2022-05-07 01:33:33 +03:00
Kirill Bulatov
64a602b8f3 Delete timeline layers 2022-05-07 01:33:33 +03:00
Kirill Bulatov
10e4da3997 Rework timeline batching 2022-05-07 01:33:33 +03:00
Kirill Bulatov
de37f982db Share the remote storage as a crate 2022-05-07 00:30:36 +03:00
Kirill Bulatov
2ef0e5c6ed Do not require metadata in every upload sync task 2022-05-05 18:26:39 +03:00
Kirill Bulatov
52a7e3155e Add local path to the Layer trait and historic layers 2022-05-05 18:26:39 +03:00
Dmitry Rodionov
0f3ec83172 avoid detach with alive branches 2022-05-05 12:54:42 +03:00
bojanserafimov
bc569dde51 Remove some unwraps from waldecoder (#1539) 2022-05-04 17:41:05 -04:00
Anastasia Lubennikova
e2cf77441d Implement pg_database_size().
In this implementation dbsize equals sum of all relation sizes, excluding shared ones.
2022-05-04 18:14:45 +03:00
Stas Kelvich
5642d0b2b8 Change shutdown_process_on_error thread spawn settings.
Now princeple is following: acceptor threads (libpq and http) error will
bring the pageserver down, but all per-tenant thread failures will be treated
as an error.
2022-05-04 00:42:57 +03:00
Dmitry Rodionov
2f83f793bc print more details when thread fails 2022-05-03 18:31:23 +03:00
Anastasia Lubennikova
2f9b17b9e5 Add simple test of pageserver recovery after crash. To cause a crash, use failpoints in checkpointer 2022-05-03 17:13:09 +03:00
Dmitry Rodionov
e7cba0b607 use thiserror instead of anyhow in disk_btree 2022-05-03 15:34:23 +03:00
Dmitry Rodionov
ff7e9a86c6 turn panic into an error with more details 2022-05-03 12:44:42 +03:00
Heikki Linnakangas
9ede38b6c4 Support finding LSN from a commit timestamp.
A new `get_lsn_by_timestamp` command is added to the libpq page service
API.

An extra timestamp field is now stored in an extra field after each
Clog page. It is the timestamp of the latest commit, among all the
transactions on the Clog page. To find the overall latest commit, we
need to scan all Clog pages, but this isn't a very frequent operation
so that's not too bad.

To find the LSN that corresponds to a timestamp, we perform a binary
search. The binary search starts with min = last LSN when GC ran, and
max = latest LSN on the timeline. On each iteration of the search we
check if there are any commits with a higher-than-requested timestamp
at that LSN.

Implements github issue 1361.
2022-05-03 09:28:57 +03:00
Konstantin Knizhnik
baa59512b8 Traverse frozen layer in get_reconstruct_data in reverse order (#1601)
* Traverse frozen layer in get_reconstruct_data in reverse order

* Fix comments on frozen layers.

Note explicitly the order that the layers are in the queue.

* Add fail point to reproduce failpoint iteration error

Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2022-05-03 08:07:14 +03:00
Dmitry Rodionov
ad25736f3a Exit pageserver process with correct error code
When we shutdown pageserver due to an error (e g one of th important
thrads panicked) use 1 exit code so systemd can properly restart it
2022-05-02 19:04:45 +03:00
Dhammika Pathirana
f3f12db2cb Add gc churn threshold knob (#1594)
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>
2022-05-01 13:13:17 -07:00
Kirill Bulatov
7e1db8c8a1 Show which virtual file got the deserialization errors 2022-04-29 21:40:57 +03:00
Dmitry Rodionov
05f8e6a050 Use fsync+rename for atomic downloads from remote storage
Use failpoint in test_remote_storage to check the behavior
2022-04-29 15:53:56 +03:00
Kirill Bulatov
2911eb084a Remove timeline files on detach 2022-04-29 09:19:18 +03:00
Kirill Bulatov
6cca57f95a Properly remove from the local timeline map 2022-04-29 09:19:18 +03:00
Kirill Bulatov
4a46b01caf Properly populate local timeline map 2022-04-29 09:19:18 +03:00
Anastasia Lubennikova
5c5c3c64f3 Fix tenant config parsing. Add a test 2022-04-28 11:49:19 +03:00
Dhammika Pathirana
aeb4f81c3b Add branch traversal unit test
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>
2022-04-27 00:05:13 -07:00
Dhammika Pathirana
b2e35fffa6 Fix ancestor layer traversal (#1484)
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>
2022-04-27 00:05:13 -07:00
Kirill Bulatov
778744d35c Limit concurrent S3 and IAM interactions 2022-04-26 13:49:37 +03:00
Dmitry Rodionov
eabf6f89e4 Use item.get for tenant config toml parsing
Previously we've used table interface, but there was no easy way to pass
it as an override to pageserver through cli. Use the same strategy as
for remote storage config parsing
2022-04-26 10:15:19 +03:00
Kirill Bulatov
fec050ce97 Fix macos clippy issues 2022-04-25 16:23:34 +03:00
Kirill Bulatov
8f6a161271 Show better layer load errors 2022-04-25 14:54:39 +03:00
Heikki Linnakangas
1fb3d08185 Use a 1-byte length header for short blobs.
Notably, this shaves 3 bytes from each small WAL record stored in
ephemeral or delta layers.
2022-04-22 21:31:27 +03:00