Commit Graph

9 Commits

Author SHA1 Message Date
Conrad Ludgate
0626e0bfd3 proxy: refactor some error handling and shutdowns (#4684)
## Problem

It took me a while to understand the purpose of all the tasks spawned in
the main functions.

## Summary of changes

Utilising the type system and less macros, plus much more comments,
document the shutdown procedure of each task in detail
2023-07-13 11:03:37 +01:00
Shany Pozin
26828560a8 Add timeouts and retries to consumption metrics reporting client (#4563)
## Problem
#4528
## Summary of changes
Add a 60 seconds default timeout to the reqwest client 
Add retries for up to 3 times to call into the metric consumption
endpoint

---------

Co-authored-by: Christian Schwarz <christian@neon.tech>
2023-07-03 15:20:49 +03:00
Anastasia Lubennikova
92214578af Fix proxy_io_bytes_per_client metric: use branch_id identifier properly. (#4084)
It fixes the miscalculation of the metric for projects that use multiple
branches for the same endpoint.
We were under billing users with such projects. So we need to
communicate the change in Release Notes.
2023-04-26 17:47:54 +03:00
Anastasia Lubennikova
6d01d835a8 [proxy] Report error if proxy_io_bytes_per_client metric has decreased 2023-04-06 23:14:07 +03:00
Anastasia Lubennikova
b7fddfa70d Add branch_id field to proxy_io_bytes_per_client metric.
Since we allow switching endpoints between different branches, it is important to use composite key.
Otherwise, we may try to calculate delta between metric values for two different branches.
2023-03-10 15:00:48 +02:00
Anastasia Lubennikova
40799d8ae7 Add debug messages to catch abnormal consumption metric values 2023-02-17 17:57:45 +02:00
Dmitry Ivanov
956b6f17ca [proxy] Handle some unix signals.
On the surface, this doesn't add much, but there are some benefits:

* We can do graceful shutdowns and thus record more code coverage data.

* We now have a foundation for the more interesting behaviors, e.g. "stop
  accepting new connections after SIGTERM but keep serving the existing ones".

* We give the otel machinery a chance to flush trace events before
  finally shutting down.
2023-02-17 15:32:14 +03:00
Heikki Linnakangas
6f9af0aa8c [proxy] Enable OpenTelemetry tracing.
This commit sets up OpenTelemetry tracing and exporter, so that they
can be exported as OpenTelemetry traces as well.

All outgoing HTTP requests will be traced. A separate (child)
span is created for each outgoing HTTP request, and the tracing
context is also propagated to the server in the HTTP headers.

If tracing is enabled in the control plane and compute node too, you
can now get an end-to-end distributed trace of what happens when a new
connection is established, starting from the handshake with the
client, creating the 'start_compute' operation in the control plane,
starting the compute node, all the way to down to fetching the base
backup and the availability checks in compute_ctl.

Co-authored-by: Dmitry Ivanov <dima@neon.tech>
2023-02-17 15:32:14 +03:00
Anastasia Lubennikova
2cbe84b78f Proxy metrics (#3290)
Implement proxy metrics collection.
Only collect metric for outbound traffic.

Add proxy CLI parameters:
- metric-collection-endpoint
- metric-collection-interval.

Add test_proxy_metric_collection test.

Move shared consumption metrics code to libs/consumption_metrics.
Refactor the code.
2023-01-16 15:17:28 +00:00