Compare commits

...

1934 Commits

Author SHA1 Message Date
Heikki Linnakangas
7eefad691c Improve the scaling, add Play/Stop buttons 2023-03-22 19:40:54 +02:00
Heikki Linnakangas
fe59a063ea WIP: Collect and draw layer trace 2023-03-22 19:40:54 +02:00
Heikki Linnakangas
ae8e5b3a8e Add test from PR #3673 2023-03-22 19:40:54 +02:00
Kirill Bulatov
8bd565e09e Ensure branches with no layers have their remote storage counterpart created eventually (#3857)
Discovered during writing a test for
https://github.com/neondatabase/neon/pull/3843
2023-03-22 17:42:31 +02:00
Joonas Koivunen
6033dfdf4a Re-access layers before threshold eviction (#3867)
To avoid re-downloading evicted files on restart, re-compute logical
size and partitioning before each threshold based eviction run.

Cc: #3802

Co-authored-by: Christian Schwarz <christian@neon.tech>
2023-03-22 16:26:27 +02:00
mikecaat
14a40c9ca6 Fix minor things for the docker-compose file (#3862)
* Add the REPOSITORY env to build args to avoid the following error when
executing without the credentials for the repository.

```
ERROR: Service 'compute' failed to build: Head
"https://369495373322.dkr.ecr.eu-central-1.amazonaws.com/v2/compute-node-v15/manifests/2221":
no basic auth credentials
```

* update the tag version in the documentation to support storage broker
2023-03-22 08:10:53 +00:00
Shany Pozin
0f7de84785 Allow calling detach on ignored tenant (#3834)
## Describe your changes
Added a query param to detach API
Allow to remove local state of a tenant even if its not in the memory
(following ignore API)
## Issue ticket number and link
#3828
## Checklist before requesting a review
- [x] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.

---------

Co-authored-by: Kirill Bulatov <kirill@neon.tech>
2023-03-22 07:17:00 +00:00
Kirill Bulatov
dd22c87100 Remove older layer metadata format support code (#3854)
The PR enforces current newest `index_part.json` format in the type
system (version `1`), not allowing any previous forms of it, that were
used in the past.
Similarly, the code to mitigate the
https://github.com/neondatabase/neon/issues/3024 issue is now also
removed.

Current code does not produce old formats and extra files in the
index_part.json, in the future we will be able to use
https://github.com/neondatabase/aversion or other approach to make
version transitions more explicit.

See https://neondb.slack.com/archives/C033RQ5SPDH/p1679134185248119 for
the justification on the breaking changes.
2023-03-21 23:33:28 +02:00
Heikki Linnakangas
6fdd9c10d1 Read storage auth token from spec file.
We read the pageserver connection string from the spec file, so let's
read the auth token from the same place.

We've been talking about pre-launching compute nodes that are not
associated with any particular tenant at startup, so that the spec
file is delivered to the compute node later. We cannot change the env
variables after the process has been launched.

We still pass the token to 'postgres' binary in the NEON_AUTH_TOKEN
env variable, but compute_ctl is now responsible for setting it.
2023-03-21 20:12:09 +02:00
Dmitry Rodionov
4158e24e60 rfc: delete pageserver data from s3 (#3792)
[Rendered](https://github.com/neondatabase/neon/blob/main/docs/rfcs/022-pageserver-delete-from-s3.md)

---------

Co-authored-by: Joonas Koivunen <joonas@neon.tech>
2023-03-21 20:03:27 +02:00
Shany Pozin
809acb5fa9 Move neon-image-depot to a larger runner (#3860)
## Describe your changes
https://neondb.slack.com/archives/C039YKBRZB4/p1679413279637059
## Issue ticket number and link

## Checklist before requesting a review
- [ ] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
2023-03-21 19:32:36 +02:00
Heikki Linnakangas
299db9d028 Simplify and clean up the $NEON_AUTH_TOKEN stuff in compute
- Remove the neon.safekeeper_token_env GUC. It was used to set the
  name of an environment variable, which was then used in pageserver
  and safekeeper connection strings to in place of the
  password. Instead, always look up the environment variable called
  NEON_AUTH_TOKEN. That's what neon.safekeeper_token_env was always
  set to in practice, and I don't see the need for the extra level of
  indirection or configurability.

- Instead of substituting $NEON_AUTH_TOKEN in the connection strings,
  pass $NEON_AUTH_TOKEN "out-of-band" as the password, when we connect
  to the pageserver or safekeepers. That's simpler.

- Also use the password from $NEON_AUTH_TOKEN in compute_ctl, when it
  connects to the pageserver to get the "base backup".
2023-03-21 00:15:04 +02:00
Heikki Linnakangas
5a786fab4f Remove duplicated global variables in neon extension.
Walproposer used to live in the backend, while pagestore_smgr was an
extension. But now that both are part of the neon extension,
walproposer can access the same 'neon_tenant' and 'neon_timeline'
variables as the pageserver_smgr code.
2023-03-21 00:15:04 +02:00
Arseny Sher
699f200811 Send error context chain to the client when Copy stream errors. 2023-03-21 01:22:02 +04:00
Christian Schwarz
881356c417 add metrics to detect eviction-induced thrashing (#3837)
This patch adds two metrics that will enable us to detect *thrashing* of
layers, i.e., repetitions of `eviction, on-demand-download, eviction,
... ` for a given layer.

The first metric counts all layer evictions per timeline. It requires no
further explanation. The second metric counts the layer evictions where
the layer was resident for less than a given threshold.

We can alert on increments to the second metric. The first metric will
serve as a baseline, and further, it's generally interesting, outside of
thrashing.

The second metric's threshold is configurable in PageServerConf and
defaults to 24h. The threshold value is reproduced as a label in the
metric because the counter's value is semantically tied to that
threshold. Since changes to the config and hence the label value are
infrequent, this will have low storage overhead in the metrics storage.

The data source to determine the time that the layer was resident is the
file's `mtime`. Using `mtime` is more of a crutch. It would be better if
Pageserver did its own persistent bookkeeping of residence change events
instead of relying on the filesystem. We had some discussion about this:
https://github.com/neondatabase/neon/pull/3809#issuecomment-1470448900

My position is that `mtime` is good enough for now. It can theoretically
jump forward if someone copies files without resetting `mtime`. But that
shouldn't happen in practice. Note that moving files back and forth
doesn't change `mtime`, nor does `chown` or `chmod`. Lastly, `rsync -a`,
which is typically used for filesystem-level backup / restore, correctly
syncs `mtime`.

I've added a label that identifies the data source to keep options open
for a future, better data source than `mtime`. Since this value will
stay the same for the time being, it's not a problem for metrics
storage.

refs https://github.com/neondatabase/neon/issues/3728
2023-03-20 16:11:36 +01:00
Heikki Linnakangas
fea4b5f551 Switch to EdDSA algorithm for the storage JWT authentication tokens.
The control plane currently only supports EdDSA. We need to either teach
the storage to use EdDSA, or the control plane to use RSA. EdDSA is more
modern, so let's use that.

We could support both, but it would require a little more code and tests,
and we don't really need the flexibility since we control both sides.
2023-03-20 16:28:01 +02:00
Heikki Linnakangas
77107607f3 Allow JWT key generation to fail if authentication is not enabled.
This allows you to run without the 'openssl' binary as long as you
don't enable authentication. This becomes more important with the next
commit, which switches the JWT algorithm to EdDSA. LibreSSL does not
support EdDSA, and LibreSSL comes with macOS, so the next commit makes
it much more likely for the key generation to fail for macOS users.

To allow running without a keypair, don't generate the authentication
token in the 'neon_local init' step. Instead, generate a new token on
every request that needs one, using the private key.
2023-03-20 16:28:01 +02:00
Heikki Linnakangas
1da963b2f9 Remove some unused code in control plane. 2023-03-20 16:28:01 +02:00
Heikki Linnakangas
1ddb9249aa Reduce the # of histogram buckets in metrics. (#3850)
Shrinks the total number of metrics collected for each timeline by
about 50%.

See https://github.com/neondatabase/neon/issues/2848. This doesn't fully
solve the problem, we still collect a lot of metrics even with this, but
this gives us a lot of headroom.
2023-03-20 15:49:16 +02:00
Joonas Koivunen
0c1228c37a feat: store initial timeline in env fixture (#3839)
minor change, but will allow more use in future for the default
tenants.

Co-authored-by: Alexander Bayandin <alexander@neon.tech>
2023-03-20 11:57:27 +02:00
Christian Schwarz
3c15874c48 allow specifying eviction_policy in TenantCreateRequest
This was on oversight from 175a577ad4.

Nothing uses this AFAIK, but, let's fix it anyways.

Noticed while working on https://github.com/neondatabase/neon/issues/3728
2023-03-20 10:43:53 +01:00
Shany Pozin
93f3f4ab5f Return NotFound in mgmt API requests when tenant is not present in the pageserver (#3818)
## Describe your changes
Add Error enum for tenant state response to allow better error handling
in mgmt api
## Issue ticket number and link
#2238
## Checklist before requesting a review
- [x] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
2023-03-19 10:44:42 +02:00
sharnoff
f6e2e0042d Fix + re-enable VM cgroup creation + running (#3820)
Re-enable cgroup shenanigans in VMs, with some special care taken to
make sure that our version of cgroup-tools supports cgroup v2 (debian
bullseye does not, and probably won't because it requires a breaking
change in libcgroup).

This involves manually building libcgroup  / cgroup-tools from source,
then copying the output into the final build stage.

We originally considered pulling the package from debian's testing repo
(which is up-to-date), but decided against it. Refer to the PR for more
details.

Prior work, for reference:

* 2153d2e0 - Run compute_ctl in a cgroup in VMs
* 1360361f - Fix missing VM cgconfig.conf
* 8dae8799 - Disable VM cgroup shenanigans
2023-03-16 17:09:45 -07:00
Christian Schwarz
b917270c67 remove unused TenantConfig::update function 2023-03-16 16:25:35 +01:00
Arthur Petukhovsky
b067378d0d Measure cross-AZ traffic in safekeepers (#3806)
Create `safekeeper_pg_io_bytes_total` metric to track total amount of
bytes written/read in a postgres connections to safekeepers. This metric
has the following labels:
- `client_az` – availability zone of the connection initiator, or
`"unknown"`
- `sk_az` – availability zone of the safekeeper, or `"unknown"`
- `app_name` – `application_name` of the postgres client
- `dir` – data direction, either `"read"` or `"write"`
- `same_az` – `"true"`, `"false"` or `"unknown"`. Can be derived from
`client_az` and `sk_az`, exists purely for convenience.

This is implemented by passing availability zone in the connection
string, like this: `-c tenant_id=AAA timeline_id=BBB
availability-zone=AZ-1`.

Update ansible deployment scripts to add availability_zone argument
to safekeeper and pageserver in systemd service files.
2023-03-16 17:24:01 +03:00
Joonas Koivunen
768c8d9972 test: allow gc to get unlucky (#3826)
this failure case was probably introduced by b220ba6, because earlier
the gc would always have run fast enough for restart every 1s. however,
test got added later, so we have just been lucky.

fixes #3824 by allowing this error to happen.
2023-03-16 11:03:21 +02:00
Rahul Patil
f1d960d2c2 Add new pageserver to eu-central-1 (#3829)
## Describe your changes

## Issue ticket number and link

## Checklist before requesting a review
- [x] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
2023-03-15 16:58:28 +01:00
Arthur Petukhovsky
cd17802b1f Add pg_clients test for serverless driver (#3827)
Fixes #3819
2023-03-15 16:18:38 +03:00
Heikki Linnakangas
10a5d36af8 Separate mgmt and libpq authentication configs in pageserver. (#3773)
This makes it possible to enable authentication only for the mgmt HTTP
API or the compute API. The HTTP API doesn't need to be directly
accessible from compute nodes, and it can be secured through network
policies. This also allows rolling out authentication in a piecemeal
fashion.
2023-03-15 13:52:29 +02:00
Arseny Sher
a7ab53c80c Forward framed read buf contents to compute before proxy pass.
Otherwise they get lost. Normally buffer is empty before proxy pass, but this is
not the case with pipeline mode of out npm driver; fixes connection hangup
introduced by b80fe41af3 for it.

fixes https://github.com/neondatabase/neon/issues/3822
2023-03-15 14:32:41 +03:00
Heikki Linnakangas
2672fd09d8 Make test independent of the order of config lines. 2023-03-14 20:10:34 +02:00
Heikki Linnakangas
4a92799f24 Fix check for trailing garbage in basebackup import.
There was a warning for trailing garbage after end-of-tar archive, but
it didn't always work. The reason is that we created a StreamReader
over the original copyin-stream, but performed the check for garbage
on the copyin-stream. There could be some garbage bytes buffered in
the StreamReader, which were not caught by the warning.

I considered turning the the warning into a fatal error, aborting the
import, but I wasn't sure if we handle aborting the import properly.
Do we clean up the timeline directory on error? If we don't, we should
make that more robust, but that's a different story.

Also, normally a valid tar archive ends with two 512-byte blocks of zeros.
The tokio_tar crate stops at the first all-zeros block. Read and check
the second all-zeros block, and error out if it's not there, or contains
something unexpected.
2023-03-14 19:45:33 +02:00
Konstantin Knizhnik
5396273541 Avoid holes between generated image layers (#3771)
## Describe your changes

When we perform partitioning of the whole key space, we take in account
actual ranges of relation present in the database. So if we have
relation with relid=1 and size 100 and relation with relid=2 with size
200 then result of KeySpace::partition may contain partitions
<100000000..100000099> and <200000000..200000199>. Generated image
layers will contain the same boundaries.
But when GC is checking image coverage to find out of old layer is fully
covered by newer image layer and so can be deleted, it takes in account
only full key range. I.e. if there is delta layer <100000000..300000000>
then it never be garbage collected because image layers
<100000000..100000099> and <200000000..200000199> are not completely
covering it.This is how it looks in practice:


000000067F000032AC00000A300000000000-000000067F000032AC00000A330000000000__000000000F761828

000000067F000032AC00000A31000000001F-000000067F000032AC00000A620000000005__0000000001696070-000000000442A551

000000067F000032AC00000A3300FFFFFFFF-000000067F000032AC00000A650100000000__000000000F761828

So there are two image layers covering delta layer but ... there is a
hole: A330000000000...A3300FFFFFFFF and as a result delta layer is not
collected.

## Issue ticket number and link

This PR is deeply related with #3673 because it is addressing the same
problem: old layers are not utilized by GC.
The test test_gc_old_layers.py in #3673 can be used to see effect of
this patch.

## Checklist before requesting a review
- [ ] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.

---------

Co-authored-by: Joonas Koivunen <joonas@neon.tech>
2023-03-14 16:15:07 +02:00
Joonas Koivunen
c23c8946a3 chore: clippies introduced with rust 1.68 (#3781)
- handle automatically fixable future clippies
- tune run-clippy.sh to remove macos specifics which we no longer have

Co-authored-by: Alexander Bayandin <alexander@neon.tech>
2023-03-14 15:29:02 +02:00
Joonas Koivunen
15b692ccc9 test: more strict finding of WARN, ERROR lines (#3798)
this prevents flakyness when `WARN|ERROR` appears in some other part of
the line, for example in a random filename.
2023-03-14 15:27:52 +02:00
Alexander Bayandin
3d869cbcde Replace flake8 and isort with ruff (#3810)
- Introduce ruff (https://beta.ruff.rs/) to replace flake8 and isort
- Update mypy and black
2023-03-14 13:25:44 +00:00
Lassi Pölönen
68ae020b37 Use RollingUpdate strategy also for legacy proxy (#3814)
## Describe your changes
We have previously changed the neon-proxy to use RollingUpdate. This
should be enabled in legacy proxy too in order to avoid breaking
connections for the clients and allow for example backups to run even
during deployment. (https://github.com/neondatabase/neon/pull/3683)

## Issue ticket number and link
https://github.com/neondatabase/neon/issues/3333
2023-03-14 13:23:46 +00:00
Joonas Koivunen
d6bb8caad4 refactor: correct return value for not found L0's on LayerMap::replace (#3805)
in prev implementation, an `ok_or_else(...)?` is used to cause a
"precondition error" on LayerMap::replace, however we only see this
particular error if an L0 for which replace fails is not in the layermap
because it is not in `l0_delta_layers`. changes or fixes this to be
Replacement::NotFound instead, making it more clear that an error would
only be raised for actual preconditions, like trying to replace layer
with completly unrelated layer.
2023-03-14 13:18:26 +02:00
Alexander Bayandin
319402fc74 postgres_ffi: restore POSTGRES_INSTALL_DIR support (#3811)
Fix path construction to `pg_config`: `pg_install_dir_versioned` 
already includes `pg_version`
2023-03-14 10:25:48 +00:00
Dmitry Ivanov
2e4bf7cee4 [proxy] Immediately log all compute node connection errors. 2023-03-14 01:45:57 +03:00
Dmitry Ivanov
15ed6af5f2 Add descriptions to proxy's python tests. 2023-03-14 01:32:37 +03:00
Dmitry Rodionov
50476a7cc7 test: update to match current interfaces 2023-03-13 17:50:10 +02:00
andres
d7ab69f303 add test for getting branchpoints from an inactive timeline 2023-03-13 17:50:10 +02:00
sharnoff
582620274a Enable file cache handling by vm-informant (#3794)
Enables the VM informant's file cache integration.

See also: https://github.com/neondatabase/autoscaling/pull/47
2023-03-13 07:16:39 -07:00
Vadim Kharitonov
daeaa767c4 Add neondatabase/release team as a default reviewers for storage
releases
2023-03-13 13:40:15 +01:00
Konstantin Knizhnik
f0573f5991 Remove block cursor cache (#3740)
## Describe your changes

Do not pin current block in BlockCursor

## Issue ticket number and link

See #3712 

There are places (see get_reconstruct_data) in our code when thread is
holding read layers lock and then try to read file and so lock page
cache slot. So we have edge in dependency graph layers->page cache slot.
At the same time (as Christian noticed) we can lock page cache slot in
BlockCursor and then try obtain shard lock on layers.
So there is backward edge in dependency graph page cache slot>layers
which forms loop and may cause deadlock.

There are three possible fixes of the problem:
1. Perform compaction under `layers` shared lock. See PR #3732. It fixes
the problem but make it not possible to append any data to pageserver
until compaction is completed.
2. Do not hold `layers` lock while accessing layers (not sure if it is
possible to do because it definitely introduce some new race
conditions).
3. Do not pin current pages in BockCursor (this PR).

My experiments shows that this cache in BlockCursor is not so useful:
the number of hits/misses for cursor cache on pgbench workload (-i -s
10/-c 10 -T 100/-c 10 -S -T 100):
```
hits: 163011
misses: 1023602
```
So number of cache misses is 10x times  larger.
And results for read-only pgbench are mostly the same:
```
with cache: 14581
w/out cache: 14429
```
 
## Checklist before requesting a review
- [ ] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
2023-03-13 14:07:35 +02:00
Nikita Kalyanov
07dcf679de set content type explicitly (#3799)
I moved management API v2 to ogen and the generated code seems to be
more strict about content type. Let's set it properly as it is json
after all

## Describe your changes

## Issue ticket number and link

## Checklist before requesting a review
- [x] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
2023-03-13 14:00:01 +02:00
Heikki Linnakangas
e0ee138a8b Add a test for tokio-postgres client to the driver test suite.
It is fully supported. To enable TLS, though, it requires some extra
glue code, and a dependency to a TLS library.
2023-03-13 12:14:37 +02:00
Arthur Petukhovsky
d9a1329834 Make postgres_backend use generic IO type (#3789)
- Support measuring inbound and outbound traffic in MeasuredStream
- Start using MeasuredStream in safekeepers code
2023-03-13 12:18:10 +03:00
Joonas Koivunen
8699342249 Ondemand rx bytes and layer count (#3777)
Adds two new *global* metrics:
- pageserver_remote_ondemand_downloaded_layers_total
- pageserver_remote_ondemand_downloaded_bytes_total

An existing test is repurposed once more to check that we do get some
reasonable counts. These are to replace guessing from the nic RX bytes
metric how much was on-demand downloaded.

First part of #3745: This does not add the "(un)?avoidable" metric,
which I plan to add as a new metric, which will be a subset of the
counts of the metrics added here.
2023-03-13 09:26:49 +02:00
Joonas Koivunen
ce8fbbd910 Fix allowed error again (#3790)
Fixes #3360 again, this time checking all other "Error processing HTTP
request" messages and aligning the regex with the two others.
2023-03-10 17:44:12 +00:00
Vadim Kharitonov
1401021b21 Be able to get number of CPUs (#3774)
After enabling autoscaling, we faced the issue that customers are not
able to get the number of CPUs they use at this moment. Therefore I've
added these two options:

1. Postgresql function to allow customers to call it whenever they want
2. `compute_ctl` endpoint to show these number in console
2023-03-10 19:00:20 +02:00
Stas Kelvich
252b3685a2 Use unsafe-postgres feature to build pgx extension
Recently added `unsafe-postgres` feature allows to build pgx extensions
against postgres forks that decided to change their ABI name (like us).
With that we can build extensions without forking them and using stock
pgx. As this feature is new few manual version bumps were required.
2023-03-10 17:40:45 +02:00
Heikki Linnakangas
34d3385b2e Add unit tests for JWT encoding and decoding. 2023-03-10 16:09:32 +02:00
Heikki Linnakangas
b00530df2a Add section in internal docs on the JWT payload.
Just copied from the code comments. Could be improved, but this is a
start.
2023-03-10 16:09:32 +02:00
Heikki Linnakangas
bebf76c461 Accept RS384 and RS512 JWT tokens.
Previously, we only accepted RS256. Seems like a pointless limitation,
and when I was testing it with RS512 tokens, it took me a while to
understand why it wasn't working.
2023-03-10 16:09:32 +02:00
Vadim Kharitonov
2ceef91da1 Compile pg_tiktoken extension 2023-03-10 14:59:26 +01:00
Anastasia Lubennikova
b7fddfa70d Add branch_id field to proxy_io_bytes_per_client metric.
Since we allow switching endpoints between different branches, it is important to use composite key.
Otherwise, we may try to calculate delta between metric values for two different branches.
2023-03-10 15:00:48 +02:00
Heikki Linnakangas
d1537a49fa Fix escaping in postgresql.conf that we generate at compute startup
If there are any config options that contain single quotes or backslashes,
they need to be escaped
2023-03-10 14:59:21 +02:00
Heikki Linnakangas
856d01ff68 Add newline at end of postgresql.conf 2023-03-10 14:59:21 +02:00
Heikki Linnakangas
42ec79fb0d Make expected test output nicer to read.
By using Rust raw string literal.
2023-03-10 14:59:21 +02:00
Rory de Zoete
3c4f5af1b9 Try depot.dev for image building (#3768)
To see if it is faster. Run side-by-side for a while so we can gather
enough data.
2023-03-10 11:11:39 +01:00
Arseny Sher
290884ea3b Fix too many arguments in read_network clippy complain. 2023-03-10 10:50:03 +03:00
Arseny Sher
965837df53 Log connection ids in safekeeper instead of thread ids.
Fixes build on macOS (which doesn't have nix gettid) after 0d8ced8534.
2023-03-10 10:50:03 +03:00
Arseny Sher
d1a0f2f0eb Fix example why manual-range-contains is disabled. 2023-03-10 00:03:17 +03:00
Konstantin Knizhnik
a34e78d084 Retry attempt to connect to pageserver in order to make pageserver restart transparent for clients (#3700)
…start transparent for clients

## Describe your changes

Try to reestablish connection with pageserver if send is failed to be
able to make pageserver restart transparent for client

## Issue ticket number and link

https://github.com/neondatabase/neon/issues/1138

## Checklist before requesting a review
- [ ] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.

---------

Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2023-03-09 22:15:46 +02:00
Arseny Sher
b80fe41af3 Refactor postgres protocol parsing.
1) Remove allocation and data copy during each message read. Instead, parsing
functions now accept BytesMut from which data they form messages, with
pointers (e.g. in CopyData) pointing directly into BytesMut buffer. Accordingly,
move ConnectionError containing IO error subtype into framed.rs providing this
and leave in pq_proto only ProtocolError.

2) Remove anyhow from pq_proto.

3) Move FeStartupPacket out of FeMessage. Now FeStartupPacket::parse returns it
directly, eliminating dead code where user wants startup packet but has to match
for others.

proxy stream.rs is adapted to framed.rs with minimal changes. It also benefits
from framed.rs improvements described above.
2023-03-09 20:45:56 +03:00
Arseny Sher
0d8ced8534 Remove sync postgres_backend, tidy up its split usage.
- Add support for splitting async postgres_backend into read and write halfes.
  Safekeeper needs this for bidirectional streams. To this end, encapsulate
  reading-writing postgres messages to framed.rs with split support without any
  additional changes (relying on BufRead for reading and BytesMut out buffer for
  writing).
- Use async postgres_backend throughout safekeeper (and in proxy auth link
  part).
- In both safekeeper COPY streams, do read-write from the same thread/task with
  select! for easier error handling.
- Tidy up finishing CopyBoth streams in safekeeper sending and receiving WAL
  -- join split parts back catching errors from them before returning.

Initially I hoped to do that read-write without split at all, through polling
IO:
https://github.com/neondatabase/neon/pull/3522
However that turned out to be more complicated than I initially expected
due to 1) borrow checking and 2) anon Future types. 1) required Rc<Refcell<...>>
which is Send construct just to satisfy the checker; 2) can be workaround with
transmute. But this is so messy that I decided to leave split.
2023-03-09 20:45:56 +03:00
Arseny Sher
7627d85345 Move async postgres_backend to its own crate.
To untie cyclic dependency between sync and async versions of postgres_backend,
copy QueryError and some logging/error routines to postgres_backend.rs. This is
temporal glue to make commits smaller, sync version will be dropped by the
upcoming commit completely.
2023-03-09 20:45:56 +03:00
Arseny Sher
3f11a647c0 Rename write_message to write_message_noflush in postgres_backend_async.rs
To make it unifrom across the project; proxy stream.rs and older
postgres_backend uses write_message_noflush.
2023-03-09 20:45:56 +03:00
Alexey Kondratov
e43c413a3f [compute_tools] Add /insights endpoint to compute_ctl (#3704)
This commit adds a basic HTTP API endpoint that allows scraping the
`pg_stat_statements` data and getting a list of slow queries. New
insights like cache hit rate and so on could be added later.

Extension `pg_stat_statements` is checked / created only if compute
tries to load the corresponding shared library. The latter is configured
by control-plane and currently covered with feature flag.

Co-authored by Eduard Dyckman (bird.duskpoet@gmail.com)
2023-03-09 14:21:10 +01:00
Heikki Linnakangas
8459e0265e Add performance test for compaction and image layer creation 2023-03-09 14:30:12 +02:00
Kirill Bulatov
03a2ce9d13 Add tracing spans with request_id into pageserver management API handlers (#3755)
Adds a newtype that creates a span with request_id from
https://github.com/neondatabase/neon/pull/3708 for every HTTP request
served.

Moves request logging and error handlers under the new wrapper, so every request-related event now is logged under the request span.
For compatibility reasons, error handler is left on the general router, since not every service uses the new handler wrappers yet.
2023-03-09 09:24:01 +02:00
Heikki Linnakangas
ccf92df4da Remove deprecated support to handle ZENITH_AUTH_TOKEN.
It's not used anywhere anymore.
2023-03-09 00:53:13 +02:00
Vadim Kharitonov
37bc6d9be4 Compile plpgsql_check extension 2023-03-08 23:20:24 +01:00
Vadim Kharitonov
177f986795 Compile hll extension 2023-03-08 12:03:09 +01:00
Heikki Linnakangas
fb1581d0b9 Fix setting "image_creation_threshold" setting in tenant config. (#3762)
We have a few tests that try to set image_creation_threshold, but it
didn't actually have any effect because we were missing some critical
code to load the setting from config file into memory.

The two modified tests in `test_remote_storage.py perform
compaction and GC, and assert that GC removes some layers. That
only happens if new image layers are created by the
compaction. The tests explicitly disabled image layer creation by
setting image_creation_threshold to a high value, but it didn't
take effect because reading image_creation_threshold from config
file was broken, which is why the test worked. Fix the test to
set image_creation_threshold low, instead, so that GC has work to
do.

Change 'test_tenant_conf.py' so that it exercises the added code.

This might explain why we're apparently missing test coverage for GC
(issue #3415), although I didn't try to address that here, nor did I
check if this improves the it.
2023-03-08 11:39:30 +02:00
Sasha Krassovsky
02b8e0e5af Add OpenAPI spec for do_gc (#3756)
## Describe your changes
Adds a field to the OpenAPI spec for the page server which describes the
`do_gc` command.
## Issue ticket number and link
#3669
## Checklist before requesting a review
- [x] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
2023-03-07 09:08:46 -08:00
Vadim Kharitonov
1b16de0d0f Compile prefix extension 2023-03-07 17:42:53 +01:00
Stas Kelvich
069b5b0a06 Make postgres --wal-redo more embeddable.
* Stop allocating and maintaining 128MB hash table for last written

  LSN cache as it is not needed in wal-redo.

* Do not require access to the initialized data directory. That
  saves few dozens megabytes of empty but initialized data directory.
  Currently such directories do occupy about 10% of the disk space
  on the pageservers as most of tenants are empty.

* Move shmem-initialization code to the extension instead of postgres
2023-03-07 15:01:14 +02:00
Joonas Koivunen
b05e94e4ff fix: allow ERROR log to appear per allowed failure (#3696)
The test already allows the background thread trying to checkpoint to
fail, however the resulting log message is currently not allowed thus
causing flakyness.
2023-03-07 12:44:04 +00:00
Arseny Sher
0acf9ace9a Return 404 if timeline is not found in safekeeper HTTP API. 2023-03-07 16:34:20 +04:00
Arseny Sher
ca85646df4 Max peer_horizon_lsn before adopting it.
Before this patch, persistent peer_horizon_lsn was always sent to walproposer,
making it initially calculate it equal to max of persistent values and in turn
pulling back the in memory value. Send instead in memory value and take max when
safekeeper sets it.

closes https://github.com/neondatabase/neon/issues/3752
2023-03-07 10:16:54 +04:00
Shany Pozin
7b9057ad01 Add timeout to download copy (#3675)
## Describe your changes
Adding a timeout handling for the remote download of layers of 120
seconds for each operation
Note that these downloads are being retried for N times 

## Issue ticket number and link
Fixes: #3672

## Checklist before requesting a review
- [x] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.

---------

Co-authored-by: Joonas Koivunen <joonas@neon.tech>
2023-03-06 18:52:59 +02:00
Konstantin Knizhnik
96f65fad68 Handle crash of walredo process and retry applying wal records (#3739)
## Describe your changes

Restart walredo process an d retry applying walredo records i case of
abnormal walredo process termination

## Issue ticket number and link

See #1700


## Checklist before requesting a review
- [ ] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
2023-03-06 10:10:58 +02:00
Christian Schwarz
9cada8b59d fix benchmarks, broken by PR #3737
Benchmarks only run on `main` branch, so, the pre-commit tests didn't
catch these.
2023-03-03 18:47:57 +01:00
Christian Schwarz
66a5159511 fix: compaction: no index upload scheduled if no on-demand downloads
Commit

    0cf7fd0fb8
    Compaction with on-demand download (#3598)

introduced a subtle bug: if we don't have to do on-demand downloads,
we only take one ROUND in fn compact() and exit early.
Thereby, we miss scheduling the index part upload for any layers
created by fn compact_inner().

Before that commit, we didn't have this problem.
So, this patch fixes it.

Since no regression test caught this, I went ahead and extended the
timeline size tests to assert that, if remote storage is configured,
1. pageserver_remote_physical_size matches the other physical sizes
2. file sizes reported by the layer map info endpoint match the other
   physical size metrics

Without the pageserver code fix, the regression test would
fail at the physical size assertion, complaining that
any of the resident physical size != remote physical size metric
50790400.0 != 18399232.0
I figured out what the problem is by comparing the remote storage
and local directories like so, and noticed that the image layer
in the local directory wasn't present on the remote side.
It's size was exactly the difference
    50790400.0 - 18399232.0  =32391168.0

fixes https://github.com/neondatabase/neon/issues/3738
2023-03-03 16:11:54 +01:00
Christian Schwarz
d1a0a907ff tests: use parse_metrics everywhere (#3737)
- use parse_metrics() in all places where we parse Prometheus metrics
- query_all: make `filter` argument optional
- encourage using properly parsed, typed metrics by changing get_metrics()
  to return already-parsed metrics. The new get_metric_str() method,
  like in the Safekeeper type, returns the raw text response.
2023-03-03 14:53:27 +01:00
Christian Schwarz
1b780fa752 timeline_checkpoint_handler: add span with tenant and timeline id
Before this patch, the logs written by freeze_and_flush() and compact()
didn't have any span, which made the test logs annoying to read.
2023-03-03 12:10:24 +01:00
Christian Schwarz
38022ff11c gc: only decrement resident size if GC'd layer is resident
Before this patch, GC would call PersistentLayer::delete()
on every GC'ed layer.
RemoteLayer::delete() returned Ok(()) unconditionally.
GC would then proceed by decrementing the resident size metric,
even though the layer is a RemoteLayer.

This patch makes the following changes:
- Rename PersistentLayer::delete() to delete_resident_layer_file().
  That name is unambiguous.
- Make RemoteLayer::delete_resident_layer_file return an Err().
  We would have uncovered this bug if we had done that from the start.
- Change GC / Timeline::delete_historic_layer check whether
  the layer is remote or not, and only call delete_resident_layer_file()
  if it's not remote. This brings us in line with how eviction does it.
- Add a regression test.

fixes https://github.com/neondatabase/neon/issues/3722
2023-03-03 12:10:24 +01:00
Christian Schwarz
1b9b9d60d4 eviction: add comment explaining resident size decrement on error
https://github.com/neondatabase/neon/issues/3722
2023-03-03 12:10:24 +01:00
Christian Schwarz
68141a924d eviction: remove needless if-let around resident size decrement
The branch was always taken at runtime, so, this should not
constitute a behavioral change.

refs https://github.com/neondatabase/neon/issues/3722
2023-03-03 12:10:24 +01:00
Christian Schwarz
764d27f696 fix checkpoint_timeout serialization in TenantConf
Without this change, when actually setting this conf opt, the tenant
would become Broken next time we load it.
Why?
The serde_toml representation that persist_tenant_conf would write out
would be a TOML inline table of `secs` and `nsecs`.
But our hand-rolled TenantConf parser expects a TOML string.

I checked that all other `Duration` values in TenantConfOpt
use the humantime serialization.

Issues like this would likely be systematically prevent by
https://github.com/neondatabase/neon/issues/3682
2023-03-03 12:10:24 +01:00
Arthur Petukhovsky
b23742e09c Create /v1/debug_dump safekeepers endpoint (#3710)
Add HTTP endpoint to get full safekeeper state of all existing timelines
(all in-memory values and info about all files stored on disk).

Example:
https://gist.github.com/petuhovskiy/3cbb8f870401e9f486731d145161c286
2023-03-03 14:01:05 +03:00
Vadim Kharitonov
5e514b8465 Compile pgTAP extension 2023-03-03 10:14:38 +01:00
Vadim Kharitonov
a60f687ce2 Compile rum extension 2023-03-02 12:28:20 +01:00
sharnoff
8dae879994 Disable VM cgroup shenanigans (#3730)
As discussed - temporary, so it can unblock releasing autoscaling.
Cleaner to fully remove, then add back rather than commenting it out.
2023-03-01 23:58:43 -08:00
Shany Pozin
d19c5248c9 Add UUID header to mgmt API (#3708)
## Describe your changes

## Issue ticket number and link
#3479
## Checklist before requesting a review
- [x] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
2023-03-01 18:09:08 +02:00
sharnoff
1360361f60 Fix missing VM cgconfig.conf (#3718)
It was being added to the wrong stage in the dockerfile. This should fix
it, and resolves an ongoing issue on staging.
2023-02-28 21:11:00 -08:00
Alexander Bayandin
000eb1b069 Bump tempfile from 3.3.0 to 3.4.0 (#3709)
Update `tempfile` crate to get rid of `remove_dir_all` dependency
Ref https://github.com/neondatabase/neon/security/dependabot/15
2023-02-27 12:44:08 +00:00
Heikki Linnakangas
f51b48fa49 Fix UNLOGGED tables.
Instead of trying to create missing files on the way, send init fork contents as
main fork from pageserver during basebackup. Add test for that. Call
put_rel_drop for init forks; previously they weren't removed. Bump
vendor/postgres to revert previous approach on Postgres side.

Co-authored-by: Arseny Sher <sher-ars@yandex.ru>

ref https://github.com/neondatabase/postgres/pull/264
ref https://github.com/neondatabase/postgres/pull/259
ref https://github.com/neondatabase/neon/issues/1222
2023-02-24 23:30:02 +04:00
Sergey Melnikov
9f906ff236 Add pageserver-2.us-east-2.aws.neon.tech (#3701) 2023-02-23 19:56:21 +01:00
Sam Kleinman
c79dd8d458 compute_ctl: support for fetching spec from control plane (#3610) 2023-02-23 13:19:39 -05:00
Vadim Kharitonov
ec4ecdd543 Enable postgres SPI extensions 2023-02-23 16:49:37 +01:00
MMeent
20a4d817ce Update vendored PostgreSQL versions to 14.7 and 15.2 (#3581)
## Describe your changes
Rebase vendored PostgreSQL onto 14.7 and 15.2

## Issue ticket number and link

#3579

## Checklist before requesting a review
- [x] I have performed a self-review of my code.
- [x] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [x] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
    ```
The version of PostgreSQL that we use is updated to 14.7 for PostgreSQL
14 and 15.2 for PostgreSQL 15.
    ```
2023-02-23 16:10:22 +02:00
Vadim Kharitonov
5ebf7e5619 Fix pg_jsonschema and pg_graphql 2023-02-23 10:43:46 +01:00
Arseny Sher
0692fffbf3 Bump vendor/postgres to include hotfix for unlogged tables with indexes.
https://github.com/neondatabase/postgres/pull/259
https://github.com/neondatabase/postgres/pull/262
2023-02-23 01:34:59 +04:00
Vadim Kharitonov
093570af20 Compile pg_hashids extension 2023-02-22 21:00:25 +01:00
Dmitry Rodionov
eb403da814 Use debug level for successful GET http requests (#3681)
We started rather frequently scrap some apis for metadata. This includes
layer eviction tester, I believe console does that too.

It should eliminate these logs:
https://neonprod.grafana.net/goto/rr_ace1Vz?orgId=1 (Note the rate
around 2k messages per minute)
2023-02-22 22:19:05 +03:00
Vadim Kharitonov
f3ad635911 Compile pgrouting extension 2023-02-22 20:16:11 +01:00
Vadim Kharitonov
a8d7360881 Compile hypopg extension 2023-02-22 20:14:30 +01:00
Lassi Pölönen
b0311cfdeb Change the production neon-proxy-scram update strategy to RollingUpdate (#3683)
## Describe your changes
The same change in production as was done in staging by
https://github.com/neondatabase/neon/pull/3678

## Issue ticket number and link
https://github.com/neondatabase/neon/issues/3333
2023-02-22 20:15:37 +02:00
Konstantin Knizhnik
412e0aa985 Skip largest N holes during compaction (#3597)
## Describe your changes

This is yet another attempt to address problem with storage size
ballooning #2948
Previous PR #3348 tries to address this problem by maintaining list of
holes for each layer.
The problem with this approach is that we have to load all layer on
pageserver start.
Lazy loading of layers is not possible any more.

This PR tries to collect information of N largest holes on compaction
time and exclude this holes from produced layers.
It can cause generation of larger number of layers (up to 2 times) and
producing small layers.
But it requires minimal changes in code and doesn't affect storage
format.

For graphical explanation please see thread:
https://github.com/neondatabase/neon/pull/3597#discussion_r1112704451

## Issue ticket number and link

#2948
#3348

## Checklist before requesting a review
- [ ] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
2023-02-22 18:28:01 +02:00
Lassi Pölönen
965b4f4ae2 Change the staging neon-proxy-scram update strategy to RollingUpdate (#3678)
## Describe your changes
When we deploy the proxy with the default Recreate strategy, there's
always some downtime and existing connections will be shut down. Change
the strategy to RollingUpdate and delay the kill signal by one week. AWS
Network Loadbalancer keeps the existing connections alive for as long as
the pods are alive, but will direct new connections to new pods.


## Issue ticket number and link
https://github.com/neondatabase/neon/issues/3333
2023-02-22 16:50:07 +02:00
Arthur Petukhovsky
95018672fa Remove safekeeper-1.ap-southeast-1.aws.neon.tech (#3671)
We migrated all timelines to
`safekeeper-3.ap-southeast-1.aws.neon.tech`, now old instance can be
removed.
2023-02-22 11:55:41 +02:00
Sergey Melnikov
2caece2077 Add -v to ansible invocations (#3670)
To get more debug output on failures
2023-02-21 23:11:52 +03:00
Joonas Koivunen
b8b8c19fb4 fix: hold permit until GetObject eof (#3663)
previously we applied the ratelimiting only up to receiving the headers
from s3, or somewhere near it. the commit adds an adapter which carries
the permit until the AsyncRead has been disposed.

fixes #3662.
2023-02-21 21:14:08 +02:00
Joonas Koivunen
225add041f calculate_logical_size: no longer use spawn_blocking (#3664)
Calculation of logical size is now async because of layer downloads, so
we shouldn't use spawn_blocking for it. Use of `spawn_blocking`
exhausted resources which are needed by `tokio::io::copy` when copying
from a stream to a file which lead to deadlock.

Fixes: #3657
2023-02-21 21:09:31 +02:00
Joonas Koivunen
5d001b1e5a chore: ignore all compaction inactive tenant errors (#3665)
these are happening in tests because of #3655 but they sure took some
time to appear.

makes the `Compaction failed, retrying in 2s: Cannot run compaction
iteration on inactive tenant` into a globally allowed error, because it
has been seen failing on different test cases.
2023-02-21 20:20:13 +02:00
Joonas Koivunen
fe462de85b fix: log download failed error (#3661)
Fixes #3659
2023-02-21 19:31:53 +02:00
Vadim Kharitonov
c0de7f5cd8 Build pg_jsonschema and pg_graphql extensions (#3535)
## Describe your changes
Layer for building pg extensions written on Rust

It required forking:
* `cargo-pgx` (in order not to catch an ABI mismatch error (`cargo-pgx`
hardcoded ABI tcdi/pgx#1032)
* `pg_jsonschema` (to use forked `cargo-pgx` version)
* `pgx-contrib-spiext` (to use forked `cargo-pgx`)
* `pg_graphql` (to use forked `cargo-pgx` and `pgx-contrib-spiext`
version)

Before the patch:

```
postgres=# create extension pg_jsonschema;
2023-02-02 17:45:23.120 UTC [35] ERROR:  incompatible library "/usr/local/lib/pg_jsonschema.so": ABI mismatch
2023-02-02 17:45:23.120 UTC [35] DETAIL:  Server has ABI "Neon Postgres", library has "PostgreSQL".
2023-02-02 17:45:23.120 UTC [35] STATEMENT:  create extension pg_jsonschema;
ERROR:  incompatible library "/usr/local/lib/pg_jsonschema.so": ABI mismatch
DETAIL:  Server has ABI "Neon Postgres", library has "PostgreSQL".
```

After 

```
postgres=# create extension pg_jsonschema;
CREATE EXTENSION
postgres=# select json_matches_schema('{"type": "object"}', '{}');
 json_matches_schema
---------------------
 t
postgres=# create extension pg_graphql;
CREATE EXTENSION
postgres=# create table book(id int primary key, title text);
CREATE TABLE
postgres=# insert into book(id, title) values (1, 'book 1');
INSERT 0 1
postgres=# select graphql.resolve($$
query {
  bookCollection {
    edges {
      node {
        id
      }
    }
  }
}
$$);
                            resolve
----------------------------------------------------------------
 {"data": {"bookCollection": {"edges": [{"node": {"id": 1}}]}}}
(1 row)
```

## Issue ticket number and link
Closes #3429, #3096

## Checklist before requesting a review
- [x] I have performed a self-review of my code.
- [x] If it is a core feature, I have added thorough tests.
- [x] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [x] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.

`pg_jsonschema` extension will be available for our customers
2023-02-21 17:31:23 +01:00
Joonas Koivunen
b220ba6cd1 add random init delay for background tasks (#3655)
Fixes #3649.
2023-02-21 12:42:11 +01:00
Joonas Koivunen
7de373210d Warn when background tasks exceed their configured period (#3654)
Fixes #3648.
2023-02-21 13:02:19 +02:00
Vadim Kharitonov
5c5b03ce08 Compile xml2 extension 2023-02-21 10:34:45 +01:00
Joonas Koivunen
d7d3f451f0 Use tracing panic hook in all binaries (#3634)
Enables tracing panic hook in addition to pageserver introduced in
#3475:

- proxy
- safekeeper
- storage_broker

For proxy, a drop guard which resets the original std panic hook was
added on the first commit. Other binaries don't need it so they never
reset anything by `disarm`ing the drop guard.

The aim of the change is to make sure all panics a) have span
information b) are logged similar to other messages, not interleaved
with other messages as happens right now. Interleaving happens right now
because std prints panics to stderr, and other logging happens in
stdout. If this was handled gracefully by some utility, the log message
splitter would treat panics as belonging to the previous message because
it expects a message to start with a timestamp.

Cc: #3468
2023-02-21 10:03:55 +02:00
Keanu Ashwell
bc7d3c6476 docs: add dependency requirements for arch based systems (#3588)
This pull request adds information on building neon on Arch based system
such as Artix, Manjaro, Antergos, etc.
2023-02-20 22:51:54 +03:00
Sergey Melnikov
e3d75879c0 Use fqdn to access console management API on production (#3651)
console-release.local is legacy manual CNAME to
neon-internal-api.aws.neon.tech in r53
We could use neon-internal-api.aws.neon.tech name directly

This already was deployed to staging in
https://github.com/neondatabase/neon/pull/3642
2023-02-20 18:11:06 +01:00
Christian Schwarz
485b269674 eviction: tone down logs to debug!() level if there were no evictions
fixes #3647
2023-02-20 18:01:59 +01:00
Christian Schwarz
ee1eda9921 eviction: remove EvictionStats::not_considered_due_to_clock_skew
Rationale: see the block comment added in this patch.

fixes #3641
2023-02-20 18:01:59 +01:00
Christian Schwarz
e363911c85 timeline: propagate span to download_remote_layer (#3644)
fixes #3643
refs #3604
2023-02-20 17:18:13 +02:00
Sergey Melnikov
d5d690c044 Use fqdn for staging console management API (#3642)
`console-staging.local` is legacy manual CNAME to
`neon-internal-api.aws.neon.build` in r53
We could use `neon-internal-api.aws.neon.build` name directly
2023-02-20 16:05:21 +01:00
Arthur Petukhovsky
8f557477c6 Add new safekeeper to ap-southeast-1 prod (#3645) 2023-02-20 17:51:27 +03:00
Shany Pozin
af210c8b42 Allow running do_gc in non testing env (#3639)
## Describe your changes
Since the current default gc period is set to 1 hour, whenever there is
an immediate need to reduce PITR and run gc, the user has to wait 1 hour
for PITR change to take effect
By enabling this API the user can configure PITR and immediately call
the do_gc API to trigger gc
## Issue ticket number and link
#3590
## Checklist before requesting a review
- [X] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
2023-02-20 13:23:13 +02:00
sharnoff
2153d2e00a Run compute_ctl in a cgroup in VMs (#3577) 2023-02-17 14:14:41 -08:00
Alexander Bayandin
564fa11244 Update Postgres extensions (#3615)
- Update postgis from 3.3.1 from 3.3.2
- Update plv8 from 3.1.4 to 3.1.5
- Update h3-pg from 4.0.1 to 4.1.2 (and underlying h3 from 4.0.1 to 4.1.0)
2023-02-17 18:18:23 +00:00
Christian Schwarz
8d28a24b26 staging: enable automatic layer eviction at 20m threshold + period (#3636)
What it says on the tin.

Part of #2476
2023-02-17 18:32:01 +02:00
Anastasia Lubennikova
53128d56d9 Fix make clean:
Use correct paths in neon-pg-ext-clean
2023-02-17 17:57:45 +02:00
Anastasia Lubennikova
40799d8ae7 Add debug messages to catch abnormal consumption metric values 2023-02-17 17:57:45 +02:00
Konstantin Knizhnik
b242b0ad67 Fix flaky tests (#3616)
## Describe your changes

test_on_demand_download is flaky because not waiting until created image
layer is transferred to S3.
test_tenants_with_remote_storage just leaves garbage at the end of
overwritten file.

Right solution for test_on_demand_download is to add some API call to
wait completion of synchronization with S3 (not just based on last
record LSN). But right now it is solved using sleep.

## Issue ticket number and link

#3209 

## Checklist before requesting a review
- [ ] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
2023-02-17 15:56:56 +02:00
Dmitry Ivanov
d90cd36bcc [proxy] Improve tracing spans here and there. 2023-02-17 15:32:14 +03:00
Dmitry Ivanov
956b6f17ca [proxy] Handle some unix signals.
On the surface, this doesn't add much, but there are some benefits:

* We can do graceful shutdowns and thus record more code coverage data.

* We now have a foundation for the more interesting behaviors, e.g. "stop
  accepting new connections after SIGTERM but keep serving the existing ones".

* We give the otel machinery a chance to flush trace events before
  finally shutting down.
2023-02-17 15:32:14 +03:00
Heikki Linnakangas
6f9af0aa8c [proxy] Enable OpenTelemetry tracing.
This commit sets up OpenTelemetry tracing and exporter, so that they
can be exported as OpenTelemetry traces as well.

All outgoing HTTP requests will be traced. A separate (child)
span is created for each outgoing HTTP request, and the tracing
context is also propagated to the server in the HTTP headers.

If tracing is enabled in the control plane and compute node too, you
can now get an end-to-end distributed trace of what happens when a new
connection is established, starting from the handshake with the
client, creating the 'start_compute' operation in the control plane,
starting the compute node, all the way to down to fetching the base
backup and the availability checks in compute_ctl.

Co-authored-by: Dmitry Ivanov <dima@neon.tech>
2023-02-17 15:32:14 +03:00
Joonas Koivunen
8e6b27bf7c fix: avoid busy loop on replacement failure (#3613)
Add an AtomicBool per RemoteLayer, use it to mark together with closed
semaphore that remotelayer is unusable until restart or ignore+load.

https://github.com/neondatabase/neon/issues/3533#issuecomment-1431481554
2023-02-17 14:15:29 +02:00
Joonas Koivunen
ae3eff1ad2 Tracing panic hook (#3475)
Fixes #3468.

This does change how the panics look, and most importantly, make sure
they are not interleaved with other messages. Adds a `GET /v1/panic`
endpoint for panic testing (useful for sentry dedup and this hook
testing).

The panics are now logged within a new error level span called `panic`
which separates it from other error level events. The panic info is
unpacked into span fields:
- thread=mgmt request worker
- location="pageserver/src/http/routes.rs:898:9"

Co-authored-by: Christian Schwarz <christian@neon.tech>
2023-02-17 13:56:00 +02:00
Joonas Koivunen
501702b27c fix: flaky test_compaction_downloads_on_demand_with_image_creation (#3629)
fix is to stop postgres before the final checkpoint to ensure no
inmemory layer gets created.

Fixes #3627.
2023-02-17 13:34:26 +02:00
Alexander Bayandin
526f8b76aa Bump werkzeug from 2.1.2 to 2.2.3 (#3631)
## Describe your changes

```
$ poetry add werkzeug@latest "moto[server]@latest"
Using version ^2.2.3 for werkzeug
Using version ^4.1.2 for moto

Updating dependencies
Resolving dependencies... (1.6s)

Writing lock file

Package operations: 0 installs, 2 updates, 1 removal

  • Removing pytz (2022.1)
  • Updating werkzeug (2.1.2 -> 2.2.3)
  • Updating moto (3.1.18 -> 4.1.2)
```

Resolves:
- https://github.com/neondatabase/neon/security/dependabot/14
- https://github.com/neondatabase/neon/security/dependabot/13

`@dependabot` failed to create a PR for some reason (I guess because it
also needed to handle `moto` dependency)

## Issue ticket number and link
N/A

## Checklist before requesting a review
- [x] I have performed a self-review of my code.
- [x] If it is a core feature, I have added thorough tests.
- [x] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [x] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
2023-02-17 08:29:52 +00:00
Sergey Melnikov
a1b062123b Do not deploy storage to old account (#3630)
It's gone
2023-02-16 20:28:53 +00:00
Dmitry Ivanov
a4d5c8085b Move hacks to a dedicated module. 2023-02-16 22:10:56 +03:00
Dmitry Ivanov
edffe0dd9d Extract password hack & cleartext hack 2023-02-16 22:10:56 +03:00
Heikki Linnakangas
d9c518b2cc Refactor use_cleartext_password_flow.
It's not a property of the credentials that we receive from the
client, so remove it from ClientCredentials. Instead, pass it as an
argument directly to 'authenticate' function, where it's actually
used. All the rest of the changes is just plumbing to pass it through
the call stack to 'authenticate'
2023-02-16 22:10:56 +03:00
Anastasia Lubennikova
0d3aefb274 Only use active timelines in synthetic_size calculation 2023-02-16 17:58:53 +02:00
Anastasia Lubennikova
6139e8e426 Revert "Add debug messages around sending cached metrics"
This reverts commit a839860c2e.
2023-02-16 17:46:15 +02:00
Anastasia Lubennikova
d9ba3c5f5e Revert "Add debug messages around timeline.get_current_logical_size"
This reverts commit a5ce2b5330.
2023-02-16 17:46:15 +02:00
Joonas Koivunen
0cf7fd0fb8 Compaction with on-demand download (#3598)
Repeatedly (twice) try to download the compaction targeted layers before actual
compaction. Adds tests for both L0 compaction downloading layers and image
creation downloading layers. Image creation support existed already.

Fixes #3591

Co-authored-by: Christian Schwarz <christian@neon.tech>
2023-02-16 15:36:13 +02:00
Kirill Bulatov
f0b41e7750 Propose less verbose way to build neon (#3624)
Closes https://github.com/neondatabase/neon/issues/3518 and might help
https://github.com/neondatabase/neon/issues/3611 and the future build
attempts.

Propose `-s` flag in the Readme when building via `make` command, to
help people to spot build errors easier.
2023-02-16 14:25:35 +02:00
Vadim Kharitonov
5082d84f5b Compile pgjwt extension 2023-02-16 10:54:22 +01:00
Anastasia Lubennikova
7991bd3b69 Fix periodic metric sending: don't reset timer on every iteration (#3617)
Previously timer was reset on every collect_metrics_iteration and sending of cached metrics was never triggered.

This is a follow-up for a69da4a7.
2023-02-16 10:56:42 +02:00
Heikki Linnakangas
ddbdcdddd7 Tenant size calculation: refactor, rewrite, and add SVG (#2817)
Refactor the tenant_size_model code. Segment now contains just the
minimum amount of information needed to calculate the size. Other
information that is useful for building up the segment tree, and for
display purposes, is now kept elsewhere. The code in 'main.rs' has a new
ScenarioBuilder struct for that.

Calculating which Segments are "needed" is now the responsibility of the
caller of tenant_size_mode, not part of the calculation itself. So it's
up to the caller to make all the decisions with retention periods for
each branch.

The output of the sizing calculation is now a Vec of SizeResults, rather
than a tree. It uses a tree representation internally, when doing the
calculation, but it's not exposed to the caller anymore.

Refactor the way the recursive calculation is performed.

Rewrite the code in size.rs that builds the Segment model. Get rid of
the intermediate representation with Update structs. Build the Segments
directly, with some local HashMaps and Vecs to track branch points to
help with that.

retention_period is now an input to gather_inputs(), rather than an
output.

Update pageserver http API: rename /size endpoint to /synthetic_size
with following parameters:
    - /synthetic_size?inputs_only to get debug info;
- /synthetic_size?retention_period=0 to override cutoff that is used to
calculate the size;
pass header -H "Accept: text/html" to get HTML output, otherwise JSON is
returned

Update python tests and openapi spec.

---------

Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>
Co-authored-by: Joonas Koivunen <joonas@neon.tech>
2023-02-16 10:53:46 +02:00
Shany Pozin
7b182e2605 Update settings.md with latest PITR and gc period values (#3618)
## Describe your changes
Updates PITR and GC_PERIOD default value doc
## Issue ticket number and link

## Checklist before requesting a review
- [ ] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
2023-02-16 10:33:04 +02:00
Dmitry Ivanov
1d9d7c02db [proxy] Don't forward empty options to compute nodes
Clients may specify endpoint/project name via `options=project=...`,
so we should not only remove `project=` from `options` but also
drop `options` entirely, because connection pools don't support it.

Discussion: https://neondb.slack.com/archives/C033A2WE6BZ/p1676464382670119
2023-02-15 22:05:03 +03:00
Anna Stepanyan
a974602f9f fix the logical size term definition (#3609)
a size of a *database* cannot be a sum of the sizes of *all databases*
indicating that a logical size is calculated for a branch

## Describe your changes

## Issue ticket number and link

## Checklist before requesting a review
- [x] i checked the suggested changes
- [x] this is not a core feature
- [x] this is just a docs update, does not require analytics
- [x] this PR does not require a public announcement
2023-02-15 15:11:06 +01:00
Anastasia Lubennikova
a839860c2e Add debug messages around sending cached metrics 2023-02-15 16:02:02 +02:00
Anastasia Lubennikova
a5ce2b5330 Add debug messages around timeline.get_current_logical_size 2023-02-15 16:02:02 +02:00
Dmitry Ivanov
3569c1bacd [proxy] Fix: don't cache user & dbname in node info cache
Upstream proxy erroneously stores user & dbname in compute node info
cache entries, thus causing "funny" connection problems if such an entry
is reused while connecting to e.g. a different DB on the same compute node.

This PR fixes the problem but doesn't eliminate the root cause just yet.
I'll revisit this code and make it more type-safe in the upcoming PR.
2023-02-14 17:54:01 +03:00
Vadim Kharitonov
86681b92aa Enable plls and plcoffee extensions 2023-02-14 13:33:27 +01:00
Sergey Melnikov
eb21d9969d Add pageserver-3.us-west-2.aws.neon.tech (#3603) 2023-02-14 12:56:03 +01:00
Joonas Koivunen
e6618f1cc0 Update current logical size gauge (#3592)
Alternative to #3586.

Introduces usage of current_logical_size.current_size as a boundary after which we start to update the metric gauge on ingested wal. Previously any incremented value (ingested wal) would had updated the gauge, but this would had left the metric at zero for timelines which never receive any wal even if size had been calculated. Now the gauge is updated right away as the calculation completes, not requiring any wal to be received.
2023-02-14 13:17:34 +02:00
Dmitry Ivanov
eaff14da5f [proxy] Restore INFO as the default tracing level
Also move tracing init to its own function.
2023-02-13 17:09:43 +03:00
Arthur Petukhovsky
f383b4d540 Enable TCP_NODELAY for wss connections 2023-02-10 21:40:28 +03:00
Dmitry Ivanov
694150ce40 [proxy] Respect the magic RUST_LOG env variable
Usage: `RUST_LOG=trace proxy ...`
2023-02-10 18:49:32 +03:00
Vadim Kharitonov
f4359b688c Backport cargo fmt diff from release branch into main 2023-02-10 14:20:55 +01:00
Vadim Kharitonov
948f047f0a Compile pgvector extension 2023-02-10 13:37:52 +01:00
Vadim Kharitonov
4175cfbdac Create folder for file_cache 2023-02-10 13:03:12 +01:00
Dmitry Ivanov
9657459d80 [proxy] Fix possible unsoundness in the websocket machinery (#3569)
This PR replaces the ill-advised `unsafe Sync` impl with a de-facto
standard way to solve the underlying problem.

TLDR:
- tokio::task::spawn requires future to be Send
- ∀t. (t : Sync) <=> (&t : Send)
- ∀t. (t : Send + !Sync) => (&t : !Send)
2023-02-10 12:45:38 +03:00
Christian Schwarz
a4256b3250 allow on-demand downloads in walreceiver connection handler
Without this patch, basebackup fails if we evict all layers before that.

This slipped in as part of

   commit 01b4b0c2f3
   Author: Christian Schwarz <christian@neon.tech>
   Date:   Fri Jan 13 17:02:22 2023 +0100

       Introduce RequestContext
2023-02-09 13:39:04 +01:00
Christian Schwarz
175a577ad4 automatic layer eviction
This patch adds a per-timeline periodic task that executes an eviction
policy. The eviction policy is configurable per tenant.

Two policies exist:
- NoEviction (the default one)
- LayerAccessThreshold

The LayerAccessThreshold policy examines the last access timestamp per
layer in the layer map and evicts the layer if that last access is
further in the past than a configurable threshold value.
This policy kind is evaluated periodically at a configurable period.
It logs a summary statistic at `info!()` or `warn!()` level, depending
on whether any evictions failed.

This feature has no explicit killswitch since it's off by default.
2023-02-09 13:33:55 +01:00
Joonas Koivunen
1fdf01e3bc fix: readable Debug for Layers (#3575)
#3536 added the custom Debug implementations but it using derived Debug
on Key lead to too verbose output. Instead of making `Key`'s `Debug`
unconditionally or conditionally do the `Display` variant (for table
space'd keys), opted to build a newtype to provide `Debug` for
`Range<Key>` via `Display` which seemed to work unconditionally.

Also orders Key to have: 1. comment, 2. derive, 3. `struct Key`.
2023-02-09 13:55:37 +02:00
Christian Schwarz
446a39e969 make LayerAccesStatFullDetails Copy
Method to_api_model renamed to as_api_model because of Clippy complaint:
https://rust-lang.github.io/rust-clippy/master/index.html#wrong_self_convention
2023-02-09 12:35:45 +01:00
Arthur Petukhovsky
7ed9eb4a56 Add script for safekeeper tenants cleanup (#3452)
This script can be used to remove tenant directories on safekeepers for
projects which do not longer exist (deleted in the console).

To run this script you need to upload it to safekeeper (i.e. with SSH),
and run it with python3. Ansible can be used to run this script on
multiple safekeepers.

Fixes https://github.com/neondatabase/cloud/issues/3356
2023-02-09 13:28:20 +02:00
Joonas Koivunen
f07d6433b6 fix: one leftover Arc::ptr_eq (#3573)
@knizhnik noticed that one instance of `Arc::<dyn
PersistentLayer>::ptr_eq` was missed in #3558.

Now all `ptr_eq` which remain are in comments.
2023-02-09 13:02:07 +02:00
Heikki Linnakangas
2040db98ef Add docs for synthetic size calculation (#3328)
---------
Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>
2023-02-09 11:20:10 +02:00
dependabot[bot]
371493ae32 Bump cryptography from 38.0.3 to 39.0.1 (#3565) 2023-02-08 16:08:01 +00:00
Rory de Zoete
1b9e5e84aa Add new storage hosts for placement group test (#3561)
To test the placement group setup
2023-02-08 16:48:29 +01:00
Christian Schwarz
7ed93fff06 refactor: allow for eviction of layers in a batch
The auto-eviction PR (#3552) operates in two phaes:

  1. find candidate layers
  2. evict them.

For (2), a batch API like the one added in this commit is useful.

Note that this PR requires #3558 to be merged first.
Otherwise, the tests won't pass.
2023-02-08 14:40:47 +01:00
Joonas Koivunen
a6dffb6ef9 fix: stop using Arc::ptr_eq with dyn Trait (#3558)
This changes the way we compare `Arc<dyn PersistentLayer>` in Timeline's
`LayerMap` not to use `Arc::ptr_eq` which has been witnessed in
development of #3557 to yield wrong results. It gives wrong results
because it compares fat pointers, which are `(object, vtable)` tuples
for `dyn Trait` and there are no guarantees that the `vtable`s are
unique. As in there were multiple vtables for `RemoteLayer` which is why
the comparison failed in #3557.

This is a known issue in rust, clippy warns against it and rust std
might be moving to the solution which has been reproduced on this PR:
compare only object pointers by "casting out" the vtable pointer.
2023-02-08 12:25:25 +00:00
Sergey Melnikov
c5c14368e3 Fix deploy-prod.yml syntax (#3556) 2023-02-07 15:27:31 +01:00
Sergey Melnikov
1254dc7ee2 Fix production deploy: run as root to access docker (#3555) 2023-02-07 15:21:15 +01:00
Joonas Koivunen
fcb905f519 Use LayerMap::replace in eviction (#3544)
Follow-up to #3536, to actually use the new `Debug` in replacing the
layers, and use replacement with manual eviction endpoint.

Turns out the two paths share a lot of handling of `Replacement` but
didn't unify the two (need 3). There are also upcoming refactorings
from other PRs to this.
2023-02-07 11:08:55 +02:00
Christian Schwarz
58fa4f0eb7 maintain access stats for historic layers
This patch adds basic access statistics for historic layers
and exposes them in the management API's `LayerMapInfo`.

We record the accesses in the `{Delta,Image}Layer::load()` function
because it's the common path of
* page_service (`Timline::get_reconstruct_data()`)
* Compaction (`PersistentLayer::iter()` and `PersistentLayer::key_iter()`)

The stats survive residence status changes, and record these as well.

When scraping the layer map endpoint to record its evolution over time,
one must account for stat resets because they are in-memory only and
will reset on pageserver restart.
Use the launch timestamp header added by (#3527) to identify pageserver restarts.

This is PR https://github.com/neondatabase/neon/pull/3496
2023-02-06 17:01:38 +01:00
Anastasia Lubennikova
877a2d70e3 Periodically send cached consumption metrics (#3520)
Add new pageserver config setting `cached_metric_collection_interval`
with default `1 hour`.
This setting controls how often unchanged cached consumption metrics are sent to
the HTTP endpoint.

This is a workaround for billing service limitations.
fixes #3485
2023-02-06 17:53:10 +02:00
Sergey Melnikov
959f5c6f40 Do not deploy legacy scram proxy (*.cloud.neon.tech) to the old account (#3546)
We have migrated to the new proxy, which was setup in
https://github.com/neondatabase/neon/pull/3461
2023-02-06 15:51:20 +01:00
Joonas Koivunen
678fe0684f std::fmt::Debug for Layer implementations (#3536)
Follow-up to #3513.

This removes the old blanket `std::fmt::Debug` impl on `dyn Layer` which
did not seem to be used from anywhere (no compilation errors after
removing).

Adds `std::fmt::Debug` requirement and implementations for `trait Layer`
implementors:
- LayerDescriptor (derived)
- RemoteLayer (manual)
- DeltaLayer (manual)
- ImageLayer (manual)

Manual implementations are used to skip PageserverConf, tenant and
timeline ids, large collections.

Adds and adjusts some doc comments to be more rustdoc alike.
2023-02-06 14:21:51 +02:00
Shany Pozin
c9821f13e0 Expose the tenant calculated synthetic size as a Prometheus metric (#3541)
## Describe your changes
Expose the currently calculated synthetic size as a Prometheus metric
## Issue ticket number and link
#3509

## Checklist before requesting a review
- [X] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
2023-02-06 09:25:15 +02:00
Heikki Linnakangas
121d535068 Add a test for the PostgresNIO Swift client library.
It supports SNI. See discussion at
https://community.neon.tech/t/postgresnio-swift-sni-support/419/4
2023-02-04 15:56:11 +01:00
Kirill Bulatov
ec3a3aed37 Dump current tenant config (#3534)
The PR adds an endpoint to show tenant's current config: `GET
/v1/tenant/:tenant_id/config`

Tenant's config consists of two parts: tenant overrides (could be
changed via other management API requests) and the default part,
substituting all missing overrides (constant, hardcoded in pageserver).
The API returns the custom overrides and the final tenant config, after
applying all the defaults.

Along the way, it had to fix two things in the config:

* allow to shorten the json version and omit all `null`'s (same as toml
serializer behaves by default), and to understand such shortened format
when deserialized. A unit test is added
* fix a bug, when `PUT /v1/tenant/config` endpoint rewritten the local
file with what had came in the request, but updating (not rewriting the
old values) the in-memory state instead.
That got uncovered during adjusting the e2e test and fixed to do the
replacement everywhere, otherwise there's no way to revert existing
overrides. Fixes #3471 (commit
dc688affe8)
* fixes https://github.com/neondatabase/neon/issues/3472 by reordering
the config saving operations
2023-02-04 01:32:29 +02:00
Christian Schwarz
87cd2bae77 introduce LaunchTimestamp to identify process restarts
This patch adds a LaunchTimestamp type to the `metrics` crate,
along with a `libmetric_` Prometheus metric.

The initial user is pageserver.
In addition to exposing the Prometheus metric, it also reproduces
the launch timestamp as a header in the API responses.

The motivation for this is that we plan to scrape the pageserver's
/v1/tenant/:tenant_id/timeline/:timeline_id/layer
HTTP endpoint over time. It will soon expose access metrics (#3496)
which reset upon process restart. We will use the pageserver's launch
ID to identify a restart between two scrape points.

However, there are other potential uses. For example, we could use
the Prometheus metric to annotate Grafana plots whenever the launch
timestamp changes.
2023-02-03 18:12:17 +01:00
bojanserafimov
be81db21b9 Revert accidental change (#3538) 2023-02-03 17:54:12 +02:00
Joonas Koivunen
f2d89761c2 feat: LayerMap::replace (#3513)
Cc: #3486

Adds a method to replace a particular layer from the LayerMap for the
purposes of remote layer download and layer eviction. In those use cases
read lock on layer map needs to be released after initial search, but
other operations could modify layermap before replacing thread gets to
run.

Co-authored-by: bojanserafimov <bojan.serafimov7@gmail.com>
2023-02-03 15:33:46 +02:00
Sergey Melnikov
a0372158a0 Add build_info metric to storage broker (#3525)
## Describe your changes
Add libmetrics_build_info metrics with commit sha to storage_broken
/metrics, to match behaviour of proxy, pageserver and safekeeper.
2023-02-03 12:23:27 +01:00
Anastasia Lubennikova
83048a4adc Handle errors during metric collection. (#3521)
Don't exit the loop if one of the tenants failed to scrape its metrics.
fixes #3490
2023-02-03 12:37:34 +02:00
Konstantin Knizhnik
f71b1b174d Check correctness of file_cache_size_limit (#3530)
## Describe your changes

## Issue ticket number and link

## Checklist before requesting a review
- [ ] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
2023-02-03 08:01:26 +02:00
Dmitry Ivanov
96e78394f5 [proxy] Fix project (aka endpoint) init in the password hack handler (#3529)
The project/endpoint should be set in the original (non-as_ref'd) creds,
because we call `wake_compute` not only in `try_password_hack` but also
later in the connection retry logic.

This PR also removes the obsolete `as_ref` method and makes the code
simpler because we no longer need this complication after a recent
refactoring.

Further action points: finally introduce typestate in creds (planned).
2023-02-02 22:56:15 +02:00
bojanserafimov
ada933eb42 Pageserver read trace utils (#2795)
List, dump, and analyze read traces.
2023-02-02 15:33:40 -05:00
Kirill Bulatov
f6a10f4693 Use regular layer names instead of a special test ones (#3524)
We do not need special enum variant for testing the file names, neither
its special handling across the code.
Current tests are able to create regular layers with normal layer names,
as the PR shows.
2023-02-02 14:52:17 +02:00
Vadim Kharitonov
d25307dced Compile postgres with ICU support 2023-02-02 13:33:38 +01:00
Kirill Bulatov
2759f1a22e Evict layers on demand (#3486)
Closes https://github.com/neondatabase/neon/issues/3439

Adds a set of commands to manipulate the layer map:
* dump the layer map contents
* evict the layer form the layer map (remove the local file, put the
remote layer instead in the layer map)
* download the layer (operation, reversing the eviction)

The commands will change later, when the statistics is added on top, so
the swagger schema is not adjusted.

The commands might have issues with big amount of layers: no pagination
is done for the dump command, eviction and download commands look for
the layer to evict/download by iterating all layers sequentially and
comparing the layer names.
For now, that seems to be tolerable ("big" number of layers is ~2_000)
and further experiments are needed.

---------

Co-authored-by: Christian Schwarz <christian@neon.tech>
2023-02-02 12:14:44 +02:00
Kirill Bulatov
f474495ba0 Publish builds stats that are easy to browse (#3514)
Adds two new tags, `run-extra-build-macos` and `run-extra-build-stats`
to trigger corresponding build jobs on any PR.

On every build for `main` or PR with `run-extra-build-stats` tag, publish a GitHub commit status with the link to the `cargo build --all --release --timings` report.
2023-02-02 11:18:42 +02:00
Shany Pozin
bf1c36a30c Moving the template file location (#3523)
see
https://github.com/appsmithorg/appsmith/issus/826#issuecomment-703093426
for details
2023-02-02 11:02:47 +02:00
Alexander Bayandin
567b71c1d2 Require poetry 1.3; regenerate poetry.lock (#3508)
Ref https://python-poetry.org/blog/announcing-poetry-1.3.0/#new-lock-file-format
2023-02-01 18:11:00 +00:00
Sergey Melnikov
f3dadfb3d0 Confirm that there is an emergency before manual execution of prod deploy workflow (#3507)
![image](https://user-images.githubusercontent.com/7127190/215840037-69eda3ee-920e-4b90-bf7d-aa58f0bdfb50.png)
2023-02-01 16:01:27 +01:00
Dmitry Ivanov
ea0278cf27 [proxy] Implement compute node info cache (#3331)
This patch adds a timed LRU cache implementation and a compute node info cache on top of that.
Cache entries might expire on their own (default ttl=5mins) or become invalid due to real-world events,
e.g. compute node scale-to-zero event, so we add a connection retry loop with a wake-up call.

Solved problems:
- [x] Find a decent LRU implementation.
- [x] Implement timed LRU on top of that.
- [x] Cache results of `proxy_wake_compute` API call.
- [x] Don't invalidate newer cache entries for the same key.
- [x] Add cmdline configuration knobs (requires some refactoring).
- [x] Add failed connection estab metric.
- [x] Refactor auth backends to make things simpler (retries, cache
placement, etc).
- [x] Address review comments (add code comments + cleanup).
- [x] Retry `/proxy_wake_compute` if we couldn't connect to a compute
(e.g. stalled cache entry).
- [x] Add high-level description for `TimedLru`.

TODOs (will be addressed later):
- [ ] Add cache metrics (hit, spurious hit, miss).
- [ ] Synchronize http requests across concurrent per-client tasks
(https://github.com/neondatabase/neon/pull/3331#issuecomment-1399216069).
- [ ] Cache results of `proxy_get_role_secret` API call.
2023-02-01 17:11:41 +03:00
Christian Schwarz
f1aece1ba0 add RequestContext plumbing for layer access stats
In preparation for #3496  plumb through RequestContext to the data
access methods of `PersistentLayer`.

This is PR https://github.com/neondatabase/neon/pull/3504
2023-02-01 15:29:01 +02:00
Christian Schwarz
590695e845 improve query param parsing
- add parse_query_param()
- use Cow<> where possible
- move param parsing code to utils::http::request

This was originally PR https://github.com/neondatabase/neon/pull/3502
which targeted a different branch.

closes  #3510
2023-02-01 14:11:12 +01:00
Joonas Koivunen
9bb6a6c77c pysync: override PYTHON_KEYRING_BACKEND (#3480)
This bothers me everytime I have to call `pysync`.
2023-02-01 14:07:23 +02:00
Vadim Kharitonov
2309dd5646 Install postgis_sfcgal 2023-02-01 10:45:41 +01:00
Sergey Melnikov
847fc566fd Use the same runners/container for old prod deployments as for new prod 2023-01-31 17:40:24 +01:00
dependabot[bot]
a058bc6de8 Bump aiohttp from 3.7.0 to 3.7.4 (#3445)
Bumps [aiohttp](https://github.com/aio-libs/aiohttp) from 3.7.0 to
3.7.4.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a
href="https://github.com/aio-libs/aiohttp/releases">aiohttp's
releases</a>.</em></p>
<blockquote>
<h2>aiohttp 3.7.3 release</h2>
<h2>Features</h2>
<ul>
<li>Use Brotli instead of brotlipy
<code>[#3803](https://github.com/aio-libs/aiohttp/issues/3803)
&lt;https://github.com/aio-libs/aiohttp/issues/3803&gt;</code>_</li>
<li>Made exceptions pickleable. Also changed the repr of some
exceptions.
<code>[#4077](https://github.com/aio-libs/aiohttp/issues/4077)
&lt;https://github.com/aio-libs/aiohttp/issues/4077&gt;</code>_</li>
</ul>
<h2>Bugfixes</h2>
<ul>
<li>Raise a ClientResponseError instead of an AssertionError for a blank
HTTP Reason Phrase.
<code>[#3532](https://github.com/aio-libs/aiohttp/issues/3532)
&lt;https://github.com/aio-libs/aiohttp/issues/3532&gt;</code>_</li>
<li>Fix <code>web_middlewares.normalize_path_middleware</code> behavior
for patch without slash.
<code>[#3669](https://github.com/aio-libs/aiohttp/issues/3669)
&lt;https://github.com/aio-libs/aiohttp/issues/3669&gt;</code>_</li>
<li>Fix overshadowing of overlapped sub-applications prefixes.
<code>[#3701](https://github.com/aio-libs/aiohttp/issues/3701)
&lt;https://github.com/aio-libs/aiohttp/issues/3701&gt;</code>_</li>
<li>Make <code>BaseConnector.close()</code> a coroutine and wait until
the client closes all connections. Drop deprecated &quot;with
Connector():&quot; syntax.
<code>[#3736](https://github.com/aio-libs/aiohttp/issues/3736)
&lt;https://github.com/aio-libs/aiohttp/issues/3736&gt;</code>_</li>
<li>Reset the <code>sock_read</code> timeout each time data is received
for a <code>aiohttp.client</code> response.
<code>[#3808](https://github.com/aio-libs/aiohttp/issues/3808)
&lt;https://github.com/aio-libs/aiohttp/issues/3808&gt;</code>_</li>
<li>Fixed type annotation for add_view method of UrlDispatcher to accept
any subclass of View
<code>[#3880](https://github.com/aio-libs/aiohttp/issues/3880)
&lt;https://github.com/aio-libs/aiohttp/issues/3880&gt;</code>_</li>
<li>Fixed querying the address families from DNS that the current host
supports.
<code>[#5156](https://github.com/aio-libs/aiohttp/issues/5156)
&lt;https://github.com/aio-libs/aiohttp/issues/5156&gt;</code>_</li>
<li>Change return type of MultipartReader.<strong>aiter</strong>() and
BodyPartReader.<strong>aiter</strong>() to AsyncIterator.
<code>[#5163](https://github.com/aio-libs/aiohttp/issues/5163)
&lt;https://github.com/aio-libs/aiohttp/issues/5163&gt;</code>_</li>
<li>Provide x86 Windows wheels.
<code>[#5230](https://github.com/aio-libs/aiohttp/issues/5230)
&lt;https://github.com/aio-libs/aiohttp/issues/5230&gt;</code>_</li>
</ul>
<h2>Improved Documentation</h2>
<ul>
<li>Add documentation for <code>aiohttp.web.FileResponse</code>.
<code>[#3958](https://github.com/aio-libs/aiohttp/issues/3958)
&lt;https://github.com/aio-libs/aiohttp/issues/3958&gt;</code>_</li>
<li>Removed deprecation warning in tracing example docs
<code>[#3964](https://github.com/aio-libs/aiohttp/issues/3964)
&lt;https://github.com/aio-libs/aiohttp/issues/3964&gt;</code>_</li>
<li>Fixed wrong &quot;Usage&quot; docstring of
<code>aiohttp.client.request</code>.
<code>[#4603](https://github.com/aio-libs/aiohttp/issues/4603)
&lt;https://github.com/aio-libs/aiohttp/issues/4603&gt;</code>_</li>
<li>Add aiohttp-pydantic to third party libraries
<code>[#5228](https://github.com/aio-libs/aiohttp/issues/5228)
&lt;https://github.com/aio-libs/aiohttp/issues/5228&gt;</code>_</li>
</ul>
<h2>Misc</h2>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/aio-libs/aiohttp/blob/master/CHANGES.rst">aiohttp's
changelog</a>.</em></p>
<blockquote>
<h1>3.7.4 (2021-02-25)</h1>
<h2>Bugfixes</h2>
<ul>
<li>
<p><strong>(SECURITY BUG)</strong> Started preventing open redirects in
the
<code>aiohttp.web.normalize_path_middleware</code> middleware. For
more details, see
<a
href="https://github.com/aio-libs/aiohttp/security/advisories/GHSA-v6wp-4m6f-gcjg">https://github.com/aio-libs/aiohttp/security/advisories/GHSA-v6wp-4m6f-gcjg</a>.</p>
<p>Thanks to <code>Beast Glatisant
&lt;https://github.com/g147&gt;</code>__ for
finding the first instance of this issue and <code>Jelmer Vernooij
&lt;https://jelmer.uk/&gt;</code>__ for reporting and tracking it down
in aiohttp.
<code>[#5497](https://github.com/aio-libs/aiohttp/issues/5497)
&lt;https://github.com/aio-libs/aiohttp/issues/5497&gt;</code>_</p>
</li>
<li>
<p>Fix interpretation difference of the pure-Python and the Cython-based
HTTP parsers construct a <code>yarl.URL</code> object for HTTP
request-target.</p>
<p>Before this fix, the Python parser would turn the URI's absolute-path
for <code>//some-path</code> into <code>/</code> while the Cython code
preserved it as
<code>//some-path</code>. Now, both do the latter.
<code>[#5498](https://github.com/aio-libs/aiohttp/issues/5498)
&lt;https://github.com/aio-libs/aiohttp/issues/5498&gt;</code>_</p>
</li>
</ul>
<hr />
<h1>3.7.3 (2020-11-18)</h1>
<h2>Features</h2>
<ul>
<li>Use Brotli instead of brotlipy
<code>[#3803](https://github.com/aio-libs/aiohttp/issues/3803)
&lt;https://github.com/aio-libs/aiohttp/issues/3803&gt;</code>_</li>
<li>Made exceptions pickleable. Also changed the repr of some
exceptions.
<code>[#4077](https://github.com/aio-libs/aiohttp/issues/4077)
&lt;https://github.com/aio-libs/aiohttp/issues/4077&gt;</code>_</li>
</ul>
<h2>Bugfixes</h2>
<ul>
<li>Raise a ClientResponseError instead of an AssertionError for a blank
HTTP Reason Phrase.
<code>[#3532](https://github.com/aio-libs/aiohttp/issues/3532)
&lt;https://github.com/aio-libs/aiohttp/issues/3532&gt;</code>_</li>
<li>Fix <code>web_middlewares.normalize_path_middleware</code> behavior
for patch without slash.
<code>[#3669](https://github.com/aio-libs/aiohttp/issues/3669)
&lt;https://github.com/aio-libs/aiohttp/issues/3669&gt;</code>_</li>
<li>Fix overshadowing of overlapped sub-applications prefixes.
<code>[#3701](https://github.com/aio-libs/aiohttp/issues/3701)
&lt;https://github.com/aio-libs/aiohttp/issues/3701&gt;</code>_</li>
</ul>
<!-- raw HTML omitted -->
</blockquote>
<p>... (truncated)</p>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="0a26acc1de"><code>0a26acc</code></a>
Bump aiohttp to v3.7.4 for a security release</li>
<li><a
href="021c416c18"><code>021c416</code></a>
Merge branch 'ghsa-v6wp-4m6f-gcjg' into master</li>
<li><a
href="4ed7c25b53"><code>4ed7c25</code></a>
Bump chardet from 3.0.4 to 4.0.0 (<a
href="https://github-redirect.dependabot.com/aio-libs/aiohttp/issues/5333">#5333</a>)</li>
<li><a
href="b61f0fdffc"><code>b61f0fd</code></a>
Fix how pure-Python HTTP parser interprets <code>//</code></li>
<li><a
href="5c1efbc32c"><code>5c1efbc</code></a>
Bump pre-commit from 2.9.2 to 2.9.3 (<a
href="https://github-redirect.dependabot.com/aio-libs/aiohttp/issues/5322">#5322</a>)</li>
<li><a
href="0075075801"><code>0075075</code></a>
Bump pygments from 2.7.2 to 2.7.3 (<a
href="https://github-redirect.dependabot.com/aio-libs/aiohttp/issues/5318">#5318</a>)</li>
<li><a
href="5085173d94"><code>5085173</code></a>
Bump multidict from 5.0.2 to 5.1.0 (<a
href="https://github-redirect.dependabot.com/aio-libs/aiohttp/issues/5308">#5308</a>)</li>
<li><a
href="5d1a75e68d"><code>5d1a75e</code></a>
Bump pre-commit from 2.9.0 to 2.9.2 (<a
href="https://github-redirect.dependabot.com/aio-libs/aiohttp/issues/5290">#5290</a>)</li>
<li><a
href="6724d0e7a9"><code>6724d0e</code></a>
Bump pre-commit from 2.8.2 to 2.9.0 (<a
href="https://github-redirect.dependabot.com/aio-libs/aiohttp/issues/5273">#5273</a>)</li>
<li><a
href="c688451ce3"><code>c688451</code></a>
Removed duplicate timeout parameter in ClientSession reference docs. (<a
href="https://github-redirect.dependabot.com/aio-libs/aiohttp/issues/5262">#5262</a>)
...</li>
<li>Additional commits viewable in <a
href="https://github.com/aio-libs/aiohttp/compare/v3.7.0...v3.7.4">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=aiohttp&package-manager=pip&previous-version=3.7.0&new-version=3.7.4)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)
- `@dependabot use these labels` will set the current labels as the
default for future PRs for this repo and language
- `@dependabot use these reviewers` will set the current reviewers as
the default for future PRs for this repo and language
- `@dependabot use these assignees` will set the current assignees as
the default for future PRs for this repo and language
- `@dependabot use this milestone` will set the current milestone as the
default for future PRs for this repo and language

You can disable automated security fix PRs for this repo from the
[Security Alerts
page](https://github.com/neondatabase/neon/network/alerts).

</details>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Vadim Kharitonov <vadim@neon.tech>
2023-01-31 17:30:45 +02:00
Konstantin Knizhnik
895f929bce Add layer_map_analyzer tool (#3451)
See #3348
2023-01-31 15:50:52 +02:00
Vadim Kharitonov
a7d8bfa631 Fix create release PR 2023-01-31 14:36:04 +01:00
Sergey Melnikov
0806a46c0c Fix production deploy (#3498)
`get_binaries.sh` no longer use `RELEASE` environmental variable, it
just use `DOCKER_TAG`
2023-01-31 13:36:25 +01:00
Sergey Melnikov
5e08b35f53 Fix new deploy workflow (#3492)
Add 'branch' input to specify commit for deploy scripts/configs. Commit
can't be passed to workflow as ref, and we need to pin configs to
specific commit for main/release deploys
Update deploy input descriptions to match GH interface
2023-01-30 22:08:00 +01:00
Sergey Melnikov
82cbcb36ab Extract neon deploy jobs into separate workflows (#3424)
Extract deploy jobs from build_and_test.yml to deploy-dev and
deploy-prod workflows.
Add trigger to run this workflows after Neon is build and tested on main and
release branches.

This will allow us to redeploy/rollback/patch config without full
rebuild.
2023-01-30 20:10:54 +01:00
Vadim Kharitonov
ec0e641578 Create Release PR: review fixes 2023-01-30 16:15:22 +01:00
Lassi Pölönen
20b38acff0 Replace per timeline pageserver_storage_operations_seconds with a global one (#3409)
Related to: https://github.com/neondatabase/neon/issues/2848

`pageserver_storage_operations_seconds` is the most expensive metric we
have, as there are a lot of tenants/timelines and the histogram had 42
buckets. These are quite sparse too, so instead of having a histogram
per timeline, create a new histogram
`pageserver_storage_operations_seconds_global` without tenant and
timeline dimensions and replace `pageserver_storage_operations_seconds`
with sum and counter.

Co-authored-by: Joonas Koivunen <joonas@neon.tech>
2023-01-30 17:10:29 +02:00
Kirill Bulatov
c61bc25ef9 Clean up NeedsDownload error (#3464) 2023-01-30 16:08:23 +02:00
Rory de Zoete
7bb13569b3 Switch more jobs to small runner (#3483)
As these jobs don't benefit from additional cores

Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
2023-01-30 14:00:44 +01:00
Vadim Kharitonov
5fc233964a Create release PR 2023-01-30 12:44:48 +01:00
Heikki Linnakangas
5ee77c0b1f Fix holding tracing span guard over query execution.
I added these spans to trace how long the queries take, but I didn't realize
that there's a difference between:

    let _ = span.entered();

and

    let _guard = span.entered();

The former drops the guard immediately, while the latter holds it
until the end of the scope. As a result, the span was ended
immediately, and the query was executed outside the span.
2023-01-30 12:10:51 +02:00
Shany Pozin
ddb9c2fe94 Add metrics for tenants state (#3448)
## Describe your changes
Added a metric that allow to monitor tenants state 
## Issue ticket number and link
https://github.com/neondatabase/neon/issues/3161

## Checklist before requesting a review
- [X] I have performed a self-review of my code.
- [X] I have added an e2e test for it.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
2023-01-29 14:04:06 +02:00
Shany Pozin
67d418e91c Set the last_record_gauge to the value which was persisted metadata (#3460)
## Describe your changes
Whenever a tenant is detached or the pageserver is restarted the
pageserver_last_record_lsn metric is dropped
This fix resurrects the value from the metadata whenever the tenant is
attached again
## Issue ticket number and link
[3571](https://github.com/neondatabase/cloud/issues/3571)
## Checklist before requesting a review
- [X] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
2023-01-29 12:40:50 +02:00
Rory de Zoete
4d291d0e90 Prevent assume error (#3476)
To fix `Error: The requested DurationSeconds exceeds the
MaxSessionDuration set for this role.`

Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>
2023-01-27 19:27:23 +01:00
Rory de Zoete
4718c67c17 Update deploy steps (#3470)
First one isn't optimal, but as it was requested to run the runner as
nonroot ->
https://github.com/neondatabase/runner/pull/1#discussion_r1069909593
this job will need more significant refactoring. This should unblock the
deployment process.

---------

Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>
2023-01-27 18:05:49 +01:00
Konstantin Knizhnik
c5ca7d0c68 Implement asynchronous pipe for communication with walredo process (#3368)
Co-authored-by: Christian Schwarz <christian@neon.tech>
2023-01-27 18:36:24 +02:00
Joonas Koivunen
0ec84e2f1f Allow creating config for attached tenant (#3446)
Currently `attach` doesn't write a tenant config, because we don't back
it up in the first place. The current implementation of
`Tenant::persist_tenant_config` does not allow changing tenant's
configuration through the http api which will fail because the file
wasn't created on attach and
`OpenOptions::truncate(true).write(true).create_new(false)` is used.

I think this patch allows for least controversial middle ground which
*enables* changing tenant configuration even for attached tenants (not
just created tenants).
2023-01-27 15:34:59 +02:00
Rory de Zoete
8342e9ea6f Update helm job (#3467)
As followup from https://github.com/neondatabase/build/pull/47

Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>
2023-01-27 13:28:26 +01:00
Christian Schwarz
99399c112a move walreceiver module under timeline
Walreceiver is a per-timeline abstraction. Move it there to reflect
the hierarchy of abstractions and task_mgr tasks.
The code that sets up the global storage_broker client
is not timeline-scoped. So, break it out into a separate module.

The motivation for this change is to prepare the code base for replacing
the task_mgr global task registry with a more ownership-oriented
approach to manage task lifetimes.

I removed TaskStateUpdate::Init because, after doing the changes,
rustc warned that it was never constructed.
A quick search through the commit history shows that this
has always been true since

    commit fb68d01449
    Author: Dmitry Rodionov <dmitry@neon.tech>
    Date:   Mon Sep 26 23:57:02 2022 +0300

        Preserve task result in TaskHandle by keeping join handle around (#2521)

So, the warning is not an indication of some accidental code removal.

This is PR: https://github.com/neondatabase/neon/pull/3456
2023-01-27 12:23:17 +01:00
Rory de Zoete
2388981311 Add cleanup tasks for ansible and helm (#3465)
To fix:

https://github.com/neondatabase/neon/actions/runs/4023027504/jobs/6913421070

https://github.com/neondatabase/neon/actions/runs/4023027504/jobs/6913421268

Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
2023-01-27 11:20:51 +01:00
Sergey Melnikov
fb721cdfa5 Setup legacy scram proxy in the new account (#3461)
This setup proxies with *.cloud.neon.tech certificate in the us-west-2
region of the new account, we are not switching to them here yet
2023-01-27 11:05:05 +01:00
Heikki Linnakangas
bf63f129ae Make 'branch_timeline' function more clear.
Change the signature so that it takes an Arc<Timeline> reference to the
source timeline, instead of just the ID. All the callers have an Arc
reference at hand, so this is more convenient for everyone.

Reorder the code a bit and improve the comments, to make it more clear
what it does and why.
2023-01-27 02:12:07 +02:00
Sergey Melnikov
2ecd0e1f00 Decommission link proxy from old account (#3454) 2023-01-26 16:18:57 +01:00
Rory de Zoete
b858d70f19 Update promote job (#3455)
To fix errors such as:
`An error occurred (ImageAlreadyExistsException) when calling the
PutImage operation: Image with digest
'sha256:da6d8ad97d84e3aec4e6a240c3a35868b626692ee5d199cdd3fe45d29a8e54df'
and tag 'latest' already exists in the repository with name
'compute-node-v14' in registry with id '369495373322'`

Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>
2023-01-26 14:26:23 +01:00
Heikki Linnakangas
0c0e15b81d compute_ctl: Extract tracing context from incoming HTTP requests.
This allows tracing the handling of HTTP requests as part of the caller's
trace.
2023-01-26 15:20:03 +02:00
Heikki Linnakangas
3e94fd5af3 Inherit OpenTelemetry context for compute startup from cloud console.
This allows fine-grained distributed tracing of the 'start_compute'
operation from the cloud console. The startup actions performed by
'compute_ctl' are now performed in a child of the 'start_compute'
context, so you can trace through the whole compute start operation.

This needs a corresponding change in the cloud console to fill in the
'startup_tracing_context' field in the json spec. If it's missing, the
startup operations are simply traced as a separate trace, without
a parent.
2023-01-26 15:20:03 +02:00
Heikki Linnakangas
006ee5f94a Configure 'compute_ctl' to use OpenTelemetry exporter.
This allows tracing the startup actions e.g. with Jaeger
(https://www.jaegertracing.io/). We use the "tracing-opentelemetry"
crate, which turns tracing spans into OpenTelemetry spans, so you can
use the usual "#[instrument]" directives to add tracing.

I put the tracing initialization code to a separate crate,
`tracing-utils`, so that we can reuse it in other programs. We
probably want to set up tracing in the same way in all our programs.

Co-authored-by: Joonas Koivunen <joonas@neon.tech>
2023-01-26 15:20:03 +02:00
Rory de Zoete
4bcbb7793d Revert docker hub job (#3453)
Regression fix as permissions aren't configured properly on gen3 for
this job.

Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
2023-01-26 11:30:53 +01:00
Christian Schwarz
dc64962ffc tenant::mgr: explicit tracking of initializing & shutting-down states
This patch wrap the tenants hashmap into an enum that represents the
tenant manager's three major states:
- Initializing
- Open for business
- Shutting down.
See the enum doc comments for details.

In response, all the users of `TENANTS` are now forced to distinguish
those states.
The only major change is in `run_if_no_tenant_in_memory`, which,
before this patch, was used by the /attach and /load endpoints.
This patch rewrites that method under the name `tenant_map_insert`,
replacing the anyhow::Result with a std Result and a dedicated error
type.
Introducing this error types allows using `tenant_map_insert` in
`tenant_create`, thereby unifying all code paths that create tenants
objects to use `tenant_map_insert`.

This is beneficial because we can now systematically prevent tenants
from being created, attached, or `/load`ed during pageserver shutdown.
The management API remains available, but the endpoints that create
new tenants will fail with an error.
More work would need to be done to properly distinguish these errors
through HTTP status codes such as 503.
2023-01-26 11:24:48 +01:00
Rory de Zoete
cd5732d9d8 Gen3 runners (#3220)
https://github.com/neondatabase/cloud/issues/2738

Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>
Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
2023-01-26 10:46:06 +01:00
bojanserafimov
0a09589403 Increase gc period to 1h (#3432) 2023-01-25 15:18:41 -05:00
Vadim Kharitonov
e3efb0d854 Fix bug while creating unit extension (#3447)
after executing

```sql
CREATE EXTENSION unit;
``` 

I saw such error

```
ERROR: could not open file "/usr/local/pgsql/share/extension/unit_prefixes.data" for reading: No such file or directory (SQLSTATE 58P01)
```

Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>
2023-01-25 17:29:06 +00:00
Sergey Melnikov
4b8dbea5c1 Add production link proxy to new account (#3444)
This PR setup link proxy in us-east-2 region, but do not redirect
pg.neon.tech DNS name to it
Will keep old link proxy for the time of migration
2023-01-25 17:15:56 +01:00
Kirill Bulatov
0c7276ae13 Expect timeline being stopped during the detach smoke test (#3442)
Found this error recently:
https://neon-github-public-dev.s3.amazonaws.com/reports/main/release/4005062867/index.html#categories/d2116d7e3b88302f27b3d646396b385b/18590f7063e91b53/?attachment=69e899c74f1cbfc5

I could not reproduce it locally, since always received `gc target
timeline does not exist` instead, so that test is quite lucky.

Still, the error is pretty valid to appear in this context, so do not fail the test if it's found in the logs.
2023-01-25 16:23:30 +02:00
Vadim Kharitonov
00f1f54b7a Leave one Dockerfile 2023-01-25 15:10:45 +01:00
Christian Schwarz
8963d830fb add script to download all remote layers (#3294)
For use in production in case on-demand download turns out to be
problematic during tenant_attach, or when we eventually introduce layer
eviction.

Co-authored-by: Dmitry Rodionov <dmitry@neon.tech>
2023-01-25 16:55:25 +03:00
Christian Schwarz
01b4b0c2f3 Introduce RequestContext
Motivation
==========

Layer Eviction Needs Context
----------------------------

Before we start implementing layer eviction, we need to collect some
access statistics per layer file or maybe even page.
Part of these statistics should be the initiator of a page read request
to answer the question of whether it was page_service vs. one of the
background loops, and if the latter, which of them?

Further, it would be nice to learn more about what activity in the pageserver
initiated an on-demand download of a layer file.
We will use this information to test out layer eviction policies.

Read more about the current plan for layer eviction here:
https://github.com/neondatabase/neon/issues/2476#issuecomment-1370822104

task_mgr problems + cancellation + tenant/timeline lifecycle
------------------------------------------------------------

Apart from layer eviction, we have long-standing problems with task_mgr,
task cancellation, and various races around tenant / timeline lifecycle
transitions.
One approach to solve these is to abandon task_mgr in favor of a
mechanism similar to Golang's context.Context, albeit extended to
support waiting for completion, and specialized to the needs in the
pageserver.

Heikki solves all of the above at once in PR
https://github.com/neondatabase/neon/pull/3228 , which is not yet
merged at the time of writing.

What Is This Patch About
========================

This patch addresses the immediate needs of layer eviction by
introducing a `RequestContext` structure that is plumbed through the
pageserver - all the way from the various entrypoints (page_service,
management API, tenant background loops) down to
Timeline::{get,get_reconstruct_data}.

The struct carries a description of the kind of activity that initiated
the call. We re-use task_mgr::TaskKind for this.

Also, it carries the desired on-demand download behavior of the entrypoint.
Timeline::get_reconstruct_data can then log the TaskKind that initiated
the on-demand download.

I developed this patch by git-checking-out Heikki's big RequestContext
PR https://github.com/neondatabase/neon/pull/3228 , then deleting all
the functionality that we do not need to address the needs for layer
eviction.

After that, I added a few things on top:

1. The concept of attached_child and detached_child in preparation for
   cancellation signalling through RequestContext, which will be added in
   a future patch.
2. A kill switch to turn DownloadBehavior::Error into a warning.
3. Renamed WalReceiverConnection to WalReceiverConnectionPoller and
   added an additional TaskKind WalReceiverConnectionHandler.These were
   necessary to create proper detached_child-type RequestContexts for the
   various tasks that walreceiver starts.

How To Review This Patch
========================

Start your review with the module-level comment in context.rs.
It explains the idea of RequestContext, what parts of it are implemented
in this patch, and the future plans for RequestContext.

Then review the various `task_mgr::spawn` call sites. At each of them,
we should be creating a new detached_child RequestContext.

Then review the (few) RequestContext::attached_child call sites and
ensure that the spawned tasks do not outlive the task that spawns them.
If they do, these call sites should use detached_child() instead.

Then review the todo_child() call sites and judge whether it's worth the
trouble of plumbing through a parent context from the caller(s).

Lastly, go through the bulk of mechanical changes that simply forwards
the &ctx.
2023-01-25 14:53:30 +01:00
Sergey Melnikov
dee71404a2 Use TLS for staging link proxy (#3443)
Fixes #3416 on staging

Adding domain parameter result in:
* Issuing TLS cert for that domain
* Passing that cert to proxy with `--tls-key`/`--tls-cert`
2023-01-25 14:39:55 +01:00
Kirill Bulatov
572332ab50 Tone down page_service timeouts (#3426)
Closes https://github.com/neondatabase/neon/issues/3341
2023-01-25 13:40:08 +02:00
Vadim Kharitonov
5223b62a19 Compile unit extension 2023-01-25 12:08:45 +01:00
Vadim Kharitonov
bc4f594ed6 Fix Sentry Version 2023-01-25 12:07:38 +01:00
Kirill Bulatov
ea6f41324a Tone down postgres client io errors (#3435)
Closes https://github.com/neondatabase/neon/issues/3343
2023-01-25 10:50:33 +00:00
Kirill Bulatov
3d5faa0295 Unify common image args in compute-node Dockerfiles (#3437)
Part of https://github.com/neondatabase/neon/issues/3436
2023-01-25 12:39:53 +02:00
Kirill Bulatov
9fbef1159f Tone down http error printing (#3434)
Only print backtraces for internal server error variants of the API
error.
2023-01-25 10:36:30 +00:00
Sergey Melnikov
aabca55d7e Migrate update version to management APIv2 (#3430) 2023-01-24 17:18:16 +01:00
Kirill Bulatov
1c3636d848 Tone down walreceiver connection timeout errors (#3425)
Closes https://github.com/neondatabase/neon/issues/3342
2023-01-24 18:03:33 +02:00
Kirill Bulatov
0c16ad8591 Tone down broker subscription errors 2023-01-24 17:23:33 +02:00
Christian Schwarz
0b673c12d7 timeline: don't transition Active=>Active during pageserver startup
Before this patch, when `initialize_with_lock` was called via
`timeline_init_and_sync`, we would transition the timeline like so:

    load_local_timeline/load_remote_timeline:
        timeline_init_and_sync
            Timeline::new
                () => Loading
            initialize_with_lock:
                set_state(Active)
                    Loading => Active
        timeline.activate()
            Active => Active
2023-01-24 15:56:02 +01:00
Christian Schwarz
7a333cfb12 be noisy about unexpected Timeline state transitions 2023-01-24 15:56:02 +01:00
Christian Schwarz
f7ec33970a add doc comment that outlines which tokio tasks walreceiver creates 2023-01-24 15:23:48 +01:00
Joonas Koivunen
98d0a0d242 fix(http): omit needless string allocs (#3421)
Drive-by fix noticed while #3419.
2023-01-24 14:53:39 +02:00
Joonas Koivunen
f74080cbad feat(http): support ?inputs_only=true for tenant_size (#3419)
this makes debugging problematic cases in the future easier, as we can
just request the model inputs, use them locally to reproduce the issue
with the model.
2023-01-24 13:57:13 +02:00
Christian Schwarz
55c184fcd7 fix some anyhow::Context::context calls that should use with_context(format!(...))
Noticed this while combing through some production logs.
2023-01-24 12:22:33 +01:00
Kirill Bulatov
fd18692dfb Output coloured pageserver logs for tty stdout 2023-01-24 09:58:08 +02:00
Alexey Kondratov
a4be54d21f [compute_ctl] Stop updating roles on each compute start (#3391)
I noticed that `compute_ctl` updates all roles on each start, search for
rows like

> - web_access:[FILTERED] -> update

in the compute startup log.

It happens since we had an adhoc hack for md5 hashes comparison, which
doesn't work with scram hashes stored in the `pg_authid`. It doesn't
really hurt, as nothing changes, but we just run >= 2 extra queries on
each start, so fix it.
2023-01-23 17:46:22 +01:00
Christian Schwarz
6b6570b580 remove TimelineState::Suspended, introduce TimelineState::Loading
The TimelineState::Suspsended was dubious to begin with. I suppose
that the intention was that timelines could transition back and
forth between Active and Suspended states.
But practically, the code before this patch never did that.
The transitions were:

    () ==Timeline::new==> Suspended ==*==> {Active,Broken,Stopping}

One exception: Tenant::set_stopping() could transition timelines like
so:

    !Broken ==Tenant::set_stopping()==> Suspended

But Tenant itself cannot transition from stopping state to any other
state.

Thus, this patch removes TimelineState::Suspended and introduces a new
state Loading. The aforementioned transitions change as follows:

    - () ==Timeline::new==> Suspended ==*==> {Active,Broken,Stopping}
    + () ==Timeline::new==> Loading   ==*==> {Active,Broken,Stopping}

    - !Broken ==Tenant::set_stopping()==> Suspended
    + !Broken ==Tenant::set_stopping()==> Stopping

Walreceiver's connection manager loop watches TimelineState to decide
whether it should retry connecting, or exit.
This patch changes the loop to exit when it observes the transition
into Stopping state.

Walreceiver isn't supposed to be started until the timeline transitions
into Active state. So, this patch also adds some warn!() messages
in case this happens anyways.
2023-01-23 17:22:49 +01:00
Joonas Koivunen
7704caa3ac More tenant size fixes (#3410)
Small changes, but hopefully this will help with the panic detected in
staging, for which we cannot get the debugging information right now
(end-of-branch before branch-point).
2023-01-23 17:12:51 +02:00
Shany Pozin
a44e5eda14 Adding pageserver3 to staging (#3403) 2023-01-23 14:08:48 +01:00
Konstantin Knizhnik
5c865f46ba Fix slru_segment_key_range function: segno was assigned to incorrect Key field (#3354) 2023-01-23 10:51:09 +02:00
bojanserafimov
a3d7ad2d52 Implement layer map using immutable BST (#2998) 2023-01-20 16:10:12 -05:00
Anastasia Lubennikova
36f048d6b0 Fix tenant size orphans (#3377)
Before only the timelines which have passed the `gc_horizon` were
processed which failed with orphans at the tree_sort phase. Example
input in added `test_branched_empty_timeline_size` test case.

The PR changes iteration to happen through all timelines, and in
addition to that, any learned branch points will be calculated as they
would had been in the original implementation if the ancestor branch had
been over the `gc_horizon`.

This also changes how tenants where all timelines are below `gc_horizon`
are handled. Previously tenant_size 0 was returned, but now they will
have approximately `initdb_lsn` worth of tenant_size.

The PR also adds several new tenant size tests that describe various corner
cases of branching structure and `gc_horizon` setting.
They are currently disabled to not consume time during CI.

Co-authored-by: Joonas Koivunen <joonas@neon.tech>
Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>
2023-01-20 20:21:36 +02:00
Joonas Koivunen
58fb6fe861 fix: dont stop pageserver if we fail to calculate synthetic size 2023-01-20 19:55:19 +02:00
Alexey Kondratov
20b1e26e74 [compute_ctl] Make role deletion spec processing idempotent (#3380)
Previously, we were trying to re-assign owned objects of the already
deleted role. This were causing a crash loop in the case when compute
was restarted with a spec that includes delta operation for role
deletion. To avoid such cases, check that role is still present before
calling `reassign_owned_objects`.

Resolves neondatabase/cloud#3553
2023-01-20 15:37:24 +01:00
Christian Schwarz
8ba1699937 Revert "Use actual temporary dir for pageserver unit tests"
This reverts commit 826e89b9ce.

The problem with that commit was that it deletes the TempDir while
there are still EphemeralFile instances open.

At first I thought this could be fixed by simply adding

  Handle::current().block_on(task_mgr::shutdown(None, Some(tenant_id), None))

to TenantHarness::drop, but it turned out to be insufficient.

So, reverting the commit until we find a proper solution.

refs https://github.com/neondatabase/neon/issues/3385
2023-01-19 20:16:56 +01:00
bojanserafimov
a9bd05760f Improve layer map docstrings (#3382) 2023-01-19 10:29:15 -05:00
Heikki Linnakangas
e5cc2f92c4 Switch to 'tracing' for logging, restructure code to make use of spans.
Refactors Compute::prepare_and_run. It's split into subroutines
differently, to make it easier to attach tracing spans to the
different stages. The high-level logic for waiting for Postgres to
exit is moved to the caller.

Replace 'env_logger' with 'tracing', and add `#instrument` directives
to different stages fo the startup process. This is a fairly
mechanical change, except for the changes in 'spec.rs'. 'spec.rs'
contained some complicated formatting, where parts of log messages
were printed directly to stdout with `print`s. That was a bit messed
up because the log normally goes to stderr, but those lines were
printed to stdout. In our docker images, stderr and stdout both go to
the same place so you wouldn't notice, but I don't think it was
intentional.

This changes the log format to the default
'tracing_subscriber::format' format. It's different from the Postgres
log format, however, and because both compute_tools and Postgres print
to the same log, it's now a mix of two different formats.  I'm not
sure how the Grafana log parsing pipeline can handle that. If it's a
problem, we can build custom formatter to change the compute_tools log
format to be the same as Postgres's, like it was before this commit,
or we can change the Postgres log format to match tracing_formatter's,
or we can start printing compute_tool's log output to a different
destination than Postgres
2023-01-18 19:42:47 +02:00
Kirill Bulatov
90f66aa51b Enable logs in unit tests 2023-01-18 17:43:27 +02:00
Kirill Bulatov
826e89b9ce Use actual temporary dir for pageserver unit tests 2023-01-18 17:43:27 +02:00
Vadim Kharitonov
e59d32ac5d Change SENTRY_ENVIRONMENT from "development" to "staging" 2023-01-18 16:34:49 +01:00
Anastasia Lubennikova
506086a3e2 Fix metric_collection_endpoint for prod.
It was incorrectly set to staging url
2023-01-18 16:35:43 +02:00
Heikki Linnakangas
3b58c61b33 If an error happens while checking for core dumps, don't panic.
If we panic, we skip the 30s wait in 'main', and don't give the
console a chance to observe the error. Which is not nice.

Spotted by @ololobus at
https://github.com/neondatabase/neon/pull/3352#discussion_r1072806981
2023-01-18 11:25:47 +02:00
Kirill Bulatov
c6b56d2967 Add more io::Error context when fail to operate on a path (#3254)
I have a test failure that shows 

```
Caused by:
    0: Failed to reconstruct a page image:
    1: Directory not empty (os error 39)
```

but does not really show where exactly that happens.

https://neon-github-public-dev.s3.amazonaws.com/reports/pr-3227/release/3823785365/index.html#categories/c0057473fc9ec8fb70876fd29a171ce8/7088dab272f2c7b7/?attachment=60fe6ed2add4d82d

The PR aims to add more context in debugging that issue.
2023-01-17 22:07:38 +02:00
Anastasia Lubennikova
9d3992ef48 Increase metric_collection_interval for proxy on prod
to not owerwhelm the service
2023-01-17 15:50:19 +02:00
Anastasia Lubennikova
7624963e13 Enable metric_collection_endpoint for proxy on prod
in all regions
2023-01-17 13:43:50 +02:00
Anastasia Lubennikova
63e3b815a2 Enable metric_collection_endpoint for pageserver on prod
in all regions
2023-01-17 13:43:50 +02:00
Kirill Bulatov
1ebd145c29 Actualize the comment (#3362)
Follow-up of
https://github.com/neondatabase/neon/pull/3326#issuecomment-1384265759
2023-01-17 13:30:42 +02:00
sharnoff
f8e887830a build: Use curl -f on vm-informant download (#3363)
Without this, we can silently fail
2023-01-17 10:38:33 +01:00
Christian Schwarz
48dd9565ac TaskHandle: tone down sender is dropped while join handle is still alive
Rationale: see comments added as part of this commit.

fixes https://github.com/neondatabase/neon/issues/3339
2023-01-17 09:42:22 +01:00
Anastasia Lubennikova
e067cd2947 Enable metric collection for proxy on staging 2023-01-16 21:15:42 +02:00
Christian Schwarz
58c8c1076c download_all_remote_layers API: require client to specify max_concurrent_downloads
Before this patch, we would start all layer downloads simultaneously.

There is at most one download_all_remote_layers task per timeline.
Hence, the specified limit is per timeline.

There is still no global concurrency limit for layer downloads.
We'll have to revisit that at some point and also prioritize on-demand
initiated downloads over download_all_remote_layers downloads.
But that's for another day.
2023-01-16 19:29:06 +01:00
Alexander Bayandin
4c6b507472 Update Postgres clients we test (#3359)
Update client libraries and runtimes for Postgres libraries we test.
- `pg8000` works with Neon now 🎉 
- `PostgresClientKit` still doesn't support SNI
2023-01-16 17:22:17 +00:00
Stas Kelvich
431e464c1e Consumption metering RFC 2023-01-16 19:15:59 +02:00
danieltprice
424fd0bd63 Update auth.rs (#3349)
Update SNI error message. Users now specify the endpoint ID when making
a connection to Neon. This should be reflected in the error message.
2023-01-16 12:32:00 -04:00
Joonas Koivunen
a8a9bee602 walredo: simple tests and bench updates (#3045)
Separated from #2875.

The microbenchmark has been validated to show similar difference as to
larger scale OLTP benchmark.
2023-01-16 18:24:45 +02:00
Vadim Kharitonov
6ac5656be5 Enable earthdistance extension 2023-01-16 17:04:51 +01:00
Anastasia Lubennikova
3c571ecde8 Update docs/consumption_metrics.md 2023-01-16 17:24:13 +02:00
Anastasia Lubennikova
5f1bd0e8a3 Add documentation for consumption metrics 2023-01-16 17:24:13 +02:00
Anastasia Lubennikova
2cbe84b78f Proxy metrics (#3290)
Implement proxy metrics collection.
Only collect metric for outbound traffic.

Add proxy CLI parameters:
- metric-collection-endpoint
- metric-collection-interval.

Add test_proxy_metric_collection test.

Move shared consumption metrics code to libs/consumption_metrics.
Refactor the code.
2023-01-16 15:17:28 +00:00
sharnoff
5c6a7a17cb Add VM informant to vm-compute-node (#3324)
The general idea is that the VM informant binary is added to the
vm-compute-node images only. `compute_tools` then will run whatever's at
`/bin/vm-informant`, if the path exists.
2023-01-16 07:05:29 -08:00
Arseny Sher
84ffdc8b4f Don't keep FDs open on cancelled timelines in safekeepers.
Since PR #3300 we don't remove timelines completely until next restart, so this
prevents leakage.

fixes https://github.com/neondatabase/neon/issues/3336
2023-01-16 19:03:38 +04:00
Kirill Bulatov
bce4233d3a Rework Cargo.toml dependencies (#3322)
* Use workspace variables from cargo, coming with rustc
[1.64](https://github.com/rust-lang/rust/blob/master/RELEASES.md#version-1640-2022-09-22)

See
https://doc.rust-lang.org/nightly/cargo/reference/workspaces.html#the-package-table
and
https://doc.rust-lang.org/nightly/cargo/reference/workspaces.html#the-dependencies-table
sections.

Now, all dependencies in all non-root `Cargo.toml` files are defined as 
```
clap.workspace = true
```

sometimes, when extra features are needed, as 
```
bytes = {workspace = true, features = ['serde'] }
```

With the actual declarations (with shared features and version
numbers/file paths/etc.) in the root Cargo.toml.
Features are additive:

https://doc.rust-lang.org/nightly/cargo/reference/specifying-dependencies.html#inheriting-a-dependency-from-a-workspace

* Uses the mechanism above to set common, 2021, edition and license across the
workspace

* Mechanically bumps a few dependencies

* Updates hakari format, as it suggested:
```
work/neon/neon kb/cargo-templated ❯ cargo hakari generate
info: no changes detected
info: new hakari format version available: 3 (current: 2)
(add or update `dep-format-version = "3"` in hakari.toml, then run `cargo hakari generate && cargo hakari manage-deps`)
```
2023-01-13 18:13:34 +02:00
Vadim Kharitonov
16baa91b2b Add more information about cargo deny 2023-01-13 13:24:34 +01:00
Kirill Bulatov
99808558de Avoid duplicate timeline insert (#3326)
`initialize_with_lock` inserts `Arc<Timeline>` before returning it:
c1731bc4f0/pageserver/src/tenant.rs (L222)

but `setup_timeline` function did another insert, which got removed in this PR:
c1731bc4f0/pageserver/src/tenant.rs (L486)


On top, a better comment and function renames are added.
2023-01-13 12:05:54 +00:00
Anastasia Lubennikova
c6d383e239 code cleanup 2023-01-13 11:51:28 +02:00
Anastasia Lubennikova
5e3e0fbf6f remove unneeded Cargo.lock changes 2023-01-13 11:51:28 +02:00
Anastasia Lubennikova
26f39c03f2 review code cleanup:
- handle errors in calculate_synthetic_size_worker. Don't exit the bgworker if one tenant failed.

- add cached_synthetic_tenant_size to cache values calculated by the bgworker

- code cleanup: remove unneeded info! messages, clean comments

- handle collect_metrics_task() error. Don't exit collect_metrics worker if one task failed.

 - add unit test to cover case when we have multiple branches at the same lsn
2023-01-13 11:51:28 +02:00
Anastasia Lubennikova
148e020fb9 Fix logical size calculation:
sort updates in topological order so that the parent timeline always preceeds its children.
    fixes #3179
2023-01-13 11:51:28 +02:00
Anastasia Lubennikova
0675859bb0 Add background worker that periodically spawns
synthetic size calculation.
Add new pageserver config param calculate_synthetic_size_interval
2023-01-13 11:51:28 +02:00
Anastasia Lubennikova
ba0190e3e8 Handle errors in tenant_size_model code 2023-01-13 11:51:28 +02:00
Konstantin Knizhnik
9ce5ada89e Do not report position in SMGR message (#3307)
refer #3277
2023-01-13 10:23:35 +02:00
Alexander Bayandin
c28bfd4c63 Nightly Benchmarks: add user provided example (#3308) 2023-01-12 23:03:21 +00:00
Vadim Kharitonov
dec875fee1 Disable postgis_sfcgal 2023-01-12 21:51:49 +01:00
Kirill Bulatov
fe8cef3427 Use ready! rustc 1.64 macro (#3315)
rustc
[1.64](https://github.com/rust-lang/rust/blob/master/RELEASES.md#version-1640-2022-09-22)
had brought `ready!` macro:
https://doc.rust-lang.org/stable/std/task/macro.ready.html

Use it to shorten the code slightly.
2023-01-12 21:27:34 +02:00
MMeent
bb406b21a8 Fix issue in compaction code (#3246)
If we ran `compact_prefetch_buffers` with exactly one hole in the
buffers, the code would fail to remove the last, now unused, entry from
the array.

This is now fixed. 

Also, add and adjust some comments in the compaction code so that the
algorithm used is a bit more clear.

Fixes #3192
2023-01-12 19:23:59 +01:00
Heikki Linnakangas
57a6e931ea Comment, formatting and other cosmetic cleanup. 2023-01-12 19:05:13 +02:00
Heikki Linnakangas
0cceb14e48 Add a FIXME on ugly error message parsing. 2023-01-12 19:05:13 +02:00
Konstantin Knizhnik
1983c4d4ad Explain prefetch (#3002)
Co-authored-by: Bojan Serafimov <bojan.serafimov7@gmail.com>
2023-01-12 18:12:40 +02:00
Heikki Linnakangas
d7c41cbbee Replace tokio::watch with CancellationToken.
PR #3228 starts to use CancellationTokens more widely, this is a small
part extracted from that.
2023-01-12 17:37:15 +02:00
Vadim Kharitonov
29a2465276 Update rust version in toolchain 2023-01-12 15:16:52 +01:00
Arthur Petukhovsky
f49e923d87 Keep deleted timelines in memory of safekeeper (#3300)
A temporal fix for https://github.com/neondatabase/neon/issues/3146,
until we come up with a reliable way to create and delete timelines in
all safekeepers.
2023-01-12 15:33:07 +03:00
Vadim Kharitonov
a0ee306c74 Enable safe contrib extensions 2023-01-12 12:41:53 +01:00
Heikki Linnakangas
c1731bc4f0 Push on-demand download into Timeline::get() function itself.
This makes Timeline::get() async, and all functions that call it
directly or indirectly with it. The with_ondemand_download() mechanism
is gone, Timeline::get() now always downloads files, whether you want
it or not. That is what all the current callers want, so even though
this loses the capability to get a page only if it's already in the
pageserver, without downloading, we were not using that capability.
There were some places that used 'no_ondemand_download' in the WAL
ingestion code that would error out if a layer file was not found
locally, but those were dubious. We do actually want to on-demand
download in all of those places.

Per discussion at
https://github.com/neondatabase/neon/pull/3233#issuecomment-1368032358
2023-01-12 11:53:10 +02:00
Sergey Melnikov
95bf19b85a Add --atomic to all helm upgrade operations (#3299)
When number of github actions workers is changed, some jobs get killed.
When helm if killed during the upgrade, release stuck in pending-upgrade
state. --atomic should initiate automatic rollback in this case.
2023-01-10 10:05:27 +00:00
Vadim Kharitonov
80d4afab0c Update tokio version (RUSTSEC-2023-0001) 2023-01-10 09:02:00 +01:00
Arthur Petukhovsky
0807522a64 Enable wss proxy in all regions (#3292)
Follow-up to https://github.com/neondatabase/helm-charts/pull/24 and
#3247
2023-01-09 19:56:12 +00:00
Christian Schwarz
8eebd5f039 run on-demand compaction in a task_mgr task
With this patch, tenant_detach and timeline_delete's
task_mgr::shutdown_tasks() call will wait for on-demand
compaction to finish.
Before this patch, the on-demand compaction would grab the
layer_removal_cs after tenant_detach / timeline_delete had
removed the timeline directory.
This resulted in error

  No such file or directory (os error 2)

NB: I already implemented this pattern for ondemand GC a while back.

fixes https://github.com/neondatabase/neon/issues/3136
2023-01-09 19:08:22 +01:00
Heikki Linnakangas
8c07ef413d Minor cleanup of test_ondemand_download_timetravel test.
- Fix and improve comments
- Rename 'physical_size' local variable to 'resident_size' for clarity.
- Remove one 'unnecessary wait_for_upload' call. The
  'wait_for_sk_commit_lsn_to_reach_remote_storage' call after shutting
  down compute is sufficient.
2023-01-09 18:56:50 +02:00
Sergey Melnikov
14df37c108 Use GHA environments for gradual prod rollout (#3295)
Each release will wait for manual approval for each region
2023-01-09 20:18:16 +04:00
Christian Schwarz
d4d0aa6ed6 gc_iteration_internal: better log message & debug log level if nothing to do
fixes https://github.com/neondatabase/neon/issues/3107
2023-01-09 13:53:59 +01:00
Kirill Bulatov
a457256fef Fix log message matching (#3291)
Spotted
https://neon-github-public-dev.s3.amazonaws.com/reports/main/debug/3871991071/index.html#suites/158be07438eb5188d40b466b6acfaeb3/22966d740e33b677/
failing on `main`, fixes that by using a proper regex match string.

Also removes one clippy lint suppression.
2023-01-09 14:25:12 +02:00
Shany Pozin
3a22e1335d Adding a PR template (#3288)
## Describe your changes
Added a PR template 
## Issue ticket number and link
#3162
## Checklist before requesting a review
- [ ] I have performed a self-review of my code
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
2023-01-09 12:15:53 +00:00
Sergey Melnikov
93c77b0383 Use GHA environment for per-region deploy approvals on staging (#3293)
Each main deploy will wait for manual approval for each region
2023-01-09 15:40:14 +04:00
Shany Pozin
7920b39a27 Adding transition reason to the log when a tenant is moved to Broken state (#3289)
#3160
2023-01-09 10:24:50 +02:00
Kirill Bulatov
23d5e2bdaa Fix common pg port in the CLI basics test (#3283)
Closes https://github.com/neondatabase/neon/issues/3282
2023-01-07 00:46:42 +02:00
Christian Schwarz
3526323bc4 prepare Timeline::get_reconstruct_data for becoming async (#3271)
This patch restructures the code so that PR
https://github.com/neondatabase/neon/pull/3228 can seamlessly
replace the return PageReconstructResult::NeedsDownload with
a download_remote_layer().await.

Background:

PR https://github.com/neondatabase/neon/pull/3228 will turn
get_reconstruct_data() async and do the on-demand
download right in place, instead of returning a
PageReconstructResult::NeedsDownload.

Current rustc requires that the layers lock guard be not in scope
across an await point.

For on-demand download inside get_reconstruct_data(), we need
to do download_remote_layer().await.

Supersedes https://github.com/neondatabase/neon/pull/3260

See my comment there:
https://github.com/neondatabase/neon/pull/3260#issuecomment-1370752407

Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2023-01-06 19:42:25 +02:00
Heikki Linnakangas
af9425394f Print time taken by CREATE/ALTER DATABASE at compute start.
Trying to investigate why the "apply_config" stage is taking longer
than expected. This proves or disproves that it's the CREATE DATABASE
statement.
2023-01-06 17:50:44 +02:00
Arthur Petukhovsky
debd134b15 Implement wss support in proxy (#3247)
This is a hacky implementation of WebSocket server, embedded into our
postgres proxy. The server is used to allow https://github.com/neondatabase/serverless 
to connect to our postgres from browser and serverless javascript functions.

How it will work (general schema):
- browser opens a websocket connection to
`wss://ep-abc-xyz-123.xx-central-1.aws.neon.tech/`
- proxy accepts this connection and terminates TLS (https)
- inside encrypted tunnel (HTTPS), browser initiates plain
(non-encrypted) postgres connection
- proxy performs auth as in usual plain pg connection and forwards
connection to the compute

Related issue: #3225
2023-01-06 18:34:18 +03:00
Heikki Linnakangas
df42213dbb Fix missing COMMIT in handle_role_deletions.
There was no COMMIT, so the DROP ROLE commands were always implicitly
rolled back.

Fixes issue #3279.
2023-01-06 17:07:46 +02:00
Kirill Bulatov
b6237474d2 Fix README and basic startup example (#3275)
Follow-up of https://github.com/neondatabase/neon/pull/3270 which made
an example from main README.md not working.

Fixes that, by adding a way to specify a default tenant now and modifies
the basic neon_local test to start postgres and check branching.
Not all neon_local commands are implemented, so not all README.md
contents is tested yet.
2023-01-06 12:26:14 +02:00
Heikki Linnakangas
8b710b9753 Fix segfault if pageserver connection is lost during backend startup.
It's not OK to return early from within a PG_TRY-CATCH block. The
PG_TRY macro sets the global PG_exception_stack variable, and
PG_END_TRY restores it. If we jump out in between with "return NULL",
the PG_exception_stack is left to point to garbage. (I'm surprised the
comments in PG_TRY_CATCH don't warn about this.)

Add test that re-attaches tenant in pageserver while Postgres is
running. If the tenant is detached while compute is connected and
busy running queries, those queries will fail if they try to fetch any
pages. But when the tenant is re-attached, things should start working
again, without disconnecting the client <-> postgres connections.
Without this fix, this reproduced the segfault.

Fixes issue #3231
2023-01-05 18:51:47 +02:00
Heikki Linnakangas
c187de1101 Copy error message before it's freed.
pageserver_disconnect() call invalidates 'pageserver_conn', including
the error message pointer we got from PQerrorMessage(pageserver_conn).
Copy the message to a temporary variable before disconnecting, like
we do in a few other places.

In the passing, clear 'pageserver_conn_wes' variable in a few places
where it was free'd. I didn't see any live bug from this, but since
pageserver_disconnect() checks if it's NULL, let's not leave it
dangling to already-free'd memory.
2023-01-05 18:51:47 +02:00
Kirill Bulatov
8712e1899e Move initial timeline creation into pytest (#3270)
For every Python test, we start the storage first, and expect that
later, in the test, when we start a compute, it will work without
specific timeline and tenant creation or their IDs specified.

For that, we have a concept of "default" branch that was created on the
control plane level first, but that's not needed at all, given that it's
only Python tests that need it: let them create the initial timeline
during set-up.

Before, control plane started and stopped pageserver for timeline
creation, now Python harness runs an extra tenant creation request on
test env init.

I had to adjust the metrics test, turns out it registered the metrics
from the default tenant after an extra pageserver restart.
New model does not sent the metrics before the collection time happens,
and that was 30s before.
2023-01-05 17:48:27 +02:00
Christian Schwarz
d7f1e30112 remote_timeline_client: more metrics & metrics-related cleanups
- Clean up redundant metric removal in TimelineMetrics::drop.
RemoteTimelineClientMetrics is responsible for cleaning up
REMOTE_OPERATION_TIME andREMOTE_UPLOAD_QUEUE_UNFINISHED_TASKS.

- Rename `pageserver_remote_upload_queue_unfinished_tasks` to
`pageserver_remote_timeline_client_calls_unfinished`. The new name
reflects that the metric is with respect to the entire call to remote
timeline client. This includes wait time in the upload queue and hence
it's a longer span than what `pageserver_remote_OPERATION_seconds`
measures.

- Add the `pageserver_remote_timeline_client_calls_started` histogram.
See the metric description for why we need it.

- Add helper functions `call_begin` etc to `RemoteTimelineClientMetrics`
to centralize the logic for updating the metrics above (they relate to
each other, see comments in code).

- Use these constructs to track ongoing downloads in
`pageserver_remote_timeline_client_calls_unfinished`

refs https://github.com/neondatabase/neon/issues/2029
fixes https://github.com/neondatabase/neon/issues/3249
closes https://github.com/neondatabase/neon/pull/3250
2023-01-05 11:50:17 +01:00
Christian Schwarz
6a9d1030a6 use RemoteTimelineClient for downloading index part during tenant_attach
Before this change, we would not .measure_remote_op for index part
downloads.

And more generally, it's good to pass not just uploads but also
downloads through RemoteTimelineClient, e.g., if we ever want to
implement some timeline-scoped policies there.

Found this while working on https://github.com/neondatabase/neon/pull/3250
where I add a metric to measure the degree of concurrent downloads.
Layer download was missing in a test that I added there.
2023-01-05 11:08:50 +01:00
Heikki Linnakangas
8c6e607327 Refactor send_tarball() (#3259)
The Basebackup struct is really just a convenient place to carry the
various parameters around in send_tarball and its subroutines. Make it
internal to the send_tarball function.
2023-01-04 23:03:16 +02:00
Vadim Kharitonov
f436fb2dfb Fix panics at compute_ctl:monitor 2023-01-04 17:26:42 +01:00
Kirill Bulatov
8932d14d50 Revert "Run Python tests in 8 threads (#3206)" (#3264)
This reverts commit 56a4466d0a.

Seems that flackiness increased after this commit, while the time
decrease was a couple of seconds.
With every regular Python test spawing 1 etcd, 3 safekeepers, 1
pageserver, few CLI commands and post-run cleanup hooks, it might be
hard to run many such tests in parallel.

We could return to this later, after we consider alternative test
structure and/or CI runner structure.
2023-01-04 17:31:51 +02:00
Kirill Bulatov
efad64bc7f Expect compute shutdown test log error (#3262)
https://neon-github-public-dev.s3.amazonaws.com/reports/pr-3261/debug/3833043374/index.html#suites/ffbb7f9930a77115316b58ff32b7c719/1f6ebaedc0a113a1/

Spotted a flacky test that appeared after
https://github.com/neondatabase/neon/pull/3227 changes
2023-01-04 10:45:11 +00:00
Kirill Bulatov
10dae79c6d Tone down safekeeper and pageserver walreceiver errors (#3227)
Closes https://github.com/neondatabase/neon/issues/3114

Adds more typization into errors that appear during protocol messages (`FeMessage`), postgres and walreceiver connections.

Socket IO errors are now better detected and logged with lesser (INFO, DEBUG) error level, without traces that they were logged before, when they were wrapped in anyhow context.
2023-01-03 20:42:04 +00:00
Heikki Linnakangas
e9583db73b Remove code and test to generate flamegraph on GetPage requests. (#3257)
It was nice to have and useful at the time, but unfortunately the method
used to gather the profiling data doesn't play nicely with 'async'. PR
#3228 will turn 'get_page_at_lsn' function async, which will break the
profiling support. Let's remove it, and re-introduce some kind of
profiling later, using some different method, if we feel like we need it
again.
2023-01-03 20:11:32 +02:00
Vadim Kharitonov
0b428f7c41 Enable licenses check for 3rd-parties 2023-01-03 15:11:50 +01:00
Heikki Linnakangas
8b692e131b Enable on-demand download in WalIngest. (#3233)
Makes the top-level functions in WalIngest async, and replaces
no_ondemand_download calls with with_ondemand_download.

This hopefully fixes the problem reported in issue #3230, although I
don't have a self-contained test case for it.
2023-01-03 14:44:42 +02:00
Heikki Linnakangas
0a0e55c3d0 Replace 'tar' crate with 'tokio-tar' (#3202)
The synchronous 'tar' crate has required us to use block_in_place and
SyncIoBridge to work together with the async I/O in the client
connection. Switch to 'tokio-tar' crate that uses async I/O natively.

As part of this, move the CopyDataWriter implementation to
postgres_backend_async.rs. Even though it's only used in one place
currently, it's in principle generally applicable whenever you want to
use COPY out.

Unfortunately we cannot use the 'tokio-tar' as it is: the Builder
implementation requires the writer to have 'static lifetime. So we
have to use a modified version without that requirement. The 'static
lifetime was required just for the Drop implementation that writes
the end-of-archive sections if the Builder is dropped without calling
`finish`. But we don't actually want that behavior anyway; in fact
we had to jump through some hoops with the AbortableWrite hack to skip
those. With the modified version of 'tokio-tar' without that Drop
implementation, we don't need AbortableWrite either.

Co-authored-by: Kirill Bulatov <kirill@neon.tech>
2023-01-03 12:39:11 +02:00
Christian Schwarz
5bc9f8eae0 README: Fedora needs protobuf-devel
Otherwise, common protobufs such as Google's empty.proto are missing,
resulting in storage_broker build.rs failure.

I encountered this on Fedora 36.
2023-01-03 11:05:23 +01:00
Sergey Melnikov
4c4d3dc87a Add new pageserver to us-east-2 staging (#3248) 2023-01-02 22:14:05 +04:00
Shany Pozin
182dc785d6 Set PITR default to 7 days (#3245)
https://github.com/neondatabase/cloud/issues/3406
2023-01-02 18:05:23 +02:00
Kirill Bulatov
a9cca7a0fd Use proper error code for BeMessage error responses (#3240)
Based on
https://github.com/neondatabase/neon/pull/3227#discussion_r1059430067

Seems that the constant, used for internal error during BeMessage error
response serialization is incorrect.
Currently used one is `CXX000`, yet all docs mention `XX000` instead:

* https://www.postgresql.org/docs/current/errcodes-appendix.html
* https://docs.rs/postgres/latest/postgres/error/struct.SqlState.html#associatedconstant.INTERNAL_ERROR

I have checked it with the patch and logs described in
https://github.com/neondatabase/neon/pull/3227#discussion_r1059949982
2023-01-02 16:51:05 +02:00
Arseny Sher
6fd64cd5f6 Allow failure to report metrics in test_metric_collection.
Per CI
https://github.com/neondatabase/neon/actions/runs/3822039946/attempts/1
shutdown seems to be racy.
2023-01-02 15:37:05 +03:00
Kirill Bulatov
56a4466d0a Run Python tests in 8 threads (#3206)
I have experimented with the runner threads number, and looks like 8
threads win us a few seconds.

Bumping the thread count more did not improve the situation much:
* 20 threads were not allowed by pytest
* 16 threads were flacking quite notably

My guess would be that all pageservers, safekeepers, and other nodes we
start occupy quite much of the CPU and other resources to make this
approach more scalable.
2023-01-02 14:34:06 +02:00
Arseny Sher
41b8e67305 Fix 81afd7011 by enabling reqwest feature for sentry.
It disabled transport altogether.
2023-01-02 15:29:33 +03:00
Heikki Linnakangas
81afd7011c Use rustls for everything.
I looked at "cargo tree" output and noticed that through various
dependencies, we are depending on both native-tls and rustls. We have
tried to standardize on rustls for everything, but dependencies on
native-tls have crept in recently. One such dependency came from
'reqwest' with default features in pageserver, used for
consumption_metrics. Another dependency was from 'sentry'. Both
'reqwest' and 'sentry' use native-tls by default, but can use 'rustls'
if compiled with the right feature flags.
2023-01-02 11:14:35 +02:00
dependabot[bot]
3468db8a2b Bump setuptools from 65.5.0 to 65.5.1 (#3212) 2023-01-02 08:47:28 +01:00
Egor Suvorov
9f94d098aa Remove unused AuthType::MD5 2022-12-31 02:27:08 +03:00
Egor Suvorov
cb61944982 Safekeeper: refactor auth validation
* Load public auth key on startup and store it in the config.
* Get rid of a separate `auth` parameter which was passed all over the place.
2022-12-31 02:27:08 +03:00
Dmitry Ivanov
c700c7db2e [proxy] Add more labels to the pricing metrics 2022-12-29 22:25:52 +03:00
Dmitry Rodionov
7c7d225d98 add pageserver to new region see https://github.com/neondatabase/aws/pull/116 2022-12-29 20:29:42 +03:00
Anastasia Lubennikova
8ff7bc5df1 Add timleline_logical_size metric.
Send this metric only when it is fully calculated.

Make consumption metrics more stable:
- Send per-timeline metrics only for active timelines.
- Adjust test assertions to make test_metric_collection test more stable.
2022-12-29 19:13:54 +02:00
Heikki Linnakangas
890ff3803e Allow update_gc_info to download files on-demand. 2022-12-29 16:23:25 +02:00
Heikki Linnakangas
fefe19a284 Avoid calling find_lsn_for_timestamp call while holding lock.
Refactor update_gc_info function so that it calls the potentially
expensive find_lsn_for_timestamp() function before acquiring the
lock. This will also be needed if we make find_lsn_for_timestamp()
async in the future; it cannot be awaited while holding the lock.
2022-12-29 16:23:25 +02:00
Vadim Kharitonov
434fcac357 Remove unused HTTP endpoints from compute_ctl 2022-12-29 13:59:40 +01:00
Anastasia Lubennikova
894ac30734 Rename billing_metrics to consumption_metrics.
Use more appropriate term, because not all of these metrics are used for billing.
2022-12-29 14:25:47 +02:00
Shany Pozin
c0290467fa Fix #2907 Remove missing_layers from IndexPart (#3217)
#2907
2022-12-29 12:33:30 +02:00
Konstantin Knizhnik
0e7c03370e Lazy calculation of traversal_id which is needed only for error repoting (#3221)
See 
https://neondb.slack.com/archives/C0277TKAJCA/p1672245908989789
and 
https://neondb.slack.com/archives/C033RQ5SPDH/p1671885245981359
2022-12-29 12:20:28 +02:00
Anastasia Lubennikova
f731e9b3de Fix serialization of billing metrics (#3215)
Fixes:
- serialize TenantId and TimelineId as strings,
- skip TimelineId if none
- serialize `metric_type` field as `type`
- add `idempotency_key` field to uniquely identify metrics
2022-12-29 12:11:04 +02:00
Dmitry Rodionov
bd7a9e6274 switch to debug from info to produce less noise 2022-12-29 13:01:17 +03:00
Egor Suvorov
42c6ddef8e Rename ZENITH_AUTH_TOKEN to NEON_AUTH_TOKEN
Changes are:

* Pageserver: start reading from NEON_AUTH_TOKEN by default.
  Warn if ZENITH_AUTH_TOKEN is used instead.
* Compute, Docs: fix the default token name.
* Control plane: change name of the token in configs and start
  sequences.

Compatibility:

* Control plane in tests: works, no compatibility expected.
* Control plane for local installations: never officially supported
  auth anyways. If someone did enable it, `pageserver.toml` should be updated
  with the new `neon.pageserver_connstring` and `neon.safekeeper_token_env`.
* Pageserver is backward compatible: you can run new Pageserver with old
  commands and environment configurations, but not vice-versa.
  The culprit is the hard-coded `NEON_AUTH_TOKEN`.
* Compute has no code changes. As long as you update its configuration
  file with `pageserver_connstring` in sync with the start up scripts,
  you are good to go.
* Safekeeper has no code changes and has never used `ZENITH_AUTH_TOKEN` in
  the first place.
2022-12-29 03:33:43 +03:00
Shany Pozin
172c7e5f92 Split upload queue code from storage_sync.rs (#3216)
https://github.com/neondatabase/neon/issues/3208
2022-12-28 15:12:06 +02:00
Shany Pozin
0c7b02ebc3 Move tenant related files to tenant directory (#3214)
Related to https://github.com/neondatabase/neon/issues/3208
2022-12-28 09:20:01 +02:00
Arseny Sher
f6bf7b2003 Add tenant_id to safekeeper spans.
Now that it's hard to map timeline id into project in the console, this should
help a little.
2022-12-27 20:19:12 +03:00
Arseny Sher
fee8bf3a17 Remove global_commit_lsn.
It is complicated and fragile to maintain and not really needed; update
commit_lsn locally only when we have enough WAL flushed.

ref https://github.com/neondatabase/neon/issues/3069
2022-12-27 20:19:12 +03:00
Arseny Sher
1ad6e186bc Refuse ProposerElected if it is going to truncate correct WAL.
Prevents commit_lsn monotonicity violation (otherwise harmless).

closes https://github.com/neondatabase/neon/issues/3069
2022-12-27 20:19:12 +03:00
Konstantin Knizhnik
140c0edac8 Yet another port of local file system cache (#2622) 2022-12-27 14:42:51 +02:00
Anna Stepanyan
5826e19b56 update the grafana links in the PR release template (#3156) 2022-12-27 10:25:19 +01:00
Konstantin Knizhnik
1137b58b4d Fix LayerMap::search to not return delta layer preceeding image layer (#3197)
While @bojanserafimov is still working on best replacement of R-Tree in
layer_map.rs there is obvious pitfall in the current `search` method
implementation: is returns delta layer even if there is image layer if
greater LSN. I think that it should be fixed.
2022-12-26 18:21:41 +02:00
Anastasia Lubennikova
1468c65ffb Enable billing metric_collection_endpoint on staging 2022-12-23 18:04:45 +02:00
Kirill Bulatov
b77c33ee06 Move tenant-related modules below tenant module (#3190)
No real code changes besides moving code around and adjusting the
imports.
2022-12-23 15:40:37 +02:00
Kirill Bulatov
0bafb2a6c7 Do more on-demand downloads where needed (#3194)
The PR aims to fix two missing redownloads in a flacky
test_remote_storage_upload_queue_retries[local_fs]
([example](https://neon-github-public-dev.s3.amazonaws.com/reports/pr-3190/release/3759194738/index.html#categories/80f1dcdd7c08252126be7e9f44fe84e6/8a70800f7ab13620/))

1. missing redownload during walreceiver work
```
2022-12-22T16:09:51.509891Z ERROR wal_connection_manager{tenant=fb62b97553e40f949de8bdeab7f93563 timeline=4f153bf6a58fd63832f6ee175638d049}: wal receiver task finished with an error: walreceiver connection handling failure

Caused by:
    Layer needs downloading

Stack backtrace:
   0: pageserver::tenant::timeline::PageReconstructResult<T>::no_ondemand_download
             at /__w/neon/neon/pageserver/src/tenant/timeline.rs:467:59
   1: pageserver::walingest::WalIngest::new
             at /__w/neon/neon/pageserver/src/walingest.rs:61:32
   2: pageserver::walreceiver::walreceiver_connection::handle_walreceiver_connection::{{closure}}
             at /__w/neon/neon/pageserver/src/walreceiver/walreceiver_connection.rs:178:25
....
```

That looks sad, but inevitable during the current approach: seems that
we need to wait for old layers to arrive in order to accept new data.

For that, `WalIngest::new` now started to return the
`PageReconstructResult`.
Sync methods from `import_datadir.rs` use `WalIngest::new` too, but both
of them import WAL during timeline creation, so no layers to download
are needed there, ergo the `PageReconstructResult` is converted to
`anyhow::Result` with `no_ondemand_download`.

2. missing redownload during compaction work
```
2022-12-22T16:09:51.090296Z ERROR compaction_loop{tenant_id=fb62b97553e40f949de8bdeab7f93563}:compact_timeline{timeline=4f153bf6a58fd63832f6ee175638d049}: could not compact, repartitioning keyspace failed: Layer needs downloading

Stack backtrace:
   0: pageserver::tenant::timeline::PageReconstructResult<T>::no_ondemand_download
             at /__w/neon/neon/pageserver/src/tenant/timeline.rs:467:59
   1: pageserver::pgdatadir_mapping::<impl pageserver::tenant::timeline::Timeline>::collect_keyspace::{{closure}}
             at /__w/neon/neon/pageserver/src/pgdatadir_mapping.rs:506:41
      <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/mod.rs:91:19
      pageserver::tenant::timeline::Timeline::repartition::{{closure}}
             at /__w/neon/neon/pageserver/src/tenant/timeline.rs:2161:50
      <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/mod.rs:91:19
   2: pageserver::tenant::timeline::Timeline::compact::{{closure}}
             at /__w/neon/neon/pageserver/src/tenant/timeline.rs:700:14
      <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/mod.rs:91:19
   3: <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll
             at /github/home/.cargo/registry/src/github.com-1ecc6299db9ec823/tracing-0.1.37/src/instrument.rs:272:9
   4: pageserver::tenant::Tenant::compaction_iteration::{{closure}}
             at /__w/neon/neon/pageserver/src/tenant.rs:1232:85
      <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/mod.rs:91:19
      pageserver::tenant_tasks::compaction_loop::{{closure}}::{{closure}}
             at /__w/neon/neon/pageserver/src/tenant_tasks.rs:76:62
      <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
             at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/mod.rs:91:19
      pageserver::tenant_tasks::compaction_loop::{{closure}}
             at /__w/neon/neon/pageserver/src/tenant_tasks.rs:91:6
```
2022-12-23 15:39:59 +02:00
Sergey Melnikov
c01f92c081 Fully remove old staging deploy (#3191) 2022-12-22 20:09:45 +01:00
Sergey Melnikov
7bc17b373e Fix calculate-deploy-targets (#3189)
Was broken in https://github.com/neondatabase/neon/pull/3180
2022-12-22 16:28:36 +01:00
Arthur Petukhovsky
72ab104733 Move zenith-1-sk-3 to zenith-1-sk-4 (#3164) 2022-12-22 15:21:53 +00:00
Sergey Melnikov
5a496d82b0 Do not deploy storage and proxies to old staging (#3180)
We fully migrated out, this nodes will be soon decommissioned
2022-12-22 15:37:17 +01:00
Anastasia Lubennikova
8544c59329 Fix flaky test_metrics_collection.py
Only check that all metrics are present on the first request,
because pageserver doesn't send unchanged metrics.
2022-12-22 16:28:11 +02:00
Anastasia Lubennikova
63eb87bde3 Set default metric_collection_interval to 10 min,
which is more reasonable for real usage
2022-12-22 16:24:09 +02:00
Vadim Kharitonov
9b71215906 Simplify some functions in compute_tools and fix typo errors in func
name
2022-12-22 15:05:43 +01:00
Stas Kelvich
5a762744c7 Collect core dump backtraces in compute_ctl.
Scan core dumps directory on exit. In case of existing core dumps
call gdb/lldb to get a backtrace and log it. By default look for
core dumps in postgres data directory as core.<pid>. That is how
core collection is configured in our k8s nodes (and a reasonable
convention in general).
2022-12-22 16:01:49 +02:00
Alexander Bayandin
201fedd65c tpch-compare: use rust image instead of rustlegacy (#3182) 2022-12-22 12:40:39 +00:00
Sergey Melnikov
707d1c1c94 Fix vm-compute-image upload to dockerhub (#3181) 2022-12-22 13:34:16 +01:00
Kirill Bulatov
fca25edae8 Fix 1.66 Clippy warnings (#3178)
1.66 release speeds up compile times for over 10% according to tests.

Also its Clippy finds plenty of old nits in our code:
* useless conversion, `foo as u8` where `foo: u8` and similar, removed
`as u8` and similar
* useless references and dereferenced (that were automatically adjusted
by the compiler), removed various `&` and `*`
* bool -> u8 conversion via `if/else`, changed to `u8::from`
* Map `.iter()` calls where only values were used, changed to
`.values()` instead

Standing out lints:
* `Eq` is missing in our protoc generated structs. Silenced, does not
seem crucial for us.
* `fn default` looks like the one from `Default` trait, so I've
implemented that instead and replaced the `dummy_*` method in tests with
`::default()` invocation
* Clippy detected that
```
if retry_attempt < u32::MAX {
    retry_attempt += 1;
}
```
is a saturating add and proposed to replace it.
2022-12-22 14:27:48 +02:00
Sergey Melnikov
f5f1197e15 Build vm-compute-node images (#3174) 2022-12-22 11:25:56 +01:00
Heikki Linnakangas
7ff591ffbf On-Demand Download
The code in this change was extracted from #2595 (Heikki’s on-demand
download draft PR).

High-Level Changes

- New RemoteLayer Type
- On-Demand Download As An Effect Of Page Reconstruction
- Breaking Semantics For Physical Size Metrics

There are several follow-up work items planned.
Refer to the Epic issue on GitHub: https://github.com/neondatabase/neon/issues/2029

closes https://github.com/neondatabase/neon/pull/3013

Co-authored-by: Kirill Bulatov <kirill@neon.tech>
Co-authored-by: Christian Schwarz <christian@neon.tech>

New RemoteLayer Type
====================

Instead of downloading all layers during tenant attach, we create
RemoteLayer instances for each of them and add them to the layer map.

On-Demand Download As An Effect Of Page Reconstruction
======================================================

At the heart of pageserver is Timeline::get_reconstruct_data(). It
traverses the layer map until it has collected all the data it needs to
produce the page image. Most code in the code base uses it, though many
layers of indirection.

Before this patch, the function would use synchronous filesystem IO to
load data from disk-resident layer files if the data was not cached.

That is not possible with RemoteLayer, because the layer file has not
been downloaded yet. So, we do the download when get_reconstruct_data
gets there, i.e., “on demand”.

The mechanics of how the download is done are rather involved, because
of the infamous async-sync-async sandwich problem that plagues the async
Rust world. We use the new PageReconstructResult type to work around
this. Its introduction is the cause for a good amount of code churn in
this patch. Refer to the block comment on `with_ondemand_download()`
for details.

Breaking Semantics For Physical Size Metrics
============================================

We rename prometheus metric pageserver_{current,resident}_physical_size to
reflect what this metric actually represents with on-demand download.
This intentionally BREAKS existing grafana dashboard and the cost model data
pipeline. Breaking is desirable because the meaning of this metrics has changed
with on-demand download. See
 https://docs.google.com/document/d/12AFpvKY-7FZdR5a4CaD6Ir_rI3QokdCLSPJ6upHxJBo/edit#
for how we will handle this breakage.

Likewise, we rename the new billing_metrics’s PhysicalSize => ResidentSize.
This is not yet used anywhere, so, this is not a breaking change.

There is still a field called TimelineInfo::current_physical_size. It
is now the sum of the layer sizes in layer map, regardless of whether
local or remote. To compute that sum, we added a new trait method
PersistentLayer::file_size().

When updating the Python tests, we got rid of
current_physical_size_non_incremental. An earlier commit removed it from
the OpenAPI spec already, so this is not a breaking change.

test_timeline_size.py has grown additional assertions on the
resident_physical_size metric.
2022-12-21 19:16:39 +01:00
Christian Schwarz
31543c4acc refactor: make update_gc_info and transitive callers async
This is so that in the next commit, we can add a retry_get to
find_lsn_for_timestamp.
2022-12-21 19:16:39 +01:00
Christian Schwarz
1da03141a7 refactor: make Layer::local_path return Option<PathBuf> instead of PathBuf
This is in preparation for RemoteLayer, which by definition doesn't have
a local path.
2022-12-21 19:16:39 +01:00
Christian Schwarz
2460987328 no-op: pgdatadir_mapping: qualified use of anyhow::Result 2022-12-21 19:16:39 +01:00
Christian Schwarz
749a2f00d7 no-op: distinguished error types for Timeline::get_reconstruct_data 2022-12-21 19:16:39 +01:00
Christian Schwarz
e94b451430 no-op: storage_layer::Iter::{iter, key_iter}: make them fallible 2022-12-21 19:16:39 +01:00
Christian Schwarz
f5b424b96c no-op: type aliases for Layer::iter and Layer::key_iter return types
Not needed by anything right now, but the next commit adds a `Result<>`
around iter() and key_iter()'s return types, and that makes clippy
complain.
2022-12-21 19:16:39 +01:00
Christian Schwarz
91e8937112 no-op: add Timeline::myself member 2022-12-21 19:16:39 +01:00
Christian Schwarz
f637f6e77e stop exposing non-incremental sizes in API spec
Console doesn't use them, so, don't expose them.

refs https://github.com/neondatabase/cloud/pull/3358
refs https://github.com/neondatabase/cloud/pull/3366
2022-12-21 15:37:29 +01:00
Christian Schwarz
a3f0111726 LayerMap::search is actually infallible
Found this while investigating failure modes of on-demand download.

I think it's a nice cleanup.
2022-12-21 12:14:55 +01:00
Alexander Bayandin
486a985629 mypy: enable check_untyped_defs (#3142)
Enable `check_untyped_defs` and fix warnings.
2022-12-21 09:38:42 +00:00
Kirill Bulatov
5d4774491f Exclude macOs fork files from tar processing (#3165)
When running tenant relocation tests, we use
https://github.com/neondatabase/neon/blob/main/scripts/export_import_between_pageservers.py
script to export and import basebackup between pageservers.

When pageserver runs on macOs and reuses the `tar` library for creating
the basebackup archive, it gets the fork files

https://superuser.com/questions/61185/why-do-i-get-files-like-foo-in-my-tarball-on-os-x

We might be able to fix our code to fix the issue, but if we get such
(valid) archive as an input, we
[fail](https://github.com/neondatabase/neon/pull/3013#issuecomment-1360093900).
This does not seem optimal, given that we can ignore such files.
2022-12-21 00:52:07 +02:00
Anastasia Lubennikova
4235f97c6a Implement consumption metrics collection.
Add new background job to collect billing metrics for each tenant and
send them to the HTTP endpoint.
Metrics are cached, so we don't send non-changed metrics.

Add metric collection config parameters:
metric_collection_endpoint (default None, i.e. disabled)
metric_collection_interval (default 60s)

Add test_metric_collection.py to test metric collection
and sending to the mocked HTTP endpoint.

Use port distributor in metric_collection test

review fixes: only update cache after metrics were send successfully, simplify code

disable metric collection if metric_collection_endpoint is not provided in config
2022-12-20 22:59:52 +02:00
Heikki Linnakangas
43fd89eaa7 Improve comments, formatting around layer_removal_cs lock. 2022-12-20 20:50:55 +02:00
Heikki Linnakangas
9a049aa846 Move code from tenant_mgr::delete_timeline to Tenant::delete_timeline.
It's better to request the tasks to shut down only after setting the
timeline state to Stopping. Otherwise, it's possible that a new task
spawns after we have waited for the existing tasks to shut down, but
before we have changed the state. We would fail to wait for them.

Feels nicer from a readability point of view too.
2022-12-20 20:50:55 +02:00
Kirill Bulatov
0c71dc627b Tidy up walreceiver logs (#3147)
Closes https://github.com/neondatabase/neon/issues/3114

Improves walrecevier logs and remove `clone()` calls.
2022-12-20 15:54:02 +02:00
Heikki Linnakangas
8e2edfcf39 Retry remote downloads.
Remote operations fail sometimes due to network failures or other
external reasons. Add retry logic to all the remote downloads, so that
a transient failure at pageserver startup or tenant attach doesn't
cause the whole tenant to be marked as Broken.

Like in the uploads retry logic, we print the failure to the log as a
WARNing after three retries, but keep retrying. We will retry up to 10
times now, before returning the error to the caller.

To test the retries, I created a new RemoteStorage wrapper that simulates
failures, by returning an error for the first N times that a remote
operation is performed. It can be enabled by setting a new
"test_remote_failures" option in the pageserver config file.

Fixes #3112
2022-12-20 14:27:24 +02:00
Heikki Linnakangas
4cda9919bf Use Self to emphasize this is a constructor 2022-12-20 14:27:24 +02:00
Heikki Linnakangas
eefb1d46f4 Replace Timeline::checkpoint with Timeline::freeze_and_flush
The new Timeline::freeze_and_flush function is equivalent to calling
Timeline::checkpoint(CheckpointConfig::Flush). There were only one
non-test caller that used CheckpointConfig::Forced, so replace that
with a call to the new Timeline::freeze_and_flush, followed by an
explicit call to Timeline::compact.

That only caller was to handle the mgmt API's 'checkpoint' endpoint.
Perhaps we should split that into separate 'flush' and 'compact'
endpoints too, but I didn't go that far yet.
2022-12-20 13:45:47 +02:00
Kirill Bulatov
2c11f1fa95 Use separate broker per Python test (#3158)
And add its logs to Allure reports per test
2022-12-20 11:06:21 +00:00
Sergey Melnikov
cd7fdf2587 Remove neon-stress configs (#3121) 2022-12-20 12:03:42 +01:00
Heikki Linnakangas
7b0d28bbdc Update outdated comment on Tenant::gc_iteration.
Commit 6dec85b19d remove the `checkpoint_before_gc` argument, but failed
to update the comment. Remove its description, and while we're at it, try
to explain better how the `horizon` and `pitr` arguments are used.
2022-12-20 12:52:26 +02:00
Heikki Linnakangas
6ac9ecb074 Remove a few unnecessary checkpoint calls from unit tests.
The `make_some_layers' function performs a checkpoint already.
2022-12-20 12:52:26 +02:00
Kirill Bulatov
56d8c25dc8 Revert "Use local brokers"
This reverts commit f9f57e211a.
2022-12-20 01:57:36 +02:00
Kirill Bulatov
f9f57e211a Use local brokers 2022-12-20 01:55:59 +02:00
Heikki Linnakangas
39f58038d1 Don't upload index file in compaction, if there was nothing to do. (#3149)
This splits the storage_sync2::schedule_index_file into two (public)
functions:
1. `schedule_index_upload_for_metadata_update`, for when the metadata
(e.g. disk_consistent_lsn or last_gc_cutoff) has changed, and

2. `schedule_index_upload_for_file_changes`, for when layer file uploads
or deletions have been scheduled.

We now keep track of whether there have been any uploads or deletions
since the last index-file upload, and skip the upload in
`schedule_index_upload_for_file_changes` if there haven't been any
changes. That allows us to call the function liberally in timeline.rs,
whenever layer file uploads or deletions might've been scheduled,
without starting a lot of unnecessary index file uploads.

GC was covered earlier by commit c262390214, but that missed that we
have the same problem with compaction.
2022-12-19 23:58:24 +02:00
Kirill Bulatov
3735aece56 Safekeeper: Always use workdir as a full path 2022-12-19 21:43:36 +02:00
Kirill Bulatov
9ddd1d7522 Stop all storage nodes on startup failure 2022-12-19 21:43:36 +02:00
Kirill Bulatov
49a211c98a Add neon_local test 2022-12-19 21:43:36 +02:00
Christian Schwarz
7db018e147 [4/4] the fix: do not leak spawn_blocking() tasks from logical size calculation code
- Refactor logical_size_calculation_task, moving the pieces that are
  specific to try_spawn_size_init_task into that function.
  This allows us to spawn additional size calculation tasks that are not
  init size calculation tasks.

  - As part of this refactoring, stop logging cancellations as errors.
    They are part of regular operations.
    Logging them as errors was inadvertently introduced in earlier commit

      427c1b2e9661161439e65aabc173d695cfc03ab4
      initial logical size calculation: if it fails, retry on next call

- Change tenant size model request code to spawn task_mgr tasks using
  the refactored logical_size_calculation_task function.
  Using a task_mgr task ensures that the calculation cannot outlive
  the timeline.
  - There are presumably still some subtle race conditions if a size
    requests comes in at exactly the same time as a detach / delete
    request.
  - But that's the concern of diferent area of the code (e.g., tenant_mgr)
    and requires holistic solutions, such as the proposed TenantGuard.

- Make size calculation cancellable using CancellationToken.
  This is more of a cherry on top.
  NB: the test code doesn't use this because we _must_ return from
  the failpoint, because the failpoint lib doesn't allow to just
  continue execution in combination with executing the closure.

This commit fixes the tests introduced earlier in this patch series.
2022-12-19 16:14:58 +01:00
Christian Schwarz
38ebd6e7a0 [3/4] make initial size estimation task sensitive to task_mgr shutdown requests
This exacerbates the problem pointed out in the previous commit.
Why? Because with this patch, deleting a timeline also exposes the issue.

Extend the test to expose the problem.
2022-12-19 16:14:58 +01:00
Christian Schwarz
40a3d50883 [2/4] add test to show that tenant detach makes us leak running size calculation task 2022-12-19 16:14:58 +01:00
Christian Schwarz
ee2b5dc9ac [1/4] initial logical size calculation: if it fails, retry on next call
Before this patch, if the task fails, we would not reset
self.initial_size_computation_started.
So, if it fails, we will return the approximate value forever.

In practice, it probably never failed because the local filesystem
is quite reliable.

But with on-demand download, the logical size calculation may need
to download layers, which is more likely to fail at times.
There will be internal retires with a timeout, but eventually,
the downloads will give up.
We want to retry in those cases.

While we're at it, also change the handling of the timeline state
watch so that we treat it as an error. Most likely, we'll not be
called again, but if we are, retrying is the right thing.
2022-12-19 16:14:58 +01:00
Christian Schwarz
c785a516aa remove TimelineInfo.{Remote,Local} along with their types
follow-up of https://github.com/neondatabase/neon/pull/2615
which is neon.git: 538876650a

must be deployed after cloud.git change
https://github.com/neondatabase/cloud/issues/3232

fixes https://github.com/neondatabase/neon/issues/3041
2022-12-19 14:37:40 +01:00
Heikki Linnakangas
e23d5da51c Tidy up and add comments to the pageserver startup code.
To make it more readable.
2022-12-19 14:03:22 +02:00
Alexander Bayandin
12e6f443da test_perf_pgbench: switch to server-side data generation (#3058)
To offload the network and reduce its impact, I suggest switching to
server-side data generation for the pgbench initialize workflow.
2022-12-18 00:02:04 +00:00
Dmitry Ivanov
61194ab2f4 Update rust-postgres everywhere
I've rebased[1] Neon's fork of rust-postgres to incorporate
latest upstream changes (including dependabot's fixes),
so we need to advance revs here as well.

[1] https://github.com/neondatabase/rust-postgres/commits/neon
2022-12-17 00:26:10 +03:00
MMeent
3514e6e89a Use neon_nblocks instead of get_cached_relsize (#3132)
This prevents us from overwriting all blocks of a relation when we
extend the relation without first caching the size - get_cached_relsize
does not guarantee a correct result when it returns `false`.
2022-12-16 21:14:57 +01:00
Dmitry Ivanov
83baf49487 [proxy] Forward compute connection params to client
This fixes all kinds of problems related to missing params,
like broken timestamps (due to `integer_datetimes`).

This solution is not ideal, but it will help. Meanwhile,
I'm going to dedicate some time to improving connection machinery.

Note that this **does not** fix problems with passing certain parameters
in a reverse direction, i.e. **from client to compute**. This is a
separate matter and will be dealt with in an upcoming PR.
2022-12-16 21:37:50 +03:00
Alexander Bayandin
64775a0a75 test_runner/performance: fix flush for NeonCompare (#3135)
Fix performance tests:
```
AttributeError: 'NeonCompare' object has no attribute 'pageserver_http'
```
2022-12-16 17:45:38 +00:00
Joonas Koivunen
c86c0c08ef task_mgr: use CancellationToken instead of shutdown_rx (#3124)
this should help us in the future to have more freedom with spawning
tasks and cancelling things, most importantly blocking tasks (assuming
the CancellationToken::is_cancelled is performant enough).
CancellationToken allows creation of hierarchical cancellations, which
would also simplify the task_mgr shutdown operation, rendering it
unnecessary.
2022-12-16 17:19:47 +02:00
Alexander Bayandin
8d39fcdf72 pgbench-compare: don't run neon-captest-new (#3130)
Do not run Nightly Benchmarks on `neon-captest-new`.
This is a temporary solution to avoid spikes in the storage we consume
during the test run. To collect data for the default instance, we could
run tests weekly (i.e. not daily).
2022-12-16 13:23:36 +00:00
Joonas Koivunen
b688a538e3 fix(remote_storage): use cached credentials (#3128)
IMDSv2 has limits, and if we query it on every s3 interaction we are
going to go over those limits. Changes the s3_bucket client
configuration to use:
- ChainCredentialsProvider to handle env variables or imds usage
- LazyCachingCredentialsProvider to actually cache any credentials

Related: https://github.com/awslabs/aws-sdk-rust/issues/629
Possibly related: https://github.com/neondatabase/neon/issues/3118
2022-12-16 13:40:01 +02:00
Arseny Sher
e14bbb889a Enable broker client keepalives. (#3127)
Should fix stale connections.

ref https://github.com/neondatabase/neon/issues/3108
2022-12-16 11:55:12 +02:00
Heikki Linnakangas
c262390214 Don't upload index file when GC doesn't remove anything.
I saw an excessive number of index file upload operations in
production, even when nothing on the timeline changes. It was because
our GC schedules index file upload if the GC cutoff LSN is advanced,
even if the GC had nothing else to do. The GC cutoff LSN marches
steadily forwards, even when there is no user activity on the
timeline, when the cutoff is determined by the time-based PITR
interval setting. To dial that down, only schedule index file upload
when GC is about to actually remove something.
2022-12-16 11:05:55 +02:00
Heikki Linnakangas
6dec85b19d Redefine the timeline_gc API to not perform a forced compaction
Previously, the /v1/tenant/:tenant_id/timeline/:timeline_id/do_gc API
call performed a flush and compaction on the timeline before
GC. Change it not to do that, and change all the tests that used that
API to perform compaction explicitly.

The compaction happens at a slightly different point now. Previously,
the code performed the `refresh_gc_info_internal` step first, and only
then did compaction on all the timelines. I don't think that was what
was originally intended here. Presumably the idea with compaction was
to make some old layer files available for GC. But if we're going to
flush the current in-memory layer to disk, surely you would want to
include the newly-written layer in the compaction too. I guess this
didn't make any difference to the tests in practice, but in any case,
the tests now perform the flush and compaction before any of the GC
steps.

Some of the tests might not need the compaction at all, but I didn't
try hard to determine which ones might need it. I left it out from a
few tests that intentionally tested calling do_gc with an invalid
tenant or timeline ID, though.
2022-12-16 11:05:55 +02:00
Arseny Sher
70ce01d84d Deploy broker with L4 LB in new env. (#3125)
Seems to be fixing issue with missing keepalives.
2022-12-15 22:42:30 +01:00
Christian Schwarz
b58f7710ff seqwait: different error messages per variant
Would have been handy to get slightly more details in
https://github.com/neondatabase/neon/issues/3109

refs https://github.com/neondatabase/neon/issues/3109
2022-12-15 18:19:43 +01:00
MMeent
807b110946 Update Makefile configuration: (#3011)
- Use only one templated section for most postgres-versioned steps
- Clean up neon_walredo, too, when running neon-pg-ext-clean
- Depend on the various cleanup steps for `clean` instead of manually
executing those cleanup steps.
2022-12-15 17:06:17 +00:00
Christian Schwarz
397b60feab common abstraction for waiting for SK commit_lsn to reach PS 2022-12-15 11:50:39 +01:00
Christian Schwarz
10cd64cf8d make TaskHandle::next_task_event cancellation-safe
If we get cancelled before jh.await returns we've take()n the join handle but
drop the result on the floor.

Fix it by setting self.join_handle = None after the .await

fixes https://github.com/neondatabase/neon/issues/3104
2022-12-15 10:26:17 +01:00
Christian Schwarz
bf3ac2be2d add remote_physical_size metric
We do the accounting exclusively after updating remote IndexPart successfully.
This is cleaner & more robust than doing it upon completion of
individual layer file uploads / deletions since we can uset .set()
insteaf of add()/sub().

NB: Originally, this work was intended to be part of #3013 but it
turns out that it's completely orthogonal.
So, spin it out into this PR for easier review.
Since this change is additive, it won't break anything.
2022-12-15 09:48:35 +01:00
Sergey Melnikov
c04c201520 Push proxy metrics to Victoria Metrics (#3106) 2022-12-14 21:28:14 +01:00
Christian Schwarz
4132ae9dfe always remove RemoteTimelineClient's metrics when dropping it 2022-12-14 19:25:29 +01:00
Alexander Bayandin
8fcba150db test_seqscans: temporarily disable remote test (#3101)
Temporarily disable `test_seqscans` for remote projects; they acquire
too much space and time. We can try to reenable it back after switching
to per-test projects.
2022-12-14 18:05:05 +00:00
Dmitry Rodionov
df09d0375b ignore metadata_backup files in index_part 2022-12-14 19:00:19 +03:00
Vadim Kharitonov
62f6e969e7 Fix helm value for proxy 2022-12-14 16:41:26 +01:00
Kirill Bulatov
4d201619ed Remove large database files after every test suite (#3090)
Closes https://github.com/neondatabase/neon/issues/1984 
Closes https://github.com/neondatabase/neon/pull/2830

A follow-up of https://github.com/neondatabase/neon/pull/2830, I've
noticed that benchmarks failed again due to out of space issues.

Removes most of the pageserver and safekeeper files from disk after
every pytest suite run.

```
$ poetry run pytest -vvsk "test_tenant_redownloads_truncated_file_on_startup[local_fs]" 
# ...
$ du -h test_output/test_tenant_redownloads_truncated_file_on_startup\[local_fs\]  
# ...
104K    test_output/test_tenant_redownloads_truncated_file_on_startup[local_fs]

$ poetry run pytest -vvsk "test_tenant_redownloads_truncated_file_on_startup[local_fs]" --preserve-database-files
# ...
$ du -h test_output/test_tenant_redownloads_truncated_file_on_startup\[local_fs\]  
# ...
123M    test_output/test_tenant_redownloads_truncated_file_on_startup[local_fs]
```

Co-authored-by: Bojan Serafimov <bojan.serafimov7@gmail.com>
2022-12-14 13:09:08 +00:00
Alexander Bayandin
d3787f9b47 neon-project-create/delete: print project id to stdout (#3073)
Print project_id to GitHub Actions stdout
2022-12-14 13:04:04 +00:00
Shany Pozin
ada5b7158f Fix Issue #3014 (#3059)
* TenantConfigRequest now supports tenant_id as hex string input instead
of bytes array

* Config file is truncated in each creation/update
2022-12-14 14:09:16 +02:00
Arseny Sher
f8ab5ef3b5 Update broker endpoint for prod-us-west-2. (#3095) 2022-12-14 12:58:12 +01:00
Sergey Melnikov
827ee10b5a Disable neon-stress deploy (#3093) 2022-12-14 01:51:42 +01:00
Alexander Bayandin
c819b699be Nightly Benchmark: run neon-captest-reuse from staging (#3086)
The project has been migrated (now it is `restless-king-632302`), and
now we should run tests from staging runners.

Test run:
https://github.com/neondatabase/neon/actions/runs/3686865543/jobs/6241367161

Ref https://github.com/neondatabase/cloud/issues/2836
2022-12-13 23:02:45 +00:00
Sergey Melnikov
228f9e4322 Use default folder for ansible collections (#3092) 2022-12-13 23:59:49 +01:00
Sergey Melnikov
826214ae56 Force ansible-galaxy to also use local ansible.cfg (#3091) 2022-12-13 21:06:18 +01:00
Sergey Melnikov
b39d6126bb Force ansible to use local ansible.cfg (#3089) 2022-12-13 21:57:39 +03:00
Vadim Kharitonov
0bc488b723 Add sentry environment for pageserver and safekeepers in new region
(us-west-2)
2022-12-13 16:26:28 +01:00
Christian Schwarz
0c915dcb1d Timeline::download_missing: fix handling of mismatched layer size
Before this patch, when we decide to rename a layer file to backup
because of layer file size mismatch, we would not remove the layer from
the layer map, but remote the on-disk file.

Because we re-download the file immediately after, we simply end up with
two layer objects in memory that reference the same file in the layer
map. So, GetPage() would work fine until one of the layers gets
delete()'d. The other layer's delete() would then fail.

Future work: prevent insertion of the same layer at LayerMap level
so that we notice such bugs sooner.
2022-12-13 15:53:08 +01:00
Alexander Bayandin
feb07ed510 deploy (old): replace actions/setup-python@v4 with ansible image (#3081)
Replace actions/setup-python@v4 with the ansible image to fix
```
Version 3.10 was not found in the local cache
Error: The version '3.10' with architecture 'x64' was not found for this operating system.
```
2022-12-13 14:01:29 +00:00
Vadim Kharitonov
4603a4cbb5 Bypass SENTRY_ENVIRONMENT variable in order to filter panics in sentry
by environment.
2022-12-13 14:52:04 +01:00
Kirill Bulatov
02c1c351dc Create initial timeline without remote storage (#3077)
Removes the race during pageserver initial timeline creation that lead to partial layer uploads.
This race is only reproducible in test code, we do not create initial timelines in cloud (yet, at least), but still nice to remove the non-deterministic behavior.
2022-12-13 15:42:59 +02:00
Dmitry Ivanov
607c0facfc [proxy] Propagate more console API errors to the user
This patch aims to fix some of the inconsistencies in error reporting,
for example "Internal error" or "Console request failed" instead of
"password authentication failed for user '<NAME>'".
2022-12-13 16:16:31 +03:00
Sergey Melnikov
e5d523c86a Add new us-west-2 region (#3071) 2022-12-13 14:11:40 +01:00
Kirill Bulatov
7a16cde737 Remove useless pub trait method (#3076) 2022-12-13 12:06:20 +00:00
Arseny Sher
d6325aa79d Disable body size limit in ingress broker deploy.
We have infinite streams.
2022-12-13 13:06:30 +03:00
Arseny Sher
544777e86b Fix storage_broker deploy typo. 2022-12-13 10:57:26 +03:00
Arseny Sher
e2ae4c09a6 Put e2e tag back.
32662ff1c4 required running e2e tests on patched branch of cloud repo; not
that it is merged, put the tag back.
2022-12-13 09:53:22 +03:00
Christian Schwarz
22ae67af8d refactor: use new type LayerFileName when referring to layer file names in PathBuf/RemotePath (#3026)
refactor: use new type LayerFileName when referring to layer file names in PathBuf/RemotePath

Before this patch, we would sometimes carry around plain file names in
`Path` types and/or awkwardly "rebase" paths to have a unified
representation of the layer file name between local and remote.

This patch introduces a new type `LayerFileName` which replaces the use
of `Path` / `PathBuf` / `RemotePath` in the `storage_sync2` APIs.

Instead of holding a string, it contains the parsed representation of
the image and delta file name.
When we need the file name, e.g., to construct a local path or
remote object key, we construct the name ad-hoc.

`LayerFileName` is also serde {Dese,Se}rializable, and in an initial   
version of this patch, it was supposed to be used directly inside      
`IndexPart`, replacing `RemotePath`.                                   
However,                                                               
  commit 3122f3282f                      
      Ignore backup files (ones with .n.old suffix) in download_missing
fixed handling of `*.old` backup file names in IndexPart, and we need  
to carry that behavior forward.                                        
The solution is to remove `*.old` backup files names during            
deserialization. When we re-serialize the IndexPart, the `*.old` file  
will be gone.                                                          
This leaks the `.old` file in the remote storage, but makes it safe    
to clean it up later.           

There is additional churn by a preliminary refactoring that got squashed
into this change:

   split off LayerMap's needs from trait Layer into super trait

That refactoring renames `Layer` to `PersistentLayer` and splits off a subset
of the functions into a super-trait called `Layer`.
The upser trait implements just the functions needed by `LayerMap`, whereas
`PersisentLayer` adds the context of the pageserver.

The naming is imperfect as some functions that reside in `PersistentLayer`
have nothing persistence-specific to it. But it's a step in the right direction.
2022-12-13 01:27:59 +02:00
Rory de Zoete
d1edc8aa00 Deprecate old runner for deploy job (#3070)
As we plan to no longer use them

Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>
2022-12-12 16:55:40 +01:00
Arseny Sher
f013d53230 Switch to clap derive API in safekeeper.
Less lines and easier to read/modify. Practically no functional changes.
2022-12-12 16:25:23 +03:00
Kirill Bulatov
0aa2f5c9a5 Regroup CI testing (#3049)
Part of https://github.com/neondatabase/neon/pull/2410 and
https://github.com/neondatabase/neon/pull/2407

* adds `hashFiles('rust-toolchain.toml')` into Rust cache keys, thus
removing one of the manual steps to do when upgrading rustc
* copies Python and Rust style checks from the `codestyle.yml` workflow
* adjusts shell defaults in the main workflow
* replaces `codestyle.yml` with a `neon_extra_builds.yml` worlflow

The new workflow runs on commits to `main` (`codestyle.yml` was run per
PR), and runs two custom builds on GH agents:

* macos-latest, to ensure the entire project compiles on it (no tests
run)

There were no frequent breakages on macOs in our builds, so we can check
it rarely without making every storage PR to wait for it to complete.
The updated mac build use release builds now, so presumably should work
a bit faster due to overall smaller files to cache between builds.

* ubuntu-latest, without caches, to produce full compilation stats for
Rust builds and upload it as an artifact to GitHub

Old `clippy build --timings` stats were collected from the builds that
use caches and incremental calculation hence never could produce a full
report, it got removed.
2022-12-12 12:58:55 +02:00
Vadim Kharitonov
26f4ff949a Add sentry to storage_broker. 2022-12-12 13:30:16 +03:00
Arseny Sher
a1fd0ba23b set tag to make proper e2e tests run 2022-12-12 13:30:16 +03:00
Arseny Sher
32662ff1c4 Replace etcd with storage_broker.
This is the replacement itself, the binary landed earlier. See
docs/storage_broker.md.

ref
https://github.com/neondatabase/neon/pull/2466
https://github.com/neondatabase/neon/issues/2394
2022-12-12 13:30:16 +03:00
Arseny Sher
249d77c720 Deploy broker with L4 LB on old envs.
To avoid having to configure MAX_CONCURRENT_STREAMS on L7 LB (as well as TLS &
public DNS).
2022-12-12 13:00:37 +03:00
Alexander Bayandin
0f445827f5 test_seqscans: increase table size for remote test (#3057)
Increase table size four times to fix the following error:

```
______________________ test_seqscans[remote-100000-100-0] ______________________
test_runner/performance/test_seqscans.py:57: in test_seqscans
    assert int(shared_buffers) < int(table_size)
E   assert 536870912 < 181239808
E    +  where 536870912 = int(536870912)
E    +  and   181239808 = int(181239808)
```

536870912 / 181239808 ≈ 2.96
2022-12-10 23:35:05 +00:00
Kirill Bulatov
700a36ee6b Wait for certain tenant status in the remote storage test (#3055)
Closes https://github.com/neondatabase/neon/issues/3052

From what I could understand from the PR, we did not wait enough before
the attach failed.
Extended the wait period a bit and put a check for a status instead of
plain `sleep` to fail if we don't get the expected status.
2022-12-10 10:18:55 +02:00
Joonas Koivunen
b8a5664fb9 test: kill spawned postgres (#3054)
Fixes #2604.
2022-12-10 00:35:05 +02:00
Kirill Bulatov
861dc8e64e Remove redundant once_cell usages 2022-12-09 22:14:32 +02:00
Arseny Sher
4d6137e0e6 Try to fix docker image tag in broker deploy. 2022-12-09 20:43:54 +03:00
Lassi Pölönen
8684b1b582 Reduce the storage-broker deployment timeout to 5 minutes. 15 minutes is (#3047)
15 minutes is way too long, at least at this point and we want to see
the possible errors quicker. Hence drop it to 5min to have some safety
margin.
2022-12-09 14:37:53 +00:00
MMeent
3321eea679 Fix for #3043 (#3048) 2022-12-09 14:26:05 +01:00
Arseny Sher
28667ce724 Make safekeeper exit code 0.
We don't have any useful graceful shutdown mode, so immediate one is normal.

https://github.com/neondatabase/neon/issues/2956
2022-12-09 12:35:36 +03:00
Lassi Pölönen
6c8b2af1f8 Change storage brokers to internal subdomain (#3039)
There's a bit of a clash with the naming, so dedicate a subdomain for
storage brokers. Back to subdomain separation just to be consistent.
2022-12-09 11:12:42 +02:00
Dmitry Rodionov
3122f3282f Ignore backup files (ones with .n.old suffix) in download_missing
This is rather a hack to resolve immediate issue:
https://github.com/neondatabase/neon/issues/3024

Properly cleaning this file from index part requires changes to
initialization of remote queue. Because we need to clean it up earlier
than we start warking around files.

With on-demand there will be no walk around layer files becase
download_missing is no longer needed, so I believe it will be
natural to unify this with load_layer_map
2022-12-09 12:07:50 +03:00
MMeent
4752385470 Update PostgreSQL to latest vendored releases (#3037)
Several fixes are included, with among others:

- Prefetching for index bulkdelete calls (e.g. during vacuum), plus v14 compiler warning fix
- A fix for setting LSN on heap pages while setting vm bits
- Some style updates that were lost in the previous wave (v15 only)
2022-12-09 11:02:23 +02:00
Alexander Bayandin
9747e90f3a Nightly Benchmarks: Move from captest to staging (#2838)
Migrate Nightly Benchmarks from captest to staging.

- Migrate GitHub Workflows
- Replace `zenith-benchmarker` with regular runners
- Remove `environment` parameter from Neon GitHub Actions, add
`postgres_version`
- The only job left on captest is `neon-captest-reuse`, which will be
moved to staging after its project migration.

Ref https://github.com/neondatabase/cloud/issues/2836
2022-12-08 22:28:25 +00:00
Alexander Bayandin
a19c487766 Nightly Benchmarks: add TPC-H benchmark (#2978)
Ref: https://www.tpc.org/tpch/
2022-12-08 15:32:49 +00:00
Alexander Bayandin
5c701f9a75 merge-allure-report: create report even if benchmarks is skipped (#3029) 2022-12-08 15:13:40 +00:00
dependabot[bot]
4de4217247 Bump certifi from 2022.9.24 to 2022.12.7 (#3033) 2022-12-08 14:50:59 +00:00
Arseny Sher
2baf6c09a8 Some more allowed pageserver errors.
https://neondb.slack.com/archives/C033RQ5SPDH/p1670497680293859
2022-12-08 15:54:59 +03:00
Sergey Melnikov
f5a735ac3b Add proxy and broker to us-west-2 (#3027)
Co-authored-by: Lassi Pölönen <lassi.polonen@iki.fi>
2022-12-08 12:24:24 +01:00
MMeent
0d04cd0b99 Run compaction on the buffer holding received buffers when useful (#3028)
This cleans up unused entries and reduces the chance of prefetch buffer
thrashing.
2022-12-08 09:49:43 +01:00
Konstantin Knizhnik
e1ef62f086 Print more information about context of failed walredo requests (#3003) 2022-12-08 09:12:38 +02:00
Kirill Bulatov
b50e0793cf Rework remote_storage interface (#2993)
Changes:

* Remove `RemoteObjectId` concept from remote_storage.
Operate directly on /-separated names instead.
These names are now represented by struct `RemotePath` which was renamed from struct `RelativePath`

* Require remote storage to operate on relative paths for its contents, thus simplifying the way to derive them in pageserver and safekeeper

* Make `IndexPart` to use `String` instead of `RelativePath` for its entries, since those are just the layer names
2022-12-07 23:11:02 +02:00
Christian Schwarz
ac0c167a85 improve pidfile handling
This patch centralize the logic of creating & reading pid files into the
new pid_file module and improves upon / makes explicit a few race conditions
that existed with the previous code.

Starting Processes / Creating Pidfiles
======================================

Before this patch, we had three places that had very similar-looking
    match lock_file::create_lock_file { ... }
blocks.
After this change, they can use a straight-forward call provided
by the pid_file:
    pid_file::claim_pid_file_for_pid()

Stopping Processes / Reading Pidfiles
=====================================

The new pid_file module provides a function to read a pidfile,
called read_pidfile(), that returns a

  pub enum PidFileRead {
      NotExist,
      NotHeldByAnyProcess(PidFileGuard),
      LockedByOtherProcess(Pid),
  }

If we get back NotExist, there is nothing to kill.

If we get back NotHeldByAnyProcess, the pid file is stale and we must
ignore its contents.

If it's LockedByOtherProcess, it's either another pidfile reader
or, more likely, the daemon that is still running.
In this case, we can read the pid in the pidfile and kill it.
There's still a small window where this is racy, but it's not a
regression compared to what we have before.

The NotHeldByAnyProcess is an improvement over what we had before
this patch. Before, we would blindly read the pidfile contents
and kill, even if no other process held the flock.
If the pidfile was stale (NotHeldByAnyProcess), then that kill
would either result in ESRCH or hit some other unrelated process
on the system. This patch avoids the latter cacse by grabbing
an exclusive flock before reading the pidfile, and returning the
flock to the caller in the form of a guard object, to avoid
concurrent reads / kills.
It's hopefully irrelevant in practice, but it's a little robustness
that we get for free here.

Maintain flock on Pidfile of ETCD / any InitialPidFile::Create()
================================================================

Pageserver and safekeeper create their pidfiles themselves.
But for etcd, neon_local creates the pidfile (InitialPidFile::Create()).

Before this change, we would unlock the etcd pidfile as soon as
`neon_local start` exits, simply because no-one else kept the FD open.

During `neon_local stop`, that results in a stale pid file,
aka, NotHeldByAnyProcess, and it would henceforth not trust that
the PID stored in the file is still valid.

With this patch, we make the etcd process inherit the pidfile FD,
thereby keeping the flock held until it exits.
2022-12-07 18:24:12 +01:00
Lassi Pölönen
6dfd7cb1d0 Neon storage broker helm value fixes (#3025)
* We were missing one cluster in production:
`prod-ap-southeast-1-epsilon` configs.
* We had `metrics` enabled. This means creating `ServiceScrape` objects,
but since those clusters don't have `kube-prometheus-stack` like older
ones, we are missing the CRDs, so the helm deploy fails.
2022-12-07 17:15:51 +02:00
Heikki Linnakangas
a46a81b5cb Fix updating "trace_read_requests" with /v1/tenant/config mgmt API.
The new "trace_read_requests" option was missing from the
parse_toml_tenant_conf function that reads the config file. Because of
that, the option was ignored, which caused the test_read_trace.py test
to fail. It used to work before commit 9a6c0be823, because the
TenantConfigOpt struct was constructed directly in tenant_create_handler,
but now it is saved and read back from disk even for a newly created
tenant.

The abovementioned bug was fixed in commit 09393279c6 already, which
added the missing code to parse_toml_tenant_conf() to parse the
new "trace_read_requests" option. This commit fixes one more function
that was missed earlier, and adds more detail to the error message if
parsing the config file fails.
2022-12-07 15:03:39 +02:00
Lassi Pölönen
c74dca95fc Helm values for old staging and one region in new staging (#2922)
helm values for the new `storage-broker`. gRPC, over secure connection
with a proper certificate, but no authentication.

Uses alb ingress in the old cluster and nginx ingress for the new one.

The chart is deployed and the addresses are functional, while the
pipeline doesn't exist yet.
2022-12-07 14:24:07 +02:00
Heikki Linnakangas
b513619503 Remove obsolete 'awaits_download' field.
It used to be a separate piece of state, but after 9a6c0be823 it's just
an alias for the Tenant being in Attaching state. It was only used in
one assertion in a test, but that check doesn't make sense anymore, so
just remove it.

Fixes https://github.com/neondatabase/neon/issues/2930
2022-12-07 13:13:54 +02:00
Shany Pozin
b447eb4d1e Add postgres-v15 to source tree documentation (#3023) 2022-12-07 12:56:42 +02:00
Kirill Bulatov
6a57d5bbf9 Make the request tracing test more useful 2022-12-06 23:52:16 +02:00
Kirill Bulatov
09393279c6 Fix tenant config parsing 2022-12-06 23:52:16 +02:00
Nikita Kalyanov
634d0eab68 pass availability zone to console during pageserver registration (#2991)
this is safe because unknown fields are ignored. After the corresponding PR in control plane is merged this field
is going to be required

Part of https://github.com/neondatabase/cloud/issues/3131
2022-12-06 21:09:54 +02:00
Kliment Serafimov
8f2b3cbded Sentry integration for storage. (#2926)
Added basic instrumentation to integrate sentry with the proxy, pageserver, and safekeeper processes.
Currently in sentry there are three projects, one for each process. Sentry url is sent to all three processes separately via cli args.
2022-12-06 18:57:54 +00:00
Christian Schwarz
4530544bb8 draw_timeline_dirs: accept paths as input 2022-12-06 18:17:48 +01:00
Dmitry Rodionov
98ff0396f8 tone down error log for successful process termination 2022-12-06 18:44:07 +03:00
Kirill Bulatov
d6bfe955c6 Add commands to unload and load the tenant in memory (#2977)
Closes https://github.com/neondatabase/neon/issues/2537

Follow-up of https://github.com/neondatabase/neon/pull/2950
With the new model that prevents attaching without the remote storage,
it has started to be even more odd to add attach-with-files
functionality (in addition to the issues raised previously).

Adds two separate commands:
* `POST {tenant_id}/ignore` that places a mark file to skip such tenant
on every start and removes it from memory
* `POST {tenant_id}/schedule_load` that tries to load a tenant from
local FS similar to what pageserver does now on startup, but without
directory removals
2022-12-06 15:30:02 +00:00
danieltprice
046ba67d68 Update README.md (#3015)
Update readme to remove reference to the invite gate.
2022-12-06 11:27:46 -04:00
Alexander Bayandin
61825dfb57 Update chrono to 0.4.23; use only clock feature from it 2022-12-06 15:45:58 +01:00
Kirill Bulatov
c0480facc1 Rename RelativePath to RemotePath
Improve rustdocs a bit
2022-12-05 22:52:42 +02:00
Kirill Bulatov
b38473d367 Remove RelativePath conversions
Function was unused, but publicly exported from the module lib,
so not reported by rustc as unused
2022-12-05 22:52:42 +02:00
Kirill Bulatov
7a9cb75e02 Replace dynamic dispatch with static dispatch 2022-12-05 22:52:42 +02:00
Kirill Bulatov
38af453553 Use async RwLock around tenants (#3009)
A step towards more async code in our repo, to help avoid most of the
odd blocking calls, that might deadlock, as mentioned in
https://github.com/neondatabase/neon/issues/2975
2022-12-05 22:48:45 +02:00
Shany Pozin
79fdd3d51b Fix #2907: Change missing_layers property to optional in the IndexPart struct (#3005)
Move missing_layers property to Option<HashSet<RelativePath>> 
This will allow the safe removal of it once the upgrade of all page servers is done with this new code
2022-12-05 13:56:04 +02:00
Alexander Bayandin
ab073696d0 test_bulk_update: use new prefetch settings (#3007)
Replace `seqscan_prefetch_buffers` with `effective_io_concurrency` &
`maintenance_io_concurrency` in one more place (the last one!)
2022-12-05 10:56:01 +00:00
Kirill Bulatov
4f443c339d Tone down retry error logs (#2999)
Closes https://github.com/neondatabase/neon/issues/2990
2022-12-03 15:30:55 +00:00
Alexander Bayandin
ed27c98022 Nightly Benchmarks: use new prefetch settings (#3000)
- Replace `seqscan_prefetch_buffers` with `effective_io_concurrency` and
`maintenance_io_concurrency` for `clickbench-compare` job (see
https://github.com/neondatabase/neon/pull/2876)
- Get the database name in a runtime (it can be `main` or `neondb` or
something else)
2022-12-03 13:11:02 +00:00
Alexander Bayandin
788823ebe3 Fix named_arguments_used_positionally warnings (#2987)
```
warning: named argument `file` is not used by name
    --> pageserver/src/tenant/timeline.rs:1078:54
     |
1078 |                 trace!("downloading image file: {}", file = path.display());
     |                                                 --   ^^^^ this named argument is referred to by position in formatting string
     |                                                 |
     |                                                 this formatting argument uses named argument `file` by position
     |
     = note: `#[warn(named_arguments_used_positionally)]` on by default
help: use the named argument by name to avoid ambiguity
     |
1078 |                 trace!("downloading image file: {file}", file = path.display());
     |                                                  ++++

```

Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2022-12-02 17:59:26 +00:00
MMeent
145e7e4b96 Prefetch cleanup: (#2876)
- **Enable `enable_seqscan_prefetch` by default**
- Drop use of `seqscan_prefetch_buffers` in favor of
`[maintenance,effective]_io_concurrency`
This includes adding some fields to the HeapScan execution node, and
vacuum state.
- Cleanup some conditionals in vacuumlazy.c
- Clarify enable_seqscan_prefetch GUC description
- Fix issues in heap SeqScan prefetching where synchronize_seqscan
machinery wasn't handled properly.
2022-12-02 13:35:01 +01:00
Heikki Linnakangas
d90b52b405 Update README
- Change "WAL service" to "safekeepers" in the architecture section. The
  safekeepers together form the WAL service, but we don't use that term
  much in the code.
- Replace the short list of pageserver components with a link /docs. We
  have more details there.
- Add "Other resources" to Documention section, with links to some blog
  posts and a video presentation.
- Remove notice at the top about the Zenith -> Neon rename. There are
  still a few references to Zenith in the codebase, but not so many that
  we would need to call it out at the top anymore.
2022-12-02 11:47:50 +01:00
Konstantin Knizhnik
c21104465e Fix copying relation in walloged create database in PG15 (#2986)
refer #2904
2022-12-01 22:27:18 +02:00
bojanserafimov
fe280f70aa Add synthetic layer map bench (#2979) 2022-12-01 13:29:21 -05:00
Heikki Linnakangas
faf1d20e6a Don't remove PID file in neon_local, and wait after "pageserver init". (#2983)
Our shutdown procedure for "pageserver init" was buggy. Firstly, it
merely sent the process a SIGKILL, but did not wait for it to actually
exit. Normally, it should exit quickly as SIGKILL cannot be caught or
ignored by the target process, but it's still asynchronous and the
process can still be alive when the kill(2) call returns. Secondly,
"neon_local" removed the PID file after sending SIGKILL, even though the
process was still running. That hid the first problem: if we didn't
remove the PID file, and you start a new pageserver process while the
old one is still running, you would get an error when the new process
tries to lock the PID file.

We've been seeing a lot of "Cannot assign requested address" failures in
the CI lately. Our theory is that when we run "pageserver init"
immediately followed by "pageserver start", the first process is still
running and listening on the port when the second invocation starts up.
This commit hopefully fixes the problem.

It is generally a bad idea for the "neon_local" to remove the PID file
on the child process's behalf. The correct way would be for the server
process to remove the PID file, after it has fully shutdown everything
else. We don't currently have a robust way to ensure that everything has
truly shut down and closed, however.

A simpler way is to simply never remove the PID file. It's not necessary
to remove the PID file for correctness: we cannot rely on the cleanup to
happen anyway, if the server process crashes for example. Because of
that, we already have all the logic in place to deal with a stale PID
file that belonged to a process that already exited. Let's rely on that
on normal shutdown too.
2022-12-01 16:38:52 +02:00
Konstantin Knizhnik
d9ab42013f Resend prefetch request in case of pageserver restart (#2974)
refer #2819

Co-authored-by: MMeent <matthias@neon.tech>
2022-12-01 12:16:15 +02:00
Bojan Serafimov
edfebad3a1 Add test that importing an empty file fails.
We used to have a bug where the pageserver just got stuck if the
client sent a CopyDone message before reaching end of tar stream. That
showed up with an empty tar file, as one example. That was
inadvertently fixed by code refactorings, but let's add a regression
test for it, so that we don't accidentally re-introduce the bug later.

Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2022-12-01 12:08:56 +02:00
bojanserafimov
b9544adcb4 Add layer map search benchmark (#2957) 2022-11-30 13:48:07 -05:00
Andrés
ebb51f16e0 Re-introduce aws-sdk-rust as rusoto S3 replacement (#2841)
Part of https://github.com/neondatabase/neon/issues/2683
Initial PR: https://github.com/neondatabase/neon/pull/2802, revert: https://github.com/neondatabase/neon/pull/2837

Co-authored-by: andres <andres.rodriguez@outlook.es>
2022-11-30 17:47:32 +02:00
Alexander Bayandin
136b029d7a neon-project-create: fix project creation (#2954)
Update api/v2 call to support changes from
https://github.com/neondatabase/cloud/pull/2929
2022-11-30 09:19:59 +00:00
Heikki Linnakangas
33834c01ec Rename Paused states to Stopping.
I'm not a fan of "Paused", for two reasons:

- Paused implies that the tenant/timeline with no activity on it. That's
  not true; the tenant/timeline can still have active tasks working on it.

- Paused implies that it can be resumed later. It can not. A tenant or
  timeline in this state cannot be switched back to Active state anymore.
  A completely new Tenant or Timeline struct can be constructed for the
  same tenant or timeline later, e.g. if you detach and later re-attach
  the same tenant, but that's a different thing.

Stopping describes the state better. I also considered "ShuttingDown",
but Stopping is simpler as it's a single word.
2022-11-30 01:10:16 +02:00
Heikki Linnakangas
9a6c0be823 storage_sync2
The code in this change was extracted from PR #2595, i.e., Heikki’s draft
PR for on-demand download.

High-Level Changes

- storage_sync module rewrite
- Changes to Tenant Loading
- Changes to Timeline States
- Crash-safe & Resumable Tenant Attach

There are several follow-up work items planned.
Refer to the Epic issue on GitHub:
https://github.com/neondatabase/neon/issues/2029

Metadata:

closes https://github.com/neondatabase/neon/pull/2785

unsquashed history of this patch: archive/pr-2785-storage-sync2/pre-squash

Co-authored-by: Dmitry Rodionov <dmitry@neon.tech>
Co-authored-by: Christian Schwarz <christian@neon.tech>

===============================================================================

storage_sync module rewrite
===========================

The storage_sync code is rewritten. New module name is storage_sync2, mostly to
make a more reasonable git diff.

The updated block comment in storage_sync2.rs describes the changes quite well,
so, we will not reproduce that comment here. TL;DR:
- Global sync queue and RemoteIndex are replaced with per-timeline
  `RemoteTimelineClient` structure that contains a queue for UploadOperations
  to ensure proper ordering and necessary metadata.
- Before deleting local layer files, wait for ongoing UploadOps to finish
  (wait_completion()).
- Download operations are not queued and executed immediately.

Changes to Tenant Loading
=========================

Initial sync part was rewritten as well and represents the other major change
that serves as a foundation for on-demand downloads. Routines for attaching and
loading shifted directly to Tenant struct and now are asynchronous and spawned
into the background.

Since this patch doesn’t introduce on-demand download of layers we fully
synchronize with the remote during pageserver startup. See details in
`Timeline::reconcile_with_remote` and `Timeline::download_missing`.

Changes to Tenant States
========================

The “Active” state has lost its “background_jobs_running: bool” member. That
variable indicated whether the GC & Compaction background loops are spawned or
not. With this patch, they are now always spawned. Unit tests (#[test]) use the
TenantConf::{gc_period,compaction_period} to disable their effect (15db566).

This patch introduces a new tenant state, “Attaching”. A tenant that is being
attached starts in this state and transitions to “Active” once it finishes
download.

The `GET /tenant` endpoints returns `TenantInfo::has_in_progress_downloads`. We
derive the value for that field from the tenant state now, to remain
backwards-compatible with cloud.git. We will remove that field when we switch
to on-demand downloads.

Changes to Timeline States
==========================

The TimelineInfo::awaits_download field is now equivalent to the tenant being
in Attaching state.  Previously, download progress was tracked per timeline.
With this change, it’s only tracked per tenant. When on-demand downloads
arrive, the field will be completely obsolete.  Deprecation is tracked in
isuse #2930.

Crash-safe & Resumable Tenant Attach
====================================

Previously, the attach operation was not persistent. I.e., when tenant attach
was interrupted by a crash, the pageserver would not continue attaching after
pageserver restart. In fact, the half-finished tenant directory on disk would
simply be skipped by tenant_mgr because it lacked the metadata file (it’s
written last). This patch introduces an “attaching” marker file inside that is
present inside the tenant directory while the tenant is attaching. During
pageserver startup, tenant_mgr will resume attach if that file is present. If
not, it assumes that the local tenant state is consistent and tries to load the
tenant. If that fails, the tenant transitions into Broken state.
2022-11-29 18:55:20 +01:00
Heikki Linnakangas
baa8d5a16a Test that physical size is the same before and after re-attaching tenant. 2022-11-29 14:32:01 +02:00
Heikki Linnakangas
fbd5f65938 Misc cosmetic fixes in comments, messages.
Most of these were extracted from PR #2785.
2022-11-29 14:10:45 +02:00
Heikki Linnakangas
1f1324ebed Require tenant to be active when calculating tenant size.
It's not clear if the calculation would work or make sense, if the
tenant is only partially loaded. Let's play it safe, and require it to
be Active.
2022-11-29 14:10:45 +02:00
Alexander Bayandin
fb633b16ac neon-project-create: change default region for staging (#2951)
Change the default region for staging from `us-east-1` to `us-east-2`
for project creation.
Remove REGION_ID from `neon-branch-create` since we don't use it.
2022-11-29 11:38:24 +00:00
Joonas Koivunen
f277140234 Small fixes (#2949)
Nothing interesting in these changes. Passing through the
RUST_BACKTRACE=full will hopefully save someone else panick reproduction
time.

Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2022-11-29 10:29:25 +02:00
Arseny Sher
52166799bd Put .proto compilation result to $OUT_DIR/
Sometimes CI build fails with

error: couldn't read storage_broker/src/../proto/storage_broker.rs: No such file or directory (os error 2)
  --> storage_broker/src/lib.rs:14:5
   |
14 |     include!("../proto/storage_broker.rs");
   |     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The root cause is not clear, but it looks like interference with cachepot. Per
cargo docs, build scripts shouldn't output to anywhere but OUT_DIR; let's follow
this and see if it helps.
2022-11-28 20:27:43 +04:00
Sergey Melnikov
0a4e5f8aa3 Setup legacy scram proxy in us-east-2 (#2943) 2022-11-28 17:21:35 +01:00
MMeent
0c1195c30d Fix #2937 (#2940)
Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
2022-11-28 15:34:07 +01:00
Alexander Bayandin
3ba92d238e Nightly Benchmarks: Fix default db name and clickbench-compare trigger (#2938)
- Fix database name: `main` -> `neondb`
- Fix `clickbench-compare` trigger; the job should be triggered even if
`pgbench-compare` fails
2022-11-28 12:08:04 +00:00
Heikki Linnakangas
67469339fa When new timeline is created, don't wait for compaction. (#2931)
When a new root timeline is created, we want to flush all the data to
disk before we return success to the caller. We were using
checkpoint(CheckpointConfig::Forced) for that, but that also performs
compaction. With the default settings, compaction will have no work
after we have imported an empty database, as the image of that is too
small to require compaction. However, with very small
checkpoint_distance and compaction_target_size, compaction will run, and
it can take a while.

PR #2785 adds new tests that use very small checkpoint_distance and
compaction_target_size settings, and the test sometimes failed with
"operation timed out" error in the client, when the create_timeline step
took too long.
2022-11-28 11:05:20 +02:00
Heikki Linnakangas
0205a44265 Remove obsolete TODO and settings in test
The GC and compaction loops have reacted quickly to shutdown request
since commit 40c845e57d.
2022-11-28 11:04:25 +02:00
Alexander Bayandin
480175852f Nightly Benchmarks: add OLAP-style benchmark (clickbench) (#2855)
Add ClickBench benchmark, an OLAP-style benchmark, to Nightly
Benchmarks.

The full run of 43 queries on the original dataset takes more than 6h
(only 34 queries got processed on in 6h) on our default-sized compute.
Having this, currently, would mean having some really unstable tests
because of our regular deployment to staging/captest environment (see
https://github.com/neondatabase/cloud/issues/1872).

I've reduced the dataset size to the first 10^7 rows from the original
10^8 rows. Now it takes ~30-40 minutes to pass.

Ref https://github.com/ClickHouse/ClickBench/tree/main/aurora-postgresql
Ref https://benchmark.clickhouse.com/
2022-11-25 18:41:26 +00:00
Alexander Bayandin
9fdd228dee GitHub Actions: Add branch related actions (#2877)
Add `neon-branch-create` / `neon-branch-delete` to allow using branches
in tests.
I have a couple of use cases in mind:
- For destructive tests with a big DB, we can create the DB once in
advance and then use branches without the need to recreate the DB itself
after tests change it.
- We can run tests in parallel (if there're compute-bound).

Also migrate API v2 for `neon-project-create` / `neon-project-delete`
2022-11-25 18:18:08 +00:00
Heikki Linnakangas
15db566420 Allow setting gc/compaction_period to 0, to disable automatic GC/compaction
Many python tests were setting the GC/compaction period to large
values, to effectively disable GC / compaction. Reserve value 0 to
mean "explicitly disabled". We also set them to 0 in unit tests now,
although currently, unit tests don't launch the background jobs at
all, so it won't have any effect.

Fixes https://github.com/neondatabase/neon/issues/2917
2022-11-25 20:14:06 +02:00
Alexander Bayandin
1a316a264d Disable statement timeout for performance tests (#2891)
Fix `test_seqscans` by disabling statement timeout.

Also, replace increasing statement timeout with disabling it for
performance tests. This should make tests more stable and allow us to
observe performance degradation instead of test failures.
2022-11-25 16:05:45 +00:00
Alexander Bayandin
aeeb782342 Make test runner compatible with Python 3.11 (#2915)
**NB**: this PR doesn't update Python to 3.11; it makes tests
compatible with it and fixes a couple of warnings by updating 
dependencies.

- `poetry add asyncpg@latest` to fix `./scripts/pysync`
- `poetry add boto3@latest "boto3-stubs[s3]@latest"` to fix
```
DeprecationWarning: 'cgi' is deprecated and slated for removal in Python 3.13
```
- `poetry update certifi` to fix
```
DeprecationWarning: path is deprecated. Use files() instead. Refer to https://importlib-resources.readthedocs.io/en/latest/using.html#migrating-from-legacy for migration advice.
```
- Move `types-toml` from `dep-dependencies` to `dependencies` to keep it
aligned with other `types-*` deps
2022-11-25 15:59:15 +00:00
Egor Suvorov
ae53dc3326 Add authentication between Safekeeper and Pageserver/Compute
* Fix https://github.com/neondatabase/neon/issues/1854
* Never log Safekeeper::conninfo in walproposer as it now contains a secret token
* control_panel, test_runner: generate and pass JWT tokens for Safekeeper to compute and pageserver
* Compute: load JWT token for Safekepeer from the environment variable. Do not reuse the token from
  pageserver_connstring because it's embedded in there weirdly.
* Pageserver: load JWT token for Safekeeper from the environment variable.
* Rewrite docs/authentication.md
2022-11-25 04:17:42 +03:00
Egor Suvorov
1ca76776d0 pageserver: require management permissions on HTTP /status 2022-11-25 04:17:42 +03:00
Egor Suvorov
10d554fcbb walproposer: refactor safekeeper::conninfo initialization
It is used both in WalProposerInit and ResetConnection.
In the future the logic will become more complicated due to authentication with Safekeeper.
2022-11-25 04:17:42 +03:00
Egor Suvorov
2ce5d8137d Separate permission checks for Pageserver and Safekeeper
There will be different scopes for those two, so authorization code should be different.

The `check_permission` function is now not in the shared library. Its implementation
is very similar to the one which will be added for Safekeeper. In fact, we may reuse
the same existing root-like 'PageServerApi' scope, but I would prefer to have separate
root-like scopes for services.

Also, generate_management_token in tests is generate_pageserver_token now.
2022-11-25 04:17:42 +03:00
Egor Suvorov
a406783098 neon_fixtures: refactor AuthKeys to support more scopes 2022-11-25 04:17:42 +03:00
Alexey Kondratov
e6db4b63eb [safekeeper] Serialize LSN in the term_history according to the spec (#2896)
Use string format in the timeline status HTTP API reponse.
2022-11-24 17:19:01 +01:00
Arseny Sher
0b0cb77da4 Fix deploy after 2d42f84389. 2022-11-24 20:07:41 +04:00
Dmitry Ivanov
47734fdb0a [proxy] Move some tests to a dedicated module
This unclutters the pivotal `proxy.rs` module.
2022-11-24 18:43:34 +03:00
Sergey Melnikov
9c886ac0a0 Use per-cluster DNS name for link proxy (#2911) 2022-11-24 12:41:38 +01:00
Egor Suvorov
b6989e8928 pageserver: make wal_source_connstring: String a 'wal_source_connconf: PgConnectionConfig` 2022-11-24 14:02:23 +03:00
Egor Suvorov
46ea2a8e96 Continue #2724: replace Url-based PgConnectionConfig with a hand-crafted struct
Downsides are:

* We store all components of the config separately. `Url` stores them inside a single
  `String` and a bunch of ints which point to different parts of the URL, which is
  probably more efficient.
* It is now impossible to pass arbitrary connection strings to the configuration file,
  one has to support all components explicitly. However, we never supported anything
  except for `host:port` anyway.

Upsides are:

* This significantly restricts the space of possible connection strings, some of which
  may be either invalid or unsupported. E.g. Postgres' connection strings may include
  a bunch of parameters as query (e.g. `connect_timeout=`, `options=`). These are nether
  validated by the current implementation, nor passed to the postgres client library,
  Hence, storing separate fields expresses the intention better.
* The same connection configuration may be represented as a URL in multiple ways
  (e.g. either `password=` in the query part or a standard URL password).
  Now we have a single canonical way.
* Escaping is provided for `options=`.

Other possibilities considered:

* `newtype` with a `String` inside and some validation on creation.
  This is more efficient, but harder to log for two reasons:
  * Passwords should never end up in logs, so we have to somehow
  * Escaped `options=` are harder to read, especially if URL-encoded,
    and we use `options=` a lot.
2022-11-24 14:02:23 +03:00
Heikki Linnakangas
5bca7713c1 Improve comments on TenantStates 2022-11-24 12:26:15 +02:00
Heikki Linnakangas
99d9c23df5 Gather path-related consts and functions to one place.
Feels more organized this way.
2022-11-24 12:26:15 +02:00
Dmitry Ivanov
05db6458df [proxy] Fix project (endpoint) -related error messages 2022-11-23 23:03:29 +03:00
Arseny Sher
2d42f84389 Add storage_broker binary.
Which ought to replace etcd. This patch only adds the binary and adjusts
Dockerfile to include it; subsequent ones will add deploy of helm chart and the
actual replacement.

It is a simple and fast pub-sub message bus. In this patch only safekeeper
message is supported, but others can be easily added.

Compilation now requires protoc to be installed. Installing protobuf-compiler
package is fine for Debian/Ubuntu.

ref
https://github.com/neondatabase/neon/pull/2733
https://github.com/neondatabase/neon/issues/2394
2022-11-23 22:05:59 +04:00
Sergey Melnikov
aee3eb6d19 Deploy link proxy to us-east-2 (#2905) 2022-11-23 18:11:44 +01:00
Konstantin Knizhnik
a6e4a3c3ef Implement corrent truncation of FSM/VM forks on arbitrary position (#2609)
refer #2601

Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>
2022-11-23 18:46:07 +02:00
Konstantin Knizhnik
21ec28d9bc Add bulk update test (#2902) 2022-11-23 17:51:35 +02:00
Heikki Linnakangas
de8f24583f Remove obsolete 'zenith_ctl' alias from compute images 2022-11-23 16:58:31 +02:00
Sergey Melnikov
85f0975c5a Setup eu-west-1 as region for PR testing (#2757) 2022-11-23 10:54:39 +01:00
Konstantin Knizhnik
1af087449a Reduce max_replication_write_lag to 10Mb (#1793) 2022-11-23 08:41:22 +02:00
Heikki Linnakangas
37625c4433 Remove obsolete design doc.
I considered archiving this under docs/rfcs, but looking at the contents,
I don't think it's relevant at all anymore. So let's just remove it.
2022-11-23 00:40:17 +02:00
Heikki Linnakangas
e9f4ca5972 Remove references to obsolete files in .gitignore 2022-11-23 00:40:17 +02:00
Alexey Kondratov
4bf3087aed [pageserver] list latest_gc_cutoff_lsn in the OpenAPI spec (#2894)
It seems that it's present in the API response for quite a while. It's
just not listed in the spec, fix it.
2022-11-22 21:10:49 +01:00
Dmitry Ivanov
9470bc9fe0 [proxy] Implement per-tenant traffic metrics 2022-11-22 18:50:57 +03:00
Heikki Linnakangas
86e483f87b Fix tenant size modeling code to include WAL at end of branch
Imagine that you have a tenant with a single branch like this:

---------------==========>
               ^
	    gc horizon
where:

----  is the portion of the branch that is older than retention period
====  is the portion of the branch that is newer than retention period.

Before this commit, the sizing model included the logical size at the
GC horizon, but not the WAL after that. In particular, that meant that
on a newly created tenant with just one timeline, where the retention
period covered the whole history of the timeline, i.e. gc_cutoff was 0,
the calculated tenant size was always zero.

We now include the WAL after the GC horizon in the size. So in the
above example, the calculated tenant size would be the logical size
of the database the GC horizon, plus all the WAL after it (marked with
===).

This adds a new `insert_point` function to the sizing model, alongside
`modify_branch`, and changes the code in size.rs to use the new
function. The new function takes an absolute lsn and logical size as
argument, so we no longer need to calculate the difference to the
previous point. Also, the end-size is now optional, because we now
need to add a point to represent the end of each branch to the model,
but we don't want to or need to calculate the logical size at that
point.
2022-11-22 17:11:27 +02:00
Christian Schwarz
f50d0ec0c9 test_runner: ignore 'sender is dropped while join handle is still alive' warnings
The need for a proper solution to this is tracked in
https://github.com/neondatabase/neon/issues/2885
2022-11-22 11:30:34 +01:00
Sergey Melnikov
74ec36a1bf Add pageserver-1.us-east-2.aws.neon.build (#2881) 2022-11-22 10:55:02 +01:00
Anastasia Lubennikova
a63ebb6446 Update vendor postgres to 14.6 and 15.1 2022-11-22 10:46:21 +02:00
Alexander Stanovoy
a5b898a31c Fix the order of checks in LSN (#2882)
We should check if LSN is in the lower range because it's constant and
only after wait for LSN to arrive if needed.
2022-11-22 02:28:41 +02:00
bojanserafimov
c6f095a821 Fix remote seqscan test (#2878) 2022-11-21 17:21:47 -05:00
Alexander Bayandin
6b2bc7f775 Nightly Benchmarks: Add RDS Postgres (#2859)
Add RDS Postgres `db.m5.large` instance to Nightly Benchmarks
2022-11-21 15:25:09 +00:00
Heikki Linnakangas
6c97fc941a Enable passing FAILPOINTS at startup.
- Pass through FAILPOINTS environment variable to the pageserver in
  "neon_local pageserver start" command

- On startup, list any failpoints that were set with FAILPOINTS to the log

- Add optional "extra_env_vars" argument to the NeonPageserver.start()
  function in the python fixture, so that you can pass FAILPOINTS

None of the tests use this functionality yet; that comes in a separate
commit.

closes https://github.com/neondatabase/neon/pull/2865
2022-11-21 16:24:19 +01:00
Alexander Bayandin
cb9b26776e Fix test_seqscans on remote cluster (#2869)
A remote project is reused between tests, so we need to ensure that we
don't have a table with the same name already created.
2022-11-19 23:39:42 +00:00
Heikki Linnakangas
684329d4d2 Another attempt at silencing test_gc_cutoff failures.
Increse the pgbench runtimes even further. The theory is that when
there are many other tests running at the same time, one pgbench run
could take a long time until it generates enough layers for GC to kick
in.
2022-11-19 19:28:56 +02:00
Heikki Linnakangas
ed40a045c0 Add more logging to track down test_gc_cutoff failure.
see https://github.com/neondatabase/neon/issues/2856
2022-11-19 14:12:21 +02:00
Heikki Linnakangas
3f39327622 Silence a few compiler warnings
I saw these from the build of the compute docker image in the CI
(compute-node-image-v15):

    pagestore_smgr.c: In function 'neon_prefetch':
    pagestore_smgr.c:1654:2: warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
     1654 |  BufferTag tag = (BufferTag) {
          |  ^~~~~~~~~
    walproposer.c:197:1: warning: no previous prototype for 'WalProposerSync' [-Wmissing-prototypes]
      197 | WalProposerSync(int argc, char *argv[])
          | ^~~~~~~~~~~~~~~
    libpagestore.c: In function 'pageserver_connect':
    libpagestore.c💯9: warning: variable 'wc' set but not used [-Wunused-but-set-variable]
      100 |   int   wc;
          |         ^~
    libpagestore.c: In function 'call_PQgetCopyData':
    libpagestore.c:144:9: warning: variable 'wc' set but not used [-Wunused-but-set-variable]
      144 |   int   wc;
          |         ^~

Harmless warnings, but let's be tidy.

In the passing, I added some "extern" to a few function declarations
that were missing them, and marked WalProposerSync as "static". Those
changes are also purely cosmetic.
2022-11-19 14:11:04 +02:00
Heikki Linnakangas
a50a7e8ac0 Try to silence test_gc_cutoff flakiness.
Commit d013a2b227 changed the test, so that it fails if pgbench runs
to completion without triggering the failpoint. That has now happened
several times in the CI. That's not expected, so this needs some
investigation, but as a quick fix just make the pgbench runs longer so
that we're closer to the situation before commit d013a2b227.

See https://github.com/neondatabase/neon/issues/2856
2022-11-19 01:19:09 +02:00
Egor Suvorov
e28eda7939 sourcetree/docs: mention hakari generate (#2864) 2022-11-18 22:30:41 +00:00
Christian Schwarz
f564dff0e3 make test_tenant_detach_smoke fail reproducibly
Add failpoint that triggers the race condition.
Skip test until we'll land the fix from
https://github.com/neondatabase/neon/pull/2851
with
https://github.com/neondatabase/neon/pull/2785
2022-11-18 17:15:34 +01:00
Christian Schwarz
d783889a1f timeline: explicit tracking of flush loop state: NotStarted, Running, Exited
This allows us to error out in the case where we request flush but the
flush loop is not running.
Before, we would only track whether it was started, but not when it
exited.
Better to use an enum with 3 states than a 2-state bool because then
the error message can answer the question whether we ever started
the flush loop or not.
2022-11-18 17:15:34 +01:00
bojanserafimov
2655bdbb2e Add remote seqscans test (#2840) 2022-11-18 09:05:13 -05:00
Konstantin Knizhnik
b9152f1ef4 Correctly terminate prefetch in case of pageserver restart (#2850)
refer #2819

This patch requires deep knowledge of prefetch internals.
So @MMeent  please review it or suggest better solution.
2022-11-18 15:04:58 +02:00
Heikki Linnakangas
328ec1ce24 Print a more full error message, with stack trace, on GC failure.
In a CI run, I got a test failure because of this error in the log,
from the test_get_tenant_size_with_multiple_branches test:

    ERROR gc_loop{tenant_id=f1630516d4b526139836ced93be0c878}: Gc failed, retrying in 2s: No such file or directory (os error 2)

There are known race conditions between GC and timeline deletion,
which surely caused that error. But if we didn't know the cause, it
would be pretty hard to debug without a stack trace.
2022-11-18 11:44:00 +02:00
Heikki Linnakangas
dcb79ef08f Silence yet another test failure from race condition between GC and delete.
Another similar case to commit 9ae4da4f31.
2022-11-18 10:18:15 +02:00
Konstantin Knizhnik
fd99e0fbc4 Build pg_prewrm extension (#2794) 2022-11-18 09:10:32 +02:00
Kirill Bulatov
60ac227196 Use modern flex and bison in macOS compilations (#2847) 2022-11-17 14:48:21 +00:00
MMeent
4a60051b0d Add codeowners section for /vendor/ (#2849)
After this, consent of @neondatabase/compute is required to update the
vendored PostgreSQL versions.
2022-11-17 14:31:34 +00:00
Heikki Linnakangas
24d3ed0952 Ignore another ERROR that's expected in test.
Got a test failure in CI because of this.
2022-11-17 12:42:56 +02:00
Alexander Bayandin
0a87d71294 test_runner: make proxy mgmt port mandatory (#2839)
Make `mgmt` port mandatory argument for `NeonProxy` (and set it for
`static_proxy`) to avoid port collision when tests run in parallel.
2022-11-16 17:57:48 +00:00
Heikki Linnakangas
150bddb929 Clean up process start/stop handling
* Poll more frequently when waiting for process start/stop. This
  speeds up startup and shutdown in tests. We did this already in
  commit 52ce1c9d53, which reduced the interval to 100 ms, but it was
  inadvertently increased back to 500 ms in commit d42700280f. Reduce
  it to 100 ms again, for both start and stop operations.

* Harmonize the start and stop loops, printing the dots and notices
  the same way in both. I considered extracting the logic to a
  separate retry-function that takes a closure as argument that does
  the polling, but as long as we only have two copies, the code
  duplication isn't that bad.

* Remove newline after "Starting pageserver" and "Starting etcd"
  messages, so that the progress-indicator dots that are printed once
  a second are printed on the same line. Before:

    Starting pageserver at '127.0.0.1:64000' in '.neon'
    ...
    pageserver started, pid: 2538937

  After:

    Starting pageserver at '127.0.0.1:64000' in '.neon'...
    pageserver started, pid: 2538937

  The "Starting safekeeper" message already got this right.

* Update example output in README.md to match
2022-11-16 19:51:37 +02:00
Alexander Bayandin
2b728bc69e test_forward_compatibility: fix path to pg_distrib_dir (#2826)
Set correct `pg_distrib_dir` in `pageserver.toml` and in neon_local
`config`.

`test_forward_compatibility` shows flakiness during `neon_local pg
start`, so hopefully, the patch will help.

```
2022-11-15 16:07:34.091 GMT [13338] LOG:  starting with zenith basebackup at LSN 0/A6A9310, prev 0/0
2022-11-15 16:07:34.091 GMT [13338] FATAL:  cannot start in read-write mode from this base backup
2022-11-15 16:07:34.091 GMT [13337] LOG:  startup process (PID 13338) exited with exit code 1
```
2022-11-16 15:14:36 +00:00
Kirill Bulatov
5184685ced Revert "Introduce aws-sdk-rust as rusoto S3 replacement (#2802)" (#2837)
Despite tests working, on staging the library started to fail with the
following error:

```
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]: 2022-11-16T11:53:37.191211Z  INFO init_tenant_mgr:local_tenant_timeline_files: Collected files for 16 tenants
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]: thread 'main' panicked at 'A connector was not available. Either set a custom connector or enable the `rustls` and `native-tls` crate featu>
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]: stack backtrace:
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:    0: rust_begin_unwind
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/std/src/panicking.rs:584:5
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:    1: core::panicking::panic_fmt
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:142:14
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:    2: core::panicking::panic_display
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:72:5
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:    3: core::panicking::panic_str
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/panicking.rs:56:5
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:    4: core::option::expect_failed
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/option.rs:1854:5
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:    5: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:    6: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:    7: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:    8: <aws_types::credentials::provider::future::ProvideCredentials as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:    9: <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   10: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   11: <aws_types::credentials::provider::future::ProvideCredentials as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   12: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   13: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   14: <aws_smithy_http_tower::map_request::MapRequestFuture<F,E> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   15: <core::pin::Pin<P> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/future.rs:124:9
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   16: <aws_smithy_http_tower::parse_response::ParseResponseService<InnerService,ResponseHandler,RetryPolicy> as tower_service::Service<aws_>
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /home/nonroot/.cargo/registry/src/github.com-1ecc6299db9ec823/aws-smithy-http-tower-0.51.0/src/parse_response.rs:109:34
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   17: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/mod.rs:91:19
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   18: <tracing::instrument::Instrumented<T> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /home/nonroot/.cargo/registry/src/github.com-1ecc6299db9ec823/tracing-0.1.37/src/instrument.rs:272:9
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   19: <core::pin::Pin<P> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/future.rs:124:9
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   20: <aws_smithy_client::timeout::TimeoutServiceFuture<InnerFuture> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /home/nonroot/.cargo/registry/src/github.com-1ecc6299db9ec823/aws-smithy-client-0.51.0/src/timeout.rs:189:70
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   21: <tower::retry::future::ResponseFuture<P,S,Request> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /home/nonroot/.cargo/registry/src/github.com-1ecc6299db9ec823/tower-0.4.13/src/retry/future.rs:77:41
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   22: <aws_smithy_client::timeout::TimeoutServiceFuture<InnerFuture> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /home/nonroot/.cargo/registry/src/github.com-1ecc6299db9ec823/aws-smithy-client-0.51.0/src/timeout.rs:189:70
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   23: aws_smithy_client::Client<C,M,R>::call_raw::{{closure}}
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /home/nonroot/.cargo/registry/src/github.com-1ecc6299db9ec823/aws-smithy-client-0.51.0/src/lib.rs:227:56
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   24: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/mod.rs:91:19
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   25: aws_smithy_client::Client<C,M,R>::call::{{closure}}
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /home/nonroot/.cargo/registry/src/github.com-1ecc6299db9ec823/aws-smithy-client-0.51.0/src/lib.rs:184:29
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   26: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/mod.rs:91:19
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   27: aws_sdk_s3::client::fluent_builders::GetObject::send::{{closure}}
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /home/nonroot/.cargo/registry/src/github.com-1ecc6299db9ec823/aws-sdk-s3-0.21.0/src/client.rs:7735:40
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   28: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/mod.rs:91:19
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   29: remote_storage::s3_bucket::S3Bucket::download_object::{{closure}}
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at libs/remote_storage/src/s3_bucket.rs:205:20
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   30: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/mod.rs:91:19
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   31: <remote_storage::s3_bucket::S3Bucket as remote_storage::RemoteStorage>::download::{{closure}}
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at libs/remote_storage/src/s3_bucket.rs:399:11
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   32: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/mod.rs:91:19
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   33: <core::pin::Pin<P> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/future.rs:124:9
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   34: remote_storage::GenericRemoteStorage::download_storage_object::{{closure}}
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at libs/remote_storage/src/lib.rs:264:55
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   35: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/mod.rs:91:19
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   36: pageserver::storage_sync::download::download_index_part::{{closure}}
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at pageserver/src/storage_sync/download.rs:148:57
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   37: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/mod.rs:91:19
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   38: pageserver::storage_sync::download::download_index_parts::{{closure}}::{{closure}}::{{closure}}
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at pageserver/src/storage_sync/download.rs:77:75
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   39: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/mod.rs:91:19
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   40: <futures_util::stream::futures_unordered::FuturesUnordered<Fut> as futures_core::stream::Stream>::poll_next
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /home/nonroot/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-util-0.3.24/src/stream/futures_unordered/mod.rs:514:17
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   41: futures_util::stream::stream::StreamExt::poll_next_unpin
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /home/nonroot/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-util-0.3.24/src/stream/stream/mod.rs:1626:9
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   42: <futures_util::stream::stream::next::Next<St> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /home/nonroot/.cargo/registry/src/github.com-1ecc6299db9ec823/futures-util-0.3.24/src/stream/stream/next.rs:32:9
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   43: pageserver::storage_sync::download::download_index_parts::{{closure}}
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at pageserver/src/storage_sync/download.rs:80:69
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   44: <core::future::from_generator::GenFuture<T> as core::future::future::Future>::poll
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/future/mod.rs:91:19
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   45: tokio::park:🧵:CachedParkThread::block_on::{{closure}}
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /home/nonroot/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.1/src/park/thread.rs:267:54
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   46: tokio::coop::with_budget::{{closure}}
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /home/nonroot/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.1/src/coop.rs:102:9
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   47: std:🧵:local::LocalKey<T>::try_with
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/std/src/thread/local.rs:445:16
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   48: std:🧵:local::LocalKey<T>::with
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/std/src/thread/local.rs:421:9
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   49: tokio::coop::with_budget
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /home/nonroot/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.1/src/coop.rs:95:5
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   50: tokio::coop::budget
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /home/nonroot/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.1/src/coop.rs:72:5
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   51: tokio::park:🧵:CachedParkThread::block_on
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /home/nonroot/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.1/src/park/thread.rs:267:31
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   52: tokio::runtime::enter::Enter::block_on
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /home/nonroot/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.1/src/runtime/enter.rs:152:13
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   53: tokio::runtime::scheduler::multi_thread::MultiThread::block_on
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /home/nonroot/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.1/src/runtime/scheduler/multi_thread/mod.rs:79:9
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   54: tokio::runtime::Runtime::block_on
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /home/nonroot/.cargo/registry/src/github.com-1ecc6299db9ec823/tokio-1.21.1/src/runtime/mod.rs:492:44
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   55: pageserver::storage_sync::spawn_storage_sync_task
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at pageserver/src/storage_sync.rs:656:34
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   56: pageserver::tenant_mgr::init_tenant_mgr
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at pageserver/src/tenant_mgr.rs:88:13
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   57: pageserver::start_pageserver
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at pageserver/src/bin/pageserver.rs:269:9
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   58: pageserver::main
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at pageserver/src/bin/pageserver.rs:103:5
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:   59: core::ops::function::FnOnce::call_once
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]:              at /rustc/e092d0b6b43f2de967af0887873151bb1c0b18d3/library/core/src/ops/function.rs:248:5
Nov 16 11:53:37 pageserver-0.us-east-2.aws.neon.build pageserver[481974]: note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
```

Feels like better testing on the env is needed later, maybe more e2e
tests have to be written (albeit we have download tests, so something
else happens here, tls issues?)
2022-11-16 15:10:36 +00:00
Heikki Linnakangas
9ae4da4f31 Silence test failure caused by race condition between GC and detach.
Thanks to the race condition, GC sometimes fails with "no such file or
directory" error, if the tenant is detached concurrently. That's a
known issue, but it didn't cause test failures until we started to
check for unexpected ERRORs in the log in commit 46d30bf054. We should
fix the race condition, of course, but until we do, let's silence the
failures.
2022-11-16 15:50:49 +02:00
Sergey Melnikov
aca221ac8b Switch old staging to new etcd (#2834) 2022-11-16 16:54:55 +04:00
Heikki Linnakangas
d013a2b227 Make test_gc_cutoff test more robust.
Previously, if the failpoint was not reached for some reason, the test
would only fail because it would reach the 5 minute timeout we have on
all python tests. That's very subtle. Make it fail explicitly, if the
failpoint is not hit on each iteration of the loop.

Extracted from a larger PR, see
https://github.com/neondatabase/neon/pull/2785/files#r1022765794
2022-11-16 13:24:02 +02:00
Heikki Linnakangas
3f93c6c6f0 Improve checks for broken tenants in test_broken_timeline.py
- Refactor the code a little bit, removing the silly for-loop over a
  single element.

- Make it more clear in log messages that the errors are expectd

- Check for a more precise error message "Failed to load delta layer"
  instead of just "extracting base backup failed".
2022-11-16 13:16:00 +02:00
Rory de Zoete
53267969d7 Preparation for ARM runners (#2751)
Need to make the runner tag more specific else we inadvertently might
run workloads on the wrong arch

Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>
Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
2022-11-16 11:28:57 +01:00
Andrés
c4b417ecdb Introduce aws-sdk-rust as rusoto S3 replacement (#2802)
- `aws-smithy-http`: Needed because of `SdkBody` see
https://github.com/awslabs/smithy-rs/issues/1759
- `aws-types`: Needed because of `SharedCredentialsProvider`, the
recommended way from aws is something like
`aws_config::from_env().region("us-east-1").load().await` but that is
problematic because of:

- `sync -> async ` in the creation of S3Client and i don't want to
change the signature of any method in this class.
- We do not need the four default steps in
https://github.com/awslabs/aws-sdk-rust/blob/main/sdk/aws-config/src/default_provider/credentials.rs#L235

- `Hyper`: Similar to what's currently doing Rusoto in
https://github.com/rusoto/rusoto/blob/master/rusoto/signature/src/signature.rs#L59
to stream the body, see also
https://github.com/awslabs/aws-sdk-rust/discussions/361

Co-authored-by: andres <andres.rodriguez@outlook.es>
2022-11-16 11:28:37 +02:00
Joonas Koivunen
1d105727cb perf: simple walredo bench (#2816)
adds a simple walredo bench to allow some comparison of the walredo
throughput.

Cc: #1339, #2778
2022-11-16 11:13:56 +02:00
Heikki Linnakangas
4787a744c2 Add documentation page about error handling and logging. (#2681)
Add a page to the internal documentation, on how we do error
handling and logging.
2022-11-16 10:38:03 +02:00
Sergey Melnikov
ac3ccac56c Add zenith-1-ps-4 and zenith-1-ps-5 (#2815) 2022-11-16 11:25:24 +04:00
Alexander Bayandin
638af96c51 postgres-v15: fix expected results for regress tests (#2822)
Fix expected output for regress tests for Postgres 15.

Required for https://github.com/neondatabase/neon/pull/2809
2022-11-15 22:32:12 +00:00
Kirill Bulatov
1e21ca1afe Trim whitespaces off Lsn strings when parsing (#2827) 2022-11-15 22:39:44 +02:00
Heikki Linnakangas
46d30bf054 Check for errors in pageserver log after each test.
If there are any unexpected ERRORs or WARNs in pageserver.log after test
finishes, fail the test. This requires whitelisting the errors that *are*
expected in each test, and there's also a few common errors that are
printed by most tests, which are whitelisted in the fixture itself.

With this, we don't need the special abort() call in testing mode, when
compaction or GC fails. Those failures will print ERRORs to the logs,
which will be picked up by this new mechanisms.

A bunch of errors are currently whitelisted that we probably shouldn't
be emitting in the first place, but fixing those is out of scope for this
commit, so I just left FIXME comments on them.
2022-11-15 18:47:28 +02:00
Heikki Linnakangas
d0105cea1f Avoid errors when removing a timeline that's still active 2022-11-15 18:47:28 +02:00
Heikki Linnakangas
e44e4a699b Downgrade log message, if client terminates COPY during basebackup import
It's more or less expected from pageserver's point of view. Change the
error kind to ConnectionReset, so that it gets logged at INFO level
instead of ERROR.
2022-11-15 18:47:28 +02:00
Heikki Linnakangas
223834a420 Fix confusion between Postgres and pageserver connection string in test.
We passed the pageserver's libpq endpoint URL as the 'compute_ctl
--connstr' argument, but that was bogus: the --connstr URL is supposed
to be the URL to the *Postgres* instance that compute_ctl launches and
monitors, not to the pageserver. compute_ctl does need the pageserver
URL too, but it is read from the cluster spec JSON, not --connstr.

That was pretty confusing, as you got a lot of "unknown command"
errors in the pageserver log, when compute_tools tries to run regular
SQL commands on the pageserver. The test still passed, however, as it
doesn't require the SQL commands to succeed. But to make this less
confusing, use an invalid hostname instead, so that the queries will
fail to even connect.
2022-11-15 18:47:28 +02:00
MMeent
01778e37cc Address issues in the pagestore prefetch mechanism: (#2790)
- Update vendored PostgreSQL to address prefetch issues
 - Make flushed state explicit in PrefetchState
 - Move flush logic into prefetch_wait_for, where possible
 - Clean up some prefetch state handling code in the various code
elements handling state transitions.
 - Fix a race condition in neon_read_at_lsn where a hash entry pointer
was used after the hash table was updated. This could result in
incorrect state transitions and assertion failures after disconnects
during prefetch_wait_for in that neon_read_at_lsn.
 
Fixes #2780
2022-11-15 15:12:38 +01:00
Alexander Bayandin
03190a2161 GitHub Actions: Do not create Allure report for cancelled jobs (#2813)
If a workflow is cancelled, do not delay its finishing by creating an allure
report.
2022-11-15 10:27:59 +00:00
Kirill Bulatov
f87017c04d Omit dependencies' debug info (#2803)
Based on https://neondb.slack.com/archives/C0277TKAJCA/p1668079753506749

Co-authored-by: Arseny Sher <sher-ars@yandex.ru>
2022-11-14 12:44:41 +00:00
andres
c11cbf0f5c fix test_compare_child_and_root_pgbench_perf to do a fair comparison 2022-11-13 21:03:54 +02:00
Heikki Linnakangas
f30ef00439 Stop building the legacy "compute-node" docker image.
Before we had separate images for v14 and v15, the compute node image
was called just "neondatabase/compute-node". It has been superseded by
the "neondatabase/compute-node-v14" and "neondatabase/compute-node-v15"
images. The old image is not used by the cloud console build or tests
anymore.
2022-11-12 20:48:10 +02:00
Heikki Linnakangas
dbe5b52494 Avoid some vector-growing overhead.
I saw this in 'perf' profile of a sequential scan:

> -   31.93%     0.21%  compute request  pageserver         [.] <pageserver::walredo::PostgresRedoManager as pageserver::walredo::WalRedoManager>::request_redo
>    - 31.72% <pageserver::walredo::PostgresRedoManager as pageserver::walredo::WalRedoManager>::request_redo
>       - 31.26% pageserver::walredo::PostgresRedoManager::apply_batch_postgres
>          + 7.64% <std::process::ChildStdin as std::io::Write>::write
>          + 6.17% nix::poll::poll
>          + 3.58% <std::process::ChildStderr as std::io::Read>::read
>          + 2.96% std::sync::condvar::Condvar::notify_one
>          + 2.48% std::sys::unix::locks::futex::Condvar::wait
>          + 2.19% alloc::raw_vec::RawVec<T,A>::reserve::do_reserve_and_handle
>          + 1.14% std::sys::unix::locks::futex::Mutex::lock_contended
>            0.67% __rust_alloc_zeroed
>            0.62% __stpcpy_ssse3
>            0.56% std::sys::unix::locks::futex::Mutex::wake

Note the 'do_reserve_handle' overhead. That's caused by having to grow
the buffer used to construct the WAL redo request. This commit
eliminates that overhead. It's only about 2% of the overall CPU usage,
but every little helps.

Also reuse the temp buffer when reading records from a DeltaLayer, and
call Vec::reserve to avoid growing a buffer when reading a blob across
pages. I saw a reduction from 2% to 1% of CPU spent in
do_reserve_and_handle in that codepath, but that's such a small change
that it could be just noise. Seems like it shouldn't hurt though.
2022-11-12 18:52:25 +02:00
Heikki Linnakangas
4131a6efae Remove unused Dockerfile.compute-node.legacy.
The cloud end-to-end tests use the docker images built by the neon PR
now, and don't need this legacy Dockerfile anymore.
2022-11-12 18:51:51 +02:00
Kirill Bulatov
03695261fc Test storage Docker images (#2767)
Closes https://github.com/neondatabase/neon/issues/2697
Example:
https://github.com/neondatabase/neon/actions/runs/3416774593/jobs/5688394855

Adds a set of tests on the storage Docker images before they are pushed
to the public registries:
* tests that pageserver binary has the correct version string (other
binaries are built with the same library, so it should be enough to test
one)
* tests that the compose file set-up works and all components are able
to start and perform a single SQL query (CREATE TABLE)
2022-11-11 19:42:26 +02:00
bojanserafimov
7fd88fab59 Trace read requests (#2762) 2022-11-10 16:43:04 -05:00
bojanserafimov
7edc098c40 Add perf test instructions (#2777) 2022-11-10 16:05:57 -05:00
Vadim Kharitonov
8421218152 Change the branch name for V14 as it does for V15 2022-11-10 17:41:36 +01:00
Alexander Bayandin
d5b7832c21 Fix test_wal_backpressure tests (#2792)
Fix expected return type for `fetchone `:

```
AssertionError: assert False
 +  where False = isinstance((Decimal('56048'), '55 kB', '0/1CF52D8', '0/1CE77E8'), list)
```
2022-11-10 16:15:04 +00:00
Arthur Petukhovsky
c6072d38c2 Remove debug logs in should_walsender_stop (#2791) 2022-11-10 15:49:00 +00:00
Alexander Bayandin
175779c0ef GitHub Actions: fix non-parallel benchmarks on CI (#2787)
Fix non-parallel pytest run by setting `--dist=loadgroup` only for
pytest command with xdist enabled (`-n` is set)
2022-11-10 12:51:47 +00:00
Christian Schwarz
8654e95fae walredo: fix zombie processes ([postgres] <defunct>)
This change wraps the std::process:Child that we spawn for WAL redo
into a type that ensures that we try to SIGKILL + waitpid() on it.

If there is no explicit call to kill_and_wait(), the Drop implementation
will spawns a task that does it in the BACKGROUND_RUNTIME.
That's an ugly hack but I think it's better than doing kill+wait
synchronously from Drop, since I think the general assumption in the
Rust ecosystem is that Drop doesn't block.
Especially since the drop sites can be _any_ place that drops the last
Arc<PostgresRedoManager>, e.g., compaction or GC.

The benefit of having the new type over just adding a Drop impl to
PostgresRedoProcess is that we can construct it earlier than the full
PostgresRedoProcess in PostgresRedoProcess::launch().
That allows us to correctly kill+wait the child if there is an error in
PostgresRedoProcess::launch() after spawning it.

I also took a stab at a regression test. I manually verified
that it fails before the fix to walredo.rs.

fixes https://github.com/neondatabase/neon/issues/2761
closes https://github.com/neondatabase/neon/pull/2776
2022-11-10 12:50:50 +01:00
Vadim Kharitonov
f720dd735e Stricter mypy linters for test_runner/fixtures/* 2022-11-10 12:47:27 +01:00
Alexander Bayandin
c4f9f1dc6d Add data format forward compatibility tests (#2766)
Add `test_forward_compatibility`, which checks if it's going to
be possible to roll back a release to the previous version.
The test uses artifacts (Neon & Postgres binaries) from the previous
release to start Neon on the repo created by the current version. It
performs exactly the same checks as `test_backward_compatibility` does.

Single `ALLOW_BREAKING_CHANGES` env var got replaced by
`ALLOW_BACKWARD_COMPATIBILITY_BREAKAGE` &
`ALLOW_FORWARD_COMPATIBILITY_BREAKAGE` and can be set by `backward
compatibility breakage` and `forward compatibility breakage` labels
respectively.
2022-11-10 09:06:34 +00:00
Kirill Bulatov
4a10e1b066 Pass pushed storage Docker tag to e2e jobs 2022-11-10 08:50:42 +02:00
Vadim Kharitonov
b55466045e Introduce codeowners 2022-11-09 11:43:10 +01:00
Heikki Linnakangas
e999f66b01 Use a cached WaitEventSet instead of WaitLatchOrSocket.
When we repeatedly wait for the same events, it's faster to create the
event set once and reuse it. While testing with a sequential scan test
case, I saw WaitLatchOrSocket consuming a lot of CPU:

> -   40.52%     0.14%  postgres  postgres           [.] WaitLatchOrSocket
>    - 40.38% WaitLatchOrSocket
>       + 17.83% AddWaitEventToSet
>       + 9.47% close@plt
>       + 8.29% CreateWaitEventSet
>       + 4.57% WaitEventSetWait

This eliminates most of that overhead.
2022-11-08 19:45:14 +02:00
andres
1cf257bc4a feedback 2022-11-08 20:15:54 +04:00
andres
40164bd589 Use latestMsgReceivedAt in walproposer 2022-11-08 20:15:54 +04:00
Christian Schwarz
c3a470a29b walredo process management: handle every error on the kill() and drop path
If we're not calling kill() before dropping the PostgresRedoProcess, we
currently leak it.
That's most likely the root cause for #2761.
This patch
1. adds an error log message for that case and
2. adds error handling for all errors on the kill() path. If we're a
`testing` build, we panic. Otherwise, we log an error and leak the
process.

The error handling changes (2) are necessary to conclusively state that
the root cause for #2761 is indeed (1). If we didn't have them, the root
cause could be missing error handling instead.

To make the log messages useful, I've added tracing::instrument
attributes that log the tenant_id and PID. That helps mapping back the
PID of `defunct` processes to pageserver log messages. Note that a
defunct process's `/proc/$PID/` directory isn't very useful. We have
left little more than its PID.

Once we have validated the root cause, we'll find a fix, but that's
still an ongoing discussion.

refs https://github.com/neondatabase/neon/issues/2761
closes https://github.com/neondatabase/neon/pull/2769
2022-11-08 14:03:13 +01:00
Alexander Bayandin
c1a76eb0e5 test_runner: replace global variables with fixtures (#2754)
This PR replaces the following global variables in the test framework
with fixtures to make tests more configurable. I mainly need this for
the forward compatibility tests (draft in
https://github.com/neondatabase/neon/pull/2766).

```
base_dir
neon_binpath 
pg_distrib_dir
top_output_dir
default_pg_version (this one got replaced with a fixture named pg_version)
```

Also, this PR adds more `Path` type where the code implies it.
2022-11-07 18:39:51 +00:00
MMeent
d5b6471fa9 Update prefetch mechanism: (#2687)
Prefetch requests and responses are stored in a ringbuffer instead of a
queue, which means we can utilize prefetches of many relations
concurrently -- page reads of un-prefetched relations now don't imply
dropping prefetches.

In a future iteration, this may detect sequential scans based on the
read behavior of sequential scans, and will dynamically prefetch buffers
for such relations as needed. Right now, it still depends on explicit
prefetch requests from PostgreSQL.

The main improvement here is that we now have a buffer for prefetched
pages of 128 entries with random access. Before, we had a similarly sized
cache, but this cache did not allow for random access, which resulted in
dropped entries when multiple systems used the prefetching subsystem
concurrently.

See also: #2544
2022-11-07 18:13:24 +02:00
Joonas Koivunen
548d472b12 fix: logical size query at before initdb_lsn (#2755)
With more realistic selection of gc_horizon in tests there is an
immediate failure with trying to query logical size with lsn <
initdb_lsn. Fixes that, adds illustration gathered from clarity of
explaining this tenant size calculation to more people.

Cc: #2748, #2599.
2022-11-07 12:03:57 +02:00
Dmitry Rodionov
99e745a760 review adjustments 2022-11-06 13:42:18 +02:00
Dmitry Rodionov
15d970f731 decrease diff by moving check_checkpoint_distance back
Co-authored-by: Christian Schwarz <me@cschwarz.com>
2022-11-06 13:42:18 +02:00
Heikki Linnakangas
7b7f84f1b4 Refactor layer flushing task
Extracted from https://github.com/neondatabase/neon/pull/2595
2022-11-06 13:42:18 +02:00
Alexander Bayandin
bc40a5595f test_runner: update cryptography (#2753)
Bump `cryptography` package from 37.0.4 to 38.0.3 to fix
https://github.com/neondatabase/neon/security/dependabot/6 (rather just
in case).

Ref https://www.openssl.org/news/secadv/20221101.txt
2022-11-04 21:29:51 +00:00
Heikki Linnakangas
07b3ba5ce3 Bump postgres submodules, for some cosmetic cleanups. 2022-11-04 20:30:13 +02:00
Dmitry Ivanov
c38f38dab7 Move pq_proto to its own crate 2022-11-03 22:56:04 +03:00
bojanserafimov
71d268c7c4 Write message serialization test (#2746) 2022-11-03 14:24:15 +00:00
Joonas Koivunen
cf68963b18 Add initial tenant sizing model and a http route to query it (#2714)
Tenant size information is gathered by using existing parts of
`Tenant::gc_iteration` which are now separated as
`Tenant::refresh_gc_info`. `Tenant::refresh_gc_info` collects branch
points, and invokes `Timeline::update_gc_info`; nothing was supposed to
be changed there. The gathered branch points (through Timeline's
`GcInfo::retain_lsns`), `GcInfo::horizon_cutoff`, and
`GcInfo::pitr_cutoff` are used to build up a Vec of updates fed into the
`libs/tenant_size_model` to calculate the history size.

The gathered information is now exposed using `GET
/v1/tenant/{tenant_id}/size`, which which will respond with the actual
calculated size. Initially the idea was to have this delivered as tenant
background task and exported via metric, but it might be too
computationally expensive to run it periodically as we don't yet know if
the returned values are any good.

Adds one new metric:
- pageserver_storage_operations_seconds with label `logical_size`
    - separating from original `init_logical_size`

Adds a pageserver wide configuration variable:
- `concurrent_tenant_size_logical_size_queries` with default 1

This leaves a lot of TODO's, tracked on issue #2748.
2022-11-03 12:39:19 +00:00
Arseny Sher
63221e4b42 Fix sk->ps walsender shutdown on sk side on caughtup.
This will fix many threads issue, but code around awfully still wants
improvement.

https://github.com/neondatabase/neon/issues/2722
2022-11-03 16:20:55 +04:00
bojanserafimov
d7eeb73f6f Impl serialize for pagestream FeMessage (#2741) 2022-11-02 23:44:07 -04:00
Joonas Koivunen
5112142997 fix: use different port for temporary postgres (#2743)
`test_tenant_relocation` ends up starting a temporary postgres instance with a fixed port. the change makes the port configurable at scripts/export_import_between_pageservers.py and uses that in test_tenant_relocation.
2022-11-02 18:37:48 +00:00
bojanserafimov
a0a74868a4 Fix clippy (#2742) 2022-11-02 12:30:09 -04:00
Christian Schwarz
b154992510 timeline_list_handler: avoid spawn_blocking
As per https://github.com/neondatabase/neon/issues/2731#issuecomment-1299335813

refs https://github.com/neondatabase/neon/issues/2731
2022-11-02 16:22:58 +01:00
Christian Schwarz
a86a38c96e README: fix instructions on how to run tests
The `make debug` target doesn't exist, and I can't find it in the Git
history.
2022-11-02 16:22:58 +01:00
Christian Schwarz
590f894db8 tenant_status: remove unnecessary spawn_blocking
The spawn_blocking is pointless in this cases: get_tenant is not
expected to block for any meaningful amount of time. There are
get_tenant calls in most other functions in the file too, and they don't
bother with spawn_blocking. Let's remove the spawn_blocking from
tenant_status, too, to be consistent.

fixes https://github.com/neondatabase/neon/issues/2731
2022-11-02 16:22:58 +01:00
Alexander Bayandin
0a0595b98d test_backward_compatibility: assign random port to compute (#2738) 2022-11-02 15:22:38 +00:00
Dmitry Rodionov
e56d11c8e1 fix style if possible (cannot really split long lines in mermaid) 2022-11-02 17:15:49 +02:00
Dmitry Rodionov
ccdc3188ed update according to discussion and comments 2022-11-02 17:15:49 +02:00
Dmitry Rodionov
67401cbdb8 pageserver s3 coordination 2022-11-02 17:15:49 +02:00
Kirill Bulatov
d42700280f Remove daemonize from storage components (#2677)
Move daemonization logic into `control_plane`.
Storage binaries now only crate a lockfile to avoid concurrent services running in the same directory.
2022-11-02 02:26:37 +02:00
Kirill Bulatov
6df4d5c911 Bump rustc to 1.62.1 (#2728)
Changelog: https://github.com/rust-lang/rust/blob/master/RELEASES.md#version-1621-2022-07-19
2022-11-02 01:21:33 +02:00
Dmitry Rodionov
32d14403bd remove wrong is_active filter for timelines in compaction/gc
Gc needs to know about all branch points, not only ones for
timelines that are active at the moment of gc. If timeline
is inactive then we wont know about branch point. In this
case gc can delete data that is needed by child timeline.

For compaction it is less severe. Delaying compaction can
cause an effect on performance. So it is still better to run
it. There is a logic to exit it quickly if there is nothing
to compact
2022-11-01 18:07:08 +02:00
Dmitry Ivanov
0df3467146 Refactoring: replace utils::connstring with Url-based APIs 2022-11-01 18:17:36 +03:00
Dmitry Rodionov
c64a121aa8 do not nest wal_connection_manager span inside parent one 2022-11-01 15:08:23 +02:00
Heikki Linnakangas
22cc8760b9 Move walredo process code under pgxn in the main 'neon' repository.
- Refactor the way the WalProposerMain function is called when started
  with --sync-safekeepers. The postgres binary now explicitly loads
  the 'neon.so' library and calls the WalProposerMain in it. This is
  simpler than the global function callback "hook" we previously used.

- Move the WAL redo process code to a new library, neon_walredo.so,
  and use the same mechanism as for --sync-safekeepers to call the
  WalRedoMain function, when launched with --walredo argument.

- Also move the seccomp code to neon_walredo.so library. I kept the
  configure check in the postgres side for now, though.
2022-10-31 01:11:50 +01:00
Arseny Sher
596d622a82 Fix test_prepare_snapshot.
It should checkpoint pageserver after waiting for all data arrival, not before.
2022-10-28 22:12:31 +04:00
Sergey Melnikov
7481fb082c Fix bugs in #2713 (#2716) 2022-10-28 14:12:49 +00:00
Arseny Sher
1eb9bd052a Bump vendor/postgres-v15 to fix XLP_FIRST_IS_CONTRECORD issue.
ref https://github.com/neondatabase/cloud/issues/2688
2022-10-28 16:45:11 +03:00
Sergey Melnikov
59a3ca4ec6 Deploy proxy to new prod regions (#2713)
* Refactor proxy deploy

* Test new prod deploy

* Remove assume role

* Add new values

* Add all regions
2022-10-28 16:25:28 +03:00
Sergey Melnikov
e86a9105a4 Deploy storage to new prod regions (#2709) 2022-10-28 10:17:27 +00:00
Stas Kelvich
d3c8749da5 Build compute postgres with openssl support
The main reason for that change is that Postgres 15 requires OpenSSL
for `pgcrypto` to work. Also not a bad idea to have SSL-enabled
Postgres in general.
2022-10-28 10:39:22 +03:00
Alexander Bayandin
128dc8d405 Nightly Benchmarks: fix workflow (#2708) 2022-10-27 19:26:10 +03:00
Alexander Bayandin
0cbae6e8f3 test_backward_compatibility: friendlier error message (#2707) 2022-10-27 15:54:49 +00:00
Alexander Stanovoy
78e412b84b The fix of #2650. (#2686)
* Wrappers and drop implementations for image and delta layer writers.
* Two regression tests for the image and delta layer files.
2022-10-27 14:02:55 +00:00
Rory de Zoete
6dbf202e0d Update crane copy target (#2704)
Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>
2022-10-27 16:00:40 +02:00
Arseny Sher
b42bf9265a Enable etcd compaction in neon_local. 2022-10-27 10:47:08 +03:00
Stas Kelvich
1f08ba5790 Avoid debian-testing packages in compute Dockerfiles
plv8 can only be built with a fairly new gold linker version. We used to install
it via binutils packages from testing, but it also updates libc and that causes
troubles in the resulting image as different extensions were built against
different libc versions. We could either use libc from debian-testing everywhere
or restrain from using testing packages and install necessary programs manually.
This patch uses the latter approach: gold for plv8 and cmake for h3 are
installed manually.

In a passing declare h3_postgis as a safe extension (previous omission).
2022-10-27 09:44:16 +03:00
bojanserafimov
0c54eb65fb Move pagestream api to libs/pageserver_api (#2698) 2022-10-26 17:32:31 -04:00
mikecaat
259a5f356e Add a docker-compose example file (#1943) (#2666)
Co-authored-by: Masahiro Ikeda <masahiro.ikeda.us@hco.ntt.co.jp>
2022-10-26 13:59:25 +03:00
Sergey Melnikov
a3cb8c11e0 Do not release to new staging proxies on release (#2685) 2022-10-25 23:51:23 +00:00
bojanserafimov
9fb2287f87 Add draw_timeline binary (#2688) 2022-10-25 11:25:22 -04:00
Alexander Bayandin
834ffe1bac Add data format backward compatibility tests (#2626) 2022-10-25 16:41:50 +02:00
Stas Kelvich
df18b041c0 Use apt version pinning instead of repo priorities
Higher `bullseye` priority doesn't works for packages installed
via `bullseye-updates`, e.g.:

```
libc-bin:
  Installed: 2.31-13+deb11u5
  Candidate: 2.35-3
  Version table:
     2.35-3 500
        500 http://ftp.debian.org/debian testing/main amd64 Packages
 *** 2.31-13+deb11u5 500
        500 http://deb.debian.org/debian bullseye-updates/main amd64 Packages
        100 /var/lib/dpkg/status
     2.31-13+deb11u4 990
        990 http://deb.debian.org/debian bullseye/main amd64 Packages
```

Try version pinning instead
2022-10-25 14:29:11 +03:00
Anastasia Lubennikova
39897105b2 Check postgres version and ensure that public schema exists
before running GRANT query on it
2022-10-25 09:55:24 +03:00
Stas Kelvich
2f399f08b2 Hotfix to disable grant create on public schema
`GRANT CREATE ON SCHEMA public` fails if there is no schema `public`.
Disable it in release for now and make a better fix later (it is
needed for v15 support).
2022-10-25 09:55:24 +03:00
Arseny Sher
9f49605041 Fix division by zero panic in determine_offloader. 2022-10-22 18:25:12 +03:00
Konstantin Knizhnik
7b6431cbd7 Disable wal_log_hints by default (#2598)
* Disable wal_log_hints by default

* Remove obsolete comment anbout wal_log_hints
2022-10-22 14:59:18 +03:00
Lassi Pölönen
321aeac3d4 Json logging capability (#2624)
* Support configuring the log format as json or plain.

Separately test json and plain logger. They would be competing on the
same global subscriber otherwise.

* Implement log_format for pageserver config

* Implement configurable log format for safekeeper.
2022-10-21 17:30:20 +00:00
Andrés
71ef7b6663 Remove cached_property package (#2673)
Co-authored-by: andres <andres.rodriguez@outlook.es>
2022-10-21 20:02:31 +03:00
Kirill Bulatov
5928cb33c5 Introduce timeline state (#2651)
Similar to https://github.com/neondatabase/neon/pull/2395, introduces a state field in Timeline, that's possible to subscribe to.

Adjusts

* walreceiver to not to have any connections if timeline is not Active
* remote storage sync to not to schedule uploads if timeline is Broken
* not to create timelines if a tenant/timeline is broken
* automatically switches timelines' states based on tenant state

Does not adjust timeline's gc, checkpointing and layer flush behaviour much, since it's not safe to cancel these processes abruptly and there's task_mgr::shutdown_tasks that does similar thing.
2022-10-21 15:51:48 +00:00
Sergey Melnikov
6ff2c61ae0 Refactor safekeeper s3 config and change it for new account (#2672) 2022-10-21 13:44:08 +00:00
Arseny Sher
7480a0338a Determine safekeeper for offloading WAL without etcd election API.
This API is rather pointless, as sane choice anyway requires knowledge of peers
status and leaders lifetime in any case can intersect, which is fine for us --
so manual elections are straightforward. Here, we deterministically choose among
the reasonably caught up safekeepers, shifting by timeline id to spread the
load.

A step towards custom broker https://github.com/neondatabase/neon/issues/2394
2022-10-21 15:33:27 +03:00
Sergey Melnikov
2709878b8b Deploy scram proxies into new account (#2643) 2022-10-21 14:21:22 +03:00
Kirill Bulatov
39e4bdb99e Actualize tenant and timeline API modifiers (#2661)
* Actualize tenant and timeline API modifiers
* Use anyhow::Result explicitly
2022-10-21 10:58:43 +00:00
Anastasia Lubennikova
52e75fead9 Use anyhow::Result explicitly 2022-10-21 12:47:06 +03:00
Anastasia Lubennikova
a347d2b6ac #2616 handle 'Unsupported pg_version' error properly 2022-10-21 12:47:06 +03:00
Heikki Linnakangas
fc4ea3553e test_gc_cutoff.py fixes (#2655)
* Fix bogus early exit from GC.

Commit 91411c415a added this failpoint, but the early exit was not
intentional.

* Cleanup test_gc_cutoff.py test.

- Remove the 'scale' parameter, this isn't a benchmark
- Tweak pgbench and pageserver options to create garbage faster that the
  the GC can collect away. The test used to take just under 5 minutes,
  which was uncomfortably close to the default 5 minute test timeout, and
  annoyingly even without the hard limit. These changes bring it down to
  about 1-2 minutes.
- Improve comments, fix typos
- Rename the failpoint. The old name, 'gc-before-save-metadata' implied
  that the failpoint was before the metadata update, but it was in fact
  much later in the function.
- Move the call to persist the metadata outside the lock, to avoid
  holding it for too long.

To verify that this test still covers the original bug,
https://github.com/neondatabase/neon/issues/2539, I commenting out
updating the metadata file like this:
```
diff --git a/pageserver/src/tenant/timeline.rs b/pageserver/src/tenant/timeline.rs
index 1e857a9a..f8a9f34a 100644
--- a/pageserver/src/tenant/timeline.rs
+++ b/pageserver/src/tenant/timeline.rs
@@ -1962,7 +1962,7 @@ impl Timeline {
         }
         // Persist the new GC cutoff value in the metadata file, before
         // we actually remove anything.
-        self.update_metadata_file(self.disk_consistent_lsn.load(), HashMap::new())?;
+        //self.update_metadata_file(self.disk_consistent_lsn.load(), HashMap::new())?;

         info!("GC starting");

```
It doesn't fail every time with that, but it did fail after about 5
runs.
2022-10-21 02:39:55 +03:00
Dmitry Rodionov
cca1ace651 make launch_wal_receiver infallible 2022-10-21 00:40:12 +03:00
Sergey Melnikov
30984c163c Fix race between pushing image to ECR and copying to dockerhub (#2662) 2022-10-20 23:01:01 +03:00
Konstantin Knizhnik
7404777efc Pin pages with speculative insert tuples to prevent their reconstruction because spec_token is not wal logged (#2657)
* Pin pages with speculative insert tuples to prevent their reconstruction because spec_token is not wal logged

refer ##2587

* Bump postgres versions
2022-10-20 20:06:05 +03:00
Heikki Linnakangas
eb1bdcc6cf If an FSM or VM page cannot be reconstructed, fill it with zeros.
If we cannot reconstruct an FSM or VM page, while creating image
layers, fill it with zeros instead. That should always be safe, for
the FSM and VM, in the sense that you won't lose actual user data. It
will get cleaned up by VACUUM later.

We had a bug with FSM/VM truncation, where we truncated the FSM and VM
at WAL replay to a smaller size than PostgreSQL originally did. We
thought was harmless, as the FSM and VM are not critical for
correctness and can be zeroed out or truncated without affecting user
data. However, it lead to a situation where PostgreSQL created
incremental WAL records for pages that we had already truncated away
in the pageserver, and when we tried to replay those WAL records, that
failed. That lead to a permanent error in image layer creation, and
prevented it from ever finishing. See
https://github.com/neondatabase/neon/issues/2601. With this patch,
those pages will be filled with zeros in the image layer, which allows
the image layer creation to finish.
2022-10-20 17:27:01 +03:00
Arthur Petukhovsky
f5ab9f761b Remove flaky checks in test_delete_force (#2567) 2022-10-20 17:14:32 +04:00
Kirill Bulatov
306a47c4fa Use uninit mark files during timeline init for atomic creation (#2489)
Part of https://github.com/neondatabase/neon/pull/2239

Regular, from scratch, timeline creation involves initdb to be run in a separate directory, data from this directory to be imported into pageserver and, finally, timeline-related background tasks to start.

This PR ensures we don't leave behind any directories that are not marked as temporary and that pageserver removes such directories on restart, allowing timeline creation to be retried with the same IDs, if needed.

It would be good to later rewrite the logic to use a temporary directory, similar what tenant creation does.
Yet currently it's harder than this change, so not done.
2022-10-20 14:19:17 +03:00
Kirill Bulatov
84c5f681b0 Fix test feature detection (#2659)
Follow-up of #2636 and #2654 , fixing the test detection feature.

Pageserver currently outputs features as

```
/target/debug/pageserver --version
Neon page server git:7734929a8202c8cc41596a861ffbe0b51b5f3cb9 failpoints: true, features: ["testing", "profiling"]
```
2022-10-20 13:44:03 +03:00
Kirill Bulatov
50297bef9f RFC about Tenant / Timeline guard objects (#2660)
Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2022-10-20 12:49:54 +03:00
Andrés
9211923bef Pageserver Python tests should not fail if the server is built with no testing feature (#2636)
Co-authored-by: andres <andres.rodriguez@outlook.es>
2022-10-20 10:46:57 +03:00
bojanserafimov
7734929a82 Remove stale todos (#2630) 2022-10-19 22:59:22 +00:00
Heikki Linnakangas
bc5ec43056 Fix flaky physical-size tests in test_timeline_size.py.
These two tests, test_timeline_physical_size_post_compaction and
test_timeline_physical_size_post_gc, assumed that after you have
waited for the WAL from a bulk insertion to arrive, and you run a
cycle of checkpoint and compaction, no new layer files are created.
Because if a new layer file is created while we are calculating the
incremental and non-incremental physical sizes, they might differ.

However, the tests used a very small checkpoint_distance, so even a
small amount of WAL generated in PostgreSQL could cause a new layer
file to be created. Autovacuum can kick in at any time, and do that.
That caused occasional failues in the test. I was able to reproduce it
reliably by adding a long delay between the incremental and
non-incremental size calculations:

```
--- a/pageserver/src/http/routes.rs
+++ b/pageserver/src/http/routes.rs
@@ -129,6 +129,9 @@ async fn build_timeline_info(
         }
     };
     let current_physical_size = Some(timeline.get_physical_size());
+    if include_non_incremental_physical_size {
+        std:🧵:sleep(std::time::Duration::from_millis(60000));
+    }

     let info = TimelineInfo {
         tenant_id: timeline.tenant_id,
```

To fix, disable autovacuum for the table. Autovacuum could still kick
in for other tables, e.g. catalog tables, but that seems less likely
to generate enough WAL to causea new layer file to be flushed.

If this continues to be a problem in the future, we could simply retry
the physical size call a few times, if there's a mismatch. A mismatch
could happen every once in a while, but it's very unlikely to happen
more than once or twice in a row.

Fixes https://github.com/neondatabase/neon/issues/2212
2022-10-19 23:50:21 +03:00
MMeent
b237feedab Add more redo metrics: (#2645)
- Measure size of redo WAL (new histogram), with bounds between 24B-32kB
- Add 2 more buckets at the upper end of the redo time histogram
  We often (>0.1% of several hours each day) take more than 250ms to do the
  redo round-trip to the postgres process. We need to measure these redo
  times more precisely.
2022-10-19 22:47:11 +02:00
Alexey Kondratov
4d1e48f3b9 [compute_ctl] Use postgres::config to properly escape database names (#2652)
We've got at least one user in production that cannot create a
database with a trailing space in the name.

This happens because we use `url` crate for manipulating the
DATABASE_URL, but it follows a standard that doesn't fit really
well with Postgres. For example, it trims all trailing spaces
from the path:

  > Remove any leading and trailing C0 control or space from input.
  > See: https://url.spec.whatwg.org/#url-parsing

But we used `set_path()` to set database name and it's totally valid
to have trailing spaces in the database name in Postgres.

Thus, use `postgres::config::Config` to modify database name in the
connection details.
2022-10-19 19:20:06 +02:00
Anastasia Lubennikova
7576b18b14 [compute_tools] fix GRANT CREATE ON SCHEMA public -
run the grant query in each database
2022-10-19 18:37:52 +03:00
Konstantin Knizhnik
6b49b370fc Fix build after applying PR #2558 2022-10-19 13:55:30 +03:00
Konstantin Knizhnik
91411c415a Persists latest_gc_cutoff_lsn before performing GC (#2558)
* Persists latest_gc_cutoff_lsn before performing GC

* Peform some refactoring and code deduplication

refer #2539

* Add test for persisting GC cutoff

* Fix python test style warnings

* Bump postgres version

* Reduce number of iterations in test_gc_cutoff test

* Bump postgres version

* Undo bumping postgres version
2022-10-19 12:32:03 +03:00
Kirill Bulatov
c67cf34040 Update GH Action version (#2646) 2022-10-19 11:16:36 +03:00
bojanserafimov
8fbe437768 Improve pageserver IO metrics (#2629) 2022-10-18 11:53:28 -04:00
Heikki Linnakangas
989d78aac8 Buffer the TCP incoming stream on libpq connections.
Reduces the number of syscalls needed to read the commands from the
compute.

Here's a snippet of strace output from the pageserver, when performing
a sequential scan on a table, with prefetch:

    3084934 recvfrom(47, "d", 1, 0, NULL, NULL) = 1
    3084934 recvfrom(47, "\0\0\0\37", 4, 0, NULL, NULL) = 4
    3084934 recvfrom(47, "\2\1\0\0\0\0\362\302\360\0\0\0\6\177\0\0002\276\0\0@\f\0\0\0\0\3", 27, 0, NULL, NULL) = 27
    3084934 pread64(28, "\0\0\0\1\0\0\0\0\0\0\0\253                    "..., 8192, 25190400) = 8192
    3084934 write(45, "B\0\0\0\25\0\0\0\6\177\0\0002\276\0\0@\f\0\0\0\3A\0\0\32\355\0\0\0\0\1"..., 7010) = 7010
    3084934 poll([{fd=46, events=POLLIN}, {fd=48, events=POLLIN}], 2, 60000) = 1 ([{fd=46, revents=POLLIN}])
    3084934 read(46, "\0\0\0\0p\311q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237\362\0\0\237\362\0"..., 8192) = 8192
    3084934 sendto(47, "d\0\0 \5f\0\0\0\0p\311q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237"..., 8198, MSG_NOSIGNAL, NULL, 0) = 8198
    3084934 recvfrom(47, "d", 1, 0, NULL, NULL) = 1
    3084934 recvfrom(47, "\0\0\0\37", 4, 0, NULL, NULL) = 4
    3084934 recvfrom(47, "\2\1\0\0\0\0\362\302\360\0\0\0\6\177\0\0002\276\0\0@\f\0\0\0\0\4", 27, 0, NULL, NULL) = 27
    3084934 pread64(28, "    \0=\0L\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0;;\0\0\0\4\4\0"..., 8192, 25198592) = 8192
    3084934 write(45, "B\0\0\0\25\0\0\0\6\177\0\0002\276\0\0@\f\0\0\0\4A\0\0\32\355\0\0\0\0\1"..., 7010) = 7010
    3084934 poll([{fd=46, events=POLLIN}, {fd=48, events=POLLIN}], 2, 60000) = 1 ([{fd=46, revents=POLLIN}])
    3084934 read(46, "\0\0\0\0\260\344q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237\362\0\0\237\362\0"..., 8192) = 8192
    3084934 sendto(47, "d\0\0 \5f\0\0\0\0\260\344q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237"..., 8198, MSG_NOSIGNAL, NULL, 0) = 8198
    3084934 recvfrom(47, "d", 1, 0, NULL, NULL) = 1
    3084934 recvfrom(47, "\0\0\0\37", 4, 0, NULL, NULL) = 4
    3084934 recvfrom(47, "\2\1\0\0\0\0\362\302\360\0\0\0\6\177\0\0002\276\0\0@\f\0\0\0\0\5", 27, 0, NULL, NULL) = 27
    3084934 write(45, "B\0\0\0\25\0\0\0\6\177\0\0002\276\0\0@\f\0\0\0\5A\0\0\32\355\0\0\0\0\1"..., 7010) = 7010
    3084934 poll([{fd=46, events=POLLIN}, {fd=48, events=POLLIN}], 2, 60000) = 1 ([{fd=46, revents=POLLIN}])
    3084934 read(46, "\0\0\0\0\330\377q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237\362\0\0\237\362\0"..., 8192) = 8192
    3084934 sendto(47, "d\0\0 \5f\0\0\0\0\330\377q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237"..., 8198, MSG_NOSIGNAL, NULL, 0) = 8198

This shows the interaction for three get_page_at_lsn requests. For
each request, the pageserver performs three recvfrom syscalls to read
the incoming request from the socket. After this patch, those recvfrom
calls are gone:

    3086123 read(47, "\0\0\0\0\360\222q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237\362\0\0\237\362\0"..., 8192) = 8192
    3086123 sendto(45, "d\0\0 \5f\0\0\0\0\360\222q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237"..., 8198, MSG_NOSIGNAL, NULL, 0) = 8198
    3086123 pread64(29, "                                "..., 8192, 25182208) = 8192
    3086123 write(46, "B\0\0\0\25\0\0\0\6\177\0\0002\276\0\0@\f\0\0\0\2A\0\0\32\355\0\0\0\0\1"..., 7010) = 7010
    3086123 poll([{fd=47, events=POLLIN}, {fd=49, events=POLLIN}], 2, 60000) = 1 ([{fd=47, revents=POLLIN}])
    3086123 read(47, "\0\0\0\0000\256q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237\362\0\0\237\362\0"..., 8192) = 8192
    3086123 sendto(45, "d\0\0 \5f\0\0\0\0000\256q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237"..., 8198, MSG_NOSIGNAL, NULL, 0) = 8198
    3086123 pread64(29, "\0\0\0\1\0\0\0\0\0\0\0\253                    "..., 8192, 25190400) = 8192
    3086123 write(46, "B\0\0\0\25\0\0\0\6\177\0\0002\276\0\0@\f\0\0\0\3A\0\0\32\355\0\0\0\0\1"..., 7010) = 7010
    3086123 poll([{fd=47, events=POLLIN}, {fd=49, events=POLLIN}], 2, 60000) = 1 ([{fd=47, revents=POLLIN}])
    3086123 read(47, "\0\0\0\0p\311q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237\362\0\0\237\362\0"..., 8192) = 8192
    3086123 sendto(45, "d\0\0 \5f\0\0\0\0p\311q\1\0\0\4\0\f\1\200\1\0 \4 \0\0\0\0\200\237"..., 8198, MSG_NOSIGNAL, NULL, 0) = 8198
    3086123 pread64(29, "    \0=\0L\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0\0;;\0\0\0\4\4\0"..., 8192, 25198592) = 8192
    3086123 write(46, "B\0\0\0\25\0\0\0\6\177\0\0002\276\0\0@\f\0\0\0\4A\0\0\32\355\0\0\0\0\1"..., 7010) = 7010
    3086123 poll([{fd=47, events=POLLIN}, {fd=49, events=POLLIN}], 2, 60000) = 1 ([{fd=47, revents=POLLIN}])

In this test, the compute sends a batch of prefetch requests, and they
are read from the socket in one syscall. That syscall was not captured
by the strace snippet above, but there are much fewer of them than
before.
2022-10-18 18:46:07 +03:00
Stas Kelvich
7ca72578f9 Enable plv8 again
Now with quickfix for https://github.com/plv8/plv8/issues/503
2022-10-18 18:34:27 +03:00
Heikki Linnakangas
41550ec8bf Remove unnecessary indirections of libpqwalproposer functions
In the Postgres backend, we cannot link directly with libpq (check the
pgsql-hackers arhive for all kinds of fun that ensued when we tried to
do that). Therefore, the libpq functions are used through the thin
wrapper functions in libpqwalreceiver.so, and libpqwalreceiver.so is
loaded dynamically. To hide the dynamic loading and make the calls
look like regular functions, we use macros to hide the function
pointers.

We had inherited the same indirections in libpqwalproposer, but it's
not needed since the neon extension is already a shared library that's
loaded dynamically. There's no problem calling the functions directly
there. Remove the indirections.
2022-10-18 18:25:30 +03:00
Sergey Melnikov
0cd2d91b9d Fix deploy-new job by installing sivel.toiletwater (#2641) 2022-10-18 14:44:19 +00:00
Sergey Melnikov
546e9bdbec Deploy storage into new account and migrate to management API v2 (#2619)
Deploy storage into new account
Migrate safekeeper and pageserver initialisation to management api v2
2022-10-18 15:52:15 +03:00
Heikki Linnakangas
59bc7e67e0 Use an optimized version of amplify_num.
Speeds up layer_map::search somewhat. I also opened a PR in the upstream
rust-amplify repository with these changes,
see https://github.com/rust-amplify/rust-amplify/pull/148. We can switch
back to upstream version when that's merged.
2022-10-18 15:00:10 +03:00
Heikki Linnakangas
2418e72649 Speed up layer_map::search, by remembering the "envelope" for each layer.
Lookups in the R-tree call the "envelope" function for every comparison,
and our envelope function isn't very cheap, so that overhead adds up.
Create the envelope once, when the layer is inserted into the tree, and
store it along with the layer. That uses some more memory per layer, but
that's not very significant.

Speeds up the search operation 2x
2022-10-18 15:00:10 +03:00
Heikki Linnakangas
80746b1c7a Add micro-benchmark for layer map search function
The test data was extracted from our pgbench benchmark project on the
captest environment, the one we use for the 'neon-captest-reuse' test.
2022-10-18 15:00:10 +03:00
Dmitry Rodionov
129f7c82b7 remove redundant expect_tenant_to_download_timeline 2022-10-18 11:21:48 +03:00
Anastasia Lubennikova
0ec5ddea0b GRANT CREATE ON SCHEMA public TO web_access 2022-10-17 22:42:51 +03:00
Kirill Bulatov
c4ee62d427 Bump clap and other minor dependencies (#2623) 2022-10-17 12:58:40 +03:00
Joonas Koivunen
c709354579 Add layer sizes to index_part.json (#2582)
This is the first step in verifying layer files. Next up on the road is
hashing the files and verifying the hashes.

The metadata additions do not require any migration. The idea is that
the change is backward and forward-compatible with regard to
`index_part.json` due to the softness of JSON schema and the
deserialization options in use.

New types added:

- LayerFileMetadata for tracking the file metadata
    - starting with only the file size
    - in future hopefully a sha256 as well
- IndexLayerMetadata, the serialized counterpart of LayerFileMetadata

LayerFileMetadata needing to have all fields Option is a problem but
that is not possible to handle without conflicting a lot more with other
ongoing work.

Co-authored-by: Kirill Bulatov <kirill@neon.tech>
2022-10-17 12:21:04 +03:00
Lassi Pölönen
5d6553d41d Fix pageserver configuration generation bug (#2584)
* We had an issue with `lineinfile` usage for pageserver configuration
file: if the S3 bucket related values were changed, it would have
resulted in duplicate keys, resulting in invalid toml.

So to fix the issue, we should keep the configuration in structured
format (yaml in this case) so we can always generate syntactically
correct toml.

Inventories are converted to yaml just so that it's easier to maintain
the configuration there. Another alternative would have been a separate
variable files.

* Keep the ansible collections dir, but locally installed collections
should not be tracked.
2022-10-16 11:37:10 +00:00
Kirill Bulatov
f03b7c3458 Bump regular dependencies (#2618)
* etcd-client is not updated, since we plan to replace it with another client and the new version errors with some missing prost library error
* clap had released another major update that requires changing every CLI declaration again, deserves a separate PR
2022-10-15 01:55:31 +03:00
Heikki Linnakangas
9c24de254f Add description and license fields to OpenAPI spec.
These were added earlier to the control plane's copy of this file.
This is the master version of this file, so let's keep it in sync.
2022-10-14 18:37:58 +03:00
Heikki Linnakangas
538876650a Merge 'local' and 'remote' parts of TimelineInfo into one struct.
The 'local' part was always filled in, so that was easy to merge into
into the TimelineInfo itself. 'remote' only contained two fields,
'remote_consistent_lsn' and 'awaits_download'. I made
'remote_consistent_lsn' an optional field, and 'awaits_download' is now
false if the timeline is not present remotely.

However, I kept stub versions of the 'local' and 'remote' structs for
backwards-compatibility, with a few fields that are actively used by
the control plane. They just duplicate the fields from TimelineInfo
now. They can be removed later, once the control plane has been
updated to use the new fields.
2022-10-14 18:37:14 +03:00
Heikki Linnakangas
500239176c Make TimelineInfo.local field mandatory.
It was only None when you queried the status of a timeline with
'timeline_detail' mgmt API call, and it was still being downloaded. You
can check for that status with the 'tenant_status' API call instead,
checking for has_in_progress_downloads field.

Anothere case was if an error happened while trying to get the current
logical size, in a 'timeline_detail' request. It might make sense to
tolerate such errors, and leave the fields we cannot fill in as empty,
None, 0 or similar, but it doesn't make sense to me to leave the whole
'local' struct empty in tht case.
2022-10-14 18:37:14 +03:00
Anastasia Lubennikova
ee64a6b80b Fix CI: push versioned compute images to production ECR 2022-10-14 18:12:50 +03:00
Anastasia Lubennikova
a13b486943 Bump vendor/postgres-v15. Rebase to 15.0 2022-10-14 18:12:50 +03:00
Arseny Sher
9fe4548e13 Reimplement explicit timeline creation on safekeepers.
With the ability to pass commit_lsn. This allows to perform project WAL recovery
through different (from the original) set of safekeepers (or under different
ttid) by
1) moving WAL files to s3 under proper ttid;
2) explicitly creating timeline on safekeepers, setting commit_lsn to the
latest point;
3) putting the lastest .parital file to the timeline directory on safekeepers, if
desired.

Extend test_s3_wal_replay to exersise this behaviour.

Also extends timeline_status endpoint to return postgres information.
2022-10-13 21:43:10 +04:00
Heikki Linnakangas
14c623b254 Make it possible to build with old cargo version.
I'm using the Rust compiler and cargo versions from Debian packages,
but the latest available cargo Debian package is quite old, version
1.57.  The 'named-profiles' features was not stabilized at that
version yet, so ever since commit a463749f5, I've had to manually add
this line to the Cargo.toml file to compile. I've been wishing that
someone would update the cargo Debian package, but it doesn't seem to
be happening any time soon.

This doesn't seem to bother anyone else but me, but it shouldn't hurt
anyone else either. If there was a good reason, I could install a
newer cargo version with 'rustup', but if all we need is this one line
in Cargo.toml, I'd prefer to continue using the Debian packages.
2022-10-13 15:17:00 +03:00
Alexander Bayandin
ebf54b0de0 Nightly Benchmarks: Add 50 GB projects (#2612) 2022-10-13 10:00:29 +01:00
Andrés
09dda35dac Return broken tenants due to non existing timelines dir (#2552) (#2575)
Co-authored-by: andres <andres.rodriguez@outlook.es>
2022-10-12 22:28:39 +03:00
Dmitry Ivanov
6ace79345d [proxy] Add more context to console requests logging (#2583) 2022-10-12 21:00:44 +03:00
danieltprice
771e61425e Update release-pr.md (#2600)
Update the Release Notes PR example that is referenced from the checklist. The Release Notes file structure changed recently.
2022-10-12 08:38:28 -03:00
Alexander Bayandin
93775f6ca7 GitHub Actions: replace deprecated set-output with GITHUB_OUTPUT (#2608) 2022-10-12 10:22:24 +01:00
Arseny Sher
6d0dacc4ce Recreate timeline on pageserver in s3_wal_replay test.
That's closer to real usage than switching to brand new pageserver.
2022-10-12 11:46:21 +04:00
Heikki Linnakangas
e5e40a31f4 Clean up terms "delete timeline" and "detach tenant".
You cannot attach/detach an individual timeline, attach/detach always
applies to the whole tenant. However, you can *delete* a single timeline
from a tenant. Fix some comments and error messages that confused these
two operations.
2022-10-11 17:47:41 +03:00
Heikki Linnakangas
676c63c329 Improve comments. 2022-10-11 17:47:41 +03:00
Heikki Linnakangas
47366522a8 Make the return type of 'list_timelines' simpler.
It's enough to return just the Timeline references. You can get the
timeline's ID easily from Timeline.
2022-10-11 17:47:41 +03:00
Heikki Linnakangas
db26bc49cc Remove obsolete FIXME comment.
Commit c634cb1d36 removed the trait and changed the function to return
a &TimelineWriter, as the FIXME said we should do, but forgot to remove
the FIXME.
2022-10-11 17:47:41 +03:00
Lassi Pölönen
e520293090 Add build info metric to pageserver, safekeeper and proxy (#2596)
* Test that we emit build info metric for pageserver, safekeeper and proxy with some non-zero length revision label

* Emit libmetrics_build_info on startup of pageserver, safekeeper and
proxy with label "revision" which tells the git revision.
2022-10-11 09:54:32 +03:00
Sergey Melnikov
241e549757 Switch neon-stress etcd to dedicatd instance (#2602) 2022-10-10 22:07:19 +00:00
Sergey Melnikov
34bea270f0 Fix POSTGRES_DISTRIB_DIR for benchmarks on ec2 runner (#2594) 2022-10-10 09:12:50 +00:00
Kirill Bulatov
13f0e7a5b4 Deploy pageserver_binutils to the envs 2022-10-09 08:21:11 +03:00
Kirill Bulatov
3e35f10adc Add a script to reformat the project 2022-10-09 08:21:11 +03:00
Kirill Bulatov
3be3bb7730 Be more verbose with initdb for pageserver timeline creation 2022-10-09 08:21:11 +03:00
Kirill Bulatov
01d2c52c82 Tidy up feature reporting 2022-10-09 08:21:11 +03:00
Kirill Bulatov
9f79e7edea Merge pageserver helper binaries and provide it for deployment (#2590) 2022-10-08 12:42:17 +00:00
Heikki Linnakangas
a22165d41e Add tests for comparing root and child branch performance.
Author: Thang Pham <thang@neon.tech>
2022-10-08 10:07:33 +03:00
Arseny Sher
725be60bb7 Storage messaging rfc 2. 2022-10-07 21:22:17 +04:00
Dmitry Ivanov
e516c376d6 [proxy] Improve logging (#2554)
* [proxy] Use `tracing::*` instead of `println!` for logging

* Fix a minor misnomer

* Log more stuff
2022-10-07 14:34:57 +03:00
Kirill Bulatov
8e51c27e1a Restore artifact versions (#2578)
Context: https://github.com/neondatabase/neon/pull/2128/files#r989489965

Co-authored-by: Rory de Zoete <rory@neon.tech>
2022-10-07 10:58:31 +00:00
Heikki Linnakangas
9e1eb69d55 Increase default compaction_period setting to 20 s.
The previous default of 1 s caused excessive CPU usage when there were
a lot of projects. Polling every timeline once a second was too aggressive
so let's reduce it.

Fixes https://github.com/neondatabase/neon/issues/2542, but we
probably also want do to something so that we don't poll timelines
that have received no new WAL or layers since last check.
2022-10-07 13:55:19 +03:00
Arthur Petukhovsky
687ba81366 Display sync safekeepers output in compute_ctl (#2571)
Pipe postgres output to compute_ctl stdout and create a test to check that compute_ctl works and prints postgres logs.
2022-10-06 13:53:52 +00:00
Andrés
47bae68a2e Make get_lsn_by_timestamp available in mgmt API (#2536) (#2560)
Co-authored-by: andres <andres.rodriguez@outlook.es>
2022-10-06 12:42:50 +03:00
Joonas Koivunen
e8b195acb7 fix: apply notify workaround on m1 mac docker (#2564)
workaround as discussed in the notify repository.
2022-10-06 11:13:40 +03:00
Anastasia Lubennikova
254cb7dc4f Update CI script to push compute-node-v15 to dockerhub 2022-10-06 10:50:08 +03:00
Anastasia Lubennikova
ed85d97f17 bump vendor/postgres-v15. Rebase it to Stamp 15rc2 2022-10-06 10:50:08 +03:00
Anastasia Lubennikova
4a216c5f7f Use PostGIS 3.3.1 that is compatible with pg 15 2022-10-06 10:50:08 +03:00
Anastasia Lubennikova
c5a428a61a Update Dockerfile.compute-node-v15 to match v14 version.
Fix build script to promote the image for v15 to neon dockerhub
2022-10-06 10:50:08 +03:00
Konstantin Knizhnik
ff8c481777 Normalize last_record LSN in wal receiver (#2529)
* Add test for branching on page boundary

* Normalize start recovery point

Co-authored-by: Heikki Linnakangas <heikki@neon.tech>

Co-authored-by:  Thang Pham <thang@neon.tech>
2022-10-06 09:01:56 +03:00
Arthur Petukhovsky
f25dd75be9 Fix deadlock in safekeeper metrics (#2566)
We had a problem where almost all of the threads were waiting on a futex syscall. More specifically:
- `/metrics` handler was inside `TimelineCollector::collect()`, waiting on a mutex for a single Timeline
- This exact timeline was inside `control_file::FileStorage::persist()`, waiting on a mutex for Lazy initialization of `PERSIST_CONTROL_FILE_SECONDS`
- `PERSIST_CONTROL_FILE_SECONDS: Lazy<Histogram>` was blocked on `prometheus::register`
- `prometheus::register` calls `DEFAULT_REGISTRY.write().register()` to take a write lock on Registry and add a new metric
- `DEFAULT_REGISTRY` lock was already taken inside `DEFAULT_REGISTRY.gather()`, which was called by `/metrics` handler to collect all metrics

This commit creates another Registry with a separate lock, to avoid deadlock in a case where `TimelineCollector` triggers registration of new metrics inside default registry.
2022-10-06 01:07:02 +03:00
Sergey Melnikov
b99bed510d Move proxies to neon-proxy namespace (#2555) 2022-10-05 16:14:09 +03:00
sharnoff
580584c8fc Remove control_plane deps on pageserver/safekeeper (#2513)
Creates new `pageserver_api` and `safekeeper_api` crates to serve as the
shared dependencies. Should reduce both recompile times and cold compile
times.

Decreases the size of the optimized `neon_local` binary: 380M -> 179M.
No significant changes for anything else (mostly as expected).
2022-10-04 11:14:45 -07:00
Kirill Bulatov
d823e84ed5 Allow attaching tenants with zero timelines 2022-10-04 18:13:51 +03:00
Kirill Bulatov
231dfbaed6 Do not remove empty timelines/ directory for tenants 2022-10-04 18:13:51 +03:00
Dmitry Rodionov
5cf53786f9 Improve pytest ergonomics
1. Disable perf tests by default
2. Add instruction to run tests in parallel
2022-10-04 14:53:01 +03:00
Heikki Linnakangas
9b9bbad462 Use 'notify' crate to wait for PostgreSQL startup.
Compute node startup time is very important. After launching
PostgreSQL, use 'notify' to be notified immediately when it has
updated the PID file, instead of polling. The polling loop had 100 ms
interval so this shaves up to 100 ms from the startup time.
2022-10-04 13:00:15 +03:00
Heikki Linnakangas
537b2c1ae6 Remove unnecessary check for open PostgreSQL TCP port.
The loop checked if the TCP port is open for connections, by trying to
connect to it. That seems unnecessary. By the time the postmaster.pid
file says that it's ready, the port should be open. Remove that check.
2022-10-04 12:09:13 +03:00
Joonas Koivunen
31123d1fa8 Silence clippies, minor doc fix (#2543)
* doc: remove stray backtick

* chore: clippy::let_unit_value

* chore: silence useless_transmute, duplicate_mod

* chore: remove allowing deref_nullptr

not needed since bindgen 0.60.0.

* chore: remove repeated allowed lints

they are already allowed from the crate root.
2022-10-03 17:44:17 +03:00
Kirill Bulatov
4f2ac51bdd Bump rustc to 1.61 2022-10-03 16:36:03 +03:00
Kirill Bulatov
7b2f9dc908 Reuse existing tenants during attach (#2540) 2022-10-03 13:33:55 +03:00
Arthur Petukhovsky
dabb6d2675 Fix log level for sk startup logs (#2526) 2022-09-27 10:36:17 +00:00
Arthur Petukhovsky
fc7087b16f Add metric for loaded safekeeper timelines (#2509) 2022-09-27 11:57:59 +03:00
Vadim Kharitonov
2233ca2a39 seqwait.rs unit tests don't check return value 2022-09-27 11:47:59 +03:00
Dmitry Rodionov
fb68d01449 Preserve task result in TaskHandle by keeping join handle around (#2521)
* Preserve task result in TaskHandle by keeping join handle around

The solution is not great, but it should hep to debug staging issue
I tried to do it in a least destructive way. TaskHandle used only in
one place so it is ok to use something less generic unless we want
to extend its usage across the codebase. In its current current form
for its single usage place it looks too abstract

Some problems around this code:
1. Task can drop event sender and continue running
2. Task cannot be joined several times (probably not needed,
    but still, can be surprising)
3. Had to split task event into two types because ahyhow::Error
    does not implement clone. So TaskContinueEvent derives clone
    but usual task evend does not. Clone requirement appears
    because we clone the current value in next_task_event.
    Taking it by reference is complicated.
4. Split between Init and Started is artificial and comes from
    watch::channel requirement to have some initial value.

    To summarize from 3 and 4. It may be a better idea to use
    RWLock or a bounded channel instead
2022-09-26 20:57:02 +00:00
Arthur Petukhovsky
d15116f2cc Update pg_version for old timelines 2022-09-26 19:57:03 +03:00
Stas Kelvich
df45c0d0e5 Disable plv8 again 2022-09-26 12:22:46 +03:00
Stas Kelvich
367cc01290 Fix deploy paths 2022-09-26 10:08:01 +03:00
Anastasia Lubennikova
1165686201 fix deploy lib paths for postgres 2022-09-23 22:23:43 +03:00
Anastasia Lubennikova
093264a695 Fix deploy bin and lib paths for postgres 2022-09-23 22:23:43 +03:00
sharnoff
805bb198c2 Miscellaneous small fixups (#2503)
Changes are:

 * Correct typo "firts" -> "first"
 * Change <empty panic with comment explaining> to <panic with message
   taken from the comment>
 * Fix weird indentation that rustfmt was failing to handle
 * Use existing `anyhow::{anyhow,bail}!` as `{anyhow,bail}!` if it's
   already in scope
 * Spell `Result<T, anyhow::Error>` as `anyhow::Result<T>`
   * In general, closer to matching the rest of the codebase
 * Change usages of `hash_map::Entry` to `Entry` when it's already in
   scope
   * A quick search shows our style on this one varies across the files
     it's used in
2022-09-23 11:49:28 -07:00
Stas Kelvich
5ccd54c699 Add support for h3-pg and re-enable plv8 2022-09-23 19:06:57 +03:00
Dmitry Ivanov
1dffba9de6 Write more tests for the proxy... (#1918)
And change a few more things in the process.
2022-09-23 18:30:44 +03:00
Alexander Bayandin
ebab89ebd2 test_runner: pass password to pgbench via PGPASSWORD (#2468) 2022-09-23 12:51:33 +00:00
MMeent
bc3ba23e0a Fix extreme metrics bloat in storage sync (#2506)
* Fix extreme metrics bloat in storage sync

From 78 metrics per (timeline, tenant) pair down to (max) 10 metrics per
(timeline, tenant) pair, plus another 117 metrics in a global histogram that
replaces the previous per-timeline histogram.

* Drop image sync operation metric series when dropping TimelineMetrics.
2022-09-23 14:35:36 +02:00
Alexander Bayandin
3e65209a06 Nightly Benchmarks: use Postgres binaries from artifacts (#2501) 2022-09-23 12:50:36 +01:00
Dmitry Rodionov
eb0c6bcf1a reenable storage deployments 2022-09-23 14:12:39 +03:00
Rory de Zoete
52819898e4 Extend image push step with production ECR (#2465)
* Extend image push step with production ECR

* Put copy step before auth change

* Use correct name

* Only push on main

* Fix typo
2022-09-23 11:25:29 +02:00
Sergey Melnikov
b0377f750a Add staging-test region to normal staging rollouts (#2500) 2022-09-23 11:25:26 +03:00
Dmitry Rodionov
43560506c0 remove duplicate walreceiver connection span 2022-09-23 00:27:24 +03:00
Anastasia Lubennikova
c81ede8644 Hotfix for safekeeper timelines with unknown pg_version.
Assume DEFAULT_PG_VERSION = 14
2022-09-22 21:17:36 +03:00
Anastasia Lubennikova
eb9200abc8 Use version-specific path in pytest CI script 2022-09-22 18:12:41 +03:00
Anastasia Lubennikova
7c1695e87d fix psql path in export_import_between_pageservers script 2022-09-22 18:12:41 +03:00
Anastasia Lubennikova
8b42c184e7 Update LD_LIBRARY_PATH in deploy scripts 2022-09-22 18:12:41 +03:00
Anastasia Lubennikova
7138db9279 Fix paths to postgres binaries in the deploy script 2022-09-22 18:12:41 +03:00
Egor Suvorov
262fa3be09 pageserver pg proto: add missing auth checks (#2494)
Fixes #1858
2022-09-22 17:07:08 +03:00
Anastasia Lubennikova
5e151192f5 Fix rebase conflicts in safekeeper code 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
2d012f0d32 Fix rebase conflicts in pageserver code 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
64f64d5637 Fix after rebase: bump vendor/postgres-v14 to match main 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
1fa7d6aebf Use DEFAULT_PG_VERSION env in CI pytest 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
d098542dde Make test_timeline_size_metrics more stable:
Compare size with Vanilla postgres size instead of hardcoded value
2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
eba419fda3 Clean up the pg_version choice code 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
d8d3cd49f4 Update libs/postgres_ffi/wal_craft/src/bin/wal_craft.rs
Co-authored-by: MMeent <matthias@neon.tech>
2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
3618c242b9 use version specific find_end_of_wal function 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
ed6b75e301 show pg_version in create_timeline info span 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
862902f9e5 Update readme and openapi spec 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
8d890b3cbb fix clippy warnings 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
0fde59aa46 use pg_version in python tests 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
1255ef806f pass version to wal_craft.rs 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
5dddeb8d88 Use non-versioned pg_distrib dir 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
d45de3d58f update build scripts to match pg_distrib_dir versioning schema 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
a69e060f0f fix clippy warning 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
a4397d43e9 Rename waldecoder -> waldecoder_handler.rs. Add comments 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
03c606f7c5 Pass pg_version parameter to timeline import command.
Add pg_version field to LocalTimelineInfo.
Use pg_version in the export_import_between_pageservers script
2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
9dfede8146 Handle backwards-compatibility of TimelineMetadata.
This commit bumps TimelineMetadata format version and makes it independent from STORAGE_FORMAT_VERSION.
2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
86bf491981 Support pg 15
- Split postgres_ffi into two version specific files.
- Preserve pg_version in timeline metadata.
- Use pg_version in safekeeper code. Check for postgres major version mismatch.
- Clean up the code to use DEFAULT_PG_VERSION constant everywhere, instead of hardcoding.

-  Parameterize python tests: use DEFAULT_PG_VERSION env and pg_version fixture.
   To run tests using a specific PostgreSQL version, pass the DEFAULT_PG_VERSION environment variable:
   'DEFAULT_PG_VERSION='15' ./scripts/pytest test_runner/regress'
 Currently don't all tests pass, because rust code relies on the default version of PostgreSQL in a few places.
2022-09-22 14:15:13 +03:00
Dmitry Rodionov
e764c1e60f remove self argument from several spans 2022-09-22 11:41:04 +03:00
Konstantin Knizhnik
f3073a4db9 R-Tree layer map (#2317)
Replace the layer array and linear search with R-tree

So far, the in-memory layer map that holds information about layer
files that exist, has used a simple Vec, in no particular order, to
hold information about all the layers. That obviously doesn't scale
very well; with thousands of layer files the linear search was
consuming a lot of CPU. Replace it with a two-dimensional R-tree, with
Key and LSN ranges as the dimensions.

For the R-tree, use the 'rstar' crate. To be able to use that, we
convert the Keys and LSNs into 256-bit integers. 64 bits would be
enough to represent LSNs, and 128 bits would be enough to represent
Keys. However, we use 256 bits, because rstar internally performs
multiplication to calculate the area of rectangles, and the result of
multiplying two 128 bit integers doesn't necessarily fit in 128 bits,
causing integer overflow and, if overflow-checks are enabled,
panic. To avoid that, we use 256 bit integers.

Add a performance test that creates a lot of layer files, to
demonstrate the benefit.
2022-09-22 08:35:06 +03:00
Dmitry Ivanov
e9a103c09f [proxy] Pass extra parameters to the console (#2467)
With this change we now pass additional params
to the console's auth methods.
2022-09-21 21:42:47 +03:00
Arthur Petukhovsky
7eebb45ea6 Reduce metrics footprint in safekeeper (#2491)
Fixes bugs with metrics in control_file and wal_storage, where we haven't deleted metrics for inactive timelines.
2022-09-21 19:13:30 +03:00
Alexander Bayandin
19fa410ff8 NeonCompare: switch to new pageserver HTTP API 2022-09-21 15:17:55 +01:00
Heikki Linnakangas
b82e2e3f18 Bump postgres submodules and update docs/core_changes.md.
The old change to downgrade a WARNING in postgres vacuumlazy.c was
reverted.
2022-09-21 17:14:29 +03:00
Kirill Bulatov
71c92e0db1 Use prebuilt image with Hakari for CI style checks (#2488) 2022-09-21 10:13:11 +00:00
sharnoff
6f949e1556 Improve pageserver/safekeepeer HTTP API errors (#2461)
Part of the general work on improving pageserver logs.

Brief summary of changes:

* Remove `ApiError::from_err`
* Remove `impl From<anyhow::Error> for ApiError`
* Convert `ApiError::{BadRequest, NotFound}` to use `anyhow::Error`
  * Note: `NotFound` has more verbose formatting because it's more
    likely to have useful information for the receiving "user"
* Explicitly convert from `tokio::task::JoinError`s into
  `InternalServerError`s where appropriate

Also note: many of the places where errors were implicitly converted to
500s have now been updated to return a more appropriate error. Some
places where it's not yet possible to distinguish the error types have
been left as 500s.
2022-09-20 17:02:10 -07:00
Kirill Bulatov
8d7024a8c2 Move path manipulation function to utils 2022-09-20 23:43:52 +03:00
Kirill Bulatov
6b8dcad1bb Unify timeline creation steps 2022-09-20 23:43:52 +03:00
Kirill Bulatov
310c507303 Merge path retrieval methods in config.rs 2022-09-20 23:43:52 +03:00
Kirill Bulatov
6fc719db13 Merge timelines.rs with tenant.rs 2022-09-20 23:43:52 +03:00
sharnoff
4a3b3ff11d Move testing pageserver libpq cmds to HTTP api (#2429)
Closes #2422.

The APIs have been feature gated with the `testing_api!` macro so that
they return 400s when support hasn't been compiled in.
2022-09-20 11:28:12 -07:00
sharnoff
4b25b9652a Rename more zid-like idents (#2480)
Follow-up to PR #2433 (b8eb908a). There's still a few more unresolved
locations that have been left as-is for the same compatibility reasons
in the original PR.
2022-09-20 11:06:31 -07:00
Heikki Linnakangas
a5019bf771 Use a simpler way to set extra options for benchmark test.
Commit 43a4f7173e fixed the case that there are extra options in the
connection string, but broke it in the case when there are not. Fix
that. But on second thoughts, it's more straightforward set the
options with ALTER DATABASE, so change the workflow yaml file to do
that instead.
2022-09-20 13:48:50 +03:00
Kirill Bulatov
7863c4a702 Regenerate Hakari files, add a CI check for that 2022-09-20 11:39:10 +03:00
Arthur Petukhovsky
566e816298 Refactor safekeeper timelines handling (#2329)
See https://github.com/neondatabase/neon/pull/2329 for details
2022-09-20 07:42:39 +00:00
Heikki Linnakangas
e4f775436f Don't override other options than statement_timeout in test conn string.
In commit 6985f6cd6c, I tried passing extra GUCs in the 'options' part
of the connection string, but it didn't work because the pgbench test
overrode it with the statement_timeout. Change it so that it adds the
statement_timeout to any other options, instead of replacing them.
2022-09-20 09:46:15 +03:00
Alexander Bayandin
bb3c66d86f github/workflows: Make publishing perf reports more configurable (#2440) 2022-09-19 22:28:51 +00:00
Heikki Linnakangas
6985f6cd6c Add a new benchmark data series for prefetching.
Also run benchmarks with the seqscan prefetching (commit f44afbaf62)
enabled.

Renames the 'neon-captest' test to 'neon-captest-reuse', for clarity
2022-09-19 20:56:11 +03:00
Dmitry Rodionov
fcb4a61a12 Adjust spans around gc and compaction
So compaction and gc loops have their own span to always show tenant id
in log messages.
2022-09-19 20:08:38 +03:00
Dmitry Rodionov
4b5e7f2f82 Temporarily disable storage deployments
Do not update configs
Do not restart servieces
Still update binaries
2022-09-19 17:03:20 +03:00
Anastasia Lubennikova
d11cb4b2f1 Bump vendor/postgres-v15 to the latest state of REL_15_STABLE_neon branch 2022-09-19 15:12:05 +03:00
Sergey Melnikov
90ed12630e Add zenith-us-stage-ps-4 and undo changes in prefix_in_bucket in pageserver config (#2473)
* Add zenith-us-stage-ps-4

* Undo changes in prefix_in_bucket in pageserver config (Rollback #2449)
2022-09-19 12:57:44 +02:00
Konstantin Knizhnik
846d126579 Set last written lsn for created relation (#2398)
* Set last written lsn for created relation

* use current LSN for updating last written LSN of relation metadata

* Update LSN for the extended blocks even for pges without LSN (zeroed)

* Update pgxn/neon/pagestore_smgr.c

Co-authored-by: Heikki Linnakangas <heikki@neon.tech>

Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2022-09-19 12:56:08 +03:00
Egor Suvorov
c9c3c77c31 Fix Docker image builds (follow-up for #2458) (#2469)
Put ninstall.sh inside Docker images for building
2022-09-16 20:51:35 +03:00
Kirill Bulatov
b46c8b4ae0 Add an alias to build test images simply 2022-09-16 18:58:41 +03:00
Egor Suvorov
65a5010e25 Use custom install command in Makefile to speed up incremental builds (#2458)
Fixes #1873: previously any run of `make` caused the `postgres-v15-headers`
target to build. It copied a bunch of headers via `install -C`. Unfortunately,
some origins were symlinks in the `./pg_install/build` directory pointing
inside `./vendor/postgres-v15` (e.g. `pg_config_os.h` pointing to `linux.h`).

GNU coreutils' `install` ignores the `-C` key for non-regular files and
always overwrites the destination if the origin is a symlink. That in turn
made Cargo rebuild the `postgres_ffi` crate and all its dependencies because
it thinks that Postgres headers changed, even if they did not. That was slow.

Now we use a custom script that wraps the `install` program. It handles one
specific case and makes sure individual headers are never copied if their
content did not change. Hence, `postgres_ffi` is not rebuilt unless there were
some changes to the C code.

One may still have slow incremental single-threaded builds because Postgres
Makefiles spawn about 2800 sub-makes even if no files have been changed.
A no-op build takes "only" 3-4 seconds on my machine now when run with `-j30`,
and 20 seconds when run with `-j1`.
2022-09-16 15:44:02 +00:00
sharnoff
9c35a09452 Improve build errors when postgres_ffi fails (#2460)
This commit does two things of note:

 1. Bumps the bindgen dependency from `0.59.1` to `0.60.1`. This gets us
    an actual error type from bindgen, so we can display what's wrong.
 2. Adds `anyhow` as a build dependency, so our error message can be
    prettier. It's already used heavily elsewhere in the crates in this
    repo, so I figured the fact it's a build dependency doesn't matter
    much.

I ran into this from running `cargo <cmd>` without running `make` first.
Here's a comparison of the compiler output in those two cases.

Before this commit:

```
error: failed to run custom build command for `postgres_ffi v0.1.0 ($repo_path/libs/postgres_ffi)`

Caused by:
  process didn't exit successfully: `$repo_path/target/debug/build/postgres_ffi-2f7253b3ad3ca840/build-script-build` (exit status: 101)
  --- stdout
  cargo:rerun-if-changed=bindgen_deps.h

  --- stderr
  bindgen_deps.h:7:10: fatal error: 'c.h' file not found
  bindgen_deps.h:7:10: fatal error: 'c.h' file not found, err: true
  thread 'main' panicked at 'Unable to generate bindings: ()', libs/postgres_ffi/build.rs:135:14
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
```

After this commit:

```
error: failed to run custom build command for `postgres_ffi v0.1.0 ($repo_path/libs/postgres_ffi)`

Caused by:
  process didn't exit successfully: `$repo_path/target/debug/build/postgres_ffi-e01fb59602596748/build-script-build` (exit status: 1)
  --- stdout
  cargo:rerun-if-changed=bindgen_deps.h

  --- stderr
  bindgen_deps.h:7:10: fatal error: 'c.h' file not found
  Error: Unable to generate bindings

  Caused by:
      clang diagnosed error: bindgen_deps.h:7:10: fatal error: 'c.h' file not found
```
2022-09-16 08:37:44 -07:00
Dmitry Rodionov
44fd4e3c9f add more logs 2022-09-16 18:14:05 +03:00
Dmitry Rodionov
4db15d3c7c change prefix_in_bucket in pageserver config 2022-09-16 18:14:05 +03:00
Alexander Bayandin
72b33997c7 Nightly Benchmarks: trigger tests earlier (#2463) 2022-09-16 09:09:54 +00:00
Kirill Bulatov
74312e268f Tidy up storege artifact build flags
* Simplify test build features handling
* Build only necessary binaries during the release build
2022-09-16 11:17:41 +03:00
sharnoff
db5ec0dae7 Cleanup/simplify logical size calculation (#2459)
Should produce identical results; replaces an error case that shouldn't
be possible with `expect`.
2022-09-15 23:50:46 -07:00
Kirill Bulatov
031e57a973 Disable failpoints by default 2022-09-16 09:26:29 +03:00
bojanserafimov
96e867642f Validate tenant create options (#2450)
Co-authored-by: Kirill Bulatov <kirill@neon.tech>
2022-09-15 18:20:23 -04:00
Egor Suvorov
e968b5e502 tests: do not set num_safekeepers = 1, it's the default (#2457)
Also get rid if `with_safekeepers` parameter in tests.
Its meaning has changed: `False` meant "no safekeepers" which is not
supported anymore, so we assume it's always `True`.

See #1648
2022-09-15 21:43:51 +03:00
Egor Suvorov
9d9d8e9519 docs/sourcetree: update CLion set up instructions (#2454)
After #2325 the old method no longer works as our Makefile does not print compilation commands when run with --dry-run, see https://github.com/neondatabase/neon/issues/2378#issuecomment-1241421325

This method is much slower but is hopefully robust.

Add some more notes while we're here.
2022-09-15 17:16:07 +00:00
Heikki Linnakangas
1062e57fee Don't run codestyle checks separately for Postgres v14 and v15.
Previously, we compiled neon separately for Postgres v14 and v15, for
the codestyle checks. But that was bogus; we actually just ran "make
postgres", which always compiled both versions. The version really only
affected the caching.

Fix that, by copying the build steps from the main build_and_test.yml
workflow.
2022-09-15 16:33:42 +03:00
dependabot[bot]
a8d9732529 Bump axum-core from 0.2.7 to 0.2.8
Bumps [axum-core](https://github.com/tokio-rs/axum) from 0.2.7 to 0.2.8.
- [Release notes](https://github.com/tokio-rs/axum/releases)
- [Changelog](https://github.com/tokio-rs/axum/blob/main/CHANGELOG.md)
- [Commits](https://github.com/tokio-rs/axum/compare/axum-core-v0.2.7...axum-core-v0.2.8)

---
updated-dependencies:
- dependency-name: axum-core
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-09-15 14:07:00 +01:00
Alexey Kondratov
757e2147c1 Follow-up for neondatabase/neon#2448 (#2452)
* remove `legacy` mode from the proxy readme
* explicitly specify `authBackend` in the link auth proxy helm-values
  for all envs
2022-09-15 15:21:22 +03:00
Dmitry Ivanov
87bf7be537 [proxy] Drop support for legacy cloud API (#2448)
Apparently, it no longer exists in the cloud.
2022-09-14 21:27:47 +03:00
Heikki Linnakangas
f86ea09323 Avoid recompiling postgres_ffi every time you run "make".
Running "make" at the top level calls "make install" to install the
PostgreSQL headers into the pg_install/ directory. That always updated
the modification time of the headers even if there were no changes,
triggering recompilation of the postgres_ffi bindings. To avoid that,
use 'install -C', to install the PostgreSQL headers.

However, there was an upstream PostgreSQL issue that the
src/include/Makefile didn't respect the INSTALL configure option. That
was just fixed in upstream PostgreSQL, so cherry-pick that fix to our
vendor/postgres repositories.

Fixes https://github.com/neondatabase/neon/issues/1873.
2022-09-14 15:18:18 +03:00
Alexander Bayandin
d87c9e62d6 Nightly Benchmarks: perform tests on both pre-created and fresh projects (#2443) 2022-09-14 10:53:34 +00:00
Heikki Linnakangas
c3096532f9 Fix vendor/postgres-v15 to point to correct v15 branch.
Commit f44afbaf62 updated vendor/postgres-v15 to point to a commit that
was built on top of PostgreSQL 14 rather than 15. So we accidentally had
two copies of PostgreSQL v14 in the repository. Oops. This updates
it to point to the correct version.
2022-09-14 09:23:51 +03:00
Kirill Bulatov
6db6e7ddda Use backward-compatible safekeeper code 2022-09-14 08:14:05 +03:00
Kirill Bulatov
b8eb908a3d Rename old project name references 2022-09-14 08:14:05 +03:00
Kirill Bulatov
260ec20a02 Refotmat pgxn code, add typedefs.list that was used 2022-09-14 08:14:05 +03:00
Dmitry Rodionov
ba8698bbcb update neon_local output in readme 2022-09-14 01:41:27 +03:00
Egor Suvorov
35761ac6b6 docs/sourcetree: add info about IDE config (#2332) 2022-09-13 21:55:18 +00:00
Kirill Bulatov
32b7259d5e Timeline data management RFC (#2152) 2022-09-13 22:37:20 +03:00
Dmitry Rodionov
1d53173e62 update openapi spec (tenant state has changed) 2022-09-13 21:53:59 +03:00
Alexander Bayandin
d4d57ea2dd github/workflows: fix project creation via API (#2437) 2022-09-13 20:26:26 +02:00
Dmitry Rodionov
db0c49148d clean up metrics in handle_pagerequests 2022-09-13 21:21:45 +03:00
Alexander Bayandin
59d04ab66a test_runner: redact passwords from log messages (#2434) 2022-09-13 17:24:11 +00:00
Kirill Bulatov
1a8c8b04d7 Merge Repository and Tenant entities, rework tenant background jobs 2022-09-13 15:39:39 +03:00
Konstantin Knizhnik
f44afbaf62 Changes of neon extension to support local prefetch (#2369)
* Changes of neon extension to support local prefetch

* Catch exceptions in pageserver_receive

* Bump posgres version

* Bump posgres version

* Bump posgres version

* Bump posgres version
2022-09-13 12:26:20 +03:00
Alexander Bayandin
4f7557fb58 github/workflows: Create projects using API (#2403)
* github/actions: add neon projects related actions
* workflows/benchmarking: create projects using API
* workflows/pg_clients: create projects using API
2022-09-13 09:45:45 +01:00
Kirill Bulatov
2a837d7de7 Create tenants in temporary directory first (#2426) 2022-09-12 21:04:33 +00:00
Heikki Linnakangas
40c845e57d Switch to async for all concurrency in the pageserver.
Instead of spawning helper threads, we now use Tokio tasks. There
are multiple Tokio runtimes, for different kinds of tasks. One for
serving libpq client connections, another for background operations
like GC and compaction, and so on. That's not strictly required, we
could use just one runtime, but with this you can still get an
overview of what's happening with "top -H".

There's one subtle behavior in how TenantState is updated. Before this
patch, if you deleted all timelines from a tenant, its GC and
compaction loops were stopped, and the tenant went back to Idle
state. We no longer do that. The empty tenant stays Active. The
changes to test_tenant_tasks.py are related to that.

There's still plenty of synchronous code and blocking. For example, we
still use blocking std::io functions for all file I/O, and the
communication with WAL redo processes is still uses low-level unix
poll(). We might want to rewrite those later, but this will do for
now. The model is that local file I/O is considered to be fast enough
that blocking - and preventing other tasks running in the same thread -
is acceptable.
2022-09-12 14:21:00 +03:00
Kirill Bulatov
698d6d0bad Use stable coverage API with rustc 1.60 2022-09-12 13:44:54 +03:00
Heikki Linnakangas
a48f9f377d Fix typo in issue template 2022-09-10 01:23:19 +03:00
Kirill Bulatov
18dafbb9ba Remove deceiving rust version from the CI files 2022-09-09 22:32:00 +03:00
Kirill Bulatov
648e86e9df Use Debian images with libc 2.31 to build legacy compute tools 2022-09-09 22:32:00 +03:00
Kirill Bulatov
923f642549 Collect cargo build timings 2022-09-09 22:32:00 +03:00
Kirill Bulatov
31ec3b7906 Use the toolchain file to define current rustc version used 2022-09-09 22:32:00 +03:00
Kirill Bulatov
c9e7c2f014 Ensure all temporary and empty directories and files are cleansed on pageserver startup 2022-09-09 16:36:45 +03:00
Kirill Bulatov
d3f83eda52 Use regular agent for triggering e2e tests 2022-09-09 16:07:43 +03:00
Dmitry Rodionov
0b76b82e0e review clean up 2022-09-08 19:59:42 +03:00
Heikki Linnakangas
35b4816f09 Turn GenericRemoteStorage into just a newtype around 'Arc<dyn RemoteStorage>'
We had a pattern like this:

    match remote_storage {
        GenericRemoteStorage::Local(storage) => {
            let source = storage.remote_object_id(&file_path)?;
            ...
            storage
                .function(&source, ...)
                .await
       },
       GenericRemoteStorage::S3(storage) => {
	    ... exact same code as for the Local case ...
       },

This removes the code duplication, by allowing you to call the functions
directly on GenericRemoteStorage.

Also change RemoveObjectId to be just a type alias for String. Now that
the callers of GenericRemoteStorage functions don't know whether they're
dealing with the LocalFs or S3 implementation, RemoveObjectId must be the
same type for both.
2022-09-08 19:59:42 +03:00
Alexander Bayandin
171385ac14 Pass COPT and PG_CFLAGS to Extension's CFLAGS (#2405)
* fix incompatible-function-pointer-types warning
* Pass COPT and PG_CFLAGS  to Extension's CFLAGS
2022-09-08 16:02:11 +01:00
MMeent
1351beae19 Fix race condition in ginHeapTupleFastInsert (#2412)
Because the metadata was not locked, it could be updated concurrently
such that we wouldn't actually have the tail block.

The current ordering works better, as we still only start XLogBeginInsert()
once we have all potentially interesting buffers loaded in memory, but
still have correct lock lifetimes.

See also: access/transam/README section Write-Ahead Log Coding
2022-09-08 10:57:30 +00:00
Alexander Bayandin
9e3136ea37 scripts/ingest_regress_test_result.py: fix json data insertion (#2408) 2022-09-07 21:40:08 +01:00
Alexander Bayandin
83dca73f85 Store Allure tests statistics in database (#2367) 2022-09-07 14:16:48 +01:00
Lassi Pölönen
dc2150a90e Add built files to gitignore (#2404) 2022-09-07 12:11:03 +00:00
Anastasia Lubennikova
2794cd83c7 Prepare pg 15 support (generate bindings for pg15) (#2396)
Another preparatory commit for pg15 support:
* generate bindings for both pg14 and pg15;
* update Makefile and CI scripts: now neon build depends on both PostgreSQL versions;
* some code refactoring to decrease version-specific dependencies.
2022-09-07 12:40:48 +03:00
Heikki Linnakangas
65b592d4bd Remove deprecated management API for timeline detach.
It is no longer used anywhere.
2022-09-06 18:54:04 +03:00
Heikki Linnakangas
f441fe57d4 Register prometheus counters correctly.
Commit f081419e68 moved all the prometheus counters to `metrics.rs`,
but accidentally replaced a couple of `register_int_counter!(...)`
calls with just `IntCounter::new(...)`. Because of that, the counters
were not registered in the metrics registry, and were not exposed
through the metrics HTTP endpoint.

Fixes failures we're seeing in a bunch of 'performance' tests because
of the missing metrics.
2022-09-06 17:38:17 +03:00
Heikki Linnakangas
cf157ad8e4 Add test that repeatedly kills and restarts the pageserver.
This caught or reproduced several bugs when I originally wrote this test
back in May, including #1731, #1740, #1751, and #707. I believe all the
issues have been fixed now, but since this was a very fruitful test,
let's add it to the test suite.

We didn't commit this earlier, because the test was very slow especially
with a debug build. We've since changed the build options so that even
the debug builds are not quite so slow anymore.
2022-09-06 13:00:40 +03:00
Lassi Pölönen
f081419e68 Cleanup tenant specific metrics once a tenant is detached. (#2328)
* Add test for pageserver metric cleanup once a tenant is detached.

* Remove tenant specific timeline metrics on detach.

* Use definitions from timeline_metrics in page service.

* Move metrics to own file from layered_repository/timeline.rs

* TIMELINE_METRICS: define smgr metrics

* REMOVE SMGR cleanup from timeline_metrics. Doesn't seem to work as
expected.

* Vritual file centralized metrics, except for evicted file as there's no
tenat id or timeline id.

* Use STORAGE_TIME from timeline_metrics in layered_repository.

* Remove timelineless gc metrics for tenant on detach.

* Rename timeline metrics -> metrics as it's more generic.

* Don't create a TimelineMetrics instance for VirtualFile

* Move the rest of the metric definitions to metrics.rs too.

* UUID -> ZTenantId

* Use consistent style for dict.

* Use Repository's Drop trait for dropping STORAGE_TIME metrics.

* No need for Arc, TimelineMetrics is used in just one place. Due to that,
we can fall back using ZTenantId and ZTimelineId too to avoid additional
string allocation.
2022-09-06 11:30:20 +03:00
Anastasia Lubennikova
05e263d0d3 Prepare pg 15 support (build system and submodules) (#2337)
* Add submodule postgres-15

* Support pg_15 in pgxn/neon

* Renamed zenith -> neon in Makefile

* fix name of codestyle check

* Refactor build system to prepare for building multiple Postgres versions.

Rename "vendor/postgres" to "vendor/postgres-v14"

Change Postgres build and install directory paths to be version-specific:

- tmp_install/build -> pg_install/build/14
- tmp_install/* -> pg_install/14/*

And Makefile targets:

- "make postgres" -> "make postgres-v14"
- "make postgres-headers" -> "make postgres-v14-headers"
- etc.

Add Makefile aliases:

- "make postgres" to build "postgres-v14" and in future, "postgres-v15"
- similarly for "make postgres-headers"

Fix POSTGRES_DISTRIB_DIR path in pytest scripts

* Make postgres version a variable in codestyle workflow

* Support vendor/postgres-v15 in codestyle check workflow

* Support postgres-v15 building in Makefile

* fix pg version in Dockerfile.compute-node

* fix kaniko path

* Build neon extensions in version-specific directories

* fix obsolete mentions of vendor/postgres

* use vendor/postgres-v14 in Dockerfile.compute-node.legacy

* Use PG_VERSION_NUM to gate dependencies in inmem_smgr.c

* Use versioned ECR repositories and image names for compute-node.
The image name format is compute-node-vXX, where XX is postgres major version number.
For now only v14 is supported.
Old format unversioned name (compute-node) is left, because cloud repo depends on it.

* update vendor/postgres submodule url (zenith->neondatabase rename)

* Fix postgres path in python tests after rebase

* fix path in regress test

* Use separate dockerfiles to build compute-node:
Dockerfile.compute-node-v15 should be identical to Dockerfile.compute-node-v14 except for the version number.
This is a hack, because Kaniko doesn't support build ARGs properly

* bump vendor/postgres-v14 and vendor/postgres-v15

* Don't use Kaniko cache for v14 and v15 compute-node images

* Build compute-node images for different versions in different jobs

Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2022-09-05 18:30:54 +03:00
Alexander Bayandin
ee0071e90d Fix nightly benchmark reports (#2392) 2022-09-05 14:30:37 +01:00
Stas Kelvich
772078eb5c Reword proxy SNI error message
Be more strict with project id/name difference and explain how to
get project id out of the domain name.
2022-09-05 15:13:29 +03:00
Konstantin Knizhnik
ad057124be Update relation size cache only when latest LSN is requested (#2310)
* Update relation size cache only when latest LSN is requested

* Fix tests

* Add a test case for timetravel query after pageserver restart.

This test is currently failing, the queries return incorrect results.
I don't know why, needs to be investigated.

    FAILED test_runner/batch_others/test_readonly_node.py::test_timetravel - assert 85 == 100000

If you remove the pageserver restart from the test, it passes.

* yapf3 test_readonly_node.py

* Add comment about cache correction in case of setting incorrect latest flag

* Fix formatting for test_readonly_node.py

* Remove unused imports

* Fix mypy warning for test_readonly_node.py

* Fix formatting of test_readonly_node.py

* Bump postgres version

Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2022-09-05 13:12:02 +03:00
Heikki Linnakangas
aeb1cf9c36 Fix misc typos and grammar in comments. 2022-09-05 11:09:32 +03:00
Heikki Linnakangas
7a3e8bb7fb Make tracing span names consistent for mgmt API handlers. 2022-09-05 11:02:13 +03:00
Konstantin Knizhnik
846d71b948 Add test for last written lsn cache (#1949)
* Fix pythin style

* Fix iport of test_backpressure in test_latency

* Apply changed to moved neon extension

* Apply changed to moved neon extension

* Merge with main

* Update pgxn/neon/pagestore_smgr.c

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>

* Bump postgres version

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>
2022-09-04 22:25:32 +03:00
Kirill Bulatov
2b6c49b2ea Fix negative usize parsing 2022-09-03 23:54:00 +03:00
Konstantin Knizhnik
eef7475408 Add tests for measuring effect of lsn caching (#2384)
* Add tests for measurif effet of lsn caching

* Fix formatting of test_latency.py

* Fix test_lsn_mapping test
2022-09-03 17:06:19 +03:00
Konstantin Knizhnik
71c965b0e1 Move backpressure throttling implementation to neon extension and function for monitoring throttling time (#2380)
* Move backpressure throttling implementation to neon extension and function for monitoring throttling time

* Add missing includes

* Bump postgres version
2022-09-03 08:48:28 +03:00
Heikki Linnakangas
a4e79db348 Move neon_local to control_plane.
Seems a bit silly to have a separate crate just for the executable. It
relies on the control plane for everything it does, and it's the only
user of the control plane.
2022-09-02 16:34:33 +03:00
MMeent
a463749f59 Slim down compute-node images (#2346)
Slim down compute-node images:

- Optimize compute_ctl build for size, not performance & debug-ability
- Don't run unused stages. Saves time in not building the PLV8 extension.
- Do not include static libraries in clean postgres
- Do the installation and finishing touches in the final layer in one job
    This allows docker (and kaniko) to only register one change to the files,
    removing potentially duplicate changed files.
- The runtime library for libreadline-dev is libreadline8, changing the dependency saves 45 MB
- libprotobuf-c-dev -> libprotobuf-c1, saving 100 kB
- libossp-uuid-dev -> libossp-uuid16, saving 150 kB
- gdal-bin + libgdal-dev -> libgeos-c1v5 + libgdal28 + libproj19, saving 747MB
- binutils @ testing -> libc6 @ testing, saving 32 MB
2022-09-02 14:34:40 +02:00
Kirill Bulatov
73f926c39a Return safekeeper remote storage logging during downloads 2022-09-02 15:08:18 +03:00
Kirill Bulatov
8b28adb6a6 Merge file name and extension for index part files 2022-09-02 14:57:09 +03:00
Kirill Bulatov
827c3013bd Adjust benchmark code to Ids 2022-09-02 14:57:09 +03:00
Kirill Bulatov
2db20e5587 Remove [Un]Loaded timeline code (#2359) 2022-09-02 14:31:28 +03:00
Kirill Bulatov
f78a542cba Calculate timeline initial logical size in the background
Start the calculation on the first size request, return
partially calculated size during calculation, retry if failed.

Remove "fast" size init through the ancestor: the current approach is
fast enough for now and there are better ways to optimize the
calculation via incremental ancestor size computation
2022-09-02 14:31:28 +03:00
Kirill Bulatov
8a7333438a Extract common remote storage operations into GenericRemoteStorage (#2373) 2022-09-02 11:58:28 +03:00
Heikki Linnakangas
47bd307cb8 Add python types to represent LSNs, tenant IDs and timeline IDs. (#2351)
For better ergonomics. I always found it weird that we used UUID to
actually mean a tenant or timeline ID. It worked because it happened
to have the same length, 16 bytes, but it was hacky.
2022-09-02 10:16:47 +03:00
Heikki Linnakangas
f0a0d7bb7a Split RcuWriteGuard::store() into two stages: store and wait.
This makes it easier to explain which stages allow concurrent readers and
writers. Expand the comments with examples, too.
2022-09-02 00:34:37 +03:00
Konstantin Knizhnik
40813adba2 Pevent creation of empty layers with duplicates (#2327)
* Pevent creation of empty layers with duplicates

* Add comments
2022-09-01 21:51:48 +03:00
Heikki Linnakangas
15c5f3e6cf Fix misc typos in comments and variable names. 2022-09-01 20:04:08 +03:00
Sergey Melnikov
46c8a93976 Fix PERF_TEST_RESULT_CONNSTR for benchmark init (#2375) 2022-09-01 15:06:52 +03:00
MMeent
13beeb59cd Update extensions included in compute-node
Update PLV8 to 3.1.4 - which is the latest release. 
Update PostGIS to 3.3.0

Remove PLV8 from the final image -- there is an issue we hit when installing PLV8, and we don't quite know what it is yet.
2022-09-01 12:53:17 +02:00
Alexander Bayandin
d7c9cfe7bb Create Allure report for perf tests (#2326) 2022-08-31 16:15:26 +01:00
Rory de Zoete
5745dbdd33 Remove deprecated notification channel (#2330)
Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
2022-08-31 14:36:24 +02:00
Kirill Bulatov
a4803233bb Remove RemoteObjectName and many remote storage generics in pageserver (#2360) 2022-08-30 22:19:52 +03:00
Heikki Linnakangas
f09bd6bc88 Fix size checks in the "local" remote storage implementation.
The code correctly detected too short and too long inputs, but the error
message was bogus for the case the input stream was too long:

    Error: Provided stream has actual size 5 fthat is smaller than the given stream size 4

That check was only supposed to check for too small inputs, but it in
fact caught too long inputs too. That was good, because the check
below that that was supposed to check for too long inputs was in fact
broken, and never did anything. It tried to read input a buffer of
size 0, to check if there is any extra data, but reading to a
zero-sized buffer always returns 0.
2022-08-30 18:44:06 +03:00
Heikki Linnakangas
3aca717f3d Reorganize python tests.
Merge batch_others and batch_pg_regress. The original idea was to
split all the python tests into multiple "batches" and run each batch
in parallel as a separate CI job. However, the batch_pg_regress batch
was pretty short compared to all the tests in batch_others. We could
split batch_others into multiple batches, but it actually seems better
to just treat them as one big pool of tests and use pytest's handle
the parallelism on its own. If we need to split them across multiple
nodes in the future, we could use pytest-shard or something else,
instead of managing the batches ourselves.

Merge test_neon_regress.py, test_pg_regress.py and test_isolation.py
into one file, test_pg_regress.py. Seems more clear to group all
pg_regress-based tests into one file, now that they would all be in
the same directory.
2022-08-30 18:25:38 +03:00
Dmitry Ivanov
96a50e99cf Forward various connection params to compute nodes. (#2336)
Previously, proxy didn't forward auxiliary `options` parameter
and other ones to the client's compute node, e.g.

```
$ psql "user=john host=localhost dbname=postgres options='-cgeqo=off'"
postgres=# show geqo;
┌──────┐
│ geqo │
├──────┤
│ on   │
└──────┘
(1 row)
```

With this patch we now forward `options`, `application_name` and `replication`.

Further reading: https://www.postgresql.org/docs/current/libpq-connect.html

Fixes #1287.
2022-08-30 17:36:21 +03:00
Arseny Sher
60408db101 Fix logging scopes in safekeeper. 2022-08-30 11:32:36 +03:00
Kirill Bulatov
07b4ace52f Use more restrictive .dockerignore 2022-08-29 22:01:53 +03:00
Konstantin Knizhnik
ee8b5f967d Add fork_at_current_lsn function which creates branch at current LSN (#2344)
* Add fork_at_current_lsn function which creates branch at current LSN

* Undo use of fork_at_current_lsn in test_branching because of short GC period

* Add missed return in fork_at_current_lsn

* Add missed return in fork_at_current_lsn

* Update test_runner/fixtures/neon_fixtures.py

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>

* Update test_runner/fixtures/neon_fixtures.py

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>

* Update test_runner/fixtures/neon_fixtures.py

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>
2022-08-29 17:59:04 +03:00
MMeent
1324dd89ed Mark PostGIS and PLV8 as trusted extensions (#2355)
Now, users can install these extensions themselves if they are owner
of the database they try to install the extension in.
2022-08-29 13:44:56 +02:00
Heikki Linnakangas
bfa1d91612 Introduce RCU, and use it to protect latest_gc_cutoff_lsn.
`latest_gc_cutoff_lsn` tracks the cutoff point where GC has been
performed. Anything older than the cutoff might already have been GC'd
away, and cannot be queried by get_page_at_lsn requests. It's
protected by an RWLock. Whenever a get_page_at_lsn requests comes in,
it first grabs the lock and reads the current `latest_gc_cutoff`, and
holds the lock it until the request has been served. The lock ensures
that GC doesn't start concurrently and remove page versions that we
still need to satisfy the request.

With the lock, get_page_at_lsn request could potentially be blocked
for a long time.  GC only holds the lock in exclusive mode for a short
duration, but depending on how whether the RWLock is "fair", a read
request might be queued behind the GC's exclusive request, which in
turn might be queued behind a long-running read operation, like a
basebackup. If the lock implementation is not fair, i.e. if a reader
can always jump the queue if the lock is already held in read mode,
then another problem arises: GC might be starved if a constant stream
of GetPage requests comes in.

To avoid the long wait or starvation, introduce a Read-Copy-Update
mechanism to replace the lock on `latest_gc_cutoff_lsn`. With the RCU,
reader can always read the latest value without blocking (except for a
very short duration if the lock protecting the RCU is contended;
that's comparable to a spinlock). And a writer can always write a new
value without waiting for readers to finish using the old value. The
old readers will continue to see the old value through their guard
object, while new readers will see the new value.

This is purely theoretical ATM, we don't have any reports of either
starvation or blocking behind GC happening in practice. But it's
simple to fix, so let's nip that problem in the bud.
2022-08-29 11:23:37 +03:00
Heikki Linnakangas
7a840ec60c Move save_metadata function.
`timeline.rs` seems like a better home for it.
2022-08-27 18:14:40 +03:00
Heikki Linnakangas
5f189cd385 Remove some unnecessary derives.
Doesn't make much difference, but let's be tidy.
2022-08-27 18:14:38 +03:00
Heikki Linnakangas
f8188e679c Downgrade a few panics into plain errors.
Let's not bring down the whole pageserver if you import a bogus tar
archive to one timeline.
2022-08-27 18:14:35 +03:00
Heikki Linnakangas
34b5d7aa9f Remove unused dependency 2022-08-27 18:14:33 +03:00
Heikki Linnakangas
88a339ed73 Update a few crates
"cargo tree -d" showed that we're building multiple versions of some
crates. Update some crates, to avoid depending on multiple versions.
2022-08-27 18:14:30 +03:00
Heikki Linnakangas
ec20534173 Fix minor typos and leftover comments. 2022-08-27 17:54:56 +03:00
MMeent
c0a867d86f Include neon extensions in the main neon images (#2341)
Oversight in #2325 - apparently this area wasn't well-covered by tests in the neon repo.

Fixes #2340
2022-08-26 19:58:08 +02:00
Dmitry Ivanov
6d30e21a32 Fix proxy tests (#2343)
There might be different psql & locale configurations,
therefore we should explicitly reset them to defaults.
2022-08-26 20:42:32 +03:00
Kirill Bulatov
a56ae15edf Lock cargo dependencies during CI builds 2022-08-26 17:29:01 +03:00
Alexey Kondratov
a5ca6a9d2b Move legacy version of compute-node Dockerfile from postgres repo (#2339)
It's used by e2e CI. Building Dockerfile.compute-node will take
unreasonable ammount of time without v2 runners.

TODO: remove once cloud repo CI is moved to v2 runners.
2022-08-26 13:59:04 +02:00
MMeent
04a018a5b1 Extract neon and neon_test_utils from postgres repo (#2325)
* Extract neon and neon_test_utils from postgres repo
* Remove neon from vendored postgres repo, and fix build_and_test.yml
* Move EmitWarningsOnPlaceholders to end of _PG_init in neon.c (from libpagestore.c)
* Fix Makefile location comments
* remove Makefile EXTRA_INSTALL flag
* Update Dockerfile.compute-node to build and include the neon extension
2022-08-25 18:48:09 +02:00
MMeent
bc588f3a53 Update WAL redo histograms (#2323)
Previously, it could only distinguish REDO task durations down to 5ms, which
equates to approx. 200pages/sec or 1.6MB/sec getpage@LSN traffic. 
This patch improves to 200'000 pages/sec or 1.6GB/sec, allowing for
much more precise performance measurement of the redo process.
2022-08-25 17:17:32 +03:00
Egor Suvorov
c952f022bb waldecoder: fix comment 2022-08-25 15:03:22 +02:00
Rory de Zoete
f67d109e6e Copy binaries to /usr/local (#2335)
* Add extra symlink

* Take other approach

Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
2022-08-25 14:35:01 +02:00
Rory de Zoete
344db0b4aa Re-add temporary symlink (#2331)
Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
2022-08-25 11:17:09 +02:00
Rory de Zoete
0c8ee6bd1d Add postgis & plv8 extensions (#2298)
* Add postgis & plv8 extensions

* Update Dockerfile & Fix typo's

* Update dockerfile

* Update Dockerfile

* Update dockerfile

* Use plv8 step

* Reduce giga layer

* Reduce layer size further

* Prepare for rollout

* Fix dependency

* Pass on correct build tag

* No longer dependent on building tools

* Use version from vendor

* Revert "Use version from vendor"

This reverts commit 7c6670c477.

* Revert and push correct set

* Add configure step for new approach

* Re-add configure flags

Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>
2022-08-25 09:46:52 +02:00
Dmitry Ivanov
8e1d6dd848 Minor cleanup in pq_proto (#2322) 2022-08-23 18:00:02 +03:00
Heikki Linnakangas
4013290508 Fix module doc comment.
`///` is used for comments on the *next* code that follows, so the comment
actually applied to the `use std::collections::BTreeMap;` line that follows.

rustfmt complained about that:

    error: an inner attribute is not permitted following an outer doc comment
     --> /home/heikki/git-sandbox/neon/libs/utils/src/seqwait_async.rs:7:1
      |
    5 | ///
      | --- previous doc comment
    6 |
    7 | #![warn(missing_docs)]
      | ^^^^^^^^^^^^^^^^^^^^^^ not permitted following an outer attribute
    8 |
    9 | use std::collections::BTreeMap;
      | ------------------------------- the inner attribute doesn't annotate this `use` import
      |
      = note: inner attributes, like `#![no_std]`, annotate the item enclosing them, and are usually found at the beginning of source files
help: to annotate the `use` import, change the attribute from inner to outer style
      |
    7 - #![warn(missing_docs)]
    7 + #[warn(missing_docs)]
      |

`//!` is the correct syntax for comments that apply to the whole file.
2022-08-23 12:58:54 +03:00
Heikki Linnakangas
5f0c95182d Minor cleanup, to pass by reference where possible. 2022-08-23 12:58:54 +03:00
Heikki Linnakangas
63b9dfb2f2 Remove unnecessary 'pub' from test module, and remove dead constant.
After making the test module private, the compiler noticed and warned
that the constant is unused.
2022-08-23 12:58:54 +03:00
Heikki Linnakangas
1a666a01d6 Improve comments a little. 2022-08-23 12:58:54 +03:00
Heikki Linnakangas
d110d2c2fd Reorder permission checks in HTTP API call handlers.
Every handler function now follows the same pattern:

1. extract parameters from the call
2. check permissions
3. execute command.

Previously, we extracted some parameters before permission check and
some after. Let's be consistent.
2022-08-23 12:14:06 +03:00
KlimentSerafimov
b98fa5d6b0 Added a new test for making sure the proxy displays a session_id when using link auth. (#2039)
Added pytest to check correctness of the link authentication pipeline.

Context: this PR is the first step towards refactoring the link authentication pipeline to use https (instead of psql) to send the db info to the proxy. There was a test missing for this pipeline in this repo, so this PR adds that test as preparation for the actual change of psql -> https.
Co-authored-by: Bojan Serafimov <bojan.serafimov7@gmail.com>
Co-authored-by: Dmitry Rodionov <dmitry@neon.tech>
Co-authored-by: Stas Kelvic <stas@neon.tech>
Co-authored-by: Dimitrii Ivanov <dima@neon.tech>
2022-08-22 20:02:45 -04:00
Dmitry Rodionov
9dd19ec397 Remove interferring proc check
We do not need it anymore because ports_distributor checks
whether the port can be used before giving it to service
2022-08-22 20:59:32 +03:00
Alexander Bayandin
832e60c2b4 Add .git-blame-ignore-revs file (#2318) 2022-08-22 16:38:31 +01:00
Alexey Kondratov
6dc56a9be1 Add GitHub templates for epics, bugs and release PRs (neondatabase/cloud#2079)
After merging this we will be able to:
- Pick Epic or Bug template in the GitHub UI, when creating an issue
- Use this link to open a release PR formatted in a unified way and
  containing a checklist with useful links: https://github.com/neondatabase/neon/compare/release...main?template=release-pr.md&title=Release%20202Y-MM-DD
2022-08-22 16:29:42 +02:00
Alexander Bayandin
39a3bcac36 test_runner: fix flake8 warnings 2022-08-22 14:57:09 +01:00
Alexander Bayandin
ae3227509c test_runner: revive flake8 2022-08-22 14:57:09 +01:00
Alexander Bayandin
4c2bb43775 Reformat all python files by black & isort 2022-08-22 14:57:09 +01:00
Alexander Bayandin
6b2e1d9065 test_runner: replace yapf with black and isort 2022-08-22 14:57:09 +01:00
Alexander Bayandin
277f2d6d3d Report test results to Allure (#2229) 2022-08-22 11:21:50 +01:00
Kirill Bulatov
7779308985 Ensure timeline logical size is initialized once 2022-08-22 11:51:37 +03:00
Kirill Bulatov
32be8739b9 Move walreceiver timeline registration into layered_repository 2022-08-22 11:51:37 +03:00
Kirill Bulatov
631cbf5b1b Use single map to manage timeline data 2022-08-22 11:51:37 +03:00
Heikki Linnakangas
5522fbab25 Move all unit tests related to Repository/Timeline to layered_repository.rs
There was a nominal split between the tests in layered_repository.rs and
repository.rs, such that tests specific to the layered implementation were
supposed to be in layered_repository.rs, and tests that should work with
any implementation of the traits were supposed to be in repository.rs.
In practice, the line was quite muddled. With minor tweaks, many of the
tests in layered_repository.rs should work with other implementations too,
and vice versa. And in practice we only have one implementation, so it's
more straightforward to gather all unit tests in one place.
2022-08-20 01:21:18 +03:00
Heikki Linnakangas
d48177d0d8 Expose timeline logical size as a prometheus metric.
Physical size was already exposed, and it'd be nice to show both
logical and physical size side by side in our graphana dashboards.
2022-08-19 22:21:33 +03:00
Heikki Linnakangas
84cd40b416 rustfmt fixes.
Not sure why these don't show up as CI failures, but on my laptop,
rustfmt insists.
2022-08-19 22:21:15 +03:00
Heikki Linnakangas
daba4c7405 Add a section in glossary to explain what "logical size" means. (#2306) 2022-08-19 21:57:00 +03:00
MMeent
8ac5a285a1 Update vendor/postgres to one that is rebased onto REL_14_5 (#2312)
This was previously based on REL_14_4
Protected tag of main before rebase is at main-before-rebase-REL_14_5
2022-08-19 20:02:36 +02:00
Heikki Linnakangas
aaa60c92ca Use u64/i64 for logical size, comment on why to use signed i64.
usize/isize type corresponds to the CPU architecture's pointer width,
i.e. 64 bits on a 64-bit platform and 32 bits on a 32-bit platform.
The logical size of a database has nothing to do with the that, so
u64/i64 is more appropriate.

It doesn't make any difference in practice as long as you're on a
64-bit platform, and it's hard to imagine anyone wanting to run the
pageserver on a 32-bit platform, but let's be tidy.

Also add a comment on why we use signed i64 for the logical size
variable, even though size should never be negative. I'm not sure the
reasons are very good, but at least this documents them, and hints at
some possible better solutions.
2022-08-19 16:44:16 +03:00
Kirill Bulatov
187a760409 Reset codestyle cargo cache 2022-08-19 16:40:37 +03:00
Kirill Bulatov
c634cb1d36 Remove TimelineWriter trait, rename LayeredTimelineWriter struct into TimelineWriter 2022-08-19 16:40:37 +03:00
Kirill Bulatov
c19b4a65f9 Remove Repository trait, rename LayeredRepository struct into Repository 2022-08-19 16:40:37 +03:00
Kirill Bulatov
8043612334 Remove Timeline trait, rename LayeredTimeline struct into Timeline 2022-08-19 16:40:37 +03:00
Rory de Zoete
12e87f0df3 Update workflow to fix dependency issue (#2309)
* Update workflow to fix dependency issue

* Update workflow

* Update workflow and dockerfile

* Specify tag

* Update main dockerfile as well

* Mirror rust image to docker hub

* Update submodule ref

Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>
2022-08-19 12:07:46 +02:00
Kirill Bulatov
6b9cef02a1 Use better defaults for pageserver Docker image 2022-08-19 12:41:00 +03:00
MMeent
37d90dc3b3 Fix dependencies issue between compute-tools and compute node docker images (#2304)
Compute node docker image requires compute-tools to build, but this
dependency (and the argument for which image to pick) weren't described in the
workflow file. This lead to out-of-date binaries in latest builds, which
subsequently broke these images.
2022-08-18 21:51:33 +02:00
Kirill Bulatov
a185821d6f Explicitly error on cache issues during I/O (#2303) 2022-08-18 22:37:20 +03:00
MMeent
f99ccb5041 Extract WalProposer into the neon extension (#2217)
Including, but not limited to:

* Fixes to neon management code to support walproposer-as-an-extension

* Fix issue in expected output of pg settings serialization.

* Show the logs of a failed --sync-safekeepers process in CI

* Add compat layer for renamed GUCs in postgres.conf

* Update vendor/postgres to the latest origin/main
2022-08-18 17:12:28 +02:00
Rory de Zoete
2db675a2f2 Re-enable test dependency for deploy (#2300)
Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>
2022-08-18 15:18:59 +02:00
Anton Galitsyn
77a2bdf3d7 on safekeeper registration pass availability zone param (#2292) 2022-08-18 15:05:40 +03:00
Arthur Petukhovsky
976576ae59 Fix walreceiver and safekeeper bugs (#2295)
- There was an issue with zero commit_lsn `reason: LaggingWal { current_commit_lsn: 0/0, new_commit_lsn: 1/6FD90D38, threshold: 10485760 } }`. The problem was in `send_wal.rs`, where we initialized `end_pos = Lsn(0)` and in some cases sent it to the pageserver.
- IDENTIFY_SYSTEM previously returned `flush_lsn` as a physical end of WAL. Now it returns `flush_lsn` (as it was) to walproposer and `commit_lsn` to everyone else including pageserver.
- There was an issue with backoff where connection was cancelled right after initialization: `connected!` -> `safekeeper_handle_db: Connection cancelled` -> `Backoff: waiting 3 seconds`. The problem was in sleeping before establishing the connection. This is fixed by reworking retry logic.
- There was an issue with getting `NoKeepAlives` reason in a loop. The issue is probably the same as the previous.
- There was an issue with filtering safekeepers based on retry attempts, which could filter some safekeepers indefinetely. This is fixed by using retry cooldown duration instead of retry attempts.
- Some `send_wal.rs` connections failed with errors without context. This is fixed by adding a timeline to safekeepers errors.

New retry logic works like this:
- Every candidate has a `next_retry_at` timestamp and is not considered for connection until that moment
- When walreceiver connection is closed, we update `next_retry_at` using exponential backoff, increasing the cooldown on every disconnect.
- When `last_record_lsn` was advanced using the WAL from the safekeeper, we reset the retry cooldown and exponential backoff, allowing walreceiver to reconnect to the same safekeeper instantly.
2022-08-18 13:38:23 +03:00
Anastasia Lubennikova
1a07ddae5f fix cargo test 2022-08-18 13:25:00 +03:00
Heikki Linnakangas
9bc12f7444 Move auto-generated 'bindings' to a separate inner module.
Re-export only things that are used by other modules.

In the future, I'm imagining that we run bindgen twice, for Postgres
v14 and v15. The two sets of bindings would go into separate
'bindings_v14' and 'bindings_v15' modules.

Rearrange postgres_ffi modules.

Move function, to avoid Postgres version dependency in timelines.rs
Move function to generate a logical-message WAL record to postgres_ffi.
2022-08-18 13:25:00 +03:00
Rory de Zoete
92bdf04758 Fix: Always build images (#2296)
* Always build images

* Remove unused

Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
2022-08-18 09:41:24 +02:00
Kirill Bulatov
67e091c906 Rework init in pageserver CLI (#2272)
* Do not create initial tenant and timeline (adjust Python tests for that)
* Rework config handling during init, add --update-config to manage local config updates
2022-08-17 23:24:47 +03:00
Alexander Bayandin
dc102197df workflows/benchmarking: increase timeout (#2294) 2022-08-17 17:16:26 +01:00
Rory de Zoete
262cdf8344 Update cachepot endpoint (#2290)
* Update cachepot endpoint

* Update dockerfile & remove env

* Update image building process

* Cannot use metadata endpoint for this

* Update workflow

* Conform to kaniko syntax

* Update syntax

* Update approach

* Update dockerfiles

* Force update

* Update dockerfiles

* Update dockerfile

* Cleanup dockerfiles

* Update s3 test location

* Revert s3 experiment

* Add more debug

* Specify aws region

* Remove debug, add prefix

* Remove one more debug

Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
2022-08-17 18:02:03 +02:00
Kirill Bulatov
3b819ee159 Remove extra type aliases (#2280) 2022-08-17 17:51:53 +03:00
bojanserafimov
e9a3499e87 Fix flaky pageserver restarts in tests (#2261) 2022-08-17 08:17:35 -04:00
bojanserafimov
3414feae03 Make local mypy behave like CI mypy (#2291) 2022-08-17 08:17:09 -04:00
Heikki Linnakangas
e94a5ce360 Rename pg_control_ffi.h to bindgen_deps.h, for clarity.
The pg_control_ffi.h name implies that it only includes stuff related to
pg_control.h. That's mostly true currently, but really the point of the
file is to include everything that we need to generate Rust definitions
from.
2022-08-16 19:37:36 +03:00
Dmitry Rodionov
d5ec84b87b reset rust cache for clippy run to avoid an ICE
additionally remove trailing whitespaces
2022-08-16 18:49:32 +03:00
Dmitry Rodionov
b21f7382cc split out timeline metrics, track layer map loading and size calculation 2022-08-16 18:49:32 +03:00
Kirill Bulatov
648e8bbefe Fix 1.63 clippy lints (#2282) 2022-08-16 18:49:22 +03:00
Rory de Zoete
9218426e41 Fix docker zombie process issue (#2289)
* Fix docker zombie process issue

* Init everywhere

Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
2022-08-16 17:24:58 +02:00
Rory de Zoete
1d4114183c Use main, not branch for ref check (#2288)
* Use main, not branch for ref check

* Add more debug

* Count main, not head

* Try new approach

* Conform to syntax

* Update approach

* Get full history

* Skip checkout

* Cleanup debug

* Remove more debug

Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
2022-08-16 15:41:31 +02:00
Rory de Zoete
4cde0e7a37 Error for fatal not git repo (#2286)
Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
2022-08-16 13:59:41 +02:00
Rory de Zoete
83f7b8ed22 Add missing step output, revert one deploy step (#2285)
* Add missing step output, revert one deploy step

* Conform to syntax

* Update approach

* Add missing value

* Add missing needs

Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
2022-08-16 13:41:51 +02:00
Rory de Zoete
b8f0f37de2 Gen2 GH runner (#2128)
* Re-add rustup override

* Try s3 bucket

* Set git version

* Use v4 cache key to prevent problems

* Switch to v5 for key

* Add second rustup fix

* Rebase

* Add kaniko steps

* Fix typo and set compress level

* Disable global run default

* Specify shell for step

* Change approach with kaniko

* Try less verbose shell spec

* Add submodule pull

* Add promote step

* Adjust dependency chain

* Try default swap again

* Use env

* Don't override aws key

* Make kaniko build conditional

* Specify runs on

* Try without dependency link

* Try soft fail

* Use image with git

* Try passing to next step

* Fix duplicate

* Try other approach

* Try other approach

* Fix typo

* Try other syntax

* Set env

* Adjust setup

* Try step 1

* Add link

* Try global env

* Fix mistake

* Debug

* Try other syntax

* Try other approach

* Change order

* Move output one step down

* Put output up one level

* Try other syntax

* Skip build

* Try output

* Re-enable build

* Try other syntax

* Skip middle step

* Update check

* Try first step of dockerhub push

* Update needs dependency

* Try explicit dir

* Add missing package

* Try other approach

* Try other approach

* Specify region

* Use with

* Try other approach

* Add debug

* Try other approach

* Set region

* Follow AWS example

* Try github approach

* Skip Qemu

* Try stdin

* Missing steps

* Add missing close

* Add echo debug

* Try v2 endpoint

* Use v1 endpoint

* Try without quotes

* Revert

* Try crane

* Add debug

* Split steps

* Fix duplicate

* Add shell step

* Conform to options

* Add verbose flag

* Try single step

* Try workaround

* First request fails hunch

* Try bullseye image

* Try other approach

* Adjust verbose level

* Try previous step

* Add more debug

* Remove debug step

* Remove rogue indent

* Try with larger image

* Add build tag step

* Update workflow for testing

* Add tag step for test

* Remove unused

* Update dependency chain

* Add ownership fix

* Use matrix for promote

* Force update

* Force build

* Remove unused

* Add new image

* Add missing argument

* Update dockerfile copy

* Update Dockerfile

* Update clone

* Update dockerfile

* Go to correct folder

* Use correct format

* Update dockerfile

* Remove cd

* Debug find where we are

* Add debug on first step

* Changedir to postgres

* Set workdir

* Use v1 approach

* Use other dependency

* Try other approach

* Try other approach

* Update dockerfile

* Update approach

* Update dockerfile

* Update approach

* Update dockerfile

* Update dockerfile

* Add workspace hack

* Update Dockerfile

* Update Dockerfile

* Update Dockerfile

* Change last step

* Cleanup pull in prep for review

* Force build images

* Add condition for latest tagging

* Use pinned version

* Try without name value

* Remove more names

* Shorten names

* Add kaniko comments

* Pin kaniko

* Pin crane and ecr helper

* Up one level

* Switch to pinned tag for rust image

* Force update for test

Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
Co-authored-by: Rory de Zoete <rdezoete@b04468bf-cdf4-41eb-9c94-aff4ca55e4bf.fritz.box>
Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>
Co-authored-by: Rory de Zoete <rdezoete@4795e9ee-4f32-401f-85f3-f316263b62b8.fritz.box>
Co-authored-by: Rory de Zoete <rdezoete@2f8bc4e5-4ec2-4ea2-adb1-65d863c4a558.fritz.box>
Co-authored-by: Rory de Zoete <rdezoete@27565b2b-72d5-4742-9898-a26c9033e6f9.fritz.box>
Co-authored-by: Rory de Zoete <rdezoete@ecc96c26-c6c4-4664-be6e-34f7c3f89a3c.fritz.box>
Co-authored-by: Rory de Zoete <rdezoete@7caff3a5-bf03-4202-bd0e-f1a93c86bdae.fritz.box>
2022-08-16 11:15:35 +02:00
Kirill Bulatov
18f251384d Check for entire range during sasl validation (#2281) 2022-08-16 11:10:38 +03:00
Alexander Bayandin
4cddb0f1a4 Set up a workflow to run pgbench against captest (#2077) 2022-08-15 18:54:31 +01:00
Arseny Sher
7b12deead7 Bump vendor/postgres to include XLP_FIRST_IS_CONTRECORD fix. (#2274) 2022-08-15 18:24:24 +03:00
Dmitry Rodionov
63a72d99bb increase timeout in wait_for_upload to avoid spurious failures when testing with real s3 2022-08-15 18:02:27 +03:00
Arthur Petukhovsky
116ecdf87a Improve walreceiver logic (#2253)
This patch makes walreceiver logic more complicated, but it should work better in most cases. Added `test_wal_lagging` to test scenarios where alive safekeepers can lag behind other alive safekeepers.

- There was a bug which looks like `etcd_info.timeline.commit_lsn > Some(self.local_timeline.get_last_record_lsn())` filtered all safekeepers in some strange cases. I removed this filter, it should probably help with #2237
- Now walreceiver_connection reports status, including commit_lsn. This allows keeping safekeeper connection even when etcd is down.
- Safekeeper connection now fails if pageserver doesn't receive safekeeper messages for some time. Usually safekeeper sends messages at least once per second.
- `LaggingWal` check now uses `commit_lsn` directly from safekeeper. This fixes the issue with often reconnects, when compute generates WAL really fast.
- `NoWalTimeout` is rewritten to trigger only when we know about the new WAL and the connected safekeeper doesn't stream any WAL. This allows setting a small `lagging_wal_timeout` because it will trigger only when we observe that the connected safekeeper has stuck.
2022-08-15 13:31:26 +03:00
Arseny Sher
431393e361 Find end of WAL on safekeepers using WalStreamDecoder.
We could make it inside wal_storage.rs, but taking into account that
 - wal_storage.rs reading is async
 - we don't need s3 here
 - error handling is different; error during decoding is normal
I decided to put it separately.

Test
cargo test test_find_end_of_wal_last_crossing_segment
prepared earlier by @yeputons passes now.

Fixes https://github.com/neondatabase/neon/issues/544
      https://github.com/neondatabase/cloud/issues/2004
Supersedes https://github.com/neondatabase/neon/pull/2066
2022-08-14 14:47:14 +03:00
Kirill Bulatov
f38f45b01d Better storage sync logs (#2268) 2022-08-13 10:58:14 +03:00
Andrey Taranik
a5154dce3e get_binaries script fix (#2263)
* get_binaries uses DOCKER_TAG taken from docker image build step

* remove docker tag discovery at all and fix get_binaries for version variable
2022-08-12 20:35:26 +03:00
Alexander Bayandin
da5f8486ce test_runner/pg_clients: collect docker logs (#2259) 2022-08-12 17:03:09 +01:00
Dmitry Ivanov
ad08c273d3 [proxy] Rework wire format of the password hack and some errors (#2236)
The new format has a few benefits: it's shorter, simpler and
human-readable as well. We don't use base64 anymore, since
url encoding got us covered.

We also show a better error in case we couldn't parse the
payload; the users should know it's all about passing the
correct project name.
2022-08-12 17:38:43 +03:00
Andrey Taranik
7f97269277 get_binaries uses DOCKER_TAG taken from docker image build step (#2260) 2022-08-12 16:01:22 +03:00
Thang Pham
6d99b4f1d8 disable test_import_from_pageserver_multisegment (#2258)
This test failed consistently on `main` now. It's better to temporarily disable it to avoid blocking others' PRs while investigating the root cause for the test failure.

See: #2255, #2256
2022-08-12 19:13:42 +07:00
Egor Suvorov
a7bf60631f postgres_ffi/waldecoder: introduce explicit enum State
Previously it was emulated with a combination of nullable fields.
This change should make the logic more readable.
2022-08-12 11:40:46 +03:00
Egor Suvorov
07bb7a2afe postgres_ffi/waldecoder: remove unused startlsn 2022-08-12 11:40:46 +03:00
Egor Suvorov
142e247e85 postgres_ffi/waldecoder: validate more header fields 2022-08-12 11:40:46 +03:00
Thang Pham
7da47d8a0a Fix timeline physical size flaky tests (#2244)
Resolves #2212.

- use `wait_for_last_flush_lsn` in `test_timeline_physical_size_*` tests

## Context
Need to wait for the pageserver to catch up with the compute's last flush LSN because during the timeline physical size API call, it's possible that there are running `LayerFlushThread` threads. These threads flush new layers into disk and hence update the physical size. This results in a mismatch between the physical size reported by the API and the actual physical size on disk.

### Note
The `LayerFlushThread` threads are processed **concurrently**, so it's possible that the above error still persists even with this patch. However, making the tests wait to finish processing all the WALs (not flushing) before calculating the physical size should help reduce the "flakiness" significantly
2022-08-12 14:28:50 +07:00
Thang Pham
dc52436a8f Fix bug when import large (>1GB) relations (#2172)
Resolves #2097 

- use timeline modification's `lsn` and timeline's `last_record_lsn` to determine the corresponding LSN to query data in `DatadirModification::get`
- update `test_import_from_pageserver`. Split the test into 2 variants: `small` and `multisegment`. 
  + `small` is the old test
  + `multisegment` is to simulate #2097 by using a larger number of inserted rows to create multiple segment files of a relation. `multisegment` is configured to only run with a `release` build
2022-08-12 09:24:20 +07:00
Kirill Bulatov
995a2de21e Share exponential backoff code and fix logic for delete task failure (#2252) 2022-08-11 23:21:06 +03:00
Arseny Sher
e593cbaaba Add pageserver checkpoint_timeout option.
To flush inmemory layer eventually when no new data arrives, which helps
safekeepers to suspend activity (stop pushing to the broker). Default 10m should
be ok.
2022-08-11 22:54:09 +03:00
Heikki Linnakangas
4b9e02be45 Update back vendor/postgres back; it was changed accidentally. (#2251)
Commit 4227cfc96e accidentally reverted vendor/postgres to an older
version. Update it back.
2022-08-11 19:25:08 +03:00
Kirill Bulatov
7a36d06cc2 Fix exponential backoff values 2022-08-11 08:34:57 +03:00
Konstantin Knizhnik
4227cfc96e Safe truncate (#2218)
* Move relation sie cache to layered timeline

* Fix obtaining current LSN for relation size cache

* Resolve merge conflicts

* Resolve merge conflicts

* Reestore 'lsn' field in DatadirModification

* adjust DatadirModification lsn in ingest_record

* Fix formatting

* Pass lsn to get_relsize

* Fix merge conflict

* Update pageserver/src/pgdatadir_mapping.rs

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>

* Update pageserver/src/pgdatadir_mapping.rs

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>

* Check if relation exists before trying to truncat it

refer #1932

* Add test reporducing FSM truncate problem

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>
2022-08-09 22:45:33 +03:00
Dmitry Rodionov
1fc761983f support node id and remote storage params in docker_entrypoint.sh 2022-08-09 18:59:00 +03:00
Stas Kelvich
227d47d2f3 Update CONTRIBUTING.md 2022-08-09 14:18:25 +03:00
Stas Kelvich
0290893bcc Update CONTRIBUTING.md 2022-08-09 14:18:25 +03:00
Heikki Linnakangas
32fd709b34 Fix links to safekeeper protocol docs. (#2188)
safekeeper/README_PROTO.md was moved to docs/safekeeper-protocol.md in
commit 0b14fdb078, as part of reorganizing the docs into 'mdbook' format.

Fixes issue #1475. Thanks to @banks for spotting the outdated references.

In addition to fixing the above issue, this patch also fixes other broken links as a result of 0b14fdb078. See https://github.com/neondatabase/neon/pull/2188#pullrequestreview-1055918480.

Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
Co-authored-by: Thang Pham <thang@neon.tech>
2022-08-09 10:19:18 +07:00
Kirill Bulatov
3a9bff81db Fix etcd typos 2022-08-08 19:04:46 +03:00
bojanserafimov
743370de98 Major migration script (#2073)
This script can be used to migrate a tenant across breaking storage versions, or (in the future) upgrading postgres versions. See the comment at the top for an overview.

Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>
2022-08-08 17:52:28 +02:00
Dmitry Rodionov
cdfa9fe705 avoid duplicate parameter, increase timeout 2022-08-08 12:15:16 +03:00
Dmitry Rodionov
7cd68a0c27 increase timeout to pass test with real s3 2022-08-08 12:15:16 +03:00
Dmitry Rodionov
beaa991f81 remove debug log 2022-08-08 12:15:16 +03:00
Dmitry Rodionov
9430abae05 use event so it fires only if workload thread successfully finished 2022-08-08 12:15:16 +03:00
Dmitry Rodionov
4da4c7f769 increase statement timeout 2022-08-08 12:15:16 +03:00
Dmitry Rodionov
0d14d4a1a8 ignore record property warning to fix benchmarks 2022-08-08 12:15:16 +03:00
bojanserafimov
8c8431ebc6 Add more buckets to pageserver latency metrics (#2225) 2022-08-06 11:45:47 +02:00
Ankur Srivastava
84d1bc06a9 refactor: replace lazy-static with once-cell (#2195)
- Replacing all the occurrences of lazy-static with `once-cell::sync::Lazy`
- fixes #1147

Signed-off-by: Ankur Srivastava <best.ankur@gmail.com>
2022-08-05 19:34:04 +02:00
Konstantin Knizhnik
5133db44e1 Move relation size cache from WalIngest to DatadirTimeline (#2094)
* Move relation sie cache to layered timeline

* Fix obtaining current LSN for relation size cache

* Resolve merge conflicts

* Resolve merge conflicts

* Reestore 'lsn' field in DatadirModification

* adjust DatadirModification lsn in ingest_record

* Fix formatting

* Pass lsn to get_relsize

* Fix merge conflict

* Update pageserver/src/pgdatadir_mapping.rs

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>

* Update pageserver/src/pgdatadir_mapping.rs

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>
2022-08-05 16:28:59 +03:00
Alexander Bayandin
4cb1074fe5 github/workflows: Fix git dubious ownership (#2223) 2022-08-05 13:44:57 +01:00
Arthur Petukhovsky
0a958b0ea1 Check find_end_of_wal errors instead of unwrap 2022-08-04 17:56:19 +03:00
Vadim Kharitonov
1bbc8090f3 [issue #1591] Add neon_local pageserver status handler 2022-08-04 16:38:29 +03:00
Dmitry Rodionov
f7d8db7e39 silence https://github.com/neondatabase/neon/issues/2211 2022-08-04 16:32:19 +03:00
Dmitry Rodionov
e54941b811 treat pytest warnings as errors 2022-08-04 16:32:19 +03:00
Heikki Linnakangas
52ce1c9d53 Speed up test shutdown, by polling more frequently.
A fair amount of the time in our python tests is spent waiting for the
pageserver and safekeeper processes to shut down. It doesn't matter so
much when you're running a lot of tests in parallel, but it's quite
noticeable when running them sequentially.

A big part of the slowness is that is that after sending the SIGTERM
signal, we poll to see if the process is still running, and the
polling happened at 1 s interval. Reduce it to 0.1 s.
2022-08-04 12:57:15 +03:00
Dmitry Rodionov
bc2cb5382b run real s3 tests in CI 2022-08-04 11:14:05 +03:00
Dmitry Rodionov
5f71aa09d3 support running tests against real s3 implementation without mocking 2022-08-04 11:14:05 +03:00
Dmitry Rodionov
b4f2c5b514 run benchmarks conditionally, on main or if run_benchmarks label is set 2022-08-03 01:36:14 +03:00
Alexander Bayandin
71f39bac3d github/workflows: upload artifacts to S3 (#2071) 2022-08-02 13:57:26 +01:00
Stas Kelvich
177d5b1f22 Bump postgres to get uuid extension 2022-08-02 11:16:26 +03:00
dependabot[bot]
8ba41b8c18 Bump pywin32 from 227 to 301 (#2202) 2022-08-01 19:08:09 +01:00
Dmitry Rodionov
1edf3eb2c8 increase timeout so mac os job can finish the build with all cache misses 2022-08-01 18:28:49 +03:00
Dmitry Rodionov
0ebb6bc4b0 Temporary pin Werkzeug version because moto hangs with newer one. See https://github.com/spulec/moto/issues/5341 2022-08-01 18:28:49 +03:00
Dmitry Rodionov
092a9b74d3 use only s3 in boto3-stubs and update mypy
Newer version of mypy fixes buggy error when trying to update only boto3 stubs.
However it brings new checks and starts to yell when we index into
cusror.fetchone without checking for None first. So this introduces a wrapper
to simplify quering for scalar values. I tried to use cursor_factory connection
argument but without success. There can be a better way to do that,
but this looks the simplest
2022-08-01 18:28:49 +03:00
Ankur Srivastava
e73b95a09d docs: linked poetry related step in tests section
Added the link to the dependencies which should be installed
before running the tests.
2022-08-01 18:13:01 +03:00
Alexander Bayandin
539007c173 github/workflows: make bash more strict (#2197) 2022-08-01 12:54:39 +01:00
Heikki Linnakangas
d0494c391a Remove wal_receiver mgmt API endpoint
Move all the fields that were returned by the wal_receiver endpoint into
timeline_detail. Internally, move those fields from the separate global
WAL_RECEIVERS hash into the LayeredTimeline struct. That way, all the
information about a timeline is kept in one place.

In the passing, I noted that the 'thread_id' field was removed from
WalReceiverEntry in commit e5cb727572, but it forgot to update
openapi_spec.yml. This commit removes that too.
2022-07-29 20:51:37 +03:00
Kirill Bulatov
2af5a96f0d Back off when reenqueueing delete tasks 2022-07-29 19:04:40 +03:00
Vadim Kharitonov
9733b24f4a Fix README.md: Fixed several typos and changed a bit documentation for
OSX
2022-07-29 19:03:57 +03:00
Heikki Linnakangas
d865892a06 Print full error with stacktrace, if compute node startup fails.
It failed in staging environment a few times, and all we got in the
logs was:

    ERROR could not start the compute node: failed to get basebackup@0/2D6194F8 from pageserver host=zenith-us-stage-ps-2.local port=6400
    giving control plane 30s to collect the error before shutdown

That's missing all the detail on *why* it failed.
2022-07-29 16:41:55 +03:00
Heikki Linnakangas
a0f76253f8 Bump Postgres version.
This brings in the inclusion of 'uuid-ossp' extension.
2022-07-29 16:30:39 +03:00
Heikki Linnakangas
02afa2762c Move Tenant- and TimelineInfo structs to models.rs.
They are part of the management API response structs. Let's try to
concentrate everything that's part of the API in models.rs.
2022-07-29 15:02:15 +03:00
Heikki Linnakangas
d903dd61bd Rename 'wal_producer_connstr' to 'wal_source_connstr'.
What the WAL receiver really connects to is the safekeeper. The
"producer" term is a bit misleading, as the safekeeper doesn't produce
the WAL, the compute node does.

This change also applies to the name of the field used in the mgmt API
in in the response of the
'/v1/tenant/:tenant_id/timeline/:timeline_id/wal_receiver' endpoint.
AFAICS that's not used anywhere else than one python test, so it
should be OK to change it.
2022-07-29 09:09:22 +03:00
Thang Pham
417d9e9db2 Add current physical size to tenant status endpoint (#2173)
Ref #1902
2022-07-28 13:59:20 -04:00
Alexander Bayandin
6ace347175 github/workflows: unpause stress env deployment (#2180)
This reverts commit 4446791397.
2022-07-28 18:37:21 +01:00
Alexander Bayandin
14a027cce5 Makefile: get openssl prefix dynamically (#2179) 2022-07-28 17:05:30 +01:00
Arthur Petukhovsky
09ddd34b2a Fix checkpoints race condition in safekeeper tests (#2175)
We should wait for WAL to arrive to pageserver before calling CHECKPOINT
2022-07-28 15:44:02 +03:00
Arthur Petukhovsky
aeb3f0ea07 Refactor test_race_conditions (#2162)
Do not use python multiprocessing, make the test async
2022-07-28 14:38:37 +03:00
Kirill Bulatov
58b04438f0 Tweak backoff numbers to avoid no wal connection threshold trigger 2022-07-27 22:16:40 +03:00
Alexey Kondratov
01f1f1c1bf Add OpenAPI spec for safekeeper HTTP API (neondatabase/cloud#1264, #2061)
This spec is used in the `cloud` repo to generate HTTP client.
2022-07-27 21:29:22 +03:00
Thang Pham
6a664629fa Add timeline physical size tracking (#2126)
Ref #1902.

- Track the layered timeline's `physical_size` using `pageserver_current_physical_size` metric when updating the layer map.
- Report the local timeline's `physical_size` in timeline GET APIs.
- Add `include-non-incremental-physical-size` URL flag to also report the local timeline's `physical_size_non_incremental` (similar to `logical_size_non_incremental`)
- Add a `UIntGaugeVec` and `UIntGauge` to represent `u64` prometheus metrics

Co-authored-by: Dmitry Rodionov <dmitry@neon.tech>
2022-07-27 12:36:46 -04:00
Sergey Melnikov
f6f29f58cd Switch production storage to dedicated etcd (#2169) 2022-07-27 16:41:25 +03:00
Sergey Melnikov
fd46e52e00 Switch staging storage to dedicated etcd (#2164) 2022-07-27 12:28:05 +03:00
Heikki Linnakangas
d6f12cff8e Make DatadirTimeline a trait, implemented by LayeredTimeline.
Previously DatadirTimeline was a separate struct, and there was a 1:1
relationship between each DatadirTimeline and LayeredTimeline. That was
a bit awkward; whenever you created a timeline, you also needed to create
the DatadirTimeline wrapper around it, and if you only had a reference
to the LayeredTimeline, you would need to look up the corresponding
DatadirTimeline struct through tenant_mgr::get_local_timeline_with_load().
There were a couple of calls like that from LayeredTimeline itself.

Refactor DatadirTimeline, so that it's a trait, and mark LayeredTimeline
as implementing that trait. That way, there's only one object,
LayeredTimeline, and you can call both Timeline and DatadirTimeline
functions on that. You can now also call DatadirTimeline functions from
LayeredTimeline itself.

I considered just moving all the functions from DatadirTimeline directly
to Timeline/LayeredTimeline, but I still like to have some separation.
Timeline provides a simple key-value API, and handles durably storing
key/value pairs, and branching. Whereas DatadirTimeline is stateless, and
provides an abstraction over the key-value store, to present an interface
with relations, databases, etc. Postgres concepts.

This simplified the logical size calculation fast-path for branch
creation, introduced in commit 28243d68e6. LayerTimeline can now
access the ancestor's logical size directly, so it doesn't need the
caller to pass it to it. I moved the fast-path to init_logical_size()
function itself. It now checks if the ancestor's last LSN is the same
as the branch point, i.e. if there haven't been any changes on the
ancestor after the branch, and copies the size from there. An
additional bonus is that the optimization will now work any time you
have a branch of another branch, with no changes from the ancestor,
not only at a create-branch command.
2022-07-27 10:26:21 +03:00
Konstantin Knizhnik
5a4394a8df Do not hold timelines lock while calling update_gc_info to avoid recusrive mutex lock and so deadlock (#2163) 2022-07-26 22:21:05 +03:00
Heikki Linnakangas
d301b8364c Move LayeredTimeline and related code to separate source file.
The layered_repository.rs file had grown to be very large. Split off
the LayeredTimeline struct and related code to a separate source file to
make it more manageable.

There are plans to move much of the code to track timelines from
tenant_mgr.rs to LayeredRepository. That will make layered_repository.rs
grow again, so now is a good time to split it.

There's a lot more cleanup to do, but this commit intentionally only
moves existing code and avoids doing anything else, for easier review.
2022-07-26 11:47:04 +03:00
Kirill Bulatov
172314155e Compact only once on psql checkpoint call 2022-07-26 11:37:16 +03:00
Konstantin Knizhnik
28243d68e6 Yet another apporach of copying logical timeline size during branch creation (#2139)
* Yet another apporach of copying logical timeline size during branch creation

* Fix unit tests

* Update pageserver/src/layered_repository.rs

Co-authored-by: Thang Pham <thang@neon.tech>

* Update pageserver/src/layered_repository.rs

Co-authored-by: Thang Pham <thang@neon.tech>

* Update pageserver/src/layered_repository.rs

Co-authored-by: Thang Pham <thang@neon.tech>

Co-authored-by: Thang Pham <thang@neon.tech>
2022-07-26 09:11:10 +03:00
Kirill Bulatov
45680f9a2d Drop CircleCI runs (#2082) 2022-07-25 18:30:30 +03:00
Dmitry Ivanov
5f4ccae5c5 [proxy] Add the password hack authentication flow (#2095)
[proxy] Add the `password hack` authentication flow

This lets us authenticate users which can use neither
SNI (due to old libpq) nor connection string `options`
(due to restrictions in other client libraries).

Note: `PasswordHack` will accept passwords which are not
encoded in base64 via the "password" field. The assumption
is that most user passwords will be valid utf-8 strings,
and the rest may still be passed via "password_".
2022-07-25 17:23:10 +03:00
Thang Pham
39c59b8df5 Fix flaky test_branch_creation_before_gc test (#2142) 2022-07-22 12:44:20 +01:00
Alexander Bayandin
9dcb9ca3da test/performance: ensure we don't have tables that we're creating (#2135) 2022-07-22 11:00:05 +01:00
Dmitry Rodionov
e308265e42 register tenants task thread pool threads in thread_mgr
needed to avoid this warning: is_shutdown_requested() called in an unexpected thread
2022-07-22 11:43:38 +03:00
Thang Pham
ed102f44d9 Reduce memory allocations for page server (#2010)
## Overview

This patch reduces the number of memory allocations when running the page server under a heavy write workload. This mostly helps improve the speed of WAL record ingestion. 

## Changes
- modified `DatadirModification` to allow reuse the struct's allocated memory after each modification
- modified `decode_wal_record` to allow passing a `DecodedWALRecord` reference. This helps reuse the struct in each `decode_wal_record` call
- added a reusable buffer for serializing object inside the `InMemoryLayer::put_value` function
- added a performance test simulating a heavy write workload for testing the changes in this patch

### Semi-related changes
- remove redundant serializations when calling `DeltaLayer::put_value` during `InMemoryLayer::write_to_disk` function call [1]
- removed the info span `info_span!("processing record", lsn = %lsn)` during each WAL ingestion [2]

## Notes
- [1]: in `InMemoryLayer::write_to_disk`, a deserialization is called
  ```
  let val = Value::des(&buf)?;
  delta_layer_writer.put_value(key, *lsn, val)?;
  ``` 
  `DeltaLayer::put_value` then creates a serialization based on the previous deserialization
  ```
  let off = self.blob_writer.write_blob(&Value::ser(&val)?)?;
  ```
- [2]: related: https://github.com/neondatabase/neon/issues/733
2022-07-21 12:08:26 -04:00
Konstantin Knizhnik
572ae74388 More precisely control size of inmem layer (#1927)
* More precisely control size of inmem layer

* Force recompaction of L0 layers if them contains large non-wallogged BLOBs to avoid too large layers

* Add modified version of test_hot_update test (test_dup_key.py) which should generate large layers without large number of tables

* Change test name in test_dup_key

* Add Layer::get_max_key_range function

* Add layer::key_iter method and implement new approach of splitting layers during compaction based on total size of all key values

* Add test_large_schema test for checking layer file size after compaction

* Make clippy happy

* Restore checking LSN distance threshold for checkpoint in-memory layer

* Optimize stoage keys iterator

* Update pageserver/src/layered_repository.rs

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>

* Update pageserver/src/layered_repository.rs

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>

* Update pageserver/src/layered_repository.rs

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>

* Update pageserver/src/layered_repository.rs

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>

* Update pageserver/src/layered_repository.rs

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>

* Fix code style

* Reduce number of tables in test_large_schema to make it fit in timeout with debug build

* Fix style of test_large_schema.py

* Fix handlng of duplicates layers

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>
2022-07-21 07:45:11 +03:00
Arthur Petukhovsky
b445cf7665 Refactor test_unavailability (#2134)
Now test_unavailability uses async instead of Process. The test is refactored to fix a possible race condition.
2022-07-20 22:13:05 +03:00
Kirill Bulatov
cc680dd81c Explicitly enable cachepot in Docker builds only 2022-07-20 17:09:36 +03:00
Heikki Linnakangas
f4233fde39 Silence "Module already imported" warning in python tests
We were getting a warning like this from the pg_regress tests:

    =================== warnings summary ===================
    /usr/lib/python3/dist-packages/_pytest/config/__init__.py:663
      /usr/lib/python3/dist-packages/_pytest/config/__init__.py:663: PytestAssertRewriteWarning: Module already imported so cannot be rewritten: fixtures.pg_stats
        self.import_plugin(import_spec)

    -- Docs: https://docs.pytest.org/en/stable/warnings.html
    ------------------ Benchmark results -------------------

To fix, reorder the imports in conftest.py. I'm not sure what exactly
the problem was or why the order matters, but the warning is gone and
that's good enough for me.
2022-07-20 16:55:41 +03:00
Heikki Linnakangas
b4c74c0ecd Clean up unnecessary dependencies.
Just to be tidy.
2022-07-20 16:31:25 +03:00
Heikki Linnakangas
abff15dd7c Fix test to be more robust with slow pageserver.
If the WAL arrives at the pageserver slowly, it's possible that the
branch is created before all the data on the parent branch have
arrived. That results in a failure:

    test_runner/batch_others/test_tenant_relocation.py:259: in test_tenant_relocation
        timeline_id_second, current_lsn_second = populate_branch(pg_second, create_table=False, expected_sum=1001000)
    test_runner/batch_others/test_tenant_relocation.py:133: in populate_branch
        assert cur.fetchone() == (expected_sum, )
    E   assert (500500,) == (1001000,)
    E     At index 0 diff: 500500 != 1001000
    E     Full diff:
    E     - (1001000,)
    E     + (500500,)

To fix, specify the LSN to branch at, so that the pageserver will wait
for it arrive.

See https://github.com/neondatabase/neon/issues/2063
2022-07-20 15:59:46 +03:00
Thang Pham
160e52ec7e Optimize branch creation (#2101)
Resolves #2054

**Context**: branch creation needs to wait for GC to acquire `gc_cs` lock, which prevents creating new timelines during GC. However, because individual timeline GC iteration also requires `compaction_cs` lock, branch creation may also need to wait for compactions of multiple timelines. This results in large latency when creating a new branch, which we advertised as *"instantly"*.

This PR optimizes the latency of branch creation by separating GC into two phases:
1. Collect GC data (branching points, cutoff LSNs, etc)
2. Perform GC for each timeline

The GC bottleneck comes from step 2, which must wait for compaction of multiple timelines. This PR modifies the branch creation and GC functions to allow GC to hold the GC lock only in step 1. As a result, branch creation doesn't need to wait for compaction to finish but only needs to wait for GC data collection step, which is fast.
2022-07-19 14:56:25 -04:00
Heikki Linnakangas
98dd2e4f52 Use zstd and multiple threads to compress artifact tarball.
For faster and better compression.
2022-07-19 21:31:34 +03:00
Heikki Linnakangas
71753dd947 Remove github CI 'build_postgres' job, merging it with 'build_neon'
Simplifies the workflow. Makes the overall build a little faster, as
the build_postgres step doesn't need to upload the pg.tgz artifact,
and the build_neon step doesn't need to download it again.

This effectively reverts commit a490f64a68. That commit changed the
workflow so that the Postgres binaries were not included in the
neon.tgz artifact. With this commit, the pg.tgz artifact is gone, and
the Postgres binaries are part of neon.tgz again.
2022-07-19 21:31:22 +03:00
Alexander Bayandin
4446791397 github/workflows: pause stress env deployment (#2122) 2022-07-19 17:40:58 +01:00
Alexander Bayandin
5ff7a7dd8b github/workflows: run periodic benchmarks earlier (#2121) 2022-07-19 16:33:33 +01:00
Heikki Linnakangas
3dce394197 Use the same cargo options for every cargo call.
The "cargo metadata" and "cargo test --no-run" are used in the workflow
to just list names of the final binaries, but unless the same cargo
options like --release or --debug are used in those calls, they will in
fact recompile everything.
2022-07-19 16:36:59 +03:00
Heikki Linnakangas
df7f644822 Move things around in github yml file, for clarity.
Also, this avoids building the list of test binaries in release mode.
They are not included in the neon.tgz tarball in release mode.
2022-07-19 16:36:59 +03:00
Arthur Petukhovsky
bf5333544f Fix missing quotes in GitHub Actions (#2116) 2022-07-19 10:57:24 +03:00
Heikki Linnakangas
0b8049c283 Update core_changes.md, describing Postgres changes.
I went through "git diff REL_14_2" and updated the doc to list all the
changes, categorized into what I think could form a logical set of
patches.
2022-07-19 09:53:12 +03:00
Heikki Linnakangas
f384e20d78 Minor cleanup in layer_repository.rs. 2022-07-19 07:50:55 +03:00
Heikki Linnakangas
0b14fdb078 Reorganize, expand, improve internal documentation
Reorganize existing READMEs and other documentation files into mdbook
format. The resulting Table of Contents is a mix of placeholders for
docs that we should write, and documentation files that we already had,
dropped into the most appropriate place.

Update the Pageserver overview diagram. Add sections on thread
management and WAL redo processes.

Add all the RFCs to the mdbook Table of Content too.

Per github issue #1979
2022-07-18 17:39:12 +03:00
Arseny Sher
a69fdb0e8e Fix commit_lsn monotonicity violation.
On ProposerElected message receival WAL is truncated at streaming point; this
code expected that, once vote is given for the proposer / term switch happened,
flush_lsn can be advanced only by this proposer (or higher one). However, that
didn't take into account possibility of accumulating written WAL and flushing it
after vote is given -- flushing goes without term checks. Which eventually led
to the violation in question.

ref #2048
2022-07-18 15:15:51 +03:00
Arseny Sher
eeff56aeb7 Make get_dir_size robust to concurrent deletions.
ref #2055
2022-07-18 15:13:10 +03:00
Dmitry Rodionov
7987889cb3 keep successfully downloaded index parts 2022-07-18 12:27:04 +03:00
Dmitry Rodionov
912a08317b do not ignore errors during downloading of tenant index parts 2022-07-18 12:27:04 +03:00
Kirill Bulatov
c4b2347e21 Use less restricrtive lock guard during storage sync 2022-07-17 12:49:18 +03:00
dependabot[bot]
373bc59ebe Bump pywin32 from 227 to 301 (#2102) 2022-07-16 16:05:12 +01:00
Egor Suvorov
94003e1ebc postgres_ffi: test restoring from intermediate LSNs by wal_craft 2022-07-15 19:06:50 +03:00
Egor Suvorov
19ea486cde postgres_ffi/xlog_utils: refactor find_end_of_wal test
* Deduce `last_segment` automatically
* Get rid of local `wal_dir`/`wal_seg_size` variables
* Prepare to test parsing of WAL from multiple specific points, not just the start;
  extract `check_end_of_wal` function to check both partial and non-partial WAL segments.
2022-07-15 19:06:50 +03:00
Alexander Bayandin
95c40334b8 github/workflows: post periodic benchmark failures to slack (#2105) 2022-07-15 15:39:49 +01:00
Sergey Melnikov
a68d5a0173 Run workflow on release branch (#2085) 2022-07-15 13:18:55 +02:00
Alexey Kondratov
c690522870 [compute_tools] Change owner of the schema public only once (#2058)
Otherwise, we will change it back to the db owner on each restart. Even
if user already changed schema owner to some other user.
2022-07-15 12:25:07 +02:00
Heikki Linnakangas
eaa550afcc Reduce size of cargo deps cache, by excluding ~/.cargo/registry/src. 2022-07-15 13:18:48 +03:00
Heikki Linnakangas
a490f64a68 Don't include Postgres binaries in neon.tgz
neon.tgz artifact in the github workflow included the contents of
'tmp_install', but that seems pointless, because the same files are
included earlier already in the pg.tgz artifact.
2022-07-15 12:33:13 +03:00
Thang Pham
fe65d1df74 reduce concurrent tasks in test_branching_with_pgbench.py
- add thread limit
- run `pgbench` with 1 client
2022-07-15 12:30:09 +03:00
Heikki Linnakangas
c68336a246 Strip debug symbols from test binaries, to make the artifact smaller.
Uploading large artifacts is slow in github actions. To speed that up,
make the artifact smaller.

The code coverage tool doesn't require debug symbols, so remove them.

We've discussed doing the same for *all* binaries, but it's nice to
have debugging symbols for debugging purposes, and so that you get
more complete stack traces. The discussion is ongoing, but let's at
least do this for the test symbols now.
2022-07-14 23:08:57 +03:00
Heikki Linnakangas
0886aced86 Update dependencies.
- Updated dependencies with "cargo update"
- Updated workspace_hack with "cargo hakari generate"

There's no particular reason to do this now, just a periodic refresh.
2022-07-14 22:13:51 +03:00
Heikki Linnakangas
a342957aee Use ok_or_else() instead of ok_or(), to silence clippy warnings.
"cargo clippy" started to complain about these, after running "cargo
update". Not sure why it didn't complain before, but seems reasonable to
fix these. (The "cargo update" is not included in this commit)
2022-07-14 22:13:51 +03:00
Heikki Linnakangas
79f5685d00 Enable basic optimizations even in 'dev' builds.
Change the build options to enable basic optimizations even in debug
mode, and always build dependencies with more optimizations. That
makes the debug-mode binaries somewhat faster, without messing up
stack traces and line-by-line debugging too much.
2022-07-14 20:46:35 +03:00
Egor Suvorov
c004a6d62f Do not cancel in-progress checks on the main branch
See https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#concurrency

* Previously there was a single concurrency group per each branch.
  As the `main` branch got pushed into frequently, very few commits got
  tested to the end. It resulted in "broken" `main` branch as there were
  no fully successful workflow runs.
  Now the `main` branch gets a separate concurrency group for each commit.
* As GitHub Actions syntax does not have the conditional operator, it is
  emulated via logical and/or operations. Although undocumented, they
  return one of their operands instead of plain true/false.
* Replace 3-space indentation with 2-space indentation while we are here
  to be consistent with the rest of the file.
2022-07-14 17:20:00 +03:00
Egor Suvorov
1b6a80a38f Fix flaky test_concurrent_computes
* Wait for all computes (except one) to complete before proceeding with
  the single compute.
* It previously waited for too few seconds. As the test is randomized, it was
  not failing all the time, but only in specific unlucky cases.
  E.g. when there were no successfuly queries by concurrent computes,
  and the single node had big timeouts and spent lots of time making the
  transaction.
  See https://github.com/neondatabase/neon/runs/7234456482?check_suite_focus=true
  (around line 980).
* Wait for exactly one extra transaction by the single compute.
2022-07-14 16:23:39 +03:00
Alexey Kondratov
12bac9c12b Wait for compute image before deploy in GitHub Action
We need both storage **and** compute images for deploy, because control plane
picks the compute version based on the storage version. If it notices a fresh
storage it may bump the compute version. And if compute image failed to build
it may break things badly.
2022-07-14 11:27:16 +02:00
Kirill Bulatov
9a7427c203 Fill build-args for Docker builds via GH Actions context 2022-07-14 10:28:15 +03:00
Arthur Petukhovsky
968c20ca5f Add zenith-1-ps-3 to prod inventory (#2084) 2022-07-13 21:22:44 +03:00
Alexey Kondratov
f8a64512df [compute_tools] Set public schema owner to db owner (#2058)
Otherwise, it does not have a control on it, which is reasonable thing
to have and some users already hit it.
2022-07-13 15:38:22 +02:00
Alexander Bayandin
07acd6ddde Fix clippy warnings in postgres_ffi/build.rs (#2081) 2022-07-13 14:12:11 +01:00
Sergey Melnikov
2b21d7b5bc Migrate from CircleCI to Github Actions: docker build and deploy (#1986) 2022-07-13 12:51:20 +03:00
Alexander Bayandin
61cc562822 Make POSTGRES_INSTALL_DIR configurable for build (#2067) 2022-07-13 09:18:11 +01:00
dhammika
7c041d9939 Add a test for gc dropping active layers (#707) (#1484)
This PR adds `test_branch_and_gc` test that reproduces https://github.com/neondatabase/neon/issues/707. It tests GC when running with branching.

Co-authored-by: Thang Pham <thang@neon.tech>
2022-07-12 15:53:22 -04:00
Thang Pham
7f048abf3b Add close_fds for initdb command and add close fd test (#2060)
This PR adds a test for https://github.com/neondatabase/neon/pull/1834 and fixes the error in https://app.circleci.com/pipelines/github/neondatabase/neon/7753/workflows/94d1b796-10a3-4989-b23c-4c1eb4a49cf5/jobs/79586, which happens because `pageserver.pid` is held by `initdb` command on restart.

Because the test requires `lsof` to be installed in the docker image, this PR also updates the caches and docker image specified in CircleCI config file.
2022-07-12 15:04:40 -04:00
Konstantin Knizhnik
5cf94a5848 Add test for cascade/flat branching (#1569) 2022-07-12 15:01:44 -04:00
bojanserafimov
5cf597044d Allow prev_lsn hint for fullbackup (#2052) 2022-07-11 10:31:14 -04:00
Heikki Linnakangas
95452e605a Optimize importing a physical backup
Before this patch, importing a physical backup followed the same path
as ingesting any WAL records:

1. All the data pages from the backup are first collected in the
   DatadirModification object.
2. Then, they are "committed" to the Repository. They are written to
   the in-memory layer
3. Finally, the in-memory layer is frozen, and flushed to disk as a
   L0 delta layer file.

This was pretty inefficient. In step 1, the whole physical backup was
held in memory. If the backup is large, you simply run out of
memory. And in step 3, the resulting L0 delta layer file is large,
holding all the data again. That's a problem if the backup is larger
than 5 GB: Amazon S3 doesn't allow uploading files larger than 5 GB
(without using multi-part upload, see github issue #1910). So we want
to avoid that.

To alleviate those problems, optimize the codepath for importing a
physical backup. The basic flow is the same as before, but step 1
is optimized so that it doesn't accumulate all the data in memory,
and step 3 writes the data in image layers instead of one large delta
layer.
2022-07-11 17:03:58 +03:00
Dmitry Rodionov
21da9199fa take Value by reference to avoid calling .clone 2022-07-11 17:03:58 +03:00
Dmitry Rodionov
39d86ed29e debug branch failure 2022-07-09 00:42:45 +03:00
Egor Suvorov
f540f115a3 postgres_ffi/wal_craft: simplify API 2022-07-08 18:30:56 +02:00
Egor Suvorov
0b5b2e8e0b postgres_ffi/wal_craft: extract trait Crafter
Make the intent of the code clearer.
2022-07-08 18:30:56 +02:00
Egor Suvorov
60e5dc10e6 postgres_ffi/wal_generate: use 'craft' instead of 'generate'
It does very fine-tuned byte-to-byte WAL crafting, not a sloppy generation.
Hence 'craft' sounds like a better description.
2022-07-08 18:30:56 +02:00
Thang Pham
1f5918b36d Delay calculating the starting LSN when doing timeline branching (#2053)
Previously, upon branching, if no starting LSN is specified, we
determine the start LSN based on the source timeline's last record LSN
in `timelines::create_timeline` function, which then calls `Repository::branch_timeline`
to create the timeline.

Inside the `LayeredRepository::branch_timeline` function, to start branching,
we try to acquire a GC lock to prevent GC from removing data needed
for the new timeline. However, a GC iteration takes time, so the GC lock 
can be held for a long period of time. As a result, the previously determined 
starting LSN can become invalid because of GC.

This PR fixes the above issue by delaying the LSN calculation part and moving it to be 
inside `LayeredRepository::branch_timeline` function.
2022-07-08 10:29:29 -04:00
Egor Suvorov
80b7a3b51a Test what happens when XLOG_SWITCH ends on page boundary, fix #1991 2022-07-08 15:37:26 +02:00
Egor Suvorov
85bda437de postgres_ffi/wal_generate: add last_wal_record_xlog_switch and use it in tests
Fix #1190: WalDecoder did not return correct LSN of the next record after
processing a XLOG_SWITCH record
2022-07-08 15:37:26 +02:00
MMeent
52f445094a Update vendor/postgres to 14.4 (#2049)
Co-authored-by: Matthias van de Meent <matthias@neon.tech>
2022-07-08 14:51:44 +02:00
Egor Suvorov
bcdee3d3b5 test_runner: add test_crafted_wal_end.py
For some reason both non-`simple` tests spend about 10 seconds
in the post-restart `INSERT INTO` query on my machine, see #2023
2022-07-08 13:56:37 +02:00
Egor Suvorov
c08fa9d562 postgres_ffi/wal_generate: support generating WAL for an already running Postgres server
* ensure_server_config() function is added to ensure the server does not have background processes
  which intervene with WAL generation
* Rework command line syntax
* Add `print-postgres-config` subcommand which prints the required server configuration
2022-07-08 13:56:37 +02:00
Alexander Bayandin
00c26ff3a3 Bring periodic perf tests on GitHub back (#2037)
* test/fixtures: fix DeprecationWarning
* workflows/benchmarking: increase timeout
* test: switch pgbench to default(simple) query mode
* test/performance: ensure we don't have tables that we're creating
* workflows/pg_clients: remove unused env var
* workflows/benchmarking: change platform name
2022-07-07 19:53:23 +01:00
Dmitry Rodionov
ec0faf3ac6 retry timeline delete 2022-07-07 21:20:04 +03:00
Dmitry Rodionov
1a5af6d7a5 extend detach/delete tests 2022-07-07 21:20:04 +03:00
Dmitry Rodionov
520ffb341b fix pageserver openapi spec 2022-07-07 21:20:04 +03:00
Dmitry Rodionov
9f2b40645d review cleanup, point timeline/detach to timeline/delete 2022-07-07 21:20:04 +03:00
Dmitry Rodionov
168214e0b6 use tenant status endpoint to check whether timelines were downloaded or not 2022-07-07 21:20:04 +03:00
Dmitry Rodionov
d9d4ef12c3 review cleanup 2022-07-07 21:20:04 +03:00
Dmitry Rodionov
e1e24336b7 review adjustments, bring back timeline_detach and rename it to timeline_delete 2022-07-07 21:20:04 +03:00
Dmitry Rodionov
4c54e4b37d switch to per-tenant attach/detach
download operations of all timelines for one tenant are now grouped
together so when attach is invoked pageserver downloads all of them
and registers them in a single apply_sync_status_update call so
branches can be used safely with attach/detach
2022-07-07 21:20:04 +03:00
Andrey Taranik
ae116ff0a9 update timeout for proxy deploy (#2047) 2022-07-07 18:09:57 +03:00
Heikki Linnakangas
e6ea049165 If an error happens during import of base backup or WAL, log it.
We only sent the error to the client, with no trace in the pageserver log.
Log it, similar to how we log errors in GetPage@LSN requests.
2022-07-07 16:05:13 +03:00
Alexey Kondratov
747d009bb4 Fix panic while waiting for Postgres readiness in the compute_ctl (#2021)
We were reading Postgres pid file and looking for the 'ready' status,
but it could be empty or we could not read it. So add all the checks.
2022-07-07 11:56:58 +02:00
Alexander Bayandin
cb5df3c627 github/actions: set missing VIP_VAP_ACCESS_TOKEN (#2045) 2022-07-07 10:47:03 +01:00
Heikki Linnakangas
0e3456351f Shrink thread pools used for WAL receivers and background tasks.
I noticed that the pageserver has a very large virtual memory size,
several GB, even though it doesn't actually use that much
memory. That's not much of a problem normally, but I hit it because I
wanted to run tests with a limited virtual memory size, by calling
setrlimit(RLIMIT_AS), but the highest limit you can set is 2 GB. I was
not able to start pageserver with a limit of 2 GB.

On Linux, each thread allocates 32 MB of virtual memory. I read this
on some random forum on the Internet, but unfortunately could not find
the source again now. Empirically, reducing the number of threads clearly
helps to bring down the virtual memory size.

Aside from the virtual memory usage, it seems excessive to launch 40
threads in both of those thread pools. The tokio default is to have as
many worker threads as there are CPU cores in the system. That seems
like a fine heuristic for us, too, so remove the explicit setting of
the pool size and rely on the default. Note that the GC and compaction
tasks are actually run with tokio spawn_blocking, so the threads that
are actually doing the work, and possibly waiting on I/O, are not
consuming threads from the thread pool. The WAL receiver work is done
in the tokio worker threads, but the WAL receivers are more CPU bound
so that seems OK.

Also remove the explicit maxinum on blocking tasks. I'm not sure what
the right value for that would be, or whether the value we set (100)
would be better than the tokio default (512). Since the value was
arbitrary, let's just rely on the tokio default for that, too.
2022-07-06 22:36:38 +03:00
Alexander Bayandin
1faf49da0f github/actions: set PERF_TEST_RESULT_CONNSTR from secrets (#2040) 2022-07-06 19:24:06 +01:00
bojanserafimov
4a96259bdd Add export/import test (#2036) 2022-07-06 13:45:26 -04:00
bojanserafimov
242af75653 Fix signal file parsing (#2042) 2022-07-06 13:45:02 -04:00
Arthur Petukhovsky
8fabdc6708 Add tests with concurrent computes.
Removes test_restart_compute, as added test_compute_restarts is stronger.
2022-07-06 18:07:29 +04:00
Alexander Bayandin
07df7c2edd github/actions: fix storing perf data for main (#2038) 2022-07-06 13:15:15 +01:00
Kirill Bulatov
50821c0a3c Return download stream directly from the remote storage API 2022-07-05 21:45:15 +03:00
Andrey Taranik
68adfe0fc8 inventory file fix for neon-stress env 2022-07-05 21:29:03 +04:00
Dmitry Rodionov
cfdf79aceb harden create_empty_timeline
Reorder checks so it checks whether the timeline exists
before writing something to disk, possibly replacing valid content
2022-07-05 16:44:18 +03:00
bojanserafimov
32560e75d2 Enable relocation test (#1974) 2022-07-05 08:27:57 -04:00
Heikki Linnakangas
bb69e0920c Do not overwrite an existing image layer.
See github issues #1594 and #1690

Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
2022-07-05 14:45:31 +03:00
Alexander Bayandin
05f6a1394d Add tests for different Postgres client libraries (#2008)
* Add tests for different postgres clients
* test/fixtures: sanitize test name for test_output_dir
* test/fixtures: do not look for etcd before runtime
* Add workflow for testing Postgres client libraries
2022-07-05 12:22:58 +01:00
Heikki Linnakangas
844832ffe4 Bump vendor/postgres
Contains changes from two PRs in vendor/postgres:
- https://github.com/neondatabase/postgres/pull/163
- https://github.com/neondatabase/postgres/pull/176
2022-07-05 10:55:03 +03:00
bojanserafimov
d29c545b5d Gc/compaction thread pool, take 2 (#1933)
Decrease the number of pageserver threads by running gc and compaction in a blocking tokio thread pool
2022-07-05 02:06:40 -04:00
Kirill Bulatov
6abdb12724 Fix 1.62 Clippy errors 2022-07-04 23:46:37 +03:00
Alexander Bayandin
7898e72990 Remove duplicated checks from LocalEnv 2022-07-04 22:35:00 +03:00
Dmitry Rodionov
65704708fa remove unused imports, make more use of pathlib.Path 2022-07-01 18:56:51 +03:00
Arseny Sher
6100a02d0f Prefix WAL files in s3 with environment name.
It wasn't merged to prod yet, so safe to enable.
2022-07-01 19:21:28 +04:00
Arseny Sher
97fed38213 Fix cadaca010c for older ssh clients. 2022-07-01 19:20:59 +04:00
Arseny Sher
cadaca010c Make ansible to work with storage nodes through teleport from local box. 2022-07-01 16:58:34 +03:00
Bojan Serafimov
f09c09438a Fix gc after import 2022-07-01 11:10:49 +03:00
Dmitry Rodionov
00fc696606 replace extra urlencode dependency with already present url library 2022-06-30 14:32:15 +03:00
Kirill Bulatov
1d0706cf25 Fix walreceiver connection selection mechanism
* Avoid reconnecting to safekeeper immediately after its failure by limiting candidates to those with fewest connection attempts. Thus we don't have to wait lagging_wal_timeout (10s by default) before switch happens even if no new changes are generated, and current test_restarts_under_load expects some commits to happen within 4s.
* Make default max_lsn_wal_lag larger, otherwise we constant reconnections happen during normal work.
* Fix wal_connection_attempts maintanance, preventing busy loop of reconnections.
2022-06-30 00:40:12 +03:00
Dmitry Ivanov
5ee19b0758 Fix bloated coverage uploads (#2005)
Move coverage data to a better directory, merge it better and don't publish it from CircleCI pipeline
2022-06-29 17:59:19 +03:00
Kirill Bulatov
cef90d9220 Disable cachepot for GH Actions builds (#2007) 2022-06-29 17:56:02 +03:00
Kirill Bulatov
4a05413a4c More code coverage fixes in GH Actions (#2002) 2022-06-27 22:40:20 +03:00
Kirill Bulatov
dd61f3558f Fix coverage upload credentials retrieval (#2001) 2022-06-27 20:41:09 +03:00
Kirill Bulatov
8a714f1ebf Add coverage to GH actions and rework part of them (#1987) 2022-06-27 19:15:56 +03:00
Arseny Sher
137291dc24 Push to etcd from safekeeper many timelines concurrently.
Mitigates latency fee, making push throughput 1-1.5 order of magnitude bigger.

Also make leases per timeline, not per whole safekeeper, avoiding storing
garbage in etcd for deleted timelines while safekeeper is alive.
2022-06-27 16:30:21 +03:00
Kirill Bulatov
eb8926083e Use the updated base build Docker image (#1972) 2022-06-27 13:12:58 +03:00
Johan Eliasson
26bca6ddba Add openssl to OSX dependencies (#1994) 2022-06-26 21:54:07 +03:00
Arthur Petukhovsky
55192384c3 Fix zero timeline_start_lsn (#1981)
* Fix zero timeline_start_lsn

* Log more info on control file upgrade

* Fix formatting

Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>
2022-06-24 13:59:37 +03:00
KlimentSerafimov
392cd8b1fc Refactored extracting project_name in console.rs. (#1982) 2022-06-24 05:57:33 -04:00
Alexey Kondratov
3cc531d093 Fix CREATE EXTENSION for non-db-owner users (#1408)
Previously, we were granting create only to db owner, but now we have a
dedicated 'web_access' role to connect via web UI and proxy link auth.

We anyway grant read / write all data to all roles, so let's grant
create to everyone too. This creates some provelege objects in each db,
which we need to drop before deleting the role. So now we reassign all
owned objects to each db owner before deletion. This also fixes deletion
of roles that created some data in any db previously. Will be tested by
https://github.com/neondatabase/cloud/pull/1673

Later we should stop messing with Postgres ACL that much.
2022-06-23 21:36:53 +02:00
bojanserafimov
84b9fcbbd5 Increase a few test timeouts (#1977) 2022-06-23 11:51:56 -04:00
Bojan Serafimov
93e050afe3 Don't require project name for link auth 2022-06-23 15:38:05 +03:00
Anastasia Lubennikova
6d7dc384a5 Add zenith-us-stage-ps-3 to deploy 2022-06-23 14:52:32 +03:00
Anastasia Lubennikova
3c2b03cd87 Update timeline size on dropdb. Add the test (#1973)
In addition, fix database size calculation:
count not only main fork of the relation, but also vm and fsm.
2022-06-23 12:28:12 +03:00
Kirill Bulatov
7c49abe7d1 Rework etcd timeline updates and their handling 2022-06-23 09:11:27 +03:00
KlimentSerafimov
d059e588a6 Added invariant check for project name. (#1921)
Summary: Added invariant checking for project name. Refactored ClientCredentials and TlsConfig.

* Added formatting invariant check for project name:
**\forall c \in project_name . c \in [alnum] U {'-'}. 
** sni_data == <project_name>.<common_name>
* Added exhaustive tests for get_project_name.
* Refactored TlsConfig to contain common_name : Option<String>.
* Refactored ClientCredentials construction to construct project_name directly.
* Merged ProjectNameError into ClientCredsParseError.
* Tweaked proxy tests to accommodate refactored ClientCredentials construction semantics. 
* [Pytests] Added project option argument to test_proxy_select_1.
* Removed project param from Api since now it's contained in creds.
* Refactored &Option<String> -> Option<&str>.

Co-authored-by: Dmitrii Ivanov <dima@neon.tech>.
2022-06-22 09:34:24 -04:00
Sergey Melnikov
6222a0012b Migrate from CircleCI to Github Actions: python codestyle, build and tests (#1647)
Duplicate postgres and neon build and test jobs from CircleCI to Github actions.
2022-06-22 11:40:59 +03:00
bojanserafimov
1ca28e6f3c Import basebackup into pageserver (#1925)
Allow importing basebackup taken from vanilla postgres or another pageserver via psql copy in protocol.
2022-06-21 11:04:10 -04:00
Arthur Petukhovsky
6c4d6a2183 Remove timeline_start_lsn check temporary. (#1964) 2022-06-21 02:02:24 +03:00
Thang Pham
37465dafe3 Add wal backpressure tests (#1919)
Resolves #1889.

This PR adds new tests to measure the WAL backpressure's performance under different workloads.

## Changes
- add new performance tests in `test_wal_backpressure.py`
- allow safekeeper's fsync to be configurable when running tests
2022-06-20 11:40:55 -04:00
Joshua D. Drake
ec0064c442 Small README.md changes (#1957)
* Update make instructions for release and debug build. Update PostgreSQL glossary to proper version (14)

* Continued cleanup of build instructions including removal of redundancies
2022-06-20 10:05:10 -04:00
Heikki Linnakangas
83c7e6ce52 Bump vendor/postgres.
This brings in the change to not use a shared memory in the WAL redo
process, to avoid running out of sysv shmem segments in the page server.

Also, removal of callmemaybe bits.
2022-06-20 15:28:43 +03:00
Arthur Petukhovsky
f862373ac0 Fix WAL timeout in test_s3_wal_replay (#1953) 2022-06-17 20:43:54 +03:00
Arthur Petukhovsky
699f46cd84 Download WAL from S3 if it's not available in safekeeper dir (#1932)
`send_wal.rs` and `WalReader` are now async. `test_s3_wal_replay` checks that WAL can be replayed after offloaded.
2022-06-17 15:33:39 +03:00
Anastasia Lubennikova
36ee182d26 Implement page servise 'fullbackup' endpoint (#1923)
* Implement page servise 'fullbackup' endpoint that works like basebackup, but also sends relational files

* Add test_runner/batch_others/test_fullbackup.py

Co-authored-by: bojanserafimov <bojan.serafimov7@gmail.com>
2022-06-16 14:07:11 +03:00
Anastasia Lubennikova
d11c9f9fcb Use random ports for the proxy and local pg in tests
Fixes #1931
Author: Dmitry Ivanov
2022-06-15 20:21:58 +03:00
Kirill Bulatov
d8a37452c8 Rename ZenithFeedback (#1912) 2022-06-11 00:44:05 +03:00
chaitanya sharma
e1336f451d renamed .zenith data-dir to .neon. 2022-06-09 18:19:18 +02:00
Arseny Sher
a4d8261390 Save Postgres log in test_find_end_of_wal_* tests. 2022-06-09 19:16:43 +04:00
Egor Suvorov
e2a5a31595 Safekeeper HTTP router: add comment about /v1/timeline 2022-06-09 17:14:46 +02:00
Egor Suvorov
0ac0fba77a test_runner: test Safekeeper HTTP API Auth
All endpoints except for POST /v1/timeline are tested, this one is not tested in any way yet.
Three attempts for each endpoint: correctly authenticated, badly authenticated, unauthenticated.
2022-06-09 17:14:46 +02:00
Egor Suvorov
a001052cdd test_runner: SafekeeperHttpClient: support auth 2022-06-09 17:14:46 +02:00
Egor Suvorov
1f1d852204 ZenithEnvBuilder: rename pageserver_auth_enabled --> auth_enabled 2022-06-09 17:14:46 +02:00
Egor Suvorov
f7b878611a Implement JWT authentication in Safekeeper HTTP API (#1753)
* `control_plane` crate (used by `neon_local`) now parses an `auth_enabled` bool for each Safekeeper
* If auth is enabled, a Safekeeper is passed a path to a public key via a new command line argument
* Added TODO comments to other places needing auth
2022-06-09 17:14:46 +02:00
Arseny Sher
a51b2dac9a Don't s3 offload from newly joined safekeeper not having required WAL.
I made the check at launcher level with the perspective of generally moving
election (decision who offloads) there.

Also log timeline 'active' changes.
2022-06-09 18:30:16 +04:00
Thang Pham
e22d9cee3a fix ZeroDivisionError in scripts/generate_perf_report_page (#1906)
Fixes the `ZeroDivisionError` error by adding `EPS=1e-6` when doing the calculation.
2022-06-08 09:15:12 -04:00
Arthur Petukhovsky
a01999bc4a Replace most common remote logs with metrics (#1909) 2022-06-08 13:36:49 +03:00
chaitanya sharma
32e64afd54 Use better parallel build instructions in readme.md (#1908) 2022-06-08 11:25:37 +03:00
Kirill Bulatov
8a53472e4f Force etcd broker keys to not to intersect 2022-06-08 11:21:05 +03:00
Dmitry Rodionov
6e26588d17 Allow to customize shutdown condition in PostgresBackend
Use it in PageServerHandler to check per thread shutdown condition
from thread_mgr which takes into account tenants and timelines
2022-06-07 22:11:54 +03:00
Arseny Sher
0b93253b3c Fix leaked keepalive task in s3 offloading leader election.
I still don't like the surroundings and feel we'd better get away without using
election API at all, but this is a quick fix to keep CI green.

ref #1815
2022-06-07 15:17:57 +04:00
Dmitry Rodionov
7dc6beacbd make it possible to associate thread with a tenant after thread start 2022-06-07 12:59:35 +03:00
Thang Pham
6cfebc096f Add read/write throughput performance tests (#1883)
Part of #1467 

This PR adds several performance tests that compare the [PG statistics](https://www.postgresql.org/docs/current/monitoring-stats.html) obtained when running PG benchmarks against Neon and vanilla PG to measure the read/write throughput of the DB.
2022-06-06 12:32:10 -04:00
KlimentSerafimov
fecad1ca34 Resolving issue #1745. Added cluster option for SNI data (#1813)
* Added project option in case SNI data is missing. Resolving issue #1745.

* Added invariant checking for project name: if both sni_data and project_name are available then they should match.
2022-06-06 08:14:41 -04:00
bojanserafimov
92de8423af Remove dead code (#1886) 2022-06-05 09:18:11 -04:00
Dmitry Rodionov
e442f5357b unify two identical failpoints in flush_frozen_layer
probably is a merge artfact
2022-06-03 19:36:09 +03:00
Arseny Sher
5a723d44cd Parametrize test_normal_work.
I like to run small test locally, but let's avoid duplication.
2022-06-03 20:32:53 +04:00
Kirill Bulatov
2623193876 Remove pageserver_connstr from WAL stream logic 2022-06-03 17:30:36 +03:00
Arseny Sher
70a53c4b03 Get backup test_safekeeper_normal_work, but skip by default.
It is handy for development.
2022-06-03 16:12:14 +04:00
Arseny Sher
9e108102b3 Silence etcd safekeeper info key parse errors.
When we subscribe to everything, it is ok to receive not only safekeeper
timeline updates.
2022-06-03 16:12:14 +04:00
huming
9c846a93e8 chore(doc) 2022-06-03 14:24:27 +03:00
Kirill Bulatov
c5007d3916 Remove unused module 2022-06-03 00:23:13 +03:00
Kirill Bulatov
5b06599770 Simplify etcd key regex parsing 2022-06-03 00:23:13 +03:00
Kirill Bulatov
1d16ee92d4 Fix the Lsn difference reconnection 2022-06-03 00:23:13 +03:00
Kirill Bulatov
7933804284 Fix and test regex parsing 2022-06-03 00:23:13 +03:00
Kirill Bulatov
a91e0c299d Reproduce etcd parsing bug in Python tests 2022-06-03 00:23:13 +03:00
Kirill Bulatov
b0c4ec0594 Log storage sync and etcd events a bit better 2022-06-03 00:23:13 +03:00
bojanserafimov
90e2c9ee1f Rename zenith to neon in python tests (#1871) 2022-06-02 16:21:28 -04:00
Egor Suvorov
aba5e5f8b5 GitHub Actions: pin Rust version to 1.58 like on CircleCI
* Fix failing `cargo clippy` while we're here.
  The behavior has been changed in Rust 1.60: https://github.com/rust-lang/rust-clippy/issues/8928
* Add Rust version to the Cargo deps cache key
2022-06-02 17:45:53 +02:00
Dmitry Rodionov
b155fe0e2f avoid perf test result context for pg regress 2022-06-02 17:41:34 +03:00
Ryan Russell
c71faae2c6 Docs readability cont
Signed-off-by: Ryan Russell <git@ryanrussell.org>
2022-06-02 15:05:12 +02:00
Kirill Bulatov
de7eda2dc6 Fix url path printing 2022-06-02 00:48:10 +03:00
Dmitry Rodionov
1188c9a95c remove extra span as this code is already covered by create timeline span
E g this log line contains duplicated data:
INFO /timeline_create{tenant=8d367870988250a755101b5189bbbc17
  new_timeline=Some(27e2580f51f5660642d8ce124e9ee4ac) lsn=None}:
  bootstrapping{timeline=27e2580f51f5660642d8ce124e9ee4ac
  tenant=8d367870988250a755101b5189bbbc17}:
  created root timeline 27e2580f51f5660642d8ce124e9ee4ac
  timeline.lsn 0/16960E8

this avoids variable duplication in `bootstrapping` subspan
2022-06-01 19:29:17 +03:00
Kirill Bulatov
e5cb727572 Replace callmemaybe with etcd subscriptions on safekeeper timeline info 2022-06-01 16:07:04 +03:00
Dmitry Rodionov
6623c5b9d5 add installation instructions for Fedora Linux 2022-06-01 15:59:53 +03:00
Anton Chaporgin
e5a2b0372d remove sk1 from inventory (#1845)
https://github.com/neondatabase/cloud/issues/1454
2022-06-01 15:40:45 +03:00
Alexey Kondratov
af6143ea1f Install missing openssl packages in the Github Actions workflow 2022-05-31 23:12:30 +03:00
Alexey Kondratov
ff233cf4c2 Use :local compute-tools tag to build compute-node image 2022-05-31 23:12:30 +03:00
Dmitry Rodionov
b1b67cc5a0 improve test normal work to start several computes 2022-05-31 22:42:11 +03:00
bojanserafimov
ca10cc12c1 Close file descriptors for redo process (#1834) 2022-05-31 14:14:09 -04:00
Thang Pham
c97cd684e0 Use HOMEBREW_PREFIX instead of hard-coded path (#1833) 2022-05-31 11:20:51 -04:00
Ryan Russell
54e163ac03 Improve Readability in Docs
Signed-off-by: Ryan Russell <ryanrussell@users.noreply.github.com>
2022-05-31 17:22:47 +03:00
Konstantin Knizhnik
595a6bc1e1 Bump vendor/postgres to fix basebackup LSN comparison. (#1835)
Co-authored-by: Arseny Sher <sher-ars@yandex.ru>
2022-05-31 14:47:06 +03:00
Arthur Petukhovsky
c3e0b6c839 Implement timeline-based metrics in safekeeper (#1823)
Now there's timelines metrics collector, which goes through all timelines and reports metrics only for active ones
2022-05-31 11:10:50 +03:00
Arseny Sher
36281e3b47 Extend test_wal_backup with compute restart. 2022-05-30 13:57:17 +04:00
Anastasia Lubennikova
e014cb6026 rename zenith.zenith_tenant to neon.tenant_id in test 2022-05-30 12:24:44 +03:00
Anastasia Lubennikova
915e5c9114 Rename 'zenith_admin' to 'cloud_admin' on compute node start 2022-05-30 11:11:01 +03:00
Anastasia Lubennikova
67d6ff4100 Rename custom GUCs:
- zenith.zenith_tenant -> neon.tenant_id
- zenith.zenith_timeline -> neon.timeline_id
2022-05-30 11:11:01 +03:00
Anastasia Lubennikova
6a867bce6d Rename 'zenith_admin' role to 'cloud_admin' 2022-05-30 11:11:01 +03:00
Anastasia Lubennikova
751f1191b4 Rename 'wal_acceptors' GUC to 'safekeepers' 2022-05-30 11:11:01 +03:00
Anastasia Lubennikova
3accde613d Rename contrib/zenith to contrib/neon. Rename custom GUCs:
- zenith.page_server_connstring -> neon.pageserver_connstring
- zenith.zenith_tenant -> neon.tenantid
- zenith.zenith_timeline -> neon.timelineid
- zenith.max_cluster_size -> neon.max_cluster_size
2022-05-30 11:11:01 +03:00
Heikki Linnakangas
e3b320daab Remove obsolete Dockerfile.alpine
It hasn't been used for anything for a long time. The comments still
talked about librocksdb, which we also haven't used for a long time.
2022-05-28 21:22:19 +03:00
Heikki Linnakangas
4b4d3073b8 Fix misc typos 2022-05-28 14:56:23 +03:00
Kian-Meng Ang
f1c51a1267 Fix typos 2022-05-28 14:02:05 +03:00
bojanserafimov
500e8772f0 Add quick-start guide in readme (#1816) 2022-05-27 17:48:11 -04:00
Dmitry Ivanov
b3ec6e0661 [proxy] Propagate SASL/SCRAM auth errors to the user
This will replace the vague (and incorrect) "Internal error" with a nice
and helpful authentication error, e.g. "password doesn't match".
2022-05-27 21:50:43 +03:00
Dmitry Ivanov
5d813f9738 [proxy] Refactoring
This patch attempts to fix some of the technical debt
we had to introduce in previous patches.
2022-05-27 21:50:43 +03:00
Thang Pham
757746b571 Fix test_pageserver_http_get_wal_receiver_success flaky test. (#1786)
Fixes #1768.

## Context

Previously, to test `get_wal_receiver` API, we make run some DB transactions then call the API to check the latest message's LSN from the WAL receiver. However, this test won't work because it's not guaranteed that the WAL receiver will get the latest WAL from the postgres/safekeeper at the time of making the API call. 

This PR resolves the above issue by adding a "poll and wait" code that waits to retrieve the latest data from the WAL receiver. 

This PR also fixes a bug that tries to compare two hex LSNs, should convert to number before the comparison. See: https://github.com/neondatabase/neon/issues/1768#issuecomment-1133752122.
2022-05-27 13:33:53 -04:00
Arseny Sher
cb8bf1beb6 Prevent commit_lsn <= flush_lsn violation after a42eba3cd7.
Nothing complained about that yet, but we definitely don't hold at least one
assert, so let's keep it this way until better version.
2022-05-27 20:23:30 +04:00
Thang Pham
75f71a6380 Handle broken timelines on startup (#1809)
Resolve #1663.

## Changes

- ignore a "broken" [1] timeline on page server startup
- fix the race condition when creating multiple timelines in parallel for a tenant
- added tests for the above changes

[1]: a timeline is marked as "broken" if either
- failed to load the timeline's metadata or
- the timeline's disk consistent LSN is zero
2022-05-27 11:43:06 -04:00
Arseny Sher
54b75248ff s3 WAL offloading staging review.
- Uncomment accidently `self.keep_alive.abort()` commented line, due to this
  task never finished, which blocked launcher.
- Mess up with initialization one more time, to fix offloader trying to back up
  segment 0. Now we initialize all required LSNs in handle_elected,
  where we learn start LSN for the first time.
- Fix blind attempt to provide safekeeper service file with remote storage
  params.
2022-05-27 14:02:52 +04:00
Arseny Sher
0e1bd57c53 Add WAL offloading to s3 on safekeepers.
Separate task is launched for each timeline and stopped when timeline doesn't
need offloading. Decision who offloads is done through etcd leader election;
currently there is no pre condition for participating, that's a TODO.

neon_local and tests infrastructure for remote storage in safekeepers added,
along with the test itself.

ref #1009

Co-authored-by: Anton Shyrabokau <ahtoxa@Antons-MacBook-Pro.local>
2022-05-27 06:19:23 +04:00
bojanserafimov
1d71949c51 Change proxy welcome message (#1808)
Remove zenith sun and outdated instructions around .pgpass
2022-05-26 14:59:03 -04:00
Thang Pham
7d565aa4b9 Reduce the logging level when PG client disconnected to INFO (#1713)
Fixes #1683.
2022-05-26 12:21:15 -04:00
Dmitry Rodionov
72a7220dc8 Tidy up some log messages
* turn println into an info with proper message
* rename new_local_timeline to load_local_timeline because it does not
  create new timeline, it registers timeline that exists on disk in
  pageserver in-memory structures
2022-05-26 18:37:40 +03:00
Konstantin Knizhnik
b0d114ee3f Initialize last_freeze_at with disk consistent LSN to avoid creation of small L0 delta layer on startup
refer #1736
2022-05-26 15:42:18 +03:00
Dmitry Rodionov
38f2d165b7 allow TLS 1.2 in proxy to be compatible with older client libraries 2022-05-26 13:21:29 +03:00
Dmitry Rodionov
5a5737278e add simple metrics for remote storage operations
track number of operations and number of their failures
2022-05-26 01:24:52 +03:00
Kirill Bulatov
06f5e017a1 Move rustfmt check to GH Action 2022-05-26 01:03:48 +03:00
Kirill Bulatov
887b0e14d9 Run basic checks on PRs and pushes to main only 2022-05-26 01:03:48 +03:00
chaitanya sharma
c584d90bb9 initial commit, renamed znodeid to nodeid. 2022-05-25 20:11:26 +03:00
Heikki Linnakangas
7997fc2932 Fix error handling with 'basebackup' command.
If the 'basebackup' command failed in the middle of building the tar
archive, the client would not report the error, but would attempt to
to start up postgres with the partial contents of the data directory.
That fails because the control file is missing (it's added to the
archive last, precisly to make sure that you cannot start postgres
from a partial archive). But the client doesn't see the proper error
message that caused the basebackup to fail in the server, which is
confusing.

Two issues conspired to cause that:

1. The tar::Builder object that we use in the pageserver to construct
the tar stream has a Drop handler that automatically writes a valid
end-of-archive marker on drop. Because of that, the resulting tarball
looks complete, even if an error happens while we're building it. The
pageserver does send an ErrorResponse after the seemingly-valid
tarball, but:

2. The client stops reading the Copy stream, as soon as it sees the
tar end-of-archive marker. Therefore, it doesn't read the
ErrorResponse that comes after it.

We have two clients that call 'basebackup', one in `control_plane`
used by the `neon_local` binary, and another one in
`compute_tools`. Both had the same issue.

This PR fixes both issues, even though fixing either one would be
enough to fix the problem at hand. The pageserver now doesn't send the
end-of-archive marker on error, and the client now reads the copy
stream to the end, even if it sees an end-of-archive marker.

Fixes github issue #1715

In the passing, change Basebackup to use generic Write rather than
'dyn'.
2022-05-25 18:14:44 +03:00
Heikki Linnakangas
24d2313d0b Set --quota-backend-bytes when launching etcd in tests.
By default, etcd makes a huge 10 GB mmap() allocation when it starts up.
It doesn't actually use that much memory, it's just address space, but
it caused me grief when I tried to use 'rr' to debug a python test run.
Apparently, when you replay the 'rr' trace, it does allocate memory for
all that address space.

The size of the initial mmap depends on the --quota-backend-bytes setting.
Our etcd clusters are very small, so let's set --quota-backend-bytes to
keep the virtual memory size small, to make debugging with 'rr' easier.

See https://github.com/etcd-io/etcd/issues/7910 and
5e4b008106
2022-05-25 16:57:45 +03:00
Andrey Taranik
9ab52e2186 helm repository name fix for production proxy deploy (#1790) 2022-05-25 15:41:18 +03:00
Heikki Linnakangas
6f1f33ef42 Improve error messages on seccomp loading errors.
Bump vendor/postgres for https://github.com/neondatabase/postgres/pull/166
2022-05-25 14:33:06 +03:00
Andrey Taranik
703f691df8 production inventory update (#1779) 2022-05-25 14:30:50 +03:00
Arseny Sher
2b265fd6dc Disable restart_after_crash in neon_local.
It is pointless when basebackup is invalid.
2022-05-25 14:48:11 +04:00
Sergey Melnikov
d32b491a53 Add zenith-us-stage-sk-6 to deploy (#1728) 2022-05-25 10:31:10 +03:00
Kirill Bulatov
541ec25875 Properly shutdown test mock S3 server 2022-05-24 19:09:31 +03:00
KlimentSerafimov
8346aa3a29 Potential fix to #1626. Fixed typo is Makefile. (#1781)
* Potential fix to #1626. Fixed typo is Makefile.
* Completed fix to #1626.

Summary:
changed 'error' to 'bail' in start_pageserver and start_safekeeper.
2022-05-24 04:55:38 -04:00
Heikki Linnakangas
2aceb6a309 Fix garbage collection to not remove image layers that are still needed.
The logic would incorrectly remove an image layer, if a new image layer
existed, even though the older image layer was still needed by some
delta layers after it. See example given in the comment this adds.

Without this fix, I was getting a lot of "could not find data for key
010000000000000000000000000000000000" errors from GC, with the new test
case being added in PR #1735.

Fixes #707
2022-05-23 20:58:27 +03:00
KlimentSerafimov
3ff5caf786 Add to readme install protobuf etcd (#1777)
* Update installation instructions
* Added libprotobuf-dev etcd to apt install

Added "brew install protobuf etcd" to OSX installation instructions.
Added "sudo apt install libprotobuf-dev etcd" to Linux installation instructions.
Without these, cargo build complains. 
Figured out in collaboration with Bojan.
2022-05-23 13:11:59 -04:00
chaitanya sharma
fbedd535c0 Replace a bunch of zenith references with neon. 2022-05-23 13:16:00 +03:00
Egor Suvorov
89e5659f3f Replace COPYRIGHT file from the root with NOTICE file
The primary reason: make GitHub detect that we use Apache License 2.0
They do it via https://github.com/licensee/licensee Ruby library (gem).

Our COPYRIGHT file contains a part of the Apache License, which should
be added to a source file, not the license or copyright information itself,
which confuses the library.

Instead, the recommended way is to create a NOTICE file which references
license of the code and its bundled dependencies.
2022-05-23 01:03:03 +02:00
Egor Suvorov
ef7cdb13e2 Remove unused dependencies from poetry.lock via poetry lock --no-update
There were a bunch of dependencies for Python <3.9. They are not needed
after #1254. This commit makes it easier to add/remove dependencies because
lock file will be updated like this on any such operation.

Do not update dependencies yet to not break anything.
2022-05-21 12:21:45 +02:00
Egor Suvorov
73187bfef1 postgres_ffi: find_end_of_wal_segment: clarify code around xl_crc retrieval
It would be better to not update xl_crc/rec_hdr at all when skipping contrecord,
but I would prefer to keep PR #1574 small.
Better audit of `find_end_of_wal_segment` is coming anyway in #544.
2022-05-21 05:25:17 +02:00
Egor Suvorov
967eb38e81 postgres_ffi: find_end_of_wal_segment: fix contrecord skipping
Also enable corresponding test.
2022-05-21 05:25:17 +02:00
Egor Suvorov
a124e44866 postgres_ffi: find_end_of_wal_segment: add lots of trace 2022-05-21 05:25:17 +02:00
Egor Suvorov
c4b77084af utils: add const_assert! macro 2022-05-21 05:25:17 +02:00
Egor Suvorov
c9efdec8db postgres_ffi: find_end_of_wal_segment: improve name of wal_crc variable
Now it reflects the field it's mirroring.
2022-05-21 05:25:17 +02:00
Egor Suvorov
12b7c793b3 postgres_ffi: find_end_of_wal_segment: remove redundant CRC operations
Previous invariant: `crc` contains an "unfinalized" CRC32 value,
its one complement, like in postgres before FIN_CRC32C.

New invariant: `crc` always contains a "finalized" CRC32 value,
this is the semantics of crc32c_append, so we don't need to invert CRC manually.
2022-05-21 05:25:17 +02:00
Egor Suvorov
3c6890bf1d postgres_ffi: add complex WAL tests for find_end_of_wal
* Actual generation logic is in a separate crate `postgres_ffi/wal_generate`
* The create also provides a binary for debug purposes akin to `initdb`
* Two tests currently fail and are ignored
* There is no easy way to test this directly in Safekeeper as it starts restoring from commit_lsn.
  So testing would require disconnecting Safekeeper just after it has received the WAL,
  but before it is committed.
2022-05-21 05:25:17 +02:00
Andrey Taranik
d97617ed3a updated proxy and proxy scram deployment for prod and stress environments (#1758) 2022-05-20 23:12:30 +03:00
KlimentSerafimov
65cf1a3221 Added paths to openssl includes and libraries for OSX because make complained that it couldn't find them. (#1761) 2022-05-20 12:02:51 -04:00
bojanserafimov
a4aef5d8dc Compile psql with openssl (#1725) 2022-05-19 12:25:31 -04:00
Heikki Linnakangas
ffbb9dd155 Add a 5 minute timeout to python tests.
The CI times out after 10 minutes of no output. It's annoying if a
test hangs and is killed by the CI timeout, because you don't get
information about which test was running. Try to avoid that, by adding
a slightly smaller timeout in pytest itself. You can override it on a
per-test basis if needed, but let's try to keep our tests shorter than
that.

For the Postgres regression tests, use a longer 30 minute timeout.
They're not really a single test, but many tests wrapped in a single
pytest test. It's OK for them to run longer in aggregate, each
Postgres test is still fairly short.
2022-05-19 14:04:14 +03:00
Egor Suvorov
baf7a81dce git-upload: pass committer to 'git rebase' (fix #1749) (#1750)
No committer was specified, which resulted in failing `git rebase` if
the branch is not up-to-date.
2022-05-19 14:01:03 +03:00
Heikki Linnakangas
ee3bcf108d Fix compact_level0 for delta layers with overlap or gaps
We saw a case in staging, where there was a gap in the LSN ranges of
level 0 files, like this:

    000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000001696070-00000000016960E9
    000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000016960E9-00000000016E4DB9
    000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000016E4DB9-000000000BFCE3E1
    000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__000000000BFCE3E1-000000000BFD0FE9
    000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000060045901-000000007005EAC1
    000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__000000007005EAC1-0000000080062E99
    000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000080062E99-000000009007F481
    000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__000000009007F481-00000000A009F7C9
    000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000A009F7C9-00000000AA284EB9
    000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__00000000AA286471-00000000AA2886B9

Note that gap between 000000000BFD0FE9 and 0000000060045901. I don't
know how that happened, but in general the pageserver should be robust
if there are gaps like that, or overlapping files etc. In theory they
could happen as result of crashes, partial downloads from S3 etc.,
although it is mystery what caused it in this case.

Looking at the compaction code, it was not safe in the face of gaps
like that. The compaction routine collected all the level 0 files, and
took their min(start)..max(end) as the range of the new files it
builds. That's wrong, if the level 0 files don't cover the whole LSN
range; the newly created files will miss any records in the gap. Fix
that, by only collecting contiguous sequences of level 0 files, so
that the end LSN of previous delta file is equal to the start of the
next one.

Fixes issue #1730
2022-05-19 10:19:38 +03:00
Heikki Linnakangas
0da4046704 Include traversal path in error message.
Previously, the path was printed to the log with separate error!() calls.
It's better to include the whole path in the error object and have it
printed to the log as one message.

Also print the path in the ValueReconstructResult::Missing case.

This is what it looks like now:

    2022-05-17T21:53:53.611801Z ERROR pagestream{timeline=5adcb4af3e95f00a31550d266aab7a37 tenant=74d9f9ad3293c030c6a6e196dd91c60f}: error reading relation or page version: could not find data for key 000000067F000032BE000000000000000001 at LSN 0/1698C48, for request at LSN 0/1698CF8

    Caused by:
        0: layer traversal: result Complete, cont_lsn 0/1698C48, layer: 000000000000000000000000000000000000-FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF__0000000001698C48-0000000001698CC1
        1: layer traversal: result Continue, cont_lsn 0/1698CC1, layer: inmem-0000000001698CC1-FFFFFFFFFFFFFFFF

    Stack backtrace:
2022-05-19 10:19:38 +03:00
Anastasia Lubennikova
cbd00d7ed9 Remove temp layer files during timeline initialization on pageserver start 2022-05-19 10:11:12 +03:00
Anastasia Lubennikova
4c30ae8ba3 Add random string as a part of tempfile name 2022-05-19 10:11:12 +03:00
Anastasia Lubennikova
3da4b3165e Fsync layer files before rename 2022-05-19 10:11:12 +03:00
Anastasia Lubennikova
c1b365fdf7 Use temp filename while writing ImageLayer file 2022-05-19 10:11:12 +03:00
Egor Suvorov
fab104d5f3 docs/sourcetree: add note about exact Python version used and how to choose it 2022-05-19 00:09:13 +02:00
Egor Suvorov
7dd27ecd20 Bump minimal supported Python version to 3.9
Most of the CI already run with Python 3.9 since https://github.com/neondatabase/docker-images/pull/1
2022-05-19 00:09:13 +02:00
Egor Suvorov
bd2979d02c CirleCI/check-codestyle-python: print versions 2022-05-19 00:09:13 +02:00
Dmitry Rodionov
5914aab78a add comments, use expect instead of unwrap 2022-05-19 00:54:14 +03:00
Heikki Linnakangas
4a36d89247 Avoid spawning a layer-flush thread when there's no work to do.
The check_checkpoint_distance() always spawned a new thread, even if
there is no frozen layer to flush. That was a thinko, as @knizhnik
pointed out.
2022-05-19 00:51:48 +03:00
Egor Suvorov
432907ff5f Safekeeper: avoid holding mutex when deleting a tenant (#1746)
Following discussion with @arssher after #1653
2022-05-18 23:02:17 +03:00
Arthur Petukhovsky
98da0aa159 Add _total suffix to metrics name (#1741) 2022-05-18 15:17:04 +03:00
Alexey Kondratov
772c2fb4ff Report startup metrics and failure reason from compute_ctl (#1581)
+ neondatabase/cloud#1103

This adds a couple of control endpoints to simplify compute state
discovery for control-plane. For example, now we may figure out
that Postgres wasn't able to start or basebackup failed within
seconds instead of just blindly polling the compute readiness
for a minute or two.

Also we now expose startup metrics (time of the each step: basebackup,
sync safekeepers, config, total). Console grabs them after each
successful start and report as histogram to prometheus and grafana.

OpenAPI spec is added and up-tp date, but is not currently used in the
console yet.
2022-05-18 13:03:29 +04:00
Andrey Taranik
b9f84f4a83 trun on storage deployment to neon-stress enviroment (#1729) 2022-05-17 23:04:04 +03:00
Arthur Petukhovsky
134eeeb096 Add more common storage metrics (#1722)
- Enabled process exporter for storage services
- Changed zenith_proxy prefix to just proxy
- Removed old `monitoring` directory
- Removed common prefix for metrics, now our common metrics have `libmetrics_` prefix, for example `libmetrics_serve_metrics_count`
- Added `test_metrics_normal_work`
2022-05-17 19:29:01 +03:00
Heikki Linnakangas
55ea3f262e Fix race condition leading to panic in remote storage sync thread.
The SyncQueue consisted of a tokio mpsc channel, and an atomic counter
to keep track of how many items there are in the channel. Updating the
atomic counter was racy, and sometimes the consumer would decrement
the counter before the producer had incremented it, leading to integer
wraparound to usize::MAX. Calling Vec::with_capacity(usize::MAX) leads
to a panic.

To fix, replace the channel with a VecDeque protected by a Mutex, and
a condition variable for signaling. Now that the queue is now
protected by standard blocking Mutex and Condvar, refactor the
functions touching it to be sync, not async.

A theoretical downside of this is that the calls to push items to the
queue and the storage sync thread that drains the queue might now need
to wait, if another thread is busy manipulating the queue. I believe
that's OK; the lock isn't held for very long, and these operations are
made in background threads, not in the hot GetPage@LSN path, so
they're not very latency-sensitive.

Fixes #1719. Also add a test case.
2022-05-17 18:14:57 +03:00
Heikki Linnakangas
f03779bf1a Fix wait_for_last_record_lsn() and wait_for_upload() python functions.
The contract for wait_for() was not very clear. It waits until the
given function returns successfully, without an exception, but the
wait_for_last_record_lsn() and wait_for_upload() functions used "a <
b" as the condition, i.e. they thought that wait_for() would poll
until the function returns true.

Inline the logic from wait_for() into those two functions, it's not
that complicated, and you get a more specific error message too, if it
fails. Also add a comment to wait_for() to make it more clear how it
works.

Also change remote_consistent_lsn() to return 0 instead of raising an
exception, if remote is None. That can happen if nothing has been
uploaded to remote storage for the timeline yet. It happened once in
the CI, and I was able to reproduce that locally too by adding a sleep
to the storage sync thread, to delay the first upload.
2022-05-17 18:14:10 +03:00
Andrey Taranik
070c255522 Neon stress deploy (#1720)
* storage and proxy deployment for neon stress environment

* neon stress inventory fix
2022-05-17 18:03:01 +03:00
Heikki Linnakangas
9ccbb8d331 Make "neon_local stop" less verbose.
I got annoyed by all the noise in CI test output.

Before:

    $ ./target/release/neon_local stop
    Stop pageserver gracefully
    Pageserver still receives connections
    Pageserver stopped receiving connections
    Pageserver status is: Reqwest error: error sending request for url (http://127.0.0.1:9898/v1/status): error trying to connect: tcp connect error: Connection refused (os error 111)
    initializing for sk 1 for 7676
    Stop safekeeper gracefully
    Safekeeper still receives connections
    Safekeeper stopped receiving connections
    Safekeeper status is: Reqwest error: error sending request for url (http://127.0.0.1:7676/v1/status): error trying to connect: tcp connect error: Connection refused (os error 111)

After:

    $ ./target/release/neon_local stop
    Stopping pageserver gracefully...done!
    Stopping safekeeper 1 gracefully...done!

Also removes the spurious "initializing for sk 1 for 7676" message from
"neon_local start"
2022-05-17 10:31:13 +03:00
Kirill Bulatov
f2881bbd8a Start and stop single etcd and mock s3 servers globally in python tests 2022-05-17 01:17:44 +03:00
Kirill Bulatov
a884f4cf6b Add etcd to neon_local 2022-05-17 01:17:44 +03:00
Kirill Bulatov
9a0fed0880 Enable at least 1 safekeeper in every test 2022-05-17 01:17:44 +03:00
chaitanya sharma
bea84150b2 Fix the markdown rendering on 004-durability.md RFC 2022-05-17 00:16:42 +03:00
chaitanya sharma
85b5c0e989 List profiling as a feature with 'pageserver --enabled-features'
Fixes https://github.com/neondatabase/neon/issues/1627
2022-05-16 21:10:57 +03:00
Thang Pham
e4a70faa08 Add more information to timeline-related APIs (#1673)
Resolves #1488.

- implemented `GET tenant/:tenant_id/timeline/:timeline_id/wal_receiver` endpoint
- returned `thread_id` in `thread_mgr::spawn` 
- added `latest_gc_cutoff_lsn` field to `LocalTimelineInfo` struct
2022-05-16 11:05:43 -04:00
chaitanya sharma
c41549f630 Update readme build for osx (#1709) 2022-05-16 10:42:08 -04:00
Heikki Linnakangas
c700032dd2 Run the regression tests in CI also for PRs opened from forked repos. 2022-05-16 14:40:49 +03:00
Kirill Bulatov
33cac863d7 Test simple.conf and handle broker_endpoints better 2022-05-16 12:07:35 +03:00
Heikki Linnakangas
51ea9c3053 Don't swallow panics when the pageserver is build with failpoints.
It's very confusing, and because you don't get a stack trace and error
message in the logs, makes debugging very hard. However, the
'test_pageserver_recovery' test relied on that behavior. To support that,
add a new "exit" action to the pageserver 'failpoints' command, so that
you can explicitly request to exit the process when a failpoint is hit.
2022-05-16 09:58:58 +03:00
Heikki Linnakangas
a10cac980f Continue with pageserver startup, if loading some tenants fail.
Fixes https://github.com/neondatabase/neon/issues/1664
2022-05-15 00:25:38 +03:00
Heikki Linnakangas
081d5dac5e Bump vendor/postgres.
Includes change to reduce log noise from inmem_smgr.
2022-05-13 21:41:00 +03:00
Andrey Taranik
cded72a580 remove sk-2 from staging inventory list (#1699) 2022-05-13 20:41:54 +03:00
Egor Suvorov
768c846eeb Fix test_delete_force from #1653 conflicting with #1692 2022-05-13 17:36:18 +02:00
Anastasia Lubennikova
a2561f0a78 Use tenant's pitr_interval instead of hardroded 0 in the command.
Adjust python tests that use the
2022-05-13 18:32:14 +03:00
Anastasia Lubennikova
aa7c601eca Fix pitr_interval check in GC:
Use timestamp->LSN mapping instead of file modification time.
Fix 'latest_gc_cutoff_lsn' - set it to the minimum of pitr_cutoff and gc_cutoff.
Add new test: test_pitr_gc
2022-05-13 18:32:14 +03:00
Egor Suvorov
bf899a57d9 Safekeeper: add timeline/tenant force delete HTTP endpoings (closes #895)
* There is no auth in Safekeeper HTTP at all currently,
  so simply calling `check_permission` is not enough.
* There are no checks of Safekeeper still working with the data,
  as "still working" is burry now: a timeline may be "active"
  while there are no compute nodes and all data is propagated.
* Still, callmemaybe is deactivated, and timeline is removed from the
  internal map. It can easily sneak back in case of race conditions
  and implicit creations, though.
2022-05-13 15:43:52 +02:00
Egor Suvorov
07b85e7cfc Safekeeper refactor: move callmemaybe_tx from SafekeeperPostgresBackend to Timeline 2022-05-13 15:43:52 +02:00
Egor Suvorov
22d997049c libs/utils/http/request: add ensure_no_body 2022-05-13 15:43:52 +02:00
Kirill Bulatov
b683308791 Return GIT_VERSION back to storage binaries 2022-05-13 16:34:32 +03:00
Kirill Bulatov
51c0f9ab2b Force git version to be up to date via decl macro 2022-05-13 16:34:32 +03:00
Stas Kelvich
0030da57a8 compute-tools: grant rw priveleges to the all created users 2022-05-13 11:27:00 +03:00
Kirill Bulatov
85884a1599 Disable tenant relocation python test 2022-05-13 01:26:38 +03:00
Thang Pham
ae20751724 update ZenithCli::create_tenant return signature (#1692)
to include the initial timeline's ID in addition to the new tenant's ID.

Context: follow-up of https://github.com/neondatabase/neon/pull/1689
2022-05-12 17:27:08 -04:00
Thang Pham
5812e26b90 Create an initial timeline on CLI tenant creation (#1689)
Resolves #1655
2022-05-12 16:33:09 -04:00
Arthur Petukhovsky
ec8861b8cc Fix pageserver metrics names (#1682)
Try to follow Prometheus style-guide https://prometheus.io/docs/practices/naming/ for metrics names. More specifically:
- Use `pageserver_` prefix for all pagserver metrics
- Specify `_seconds` unit in time metrics
- Use unit as a suffix in other cases, such as `_hits`, `_bytes`, `_records`
- Use `_total` suffix for accumulating counters (note that Histograms append that suffix internally)
2022-05-12 19:53:07 +03:00
Kirill Bulatov
4538f1e1b8 Correctly operate etcd safekeeper timeline data 2022-05-12 18:47:31 +03:00
Stas Kelvich
b10ae195b7 Set vendor/postgres back to the main branch
I accidentally merged postgres PR that was referencing non-main branch.
2022-05-12 15:05:49 +03:00
Alexey Kondratov
b426775aa0 Use compute-tools from the new neondatabase Docker Hub repo 2022-05-12 12:26:24 +03:00
Heikki Linnakangas
5da4f3a4df Refactor DeltaLayer::dump() function
Put most of the code in a closure that returns Result, so that we can
use the ?-operator for error handling. That's simpler.
2022-05-12 10:31:04 +03:00
Konstantin Knizhnik
2bde77fced Do not apply records with LSN smaller than LSN of cached image in del… (#1672)
* Do not apply records with LSN smaller than LSN of cached image in delta layer

* Do not apply records with LSN smaller than LSN of cached image in delta layer
2022-05-12 07:56:02 +03:00
Dhammika Pathirana
c864091035 Fix err msg typo
Signed-off-by: Dhammika Pathirana <dham@neon.tech>
2022-05-11 16:13:26 -07:00
Anton Shyrabokau
20361395bb Add zenith-us-stage-sk-5 to circleci inventory (#1665)
Co-authored-by: Debian <admin@ip-10-0-5-32.us-west-2.compute.internal>
2022-05-11 21:36:53 +03:00
Arseny Sher
b338b5dffe Make callmemaybe less agressive until we fix it/migrate to bigger machines. 2022-05-11 22:16:13 +04:00
Stas Kelvich
5bd879f641 Proxy: update protocol after cluster->project rename 2022-05-11 15:50:36 +03:00
Konstantin Knizhnik
e6e883eb12 Do not set LSN for new FPI page (#1657)
* Do not set LSN for new FPI page

refer #1656

* Add page_is_new, page_get_lsn, page_set_lsn functions

* Fix page_is_new implementation

* Add comment from XLogReadBufferForRedoExtended
2022-05-11 15:23:17 +03:00
Heikki Linnakangas
d710dff975 Remove unnecessary Serialize/Deserialize traits from VecMap.
It's never stored on disk. Let's be tidy.
2022-05-10 23:47:40 +03:00
Arseny Sher
6cb14b4200 Optionally remove WAL on safekeepers without s3 offloading.
And do that on staging, until offloading is merged.
2022-05-10 22:41:02 +04:00
Thang Pham
87dfa99734 Update layered_repository REAMDE (#1659) 2022-05-10 09:55:14 -04:00
Thang Pham
cf59b51519 Update README (Running local installation section) (#1649) 2022-05-09 11:11:46 -04:00
Kirill Bulatov
0a7735a656 Rework remote storage sync queue, general refactoring 2022-05-07 01:33:33 +03:00
Kirill Bulatov
64a602b8f3 Delete timeline layers 2022-05-07 01:33:33 +03:00
Kirill Bulatov
10e4da3997 Rework timeline batching 2022-05-07 01:33:33 +03:00
Kirill Bulatov
de37f982db Share the remote storage as a crate 2022-05-07 00:30:36 +03:00
Kirill Bulatov
d4e155aaa3 Librarify common etcd timeline logic 2022-05-06 22:32:57 +03:00
Arseny Sher
dd6dca9072 Bump vendor/postgres to shut down on wrong basebackup. 2022-05-06 20:07:26 +04:00
bojanserafimov
ef40e404cf Rename zenith crate to neon_local (#1625) 2022-05-05 19:06:53 -04:00
Sergey Melnikov
11a44eda0e Add TLS support in scram-proxy (#1643)
* Add TLS support in scram-proxy

* Fix authEndpoint
2022-05-05 23:48:16 +03:00
Heikki Linnakangas
30a7598172 Some copy-editing. 2022-05-05 22:35:15 +03:00
Heikki Linnakangas
1ad5658d9c Fix typos 2022-05-05 22:35:15 +03:00
Dmitry Rodionov
954859f6c5 add readme for performance tests with the current state of things 2022-05-05 22:35:15 +03:00
Andrey Taranik
4024bfe736 get_binaries script fix (#1638)
* get_binaries script fix

* minor improvment for get_binaries
2022-05-05 22:21:07 +03:00
Kirill Bulatov
2ef0e5c6ed Do not require metadata in every upload sync task 2022-05-05 18:26:39 +03:00
Kirill Bulatov
52a7e3155e Add local path to the Layer trait and historic layers 2022-05-05 18:26:39 +03:00
Thang Pham
ad5eaa6027 Use node's LSN for read-only nodes (#1642)
Fixes #1410.
2022-05-05 10:53:10 -04:00
Dmitry Rodionov
0f3ec83172 avoid detach with alive branches 2022-05-05 12:54:42 +03:00
Arseny Sher
c46fe90010 Fix division by zero in WAL removal. 2022-05-05 10:41:43 +04:00
bojanserafimov
bc569dde51 Remove some unwraps from waldecoder (#1539) 2022-05-04 17:41:05 -04:00
bojanserafimov
02e5083695 Add hot page test (#1479) 2022-05-04 12:45:01 -04:00
Thang Pham
c4bc604e5f Fix pg list table alignment #1633
Fixes #1628

- add [`comfy_table`](https://github.com/Nukesor/comfy-table/tree/main) and use it to construct table for `pg list` CLI command

Comparison

- Old:

```
NODE	ADDRESS	TIMELINE	BRANCH NAME	LSN		STATUS
main	127.0.0.1:55432	3823dd05e35d71f6ccf33049de366d70	main	0/16FB140	running
migration_check	127.0.0.1:55433	3823dd05e35d71f6ccf33049de366d70	main	0/16FB140	running
```

- New:

```
 NODE             ADDRESS          TIMELINE                          BRANCH NAME  LSN        STATUS
 main             127.0.0.1:55432  3823dd05e35d71f6ccf33049de366d70  main         0/16FB140  running
 migration_check  127.0.0.1:55433  3823dd05e35d71f6ccf33049de366d70  main         0/16FB140  running
```
2022-05-04 12:12:26 -04:00
Anastasia Lubennikova
b8880bfaab Bump vendor/postgres 2022-05-04 18:14:45 +03:00
Anastasia Lubennikova
e2cf77441d Implement pg_database_size().
In this implementation dbsize equals sum of all relation sizes, excluding shared ones.
2022-05-04 18:14:45 +03:00
Arseny Sher
b68e3b03ed Fix control file update for b9fd8a36ad 2022-05-04 17:11:22 +04:00
Arseny Sher
e58c83870f Bump vendor/postgres to to send timeline_start_lsn. 2022-05-04 14:32:03 +04:00
Arseny Sher
b9fd8a36ad Remember timeline_start_lsn and local_start_lsn on safekeeper.
Make it remember when timeline starts in general and on this safekeeper in
particular (the point might be later on new safekeeper replacing failed one).

Bumps control file and walproposer protocol versions.

While protocol is bumped, also add safekeeper node id to
AcceptorProposerGreeting.

ref #1561
2022-05-04 14:32:03 +04:00
Heikki Linnakangas
748c5a577b Bump vendor/postgres. (#1616)
Includes fix for https://github.com/neondatabase/neon/issues/1615
2022-05-04 10:54:44 +03:00
Stas Kelvich
51a0f2683b fix scram-proxy addresses 2022-05-04 01:35:30 +03:00
Dmitry Rodionov
9dfa145c7c tone down tenant not found error 2022-05-04 00:47:52 +03:00
Stas Kelvich
5642d0b2b8 Change shutdown_process_on_error thread spawn settings.
Now princeple is following: acceptor threads (libpq and http) error will
bring the pageserver down, but all per-tenant thread failures will be treated
as an error.
2022-05-04 00:42:57 +03:00
Dmitry Rodionov
2f83f793bc print more details when thread fails 2022-05-03 18:31:23 +03:00
Anastasia Lubennikova
2f9b17b9e5 Add simple test of pageserver recovery after crash. To cause a crash, use failpoints in checkpointer 2022-05-03 17:13:09 +03:00
Dmitry Rodionov
e7cba0b607 use thiserror instead of anyhow in disk_btree 2022-05-03 15:34:23 +03:00
Dmitry Rodionov
ff7e9a86c6 turn panic into an error with more details 2022-05-03 12:44:42 +03:00
Heikki Linnakangas
9ede38b6c4 Support finding LSN from a commit timestamp.
A new `get_lsn_by_timestamp` command is added to the libpq page service
API.

An extra timestamp field is now stored in an extra field after each
Clog page. It is the timestamp of the latest commit, among all the
transactions on the Clog page. To find the overall latest commit, we
need to scan all Clog pages, but this isn't a very frequent operation
so that's not too bad.

To find the LSN that corresponds to a timestamp, we perform a binary
search. The binary search starts with min = last LSN when GC ran, and
max = latest LSN on the timeline. On each iteration of the search we
check if there are any commits with a higher-than-requested timestamp
at that LSN.

Implements github issue 1361.
2022-05-03 09:28:57 +03:00
Heikki Linnakangas
62449d6068 Bump vendor/postgres (#1573)
This brings us the performance improvements to WAL redo from
https://github.com/neondatabase/postgres/pull/144
2022-05-03 09:25:12 +03:00
Konstantin Knizhnik
baa59512b8 Traverse frozen layer in get_reconstruct_data in reverse order (#1601)
* Traverse frozen layer in get_reconstruct_data in reverse order

* Fix comments on frozen layers.

Note explicitly the order that the layers are in the queue.

* Add fail point to reproduce failpoint iteration error

Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2022-05-03 08:07:14 +03:00
Heikki Linnakangas
87a6c4d051 RFC on connection routing and authentication.
This documents how we want this to work. We're not quite there yet.
2022-05-02 23:39:06 +03:00
Stas Kelvich
801b749e1d Set correct authEndpoint for the new proxy 2022-05-02 21:46:32 +03:00
Kirill Bulatov
5cb501c2b3 Make remote storage test less flacky 2022-05-02 20:04:48 +03:00
Dmitry Rodionov
ad25736f3a Exit pageserver process with correct error code
When we shutdown pageserver due to an error (e g one of th important
thrads panicked) use 1 exit code so systemd can properly restart it
2022-05-02 19:04:45 +03:00
Stas Kelvich
9a396e1feb Support SNI-based routing in proxy 2022-05-02 18:32:18 +03:00
Stas Kelvich
0323bb5870 [proxy] Refactor cplane API and add new console SCRAM auth API
Now proxy binary accepts `--auth-backend` CLI option, which determines
auth scheme and cluster routing method. Following backends are currently
implemented:

* legacy
    old method, when username ends with `@zenith` it uses md5 auth dbname as
    the cluster name; otherwise, it sends a login link and waits for the console
    to call back
* console
    new SCRAM-based console API; uses SNI info to select the destination
    cluster
* postgres
    uses postgres to select auth secrets of existing roles. Useful for local
    testing
* link
    sends login link for all usernames
2022-05-02 18:32:18 +03:00
Dmitry Ivanov
af0195b604 [proxy] Introduce cloud::Api for communication with Neon Cloud
* `cloud::legacy` talks to Cloud API V1.
* `cloud::api` defines Cloud API v2.
* `cloud::local` mocks the Cloud API V2 using a local postgres instance.
* It's possible to choose between API versions using the `--api-version` flag.
2022-05-02 18:32:18 +03:00
Dmitry Ivanov
9df8915b03 [proxy] sasl::Mechanism may return Output during exchange
This is needed to forward the `ClientKey` that's required
to connect the proxy to a compute.

Co-authored-by: bojanserafimov <bojan.serafimov7@gmail.com>
2022-05-02 18:32:18 +03:00
Dmitry Ivanov
4b1bd32e4a Drop Debug impl for ScramKey and ServerSecret
There's a notion that accidental misuse of those implementations
might reveal authentication secrets.
2022-05-02 18:32:18 +03:00
Andrey Taranik
68ba6a58a0 authEndpoint fix 2022-05-02 17:55:13 +03:00
Andrey Taranik
8f479a712f minor fixes in proxy deployment 2022-05-02 17:55:13 +03:00
Stas Kelvich
2477d2f9e2 Deploy standalone SRAM proxy on staging 2022-05-02 17:55:13 +03:00
Dhammika Pathirana
992874c916 Fix update ps settings doc
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>
2022-05-01 13:52:08 -07:00
Dhammika Pathirana
3128e8c75c Fix tenant conf test
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>
2022-05-01 13:13:25 -07:00
Dhammika Pathirana
f3f12db2cb Add gc churn threshold knob (#1594)
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>
2022-05-01 13:13:17 -07:00
Andrey Taranik
038ea4c128 proxy notice message update (#1600) 2022-04-30 22:04:08 +03:00
Kirill Bulatov
7e1db8c8a1 Show which virtual file got the deserialization errors 2022-04-29 21:40:57 +03:00
Andrey Taranik
aa933d3961 proxy settings update for new domain (#1597) 2022-04-29 20:05:14 +03:00
Dmitry Rodionov
67b4e38092 remporarily disable test_backpressure_received_lsn_lag 2022-04-29 15:53:56 +03:00
Dmitry Rodionov
05f8e6a050 Use fsync+rename for atomic downloads from remote storage
Use failpoint in test_remote_storage to check the behavior
2022-04-29 15:53:56 +03:00
chaitanya sharma
76388abeb6 Rename READMEs with .md extension, and fix links to them.
Commit edba2e97 renamed pageserver/README to pageserver/README.md, but
forgot to update links to it. Fix.

Rename libs/postgres_ffi/README and safekeeper/README files to also
have the the .md extension, so that github can render them nicely.

Quote ascii-diagram in safekeeper/README.md so that it renders
correctly.
2022-04-29 14:23:42 +03:00
Kirill Bulatov
2911eb084a Remove timeline files on detach 2022-04-29 09:19:18 +03:00
Kirill Bulatov
6cca57f95a Properly remove from the local timeline map 2022-04-29 09:19:18 +03:00
Kirill Bulatov
4a46b01caf Properly populate local timeline map 2022-04-29 09:19:18 +03:00
Anastasia Lubennikova
5c5c3c64f3 Fix tenant config parsing. Add a test 2022-04-28 11:49:19 +03:00
Arthur Petukhovsky
29539b0561 Set wal_keep_size to zero (#1507)
wal_keep_size is already set to 0 in our cloud setup, but we don't use this value in tests. This commit fixes wal_keep_size in control_plane and adds tests for WAL recycling and lagging safekeepers.
2022-04-27 19:09:28 +03:00
Dmitry Rodionov
695b5f9d88 Remove obsolete failpoint in proxy
When failpoint feature is disabled it throws away passed code so code
inside is not guaranteed to compile when feature is disabled. In this
particular case code is obsolete so removing it.
2022-04-27 14:34:33 +03:00
Dhammika Pathirana
66694e736a Fix add ps tenant config
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>
2022-04-27 00:05:13 -07:00
Dhammika Pathirana
091cefaa92 Fix add compaction for key partitioning
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>
2022-04-27 00:05:13 -07:00
Dhammika Pathirana
aeb4f81c3b Add branch traversal unit test
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>
2022-04-27 00:05:13 -07:00
Dhammika Pathirana
6391862d8a Add branch traversal test
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>
2022-04-27 00:05:13 -07:00
Dhammika Pathirana
b2e35fffa6 Fix ancestor layer traversal (#1484)
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>
2022-04-27 00:05:13 -07:00
Arseny Sher
8b9d523f3c Remove old WAL on safekeepers.
Remove when it is consumed by all of 1) pageserver (remote_consistent_lsn) 2)
safekeeper peers 3) s3 WAL offloading.

In test s3 offloading for now is mocked by directly bumping s3_wal_lsn.

ref #1403
2022-04-26 23:02:23 +04:00
Arseny Sher
3fd234da07 Enable etcd for safekeepers in deploy. 2022-04-26 18:13:50 +04:00
Kirill Bulatov
778744d35c Limit concurrent S3 and IAM interactions 2022-04-26 13:49:37 +03:00
Dmitry Rodionov
eabf6f89e4 Use item.get for tenant config toml parsing
Previously we've used table interface, but there was no easy way to pass
it as an override to pageserver through cli. Use the same strategy as
for remote storage config parsing
2022-04-26 10:15:19 +03:00
Kirill Bulatov
fec050ce97 Fix macos clippy issues 2022-04-25 16:23:34 +03:00
Kirill Bulatov
d060a97c54 Simplify clippy runs 2022-04-25 16:23:34 +03:00
Anastasia Lubennikova
78a6cb247f allow the users to create extensions: GRANT CREATE ON DATABASE 2022-04-25 15:35:44 +03:00
Kirill Bulatov
8f6a161271 Show better layer load errors 2022-04-25 14:54:39 +03:00
Andrey Taranik
56f6269a8e rename docker images to neondatabase docker account (#1570)
* rename docker images to neondatabase docker account

* docker images build fix (permisions for Cargo.lock)
2022-04-25 11:34:51 +03:00
Heikki Linnakangas
1fb3d08185 Use a 1-byte length header for short blobs.
Notably, this shaves 3 bytes from each small WAL record stored in
ephemeral or delta layers.
2022-04-22 21:31:27 +03:00
bojanserafimov
867aede715 Add idle compute restart time test (#1514) 2022-04-22 10:45:47 -04:00
Dmitry Ivanov
d3f356e7a8 Update rust-postgres project-wide (#1525)
* Update `rust-postgres` project-wide

This commit points to https://github.com/neondatabase/rust-postgres/commits/neon
in order to test our patches on top of the latest version of this crate.

* [proxy] Update `hmac` and `sha2`
2022-04-22 17:31:58 +03:00
Konstantin Knizhnik
5f83c9290b Make it possible to specify per-tenant configuration parameters
Add tenant config API and 'zenith tenant config' CLI command.
Add 'show' query to pageserver protocol for tenantspecific config parameters

Refactoring: move tenant_config code to a separate module.
Save tenant conf file to tenant's directory, when tenant is created to recover it on pageserver restart.
Ignore error during tenant config loading, while it is not supported by console

Define PiTR interval for GC.

refer #1320
2022-04-22 11:24:29 +03:00
Heikki Linnakangas
a4700c9bbe Use pprof to get flamegraph of get_page and get_relsize requests.
This depends on a hacked version of the 'pprof-rs' crate. Because of
that, it's under an optional 'profiling' feature. It is disabled by
default, but enabled for release builds in CircleCI config. It doesn't
currently work on macOS.

The flamegraph is written to 'flamegraph.svg' in the pageserver
workdir when the 'pageserver' process exits.

Add a performance test that runs the perf_pgbench test, with profiling
enabled.
2022-04-21 20:32:48 +03:00
Heikki Linnakangas
dafdf9b952 Handle EINTR 2022-04-21 16:37:36 +03:00
Heikki Linnakangas
263d60f12d Add prometheus metric for time spent waiting for WAL to arrive 2022-04-21 16:37:32 +03:00
Arseny Sher
abcd7a4b1f Insert less data in test_wal_restore.
Otherwise it sometimes hits 2m statement timeout in CI.
2022-04-21 16:00:15 +04:00
Kirill Bulatov
81cad6277a Move and library crates into a dedicated directory and rename them 2022-04-21 13:30:33 +03:00
Kirill Bulatov
629688fd6c Drop redundant resolver setting for 2021 edition 2022-04-21 13:30:33 +03:00
Heikki Linnakangas
9d3779c124 Add a counter for materialized page cache hits. 2022-04-20 21:26:03 +03:00
Heikki Linnakangas
334a1d6b5d Fix materialized page caching with delta layers.
We only checked the cache page version when collecting WAL records in
an in-memory layer, not in a delta layer. Refactor the code so that we
always stop collecting WAL records when we reach a cached materialized
page.

Fix the assertion on the LSN range in
InMemoryLayer::get_value_reconstruct_data. It was supposed to check
that the requested LSN range is within the layer's LSN range, but the
inequality was backwards. That went unnoticed before, because the
caller always passed the layer's start LSN as the requested LSN
range's start LSN, but now we might stop the search earlier, if we have
a cached page version.

Co-authored-by: Konstantin Knizhnik <knizhnik@zenith.tech>
2022-04-20 21:25:59 +03:00
Dmitry Rodionov
e41ad3be0f add more context to writeback error 2022-04-20 17:07:07 +03:00
Heikki Linnakangas
e113c6fa8d Print a warning if unlinking an ephemeral file fails.
Unlink failure isn't serious on its own, we were about to remove the
file anyway, but it shouldn't happen and could be a symptom of
something more serious.

We just saw "No such file or directory" errors happening from
ephemeral file writeback in staging, and I suspect if we had this
warning in place, we would have seen these warnings too, if the
problem was that the ephemeral file was removed before dropping the
EphemeralFile struct. Next time it happens, we'll have more
information.
2022-04-20 16:23:16 +03:00
Heikki Linnakangas
cbdfd8c719 Update 'routerify' dependency in proxy.
routerify version 3 is used in zenith_utils, use the same version in proxy
to avoid having to build two versions.
2022-04-20 14:42:05 +03:00
Heikki Linnakangas
86bf4301b7 Remove unnecessary dependency on 'webpki' 2022-04-20 14:36:54 +03:00
Heikki Linnakangas
9eaa21317c Update jsonwebtoken crate.
With this, we no longer need to build two versions of 'pem' and 'base64'
crates. Introduces a duplicate version of 'time' crate, though, but it's
still progress.
2022-04-20 14:27:49 +03:00
Heikki Linnakangas
e660e12f79 Update rustls-split and rustls versions.
All dependencies now use rustls 0.20.2, so we no longer need to build two
versions of it.
2022-04-20 14:07:55 +03:00
Konstantin Knizhnik
ac52f4f2d6 Set superuser when initializing database for wal recovery (#1544) 2022-04-20 13:24:38 +03:00
Heikki Linnakangas
5e95338ee9 Improve logging in test_wal_restore.py
- Capture the output of the restore_from_wal.sh in a log file
- Kill "restored" Postgres server on test failure
2022-04-20 11:18:40 +03:00
Heikki Linnakangas
170badd626 Capture the postgres log in all tests that start a vanilla Postgres. 2022-04-20 11:18:40 +03:00
Kirill Bulatov
91fb21225a Show more logs during S3 sync 2022-04-20 02:57:03 +03:00
Kirill Bulatov
3e6087a12f Remove S3 archiving 2022-04-19 23:13:52 +03:00
Kirill Bulatov
44bfc529f6 Require specifying the upload size in remote storage 2022-04-19 23:13:52 +03:00
bojanserafimov
ef72eb84cf Remove zenfixture (#1534) 2022-04-19 09:46:47 -04:00
Kirill Bulatov
a1e34772e5 Improve compute error logging 2022-04-19 00:20:08 +03:00
Stas Kelvich
389bd1faeb Support for SCRAM-SHA-256 in compute tools 2022-04-18 22:19:01 +03:00
Anastasia Lubennikova
c15aa04714 Move Cluster size limit RFC from rfcs repo 2022-04-18 18:11:31 +03:00
Kirill Bulatov
52e0816fa5 wal_acceptor -> safekeeper 2022-04-18 12:52:31 +03:00
Kirill Bulatov
81417788c8 walkeeper -> safekeeper 2022-04-18 12:52:31 +03:00
Kirill Bulatov
81879f8137 Restore missing cachepot env vars 2022-04-18 12:32:04 +03:00
Arseny Sher
5b29774532 Small refactoring after ec3bc74165.
Move record_safekeeper_info inside safekeeper.rs, fix commit_lsn update, sync
control file.
2022-04-18 13:11:34 +04:00
Kirill Bulatov
0ca2bd929b Remove log crate from pageserver 2022-04-18 00:00:36 +03:00
Kirill Bulatov
9b7dcc2bae Use proper cachepot bucket 2022-04-17 16:35:40 +03:00
Kirill Bulatov
3136a0754a Use mold in Docker images 2022-04-17 00:50:28 +03:00
Kirill Bulatov
787f0d33f0 Use another cachepot bucket for rust Docker build caches 2022-04-16 23:36:42 +03:00
Kirill Bulatov
ed5f9acca9 Revert "Revert libc upgrade" (#1527)
This reverts commit 4bc338babc.
2022-04-16 13:38:48 +03:00
Kirill Bulatov
4bc338babc Revert libc upgrade 2022-04-16 10:03:26 +03:00
Kirill Bulatov
3ab090b43a Fix compute tools build 2022-04-15 23:12:35 +03:00
Kirill Bulatov
7126979950 Remove custom neon Docker build image 2022-04-15 20:08:22 +03:00
Arseny Sher
9946cd1125 Bump vendor/postgres to add safekeeper connection timeout. 2022-04-15 20:44:56 +04:00
Dmitry Ivanov
ab20f2c491 Use the same version of rust-postgres everywhere. (#1516)
Turns out we still had a stale dep in `compute_tools`.
2022-04-15 18:36:11 +03:00
Dmitry Ivanov
c9d897f9b6 [proxy] Update rustls (#1510) 2022-04-15 12:06:25 +03:00
Kirill Bulatov
e97f94cc30 Bump rustc version 2022-04-14 23:01:06 +03:00
Dmitry Rodionov
2cb39a1624 add missing files, update workspace hack 2022-04-14 20:41:21 +03:00
Heikki Linnakangas
93e0ac2b7a Remove a couple of unused dependencies.
Found by "cargo-udeps"
2022-04-14 17:38:26 +03:00
bojanserafimov
d5ae9db997 Add s3 cost estimate to tests (#1478) 2022-04-14 10:09:03 -04:00
Heikki Linnakangas
9e4de6bed0 Use RwLock instad of Mutex for layer map lock.
For more concurrency
2022-04-14 13:34:01 +03:00
Heikki Linnakangas
4a8c663452 Refactor pgbench tests.
- Remove batch_others/test_pgbench.py. It was a quick check that pgbench
  works, without actually recording any performance numbers, but that
  doesn't seem very interesting anymore. Remove it to avoid confusing it
  with the actual pgbench benchmarks

- Run pgbench with "-n" and "-S" options, for two different workloads:
  simple-updates, and SELECT-only. Previously, we would only run it with
  the "default" TPCB-like workload. That's more or less the same as the
  simple-update (-n) workload, but I think the simple-upload workload
  is more relevant for testing storage performance. The SELECT-only
  workload is a new thing to measure.

- Merge test_perf_pgbench.py and test_perf_pgbench_remote.py. I added
  a new "remote" implementation of the PgCompare class, which allows
  running the same tests against an already-running Postgres instance.

- Make the PgBenchRunResult.parse_from_output function more
  flexible. pgbench can print different lines depending on the
  command-line options, but the parsing function expected a particular
  set of lines.
2022-04-14 13:31:42 +03:00
Heikki Linnakangas
a009fe912a Refactor connection option handling in python tests
The PgProtocol.connect() function took extra options for username,
database, etc. Remove those options, and have a generic way for each
subclass of PgProtocol to provide some default options, with the
capability override them in the connect() call.
2022-04-14 13:31:40 +03:00
Heikki Linnakangas
19954dfd8a Refactor proxy options test to not rely on the 'schema' argument.
It was the only test that used the 'schema' argument to the connect()
function. I'm about to refactor the option handling and will remove
the special 'schema' argument altogether, so rewrite the test to not
use it.
2022-04-14 13:31:37 +03:00
Heikki Linnakangas
570db6f168 Update README for Zenith -> Neon renaming.
There's a lot of renaming left to do in the code and docs, but this is
a start. Our binaries and many other things are still called "zenith",
but I didn't change those in the README, because otherwise the
examples won't work. I added a brief note at the top of the README to
explain that we're in the process of renaming, until we've renamed
everything.
2022-04-14 11:30:01 +03:00
Arthur Petukhovsky
cdf04b6a9f Fix control file updates in safekeeper (#1452)
Now control_file::Storage implements Deref for read-only access to the state. All updates should clone the state before modifying and persisting.
2022-04-14 09:31:35 +03:00
Dhammika Pathirana
a0781f229c Add ps compact command
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>

Add ps compact command to api (#707) (#1484)
2022-04-13 22:47:13 -07:00
Dmitry Rodionov
1d36c5a39e reenable s3 on staging pagservers by default
After deadlockk fix in https://github.com/neondatabase/neon/pull/1496 s3
seems to work normally. There is one more discovered issue but it is not
a blocker so can be fixed separately.
2022-04-13 20:10:39 +03:00
Dmitry Rodionov
49da76237b remove noisy debug log message 2022-04-13 19:50:31 +03:00
Dhammika Pathirana
1fd08107ca Add ps compaction_threshold config
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>

Add ps compaction_threadhold knob for (#707) (#1484)
2022-04-13 07:42:58 -07:00
Daniil
58d5136a61 compute_tools: check writability handler (#941) 2022-04-13 17:16:25 +03:00
Arthur Petukhovsky
87020f8126 Fix CI staging deploy (#1499)
- Remove stopped safekeeper from inventory
- Fix github pages address after neon rename
2022-04-13 10:59:29 +03:00
Dmitry Rodionov
20414c4b16 defuse possible deadlock in download_timeline too 2022-04-13 10:05:19 +03:00
Dmitry Rodionov
9b7a8e67a4 fix deadlock in upload_timeline_checkpoint
It originated from the fact that we were calling to fetch_full_index
without releasing the read guard, and fetch_full_index tries to acquire
read again. For plain mutex it is already a deeadlock, for RW lock
deadlock was achieved by an attempt to acquire write access later in the
code while still having active read guard up in the stack

This is sort of a bandaid because Kirill plans to change this code
during removal of an archiving mechanism
2022-04-13 10:05:19 +03:00
Dmitry Ivanov
4af87f3d60 [proxy] Add SCRAM auth mechanism implementation (#1050)
* [proxy] Add SCRAM auth

* [proxy] Implement some tests for SCRAM

* Refactoring + test fixes

* Hide SCRAM mechanism behind `#[cfg(test)]`

Currently we only use it in tests, so we hide all relevant
module behind `#[cfg(test)]` to prevent "unused item" warnings.
2022-04-13 03:00:32 +03:00
Alexey Kondratov
0fbe657b2f Fix remote e2e tests after repository rename (#1434)
Also start them after release build instead of debug. It saves 3-5
minutes and we anyway use release mode in Docker images.
2022-04-13 00:02:06 +03:00
Konstantin Knizhnik
07a9553700 Add test for restore from WAL (#1366)
* Add test for restore from WAL

* Fix python formatting

* Choose unused port in wal restore test

* Move recovery tests to zenith_utils/scripts

* Set LD_LIBRARY_PATH in wal recovery scripts

* Fix python test formatting

* Fix mypy warning

* Bump postgres version

* Bump postgres version
2022-04-11 22:30:08 +03:00
Kirill Bulatov
dc7e3ff05a Fix rustc 1.60 clippy warnings 2022-04-11 21:34:04 +03:00
Kirill Bulatov
4f172e7612 Replicate S3 blob metadata in the remote storage 2022-04-11 21:34:04 +03:00
Kirill Bulatov
0e9ee772af Use rusoto in safekeeper 2022-04-11 21:34:04 +03:00
Kirill Bulatov
db63fa64ae Use rusoto lib for S3 relish_storage impl 2022-04-11 21:34:04 +03:00
Arthur Petukhovsky
8e2a6661e9 Make wal_storage initialization eager (#1489) 2022-04-11 20:36:26 +03:00
Heikki Linnakangas
214567bf8f Use B-tree for the index in image and delta layers.
We now use a page cache for those, instead of slurping the whole index into
memory.

Fixes https://github.com/zenithdb/zenith/issues/1356

This is a backwards-incompatible change to the storage format, so
bump STORAGE_FORMAT_VERSION.
2022-04-07 20:58:55 +03:00
Heikki Linnakangas
c4b57e4b8f Move BlobRef
It's not needed in image layers anymore, so move it into delta_layer.rs
2022-04-07 20:58:55 +03:00
Heikki Linnakangas
5d9851f5d1 Refactor the I/O functions.
This introduces two new abstraction layers for I/O:

- Block I/O, and
- Blob I/O.

The BlockReader trait abstracts a file or something else that can be read
in 8kB pages. It is implemented by EphemeralFiles, and by a new
FileBlockReader struct that allows reading arbitrary VirtualFiles in that
manner, utilizing the page cache.

There is also a new BlockCursor struct that works as a cursor over a
BlockReader. When you create a BlockCursor and read the first page using
it, it keeps the reference to the page. If you access the same page again,
it avoids going to page cache and quickly returns the same page again.
That can save a lot of lookups in the page cache if you perform multiple
reads.

The Blob-oriented API allows reading and writing "blobs" of arbitrary
length. It is a layer on top of the block-oriented API. When you write
a blob with the write_blob() function, it writes a length field
followed by the actual data to the underlying block storage, and
returns the offset where the blob was stored. The blob can be
retrieved later using the offset.

Finally, this replaces the I/O code in image-, delta-, and in-memory
layers to use the new abstractions. These replace the 'bookfile'
crate.

This is a backwards-incompatible change to the storage format.
2022-04-07 20:58:54 +03:00
Arthur Petukhovsky
81ba23094e Fix scripts to deploy sk4 on staging (#1476)
Adjust ansible scripts and inventory for sk4 on staging
2022-04-07 20:38:26 +03:00
bojanserafimov
d5258cdc4d [proxy] Don't print passwords (#1298) 2022-04-06 20:05:24 -04:00
Arthur Petukhovsky
6bc78a0e77 Log more info in test_many_timelines asserts (#1473)
It will help to debug #1470 as soon as it happens again
2022-04-07 01:44:26 +03:00
bojanserafimov
6fe443e239 Improve random_writes test (#1469)
If you want to test with a 3GB database by tweaking some constants you'll hit a query timeout. I fix that by batching the inserts.
2022-04-06 18:32:10 -04:00
Alexey Kondratov
d0c246ac3c Update pageserver OpenAPI spec with missing attach/detach methods (#1463)
We have these methods for some time in the API, so mentioning them in the
spec could be useful for console (see zenithdb/console#867), as we generate
pageserver HTTP API golang client there.
2022-04-05 20:01:57 +03:00
Heikki Linnakangas
2f784144fe Avoid deadlock when locking two buffers.
It happened in unit tests. If a thread tries to read a buffer while
already holding a lock on one buffer, the code to find a victim buffer
to evict could try to evict the buffer that's already locked. To fix,
skip locked buffers.
2022-04-04 20:12:31 +03:00
Heikki Linnakangas
222b723354 Handle read errors when dumping a delta layer file.
If a file is corrupt, let's not stop on first read error, but continue
dumping.
2022-04-04 20:12:28 +03:00
Heikki Linnakangas
089ba6abfe Clean up some comments that still referred to 'segments' 2022-04-04 20:12:25 +03:00
Arthur Petukhovsky
a5a478c321 Bump vendor/postgres to store WAL on disk only (#1342)
Now WAL is no longer held in compute memory
2022-04-04 16:32:30 +03:00
Konstantin Knizhnik
fcf613b6e3 Fix unit tests build 2022-04-04 10:43:27 +03:00
Konstantin Knizhnik
572b3f48cf Add compaction_target_size parameter 2022-04-04 10:43:27 +03:00
Konstantin Knizhnik
bef9b837f1 Replace rwlock with mutex in repartition 2022-04-04 10:43:27 +03:00
Konstantin Knizhnik
232fe14297 Refactor partitioning 2022-04-04 10:43:27 +03:00
Konstantin Knizhnik
92031d376a Fix unit tests 2022-04-04 10:43:27 +03:00
Konstantin Knizhnik
1f0b406b63 Perform repartitioning in compaction thread
refer #1441
2022-04-04 10:43:27 +03:00
Kirill Bulatov
4c9447589a Place an info span into gc loop step 2022-04-03 19:30:36 +03:00
Kirill Bulatov
9e5423c867 Assert in a more informative way 2022-04-03 19:30:36 +03:00
Kirill Bulatov
43c16c5145 Don't log ZIds in the timeline load span 2022-04-03 19:30:36 +03:00
bojanserafimov
af712798e7 Fix pageserver readme formatting
I put the diagram in a fixed-width block, since it wasn't rendering correctly on github.
2022-04-02 00:36:54 +03:00
Dmitry Ivanov
f5da652388 [proxy] Enable keepalives for all tcp connections (#1448) 2022-03-31 20:44:57 +03:00
Anastasia Lubennikova
8745b022a9 Extend LayerMap dump() function to print also open_layers and frozen_layers.
Add verbose option to chose if we need to print all layer's keys or not.
2022-03-31 17:26:24 +03:00
Arthur Petukhovsky
a40b7cd516 Fix timeouts in test_restarts_under_load (#1436)
* Enable backpressure in test_restarts_under_load

* Remove hacks because #644 is fixed now

* Adjust config in test_restarts_under_load
2022-03-31 17:00:09 +03:00
Konstantin Knizhnik
1aa8fe43cf Fix race condition in image layer (#1440)
* Fix race condition in image layer

refer #1439

* Add explicit drop(inner) in layer load method

* Add explicit drop(inner) in layer load method
2022-03-31 15:47:59 +03:00
Dmitry Rodionov
649f324fe3 make logging in basebackup more consistent 2022-03-30 17:58:51 +03:00
Dmitry Rodionov
8609234204 decrease the log level to debug because it is too noisy 2022-03-30 10:13:38 +03:00
Anton Shyrabokau
5c5629910f Add a test case for reading historic page versions (#1314)
* Add a test case for reading historic page versions

 Test read_page_at_lsn returns correct results when compared to page inspect.
 Validate possiblity of reading pages from dropped relation.
 Ensure funcitons read latest version when null lsn supplied.
 Check that functions do not poison buffer cache with stale page versions.
2022-03-29 22:13:06 -07:00
Kirill Bulatov
277e41f4b7 Show s3 spans in logs and improve the log messages 2022-03-29 19:21:31 +03:00
Arthur Petukhovsky
ce0243bc12 Add metric for last_record_lsn (#1430) 2022-03-29 18:54:24 +03:00
Arseny Sher
ec3bc74165 Add safekeeper information exchange through etcd.
Safekeers now publish to and pull from etcd per-timeline data. Immediate goal is
WAL truncation, for which every safekeeper must know remote_consistent_lsn; the
next would be callmemaybe replacement.

Adds corresponding '--broker' argument to safekeeper and ability to run etcd in
tests.

Adds test checking remote_consistent_lsn is indeed communicated.
2022-03-29 18:16:49 +04:00
Dmitry Rodionov
9594362f74 change python cache version to 2 (fixes python cache in circle CI) 2022-03-29 10:42:30 +03:00
Dmitry Rodionov
eee0f51e0c use cargo-hakari to manage workspace_hack crate
workspace_hack is needed to avoid recompilation when different crates
inside the workspace depend on the same packages but with different
features being enabled. Problem occurs when you build crates separately
one by one. So this is irrelevant to our CI setup because there we build
all binaries at once, but it may be relevant for local development.

this also changes cargo's resolver version to 2
2022-03-29 10:42:04 +03:00
Arthur Petukhovsky
fd78110c2b Add default statement_timeout for tests (#1423) 2022-03-29 09:57:00 +03:00
Anton Shyrabokau
be6a6958e2 CI: rebuild postgres when Makefile changes (#1429) 2022-03-28 18:19:20 -07:00
Kirill Bulatov
0e44887929 Show more S3 logs and less verbove WAL logs 2022-03-29 00:36:06 +03:00
Dhammika Pathirana
1aa57fc262 Fix tone down compact log chatter
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>
2022-03-28 13:24:13 -07:00
Alexey Kondratov
9a4f0930c0 Turn off S3 for pageserver on staging 2022-03-28 14:14:17 -05:00
Alexey Kondratov
d88f8b4a7e Fix storage deploy condition in ansible playbook 2022-03-28 13:30:40 -05:00
Arthur Petukhovsky
8a901de52a Refactor control file update at safekeeper.
Record global_commit_lsn, have common routine for control file update, add
SafekeeperMemstate.
2022-03-28 21:52:12 +04:00
Alexey Kondratov
a883202495 Enable S3 for pageserver on staging
Follow-up for #1417. Previously we had a problem uploading to S3
due to huge ammount of existing not yet uploaded data. Now we have a
fresh pageserver with LSM storage on staging, so we can try enabling it
once again.
2022-03-28 12:04:40 -05:00
Arseny Sher
780b46ad27 Bump vendor/postgres to fix commit_lsn going backwards. 2022-03-28 20:37:33 +04:00
Arseny Sher
75002adc14 Make shared_buffers large in test_pageserver_catchup.
We intentionally write while pageserver is down, so we shouldn't query it.

Noticed by @petuhovskiy at
https://github.com/zenithdb/postgres/pull/141#issuecomment-1080261700
2022-03-28 20:34:06 +04:00
Heikki Linnakangas
07342f7519 Major storage format rewrite.
This is a backwards-incompatible change. The new pageserver cannot
read repositories created with an old pageserver binary, or vice
versa.

Simplify Repository to a value-store
------------------------------------

Move the responsibility of tracking relation metadata, like which
relations exist and what are their sizes, from Repository to a new
module, pgdatadir_mapping.rs. The interface to Repository is now a
simple key-value PUT/GET operations.

It's still not any old key-value store though. A Repository is still
responsible from handling branching, and every GET operation comes
with an LSN.

Mapping from Postgres data directory to keys/values
---------------------------------------------------

All the data is now stored in the key-value store. The
'pgdatadir_mapping.rs' module handles mapping from PostgreSQL objects
like relation pages and SLRUs, to key-value pairs.

The key to the Repository key-value store is a Key struct, which
consists of a few integer fields. It's wide enough to store a full
RelFileNode, fork and block number, and to distinguish those from
metadata keys.

'pgdatadir_mapping.rs' is also responsible for maintaining a
"partitioning" of the keyspace. Partitioning means splitting the
keyspace so that each partition holds a roughly equal number of keys.
The partitioning is used when new image layer files are created, so
that each image layer file is roughly the same size.

The partitioning is also responsible for reclaiming space used by
deleted keys. The Repository implementation doesn't have any explicit
support for deleting keys. Instead, the deleted keys are simply
omitted from the partitioning, and when a new image layer is created,
the omitted keys are not copied over to the new image layer. We might
want to implement tombstone keys in the future, to reclaim space
faster, but this will work for now.

Changes to low-level layer file code
------------------------------------

The concept of a "segment" is gone. Each layer file can now store an
arbitrary range of Keys.

Checkpointing, compaction
-------------------------

The background tasks are somewhat different now. Whenever
checkpoint_distance is reached, the WAL receiver thread "freezes" the
current in-memory layer, and creates a new one. This is a quick
operation and doesn't perform any I/O yet. It then launches a
background "layer flushing thread" to write the frozen layer to disk,
as a new L0 delta layer. This mechanism takes care of durability. It
replaces the checkpointing thread.

Compaction is a new background operation that takes a bunch of L0
delta layers, and reshuffles the data in them. It runs in a separate
compaction thread.

Deployment
----------

This also contains changes to the ansible scripts that enable having
multiple different pageservers running at the same time in the staging
environment. We will use that to keep an old version of the pageserver
running, for clusters created with the old version, at the same time
with a new pageserver with the new binary.

Author: Heikki Linnakangas
Author: Konstantin Knizhnik <knizhnik@zenith.tech>
Author: Andrey Taranik <andrey@zenith.tech>
Reviewed-by: Matthias Van De Meent <matthias@zenith.tech>
Reviewed-by: Bojan Serafimov <bojan@zenith.tech>
Reviewed-by: Konstantin Knizhnik <knizhnik@zenith.tech>
Reviewed-by: Anton Shyrabokau <antons@zenith.tech>
Reviewed-by: Dhammika Pathirana <dham@zenith.tech>
Reviewed-by: Kirill Bulatov <kirill@zenith.tech>
Reviewed-by: Anastasia Lubennikova <anastasia@zenith.tech>
Reviewed-by: Alexey Kondratov <alexey@zenith.tech>
2022-03-28 05:41:15 -05:00
Kirill Bulatov
55de0b88f5 Hide remote timeline index access details 2022-03-28 12:36:01 +03:00
Kirill Bulatov
d56a0ee19a Avoid recompiling tests for release profile 2022-03-26 08:38:45 +02:00
Kirill Bulatov
18dfc769d8 Use cachepot to cache more rustc builds 2022-03-26 08:38:45 +02:00
Heikki Linnakangas
5e04dad360 Add more variants of the sequential scan performance tests.
More rows, and test with serial and parallel plans. But fewer iterations,
so that the tests run in < 1 minutes, and we don't need to mark them as
"slow".
2022-03-25 23:42:13 +02:00
Dmitry Rodionov
b8cba059a5 temporary disable s3 integration on staging until LSM storge rewrite lands 2022-03-26 00:19:25 +04:00
Heikki Linnakangas
e3fa00972e Use RwLocks in image and delta layers for more concurrency.
With a Mutex, only one thread could read from the layer at a time. I did
some ad hoc profiling with pgbench and saw that a fair amout of time was
spent blocked on these Mutexes.
2022-03-25 15:34:38 +02:00
Kirill Bulatov
b39d1b1717 Exit only on important thread failures 2022-03-25 11:58:54 +02:00
Kirill Bulatov
28bc8e3f5c Log pageserver threads better and shut down on errors in them 2022-03-25 11:58:54 +02:00
Kirill Bulatov
6244fd9e7e Better error messages on zenith cli subcommand invocations 2022-03-25 11:58:54 +02:00
Kirill Bulatov
f6b1d76c30 Replace assert! with ensure! for anyhow::Result functions 2022-03-25 11:58:54 +02:00
Kirill Bulatov
edc7bebcb5 Remove obvious panic sources 2022-03-25 11:58:54 +02:00
Kirill Bulatov
a201d33edc Properly print cachepot stats 2022-03-24 21:11:02 +02:00
Heikki Linnakangas
825d363170 Remove some unnecessary Ord etc. trait implementations.
It doesn't make much sense to compare TimelineMetadata structs with
< or >. But we depended on that in the remote storage upload code,
so replace BTreeSets with Vecs there.
2022-03-24 12:20:06 +02:00
Dmitry Rodionov
b9a1a75b0d clean up unused imports in python tests 2022-03-24 12:47:22 +04:00
Dmitry Rodionov
d3a9cb44a6 tweak timeouts for tenant relocation test 2022-03-24 12:47:22 +04:00
Heikki Linnakangas
c718870517 Tiny refactoring of page_cache::init function.
The init function only needs the 'page_cache_size' from the config, so
seems slightly nicer to pass just that.
2022-03-24 09:46:07 +02:00
Dmitry Rodionov
8437fc056e some follow ups after s3 integration was enabled on staging
* do not error out when upload file list is empty
* ignore ephemeral files during sync initialization
2022-03-23 23:35:36 +04:00
Dmitry Rodionov
8b8d78a3a0 use main branch of our bookfile crate 2022-03-23 22:05:43 +04:00
Dmitry Rodionov
8a86276a6e add more context to error 2022-03-23 18:38:15 +04:00
Dmitry Rodionov
0be7ed0cb5 decrease log message severity for timeline checkpoint internals 2022-03-23 18:20:43 +04:00
Dmitry Rodionov
e80ae4306a change log level from info to debug for timeline gc messages 2022-03-23 18:20:43 +04:00
Heikki Linnakangas
123fcd5d0d Revert accidental bump of vendor/postgres submodule
I accidentally bumped it in commit 3b069f5aef. It didn't seem to cause
any harm, but it was not intentional.
2022-03-23 15:45:29 +02:00
Kirill Bulatov
15434ba7e0 Show cachepot build stats 2022-03-23 14:12:59 +02:00
Andrey Taranik
a4d0d78e9e s3 settings for pageserver (#1388) 2022-03-23 13:39:55 +03:00
Dmitry Rodionov
e13bdd77fe add safekepeers gossip annd storage messaging rfcs
they were in prs during rfc repo import

in addition to just import I've added sequence diagrams to storage
messaging rfc
2022-03-22 15:01:26 +04:00
Kirill Bulatov
bd6bef468c Provide single list timelines HTTP API handle 2022-03-21 13:42:21 +02:00
Kirill Bulatov
77ed2a0fa0 Run GitHub testing workflow on every push 2022-03-21 12:46:33 +02:00
Kirill Bulatov
37ebbb598d Add a macOs build 2022-03-21 12:46:33 +02:00
Kirill Bulatov
063f9ba81d Use serde_with to (de)serialize ZId and Lsn to hex 2022-03-21 12:46:07 +02:00
Heikki Linnakangas
3b069f5aef Fix name of directory used in unit test.
There's another test called 'timeline_load'. If the two tests run in
parallel, they would conflict and fail.
2022-03-18 21:27:48 +02:00
Dmitry Rodionov
b19870cd88 guard against partial uploads to local storage 2022-03-18 18:14:57 +03:00
Dmitry Rodionov
7738254f83 refactor timeline memory state management 2022-03-18 18:14:57 +03:00
Dmitry Ivanov
a7544eead5 Remove the last non-borrowed string from BeMessage (#1376) 2022-03-17 16:46:58 +03:00
Andrey Taranik
ab124c161b Merge branch 'release' into main 2022-03-17 00:05:51 +03:00
Andrey Taranik
1fddb0556f deploy playbook fix - interaction with console (#1374) 2022-03-17 00:01:17 +03:00
Andrey Taranik
15a2a2bf04 release 2202-03-16 (#1373)
production deploy
2022-03-16 23:00:01 +03:00
Dmitry Ivanov
705f51db27 [proxy] Propagate some errors to user (#1329)
* [proxy] Propagate most errors to user

This change enables propagation of most errors to the user
(e.g. auth and connectivity errors). Some of them will be
stripped of sensitive information.

As a side effect, most occurrences of `anyhow::Error` were
replaced with concrete error types.

* [proxy] Box weighty errors
2022-03-16 21:20:04 +03:00
Heikki Linnakangas
9c1a9a1d9f Update Cargo.lock for new dependencies (#1354)
Commit b2ad8342d2 added dependency on 'criterion', which pulled along
some other crates.
2022-03-14 21:06:25 +03:00
Arseny Sher
d5a96d3d50 Fix finding end of WAL on safekeepers after f86cf93435.
That commit dropped wal_start_lsn, now we're looking since commit_lsn, which is
the real end of WAL if no records follow it.

ref #1351
2022-03-14 18:54:59 +03:00
Heikki Linnakangas
d93fc371f3 Import all existing RFCs documents from the separate 'rfcs' repository. 2022-03-11 18:49:36 +02:00
Dhammika Pathirana
5d7bd8643a Fix page reconstruct time histo
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>
2022-03-10 14:42:28 -08:00
Dhammika Pathirana
a8a7dc9ca6 Fix zid encoding
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>
2022-03-10 14:42:28 -08:00
Dhammika Pathirana
b2ad8342d2 Add zid stringify bench test
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>
2022-03-10 14:42:28 -08:00
Dhammika Pathirana
27dadba52c Fix retain references to layer histograms
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>
2022-03-10 14:42:28 -08:00
Dhammika Pathirana
f67d010d1b Add ps smgr/storage metrics tenant tags
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>

Add tenant_id,timeline_id in smgr/storage metrics (#1234)
2022-03-10 14:42:28 -08:00
Kirill Bulatov
093ad8ab59 Send 409 HTTP responses on timeline and tenant creation for existing entity 2022-03-10 19:38:58 +02:00
Kirill Bulatov
c51d545fd9 Serialize Lsn as strings in http api 2022-03-10 19:38:58 +02:00
Kirill Bulatov
fe6fccfdae Allow already existing repo when creating a tenant 2022-03-10 19:38:58 +02:00
Kirill Bulatov
dd74c66ef0 Do not create timeline along with tenant 2022-03-10 19:38:58 +02:00
Kirill Bulatov
a5e10c4f64 Tidy up pageserver's endpoints 2022-03-10 19:38:58 +02:00
Kirill Bulatov
7b5482bac0 Properly store the branch name mappings 2022-03-10 19:38:58 +02:00
Kirill Bulatov
c7569dce47 Allow passing initial timeline id into zenith CLI commands 2022-03-10 19:38:58 +02:00
Kirill Bulatov
4d0f7fd1e4 Update Zenith CLI config between runs 2022-03-10 19:38:58 +02:00
Kirill Bulatov
f49990ed43 Allow creating timelines by branching off ancestors 2022-03-10 19:38:58 +02:00
Kirill Bulatov
0c91091c63 Avoid point in time concept on pageserver level 2022-03-10 19:38:58 +02:00
Kirill Bulatov
10f811e886 Use timeline instead of branch in pageserver's API 2022-03-10 19:38:58 +02:00
Anastasia Lubennikova
2883a25761 Bump vendor/postgres to use local relation cache for smgr_exists 2022-03-10 17:36:09 +04:00
anastasia
87f306c516 Tune backpressure in python tests to make them more stable 2022-03-10 17:36:09 +04:00
anastasia
5b34afe893 Bump vendor/postgres to use local relation cache for smgr_exists 2022-03-10 17:36:09 +04:00
bojanserafimov
15b19a0a57 [proxy] Test connstr options (#1344)
* Add proxy test
* Fix typo
2022-03-09 22:47:06 +03:00
Andrey Taranik
934bbcba0f revert docker build to debian:buster based rust (#1347)
* dockerfile fix, rust cache in docker build flow

* check rust cachepot

* another check rust cachepot

* cleanup

* revert docker build to debian:buster based rust to avoid libc6 version mismatch
2022-03-09 10:13:46 +03:00
Andrey Taranik
cffac59a41 Docker improvement (#1345)
* dockerfile fix, rust cache in docker build flow

* check rust cachepot

* another check rust cachepot

* cleanup
2022-03-08 23:19:49 +03:00
Arseny Sher
8e37d345a8 Adjust safekeeper detailed logging to batch fsyncing. 2022-03-08 08:07:00 +03:00
Arseny Sher
f86cf93435 Refactor timeline creation on safekeepers, allowing storing peer ids.
Have separate routine and http endpoint to create timeline on safekeepers. It is
not used yet, i.e. timeline is still created implicitly, but we'll change that
once infrastructure for learning which tlis are assigned to which safekeepers
will be ready, preventing accidental creation by compute.

Changes format of safekeeper control file, allowing to store set of
peers. Knowing peers provides a part of foundation for peer
recovery (calculating min horizons like truncate_lsn for WAL truncation and
commit_lsn for sync-safekeepers replacement) and proper membership change;
similarly, we don't yet use it for now.

Employing cf file version bump, extracts tenant_id and timeline_id to top level
where it is more suitable. Also adds a bunch of LSNs there and rename
truncate_lsn to more specific peer_horizon_lsn.
2022-03-06 08:06:38 +03:00
Kirill Bulatov
66eb2a1dd3 Replace zenith/build build image with zimg/* ones 2022-03-04 13:46:44 +02:00
Kirill Bulatov
9424bfae22 Use a separate newtype for ZId that (de)serialize as hex strings 2022-03-04 10:58:40 +02:00
Dmitry Rodionov
1d90b1b205 add node id to pageserver (#1310)
* Add --id argument to safekeeper setting its unique u64 id.

In preparation for storage node messaging. IDs are supposed to be monotonically
assigned by the console. In tests it is issued by ZenithEnv; at the zenith cli
level and fixtures, string name is completely replaced by integer id. Example
TOML configs are adjusted accordingly.

Sequential ids are chosen over Zid mainly because they are compact and easy to
type/remember.

* add node id to pageserver

This adds node id parameter to pageserver configuration. Also I use a
simple builder to construct pageserver config struct to avoid setting
node id to some temporary invalid value. Some of the changes in test
fixtures are needed to split init and start operations for envrionment.

Co-authored-by: Arseny Sher <sher-ars@yandex.ru>
2022-03-04 01:10:42 +03:00
Kirill Bulatov
949f8b4633 Fix 1.59 rustc clippy warnings 2022-03-02 21:35:34 +02:00
Andrey Taranik
a0f9a0d350 safekeeper to cosnole call fix (#1333) (#1334) 2022-02-27 01:52:33 +03:00
Andrey Taranik
26a68612d9 safekeeper to cosnole call fix (#1333) 2022-02-27 01:36:40 +03:00
Andrey Taranik
850dfd02df Release deployment (#1331)
* new deployment flow for staging and production

* ansible playbooks and circleci config fixes

* cleanup before merge

* additional cleanup before merge

* debug deployment to staging env

* debug deployment to staging env

* debug deployment to staging env

* debug deployment to staging env

* debug deployment to staging env

* debug deployment to staging env

* bianries artifacts path fix for ansible playbooks

* deployment flow refactored

* base64 decode fix for ssh key

* fix for console notification and production deploy settings

* cleanup after deployment tests

* fix - trigger release binaries download for production deploy
2022-02-26 23:33:16 +03:00
Arthur Petukhovsky
c8a1192b53 Optimize WAL storage in safekeeper (#1318)
When several AppendRequest's can be read from socket without blocking,
they are processed together and fsync() to segment file is only called
once. Segment file is no longer opened for every write request, now
last opened file is cached inside PhysicalStorage. New metric for WAL
flushes was added to the storage, FLUSH_WAL_SECONDS. More errors were
added to storage for non-sequential WAL writes, now write_lsn can be
moved only with calls to truncate_lsn(new_lsn).

New messages have been added to ProposerAcceptorMessage enum. They
can't be deserialized directly and now are used only for optimizing
flushes. Existing protocol wasn't changed and flush will be called for
every AppendRequest, as it was before.
2022-02-25 18:52:21 +03:00
bojanserafimov
137d616e76 [proxy] Add pytest fixture (#1311) 2022-02-24 11:20:07 -05:00
Kirill Bulatov
917c640818 Fix mypy for the new Python 2022-02-24 14:24:36 +03:00
anastasia
c1b3836df1 Bump vendor/postgres 2022-02-24 12:52:12 +03:00
Heikki Linnakangas
5120ba4b5f Refactor the interface for using cached page image.
Instead of passing it as a separate argument to get_page_reconstruct_data,
the caller can fill it in the PageReconstructData struct.
2022-02-24 10:37:12 +02:00
Heikki Linnakangas
e4670a5f1e Remove the PageVersions abstraction.
Since commit fdd987c3ad, it was only used in InMemoryLayers. Let's
just "inline" the code into InMemoryLayer itself.

I originally did this as part of a bigger PR (#1267). With that PR,
one in-memory layer, and one ephemeral file, would hold page versions
belonging to multiple segments. Currently, PageVersions can only hold
versions for a single segment, so that would need to be changed.
Rather than modify PageVersions to support that, just remove it
altogether.
2022-02-23 21:04:39 +02:00
Heikki Linnakangas
7fae894648 Move a few unit tests specific to layered file format.
These tests have intimate knowledge of the directory layeout and layer
file names used by the LayeredRepository implementation of the
Repository trait. Move them, so that all the tests that remain in
repository.rs are expected to work without changes with any
implementation of Repository. Not that we have any plans to create
another Repository implementaiton any time soon, but as long as we
have the Repository interface, let's try to maintain that abstraction
in the tests too.
2022-02-23 20:32:06 +02:00
Stas Kelvich
058123f7ef Bump postgres to fix zenith_test_utils linkage on macOS. 2022-02-23 20:33:47 +03:00
anastasia
87edbd38c7 Add 'wait_lsn_timeout' and 'wal_redo_timeout' pageserver config options instead of hardcoded defaults 2022-02-23 19:59:35 +03:00
anastasia
58ee5d005f Add --pageserver-config-override to ZenithEnvBuilder to tune checkpointer and GC in tests.
Usage example:
zenith_env_builder.pageserver_config_override = "checkpoint_period = '100 s'; checkpoint_distance = 1073741824"
2022-02-23 19:59:35 +03:00
Heikki Linnakangas
468366a28f Fix wrong 'lsn' stored in test page image
The test creates a page version with a string like "foo 123 at 0/10"
as the content. But the LSN stored in that string was wrong: the page
version stored at LSN 0/20 would say "foo <blk> at 0/10".
2022-02-23 11:33:17 +02:00
Dhammika Pathirana
b815f5fb9f Add no_sync check in storage
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>
2022-02-22 12:01:12 -08:00
anastasia
74a0942a77 Fix zenith feedback processing at compute node.
Add test for backpressure
2022-02-22 13:56:21 +03:00
anastasia
1a4682a04a Add 'walreceiver-after-ingest' failpoint. Use sleep at this point to imitate slow walreceiver. 2022-02-22 13:56:21 +03:00
Heikki Linnakangas
993b544ad0 Change default parameters for back pressure
Fixes issue #1238 and #1189. Extracted from PR #1194, with some comment
editorialization by me.

Author: Konstantin Knizhnik <knizhnik@zenith.tech>
2022-02-22 13:56:21 +03:00
Arthur Petukhovsky
dba1d36a4a Refactor WAL utils in safekeeper (#1290)
wal_storage.rs was split up from timeline.rs, safekeeper.rs and send_wal.rs,
and now contains all WAL related code from the safekeeper. Now there are
PhysicalStorage for persisting WAL to disk and WalReader for reading it.
This allows optimizing PhysicalStorage without affecting too much of other
code.

Also there is a separate structure for persisting control file now in
control_file.rs.
2022-02-21 17:20:53 +03:00
Bojan Serafimov
ca81a550ef Fmt 2022-02-21 16:43:28 +03:00
Bojan Serafimov
65a0b2736b Add static router 2022-02-21 16:43:28 +03:00
Bojan Serafimov
cca886682b Undo cplane change 2022-02-21 16:43:28 +03:00
Bojan Serafimov
c8f47cd38e Fix param name 2022-02-21 16:43:28 +03:00
Bojan Serafimov
92787159f7 Add client auth method option 2022-02-21 16:43:28 +03:00
anastasia
abb422d5de Fix SafekeeperMetrics parsing in python tests 2022-02-21 13:45:22 +03:00
bojanserafimov
fdc15de8b2 Add perf test: test_random_writes (#1292) 2022-02-18 15:46:29 -05:00
Stas Kelvich
207286f2b8 Actualize branching parts of openapi spec.
Previous version of spec caused parsing errors in generated clients
as return type is object not array, also one field was missing. In
a passing set `format: hex` on ancestor_id too as value conforms to
that format.
2022-02-18 20:22:21 +02:00
Dhammika Pathirana
d2b896381a Add safekeeper tenant tags in lsn/wal metrics
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>

Add tenant_id in lsn/wal metrics (#1234)
2022-02-18 08:26:37 -08:00
Dhammika Pathirana
009f6d4ae8 Fix safekeeper metric tags
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>

Use separate tags in sk storage file histo (#1234)
2022-02-18 08:26:37 -08:00
Kirill Bulatov
1b31379456 Log postgres errors with ERROR level 2022-02-17 13:42:09 +02:00
Bojan Serafimov
4c64b10aec Revert removal of ignore hint 2022-02-17 13:41:49 +02:00
Bojan Serafimov
ad262a46ad Remove redundant pytest_plugins assignment 2022-02-17 13:41:49 +02:00
Kirill Bulatov
ce533835e5 Use uuid.UUID types for tenants and timelines more 2022-02-17 13:41:19 +02:00
Kirill Bulatov
e5bf520b18 Use types in zenith cli invocations in Python tests 2022-02-17 13:41:19 +02:00
Dmitry Rodionov
9512e21b9e fix python formatting 2022-02-17 13:22:14 +03:00
Dmitry Ivanov
a26d565282 [proxy] Replace private static map with a public CancelMap
This is a cleaner approach which might facilitate testing.
2022-02-17 11:54:27 +03:00
Dmitry Ivanov
a47dade622 [proxy] Migrate to async
This change makes most parts of the code asynchronous, except
for the `mgmt` subsystem (we're going to drop it anyway).

Co-authored-by: bojanserafimov <bojan.serafimov7@gmail.com>
2022-02-17 11:54:27 +03:00
Dmitry Rodionov
9cce430430 remove several obsolete management api commands from pageserver's libpq
api

these commands are now available via http api
2022-02-17 11:26:28 +03:00
Dhammika Pathirana
4bf4bacf01 Add cli start/stop test
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>

Add a test for #1260
2022-02-16 13:19:12 -08:00
bojanserafimov
335abfcc28 Add slow seqscan perf test (#1283) 2022-02-16 10:59:51 -05:00
bojanserafimov
afb3342e46 Add vanilla pg baseline tests (#1275) 2022-02-15 13:44:22 -05:00
Kirill Bulatov
5563ff123f Reuse tenant-timeline id struct from utils 2022-02-15 17:45:23 +02:00
Dhammika Pathirana
0a557b2fa9 Add cli v4 loopback listener ports test
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>

Add a test for #1247
2022-02-15 17:01:22 +02:00
Heikki Linnakangas
9632c352ab Avoid having multiple records for the same page and LSN.
If a heap UPDATE record modified two pages, and both pages needed to have
their VM bits cleared, and the VM bits were located on the same VM page,
we would emit two ZenithWalRecord::ClearVisibilityMapFlags records for
the same VM page. That produced warnings like this in the pageserver log:

    Page version Wal(ClearVisibilityMapFlags { heap_blkno: 18, flags: 3 }) of rel 1663/13949/2619_vm blk 0 at 2A/346046A0 already exists

To fix, change ClearVisibilityMapFlags so that it can update the bits
for both pages as one operation.

This was already covered by several python tests, so no need to add a
new one. Fixes #1125.

Co-authored-by: Konstantin Knizhnik <knizhnik@zenith.tech>
2022-02-15 14:26:16 +02:00
Arseny Sher
328e3b4189 bump vendor/postgres to fix compiler warnings 2022-02-15 06:51:16 +03:00
Arseny Sher
47f6a1f9a8 Add -Werror to CI builds. 2022-02-15 06:51:16 +03:00
Dmitry Rodionov
a4829712f4 merge directories in git-upload instead of removing existing files for perf test result uploads 2022-02-15 03:47:06 +03:00
Arseny Sher
d4d26f619d bump vendor/postgres to fix compilation warning 2022-02-14 21:00:11 +03:00
Arseny Sher
36481f3374 bump vendor/postgres to init pgxactoff in walproposer
ref #1244
2022-02-14 15:57:38 +03:00
Dhammika Pathirana
d951dd8977 Fix cli start (#1260)
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>
2022-02-10 18:36:02 -05:00
bojanserafimov
ea13838be7 Add pgbench baseline test (#1204)
Co-authored-by: Heikki Linnakangas <heikki.linnakangas@iki.fi>
2022-02-10 15:33:36 -05:00
Dmitry Rodionov
b51f23cdf0 pass perf test cluster connstr to circle ci jobs 2022-02-10 17:49:54 +03:00
Kirill Bulatov
3cfcdb92ed Fix tokio features in zenith utils to enable its standalone compilation 2022-02-10 08:33:22 -05:00
Kirill Bulatov
d7af965982 Do not leak decoding_key in JwtAuth's Debug representation 2022-02-10 08:33:22 -05:00
Kirill Bulatov
7c1c7702d2 Code review fixes 2022-02-10 08:33:22 -05:00
Kirill Bulatov
6eef401602 Move routerify behind zenith_utils 2022-02-10 08:33:22 -05:00
Kirill Bulatov
c5b5905ed3 Remove parking_lot dependency from workspace 2022-02-10 08:33:22 -05:00
Kirill Bulatov
76b74349cb Bump pageserver dependencies 2022-02-10 08:33:22 -05:00
Dmitry Rodionov
b08e340f60 point perf results back from testing to master 2022-02-10 14:18:34 +03:00
Dmitry Rodionov
a25fa29bc9 modify git-upload for generate_and_push_perf_report.sh needs 2022-02-10 13:12:19 +03:00
Dmitry Rodionov
ccf3c8cc30 store performance test results in our staging cluster to be able to
visualize them in grafana
2022-02-10 13:12:19 +03:00
Heikki Linnakangas
c45ee13b4e Bump vendor/postgres, to fix memory leak.
See https://github.com/zenithdb/postgres/pull/129
2022-02-10 11:29:38 +02:00
anastasia
f1e7db9d0d Bump vendor/postgres rebased to 14.2 2022-02-10 11:19:10 +03:00
Heikki Linnakangas
fa8a6c0e94 Reduce logging of walkeeper normal operations.
It was printing a lot of stuff to the log with INFO level, for routine
things like receiving or sending messages. Reduce the noise. The amount
of logging was excessive, and it was also consuming a fair amount of CPU
(about 20% of safekeeper's CPU usage in a little test I ran).
2022-02-10 08:34:30 +02:00
Dhammika Pathirana
1e8ca497e0 Fix safekeeper loopback addr (#1247)
Signed-off-by: Dhammika Pathirana <dhammika@gmail.com>
2022-02-10 09:23:53 +03:00
Heikki Linnakangas
a504cc87ab Bump vendor/postgres for "Make getpage requests interruptible"
See https://github.com/zenithdb/zenith/issues/1224
2022-02-09 16:13:46 +02:00
Heikki Linnakangas
5268bbc840 Bump vendor/postgres for fixes to cluster size limit.
See https://github.com/zenithdb/postgres/pull/126
2022-02-09 15:52:21 +02:00
Arseny Sher
e1d770939b Bump vendor/postgres to fix recent CI failure.
See zenithdb/postgres#127
2022-02-09 08:50:45 -05:00
Egor Suvorov
2866a9e82e Fix safekeeper LSN metrics (#1216)
* Always initialize flush_lsn/commit_lsn metrics on a specific timeline, no more `n/a`
* Update flush_lsn metrics missing from cba4da3f4d
* Ensure that flush_lsn found on load is >= than both commit_lsn and truncate_lsn
* Add some debug logging
2022-02-07 20:05:16 +03:00
Kirill Bulatov
b67cddb303 Implement EphemeralFile flush in a least dangerous way 2022-02-05 22:02:59 -05:00
anastasia
cb1d84d980 Make test_timeline_size_quota more deterministic 2022-02-06 02:16:36 +03:00
anastasia
642797b69e Implement cluster size quota for zenith compute node.
Use GUC zenith.max_cluster_size to set the limit.

If limit is reached, extend requests will throw out-of-space error.
When current size is too close to the limit - throw a warning.

Add new test: test_timeline_size_quota.
2022-02-06 02:16:36 +03:00
Kirill Bulatov
3ed156a5b6 Add a CLI tool to manipulate remote storage blob files 2022-02-05 15:48:08 -05:00
Heikki Linnakangas
2d93b129a0 Avoid eprintln() in pageserver and walkeeper.
Use log::error!() instead. I spotted a few of these "connection error"
lines in the logs, without timestamps and the other stuff we print for
all other log messages.
2022-02-05 17:59:31 +02:00
Arseny Sher
32c7859659 bump vendor/postgres 2022-02-05 01:27:31 +03:00
Arseny Sher
729ac38ea8 Centralize suspending/resuming timeline activity on safekeepers.
Timeline is active whenever there is at least 1 connection from compute or
pageserver is not caught up. Currently 'active' means callmemaybes are being
sent.

Fixes race: now suspend condition checking and callmemaybe unsubscribe happen
under the same lock.
2022-02-03 02:34:10 +03:00
Andrey Taranik
d69b0539ba proxy chart staging values update for labels (#1202) 2022-02-01 13:31:05 +03:00
Dmitry Ivanov
ec78babad2 Use mold instead of default linker 2022-01-28 20:40:50 +03:00
Dmitry Ivanov
9350dfb215 [CI] Merge *.profraw files prior to uploading workspace
Hopefully, this will make CI pipeline a bit faster.
2022-01-28 19:56:28 +03:00
Dmitry Ivanov
8ac8be5206 [scripts/coverage] Implement merge command
This will drastically decrease the size of CI workspace uploads.
2022-01-28 19:56:28 +03:00
Dmitry Ivanov
c2927353a5 Enable async deserialization of FeMessage
Now it's possible to call Fe{Startup,}Message in both
sync and async contexts, which is good for proxy.

Co-authored-by: bojanserafimov <bojan.serafimov7@gmail.com>
2022-01-28 19:40:37 +03:00
Kirill Bulatov
33251a9d8f Disable failing remote storage tests for now 2022-01-28 18:35:46 +03:00
Konstantin Knizhnik
c045ae7a9b Fix random range for keys in test_gc_aggressive.py (#1199) 2022-01-28 16:29:55 +03:00
Dmitry Rodionov
602ccb7d5f distinguish failures for pre-initdb lsn and pre-ancestor lsn branching in test_branch_behind 2022-01-28 12:31:15 +03:00
Dmitry Rodionov
5df21e1058 remove Timeline::start_lsn in favor of ancestor_lsn 2022-01-28 12:31:15 +03:00
Konstantin Knizhnik
08135910a5 Fix checkpoint.nextXid update (#1166)
* Fix checkpoint.nextXid update

* Add test for cehckpoint.nextXid

* Fix indentation of test_next_xid.py

* Fix mypy error in test_next_xid.py

* Tidy up the test case.

* Add a unit test

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>
2022-01-27 18:21:51 +03:00
Konstantin Knizhnik
f58a22d07e Freeze layers at the same end LSN (#1182)
* Freeze vectors at the same end LSN

* Fix calculation of last LSN for inmem layer

* Do not advance disk_consistent_lsn is no open layer was evicted

* Fix calculation of freeze_end_lsn

* Let start_lsn be larger than oldest_pending_lsn

* Rename 'oldest_pending_lsn' and 'last_lsn', add comments.

* Fix future_layerfiles test

* Update comments conserning olest_lsn

* Update comments conserning olest_lsn

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>
2022-01-27 18:21:00 +03:00
Arthur Petukhovsky
cedde559b8 Add test for replacement of the failed safekeeper (#1179)
* Add test to replace failed safekeeper

* Restart safekeepers in test_replace_safekeeper

* Update vendor/postgres
2022-01-27 17:26:55 +03:00
Arthur Petukhovsky
49d1d1ddf9 Don't call adjust_for_wal_acceptors after pg create (#1178)
Now zenith_cli handles wal_acceptors config internally, and if we
will append wal_acceptors to postgresql.conf in python tests, then
it will contain duplicate wal_acceptors config.
2022-01-27 17:23:14 +03:00
Arseny Sher
86045ac36c Prefix per-cluster directory with ztenant_id in safekeeper.
Currently ztimelineids are unique, but all APIs accept the pair, so let's keep
it everywhere for uniformity.

Carry around ZTTId containing both ZTenantId and ZTimelineId for simplicity.

(existing clusters on staging ought to be preprocessed for that)
2022-01-27 17:22:07 +03:00
Konstantin Knizhnik
79f0e44a20 Gc cutoff rwlock (#1139)
* Reproduce github issue #1047.

* Use RwLock to protect gc_cuttof_lsn

* Eeduce number of updates in test_gc_aggressive

* Change  test_prohibit_get_page_at_lsn_for_garbage_collected_pages test

* Change  test_prohibit_get_page_at_lsn_for_garbage_collected_pages

* Lock latest_gc_cutoff_lsn in all operations accessing storage to prevent race conditions with GC

* Remove random sleep between wait_for_lsn and get_page_at_lsn

* Initialize latest_gc_cutoff with initdb_lsn and remove separate check that lsn >= initdb_lsn

* Update test_prohibit_branch_creation_on_pre_initdb_lsn test

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>
2022-01-27 14:41:16 +03:00
anastasia
c44695f34b bump vendor/postgres 2022-01-27 11:20:45 +03:00
anastasia
5abe2129c6 Extend replication protocol with ZentihFeedback message
to pass current_timeline_size to compute node

Put standby_status_update fields into ZenithFeedback and send them as one message.
Pass values sizes together with keys in ZenithFeedback message.
2022-01-27 11:20:45 +03:00
Dmitry Rodionov
63dd7bce7e bandaid to avoid concurrent timeline downloading until proper refactoring/fix 2022-01-26 19:54:09 +03:00
Dmitry Rodionov
f3c73f5797 cache python deps in circle ci 2022-01-26 13:01:12 +03:00
Dmitry Rodionov
e6f2d70517 use 2021 rust edition 2022-01-25 18:48:49 +03:00
Andrey Taranik
be6d1cc360 Use zimg as builders (#1165)
* try use own builder images

* add postgres headers before build zenith

* checkout submodule before zenith build

* circleci cleanup
2022-01-25 00:58:37 +03:00
Dmitry Ivanov
703716228e Use &str instead of String in BeMessage::ErrorResponse
There's no need in allocating string literals in the heap.
2022-01-24 18:49:05 +03:00
Dmitry Rodionov
458bc0c838 walkeeper: use named type as a key in callmemaybe subscriptions hashmap 2022-01-24 17:20:15 +03:00
Dmitry Rodionov
39591ef627 reduce flakiness 2022-01-24 17:20:15 +03:00
Dmitry Rodionov
37c440c5d3 Introduce first version of tenant migraiton between pageservers
This patch includes attach/detach http endpoints in pageservers. Some
changes in callmemaybe handling inside safekeeper and an integrational
test to check migration with and without load. There are still some
rough edges that will be addressed in follow up patches
2022-01-24 17:20:15 +03:00
anastasia
81e94d1897 Add LSN and Backpressure descriptions to glossary.md 2022-01-24 12:52:30 +03:00
Konstantin Knizhnik
7bc1274a03 Fix comparison with disk_consistent_lsn in newer_image_layer_exists (#1167) 2022-01-24 12:19:18 +03:00
Dmitry Rodionov
5f5a11525c Switch our python package management solution to poetry.
Mainly because it has better support for installing the packages from
different python versions.

It also has better dependency resolver than Pipenv. And supports modern
standard for python dependency management. This includes usage of
pyproject.toml for project specific configuration instead of per
tool conf files. See following links for details:
 https://pip.pypa.io/en/stable/reference/build-system/pyproject-toml/
 https://www.python.org/dev/peps/pep-0518/
2022-01-24 11:33:47 +03:00
Konstantin Knizhnik
e209764877 Do not delete layers beyand cutoff LSN (#1128)
* Do not delete layers beyand cutoff LSN

* Update pageserver/src/layered_repository/layer_map.rs

Co-authored-by: Heikki Linnakangas <heikki.linnakangas@iki.fi>

Co-authored-by: Heikki Linnakangas <heikki.linnakangas@iki.fi>
2022-01-24 10:42:40 +03:00
Kirill Bulatov
65290b2e96 Ensure every submodule compiles on its own 2022-01-21 17:34:15 +03:00
Dmitry Ivanov
127df96635 [proxy] Make NUM_BYTES_PROXIED_COUNTER more precise 2022-01-21 17:31:19 +03:00
Kirill Bulatov
924d8d489a Allow enabling S3 mock in all existing tests with an env var 2022-01-20 18:42:47 +02:00
Dmitry Rodionov
026eb64a83 Use python lib to mock s3 2022-01-20 18:42:47 +02:00
Kirill Bulatov
45124856b1 Better S3 remote storage logging 2022-01-20 18:42:47 +02:00
Kirill Bulatov
38c6f6ce16 Allow specifying custom endpoint in s3 2022-01-20 18:42:47 +02:00
Heikki Linnakangas
caa62eff2a Fix description of proxy --auth-endpoint option. 2022-01-20 14:50:27 +03:00
Dmitry Ivanov
d3542c34f1 Refactoring: use anyhow::Context's methods where possible 2022-01-19 16:33:48 +03:00
Kirill Bulatov
7fb62fc849 Fix macos compilation 2022-01-18 23:01:04 +02:00
Andrey Taranik
9d6ae06663 monitoring turn on for proxy (#1146) 2022-01-18 19:23:53 +03:00
Alexey Kondratov
06c28174c2 Integrate compute_tools into zenith workspace and improve logging (zenithdb/console#487) 2022-01-18 18:47:31 +03:00
bojanserafimov
8af1b43074 proxy: Add new metrics (#1132) 2022-01-14 19:12:43 -05:00
Heikki Linnakangas
17b7caddcb Update vendor/postgres: silence excessive logging from walproposer. 2022-01-14 20:51:02 +02:00
Heikki Linnakangas
dab30c27b6 Refactor thread management and shutdown
This introduces a new module to handle thread creation and shutdown.
All page server threads are now registered in a global hash map, and
there's a function to request individual threads to shut down gracefully.

Thread shutdown request is signalled to the thread with a flag, as well
as a Future that can be used to wake up async operations if shutdown is
requested. Use that facility to have the libpq listener thread respond
to pageserver shutdown, based on Kirill's earlier prototype
(https://github.com/zenithdb/zenith/pull/1088). That addresses
https://github.com/zenithdb/zenith/issues/1036, previously the libpq
listener thread would not exit until one more connection arrives.

This also eliminates a resource leak in the accept() loop. Previously,
we added the JoinHanlde of each new thread to a vector but old handles
for threads that had already exited were never removed.
2022-01-14 18:36:10 +02:00
Heikki Linnakangas
bad1dd9759 Don't panic if spawning a new WAL receiver thread fails.
The panic would kill the page service thread. That's not too bad, but
still let's try to handle it more gracefully.
2022-01-14 18:02:34 +02:00
Heikki Linnakangas
d29836d0d5 Don't panic if spawning a thread to handle a connection fails.
Log the error and continue. Hopefully it's a transient failure.

This might have been happening in staging earlier, when the safekeeper
had a problem where it opened connections very frequently to issue
"callmemaybe" commands. If you launch too many threads too fast, you might
run out of file descriptors or something. It's not totally clear what
happened, but with commit, at least the page server will continue to run
and accept new connections, if a transient error happens.
2022-01-14 18:02:30 +02:00
Heikki Linnakangas
adb0b3dada Include backtrace in error messages in the log.
'anyhow' crate can include a backtrace in all errors, when the
'backtrace' feature is enabled. Enable it, and change the places that used
'{:#}' or '{}' to '{:?}', so that the backtrace is printed.
2022-01-14 10:10:17 +02:00
bojanserafimov
5e0f39cc9e Add proxy metrics (#1093) 2022-01-13 20:34:30 -05:00
Arthur Petukhovsky
0a34a592d5 Bump vendor/postgres (#1120) 2022-01-13 20:28:37 +03:00
Heikki Linnakangas
19aaa91f6d Timeline IDs are not globally unique, fix some code that assumed that.
A timeline ID is only guaranteed to be unique for a particular tenant,
so you need to use tenant ID + timeline ID as the key, rather than just
timeline ID.

The safekeeper currently makes the same assumption, and we should fix that
too, but this commit just addresses this one case in the page server.

In the passing, reorder some function arguments to be more consistent.
2022-01-13 18:45:30 +02:00
Konstantin Knizhnik
404aab9373 Use mutex to prevent concurrent checkpoints (#1115)
* Use mutex to prevent concurrent checkpoints

* Fix comment
2022-01-13 17:48:24 +03:00
Konstantin Knizhnik
bc6db2c10e Implement IO metrics in VirtualFile (#1112)
* Implement IO metrics in VirtualFile

* Do not group virtual file close statistics by tenantid/timelineid

* Add comments concenring close metrics
2022-01-13 17:36:53 +03:00
Heikki Linnakangas
772d853dcf Fix race condition leading to panic in walkeeper.
The walkeeper launch two threads for each connection, and uses a guard
object to remove entry from 'replicas' array, when finishes. But only
the background thread held onto the guard object, so if the background
thread finished before the other thread, the array entry would be
removed prematurely, which lead to panic in the check_stop_streaming()
call.

Fixes https://github.com/zenithdb/zenith/issues/1103
2022-01-13 11:21:11 +02:00
Arseny Sher
ab4d272149 Add safekeeper --dump-control-file option.
Hexalize zids there for better output; since Serde doesn't support several
formats for one struct, on-disk representation is changed as well, make
upgrade.rs cope with it.
2022-01-12 19:47:24 +03:00
Konstantin Knizhnik
f70a5cad61 Fix releasing of timelines lock (#1100)
refer #1087
2022-01-12 15:05:08 +03:00
anastasia
7aba299dbd Use safekeeper in test_branch_behind (#1068)
to avoid a subtle race condition.

Without safekeeper, walreceiver reconnection can stuck,
because of IO deadlock between walsender auth and regular backend.
2022-01-12 14:38:04 +03:00
Kirill Bulatov
4b3b19f444 Support prefixes when working with s3 buckets 2022-01-11 15:44:50 +02:00
Kirill Bulatov
8ab4c8a050 Code review fixes 2022-01-11 15:44:23 +02:00
Kirill Bulatov
7c4a653230 Propagate Zenith CLI's RUST_LOG env var to subprocesses 2022-01-11 15:44:23 +02:00
Kirill Bulatov
a3cd8f0e6d Add the remote storage test 2022-01-11 15:44:23 +02:00
Kirill Bulatov
65c851a451 Test pageserver's timeline http methods
z
2022-01-11 15:44:23 +02:00
Kirill Bulatov
23cf2fa984 Properly shutdown storage sync loop 2022-01-11 15:44:23 +02:00
Kirill Bulatov
ce8d6ae958 Allow using remote storage in tests 2022-01-11 15:44:23 +02:00
Kirill Bulatov
384b2a91fa Pass generic pageserver params through zenith cli 2022-01-11 15:44:23 +02:00
Arseny Sher
233c4811db Fix default safekeeper http port. 2022-01-11 10:13:27 +03:00
Konstantin Knizhnik
2fd4c390cb Do not hold timelines lock during GC (#1089)
* Do not hold timelines lock during GC
refer #1087

* Add gc_cs mutex for preveting creation of new timelines during GC

* Make clippy happy

* Use Mutex<()> instead of Mutex<i32> for GC critical section
2022-01-10 14:41:15 +03:00
bojanserafimov
5b9391b51d Support "query cancel" in proxy (#1052) 2022-01-05 17:27:12 -05:00
Arthur Petukhovsky
5a6405848d Bump vendor/postgres (#1086) 2022-01-05 14:27:51 +03:00
Patrick Insinger
191d9d2b74 par_fsync - use VirtualFile 2022-01-04 20:40:57 -08:00
Patrick Insinger
24c8dab86f pageserver - parallelize checkpoint fsyncs 2022-01-04 20:40:57 -08:00
Heikki Linnakangas
55a4cf64a1 Refactor WAL record handling.
Introduce the concept of a "ZenithWalRecord", which can be a Postgres WAL
record that is replayed with the Postgres WAL redo process, or a built-in
type that is handled entirely by pageserver code.

Replace the special code to replay Postgres XACT commit/abort records
with new Zenith WAL records. A separate zenith WAL record is created for
each modified CLOG page. This allows removing the 'main_data_offset'
field from stored PostgreSQL WAL records, which saves some memory and
some disk space in delta layers.

Introduce zenith WAL records for updating bits in the visibility map.
Previously, when e.g. a heap insert cleared the VM bit, we duplicated the
heap insert WAL record for the affected VM page. That was very wasteful.
The heap WAL record could be massive, containing a full page image in
the worst case. This addresses github issue #941.
2022-01-04 11:26:37 +02:00
Heikki Linnakangas
722667f189 Add test case for performance issue #941.
The first COPY generates about 230 MB of write I/O, but the second
COPY, after deleting most of the rows and vacuuming the rows away,
generates 370 MB of writes. Both COPYs insert the same amount of data,
so they should generate roughly the same amount of I/O. This commit
doesn't try to fix the issue, just adds a test case to demonstrate it.

Add a new 'checkpoint' command to the pageserver API. Previously,
we've used 'do_gc' for that, but many tests, including this new one,
really only want to perform a checkpoint and don't care about GC. For
now, I only used the command in the new test, though, and didn't
convert any existing tests to use it.
2022-01-04 11:26:37 +02:00
Arseny Sher
25a515b968 Don't call immediately on resume in callmemaybe.
It creates busy loop if pageserver <-> safekeeper connection fails after it was
established (e.g. currently due to 'segment checkpoint not found' error on
pageserver).

Also wake up callmemaybe thread regularly once in recall_period regardless of
channel activity.
2022-01-03 20:44:36 +03:00
Konstantin Knizhnik
1c47fbae81 Do not write image layers during enforced checkpoint (#1057)
* Do not write image layers during enforced checkpoint
refer #1056

* Add Flush option to CheckpointConfig

refer #1057
2022-01-01 19:08:09 +03:00
Alexey Kondratov
8f0cd7fb9f [compute_tools] Switch cluster_id in spec to string (zenithdb/console#72) 2021-12-29 16:35:29 +03:00
Dmitry Rodionov
c910132d4b Fix wal receiver shutdown
This patch allows to shutdown wal receiver when there are no messages
and wal receiver is blocked inside tokio-postgres. In this case it
cannot check the shutdown flag.

This patch switches to use async interface of tokio-postgres directly
without sync wrappers. It opens the possibility to use tokio::select!
between the phsycal_stream.next() and a shutdown channel readiness to
interrupt replication process.

Also this allows to shutdown only particular wal receiver without
using global shutdown_requested flag.
2021-12-29 14:42:29 +03:00
Arthur Petukhovsky
70778058d9 Add test for safekeeper setup without pageserver (#1000) 2021-12-29 12:58:27 +03:00
nikitashamgunov
a379b45257 Update README.md 2021-12-28 14:26:42 -08:00
bojanserafimov
24eca8d58b Parse cancel message in pq_proto (#1060) 2021-12-28 16:43:44 -05:00
Bojan Serafimov
1e3ddd43bc Add struct for key data 2021-12-28 22:40:22 +03:00
Bojan Serafimov
989371493b Add BeMessage::BackendKeyData variant 2021-12-28 22:40:22 +03:00
Alexey Kondratov
f64074c609 Move compute_tools from console repo (zenithdb/console#383)
Currently it's included with minimal changes and lives aside of the main
workspace. Later we may re-use and combine common parts with zenith
control_plane.

This change is mostly needed to unify cloud deployment pipeline:
1.1. build compute-tools image
1.2. build compute-node image based on the freshly built compute-tools
2. build zenith image

So we can roll new compute image and new storage required by it to
operate properly. Also it becomes easier to test console against some
specific version of compute-node/-tools.
2021-12-28 20:17:29 +03:00
anastasia
eba897ffe7 send CallmeEvent::Unsubscribe request only when pageserver is caught up with safekeeper and it's time to stop streaming 2021-12-28 17:50:48 +03:00
anastasia
5ef2b1baf7 Add new test illustrating issue with sync-safekeepers.
If safekeepers sync fast enough, callmemaybe thread may never make a call before receiving Unsubscribe request. This leads to the situation, when pageserver lacks data that exists on safekeepers.
2021-12-28 17:50:48 +03:00
Kirill Bulatov
f0afd08667 Fix zenith init defaults 2021-12-28 00:21:48 +02:00
Kirill Bulatov
b494ac1ea0 Remove redundant pageserver cli params 2021-12-27 18:38:54 +02:00
Arseny Sher
a163650a99 Refactor Postgres command parsing in safekeeper.
Do it separately with SafekeeperPostgresCommand enum as a result. Since query is
always C string, switch postgres_backend process_query argument from Bytes to
&str.

Make passing ztli/ztenant id in safekeeper connection string optional; this is
needed for upcoming intra-safekeeper heartbeat cmd which is not bound to any
timeline.
2021-12-24 15:48:13 +03:00
anastasia
980f5f8440 Propagate remote_consistent_lsn to safekeepers.
Change meaning of lsns in HOT_STANDBY_FEEDBACK:
flush_lsn = disk_consistent_lsn,
apply_lsn = remote_consistent_lsn
Update compute node backpressure configuration respectively.

Update compute node configuration:
set 'synchronous_commit=remote_write' in setup without safekeepers.
This way compute node doesn't have to wait for data checkpoint on pageserver.
This doesn't guarantee data durability, but we only use this setup for tests, so it's fine.
2021-12-24 15:32:54 +03:00
Kirill Bulatov
42647f606e Use correct pageserver CLI parameters in docker entrypoint 2021-12-24 03:41:45 +02:00
bojanserafimov
b807570f46 Use parking_lot::Mutex instead of std::Mutex in walreceiver (#1045) 2021-12-23 14:25:44 -05:00
Kirill Bulatov
114a757d1c Use generic config parameters in pageserver cli
Co-authored-by: Heikki Linnakangas <heikki.linnakangas@iki.fi>
2021-12-23 18:58:28 +02:00
Andrey Taranik
9854ded56b Feature/proxy deploy (#1046)
* zenith proxy deployment

* proxy deploy ci fix

* ci cleanup or zenith proxy deploy
2021-12-23 15:53:28 +03:00
Heikki Linnakangas
fdd987c3ad Refactor the way Image- and DeltaLayers are created
Introduce builder objects, DeltaLayerWriter and ImageLayerWriter.
This gives more flexibility, as the DeltaLayer::create and
ImageLayer::create functions don't need to know about the details of
the format of where the page versions are coming from. This allows us
to change the format used in InMemoryLayer more easily, without having
to modify Delta- and ImageLayer code.

Also refactor the code in InMemoryLayer::write_to_disk for clarity.
2021-12-23 00:33:16 +02:00
Heikki Linnakangas
da62407fce Change the meaning of 'blknum' argument in Layer trait
Previously, the 'blknum' argument of various Layer functions was the
block number within the overall relation. That was pretty confusing,
because an individual layer only holds data from a one segment of the
relation. Furthermore, the 'put_truncation' function already dealt
with per-segment size, not overall relation size, adding to the
confusion.

Change the meaning of the 'blknum' argument to mean the block number
within the segment, not the overall relation.
2021-12-22 16:55:37 +02:00
Heikki Linnakangas
1cc181ca32 Fix WAL redo of commit records with subtransactions.
If a commit record contains XIDs that are stored on different CLOG pages,
we duplicate the commit record for each affected CLOG page. In the redo
routine, we must only apply the parts of the record that apply to the
CLOG page being restored. We got that right in the loop that handles the
sub-XIDs, but incorrectly always set the bit that corresponds to the main
XID.
2021-12-21 23:08:01 +02:00
Heikki Linnakangas
927587cec8 Fix comments in tests 2021-12-21 22:38:33 +02:00
Heikki Linnakangas
bcf80eaa95 Fix multixacts members WAL redo.
The logic to compute the page number was broken, and as a result, only
the first page of multixact members was updated correctly. All the
rest were left as zeros. Improve test_multixact.py to generate more
multixacts, to cover this case.

Also fix the check that the restored PG data directory matches the
original one. Previously, the test compared the 'pg_new' cluster,
which is a bit silly because the test restored the 'pg_new' cluster
only a few lines earlier, so if the multixact WAL redo is somehow
broken, the comparison will just compare two broken data directories
and report success. Change it to compare the original datadir, the one
where the multixacts were originally created, with a restored image of
the same.
2021-12-21 17:50:06 +02:00
Arthur Petukhovsky
f56db3da68 Bump vendor/postgres (#996) 2021-12-21 16:53:08 +03:00
Konstantin Knizhnik
68aa9d2715 Set utf8 encoding in initdb (#993)
refer #992
2021-12-21 15:43:34 +03:00
Konstantin Knizhnik
76777f5812 Add utility for dumping/editing metadata file (#1031) 2021-12-21 15:43:15 +03:00
Arseny Sher
56312522f9 Make safekeeper namings more consistent with reality.
s/send_wal.rs/handler.rs
s/SendWalHandler/SafekeeperPostgresHandler
s/replication.rs/send_wal.rs
2021-12-21 13:24:23 +03:00
Dmitry Rodionov
2d9d0658e8 adjust benchmarking script for go console 2021-12-20 13:54:10 +03:00
anastasia
3b61f364f7 Stop WAL streaming threads, when compute node is shut down.
WAL stream uses the 2 connections:
1. Compute node (walproposer) -> Safekeeper (ReceiveWalConn module)

When compute node is shut down, safekeeper needs to stop the respective receiving thread.
Prior to this PR it didn't work because PostgresBackend haven't handled disconnection properly.

2. Safekeeper (ReplicationConn module) -> pageserver (walreceiver thread)

When incoming WAL stream is gone, safekeeper can stop streaming WAL and cancel connection as soon as replica is caught up.
Note that the WAL can be streamed to multiple replicas simultaneously, only disconnect ones that are caught up to the last_recieved_lsn.
2021-12-20 12:34:28 +03:00
anastasia
90e5b6f983 Don't try to reconnect failed walreceiver. If necessary, wal service will send new callmemaybe request 2021-12-20 12:34:28 +03:00
Heikki Linnakangas
75cbaafb96 Remove old ephemeral files on pageserver restart.
The ephemeral files are not usable after restart, so just delete them.
Before this, you got "unrecognized filename in timeline dir" warnings
about them, as Konstantin noted at:
https://github.com/zenithdb/zenith/issues/906#issuecomment-995530870.

While we're at it, refactor away the list_files() function, moving the
logic fully into the caller. Seems more straightforward.
2021-12-17 00:00:02 +02:00
Andrey Taranik
5d5c2738a6 staging deployment flow fix (#1029) 2021-12-16 22:54:01 +03:00
Andrey Taranik
cbe155ff48 storage CI flow for staging environment (#1003)
* storage CI flow for staging environment

* prevent deploy version older than already deployed
2021-12-16 17:05:20 +03:00
Kirill Bulatov
29143b018e Disable rustc incremental compilation to avoid ICEs 2021-12-15 21:57:34 +03:00
Heikki Linnakangas
d8a367dd32 Remove dead code, fix typos. 2021-12-15 19:58:03 +02:00
Kirill Bulatov
ca60561a01 Propagate disk consistent lsn in timeline sync statuses 2021-12-15 15:13:21 +02:00
Andrey Taranik
86a409a174 cleanup circleci config after test 2021-12-15 16:08:31 +03:00
Andrey Taranik
66242f0d0e tag docker image by commit sha and add docker build for compute 2021-12-15 16:08:31 +03:00
Heikki Linnakangas
7f78e80c51 Refactor WAL ingestion code.
Rename save_decoded_record() to ingest_record(), and move the
responsibility for decoding the record into ingest_record().

Also move the responsibility of updating the CheckPoint relish to
ingest_record(). Put it in a new WalIngest struct, to help with tracking
that.
2021-12-14 20:24:03 +02:00
Heikki Linnakangas
f8f88154d5 Split restore_local_repo.rs into two files, with more descriptive names. 2021-12-14 20:24:03 +02:00
Kirill Bulatov
5cff7d1de9 Use proper download order 2021-12-14 15:32:22 +02:00
Arseny Sher
8f0cafd508 Grab safekeeper.lock on the whole directory instead of per tli.
closes #976
2021-12-13 22:11:04 +03:00
Heikki Linnakangas
e0d41ac6a3 Move constants related to metadata file to metadata.rs.
They're not used anywhere else, so seems like a better place.
2021-12-13 16:57:16 +02:00
Heikki Linnakangas
72ef59c378 Fix small typos in comments, add a comment.
The introducing paragraph README could use some more love, but let's at
least fix the typos.
2021-12-13 13:44:08 +02:00
Kirill Bulatov
673c297949 Download timelines on demand 2021-12-10 17:23:35 +02:00
Kirill Bulatov
e61732ca7c Compress checkpoint files before streaming into S3 2021-12-10 17:23:35 +02:00
Heikki Linnakangas
cb4a8396fb Use rustls rather than native-tls in all dependencies.
We depends on rustls in postgres_backend anyway, so might as well use it
for all TLS stuff. Seems better to depend on only one library both from a
security point of view, and because fewer dependencies means less code to
compile. With this commit, we no longer depend on OpenSSL.
2021-12-10 15:14:27 +02:00
Heikki Linnakangas
c77e30116e Split waldecoder.rs into two source files.
Move the code for decoding a WAL stream into WAL records into
'postgres_ffi', and keep the code to parse the WAL records deeper in
'pageserver' crate, renamed to walrecord.rs.

This tidies up the dependencies a bit. 'walkeeper' reuses the same
waldecoder routines, and it used to depend on 'pageserver' because of
that. Now it only depends on 'postgres_ffi'.

(The comment in walkeeper/Cargo.toml that claimed that the dependency was
needed for ZTimelineId was obsolete. ZTimelineId is defined in
'zenith_utils', the dependency was actually needed for the waldecoder.)
2021-12-10 15:14:13 +02:00
Heikki Linnakangas
9d369f158c Update rust-s3 to version 0.28.0
0.28.0 includes two changes I submitted to upstream:

- Add support for older ListObjects API, needed to use rust-s3 with Google
  Cloud Storage: https://github.com/durch/rust-s3/pull/229

- If file is smaller than one chunk, don't initiate multi-part upload.
  https://github.com/durch/rust-s3/pull/228

These are not critical for Zenith right now, but let's stay up-to-date.
2021-12-10 14:52:08 +02:00
Heikki Linnakangas
6ecd442fb9 Remove a bunch of unnecessary dependencies. 2021-12-10 14:24:33 +02:00
Heikki Linnakangas
f3f059c1f8 Fix a few cases where request beyond end of rel would error out.
Currently, we return an all-zeros page if you request a block beyond end of
a relation. That has been implemented in LayeredTimeline::materialize_page,
so that if Layer::get_page_reconstruct_data returns Missing, it returns
and all-zeros page.

However InMemoryLayer and DeltaLayer would return Continue, not Missing,
in that case, and materialize_page would try to find the predecessor
layer. If there was a preceding image layer, then everything would still
work, but if there wasn't, it would return a "could not find predecessor
of layer" error. Fix that in InMemoryLayer and DeltaLayer, making them
check the size of the relation and return Missing in that case.

This is hard to reproduce at the moment, but it happened quickly with
pgbench when I modified InMemoryLayer::write_to_disk so that it didn't
always create a new ImageLayer.
2021-12-09 17:46:48 +02:00
Dmitry Ivanov
8388e14bbd [scripts/git-upload] Fix logic of --forbid-overwrite 2021-12-09 14:06:17 +03:00
anastasia
5293e183c5 callmemaybe. review code cleanup 2021-12-09 13:31:49 +03:00
anastasia
93ff5f7ff0 Add default value for safekeeper --recall option. DEFAULT_RECALL_PERIOD is 1 second. 2021-12-09 13:31:49 +03:00
anastasia
41dce68bdd callmemaybe refactoring
- Don't spawn a separate thread for each connection.
Instead use one thread per safekeeper, that iterates over all connections and sends callback requests for them.

-Use tokio postgres to connect to the pageserver, to avoid spawning a new thread for each connection.

callmemaybe review fixes:
- Spawn all request_callback tasks separately.
- Remember 'last_call_time' and only send request_callback if 'recall_period' has passed.
- If task hasn't finished till next recall, abort it and try again.
- Add pause/resume CallmeEvents to avoid spamming pageserver when connection already established.
2021-12-09 13:31:49 +03:00
Dmitry Rodionov
7dece8e4a0 skip temporary table files when comparing directories in regress tests 2021-12-09 12:53:26 +03:00
Arseny Sher
37c85d5fd9 Switch safekeeper from log to tracing logging.
Add context to wal acceptor and wal sender threads showing timeline id and
unique id differentiating them.
2021-12-09 06:57:46 +03:00
nikitashamgunov
6094236171 Update README.md 2021-12-08 11:55:54 -08:00
anastasia
bb5aba42eb bump vendor/postgres to use correct backpressure commit 2021-12-08 18:57:18 +03:00
Arthur Petukhovsky
450fb9eafe Don't persist control file without sync (#966) 2021-12-07 15:02:44 +03:00
Dmitry Rodionov
557e3024cd Forward pageserver connection string from compute to safekeeper
This is needed for implementation of tenant rebalancing. With this
change safekeeper becomes aware of which pageserver is supposed to be
used for replication from this particular compute.
2021-12-06 21:28:49 +03:00
Arseny Sher
bd34d7ecfc Bump safekeeper control file version and allow reading the previous one.
Should have been a part of cba4da3f4d to provide upgrade for previously
existing clusters. Separates version independent header (magic + version) out of
SafeKeeperState to choose what to deserialize.
2021-12-06 19:47:55 +03:00
Dmitry Ivanov
0a8c672630 [CI] Fix benchmarks
Too bad we don't have a --dry-run in PRs :(
2021-12-06 13:52:28 +03:00
Dmitry Ivanov
b87ab17d05 Bump rust version to 1.56.1
Apparently, code coverage doesn't work that well in 1.55.
2021-12-06 13:27:52 +03:00
Dmitry Ivanov
d874675955 Collect coverage in CI 2021-12-06 13:27:52 +03:00
Dmitry Ivanov
5d37560308 Add bespoke glue script leveraging LLVM coverage tools 2021-12-06 13:27:52 +03:00
Dmitry Ivanov
7cec13d1df Improve shutdown story for code coverage
This patch introduces fixes for several problems affecting
LLVM-based code coverage:

* Daemonizing parent processes should call _exit() to prevent
coverage data file corruption (*.profraw) due to concurrent writes.

* Implement proper shutdown handlers in safekeeper.
2021-12-06 13:27:52 +03:00
anastasia
b7685eb6ba Enable backpressure 2021-12-06 12:49:42 +03:00
anastasia
c7f3b4e62c Clarify the meaning of StandbyReply LSNs:
write_lsn - The last LSN received and processed by pageserver's walreceiver.
flush_lsn - same as write_lsn. At pageserver it doesn't guarantees data persistence, but it's fine. We rely on safekeepers.
apply_lsn - The LSN at which pageserver guaranteed persistence of all received data (disk_consistent_lsn).
2021-12-06 12:49:42 +03:00
Heikki Linnakangas
5bad2deff8 Don't hold 'timelines' lock over checkpoint.
It was very noticeable that you while the checkpointer was busy, you
could not e.g. open a new connection.
2021-12-03 07:42:10 -05:00
Arseny Sher
d39608c367 Fix passing start_offset to find_end_of_wal_segment. 2021-12-03 12:43:57 +03:00
Arseny Sher
cba4da3f4d Add term history to safekeepers.
Persist full history of term switches on safekeepers instead of storing only the
single term of the highest entry (called epoch). This allows easily and
correctly find the divergence point of two logs and truncate the obsolete part
before overwriting it with entries of the newer proposer(s).

Full history of the proposer is transferred in separate message before proposer
starts streaming; it is immediately persisted by safekeeper, though he might not
yet have entries for some older terms there. That's because we can't atomically
append to WAL and update the control file anyway, so locally available WAL must
be taken into account when looking at the history.

We should sometimes purge term history entries beyond truncate_lsn; this is not
done here.

Per https://github.com/zenithdb/rfcs/pull/12

Closes #296.

Bumps vendor/postgres.
2021-12-03 12:43:57 +03:00
Dmitry Rodionov
2669d140f8 use full commit sha for version info
for builds in docker this is not needed, since environment variable
with commit sha already contains full version
2021-12-01 17:35:57 +03:00
Heikki Linnakangas
f49ad33f1b Initialize 'loaded' correctly in DeltaLayer.
While we're at it, reuse the Book and the VirtualFile that's backing
it even over unload() calls. Previously, we would keep the Book open,
but on load(), we would re-open it anyway, which didn't make much
sense. Now we reuse it it. Alternatively, perhaps we should close it
on unload() to save some memory, but this I'm not going to think too
hard about it right now as the whole load/unload thing is a bit of a
hack and needs to be rewritten.

This is hard to reproduce ATM, because the incorrect state would get
fixed by an unload(). A checkpoint creates the DeltaLayer, and it also
calls unload() afterwards, so the window is not very large. I hit it
occasionally with a scale 1000 pgbench test, after I had modified
InMemoryLayer::write_to_disk() to not write an image layer every time,
which made the DeltaLayers be accessed more often.
2021-11-30 22:23:59 +02:00
Kirill Bulatov
670205e17a Evict excessively failing sync tasks, improve processing for the rest of
the tasks
2021-11-30 13:58:49 +02:00
Konstantin Knizhnik
f72d4814b1 Extract page images from FPI WAL records (#949)
* Extract page images from FPI WAL records

* Fix issues reported in review
2021-11-30 12:57:26 +03:00
Heikki Linnakangas
5ecf0664cc Fix off-by-one error in check for future delta layers.
This doesnt show up at the moment, because we never create a delta
layer with end-LSN equal to the last LSN. We always create an image
layer at that LSN instead. For example, if the latest processed LSN is
100, we would create a delta layer with end LSN 100 (exclusive), and
an image layer at 100. But that's just how InMemoryLayer::write_to_disk
happens to work at the moment, there's no fundamental reason it needs
to always create that image layer. I noticed this bug when I tried to
change the logic in InMemoryLayer::write_to_disk to only create an
image layer after a few delta layers.
2021-11-29 14:35:24 +02:00
Heikki Linnakangas
7cae265447 Fix dump_layerfile.
The VirtualFile machinery panics if it's not initialized
2021-11-29 11:26:54 +02:00
Heikki Linnakangas
5aa969a588 Replace in-memory layers and OOM-triggered eviction with temp files.
The "in-memory layer" is misnomer now, each in-memory layer is now actually
backed by a file. The files are ephemeral, in that they don't survive page
server crash or shutdown.

To avoid reading the file for every operation,
"ephemeral files" are cached in a page cache.

This includes changes from 'inmemory-layer-chunks' branch to serialize /
the page versions when they are added to the open layer. The difference is
that they are not serialized to the expandable in-memory "chunk buffer", but
written out to the file.
2021-11-26 17:25:17 +03:00
Arthur Petukhovsky
93cc40584d Shutdown socket on CopyFail (#938)
Fixes #935
2021-11-26 16:48:27 +03:00
Dmitry Rodionov
130184fee9 Prohibit branch creation and basebackup at out of scope lsns
Out of scope LSNs include pre initdb LSNs, and LSNs prior to
latest_gc_cutoff.

To get there there was also two cleanups:
* Fix error handling in Execute message handler. This fixes behaviour
  when basebackup retured an error. Previously pageserver thread just
  died.
* Remove "ancestor" file which previously contained ancestor id and
  branch lsn. Currently the same data can be obtained from metadata file.
  And just the way we handled ancestor file in the code introduced the
  case when branching fails timeline directory is created but there is no data in it
  except ancestor file. And this confused gc because it scans
  directories. So it is better to just remove ancestor file and clean up
  this timeline directory creation so it happens after all validity
  checks have passed
2021-11-25 15:27:16 +03:00
Heikki Linnakangas
d47f610606 Fix pageserver CLI parameter names and document them 2021-11-25 13:31:52 +02:00
Dmitry Rodionov
0650e51f0b add test one more case for layer visibility 2021-11-22 11:39:20 +03:00
Dmitry Rodionov
737a557f09 add check to python tests that afteer gc number of rows is unchanged in all branches 2021-11-22 11:39:20 +03:00
Dmitry Rodionov
6f7ebe6e01 preserve data in parent branch that might be referenced in child branch 2021-11-22 11:39:20 +03:00
Dmitry Rodionov
70ab0d5b1f add missing script 2021-11-19 00:10:40 +03:00
Dmitry Rodionov
6ac76248cf Save performance test results from perfirmance test suit runs.
Also render reports for both staging and local runs.
2021-11-19 00:00:19 +03:00
Kirill Bulatov
b32da3b42e Use less pageserver-specific method in RemoteStorage trait 2021-11-18 22:53:40 +02:00
Dmitry Ivanov
0ccfc62e88 [proxy] Pass PostgreSQL version to client
Fixes #779
2021-11-17 16:28:44 +03:00
Dmitry Ivanov
b55cf773a8 [proxy] Streamline control- and dataflow 2021-11-17 16:28:44 +03:00
Dmitry Ivanov
43ded1c54b [proxy] Minor cleanup 2021-11-17 16:28:44 +03:00
Heikki Linnakangas
f8702d4625 Fix checking for whether segment exists on a frozen in-memory layer.
Ever since we've had frozen in-memory layers, having an 'end_lsn' no
longer means that the layer has been dropped. Need to check the 'dropped'
flag explicitly.

This was reliably causing a failure on the new 'test_parallel_copy' test
in https://github.com/zenithdb/zenith/pull/864. I'm not sure why it
doesn't happen on main branch, but the bug is pretty straightforward when
you see it.
2021-11-15 20:19:15 +02:00
Dmitry Rodionov
44111e3ba3 Prohibit branch creation at lsn that was already garbage collected.
This introduces new timeline field latest_gc_cutoff. It is updated
before each gc iteration. New check is added to branch_timelines to
prevent branch creation with start point less than latest_gc_cutoff.
Also this adds a check to get_page_at_lsn which asserts that lsn at
which the page is requested was not garbage collected. This check
currently is triggered for readonly nodes which are pinned to specific
lsn and because they are not tracked in pageserver garbage collection
can remove data that still might be referenced. This is a bug and will
be fixed separately.
2021-11-15 20:03:16 +03:00
Patrick Insinger
298bc588f9 pageserver - don't try to GC InMemoryLayers 2021-11-15 09:01:45 -08:00
Heikki Linnakangas
4ba521f53f Add performance test case for parallel COPY TO 2021-11-15 14:49:53 +02:00
Heikki Linnakangas
431d32756b Add a buffer cache, and use it to store materialized pages.
The buffer cache is shared across all tenants, allowing memory to be
dynamically allocated where it's needed the most. The cache works on 8 kB
pages, and uses the clock algorithm for replacement policy; same as the
PostgreSQL buffer cache.

One peculiarity is that the materialized page versions can be looked up
by an inexact LSN, to find the latest page version with an LSN >= the
search key.

The code is structured to support caching other kinds of pages in the same
cache in the future, but with a different mapping key.

Co-authored-by: Patrick Insinger <patrick@zenith.tech>
2021-11-12 11:02:12 -08:00
Heikki Linnakangas
3d172d98a3 Improve layered repo README.
Add an informal overview of how it works.
2021-11-12 19:59:31 +02:00
Heikki Linnakangas
849ac791a6 Bandaid fix for "page not found" errors, when a table is loaded.
During parallel load of a table, Postgres sometimes requests a page from
the page server for which no WAL has been generated yet. That's normal;
Postgres expects the page to be full of zeros. There was a special case
for that in LayeredTimeline::materialize_page, but the problem remained
when you're crossing a segment boundary, so that there's no layer for
the segment at all.

It would be nice to have a more robust cross-check for this case. That
might need help from the Postgres side. But this extends the bandaid fix
we had in materialize_page() to the case where cross segment boundary.

Fixes https://github.com/zenithdb/zenith/issues/841
2021-11-12 18:47:39 +02:00
Alexey Kondratov
de5e6a15ae Set LD_LIBRARY_PATH in the check_restored_datadir_content() psql call
Otherwise we may use outdated system libpq.
Also print stdout/stderr if basebackup failed in check_restored_datadir_content()
2021-11-12 16:27:43 +03:00
Alexey Kondratov
0d6bf14ecb Use vendor/postgres rebased on top of REL_14_1 2021-11-12 16:27:43 +03:00
Heikki Linnakangas
d1e79c4af3 Fix locking issues in VirtualFile machinery.
There were two separate locking issues that could lead to a deadlock,
both related to holding a lock for longer than necessary:

1. In the loop in `VirtualFile::with_file`, the "handle_guard" was
held across iterations of the loop. Because of that, if the handle was
changed by a concurrent thread, the loop would try to acquire the
handle lock, when it was still holding the lock from previous
iteration. To fix, release the lock earlier. There was no need to hold
it across iterations, it was just accidental.

2. In the same function, we also held the "slot_guard" longer than
necessary. It's only needed in the first part of the loop, where we
check if the current handle is valid. If it's not, the slot lock can
be immediately released. But it was not, it was kept over the
acquisition of the handle lock. I'm not sure if that alone could cause
problems, but let's release the lock as soon as possible anyway.

Add a test case, based on Konstantin's test program to demonstrate the
deadlock.
2021-11-11 20:12:59 +02:00
Kirill Bulatov
abb2ac5246 Better context when erroring 2021-11-11 19:22:05 +02:00
Kirill Bulatov
99dbbe5f18 Allow downloading remote files partially 2021-11-11 18:51:34 +02:00
Arseny Sher
e7ca8ef5a8 Use PG timelineid 1 everywhere.
As changing it doesn't have useful meaning in Zenith.

ref #824
2021-11-11 13:53:39 +03:00
Patrick Insinger
1ce4976e36 pageserver - track size of VecMaps 2021-11-10 11:09:34 -08:00
Heikki Linnakangas
9300107cdf Cache Book objects, use virtual files to avoid running out of fds.
Currently, whenever a page version is needed from an image or delta
layer, we open the file and read and parse the bookfile headers. That's
pretty expensive. To reduce the overhead, introduce a cache of open file
descriptors, and use that to cache the Book objects so that we don't need
to read the metadata on every access.
2021-11-10 17:19:37 +02:00
Arthur Petukhovsky
9aaa02bc9a Fix high CPU usage in walproposer (#860)
* Bump vendor/postgres

* Update time limits for test_restarts_under_load
2021-11-10 17:18:07 +03:00
Arseny Sher
5603259c53 In wal_proposer_recovery, don't wait outcoming WAL to be committed.
Otherwise we're deadlocking ourselves. Oversight of 33007cc.
2021-11-10 01:38:25 +03:00
Arseny Sher
ce15c62f35 Fix 'send WAL up to' debug logging. 2021-11-10 01:38:25 +03:00
Egor Suvorov
eaff0cd568 Check python for the whole repository and improve docs (#813) 2021-11-09 22:23:29 +03:00
Egor Suvorov
587935ebed Add Safekeeper metrics tests (#746)
* zenith_fixtures.py: add SafekeeperHttpClient.get_metrics()
* Ensure that `collect_lsn` and `flush_lsn`'s reported values look reasonable in `test_many_timelines`
2021-11-09 22:18:59 +03:00
Dmitry Rodionov
07dddfed28 Use more robust way to persist safekeeper control file.
Now safekeeper control file updated in a following way:
1. Write data to temp file
2. Fsync the temporary file (if sync option is specified)
3. Rename temporary file to actual control file
4. Fsync containing directory (if sync option is specified)
5. Fsync file after rename (if sync option is specified).

Note that action 5 is not mentioned anywhere as required but it is done
in postgres this way (see durable_rename).

Also because of the rename machinery switch to use dedicated lock file
to prevent running several safekeepers concurrently on the same data

cleanup

fsync control file after rename to match postgres behaviour
2021-11-09 17:51:46 +03:00
Arseny Sher
229dc7704f Bump vendor/postgres. 2021-11-08 17:32:13 +03:00
Dmitry Rodionov
067f2ac814 fix perf repo branch name 2021-11-08 13:27:23 +03:00
Dmitry Rodionov
865870a8e5 Follow up staging benchmarking
* change zenith-perf-data checkout ref to be main
* set cluster id through secrets so there is no code changes required
  when we wipe out clusters on staging
* display full pgbench output on error
2021-11-05 14:07:11 +03:00
Arthur Petukhovsky
d19263aec8 Adjust timeouts for test_restarts_under_load (#830)
* Adjust timeouts for test_restarts_under_load

* Add test timeout for test_restarts_under_load
2021-11-04 19:58:40 +03:00
Heikki Linnakangas
6d742719a1 Fix infinite loop in looking up predecessor layer
Commit 960c7d69a8 changed the LSN returned in the Continue case in
InMemoryLayer::get_page_reconstruct_data(), but neglected to make the
same change in DeltaLayer.

Also add an escape hatch to the loop in materialize_page() to avoid
getting stuck in an infinite loop, if a bug like this reoccurs.
2021-11-04 16:07:12 +02:00
Dmitry Rodionov
c75bc9b8b0 Change benchmark plugin layout so pytest loads it properly when running
all tests (not necessary performance ones)

resolves #837
2021-11-04 16:33:31 +03:00
Egor Suvorov
33007cc0bb Safekeeper's START_REPLICATION handler: remove stop_point, do not handle start_point == 0 (#777) 2021-11-04 14:50:33 +03:00
Dmitry Rodionov
987833e0b9 Propagate git SHA to zenith binaries
Git commit sha is displayed when --version flag is used and is written
to logs during service startup. Uses git_version crate when git is
available, and GIT_VERSION environment variable otherwise which is the case for docker
builds.
2021-11-04 14:22:29 +03:00
Kirill Bulatov
f36acf00de Reduce "relish" word usages in remote storage 2021-11-04 12:53:42 +02:00
Kirill Bulatov
956fc3dec9 Tidy up and make consistent the remote storate API 2021-11-04 12:53:42 +02:00
Heikki Linnakangas
b38e841f2d Use poll() in communication with WAL redo process.
The tokio futures added some overhead, so switch to plain non-blocking
I/O with poll(). In a simple pgbench test on my laptop (select-only
queries, scale-factor 1 `pgbench -P1 -T50 -S`), this gives about 10%
improvement, from about 4300 TPS to 4800 TPS.
2021-11-04 10:39:04 +02:00
Heikki Linnakangas
3a0111c75e Refactor functions for constructing WAL redo messages.
Instead of building a separate Vec<u8> to hold each message, serialize all
the messages to one big Vec<u8>. This eliminates some Vec allocation and
memcpy() overhead. The downside is that if there are a lot of records to
replay, we have to serialize them all into one big chunk of memory.
That shouldn't be a problem in practice. If you need to replay millions
of records to reconstruct a page, we should've materialized a new image
of that page earlier already.
2021-11-04 10:39:00 +02:00
Heikki Linnakangas
086a02ab92 Add performance test for simple seq scans.
Fixes https://github.com/zenithdb/zenith/issues/831
2021-11-04 10:36:45 +02:00
Heikki Linnakangas
7ed39655dc Bump vendor/postgres 2021-11-04 10:35:50 +02:00
Dmitry Rodionov
c6172dae47 implement performance tests against our staging environment
tests are based on self-hosted runner which is physically close
to our staging deployment in aws, currently tests consist of
various configurations of pgbenchi runs.

Also these changes rework benchmark fixture by removing globals and
allowing to collect reports with desired metrics and dump them to json
for further analysis. This is also applicable to usual performance tests
which use local zenith binaries.
2021-11-04 02:15:46 +03:00
Heikki Linnakangas
4ba783d0af Remove a couple of unused functions.
We might want to have custom serialize/deserialize functions for
WALRecords and PageVersions for performance reasons, see github issue 832.
But that would probably look a bit different from this, and currently
these functions are just dead.
2021-11-03 19:10:23 +02:00
Patrick Insinger
0457fe81a9 pageserver - make PageVersion an enum 2021-11-03 09:28:49 -07:00
Heikki Linnakangas
fb524dd973 Put a global limit on memory used by in-memory layers.
Adds simple global tracking of memory used by the in-memory layers. It's
very approximate, it doesn't take into account allocator, memory
fragmentation or many other things, but it's a good first step.

After storing a WAL record in the repository, the WAL receiver checks
if the global memory usage. If it's above a configurable threshold (hard
coded at 128 MB at the moment), it evicts a layer. The victim layer is
chosen by GClock algorithm, similar to that used in the Postgres buffer
cache.

This stops the page server from using an unbounded amount of memory. It's
pretty crude, the eviction and materializing and writing a layer to disk
happens now in the WAL receiver thread. It would be nice to move that
to a background thread, and it would be nice to have a smarter policy on
when to materialize a new image layer and when to just write out a delta
layer, and it would be nice to have more accurate accounting of memory.
But this should fix the most pressing OOM issues, and is a step in the
right direction.

Co-authored-by: Patrick Insinger <patrickinsinger@gmail.com>
2021-11-02 15:49:39 +02:00
Heikki Linnakangas
8c6d2664c0 Support removing arbitrary open layers, not just the oldest one 2021-11-02 15:43:16 +02:00
Patrick Insinger
cdbbd15eb9 pageserver - add InMemoryLayer global map (#817) 2021-11-01 12:20:24 -07:00
anastasia
85f8bf97f5 Name walkeeper threads to make debugging more convenient 2021-11-01 19:09:57 +03:00
anastasia
83ed930bc2 WIP. Launch and shutdown tenant threads together with walreceiver.
TODO: now walreceiver only disconnects if safekeeper was shut down. Implemnt proper walreceiver disconnection.
2021-11-01 18:04:00 +03:00
anastasia
071e30cc53 Expose TENANT_THREADS_COUNT metric to observe number of currently active checkpointer and GC threads 2021-11-01 18:04:00 +03:00
757 changed files with 162423 additions and 29397 deletions

16
.cargo/config.toml Normal file
View File

@@ -0,0 +1,16 @@
# The binaries are really slow, if you compile them in 'dev' mode with the defaults.
# Enable some optimizations even in 'dev' mode, to make tests faster. The basic
# optimizations enabled by "opt-level=1" don't affect debuggability too much.
#
# See https://www.reddit.com/r/rust/comments/gvrgca/this_is_a_neat_trick_for_getting_good_runtime/
#
[profile.dev.package."*"]
# Set the default for dependencies in Development mode.
opt-level = 3
[profile.dev]
# Turn on a small amount of optimization in Development mode.
opt-level = 1
[alias]
build_testing = ["build", "--features", "testing"]

View File

@@ -1,410 +0,0 @@
version: 2.1
executors:
zenith-build-executor:
resource_class: xlarge
docker:
- image: cimg/rust:1.55.0
zenith-python-executor:
docker:
- image: cimg/python:3.7.10 # Oldest available 3.7 with Ubuntu 20.04 (for GLIBC and Rust) at CirlceCI
jobs:
check-codestyle:
executor: zenith-build-executor
steps:
- checkout
- run:
name: rustfmt
when: always
command: |
cargo fmt --all -- --check
# A job to build postgres
build-postgres:
executor: zenith-build-executor
parameters:
build_type:
type: enum
enum: ["debug", "release"]
environment:
BUILD_TYPE: << parameters.build_type >>
steps:
# Checkout the git repo (circleci doesn't have a flag to enable submodules here)
- checkout
# Grab the postgres git revision to build a cache key.
# Note this works even though the submodule hasn't been checkout out yet.
- run:
name: Get postgres cache key
command: |
git rev-parse HEAD:vendor/postgres > /tmp/cache-key-postgres
- restore_cache:
name: Restore postgres cache
keys:
# Restore ONLY if the rev key matches exactly
- v04-postgres-cache-<< parameters.build_type >>-{{ checksum "/tmp/cache-key-postgres" }}
# FIXME We could cache our own docker container, instead of installing packages every time.
- run:
name: apt install dependencies
command: |
if [ ! -e tmp_install/bin/postgres ]; then
sudo apt update
sudo apt install build-essential libreadline-dev zlib1g-dev flex bison libseccomp-dev
fi
# Build postgres if the restore_cache didn't find a build.
# `make` can't figure out whether the cache is valid, since
# it only compares file timestamps.
- run:
name: build postgres
command: |
if [ ! -e tmp_install/bin/postgres ]; then
# "depth 1" saves some time by not cloning the whole repo
git submodule update --init --depth 1
make postgres -j8
fi
- save_cache:
name: Save postgres cache
key: v04-postgres-cache-<< parameters.build_type >>-{{ checksum "/tmp/cache-key-postgres" }}
paths:
- tmp_install
# A job to build zenith rust code
build-zenith:
executor: zenith-build-executor
parameters:
build_type:
type: enum
enum: ["debug", "release"]
steps:
- run:
name: apt install dependencies
command: |
sudo apt update
sudo apt install libssl-dev clang
# Checkout the git repo (without submodules)
- checkout
# Grab the postgres git revision to build a cache key.
# Note this works even though the submodule hasn't been checkout out yet.
- run:
name: Get postgres cache key
command: |
git rev-parse HEAD:vendor/postgres > /tmp/cache-key-postgres
- restore_cache:
name: Restore postgres cache
keys:
# Restore ONLY if the rev key matches exactly
- v04-postgres-cache-<< parameters.build_type >>-{{ checksum "/tmp/cache-key-postgres" }}
- restore_cache:
name: Restore rust cache
keys:
# Require an exact match. While an out of date cache might speed up the build,
# there's no way to clean out old packages, so the cache grows every time something
# changes.
- v04-rust-cache-deps-<< parameters.build_type >>-{{ checksum "Cargo.lock" }}
# Build the rust code, including test binaries
- run:
name: Rust build << parameters.build_type >>
command: |
export CARGO_INCREMENTAL=0
BUILD_TYPE="<< parameters.build_type >>"
if [[ $BUILD_TYPE == "debug" ]]; then
echo "Build in debug mode"
cargo build --bins --tests
elif [[ $BUILD_TYPE == "release" ]]; then
echo "Build in release mode"
cargo build --release --bins --tests
fi
- save_cache:
name: Save rust cache
key: v04-rust-cache-deps-<< parameters.build_type >>-{{ checksum "Cargo.lock" }}
paths:
- ~/.cargo/registry
- ~/.cargo/git
- target
# Run style checks
# has to run separately from cargo fmt section
# since needs to run with dependencies
- run:
name: clippy
command: |
./run_clippy.sh
# Run rust unit tests
- run: cargo test
# Install the rust binaries, for use by test jobs
# `--locked` is required; otherwise, `cargo install` will ignore Cargo.lock.
# FIXME: this is a really silly way to install; maybe we should just output
# a tarball as an artifact? Or a .deb package?
- run:
name: cargo install
command: |
export CARGO_INCREMENTAL=0
BUILD_TYPE="<< parameters.build_type >>"
if [[ $BUILD_TYPE == "debug" ]]; then
echo "Install debug mode"
CARGO_FLAGS="--debug"
elif [[ $BUILD_TYPE == "release" ]]; then
echo "Install release mode"
# The default is release mode; there is no --release flag.
CARGO_FLAGS=""
fi
cargo install $CARGO_FLAGS --locked --root /tmp/zenith --path pageserver
cargo install $CARGO_FLAGS --locked --root /tmp/zenith --path walkeeper
cargo install $CARGO_FLAGS --locked --root /tmp/zenith --path zenith
# Install the postgres binaries, for use by test jobs
# FIXME: this is a silly way to do "install"; maybe just output a standard
# postgres package, whatever the favored form is (tarball? .deb package?)
# Note that pg_regress needs some build artifacts that probably aren't
# in the usual package...?
- run:
name: postgres install
command: |
cp -a tmp_install /tmp/zenith/pg_install
# Save the rust output binaries for other jobs in this workflow.
- persist_to_workspace:
root: /tmp/zenith
paths:
- "*"
check-python:
executor: zenith-python-executor
steps:
- checkout
- run:
name: Install deps
working_directory: test_runner
command: pipenv --python 3.7 install --dev
- run:
name: Run yapf to ensure code format
when: always
working_directory: test_runner
command: pipenv run yapf --recursive --diff .
- run:
name: Run mypy to check types
when: always
working_directory: test_runner
command: pipenv run mypy .
run-pytest:
executor: zenith-python-executor
parameters:
# pytest args to specify the tests to run.
#
# This can be a test file name, e.g. 'test_pgbench.py, or a subdirectory,
# or '-k foobar' to run tests containing string 'foobar'. See pytest man page
# section SPECIFYING TESTS / SELECTING TESTS for details.
#
# Select the type of Rust build. Must be "release" or "debug".
build_type:
type: string
default: "debug"
# This parameter is required, to prevent the mistake of running all tests in one job.
test_selection:
type: string
default: ""
# Arbitrary parameters to pytest. For example "-s" to prevent capturing stdout/stderr
extra_params:
type: string
default: ""
needs_postgres_source:
type: boolean
default: false
run_in_parallel:
type: boolean
default: true
steps:
- attach_workspace:
at: /tmp/zenith
- checkout
- when:
condition: << parameters.needs_postgres_source >>
steps:
- run: git submodule update --init --depth 1
- run:
name: Install deps
working_directory: test_runner
command: pipenv --python 3.7 install
- run:
name: Run pytest
working_directory: test_runner
environment:
- ZENITH_BIN: /tmp/zenith/bin
- POSTGRES_DISTRIB_DIR: /tmp/zenith/pg_install
- TEST_OUTPUT: /tmp/test_output
command: |
TEST_SELECTION="<< parameters.test_selection >>"
EXTRA_PARAMS="<< parameters.extra_params >>"
if [ -z "$TEST_SELECTION" ]; then
echo "test_selection must be set"
exit 1
fi
if << parameters.run_in_parallel >>; then
EXTRA_PARAMS="-n4 $EXTRA_PARAMS"
fi;
# Run the tests.
#
# The junit.xml file allows CircleCI to display more fine-grained test information
# in its "Tests" tab in the results page.
# --verbose prints name of each test (helpful when there are
# multiple tests in one file)
# -rA prints summary in the end
# -n4 uses four processes to run tests via pytest-xdist
# -s is not used to prevent pytest from capturing output, because tests are running
# in parallel and logs are mixed between different tests
pipenv run pytest --junitxml=$TEST_OUTPUT/junit.xml --tb=short --verbose -rA $TEST_SELECTION $EXTRA_PARAMS
- run:
# CircleCI artifacts are preserved one file at a time, so skipping
# this step isn't a good idea. If you want to extract the
# pageserver state, perhaps a tarball would be a better idea.
name: Delete all data but logs
when: always
command: |
du -sh /tmp/test_output/*
find /tmp/test_output -type f ! -name "pg.log" ! -name "pageserver.log" ! -name "safekeeper.log" ! -name "regression.diffs" ! -name "junit.xml" ! -name "*.filediff" ! -name "*.stdout" ! -name "*.stderr" -delete
du -sh /tmp/test_output/*
- store_artifacts:
path: /tmp/test_output
# The store_test_results step tells CircleCI where to find the junit.xml file.
- store_test_results:
path: /tmp/test_output
# Build zenithdb/zenith:latest image and push it to Docker hub
docker-image:
docker:
- image: cimg/base:2021.04
steps:
- checkout
- setup_remote_docker:
docker_layer_caching: true
- run:
name: Init postgres submodule
command: git submodule update --init --depth 1
- run:
name: Build and push Docker image
command: |
echo $DOCKER_PWD | docker login -u $DOCKER_LOGIN --password-stdin
docker build -t zenithdb/zenith:latest . && docker push zenithdb/zenith:latest
# Trigger a new remote CI job
remote-ci-trigger:
docker:
- image: cimg/base:2021.04
parameters:
remote_repo:
type: string
environment:
REMOTE_REPO: << parameters.remote_repo >>
steps:
- run:
name: Set PR's status to pending
command: |
LOCAL_REPO=$CIRCLE_PROJECT_USERNAME/$CIRCLE_PROJECT_REPONAME
curl -f -X POST \
https://api.github.com/repos/$LOCAL_REPO/statuses/$CIRCLE_SHA1 \
-H "Accept: application/vnd.github.v3+json" \
--user "$CI_ACCESS_TOKEN" \
--data \
"{
\"state\": \"pending\",
\"context\": \"zenith-remote-ci\",
\"description\": \"[$REMOTE_REPO] Remote CI job is about to start\"
}"
- run:
name: Request a remote CI test
command: |
LOCAL_REPO=$CIRCLE_PROJECT_USERNAME/$CIRCLE_PROJECT_REPONAME
curl -f -X POST \
https://api.github.com/repos/$REMOTE_REPO/actions/workflows/testing.yml/dispatches \
-H "Accept: application/vnd.github.v3+json" \
--user "$CI_ACCESS_TOKEN" \
--data \
"{
\"ref\": \"main\",
\"inputs\": {
\"ci_job_name\": \"zenith-remote-ci\",
\"commit_hash\": \"$CIRCLE_SHA1\",
\"remote_repo\": \"$LOCAL_REPO\"
}
}"
workflows:
build_and_test:
jobs:
- check-codestyle
- check-python
- build-postgres:
name: build-postgres-<< matrix.build_type >>
matrix:
parameters:
build_type: ["debug", "release"]
- build-zenith:
name: build-zenith-<< matrix.build_type >>
matrix:
parameters:
build_type: ["debug", "release"]
requires:
- build-postgres-<< matrix.build_type >>
- run-pytest:
name: pg_regress-tests-<< matrix.build_type >>
matrix:
parameters:
build_type: ["debug", "release"]
test_selection: batch_pg_regress
needs_postgres_source: true
requires:
- build-zenith-<< matrix.build_type >>
- run-pytest:
name: other-tests-<< matrix.build_type >>
matrix:
parameters:
build_type: ["debug", "release"]
test_selection: batch_others
requires:
- build-zenith-<< matrix.build_type >>
- run-pytest:
name: benchmarks
build_type: release
test_selection: performance
run_in_parallel: false
requires:
- build-zenith-release
- docker-image:
# Context gives an ability to login
context: Docker Hub
# Build image only for commits to main
filters:
branches:
only:
- main
requires:
- pg_regress-tests-release
- other-tests-release
- remote-ci-trigger:
# Context passes credentials for gh api
context: CI_ACCESS_TOKEN
remote_repo: "zenithdb/console"
requires:
# XXX: Successful build doesn't mean everything is OK, but
# the job to be triggered takes so much time to complete (~22 min)
# that it's better not to wait for the commented-out steps
- build-zenith-debug
# - pg_regress-tests-release
# - other-tests-release

26
.config/hakari.toml Normal file
View File

@@ -0,0 +1,26 @@
# This file contains settings for `cargo hakari`.
# See https://docs.rs/cargo-hakari/latest/cargo_hakari/config for a full list of options.
hakari-package = "workspace_hack"
# Format for `workspace-hack = ...` lines in other Cargo.tomls. Requires cargo-hakari 0.9.8 or above.
dep-format-version = "3"
# Setting workspace.resolver = "2" in the root Cargo.toml is HIGHLY recommended.
# Hakari works much better with the new feature resolver.
# For more about the new feature resolver, see:
# https://blog.rust-lang.org/2021/03/25/Rust-1.51.0.html#cargos-new-feature-resolver
# Have to keep the resolver still here since hakari requires this field,
# despite it's now the default for 2021 edition & cargo.
resolver = "2"
# Add triples corresponding to platforms commonly used by developers here.
# https://doc.rust-lang.org/rustc/platform-support.html
platforms = [
# "x86_64-unknown-linux-gnu",
# "x86_64-apple-darwin",
# "x86_64-pc-windows-msvc",
]
# Write out exact versions rather than a semver range. (Defaults to false.)
# exact-versions = true

View File

@@ -1,18 +1,24 @@
**/.git/
**/__pycache__
**/.pytest_cache
*
.git
target
tmp_check
tmp_install
tmp_check_cli
test_output
.vscode
.zenith
integration_tests/.zenith
.mypy_cache
Dockerfile
.dockerignore
!rust-toolchain.toml
!Cargo.toml
!Cargo.lock
!Makefile
!.cargo/
!.config/
!control_plane/
!compute_tools/
!libs/
!pageserver/
!pgxn/
!proxy/
!safekeeper/
!storage_broker/
!trace/
!vendor/postgres-v14/
!vendor/postgres-v15/
!workspace_hack/
!neon_local/
!scripts/ninstall.sh
!vm-cgconfig.conf

1
.git-blame-ignore-revs Normal file
View File

@@ -0,0 +1 @@
4c2bb43775947775401cbb9d774823c5723a91f8

23
.github/ISSUE_TEMPLATE/bug-template.md vendored Normal file
View File

@@ -0,0 +1,23 @@
---
name: Bug Template
about: Used for describing bugs
title: ''
labels: t/bug
assignees: ''
---
## Steps to reproduce
## Expected result
## Actual result
## Environment
## Logs, links
-

25
.github/ISSUE_TEMPLATE/epic-template.md vendored Normal file
View File

@@ -0,0 +1,25 @@
---
name: Epic Template
about: A set of related tasks contributing towards specific outcome, comprising of
more than 1 week of work.
title: 'Epic: '
labels: t/Epic
assignees: ''
---
## Motivation
## DoD
## Implementation ideas
## Tasks
- [ ]
## Other related tasks and Epics
-

View File

@@ -0,0 +1,20 @@
## Release 202Y-MM-DD
**NB: this PR must be merged only by 'Create a merge commit'!**
### Checklist when preparing for release
- [ ] Read or refresh [the release flow guide](https://github.com/neondatabase/cloud/wiki/Release:-general-flow)
- [ ] Ask in the [cloud Slack channel](https://neondb.slack.com/archives/C033A2WE6BZ) that you are going to rollout the release. Any blockers?
- [ ] Does this release contain any db migrations? Destructive ones? What is the rollback plan?
<!-- List everything that should be done **before** release, any issues / setting changes / etc -->
### Checklist after release
- [ ] Based on the merged commits write release notes and open a PR into `website` repo ([example](https://github.com/neondatabase/website/pull/219/files))
- [ ] Check [#dev-production-stream](https://neondb.slack.com/archives/C03F5SM1N02) Slack channel
- [ ] Check [stuck projects page](https://console.neon.tech/admin/projects?sort=last_active&order=desc&stuck=true)
- [ ] Check [recent operation failures](https://console.neon.tech/admin/operations?action=create_timeline%2Cstart_compute%2Cstop_compute%2Csuspend_compute%2Capply_config%2Cdelete_timeline%2Cdelete_tenant%2Ccreate_branch%2Ccheck_availability&sort=updated_at&order=desc&had_retries=some)
- [ ] Check [cloud SLO dashboard](https://neonprod.grafana.net/d/_oWcBMJ7k/cloud-slos?orgId=1)
- [ ] Check [compute startup metrics dashboard](https://neonprod.grafana.net/d/5OkYJEmVz/compute-startup-time)
<!-- List everything that should be done **after** release, any admin UI configuration / Grafana dashboard / alert changes / setting changes / etc -->

232
.github/actions/allure-report/action.yml vendored Normal file
View File

@@ -0,0 +1,232 @@
name: 'Create Allure report'
description: 'Create and publish Allure report'
inputs:
action:
desctiption: 'generate or store'
required: true
build_type:
description: '`build_type` from run-python-test-set action'
required: true
test_selection:
description: '`test_selector` from run-python-test-set action'
required: false
outputs:
report-url:
description: 'Allure report URL'
value: ${{ steps.generate-report.outputs.report-url }}
runs:
using: "composite"
steps:
- name: Validate input parameters
shell: bash -euxo pipefail {0}
run: |
if [ "${{ inputs.action }}" != "store" ] && [ "${{ inputs.action }}" != "generate" ]; then
echo 2>&1 "Unknown inputs.action type '${{ inputs.action }}'; allowed 'generate' or 'store' only"
exit 1
fi
if [ -z "${{ inputs.test_selection }}" ] && [ "${{ inputs.action }}" == "store" ]; then
echo 2>&1 "inputs.test_selection must be set for 'store' action"
exit 2
fi
- name: Calculate variables
id: calculate-vars
shell: bash -euxo pipefail {0}
run: |
# TODO: for manually triggered workflows (via workflow_dispatch) we need to have a separate key
pr_number=$(jq --raw-output .pull_request.number "$GITHUB_EVENT_PATH" || true)
if [ "${pr_number}" != "null" ]; then
key=pr-${pr_number}
elif [ "${GITHUB_REF_NAME}" = "main" ]; then
# Shortcut for a special branch
key=main
elif [ "${GITHUB_REF_NAME}" = "release" ]; then
# Shortcut for a special branch
key=release
else
key=branch-$(printf "${GITHUB_REF_NAME}" | tr -c "[:alnum:]._-" "-")
fi
echo "KEY=${key}" >> $GITHUB_OUTPUT
# Sanitize test selection to remove `/` and any other special characters
# Use printf instead of echo to avoid having `\n` at the end of the string
test_selection=$(printf "${{ inputs.test_selection }}" | tr -c "[:alnum:]._-" "-" )
echo "TEST_SELECTION=${test_selection}" >> $GITHUB_OUTPUT
- uses: actions/setup-java@v3
if: ${{ inputs.action == 'generate' }}
with:
distribution: 'temurin'
java-version: '17'
- name: Install Allure
if: ${{ inputs.action == 'generate' }}
shell: bash -euxo pipefail {0}
run: |
if ! which allure; then
ALLURE_ZIP=allure-${ALLURE_VERSION}.zip
wget -q https://github.com/allure-framework/allure2/releases/download/${ALLURE_VERSION}/${ALLURE_ZIP}
echo "${ALLURE_ZIP_MD5} ${ALLURE_ZIP}" | md5sum -c
unzip -q ${ALLURE_ZIP}
echo "$(pwd)/allure-${ALLURE_VERSION}/bin" >> $GITHUB_PATH
rm -f ${ALLURE_ZIP}
fi
env:
ALLURE_VERSION: 2.19.0
ALLURE_ZIP_MD5: ced21401a1a8b9dfb68cee9e4c210464
- name: Upload Allure results
if: ${{ inputs.action == 'store' }}
env:
REPORT_PREFIX: reports/${{ steps.calculate-vars.outputs.KEY }}/${{ inputs.build_type }}
RAW_PREFIX: reports-raw/${{ steps.calculate-vars.outputs.KEY }}/${{ inputs.build_type }}
TEST_OUTPUT: /tmp/test_output
BUCKET: neon-github-public-dev
TEST_SELECTION: ${{ steps.calculate-vars.outputs.TEST_SELECTION }}
shell: bash -euxo pipefail {0}
run: |
# Add metadata
cat <<EOF > $TEST_OUTPUT/allure/results/executor.json
{
"name": "GitHub Actions",
"type": "github",
"url": "https://${BUCKET}.s3.amazonaws.com/${REPORT_PREFIX}/latest/index.html",
"buildOrder": ${GITHUB_RUN_ID},
"buildName": "GitHub Actions Run #${{ github.run_number }}/${GITHUB_RUN_ATTEMPT}",
"buildUrl": "${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}/actions/runs/${GITHUB_RUN_ID}/attempts/${GITHUB_RUN_ATTEMPT}",
"reportUrl": "https://${BUCKET}.s3.amazonaws.com/${REPORT_PREFIX}/${GITHUB_RUN_ID}/index.html",
"reportName": "Allure Report"
}
EOF
cat <<EOF > $TEST_OUTPUT/allure/results/environment.properties
TEST_SELECTION=${{ inputs.test_selection }}
BUILD_TYPE=${{ inputs.build_type }}
EOF
ARCHIVE="${GITHUB_RUN_ID}-${TEST_SELECTION}-${GITHUB_RUN_ATTEMPT}-$(date +%s).tar.zst"
ZSTD_NBTHREADS=0
tar -C ${TEST_OUTPUT}/allure/results -cf ${ARCHIVE} --zstd .
aws s3 mv --only-show-errors ${ARCHIVE} "s3://${BUCKET}/${RAW_PREFIX}/${ARCHIVE}"
# Potentially we could have several running build for the same key (for example for the main branch), so we use improvised lock for this
- name: Acquire Allure lock
if: ${{ inputs.action == 'generate' }}
shell: bash -euxo pipefail {0}
env:
LOCK_FILE: reports/${{ steps.calculate-vars.outputs.KEY }}/lock.txt
BUCKET: neon-github-public-dev
TEST_SELECTION: ${{ steps.calculate-vars.outputs.TEST_SELECTION }}
run: |
LOCK_TIMEOUT=300 # seconds
for _ in $(seq 1 5); do
for i in $(seq 1 ${LOCK_TIMEOUT}); do
LOCK_ADDED=$(aws s3api head-object --bucket neon-github-public-dev --key ${LOCK_FILE} | jq --raw-output '.LastModified' || true)
# `date --date="..."` is supported only by gnu date (i.e. it doesn't work on BSD/macOS)
if [ -z "${LOCK_ADDED}" ] || [ "$(( $(date +%s) - $(date --date="${LOCK_ADDED}" +%s) ))" -gt "${LOCK_TIMEOUT}" ]; then
break
fi
sleep 1
done
echo "${GITHUB_RUN_ID}-${GITHUB_RUN_ATTEMPT}-${TEST_SELECTION}" > lock.txt
aws s3 mv --only-show-errors lock.txt "s3://${BUCKET}/${LOCK_FILE}"
# A double-check that exactly WE have acquired the lock
aws s3 cp --only-show-errors "s3://${BUCKET}/${LOCK_FILE}" ./lock.txt
if [ "$(cat lock.txt)" = "${GITHUB_RUN_ID}-${GITHUB_RUN_ATTEMPT}-${TEST_SELECTION}" ]; then
break
fi
done
- name: Generate and publish final Allure report
if: ${{ inputs.action == 'generate' }}
id: generate-report
env:
REPORT_PREFIX: reports/${{ steps.calculate-vars.outputs.KEY }}/${{ inputs.build_type }}
RAW_PREFIX: reports-raw/${{ steps.calculate-vars.outputs.KEY }}/${{ inputs.build_type }}
TEST_OUTPUT: /tmp/test_output
BUCKET: neon-github-public-dev
shell: bash -euxo pipefail {0}
run: |
# Get previously uploaded data for this run
ZSTD_NBTHREADS=0
s3_filepaths=$(aws s3api list-objects-v2 --bucket ${BUCKET} --prefix ${RAW_PREFIX}/${GITHUB_RUN_ID}- | jq --raw-output '.Contents[].Key')
if [ -z "$s3_filepaths" ]; then
# There's no previously uploaded data for this run
exit 0
fi
for s3_filepath in ${s3_filepaths}; do
aws s3 cp --only-show-errors "s3://${BUCKET}/${s3_filepath}" "${TEST_OUTPUT}/allure/"
archive=${TEST_OUTPUT}/allure/$(basename $s3_filepath)
mkdir -p ${archive%.tar.zst}
tar -xf ${archive} -C ${archive%.tar.zst}
rm -f ${archive}
done
# Get history trend
aws s3 cp --recursive --only-show-errors "s3://${BUCKET}/${REPORT_PREFIX}/latest/history" "${TEST_OUTPUT}/allure/latest/history" || true
# Generate report
allure generate --clean --output $TEST_OUTPUT/allure/report $TEST_OUTPUT/allure/*
# Replace a logo link with a redirect to the latest version of the report
sed -i 's|<a href="." class=|<a href="https://'${BUCKET}'.s3.amazonaws.com/'${REPORT_PREFIX}'/latest/index.html" class=|g' $TEST_OUTPUT/allure/report/app.js
# Upload a history and the final report (in this particular order to not to have duplicated history in 2 places)
aws s3 mv --recursive --only-show-errors "${TEST_OUTPUT}/allure/report/history" "s3://${BUCKET}/${REPORT_PREFIX}/latest/history"
aws s3 mv --recursive --only-show-errors "${TEST_OUTPUT}/allure/report" "s3://${BUCKET}/${REPORT_PREFIX}/${GITHUB_RUN_ID}"
REPORT_URL=https://${BUCKET}.s3.amazonaws.com/${REPORT_PREFIX}/${GITHUB_RUN_ID}/index.html
# Generate redirect
cat <<EOF > ./index.html
<!DOCTYPE html>
<meta charset="utf-8">
<title>Redirecting to ${REPORT_URL}</title>
<meta http-equiv="refresh" content="0; URL=${REPORT_URL}">
EOF
aws s3 cp --only-show-errors ./index.html "s3://${BUCKET}/${REPORT_PREFIX}/latest/index.html"
echo "[Allure Report](${REPORT_URL})" >> ${GITHUB_STEP_SUMMARY}
echo "report-url=${REPORT_URL}" >> $GITHUB_OUTPUT
- name: Release Allure lock
if: ${{ inputs.action == 'generate' && always() }}
shell: bash -euxo pipefail {0}
env:
LOCK_FILE: reports/${{ steps.calculate-vars.outputs.KEY }}/lock.txt
BUCKET: neon-github-public-dev
TEST_SELECTION: ${{ steps.calculate-vars.outputs.TEST_SELECTION }}
run: |
aws s3 cp --only-show-errors "s3://${BUCKET}/${LOCK_FILE}" ./lock.txt || exit 0
if [ "$(cat lock.txt)" = "${GITHUB_RUN_ID}-${GITHUB_RUN_ATTEMPT}-${TEST_SELECTION}" ]; then
aws s3 rm "s3://${BUCKET}/${LOCK_FILE}"
fi
- uses: actions/github-script@v6
if: ${{ inputs.action == 'generate' && always() }}
env:
REPORT_URL: ${{ steps.generate-report.outputs.report-url }}
BUILD_TYPE: ${{ inputs.build_type }}
SHA: ${{ github.event.pull_request.head.sha || github.sha }}
with:
script: |
const { REPORT_URL, BUILD_TYPE, SHA } = process.env
await github.rest.repos.createCommitStatus({
owner: context.repo.owner,
repo: context.repo.repo,
sha: `${SHA}`,
state: 'success',
target_url: `${REPORT_URL}`,
context: `Allure report / ${BUILD_TYPE}`,
})

59
.github/actions/download/action.yml vendored Normal file
View File

@@ -0,0 +1,59 @@
name: "Download an artifact"
description: "Custom download action"
inputs:
name:
description: "Artifact name"
required: true
path:
description: "A directory to put artifact into"
default: "."
required: false
skip-if-does-not-exist:
description: "Allow to skip if file doesn't exist, fail otherwise"
default: false
required: false
prefix:
description: "S3 prefix. Default is '${GITHUB_RUN_ID}/${GITHUB_RUN_ATTEMPT}'"
required: false
runs:
using: "composite"
steps:
- name: Download artifact
id: download-artifact
shell: bash -euxo pipefail {0}
env:
TARGET: ${{ inputs.path }}
ARCHIVE: /tmp/downloads/${{ inputs.name }}.tar.zst
SKIP_IF_DOES_NOT_EXIST: ${{ inputs.skip-if-does-not-exist }}
PREFIX: artifacts/${{ inputs.prefix || format('{0}/{1}', github.run_id, github.run_attempt) }}
run: |
BUCKET=neon-github-public-dev
FILENAME=$(basename $ARCHIVE)
S3_KEY=$(aws s3api list-objects-v2 --bucket ${BUCKET} --prefix ${PREFIX%$GITHUB_RUN_ATTEMPT} | jq -r '.Contents[].Key' | grep ${FILENAME} | sort --version-sort | tail -1 || true)
if [ -z "${S3_KEY}" ]; then
if [ "${SKIP_IF_DOES_NOT_EXIST}" = "true" ]; then
echo 'SKIPPED=true' >> $GITHUB_OUTPUT
exit 0
else
echo 2>&1 "Neither s3://${BUCKET}/${PREFIX}/${FILENAME} nor its version from previous attempts exist"
exit 1
fi
fi
echo 'SKIPPED=false' >> $GITHUB_OUTPUT
mkdir -p $(dirname $ARCHIVE)
time aws s3 cp --only-show-errors s3://${BUCKET}/${S3_KEY} ${ARCHIVE}
- name: Extract artifact
if: ${{ steps.download-artifact.outputs.SKIPPED == 'false' }}
shell: bash -euxo pipefail {0}
env:
TARGET: ${{ inputs.path }}
ARCHIVE: /tmp/downloads/${{ inputs.name }}.tar.zst
run: |
mkdir -p ${TARGET}
time tar -xf ${ARCHIVE} -C ${TARGET}
rm -f ${ARCHIVE}

View File

@@ -0,0 +1,138 @@
name: 'Create Branch'
description: 'Create Branch using API'
inputs:
api_key:
desctiption: 'Neon API key'
required: true
project_id:
desctiption: 'ID of the Project to create Branch in'
required: true
api_host:
desctiption: 'Neon API host'
default: console.stage.neon.tech
outputs:
dsn:
description: 'Created Branch DSN (for main database)'
value: ${{ steps.change-password.outputs.dsn }}
branch_id:
description: 'Created Branch ID'
value: ${{ steps.create-branch.outputs.branch_id }}
runs:
using: "composite"
steps:
- name: Create New Branch
id: create-branch
shell: bash -euxo pipefail {0}
run: |
for i in $(seq 1 10); do
branch=$(curl \
"https://${API_HOST}/api/v2/projects/${PROJECT_ID}/branches" \
--header "Accept: application/json" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${API_KEY}" \
--data "{
\"branch\": {
\"name\": \"Created by actions/neon-branch-create; GITHUB_RUN_ID=${GITHUB_RUN_ID} at $(date +%s)\"
},
\"endpoints\": [
{
\"type\": \"read_write\"
}
]
}")
if [ -z "${branch}" ]; then
sleep 1
continue
fi
branch_id=$(echo $branch | jq --raw-output '.branch.id')
if [ "${branch_id}" == "null" ]; then
sleep 1
continue
fi
break
done
if [ -z "${branch_id}" ] || [ "${branch_id}" == "null" ]; then
echo 2>&1 "Failed to create branch after 10 attempts, the latest response was: ${branch}"
exit 1
fi
branch_id=$(echo $branch | jq --raw-output '.branch.id')
echo "branch_id=${branch_id}" >> $GITHUB_OUTPUT
host=$(echo $branch | jq --raw-output '.endpoints[0].host')
echo "host=${host}" >> $GITHUB_OUTPUT
env:
API_HOST: ${{ inputs.api_host }}
API_KEY: ${{ inputs.api_key }}
PROJECT_ID: ${{ inputs.project_id }}
- name: Get Role name
id: role-name
shell: bash -euxo pipefail {0}
run: |
roles=$(curl \
"https://${API_HOST}/api/v2/projects/${PROJECT_ID}/branches/${BRANCH_ID}/roles" \
--fail \
--header "Accept: application/json" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${API_KEY}"
)
role_name=$(echo $roles | jq --raw-output '.roles[] | select(.protected == false) | .name')
echo "role_name=${role_name}" >> $GITHUB_OUTPUT
env:
API_HOST: ${{ inputs.api_host }}
API_KEY: ${{ inputs.api_key }}
PROJECT_ID: ${{ inputs.project_id }}
BRANCH_ID: ${{ steps.create-branch.outputs.branch_id }}
- name: Change Password
id: change-password
# A shell without `set -x` to not to expose password/dsn in logs
shell: bash -euo pipefail {0}
run: |
for i in $(seq 1 10); do
reset_password=$(curl \
"https://${API_HOST}/api/v2/projects/${PROJECT_ID}/branches/${BRANCH_ID}/roles/${ROLE_NAME}/reset_password" \
--request POST \
--header "Accept: application/json" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${API_KEY}"
)
if [ -z "${reset_password}" ]; then
sleep 1
continue
fi
password=$(echo $reset_password | jq --raw-output '.role.password')
if [ "${password}" == "null" ]; then
sleep 1
continue
fi
echo "::add-mask::${password}"
break
done
if [ -z "${password}" ] || [ "${password}" == "null" ]; then
echo 2>&1 "Failed to reset password after 10 attempts, the latest response was: ${reset_password}"
exit 1
fi
dsn="postgres://${ROLE_NAME}:${password}@${HOST}/neondb"
echo "::add-mask::${dsn}"
echo "dsn=${dsn}" >> $GITHUB_OUTPUT
env:
API_HOST: ${{ inputs.api_host }}
API_KEY: ${{ inputs.api_key }}
PROJECT_ID: ${{ inputs.project_id }}
BRANCH_ID: ${{ steps.create-branch.outputs.branch_id }}
ROLE_NAME: ${{ steps.role-name.outputs.role_name }}
HOST: ${{ steps.create-branch.outputs.host }}

View File

@@ -0,0 +1,58 @@
name: 'Delete Branch'
description: 'Delete Branch using API'
inputs:
api_key:
desctiption: 'Neon API key'
required: true
project_id:
desctiption: 'ID of the Project which should be deleted'
required: true
branch_id:
desctiption: 'ID of the branch to delete'
required: true
api_host:
desctiption: 'Neon API host'
default: console.stage.neon.tech
runs:
using: "composite"
steps:
- name: Delete Branch
# Do not try to delete a branch if .github/actions/neon-project-create
# or .github/actions/neon-branch-create failed before
if: ${{ inputs.project_id != '' && inputs.branch_id != '' }}
shell: bash -euxo pipefail {0}
run: |
for i in $(seq 1 10); do
deleted_branch=$(curl \
"https://${API_HOST}/api/v2/projects/${PROJECT_ID}/branches/${BRANCH_ID}" \
--request DELETE \
--header "Accept: application/json" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${API_KEY}"
)
if [ -z "${deleted_branch}" ]; then
sleep 1
continue
fi
branch_id=$(echo $deleted_branch | jq --raw-output '.branch.id')
if [ "${branch_id}" == "null" ]; then
sleep 1
continue
fi
break
done
if [ -z "${branch_id}" ] || [ "${branch_id}" == "null" ]; then
echo 2>&1 "Failed to delete branch after 10 attempts, the latest response was: ${deleted_branch}"
exit 1
fi
env:
API_HOST: ${{ inputs.api_host }}
API_KEY: ${{ inputs.api_key }}
PROJECT_ID: ${{ inputs.project_id }}
BRANCH_ID: ${{ inputs.branch_id }}

View File

@@ -0,0 +1,64 @@
name: 'Create Neon Project'
description: 'Create Neon Project using API'
inputs:
api_key:
desctiption: 'Neon API key'
required: true
region_id:
desctiption: 'Region ID, if not set the project will be created in the default region'
default: aws-us-east-2
postgres_version:
desctiption: 'Postgres version; default is 15'
default: 15
api_host:
desctiption: 'Neon API host'
default: console.stage.neon.tech
outputs:
dsn:
description: 'Created Project DSN (for main database)'
value: ${{ steps.create-neon-project.outputs.dsn }}
project_id:
description: 'Created Project ID'
value: ${{ steps.create-neon-project.outputs.project_id }}
runs:
using: "composite"
steps:
- name: Create Neon Project
id: create-neon-project
# A shell without `set -x` to not to expose password/dsn in logs
shell: bash -euo pipefail {0}
run: |
project=$(curl \
"https://${API_HOST}/api/v2/projects" \
--fail \
--header "Accept: application/json" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${API_KEY}" \
--data "{
\"project\": {
\"name\": \"Created by actions/neon-project-create; GITHUB_RUN_ID=${GITHUB_RUN_ID}\",
\"pg_version\": ${POSTGRES_VERSION},
\"region_id\": \"${REGION_ID}\",
\"settings\": { }
}
}")
# Mask password
echo "::add-mask::$(echo $project | jq --raw-output '.roles[] | select(.name != "web_access") | .password')"
dsn=$(echo $project | jq --raw-output '.connection_uris[0].connection_uri')
echo "::add-mask::${dsn}"
echo "dsn=${dsn}" >> $GITHUB_OUTPUT
project_id=$(echo $project | jq --raw-output '.project.id')
echo "project_id=${project_id}" >> $GITHUB_OUTPUT
echo "Project ${project_id} has been created"
env:
API_HOST: ${{ inputs.api_host }}
API_KEY: ${{ inputs.api_key }}
REGION_ID: ${{ inputs.region_id }}
POSTGRES_VERSION: ${{ inputs.postgres_version }}

View File

@@ -0,0 +1,35 @@
name: 'Delete Neon Project'
description: 'Delete Neon Project using API'
inputs:
api_key:
desctiption: 'Neon API key'
required: true
project_id:
desctiption: 'ID of the Project to delete'
required: true
api_host:
desctiption: 'Neon API host'
default: console.stage.neon.tech
runs:
using: "composite"
steps:
- name: Delete Neon Project
# Do not try to delete a project if .github/actions/neon-project-create failed before
if: ${{ inputs.project_id != '' }}
shell: bash -euxo pipefail {0}
run: |
curl \
"https://${API_HOST}/api/v2/projects/${PROJECT_ID}" \
--fail \
--request DELETE \
--header "Accept: application/json" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${API_KEY}"
echo "Project ${PROJECT_ID} has been deleted"
env:
API_HOST: ${{ inputs.api_host }}
API_KEY: ${{ inputs.api_key }}
PROJECT_ID: ${{ inputs.project_id }}

View File

@@ -0,0 +1,198 @@
name: 'Run python test'
description: 'Runs a Neon python test set, performing all the required preparations before'
inputs:
build_type:
description: 'Type of Rust (neon) and C (postgres) builds. Must be "release" or "debug", or "remote" for the remote cluster'
required: true
test_selection:
description: 'A python test suite to run'
required: true
extra_params:
description: 'Arbitrary parameters to pytest. For example "-s" to prevent capturing stdout/stderr'
required: false
default: ''
needs_postgres_source:
description: 'Set to true if the test suite requires postgres source checked out'
required: false
default: 'false'
run_in_parallel:
description: 'Whether to run tests in parallel'
required: false
default: 'true'
save_perf_report:
description: 'Whether to upload the performance report, if true PERF_TEST_RESULT_CONNSTR env variable should be set'
required: false
default: 'false'
run_with_real_s3:
description: 'Whether to pass real s3 credentials to the test suite'
required: false
default: 'false'
real_s3_bucket:
description: 'Bucket name for real s3 tests'
required: false
default: ''
real_s3_region:
description: 'Region name for real s3 tests'
required: false
default: ''
real_s3_access_key_id:
description: 'Access key id'
required: false
default: ''
real_s3_secret_access_key:
description: 'Secret access key'
required: false
default: ''
runs:
using: "composite"
steps:
- name: Get Neon artifact
if: inputs.build_type != 'remote'
uses: ./.github/actions/download
with:
name: neon-${{ runner.os }}-${{ inputs.build_type }}-artifact
path: /tmp/neon
- name: Download Neon binaries for the previous release
if: inputs.build_type != 'remote'
uses: ./.github/actions/download
with:
name: neon-${{ runner.os }}-${{ inputs.build_type }}-artifact
path: /tmp/neon-previous
prefix: latest
- name: Download compatibility snapshot for Postgres 14
if: inputs.build_type != 'remote'
uses: ./.github/actions/download
with:
name: compatibility-snapshot-${{ inputs.build_type }}-pg14
path: /tmp/compatibility_snapshot_pg14
prefix: latest
- name: Checkout
if: inputs.needs_postgres_source == 'true'
uses: actions/checkout@v3
with:
submodules: true
fetch-depth: 1
- name: Cache poetry deps
id: cache_poetry
uses: actions/cache@v3
with:
path: ~/.cache/pypoetry/virtualenvs
key: v1-${{ runner.os }}-python-deps-${{ hashFiles('poetry.lock') }}
- name: Install Python deps
shell: bash -euxo pipefail {0}
run: ./scripts/pysync
- name: Run pytest
env:
NEON_BIN: /tmp/neon/bin
COMPATIBILITY_NEON_BIN: /tmp/neon-previous/bin
COMPATIBILITY_POSTGRES_DISTRIB_DIR: /tmp/neon-previous/pg_install
TEST_OUTPUT: /tmp/test_output
BUILD_TYPE: ${{ inputs.build_type }}
AWS_ACCESS_KEY_ID: ${{ inputs.real_s3_access_key_id }}
AWS_SECRET_ACCESS_KEY: ${{ inputs.real_s3_secret_access_key }}
COMPATIBILITY_SNAPSHOT_DIR: /tmp/compatibility_snapshot_pg14
ALLOW_BACKWARD_COMPATIBILITY_BREAKAGE: contains(github.event.pull_request.labels.*.name, 'backward compatibility breakage')
ALLOW_FORWARD_COMPATIBILITY_BREAKAGE: contains(github.event.pull_request.labels.*.name, 'forward compatibility breakage')
shell: bash -euxo pipefail {0}
run: |
# PLATFORM will be embedded in the perf test report
# and it is needed to distinguish different environments
export PLATFORM=${PLATFORM:-github-actions-selfhosted}
export POSTGRES_DISTRIB_DIR=${POSTGRES_DISTRIB_DIR:-/tmp/neon/pg_install}
export DEFAULT_PG_VERSION=${DEFAULT_PG_VERSION:-14}
if [ "${BUILD_TYPE}" = "remote" ]; then
export REMOTE_ENV=1
fi
PERF_REPORT_DIR="$(realpath test_runner/perf-report-local)"
rm -rf $PERF_REPORT_DIR
TEST_SELECTION="test_runner/${{ inputs.test_selection }}"
EXTRA_PARAMS="${{ inputs.extra_params }}"
if [ -z "$TEST_SELECTION" ]; then
echo "test_selection must be set"
exit 1
fi
if [[ "${{ inputs.run_in_parallel }}" == "true" ]]; then
# -n16 uses sixteen processes to run tests via pytest-xdist
EXTRA_PARAMS="-n16 $EXTRA_PARAMS"
# --dist=loadgroup points tests marked with @pytest.mark.xdist_group
# to the same worker to make @pytest.mark.order work with xdist
EXTRA_PARAMS="--dist=loadgroup $EXTRA_PARAMS"
fi
if [[ "${{ inputs.run_with_real_s3 }}" == "true" ]]; then
echo "REAL S3 ENABLED"
export ENABLE_REAL_S3_REMOTE_STORAGE=nonempty
export REMOTE_STORAGE_S3_BUCKET=${{ inputs.real_s3_bucket }}
export REMOTE_STORAGE_S3_REGION=${{ inputs.real_s3_region }}
fi
if [[ "${{ inputs.save_perf_report }}" == "true" ]]; then
mkdir -p "$PERF_REPORT_DIR"
EXTRA_PARAMS="--out-dir $PERF_REPORT_DIR $EXTRA_PARAMS"
fi
if [[ "${{ inputs.build_type }}" == "debug" ]]; then
cov_prefix=(scripts/coverage "--profraw-prefix=$GITHUB_JOB" --dir=/tmp/coverage run)
elif [[ "${{ inputs.build_type }}" == "release" ]]; then
cov_prefix=()
else
cov_prefix=()
fi
# Wake up the cluster if we use remote neon instance
if [ "${{ inputs.build_type }}" = "remote" ] && [ -n "${BENCHMARK_CONNSTR}" ]; then
${POSTGRES_DISTRIB_DIR}/v${DEFAULT_PG_VERSION}/bin/psql ${BENCHMARK_CONNSTR} -c "SELECT version();"
fi
# Run the tests.
#
# The junit.xml file allows CI tools to display more fine-grained test information
# in its "Tests" tab in the results page.
# --verbose prints name of each test (helpful when there are
# multiple tests in one file)
# -rA prints summary in the end
# -s is not used to prevent pytest from capturing output, because tests are running
# in parallel and logs are mixed between different tests
#
mkdir -p $TEST_OUTPUT/allure/results
"${cov_prefix[@]}" ./scripts/pytest \
--junitxml=$TEST_OUTPUT/junit.xml \
--alluredir=$TEST_OUTPUT/allure/results \
--tb=short \
--verbose \
-rA $TEST_SELECTION $EXTRA_PARAMS
if [[ "${{ inputs.save_perf_report }}" == "true" ]]; then
export REPORT_FROM="$PERF_REPORT_DIR"
export REPORT_TO="$PLATFORM"
scripts/generate_and_push_perf_report.sh
fi
- name: Upload compatibility snapshot for Postgres 14
if: github.ref_name == 'release'
uses: ./.github/actions/upload
with:
name: compatibility-snapshot-${{ inputs.build_type }}-pg14-${{ github.run_id }}
# The path includes a test name (test_create_snapshot) and directory that the test creates (compatibility_snapshot_pg14), keep the path in sync with the test
path: /tmp/test_output/test_create_snapshot/compatibility_snapshot_pg14/
prefix: latest
- name: Create Allure report
if: success() || failure()
uses: ./.github/actions/allure-report
with:
action: store
build_type: ${{ inputs.build_type }}
test_selection: ${{ inputs.test_selection }}

View File

@@ -0,0 +1,22 @@
name: 'Merge and upload coverage data'
description: 'Compresses and uploads the coverage data as an artifact'
runs:
using: "composite"
steps:
- name: Merge coverage data
shell: bash -euxo pipefail {0}
run: scripts/coverage "--profraw-prefix=$GITHUB_JOB" --dir=/tmp/coverage merge
- name: Download previous coverage data into the same directory
uses: ./.github/actions/download
with:
name: coverage-data-artifact
path: /tmp/coverage
skip-if-does-not-exist: true # skip if there's no previous coverage to download
- name: Upload coverage data
uses: ./.github/actions/upload
with:
name: coverage-data-artifact
path: /tmp/coverage

58
.github/actions/upload/action.yml vendored Normal file
View File

@@ -0,0 +1,58 @@
name: "Upload an artifact"
description: "Custom upload action"
inputs:
name:
description: "Artifact name"
required: true
path:
description: "A directory or file to upload"
required: true
prefix:
description: "S3 prefix. Default is '${GITHUB_RUN_ID}/${GITHUB_RUN_ATTEMPT}'"
required: false
runs:
using: "composite"
steps:
- name: Prepare artifact
shell: bash -euxo pipefail {0}
env:
SOURCE: ${{ inputs.path }}
ARCHIVE: /tmp/uploads/${{ inputs.name }}.tar.zst
run: |
mkdir -p $(dirname $ARCHIVE)
if [ -f ${ARCHIVE} ]; then
echo 2>&1 "File ${ARCHIVE} already exist. Something went wrong before"
exit 1
fi
ZSTD_NBTHREADS=0
if [ -d ${SOURCE} ]; then
time tar -C ${SOURCE} -cf ${ARCHIVE} --zstd .
elif [ -f ${SOURCE} ]; then
time tar -cf ${ARCHIVE} --zstd ${SOURCE}
elif ! ls ${SOURCE} > /dev/null 2>&1; then
echo 2>&1 "${SOURCE} does not exist"
exit 2
else
echo 2>&1 "${SOURCE} is neither a directory nor a file, do not know how to handle it"
exit 3
fi
- name: Upload artifact
shell: bash -euxo pipefail {0}
env:
SOURCE: ${{ inputs.path }}
ARCHIVE: /tmp/uploads/${{ inputs.name }}.tar.zst
PREFIX: artifacts/${{ inputs.prefix || format('{0}/{1}', github.run_id, github.run_attempt) }}
run: |
BUCKET=neon-github-public-dev
FILENAME=$(basename $ARCHIVE)
FILESIZE=$(du -sh ${ARCHIVE} | cut -f1)
time aws s3 mv --only-show-errors ${ARCHIVE} s3://${BUCKET}/${PREFIX}/${FILENAME}
# Ref https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions#adding-a-job-summary
echo "[${FILENAME}](https://${BUCKET}.s3.amazonaws.com/${PREFIX}/${FILENAME}) ${FILESIZE}" >> ${GITHUB_STEP_SUMMARY}

5
.github/ansible/.gitignore vendored Normal file
View File

@@ -0,0 +1,5 @@
neon_install.tar.gz
.neon_current_version
collections/*
!collections/.keep

12
.github/ansible/ansible.cfg vendored Normal file
View File

@@ -0,0 +1,12 @@
[defaults]
localhost_warning = False
host_key_checking = False
timeout = 30
[ssh_connection]
ssh_args = -F ./ansible.ssh.cfg
# teleport doesn't support sftp yet https://github.com/gravitational/teleport/issues/7127
# and scp neither worked for me
transfer_method = piped
pipelining = True

15
.github/ansible/ansible.ssh.cfg vendored Normal file
View File

@@ -0,0 +1,15 @@
# Remove this once https://github.com/gravitational/teleport/issues/10918 is fixed
# (use pre 8.5 option name to cope with old ssh in CI)
PubkeyAcceptedKeyTypes +ssh-rsa-cert-v01@openssh.com
Host tele.zenith.tech
User admin
Port 3023
StrictHostKeyChecking no
UserKnownHostsFile /dev/null
Host * !tele.zenith.tech
User admin
StrictHostKeyChecking no
UserKnownHostsFile /dev/null
ProxyJump tele.zenith.tech

0
.github/ansible/collections/.keep vendored Normal file
View File

211
.github/ansible/deploy.yaml vendored Normal file
View File

@@ -0,0 +1,211 @@
- name: Upload Neon binaries
hosts: storage
gather_facts: False
remote_user: "{{ remote_user }}"
tasks:
- name: get latest version of Neon binaries
register: current_version_file
set_fact:
current_version: "{{ lookup('file', '.neon_current_version') | trim }}"
tags:
- pageserver
- safekeeper
- name: inform about versions
debug:
msg: "Version to deploy - {{ current_version }}"
tags:
- pageserver
- safekeeper
- name: upload and extract Neon binaries to /usr/local
ansible.builtin.unarchive:
owner: root
group: root
src: neon_install.tar.gz
dest: /usr/local
become: true
tags:
- pageserver
- safekeeper
- binaries
- putbinaries
- name: Deploy pageserver
hosts: pageservers
gather_facts: False
remote_user: "{{ remote_user }}"
tasks:
- name: upload init script
when: console_mgmt_base_url is defined
ansible.builtin.template:
src: scripts/init_pageserver.sh
dest: /tmp/init_pageserver.sh
owner: root
group: root
mode: '0755'
become: true
tags:
- pageserver
- name: init pageserver
shell:
cmd: /tmp/init_pageserver.sh
args:
creates: "/storage/pageserver/data/tenants"
environment:
NEON_REPO_DIR: "/storage/pageserver/data"
LD_LIBRARY_PATH: "/usr/local/v14/lib"
become: true
tags:
- pageserver
- name: read the existing remote pageserver config
ansible.builtin.slurp:
src: /storage/pageserver/data/pageserver.toml
register: _remote_ps_config
tags:
- pageserver
- name: parse the existing pageserver configuration
ansible.builtin.set_fact:
_existing_ps_config: "{{ _remote_ps_config['content'] | b64decode | sivel.toiletwater.from_toml }}"
tags:
- pageserver
- name: construct the final pageserver configuration dict
ansible.builtin.set_fact:
pageserver_config: "{{ pageserver_config_stub | combine({'id': _existing_ps_config.id }) }}"
tags:
- pageserver
- name: template the pageserver config
template:
src: templates/pageserver.toml.j2
dest: /storage/pageserver/data/pageserver.toml
become: true
tags:
- pageserver
# used in `pageserver.service` template
- name: learn current availability_zone
shell:
cmd: "curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone"
register: ec2_availability_zone
- set_fact:
ec2_availability_zone={{ ec2_availability_zone.stdout }}
- name: upload systemd service definition
ansible.builtin.template:
src: systemd/pageserver.service
dest: /etc/systemd/system/pageserver.service
owner: root
group: root
mode: '0644'
become: true
tags:
- pageserver
- name: start systemd service
ansible.builtin.systemd:
daemon_reload: yes
name: pageserver
enabled: yes
state: restarted
become: true
tags:
- pageserver
- name: post version to console
when: console_mgmt_base_url is defined
shell:
cmd: |
INSTANCE_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
curl -sfS -H "Authorization: Bearer {{ CONSOLE_API_TOKEN }}" {{ console_mgmt_base_url }}/management/api/v2/pageservers/$INSTANCE_ID | jq '.version = {{ current_version }}' > /tmp/new_version
curl -sfS -H "Authorization: Bearer {{ CONSOLE_API_TOKEN }}" -H "Content-Type: application/json" -X POST -d@/tmp/new_version {{ console_mgmt_base_url }}/management/api/v2/pageservers
tags:
- pageserver
- name: Deploy safekeeper
hosts: safekeepers
gather_facts: False
remote_user: "{{ remote_user }}"
tasks:
- name: upload init script
when: console_mgmt_base_url is defined
ansible.builtin.template:
src: scripts/init_safekeeper.sh
dest: /tmp/init_safekeeper.sh
owner: root
group: root
mode: '0755'
become: true
tags:
- safekeeper
- name: init safekeeper
shell:
cmd: /tmp/init_safekeeper.sh
args:
creates: "/storage/safekeeper/data/safekeeper.id"
environment:
NEON_REPO_DIR: "/storage/safekeeper/data"
LD_LIBRARY_PATH: "/usr/local/v14/lib"
become: true
tags:
- safekeeper
# used in `safekeeper.service` template
- name: learn current availability_zone
shell:
cmd: "curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone"
register: ec2_availability_zone
- set_fact:
ec2_availability_zone={{ ec2_availability_zone.stdout }}
# in the future safekeepers should discover pageservers byself
# but currently use first pageserver that was discovered
- name: set first pageserver var for safekeepers
set_fact:
first_pageserver: "{{ hostvars[groups['pageservers'][0]]['inventory_hostname'] }}"
tags:
- safekeeper
- name: upload systemd service definition
ansible.builtin.template:
src: systemd/safekeeper.service
dest: /etc/systemd/system/safekeeper.service
owner: root
group: root
mode: '0644'
become: true
tags:
- safekeeper
- name: start systemd service
ansible.builtin.systemd:
daemon_reload: yes
name: safekeeper
enabled: yes
state: restarted
become: true
tags:
- safekeeper
- name: post version to console
when: console_mgmt_base_url is defined
shell:
cmd: |
INSTANCE_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
curl -sfS -H "Authorization: Bearer {{ CONSOLE_API_TOKEN }}" {{ console_mgmt_base_url }}/management/api/v2/safekeepers/$INSTANCE_ID | jq '.version = {{ current_version }}' > /tmp/new_version
curl -sfS -H "Authorization: Bearer {{ CONSOLE_API_TOKEN }}" -H "Content-Type: application/json" -X POST -d@/tmp/new_version {{ console_mgmt_base_url }}/management/api/v2/safekeepers
tags:
- safekeeper

42
.github/ansible/get_binaries.sh vendored Executable file
View File

@@ -0,0 +1,42 @@
#!/bin/bash
set -e
if [ -n "${DOCKER_TAG}" ]; then
# Verson is DOCKER_TAG but without prefix
VERSION=$(echo $DOCKER_TAG | sed 's/^.*-//g')
else
echo "Please set DOCKER_TAG environment variable"
exit 1
fi
# do initial cleanup
rm -rf neon_install postgres_install.tar.gz neon_install.tar.gz .neon_current_version
mkdir neon_install
# retrieve binaries from docker image
echo "getting binaries from docker image"
docker pull --quiet neondatabase/neon:${DOCKER_TAG}
ID=$(docker create neondatabase/neon:${DOCKER_TAG})
docker cp ${ID}:/data/postgres_install.tar.gz .
tar -xzf postgres_install.tar.gz -C neon_install
mkdir neon_install/bin/
docker cp ${ID}:/usr/local/bin/pageserver neon_install/bin/
docker cp ${ID}:/usr/local/bin/pageserver_binutils neon_install/bin/
docker cp ${ID}:/usr/local/bin/safekeeper neon_install/bin/
docker cp ${ID}:/usr/local/bin/storage_broker neon_install/bin/
docker cp ${ID}:/usr/local/bin/proxy neon_install/bin/
docker cp ${ID}:/usr/local/v14/bin/ neon_install/v14/bin/
docker cp ${ID}:/usr/local/v15/bin/ neon_install/v15/bin/
docker cp ${ID}:/usr/local/v14/lib/ neon_install/v14/lib/
docker cp ${ID}:/usr/local/v15/lib/ neon_install/v15/lib/
docker rm -vf ${ID}
# store version to file (for ansible playbooks) and create binaries tarball
echo ${VERSION} > neon_install/.neon_current_version
echo ${VERSION} > .neon_current_version
tar -czf neon_install.tar.gz -C neon_install .
# do final cleaup
rm -rf neon_install postgres_install.tar.gz

View File

@@ -0,0 +1,38 @@
storage:
vars:
bucket_name: neon-prod-storage-ap-southeast-1
bucket_region: ap-southeast-1
console_mgmt_base_url: http://neon-internal-api.aws.neon.tech
broker_endpoint: http://storage-broker-lb.epsilon.ap-southeast-1.internal.aws.neon.tech:50051
pageserver_config_stub:
pg_distrib_dir: /usr/local
metric_collection_endpoint: http://neon-internal-api.aws.neon.tech/billing/api/v1/usage_events
metric_collection_interval: 10min
remote_storage:
bucket_name: "{{ bucket_name }}"
bucket_region: "{{ bucket_region }}"
prefix_in_bucket: "pageserver/v1"
safekeeper_s3_prefix: safekeeper/v1/wal
hostname_suffix: ""
remote_user: ssm-user
ansible_aws_ssm_region: ap-southeast-1
ansible_aws_ssm_bucket_name: neon-prod-storage-ap-southeast-1
console_region_id: aws-ap-southeast-1
sentry_environment: production
children:
pageservers:
hosts:
pageserver-0.ap-southeast-1.aws.neon.tech:
ansible_host: i-064de8ea28bdb495b
pageserver-1.ap-southeast-1.aws.neon.tech:
ansible_host: i-0b180defcaeeb6b93
safekeepers:
hosts:
safekeeper-0.ap-southeast-1.aws.neon.tech:
ansible_host: i-0d6f1dc5161eef894
safekeeper-2.ap-southeast-1.aws.neon.tech:
ansible_host: i-04fb63634e4679eb9
safekeeper-3.ap-southeast-1.aws.neon.tech:
ansible_host: i-05481f3bc88cfc2d4

View File

@@ -0,0 +1,40 @@
storage:
vars:
bucket_name: neon-prod-storage-eu-central-1
bucket_region: eu-central-1
console_mgmt_base_url: http://neon-internal-api.aws.neon.tech
broker_endpoint: http://storage-broker-lb.gamma.eu-central-1.internal.aws.neon.tech:50051
pageserver_config_stub:
pg_distrib_dir: /usr/local
metric_collection_endpoint: http://neon-internal-api.aws.neon.tech/billing/api/v1/usage_events
metric_collection_interval: 10min
remote_storage:
bucket_name: "{{ bucket_name }}"
bucket_region: "{{ bucket_region }}"
prefix_in_bucket: "pageserver/v1"
safekeeper_s3_prefix: safekeeper/v1/wal
hostname_suffix: ""
remote_user: ssm-user
ansible_aws_ssm_region: eu-central-1
ansible_aws_ssm_bucket_name: neon-prod-storage-eu-central-1
console_region_id: aws-eu-central-1
sentry_environment: production
children:
pageservers:
hosts:
pageserver-0.eu-central-1.aws.neon.tech:
ansible_host: i-0cd8d316ecbb715be
pageserver-1.eu-central-1.aws.neon.tech:
ansible_host: i-090044ed3d383fef0
pageserver-2.eu-central-1.aws.neon.tech:
ansible_host: i-033584edf3f4b6742
safekeepers:
hosts:
safekeeper-0.eu-central-1.aws.neon.tech:
ansible_host: i-0b238612d2318a050
safekeeper-1.eu-central-1.aws.neon.tech:
ansible_host: i-07b9c45e5c2637cd4
safekeeper-2.eu-central-1.aws.neon.tech:
ansible_host: i-020257302c3c93d88

View File

@@ -0,0 +1,41 @@
storage:
vars:
bucket_name: neon-prod-storage-us-east-2
bucket_region: us-east-2
console_mgmt_base_url: http://neon-internal-api.aws.neon.tech
broker_endpoint: http://storage-broker-lb.delta.us-east-2.internal.aws.neon.tech:50051
pageserver_config_stub:
pg_distrib_dir: /usr/local
metric_collection_endpoint: http://neon-internal-api.aws.neon.tech/billing/api/v1/usage_events
metric_collection_interval: 10min
remote_storage:
bucket_name: "{{ bucket_name }}"
bucket_region: "{{ bucket_region }}"
prefix_in_bucket: "pageserver/v1"
safekeeper_s3_prefix: safekeeper/v1/wal
hostname_suffix: ""
remote_user: ssm-user
ansible_aws_ssm_region: us-east-2
ansible_aws_ssm_bucket_name: neon-prod-storage-us-east-2
console_region_id: aws-us-east-2
sentry_environment: production
children:
pageservers:
hosts:
pageserver-0.us-east-2.aws.neon.tech:
ansible_host: i-062227ba7f119eb8c
pageserver-1.us-east-2.aws.neon.tech:
ansible_host: i-0b3ec0afab5968938
pageserver-2.us-east-2.aws.neon.tech:
ansible_host: i-0d7a1c4325e71421d
safekeepers:
hosts:
safekeeper-0.us-east-2.aws.neon.tech:
ansible_host: i-0e94224750c57d346
safekeeper-1.us-east-2.aws.neon.tech:
ansible_host: i-06d113fb73bfddeb0
safekeeper-2.us-east-2.aws.neon.tech:
ansible_host: i-09f66c8e04afff2e8

View File

@@ -0,0 +1,43 @@
storage:
vars:
bucket_name: neon-prod-storage-us-west-2
bucket_region: us-west-2
console_mgmt_base_url: http://neon-internal-api.aws.neon.tech
broker_endpoint: http://storage-broker-lb.eta.us-west-2.internal.aws.neon.tech:50051
pageserver_config_stub:
pg_distrib_dir: /usr/local
metric_collection_endpoint: http://neon-internal-api.aws.neon.tech/billing/api/v1/usage_events
metric_collection_interval: 10min
remote_storage:
bucket_name: "{{ bucket_name }}"
bucket_region: "{{ bucket_region }}"
prefix_in_bucket: "pageserver/v1"
safekeeper_s3_prefix: safekeeper/v1/wal
hostname_suffix: ""
remote_user: ssm-user
ansible_aws_ssm_region: us-west-2
ansible_aws_ssm_bucket_name: neon-prod-storage-us-west-2
console_region_id: aws-us-west-2-new
sentry_environment: production
children:
pageservers:
hosts:
pageserver-0.us-west-2.aws.neon.tech:
ansible_host: i-0d9f6dfae0e1c780d
pageserver-1.us-west-2.aws.neon.tech:
ansible_host: i-0c834be1dddba8b3f
pageserver-2.us-west-2.aws.neon.tech:
ansible_host: i-051642d372c0a4f32
pageserver-3.us-west-2.aws.neon.tech:
ansible_host: i-00c3844beb9ad1c6b
safekeepers:
hosts:
safekeeper-0.us-west-2.aws.neon.tech:
ansible_host: i-00719d8a74986fda6
safekeeper-1.us-west-2.aws.neon.tech:
ansible_host: i-074682f9d3c712e7c
safekeeper-2.us-west-2.aws.neon.tech:
ansible_host: i-042b7efb1729d7966

View File

@@ -0,0 +1,33 @@
#!/bin/sh
# fetch params from meta-data service
INSTANCE_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
AZ_ID=$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone)
# store fqdn hostname in var
HOST=$(hostname -f)
cat <<EOF | tee /tmp/payload
{
"version": 1,
"host": "${HOST}",
"port": 6400,
"region_id": "{{ console_region_id }}",
"instance_id": "${INSTANCE_ID}",
"http_host": "${HOST}",
"http_port": 9898,
"active": false,
"availability_zone_id": "${AZ_ID}"
}
EOF
# check if pageserver already registered or not
if ! curl -sf -H "Authorization: Bearer {{ CONSOLE_API_TOKEN }}" {{ console_mgmt_base_url }}/management/api/v2/pageservers/${INSTANCE_ID} -o /dev/null; then
# not registered, so register it now
ID=$(curl -sf -X POST -H "Authorization: Bearer {{ CONSOLE_API_TOKEN }}" -H "Content-Type: application/json" {{ console_mgmt_base_url }}/management/api/v2/pageservers -d@/tmp/payload | jq -r '.id')
# init pageserver
sudo -u pageserver /usr/local/bin/pageserver -c "id=${ID}" -c "pg_distrib_dir='/usr/local'" --init -D /storage/pageserver/data
fi

View File

@@ -0,0 +1,31 @@
#!/bin/sh
# fetch params from meta-data service
INSTANCE_ID=$(curl -s http://169.254.169.254/latest/meta-data/instance-id)
AZ_ID=$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone)
# store fqdn hostname in var
HOST=$(hostname -f)
cat <<EOF | tee /tmp/payload
{
"version": 1,
"host": "${HOST}",
"port": 6500,
"http_port": 7676,
"region_id": "{{ console_region_id }}",
"instance_id": "${INSTANCE_ID}",
"availability_zone_id": "${AZ_ID}",
"active": false
}
EOF
# check if safekeeper already registered or not
if ! curl -sf -H "Authorization: Bearer {{ CONSOLE_API_TOKEN }}" {{ console_mgmt_base_url }}/management/api/v2/safekeepers/${INSTANCE_ID} -o /dev/null; then
# not registered, so register it now
ID=$(curl -sf -X POST -H "Authorization: Bearer {{ CONSOLE_API_TOKEN }}" -H "Content-Type: application/json" {{ console_mgmt_base_url }}/management/api/v2/safekeepers -d@/tmp/payload | jq -r '.id')
# init safekeeper
sudo -u safekeeper /usr/local/bin/safekeeper --id ${ID} --init -D /storage/safekeeper/data
fi

2
.github/ansible/ssm_config vendored Normal file
View File

@@ -0,0 +1,2 @@
ansible_connection: aws_ssm
ansible_python_interpreter: /usr/bin/python3

View File

@@ -0,0 +1,41 @@
storage:
vars:
bucket_name: neon-dev-storage-eu-west-1
bucket_region: eu-west-1
console_mgmt_base_url: http://neon-internal-api.aws.neon.build
broker_endpoint: http://storage-broker-lb.zeta.eu-west-1.internal.aws.neon.build:50051
pageserver_config_stub:
pg_distrib_dir: /usr/local
metric_collection_endpoint: http://neon-internal-api.aws.neon.build/billing/api/v1/usage_events
metric_collection_interval: 10min
tenant_config:
eviction_policy:
kind: "LayerAccessThreshold"
period: "20m"
threshold: "20m"
remote_storage:
bucket_name: "{{ bucket_name }}"
bucket_region: "{{ bucket_region }}"
prefix_in_bucket: "pageserver/v1"
safekeeper_s3_prefix: safekeeper/v1/wal
hostname_suffix: ""
remote_user: ssm-user
ansible_aws_ssm_region: eu-west-1
ansible_aws_ssm_bucket_name: neon-dev-storage-eu-west-1
console_region_id: aws-eu-west-1
sentry_environment: staging
children:
pageservers:
hosts:
pageserver-0.eu-west-1.aws.neon.build:
ansible_host: i-01d496c5041c7f34c
safekeepers:
hosts:
safekeeper-0.eu-west-1.aws.neon.build:
ansible_host: i-05226ef85722831bf
safekeeper-1.eu-west-1.aws.neon.build:
ansible_host: i-06969ee1bf2958bfc
safekeeper-2.eu-west-1.aws.neon.build:
ansible_host: i-087892e9625984a0b

View File

@@ -0,0 +1,51 @@
storage:
vars:
bucket_name: neon-staging-storage-us-east-2
bucket_region: us-east-2
console_mgmt_base_url: http://neon-internal-api.aws.neon.build
broker_endpoint: http://storage-broker-lb.beta.us-east-2.internal.aws.neon.build:50051
pageserver_config_stub:
pg_distrib_dir: /usr/local
metric_collection_endpoint: http://neon-internal-api.aws.neon.build/billing/api/v1/usage_events
metric_collection_interval: 10min
tenant_config:
eviction_policy:
kind: "LayerAccessThreshold"
period: "20m"
threshold: "20m"
remote_storage:
bucket_name: "{{ bucket_name }}"
bucket_region: "{{ bucket_region }}"
prefix_in_bucket: "pageserver/v1"
safekeeper_s3_prefix: safekeeper/v1/wal
hostname_suffix: ""
remote_user: ssm-user
ansible_aws_ssm_region: us-east-2
ansible_aws_ssm_bucket_name: neon-staging-storage-us-east-2
console_region_id: aws-us-east-2
sentry_environment: staging
children:
pageservers:
hosts:
pageserver-0.us-east-2.aws.neon.build:
ansible_host: i-0c3e70929edb5d691
pageserver-1.us-east-2.aws.neon.build:
ansible_host: i-0565a8b4008aa3f40
pageserver-2.us-east-2.aws.neon.build:
ansible_host: i-01e31cdf7e970586a
pageserver-3.us-east-2.aws.neon.build:
ansible_host: i-0602a0291365ef7cc
pageserver-99.us-east-2.aws.neon.build:
ansible_host: i-0c39491109bb88824
safekeepers:
hosts:
safekeeper-0.us-east-2.aws.neon.build:
ansible_host: i-027662bd552bf5db0
safekeeper-1.us-east-2.aws.neon.build:
ansible_host: i-0171efc3604a7b907
safekeeper-2.us-east-2.aws.neon.build:
ansible_host: i-0de0b03a51676a6ce
safekeeper-99.us-east-2.aws.neon.build:
ansible_host: i-0d61b6a2ea32028d5

View File

@@ -0,0 +1,18 @@
[Unit]
Description=Neon pageserver
After=network.target auditd.service
[Service]
Type=simple
User=pageserver
Environment=RUST_BACKTRACE=1 NEON_REPO_DIR=/storage/pageserver LD_LIBRARY_PATH=/usr/local/v14/lib SENTRY_DSN={{ SENTRY_URL_PAGESERVER }} SENTRY_ENVIRONMENT={{ sentry_environment }}
ExecStart=/usr/local/bin/pageserver -c "pg_distrib_dir='/usr/local'" -c "listen_pg_addr='0.0.0.0:6400'" -c "listen_http_addr='0.0.0.0:9898'" -c "broker_endpoint='{{ broker_endpoint }}'" -c "availability_zone='{{ ec2_availability_zone }}'" -D /storage/pageserver/data
ExecReload=/bin/kill -HUP $MAINPID
KillMode=mixed
KillSignal=SIGINT
Restart=on-failure
TimeoutSec=10
LimitNOFILE=30000000
[Install]
WantedBy=multi-user.target

View File

@@ -0,0 +1,18 @@
[Unit]
Description=Neon safekeeper
After=network.target auditd.service
[Service]
Type=simple
User=safekeeper
Environment=RUST_BACKTRACE=1 NEON_REPO_DIR=/storage/safekeeper/data LD_LIBRARY_PATH=/usr/local/v14/lib SENTRY_DSN={{ SENTRY_URL_SAFEKEEPER }} SENTRY_ENVIRONMENT={{ sentry_environment }}
ExecStart=/usr/local/bin/safekeeper -l {{ inventory_hostname }}{{ hostname_suffix }}:6500 --listen-http {{ inventory_hostname }}{{ hostname_suffix }}:7676 -D /storage/safekeeper/data --broker-endpoint={{ broker_endpoint }} --remote-storage='{bucket_name="{{bucket_name}}", bucket_region="{{bucket_region}}", prefix_in_bucket="{{ safekeeper_s3_prefix }}"}' --availability-zone={{ ec2_availability_zone }}
ExecReload=/bin/kill -HUP $MAINPID
KillMode=mixed
KillSignal=SIGINT
Restart=on-failure
TimeoutSec=10
LimitNOFILE=30000000
[Install]
WantedBy=multi-user.target

View File

@@ -0,0 +1 @@
{{ pageserver_config | sivel.toiletwater.to_toml }}

View File

@@ -0,0 +1,76 @@
# Helm chart values for neon-proxy-scram.
# This is a YAML-formatted file.
deploymentStrategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 100%
maxUnavailable: 50%
# Delay the kill signal by 7 days (7 * 24 * 60 * 60)
# The pod(s) will stay in Terminating, keeps the existing connections
# but doesn't receive new ones
containerLifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 604800"]
terminationGracePeriodSeconds: 604800
image:
repository: neondatabase/neon
settings:
authBackend: "console"
authEndpoint: "http://neon-internal-api.aws.neon.build/management/api/v2"
domain: "*.eu-west-1.aws.neon.build"
sentryEnvironment: "staging"
wssPort: 8443
metricCollectionEndpoint: "http://neon-internal-api.aws.neon.build/billing/api/v1/usage_events"
metricCollectionInterval: "1min"
# -- Additional labels for neon-proxy pods
podLabels:
zenith_service: proxy-scram
zenith_env: dev
zenith_region: eu-west-1
zenith_region_slug: eu-west-1
exposedService:
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: external
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
external-dns.alpha.kubernetes.io/hostname: eu-west-1.aws.neon.build
httpsPort: 443
#metrics:
# enabled: true
# serviceMonitor:
# enabled: true
# selector:
# release: kube-prometheus-stack
extraManifests:
- apiVersion: operator.victoriametrics.com/v1beta1
kind: VMServiceScrape
metadata:
name: "{{ include \"neon-proxy.fullname\" . }}"
labels:
helm.sh/chart: neon-proxy-{{ .Chart.Version }}
app.kubernetes.io/name: neon-proxy
app.kubernetes.io/instance: "{{ include \"neon-proxy.fullname\" . }}"
app.kubernetes.io/version: "{{ .Chart.AppVersion }}"
app.kubernetes.io/managed-by: Helm
namespace: "{{ .Release.Namespace }}"
spec:
selector:
matchLabels:
app.kubernetes.io/name: "neon-proxy"
endpoints:
- port: http
path: /metrics
interval: 10s
scrapeTimeout: 10s
namespaceSelector:
matchNames:
- "{{ .Release.Namespace }}"

View File

@@ -0,0 +1,52 @@
# Helm chart values for neon-storage-broker
podLabels:
neon_env: staging
neon_service: storage-broker
# Use L4 LB
service:
# service.annotations -- Annotations to add to the service
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: external # use newer AWS Load Balancer Controller
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internal # deploy LB to private subnet
# assign service to this name at external-dns
external-dns.alpha.kubernetes.io/hostname: storage-broker-lb.zeta.eu-west-1.internal.aws.neon.build
# service.type -- Service type
type: LoadBalancer
# service.port -- broker listen port
port: 50051
ingress:
enabled: false
metrics:
enabled: false
extraManifests:
- apiVersion: operator.victoriametrics.com/v1beta1
kind: VMServiceScrape
metadata:
name: "{{ include \"neon-storage-broker.fullname\" . }}"
labels:
helm.sh/chart: neon-storage-broker-{{ .Chart.Version }}
app.kubernetes.io/name: neon-storage-broker
app.kubernetes.io/instance: neon-storage-broker
app.kubernetes.io/version: "{{ .Chart.AppVersion }}"
app.kubernetes.io/managed-by: Helm
namespace: "{{ .Release.Namespace }}"
spec:
selector:
matchLabels:
app.kubernetes.io/name: "neon-storage-broker"
endpoints:
- port: broker
path: /metrics
interval: 10s
scrapeTimeout: 10s
namespaceSelector:
matchNames:
- "{{ .Release.Namespace }}"
settings:
sentryEnvironment: "staging"

View File

@@ -0,0 +1,68 @@
# Helm chart values for neon-proxy-link.
# This is a YAML-formatted file.
image:
repository: neondatabase/neon
settings:
authBackend: "link"
authEndpoint: "https://console.stage.neon.tech/authenticate_proxy_request/"
uri: "https://console.stage.neon.tech/psql_session/"
domain: "pg.neon.build"
sentryEnvironment: "staging"
metricCollectionEndpoint: "http://neon-internal-api.aws.neon.build/billing/api/v1/usage_events"
metricCollectionInterval: "1min"
# -- Additional labels for neon-proxy-link pods
podLabels:
zenith_service: proxy
zenith_env: dev
zenith_region: us-east-2
zenith_region_slug: us-east-2
service:
type: LoadBalancer
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: external
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internal
external-dns.alpha.kubernetes.io/hostname: neon-proxy-link-mgmt.beta.us-east-2.aws.neon.build
exposedService:
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: external
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
external-dns.alpha.kubernetes.io/hostname: neon-proxy-link.beta.us-east-2.aws.neon.build
#metrics:
# enabled: true
# serviceMonitor:
# enabled: true
# selector:
# release: kube-prometheus-stack
extraManifests:
- apiVersion: operator.victoriametrics.com/v1beta1
kind: VMServiceScrape
metadata:
name: "{{ include \"neon-proxy.fullname\" . }}"
labels:
helm.sh/chart: neon-proxy-{{ .Chart.Version }}
app.kubernetes.io/name: neon-proxy
app.kubernetes.io/instance: "{{ include \"neon-proxy.fullname\" . }}"
app.kubernetes.io/version: "{{ .Chart.AppVersion }}"
app.kubernetes.io/managed-by: Helm
namespace: "{{ .Release.Namespace }}"
spec:
selector:
matchLabels:
app.kubernetes.io/name: "neon-proxy"
endpoints:
- port: http
path: /metrics
interval: 10s
scrapeTimeout: 10s
namespaceSelector:
matchNames:
- "{{ .Release.Namespace }}"

View File

@@ -0,0 +1,61 @@
# Helm chart values for neon-proxy-scram.
# This is a YAML-formatted file.
image:
repository: neondatabase/neon
settings:
authBackend: "console"
authEndpoint: "http://neon-internal-api.aws.neon.build/management/api/v2"
domain: "*.cloud.stage.neon.tech"
sentryEnvironment: "staging"
wssPort: 8443
metricCollectionEndpoint: "http://neon-internal-api.aws.neon.build/billing/api/v1/usage_events"
metricCollectionInterval: "1min"
# -- Additional labels for neon-proxy pods
podLabels:
zenith_service: proxy-scram-legacy
zenith_env: dev
zenith_region: us-east-2
zenith_region_slug: us-east-2
exposedService:
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: external
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
external-dns.alpha.kubernetes.io/hostname: neon-proxy-scram-legacy.beta.us-east-2.aws.neon.build
httpsPort: 443
#metrics:
# enabled: true
# serviceMonitor:
# enabled: true
# selector:
# release: kube-prometheus-stack
extraManifests:
- apiVersion: operator.victoriametrics.com/v1beta1
kind: VMServiceScrape
metadata:
name: "{{ include \"neon-proxy.fullname\" . }}"
labels:
helm.sh/chart: neon-proxy-{{ .Chart.Version }}
app.kubernetes.io/name: neon-proxy
app.kubernetes.io/instance: "{{ include \"neon-proxy.fullname\" . }}"
app.kubernetes.io/version: "{{ .Chart.AppVersion }}"
app.kubernetes.io/managed-by: Helm
namespace: "{{ .Release.Namespace }}"
spec:
selector:
matchLabels:
app.kubernetes.io/name: "neon-proxy"
endpoints:
- port: http
path: /metrics
interval: 10s
scrapeTimeout: 10s
namespaceSelector:
matchNames:
- "{{ .Release.Namespace }}"

View File

@@ -0,0 +1,76 @@
# Helm chart values for neon-proxy-scram.
# This is a YAML-formatted file.
deploymentStrategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 100%
maxUnavailable: 50%
# Delay the kill signal by 7 days (7 * 24 * 60 * 60)
# The pod(s) will stay in Terminating, keeps the existing connections
# but doesn't receive new ones
containerLifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 604800"]
terminationGracePeriodSeconds: 604800
image:
repository: neondatabase/neon
settings:
authBackend: "console"
authEndpoint: "http://neon-internal-api.aws.neon.build/management/api/v2"
domain: "*.us-east-2.aws.neon.build"
sentryEnvironment: "staging"
wssPort: 8443
metricCollectionEndpoint: "http://neon-internal-api.aws.neon.build/billing/api/v1/usage_events"
metricCollectionInterval: "1min"
# -- Additional labels for neon-proxy pods
podLabels:
zenith_service: proxy-scram
zenith_env: dev
zenith_region: us-east-2
zenith_region_slug: us-east-2
exposedService:
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: external
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
external-dns.alpha.kubernetes.io/hostname: us-east-2.aws.neon.build
httpsPort: 443
#metrics:
# enabled: true
# serviceMonitor:
# enabled: true
# selector:
# release: kube-prometheus-stack
extraManifests:
- apiVersion: operator.victoriametrics.com/v1beta1
kind: VMServiceScrape
metadata:
name: "{{ include \"neon-proxy.fullname\" . }}"
labels:
helm.sh/chart: neon-proxy-{{ .Chart.Version }}
app.kubernetes.io/name: neon-proxy
app.kubernetes.io/instance: "{{ include \"neon-proxy.fullname\" . }}"
app.kubernetes.io/version: "{{ .Chart.AppVersion }}"
app.kubernetes.io/managed-by: Helm
namespace: "{{ .Release.Namespace }}"
spec:
selector:
matchLabels:
app.kubernetes.io/name: "neon-proxy"
endpoints:
- port: http
path: /metrics
interval: 10s
scrapeTimeout: 10s
namespaceSelector:
matchNames:
- "{{ .Release.Namespace }}"

View File

@@ -0,0 +1,52 @@
# Helm chart values for neon-storage-broker
podLabels:
neon_env: staging
neon_service: storage-broker
# Use L4 LB
service:
# service.annotations -- Annotations to add to the service
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: external # use newer AWS Load Balancer Controller
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internal # deploy LB to private subnet
# assign service to this name at external-dns
external-dns.alpha.kubernetes.io/hostname: storage-broker-lb.beta.us-east-2.internal.aws.neon.build
# service.type -- Service type
type: LoadBalancer
# service.port -- broker listen port
port: 50051
ingress:
enabled: false
metrics:
enabled: false
extraManifests:
- apiVersion: operator.victoriametrics.com/v1beta1
kind: VMServiceScrape
metadata:
name: "{{ include \"neon-storage-broker.fullname\" . }}"
labels:
helm.sh/chart: neon-storage-broker-{{ .Chart.Version }}
app.kubernetes.io/name: neon-storage-broker
app.kubernetes.io/instance: neon-storage-broker
app.kubernetes.io/version: "{{ .Chart.AppVersion }}"
app.kubernetes.io/managed-by: Helm
namespace: "{{ .Release.Namespace }}"
spec:
selector:
matchLabels:
app.kubernetes.io/name: "neon-storage-broker"
endpoints:
- port: broker
path: /metrics
interval: 10s
scrapeTimeout: 10s
namespaceSelector:
matchNames:
- "{{ .Release.Namespace }}"
settings:
sentryEnvironment: "staging"

View File

@@ -0,0 +1,77 @@
# Helm chart values for neon-proxy-scram.
# This is a YAML-formatted file.
deploymentStrategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 100%
maxUnavailable: 50%
# Delay the kill signal by 7 days (7 * 24 * 60 * 60)
# The pod(s) will stay in Terminating, keeps the existing connections
# but doesn't receive new ones
containerLifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 604800"]
terminationGracePeriodSeconds: 604800
image:
repository: neondatabase/neon
settings:
authBackend: "console"
authEndpoint: "http://neon-internal-api.aws.neon.tech/management/api/v2"
domain: "*.ap-southeast-1.aws.neon.tech"
sentryEnvironment: "production"
wssPort: 8443
metricCollectionEndpoint: "http://neon-internal-api.aws.neon.tech/billing/api/v1/usage_events"
metricCollectionInterval: "10min"
# -- Additional labels for neon-proxy pods
podLabels:
zenith_service: proxy-scram
zenith_env: prod
zenith_region: ap-southeast-1
zenith_region_slug: ap-southeast-1
exposedService:
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: external
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
external-dns.alpha.kubernetes.io/hostname: ap-southeast-1.aws.neon.tech
httpsPort: 443
#metrics:
# enabled: true
# serviceMonitor:
# enabled: true
# selector:
# release: kube-prometheus-stack
extraManifests:
- apiVersion: operator.victoriametrics.com/v1beta1
kind: VMServiceScrape
metadata:
name: "{{ include \"neon-proxy.fullname\" . }}"
labels:
helm.sh/chart: neon-proxy-{{ .Chart.Version }}
app.kubernetes.io/name: neon-proxy
app.kubernetes.io/instance: "{{ include \"neon-proxy.fullname\" . }}"
app.kubernetes.io/version: "{{ .Chart.AppVersion }}"
app.kubernetes.io/managed-by: Helm
namespace: "{{ .Release.Namespace }}"
spec:
selector:
matchLabels:
app.kubernetes.io/name: "neon-proxy"
endpoints:
- port: http
path: /metrics
interval: 10s
scrapeTimeout: 10s
namespaceSelector:
matchNames:
- "{{ .Release.Namespace }}"

View File

@@ -0,0 +1,52 @@
# Helm chart values for neon-storage-broker
podLabels:
neon_env: production
neon_service: storage-broker
# Use L4 LB
service:
# service.annotations -- Annotations to add to the service
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: external # use newer AWS Load Balancer Controller
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internal # deploy LB to private subnet
# assign service to this name at external-dns
external-dns.alpha.kubernetes.io/hostname: storage-broker-lb.epsilon.ap-southeast-1.internal.aws.neon.tech
# service.type -- Service type
type: LoadBalancer
# service.port -- broker listen port
port: 50051
ingress:
enabled: false
metrics:
enabled: false
extraManifests:
- apiVersion: operator.victoriametrics.com/v1beta1
kind: VMServiceScrape
metadata:
name: "{{ include \"neon-storage-broker.fullname\" . }}"
labels:
helm.sh/chart: neon-storage-broker-{{ .Chart.Version }}
app.kubernetes.io/name: neon-storage-broker
app.kubernetes.io/instance: neon-storage-broker
app.kubernetes.io/version: "{{ .Chart.AppVersion }}"
app.kubernetes.io/managed-by: Helm
namespace: "{{ .Release.Namespace }}"
spec:
selector:
matchLabels:
app.kubernetes.io/name: "neon-storage-broker"
endpoints:
- port: broker
path: /metrics
interval: 10s
scrapeTimeout: 10s
namespaceSelector:
matchNames:
- "{{ .Release.Namespace }}"
settings:
sentryEnvironment: "production"

View File

@@ -0,0 +1,77 @@
# Helm chart values for neon-proxy-scram.
# This is a YAML-formatted file.
deploymentStrategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 100%
maxUnavailable: 50%
# Delay the kill signal by 7 days (7 * 24 * 60 * 60)
# The pod(s) will stay in Terminating, keeps the existing connections
# but doesn't receive new ones
containerLifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 604800"]
terminationGracePeriodSeconds: 604800
image:
repository: neondatabase/neon
settings:
authBackend: "console"
authEndpoint: "http://neon-internal-api.aws.neon.tech/management/api/v2"
domain: "*.eu-central-1.aws.neon.tech"
sentryEnvironment: "production"
wssPort: 8443
metricCollectionEndpoint: "http://neon-internal-api.aws.neon.tech/billing/api/v1/usage_events"
metricCollectionInterval: "10min"
# -- Additional labels for neon-proxy pods
podLabels:
zenith_service: proxy-scram
zenith_env: prod
zenith_region: eu-central-1
zenith_region_slug: eu-central-1
exposedService:
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: external
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
external-dns.alpha.kubernetes.io/hostname: eu-central-1.aws.neon.tech
httpsPort: 443
#metrics:
# enabled: true
# serviceMonitor:
# enabled: true
# selector:
# release: kube-prometheus-stack
extraManifests:
- apiVersion: operator.victoriametrics.com/v1beta1
kind: VMServiceScrape
metadata:
name: "{{ include \"neon-proxy.fullname\" . }}"
labels:
helm.sh/chart: neon-proxy-{{ .Chart.Version }}
app.kubernetes.io/name: neon-proxy
app.kubernetes.io/instance: "{{ include \"neon-proxy.fullname\" . }}"
app.kubernetes.io/version: "{{ .Chart.AppVersion }}"
app.kubernetes.io/managed-by: Helm
namespace: "{{ .Release.Namespace }}"
spec:
selector:
matchLabels:
app.kubernetes.io/name: "neon-proxy"
endpoints:
- port: http
path: /metrics
interval: 10s
scrapeTimeout: 10s
namespaceSelector:
matchNames:
- "{{ .Release.Namespace }}"

View File

@@ -0,0 +1,52 @@
# Helm chart values for neon-storage-broker
podLabels:
neon_env: production
neon_service: storage-broker
# Use L4 LB
service:
# service.annotations -- Annotations to add to the service
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: external # use newer AWS Load Balancer Controller
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internal # deploy LB to private subnet
# assign service to this name at external-dns
external-dns.alpha.kubernetes.io/hostname: storage-broker-lb.gamma.eu-central-1.internal.aws.neon.tech
# service.type -- Service type
type: LoadBalancer
# service.port -- broker listen port
port: 50051
ingress:
enabled: false
metrics:
enabled: false
extraManifests:
- apiVersion: operator.victoriametrics.com/v1beta1
kind: VMServiceScrape
metadata:
name: "{{ include \"neon-storage-broker.fullname\" . }}"
labels:
helm.sh/chart: neon-storage-broker-{{ .Chart.Version }}
app.kubernetes.io/name: neon-storage-broker
app.kubernetes.io/instance: neon-storage-broker
app.kubernetes.io/version: "{{ .Chart.AppVersion }}"
app.kubernetes.io/managed-by: Helm
namespace: "{{ .Release.Namespace }}"
spec:
selector:
matchLabels:
app.kubernetes.io/name: "neon-storage-broker"
endpoints:
- port: broker
path: /metrics
interval: 10s
scrapeTimeout: 10s
namespaceSelector:
matchNames:
- "{{ .Release.Namespace }}"
settings:
sentryEnvironment: "production"

View File

@@ -0,0 +1,59 @@
# Helm chart values for neon-proxy-link.
# This is a YAML-formatted file.
image:
repository: neondatabase/neon
settings:
authBackend: "link"
authEndpoint: "https://console.neon.tech/authenticate_proxy_request/"
uri: "https://console.neon.tech/psql_session/"
domain: "pg.neon.tech"
sentryEnvironment: "production"
# -- Additional labels for zenith-proxy pods
podLabels:
zenith_service: proxy
zenith_env: production
zenith_region: us-east-2
zenith_region_slug: us-east-2
service:
type: LoadBalancer
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: external
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internal
external-dns.alpha.kubernetes.io/hostname: neon-proxy-link-mgmt.delta.us-east-2.aws.neon.tech
exposedService:
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: external
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
external-dns.alpha.kubernetes.io/hostname: neon-proxy-link.delta.us-east-2.aws.neon.tech
extraManifests:
- apiVersion: operator.victoriametrics.com/v1beta1
kind: VMServiceScrape
metadata:
name: "{{ include \"neon-proxy.fullname\" . }}"
labels:
helm.sh/chart: neon-proxy-{{ .Chart.Version }}
app.kubernetes.io/name: neon-proxy
app.kubernetes.io/instance: "{{ include \"neon-proxy.fullname\" . }}"
app.kubernetes.io/version: "{{ .Chart.AppVersion }}"
app.kubernetes.io/managed-by: Helm
namespace: "{{ .Release.Namespace }}"
spec:
selector:
matchLabels:
app.kubernetes.io/name: "neon-proxy"
endpoints:
- port: http
path: /metrics
interval: 10s
scrapeTimeout: 10s
namespaceSelector:
matchNames:
- "{{ .Release.Namespace }}"

View File

@@ -0,0 +1,77 @@
# Helm chart values for neon-proxy-scram.
# This is a YAML-formatted file.
deploymentStrategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 100%
maxUnavailable: 50%
# Delay the kill signal by 7 days (7 * 24 * 60 * 60)
# The pod(s) will stay in Terminating, keeps the existing connections
# but doesn't receive new ones
containerLifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 604800"]
terminationGracePeriodSeconds: 604800
image:
repository: neondatabase/neon
settings:
authBackend: "console"
authEndpoint: "http://neon-internal-api.aws.neon.tech/management/api/v2"
domain: "*.us-east-2.aws.neon.tech"
sentryEnvironment: "production"
wssPort: 8443
metricCollectionEndpoint: "http://neon-internal-api.aws.neon.tech/billing/api/v1/usage_events"
metricCollectionInterval: "10min"
# -- Additional labels for neon-proxy pods
podLabels:
zenith_service: proxy-scram
zenith_env: prod
zenith_region: us-east-2
zenith_region_slug: us-east-2
exposedService:
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: external
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
external-dns.alpha.kubernetes.io/hostname: us-east-2.aws.neon.tech
httpsPort: 443
#metrics:
# enabled: true
# serviceMonitor:
# enabled: true
# selector:
# release: kube-prometheus-stack
extraManifests:
- apiVersion: operator.victoriametrics.com/v1beta1
kind: VMServiceScrape
metadata:
name: "{{ include \"neon-proxy.fullname\" . }}"
labels:
helm.sh/chart: neon-proxy-{{ .Chart.Version }}
app.kubernetes.io/name: neon-proxy
app.kubernetes.io/instance: "{{ include \"neon-proxy.fullname\" . }}"
app.kubernetes.io/version: "{{ .Chart.AppVersion }}"
app.kubernetes.io/managed-by: Helm
namespace: "{{ .Release.Namespace }}"
spec:
selector:
matchLabels:
app.kubernetes.io/name: "neon-proxy"
endpoints:
- port: http
path: /metrics
interval: 10s
scrapeTimeout: 10s
namespaceSelector:
matchNames:
- "{{ .Release.Namespace }}"

View File

@@ -0,0 +1,52 @@
# Helm chart values for neon-storage-broker
podLabels:
neon_env: production
neon_service: storage-broker
# Use L4 LB
service:
# service.annotations -- Annotations to add to the service
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: external # use newer AWS Load Balancer Controller
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internal # deploy LB to private subnet
# assign service to this name at external-dns
external-dns.alpha.kubernetes.io/hostname: storage-broker-lb.delta.us-east-2.internal.aws.neon.tech
# service.type -- Service type
type: LoadBalancer
# service.port -- broker listen port
port: 50051
ingress:
enabled: false
metrics:
enabled: false
extraManifests:
- apiVersion: operator.victoriametrics.com/v1beta1
kind: VMServiceScrape
metadata:
name: "{{ include \"neon-storage-broker.fullname\" . }}"
labels:
helm.sh/chart: neon-storage-broker-{{ .Chart.Version }}
app.kubernetes.io/name: neon-storage-broker
app.kubernetes.io/instance: neon-storage-broker
app.kubernetes.io/version: "{{ .Chart.AppVersion }}"
app.kubernetes.io/managed-by: Helm
namespace: "{{ .Release.Namespace }}"
spec:
selector:
matchLabels:
app.kubernetes.io/name: "neon-storage-broker"
endpoints:
- port: broker
path: /metrics
interval: 10s
scrapeTimeout: 10s
namespaceSelector:
matchNames:
- "{{ .Release.Namespace }}"
settings:
sentryEnvironment: "production"

View File

@@ -0,0 +1,77 @@
# Helm chart values for neon-proxy-scram.
# This is a YAML-formatted file.
deploymentStrategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 100%
maxUnavailable: 50%
# Delay the kill signal by 7 days (7 * 24 * 60 * 60)
# The pod(s) will stay in Terminating, keeps the existing connections
# but doesn't receive new ones
containerLifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 604800"]
terminationGracePeriodSeconds: 604800
image:
repository: neondatabase/neon
settings:
authBackend: "console"
authEndpoint: "http://neon-internal-api.aws.neon.tech/management/api/v2"
domain: "*.cloud.neon.tech"
sentryEnvironment: "production"
wssPort: 8443
metricCollectionEndpoint: "http://neon-internal-api.aws.neon.tech/billing/api/v1/usage_events"
metricCollectionInterval: "10min"
# -- Additional labels for neon-proxy pods
podLabels:
zenith_service: proxy-scram
zenith_env: prod
zenith_region: us-west-2
zenith_region_slug: us-west-2
exposedService:
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: external
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
external-dns.alpha.kubernetes.io/hostname: neon-proxy-scram-legacy.eta.us-west-2.aws.neon.tech
httpsPort: 443
#metrics:
# enabled: true
# serviceMonitor:
# enabled: true
# selector:
# release: kube-prometheus-stack
extraManifests:
- apiVersion: operator.victoriametrics.com/v1beta1
kind: VMServiceScrape
metadata:
name: "{{ include \"neon-proxy.fullname\" . }}"
labels:
helm.sh/chart: neon-proxy-{{ .Chart.Version }}
app.kubernetes.io/name: neon-proxy
app.kubernetes.io/instance: "{{ include \"neon-proxy.fullname\" . }}"
app.kubernetes.io/version: "{{ .Chart.AppVersion }}"
app.kubernetes.io/managed-by: Helm
namespace: "{{ .Release.Namespace }}"
spec:
selector:
matchLabels:
app.kubernetes.io/name: "neon-proxy"
endpoints:
- port: http
path: /metrics
interval: 10s
scrapeTimeout: 10s
namespaceSelector:
matchNames:
- "{{ .Release.Namespace }}"

View File

@@ -0,0 +1,77 @@
# Helm chart values for neon-proxy-scram.
# This is a YAML-formatted file.
deploymentStrategy:
type: RollingUpdate
rollingUpdate:
maxSurge: 100%
maxUnavailable: 50%
# Delay the kill signal by 7 days (7 * 24 * 60 * 60)
# The pod(s) will stay in Terminating, keeps the existing connections
# but doesn't receive new ones
containerLifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 604800"]
terminationGracePeriodSeconds: 604800
image:
repository: neondatabase/neon
settings:
authBackend: "console"
authEndpoint: "http://neon-internal-api.aws.neon.tech/management/api/v2"
domain: "*.us-west-2.aws.neon.tech"
sentryEnvironment: "production"
wssPort: 8443
metricCollectionEndpoint: "http://neon-internal-api.aws.neon.tech/billing/api/v1/usage_events"
metricCollectionInterval: "10min"
# -- Additional labels for neon-proxy pods
podLabels:
zenith_service: proxy-scram
zenith_env: prod
zenith_region: us-west-2
zenith_region_slug: us-west-2
exposedService:
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: external
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
external-dns.alpha.kubernetes.io/hostname: us-west-2.aws.neon.tech
httpsPort: 443
#metrics:
# enabled: true
# serviceMonitor:
# enabled: true
# selector:
# release: kube-prometheus-stack
extraManifests:
- apiVersion: operator.victoriametrics.com/v1beta1
kind: VMServiceScrape
metadata:
name: "{{ include \"neon-proxy.fullname\" . }}"
labels:
helm.sh/chart: neon-proxy-{{ .Chart.Version }}
app.kubernetes.io/name: neon-proxy
app.kubernetes.io/instance: "{{ include \"neon-proxy.fullname\" . }}"
app.kubernetes.io/version: "{{ .Chart.AppVersion }}"
app.kubernetes.io/managed-by: Helm
namespace: "{{ .Release.Namespace }}"
spec:
selector:
matchLabels:
app.kubernetes.io/name: "neon-proxy"
endpoints:
- port: http
path: /metrics
interval: 10s
scrapeTimeout: 10s
namespaceSelector:
matchNames:
- "{{ .Release.Namespace }}"

View File

@@ -0,0 +1,52 @@
# Helm chart values for neon-storage-broker
podLabels:
neon_env: production
neon_service: storage-broker
# Use L4 LB
service:
# service.annotations -- Annotations to add to the service
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: external # use newer AWS Load Balancer Controller
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: ip
service.beta.kubernetes.io/aws-load-balancer-scheme: internal # deploy LB to private subnet
# assign service to this name at external-dns
external-dns.alpha.kubernetes.io/hostname: storage-broker-lb.eta.us-west-2.internal.aws.neon.tech
# service.type -- Service type
type: LoadBalancer
# service.port -- broker listen port
port: 50051
ingress:
enabled: false
metrics:
enabled: false
extraManifests:
- apiVersion: operator.victoriametrics.com/v1beta1
kind: VMServiceScrape
metadata:
name: "{{ include \"neon-storage-broker.fullname\" . }}"
labels:
helm.sh/chart: neon-storage-broker-{{ .Chart.Version }}
app.kubernetes.io/name: neon-storage-broker
app.kubernetes.io/instance: neon-storage-broker
app.kubernetes.io/version: "{{ .Chart.AppVersion }}"
app.kubernetes.io/managed-by: Helm
namespace: "{{ .Release.Namespace }}"
spec:
selector:
matchLabels:
app.kubernetes.io/name: "neon-storage-broker"
endpoints:
- port: broker
path: /metrics
interval: 10s
scrapeTimeout: 10s
namespaceSelector:
matchNames:
- "{{ .Release.Namespace }}"
settings:
sentryEnvironment: "production"

10
.github/pull_request_template.md vendored Normal file
View File

@@ -0,0 +1,10 @@
## Describe your changes
## Issue ticket number and link
## Checklist before requesting a review
- [ ] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.

596
.github/workflows/benchmarking.yml vendored Normal file
View File

@@ -0,0 +1,596 @@
name: Benchmarking
on:
# uncomment to run on push for debugging your PR
# push:
# branches: [ your branch ]
schedule:
# * is a special character in YAML so you have to quote this string
# ┌───────────── minute (0 - 59)
# │ ┌───────────── hour (0 - 23)
# │ │ ┌───────────── day of the month (1 - 31)
# │ │ │ ┌───────────── month (1 - 12 or JAN-DEC)
# │ │ │ │ ┌───────────── day of the week (0 - 6 or SUN-SAT)
- cron: '0 3 * * *' # run once a day, timezone is utc
workflow_dispatch: # adds ability to run this manually
inputs:
region_id:
description: 'Use a particular region. If not set the default region will be used'
required: false
default: 'aws-us-east-2'
save_perf_report:
type: boolean
description: 'Publish perf report or not. If not set, the report is published only for the main branch'
required: false
defaults:
run:
shell: bash -euxo pipefail {0}
concurrency:
# Allow only one workflow per any non-`main` branch.
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.ref == 'refs/heads/main' && github.sha || 'anysha' }}
cancel-in-progress: true
jobs:
bench:
env:
TEST_PG_BENCH_DURATIONS_MATRIX: "300"
TEST_PG_BENCH_SCALES_MATRIX: "10,100"
POSTGRES_DISTRIB_DIR: /tmp/neon/pg_install
DEFAULT_PG_VERSION: 14
TEST_OUTPUT: /tmp/test_output
BUILD_TYPE: remote
SAVE_PERF_REPORT: ${{ github.event.inputs.save_perf_report || ( github.ref == 'refs/heads/main' ) }}
PLATFORM: "neon-staging"
runs-on: [ self-hosted, us-east-2, x64 ]
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned
options: --init
steps:
- uses: actions/checkout@v3
- name: Download Neon artifact
uses: ./.github/actions/download
with:
name: neon-${{ runner.os }}-release-artifact
path: /tmp/neon/
prefix: latest
- name: Create Neon Project
id: create-neon-project
uses: ./.github/actions/neon-project-create
with:
region_id: ${{ github.event.inputs.region_id || 'aws-us-east-2' }}
postgres_version: ${{ env.DEFAULT_PG_VERSION }}
api_key: ${{ secrets.NEON_STAGING_API_KEY }}
- name: Run benchmark
uses: ./.github/actions/run-python-test-set
with:
build_type: ${{ env.BUILD_TYPE }}
test_selection: performance
run_in_parallel: false
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
# Set --sparse-ordering option of pytest-order plugin
# to ensure tests are running in order of appears in the file.
# It's important for test_perf_pgbench.py::test_pgbench_remote_* tests
extra_params: -m remote_cluster --sparse-ordering --timeout 5400 --ignore test_runner/performance/test_perf_olap.py
env:
BENCHMARK_CONNSTR: ${{ steps.create-neon-project.outputs.dsn }}
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"
- name: Delete Neon Project
if: ${{ always() }}
uses: ./.github/actions/neon-project-delete
with:
project_id: ${{ steps.create-neon-project.outputs.project_id }}
api_key: ${{ secrets.NEON_STAGING_API_KEY }}
- name: Create Allure report
if: success() || failure()
uses: ./.github/actions/allure-report
with:
action: generate
build_type: ${{ env.BUILD_TYPE }}
- name: Post to a Slack channel
if: ${{ github.event.schedule && failure() }}
uses: slackapi/slack-github-action@v1
with:
channel-id: "C033QLM5P7D" # dev-staging-stream
slack-message: "Periodic perf testing: ${{ job.status }}\n${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"
env:
SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}
pgbench-compare:
strategy:
fail-fast: false
matrix:
# neon-captest-new: Run pgbench in a freshly created project
# neon-captest-reuse: Same, but reusing existing project
# neon-captest-prefetch: Same, with prefetching enabled (new project)
# rds-aurora: Aurora Postgres Serverless v2 with autoscaling from 0.5 to 2 ACUs
# rds-postgres: RDS Postgres db.m5.large instance (2 vCPU, 8 GiB) with gp3 EBS storage
platform: [ neon-captest-reuse, neon-captest-prefetch, rds-postgres ]
db_size: [ 10gb ]
runner: [ us-east-2 ]
include:
- platform: neon-captest-prefetch
db_size: 50gb
runner: us-east-2
- platform: rds-aurora
db_size: 50gb
runner: us-east-2
env:
TEST_PG_BENCH_DURATIONS_MATRIX: "60m"
TEST_PG_BENCH_SCALES_MATRIX: ${{ matrix.db_size }}
POSTGRES_DISTRIB_DIR: /tmp/neon/pg_install
DEFAULT_PG_VERSION: 14
TEST_OUTPUT: /tmp/test_output
BUILD_TYPE: remote
SAVE_PERF_REPORT: ${{ github.event.inputs.save_perf_report || ( github.ref == 'refs/heads/main' ) }}
PLATFORM: ${{ matrix.platform }}
runs-on: [ self-hosted, "${{ matrix.runner }}", x64 ]
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned
options: --init
timeout-minutes: 360 # 6h
steps:
- uses: actions/checkout@v3
- name: Download Neon artifact
uses: ./.github/actions/download
with:
name: neon-${{ runner.os }}-release-artifact
path: /tmp/neon/
prefix: latest
- name: Add Postgres binaries to PATH
run: |
${POSTGRES_DISTRIB_DIR}/v${DEFAULT_PG_VERSION}/bin/pgbench --version
echo "${POSTGRES_DISTRIB_DIR}/v${DEFAULT_PG_VERSION}/bin" >> $GITHUB_PATH
- name: Create Neon Project
if: contains(fromJson('["neon-captest-new", "neon-captest-prefetch"]'), matrix.platform)
id: create-neon-project
uses: ./.github/actions/neon-project-create
with:
region_id: ${{ github.event.inputs.region_id || 'aws-us-east-2' }}
postgres_version: ${{ env.DEFAULT_PG_VERSION }}
api_key: ${{ secrets.NEON_STAGING_API_KEY }}
- name: Set up Connection String
id: set-up-connstr
run: |
case "${PLATFORM}" in
neon-captest-reuse)
CONNSTR=${{ secrets.BENCHMARK_CAPTEST_CONNSTR }}
;;
neon-captest-new | neon-captest-prefetch)
CONNSTR=${{ steps.create-neon-project.outputs.dsn }}
;;
rds-aurora)
CONNSTR=${{ secrets.BENCHMARK_RDS_AURORA_CONNSTR }}
;;
rds-postgres)
CONNSTR=${{ secrets.BENCHMARK_RDS_POSTGRES_CONNSTR }}
;;
*)
echo 2>&1 "Unknown PLATFORM=${PLATFORM}. Allowed only 'neon-captest-reuse', 'neon-captest-new', 'neon-captest-prefetch', 'rds-aurora', or 'rds-postgres'"
exit 1
;;
esac
echo "connstr=${CONNSTR}" >> $GITHUB_OUTPUT
psql ${CONNSTR} -c "SELECT version();"
- name: Set database options
if: matrix.platform == 'neon-captest-prefetch'
run: |
DB_NAME=$(psql ${BENCHMARK_CONNSTR} --no-align --quiet -t -c "SELECT current_database()")
psql ${BENCHMARK_CONNSTR} -c "ALTER DATABASE ${DB_NAME} SET enable_seqscan_prefetch=on"
psql ${BENCHMARK_CONNSTR} -c "ALTER DATABASE ${DB_NAME} SET effective_io_concurrency=32"
psql ${BENCHMARK_CONNSTR} -c "ALTER DATABASE ${DB_NAME} SET maintenance_io_concurrency=32"
env:
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
- name: Benchmark init
uses: ./.github/actions/run-python-test-set
with:
build_type: ${{ env.BUILD_TYPE }}
test_selection: performance
run_in_parallel: false
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600 -k test_pgbench_remote_init
env:
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"
- name: Benchmark simple-update
uses: ./.github/actions/run-python-test-set
with:
build_type: ${{ env.BUILD_TYPE }}
test_selection: performance
run_in_parallel: false
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600 -k test_pgbench_remote_simple_update
env:
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"
- name: Benchmark select-only
uses: ./.github/actions/run-python-test-set
with:
build_type: ${{ env.BUILD_TYPE }}
test_selection: performance
run_in_parallel: false
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600 -k test_pgbench_remote_select_only
env:
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"
- name: Delete Neon Project
if: ${{ steps.create-neon-project.outputs.project_id && always() }}
uses: ./.github/actions/neon-project-delete
with:
project_id: ${{ steps.create-neon-project.outputs.project_id }}
api_key: ${{ secrets.NEON_STAGING_API_KEY }}
- name: Create Allure report
if: success() || failure()
uses: ./.github/actions/allure-report
with:
action: generate
build_type: ${{ env.BUILD_TYPE }}
- name: Post to a Slack channel
if: ${{ github.event.schedule && failure() }}
uses: slackapi/slack-github-action@v1
with:
channel-id: "C033QLM5P7D" # dev-staging-stream
slack-message: "Periodic perf testing ${{ matrix.platform }}: ${{ job.status }}\n${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"
env:
SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}
clickbench-compare:
# ClichBench DB for rds-aurora and rds-Postgres deployed to the same clusters
# we use for performance testing in pgbench-compare.
# Run this job only when pgbench-compare is finished to avoid the intersection.
# We might change it after https://github.com/neondatabase/neon/issues/2900.
#
# *_CLICKBENCH_CONNSTR: Genuine ClickBench DB with ~100M rows
# *_CLICKBENCH_10M_CONNSTR: DB with the first 10M rows of ClickBench DB
if: success() || failure()
needs: [ pgbench-compare ]
strategy:
fail-fast: false
matrix:
# neon-captest-prefetch: We have pre-created projects with prefetch enabled
# rds-aurora: Aurora Postgres Serverless v2 with autoscaling from 0.5 to 2 ACUs
# rds-postgres: RDS Postgres db.m5.large instance (2 vCPU, 8 GiB) with gp3 EBS storage
platform: [ neon-captest-prefetch, rds-postgres, rds-aurora ]
env:
POSTGRES_DISTRIB_DIR: /tmp/neon/pg_install
DEFAULT_PG_VERSION: 14
TEST_OUTPUT: /tmp/test_output
BUILD_TYPE: remote
SAVE_PERF_REPORT: ${{ github.event.inputs.save_perf_report || ( github.ref == 'refs/heads/main' ) }}
PLATFORM: ${{ matrix.platform }}
runs-on: [ self-hosted, us-east-2, x64 ]
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned
options: --init
timeout-minutes: 360 # 6h
steps:
- uses: actions/checkout@v3
- name: Download Neon artifact
uses: ./.github/actions/download
with:
name: neon-${{ runner.os }}-release-artifact
path: /tmp/neon/
prefix: latest
- name: Add Postgres binaries to PATH
run: |
${POSTGRES_DISTRIB_DIR}/v${DEFAULT_PG_VERSION}/bin/pgbench --version
echo "${POSTGRES_DISTRIB_DIR}/v${DEFAULT_PG_VERSION}/bin" >> $GITHUB_PATH
- name: Set up Connection String
id: set-up-connstr
run: |
case "${PLATFORM}" in
neon-captest-prefetch)
CONNSTR=${{ secrets.BENCHMARK_CAPTEST_CLICKBENCH_10M_CONNSTR }}
;;
rds-aurora)
CONNSTR=${{ secrets.BENCHMARK_RDS_AURORA_CLICKBENCH_10M_CONNSTR }}
;;
rds-postgres)
CONNSTR=${{ secrets.BENCHMARK_RDS_POSTGRES_CLICKBENCH_10M_CONNSTR }}
;;
*)
echo 2>&1 "Unknown PLATFORM=${PLATFORM}. Allowed only 'neon-captest-prefetch', 'rds-aurora', or 'rds-postgres'"
exit 1
;;
esac
echo "connstr=${CONNSTR}" >> $GITHUB_OUTPUT
psql ${CONNSTR} -c "SELECT version();"
- name: Set database options
if: matrix.platform == 'neon-captest-prefetch'
run: |
DB_NAME=$(psql ${BENCHMARK_CONNSTR} --no-align --quiet -t -c "SELECT current_database()")
psql ${BENCHMARK_CONNSTR} -c "ALTER DATABASE ${DB_NAME} SET enable_seqscan_prefetch=on"
psql ${BENCHMARK_CONNSTR} -c "ALTER DATABASE ${DB_NAME} SET effective_io_concurrency=32"
psql ${BENCHMARK_CONNSTR} -c "ALTER DATABASE ${DB_NAME} SET maintenance_io_concurrency=32"
env:
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
- name: ClickBench benchmark
uses: ./.github/actions/run-python-test-set
with:
build_type: ${{ env.BUILD_TYPE }}
test_selection: performance/test_perf_olap.py
run_in_parallel: false
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600 -k test_clickbench
env:
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
- name: Create Allure report
if: success() || failure()
uses: ./.github/actions/allure-report
with:
action: generate
build_type: ${{ env.BUILD_TYPE }}
- name: Post to a Slack channel
if: ${{ github.event.schedule && failure() }}
uses: slackapi/slack-github-action@v1
with:
channel-id: "C033QLM5P7D" # dev-staging-stream
slack-message: "Periodic OLAP perf testing ${{ matrix.platform }}: ${{ job.status }}\n${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"
env:
SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}
tpch-compare:
# TCP-H DB for rds-aurora and rds-Postgres deployed to the same clusters
# we use for performance testing in pgbench-compare & clickbench-compare.
# Run this job only when clickbench-compare is finished to avoid the intersection.
# We might change it after https://github.com/neondatabase/neon/issues/2900.
#
# *_TPCH_S10_CONNSTR: DB generated with scale factor 10 (~10 GB)
if: success() || failure()
needs: [ clickbench-compare ]
strategy:
fail-fast: false
matrix:
# neon-captest-prefetch: We have pre-created projects with prefetch enabled
# rds-aurora: Aurora Postgres Serverless v2 with autoscaling from 0.5 to 2 ACUs
# rds-postgres: RDS Postgres db.m5.large instance (2 vCPU, 8 GiB) with gp3 EBS storage
platform: [ neon-captest-prefetch, rds-postgres, rds-aurora ]
env:
POSTGRES_DISTRIB_DIR: /tmp/neon/pg_install
DEFAULT_PG_VERSION: 14
TEST_OUTPUT: /tmp/test_output
BUILD_TYPE: remote
SAVE_PERF_REPORT: ${{ github.event.inputs.save_perf_report || ( github.ref == 'refs/heads/main' ) }}
PLATFORM: ${{ matrix.platform }}
runs-on: [ self-hosted, us-east-2, x64 ]
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned
options: --init
timeout-minutes: 360 # 6h
steps:
- uses: actions/checkout@v3
- name: Download Neon artifact
uses: ./.github/actions/download
with:
name: neon-${{ runner.os }}-release-artifact
path: /tmp/neon/
prefix: latest
- name: Add Postgres binaries to PATH
run: |
${POSTGRES_DISTRIB_DIR}/v${DEFAULT_PG_VERSION}/bin/pgbench --version
echo "${POSTGRES_DISTRIB_DIR}/v${DEFAULT_PG_VERSION}/bin" >> $GITHUB_PATH
- name: Set up Connection String
id: set-up-connstr
run: |
case "${PLATFORM}" in
neon-captest-prefetch)
CONNSTR=${{ secrets.BENCHMARK_CAPTEST_TPCH_S10_CONNSTR }}
;;
rds-aurora)
CONNSTR=${{ secrets.BENCHMARK_RDS_AURORA_TPCH_S10_CONNSTR }}
;;
rds-postgres)
CONNSTR=${{ secrets.BENCHMARK_RDS_POSTGRES_TPCH_S10_CONNSTR }}
;;
*)
echo 2>&1 "Unknown PLATFORM=${PLATFORM}. Allowed only 'neon-captest-prefetch', 'rds-aurora', or 'rds-postgres'"
exit 1
;;
esac
echo "connstr=${CONNSTR}" >> $GITHUB_OUTPUT
psql ${CONNSTR} -c "SELECT version();"
- name: Set database options
if: matrix.platform == 'neon-captest-prefetch'
run: |
DB_NAME=$(psql ${BENCHMARK_CONNSTR} --no-align --quiet -t -c "SELECT current_database()")
psql ${BENCHMARK_CONNSTR} -c "ALTER DATABASE ${DB_NAME} SET enable_seqscan_prefetch=on"
psql ${BENCHMARK_CONNSTR} -c "ALTER DATABASE ${DB_NAME} SET effective_io_concurrency=32"
psql ${BENCHMARK_CONNSTR} -c "ALTER DATABASE ${DB_NAME} SET maintenance_io_concurrency=32"
env:
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
- name: Run TPC-H benchmark
uses: ./.github/actions/run-python-test-set
with:
build_type: ${{ env.BUILD_TYPE }}
test_selection: performance/test_perf_olap.py
run_in_parallel: false
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600 -k test_tpch
env:
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
- name: Create Allure report
if: success() || failure()
uses: ./.github/actions/allure-report
with:
action: generate
build_type: ${{ env.BUILD_TYPE }}
- name: Post to a Slack channel
if: ${{ github.event.schedule && failure() }}
uses: slackapi/slack-github-action@v1
with:
channel-id: "C033QLM5P7D" # dev-staging-stream
slack-message: "Periodic TPC-H perf testing ${{ matrix.platform }}: ${{ job.status }}\n${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"
env:
SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}
user-examples-compare:
if: success() || failure()
needs: [ tpch-compare ]
strategy:
fail-fast: false
matrix:
# neon-captest-prefetch: We have pre-created projects with prefetch enabled
# rds-aurora: Aurora Postgres Serverless v2 with autoscaling from 0.5 to 2 ACUs
# rds-postgres: RDS Postgres db.m5.large instance (2 vCPU, 8 GiB) with gp3 EBS storage
platform: [ neon-captest-prefetch, rds-postgres, rds-aurora ]
env:
POSTGRES_DISTRIB_DIR: /tmp/neon/pg_install
DEFAULT_PG_VERSION: 14
TEST_OUTPUT: /tmp/test_output
BUILD_TYPE: remote
SAVE_PERF_REPORT: ${{ github.event.inputs.save_perf_report || ( github.ref == 'refs/heads/main' ) }}
PLATFORM: ${{ matrix.platform }}
runs-on: [ self-hosted, us-east-2, x64 ]
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned
options: --init
timeout-minutes: 360 # 6h
steps:
- uses: actions/checkout@v3
- name: Download Neon artifact
uses: ./.github/actions/download
with:
name: neon-${{ runner.os }}-release-artifact
path: /tmp/neon/
prefix: latest
- name: Add Postgres binaries to PATH
run: |
${POSTGRES_DISTRIB_DIR}/v${DEFAULT_PG_VERSION}/bin/pgbench --version
echo "${POSTGRES_DISTRIB_DIR}/v${DEFAULT_PG_VERSION}/bin" >> $GITHUB_PATH
- name: Set up Connection String
id: set-up-connstr
run: |
case "${PLATFORM}" in
neon-captest-prefetch)
CONNSTR=${{ secrets.BENCHMARK_USER_EXAMPLE_CAPTEST_CONNSTR }}
;;
rds-aurora)
CONNSTR=${{ secrets.BENCHMARK_USER_EXAMPLE_RDS_AURORA_CONNSTR }}
;;
rds-postgres)
CONNSTR=${{ secrets.BENCHMARK_USER_EXAMPLE_RDS_POSTGRES_CONNSTR }}
;;
*)
echo 2>&1 "Unknown PLATFORM=${PLATFORM}. Allowed only 'neon-captest-prefetch', 'rds-aurora', or 'rds-postgres'"
exit 1
;;
esac
echo "connstr=${CONNSTR}" >> $GITHUB_OUTPUT
psql ${CONNSTR} -c "SELECT version();"
- name: Set database options
if: matrix.platform == 'neon-captest-prefetch'
run: |
DB_NAME=$(psql ${BENCHMARK_CONNSTR} --no-align --quiet -t -c "SELECT current_database()")
psql ${BENCHMARK_CONNSTR} -c "ALTER DATABASE ${DB_NAME} SET enable_seqscan_prefetch=on"
psql ${BENCHMARK_CONNSTR} -c "ALTER DATABASE ${DB_NAME} SET effective_io_concurrency=32"
psql ${BENCHMARK_CONNSTR} -c "ALTER DATABASE ${DB_NAME} SET maintenance_io_concurrency=32"
env:
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
- name: Run user examples
uses: ./.github/actions/run-python-test-set
with:
build_type: ${{ env.BUILD_TYPE }}
test_selection: performance/test_perf_olap.py
run_in_parallel: false
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600 -k test_user_examples
env:
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
- name: Create Allure report
if: success() || failure()
uses: ./.github/actions/allure-report
with:
action: generate
build_type: ${{ env.BUILD_TYPE }}
- name: Post to a Slack channel
if: ${{ github.event.schedule && failure() }}
uses: slackapi/slack-github-action@v1
with:
channel-id: "C033QLM5P7D" # dev-staging-stream
slack-message: "Periodic TPC-H perf testing ${{ matrix.platform }}: ${{ job.status }}\n${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"
env:
SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}

946
.github/workflows/build_and_test.yml vendored Normal file
View File

@@ -0,0 +1,946 @@
name: Build and Test
on:
push:
branches:
- main
- release
pull_request:
defaults:
run:
shell: bash -euxo pipefail {0}
concurrency:
# Allow only one workflow per any non-`main` branch.
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.ref == 'refs/heads/main' && github.sha || 'anysha' }}
cancel-in-progress: true
env:
RUST_BACKTRACE: 1
COPT: '-Werror'
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_DEV }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_KEY_DEV }}
jobs:
tag:
runs-on: [ self-hosted, gen3, small ]
container: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/base:pinned
outputs:
build-tag: ${{steps.build-tag.outputs.tag}}
steps:
- name: Checkout
uses: actions/checkout@v3
with:
fetch-depth: 0
- name: Get build tag
run: |
echo run:$GITHUB_RUN_ID
echo ref:$GITHUB_REF_NAME
echo rev:$(git rev-list --count HEAD)
if [[ "$GITHUB_REF_NAME" == "main" ]]; then
echo "tag=$(git rev-list --count HEAD)" >> $GITHUB_OUTPUT
elif [[ "$GITHUB_REF_NAME" == "release" ]]; then
echo "tag=release-$(git rev-list --count HEAD)" >> $GITHUB_OUTPUT
else
echo "GITHUB_REF_NAME (value '$GITHUB_REF_NAME') is not set to either 'main' or 'release'"
echo "tag=$GITHUB_RUN_ID" >> $GITHUB_OUTPUT
fi
shell: bash
id: build-tag
check-codestyle-python:
runs-on: [ self-hosted, gen3, small ]
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned
options: --init
steps:
- name: Checkout
uses: actions/checkout@v3
with:
submodules: false
fetch-depth: 1
- name: Cache poetry deps
id: cache_poetry
uses: actions/cache@v3
with:
path: ~/.cache/pypoetry/virtualenvs
key: v1-codestyle-python-deps-${{ hashFiles('poetry.lock') }}
- name: Install Python deps
run: ./scripts/pysync
- name: Run ruff to ensure code format
run: poetry run ruff .
- name: Run black to ensure code format
run: poetry run black --diff --check .
- name: Run mypy to check types
run: poetry run mypy .
check-codestyle-rust:
runs-on: [ self-hosted, gen3, large ]
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned
options: --init
steps:
- name: Checkout
uses: actions/checkout@v3
with:
submodules: true
fetch-depth: 1
# Disabled for now
# - name: Restore cargo deps cache
# id: cache_cargo
# uses: actions/cache@v3
# with:
# path: |
# !~/.cargo/registry/src
# ~/.cargo/git/
# target/
# key: v1-${{ runner.os }}-cargo-clippy-${{ hashFiles('rust-toolchain.toml') }}-${{ hashFiles('Cargo.lock') }}
# Some of our rust modules use FFI and need those to be checked
- name: Get postgres headers
run: make postgres-headers -j$(nproc)
- name: Run cargo clippy
run: ./run_clippy.sh
# Use `${{ !cancelled() }}` to run quck tests after the longer clippy run
- name: Check formatting
if: ${{ !cancelled() }}
run: cargo fmt --all -- --check
# https://github.com/facebookincubator/cargo-guppy/tree/bec4e0eb29dcd1faac70b1b5360267fc02bf830e/tools/cargo-hakari#2-keep-the-workspace-hack-up-to-date-in-ci
- name: Check rust dependencies
if: ${{ !cancelled() }}
run: |
cargo hakari generate --diff # workspace-hack Cargo.toml is up-to-date
cargo hakari manage-deps --dry-run # all workspace crates depend on workspace-hack
# https://github.com/EmbarkStudios/cargo-deny
- name: Check rust licenses/bans/advisories/sources
if: ${{ !cancelled() }}
run: cargo deny check
build-neon:
runs-on: [ self-hosted, gen3, large ]
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned
options: --init
strategy:
fail-fast: false
matrix:
build_type: [ debug, release ]
env:
BUILD_TYPE: ${{ matrix.build_type }}
GIT_VERSION: ${{ github.sha }}
steps:
- name: Fix git ownership
run: |
# Workaround for `fatal: detected dubious ownership in repository at ...`
#
# Use both ${{ github.workspace }} and ${GITHUB_WORKSPACE} because they're different on host and in containers
# Ref https://github.com/actions/checkout/issues/785
#
git config --global --add safe.directory ${{ github.workspace }}
git config --global --add safe.directory ${GITHUB_WORKSPACE}
- name: Checkout
uses: actions/checkout@v3
with:
submodules: true
fetch-depth: 1
- name: Set pg 14 revision for caching
id: pg_v14_rev
run: echo pg_rev=$(git rev-parse HEAD:vendor/postgres-v14) >> $GITHUB_OUTPUT
- name: Set pg 15 revision for caching
id: pg_v15_rev
run: echo pg_rev=$(git rev-parse HEAD:vendor/postgres-v15) >> $GITHUB_OUTPUT
# Set some environment variables used by all the steps.
#
# CARGO_FLAGS is extra options to pass to "cargo build", "cargo test" etc.
# It also includes --features, if any
#
# CARGO_FEATURES is passed to "cargo metadata". It is separate from CARGO_FLAGS,
# because "cargo metadata" doesn't accept --release or --debug options
#
# We run tests with addtional features, that are turned off by default (e.g. in release builds), see
# corresponding Cargo.toml files for their descriptions.
- name: Set env variables
run: |
CARGO_FEATURES="--features testing"
if [[ $BUILD_TYPE == "debug" ]]; then
cov_prefix="scripts/coverage --profraw-prefix=$GITHUB_JOB --dir=/tmp/coverage run"
CARGO_FLAGS="--locked $CARGO_FEATURES"
elif [[ $BUILD_TYPE == "release" ]]; then
cov_prefix=""
CARGO_FLAGS="--locked --release $CARGO_FEATURES"
fi
echo "cov_prefix=${cov_prefix}" >> $GITHUB_ENV
echo "CARGO_FEATURES=${CARGO_FEATURES}" >> $GITHUB_ENV
echo "CARGO_FLAGS=${CARGO_FLAGS}" >> $GITHUB_ENV
echo "CARGO_HOME=${GITHUB_WORKSPACE}/.cargo" >> $GITHUB_ENV
# Disabled for now
# Don't include the ~/.cargo/registry/src directory. It contains just
# uncompressed versions of the crates in ~/.cargo/registry/cache
# directory, and it's faster to let 'cargo' to rebuild it from the
# compressed crates.
# - name: Cache cargo deps
# id: cache_cargo
# uses: actions/cache@v3
# with:
# path: |
# ~/.cargo/registry/
# !~/.cargo/registry/src
# ~/.cargo/git/
# target/
# # Fall back to older versions of the key, if no cache for current Cargo.lock was found
# key: |
# v1-${{ runner.os }}-${{ matrix.build_type }}-cargo-${{ hashFiles('rust-toolchain.toml') }}-${{ hashFiles('Cargo.lock') }}
# v1-${{ runner.os }}-${{ matrix.build_type }}-cargo-${{ hashFiles('rust-toolchain.toml') }}-
- name: Cache postgres v14 build
id: cache_pg_14
uses: actions/cache@v3
with:
path: pg_install/v14
key: v1-${{ runner.os }}-${{ matrix.build_type }}-pg-${{ steps.pg_v14_rev.outputs.pg_rev }}-${{ hashFiles('Makefile') }}
- name: Cache postgres v15 build
id: cache_pg_15
uses: actions/cache@v3
with:
path: pg_install/v15
key: v1-${{ runner.os }}-${{ matrix.build_type }}-pg-${{ steps.pg_v15_rev.outputs.pg_rev }}-${{ hashFiles('Makefile') }}
- name: Build postgres v14
if: steps.cache_pg_14.outputs.cache-hit != 'true'
run: mold -run make postgres-v14 -j$(nproc)
- name: Build postgres v15
if: steps.cache_pg_15.outputs.cache-hit != 'true'
run: mold -run make postgres-v15 -j$(nproc)
- name: Build neon extensions
run: mold -run make neon-pg-ext -j$(nproc)
- name: Run cargo build
run: |
${cov_prefix} mold -run cargo build $CARGO_FLAGS --bins --tests
- name: Run cargo test
run: |
${cov_prefix} cargo test $CARGO_FLAGS
- name: Install rust binaries
run: |
# Install target binaries
mkdir -p /tmp/neon/bin/
binaries=$(
${cov_prefix} cargo metadata $CARGO_FEATURES --format-version=1 --no-deps |
jq -r '.packages[].targets[] | select(.kind | index("bin")) | .name'
)
for bin in $binaries; do
SRC=target/$BUILD_TYPE/$bin
DST=/tmp/neon/bin/$bin
cp "$SRC" "$DST"
done
# Install test executables and write list of all binaries (for code coverage)
if [[ $BUILD_TYPE == "debug" ]]; then
# Keep bloated coverage data files away from the rest of the artifact
mkdir -p /tmp/coverage/
mkdir -p /tmp/neon/test_bin/
test_exe_paths=$(
${cov_prefix} cargo test $CARGO_FLAGS --message-format=json --no-run |
jq -r '.executable | select(. != null)'
)
for bin in $test_exe_paths; do
SRC=$bin
DST=/tmp/neon/test_bin/$(basename $bin)
# We don't need debug symbols for code coverage, so strip them out to make
# the artifact smaller.
strip "$SRC" -o "$DST"
echo "$DST" >> /tmp/coverage/binaries.list
done
for bin in $binaries; do
echo "/tmp/neon/bin/$bin" >> /tmp/coverage/binaries.list
done
fi
- name: Install postgres binaries
run: cp -a pg_install /tmp/neon/pg_install
- name: Upload Neon artifact
uses: ./.github/actions/upload
with:
name: neon-${{ runner.os }}-${{ matrix.build_type }}-artifact
path: /tmp/neon
# XXX: keep this after the binaries.list is formed, so the coverage can properly work later
- name: Merge and upload coverage data
if: matrix.build_type == 'debug'
uses: ./.github/actions/save-coverage-data
regress-tests:
runs-on: [ self-hosted, gen3, large ]
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned
options: --init
needs: [ build-neon ]
strategy:
fail-fast: false
matrix:
build_type: [ debug, release ]
steps:
- name: Checkout
uses: actions/checkout@v3
with:
submodules: true
fetch-depth: 1
- name: Pytest regression tests
uses: ./.github/actions/run-python-test-set
with:
build_type: ${{ matrix.build_type }}
test_selection: regress
needs_postgres_source: true
run_with_real_s3: true
real_s3_bucket: ci-tests-s3
real_s3_region: us-west-2
real_s3_access_key_id: "${{ secrets.AWS_ACCESS_KEY_ID_CI_TESTS_S3 }}"
real_s3_secret_access_key: "${{ secrets.AWS_SECRET_ACCESS_KEY_CI_TESTS_S3 }}"
- name: Merge and upload coverage data
if: matrix.build_type == 'debug'
uses: ./.github/actions/save-coverage-data
benchmarks:
runs-on: [ self-hosted, gen3, small ]
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned
options: --init
needs: [ build-neon ]
if: github.ref_name == 'main' || contains(github.event.pull_request.labels.*.name, 'run-benchmarks')
strategy:
fail-fast: false
matrix:
build_type: [ release ]
steps:
- name: Checkout
uses: actions/checkout@v3
with:
submodules: true
fetch-depth: 1
- name: Pytest benchmarks
uses: ./.github/actions/run-python-test-set
with:
build_type: ${{ matrix.build_type }}
test_selection: performance
run_in_parallel: false
save_perf_report: ${{ github.ref == 'refs/heads/main' }}
env:
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"
# XXX: no coverage data handling here, since benchmarks are run on release builds,
# while coverage is currently collected for the debug ones
merge-allure-report:
runs-on: [ self-hosted, gen3, small ]
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned
options: --init
needs: [ regress-tests, benchmarks ]
if: ${{ !cancelled() }}
strategy:
fail-fast: false
matrix:
build_type: [ debug, release ]
steps:
- name: Checkout
uses: actions/checkout@v3
with:
submodules: false
- name: Create Allure report
id: create-allure-report
uses: ./.github/actions/allure-report
with:
action: generate
build_type: ${{ matrix.build_type }}
- name: Store Allure test stat in the DB
if: ${{ steps.create-allure-report.outputs.report-url }}
env:
BUILD_TYPE: ${{ matrix.build_type }}
SHA: ${{ github.event.pull_request.head.sha || github.sha }}
REPORT_URL: ${{ steps.create-allure-report.outputs.report-url }}
TEST_RESULT_CONNSTR: ${{ secrets.REGRESS_TEST_RESULT_CONNSTR }}
run: |
curl --fail --output suites.json ${REPORT_URL%/index.html}/data/suites.json
./scripts/pysync
DATABASE_URL="$TEST_RESULT_CONNSTR" poetry run python3 scripts/ingest_regress_test_result.py --revision ${SHA} --reference ${GITHUB_REF} --build-type ${BUILD_TYPE} --ingest suites.json
coverage-report:
runs-on: [ self-hosted, gen3, small ]
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned
options: --init
needs: [ regress-tests ]
strategy:
fail-fast: false
matrix:
build_type: [ debug ]
steps:
- name: Checkout
uses: actions/checkout@v3
with:
submodules: true
fetch-depth: 1
# Disabled for now
# - name: Restore cargo deps cache
# id: cache_cargo
# uses: actions/cache@v3
# with:
# path: |
# ~/.cargo/registry/
# !~/.cargo/registry/src
# ~/.cargo/git/
# target/
# key: v1-${{ runner.os }}-${{ matrix.build_type }}-cargo-${{ hashFiles('rust-toolchain.toml') }}-${{ hashFiles('Cargo.lock') }}
- name: Get Neon artifact
uses: ./.github/actions/download
with:
name: neon-${{ runner.os }}-${{ matrix.build_type }}-artifact
path: /tmp/neon
- name: Get coverage artifact
uses: ./.github/actions/download
with:
name: coverage-data-artifact
path: /tmp/coverage
- name: Merge coverage data
run: scripts/coverage "--profraw-prefix=$GITHUB_JOB" --dir=/tmp/coverage merge
- name: Build and upload coverage report
run: |
COMMIT_SHA=${{ github.event.pull_request.head.sha }}
COMMIT_SHA=${COMMIT_SHA:-${{ github.sha }}}
COMMIT_URL=https://github.com/${{ github.repository }}/commit/$COMMIT_SHA
scripts/coverage \
--dir=/tmp/coverage report \
--input-objects=/tmp/coverage/binaries.list \
--commit-url=$COMMIT_URL \
--format=github
REPORT_URL=https://${{ github.repository_owner }}.github.io/zenith-coverage-data/$COMMIT_SHA
scripts/git-upload \
--repo=https://${{ secrets.VIP_VAP_ACCESS_TOKEN }}@github.com/${{ github.repository_owner }}/zenith-coverage-data.git \
--message="Add code coverage for $COMMIT_URL" \
copy /tmp/coverage/report $COMMIT_SHA # COPY FROM TO_RELATIVE
# Add link to the coverage report to the commit
curl -f -X POST \
https://api.github.com/repos/${{ github.repository }}/statuses/$COMMIT_SHA \
-H "Accept: application/vnd.github.v3+json" \
--user "${{ secrets.CI_ACCESS_TOKEN }}" \
--data \
"{
\"state\": \"success\",
\"context\": \"neon-coverage\",
\"description\": \"Coverage report is ready\",
\"target_url\": \"$REPORT_URL\"
}"
trigger-e2e-tests:
runs-on: [ self-hosted, gen3, small ]
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/base:pinned
options: --init
needs: [ push-docker-hub, tag ]
steps:
- name: Set PR's status to pending and request a remote CI test
run: |
# For pull requests, GH Actions set "github.sha" variable to point at a fake merge commit
# but we need to use a real sha of a latest commit in the PR's branch for the e2e job,
# to place a job run status update later.
COMMIT_SHA=${{ github.event.pull_request.head.sha }}
# For non-PR kinds of runs, the above will produce an empty variable, pick the original sha value for those
COMMIT_SHA=${COMMIT_SHA:-${{ github.sha }}}
REMOTE_REPO="${{ github.repository_owner }}/cloud"
curl -f -X POST \
https://api.github.com/repos/${{ github.repository }}/statuses/$COMMIT_SHA \
-H "Accept: application/vnd.github.v3+json" \
--user "${{ secrets.CI_ACCESS_TOKEN }}" \
--data \
"{
\"state\": \"pending\",
\"context\": \"neon-cloud-e2e\",
\"description\": \"[$REMOTE_REPO] Remote CI job is about to start\"
}"
curl -f -X POST \
https://api.github.com/repos/$REMOTE_REPO/actions/workflows/testing.yml/dispatches \
-H "Accept: application/vnd.github.v3+json" \
--user "${{ secrets.CI_ACCESS_TOKEN }}" \
--data \
"{
\"ref\": \"main\",
\"inputs\": {
\"ci_job_name\": \"neon-cloud-e2e\",
\"commit_hash\": \"$COMMIT_SHA\",
\"remote_repo\": \"${{ github.repository }}\",
\"storage_image_tag\": \"${{ needs.tag.outputs.build-tag }}\",
\"compute_image_tag\": \"${{ needs.tag.outputs.build-tag }}\"
}
}"
neon-image:
runs-on: [ self-hosted, gen3, large ]
needs: [ tag ]
# https://github.com/GoogleContainerTools/kaniko/issues/2005
container: gcr.io/kaniko-project/executor:v1.7.0-debug
defaults:
run:
shell: sh -eu {0}
steps:
- name: Checkout
uses: actions/checkout@v1 # v3 won't work with kaniko
with:
submodules: true
fetch-depth: 0
- name: Configure ECR login
run: echo "{\"credsStore\":\"ecr-login\"}" > /kaniko/.docker/config.json
- name: Kaniko build neon
run: /kaniko/executor --reproducible --snapshotMode=redo --skip-unused-stages --cache=true --cache-repo 369495373322.dkr.ecr.eu-central-1.amazonaws.com/cache --context . --build-arg GIT_VERSION=${{ github.sha }} --destination 369495373322.dkr.ecr.eu-central-1.amazonaws.com/neon:${{needs.tag.outputs.build-tag}}
# Cleanup script fails otherwise - rm: cannot remove '/nvme/actions-runner/_work/_temp/_github_home/.ecr': Permission denied
- name: Cleanup ECR folder
run: rm -rf ~/.ecr
neon-image-depot:
# For testing this will run side-by-side for a few merges.
# This action is not really optimized yet, but gets the job done
runs-on: [ self-hosted, gen3, large ]
needs: [ tag ]
container: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/base:pinned
permissions:
contents: read
id-token: write
steps:
- name: Checkout
uses: actions/checkout@v3
with:
submodules: true
fetch-depth: 0
- name: Setup go
uses: actions/setup-go@v3
with:
go-version: '1.19'
- name: Set up Depot CLI
uses: depot/setup-action@v1
- name: Install Crane & ECR helper
run: go install github.com/awslabs/amazon-ecr-credential-helper/ecr-login/cli/docker-credential-ecr-login@69c85dc22db6511932bbf119e1a0cc5c90c69a7f # v0.6.0
- name: Configure ECR login
run: |
mkdir /github/home/.docker/
echo "{\"credsStore\":\"ecr-login\"}" > /github/home/.docker/config.json
- name: Build and push
uses: depot/build-push-action@v1
with:
# if no depot.json file is at the root of your repo, you must specify the project id
project: nrdv0s4kcs
push: true
tags: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/neon:depot-${{needs.tag.outputs.build-tag}}
compute-tools-image:
runs-on: [ self-hosted, gen3, large ]
needs: [ tag ]
container: gcr.io/kaniko-project/executor:v1.7.0-debug
defaults:
run:
shell: sh -eu {0}
steps:
- name: Checkout
uses: actions/checkout@v1 # v3 won't work with kaniko
- name: Configure ECR login
run: echo "{\"credsStore\":\"ecr-login\"}" > /kaniko/.docker/config.json
- name: Kaniko build compute tools
run: /kaniko/executor --reproducible --snapshotMode=redo --skip-unused-stages --cache=true --cache-repo 369495373322.dkr.ecr.eu-central-1.amazonaws.com/cache --context . --build-arg GIT_VERSION=${{ github.sha }} --dockerfile Dockerfile.compute-tools --destination 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-tools:${{needs.tag.outputs.build-tag}}
- name: Cleanup ECR folder
run: rm -rf ~/.ecr
compute-node-image:
runs-on: [ self-hosted, gen3, large ]
container: gcr.io/kaniko-project/executor:v1.7.0-debug
needs: [ tag ]
strategy:
fail-fast: false
matrix:
version: [ v14, v15 ]
defaults:
run:
shell: sh -eu {0}
steps:
- name: Checkout
uses: actions/checkout@v1 # v3 won't work with kaniko
with:
submodules: true
fetch-depth: 0
- name: Configure ECR login
run: echo "{\"credsStore\":\"ecr-login\"}" > /kaniko/.docker/config.json
- name: Kaniko build compute node with extensions
run: /kaniko/executor --reproducible --snapshotMode=redo --skip-unused-stages --cache=true --cache-repo 369495373322.dkr.ecr.eu-central-1.amazonaws.com/cache --context . --build-arg GIT_VERSION=${{ github.sha }} --build-arg PG_VERSION=${{ matrix.version }} --dockerfile Dockerfile.compute-node --destination 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node-${{ matrix.version }}:${{needs.tag.outputs.build-tag}}
- name: Cleanup ECR folder
run: rm -rf ~/.ecr
vm-compute-node-image:
runs-on: [ self-hosted, gen3, large ]
needs: [ tag, compute-node-image ]
strategy:
fail-fast: false
matrix:
version: [ v14, v15 ]
defaults:
run:
shell: sh -eu {0}
env:
VM_BUILDER_VERSION: v0.4.6
steps:
- name: Checkout
uses: actions/checkout@v1
with:
fetch-depth: 0
- name: Downloading vm-builder
run: |
curl -L https://github.com/neondatabase/neonvm/releases/download/$VM_BUILDER_VERSION/vm-builder -o vm-builder
chmod +x vm-builder
- name: Pulling compute-node image
run: |
docker pull 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node-${{ matrix.version }}:${{needs.tag.outputs.build-tag}}
- name: Building VM compute-node rootfs
run: |
docker build -t temp-vm-compute-node --build-arg SRC_IMAGE=369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node-${{ matrix.version }}:${{needs.tag.outputs.build-tag}} -f Dockerfile.vm-compute-node .
- name: Build vm image
run: |
# note: as of 2023-01-12, vm-builder requires a trailing ":latest" for local images
./vm-builder -use-inittab -src=temp-vm-compute-node:latest -dst=369495373322.dkr.ecr.eu-central-1.amazonaws.com/vm-compute-node-${{ matrix.version }}:${{needs.tag.outputs.build-tag}}
- name: Pushing vm-compute-node image
run: |
docker push 369495373322.dkr.ecr.eu-central-1.amazonaws.com/vm-compute-node-${{ matrix.version }}:${{needs.tag.outputs.build-tag}}
test-images:
needs: [ tag, neon-image, compute-node-image, compute-tools-image ]
runs-on: [ self-hosted, gen3, small ]
steps:
- name: Checkout
uses: actions/checkout@v3
with:
fetch-depth: 0
# `neondatabase/neon` contains multiple binaries, all of them use the same input for the version into the same version formatting library.
# Pick pageserver as currently the only binary with extra "version" features printed in the string to verify.
# Regular pageserver version string looks like
# Neon page server git-env:32d14403bd6ab4f4520a94cbfd81a6acef7a526c failpoints: true, features: []
# Bad versions might loop like:
# Neon page server git-env:local failpoints: true, features: ["testing"]
# Ensure that we don't have bad versions.
- name: Verify image versions
shell: bash # ensure no set -e for better error messages
run: |
pageserver_version=$(docker run --rm 369495373322.dkr.ecr.eu-central-1.amazonaws.com/neon:${{needs.tag.outputs.build-tag}} "/bin/sh" "-c" "/usr/local/bin/pageserver --version")
echo "Pageserver version string: $pageserver_version"
if ! echo "$pageserver_version" | grep -qv 'git-env:local' ; then
echo "Pageserver version should not be the default Dockerfile one"
exit 1
fi
if ! echo "$pageserver_version" | grep -qv '"testing"' ; then
echo "Pageserver version should have no testing feature enabled"
exit 1
fi
- name: Verify docker-compose example
run: env REPOSITORY=369495373322.dkr.ecr.eu-central-1.amazonaws.com TAG=${{needs.tag.outputs.build-tag}} ./docker-compose/docker_compose_test.sh
- name: Print logs and clean up
if: always()
run: |
docker compose -f ./docker-compose/docker-compose.yml logs || 0
docker compose -f ./docker-compose/docker-compose.yml down
promote-images:
runs-on: [ self-hosted, gen3, small ]
needs: [ tag, test-images, vm-compute-node-image ]
container: golang:1.19-bullseye
if: github.event_name != 'workflow_dispatch'
steps:
- name: Install Crane & ECR helper
if: |
(github.ref_name == 'main' || github.ref_name == 'release') &&
github.event_name != 'workflow_dispatch'
run: |
go install github.com/google/go-containerregistry/cmd/crane@31786c6cbb82d6ec4fb8eb79cd9387905130534e # v0.11.0
go install github.com/awslabs/amazon-ecr-credential-helper/ecr-login/cli/docker-credential-ecr-login@69c85dc22db6511932bbf119e1a0cc5c90c69a7f # v0.6.0
- name: Configure ECR login
run: |
mkdir /github/home/.docker/
echo "{\"credsStore\":\"ecr-login\"}" > /github/home/.docker/config.json
- name: Add latest tag to images
if: |
(github.ref_name == 'main' || github.ref_name == 'release') &&
github.event_name != 'workflow_dispatch'
run: |
crane tag 369495373322.dkr.ecr.eu-central-1.amazonaws.com/neon:${{needs.tag.outputs.build-tag}} latest
crane tag 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-tools:${{needs.tag.outputs.build-tag}} latest
crane tag 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node-v14:${{needs.tag.outputs.build-tag}} latest
crane tag 369495373322.dkr.ecr.eu-central-1.amazonaws.com/vm-compute-node-v14:${{needs.tag.outputs.build-tag}} latest
crane tag 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node-v15:${{needs.tag.outputs.build-tag}} latest
crane tag 369495373322.dkr.ecr.eu-central-1.amazonaws.com/vm-compute-node-v15:${{needs.tag.outputs.build-tag}} latest
- name: Cleanup ECR folder
run: rm -rf ~/.ecr
push-docker-hub:
runs-on: [ self-hosted, dev, x64 ]
needs: [ promote-images, tag ]
container: golang:1.19-bullseye
steps:
- name: Install Crane & ECR helper
run: |
go install github.com/google/go-containerregistry/cmd/crane@31786c6cbb82d6ec4fb8eb79cd9387905130534e # v0.11.0
go install github.com/awslabs/amazon-ecr-credential-helper/ecr-login/cli/docker-credential-ecr-login@69c85dc22db6511932bbf119e1a0cc5c90c69a7f # v0.6.0
- name: Configure ECR login
run: |
mkdir /github/home/.docker/
echo "{\"credsStore\":\"ecr-login\"}" > /github/home/.docker/config.json
- name: Pull neon image from ECR
run: crane pull 369495373322.dkr.ecr.eu-central-1.amazonaws.com/neon:${{needs.tag.outputs.build-tag}} neon
- name: Pull compute tools image from ECR
run: crane pull 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-tools:${{needs.tag.outputs.build-tag}} compute-tools
- name: Pull compute node v14 image from ECR
run: crane pull 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node-v14:${{needs.tag.outputs.build-tag}} compute-node-v14
- name: Pull vm compute node v14 image from ECR
run: crane pull 369495373322.dkr.ecr.eu-central-1.amazonaws.com/vm-compute-node-v14:${{needs.tag.outputs.build-tag}} vm-compute-node-v14
- name: Pull compute node v15 image from ECR
run: crane pull 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node-v15:${{needs.tag.outputs.build-tag}} compute-node-v15
- name: Pull vm compute node v15 image from ECR
run: crane pull 369495373322.dkr.ecr.eu-central-1.amazonaws.com/vm-compute-node-v15:${{needs.tag.outputs.build-tag}} vm-compute-node-v15
- name: Pull rust image from ECR
run: crane pull 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned rust
- name: Push images to production ECR
if: |
(github.ref_name == 'main' || github.ref_name == 'release') &&
github.event_name != 'workflow_dispatch'
run: |
crane copy 369495373322.dkr.ecr.eu-central-1.amazonaws.com/neon:${{needs.tag.outputs.build-tag}} 093970136003.dkr.ecr.eu-central-1.amazonaws.com/neon:latest
crane copy 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-tools:${{needs.tag.outputs.build-tag}} 093970136003.dkr.ecr.eu-central-1.amazonaws.com/compute-tools:latest
crane copy 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node-v14:${{needs.tag.outputs.build-tag}} 093970136003.dkr.ecr.eu-central-1.amazonaws.com/compute-node-v14:latest
crane copy 369495373322.dkr.ecr.eu-central-1.amazonaws.com/vm-compute-node-v14:${{needs.tag.outputs.build-tag}} 093970136003.dkr.ecr.eu-central-1.amazonaws.com/vm-compute-node-v14:latest
crane copy 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node-v15:${{needs.tag.outputs.build-tag}} 093970136003.dkr.ecr.eu-central-1.amazonaws.com/compute-node-v15:latest
crane copy 369495373322.dkr.ecr.eu-central-1.amazonaws.com/vm-compute-node-v15:${{needs.tag.outputs.build-tag}} 093970136003.dkr.ecr.eu-central-1.amazonaws.com/vm-compute-node-v15:latest
- name: Configure Docker Hub login
run: |
# ECR Credential Helper & Docker Hub don't work together in config, hence reset
echo "" > /github/home/.docker/config.json
crane auth login -u ${{ secrets.NEON_DOCKERHUB_USERNAME }} -p ${{ secrets.NEON_DOCKERHUB_PASSWORD }} index.docker.io
- name: Push neon image to Docker Hub
run: crane push neon neondatabase/neon:${{needs.tag.outputs.build-tag}}
- name: Push compute tools image to Docker Hub
run: crane push compute-tools neondatabase/compute-tools:${{needs.tag.outputs.build-tag}}
- name: Push compute node v14 image to Docker Hub
run: crane push compute-node-v14 neondatabase/compute-node-v14:${{needs.tag.outputs.build-tag}}
- name: Push vm compute node v14 image to Docker Hub
run: crane push vm-compute-node-v14 neondatabase/vm-compute-node-v14:${{needs.tag.outputs.build-tag}}
- name: Push compute node v15 image to Docker Hub
run: crane push compute-node-v15 neondatabase/compute-node-v15:${{needs.tag.outputs.build-tag}}
- name: Push vm compute node v15 image to Docker Hub
run: crane push vm-compute-node-v15 neondatabase/vm-compute-node-v15:${{needs.tag.outputs.build-tag}}
- name: Push rust image to Docker Hub
run: crane push rust neondatabase/rust:pinned
- name: Add latest tag to images in Docker Hub
if: |
(github.ref_name == 'main' || github.ref_name == 'release') &&
github.event_name != 'workflow_dispatch'
run: |
crane tag neondatabase/neon:${{needs.tag.outputs.build-tag}} latest
crane tag neondatabase/compute-tools:${{needs.tag.outputs.build-tag}} latest
crane tag neondatabase/compute-node-v14:${{needs.tag.outputs.build-tag}} latest
crane tag neondatabase/vm-compute-node-v14:${{needs.tag.outputs.build-tag}} latest
crane tag neondatabase/compute-node-v15:${{needs.tag.outputs.build-tag}} latest
crane tag neondatabase/vm-compute-node-v15:${{needs.tag.outputs.build-tag}} latest
- name: Cleanup ECR folder
run: rm -rf ~/.ecr
deploy-pr-test-new:
runs-on: [ self-hosted, gen3, small ]
container: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/ansible:pinned
# We need both storage **and** compute images for deploy, because control plane picks the compute version based on the storage version.
# If it notices a fresh storage it may bump the compute version. And if compute image failed to build it may break things badly
needs: [ push-docker-hub, tag, regress-tests ]
if: |
contains(github.event.pull_request.labels.*.name, 'deploy-test-storage') &&
github.event_name != 'workflow_dispatch'
defaults:
run:
shell: bash
strategy:
matrix:
target_region: [ eu-west-1 ]
steps:
- name: Checkout
uses: actions/checkout@v3
with:
submodules: true
fetch-depth: 0
- name: Redeploy
run: |
export DOCKER_TAG=${{needs.tag.outputs.build-tag}}
cd "$(pwd)/.github/ansible"
./get_binaries.sh
ansible-galaxy collection install sivel.toiletwater
ansible-playbook deploy.yaml -i staging.${{ matrix.target_region }}.hosts.yaml -e @ssm_config -e CONSOLE_API_TOKEN=${{ secrets.NEON_STAGING_API_KEY }} -e SENTRY_URL_PAGESERVER=${{ secrets.SENTRY_URL_PAGESERVER }} -e SENTRY_URL_SAFEKEEPER=${{ secrets.SENTRY_URL_SAFEKEEPER }}
rm -f neon_install.tar.gz .neon_current_version
- name: Cleanup ansible folder
run: rm -rf ~/.ansible
deploy:
runs-on: [ self-hosted, gen3, small ]
container: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/ansible:latest
needs: [ push-docker-hub, tag, regress-tests ]
if: ( github.ref_name == 'main' || github.ref_name == 'release' ) && github.event_name != 'workflow_dispatch'
steps:
- name: Checkout
uses: actions/checkout@v3
with:
submodules: false
fetch-depth: 0
- name: Trigger deploy workflow
env:
GH_TOKEN: ${{ github.token }}
run: |
if [[ "$GITHUB_REF_NAME" == "main" ]]; then
gh workflow run deploy-dev.yml --ref main -f branch=${{ github.sha }} -f dockerTag=${{needs.tag.outputs.build-tag}}
elif [[ "$GITHUB_REF_NAME" == "release" ]]; then
gh workflow run deploy-prod.yml --ref release -f branch=${{ github.sha }} -f dockerTag=${{needs.tag.outputs.build-tag}} -f disclamerAcknowledged=true
else
echo "GITHUB_REF_NAME (value '$GITHUB_REF_NAME') is not set to either 'main' or 'release'"
exit 1
fi
promote-compatibility-data:
runs-on: [ self-hosted, gen3, small ]
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned
options: --init
needs: [ push-docker-hub, tag, regress-tests ]
if: github.ref_name == 'release' && github.event_name != 'workflow_dispatch'
steps:
- name: Promote compatibility snapshot for the release
env:
BUCKET: neon-github-public-dev
PREFIX: artifacts/latest
run: |
# Update compatibility snapshot for the release
for build_type in debug release; do
OLD_FILENAME=compatibility-snapshot-${build_type}-pg14-${GITHUB_RUN_ID}.tar.zst
NEW_FILENAME=compatibility-snapshot-${build_type}-pg14.tar.zst
time aws s3 mv --only-show-errors s3://${BUCKET}/${PREFIX}/${OLD_FILENAME} s3://${BUCKET}/${PREFIX}/${NEW_FILENAME}
done
# Update Neon artifact for the release (reuse already uploaded artifact)
for build_type in debug release; do
OLD_PREFIX=artifacts/${GITHUB_RUN_ID}
FILENAME=neon-${{ runner.os }}-${build_type}-artifact.tar.zst
S3_KEY=$(aws s3api list-objects-v2 --bucket ${BUCKET} --prefix ${OLD_PREFIX} | jq -r '.Contents[].Key' | grep ${FILENAME} | sort --version-sort | tail -1 || true)
if [ -z "${S3_KEY}" ]; then
echo 2>&1 "Neither s3://${BUCKET}/${OLD_PREFIX}/${FILENAME} nor its version from previous attempts exist"
exit 1
fi
time aws s3 cp --only-show-errors s3://${BUCKET}/${S3_KEY} s3://${BUCKET}/${PREFIX}/${FILENAME}
done

179
.github/workflows/deploy-dev.yml vendored Normal file
View File

@@ -0,0 +1,179 @@
name: Neon Deploy dev
on:
workflow_dispatch:
inputs:
dockerTag:
description: 'Docker tag to deploy'
required: true
type: string
branch:
description: 'Branch or commit used for deploy scripts and configs'
required: true
type: string
default: 'main'
deployStorage:
description: 'Deploy storage'
required: true
type: boolean
default: true
deployProxy:
description: 'Deploy proxy'
required: true
type: boolean
default: true
deployStorageBroker:
description: 'Deploy storage-broker'
required: true
type: boolean
default: true
env:
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_DEV }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_KEY_DEV }}
concurrency:
group: deploy-dev
cancel-in-progress: false
jobs:
deploy-storage-new:
runs-on: [ self-hosted, gen3, small ]
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/ansible:pinned
options: --user root --privileged
if: inputs.deployStorage
defaults:
run:
shell: bash
strategy:
matrix:
target_region: [ eu-west-1, us-east-2 ]
environment:
name: dev-${{ matrix.target_region }}
steps:
- name: Checkout
uses: actions/checkout@v3
with:
submodules: true
fetch-depth: 0
ref: ${{ inputs.branch }}
- name: Redeploy
run: |
export DOCKER_TAG=${{ inputs.dockerTag }}
cd "$(pwd)/.github/ansible"
./get_binaries.sh
ansible-galaxy collection install sivel.toiletwater
ansible-playbook -v deploy.yaml -i staging.${{ matrix.target_region }}.hosts.yaml -e @ssm_config -e CONSOLE_API_TOKEN=${{ secrets.NEON_STAGING_API_KEY }} -e SENTRY_URL_PAGESERVER=${{ secrets.SENTRY_URL_PAGESERVER }} -e SENTRY_URL_SAFEKEEPER=${{ secrets.SENTRY_URL_SAFEKEEPER }}
rm -f neon_install.tar.gz .neon_current_version
- name: Cleanup ansible folder
run: rm -rf ~/.ansible
deploy-proxy-new:
runs-on: [ self-hosted, gen3, small ]
container: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/ansible:pinned
if: inputs.deployProxy
defaults:
run:
shell: bash
strategy:
matrix:
include:
- target_region: us-east-2
target_cluster: dev-us-east-2-beta
deploy_link_proxy: true
deploy_legacy_scram_proxy: true
- target_region: eu-west-1
target_cluster: dev-eu-west-1-zeta
deploy_link_proxy: false
deploy_legacy_scram_proxy: false
environment:
name: dev-${{ matrix.target_region }}
steps:
- name: Checkout
uses: actions/checkout@v3
with:
submodules: true
fetch-depth: 0
ref: ${{ inputs.branch }}
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v1-node16
with:
role-to-assume: arn:aws:iam::369495373322:role/github-runner
aws-region: eu-central-1
role-skip-session-tagging: true
role-duration-seconds: 1800
- name: Configure environment
run: |
helm repo add neondatabase https://neondatabase.github.io/helm-charts
aws --region ${{ matrix.target_region }} eks update-kubeconfig --name ${{ matrix.target_cluster }}
- name: Re-deploy scram proxy
run: |
DOCKER_TAG=${{ inputs.dockerTag }}
helm upgrade neon-proxy-scram neondatabase/neon-proxy --namespace neon-proxy --create-namespace --install --atomic -f .github/helm-values/${{ matrix.target_cluster }}.neon-proxy-scram.yaml --set image.tag=${DOCKER_TAG} --set settings.sentryUrl=${{ secrets.SENTRY_URL_PROXY }} --wait --timeout 15m0s
- name: Re-deploy link proxy
if: matrix.deploy_link_proxy
run: |
DOCKER_TAG=${{ inputs.dockerTag }}
helm upgrade neon-proxy-link neondatabase/neon-proxy --namespace neon-proxy --create-namespace --install --atomic -f .github/helm-values/${{ matrix.target_cluster }}.neon-proxy-link.yaml --set image.tag=${DOCKER_TAG} --set settings.sentryUrl=${{ secrets.SENTRY_URL_PROXY }} --wait --timeout 15m0s
- name: Re-deploy legacy scram proxy
if: matrix.deploy_legacy_scram_proxy
run: |
DOCKER_TAG=${{ inputs.dockerTag }}
helm upgrade neon-proxy-scram-legacy neondatabase/neon-proxy --namespace neon-proxy --create-namespace --install --atomic -f .github/helm-values/${{ matrix.target_cluster }}.neon-proxy-scram-legacy.yaml --set image.tag=${DOCKER_TAG} --set settings.sentryUrl=${{ secrets.SENTRY_URL_PROXY }} --wait --timeout 15m0s
- name: Cleanup helm folder
run: rm -rf ~/.cache
deploy-storage-broker-new:
runs-on: [ self-hosted, gen3, small ]
container: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/ansible:pinned
if: inputs.deployStorageBroker
defaults:
run:
shell: bash
strategy:
matrix:
include:
- target_region: us-east-2
target_cluster: dev-us-east-2-beta
- target_region: eu-west-1
target_cluster: dev-eu-west-1-zeta
environment:
name: dev-${{ matrix.target_region }}
steps:
- name: Checkout
uses: actions/checkout@v3
with:
submodules: true
fetch-depth: 0
ref: ${{ inputs.branch }}
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v1-node16
with:
role-to-assume: arn:aws:iam::369495373322:role/github-runner
aws-region: eu-central-1
role-skip-session-tagging: true
role-duration-seconds: 1800
- name: Configure environment
run: |
helm repo add neondatabase https://neondatabase.github.io/helm-charts
aws --region ${{ matrix.target_region }} eks update-kubeconfig --name ${{ matrix.target_cluster }}
- name: Deploy storage-broker
run:
helm upgrade neon-storage-broker-lb neondatabase/neon-storage-broker --namespace neon-storage-broker-lb --create-namespace --install --atomic -f .github/helm-values/${{ matrix.target_cluster }}.neon-storage-broker.yaml --set image.tag=${{ inputs.dockerTag }} --set settings.sentryUrl=${{ secrets.SENTRY_URL_BROKER }} --wait --timeout 5m0s
- name: Cleanup helm folder
run: rm -rf ~/.cache

167
.github/workflows/deploy-prod.yml vendored Normal file
View File

@@ -0,0 +1,167 @@
name: Neon Deploy prod
on:
workflow_dispatch:
inputs:
dockerTag:
description: 'Docker tag to deploy'
required: true
type: string
branch:
description: 'Branch or commit used for deploy scripts and configs'
required: true
type: string
default: 'release'
deployStorage:
description: 'Deploy storage'
required: true
type: boolean
default: true
deployProxy:
description: 'Deploy proxy'
required: true
type: boolean
default: true
deployStorageBroker:
description: 'Deploy storage-broker'
required: true
type: boolean
default: true
disclamerAcknowledged:
description: 'I confirm that there is an emergency and I can not use regular release workflow'
required: true
type: boolean
default: false
concurrency:
group: deploy-prod
cancel-in-progress: false
jobs:
deploy-prod-new:
runs-on: prod
container:
image: 093970136003.dkr.ecr.eu-central-1.amazonaws.com/ansible:latest
options: --user root --privileged
if: inputs.deployStorage && inputs.disclamerAcknowledged
defaults:
run:
shell: bash
strategy:
matrix:
target_region: [ us-east-2, us-west-2, eu-central-1, ap-southeast-1 ]
environment:
name: prod-${{ matrix.target_region }}
steps:
- name: Checkout
uses: actions/checkout@v3
with:
submodules: true
fetch-depth: 0
ref: ${{ inputs.branch }}
- name: Redeploy
run: |
export DOCKER_TAG=${{ inputs.dockerTag }}
cd "$(pwd)/.github/ansible"
./get_binaries.sh
ansible-galaxy collection install sivel.toiletwater
ansible-playbook -v deploy.yaml -i prod.${{ matrix.target_region }}.hosts.yaml -e @ssm_config -e CONSOLE_API_TOKEN=${{ secrets.NEON_PRODUCTION_API_KEY }} -e SENTRY_URL_PAGESERVER=${{ secrets.SENTRY_URL_PAGESERVER }} -e SENTRY_URL_SAFEKEEPER=${{ secrets.SENTRY_URL_SAFEKEEPER }}
rm -f neon_install.tar.gz .neon_current_version
deploy-proxy-prod-new:
runs-on: prod
container: 093970136003.dkr.ecr.eu-central-1.amazonaws.com/ansible:latest
if: inputs.deployProxy && inputs.disclamerAcknowledged
defaults:
run:
shell: bash
strategy:
matrix:
include:
- target_region: us-east-2
target_cluster: prod-us-east-2-delta
deploy_link_proxy: true
deploy_legacy_scram_proxy: false
- target_region: us-west-2
target_cluster: prod-us-west-2-eta
deploy_link_proxy: false
deploy_legacy_scram_proxy: true
- target_region: eu-central-1
target_cluster: prod-eu-central-1-gamma
deploy_link_proxy: false
deploy_legacy_scram_proxy: false
- target_region: ap-southeast-1
target_cluster: prod-ap-southeast-1-epsilon
deploy_link_proxy: false
deploy_legacy_scram_proxy: false
environment:
name: prod-${{ matrix.target_region }}
steps:
- name: Checkout
uses: actions/checkout@v3
with:
submodules: true
fetch-depth: 0
ref: ${{ inputs.branch }}
- name: Configure environment
run: |
helm repo add neondatabase https://neondatabase.github.io/helm-charts
aws --region ${{ matrix.target_region }} eks update-kubeconfig --name ${{ matrix.target_cluster }}
- name: Re-deploy scram proxy
run: |
DOCKER_TAG=${{ inputs.dockerTag }}
helm upgrade neon-proxy-scram neondatabase/neon-proxy --namespace neon-proxy --create-namespace --install --atomic -f .github/helm-values/${{ matrix.target_cluster }}.neon-proxy-scram.yaml --set image.tag=${DOCKER_TAG} --set settings.sentryUrl=${{ secrets.SENTRY_URL_PROXY }} --wait --timeout 15m0s
- name: Re-deploy link proxy
if: matrix.deploy_link_proxy
run: |
DOCKER_TAG=${{ inputs.dockerTag }}
helm upgrade neon-proxy-link neondatabase/neon-proxy --namespace neon-proxy --create-namespace --install --atomic -f .github/helm-values/${{ matrix.target_cluster }}.neon-proxy-link.yaml --set image.tag=${DOCKER_TAG} --set settings.sentryUrl=${{ secrets.SENTRY_URL_PROXY }} --wait --timeout 15m0s
- name: Re-deploy legacy scram proxy
if: matrix.deploy_legacy_scram_proxy
run: |
DOCKER_TAG=${{ inputs.dockerTag }}
helm upgrade neon-proxy-scram-legacy neondatabase/neon-proxy --namespace neon-proxy --create-namespace --install --atomic -f .github/helm-values/${{ matrix.target_cluster }}.neon-proxy-scram-legacy.yaml --set image.tag=${DOCKER_TAG} --set settings.sentryUrl=${{ secrets.SENTRY_URL_PROXY }} --wait --timeout 15m0s
deploy-storage-broker-prod-new:
runs-on: prod
container: 093970136003.dkr.ecr.eu-central-1.amazonaws.com/ansible:latest
if: inputs.deployStorageBroker && inputs.disclamerAcknowledged
defaults:
run:
shell: bash
strategy:
matrix:
include:
- target_region: us-east-2
target_cluster: prod-us-east-2-delta
- target_region: us-west-2
target_cluster: prod-us-west-2-eta
- target_region: eu-central-1
target_cluster: prod-eu-central-1-gamma
- target_region: ap-southeast-1
target_cluster: prod-ap-southeast-1-epsilon
environment:
name: prod-${{ matrix.target_region }}
steps:
- name: Checkout
uses: actions/checkout@v3
with:
submodules: true
fetch-depth: 0
ref: ${{ inputs.branch }}
- name: Configure environment
run: |
helm repo add neondatabase https://neondatabase.github.io/helm-charts
aws --region ${{ matrix.target_region }} eks update-kubeconfig --name ${{ matrix.target_cluster }}
- name: Deploy storage-broker
run:
helm upgrade neon-storage-broker-lb neondatabase/neon-storage-broker --namespace neon-storage-broker-lb --create-namespace --install --atomic -f .github/helm-values/${{ matrix.target_cluster }}.neon-storage-broker.yaml --set image.tag=${{ inputs.dockerTag }} --set settings.sentryUrl=${{ secrets.SENTRY_URL_BROKER }} --wait --timeout 5m0s

154
.github/workflows/neon_extra_builds.yml vendored Normal file
View File

@@ -0,0 +1,154 @@
name: Check neon with extra platform builds
on:
push:
branches:
- main
pull_request:
defaults:
run:
shell: bash -euxo pipefail {0}
concurrency:
# Allow only one workflow per any non-`main` branch.
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.ref == 'refs/heads/main' && github.sha || 'anysha' }}
cancel-in-progress: true
env:
RUST_BACKTRACE: 1
COPT: '-Werror'
jobs:
check-macos-build:
if: github.ref_name == 'main' || contains(github.event.pull_request.labels.*.name, 'run-extra-build-macos')
timeout-minutes: 90
runs-on: macos-latest
env:
# Use release build only, to have less debug info around
# Hence keeping target/ (and general cache size) smaller
BUILD_TYPE: release
steps:
- name: Checkout
uses: actions/checkout@v3
with:
submodules: true
fetch-depth: 1
- name: Install macOS postgres dependencies
run: brew install flex bison openssl protobuf
- name: Set pg 14 revision for caching
id: pg_v14_rev
run: echo pg_rev=$(git rev-parse HEAD:vendor/postgres-v14) >> $GITHUB_OUTPUT
- name: Set pg 15 revision for caching
id: pg_v15_rev
run: echo pg_rev=$(git rev-parse HEAD:vendor/postgres-v15) >> $GITHUB_OUTPUT
- name: Cache postgres v14 build
id: cache_pg_14
uses: actions/cache@v3
with:
path: pg_install/v14
key: v1-${{ runner.os }}-${{ matrix.build_type }}-pg-${{ steps.pg_v14_rev.outputs.pg_rev }}-${{ hashFiles('Makefile') }}
- name: Cache postgres v15 build
id: cache_pg_15
uses: actions/cache@v3
with:
path: pg_install/v15
key: v1-${{ runner.os }}-${{ matrix.build_type }}-pg-${{ steps.pg_v15_rev.outputs.pg_rev }}-${{ hashFiles('Makefile') }}
- name: Set extra env for macOS
run: |
echo 'LDFLAGS=-L/usr/local/opt/openssl@3/lib' >> $GITHUB_ENV
echo 'CPPFLAGS=-I/usr/local/opt/openssl@3/include' >> $GITHUB_ENV
- name: Cache cargo deps
uses: actions/cache@v3
with:
path: |
~/.cargo/registry
!~/.cargo/registry/src
~/.cargo/git
target
key: v1-${{ runner.os }}-cargo-${{ hashFiles('./Cargo.lock') }}-${{ hashFiles('./rust-toolchain.toml') }}-rust
- name: Build postgres v14
if: steps.cache_pg_14.outputs.cache-hit != 'true'
run: make postgres-v14 -j$(nproc)
- name: Build postgres v15
if: steps.cache_pg_15.outputs.cache-hit != 'true'
run: make postgres-v15 -j$(nproc)
- name: Build neon extensions
run: make neon-pg-ext -j$(nproc)
- name: Run cargo build
run: cargo build --all --release
- name: Check that no warnings are produced
run: ./run_clippy.sh
gather-rust-build-stats:
if: github.ref_name == 'main' || contains(github.event.pull_request.labels.*.name, 'run-extra-build-stats')
runs-on: [ self-hosted, gen3, large ]
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned
options: --init
env:
BUILD_TYPE: release
# remove the cachepot wrapper and build without crate caches
RUSTC_WRAPPER: ""
# build with incremental compilation produce partial results
# so do not attempt to cache this build, also disable the incremental compilation
CARGO_INCREMENTAL: 0
steps:
- name: Checkout
uses: actions/checkout@v3
with:
submodules: true
fetch-depth: 1
# Some of our rust modules use FFI and need those to be checked
- name: Get postgres headers
run: make postgres-headers -j$(nproc)
- name: Produce the build stats
run: cargo build --all --release --timings
- name: Upload the build stats
id: upload-stats
env:
BUCKET: neon-github-public-dev
SHA: ${{ github.event.pull_request.head.sha || github.sha }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_DEV }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_KEY_DEV }}
run: |
REPORT_URL=https://${BUCKET}.s3.amazonaws.com/build-stats/${SHA}/${GITHUB_RUN_ID}/cargo-timing.html
aws s3 cp --only-show-errors ./target/cargo-timings/cargo-timing.html "s3://${BUCKET}/build-stats/${SHA}/${GITHUB_RUN_ID}/"
echo "report-url=${REPORT_URL}" >> $GITHUB_OUTPUT
- name: Publish build stats report
uses: actions/github-script@v6
env:
REPORT_URL: ${{ steps.upload-stats.outputs.report-url }}
SHA: ${{ github.event.pull_request.head.sha || github.sha }}
with:
script: |
const { REPORT_URL, SHA } = process.env
await github.rest.repos.createCommitStatus({
owner: context.repo.owner,
repo: context.repo.repo,
sha: `${SHA}`,
state: 'success',
target_url: `${REPORT_URL}`,
context: `Build stats (release)`,
})

View File

@@ -1,45 +0,0 @@
name: Send Notifications
on:
push:
branches: [ main ]
jobs:
send-notifications:
timeout-minutes: 30
name: send commit notifications
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v2
with:
submodules: true
fetch-depth: 2
- name: Form variables for notification message
id: git_info_grab
run: |
git_stat=$(git show --stat=50)
git_stat="${git_stat//'%'/'%25'}"
git_stat="${git_stat//$'\n'/'%0A'}"
git_stat="${git_stat//$'\r'/'%0D'}"
git_stat="${git_stat// /}" # space -> 'Space En', as github tends to eat ordinary spaces
echo "::set-output name=git_stat::$git_stat"
echo "::set-output name=sha_short::$(git rev-parse --short HEAD)"
echo "##[set-output name=git_branch;]$(echo ${GITHUB_REF#refs/heads/})"
- name: Send notification
uses: appleboy/telegram-action@master
with:
to: ${{ secrets.TELEGRAM_TO }}
token: ${{ secrets.TELEGRAM_TOKEN }}
format: markdown
args: |
*@${{ github.actor }} pushed to* [${{ github.repository }}:${{steps.git_info_grab.outputs.git_branch}}](github.com/${{ github.repository }}/commit/${{steps.git_info_grab.outputs.sha_short }})
```
${{ steps.git_info_grab.outputs.git_stat }}
```

99
.github/workflows/pg_clients.yml vendored Normal file
View File

@@ -0,0 +1,99 @@
name: Test Postgres client libraries
on:
schedule:
# * is a special character in YAML so you have to quote this string
# ┌───────────── minute (0 - 59)
# │ ┌───────────── hour (0 - 23)
# │ │ ┌───────────── day of the month (1 - 31)
# │ │ │ ┌───────────── month (1 - 12 or JAN-DEC)
# │ │ │ │ ┌───────────── day of the week (0 - 6 or SUN-SAT)
- cron: '23 02 * * *' # run once a day, timezone is utc
workflow_dispatch:
concurrency:
# Allow only one workflow per any non-`main` branch.
group: ${{ github.workflow }}-${{ github.ref }}-${{ github.ref == 'refs/heads/main' && github.sha || 'anysha' }}
cancel-in-progress: true
jobs:
test-postgres-client-libs:
# TODO: switch to gen2 runner, requires docker
runs-on: [ ubuntu-latest ]
env:
DEFAULT_PG_VERSION: 14
TEST_OUTPUT: /tmp/test_output
steps:
- name: Checkout
uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: 3.9
- name: Install Poetry
uses: snok/install-poetry@v1
- name: Cache poetry deps
id: cache_poetry
uses: actions/cache@v3
with:
path: ~/.cache/pypoetry/virtualenvs
key: v1-${{ runner.os }}-python-deps-${{ hashFiles('poetry.lock') }}
- name: Install Python deps
shell: bash -euxo pipefail {0}
run: ./scripts/pysync
- name: Create Neon Project
id: create-neon-project
uses: ./.github/actions/neon-project-create
with:
api_key: ${{ secrets.NEON_STAGING_API_KEY }}
postgres_version: ${{ env.DEFAULT_PG_VERSION }}
- name: Run pytest
env:
REMOTE_ENV: 1
BENCHMARK_CONNSTR: ${{ steps.create-neon-project.outputs.dsn }}
POSTGRES_DISTRIB_DIR: /tmp/neon/pg_install
shell: bash -euxo pipefail {0}
run: |
# Test framework expects we have psql binary;
# but since we don't really need it in this test, let's mock it
mkdir -p "$POSTGRES_DISTRIB_DIR/v${DEFAULT_PG_VERSION}/bin" && touch "$POSTGRES_DISTRIB_DIR/v${DEFAULT_PG_VERSION}/bin/psql";
./scripts/pytest \
--junitxml=$TEST_OUTPUT/junit.xml \
--tb=short \
--verbose \
-m "remote_cluster" \
-rA "test_runner/pg_clients"
- name: Delete Neon Project
if: ${{ always() }}
uses: ./.github/actions/neon-project-delete
with:
project_id: ${{ steps.create-neon-project.outputs.project_id }}
api_key: ${{ secrets.NEON_STAGING_API_KEY }}
# We use GitHub's action upload-artifact because `ubuntu-latest` doesn't have configured AWS CLI.
# It will be fixed after switching to gen2 runner
- name: Upload python test logs
if: always()
uses: actions/upload-artifact@v3
with:
retention-days: 7
name: python-test-pg_clients-${{ runner.os }}-stage-logs
path: ${{ env.TEST_OUTPUT }}
- name: Post to a Slack channel
if: ${{ github.event.schedule && failure() }}
uses: slackapi/slack-github-action@v1
with:
channel-id: "C033QLM5P7D" # dev-staging-stream
slack-message: "Testing Postgres clients: ${{ job.status }}\n${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"
env:
SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}

34
.github/workflows/release.yml vendored Normal file
View File

@@ -0,0 +1,34 @@
name: Create Release Branch
on:
schedule:
- cron: '0 10 * * 2'
jobs:
create_release_branch:
runs-on: [ubuntu-latest]
steps:
- name: Check out code
uses: actions/checkout@v3
with:
ref: main
- name: Get current date
id: date
run: echo "date=$(date +'%Y-%m-%d')" >> $GITHUB_OUTPUT
- name: Create release branch
run: git checkout -b releases/${{ steps.date.outputs.date }}
- name: Push new branch
run: git push origin releases/${{ steps.date.outputs.date }}
- name: Create pull request into release
uses: thomaseizinger/create-pull-request@e3972219c86a56550fb70708d96800d8e24ba862 # 1.3.0
with:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
head: releases/${{ steps.date.outputs.date }}
base: release
title: Release ${{ steps.date.outputs.date }}
team_reviewers: release

View File

@@ -1,73 +0,0 @@
name: Build and Test
on:
push:
branches: [ main ]
pull_request:
branches: [ main ]
jobs:
regression-check:
strategy:
matrix:
# If we want to duplicate this job for different
# Rust toolchains (e.g. nightly or 1.37.0), add them here.
rust_toolchain: [stable]
os: [ubuntu-latest]
timeout-minutes: 30
name: run regression test suite
runs-on: ${{ matrix.os }}
steps:
- name: Checkout
uses: actions/checkout@v2
with:
submodules: true
fetch-depth: 2
- name: install rust toolchain ${{ matrix.rust_toolchain }}
uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: ${{ matrix.rust_toolchain }}
override: true
- name: Install postgres dependencies
run: |
sudo apt update
sudo apt install build-essential libreadline-dev zlib1g-dev flex bison libseccomp-dev
- name: Set pg revision for caching
id: pg_ver
run: echo ::set-output name=pg_rev::$(git rev-parse HEAD:vendor/postgres)
- name: Cache postgres build
id: cache_pg
uses: actions/cache@v2
with:
path: |
tmp_install/
key: ${{ runner.os }}-pg-${{ steps.pg_ver.outputs.pg_rev }}
- name: Build postgres
if: steps.cache_pg.outputs.cache-hit != 'true'
run: |
make postgres
- name: Cache cargo deps
id: cache_cargo
uses: actions/cache@v2
with:
path: |
~/.cargo/registry
~/.cargo/git
target
key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}
- name: Run cargo build
run: |
cargo build --workspace --bins --examples --tests
- name: Run cargo test
run: |
cargo test -- --nocapture --test-threads=1

17
.gitignore vendored
View File

@@ -1,9 +1,20 @@
/pg_install
/target
/tmp_check
/tmp_install
/tmp_check_cli
__pycache__/
test_output/
.vscode
/.zenith
/integration_tests/.zenith
.idea
/.neon
/integration_tests/.neon
# Coverage
*.profraw
*.profdata
*.key
*.crt
*.o
*.so
*.Po

12
.gitmodules vendored
View File

@@ -1,4 +1,8 @@
[submodule "vendor/postgres"]
path = vendor/postgres
url = https://github.com/zenithdb/postgres
branch = main
[submodule "vendor/postgres-v14"]
path = vendor/postgres-v14
url = https://github.com/neondatabase/postgres.git
branch = REL_14_STABLE_neon
[submodule "vendor/postgres-v15"]
path = vendor/postgres-v15
url = https://github.com/neondatabase/postgres.git
branch = REL_15_STABLE_neon

11
CODEOWNERS Normal file
View File

@@ -0,0 +1,11 @@
/compute_tools/ @neondatabase/control-plane
/control_plane/ @neondatabase/compute @neondatabase/storage
/libs/pageserver_api/ @neondatabase/compute @neondatabase/storage
/libs/postgres_ffi/ @neondatabase/compute
/libs/remote_storage/ @neondatabase/storage
/libs/safekeeper_api/ @neondatabase/safekeepers
/pageserver/ @neondatabase/compute @neondatabase/storage
/pgxn/ @neondatabase/compute
/proxy/ @neondatabase/control-plane
/safekeeper/ @neondatabase/safekeepers
/vendor/ @neondatabase/compute

View File

@@ -11,17 +11,15 @@ than it was before.
## Submitting changes
1. Make a PR for every change.
Even seemingly trivial patches can break things in surprising ways.
Use of common sense is OK. If you're only fixing a typo in a comment,
it's probably fine to just push it. But if in doubt, open a PR.
2. Get at least one +1 on your PR before you push.
1. Get at least one +1 on your PR before you push.
For simple patches, it will only take a minute for someone to review
it.
2. Don't force push small changes after making the PR ready for review.
Doing so will force readers to re-read your entire PR, which will delay
the review process.
3. Always keep the CI green.
Do not push, if the CI failed on your PR. Even if you think it's not

View File

@@ -1,20 +0,0 @@
This software is licensed under the Apache 2.0 License:
----------------------------------------------------------------------------
Copyright 2021 Zenith Labs, Inc
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
----------------------------------------------------------------------------
The PostgreSQL submodule in vendor/postgres is licensed under the
PostgreSQL license. See vendor/postgres/COPYRIGHT.

4179
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -1,17 +1,224 @@
[workspace]
members = [
"compute_tools",
"control_plane",
"pageserver",
"postgres_ffi",
"proxy",
"walkeeper",
"safekeeper",
"storage_broker",
"workspace_hack",
"zenith",
"zenith_metrics",
"zenith_utils",
"trace",
"libs/*",
]
[workspace.package]
edition = "2021"
license = "Apache-2.0"
## All dependency versions, used in the project
[workspace.dependencies]
anyhow = { version = "1.0", features = ["backtrace"] }
async-stream = "0.3"
async-trait = "0.1"
atty = "0.2.14"
aws-config = { version = "0.51.0", default-features = false, features=["rustls"] }
aws-sdk-s3 = "0.21.0"
aws-smithy-http = "0.51.0"
aws-types = "0.51.0"
base64 = "0.13.0"
bincode = "1.3"
bindgen = "0.61"
bstr = "1.0"
byteorder = "1.4"
bytes = "1.0"
chrono = { version = "0.4", default-features = false, features = ["clock"] }
clap = { version = "4.0", features = ["derive"] }
close_fds = "0.3.2"
comfy-table = "6.1"
const_format = "0.2"
crc32c = "0.6"
crossbeam-utils = "0.8.5"
either = "1.8"
enum-map = "2.4.2"
enumset = "1.0.12"
fail = "0.5.0"
fs2 = "0.4.3"
futures = "0.3"
futures-core = "0.3"
futures-util = "0.3"
git-version = "0.3"
hashbrown = "0.13"
hashlink = "0.8.1"
hex = "0.4"
hex-literal = "0.3"
hmac = "0.12.1"
hostname = "0.3.1"
humantime = "2.1"
humantime-serde = "1.1.1"
hyper = "0.14"
hyper-tungstenite = "0.9"
itertools = "0.10"
jsonwebtoken = "8"
libc = "0.2"
md5 = "0.7.0"
memoffset = "0.8"
nix = "0.26"
notify = "5.0.0"
num_cpus = "1.15"
num-traits = "0.2.15"
once_cell = "1.13"
opentelemetry = "0.18.0"
opentelemetry-otlp = { version = "0.11.0", default_features=false, features = ["http-proto", "trace", "http", "reqwest-client"] }
opentelemetry-semantic-conventions = "0.10.0"
parking_lot = "0.12"
pin-project-lite = "0.2"
prometheus = {version = "0.13", default_features=false, features = ["process"]} # removes protobuf dependency
prost = "0.11"
rand = "0.8"
regex = "1.4"
reqwest = { version = "0.11", default-features = false, features = ["rustls-tls"] }
reqwest-tracing = { version = "0.4.0", features = ["opentelemetry_0_18"] }
reqwest-middleware = "0.2.0"
routerify = "3"
rpds = "0.12.0"
rustls = "0.20"
rustls-pemfile = "1"
rustls-split = "0.3"
scopeguard = "1.1"
sentry = { version = "0.29", default-features = false, features = ["backtrace", "contexts", "panic", "rustls", "reqwest" ] }
serde = { version = "1.0", features = ["derive"] }
serde_json = "1"
serde_with = "2.0"
sha2 = "0.10.2"
signal-hook = "0.3"
socket2 = "0.4.4"
strum = "0.24"
strum_macros = "0.24"
svg_fmt = "0.4.1"
sync_wrapper = "0.1.2"
tar = "0.4"
thiserror = "1.0"
tls-listener = { version = "0.6", features = ["rustls", "hyper-h1"] }
tokio = { version = "1.17", features = ["macros"] }
tokio-postgres-rustls = "0.9.0"
tokio-rustls = "0.23"
tokio-stream = "0.1"
tokio-util = { version = "0.7", features = ["io"] }
toml = "0.5"
toml_edit = { version = "0.17", features = ["easy"] }
tonic = {version = "0.8", features = ["tls", "tls-roots"]}
tracing = "0.1"
tracing-opentelemetry = "0.18.0"
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
url = "2.2"
uuid = { version = "1.2", features = ["v4", "serde"] }
walkdir = "2.3.2"
webpki-roots = "0.22.5"
x509-parser = "0.14"
## TODO replace this with tracing
env_logger = "0.10"
log = "0.4"
## Libraries from neondatabase/ git forks, ideally with changes to be upstreamed
postgres = { git = "https://github.com/neondatabase/rust-postgres.git", rev="43e6db254a97fdecbce33d8bc0890accfd74495e" }
postgres-protocol = { git = "https://github.com/neondatabase/rust-postgres.git", rev="43e6db254a97fdecbce33d8bc0890accfd74495e" }
postgres-types = { git = "https://github.com/neondatabase/rust-postgres.git", rev="43e6db254a97fdecbce33d8bc0890accfd74495e" }
tokio-postgres = { git = "https://github.com/neondatabase/rust-postgres.git", rev="43e6db254a97fdecbce33d8bc0890accfd74495e" }
tokio-tar = { git = "https://github.com/neondatabase/tokio-tar.git", rev="404df61437de0feef49ba2ccdbdd94eb8ad6e142" }
## Other git libraries
heapless = { default-features=false, features=[], git = "https://github.com/japaric/heapless.git", rev = "644653bf3b831c6bb4963be2de24804acf5e5001" } # upstream release pending
## Local libraries
consumption_metrics = { version = "0.1", path = "./libs/consumption_metrics/" }
metrics = { version = "0.1", path = "./libs/metrics/" }
pageserver_api = { version = "0.1", path = "./libs/pageserver_api/" }
postgres_backend = { version = "0.1", path = "./libs/postgres_backend/" }
postgres_connection = { version = "0.1", path = "./libs/postgres_connection/" }
postgres_ffi = { version = "0.1", path = "./libs/postgres_ffi/" }
pq_proto = { version = "0.1", path = "./libs/pq_proto/" }
remote_storage = { version = "0.1", path = "./libs/remote_storage/" }
safekeeper_api = { version = "0.1", path = "./libs/safekeeper_api" }
storage_broker = { version = "0.1", path = "./storage_broker/" } # Note: main broker code is inside the binary crate, so linking with the library shouldn't be heavy.
tenant_size_model = { version = "0.1", path = "./libs/tenant_size_model/" }
tracing-utils = { version = "0.1", path = "./libs/tracing-utils/" }
utils = { version = "0.1", path = "./libs/utils/" }
## Common library dependency
workspace_hack = { version = "0.1", path = "./workspace_hack/" }
## Build dependencies
criterion = "0.4"
rcgen = "0.10"
rstest = "0.16"
tempfile = "3.4"
tonic-build = "0.8"
# This is only needed for proxy's tests.
# TODO: we should probably fork `tokio-postgres-rustls` instead.
[patch.crates-io]
tokio-postgres = { git = "https://github.com/neondatabase/rust-postgres.git", rev="43e6db254a97fdecbce33d8bc0890accfd74495e" }
################# Binary contents sections
[profile.release]
# This is useful for profiling and, to some extent, debug.
# Besides, debug info should not affect the performance.
debug = true
# disable debug symbols for all packages except this one to decrease binaries size
[profile.release.package."*"]
debug = false
[profile.release-line-debug]
inherits = "release"
debug = 1 # true = 2 = all symbols, 1 = line only
[profile.release-line-debug-lto]
inherits = "release"
debug = 1 # true = 2 = all symbols, 1 = line only
lto = true
[profile.release-line-debug-size]
inherits = "release"
debug = 1 # true = 2 = all symbols, 1 = line only
opt-level = "s"
[profile.release-line-debug-zize]
inherits = "release"
debug = 1 # true = 2 = all symbols, 1 = line only
opt-level = "z"
[profile.release-line-debug-size-lto]
inherits = "release"
debug = 1 # true = 2 = all symbols, 1 = line only
opt-level = "s"
lto = true
[profile.release-line-debug-zize-lto]
inherits = "release"
debug = 1 # true = 2 = all symbols, 1 = line only
opt-level = "z"
lto = true
[profile.release-no-debug]
inherits = "release"
debug = false # true = 2 = all symbols, 1 = line only
[profile.release-no-debug-size]
inherits = "release"
debug = false # true = 2 = all symbols, 1 = line only
opt-level = "s"
[profile.release-no-debug-zize]
inherits = "release"
debug = false # true = 2 = all symbols, 1 = line only
opt-level = "z"
[profile.release-no-debug-size-lto]
inherits = "release"
debug = false # true = 2 = all symbols, 1 = line only
opt-level = "s"
lto = true
[profile.release-no-debug-zize-lto]
inherits = "release"
debug = false # true = 2 = all symbols, 1 = line only
opt-level = "z"
lto = true

View File

@@ -1,60 +1,90 @@
#
# Docker image for console integration testing.
#
### Creates a storage Docker image with postgres, pageserver, safekeeper and proxy binaries.
### The image itself is mainly used as a container for the binaries and for starting e2e tests with custom parameters.
### By default, the binaries inside the image have some mock parameters and can start, but are not intended to be used
### inside this image in the real deployments.
ARG REPOSITORY=369495373322.dkr.ecr.eu-central-1.amazonaws.com
ARG IMAGE=rust
ARG TAG=pinned
# Build Postgres
FROM $REPOSITORY/$IMAGE:$TAG AS pg-build
WORKDIR /home/nonroot
COPY --chown=nonroot vendor/postgres-v14 vendor/postgres-v14
COPY --chown=nonroot vendor/postgres-v15 vendor/postgres-v15
COPY --chown=nonroot pgxn pgxn
COPY --chown=nonroot Makefile Makefile
COPY --chown=nonroot scripts/ninstall.sh scripts/ninstall.sh
#
# Build Postgres separately --- this layer will be rebuilt only if one of
# mentioned paths will get any changes.
#
FROM zenithdb/build:buster AS pg-build
WORKDIR /zenith
COPY ./vendor/postgres vendor/postgres
COPY ./Makefile Makefile
ENV BUILD_TYPE release
RUN make -j $(getconf _NPROCESSORS_ONLN) -s postgres
RUN rm -rf postgres_install/build
RUN set -e \
&& mold -run make -j $(nproc) -s neon-pg-ext \
&& rm -rf pg_install/build \
&& tar -C pg_install -czf /home/nonroot/postgres_install.tar.gz .
#
# Build zenith binaries
#
# TODO: build cargo deps as separate layer. We used cargo-chef before but that was
# net time waste in a lot of cases. Copying Cargo.lock with empty lib.rs should do the work.
#
FROM zenithdb/build:buster AS build
WORKDIR /zenith
COPY --from=pg-build /zenith/tmp_install/include/postgresql/server tmp_install/include/postgresql/server
# Build neon binaries
FROM $REPOSITORY/$IMAGE:$TAG AS build
WORKDIR /home/nonroot
ARG GIT_VERSION=local
COPY . .
RUN cargo build --release
# Enable https://github.com/paritytech/cachepot to cache Rust crates' compilation results in Docker builds.
# Set up cachepot to use an AWS S3 bucket for cache results, to reuse it between `docker build` invocations.
# cachepot falls back to local filesystem if S3 is misconfigured, not failing the build
ARG RUSTC_WRAPPER=cachepot
ENV AWS_REGION=eu-central-1
ENV CACHEPOT_S3_KEY_PREFIX=cachepot
ARG CACHEPOT_BUCKET=neon-github-dev
#ARG AWS_ACCESS_KEY_ID
#ARG AWS_SECRET_ACCESS_KEY
COPY --from=pg-build /home/nonroot/pg_install/v14/include/postgresql/server pg_install/v14/include/postgresql/server
COPY --from=pg-build /home/nonroot/pg_install/v15/include/postgresql/server pg_install/v15/include/postgresql/server
COPY --chown=nonroot . .
# Show build caching stats to check if it was used in the end.
# Has to be the part of the same RUN since cachepot daemon is killed in the end of this RUN, losing the compilation stats.
RUN set -e \
&& mold -run cargo build --bin pageserver --bin pageserver_binutils --bin draw_timeline_dir --bin safekeeper --bin storage_broker --bin proxy --locked --release \
&& cachepot -s
# Build final image
#
# Copy binaries to resulting image.
#
FROM debian:buster-slim
FROM debian:bullseye-slim
WORKDIR /data
RUN apt-get update && apt-get -yq install libreadline-dev libseccomp-dev openssl ca-certificates && \
mkdir zenith_install
RUN set -e \
&& apt update \
&& apt install -y \
libreadline-dev \
libseccomp-dev \
openssl \
ca-certificates \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* \
&& useradd -d /data neon \
&& chown -R neon:neon /data
COPY --from=build /zenith/target/release/pageserver /usr/local/bin
COPY --from=build /zenith/target/release/safekeeper /usr/local/bin
COPY --from=build /zenith/target/release/proxy /usr/local/bin
COPY --from=pg-build /zenith/tmp_install postgres_install
COPY docker-entrypoint.sh /docker-entrypoint.sh
COPY --from=build --chown=neon:neon /home/nonroot/target/release/pageserver /usr/local/bin
COPY --from=build --chown=neon:neon /home/nonroot/target/release/pageserver_binutils /usr/local/bin
COPY --from=build --chown=neon:neon /home/nonroot/target/release/draw_timeline_dir /usr/local/bin
COPY --from=build --chown=neon:neon /home/nonroot/target/release/safekeeper /usr/local/bin
COPY --from=build --chown=neon:neon /home/nonroot/target/release/storage_broker /usr/local/bin
COPY --from=build --chown=neon:neon /home/nonroot/target/release/proxy /usr/local/bin
# Remove build artifacts (~ 500 MB)
RUN rm -rf postgres_install/build && \
# 'Install' Postgres binaries locally
cp -r postgres_install/* /usr/local/ && \
# Prepare an archive of Postgres binaries (should be around 11 MB)
# and keep it inside container for an ease of deploy pipeline.
cd postgres_install && tar -czf /data/postgres_install.tar.gz . && cd .. && \
rm -rf postgres_install
COPY --from=pg-build /home/nonroot/pg_install/v14 /usr/local/v14/
COPY --from=pg-build /home/nonroot/pg_install/v15 /usr/local/v15/
COPY --from=pg-build /home/nonroot/postgres_install.tar.gz /data/
RUN useradd -d /data zenith && chown -R zenith:zenith /data
# By default, pageserver uses `.neon/` working directory in WORKDIR, so create one and fill it with the dummy config.
# Now, when `docker run ... pageserver` is run, it can start without errors, yet will have some default dummy values.
RUN mkdir -p /data/.neon/ && chown -R neon:neon /data/.neon/ \
&& /usr/local/bin/pageserver -D /data/.neon/ --init \
-c "id=1234" \
-c "broker_endpoint='http://storage_broker:50051'" \
-c "pg_distrib_dir='/usr/local/'" \
-c "listen_pg_addr='0.0.0.0:6400'" \
-c "listen_http_addr='0.0.0.0:9898'"
VOLUME ["/data"]
USER zenith
USER neon
EXPOSE 6400
ENTRYPOINT ["/docker-entrypoint.sh"]
CMD ["pageserver"]
EXPOSE 9898

View File

@@ -1,95 +0,0 @@
#
# Docker image for console integration testing.
#
# We may also reuse it in CI to unify installation process and as a general binaries building
# tool for production servers.
#
# Dynamic linking is used for librocksdb and libstdc++ bacause librocksdb-sys calls
# bindgen with "dynamic" feature flag. This also prevents usage of dockerhub alpine-rust
# images which are statically linked and have guards against any dlopen. I would rather
# prefer all static binaries so we may change the way librocksdb-sys builds or wait until
# we will have our own storage and drop rockdb dependency.
#
# Cargo-chef is used to separate dependencies building from main binaries building. This
# way `docker build` will download and install dependencies only of there are changes to
# out Cargo.toml files.
#
#
# build postgres separately -- this layer will be rebuilt only if one of
# mentioned paths will get any changes
#
FROM alpine:3.13 as pg-build
RUN apk add --update clang llvm compiler-rt compiler-rt-static lld musl-dev binutils \
make bison flex readline-dev zlib-dev perl linux-headers libseccomp-dev
WORKDIR zenith
COPY ./vendor/postgres vendor/postgres
COPY ./Makefile Makefile
# Build using clang and lld
RUN CC='clang' LD='lld' CFLAGS='-fuse-ld=lld --rtlib=compiler-rt' make postgres -j4
#
# Calculate cargo dependencies.
# This will always run, but only generate recipe.json with list of dependencies without
# installing them.
#
FROM alpine:20210212 as cargo-deps-inspect
RUN apk add --update rust cargo
RUN cargo install cargo-chef
WORKDIR zenith
COPY . .
RUN cargo chef prepare --recipe-path recipe.json
#
# Build cargo dependencies.
# This temp cantainner would be build only if recipe.json was changed.
#
FROM alpine:20210212 as deps-build
RUN apk add --update rust cargo openssl-dev clang build-base
# rust-rocksdb can be built against system-wide rocksdb -- that saves about
# 10 minutes during build. Rocksdb apk package is in testing now, but use it
# anyway. In case of any troubles we can download and build rocksdb here manually
# (to cache it as a docker layer).
RUN apk --no-cache --update --repository https://dl-cdn.alpinelinux.org/alpine/edge/testing add rocksdb-dev
WORKDIR zenith
COPY --from=pg-build /zenith/tmp_install/include/postgresql/server tmp_install/include/postgresql/server
COPY --from=cargo-deps-inspect /root/.cargo/bin/cargo-chef /root/.cargo/bin/
COPY --from=cargo-deps-inspect /zenith/recipe.json recipe.json
RUN ROCKSDB_LIB_DIR=/usr/lib/ cargo chef cook --release --recipe-path recipe.json
#
# Build zenith binaries
#
FROM alpine:20210212 as build
RUN apk add --update rust cargo openssl-dev clang build-base
RUN apk --no-cache --update --repository https://dl-cdn.alpinelinux.org/alpine/edge/testing add rocksdb-dev
WORKDIR zenith
COPY . .
# Copy cached dependencies
COPY --from=pg-build /zenith/tmp_install/include/postgresql/server tmp_install/include/postgresql/server
COPY --from=deps-build /zenith/target target
COPY --from=deps-build /root/.cargo /root/.cargo
RUN cargo build --release
#
# Copy binaries to resulting image.
# build-base hare to provide libstdc++ (it will also bring gcc, but leave it this way until we figure
# out how to statically link rocksdb or avoid it at all).
#
FROM alpine:3.13
RUN apk add --update openssl build-base libseccomp-dev
RUN apk --no-cache --update --repository https://dl-cdn.alpinelinux.org/alpine/edge/testing add rocksdb
COPY --from=build /zenith/target/release/pageserver /usr/local/bin
COPY --from=build /zenith/target/release/safekeeper /usr/local/bin
COPY --from=build /zenith/target/release/proxy /usr/local/bin
COPY --from=pg-build /zenith/tmp_install /usr/local
COPY docker-entrypoint.sh /docker-entrypoint.sh
RUN addgroup zenith && adduser -h /data -D -G zenith zenith
VOLUME ["/data"]
WORKDIR /data
USER zenith
EXPOSE 6400
ENTRYPOINT ["/docker-entrypoint.sh"]
CMD ["pageserver"]

View File

@@ -1,15 +0,0 @@
#
# Image with all the required dependencies to build https://github.com/zenithdb/zenith
# and Postgres from https://github.com/zenithdb/postgres
# Also includes some rust development and build tools.
#
FROM rust:slim-buster
WORKDIR /zenith
# Install postgres and zenith build dependencies
# clang is for rocksdb
RUN apt-get update && apt-get -yq install automake libtool build-essential bison flex libreadline-dev zlib1g-dev libxml2-dev \
libseccomp-dev pkg-config libssl-dev clang
# Install rust tools
RUN rustup component add clippy && cargo install cargo-audit

493
Dockerfile.compute-node Normal file
View File

@@ -0,0 +1,493 @@
ARG PG_VERSION
ARG REPOSITORY=369495373322.dkr.ecr.eu-central-1.amazonaws.com
ARG IMAGE=rust
ARG TAG=pinned
#########################################################################################
#
# Layer "build-deps"
#
#########################################################################################
FROM debian:bullseye-slim AS build-deps
RUN apt update && \
apt install -y git autoconf automake libtool build-essential bison flex libreadline-dev \
zlib1g-dev libxml2-dev libcurl4-openssl-dev libossp-uuid-dev wget pkg-config libssl-dev \
libicu-dev libxslt1-dev
#########################################################################################
#
# Layer "pg-build"
# Build Postgres from the neon postgres repository.
#
#########################################################################################
FROM build-deps AS pg-build
ARG PG_VERSION
COPY vendor/postgres-${PG_VERSION} postgres
RUN cd postgres && \
./configure CFLAGS='-O2 -g3' --enable-debug --with-openssl --with-uuid=ossp --with-icu \
--with-libxml --with-libxslt && \
make MAKELEVEL=0 -j $(getconf _NPROCESSORS_ONLN) -s install && \
make MAKELEVEL=0 -j $(getconf _NPROCESSORS_ONLN) -s -C contrib/ install && \
# Install headers
make MAKELEVEL=0 -j $(getconf _NPROCESSORS_ONLN) -s -C src/include install && \
make MAKELEVEL=0 -j $(getconf _NPROCESSORS_ONLN) -s -C src/interfaces/libpq install && \
# Enable some of contrib extensions
echo 'trusted = true' >> /usr/local/pgsql/share/extension/autoinc.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/bloom.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/earthdistance.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/insert_username.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/intagg.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/moddatetime.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/pgrowlocks.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/pgstattuple.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/refint.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/xml2.control
#########################################################################################
#
# Layer "postgis-build"
# Build PostGIS from the upstream PostGIS mirror.
#
#########################################################################################
FROM build-deps AS postgis-build
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
RUN apt update && \
apt install -y cmake gdal-bin libboost-dev libboost-thread-dev libboost-filesystem-dev \
libboost-system-dev libboost-iostreams-dev libboost-program-options-dev libboost-timer-dev \
libcgal-dev libgdal-dev libgmp-dev libmpfr-dev libopenscenegraph-dev libprotobuf-c-dev \
protobuf-c-compiler xsltproc
# SFCGAL > 1.3 requires CGAL > 5.2, Bullseye's libcgal-dev is 5.2
RUN wget https://gitlab.com/Oslandia/SFCGAL/-/archive/v1.3.10/SFCGAL-v1.3.10.tar.gz -O SFCGAL.tar.gz && \
mkdir sfcgal-src && cd sfcgal-src && tar xvzf ../SFCGAL.tar.gz --strip-components=1 -C . && \
cmake . && make -j $(getconf _NPROCESSORS_ONLN) && \
DESTDIR=/sfcgal make install -j $(getconf _NPROCESSORS_ONLN) && \
make clean && cp -R /sfcgal/* /
ENV PATH "/usr/local/pgsql/bin:$PATH"
RUN wget https://download.osgeo.org/postgis/source/postgis-3.3.2.tar.gz -O postgis.tar.gz && \
mkdir postgis-src && cd postgis-src && tar xvzf ../postgis.tar.gz --strip-components=1 -C . && \
./autogen.sh && \
./configure --with-sfcgal=/usr/local/bin/sfcgal-config && \
make -j $(getconf _NPROCESSORS_ONLN) install && \
cd extensions/postgis && \
make clean && \
make -j $(getconf _NPROCESSORS_ONLN) install && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/postgis.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/postgis_raster.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/postgis_sfcgal.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/postgis_tiger_geocoder.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/postgis_topology.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/address_standardizer.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/address_standardizer_data_us.control
RUN wget https://github.com/pgRouting/pgrouting/archive/v3.4.2.tar.gz -O pgrouting.tar.gz && \
mkdir pgrouting-src && cd pgrouting-src && tar xvzf ../pgrouting.tar.gz --strip-components=1 -C . && \
mkdir build && \
cd build && \
cmake .. && \
make -j $(getconf _NPROCESSORS_ONLN) && \
make -j $(getconf _NPROCESSORS_ONLN) install && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/pgrouting.control
#########################################################################################
#
# Layer "plv8-build"
# Build plv8
#
#########################################################################################
FROM build-deps AS plv8-build
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
RUN apt update && \
apt install -y ninja-build python3-dev libncurses5 binutils clang
RUN wget https://github.com/plv8/plv8/archive/refs/tags/v3.1.5.tar.gz -O plv8.tar.gz && \
mkdir plv8-src && cd plv8-src && tar xvzf ../plv8.tar.gz --strip-components=1 -C . && \
export PATH="/usr/local/pgsql/bin:$PATH" && \
make DOCKER=1 -j $(getconf _NPROCESSORS_ONLN) install && \
rm -rf /plv8-* && \
find /usr/local/pgsql/ -name "plv8-*.so" | xargs strip && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/plv8.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/plcoffee.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/plls.control
#########################################################################################
#
# Layer "h3-pg-build"
# Build h3_pg
#
#########################################################################################
FROM build-deps AS h3-pg-build
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
# packaged cmake is too old
RUN wget https://github.com/Kitware/CMake/releases/download/v3.24.2/cmake-3.24.2-linux-x86_64.sh \
-q -O /tmp/cmake-install.sh \
&& chmod u+x /tmp/cmake-install.sh \
&& /tmp/cmake-install.sh --skip-license --prefix=/usr/local/ \
&& rm /tmp/cmake-install.sh
RUN wget https://github.com/uber/h3/archive/refs/tags/v4.1.0.tar.gz -O h3.tar.gz && \
mkdir h3-src && cd h3-src && tar xvzf ../h3.tar.gz --strip-components=1 -C . && \
mkdir build && cd build && \
cmake .. -DCMAKE_BUILD_TYPE=Release && \
make -j $(getconf _NPROCESSORS_ONLN) && \
DESTDIR=/h3 make install && \
cp -R /h3/usr / && \
rm -rf build
RUN wget https://github.com/zachasme/h3-pg/archive/refs/tags/v4.1.2.tar.gz -O h3-pg.tar.gz && \
mkdir h3-pg-src && cd h3-pg-src && tar xvzf ../h3-pg.tar.gz --strip-components=1 -C . && \
export PATH="/usr/local/pgsql/bin:$PATH" && \
make -j $(getconf _NPROCESSORS_ONLN) && \
make -j $(getconf _NPROCESSORS_ONLN) install && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/h3.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/h3_postgis.control
#########################################################################################
#
# Layer "unit-pg-build"
# compile unit extension
#
#########################################################################################
FROM build-deps AS unit-pg-build
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
RUN wget https://github.com/df7cb/postgresql-unit/archive/refs/tags/7.7.tar.gz -O postgresql-unit.tar.gz && \
mkdir postgresql-unit-src && cd postgresql-unit-src && tar xvzf ../postgresql-unit.tar.gz --strip-components=1 -C . && \
make -j $(getconf _NPROCESSORS_ONLN) PG_CONFIG=/usr/local/pgsql/bin/pg_config && \
make -j $(getconf _NPROCESSORS_ONLN) install PG_CONFIG=/usr/local/pgsql/bin/pg_config && \
# unit extension's "create extension" script relies on absolute install path to fill some reference tables.
# We move the extension from '/usr/local/pgsql/' to '/usr/local/' after it is build. So we need to adjust the path.
# This one-liner removes pgsql/ part of the path.
# NOTE: Other extensions that rely on MODULEDIR variable after building phase will need the same fix.
find /usr/local/pgsql/share/extension/ -name "unit*.sql" -print0 | xargs -0 sed -i "s|pgsql/||g" && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/unit.control
#########################################################################################
#
# Layer "vector-pg-build"
# compile pgvector extension
#
#########################################################################################
FROM build-deps AS vector-pg-build
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
RUN wget https://github.com/pgvector/pgvector/archive/refs/tags/v0.4.0.tar.gz -O pgvector.tar.gz && \
mkdir pgvector-src && cd pgvector-src && tar xvzf ../pgvector.tar.gz --strip-components=1 -C . && \
make -j $(getconf _NPROCESSORS_ONLN) PG_CONFIG=/usr/local/pgsql/bin/pg_config && \
make -j $(getconf _NPROCESSORS_ONLN) install PG_CONFIG=/usr/local/pgsql/bin/pg_config && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/vector.control
#########################################################################################
#
# Layer "pgjwt-pg-build"
# compile pgjwt extension
#
#########################################################################################
FROM build-deps AS pgjwt-pg-build
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
# 9742dab1b2f297ad3811120db7b21451bca2d3c9 made on 13/11/2021
RUN wget https://github.com/michelp/pgjwt/archive/9742dab1b2f297ad3811120db7b21451bca2d3c9.tar.gz -O pgjwt.tar.gz && \
mkdir pgjwt-src && cd pgjwt-src && tar xvzf ../pgjwt.tar.gz --strip-components=1 -C . && \
make -j $(getconf _NPROCESSORS_ONLN) install PG_CONFIG=/usr/local/pgsql/bin/pg_config && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/pgjwt.control
#########################################################################################
#
# Layer "hypopg-pg-build"
# compile hypopg extension
#
#########################################################################################
FROM build-deps AS hypopg-pg-build
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
RUN wget https://github.com/HypoPG/hypopg/archive/refs/tags/1.3.1.tar.gz -O hypopg.tar.gz && \
mkdir hypopg-src && cd hypopg-src && tar xvzf ../hypopg.tar.gz --strip-components=1 -C . && \
make -j $(getconf _NPROCESSORS_ONLN) PG_CONFIG=/usr/local/pgsql/bin/pg_config && \
make -j $(getconf _NPROCESSORS_ONLN) install PG_CONFIG=/usr/local/pgsql/bin/pg_config && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/hypopg.control
#########################################################################################
#
# Layer "pg-hashids-pg-build"
# compile pg_hashids extension
#
#########################################################################################
FROM build-deps AS pg-hashids-pg-build
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
RUN wget https://github.com/iCyberon/pg_hashids/archive/refs/tags/v1.2.1.tar.gz -O pg_hashids.tar.gz && \
mkdir pg_hashids-src && cd pg_hashids-src && tar xvzf ../pg_hashids.tar.gz --strip-components=1 -C . && \
make -j $(getconf _NPROCESSORS_ONLN) PG_CONFIG=/usr/local/pgsql/bin/pg_config USE_PGXS=1 && \
make -j $(getconf _NPROCESSORS_ONLN) install PG_CONFIG=/usr/local/pgsql/bin/pg_config USE_PGXS=1 && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/pg_hashids.control
#########################################################################################
#
# Layer "rum-pg-build"
# compile rum extension
#
#########################################################################################
FROM build-deps AS rum-pg-build
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
RUN wget https://github.com/postgrespro/rum/archive/refs/tags/1.3.13.tar.gz -O rum.tar.gz && \
mkdir rum-src && cd rum-src && tar xvzf ../rum.tar.gz --strip-components=1 -C . && \
make -j $(getconf _NPROCESSORS_ONLN) PG_CONFIG=/usr/local/pgsql/bin/pg_config USE_PGXS=1 && \
make -j $(getconf _NPROCESSORS_ONLN) install PG_CONFIG=/usr/local/pgsql/bin/pg_config USE_PGXS=1 && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/rum.control
#########################################################################################
#
# Layer "pgtap-pg-build"
# compile pgTAP extension
#
#########################################################################################
FROM build-deps AS pgtap-pg-build
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
RUN wget https://github.com/theory/pgtap/archive/refs/tags/v1.2.0.tar.gz -O pgtap.tar.gz && \
mkdir pgtap-src && cd pgtap-src && tar xvzf ../pgtap.tar.gz --strip-components=1 -C . && \
make -j $(getconf _NPROCESSORS_ONLN) PG_CONFIG=/usr/local/pgsql/bin/pg_config && \
make -j $(getconf _NPROCESSORS_ONLN) install PG_CONFIG=/usr/local/pgsql/bin/pg_config && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/pgtap.control
#########################################################################################
#
# Layer "prefix-pg-build"
# compile Prefix extension
#
#########################################################################################
FROM build-deps AS prefix-pg-build
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
RUN wget https://github.com/dimitri/prefix/archive/refs/tags/v1.2.9.tar.gz -O prefix.tar.gz && \
mkdir prefix-src && cd prefix-src && tar xvzf ../prefix.tar.gz --strip-components=1 -C . && \
make -j $(getconf _NPROCESSORS_ONLN) PG_CONFIG=/usr/local/pgsql/bin/pg_config && \
make -j $(getconf _NPROCESSORS_ONLN) install PG_CONFIG=/usr/local/pgsql/bin/pg_config && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/prefix.control
#########################################################################################
#
# Layer "hll-pg-build"
# compile hll extension
#
#########################################################################################
FROM build-deps AS hll-pg-build
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
RUN wget https://github.com/citusdata/postgresql-hll/archive/refs/tags/v2.17.tar.gz -O hll.tar.gz && \
mkdir hll-src && cd hll-src && tar xvzf ../hll.tar.gz --strip-components=1 -C . && \
make -j $(getconf _NPROCESSORS_ONLN) PG_CONFIG=/usr/local/pgsql/bin/pg_config && \
make -j $(getconf _NPROCESSORS_ONLN) install PG_CONFIG=/usr/local/pgsql/bin/pg_config && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/hll.control
#########################################################################################
#
# Layer "plpgsql-check-pg-build"
# compile plpgsql_check extension
#
#########################################################################################
FROM build-deps AS plpgsql-check-pg-build
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
RUN wget https://github.com/okbob/plpgsql_check/archive/refs/tags/v2.3.2.tar.gz -O plpgsql_check.tar.gz && \
mkdir plpgsql_check-src && cd plpgsql_check-src && tar xvzf ../plpgsql_check.tar.gz --strip-components=1 -C . && \
make -j $(getconf _NPROCESSORS_ONLN) PG_CONFIG=/usr/local/pgsql/bin/pg_config USE_PGXS=1 && \
make -j $(getconf _NPROCESSORS_ONLN) install PG_CONFIG=/usr/local/pgsql/bin/pg_config USE_PGXS=1 && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/plpgsql_check.control
#########################################################################################
#
# Layer "rust extensions"
# This layer is used to build `pgx` deps
#
#########################################################################################
FROM build-deps AS rust-extensions-build
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
RUN apt-get update && \
apt-get install -y curl libclang-dev cmake && \
useradd -ms /bin/bash nonroot -b /home
ENV HOME=/home/nonroot
ENV PATH="/home/nonroot/.cargo/bin:/usr/local/pgsql/bin/:$PATH"
USER nonroot
WORKDIR /home/nonroot
ARG PG_VERSION
RUN curl -sSO https://static.rust-lang.org/rustup/dist/$(uname -m)-unknown-linux-gnu/rustup-init && \
chmod +x rustup-init && \
./rustup-init -y --no-modify-path --profile minimal --default-toolchain stable && \
rm rustup-init && \
cargo install --locked --version 0.7.3 cargo-pgx && \
/bin/bash -c 'cargo pgx init --pg${PG_VERSION:1}=/usr/local/pgsql/bin/pg_config'
USER root
#########################################################################################
#
# Layer "pg-jsonschema-pg-build"
# Compile "pg_jsonschema" extension
#
#########################################################################################
FROM rust-extensions-build AS pg-jsonschema-pg-build
# there is no release tag yet, but we need it due to the superuser fix in the control file
RUN wget https://github.com/supabase/pg_jsonschema/archive/caeab60d70b2fd3ae421ec66466a3abbb37b7ee6.tar.gz -O pg_jsonschema.tar.gz && \
mkdir pg_jsonschema-src && cd pg_jsonschema-src && tar xvzf ../pg_jsonschema.tar.gz --strip-components=1 -C . && \
sed -i 's/pgx = "0.7.1"/pgx = { version = "0.7.3", features = [ "unsafe-postgres" ] }/g' Cargo.toml && \
cargo pgx install --release && \
echo "trusted = true" >> /usr/local/pgsql/share/extension/pg_jsonschema.control
#########################################################################################
#
# Layer "pg-graphql-pg-build"
# Compile "pg_graphql" extension
#
#########################################################################################
FROM rust-extensions-build AS pg-graphql-pg-build
# Currently pgx version bump to >= 0.7.2 causes "call to unsafe function" compliation errors in
# pgx-contrib-spiext. There is a branch that removes that dependency, so use it. It is on the
# same 1.1 version we've used before.
RUN git clone -b remove-pgx-contrib-spiext --single-branch https://github.com/yrashk/pg_graphql && \
cd pg_graphql && \
sed -i 's/pgx = "~0.7.1"/pgx = { version = "0.7.3", features = [ "unsafe-postgres" ] }/g' Cargo.toml && \
sed -i 's/pgx-tests = "~0.7.1"/pgx-tests = "0.7.3"/g' Cargo.toml && \
cargo pgx install --release && \
# it's needed to enable extension because it uses untrusted C language
sed -i 's/superuser = false/superuser = true/g' /usr/local/pgsql/share/extension/pg_graphql.control && \
echo "trusted = true" >> /usr/local/pgsql/share/extension/pg_graphql.control
#########################################################################################
#
# Layer "pg-tiktoken-build"
# Compile "pg_tiktoken" extension
#
#########################################################################################
FROM rust-extensions-build AS pg-tiktoken-pg-build
RUN git clone --depth=1 --single-branch https://github.com/kelvich/pg_tiktoken && \
cd pg_tiktoken && \
cargo pgx install --release && \
echo "trusted = true" >> /usr/local/pgsql/share/extension/pg_tiktoken.control
#########################################################################################
#
# Layer "neon-pg-ext-build"
# compile neon extensions
#
#########################################################################################
FROM build-deps AS neon-pg-ext-build
COPY --from=postgis-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=postgis-build /sfcgal/* /
COPY --from=plv8-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=h3-pg-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=h3-pg-build /h3/usr /
COPY --from=unit-pg-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=vector-pg-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=pgjwt-pg-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=pg-jsonschema-pg-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=pg-graphql-pg-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=pg-tiktoken-pg-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=hypopg-pg-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=pg-hashids-pg-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=rum-pg-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=pgtap-pg-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=prefix-pg-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=hll-pg-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=plpgsql-check-pg-build /usr/local/pgsql/ /usr/local/pgsql/
COPY pgxn/ pgxn/
RUN make -j $(getconf _NPROCESSORS_ONLN) \
PG_CONFIG=/usr/local/pgsql/bin/pg_config \
-C pgxn/neon \
-s install && \
make -j $(getconf _NPROCESSORS_ONLN) \
PG_CONFIG=/usr/local/pgsql/bin/pg_config \
-C pgxn/neon_utils \
-s install
#########################################################################################
#
# Compile and run the Neon-specific `compute_ctl` binary
#
#########################################################################################
FROM $REPOSITORY/$IMAGE:$TAG AS compute-tools
USER nonroot
# Copy entire project to get Cargo.* files with proper dependencies for the whole project
COPY --chown=nonroot . .
RUN cd compute_tools && cargo build --locked --profile release-line-debug-size-lto
#########################################################################################
#
# Clean up postgres folder before inclusion
#
#########################################################################################
FROM neon-pg-ext-build AS postgres-cleanup-layer
COPY --from=neon-pg-ext-build /usr/local/pgsql /usr/local/pgsql
# Remove binaries from /bin/ that we won't use (or would manually copy & install otherwise)
RUN cd /usr/local/pgsql/bin && rm ecpg raster2pgsql shp2pgsql pgtopo_export pgtopo_import pgsql2shp
# Remove headers that we won't need anymore - we've completed installation of all extensions
RUN rm -r /usr/local/pgsql/include
# Remove static postgresql libraries - all compilation is finished, so we
# can now remove these files - they must be included in other binaries by now
# if they were to be used by other libraries.
RUN rm /usr/local/pgsql/lib/lib*.a
#########################################################################################
#
# Final layer
# Put it all together into the final image
#
#########################################################################################
FROM debian:bullseye-slim
# Add user postgres
RUN mkdir /var/db && useradd -m -d /var/db/postgres postgres && \
echo "postgres:test_console_pass" | chpasswd && \
mkdir /var/db/postgres/compute && mkdir /var/db/postgres/specs && \
chown -R postgres:postgres /var/db/postgres && \
chmod 0750 /var/db/postgres/compute && \
echo '/usr/local/lib' >> /etc/ld.so.conf && /sbin/ldconfig && \
# create folder for file cache
mkdir -p -m 777 /neon/cache
COPY --from=postgres-cleanup-layer --chown=postgres /usr/local/pgsql /usr/local
COPY --from=compute-tools --chown=postgres /home/nonroot/target/release-line-debug-size-lto/compute_ctl /usr/local/bin/compute_ctl
# Install:
# libreadline8 for psql
# libicu67, locales for collations (including ICU and plpgsql_check)
# libossp-uuid16 for extension ossp-uuid
# libgeos, libgdal, libsfcgal1, libproj and libprotobuf-c1 for PostGIS
# libxml2, libxslt1.1 for xml2
RUN apt update && \
apt install --no-install-recommends -y \
locales \
libicu67 \
libreadline8 \
libossp-uuid16 \
libgeos-c1v5 \
libgdal28 \
libproj19 \
libprotobuf-c1 \
libsfcgal1 \
libxml2 \
libxslt1.1 \
gdb && \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* && \
localedef -i en_US -c -f UTF-8 -A /usr/share/locale/locale.alias en_US.UTF-8
ENV LANG en_US.utf8
USER postgres
ENTRYPOINT ["/usr/local/bin/compute_ctl"]

29
Dockerfile.compute-tools Normal file
View File

@@ -0,0 +1,29 @@
# First transient image to build compute_tools binaries
# NB: keep in sync with rust image version in .github/workflows/build_and_test.yml
ARG REPOSITORY=369495373322.dkr.ecr.eu-central-1.amazonaws.com
ARG IMAGE=rust
ARG TAG=pinned
FROM $REPOSITORY/$IMAGE:$TAG AS rust-build
WORKDIR /home/nonroot
# Enable https://github.com/paritytech/cachepot to cache Rust crates' compilation results in Docker builds.
# Set up cachepot to use an AWS S3 bucket for cache results, to reuse it between `docker build` invocations.
# cachepot falls back to local filesystem if S3 is misconfigured, not failing the build.
ARG RUSTC_WRAPPER=cachepot
ENV AWS_REGION=eu-central-1
ENV CACHEPOT_S3_KEY_PREFIX=cachepot
ARG CACHEPOT_BUCKET=neon-github-dev
#ARG AWS_ACCESS_KEY_ID
#ARG AWS_SECRET_ACCESS_KEY
COPY . .
RUN set -e \
&& mold -run cargo build -p compute_tools --locked --release \
&& cachepot -s
# Final image that only has one binary
FROM debian:bullseye-slim
COPY --from=rust-build /home/nonroot/target/release/compute_ctl /usr/local/bin/compute_ctl

View File

@@ -0,0 +1,70 @@
# Note: this file *mostly* just builds on Dockerfile.compute-node
ARG SRC_IMAGE
ARG VM_INFORMANT_VERSION=v0.1.14
# on libcgroup update, make sure to check bootstrap.sh for changes
ARG LIBCGROUP_VERSION=v2.0.3
# Pull VM informant, to copy from later
FROM neondatabase/vm-informant:$VM_INFORMANT_VERSION as informant
# Build cgroup-tools
#
# At time of writing (2023-03-14), debian bullseye has a version of cgroup-tools (technically
# libcgroup) that doesn't support cgroup v2 (version 0.41-11). Unfortunately, the vm-informant
# requires cgroup v2, so we'll build cgroup-tools ourselves.
FROM debian:bullseye-slim as libcgroup-builder
ARG LIBCGROUP_VERSION
RUN set -exu \
&& apt update \
&& apt install --no-install-recommends -y \
git \
ca-certificates \
automake \
cmake \
make \
gcc \
byacc \
flex \
libtool \
libpam0g-dev \
&& git clone --depth 1 -b $LIBCGROUP_VERSION https://github.com/libcgroup/libcgroup \
&& INSTALL_DIR="/libcgroup-install" \
&& mkdir -p "$INSTALL_DIR/bin" "$INSTALL_DIR/include" \
&& cd libcgroup \
# extracted from bootstrap.sh, with modified flags:
&& (test -d m4 || mkdir m4) \
&& autoreconf -fi \
&& rm -rf autom4te.cache \
&& CFLAGS="-O3" ./configure --prefix="$INSTALL_DIR" --sysconfdir=/etc --localstatedir=/var --enable-opaque-hierarchy="name=systemd" \
# actually build the thing...
&& make install
# Combine, starting from non-VM compute node image.
FROM $SRC_IMAGE as base
# Temporarily set user back to root so we can run adduser, set inittab
USER root
RUN adduser vm-informant --disabled-password --no-create-home
RUN set -e \
&& rm -f /etc/inittab \
&& touch /etc/inittab
RUN set -e \
&& echo "::sysinit:cgconfigparser -l /etc/cgconfig.conf -s 1664" >> /etc/inittab \
&& CONNSTR="dbname=neondb user=cloud_admin sslmode=disable" \
&& ARGS="--auto-restart --cgroup=neon-postgres --pgconnstr=\"$CONNSTR\"" \
&& echo "::respawn:su vm-informant -c '/usr/local/bin/vm-informant $ARGS'" >> /etc/inittab
USER postgres
ADD vm-cgconfig.conf /etc/cgconfig.conf
COPY --from=informant /usr/bin/vm-informant /usr/local/bin/vm-informant
COPY --from=libcgroup-builder /libcgroup-install/bin/* /usr/bin/
COPY --from=libcgroup-builder /libcgroup-install/lib/* /usr/lib/
COPY --from=libcgroup-builder /libcgroup-install/sbin/* /usr/sbin/
ENTRYPOINT ["/usr/sbin/cgexec", "-g", "*:neon-postgres", "/usr/local/bin/compute_ctl"]

188
Makefile
View File

@@ -1,10 +1,7 @@
# Seccomp BPF is only available for Linux
UNAME_S := $(shell uname -s)
ifeq ($(UNAME_S),Linux)
SECCOMP = --with-libseccomp
else
SECCOMP =
endif
ROOT_PROJECT_DIR := $(dir $(abspath $(lastword $(MAKEFILE_LIST))))
# Where to install Postgres, default is ./pg_install, maybe useful for package managers
POSTGRES_INSTALL_DIR ?= $(ROOT_PROJECT_DIR)/pg_install/
#
# We differentiate between release / debug build types using the BUILD_TYPE
@@ -12,17 +9,37 @@ endif
#
BUILD_TYPE ?= debug
ifeq ($(BUILD_TYPE),release)
PG_CONFIGURE_OPTS = --enable-debug
PG_CONFIGURE_OPTS = --enable-debug --with-openssl
PG_CFLAGS = -O2 -g3 $(CFLAGS)
# Unfortunately, `--profile=...` is a nightly feature
CARGO_BUILD_FLAGS += --release
else ifeq ($(BUILD_TYPE),debug)
PG_CONFIGURE_OPTS = --enable-debug --enable-cassert --enable-depend
PG_CONFIGURE_OPTS = --enable-debug --with-openssl --enable-cassert --enable-depend
PG_CFLAGS = -O0 -g3 $(CFLAGS)
else
$(error Bad build type `$(BUILD_TYPE)', see Makefile for options)
$(error Bad build type '$(BUILD_TYPE)', see Makefile for options)
endif
UNAME_S := $(shell uname -s)
ifeq ($(UNAME_S),Linux)
# Seccomp BPF is only available for Linux
PG_CONFIGURE_OPTS += --with-libseccomp
else ifeq ($(UNAME_S),Darwin)
# macOS with brew-installed openssl requires explicit paths
# It can be configured with OPENSSL_PREFIX variable
OPENSSL_PREFIX ?= $(shell brew --prefix openssl@3)
PG_CONFIGURE_OPTS += --with-includes=$(OPENSSL_PREFIX)/include --with-libraries=$(OPENSSL_PREFIX)/lib
# macOS already has bison and flex in the system, but they are old and result in postgres-v14 target failure
# brew formulae are keg-only and not symlinked into HOMEBREW_PREFIX, force their usage
EXTRA_PATH_OVERRIDES += $(shell brew --prefix bison)/bin/:$(shell brew --prefix flex)/bin/:
endif
# Use -C option so that when PostgreSQL "make install" installs the
# headers, the mtime of the headers are not changed when there have
# been no changes to the files. Changing the mtime triggers an
# unnecessary rebuild of 'postgres_ffi'.
PG_CONFIGURE_OPTS += INSTALL='$(ROOT_PROJECT_DIR)/scripts/ninstall.sh -C'
# Choose whether we should be silent or verbose
CARGO_BUILD_FLAGS += --$(if $(filter s,$(MAKEFLAGS)),quiet,verbose)
# Fix for a corner case when make doesn't pass a jobserver
@@ -35,64 +52,143 @@ CARGO_CMD_PREFIX += $(if $(filter n,$(MAKEFLAGS)),,+)
CARGO_CMD_PREFIX += CARGO_TERM_PROGRESS_WHEN=never CI=1
#
# Top level Makefile to build Zenith and PostgreSQL
# Top level Makefile to build Neon and PostgreSQL
#
.PHONY: all
all: zenith postgres
all: neon postgres neon-pg-ext
### Zenith Rust bits
### Neon Rust bits
#
# The 'postgres_ffi' depends on the Postgres headers.
.PHONY: zenith
zenith: postgres-headers
+@echo "Compiling Zenith"
.PHONY: neon
neon: postgres-headers
+@echo "Compiling Neon"
$(CARGO_CMD_PREFIX) cargo build $(CARGO_BUILD_FLAGS)
### PostgreSQL parts
tmp_install/build/config.status:
+@echo "Configuring postgres build"
mkdir -p tmp_install/build
(cd tmp_install/build && \
../../vendor/postgres/configure CFLAGS='$(PG_CFLAGS)' \
# Some rules are duplicated for Postgres v14 and 15. We may want to refactor
# to avoid the duplication in the future, but it's tolerable for now.
#
$(POSTGRES_INSTALL_DIR)/build/%/config.status:
+@echo "Configuring Postgres $* build"
mkdir -p $(POSTGRES_INSTALL_DIR)/build/$*
(cd $(POSTGRES_INSTALL_DIR)/build/$* && \
env PATH="$(EXTRA_PATH_OVERRIDES):$$PATH" $(ROOT_PROJECT_DIR)/vendor/postgres-$*/configure \
CFLAGS='$(PG_CFLAGS)' \
$(PG_CONFIGURE_OPTS) \
$(SECCOMP) \
--prefix=$(abspath tmp_install) > configure.log)
--prefix=$(abspath $(POSTGRES_INSTALL_DIR))/$* > configure.log)
# nicer alias for running 'configure'
.PHONY: postgres-configure
postgres-configure: tmp_install/build/config.status
# nicer alias to run 'configure'
# Note: I've been unable to use templates for this part of our configuration.
# I'm not sure why it wouldn't work, but this is the only place (apart from
# the "build-all-versions" entry points) where direct mention of PostgreSQL
# versions is used.
.PHONY: postgres-configure-v15
postgres-configure-v15: $(POSTGRES_INSTALL_DIR)/build/v15/config.status
.PHONY: postgres-configure-v14
postgres-configure-v14: $(POSTGRES_INSTALL_DIR)/build/v14/config.status
# Install the PostgreSQL header files into tmp_install/include
.PHONY: postgres-headers
postgres-headers: postgres-configure
+@echo "Installing PostgreSQL headers"
$(MAKE) -C tmp_install/build/src/include MAKELEVEL=0 install
# Install the PostgreSQL header files into $(POSTGRES_INSTALL_DIR)/<version>/include
.PHONY: postgres-headers-%
postgres-headers-%: postgres-configure-%
+@echo "Installing PostgreSQL $* headers"
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$*/src/include MAKELEVEL=0 install
# Compile and install PostgreSQL and contrib/zenith
# Compile and install PostgreSQL
.PHONY: postgres-%
postgres-%: postgres-configure-% \
postgres-headers-% # to prevent `make install` conflicts with neon's `postgres-headers`
+@echo "Compiling PostgreSQL $*"
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$* MAKELEVEL=0 install
+@echo "Compiling libpq $*"
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$*/src/interfaces/libpq install
+@echo "Compiling pg_prewarm $*"
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$*/contrib/pg_prewarm install
+@echo "Compiling pg_buffercache $*"
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$*/contrib/pg_buffercache install
+@echo "Compiling pageinspect $*"
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$*/contrib/pageinspect install
.PHONY: postgres-clean-%
postgres-clean-%:
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$* MAKELEVEL=0 clean
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$*/contrib/pg_buffercache clean
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$*/contrib/pageinspect clean
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/$*/src/interfaces/libpq clean
.PHONY: neon-pg-ext-%
neon-pg-ext-%: postgres-%
+@echo "Compiling neon $*"
mkdir -p $(POSTGRES_INSTALL_DIR)/build/neon-$*
$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/$*/bin/pg_config CFLAGS='$(PG_CFLAGS) $(COPT)' \
-C $(POSTGRES_INSTALL_DIR)/build/neon-$* \
-f $(ROOT_PROJECT_DIR)/pgxn/neon/Makefile install
+@echo "Compiling neon_walredo $*"
mkdir -p $(POSTGRES_INSTALL_DIR)/build/neon-walredo-$*
$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/$*/bin/pg_config CFLAGS='$(PG_CFLAGS) $(COPT)' \
-C $(POSTGRES_INSTALL_DIR)/build/neon-walredo-$* \
-f $(ROOT_PROJECT_DIR)/pgxn/neon_walredo/Makefile install
+@echo "Compiling neon_test_utils $*"
mkdir -p $(POSTGRES_INSTALL_DIR)/build/neon-test-utils-$*
$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/$*/bin/pg_config CFLAGS='$(PG_CFLAGS) $(COPT)' \
-C $(POSTGRES_INSTALL_DIR)/build/neon-test-utils-$* \
-f $(ROOT_PROJECT_DIR)/pgxn/neon_test_utils/Makefile install
+@echo "Compiling neon_utils $*"
mkdir -p $(POSTGRES_INSTALL_DIR)/build/neon-utils-$*
$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/$*/bin/pg_config CFLAGS='$(PG_CFLAGS) $(COPT)' \
-C $(POSTGRES_INSTALL_DIR)/build/neon-utils-$* \
-f $(ROOT_PROJECT_DIR)/pgxn/neon_utils/Makefile install
.PHONY: neon-pg-ext-clean-%
neon-pg-ext-clean-%:
$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/$*/bin/pg_config \
-C $(POSTGRES_INSTALL_DIR)/build/neon-$* \
-f $(ROOT_PROJECT_DIR)/pgxn/neon/Makefile clean
$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/$*/bin/pg_config \
-C $(POSTGRES_INSTALL_DIR)/build/neon-walredo-$* \
-f $(ROOT_PROJECT_DIR)/pgxn/neon_walredo/Makefile clean
$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/$*/bin/pg_config \
-C $(POSTGRES_INSTALL_DIR)/build/neon-test-utils-$* \
-f $(ROOT_PROJECT_DIR)/pgxn/neon_test_utils/Makefile clean
$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/$*/bin/pg_config \
-C $(POSTGRES_INSTALL_DIR)/build/neon-utils-$* \
-f $(ROOT_PROJECT_DIR)/pgxn/neon_utils/Makefile clean
.PHONY: neon-pg-ext
neon-pg-ext: \
neon-pg-ext-v14 \
neon-pg-ext-v15
.PHONY: neon-pg-ext-clean
neon-pg-ext-clean: \
neon-pg-ext-clean-v14 \
neon-pg-ext-clean-v15
# shorthand to build all Postgres versions
.PHONY: postgres
postgres: postgres-configure \
postgres-headers # to prevent `make install` conflicts with zenith's `postgres-headers`
+@echo "Compiling PostgreSQL"
$(MAKE) -C tmp_install/build MAKELEVEL=0 install
+@echo "Compiling contrib/zenith"
$(MAKE) -C tmp_install/build/contrib/zenith install
+@echo "Compiling contrib/zenith_test_utils"
$(MAKE) -C tmp_install/build/contrib/zenith_test_utils install
postgres: \
postgres-v14 \
postgres-v15
.PHONY: postgres-headers
postgres-headers: \
postgres-headers-v14 \
postgres-headers-v15
.PHONY: postgres-clean
postgres-clean:
$(MAKE) -C tmp_install/build MAKELEVEL=0 clean
postgres-clean: \
postgres-clean-v14 \
postgres-clean-v15
# This doesn't remove the effects of 'configure'.
.PHONY: clean
clean:
cd tmp_install/build && $(MAKE) clean
clean: postgres-clean neon-pg-ext-clean
$(CARGO_CMD_PREFIX) cargo clean
# This removes everything
.PHONY: distclean
distclean:
rm -rf tmp_install
rm -rf $(POSTGRES_INSTALL_DIR)
$(CARGO_CMD_PREFIX) cargo clean
.PHONY: fmt
@@ -101,4 +197,4 @@ fmt:
.PHONY: setup-pre-commit-hook
setup-pre-commit-hook:
ln -s -f ../../pre-commit.py .git/hooks/pre-commit
ln -s -f $(ROOT_PROJECT_DIR)/pre-commit.py .git/hooks/pre-commit

5
NOTICE Normal file
View File

@@ -0,0 +1,5 @@
Neon
Copyright 2022 Neon Inc.
The PostgreSQL submodules in vendor/postgres-v14 and vendor/postgres-v15 are licensed under the
PostgreSQL license. See vendor/postgres-v14/COPYRIGHT and vendor/postgres-v15/COPYRIGHT.

View File

@@ -1 +0,0 @@
./test_runner/Pipfile

1
Pipfile.lock generated
View File

@@ -1 +0,0 @@
./test_runner/Pipfile.lock

262
README.md
View File

@@ -1,82 +1,164 @@
# Zenith
# Neon
Zenith substitutes PostgreSQL storage layer and redistributes data across a cluster of nodes
Neon is a serverless open-source alternative to AWS Aurora Postgres. It separates storage and compute and substitutes the PostgreSQL storage layer by redistributing data across a cluster of nodes.
## Quick start
Try the [Neon Free Tier](https://neon.tech/docs/introduction/technical-preview-free-tier/) to create a serverless Postgres instance. Then connect to it with your preferred Postgres client (psql, dbeaver, etc) or use the online [SQL Editor](https://neon.tech/docs/get-started-with-neon/query-with-neon-sql-editor/). See [Connect from any application](https://neon.tech/docs/connect/connect-from-any-app/) for connection instructions.
Alternatively, compile and run the project [locally](#running-local-installation).
## Architecture overview
A Zenith installation consists of Compute nodes and Storage engine.
A Neon installation consists of compute nodes and the Neon storage engine. Compute nodes are stateless PostgreSQL nodes backed by the Neon storage engine.
Compute nodes are stateless PostgreSQL nodes, backed by zenith storage.
The Neon storage engine consists of two major components:
- Pageserver. Scalable storage backend for the compute nodes.
- Safekeepers. The safekeepers form a redundant WAL service that received WAL from the compute node, and stores it durably until it has been processed by the pageserver and uploaded to cloud storage.
Zenith storage engine consists of two major components:
- Pageserver. Scalable storage backend for compute nodes.
- WAL service. The service that receives WAL from compute node and ensures that it is stored durably.
Pageserver consists of:
- Repository - Zenith storage implementation.
- WAL receiver - service that receives WAL from WAL service and stores it in the repository.
- Page service - service that communicates with compute nodes and responds with pages from the repository.
- WAL redo - service that builds pages from base images and WAL records on Page service request.
See developer documentation in [/docs/SUMMARY.md](/docs/SUMMARY.md) for more information.
## Running local installation
1. Install build dependencies and other useful packages
On Ubuntu or Debian this set of packages should be sufficient to build the code:
```text
#### Installing dependencies on Linux
1. Install build dependencies and other applicable packages
* On Ubuntu or Debian, this set of packages should be sufficient to build the code:
```bash
apt install build-essential libtool libreadline-dev zlib1g-dev flex bison libseccomp-dev \
libssl-dev clang pkg-config libpq-dev
libssl-dev clang pkg-config libpq-dev cmake postgresql-client protobuf-compiler
```
* On Fedora, these packages are needed:
```bash
dnf install flex bison readline-devel zlib-devel openssl-devel \
libseccomp-devel perl clang cmake postgresql postgresql-contrib protobuf-compiler \
protobuf-devel
```
* On Arch based systems, these packages are needed:
```bash
pacman -S base-devel readline zlib libseccomp openssl clang \
postgresql-libs cmake postgresql protobuf
```
[Rust] 1.55 or later is also required.
To run the `psql` client, install the `postgresql-client` package or modify `PATH` and `LD_LIBRARY_PATH` to include `tmp_install/bin` and `tmp_install/lib`, respectively.
To run the integration tests (not required to use the code), install
Python (3.7 or higher), and install python3 packages with `pipenv` using `pipenv install` in the project directory.
2. Build zenith and patched postgres
```sh
git clone --recursive https://github.com/zenithdb/zenith.git
cd zenith
make -j5
2. [Install Rust](https://www.rust-lang.org/tools/install)
```
# recommended approach from https://www.rust-lang.org/tools/install
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
```
3. Start pageserver and postgres on top of it (should be called from repo root):
#### Installing dependencies on macOS (12.3.1)
1. Install XCode and dependencies
```
xcode-select --install
brew install protobuf openssl flex bison
# add openssl to PATH, required for ed25519 keys generation in neon_local
echo 'export PATH="$(brew --prefix openssl)/bin:$PATH"' >> ~/.zshrc
```
2. [Install Rust](https://www.rust-lang.org/tools/install)
```
# recommended approach from https://www.rust-lang.org/tools/install
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
```
3. Install PostgreSQL Client
```
# from https://stackoverflow.com/questions/44654216/correct-way-to-install-psql-without-full-postgres-on-macos
brew install libpq
brew link --force libpq
```
#### Rustc version
The project uses [rust toolchain file](./rust-toolchain.toml) to define the version it's built with in CI for testing and local builds.
This file is automatically picked up by [`rustup`](https://rust-lang.github.io/rustup/overrides.html#the-toolchain-file) that installs (if absent) and uses the toolchain version pinned in the file.
rustup users who want to build with another toolchain can use [`rustup override`](https://rust-lang.github.io/rustup/overrides.html#directory-overrides) command to set a specific toolchain for the project's directory.
non-rustup users most probably are not getting the same toolchain automatically from the file, so are responsible to manually verify their toolchain matches the version in the file.
Newer rustc versions most probably will work fine, yet older ones might not be supported due to some new features used by the project or the crates.
#### Building on Linux
1. Build neon and patched postgres
```
# Note: The path to the neon sources can not contain a space.
git clone --recursive https://github.com/neondatabase/neon.git
cd neon
# The preferred and default is to make a debug build. This will create a
# demonstrably slower build than a release build. For a release build,
# use "BUILD_TYPE=release make -j`nproc` -s"
# Remove -s for the verbose build log
make -j`nproc` -s
```
#### Building on OSX
1. Build neon and patched postgres
```
# Note: The path to the neon sources can not contain a space.
git clone --recursive https://github.com/neondatabase/neon.git
cd neon
# The preferred and default is to make a debug build. This will create a
# demonstrably slower build than a release build. For a release build,
# use "BUILD_TYPE=release make -j`sysctl -n hw.logicalcpu` -s"
# Remove -s for the verbose build log
make -j`sysctl -n hw.logicalcpu` -s
```
#### Dependency installation notes
To run the `psql` client, install the `postgresql-client` package or modify `PATH` and `LD_LIBRARY_PATH` to include `pg_install/bin` and `pg_install/lib`, respectively.
To run the integration tests or Python scripts (not required to use the code), install
Python (3.9 or higher), and install python3 packages using `./scripts/pysync` (requires [poetry>=1.3](https://python-poetry.org/)) in the project directory.
#### Running neon database
1. Start pageserver and postgres on top of it (should be called from repo root):
```sh
# Create repository in .zenith with proper paths to binaries and data
# Create repository in .neon with proper paths to binaries and data
# Later that would be responsibility of a package install script
> ./target/debug/zenith init
initializing tenantid c03ba6b7ad4c5e9cf556f059ade44229
created initial timeline 5b014a9e41b4b63ce1a1febc04503636 timeline.lsn 0/169C3C8
created main branch
pageserver init succeeded
> ./target/debug/neon_local init
Starting pageserver at '127.0.0.1:64000' in '.neon'.
# start pageserver and safekeeper
> ./target/debug/zenith start
Starting pageserver at 'localhost:64000' in '.zenith'
Pageserver started
initializing for single for 7676
Starting safekeeper at 'localhost:5454' in '.zenith/safekeepers/single'
Safekeeper started
# start pageserver, safekeeper, and broker for their intercommunication
> ./target/debug/neon_local start
Starting neon broker at 127.0.0.1:50051
storage_broker started, pid: 2918372
Starting pageserver at '127.0.0.1:64000' in '.neon'.
pageserver started, pid: 2918386
Starting safekeeper at '127.0.0.1:5454' in '.neon/safekeepers/sk1'.
safekeeper 1 started, pid: 2918437
# create initial tenant and use it as a default for every future neon_local invocation
> ./target/debug/neon_local tenant create --set-default
tenant 9ef87a5bf0d92544f6fafeeb3239695c successfully created on the pageserver
Created an initial timeline 'de200bd42b49cc1814412c7e592dd6e9' at Lsn 0/16B5A50 for tenant: 9ef87a5bf0d92544f6fafeeb3239695c
Setting tenant 9ef87a5bf0d92544f6fafeeb3239695c as a default one
# start postgres compute node
> ./target/debug/zenith pg start main
Starting new postgres main on main...
Extracting base backup to create postgres instance: path=.zenith/pgdatadirs/tenants/c03ba6b7ad4c5e9cf556f059ade44229/main port=55432
Starting postgres node at 'host=127.0.0.1 port=55432 user=zenith_admin dbname=postgres'
waiting for server to start.... done
server started
> ./target/debug/neon_local pg start main
Starting new postgres (v14) main on timeline de200bd42b49cc1814412c7e592dd6e9 ...
Extracting base backup to create postgres instance: path=.neon/pgdatadirs/tenants/9ef87a5bf0d92544f6fafeeb3239695c/main port=55432
Starting postgres node at 'host=127.0.0.1 port=55432 user=cloud_admin dbname=postgres'
# check list of running postgres instances
> ./target/debug/zenith pg list
BRANCH ADDRESS LSN STATUS
main 127.0.0.1:55432 0/1609610 running
> ./target/debug/neon_local pg list
NODE ADDRESS TIMELINE BRANCH NAME LSN STATUS
main 127.0.0.1:55432 de200bd42b49cc1814412c7e592dd6e9 main 0/16B5BA8 running
```
4. Now it is possible to connect to postgres and run some queries:
2. Now, it is possible to connect to postgres and run some queries:
```text
> psql -p55432 -h 127.0.0.1 -U zenith_admin postgres
> psql -p55432 -h 127.0.0.1 -U cloud_admin postgres
postgres=# CREATE TABLE t(key int primary key, value text);
CREATE TABLE
postgres=# insert into t values(1,1);
@@ -88,25 +170,32 @@ postgres=# select * from t;
(1 row)
```
5. And create branches and run postgres on them:
3. And create branches and run postgres on them:
```sh
# create branch named migration_check
> ./target/debug/zenith branch migration_check main
Created branch 'migration_check' at 0/1609610
> ./target/debug/neon_local timeline branch --branch-name migration_check
Created timeline 'b3b863fa45fa9e57e615f9f2d944e601' at Lsn 0/16F9A00 for tenant: 9ef87a5bf0d92544f6fafeeb3239695c. Ancestor timeline: 'main'
# check branches tree
> ./target/debug/zenith branch
main
┗━ @0/1609610: migration_check
> ./target/debug/neon_local timeline list
(L) main [de200bd42b49cc1814412c7e592dd6e9]
(L) ┗━ @0/16F9A00: migration_check [b3b863fa45fa9e57e615f9f2d944e601]
# start postgres on that branch
> ./target/debug/zenith pg start migration_check
Starting postgres node at 'host=127.0.0.1 port=55433 user=stas'
waiting for server to start.... done
> ./target/debug/neon_local pg start migration_check --branch-name migration_check
Starting new postgres migration_check on timeline b3b863fa45fa9e57e615f9f2d944e601 ...
Extracting base backup to create postgres instance: path=.neon/pgdatadirs/tenants/9ef87a5bf0d92544f6fafeeb3239695c/migration_check port=55433
Starting postgres node at 'host=127.0.0.1 port=55433 user=cloud_admin dbname=postgres'
# check the new list of running postgres instances
> ./target/debug/neon_local pg list
NODE ADDRESS TIMELINE BRANCH NAME LSN STATUS
main 127.0.0.1:55432 de200bd42b49cc1814412c7e592dd6e9 main 0/16F9A38 running
migration_check 127.0.0.1:55433 b3b863fa45fa9e57e615f9f2d944e601 migration_check 0/16F9A70 running
# this new postgres instance will have all the data from 'main' postgres,
# but all modifications would not affect data in original postgres
> psql -p55433 -h 127.0.0.1 -U zenith_admin postgres
> psql -p55433 -h 127.0.0.1 -U cloud_admin postgres
postgres=# select * from t;
key | value
-----+-------
@@ -115,41 +204,60 @@ postgres=# select * from t;
postgres=# insert into t values(2,2);
INSERT 0 1
# check that the new change doesn't affect the 'main' postgres
> psql -p55432 -h 127.0.0.1 -U cloud_admin postgres
postgres=# select * from t;
key | value
-----+-------
1 | 1
(1 row)
```
6. If you want to run tests afterwards (see below), you have to stop all the running the pageserver, safekeeper and postgres instances
you have just started. You can stop them all with one command:
4. If you want to run tests afterward (see below), you must stop all the running of the pageserver, safekeeper, and postgres instances
you have just started. You can terminate them all with one command:
```sh
> ./target/debug/zenith stop
> ./target/debug/neon_local stop
```
## Running tests
Ensure your dependencies are installed as described [here](https://github.com/neondatabase/neon#dependency-installation-notes).
```sh
git clone --recursive https://github.com/zenithdb/zenith.git
make # builds also postgres and installs it to ./tmp_install
cd test_runner
pytest
git clone --recursive https://github.com/neondatabase/neon.git
CARGO_BUILD_FLAGS="--features=testing" make
./scripts/pytest
```
## Documentation
Now we use README files to cover design ideas and overall architecture for each module and `rustdoc` style documentation comments. See also [/docs/](/docs/) a top-level overview of all available markdown documentation.
[/docs/](/docs/) Contains a top-level overview of all available markdown documentation.
- [/docs/sourcetree.md](/docs/sourcetree.md) contains overview of source tree layout.
To view your `rustdoc` documentation in a browser, try running `cargo doc --no-deps --open`
See also README files in some source directories, and `rustdoc` style documentation comments.
Other resources:
- [SELECT 'Hello, World'](https://neon.tech/blog/hello-world/): Blog post by Nikita Shamgunov on the high level architecture
- [Architecture decisions in Neon](https://neon.tech/blog/architecture-decisions-in-neon/): Blog post by Heikki Linnakangas
- [Neon: Serverless PostgreSQL!](https://www.youtube.com/watch?v=rES0yzeERns): Presentation on storage system by Heikki Linnakangas in the CMU Database Group seminar series
### Postgres-specific terms
Due to Zenith's very close relation with PostgreSQL internals, there are numerous specific terms used.
Same applies to certain spelling: i.e. we use MB to denote 1024 * 1024 bytes, while MiB would be technically more correct, it's inconsistent with what PostgreSQL code and its documentation use.
Due to Neon's very close relation with PostgreSQL internals, numerous specific terms are used.
The same applies to certain spelling: i.e. we use MB to denote 1024 * 1024 bytes, while MiB would be technically more correct, it's inconsistent with what PostgreSQL code and its documentation use.
To get more familiar with this aspect, refer to:
- [Zenith glossary](/docs/glossary.md)
- [PostgreSQL glossary](https://www.postgresql.org/docs/13/glossary.html)
- Other PostgreSQL documentation and sources (Zenith fork sources can be found [here](https://github.com/zenithdb/postgres))
- [Neon glossary](/docs/glossary.md)
- [PostgreSQL glossary](https://www.postgresql.org/docs/14/glossary.html)
- Other PostgreSQL documentation and sources (Neon fork sources can be found [here](https://github.com/neondatabase/postgres))
## Join the development

View File

@@ -1,188 +0,0 @@
Create a new Zenith repository in the current directory:
~/git-sandbox/zenith (cli-v2)$ ./target/debug/cli init
The files belonging to this database system will be owned by user "heikki".
This user must also own the server process.
The database cluster will be initialized with locale "en_GB.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".
Data page checksums are disabled.
creating directory tmp ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... Europe/Helsinki
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok
initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.
new zenith repository was created in .zenith
Initially, there is only one branch:
~/git-sandbox/zenith (cli-v2)$ ./target/debug/cli branch
main
Start a local Postgres instance on the branch:
~/git-sandbox/zenith (cli-v2)$ ./target/debug/cli start main
Creating data directory from snapshot at 0/15FFB08...
waiting for server to start....2021-04-13 09:27:43.919 EEST [984664] LOG: starting PostgreSQL 14devel on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
2021-04-13 09:27:43.920 EEST [984664] LOG: listening on IPv6 address "::1", port 5432
2021-04-13 09:27:43.920 EEST [984664] LOG: listening on IPv4 address "127.0.0.1", port 5432
2021-04-13 09:27:43.927 EEST [984664] LOG: listening on Unix socket "/tmp/.s.PGSQL.5432"
2021-04-13 09:27:43.939 EEST [984665] LOG: database system was interrupted; last known up at 2021-04-13 09:27:33 EEST
2021-04-13 09:27:43.939 EEST [984665] LOG: creating missing WAL directory "pg_wal/archive_status"
2021-04-13 09:27:44.189 EEST [984665] LOG: database system was not properly shut down; automatic recovery in progress
2021-04-13 09:27:44.195 EEST [984665] LOG: invalid record length at 0/15FFB80: wanted 24, got 0
2021-04-13 09:27:44.195 EEST [984665] LOG: redo is not required
2021-04-13 09:27:44.225 EEST [984664] LOG: database system is ready to accept connections
done
server started
Run some commands against it:
~/git-sandbox/zenith (cli-v2)$ psql postgres -c "create table foo (t text);"
CREATE TABLE
~/git-sandbox/zenith (cli-v2)$ psql postgres -c "insert into foo values ('inserted on the main branch');"
INSERT 0 1
~/git-sandbox/zenith (cli-v2)$ psql postgres -c "select * from foo"
t
-----------------------------
inserted on the main branch
(1 row)
Create a new branch called 'experimental'. We create it from the
current end of the 'main' branch, but you could specify a different
LSN as the start point instead.
~/git-sandbox/zenith (cli-v2)$ ./target/debug/cli branch experimental main
branching at end of WAL: 0/161F478
~/git-sandbox/zenith (cli-v2)$ ./target/debug/cli branch
experimental
main
Start another Postgres instance off the 'experimental' branch:
~/git-sandbox/zenith (cli-v2)$ ./target/debug/cli start experimental -- -o -p5433
Creating data directory from snapshot at 0/15FFB08...
waiting for server to start....2021-04-13 09:28:41.874 EEST [984766] LOG: starting PostgreSQL 14devel on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
2021-04-13 09:28:41.875 EEST [984766] LOG: listening on IPv6 address "::1", port 5433
2021-04-13 09:28:41.875 EEST [984766] LOG: listening on IPv4 address "127.0.0.1", port 5433
2021-04-13 09:28:41.883 EEST [984766] LOG: listening on Unix socket "/tmp/.s.PGSQL.5433"
2021-04-13 09:28:41.896 EEST [984767] LOG: database system was interrupted; last known up at 2021-04-13 09:27:33 EEST
2021-04-13 09:28:42.265 EEST [984767] LOG: database system was not properly shut down; automatic recovery in progress
2021-04-13 09:28:42.269 EEST [984767] LOG: redo starts at 0/15FFB80
2021-04-13 09:28:42.272 EEST [984767] LOG: invalid record length at 0/161F4B0: wanted 24, got 0
2021-04-13 09:28:42.272 EEST [984767] LOG: redo done at 0/161F478 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
2021-04-13 09:28:42.321 EEST [984766] LOG: database system is ready to accept connections
done
server started
Insert some a row on the 'experimental' branch:
~/git-sandbox/zenith (cli-v2)$ psql postgres -p5433 -c "select * from foo"
t
-----------------------------
inserted on the main branch
(1 row)
~/git-sandbox/zenith (cli-v2)$ psql postgres -p5433 -c "insert into foo values ('inserted on experimental')"
INSERT 0 1
~/git-sandbox/zenith (cli-v2)$ psql postgres -p5433 -c "select * from foo"
t
-----------------------------
inserted on the main branch
inserted on experimental
(2 rows)
See that the other Postgres instance is still running on 'main' branch on port 5432:
~/git-sandbox/zenith (cli-v2)$ psql postgres -p5432 -c "select * from foo"
t
-----------------------------
inserted on the main branch
(1 row)
Everything is stored in the .zenith directory:
~/git-sandbox/zenith (cli-v2)$ ls -l .zenith/
total 12
drwxr-xr-x 4 heikki heikki 4096 Apr 13 09:28 datadirs
drwxr-xr-x 4 heikki heikki 4096 Apr 13 09:27 refs
drwxr-xr-x 4 heikki heikki 4096 Apr 13 09:28 timelines
The 'datadirs' directory contains the datadirs of the running instances:
~/git-sandbox/zenith (cli-v2)$ ls -l .zenith/datadirs/
total 8
drwx------ 18 heikki heikki 4096 Apr 13 09:27 3c0c634c1674079b2c6d4edf7c91523e
drwx------ 18 heikki heikki 4096 Apr 13 09:28 697e3c103d4b1763cd6e82e4ff361d76
~/git-sandbox/zenith (cli-v2)$ ls -l .zenith/datadirs/3c0c634c1674079b2c6d4edf7c91523e/
total 124
drwxr-xr-x 5 heikki heikki 4096 Apr 13 09:27 base
drwxr-xr-x 2 heikki heikki 4096 Apr 13 09:27 global
drwxr-xr-x 2 heikki heikki 4096 Apr 13 09:27 pg_commit_ts
drwxr-xr-x 2 heikki heikki 4096 Apr 13 09:27 pg_dynshmem
-rw------- 1 heikki heikki 4760 Apr 13 09:27 pg_hba.conf
-rw------- 1 heikki heikki 1636 Apr 13 09:27 pg_ident.conf
drwxr-xr-x 4 heikki heikki 4096 Apr 13 09:32 pg_logical
drwxr-xr-x 4 heikki heikki 4096 Apr 13 09:27 pg_multixact
drwxr-xr-x 2 heikki heikki 4096 Apr 13 09:27 pg_notify
drwxr-xr-x 2 heikki heikki 4096 Apr 13 09:27 pg_replslot
drwxr-xr-x 2 heikki heikki 4096 Apr 13 09:27 pg_serial
drwxr-xr-x 2 heikki heikki 4096 Apr 13 09:27 pg_snapshots
drwxr-xr-x 2 heikki heikki 4096 Apr 13 09:27 pg_stat
drwxr-xr-x 2 heikki heikki 4096 Apr 13 09:34 pg_stat_tmp
drwxr-xr-x 2 heikki heikki 4096 Apr 13 09:27 pg_subtrans
drwxr-xr-x 2 heikki heikki 4096 Apr 13 09:27 pg_tblspc
drwxr-xr-x 2 heikki heikki 4096 Apr 13 09:27 pg_twophase
-rw------- 1 heikki heikki 3 Apr 13 09:27 PG_VERSION
lrwxrwxrwx 1 heikki heikki 52 Apr 13 09:27 pg_wal -> ../../timelines/3c0c634c1674079b2c6d4edf7c91523e/wal
drwxr-xr-x 2 heikki heikki 4096 Apr 13 09:27 pg_xact
-rw------- 1 heikki heikki 88 Apr 13 09:27 postgresql.auto.conf
-rw------- 1 heikki heikki 28688 Apr 13 09:27 postgresql.conf
-rw------- 1 heikki heikki 96 Apr 13 09:27 postmaster.opts
-rw------- 1 heikki heikki 149 Apr 13 09:27 postmaster.pid
Note how 'pg_wal' is just a symlink to the 'timelines' directory. The
datadir is ephemeral, you can delete it at any time, and it can be reconstructed
from the snapshots and WAL stored in the 'timelines' directory. So if you push/pull
the repository, the 'datadirs' are not included. (They are like git working trees)
~/git-sandbox/zenith (cli-v2)$ killall -9 postgres
~/git-sandbox/zenith (cli-v2)$ rm -rf .zenith/datadirs/*
~/git-sandbox/zenith (cli-v2)$ ./target/debug/cli start experimental -- -o -p5433
Creating data directory from snapshot at 0/15FFB08...
waiting for server to start....2021-04-13 09:37:05.476 EEST [985340] LOG: starting PostgreSQL 14devel on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit
2021-04-13 09:37:05.477 EEST [985340] LOG: listening on IPv6 address "::1", port 5433
2021-04-13 09:37:05.477 EEST [985340] LOG: listening on IPv4 address "127.0.0.1", port 5433
2021-04-13 09:37:05.487 EEST [985340] LOG: listening on Unix socket "/tmp/.s.PGSQL.5433"
2021-04-13 09:37:05.498 EEST [985341] LOG: database system was interrupted; last known up at 2021-04-13 09:27:33 EEST
2021-04-13 09:37:05.808 EEST [985341] LOG: database system was not properly shut down; automatic recovery in progress
2021-04-13 09:37:05.813 EEST [985341] LOG: redo starts at 0/15FFB80
2021-04-13 09:37:05.815 EEST [985341] LOG: invalid record length at 0/161F770: wanted 24, got 0
2021-04-13 09:37:05.815 EEST [985341] LOG: redo done at 0/161F738 system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s
2021-04-13 09:37:05.866 EEST [985340] LOG: database system is ready to accept connections
done
server started
~/git-sandbox/zenith (cli-v2)$ psql postgres -p5433 -c "select * from foo"
t
-----------------------------
inserted on the main branch
inserted on experimental
(2 rows)

View File

@@ -0,0 +1 @@
target

1
compute_tools/.gitignore vendored Normal file
View File

@@ -0,0 +1 @@
target

30
compute_tools/Cargo.toml Normal file
View File

@@ -0,0 +1,30 @@
[package]
name = "compute_tools"
version = "0.1.0"
edition.workspace = true
license.workspace = true
[dependencies]
anyhow.workspace = true
chrono.workspace = true
clap.workspace = true
futures.workspace = true
hyper = { workspace = true, features = ["full"] }
notify.workspace = true
num_cpus.workspace = true
opentelemetry.workspace = true
postgres.workspace = true
regex.workspace = true
serde.workspace = true
serde_json.workspace = true
tar.workspace = true
reqwest = { workspace = true, features = ["json"] }
tokio = { workspace = true, features = ["rt", "rt-multi-thread"] }
tokio-postgres.workspace = true
tracing.workspace = true
tracing-opentelemetry.workspace = true
tracing-subscriber.workspace = true
tracing-utils.workspace = true
url.workspace = true
workspace_hack.workspace = true

85
compute_tools/README.md Normal file
View File

@@ -0,0 +1,85 @@
# Compute node tools
Postgres wrapper (`compute_ctl`) is intended to be run as a Docker entrypoint or as a `systemd`
`ExecStart` option. It will handle all the `Neon` specifics during compute node
initialization:
- `compute_ctl` accepts cluster (compute node) specification as a JSON file.
- Every start is a fresh start, so the data directory is removed and
initialized again on each run.
- Next it will put configuration files into the `PGDATA` directory.
- Sync safekeepers and get commit LSN.
- Get `basebackup` from pageserver using the returned on the previous step LSN.
- Try to start `postgres` and wait until it is ready to accept connections.
- Check and alter/drop/create roles and databases.
- Hang waiting on the `postmaster` process to exit.
Also `compute_ctl` spawns two separate service threads:
- `compute-monitor` checks the last Postgres activity timestamp and saves it
into the shared `ComputeNode`;
- `http-endpoint` runs a Hyper HTTP API server, which serves readiness and the
last activity requests.
If the `vm-informant` binary is present at `/bin/vm-informant`, it will also be started. For VM
compute nodes, `vm-informant` communicates with the VM autoscaling system. It coordinates
downscaling and (eventually) will request immediate upscaling under resource pressure.
Usage example:
```sh
compute_ctl -D /var/db/postgres/compute \
-C 'postgresql://cloud_admin@localhost/postgres' \
-S /var/db/postgres/specs/current.json \
-b /usr/local/bin/postgres
```
## Tests
Cargo formatter:
```sh
cargo fmt
```
Run tests:
```sh
cargo test
```
Clippy linter:
```sh
cargo clippy --all --all-targets -- -Dwarnings -Drust-2018-idioms
```
## Cross-platform compilation
Imaging that you are on macOS (x86) and you want a Linux GNU (`x86_64-unknown-linux-gnu` platform in `rust` terminology) executable.
### Using docker
You can use a throw-away Docker container ([rustlang/rust](https://hub.docker.com/r/rustlang/rust/) image) for doing that:
```sh
docker run --rm \
-v $(pwd):/compute_tools \
-w /compute_tools \
-t rustlang/rust:nightly cargo build --release --target=x86_64-unknown-linux-gnu
```
or one-line:
```sh
docker run --rm -v $(pwd):/compute_tools -w /compute_tools -t rust:latest cargo build --release --target=x86_64-unknown-linux-gnu
```
### Using rust native cross-compilation
Another way is to add `x86_64-unknown-linux-gnu` target on your host system:
```sh
rustup target add x86_64-unknown-linux-gnu
```
Install macOS cross-compiler toolchain:
```sh
brew tap SergioBenitez/osxct
brew install x86_64-unknown-linux-gnu
```
And finally run `cargo build`:
```sh
CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_LINKER=x86_64-unknown-linux-gnu-gcc cargo build --target=x86_64-unknown-linux-gnu --release
```

View File

@@ -0,0 +1 @@
max_width = 100

View File

@@ -0,0 +1,271 @@
//!
//! Postgres wrapper (`compute_ctl`) is intended to be run as a Docker entrypoint or as a `systemd`
//! `ExecStart` option. It will handle all the `Neon` specifics during compute node
//! initialization:
//! - `compute_ctl` accepts cluster (compute node) specification as a JSON file.
//! - Every start is a fresh start, so the data directory is removed and
//! initialized again on each run.
//! - Next it will put configuration files into the `PGDATA` directory.
//! - Sync safekeepers and get commit LSN.
//! - Get `basebackup` from pageserver using the returned on the previous step LSN.
//! - Try to start `postgres` and wait until it is ready to accept connections.
//! - Check and alter/drop/create roles and databases.
//! - Hang waiting on the `postmaster` process to exit.
//!
//! Also `compute_ctl` spawns two separate service threads:
//! - `compute-monitor` checks the last Postgres activity timestamp and saves it
//! into the shared `ComputeNode`;
//! - `http-endpoint` runs a Hyper HTTP API server, which serves readiness and the
//! last activity requests.
//!
//! If the `vm-informant` binary is present at `/bin/vm-informant`, it will also be started. For VM
//! compute nodes, `vm-informant` communicates with the VM autoscaling system. It coordinates
//! downscaling and (eventually) will request immediate upscaling under resource pressure.
//!
//! Usage example:
//! ```sh
//! compute_ctl -D /var/db/postgres/compute \
//! -C 'postgresql://cloud_admin@localhost/postgres' \
//! -S /var/db/postgres/specs/current.json \
//! -b /usr/local/bin/postgres
//! ```
//!
use std::fs::File;
use std::panic;
use std::path::Path;
use std::process::exit;
use std::sync::{Arc, RwLock};
use std::{thread, time::Duration};
use anyhow::{Context, Result};
use chrono::Utc;
use clap::Arg;
use tracing::{error, info};
use compute_tools::compute::{ComputeMetrics, ComputeNode, ComputeState, ComputeStatus};
use compute_tools::http::api::launch_http_server;
use compute_tools::logger::*;
use compute_tools::monitor::launch_monitor;
use compute_tools::params::*;
use compute_tools::pg_helpers::*;
use compute_tools::spec::*;
use url::Url;
fn main() -> Result<()> {
init_tracing_and_logging(DEFAULT_LOG_LEVEL)?;
let matches = cli().get_matches();
let pgdata = matches
.get_one::<String>("pgdata")
.expect("PGDATA path is required");
let connstr = matches
.get_one::<String>("connstr")
.expect("Postgres connection string is required");
let spec = matches.get_one::<String>("spec");
let spec_path = matches.get_one::<String>("spec-path");
let compute_id = matches.get_one::<String>("compute-id");
let control_plane_uri = matches.get_one::<String>("control-plane-uri");
// Try to use just 'postgres' if no path is provided
let pgbin = matches.get_one::<String>("pgbin").unwrap();
let spec: ComputeSpec = match spec {
// First, try to get cluster spec from the cli argument
Some(json) => serde_json::from_str(json)?,
None => {
// Second, try to read it from the file if path is provided
if let Some(sp) = spec_path {
let path = Path::new(sp);
let file = File::open(path)?;
serde_json::from_reader(file)?
} else if let Some(id) = compute_id {
if let Some(cp_base) = control_plane_uri {
let cp_uri = format!("{cp_base}/management/api/v1/{id}/spec");
let jwt: String = match std::env::var("NEON_CONSOLE_JWT") {
Ok(v) => v,
Err(_) => "".to_string(),
};
reqwest::blocking::Client::new()
.get(cp_uri)
.header("Authorization", jwt)
.send()?
.json()?
} else {
panic!(
"must specify --control-plane-uri \"{:#?}\" and --compute-id \"{:#?}\"",
control_plane_uri, compute_id
);
}
} else {
panic!("compute spec should be provided via --spec or --spec-path argument");
}
}
};
// Extract OpenTelemetry context for the startup actions from the spec, and
// attach it to the current tracing context.
//
// This is used to propagate the context for the 'start_compute' operation
// from the neon control plane. This allows linking together the wider
// 'start_compute' operation that creates the compute container, with the
// startup actions here within the container.
//
// Switch to the startup context here, and exit it once the startup has
// completed and Postgres is up and running.
//
// NOTE: This is supposed to only cover the *startup* actions. Once
// postgres is configured and up-and-running, we exit this span. Any other
// actions that are performed on incoming HTTP requests, for example, are
// performed in separate spans.
let startup_context_guard = if let Some(ref carrier) = spec.startup_tracing_context {
use opentelemetry::propagation::TextMapPropagator;
use opentelemetry::sdk::propagation::TraceContextPropagator;
Some(TraceContextPropagator::new().extract(carrier).attach())
} else {
None
};
let pageserver_connstr = spec
.cluster
.settings
.find("neon.pageserver_connstring")
.expect("pageserver connstr should be provided");
let storage_auth_token = spec.storage_auth_token.clone();
let tenant = spec
.cluster
.settings
.find("neon.tenant_id")
.expect("tenant id should be provided");
let timeline = spec
.cluster
.settings
.find("neon.timeline_id")
.expect("tenant id should be provided");
let compute_state = ComputeNode {
start_time: Utc::now(),
connstr: Url::parse(connstr).context("cannot parse connstr as a URL")?,
pgdata: pgdata.to_string(),
pgbin: pgbin.to_string(),
spec,
tenant,
timeline,
pageserver_connstr,
storage_auth_token,
metrics: ComputeMetrics::default(),
state: RwLock::new(ComputeState::new()),
};
let compute = Arc::new(compute_state);
// Launch service threads first, so we were able to serve availability
// requests, while configuration is still in progress.
let _http_handle = launch_http_server(&compute).expect("cannot launch http endpoint thread");
let _monitor_handle = launch_monitor(&compute).expect("cannot launch compute monitor thread");
// Start Postgres
let mut delay_exit = false;
let mut exit_code = None;
let pg = match compute.start_compute() {
Ok(pg) => Some(pg),
Err(err) => {
error!("could not start the compute node: {:?}", err);
let mut state = compute.state.write().unwrap();
state.error = Some(format!("{:?}", err));
state.status = ComputeStatus::Failed;
drop(state);
delay_exit = true;
None
}
};
// Wait for the child Postgres process forever. In this state Ctrl+C will
// propagate to Postgres and it will be shut down as well.
if let Some(mut pg) = pg {
// Startup is finished, exit the startup tracing span
drop(startup_context_guard);
let ecode = pg
.wait()
.expect("failed to start waiting on Postgres process");
info!("Postgres exited with code {}, shutting down", ecode);
exit_code = ecode.code()
}
if let Err(err) = compute.check_for_core_dumps() {
error!("error while checking for core dumps: {err:?}");
}
// If launch failed, keep serving HTTP requests for a while, so the cloud
// control plane can get the actual error.
if delay_exit {
info!("giving control plane 30s to collect the error before shutdown");
thread::sleep(Duration::from_secs(30));
info!("shutting down");
}
// Shutdown trace pipeline gracefully, so that it has a chance to send any
// pending traces before we exit.
tracing_utils::shutdown_tracing();
exit(exit_code.unwrap_or(1))
}
fn cli() -> clap::Command {
// Env variable is set by `cargo`
let version = option_env!("CARGO_PKG_VERSION").unwrap_or("unknown");
clap::Command::new("compute_ctl")
.version(version)
.arg(
Arg::new("connstr")
.short('C')
.long("connstr")
.value_name("DATABASE_URL")
.required(true),
)
.arg(
Arg::new("pgdata")
.short('D')
.long("pgdata")
.value_name("DATADIR")
.required(true),
)
.arg(
Arg::new("pgbin")
.short('b')
.long("pgbin")
.default_value("postgres")
.value_name("POSTGRES_PATH"),
)
.arg(
Arg::new("spec")
.short('s')
.long("spec")
.value_name("SPEC_JSON"),
)
.arg(
Arg::new("spec-path")
.short('S')
.long("spec-path")
.value_name("SPEC_PATH"),
)
.arg(
Arg::new("compute-id")
.short('i')
.long("compute-id")
.value_name("COMPUTE_ID"),
)
.arg(
Arg::new("control-plane-uri")
.short('p')
.long("control-plane-uri")
.value_name("CONTROL_PLANE"),
)
}
#[test]
fn verify_cli() {
cli().debug_assert()
}

View File

@@ -0,0 +1,45 @@
use anyhow::{anyhow, Result};
use postgres::Client;
use tokio_postgres::NoTls;
use tracing::{error, instrument};
use crate::compute::ComputeNode;
#[instrument(skip_all)]
pub fn create_writability_check_data(client: &mut Client) -> Result<()> {
let query = "
CREATE TABLE IF NOT EXISTS health_check (
id serial primary key,
updated_at timestamptz default now()
);
INSERT INTO health_check VALUES (1, now())
ON CONFLICT (id) DO UPDATE
SET updated_at = now();";
let result = client.simple_query(query)?;
if result.len() < 2 {
return Err(anyhow::format_err!("executed {} queries", result.len()));
}
Ok(())
}
#[instrument(skip_all)]
pub async fn check_writability(compute: &ComputeNode) -> Result<()> {
let (client, connection) = tokio_postgres::connect(compute.connstr.as_str(), NoTls).await?;
if client.is_closed() {
return Err(anyhow!("connection to postgres closed"));
}
tokio::spawn(async move {
if let Err(e) = connection.await {
error!("connection error: {}", e);
}
});
let result = client
.simple_query("UPDATE health_check SET updated_at = now() WHERE id = 1;")
.await?;
if result.len() != 1 {
return Err(anyhow!("statement can't be executed"));
}
Ok(())
}

View File

@@ -0,0 +1,467 @@
//
// XXX: This starts to be scarry similar to the `PostgresNode` from `control_plane`,
// but there are several things that makes `PostgresNode` usage inconvenient in the
// cloud:
// - it inherits from `LocalEnv`, which contains **all-all** the information about
// a complete service running
// - it uses `PageServerNode` with information about http endpoint, which we do not
// need in the cloud again
// - many tiny pieces like, for example, we do not use `pg_ctl` in the cloud
//
// Thus, to use `PostgresNode` in the cloud, we need to 'mock' a bunch of required
// attributes (not required for the cloud). Yet, it is still tempting to unify these
// `PostgresNode` and `ComputeNode` and use one in both places.
//
// TODO: stabilize `ComputeNode` and think about using it in the `control_plane`.
//
use std::fs;
use std::os::unix::fs::PermissionsExt;
use std::path::Path;
use std::process::{Command, Stdio};
use std::str::FromStr;
use std::sync::atomic::{AtomicU64, Ordering};
use std::sync::RwLock;
use anyhow::{Context, Result};
use chrono::{DateTime, Utc};
use postgres::{Client, NoTls};
use serde::{Serialize, Serializer};
use tokio_postgres;
use tracing::{info, instrument, warn};
use crate::checker::create_writability_check_data;
use crate::config;
use crate::pg_helpers::*;
use crate::spec::*;
/// Compute node info shared across several `compute_ctl` threads.
pub struct ComputeNode {
pub start_time: DateTime<Utc>,
// Url type maintains proper escaping
pub connstr: url::Url,
pub pgdata: String,
pub pgbin: String,
pub spec: ComputeSpec,
pub tenant: String,
pub timeline: String,
pub pageserver_connstr: String,
pub storage_auth_token: Option<String>,
pub metrics: ComputeMetrics,
/// Volatile part of the `ComputeNode` so should be used under `RwLock`
/// to allow HTTP API server to serve status requests, while configuration
/// is in progress.
pub state: RwLock<ComputeState>,
}
fn rfc3339_serialize<S>(x: &DateTime<Utc>, s: S) -> Result<S::Ok, S::Error>
where
S: Serializer,
{
x.to_rfc3339().serialize(s)
}
#[derive(Serialize)]
#[serde(rename_all = "snake_case")]
pub struct ComputeState {
pub status: ComputeStatus,
/// Timestamp of the last Postgres activity
#[serde(serialize_with = "rfc3339_serialize")]
pub last_active: DateTime<Utc>,
pub error: Option<String>,
}
impl ComputeState {
pub fn new() -> Self {
Self {
status: ComputeStatus::Init,
last_active: Utc::now(),
error: None,
}
}
}
impl Default for ComputeState {
fn default() -> Self {
Self::new()
}
}
#[derive(Serialize, Clone, Copy, PartialEq, Eq)]
#[serde(rename_all = "snake_case")]
pub enum ComputeStatus {
Init,
Running,
Failed,
}
#[derive(Default, Serialize)]
pub struct ComputeMetrics {
pub sync_safekeepers_ms: AtomicU64,
pub basebackup_ms: AtomicU64,
pub config_ms: AtomicU64,
pub total_startup_ms: AtomicU64,
}
impl ComputeNode {
pub fn set_status(&self, status: ComputeStatus) {
self.state.write().unwrap().status = status;
}
pub fn get_status(&self) -> ComputeStatus {
self.state.read().unwrap().status
}
// Remove `pgdata` directory and create it again with right permissions.
fn create_pgdata(&self) -> Result<()> {
// Ignore removal error, likely it is a 'No such file or directory (os error 2)'.
// If it is something different then create_dir() will error out anyway.
let _ok = fs::remove_dir_all(&self.pgdata);
fs::create_dir(&self.pgdata)?;
fs::set_permissions(&self.pgdata, fs::Permissions::from_mode(0o700))?;
Ok(())
}
// Get basebackup from the libpq connection to pageserver using `connstr` and
// unarchive it to `pgdata` directory overriding all its previous content.
#[instrument(skip(self))]
fn get_basebackup(&self, lsn: &str) -> Result<()> {
let start_time = Utc::now();
let mut config = postgres::Config::from_str(&self.pageserver_connstr)?;
// Use the storage auth token from the config file, if given.
// Note: this overrides any password set in the connection string.
if let Some(storage_auth_token) = &self.storage_auth_token {
info!("Got storage auth token from spec file");
config.password(storage_auth_token);
} else {
info!("Storage auth token not set");
}
let mut client = config.connect(NoTls)?;
let basebackup_cmd = match lsn {
"0/0" => format!("basebackup {} {}", &self.tenant, &self.timeline), // First start of the compute
_ => format!("basebackup {} {} {}", &self.tenant, &self.timeline, lsn),
};
let copyreader = client.copy_out(basebackup_cmd.as_str())?;
// Read the archive directly from the `CopyOutReader`
//
// Set `ignore_zeros` so that unpack() reads all the Copy data and
// doesn't stop at the end-of-archive marker. Otherwise, if the server
// sends an Error after finishing the tarball, we will not notice it.
let mut ar = tar::Archive::new(copyreader);
ar.set_ignore_zeros(true);
ar.unpack(&self.pgdata)?;
self.metrics.basebackup_ms.store(
Utc::now()
.signed_duration_since(start_time)
.to_std()
.unwrap()
.as_millis() as u64,
Ordering::Relaxed,
);
Ok(())
}
// Run `postgres` in a special mode with `--sync-safekeepers` argument
// and return the reported LSN back to the caller.
#[instrument(skip(self))]
fn sync_safekeepers(&self) -> Result<String> {
let start_time = Utc::now();
let sync_handle = Command::new(&self.pgbin)
.args(["--sync-safekeepers"])
.env("PGDATA", &self.pgdata) // we cannot use -D in this mode
.envs(if let Some(storage_auth_token) = &self.storage_auth_token {
vec![("NEON_AUTH_TOKEN", storage_auth_token)]
} else {
vec![]
})
.stdout(Stdio::piped())
.spawn()
.expect("postgres --sync-safekeepers failed to start");
// `postgres --sync-safekeepers` will print all log output to stderr and
// final LSN to stdout. So we pipe only stdout, while stderr will be automatically
// redirected to the caller output.
let sync_output = sync_handle
.wait_with_output()
.expect("postgres --sync-safekeepers failed");
if !sync_output.status.success() {
anyhow::bail!(
"postgres --sync-safekeepers exited with non-zero status: {}. stdout: {}",
sync_output.status,
String::from_utf8(sync_output.stdout)
.expect("postgres --sync-safekeepers exited, and stdout is not utf-8"),
);
}
self.metrics.sync_safekeepers_ms.store(
Utc::now()
.signed_duration_since(start_time)
.to_std()
.unwrap()
.as_millis() as u64,
Ordering::Relaxed,
);
let lsn = String::from(String::from_utf8(sync_output.stdout)?.trim());
Ok(lsn)
}
/// Do all the preparations like PGDATA directory creation, configuration,
/// safekeepers sync, basebackup, etc.
#[instrument(skip(self))]
pub fn prepare_pgdata(&self) -> Result<()> {
let spec = &self.spec;
let pgdata_path = Path::new(&self.pgdata);
// Remove/create an empty pgdata directory and put configuration there.
self.create_pgdata()?;
config::write_postgres_conf(&pgdata_path.join("postgresql.conf"), spec)?;
info!("starting safekeepers syncing");
let lsn = self
.sync_safekeepers()
.with_context(|| "failed to sync safekeepers")?;
info!("safekeepers synced at LSN {}", lsn);
info!(
"getting basebackup@{} from pageserver {}",
lsn, &self.pageserver_connstr
);
self.get_basebackup(&lsn).with_context(|| {
format!(
"failed to get basebackup@{} from pageserver {}",
lsn, &self.pageserver_connstr
)
})?;
// Update pg_hba.conf received with basebackup.
update_pg_hba(pgdata_path)?;
Ok(())
}
/// Start Postgres as a child process and manage DBs/roles.
/// After that this will hang waiting on the postmaster process to exit.
#[instrument(skip(self))]
pub fn start_postgres(&self) -> Result<std::process::Child> {
let pgdata_path = Path::new(&self.pgdata);
// Run postgres as a child process.
let mut pg = Command::new(&self.pgbin)
.args(["-D", &self.pgdata])
.envs(if let Some(storage_auth_token) = &self.storage_auth_token {
vec![("NEON_AUTH_TOKEN", storage_auth_token)]
} else {
vec![]
})
.spawn()
.expect("cannot start postgres process");
wait_for_postgres(&mut pg, pgdata_path)?;
Ok(pg)
}
#[instrument(skip(self))]
pub fn apply_config(&self) -> Result<()> {
// If connection fails,
// it may be the old node with `zenith_admin` superuser.
//
// In this case we need to connect with old `zenith_admin` name
// and create new user. We cannot simply rename connected user,
// but we can create a new one and grant it all privileges.
let mut client = match Client::connect(self.connstr.as_str(), NoTls) {
Err(e) => {
info!(
"cannot connect to postgres: {}, retrying with `zenith_admin` username",
e
);
let mut zenith_admin_connstr = self.connstr.clone();
zenith_admin_connstr
.set_username("zenith_admin")
.map_err(|_| anyhow::anyhow!("invalid connstr"))?;
let mut client = Client::connect(zenith_admin_connstr.as_str(), NoTls)?;
client.simple_query("CREATE USER cloud_admin WITH SUPERUSER")?;
client.simple_query("GRANT zenith_admin TO cloud_admin")?;
drop(client);
// reconnect with connsting with expected name
Client::connect(self.connstr.as_str(), NoTls)?
}
Ok(client) => client,
};
// Proceed with post-startup configuration. Note, that order of operations is important.
handle_roles(&self.spec, &mut client)?;
handle_databases(&self.spec, &mut client)?;
handle_role_deletions(self, &mut client)?;
handle_grants(self, &mut client)?;
create_writability_check_data(&mut client)?;
handle_extensions(&self.spec, &mut client)?;
// 'Close' connection
drop(client);
info!(
"finished configuration of compute for project {}",
self.spec.cluster.cluster_id
);
Ok(())
}
#[instrument(skip(self))]
pub fn start_compute(&self) -> Result<std::process::Child> {
info!(
"starting compute for project {}, operation {}, tenant {}, timeline {}",
self.spec.cluster.cluster_id,
self.spec.operation_uuid.as_ref().unwrap(),
self.tenant,
self.timeline,
);
self.prepare_pgdata()?;
let start_time = Utc::now();
let pg = self.start_postgres()?;
self.apply_config()?;
let startup_end_time = Utc::now();
self.metrics.config_ms.store(
startup_end_time
.signed_duration_since(start_time)
.to_std()
.unwrap()
.as_millis() as u64,
Ordering::Relaxed,
);
self.metrics.total_startup_ms.store(
startup_end_time
.signed_duration_since(self.start_time)
.to_std()
.unwrap()
.as_millis() as u64,
Ordering::Relaxed,
);
self.set_status(ComputeStatus::Running);
Ok(pg)
}
// Look for core dumps and collect backtraces.
//
// EKS worker nodes have following core dump settings:
// /proc/sys/kernel/core_pattern -> core
// /proc/sys/kernel/core_uses_pid -> 1
// ulimint -c -> unlimited
// which results in core dumps being written to postgres data directory as core.<pid>.
//
// Use that as a default location and pattern, except macos where core dumps are written
// to /cores/ directory by default.
pub fn check_for_core_dumps(&self) -> Result<()> {
let core_dump_dir = match std::env::consts::OS {
"macos" => Path::new("/cores/"),
_ => Path::new(&self.pgdata),
};
// Collect core dump paths if any
info!("checking for core dumps in {}", core_dump_dir.display());
let files = fs::read_dir(core_dump_dir)?;
let cores = files.filter_map(|entry| {
let entry = entry.ok()?;
let _ = entry.file_name().to_str()?.strip_prefix("core.")?;
Some(entry.path())
});
// Print backtrace for each core dump
for core_path in cores {
warn!(
"core dump found: {}, collecting backtrace",
core_path.display()
);
// Try first with gdb
let backtrace = Command::new("gdb")
.args(["--batch", "-q", "-ex", "bt", &self.pgbin])
.arg(&core_path)
.output();
// Try lldb if no gdb is found -- that is handy for local testing on macOS
let backtrace = match backtrace {
Err(ref e) if e.kind() == std::io::ErrorKind::NotFound => {
warn!("cannot find gdb, trying lldb");
Command::new("lldb")
.arg("-c")
.arg(&core_path)
.args(["--batch", "-o", "bt all", "-o", "quit"])
.output()
}
_ => backtrace,
}?;
warn!(
"core dump backtrace: {}",
String::from_utf8_lossy(&backtrace.stdout)
);
warn!(
"debugger stderr: {}",
String::from_utf8_lossy(&backtrace.stderr)
);
}
Ok(())
}
/// Select `pg_stat_statements` data and return it as a stringified JSON
pub async fn collect_insights(&self) -> String {
let mut result_rows: Vec<String> = Vec::new();
let connect_result = tokio_postgres::connect(self.connstr.as_str(), NoTls).await;
let (client, connection) = connect_result.unwrap();
tokio::spawn(async move {
if let Err(e) = connection.await {
eprintln!("connection error: {}", e);
}
});
let result = client
.simple_query(
"SELECT
row_to_json(pg_stat_statements)
FROM
pg_stat_statements
WHERE
userid != 'cloud_admin'::regrole::oid
ORDER BY
(mean_exec_time + mean_plan_time) DESC
LIMIT 100",
)
.await;
if let Ok(raw_rows) = result {
for message in raw_rows.iter() {
if let postgres::SimpleQueryMessage::Row(row) = message {
if let Some(json) = row.get(0) {
result_rows.push(json.to_string());
}
}
}
format!("{{\"pg_stat_statements\": [{}]}}", result_rows.join(","))
} else {
"{{\"pg_stat_statements\": []}}".to_string()
}
}
}

View File

@@ -0,0 +1,51 @@
use std::fs::{File, OpenOptions};
use std::io;
use std::io::prelude::*;
use std::path::Path;
use anyhow::Result;
use crate::pg_helpers::PgOptionsSerialize;
use crate::spec::ComputeSpec;
/// Check that `line` is inside a text file and put it there if it is not.
/// Create file if it doesn't exist.
pub fn line_in_file(path: &Path, line: &str) -> Result<bool> {
let mut file = OpenOptions::new()
.read(true)
.write(true)
.create(true)
.append(false)
.open(path)?;
let buf = io::BufReader::new(&file);
let mut count: usize = 0;
for l in buf.lines() {
if l? == line {
return Ok(false);
}
count = 1;
}
write!(file, "{}{}", "\n".repeat(count), line)?;
Ok(true)
}
/// Create or completely rewrite configuration file specified by `path`
pub fn write_postgres_conf(path: &Path, spec: &ComputeSpec) -> Result<()> {
// File::create() destroys the file content if it exists.
let mut postgres_conf = File::create(path)?;
write_auto_managed_block(&mut postgres_conf, &spec.cluster.settings.as_pg_settings())?;
Ok(())
}
// Write Postgres config block wrapped with generated comment section
fn write_auto_managed_block(file: &mut File, buf: &str) -> Result<()> {
writeln!(file, "# Managed by compute_ctl: begin")?;
writeln!(file, "{}", buf)?;
writeln!(file, "# Managed by compute_ctl: end")?;
Ok(())
}

View File

@@ -0,0 +1,117 @@
use std::convert::Infallible;
use std::net::SocketAddr;
use std::sync::Arc;
use std::thread;
use crate::compute::ComputeNode;
use anyhow::Result;
use hyper::service::{make_service_fn, service_fn};
use hyper::{Body, Method, Request, Response, Server, StatusCode};
use num_cpus;
use serde_json;
use tracing::{error, info};
use tracing_utils::http::OtelName;
// Service function to handle all available routes.
async fn routes(req: Request<Body>, compute: &Arc<ComputeNode>) -> Response<Body> {
//
// NOTE: The URI path is currently included in traces. That's OK because
// it doesn't contain any variable parts or sensitive information. But
// please keep that in mind if you change the routing here.
//
match (req.method(), req.uri().path()) {
// Serialized compute state.
(&Method::GET, "/status") => {
info!("serving /status GET request");
let state = compute.state.read().unwrap();
Response::new(Body::from(serde_json::to_string(&*state).unwrap()))
}
// Startup metrics in JSON format. Keep /metrics reserved for a possible
// future use for Prometheus metrics format.
(&Method::GET, "/metrics.json") => {
info!("serving /metrics.json GET request");
Response::new(Body::from(serde_json::to_string(&compute.metrics).unwrap()))
}
// Collect Postgres current usage insights
(&Method::GET, "/insights") => {
info!("serving /insights GET request");
let insights = compute.collect_insights().await;
Response::new(Body::from(insights))
}
(&Method::POST, "/check_writability") => {
info!("serving /check_writability POST request");
let res = crate::checker::check_writability(compute).await;
match res {
Ok(_) => Response::new(Body::from("true")),
Err(e) => Response::new(Body::from(e.to_string())),
}
}
(&Method::GET, "/info") => {
let num_cpus = num_cpus::get_physical();
info!("serving /info GET request. num_cpus: {}", num_cpus);
Response::new(Body::from(
serde_json::json!({
"num_cpus": num_cpus,
})
.to_string(),
))
}
// Return the `404 Not Found` for any other routes.
_ => {
let mut not_found = Response::new(Body::from("404 Not Found"));
*not_found.status_mut() = StatusCode::NOT_FOUND;
not_found
}
}
}
// Main Hyper HTTP server function that runs it and blocks waiting on it forever.
#[tokio::main]
async fn serve(state: Arc<ComputeNode>) {
let addr = SocketAddr::from(([0, 0, 0, 0], 3080));
let make_service = make_service_fn(move |_conn| {
let state = state.clone();
async move {
Ok::<_, Infallible>(service_fn(move |req: Request<Body>| {
let state = state.clone();
async move {
Ok::<_, Infallible>(
// NOTE: We include the URI path in the string. It
// doesn't contain any variable parts or sensitive
// information in this API.
tracing_utils::http::tracing_handler(
req,
|req| routes(req, &state),
OtelName::UriPath,
)
.await,
)
}
}))
}
});
info!("starting HTTP server on {}", addr);
let server = Server::bind(&addr).serve(make_service);
// Run this server forever
if let Err(e) = server.await {
error!("server error: {}", e);
}
}
/// Launch a separate Hyper HTTP API server thread and return its `JoinHandle`.
pub fn launch_http_server(state: &Arc<ComputeNode>) -> Result<thread::JoinHandle<()>> {
let state = Arc::clone(state);
Ok(thread::Builder::new()
.name("http-endpoint".into())
.spawn(move || serve(state))?)
}

View File

@@ -0,0 +1 @@
pub mod api;

View File

@@ -0,0 +1,156 @@
openapi: "3.0.2"
info:
title: Compute node control API
version: "1.0"
servers:
- url: "http://localhost:3080"
paths:
/status:
get:
tags:
- Info
summary: Get compute node internal status
description: ""
operationId: getComputeStatus
responses:
200:
description: ComputeState
content:
application/json:
schema:
$ref: "#/components/schemas/ComputeState"
/metrics.json:
get:
tags:
- Info
summary: Get compute node startup metrics in JSON format
description: ""
operationId: getComputeMetricsJSON
responses:
200:
description: ComputeMetrics
content:
application/json:
schema:
$ref: "#/components/schemas/ComputeMetrics"
/insights:
get:
tags:
- Info
summary: Get current compute insights in JSON format
description: |
Note, that this doesn't include any historical data
operationId: getComputeInsights
responses:
200:
description: Compute insights
content:
application/json:
schema:
$ref: "#/components/schemas/ComputeInsights"
/info:
get:
tags:
- "info"
summary: Get info about the compute Pod/VM
description: ""
operationId: getInfo
responses:
"200":
description: Info
content:
application/json:
schema:
$ref: "#/components/schemas/Info"
/check_writability:
post:
tags:
- Check
summary: Check that we can write new data on this compute
description: ""
operationId: checkComputeWritability
responses:
200:
description: Check result
content:
text/plain:
schema:
type: string
description: Error text or 'true' if check passed
example: "true"
components:
securitySchemes:
JWT:
type: http
scheme: bearer
bearerFormat: JWT
schemas:
ComputeMetrics:
type: object
description: Compute startup metrics
required:
- sync_safekeepers_ms
- basebackup_ms
- config_ms
- total_startup_ms
properties:
sync_safekeepers_ms:
type: integer
basebackup_ms:
type: integer
config_ms:
type: integer
total_startup_ms:
type: integer
Info:
type: object
description: Information about VM/Pod
required:
- num_cpus
properties:
num_cpus:
type: integer
ComputeState:
type: object
required:
- status
- last_active
properties:
status:
$ref: '#/components/schemas/ComputeStatus'
last_active:
type: string
description: The last detected compute activity timestamp in UTC and RFC3339 format
example: "2022-10-12T07:20:50.52Z"
error:
type: string
description: Text of the error during compute startup, if any
ComputeInsights:
type: object
properties:
pg_stat_statements:
description: Contains raw output from pg_stat_statements in JSON format
type: array
items:
type: object
ComputeStatus:
type: string
enum:
- init
- failed
- running
security:
- JWT: []

14
compute_tools/src/lib.rs Normal file
View File

@@ -0,0 +1,14 @@
//!
//! Various tools and helpers to handle cluster / compute node (Postgres)
//! configuration.
//!
pub mod checker;
pub mod config;
pub mod http;
#[macro_use]
pub mod logger;
pub mod compute;
pub mod monitor;
pub mod params;
pub mod pg_helpers;
pub mod spec;

View File

@@ -0,0 +1,37 @@
use tracing_opentelemetry::OpenTelemetryLayer;
use tracing_subscriber::layer::SubscriberExt;
use tracing_subscriber::prelude::*;
/// Initialize logging to stderr, and OpenTelemetry tracing and exporter.
///
/// Logging is configured using either `default_log_level` or
/// `RUST_LOG` environment variable as default log level.
///
/// OpenTelemetry is configured with OTLP/HTTP exporter. It picks up
/// configuration from environment variables. For example, to change the destination,
/// set `OTEL_EXPORTER_OTLP_ENDPOINT=http://jaeger:4318`. See
/// `tracing-utils` package description.
///
pub fn init_tracing_and_logging(default_log_level: &str) -> anyhow::Result<()> {
// Initialize Logging
let env_filter = tracing_subscriber::EnvFilter::try_from_default_env()
.unwrap_or_else(|_| tracing_subscriber::EnvFilter::new(default_log_level));
let fmt_layer = tracing_subscriber::fmt::layer()
.with_target(false)
.with_writer(std::io::stderr);
// Initialize OpenTelemetry
let otlp_layer =
tracing_utils::init_tracing_without_runtime("compute_ctl").map(OpenTelemetryLayer::new);
// Put it all together
tracing_subscriber::registry()
.with(env_filter)
.with(otlp_layer)
.with(fmt_layer)
.init();
tracing::info!("logging and tracing started");
Ok(())
}

View File

@@ -0,0 +1,113 @@
use std::sync::Arc;
use std::{thread, time};
use anyhow::Result;
use chrono::{DateTime, Utc};
use postgres::{Client, NoTls};
use tracing::{debug, info};
use crate::compute::ComputeNode;
const MONITOR_CHECK_INTERVAL: u64 = 500; // milliseconds
// Spin in a loop and figure out the last activity time in the Postgres.
// Then update it in the shared state. This function never errors out.
// XXX: the only expected panic is at `RwLock` unwrap().
fn watch_compute_activity(compute: &ComputeNode) {
// Suppose that `connstr` doesn't change
let connstr = compute.connstr.as_str();
// Define `client` outside of the loop to reuse existing connection if it's active.
let mut client = Client::connect(connstr, NoTls);
let timeout = time::Duration::from_millis(MONITOR_CHECK_INTERVAL);
info!("watching Postgres activity at {}", connstr);
loop {
// Should be outside of the write lock to allow others to read while we sleep.
thread::sleep(timeout);
match &mut client {
Ok(cli) => {
if cli.is_closed() {
info!("connection to postgres closed, trying to reconnect");
// Connection is closed, reconnect and try again.
client = Client::connect(connstr, NoTls);
continue;
}
// Get all running client backends except ourself, use RFC3339 DateTime format.
let backends = cli
.query(
"SELECT state, to_char(state_change, 'YYYY-MM-DD\"T\"HH24:MI:SS.US\"Z\"') AS state_change
FROM pg_stat_activity
WHERE backend_type = 'client backend'
AND pid != pg_backend_pid()
AND usename != 'cloud_admin';", // XXX: find a better way to filter other monitors?
&[],
);
let mut last_active = compute.state.read().unwrap().last_active;
if let Ok(backs) = backends {
let mut idle_backs: Vec<DateTime<Utc>> = vec![];
for b in backs.into_iter() {
let state: String = match b.try_get("state") {
Ok(state) => state,
Err(_) => continue,
};
if state == "idle" {
let change: String = match b.try_get("state_change") {
Ok(state_change) => state_change,
Err(_) => continue,
};
let change = DateTime::parse_from_rfc3339(&change);
match change {
Ok(t) => idle_backs.push(t.with_timezone(&Utc)),
Err(e) => {
info!("cannot parse backend state_change DateTime: {}", e);
continue;
}
}
} else {
// Found non-idle backend, so the last activity is NOW.
// Save it and exit the for loop. Also clear the idle backend
// `state_change` timestamps array as it doesn't matter now.
last_active = Utc::now();
idle_backs.clear();
break;
}
}
// Get idle backend `state_change` with the max timestamp.
if let Some(last) = idle_backs.iter().max() {
last_active = *last;
}
}
// Update the last activity in the shared state if we got a more recent one.
let mut state = compute.state.write().unwrap();
if last_active > state.last_active {
state.last_active = last_active;
debug!("set the last compute activity time to: {}", last_active);
}
}
Err(e) => {
debug!("cannot connect to postgres: {}, retrying", e);
// Establish a new connection and try again.
client = Client::connect(connstr, NoTls);
}
}
}
}
/// Launch a separate compute monitor thread and return its `JoinHandle`.
pub fn launch_monitor(state: &Arc<ComputeNode>) -> Result<thread::JoinHandle<()>> {
let state = Arc::clone(state);
Ok(thread::Builder::new()
.name("compute-monitor".into())
.spawn(move || watch_compute_activity(&state))?)
}

View File

@@ -0,0 +1,9 @@
pub const DEFAULT_LOG_LEVEL: &str = "info";
// From Postgres docs:
// To ease transition from the md5 method to the newer SCRAM method, if md5 is specified
// as a method in pg_hba.conf but the user's password on the server is encrypted for SCRAM
// (see below), then SCRAM-based authentication will automatically be chosen instead.
// https://www.postgresql.org/docs/15/auth-password.html
//
// So it's safe to set md5 here, as `control-plane` anyway uses SCRAM for all roles.
pub const PG_HBA_ALL_MD5: &str = "host\tall\t\tall\t\t0.0.0.0/0\t\tmd5";

View File

@@ -0,0 +1,363 @@
use std::fmt::Write;
use std::fs;
use std::fs::File;
use std::io::{BufRead, BufReader};
use std::os::unix::fs::PermissionsExt;
use std::path::Path;
use std::process::Child;
use std::time::{Duration, Instant};
use anyhow::{bail, Result};
use notify::{RecursiveMode, Watcher};
use postgres::{Client, Transaction};
use serde::Deserialize;
use tracing::{debug, instrument};
const POSTGRES_WAIT_TIMEOUT: Duration = Duration::from_millis(60 * 1000); // milliseconds
/// Rust representation of Postgres role info with only those fields
/// that matter for us.
#[derive(Clone, Deserialize)]
pub struct Role {
pub name: PgIdent,
pub encrypted_password: Option<String>,
pub options: GenericOptions,
}
/// Rust representation of Postgres database info with only those fields
/// that matter for us.
#[derive(Clone, Deserialize)]
pub struct Database {
pub name: PgIdent,
pub owner: PgIdent,
pub options: GenericOptions,
}
/// Common type representing both SQL statement params with or without value,
/// like `LOGIN` or `OWNER username` in the `CREATE/ALTER ROLE`, and config
/// options like `wal_level = logical`.
#[derive(Clone, Deserialize)]
pub struct GenericOption {
pub name: String,
pub value: Option<String>,
pub vartype: String,
}
/// Optional collection of `GenericOption`'s. Type alias allows us to
/// declare a `trait` on it.
pub type GenericOptions = Option<Vec<GenericOption>>;
/// Escape a string for including it in a SQL literal
fn escape_literal(s: &str) -> String {
s.replace('\'', "''").replace('\\', "\\\\")
}
/// Escape a string so that it can be used in postgresql.conf.
/// Same as escape_literal, currently.
fn escape_conf_value(s: &str) -> String {
s.replace('\'', "''").replace('\\', "\\\\")
}
impl GenericOption {
/// Represent `GenericOption` as SQL statement parameter.
pub fn to_pg_option(&self) -> String {
if let Some(val) = &self.value {
match self.vartype.as_ref() {
"string" => format!("{} '{}'", self.name, escape_literal(val)),
_ => format!("{} {}", self.name, val),
}
} else {
self.name.to_owned()
}
}
/// Represent `GenericOption` as configuration option.
pub fn to_pg_setting(&self) -> String {
if let Some(val) = &self.value {
// TODO: check in the console DB that we don't have these settings
// set for any non-deleted project and drop this override.
let name = match self.name.as_str() {
"safekeepers" => "neon.safekeepers",
"wal_acceptor_reconnect" => "neon.safekeeper_reconnect_timeout",
"wal_acceptor_connection_timeout" => "neon.safekeeper_connection_timeout",
it => it,
};
match self.vartype.as_ref() {
"string" => format!("{} = '{}'", name, escape_conf_value(val)),
_ => format!("{} = {}", name, val),
}
} else {
self.name.to_owned()
}
}
}
pub trait PgOptionsSerialize {
fn as_pg_options(&self) -> String;
fn as_pg_settings(&self) -> String;
}
impl PgOptionsSerialize for GenericOptions {
/// Serialize an optional collection of `GenericOption`'s to
/// Postgres SQL statement arguments.
fn as_pg_options(&self) -> String {
if let Some(ops) = &self {
ops.iter()
.map(|op| op.to_pg_option())
.collect::<Vec<String>>()
.join(" ")
} else {
"".to_string()
}
}
/// Serialize an optional collection of `GenericOption`'s to
/// `postgresql.conf` compatible format.
fn as_pg_settings(&self) -> String {
if let Some(ops) = &self {
ops.iter()
.map(|op| op.to_pg_setting())
.collect::<Vec<String>>()
.join("\n")
+ "\n" // newline after last setting
} else {
"".to_string()
}
}
}
pub trait GenericOptionsSearch {
fn find(&self, name: &str) -> Option<String>;
}
impl GenericOptionsSearch for GenericOptions {
/// Lookup option by name
fn find(&self, name: &str) -> Option<String> {
let ops = self.as_ref()?;
let op = ops.iter().find(|s| s.name == name)?;
op.value.clone()
}
}
impl Role {
/// Serialize a list of role parameters into a Postgres-acceptable
/// string of arguments.
pub fn to_pg_options(&self) -> String {
// XXX: consider putting LOGIN as a default option somewhere higher, e.g. in control-plane.
// For now, we do not use generic `options` for roles. Once used, add
// `self.options.as_pg_options()` somewhere here.
let mut params: String = "LOGIN".to_string();
if let Some(pass) = &self.encrypted_password {
// Some time ago we supported only md5 and treated all encrypted_password as md5.
// Now we also support SCRAM-SHA-256 and to preserve compatibility
// we treat all encrypted_password as md5 unless they starts with SCRAM-SHA-256.
if pass.starts_with("SCRAM-SHA-256") {
write!(params, " PASSWORD '{pass}'")
.expect("String is documented to not to error during write operations");
} else {
write!(params, " PASSWORD 'md5{pass}'")
.expect("String is documented to not to error during write operations");
}
} else {
params.push_str(" PASSWORD NULL");
}
params
}
}
impl Database {
pub fn new(name: PgIdent, owner: PgIdent) -> Self {
Self {
name,
owner,
options: None,
}
}
/// Serialize a list of database parameters into a Postgres-acceptable
/// string of arguments.
/// NB: `TEMPLATE` is actually also an identifier, but so far we only need
/// to use `template0` and `template1`, so it is not a problem. Yet in the future
/// it may require a proper quoting too.
pub fn to_pg_options(&self) -> String {
let mut params: String = self.options.as_pg_options();
write!(params, " OWNER {}", &self.owner.pg_quote())
.expect("String is documented to not to error during write operations");
params
}
}
/// String type alias representing Postgres identifier and
/// intended to be used for DB / role names.
pub type PgIdent = String;
/// Generic trait used to provide quoting / encoding for strings used in the
/// Postgres SQL queries and DATABASE_URL.
pub trait Escaping {
fn pg_quote(&self) -> String;
}
impl Escaping for PgIdent {
/// This is intended to mimic Postgres quote_ident(), but for simplicity it
/// always quotes provided string with `""` and escapes every `"`.
/// **Not idempotent**, i.e. if string is already escaped it will be escaped again.
fn pg_quote(&self) -> String {
let result = format!("\"{}\"", self.replace('"', "\"\""));
result
}
}
/// Build a list of existing Postgres roles
pub fn get_existing_roles(xact: &mut Transaction<'_>) -> Result<Vec<Role>> {
let postgres_roles = xact
.query("SELECT rolname, rolpassword FROM pg_catalog.pg_authid", &[])?
.iter()
.map(|row| Role {
name: row.get("rolname"),
encrypted_password: row.get("rolpassword"),
options: None,
})
.collect();
Ok(postgres_roles)
}
/// Build a list of existing Postgres databases
pub fn get_existing_dbs(client: &mut Client) -> Result<Vec<Database>> {
let postgres_dbs = client
.query(
"SELECT datname, datdba::regrole::text as owner
FROM pg_catalog.pg_database;",
&[],
)?
.iter()
.map(|row| Database::new(row.get("datname"), row.get("owner")))
.collect();
Ok(postgres_dbs)
}
/// Wait for Postgres to become ready to accept connections. It's ready to
/// accept connections when the state-field in `pgdata/postmaster.pid` says
/// 'ready'.
#[instrument(skip(pg))]
pub fn wait_for_postgres(pg: &mut Child, pgdata: &Path) -> Result<()> {
let pid_path = pgdata.join("postmaster.pid");
// PostgreSQL writes line "ready" to the postmaster.pid file, when it has
// completed initialization and is ready to accept connections. We want to
// react quickly and perform the rest of our initialization as soon as
// PostgreSQL starts accepting connections. Use 'notify' to be notified
// whenever the PID file is changed, and whenever it changes, read it to
// check if it's now "ready".
//
// You cannot actually watch a file before it exists, so we first watch the
// data directory, and once the postmaster.pid file appears, we switch to
// watch the file instead. We also wake up every 100 ms to poll, just in
// case we miss some events for some reason. Not strictly necessary, but
// better safe than sorry.
let (tx, rx) = std::sync::mpsc::channel();
let (mut watcher, rx): (Box<dyn Watcher>, _) = match notify::recommended_watcher(move |res| {
let _ = tx.send(res);
}) {
Ok(watcher) => (Box::new(watcher), rx),
Err(e) => {
match e.kind {
notify::ErrorKind::Io(os) if os.raw_os_error() == Some(38) => {
// docker on m1 macs does not support recommended_watcher
// but return "Function not implemented (os error 38)"
// see https://github.com/notify-rs/notify/issues/423
let (tx, rx) = std::sync::mpsc::channel();
// let's poll it faster than what we check the results for (100ms)
let config =
notify::Config::default().with_poll_interval(Duration::from_millis(50));
let watcher = notify::PollWatcher::new(
move |res| {
let _ = tx.send(res);
},
config,
)?;
(Box::new(watcher), rx)
}
_ => return Err(e.into()),
}
}
};
watcher.watch(pgdata, RecursiveMode::NonRecursive)?;
let started_at = Instant::now();
let mut postmaster_pid_seen = false;
loop {
if let Ok(Some(status)) = pg.try_wait() {
// Postgres exited, that is not what we expected, bail out earlier.
let code = status.code().unwrap_or(-1);
bail!("Postgres exited unexpectedly with code {}", code);
}
let res = rx.recv_timeout(Duration::from_millis(100));
debug!("woken up by notify: {res:?}");
// If there are multiple events in the channel already, we only need to be
// check once. Swallow the extra events before we go ahead to check the
// pid file.
while let Ok(res) = rx.try_recv() {
debug!("swallowing extra event: {res:?}");
}
// Check that we can open pid file first.
if let Ok(file) = File::open(&pid_path) {
if !postmaster_pid_seen {
debug!("postmaster.pid appeared");
watcher
.unwatch(pgdata)
.expect("Failed to remove pgdata dir watch");
watcher
.watch(&pid_path, RecursiveMode::NonRecursive)
.expect("Failed to add postmaster.pid file watch");
postmaster_pid_seen = true;
}
let file = BufReader::new(file);
let last_line = file.lines().last();
// Pid file could be there and we could read it, but it could be empty, for example.
if let Some(Ok(line)) = last_line {
let status = line.trim();
debug!("last line of postmaster.pid: {status:?}");
// Now Postgres is ready to accept connections
if status == "ready" {
break;
}
}
}
// Give up after POSTGRES_WAIT_TIMEOUT.
let duration = started_at.elapsed();
if duration >= POSTGRES_WAIT_TIMEOUT {
bail!("timed out while waiting for Postgres to start");
}
}
tracing::info!("PostgreSQL is now running, continuing to configure it");
Ok(())
}
/// Remove `pgdata` directory and create it again with right permissions.
pub fn create_pgdata(pgdata: &str) -> Result<()> {
// Ignore removal error, likely it is a 'No such file or directory (os error 2)'.
// If it is something different then create_dir() will error out anyway.
let _ok = fs::remove_dir_all(pgdata);
fs::create_dir(pgdata)?;
fs::set_permissions(pgdata, fs::Permissions::from_mode(0o700))?;
Ok(())
}

534
compute_tools/src/spec.rs Normal file
View File

@@ -0,0 +1,534 @@
use std::collections::HashMap;
use std::path::Path;
use std::str::FromStr;
use anyhow::Result;
use postgres::config::Config;
use postgres::{Client, NoTls};
use serde::Deserialize;
use tracing::{info, info_span, instrument, span_enabled, warn, Level};
use crate::compute::ComputeNode;
use crate::config;
use crate::params::PG_HBA_ALL_MD5;
use crate::pg_helpers::*;
/// Cluster spec or configuration represented as an optional number of
/// delta operations + final cluster state description.
#[derive(Clone, Deserialize)]
pub struct ComputeSpec {
pub format_version: f32,
pub timestamp: String,
pub operation_uuid: Option<String>,
/// Expected cluster state at the end of transition process.
pub cluster: Cluster,
pub delta_operations: Option<Vec<DeltaOp>>,
pub storage_auth_token: Option<String>,
pub startup_tracing_context: Option<HashMap<String, String>>,
}
/// Cluster state seen from the perspective of the external tools
/// like Rails web console.
#[derive(Clone, Deserialize)]
pub struct Cluster {
pub cluster_id: String,
pub name: String,
pub state: Option<String>,
pub roles: Vec<Role>,
pub databases: Vec<Database>,
pub settings: GenericOptions,
}
/// Single cluster state changing operation that could not be represented as
/// a static `Cluster` structure. For example:
/// - DROP DATABASE
/// - DROP ROLE
/// - ALTER ROLE name RENAME TO new_name
/// - ALTER DATABASE name RENAME TO new_name
#[derive(Clone, Deserialize)]
pub struct DeltaOp {
pub action: String,
pub name: PgIdent,
pub new_name: Option<PgIdent>,
}
/// It takes cluster specification and does the following:
/// - Serialize cluster config and put it into `postgresql.conf` completely rewriting the file.
/// - Update `pg_hba.conf` to allow external connections.
pub fn handle_configuration(spec: &ComputeSpec, pgdata_path: &Path) -> Result<()> {
// File `postgresql.conf` is no longer included into `basebackup`, so just
// always write all config into it creating new file.
config::write_postgres_conf(&pgdata_path.join("postgresql.conf"), spec)?;
update_pg_hba(pgdata_path)?;
Ok(())
}
/// Check `pg_hba.conf` and update if needed to allow external connections.
pub fn update_pg_hba(pgdata_path: &Path) -> Result<()> {
// XXX: consider making it a part of spec.json
info!("checking pg_hba.conf");
let pghba_path = pgdata_path.join("pg_hba.conf");
if config::line_in_file(&pghba_path, PG_HBA_ALL_MD5)? {
info!("updated pg_hba.conf to allow external connections");
} else {
info!("pg_hba.conf is up-to-date");
}
Ok(())
}
/// Given a cluster spec json and open transaction it handles roles creation,
/// deletion and update.
#[instrument(skip_all)]
pub fn handle_roles(spec: &ComputeSpec, client: &mut Client) -> Result<()> {
let mut xact = client.transaction()?;
let existing_roles: Vec<Role> = get_existing_roles(&mut xact)?;
// Print a list of existing Postgres roles (only in debug mode)
if span_enabled!(Level::INFO) {
info!("postgres roles:");
for r in &existing_roles {
info!(
" - {}:{}",
r.name,
if r.encrypted_password.is_some() {
"[FILTERED]"
} else {
"(null)"
}
);
}
}
// Process delta operations first
if let Some(ops) = &spec.delta_operations {
info!("processing role renames");
for op in ops {
match op.action.as_ref() {
"delete_role" => {
// no-op now, roles will be deleted at the end of configuration
}
// Renaming role drops its password, since role name is
// used as a salt there. It is important that this role
// is recorded with a new `name` in the `roles` list.
// Follow up roles update will set the new password.
"rename_role" => {
let new_name = op.new_name.as_ref().unwrap();
// XXX: with a limited number of roles it is fine, but consider making it a HashMap
if existing_roles.iter().any(|r| r.name == op.name) {
let query: String = format!(
"ALTER ROLE {} RENAME TO {}",
op.name.pg_quote(),
new_name.pg_quote()
);
warn!("renaming role '{}' to '{}'", op.name, new_name);
xact.execute(query.as_str(), &[])?;
}
}
_ => {}
}
}
}
// Refresh Postgres roles info to handle possible roles renaming
let existing_roles: Vec<Role> = get_existing_roles(&mut xact)?;
info!("cluster spec roles:");
for role in &spec.cluster.roles {
let name = &role.name;
// XXX: with a limited number of roles it is fine, but consider making it a HashMap
let pg_role = existing_roles.iter().find(|r| r.name == *name);
enum RoleAction {
None,
Update,
Create,
}
let action = if let Some(r) = pg_role {
if (r.encrypted_password.is_none() && role.encrypted_password.is_some())
|| (r.encrypted_password.is_some() && role.encrypted_password.is_none())
{
RoleAction::Update
} else if let Some(pg_pwd) = &r.encrypted_password {
// Check whether password changed or not (trim 'md5' prefix first if any)
//
// This is a backward compatibility hack, which comes from the times when we were using
// md5 for everyone and hashes were stored in the console db without md5 prefix. So when
// role comes from the control-plane (json spec) `Role.encrypted_password` doesn't have md5 prefix,
// but when role comes from Postgres (`get_existing_roles` / `existing_roles`) it has this prefix.
// Here is the only place so far where we compare hashes, so it seems to be the best candidate
// to place this compatibility layer.
let pg_pwd = if let Some(stripped) = pg_pwd.strip_prefix("md5") {
stripped
} else {
pg_pwd
};
if pg_pwd != *role.encrypted_password.as_ref().unwrap() {
RoleAction::Update
} else {
RoleAction::None
}
} else {
RoleAction::None
}
} else {
RoleAction::Create
};
match action {
RoleAction::None => {}
RoleAction::Update => {
let mut query: String = format!("ALTER ROLE {} ", name.pg_quote());
query.push_str(&role.to_pg_options());
xact.execute(query.as_str(), &[])?;
}
RoleAction::Create => {
let mut query: String = format!("CREATE ROLE {} ", name.pg_quote());
info!("role create query: '{}'", &query);
query.push_str(&role.to_pg_options());
xact.execute(query.as_str(), &[])?;
let grant_query = format!(
"GRANT pg_read_all_data, pg_write_all_data TO {}",
name.pg_quote()
);
xact.execute(grant_query.as_str(), &[])?;
info!("role grant query: '{}'", &grant_query);
}
}
if span_enabled!(Level::INFO) {
let pwd = if role.encrypted_password.is_some() {
"[FILTERED]"
} else {
"(null)"
};
let action_str = match action {
RoleAction::None => "",
RoleAction::Create => " -> create",
RoleAction::Update => " -> update",
};
info!(" - {}:{}{}", name, pwd, action_str);
}
}
xact.commit()?;
Ok(())
}
/// Reassign all dependent objects and delete requested roles.
#[instrument(skip_all)]
pub fn handle_role_deletions(node: &ComputeNode, client: &mut Client) -> Result<()> {
if let Some(ops) = &node.spec.delta_operations {
// First, reassign all dependent objects to db owners.
info!("reassigning dependent objects of to-be-deleted roles");
// Fetch existing roles. We could've exported and used `existing_roles` from
// `handle_roles()`, but we only make this list there before creating new roles.
// Which is probably fine as we never create to-be-deleted roles, but that'd
// just look a bit untidy. Anyway, the entire `pg_roles` should be in shared
// buffers already, so this shouldn't be a big deal.
let mut xact = client.transaction()?;
let existing_roles: Vec<Role> = get_existing_roles(&mut xact)?;
xact.commit()?;
for op in ops {
// Check that role is still present in Postgres, as this could be a
// restart with the same spec after role deletion.
if op.action == "delete_role" && existing_roles.iter().any(|r| r.name == op.name) {
reassign_owned_objects(node, &op.name)?;
}
}
// Second, proceed with role deletions.
info!("processing role deletions");
let mut xact = client.transaction()?;
for op in ops {
// We do not check either role exists or not,
// Postgres will take care of it for us
if op.action == "delete_role" {
let query: String = format!("DROP ROLE IF EXISTS {}", &op.name.pg_quote());
warn!("deleting role '{}'", &op.name);
xact.execute(query.as_str(), &[])?;
}
}
xact.commit()?;
}
Ok(())
}
// Reassign all owned objects in all databases to the owner of the database.
fn reassign_owned_objects(node: &ComputeNode, role_name: &PgIdent) -> Result<()> {
for db in &node.spec.cluster.databases {
if db.owner != *role_name {
let mut conf = Config::from_str(node.connstr.as_str())?;
conf.dbname(&db.name);
let mut client = conf.connect(NoTls)?;
// This will reassign all dependent objects to the db owner
let reassign_query = format!(
"REASSIGN OWNED BY {} TO {}",
role_name.pg_quote(),
db.owner.pg_quote()
);
info!(
"reassigning objects owned by '{}' in db '{}' to '{}'",
role_name, &db.name, &db.owner
);
client.simple_query(&reassign_query)?;
// This now will only drop privileges of the role
let drop_query = format!("DROP OWNED BY {}", role_name.pg_quote());
client.simple_query(&drop_query)?;
}
}
Ok(())
}
/// It follows mostly the same logic as `handle_roles()` excepting that we
/// does not use an explicit transactions block, since major database operations
/// like `CREATE DATABASE` and `DROP DATABASE` do not support it. Statement-level
/// atomicity should be enough here due to the order of operations and various checks,
/// which together provide us idempotency.
#[instrument(skip_all)]
pub fn handle_databases(spec: &ComputeSpec, client: &mut Client) -> Result<()> {
let existing_dbs: Vec<Database> = get_existing_dbs(client)?;
// Print a list of existing Postgres databases (only in debug mode)
if span_enabled!(Level::INFO) {
info!("postgres databases:");
for r in &existing_dbs {
info!(" {}:{}", r.name, r.owner);
}
}
// Process delta operations first
if let Some(ops) = &spec.delta_operations {
info!("processing delta operations on databases");
for op in ops {
match op.action.as_ref() {
// We do not check either DB exists or not,
// Postgres will take care of it for us
"delete_db" => {
let query: String = format!("DROP DATABASE IF EXISTS {}", &op.name.pg_quote());
warn!("deleting database '{}'", &op.name);
client.execute(query.as_str(), &[])?;
}
"rename_db" => {
let new_name = op.new_name.as_ref().unwrap();
// XXX: with a limited number of roles it is fine, but consider making it a HashMap
if existing_dbs.iter().any(|r| r.name == op.name) {
let query: String = format!(
"ALTER DATABASE {} RENAME TO {}",
op.name.pg_quote(),
new_name.pg_quote()
);
warn!("renaming database '{}' to '{}'", op.name, new_name);
client.execute(query.as_str(), &[])?;
}
}
_ => {}
}
}
}
// Refresh Postgres databases info to handle possible renames
let existing_dbs: Vec<Database> = get_existing_dbs(client)?;
info!("cluster spec databases:");
for db in &spec.cluster.databases {
let name = &db.name;
// XXX: with a limited number of databases it is fine, but consider making it a HashMap
let pg_db = existing_dbs.iter().find(|r| r.name == *name);
enum DatabaseAction {
None,
Update,
Create,
}
let action = if let Some(r) = pg_db {
// XXX: db owner name is returned as quoted string from Postgres,
// when quoting is needed.
let new_owner = if r.owner.starts_with('"') {
db.owner.pg_quote()
} else {
db.owner.clone()
};
if new_owner != r.owner {
// Update the owner
DatabaseAction::Update
} else {
DatabaseAction::None
}
} else {
DatabaseAction::Create
};
match action {
DatabaseAction::None => {}
DatabaseAction::Update => {
let query: String = format!(
"ALTER DATABASE {} OWNER TO {}",
name.pg_quote(),
db.owner.pg_quote()
);
let _guard = info_span!("executing", query).entered();
client.execute(query.as_str(), &[])?;
}
DatabaseAction::Create => {
let mut query: String = format!("CREATE DATABASE {} ", name.pg_quote());
query.push_str(&db.to_pg_options());
let _guard = info_span!("executing", query).entered();
client.execute(query.as_str(), &[])?;
}
};
if span_enabled!(Level::INFO) {
let action_str = match action {
DatabaseAction::None => "",
DatabaseAction::Create => " -> create",
DatabaseAction::Update => " -> update",
};
info!(" - {}:{}{}", db.name, db.owner, action_str);
}
}
Ok(())
}
/// Grant CREATE ON DATABASE to the database owner and do some other alters and grants
/// to allow users creating trusted extensions and re-creating `public` schema, for example.
#[instrument(skip_all)]
pub fn handle_grants(node: &ComputeNode, client: &mut Client) -> Result<()> {
let spec = &node.spec;
info!("cluster spec grants:");
// We now have a separate `web_access` role to connect to the database
// via the web interface and proxy link auth. And also we grant a
// read / write all data privilege to every role. So also grant
// create to everyone.
// XXX: later we should stop messing with Postgres ACL in such horrible
// ways.
let roles = spec
.cluster
.roles
.iter()
.map(|r| r.name.pg_quote())
.collect::<Vec<_>>();
for db in &spec.cluster.databases {
let dbname = &db.name;
let query: String = format!(
"GRANT CREATE ON DATABASE {} TO {}",
dbname.pg_quote(),
roles.join(", ")
);
info!("grant query {}", &query);
client.execute(query.as_str(), &[])?;
}
// Do some per-database access adjustments. We'd better do this at db creation time,
// but CREATE DATABASE isn't transactional. So we cannot create db + do some grants
// atomically.
for db in &node.spec.cluster.databases {
let mut conf = Config::from_str(node.connstr.as_str())?;
conf.dbname(&db.name);
let mut db_client = conf.connect(NoTls)?;
// This will only change ownership on the schema itself, not the objects
// inside it. Without it owner of the `public` schema will be `cloud_admin`
// and database owner cannot do anything with it. SQL procedure ensures
// that it won't error out if schema `public` doesn't exist.
let alter_query = format!(
"DO $$\n\
DECLARE\n\
schema_owner TEXT;\n\
BEGIN\n\
IF EXISTS(\n\
SELECT nspname\n\
FROM pg_catalog.pg_namespace\n\
WHERE nspname = 'public'\n\
)\n\
THEN\n\
SELECT nspowner::regrole::text\n\
FROM pg_catalog.pg_namespace\n\
WHERE nspname = 'public'\n\
INTO schema_owner;\n\
\n\
IF schema_owner = 'cloud_admin' OR schema_owner = 'zenith_admin'\n\
THEN\n\
ALTER SCHEMA public OWNER TO {};\n\
END IF;\n\
END IF;\n\
END\n\
$$;",
db.owner.pg_quote()
);
db_client.simple_query(&alter_query)?;
// Explicitly grant CREATE ON SCHEMA PUBLIC to the web_access user.
// This is needed because since postgres 15 this privilege is removed by default.
let grant_query = "DO $$\n\
BEGIN\n\
IF EXISTS(\n\
SELECT nspname\n\
FROM pg_catalog.pg_namespace\n\
WHERE nspname = 'public'\n\
) AND\n\
current_setting('server_version_num')::int/10000 >= 15\n\
THEN\n\
IF EXISTS(\n\
SELECT rolname\n\
FROM pg_catalog.pg_roles\n\
WHERE rolname = 'web_access'\n\
)\n\
THEN\n\
GRANT CREATE ON SCHEMA public TO web_access;\n\
END IF;\n\
END IF;\n\
END\n\
$$;"
.to_string();
info!("grant query for db {} : {}", &db.name, &grant_query);
db_client.simple_query(&grant_query)?;
}
Ok(())
}
/// Create required system extensions
#[instrument(skip_all)]
pub fn handle_extensions(spec: &ComputeSpec, client: &mut Client) -> Result<()> {
if let Some(libs) = spec.cluster.settings.find("shared_preload_libraries") {
if libs.contains("pg_stat_statements") {
// Create extension only if this compute really needs it
let query = "CREATE EXTENSION IF NOT EXISTS pg_stat_statements";
info!("creating system extensions with query: {}", query);
client.simple_query(query)?;
}
}
Ok(())
}

View File

@@ -0,0 +1,209 @@
{
"format_version": 1.0,
"timestamp": "2021-05-23T18:25:43.511Z",
"operation_uuid": "0f657b36-4b0f-4a2d-9c2e-1dcd615e7d8b",
"cluster": {
"cluster_id": "test-cluster-42",
"name": "Zenith Test",
"state": "restarted",
"roles": [
{
"name": "postgres",
"encrypted_password": "6b1d16b78004bbd51fa06af9eda75972",
"options": null
},
{
"name": "alexk",
"encrypted_password": null,
"options": null
},
{
"name": "zenith \"new\"",
"encrypted_password": "5b1d16b78004bbd51fa06af9eda75972",
"options": null
},
{
"name": "zen",
"encrypted_password": "9b1d16b78004bbd51fa06af9eda75972"
},
{
"name": "\"name\";\\n select 1;",
"encrypted_password": "5b1d16b78004bbd51fa06af9eda75972"
},
{
"name": "MyRole",
"encrypted_password": "5b1d16b78004bbd51fa06af9eda75972"
}
],
"databases": [
{
"name": "DB2",
"owner": "alexk",
"options": [
{
"name": "LC_COLLATE",
"value": "C",
"vartype": "string"
},
{
"name": "LC_CTYPE",
"value": "C",
"vartype": "string"
},
{
"name": "TEMPLATE",
"value": "template0",
"vartype": "enum"
}
]
},
{
"name": "zenith",
"owner": "MyRole"
},
{
"name": "zen",
"owner": "zen"
}
],
"settings": [
{
"name": "fsync",
"value": "off",
"vartype": "bool"
},
{
"name": "wal_level",
"value": "replica",
"vartype": "enum"
},
{
"name": "hot_standby",
"value": "on",
"vartype": "bool"
},
{
"name": "neon.safekeepers",
"value": "127.0.0.1:6502,127.0.0.1:6503,127.0.0.1:6501",
"vartype": "string"
},
{
"name": "wal_log_hints",
"value": "on",
"vartype": "bool"
},
{
"name": "log_connections",
"value": "on",
"vartype": "bool"
},
{
"name": "shared_buffers",
"value": "32768",
"vartype": "integer"
},
{
"name": "port",
"value": "55432",
"vartype": "integer"
},
{
"name": "max_connections",
"value": "100",
"vartype": "integer"
},
{
"name": "max_wal_senders",
"value": "10",
"vartype": "integer"
},
{
"name": "listen_addresses",
"value": "0.0.0.0",
"vartype": "string"
},
{
"name": "wal_sender_timeout",
"value": "0",
"vartype": "integer"
},
{
"name": "password_encryption",
"value": "md5",
"vartype": "enum"
},
{
"name": "maintenance_work_mem",
"value": "65536",
"vartype": "integer"
},
{
"name": "max_parallel_workers",
"value": "8",
"vartype": "integer"
},
{
"name": "max_worker_processes",
"value": "8",
"vartype": "integer"
},
{
"name": "neon.tenant_id",
"value": "b0554b632bd4d547a63b86c3630317e8",
"vartype": "string"
},
{
"name": "max_replication_slots",
"value": "10",
"vartype": "integer"
},
{
"name": "neon.timeline_id",
"value": "2414a61ffc94e428f14b5758fe308e13",
"vartype": "string"
},
{
"name": "shared_preload_libraries",
"value": "neon",
"vartype": "string"
},
{
"name": "synchronous_standby_names",
"value": "walproposer",
"vartype": "string"
},
{
"name": "neon.pageserver_connstring",
"value": "host=127.0.0.1 port=6400",
"vartype": "string"
},
{
"name": "test.escaping",
"value": "here's a backslash \\ and a quote ' and a double-quote \" hooray",
"vartype": "string"
}
]
},
"delta_operations": [
{
"action": "delete_db",
"name": "zenith_test"
},
{
"action": "rename_db",
"name": "DB",
"new_name": "DB2"
},
{
"action": "delete_role",
"name": "zenith2"
},
{
"action": "rename_role",
"name": "zenith new",
"new_name": "zenith \"new\""
}
]
}

View File

@@ -0,0 +1,48 @@
#[cfg(test)]
mod config_tests {
use std::fs::{remove_file, File};
use std::io::{Read, Write};
use std::path::Path;
use compute_tools::config::*;
fn write_test_file(path: &Path, content: &str) {
let mut file = File::create(path).unwrap();
file.write_all(content.as_bytes()).unwrap();
}
fn check_file_content(path: &Path, expected_content: &str) {
let mut file = File::open(path).unwrap();
let mut content = String::new();
file.read_to_string(&mut content).unwrap();
assert_eq!(content, expected_content);
}
#[test]
fn test_line_in_file() {
let path = Path::new("./tests/tmp/config_test.txt");
write_test_file(path, "line1\nline2.1\t line2.2\nline3");
let line = "line2.1\t line2.2";
let result = line_in_file(path, line).unwrap();
assert!(!result);
check_file_content(path, "line1\nline2.1\t line2.2\nline3");
let line = "line4";
let result = line_in_file(path, line).unwrap();
assert!(result);
check_file_content(path, "line1\nline2.1\t line2.2\nline3\nline4");
remove_file(path).unwrap();
let path = Path::new("./tests/tmp/new_config_test.txt");
let line = "line4";
let result = line_in_file(path, line).unwrap();
assert!(result);
check_file_content(path, "line4");
remove_file(path).unwrap();
}
}

Some files were not shown because too many files have changed in this diff Show More