Compare commits

..

221 Commits

Author SHA1 Message Date
Konstantin Knizhnik
10b90506a0 Use B-Tree instead of R-Tree 2022-10-07 14:57:25 +03:00
Konstantin Knizhnik
5ee4524caa Fix indentation 2022-10-06 22:31:11 +03:00
Konstantin Knizhnik
c5245a9e4f Make clippy happy 2022-10-06 21:47:26 +03:00
Konstantin Knizhnik
9f10195d7b Change LSN range assignment rue for partial image layers 2022-10-06 20:16:13 +03:00
Konstantin Knizhnik
51aa53ab90 Check that image contains search key in LayerMap::search 2022-10-06 16:06:39 +03:00
Konstantin Knizhnik
2359106a9d Fix writer creation in reconstruct_level0 2022-10-06 10:05:35 +03:00
Konstantin Knizhnik
885033ad42 Make reconstruction more intensive 2022-10-06 09:37:37 +03:00
Konstantin Knizhnik
487ec20085 Fix indentation 2022-10-05 15:30:24 +03:00
Konstantin Knizhnik
898937d500 Store partria image layers instead of compaction 2022-10-04 20:42:34 +03:00
Joonas Koivunen
31123d1fa8 Silence clippies, minor doc fix (#2543)
* doc: remove stray backtick

* chore: clippy::let_unit_value

* chore: silence useless_transmute, duplicate_mod

* chore: remove allowing deref_nullptr

not needed since bindgen 0.60.0.

* chore: remove repeated allowed lints

they are already allowed from the crate root.
2022-10-03 17:44:17 +03:00
Kirill Bulatov
4f2ac51bdd Bump rustc to 1.61 2022-10-03 16:36:03 +03:00
Kirill Bulatov
7b2f9dc908 Reuse existing tenants during attach (#2540) 2022-10-03 13:33:55 +03:00
Arthur Petukhovsky
dabb6d2675 Fix log level for sk startup logs (#2526) 2022-09-27 10:36:17 +00:00
Arthur Petukhovsky
fc7087b16f Add metric for loaded safekeeper timelines (#2509) 2022-09-27 11:57:59 +03:00
Vadim Kharitonov
2233ca2a39 seqwait.rs unit tests don't check return value 2022-09-27 11:47:59 +03:00
Dmitry Rodionov
fb68d01449 Preserve task result in TaskHandle by keeping join handle around (#2521)
* Preserve task result in TaskHandle by keeping join handle around

The solution is not great, but it should hep to debug staging issue
I tried to do it in a least destructive way. TaskHandle used only in
one place so it is ok to use something less generic unless we want
to extend its usage across the codebase. In its current current form
for its single usage place it looks too abstract

Some problems around this code:
1. Task can drop event sender and continue running
2. Task cannot be joined several times (probably not needed,
    but still, can be surprising)
3. Had to split task event into two types because ahyhow::Error
    does not implement clone. So TaskContinueEvent derives clone
    but usual task evend does not. Clone requirement appears
    because we clone the current value in next_task_event.
    Taking it by reference is complicated.
4. Split between Init and Started is artificial and comes from
    watch::channel requirement to have some initial value.

    To summarize from 3 and 4. It may be a better idea to use
    RWLock or a bounded channel instead
2022-09-26 20:57:02 +00:00
Arthur Petukhovsky
d15116f2cc Update pg_version for old timelines 2022-09-26 19:57:03 +03:00
Stas Kelvich
df45c0d0e5 Disable plv8 again 2022-09-26 12:22:46 +03:00
Stas Kelvich
367cc01290 Fix deploy paths 2022-09-26 10:08:01 +03:00
Anastasia Lubennikova
1165686201 fix deploy lib paths for postgres 2022-09-23 22:23:43 +03:00
Anastasia Lubennikova
093264a695 Fix deploy bin and lib paths for postgres 2022-09-23 22:23:43 +03:00
sharnoff
805bb198c2 Miscellaneous small fixups (#2503)
Changes are:

 * Correct typo "firts" -> "first"
 * Change <empty panic with comment explaining> to <panic with message
   taken from the comment>
 * Fix weird indentation that rustfmt was failing to handle
 * Use existing `anyhow::{anyhow,bail}!` as `{anyhow,bail}!` if it's
   already in scope
 * Spell `Result<T, anyhow::Error>` as `anyhow::Result<T>`
   * In general, closer to matching the rest of the codebase
 * Change usages of `hash_map::Entry` to `Entry` when it's already in
   scope
   * A quick search shows our style on this one varies across the files
     it's used in
2022-09-23 11:49:28 -07:00
Stas Kelvich
5ccd54c699 Add support for h3-pg and re-enable plv8 2022-09-23 19:06:57 +03:00
Dmitry Ivanov
1dffba9de6 Write more tests for the proxy... (#1918)
And change a few more things in the process.
2022-09-23 18:30:44 +03:00
Alexander Bayandin
ebab89ebd2 test_runner: pass password to pgbench via PGPASSWORD (#2468) 2022-09-23 12:51:33 +00:00
MMeent
bc3ba23e0a Fix extreme metrics bloat in storage sync (#2506)
* Fix extreme metrics bloat in storage sync

From 78 metrics per (timeline, tenant) pair down to (max) 10 metrics per
(timeline, tenant) pair, plus another 117 metrics in a global histogram that
replaces the previous per-timeline histogram.

* Drop image sync operation metric series when dropping TimelineMetrics.
2022-09-23 14:35:36 +02:00
Alexander Bayandin
3e65209a06 Nightly Benchmarks: use Postgres binaries from artifacts (#2501) 2022-09-23 12:50:36 +01:00
Dmitry Rodionov
eb0c6bcf1a reenable storage deployments 2022-09-23 14:12:39 +03:00
Rory de Zoete
52819898e4 Extend image push step with production ECR (#2465)
* Extend image push step with production ECR

* Put copy step before auth change

* Use correct name

* Only push on main

* Fix typo
2022-09-23 11:25:29 +02:00
Sergey Melnikov
b0377f750a Add staging-test region to normal staging rollouts (#2500) 2022-09-23 11:25:26 +03:00
Dmitry Rodionov
43560506c0 remove duplicate walreceiver connection span 2022-09-23 00:27:24 +03:00
Anastasia Lubennikova
c81ede8644 Hotfix for safekeeper timelines with unknown pg_version.
Assume DEFAULT_PG_VERSION = 14
2022-09-22 21:17:36 +03:00
Anastasia Lubennikova
eb9200abc8 Use version-specific path in pytest CI script 2022-09-22 18:12:41 +03:00
Anastasia Lubennikova
7c1695e87d fix psql path in export_import_between_pageservers script 2022-09-22 18:12:41 +03:00
Anastasia Lubennikova
8b42c184e7 Update LD_LIBRARY_PATH in deploy scripts 2022-09-22 18:12:41 +03:00
Anastasia Lubennikova
7138db9279 Fix paths to postgres binaries in the deploy script 2022-09-22 18:12:41 +03:00
Egor Suvorov
262fa3be09 pageserver pg proto: add missing auth checks (#2494)
Fixes #1858
2022-09-22 17:07:08 +03:00
Anastasia Lubennikova
5e151192f5 Fix rebase conflicts in safekeeper code 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
2d012f0d32 Fix rebase conflicts in pageserver code 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
64f64d5637 Fix after rebase: bump vendor/postgres-v14 to match main 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
1fa7d6aebf Use DEFAULT_PG_VERSION env in CI pytest 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
d098542dde Make test_timeline_size_metrics more stable:
Compare size with Vanilla postgres size instead of hardcoded value
2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
eba419fda3 Clean up the pg_version choice code 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
d8d3cd49f4 Update libs/postgres_ffi/wal_craft/src/bin/wal_craft.rs
Co-authored-by: MMeent <matthias@neon.tech>
2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
3618c242b9 use version specific find_end_of_wal function 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
ed6b75e301 show pg_version in create_timeline info span 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
862902f9e5 Update readme and openapi spec 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
8d890b3cbb fix clippy warnings 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
0fde59aa46 use pg_version in python tests 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
1255ef806f pass version to wal_craft.rs 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
5dddeb8d88 Use non-versioned pg_distrib dir 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
d45de3d58f update build scripts to match pg_distrib_dir versioning schema 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
a69e060f0f fix clippy warning 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
a4397d43e9 Rename waldecoder -> waldecoder_handler.rs. Add comments 2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
03c606f7c5 Pass pg_version parameter to timeline import command.
Add pg_version field to LocalTimelineInfo.
Use pg_version in the export_import_between_pageservers script
2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
9dfede8146 Handle backwards-compatibility of TimelineMetadata.
This commit bumps TimelineMetadata format version and makes it independent from STORAGE_FORMAT_VERSION.
2022-09-22 14:15:13 +03:00
Anastasia Lubennikova
86bf491981 Support pg 15
- Split postgres_ffi into two version specific files.
- Preserve pg_version in timeline metadata.
- Use pg_version in safekeeper code. Check for postgres major version mismatch.
- Clean up the code to use DEFAULT_PG_VERSION constant everywhere, instead of hardcoding.

-  Parameterize python tests: use DEFAULT_PG_VERSION env and pg_version fixture.
   To run tests using a specific PostgreSQL version, pass the DEFAULT_PG_VERSION environment variable:
   'DEFAULT_PG_VERSION='15' ./scripts/pytest test_runner/regress'
 Currently don't all tests pass, because rust code relies on the default version of PostgreSQL in a few places.
2022-09-22 14:15:13 +03:00
Dmitry Rodionov
e764c1e60f remove self argument from several spans 2022-09-22 11:41:04 +03:00
Konstantin Knizhnik
f3073a4db9 R-Tree layer map (#2317)
Replace the layer array and linear search with R-tree

So far, the in-memory layer map that holds information about layer
files that exist, has used a simple Vec, in no particular order, to
hold information about all the layers. That obviously doesn't scale
very well; with thousands of layer files the linear search was
consuming a lot of CPU. Replace it with a two-dimensional R-tree, with
Key and LSN ranges as the dimensions.

For the R-tree, use the 'rstar' crate. To be able to use that, we
convert the Keys and LSNs into 256-bit integers. 64 bits would be
enough to represent LSNs, and 128 bits would be enough to represent
Keys. However, we use 256 bits, because rstar internally performs
multiplication to calculate the area of rectangles, and the result of
multiplying two 128 bit integers doesn't necessarily fit in 128 bits,
causing integer overflow and, if overflow-checks are enabled,
panic. To avoid that, we use 256 bit integers.

Add a performance test that creates a lot of layer files, to
demonstrate the benefit.
2022-09-22 08:35:06 +03:00
Dmitry Ivanov
e9a103c09f [proxy] Pass extra parameters to the console (#2467)
With this change we now pass additional params
to the console's auth methods.
2022-09-21 21:42:47 +03:00
Arthur Petukhovsky
7eebb45ea6 Reduce metrics footprint in safekeeper (#2491)
Fixes bugs with metrics in control_file and wal_storage, where we haven't deleted metrics for inactive timelines.
2022-09-21 19:13:30 +03:00
Alexander Bayandin
19fa410ff8 NeonCompare: switch to new pageserver HTTP API 2022-09-21 15:17:55 +01:00
Heikki Linnakangas
b82e2e3f18 Bump postgres submodules and update docs/core_changes.md.
The old change to downgrade a WARNING in postgres vacuumlazy.c was
reverted.
2022-09-21 17:14:29 +03:00
Kirill Bulatov
71c92e0db1 Use prebuilt image with Hakari for CI style checks (#2488) 2022-09-21 10:13:11 +00:00
sharnoff
6f949e1556 Improve pageserver/safekeepeer HTTP API errors (#2461)
Part of the general work on improving pageserver logs.

Brief summary of changes:

* Remove `ApiError::from_err`
* Remove `impl From<anyhow::Error> for ApiError`
* Convert `ApiError::{BadRequest, NotFound}` to use `anyhow::Error`
  * Note: `NotFound` has more verbose formatting because it's more
    likely to have useful information for the receiving "user"
* Explicitly convert from `tokio::task::JoinError`s into
  `InternalServerError`s where appropriate

Also note: many of the places where errors were implicitly converted to
500s have now been updated to return a more appropriate error. Some
places where it's not yet possible to distinguish the error types have
been left as 500s.
2022-09-20 17:02:10 -07:00
Kirill Bulatov
8d7024a8c2 Move path manipulation function to utils 2022-09-20 23:43:52 +03:00
Kirill Bulatov
6b8dcad1bb Unify timeline creation steps 2022-09-20 23:43:52 +03:00
Kirill Bulatov
310c507303 Merge path retrieval methods in config.rs 2022-09-20 23:43:52 +03:00
Kirill Bulatov
6fc719db13 Merge timelines.rs with tenant.rs 2022-09-20 23:43:52 +03:00
sharnoff
4a3b3ff11d Move testing pageserver libpq cmds to HTTP api (#2429)
Closes #2422.

The APIs have been feature gated with the `testing_api!` macro so that
they return 400s when support hasn't been compiled in.
2022-09-20 11:28:12 -07:00
sharnoff
4b25b9652a Rename more zid-like idents (#2480)
Follow-up to PR #2433 (b8eb908a). There's still a few more unresolved
locations that have been left as-is for the same compatibility reasons
in the original PR.
2022-09-20 11:06:31 -07:00
Heikki Linnakangas
a5019bf771 Use a simpler way to set extra options for benchmark test.
Commit 43a4f7173e fixed the case that there are extra options in the
connection string, but broke it in the case when there are not. Fix
that. But on second thoughts, it's more straightforward set the
options with ALTER DATABASE, so change the workflow yaml file to do
that instead.
2022-09-20 13:48:50 +03:00
Kirill Bulatov
7863c4a702 Regenerate Hakari files, add a CI check for that 2022-09-20 11:39:10 +03:00
Arthur Petukhovsky
566e816298 Refactor safekeeper timelines handling (#2329)
See https://github.com/neondatabase/neon/pull/2329 for details
2022-09-20 07:42:39 +00:00
Heikki Linnakangas
e4f775436f Don't override other options than statement_timeout in test conn string.
In commit 6985f6cd6c, I tried passing extra GUCs in the 'options' part
of the connection string, but it didn't work because the pgbench test
overrode it with the statement_timeout. Change it so that it adds the
statement_timeout to any other options, instead of replacing them.
2022-09-20 09:46:15 +03:00
Alexander Bayandin
bb3c66d86f github/workflows: Make publishing perf reports more configurable (#2440) 2022-09-19 22:28:51 +00:00
Heikki Linnakangas
6985f6cd6c Add a new benchmark data series for prefetching.
Also run benchmarks with the seqscan prefetching (commit f44afbaf62)
enabled.

Renames the 'neon-captest' test to 'neon-captest-reuse', for clarity
2022-09-19 20:56:11 +03:00
Dmitry Rodionov
fcb4a61a12 Adjust spans around gc and compaction
So compaction and gc loops have their own span to always show tenant id
in log messages.
2022-09-19 20:08:38 +03:00
Dmitry Rodionov
4b5e7f2f82 Temporarily disable storage deployments
Do not update configs
Do not restart servieces
Still update binaries
2022-09-19 17:03:20 +03:00
Anastasia Lubennikova
d11cb4b2f1 Bump vendor/postgres-v15 to the latest state of REL_15_STABLE_neon branch 2022-09-19 15:12:05 +03:00
Sergey Melnikov
90ed12630e Add zenith-us-stage-ps-4 and undo changes in prefix_in_bucket in pageserver config (#2473)
* Add zenith-us-stage-ps-4

* Undo changes in prefix_in_bucket in pageserver config (Rollback #2449)
2022-09-19 12:57:44 +02:00
Konstantin Knizhnik
846d126579 Set last written lsn for created relation (#2398)
* Set last written lsn for created relation

* use current LSN for updating last written LSN of relation metadata

* Update LSN for the extended blocks even for pges without LSN (zeroed)

* Update pgxn/neon/pagestore_smgr.c

Co-authored-by: Heikki Linnakangas <heikki@neon.tech>

Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2022-09-19 12:56:08 +03:00
Egor Suvorov
c9c3c77c31 Fix Docker image builds (follow-up for #2458) (#2469)
Put ninstall.sh inside Docker images for building
2022-09-16 20:51:35 +03:00
Kirill Bulatov
b46c8b4ae0 Add an alias to build test images simply 2022-09-16 18:58:41 +03:00
Egor Suvorov
65a5010e25 Use custom install command in Makefile to speed up incremental builds (#2458)
Fixes #1873: previously any run of `make` caused the `postgres-v15-headers`
target to build. It copied a bunch of headers via `install -C`. Unfortunately,
some origins were symlinks in the `./pg_install/build` directory pointing
inside `./vendor/postgres-v15` (e.g. `pg_config_os.h` pointing to `linux.h`).

GNU coreutils' `install` ignores the `-C` key for non-regular files and
always overwrites the destination if the origin is a symlink. That in turn
made Cargo rebuild the `postgres_ffi` crate and all its dependencies because
it thinks that Postgres headers changed, even if they did not. That was slow.

Now we use a custom script that wraps the `install` program. It handles one
specific case and makes sure individual headers are never copied if their
content did not change. Hence, `postgres_ffi` is not rebuilt unless there were
some changes to the C code.

One may still have slow incremental single-threaded builds because Postgres
Makefiles spawn about 2800 sub-makes even if no files have been changed.
A no-op build takes "only" 3-4 seconds on my machine now when run with `-j30`,
and 20 seconds when run with `-j1`.
2022-09-16 15:44:02 +00:00
sharnoff
9c35a09452 Improve build errors when postgres_ffi fails (#2460)
This commit does two things of note:

 1. Bumps the bindgen dependency from `0.59.1` to `0.60.1`. This gets us
    an actual error type from bindgen, so we can display what's wrong.
 2. Adds `anyhow` as a build dependency, so our error message can be
    prettier. It's already used heavily elsewhere in the crates in this
    repo, so I figured the fact it's a build dependency doesn't matter
    much.

I ran into this from running `cargo <cmd>` without running `make` first.
Here's a comparison of the compiler output in those two cases.

Before this commit:

```
error: failed to run custom build command for `postgres_ffi v0.1.0 ($repo_path/libs/postgres_ffi)`

Caused by:
  process didn't exit successfully: `$repo_path/target/debug/build/postgres_ffi-2f7253b3ad3ca840/build-script-build` (exit status: 101)
  --- stdout
  cargo:rerun-if-changed=bindgen_deps.h

  --- stderr
  bindgen_deps.h:7:10: fatal error: 'c.h' file not found
  bindgen_deps.h:7:10: fatal error: 'c.h' file not found, err: true
  thread 'main' panicked at 'Unable to generate bindings: ()', libs/postgres_ffi/build.rs:135:14
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
```

After this commit:

```
error: failed to run custom build command for `postgres_ffi v0.1.0 ($repo_path/libs/postgres_ffi)`

Caused by:
  process didn't exit successfully: `$repo_path/target/debug/build/postgres_ffi-e01fb59602596748/build-script-build` (exit status: 1)
  --- stdout
  cargo:rerun-if-changed=bindgen_deps.h

  --- stderr
  bindgen_deps.h:7:10: fatal error: 'c.h' file not found
  Error: Unable to generate bindings

  Caused by:
      clang diagnosed error: bindgen_deps.h:7:10: fatal error: 'c.h' file not found
```
2022-09-16 08:37:44 -07:00
Dmitry Rodionov
44fd4e3c9f add more logs 2022-09-16 18:14:05 +03:00
Dmitry Rodionov
4db15d3c7c change prefix_in_bucket in pageserver config 2022-09-16 18:14:05 +03:00
Alexander Bayandin
72b33997c7 Nightly Benchmarks: trigger tests earlier (#2463) 2022-09-16 09:09:54 +00:00
Kirill Bulatov
74312e268f Tidy up storege artifact build flags
* Simplify test build features handling
* Build only necessary binaries during the release build
2022-09-16 11:17:41 +03:00
sharnoff
db5ec0dae7 Cleanup/simplify logical size calculation (#2459)
Should produce identical results; replaces an error case that shouldn't
be possible with `expect`.
2022-09-15 23:50:46 -07:00
Kirill Bulatov
031e57a973 Disable failpoints by default 2022-09-16 09:26:29 +03:00
bojanserafimov
96e867642f Validate tenant create options (#2450)
Co-authored-by: Kirill Bulatov <kirill@neon.tech>
2022-09-15 18:20:23 -04:00
Egor Suvorov
e968b5e502 tests: do not set num_safekeepers = 1, it's the default (#2457)
Also get rid if `with_safekeepers` parameter in tests.
Its meaning has changed: `False` meant "no safekeepers" which is not
supported anymore, so we assume it's always `True`.

See #1648
2022-09-15 21:43:51 +03:00
Egor Suvorov
9d9d8e9519 docs/sourcetree: update CLion set up instructions (#2454)
After #2325 the old method no longer works as our Makefile does not print compilation commands when run with --dry-run, see https://github.com/neondatabase/neon/issues/2378#issuecomment-1241421325

This method is much slower but is hopefully robust.

Add some more notes while we're here.
2022-09-15 17:16:07 +00:00
Heikki Linnakangas
1062e57fee Don't run codestyle checks separately for Postgres v14 and v15.
Previously, we compiled neon separately for Postgres v14 and v15, for
the codestyle checks. But that was bogus; we actually just ran "make
postgres", which always compiled both versions. The version really only
affected the caching.

Fix that, by copying the build steps from the main build_and_test.yml
workflow.
2022-09-15 16:33:42 +03:00
dependabot[bot]
a8d9732529 Bump axum-core from 0.2.7 to 0.2.8
Bumps [axum-core](https://github.com/tokio-rs/axum) from 0.2.7 to 0.2.8.
- [Release notes](https://github.com/tokio-rs/axum/releases)
- [Changelog](https://github.com/tokio-rs/axum/blob/main/CHANGELOG.md)
- [Commits](https://github.com/tokio-rs/axum/compare/axum-core-v0.2.7...axum-core-v0.2.8)

---
updated-dependencies:
- dependency-name: axum-core
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-09-15 14:07:00 +01:00
Alexey Kondratov
757e2147c1 Follow-up for neondatabase/neon#2448 (#2452)
* remove `legacy` mode from the proxy readme
* explicitly specify `authBackend` in the link auth proxy helm-values
  for all envs
2022-09-15 15:21:22 +03:00
Dmitry Ivanov
87bf7be537 [proxy] Drop support for legacy cloud API (#2448)
Apparently, it no longer exists in the cloud.
2022-09-14 21:27:47 +03:00
Heikki Linnakangas
f86ea09323 Avoid recompiling postgres_ffi every time you run "make".
Running "make" at the top level calls "make install" to install the
PostgreSQL headers into the pg_install/ directory. That always updated
the modification time of the headers even if there were no changes,
triggering recompilation of the postgres_ffi bindings. To avoid that,
use 'install -C', to install the PostgreSQL headers.

However, there was an upstream PostgreSQL issue that the
src/include/Makefile didn't respect the INSTALL configure option. That
was just fixed in upstream PostgreSQL, so cherry-pick that fix to our
vendor/postgres repositories.

Fixes https://github.com/neondatabase/neon/issues/1873.
2022-09-14 15:18:18 +03:00
Alexander Bayandin
d87c9e62d6 Nightly Benchmarks: perform tests on both pre-created and fresh projects (#2443) 2022-09-14 10:53:34 +00:00
Heikki Linnakangas
c3096532f9 Fix vendor/postgres-v15 to point to correct v15 branch.
Commit f44afbaf62 updated vendor/postgres-v15 to point to a commit that
was built on top of PostgreSQL 14 rather than 15. So we accidentally had
two copies of PostgreSQL v14 in the repository. Oops. This updates
it to point to the correct version.
2022-09-14 09:23:51 +03:00
Kirill Bulatov
6db6e7ddda Use backward-compatible safekeeper code 2022-09-14 08:14:05 +03:00
Kirill Bulatov
b8eb908a3d Rename old project name references 2022-09-14 08:14:05 +03:00
Kirill Bulatov
260ec20a02 Refotmat pgxn code, add typedefs.list that was used 2022-09-14 08:14:05 +03:00
Dmitry Rodionov
ba8698bbcb update neon_local output in readme 2022-09-14 01:41:27 +03:00
Egor Suvorov
35761ac6b6 docs/sourcetree: add info about IDE config (#2332) 2022-09-13 21:55:18 +00:00
Kirill Bulatov
32b7259d5e Timeline data management RFC (#2152) 2022-09-13 22:37:20 +03:00
Dmitry Rodionov
1d53173e62 update openapi spec (tenant state has changed) 2022-09-13 21:53:59 +03:00
Alexander Bayandin
d4d57ea2dd github/workflows: fix project creation via API (#2437) 2022-09-13 20:26:26 +02:00
Dmitry Rodionov
db0c49148d clean up metrics in handle_pagerequests 2022-09-13 21:21:45 +03:00
Alexander Bayandin
59d04ab66a test_runner: redact passwords from log messages (#2434) 2022-09-13 17:24:11 +00:00
Kirill Bulatov
1a8c8b04d7 Merge Repository and Tenant entities, rework tenant background jobs 2022-09-13 15:39:39 +03:00
Konstantin Knizhnik
f44afbaf62 Changes of neon extension to support local prefetch (#2369)
* Changes of neon extension to support local prefetch

* Catch exceptions in pageserver_receive

* Bump posgres version

* Bump posgres version

* Bump posgres version

* Bump posgres version
2022-09-13 12:26:20 +03:00
Alexander Bayandin
4f7557fb58 github/workflows: Create projects using API (#2403)
* github/actions: add neon projects related actions
* workflows/benchmarking: create projects using API
* workflows/pg_clients: create projects using API
2022-09-13 09:45:45 +01:00
Kirill Bulatov
2a837d7de7 Create tenants in temporary directory first (#2426) 2022-09-12 21:04:33 +00:00
Heikki Linnakangas
40c845e57d Switch to async for all concurrency in the pageserver.
Instead of spawning helper threads, we now use Tokio tasks. There
are multiple Tokio runtimes, for different kinds of tasks. One for
serving libpq client connections, another for background operations
like GC and compaction, and so on. That's not strictly required, we
could use just one runtime, but with this you can still get an
overview of what's happening with "top -H".

There's one subtle behavior in how TenantState is updated. Before this
patch, if you deleted all timelines from a tenant, its GC and
compaction loops were stopped, and the tenant went back to Idle
state. We no longer do that. The empty tenant stays Active. The
changes to test_tenant_tasks.py are related to that.

There's still plenty of synchronous code and blocking. For example, we
still use blocking std::io functions for all file I/O, and the
communication with WAL redo processes is still uses low-level unix
poll(). We might want to rewrite those later, but this will do for
now. The model is that local file I/O is considered to be fast enough
that blocking - and preventing other tasks running in the same thread -
is acceptable.
2022-09-12 14:21:00 +03:00
Kirill Bulatov
698d6d0bad Use stable coverage API with rustc 1.60 2022-09-12 13:44:54 +03:00
Heikki Linnakangas
a48f9f377d Fix typo in issue template 2022-09-10 01:23:19 +03:00
Kirill Bulatov
18dafbb9ba Remove deceiving rust version from the CI files 2022-09-09 22:32:00 +03:00
Kirill Bulatov
648e86e9df Use Debian images with libc 2.31 to build legacy compute tools 2022-09-09 22:32:00 +03:00
Kirill Bulatov
923f642549 Collect cargo build timings 2022-09-09 22:32:00 +03:00
Kirill Bulatov
31ec3b7906 Use the toolchain file to define current rustc version used 2022-09-09 22:32:00 +03:00
Kirill Bulatov
c9e7c2f014 Ensure all temporary and empty directories and files are cleansed on pageserver startup 2022-09-09 16:36:45 +03:00
Kirill Bulatov
d3f83eda52 Use regular agent for triggering e2e tests 2022-09-09 16:07:43 +03:00
Dmitry Rodionov
0b76b82e0e review clean up 2022-09-08 19:59:42 +03:00
Heikki Linnakangas
35b4816f09 Turn GenericRemoteStorage into just a newtype around 'Arc<dyn RemoteStorage>'
We had a pattern like this:

    match remote_storage {
        GenericRemoteStorage::Local(storage) => {
            let source = storage.remote_object_id(&file_path)?;
            ...
            storage
                .function(&source, ...)
                .await
       },
       GenericRemoteStorage::S3(storage) => {
	    ... exact same code as for the Local case ...
       },

This removes the code duplication, by allowing you to call the functions
directly on GenericRemoteStorage.

Also change RemoveObjectId to be just a type alias for String. Now that
the callers of GenericRemoteStorage functions don't know whether they're
dealing with the LocalFs or S3 implementation, RemoveObjectId must be the
same type for both.
2022-09-08 19:59:42 +03:00
Alexander Bayandin
171385ac14 Pass COPT and PG_CFLAGS to Extension's CFLAGS (#2405)
* fix incompatible-function-pointer-types warning
* Pass COPT and PG_CFLAGS  to Extension's CFLAGS
2022-09-08 16:02:11 +01:00
MMeent
1351beae19 Fix race condition in ginHeapTupleFastInsert (#2412)
Because the metadata was not locked, it could be updated concurrently
such that we wouldn't actually have the tail block.

The current ordering works better, as we still only start XLogBeginInsert()
once we have all potentially interesting buffers loaded in memory, but
still have correct lock lifetimes.

See also: access/transam/README section Write-Ahead Log Coding
2022-09-08 10:57:30 +00:00
Alexander Bayandin
9e3136ea37 scripts/ingest_regress_test_result.py: fix json data insertion (#2408) 2022-09-07 21:40:08 +01:00
Alexander Bayandin
83dca73f85 Store Allure tests statistics in database (#2367) 2022-09-07 14:16:48 +01:00
Lassi Pölönen
dc2150a90e Add built files to gitignore (#2404) 2022-09-07 12:11:03 +00:00
Anastasia Lubennikova
2794cd83c7 Prepare pg 15 support (generate bindings for pg15) (#2396)
Another preparatory commit for pg15 support:
* generate bindings for both pg14 and pg15;
* update Makefile and CI scripts: now neon build depends on both PostgreSQL versions;
* some code refactoring to decrease version-specific dependencies.
2022-09-07 12:40:48 +03:00
Heikki Linnakangas
65b592d4bd Remove deprecated management API for timeline detach.
It is no longer used anywhere.
2022-09-06 18:54:04 +03:00
Heikki Linnakangas
f441fe57d4 Register prometheus counters correctly.
Commit f081419e68 moved all the prometheus counters to `metrics.rs`,
but accidentally replaced a couple of `register_int_counter!(...)`
calls with just `IntCounter::new(...)`. Because of that, the counters
were not registered in the metrics registry, and were not exposed
through the metrics HTTP endpoint.

Fixes failures we're seeing in a bunch of 'performance' tests because
of the missing metrics.
2022-09-06 17:38:17 +03:00
Heikki Linnakangas
cf157ad8e4 Add test that repeatedly kills and restarts the pageserver.
This caught or reproduced several bugs when I originally wrote this test
back in May, including #1731, #1740, #1751, and #707. I believe all the
issues have been fixed now, but since this was a very fruitful test,
let's add it to the test suite.

We didn't commit this earlier, because the test was very slow especially
with a debug build. We've since changed the build options so that even
the debug builds are not quite so slow anymore.
2022-09-06 13:00:40 +03:00
Lassi Pölönen
f081419e68 Cleanup tenant specific metrics once a tenant is detached. (#2328)
* Add test for pageserver metric cleanup once a tenant is detached.

* Remove tenant specific timeline metrics on detach.

* Use definitions from timeline_metrics in page service.

* Move metrics to own file from layered_repository/timeline.rs

* TIMELINE_METRICS: define smgr metrics

* REMOVE SMGR cleanup from timeline_metrics. Doesn't seem to work as
expected.

* Vritual file centralized metrics, except for evicted file as there's no
tenat id or timeline id.

* Use STORAGE_TIME from timeline_metrics in layered_repository.

* Remove timelineless gc metrics for tenant on detach.

* Rename timeline metrics -> metrics as it's more generic.

* Don't create a TimelineMetrics instance for VirtualFile

* Move the rest of the metric definitions to metrics.rs too.

* UUID -> ZTenantId

* Use consistent style for dict.

* Use Repository's Drop trait for dropping STORAGE_TIME metrics.

* No need for Arc, TimelineMetrics is used in just one place. Due to that,
we can fall back using ZTenantId and ZTimelineId too to avoid additional
string allocation.
2022-09-06 11:30:20 +03:00
Anastasia Lubennikova
05e263d0d3 Prepare pg 15 support (build system and submodules) (#2337)
* Add submodule postgres-15

* Support pg_15 in pgxn/neon

* Renamed zenith -> neon in Makefile

* fix name of codestyle check

* Refactor build system to prepare for building multiple Postgres versions.

Rename "vendor/postgres" to "vendor/postgres-v14"

Change Postgres build and install directory paths to be version-specific:

- tmp_install/build -> pg_install/build/14
- tmp_install/* -> pg_install/14/*

And Makefile targets:

- "make postgres" -> "make postgres-v14"
- "make postgres-headers" -> "make postgres-v14-headers"
- etc.

Add Makefile aliases:

- "make postgres" to build "postgres-v14" and in future, "postgres-v15"
- similarly for "make postgres-headers"

Fix POSTGRES_DISTRIB_DIR path in pytest scripts

* Make postgres version a variable in codestyle workflow

* Support vendor/postgres-v15 in codestyle check workflow

* Support postgres-v15 building in Makefile

* fix pg version in Dockerfile.compute-node

* fix kaniko path

* Build neon extensions in version-specific directories

* fix obsolete mentions of vendor/postgres

* use vendor/postgres-v14 in Dockerfile.compute-node.legacy

* Use PG_VERSION_NUM to gate dependencies in inmem_smgr.c

* Use versioned ECR repositories and image names for compute-node.
The image name format is compute-node-vXX, where XX is postgres major version number.
For now only v14 is supported.
Old format unversioned name (compute-node) is left, because cloud repo depends on it.

* update vendor/postgres submodule url (zenith->neondatabase rename)

* Fix postgres path in python tests after rebase

* fix path in regress test

* Use separate dockerfiles to build compute-node:
Dockerfile.compute-node-v15 should be identical to Dockerfile.compute-node-v14 except for the version number.
This is a hack, because Kaniko doesn't support build ARGs properly

* bump vendor/postgres-v14 and vendor/postgres-v15

* Don't use Kaniko cache for v14 and v15 compute-node images

* Build compute-node images for different versions in different jobs

Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2022-09-05 18:30:54 +03:00
Alexander Bayandin
ee0071e90d Fix nightly benchmark reports (#2392) 2022-09-05 14:30:37 +01:00
Stas Kelvich
772078eb5c Reword proxy SNI error message
Be more strict with project id/name difference and explain how to
get project id out of the domain name.
2022-09-05 15:13:29 +03:00
Konstantin Knizhnik
ad057124be Update relation size cache only when latest LSN is requested (#2310)
* Update relation size cache only when latest LSN is requested

* Fix tests

* Add a test case for timetravel query after pageserver restart.

This test is currently failing, the queries return incorrect results.
I don't know why, needs to be investigated.

    FAILED test_runner/batch_others/test_readonly_node.py::test_timetravel - assert 85 == 100000

If you remove the pageserver restart from the test, it passes.

* yapf3 test_readonly_node.py

* Add comment about cache correction in case of setting incorrect latest flag

* Fix formatting for test_readonly_node.py

* Remove unused imports

* Fix mypy warning for test_readonly_node.py

* Fix formatting of test_readonly_node.py

* Bump postgres version

Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2022-09-05 13:12:02 +03:00
Heikki Linnakangas
aeb1cf9c36 Fix misc typos and grammar in comments. 2022-09-05 11:09:32 +03:00
Heikki Linnakangas
7a3e8bb7fb Make tracing span names consistent for mgmt API handlers. 2022-09-05 11:02:13 +03:00
Konstantin Knizhnik
846d71b948 Add test for last written lsn cache (#1949)
* Fix pythin style

* Fix iport of test_backpressure in test_latency

* Apply changed to moved neon extension

* Apply changed to moved neon extension

* Merge with main

* Update pgxn/neon/pagestore_smgr.c

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>

* Bump postgres version

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>
2022-09-04 22:25:32 +03:00
Kirill Bulatov
2b6c49b2ea Fix negative usize parsing 2022-09-03 23:54:00 +03:00
Konstantin Knizhnik
eef7475408 Add tests for measuring effect of lsn caching (#2384)
* Add tests for measurif effet of lsn caching

* Fix formatting of test_latency.py

* Fix test_lsn_mapping test
2022-09-03 17:06:19 +03:00
Konstantin Knizhnik
71c965b0e1 Move backpressure throttling implementation to neon extension and function for monitoring throttling time (#2380)
* Move backpressure throttling implementation to neon extension and function for monitoring throttling time

* Add missing includes

* Bump postgres version
2022-09-03 08:48:28 +03:00
Heikki Linnakangas
a4e79db348 Move neon_local to control_plane.
Seems a bit silly to have a separate crate just for the executable. It
relies on the control plane for everything it does, and it's the only
user of the control plane.
2022-09-02 16:34:33 +03:00
MMeent
a463749f59 Slim down compute-node images (#2346)
Slim down compute-node images:

- Optimize compute_ctl build for size, not performance & debug-ability
- Don't run unused stages. Saves time in not building the PLV8 extension.
- Do not include static libraries in clean postgres
- Do the installation and finishing touches in the final layer in one job
    This allows docker (and kaniko) to only register one change to the files,
    removing potentially duplicate changed files.
- The runtime library for libreadline-dev is libreadline8, changing the dependency saves 45 MB
- libprotobuf-c-dev -> libprotobuf-c1, saving 100 kB
- libossp-uuid-dev -> libossp-uuid16, saving 150 kB
- gdal-bin + libgdal-dev -> libgeos-c1v5 + libgdal28 + libproj19, saving 747MB
- binutils @ testing -> libc6 @ testing, saving 32 MB
2022-09-02 14:34:40 +02:00
Kirill Bulatov
73f926c39a Return safekeeper remote storage logging during downloads 2022-09-02 15:08:18 +03:00
Kirill Bulatov
8b28adb6a6 Merge file name and extension for index part files 2022-09-02 14:57:09 +03:00
Kirill Bulatov
827c3013bd Adjust benchmark code to Ids 2022-09-02 14:57:09 +03:00
Kirill Bulatov
2db20e5587 Remove [Un]Loaded timeline code (#2359) 2022-09-02 14:31:28 +03:00
Kirill Bulatov
f78a542cba Calculate timeline initial logical size in the background
Start the calculation on the first size request, return
partially calculated size during calculation, retry if failed.

Remove "fast" size init through the ancestor: the current approach is
fast enough for now and there are better ways to optimize the
calculation via incremental ancestor size computation
2022-09-02 14:31:28 +03:00
Kirill Bulatov
8a7333438a Extract common remote storage operations into GenericRemoteStorage (#2373) 2022-09-02 11:58:28 +03:00
Heikki Linnakangas
47bd307cb8 Add python types to represent LSNs, tenant IDs and timeline IDs. (#2351)
For better ergonomics. I always found it weird that we used UUID to
actually mean a tenant or timeline ID. It worked because it happened
to have the same length, 16 bytes, but it was hacky.
2022-09-02 10:16:47 +03:00
Heikki Linnakangas
f0a0d7bb7a Split RcuWriteGuard::store() into two stages: store and wait.
This makes it easier to explain which stages allow concurrent readers and
writers. Expand the comments with examples, too.
2022-09-02 00:34:37 +03:00
Konstantin Knizhnik
40813adba2 Pevent creation of empty layers with duplicates (#2327)
* Pevent creation of empty layers with duplicates

* Add comments
2022-09-01 21:51:48 +03:00
Heikki Linnakangas
15c5f3e6cf Fix misc typos in comments and variable names. 2022-09-01 20:04:08 +03:00
Sergey Melnikov
46c8a93976 Fix PERF_TEST_RESULT_CONNSTR for benchmark init (#2375) 2022-09-01 15:06:52 +03:00
MMeent
13beeb59cd Update extensions included in compute-node
Update PLV8 to 3.1.4 - which is the latest release. 
Update PostGIS to 3.3.0

Remove PLV8 from the final image -- there is an issue we hit when installing PLV8, and we don't quite know what it is yet.
2022-09-01 12:53:17 +02:00
Alexander Bayandin
d7c9cfe7bb Create Allure report for perf tests (#2326) 2022-08-31 16:15:26 +01:00
Rory de Zoete
5745dbdd33 Remove deprecated notification channel (#2330)
Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
2022-08-31 14:36:24 +02:00
Kirill Bulatov
a4803233bb Remove RemoteObjectName and many remote storage generics in pageserver (#2360) 2022-08-30 22:19:52 +03:00
Heikki Linnakangas
f09bd6bc88 Fix size checks in the "local" remote storage implementation.
The code correctly detected too short and too long inputs, but the error
message was bogus for the case the input stream was too long:

    Error: Provided stream has actual size 5 fthat is smaller than the given stream size 4

That check was only supposed to check for too small inputs, but it in
fact caught too long inputs too. That was good, because the check
below that that was supposed to check for too long inputs was in fact
broken, and never did anything. It tried to read input a buffer of
size 0, to check if there is any extra data, but reading to a
zero-sized buffer always returns 0.
2022-08-30 18:44:06 +03:00
Heikki Linnakangas
3aca717f3d Reorganize python tests.
Merge batch_others and batch_pg_regress. The original idea was to
split all the python tests into multiple "batches" and run each batch
in parallel as a separate CI job. However, the batch_pg_regress batch
was pretty short compared to all the tests in batch_others. We could
split batch_others into multiple batches, but it actually seems better
to just treat them as one big pool of tests and use pytest's handle
the parallelism on its own. If we need to split them across multiple
nodes in the future, we could use pytest-shard or something else,
instead of managing the batches ourselves.

Merge test_neon_regress.py, test_pg_regress.py and test_isolation.py
into one file, test_pg_regress.py. Seems more clear to group all
pg_regress-based tests into one file, now that they would all be in
the same directory.
2022-08-30 18:25:38 +03:00
Dmitry Ivanov
96a50e99cf Forward various connection params to compute nodes. (#2336)
Previously, proxy didn't forward auxiliary `options` parameter
and other ones to the client's compute node, e.g.

```
$ psql "user=john host=localhost dbname=postgres options='-cgeqo=off'"
postgres=# show geqo;
┌──────┐
│ geqo │
├──────┤
│ on   │
└──────┘
(1 row)
```

With this patch we now forward `options`, `application_name` and `replication`.

Further reading: https://www.postgresql.org/docs/current/libpq-connect.html

Fixes #1287.
2022-08-30 17:36:21 +03:00
Arseny Sher
60408db101 Fix logging scopes in safekeeper. 2022-08-30 11:32:36 +03:00
Kirill Bulatov
07b4ace52f Use more restrictive .dockerignore 2022-08-29 22:01:53 +03:00
Konstantin Knizhnik
ee8b5f967d Add fork_at_current_lsn function which creates branch at current LSN (#2344)
* Add fork_at_current_lsn function which creates branch at current LSN

* Undo use of fork_at_current_lsn in test_branching because of short GC period

* Add missed return in fork_at_current_lsn

* Add missed return in fork_at_current_lsn

* Update test_runner/fixtures/neon_fixtures.py

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>

* Update test_runner/fixtures/neon_fixtures.py

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>

* Update test_runner/fixtures/neon_fixtures.py

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>

Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>
2022-08-29 17:59:04 +03:00
MMeent
1324dd89ed Mark PostGIS and PLV8 as trusted extensions (#2355)
Now, users can install these extensions themselves if they are owner
of the database they try to install the extension in.
2022-08-29 13:44:56 +02:00
Heikki Linnakangas
bfa1d91612 Introduce RCU, and use it to protect latest_gc_cutoff_lsn.
`latest_gc_cutoff_lsn` tracks the cutoff point where GC has been
performed. Anything older than the cutoff might already have been GC'd
away, and cannot be queried by get_page_at_lsn requests. It's
protected by an RWLock. Whenever a get_page_at_lsn requests comes in,
it first grabs the lock and reads the current `latest_gc_cutoff`, and
holds the lock it until the request has been served. The lock ensures
that GC doesn't start concurrently and remove page versions that we
still need to satisfy the request.

With the lock, get_page_at_lsn request could potentially be blocked
for a long time.  GC only holds the lock in exclusive mode for a short
duration, but depending on how whether the RWLock is "fair", a read
request might be queued behind the GC's exclusive request, which in
turn might be queued behind a long-running read operation, like a
basebackup. If the lock implementation is not fair, i.e. if a reader
can always jump the queue if the lock is already held in read mode,
then another problem arises: GC might be starved if a constant stream
of GetPage requests comes in.

To avoid the long wait or starvation, introduce a Read-Copy-Update
mechanism to replace the lock on `latest_gc_cutoff_lsn`. With the RCU,
reader can always read the latest value without blocking (except for a
very short duration if the lock protecting the RCU is contended;
that's comparable to a spinlock). And a writer can always write a new
value without waiting for readers to finish using the old value. The
old readers will continue to see the old value through their guard
object, while new readers will see the new value.

This is purely theoretical ATM, we don't have any reports of either
starvation or blocking behind GC happening in practice. But it's
simple to fix, so let's nip that problem in the bud.
2022-08-29 11:23:37 +03:00
Heikki Linnakangas
7a840ec60c Move save_metadata function.
`timeline.rs` seems like a better home for it.
2022-08-27 18:14:40 +03:00
Heikki Linnakangas
5f189cd385 Remove some unnecessary derives.
Doesn't make much difference, but let's be tidy.
2022-08-27 18:14:38 +03:00
Heikki Linnakangas
f8188e679c Downgrade a few panics into plain errors.
Let's not bring down the whole pageserver if you import a bogus tar
archive to one timeline.
2022-08-27 18:14:35 +03:00
Heikki Linnakangas
34b5d7aa9f Remove unused dependency 2022-08-27 18:14:33 +03:00
Heikki Linnakangas
88a339ed73 Update a few crates
"cargo tree -d" showed that we're building multiple versions of some
crates. Update some crates, to avoid depending on multiple versions.
2022-08-27 18:14:30 +03:00
Heikki Linnakangas
ec20534173 Fix minor typos and leftover comments. 2022-08-27 17:54:56 +03:00
MMeent
c0a867d86f Include neon extensions in the main neon images (#2341)
Oversight in #2325 - apparently this area wasn't well-covered by tests in the neon repo.

Fixes #2340
2022-08-26 19:58:08 +02:00
Dmitry Ivanov
6d30e21a32 Fix proxy tests (#2343)
There might be different psql & locale configurations,
therefore we should explicitly reset them to defaults.
2022-08-26 20:42:32 +03:00
Kirill Bulatov
a56ae15edf Lock cargo dependencies during CI builds 2022-08-26 17:29:01 +03:00
Alexey Kondratov
a5ca6a9d2b Move legacy version of compute-node Dockerfile from postgres repo (#2339)
It's used by e2e CI. Building Dockerfile.compute-node will take
unreasonable ammount of time without v2 runners.

TODO: remove once cloud repo CI is moved to v2 runners.
2022-08-26 13:59:04 +02:00
MMeent
04a018a5b1 Extract neon and neon_test_utils from postgres repo (#2325)
* Extract neon and neon_test_utils from postgres repo
* Remove neon from vendored postgres repo, and fix build_and_test.yml
* Move EmitWarningsOnPlaceholders to end of _PG_init in neon.c (from libpagestore.c)
* Fix Makefile location comments
* remove Makefile EXTRA_INSTALL flag
* Update Dockerfile.compute-node to build and include the neon extension
2022-08-25 18:48:09 +02:00
MMeent
bc588f3a53 Update WAL redo histograms (#2323)
Previously, it could only distinguish REDO task durations down to 5ms, which
equates to approx. 200pages/sec or 1.6MB/sec getpage@LSN traffic. 
This patch improves to 200'000 pages/sec or 1.6GB/sec, allowing for
much more precise performance measurement of the redo process.
2022-08-25 17:17:32 +03:00
Egor Suvorov
c952f022bb waldecoder: fix comment 2022-08-25 15:03:22 +02:00
Rory de Zoete
f67d109e6e Copy binaries to /usr/local (#2335)
* Add extra symlink

* Take other approach

Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
2022-08-25 14:35:01 +02:00
Rory de Zoete
344db0b4aa Re-add temporary symlink (#2331)
Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
2022-08-25 11:17:09 +02:00
Rory de Zoete
0c8ee6bd1d Add postgis & plv8 extensions (#2298)
* Add postgis & plv8 extensions

* Update Dockerfile & Fix typo's

* Update dockerfile

* Update Dockerfile

* Update dockerfile

* Use plv8 step

* Reduce giga layer

* Reduce layer size further

* Prepare for rollout

* Fix dependency

* Pass on correct build tag

* No longer dependent on building tools

* Use version from vendor

* Revert "Use version from vendor"

This reverts commit 7c6670c477.

* Revert and push correct set

* Add configure step for new approach

* Re-add configure flags

Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>
Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>
2022-08-25 09:46:52 +02:00
Dmitry Ivanov
8e1d6dd848 Minor cleanup in pq_proto (#2322) 2022-08-23 18:00:02 +03:00
Heikki Linnakangas
4013290508 Fix module doc comment.
`///` is used for comments on the *next* code that follows, so the comment
actually applied to the `use std::collections::BTreeMap;` line that follows.

rustfmt complained about that:

    error: an inner attribute is not permitted following an outer doc comment
     --> /home/heikki/git-sandbox/neon/libs/utils/src/seqwait_async.rs:7:1
      |
    5 | ///
      | --- previous doc comment
    6 |
    7 | #![warn(missing_docs)]
      | ^^^^^^^^^^^^^^^^^^^^^^ not permitted following an outer attribute
    8 |
    9 | use std::collections::BTreeMap;
      | ------------------------------- the inner attribute doesn't annotate this `use` import
      |
      = note: inner attributes, like `#![no_std]`, annotate the item enclosing them, and are usually found at the beginning of source files
help: to annotate the `use` import, change the attribute from inner to outer style
      |
    7 - #![warn(missing_docs)]
    7 + #[warn(missing_docs)]
      |

`//!` is the correct syntax for comments that apply to the whole file.
2022-08-23 12:58:54 +03:00
Heikki Linnakangas
5f0c95182d Minor cleanup, to pass by reference where possible. 2022-08-23 12:58:54 +03:00
Heikki Linnakangas
63b9dfb2f2 Remove unnecessary 'pub' from test module, and remove dead constant.
After making the test module private, the compiler noticed and warned
that the constant is unused.
2022-08-23 12:58:54 +03:00
Heikki Linnakangas
1a666a01d6 Improve comments a little. 2022-08-23 12:58:54 +03:00
Heikki Linnakangas
d110d2c2fd Reorder permission checks in HTTP API call handlers.
Every handler function now follows the same pattern:

1. extract parameters from the call
2. check permissions
3. execute command.

Previously, we extracted some parameters before permission check and
some after. Let's be consistent.
2022-08-23 12:14:06 +03:00
KlimentSerafimov
b98fa5d6b0 Added a new test for making sure the proxy displays a session_id when using link auth. (#2039)
Added pytest to check correctness of the link authentication pipeline.

Context: this PR is the first step towards refactoring the link authentication pipeline to use https (instead of psql) to send the db info to the proxy. There was a test missing for this pipeline in this repo, so this PR adds that test as preparation for the actual change of psql -> https.
Co-authored-by: Bojan Serafimov <bojan.serafimov7@gmail.com>
Co-authored-by: Dmitry Rodionov <dmitry@neon.tech>
Co-authored-by: Stas Kelvic <stas@neon.tech>
Co-authored-by: Dimitrii Ivanov <dima@neon.tech>
2022-08-22 20:02:45 -04:00
Dmitry Rodionov
9dd19ec397 Remove interferring proc check
We do not need it anymore because ports_distributor checks
whether the port can be used before giving it to service
2022-08-22 20:59:32 +03:00
Alexander Bayandin
832e60c2b4 Add .git-blame-ignore-revs file (#2318) 2022-08-22 16:38:31 +01:00
Alexey Kondratov
6dc56a9be1 Add GitHub templates for epics, bugs and release PRs (neondatabase/cloud#2079)
After merging this we will be able to:
- Pick Epic or Bug template in the GitHub UI, when creating an issue
- Use this link to open a release PR formatted in a unified way and
  containing a checklist with useful links: https://github.com/neondatabase/neon/compare/release...main?template=release-pr.md&title=Release%20202Y-MM-DD
2022-08-22 16:29:42 +02:00
Alexander Bayandin
39a3bcac36 test_runner: fix flake8 warnings 2022-08-22 14:57:09 +01:00
Alexander Bayandin
ae3227509c test_runner: revive flake8 2022-08-22 14:57:09 +01:00
Alexander Bayandin
4c2bb43775 Reformat all python files by black & isort 2022-08-22 14:57:09 +01:00
Alexander Bayandin
6b2e1d9065 test_runner: replace yapf with black and isort 2022-08-22 14:57:09 +01:00
Alexander Bayandin
277f2d6d3d Report test results to Allure (#2229) 2022-08-22 11:21:50 +01:00
Kirill Bulatov
7779308985 Ensure timeline logical size is initialized once 2022-08-22 11:51:37 +03:00
Kirill Bulatov
32be8739b9 Move walreceiver timeline registration into layered_repository 2022-08-22 11:51:37 +03:00
Kirill Bulatov
631cbf5b1b Use single map to manage timeline data 2022-08-22 11:51:37 +03:00
Heikki Linnakangas
5522fbab25 Move all unit tests related to Repository/Timeline to layered_repository.rs
There was a nominal split between the tests in layered_repository.rs and
repository.rs, such that tests specific to the layered implementation were
supposed to be in layered_repository.rs, and tests that should work with
any implementation of the traits were supposed to be in repository.rs.
In practice, the line was quite muddled. With minor tweaks, many of the
tests in layered_repository.rs should work with other implementations too,
and vice versa. And in practice we only have one implementation, so it's
more straightforward to gather all unit tests in one place.
2022-08-20 01:21:18 +03:00
Heikki Linnakangas
d48177d0d8 Expose timeline logical size as a prometheus metric.
Physical size was already exposed, and it'd be nice to show both
logical and physical size side by side in our graphana dashboards.
2022-08-19 22:21:33 +03:00
Heikki Linnakangas
84cd40b416 rustfmt fixes.
Not sure why these don't show up as CI failures, but on my laptop,
rustfmt insists.
2022-08-19 22:21:15 +03:00
Heikki Linnakangas
daba4c7405 Add a section in glossary to explain what "logical size" means. (#2306) 2022-08-19 21:57:00 +03:00
MMeent
8ac5a285a1 Update vendor/postgres to one that is rebased onto REL_14_5 (#2312)
This was previously based on REL_14_4
Protected tag of main before rebase is at main-before-rebase-REL_14_5
2022-08-19 20:02:36 +02:00
Heikki Linnakangas
aaa60c92ca Use u64/i64 for logical size, comment on why to use signed i64.
usize/isize type corresponds to the CPU architecture's pointer width,
i.e. 64 bits on a 64-bit platform and 32 bits on a 32-bit platform.
The logical size of a database has nothing to do with the that, so
u64/i64 is more appropriate.

It doesn't make any difference in practice as long as you're on a
64-bit platform, and it's hard to imagine anyone wanting to run the
pageserver on a 32-bit platform, but let's be tidy.

Also add a comment on why we use signed i64 for the logical size
variable, even though size should never be negative. I'm not sure the
reasons are very good, but at least this documents them, and hints at
some possible better solutions.
2022-08-19 16:44:16 +03:00
Kirill Bulatov
187a760409 Reset codestyle cargo cache 2022-08-19 16:40:37 +03:00
Kirill Bulatov
c634cb1d36 Remove TimelineWriter trait, rename LayeredTimelineWriter struct into TimelineWriter 2022-08-19 16:40:37 +03:00
Kirill Bulatov
c19b4a65f9 Remove Repository trait, rename LayeredRepository struct into Repository 2022-08-19 16:40:37 +03:00
Kirill Bulatov
8043612334 Remove Timeline trait, rename LayeredTimeline struct into Timeline 2022-08-19 16:40:37 +03:00
Rory de Zoete
12e87f0df3 Update workflow to fix dependency issue (#2309)
* Update workflow to fix dependency issue

* Update workflow

* Update workflow and dockerfile

* Specify tag

* Update main dockerfile as well

* Mirror rust image to docker hub

* Update submodule ref

Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>
2022-08-19 12:07:46 +02:00
Kirill Bulatov
6b9cef02a1 Use better defaults for pageserver Docker image 2022-08-19 12:41:00 +03:00
MMeent
37d90dc3b3 Fix dependencies issue between compute-tools and compute node docker images (#2304)
Compute node docker image requires compute-tools to build, but this
dependency (and the argument for which image to pick) weren't described in the
workflow file. This lead to out-of-date binaries in latest builds, which
subsequently broke these images.
2022-08-18 21:51:33 +02:00
Kirill Bulatov
a185821d6f Explicitly error on cache issues during I/O (#2303) 2022-08-18 22:37:20 +03:00
MMeent
f99ccb5041 Extract WalProposer into the neon extension (#2217)
Including, but not limited to:

* Fixes to neon management code to support walproposer-as-an-extension

* Fix issue in expected output of pg settings serialization.

* Show the logs of a failed --sync-safekeepers process in CI

* Add compat layer for renamed GUCs in postgres.conf

* Update vendor/postgres to the latest origin/main
2022-08-18 17:12:28 +02:00
369 changed files with 32709 additions and 15933 deletions

View File

@@ -11,3 +11,6 @@ opt-level = 3
[profile.dev]
# Turn on a small amount of optimization in Development mode.
opt-level = 1
[alias]
build_testing = ["build", "--features", "testing"]

View File

@@ -1,18 +1,21 @@
**/.git/
**/__pycache__
**/.pytest_cache
*
.git
target
tmp_check
tmp_install
tmp_check_cli
test_output
.vscode
.neon
integration_tests/.neon
.mypy_cache
Dockerfile
.dockerignore
!rust-toolchain.toml
!Cargo.toml
!Cargo.lock
!Makefile
!.cargo/
!.config/
!control_plane/
!compute_tools/
!libs/
!pageserver/
!pgxn/
!proxy/
!safekeeper/
!vendor/postgres-v14/
!vendor/postgres-v15/
!workspace_hack/
!neon_local/
!scripts/ninstall.sh

1
.git-blame-ignore-revs Normal file
View File

@@ -0,0 +1 @@
4c2bb43775947775401cbb9d774823c5723a91f8

23
.github/ISSUE_TEMPLATE/bug-template.md vendored Normal file
View File

@@ -0,0 +1,23 @@
---
name: Bug Template
about: Used for describing bugs
title: ''
labels: t/bug
assignees: ''
---
## Steps to reproduce
## Expected result
## Actual result
## Environment
## Logs, links
-

25
.github/ISSUE_TEMPLATE/epic-template.md vendored Normal file
View File

@@ -0,0 +1,25 @@
---
name: Epic Template
about: A set of related tasks contributing towards specific outcome, comprising of
more than 1 week of work.
title: 'Epic: '
labels: t/Epic
assignees: ''
---
## Motivation
## DoD
## Implementation ideas
## Tasks
- [ ]
## Other related tasks and Epics
-

View File

@@ -0,0 +1,20 @@
## Release 202Y-MM-DD
**NB: this PR must be merged only by 'Create a merge commit'!**
### Checklist when preparing for release
- [ ] Read or refresh [the release flow guide](https://github.com/neondatabase/cloud/wiki/Release:-general-flow)
- [ ] Ask in the [cloud Slack channel](https://neondb.slack.com/archives/C033A2WE6BZ) that you are going to rollout the release. Any blockers?
- [ ] Does this release contain any db migrations? Destructive ones? What is the rollback plan?
<!-- List everything that should be done **before** release, any issues / setting changes / etc -->
### Checklist after release
- [ ] Based on the merged commits write release notes and open a PR into `website` repo ([example](https://github.com/neondatabase/website/pull/120/files))
- [ ] Check [#dev-production-stream](https://neondb.slack.com/archives/C03F5SM1N02) Slack channel
- [ ] Check [stuck projects page](https://console.neon.tech/admin/projects?sort=last_active&order=desc&stuck=true)
- [ ] Check [recent operation failures](https://console.neon.tech/admin/operations?action=create_timeline%2Cstart_compute%2Cstop_compute%2Csuspend_compute%2Capply_config%2Cdelete_timeline%2Cdelete_tenant%2Ccreate_branch%2Ccheck_availability&sort=updated_at&order=desc&had_retries=some)
- [ ] Check [cloud SLO dashboard](https://observer.zenith.tech/d/_oWcBMJ7k/cloud-slos?orgId=1)
- [ ] Check [compute startup metrics dashboard](https://observer.zenith.tech/d/5OkYJEmVz/compute-startup-time)
<!-- List everything that should be done **after** release, any admin UI configuration / Grafana dashboard / alert changes / setting changes / etc -->

221
.github/actions/allure-report/action.yml vendored Normal file
View File

@@ -0,0 +1,221 @@
name: 'Create Allure report'
description: 'Create and publish Allure report'
inputs:
action:
desctiption: 'generate or store'
required: true
build_type:
description: '`build_type` from run-python-test-set action'
required: true
test_selection:
description: '`test_selector` from run-python-test-set action'
required: false
outputs:
report-url:
description: 'Allure report URL'
value: ${{ steps.generate-report.outputs.report-url }}
runs:
using: "composite"
steps:
- name: Validate input parameters
shell: bash -euxo pipefail {0}
run: |
if [ "${{ inputs.action }}" != "store" ] && [ "${{ inputs.action }}" != "generate" ]; then
echo 2>&1 "Unknown inputs.action type '${{ inputs.action }}'; allowed 'generate' or 'store' only"
exit 1
fi
if [ -z "${{ inputs.test_selection }}" ] && [ "${{ inputs.action }}" == "store" ]; then
echo 2>&1 "inputs.test_selection must be set for 'store' action"
exit 2
fi
- name: Calculate key
id: calculate-key
shell: bash -euxo pipefail {0}
run: |
# TODO: for manually triggered workflows (via workflow_dispatch) we need to have a separate key
pr_number=$(jq --raw-output .pull_request.number "$GITHUB_EVENT_PATH" || true)
if [ "${pr_number}" != "null" ]; then
key=pr-${pr_number}
elif [ "${GITHUB_REF}" = "refs/heads/main" ]; then
# Shortcut for a special branch
key=main
else
key=branch-$(echo ${GITHUB_REF#refs/heads/} | tr -c "[:alnum:]._-" "-")
fi
echo "::set-output name=KEY::${key}"
- uses: actions/setup-java@v3
if: ${{ inputs.action == 'generate' }}
with:
distribution: 'temurin'
java-version: '17'
- name: Install Allure
if: ${{ inputs.action == 'generate' }}
shell: bash -euxo pipefail {0}
run: |
if ! which allure; then
ALLURE_ZIP=allure-${ALLURE_VERSION}.zip
wget -q https://github.com/allure-framework/allure2/releases/download/${ALLURE_VERSION}/${ALLURE_ZIP}
echo "${ALLURE_ZIP_MD5} ${ALLURE_ZIP}" | md5sum -c
unzip -q ${ALLURE_ZIP}
echo "$(pwd)/allure-${ALLURE_VERSION}/bin" >> $GITHUB_PATH
rm -f ${ALLURE_ZIP}
fi
env:
ALLURE_VERSION: 2.19.0
ALLURE_ZIP_MD5: ced21401a1a8b9dfb68cee9e4c210464
- name: Upload Allure results
if: ${{ inputs.action == 'store' }}
env:
REPORT_PREFIX: reports/${{ steps.calculate-key.outputs.KEY }}/${{ inputs.build_type }}
RAW_PREFIX: reports-raw/${{ steps.calculate-key.outputs.KEY }}/${{ inputs.build_type }}
TEST_OUTPUT: /tmp/test_output
BUCKET: neon-github-public-dev
shell: bash -euxo pipefail {0}
run: |
# Add metadata
cat <<EOF > $TEST_OUTPUT/allure/results/executor.json
{
"name": "GitHub Actions",
"type": "github",
"url": "https://${BUCKET}.s3.amazonaws.com/${REPORT_PREFIX}/latest/index.html",
"buildOrder": ${GITHUB_RUN_ID},
"buildName": "GitHub Actions Run #${{ github.run_number }}/${GITHUB_RUN_ATTEMPT}",
"buildUrl": "${GITHUB_SERVER_URL}/${GITHUB_REPOSITORY}/actions/runs/${GITHUB_RUN_ID}/attempts/${GITHUB_RUN_ATTEMPT}",
"reportUrl": "https://${BUCKET}.s3.amazonaws.com/${REPORT_PREFIX}/${GITHUB_RUN_ID}/index.html",
"reportName": "Allure Report"
}
EOF
cat <<EOF > $TEST_OUTPUT/allure/results/environment.properties
TEST_SELECTION=${{ inputs.test_selection }}
BUILD_TYPE=${{ inputs.build_type }}
EOF
ARCHIVE="${GITHUB_RUN_ID}-${{ inputs.test_selection }}-${GITHUB_RUN_ATTEMPT}-$(date +%s).tar.zst"
ZSTD_NBTHREADS=0
tar -C ${TEST_OUTPUT}/allure/results -cf ${ARCHIVE} --zstd .
aws s3 mv --only-show-errors ${ARCHIVE} "s3://${BUCKET}/${RAW_PREFIX}/${ARCHIVE}"
# Potentially we could have several running build for the same key (for example for the main branch), so we use improvised lock for this
- name: Acquire Allure lock
if: ${{ inputs.action == 'generate' }}
shell: bash -euxo pipefail {0}
env:
LOCK_FILE: reports/${{ steps.calculate-key.outputs.KEY }}/lock.txt
BUCKET: neon-github-public-dev
run: |
LOCK_TIMEOUT=300 # seconds
for _ in $(seq 1 5); do
for i in $(seq 1 ${LOCK_TIMEOUT}); do
LOCK_ADDED=$(aws s3api head-object --bucket neon-github-public-dev --key ${LOCK_FILE} | jq --raw-output '.LastModified' || true)
# `date --date="..."` is supported only by gnu date (i.e. it doesn't work on BSD/macOS)
if [ -z "${LOCK_ADDED}" ] || [ "$(( $(date +%s) - $(date --date="${LOCK_ADDED}" +%s) ))" -gt "${LOCK_TIMEOUT}" ]; then
break
fi
sleep 1
done
echo "${GITHUB_RUN_ID}-${GITHUB_RUN_ATTEMPT}-${{ inputs.test_selection }}" > lock.txt
aws s3 mv --only-show-errors lock.txt "s3://${BUCKET}/${LOCK_FILE}"
# A double-check that exactly WE have acquired the lock
aws s3 cp --only-show-errors "s3://${BUCKET}/${LOCK_FILE}" ./lock.txt
if [ "$(cat lock.txt)" = "${GITHUB_RUN_ID}-${GITHUB_RUN_ATTEMPT}-${{ inputs.test_selection }}" ]; then
break
fi
done
- name: Generate and publish final Allure report
if: ${{ inputs.action == 'generate' }}
id: generate-report
env:
REPORT_PREFIX: reports/${{ steps.calculate-key.outputs.KEY }}/${{ inputs.build_type }}
RAW_PREFIX: reports-raw/${{ steps.calculate-key.outputs.KEY }}/${{ inputs.build_type }}
TEST_OUTPUT: /tmp/test_output
BUCKET: neon-github-public-dev
shell: bash -euxo pipefail {0}
run: |
# Get previously uploaded data for this run
ZSTD_NBTHREADS=0
s3_filepaths=$(aws s3api list-objects-v2 --bucket ${BUCKET} --prefix ${RAW_PREFIX}/${GITHUB_RUN_ID}- | jq --raw-output '.Contents[].Key')
if [ -z "$s3_filepaths" ]; then
# There's no previously uploaded data for this run
exit 0
fi
for s3_filepath in ${s3_filepaths}; do
aws s3 cp --only-show-errors "s3://${BUCKET}/${s3_filepath}" "${TEST_OUTPUT}/allure/"
archive=${TEST_OUTPUT}/allure/$(basename $s3_filepath)
mkdir -p ${archive%.tar.zst}
tar -xf ${archive} -C ${archive%.tar.zst}
rm -f ${archive}
done
# Get history trend
aws s3 cp --recursive --only-show-errors "s3://${BUCKET}/${REPORT_PREFIX}/latest/history" "${TEST_OUTPUT}/allure/latest/history" || true
# Generate report
allure generate --clean --output $TEST_OUTPUT/allure/report $TEST_OUTPUT/allure/*
# Replace a logo link with a redirect to the latest version of the report
sed -i 's|<a href="." class=|<a href="https://'${BUCKET}'.s3.amazonaws.com/'${REPORT_PREFIX}'/latest/index.html" class=|g' $TEST_OUTPUT/allure/report/app.js
# Upload a history and the final report (in this particular order to not to have duplicated history in 2 places)
aws s3 mv --recursive --only-show-errors "${TEST_OUTPUT}/allure/report/history" "s3://${BUCKET}/${REPORT_PREFIX}/latest/history"
aws s3 mv --recursive --only-show-errors "${TEST_OUTPUT}/allure/report" "s3://${BUCKET}/${REPORT_PREFIX}/${GITHUB_RUN_ID}"
REPORT_URL=https://${BUCKET}.s3.amazonaws.com/${REPORT_PREFIX}/${GITHUB_RUN_ID}/index.html
# Generate redirect
cat <<EOF > ./index.html
<!DOCTYPE html>
<meta charset="utf-8">
<title>Redirecting to ${REPORT_URL}</title>
<meta http-equiv="refresh" content="0; URL=${REPORT_URL}">
EOF
aws s3 cp --only-show-errors ./index.html "s3://${BUCKET}/${REPORT_PREFIX}/latest/index.html"
echo "[Allure Report](${REPORT_URL})" >> ${GITHUB_STEP_SUMMARY}
echo "::set-output name=report-url::${REPORT_URL}"
- name: Release Allure lock
if: ${{ inputs.action == 'generate' && always() }}
shell: bash -euxo pipefail {0}
env:
LOCK_FILE: reports/${{ steps.calculate-key.outputs.KEY }}/lock.txt
BUCKET: neon-github-public-dev
run: |
aws s3 cp --only-show-errors "s3://${BUCKET}/${LOCK_FILE}" ./lock.txt || exit 0
if [ "$(cat lock.txt)" = "${GITHUB_RUN_ID}-${GITHUB_RUN_ATTEMPT}-${{ inputs.test_selection }}" ]; then
aws s3 rm "s3://${BUCKET}/${LOCK_FILE}"
fi
- uses: actions/github-script@v6
if: ${{ inputs.action == 'generate' && always() }}
env:
REPORT_URL: ${{ steps.generate-report.outputs.report-url }}
BUILD_TYPE: ${{ inputs.build_type }}
SHA: ${{ github.event.pull_request.head.sha || github.sha }}
with:
script: |
const { REPORT_URL, BUILD_TYPE, SHA } = process.env
await github.rest.repos.createCommitStatus({
owner: context.repo.owner,
repo: context.repo.repo,
sha: `${SHA}`,
state: 'success',
target_url: `${REPORT_URL}`,
context: `Allure report / ${BUILD_TYPE}`,
})

View File

@@ -12,6 +12,9 @@ inputs:
description: "Allow to skip if file doesn't exist, fail otherwise"
default: false
required: false
prefix:
description: "S3 prefix. Default is '${GITHUB_RUN_ID}/${GITHUB_RUN_ATTEMPT}'"
required: false
runs:
using: "composite"
@@ -23,18 +26,18 @@ runs:
TARGET: ${{ inputs.path }}
ARCHIVE: /tmp/downloads/${{ inputs.name }}.tar.zst
SKIP_IF_DOES_NOT_EXIST: ${{ inputs.skip-if-does-not-exist }}
PREFIX: artifacts/${{ inputs.prefix || format('{0}/{1}', github.run_id, github.run_attempt) }}
run: |
BUCKET=neon-github-public-dev
PREFIX=artifacts/${GITHUB_RUN_ID}
FILENAME=$(basename $ARCHIVE)
S3_KEY=$(aws s3api list-objects-v2 --bucket ${BUCKET} --prefix ${PREFIX} | jq -r '.Contents[].Key' | grep ${FILENAME} | sort --version-sort | tail -1 || true)
S3_KEY=$(aws s3api list-objects-v2 --bucket ${BUCKET} --prefix ${PREFIX%$GITHUB_RUN_ATTEMPT} | jq -r '.Contents[].Key' | grep ${FILENAME} | sort --version-sort | tail -1 || true)
if [ -z "${S3_KEY}" ]; then
if [ "${SKIP_IF_DOES_NOT_EXIST}" = "true" ]; then
echo '::set-output name=SKIPPED::true'
exit 0
else
echo 2>&1 "Neither s3://${BUCKET}/${PREFIX}/${GITHUB_RUN_ATTEMPT}/${FILENAME} nor its version from previous attempts exist"
echo 2>&1 "Neither s3://${BUCKET}/${PREFIX}/${FILENAME} nor its version from previous attempts exist"
exit 1
fi
fi

View File

@@ -0,0 +1,82 @@
name: 'Create Neon Project'
description: 'Create Neon Project using API'
inputs:
api_key:
desctiption: 'Neon API key'
required: true
environment:
desctiption: 'dev (aka captest) or stage'
required: true
region_id:
desctiption: 'Region ID, if not set the project will be created in the default region'
required: false
outputs:
dsn:
description: 'Created Project DSN (for main database)'
value: ${{ steps.create-neon-project.outputs.dsn }}
project_id:
description: 'Created Project ID'
value: ${{ steps.create-neon-project.outputs.project_id }}
runs:
using: "composite"
steps:
- name: Parse Input
id: parse-input
shell: bash -euxo pipefail {0}
run: |
case "${ENVIRONMENT}" in
dev)
API_HOST=console.dev.neon.tech
REGION_ID=${REGION_ID:-eu-west-1}
;;
staging)
API_HOST=console.stage.neon.tech
REGION_ID=${REGION_ID:-us-east-1}
;;
*)
echo 2>&1 "Unknown environment=${ENVIRONMENT}. Allowed 'dev' or 'staging' only"
exit 1
;;
esac
echo "::set-output name=api_host::${API_HOST}"
echo "::set-output name=region_id::${REGION_ID}"
env:
ENVIRONMENT: ${{ inputs.environment }}
REGION_ID: ${{ inputs.region_id }}
- name: Create Neon Project
id: create-neon-project
# A shell without `set -x` to not to expose password/dsn in logs
shell: bash -euo pipefail {0}
run: |
project=$(curl \
"https://${API_HOST}/api/v1/projects" \
--fail \
--header "Accept: application/json" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${API_KEY}" \
--data "{
\"project\": {
\"name\": \"Created by actions/neon-project-create; GITHUB_RUN_ID=${GITHUB_RUN_ID}\",
\"platform_id\": \"aws\",
\"region_id\": \"${REGION_ID}\",
\"settings\": { }
}
}")
# Mask password
echo "::add-mask::$(echo $project | jq --raw-output '.roles[] | select(.name != "web_access") | .password')"
dsn=$(echo $project | jq --raw-output '.roles[] | select(.name != "web_access") | .dsn')/main
echo "::add-mask::${dsn}"
echo "::set-output name=dsn::${dsn}"
project_id=$(echo $project | jq --raw-output '.id')
echo "::set-output name=project_id::${project_id}"
env:
API_KEY: ${{ inputs.api_key }}
API_HOST: ${{ steps.parse-input.outputs.api_host }}
REGION_ID: ${{ steps.parse-input.outputs.region_id }}

View File

@@ -0,0 +1,54 @@
name: 'Delete Neon Project'
description: 'Delete Neon Project using API'
inputs:
api_key:
desctiption: 'Neon API key'
required: true
environment:
desctiption: 'dev (aka captest) or stage'
required: true
project_id:
desctiption: 'ID of the Project to delete'
required: true
runs:
using: "composite"
steps:
- name: Parse Input
id: parse-input
shell: bash -euxo pipefail {0}
run: |
case "${ENVIRONMENT}" in
dev)
API_HOST=console.dev.neon.tech
;;
staging)
API_HOST=console.stage.neon.tech
;;
*)
echo 2>&1 "Unknown environment=${ENVIRONMENT}. Allowed 'dev' or 'staging' only"
exit 1
;;
esac
echo "::set-output name=api_host::${API_HOST}"
env:
ENVIRONMENT: ${{ inputs.environment }}
- name: Delete Neon Project
shell: bash -euxo pipefail {0}
run: |
# Allow PROJECT_ID to be empty/null for cases when .github/actions/neon-project-create failed
if [ -n "${PROJECT_ID}" ]; then
curl -X "POST" \
"https://${API_HOST}/api/v1/projects/${PROJECT_ID}/delete" \
--fail \
--header "Accept: application/json" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer ${API_KEY}"
fi
env:
API_KEY: ${{ inputs.api_key }}
PROJECT_ID: ${{ inputs.project_id }}
API_HOST: ${{ steps.parse-input.outputs.api_host }}

View File

@@ -3,10 +3,7 @@ description: 'Runs a Neon python test set, performing all the required preparati
inputs:
build_type:
description: 'Type of Rust (neon) and C (postgres) builds. Must be "release" or "debug".'
required: true
rust_toolchain:
description: 'Rust toolchain version to fetch the caches'
description: 'Type of Rust (neon) and C (postgres) builds. Must be "release" or "debug", or "remote" for the remote cluster'
required: true
test_selection:
description: 'A python test suite to run'
@@ -24,7 +21,7 @@ inputs:
required: false
default: 'true'
save_perf_report:
description: 'Whether to upload the performance report'
description: 'Whether to upload the performance report, if true PERF_TEST_RESULT_CONNSTR env variable should be set'
required: false
default: 'false'
run_with_real_s3:
@@ -52,9 +49,10 @@ runs:
using: "composite"
steps:
- name: Get Neon artifact
if: inputs.build_type != 'remote'
uses: ./.github/actions/download
with:
name: neon-${{ runner.os }}-${{ inputs.build_type }}-${{ inputs.rust_toolchain }}-artifact
name: neon-${{ runner.os }}-${{ inputs.build_type }}-artifact
path: /tmp/neon
- name: Checkout
@@ -78,16 +76,22 @@ runs:
- name: Run pytest
env:
NEON_BIN: /tmp/neon/bin
POSTGRES_DISTRIB_DIR: /tmp/neon/pg_install
TEST_OUTPUT: /tmp/test_output
# this variable will be embedded in perf test report
# and is needed to distinguish different environments
PLATFORM: github-actions-selfhosted
BUILD_TYPE: ${{ inputs.build_type }}
AWS_ACCESS_KEY_ID: ${{ inputs.real_s3_access_key_id }}
AWS_SECRET_ACCESS_KEY: ${{ inputs.real_s3_secret_access_key }}
shell: bash -euxo pipefail {0}
run: |
# PLATFORM will be embedded in the perf test report
# and it is needed to distinguish different environments
export PLATFORM=${PLATFORM:-github-actions-selfhosted}
export POSTGRES_DISTRIB_DIR=${POSTGRES_DISTRIB_DIR:-/tmp/neon/pg_install}
export DEFAULT_PG_VERSION=${DEFAULT_PG_VERSION:-14}
if [ "${BUILD_TYPE}" = "remote" ]; then
export REMOTE_ENV=1
fi
PERF_REPORT_DIR="$(realpath test_runner/perf-report-local)"
rm -rf $PERF_REPORT_DIR
@@ -109,16 +113,21 @@ runs:
fi
if [[ "${{ inputs.save_perf_report }}" == "true" ]]; then
if [[ "$GITHUB_REF" == "refs/heads/main" ]]; then
mkdir -p "$PERF_REPORT_DIR"
EXTRA_PARAMS="--out-dir $PERF_REPORT_DIR $EXTRA_PARAMS"
fi
mkdir -p "$PERF_REPORT_DIR"
EXTRA_PARAMS="--out-dir $PERF_REPORT_DIR $EXTRA_PARAMS"
fi
if [[ "${{ inputs.build_type }}" == "debug" ]]; then
cov_prefix=(scripts/coverage "--profraw-prefix=$GITHUB_JOB" --dir=/tmp/coverage run)
elif [[ "${{ inputs.build_type }}" == "release" ]]; then
cov_prefix=()
else
cov_prefix=()
fi
# Wake up the cluster if we use remote neon instance
if [ "${{ inputs.build_type }}" = "remote" ] && [ -n "${BENCHMARK_CONNSTR}" ]; then
${POSTGRES_DISTRIB_DIR}/v${DEFAULT_PG_VERSION}/bin/psql ${BENCHMARK_CONNSTR} -c "SELECT version();"
fi
# Run the tests.
@@ -131,32 +140,24 @@ runs:
# -n4 uses four processes to run tests via pytest-xdist
# -s is not used to prevent pytest from capturing output, because tests are running
# in parallel and logs are mixed between different tests
mkdir -p $TEST_OUTPUT/allure/results
"${cov_prefix[@]}" ./scripts/pytest \
--junitxml=$TEST_OUTPUT/junit.xml \
--alluredir=$TEST_OUTPUT/allure/results \
--tb=short \
--verbose \
-m "not remote_cluster" \
-rA $TEST_SELECTION $EXTRA_PARAMS
if [[ "${{ inputs.save_perf_report }}" == "true" ]]; then
if [[ "$GITHUB_REF" == "refs/heads/main" ]]; then
export REPORT_FROM="$PERF_REPORT_DIR"
export REPORT_TO=local
scripts/generate_and_push_perf_report.sh
fi
export REPORT_FROM="$PERF_REPORT_DIR"
export REPORT_TO="$PLATFORM"
scripts/generate_and_push_perf_report.sh
fi
- name: Delete all data but logs
shell: bash -euxo pipefail {0}
- name: Create Allure report
if: always()
run: |
du -sh /tmp/test_output/*
find /tmp/test_output -type f ! -name "*.log" ! -name "regression.diffs" ! -name "junit.xml" ! -name "*.filediff" ! -name "*.stdout" ! -name "*.stderr" ! -name "flamegraph.svg" ! -name "*.metrics" -delete
du -sh /tmp/test_output/*
- name: Upload python test logs
if: always()
uses: ./.github/actions/upload
uses: ./.github/actions/allure-report
with:
name: python-test-${{ inputs.test_selection }}-${{ runner.os }}-${{ inputs.build_type }}-${{ inputs.rust_toolchain }}-logs
path: /tmp/test_output/
action: store
build_type: ${{ inputs.build_type }}
test_selection: ${{ inputs.test_selection }}

View File

@@ -7,6 +7,9 @@ inputs:
path:
description: "A directory or file to upload"
required: true
prefix:
description: "S3 prefix. Default is '${GITHUB_RUN_ID}/${GITHUB_RUN_ATTEMPT}'"
required: false
runs:
using: "composite"
@@ -42,14 +45,14 @@ runs:
env:
SOURCE: ${{ inputs.path }}
ARCHIVE: /tmp/uploads/${{ inputs.name }}.tar.zst
PREFIX: artifacts/${{ inputs.prefix || format('{0}/{1}', github.run_id, github.run_attempt) }}
run: |
BUCKET=neon-github-public-dev
PREFIX=artifacts/${GITHUB_RUN_ID}
FILENAME=$(basename $ARCHIVE)
FILESIZE=$(du -sh ${ARCHIVE} | cut -f1)
time aws s3 mv --only-show-errors ${ARCHIVE} s3://${BUCKET}/${PREFIX}/${GITHUB_RUN_ATTEMPT}/${FILENAME}
time aws s3 mv --only-show-errors ${ARCHIVE} s3://${BUCKET}/${PREFIX}/${FILENAME}
# Ref https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions#adding-a-job-summary
echo "[${FILENAME}](https://${BUCKET}.s3.amazonaws.com/${PREFIX}/${GITHUB_RUN_ATTEMPT}/${FILENAME}) ${FILESIZE}" >> ${GITHUB_STEP_SUMMARY}
echo "[${FILENAME}](https://${BUCKET}.s3.amazonaws.com/${PREFIX}/${FILENAME}) ${FILESIZE}" >> ${GITHUB_STEP_SUMMARY}

View File

@@ -58,7 +58,7 @@
creates: "/storage/pageserver/data/tenants"
environment:
NEON_REPO_DIR: "/storage/pageserver/data"
LD_LIBRARY_PATH: "/usr/local/lib"
LD_LIBRARY_PATH: "/usr/local/v14/lib"
become: true
tags:
- pageserver
@@ -132,7 +132,7 @@
creates: "/storage/safekeeper/data/safekeeper.id"
environment:
NEON_REPO_DIR: "/storage/safekeeper/data"
LD_LIBRARY_PATH: "/usr/local/lib"
LD_LIBRARY_PATH: "/usr/local/v14/lib"
become: true
tags:
- safekeeper

View File

@@ -21,10 +21,14 @@ docker pull --quiet neondatabase/neon:${DOCKER_TAG}
ID=$(docker create neondatabase/neon:${DOCKER_TAG})
docker cp ${ID}:/data/postgres_install.tar.gz .
tar -xzf postgres_install.tar.gz -C neon_install
mkdir neon_install/bin/
docker cp ${ID}:/usr/local/bin/pageserver neon_install/bin/
docker cp ${ID}:/usr/local/bin/safekeeper neon_install/bin/
docker cp ${ID}:/usr/local/bin/proxy neon_install/bin/
docker cp ${ID}:/usr/local/bin/postgres neon_install/bin/
docker cp ${ID}:/usr/local/v14/bin/ neon_install/v14/bin/
docker cp ${ID}:/usr/local/v15/bin/ neon_install/v15/bin/
docker cp ${ID}:/usr/local/v14/lib/ neon_install/v14/lib/
docker cp ${ID}:/usr/local/v15/lib/ neon_install/v15/lib/
docker rm -vf ${ID}
# store version to file (for ansible playbooks) and create binaries tarball

View File

@@ -2,11 +2,16 @@
#zenith-us-stage-ps-1 console_region_id=27
zenith-us-stage-ps-2 console_region_id=27
zenith-us-stage-ps-3 console_region_id=27
zenith-us-stage-ps-4 console_region_id=27
zenith-us-stage-test-ps-1 console_region_id=28
[safekeepers]
zenith-us-stage-sk-4 console_region_id=27
zenith-us-stage-sk-5 console_region_id=27
zenith-us-stage-sk-6 console_region_id=27
zenith-us-stage-test-sk-1 console_region_id=28
zenith-us-stage-test-sk-2 console_region_id=28
zenith-us-stage-test-sk-3 console_region_id=28
[storage:children]
pageservers

View File

@@ -5,7 +5,7 @@ After=network.target auditd.service
[Service]
Type=simple
User=pageserver
Environment=RUST_BACKTRACE=1 NEON_REPO_DIR=/storage/pageserver LD_LIBRARY_PATH=/usr/local/lib
Environment=RUST_BACKTRACE=1 NEON_REPO_DIR=/storage/pageserver LD_LIBRARY_PATH=/usr/local/v14/lib
ExecStart=/usr/local/bin/pageserver -c "pg_distrib_dir='/usr/local'" -c "listen_pg_addr='0.0.0.0:6400'" -c "listen_http_addr='0.0.0.0:9898'" -c "broker_endpoints=['{{ etcd_endpoints }}']" -D /storage/pageserver/data
ExecReload=/bin/kill -HUP $MAINPID
KillMode=mixed

View File

@@ -5,7 +5,7 @@ After=network.target auditd.service
[Service]
Type=simple
User=safekeeper
Environment=RUST_BACKTRACE=1 NEON_REPO_DIR=/storage/safekeeper/data LD_LIBRARY_PATH=/usr/local/lib
Environment=RUST_BACKTRACE=1 NEON_REPO_DIR=/storage/safekeeper/data LD_LIBRARY_PATH=/usr/local/v14/lib
ExecStart=/usr/local/bin/safekeeper -l {{ inventory_hostname }}.local:6500 --listen-http {{ inventory_hostname }}.local:7676 -D /storage/safekeeper/data --broker-endpoints={{ etcd_endpoints }} --remote-storage='{bucket_name="{{bucket_name}}", bucket_region="{{bucket_region}}", prefix_in_bucket="{{ env_name }}/wal"}'
ExecReload=/bin/kill -HUP $MAINPID
KillMode=mixed

View File

@@ -1,6 +1,7 @@
fullnameOverride: "neon-stress-proxy"
settings:
authBackend: "link"
authEndpoint: "https://console.dev.neon.tech/authenticate_proxy_request/"
uri: "https://console.dev.neon.tech/psql_session/"

View File

@@ -1,4 +1,5 @@
settings:
authBackend: "link"
authEndpoint: "https://console.neon.tech/authenticate_proxy_request/"
uri: "https://console.neon.tech/psql_session/"

View File

@@ -5,6 +5,7 @@ image:
repository: neondatabase/neon
settings:
authBackend: "link"
authEndpoint: "https://console.stage.neon.tech/authenticate_proxy_request/"
uri: "https://console.stage.neon.tech/psql_session/"

View File

@@ -11,9 +11,20 @@ on:
# │ │ ┌───────────── day of the month (1 - 31)
# │ │ │ ┌───────────── month (1 - 12 or JAN-DEC)
# │ │ │ │ ┌───────────── day of the week (0 - 6 or SUN-SAT)
- cron: '36 4 * * *' # run once a day, timezone is utc
- cron: '0 3 * * *' # run once a day, timezone is utc
workflow_dispatch: # adds ability to run this manually
inputs:
environment:
description: 'Environment to run remote tests on (dev or staging)'
required: false
region_id:
description: 'Use a particular region. If not set the default region will be used'
required: false
save_perf_report:
type: boolean
description: 'Publish perf report or not. If not set, the report is published only for the main branch'
required: false
defaults:
run:
@@ -35,7 +46,8 @@ jobs:
runs-on: [self-hosted, zenith-benchmarker]
env:
POSTGRES_DISTRIB_DIR: "/usr/pgsql-14"
POSTGRES_DISTRIB_DIR: /tmp/pg_install
DEFAULT_PG_VERSION: 14
steps:
- name: Checkout zenith repo
@@ -60,21 +72,14 @@ jobs:
echo Poetry
poetry --version
echo Pgbench
$POSTGRES_DISTRIB_DIR/bin/pgbench --version
${POSTGRES_DISTRIB_DIR}/v${DEFAULT_PG_VERSION}/bin/pgbench --version
# FIXME cluster setup is skipped due to various changes in console API
# for now pre created cluster is used. When API gain some stability
# after massive changes dynamic cluster setup will be revived.
# So use pre created cluster. It needs to be started manually, but stop is automatic after 5 minutes of inactivity
- name: Setup cluster
env:
BENCHMARK_CONNSTR: "${{ secrets.BENCHMARK_STAGING_CONNSTR }}"
run: |
set -e
echo "Starting cluster"
# wake up the cluster
$POSTGRES_DISTRIB_DIR/bin/psql $BENCHMARK_CONNSTR -c "SELECT 1"
- name: Create Neon Project
id: create-neon-project
uses: ./.github/actions/neon-project-create
with:
environment: ${{ github.event.inputs.environment || 'staging' }}
api_key: ${{ ( github.event.inputs.environment || 'staging' ) == 'staging' && secrets.NEON_STAGING_API_KEY || secrets.NEON_CAPTEST_API_KEY }}
- name: Run benchmark
# pgbench is installed system wide from official repo
@@ -97,7 +102,7 @@ jobs:
TEST_PG_BENCH_DURATIONS_MATRIX: "300"
TEST_PG_BENCH_SCALES_MATRIX: "10,100"
PLATFORM: "neon-staging"
BENCHMARK_CONNSTR: "${{ secrets.BENCHMARK_STAGING_CONNSTR }}"
BENCHMARK_CONNSTR: ${{ steps.create-neon-project.outputs.dsn }}
REMOTE_ENV: "1" # indicate to test harness that we do not have zenith binaries locally
run: |
# just to be sure that no data was cached on self hosted runner
@@ -106,7 +111,7 @@ jobs:
mkdir -p perf-report-staging
# Set --sparse-ordering option of pytest-order plugin to ensure tests are running in order of appears in the file,
# it's important for test_perf_pgbench.py::test_pgbench_remote_* tests
./scripts/pytest test_runner/performance/ -v -m "remote_cluster" --sparse-ordering --skip-interfering-proc-check --out-dir perf-report-staging --timeout 5400
./scripts/pytest test_runner/performance/ -v -m "remote_cluster" --sparse-ordering --out-dir perf-report-staging --timeout 5400
- name: Submit result
env:
@@ -115,6 +120,14 @@ jobs:
run: |
REPORT_FROM=$(realpath perf-report-staging) REPORT_TO=staging scripts/generate_and_push_perf_report.sh
- name: Delete Neon Project
if: ${{ always() }}
uses: ./.github/actions/neon-project-delete
with:
environment: staging
project_id: ${{ steps.create-neon-project.outputs.project_id }}
api_key: ${{ secrets.NEON_STAGING_API_KEY }}
- name: Post to a Slack channel
if: ${{ github.event.schedule && failure() }}
uses: slackapi/slack-github-action@v1
@@ -128,101 +141,145 @@ jobs:
env:
TEST_PG_BENCH_DURATIONS_MATRIX: "60m"
TEST_PG_BENCH_SCALES_MATRIX: "10gb"
REMOTE_ENV: "1"
POSTGRES_DISTRIB_DIR: /usr
POSTGRES_DISTRIB_DIR: /tmp/neon/pg_install
DEFAULT_PG_VERSION: 14
TEST_OUTPUT: /tmp/test_output
BUILD_TYPE: remote
SAVE_PERF_REPORT: ${{ github.event.inputs.save_perf_report || ( github.ref == 'refs/heads/main' ) }}
strategy:
fail-fast: false
matrix:
connstr: [ BENCHMARK_CAPTEST_CONNSTR, BENCHMARK_RDS_CONNSTR ]
# neon-captest-new: Run pgbench in a freshly created project
# neon-captest-reuse: Same, but reusing existing project
# neon-captest-prefetch: Same, with prefetching enabled (new project)
platform: [ neon-captest-new, neon-captest-reuse, neon-captest-prefetch, rds-aurora ]
runs-on: dev
container: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rustlegacy:2817580636
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rustlegacy:pinned
options: --init
timeout-minutes: 360 # 6h
steps:
- uses: actions/checkout@v3
- name: Cache poetry deps
id: cache_poetry
uses: actions/cache@v3
- name: Download Neon artifact
uses: ./.github/actions/download
with:
path: ~/.cache/pypoetry/virtualenvs
key: v2-${{ runner.os }}-python-deps-${{ hashFiles('poetry.lock') }}
name: neon-${{ runner.os }}-release-artifact
path: /tmp/neon/
prefix: latest
- name: Install Python deps
run: ./scripts/pysync
- name: Add Postgres binaries to PATH
run: |
${POSTGRES_DISTRIB_DIR}/v${DEFAULT_PG_VERSION}/bin/pgbench --version
echo "${POSTGRES_DISTRIB_DIR}/v${DEFAULT_PG_VERSION}/bin" >> $GITHUB_PATH
- name: Calculate platform
id: calculate-platform
- name: Create Neon Project
if: matrix.platform != 'neon-captest-reuse'
id: create-neon-project
uses: ./.github/actions/neon-project-create
with:
environment: ${{ github.event.inputs.environment || 'dev' }}
api_key: ${{ ( github.event.inputs.environment || 'dev' ) == 'staging' && secrets.NEON_STAGING_API_KEY || secrets.NEON_CAPTEST_API_KEY }}
- name: Set up Connection String
id: set-up-connstr
run: |
case "${PLATFORM}" in
neon-captest-reuse)
CONNSTR=${{ secrets.BENCHMARK_CAPTEST_CONNSTR }}
;;
neon-captest-new | neon-captest-prefetch)
CONNSTR=${{ steps.create-neon-project.outputs.dsn }}
;;
rds-aurora)
CONNSTR=${{ secrets.BENCHMARK_RDS_CONNSTR }}
;;
*)
echo 2>&1 "Unknown PLATFORM=${PLATFORM}. Allowed only 'neon-captest-reuse', 'neon-captest-new', 'neon-captest-prefetch' or 'rds-aurora'"
exit 1
;;
esac
echo "::set-output name=connstr::${CONNSTR}"
psql ${CONNSTR} -c "SELECT version();"
env:
CONNSTR: ${{ matrix.connstr }}
run: |
if [ "${CONNSTR}" = "BENCHMARK_CAPTEST_CONNSTR" ]; then
PLATFORM=neon-captest
elif [ "${CONNSTR}" = "BENCHMARK_RDS_CONNSTR" ]; then
PLATFORM=rds-aurora
else
echo 2>&1 "Unknown CONNSTR=${CONNSTR}. Allowed are BENCHMARK_CAPTEST_CONNSTR, and BENCHMARK_RDS_CONNSTR only"
exit 1
fi
PLATFORM: ${{ matrix.platform }}
echo "::set-output name=PLATFORM::${PLATFORM}"
- name: Install Deps
- name: Set database options
if: matrix.platform == 'neon-captest-prefetch'
run: |
echo "deb http://apt.postgresql.org/pub/repos/apt focal-pgdg main" | sudo tee /etc/apt/sources.list.d/pgdg.list
wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add -
sudo apt -y update
sudo apt install -y postgresql-14 postgresql-client-14
psql ${BENCHMARK_CONNSTR} -c "ALTER DATABASE main SET enable_seqscan_prefetch=on"
psql ${BENCHMARK_CONNSTR} -c "ALTER DATABASE main SET seqscan_prefetch_buffers=10"
env:
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
- name: Benchmark init
uses: ./.github/actions/run-python-test-set
with:
build_type: ${{ env.BUILD_TYPE }}
test_selection: performance
run_in_parallel: false
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600 -k test_pgbench_remote_init
env:
PLATFORM: ${{ steps.calculate-platform.outputs.PLATFORM }}
BENCHMARK_CONNSTR: ${{ secrets[matrix.connstr] }}
run: |
mkdir -p perf-report-captest
psql $BENCHMARK_CONNSTR -c "SELECT 1;"
./scripts/pytest test_runner/performance/test_perf_pgbench.py::test_pgbench_remote_init -v -m "remote_cluster" --skip-interfering-proc-check --out-dir perf-report-captest --timeout 21600
- name: Benchmark simple-update
env:
PLATFORM: ${{ steps.calculate-platform.outputs.PLATFORM }}
BENCHMARK_CONNSTR: ${{ secrets[matrix.connstr] }}
run: |
psql $BENCHMARK_CONNSTR -c "SELECT 1;"
./scripts/pytest test_runner/performance/test_perf_pgbench.py::test_pgbench_remote_simple_update -v -m "remote_cluster" --skip-interfering-proc-check --out-dir perf-report-captest --timeout 21600
- name: Benchmark select-only
env:
PLATFORM: ${{ steps.calculate-platform.outputs.PLATFORM }}
BENCHMARK_CONNSTR: ${{ secrets[matrix.connstr] }}
run: |
psql $BENCHMARK_CONNSTR -c "SELECT 1;"
./scripts/pytest test_runner/performance/test_perf_pgbench.py::test_pgbench_remote_select_only -v -m "remote_cluster" --skip-interfering-proc-check --out-dir perf-report-captest --timeout 21600
- name: Submit result
env:
PLATFORM: ${{ matrix.platform }}
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"
run: |
REPORT_FROM=$(realpath perf-report-captest) REPORT_TO=staging scripts/generate_and_push_perf_report.sh
- name: Upload logs
if: always()
uses: ./.github/actions/upload
- name: Benchmark simple-update
uses: ./.github/actions/run-python-test-set
with:
name: bench-captest-${{ steps.calculate-platform.outputs.PLATFORM }}
path: /tmp/test_output/
build_type: ${{ env.BUILD_TYPE }}
test_selection: performance
run_in_parallel: false
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600 -k test_pgbench_remote_simple_update
env:
PLATFORM: ${{ matrix.platform }}
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"
- name: Benchmark select-only
uses: ./.github/actions/run-python-test-set
with:
build_type: ${{ env.BUILD_TYPE }}
test_selection: performance
run_in_parallel: false
save_perf_report: ${{ env.SAVE_PERF_REPORT }}
extra_params: -m remote_cluster --timeout 21600 -k test_pgbench_remote_select_only
env:
PLATFORM: ${{ matrix.platform }}
BENCHMARK_CONNSTR: ${{ steps.set-up-connstr.outputs.connstr }}
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"
- name: Create Allure report
if: always()
uses: ./.github/actions/allure-report
with:
action: generate
build_type: ${{ env.BUILD_TYPE }}
- name: Delete Neon Project
if: ${{ matrix.platform != 'neon-captest-reuse' && always() }}
uses: ./.github/actions/neon-project-delete
with:
environment: dev
project_id: ${{ steps.create-neon-project.outputs.project_id }}
api_key: ${{ secrets.NEON_CAPTEST_API_KEY }}
- name: Post to a Slack channel
if: ${{ github.event.schedule && failure() }}
uses: slackapi/slack-github-action@v1
with:
channel-id: "C033QLM5P7D" # dev-staging-stream
slack-message: "Periodic perf testing: ${{ job.status }}\n${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"
slack-message: "Periodic perf testing ${{ matrix.platform }}: ${{ job.status }}\n${{ github.server_url }}/${{ github.repository }}/actions/runs/${{ github.run_id }}"
env:
SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}

View File

@@ -54,7 +54,6 @@ jobs:
fail-fast: false
matrix:
build_type: [ debug, release ]
rust_toolchain: [ 1.58 ]
env:
BUILD_TYPE: ${{ matrix.build_type }}
@@ -77,9 +76,14 @@ jobs:
submodules: true
fetch-depth: 1
- name: Set pg revision for caching
id: pg_ver
run: echo ::set-output name=pg_rev::$(git rev-parse HEAD:vendor/postgres)
- name: Set pg 14 revision for caching
id: pg_v14_rev
run: echo ::set-output name=pg_rev::$(git rev-parse HEAD:vendor/postgres-v14)
shell: bash -euxo pipefail {0}
- name: Set pg 15 revision for caching
id: pg_v15_rev
run: echo ::set-output name=pg_rev::$(git rev-parse HEAD:vendor/postgres-v15)
shell: bash -euxo pipefail {0}
# Set some environment variables used by all the steps.
@@ -90,16 +94,18 @@ jobs:
# CARGO_FEATURES is passed to "cargo metadata". It is separate from CARGO_FLAGS,
# because "cargo metadata" doesn't accept --release or --debug options
#
# We run tests with addtional features, that are turned off by default (e.g. in release builds), see
# corresponding Cargo.toml files for their descriptions.
- name: Set env variables
run: |
if [[ $BUILD_TYPE == "debug" ]]; then
cov_prefix="scripts/coverage --profraw-prefix=$GITHUB_JOB --dir=/tmp/coverage run"
CARGO_FEATURES=""
CARGO_FLAGS=""
CARGO_FEATURES="--features testing"
CARGO_FLAGS="--locked --timings $CARGO_FEATURES"
elif [[ $BUILD_TYPE == "release" ]]; then
cov_prefix=""
CARGO_FEATURES="--features profiling"
CARGO_FLAGS="--release $CARGO_FEATURES"
CARGO_FEATURES="--features testing,profiling"
CARGO_FLAGS="--locked --timings --release $CARGO_FEATURES"
fi
echo "cov_prefix=${cov_prefix}" >> $GITHUB_ENV
echo "CARGO_FEATURES=${CARGO_FEATURES}" >> $GITHUB_ENV
@@ -121,24 +127,40 @@ jobs:
target/
# Fall back to older versions of the key, if no cache for current Cargo.lock was found
key: |
v6-${{ runner.os }}-${{ matrix.build_type }}-cargo-${{ matrix.rust_toolchain }}-${{ hashFiles('Cargo.lock') }}
v6-${{ runner.os }}-${{ matrix.build_type }}-cargo-${{ matrix.rust_toolchain }}-
v9-${{ runner.os }}-${{ matrix.build_type }}-cargo-${{ hashFiles('Cargo.lock') }}
v9-${{ runner.os }}-${{ matrix.build_type }}-cargo-
- name: Cache postgres build
id: cache_pg
- name: Cache postgres v14 build
id: cache_pg_14
uses: actions/cache@v3
with:
path: tmp_install/
key: v1-${{ runner.os }}-${{ matrix.build_type }}-pg-${{ steps.pg_ver.outputs.pg_rev }}-${{ hashFiles('Makefile') }}
path: pg_install/v14
key: v1-${{ runner.os }}-${{ matrix.build_type }}-pg-${{ steps.pg_v14_rev.outputs.pg_rev }}-${{ hashFiles('Makefile') }}
- name: Build postgres
if: steps.cache_pg.outputs.cache-hit != 'true'
run: mold -run make postgres -j$(nproc)
- name: Cache postgres v15 build
id: cache_pg_15
uses: actions/cache@v3
with:
path: pg_install/v15
key: v1-${{ runner.os }}-${{ matrix.build_type }}-pg-${{ steps.pg_v15_rev.outputs.pg_rev }}-${{ hashFiles('Makefile') }}
- name: Build postgres v14
if: steps.cache_pg_14.outputs.cache-hit != 'true'
run: mold -run make postgres-v14 -j$(nproc)
shell: bash -euxo pipefail {0}
- name: Build postgres v15
if: steps.cache_pg_15.outputs.cache-hit != 'true'
run: mold -run make postgres-v15 -j$(nproc)
shell: bash -euxo pipefail {0}
- name: Build neon extensions
run: mold -run make neon-pg-ext -j$(nproc)
shell: bash -euxo pipefail {0}
- name: Run cargo build
run: |
${cov_prefix} mold -run cargo build $CARGO_FLAGS --features failpoints --bins --tests
${cov_prefix} mold -run cargo build $CARGO_FLAGS --bins --tests
shell: bash -euxo pipefail {0}
- name: Run cargo test
@@ -188,21 +210,32 @@ jobs:
shell: bash -euxo pipefail {0}
- name: Install postgres binaries
run: cp -a tmp_install /tmp/neon/pg_install
run: cp -a pg_install /tmp/neon/pg_install
shell: bash -euxo pipefail {0}
- name: Upload Neon artifact
uses: ./.github/actions/upload
with:
name: neon-${{ runner.os }}-${{ matrix.build_type }}-${{ matrix.rust_toolchain }}-artifact
name: neon-${{ runner.os }}-${{ matrix.build_type }}-artifact
path: /tmp/neon
- name: Prepare cargo build timing stats for storing
run: |
mkdir -p "/tmp/neon/cargo-timings/$BUILD_TYPE/"
cp -r ./target/cargo-timings/* "/tmp/neon/cargo-timings/$BUILD_TYPE/"
shell: bash -euxo pipefail {0}
- name: Upload cargo build stats
uses: ./.github/actions/upload
with:
name: neon-${{ runner.os }}-${{ matrix.build_type }}-build-stats
path: /tmp/neon/cargo-timings/
# XXX: keep this after the binaries.list is formed, so the coverage can properly work later
- name: Merge and upload coverage data
if: matrix.build_type == 'debug'
uses: ./.github/actions/save-coverage-data
pg_regress-tests:
regress-tests:
runs-on: dev
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned
@@ -212,7 +245,6 @@ jobs:
fail-fast: false
matrix:
build_type: [ debug, release ]
rust_toolchain: [ 1.58 ]
steps:
- name: Checkout
uses: actions/checkout@v3
@@ -220,51 +252,48 @@ jobs:
submodules: true
fetch-depth: 2
- name: Pytest regress tests
- name: Pytest regression tests
uses: ./.github/actions/run-python-test-set
with:
build_type: ${{ matrix.build_type }}
rust_toolchain: ${{ matrix.rust_toolchain }}
test_selection: batch_pg_regress
test_selection: regress
needs_postgres_source: true
- name: Merge and upload coverage data
if: matrix.build_type == 'debug'
uses: ./.github/actions/save-coverage-data
other-tests:
runs-on: dev
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned
options: --init
needs: [ build-neon ]
strategy:
fail-fast: false
matrix:
build_type: [ debug, release ]
rust_toolchain: [ 1.58 ]
steps:
- name: Checkout
uses: actions/checkout@v3
with:
submodules: true
fetch-depth: 2
- name: Pytest other tests
uses: ./.github/actions/run-python-test-set
with:
build_type: ${{ matrix.build_type }}
rust_toolchain: ${{ matrix.rust_toolchain }}
test_selection: batch_others
run_with_real_s3: true
real_s3_bucket: ci-tests-s3
real_s3_region: us-west-2
real_s3_access_key_id: "${{ secrets.AWS_ACCESS_KEY_ID_CI_TESTS_S3 }}"
real_s3_secret_access_key: "${{ secrets.AWS_SECRET_ACCESS_KEY_CI_TESTS_S3 }}"
- name: Merge and upload coverage data
if: matrix.build_type == 'debug'
uses: ./.github/actions/save-coverage-data
upload-latest-artifacts:
runs-on: dev
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned
options: --init
needs: [ regress-tests ]
if: github.ref_name == 'main'
steps:
- name: Copy Neon artifact to the latest directory
shell: bash -euxo pipefail {0}
env:
BUCKET: neon-github-public-dev
PREFIX: artifacts/${{ github.run_id }}
run: |
for build_type in debug release; do
FILENAME=neon-${{ runner.os }}-${build_type}-artifact.tar.zst
S3_KEY=$(aws s3api list-objects-v2 --bucket ${BUCKET} --prefix ${PREFIX} | jq -r '.Contents[].Key' | grep ${FILENAME} | sort --version-sort | tail -1 || true)
if [ -z "${S3_KEY}" ]; then
echo 2>&1 "Neither s3://${BUCKET}/${PREFIX}/${FILENAME} nor its version from previous attempts exist"
exit 1
fi
time aws s3 cp --only-show-errors s3://${BUCKET}/${S3_KEY} s3://${BUCKET}/artifacts/latest/${FILENAME}
done
benchmarks:
runs-on: dev
container:
@@ -276,7 +305,6 @@ jobs:
fail-fast: false
matrix:
build_type: [ release ]
rust_toolchain: [ 1.58 ]
steps:
- name: Checkout
uses: actions/checkout@v3
@@ -288,27 +316,63 @@ jobs:
uses: ./.github/actions/run-python-test-set
with:
build_type: ${{ matrix.build_type }}
rust_toolchain: ${{ matrix.rust_toolchain }}
test_selection: performance
run_in_parallel: false
save_perf_report: true
save_perf_report: ${{ github.ref == 'refs/heads/main' }}
env:
VIP_VAP_ACCESS_TOKEN: "${{ secrets.VIP_VAP_ACCESS_TOKEN }}"
PERF_TEST_RESULT_CONNSTR: "${{ secrets.PERF_TEST_RESULT_CONNSTR }}"
# XXX: no coverage data handling here, since benchmarks are run on release builds,
# while coverage is currently collected for the debug ones
merge-allure-report:
runs-on: dev
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned
options: --init
needs: [ regress-tests, benchmarks ]
if: always()
strategy:
fail-fast: false
matrix:
build_type: [ debug, release ]
steps:
- name: Checkout
uses: actions/checkout@v3
with:
submodules: false
- name: Create Allure report
id: create-allure-report
uses: ./.github/actions/allure-report
with:
action: generate
build_type: ${{ matrix.build_type }}
- name: Store Allure test stat in the DB
if: ${{ steps.create-allure-report.outputs.report-url }}
env:
BUILD_TYPE: ${{ matrix.build_type }}
SHA: ${{ github.event.pull_request.head.sha || github.sha }}
REPORT_URL: ${{ steps.create-allure-report.outputs.report-url }}
TEST_RESULT_CONNSTR: ${{ secrets.REGRESS_TEST_RESULT_CONNSTR }}
shell: bash -euxo pipefail {0}
run: |
curl --fail --output suites.json ${REPORT_URL%/index.html}/data/suites.json
./scripts/pysync
DATABASE_URL="$TEST_RESULT_CONNSTR" poetry run python3 scripts/ingest_regress_test_result.py --revision ${SHA} --reference ${GITHUB_REF} --build-type ${BUILD_TYPE} --ingest suites.json
coverage-report:
runs-on: dev
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned
options: --init
needs: [ other-tests, pg_regress-tests ]
needs: [ regress-tests ]
strategy:
fail-fast: false
matrix:
build_type: [ debug ]
rust_toolchain: [ 1.58 ]
steps:
- name: Checkout
uses: actions/checkout@v3
@@ -325,12 +389,12 @@ jobs:
!~/.cargo/registry/src
~/.cargo/git/
target/
key: v5-${{ runner.os }}-${{ matrix.build_type }}-cargo-${{ matrix.rust_toolchain }}-${{ hashFiles('Cargo.lock') }}
key: v9-${{ runner.os }}-${{ matrix.build_type }}-cargo-${{ hashFiles('Cargo.lock') }}
- name: Get Neon artifact
uses: ./.github/actions/download
with:
name: neon-${{ runner.os }}-${{ matrix.build_type }}-${{ matrix.rust_toolchain }}-artifact
name: neon-${{ runner.os }}-${{ matrix.build_type }}-artifact
path: /tmp/neon
- name: Get coverage artifact
@@ -379,7 +443,7 @@ jobs:
trigger-e2e-tests:
runs-on: dev
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/base:pinned
options: --init
needs: [ build-neon ]
steps:
@@ -449,7 +513,6 @@ jobs:
compute-node-image:
runs-on: dev
container: gcr.io/kaniko-project/executor:v1.9.0-debug
steps:
- name: Checkout
uses: actions/checkout@v1 # v3 won't work with kaniko
@@ -460,19 +523,57 @@ jobs:
- name: Configure ECR login
run: echo "{\"credsStore\":\"ecr-login\"}" > /kaniko/.docker/config.json
- name: Kaniko build compute node
working-directory: ./vendor/postgres/
run: /kaniko/executor --snapshotMode=redo --cache=true --cache-repo 369495373322.dkr.ecr.eu-central-1.amazonaws.com/cache --snapshotMode=redo --context . --destination 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node:$GITHUB_RUN_ID
# compute-node uses postgres 14, which is default now
# cloud repo depends on this image name, thus duplicating it
# remove compute-node when cloud repo is updated
- name: Kaniko build compute node with extensions v14 (compatibility)
run: /kaniko/executor --skip-unused-stages --snapshotMode=redo --cache=true --cache-repo 369495373322.dkr.ecr.eu-central-1.amazonaws.com/cache --snapshotMode=redo --context . --dockerfile Dockerfile.compute-node-v14 --destination 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node:$GITHUB_RUN_ID
compute-node-image-v14:
runs-on: dev
container: gcr.io/kaniko-project/executor:v1.9.0-debug
steps:
- name: Checkout
uses: actions/checkout@v1 # v3 won't work with kaniko
with:
submodules: true
fetch-depth: 0
- name: Configure ECR login
run: echo "{\"credsStore\":\"ecr-login\"}" > /kaniko/.docker/config.json
- name: Kaniko build compute node with extensions v14
run: /kaniko/executor --skip-unused-stages --snapshotMode=redo --cache=true --cache-repo 369495373322.dkr.ecr.eu-central-1.amazonaws.com/cache --context . --dockerfile Dockerfile.compute-node-v14 --destination 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node-v14:$GITHUB_RUN_ID
compute-node-image-v15:
runs-on: dev
container: gcr.io/kaniko-project/executor:v1.9.0-debug
steps:
- name: Checkout
uses: actions/checkout@v1 # v3 won't work with kaniko
with:
submodules: true
fetch-depth: 0
- name: Configure ECR login
run: echo "{\"credsStore\":\"ecr-login\"}" > /kaniko/.docker/config.json
- name: Kaniko build compute node with extensions v15
run: /kaniko/executor --skip-unused-stages --snapshotMode=redo --cache=true --cache-repo 369495373322.dkr.ecr.eu-central-1.amazonaws.com/cache --context . --dockerfile Dockerfile.compute-node-v15 --destination 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node-v15:$GITHUB_RUN_ID
promote-images:
runs-on: dev
needs: [ neon-image, compute-tools-image, compute-node-image ]
needs: [ neon-image, compute-node-image, compute-node-image-v14, compute-tools-image ]
if: github.event_name != 'workflow_dispatch'
container: amazon/aws-cli
strategy:
fail-fast: false
matrix:
name: [ neon, compute-tools, compute-node ]
# compute-node uses postgres 14, which is default now
# cloud repo depends on this image name, thus duplicating it
# remove compute-node when cloud repo is updated
name: [ neon, compute-node, compute-node-v14, compute-tools ]
steps:
- name: Promote image to latest
@@ -489,18 +590,6 @@ jobs:
run: |
go install github.com/google/go-containerregistry/cmd/crane@31786c6cbb82d6ec4fb8eb79cd9387905130534e # v0.11.0
go install github.com/awslabs/amazon-ecr-credential-helper/ecr-login/cli/docker-credential-ecr-login@69c85dc22db6511932bbf119e1a0cc5c90c69a7f # v0.6.0
# - name: Get build tag
# run: |
# if [[ "$GITHUB_REF_NAME" == "main" ]]; then
# echo "::set-output name=tag::$(git rev-list --count HEAD)"
# elif [[ "$GITHUB_REF_NAME" == "release" ]]; then
# echo "::set-output name=tag::release-$(git rev-list --count HEAD)"
# else
# echo "GITHUB_REF_NAME (value '$GITHUB_REF_NAME') is not set to either 'main' or 'release' "
# echo "::set-output name=tag::$GITHUB_RUN_ID"
# fi
# id: build-tag
- name: Configure ECR login
run: |
@@ -516,7 +605,22 @@ jobs:
- name: Pull compute node image from ECR
run: crane pull 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node:latest compute-node
- name: Configure docker login
- name: Pull compute node v14 image from ECR
run: crane pull 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node-v14:latest compute-node-v14
- name: Pull rust image from ECR
run: crane pull 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned rust
- name: Push images to production ECR
if: |
(github.ref_name == 'main' || github.ref_name == 'release') &&
github.event_name != 'workflow_dispatch'
run: |
crane copy 369495373322.dkr.ecr.eu-central-1.amazonaws.com/neon:$GITHUB_RUN_ID 093970136003.dkr.ecr.us-east-2.amazonaws.com/neon:latest
crane copy 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-tools:$GITHUB_RUN_ID 093970136003.dkr.ecr.us-east-2.amazonaws.com/compute-tools:latest
crane copy 369495373322.dkr.ecr.eu-central-1.amazonaws.com/compute-node:$GITHUB_RUN_ID 093970136003.dkr.ecr.us-east-2.amazonaws.com/compute-node:latest
- name: Configure Docker Hub login
run: |
# ECR Credential Helper & Docker Hub don't work together in config, hence reset
echo "" > /github/home/.docker/config.json
@@ -531,7 +635,13 @@ jobs:
- name: Push compute node image to Docker Hub
run: crane push compute-node neondatabase/compute-node:${{needs.tag.outputs.build-tag}}
- name: Add latest tag to images
- name: Push compute node v14 image to Docker Hub
run: crane push compute-node-v14 neondatabase/compute-node-v14:${{needs.tag.outputs.build-tag}}
- name: Push rust image to Docker Hub
run: crane push rust neondatabase/rust:pinned
- name: Add latest tag to images in Docker Hub
if: |
(github.ref_name == 'main' || github.ref_name == 'release') &&
github.event_name != 'workflow_dispatch'
@@ -539,6 +649,7 @@ jobs:
crane tag neondatabase/neon:${{needs.tag.outputs.build-tag}} latest
crane tag neondatabase/compute-tools:${{needs.tag.outputs.build-tag}} latest
crane tag neondatabase/compute-node:${{needs.tag.outputs.build-tag}} latest
crane tag neondatabase/compute-node-v14:${{needs.tag.outputs.build-tag}} latest
calculate-deploy-targets:
runs-on: [ self-hosted, Linux, k8s-runner ]
@@ -567,7 +678,7 @@ jobs:
#container: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/base:latest
# We need both storage **and** compute images for deploy, because control plane picks the compute version based on the storage version.
# If it notices a fresh storage it may bump the compute version. And if compute image failed to build it may break things badly
needs: [ push-docker-hub, calculate-deploy-targets, tag, other-tests, pg_regress-tests ]
needs: [ push-docker-hub, calculate-deploy-targets, tag, regress-tests ]
if: |
(github.ref_name == 'main' || github.ref_name == 'release') &&
github.event_name != 'workflow_dispatch'
@@ -622,7 +733,7 @@ jobs:
runs-on: dev
container: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/base:latest
# Compute image isn't strictly required for proxy deploy, but let's still wait for it to run all deploy jobs consistently.
needs: [ push-docker-hub, calculate-deploy-targets, tag, other-tests, pg_regress-tests ]
needs: [ push-docker-hub, calculate-deploy-targets, tag, regress-tests ]
if: |
(github.ref_name == 'main' || github.ref_name == 'release') &&
github.event_name != 'workflow_dispatch'

View File

@@ -17,18 +17,21 @@ concurrency:
env:
RUST_BACKTRACE: 1
COPT: '-Werror'
jobs:
check-codestyle-rust:
strategy:
fail-fast: false
matrix:
# If we want to duplicate this job for different
# Rust toolchains (e.g. nightly or 1.37.0), add them here.
rust_toolchain: [1.58]
# XXX: both OSes have rustup
# * https://github.com/actions/runner-images/blob/main/images/macos/macos-12-Readme.md#rust-tools
# * https://github.com/actions/runner-images/blob/main/images/linux/Ubuntu2204-Readme.md#rust-tools
# this is all we need to install our toolchain later via rust-toolchain.toml
# so don't install any toolchain explicitly.
os: [ubuntu-latest, macos-latest]
timeout-minutes: 60
name: run regression test suite
timeout-minutes: 90
name: check codestyle rust and postgres
runs-on: ${{ matrix.os }}
steps:
@@ -38,14 +41,6 @@ jobs:
submodules: true
fetch-depth: 2
- name: Install rust toolchain ${{ matrix.rust_toolchain }}
uses: actions-rs/toolchain@v1
with:
profile: minimal
toolchain: ${{ matrix.rust_toolchain }}
components: rustfmt, clippy
override: true
- name: Check formatting
run: cargo fmt --all -- --check
@@ -59,17 +54,29 @@ jobs:
if: matrix.os == 'macos-latest'
run: brew install flex bison openssl
- name: Set pg revision for caching
id: pg_ver
run: echo ::set-output name=pg_rev::$(git rev-parse HEAD:vendor/postgres)
- name: Set pg 14 revision for caching
id: pg_v14_rev
run: echo ::set-output name=pg_rev::$(git rev-parse HEAD:vendor/postgres-v14)
shell: bash -euxo pipefail {0}
- name: Cache postgres build
id: cache_pg
uses: actions/cache@v2
- name: Set pg 15 revision for caching
id: pg_v15_rev
run: echo ::set-output name=pg_rev::$(git rev-parse HEAD:vendor/postgres-v15)
shell: bash -euxo pipefail {0}
- name: Cache postgres v14 build
id: cache_pg_14
uses: actions/cache@v3
with:
path: |
tmp_install/
key: ${{ runner.os }}-pg-${{ steps.pg_ver.outputs.pg_rev }}
path: pg_install/v14
key: v1-${{ runner.os }}-${{ matrix.build_type }}-pg-${{ steps.pg_v14_rev.outputs.pg_rev }}-${{ hashFiles('Makefile') }}
- name: Cache postgres v15 build
id: cache_pg_15
uses: actions/cache@v3
with:
path: pg_install/v15
key: v1-${{ runner.os }}-${{ matrix.build_type }}-pg-${{ steps.pg_v15_rev.outputs.pg_rev }}-${{ hashFiles('Makefile') }}
- name: Set extra env for macOS
if: matrix.os == 'macos-latest'
@@ -77,37 +84,55 @@ jobs:
echo 'LDFLAGS=-L/usr/local/opt/openssl@3/lib' >> $GITHUB_ENV
echo 'CPPFLAGS=-I/usr/local/opt/openssl@3/include' >> $GITHUB_ENV
- name: Build postgres
if: steps.cache_pg.outputs.cache-hit != 'true'
run: make postgres
- name: Build postgres v14
if: steps.cache_pg_14.outputs.cache-hit != 'true'
run: make postgres-v14
shell: bash -euxo pipefail {0}
# Plain configure output can contain weird errors like 'error: C compiler cannot create executables'
# and the real cause will be inside config.log
- name: Print configure logs in case of failure
if: failure()
continue-on-error: true
run: |
echo '' && echo '=== config.log ===' && echo ''
cat tmp_install/build/config.log
echo '' && echo '=== configure.log ===' && echo ''
cat tmp_install/build/configure.log
- name: Build postgres v15
if: steps.cache_pg_15.outputs.cache-hit != 'true'
run: make postgres-v15
shell: bash -euxo pipefail {0}
- name: Build neon extensions
run: make neon-pg-ext
- name: Cache cargo deps
id: cache_cargo
uses: actions/cache@v2
uses: actions/cache@v3
with:
path: |
~/.cargo/registry
!~/.cargo/registry/src
~/.cargo/git
target
key: v2-${{ runner.os }}-cargo-${{ hashFiles('./Cargo.lock') }}-rust-${{ matrix.rust_toolchain }}
key: v5-${{ runner.os }}-cargo-${{ hashFiles('./Cargo.lock') }}-rust
- name: Run cargo clippy
run: ./run_clippy.sh
- name: Ensure all project builds
run: cargo build --all --all-targets
run: cargo build --locked --all --all-targets
check-rust-dependencies:
runs-on: dev
container:
image: 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned
options: --init
steps:
- name: Checkout
uses: actions/checkout@v3
with:
submodules: false
fetch-depth: 1
# https://github.com/facebookincubator/cargo-guppy/tree/bec4e0eb29dcd1faac70b1b5360267fc02bf830e/tools/cargo-hakari#2-keep-the-workspace-hack-up-to-date-in-ci
- name: Check every project module is covered by Hakari
run: |
cargo hakari generate --diff # workspace-hack Cargo.toml is up-to-date
cargo hakari manage-deps --dry-run # all workspace crates depend on workspace-hack
shell: bash -euxo pipefail {0}
check-codestyle-python:
runs-on: [ self-hosted, Linux, k8s-runner ]
@@ -128,8 +153,14 @@ jobs:
- name: Install Python deps
run: ./scripts/pysync
- name: Run yapf to ensure code format
run: poetry run yapf --recursive --diff .
- name: Run isort to ensure code format
run: poetry run isort --diff --check .
- name: Run black to ensure code format
run: poetry run black --diff --check .
- name: Run flake8 to ensure code format
run: poetry run flake8 .
- name: Run mypy to check types
run: poetry run mypy .

View File

@@ -1,45 +0,0 @@
name: Send Notifications
on:
push:
branches: [ main ]
jobs:
send-notifications:
timeout-minutes: 30
name: send commit notifications
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v2
with:
submodules: true
fetch-depth: 2
- name: Form variables for notification message
id: git_info_grab
run: |
git_stat=$(git show --stat=50)
git_stat="${git_stat//'%'/'%25'}"
git_stat="${git_stat//$'\n'/'%0A'}"
git_stat="${git_stat//$'\r'/'%0D'}"
git_stat="${git_stat// /}" # space -> 'Space En', as github tends to eat ordinary spaces
echo "::set-output name=git_stat::$git_stat"
echo "::set-output name=sha_short::$(git rev-parse --short HEAD)"
echo "##[set-output name=git_branch;]$(echo ${GITHUB_REF#refs/heads/})"
- name: Send notification
uses: appleboy/telegram-action@master
with:
to: ${{ secrets.TELEGRAM_TO }}
token: ${{ secrets.TELEGRAM_TOKEN }}
format: markdown
args: |
*@${{ github.actor }} pushed to* [${{ github.repository }}:${{steps.git_info_grab.outputs.git_branch}}](github.com/${{ github.repository }}/commit/${{steps.git_info_grab.outputs.sha_short }})
```
${{ steps.git_info_grab.outputs.git_stat }}
```

View File

@@ -47,17 +47,23 @@ jobs:
shell: bash -euxo pipefail {0}
run: ./scripts/pysync
- name: Create Neon Project
id: create-neon-project
uses: ./.github/actions/neon-project-create
with:
environment: staging
api_key: ${{ secrets.NEON_STAGING_API_KEY }}
- name: Run pytest
env:
REMOTE_ENV: 1
BENCHMARK_CONNSTR: "${{ secrets.BENCHMARK_STAGING_CONNSTR }}"
BENCHMARK_CONNSTR: ${{ steps.create-neon-project.outputs.dsn }}
POSTGRES_DISTRIB_DIR: /tmp/neon/pg_install
shell: bash -euxo pipefail {0}
run: |
# Test framework expects we have psql binary;
# but since we don't really need it in this test, let's mock it
mkdir -p "$POSTGRES_DISTRIB_DIR/bin" && touch "$POSTGRES_DISTRIB_DIR/bin/psql";
mkdir -p "$POSTGRES_DISTRIB_DIR/v14/bin" && touch "$POSTGRES_DISTRIB_DIR/v14/bin/psql";
./scripts/pytest \
--junitxml=$TEST_OUTPUT/junit.xml \
--tb=short \
@@ -65,6 +71,14 @@ jobs:
-m "remote_cluster" \
-rA "test_runner/pg_clients"
- name: Delete Neon Project
if: ${{ always() }}
uses: ./.github/actions/neon-project-delete
with:
environment: staging
project_id: ${{ steps.create-neon-project.outputs.project_id }}
api_key: ${{ secrets.NEON_STAGING_API_KEY }}
# We use GitHub's action upload-artifact because `ubuntu-latest` doesn't have configured AWS CLI.
# It will be fixed after switching to gen2 runner
- name: Upload python test logs

6
.gitignore vendored
View File

@@ -1,7 +1,6 @@
/pg_install
/target
/bindings/python/neon-dev-utils/target
/tmp_check
/tmp_install
/tmp_check_cli
__pycache__/
test_output/
@@ -16,3 +15,6 @@ test_output/
*.key
*.crt
*.o
*.so
*.Po

10
.gitmodules vendored
View File

@@ -1,4 +1,8 @@
[submodule "vendor/postgres"]
path = vendor/postgres
url = https://github.com/zenithdb/postgres
[submodule "vendor/postgres-v14"]
path = vendor/postgres-v14
url = https://github.com/neondatabase/postgres.git
branch = main
[submodule "vendor/postgres-v15"]
path = vendor/postgres-v15
url = https://github.com/neondatabase/postgres.git
branch = REL_15_STABLE_neon

View File

@@ -1,10 +0,0 @@
# This file is only read when `yapf` is run from this directory.
# Hence we only top-level directories here to avoid confusion.
# See source code for the exact file format: https://github.com/google/yapf/blob/c6077954245bc3add82dafd853a1c7305a6ebd20/yapf/yapflib/file_resources.py#L40-L43
vendor/
target/
tmp_install/
__pycache__/
test_output/
.neon/
.git/

562
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -6,21 +6,68 @@ members = [
"proxy",
"safekeeper",
"workspace_hack",
"neon_local",
"integration_tests",
"libs/*",
]
exclude = [
"bindings/python/neon-dev-utils",
]
[profile.release]
# This is useful for profiling and, to some extent, debug.
# Besides, debug info should not affect the performance.
debug = true
[profile.release-line-debug]
inherits = "release"
debug = 1 # true = 2 = all symbols, 1 = line only
[profile.release-line-debug-lto]
inherits = "release"
debug = 1 # true = 2 = all symbols, 1 = line only
lto = true
[profile.release-line-debug-size]
inherits = "release"
debug = 1 # true = 2 = all symbols, 1 = line only
opt-level = "s"
[profile.release-line-debug-zize]
inherits = "release"
debug = 1 # true = 2 = all symbols, 1 = line only
opt-level = "z"
[profile.release-line-debug-size-lto]
inherits = "release"
debug = 1 # true = 2 = all symbols, 1 = line only
opt-level = "s"
lto = true
[profile.release-line-debug-zize-lto]
inherits = "release"
debug = 1 # true = 2 = all symbols, 1 = line only
opt-level = "z"
lto = true
[profile.release-no-debug]
inherits = "release"
debug = false # true = 2 = all symbols, 1 = line only
[profile.release-no-debug-size]
inherits = "release"
debug = false # true = 2 = all symbols, 1 = line only
opt-level = "s"
[profile.release-no-debug-zize]
inherits = "release"
debug = false # true = 2 = all symbols, 1 = line only
opt-level = "z"
[profile.release-no-debug-size-lto]
inherits = "release"
debug = false # true = 2 = all symbols, 1 = line only
opt-level = "s"
lto = true
[profile.release-no-debug-zize-lto]
inherits = "release"
debug = false # true = 2 = all symbols, 1 = line only
opt-level = "z"
lto = true
# This is only needed for proxy's tests.
# TODO: we should probably fork `tokio-postgres-rustls` instead.
[patch.crates-io]
tokio-postgres = { git = "https://github.com/zenithdb/rust-postgres.git", rev="d052ee8b86fff9897c77b0fe89ea9daba0e1fa38" }
tokio-postgres = { git = "https://github.com/neondatabase/rust-postgres.git", rev="d052ee8b86fff9897c77b0fe89ea9daba0e1fa38" }

View File

@@ -1,18 +1,29 @@
### Creates a storage Docker image with postgres, pageserver, safekeeper and proxy binaries.
### The image itself is mainly used as a container for the binaries and for starting e2e tests with custom parameters.
### By default, the binaries inside the image have some mock parameters and can start, but are not intended to be used
### inside this image in the real deployments.
ARG REPOSITORY=369495373322.dkr.ecr.eu-central-1.amazonaws.com
ARG IMAGE=rust
ARG TAG=pinned
# Build Postgres
FROM 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned AS pg-build
FROM $REPOSITORY/$IMAGE:$TAG AS pg-build
WORKDIR /home/nonroot
COPY vendor/postgres vendor/postgres
COPY Makefile Makefile
COPY --chown=nonroot vendor/postgres-v14 vendor/postgres-v14
COPY --chown=nonroot vendor/postgres-v15 vendor/postgres-v15
COPY --chown=nonroot pgxn pgxn
COPY --chown=nonroot Makefile Makefile
COPY --chown=nonroot scripts/ninstall.sh scripts/ninstall.sh
ENV BUILD_TYPE release
RUN set -e \
&& mold -run make -j $(nproc) -s postgres \
&& rm -rf tmp_install/build \
&& tar -C tmp_install -czf /home/nonroot/postgres_install.tar.gz .
&& mold -run make -j $(nproc) -s neon-pg-ext \
&& rm -rf pg_install/build \
&& tar -C pg_install -czf /home/nonroot/postgres_install.tar.gz .
# Build zenith binaries
FROM 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned AS build
# Build neon binaries
FROM $REPOSITORY/$IMAGE:$TAG AS build
WORKDIR /home/nonroot
ARG GIT_VERSION=local
@@ -26,13 +37,14 @@ ARG CACHEPOT_BUCKET=neon-github-dev
#ARG AWS_ACCESS_KEY_ID
#ARG AWS_SECRET_ACCESS_KEY
COPY --from=pg-build /home/nonroot/tmp_install/include/postgresql/server tmp_install/include/postgresql/server
COPY --from=pg-build /home/nonroot/pg_install/v14/include/postgresql/server pg_install/v14/include/postgresql/server
COPY --from=pg-build /home/nonroot/pg_install/v15/include/postgresql/server pg_install/v15/include/postgresql/server
COPY . .
# Show build caching stats to check if it was used in the end.
# Has to be the part of the same RUN since cachepot daemon is killed in the end of this RUN, losing the compilation stats.
RUN set -e \
&& mold -run cargo build --release \
&& mold -run cargo build --bin pageserver --bin safekeeper --bin proxy --locked --release \
&& cachepot -s
# Build final image
@@ -48,17 +60,29 @@ RUN set -e \
openssl \
ca-certificates \
&& rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* \
&& useradd -d /data zenith \
&& chown -R zenith:zenith /data
&& useradd -d /data neon \
&& chown -R neon:neon /data
COPY --from=build --chown=zenith:zenith /home/nonroot/target/release/pageserver /usr/local/bin
COPY --from=build --chown=zenith:zenith /home/nonroot/target/release/safekeeper /usr/local/bin
COPY --from=build --chown=zenith:zenith /home/nonroot/target/release/proxy /usr/local/bin
COPY --from=build --chown=neon:neon /home/nonroot/target/release/pageserver /usr/local/bin
COPY --from=build --chown=neon:neon /home/nonroot/target/release/safekeeper /usr/local/bin
COPY --from=build --chown=neon:neon /home/nonroot/target/release/proxy /usr/local/bin
COPY --from=pg-build /home/nonroot/tmp_install/ /usr/local/
COPY --from=pg-build /home/nonroot/pg_install/v14 /usr/local/v14/
COPY --from=pg-build /home/nonroot/pg_install/v15 /usr/local/v15/
COPY --from=pg-build /home/nonroot/postgres_install.tar.gz /data/
# By default, pageserver uses `.neon/` working directory in WORKDIR, so create one and fill it with the dummy config.
# Now, when `docker run ... pageserver` is run, it can start without errors, yet will have some default dummy values.
RUN mkdir -p /data/.neon/ && chown -R neon:neon /data/.neon/ \
&& /usr/local/bin/pageserver -D /data/.neon/ --init \
-c "id=1234" \
-c "broker_endpoints=['http://etcd:2379']" \
-c "pg_distrib_dir='/usr/local/'" \
-c "listen_pg_addr='0.0.0.0:6400'" \
-c "listen_http_addr='0.0.0.0:9898'"
VOLUME ["/data"]
USER zenith
USER neon
EXPOSE 6400
CMD ["pageserver"]
EXPOSE 9898
CMD ["/bin/bash"]

200
Dockerfile.compute-node-v14 Normal file
View File

@@ -0,0 +1,200 @@
ARG TAG=pinned
# apparently, ARGs don't get replaced in RUN commands in kaniko
# ARG POSTGIS_VERSION=3.3.0
# ARG PLV8_VERSION=3.1.4
# ARG PG_VERSION=v14
#
# Layer "build-deps"
#
FROM debian:bullseye-slim AS build-deps
RUN echo "deb http://ftp.debian.org/debian testing main" >> /etc/apt/sources.list && \
echo "APT::Default-Release \"stable\";" > /etc/apt/apt.conf.d/default-release && \
apt update
RUN apt update && \
apt install -y git autoconf automake libtool build-essential bison flex libreadline-dev zlib1g-dev libxml2-dev \
libcurl4-openssl-dev libossp-uuid-dev wget pkg-config libglib2.0-dev
#
# Layer "pg-build"
# Build Postgres from the neon postgres repository.
#
FROM build-deps AS pg-build
COPY vendor/postgres-v14 postgres
RUN cd postgres && \
./configure CFLAGS='-O2 -g3' --enable-debug --with-uuid=ossp && \
make MAKELEVEL=0 -j $(getconf _NPROCESSORS_ONLN) -s install && \
make MAKELEVEL=0 -j $(getconf _NPROCESSORS_ONLN) -s -C contrib/ install && \
# Install headers
make MAKELEVEL=0 -j $(getconf _NPROCESSORS_ONLN) -s -C src/include install && \
make MAKELEVEL=0 -j $(getconf _NPROCESSORS_ONLN) -s -C src/interfaces/libpq install
#
# Layer "postgis-build"
# Build PostGIS from the upstream PostGIS mirror.
#
# PostGIS compiles against neon postgres sources without changes. Perhaps we
# could even use the upstream binaries, compiled against vanilla Postgres, but
# it would require some investigation to check that it works, and also keeps
# working in the future. So for now, we compile our own binaries.
FROM build-deps AS postgis-build
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
RUN apt update && \
apt install -y gdal-bin libgdal-dev libprotobuf-c-dev protobuf-c-compiler xsltproc
RUN wget https://download.osgeo.org/postgis/source/postgis-3.3.0.tar.gz && \
tar xvzf postgis-3.3.0.tar.gz && \
cd postgis-3.3.0 && \
./autogen.sh && \
export PATH="/usr/local/pgsql/bin:$PATH" && \
./configure && \
make -j $(getconf _NPROCESSORS_ONLN) install && \
cd extensions/postgis && \
make clean && \
make -j $(getconf _NPROCESSORS_ONLN) install && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/postgis.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/postgis_raster.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/postgis_tiger_geocoder.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/postgis_topology.control
#
# Layer "plv8-build"
# Build plv8
#
FROM build-deps AS plv8-build
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
RUN apt update && \
apt install -y ninja-build python3-dev libc++-dev libc++abi-dev libncurses5
# https://github.com/plv8/plv8/issues/475
# Debian bullseye provides binutils 2.35 when >= 2.38 is necessary
RUN apt update && \
apt install -y --no-install-recommends -t testing binutils
RUN wget https://github.com/plv8/plv8/archive/refs/tags/v3.1.4.tar.gz && \
tar xvzf v3.1.4.tar.gz && \
cd plv8-3.1.4 && \
export PATH="/usr/local/pgsql/bin:$PATH" && \
make -j $(getconf _NPROCESSORS_ONLN) && \
make -j $(getconf _NPROCESSORS_ONLN) install && \
rm -rf /plv8-* && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/plv8.control
#
# Layer "h3-pg-build"
# Build h3_pg
#
FROM build-deps AS h3-pg-build
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
# packaged cmake is too old
RUN apt update && \
apt install -y --no-install-recommends -t testing cmake
RUN wget https://github.com/uber/h3/archive/refs/tags/v4.0.1.tar.gz -O h3.tgz && \
tar xvzf h3.tgz && \
cd h3-4.0.1 && \
mkdir build && \
cd build && \
cmake .. -DCMAKE_BUILD_TYPE=Release && \
make -j $(getconf _NPROCESSORS_ONLN) && \
DESTDIR=/h3 make install && \
cp -R /h3/usr / && \
rm -rf build
RUN wget https://github.com/zachasme/h3-pg/archive/refs/tags/v4.0.1.tar.gz -O h3-pg.tgz && \
tar xvzf h3-pg.tgz && \
cd h3-pg-4.0.1 && \
export PATH="/usr/local/pgsql/bin:$PATH" && \
make -j $(getconf _NPROCESSORS_ONLN) && \
make -j $(getconf _NPROCESSORS_ONLN) install && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/h3.control
#
# Layer "neon-pg-ext-build"
# compile neon extensions
#
FROM build-deps AS neon-pg-ext-build
COPY --from=postgis-build /usr/local/pgsql/ /usr/local/pgsql/
# plv8 still sometimes crashes during the creation
# COPY --from=plv8-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=h3-pg-build /usr/local/pgsql/ /usr/local/pgsql/
COPY --from=h3-pg-build /h3/usr /
COPY pgxn/ pgxn/
RUN make -j $(getconf _NPROCESSORS_ONLN) \
PG_CONFIG=/usr/local/pgsql/bin/pg_config \
-C pgxn/neon \
-s install
# Compile and run the Neon-specific `compute_ctl` binary
FROM 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:$TAG AS compute-tools
USER nonroot
# Copy entire project to get Cargo.* files with proper dependencies for the whole project
COPY --chown=nonroot . .
RUN cd compute_tools && cargo build --locked --profile release-line-debug-size-lto
#
# Clean up postgres folder before inclusion
#
FROM neon-pg-ext-build AS postgres-cleanup-layer
COPY --from=neon-pg-ext-build /usr/local/pgsql /usr/local/pgsql
# Remove binaries from /bin/ that we won't use (or would manually copy & install otherwise)
RUN cd /usr/local/pgsql/bin && rm ecpg raster2pgsql shp2pgsql pgtopo_export pgtopo_import pgsql2shp
# Remove headers that we won't need anymore - we've completed installation of all extensions
RUN rm -r /usr/local/pgsql/include
# Remove now-useless PGXS src infrastructure
RUN rm -r /usr/local/pgsql/lib/pgxs/src
# Remove static postgresql libraries - all compilation is finished, so we
# can now remove these files - they must be included in other binaries by now
# if they were to be used by other libraries.
RUN rm /usr/local/pgsql/lib/lib*.a
#
# Final layer
# Put it all together into the final image
#
FROM debian:bullseye-slim
# Add user postgres
RUN mkdir /var/db && useradd -m -d /var/db/postgres postgres && \
echo "postgres:test_console_pass" | chpasswd && \
mkdir /var/db/postgres/compute && mkdir /var/db/postgres/specs && \
chown -R postgres:postgres /var/db/postgres && \
chmod 0750 /var/db/postgres/compute && \
echo '/usr/local/lib' >> /etc/ld.so.conf && /sbin/ldconfig
COPY --from=postgres-cleanup-layer --chown=postgres /usr/local/pgsql /usr/local
COPY --from=compute-tools --chown=postgres /home/nonroot/target/release-line-debug-size-lto/compute_ctl /usr/local/bin/compute_ctl
# Install:
# libreadline8 for psql
# libossp-uuid16 for extension ossp-uuid
# libgeos, libgdal, libproj and libprotobuf-c1 for PostGIS
# GLIBC 2.34 for plv8.
# Debian bullseye provides GLIBC 2.31, so we install the library from testing
#
# Lastly, link compute_ctl into zenith_ctl while we're at it,
# so that we don't need to put this in another layer.
RUN apt update && \
apt install --no-install-recommends -y \
libreadline8 \
libossp-uuid16 \
libgeos-c1v5 \
libgdal28 \
libproj19 \
libprotobuf-c1 && \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* && \
echo "Installing GLIBC 2.34" && \
echo "deb http://ftp.debian.org/debian testing main" >> /etc/apt/sources.list && \
echo "APT::Default-Release \"stable\";" > /etc/apt/apt.conf.d/default-release && \
apt update && \
apt install -y --no-install-recommends -t testing libc6 && \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* && \
ln /usr/local/bin/compute_ctl /usr/local/bin/zenith_ctl
USER postgres
ENTRYPOINT ["/usr/local/bin/compute_ctl"]

172
Dockerfile.compute-node-v15 Normal file
View File

@@ -0,0 +1,172 @@
#
# This file is identical to the Dockerfile.compute-node-v14 file
# except for the version of Postgres that is built.
#
ARG TAG=pinned
# apparently, ARGs don't get replaced in RUN commands in kaniko
# ARG POSTGIS_VERSION=3.3.0
# ARG PLV8_VERSION=3.1.4
# ARG PG_VERSION=v15
#
# Layer "build-deps"
#
FROM debian:bullseye-slim AS build-deps
RUN apt update && \
apt install -y git autoconf automake libtool build-essential bison flex libreadline-dev zlib1g-dev libxml2-dev \
libcurl4-openssl-dev libossp-uuid-dev
#
# Layer "pg-build"
# Build Postgres from the neon postgres repository.
#
FROM build-deps AS pg-build
COPY vendor/postgres-v15 postgres
RUN cd postgres && \
./configure CFLAGS='-O2 -g3' --enable-debug --with-uuid=ossp && \
make MAKELEVEL=0 -j $(getconf _NPROCESSORS_ONLN) -s install && \
make MAKELEVEL=0 -j $(getconf _NPROCESSORS_ONLN) -s -C contrib/ install && \
# Install headers
make MAKELEVEL=0 -j $(getconf _NPROCESSORS_ONLN) -s -C src/include install && \
make MAKELEVEL=0 -j $(getconf _NPROCESSORS_ONLN) -s -C src/interfaces/libpq install
#
# Layer "postgis-build"
# Build PostGIS from the upstream PostGIS mirror.
#
# PostGIS compiles against neon postgres sources without changes. Perhaps we
# could even use the upstream binaries, compiled against vanilla Postgres, but
# it would require some investigation to check that it works, and also keeps
# working in the future. So for now, we compile our own binaries.
FROM build-deps AS postgis-build
COPY --from=pg-build /usr/local/pgsql/ /usr/local/pgsql/
RUN apt update && \
apt install -y gdal-bin libgdal-dev libprotobuf-c-dev protobuf-c-compiler xsltproc wget
RUN wget https://download.osgeo.org/postgis/source/postgis-3.3.0.tar.gz && \
tar xvzf postgis-3.3.0.tar.gz && \
cd postgis-3.3.0 && \
./autogen.sh && \
export PATH="/usr/local/pgsql/bin:$PATH" && \
./configure && \
make -j $(getconf _NPROCESSORS_ONLN) install && \
cd extensions/postgis && \
make clean && \
make -j $(getconf _NPROCESSORS_ONLN) install && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/postgis.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/postgis_raster.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/postgis_tiger_geocoder.control && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/postgis_topology.control
#
# Layer "plv8-build"
# Build plv8
#
FROM build-deps AS plv8-build
COPY --from=postgis-build /usr/local/pgsql/ /usr/local/pgsql/
RUN apt update && \
apt install -y git curl wget make ninja-build build-essential libncurses5 python3-dev pkg-config libc++-dev libc++abi-dev libglib2.0-dev
# https://github.com/plv8/plv8/issues/475
# Debian bullseye provides binutils 2.35 when >= 2.38 is necessary
RUN echo "deb http://ftp.debian.org/debian testing main" >> /etc/apt/sources.list && \
echo "APT::Default-Release \"stable\";" > /etc/apt/apt.conf.d/default-release && \
apt update && \
apt install -y --no-install-recommends -t testing binutils
RUN wget https://github.com/plv8/plv8/archive/refs/tags/v3.1.4.tar.gz && \
tar xvzf v3.1.4.tar.gz && \
cd plv8-3.1.4 && \
export PATH="/usr/local/pgsql/bin:$PATH" && \
make -j $(getconf _NPROCESSORS_ONLN) && \
make -j $(getconf _NPROCESSORS_ONLN) install && \
rm -rf /plv8-* && \
echo 'trusted = true' >> /usr/local/pgsql/share/extension/plv8.control
#
# Layer "neon-pg-ext-build"
# compile neon extensions
#
FROM build-deps AS neon-pg-ext-build
COPY --from=postgis-build /usr/local/pgsql/ /usr/local/pgsql/
COPY pgxn/ pgxn/
RUN make -j $(getconf _NPROCESSORS_ONLN) \
PG_CONFIG=/usr/local/pgsql/bin/pg_config \
-C pgxn/neon \
-s install
# Compile and run the Neon-specific `compute_ctl` binary
FROM 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:$TAG AS compute-tools
USER nonroot
# Copy entire project to get Cargo.* files with proper dependencies for the whole project
COPY --chown=nonroot . .
RUN cd compute_tools && cargo build --locked --profile release-line-debug-size-lto
#
# Clean up postgres folder before inclusion
#
FROM neon-pg-ext-build AS postgres-cleanup-layer
COPY --from=neon-pg-ext-build /usr/local/pgsql /usr/local/pgsql
# Remove binaries from /bin/ that we won't use (or would manually copy & install otherwise)
RUN cd /usr/local/pgsql/bin && rm ecpg raster2pgsql shp2pgsql pgtopo_export pgtopo_import pgsql2shp
# Remove headers that we won't need anymore - we've completed installation of all extensions
RUN rm -r /usr/local/pgsql/include
# Remove now-useless PGXS src infrastructure
RUN rm -r /usr/local/pgsql/lib/pgxs/src
# Remove static postgresql libraries - all compilation is finished, so we
# can now remove these files - they must be included in other binaries by now
# if they were to be used by other libraries.
RUN rm /usr/local/pgsql/lib/lib*.a
#
# Final layer
# Put it all together into the final image
#
FROM debian:bullseye-slim
# Add user postgres
RUN mkdir /var/db && useradd -m -d /var/db/postgres postgres && \
echo "postgres:test_console_pass" | chpasswd && \
mkdir /var/db/postgres/compute && mkdir /var/db/postgres/specs && \
chown -R postgres:postgres /var/db/postgres && \
chmod 0750 /var/db/postgres/compute && \
echo '/usr/local/lib' >> /etc/ld.so.conf && /sbin/ldconfig
# TODO: Check if we can make the extension setup more modular versus a linear build
# currently plv8-build copies the output /usr/local/pgsql from postgis-build, etc#
COPY --from=postgres-cleanup-layer --chown=postgres /usr/local/pgsql /usr/local
COPY --from=compute-tools --chown=postgres /home/nonroot/target/release-line-debug-size-lto/compute_ctl /usr/local/bin/compute_ctl
# Install:
# libreadline8 for psql
# libossp-uuid16 for extension ossp-uuid
# libgeos, libgdal, libproj and libprotobuf-c1 for PostGIS
# GLIBC 2.34 for plv8.
# Debian bullseye provides GLIBC 2.31, so we install the library from testing
#
# Lastly, link compute_ctl into zenith_ctl while we're at it,
# so that we don't need to put this in another layer.
RUN apt update && \
apt install --no-install-recommends -y \
libreadline8 \
libossp-uuid16 \
libgeos-c1v5 \
libgdal28 \
libproj19 \
libprotobuf-c1 && \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* && \
echo "Installing GLIBC 2.34" && \
echo "deb http://ftp.debian.org/debian testing main" >> /etc/apt/sources.list && \
echo "APT::Default-Release \"stable\";" > /etc/apt/apt.conf.d/default-release && \
apt update && \
apt install -y --no-install-recommends -t testing libc6 && \
rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/* && \
ln /usr/local/bin/compute_ctl /usr/local/bin/zenith_ctl
USER postgres
ENTRYPOINT ["/usr/local/bin/compute_ctl"]

View File

@@ -0,0 +1,88 @@
#
# Legacy version of the Dockerfile for the compute node.
# Used by e2e CI. Building Dockerfile.compute-node will take
# unreasonable ammount of time without v2 runners.
#
# TODO: remove once cloud repo CI is moved to v2 runners.
#
# Allow specifiyng different compute-tools tag and image repo, so we are
# able to use different images
ARG REPOSITORY=369495373322.dkr.ecr.eu-central-1.amazonaws.com
ARG IMAGE=compute-tools
ARG TAG=latest
#
# Image with pre-built tools
#
FROM $REPOSITORY/$IMAGE:$TAG AS compute-deps
# Only to get ready compute_ctl binary as deppendency
#
# Image with Postgres build deps
#
FROM debian:bullseye-slim AS build-deps
RUN apt-get update && apt-get -yq install automake libtool build-essential bison flex libreadline-dev zlib1g-dev libxml2-dev \
libcurl4-openssl-dev libossp-uuid-dev
#
# Image with built Postgres
#
FROM build-deps AS pg-build
# Add user postgres
RUN adduser postgres
RUN mkdir /pg && chown postgres:postgres /pg
# Copy source files
# version 14 is default for now
COPY ./vendor/postgres-v14 /pg/
COPY ./pgxn /pg/
# Build and install Postgres locally
RUN mkdir /pg/compute_build && cd /pg/compute_build && \
../configure CFLAGS='-O2 -g3' --prefix=$(pwd)/postgres_bin --enable-debug --with-uuid=ossp && \
# Install main binaries and contribs
make MAKELEVEL=0 -j $(getconf _NPROCESSORS_ONLN) -s install && \
make MAKELEVEL=0 -j $(getconf _NPROCESSORS_ONLN) -s -C contrib/ install && \
# Install headers
make MAKELEVEL=0 -j $(getconf _NPROCESSORS_ONLN) -s -C src/include install
# Install neon contrib
RUN make MAKELEVEL=0 PG_CONFIG=/pg/compute_build/postgres_bin/bin/pg_config -j $(getconf _NPROCESSORS_ONLN) -C /pg/neon install
USER postgres
WORKDIR /pg
#
# Final compute node image to be exported
#
FROM debian:bullseye-slim
# libreadline-dev is required to run psql
RUN apt-get update && apt-get -yq install libreadline-dev libossp-uuid-dev
# Add user postgres
RUN mkdir /var/db && useradd -m -d /var/db/postgres postgres && \
echo "postgres:test_console_pass" | chpasswd && \
mkdir /var/db/postgres/compute && mkdir /var/db/postgres/specs && \
chown -R postgres:postgres /var/db/postgres && \
chmod 0750 /var/db/postgres/compute
# Copy ready Postgres binaries
COPY --from=pg-build /pg/compute_build/postgres_bin /usr/local
# Copy binaries from compute-tools
COPY --from=compute-deps /usr/local/bin/compute_ctl /usr/local/bin/compute_ctl
# XXX: temporary symlink for compatibility with old control-plane
RUN ln -s /usr/local/bin/compute_ctl /usr/local/bin/zenith_ctl
# Add postgres shared objects to the search path
RUN echo '/usr/local/lib' >> /etc/ld.so.conf && /sbin/ldconfig
USER postgres
ENTRYPOINT ["/usr/local/bin/compute_ctl"]

View File

@@ -1,6 +1,10 @@
# First transient image to build compute_tools binaries
# NB: keep in sync with rust image version in .github/workflows/build_and_test.yml
FROM 369495373322.dkr.ecr.eu-central-1.amazonaws.com/rust:pinned AS rust-build
ARG REPOSITORY=369495373322.dkr.ecr.eu-central-1.amazonaws.com
ARG IMAGE=rust
ARG TAG=pinned
FROM $REPOSITORY/$IMAGE:$TAG AS rust-build
WORKDIR /home/nonroot
# Enable https://github.com/paritytech/cachepot to cache Rust crates' compilation results in Docker builds.
@@ -16,7 +20,7 @@ ARG CACHEPOT_BUCKET=neon-github-dev
COPY . .
RUN set -e \
&& mold -run cargo build -p compute_tools --release \
&& mold -run cargo build -p compute_tools --locked --release \
&& cachepot -s
# Final image that only has one binary

177
Makefile
View File

@@ -1,15 +1,7 @@
ROOT_PROJECT_DIR := $(dir $(abspath $(lastword $(MAKEFILE_LIST))))
# Where to install Postgres, default is ./tmp_install, maybe useful for package managers
POSTGRES_INSTALL_DIR ?= $(ROOT_PROJECT_DIR)/tmp_install
# Seccomp BPF is only available for Linux
UNAME_S := $(shell uname -s)
ifeq ($(UNAME_S),Linux)
SECCOMP = --with-libseccomp
else
SECCOMP =
endif
# Where to install Postgres, default is ./pg_install, maybe useful for package managers
POSTGRES_INSTALL_DIR ?= $(ROOT_PROJECT_DIR)/pg_install/
#
# We differentiate between release / debug build types using the BUILD_TYPE
@@ -28,6 +20,12 @@ else
$(error Bad build type '$(BUILD_TYPE)', see Makefile for options)
endif
# Seccomp BPF is only available for Linux
UNAME_S := $(shell uname -s)
ifeq ($(UNAME_S),Linux)
PG_CONFIGURE_OPTS += --with-libseccomp
endif
# macOS with brew-installed openssl requires explicit paths
# It can be configured with OPENSSL_PREFIX variable
UNAME_S := $(shell uname -s)
@@ -36,6 +34,12 @@ ifeq ($(UNAME_S),Darwin)
PG_CONFIGURE_OPTS += --with-includes=$(OPENSSL_PREFIX)/include --with-libraries=$(OPENSSL_PREFIX)/lib
endif
# Use -C option so that when PostgreSQL "make install" installs the
# headers, the mtime of the headers are not changed when there have
# been no changes to the files. Changing the mtime triggers an
# unnecessary rebuild of 'postgres_ffi'.
PG_CONFIGURE_OPTS += INSTALL='$(ROOT_PROJECT_DIR)/scripts/ninstall.sh -C'
# Choose whether we should be silent or verbose
CARGO_BUILD_FLAGS += --$(if $(filter s,$(MAKEFLAGS)),quiet,verbose)
# Fix for a corner case when make doesn't pass a jobserver
@@ -48,64 +52,139 @@ CARGO_CMD_PREFIX += $(if $(filter n,$(MAKEFLAGS)),,+)
CARGO_CMD_PREFIX += CARGO_TERM_PROGRESS_WHEN=never CI=1
#
# Top level Makefile to build Zenith and PostgreSQL
# Top level Makefile to build Neon and PostgreSQL
#
.PHONY: all
all: zenith postgres
all: neon postgres neon-pg-ext
### Zenith Rust bits
### Neon Rust bits
#
# The 'postgres_ffi' depends on the Postgres headers.
.PHONY: zenith
zenith: postgres-headers
+@echo "Compiling Zenith"
.PHONY: neon
neon: postgres-v14-headers postgres-v15-headers
+@echo "Compiling Neon"
$(CARGO_CMD_PREFIX) cargo build $(CARGO_BUILD_FLAGS)
### PostgreSQL parts
$(POSTGRES_INSTALL_DIR)/build/config.status:
+@echo "Configuring postgres build"
mkdir -p $(POSTGRES_INSTALL_DIR)/build
(cd $(POSTGRES_INSTALL_DIR)/build && \
$(ROOT_PROJECT_DIR)/vendor/postgres/configure CFLAGS='$(PG_CFLAGS)' \
# The rules are duplicated for Postgres v14 and 15. We may want to refactor
# to avoid the duplication in the future, but it's tolerable for now.
#
$(POSTGRES_INSTALL_DIR)/build/v14/config.status:
+@echo "Configuring Postgres v14 build"
mkdir -p $(POSTGRES_INSTALL_DIR)/build/v14
(cd $(POSTGRES_INSTALL_DIR)/build/v14 && \
$(ROOT_PROJECT_DIR)/vendor/postgres-v14/configure CFLAGS='$(PG_CFLAGS)' \
$(PG_CONFIGURE_OPTS) \
$(SECCOMP) \
--prefix=$(abspath $(POSTGRES_INSTALL_DIR)) > configure.log)
--prefix=$(abspath $(POSTGRES_INSTALL_DIR))/v14 > configure.log)
# nicer alias for running 'configure'
.PHONY: postgres-configure
postgres-configure: $(POSTGRES_INSTALL_DIR)/build/config.status
$(POSTGRES_INSTALL_DIR)/build/v15/config.status:
+@echo "Configuring Postgres v15 build"
mkdir -p $(POSTGRES_INSTALL_DIR)/build/v15
(cd $(POSTGRES_INSTALL_DIR)/build/v15 && \
$(ROOT_PROJECT_DIR)/vendor/postgres-v15/configure CFLAGS='$(PG_CFLAGS)' \
$(PG_CONFIGURE_OPTS) \
--prefix=$(abspath $(POSTGRES_INSTALL_DIR))/v15 > configure.log)
# Install the PostgreSQL header files into $(POSTGRES_INSTALL_DIR)/include
.PHONY: postgres-headers
postgres-headers: postgres-configure
+@echo "Installing PostgreSQL headers"
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/src/include MAKELEVEL=0 install
# nicer alias to run 'configure'
.PHONY: postgres-v14-configure
postgres-v14-configure: $(POSTGRES_INSTALL_DIR)/build/v14/config.status
# Compile and install PostgreSQL and contrib/neon
.PHONY: postgres
postgres: postgres-configure \
postgres-headers # to prevent `make install` conflicts with zenith's `postgres-headers`
+@echo "Compiling PostgreSQL"
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build MAKELEVEL=0 install
+@echo "Compiling contrib/neon"
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/contrib/neon install
+@echo "Compiling contrib/neon_test_utils"
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/contrib/neon_test_utils install
+@echo "Compiling pg_buffercache"
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/contrib/pg_buffercache install
+@echo "Compiling pageinspect"
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/contrib/pageinspect install
.PHONY: postgres-v15-configure
postgres-v15-configure: $(POSTGRES_INSTALL_DIR)/build/v15/config.status
# Install the PostgreSQL header files into $(POSTGRES_INSTALL_DIR)/<version>/include
.PHONY: postgres-v14-headers
postgres-v14-headers: postgres-v14-configure
+@echo "Installing PostgreSQL v14 headers"
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/v14/src/include MAKELEVEL=0 install
.PHONY: postgres-clean
postgres-clean:
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build MAKELEVEL=0 clean
.PHONY: postgres-v15-headers
postgres-v15-headers: postgres-v15-configure
+@echo "Installing PostgreSQL v15 headers"
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/v15/src/include MAKELEVEL=0 install
# Compile and install PostgreSQL
.PHONY: postgres-v14
postgres-v14: postgres-v14-configure \
postgres-v14-headers # to prevent `make install` conflicts with neon's `postgres-headers`
+@echo "Compiling PostgreSQL v14"
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/v14 MAKELEVEL=0 install
+@echo "Compiling libpq v14"
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/v14/src/interfaces/libpq install
+@echo "Compiling pg_buffercache v14"
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/v14/contrib/pg_buffercache install
+@echo "Compiling pageinspect v14"
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/v14/contrib/pageinspect install
.PHONY: postgres-v15
postgres-v15: postgres-v15-configure \
postgres-v15-headers # to prevent `make install` conflicts with neon's `postgres-headers`
+@echo "Compiling PostgreSQL v15"
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/v15 MAKELEVEL=0 install
+@echo "Compiling libpq v15"
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/v15/src/interfaces/libpq install
+@echo "Compiling pg_buffercache v15"
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/v15/contrib/pg_buffercache install
+@echo "Compiling pageinspect v15"
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/v15/contrib/pageinspect install
# shorthand to build all Postgres versions
postgres: postgres-v14 postgres-v15
.PHONY: postgres-v14-clean
postgres-v14-clean:
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/v14 MAKELEVEL=0 clean
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/v14/contrib/pg_buffercache clean
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/v14/contrib/pageinspect clean
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/v14/src/interfaces/libpq clean
.PHONY: postgres-v15-clean
postgres-v15-clean:
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/v15 MAKELEVEL=0 clean
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/v15/contrib/pg_buffercache clean
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/v15/contrib/pageinspect clean
$(MAKE) -C $(POSTGRES_INSTALL_DIR)/build/v15/src/interfaces/libpq clean
neon-pg-ext-v14: postgres-v14
+@echo "Compiling neon v14"
mkdir -p $(POSTGRES_INSTALL_DIR)/build/neon-v14
(cd $(POSTGRES_INSTALL_DIR)/build/neon-v14 && \
$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/v14/bin/pg_config CFLAGS='$(PG_CFLAGS) $(COPT)' \
-f $(ROOT_PROJECT_DIR)/pgxn/neon/Makefile install)
+@echo "Compiling neon_test_utils" v14
mkdir -p $(POSTGRES_INSTALL_DIR)/build/neon-test-utils-v14
(cd $(POSTGRES_INSTALL_DIR)/build/neon-test-utils-v14 && \
$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/v14/bin/pg_config CFLAGS='$(PG_CFLAGS) $(COPT)' \
-f $(ROOT_PROJECT_DIR)/pgxn/neon_test_utils/Makefile install)
neon-pg-ext-v15: postgres-v15
+@echo "Compiling neon v15"
mkdir -p $(POSTGRES_INSTALL_DIR)/build/neon-v15
(cd $(POSTGRES_INSTALL_DIR)/build/neon-v15 && \
$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/v15/bin/pg_config CFLAGS='$(PG_CFLAGS) $(COPT)' \
-f $(ROOT_PROJECT_DIR)/pgxn/neon/Makefile install)
+@echo "Compiling neon_test_utils" v15
mkdir -p $(POSTGRES_INSTALL_DIR)/build/neon-test-utils-v15
(cd $(POSTGRES_INSTALL_DIR)/build/neon-test-utils-v15 && \
$(MAKE) PG_CONFIG=$(POSTGRES_INSTALL_DIR)/v15/bin/pg_config CFLAGS='$(PG_CFLAGS) $(COPT)' \
-f $(ROOT_PROJECT_DIR)/pgxn/neon_test_utils/Makefile install)
.PHONY: neon-pg-ext-clean
$(MAKE) -C $(ROOT_PROJECT_DIR)/pgxn/neon clean
$(MAKE) -C $(ROOT_PROJECT_DIR)/pgxn/neon_test_utils clean
neon-pg-ext: neon-pg-ext-v14 neon-pg-ext-v15
postgres-headers: postgres-v14-headers postgres-v15-headers
postgres-clean: postgres-v14-clean postgres-v15-clean
# This doesn't remove the effects of 'configure'.
.PHONY: clean
clean:
cd $(POSTGRES_INSTALL_DIR)/build && $(MAKE) clean
cd $(POSTGRES_INSTALL_DIR)/build/v14 && $(MAKE) clean
cd $(POSTGRES_INSTALL_DIR)/build/v15 && $(MAKE) clean
$(CARGO_CMD_PREFIX) cargo clean
cd pgxn/neon && $(MAKE) clean
cd pgxn/neon_test_utils && $(MAKE) clean
# This removes everything
.PHONY: distclean

4
NOTICE
View File

@@ -1,5 +1,5 @@
Neon
Copyright 2022 Neon Inc.
The PostgreSQL submodule in vendor/postgres is licensed under the
PostgreSQL license. See vendor/postgres/COPYRIGHT.
The PostgreSQL submodules in vendor/postgres-v14 and vendor/postgres-v15 are licensed under the
PostgreSQL license. See vendor/postgres-v14/COPYRIGHT and vendor/postgres-v15/COPYRIGHT.

View File

@@ -25,6 +25,7 @@ Pageserver consists of:
- WAL receiver - service that receives WAL from WAL service and stores it in the repository.
- Page service - service that communicates with compute nodes and responds with pages from the repository.
- WAL redo - service that builds pages from base images and WAL records on Page service request
## Running local installation
@@ -68,6 +69,17 @@ brew install libpq
brew link --force libpq
```
#### Rustc version
The project uses [rust toolchain file](./rust-toolchain.toml) to define the version it's built with in CI for testing and local builds.
This file is automatically picked up by [`rustup`](https://rust-lang.github.io/rustup/overrides.html#the-toolchain-file) that installs (if absent) and uses the toolchain version pinned in the file.
rustup users who want to build with another toolchain can use [`rustup override`](https://rust-lang.github.io/rustup/overrides.html#directory-overrides) command to set a specific toolchain for the project's directory.
non-rustup users most probably are not getting the same toolchain automatically from the file, so are responsible to manually verify their toolchain matches the version in the file.
Newer rustc versions most probably will work fine, yet older ones might not be supported due to some new features used by the project or the crates.
#### Building on Linux
1. Build neon and patched postgres
@@ -77,9 +89,9 @@ brew link --force libpq
git clone --recursive https://github.com/neondatabase/neon.git
cd neon
# The preferred and default is to make a debug build. This will create a
# demonstrably slower build than a release build. If you want to use a release
# build, utilize "BUILD_TYPE=release make -j`nproc`"
# The preferred and default is to make a debug build. This will create a
# demonstrably slower build than a release build. For a release build,
# use "BUILD_TYPE=release make -j`nproc`"
make -j`nproc`
```
@@ -93,15 +105,15 @@ make -j`nproc`
git clone --recursive https://github.com/neondatabase/neon.git
cd neon
# The preferred and default is to make a debug build. This will create a
# demonstrably slower build than a release build. If you want to use a release
# build, utilize "BUILD_TYPE=release make -j`sysctl -n hw.logicalcpu`"
# The preferred and default is to make a debug build. This will create a
# demonstrably slower build than a release build. For a release build,
# use "BUILD_TYPE=release make -j`sysctl -n hw.logicalcpu`"
make -j`sysctl -n hw.logicalcpu`
```
#### Dependency installation notes
To run the `psql` client, install the `postgresql-client` package or modify `PATH` and `LD_LIBRARY_PATH` to include `tmp_install/bin` and `tmp_install/lib`, respectively.
To run the `psql` client, install the `postgresql-client` package or modify `PATH` and `LD_LIBRARY_PATH` to include `pg_install/bin` and `pg_install/lib`, respectively.
To run the integration tests or Python scripts (not required to use the code), install
Python (3.9 or higher), and install python3 packages using `./scripts/pysync` (requires [poetry](https://python-poetry.org/)) in the project directory.
@@ -113,16 +125,18 @@ Python (3.9 or higher), and install python3 packages using `./scripts/pysync` (r
# Create repository in .neon with proper paths to binaries and data
# Later that would be responsibility of a package install script
> ./target/debug/neon_local init
initializing tenantid 9ef87a5bf0d92544f6fafeeb3239695c
created initial timeline de200bd42b49cc1814412c7e592dd6e9 timeline.lsn 0/16B5A50
initial timeline de200bd42b49cc1814412c7e592dd6e9 created
pageserver init succeeded
Starting pageserver at '127.0.0.1:64000' in '.neon'
Pageserver started
Successfully initialized timeline 7dd0907914ac399ff3be45fb252bfdb7
Stopping pageserver gracefully...done!
# start pageserver and safekeeper
> ./target/debug/neon_local start
Starting etcd broker using /usr/bin/etcd
Starting pageserver at '127.0.0.1:64000' in '.neon'
Pageserver started
initializing for sk 1 for 7676
Starting safekeeper at '127.0.0.1:5454' in '.neon/safekeepers/sk1'
Safekeeper started
@@ -208,7 +222,12 @@ Ensure your dependencies are installed as described [here](https://github.com/ne
```sh
git clone --recursive https://github.com/neondatabase/neon.git
make # builds also postgres and installs it to ./tmp_install
# either:
CARGO_BUILD_FLAGS="--features=testing" make
# or:
make debug
./scripts/pytest
```

View File

@@ -1,264 +0,0 @@
# This file is automatically @generated by Cargo.
# It is not intended for manual editing.
version = 3
[[package]]
name = "autocfg"
version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d468802bab17cbc0cc575e9b053f41e72aa36bfa6b7f55e3529ffa43161b97fa"
[[package]]
name = "bitflags"
version = "1.3.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bef38d45163c2f1dde094a7dfd33ccf595c92905c8f8f4fdc18d06fb1037718a"
[[package]]
name = "cfg-if"
version = "1.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "baf1de4339761588bc0619e3cbc0120ee582ebb74b53b4efbf79117bd2da40fd"
[[package]]
name = "indoc"
version = "0.3.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "47741a8bc60fb26eb8d6e0238bbb26d8575ff623fdc97b1a2c00c050b9684ed8"
dependencies = [
"indoc-impl",
"proc-macro-hack",
]
[[package]]
name = "indoc-impl"
version = "0.3.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ce046d161f000fffde5f432a0d034d0341dc152643b2598ed5bfce44c4f3a8f0"
dependencies = [
"proc-macro-hack",
"proc-macro2",
"quote",
"syn",
"unindent",
]
[[package]]
name = "instant"
version = "0.1.12"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7a5bbe824c507c5da5956355e86a746d82e0e1464f65d862cc5e71da70e94b2c"
dependencies = [
"cfg-if",
]
[[package]]
name = "libc"
version = "0.2.132"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8371e4e5341c3a96db127eb2465ac681ced4c433e01dd0e938adbef26ba93ba5"
[[package]]
name = "lock_api"
version = "0.4.8"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "9f80bf5aacaf25cbfc8210d1cfb718f2bf3b11c4c54e5afe36c236853a8ec390"
dependencies = [
"autocfg",
"scopeguard",
]
[[package]]
name = "neon-dev-utils"
version = "0.1.0"
dependencies = [
"pyo3",
]
[[package]]
name = "once_cell"
version = "1.13.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "074864da206b4973b84eb91683020dbefd6a8c3f0f38e054d93954e891935e4e"
[[package]]
name = "parking_lot"
version = "0.11.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "7d17b78036a60663b797adeaee46f5c9dfebb86948d1255007a1d6be0271ff99"
dependencies = [
"instant",
"lock_api",
"parking_lot_core",
]
[[package]]
name = "parking_lot_core"
version = "0.8.5"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d76e8e1493bcac0d2766c42737f34458f1c8c50c0d23bcb24ea953affb273216"
dependencies = [
"cfg-if",
"instant",
"libc",
"redox_syscall",
"smallvec",
"winapi",
]
[[package]]
name = "paste"
version = "0.1.18"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "45ca20c77d80be666aef2b45486da86238fabe33e38306bd3118fe4af33fa880"
dependencies = [
"paste-impl",
"proc-macro-hack",
]
[[package]]
name = "paste-impl"
version = "0.1.18"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d95a7db200b97ef370c8e6de0088252f7e0dfff7d047a28528e47456c0fc98b6"
dependencies = [
"proc-macro-hack",
]
[[package]]
name = "proc-macro-hack"
version = "0.5.19"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "dbf0c48bc1d91375ae5c3cd81e3722dff1abcf81a30960240640d223f59fe0e5"
[[package]]
name = "proc-macro2"
version = "1.0.43"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "0a2ca2c61bc9f3d74d2886294ab7b9853abd9c1ad903a3ac7815c58989bb7bab"
dependencies = [
"unicode-ident",
]
[[package]]
name = "pyo3"
version = "0.15.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d41d50a7271e08c7c8a54cd24af5d62f73ee3a6f6a314215281ebdec421d5752"
dependencies = [
"cfg-if",
"indoc",
"libc",
"parking_lot",
"paste",
"pyo3-build-config",
"pyo3-macros",
"unindent",
]
[[package]]
name = "pyo3-build-config"
version = "0.15.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "779239fc40b8e18bc8416d3a37d280ca9b9fb04bda54b98037bb6748595c2410"
dependencies = [
"once_cell",
]
[[package]]
name = "pyo3-macros"
version = "0.15.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "00b247e8c664be87998d8628e86f282c25066165f1f8dda66100c48202fdb93a"
dependencies = [
"pyo3-macros-backend",
"quote",
"syn",
]
[[package]]
name = "pyo3-macros-backend"
version = "0.15.2"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5a8c2812c412e00e641d99eeb79dd478317d981d938aa60325dfa7157b607095"
dependencies = [
"proc-macro2",
"pyo3-build-config",
"quote",
"syn",
]
[[package]]
name = "quote"
version = "1.0.21"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "bbe448f377a7d6961e30f5955f9b8d106c3f5e449d493ee1b125c1d43c2b5179"
dependencies = [
"proc-macro2",
]
[[package]]
name = "redox_syscall"
version = "0.2.16"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "fb5a58c1855b4b6819d59012155603f0b22ad30cad752600aadfcb695265519a"
dependencies = [
"bitflags",
]
[[package]]
name = "scopeguard"
version = "1.1.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "d29ab0c6d3fc0ee92fe66e2d99f700eab17a8d57d1c1d3b748380fb20baa78cd"
[[package]]
name = "smallvec"
version = "1.9.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "2fd0db749597d91ff862fd1d55ea87f7855a744a8425a64695b6fca237d1dad1"
[[package]]
name = "syn"
version = "1.0.99"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "58dbef6ec655055e20b86b15a8cc6d439cca19b667537ac6a1369572d151ab13"
dependencies = [
"proc-macro2",
"quote",
"unicode-ident",
]
[[package]]
name = "unicode-ident"
version = "1.0.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "c4f5b37a154999a8f3f98cc23a628d850e154479cd94decf3414696e12e31aaf"
[[package]]
name = "unindent"
version = "0.1.10"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "58ee9362deb4a96cef4d437d1ad49cffc9b9e92d202b6995674e928ce684f112"
[[package]]
name = "winapi"
version = "0.3.9"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "5c839a674fcd7a98952e593242ea400abe93992746761e38641405d28b00f419"
dependencies = [
"winapi-i686-pc-windows-gnu",
"winapi-x86_64-pc-windows-gnu",
]
[[package]]
name = "winapi-i686-pc-windows-gnu"
version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "ac3b87c63620426dd9b991e5ce0329eff545bccbbb34f3be09ff6fb6ab51b7b6"
[[package]]
name = "winapi-x86_64-pc-windows-gnu"
version = "0.4.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "712e227841d057c1ee1cd2fb22fa7e5a5461ae8e48fa2ca79ec42cfc1931183f"

View File

@@ -1,16 +0,0 @@
[package]
name = "neon-dev-utils"
version = "0.1.0"
edition = "2021"
[lib]
name = "neon_dev_utils"
# "cdylib" is necessary to produce a shared library for Python to import from.
#
# Downstream Rust code (including code in `bin/`, `examples/`, and `tests/`) will not be able
# to `use string_sum;` unless the "rlib" or "lib" crate type is also included, e.g.:
# crate-type = ["cdylib", "rlib"]
crate-type = ["cdylib"]
[dependencies]
pyo3 = { version = "0.15.1", features = ["extension-module"] }

View File

@@ -1,31 +0,0 @@
[[package]]
name = "maturin"
version = "0.13.2"
description = "Build and publish crates with pyo3, rust-cpython and cffi bindings as well as rust binaries as python packages"
category = "dev"
optional = false
python-versions = ">=3.7"
[package.dependencies]
tomli = {version = ">=1.1.0", markers = "python_version < \"3.11\""}
[package.extras]
zig = ["ziglang (>=0.9.0,<0.10.0)"]
patchelf = ["patchelf"]
[[package]]
name = "tomli"
version = "2.0.1"
description = "A lil' TOML parser"
category = "dev"
optional = false
python-versions = ">=3.7"
[metadata]
lock-version = "1.1"
python-versions = "^3.10"
content-hash = "4e177514d6cf74b58bcd8ca30ef300c10a833b3e6b1d809aa57337ee20efeb47"
[metadata.files]
maturin = []
tomli = []

View File

@@ -1,15 +0,0 @@
[tool.poetry]
name = "neon-dev-utils"
version = "0.1.0"
description = "Python bindings for common neon development utils"
authors = ["Your Name <you@example.com>"]
[tool.poetry.dependencies]
python = "^3.10"
[tool.poetry.dev-dependencies]
maturin = "^0.13.2"
[build-system]
requires = ["maturin>=0.13.2", "poetry-core>=1.0.0"]
build-backend = "maturin"

View File

@@ -1,17 +0,0 @@
use pyo3::prelude::*;
/// Formats the sum of two numbers as string.
#[pyfunction]
fn sum_as_string(a: usize, b: usize) -> PyResult<String> {
Ok((a + b).to_string())
}
/// A Python module implemented in Rust. The name of this function must match
/// the `lib.name` setting in the `Cargo.toml`, else Python will not be able to
/// import the module.
#[pymodule]
fn neon_dev_utils(_py: Python, m: &PyModule) -> PyResult<()> {
m.add_function(wrap_pyfunction!(sum_as_string, m)?)?;
Ok(())
}

View File

@@ -10,12 +10,12 @@ clap = "3.0"
env_logger = "0.9"
hyper = { version = "0.14", features = ["full"] }
log = { version = "0.4", features = ["std", "serde"] }
postgres = { git = "https://github.com/zenithdb/rust-postgres.git", rev="d052ee8b86fff9897c77b0fe89ea9daba0e1fa38" }
postgres = { git = "https://github.com/neondatabase/rust-postgres.git", rev="d052ee8b86fff9897c77b0fe89ea9daba0e1fa38" }
regex = "1"
serde = { version = "1.0", features = ["derive"] }
serde_json = "1"
tar = "0.4"
tokio = { version = "1.17", features = ["macros", "rt", "rt-multi-thread"] }
tokio-postgres = { git = "https://github.com/zenithdb/rust-postgres.git", rev="d052ee8b86fff9897c77b0fe89ea9daba0e1fa38" }
tokio-postgres = { git = "https://github.com/neondatabase/rust-postgres.git", rev="d052ee8b86fff9897c77b0fe89ea9daba0e1fa38" }
url = "2.2.2"
workspace_hack = { version = "0.1", path = "../workspace_hack" }

View File

@@ -178,6 +178,7 @@ impl ComputeNode {
.args(&["--sync-safekeepers"])
.env("PGDATA", &self.pgdata) // we cannot use -D in this mode
.stdout(Stdio::piped())
.stderr(Stdio::piped())
.spawn()
.expect("postgres --sync-safekeepers failed to start");
@@ -187,10 +188,13 @@ impl ComputeNode {
let sync_output = sync_handle
.wait_with_output()
.expect("postgres --sync-safekeepers failed");
if !sync_output.status.success() {
anyhow::bail!(
"postgres --sync-safekeepers exited with non-zero status: {}",
"postgres --sync-safekeepers exited with non-zero status: {}. stdout: {}, stderr: {}",
sync_output.status,
String::from_utf8(sync_output.stdout).expect("postgres --sync-safekeepers exited, and stdout is not utf-8"),
String::from_utf8(sync_output.stderr).expect("postgres --sync-safekeepers exited, and stderr is not utf-8"),
);
}

View File

@@ -62,9 +62,16 @@ impl GenericOption {
/// Represent `GenericOption` as configuration option.
pub fn to_pg_setting(&self) -> String {
if let Some(val) = &self.value {
let name = match self.name.as_str() {
"safekeepers" => "neon.safekeepers",
"wal_acceptor_reconnect" => "neon.safekeeper_reconnect_timeout",
"wal_acceptor_connect_timeout" => "neon.safekeeper_connect_timeout",
it => it,
};
match self.vartype.as_ref() {
"string" => format!("{} = '{}'", self.name, val),
_ => format!("{} = {}", self.name, val),
"string" => format!("{} = '{}'", name, val),
_ => format!("{} = {}", name, val),
}
} else {
self.name.to_owned()

View File

@@ -85,7 +85,7 @@
"vartype": "bool"
},
{
"name": "safekeepers",
"name": "neon.safekeepers",
"value": "127.0.0.1:6502,127.0.0.1:6503,127.0.0.1:6501",
"vartype": "string"
},
@@ -181,7 +181,6 @@
}
]
},
"delta_operations": [
{
"action": "delete_db",

View File

@@ -28,7 +28,7 @@ mod pg_helpers_tests {
assert_eq!(
spec.cluster.settings.as_pg_settings(),
"fsync = off\nwal_level = replica\nhot_standby = on\nsafekeepers = '127.0.0.1:6502,127.0.0.1:6503,127.0.0.1:6501'\nwal_log_hints = on\nlog_connections = on\nshared_buffers = 32768\nport = 55432\nmax_connections = 100\nmax_wal_senders = 10\nlisten_addresses = '0.0.0.0'\nwal_sender_timeout = 0\npassword_encryption = md5\nmaintenance_work_mem = 65536\nmax_parallel_workers = 8\nmax_worker_processes = 8\nneon.tenant_id = 'b0554b632bd4d547a63b86c3630317e8'\nmax_replication_slots = 10\nneon.timeline_id = '2414a61ffc94e428f14b5758fe308e13'\nshared_preload_libraries = 'neon'\nsynchronous_standby_names = 'walproposer'\nneon.pageserver_connstring = 'host=127.0.0.1 port=6400'"
"fsync = off\nwal_level = replica\nhot_standby = on\nneon.safekeepers = '127.0.0.1:6502,127.0.0.1:6503,127.0.0.1:6501'\nwal_log_hints = on\nlog_connections = on\nshared_buffers = 32768\nport = 55432\nmax_connections = 100\nmax_wal_senders = 10\nlisten_addresses = '0.0.0.0'\nwal_sender_timeout = 0\npassword_encryption = md5\nmaintenance_work_mem = 65536\nmax_parallel_workers = 8\nmax_worker_processes = 8\nneon.tenant_id = 'b0554b632bd4d547a63b86c3630317e8'\nmax_replication_slots = 10\nneon.timeline_id = '2414a61ffc94e428f14b5758fe308e13'\nshared_preload_libraries = 'neon'\nsynchronous_standby_names = 'walproposer'\nneon.pageserver_connstring = 'host=127.0.0.1 port=6400'"
);
}

View File

@@ -4,8 +4,11 @@ version = "0.1.0"
edition = "2021"
[dependencies]
clap = "3.0"
comfy-table = "5.0.1"
git-version = "0.3.5"
tar = "0.4.38"
postgres = { git = "https://github.com/zenithdb/rust-postgres.git", rev="d052ee8b86fff9897c77b0fe89ea9daba0e1fa38" }
postgres = { git = "https://github.com/neondatabase/rust-postgres.git", rev="d052ee8b86fff9897c77b0fe89ea9daba0e1fa38" }
serde = { version = "1.0", features = ["derive"] }
serde_with = "1.12.0"
toml = "0.5"

View File

@@ -1,4 +1,4 @@
# Minimal zenith environment with one safekeeper. This is equivalent to the built-in
# Minimal neon environment with one safekeeper. This is equivalent to the built-in
# defaults that you get with no --config
[pageserver]
listen_pg_addr = '127.0.0.1:64000'

View File

@@ -1,3 +1,10 @@
//!
//! `neon_local` is an executable that can be used to create a local
//! Neon environment, for testing purposes. The local environment is
//! quite different from the cloud environment with Kubernetes, but it
//! easier to work with locally. The python tests in `test_runner`
//! rely on `neon_local` to set up the environment for each test.
//!
use anyhow::{anyhow, bail, Context, Result};
use clap::{App, AppSettings, Arg, ArgMatches};
use control_plane::compute::ComputeControlPlane;
@@ -20,10 +27,10 @@ use std::process::exit;
use std::str::FromStr;
use utils::{
auth::{Claims, Scope},
id::{NodeId, TenantId, TenantTimelineId, TimelineId},
lsn::Lsn,
postgres_backend::AuthType,
project_git_version,
zid::{NodeId, ZTenantId, ZTenantTimelineId, ZTimelineId},
};
// Default id of a safekeeper node, if not specified on the command line.
@@ -32,6 +39,8 @@ const DEFAULT_PAGESERVER_ID: NodeId = NodeId(1);
const DEFAULT_BRANCH_NAME: &str = "main";
project_git_version!(GIT_VERSION);
const DEFAULT_PG_VERSION: &str = "14";
fn default_conf(etcd_binary_path: &Path) -> String {
format!(
r#"
@@ -65,7 +74,7 @@ struct TimelineTreeEl {
/// Name, recovered from neon config mappings
pub name: Option<String>,
/// Holds all direct children of this timeline referenced using `timeline_id`.
pub children: BTreeSet<ZTimelineId>,
pub children: BTreeSet<TimelineId>,
}
// Main entry point for the 'neon_local' CLI utility
@@ -98,6 +107,13 @@ fn main() -> Result<()> {
.takes_value(true)
.required(false);
let pg_version_arg = Arg::new("pg-version")
.long("pg-version")
.help("Postgres version to use for the initial tenant")
.required(false)
.takes_value(true)
.default_value(DEFAULT_PG_VERSION);
let port_arg = Arg::new("port")
.long("port")
.required(false)
@@ -139,6 +155,7 @@ fn main() -> Result<()> {
.required(false)
.value_name("config"),
)
.arg(pg_version_arg.clone())
)
.subcommand(
App::new("timeline")
@@ -157,7 +174,9 @@ fn main() -> Result<()> {
.subcommand(App::new("create")
.about("Create a new blank timeline")
.arg(tenant_id_arg.clone())
.arg(branch_name_arg.clone()))
.arg(branch_name_arg.clone())
.arg(pg_version_arg.clone())
)
.subcommand(App::new("import")
.about("Import timeline from basebackup directory")
.arg(tenant_id_arg.clone())
@@ -171,7 +190,9 @@ fn main() -> Result<()> {
.arg(Arg::new("wal-tarfile").long("wal-tarfile").takes_value(true)
.help("Wal to add after base"))
.arg(Arg::new("end-lsn").long("end-lsn").takes_value(true)
.help("Lsn the basebackup ends at")))
.help("Lsn the basebackup ends at"))
.arg(pg_version_arg.clone())
)
).subcommand(
App::new("tenant")
.setting(AppSettings::ArgRequiredElseHelp)
@@ -181,6 +202,7 @@ fn main() -> Result<()> {
.arg(tenant_id_arg.clone())
.arg(timeline_id_arg.clone().help("Use a specific timeline id when creating a tenant and its initial timeline"))
.arg(Arg::new("config").short('c').takes_value(true).multiple_occurrences(true).required(false))
.arg(pg_version_arg.clone())
)
.subcommand(App::new("config")
.arg(tenant_id_arg.clone())
@@ -232,8 +254,9 @@ fn main() -> Result<()> {
Arg::new("config-only")
.help("Don't do basebackup, create compute node with only config files")
.long("config-only")
.required(false)
))
.required(false))
.arg(pg_version_arg.clone())
)
.subcommand(App::new("start")
.about("Start a postgres compute node.\n This command actually creates new node from scratch, but preserves existing config files")
.arg(pg_node_arg.clone())
@@ -241,7 +264,9 @@ fn main() -> Result<()> {
.arg(branch_name_arg.clone())
.arg(timeline_id_arg.clone())
.arg(lsn_arg.clone())
.arg(port_arg.clone()))
.arg(port_arg.clone())
.arg(pg_version_arg.clone())
)
.subcommand(
App::new("stop")
.arg(pg_node_arg.clone())
@@ -314,7 +339,7 @@ fn main() -> Result<()> {
///
fn print_timelines_tree(
timelines: Vec<TimelineInfo>,
mut timeline_name_mappings: HashMap<ZTenantTimelineId, String>,
mut timeline_name_mappings: HashMap<TenantTimelineId, String>,
) -> Result<()> {
let mut timelines_hash = timelines
.iter()
@@ -325,7 +350,7 @@ fn print_timelines_tree(
info: t.clone(),
children: BTreeSet::new(),
name: timeline_name_mappings
.remove(&ZTenantTimelineId::new(t.tenant_id, t.timeline_id)),
.remove(&TenantTimelineId::new(t.tenant_id, t.timeline_id)),
},
)
})
@@ -367,7 +392,7 @@ fn print_timeline(
nesting_level: usize,
is_last: &[bool],
timeline: &TimelineTreeEl,
timelines: &HashMap<ZTimelineId, TimelineTreeEl>,
timelines: &HashMap<TimelineId, TimelineTreeEl>,
) -> Result<()> {
let local_remote = match (timeline.info.local.as_ref(), timeline.info.remote.as_ref()) {
(None, None) => unreachable!("in this case no info for a timeline is found"),
@@ -445,8 +470,8 @@ fn print_timeline(
/// Connects to the pageserver to query this information.
fn get_timeline_infos(
env: &local_env::LocalEnv,
tenant_id: &ZTenantId,
) -> Result<HashMap<ZTimelineId, TimelineInfo>> {
tenant_id: &TenantId,
) -> Result<HashMap<TimelineId, TimelineInfo>> {
Ok(PageServerNode::from_env(env)
.timeline_list(tenant_id)?
.into_iter()
@@ -455,7 +480,7 @@ fn get_timeline_infos(
}
// Helper function to parse --tenant_id option, or get the default from config file
fn get_tenant_id(sub_match: &ArgMatches, env: &local_env::LocalEnv) -> anyhow::Result<ZTenantId> {
fn get_tenant_id(sub_match: &ArgMatches, env: &local_env::LocalEnv) -> anyhow::Result<TenantId> {
if let Some(tenant_id_from_arguments) = parse_tenant_id(sub_match).transpose() {
tenant_id_from_arguments
} else if let Some(default_id) = env.default_tenant_id {
@@ -465,18 +490,18 @@ fn get_tenant_id(sub_match: &ArgMatches, env: &local_env::LocalEnv) -> anyhow::R
}
}
fn parse_tenant_id(sub_match: &ArgMatches) -> anyhow::Result<Option<ZTenantId>> {
fn parse_tenant_id(sub_match: &ArgMatches) -> anyhow::Result<Option<TenantId>> {
sub_match
.value_of("tenant-id")
.map(ZTenantId::from_str)
.map(TenantId::from_str)
.transpose()
.context("Failed to parse tenant id from the argument string")
}
fn parse_timeline_id(sub_match: &ArgMatches) -> anyhow::Result<Option<ZTimelineId>> {
fn parse_timeline_id(sub_match: &ArgMatches) -> anyhow::Result<Option<TimelineId>> {
sub_match
.value_of("timeline-id")
.map(ZTimelineId::from_str)
.map(TimelineId::from_str)
.transpose()
.context("Failed to parse timeline id from the argument string")
}
@@ -494,12 +519,19 @@ fn handle_init(init_match: &ArgMatches) -> anyhow::Result<LocalEnv> {
default_conf(&EtcdBroker::locate_etcd()?)
};
let pg_version = init_match
.value_of("pg-version")
.unwrap()
.parse::<u32>()
.context("Failed to parse postgres version from the argument string")?;
let mut env =
LocalEnv::parse_config(&toml_file).context("Failed to create neon configuration")?;
env.init().context("Failed to initialize neon repository")?;
// default_tenantid was generated by the `env.init()` call above
let initial_tenant_id = env.default_tenant_id.unwrap();
env.init(pg_version)
.context("Failed to initialize neon repository")?;
let initial_tenant_id = env
.default_tenant_id
.expect("default_tenant_id should be generated by the `env.init()` call above");
// Initialize pageserver, create initial tenant and timeline.
let pageserver = PageServerNode::from_env(&env);
@@ -508,6 +540,7 @@ fn handle_init(init_match: &ArgMatches) -> anyhow::Result<LocalEnv> {
Some(initial_tenant_id),
initial_timeline_id_arg,
&pageserver_config_overrides(init_match),
pg_version,
)
.unwrap_or_else(|e| {
eprintln!("pageserver init failed: {e}");
@@ -536,13 +569,7 @@ fn handle_tenant(tenant_match: &ArgMatches, env: &mut local_env::LocalEnv) -> an
match tenant_match.subcommand() {
Some(("list", _)) => {
for t in pageserver.tenant_list()? {
println!(
"{} {}",
t.id,
t.state
.map(|s| s.to_string())
.unwrap_or_else(|| String::from(""))
);
println!("{} {:?}", t.id, t.state);
}
}
Some(("create", create_match)) => {
@@ -556,8 +583,19 @@ fn handle_tenant(tenant_match: &ArgMatches, env: &mut local_env::LocalEnv) -> an
// Create an initial timeline for the new tenant
let new_timeline_id = parse_timeline_id(create_match)?;
let timeline_info =
pageserver.timeline_create(new_tenant_id, new_timeline_id, None, None)?;
let pg_version = create_match
.value_of("pg-version")
.unwrap()
.parse::<u32>()
.context("Failed to parse postgres version from the argument string")?;
let timeline_info = pageserver.timeline_create(
new_tenant_id,
new_timeline_id,
None,
None,
Some(pg_version),
)?;
let new_timeline_id = timeline_info.timeline_id;
let last_record_lsn = timeline_info
.local
@@ -606,7 +644,15 @@ fn handle_timeline(timeline_match: &ArgMatches, env: &mut local_env::LocalEnv) -
let new_branch_name = create_match
.value_of("branch-name")
.ok_or_else(|| anyhow!("No branch name provided"))?;
let timeline_info = pageserver.timeline_create(tenant_id, None, None, None)?;
let pg_version = create_match
.value_of("pg-version")
.unwrap()
.parse::<u32>()
.context("Failed to parse postgres version from the argument string")?;
let timeline_info =
pageserver.timeline_create(tenant_id, None, None, None, Some(pg_version))?;
let new_timeline_id = timeline_info.timeline_id;
let last_record_lsn = timeline_info
@@ -649,12 +695,19 @@ fn handle_timeline(timeline_match: &ArgMatches, env: &mut local_env::LocalEnv) -
// TODO validate both or none are provided
let pg_wal = end_lsn.zip(wal_tarfile);
let pg_version = import_match
.value_of("pg-version")
.unwrap()
.parse::<u32>()
.context("Failed to parse postgres version from the argument string")?;
let mut cplane = ComputeControlPlane::load(env.clone())?;
println!("Importing timeline into pageserver ...");
pageserver.timeline_import(tenant_id, timeline_id, base, pg_wal)?;
pageserver.timeline_import(tenant_id, timeline_id, base, pg_wal, pg_version)?;
println!("Creating node for imported timeline ...");
env.register_branch_mapping(name.to_string(), tenant_id, timeline_id)?;
cplane.new_node(tenant_id, name, timeline_id, None, None)?;
cplane.new_node(tenant_id, name, timeline_id, None, None, pg_version)?;
println!("Done");
}
Some(("branch", branch_match)) => {
@@ -681,6 +734,7 @@ fn handle_timeline(timeline_match: &ArgMatches, env: &mut local_env::LocalEnv) -
None,
start_lsn,
Some(ancestor_timeline_id),
None,
)?;
let new_timeline_id = timeline_info.timeline_id;
@@ -758,7 +812,7 @@ fn handle_pg(pg_match: &ArgMatches, env: &local_env::LocalEnv) -> Result<()> {
};
let branch_name = timeline_name_mappings
.get(&ZTenantTimelineId::new(tenant_id, node.timeline_id))
.get(&TenantTimelineId::new(tenant_id, node.timeline_id))
.map(|name| name.as_str())
.unwrap_or("?");
@@ -796,7 +850,14 @@ fn handle_pg(pg_match: &ArgMatches, env: &local_env::LocalEnv) -> Result<()> {
Some(p) => Some(p.parse()?),
None => None,
};
cplane.new_node(tenant_id, &node_name, timeline_id, lsn, port)?;
let pg_version = sub_args
.value_of("pg-version")
.unwrap()
.parse::<u32>()
.context("Failed to parse postgres version from the argument string")?;
cplane.new_node(tenant_id, &node_name, timeline_id, lsn, port, pg_version)?;
}
"start" => {
let port: Option<u16> = match sub_args.value_of("port") {
@@ -809,7 +870,7 @@ fn handle_pg(pg_match: &ArgMatches, env: &local_env::LocalEnv) -> Result<()> {
let node = cplane.nodes.get(&(tenant_id, node_name.to_owned()));
let auth_token = if matches!(env.pageserver.auth_type, AuthType::ZenithJWT) {
let auth_token = if matches!(env.pageserver.auth_type, AuthType::NeonJWT) {
let claims = Claims::new(Some(tenant_id), Scope::Tenant);
Some(env.generate_auth_token(&claims)?)
@@ -834,16 +895,23 @@ fn handle_pg(pg_match: &ArgMatches, env: &local_env::LocalEnv) -> Result<()> {
.map(Lsn::from_str)
.transpose()
.context("Failed to parse Lsn from the request")?;
let pg_version = sub_args
.value_of("pg-version")
.unwrap()
.parse::<u32>()
.context("Failed to parse postgres version from the argument string")?;
// when used with custom port this results in non obvious behaviour
// port is remembered from first start command, i e
// start --port X
// stop
// start <-- will also use port X even without explicit port argument
println!(
"Starting new postgres {} on timeline {} ...",
node_name, timeline_id
"Starting new postgres (v{}) {} on timeline {} ...",
pg_version, node_name, timeline_id
);
let node = cplane.new_node(tenant_id, node_name, timeline_id, lsn, port)?;
let node =
cplane.new_node(tenant_id, node_name, timeline_id, lsn, port, pg_version)?;
node.start(&auth_token)?;
}
}

View File

@@ -13,12 +13,12 @@ use std::time::Duration;
use anyhow::{Context, Result};
use utils::{
connstring::connection_host_port,
id::{TenantId, TimelineId},
lsn::Lsn,
postgres_backend::AuthType,
zid::{ZTenantId, ZTimelineId},
};
use crate::local_env::LocalEnv;
use crate::local_env::{LocalEnv, DEFAULT_PG_VERSION};
use crate::postgresql_conf::PostgresConf;
use crate::storage::PageServerNode;
@@ -28,7 +28,7 @@ use crate::storage::PageServerNode;
pub struct ComputeControlPlane {
base_port: u16,
pageserver: Arc<PageServerNode>,
pub nodes: BTreeMap<(ZTenantId, String), Arc<PostgresNode>>,
pub nodes: BTreeMap<(TenantId, String), Arc<PostgresNode>>,
env: LocalEnv,
}
@@ -76,11 +76,12 @@ impl ComputeControlPlane {
pub fn new_node(
&mut self,
tenant_id: ZTenantId,
tenant_id: TenantId,
name: &str,
timeline_id: ZTimelineId,
timeline_id: TimelineId,
lsn: Option<Lsn>,
port: Option<u16>,
pg_version: u32,
) -> Result<Arc<PostgresNode>> {
let port = port.unwrap_or_else(|| self.get_port());
let node = Arc::new(PostgresNode {
@@ -93,6 +94,7 @@ impl ComputeControlPlane {
lsn,
tenant_id,
uses_wal_proposer: false,
pg_version,
});
node.create_pgdata()?;
@@ -114,10 +116,11 @@ pub struct PostgresNode {
pub env: LocalEnv,
pageserver: Arc<PageServerNode>,
is_test: bool,
pub timeline_id: ZTimelineId,
pub timeline_id: TimelineId,
pub lsn: Option<Lsn>, // if it's a read-only node. None for primary
pub tenant_id: ZTenantId,
pub tenant_id: TenantId,
uses_wal_proposer: bool,
pg_version: u32,
}
impl PostgresNode {
@@ -148,9 +151,17 @@ impl PostgresNode {
// Read a few options from the config file
let context = format!("in config file {}", cfg_path_str);
let port: u16 = conf.parse_field("port", &context)?;
let timeline_id: ZTimelineId = conf.parse_field("neon.timeline_id", &context)?;
let tenant_id: ZTenantId = conf.parse_field("neon.tenant_id", &context)?;
let uses_wal_proposer = conf.get("safekeepers").is_some();
let timeline_id: TimelineId = conf.parse_field("neon.timeline_id", &context)?;
let tenant_id: TenantId = conf.parse_field("neon.tenant_id", &context)?;
let uses_wal_proposer = conf.get("neon.safekeepers").is_some();
// Read postgres version from PG_VERSION file to determine which postgres version binary to use.
// If it doesn't exist, assume broken data directory and use default pg version.
let pg_version_path = entry.path().join("PG_VERSION");
let pg_version_str =
fs::read_to_string(pg_version_path).unwrap_or_else(|_| DEFAULT_PG_VERSION.to_string());
let pg_version = u32::from_str(&pg_version_str)?;
// parse recovery_target_lsn, if any
let recovery_target_lsn: Option<Lsn> =
@@ -167,17 +178,24 @@ impl PostgresNode {
lsn: recovery_target_lsn,
tenant_id,
uses_wal_proposer,
pg_version,
})
}
fn sync_safekeepers(&self, auth_token: &Option<String>) -> Result<Lsn> {
let pg_path = self.env.pg_bin_dir().join("postgres");
fn sync_safekeepers(&self, auth_token: &Option<String>, pg_version: u32) -> Result<Lsn> {
let pg_path = self.env.pg_bin_dir(pg_version).join("postgres");
let mut cmd = Command::new(&pg_path);
cmd.arg("--sync-safekeepers")
.env_clear()
.env("LD_LIBRARY_PATH", self.env.pg_lib_dir().to_str().unwrap())
.env("DYLD_LIBRARY_PATH", self.env.pg_lib_dir().to_str().unwrap())
.env(
"LD_LIBRARY_PATH",
self.env.pg_lib_dir(pg_version).to_str().unwrap(),
)
.env(
"DYLD_LIBRARY_PATH",
self.env.pg_lib_dir(pg_version).to_str().unwrap(),
)
.env("PGDATA", self.pgdata().to_str().unwrap())
.stdout(Stdio::piped())
// Comment this to avoid capturing stderr (useful if command hangs)
@@ -259,14 +277,14 @@ impl PostgresNode {
})
}
// Connect to a page server, get base backup, and untar it to initialize a
// new data directory
// Write postgresql.conf with default configuration
// and PG_VERSION file to the data directory of a new node.
fn setup_pg_conf(&self, auth_type: AuthType) -> Result<()> {
let mut conf = PostgresConf::new();
conf.append("max_wal_senders", "10");
// wal_log_hints is mandatory when running against pageserver (see gh issue#192)
// TODO: is it possible to check wal_log_hints at pageserver side via XLOG_PARAMETER_CHANGE?
conf.append("wal_log_hints", "on");
conf.append("wal_log_hints", "off");
conf.append("max_replication_slots", "10");
conf.append("hot_standby", "on");
conf.append("shared_buffers", "1MB");
@@ -292,7 +310,7 @@ impl PostgresNode {
// variable during compute pg startup. It is done this way because
// otherwise user will be able to retrieve the value using SHOW
// command or pg_settings
let password = if let AuthType::ZenithJWT = auth_type {
let password = if let AuthType::NeonJWT = auth_type {
"$ZENITH_AUTH_TOKEN"
} else {
""
@@ -301,7 +319,7 @@ impl PostgresNode {
// Also note that not all parameters are supported here. Because in compute we substitute $ZENITH_AUTH_TOKEN
// We parse this string and build it back with token from env var, and for simplicity rebuild
// uses only needed variables namely host, port, user, password.
format!("postgresql://no_user:{}@{}:{}", password, host, port)
format!("postgresql://no_user:{password}@{host}:{port}")
};
conf.append("shared_preload_libraries", "neon");
conf.append_line("");
@@ -341,7 +359,7 @@ impl PostgresNode {
.map(|sk| format!("localhost:{}", sk.pg_port))
.collect::<Vec<String>>()
.join(",");
conf.append("safekeepers", &safekeepers);
conf.append("neon.safekeepers", &safekeepers);
} else {
// We only use setup without safekeepers for tests,
// and don't care about data durability on pageserver,
@@ -357,6 +375,9 @@ impl PostgresNode {
let mut file = File::create(self.pgdata().join("postgresql.conf"))?;
file.write_all(conf.to_string().as_bytes())?;
let mut file = File::create(self.pgdata().join("PG_VERSION"))?;
file.write_all(self.pg_version.to_string().as_bytes())?;
Ok(())
}
@@ -368,7 +389,7 @@ impl PostgresNode {
// latest data from the pageserver. That is a bit clumsy but whole bootstrap
// procedure evolves quite actively right now, so let's think about it again
// when things would be more stable (TODO).
let lsn = self.sync_safekeepers(auth_token)?;
let lsn = self.sync_safekeepers(auth_token, self.pg_version)?;
if lsn == Lsn(0) {
None
} else {
@@ -401,7 +422,7 @@ impl PostgresNode {
}
fn pg_ctl(&self, args: &[&str], auth_token: &Option<String>) -> Result<()> {
let pg_ctl_path = self.env.pg_bin_dir().join("pg_ctl");
let pg_ctl_path = self.env.pg_bin_dir(self.pg_version).join("pg_ctl");
let mut cmd = Command::new(pg_ctl_path);
cmd.args(
[
@@ -417,8 +438,14 @@ impl PostgresNode {
.concat(),
)
.env_clear()
.env("LD_LIBRARY_PATH", self.env.pg_lib_dir().to_str().unwrap())
.env("DYLD_LIBRARY_PATH", self.env.pg_lib_dir().to_str().unwrap());
.env(
"LD_LIBRARY_PATH",
self.env.pg_lib_dir(self.pg_version).to_str().unwrap(),
)
.env(
"DYLD_LIBRARY_PATH",
self.env.pg_lib_dir(self.pg_version).to_str().unwrap(),
);
if let Some(token) = auth_token {
cmd.env("ZENITH_AUTH_TOKEN", token);
}

View File

@@ -14,12 +14,14 @@ use std::path::{Path, PathBuf};
use std::process::{Command, Stdio};
use utils::{
auth::{encode_from_key_file, Claims, Scope},
id::{NodeId, TenantId, TenantTimelineId, TimelineId},
postgres_backend::AuthType,
zid::{NodeId, ZTenantId, ZTenantTimelineId, ZTimelineId},
};
use crate::safekeeper::SafekeeperNode;
pub const DEFAULT_PG_VERSION: u32 = 14;
//
// This data structures represents neon_local CLI config
//
@@ -48,13 +50,13 @@ pub struct LocalEnv {
// Path to pageserver binary.
#[serde(default)]
pub zenith_distrib_dir: PathBuf,
pub neon_distrib_dir: PathBuf,
// Default tenant ID to use with the 'zenith' command line utility, when
// --tenantid is not explicitly specified.
// Default tenant ID to use with the 'neon_local' command line utility, when
// --tenant_id is not explicitly specified.
#[serde(default)]
#[serde_as(as = "Option<DisplayFromStr>")]
pub default_tenant_id: Option<ZTenantId>,
pub default_tenant_id: Option<TenantId>,
// used to issue tokens during e.g pg start
#[serde(default)]
@@ -69,11 +71,11 @@ pub struct LocalEnv {
/// Keep human-readable aliases in memory (and persist them to config), to hide ZId hex strings from the user.
#[serde(default)]
// A `HashMap<String, HashMap<ZTenantId, ZTimelineId>>` would be more appropriate here,
// A `HashMap<String, HashMap<TenantId, TimelineId>>` would be more appropriate here,
// but deserialization into a generic toml object as `toml::Value::try_from` fails with an error.
// https://toml.io/en/v1.0.0 does not contain a concept of "a table inside another table".
#[serde_as(as = "HashMap<_, Vec<(DisplayFromStr, DisplayFromStr)>>")]
branch_name_mappings: HashMap<String, Vec<(ZTenantId, ZTimelineId)>>,
branch_name_mappings: HashMap<String, Vec<(TenantId, TimelineId)>>,
}
/// Etcd broker config for cluster internal communication.
@@ -195,29 +197,50 @@ impl Default for SafekeeperConf {
}
impl LocalEnv {
// postgres installation paths
pub fn pg_bin_dir(&self) -> PathBuf {
self.pg_distrib_dir.join("bin")
pub fn pg_distrib_dir_raw(&self) -> PathBuf {
self.pg_distrib_dir.clone()
}
pub fn pg_lib_dir(&self) -> PathBuf {
self.pg_distrib_dir.join("lib")
pub fn pg_distrib_dir(&self, pg_version: u32) -> PathBuf {
let path = self.pg_distrib_dir.clone();
match pg_version {
14 => path.join(format!("v{pg_version}")),
15 => path.join(format!("v{pg_version}")),
_ => panic!("Unsupported postgres version: {}", pg_version),
}
}
pub fn pg_bin_dir(&self, pg_version: u32) -> PathBuf {
match pg_version {
14 => self.pg_distrib_dir(pg_version).join("bin"),
15 => self.pg_distrib_dir(pg_version).join("bin"),
_ => panic!("Unsupported postgres version: {}", pg_version),
}
}
pub fn pg_lib_dir(&self, pg_version: u32) -> PathBuf {
match pg_version {
14 => self.pg_distrib_dir(pg_version).join("lib"),
15 => self.pg_distrib_dir(pg_version).join("lib"),
_ => panic!("Unsupported postgres version: {}", pg_version),
}
}
pub fn pageserver_bin(&self) -> anyhow::Result<PathBuf> {
Ok(self.zenith_distrib_dir.join("pageserver"))
Ok(self.neon_distrib_dir.join("pageserver"))
}
pub fn safekeeper_bin(&self) -> anyhow::Result<PathBuf> {
Ok(self.zenith_distrib_dir.join("safekeeper"))
Ok(self.neon_distrib_dir.join("safekeeper"))
}
pub fn pg_data_dirs_path(&self) -> PathBuf {
self.base_data_dir.join("pgdatadirs").join("tenants")
}
pub fn pg_data_dir(&self, tenantid: &ZTenantId, branch_name: &str) -> PathBuf {
pub fn pg_data_dir(&self, tenant_id: &TenantId, branch_name: &str) -> PathBuf {
self.pg_data_dirs_path()
.join(tenantid.to_string())
.join(tenant_id.to_string())
.join(branch_name)
}
@@ -233,8 +256,8 @@ impl LocalEnv {
pub fn register_branch_mapping(
&mut self,
branch_name: String,
tenant_id: ZTenantId,
timeline_id: ZTimelineId,
tenant_id: TenantId,
timeline_id: TimelineId,
) -> anyhow::Result<()> {
let existing_values = self
.branch_name_mappings
@@ -260,22 +283,22 @@ impl LocalEnv {
pub fn get_branch_timeline_id(
&self,
branch_name: &str,
tenant_id: ZTenantId,
) -> Option<ZTimelineId> {
tenant_id: TenantId,
) -> Option<TimelineId> {
self.branch_name_mappings
.get(branch_name)?
.iter()
.find(|(mapped_tenant_id, _)| mapped_tenant_id == &tenant_id)
.map(|&(_, timeline_id)| timeline_id)
.map(ZTimelineId::from)
.map(TimelineId::from)
}
pub fn timeline_name_mappings(&self) -> HashMap<ZTenantTimelineId, String> {
pub fn timeline_name_mappings(&self) -> HashMap<TenantTimelineId, String> {
self.branch_name_mappings
.iter()
.flat_map(|(name, tenant_timelines)| {
tenant_timelines.iter().map(|&(tenant_id, timeline_id)| {
(ZTenantTimelineId::new(tenant_id, timeline_id), name.clone())
(TenantTimelineId::new(tenant_id, timeline_id), name.clone())
})
})
.collect()
@@ -289,24 +312,26 @@ impl LocalEnv {
let mut env: LocalEnv = toml::from_str(toml)?;
// Find postgres binaries.
// Follow POSTGRES_DISTRIB_DIR if set, otherwise look in "tmp_install".
// Follow POSTGRES_DISTRIB_DIR if set, otherwise look in "pg_install".
// Note that later in the code we assume, that distrib dirs follow the same pattern
// for all postgres versions.
if env.pg_distrib_dir == Path::new("") {
if let Some(postgres_bin) = env::var_os("POSTGRES_DISTRIB_DIR") {
env.pg_distrib_dir = postgres_bin.into();
} else {
let cwd = env::current_dir()?;
env.pg_distrib_dir = cwd.join("tmp_install")
env.pg_distrib_dir = cwd.join("pg_install")
}
}
// Find zenith binaries.
if env.zenith_distrib_dir == Path::new("") {
env.zenith_distrib_dir = env::current_exe()?.parent().unwrap().to_owned();
// Find neon binaries.
if env.neon_distrib_dir == Path::new("") {
env.neon_distrib_dir = env::current_exe()?.parent().unwrap().to_owned();
}
// If no initial tenant ID was given, generate it.
if env.default_tenant_id.is_none() {
env.default_tenant_id = Some(ZTenantId::generate());
env.default_tenant_id = Some(TenantId::generate());
}
env.base_data_dir = base_path();
@@ -320,12 +345,12 @@ impl LocalEnv {
if !repopath.exists() {
bail!(
"Zenith config is not found in {}. You need to run 'neon_local init' first",
"Neon config is not found in {}. You need to run 'neon_local init' first",
repopath.to_str().unwrap()
);
}
// TODO: check that it looks like a zenith repository
// TODO: check that it looks like a neon repository
// load and parse file
let config = fs::read_to_string(repopath.join("config"))?;
@@ -384,7 +409,7 @@ impl LocalEnv {
//
// Initialize a new Neon repository
//
pub fn init(&mut self) -> anyhow::Result<()> {
pub fn init(&mut self, pg_version: u32) -> anyhow::Result<()> {
// check if config already exists
let base_path = &self.base_data_dir;
ensure!(
@@ -397,17 +422,17 @@ impl LocalEnv {
"directory '{}' already exists. Perhaps already initialized?",
base_path.display()
);
if !self.pg_distrib_dir.join("bin/postgres").exists() {
if !self.pg_bin_dir(pg_version).join("postgres").exists() {
bail!(
"Can't find postgres binary at {}",
self.pg_distrib_dir.display()
self.pg_bin_dir(pg_version).display()
);
}
for binary in ["pageserver", "safekeeper"] {
if !self.zenith_distrib_dir.join(binary).exists() {
if !self.neon_distrib_dir.join(binary).exists() {
bail!(
"Can't find binary '{binary}' in zenith distrib dir '{}'",
self.zenith_distrib_dir.display()
"Can't find binary '{binary}' in neon distrib dir '{}'",
self.neon_distrib_dir.display()
);
}
}

View File

@@ -2,7 +2,7 @@
/// Module for parsing postgresql.conf file.
///
/// NOTE: This doesn't implement the full, correct postgresql.conf syntax. Just
/// enough to extract a few settings we need in Zenith, assuming you don't do
/// enough to extract a few settings we need in Neon, assuming you don't do
/// funny stuff like include-directives or funny escaping.
use anyhow::{bail, Context, Result};
use once_cell::sync::Lazy;

View File

@@ -17,7 +17,7 @@ use thiserror::Error;
use utils::{
connstring::connection_address,
http::error::HttpErrorBody,
zid::{NodeId, ZTenantId, ZTimelineId},
id::{NodeId, TenantId, TimelineId},
};
use crate::local_env::{LocalEnv, SafekeeperConf};
@@ -46,7 +46,7 @@ impl ResponseErrorMessageExt for Response {
return Ok(self);
}
// reqwest do not export it's error construction utility functions, so lets craft the message ourselves
// reqwest does not export its error construction utility functions, so let's craft the message ourselves
let url = self.url().to_owned();
Err(SafekeeperHttpError::Response(
match self.json::<HttpErrorBody>() {
@@ -269,7 +269,7 @@ impl SafekeeperNode {
fn http_request<U: IntoUrl>(&self, method: Method, url: U) -> RequestBuilder {
// TODO: authentication
//if self.env.auth_type == AuthType::ZenithJWT {
//if self.env.auth_type == AuthType::NeonJWT {
// builder = builder.bearer_auth(&self.env.safekeeper_auth_token)
//}
self.http_client.request(method, url)
@@ -284,8 +284,8 @@ impl SafekeeperNode {
pub fn timeline_create(
&self,
tenant_id: ZTenantId,
timeline_id: ZTimelineId,
tenant_id: TenantId,
timeline_id: TimelineId,
peer_ids: Vec<NodeId>,
) -> Result<()> {
Ok(self

View File

@@ -21,9 +21,9 @@ use thiserror::Error;
use utils::{
connstring::connection_address,
http::error::HttpErrorBody,
id::{TenantId, TimelineId},
lsn::Lsn,
postgres_backend::AuthType,
zid::{ZTenantId, ZTimelineId},
};
use crate::local_env::LocalEnv;
@@ -57,7 +57,7 @@ impl ResponseErrorMessageExt for Response {
return Ok(self);
}
// reqwest do not export it's error construction utility functions, so lets craft the message ourselves
// reqwest does not export its error construction utility functions, so let's craft the message ourselves
let url = self.url().to_owned();
Err(PageserverHttpError::Response(
match self.json::<HttpErrorBody>() {
@@ -83,7 +83,7 @@ pub struct PageServerNode {
impl PageServerNode {
pub fn from_env(env: &LocalEnv) -> PageServerNode {
let password = if env.pageserver.auth_type == AuthType::ZenithJWT {
let password = if env.pageserver.auth_type == AuthType::NeonJWT {
&env.pageserver.auth_token
} else {
""
@@ -109,14 +109,18 @@ impl PageServerNode {
pub fn initialize(
&self,
create_tenant: Option<ZTenantId>,
initial_timeline_id: Option<ZTimelineId>,
create_tenant: Option<TenantId>,
initial_timeline_id: Option<TimelineId>,
config_overrides: &[&str],
) -> anyhow::Result<ZTimelineId> {
pg_version: u32,
) -> anyhow::Result<TimelineId> {
let id = format!("id={}", self.env.pageserver.id);
// FIXME: the paths should be shell-escaped to handle paths with spaces, quotas etc.
let pg_distrib_dir_param =
format!("pg_distrib_dir='{}'", self.env.pg_distrib_dir.display());
let pg_distrib_dir_param = format!(
"pg_distrib_dir='{}'",
self.env.pg_distrib_dir_raw().display()
);
let authg_type_param = format!("auth_type='{}'", self.env.pageserver.auth_type);
let listen_http_addr_param = format!(
"listen_http_addr='{}'",
@@ -159,7 +163,7 @@ impl PageServerNode {
self.start_node(&init_config_overrides, &self.env.base_data_dir, true)?;
let init_result = self
.try_init_timeline(create_tenant, initial_timeline_id)
.try_init_timeline(create_tenant, initial_timeline_id, pg_version)
.context("Failed to create initial tenant and timeline for pageserver");
match &init_result {
Ok(initial_timeline_id) => {
@@ -173,12 +177,18 @@ impl PageServerNode {
fn try_init_timeline(
&self,
new_tenant_id: Option<ZTenantId>,
new_timeline_id: Option<ZTimelineId>,
) -> anyhow::Result<ZTimelineId> {
new_tenant_id: Option<TenantId>,
new_timeline_id: Option<TimelineId>,
pg_version: u32,
) -> anyhow::Result<TimelineId> {
let initial_tenant_id = self.tenant_create(new_tenant_id, HashMap::new())?;
let initial_timeline_info =
self.timeline_create(initial_tenant_id, new_timeline_id, None, None)?;
let initial_timeline_info = self.timeline_create(
initial_tenant_id,
new_timeline_id,
None,
None,
Some(pg_version),
)?;
Ok(initial_timeline_info.timeline_id)
}
@@ -345,7 +355,7 @@ impl PageServerNode {
fn http_request<U: IntoUrl>(&self, method: Method, url: U) -> RequestBuilder {
let mut builder = self.http_client.request(method, url);
if self.env.pageserver.auth_type == AuthType::ZenithJWT {
if self.env.pageserver.auth_type == AuthType::NeonJWT {
builder = builder.bearer_auth(&self.env.pageserver.auth_token)
}
builder
@@ -368,46 +378,53 @@ impl PageServerNode {
pub fn tenant_create(
&self,
new_tenant_id: Option<ZTenantId>,
new_tenant_id: Option<TenantId>,
settings: HashMap<&str, &str>,
) -> anyhow::Result<ZTenantId> {
) -> anyhow::Result<TenantId> {
let mut settings = settings.clone();
let request = TenantCreateRequest {
new_tenant_id,
checkpoint_distance: settings
.remove("checkpoint_distance")
.map(|x| x.parse::<u64>())
.transpose()?,
checkpoint_timeout: settings.remove("checkpoint_timeout").map(|x| x.to_string()),
compaction_target_size: settings
.remove("compaction_target_size")
.map(|x| x.parse::<u64>())
.transpose()?,
compaction_period: settings.remove("compaction_period").map(|x| x.to_string()),
compaction_threshold: settings
.remove("compaction_threshold")
.map(|x| x.parse::<usize>())
.transpose()?,
gc_horizon: settings
.remove("gc_horizon")
.map(|x| x.parse::<u64>())
.transpose()?,
gc_period: settings.remove("gc_period").map(|x| x.to_string()),
image_creation_threshold: settings
.remove("image_creation_threshold")
.map(|x| x.parse::<usize>())
.transpose()?,
pitr_interval: settings.remove("pitr_interval").map(|x| x.to_string()),
walreceiver_connect_timeout: settings
.remove("walreceiver_connect_timeout")
.map(|x| x.to_string()),
lagging_wal_timeout: settings
.remove("lagging_wal_timeout")
.map(|x| x.to_string()),
max_lsn_wal_lag: settings
.remove("max_lsn_wal_lag")
.map(|x| x.parse::<NonZeroU64>())
.transpose()
.context("Failed to parse 'max_lsn_wal_lag' as non zero integer")?,
};
if !settings.is_empty() {
bail!("Unrecognized tenant settings: {settings:?}")
}
self.http_request(Method::POST, format!("{}/tenant", self.http_base_url))
.json(&TenantCreateRequest {
new_tenant_id,
checkpoint_distance: settings
.get("checkpoint_distance")
.map(|x| x.parse::<u64>())
.transpose()?,
checkpoint_timeout: settings.get("checkpoint_timeout").map(|x| x.to_string()),
compaction_target_size: settings
.get("compaction_target_size")
.map(|x| x.parse::<u64>())
.transpose()?,
compaction_period: settings.get("compaction_period").map(|x| x.to_string()),
compaction_threshold: settings
.get("compaction_threshold")
.map(|x| x.parse::<usize>())
.transpose()?,
gc_horizon: settings
.get("gc_horizon")
.map(|x| x.parse::<u64>())
.transpose()?,
gc_period: settings.get("gc_period").map(|x| x.to_string()),
image_creation_threshold: settings
.get("image_creation_threshold")
.map(|x| x.parse::<usize>())
.transpose()?,
pitr_interval: settings.get("pitr_interval").map(|x| x.to_string()),
walreceiver_connect_timeout: settings
.get("walreceiver_connect_timeout")
.map(|x| x.to_string()),
lagging_wal_timeout: settings.get("lagging_wal_timeout").map(|x| x.to_string()),
max_lsn_wal_lag: settings
.get("max_lsn_wal_lag")
.map(|x| x.parse::<NonZeroU64>())
.transpose()
.context("Failed to parse 'max_lsn_wal_lag' as non zero integer")?,
})
.json(&request)
.send()?
.error_from_body()?
.json::<Option<String>>()
@@ -422,7 +439,7 @@ impl PageServerNode {
})
}
pub fn tenant_config(&self, tenant_id: ZTenantId, settings: HashMap<&str, &str>) -> Result<()> {
pub fn tenant_config(&self, tenant_id: TenantId, settings: HashMap<&str, &str>) -> Result<()> {
self.http_request(Method::PUT, format!("{}/tenant/config", self.http_base_url))
.json(&TenantConfigRequest {
tenant_id,
@@ -471,7 +488,7 @@ impl PageServerNode {
Ok(())
}
pub fn timeline_list(&self, tenant_id: &ZTenantId) -> anyhow::Result<Vec<TimelineInfo>> {
pub fn timeline_list(&self, tenant_id: &TenantId) -> anyhow::Result<Vec<TimelineInfo>> {
let timeline_infos: Vec<TimelineInfo> = self
.http_request(
Method::GET,
@@ -486,10 +503,11 @@ impl PageServerNode {
pub fn timeline_create(
&self,
tenant_id: ZTenantId,
new_timeline_id: Option<ZTimelineId>,
tenant_id: TenantId,
new_timeline_id: Option<TimelineId>,
ancestor_start_lsn: Option<Lsn>,
ancestor_timeline_id: Option<ZTimelineId>,
ancestor_timeline_id: Option<TimelineId>,
pg_version: Option<u32>,
) -> anyhow::Result<TimelineInfo> {
self.http_request(
Method::POST,
@@ -499,6 +517,7 @@ impl PageServerNode {
new_timeline_id,
ancestor_start_lsn,
ancestor_timeline_id,
pg_version,
})
.send()?
.error_from_body()?
@@ -524,10 +543,11 @@ impl PageServerNode {
/// * `pg_wal` - if there's any wal to import: (end lsn, path to `pg_wal.tar`)
pub fn timeline_import(
&self,
tenant_id: ZTenantId,
timeline_id: ZTimelineId,
tenant_id: TenantId,
timeline_id: TimelineId,
base: (Lsn, PathBuf),
pg_wal: Option<(Lsn, PathBuf)>,
pg_version: u32,
) -> anyhow::Result<()> {
let mut client = self.pg_connection_config.connect(NoTls).unwrap();
@@ -546,8 +566,9 @@ impl PageServerNode {
};
// Import base
let import_cmd =
format!("import basebackup {tenant_id} {timeline_id} {start_lsn} {end_lsn}");
let import_cmd = format!(
"import basebackup {tenant_id} {timeline_id} {start_lsn} {end_lsn} {pg_version}"
);
let mut writer = client.copy_in(&import_cmd)?;
io::copy(&mut base_reader, &mut writer)?;
writer.finish()?;

View File

@@ -79,4 +79,5 @@
- [014-storage-lsm](rfcs/014-storage-lsm.md)
- [015-storage-messaging](rfcs/015-storage-messaging.md)
- [016-connection-routing](rfcs/016-connection-routing.md)
- [017-timeline-data-management](rfcs/017-timeline-data-management.md)
- [cluster-size-limits](rfcs/cluster-size-limits.md)

View File

@@ -2,14 +2,14 @@
### Overview
Current state of authentication includes usage of JWT tokens in communication between compute and pageserver and between CLI and pageserver. JWT token is signed using RSA keys. CLI generates a key pair during call to `zenith init`. Using following openssl commands:
Current state of authentication includes usage of JWT tokens in communication between compute and pageserver and between CLI and pageserver. JWT token is signed using RSA keys. CLI generates a key pair during call to `neon_local init`. Using following openssl commands:
```bash
openssl genrsa -out private_key.pem 2048
openssl rsa -in private_key.pem -pubout -outform PEM -out public_key.pem
```
CLI also generates signed token and saves it in the config for later access to pageserver. Now authentication is optional. Pageserver has two variables in config: `auth_validation_public_key_path` and `auth_type`, so when auth type present and set to `ZenithJWT` pageserver will require authentication for connections. Actual JWT is passed in password field of connection string. There is a caveat for psql, it silently truncates passwords to 100 symbols, so to correctly pass JWT via psql you have to either use PGPASSWORD environment variable, or store password in psql config file.
CLI also generates signed token and saves it in the config for later access to pageserver. Now authentication is optional. Pageserver has two variables in config: `auth_validation_public_key_path` and `auth_type`, so when auth type present and set to `NeonJWT` pageserver will require authentication for connections. Actual JWT is passed in password field of connection string. There is a caveat for psql, it silently truncates passwords to 100 symbols, so to correctly pass JWT via psql you have to either use PGPASSWORD environment variable, or store password in psql config file.
Currently there is no authentication between compute and safekeepers, because this communication layer is under heavy refactoring. After this refactoring support for authentication will be added there too. Now safekeeper supports "hardcoded" token passed via environment variable to be able to use callmemaybe command in pageserver.

View File

@@ -148,31 +148,6 @@ relcache? (I think we do cache nblocks in relcache already, check why that's not
Neon)
## Misc change in vacuumlazy.c
```
index 8aab6e324e..c684c4fbee 100644
--- a/src/backend/access/heap/vacuumlazy.c
+++ b/src/backend/access/heap/vacuumlazy.c
@@ -1487,7 +1487,10 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams *params, bool aggressive)
else if (all_visible_according_to_vm && !PageIsAllVisible(page)
&& VM_ALL_VISIBLE(vacrel->rel, blkno, &vmbuffer))
{
- elog(WARNING, "page is not marked all-visible but visibility map bit is set in relation \"%s\" page %u",
+ /* ZENITH-XXX: all visible hint is not wal-logged
+ * FIXME: Replay visibilitymap changes in pageserver
+ */
+ elog(DEBUG1, "page is not marked all-visible but visibility map bit is set in relation \"%s\" page %u",
vacrel->relname, blkno);
visibilitymap_clear(vacrel->rel, blkno, vmbuffer,
VISIBILITYMAP_VALID_BITS);
```
Is this still needed? If that WARNING happens, it looks like potential corruption that we should
fix!
## Use buffer manager when extending VM or FSM
```

View File

@@ -92,6 +92,7 @@ The layer map tracks what layers exist in a timeline.
### Layered repository
Neon repository implementation that keeps data in layers.
### LSN
The Log Sequence Number (LSN) is a unique identifier of the WAL record[] in the WAL log.
@@ -125,6 +126,26 @@ TODO: use this name consistently in remote storage code. Now `disk_consistent_ls
* `ancestor_lsn` - LSN of the branch point (the LSN at which this branch was created)
TODO: add table that describes mapping between PostgreSQL (compute), safekeeper and pageserver LSNs.
### Logical size
The pageserver tracks the "logical size" of a timeline. It is the
total size of all relations in all Postgres databases on the
timeline. It includes all user and system tables, including their FSM
and VM forks. But it does not include SLRUs, twophase files or any
other such data or metadata that lives outside relations.
The logical size is calculated by the pageserver, and is sent to
PostgreSQL via feedback messages to the safekeepers. PostgreSQL uses
the logical size to enforce the size limit in the free tier. The
logical size is also shown to users in the web console.
The logical size is not affected by branches or the physical layout of
layer files in the pageserver. If you have a database with 1 GB
logical size and you create a branch of it, both branches will have 1
GB logical size, even though the branch is copy-on-write and won't
consume any extra physical disk space until you make changes to it.
### Page (block)
The basic structure used to store relation data. All pages are of the same size.

View File

@@ -2,26 +2,26 @@
### Overview
Zenith supports multitenancy. One pageserver can serve multiple tenants at once. Tenants can be managed via zenith CLI. During page server setup tenant can be created using ```zenith init --create-tenant``` Also tenants can be added into the system on the fly without pageserver restart. This can be done using the following cli command: ```zenith tenant create``` Tenants use random identifiers which can be represented as a 32 symbols hexadecimal string. So zenith tenant create accepts desired tenant id as an optional argument. The concept of timelines/branches is working independently per tenant.
Neon supports multitenancy. One pageserver can serve multiple tenants at once. Tenants can be managed via neon_local CLI. During page server setup tenant can be created using ```neon_local init --create-tenant``` Also tenants can be added into the system on the fly without pageserver restart. This can be done using the following cli command: ```neon_local tenant create``` Tenants use random identifiers which can be represented as a 32 symbols hexadecimal string. So neon_local tenant create accepts desired tenant id as an optional argument. The concept of timelines/branches is working independently per tenant.
### Tenants in other commands
By default during `zenith init` new tenant is created on the pageserver. Newly created tenant's id is saved to cli config, so other commands can use it automatically if no direct argument `--tenantid=<tenantid>` is provided. So generally tenantid more frequently appears in internal pageserver interface. Its commands take tenantid argument to distinguish to which tenant operation should be applied. CLI support creation of new tenants.
By default during `neon_local init` new tenant is created on the pageserver. Newly created tenant's id is saved to cli config, so other commands can use it automatically if no direct argument `--tenant_id=<tenant_id>` is provided. So generally tenant_id more frequently appears in internal pageserver interface. Its commands take tenant_id argument to distinguish to which tenant operation should be applied. CLI support creation of new tenants.
Examples for cli:
```sh
zenith tenant list
neon_local tenant list
zenith tenant create // generates new id
neon_local tenant create // generates new id
zenith tenant create ee6016ec31116c1b7c33dfdfca38892f
neon_local tenant create ee6016ec31116c1b7c33dfdfca38892f
zenith pg create main // default tenant from zenith init
neon_local pg create main // default tenant from neon init
zenith pg create main --tenantid=ee6016ec31116c1b7c33dfdfca38892f
neon_local pg create main --tenant_id=ee6016ec31116c1b7c33dfdfca38892f
zenith branch --tenantid=ee6016ec31116c1b7c33dfdfca38892f
neon_local branch --tenant_id=ee6016ec31116c1b7c33dfdfca38892f
```
### Data layout
@@ -56,4 +56,4 @@ Tenant id is passed to postgres via GUC the same way as the timeline. Tenant id
### Safety
For now particular tenant can only appear on a particular pageserver. Set of safekeepers are also pinned to particular (tenantid, timeline) pair so there can only be one writer for particular (tenantid, timeline).
For now particular tenant can only appear on a particular pageserver. Set of safekeepers are also pinned to particular (tenant_id, timeline_id) pair so there can only be one writer for particular (tenant_id, timeline_id).

View File

@@ -109,7 +109,7 @@ Repository
The repository stores all the page versions, or WAL records needed to
reconstruct them. Each tenant has a separate Repository, which is
stored in the .neon/tenants/<tenantid> directory.
stored in the .neon/tenants/<tenant_id> directory.
Repository is an abstract trait, defined in `repository.rs`. It is
implemented by the LayeredRepository object in

View File

@@ -123,7 +123,7 @@ The files are called "layer files". Each layer file covers a range of keys, and
a range of LSNs (or a single LSN, in case of image layers). You can think of it
as a rectangle in the two-dimensional key-LSN space. The layer files for each
timeline are stored in the timeline's subdirectory under
`.neon/tenants/<tenantid>/timelines`.
`.neon/tenants/<tenant_id>/timelines`.
There are two kind of layer files: images, and delta layers. An image file
contains a snapshot of all keys at a particular LSN, whereas a delta file
@@ -351,7 +351,7 @@ branch.
Note: It doesn't make any difference if the child branch is created
when the end of the main branch was at LSN 250, or later when the tip of
the main branch had already moved on. The latter case, creating a
branch at a historic LSN, is how we support PITR in Zenith.
branch at a historic LSN, is how we support PITR in Neon.
# Garbage collection
@@ -396,9 +396,9 @@ table:
main/orders_200_300 DELETE
main/orders_300 STILL NEEDED BY orders_300_400
main/orders_300_400 KEEP, NEWER THAN GC HORIZON
main/orders_400 ..
main/orders_400_500 ..
main/orders_500 ..
main/orders_400 ..
main/orders_400_500 ..
main/orders_500 ..
main/customers_100 DELETE
main/customers_100_200 DELETE
main/customers_200 KEEP, NO NEWER VERSION

View File

@@ -9,7 +9,7 @@ This feature allows to migrate a timeline from one pageserver to another by util
Pageserver implements two new http handlers: timeline attach and timeline detach.
Timeline migration is performed in a following way:
1. Timeline attach is called on a target pageserver. This asks pageserver to download latest checkpoint uploaded to s3.
2. For now it is necessary to manually initialize replication stream via callmemaybe call so target pageserver initializes replication from safekeeper (it is desired to avoid this and initialize replication directly in attach handler, but this requires some refactoring (probably [#997](https://github.com/zenithdb/zenith/issues/997)/[#1049](https://github.com/zenithdb/zenith/issues/1049))
2. For now it is necessary to manually initialize replication stream via callmemaybe call so target pageserver initializes replication from safekeeper (it is desired to avoid this and initialize replication directly in attach handler, but this requires some refactoring (probably [#997](https://github.com/neondatabase/neon/issues/997)/[#1049](https://github.com/neondatabase/neon/issues/1049))
3. Replication state can be tracked via timeline detail pageserver call.
4. Compute node should be restarted with new pageserver connection string. Issue with multiple compute nodes for one timeline is handled on the safekeeper consensus level. So this is not a problem here.Currently responsibility for rescheduling the compute with updated config lies on external coordinator (console).
5. Timeline is detached from old pageserver. On disk data is removed.
@@ -18,5 +18,5 @@ Timeline migration is performed in a following way:
### Implementation details
Now safekeeper needs to track which pageserver it is replicating to. This introduces complications into replication code:
* We need to distinguish different pageservers (now this is done by connection string which is imperfect and is covered here: https://github.com/zenithdb/zenith/issues/1105). Callmemaybe subscription management also needs to track that (this is already implemented).
* We need to distinguish different pageservers (now this is done by connection string which is imperfect and is covered here: https://github.com/neondatabase/neon/issues/1105). Callmemaybe subscription management also needs to track that (this is already implemented).
* We need to track which pageserver is the primary. This is needed to avoid reconnections to non primary pageservers. Because we shouldn't reconnect to them when they decide to stop their walreceiver. I e this can appear when there is a load on the compute and we are trying to detach timeline from old pageserver. In this case callmemaybe will try to reconnect to it because replication termination condition is not met (page server with active compute could never catch up to the latest lsn, so there is always some wal tail)

View File

@@ -1,26 +1,39 @@
## Thread management
Each thread in the system is tracked by the `thread_mgr` module. It
maintains a registry of threads, and which tenant or timeline they are
operating on. This is used for safe shutdown of a tenant, or the whole
system.
The pageserver uses Tokio for handling concurrency. Everything runs in
Tokio tasks, although some parts are written in blocking style and use
spawn_blocking().
Each Tokio task is tracked by the `task_mgr` module. It maintains a
registry of tasks, and which tenant or timeline they are operating
on.
### Handling shutdown
When a tenant or timeline is deleted, we need to shut down all threads
operating on it, before deleting the data on disk. A thread registered
in the thread registry can check if it has been requested to shut down,
by calling `is_shutdown_requested()`. For async operations, there's also
a `shudown_watcher()` async task that can be used to wake up on shutdown.
When a tenant or timeline is deleted, we need to shut down all tasks
operating on it, before deleting the data on disk. There's a function,
`shutdown_tasks`, to request all tasks of a particular tenant or
timeline to shutdown. It will also wait for them to finish.
A task registered in the task registry can check if it has been
requested to shut down, by calling `is_shutdown_requested()`. There's
also a `shudown_watcher()` Future that can be used with `tokio::select!`
or similar, to wake up on shutdown.
### Sync vs async
The primary programming model in the page server is synchronous,
blocking code. However, there are some places where async code is
used. Be very careful when mixing sync and async code.
Async is primarily used to wait for incoming data on network
connections. For example, all WAL receivers have a shared thread pool,
with one async Task for each connection. Once a piece of WAL has been
received from the network, the thread calls the blocking functions in
We use async to wait for incoming data on network connections, and to
perform other long-running operations. For example, each WAL receiver
connection is handled by a tokio Task. Once a piece of WAL has been
received from the network, the task calls the blocking functions in
the Repository to process the WAL.
The core storage code in `layered_repository/` is synchronous, with
blocking locks and I/O calls. The current model is that we consider
disk I/Os to be short enough that we perform them while running in a
Tokio task. If that becomes a problem, we should use `spawn_blocking`
before entering the synchronous parts of the code, or switch to using
tokio I/O functions.
Be very careful when mixing sync and async code!

View File

@@ -70,7 +70,7 @@ two options.
...start sending WAL conservatively since the horizon (1.1), and truncate
obsolete part of WAL only when recovery is finished, i.e. epochStartLsn (4) is
reached, i.e. 2.3 transferred -- that's what https://github.com/zenithdb/zenith/pull/505 proposes.
reached, i.e. 2.3 transferred -- that's what https://github.com/neondatabase/neon/pull/505 proposes.
Then the following is possible:

View File

@@ -0,0 +1,413 @@
# Name
Tenant and timeline data management in pageserver
## Summary
This RFC attempts to describe timeline-related data management as it's done now in pageserver, highlight current complexities caused by this and propose a set of changes to mitigate them.
The main goal is to prepare for future [on-demand layer downloads](https://github.com/neondatabase/neon/issues/2029), yet timeline data is one of the core primitive of pageserver, so a number of other RFCs are affected either.
Due to that, this document won't have a single implementation, rather requiring a set of code changes to achieve the final state.
RFC considers the repository at the `main` branch, commit [`28243d68e60ffc7e69f158522f589f7d2e09186d`](https://github.com/neondatabase/neon/tree/28243d68e60ffc7e69f158522f589f7d2e09186d) on the time of writing.
## Motivation
In recent discussions, it became more clear that timeline-related code becomes harder to change: it consists of multiple disjoint modules, each requiring a synchronization to access.
The lower the code is, the complex the sync gets since many concurrent processes are involved and require orchestration to keep the data consistent.
As the number of modules and isolated data grows per timeline, more questions and corner cases arise:
- https://github.com/neondatabase/neon/issues/1559
right now it's not straightened out what to do when the synchronization task fails for too many times: every separate module's data has to be treated differently.
- https://github.com/neondatabase/neon/issues/1751
GC and compaction file activities are not well known outside their tasks code, causing race bugs
- https://github.com/neondatabase/neon/issues/2003
Even the tenant management gets affected: we have to alter its state based on timeline state, yet the data for making the decision is separated and the synchronisation logic has bugs
- more issues were brought in discussions, but apparently they were too specific to the code to mention them in the issues.
For instance, `tenant_mgr` itself is a static object that we can not mock anyhow, which reduces our capabilities to test the data synchronization logic.
In fact, we have zero Rust tests that cover the case of synchronizing more than one module's data.
On demand layer downloads would require us to dynamically manage the layer files, which we almost not doing at all on the module level, resulting in the most of their APIs dealing with timelines, rather than the layer files.
The disjoint data that would require data synchronization with possibly a chain of lock acquisitions, some async and some sync, and it would be hard to unit test it with the current code state.
Neither this helps to easy start the on-demand download epic, nor it's easy to add more timeline-related code on top, whatever the task is.
We have to develop a vision on a number of topics before progressing safely:
- timeline and tenant data structure and how should we access it
- sync and async worlds and in what way that should evolve
- unit tests for the complex logic
This RFC aims to provide a general overview of the existing situation and propose ways to improve it.
The changes proposed are quite big and no single PR is expected to do the adjustments, they should gradually be done during the on-demand download work later.
## What is a timeline and its data
First, we need to define what data we want to manage per timeline.
Currently, the data every timeline operates is:
- a set of layer files, on the FS
Never updated files, created after pageserver's checkpoints and compaction runs, can be removed from the local FS due to compaction, gc or timeline deletion.
- a set of layer files, on the remote storage
Identically named and placed in tenant subdirectories files on the remote storage (S3), copied by a special background sync thread
- a `metadata` file, on the FS
Updated after every checkpoint with the never `disk_consistent_lsn` and `latest_gc_cutoff_lsn` values. Used to quickly restore timeline's basic metadata on pageserver restart.
Also contains data about the ancestor, if the timeline was branched off another timeline.
- an `index_part.json` file, on the remote storage
Contains `metadata` file contents and a list of layer files, available in the current S3 "directory" for the timeline.
Used to avoid potentially slow and expensive `S3 list` command, updated by the remotes storage sync thread after every operation with the remote layer files.
- LayerMap and PageCache, in memory
Dynamic, used to store and retrieve the page data to users.
- timeline info, in memory
LSNs, walreceiver data, `RemoteTimelineIndex` and other data to share via HTTP API and internal processes.
- metrics data, in memory
Data to push or provide to Prometheus, Opentelemetry, etc.
Besides the data, every timeline currently needs an etcd connection to receive WAL events and connect to safekeepers.
Timeline could be an ancestor to another one, forming a dependency tree, which is implicit right now: every time relations are looked up in place, based on the corresponding `TimelineMetadata` struct contents.
Yet, there's knowledge on a tenant as a group of timelines, belonging to a single user which is used in GC and compaction tasks, run on every tenant.
`tenant_mgr` manages tenant creation and its task startup, along with the remote storage sync for timeline layers.
Last file being managed per-tenant is the tenant config file, created and updated on the local FS to hold tenant-specific configuration between restarts.
It's not yet anyhow synchronized with the remote storage, so only exists on the local FS.
### How the data is stored
We have multiple places where timeline data is stored:
- `tenant_mgr` [holds](https://github.com/neondatabase/neon/blob/28243d68e60ffc7e69f158522f589f7d2e09186d/pageserver/src/tenant_mgr.rs#L43) a static `static ref TENANTS: RwLock<HashMap<ZTenantId, Tenant>>` with the `Tenant` having the `local_timelines: HashMap<ZTimelineId, Arc<DatadirTimelineImpl>>` inside
- same `Tenant` above has actually two references to timelines: another via its `repo: Arc<RepositoryImpl>` with `pub type RepositoryImpl = LayeredRepository;` that [holds](https://github.com/neondatabase/neon/blob/28243d68e60ffc7e69f158522f589f7d2e09186d/pageserver/src/layered_repository.rs#L178) `Mutex<HashMap<ZTimelineId, LayeredTimelineEntry>>`
- `RemoteTimelineIndex` [contains](https://github.com/neondatabase/neon/blob/28243d68e60ffc7e69f158522f589f7d2e09186d/pageserver/src/storage_sync/index.rs#L84) the metadata about timelines on the remote storage (S3) for sync reasons and possible HTTP API queries
- `walreceiver` [stores](https://github.com/neondatabase/neon/blob/28243d68e60ffc7e69f158522f589f7d2e09186d/pageserver/src/walreceiver.rs#L60) the metadata for possible HTTP API queries and its [internal state](https://github.com/neondatabase/neon/blob/28243d68e60ffc7e69f158522f589f7d2e09186d/pageserver/src/walreceiver/connection_manager.rs#L245) with a reference to the timeline, its current connections and etcd subscription (if any)
- `PageCache` contains timeline-related data, and is created globally for the whole pageserver
- implicitly, we also have files on local FS, that contain timeline state. We operate on those files and for some operations (GC, compaction) yet we don't anyhow synchronize the access to the files per se: there are more high-level locks, ensuring only one of a group of operations is running at a time.
On practice though, `LayerMap` and layer files are tightly coupled together: current low-level code requires a timeline to be loaded into the memory to work with it, and the code removes the layer files after removing the entry from the `LayerMap` first.
Based on this, a high-level pageserver's module diagram with data and entities could be:
![timeline tenant state diagram](./images/017-timeline-data-management/timeline_tenant_state.svg)
A few comments on the diagram:
- the diagram does not show all the data and replaces a few newtypes and type aliases (for example, completely ignores "unloaded" timelines due to reasons described below)
It aims to show main data and means of synchronizing it.
- modules tend to isolate their data inside and provide access to it via API
Due to multitenancy, that results in a common pattern for storing both tenant and timeline data: `RwLock` or `Mutex` around the `HashMap<Id, Data>`, gc and compaction tasks also use the same lock pattern to ensure no concurrent runs are happening.
- part of the modules is asynchronous, while the other is not, that complicates the data access
Currently, anything that's not related to tasks (walreceiver, storage sync, GC, compaction) is blocking.
Async tasks that try to access the data in the sync world, have to call `std::sync::Mutex::lock` method, which blocks the thread the callee async task runs on, also blocking other async tasks running in the same thread. Methods of `std::sync::RwLock` have the same issues, forcing async tasks either to block or spawn another, "blocking" task on a separate thread.
Sync tasks that try to access the data in the async world, cannot use `.await` hence have to have some `Runtime` doing those calls for them. [`tokio::sync::Mutex`](https://docs.rs/tokio/1.19.2/tokio/sync/struct.Mutex.html#method.blocking_lock) and [`tokio::sync::RwLock`](https://docs.rs/tokio/1.19.2/tokio/sync/struct.RwLock.html#method.blocking_read) provide an API to simplify such calls. Similarly, both `std::sync` and `tokio::sync` have channels that are able to communicate into one direction without blocking and requiring `.await` calls, hence can be used to connect both worlds without locking.
Some modules are in transition, started as async "blocking" tasks and being fully synchronous in their entire code below the start. Current idea is to transfer them to the async further, but it's not yet done.
- locks are used in two different ways:
- `RwLock<HashMap<..>>` ones to hold the shared data and ensure its atomic updates
- `Mutex<()>` for synchronizing the tasks, used to implicitly order the data access
The "shared data" locks of the first kind are mainly accessed briefly to either look up or alter the data, yet there are a few notable exceptions, such as
`latest_gc_cutoff_lsn: RwLock<Lsn>` that is explicitly held in a few places to prevent GC thread from progressing. Those are covered later in the data access diagrams.
- some synchronizations are not yet implemented
E.g. asynchronous storage sync module does not synchronize with almost synchronous GC and compaction tasks when the layer files are uploaded to the remote storage.
That occasionally results in the files being deleted before the storage upload task is run for this layer, but due to the incremental nature of the layer files, we can handle such situations without issues.
- `LayeredRepository` covers lots of responsibilities: GC and compaction task synchronisation, timeline access (`local_timelines` in `Tenant` is not used directly before the timeline from the repository is accessed), layer flushing to FS, layer sync to remote storage scheduling, etc.
### How is this data accessed?
There are multiple ways the data is accessed, from different sources:
1. [HTTP requests](https://github.com/neondatabase/neon/blob/28243d68e60ffc7e69f158522f589f7d2e09186d/pageserver/src/http/routes.rs)
High-level CRUD API for managing tenants, timelines and getting data about them.
Current API list (modified for readability):
```rust
.get("/v1/status", status_handler) // pageserver status
.get("/v1/tenant", tenant_list_handler)
.post("/v1/tenant", tenant_create_handler) // can create "empty" timelines or branch off the existing ones
.get("/v1/tenant/:tenant_id", tenant_status) // the only tenant public metadata
.put("/v1/tenant/config", tenant_config_handler) // tenant config data and local file manager
.get("/v1/tenant/:tenant_id/timeline", timeline_list_handler)
.post("/v1/tenant/:tenant_id/timeline", timeline_create_handler)
.post("/v1/tenant/:tenant_id/attach", tenant_attach_handler) // download entire tenant from the remote storage and load its timelines memory
.post("/v1/tenant/:tenant_id/detach", tenant_detach_handler) // delete all tenant timelines from memory, remote corresponding storage and local FS files
.get("/v1/tenant/:tenant_id/timeline/:timeline_id", timeline_detail_handler)
.delete("/v1/tenant/:tenant_id/timeline/:timeline_id", timeline_delete_handler)
.get("/v1/tenant/:tenant_id/timeline/:timeline_id/wal_receiver", wal_receiver_get_handler) // get walreceiver stats metadata
```
Overall, neither HTTP operation goes below `LayeredRepository` level and does not interact with layers: instead, they manage tenant and timeline entities, their configuration and metadata.
`GET` data is small (relative to layer files contents), updated via brief `.write()/.lock()` calls and read via copying/cloning the data to release the lock soon.
It does not mean that the operations themselves are short, e.g. `tenant_attach_handler` downloads multiple files from the remote storage which might take time, yet the final data is inserted in memory via one brief write under the lock.
Non-`GET` operations mostly follow the same rule, with two differences:
- `tenant_detach_handler` has to wait for its background tasks to stop before shutting down, which requires more work with locks
- `timeline_create_handler` currently requires GC to be paused before branching the timeline, which requires orchestrating too.
This is the only HTTP operation, able to load the timeline into memory: rest of the operations are reading the metadata or, as in `tenant_attach_handler`, schedule a deferred task to download timeline and load it into memory.
"Timeline data synchronization" section below describes both complex cases in more details.
2. [libpq requests](https://github.com/neondatabase/neon/blob/28243d68e60ffc7e69f158522f589f7d2e09186d/pageserver/src/page_service.rs)
Is the main interface of pageserver, intended to handle libpq (and similar) requests.
Operates on `LayeredTimeline` and, lower, `LayerMap` modules; all timelines accessed during the operation are loaded into memory immediately (if not loaded already), operations bail on timeline load errors.
- `pagestream`
Page requests: `get_rel_exists`, `get_rel_size`, `get_page_at_lsn`, `get_db_size`
Main API points, intended to be used by `compute` to show the data to the user. All require requests to be made at certain Lsn, if this Lsn is not available in the memory, request processing is paused until that happens or bails after a timeout.
- `basebackup` and `fullbackup`
Options to generate postgres-compatible backup archives.
- `import basebackup`
- `import wal`
Import the `pg_wal` section of the basebackup archive.
- `get_last_record_rlsn`, `get_lsn_by_timestamp`
"Metadata" retrieval methods, that still requires internal knowledge about layers.
- `set`, `fallpoints`, `show`
Utility methods to support various edge cases or help with debugging/testing.
- `do_gc`, `compact`, `checkpoint`
Manual triggers for corresponding tenant tasks (GC, compaction) and inmemory layer flushing on disk (checkpointing), with upload task scheduling as a follow-up.
Apart from loading into memory, every timeline layer has to be accessed using specific set of locking primitives, especially if a write operations happens: otherwise, GC or compaction might spoil the data. User API is implicitly affected by this synchronization during branching, when a GC has to be orchestrated properly before the new timeline could be branched off the existing one.
See "Timeline data synchronization" section for the united synchronization diagram on the topic.
3. internal access
Entities within pageserver that update files on local FS and remote storage, metadata in memory; has to use internal data for those operations.
Places that access internal, lower data are also required to have the corresponding timeline successfully loaded into memory and accessed with corresponding synchronization.
If ancestors' data is accessed via its child branch, it means more than one timeline has to be loaded into memory entirely and more locking primitives usage involved.
Right now, all ancestors are resolved in-place: every place that has to check timeline's ancestor has to lock the timelines map, check if one is loaded into the memory, load it there or bail if it's not present, and get the information required and so on.
- periodic GC and compaction tasks
Alter metadata (GC info), in-memory data (layer relations, page caches, etc.) and layer files on disk.
Same as its libpq counterparts, needs full synchronization with the low level layer management code.
- storage sync task
Alters metadata (`RemoteTimelineIndex`), layer files on remote storage (upload, delete) and local FS (download) and in-memory data (registers downloaded timelines in the repository).
Currently, does not know anything about layer files contents, rather focusing on the file structure and metadata file updates: due to the fact that the layer files cannot be updated (only created or deleted), storage sync is able to back up the files to the remote storage without further low-level synchronizations: only when the timeline is downloaded, a load operation is needed to run, possibly pausing GC and compaction tasks.
- walreceiver and walingest task
Per timeline, subscribes for etcd events from safekeeper and eventually spawns a walreceiver connection task to receive WAL from a safekeeper node.
Fills memory with data, eventually triggering a checkpoint task that creates a new layer file in the local FS and schedules a remote storage sync upload task.
During WAL receiving, also updates a separate in-memory data structure with the walreceiver stats, used later via HTTP API.
Layer updates require low-level set of sync primitives used to preserve the data consistency.
- checkpoint (layer freeze) task
Periodic, short-lived tasks to generate a new layer file in the FS. Requires low level synchronization in the end, when the layer is being registered after creating and has additional mode to ensure only one concurrent compaction happens at a time.
### Timeline data synchronization
Here's a high-level timeline data access diagram, considering the synchronization locks, based on the state diagram above.
For brevity, diagrams do not show `RwLock<HashMap<..>>` data accesses, considering them almost instant to happen.
`RwLock<LayerMap>` is close to be an exception to the previous rule, since it's taken in a multiple places to ensure all layers are inserted correctly.
Yet the only long operation in the current code is a `.write()` lock on the map during its creation, while all other lock usages tend to be short in the current code.
Note though, that due to current "working with loaded timeline only", prevailing amount of the locks taken on the struct are `.write()` locks, not the `.read()` ones.
To simplify the diagrams, these accesses are now considered "fast" data access, not the synchronization attempts.
`write_lock` synchronization diagram:
![timeline data access synchronization(1)](./images/017-timeline-data-management/timeline_data_access_sync_1.svg)
Comments:
- `write_lock: Mutex<()>` ensures that all timeline data being written into **in-memory layers** is done without races, one concurrent write at a time
- `layer_flush_lock: Mutex<()>` and layer flushing seems to be slightly bloated with various ways to create a layer on disk and write it in memory
The lock itself seem to repeat `write_lock` purpose when it touches in-memory layers, and also to limit the on-disk layer creations.
Yet the latter is not really done consistently, since remote storage sync manages to download and register the new layers without touching the locks
- `freeze_inmem_layer(true)` that touches both `write_lock` and `layer_flush_lock` seems not very aligned with the rest of the locks to those primitives; it also now restricts the layer creation concurrency even more, yet there are various `freeze_inmem_layer(false)` that are ignoring those restrictions at the same time
![timeline data access synchronization(2)](./images/017-timeline-data-management/timeline_data_access_sync_2.svg)
Comments:
- `partitioning: Mutex<(KeyPartitioning, Lsn)>` lock is a data sync lock that's not used to synchronize the tasks (all other such kinds were considered "almost instant" and omitted on the diagram), yet is very similar to what `write_lock` and `layer_flush_lock` do: it ensures the timeline in-memory data is up-to-date with the layer files state on disk, which is what `LayerMap` is for.
- there are multiple locks that do similar task management operations:
- `gc_cs: Mutex<()>` and `latest_gc_cutoff_lsn: RwLock<Lsn>` ensures that branching and gc are not run concurrently
- `layer_removal_cs: Mutex<()>` lock ensure gc, compaction and timeline deletion via HTTP API do not run concurrently
- `file_lock: RwLock<()>` is used as a semaphore, to ensure "all" gc and compaction tasks are shut down and do not start
Yet that lock does take only gc and compaction from internal loops: libpq call is not cancelled and waited upon.
Those operations do not seem to belong to a timeline. Moreover, some of those could be eliminated entirely due to duplication of their tasks.
## Proposed implementation
### How to structure timeline data access better
- adjust tenant state handling
Current [`TenantState`](https://github.com/neondatabase/neon/blob/28243d68e60ffc7e69f158522f589f7d2e09186d/pageserver/src/tenant_mgr.rs#L108) [changes](https://github.com/neondatabase/neon/blob/28243d68e60ffc7e69f158522f589f7d2e09186d/pageserver/src/tenant_mgr.rs#L317) mainly indicates whether GC and compaction tasks are running or not; another state, `Broken` shows only in case any timeline does not load during startup.
We could start both GC and compaction tasks at the time the tenant is created and adjust the tasks to throttle/sleep on timeline absence and wake up when the first one is added.
The latter becomes more important on download on demand, since we won't have the entire timeline in reach to verify its correctness. Moreover, if any network connection happens, the timeline could fail temporarily and entire tenant should be marked as broken due to that.
Since nothing verifies the `TenantState` via HTTP API currently, it makes sense to remove the whole state entirely and don't write the code to synchronize its changes.
Instead, we could indicate internal issues for every timeline and have a better API to "stop" timeline processing without deleting its data, making our API less restrictive.
- remove the "unloaded" status for the timeline
Current approach to timeline management [assumes](https://github.com/neondatabase/neon/blob/28243d68e60ffc7e69f158522f589f7d2e09186d/pageserver/src/layered_repository.rs#L486-L493)
```rust
#[derive(Clone)]
enum LayeredTimelineEntry {
Loaded(Arc<LayeredTimeline>),
Unloaded {
id: ZTimelineId,
metadata: TimelineMetadata,
},
}
```
supposes that timelines have to be in `Unloaded` state.
The difference between both variants is whether its layer map was loaded from disk and kept in memory (Loaded) or not (Unloaded).
The idea behind such separation was to lazy load timelines in memory with all their layers only after its first access and potentially unload them later.
Yet now there's no public API methods, that deal with unloaded timelines' layers: all of them either bail when such timeline is worked on, or load it into memory and continue working.
Moreover, every timeline in the local FS is loaded on pageserver startup now, so only two places where `Unloaded` variant is used are branching and timeline attach, with both loading the timeline into memory before the end of the operation.
Even if that loading into memory bails for some reason, next GC or compaction task periodic run would load such timeline into memory.
There are a few timeline methods that return timeline metadata without loading its layers, but such metadata also comes from the `metadata` FS file, not the layer files (so no page info could be retrieved without loading the entire layer map first).
With the layer on-demand download, it's not feasible anymore to wait for the entire layer map to be loaded into the memory, since it might not even be available on the local FS when requested: `LayerMap` needs to be changed to contain metadata to retrieve the missing layers and handle partially present on the local FS timeline state.
To accommodate to that and move away from the redundant status, a timeline should always be "loaded" with its metadata read from the disk and its layer map prepared to be downloaded when requested, per layer.
Layers in the layer map, on the other hand, could be in various state: loaded, unloaded, downloading, downloading failed, etc. and their state has to be handled instead, if we want to support on-demand download in the future.
This way, tenants and timelines could always try to serve requests and do their internal tasks periodically, trying to recover.
- scale down the remote storage sync to per layer file, not per timeline as now
Due to the reasons from the previous bullet, current remote storage model needs its timeline download approach to be changed.
Right now, a timeline is marked as "ready" only after all its layers on the remote storage are downloaded on the local storage.
With the on-demand download approach, only remote storage timeline metadata should be downloaded from S3, leaving the rest of the layers ready for download if/when it's requested.
Note: while the remote storage sync should operate per layer, it should stay global for all tenants, to better manage S3 limits and sync queue priorities.
Yet the only place using remote storage should be the layer map.
- encapsulate `tenant_mgr` logic into a regular Rust struct, unite with part of the `Repository` and anything else needed to manage the timeline data in a single place and to test it independently
[`Repository`](https://github.com/neondatabase/neon/blob/28243d68e60ffc7e69f158522f589f7d2e09186d/pageserver/src/repository.rs#L187) trait gets closer to `tenant_mgr` in terms of functionality: there are two background task-related functions, that are run on all timelines of a tenant: `gc_iteration` (it does allow running on a single timeline, but GC task runs it on all timelines) and `compaction_iteration` that are related to service tasks, not the data storage; and the metadata management functions, also not really related to the timeline contents.
`tenant_mgr` proxies some of the `Repository` calls, yet both service tasks use `tenant_mgr` to access the data they need, creating a circular dependency between their APIs.
To avoid excessive synchronization between components, taking multiple locks for that and static state, we can organize the data access and updates in one place.
One potential benefit Rust gets from this is the ability to track and manage timeline resources, if all the related data is located in one place.
- move `RemoteStorage` usage from `LayeredRepository` into `LayerMap`, as the rest of the layer-based entities (layer files, etc.)
Layer == file in our model, since pageserver always either tries to load the LayerMap from disk for the timeline not in memory, or assumes the file contents matches its memory.
`LayeredRepository` is one of the most loaded objects currently and not everything from it deserves unification with the `tenant_mgr`.
In particular, layer files need to be better prepared for future download on demand functionality, where every layer could be dynamically loaded and unloaded from memory and local FS.
Current amount of locks and sync-async separation would make it hard to implement truly dynamic (un)loading; moreover, we would need retries with backoffs, since the unloaded layer files are most probably not available on the local FS either and network is not always reliable.
One of the solutions to the issue is already being developed for the remote storage sync: [SyncQueue](https://github.com/neondatabase/neon/blob/28243d68e60ffc7e69f158522f589f7d2e09186d/pageserver/src/storage_sync.rs#L463)
The queue is able to batch CRUD layer operations (both for local and remote FS contexts) and reorder them to increase the sync speed.
Similar approach could be generalized for all layer modifications, including in-memory ones such as GC or compaction: this way, we could manage all layer modifications and reads in one place with lesser locks and tests that are closer to unit tests.
- change the approach to locking synchronization
A number of locks in the timeline seem to be used to coordinate gc, compaction tasks and related processes.
It should be done in a task manager or other place, external to the timeline.
Timeline contents still needs to be synchronized, considering the task work, so fields like `latest_gc_cutoff_lsn: RwLock<Lsn>` are expected to stay for that purpose, but general amount of locks should be reduced.
### Putting it all together
If the proposal bullets applied to the diagrams above, the state could be represented as:
![timeline timeline tenant state](./images/017-timeline-data-management/proposed_timeline_tenant_state.svg)
The reorders aim to put all tasks into separated modules, with strictly defined interfaces and as less knowledge about other components, as possible.
This way, all timeline data is now in the `data_storage`, including the GC, walreceiver, `RemoteTimelineIndex`, `LayerMap`, etc. with some API to get the data in the way,
more convenient for the data sync system inside.
So far, it seems that a few maps with `Arc<RwLock<SeparateData>>` with actual data operations added inside each `SeparateData` struct, if needed.
`page_cache` is proposed to placed into the same `data_storage` since it contains tenant timelines' data: this way, all metadata and data is in the same struct, simplifying things with Rust's borrow checker and allowing us to share internals between data modules and later might simplify timeline in-memory size tracking.
`task_manager` is related to data storage and manages all tenant and timeline tasks, manages shared resources (runtimes, thread pools, etcd connection, etc.) and synchronizes tasks.
All locks such as `gc_cs` belong to this module tree, as primitives inherently related to the task synchronization.
Tasks have to access timelines and their metadata, but should do that through `data_storage` API and similar.
`task_manager` should (re)start, stop and track all tasks that are run in it, selecting an appropriate runtime depending on a task kind (we have async/sync task separation, CPU and IO bound tasks separation, ...)
Some locks such as `layer_removal_cs` one are not needed, if the only component that starts the tasks ensures they don't run concurrently.
`LayeredTimeline` is still split into two parts, more high-level with whatever primitives needed to sync its state, and the actual state storage with `LayerMap` and other low level entities.
Only `LayerMap` knows what storage it's layer files are taken from (inmem, local FS, etc.), and it's responsible for synchronizing the layers when needed, as also reacting to sync events, successful or not.
Last but not least, `tenant config file` has to be backed into a remote storage, as tenant-specific information for all timelines.
Tenant and timelines have volatile information that's now partially mixed with constant information (e.g. fields in `metadata` file), that model should be better split and handled, in case we want to properly support its backups and synchronization.
![proposed timeline data access synchronization(1)](./images/017-timeline-data-management/proposed_timeline_data_access_sync_1.svg)
There's still a need to keep inmemory layer buffer synchronized during layer freezing, yet that could happen on a layer level, not on a timeline level, as `write_lock` used to be, so we could lower the sync primitives one layer deeper, preparing us for download on demand feature, where multiple layers could be concurrently streamed and written from various data sources.
Flushing the frozen layer requires creating a new layer on disk and further remote storage upload, so `LayerMap` has to get those flushed bytes and queue them later: no need to block in the timeline itself for anything again, rather locking on the layer level, if needed.
![proposed timeline data access synchronization(2)](./images/017-timeline-data-management/proposed_timeline_data_access_sync_2.svg)
Lock diagrams legend:
![lock diagrams legend](./images/017-timeline-data-management/lock_legend.svg)
After the frozen layers are flushed, something has to ensure that the layer structure is intact, so a repartitioning lock is needed still, and could also guard the layer map structure changes, since both are needed either way.
This locking belongs to the `LowLevelLayeredTimeline` from the proposed data structure diagram, as the place with all such data being held.
Similarly, branching is still required to be done after certain Lsn in our current model, but this needs only one lock to synchronize and that could be the `gc_cs: Mutex<()>` lock.
It raises the question of where this lock has to be placed, it's the only place that requires pausing a GC task during external, HTTP request handling.
The right place for the lock seems to be the `task_manager` that could manage GC in more fine-grained way to accommodate the incoming branching request.
There's no explicit lock sync between GC, compaction or other mutually exclusive tasks: it is a job of the `task_manager` to ensure those are not run concurrently.

View File

@@ -15,7 +15,7 @@ The stateless compute node that performs validation is separate from the storage
Limit the maximum size of a PostgreSQL instance to limit free tier users (and other tiers in the future).
First of all, this is needed to control our free tier production costs.
Another reason to limit resources is risk management — we haven't (fully) tested and optimized zenith for big clusters,
Another reason to limit resources is risk management — we haven't (fully) tested and optimized neon for big clusters,
so we don't want to give users access to the functionality that we don't think is ready.
## Components
@@ -43,20 +43,20 @@ Then this size should be reported to compute node.
`current_timeline_size` value is included in the walreceiver's custom feedback message: `ReplicationFeedback.`
(PR about protocol changes https://github.com/zenithdb/zenith/pull/1037).
(PR about protocol changes https://github.com/neondatabase/neon/pull/1037).
This message is received by the safekeeper and propagated to compute node as a part of `AppendResponse`.
Finally, when compute node receives the `current_timeline_size` from safekeeper (or from pageserver directly), it updates the global variable.
And then every zenith_extend() operation checks if limit is reached `(current_timeline_size > neon.max_cluster_size)` and throws `ERRCODE_DISK_FULL` error if so.
And then every neon_extend() operation checks if limit is reached `(current_timeline_size > neon.max_cluster_size)` and throws `ERRCODE_DISK_FULL` error if so.
(see Postgres error codes [https://www.postgresql.org/docs/devel/errcodes-appendix.html](https://www.postgresql.org/docs/devel/errcodes-appendix.html))
TODO:
We can allow autovacuum processes to bypass this check, simply checking `IsAutoVacuumWorkerProcess()`.
It would be nice to allow manual VACUUM and VACUUM FULL to bypass the check, but it's uneasy to distinguish these operations at the low level.
See issues https://github.com/neondatabase/neon/issues/1245
https://github.com/zenithdb/zenith/issues/1445
https://github.com/neondatabase/neon/issues/1445
TODO:
We should warn users if the limit is soon to be reached.

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 15 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 62 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 33 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 34 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 70 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 49 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 33 KiB

View File

@@ -155,9 +155,11 @@ for other files and for sockets for incoming connections.
#### pg_distrib_dir
A directory with Postgres installation to use during pageserver activities.
Since pageserver supports several postgres versions, `pg_distrib_dir` contains
a subdirectory for each version with naming convention `v{PG_MAJOR_VERSION}/`.
Inside that dir, a `bin/postgres` binary should be present.
The default distrib dir is `./tmp_install/`.
The default distrib dir is `./pg_install/`.
#### workdir (-D)

View File

@@ -10,7 +10,7 @@ Intended to be used in integration tests and in CLI tools for local installation
`/docs`:
Documentation of the Zenith features and concepts.
Documentation of the Neon features and concepts.
Now it is mostly dev documentation.
`/monitoring`:
@@ -19,7 +19,7 @@ TODO
`/pageserver`:
Zenith storage service.
Neon storage service.
The pageserver has a few different duties:
- Store and manage the data.
@@ -40,21 +40,21 @@ and create new databases and accounts (control plane API in our case).
Integration tests, written in Python using the `pytest` framework.
`/vendor/postgres`:
`/vendor/postgres-v14`:
PostgreSQL source tree, with the modifications needed for Neon.
`/vendor/postgres/contrib/neon`:
`/pgxn/neon`:
PostgreSQL extension that implements storage manager API and network communications with remote page server.
`/vendor/postgres/contrib/neon_test_utils`:
`/pgxn/neon_test_utils`:
PostgreSQL extension that contains functions needed for testing and debugging.
`/safekeeper`:
The zenith WAL service that receives WAL from a primary compute nodes and streams it to the pageserver.
The neon WAL service that receives WAL from a primary compute nodes and streams it to the pageserver.
It acts as a holding area and redistribution center for recently generated WAL.
For more detailed info, see [walservice.md](./walservice.md)
@@ -64,11 +64,6 @@ The workspace_hack crate exists only to pin down some dependencies.
We use [cargo-hakari](https://crates.io/crates/cargo-hakari) for automation.
`/zenith`
Main entry point for the 'zenith' CLI utility.
TODO: Doesn't it belong to control_plane?
`/libs`:
Unites granular neon helper crates under the hood.
@@ -101,7 +96,7 @@ A single virtual environment with all dependencies is described in the single `P
sudo apt install python3.9
```
- Install `poetry`
- Exact version of `poetry` is not important, see installation instructions available at poetry's [website](https://python-poetry.org/docs/#installation)`.
- Exact version of `poetry` is not important, see installation instructions available at poetry's [website](https://python-poetry.org/docs/#installation).
- Install dependencies via `./scripts/pysync`.
- Note that CI uses specific Python version (look for `PYTHON_VERSION` [here](https://github.com/neondatabase/docker-images/blob/main/rust/Dockerfile))
so if you have different version some linting tools can yield different result locally vs in the CI.
@@ -112,11 +107,13 @@ Run `poetry shell` to activate the virtual environment.
Alternatively, use `poetry run` to run a single command in the venv, e.g. `poetry run pytest`.
### Obligatory checks
We force code formatting via `yapf` and type hints via `mypy`.
Run the following commands in the repository's root (next to `setup.cfg`):
We force code formatting via `black`, `isort` and type hints via `mypy`.
Run the following commands in the repository's root (next to `pyproject.toml`):
```bash
poetry run yapf -ri . # All code is reformatted
poetry run isort . # Imports are reformatted
poetry run black . # All code is reformatted
poetry run flake8 . # Python linter
poetry run mypy . # Ensure there are no typing errors
```
@@ -125,10 +122,59 @@ Otherwise it will not find its configuration.
Also consider:
* Running `flake8` (or a linter of your choice, e.g. `pycodestyle`) and fixing possible defects, if any.
* Running `pycodestyle` (or a linter of your choice) and fixing possible defects, if any.
* Adding more type hints to your code to avoid `Any`.
### Changing dependencies
To add new package or change an existing one you can use `poetry add` or `poetry update` or edit `pyproject.toml` manually. Do not forget to run `poetry lock` in the latter case.
More details are available in poetry's [documentation](https://python-poetry.org/docs/).
## Configuring IDEs
Neon consists of three projects in different languages which use different project models.
* A bunch of Rust crates, all available from the root `Cargo.toml`.
* Integration tests in Python in the `test_runner` directory. Some stand-alone Python scripts exist as well.
* Postgres and our Postgres extensions in C built with Makefiles under `vendor/postgres` and `pgxn`.
### CLion
You can use CLion with the [Rust plugin](https://plugins.jetbrains.com/plugin/8182-rust) to develop Neon. It should pick up Rust and Python projects whenever you open Neon's repository as a project. We have not tried setting up a debugger, though.
C code requires some extra care, as it's built via Make, not CMake. Some of our developers have successfully used [compilation database](https://www.jetbrains.com/help/clion/compilation-database.html#compdb_generate) for CLion. It is a JSON file which lists all C source files and corresponding compilation keys. CLion can use it instead of `CMakeLists.txt`. To set up a project with a compilation database:
1. Clone the Neon repository and install all dependencies, including Python. Do not open it with CLion just yet.
2. Run the following commands in the repository's root:
```bash
# Install a `compiledb` tool which can parse make's output and generate the compilation database.
poetry add -D compiledb
# Clean the build tree so we can rebuild from scratch.
# Unfortunately, our and Postgres Makefiles do not work well with either --dry-run or --assume-new,
# so we don't know a way to generate the compilation database without recompiling everything,
# see https://github.com/neondatabase/neon/issues/2378#issuecomment-1241421325
make distclean
# Rebuild the Postgres parts from scratch and save the compilation commands to the compilation database.
# You can alter the -j parameter to your liking.
# Note that we only build for a specific version of Postgres. The extension code is shared, but headers are
# different, so we set up CLion to only use a specific version of the headers.
make -j$(nproc) --print-directory postgres-v15 neon-pg-ext-v15 | poetry run compiledb --verbose --no-build
# Uninstall the tool
poetry remove -D compiledb
# Make sure the compile_commands.json file is not committed.
echo /compile_commands.json >>.git/info/exclude
```
3. Open CLion, click "Open File or Project" and choose the generated `compile_commands.json` file to be opened "as a project". You cannot add a compilation database into an existing CLion project, you have to create a new one. _Do not_ open the directory as a project, open the file.
4. The newly created project should start indexing Postgres source code in C, as well as the C standard library. You may have to [configure the C compiler for the compilation database](https://www.jetbrains.com/help/clion/compilation-database.html#compdb_toolchain).
5. Open the `Cargo.toml` file in an editor in the same project. CLion should pick up the hint and start indexing Rust code.
6. Now you have a CLion project which knows about C files, Rust files. It should pick up Python files automatically as well.
7. Set up correct code indentation in CLion's settings: Editor > Code Style > C/C++, choose the "Project" scheme on the top, and tick the "Use tab character" on the "Tabs and Indents" tab. Ensure that "Tab size" is 4.
You can also enable Cargo Clippy diagnostics and enable Rustfmt instead of built-in code formatter.
Whenever you change layout of C files, you may need to regenerate the compilation database. No need to re-create the CLion project, changes should be picked up automatically.
Known issues (fixes and suggestions are welcome):
* Test results may be hard to read in CLion, both for unit tests in Rust and integration tests in Python. Use command line to run them instead.
* CLion does not support non-local Python interpreters, unlike PyCharm. E.g. if you use WSL, CLion does not see `poetry` and installed dependencies. Python support is limited.
* Cargo Clippy diagnostics in CLion may take a lot of resources.
* `poetry add -D` updates some packages and changes `poetry.lock` drastically even when followed by `poetry remove -D`. Feel free to `git checkout poetry.lock` and `./scripts/pysync` to revert these changes.

View File

@@ -1,13 +0,0 @@
[package]
name = "integration_tests"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
utils = { path = "../libs/utils" }
pg_bin = { path = "../libs/pg_bin" }
tokio-postgres = { git = "https://github.com/zenithdb/rust-postgres.git", rev="d052ee8b86fff9897c77b0fe89ea9daba0e1fa38" }
tokio = { version = "1.17", features = ["macros", "rt", "rt-multi-thread"] }
anyhow = "1.0.62"

View File

@@ -1,36 +0,0 @@
#[cfg(test)]
mod tests {
use pg_bin::PgDatadir;
use std::path::PathBuf;
use tokio_postgres::NoTls;
#[tokio::test]
async fn test_postgres_select_1() -> anyhow::Result<()> {
// Test setup
let output = PathBuf::from("/home/bojan/tmp/");
let pg_prefix = PathBuf::from("/home/bojan/src/neondatabase/neon/tmp_install/bin/");
// Init datadir
let pg_datadir_path = PathBuf::from("/home/bojan/tmp/t1/");
let pg_datadir = PgDatadir::new_initdb(pg_datadir_path, &pg_prefix, &output, true);
// Get a postgres
let postgres = pg_datadir.spawn_postgres(pg_prefix, output);
let conn_info = postgres.admin_conn_info();
// Get client, run connection
let (client, connection) = conn_info.connect(NoTls).await?;
tokio::spawn(async move {
if let Err(e) = connection.await {
eprintln!("connection error: {}", e);
}
});
// Run "select 1"
let rows = client.query("SELECT 'hello';", &[]).await?;
let value: &str = rows[0].get(0);
assert_eq!(value, "hello");
Ok(())
}
}

View File

@@ -1 +0,0 @@
mod basic;

View File

@@ -11,7 +11,7 @@ use std::{fmt::Display, str::FromStr};
use once_cell::sync::Lazy;
use regex::{Captures, Regex};
use utils::zid::{NodeId, ZTenantId, ZTenantTimelineId};
use utils::id::{NodeId, TenantId, TenantTimelineId};
/// The subscription kind to the timeline updates from safekeeper.
#[derive(Debug, Clone, PartialEq, Eq, Hash)]
@@ -30,13 +30,13 @@ pub enum SubscriptionKind {
/// Get every update in etcd.
All,
/// Get etcd updates for any timeiline of a certain tenant, affected by any operation from any node kind.
TenantTimelines(ZTenantId),
TenantTimelines(TenantId),
/// Get etcd updates for a certain timeline of a tenant, affected by any operation from any node kind.
Timeline(ZTenantTimelineId),
Timeline(TenantTimelineId),
/// Get etcd timeline updates, specific to a certain node kind.
Node(ZTenantTimelineId, NodeKind),
Node(TenantTimelineId, NodeKind),
/// Get etcd timeline updates for a certain operation on specific nodes.
Operation(ZTenantTimelineId, NodeKind, OperationKind),
Operation(TenantTimelineId, NodeKind, OperationKind),
}
/// All kinds of nodes, able to write into etcd.
@@ -67,7 +67,7 @@ static SUBSCRIPTION_FULL_KEY_REGEX: Lazy<Regex> = Lazy::new(|| {
/// No other etcd keys are considered during system's work.
#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
pub struct SubscriptionFullKey {
pub id: ZTenantTimelineId,
pub id: TenantTimelineId,
pub node_kind: NodeKind,
pub operation: OperationKind,
pub node_id: NodeId,
@@ -83,7 +83,7 @@ impl SubscriptionKey {
}
/// Subscribes to a given timeline info updates from safekeepers.
pub fn sk_timeline_info(cluster_prefix: String, timeline: ZTenantTimelineId) -> Self {
pub fn sk_timeline_info(cluster_prefix: String, timeline: TenantTimelineId) -> Self {
Self {
cluster_prefix,
kind: SubscriptionKind::Operation(
@@ -97,7 +97,7 @@ impl SubscriptionKey {
/// Subscribes to all timeine updates during specific operations, running on the corresponding nodes.
pub fn operation(
cluster_prefix: String,
timeline: ZTenantTimelineId,
timeline: TenantTimelineId,
node_kind: NodeKind,
operation: OperationKind,
) -> Self {
@@ -175,7 +175,7 @@ impl FromStr for SubscriptionFullKey {
};
Ok(Self {
id: ZTenantTimelineId::new(
id: TenantTimelineId::new(
parse_capture(&key_captures, 1)?,
parse_capture(&key_captures, 2)?,
),
@@ -247,7 +247,7 @@ impl FromStr for SkOperationKind {
#[cfg(test)]
mod tests {
use utils::zid::ZTimelineId;
use utils::id::TimelineId;
use super::*;
@@ -256,9 +256,9 @@ mod tests {
let prefix = "neon";
let node_kind = NodeKind::Safekeeper;
let operation_kind = OperationKind::Safekeeper(SkOperationKind::WalBackup);
let tenant_id = ZTenantId::generate();
let timeline_id = ZTimelineId::generate();
let id = ZTenantTimelineId::new(tenant_id, timeline_id);
let tenant_id = TenantId::generate();
let timeline_id = TimelineId::generate();
let id = TenantTimelineId::new(tenant_id, timeline_id);
let node_id = NodeId(1);
let timeline_subscription_keys = [

View File

@@ -1,10 +0,0 @@
[package]
name = "pg_bin"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
tokio-postgres = { git = "https://github.com/zenithdb/rust-postgres.git", rev="d052ee8b86fff9897c77b0fe89ea9daba0e1fa38" }
utils = { path = "../utils" }

View File

@@ -1,106 +0,0 @@
//! Utils for runnig postgres binaries as subprocesses.
use std::{fs::{File, remove_dir_all}, path::PathBuf, process::{Child, Command}, time::Duration};
use std::io::Write;
use utils::command_extensions::NeonCommandExtensions;
pub struct PgDatadir {
path: PathBuf
}
impl PgDatadir {
pub fn new_initdb(
path: PathBuf,
pg_prefix: &PathBuf,
command_output_dir: &PathBuf,
remove_if_exists: bool
) -> Self {
if remove_if_exists {
remove_dir_all(path.clone()).ok();
}
let status = Command::new(pg_prefix.join("initdb"))
.arg("-D")
.arg(path.clone())
.capture_to_files(command_output_dir.clone(), "initdb")
.status()
.expect("failed to get status");
assert!(status.success());
Self {
path
}
}
pub fn load_existing(path: PathBuf) -> Self{
Self {
path
}
}
pub fn path(&self) -> PathBuf {
self.path.clone()
}
pub fn spawn_postgres(self, pg_prefix: PathBuf, command_output_dir: PathBuf) -> LocalPostgres {
let port = 54729;
// Write conf
// TODO don't override existing conf
// - instead infer port from conf
let mut conf = File::create(self.path().join("postgresql.conf")).expect("failed to create file");
writeln!(&mut conf, "port = {}", port).expect("failed to write conf");
let process = Command::new(pg_prefix.join("postgres"))
.env("PGDATA", self.path())
.capture_to_files(command_output_dir, "pg")
.spawn()
.expect("postgres failed to spawn");
// Wait until ready. TODO improve this
std::thread::sleep(Duration::from_millis(300));
LocalPostgres {
datadir: self,
port: 54729,
process,
}
}
}
pub struct LocalPostgres {
datadir: PgDatadir,
port: u16,
process: Child,
}
impl LocalPostgres {
pub fn admin_conn_info(&self) -> tokio_postgres::Config {
// I don't like this, but idk what else to do
let whoami = Command::new("whoami").output().unwrap().stdout;
let user = String::from_utf8_lossy(&whoami);
let user = user.trim();
let mut config = tokio_postgres::Config::new();
config
.host("127.0.0.1")
.port(self.port)
.dbname("postgres")
.user(&user);
config
}
pub fn stop(mut self) -> PgDatadir {
self.process.kill().expect("failed to kill child");
PgDatadir {
path: self.datadir.path.clone()
}
}
}
impl Drop for LocalPostgres {
fn drop(&mut self) {
self.process.kill().expect("failed to kill child");
}
}

View File

@@ -4,7 +4,6 @@ version = "0.1.0"
edition = "2021"
[dependencies]
chrono = "0.4.19"
rand = "0.8.3"
regex = "1.4.5"
bytes = "1.0.1"
@@ -22,8 +21,9 @@ workspace_hack = { version = "0.1", path = "../../workspace_hack" }
[dev-dependencies]
env_logger = "0.9"
postgres = { git = "https://github.com/zenithdb/rust-postgres.git", rev="d052ee8b86fff9897c77b0fe89ea9daba0e1fa38" }
postgres = { git = "https://github.com/neondatabase/rust-postgres.git", rev="d052ee8b86fff9897c77b0fe89ea9daba0e1fa38" }
wal_craft = { path = "wal_craft" }
[build-dependencies]
bindgen = "0.59.1"
anyhow = "1.0"
bindgen = "0.60.1"

View File

@@ -9,9 +9,11 @@ should be auto-generated too, but that's a TODO.
The PostgreSQL on-disk file format is not portable across different
CPU architectures and operating systems. It is also subject to change
in each major PostgreSQL version. Currently, this module is based on
PostgreSQL v14, but in the future we will probably need a separate
copy for each PostgreSQL version.
in each major PostgreSQL version. Currently, this module supports
PostgreSQL v14 and v15: bindings and code that depends on them are version-specific.
This code is organized in modules: `postgres_ffi::v14` and `postgres_ffi::v15`
Version independend code is explicitly exported into shared `postgres_ffi`.
TODO: Currently, there is also some code that deals with WAL records
in pageserver/src/waldecoder.rs. That should be moved into this

View File

@@ -4,6 +4,7 @@ use std::env;
use std::path::PathBuf;
use std::process::Command;
use anyhow::{anyhow, Context};
use bindgen::callbacks::ParseCallbacks;
#[derive(Debug)]
@@ -42,93 +43,112 @@ impl ParseCallbacks for PostgresFfiCallbacks {
}
}
fn main() {
fn main() -> anyhow::Result<()> {
// Tell cargo to invalidate the built crate whenever the wrapper changes
println!("cargo:rerun-if-changed=bindgen_deps.h");
// Finding the location of C headers for the Postgres server:
// - if POSTGRES_INSTALL_DIR is set look into it, otherwise look into `<project_root>/tmp_install`
// - if there's a `bin/pg_config` file use it for getting include server, otherwise use `<project_root>/tmp_install/include/postgresql/server`
let mut pg_install_dir = if let Some(postgres_install_dir) = env::var_os("POSTGRES_INSTALL_DIR")
{
// - if POSTGRES_INSTALL_DIR is set look into it, otherwise look into `<project_root>/pg_install`
// - if there's a `bin/pg_config` file use it for getting include server, otherwise use `<project_root>/pg_install/{PG_MAJORVERSION}/include/postgresql/server`
let pg_install_dir = if let Some(postgres_install_dir) = env::var_os("POSTGRES_INSTALL_DIR") {
postgres_install_dir.into()
} else {
PathBuf::from("tmp_install")
PathBuf::from("pg_install")
};
if pg_install_dir.is_relative() {
let cwd = env::current_dir().unwrap();
pg_install_dir = cwd.join("..").join("..").join(pg_install_dir);
}
let pg_config_bin = pg_install_dir.join("bin").join("pg_config");
let inc_server_path: String = if pg_config_bin.exists() {
let output = Command::new(pg_config_bin)
.arg("--includedir-server")
.output()
.expect("failed to execute `pg_config --includedir-server`");
if !output.status.success() {
panic!("`pg_config --includedir-server` failed")
for pg_version in &["v14", "v15"] {
let mut pg_install_dir_versioned = pg_install_dir.join(pg_version);
if pg_install_dir_versioned.is_relative() {
let cwd = env::current_dir().context("Failed to get current_dir")?;
pg_install_dir_versioned = cwd.join("..").join("..").join(pg_install_dir_versioned);
}
String::from_utf8(output.stdout).unwrap().trim_end().into()
} else {
pg_install_dir
.join("include")
.join("postgresql")
.join("server")
.into_os_string()
.into_string()
.unwrap()
};
let pg_config_bin = pg_install_dir_versioned
.join(pg_version)
.join("bin")
.join("pg_config");
let inc_server_path: String = if pg_config_bin.exists() {
let output = Command::new(pg_config_bin)
.arg("--includedir-server")
.output()
.context("failed to execute `pg_config --includedir-server`")?;
// The bindgen::Builder is the main entry point
// to bindgen, and lets you build up options for
// the resulting bindings.
let bindings = bindgen::Builder::default()
//
// All the needed PostgreSQL headers are included from 'bindgen_deps.h'
//
.header("bindgen_deps.h")
//
// Tell cargo to invalidate the built crate whenever any of the
// included header files changed.
//
.parse_callbacks(Box::new(PostgresFfiCallbacks))
//
// These are the types and constants that we want to generate bindings for
//
.allowlist_type("BlockNumber")
.allowlist_type("OffsetNumber")
.allowlist_type("MultiXactId")
.allowlist_type("MultiXactOffset")
.allowlist_type("MultiXactStatus")
.allowlist_type("ControlFileData")
.allowlist_type("CheckPoint")
.allowlist_type("FullTransactionId")
.allowlist_type("XLogRecord")
.allowlist_type("XLogPageHeaderData")
.allowlist_type("XLogLongPageHeaderData")
.allowlist_var("XLOG_PAGE_MAGIC")
.allowlist_var("PG_CONTROL_FILE_SIZE")
.allowlist_var("PG_CONTROLFILEDATA_OFFSETOF_CRC")
.allowlist_type("PageHeaderData")
.allowlist_type("DBState")
// Because structs are used for serialization, tell bindgen to emit
// explicit padding fields.
.explicit_padding(true)
//
.clang_arg(format!("-I{inc_server_path}"))
//
// Finish the builder and generate the bindings.
//
.generate()
.expect("Unable to generate bindings");
if !output.status.success() {
panic!("`pg_config --includedir-server` failed")
}
// Write the bindings to the $OUT_DIR/bindings.rs file.
let out_path = PathBuf::from(env::var("OUT_DIR").unwrap());
bindings
.write_to_file(out_path.join("bindings.rs"))
.expect("Couldn't write bindings!");
String::from_utf8(output.stdout)
.context("pg_config output is not UTF-8")?
.trim_end()
.into()
} else {
let server_path = pg_install_dir_versioned
.join("include")
.join("postgresql")
.join("server")
.into_os_string();
server_path
.into_string()
.map_err(|s| anyhow!("Bad postgres server path {s:?}"))?
};
// The bindgen::Builder is the main entry point
// to bindgen, and lets you build up options for
// the resulting bindings.
let bindings = bindgen::Builder::default()
//
// All the needed PostgreSQL headers are included from 'bindgen_deps.h'
//
.header("bindgen_deps.h")
//
// Tell cargo to invalidate the built crate whenever any of the
// included header files changed.
//
.parse_callbacks(Box::new(PostgresFfiCallbacks))
//
// These are the types and constants that we want to generate bindings for
//
.allowlist_type("BlockNumber")
.allowlist_type("OffsetNumber")
.allowlist_type("XLogRecPtr")
.allowlist_type("XLogSegNo")
.allowlist_type("TimeLineID")
.allowlist_type("TimestampTz")
.allowlist_type("MultiXactId")
.allowlist_type("MultiXactOffset")
.allowlist_type("MultiXactStatus")
.allowlist_type("ControlFileData")
.allowlist_type("CheckPoint")
.allowlist_type("FullTransactionId")
.allowlist_type("XLogRecord")
.allowlist_type("XLogPageHeaderData")
.allowlist_type("XLogLongPageHeaderData")
.allowlist_var("XLOG_PAGE_MAGIC")
.allowlist_var("PG_CONTROL_FILE_SIZE")
.allowlist_var("PG_CONTROLFILEDATA_OFFSETOF_CRC")
.allowlist_type("PageHeaderData")
.allowlist_type("DBState")
// Because structs are used for serialization, tell bindgen to emit
// explicit padding fields.
.explicit_padding(true)
//
.clang_arg(format!("-I{inc_server_path}"))
//
// Finish the builder and generate the bindings.
//
.generate()
.context("Unable to generate bindings")?;
// Write the bindings to the $OUT_DIR/bindings_$pg_version.rs file.
let out_path: PathBuf = env::var("OUT_DIR")
.context("Couldn't read OUT_DIR environment variable var")?
.into();
let filename = format!("bindings_{pg_version}.rs");
bindings
.write_to_file(out_path.join(filename))
.context("Couldn't write bindings")?;
}
Ok(())
}

View File

@@ -3,31 +3,41 @@
#![allow(non_snake_case)]
// bindgen creates some unsafe code with no doc comments.
#![allow(clippy::missing_safety_doc)]
// suppress warnings on rust 1.53 due to bindgen unit tests.
// https://github.com/rust-lang/rust-bindgen/issues/1651
#![allow(deref_nullptr)]
// noted at 1.63 that in many cases there's a u32 -> u32 transmutes in bindgen code.
#![allow(clippy::useless_transmute)]
// modules included with the postgres_ffi macro depend on the types of the specific version's
// types, and trigger a too eager lint.
#![allow(clippy::duplicate_mod)]
use bytes::Bytes;
use utils::bin_ser::SerializeError;
use utils::lsn::Lsn;
macro_rules! postgres_ffi {
($version:ident) => {
#[path = "."]
pub mod $version {
// fixme: does this have to be 'pub'?
pub mod bindings {
// bindgen generates bindings for a lot of stuff we don't need
#![allow(dead_code)]
use serde::{Deserialize, Serialize};
include!(concat!(env!("OUT_DIR"), "/bindings.rs"));
include!(concat!(
env!("OUT_DIR"),
"/bindings_",
stringify!($version),
".rs"
));
include!(concat!("pg_constants_", stringify!($version), ".rs"));
}
pub mod controlfile_utils;
pub mod nonrelfile_utils;
pub mod pg_constants;
pub mod relfile_utils;
pub mod waldecoder;
pub mod waldecoder_handler;
pub mod xlog_utils;
pub const PG_MAJORVERSION: &str = stringify!($version);
// Re-export some symbols from bindings
pub use bindings::DBState_DB_SHUTDOWNED;
pub use bindings::{CheckPoint, ControlFileData, XLogRecord};
@@ -36,20 +46,75 @@ macro_rules! postgres_ffi {
}
postgres_ffi!(v14);
postgres_ffi!(v15);
pub mod pg_constants;
pub mod relfile_utils;
// Export some widely used datatypes that are unlikely to change across Postgres versions
pub use v14::bindings::{uint32, uint64, Oid};
pub use v14::bindings::{BlockNumber, OffsetNumber};
pub use v14::bindings::{MultiXactId, TransactionId};
pub use v14::bindings::{TimeLineID, TimestampTz, XLogRecPtr, XLogSegNo};
// Likewise for these, although the assumption that these don't change is a little more iffy.
pub use v14::bindings::{MultiXactOffset, MultiXactStatus};
pub use v14::bindings::{PageHeaderData, XLogRecord};
pub use v14::xlog_utils::{XLOG_SIZE_OF_XLOG_RECORD, XLOG_SIZE_OF_XLOG_SHORT_PHD};
pub use v14::bindings::{CheckPoint, ControlFileData};
// from pg_config.h. These can be changed with configure options --with-blocksize=BLOCKSIZE and
// --with-segsize=SEGSIZE, but assume the defaults for now.
pub const BLCKSZ: u16 = 8192;
pub const RELSEG_SIZE: u32 = 1024 * 1024 * 1024 / (BLCKSZ as u32);
pub const XLOG_BLCKSZ: usize = 8192;
pub const WAL_SEGMENT_SIZE: usize = 16 * 1024 * 1024;
pub const MAX_SEND_SIZE: usize = XLOG_BLCKSZ * 16;
// Export some version independent functions that are used outside of this mod
pub use v14::xlog_utils::encode_logical_message;
pub use v14::xlog_utils::get_current_timestamp;
pub use v14::xlog_utils::to_pg_timestamp;
pub use v14::xlog_utils::XLogFileName;
pub use v14::bindings::DBState_DB_SHUTDOWNED;
pub fn bkpimage_is_compressed(bimg_info: u8, version: u32) -> anyhow::Result<bool> {
match version {
14 => Ok(bimg_info & v14::bindings::BKPIMAGE_IS_COMPRESSED != 0),
15 => Ok(bimg_info & v15::bindings::BKPIMAGE_COMPRESS_PGLZ != 0
|| bimg_info & v15::bindings::BKPIMAGE_COMPRESS_LZ4 != 0
|| bimg_info & v15::bindings::BKPIMAGE_COMPRESS_ZSTD != 0),
_ => anyhow::bail!("Unknown version {}", version),
}
}
pub fn generate_wal_segment(
segno: u64,
system_id: u64,
pg_version: u32,
) -> Result<Bytes, SerializeError> {
match pg_version {
14 => v14::xlog_utils::generate_wal_segment(segno, system_id),
15 => v15::xlog_utils::generate_wal_segment(segno, system_id),
_ => Err(SerializeError::BadInput),
}
}
pub fn generate_pg_control(
pg_control_bytes: &[u8],
checkpoint_bytes: &[u8],
lsn: Lsn,
pg_version: u32,
) -> anyhow::Result<(Bytes, u64)> {
match pg_version {
14 => v14::xlog_utils::generate_pg_control(pg_control_bytes, checkpoint_bytes, lsn),
15 => v15::xlog_utils::generate_pg_control(pg_control_bytes, checkpoint_bytes, lsn),
_ => anyhow::bail!("Unknown version {}", pg_version),
}
}
// PG timeline is always 1, changing it doesn't have any useful meaning in Neon.
//
@@ -62,7 +127,7 @@ pub const PG_TLI: u32 = 1;
// See TransactionIdIsNormal in transam.h
pub const fn transaction_id_is_normal(id: TransactionId) -> bool {
id > v14::pg_constants::FIRST_NORMAL_TRANSACTION_ID
id > pg_constants::FIRST_NORMAL_TRANSACTION_ID
}
// See TransactionIdPrecedes in transam.c
@@ -97,3 +162,76 @@ pub fn page_set_lsn(pg: &mut [u8], lsn: Lsn) {
pg[0..4].copy_from_slice(&((lsn.0 >> 32) as u32).to_le_bytes());
pg[4..8].copy_from_slice(&(lsn.0 as u32).to_le_bytes());
}
pub mod waldecoder {
use crate::{v14, v15};
use bytes::{Buf, Bytes, BytesMut};
use std::num::NonZeroU32;
use thiserror::Error;
use utils::lsn::Lsn;
pub enum State {
WaitingForRecord,
ReassemblingRecord {
recordbuf: BytesMut,
contlen: NonZeroU32,
},
SkippingEverything {
skip_until_lsn: Lsn,
},
}
pub struct WalStreamDecoder {
pub lsn: Lsn,
pub pg_version: u32,
pub inputbuf: BytesMut,
pub state: State,
}
#[derive(Error, Debug, Clone)]
#[error("{msg} at {lsn}")]
pub struct WalDecodeError {
pub msg: String,
pub lsn: Lsn,
}
impl WalStreamDecoder {
pub fn new(lsn: Lsn, pg_version: u32) -> WalStreamDecoder {
WalStreamDecoder {
lsn,
pg_version,
inputbuf: BytesMut::new(),
state: State::WaitingForRecord,
}
}
// The latest LSN position fed to the decoder.
pub fn available(&self) -> Lsn {
self.lsn + self.inputbuf.remaining() as u64
}
pub fn feed_bytes(&mut self, buf: &[u8]) {
self.inputbuf.extend_from_slice(buf);
}
pub fn poll_decode(&mut self) -> Result<Option<(Lsn, Bytes)>, WalDecodeError> {
match self.pg_version {
// This is a trick to support both versions simultaneously.
// See WalStreamDecoderHandler comments.
14 => {
use self::v14::waldecoder_handler::WalStreamDecoderHandler;
self.poll_decode_internal()
}
15 => {
use self::v15::waldecoder_handler::WalStreamDecoderHandler;
self.poll_decode_internal()
}
_ => Err(WalDecodeError {
msg: format!("Unknown version {}", self.pg_version),
lsn: self.lsn,
}),
}
}
}
}

View File

@@ -1,8 +1,8 @@
//!
//! Common utilities for dealing with PostgreSQL non-relation files.
//!
use crate::pg_constants;
use crate::transaction_id_precedes;
use super::pg_constants;
use bytes::BytesMut;
use log::*;

View File

@@ -1,14 +1,16 @@
//!
//! Misc constants, copied from PostgreSQL headers.
//!
//! Only place version-independent constants here.
//!
//! TODO: These probably should be auto-generated using bindgen,
//! rather than copied by hand. Although on the other hand, it's nice
//! to have them all here in one place, and have the ability to add
//! comments on them.
//!
use super::bindings::PageHeaderData;
use crate::BLCKSZ;
use crate::{PageHeaderData, XLogRecord};
//
// From pg_tablespace_d.h
@@ -16,14 +18,6 @@ use crate::BLCKSZ;
pub const DEFAULTTABLESPACE_OID: u32 = 1663;
pub const GLOBALTABLESPACE_OID: u32 = 1664;
//
// Fork numbers, from relpath.h
//
pub const MAIN_FORKNUM: u8 = 0;
pub const FSM_FORKNUM: u8 = 1;
pub const VISIBILITYMAP_FORKNUM: u8 = 2;
pub const INIT_FORKNUM: u8 = 3;
// From storage_xlog.h
pub const XLOG_SMGR_CREATE: u8 = 0x10;
pub const XLOG_SMGR_TRUNCATE: u8 = 0x20;
@@ -114,7 +108,6 @@ pub const XLOG_NEXTOID: u8 = 0x30;
pub const XLOG_SWITCH: u8 = 0x40;
pub const XLOG_FPI_FOR_HINT: u8 = 0xA0;
pub const XLOG_FPI: u8 = 0xB0;
pub const DB_SHUTDOWNED: u32 = 1;
// From multixact.h
pub const FIRST_MULTIXACT_ID: u32 = 1;
@@ -169,14 +162,10 @@ pub const RM_HEAP_ID: u8 = 10;
pub const XLR_INFO_MASK: u8 = 0x0F;
pub const XLR_RMGR_INFO_MASK: u8 = 0xF0;
// from dbcommands_xlog.h
pub const XLOG_DBASE_CREATE: u8 = 0x00;
pub const XLOG_DBASE_DROP: u8 = 0x10;
pub const XLOG_TBLSPC_CREATE: u8 = 0x00;
pub const XLOG_TBLSPC_DROP: u8 = 0x10;
pub const SIZEOF_XLOGRECORD: u32 = 24;
pub const SIZEOF_XLOGRECORD: u32 = std::mem::size_of::<XLogRecord>() as u32;
//
// from xlogrecord.h
@@ -197,8 +186,6 @@ pub const BKPBLOCK_SAME_REL: u8 = 0x80; /* RelFileNode omitted, same as previous
/* Information stored in bimg_info */
pub const BKPIMAGE_HAS_HOLE: u8 = 0x01; /* page image has "hole" */
pub const BKPIMAGE_IS_COMPRESSED: u8 = 0x02; /* page image is compressed */
pub const BKPIMAGE_APPLY: u8 = 0x04; /* page image should be restored during replay */
/* From transam.h */
pub const FIRST_NORMAL_TRANSACTION_ID: u32 = 3;
@@ -206,15 +193,10 @@ pub const INVALID_TRANSACTION_ID: u32 = 0;
pub const FIRST_BOOTSTRAP_OBJECT_ID: u32 = 12000;
pub const FIRST_NORMAL_OBJECT_ID: u32 = 16384;
/* FIXME: pageserver should request wal_seg_size from compute node */
pub const WAL_SEGMENT_SIZE: usize = 16 * 1024 * 1024;
pub const XLOG_CHECKPOINT_SHUTDOWN: u8 = 0x00;
pub const XLOG_CHECKPOINT_ONLINE: u8 = 0x10;
pub const XLP_LONG_HEADER: u16 = 0x0002;
pub const PG_MAJORVERSION: &str = "14";
// List of subdirectories inside pgdata.
// Copied from src/bin/initdb/initdb.c
pub const PGDATA_SUBDIRS: [&str; 22] = [

View File

@@ -0,0 +1,5 @@
pub const XLOG_DBASE_CREATE: u8 = 0x00;
pub const XLOG_DBASE_DROP: u8 = 0x10;
pub const BKPIMAGE_IS_COMPRESSED: u8 = 0x02; /* page image is compressed */
pub const BKPIMAGE_APPLY: u8 = 0x04; /* page image should be restored during replay */

View File

@@ -0,0 +1,10 @@
pub const XACT_XINFO_HAS_DROPPED_STATS: u32 = 1u32 << 8;
pub const XLOG_DBASE_CREATE_FILE_COPY: u8 = 0x00;
pub const XLOG_DBASE_CREATE_WAL_LOG: u8 = 0x00;
pub const XLOG_DBASE_DROP: u8 = 0x20;
pub const BKPIMAGE_APPLY: u8 = 0x02; /* page image should be restored during replay */
pub const BKPIMAGE_COMPRESS_PGLZ: u8 = 0x04; /* page image is compressed */
pub const BKPIMAGE_COMPRESS_LZ4: u8 = 0x08; /* page image is compressed */
pub const BKPIMAGE_COMPRESS_ZSTD: u8 = 0x10; /* page image is compressed */

View File

@@ -1,10 +1,17 @@
//!
//! Common utilities for dealing with PostgreSQL relation files.
//!
use super::pg_constants;
use once_cell::sync::OnceCell;
use regex::Regex;
//
// Fork numbers, from relpath.h
//
pub const MAIN_FORKNUM: u8 = 0;
pub const FSM_FORKNUM: u8 = 1;
pub const VISIBILITYMAP_FORKNUM: u8 = 2;
pub const INIT_FORKNUM: u8 = 3;
#[derive(Debug, Clone, thiserror::Error, PartialEq, Eq)]
pub enum FilePathError {
#[error("invalid relation fork name")]
@@ -23,10 +30,10 @@ impl From<core::num::ParseIntError> for FilePathError {
pub fn forkname_to_number(forkname: Option<&str>) -> Result<u8, FilePathError> {
match forkname {
// "main" is not in filenames, it's implicit if the fork name is not present
None => Ok(pg_constants::MAIN_FORKNUM),
Some("fsm") => Ok(pg_constants::FSM_FORKNUM),
Some("vm") => Ok(pg_constants::VISIBILITYMAP_FORKNUM),
Some("init") => Ok(pg_constants::INIT_FORKNUM),
None => Ok(MAIN_FORKNUM),
Some("fsm") => Ok(FSM_FORKNUM),
Some("vm") => Ok(VISIBILITYMAP_FORKNUM),
Some("init") => Ok(INIT_FORKNUM),
Some(_) => Err(FilePathError::InvalidForkName),
}
}
@@ -34,10 +41,10 @@ pub fn forkname_to_number(forkname: Option<&str>) -> Result<u8, FilePathError> {
/// Convert Postgres fork number to the right suffix of the relation data file.
pub fn forknumber_to_name(forknum: u8) -> Option<&'static str> {
match forknum {
pg_constants::MAIN_FORKNUM => None,
pg_constants::FSM_FORKNUM => Some("fsm"),
pg_constants::VISIBILITYMAP_FORKNUM => Some("vm"),
pg_constants::INIT_FORKNUM => Some("init"),
MAIN_FORKNUM => None,
FSM_FORKNUM => Some("fsm"),
VISIBILITYMAP_FORKNUM => Some("vm"),
INIT_FORKNUM => Some("init"),
_ => Some("UNKNOWN FORKNUM"),
}
}

View File

@@ -8,63 +8,35 @@
//! to look deeper into the WAL records to also understand which blocks they modify, the code
//! for that is in pageserver/src/walrecord.rs
//!
use super::pg_constants;
use super::xlog_utils::*;
use super::super::waldecoder::{State, WalDecodeError, WalStreamDecoder};
use super::bindings::{XLogLongPageHeaderData, XLogPageHeaderData, XLogRecord, XLOG_PAGE_MAGIC};
use super::xlog_utils::*;
use crate::WAL_SEGMENT_SIZE;
use bytes::{Buf, BufMut, Bytes, BytesMut};
use crc32c::*;
use log::*;
use std::cmp::min;
use std::num::NonZeroU32;
use thiserror::Error;
use utils::lsn::Lsn;
enum State {
WaitingForRecord,
ReassemblingRecord {
recordbuf: BytesMut,
contlen: NonZeroU32,
},
SkippingEverything {
skip_until_lsn: Lsn,
},
}
pub struct WalStreamDecoder {
lsn: Lsn,
inputbuf: BytesMut,
state: State,
}
#[derive(Error, Debug, Clone)]
#[error("{msg} at {lsn}")]
pub struct WalDecodeError {
msg: String,
lsn: Lsn,
pub trait WalStreamDecoderHandler {
fn validate_page_header(&self, hdr: &XLogPageHeaderData) -> Result<(), WalDecodeError>;
fn poll_decode_internal(&mut self) -> Result<Option<(Lsn, Bytes)>, WalDecodeError>;
fn complete_record(&mut self, recordbuf: Bytes) -> Result<(Lsn, Bytes), WalDecodeError>;
}
//
// WalRecordStream is a Stream that returns a stream of WAL records
// FIXME: This isn't a proper rust stream
// This is a trick to support several postgres versions simultaneously.
//
impl WalStreamDecoder {
pub fn new(lsn: Lsn) -> WalStreamDecoder {
WalStreamDecoder {
lsn,
inputbuf: BytesMut::new(),
state: State::WaitingForRecord,
}
}
// The latest LSN position fed to the decoder.
pub fn available(&self) -> Lsn {
self.lsn + self.inputbuf.remaining() as u64
}
pub fn feed_bytes(&mut self, buf: &[u8]) {
self.inputbuf.extend_from_slice(buf);
}
// Page decoding code depends on postgres bindings, so it is compiled for each version.
// Thus WalStreamDecoder implements several WalStreamDecoderHandler traits.
// WalStreamDecoder poll_decode() method dispatches to the right handler based on the postgres version.
// Other methods are internal and are not dispatched.
//
// It is similar to having several impl blocks for the same struct,
// but the impls here are in different modules, so need to use a trait.
//
impl WalStreamDecoderHandler for WalStreamDecoder {
fn validate_page_header(&self, hdr: &XLogPageHeaderData) -> Result<(), WalDecodeError> {
let validate_impl = || {
if hdr.xlp_magic != XLOG_PAGE_MAGIC as u16 {
@@ -125,7 +97,7 @@ impl WalStreamDecoder {
/// Ok(None): there is not enough data in the input buffer. Feed more by calling the `feed_bytes` function
/// Err(WalDecodeError): an error occurred while decoding, meaning the input was invalid.
///
pub fn poll_decode(&mut self) -> Result<Option<(Lsn, Bytes)>, WalDecodeError> {
fn poll_decode_internal(&mut self) -> Result<Option<(Lsn, Bytes)>, WalDecodeError> {
// Run state machine that validates page headers, and reassembles records
// that cross page boundaries.
loop {
@@ -133,7 +105,7 @@ impl WalStreamDecoder {
// However, we may have to skip some page headers if we're processing the XLOG_SWITCH record or skipping padding for whatever reason.
match self.state {
State::WaitingForRecord | State::ReassemblingRecord { .. } => {
if self.lsn.segment_offset(pg_constants::WAL_SEGMENT_SIZE) == 0 {
if self.lsn.segment_offset(WAL_SEGMENT_SIZE) == 0 {
// parse long header
if self.inputbuf.remaining() < XLOG_SIZE_OF_XLOG_LONG_PHD {
@@ -170,6 +142,7 @@ impl WalStreamDecoder {
}
State::SkippingEverything { .. } => {}
}
// now read page contents
match &mut self.state {
State::WaitingForRecord => {
// need to have at least the xl_tot_len field
@@ -194,8 +167,8 @@ impl WalStreamDecoder {
return Ok(Some(self.complete_record(recordbuf)?));
} else {
// Need to assemble the record from pieces. Remember the size of the
// record, and loop back. On next iteration, we will reach the 'else'
// branch below, and copy the part of the record that was on this page
// record, and loop back. On next iterations, we will reach the branch
// below, and copy the part of the record that was on this or next page(s)
// to 'recordbuf'. Subsequent iterations will skip page headers, and
// append the continuations from the next pages to 'recordbuf'.
self.state = State::ReassemblingRecord {
@@ -264,7 +237,7 @@ impl WalStreamDecoder {
// to the next WAL segment.
let next_lsn = if xlogrec.is_xlog_switch_record() {
trace!("saw xlog switch record at {}", self.lsn);
self.lsn + self.lsn.calc_padding(pg_constants::WAL_SEGMENT_SIZE as u64)
self.lsn + self.lsn.calc_padding(WAL_SEGMENT_SIZE as u64)
} else {
// Pad to an 8-byte boundary
self.lsn.align()

View File

@@ -9,15 +9,16 @@
use crc32c::crc32c_append;
use super::super::waldecoder::WalStreamDecoder;
use super::bindings::{
CheckPoint, FullTransactionId, XLogLongPageHeaderData, XLogPageHeaderData, XLogRecord,
XLOG_PAGE_MAGIC,
CheckPoint, ControlFileData, DBState_DB_SHUTDOWNED, FullTransactionId, TimeLineID, TimestampTz,
XLogLongPageHeaderData, XLogPageHeaderData, XLogRecPtr, XLogRecord, XLogSegNo, XLOG_PAGE_MAGIC,
};
use super::pg_constants;
use super::pg_constants::WAL_SEGMENT_SIZE;
use crate::v14::waldecoder::WalStreamDecoder;
use super::PG_MAJORVERSION;
use crate::pg_constants;
use crate::PG_TLI;
use crate::{uint32, uint64, Oid};
use crate::{WAL_SEGMENT_SIZE, XLOG_BLCKSZ};
use bytes::BytesMut;
use bytes::{Buf, Bytes};
@@ -37,11 +38,9 @@ use utils::bin_ser::SerializeError;
use utils::lsn::Lsn;
pub const XLOG_FNAME_LEN: usize = 24;
pub const XLOG_BLCKSZ: usize = 8192;
pub const XLP_FIRST_IS_CONTRECORD: u16 = 0x0001;
pub const XLP_REM_LEN_OFFS: usize = 2 + 2 + 4 + 8;
pub const XLOG_RECORD_CRC_OFFS: usize = 4 + 4 + 8 + 1 + 1 + 2;
pub const MAX_SEND_SIZE: usize = XLOG_BLCKSZ * 16;
pub const XLOG_SIZE_OF_XLOG_SHORT_PHD: usize = std::mem::size_of::<XLogPageHeaderData>();
pub const XLOG_SIZE_OF_XLOG_LONG_PHD: usize = std::mem::size_of::<XLogLongPageHeaderData>();
@@ -49,11 +48,6 @@ pub const XLOG_SIZE_OF_XLOG_RECORD: usize = std::mem::size_of::<XLogRecord>();
#[allow(clippy::identity_op)]
pub const SIZE_OF_XLOG_RECORD_DATA_HEADER_SHORT: usize = 1 * 2;
pub type XLogRecPtr = u64;
pub type TimeLineID = u32;
pub type TimestampTz = i64;
pub type XLogSegNo = u64;
/// Interval of checkpointing metadata file. We should store metadata file to enforce
/// predicate that checkpoint.nextXid is larger than any XID in WAL.
/// But flushing checkpoint file for each transaction seems to be too expensive,
@@ -63,12 +57,10 @@ pub type XLogSegNo = u64;
/// in order to let CLOG_TRUNCATE mechanism correctly extend CLOG.
const XID_CHECKPOINT_INTERVAL: u32 = 1024;
#[allow(non_snake_case)]
pub fn XLogSegmentsPerXLogId(wal_segsz_bytes: usize) -> XLogSegNo {
(0x100000000u64 / wal_segsz_bytes as u64) as XLogSegNo
}
#[allow(non_snake_case)]
pub fn XLogSegNoOffsetToRecPtr(
segno: XLogSegNo,
offset: u32,
@@ -77,7 +69,6 @@ pub fn XLogSegNoOffsetToRecPtr(
segno * (wal_segsz_bytes as u64) + (offset as u64)
}
#[allow(non_snake_case)]
pub fn XLogFileName(tli: TimeLineID, logSegNo: XLogSegNo, wal_segsz_bytes: usize) -> String {
format!(
"{:>08X}{:>08X}{:>08X}",
@@ -87,7 +78,6 @@ pub fn XLogFileName(tli: TimeLineID, logSegNo: XLogSegNo, wal_segsz_bytes: usize
)
}
#[allow(non_snake_case)]
pub fn XLogFromFileName(fname: &str, wal_seg_size: usize) -> (XLogSegNo, TimeLineID) {
let tli = u32::from_str_radix(&fname[0..8], 16).unwrap();
let log = u32::from_str_radix(&fname[8..16], 16).unwrap() as XLogSegNo;
@@ -95,12 +85,10 @@ pub fn XLogFromFileName(fname: &str, wal_seg_size: usize) -> (XLogSegNo, TimeLin
(log * XLogSegmentsPerXLogId(wal_seg_size) + seg, tli)
}
#[allow(non_snake_case)]
pub fn IsXLogFileName(fname: &str) -> bool {
return fname.len() == XLOG_FNAME_LEN && fname.chars().all(|c| c.is_ascii_hexdigit());
}
#[allow(non_snake_case)]
pub fn IsPartialXLogFileName(fname: &str) -> bool {
fname.ends_with(".partial") && IsXLogFileName(&fname[0..fname.len() - 8])
}
@@ -120,6 +108,30 @@ pub fn normalize_lsn(lsn: Lsn, seg_sz: usize) -> Lsn {
}
}
pub fn generate_pg_control(
pg_control_bytes: &[u8],
checkpoint_bytes: &[u8],
lsn: Lsn,
) -> anyhow::Result<(Bytes, u64)> {
let mut pg_control = ControlFileData::decode(pg_control_bytes)?;
let mut checkpoint = CheckPoint::decode(checkpoint_bytes)?;
// Generate new pg_control needed for bootstrap
checkpoint.redo = normalize_lsn(lsn, WAL_SEGMENT_SIZE).0;
//reset some fields we don't want to preserve
//TODO Check this.
//We may need to determine the value from twophase data.
checkpoint.oldestActiveXid = 0;
//save new values in pg_control
pg_control.checkPoint = 0;
pg_control.checkPointCopy = checkpoint;
pg_control.state = DBState_DB_SHUTDOWNED;
Ok((pg_control.encode(), pg_control.system_identifier))
}
pub fn get_current_timestamp() -> TimestampTz {
to_pg_timestamp(SystemTime::now())
}
@@ -151,7 +163,10 @@ pub fn find_end_of_wal(
let mut result = start_lsn;
let mut curr_lsn = start_lsn;
let mut buf = [0u8; XLOG_BLCKSZ];
let mut decoder = WalStreamDecoder::new(start_lsn);
let pg_version = PG_MAJORVERSION[1..3].parse::<u32>().unwrap();
debug!("find_end_of_wal PG_VERSION: {}", pg_version);
let mut decoder = WalStreamDecoder::new(start_lsn, pg_version);
// loop over segments
loop {
@@ -161,7 +176,7 @@ pub fn find_end_of_wal(
match open_wal_segment(&seg_file_path)? {
None => {
// no more segments
info!(
debug!(
"find_end_of_wal reached end at {:?}, segment {:?} doesn't exist",
result, seg_file_path
);
@@ -184,7 +199,7 @@ pub fn find_end_of_wal(
match decoder.poll_decode() {
Ok(Some(record)) => result = record.0,
Err(e) => {
info!(
debug!(
"find_end_of_wal reached end at {:?}, decode error: {:?}",
result, e
);
@@ -318,9 +333,9 @@ impl CheckPoint {
// We need this segment to start compute node.
//
pub fn generate_wal_segment(segno: u64, system_id: u64) -> Result<Bytes, SerializeError> {
let mut seg_buf = BytesMut::with_capacity(pg_constants::WAL_SEGMENT_SIZE as usize);
let mut seg_buf = BytesMut::with_capacity(WAL_SEGMENT_SIZE as usize);
let pageaddr = XLogSegNoOffsetToRecPtr(segno, 0, pg_constants::WAL_SEGMENT_SIZE);
let pageaddr = XLogSegNoOffsetToRecPtr(segno, 0, WAL_SEGMENT_SIZE);
let hdr = XLogLongPageHeaderData {
std: {
XLogPageHeaderData {
@@ -333,7 +348,7 @@ pub fn generate_wal_segment(segno: u64, system_id: u64) -> Result<Bytes, Seriali
}
},
xlp_sysid: system_id,
xlp_seg_size: pg_constants::WAL_SEGMENT_SIZE as u32,
xlp_seg_size: WAL_SEGMENT_SIZE as u32,
xlp_xlog_blcksz: XLOG_BLCKSZ as u32,
};
@@ -341,7 +356,7 @@ pub fn generate_wal_segment(segno: u64, system_id: u64) -> Result<Bytes, Seriali
seg_buf.extend_from_slice(&hdr_bytes);
//zero out the rest of the file
seg_buf.resize(pg_constants::WAL_SEGMENT_SIZE, 0);
seg_buf.resize(WAL_SEGMENT_SIZE, 0);
Ok(seg_buf.freeze())
}
@@ -426,6 +441,7 @@ pub fn encode_logical_message(prefix: &str, message: &str) -> Vec<u8> {
#[cfg(test)]
mod tests {
use super::super::PG_MAJORVERSION;
use super::*;
use regex::Regex;
use std::cmp::min;
@@ -434,23 +450,26 @@ mod tests {
use utils::const_assert;
fn init_logging() {
let _ = env_logger::Builder::from_env(
env_logger::Env::default()
.default_filter_or("wal_craft=info,postgres_ffi::xlog_utils=trace"),
)
let _ = env_logger::Builder::from_env(env_logger::Env::default().default_filter_or(
format!("wal_craft=info,postgres_ffi::{PG_MAJORVERSION}::xlog_utils=trace"),
))
.is_test(true)
.try_init();
}
fn test_end_of_wal<C: wal_craft::Crafter>(test_name: &str) {
use wal_craft::*;
let pg_version = PG_MAJORVERSION[1..3].parse::<u32>().unwrap();
// Craft some WAL
let top_path = PathBuf::from(env!("CARGO_MANIFEST_DIR"))
.join("..")
.join("..");
let cfg = Conf {
pg_distrib_dir: top_path.join("tmp_install"),
datadir: top_path.join(format!("test_output/{}", test_name)),
pg_version,
pg_distrib_dir: top_path.join("pg_install"),
datadir: top_path.join(format!("test_output/{}-{PG_MAJORVERSION}", test_name)),
};
if cfg.datadir.exists() {
fs::remove_dir_all(&cfg.datadir).unwrap();

View File

@@ -11,6 +11,7 @@ clap = "3.0"
env_logger = "0.9"
log = "0.4"
once_cell = "1.13.0"
postgres = { git = "https://github.com/zenithdb/rust-postgres.git", rev="d052ee8b86fff9897c77b0fe89ea9daba0e1fa38" }
postgres = { git = "https://github.com/neondatabase/rust-postgres.git", rev="d052ee8b86fff9897c77b0fe89ea9daba0e1fa38" }
postgres_ffi = { path = "../" }
tempfile = "3.2"
workspace_hack = { version = "0.1", path = "../../../workspace_hack" }

View File

@@ -37,9 +37,16 @@ fn main() -> Result<()> {
Arg::new("pg-distrib-dir")
.long("pg-distrib-dir")
.takes_value(true)
.help("Directory with Postgres distribution (bin and lib directories, e.g. tmp_install)")
.help("Directory with Postgres distributions (bin and lib directories, e.g. pg_install containing subpath `v14/bin/postgresql`)")
.default_value("/usr/local")
)
.arg(
Arg::new("pg-version")
.long("pg-version")
.help("Postgres version to use for the initial tenant")
.required(true)
.takes_value(true)
)
)
.subcommand(
App::new("in-existing")
@@ -82,8 +89,14 @@ fn main() -> Result<()> {
}
Ok(())
}
Some(("with-initdb", arg_matches)) => {
let cfg = Conf {
pg_version: arg_matches
.value_of("pg-version")
.unwrap()
.parse::<u32>()
.context("Failed to parse postgres version from the argument string")?,
pg_distrib_dir: arg_matches.value_of("pg-distrib-dir").unwrap().into(),
datadir: arg_matches.value_of("datadir").unwrap().into(),
};

View File

@@ -4,10 +4,8 @@ use log::*;
use once_cell::sync::Lazy;
use postgres::types::PgLsn;
use postgres::Client;
use postgres_ffi::v14::pg_constants::WAL_SEGMENT_SIZE;
use postgres_ffi::v14::xlog_utils::{
XLOG_BLCKSZ, XLOG_SIZE_OF_XLOG_RECORD, XLOG_SIZE_OF_XLOG_SHORT_PHD,
};
use postgres_ffi::{WAL_SEGMENT_SIZE, XLOG_BLCKSZ};
use postgres_ffi::{XLOG_SIZE_OF_XLOG_RECORD, XLOG_SIZE_OF_XLOG_SHORT_PHD};
use std::cmp::Ordering;
use std::fs;
use std::path::{Path, PathBuf};
@@ -17,6 +15,7 @@ use tempfile::{tempdir, TempDir};
#[derive(Debug, Clone, PartialEq, Eq)]
pub struct Conf {
pub pg_version: u32,
pub pg_distrib_dir: PathBuf,
pub datadir: PathBuf,
}
@@ -38,12 +37,22 @@ pub static REQUIRED_POSTGRES_CONFIG: Lazy<Vec<&'static str>> = Lazy::new(|| {
});
impl Conf {
pub fn pg_distrib_dir(&self) -> PathBuf {
let path = self.pg_distrib_dir.clone();
match self.pg_version {
14 => path.join(format!("v{}", self.pg_version)),
15 => path.join(format!("v{}", self.pg_version)),
_ => panic!("Unsupported postgres version: {}", self.pg_version),
}
}
fn pg_bin_dir(&self) -> PathBuf {
self.pg_distrib_dir.join("bin")
self.pg_distrib_dir().join("bin")
}
fn pg_lib_dir(&self) -> PathBuf {
self.pg_distrib_dir.join("lib")
self.pg_distrib_dir().join("lib")
}
pub fn wal_dir(&self) -> PathBuf {

View File

@@ -7,6 +7,7 @@ edition = "2021"
anyhow = { version = "1.0", features = ["backtrace"] }
async-trait = "0.1"
metrics = { version = "0.1", path = "../metrics" }
utils = { version = "0.1", path = "../utils" }
once_cell = "1.13.0"
rusoto_core = "0.48"
rusoto_s3 = "0.48"

View File

@@ -9,13 +9,13 @@ mod local_fs;
mod s3_bucket;
use std::{
borrow::Cow,
collections::HashMap,
ffi::OsStr,
fmt::Debug,
fmt::{Debug, Display},
num::{NonZeroU32, NonZeroUsize},
ops::Deref,
path::{Path, PathBuf},
pin::Pin,
sync::Arc,
};
use anyhow::{bail, Context};
@@ -24,10 +24,7 @@ use tokio::io;
use toml_edit::Item;
use tracing::info;
pub use self::{
local_fs::LocalFs,
s3_bucket::{S3Bucket, S3ObjectKey},
};
pub use self::{local_fs::LocalFs, s3_bucket::S3Bucket};
/// How many different timelines can be processed simultaneously when synchronizing layers with the remote storage.
/// During regular work, pageserver produces one layer file per timeline checkpoint, with bursts of concurrency
@@ -42,28 +39,62 @@ pub const DEFAULT_REMOTE_STORAGE_MAX_SYNC_ERRORS: u32 = 10;
/// https://aws.amazon.com/premiumsupport/knowledge-center/s3-request-limit-avoid-throttling/
pub const DEFAULT_REMOTE_STORAGE_S3_CONCURRENCY_LIMIT: usize = 100;
pub trait RemoteObjectName {
const REMOTE_STORAGE_PREFIX_SEPARATOR: char = '/';
#[derive(Clone, PartialEq, Eq)]
pub struct RemoteObjectId(String);
///
/// A key that refers to an object in remote storage. It works much like a Path,
/// but it's a separate datatype so that you don't accidentally mix local paths
/// and remote keys.
///
impl RemoteObjectId {
// Needed to retrieve last component for RemoteObjectId.
// In other words a file name
fn object_name(&self) -> Option<&str>;
/// Turn a/b/c or a/b/c/ into c
pub fn object_name(&self) -> Option<&str> {
// corner case, char::to_string is not const, thats why this is more verbose than it needs to be
// see https://github.com/rust-lang/rust/issues/88674
if self.0.len() == 1 && self.0.chars().next().unwrap() == REMOTE_STORAGE_PREFIX_SEPARATOR {
return None;
}
if self.0.ends_with(REMOTE_STORAGE_PREFIX_SEPARATOR) {
self.0.rsplit(REMOTE_STORAGE_PREFIX_SEPARATOR).nth(1)
} else {
self.0
.rsplit_once(REMOTE_STORAGE_PREFIX_SEPARATOR)
.map(|(_, last)| last)
}
}
}
impl Debug for RemoteObjectId {
fn fmt(&self, fmt: &mut std::fmt::Formatter<'_>) -> Result<(), std::fmt::Error> {
Debug::fmt(&self.0, fmt)
}
}
impl Display for RemoteObjectId {
fn fmt(&self, fmt: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
Display::fmt(&self.0, fmt)
}
}
/// Storage (potentially remote) API to manage its state.
/// This storage tries to be unaware of any layered repository context,
/// providing basic CRUD operations for storage files.
#[async_trait::async_trait]
pub trait RemoteStorage: Send + Sync {
/// A way to uniquely reference a file in the remote storage.
type RemoteObjectId: RemoteObjectName;
pub trait RemoteStorage: Send + Sync + 'static {
/// Attempts to derive the storage path out of the local path, if the latter is correct.
fn remote_object_id(&self, local_path: &Path) -> anyhow::Result<Self::RemoteObjectId>;
fn remote_object_id(&self, local_path: &Path) -> anyhow::Result<RemoteObjectId>;
/// Gets the download path of the given storage file.
fn local_path(&self, remote_object_id: &Self::RemoteObjectId) -> anyhow::Result<PathBuf>;
fn local_path(&self, remote_object_id: &RemoteObjectId) -> anyhow::Result<PathBuf>;
/// Lists all items the storage has right now.
async fn list(&self) -> anyhow::Result<Vec<Self::RemoteObjectId>>;
async fn list(&self) -> anyhow::Result<Vec<RemoteObjectId>>;
/// Lists all top level subdirectories for a given prefix
/// Note: here we assume that if the prefix is passed it was obtained via remote_object_id
@@ -71,34 +102,39 @@ pub trait RemoteStorage: Send + Sync {
/// so this method doesnt need to.
async fn list_prefixes(
&self,
prefix: Option<Self::RemoteObjectId>,
) -> anyhow::Result<Vec<Self::RemoteObjectId>>;
prefix: Option<&RemoteObjectId>,
) -> anyhow::Result<Vec<RemoteObjectId>>;
/// Streams the local file contents into remote into the remote storage entry.
async fn upload(
&self,
from: impl io::AsyncRead + Unpin + Send + Sync + 'static,
from: Box<(dyn io::AsyncRead + Unpin + Send + Sync + 'static)>,
// S3 PUT request requires the content length to be specified,
// otherwise it starts to fail with the concurrent connection count increasing.
from_size_bytes: usize,
to: &Self::RemoteObjectId,
to: &RemoteObjectId,
metadata: Option<StorageMetadata>,
) -> anyhow::Result<()>;
/// Streams the remote storage entry contents into the buffered writer given, returns the filled writer.
/// Returns the metadata, if any was stored with the file previously.
async fn download(&self, from: &Self::RemoteObjectId) -> Result<Download, DownloadError>;
async fn download(&self, from: &RemoteObjectId) -> Result<Download, DownloadError>;
/// Streams a given byte range of the remote storage entry contents into the buffered writer given, returns the filled writer.
/// Returns the metadata, if any was stored with the file previously.
async fn download_byte_range(
&self,
from: &Self::RemoteObjectId,
from: &RemoteObjectId,
start_inclusive: u64,
end_exclusive: Option<u64>,
) -> Result<Download, DownloadError>;
async fn delete(&self, path: &Self::RemoteObjectId) -> anyhow::Result<()>;
async fn delete(&self, path: &RemoteObjectId) -> anyhow::Result<()>;
/// Downcast to LocalFs implementation. For tests.
fn as_local(&self) -> Option<&LocalFs> {
None
}
}
pub struct Download {
@@ -141,26 +177,91 @@ impl std::error::Error for DownloadError {}
/// Every storage, currently supported.
/// Serves as a simple way to pass around the [`RemoteStorage`] without dealing with generics.
pub enum GenericRemoteStorage {
Local(LocalFs),
S3(S3Bucket),
#[derive(Clone)]
pub struct GenericRemoteStorage(Arc<dyn RemoteStorage>);
impl Deref for GenericRemoteStorage {
type Target = dyn RemoteStorage;
fn deref(&self) -> &Self::Target {
self.0.as_ref()
}
}
impl GenericRemoteStorage {
pub fn new(
pub fn new(storage: impl RemoteStorage) -> Self {
Self(Arc::new(storage))
}
pub fn from_config(
working_directory: PathBuf,
storage_config: &RemoteStorageConfig,
) -> anyhow::Result<Self> {
match &storage_config.storage {
) -> anyhow::Result<GenericRemoteStorage> {
Ok(match &storage_config.storage {
RemoteStorageKind::LocalFs(root) => {
info!("Using fs root '{}' as a remote storage", root.display());
LocalFs::new(root.clone(), working_directory).map(GenericRemoteStorage::Local)
GenericRemoteStorage::new(LocalFs::new(root.clone(), working_directory)?)
}
RemoteStorageKind::AwsS3(s3_config) => {
info!("Using s3 bucket '{}' in region '{}' as a remote storage, prefix in bucket: '{:?}', bucket endpoint: '{:?}'",
s3_config.bucket_name, s3_config.bucket_region, s3_config.prefix_in_bucket, s3_config.endpoint);
S3Bucket::new(s3_config, working_directory).map(GenericRemoteStorage::S3)
s3_config.bucket_name, s3_config.bucket_region, s3_config.prefix_in_bucket, s3_config.endpoint);
GenericRemoteStorage::new(S3Bucket::new(s3_config, working_directory)?)
}
})
}
/// Takes storage object contents and its size and uploads to remote storage,
/// mapping `from_path` to the corresponding remote object id in the storage.
///
/// The storage object does not have to be present on the `from_path`,
/// this path is used for the remote object id conversion only.
pub async fn upload_storage_object(
&self,
from: Box<dyn tokio::io::AsyncRead + Unpin + Send + Sync + 'static>,
from_size_bytes: usize,
from_path: &Path,
) -> anyhow::Result<()> {
let target_storage_path = self.remote_object_id(from_path).with_context(|| {
format!(
"Failed to get the storage path for source local path '{}'",
from_path.display()
)
})?;
self.upload(from, from_size_bytes, &target_storage_path, None)
.await
.with_context(|| {
format!(
"Failed to upload from '{}' to storage path '{:?}'",
from_path.display(),
target_storage_path
)
})
}
/// Downloads the storage object into the `to_path` provided.
/// `byte_range` could be specified to dowload only a part of the file, if needed.
pub async fn download_storage_object(
&self,
byte_range: Option<(u64, Option<u64>)>,
to_path: &Path,
) -> Result<Download, DownloadError> {
let remote_object_path = self
.remote_object_id(to_path)
.with_context(|| {
format!(
"Failed to get the storage path for target local path '{}'",
to_path.display()
)
})
.map_err(DownloadError::BadInput)?;
match byte_range {
Some((start, end)) => {
self.download_byte_range(&remote_object_path, start, end)
.await
}
None => self.download(&remote_object_path).await,
}
}
}
@@ -241,20 +342,6 @@ impl Debug for S3Config {
}
}
pub fn path_with_suffix_extension(original_path: impl AsRef<Path>, suffix: &str) -> PathBuf {
let new_extension = match original_path
.as_ref()
.extension()
.map(OsStr::to_string_lossy)
{
Some(extension) => Cow::Owned(format!("{extension}.{suffix}")),
None => Cow::Borrowed(suffix),
};
original_path
.as_ref()
.with_extension(new_extension.as_ref())
}
impl RemoteStorageConfig {
pub fn from_toml(toml: &toml_edit::Item) -> anyhow::Result<RemoteStorageConfig> {
let local_path = toml.get("local_path");
@@ -344,26 +431,21 @@ mod tests {
use super::*;
#[test]
fn test_path_with_suffix_extension() {
let p = PathBuf::from("/foo/bar");
assert_eq!(
&path_with_suffix_extension(&p, "temp").to_string_lossy(),
"/foo/bar.temp"
);
let p = PathBuf::from("/foo/bar");
assert_eq!(
&path_with_suffix_extension(&p, "temp.temp").to_string_lossy(),
"/foo/bar.temp.temp"
);
let p = PathBuf::from("/foo/bar.baz");
assert_eq!(
&path_with_suffix_extension(&p, "temp.temp").to_string_lossy(),
"/foo/bar.baz.temp.temp"
);
let p = PathBuf::from("/foo/bar.baz");
assert_eq!(
&path_with_suffix_extension(&p, ".temp").to_string_lossy(),
"/foo/bar.baz..temp"
);
fn object_name() {
let k = RemoteObjectId("a/b/c".to_owned());
assert_eq!(k.object_name(), Some("c"));
let k = RemoteObjectId("a/b/c/".to_owned());
assert_eq!(k.object_name(), Some("c"));
let k = RemoteObjectId("a/".to_owned());
assert_eq!(k.object_name(), Some("a"));
// XXX is it impossible to have an empty key?
let k = RemoteObjectId("".to_owned());
assert_eq!(k.object_name(), None);
let k = RemoteObjectId("/".to_owned());
assert_eq!(k.object_name(), None);
}
}

View File

@@ -5,7 +5,6 @@
//! volume is mounted to the local FS.
use std::{
borrow::Cow,
future::Future,
path::{Path, PathBuf},
pin::Pin,
@@ -17,15 +16,21 @@ use tokio::{
io::{self, AsyncReadExt, AsyncSeekExt, AsyncWriteExt},
};
use tracing::*;
use utils::crashsafe_dir::path_with_suffix_extension;
use crate::{path_with_suffix_extension, Download, DownloadError, RemoteObjectName};
use crate::{Download, DownloadError, RemoteObjectId};
use super::{strip_path_prefix, RemoteStorage, StorageMetadata};
impl RemoteObjectName for PathBuf {
fn object_name(&self) -> Option<&str> {
self.file_stem().and_then(|n| n.to_str())
}
const LOCAL_FS_TEMP_FILE_SUFFIX: &str = "___temp";
/// Convert a Path in the remote storage into a RemoteObjectId
fn remote_object_id_from_path(path: &Path) -> anyhow::Result<RemoteObjectId> {
Ok(RemoteObjectId(
path.to_str()
.ok_or_else(|| anyhow::anyhow!("unexpected characters found in path"))?
.to_string(),
))
}
pub struct LocalFs {
@@ -50,11 +55,17 @@ impl LocalFs {
})
}
fn resolve_in_storage(&self, path: &Path) -> anyhow::Result<PathBuf> {
///
/// Get the absolute path in the local filesystem to given remote object.
///
/// This is public so that it can be used in tests. Should not be used elsewhere.
///
pub fn resolve_in_storage(&self, remote_object_id: &RemoteObjectId) -> anyhow::Result<PathBuf> {
let path = PathBuf::from(&remote_object_id.0);
if path.is_relative() {
Ok(self.storage_root.join(path))
} else if path.starts_with(&self.storage_root) {
Ok(path.to_path_buf())
Ok(path)
} else {
bail!(
"Path '{}' does not belong to the current storage",
@@ -92,41 +103,42 @@ impl LocalFs {
#[async_trait::async_trait]
impl RemoteStorage for LocalFs {
type RemoteObjectId = PathBuf;
fn remote_object_id(&self, local_path: &Path) -> anyhow::Result<Self::RemoteObjectId> {
Ok(self.storage_root.join(
/// Convert a "local" path into a "remote path"
fn remote_object_id(&self, local_path: &Path) -> anyhow::Result<RemoteObjectId> {
let path = self.storage_root.join(
strip_path_prefix(&self.working_directory, local_path)
.context("local path does not belong to this storage")?,
))
);
remote_object_id_from_path(&path)
}
fn local_path(&self, storage_path: &Self::RemoteObjectId) -> anyhow::Result<PathBuf> {
let relative_path = strip_path_prefix(&self.storage_root, storage_path)
fn local_path(&self, remote_object_id: &RemoteObjectId) -> anyhow::Result<PathBuf> {
let storage_path = PathBuf::from(&remote_object_id.0);
let relative_path = strip_path_prefix(&self.storage_root, &storage_path)
.context("local path does not belong to this storage")?;
Ok(self.working_directory.join(relative_path))
}
async fn list(&self) -> anyhow::Result<Vec<Self::RemoteObjectId>> {
async fn list(&self) -> anyhow::Result<Vec<RemoteObjectId>> {
get_all_files(&self.storage_root, true).await
}
async fn list_prefixes(
&self,
prefix: Option<Self::RemoteObjectId>,
) -> anyhow::Result<Vec<Self::RemoteObjectId>> {
prefix: Option<&RemoteObjectId>,
) -> anyhow::Result<Vec<RemoteObjectId>> {
let path = match prefix {
Some(prefix) => Cow::Owned(prefix),
None => Cow::Borrowed(&self.storage_root),
Some(prefix) => Path::new(&prefix.0),
None => &self.storage_root,
};
get_all_files(path.as_ref(), false).await
get_all_files(path, false).await
}
async fn upload(
&self,
from: impl io::AsyncRead + Unpin + Send + Sync + 'static,
from: Box<(dyn io::AsyncRead + Unpin + Send + Sync + 'static)>,
from_size_bytes: usize,
to: &Self::RemoteObjectId,
to: &RemoteObjectId,
metadata: Option<StorageMetadata>,
) -> anyhow::Result<()> {
let target_file_path = self.resolve_in_storage(to)?;
@@ -134,7 +146,8 @@ impl RemoteStorage for LocalFs {
// We need this dance with sort of durable rename (without fsyncs)
// to prevent partial uploads. This was really hit when pageserver shutdown
// cancelled the upload and partial file was left on the fs
let temp_file_path = path_with_suffix_extension(&target_file_path, "temp");
let temp_file_path =
path_with_suffix_extension(&target_file_path, LOCAL_FS_TEMP_FILE_SUFFIX);
let mut destination = io::BufWriter::new(
fs::OpenOptions::new()
.write(true)
@@ -150,8 +163,7 @@ impl RemoteStorage for LocalFs {
);
let from_size_bytes = from_size_bytes as u64;
// Require to read 1 byte more than the expected to check later, that the stream and its size match.
let mut buffer_to_read = from.take(from_size_bytes + 1);
let mut buffer_to_read = from.take(from_size_bytes);
let bytes_read = io::copy(&mut buffer_to_read, &mut destination)
.await
@@ -162,17 +174,15 @@ impl RemoteStorage for LocalFs {
)
})?;
if bytes_read < from_size_bytes {
bail!("Provided stream was shorter than expected: {bytes_read} vs {from_size_bytes} bytes");
}
// Check if there is any extra data after the given size.
let mut from = buffer_to_read.into_inner();
let extra_read = from.read(&mut [1]).await?;
ensure!(
bytes_read == from_size_bytes,
"Provided stream has actual size {} fthat is smaller than the given stream size {}",
bytes_read,
from_size_bytes
);
ensure!(
buffer_to_read.read(&mut [0]).await? == 0,
"Provided stream has bigger size than the given stream size {}",
from_size_bytes
extra_read == 0,
"Provided stream was larger than expected: expected {from_size_bytes} bytes",
);
destination.flush().await.with_context(|| {
@@ -210,7 +220,7 @@ impl RemoteStorage for LocalFs {
Ok(())
}
async fn download(&self, from: &Self::RemoteObjectId) -> Result<Download, DownloadError> {
async fn download(&self, from: &RemoteObjectId) -> Result<Download, DownloadError> {
let file_path = self
.resolve_in_storage(from)
.map_err(DownloadError::BadInput)?;
@@ -244,7 +254,7 @@ impl RemoteStorage for LocalFs {
async fn download_byte_range(
&self,
from: &Self::RemoteObjectId,
from: &RemoteObjectId,
start_inclusive: u64,
end_exclusive: Option<u64>,
) -> Result<Download, DownloadError> {
@@ -298,7 +308,7 @@ impl RemoteStorage for LocalFs {
}
}
async fn delete(&self, path: &Self::RemoteObjectId) -> anyhow::Result<()> {
async fn delete(&self, path: &RemoteObjectId) -> anyhow::Result<()> {
let file_path = self.resolve_in_storage(path)?;
if file_path.exists() && file_path.is_file() {
Ok(fs::remove_file(file_path).await?)
@@ -309,6 +319,10 @@ impl RemoteStorage for LocalFs {
)
}
}
fn as_local(&self) -> Option<&LocalFs> {
Some(self)
}
}
fn storage_metadata_path(original_path: &Path) -> PathBuf {
@@ -318,7 +332,7 @@ fn storage_metadata_path(original_path: &Path) -> PathBuf {
fn get_all_files<'a, P>(
directory_path: P,
recursive: bool,
) -> Pin<Box<dyn Future<Output = anyhow::Result<Vec<PathBuf>>> + Send + Sync + 'a>>
) -> Pin<Box<dyn Future<Output = anyhow::Result<Vec<RemoteObjectId>>> + Send + Sync + 'a>>
where
P: AsRef<Path> + Send + Sync + 'a,
{
@@ -335,12 +349,12 @@ where
debug!("{:?} us a symlink, skipping", entry_path)
} else if file_type.is_dir() {
if recursive {
paths.extend(get_all_files(entry_path, true).await?.into_iter())
paths.extend(get_all_files(&entry_path, true).await?.into_iter())
} else {
paths.push(dir_entry.path())
paths.push(remote_object_id_from_path(&dir_entry.path())?)
}
} else {
paths.push(dir_entry.path());
paths.push(remote_object_id_from_path(&dir_entry.path())?);
}
}
Ok(paths)
@@ -402,9 +416,15 @@ mod pure_tests {
.join("file_name");
let expected_path = storage_root.join(local_path.strip_prefix(&workdir)?);
let actual_path = PathBuf::from(
storage
.remote_object_id(&local_path)
.expect("Matching path should map to storage path normally")
.0,
);
assert_eq!(
expected_path,
storage.remote_object_id(&local_path).expect("Matching path should map to storage path normally"),
actual_path,
"File paths from workdir should be stored in local fs storage with the same path they have relative to the workdir"
);
@@ -465,7 +485,9 @@ mod pure_tests {
assert_eq!(
local_path,
storage
.local_path(&storage_root.join(local_path.strip_prefix(&workdir)?))
.local_path(&remote_object_id_from_path(
&storage_root.join(local_path.strip_prefix(&workdir)?)
)?)
.expect("For a valid input, valid local path should be parsed"),
"Should be able to parse metadata out of the correctly named remote delta file"
);
@@ -489,8 +511,7 @@ mod pure_tests {
#[test]
fn local_path_negatives() -> anyhow::Result<()> {
#[track_caller]
#[allow(clippy::ptr_arg)] // have to use &PathBuf due to `storage.local_path` parameter requirements
fn local_path_error(storage: &LocalFs, storage_path: &PathBuf) -> String {
fn local_path_error(storage: &LocalFs, storage_path: &RemoteObjectId) -> String {
match storage.local_path(storage_path) {
Ok(wrong_path) => panic!(
"Expected local path input {:?} to cause an error, but got file path: {:?}",
@@ -507,7 +528,8 @@ mod pure_tests {
};
let totally_wrong_path = "wrong_wrong_wrong";
let error_message = local_path_error(&storage, &PathBuf::from(totally_wrong_path));
let error_message =
local_path_error(&storage, &RemoteObjectId(totally_wrong_path.to_string()));
assert!(error_message.contains(totally_wrong_path));
Ok(())
@@ -550,7 +572,7 @@ mod fs_tests {
storage: &LocalFs,
#[allow(clippy::ptr_arg)]
// have to use &PathBuf due to `storage.local_path` parameter requirements
remote_storage_path: &PathBuf,
remote_storage_path: &RemoteObjectId,
expected_metadata: Option<&StorageMetadata>,
) -> anyhow::Result<String> {
let mut download = storage
@@ -581,12 +603,20 @@ mod fs_tests {
"whatever_contents",
)
.await?;
let target_path = PathBuf::from("/").join("somewhere").join("else");
match storage.upload(file, size, &target_path, None).await {
let target_path = "/somewhere/else";
match storage
.upload(
Box::new(file),
size,
&RemoteObjectId(target_path.to_string()),
None,
)
.await
{
Ok(()) => panic!("Should not allow storing files with wrong target path"),
Err(e) => {
let message = format!("{:?}", e);
assert!(message.contains(&target_path.display().to_string()));
assert!(message.contains(target_path));
assert!(message.contains("does not belong to the current storage"));
}
}
@@ -609,6 +639,34 @@ mod fs_tests {
Ok(())
}
#[tokio::test]
async fn upload_file_negatives() -> anyhow::Result<()> {
let storage = create_storage()?;
let id = storage.remote_object_id(&storage.working_directory.join("dummy"))?;
let content = std::io::Cursor::new(b"12345");
// Check that you get an error if the size parameter doesn't match the actual
// size of the stream.
storage
.upload(Box::new(content.clone()), 0, &id, None)
.await
.expect_err("upload with zero size succeeded");
storage
.upload(Box::new(content.clone()), 4, &id, None)
.await
.expect_err("upload with too short size succeeded");
storage
.upload(Box::new(content.clone()), 6, &id, None)
.await
.expect_err("upload with too large size succeeded");
// Correct size is 5, this should succeed.
storage.upload(Box::new(content), 5, &id, None).await?;
Ok(())
}
fn create_storage() -> anyhow::Result<LocalFs> {
LocalFs::new(tempdir()?.path().to_owned(), tempdir()?.path().to_owned())
}
@@ -628,8 +686,8 @@ mod fs_tests {
"We should upload and download the same contents"
);
let non_existing_path = PathBuf::from("somewhere").join("else");
match storage.download(&non_existing_path).await {
let non_existing_path = "somewhere/else";
match storage.download(&RemoteObjectId(non_existing_path.to_string())).await {
Err(DownloadError::NotFound) => {} // Should get NotFound for non existing keys
other => panic!("Should get a NotFound error when downloading non-existing storage files, but got: {other:?}"),
}
@@ -768,7 +826,7 @@ mod fs_tests {
Err(e) => {
let error_string = e.to_string();
assert!(error_string.contains("does not exist"));
assert!(error_string.contains(&upload_target.display().to_string()));
assert!(error_string.contains(&upload_target.0));
}
}
Ok(())
@@ -829,15 +887,19 @@ mod fs_tests {
storage: &LocalFs,
name: &str,
metadata: Option<StorageMetadata>,
) -> anyhow::Result<PathBuf> {
) -> anyhow::Result<RemoteObjectId> {
let timeline_path = workdir.join("timelines").join("some_timeline");
let relative_timeline_path = timeline_path.strip_prefix(&workdir)?;
let storage_path = storage.storage_root.join(relative_timeline_path).join(name);
let remote_object_id = RemoteObjectId(storage_path.to_str().unwrap().to_string());
let from_path = storage.working_directory.join(name);
let (file, size) = create_file_for_upload(&from_path, &dummy_contents(name)).await?;
storage.upload(file, size, &storage_path, metadata).await?;
Ok(storage_path)
storage
.upload(Box::new(file), size, &remote_object_id, metadata)
.await?;
remote_object_id_from_path(&storage_path)
}
async fn create_file_for_upload(
@@ -862,9 +924,9 @@ mod fs_tests {
format!("contents for {name}")
}
async fn list_files_sorted(storage: &LocalFs) -> anyhow::Result<Vec<PathBuf>> {
async fn list_files_sorted(storage: &LocalFs) -> anyhow::Result<Vec<RemoteObjectId>> {
let mut files = storage.list().await?;
files.sort();
files.sort_by(|a, b| a.0.cmp(&b.0));
Ok(files)
}
}

Some files were not shown because too many files have changed in this diff Show More