Commit Graph

2263 Commits

Author SHA1 Message Date
Heikki Linnakangas
22cc8760b9 Move walredo process code under pgxn in the main 'neon' repository.
- Refactor the way the WalProposerMain function is called when started
  with --sync-safekeepers. The postgres binary now explicitly loads
  the 'neon.so' library and calls the WalProposerMain in it. This is
  simpler than the global function callback "hook" we previously used.

- Move the WAL redo process code to a new library, neon_walredo.so,
  and use the same mechanism as for --sync-safekeepers to call the
  WalRedoMain function, when launched with --walredo argument.

- Also move the seccomp code to neon_walredo.so library. I kept the
  configure check in the postgres side for now, though.
2022-10-31 01:11:50 +01:00
Arseny Sher
596d622a82 Fix test_prepare_snapshot.
It should checkpoint pageserver after waiting for all data arrival, not before.
2022-10-28 22:12:31 +04:00
Sergey Melnikov
7481fb082c Fix bugs in #2713 (#2716) 2022-10-28 14:12:49 +00:00
Arseny Sher
1eb9bd052a Bump vendor/postgres-v15 to fix XLP_FIRST_IS_CONTRECORD issue.
ref https://github.com/neondatabase/cloud/issues/2688
2022-10-28 16:45:11 +03:00
Sergey Melnikov
59a3ca4ec6 Deploy proxy to new prod regions (#2713)
* Refactor proxy deploy

* Test new prod deploy

* Remove assume role

* Add new values

* Add all regions
2022-10-28 16:25:28 +03:00
Sergey Melnikov
e86a9105a4 Deploy storage to new prod regions (#2709) 2022-10-28 10:17:27 +00:00
Stas Kelvich
d3c8749da5 Build compute postgres with openssl support
The main reason for that change is that Postgres 15 requires OpenSSL
for `pgcrypto` to work. Also not a bad idea to have SSL-enabled
Postgres in general.
2022-10-28 10:39:22 +03:00
Alexander Bayandin
128dc8d405 Nightly Benchmarks: fix workflow (#2708) 2022-10-27 19:26:10 +03:00
Alexander Bayandin
0cbae6e8f3 test_backward_compatibility: friendlier error message (#2707) 2022-10-27 15:54:49 +00:00
Alexander Stanovoy
78e412b84b The fix of #2650. (#2686)
* Wrappers and drop implementations for image and delta layer writers.
* Two regression tests for the image and delta layer files.
2022-10-27 14:02:55 +00:00
Rory de Zoete
6dbf202e0d Update crane copy target (#2704)
Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>
2022-10-27 16:00:40 +02:00
Arseny Sher
b42bf9265a Enable etcd compaction in neon_local. 2022-10-27 10:47:08 +03:00
Stas Kelvich
1f08ba5790 Avoid debian-testing packages in compute Dockerfiles
plv8 can only be built with a fairly new gold linker version. We used to install
it via binutils packages from testing, but it also updates libc and that causes
troubles in the resulting image as different extensions were built against
different libc versions. We could either use libc from debian-testing everywhere
or restrain from using testing packages and install necessary programs manually.
This patch uses the latter approach: gold for plv8 and cmake for h3 are
installed manually.

In a passing declare h3_postgis as a safe extension (previous omission).
2022-10-27 09:44:16 +03:00
bojanserafimov
0c54eb65fb Move pagestream api to libs/pageserver_api (#2698) 2022-10-26 17:32:31 -04:00
mikecaat
259a5f356e Add a docker-compose example file (#1943) (#2666)
Co-authored-by: Masahiro Ikeda <masahiro.ikeda.us@hco.ntt.co.jp>
2022-10-26 13:59:25 +03:00
Sergey Melnikov
a3cb8c11e0 Do not release to new staging proxies on release (#2685) 2022-10-25 23:51:23 +00:00
bojanserafimov
9fb2287f87 Add draw_timeline binary (#2688) 2022-10-25 11:25:22 -04:00
Alexander Bayandin
834ffe1bac Add data format backward compatibility tests (#2626) 2022-10-25 16:41:50 +02:00
Stas Kelvich
df18b041c0 Use apt version pinning instead of repo priorities
Higher `bullseye` priority doesn't works for packages installed
via `bullseye-updates`, e.g.:

```
libc-bin:
  Installed: 2.31-13+deb11u5
  Candidate: 2.35-3
  Version table:
     2.35-3 500
        500 http://ftp.debian.org/debian testing/main amd64 Packages
 *** 2.31-13+deb11u5 500
        500 http://deb.debian.org/debian bullseye-updates/main amd64 Packages
        100 /var/lib/dpkg/status
     2.31-13+deb11u4 990
        990 http://deb.debian.org/debian bullseye/main amd64 Packages
```

Try version pinning instead
2022-10-25 14:29:11 +03:00
Anastasia Lubennikova
39897105b2 Check postgres version and ensure that public schema exists
before running GRANT query on it
2022-10-25 09:55:24 +03:00
Stas Kelvich
2f399f08b2 Hotfix to disable grant create on public schema
`GRANT CREATE ON SCHEMA public` fails if there is no schema `public`.
Disable it in release for now and make a better fix later (it is
needed for v15 support).
2022-10-25 09:55:24 +03:00
Arseny Sher
9f49605041 Fix division by zero panic in determine_offloader. 2022-10-22 18:25:12 +03:00
Konstantin Knizhnik
7b6431cbd7 Disable wal_log_hints by default (#2598)
* Disable wal_log_hints by default

* Remove obsolete comment anbout wal_log_hints
2022-10-22 14:59:18 +03:00
Lassi Pölönen
321aeac3d4 Json logging capability (#2624)
* Support configuring the log format as json or plain.

Separately test json and plain logger. They would be competing on the
same global subscriber otherwise.

* Implement log_format for pageserver config

* Implement configurable log format for safekeeper.
2022-10-21 17:30:20 +00:00
Andrés
71ef7b6663 Remove cached_property package (#2673)
Co-authored-by: andres <andres.rodriguez@outlook.es>
2022-10-21 20:02:31 +03:00
Kirill Bulatov
5928cb33c5 Introduce timeline state (#2651)
Similar to https://github.com/neondatabase/neon/pull/2395, introduces a state field in Timeline, that's possible to subscribe to.

Adjusts

* walreceiver to not to have any connections if timeline is not Active
* remote storage sync to not to schedule uploads if timeline is Broken
* not to create timelines if a tenant/timeline is broken
* automatically switches timelines' states based on tenant state

Does not adjust timeline's gc, checkpointing and layer flush behaviour much, since it's not safe to cancel these processes abruptly and there's task_mgr::shutdown_tasks that does similar thing.
2022-10-21 15:51:48 +00:00
Sergey Melnikov
6ff2c61ae0 Refactor safekeeper s3 config and change it for new account (#2672) 2022-10-21 13:44:08 +00:00
Arseny Sher
7480a0338a Determine safekeeper for offloading WAL without etcd election API.
This API is rather pointless, as sane choice anyway requires knowledge of peers
status and leaders lifetime in any case can intersect, which is fine for us --
so manual elections are straightforward. Here, we deterministically choose among
the reasonably caught up safekeepers, shifting by timeline id to spread the
load.

A step towards custom broker https://github.com/neondatabase/neon/issues/2394
2022-10-21 15:33:27 +03:00
Sergey Melnikov
2709878b8b Deploy scram proxies into new account (#2643) 2022-10-21 14:21:22 +03:00
Kirill Bulatov
39e4bdb99e Actualize tenant and timeline API modifiers (#2661)
* Actualize tenant and timeline API modifiers
* Use anyhow::Result explicitly
2022-10-21 10:58:43 +00:00
Anastasia Lubennikova
52e75fead9 Use anyhow::Result explicitly 2022-10-21 12:47:06 +03:00
Anastasia Lubennikova
a347d2b6ac #2616 handle 'Unsupported pg_version' error properly 2022-10-21 12:47:06 +03:00
Heikki Linnakangas
fc4ea3553e test_gc_cutoff.py fixes (#2655)
* Fix bogus early exit from GC.

Commit 91411c415a added this failpoint, but the early exit was not
intentional.

* Cleanup test_gc_cutoff.py test.

- Remove the 'scale' parameter, this isn't a benchmark
- Tweak pgbench and pageserver options to create garbage faster that the
  the GC can collect away. The test used to take just under 5 minutes,
  which was uncomfortably close to the default 5 minute test timeout, and
  annoyingly even without the hard limit. These changes bring it down to
  about 1-2 minutes.
- Improve comments, fix typos
- Rename the failpoint. The old name, 'gc-before-save-metadata' implied
  that the failpoint was before the metadata update, but it was in fact
  much later in the function.
- Move the call to persist the metadata outside the lock, to avoid
  holding it for too long.

To verify that this test still covers the original bug,
https://github.com/neondatabase/neon/issues/2539, I commenting out
updating the metadata file like this:
```
diff --git a/pageserver/src/tenant/timeline.rs b/pageserver/src/tenant/timeline.rs
index 1e857a9a..f8a9f34a 100644
--- a/pageserver/src/tenant/timeline.rs
+++ b/pageserver/src/tenant/timeline.rs
@@ -1962,7 +1962,7 @@ impl Timeline {
         }
         // Persist the new GC cutoff value in the metadata file, before
         // we actually remove anything.
-        self.update_metadata_file(self.disk_consistent_lsn.load(), HashMap::new())?;
+        //self.update_metadata_file(self.disk_consistent_lsn.load(), HashMap::new())?;

         info!("GC starting");

```
It doesn't fail every time with that, but it did fail after about 5
runs.
2022-10-21 02:39:55 +03:00
Dmitry Rodionov
cca1ace651 make launch_wal_receiver infallible 2022-10-21 00:40:12 +03:00
Sergey Melnikov
30984c163c Fix race between pushing image to ECR and copying to dockerhub (#2662) 2022-10-20 23:01:01 +03:00
Konstantin Knizhnik
7404777efc Pin pages with speculative insert tuples to prevent their reconstruction because spec_token is not wal logged (#2657)
* Pin pages with speculative insert tuples to prevent their reconstruction because spec_token is not wal logged

refer ##2587

* Bump postgres versions
2022-10-20 20:06:05 +03:00
Heikki Linnakangas
eb1bdcc6cf If an FSM or VM page cannot be reconstructed, fill it with zeros.
If we cannot reconstruct an FSM or VM page, while creating image
layers, fill it with zeros instead. That should always be safe, for
the FSM and VM, in the sense that you won't lose actual user data. It
will get cleaned up by VACUUM later.

We had a bug with FSM/VM truncation, where we truncated the FSM and VM
at WAL replay to a smaller size than PostgreSQL originally did. We
thought was harmless, as the FSM and VM are not critical for
correctness and can be zeroed out or truncated without affecting user
data. However, it lead to a situation where PostgreSQL created
incremental WAL records for pages that we had already truncated away
in the pageserver, and when we tried to replay those WAL records, that
failed. That lead to a permanent error in image layer creation, and
prevented it from ever finishing. See
https://github.com/neondatabase/neon/issues/2601. With this patch,
those pages will be filled with zeros in the image layer, which allows
the image layer creation to finish.
2022-10-20 17:27:01 +03:00
Arthur Petukhovsky
f5ab9f761b Remove flaky checks in test_delete_force (#2567) 2022-10-20 17:14:32 +04:00
Kirill Bulatov
306a47c4fa Use uninit mark files during timeline init for atomic creation (#2489)
Part of https://github.com/neondatabase/neon/pull/2239

Regular, from scratch, timeline creation involves initdb to be run in a separate directory, data from this directory to be imported into pageserver and, finally, timeline-related background tasks to start.

This PR ensures we don't leave behind any directories that are not marked as temporary and that pageserver removes such directories on restart, allowing timeline creation to be retried with the same IDs, if needed.

It would be good to later rewrite the logic to use a temporary directory, similar what tenant creation does.
Yet currently it's harder than this change, so not done.
2022-10-20 14:19:17 +03:00
Kirill Bulatov
84c5f681b0 Fix test feature detection (#2659)
Follow-up of #2636 and #2654 , fixing the test detection feature.

Pageserver currently outputs features as

```
/target/debug/pageserver --version
Neon page server git:7734929a8202c8cc41596a861ffbe0b51b5f3cb9 failpoints: true, features: ["testing", "profiling"]
```
2022-10-20 13:44:03 +03:00
Kirill Bulatov
50297bef9f RFC about Tenant / Timeline guard objects (#2660)
Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2022-10-20 12:49:54 +03:00
Andrés
9211923bef Pageserver Python tests should not fail if the server is built with no testing feature (#2636)
Co-authored-by: andres <andres.rodriguez@outlook.es>
2022-10-20 10:46:57 +03:00
bojanserafimov
7734929a82 Remove stale todos (#2630) 2022-10-19 22:59:22 +00:00
Heikki Linnakangas
bc5ec43056 Fix flaky physical-size tests in test_timeline_size.py.
These two tests, test_timeline_physical_size_post_compaction and
test_timeline_physical_size_post_gc, assumed that after you have
waited for the WAL from a bulk insertion to arrive, and you run a
cycle of checkpoint and compaction, no new layer files are created.
Because if a new layer file is created while we are calculating the
incremental and non-incremental physical sizes, they might differ.

However, the tests used a very small checkpoint_distance, so even a
small amount of WAL generated in PostgreSQL could cause a new layer
file to be created. Autovacuum can kick in at any time, and do that.
That caused occasional failues in the test. I was able to reproduce it
reliably by adding a long delay between the incremental and
non-incremental size calculations:

```
--- a/pageserver/src/http/routes.rs
+++ b/pageserver/src/http/routes.rs
@@ -129,6 +129,9 @@ async fn build_timeline_info(
         }
     };
     let current_physical_size = Some(timeline.get_physical_size());
+    if include_non_incremental_physical_size {
+        std:🧵:sleep(std::time::Duration::from_millis(60000));
+    }

     let info = TimelineInfo {
         tenant_id: timeline.tenant_id,
```

To fix, disable autovacuum for the table. Autovacuum could still kick
in for other tables, e.g. catalog tables, but that seems less likely
to generate enough WAL to causea new layer file to be flushed.

If this continues to be a problem in the future, we could simply retry
the physical size call a few times, if there's a mismatch. A mismatch
could happen every once in a while, but it's very unlikely to happen
more than once or twice in a row.

Fixes https://github.com/neondatabase/neon/issues/2212
2022-10-19 23:50:21 +03:00
MMeent
b237feedab Add more redo metrics: (#2645)
- Measure size of redo WAL (new histogram), with bounds between 24B-32kB
- Add 2 more buckets at the upper end of the redo time histogram
  We often (>0.1% of several hours each day) take more than 250ms to do the
  redo round-trip to the postgres process. We need to measure these redo
  times more precisely.
2022-10-19 22:47:11 +02:00
Alexey Kondratov
4d1e48f3b9 [compute_ctl] Use postgres::config to properly escape database names (#2652)
We've got at least one user in production that cannot create a
database with a trailing space in the name.

This happens because we use `url` crate for manipulating the
DATABASE_URL, but it follows a standard that doesn't fit really
well with Postgres. For example, it trims all trailing spaces
from the path:

  > Remove any leading and trailing C0 control or space from input.
  > See: https://url.spec.whatwg.org/#url-parsing

But we used `set_path()` to set database name and it's totally valid
to have trailing spaces in the database name in Postgres.

Thus, use `postgres::config::Config` to modify database name in the
connection details.
2022-10-19 19:20:06 +02:00
Anastasia Lubennikova
7576b18b14 [compute_tools] fix GRANT CREATE ON SCHEMA public -
run the grant query in each database
2022-10-19 18:37:52 +03:00
Konstantin Knizhnik
6b49b370fc Fix build after applying PR #2558 2022-10-19 13:55:30 +03:00
Konstantin Knizhnik
91411c415a Persists latest_gc_cutoff_lsn before performing GC (#2558)
* Persists latest_gc_cutoff_lsn before performing GC

* Peform some refactoring and code deduplication

refer #2539

* Add test for persisting GC cutoff

* Fix python test style warnings

* Bump postgres version

* Reduce number of iterations in test_gc_cutoff test

* Bump postgres version

* Undo bumping postgres version
2022-10-19 12:32:03 +03:00
Kirill Bulatov
c67cf34040 Update GH Action version (#2646) 2022-10-19 11:16:36 +03:00