Commit Graph

146 Commits

Author SHA1 Message Date
Tristan Partin
3525d2e381 Update TimescaleDB to 2.17.1 for PG 17
Signed-off-by: Tristan Partin <tristan@neon.tech>
2024-11-08 09:15:38 -06:00
Cheng Chen
e1d0b73824 chore(compute): Bump pg_mooncake to the latest version 2024-11-06 22:41:18 -06:00
Tristan Partin
93123f2623 Rename compute_backpressure_throttling_ms to compute_backpressure_throttling_seconds
This is in line with the Prometheus guidance[0]. We also haven't started
using this metric, so renaming is essentially free.

Link: https://prometheus.io/docs/practices/naming/ [0]
Signed-off-by: Tristan Partin <tristan@neon.tech>
2024-11-06 13:28:23 -06:00
Conrad Ludgate
9761b6a64e update pg_session_jwt to use pgrx 0.12 for pg17 (#9595)
Updates the extension to use pgrx 0.12. No changes to the extensions
have been made, the only difference is the pgrx version.
2024-10-31 15:50:41 +00:00
Anastasia Lubennikova
e96398a552 Add support of extensions for v17 (part 4) (#9568)
- pg_jsonschema 0.3.3
- pg_graphql 1.5.9
- rum 65e0a752
- pg_tiktoken a5bc447e

update support of extensions for v14-v16:
- pg_jsonschema 0.3.1 -> 0.3.3
- pg_graphql 1.5.7 -> 1.5.9
- rum 6ab37053 -> 65e0a752
- pg_tiktoken e64e55aa -> a5bc447e
2024-10-31 15:05:24 +02:00
Tristan Partin
8af9412eb2 Collect compute backpressure throttling time
This will tell us how much time the compute has spent throttled if
pageserver/safekeeper cannot keep up with WAL generation.

Signed-off-by: Tristan Partin <tristan@neon.tech>
2024-10-30 09:58:29 -05:00
Alexey Kondratov
745061ddf8 chore(compute): Bump pg_mooncake to the latest version (#9576)
## Problem

There were some critical breaking changes made in the upstream since Oct
29th morning.

## Summary of changes

Point it to the topmost commit in the `neon` branch at the time of
writing this
https://github.com/Mooncake-Labs/pg_mooncake/commits/neon/
c495cd17d6
2024-10-30 11:07:02 +01:00
Tristan Partin
b77b9bdc9f Add tests for sql-exporter metrics
Should help us keep non-working metrics from hitting staging or
production.

Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
Fixes: https://github.com/neondatabase/neon/issues/8569
Signed-off-by: Tristan Partin <tristan@neon.tech>
2024-10-29 15:13:06 -05:00
Anastasia Lubennikova
80e1630042 Use pg_mooncake from our fork. (#9565)
Switch to main repo once
https://github.com/Mooncake-Labs/pg_mooncake/pull/3 is merged
2024-10-29 15:57:52 +00:00
George MacKerron
7d5f6b6a52 Build pgrag extensions x3 (#8486)
Build the pgrag extensions (rag, rag_bge_small_en_v15, and
rag_jina_reranker_v1_tiny_en) as part of the compute node Dockerfile.

---------

Co-authored-by: Alexander Bayandin <alexander@neon.tech>
2024-10-28 20:06:36 +00:00
Tristan Partin
3d64a7ddcd Add pg_mooncake to compute-node.Dockerfile
Signed-off-by: Tristan Partin <tristan@neon.tech>
2024-10-28 11:23:30 -05:00
Arthur Petukhovsky
01b6843e12 Route pgbouncer logs to virtio-serial (#9488)
virtio-serial is much more performant than /dev/console emulation,
therefore, is much more suitable for the verbose logs inside vm. This
commit changes routing for pgbouncer logs, since we've recently noticed
it can emit large volumes of logs.

Manually tested on staging by pinning a compute image to my test
project.

Should help with https://github.com/neondatabase/cloud/issues/19072
2024-10-28 12:09:47 +00:00
Tristan Partin
0595320c87 Protect call to pg_current_wal_lsn() in retained_wal query
We can't call pg_current_wal_lsn() if we are a standby instance (read
replica). Any attempt to call this function while in recovery results
in:

ERROR:  recovery is in progress

Signed-off-by: Tristan Partin <tristan@neon.tech>
2024-10-23 09:55:00 -06:00
Tristan Partin
fcb55a2aa2 Fix copy-paste error in checkpoints_timed metric
Importing the wrong metric. Sigh...

Signed-off-by: Tristan Partin <tristan@neon.tech>
2024-10-22 14:34:26 -06:00
Tristan Partin
b7fa93f6b7 Use make's builtin RM variable
At least as far as removing individual files goes, this is the best
pattern for removing. I can't say the same for removing directories, but
I went ahead and changed those to `$(RM) -r` anyway.

Signed-off-by: Tristan Partin <tristan@neon.tech>
2024-10-22 09:14:29 -06:00
David Gomes
94369af782 chore(compute): bumps pg_session_jwt to latest version (#9474) 2024-10-21 23:39:30 +00:00
Ivan Efremov
ababa50cce Use '-f' for make clean in Makefile compute (#9464)
Use '-f' instead of '--force' because it is impossible to clean the
targets on MacOS
2024-10-21 16:20:39 +03:00
Alexander Bayandin
5b37485c99 Rename dockerfiles from Dockerfile.<something> to <something>.Dockerfile (#9446)
## Problem

Our dockerfiles, for some historical reason, have unconventional names
`Dockerfile.<something>`, and some tools (like GitHub UI) fail to highlight
the syntax in them.

> Some projects may need distinct Dockerfiles for specific purposes. A
common convention is to name these `<something>.Dockerfile`

From: https://docs.docker.com/build/concepts/dockerfile/#filename

## Summary of changes
- Rename `Dockerfile.build-tools` -> `build-tools.Dockerfile`
- Rename `compute/Dockerfile.compute-node` ->
`compute/compute-node.Dockerfile`
2024-10-21 09:51:12 +01:00
Tristan Partin
62a334871f Take the collector name as argument when generating sql_exporter configs
In neon_collector_autoscaling.jsonnet, the collector name is hardcoded
to neon_collector_autoscaling. This issue manifests itself such that
sql_exporter would not find the collector configuration.

Signed-off-by: Tristan Partin <tristan@neon.tech>
2024-10-18 09:36:29 -05:00
Conrad Ludgate
5cbdec9c79 [local_proxy]: install pg_session_jwt extension on demand (#9370)
Follow up on #9344. We want to install the extension automatically. We
didn't want to couple the extension into compute_ctl so instead
local_proxy is the one to issue requests specific to the extension.

depends on #9344 and #9395
2024-10-18 14:41:21 +01:00
Anastasia Lubennikova
8b47938140 Add support of extensions for v17 (part 3) (#9430)
- pgvector 7.4

update support of extensions for v14-v16:
- pgvector 7.2 -> 7.4
2024-10-17 13:37:21 +01:00
Tristan Partin
e0fa6bcf1a Fix some sql_exporter metrics for PG 17
Checkpointer related statistics moved from pg_stat_bgwriter to
pg_stat_checkpointer, so we need to adjust our queries accordingly.

Signed-off-by: Tristan Partin <tristan@neon.tech>
2024-10-16 14:46:33 -05:00
Tristan Partin
409a286eaa Fix typo in sql_exporter generator
Bad copy-paste seemingly. This manifested itself as a failure to start
for the sql_exporter, and was just dying on loop in staging. A future PR
will have E2E testing of sql_exporter.

Signed-off-by: Tristan Partin <tristan@neon.tech>
2024-10-16 13:08:40 -05:00
Anastasia Lubennikova
9668601f46 Add support of extensions for v17 (part 2) (#9389)
- plv8 3.2.3
    - HypoPG 1.4.1
    - pgtap 1.3.3
    - timescaledb 2.17.0
    - pg_hint_plan 17_1_7_0
    - rdkit Release_2024_09_1
    - pg_uuidv7 1.6.0
    - wal2json 2.6
    - pg_ivm 1.9
    - pg_partman 5.1.0

    update support of extensions for v14-v16:
    - HypoPG 1.4.0 -> 1.4.1
    - pgtap 1.2.0 -> 1.3.3
    - plpgsql_check 2.5.3 -> 2.7.11
    - pg_uuidv7 1.0.1 -> 1.6.0
    - wal2json 2.5 -> 2.6
    - pg_ivm 1.7 -> 1.9
    - pg_partman 5.0.1 -> 5.1.0
2024-10-16 15:29:23 +01:00
Tristan Partin
061ea0de7a Add jsonnetfmt targets
This should make it a little bit easier for people wanting to check if
their files are formated correctly. Has the added bonus of making the CI
check simpler as well.

Signed-off-by: Tristan Partin <tristan@neon.tech>
2024-10-15 20:01:13 -05:00
Tristan Partin
be5d6a69dc Fix jsonnet_files wildcard
Just a typo in a path.

Signed-off-by: Tristan Partin <tristan@neon.tech>
2024-10-15 16:30:31 -05:00
Matthias van de Meent
18f4e5f10c Add newly added metrics from neondatabase/neon#9116 to exports (#9402)
They weren't added in that PR, but should be available immediately on
rollout as the neon extension already defaults to 1.5.
2024-10-15 23:13:31 +02:00
Tristan Partin
cf7a596a15 Generate sql_exporter config files with Jsonnet
There are quite a few benefits to this approach:

- Reduce config duplication
  - The two sql_exporter configs were super similar with just a few
    differences
- Pull SQL queries into standalone files
  - That means we could run a SQL formatter on the file in the future
  - It also means access to syntax highlighting
- In the future, run different queries for different PG versions
  - This is relevant because right now, we have queries that are failing
    on PG 17 due to catalog updates

Signed-off-by: Tristan Partin <tristan@neon.tech>
2024-10-15 11:18:38 -05:00
Conrad Ludgate
d92d36a315 [local_proxy] update api for pg_session_jwt (#9359)
pg_session_jwt now:
1. Sets the JWK in a PGU_BACKEND session guc, no longer in the init()
function.
2. JWK no longer needs the kid.
2024-10-15 12:13:57 +00:00
Alexander Bayandin
0fc4ada3ca Switch CI, Storage and Proxy to Debian 12 (Bookworm) (#9170)
## Problem

This PR switches CI and Storage to Debain 12 (Bookworm) based images.

## Summary of changes
- Add Debian codename (`bookworm`/`bullseye`) to most of docker tags,
create un-codenamed images to be used by default
- `vm-compute-node-image`: create a separate spec for `bookworm` (we
don't need to build cgroups in the future)
- `neon-image`: Switch to `bookworm`-based `build-tools` image
  - Storage components and Proxy use it
- CI: run lints and tests on `bookworm`-based `build-tools` image
2024-10-14 21:12:43 +01:00
Anastasia Lubennikova
721803a0e7 Add partial support of extensions for v17: (#9322)
- PostGIS 3.5.0
- pgrouting 3.6.2
- h3 4.1.3
- unit 7.9
- pgjwt version (f3d82fd)
- pg_hashids 1.2.1
- ip4r 2.4.2
- prefix 1.2.10
- postgresql-hll 2.18
- pg_roaringbitmap 0.5.4
- pg-semver 0.40.0

update support of extensions for v14-v16:
- unit 7.7 -> 7.9
- pgjwt 9742dab -> f3d82fd

---------

Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2024-10-09 17:07:59 +01:00
Conrad Ludgate
6c05f89f7d proxy: add local-proxy to compute image (#8823)
1. Adds local-proxy to compute image and vm spec
2. Updates local-proxy config processing, writing PID to a file eagerly
3. Updates compute-ctl to understand local proxy compute spec and to
send SIGHUP to local-proxy over that pid.

closes https://github.com/neondatabase/cloud/issues/16867
2024-10-04 14:52:01 +00:00
David Gomes
2fac0b7fac chore: remove unnecessary comments in compute/Dockerfile.compute-node (#9253)
See [this
comment](https://github.com/neondatabase/neon/pull/8888#discussion_r1783130082).
2024-10-03 18:26:41 +00:00
Heikki Linnakangas
6a9e2d657c Remove unnecessary dependencies from postgis-build image (#9211)
The apt install stage before this commit:

    0 upgraded, 391 newly installed, 0 to remove and 9 not upgraded.
    Need to get 261 MB of archives.

after:

    0 upgraded, 367 newly installed, 0 to remove and 9 not upgraded.
    Need to get 220 MB of archives.
2024-10-03 10:05:23 +03:00
Heikki Linnakangas
d20448986c Fix metric name of the 'getpage_wait_seconds_bucket' metric (#9242)
Per convention, histogram buckets have the '_bucket' suffix. I got that
wrong in commit 0d500bbd5b.

Fixes https://github.com/neondatabase/neon/issues/9241
2024-10-02 20:05:14 +03:00
Heikki Linnakangas
2e3b7862d0 Fix compute metrics collector config (#9235) 2024-10-02 09:44:00 +01:00
Heikki Linnakangas
0d500bbd5b Add new compute metrics to sql exporter (#9190)
These are the perf counters added in commit 263dfba6ee.

Note: This relies on 'neon' extension version 1.5. The default was
bumped to 1.5 in commit d696c41807.

---------

Co-authored-by: Matthias van de Meent <matthias@neon.tech>
2024-10-01 17:38:19 +03:00
Heikki Linnakangas
1b8b50755c Use debian packages for cmake again (#9212)
On bookworm, 'cmake' is new enough that we can just use it. On bullseye,
we can get a new-enough package from backports. By including 'cmake' in
the build-deps stage, we don't need to install it separately in all the
later build stages that need it.

See https://github.com/neondatabase/neon/pull/2699, where we switched to
downloading and building a specific version.
2024-10-01 15:09:09 +03:00
David Gomes
d6c6b0a509 feat(compute): adds pg_session_jwt extension to compute image (#8888)
## Problem

We need the
[pg_session_jwt](https://github.com/neondatabase/pg_session_jwt/)
extension in the compute image. This PR adds it.

## Summary of changes

I added the `pg_session_jwt` extension in a very similar way to how the
pggraphql and pgtiktoken extensions were added (since they're all
written with pgrx). Then I tested this.

```
$ cd docker-compose/
$ PG_VERSION=16 TAG=10667533475 docker-compose up --build -d
$ psql postgresql://cloud_admin:cloud_admin@localhost:55433/postgres

cloud_admin@postgres=# create extension pg_session_jwt;
CREATE EXTENSION
Time: 43.048 ms

cloud_admin@postgres=# \df auth.*;
                              List of functions
┌────────┬──────────────────┬──────────────────┬─────────────────────┬──────┐
│ Schema │       Name       │ Result data type │ Argument data types │ Type │
├────────┼──────────────────┼──────────────────┼─────────────────────┼──────┤
│ auth   │ get              │ jsonb            │ s text              │ func │
│ auth   │ init             │ void             │ kid bigint, s jsonb │ func │
│ auth   │ jwt_session_init │ void             │ s text              │ func │
│ auth   │ user_id          │ text             │                     │ func │
└────────┴──────────────────┴──────────────────┴─────────────────────┴──────┘
(4 rows)

cloud_admin@postgres=# select auth.init(cast('1' as bigint), to_jsonb(TEXT '{ "kty": "EC", "kid": "571683be-33cf-4e67-bccc-8905c0ebb862", "crv": "P-521", "alg": "ES512", "x": "AM_GsnQvKML2yXdn_OsN8PdgO1Sf9XMXih5vQMKLmJkp-Iz_FFWJUt6uyR_qp4brr8Ji2kjGJgN4cQJpg2kskH7V", "y": "AZg-salw24lCmsBP-BCBa5jT6INkTwLtCOC7o0BIxDVvmIEH1-PQAJVYVJPTFvPMi_PLa0QlOm-ufJYkynwa2Mau" }'));
ERROR:  called `Result::unwrap()` on an `Err` value: Error("invalid type: string \"{ \\\"kty\\\": \\\"EC\\\", \\\"kid\\\": \\\"571683be-33cf-4e67-bccc-8905c0ebb862\\\", \\\"crv\\\": \\\"P-521\\\", \\\"alg\\\": \\\"ES512\\\", \\\"x\\\": \\\"AM_GsnQvKML2yXdn_OsN8PdgO1Sf9XMXih5vQMKLmJkp-Iz_FFWJUt6uyR_qp4brr8Ji2kjGJgN4cQJpg2kskH7V\\\", \\\"y\\\": \\\"AZg-salw24lCmsBP-BCBa5jT6INkTwLtCOC7o0BIxDVvmIEH1-PQAJVYVJPTFvPMi_PLa0QlOm-ufJYkynwa2Mau\\\" }\", expected struct JwkEcKey", line: 0, column: 0)
Time: 6.991 ms
```

## Checklist before requesting a review

- [x] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.

## Checklist before merging

- [ ] Move the download location to a proper URL
2024-10-01 10:29:56 +01:00
Heikki Linnakangas
65bda19051 Remove unnecessary dev package from compute image (#9210)
libcurl4-openssl-dev is needed to build pgxn/, but libcurl4 is enough at
runtime.
2024-10-01 01:07:43 +03:00
Arthur Petukhovsky
ba498a630a Set disk quotas on bind in compute_ctl (#8936)
Part of https://github.com/neondatabase/cloud/issues/13127. Resolves
#9153

What changed in this PR:
1. Adds `ComputeSpec.disk_quota_bytes: Option<u64>`
2. Adds new arg to compute_ctl: `--set-disk-quota-for-fs <mountpoint>`
3. Implements running `/neonvm/bin/set-disk-quota` with the right value
if both cmdline arg AND field in the spec are specified
4. Patches `/etc/sudoers.d` to allow `compute_ctl` to set quota with
sudo

This PR is very similar to the swap support added earlier, you can take
a look at it as prior art: #7434

In theory, it can be implemented outside of compute_ctl when we will
have a separate neonvm daemon, but we are not there yet. Current
implementation is the simplest possible to unblock computes with larger
disks.

All code related to usage of `/neonvm/bin/set-disk-quota` is located in
`disk_quota.rs`. We need to call this script with the following
arguments: `/neonvm/bin/set-disk-quota {size_kb} {mountpoint}`. Quotas
are set on the filesystem level, so we need to provide path to the
directory that filesystem was mounted to.

I tested this change locally with
https://github.com/neondatabase/cloud/pull/17270. It should be safe to
merge, because this feature is gated by both cmdline arg and field in
the spec. If control-plane doesn't set values in both places,
compute_ctl won't be affected by this change.
2024-09-27 20:52:22 +01:00
Tristan Partin
684e924211 Fix compute_logical_snapshot_files for v14
The function, pg_ls_logicalsnapdir(), was added in version 15.

Signed-off-by: Tristan Partin <tristan@neon.tech>
2024-09-25 16:25:17 -05:00
Matthias van de Meent
c4f5736d5a Build images for PG17 using Debian 12 "Bookworm" (#9132)
This increases the support window of the OS used for PG17 by 2 years
compared to the previous usage of Debian 11 "Bullseye".
2024-09-25 17:50:05 +03:00
a-masterov
b224a5a377 Move the patch to compute (#9120)
## Problem
All the other patches were moved to the compute directory, and only one
was left in the patches subdirectory in the root directory.

## Summary of changes
The patch was moved to the compute directory as others
2024-09-24 15:13:18 +02:00
Heikki Linnakangas
3ad567290c Move metric exporter and pgbouncer config files
Instead of adding them to the VM image late in the build process, when
putting together the final VM image, include them in the earlier
compute image already. That makes it more convenient to edit the
files, and to test them.
2024-09-24 00:35:52 +03:00
Heikki Linnakangas
3a110e45ed Move files related to building compute image into compute/ dir
Seems nice to keep all these together. This also provides a nice place
for a README file to describe the compute image build process. For
now, it briefly describes the contents of the directory, but can be
expanded.
2024-09-24 00:35:52 +03:00