## Problem
This is a prerequisite for neondatabase/neon#12575 to keep all things
relevant to `build-tools` image in a single directory
## Summary of changes
- Rename `build_tools/` to `build-tools/`
- Move `build-tools.Dockerfile` to `build-tools/Dockerfile`
Mainly for general readability. Some notable changes:
- Postgres can be built without the rest of the repository, and in
particular without any of the Rust bits. Some CI scripts took advantage
of that, so let's make that more explicit by separating those parts.
Also add an explicit comment about that in the new postgres.mk file.
- Add a new PG_INSTALL_CACHED variable. If it's set, `make all` and
other top-Makefile targets skip checking if Postgres is up-to-date. This
is also to be used in CI scripts that build and cache Postgres as
separate steps. (It is currently only used in the macos walproposer-lib
rule, but stay tuned for more.)
- Introduce a POSTGRES_VERSIONS variable that lists all supported
PostgreSQL versions. Refactor a few Makefile rules to use that.
Service targeted for storing and retrieving LFC prewarm data.
Can be used for proxying S3 access for Postgres extensions like
pg_mooncake as well.
Requests must include a Bearer JWT token.
Token is validated using a pemfile (should be passed in infra/).
Note: app is not tolerant to extra trailing slashes, see app.rs
`delete_prefix` test for comments.
Resolves: https://github.com/neondatabase/cloud/issues/26342
Unrelated changes: gate a `rename_noreplace` feature and disable it in
`remote_storage` so as `object_storage` can be built with musl
## Problem
We use `docker cp` to copy the files required for the extension tests
now.
It causes problems if we run older images with the newer source tree.
## Summary of changes
Copying the files was moved to the compute Dockerfile.
## Problem
During ingest_benchmark which uses `pgcopydb`
([see](https://github.com/dimitri/pgcopydb))we sometimes had outages.
- when PostgreSQL COPY step failed we got a segfault (reported
[here](https://github.com/dimitri/pgcopydb/issues/899))
- the root cause was Neon idle_in_transaction_session_timeout is set to
5 minutes which is suboptimal for long-running tasks like project import
(reported [here](https://github.com/dimitri/pgcopydb/issues/900))
## Summary of changes
Patch pgcopydb to avoid segfault.
override idle_in_transaction_session_timeout and set it to "unlimited"
Seems nice to keep all these together. This also provides a nice place
for a README file to describe the compute image build process. For
now, it briefly describes the contents of the directory, but can be
expanded.
The S3 scrubber contains "S3" in its name, but we want to make it
generic in terms of which storage is used (#7547). Therefore, rename it
to "storage scrubber", following the naming scheme of already existing
components "storage broker" and "storage controller".
Part of #7547
## Problem
We need automated tests of extensions shipped with Neon to detect
possible problems.
## Summary of changes
A new image neon-test-extensions is added. Workflow changes to test the
shipped extensions are added as well.
Currently, the regression tests, shipped with extensions are in use.
Some extensions, i.e. rum, timescaledb, rdkit, postgis, pgx_ulid, pgtap,
pg_tiktoken, pg_jsonschema, pg_graphql, kq_imcx, wal2json_2_5 are
excluded due to problems or absence of internal tests.
---------
Co-authored-by: Alexander Bayandin <alexander@neon.tech>
Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
Upgrade pgvector to 0.7.0.
This PR is based on Heikki's PR #6753 and just uses pgvector 0.7.0
instead of 0.6.0
I have now done all planned manual tests.
The pull request is ready to be reviewed and merged and can be deployed
in production together / after swap enablement.
See (https://github.com/neondatabase/autoscaling/issues/800)
Fixes https://github.com/neondatabase/neon/issues/6516
Fixes https://github.com/neondatabase/neon/issues/7780
## Documentation input for usage recommendations
### maintenance_work_mem
In Neon
`maintenance_work_mem` is very small by default (depends on configured
RAM for your compute but can be as low as 64 MB).
To optimize pgvector index build time you may have to bump it up
according to your working set size (size of tuples for vector index
creation).
You can do so in the current session using
`SET maintenance_work_mem='10 GB';`
The target value you choose should fit into the memory of your compute
size and not exceed 50-60% of available RAM.
The value above has been successfully used on a 7CU endpoint.
### max_parallel_maintenance_workers
max_parallel_maintenance_workers is also small by default (2). For
efficient parallel pgvector index creation you have to bump it up with
`SET max_parallel_maintenance_workers = 7`
to make use of all the CPUs available, assuming you have configured your
endpoint to use 7CU.
## ID input for changelog
pgvector extension in Neon has been upgraded from version 0.5.1 to
version 0.7.0.
Please see https://github.com/pgvector/pgvector/ for documentation of
new capabilities in pgvector version 0.7.0
If you have existing databases with pgvector 0.5.1 already installed
there is a slight difference in behavior in the following corner cases
even if you don't run `ALTER EXTENSION UPDATE`:
### L2 distance from NULL::vector
For the following script, comparing the NULL::vector to non-null vectors
the resulting output changes:
```sql
SET enable_seqscan = off;
CREATE TABLE t (val vector(3));
INSERT INTO t (val) VALUES ('[0,0,0]'), ('[1,2,3]'), ('[1,1,1]'), (NULL);
CREATE INDEX ON t USING hnsw (val vector_l2_ops);
INSERT INTO t (val) VALUES ('[1,2,4]');
SELECT * FROM t ORDER BY val <-> (SELECT NULL::vector);
```
and now the output is
```
val
---------
[1,1,1]
[1,2,4]
[1,2,3]
[0,0,0]
(4 rows)
```
For the following script
```sql
SET enable_seqscan = off;
CREATE TABLE t (val vector(3));
INSERT INTO t (val) VALUES ('[0,0,0]'), ('[1,2,3]'), ('[1,1,1]'), (NULL);
CREATE INDEX ON t USING ivfflat (val vector_l2_ops) WITH (lists = 1);
INSERT INTO t (val) VALUES ('[1,2,4]');
SELECT * FROM t ORDER BY val <-> (SELECT NULL::vector);
```
the output now is
```
val
---------
[0,0,0]
[1,2,3]
[1,1,1]
[1,2,4]
(4 rows)
```
### changed error messages
If you provide invalid literals for datatype vector you may get
improved/changed error messages, for example:
```sql
neondb=> SELECT '[4e38,1]'::vector;
ERROR: "4e38" is out of range for type vector
LINE 1: SELECT '[4e38,1]'::vector;
^
```
---------
Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
The binary etc were renamed some time ago, but the path in the source
tree remained "attachment_service" to avoid disruption to ongoing PRs.
There aren't any big PRs out right now, so it's a good time to cut over.
- Rename `attachment_service` to `storage_controller`
- Move it to the top level for symmetry with `storage_broker` & to avoid
mixing the non-prod neon_local stuff (`control_plane/`) with the storage
controller which is a production component.
The issue is still unsolved because of shmem size in VMs. Need to figure it out before applying this patch.
For more details:
```
ERROR: could not resize shared memory segment "/PostgreSQL.2892504480" to 16774205952 bytes: No space left on device
```
As an example, the same issue in community pgvector/pgvector#453.
This includes a compatibility patch that is needed because pgvector
now skips WAL-logging during the index build, and WAL-logs the index
only in one go at the end. That's how GIN, GiST and SP-GIST index
builds work in core PostgreSQL too, but we need some Neon-specific
calls to mark the beginning and end of those build phases.
pgvector is the first index AM that does that with parallel workers,
so I had to modify those functions in the Neon extension to be aware
of parallel workers. Only the leader needs to create the underlying
file and perform the WAL-logging. (In principle, the parallel workers
could participate in the WAL-logging too, but pgvector doesn't do
that. This will need some further work if that changes).
The previous attempt at this (#6592) missed that parallel workers
needed those changes, and segfaulted in parallel build that spilled to
disk.
Testing
-------
We don't have a place for regression tests of extensions at the
moment. I tested this manually with the following script:
```
CREATE EXTENSION IF NOT EXISTS vector;
DROP TABLE IF EXISTS tst;
CREATE TABLE tst (i serial, v vector(3));
INSERT INTO tst (v) SELECT ARRAY[random(), random(), random()] FROM generate_series(1, 15000) g;
-- Serial build, in memory
ALTER TABLE tst SET (parallel_workers=0);
SET maintenance_work_mem='50 MB';
CREATE INDEX idx ON tst USING hnsw (v vector_l2_ops);
-- Test that the index works. (The table contents are random, and the
-- search is approximate anyway, so we cannot check the exact values.
-- For now, just eyeball that they look reasonable)
set enable_seqscan=off;
explain SELECT * FROM tst ORDER BY v <-> ARRAY[0, 0, 0]::vector LIMIT 5;
SELECT * FROM tst ORDER BY v <-> ARRAY[0, 0, 0]::vector LIMIT 5;
DROP INDEX idx;
-- Serial build, spills to on disk
ALTER TABLE tst SET (parallel_workers=0);
SET maintenance_work_mem='5 MB';
CREATE INDEX idx ON tst USING hnsw (v vector_l2_ops);
SELECT * FROM tst ORDER BY v <-> ARRAY[0, 0, 0]::vector LIMIT 5;
DROP INDEX idx;
-- Parallel build, in memory
ALTER TABLE tst SET (parallel_workers=4);
SET maintenance_work_mem='50 MB';
CREATE INDEX idx ON tst USING hnsw (v vector_l2_ops);
SELECT * FROM tst ORDER BY v <-> ARRAY[0, 0, 0]::vector LIMIT 5;
DROP INDEX idx;
-- Parallel build, spills to disk
ALTER TABLE tst SET (parallel_workers=4);
SET maintenance_work_mem='5 MB';
CREATE INDEX idx ON tst USING hnsw (v vector_l2_ops);
SELECT * FROM tst ORDER BY v <-> ARRAY[0, 0, 0]::vector LIMIT 5;
DROP INDEX idx;
```
This adds PostgreSQL 16 as a vendored postgresql version, and adapts the
code to support this version.
The important changes to PostgreSQL 16 compared to the PostgreSQL 15
changeset include the addition of a neon_rmgr instead of altering Postgres's
original WAL format.
Co-authored-by: Alexander Bayandin <alexander@neon.tech>
Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
## Problem
The S3 scrubber currently lives at
https://github.com/neondatabase/s3-scrubber
We don't have tests that use it, and it has copies of some data
structures that can get stale.
## Summary of changes
- Import the s3-scrubber as `s3_scrubber/
- Replace copied_definitions/ in the scrubber with direct access to the
`utils` and `pageserver` crates
- Modify visibility of a few definitions in `pageserver` to allow the
scrubber to use them
- Update scrubber code for recent changes to `IndexPart`
- Update `KNOWN_VERSIONS` for IndexPart and move the definition into
index.rs so that it is easier to keep up to date
As a future refinement, it would be good to pull the remote persistence
types (like IndexPart) out of `pageserver` into a separate library so
that the scrubber doesn't have to link against the whole pageserver, and
so that it's clearer which types need to be public.
Co-authored-by: Kirill Bulatov <kirill@neon.tech>
Co-authored-by: Dmitry Rodionov <dmitry@neon.tech>
Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>
We need some real extensions in S3 to accurately test the code for
handling remote extensions.
In this PR we just upload three extensions (anon, kq_imcx and postgis), which is
enough for testing purposes for now. In addition to creating and
uploading the extension archives, we must generate a file
`ext_index.json` which specifies important metadata about the
extensions.
---------
Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>
Co-authored-by: Alexander Bayandin <alexander@neon.tech>
Which ought to replace etcd. This patch only adds the binary and adjusts
Dockerfile to include it; subsequent ones will add deploy of helm chart and the
actual replacement.
It is a simple and fast pub-sub message bus. In this patch only safekeeper
message is supported, but others can be easily added.
Compilation now requires protoc to be installed. Installing protobuf-compiler
package is fine for Debian/Ubuntu.
ref
https://github.com/neondatabase/neon/pull/2733https://github.com/neondatabase/neon/issues/2394
* Add submodule postgres-15
* Support pg_15 in pgxn/neon
* Renamed zenith -> neon in Makefile
* fix name of codestyle check
* Refactor build system to prepare for building multiple Postgres versions.
Rename "vendor/postgres" to "vendor/postgres-v14"
Change Postgres build and install directory paths to be version-specific:
- tmp_install/build -> pg_install/build/14
- tmp_install/* -> pg_install/14/*
And Makefile targets:
- "make postgres" -> "make postgres-v14"
- "make postgres-headers" -> "make postgres-v14-headers"
- etc.
Add Makefile aliases:
- "make postgres" to build "postgres-v14" and in future, "postgres-v15"
- similarly for "make postgres-headers"
Fix POSTGRES_DISTRIB_DIR path in pytest scripts
* Make postgres version a variable in codestyle workflow
* Support vendor/postgres-v15 in codestyle check workflow
* Support postgres-v15 building in Makefile
* fix pg version in Dockerfile.compute-node
* fix kaniko path
* Build neon extensions in version-specific directories
* fix obsolete mentions of vendor/postgres
* use vendor/postgres-v14 in Dockerfile.compute-node.legacy
* Use PG_VERSION_NUM to gate dependencies in inmem_smgr.c
* Use versioned ECR repositories and image names for compute-node.
The image name format is compute-node-vXX, where XX is postgres major version number.
For now only v14 is supported.
Old format unversioned name (compute-node) is left, because cloud repo depends on it.
* update vendor/postgres submodule url (zenith->neondatabase rename)
* Fix postgres path in python tests after rebase
* fix path in regress test
* Use separate dockerfiles to build compute-node:
Dockerfile.compute-node-v15 should be identical to Dockerfile.compute-node-v14 except for the version number.
This is a hack, because Kaniko doesn't support build ARGs properly
* bump vendor/postgres-v14 and vendor/postgres-v15
* Don't use Kaniko cache for v14 and v15 compute-node images
* Build compute-node images for different versions in different jobs
Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
The dockerignore and dockerfile have also been excluded from being moved into
docker images, saving docker layer cache busts if only those are changed.
* make .dockerignore `ncdu -X` compatible to easily inspect build context
* remove cargo-chef as it was introducing more problems than it was solving
* remove rocksdb packages
* add ca-certs in the resulting image. We need that to be able to make https
connections from container with proxy to the console.