Commit Graph

2071 Commits

Author SHA1 Message Date
sharnoff
db5ec0dae7 Cleanup/simplify logical size calculation (#2459)
Should produce identical results; replaces an error case that shouldn't
be possible with `expect`.
2022-09-15 23:50:46 -07:00
Kirill Bulatov
031e57a973 Disable failpoints by default 2022-09-16 09:26:29 +03:00
bojanserafimov
96e867642f Validate tenant create options (#2450)
Co-authored-by: Kirill Bulatov <kirill@neon.tech>
2022-09-15 18:20:23 -04:00
Egor Suvorov
e968b5e502 tests: do not set num_safekeepers = 1, it's the default (#2457)
Also get rid if `with_safekeepers` parameter in tests.
Its meaning has changed: `False` meant "no safekeepers" which is not
supported anymore, so we assume it's always `True`.

See #1648
2022-09-15 21:43:51 +03:00
Egor Suvorov
9d9d8e9519 docs/sourcetree: update CLion set up instructions (#2454)
After #2325 the old method no longer works as our Makefile does not print compilation commands when run with --dry-run, see https://github.com/neondatabase/neon/issues/2378#issuecomment-1241421325

This method is much slower but is hopefully robust.

Add some more notes while we're here.
2022-09-15 17:16:07 +00:00
Heikki Linnakangas
1062e57fee Don't run codestyle checks separately for Postgres v14 and v15.
Previously, we compiled neon separately for Postgres v14 and v15, for
the codestyle checks. But that was bogus; we actually just ran "make
postgres", which always compiled both versions. The version really only
affected the caching.

Fix that, by copying the build steps from the main build_and_test.yml
workflow.
2022-09-15 16:33:42 +03:00
dependabot[bot]
a8d9732529 Bump axum-core from 0.2.7 to 0.2.8
Bumps [axum-core](https://github.com/tokio-rs/axum) from 0.2.7 to 0.2.8.
- [Release notes](https://github.com/tokio-rs/axum/releases)
- [Changelog](https://github.com/tokio-rs/axum/blob/main/CHANGELOG.md)
- [Commits](https://github.com/tokio-rs/axum/compare/axum-core-v0.2.7...axum-core-v0.2.8)

---
updated-dependencies:
- dependency-name: axum-core
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2022-09-15 14:07:00 +01:00
Alexey Kondratov
757e2147c1 Follow-up for neondatabase/neon#2448 (#2452)
* remove `legacy` mode from the proxy readme
* explicitly specify `authBackend` in the link auth proxy helm-values
  for all envs
2022-09-15 15:21:22 +03:00
Dmitry Ivanov
87bf7be537 [proxy] Drop support for legacy cloud API (#2448)
Apparently, it no longer exists in the cloud.
2022-09-14 21:27:47 +03:00
Heikki Linnakangas
f86ea09323 Avoid recompiling postgres_ffi every time you run "make".
Running "make" at the top level calls "make install" to install the
PostgreSQL headers into the pg_install/ directory. That always updated
the modification time of the headers even if there were no changes,
triggering recompilation of the postgres_ffi bindings. To avoid that,
use 'install -C', to install the PostgreSQL headers.

However, there was an upstream PostgreSQL issue that the
src/include/Makefile didn't respect the INSTALL configure option. That
was just fixed in upstream PostgreSQL, so cherry-pick that fix to our
vendor/postgres repositories.

Fixes https://github.com/neondatabase/neon/issues/1873.
2022-09-14 15:18:18 +03:00
Alexander Bayandin
d87c9e62d6 Nightly Benchmarks: perform tests on both pre-created and fresh projects (#2443) 2022-09-14 10:53:34 +00:00
Heikki Linnakangas
c3096532f9 Fix vendor/postgres-v15 to point to correct v15 branch.
Commit f44afbaf62 updated vendor/postgres-v15 to point to a commit that
was built on top of PostgreSQL 14 rather than 15. So we accidentally had
two copies of PostgreSQL v14 in the repository. Oops. This updates
it to point to the correct version.
2022-09-14 09:23:51 +03:00
Kirill Bulatov
6db6e7ddda Use backward-compatible safekeeper code 2022-09-14 08:14:05 +03:00
Kirill Bulatov
b8eb908a3d Rename old project name references 2022-09-14 08:14:05 +03:00
Kirill Bulatov
260ec20a02 Refotmat pgxn code, add typedefs.list that was used 2022-09-14 08:14:05 +03:00
Dmitry Rodionov
ba8698bbcb update neon_local output in readme 2022-09-14 01:41:27 +03:00
Egor Suvorov
35761ac6b6 docs/sourcetree: add info about IDE config (#2332) 2022-09-13 21:55:18 +00:00
Kirill Bulatov
32b7259d5e Timeline data management RFC (#2152) 2022-09-13 22:37:20 +03:00
Dmitry Rodionov
1d53173e62 update openapi spec (tenant state has changed) 2022-09-13 21:53:59 +03:00
Alexander Bayandin
d4d57ea2dd github/workflows: fix project creation via API (#2437) 2022-09-13 20:26:26 +02:00
Dmitry Rodionov
db0c49148d clean up metrics in handle_pagerequests 2022-09-13 21:21:45 +03:00
Alexander Bayandin
59d04ab66a test_runner: redact passwords from log messages (#2434) 2022-09-13 17:24:11 +00:00
Kirill Bulatov
1a8c8b04d7 Merge Repository and Tenant entities, rework tenant background jobs 2022-09-13 15:39:39 +03:00
Konstantin Knizhnik
f44afbaf62 Changes of neon extension to support local prefetch (#2369)
* Changes of neon extension to support local prefetch

* Catch exceptions in pageserver_receive

* Bump posgres version

* Bump posgres version

* Bump posgres version

* Bump posgres version
2022-09-13 12:26:20 +03:00
Alexander Bayandin
4f7557fb58 github/workflows: Create projects using API (#2403)
* github/actions: add neon projects related actions
* workflows/benchmarking: create projects using API
* workflows/pg_clients: create projects using API
2022-09-13 09:45:45 +01:00
Kirill Bulatov
2a837d7de7 Create tenants in temporary directory first (#2426) 2022-09-12 21:04:33 +00:00
Heikki Linnakangas
40c845e57d Switch to async for all concurrency in the pageserver.
Instead of spawning helper threads, we now use Tokio tasks. There
are multiple Tokio runtimes, for different kinds of tasks. One for
serving libpq client connections, another for background operations
like GC and compaction, and so on. That's not strictly required, we
could use just one runtime, but with this you can still get an
overview of what's happening with "top -H".

There's one subtle behavior in how TenantState is updated. Before this
patch, if you deleted all timelines from a tenant, its GC and
compaction loops were stopped, and the tenant went back to Idle
state. We no longer do that. The empty tenant stays Active. The
changes to test_tenant_tasks.py are related to that.

There's still plenty of synchronous code and blocking. For example, we
still use blocking std::io functions for all file I/O, and the
communication with WAL redo processes is still uses low-level unix
poll(). We might want to rewrite those later, but this will do for
now. The model is that local file I/O is considered to be fast enough
that blocking - and preventing other tasks running in the same thread -
is acceptable.
2022-09-12 14:21:00 +03:00
Kirill Bulatov
698d6d0bad Use stable coverage API with rustc 1.60 2022-09-12 13:44:54 +03:00
Heikki Linnakangas
a48f9f377d Fix typo in issue template 2022-09-10 01:23:19 +03:00
Kirill Bulatov
18dafbb9ba Remove deceiving rust version from the CI files 2022-09-09 22:32:00 +03:00
Kirill Bulatov
648e86e9df Use Debian images with libc 2.31 to build legacy compute tools 2022-09-09 22:32:00 +03:00
Kirill Bulatov
923f642549 Collect cargo build timings 2022-09-09 22:32:00 +03:00
Kirill Bulatov
31ec3b7906 Use the toolchain file to define current rustc version used 2022-09-09 22:32:00 +03:00
Kirill Bulatov
c9e7c2f014 Ensure all temporary and empty directories and files are cleansed on pageserver startup 2022-09-09 16:36:45 +03:00
Kirill Bulatov
d3f83eda52 Use regular agent for triggering e2e tests 2022-09-09 16:07:43 +03:00
Dmitry Rodionov
0b76b82e0e review clean up 2022-09-08 19:59:42 +03:00
Heikki Linnakangas
35b4816f09 Turn GenericRemoteStorage into just a newtype around 'Arc<dyn RemoteStorage>'
We had a pattern like this:

    match remote_storage {
        GenericRemoteStorage::Local(storage) => {
            let source = storage.remote_object_id(&file_path)?;
            ...
            storage
                .function(&source, ...)
                .await
       },
       GenericRemoteStorage::S3(storage) => {
	    ... exact same code as for the Local case ...
       },

This removes the code duplication, by allowing you to call the functions
directly on GenericRemoteStorage.

Also change RemoveObjectId to be just a type alias for String. Now that
the callers of GenericRemoteStorage functions don't know whether they're
dealing with the LocalFs or S3 implementation, RemoveObjectId must be the
same type for both.
2022-09-08 19:59:42 +03:00
Alexander Bayandin
171385ac14 Pass COPT and PG_CFLAGS to Extension's CFLAGS (#2405)
* fix incompatible-function-pointer-types warning
* Pass COPT and PG_CFLAGS  to Extension's CFLAGS
2022-09-08 16:02:11 +01:00
MMeent
1351beae19 Fix race condition in ginHeapTupleFastInsert (#2412)
Because the metadata was not locked, it could be updated concurrently
such that we wouldn't actually have the tail block.

The current ordering works better, as we still only start XLogBeginInsert()
once we have all potentially interesting buffers loaded in memory, but
still have correct lock lifetimes.

See also: access/transam/README section Write-Ahead Log Coding
2022-09-08 10:57:30 +00:00
Alexander Bayandin
9e3136ea37 scripts/ingest_regress_test_result.py: fix json data insertion (#2408) 2022-09-07 21:40:08 +01:00
Alexander Bayandin
83dca73f85 Store Allure tests statistics in database (#2367) 2022-09-07 14:16:48 +01:00
Lassi Pölönen
dc2150a90e Add built files to gitignore (#2404) 2022-09-07 12:11:03 +00:00
Anastasia Lubennikova
2794cd83c7 Prepare pg 15 support (generate bindings for pg15) (#2396)
Another preparatory commit for pg15 support:
* generate bindings for both pg14 and pg15;
* update Makefile and CI scripts: now neon build depends on both PostgreSQL versions;
* some code refactoring to decrease version-specific dependencies.
2022-09-07 12:40:48 +03:00
Heikki Linnakangas
65b592d4bd Remove deprecated management API for timeline detach.
It is no longer used anywhere.
2022-09-06 18:54:04 +03:00
Heikki Linnakangas
f441fe57d4 Register prometheus counters correctly.
Commit f081419e68 moved all the prometheus counters to `metrics.rs`,
but accidentally replaced a couple of `register_int_counter!(...)`
calls with just `IntCounter::new(...)`. Because of that, the counters
were not registered in the metrics registry, and were not exposed
through the metrics HTTP endpoint.

Fixes failures we're seeing in a bunch of 'performance' tests because
of the missing metrics.
2022-09-06 17:38:17 +03:00
Heikki Linnakangas
cf157ad8e4 Add test that repeatedly kills and restarts the pageserver.
This caught or reproduced several bugs when I originally wrote this test
back in May, including #1731, #1740, #1751, and #707. I believe all the
issues have been fixed now, but since this was a very fruitful test,
let's add it to the test suite.

We didn't commit this earlier, because the test was very slow especially
with a debug build. We've since changed the build options so that even
the debug builds are not quite so slow anymore.
2022-09-06 13:00:40 +03:00
Lassi Pölönen
f081419e68 Cleanup tenant specific metrics once a tenant is detached. (#2328)
* Add test for pageserver metric cleanup once a tenant is detached.

* Remove tenant specific timeline metrics on detach.

* Use definitions from timeline_metrics in page service.

* Move metrics to own file from layered_repository/timeline.rs

* TIMELINE_METRICS: define smgr metrics

* REMOVE SMGR cleanup from timeline_metrics. Doesn't seem to work as
expected.

* Vritual file centralized metrics, except for evicted file as there's no
tenat id or timeline id.

* Use STORAGE_TIME from timeline_metrics in layered_repository.

* Remove timelineless gc metrics for tenant on detach.

* Rename timeline metrics -> metrics as it's more generic.

* Don't create a TimelineMetrics instance for VirtualFile

* Move the rest of the metric definitions to metrics.rs too.

* UUID -> ZTenantId

* Use consistent style for dict.

* Use Repository's Drop trait for dropping STORAGE_TIME metrics.

* No need for Arc, TimelineMetrics is used in just one place. Due to that,
we can fall back using ZTenantId and ZTimelineId too to avoid additional
string allocation.
2022-09-06 11:30:20 +03:00
Anastasia Lubennikova
05e263d0d3 Prepare pg 15 support (build system and submodules) (#2337)
* Add submodule postgres-15

* Support pg_15 in pgxn/neon

* Renamed zenith -> neon in Makefile

* fix name of codestyle check

* Refactor build system to prepare for building multiple Postgres versions.

Rename "vendor/postgres" to "vendor/postgres-v14"

Change Postgres build and install directory paths to be version-specific:

- tmp_install/build -> pg_install/build/14
- tmp_install/* -> pg_install/14/*

And Makefile targets:

- "make postgres" -> "make postgres-v14"
- "make postgres-headers" -> "make postgres-v14-headers"
- etc.

Add Makefile aliases:

- "make postgres" to build "postgres-v14" and in future, "postgres-v15"
- similarly for "make postgres-headers"

Fix POSTGRES_DISTRIB_DIR path in pytest scripts

* Make postgres version a variable in codestyle workflow

* Support vendor/postgres-v15 in codestyle check workflow

* Support postgres-v15 building in Makefile

* fix pg version in Dockerfile.compute-node

* fix kaniko path

* Build neon extensions in version-specific directories

* fix obsolete mentions of vendor/postgres

* use vendor/postgres-v14 in Dockerfile.compute-node.legacy

* Use PG_VERSION_NUM to gate dependencies in inmem_smgr.c

* Use versioned ECR repositories and image names for compute-node.
The image name format is compute-node-vXX, where XX is postgres major version number.
For now only v14 is supported.
Old format unversioned name (compute-node) is left, because cloud repo depends on it.

* update vendor/postgres submodule url (zenith->neondatabase rename)

* Fix postgres path in python tests after rebase

* fix path in regress test

* Use separate dockerfiles to build compute-node:
Dockerfile.compute-node-v15 should be identical to Dockerfile.compute-node-v14 except for the version number.
This is a hack, because Kaniko doesn't support build ARGs properly

* bump vendor/postgres-v14 and vendor/postgres-v15

* Don't use Kaniko cache for v14 and v15 compute-node images

* Build compute-node images for different versions in different jobs

Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2022-09-05 18:30:54 +03:00
Alexander Bayandin
ee0071e90d Fix nightly benchmark reports (#2392) 2022-09-05 14:30:37 +01:00
Stas Kelvich
772078eb5c Reword proxy SNI error message
Be more strict with project id/name difference and explain how to
get project id out of the domain name.
2022-09-05 15:13:29 +03:00