if anon is in shared_preload_libraries.
Users cannot install it themselves, because superuser is required.
GRANT all priveleged needed to use it to db_owner
We use the neon fork of the extension, because small change to sql file
is needed to allow db_owner to use it.
This feature is behind a feature flag AnonExtension,
so it is not enabled by default.
## Problem
Drizzle needs to be able to configure the array_mode flag per query.
## Summary of changes
Adds an array_mode flag to the query data json that will otherwise
default to the header flag.
## Problem
Previous test started with a new-style TenantShardId with a non-zero
ShardCount. We also need to handle the case of a ShardCount() (aka
`unsharded`) parent shard.
**A followup PR will refactor ShardCount to make its inner value private
and thereby make this kind of mistake harder**
## Summary of changes
- Fix a place we were incorrectly treating a ShardCount as a number of
shards rather than as thing that can be zero or the number of shards.
- Add a test for this case.
This test occasionally fails with a difference in "pg_xact/0000" file
between the local and restored datadirs. My hypothesis is that something
changed in the database between the last explicit checkpoint and the
shutdown. I suspect autovacuum, it could certainly create transactions.
To fix, be more precise about the point in time that we compare. Shut
down the endpoint first, then read the last LSN (i.e. the shutdown
checkpoint's LSN), from the local disk with pg_controldata. And use
exactly that LSN in the basebackup.
Closes#559.
I'm proposing this as an alternative to
https://github.com/neondatabase/neon/pull/6662.
## Problem
test_lfc_resize sometimes filed with assertion failure when require lock
in write operation:
```
if (lfc_ctl->generation == generation)
{
Assert(LFC_ENABLED());
```
## Summary of changes
Increment generation when 0 is assigned to neon.file_cache_size_limit
## Checklist before requesting a review
- [ ] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
## Checklist before merging
- [ ] Do not forget to reformat commit message to not include the above
checklist
Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
@problame noticed that the `tokio::sync::AcquireError` branch assertion
can be hit like in the first commit. We haven't seen this yet in
production, but I'd prefer not to see it there. There `take_and_deinit`
is being used, but this race must be quite timing sensitive.
- Automatically set a node's availability to Active if it is responsive
in startup_reconcile
- Impose a 5s timeout of HTTP request to list location conf, so that an
unresponsive node can't hang it for minutes
- Do several retries if the request fails with a retryable error, to be
tolerant of concurrent pageserver & storage controller restarts
- Add a readiness hook for use with k8s so that we can tell when the
startup reconciliaton is done and the service is fully ready to do work.
- Add /metrics to the list of un-authenticated endpoints (this is
unrelated but we're touching the line in this PR already, and it fixes
auth error spam in deployed container.)
- A test for the above.
Closes: #6670
## Problem
One doesn't know at tenant creation time how large the tenant will grow.
We need to be able to dynamically adjust the shard count at runtime.
This is implemented as "splitting" of shards into smaller child shards,
which cover a subset of the keyspace that the parent covered.
Refer to RFC: https://github.com/neondatabase/neon/pull/6358
Part of epic: #6278
## Summary of changes
This PR implements the happy path (does not cleanly recover from a crash
mid-split, although won't lose any data), without any optimizations
(e.g. child shards re-download their own copies of layers that the
parent shard already had on local disk)
- Add `/v1/tenant/:tenant_shard_id/shard_split` API to pageserver: this
copies the shard's index to the child shards' paths, instantiates child
`Tenant` object, and tears down parent `Tenant` object.
- Add `splitting` column to `tenant_shards` table. This is written into
an existing migration because we haven't deployed yet, so don't need to
cleanly upgrade.
- Add `/control/v1/tenant/:tenant_id/shard_split` API to
attachment_service,
- Add `test_sharding_split_smoke` test. This covers the happy path:
future PRs will add tests that exercise failure cases.
## Problem
See #6626
If there is inactive replication slot then Postgres will not bw able to
shrink WAL and delete unused snapshots.
If she other active subscription is present, then snapshots created each
15 seconds will overflow AUX_DIR.
Setting `max_slot_wal_keep_size` doesn't solve the problem, because even
small WAL segment will be enough to overflow AUX_DIR if there is no
other activity on the system.
## Summary of changes
If there are active subscriptions and some logical replication slots are
not used during `neon.logical_replication_max_time_lag` interval, then
unused slot is dropped.
## Checklist before requesting a review
- [ ] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
## Checklist before merging
- [ ] Do not forget to reformat commit message to not include the above
checklist
Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
## Problem
The password check logic for the sql-over-http is a bit non-intuitive.
## Summary of changes
1. Perform scram auth using the same logic as for websocket cleartext
password.
2. Split establish connection logic and connection pool.
3. Parallelize param parsing logic with authentication + wake compute.
4. Limit the total number of clients
## Problem
Copyright notice is outdated
## Summary of changes
Replace the initial year `2022` with `2022 - 2024`, after brief
discussion with Stas about the format
Co-authored-by: Andreas Scherbaum <andreas@neon.tech>
## Problem
The existing behavior isn't exactly incorrect, but is operationally
risky: if the control plane compute hook breaks, then all the control
plane operations trying to call /location_config will end up retrying
forever, which could put more load on the system.
## Summary of changes
- Treat 404s as fatal errors to do fewer retries: a 404 either indicates
we have the wrong URL, or some control plane bug is failing to recognize
our tenant ID as existing.
- Do not return an error on reconcilation errors in a non-creating
/location_config response: this allows the control plane to finish its
Operation (and we will eventually retry the compute notification later)
This PR adds an API to live-reconfigure the VirtualFile io engine.
It also adds a flag to `pagebench get-page-latest-lsn`, which is where I
found this functionality to be useful: it helps compare the io engines
in a benchmark without re-compiling a release build, which took ~50s on
the i3en.3xlarge where I was doing the benchmark.
Switching the IO engine is completely safe at runtime.
## Problem
We have finite amount of runners and intermediate results are often
wanted before a PR is ready for merging. Currently all PRs get e2e tests
run and this creates a lot of throwaway e2e results which may or may not
get to start or complete before a new push.
## Summary of changes
1. Skip e2e test when PR is in draft mode
2. Run e2e when PR status changes from draft to ready for review (change
this to having its trigger in below PR and update results of build and
test)
3. Abstract e2e test in a Separate workflow and call it from the main
workflow for the e2e test
5. Add a label, if that label is present run e2e test in draft
(run-e2e-test-in-draft)
6. Auto add a label(approve to ci) so that all the external contributors
PR , e2e run in draft
7. Document the new label changes and the above behaviour
Draft PR : https://github.com/neondatabase/neon/actions/runs/7729128470
Ready To Review :
https://github.com/neondatabase/neon/actions/runs/7733779916
Draft PR with label :
https://github.com/neondatabase/neon/actions/runs/7725691012/job/21062432342
and https://github.com/neondatabase/neon/actions/runs/7733854028
## Checklist before requesting a review
- [x] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
## Checklist before merging
- [ ] Do not forget to reformat commit message to not include the above
checklist
---------
Co-authored-by: Alexander Bayandin <alexander@neon.tech>
## Problem
This is mainly to limit our concurrency, rather than to speed up
requests (I was doing some sanity checks on performance of the service
with thousands of shards)
## Summary of changes
- Enable the `diesel:r2d2` feature, which provides an async connection
pool
- Acquire a connection before entering spawn_blocking for a database
transaction (recall that diesel's interface is sync)
- Set a connection pool size of 99 to fit within default postgres limit
(100)
- Also set the tokio blocking thread count to accomodate the same number
of blocking tasks (the only thing we use spawn_blocking for is database
calls).
## Problem
See
https://neondb.slack.com/archives/C04DGM6SMTM/p1707149618314539?thread_ts=1707081520.140049&cid=C04DGM6SMTM
## Summary of changes
Perform checkpoint check after processing `ingest_batch_size` (default
100) WAL records.
## Checklist before requesting a review
- [ ] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
## Checklist before merging
- [ ] Do not forget to reformat commit message to not include the above
checklist
---------
Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
## Problem
We use an outdated version of Python (3.9.2)
## Summary of changes
- Update Python to the latest patch version (3.9.18)
- Unify the usage of python caches where possible
It's awkward to point to a file when doing some kinds of ad-hoc
deployment (like right now, when I'm hacking a helm chart having not
quite hooked up secrets properly yet). We take all the rest of the
secrets as CLI args directly, so let's do the same for public key.
create_neon_superuser runs the first queries in the database after cold
start. Traces suggest that those first queries can make up a significant
fraction of the cold start time. Make it more visible by adding an
explict tracing span to it; currently you just have to deduce it by
looking at the time spent in the parent 'apply_config' span subtracted
by all the other child spans.
## Problem
We've got several issues with the current `benchmarks` job setup:
- `benchmark_durations.json` file (that we generate in runtime to
split tests into several jobs[0]) is not consistent between these
jobs (and very not consistent with the file if we rerun the job). I.e.
test selection for each job can be different, which could end up in
missed tests in a test run.
- `scripts/benchmark_durations` doesn't fetch all tests from the
database (it doesn't expect any extra directories inside
`test_runner/performance`)
- For some reason, currently split into 4 groups ends up with the 4th
group has no tests to run, which fails the job[1]
- [0] https://github.com/neondatabase/neon/pull/4683
- [1] https://github.com/neondatabase/neon/issues/6629
## Summary of changes
- Generate `benchmark_durations.json` file once before we start
`benchmarks` jobs (this makes it consistent across the jobs) and pass
the file content through the GitHub Actions input (this makes it
consistent for reruns)
- `scripts/benchmark_durations` fix SQL query for getting all required
tests
- Split benchmarks into 5 jobs instead of 4 jobs.
When we'll later introduce a global pool of pre-spawned walredo
processes (https://github.com/neondatabase/neon/issues/6581), this
refactoring avoids plumbing through the reference to the pool to all the
places where we create a broken tenant.
Builds atop the refactoring in #6583
Fix several test flakes:
- test_sharding_service_smoke had log failures on "Dropped LSN updates"
- test_emergency_mode had log failures on a deletion queue shutdown
check, where the check was incorrect because it was expecting channel
receiver to stay alive after cancellation token was fired.
- test_secondary_mode_eviction had racing heatmap uploads because the
test was using a live migration hook to set up locations, where that
migration was itself uploading heatmaps and generally making the
situation more complex than it needed to be.
These are the failure modes that I saw when spot checking the last few
failures of each test.
This will mostly/completely address #6511, but I'll leave that ticket
open for a couple days and then check if either of the tests named in
that ticket are flaky.
Related #6511
## Problem
the idea is to keep compute up and running if there are any active
logical replication subscriptions.
### Rationale
Rationale:
- The Write-Ahead Logging (WAL) files, which contain the data changes,
will need to be retained on the publisher side until the subscriber is
able to connect again and apply these changes. This could potentially
lead to increased disk usage on the publisher - and we do not want to
disrupt the source - I think it is more pain for our customer to resolve
storage issues on the source than to pay for the compute at the target.
- Upon resuming the compute resources, the subscriber will start
consuming and applying the changes from the retained WAL files. The time
taken to catch up will depend on the volume of changes and the
configured vCPUs.
we can avoid explaining complex situations where we lag behind (in
extreme cases we could lag behind hours, days or even months)
- I think an important use case for logical replication from a source is
a one-time migration or release upgrade. In this case the customer would
not mind if we are not suspended for the duration of the migration.
We need to document this in the release notes and the documentation in
the context of logical replication where Neon is the target (subscriber)
### See internal discussion here
https://neondb.slack.com/archives/C04DGM6SMTM/p1706793400746539?thread_ts=1706792628.701279&cid=C04DGM6SMTM
Fix cloning the serialized heatmap on every attempt by just turning it
into `bytes::Bytes` before clone so it will be a refcounted instead of
refcounting a vec clone later on.
Also fixes one cancellation token cloning I had missed in #6618.
Cc: #6096
## Problem
We don't have a neat way to carry around migration .sql files during
deploy, and in any case would prefer to avoid depending on diesel CLI to
deploy.
## Summary of changes
- Use `diesel_migrations` crate to embed migrations in our binary
- Run migrations on startup
- Drop the diesel dependency in the `neon_local` binary, as the
attachment_service binary just needs the database to exist. Do database
creation with a simple `createdb`.
Co-authored-by: Arpad Müller <arpad-m@users.noreply.github.com>
The solution we ended up for `backoff::retry` requires always cloning of
cancellation tokens even though there is just `.await`. Fix that, and
also turn the return type into `Option<Result<T, E>>` avoiding the need
for the `E::cancelled()` fn passed in.
Cc: #6096
## Problem
This change was left out of #6585 accidentally -- just forgot to push
the very last version of my branch.
Now that we can load database url from Secrets Manager, we don't always
need it on the CLI any more. We should let the user omit it instead of
passing `--database-url ""`
## Summary of changes
- Make `--database-url` optional
Cleanups from https://github.com/neondatabase/neon/pull/6394
- There was a rogue `*` breaking the `GET /tenant/:tenant_id`, which
passes through to shard zero
- There was a duplicate migrate endpoint
- There are un-prefixed API endpoints that were only needed for compat
tests and can now be removed.
## Problem
Running some memory profiling with high concurrent request rate shows
seemingly some memory fragmentation.
## Summary of changes
Eventually, we will want to separate global memory (caches) from local
memory (per connection handshake and per passthrough).
Using a string interner for project info cache helps reduce some of the
fragmentation of the global cache by having a single heap dedicated to
project strings, and not scattering them throughout all a requests.
At the same time, the interned key is 4 bytes vs the 24 bytes that
`SmolStr` offers.
Important: we should only store verified strings in the interner because
there's no way to remove them afterwards. Good for caching responses
from console.
Before tenant migration it made sense to leak broken tenants in the
metrics until restart. Nowdays it makes less sense because on
cancellations we set the tenant broken. The set metric still allows
filterable alerting.
Fixes: #6507