Removes some unnecessary initdb arguments, and fixes Neon for MacOS
since it doesn't seem to ship a C.UTF-8 locale.
Signed-off-by: Tristan Partin <tristan@neon.tech>
Right now, our environments create databases with the C locale, which is
really unfortunate for users who have data stored in other languages
that they want to analyze. For instance, show_trgm on Hebrew text
currently doesn't work in staging or production.
I don't envision this being the final solution. I think this is just a
way to set a known value so the pageserver doesn't use its parent
environment. The final solution to me is exposing initdb parameters to
users in the console. Then they could use a different locale or encoding
if they so chose.
Signed-off-by: Tristan Partin <tristan@neon.tech>
## Problem
In test environments, the `syncfs` that the pageserver does on startup
can take a long time, as other tests running concurrently might have
many gigabytes of dirty pages.
## Summary of changes
- Add a `no_sync` option to the pageserver's config.
- Skip syncfs on startup if this is set
- A subsequent PR (https://github.com/neondatabase/neon/pull/9678) will
enable this by default in tests. We need to wait until after the next
release to avoid breaking compat tests, which would fail if we set
no_sync & use an old pageserver binary.
Q: Why is this a different mechanism than safekeeper, which as a
--no-sync CLI?
A: Because the way we manage pageservers in neon_local depends on the
pageserver.toml containing the full configuration, whereas safekeepers
have a config file which is neon-local-specific and can drive a CLI
flag.
Q: Why is the option no_sync rather than sync?
A: For boolean configs with a dangerous value, it's preferable to make
"false" the safe option, so that any downstream future config tooling
that might have a "booleans are false by default" behavior (e.g. golang
structs) is safe by default.
Q: Why only skip the syncfs, and not all fsyncs?
A: Skipping all fsyncs would require more code changes, and the most
acute problem isn't fsyncs themselves (these just slow down a running
test), it's the syncfs (which makes a pageserver startup slow as a
result of _other_ tests)
## Problem
Part of https://github.com/neondatabase/neon/issues/8623
## Summary of changes
Removed all aux-v1 config processing code. Note that we persisted it
into the index part file, so we cannot really remove the field from
index part. I also kept the config item within the tenant config, but we
will not read it any more.
---------
Signed-off-by: Alex Chi Z <chi@neon.tech>
Adds a configuration variable for timeline offloading support. The added
pageserver-global config option controls whether the pageserver
automatically offloads timelines during compaction.
Therefore, already offloaded timelines are not affected by this, nor is
the manual testing endpoint.
This allows the rollout of timeline offloading to be driven by the
storage team.
Part of #8088
## Problem
We need a way to incrementally switch to direct IO. During the rollout
we might want to switch to O_DIRECT on image and delta layer read path
first before others.
## Summary of changes
- Revisited and simplified direct io config in `PageserverConf`.
- We could add a fallback mode for open, but for read there isn't a
reasonable alternative (without creating another buffered virtual file).
- Added a wrapper around `VirtualFile`, current implementation become
`VirtualFileInner`
- Use `open_v2`, `create_v2`, `open_with_options_v2` when we want to use
the IO mode specified in PS config.
- Once we onboard all IO through VirtualFile using this new API, we will
delete the old code path.
- Make io mode live configurable for benchmarking.
- Only guaranteed for files opened after the config change, so do it
before the experiment.
As an example, we are using `open_v2` with
`virtual_file::IoMode::Direct` in
https://github.com/neondatabase/neon/pull/9169
We also remove `io_buffer_alignment` config in
a04cfd754b and use it as a compile time
constant. This way we don't have to carry the alignment around or make
frequent call to retrieve this information from the static variable.
Signed-off-by: Yuchen Liang <yuchen@neon.tech>
## Problem
`Oversized vectored read [...]` logs are spewing in prod because we have
a few keys that
are unexpectedly large:
* reldir/relblock - these are unbounded, so it's known technical debt
* slru block - they can be a bit bigger than 128KiB due to storage
format overhead
## Summary of changes
* Bump threshold to 130KiB
* Don't warn on oversized reldir and dbdir keys
Closes https://github.com/neondatabase/neon/issues/8967
This constant in 'tenant_conf_defaults' was unused, but there's
another constant with the same name in the global 'defaults'. I wish
the setting was configurable per-tenant, but it isn't, so let's remove
the confusing duplicate.
The DEFAULT_CONCURRENT_TENANT_SIZE_LOGICAL_SIZE_QUERIES constant was
unused, because we had just hardcoded it to 1 where the constant
should've been used.
Remove the ConfigurableSemaphore::Default implementation, since it was
unused.
This PR simplifies the pageserver configuration parsing as follows:
* introduce the `pageserver_api::config::ConfigToml` type
* implement `Default` for `ConfigToml`
* use serde derive to do the brain-dead leg-work of processing the toml
document
* use `serde(default)` to fill in default values
* in `pageserver` crate:
* use `toml_edit` to deserialize the pageserver.toml string into a
`ConfigToml`
* `PageServerConfig::parse_and_validate` then
* consumes the `ConfigToml`
* destructures it exhaustively into its constituent fields
* constructs the `PageServerConfig`
The rules are:
* in `ConfigToml`, use `deny_unknown_fields` everywhere
* static default values go in `pageserver_api`
* if there cannot be a static default value (e.g. which default IO
engine to use, because it depends on the runtime), make the field in
`ConfigToml` an `Option`
* if runtime-augmentation of a value is needed, do that in
`parse_and_validate`
* a good example is `virtual_file_io_engine` or `l0_flush`, both of
which need to execute code to determine the effective value in
`PageServerConf`
The benefits:
* massive amount of brain-dead repetitive code can be deleted
* "unused variable" compile-time errors when removing a config value,
due to the exhaustive destructuring in `parse_and_validate`
* compile-time errors guide you when adding a new config field
Drawbacks:
* serde derive is sometimes a bit too magical
* `deny_unknown_fields` is easy to miss
Future Work / Benefits:
* make `neon_local` use `pageserver_api` to construct `ConfigToml` and
write it to `pageserver.toml`
* This provides more type safety / coompile-time errors than the current
approach.
### Refs
Fixes#3682
### Future Work
* `remote_storage` deser doesn't reject unknown fields
https://github.com/neondatabase/neon/issues/8915
* clean up `libs/pageserver_api/src/config.rs` further
* break up into multiple files, at least for tenant config
* move `models` as appropriate / refine distinction between config and
API models / be explicit about when it's the same
* use `pub(crate)` visibility on `mod defaults` to detect stale values
This is the first step towards representing all of Pageserver
configuration as clean `serde::Serialize`able Rust structs in
`pageserver_api`.
The `neon_local` code will then use those structs instead of the crude
`toml_edit` / string concatenation that it does today.
refs https://github.com/neondatabase/neon/issues/7555
---------
Co-authored-by: Alex Chi Z <iskyzh@gmail.com>