The warning:
warning: ISO C90 forbids mixed declarations and code [-Wdeclaration-after-statement]
It's PostgreSQL project style to stick to the old C90 style.
(Alternatively, we could disable it for our extension.)
Part of the cleanup issue #9217.
Silences these compiler warnings:
/pgxn/neon_walredo/walredoproc.c:452:1: warning: ‘CreateFakeSharedMemoryAndSemaphores’ was used with no prototype before its definition [-Wmissing-prototypes]
452 | CreateFakeSharedMemoryAndSemaphores()
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/pgxn/neon/walproposer_pg.c:541:1: warning: no previous prototype for ‘GetWalpropShmemState’ [-Wmissing-prototypes]
541 | GetWalpropShmemState()
| ^~~~~~~~~~~~~~~~~~~~
Part of the cleanup issue #9217.
In PostgreSQL v16, BUFFERTAGS_EQUAL was replaced with a static inline
macro, BufferTagsEqual. Let's use the new name going forward, and have
backwards-compatibility glue to allow using the new name on v14 and v15,
rather than the other way round. This also makes BufferTagsEquals
consistent with InitBufferTag, for which we were already using the new
name.
Not used in production, but in benchmarks, to demonstrate minimal RTT.
(It would be nice to not have to copy the 8KiB of zeroes, but, that
would require larger protocol changes).
Found this useful in investigation
https://github.com/neondatabase/neon/pull/8952.
This adds preliminary PG17 support to Neon, based on RC1 / 2024-09-04
07b828e9d4
NOTICE: The data produced by the included version of the PostgreSQL fork
may not be compatible with the future full release of PostgreSQL 17 due to
expected or unexpected future changes in magic numbers and internals.
DO NOT EXPECT DATA IN V17-TENANTS TO BE COMPATIBLE WITH THE 17.0
RELEASE!
Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>
Co-authored-by: Alexander Bayandin <alexander@neon.tech>
Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
I saw this compiler warning on my laptop:
pgxn/neon_walredo/walredoproc.c:178:10: warning: using the result of an
assignment as a condition without parentheses [-Wparentheses]
if (err = close_range_syscall(3, ~0U, 0)) {
~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
pgxn/neon_walredo/walredoproc.c:178:10: note: place parentheses around
the assignment to silence this warning
if (err = close_range_syscall(3, ~0U, 0)) {
^
( )
pgxn/neon_walredo/walredoproc.c:178:10: note: use '==' to turn this
assignment into an equality comparison
if (err = close_range_syscall(3, ~0U, 0)) {
^
==
1 warning generated.
I'm not sure what compiler version or options cause that, but it's a
good warning. Write the call a little differently, to avoid the warning
and to make it a little more clear anyway. (The 'err' variable wasn't
used for anything, so I'm surprised we were not seeing a compiler
warning on the unused value, too.)
I was getting an error:
/home/heikki/git-sandbox/neon//pgxn/neon_walredo/walredoproc.c:161:5: error: conflicting types for ‘close_range’; have ‘int(unsigned int, unsigned int, unsigned int)’
161 | int close_range(unsigned int start_fd, unsigned int count, unsigned int flags) {
| ^~~~~~~~~~~
In file included from /usr/include/x86_64-linux-gnu/bits/sigstksz.h:24,
from /usr/include/signal.h:328,
from /home/heikki/git-sandbox/neon//pgxn/neon_walredo/walredoproc.c:50:
/usr/include/unistd.h:1208:12: note: previous declaration of ‘close_range’ with type ‘int(unsigned int, unsigned int, int)’
1208 | extern int close_range (unsigned int __fd, unsigned int __max_fd,
| ^~~~~~~~~~~
The discrepancy is in the 3rd argument. Apparently in the glibc
wrapper it's signed.
As a quick fix, rename our close_range() function, the one that calls
syscall() directly, to avoid the clash with the glibc wrapper. In the
long term, an autoconf test would be nice, and some equivalent on
macOS, see issue #6580.
At the end of ApplyRecord(), we called pfree on the decoded record, if
it was "oversized". However, we had alread linked it to the "decode
queue" list in XLogReaderState. If we later called XLogBeginRead(), it
called ResetDecoder and tried to free the same record again.
The conditions to hit this are:
- a large WAL record (larger than aboue 64 kB I think, per
DEFAULT_DECODE_BUFFER_SIZE), and
- another WAL record processed by the same WAL redo process after the
large one.
I think the reason we haven't seen this earlier is that you don't get
WAL records that large that are sent to the WAL redo process, except
when logical replication is enabled. Logical replication adds data to
the WAL records, making them larger.
To fix, allocate the buffer ourselves, and don't link it to the decode
queue. Alternatively, we could perhaps have just removed the pfree(),
but frankly I'm a bit scared about the whole queue thing.
The rust stdlib uses the efficient `posix_spawn` by default.
However, before this PR, pageserver used `pre_exec()` in our
`close_fds()` ext trait.
This PR moves the work that `close_fds()` did to the walredo C code.
I verified manually using `gdb` that we're now forking out the walredo
process using `posix_spawn`.
refs https://github.com/neondatabase/neon/issues/6565
## Problem
See
https://neondb.slack.com/archives/C036U0GRMRB/p1698652221399419?thread_ts=1698438997.903919&cid=C036U0GRMRB
## Summary of changes
Check if record pointer is not NULL before trying to print record
descriptor
## Checklist before requesting a review
- [ ] I have performed a self-review of my code.
- [ ] If it is a core feature, I have added thorough tests.
- [ ] Do we need to implement analytics? if so did you add the relevant
metrics to the dashboard?
- [ ] If this PR requires public announcement, mark it with
/release-notes label and add several sentences in this section.
## Checklist before merging
- [ ] Do not forget to reformat commit message to not include the above
checklist
Co-authored-by: Konstantin Knizhnik <knizhnik@neon.tech>
This fixes issues in pageserver's walredo process where WALRedo
logs of loglevel=LOG are interpreted as errors.
## Problem
See #5560
## Summary of changes
Set the log level to something that doesn't include LOG.
This adds PostgreSQL 16 as a vendored postgresql version, and adapts the
code to support this version.
The important changes to PostgreSQL 16 compared to the PostgreSQL 15
changeset include the addition of a neon_rmgr instead of altering Postgres's
original WAL format.
Co-authored-by: Alexander Bayandin <alexander@neon.tech>
Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
Aarch64 doesn't implement some old syscalls like open and select. Use
openat instead of open to check if seccomp is supported. Leave both
select and pselect6 in the allowlist since we don't call select syscall
directly and may hope that libc will call pselect6 on aarch64.
To check whether some syscall is supported it is possible to use
`scmp_sys_resolver` from seccopm package:
```
> apt install seccopm
> scmp_sys_resolver -a x86_64 select
23
> scmp_sys_resolver -a aarch64 select
-10101
> scmp_sys_resolver -a aarch64 pselect6
72
```
Negative value means that syscall is not supported.
Another cross-check is to look up for the actuall syscall table in
`unistd.h`. To resolve all the macroses one can use `gcc -E` as it is
done in `dump_sys_aarch64()` function in
libseccomp/src/arch-syscall-validate.
---------
Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
Aarch64 doesn't implement some old syscalls like open and select. Use
openat instead of open to check if seccomp is supported. Leave both
select and pselect6 in the allowlist since we don't call select syscall
directly and may hope that libc will call pselect6 on aarch64.
To check whether some syscall is supported it is possible to use
`scmp_sys_resolver` from seccopm package:
```
> apt install seccopm
> scmp_sys_resolver -a x86_64 select
23
> scmp_sys_resolver -a aarch64 select
-10101
> scmp_sys_resolver -a aarch64 pselect6
72
```
Negative value means that syscall is not supported.
Another cross-check is to look up for the actuall syscall table
in `unistd.h`. To resolve all the macroses one can use `gcc -E` as
it is done in `dump_sys_aarch64()` function in
libseccomp/src/arch-syscall-validate.
* Stop allocating and maintaining 128MB hash table for last written
LSN cache as it is not needed in wal-redo.
* Do not require access to the initialized data directory. That
saves few dozens megabytes of empty but initialized data directory.
Currently such directories do occupy about 10% of the disk space
on the pageservers as most of tenants are empty.
* Move shmem-initialization code to the extension instead of postgres
- Refactor the way the WalProposerMain function is called when started
with --sync-safekeepers. The postgres binary now explicitly loads
the 'neon.so' library and calls the WalProposerMain in it. This is
simpler than the global function callback "hook" we previously used.
- Move the WAL redo process code to a new library, neon_walredo.so,
and use the same mechanism as for --sync-safekeepers to call the
WalRedoMain function, when launched with --walredo argument.
- Also move the seccomp code to neon_walredo.so library. I kept the
configure check in the postgres side for now, though.