rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-07-04 12:40:37 +00:00

Author	SHA1	Message	Date
Rory de Zoete	1d4114183c	Use main, not branch for ref check (#2288 ) * Use main, not branch for ref check * Add more debug * Count main, not head * Try new approach * Conform to syntax * Update approach * Get full history * Skip checkout * Cleanup debug * Remove more debug Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>	2022-08-16 15:41:31 +02:00
Rory de Zoete	4cde0e7a37	Error for fatal not git repo (#2286 ) Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>	2022-08-16 13:59:41 +02:00
Rory de Zoete	83f7b8ed22	Add missing step output, revert one deploy step (#2285 ) * Add missing step output, revert one deploy step * Conform to syntax * Update approach * Add missing value * Add missing needs Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box>	2022-08-16 13:41:51 +02:00
Rory de Zoete	b8f0f37de2	Gen2 GH runner (#2128 ) * Re-add rustup override * Try s3 bucket * Set git version * Use v4 cache key to prevent problems * Switch to v5 for key * Add second rustup fix * Rebase * Add kaniko steps * Fix typo and set compress level * Disable global run default * Specify shell for step * Change approach with kaniko * Try less verbose shell spec * Add submodule pull * Add promote step * Adjust dependency chain * Try default swap again * Use env * Don't override aws key * Make kaniko build conditional * Specify runs on * Try without dependency link * Try soft fail * Use image with git * Try passing to next step * Fix duplicate * Try other approach * Try other approach * Fix typo * Try other syntax * Set env * Adjust setup * Try step 1 * Add link * Try global env * Fix mistake * Debug * Try other syntax * Try other approach * Change order * Move output one step down * Put output up one level * Try other syntax * Skip build * Try output * Re-enable build * Try other syntax * Skip middle step * Update check * Try first step of dockerhub push * Update needs dependency * Try explicit dir * Add missing package * Try other approach * Try other approach * Specify region * Use with * Try other approach * Add debug * Try other approach * Set region * Follow AWS example * Try github approach * Skip Qemu * Try stdin * Missing steps * Add missing close * Add echo debug * Try v2 endpoint * Use v1 endpoint * Try without quotes * Revert * Try crane * Add debug * Split steps * Fix duplicate * Add shell step * Conform to options * Add verbose flag * Try single step * Try workaround * First request fails hunch * Try bullseye image * Try other approach * Adjust verbose level * Try previous step * Add more debug * Remove debug step * Remove rogue indent * Try with larger image * Add build tag step * Update workflow for testing * Add tag step for test * Remove unused * Update dependency chain * Add ownership fix * Use matrix for promote * Force update * Force build * Remove unused * Add new image * Add missing argument * Update dockerfile copy * Update Dockerfile * Update clone * Update dockerfile * Go to correct folder * Use correct format * Update dockerfile * Remove cd * Debug find where we are * Add debug on first step * Changedir to postgres * Set workdir * Use v1 approach * Use other dependency * Try other approach * Try other approach * Update dockerfile * Update approach * Update dockerfile * Update approach * Update dockerfile * Update dockerfile * Add workspace hack * Update Dockerfile * Update Dockerfile * Update Dockerfile * Change last step * Cleanup pull in prep for review * Force build images * Add condition for latest tagging * Use pinned version * Try without name value * Remove more names * Shorten names * Add kaniko comments * Pin kaniko * Pin crane and ecr helper * Up one level * Switch to pinned tag for rust image * Force update for test Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@b04468bf-cdf4-41eb-9c94-aff4ca55e4bf.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@4795e9ee-4f32-401f-85f3-f316263b62b8.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@2f8bc4e5-4ec2-4ea2-adb1-65d863c4a558.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@27565b2b-72d5-4742-9898-a26c9033e6f9.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@ecc96c26-c6c4-4664-be6e-34f7c3f89a3c.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@7caff3a5-bf03-4202-bd0e-f1a93c86bdae.fritz.box>	2022-08-16 11:15:35 +02:00
Kirill Bulatov	18f251384d	Check for entire range during sasl validation (#2281 )	2022-08-16 11:10:38 +03:00
Alexander Bayandin	4cddb0f1a4	Set up a workflow to run pgbench against captest (#2077 )	2022-08-15 18:54:31 +01:00
Arseny Sher	7b12deead7	Bump vendor/postgres to include XLP_FIRST_IS_CONTRECORD fix. (#2274 )	2022-08-15 18:24:24 +03:00
Dmitry Rodionov	63a72d99bb	increase timeout in wait_for_upload to avoid spurious failures when testing with real s3	2022-08-15 18:02:27 +03:00
Arthur Petukhovsky	116ecdf87a	Improve walreceiver logic (#2253 ) This patch makes walreceiver logic more complicated, but it should work better in most cases. Added `test_wal_lagging` to test scenarios where alive safekeepers can lag behind other alive safekeepers. - There was a bug which looks like `etcd_info.timeline.commit_lsn > Some(self.local_timeline.get_last_record_lsn())` filtered all safekeepers in some strange cases. I removed this filter, it should probably help with #2237 - Now walreceiver_connection reports status, including commit_lsn. This allows keeping safekeeper connection even when etcd is down. - Safekeeper connection now fails if pageserver doesn't receive safekeeper messages for some time. Usually safekeeper sends messages at least once per second. - `LaggingWal` check now uses `commit_lsn` directly from safekeeper. This fixes the issue with often reconnects, when compute generates WAL really fast. - `NoWalTimeout` is rewritten to trigger only when we know about the new WAL and the connected safekeeper doesn't stream any WAL. This allows setting a small `lagging_wal_timeout` because it will trigger only when we observe that the connected safekeeper has stuck.	2022-08-15 13:31:26 +03:00
Arseny Sher	431393e361	Find end of WAL on safekeepers using WalStreamDecoder. We could make it inside wal_storage.rs, but taking into account that - wal_storage.rs reading is async - we don't need s3 here - error handling is different; error during decoding is normal I decided to put it separately. Test cargo test test_find_end_of_wal_last_crossing_segment prepared earlier by @yeputons passes now. Fixes https://github.com/neondatabase/neon/issues/544 https://github.com/neondatabase/cloud/issues/2004 Supersedes https://github.com/neondatabase/neon/pull/2066	2022-08-14 14:47:14 +03:00
Kirill Bulatov	f38f45b01d	Better storage sync logs (#2268 )	2022-08-13 10:58:14 +03:00
Andrey Taranik	a5154dce3e	get_binaries script fix (#2263 ) * get_binaries uses DOCKER_TAG taken from docker image build step * remove docker tag discovery at all and fix get_binaries for version variable	2022-08-12 20:35:26 +03:00
Alexander Bayandin	da5f8486ce	test_runner/pg_clients: collect docker logs (#2259 )	2022-08-12 17:03:09 +01:00
Dmitry Ivanov	ad08c273d3	[proxy] Rework wire format of the password hack and some errors (#2236 ) The new format has a few benefits: it's shorter, simpler and human-readable as well. We don't use base64 anymore, since url encoding got us covered. We also show a better error in case we couldn't parse the payload; the users should know it's all about passing the correct project name.	2022-08-12 17:38:43 +03:00
Andrey Taranik	7f97269277	get_binaries uses DOCKER_TAG taken from docker image build step (#2260 )	2022-08-12 16:01:22 +03:00
Thang Pham	6d99b4f1d8	disable `test_import_from_pageserver_multisegment` (#2258 ) This test failed consistently on `main` now. It's better to temporarily disable it to avoid blocking others' PRs while investigating the root cause for the test failure. See: #2255, #2256	2022-08-12 19:13:42 +07:00
Egor Suvorov	a7bf60631f	postgres_ffi/waldecoder: introduce explicit `enum State` Previously it was emulated with a combination of nullable fields. This change should make the logic more readable.	2022-08-12 11:40:46 +03:00
Egor Suvorov	07bb7a2afe	postgres_ffi/waldecoder: remove unused startlsn	2022-08-12 11:40:46 +03:00
Egor Suvorov	142e247e85	postgres_ffi/waldecoder: validate more header fields	2022-08-12 11:40:46 +03:00
Thang Pham	7da47d8a0a	Fix timeline physical size flaky tests (#2244 ) Resolves #2212. - use `wait_for_last_flush_lsn` in `test_timeline_physical_size_` tests ## Context Need to wait for the pageserver to catch up with the compute's last flush LSN because during the timeline physical size API call, it's possible that there are running `LayerFlushThread` threads. These threads flush new layers into disk and hence update the physical size. This results in a mismatch between the physical size reported by the API and the actual physical size on disk. ### Note The `LayerFlushThread` threads are processed concurrently*, so it's possible that the above error still persists even with this patch. However, making the tests wait to finish processing all the WALs (not flushing) before calculating the physical size should help reduce the "flakiness" significantly	2022-08-12 14:28:50 +07:00
Thang Pham	dc52436a8f	Fix bug when import large (>1GB) relations (#2172 ) Resolves #2097 - use timeline modification's `lsn` and timeline's `last_record_lsn` to determine the corresponding LSN to query data in `DatadirModification::get` - update `test_import_from_pageserver`. Split the test into 2 variants: `small` and `multisegment`. + `small` is the old test + `multisegment` is to simulate #2097 by using a larger number of inserted rows to create multiple segment files of a relation. `multisegment` is configured to only run with a `release` build	2022-08-12 09:24:20 +07:00
Kirill Bulatov	995a2de21e	Share exponential backoff code and fix logic for delete task failure (#2252 )	2022-08-11 23:21:06 +03:00
Arseny Sher	e593cbaaba	Add pageserver checkpoint_timeout option. To flush inmemory layer eventually when no new data arrives, which helps safekeepers to suspend activity (stop pushing to the broker). Default 10m should be ok.	2022-08-11 22:54:09 +03:00
Heikki Linnakangas	4b9e02be45	Update back `vendor/postgres` back; it was changed accidentally. (#2251 ) Commit `4227cfc96e` accidentally reverted vendor/postgres to an older version. Update it back.	2022-08-11 19:25:08 +03:00
Kirill Bulatov	7a36d06cc2	Fix exponential backoff values	2022-08-11 08:34:57 +03:00
Konstantin Knizhnik	4227cfc96e	Safe truncate (#2218 ) * Move relation sie cache to layered timeline * Fix obtaining current LSN for relation size cache * Resolve merge conflicts * Resolve merge conflicts * Reestore 'lsn' field in DatadirModification * adjust DatadirModification lsn in ingest_record * Fix formatting * Pass lsn to get_relsize * Fix merge conflict * Update pageserver/src/pgdatadir_mapping.rs Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * Update pageserver/src/pgdatadir_mapping.rs Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * Check if relation exists before trying to truncat it refer #1932 * Add test reporducing FSM truncate problem Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>	2022-08-09 22:45:33 +03:00
Dmitry Rodionov	1fc761983f	support node id and remote storage params in docker_entrypoint.sh	2022-08-09 18:59:00 +03:00
Stas Kelvich	227d47d2f3	Update CONTRIBUTING.md	2022-08-09 14:18:25 +03:00
Stas Kelvich	0290893bcc	Update CONTRIBUTING.md	2022-08-09 14:18:25 +03:00
Heikki Linnakangas	32fd709b34	Fix links to safekeeper protocol docs. (#2188 ) safekeeper/README_PROTO.md was moved to docs/safekeeper-protocol.md in commit `0b14fdb078`, as part of reorganizing the docs into 'mdbook' format. Fixes issue #1475. Thanks to @banks for spotting the outdated references. In addition to fixing the above issue, this patch also fixes other broken links as a result of `0b14fdb078`. See https://github.com/neondatabase/neon/pull/2188#pullrequestreview-1055918480. Co-authored-by: Heikki Linnakangas <heikki@neon.tech> Co-authored-by: Thang Pham <thang@neon.tech>	2022-08-09 10:19:18 +07:00
Kirill Bulatov	3a9bff81db	Fix etcd typos	2022-08-08 19:04:46 +03:00
bojanserafimov	743370de98	Major migration script (#2073 ) This script can be used to migrate a tenant across breaking storage versions, or (in the future) upgrading postgres versions. See the comment at the top for an overview. Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>	2022-08-08 17:52:28 +02:00
Dmitry Rodionov	cdfa9fe705	avoid duplicate parameter, increase timeout	2022-08-08 12:15:16 +03:00
Dmitry Rodionov	7cd68a0c27	increase timeout to pass test with real s3	2022-08-08 12:15:16 +03:00
Dmitry Rodionov	beaa991f81	remove debug log	2022-08-08 12:15:16 +03:00
Dmitry Rodionov	9430abae05	use event so it fires only if workload thread successfully finished	2022-08-08 12:15:16 +03:00
Dmitry Rodionov	4da4c7f769	increase statement timeout	2022-08-08 12:15:16 +03:00
Dmitry Rodionov	0d14d4a1a8	ignore record property warning to fix benchmarks	2022-08-08 12:15:16 +03:00
bojanserafimov	8c8431ebc6	Add more buckets to pageserver latency metrics (#2225 )	2022-08-06 11:45:47 +02:00
Ankur Srivastava	84d1bc06a9	refactor: replace lazy-static with once-cell (#2195 ) - Replacing all the occurrences of lazy-static with `once-cell::sync::Lazy` - fixes #1147 Signed-off-by: Ankur Srivastava <best.ankur@gmail.com>	2022-08-05 19:34:04 +02:00
Konstantin Knizhnik	5133db44e1	Move relation size cache from WalIngest to DatadirTimeline (#2094 ) * Move relation sie cache to layered timeline * Fix obtaining current LSN for relation size cache * Resolve merge conflicts * Resolve merge conflicts * Reestore 'lsn' field in DatadirModification * adjust DatadirModification lsn in ingest_record * Fix formatting * Pass lsn to get_relsize * Fix merge conflict * Update pageserver/src/pgdatadir_mapping.rs Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * Update pageserver/src/pgdatadir_mapping.rs Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>	2022-08-05 16:28:59 +03:00
Alexander Bayandin	4cb1074fe5	github/workflows: Fix git dubious ownership (#2223 )	2022-08-05 13:44:57 +01:00
Arthur Petukhovsky	0a958b0ea1	Check find_end_of_wal errors instead of unwrap	2022-08-04 17:56:19 +03:00
Vadim Kharitonov	1bbc8090f3	[issue #1591 ] Add `neon_local pageserver status` handler	2022-08-04 16:38:29 +03:00
Dmitry Rodionov	f7d8db7e39	silence https://github.com/neondatabase/neon/issues/2211	2022-08-04 16:32:19 +03:00
Dmitry Rodionov	e54941b811	treat pytest warnings as errors	2022-08-04 16:32:19 +03:00
Heikki Linnakangas	52ce1c9d53	Speed up test shutdown, by polling more frequently. A fair amount of the time in our python tests is spent waiting for the pageserver and safekeeper processes to shut down. It doesn't matter so much when you're running a lot of tests in parallel, but it's quite noticeable when running them sequentially. A big part of the slowness is that is that after sending the SIGTERM signal, we poll to see if the process is still running, and the polling happened at 1 s interval. Reduce it to 0.1 s.	2022-08-04 12:57:15 +03:00
Dmitry Rodionov	bc2cb5382b	run real s3 tests in CI	2022-08-04 11:14:05 +03:00
Dmitry Rodionov	5f71aa09d3	support running tests against real s3 implementation without mocking	2022-08-04 11:14:05 +03:00
Dmitry Rodionov	b4f2c5b514	run benchmarks conditionally, on main or if run_benchmarks label is set	2022-08-03 01:36:14 +03:00

... 8 9 10 11 12 ...

2373 Commits