rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-07-04 04:30:38 +00:00

Author	SHA1	Message	Date
Kunpeng Fan	2ea4117903	unset current dir for safekeeper	2022-12-19 16:54:24 +02:00
Christian Schwarz	c785a516aa	remove TimelineInfo.{Remote,Local} along with their types follow-up of https://github.com/neondatabase/neon/pull/2615 which is neon.git: `538876650a` must be deployed after cloud.git change https://github.com/neondatabase/cloud/issues/3232 fixes https://github.com/neondatabase/neon/issues/3041	2022-12-19 14:37:40 +01:00
Heikki Linnakangas	e23d5da51c	Tidy up and add comments to the pageserver startup code. To make it more readable.	2022-12-19 14:03:22 +02:00
Alexander Bayandin	12e6f443da	test_perf_pgbench: switch to server-side data generation (#3058 ) To offload the network and reduce its impact, I suggest switching to server-side data generation for the pgbench initialize workflow.	2022-12-18 00:02:04 +00:00
Dmitry Ivanov	61194ab2f4	Update rust-postgres everywhere I've rebased[1] Neon's fork of rust-postgres to incorporate latest upstream changes (including dependabot's fixes), so we need to advance revs here as well. [1] https://github.com/neondatabase/rust-postgres/commits/neon	2022-12-17 00:26:10 +03:00
MMeent	3514e6e89a	Use neon_nblocks instead of get_cached_relsize (#3132 ) This prevents us from overwriting all blocks of a relation when we extend the relation without first caching the size - get_cached_relsize does not guarantee a correct result when it returns `false`.	2022-12-16 21:14:57 +01:00
Dmitry Ivanov	83baf49487	[proxy] Forward compute connection params to client This fixes all kinds of problems related to missing params, like broken timestamps (due to `integer_datetimes`). This solution is not ideal, but it will help. Meanwhile, I'm going to dedicate some time to improving connection machinery. Note that this does not fix problems with passing certain parameters in a reverse direction, i.e. from client to compute. This is a separate matter and will be dealt with in an upcoming PR.	2022-12-16 21:37:50 +03:00
Alexander Bayandin	64775a0a75	test_runner/performance: fix flush for NeonCompare (#3135 ) Fix performance tests: ``` AttributeError: 'NeonCompare' object has no attribute 'pageserver_http' ```	2022-12-16 17:45:38 +00:00
Joonas Koivunen	c86c0c08ef	task_mgr: use CancellationToken instead of shutdown_rx (#3124 ) this should help us in the future to have more freedom with spawning tasks and cancelling things, most importantly blocking tasks (assuming the CancellationToken::is_cancelled is performant enough). CancellationToken allows creation of hierarchical cancellations, which would also simplify the task_mgr shutdown operation, rendering it unnecessary.	2022-12-16 17:19:47 +02:00
Alexander Bayandin	8d39fcdf72	pgbench-compare: don't run neon-captest-new (#3130 ) Do not run Nightly Benchmarks on `neon-captest-new`. This is a temporary solution to avoid spikes in the storage we consume during the test run. To collect data for the default instance, we could run tests weekly (i.e. not daily).	2022-12-16 13:23:36 +00:00
Joonas Koivunen	b688a538e3	fix(remote_storage): use cached credentials (#3128 ) IMDSv2 has limits, and if we query it on every s3 interaction we are going to go over those limits. Changes the s3_bucket client configuration to use: - ChainCredentialsProvider to handle env variables or imds usage - LazyCachingCredentialsProvider to actually cache any credentials Related: https://github.com/awslabs/aws-sdk-rust/issues/629 Possibly related: https://github.com/neondatabase/neon/issues/3118	2022-12-16 13:40:01 +02:00
Arseny Sher	e14bbb889a	Enable broker client keepalives. (#3127 ) Should fix stale connections. ref https://github.com/neondatabase/neon/issues/3108	2022-12-16 11:55:12 +02:00
Heikki Linnakangas	c262390214	Don't upload index file when GC doesn't remove anything. I saw an excessive number of index file upload operations in production, even when nothing on the timeline changes. It was because our GC schedules index file upload if the GC cutoff LSN is advanced, even if the GC had nothing else to do. The GC cutoff LSN marches steadily forwards, even when there is no user activity on the timeline, when the cutoff is determined by the time-based PITR interval setting. To dial that down, only schedule index file upload when GC is about to actually remove something.	2022-12-16 11:05:55 +02:00
Heikki Linnakangas	6dec85b19d	Redefine the timeline_gc API to not perform a forced compaction Previously, the /v1/tenant/:tenant_id/timeline/:timeline_id/do_gc API call performed a flush and compaction on the timeline before GC. Change it not to do that, and change all the tests that used that API to perform compaction explicitly. The compaction happens at a slightly different point now. Previously, the code performed the `refresh_gc_info_internal` step first, and only then did compaction on all the timelines. I don't think that was what was originally intended here. Presumably the idea with compaction was to make some old layer files available for GC. But if we're going to flush the current in-memory layer to disk, surely you would want to include the newly-written layer in the compaction too. I guess this didn't make any difference to the tests in practice, but in any case, the tests now perform the flush and compaction before any of the GC steps. Some of the tests might not need the compaction at all, but I didn't try hard to determine which ones might need it. I left it out from a few tests that intentionally tested calling do_gc with an invalid tenant or timeline ID, though.	2022-12-16 11:05:55 +02:00
Arseny Sher	70ce01d84d	Deploy broker with L4 LB in new env. (#3125 ) Seems to be fixing issue with missing keepalives.	2022-12-15 22:42:30 +01:00
Christian Schwarz	b58f7710ff	seqwait: different error messages per variant Would have been handy to get slightly more details in https://github.com/neondatabase/neon/issues/3109 refs https://github.com/neondatabase/neon/issues/3109	2022-12-15 18:19:43 +01:00
MMeent	807b110946	Update Makefile configuration: (#3011 ) - Use only one templated section for most postgres-versioned steps - Clean up neon_walredo, too, when running neon-pg-ext-clean - Depend on the various cleanup steps for `clean` instead of manually executing those cleanup steps.	2022-12-15 17:06:17 +00:00
Christian Schwarz	397b60feab	common abstraction for waiting for SK commit_lsn to reach PS	2022-12-15 11:50:39 +01:00
Christian Schwarz	10cd64cf8d	make TaskHandle::next_task_event cancellation-safe If we get cancelled before jh.await returns we've take()n the join handle but drop the result on the floor. Fix it by setting self.join_handle = None after the .await fixes https://github.com/neondatabase/neon/issues/3104	2022-12-15 10:26:17 +01:00
Christian Schwarz	bf3ac2be2d	add remote_physical_size metric We do the accounting exclusively after updating remote IndexPart successfully. This is cleaner & more robust than doing it upon completion of individual layer file uploads / deletions since we can uset .set() insteaf of add()/sub(). NB: Originally, this work was intended to be part of #3013 but it turns out that it's completely orthogonal. So, spin it out into this PR for easier review. Since this change is additive, it won't break anything.	2022-12-15 09:48:35 +01:00
Sergey Melnikov	c04c201520	Push proxy metrics to Victoria Metrics (#3106 )	2022-12-14 21:28:14 +01:00
Christian Schwarz	4132ae9dfe	always remove RemoteTimelineClient's metrics when dropping it	2022-12-14 19:25:29 +01:00
Alexander Bayandin	8fcba150db	test_seqscans: temporarily disable remote test (#3101 ) Temporarily disable `test_seqscans` for remote projects; they acquire too much space and time. We can try to reenable it back after switching to per-test projects.	2022-12-14 18:05:05 +00:00
Dmitry Rodionov	df09d0375b	ignore metadata_backup files in index_part	2022-12-14 19:00:19 +03:00
Vadim Kharitonov	62f6e969e7	Fix helm value for proxy	2022-12-14 16:41:26 +01:00
Kirill Bulatov	4d201619ed	Remove large database files after every test suite (#3090 ) Closes https://github.com/neondatabase/neon/issues/1984 Closes https://github.com/neondatabase/neon/pull/2830 A follow-up of https://github.com/neondatabase/neon/pull/2830, I've noticed that benchmarks failed again due to out of space issues. Removes most of the pageserver and safekeeper files from disk after every pytest suite run. ``` $ poetry run pytest -vvsk "test_tenant_redownloads_truncated_file_on_startup[local_fs]" # ... $ du -h test_output/test_tenant_redownloads_truncated_file_on_startup\[local_fs\] # ... 104K test_output/test_tenant_redownloads_truncated_file_on_startup[local_fs] $ poetry run pytest -vvsk "test_tenant_redownloads_truncated_file_on_startup[local_fs]" --preserve-database-files # ... $ du -h test_output/test_tenant_redownloads_truncated_file_on_startup\[local_fs\] # ... 123M test_output/test_tenant_redownloads_truncated_file_on_startup[local_fs] ``` Co-authored-by: Bojan Serafimov <bojan.serafimov7@gmail.com>	2022-12-14 13:09:08 +00:00
Alexander Bayandin	d3787f9b47	neon-project-create/delete: print project id to stdout (#3073 ) Print project_id to GitHub Actions stdout	2022-12-14 13:04:04 +00:00
Shany Pozin	ada5b7158f	Fix Issue #3014 (#3059 ) * TenantConfigRequest now supports tenant_id as hex string input instead of bytes array * Config file is truncated in each creation/update	2022-12-14 14:09:16 +02:00
Arseny Sher	f8ab5ef3b5	Update broker endpoint for prod-us-west-2. (#3095 )	2022-12-14 12:58:12 +01:00
Sergey Melnikov	827ee10b5a	Disable neon-stress deploy (#3093 )	2022-12-14 01:51:42 +01:00
Alexander Bayandin	c819b699be	Nightly Benchmark: run neon-captest-reuse from staging (#3086 ) The project has been migrated (now it is `restless-king-632302`), and now we should run tests from staging runners. Test run: https://github.com/neondatabase/neon/actions/runs/3686865543/jobs/6241367161 Ref https://github.com/neondatabase/cloud/issues/2836	2022-12-13 23:02:45 +00:00
Sergey Melnikov	228f9e4322	Use default folder for ansible collections (#3092 )	2022-12-13 23:59:49 +01:00
Sergey Melnikov	826214ae56	Force ansible-galaxy to also use local ansible.cfg (#3091 )	2022-12-13 21:06:18 +01:00
Sergey Melnikov	b39d6126bb	Force ansible to use local ansible.cfg (#3089 )	2022-12-13 21:57:39 +03:00
Vadim Kharitonov	0bc488b723	Add sentry environment for pageserver and safekeepers in new region (us-west-2)	2022-12-13 16:26:28 +01:00
Christian Schwarz	0c915dcb1d	Timeline::download_missing: fix handling of mismatched layer size Before this patch, when we decide to rename a layer file to backup because of layer file size mismatch, we would not remove the layer from the layer map, but remote the on-disk file. Because we re-download the file immediately after, we simply end up with two layer objects in memory that reference the same file in the layer map. So, GetPage() would work fine until one of the layers gets delete()'d. The other layer's delete() would then fail. Future work: prevent insertion of the same layer at LayerMap level so that we notice such bugs sooner.	2022-12-13 15:53:08 +01:00
Alexander Bayandin	feb07ed510	deploy (old): replace actions/setup-python@v4 with ansible image (#3081 ) Replace actions/setup-python@v4 with the ansible image to fix ``` Version 3.10 was not found in the local cache Error: The version '3.10' with architecture 'x64' was not found for this operating system. ```	2022-12-13 14:01:29 +00:00
Vadim Kharitonov	4603a4cbb5	Bypass SENTRY_ENVIRONMENT variable in order to filter panics in sentry by environment.	2022-12-13 14:52:04 +01:00
Kirill Bulatov	02c1c351dc	Create initial timeline without remote storage (#3077 ) Removes the race during pageserver initial timeline creation that lead to partial layer uploads. This race is only reproducible in test code, we do not create initial timelines in cloud (yet, at least), but still nice to remove the non-deterministic behavior.	2022-12-13 15:42:59 +02:00
Dmitry Ivanov	607c0facfc	[proxy] Propagate more console API errors to the user This patch aims to fix some of the inconsistencies in error reporting, for example "Internal error" or "Console request failed" instead of "password authentication failed for user '<NAME>'".	2022-12-13 16:16:31 +03:00
Sergey Melnikov	e5d523c86a	Add new us-west-2 region (#3071 )	2022-12-13 14:11:40 +01:00
Kirill Bulatov	7a16cde737	Remove useless pub trait method (#3076 )	2022-12-13 12:06:20 +00:00
Arseny Sher	d6325aa79d	Disable body size limit in ingress broker deploy. We have infinite streams.	2022-12-13 13:06:30 +03:00
Arseny Sher	544777e86b	Fix storage_broker deploy typo.	2022-12-13 10:57:26 +03:00
Arseny Sher	e2ae4c09a6	Put e2e tag back. `32662ff1c4` required running e2e tests on patched branch of cloud repo; not that it is merged, put the tag back.	2022-12-13 09:53:22 +03:00
Christian Schwarz	22ae67af8d	refactor: use new type LayerFileName when referring to layer file names in PathBuf/RemotePath (#3026 ) refactor: use new type LayerFileName when referring to layer file names in PathBuf/RemotePath Before this patch, we would sometimes carry around plain file names in `Path` types and/or awkwardly "rebase" paths to have a unified representation of the layer file name between local and remote. This patch introduces a new type `LayerFileName` which replaces the use of `Path` / `PathBuf` / `RemotePath` in the `storage_sync2` APIs. Instead of holding a string, it contains the parsed representation of the image and delta file name. When we need the file name, e.g., to construct a local path or remote object key, we construct the name ad-hoc. `LayerFileName` is also serde {Dese,Se}rializable, and in an initial version of this patch, it was supposed to be used directly inside `IndexPart`, replacing `RemotePath`. However, commit `3122f3282f` Ignore backup files (ones with .n.old suffix) in download_missing fixed handling of `.old` backup file names in IndexPart, and we need to carry that behavior forward. The solution is to remove `.old` backup files names during deserialization. When we re-serialize the IndexPart, the `*.old` file will be gone. This leaks the `.old` file in the remote storage, but makes it safe to clean it up later. There is additional churn by a preliminary refactoring that got squashed into this change: split off LayerMap's needs from trait Layer into super trait That refactoring renames `Layer` to `PersistentLayer` and splits off a subset of the functions into a super-trait called `Layer`. The upser trait implements just the functions needed by `LayerMap`, whereas `PersisentLayer` adds the context of the pageserver. The naming is imperfect as some functions that reside in `PersistentLayer` have nothing persistence-specific to it. But it's a step in the right direction.	2022-12-13 01:27:59 +02:00
Rory de Zoete	d1edc8aa00	Deprecate old runner for deploy job (#3070 ) As we plan to no longer use them Co-authored-by: Rory de Zoete <rdezoete@RorysMacStudio.fritz.box> Co-authored-by: Rory de Zoete <rdezoete@Rorys-Mac-Studio.fritz.box>	2022-12-12 16:55:40 +01:00
Arseny Sher	f013d53230	Switch to clap derive API in safekeeper. Less lines and easier to read/modify. Practically no functional changes.	2022-12-12 16:25:23 +03:00
Kirill Bulatov	0aa2f5c9a5	Regroup CI testing (#3049 ) Part of https://github.com/neondatabase/neon/pull/2410 and https://github.com/neondatabase/neon/pull/2407 * adds `hashFiles('rust-toolchain.toml')` into Rust cache keys, thus removing one of the manual steps to do when upgrading rustc * copies Python and Rust style checks from the `codestyle.yml` workflow * adjusts shell defaults in the main workflow * replaces `codestyle.yml` with a `neon_extra_builds.yml` worlflow The new workflow runs on commits to `main` (`codestyle.yml` was run per PR), and runs two custom builds on GH agents: * macos-latest, to ensure the entire project compiles on it (no tests run) There were no frequent breakages on macOs in our builds, so we can check it rarely without making every storage PR to wait for it to complete. The updated mac build use release builds now, so presumably should work a bit faster due to overall smaller files to cache between builds. * ubuntu-latest, without caches, to produce full compilation stats for Rust builds and upload it as an artifact to GitHub Old `clippy build --timings` stats were collected from the builds that use caches and incremental calculation hence never could produce a full report, it got removed.	2022-12-12 12:58:55 +02:00
Vadim Kharitonov	26f4ff949a	Add sentry to storage_broker.	2022-12-12 13:30:16 +03:00

1 2 3 4 5 ...

2506 Commits