rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-06-04 22:10:39 +00:00

Author	SHA1	Message	Date
bojanserafimov	a9bd05760f	Improve layer map docstrings (#3382 )	2023-01-19 10:29:15 -05:00
Heikki Linnakangas	e5cc2f92c4	Switch to 'tracing' for logging, restructure code to make use of spans. Refactors Compute::prepare_and_run. It's split into subroutines differently, to make it easier to attach tracing spans to the different stages. The high-level logic for waiting for Postgres to exit is moved to the caller. Replace 'env_logger' with 'tracing', and add `#instrument` directives to different stages fo the startup process. This is a fairly mechanical change, except for the changes in 'spec.rs'. 'spec.rs' contained some complicated formatting, where parts of log messages were printed directly to stdout with `print`s. That was a bit messed up because the log normally goes to stderr, but those lines were printed to stdout. In our docker images, stderr and stdout both go to the same place so you wouldn't notice, but I don't think it was intentional. This changes the log format to the default 'tracing_subscriber::format' format. It's different from the Postgres log format, however, and because both compute_tools and Postgres print to the same log, it's now a mix of two different formats. I'm not sure how the Grafana log parsing pipeline can handle that. If it's a problem, we can build custom formatter to change the compute_tools log format to be the same as Postgres's, like it was before this commit, or we can change the Postgres log format to match tracing_formatter's, or we can start printing compute_tool's log output to a different destination than Postgres	2023-01-18 19:42:47 +02:00
Kirill Bulatov	90f66aa51b	Enable logs in unit tests	2023-01-18 17:43:27 +02:00
Kirill Bulatov	826e89b9ce	Use actual temporary dir for pageserver unit tests	2023-01-18 17:43:27 +02:00
Vadim Kharitonov	e59d32ac5d	Change SENTRY_ENVIRONMENT from "development" to "staging"	2023-01-18 16:34:49 +01:00
Anastasia Lubennikova	506086a3e2	Fix metric_collection_endpoint for prod. It was incorrectly set to staging url	2023-01-18 16:35:43 +02:00
Heikki Linnakangas	3b58c61b33	If an error happens while checking for core dumps, don't panic. If we panic, we skip the 30s wait in 'main', and don't give the console a chance to observe the error. Which is not nice. Spotted by @ololobus at https://github.com/neondatabase/neon/pull/3352#discussion_r1072806981	2023-01-18 11:25:47 +02:00
Kirill Bulatov	c6b56d2967	Add more io::Error context when fail to operate on a path (#3254 ) I have a test failure that shows ``` Caused by: 0: Failed to reconstruct a page image: 1: Directory not empty (os error 39) ``` but does not really show where exactly that happens. https://neon-github-public-dev.s3.amazonaws.com/reports/pr-3227/release/3823785365/index.html#categories/c0057473fc9ec8fb70876fd29a171ce8/7088dab272f2c7b7/?attachment=60fe6ed2add4d82d The PR aims to add more context in debugging that issue.	2023-01-17 22:07:38 +02:00
Anastasia Lubennikova	9d3992ef48	Increase metric_collection_interval for proxy on prod to not owerwhelm the service	2023-01-17 15:50:19 +02:00
Anastasia Lubennikova	7624963e13	Enable metric_collection_endpoint for proxy on prod in all regions	2023-01-17 13:43:50 +02:00
Anastasia Lubennikova	63e3b815a2	Enable metric_collection_endpoint for pageserver on prod in all regions	2023-01-17 13:43:50 +02:00
Kirill Bulatov	1ebd145c29	Actualize the comment (#3362 ) Follow-up of https://github.com/neondatabase/neon/pull/3326#issuecomment-1384265759	2023-01-17 13:30:42 +02:00
sharnoff	f8e887830a	build: Use `curl -f` on vm-informant download (#3363 ) Without this, we can silently fail	2023-01-17 10:38:33 +01:00
Christian Schwarz	48dd9565ac	TaskHandle: tone down `sender is dropped while join handle is still alive` Rationale: see comments added as part of this commit. fixes https://github.com/neondatabase/neon/issues/3339	2023-01-17 09:42:22 +01:00
Anastasia Lubennikova	e067cd2947	Enable metric collection for proxy on staging	2023-01-16 21:15:42 +02:00
Christian Schwarz	58c8c1076c	download_all_remote_layers API: require client to specify max_concurrent_downloads Before this patch, we would start all layer downloads simultaneously. There is at most one download_all_remote_layers task per timeline. Hence, the specified limit is per timeline. There is still no global concurrency limit for layer downloads. We'll have to revisit that at some point and also prioritize on-demand initiated downloads over download_all_remote_layers downloads. But that's for another day.	2023-01-16 19:29:06 +01:00
Alexander Bayandin	4c6b507472	Update Postgres clients we test (#3359 ) Update client libraries and runtimes for Postgres libraries we test. - `pg8000` works with Neon now 🎉 - `PostgresClientKit` still doesn't support SNI	2023-01-16 17:22:17 +00:00
Stas Kelvich	431e464c1e	Consumption metering RFC	2023-01-16 19:15:59 +02:00
danieltprice	424fd0bd63	Update auth.rs (#3349 ) Update SNI error message. Users now specify the endpoint ID when making a connection to Neon. This should be reflected in the error message.	2023-01-16 12:32:00 -04:00
Joonas Koivunen	a8a9bee602	walredo: simple tests and bench updates (#3045 ) Separated from #2875. The microbenchmark has been validated to show similar difference as to larger scale OLTP benchmark.	2023-01-16 18:24:45 +02:00
Vadim Kharitonov	6ac5656be5	Enable earthdistance extension	2023-01-16 17:04:51 +01:00
Anastasia Lubennikova	3c571ecde8	Update docs/consumption_metrics.md	2023-01-16 17:24:13 +02:00
Anastasia Lubennikova	5f1bd0e8a3	Add documentation for consumption metrics	2023-01-16 17:24:13 +02:00
Anastasia Lubennikova	2cbe84b78f	Proxy metrics (#3290 ) Implement proxy metrics collection. Only collect metric for outbound traffic. Add proxy CLI parameters: - metric-collection-endpoint - metric-collection-interval. Add test_proxy_metric_collection test. Move shared consumption metrics code to libs/consumption_metrics. Refactor the code.	2023-01-16 15:17:28 +00:00
sharnoff	5c6a7a17cb	Add VM informant to vm-compute-node (#3324 ) The general idea is that the VM informant binary is added to the vm-compute-node images only. `compute_tools` then will run whatever's at `/bin/vm-informant`, if the path exists.	2023-01-16 07:05:29 -08:00
Arseny Sher	84ffdc8b4f	Don't keep FDs open on cancelled timelines in safekeepers. Since PR #3300 we don't remove timelines completely until next restart, so this prevents leakage. fixes https://github.com/neondatabase/neon/issues/3336	2023-01-16 19:03:38 +04:00
Kirill Bulatov	bce4233d3a	Rework Cargo.toml dependencies (#3322 ) * Use workspace variables from cargo, coming with rustc [1.64](https://github.com/rust-lang/rust/blob/master/RELEASES.md#version-1640-2022-09-22) See https://doc.rust-lang.org/nightly/cargo/reference/workspaces.html#the-package-table and https://doc.rust-lang.org/nightly/cargo/reference/workspaces.html#the-dependencies-table sections. Now, all dependencies in all non-root `Cargo.toml` files are defined as ``` clap.workspace = true ``` sometimes, when extra features are needed, as ``` bytes = {workspace = true, features = ['serde'] } ``` With the actual declarations (with shared features and version numbers/file paths/etc.) in the root Cargo.toml. Features are additive: https://doc.rust-lang.org/nightly/cargo/reference/specifying-dependencies.html#inheriting-a-dependency-from-a-workspace * Uses the mechanism above to set common, 2021, edition and license across the workspace * Mechanically bumps a few dependencies * Updates hakari format, as it suggested: ``` work/neon/neon kb/cargo-templated ❯ cargo hakari generate info: no changes detected info: new hakari format version available: 3 (current: 2) (add or update `dep-format-version = "3"` in hakari.toml, then run `cargo hakari generate && cargo hakari manage-deps`) ```	2023-01-13 18:13:34 +02:00
Vadim Kharitonov	16baa91b2b	Add more information about `cargo deny`	2023-01-13 13:24:34 +01:00
Kirill Bulatov	99808558de	Avoid duplicate timeline insert (#3326 ) `initialize_with_lock` inserts `Arc<Timeline>` before returning it: `c1731bc4f0/pageserver/src/tenant.rs (L222)` but `setup_timeline` function did another insert, which got removed in this PR: `c1731bc4f0/pageserver/src/tenant.rs (L486)` On top, a better comment and function renames are added.	2023-01-13 12:05:54 +00:00
Anastasia Lubennikova	c6d383e239	code cleanup	2023-01-13 11:51:28 +02:00
Anastasia Lubennikova	5e3e0fbf6f	remove unneeded Cargo.lock changes	2023-01-13 11:51:28 +02:00
Anastasia Lubennikova	26f39c03f2	review code cleanup: - handle errors in calculate_synthetic_size_worker. Don't exit the bgworker if one tenant failed. - add cached_synthetic_tenant_size to cache values calculated by the bgworker - code cleanup: remove unneeded info! messages, clean comments - handle collect_metrics_task() error. Don't exit collect_metrics worker if one task failed. - add unit test to cover case when we have multiple branches at the same lsn	2023-01-13 11:51:28 +02:00
Anastasia Lubennikova	148e020fb9	Fix logical size calculation: sort updates in topological order so that the parent timeline always preceeds its children. fixes #3179	2023-01-13 11:51:28 +02:00
Anastasia Lubennikova	0675859bb0	Add background worker that periodically spawns synthetic size calculation. Add new pageserver config param calculate_synthetic_size_interval	2023-01-13 11:51:28 +02:00
Anastasia Lubennikova	ba0190e3e8	Handle errors in tenant_size_model code	2023-01-13 11:51:28 +02:00
Konstantin Knizhnik	9ce5ada89e	Do not report position in SMGR message (#3307 ) refer #3277	2023-01-13 10:23:35 +02:00
Alexander Bayandin	c28bfd4c63	Nightly Benchmarks: add user provided example (#3308 )	2023-01-12 23:03:21 +00:00
Vadim Kharitonov	dec875fee1	Disable postgis_sfcgal	2023-01-12 21:51:49 +01:00
Kirill Bulatov	fe8cef3427	Use ready! rustc 1.64 macro (#3315 ) rustc [1.64](https://github.com/rust-lang/rust/blob/master/RELEASES.md#version-1640-2022-09-22) had brought `ready!` macro: https://doc.rust-lang.org/stable/std/task/macro.ready.html Use it to shorten the code slightly.	2023-01-12 21:27:34 +02:00
MMeent	bb406b21a8	Fix issue in compaction code (#3246 ) If we ran `compact_prefetch_buffers` with exactly one hole in the buffers, the code would fail to remove the last, now unused, entry from the array. This is now fixed. Also, add and adjust some comments in the compaction code so that the algorithm used is a bit more clear. Fixes #3192	2023-01-12 19:23:59 +01:00
Heikki Linnakangas	57a6e931ea	Comment, formatting and other cosmetic cleanup.	2023-01-12 19:05:13 +02:00
Heikki Linnakangas	0cceb14e48	Add a FIXME on ugly error message parsing.	2023-01-12 19:05:13 +02:00
Konstantin Knizhnik	1983c4d4ad	Explain prefetch (#3002 ) Co-authored-by: Bojan Serafimov <bojan.serafimov7@gmail.com>	2023-01-12 18:12:40 +02:00
Heikki Linnakangas	d7c41cbbee	Replace tokio::watch with CancellationToken. PR #3228 starts to use CancellationTokens more widely, this is a small part extracted from that.	2023-01-12 17:37:15 +02:00
Vadim Kharitonov	29a2465276	Update rust version in toolchain	2023-01-12 15:16:52 +01:00
Arthur Petukhovsky	f49e923d87	Keep deleted timelines in memory of safekeeper (#3300 ) A temporal fix for https://github.com/neondatabase/neon/issues/3146, until we come up with a reliable way to create and delete timelines in all safekeepers.	2023-01-12 15:33:07 +03:00
Vadim Kharitonov	a0ee306c74	Enable safe contrib extensions	2023-01-12 12:41:53 +01:00
Heikki Linnakangas	c1731bc4f0	Push on-demand download into Timeline::get() function itself. This makes Timeline::get() async, and all functions that call it directly or indirectly with it. The with_ondemand_download() mechanism is gone, Timeline::get() now always downloads files, whether you want it or not. That is what all the current callers want, so even though this loses the capability to get a page only if it's already in the pageserver, without downloading, we were not using that capability. There were some places that used 'no_ondemand_download' in the WAL ingestion code that would error out if a layer file was not found locally, but those were dubious. We do actually want to on-demand download in all of those places. Per discussion at https://github.com/neondatabase/neon/pull/3233#issuecomment-1368032358	2023-01-12 11:53:10 +02:00
Sergey Melnikov	95bf19b85a	Add --atomic to all helm upgrade operations (#3299 ) When number of github actions workers is changed, some jobs get killed. When helm if killed during the upgrade, release stuck in pending-upgrade state. --atomic should initiate automatic rollback in this case.	2023-01-10 10:05:27 +00:00
Vadim Kharitonov	80d4afab0c	Update tokio version (RUSTSEC-2023-0001)	2023-01-10 09:02:00 +01:00

1 2 3 4 5 ...

2663 Commits