rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-07-07 22:20:36 +00:00

Author	SHA1	Message	Date
Christian Schwarz	2a0695e01a	log always, but at debug	2024-09-15 13:28:47 +01:00
Christian Schwarz	af6c168265	debug-log the buggy key	2024-09-15 13:12:25 +01:00
Christian Schwarz	1b3209497f	image_layer: don't do .rev(), it's confusing and unlike what we did before	2024-09-15 12:58:13 +01:00
Christian Schwarz	9f97523d0d	dump full missing keyspace	2024-09-15 12:40:41 +01:00
Christian Schwarz	33196c90bc	todo!	2024-09-15 12:32:17 +01:00
Christian Schwarz	c72c5df922	make sure get_cached_lsn returns what it returned before	2024-09-15 12:20:58 +01:00
Christian Schwarz	4c7599e8df	Revert "address the "This is silly" comment" This reverts commit `c736f9d6ef`.	2024-09-15 12:18:26 +01:00
Christian Schwarz	4b2b0a24da	Revert "WIP better typed" This reverts commit `ef5a95010b`.	2024-09-15 12:18:21 +01:00
Christian Schwarz	ef5a95010b	WIP better typed	2024-09-15 12:18:15 +01:00
Christian Schwarz	c736f9d6ef	address the "This is silly" comment	2024-09-14 15:38:46 +00:00
Christian Schwarz	adc798b59e	explicitly pass will_init	2024-09-14 15:15:48 +00:00
Christian Schwarz	f0668a7a4d	cargo fmt	2024-09-14 15:15:48 +00:00
Christian Schwarz	6d73642a93	hyptohesis of what's wrong	2024-09-14 14:26:52 +00:00
Christian Schwarz	9012a18fa1	Revert "WIP" This reverts commit `709a6ad29b`.	2024-09-14 14:18:39 +00:00
Christian Schwarz	a6a7550bb4	add some debugging that didn't lead anywhere	2024-09-14 14:18:02 +00:00
Christian Schwarz	10556f25df	fix double .remove() on InMemoryLayer read error (not the bug)	2024-09-14 13:41:36 +00:00
Christian Schwarz	f54cf567ff	implement io concurrency control (serial / parallel). bug still happens	2024-09-14 13:22:28 +00:00
Christian Schwarz	4303914681	Revert "rule out empty keyspace bug" (it wasn't the bug) This reverts commit `539225d792`.	2024-09-14 13:02:20 +00:00
Christian Schwarz	539225d792	rule out empty keyspace bug	2024-09-14 13:02:06 +00:00
Christian Schwarz	c118736c9c	draw-timeline-dir: ability to draw line	2024-09-14 12:48:23 +00:00
Christian Schwarz	3f85246a42	4096 queuedepth in pagebench get-page-latest-lsn	2024-09-14 12:06:57 +00:00
Christian Schwarz	709a6ad29b	WIP	2024-09-14 12:05:42 +00:00
Vlad Lazar	3640824553	fixup: incorrect assumption on img layer need	2024-09-12 23:04:58 +01:00
Vlad Lazar	fea1f34f6a	fixup: serialize again in image layer	2024-09-12 23:04:10 +01:00
Vlad Lazar	5d40d1ccdd	fixup: order of waiters in delta layer	2024-09-12 23:04:01 +01:00
Vlad Lazar	b2cb10590e	fixup: deserialize shenanigans	2024-09-12 20:00:06 +01:00
Vlad Lazar	2923fd2a5b	fixup: remove stale import	2024-09-12 19:25:46 +01:00
Vlad Lazar	2a5336b9ab	fixup image deserialization	2024-09-12 19:24:41 +01:00
Christian Schwarz	6f20726610	Merge remote-tracking branch 'origin/hackaneon/lisbon24/superscalar-page_service--problame/evaluate-debouncer' into hackaneon/lisbon24/superscalar-page_service	2024-09-12 17:47:16 +00:00
Christian Schwarz	29f741e1e9	debounce: actually issue vectored get	2024-09-12 17:46:30 +00:00
Vlad Lazar	2b37a40079	Materialize future ios	2024-09-12 18:25:17 +01:00
Vlad Lazar	af2b65a2fb	Rework issuing of IOs on read path	2024-09-12 16:42:35 +01:00
Christian Schwarz	5d194c7824	debounce: bounce if shard or effective request_lsn differ	2024-09-12 14:20:32 +00:00
Christian Schwarz	ac2702afd3	deboucner: move decoding into debounce loop	2024-09-12 10:58:09 +00:00
Christian Schwarz	88fd46d795	sketch interface	2024-09-12 11:35:00 +01:00
Christian Schwarz	2d6763882e	pagebench: fake queue depth of 10	2024-09-12 11:35:00 +01:00
Christian Schwarz	c0c23cde72	debouncer	2024-09-12 11:35:00 +01:00
Christian Schwarz	942bc9544b	fixup	2024-09-11 20:04:39 +00:00
Christian Schwarz	02b7cdb305	HACK: instrument page_service to count nonblocking consecutive getpage requests	2024-09-11 19:25:19 +01:00
Joonas Koivunen	3dbd34aa78	feat(storcon): forward gc blocking and unblocking (#8956 ) Currently using gc blocking and unblocking with storage controller managed pageservers is painful. Implement the API on storage controller. Fixes: #8893	2024-09-06 22:42:55 +01:00
Arseny Sher	c1a51416db	safekeeper: fsync filesystem on start. We can't really rely on files contents after boot without fsync'ing them.	2024-09-06 19:14:25 +03:00
Arpad Müller	cbcd4058ed	Fix 1.82 clippy lint too_long_first_doc_paragraph (#8941 ) Addresses the 1.82 beta clippy lint `too_long_first_doc_paragraph` by adding newlines to the first sentence if it is short enough, and making a short first sentence if there is the need.	2024-09-06 14:33:52 +02:00
Arpad Müller	a1323231bc	Update Rust to 1.81.0 (#8939 ) We keep the practice of keeping the compiler up to date, pointing to the latest release. This is done by many other projects in the Rust ecosystem as well. [Release notes](https://github.com/rust-lang/rust/blob/master/RELEASES.md#version-1810-2024-09-05). Prior update was in #8667 and #8518	2024-09-06 12:40:19 +02:00
Christian Schwarz	06e840b884	compact_level0_phase1: ignore access mode config, always do streaming-kmerge without validation (#8934 ) refs https://github.com/neondatabase/neon/issues/8184 PR https://github.com/neondatabase/infra/pull/1905 enabled streaming-kmerge without validation everywhere. It rolls into prod sooner or in the same release as this PR.	2024-09-06 10:58:48 +02:00
Vlad Lazar	04f99a87bf	storcon: make pageserver AZ id mandatory (#8856 ) ## Problem https://github.com/neondatabase/neon/pull/8852 introduced a new nullable column for the `nodes` table: `availability_zone_id` ## Summary of changes * Make neon local and the test suite always provide an az id * Make the az id field in the ps registration request mandatory * Migrate the column to non-nullable and adjust in memory state accordingly * Remove the code that was used to populate the az id for pre-existing nodes	2024-09-05 19:14:21 +01:00
Christian Schwarz	850421ec06	refactor(pageserver): rely on serde derive for toml deserialization (#7656 ) This PR simplifies the pageserver configuration parsing as follows: * introduce the `pageserver_api::config::ConfigToml` type * implement `Default` for `ConfigToml` * use serde derive to do the brain-dead leg-work of processing the toml document * use `serde(default)` to fill in default values * in `pageserver` crate: * use `toml_edit` to deserialize the pageserver.toml string into a `ConfigToml` * `PageServerConfig::parse_and_validate` then * consumes the `ConfigToml` * destructures it exhaustively into its constituent fields * constructs the `PageServerConfig` The rules are: * in `ConfigToml`, use `deny_unknown_fields` everywhere * static default values go in `pageserver_api` * if there cannot be a static default value (e.g. which default IO engine to use, because it depends on the runtime), make the field in `ConfigToml` an `Option` * if runtime-augmentation of a value is needed, do that in `parse_and_validate` * a good example is `virtual_file_io_engine` or `l0_flush`, both of which need to execute code to determine the effective value in `PageServerConf` The benefits: * massive amount of brain-dead repetitive code can be deleted * "unused variable" compile-time errors when removing a config value, due to the exhaustive destructuring in `parse_and_validate` * compile-time errors guide you when adding a new config field Drawbacks: * serde derive is sometimes a bit too magical * `deny_unknown_fields` is easy to miss Future Work / Benefits: * make `neon_local` use `pageserver_api` to construct `ConfigToml` and write it to `pageserver.toml` * This provides more type safety / coompile-time errors than the current approach. ### Refs Fixes #3682 ### Future Work * `remote_storage` deser doesn't reject unknown fields https://github.com/neondatabase/neon/issues/8915 * clean up `libs/pageserver_api/src/config.rs` further * break up into multiple files, at least for tenant config * move `models` as appropriate / refine distinction between config and API models / be explicit about when it's the same * use `pub(crate)` visibility on `mod defaults` to detect stale values	2024-09-05 14:59:49 +02:00
Alex Chi Z.	99fa1c3600	fix(pageserver): more information on aux v1 warnings (#8906 ) Part of https://github.com/neondatabase/neon/issues/8623 ## Summary of changes It seems that we have tenants with aux policy set to v1 but don't have any aux files in the storage. It is still safe to force migrate them without notifying the customers. This patch adds more details to the warning to identify the cases where we have to reach out to the users before retiring aux v1. --------- Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-09-04 21:45:04 +01:00
Alex Chi Z.	3d9001d83f	fix(pageserver): is_archived should be optional (#8902 ) Set the field to optional, otherwise there will be decode errors when newer version of the storage controller receives the JSON from older version of the pageservers. Signed-off-by: Alex Chi Z <chi@neon.tech>	2024-09-03 14:05:06 -04:00
John Spray	c4fe6641c1	pageserver: separate metadata and data pages in DatadirModification (#8621 ) ## Problem Currently, DatadirModification keeps a key-indexed map of all pending writes, even though we (almost) never need to read back dirty pages for anything other than metadata pages (e.g. relation sizes). Related: https://github.com/neondatabase/neon/issues/6345 ## Summary of changes - commit() modifications before ingesting database creation wal records, so that they are guaranteed to be able to get() everything they need directly from the underlying Timeline. - Split dirty pages in DatadirModification into pending_metadata_pages and pending_data_pages. The data ones don't need to be in a key-addressable format, so they just go in a Vec instead. - Special case handling of zero-page writes in DatadirModification, putting them in a map which is flushed on the end of a WAL record. This handles the case where during ingest, we might first write a zero page, and then ingest a postgres write to that page. We used to do this via the key-indexed map of writes, but in this PR we change the data page write path to not bother indexing these by key. My least favorite thing about this PR is that I needed to change the DatadirModification interface to add the on_record_end call. This is not very invasive because there's really only one place we use it, but it changes the object's behaviour from being clearly an aggregation of many records to having some per-record state. I could avoid this by implicitly doing the work when someone calls set_lsn or commit -- I'm open to opinions on whether that's cleaner or dirtier. ## Performance There may be some efficiency improvement here, but the primary motivation is to enable an earlier stage of ingest to operate without access to a Timeline. The `pending_data_pages` part is the "fast path" bulk write data that can in principle be generated without a Timeline, in parallel with other ingest batches, and ultimately on the safekeeper. `test_bulk_insert` on AX102 shows approximately the same results as in the previous PR #8591: ``` ------------------------------ Benchmark results ------------------------------- test_bulk_insert[neon-release-pg16].insert: 23.577 s test_bulk_insert[neon-release-pg16].pageserver_writes: 5,428 MB test_bulk_insert[neon-release-pg16].peak_mem: 637 MB test_bulk_insert[neon-release-pg16].size: 0 MB test_bulk_insert[neon-release-pg16].data_uploaded: 1,922 MB test_bulk_insert[neon-release-pg16].num_files_uploaded: 8 test_bulk_insert[neon-release-pg16].wal_written: 1,382 MB test_bulk_insert[neon-release-pg16].wal_recovery: 18.264 s test_bulk_insert[neon-release-pg16].compaction: 0.052 s ```	2024-09-03 18:16:49 +01:00
Erik Grinaker	b37da32c6f	pageserver: reuse idempotency keys across metrics sinks (#8876 ) ## Problem Metrics event idempotency keys differ across S3 and Vector. The events should be identical. Resolves #8605. ## Summary of changes Pre-generate the idempotency keys and pass the same set into both metrics sinks. Co-authored-by: John Spray <john@neon.tech>	2024-09-03 09:05:24 +01:00

1 2 3 4 5 ...

2446 Commits