rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-05-15 20:20:38 +00:00

Author	SHA1	Message	Date
Christian Schwarz	dc5f651600	pagebench: WIP: command to trigger initial logical size calculation	2023-12-15 17:52:39 +00:00
Christian Schwarz	d2d1432a65	include timeline ids in tenant details response	2023-12-15 17:52:38 +00:00
Christian Schwarz	fae3e01522	WIP: performance test that uses the getpage benchmark	2023-12-15 17:49:36 +00:00
Christian Schwarz	746b5d6323	Revert "expose RemotePath in layer map dump endpoint" This reverts commit `587b58a90b`.	2023-12-15 17:49:03 +00:00
Christian Schwarz	587b58a90b	expose RemotePath in layer map dump endpoint	2023-12-15 17:48:46 +00:00
Christian Schwarz	feb64cf67c	fixup	2023-12-15 17:48:41 +00:00
Christian Schwarz	eb77341bf8	find a way to duplicate a tenant in local_fs Use the script like so, against the tenant to duplicate: poetry run python3 ./test_runner/duplicate_tenant.py 7ea51af32d42bfe7fb93bf5f28114d09 200 8 backup of pageserver.toml d =1 pg_distrib_dir ='/home/admin/neon-main/pg_install' http_auth_type ='Trust' pg_auth_type ='Trust' listen_http_addr ='127.0.0.1:9898' listen_pg_addr ='127.0.0.1:64000' broker_endpoint ='http://127.0.0.1:50051/' #control_plane_api ='http://127.0.0.1:1234/' # Initial configuration file created by 'pageserver --init' #listen_pg_addr = '127.0.0.1:64000' #listen_http_addr = '127.0.0.1:9898' #wait_lsn_timeout = '60 s' #wal_redo_timeout = '60 s' #max_file_descriptors = 10000 #page_cache_size = 160000 # initial superuser role name to use when creating a new tenant #initial_superuser_name = 'cloud_admin' #broker_endpoint = 'http://127.0.0.1:50051' #log_format = 'plain' #concurrent_tenant_size_logical_size_queries = '1' #metric_collection_interval = '10 min' #cached_metric_collection_interval = '0s' #synthetic_size_calculation_interval = '10 min' #disk_usage_based_eviction = { max_usage_pct = .., min_avail_bytes = .., period = "10s"} #background_task_maximum_delay = '10s' [tenant_config] #checkpoint_distance = 268435456 # in bytes #checkpoint_timeout = 10 m #compaction_target_size = 134217728 # in bytes #compaction_period = '20 s' #compaction_threshold = 10 #gc_period = '1 hr' #gc_horizon = 67108864 #image_creation_threshold = 3 #pitr_interval = '7 days' #min_resident_size_override = .. # in bytes #evictions_low_residence_duration_metric_threshold = '24 hour' #gc_feedback = false # make it determinsitic gc_period = '0s' checkpoint_timeout = '3650 day' compaction_period = '20 s' compaction_threshold = 10 compaction_target_size = 134217728 checkpoint_distance = 268435456 image_creation_threshold = 3 [remote_storage] local_path = '/home/admin/neon-main/bench_repo_dir/repo/remote_storage_local_fs' remove http handler switch to generalized rewrite_summary & impl page_ctl subcommand to use it WIP: change duplicate_tenant.py script to use the pagectl command The script works but at restart, we detach the created tenants because they're not known to the attachment service: Detaching tenant, control plane omitted it in re-attach response tenant_id=1e399d390e3aee6b11c701cbc716bb6c => figure out how to further integrate this	2023-12-15 17:48:33 +00:00
Christian Schwarz	edc2fa88b8	pagebench: add a 'getpage@lsn' benchmark	2023-12-15 17:44:27 +00:00
Christian Schwarz	7a27b811a1	WIP	2023-12-15 17:16:03 +00:00
Christian Schwarz	27a35331c0	pagebench: add a 'basebackup' benchmark	2023-12-15 17:09:23 +00:00
Christian Schwarz	0b9f0e72ac	pagebench: scaffold	2023-12-15 17:09:21 +00:00
Christian Schwarz	2c631d3dc9	clippy	2023-12-15 17:07:59 +00:00
Christian Schwarz	300d6c38ad	Merge remote-tracking branch 'origin/problame/benchmarking/pr/keyspace-in-mgmt-api' into problame/benchmarking/pr/page_service_api_client	2023-12-15 17:06:41 +00:00
Christian Schwarz	8985331533	Merge branch 'problame/benchmarking/pr/mgmt-api-client' into problame/benchmarking/pr/keyspace-in-mgmt-api	2023-12-15 15:50:15 +00:00
Christian Schwarz	a6abcbe454	hakari manage-deps	2023-12-15 15:49:08 +00:00
Christian Schwarz	888a7311f4	preseed rng for display_fromstr_bijection test case	2023-12-15 15:44:09 +00:00
Christian Schwarz	28479529ae	fixup merge: move keyspace and models::partitioning into pageserver_api	2023-12-15 15:41:00 +00:00
Christian Schwarz	feaee19d4a	Merge branch 'problame/benchmarking/pr/mgmt-api-client' into problame/benchmarking/pr/keyspace-in-mgmt-api	2023-12-15 15:32:02 +00:00
Christian Schwarz	9e238a34b4	make cargo deny happy	2023-12-15 15:31:29 +00:00
Christian Schwarz	46889d768e	move client to separate crate	2023-12-15 15:21:05 +00:00
Christian Schwarz	1a71b72c39	move serialization roundtrip to rust unit test	2023-12-15 15:09:39 +00:00
Christian Schwarz	e4509d151d	Merge branch 'problame/benchmarking/pr/mgmt-api-client' into problame/benchmarking/pr/keyspace-in-mgmt-api	2023-12-15 14:05:47 +00:00
Christian Schwarz	f91625a552	fixup	2023-12-15 14:04:09 +00:00
Christian Schwarz	795fe55332	Merge branch 'problame/benchmarking/pr/mgmt-api-client' into problame/benchmarking/pr/keyspace-in-mgmt-api	2023-12-15 13:52:12 +00:00
Christian Schwarz	cab12c02a3	clippy	2023-12-15 13:43:40 +00:00
Christian Schwarz	7ac9ef8291	remove unused dep	2023-12-15 13:38:34 +00:00
Christian Schwarz	d7a8e0b1ae	eliminate one workaround, convert sk stuff to async as well	2023-12-15 13:34:32 +00:00
Christian Schwarz	2664e9b834	fix	2023-12-15 12:29:02 +00:00
Christian Schwarz	83bdebb4af	WIP	2023-12-15 12:22:33 +00:00
Christian Schwarz	b2508a689b	fill in the todo!()s	2023-12-15 11:16:07 +00:00
Christian Schwarz	672a97993d	Merge branch 'problame/benchmarking/pr/keyspace-in-mgmt-api' into problame/benchmarking/pr/page_service_api_client	2023-12-15 10:06:13 +00:00
Christian Schwarz	7c63902741	Merge remote-tracking branch 'origin/problame/benchmarking/pr/mgmt-api-client' into problame/benchmarking/pr/keyspace-in-mgmt-api	2023-12-15 10:05:22 +00:00
Christian Schwarz	a5214b203d	Merge remote-tracking branch 'origin/problame/benchmarking/pr/mgmt-api-client' into problame/benchmarking/pr/page_service_api_client	2023-12-15 09:59:32 +00:00
Christian Schwarz	9e70c213f7	Merge remote-tracking branch 'origin/main' into problame/benchmarking/pr/mgmt-api-client	2023-12-15 09:40:55 +00:00
Christian Schwarz	e8cd645a82	fixup	2023-12-15 09:35:42 +00:00
John Spray	f1cd1a2122	pageserver: improved handling of concurrent timeline creations on the same ID (#6139 ) ## Problem Historically, the pageserver used an "uninit mark" file on disk for two purposes: - Track which timeline dirs are incomplete for handling on restart - Avoid trying to create the same timeline twice at the same time. The original purpose of handling restarts is now defunct, as we use remote storage as the source of truth and clean up any trash timeline dirs on startup. Using the file to mutually exclude creation operations is error prone compared with just doing it in memory, and the existing checks happened some way into the creation operation, and could expose errors as 500s (anyhow::Errors) rather than something clean. ## Summary of changes - Creations are now mutually excluded in memory (using `Tenant::timelines_creating`), rather than relying on a file on disk for coordination. - Acquiring unique access to the timeline ID now happens earlier in the request. - Creating the same timeline which already exists is now a 201: this simplifies retry handling for clients. - 409 is still returned if a timeline with the same ID is still being created: if this happens it is probably because the client timed out an earlier request and has retried. - Colliding timeline creation requests should no longer return 500 errors This paves the way to entirely removing uninit markers in a subsequent change. --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-12-15 08:51:23 +00:00
Christian Schwarz	2e0737ce1a	pageserver: keyspace in mgmt api client Part of getpage@lsn benchmark epic: https://github.com/neondatabase/neon/issues/5771	2023-12-14 19:48:59 +00:00
Christian Schwarz	4c5b7cff49	add a Rust client for pageserver mgmt api Part of getpage@lsn benchmark epic: https://github.com/neondatabase/neon/issues/5771 This PR moves the control plane's spread-all-over-the-place client for the pageserver management API into a separate module within the pageserver crate. It also switches to the async version of reqwest, which I think is generally the right direction, and I need an async client API in the benchmark epic.	2023-12-14 19:47:26 +00:00
Joonas Koivunen	f010479107	feat(layer): pageserver_layer_redownloaded_after histogram (#6132 ) this is aimed at replacing the current mtime only based trashing alerting later. Cc: #5331	2023-12-14 21:32:54 +02:00
Conrad Ludgate	cc633585dc	gauge guards (#6138 ) ## Problem The websockets gauge for active db connections seems to be growing more than the gauge for client connections over websockets, which does not make sense. ## Summary of changes refactor how our counter-pair gauges are represented. not sure if this will improve the problem, but it should be harder to mess-up the counters. The API is much nicer though now and doesn't require scopeguard::defer hacks	2023-12-14 17:21:39 +00:00
Christian Schwarz	aa5581d14f	utils::logging: TracingEventCountLayer: don't use with_label_values() on hot path (#6129 ) fixes #6126	2023-12-14 16:31:41 +01:00
John Spray	c4e0ef507f	pageserver: heatmap uploads (#6050 ) Dependency (commits inline): https://github.com/neondatabase/neon/pull/5842 ## Problem Secondary mode tenants need a manifest of what to download. Ultimately this will be some kind of heat-scored set of layers, but as a robust first step we will simply use the set of resident layers: secondary tenant locations will aim to match the on-disk content of the attached location. ## Summary of changes - Add heatmap types representing the remote structure - Add hooks to Tenant/Timeline for generating these heatmaps - Create a new `HeatmapUploader` type that is external to `Tenant`, and responsible for walking the list of attached tenants and scheduling heatmap uploads. Notes to reviewers: - Putting the logic for uploads (and later, secondary mode downloads) outside of `Tenant` is an opinionated choice, motivated by: - Enable future smarter scheduling of operations, e.g. uploading the stalest tenant first, rather than having all tenants compete for a fair semaphore on a first-come-first-served basis. Similarly for downloads, we may wish to schedule the tenants with the hottest un-downloaded layers first. - Enable accessing upload-related state without synchronization (it belongs to HeatmapUploader, rather than being some Mutex<>'d part of Tenant) - Avoid further expanding the scope of Tenant/Timeline types, which are already among the largest in the codebase - You might reasonably wonder how much of the uploader code could be a generic job manager thing. Probably some of it: but let's defer pulling that out until we have at least two users (perhaps secondary downloads will be the second one) to highlight which bits are really generic. Compromises: - Later, instead of using digests of heatmaps to decide whether anything changed, I would prefer to avoid walking the layers in tenants that don't have changes: tracking that will be a bit invasive, as it needs input from both remote_timeline_client and Layer.	2023-12-14 13:09:24 +00:00
Conrad Ludgate	6987b5c44e	proxy: add more rates to endpoint limiter (#6130 ) ## Problem Single rate bucket is limited in usefulness ## Summary of changes Introduce a secondary bucket allowing an average of 200 requests per second over 1 minute, and a tertiary bucket allowing an average of 100 requests per second over 10 minutes. Configured by using a format like ```sh proxy --endpoint-rps-limit 300@1s --endpoint-rps-limit 100@10s --endpoint-rps-limit 50@1m ``` If the bucket limits are inconsistent, an error is returned on startup ``` $ proxy --endpoint-rps-limit 300@1s --endpoint-rps-limit 10@10s Error: invalid endpoint RPS limits. 10@10s allows fewer requests per bucket than 300@1s (100 vs 300) ```	2023-12-13 21:43:49 +00:00
Alexander Bayandin	0cd49cac84	test_compatibility: make it use initdb.tar.zst	2023-12-13 15:04:25 -06:00
Alexander Bayandin	904dff58b5	test_wal_restore_http: cleanup test	2023-12-13 15:04:25 -06:00
Arthur Petukhovsky	f401a21cf6	Fix test_simple_sync_safekeepers There is a postgres 16 version encoded in a binary message.	2023-12-13 15:04:25 -06:00
Tristan Partin	158adf602e	Update Postgres 16 series to 16.1	2023-12-13 15:04:25 -06:00
Tristan Partin	c94db6adbb	Update Postgres 15 series to 15.5	2023-12-13 15:04:25 -06:00
Tristan Partin	85720616b1	Update Postgres 14 series to 14.10	2023-12-13 15:04:25 -06:00
George MacKerron	d6fcc18eb2	Add Neon-Batch- headers to OPTIONS response for SQL-over-HTTP requests (#6116 ) This is needed to allow use of batch queries from browsers. ## Problem SQL-over-HTTP batch queries fail from web browsers because the relevant headers, `Neon-Batch-isolation-Level` and `Neon-Batch-Read-Only`, are not included in the server's OPTIONS response. I think we simply forgot to add them when implementing the batch query feature. ## Summary of changes Added `Neon-Batch-Isolation-Level` and `Neon-Batch-Read-Only` to the OPTIONS response.	2023-12-13 17:18:20 +00:00

1 2 3 4 5 ...

4248 Commits