rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-05-29 19:10:38 +00:00

Author	SHA1	Message	Date
Arpad Müller	4b26783c94	scrubber: remove _generic postfix and two unused functions (#8761 ) Removes the `_generic` postfix from the `GenericRemoteStorage` using APIs, as `remote_storage` is the "default" now, and add a `_s3` postfix to the remaining APIs using the S3 SDK (only in tenant snapshot). Also, remove two unused functions: `list_objects_with_retries` and `stream_tenants functions`. Part of https://github.com/neondatabase/neon/issues/7547	2024-08-19 23:58:47 +02:00
Arpad Müller	9fabdda2dc	scrubber: add remote_storage based listing APIs and use them in find-large-objects (#8541 ) Add two new functions `stream_objects_with_retries` and `stream_tenants_generic` and use them in the `find-large-objects` subcommand, migrating it to `remote_storage`. Also adds the `size` field to the `ListingObject` struct. Part of #7547	2024-07-30 09:00:37 +00:00
Alex Chi Z	b1fe8259b4	fix(storage-scrubber): use default AWS authentication (#8299 ) part of https://github.com/neondatabase/cloud/issues/14024 close https://github.com/neondatabase/neon/issues/7665 Things running in k8s container use this authentication: https://docs.aws.amazon.com/sdkref/latest/guide/feature-container-credentials.html while we did not configure the client to use it. This pull request simply uses the default s3 client credential chain for storage scrubber. It might break compatibility with minio. ## Summary of changes * Use default AWS credential provider chain. * Improvements for s3 errors, we now have detailed errors and correct backtrace on last trial of the operation. --------- Signed-off-by: Alex Chi Z <chi@neon.tech> Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2024-07-09 18:41:37 +01:00
Arpad Müller	0a937b7f91	Add concurrency to the find-large-objects scrubber subcommand (#8291 ) The find-large-objects scrubber subcommand is quite fast if you run it in an environment with low latency to the S3 bucket (say an EC2 instance in the same region). However, the higher the latency gets, the slower the command becomes. Therefore, add a concurrency param and make it parallelized. This doesn't change that general relationship, but at least lets us do multiple requests in parallel and therefore hopefully faster. Running with concurrency of 64 (default): ``` 2024-07-05T17:30:22.882959Z INFO lazy_load_identity [...] [...] 2024-07-05T17:30:28.289853Z INFO Scanned 500 shards. [...] ``` With concurrency of 1, simulating state before this PR: ``` 2024-07-05T17:31:43.375153Z INFO lazy_load_identity [...] [...] 2024-07-05T17:33:51.987092Z INFO Scanned 500 shards. [...] ``` In other words, to list 500 shards, speed is increased from 2:08 minutes to 6 seconds. Follow-up of #8257, part of #5431	2024-07-05 21:36:28 +01:00
Arpad Müller	e579bc0819	Add find-large-objects subcommand to scrubber (#8257 ) Adds a find-large-objects subcommand to the scrubber to allow listing layer objects larger than a specific size. To be used like: ``` AWS_PROFILE=dev REGION=us-east-2 BUCKET=neon-dev-storage-us-east-2 cargo run -p storage_scrubber -- find-large-objects --min-size 250000000 --ignore-deltas ``` Part of #5431	2024-07-04 15:07:16 +00:00

5 Commits