rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-07-04 04:30:38 +00:00

Author	SHA1	Message	Date
Heikki Linnakangas	7eefad691c	Improve the scaling, add Play/Stop buttons	2023-03-22 19:40:54 +02:00
Heikki Linnakangas	fe59a063ea	WIP: Collect and draw layer trace	2023-03-22 19:40:54 +02:00
Heikki Linnakangas	ae8e5b3a8e	Add test from PR #3673	2023-03-22 19:40:54 +02:00
Kirill Bulatov	8bd565e09e	Ensure branches with no layers have their remote storage counterpart created eventually (#3857 ) Discovered during writing a test for https://github.com/neondatabase/neon/pull/3843	2023-03-22 17:42:31 +02:00
Joonas Koivunen	6033dfdf4a	Re-access layers before threshold eviction (#3867 ) To avoid re-downloading evicted files on restart, re-compute logical size and partitioning before each threshold based eviction run. Cc: #3802 Co-authored-by: Christian Schwarz <christian@neon.tech>	2023-03-22 16:26:27 +02:00
mikecaat	14a40c9ca6	Fix minor things for the docker-compose file (#3862 ) * Add the REPOSITORY env to build args to avoid the following error when executing without the credentials for the repository. ``` ERROR: Service 'compute' failed to build: Head "https://369495373322.dkr.ecr.eu-central-1.amazonaws.com/v2/compute-node-v15/manifests/2221": no basic auth credentials ``` * update the tag version in the documentation to support storage broker	2023-03-22 08:10:53 +00:00
Shany Pozin	0f7de84785	Allow calling detach on ignored tenant (#3834 ) ## Describe your changes Added a query param to detach API Allow to remove local state of a tenant even if its not in the memory (following ignore API) ## Issue ticket number and link #3828 ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. --------- Co-authored-by: Kirill Bulatov <kirill@neon.tech>	2023-03-22 07:17:00 +00:00
Kirill Bulatov	dd22c87100	Remove older layer metadata format support code (#3854 ) The PR enforces current newest `index_part.json` format in the type system (version `1`), not allowing any previous forms of it, that were used in the past. Similarly, the code to mitigate the https://github.com/neondatabase/neon/issues/3024 issue is now also removed. Current code does not produce old formats and extra files in the index_part.json, in the future we will be able to use https://github.com/neondatabase/aversion or other approach to make version transitions more explicit. See https://neondb.slack.com/archives/C033RQ5SPDH/p1679134185248119 for the justification on the breaking changes.	2023-03-21 23:33:28 +02:00
Heikki Linnakangas	6fdd9c10d1	Read storage auth token from spec file. We read the pageserver connection string from the spec file, so let's read the auth token from the same place. We've been talking about pre-launching compute nodes that are not associated with any particular tenant at startup, so that the spec file is delivered to the compute node later. We cannot change the env variables after the process has been launched. We still pass the token to 'postgres' binary in the NEON_AUTH_TOKEN env variable, but compute_ctl is now responsible for setting it.	2023-03-21 20:12:09 +02:00
Dmitry Rodionov	4158e24e60	rfc: delete pageserver data from s3 (#3792 ) [Rendered](https://github.com/neondatabase/neon/blob/main/docs/rfcs/022-pageserver-delete-from-s3.md) --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-03-21 20:03:27 +02:00
Shany Pozin	809acb5fa9	Move neon-image-depot to a larger runner (#3860 ) ## Describe your changes https://neondb.slack.com/archives/C039YKBRZB4/p1679413279637059 ## Issue ticket number and link ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.	2023-03-21 19:32:36 +02:00
Heikki Linnakangas	299db9d028	Simplify and clean up the $NEON_AUTH_TOKEN stuff in compute - Remove the neon.safekeeper_token_env GUC. It was used to set the name of an environment variable, which was then used in pageserver and safekeeper connection strings to in place of the password. Instead, always look up the environment variable called NEON_AUTH_TOKEN. That's what neon.safekeeper_token_env was always set to in practice, and I don't see the need for the extra level of indirection or configurability. - Instead of substituting $NEON_AUTH_TOKEN in the connection strings, pass $NEON_AUTH_TOKEN "out-of-band" as the password, when we connect to the pageserver or safekeepers. That's simpler. - Also use the password from $NEON_AUTH_TOKEN in compute_ctl, when it connects to the pageserver to get the "base backup".	2023-03-21 00:15:04 +02:00
Heikki Linnakangas	5a786fab4f	Remove duplicated global variables in neon extension. Walproposer used to live in the backend, while pagestore_smgr was an extension. But now that both are part of the neon extension, walproposer can access the same 'neon_tenant' and 'neon_timeline' variables as the pageserver_smgr code.	2023-03-21 00:15:04 +02:00
Arseny Sher	699f200811	Send error context chain to the client when Copy stream errors.	2023-03-21 01:22:02 +04:00
Christian Schwarz	881356c417	add metrics to detect eviction-induced thrashing (#3837 ) This patch adds two metrics that will enable us to detect thrashing of layers, i.e., repetitions of `eviction, on-demand-download, eviction, ... ` for a given layer. The first metric counts all layer evictions per timeline. It requires no further explanation. The second metric counts the layer evictions where the layer was resident for less than a given threshold. We can alert on increments to the second metric. The first metric will serve as a baseline, and further, it's generally interesting, outside of thrashing. The second metric's threshold is configurable in PageServerConf and defaults to 24h. The threshold value is reproduced as a label in the metric because the counter's value is semantically tied to that threshold. Since changes to the config and hence the label value are infrequent, this will have low storage overhead in the metrics storage. The data source to determine the time that the layer was resident is the file's `mtime`. Using `mtime` is more of a crutch. It would be better if Pageserver did its own persistent bookkeeping of residence change events instead of relying on the filesystem. We had some discussion about this: https://github.com/neondatabase/neon/pull/3809#issuecomment-1470448900 My position is that `mtime` is good enough for now. It can theoretically jump forward if someone copies files without resetting `mtime`. But that shouldn't happen in practice. Note that moving files back and forth doesn't change `mtime`, nor does `chown` or `chmod`. Lastly, `rsync -a`, which is typically used for filesystem-level backup / restore, correctly syncs `mtime`. I've added a label that identifies the data source to keep options open for a future, better data source than `mtime`. Since this value will stay the same for the time being, it's not a problem for metrics storage. refs https://github.com/neondatabase/neon/issues/3728	2023-03-20 16:11:36 +01:00
Heikki Linnakangas	fea4b5f551	Switch to EdDSA algorithm for the storage JWT authentication tokens. The control plane currently only supports EdDSA. We need to either teach the storage to use EdDSA, or the control plane to use RSA. EdDSA is more modern, so let's use that. We could support both, but it would require a little more code and tests, and we don't really need the flexibility since we control both sides.	2023-03-20 16:28:01 +02:00
Heikki Linnakangas	77107607f3	Allow JWT key generation to fail if authentication is not enabled. This allows you to run without the 'openssl' binary as long as you don't enable authentication. This becomes more important with the next commit, which switches the JWT algorithm to EdDSA. LibreSSL does not support EdDSA, and LibreSSL comes with macOS, so the next commit makes it much more likely for the key generation to fail for macOS users. To allow running without a keypair, don't generate the authentication token in the 'neon_local init' step. Instead, generate a new token on every request that needs one, using the private key.	2023-03-20 16:28:01 +02:00
Heikki Linnakangas	1da963b2f9	Remove some unused code in control plane.	2023-03-20 16:28:01 +02:00
Heikki Linnakangas	1ddb9249aa	Reduce the # of histogram buckets in metrics. (#3850 ) Shrinks the total number of metrics collected for each timeline by about 50%. See https://github.com/neondatabase/neon/issues/2848. This doesn't fully solve the problem, we still collect a lot of metrics even with this, but this gives us a lot of headroom.	2023-03-20 15:49:16 +02:00
Joonas Koivunen	0c1228c37a	feat: store initial timeline in env fixture (#3839 ) minor change, but will allow more use in future for the default tenants. Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2023-03-20 11:57:27 +02:00
Christian Schwarz	3c15874c48	allow specifying eviction_policy in TenantCreateRequest This was on oversight from `175a577ad4`. Nothing uses this AFAIK, but, let's fix it anyways. Noticed while working on https://github.com/neondatabase/neon/issues/3728	2023-03-20 10:43:53 +01:00
Shany Pozin	93f3f4ab5f	Return NotFound in mgmt API requests when tenant is not present in the pageserver (#3818 ) ## Describe your changes Add Error enum for tenant state response to allow better error handling in mgmt api ## Issue ticket number and link #2238 ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.	2023-03-19 10:44:42 +02:00
sharnoff	f6e2e0042d	Fix + re-enable VM cgroup creation + running (#3820 ) Re-enable cgroup shenanigans in VMs, with some special care taken to make sure that our version of cgroup-tools supports cgroup v2 (debian bullseye does not, and probably won't because it requires a breaking change in libcgroup). This involves manually building libcgroup / cgroup-tools from source, then copying the output into the final build stage. We originally considered pulling the package from debian's testing repo (which is up-to-date), but decided against it. Refer to the PR for more details. Prior work, for reference: * `2153d2e0` - Run compute_ctl in a cgroup in VMs * `1360361f` - Fix missing VM cgconfig.conf * `8dae8799` - Disable VM cgroup shenanigans	2023-03-16 17:09:45 -07:00
Christian Schwarz	b917270c67	remove unused TenantConfig::update function	2023-03-16 16:25:35 +01:00
Arthur Petukhovsky	b067378d0d	Measure cross-AZ traffic in safekeepers (#3806 ) Create `safekeeper_pg_io_bytes_total` metric to track total amount of bytes written/read in a postgres connections to safekeepers. This metric has the following labels: - `client_az` – availability zone of the connection initiator, or `"unknown"` - `sk_az` – availability zone of the safekeeper, or `"unknown"` - `app_name` – `application_name` of the postgres client - `dir` – data direction, either `"read"` or `"write"` - `same_az` – `"true"`, `"false"` or `"unknown"`. Can be derived from `client_az` and `sk_az`, exists purely for convenience. This is implemented by passing availability zone in the connection string, like this: `-c tenant_id=AAA timeline_id=BBB availability-zone=AZ-1`. Update ansible deployment scripts to add availability_zone argument to safekeeper and pageserver in systemd service files.	2023-03-16 17:24:01 +03:00
Joonas Koivunen	768c8d9972	test: allow gc to get unlucky (#3826 ) this failure case was probably introduced by `b220ba6`, because earlier the gc would always have run fast enough for restart every 1s. however, test got added later, so we have just been lucky. fixes #3824 by allowing this error to happen.	2023-03-16 11:03:21 +02:00
Rahul Patil	f1d960d2c2	Add new pageserver to eu-central-1 (#3829 ) ## Describe your changes ## Issue ticket number and link ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.	2023-03-15 16:58:28 +01:00
Arthur Petukhovsky	cd17802b1f	Add pg_clients test for serverless driver (#3827 ) Fixes #3819	2023-03-15 16:18:38 +03:00
Heikki Linnakangas	10a5d36af8	Separate mgmt and libpq authentication configs in pageserver. (#3773 ) This makes it possible to enable authentication only for the mgmt HTTP API or the compute API. The HTTP API doesn't need to be directly accessible from compute nodes, and it can be secured through network policies. This also allows rolling out authentication in a piecemeal fashion.	2023-03-15 13:52:29 +02:00
Arseny Sher	a7ab53c80c	Forward framed read buf contents to compute before proxy pass. Otherwise they get lost. Normally buffer is empty before proxy pass, but this is not the case with pipeline mode of out npm driver; fixes connection hangup introduced by `b80fe41af3` for it. fixes https://github.com/neondatabase/neon/issues/3822	2023-03-15 14:32:41 +03:00
Heikki Linnakangas	2672fd09d8	Make test independent of the order of config lines.	2023-03-14 20:10:34 +02:00
Heikki Linnakangas	4a92799f24	Fix check for trailing garbage in basebackup import. There was a warning for trailing garbage after end-of-tar archive, but it didn't always work. The reason is that we created a StreamReader over the original copyin-stream, but performed the check for garbage on the copyin-stream. There could be some garbage bytes buffered in the StreamReader, which were not caught by the warning. I considered turning the the warning into a fatal error, aborting the import, but I wasn't sure if we handle aborting the import properly. Do we clean up the timeline directory on error? If we don't, we should make that more robust, but that's a different story. Also, normally a valid tar archive ends with two 512-byte blocks of zeros. The tokio_tar crate stops at the first all-zeros block. Read and check the second all-zeros block, and error out if it's not there, or contains something unexpected.	2023-03-14 19:45:33 +02:00
Konstantin Knizhnik	5396273541	Avoid holes between generated image layers (#3771 ) ## Describe your changes When we perform partitioning of the whole key space, we take in account actual ranges of relation present in the database. So if we have relation with relid=1 and size 100 and relation with relid=2 with size 200 then result of KeySpace::partition may contain partitions <100000000..100000099> and <200000000..200000199>. Generated image layers will contain the same boundaries. But when GC is checking image coverage to find out of old layer is fully covered by newer image layer and so can be deleted, it takes in account only full key range. I.e. if there is delta layer <100000000..300000000> then it never be garbage collected because image layers <100000000..100000099> and <200000000..200000199> are not completely covering it.This is how it looks in practice: 000000067F000032AC00000A300000000000-000000067F000032AC00000A330000000000__000000000F761828 000000067F000032AC00000A31000000001F-000000067F000032AC00000A620000000005__0000000001696070-000000000442A551 000000067F000032AC00000A3300FFFFFFFF-000000067F000032AC00000A650100000000__000000000F761828 So there are two image layers covering delta layer but ... there is a hole: A330000000000...A3300FFFFFFFF and as a result delta layer is not collected. ## Issue ticket number and link This PR is deeply related with #3673 because it is addressing the same problem: old layers are not utilized by GC. The test test_gc_old_layers.py in #3673 can be used to see effect of this patch. ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section. --------- Co-authored-by: Joonas Koivunen <joonas@neon.tech>	2023-03-14 16:15:07 +02:00
Joonas Koivunen	c23c8946a3	chore: clippies introduced with rust 1.68 (#3781 ) - handle automatically fixable future clippies - tune run-clippy.sh to remove macos specifics which we no longer have Co-authored-by: Alexander Bayandin <alexander@neon.tech>	2023-03-14 15:29:02 +02:00
Joonas Koivunen	15b692ccc9	test: more strict finding of WARN, ERROR lines (#3798 ) this prevents flakyness when `WARN\|ERROR` appears in some other part of the line, for example in a random filename.	2023-03-14 15:27:52 +02:00
Alexander Bayandin	3d869cbcde	Replace flake8 and isort with ruff (#3810 ) - Introduce ruff (https://beta.ruff.rs/) to replace flake8 and isort - Update mypy and black	2023-03-14 13:25:44 +00:00
Lassi Pölönen	68ae020b37	Use RollingUpdate strategy also for legacy proxy (#3814 ) ## Describe your changes We have previously changed the neon-proxy to use RollingUpdate. This should be enabled in legacy proxy too in order to avoid breaking connections for the clients and allow for example backups to run even during deployment. (https://github.com/neondatabase/neon/pull/3683) ## Issue ticket number and link https://github.com/neondatabase/neon/issues/3333	2023-03-14 13:23:46 +00:00
Joonas Koivunen	d6bb8caad4	refactor: correct return value for not found L0's on LayerMap::replace (#3805 ) in prev implementation, an `ok_or_else(...)?` is used to cause a "precondition error" on LayerMap::replace, however we only see this particular error if an L0 for which replace fails is not in the layermap because it is not in `l0_delta_layers`. changes or fixes this to be Replacement::NotFound instead, making it more clear that an error would only be raised for actual preconditions, like trying to replace layer with completly unrelated layer.	2023-03-14 13:18:26 +02:00
Alexander Bayandin	319402fc74	postgres_ffi: restore POSTGRES_INSTALL_DIR support (#3811 ) Fix path construction to `pg_config`: `pg_install_dir_versioned` already includes `pg_version`	2023-03-14 10:25:48 +00:00
Dmitry Ivanov	2e4bf7cee4	[proxy] Immediately log all compute node connection errors.	2023-03-14 01:45:57 +03:00
Dmitry Ivanov	15ed6af5f2	Add descriptions to proxy's python tests.	2023-03-14 01:32:37 +03:00
Dmitry Rodionov	50476a7cc7	test: update to match current interfaces	2023-03-13 17:50:10 +02:00
andres	d7ab69f303	add test for getting branchpoints from an inactive timeline	2023-03-13 17:50:10 +02:00
sharnoff	582620274a	Enable file cache handling by vm-informant (#3794 ) Enables the VM informant's file cache integration. See also: https://github.com/neondatabase/autoscaling/pull/47	2023-03-13 07:16:39 -07:00
Vadim Kharitonov	daeaa767c4	Add `neondatabase/release` team as a default reviewers for storage releases	2023-03-13 13:40:15 +01:00
Konstantin Knizhnik	f0573f5991	Remove block cursor cache (#3740 ) ## Describe your changes Do not pin current block in BlockCursor ## Issue ticket number and link See #3712 There are places (see get_reconstruct_data) in our code when thread is holding read layers lock and then try to read file and so lock page cache slot. So we have edge in dependency graph layers->page cache slot. At the same time (as Christian noticed) we can lock page cache slot in BlockCursor and then try obtain shard lock on layers. So there is backward edge in dependency graph page cache slot>layers which forms loop and may cause deadlock. There are three possible fixes of the problem: 1. Perform compaction under `layers` shared lock. See PR #3732. It fixes the problem but make it not possible to append any data to pageserver until compaction is completed. 2. Do not hold `layers` lock while accessing layers (not sure if it is possible to do because it definitely introduce some new race conditions). 3. Do not pin current pages in BockCursor (this PR). My experiments shows that this cache in BlockCursor is not so useful: the number of hits/misses for cursor cache on pgbench workload (-i -s 10/-c 10 -T 100/-c 10 -S -T 100): ``` hits: 163011 misses: 1023602 ``` So number of cache misses is 10x times larger. And results for read-only pgbench are mostly the same: ``` with cache: 14581 w/out cache: 14429 ``` ## Checklist before requesting a review - [ ] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.	2023-03-13 14:07:35 +02:00
Nikita Kalyanov	07dcf679de	set content type explicitly (#3799 ) I moved management API v2 to ogen and the generated code seems to be more strict about content type. Let's set it properly as it is json after all ## Describe your changes ## Issue ticket number and link ## Checklist before requesting a review - [x] I have performed a self-review of my code. - [ ] If it is a core feature, I have added thorough tests. - [ ] Do we need to implement analytics? if so did you add the relevant metrics to the dashboard? - [ ] If this PR requires public announcement, mark it with /release-notes label and add several sentences in this section.	2023-03-13 14:00:01 +02:00
Heikki Linnakangas	e0ee138a8b	Add a test for tokio-postgres client to the driver test suite. It is fully supported. To enable TLS, though, it requires some extra glue code, and a dependency to a TLS library.	2023-03-13 12:14:37 +02:00
Arthur Petukhovsky	d9a1329834	Make postgres_backend use generic IO type (#3789 ) - Support measuring inbound and outbound traffic in MeasuredStream - Start using MeasuredStream in safekeepers code	2023-03-13 12:18:10 +03:00
Joonas Koivunen	8699342249	Ondemand rx bytes and layer count (#3777 ) Adds two new global metrics: - pageserver_remote_ondemand_downloaded_layers_total - pageserver_remote_ondemand_downloaded_bytes_total An existing test is repurposed once more to check that we do get some reasonable counts. These are to replace guessing from the nic RX bytes metric how much was on-demand downloaded. First part of #3745: This does not add the "(un)?avoidable" metric, which I plan to add as a new metric, which will be a subset of the counts of the metrics added here.	2023-03-13 09:26:49 +02:00

1 2 3 4 5 ...

2944 Commits