rust/neon - neon - Gitea: Git with a cup of tea

rust/neon

mirror of https://github.com/neondatabase/neon.git synced 2026-01-04 03:52:56 +00:00

Author	SHA1	Message	Date
Heikki Linnakangas	32fd709b34	Fix links to safekeeper protocol docs. (#2188 ) safekeeper/README_PROTO.md was moved to docs/safekeeper-protocol.md in commit `0b14fdb078`, as part of reorganizing the docs into 'mdbook' format. Fixes issue #1475. Thanks to @banks for spotting the outdated references. In addition to fixing the above issue, this patch also fixes other broken links as a result of `0b14fdb078`. See https://github.com/neondatabase/neon/pull/2188#pullrequestreview-1055918480. Co-authored-by: Heikki Linnakangas <heikki@neon.tech> Co-authored-by: Thang Pham <thang@neon.tech>	2022-08-09 10:19:18 +07:00
Kirill Bulatov	3a9bff81db	Fix etcd typos	2022-08-08 19:04:46 +03:00
bojanserafimov	743370de98	Major migration script (#2073 ) This script can be used to migrate a tenant across breaking storage versions, or (in the future) upgrading postgres versions. See the comment at the top for an overview. Co-authored-by: Anastasia Lubennikova <anastasia@neon.tech>	2022-08-08 17:52:28 +02:00
Dmitry Rodionov	cdfa9fe705	avoid duplicate parameter, increase timeout	2022-08-08 12:15:16 +03:00
Dmitry Rodionov	7cd68a0c27	increase timeout to pass test with real s3	2022-08-08 12:15:16 +03:00
Dmitry Rodionov	beaa991f81	remove debug log	2022-08-08 12:15:16 +03:00
Dmitry Rodionov	9430abae05	use event so it fires only if workload thread successfully finished	2022-08-08 12:15:16 +03:00
Dmitry Rodionov	4da4c7f769	increase statement timeout	2022-08-08 12:15:16 +03:00
Dmitry Rodionov	0d14d4a1a8	ignore record property warning to fix benchmarks	2022-08-08 12:15:16 +03:00
bojanserafimov	8c8431ebc6	Add more buckets to pageserver latency metrics (#2225 )	2022-08-06 11:45:47 +02:00
Ankur Srivastava	84d1bc06a9	refactor: replace lazy-static with once-cell (#2195 ) - Replacing all the occurrences of lazy-static with `once-cell::sync::Lazy` - fixes #1147 Signed-off-by: Ankur Srivastava <best.ankur@gmail.com>	2022-08-05 19:34:04 +02:00
Konstantin Knizhnik	5133db44e1	Move relation size cache from WalIngest to DatadirTimeline (#2094 ) * Move relation sie cache to layered timeline * Fix obtaining current LSN for relation size cache * Resolve merge conflicts * Resolve merge conflicts * Reestore 'lsn' field in DatadirModification * adjust DatadirModification lsn in ingest_record * Fix formatting * Pass lsn to get_relsize * Fix merge conflict * Update pageserver/src/pgdatadir_mapping.rs Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> * Update pageserver/src/pgdatadir_mapping.rs Co-authored-by: Heikki Linnakangas <heikki@zenith.tech> Co-authored-by: Heikki Linnakangas <heikki@zenith.tech>	2022-08-05 16:28:59 +03:00
Alexander Bayandin	4cb1074fe5	github/workflows: Fix git dubious ownership (#2223 )	2022-08-05 13:44:57 +01:00
Arthur Petukhovsky	0a958b0ea1	Check find_end_of_wal errors instead of unwrap	2022-08-04 17:56:19 +03:00
Vadim Kharitonov	1bbc8090f3	[issue #1591 ] Add `neon_local pageserver status` handler	2022-08-04 16:38:29 +03:00
Dmitry Rodionov	f7d8db7e39	silence https://github.com/neondatabase/neon/issues/2211	2022-08-04 16:32:19 +03:00
Dmitry Rodionov	e54941b811	treat pytest warnings as errors	2022-08-04 16:32:19 +03:00
Heikki Linnakangas	52ce1c9d53	Speed up test shutdown, by polling more frequently. A fair amount of the time in our python tests is spent waiting for the pageserver and safekeeper processes to shut down. It doesn't matter so much when you're running a lot of tests in parallel, but it's quite noticeable when running them sequentially. A big part of the slowness is that is that after sending the SIGTERM signal, we poll to see if the process is still running, and the polling happened at 1 s interval. Reduce it to 0.1 s.	2022-08-04 12:57:15 +03:00
Dmitry Rodionov	bc2cb5382b	run real s3 tests in CI	2022-08-04 11:14:05 +03:00
Dmitry Rodionov	5f71aa09d3	support running tests against real s3 implementation without mocking	2022-08-04 11:14:05 +03:00
Dmitry Rodionov	b4f2c5b514	run benchmarks conditionally, on main or if run_benchmarks label is set	2022-08-03 01:36:14 +03:00
Alexander Bayandin	71f39bac3d	github/workflows: upload artifacts to S3 (#2071 )	2022-08-02 13:57:26 +01:00
Stas Kelvich	177d5b1f22	Bump postgres to get uuid extension	2022-08-02 11:16:26 +03:00
dependabot[bot]	8ba41b8c18	Bump pywin32 from 227 to 301 (#2202 )	2022-08-01 19:08:09 +01:00
Dmitry Rodionov	1edf3eb2c8	increase timeout so mac os job can finish the build with all cache misses	2022-08-01 18:28:49 +03:00
Dmitry Rodionov	0ebb6bc4b0	Temporary pin Werkzeug version because moto hangs with newer one. See https://github.com/spulec/moto/issues/5341	2022-08-01 18:28:49 +03:00
Dmitry Rodionov	092a9b74d3	use only s3 in boto3-stubs and update mypy Newer version of mypy fixes buggy error when trying to update only boto3 stubs. However it brings new checks and starts to yell when we index into cusror.fetchone without checking for None first. So this introduces a wrapper to simplify quering for scalar values. I tried to use cursor_factory connection argument but without success. There can be a better way to do that, but this looks the simplest	2022-08-01 18:28:49 +03:00
Ankur Srivastava	e73b95a09d	docs: linked poetry related step in tests section Added the link to the dependencies which should be installed before running the tests.	2022-08-01 18:13:01 +03:00
Alexander Bayandin	539007c173	github/workflows: make bash more strict (#2197 )	2022-08-01 12:54:39 +01:00
Heikki Linnakangas	d0494c391a	Remove wal_receiver mgmt API endpoint Move all the fields that were returned by the wal_receiver endpoint into timeline_detail. Internally, move those fields from the separate global WAL_RECEIVERS hash into the LayeredTimeline struct. That way, all the information about a timeline is kept in one place. In the passing, I noted that the 'thread_id' field was removed from WalReceiverEntry in commit `e5cb727572`, but it forgot to update openapi_spec.yml. This commit removes that too.	2022-07-29 20:51:37 +03:00
Kirill Bulatov	2af5a96f0d	Back off when reenqueueing delete tasks	2022-07-29 19:04:40 +03:00
Vadim Kharitonov	9733b24f4a	Fix README.md: Fixed several typos and changed a bit documentation for OSX	2022-07-29 19:03:57 +03:00
Heikki Linnakangas	d865892a06	Print full error with stacktrace, if compute node startup fails. It failed in staging environment a few times, and all we got in the logs was: ERROR could not start the compute node: failed to get basebackup@0/2D6194F8 from pageserver host=zenith-us-stage-ps-2.local port=6400 giving control plane 30s to collect the error before shutdown That's missing all the detail on why it failed.	2022-07-29 16:41:55 +03:00
Heikki Linnakangas	a0f76253f8	Bump Postgres version. This brings in the inclusion of 'uuid-ossp' extension.	2022-07-29 16:30:39 +03:00
Heikki Linnakangas	02afa2762c	Move Tenant- and TimelineInfo structs to models.rs. They are part of the management API response structs. Let's try to concentrate everything that's part of the API in models.rs.	2022-07-29 15:02:15 +03:00
Heikki Linnakangas	d903dd61bd	Rename 'wal_producer_connstr' to 'wal_source_connstr'. What the WAL receiver really connects to is the safekeeper. The "producer" term is a bit misleading, as the safekeeper doesn't produce the WAL, the compute node does. This change also applies to the name of the field used in the mgmt API in in the response of the '/v1/tenant/:tenant_id/timeline/:timeline_id/wal_receiver' endpoint. AFAICS that's not used anywhere else than one python test, so it should be OK to change it.	2022-07-29 09:09:22 +03:00
Thang Pham	417d9e9db2	Add current physical size to tenant status endpoint (#2173 ) Ref #1902	2022-07-28 13:59:20 -04:00
Alexander Bayandin	6ace347175	github/workflows: unpause stress env deployment (#2180 ) This reverts commit `4446791397`.	2022-07-28 18:37:21 +01:00
Alexander Bayandin	14a027cce5	Makefile: get openssl prefix dynamically (#2179 )	2022-07-28 17:05:30 +01:00
Arthur Petukhovsky	09ddd34b2a	Fix checkpoints race condition in safekeeper tests (#2175 ) We should wait for WAL to arrive to pageserver before calling CHECKPOINT	2022-07-28 15:44:02 +03:00
Arthur Petukhovsky	aeb3f0ea07	Refactor test_race_conditions (#2162 ) Do not use python multiprocessing, make the test async	2022-07-28 14:38:37 +03:00
Kirill Bulatov	58b04438f0	Tweak backoff numbers to avoid no wal connection threshold trigger	2022-07-27 22:16:40 +03:00
Alexey Kondratov	01f1f1c1bf	Add OpenAPI spec for safekeeper HTTP API (neondatabase/cloud#1264, #2061 ) This spec is used in the `cloud` repo to generate HTTP client.	2022-07-27 21:29:22 +03:00
Thang Pham	6a664629fa	Add timeline physical size tracking (#2126 ) Ref #1902. - Track the layered timeline's `physical_size` using `pageserver_current_physical_size` metric when updating the layer map. - Report the local timeline's `physical_size` in timeline GET APIs. - Add `include-non-incremental-physical-size` URL flag to also report the local timeline's `physical_size_non_incremental` (similar to `logical_size_non_incremental`) - Add a `UIntGaugeVec` and `UIntGauge` to represent `u64` prometheus metrics Co-authored-by: Dmitry Rodionov <dmitry@neon.tech>	2022-07-27 12:36:46 -04:00
Sergey Melnikov	f6f29f58cd	Switch production storage to dedicated etcd (#2169 )	2022-07-27 16:41:25 +03:00
Sergey Melnikov	fd46e52e00	Switch staging storage to dedicated etcd (#2164 )	2022-07-27 12:28:05 +03:00
Heikki Linnakangas	d6f12cff8e	Make DatadirTimeline a trait, implemented by LayeredTimeline. Previously DatadirTimeline was a separate struct, and there was a 1:1 relationship between each DatadirTimeline and LayeredTimeline. That was a bit awkward; whenever you created a timeline, you also needed to create the DatadirTimeline wrapper around it, and if you only had a reference to the LayeredTimeline, you would need to look up the corresponding DatadirTimeline struct through tenant_mgr::get_local_timeline_with_load(). There were a couple of calls like that from LayeredTimeline itself. Refactor DatadirTimeline, so that it's a trait, and mark LayeredTimeline as implementing that trait. That way, there's only one object, LayeredTimeline, and you can call both Timeline and DatadirTimeline functions on that. You can now also call DatadirTimeline functions from LayeredTimeline itself. I considered just moving all the functions from DatadirTimeline directly to Timeline/LayeredTimeline, but I still like to have some separation. Timeline provides a simple key-value API, and handles durably storing key/value pairs, and branching. Whereas DatadirTimeline is stateless, and provides an abstraction over the key-value store, to present an interface with relations, databases, etc. Postgres concepts. This simplified the logical size calculation fast-path for branch creation, introduced in commit `28243d68e6`. LayerTimeline can now access the ancestor's logical size directly, so it doesn't need the caller to pass it to it. I moved the fast-path to init_logical_size() function itself. It now checks if the ancestor's last LSN is the same as the branch point, i.e. if there haven't been any changes on the ancestor after the branch, and copies the size from there. An additional bonus is that the optimization will now work any time you have a branch of another branch, with no changes from the ancestor, not only at a create-branch command.	2022-07-27 10:26:21 +03:00
Konstantin Knizhnik	5a4394a8df	Do not hold timelines lock while calling update_gc_info to avoid recusrive mutex lock and so deadlock (#2163 )	2022-07-26 22:21:05 +03:00
Heikki Linnakangas	d301b8364c	Move LayeredTimeline and related code to separate source file. The layered_repository.rs file had grown to be very large. Split off the LayeredTimeline struct and related code to a separate source file to make it more manageable. There are plans to move much of the code to track timelines from tenant_mgr.rs to LayeredRepository. That will make layered_repository.rs grow again, so now is a good time to split it. There's a lot more cleanup to do, but this commit intentionally only moves existing code and avoids doing anything else, for easier review.	2022-07-26 11:47:04 +03:00
Kirill Bulatov	172314155e	Compact only once on psql checkpoint call	2022-07-26 11:37:16 +03:00

1 2 3 4 5 ...

1894 Commits