Commit Graph

9 Commits

Author SHA1 Message Date
Christian Schwarz
3526323bc4 prepare Timeline::get_reconstruct_data for becoming async (#3271)
This patch restructures the code so that PR
https://github.com/neondatabase/neon/pull/3228 can seamlessly
replace the return PageReconstructResult::NeedsDownload with
a download_remote_layer().await.

Background:

PR https://github.com/neondatabase/neon/pull/3228 will turn
get_reconstruct_data() async and do the on-demand
download right in place, instead of returning a
PageReconstructResult::NeedsDownload.

Current rustc requires that the layers lock guard be not in scope
across an await point.

For on-demand download inside get_reconstruct_data(), we need
to do download_remote_layer().await.

Supersedes https://github.com/neondatabase/neon/pull/3260

See my comment there:
https://github.com/neondatabase/neon/pull/3260#issuecomment-1370752407

Co-authored-by: Heikki Linnakangas <heikki@neon.tech>
2023-01-06 19:42:25 +02:00
Konstantin Knizhnik
1137b58b4d Fix LayerMap::search to not return delta layer preceeding image layer (#3197)
While @bojanserafimov is still working on best replacement of R-Tree in
layer_map.rs there is obvious pitfall in the current `search` method
implementation: is returns delta layer even if there is image layer if
greater LSN. I think that it should be fixed.
2022-12-26 18:21:41 +02:00
Kirill Bulatov
b77c33ee06 Move tenant-related modules below tenant module (#3190)
No real code changes besides moving code around and adjusting the
imports.
2022-12-23 15:40:37 +02:00
Christian Schwarz
a3f0111726 LayerMap::search is actually infallible
Found this while investigating failure modes of on-demand download.

I think it's a nice cleanup.
2022-12-21 12:14:55 +01:00
Christian Schwarz
22ae67af8d refactor: use new type LayerFileName when referring to layer file names in PathBuf/RemotePath (#3026)
refactor: use new type LayerFileName when referring to layer file names in PathBuf/RemotePath

Before this patch, we would sometimes carry around plain file names in
`Path` types and/or awkwardly "rebase" paths to have a unified
representation of the layer file name between local and remote.

This patch introduces a new type `LayerFileName` which replaces the use
of `Path` / `PathBuf` / `RemotePath` in the `storage_sync2` APIs.

Instead of holding a string, it contains the parsed representation of
the image and delta file name.
When we need the file name, e.g., to construct a local path or
remote object key, we construct the name ad-hoc.

`LayerFileName` is also serde {Dese,Se}rializable, and in an initial   
version of this patch, it was supposed to be used directly inside      
`IndexPart`, replacing `RemotePath`.                                   
However,                                                               
  commit 3122f3282f                      
      Ignore backup files (ones with .n.old suffix) in download_missing
fixed handling of `*.old` backup file names in IndexPart, and we need  
to carry that behavior forward.                                        
The solution is to remove `*.old` backup files names during            
deserialization. When we re-serialize the IndexPart, the `*.old` file  
will be gone.                                                          
This leaks the `.old` file in the remote storage, but makes it safe    
to clean it up later.           

There is additional churn by a preliminary refactoring that got squashed
into this change:

   split off LayerMap's needs from trait Layer into super trait

That refactoring renames `Layer` to `PersistentLayer` and splits off a subset
of the functions into a super-trait called `Layer`.
The upser trait implements just the functions needed by `LayerMap`, whereas
`PersisentLayer` adds the context of the pageserver.

The naming is imperfect as some functions that reside in `PersistentLayer`
have nothing persistence-specific to it. But it's a step in the right direction.
2022-12-13 01:27:59 +02:00
Heikki Linnakangas
2418e72649 Speed up layer_map::search, by remembering the "envelope" for each layer.
Lookups in the R-tree call the "envelope" function for every comparison,
and our envelope function isn't very cheap, so that overhead adds up.
Create the envelope once, when the layer is inserted into the tree, and
store it along with the layer. That uses some more memory per layer, but
that's not very significant.

Speeds up the search operation 2x
2022-10-18 15:00:10 +03:00
Konstantin Knizhnik
f3073a4db9 R-Tree layer map (#2317)
Replace the layer array and linear search with R-tree

So far, the in-memory layer map that holds information about layer
files that exist, has used a simple Vec, in no particular order, to
hold information about all the layers. That obviously doesn't scale
very well; with thousands of layer files the linear search was
consuming a lot of CPU. Replace it with a two-dimensional R-tree, with
Key and LSN ranges as the dimensions.

For the R-tree, use the 'rstar' crate. To be able to use that, we
convert the Keys and LSNs into 256-bit integers. 64 bits would be
enough to represent LSNs, and 128 bits would be enough to represent
Keys. However, we use 256 bits, because rstar internally performs
multiplication to calculate the area of rectangles, and the result of
multiplying two 128 bit integers doesn't necessarily fit in 128 bits,
causing integer overflow and, if overflow-checks are enabled,
panic. To avoid that, we use 256 bit integers.

Add a performance test that creates a lot of layer files, to
demonstrate the benefit.
2022-09-22 08:35:06 +03:00
Kirill Bulatov
b8eb908a3d Rename old project name references 2022-09-14 08:14:05 +03:00
Kirill Bulatov
1a8c8b04d7 Merge Repository and Tenant entities, rework tenant background jobs 2022-09-13 15:39:39 +03:00