mirror of
https://github.com/neondatabase/neon.git
synced 2026-05-18 13:40:37 +00:00
RFC fixes, per comments in the PR
This commit is contained in:
@@ -7,11 +7,13 @@ existing files are never modified. That fits well with storing the
|
||||
files on S3.
|
||||
|
||||
Currently, we create a lot of small files. That is mostly a problem
|
||||
with S3, because each GET/PUT operation is expensive. Currently, the
|
||||
files "archived" together into larger checkpoint files before they're
|
||||
uploaded to S3, but garbage collecting data from the archive files
|
||||
would be difficult and we have not implemented it. This proposal
|
||||
addresses that problem.
|
||||
with S3, because each GET/PUT operation is expensive, and LIST
|
||||
operation only returns 1000 objects at a time, and isn't free
|
||||
either. Currently, the files are "archived" together into larger
|
||||
checkpoint files before they're uploaded to S3 to alleviate that
|
||||
problem, but garbage collecting data from the archive files would be
|
||||
difficult and we have not implemented it. This proposal addresses that
|
||||
problem.
|
||||
|
||||
|
||||
# Overview
|
||||
@@ -98,7 +100,8 @@ the overall key space, and a larger range of LSNs. This speeds up
|
||||
searches. When you're looking for a given page, you need to check all
|
||||
the files in L0, to see if they contain a page version for the requested
|
||||
page. But in L1, you only need to check the files whose key range covers
|
||||
the requested page.
|
||||
the requested page. This is particularly important at cold start, when
|
||||
checking a file means downloading it from S3.
|
||||
|
||||
Partitioning by key range also helps with garbage collection. If only a
|
||||
part of the database is updated, we will accumulate more files for
|
||||
@@ -133,13 +136,6 @@ we partition the data into the files?
|
||||
for how PebblesDB does this, and for why that's important)
|
||||
- Greedy algorithm
|
||||
|
||||
# Next steps
|
||||
|
||||
- Allow delta layers to cover a range keys instead of a single segment.
|
||||
|
||||
- Implement a two-level LSM tree (or three-leveled, if you count the
|
||||
"memtable"), by adding L0.
|
||||
|
||||
# Additional Reading
|
||||
|
||||
[1] Paper on PebblesDB and how it does partitioning.
|
||||
|
||||
Reference in New Issue
Block a user