RFC fixes, per comments in the PR

2026-07-06 13:40:37 +00:00 · 2022-03-18 14:18:25 +02:00
parent 2bc9ed164f
commit d756921220
1 changed files with 9 additions and 13 deletions
--- a/docs/rfcs/014-storage-lsm.md
+++ b/docs/rfcs/014-storage-lsm.md
@@ -7,11 +7,13 @@ existing files are never modified. That fits well with storing the
 files on S3.

 Currently, we create a lot of small files. That is mostly a problem
-with S3, because each GET/PUT operation is expensive. Currently, the
-files "archived" together into larger checkpoint files before they're
-uploaded to S3, but garbage collecting data from the archive files
-would be difficult and we have not implemented it. This proposal
-addresses that problem.
+with S3, because each GET/PUT operation is expensive, and LIST
+operation only returns 1000 objects at a time, and isn't free
+either. Currently, the files are "archived" together into larger
+checkpoint files before they're uploaded to S3 to alleviate that
+problem, but garbage collecting data from the archive files would be
+difficult and we have not implemented it. This proposal addresses that
+problem.


 # Overview
@@ -98,7 +100,8 @@ the overall key space, and a larger range of LSNs. This speeds up
 searches. When you're looking for a given page, you need to check all
 the files in L0, to see if they contain a page version for the requested
 page. But in L1, you only need to check the files whose key range covers
-the requested page.
+the requested page. This is particularly important at cold start, when
+checking a file means downloading it from S3.

 Partitioning by key range also helps with garbage collection. If only a
 part of the database is updated, we will accumulate more files for
@@ -133,13 +136,6 @@ we partition the data into the files?
  for how PebblesDB does this, and for why that's important)
 - Greedy algorithm

-# Next steps
-
- Allow delta layers to cover a range keys instead of a single segment.
-
- Implement a two-level LSM tree (or three-leveled, if you count the
-"memtable"), by adding L0.
-
 # Additional Reading

 [1] Paper on PebblesDB and how it does partitioning.