-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[TieredStorage] Add capacity() API and limit file size to 16GB #401
Conversation
81fde7d
to
1ff9615
Compare
if self.is_empty() { | ||
return MAX_HOT_STORAGE_FILE_SIZE; | ||
} | ||
self.len() as u64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this impl not just?
self.len() as u64
If we have a reader, we shouldn't be able to write any more data, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. I think it should be simply self.len() as u64
.
If we have a reader, we shouldn't be able to write any more data, right?
Right. In that case, the capacity should be equal to len(), or should we return 0 instead?
Btw, is there a reason why capacity()
is u64 but len is usize()
in AccountsFile / AppendVec API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. In that case, the capacity should be equal to len(), or should we return 0 instead?
The capacity of a container is the number of elements currently allocated. So for a file-based container like tiered storage, its capacity is the file size. Returning zero would only be correct if the file size was zero.
Btw, is there a reason why capacity() is u64 but len is usize() in AccountsFile / AppendVec API?
I'm not sure. I would imagine not a strong one.
|
||
pub fn capacity(&self) -> u64 { | ||
self.reader() | ||
.map_or(MAX_TIERED_STORAGE_FILE_SIZE, |reader| reader.capacity()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this impl makes sense. We don't want to have to specify the max size constant in multiple places (as is currently). So the one in HotStorageReader::capacity()
can hopefully be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here we need MAX_TIERED_STORAGE_FILE_SIZE. Because the or case means we don't have a reader yet, meaning we don't know whether it is cold or hot yet.
But yep, under the HotStorageReader it should return MAX_HOT_STORAGE_FILE_SIZE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't know whether it is cold or hot yet.
I think this is another example of the API being inverted/backwards. The caller does have a tiered storage though, that's how they can even call capacity()
. But the API does not go down into a specific hot/cold impl to ask for its capacity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is another example of the API being inverted/backwards.
My original thinking is that: the accounts-db layer manages temperature (i.e., hot / cold / warm .. etc) as it has the whole picture of active / inactive accounts and other related information like shrink, and the accounts-file layer's job is to determine the best suitable file format based on the temperature assigned by the accounts-db. (and this is the reason why it was TieredHot previously as we eventually need an enum to determine the temperature.)
So back to the main topic, if we think determining temperature is accounts-db layer's job, then we should have some temperature information here even without constructing a reader.
We can revisit this later as we currently don't have plans for cold format. For now I think MAX_TIERED_STORAGE_FILE_SIZE is good enough.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #401 +/- ##
=========================================
- Coverage 81.9% 81.8% -0.1%
=========================================
Files 840 840
Lines 227913 227923 +10
=========================================
- Hits 186672 186635 -37
- Misses 41241 41288 +47 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Problem
The TieredStorage has not yet implemented the AccountsFile::capacity()
API.
Summary of Changes
Implement capacity() API for TieredStorage and limit file size to 16GB,
same as the append-vec file.