-
Notifications
You must be signed in to change notification settings - Fork 526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
curvebs: support 512 aligned IO #1231
Conversation
2932059
to
09cef26
Compare
recheck |
1 similar comment
recheck |
726becd
to
a7fe2e2
Compare
recheck |
a7fe2e2
to
a5b44d7
Compare
This reverts commit de2a121. Signed-off-by: wuhanqing <[email protected]>
94f1947
to
b668629
Compare
recheck |
b668629
to
2836175
Compare
recheck |
# it should consist with `block_size` in chunkfilepool.meta_path and `mds.volume.blockSize` in MDS's configurations | ||
# for clone chunk and snapshot chunk, it's also the minimum granularity that each bit represents | ||
# if set to |512|, we need 4096 bytes bitmap for each chunk, so meta_page_size should be 8192 or larger. | ||
global.block_size=4096 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this value be verified on the mds side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have three verifications:
- when a chunkserver starts, we will verify
global.block_size
in chunkserver's conf and chunkfilepool's metafile, if they aren't equal, chunkserver will quit. - then, if the chunkserver is newly deployed, and in the registration process, we will verify chunkserver's block size and block size recorded in MDS, if they aren't equal, chunkserver registration will fail.
- MDS's conf also has a
block_size
, and when it starts, we will verify this item with block size recorded in etcd, if they aren't equal, mds will quit.
and for compatibility,
- if chunkfilepool's metafile doesn't have block size, we treat it as 4096.
- if etcd doesn't block size, we will insert it according to the value in conf.
@@ -25,3 +25,4 @@ s3.throttle.iopsWriteLimit={{ s3_throttle_iopsWriteLimit }} | |||
s3.throttle.bpsTotalMB= {{ s3_throttle_bpsTotalLimit }} | |||
s3.throttle.bpsReadMB= {{ s3_throttle_bpsReadLimit }} | |||
s3.throttle.bpsWriteMB= {{ s3_throttle_bpsWriteLimit }} | |||
s3.useVirtualAddressing=False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this configuration for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it was introduced by #1253
std::string path = "/sys/block/nbd" + std::to_string(nbd_index) | ||
+ "/queue/hw_sector_size"; | ||
std::string path = | ||
sys_block_path(nbd_index) + "/queue/logical_block_size"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did the check items change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why we check hw_sector_size
before, but what we set is block size when mapping the device,
@@ -140,6 +142,27 @@ struct ChunkServiceOptions { | |||
std::shared_ptr<InflightThrottle> inflightThrottle; | |||
}; | |||
|
|||
inline CopysetNodeOptions::CopysetNodeOptions() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This default configuration is only used when unit testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't modify it, just moved it from source file to header, because it's trivial.
@@ -433,9 +438,9 @@ CSErrorCode CSChunkFile::Paste(const char * buf, off_t offset, size_t length) { | |||
|
|||
// The request above must be pagesize aligned | |||
// the starting page index number of the paste area | |||
uint32_t beginIndex = offset / pageSize_; | |||
uint32_t beginIndex = offset / blockSize_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Notes need to be changed.
// The request above must be pagesize aligned
// the starting page index number of the paste area
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix
@@ -488,9 +494,9 @@ CSErrorCode CSChunkFile::Read(char * buf, off_t offset, size_t length) { | |||
if (isCloneChunk_) { | |||
// The request above must be pagesize aligned | |||
// the starting page index number of the paste area | |||
uint32_t beginIndex = offset / pageSize_; | |||
uint32_t beginIndex = offset / blockSize_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix
@@ -63,12 +63,14 @@ int Register::RegisterToMDS(ChunkServerMetadata *metadata) { | |||
req.set_externalip(ops_.chunkserverExternalIp); | |||
} | |||
req.set_port(ops_.chunkserverPort); | |||
req.set_blocksize(ops_.blockSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this for verification on the mds side?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep
src/common/fast_align.h
Outdated
@@ -0,0 +1,65 @@ | |||
/* | |||
Copyright 2015 Glen Joseph Fernandes | |||
([email protected]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as curvefs/src/common/fast_align.h?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix, curvefs/src/common/fast_align.h is removed
src/common/namespace_define.h
Outdated
@@ -62,6 +62,8 @@ const char CLONEINFOKEYEND[] = "13"; | |||
const char DISCARDSEGMENTKEYPREFIX[] = "13"; | |||
const char DISCARDSEGMENTKEYEND[] = "14"; | |||
|
|||
const char BLOCKSIZEKEY[] = "14blocksize"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use "15blocksize" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix
src/tools/curve_format_main.cpp
Outdated
} | ||
|
||
auto bitmapBytes = FLAGS_fileSize / FLAGS_blockSize / 8; | ||
if (bitmapBytes > FLAGS_metaPageSize * 2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If bitmapBytes > (FLAGS_metaPageSize-specified field), it should not work?
Here means if bitmapBytes = 4k, FLAGS_metaPageSize = 1k,that's ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed, I added a hard code value 4096 - 512
to indicate minimum size usage for encoding and storing source location for clone chunk, so that, the meta page size should be great than or equal to 3584 + bitmap bytes
2836175
to
078e7ab
Compare
Signed-off-by: Hanqing Wu <[email protected]>
078e7ab
to
89ca60c
Compare
@@ -101,6 +101,8 @@ chunk_size=16777216 | |||
chunkserver_walfilepool_segment_size=8388608 | |||
retain_pool=False | |||
walfilepool_use_chunk_file_pool=True | |||
chunkserver_meta_page_size=4096 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider chunkserver_meta_page_size, which is automatically calculated by chunkserver according to the chunkserver_block_size?
@@ -500,6 +507,12 @@ void MDS::InitCurveFSOptions(CurveFSOption *curveFSOptions) { | |||
"mds.curvefs.minFileLength", &curveFSOptions->minFileLength); | |||
conf_->GetValueFatalIfFail( | |||
"mds.curvefs.maxFileLength", &curveFSOptions->maxFileLength); | |||
conf_->GetValueFatalIfFail("mds.curvefs.blockSize", &g_block_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not define g_block_size in curveFSOptions instead of global variable
void TopologyServiceManager::RegistChunkServer( | ||
const ChunkServerRegistRequest *request, | ||
ChunkServerRegistResponse *response) { | ||
if (request->has_blocksize()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider that chunkserver gets blocksize from mds instead of configuring blocksize individually and then verifying
|
||
DEFINE_validator(metaPageSize, &ValidateMetaPageSize); | ||
|
||
DEFINE_uint32(blockSize, 4096, "minimum io alignment supported"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we check blockSize and metaPageSize at chunkserver startup ?
cc @wu-hanqing Please resubmit to the master branch if necessary. |
What problem does this PR solve?
Issue Number: close #985
Problem Summary:
What is changed and how it works?
What's Changed:
How it Works:
Side effects(Breaking backward compatibility? Performance regression?):
Check List