-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Track each SST's timestamp information as user properties #9093
Conversation
b5db9d7
to
7bdb3a6
Compare
@riversand963 I've reviewed this mr and my colleague post it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sunlike-Lipo for the PR. I think this PR needs more testing once #9092 gets merged. cc @wolfkdy
db/table_properties_collector.h
Outdated
} | ||
|
||
Status Finish(UserCollectedProperties* properties) override { | ||
properties->insert({"min_timestamp", min_timestamp}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we check the size of min/max timestamps are both equal to cmp_->timestamp_size()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine. Already fixed it.
Thanks, we plan to follow other unittests for TablePropertiesCollector's subclass and add unittests for TimestampTablePropertiesCollector. |
cdcdd89
to
2a1262e
Compare
@riversand963 |
@sunlike-Lipo Will take a look. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more comments.
@riversand963 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@sunlike-Lipo has updated the pull request. You must reimport the pull request before landing. |
1 similar comment
@sunlike-Lipo has updated the pull request. You must reimport the pull request before landing. |
c43c600
to
9afe576
Compare
@sunlike-Lipo has updated the pull request. You must reimport the pull request before landing. |
a0e5b0f
to
13ab3b6
Compare
@sunlike-Lipo has updated the pull request. You must reimport the pull request before landing. |
Sure. I'll do the test and post the benchmarking results as soon as possible. |
13ab3b6
to
05ebb11
Compare
@sunlike-Lipo has updated the pull request. You must reimport the pull request before landing. |
05ebb11
to
c38de84
Compare
@sunlike-Lipo has updated the pull request. You must reimport the pull request before landing. |
c38de84
to
886c942
Compare
@sunlike-Lipo has updated the pull request. You must reimport the pull request before landing. |
@riversand963 The variables are
|
@sunlike-Lipo has updated the pull request. You must reimport the pull request before landing. |
Thanks @sunlike-Lipo for addressing the comments. Will take another look soon.
I wonder if we can move the code to a standalone source file (similar to examples/) and compile with rocksdb library built with |
I fixed it to a stand alone test (sunlike-Lipo@50a1152) for the benchmark. The variables are
I will also test the db_bench, the result will be soon. |
@riversand963
With ts collector : 22.8MB/s, 22.5MB/s, 22.5MB/s, 22.6MB/s, 22.6MB/s The detailed results are as follows: Without ts collector : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @sunlike-Lipo for the PR. Left some final minor comments before stamping.
db/table_properties_collector.h
Outdated
// internal key when Add() is invoked. | ||
// | ||
// @param cmp the Comparator to compare timestamp of key by the | ||
// Comparator::CompareTimestamp function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure, but the indentation here seems weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
db/table_properties_collector.h
Outdated
properties->insert({"rocksdb.min_timestamp", min_timestamp_}); | ||
properties->insert({"rocksdb.max_timestamp", max_timestamp_}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used sst_dump
to dump the properties of generated SST files, and I can see
...
# rocksdb.max_timestamp: 0x03000000000000000000000000000000
# rocksdb.merge.operands: 0x00
# rocksdb.min_timestamp: 0x03000000000000000000000000000000
This makes me wonder whether we can instead use rocksdb.timestamp_min and rocksdb.timestamp_max so that they can be next to each other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed it.
db/table_properties_collector.h
Outdated
return {{"rocksdb.min_timestamp", Slice(min_timestamp_).ToString(true)}, | ||
{"rocksdb.max_timestamp", Slice(max_timestamp_).ToString(true)}}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as previous one.
Can we instead use rocksdb.timestamp_min and rocksdb.timestamp_max so that they can be next to each other?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed it.
db/table_properties_collector.h
Outdated
std::string max_timestamp_; | ||
std::string min_timestamp_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should these two be explicitly initialized to kDisableTimestamp
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed it.
db/table_properties_collector.h
Outdated
ExtractTimestampFromUserKey(user_key, cmp_->timestamp_size()); | ||
if (max_timestamp_ == kDisableUserTimestamp || | ||
cmp_->CompareTimestamp(timestamp_in_key, max_timestamp_) > 0) { | ||
max_timestamp_ = timestamp_in_key.ToString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do
max_timestamp_.assign(timestamp_in_key.data(), timestamp_in_key.size());
will it save us some string creation and copy caused by calling ToString()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed it.
db/table_properties_collector.h
Outdated
} | ||
if (min_timestamp_ == kDisableUserTimestamp || | ||
cmp_->CompareTimestamp(min_timestamp_, timestamp_in_key) > 0) { | ||
min_timestamp_ = timestamp_in_key.ToString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed it.
db/table_properties_collector.h
Outdated
class TimestampTablePropertiesCollector : public IntTblPropCollector { | ||
public: | ||
explicit TimestampTablePropertiesCollector(const Comparator* cmp) | ||
: cmp_(cmp) {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe here we can explicitly initialize max_timestamp_
and min_timestamp_
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed it.
@riversand963 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@sunlike-Lipo has updated the pull request. You must reimport the pull request before landing. |
@riversand963 The db_bench result are as follows: Without ts collector : The standalone test from sunlike-Lipo@50a1152 result are as follows:
use string.operator=
It seems there is little difference between the result using string.assign and using string.operator=, perhaps the compiler is |
480400f
to
a7dfe15
Compare
@sunlike-Lipo has updated the pull request. You must reimport the pull request before landing. |
@riversand963 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@riversand963 merged this pull request in e12753e. |
Track each SST's timestamp information as user properties #8959
Rockdb has supported user-defined timestamp feature. Application can specify a timestamp
when writing each k-v pair. When data flush from memory to disk file called SST files.
Each SST files consist of multiple data blocks and several metadata blocks. Among the metadata
blocks, there is one called Properties block that tracks some pre-defined properties of this SST file.
This PR is for collecting the properties of min and max timestamps of all keys in the file. With those
properties the SST file is more convenient to tell whether the keys in the SST have timestamps or not.
The changes involved are as follows:
The way TimestampTablePropertiesCollector use to compare timestamp of key should defined by
user by implementing the Comparator::CompareTimestamp function in the user defined comparator.