-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extend Transaction::GetForUpdate with do_validate #4680
Extend Transaction::GetForUpdate with do_validate #4680
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@maysamyabandeh has updated the pull request. Re-import the pull request |
4 similar comments
@maysamyabandeh has updated the pull request. Re-import the pull request |
@maysamyabandeh has updated the pull request. Re-import the pull request |
@maysamyabandeh has updated the pull request. Re-import the pull request |
@maysamyabandeh has updated the pull request. Re-import the pull request |
22ba753
to
6f0ab51
Compare
@maysamyabandeh has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
s = Get(read_options, column_family, key, &pinnable_val); | ||
if (skip_validate && read_options.snapshot != nullptr) { | ||
ReadOptions ro(read_options); | ||
ro.snapshot = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why we need to do it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
skip_validate has two functions: skip validating snapshot, which is done by passing it to TryLock, read the latest committed value independent of what the snapshot is. The latter is missing in the inline documentation. If the explanation above is satisfaction, I go ahead with improving the inline docs with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the caller expects to read the latest, not from the snapshot, should they not set read_options.snapshot
? Users passing in a snapshot but we have an argument to skip it is too complicated to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hermanlee @lth would it work if we return error if snapshot and skip_validate both are set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that would be fine. If skip_validate is set, then snapshot should be null.
s = Get(read_options, column_family, key, pinnable_val); | ||
if (skip_validate && read_options.snapshot != nullptr) { | ||
ReadOptions ro(read_options); | ||
ro.snapshot = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question here. Why do we need to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
|
||
// An overload of the above method that receives a PinnableSlice | ||
// For backward compatibility a default implementation is provided | ||
virtual Status GetForUpdate(const ReadOptions& options, | ||
ColumnFamilyHandle* column_family, | ||
const Slice& key, PinnableSlice* pinnable_val, | ||
bool exclusive = true) { | ||
bool exclusive = true, | ||
bool skip_validate = false) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Callaghan had a suggestion which might loosely apply here: avoid argument/option name which contains negative in it, because if it is false, it is double negative and makes it difficulty for users to tune. "Skip" is kind of such thing. skip_validate = false
is harder to understand than something like do_validation = false
.
@maysamyabandeh has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
bool exclusive, | ||
const bool do_validate) { | ||
assert(do_validate || read_options.snapshot == nullptr); | ||
if (UNLIKELY(!do_validate && read_options.snapshot != nullptr)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As always, I'm not a fun of UNLIKELY hint. Do we really need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have seen cases that it improves performance and here it might too. I do not see it to be hurtful in anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is hurtful because it tries to dictate compiler behavior unnecessarily. It's just like that we don't encourage people to deep tune RocksDB behavior unless it's needed, I am not a fan of doing it to compiler either.
Furthermore, just have this macro inside the code for no reason makes it less readable. Imagine someone grabs RocksDB source code and read this function, and sees a "UNLIKELY()".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe it is a "hint" rather than "dictation". And it is not "unnecessary" as the previous benchmarks has shown. I actually find them to make the code more readable if they are used properly. Here is a proper use case as the if-then branch is not taken unless there is a user bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I even feel that we should take the hit even we see some minor regression. "Previous benchmark" is even weaker argument. It's more readable to you because you wrote so many UNLIKELY() in the code base so that you are so familiar with what it is for when you see it. It's not the case for someone new to the code base. In fact, I'm always reluctant to the use of macros in general. This is also the rule written in the Google C++ Style we are following: https://google.github.io/styleguide/cppguide.html#Preprocessor_Macros
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am going to remove UNLIKELY here as I do not want this minor point to become a blocker for this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maysamyabandeh I appreciate it.
Status s = TryLock(column_family, key, true /* read_only */, exclusive); | ||
bool exclusive, | ||
const bool do_validate) { | ||
assert(do_validate || read_options.snapshot == nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we should assert user input. Crashing user program (while running in debug mode) when passing invalid argument doesn't feel like something we should do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is undefined behavior in the API and assert is a way to specify the assumption on the input. We leave the runtime error return just in case there was a corner case that we did not anticipate but during the test we would prefer a highly visible crash over a probably silent error return.
I let @hermanlee , @lth, and @yoshinorim chime in if they prefer it differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming assert is debug only, then this is fine. I would prefer it return an error on non-debug builds though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. The next line returns the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can't assume that RocksDB users never run production with debug build. It has happened and is very likely to be happening somewhere too. Turning user input validation into application crashing will prevent users to do it.
@maysamyabandeh why is it an undefined behavior? The behavior is deterministic: return error to users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is it easier for users to find the corner case by crashing the program, than return them an error message telling them what mistake they have made?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I let MyRocks team as a user of rocksdb chime in of what behavior would make their life easier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not about this concrete case or concrete user. It's about the general principle we follow in this code base. It needs to be all consistent in the code base.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR is prepared for MyRocks primarily. I think it is fair to hear out their experience as the user before making strong opinions about carved-in-stone principles.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can hear it as an input for revisiting the model of crashing application while users provide invalid arguments, but I won't change opinion in here, even if they say they love it a lot.
@maysamyabandeh has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maysamyabandeh has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
s = Status::InvalidArgument( | ||
"assume_exclusive_tracked is set but it is not tracked yet"); | ||
} else if (lock_upgrade) { | ||
s = Status::InvalidArgument( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline. Shared locking should be enough. The goal of validation is just to check whether writes have happened, and a shared lock would be able to prevent that.
HISTORY.md
Outdated
@@ -3,6 +3,7 @@ | |||
### New Features | |||
|
|||
### Public API Change | |||
* Transaction::GetForUpdate is extended with a do_validate parameter with default value of true. If false it skips validating the snapshot before doing the read. Similar ::Merge, ::Put, ::Delete, and ::SingleDelete are extended with assume_exclusive_tracked with default value of false. It true it indicates that call is assumed to be after a ::GetForUpdate(do_validate=false). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: typo It -> If
Status s = TryLock(column_family, key, true /* read_only */, exclusive); | ||
bool exclusive, | ||
const bool do_validate) { | ||
assert(do_validate || read_options.snapshot == nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming assert is debug only, then this is fine. I would prefer it return an error on non-debug builds though.
@@ -559,7 +561,18 @@ Status PessimisticTransaction::TryLock(ColumnFamilyHandle* column_family, | |||
// any writes since this transaction's snapshot. | |||
// TODO(agiardullo): could optimize by supporting shared txn locks in the | |||
// future | |||
if (skip_validate || snapshot_ == nullptr) { | |||
if (!do_validate || snapshot_ == nullptr) { | |||
assert(!assume_exclusive_tracked || (previously_locked && !lock_upgrade)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like this assert could just be in an additional else clause after "else if (lock_upgrade)" below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the if-then clause. Let me know if the update has addressed your comment. I kept assert before the if-then clause so that we crash during the tests and the corner cases that have violated our assumptions would show up in tests.
bool exclusive, bool skip_validate) { | ||
bool exclusive, const bool do_validate, | ||
const bool assume_exclusive_tracked) { | ||
assert(!assume_exclusive_tracked || (!do_validate && exclusive)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Manuel mentioned acquiring read locks should be able to set do_validate = false if we want to retrieve the value with snapshot set to nullptr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. The latest PR is updated with manuel's comment. Let me know if the latest is correct.
@maysamyabandeh has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic makes sense by itself. I'll defer to @lth @yoshinorim @hermanlee to decide whether it makes sense to MyRocks and accept.
@maysamyabandeh has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes seems fine with me. I'll let Manuel have the final approval.
if (skip_validate || snapshot_ == nullptr) { | ||
if (!do_validate || snapshot_ == nullptr) { | ||
assert(!assume_tracked || previously_locked); | ||
if (assume_tracked) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor suggestion. Seems like this could be rewritten as:
if (assume_tracked && !previous_locked) {
s = Status::InvalidArgument();
}
@maysamyabandeh has updated the pull request. Re-import the pull request |
@maysamyabandeh has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
HISTORY.md
Outdated
@@ -3,6 +3,7 @@ | |||
### New Features | |||
|
|||
### Public API Change | |||
* Transaction::GetForUpdate is extended with a do_validate parameter with default value of true. If false it skips validating the snapshot before doing the read. Similarly ::Merge, ::Put, ::Delete, and ::SingleDelete are extended with assume_tracked with default value of false. If true it indicates that call is assumed to be after a ::GetForUpdate(do_validate=false). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's more accurate to say that assume_tracked means the call is assumed to be called after any GetForUpdate call right?
Status s = TryLock(column_family, key, true /* read_only */, exclusive); | ||
bool exclusive, | ||
const bool do_validate) { | ||
if (UNLIKELY(!do_validate && read_options.snapshot != nullptr)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have UNLIKELY only in one place?
const Slice& key, const Slice& value) { | ||
const Slice& key, const Slice& value, | ||
const bool assume_exclusive_tracked) { | ||
const bool do_validate = !assume_exclusive_tracked; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename to just assume_tracked, if we agree that shared locks also imply that we can skip validation?
@maysamyabandeh has updated the pull request. Re-import the pull request |
@maysamyabandeh has updated the pull request. Re-import the pull request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maysamyabandeh is landing this pull request. If you are a Facebook employee, you can view this diff on Phabricator.
@@ -287,6 +293,9 @@ class Transaction { | |||
// functions in WriteBatch, but will also do conflict checking on the | |||
// keys being written. | |||
// | |||
// assume_tracked=false expects the key be already tracked. If valid then it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you meant assume_tracked=true here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops. will fix it.
@@ -834,11 +909,23 @@ public void delete(final byte[] key) throws RocksDBException { | |||
* @throws RocksDBException when one of the TransactionalDB conditions | |||
* described above occurs, or in the case of an unexpected error | |||
*/ | |||
public void delete(final ColumnFamilyHandle columnFamilyHandle, final byte[][] keyParts, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Could you please rename
assume_tracked
toassumeTracked
so it follows the Java naming conventions. -
Could you please add the javadoc for the additional parameter
assumeTracked
. -
This occurs in several places, could you correct each please?
@@ -433,6 +433,33 @@ public void rollbackToSavePoint() throws RocksDBException { | |||
* @param key the key to retrieve the value for. | |||
* @param exclusive true if the transaction should have exclusive access to | |||
* the key, otherwise false for shared access. | |||
* @param do_validate true if it should validate the snapshot before doing the read |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
Could you please rename
do_validate
todoValidate
so it follows the Java naming conventions. -
Could you please add the javadoc for the additional parameter
doValidate
on the other methods which are missing it? -
This occurs in several places, could you correct each please?
Transaction::GetForUpdate is extended with a do_validate parameter with default value of true. If false it skips validating the snapshot (if there is any) before doing the read. After the read it also returns the latest value (expects the ReadOptions::snapshot to be nullptr). This allows RocksDB applications to use GetForUpdate similarly to how InnoDB does. Similarly ::Merge, ::Put, ::Delete, and ::SingleDelete are extended with assume_exclusive_tracked with default value of false. It true it indicates that call is assumed to be after a ::GetForUpdate(do_validate=false).
The Java APIs are accordingly updated.