-
Notifications
You must be signed in to change notification settings - Fork 15.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
protoc: fix consistency with parsing very large decimal numbers #10555
protoc: fix consistency with parsing very large decimal numbers #10555
Conversation
5047015
to
7467f75
Compare
I realized there is no new test to confirm that options and defaults allow very large decimal literals for float/double fields Adding that now. |
@fowles, I think this is ready for another look. I think I addressed your comments, and I've added a new test, too. |
…flow uint64 or underflow int64) to be used as float/double values
93e26b4
to
4c69337
Compare
BTW, I rebased to more recent master and was able to remove the commit that was just re-generating files from protos. So the diff is a little leaner/less noisy now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only nits remaining, I will approve and kick the CI systems on the next round
@fowles, I see you approved this, but I realized one more thing missing from this before it's ready to merge -- there's a code path that could accidentally trigger a fatal error, if the literal is an octal or hex integer that is >2^64-1. I think it's an easy fix as well as easy to test. |
@fowles, see last commit and let me know what you think. |
} else if (!io::Tokenizer::TryParseFloat(input_->current().text, output)) { | ||
// out of int range, and not valid float? 🤷 | ||
AddError("Integer out of range."); | ||
// We still return true because we did, in fact, parse a number. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is "just in case". I can't think of any actual scenario, other octal or hex literal (handled above), where ParseFloat
would fail. But I also really don't want this inadvertently raising an exception.
} else if (input_->current().text[0] == '0') { | ||
// octal or hexadecimal; don't bother parsing as float | ||
AddError("Integer out of range."); | ||
// We still return true because we did, in fact, parse a number. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put this here, instead of just relying on the branch below, because I think an octal literal would likely be successfully parsed by ParseFloat
, but incorrectly interpreted as decimal.
src/google/protobuf/io/tokenizer.cc
Outdated
@@ -1002,9 +1002,19 @@ bool Tokenizer::ParseInteger(const std::string& text, uint64_t max_value, | |||
} | |||
|
|||
double Tokenizer::ParseFloat(const std::string& text) { | |||
double result; | |||
GOOGLE_LOG_IF(DFATAL, | |||
!TryParseFloat(text, &result)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hate side effected in LOG statements, do you mind switching this to the more straight forward:
if (!TryParseFloat(...)) {
LOG(DFATAL) << ...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Thanks for the PR. Sorry about the many rounds of back and forth. Running CI one last time and then I will merge! |
No worries! I totally understand the desire for high code quality, and I know that I don't write C++ often enough to get everything right the first time :) |
…y-with-very-large-decimal-numbers protoc: fix consistency with parsing very large decimal numbers
Fixes #10554.
I think the most controversial part of this would be the error messages, which I've updated to include the actual allowed range of values for integer literals (since this change makes the parser accept out-of-range integer literals, but they are interpreted as if float/double literals).
I re-generated code from protos in the third commit in order to make tests pass. I think that means the
main
branch is actually broken now because of stale generated code? I can remove that commit, since I understand it may be noise for this CL, but I included it to get a successful run of tests.