Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug in scientific notation parsing #711

Merged
merged 1 commit into from
Jun 27, 2022

Conversation

alangpierce
Copy link
Owner

Fixes #676

Scientific notation parsing was using the shared readInt code path, which I
had made more flexible to handle hex digits, but that means that literals like
1e2 were being treated as a complete number (with the hex digit e) rather
than as scientific notation. Normally this wouldn't be a problem since it's all
a number token anyway, but it caused trouble when using dot-style property
access like 1e2.toString(). Expressions like 1.toString() are expected to
fail because . is interpreted as a decimal point, but when using scientific
notation, the parser needs to be smart enough see that a decimal point wouldn't
be valid, and therefore not include it as part of the token.

The fix was to change readInt to only handle decimal digits, not hex digits,
and to inline the hex case into readRadixNumber. This should make number
parsing slightly faster because it's not checking for hex digits anymore.

I also added some tests since some of this code wasn't well covered by test
coverage.

@codecov
Copy link

codecov bot commented Jun 27, 2022

Codecov Report

Merging #711 (4a04c49) into main (ea15a79) will increase coverage by 0.38%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main     #711      +/-   ##
==========================================
+ Coverage   84.97%   85.36%   +0.38%     
==========================================
  Files          54       54              
  Lines        5858     5855       -3     
  Branches     1338     1337       -1     
==========================================
+ Hits         4978     4998      +20     
+ Misses        597      576      -21     
+ Partials      283      281       -2     
Impacted Files Coverage Δ
src/parser/tokenizer/index.ts 84.73% <100.00%> (+4.95%) ⬆️

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@github-actions
Copy link

github-actions bot commented Jun 27, 2022

Benchmark results

Before this PR: 353.1 thousand lines per second
After this PR: 353.3 thousand lines per second

Measured change: 0.06% faster (1.12% slower to 0.54% faster)
Summary: Likely no significant difference

Fixes #676

Scientific notation parsing was using the shared `readInt` code path, which I
had made more flexible to handle hex digits, but that means that literals like
`1e2` were being treated as a complete number (with the hex digit `e`) rather
than as scientific notation. Normally this wouldn't be a problem since it's all
a number token anyway, but it caused trouble when using dot-style property
access like `1e2.toString()`. Expressions like `1.toString()` are expected to
fail because `.` is interpreted as a decimal point, but when using scientific
notation, the parser needs to be smart enough see that a decimal point wouldn't
be valid, and therefore not include it as part of the token.

The fix was to change `readInt` to only handle decimal digits, not hex digits,
and to inline the hex case into `readRadixNumber`. This should make number
parsing slightly faster because it's not checking for hex digits anymore.
@alangpierce alangpierce force-pushed the fix-scientific-notation-bug branch from f9da2bc to 4a04c49 Compare June 27, 2022 22:10
@alangpierce alangpierce merged commit 9446377 into main Jun 27, 2022
@alangpierce alangpierce deleted the fix-scientific-notation-bug branch June 27, 2022 22:17
1Lighty pushed a commit to Astra-mod/sucrase that referenced this pull request Aug 14, 2022
Fixes alangpierce#676

Scientific notation parsing was using the shared `readInt` code path, which I
had made more flexible to handle hex digits, but that means that literals like
`1e2` were being treated as a complete number (with the hex digit `e`) rather
than as scientific notation. Normally this wouldn't be a problem since it's all
a number token anyway, but it caused trouble when using dot-style property
access like `1e2.toString()`. Expressions like `1.toString()` are expected to
fail because `.` is interpreted as a decimal point, but when using scientific
notation, the parser needs to be smart enough see that a decimal point wouldn't
be valid, and therefore not include it as part of the token.

The fix was to change `readInt` to only handle decimal digits, not hex digits,
and to inline the hex case into `readRadixNumber`. This should make number
parsing slightly faster because it's not checking for hex digits anymore.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error parsing scientific notation with prototype access
1 participant