-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Panic with surrogate code points in EscapedUnicode #608
Labels
Comments
3 tasks
SimonSapin
changed the title
Parsing surrogate code points in EscapedUnicode
Panic with surrogate code points in EscapedUnicode
Sep 25, 2023
The new feature of decoding surragate pairs is split off to a new issue: #657, leaving this one about the panic. |
SimonSapin
added a commit
that referenced
this issue
Sep 25, 2023
SimonSapin
added a commit
that referenced
this issue
Sep 25, 2023
SimonSapin
added a commit
that referenced
this issue
Sep 28, 2023
SimonSapin
added a commit
that referenced
this issue
Sep 28, 2023
* fix(parser): emit lexer error on escaped surrogate to avoid panic Fixes #608 * Changelog
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Description
apollo-parser 0.5.3 provides
impl From<&'_ ast::StringValue> for String
to extract the value represented by a string literal by resolving escape sequences. This part of the spec:https://spec.graphql.org/October2021/#sec-String-Value.Semantics
Is implemented here:
apollo-rs/crates/apollo-parser/src/ast/node_ext.rs
Lines 169 to 176 in 126816f
The spec is written assuming JavaScript-like strings made of 16-bit code units, but Rust’s
char
represents a Unicode scalar value which excludes the range of surrogate code units, U+D800 to U+DFFF. (Surrogates are reserved to leave space for UTF-16 to encode code points beyond U+FFFF as a pair of leading of leading and trailing surrogates.) As a result, this.unwrap()
call can panic.The draft spec fixes this (and adds a new bit of syntax):
https://spec.graphql.org/draft/#sec-String-Value
In particular:
TryFrom
instead ofFrom
)SyntaxTree::errors()
, but that is completed without actually extracting the unescaped value of string literals. So the lexer is likely the right place to check whether that conversion would succeed.Steps to reproduce
(Short version, see test case without apollo-compiler below.)
Expected result
Either:
Value::String("🦀")
(the UTF-16 decoding of0xD83E 0xDD80
)Actual result
Test case
Fixing this per the draft spec should make this test pass:
crates/apollo-parser/src/ast/node_ext.rs
The text was updated successfully, but these errors were encountered: