Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with multibyte chars in source_text() computation #410

Closed
bram209 opened this issue Oct 8, 2023 · 1 comment · Fixed by #411
Closed

Issue with multibyte chars in source_text() computation #410

bram209 opened this issue Oct 8, 2023 · 1 comment · Fixed by #411

Comments

@bram209
Copy link

bram209 commented Oct 8, 2023

It treats lo as a byte index, while it is actually a character index:

let trunc_lo = &self.source_text[lo..];

I expect this test to pass, but it does not:

#[cfg(span_locations)]
#[test]
fn source_text() {
    let input = "    𓀕 c    ";
    let mut tokens = input
        .parse::<proc_macro2::TokenStream>()
        .unwrap()
        .into_iter();

    let ident1 = tokens.next().unwrap();
    assert_eq!("𓀕", ident1.span().source_text().unwrap());

    let ident2 = tokens.next().unwrap();
    assert_eq!("𓀕", ident2.span().source_text().unwrap());
}

Panics with (as character 𓀕 occupies byte 5 and 6)

---- source_text stdout ----
thread 'source_text' panicked at 'byte index 6 is not a char boundary; it is inside '𓀕' (bytes 4..8) of `    𓀕 c   `', src/fallback.rs:367:25
@dtolnay
Copy link
Owner

dtolnay commented Oct 9, 2023

I have published a fix in proc-macro2 1.0.69.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants