Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scan unicode identifier #16

Merged
merged 3 commits into from
Oct 19, 2023
Merged

scan unicode identifier #16

merged 3 commits into from
Oct 19, 2023

Conversation

lu-zhengda
Copy link
Collaborator

@lu-zhengda lu-zhengda commented Oct 19, 2023

This PR adds support to tokenize utf8 encoded (non-ascii) identifier.
utf8 encoding was already supported in quoted identifier but errors out in non-quoted ones. The error was mainly due to how we are advancing token in input string, where rune(s.src[s.cursor]) will simply return the byte at cursor but not taking multi-byte runes into consideration.

@lu-zhengda lu-zhengda requested a review from a team as a code owner October 19, 2023 22:04
@lu-zhengda lu-zhengda merged commit a69f508 into main Oct 19, 2023
3 checks passed
@lu-zhengda lu-zhengda deleted the zhengda.lu/unicode-ident branch October 19, 2023 22:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants