Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(compiler): allow unicode characters for component name as described in #8564 #8666

Merged
merged 7 commits into from
Dec 26, 2018

Conversation

youngrok
Copy link
Contributor

resolve #8564

What kind of change does this PR introduce? (check at least one)

  • Bugfix
  • Feature
  • Code style update
  • Refactor
  • Build-related changes
  • Other, please describe:

Does this PR introduce a breaking change? (check one)

  • Yes
  • No

If yes, please describe the impact and migration path for existing applications:

The PR fulfills these requirements:

If adding a new feature, the PR's description includes:

  • A convincing reason for adding this feature (to avoid wasting your time, it's best to open a suggestion issue first and wait for approval before working on it)

Other information:

 - use unicode letters when parsing path for watcher instead of only
 ascii letters
 - extract const `unicodeLetters` from html-parser to lang
@youngrok
Copy link
Contributor Author

In addition to #8564 , I added support for unicode property path in watcher.

src/core/util/options.js Outdated Show resolved Hide resolved
/**
* unicode letters used for parsing html tags, component names and property paths.
* use https://www.w3.org/TR/html53/semantics-scripting.html#potentialcustomelementname
* except \u10000-\uEFFFF because of performance problem
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a benchmark for this and it turns out that in most modern browsers except Safari, it runs fastest if we include \u10000-\uEFFFF:

https://jsperf.com/unicode-regex-test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, in my computer, npm test failed because of timeout when including \u10000-\uEFFFF. It passed without the extra characters. I don't know why the performance difference exists, yet.

@yyx990803 yyx990803 changed the base branch from dev to 2.6 December 26, 2018 14:53
// except \u10000-\uEFFFF because of performance problem
export const pcenchars = '[\\-\\.0-9_a-zA-Z\\u00B7\u00C0-\u00D6\u00D8-\u00F6\u00F8-\u037D\u037F-\u1FFF\u200C-\u200D\u203F-\u2040\u2070-\u218F\u2C00-\u2FEF\u3001-\uD7FF\uF900-\uFDCF\uFDF0-\uFFFD]'
const ncname = `[a-zA-Z_]${pcenchars}*`
const ncname = `[a-zA-Z_][\\-\\.0-9_a-zA-Z${unicodeLetters}]*`

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'a-zA-Z' is repeat in unicodeLetters

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants