Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modify formatting of perference's name to align vscode #9381

Merged
merged 1 commit into from
Apr 28, 2021
Merged

Conversation

shuyaqian
Copy link
Contributor

@shuyaqian shuyaqian commented Apr 21, 2021

Signed-off-by: shuyaqian [email protected]

What it does

Avoid the problem that if the perference's name contains Chinese characters or special characters like C++ will be filtered.

How to test

Review checklist

Reminder for reviewers

@shuyaqian shuyaqian changed the title Remove formatting of perferences to align vscode Modify formatting of perferences' name to align vscode Apr 21, 2021
@shuyaqian shuyaqian changed the title Modify formatting of perferences' name to align vscode Modify formatting of perference's name to align vscode Apr 21, 2021
@vince-fugnitto vince-fugnitto added the preferences issues related to preferences label Apr 21, 2021
@colin-grant-work
Copy link
Contributor

@shuyaqian, thank you for your contribution! With your changes, I'm seeing some problems where names that should be split are not being split:

image

You mention in your PR description that the problem occurs with Chinese characters and special characters like the + in C++. Could you add some tests to the test suite for the various cases?

For example completePropertyWithSemicolon should become Complete Property With Semicolon but C++ should remain C++ and 某個設定/某个设定 should remain 某個設定/某个设定.

@shuyaqian
Copy link
Contributor Author

@shuyaqian, thank you for your contribution! With your changes, I'm seeing some problems where names that should be split are not being split:

image

You mention in your PR description that the problem occurs with Chinese characters and special characters like the + in C++. Could you add some tests to the test suite for the various cases?

For example completePropertyWithSemicolon should become Complete Property With Semicolon but C++ should remain C++ and 某個設定/某个设定 should remain 某個設定/某个设定.

Thanks, I've revised it.

@colin-grant-work
Copy link
Contributor

colin-grant-work commented Apr 22, 2021

Thanks, I've revised it.

Thank you very much! One thing I notice is that the new regex uses ranges like [a-z] to a lot. That means it now works well for the vanilla Latin alphabet, but it wouldn’t work for any other alphabets. To generalize it, I think it may be better to apply an iterative approach, rather than a regex, something like this (pseudocode):

Initialize output to empty string
Iterate over input string
If the first character can be capitalized, capitalize it.
If you encounter a capitalized character following a non-capitalized character, add a space before the capitalized character.
Otherwise just add the character

e.g.
Danish:
indstillingPåEnØ -> Indstilling På En Ø

Greek:
κάποιαΡύθμιση -> Κάποια Ρύθμιση

Russian:
некоторыеНастройки -> Некоторые Настройки

Armenian:
ինչ-որՊարամետր -> Ինչ-որ Պարամետր

I haven’t tested every case, but JavaScript’s toUpperCase() and toLowerCase() methods work on a wide variety of character sets, so a check like a.toLowerCase() !== a works pretty well to tell if something is capitalized, and a.toUpperCase() !== a works pretty well to check whether something could be capitalized. I’ve got some code here that works on the Armenian example, and should cover the rest, though it can certainly be improved.

Punctuation is a challenge. For example something like hyphenated-wordC++Setting should maybe become Hyphenated-word C++ Setting, but I don’t think we have a mechanism at the moment for making any useful distinctions there. I think we should probably allow something like a displayName in the preference schema and only fall back to calculating it if that doesn’t exist, but that’s outside the scope of this PR.

@shuyaqian
Copy link
Contributor Author

Thanks, I've revised it.

Thank you very much! One thing I notice is that the new regex uses ranges like [a-z] to a lot. That means it now works well for the vanilla Latin alphabet, but it wouldn’t work for any other alphabets. To generalize it, I think it may be better to apply an iterative approach, rather than a regex, something like this (pseudocode):

Initialize output to empty string
Iterate over input string
If the first character can be capitalized, capitalize it.
If you encounter a capitalized character following a non-capitalized character, add a space before the capitalized character.
Otherwise just add the character

e.g.
Danish:
indstillingPåEnØ -> Indstilling På En Ø

Greek:
κάποιαΡύθμιση -> Κάποια Ρύθμιση

Russian:
некоторыеНастройки -> Некоторые Настройки

Armenian:
ինչ-որՊարամետր -> Ինչ-որ Պարամետր

I haven’t tested every case, but JavaScript’s toUpperCase() and toLowerCase() methods work on a wide variety of character sets, so a check like a.toLowerCase() !== a works pretty well to tell if something is capitalized, and a.toUpperCase() !== a works pretty well to check whether something could be capitalized. I’ve got some code here that works on the Armenian example, and should cover the rest, though it can certainly be improved.

Punctuation is a challenge. For example something like hyphenated-wordC++Setting should maybe become Hyphenated-word C++ Setting, but I don’t think we have a mechanism at the moment for making any useful distinctions there. I think we should probably allow something like a displayName in the preference schema and only fall back to calculating it if that doesn’t exist, but that’s outside the scope of this PR.

I have a preference for 'SecSolar', which is an acronym, and I don't want it to be 'Sec Solar', but in VSCODE it's also 'Sec Solar'. I think this can be aligned with vscode and ignored for the time being. The rest of the logic is executed according to the pseudocode above.

Copy link
Contributor

@colin-grant-work colin-grant-work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good and the tests are a very nice addition, only very minor issues remaining.

@@ -137,7 +137,17 @@ export class PreferenceTreeGenerator {
}

private formatString(string: string): string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private formatString(string: string): string {
protected formatString(string: string): string {

return formatedString.trim();
}

private isUpperCase(char: string): boolean {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private isUpperCase(char: string): boolean {
protected isUpperCase(char: string): boolean {

Copy link
Contributor

@colin-grant-work colin-grant-work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, a test is now failing. Looks like it expects capitalization after white space, as well as string-initially. I think that expectation is incorrect. If you create a preference name in the following form:

"references.My perfectly formatted Display Name"`

VSCode does not capitalize the words after spaces:

image

So you can simply change the expected output.

@shuyaqian
Copy link
Contributor Author

Thanks, I've revised it. I don't know why the test cases don't run automatically after force-push🤪,maybe need you to approve.

@colin-grant-work
Copy link
Contributor

@shuyaqian, looks like it did re-run, and there's still a tiny problem: the current code is adding a space in the sequence space + capital:

@theia/preferences:   1) preference-tree-generator
@theia/preferences:        PreferenceTreeGenerator.format:
@theia/preferences:       AssertionError: expected 'Aaa Bbb Ccc  Dddd eee' to equal 'Aaa Bbb Ccc Dddd eee'
@theia/preferences:       + expected - actual
@theia/preferences:       -Aaa Bbb Ccc  Dddd eee      (colin-grant-work: two spaces in 'Ccc  Dddd' here instead of the expected one)
@theia/preferences:       +Aaa Bbb Ccc Dddd eee

You can also run the tests locally: from the packages/preferences directory, run yarn test, and it will run the tests for that package.

@shuyaqian
Copy link
Contributor Author

@shuyaqian, looks like it did re-run, and there's still a tiny problem: the current code is adding a space in the sequence space + capital:

@theia/preferences:   1) preference-tree-generator
@theia/preferences:        PreferenceTreeGenerator.format:
@theia/preferences:       AssertionError: expected 'Aaa Bbb Ccc  Dddd eee' to equal 'Aaa Bbb Ccc Dddd eee'
@theia/preferences:       + expected - actual
@theia/preferences:       -Aaa Bbb Ccc  Dddd eee      (colin-grant-work: two spaces in 'Ccc  Dddd' here instead of the expected one)
@theia/preferences:       +Aaa Bbb Ccc Dddd eee

You can also run the tests locally: from the packages/preferences directory, run yarn test, and it will run the tests for that package.

I ignored the space in front of the uppercase letter, and I modified the criteria, now it's okay.

Copy link
Contributor

@colin-grant-work colin-grant-work left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests are passing and the code looks good. Thanks for the contribution and for the tests, @shuyaqian!

protected formatString(string: string): string {
let formatedString = string[0].toLocaleUpperCase();
for (let i = 1; i < string.length; i++) {
if (this.isUpperCase(string[i]) && string[i - 1] !== ' ' && !this.isUpperCase(string[i - 1])) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: I think this is fine as is - I'll approve it momentarily - but the check !== ' ' could be generalized to regex /\s/, since not all scripts use the ' ' character.

@colin-grant-work
Copy link
Contributor

I will merge this tomorrow, if there are no objections.

@shuyaqian
Copy link
Contributor Author

I will merge this tomorrow, if there are no objections.

Thanks, hopefully this will end the PR this time😅

@colin-grant-work
Copy link
Contributor

Thanks, hopefully this will end the PR this time😅

Thanks for going along with my expansion of the scope of the PR. I think this is a big improvement over the previous implementation.

@colin-grant-work colin-grant-work merged commit cac776c into eclipse-theia:master Apr 28, 2021
@vince-fugnitto vince-fugnitto added this to the 1.13.0 milestone Apr 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preferences issues related to preferences
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants