Added support for Unicode formats and code casing #85

tdulcet · 2024-02-09T14:05:42Z

Added new Parenthesized Unicode font
Added support for Unicode formats (disabled by default)
- This just uses combining characters (diacritical marks) that appear like formatting.
Added support for code casing
Reverted some of the Chrome manifest changes from Removed use of tabs.executeScript() and updated manifest files #83, which require first upgrading to MV3

@rugk:

These and the menu items added in Added option for a nested context menu #84 still need to be localized to German.

rugk · 2024-02-09T20:43:57Z

These and the menu items added in #84 still need to be localized to German.

Will do when merged… 🙂

tdulcet · 2024-02-29T17:38:07Z

This is ready to be merged.

rugk · 2024-02-29T19:37:09Z

Yeah you can just click merge, if that's the case. Likely did not want to merge it, as I was not sure, whether you still wanted to add some things.

tDeContes · 2024-03-03T02:18:38Z

* Added support for code casing

Very nice !! :-)

Did you plan to allow enabling casing without code casing ?

I don't care about Toggle Case,
but with code casing I'll probably need sth like Capitalize Each Word, but not setting "other letters" Lowercase, do you understand what I mean ?
Did you plan that ?

tdulcet · 2024-03-08T17:19:43Z

Did you plan to allow enabling casing without code casing ?

The new coding cases have a separate checkbox on the options page.

but with code casing I'll probably need sth like Capitalize Each Word, but not setting "other letters" Lowercase

The coding cases do not make the other letters lowercase like "Capitalize Each Word" does. We could add an option for "Capitalize Each Word" to not make the other letters lowercase as well, if there was interest.

tDeContes · 2024-03-09T01:53:13Z

Did you plan to allow enabling casing without code casing ?

The new coding cases have a separate checkbox on the options page.

Fine!

but with code casing I'll probably need sth like Capitalize Each Word, but not setting "other letters" Lowercase

The coding cases do not make the other letters lowercase like "Capitalize Each Word" does.

So the upper camel case should do the job.

We could add an option for "Capitalize Each Word" to not make the other letters lowercase as well, if there was interest.

I don't think there is interest. I let you know in case.

tdulcet · 2024-03-09T11:09:27Z

So the upper camel case should do the job.

Possibly depending on your use case. All the coding cases will of course remove any whitespace or other special characters.

tDeContes · 2024-03-20T03:09:47Z

but with code casing I'll probably need sth like Capitalize Each Word, but not setting "other letters" Lowercase

The coding cases do not make the other letters lowercase like "Capitalize Each Word" does.

Oh! I didn't expect this behavior! :-D
Fine to be able to reverse createGuiHandler to create_gui_handler. :-)
(But it's broken for one-letter-words: it is a word -> ItIsAWord -> it_is_aword.)

However, I would like very much it supports acronyms.

With Pascal_Snake_Case, create GUI handler would give Create_GUI_Handler.
Of course with sth like snake_case it has no sense, since it's full lowercase.
Maybe it has sense for UpperCamelCase. I don't know whether it would be better to make CreateGUIHandler, CreateGUIhandler, or sth else.

tdulcet · 2024-03-20T12:31:59Z

(But it's broken for one-letter-words: it is a word -> ItIsAWord -> it_is_aword.)

Thanks for testing it and for the feedback. This was a deliberate choice to handle words with multiple consecutive capitals, but I can change this in my next PR. It it just a trivial change to the regular expression.

However, I would like very much it supports acronyms.

It looks like I misspoke above, as I forgot about the constant and train cases, which require lowercaseing the string to convert them to the other cases. Keeping the existing case would also make the conversion irreversible, as CreateGUIHandler would then become create_guihandler or create_g_u_i_handler (if disallowing multiple consecutive capitals).

However, we could add an option for the coding cases to not make the other letters lowercase, if there was interest.

tDeContes · 2024-03-20T20:56:16Z

(But it's broken for one-letter-words: it is a word -> ItIsAWord -> it_is_aword.)

While testing it's a word, I discovered the following strange behavior:

it is a word -> it_is_a_word: fine
it_is_a word -> itisa_word!
it_is_a-word -> it_is_a_word: still fine!

All the coding cases will of course remove any whitespace or other special characters.

Well, IMHO the best would be easier, that is, any whitespace or other special characters would be removed, of course, but always considering them as word separators, never making them void.

Thanks for testing it and for the feedback.

:-)

However, I would like very much it supports acronyms.

It looks like I misspoke above, as I forgot about the constant and train cases, which require lowercaseing the string to convert them to the other cases.

It's always using lowercase, but it's fine (except with one-letter-words and acronyms) because it preserves uppercase when needed. :-)

Keeping the existing case would also make the conversion irreversible, as CreateGUIHandler would then become

create_guihandler

No sense.

or create_g_u_i_handler (if disallowing multiple consecutive capitals).

It's right for one-letter-words, and it also has a sense for acronyms, since they are made with initials!

However, we could add an option for the coding cases to not make the other letters lowercase, if there was interest.

IMHO the best thing to do would be:

discard create_guihandler
make create_g_u_i_handler the default (since we have at hand a sample for 2 consecutive one-letter-words: it's a word)
add an option to get create_gui_handler back, that is to handle acronyms.

In fact, yes, it could be a good idea to just let case as is, without trying to recognize acronyms, because I'm using sometimes a mix between Pascal_Snake_Case and UpperCamelCase.
For ex in Create_ObjectHandler it would mean that ObjectHandler is an entity on which Create applies, whereas CreateObject_Handler would have another mean. :-)

(However I need some more experimentation to see what would really be useful.)

tdulcet · 2024-03-21T12:34:31Z

it_is_a word -> itisa_word!

That is intentional. If there is any whitespace in the string, it splits by that instead of the special characters. The hierarchy currently is whitespace, then special characters and lastly casing transitions.

Well, IMHO the best would be easier, that is, any whitespace or other special characters would be removed, of course, but always considering them as word separators, never making them void.

The problem with that is your it's a word example. It would split that into it_s_a_word instead of its_a_word, which is probably not what the user would want.

It's right for one-letter-words, and it also has a sense for acronyms, since they are made with initials!

OK, thanks for the feedback. I will make this change in my next PR. If you want to test it now, in this line:

unicodify/src/common/modules/UnicodeTransformationHandler.js

Line 188 in b62fb3b

    
           return !arr.length || arr.length > 1 ? arr : arr[0].match(/\p{Upper}*\P{Upper}+|\p{Upper}+/gu);

Change the /\p{Upper}*\P{Upper}+|\p{Upper}+/gu regular expression to /\p{Upper}\P{Upper}*|\P{Upper}+/gu.

add an option to get create_gui_handler back, that is to handle acronyms.

The question is how to differentiate acronyms from one letter words...

(However I need some more experimentation to see what would really be useful.)

Please feel free to open additional issues if you find anything that would be useful.

tDeContes · 2024-03-22T20:31:29Z

it_is_a word -> itisa_word!

That is intentional. If there is any whitespace in the string, it splits by that instead of the special characters. The hierarchy currently is whitespace, then special characters and lastly casing transitions.

OK, I got it. If there is no space, all special characters are separators.

Well, IMHO the best would be easier, that is, any whitespace or other special characters would be removed, of course, but always considering them as word separators, never making them void.

The problem with that is your it's a word example. It would split that into it_s_a_word instead of its_a_word, which is probably not what the user would want.

Perso I prefer it_s_a_word.
But it's not a priority. If you disagree, I can open a discussion about that later if I find it too annoying.

It's right for one-letter-words, and it also has a sense for acronyms, since they are made with initials!

OK, thanks for the feedback. I will make this change in my next PR.

If you want to test it now,

No, thanks, I'll wait.
Let me know about your next PR, so I can test it. :-)

add an option to get create_gui_handler back, that is to handle acronyms.

The question is how to differentiate acronyms from one letter words...

It's not possible, that's why it needs an option. (I understand a setting with checkbox or radio buttons.)

Alternative: suggest both in the context menu (but that won't make it shorter ...)

It looks like I misspoke above, as I forgot about the constant and train cases, which require lowercaseing the string to convert them to the other cases.

However, we could add an option for the coding cases to not make the other letters lowercase, if there was interest.

Are you able to detect constant and train cases, to not apply lowercase in other cases ?

(However I need some more experimentation to see what would really be useful.)

Please feel free to open additional issues if you find anything that would be useful.

Anyway, all of that is not high-priority.
I prefer you to make the selection on right-click first! ;-)

tdulcet · 2024-03-23T10:52:48Z

But it's not a priority. If you disagree, I can open a discussion about that later if I find it too annoying.

I used some of the test cases from this library when developing this, so I believe it is the correct behavior. However, I am always open to changing it if needed and updating the regular expressions.

Let me know about your next PR, so I can test it. :-)

I can ping you on my next PR if I remember... You could subscribe to this repository as well, so you would automatically be notified of any new PRs. 😉

It's not possible, that's why it needs an option.

OK, please create an issue with specifically what you want this new option to do. Which coding cases should it apply to? Some test cases would be helpful as well, so that we could confirm it is working as intended.

Are you able to detect constant and train cases, to not apply lowercase in other cases ?

Not currently. It prioritizes a simple implementation, so it uses the same split function for all of the coding cases.

tDeContes · 2024-04-12T01:28:04Z

Let me know about your next PR, so I can test it. :-)

I can ping you on my next PR if I remember... You could subscribe to this repository as well, so you would automatically be notified of any new PRs. 😉

Sorry, I don't find how to subscribe to this repository. Where is the button please? :-)

It's not possible, that's why it needs an option.

OK, please create an issue with specifically what you want this new option to do. Which coding cases should it apply to? Some test cases would be helpful as well, so that we could confirm it is working as intended.

Thank you. :-)
I think it would be better to make #93 before, and it would help me to know what I want exactly, so I'll wait for it. :-)

However I forgot to ask:
In Unicodify DEV VERSION, is it right that the details tab is empty? I count on it to copy-paste code casing names, since it doesn't work in the settings tab.

tdulcet · 2024-04-12T10:29:36Z

Sorry, I don't find how to subscribe to this repository. Where is the button please? :-)

Click the "Watch" dropdown button at the top right of the main page of the repository and then select your preferred options.

In Unicodify DEV VERSION, is it right that the details tab is empty? I count on it to copy-paste code casing names, since it doesn't work in the settings tab.

Yes, Firefox/Thunderbird gets that text directly from the AMO/ATN listings respectively, which does not apply to side loaded extensions. You could just open the listing on AMO/ATN directly though to copy from, for example: https://addons.mozilla.org/firefox/addon/unicodify-text-transformer/.

Not being able to copy from the options page should be fixed by TinyWebEx/CommonCss#4.

Added support for Unicode formats and code casing.

18fdeb5

tdulcet requested a review from rugk February 9, 2024 14:05

rugk previously approved these changes Feb 9, 2024

View reviewed changes

Cleaned up code and restored Chrome MV2 support.

0a8541a

tdulcet dismissed rugk’s stale review via 0a8541a February 10, 2024 17:35

rugk approved these changes Feb 10, 2024

View reviewed changes

rugk merged commit e094322 into main Feb 29, 2024
7 checks passed

rugk deleted the Unicode-format branch February 29, 2024 19:36

tdulcet mentioned this pull request Mar 1, 2024

Making Toggle Case optionnal #88

Closed

tDeContes mentioned this pull request Mar 20, 2024

Pascal_Snake_Case #92

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added support for Unicode formats and code casing #85

Added support for Unicode formats and code casing #85

tdulcet commented Feb 9, 2024 •

edited

Loading

rugk commented Feb 9, 2024

tdulcet commented Feb 29, 2024

rugk commented Feb 29, 2024

tDeContes commented Mar 3, 2024

tdulcet commented Mar 8, 2024

tDeContes commented Mar 9, 2024

tdulcet commented Mar 9, 2024

tDeContes commented Mar 20, 2024

tdulcet commented Mar 20, 2024

tDeContes commented Mar 20, 2024

tdulcet commented Mar 21, 2024

tDeContes commented Mar 22, 2024

tdulcet commented Mar 23, 2024 •

edited

Loading

tDeContes commented Apr 12, 2024

tdulcet commented Apr 12, 2024

Added support for Unicode formats and code casing #85

Added support for Unicode formats and code casing #85

Conversation

tdulcet commented Feb 9, 2024 • edited Loading

rugk commented Feb 9, 2024

tdulcet commented Feb 29, 2024

rugk commented Feb 29, 2024

tDeContes commented Mar 3, 2024

tdulcet commented Mar 8, 2024

tDeContes commented Mar 9, 2024

tdulcet commented Mar 9, 2024

tDeContes commented Mar 20, 2024

tdulcet commented Mar 20, 2024

tDeContes commented Mar 20, 2024

tdulcet commented Mar 21, 2024

tDeContes commented Mar 22, 2024

tdulcet commented Mar 23, 2024 • edited Loading

tDeContes commented Apr 12, 2024

tdulcet commented Apr 12, 2024

tdulcet commented Feb 9, 2024 •

edited

Loading

tdulcet commented Mar 23, 2024 •

edited

Loading