Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to the mb_url_title() helper #3185

Closed
wants to merge 1 commit into from

Conversation

michalsn
Copy link
Member

Description
This PR adds Unicode support to the mb_url_title() helper. See: #3180

I don't want to sound totally ignorant, but it would be great if someone with better knowledge of the non-Latin alphabet could check if this is working as it should. Right now this is purely based on feedback from @mmrtonmoybd.

Checklist:

  • Securely signed commits
  • Component(s) with PHPdocs
  • Unit testing, with >80% coverage
  • User guide updated
  • Conforms to style guide

@ivantcholakov
Copy link

ivantcholakov commented Jul 1, 2020

  1. I suppose you mean UTF-8.

  2. The function convert_accented_characters() has its old name that is understandable for people with Latin-origin language. But generally speaking it is transliteration of whatever text you have (even Chinese glyphs) into ASCII. And such a transliteration from cyrillic alphabet to ASCII is standartized, it is language-dependent - there is a Bulgarian way, there is a Russin way, etc. So, even without immediate implementation the signature of this function should be convert_accented_characters($string, $language).

  3. Maybe it would be easier for supporting UTF-8, not such fragmented efforts to be done. Maybe it would be better a set of UTF-8-aware functions to be implemented first for being maintained easily, something like Kohana did many years ago with their UTF8 class. And then this set to be used where it is appropriate. At the moment, I think that hardly readable regular expressions in many places might bring difficulties.

@ivantcholakov
Copy link

ivantcholakov commented Jul 1, 2020

Here is a collection of my patches on CI3 that worked well for me for years: https://github.com/ivantcholakov/starter-public-edition-4/blob/v4.4.6/platform/common/helpers/MY_text_helper.php Not everything.

@MGatner
Copy link
Member

MGatner commented Jul 1, 2020

I will let @michalsn take a look at those first.

@mmrtonmoybd
Copy link
Contributor

@MGatner now mb_url_title() is not work perfectly as before. I test it when @michalsn add mb_url_title() and it works as expect, now it is not working as expect.

@michalsn
Copy link
Member Author

michalsn commented Jul 1, 2020

Thanks for the feedback @ivantcholakov!

  1. Yes, I meant UTF-8
  2. You're right, I guess everything should be "translated" to ASCII with the relation to a given language.
  3. I have to think about how to address this correctly but a separate class is probably a good idea.

Obviously changes in this PR are not a good idea.

@michalsn michalsn closed this Jul 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants