-
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving UTF-8 Support #203
Comments
I really like this, but my only concern is that it wouldn't be understandable while in the text editor. Perhaps we could save a mapping of usernames <=> ids somewhere, still show the usernames in the text editor, and swap it out on save? Essentially, just cuz we store it that way shouldn't mean that we display it in the text editor that way. Also, ref #254
I agree, this should resolve itself naturally with #286 And that brings us to slugs. I've given this a bit of thought, and would like to propose the following structure: We have an extender, with use:
The driver interface looks like this:
The sluggable model interface looks like this:
Alternatively, we could allow registration of sluggable models via an extender, or bring the SluggableModelInterface functionality out into some form of service class. We would provide a UI in admin for site admins to choose which slug drivers to use for each supported model (if more than one is registered for that model) In all sluggable models, we would also include a computed slug for the model in the serialization so that we can keep slug generation logic completely to the backend. For the API, we can use a parameter sent in get requests to single-item endpoints to determine if the model instance should be retrieved via a slug driver, otherwise defaulting to an ID (to keep a bit of compliance with JSON API? This should also resolve flarum/framework#358 and flarum/framework#1356 |
@askvortsov1 For this we could also do something like Discord, but that'd require the editor to be WYSIWYG then (at least for some functions). I agree that just showing |
I agree that Also @askvortsov1 I'm pretty much happy with this proposals for slugs, and it's pretty much what I had in mind. |
I've updated the list based on some community feedback. However I believe that what's now listed is pretty finalized at this point. I plan to start at least some work here in the coming days to resolve some of these issues. Edit: I'll start work on this once Mithril 2 rewrite is near completion/complete. This will be my focus for next beta. |
Alright, I'm just going to write down my vision of the way that his would work. Looking for feedback on this idea, it mostly addresses the slugging problems. Overall the basic idea is that each model that needs slug would use a trait or implement something that would let core know that it's a sluggable model. Further there would be a function (e.g. Each slug handler would be defined as a driver (like the mail system) and extension developers would be able to write their own driver that the admin would then be able to select on the dashboard. Admins should be able to select a driver on a model by model basis (so they could do something like discussions use php-intl slugging, users use no conversion slugging, etc.). Once the admins have changed the slugging method they should be able to use a "Update slug" button that would update all stored slugs. However that button should warn them if they have more than say 1K discussions/users and show them the proper command to use in SSH to do it from there. We should also warn admins that there may be SEO impacts if they update the slugs.
|
I would like to leave a suggestion about this feature. At the moment username can only follow strict rules and there are no ways at the moment (other than editing Flarum core) to change them to something more permissive. |
@maicol07 We do support special characters in Display Names which in the default version of Flarum isn't used much. If you look on twitter for example the actual username does not support UTF-8 but the display name does. One of the things that we in theory will be fixing as part of this UTF-8 project will be display/username support and other things that need fixing. Also of note your concern is listed in the checklist. |
For changing usernames, I believe that's possible as an admin side action, and the fof username change request extension is a great package for enabling user side requests. Tbh I don't think it's needed as a core feature given those 2 |
Twitter? Or am I thinking about that wrong? Maybe that’s not technically a username and/or, I’m sure being Twitter they find other ways to harden security? I get trying to keep usernames more private for security. But, I do think some sites, even some big ones, still have username logins and display usernames. You just need strong passwords. Maybe 2 factor eventually. Anyway... just saw that comment and thought I’d foolishly for whatever reason go to bat for sites that display usernames and allow username logins. 🙃 I’m actually a fan of going with email only logins or whatever is decided. Or, let the user decide how secure they want to be. A switch in the admin to allow or not allow username logins. I guess that could be an extension. I also guess I’m getting a little off topic. |
@tankerkiller125 I've been thinking about this, and I'm not sure if we want to over engineer this. Essentially, I'm not convinced that in 2020, transliteration is still necessary. What if we offered the following (independent, boolean) options:
This should cover most use cases, and wouldn't be too complex. It would also be fully up to site admins to enable/disable allowing direct unicode in slugs (with the knowledge that those settings would only apply to future changes). It seems to me like the only issue with unicode in usernames is for slugs, and over the past years, browsers support this better and better (as they should, IMO). Other forum softwares allow this as well (https://community.nodebb.org/topic/417/unicode-slugs-yes-please, https://meta.discourse.org/t/unicode-usernames-and-group-names/117737). We'd then just need to fix the mentions regex (or substitute the aforementioned solution), which needs to be done anyway. The only issue that remains with this aproach is what to do for extensions that define their own models that need slugs. For instance, tags. With the above framework, I don't think we need an "extender" per-se, but it would be nice to have a unified place in admin for configuring all these at once. But we could keep the same switches: Whether unicode can be used in tag slugs, and whether the numerical ID should be used instead of unicode in tag slugs. |
@askvortsov1 although browsers now display UTF-8 when I suggested that in the original discussion on discuss (or possibly a discord conversation) users with UTF-8 languages where against it do to the way that copy and paste gets handled (very poorly) making sharing of the full URL very ugly. |
I'm still unsure that we need custom drivers. Practically speaking, one of 3 slugging schemes would be used:
With the 2 switches per-model strategy described above, all these are supported simply and out of the box. We could broaden it to 3 switches if we allow ASCII only (plus periods, I think those should be added), ASCII + accented characters (easily transliteratable), or full Unicode. |
After further discussion during our meeting last night there are plans to merge two ideas namely:
We will likely need another meeting to hash out the exact plan here and determine the best way forward. |
As a follow-up, we need to decide what kind of drivers we're going with. I'm not sure that we should force a dedicated column for slugs, because for the most part, those can be generated dynamically. Realistically, there are going to be 4 drivers:
If someone wants to use a slug beyond that, they might need a DB column, but that's extension territory IMO. Another concern is how we'd want these to be handled in the frontend. We could generate and serialize slugs when sending content across, but that doesn't address the question of deciphering the slug so the frontend knows which endpoint to send API requests to. We could either:
After that, we can rework mentions, and that should allow for utf-8 usernames! |
We could actually re-work mentions regardless of slugging drivers or changes. The only hard part about mentions is getting the URL for the user mentions (which will be tied to drivers) |
True. We could even do both at once... For the mentions bit, I think s9e/TextFormatter#23 (comment) might just be our solution: We're already retrieving the users tagged by a post from the DB to insert their display names; the same could be done with the slug (and this can even be dynamically) in https://github.com/flarum/mentions/blob/8bcb66f7d728fc92c92c69333c40175c759eba9e/src/Listener/FormatUserMentions.php#L21-L29. So that leaves us with adjusting the frontend regex so it can recognize dots and utf-8 characters I suppose (although actually no, that leaves us with adjusting the frontend display code so it has a WYSIWYG effect in the text editor so we can store IDs. Which frankly I'm not 100% sure how to approach. How do you feel about not tying sluggability to DB columns? |
I see the reason for not tying it to slugs in the db, I also have some concerns around it but their fairly minimal concerns so I don't think it's worth arguing over it. Not having it tied to the DB does making changing slug drivers easier, but we'll still need a mass update function regardless since discussions already have a slug column. |
I'm worried that removing a slug column in the database will impact performance too much. Needing to loop over all discussions from the index through a Sluggerable class in contrast to saving the slug attribute on save makes a huge difference.. |
I don't think anything should be removed, but I think that whether or not a column in the database is used to store slugs is more of an implementation detail for a given driver than a part of the API. We loop through the discussions anyway when serializing, so a database column could be used in implementation to (slightly) speed up the For |
A bit more info from a meeting today for what's needed:
|
Search improvements will be a part of #286, so the remaining concrete task here is the day js plugin for scrubber date translations. |
A note on that Day js plugin is that we need a localized format for |
@tankerkiller125 i can confirm that this doesn't work with sinhala (si) language as well. We also interest to see flarum support to display every scripts slugs. e.g. when user started a discussion with particular script title |
@tankerkiller125 @luceos search facility works perfectly with |
Search function for Chinese works only when there is a perfect phrase match, but a partial search is very needed as CJK languages don't use space to separate words. |
The following kinds of bugs regarding UTF-8 are known and require the following kinds of changes/steps:
Problems:
The text was updated successfully, but these errors were encountered: