Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to display the "uncommon" fonts in the platform #1554

Closed
dpalomino opened this issue Jun 1, 2017 · 23 comments
Closed

Ability to display the "uncommon" fonts in the platform #1554

dpalomino opened this issue Jun 1, 2017 · 23 comments
Assignees

Comments

@dpalomino
Copy link

As discussed, we need a way to properly display "uncommon" fonts not currently supported in the platform (and specifically the font used by a particular partner).

This is a follow up from #1549 issue (please see details there).

@clash99
Copy link
Contributor

clash99 commented Jun 1, 2017

The ones mentioned yesterday: KNU, Karen, Padauk (Burmese) and Noto Myanmar.

@clash99
Copy link
Contributor

clash99 commented Jun 1, 2017

We should probably have a discussion related to this for the bigger picture. Scalability and maintenance could get difficult very quickly. Some options off the top of my head:

  • Is a reasonable solution to rely on users to have these fonts installed locally?
  • We could detect these uncommon fonts and provide links in a alert message to download them (would this be from us or someone like google? - but what if they aren't provided by Google...)
  • Other options?

@amplifi
Copy link
Contributor

amplifi commented Jun 1, 2017

Flagging this -- even if we can get the character set mapping to work, for any non-Unicode charset portions of the platform will still not function (at all, or may exhibit unpredictable behavior), including but not limited to: any type of sorting, search, string validation, export...

This is not just a font issue; this is a character set/encoding issue.

@alukach
Copy link
Contributor

alukach commented Jun 1, 2017

Regarding @amplifi's comment, I think it's important that we are very clear about which problems we're talking about. There seem to be two issues at hand:

  1. What do we do with non-unicode data? Should it be supported? What do we do if it's not supported? If we are supporting it, how do we handle those fonts?
  2. If the data is encoded in unicode, how do we ensure that its characters are represented correctly on our front-end (i.e. how do we avoid "tofu")?

These are two distinct issues and can be addressed separately. @amplifi seems to be suggesting that the answer to #1 should be "no, non-unicode is not supported". I vote a +1 to this (simply for the sake that it's easier than suggesting that it is supported) and suggest that we agree on a policy that "Cadasta expects all data to be UTF-8 compliant" until we get to a place where we are confident that it is actually needed. It may be prudent to add some tooling to error on non-unicode input (maybe this already exists as part of our sanitization tooling).

I'd suggest renaming this Issue to read "Ability to display the "uncommon" unicode characters in the platform" to specify that it only relates to dilemna #2.

With regards to #2 I'd like to add a point to this solution's wishlist:

  • If we are loading fonts via URL (such as from Google or from our own /static dir), we should ideally aim for a solution that only loads fonts that are needed as loading font-faces for every possible language seems a bit excessive. Not sure what the right way to do this would be. We could ask for a user to select the used languages when creating a project. Or, aim for some magic language detection based on characters (https://stackoverflow.com/questions/4545977/python-can-i-detect-unicode-string-language-code, https://github.com/wooorm/franc) and dynamically load fonts as-needed for that representation (this feels far fetched but a dev can dream).

@alukach
Copy link
Contributor

alukach commented Jun 1, 2017

Looks like my request for as-needed font loading is already built in to modern-browsers: https://developers.google.com/web/fundamentals/performance/optimizing-content-efficiency/webfont-optimization#unicode-range_subsetting

@clash99
Copy link
Contributor

clash99 commented Jun 1, 2017

@alukach

For Q1, I agree we should not support non-unicode data. Karen is pseudo-unicode meaning that it is a combination of unicode and not so do we throw it all out? If its all converted to unicode then I think we are good there.

For Q2, since the users at-hand have the fonts installed locally we don't have to do anything with the "visual images that typically represent the parts of language" display issue on the platform right now. :)
I think we need to set aside some time to discuss this in Madrid. I think we should put this into a larger discussion regarding how we handle all files that feed into the Cadasta UI (cdn, etc). It will be good to collaborate and set some thoughtful best practices.

@clash99
Copy link
Contributor

clash99 commented Jun 1, 2017

@SteadyCadence and @dpalomino - Since it seems like the users we are currently supporting have these fonts locally installed, I suggest we table this for a larger discussion in Madrid (see note above). I recommend we move this from high priority to medium priority. Let me know if you agree.

@alukach
Copy link
Contributor

alukach commented Jun 1, 2017

@clash99 I'm not sure what you mean by pseudo-unicode. My understanding is that all of S'gaw Karen characters should be included in the Myanmar Unicode block, meaning it is valid Unicode. Am I mistaken? Are there more characters outside of that Unicode block that make-up the S'gaw Karen written language?

Are we confident that all S'gaw Karen speaking clients have a Myanmar Unicode font installed and available on their machines? For example, some of the input data was provided in a non-Unicode Karen font. Furthermore, are we certain that all browsers will utilize that Myanmar Unicode font when it encounters characters from the Myanmar Unicode block (I think this is a "yes")? I would recommend that @SteadyCadence ask a few of our S'gaw Karen speaking partners to open the Wikipedia link and to verify that there are no tofu symbols in the block: https://en.wikipedia.org/wiki/Myanmar_(Unicode_block)

@clash99
Copy link
Contributor

clash99 commented Jun 1, 2017

@alukach I don't know if this is an issue or not but it seems like one symbol ( ြ) might be a problem, or at least worth verifying that it is not a problem based on what I've read.

Good questions for @SteadyCadence. When I was screen sharing with Katrina last night they were installing fonts on the machines they were using so I'm pretty sure they have everything locally that they need (for this particular partner at least). But yes, we should verify. At this point I think it would be better to install the fonts locally.

@SteadyCadence
Copy link

@alukach @clash99 Looking at these tables with Mabu, he says that there are some characters in the Myanmar table that are not used in Karen and then there are some characters in the extension A and extension B that are used in Karen: https://en.wikipedia.org/wiki/S%27gaw_Karen_Script

Ill try to print out the sheet and have him highlight the characters that Karen uses... not sure how much that will help.

@seav
Copy link
Contributor

seav commented Jun 2, 2017

It should be OK if there are unused characters. What we should worry about is if there are characters they use that aren't in Unicode.

@SteadyCadence
Copy link

SteadyCadence commented Jun 2, 2017

I lied. Apparently, there are characters missing in that Wikipedia page. This file has all of the alphabet and the corresponding unicode character:
Payap unicode complain.pdf

@alukach
Copy link
Contributor

alukach commented Jun 2, 2017

@SteadyCadence thanks for the update. I take it that all of the characters rendered properly when you opened the S'gaw Karen webpage? Even the extension? As @seav said, the issue isn't unused characters. Right now we just want to verify that our target users have that font installed on their machines and that it plays well with their browsers and Unicode text.

@clash99
Copy link
Contributor

clash99 commented Jun 2, 2017

I've read the emails @SteadyCadence and I've created a branch called font/padauk that I believe will work. The KNU font mentioned wasn't available in the link and this Paudauk font was recommended in the second email. I think it will work based off my before/after screenshots below. Can you verify this is the readable? If so, I will make a PR.

BEFORE

screenshot 2017-06-02 11 05 24

AFTER

screenshot 2017-06-02 11 05 53

@SteadyCadence
Copy link

Thank you! I'll try to get someone to take a peek. I know I will go into the office on Wed, so I can certainly get it done by then.

I appreciate your hard work on this! I think it will make the team super happy, esp because they have had some doubts.

@dpalomino
Copy link
Author

dpalomino commented Jun 5, 2017

First, thanks to all @clash99 @alukach @seav @amplifi and @SteadyCadence !!! This is s a very very tricky, difficult but important thing.

For Q1, I agree we should not support non-unicode data. Karen is pseudo-unicode meaning that it is a combination of unicode and not so do we throw it all out? If its all converted to unicode then I think we are good there.

I agree, indeed as far as I understood is our only option...

For Q2, since the users at-hand have the fonts installed locally we don't have to do anything with the "visual images that typically represent the parts of language" display issue on the platform right now. :)

If we can confirm (@SteadyCadence?) that now all the team have the correct fonts installed and there are no problems working with the data then I think it should be ok. If this is not working for all the team, I think we should make an exemption an try to include it via loading font-face the proper font only for this case, given the importance of the partner.

I think we need to set aside some time to discuss this in Madrid. I think we should put this into a larger discussion regarding how we handle all files that feed into the Cadasta UI (cdn, etc). It will be good to collaborate and set some thoughtful best practices.

Totally agree, and thanks for bringing this into the conversation!

@SteadyCadence
Copy link

SteadyCadence commented Jun 6, 2017

Disregard what I wrote earlier if you received it in an email. I got very confused talking with Lawplah-- he is confusing sometimes.

I think we can successfully read KNU with the font/KNU branch. The font works well in Chrome and Safari. I couldnt test on IE yet. (They use Chrome and IE mostly). In Firefox, there are characters missing.

I think we should go ahead and merge the Paudauk and KNU branch asap. Both languages will be used. This gets us one step closer to an ideal readability.

Paudauk is the Burmese language. It is different than Karen.
In the KNU fonts showing in Cadasta, you can see some ticks off of the letters, which represent vowels. On the site, you only see one tick but in Karen that would be two ticks. So there are some small errors but this is a GOOD first step. Also they would definitely still need to install the KNU Unicode keyboard.

Overall, let's merge the branches. It's our best option so far!

@clash99 clash99 mentioned this issue Jun 6, 2017
20 tasks
@dpalomino
Copy link
Author

@SteadyCadence, we need to check if we need this landed urgently or we can wait until having a broader solution. Can you confirm if all the partner's team can see properly their data just having the local fonts installed?

@clash99
Copy link
Contributor

clash99 commented Jun 8, 2017

@SteadyCadence
Copy link

The font is looking good on staging. Its working!!

Request: We need the KNU and Paudauk font to be LARGER. Like size 16 or more. It's super hard to read.

@clash99
Copy link
Contributor

clash99 commented Jun 13, 2017

@SteadyCadence Since we aren't using the lang tag to change the font of the whole platform (just the project data fields), there isn't a quick solution to this. I will look into this more but it will take more time.

@wonderchook
Copy link
Contributor

@SteadyCadence for the time being are you having them enlarge the text in the browser as a work around?

@SteadyCadence
Copy link

SteadyCadence commented Jun 16, 2017 via email

@clash99 clash99 closed this as completed Jun 26, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants