Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add priority popup keys to it, pms and tr #824

Closed
wants to merge 1 commit into from

Conversation

glemco
Copy link
Contributor

@glemco glemco commented May 31, 2024

This PR adds the priority popup characters for Italian, Piedmontese and Turkish based on what character with diacritics are used in the language.

The following screenshots show it in action: by disabling all indicators besides the language (priority) ones, we can see that the desired characters are correctly displayed as overlay.

Perhaps stupid question, but while playing with the files I tried to change the extra_keys, thinking it was related, apparently it isn't.
What is that section for?

@Helium314
Copy link
Owner

Extra keys are the keys that are added in the default layout when it ends in +. I did this because there are a lot of languages that use an existing qwerty/qwertz/azerty layout and add keys on the right side. These keys on the right side are the extra keys (see layouts.md).

I am currently revising the locale key texts, so I would like to avoid PRs on this until #659 is done (title is misleading, it's about removing all diacritics not specific to this locale).
Do you think the priority keys should be still defined in the locale key texts after #659 is done? I consider removing the possibility of using % for this, and instead declare all keys in locale key texts as priority.


Btw I didn't find a good source which diacritics are necessary for Piedmontese. My usual source only has a relatively short 30k sentences list from Wikipedia.
From your priority list, I would guess the letters to add are à, è, é, ë, ì, ö, ò, ü, ù.
But I checked the use count, and ü and ö only have 95 resp. 85 uses in the source list, while the not listed ó has 371.

What would you recommend here?
Wikipedia is not really a great source for this sort of estimation, e.g. ė has 61 uses because there are many mentions of Lithuanian cities.

@glemco
Copy link
Contributor Author

glemco commented Jun 1, 2024

Mmh sure, makes sense, I'm not sure what would be the consequence of removing the % logic, would that imply all diacritics ever reachable via popup are going to be priority?
My main concern while opening this PR was that I would like to have the diacritics as first suggestion for popups in the above languages, I did by enabling the Language source (since Language priority was empty without the changes in this PR) but that enabled them also for English (which has no diacritic but one may want to type letters like ñ or ø every now and then).
Now I found the Language (priority) powerful because it tells which diacritic is definitely needed, but the layout may allow also to type others. I don't fully get if your proposed change would still allow this.


Regarding Piedmontese, the story is a little complicated, it's a language that is mostly spoken and many speakers don't know exactly how to write.
The standard writing system is kinda closer to French, this is the one strictly used by Wikipedia. As an attempt to make it easier for Italian speakers (the vast majority of Piedmontese speakers) a foundation created a different writing system (using among others letters like ö and ü in place of eu and u).

Now that foundation's website is about the most advanced Piedmontese source you can find on the web, I asked them their dataset and it seems also Microsoft did the same for Swiftkey.
To create the dictionary, I combined their dataset and just about everything I could find on their website, converting as much as I could to their writing system (with their permission and license, of course).
Now, it's still likely the dictionary is a little inconsistent (not everything can be inferred that easily).

That said, the letters that I put as priority are the ones mostly used in that writing system, but I left also others (ó for instance, but I might have picked the wrong direction of the accent, my bad) belonging to the other, just in case.

Considering the missing standardisation of the language writing, I don't think it's a big deal, but it's probably good if I check again the dataset.
I would say that ö and ü should have the priority over the others appearing on o and u. But don't take my word for now, I should adjust the dictionary and check more consistently.
Regarding e, we have several (I'd say é being the most common) but does it make a difference between the second and third more common or is the keyboard going to show only the first?

That said, we can close this PR or just keep it open for me to leave more details after I check the dictionary, I assume all changes here will not be needed the way they are anyway.

@Helium314
Copy link
Owner

would that imply all diacritics ever reachable via popup are going to be priority

The diacritics for the enabled locale(s) would be priority, others added via the show more letters with diacritics setting would be non-priority.
I think that matches what you want?

Main changes compared to your proposed priority keys is that for tr, î and û would also be priority keys.

but I might have picked the wrong direction of the accent

Certainly not, ò has 17k uses vs the 371 of ó. I just wanted to mention it because it has higher usage than ü and ö.
But with your explanation I see that it does not really matter.
Your proposed priority keys for Piedmontese should be fine then, I'm not qualified to comment on this anyway.

does it make a difference between the second and third more common or is the keyboard going to show only the first?

The first is going to be shown as hint, otherwise it does not matter. All letters are shown in the popup.

That said, we can close this PR or just keep it open for me to leave more details after I check the dictionary, I assume all changes here will not be needed the way they are anyway.

I'd like to keep it open as a reminder, at least until I update #659

@glemco
Copy link
Contributor Author

glemco commented Jun 1, 2024

The diacritics for the enabled locale(s) would be priority, others added via the show more letters with diacritics setting would be non-priority.
I think that matches what you want?

Yeah seems good actually, thanks!

Main changes compared to your proposed priority keys is that for tr, î and û would also be priority keys.

Well, I'm no native speaker and never met those two (also â is rather rare), but indeed they seem to exist..

Your proposed priority keys for Piedmontese should be fine then, I'm not qualified to comment on this anyway.

For now I'd say yes, but I'd definitely need to clean it up a little, I made the dictionary a few years ago and didn't bother much. In case i see it doesn't fit I can submit another PR after you refactor the whole thing.

Anyway, thanks for your quick support and keep it up!

@glemco
Copy link
Contributor Author

glemco commented Jun 5, 2024

I checked again the pms dictionary and the writing guidelines and it seems it's relatively accurate, according to the dictionary, those are the occurrences for the diacritics:
ü has 11754 occurrences
ì has 2912 occurrences
à has 2369 occurrences
ö has 1501 occurrences
ë has 1469 occurrences
é has 1416 occurrences
ò has 259 occurrences
è has 49 occurrences
ù has 9 occurrences

So I'd keep the order I mentioned before (regarding e, it doesn't change much but é feels more familiar and I'd put it first):
a -> à
e -> é ë è
i -> ì
o -> ö ò
u -> ü ù

@Helium314
Copy link
Owner

Great, this fits with the preliminary list I have in #659 (61eed81)

@glemco
Copy link
Contributor Author

glemco commented Jul 14, 2024

Closing since #659 was merged

@glemco glemco closed this Jul 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants