Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is Russian to ASCII correct? #3

Closed
voku opened this issue Sep 14, 2019 · 17 comments
Closed

Is Russian to ASCII correct? #3

voku opened this issue Sep 14, 2019 · 17 comments

Comments

@voku
Copy link
Owner

voku commented Sep 14, 2019

Example: Is this correct?

"лысая гора" -> "lysaja gora"


Can a native speaker, please check the character-replacement, thanks.

https://github.com/voku/portable-ascii/blob/master/src/voku/helper/data/ascii_by_languages.php#L468

@kalessil
Copy link

http://translit.cc/ for rescue, this specific case will be "lysaja gora".

@voku
Copy link
Owner Author

voku commented Sep 14, 2019

Thanks @kalessil I have merged the suggestions from "translit.ru" and "translit.cc" here: https://github.com/voku/portable-ascii/blob/master/src/voku/helper/data/ascii_by_languages.php#L468

@voku voku changed the title Is russian to ASCII correct? Is Russian to ASCII correct? Sep 14, 2019
@samdark
Copy link

samdark commented Sep 14, 2019

There are multiple ways to tranlit and there are many GOSTs (state standards). Using the one for passports would produce "lysaia gora".

@samdark
Copy link

samdark commented Sep 14, 2019

@Hlaford
Copy link

Hlaford commented Sep 16, 2019

There are multiple ways to tranlit and there are many GOSTs (state standards). Using the one for passports would produce "lysaia gora".

I would use this one, actually. It is GOSTed (i.e. standardized).

@samdark
Copy link

samdark commented Sep 16, 2019

@Hlaford the problem is that there are multiple standards that are not deprecated.

@GrahamCampbell
Copy link
Contributor

Do different standards all have the same language code?

@samdark
Copy link

samdark commented Sep 16, 2019

Yes.

@GrahamCampbell
Copy link
Contributor

Hmm, I guess we just need to choose a standard.

@samdark
Copy link

samdark commented Sep 16, 2019

No, that won't work well. If it's something like generating slugs for an article title, that doesn't matter much which standard is used. If it's passport names, your users may get in trouble if you aren't using correct standard. If your service if exchanging info with state road police, it matters to use correct standard for road signs and street names.

@samdark
Copy link

samdark commented Sep 16, 2019

So solution would be allowing to choose a standard.

@kalessil
Copy link

+1 for "So solution would be allowing to choose a standard."

@voku
Copy link
Owner Author

voku commented Sep 16, 2019

@samdark Ok, so what standards do a normal application / user need?

"Passport (2013)" + "GOST 7.79-2000(B)"?https://en.m.wikipedia.org/wiki/Romanization_of_Russian#content-collapsible-block-1

@samdark
Copy link

samdark commented Sep 16, 2019

Yes. Would be enough for many typical cases.

@voku
Copy link
Owner Author

voku commented Sep 18, 2019

@samdark I added mappings for "ru__passport_2013" && "ru__gost_2000_b" (copy&past from wikipedia).

Can you please take a look at it, here is a test case for the string "лысая гора": a2b1920

Mappings:

@samdark
Copy link

samdark commented Sep 18, 2019

Looks OK to me.

@voku
Copy link
Owner Author

voku commented Sep 18, 2019

Thank you for the review & feedback. 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants