Japanese transliteration / romanization #46

cryptoquick · 2015-01-14T10:10:56Z

Japanese can be transliterated or romanized pretty effectively with software. That would be a killer feature for my future needs, as my company has several sites will include URLs for stores that have Japanese titles.

pid · 2015-01-20T20:34:40Z

I took a look on the issue -> "This chart shows in full the three main systems for the romanization of Japanese: Hepburn, Nihon-shiki and Kunrei-shiki:"
What's the way to go here? Three types of transliteration? Is there a focus on one?

Is this a good choice? -> http://www.translitteration.com/transliteration/en/japanese/iso-3602-kunrei-shiki/

cryptoquick · 2015-01-21T01:16:38Z

I've been experimenting with this, and Hepburn is definitely the way to go, but unfortunately, there's two cases we'd need to detect for here.

Hepburn can detect if the string needs to be altered with containsKana. Then, using fromKana, we get the desired Romajii.

However, another problem is that not all Japanese is in kana; there's also Kanji. The best solution I can come up with for that is ENAMDICT.

With both, we should have good coverage. I might also be prudent to get the hiragana from ENAMDICT first, then pass that to Hepburn.

We also want to be careful to provide to SpeakingURL the result of these techniques, in case it throws in any other characters with marks/diacriticals/accents/long-vowels that should be converted to ASCII.

cryptoquick · 2015-01-21T01:38:24Z

Also, I have this code written for my own slugifier based on SpeakingURL. It may be of some help.

And this is the test, but note, this only works for Kana, not for Kanji. You'll want to get it working for both, and that's the tricky part.

slugify-spec.coffee

  it "should romanize japanese characters", ->
    url_with_japanese = "/store/デパート/12345"
    romanized_url = "/store/depato/12345"
    expect(slugify(url_with_japanese)).toEqual(romanized_url)

slugify.coffee

hepburn = require 'hepburn'
getSlug = require 'speakingurl'

module.exports = (str) ->
  if typeof str isnt 'string'
    throw new Error "Slugify Error: Wrong type passed for value: #{str}"
  else
    if hepburn.containsKana str
      str = hepburn.fromKana str

    str = getSlug str,
      uric: yes
      custom:
        '&': 'and'

    str

martindale · 2015-02-25T01:16:39Z

+1 on this issue. I'll need this for soundtrack.io.

pid · 2015-02-28T10:44:33Z

Because of lack of time, this issue is still unfinished.
Still hope that someone will send a pull-request :-) Come on ;-)

h2non · 2015-04-03T01:09:25Z

+1

iliakan · 2018-10-19T15:21:35Z

+5

pid added the enhancement label Feb 28, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Japanese transliteration / romanization #46

Japanese transliteration / romanization #46

cryptoquick commented Jan 14, 2015

pid commented Jan 20, 2015

cryptoquick commented Jan 21, 2015

cryptoquick commented Jan 21, 2015

martindale commented Feb 25, 2015

pid commented Feb 28, 2015

h2non commented Apr 3, 2015

iliakan commented Oct 19, 2018

Japanese transliteration / romanization #46

Japanese transliteration / romanization #46

Comments

cryptoquick commented Jan 14, 2015

pid commented Jan 20, 2015

cryptoquick commented Jan 21, 2015

cryptoquick commented Jan 21, 2015

martindale commented Feb 25, 2015

pid commented Feb 28, 2015

h2non commented Apr 3, 2015

iliakan commented Oct 19, 2018