Auto-transcribe sentences in Lingua Franca Nova's Latin orthography to Cyrillic, and vice versa #1958

conlangbecca · 2019-09-24T22:15:35Z

Lingua Franca Nova has two orthographies, one Latin and one Cyrillic, with a one-to-one correspondence between letters in each orthography. A table can be found here: http://www.elefen.org/vici/gramatica/en/spelling_and_pronunciation

The practice up to now has been to contribute in both orthographies wherever possible, but I feel this is inefficient and ignores that there are already systems in place for auto-transcribing between Chinese Simplified and Chinese Traditional, among other orthographies. Could such a system be added for LFN? Thank you.

jiru · 2019-09-26T04:10:54Z

Hello conlangbecca, thank you for your request. The information you gave me looks like such autotranscription system is quite feasible in Tatoeba. To get this done, we will need your help as described in the wiki. Please follow the instruction there and get back to us with the relevant data about Latin/Cyrillic transcription in LFN.

jiru · 2020-04-26T20:20:36Z

Following the link you provided, I was able to write a simple conversion algorithm and check it against a few LFN sentence pairs on Tatoeba that were contributed in Latin and Cyrillic. Among these, I found a few sentences where my algorithm gets a different result than what was contributed on Tatoeba:

Sentence	Transcription on Tatoeba	Transcription by algorithm
5454279	Chelsea ес ентре ла дистритос ла плу модоса де Manhattan, е суа барес е ресторантес ес комун фолида а финис де семана.	Кхелсеа ес ентре ла дистритос ла плу модоса де Манхаттан, е суа барес е ресторантес ес комун фолида а финис де семана.
5459786	«Ме ес тристе,» ел иа дисе. «Ло ес ун фарса гранде, ун менти гранде, ун пиротекникал гранде. Ло куал авени но депреса ме, ма симпле ло мотива ме а апаре е парла плу.»	"Ме ес тристе," ел иа дисе. "Ло ес ун фарса гранде, ун менти гранде, ун пиротекникал гранде. Ло куал авени но депреса ме, ма симпле ло мотива ме а апаре е парла плу."
5441679	Ла аутор де ла либро "Ла еволуи – имажес де носа жовениа", Емиле Де Кооман, меа фрате, ес гравор е пинтор.	Ла аутор де ла либро "Ла еволуи – имажес де носа жовениа", Емиле де Кооман, меа фрате, ес гравор е пинтор.
8657973	Лаила ес ун традуор.	Лаyла ес ун традуор.
5623833	Christoph Schlütermann, ун лаборор пер ла Крус Рожа, иа дескриве ел комо "тан диференте де ла отрас – мулте нонкапас де ата син аида".	Кхристопх Скхлüтерманн, ун лаборор пер ла Крус Рожа, иа дескриве ел комо "тан диференте де ла отрас – мулте нонкапас де ата син аида".
5500486	Есперанто ес ун лингуа куал он дебе апренде, уса, рекорда, парла, дифуса, асета, скрибе, деже, трансмете.	Есперанто ес ун лингуа куал он дебе апренде, уса, рекорда, парла, дифуса, асета, скрибе, леже, трансмете.
5543582	"Нос иа аве но темпо," Кеllnеr иа есплика, "донке ме иа коре пос ел пер киса 15 метрес о симил. Ун де меа амис ес анке ун полисиор, донке нос иа саиси ла ом. Ел иа атента еваде, донке нoс иа тени плу форте ел."	"Нос иа аве но темпо," Kеллнер иа есплика, "донке ме иа коре пос ел пер киса 15 метрес о симил. Ун де меа амис ес анке ун полисиор, донке нос иа саиси ла ом. Ел иа атента еваде, донке нос иа тени плу форте ел."

Based on http://www.elefen.org/vici/gramatica/en/spelling_and_pronunciation Included tests based on existing LFN sentence pairs on Tatoeba picked at random. The failing tests show edge cases that require to refine in the algorithm or to rethink of the conversion rules. Refs #1958.

conlangbecca · 2020-09-22T13:54:57Z

Cyrillic transcriptions might not always be one-for-one with proper names, because sometimes people opt for a phonetic transcription into actual LFN phonology rather than a strict letter-for-letter transcription. The automatic transcription, though, should do the letter-for-letter transcription, which is completely acceptable.

jiru · 2020-10-06T14:53:20Z

Thanks for clarifying, @conlangbecca.

On a side note, I am mentioning #770 (and #76) because we have a number of LFN sentences that will become duplicates when this issue is solved as they will only differ in terms of script.

trang added the enhancement Issue that describes a problem that requires a change in the current functionalities of Tatoeba. label Oct 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto-transcribe sentences in Lingua Franca Nova's Latin orthography to Cyrillic, and vice versa #1958

Auto-transcribe sentences in Lingua Franca Nova's Latin orthography to Cyrillic, and vice versa #1958

conlangbecca commented Sep 24, 2019

jiru commented Sep 26, 2019

jiru commented Apr 26, 2020

conlangbecca commented Sep 22, 2020

jiru commented Oct 6, 2020

Auto-transcribe sentences in Lingua Franca Nova's Latin orthography to Cyrillic, and vice versa #1958

Auto-transcribe sentences in Lingua Franca Nova's Latin orthography to Cyrillic, and vice versa #1958

Comments

conlangbecca commented Sep 24, 2019

jiru commented Sep 26, 2019

jiru commented Apr 26, 2020

conlangbecca commented Sep 22, 2020

jiru commented Oct 6, 2020