Skip to content
This repository has been archived by the owner on May 6, 2021. It is now read-only.

Approximate English translations of inflected forms #81

Open
eddieantonio opened this issue Oct 3, 2018 · 1 comment
Open

Approximate English translations of inflected forms #81

eddieantonio opened this issue Oct 3, 2018 · 1 comment
Labels
enhancement New feature or request

Comments

@eddieantonio
Copy link
Member

eddieantonio commented Oct 3, 2018

This seems to be really desired! We may be able to provide approximate English translations of inflected wordforms.

e.g.,

"nimowâw" — "I eat s.o. (animate)"
"kimowâw" — "You eat s.o. (animate)"
"mowêw" — "s/he eats s.o. (animate)"

... with a BIG caveat that usage may not be the same as it is in English (some words are only used as question, some words are only used by members of a specific gender). It's something really hard to do 100%, but somewhat doable for the 20% that matters.

@eddieantonio eddieantonio added help wanted Extra attention is needed enhancement New feature or request and removed help wanted Extra attention is needed labels Oct 3, 2018
@aarppe
Copy link
Contributor

aarppe commented Dec 18, 2018

I've done some gawk scripting with Wolvengrey's dictionary, where we have parsed the English glosses. In that case, it does seem that we could actually provide English translations by some clever scripting and the listing (or FST-generation) of the inflected forms of the nouns and verbs in the dictionary sources (totally doable).

However, doing this generally would likely require adding translation templates to the *.paradigm files, which would indicate per each paradigm cell which English inflected noun/verb form is required in a particular slot, with the rest of the context generated dynamically based on the POS coding (e.g. s/he_PRON_SUBJ would be replaced by the subject-pronoun indicated for that cell, s.o._PRON with the object pronoun for that cell, etc.).

The additional information in the *.paradigm files could be something like the following for TA verbs:

{{ lemma }}+V+TA+Ind+Prs+2Sg+1SgO: SUBJ:{I}+OBJ:{you (one)}+PRED:V+Pres
{{ lemma }}+V+TA+Ind+Prs+3Sg+4Sg/PlO: SUBJ:{s/he}+OBJ:{someone else}+PRED:V+Pres+3Sg

{{ lemma }}+V+TA+Ind+Prt+2Sg+1SgO: SUBJ:{I}+OBJ:{you (one)}+PRED:V+Past
{{ lemma }}+V+TA+Ind+Prt+3Sg+4Sg/PlO: SUBJ:{s/he}+OBJ:{someone else}+PRED:V+Past

Then, the English glosses which have been POS-tagged will be edited on the fly with the above information, e.g. for wâpamêw:

wâpamêw: s/he_PRON_SUBJ sees#see_V s.o.PRON , s/he_PRON_SUBJ witnesses#witness_V s.o._PRON
s/he_PRON_SUBJ <- SUBJ:{...}
s.o._PRON <- OBJ:{...}
lemma_V <- lemma:PRES{...}

The other verbs should be simpler cases of the above. Special cases are reflexives, and the future conditionals and infinitives.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants