-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Coordination in Kazakh and Basque #189
Comments
This is an interesting point. I somewhat sympathetize with making the morphologically relevant conjunct the head and saying that this is required by the specific nature of the language. The only potential issue I see is that it may make more difficult writing transformations that would convert coordination to a different style - something people may want to do for parsing experiments etc. If we keep it unified, i.e. all coordinations in all languages will be headed by the first conjunct, you can still easily and deterministically access the relevant morphology, by traversing one |
Not surprisingly, this issue is also relevant to Turkish. I just want to add another example to support the case. Besides postpositions mentioned in the original post, there are other coordinated constructions that are affected. For Turkish, quite a few suffixes are added to the final conjunct (I believe this is called delayed suffixation). So, marking final conjunct as head would be more natural. Here are two related example sentences:
The point is that the tense and person information is present only at the last conjunct. So, it makes more sense for it to be the head and have dependency relations with the rest of the sentence. For example, if there was a overt subject, it would have to agree with the agreement marker on the last conjunct. The other coordinated clauses do not even have finite predicates in these examples. METU-Sabancı treebank also marks the last conjunct as the head. I would also be very much in favor of allowing language-specific decisions on the head of the coordinated structures. After all, choice of head-direction in other constructions is language-specific, and I cannot think of a reason for coordination to be treated differently. |
Just minor clarifications. 1 Мексика Мексика nom 0 nmod 1 Мексика Мексика nom 5 conj (not nmod) |
The first example should read:
e.g. Everything aside from the postposition attaches to "Мексика", the first item in the list. And the second example should read:
e.g. Everything attaches to Германиядан, including the postposition. Thanks for spotting those errors. Usually I draw ascii art :) [1] |
This is an important issue that has come up also for Hungarian, I think, where there are cases where morphological agreement is expressed only on the second (or last) conjunction, indicating that this is the head of the coordination. It is also related to the issue of what should be the head in complex names, where the official guidelines say the first element, but many languages have good arguments for saying that it should be the last (typically again based on inflection). Even more generally, I think we will need a mechanism for handling this kind of variation across languages. Sort of like a small set of parameters where different languages can choose different values. |
The Uppsala meeting / coordination discussion group decided that the current rule should not be changed and the first conjunct should always be the head of coordination. See here for details: http://universaldependencies.github.io/docs/2015-08-23-uppsala/coordination.html |
I'm sure this has come up in the documentation somewhere before, but I haven't been able to find it.
The docs write that for coordinated NPs "We take the first conjunct as the head of the coordination." When you have prepositions this is quite nice as you will have something like:
So your first conjoined element has the right function. But if you have a language with postpositions, where the postposition can or must be omitted from all but the last element:
Basque:
Kazakh:
Now the first word in the conjunct doesn't have any of the relevant morphology or dependents for its function. This is not ideal.
Would this be something that could be done on a language-dependent basis ? e.g. for Kazakh make everything depend on the last element instead of the first:
Also, has this been written about somewhere before ? I'm sure I'm not the first person to have come across this.
The text was updated successfully, but these errors were encountered: