-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Argument order inside intent applications #478
Comments
This is something that any application inferring intent (or inferring speech directly) needs to handle, but we do not specify that at all in the current spec, and I don't think we should (or that it's possible in a mathml4 timeframe)
The intent handling is specified without reference to any presentation so the order of arguments is completely explicit: if you have
But Content MathML is about implied meaning so this has to be given. |
To keep this constructive, I'll only address your example. If a generator application emits <msup intent="power(a,b,c)">
<mi>c</mi>
<mfrac>
<mi>b</mi>
<mi>a</mi>
</mfrac>
</msup> expecting an AT to narrate: "c raised to the fractional-exponent of b over a end-exponent", but the consumer AT instead narrates "a raised to the fractional-exponent of b over c end-exponent", the two systems have failed to interoperate, where the listener of the AT receives a completely broken readout (significantly worse than a purely presentational readout "c superscript b over a end-superscript"). Note that there may still be a second system AT2, which correctly meets the expectations of the generator. Which system made a wrong choice is under-specified today - in fact, neither choice is "wrong", this brokenness is a predictable consequence from an incomplete Intent spec. In such a world, any system-specific narration by a consumer AT would only be usable with generator systems specifically targeting the argument order supported by that singular AT system. (for P.S. It took me a minute, but finding a 3-argument use of power wasn't that hard after all - it just needed some extra imagination. |
On Fri, 27 Oct 2023 at 20:56, Deyan Ginev ***@***.***> wrote:
To keep this constructive, I'll only address your example.
If a generator application emits power(a,b,c) on:
<msup intent="power(a,b,c)">
<mi>c</mi>
<mfrac>
<mi>b</mi>
<mi>a</mi>
</mfrac>
</msup>
expecting an AT to narrate:
"c raised to the fractional-exponent
<https://www.cuemath.com/algebra/fractional-exponents/> of b over a
end-exponent",
No I would not expect AT to do that at all, the `intent` there will
suppress inferring speech from the presentation, that's its purpose, so it
will say
power of a comma b comma c
unless the system has a non-core rule for a three argument power.
Message ID: ***@***.***>
|
Note a system that does have a rule for a three argument power would say the same for
|
Any generator that creates I also expect that we're over-complicating things; There will probably be only a handful of concepts in the core dictionary with more than 1 argument which have distinct roles. Add a phrase like "the first argument is the base, the second is the exponent" to a comments column of the Core dictionary, add a few months for arguing, and we're done! |
exactly, yes agreed.
yes sure we could do this. |
No? The two of you even proceed to brainstorm an interoperability mechanism, so I am not even sure what you agreed to. If there is a mechanism to ensure argument order interoperability for The "comment column" suggestion is at least beginning to engage with the substance of the issue as opened. My view on that matches my comment for the CMML plain-text documentation, quoting from the issue description:
To me this is a good direction to brainstorm more on, and I welcome other group participants to join in. A simple convention could provide a nice intuition to both implementers and list curators. |
Being in the open list has no effect on anything, it is a list of suggestions for things that implementers might consider implementing in addition to the things in core, but a given system can implement things not in the open list and may or may not implement the things that are in the list. I am not really sure what "interoperability of argument order" for foo(a,b,c) means. The default interoperable thing is to read it as foo of a comma b comma c. If a system chooses to implement a rule for foo and give it a specific better reading, that is fine, but naturally that reading is different from the reading produced by systems without such a rule. |
Probably more the former, at least initially; at least until we see what we've collected in the list and can assess the potential for confusion. |
It means that for all systems where "foo" is a For
I should again clarify: The Open realm is larger and needs more of this care, but this is a fundamental issue to intent expressions with today's spec. The example |
There is no problem here to fix. The core entry says a to the b-th power It may be the generator should have generated
There is no "shuffling" of arguments possible, just as the order of words do not get shuffled. You are talking about explicit markup in a document, if the document says If by "shuffling" you mean that different mathml generators should generate the same intent for the same mathematical expression, that's not a general problem just a matter of decribing (especially in the open list) the intended concept in sufficient detail. Currently the open list at But I think the core list has more or less sufficient descriptions that no implementer would really be in any doubt about which function was intended by each entry. |
Firstly, I personally have been trying to get away from this notion of "known concept" preferring something more like a "known speech pattern", which includes arity (w/o requiring that a specific speech template be used). Power with 2 arguments is a known pattern, with 3 arguments is not. Whether the 3 argument form is a "known concept" is completely uninteresting to me. Secondly, with regards to at least one generator, I can see no conceivable way that LaTeXML would try go generate And finally, there already is a default speech for Clearly, for the Core list, and moreso for the Open list, there will be concepts which take more then 1 argument and where the order matters; in such cases we need a way to document the expected order. Enforcing that order is out of scope, even assuming it were possible. |
If I am reading the replies correctly, both of you understand the technical need I described and have agreed with each other to experiment with a list-specific documentation solution. Great. The solution you've focused on so far is a similar approach to how CMML's As a supplement to that, I would like us to investigate a good convention for how to organize the arguments for common patterns of applications. You may not be interested in that, and that is OK, we can agree to disagree. If no one but me is interested, the issue can be closed. I find the appeal of streamlining argument order in the Open realm quite attractive - it can simplify a lot (which means broader coverage for less work). For example: Sadly this can't be so simple, due to competing notations, but all we may need is a tie-braking clause (= ranking) which notation to use as a "reference notation for argument order" when there are multiple known notations. I suspect we have already been doing some of this "subconsciously" when specifying speech hints, following some common sense take on "taste". Making those choices transparent, and using them consistently, can make a big difference for adopters. This kind of work is in scope to the current WG's efforts for the same reason the curation principles were in scope ( #470 ). As such, in my opinion, discussion here should be allowed to continue unencumbered. |
No, I really can not understand your issue at all., I have no idea why you think an explicit attribute such as
It's not that I'm not interested I just don't think it's relevant to the mathml spec, it's just general advice to implementers or contributers to the open list on good concept definitions
You could perhaps put some such suggestions in the top of the open list or in the notes-on-mathml, but there is no testable assertion here and nothing that should go in the spec.
No there is nothing to be specified here.
I can't see that we can do anything other than close this with no action. |
I agree that argument order is something to be addressed, but do not agree that it needs anything more than a comment in a few entries, or at most a separate column. Of the concepts in the current Core list, only intervals, quotient, remainder, power, 2 argument root, definite integrals, derivatives, sum, product (and other bigop, limitop) need any clarification (and most of those are already implied by the template). I do think we'll improve interoperability by being explicit and clear. The open list will likely be more work, but that shouldn't be surprising. Without having community experience using the open list, I wouldn't expect a confusing algorithm to guess argument order from "standard" notation to pose any advantage, by the time you've added to every concept what you think the standard notation is, and what it means. |
I have to admit that I remain baffled/unconvinced that there will be confusion on argument order for almost all concepts core handles. Notwithstanding (a new favorite word of mine since you pack three words into one) the above, it certainly doesn't hurt to add some text to the top of the concepts lists that says something like:
I think this says what David and Bruce have assumed and is somewhat like something that Deyan proposed above. Note that this text is informative and that the speech hints and comments are likewise informative, not normative. @dginev: does this address your concerns or am still not understanding why you think there is a problem with argument order? I think a similar statement can be made for open concepts. I just did a quick scan through the open list and where arguments were indicated with |
@NSoiffer Yes, the phrasing you used would already auto-decide the vast majority of ordering cases. I think it is quite appropriate. Even as an informative note, this is an improvement. But I am wondering if making it normative (in the main spec text) wouldn't reap the ultimate benefit - streamlining all Historical context: Recall I was also a voice for streamlining how concepts themselves are named, and wanted us to have some "encyclopedic" convention, also lowercase-dashed. These are ultimately small design tweaks that don't change the nature of what intent is, but make it more uniform and predictable to use. |
You could have a normative testable statement that concept names have a specific form such as hyphenated lower case, you can't have a normative statement that the names make sense. Argument order is of the latter type, there is no normative statement you can make as the function (from a spec point of view) only exists as the concept, so there is no way you can normatively say the functions arguments have to be in any order, that is you can not make "typically used represent the concept" in Neil''s phrasing into anything normative. This is basically just guidelines for submission to the open list and could be added to the top of that file. |
also what does "order in presentation mathml" mean? We have some open issues around intent for calculus, but if you have
It really makes no sense to try to specify this order in the abstract for all functions. When But it's a judgement design call in each case, not a normative rule that should or can be followed. |
@davidcarlisle Ok, sure, the nuance of what can be prescribed is something I need to learn more about. Maybe all that's really needed is some non-normative encouragement from the main text. But I am not sure I understand the technicality. This is the same spec that documents the order of arguments of The spec hasn't provided a "testable" way to know if those were used correctly I think. It can't prevent anyone to put the denominator as the first arg of a fraction, or the exponent as the first arg of |
This is exactly what I've suggested (several times). But once we've specified the expected order of arguments, there's nothing we can do to enforce it. I like @davidcarlisle suggestion to add Open list guidelines about how to choose argument order. That could indeed make the system more predictable. But there's no "correct" order; only common, conventional, convenient, etc, so nothing normative. |
you can possibly say something about a specific function (typically in the comments in its entry) but there are no general rules, You just need to make an arbitrary choice. That is why we need the concept dictionaries so that once a function has been added, different implementations can make the same choice. definite integration could be any of
and dozens of other possibilities, There is no way of specifying in advance a rule that tells you which concept name and argument structure to pick, Whoever is writing the concept dictionary entry needs to make an arbitrary choice. By adding it to the dictionary you are saying other systems should (for core) or may (for open) use the same choice to improve interoperability. Some version of Neil's note above could be added to the open list as general guidelines used to help people choose names and structure of concept entries, but it can't be anything more than general vague hints. There is no way of phrasing anything that applies in general. To phrase this another way, when describing mfrac's two arguments we can say the first argument is called numerator and the second is called denominator. But in general the only thing that can be said of a 2 argument intent concept is that the first argument is first and the second argument is second. The dictionary is the definition of the functions so the entries are correct by definition there is no previous notion of the arguments which the entries should or can follow. |
To my understanding Neil's phrasing of:
addresses the integral example, and any other construct of such variability or complexity. While an integral may need special treatment (or a more sophisticated general convention), I don't see it as a reason why we can't suggest that the simpler cases should be streamlined. I.e. a convention where The main difference I have with David is that even if we agree that sometimes a normative ( If the curator was instead recommended to make the same fixed choice (even if its principle is arbitrary), it will avoid the need for documenting every entry, and will make implementation briefer and more predictable. Aside: Currently my own taste leans towards |
With my DLMF hat on, I like to encourage (but not enforce) standard notations wherever possible; with any other hat on, I have to point out that there is no such thing. Making a rule such as proposed above be normative would actually guarantee lack of interoperability. Those who think "n choose m" is written I'm inclined to agree with @NSoiffer (& @davidcarlisle ?) that the speech hint (where given) is likely sufficient to clarify expected argument order; Compared to adding comments to the dictionary it's less work for the dictionary writer, perhaps more work for the dictionary user. |
The speech templates in the open list will have no effect unless implemenmted in AT systems, so at no point should a content creator be adding thousands of such things. If a content creator is adding concepts that are not known to the system they should use the order that they want the arguments spoken as |
I said "concept curator", not "content creator" |
ah sorry. misreead, although the point still holds, the only people who can affect a non default reading of any given concept expression are the implementers of AT systems such as mathcat. We chose not to define a default reading of presentation mathml, leaving that up to vendor experimentations, so you don't know in general how an expression will be read unless you add an intent but if you do add an intent concept function expression it will be read without reference to the presentation, so whoever is adding those will add them based on how they want it read. Vague hints in the dictionary about how the argument order should be different based on some possibly different notation doesn't help anyone. |
I agree. I prefer a normative I think the current active discussion here boils down to a design preference for how strict we should be with prescribing argument order. If "list comments" are the only mechanism to decide it, then the "concept curator" (writing the list) has full freedom, and implementers of lists need to manually walk through each entry to find what decision the curators made. This makes it very low-friction for group members to curate the lists, but harder for people outside the group to implement them - since every entry is essentially a special case. I will continue advocating for making some deliberate design choice, adding a cross-list mechanism that guides how argument order can be automatically chosen for |
The entry is the definition of the concept function and its arguments, there is no pre-existing function, so certainly no normative statement can be made at all, and I don't really see how there is any general non normative statement either. I do not see there is any issue here and think we should close this with no action. A motivating use case for intent is disambiguating notational differences so whether you have |
Perhaps an explict example might help show why argument order should not be a tied to presentation order other than at most as a vague hint as to general considerations that one may take in to consideration. https://en.wikipedia.org/wiki/Coset#Notation a reasonable but probably non core pair of concept definitions would be left-cosets(G,H) "left cosets of $2 in $1" right-cosets(G,H) "right cosets of $2 in $1" The conventional notation of the first is There is no prior definition of these functional forms and no normative or non-normative test to say which argument order is correct. The concept dictionary entry forms the definition of the concept function and the argument order, whichever is chosen, is correct by definition. |
The good aspects about a My design preference is to think of "encyclopedic concepts" and to see https://en.wikipedia.org/wiki/Coset as one encyclopedic page defining the concept (itself summarizing use in actual mathematical practice). The intent lists should primarily aim to make transparent a list of names that systems may interoperate with, and avoid the trap of trying to become developed ontologies of discourse, with all the custom curation decisions that come along with them. The more focused the list - the smaller the friction will be for adopters. But if-and-only-if the crucial operational questions have been addressed by the main spec text. I've explained the details above, though I still wish we were given the opportunity for proper discussion. Absent that, I suggest a group meeting and vote on the questions posed. |
It should not be a SHOULD, or in the spec, we could at most include it in the notes in the dictionary on design considerations that could be be used when contributing new entries to the open list. There is no SHOULD or testable assertion that can be made, concept dictionaries are conceptually (and in the current version of the core list, actually) totally independent of any visual layout. When defining a dictionary entry for a function of more than one argument you might have various things in mind.
Of the four, I'd say that the two based on presentation mathml are perhaps the least useful, I'd probably use the 1st then the 4th before that. But the point is if I'm adding a dictionary entry, whatever is in my head really doesn't matter and it doesn't matter if someone else would have made a different choice. The dictionary is there to log choices and allow different systems to use the same set of definitions. This issue is suggesting a SHOULD requirement to use the third bullet (as far as I understand the issue at all) but I don't think there can be any general rule and certainly I do not think that would be a good rule. But in any case as the concept entry does not mention the "common notation" that was in the author's mind, It is impossible to have any requirement on the order that means anything or is testable. |
Re-reading this after being away from the issue for a while, I think I see a subtle distinction between what Deyan is asking for and what David, Bruce, and myself were saying isn't needed. What the three of us keep saying is that the list is the definition. What I think Deyan is saying is that makes everything a special case. It would be much better to have rules. We all agree that one can't state rules that are always going to work. In other words, we all agree there are special cases. I suspect the special cases comprise well less than 10% of the entries in core, and also in open (probably closer to 1% than 10%). A good part of this is because in both lists (at least so far in the spreadsheet Deyan created), there aren't too many entries with more than one argument, and where these is more than one argument, they are almost always pronounced left-to-right (hence following the stated default ordering). I believe Deyan's main complaint is that if there are thousands of special cases, it is too much work to implement. But if in fact there are some common rules, then either due to a special column in the table or the presence of a speech hint or something else, an implementer or machine generator could recognize the 90+% cases that follow the general rule and an implementer would have much less work to do, or at least less cognitive load. @dginev: did I capture what one of your main concerns is? Is this (a note at the start of the concept list together with an entry in the table that says "this is special") something everyone can get on board with? I know part of the discussion was normative vs informative. This puts the text in the list document(s) and not in the spec, so it makes it informative. On the other hand, it specifies how the table is to be interpreted, so in that sense, it is normative for people authoring the table and those reading it. Maybe it is easier for a camel to go through the eye of a needle than to make everyone happy :-) |
You did, thank you for the summary.
Your middle-ground suggestion upgrades us from "every concept is a special case" to "every list has special argument order rules", which is certainly an improvement to the implementer workload. If we are discussing 1% of list entries as needing special treatment, as you suggest, then to me it is easy to be tempted by the stronger "MathML Intent has a single convention for argument order" for the remaining 99%. |
I don't think anything should be described as "special" here, although no objection for a note at the top of the open list suggesting that people making contributions should consider placing the arguments of functions in the standard reading order of a common notation for the concept. |
So we'd have a different order for binomial of k among n and vector with components x and y although they are displayed the same (at least in some cultures the bigger number is above in the binomial coefficient). |
well that more or less indicates why this requirement really shouldn't be here. I'd read basically you have to make an arbitrary choice, that is the whole point of having the dictionary, to record that choice. Really we should close this with no action, or at most add some vague hint to take reading to consideration when specifying a new entry. |
I would have started by asking how Paul arrived at the English "binomial of k among n" and whether he considers it "the standard reading order of a common notation for the concept", before reaching such a strong conclusion. I took it as another good reason to try and leverage encyclopedic resources. For example, the wiki page informs:
Also, from encyclopedia Britannica:
This is a kind of testable evidence for "prevailing use" of at least one speech pattern. Clearly, a language capable of active and passive voice is capable of order reversal for most phrases we can build. "n choose k" and "k chosen from n" are equally meaningful to a learned listener, but one is in common use and the other is not.
I think what you are trying to claim is that "there is no testable rule which uniformly covers the full domain of math syntax", which would be correct. There are certainly a variety of testable rules which can be designed by us (including the notation-based rules in the issue description) which will cover most of the cases. For the cases where multiple choices are possible and an arbitrary choice needs to be made (apologies for using "special" before) and documented in the list, that is fine. If the group prefers focusing on English speech as deciding argument order, I think we will need to produce some language rules, of the sort "prefer active voice and brief speech patterns". That would then serve to break the theoretical tie between |
No I am saying there is no rule that can be stated that refers to any notation as the notation is not part of the entry. There may be one standard notation, there may be many, they may or may not have argumments in the same order. It does not matter as the notation plays no part in this. The concept dictionary entry has no entry for notation (except possibly as mentioned in a comment) and the instance the intent processor is trying to match may be
and the concept dictionary entry for binomial-coefficient has to match (or not) and if it matches, give some speech hints Any rule (certainly any normative rule) can only be a rule about the data at hand. The person who writes the entry for the dictionary may have a notation in mind, and that may inform their choice of argument order, and of the speech hints, but that's just in their mind. |
@dginev although specifically "n choose k" (and the It certainly makes sense to have some notes highlighting this kind of issue, but there is no rule that can be made, other than advice that the contributer of a concept dictionary entry should consider these things. |
That'd be my preference: Warn that there may be interpretation differences, especially when going international. The saying of k among n is from the French's pronounciation I remember. |
No great insights follow. Just a few reminders and an example... Reminder: the speech template is not meant to force speech to be spoken a certain way for that notation. It is an example of how it might be spoken. As @davidcarlisle said, you may speak the same intent in different ways. His example is a good one for a terse and verbose way of speaking The speech template hopefully makes clear which argument means what. If it's not clear, the comments should clarify the order. However, it doesn't mean they are spoken in that manner. As an example, Asian languages speak the denominator and then the numerator ("b under a"). [Surprisingly, at the moment, the concept list doesn't have an entry corresponding to |
I added the text suggested by @NSoiffer above to the text at the top of the open concept list, hopefully that is enough to close this issue. https://w3c.github.io/mathml-docs/intent-open-concepts/ |
Discussed at 6 June meeting... Although @dginev feels that the spec itself should have mention of argument order, he deferred to @brucemiller who felt the current solution of adding text to the open list and having speech templates is ok. Others agreed that the issue can be closed. |
Description
An intent application follows the grammar rule:
A recent problem @brucemiller and I encountered is that the we are currently under-specified on how argument order is determined. As a minimal example, we have the practical choice between
power(2,k)
andpower(k,2)
one of which stands fortwo to the power k
and the otherk squared
.@polx mentioned in our WG meeting on Oct 26, 2023, that Content MathML had one solution to this problem. Personally, I think it is a bit too "manual" - there is English text documenting each and every content element and its arguments, each requiring an adopter to carefully consider the description and implement it correctly. As an example, cmml power states:
A comparable approach, mentioned by @NSoiffer in the same meeting, is prescribing that the order of arguments in a listed "speech hint" for a concept becomes normative for applications of that concept. For example, in the application
power($1,$2)
, a speech hint associated withpower
stating$1 to the power $2
would dictate that$1
must be the base, while an alternative$1-th power of $2
would dictate$1
to be the expontent.This is workable, but manually walking a documentation of this nature would get overwhelming as we exceed one thousand concept entries. Is there a more automatic approach we could adopt?
Preliminary Discussion
Draft idea: I spent a little time considering if we can instead propose a notation-based convention for this argument order. MathML Intent annotates presentation expressions, which means we are annotating a known rendering, and we can pose a convention based on that rendering. Here is an example set of rules:
C
is silent inintent="binomial-coefficient(n,k)"
mfrac
), the arguments are provided top-to-bottom, based on their rendered order.mtable
annotated withsystem-of-equations($1,$2,$3)
, the argument order should match the reading order of the expression. While a system-of-equations flows top-to-bottom + left-to-right, a tabular diagram may have its reading start center-to-outermost-column, and the argument order should follow that.intent="maps-to(A,B,C,A)"
andintent="maps-to(B,C,A,B)"
are equivalent for a circular diagram showing directed arrows between A, B and C.Tabulars actually provide a counter-example to a claim that a "notation convention" is good enough. It's true for simple arithmetic, but in advanced tabular cases the readout is not easy to infer from the presentation markup - and often not unique.
Another flaw in my proposal is the same concept having multiple contradictory notations. The${}_n C_k$ , $C^n_k$ , $C^k_n$ and $\binom{n}{k}$ , can be spoken "n choose k" (if that is the local convention).
binomial-coefficient
is notorious here: All ofFor this kind of diversity it is hard to imagine any automatic approach that "optimally" determines argument order. An alternative focus on "order of speaking" may be tempting, but that is easy to invert even in English, and likely multiple orders are possible in foreign languages.
In conclusion, if there is no reliable automatic way to make the choice, our best current mechanism seems to be for the list containing the concept to make a manual, normative choice for adopters' sake. That likely requires a single (primary) speech hint for each entry with two-or-more arguments, as a brief self-documenting device.
One middle-ground thought: we may still use a notational convention as a guide when creating speech hints in the concept lists - so that we have a consistent collection of argument patterns within each list. One would hope they are common sense enough that we have been doing that by inertia already - but I doubt we've been thoroughly consistent.
Better ideas are most welcome, hopefully this description seeds a robust discussion.
The text was updated successfully, but these errors were encountered: