-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
include (or not) a sample-set of default conversion from plain-MathML to MathML-with-intent #433
Comments
My position on the call was focused on the claim:
Such examples will create an unrealistic (and underspecified) expectation of what may or may not be possible with simple defaulting rules. In reality, very little is inferred reliably with simple rules, and even for K12 materials one needs to consider the full presentation tree, matching on both XML structure and text content of each node. Since we do not have the resources to develop this kind of mechanism in full, my preference is to defer further work to MathML 5. For the current iteration, I would be more interested in investigating domain-specific rule-sets anchored around the "isa" ( #426 ) capability. It may be possible that simple defaulting examples may be realistic within very concrete "isa" values, such as For a specific illustration of my point, consider the intuition:
If one applies such a rule over a larger text, it is bound to mispronounce a wide variety of different scripted constructs. Here is an excerpt from my survey of Khan Academy's K12 materials - 15 notations relying on a simple
|
I think the issue is not whether the charter says we should have a sample set of mappings of defaults to
BackgroundI tried to answer that as part of a position paper a few months back. See that paper for more details. Here I will just list the defaults so they can be discussed individually or as a group via comments. First off though, note that I strongly feel there needs to be a
Note: in all cases, if intent is given, it should be used (if appropriate). Even for legacy, it is possible remediation may have added an intent value. Proposed defaults:AT should have a specified default interpretation for every MathML Element. That doesn't mean that the exact words are specified, only that AT chooses words that convey the default meaning. For example: msup is spoken as "super" or "superscript" if intent-default = "structure" and is spoken as a power ("x squared", "x raised to the n minus 1 power", etc) if intent-default = "common". The exact words may depend upon both the audience and the arguments. "structure"The goal of the default is to avoid any inference of semantics. The meanings and special cases for all the MathML elements are (expand to see details):
"common"The goal is to use common 'K-14" meanings so the need to use
SummaryAs stated at the start, we need to answer the questions of whether there should be defaults (given my long post, it should be clear what my position is). If other agree, then we need to come to an agreement on what the defaults should be. The above list is a first cut. There is a trade-off that should be considered. If the rules/special cases become too numerous, then AT is less likely to implement them. On the other hand, if special cases aren't listed, then authors/authoring software needs to go to extra work to generate them which makes them less likely to use If you feel a default/special case is missing or if some default is wrong/has too many special cases, that's what comments are for... |
Responding to @dginev's comment... While I completely agree that simple rules will fail to capture a significant number of special cases, I disagree that they are not useful. I would love to gather some data, but my guess based on looking at a lot of math textbooks and tutorials over the years is that they will capture over 95% of the cases, maybe even over 99%. These numbers don't reflect "good" speech, just "not wrong" speech. I know that is pretty bold, but simply put, there are a lot of Moving on to The least accurate rule is the one for Based on your examples, I updated the I'll be the first to admit that I don't have statistics to back up my claims. It would be great to go through a dozen textbooks and do counts, but I don't think that anyone has the time/stamina to do that. At best, we could find the number of |
My first comment regards the "default default", ie. what defaulting rule set (if any) applies when there is no |
My second comment is that I believe we should express the defaulting rule sets in terms of
And the fact that it would force us to put some non-semantic items (eg |
I think we agreed that this is likely a too big entreprise: Any default set we can recommend will be frustrating. Leave this in the field of brave experimenting implementors.
I agree. Let us not promise such a set of rules. I strongly suggest to start with a vocabulary clarification first:
The easy bits in there are the function-names at the bottom of David F's list (map word to intents then to pronunciation) and the unicode characters (map character codes to intents and pronunciation possibly differing from Unicode). Both imply translation as well. |
@NSoiffer the original reason to open this issue, the way I remember it, was more specific than the general topic of "default rule sets for intent", which may fit better in a new dedicated issue (especially if we are closer to consensus to add some recommended markup). To summarize my comments from the discussion in today's meeting:
So a first suggestion would be: <math intent=":default-common">...</math>
<math intent=":default-structure">...</math>
<math intent=":default-legacy">...</math> <!-- identical to <math>...</math> --> Separately, @davidfarmer expressed his hope that we won't have a prolonged discussion on what exact behaviors go into the "common" defaults. I certainly understand the sentiment. However, unless we specify a clear and fixed set of rules, it is reasonable to expect that different AT systems will implement different behaviors. Maybe that is an acceptable outcome, but we should be aware that we are making that choice. Moreover, it is good if we can now bundle this discussion together with discussing properties, as I can make this "AT alignment" point in general: Unless we clearly enumerate the exact effects each "behavioral property" is expected to enrich, we will have differences in behavior/coverage between AT systems. As an example, To bring this example back to defaults, if Neil's "proposed defaults" are a start, but they are incomplete from a Western K12/K14 education standpoint. Should we try to make them complete? |
I'd be tempted to drop Commiting forever to an "undefined default behaviour" and forcing opt-in to get an defined behaviour seems a high price to pay to get unchanged behaviour for a possibly non existing set of documents. |
Adding a couple of my comments from the meeting on May 25th:
|
The names of properties seem to have changed and stabilised a bit since this discussion was last active. In the current core properties list we just have I don't think the list of so of the three cases in @NSoiffer comment #433 (comment) |
In the Jan 9 meeting, we started to discuss the "common" idea. One suggestion from that meeting by @MurrayIII and @polx is to have a JS library that inserts the intents based on a set of rules rather than AT doing that. After the meeting, I realized that while that may be viable for a web page, it isn't a solution for Word or PowerPoint documents or other non-web documents. Because of this, I don't think JS is a viable general purpose solution. Although it is in the minutes, to make it more obvious in this issue, @dginev proposed that rather than formalizing the common rules now, we allow "vendor extensions" à la the web, so maybe a value such as |
I don't think we should have a vendor specific version as that would suggest other systems don't support it which would hamper rather than aid portability. The situation is rather different in css with existing rules for systems to ignore unknown properties, and even there the vendor extensions are introdcued by the vendors not something promoted by the working groups. As there is already a lot of flexibility on the exact words systems use, they don't have to follow the exact speech hints for core concepts or properties, I don't think having a As I note above, I'm not convinced by the need to have a "legacy" default as while there are existing javascript based solutions there simply isn't a large corpus of cross browser documents using native mathml with AT readings for which some legacy compatibility is needed. As far as it exists at all the "legacy" behaviour is likely to be closer to |
The reason to reach for a vendored prefix is exactly so that the working group does not recommend an unfinished / potentially broken behavior for Making an unfinished/unproven design portable should not be a goal of the group. Starting with a MathML 4 in which
I quite like that syntax, as it clearly marks the experimental state of the feature, follows a proven web platform precedent, and also allows for multiple vendors to be tested on the same document. My proposal would be that only after one of these experiments has some sucesses under its belt, then we can move to standardize a clearly specified |
Explictly making a document target at one system (or worse a list of systems) If the proposal stays with the default as an unspecified legacy then the end result will be that most documents will get read with some approximation to the Having some rules (any rules) can not be harmful in the way you suggest as once they are specified authors can use intent to over-ride them where needed. If the default is simply "unspecified behaviour" then that is really a failure of the group to specify something more usable and gives authors no guidance at all. |
"Broken by default" is not a design philosophy I subscribe to, and I am somewhat surprised you appear to be advocating for that. What legacy systems did is not affected by any decision made in this issue, and that part of the conversation appears to be a distraction to me. |
That's not what I'm advocating.
The issue is all about what the default should be. The proposal to make that "legacy" aka completely undefined is what I don't like. I think the specification should specify a usable default. legacy systems that do not change will do what they do in any case but they do not claim conformance to MathML 4 and I don't see the need to water down the rules so that they can be retrospectively considered conforming. |
And yet the only reliably usable default is |
If you need to specify I don't agree with your definition of "usable", it is obviously less ambiguous, but that is solving the wrong problem. The problem is how to produce the best readings from exising
I honestly can not understand that comment at all. The readings may be wrong in some cases, but if it is just unspecifed heuristics the author has no way of knowing what will happen. If there is a specification such that conforming systems use the same heuristics, the readings are in not more broken, but the author has the benefit of knowing in advance whether they need to add extra markup to guide the speech. |
What? I am suggesting that specifying that the default readout of
Your take on "usable" will make arXiv readouts broken by default, so I am happy to disagree with it.
First, you are asserting that without any validation. You may be right, you may be wrong or worse - it may depend on the document, on the AT system and on the reader.
That level of design should be unspecified/vendor-specific behavior. We shouldn't standardize "sometimes wrong, but..." as a global MathML default. I see creating a dedicated vocabulary (such as |
Oh sorry I misunderstood, OK, but I can not see that being a workable proposal.
I do not agree it is broken I do not think reading conventional notation using conventional terminoligy is wrong even if it's mathematically inaccurate. |
This issue is about what to do when there is no intent. The proposal above from Neil is to leave that as ":legacy" (ie unspecified) I don't think the group has seriously considered making So it seems to me the only real option other than |
To me the workable options are |
We should discuss if the charter of the next WG will deliver a sample-set explicitting a default conversion from MathML (without intent) to MathML (with intent) so that legacy MathML expressions can be enriched (at least partially).
In the group's last call:
@dginev indicated that we should not make this a visible deliverable as we shall not have it in any way complete.
@polx suggested that we should promise it as such a promise has no indications of completion and it will be helpful for hinting a first intent-enrichment process.
Let's discuss this on this isse.
The text was updated successfully, but these errors were encountered: