Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MathML Copied from JAWS 2024's Math Viewer with MathCAT Will Not Auto Convert Into Math Objects in Microsoft Word 365 #4

Open
brichwin opened this issue Jul 25, 2024 · 14 comments

Comments

@brichwin
Copy link
Collaborator

brichwin commented Jul 25, 2024

eDAD tracking ticket for this issue is Accessibility 228596

Description:

Math copied from the Math Viewer in JAWS will not automatically convert into a math object in Microsoft Word 365. Instead, the result is that the MathML for the expression pastes in as plain text. It can be observed that the MathML copied to the clipboard lacks an appropriate xmlns attribute on the <math> element.

Steps to Reproduce:

  1. Open Microsoft Word 365 (Desktop version) on Windows 11.
  2. Create a new [blank] document.
  3. Launch JAWS
  4. Use the Early Adopter Program dialog to ensure that MathCAT is enabled.
    • Dialog is available from "Options" > "Early Adopter Program..."
    • Restart JAWS if necessary
  5. Visit a web page that contains math expressions encoded as MathML
  6. Navigate to a math expression
  7. Press "Enter" to bring up the "Math Viewer"
  8. Press "Ctrl+C" to copy the math expression
    • You should hear "Copied selection to clipboard"
  9. Switch back to the blank Microsoft Word document
  10. Press "Ctrl+V" to paste the math expression into the Word document
  11. Note that instead of a Microsoft Math Object that represents the expression copied in step 8, that the raw MathML for the expression was pasted in as plain text.
  12. Also note that the MathML that was pasted in does not have a xmlns='http://www.w3.org/1998/Math/MathML' attribute on the <math> element.

Expected Behavior:

When pasted into Microsoft Word, the MathML should convert into a Microsoft Math Object for the original expression that can be edited using the Equation Editor.

Observed Behavior:

  • The MathML copied to the clipboard by JAWS lacks a xmlns='http://www.w3.org/1998/Math/MathML' attribute on the <math> element.
  • It appears that Microsoft Word requires the xmlns='http://www.w3.org/1998/Math/MathML' attribute on the <math> element before it will recognize the pasted content as MathML.
  • Since Microsoft Word does not recognize the pasted content as MathML, the pasted content becomes plain text instead of a Microsoft Math Object.

Version Information:

  • OS Version:
    -- OS Name: Microsoft Windows 11 Home
    -- Version: 10.0.22631 Build 22631
    -- Locale: United States
  • Microsoft Word Version:
    -- Microsoft 365 Apps for enterprise
    -- Microsoft® Word for Microsoft 365 MSO (Version 2406 Build 16.0.17726.20078) 64-bit
  • JAWS Version
    -- JAWS Professional Version 2024.2406.121 with MathCAT enabled.

Example Video and Sample Web Page with MathML content:

A link to a short example video demonstrating the issue and a link to the web page used to demonstrate the issue are available below:

Attachments:

Additional Context:
The xmlns='http://www.w3.org/1998/Math/MathML' attribute on the <math> element is optional for MathML contained in an HTML5 document. When MathCAT in NVDA copies math to the clipboard, it appears to ensure that the <math> element contains the correct xmlns attribute regardless of if the original <math> element had one or not.

Here is the MathML that was generated when copied using the JAWS Math Viewer with MathCAT enabled:

 <math id='Mpcd3zh6-0' data-id-added='true'>
  <mrow data-changed='added' id='Mpcd3zh6-1' data-id-added='true'>
    <msqrt id='Mpcd3zh6-2' data-id-added='true'>
      <mrow data-changed='added' id='Mpcd3zh6-3' data-id-added='true'>
        <mi id='Mpcd3zh6-4' data-id-added='true'>x</mi>
        <mo id='Mpcd3zh6-5' data-id-added='true'>+</mo>
        <mn id='Mpcd3zh6-6' data-id-added='true'>1</mn>
      </mrow>
    </msqrt>
    <mo id='Mpcd3zh6-7' data-id-added='true'>=</mo>
    <mn id='Mpcd3zh6-8' data-id-added='true'>3</mn>
  </mrow>
 </math>

Note that if the above MathML is copied and then pasted into Microsoft Word, it becomes plain text of the MathML code:
Screen shot of Microsoft Word where a bunch of MathML code rendered as paragraph text is visible.

Here is the MathML that was generated from the same expression when copied using MathCAT in NVDA:

<math xmlns='http://www.w3.org/1998/Math/MathML'>
 <mrow>
  <msqrt>
    <mrow>
      <mi>x</mi>
      <mo>+</mo>
      <mn>1</mn>
    </mrow>
  </msqrt>
  <mo>=</mo>
  <mn>3</mn>
 </mrow>
</math>

Note that if it is copied and then pasted into Microsoft Word it becomes a Microsoft Word Math Object:
Screenshot of Microsoft Word where an active Microsoft Equation is visible.

@brichwin brichwin changed the title MathML Copied from JAWS 2024's MathViewer with MathCAT Will Not Auto Convert Into Math Objects in Microsoft Word 365 MathML Copied from JAWS 2024's Math Viewer with MathCAT Will Not Auto Convert Into Math Objects in Microsoft Word 365 Jul 25, 2024
@jkhurdan
Copy link
Collaborator

I was able to reconstruct the issue. I used Freedom Scientific MathML Example page as my reference MathML code.

@brichwin
Copy link
Collaborator Author

I'm wondering if this issue should be sent to both Freedom Scientific and to Microsoft?

  • For JAWS: Send the original mathML (without added ID's, etc.) after ensuring to add the appropriate xmlns attribute to the math element if it is missing. This would be instead of sending the mathML that has been processed with added ID's and data attributes used to support the math viewer (which appears to strip the xmlns attribute).
  • For Microsoft: If well-formed otherwise valid xml starting with a math element is pasted, it is reasonable to assume the namespace is "http://www.w3.org/1998/Math/MathML" when the xmlns attribute is missing from the math element. There may be other sources of mathML besides JAWS that fails to supply the xmlns attribute.

I wonder if @NSoiffer has an informed opinion to guide what we should do here?

@brichwin
Copy link
Collaborator Author

Via pasting in the two mathML examples above, I can confirm that the same behavior is happening on Microsoft® Word for Mac, Version 16.87 (24071426), License: Microsoft 365 Subscription.

@jkhurdan
Copy link
Collaborator

I personally like the idea of sending to both. I think we should approach it from the perspective that other tools like equatio, mathpx etc may also operate similarly where they are only copying a portion of the mathml. (I haven't tested those however, just giving as theoretical examples.)

(Also I tried to do this with VO- know its outside the scope of this group, but I couldn't find a way to copy MathML with VO as you can with JAWS/NVDA).

@GeorgeKerscher
Copy link
Collaborator

I too think it should go to both. We can confirm that the Freedom pages with MathML work properly with NVDA, right?

@NSoiffer
Copy link

NSoiffer commented Aug 8, 2024

Sorry for the slow response -- just catching up from a long vacation...

FYI: in a web environment, "math" is a known tag and doesn't require a namespace element. Same for "svg". HTML5 is not XML and so it ignores namespaces.

The reason the copy/paste works from NVDA is because MathCAT puts the MathML out onto the clipboard using multiple "flavors". One of the flavors is Unicode, but the key for Word is that another flavor that is used is "MathML Presentation". There is also the more generic "MathML" flavor. MathCAT puts both MathML flavors out because an application might only look for one of them. Word sees the MathML flavor and then knows that the clipboard contains MathML. Otherwise, it thinks you want to paste the clipboard contents as text. So the key is to get JAWS to add the flavor when it puts things on the clipboard.

From the NVDA code:
First you need to register the format as it isn't a Windows known standard:

    CF_MathML = windll.user32.RegisterClipboardFormatW("MathML")
    CF_MathML_Presentation = windll.user32.RegisterClipboardFormatW("MathML Presentation")

When you do a copy, the code is

            with winUser.openClipboard(gui.mainFrame.Handle):
                winUser.emptyClipboard()
                if is_mathml:
                    self._setClipboardData(self.CF_MathML, '<?xml version="1.0"?>' + text)
                    self._setClipboardData(self.CF_MathML_Presentation, '<?xml version="1.0"?>' + text)
                self._setClipboardData(winUser.CF_UNICODETEXT, text)

If the application doesn't know about MathML flavors, the last line serves as a text fallback.

On the Mac, an analogous thing needs to be done, but the details differ. I haven't coded this for the Mac, so I don't know the details other than that the MathML spec says to use public.mathml.presentation and public.mathml.

Bottom line: the solution is not about namespaces -- it is to set up the clipboard format properly.

@brichwin
Copy link
Collaborator Author

brichwin commented Aug 8, 2024

It's always amazing how much Neil knows and how much I don't know that I don't know!

... the key for Word is that another flavor that is used is "MathML Presentation". There is also the more generic "MathML" flavor. MathCAT puts both MathML flavors out because an application might only look for one of them. Word sees the MathML flavor and then knows that the clipboard contains MathML. Otherwise, it thinks you want to paste the clipboard contents as text.

Some questions that are all basically "Should not request Microsoft investigate changing the behavior of Word?":

  1. If I type a MathML expression with the xmlns attribute into notepad.exe on windows (without naming or saving the file) and then copy it into Word, Word does build that up into a Microsoft Equation in the Professional format. I'm guessing that notepad.exe has no idea that the text was MathML. Does that mean that some background magic is happening to set the flavor or that Word is parsing the plain text clipboard contents to some degree?
  2. Is it likely that most processes/tool(s) where an individual copies an expression as MathML would handle setting the flavors correctly for MS Word?

The main risk I see is that the user doesn't want a MathML snippet converted into a math object. However, there are already many use cases where the user needs to use the "Paste as text" feature to avoid having it inserted as an object, formatted, etc.

Are there other risks/reasons not to?

@GeorgeKerscher
Copy link
Collaborator

I just tried using VS Code to write an expression and then copy it to Word.

The expression seems tocopy correctly.

In VS Code, I then created a more complicated expression using the backslash codes and copied it into a empty Word expression and it worked perfectly.
It seems that using VS Code is much easier than working in the equation editor and simply pasting in what you want could be a good workflow.

Very interesting!

@brichwin
Copy link
Collaborator Author

I found an online clipboard-inspect tool that displays all of the different "flavors"/types of content available in the currently copied item. It's at: https://evercoder.github.io/clipboard-inspector/

@GeorgeKerscher
Copy link
Collaborator

This is very interesting! I was in VS Code and created an expression. I pasted it into word equation editor and all was well. I used ctrl+= to build it up and copied it to the clipboard.

I pasted it into the inspector and it gave me the plain text and the html option.

I tried putting it bac in linear with shift+ctrl+= and copied it again. I went to the inspector and copied it out.

From what I can tell, it round trips between Word and VS code with one exception.
The \sqrt becomes √

So it seems that √ is aplain text character. I suppose that is just the unicode value.

The point is that VS code is simple to write in and if it round trips properly, then we may have a workflow that many people would want to use.

While doing this, I crashed MathCAT as I have done before. I filed that issue in the MathCAT issue tracker.

@MurrayIII
Copy link

MurrayIII commented Sep 3, 2024

To recognize plain text as MathML, Word requires the xmlns='http://www.w3.org/1998/Math/MathML' attribute on the < math> element

@MurrayIII
Copy link

MurrayIII commented Sep 18, 2024

To copy MathML from the SVG MathJax DOM (in https://murrayiii.github.io/UnicodeMathML/playground/), I wrote a JavaScript function:

function getMathJaxMathMlNode() {
   /* MathJax output-element DOM has the form:
      <mjx-container
         <svg
            <mjx-assistive-mml
               <mjx-container
                  <svg
                     <mjx-assistive-mml
                        <math ...
   */
   let node = output.firstElementChild.lastElementChild.firstElementChild
   console.log('nodename = ' + node.nodeName)
   if (node.nodeName == 'MJX-CONTAINER')
      node = node.lastElementChild.firstElementChild
   return node
}

Then in my output.addEventListener('keydown', function (e) {...}, I included

if (output.firstElementChild.nodeName == 'MJX-CONTAINER') {
    // MathJax is active. Copying the whole math zone is supported
    e.preventDefault()
    if (key.length > 1)
        return
    if (e.ctrlKey && key == 'c') {
        let node = getMathJaxMathMlNode()
        let mathml = node.outerHTML
        if (mathml.startsWith('<math'))
            navigator.clipboard.writeText(mathml)
    }
    return
}

This works with desktop Word

@ways2read
Copy link
Member

Response from eDAD:

Thank you for letting us know about this issue. We have reviewed it with the team working on Word, and it has been logged in their bug tracking system. The Word team reviews their list of issues regularly and takes action on them based on the priority and other work items on the list. Since there is no immediate action for eDAD, and the issue is in the Word teams’ backlog of issues to work on, we are closing this ticket. If you want to check the status at any point of time, please do contact us, and we will assist you

@MurrayIII
Copy link

Another way to copy MathML into Word from a web page is to ensure that native MathML rendering is active, i.e., that MathJax is not active. You can compare the two using https://murrayiii.github.io/UnicodeMathML/playground/. The Settings menu lets you enable/disable MathJax rendering. The playground web page does copy MathJax MathML as well using the JavaScript code above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants