Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bold Texts #15

Open
spekulatius opened this issue May 16, 2021 · 11 comments
Open

Bold Texts #15

spekulatius opened this issue May 16, 2021 · 11 comments

Comments

@spekulatius
Copy link

Hey @euangoddard

super long shoot here. I know well this might never happen. I still give it a try. I noticed all bold texts are dropped when I use https://euangoddard.github.io/clipboard2markdown/. Any chance this could be added?

Cheers,
Peter

@euangoddard
Copy link
Owner

Thanks for reporting this. It would be good to have a reproduction of this. If I had to guess then I would suspect that the bold text in the source only appears bold and isn't actually marked as bold in a way the processor understands, i.e. it has some style applied to make it seem bold. Whenever I have tested this with formatted text it always translates correctly. Perhaps if you can attach an example source document I can have a look at what's going on?

@spekulatius
Copy link
Author

Hello @euangoddard,

yeah, I understand it might look bold, but isn't actually bolded. Just displayed like this.

It's happening when I use Google Docs. I've created an example here: https://docs.google.com/document/d/17tpX1h53phZ-64uP4flTiOJIMQrpvEuCcdf49hdvV2A/edit?usp=sharing

Cheers,
Peter

@euangoddard
Copy link
Owner

I suspect it is a google docs issue. When you export a google doc as HTML I know the bold sections in there aren't represented with <strong> or <b> tags but are simply styled so I suspect this has the same root cause as is manifesting here. It would be pretty hard to solve the problem - you'd have to detect styles that looked bold and then convert those, on the fly, to semantically bold tags prior to processing. Happy to consider any PR that addresses this

@spekulatius
Copy link
Author

Hmmm, okay. I guessed something along these lines when opening the issue. I've seen the bold part working with other sources. So I guessed there is something non-standard going on.

I've got limited time atm. For the next weeks surely no time for any side-projects. Maybe after this period.

@euangoddard
Copy link
Owner

Fantastic that you're at least keen to try! I think it could be really quite a tricky problem to solve as you'll need to traverse the entire node tree and replace styles that look bold with semantic elements. Good luck!

@spekulatius
Copy link
Author

That sounds like an interesting challenge. Not sure I'll manage tho

@DanielOaks
Copy link

Hmm, I might not make a full PR for this, but adding this to clipboard2markdown.js has worked for converting Reddit's bold text properly for me:

    {
      filter: function (node) {
        // TODO: check other font-weights
        return node.style.fontWeight && (node.style.fontWeight > 500);
      },
      replacement: function (content) {
        return '**' + content + '**';
      }
    },

@euangoddard
Copy link
Owner

That's interesting. I am current re-writing the project from scratch at the moment and can definitely include this patch in there. It makes sense to me to support these visually styled bold elements

@colinbrislawn
Copy link

Supporting Italics would be nice as well! My use case is also Google Docs -> Markdown

@euangoddard
Copy link
Owner

@colinbrislawn that's interesting that google docs paste doesn't support italics. When I've looked into this previously it seemed to work. I know that some rich text editors don't implement the rich text as semantic elements which makes it quite difficult to resolve the original author's formatting intent. The approach mentioned by @DanielOaks could certainly be used for the italic approach as well.

@mackaaij
Copy link

Just experienced the same using text copied from SharePoint/Word using macOS Safari. Headers would be nice too, which comes across via "data-ccp-parastyle" according to https://dynalist.io/clipboard

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants