-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
plugin to scrape website & convert HTML to markdown #923
Comments
I use the copy as markdown plugin of chrome. I find it very convenient. |
Thanks for sharing! Looks like Screenshot comparing reMarked.js vs pandoc - reMarked has trouble parsing the code blocks for some reason: reMarked code blocks fenced with 'true'?This is funny. I couldn't figure out why the reMarked demo was fencing code blocks with 'true'. I think it's just a mistake in how the reMarker object is configured on the demo page: // code blocks will be delimited with the string 'true'
var reMarker = new reMarked({gfm_code: true});
// this is what we want
// try it by pasting into the console at reMarked demo site
var reMarker = new reMarked({gfm_code: "```"});
reMarker.render(document.getElementById('html-inp').value) example reMarked.js output w/
|
@kazup01 has boosted this issue with $100. Visit this issue on Issuehunt |
@StormBurpee has started working. Visit this issue on Issuehunt |
@StormBurpee has submitted output. Visit this issue on Issuehunt |
Hey guys, feel free to take a look at the pull request I made for this feature over at #1981 In the issue I've attached a few example photos for you to see. |
@Rokt33r has stopped working. Visit this issue on Issuehunt |
@kazup01 cancelled funding, $100, of this issue. Visit this issue on Issuehunt |
@BoostIO funded this issue with $100. Visit this issue on Issuehunt |
@edokan has started working. Visit this issue on Issuehunt |
a good web clipper: https://github.com/mika-cn/maoxian-web-clipper/ |
Would a web clipper(like Evernote's browser extension) be a better solution for this? |
@ZeroX-DG has rewarded $90.00 to @AWolf81. See it on IssueHunt
|
sweet! |
I'm enjoying boostnote after switching from evernote & quiver.app - thank you to everyone who has contributed to this promising open source tool.
I keep a "code" notebook for technical notes-to-self and today I wanted to add a "clipping" of a blog post to it. I wasn't sure what the best way was (sometimes I try copying-and-pasting directly from the browser, which worked OK in quiver's rich-text note mode... but rtf, gross), so I tried out a few tools for automatically converting from HTML to markdown.
pandoc
has a command-line option to fetch content from URL and can convert to/from HTML, markdown, and many other formats. Install on osx withbrew install pandoc
, then:as a handy fish shell function:
Pandoc does an OK job but isn't definitely not perfect, so some manual editing of the output may be necessary, for instance deleting header & footer content.
If you don't want to install anything, fuckyeahmarkdown.com seems to have an alright hosted converter.
feature request
Add a command (plugin?) to Boostnote that takes a URL as input, scrapes the page, converts the html to markdown, and creates a new note filled with the result.
Starting points:
node-europa
"is a Node.js module for converting HTML into valid Markdown that uses the Europa Core engine."scrape-markdown
CLI tool based on node-europanpm install github:evangoer/scrape-markdown
./node_modules/.bin/scrape-markdown [URL]
I would be happy to help with implementation.
#405
IssueHunt Summary
awolf81 has been rewarded.
Backers (Total: $100.00)
Submitted pull Requests
Tips
IssueHunt has been backed by the following sponsors. Become a sponsor
The text was updated successfully, but these errors were encountered: