Skip to content

Lompik/html-to-org

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Convert HTML files to org-mode

This converts html files to org-mode focusing on keeping the formatted text (no embedded span or div in the output) based on google’s gumbo parser. It comes in two version (Python and Nim) which are kept in sync (does the same thing) as much as possible.

Usage

  • Install gumbo
  • Run your executable of choice with one html file (or url) as argument. Output org-mode file goes to stdout.
  • Install the gumbo-parser python binding
pip install gumbo (--user)
  • To compile the nim executable (tested with nim 0.18 )
cd nim
nim c html_to_org.nim

requires libgumbo-dev

  • handle HTML anchor/fragment links
    • probably need a uid for each header
  • fix wrong wrap in nim’s version
  • “browse the web in org-mode” mode

About

Convert HTML files to org-mode via gumbo

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published