I did a Ruby version of a MangaReader crawler using Typhoeus. The code got really convoluted and I never properly refactored it.
The original code is here:
https://github.com/akitaonrails/manga-downloadr
I also did a better structured Elixir version here:
https://github.com/akitaonrails/ex_manga_downloadr
You will need to install ImageMagick in your system (to resize images to Kindle format and merge them into PDF volumes). Refer to your system's particular install. In Ubuntu, simply do:
sudo apt-get install imagemagick
To set up the development environment install the dependencies:
crystal deps
crystal build src/cr_manga_downloadr.cr --release
Once you have the compiled binary just use like this:
./cr_manga_downloadr -u http://www.mangareader.net/onepunch-man -d /tmp/onepunch-man
In this example, all the pages of the "One Punch Man" will be downloaded to the directory "/tmp/onepunch-man" and they will have the following filename format:
/tmp/onepunch-man/Onepunch-Man-Chap-00038-Pg-00011.jpg
Chapters and Pages numbers will be properly left-padded with zeroes so the filesystem can sort them correctly.
You can also use the flag --cache
or just -c
to turn on the HTTP cache, that way you can Ctrl-C in the middle of the process and resume later where you left off.
You can run the specs like this:
crystal spec
If you want to benchmark against the other implementations of the downloader you can use the test mode like this:
time ./cr_manga_downloadr --test
And you can test with cache turned on as well:
time ./cr_manga_downloadr --test --cache
This will use the One-Punch Man manga as the test sample.
- Fork it ( https://github.com/akitaonrails/cr_manga_downloadr/fork )
- Create your feature branch (git checkout -b my-new-feature)
- Commit your changes (git commit -am 'Add some feature')
- Push to the branch (git push origin my-new-feature)
- Create a new Pull Request
- AkitaOnRails - creator, maintainer