-
-
Notifications
You must be signed in to change notification settings - Fork 995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Request: Add support for simply-hentai #89
Comments
Why yes, this is very useful. That makes this a whole lot easier. Thanks. Is it necessary (or would it be useful) to add login support or is everything available without being logged in? |
Everything is available to anonymous users. I did a custom script a while back and didn't have any issues about limits nor throttling after downloading no less than one thousand pages. And that was before discovering the json containing the full index, so I was crawling the whole thing. I did put one second of sleep between requests, though. I like to be nice to servers just in case. You can bookmark and favorite works with an account, if you want to go the extra mile and add support for that. But the site is perfectly usable without one. |
All videos hosted on their own servers seem be to dead, but myhentai.tv embeds, which are most of the videos, work fine.
H-manga/galleries, single images and gifs, and even videos should work now. I've noticed that the download speed for anything not cached by their CDN is incredibly slow and may even result in a read-timeout, but downloads still finish, given enough time, so I guess it's fine. Video hosted on their own servers are also all gone, but most of the videos listed are hosted on another service and they work just fine. Anyway, notify me if you find anything that doesn't work the way it should. |
Thanks, I have been trying it with a bunch of links and it's working nicely for the most part. Last time I crawled the site it didn't timeout that often, I suppose they have changed the infrastructure on their CDN. Anyway, I said for the most part because I found out that the Thanks again for your work and releasing it as Free Software with Linux support and all. It's appreciated. |
I've tried the link you posted and it works just fine ... well, at least now it does. Maybe the site had some sort of hiccup when you tried or it needs some time to generate the I also tested the 10 newest and 10 oldest galleries to try to reproduce this problem, but to no avail, i.e. everything worked as it should. |
Nope, failed again for me, third day in a row. Removed my config file just in case, here's the log:
I can't open the json file in my browser, either, it gets redirected endlessly too. Same with wget. I tried other links and they work flawlessly. It's not gallery-dl's fault, but it's baffling. |
Since I don't have this infinite redirect problem, I kind of need to know what works and what doesn't on your side to fix this:
# should return the HTML version
$ wget --header='Accept: text/html' https://original-work.simply-hentai.com/dolls-anzai-rina-hen-dolls-rina-anzais-story/all-pages
# should get the same JSON data as all-pages.json would; or cause infinite redirects ...
$ wget --header='Accept: application/json' https://original-work.simply-hentai.com/dolls-anzai-rina-hen-dolls-rina-anzais-story/all-pages |
Yes, I can access it. It works as intended, thumbnails and all. Response from the first command: https://pastebin.com/A8FSZE5Z Response from the second command: https://pastebin.com/Gmp92KzX That trick with the header worked. The server still refuses to serve me the json file using the proper URL. |
I've changed the HTTP request to The As it turns out, the webserver only sends the HTML version if you send an |
Go figure. Can't decide whether that's clever or obscure web design. The last patch seems to work fine on my end. Thank you for your time, again. |
Would be possible to add support for https://www.simpy-hentai.com/? It's a hentai web similar to nhentai, hbrowse, and the like. I could try to do it myself, but there's no documentation about how to do it, and I would rather not submit a half baked patch that will have to be reviewed and rewriten.
Every doujin/manga/gallery has it's own main page containing a cover, the title, meta data such as tags, language, author/s, number of pages, etc. Language info is not always present and I believe one work can have several authors, but I can't find and example now.
Depending on several factors, the URLs for these main pages can differ.
original-work
, the main domain and an slug, like this: https://original-work.simply-hentai.com/seductive-uniform-ch-1-21www
, the main domain and two slugs, one for the series it parodies and another for the title, like this: https://www.simply-hentai.com/fresh-precure/eas-sama-no-sakusei-jigokuEach work has a page showing thumbnails for every page, and follows the structure
(url)/all-pages
, like this:https://pokemon.simply-hentai.com/mao-friends-9bc39/all-pages
Each page can be viewed separately and their links follow the structure
(url)/page/(page_id)
, like this: https://pokemon.simply-hentai.com/mao-friends-9bc39/page/4052558There are also extra sections for gifs galleries and videos who has URLs very similar to the previous ones, so some sort of detection would be needed to avoid trying downloading a manga that isn't there.
Each work has an associated json file containing the URLs to the files itself following the structure
(url)/all-pages.json
, like this: https://pokemon.simply-hentai.com/mao-friends-9bc39/all-pages.json.The content of said file is like this:
Each page is defined by an id, a giant thumb, the link to view said page and whether it was bookmarked or not by the user. The giant thumb, although big, it's smaller than the full page, so the full page (property
full
) is the one that should be downloaded. The only way to know the actual page number is by their position in the list.The main page for each work offers a download option, but it's just a list of filelockers to get an encrypted zip file.
As far as I know, the web doesn't offer an API.
Hope this info is useful.
The text was updated successfully, but these errors were encountered: