-
-
Notifications
You must be signed in to change notification settings - Fork 995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[kemono.party - Patreon] Inconsistencies downloading main files vs attachments #1899
Comments
The
There's a
You haven't, it's just that any attempt of fixing this "duplicate files for patreon posts" issue has always failed, including the current "ignore main file if there are attachments". |
BTW, for new files SHA-256 taken from the URL can be used to define are the files are same, or they just only have the same name. |
Ah, I see. Thanks for clearing that up. I suppose I'll just have to download everything and manually remove duplicates, then.
Yeah. I think I made the issue that led to that option being included, actually. Heh.
That's good to know. There may be something I use that for.
Well, for what it's worth, the "ignore main file if there are attachments" approach does filter out the vast, vast majority of duplicates and it's mostly solved kemono's data duplication. I just seem to have found an artist or a post that happens to store data differently.
Is there a download comparison option in gallery-dl that does that? I've looked through some of the comparison options in the config documentation but I don't remember seeing something like that. |
It's the new URL format introduced 4 days ago. Currently not all files uses it. |
There are some cases where the images aren't posted in 'files' area, but 'content' area and the downloader skipped the content ones. The images aren't links, just inline. |
And they still do not, even more than a week later. Maybe these changes only got applied to patreon posts. $ gallery-dl -g https://kemono.party/gumroad/user/trylsc/post/IURjT
https://kemono.party/data/files/gumroad/trylsc/IURjT/reward8.jpg
https://kemono.party/data/attachments/gumroad/trylsc/IURjT/$3.zip @skyvory inline images are supposed to be supported, unless the URLs in newer posts got changed and aren't picked up by gallery-dl. $ gallery-dl -g https://kemono.party/fanbox/user/7356311/post/802343
https://kemono.party/data/inline/fanbox/uaozO4Yga6ydkGIJFAQDixfE.jpeg |
@mikf For my artist, you can look at I'm not sure if this is something gallery-dl accounts for when crawling kemono patreon posts. From some minor testing, it doesn't seem to recognize that these embedded/inline images are even there. In any event, the workaround that I'm using now is simple but somewhat tedious using JDownloader 2: |
Not sure if this is the best place, apologies. But I noticed with this URL that the main attachment 404s but the inline image isn't available to download:
Not too sure how that differs from the one posted earlier, which does come through as an inline post. Most likely because it has both a file and an inline image?
|
You're amazing! Thanks, that's got it! |
[This might look like a wall of text, but I don't think it's actually that much information. Thanks in advance.]
I am attempting to download some files from kemono.party, but the behaviour of the downloader seems inconsistent depending on whether the target post has its content uploaded as files or attachments, and which ones are duplicates (because of course that's still a problem on kemono.party). I am using
gallery-dl 1.18.4-dev
.Target URLs [no nudity, but NSFW]:
It might be worth noting that link 2 doesn't have any images listed under "content" on the page, but if you look at the image URLs you can see that the first image is under
hostname/files/etc
and the others arehostname/attachments/etc
The JSON for my gallery-dl config file:
I have configured it this way to force all Patreon attachment filenames to use underscores instead of spaces, which protects against duplicate files with slightly different filenames. It has worked for me for several months.
When using this config, I downloaded all images except for animation 1 from link 2, and there were no duplicates, but because of the filenames the order of each picture was jumbled. I tried to change the JSON to download everything and put them in the correct order:
This config improved the filenames to be in order, but it didn't download the missing picture from the first config and it downloaded the duplicate animation from link 2.
I tried to see what keywords/filters I could use in the filename by using
gallery-dl -K [link 2]
but that did not seem to help: according to gallery-dl, thenum
(index) of each picture in that link starts at 1 with the duplicate animations. Even when I remove the distinction between Patreon and other services (or removed thefilename
block entirely), gallery-dl does not download the first animation.In summary:
For reference, here is the command and verbose output when using the second config.
The text was updated successfully, but these errors were encountered: