Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[kemono.party] "404 NOT FOUND" error #1514

Closed
Vishvamitra opened this issue Apr 29, 2021 · 44 comments
Closed

[kemono.party] "404 NOT FOUND" error #1514

Vishvamitra opened this issue Apr 29, 2021 · 44 comments

Comments

@Vishvamitra
Copy link

For the first time, I tried downloading all the posts of a user from kemono party today. I exported the cookies, I placed them in my gallery-dl conf file, but I got the error. You can see the first three lines of the command output below:

gallery-dl "kemono party user profile"

[downloader.http][warning] '404 NOT FOUND' for 'file name'

[download][error] Failed to download 

Do you have any idea on the reason why this could happen? I'm downloading other stuff using gallery-dl in the background. Might this be the reason why the command isn't working?

@Vishvamitra
Copy link
Author

I've tried downloading the pics from kemono party even after finishing the former job, but I still get the same error.

@kattjevfel
Copy link
Contributor

Works here, please provide a problematic URL.

@Cunabo
Copy link

Cunabo commented Apr 29, 2021

Links to files on kemono slightly changed:
old
https://kemono.party/files/.....
new
https://data.kemono.party/files/.....
Sorry for bad English.

@zenosiege
Copy link

Links to files on kemono slightly changed:
old
https://kemono.party/files/.....
new
https://data.kemono.party/files/.....
Sorry for bad English.

when a fix update?
please

@Vishvamitra
Copy link
Author

Vishvamitra commented Apr 29, 2021

@kattjevfel
Hello! Here's the verbose output resulting from providing the same link I tried this morning:

[gallery-dl][debug] Version 1.17.3
[gallery-dl][debug] Python 3.6.9 - Linux-5.4.0-72-generic-x86_64-with-Ubuntu-18.04-bionic
[gallery-dl][debug] requests 2.25.1 - urllib3 1.26.4
[gallery-dl][debug] Starting DownloadJob for 'https://kemono.party/patreon/user/6549841'
[kemonoparty][debug] Using KemonopartyUserExtractor for 'https://kemono.party/patreon/user/6549841'
[urllib3.connectionpool][debug] Starting new HTTPS connection (1): kemono.party:443
[urllib3.connectionpool][debug] https://kemono.party:443 "GET /api/patreon/user/6549841?o=0 HTTP/1.1" 200 None
[urllib3.connectionpool][debug] https://kemono.party:443 "GET /files/6549841/48931656/Cheerleaders_023.png HTTP/1.1" 404 None
[downloader.http][warning] '404 NOT FOUND' for 'https://kemono.party/files/6549841/48931656/Cheerleaders_023.png'

@Cunabo
Editing the URL by adding data doesn't help, since now the download doesn't even start as the program doesn't recognize the extractor.

@Cunabo
Copy link

Cunabo commented Apr 29, 2021

I don't know how, but I fixed it.

Also, I don’t know how github works and therefore I don’t know how to show my solution to the developer.

And also I'm a noob in python.)

Sorry for bad English.

mikf added a commit that referenced this issue Apr 29, 2021
@mikf
Copy link
Owner

mikf commented Apr 29, 2021

Fixed in 4b65ebf.
Get the executable from here or install from source to use this commit.

edit: this change seems to have broken (older?) inline images on the site itself,
e.g. https://kemono.party/fanbox/user/7356311/post/802343

@zenosiege
Copy link

Fixed in 4b65ebf.
Get the executable from here or install from source to use this commit.

edit: this change seems to have broken (older?) inline images on the site itself,
e.g. https://kemono.party/fanbox/user/7356311/post/802343

is it me or the executable is broken? Can't unzip it

@Vishvamitra
Copy link
Author

Vishvamitra commented Apr 29, 2021

@mikf
Thank you for the fix! I've just updated to the latest dev version, but how am I supposed to use it instead of the regular version? The gallery-dl --version command still calls the 1.17.3 version.

@zenosiege
I was able to extract the Ubuntu zip file with no problems, but I couldn't launch it since an additional python library was missing.

@zenosiege
Copy link

zenosiege commented Apr 29, 2021

I was able to extract the Ubuntu zip file with no problems, but I couldn't launch it since an additional python library was missing.

@Vishvamitra
Dunno, I use Windows and I've tried to unzip it with 7-Zip and Explorer. No results. It says there is an error in the zip file

@roweger
Copy link

roweger commented Apr 29, 2021

I was able to unzip and get to the .exe, but it didn't seem to install. It displays this, and then closes:

error

Installing from source says 1.17.4.dev0 installed successfully, but the version readout still says 1.17.3, and downloading from Kemono returns the 404 Forbidden errors.

pip install

Any ideas?

@kattjevfel
Copy link
Contributor

You don't "install" gallery-dl.exe, you run it directory in a command prompt. Anyway if installing via pip didn't help either then I don't know.

@Vishvamitra
Copy link
Author

Anyway if installing via pip didn't help either then I don't know.

@kattjevfel
I can confirm that the dev version gets installed but the program keeps using the latest stable version.
Screenshot_20210430_072642

@zenosiege
Copy link

I was able to unzip and get to the .exe, but it didn't seem to install. It displays this, and then closes:

@roweger
How?

@TestPolygon
Copy link

Al least for Fanbox it works with a bug — it can download a preview instead of the original image if the preview and the attachments have the same name.

https://data.kemono.party/files/fanbox/{user}/{id}/Untitled.jpe            downloaded (200 KB preview)
https://data.kemono.party/attachments/fanbox/{user}/{id}/Untitled.jpe      NOT downloaded (2 MB file)
https://data.kemono.party/attachments/fanbox/{user}/{id}/Untitled_1.jpe    downloaded (2 MB file)
https://data.kemono.party/attachments/fanbox/{user}/{id}/Untitled_2.jpe    downloaded (2 MB file)

@TestPolygon
Copy link

TestPolygon commented Apr 30, 2021

Is there no option to not overwrite files with the same name? (And keep both files file.png, file (1).png)

As workatound I can use --no-skip to overwrite files (that are downloads first) with attachments (that are follow next in a post).

@thatfuckingbird
Copy link
Contributor

Is there no option to not overwrite files with the same name? (And keep both files file.png, file (1).png)

You can also set skip to "enumerate" to get numbered filenames.

@TestPolygon
Copy link

TestPolygon commented Apr 30, 2021

Yeah, it even was already noted in other issues.
#1436 (comment).

It think it should be enabled by default. Currently the program has the unsafe behavior from the box.
It would be unpleasant surprice for people when they find out that a part of the downloaded files are previews.


But "skip": "enumerate" shoud be not compatible with --download-archive option I suggest (I did not test it).
So, it's not the best decidion.

I think it makes sense to add the additional field type: attachment, file.
And short aliases (type-alias): a and f to use them in filename.

For example:
"{id}_{title}_{type-alias}_{filename}.{extension}"
or
"{id}_{title}_{filename}_{type-alia}.{extension}"


But that is the title is too long?
UPD: It looks OK, it will be trimmed.
UPD2: No, it's just already trimmed by the site down to 60 char + ...

@Twi-Hard
Copy link

Twi-Hard commented Apr 30, 2021

It's come up several times now that it skips a lot of files because of duplicate names. I really think the default name should be set to "{id}_{title}_{num:>02}_{filename}.{extension}" or "{id}_{title}_{num}_{filename}.{extension}" instead of "{id}_{title}_{filename}.{extension}"
A huge amount of the files I try to download have duplicate names.
Edit: I'm just mentioning this because it could help helpful for people in the future.. I already have it in my config.
Edit2: I personally user "{id}_{title}_{num:>02}_{filename}.{extension}" because there's a lot of posts with 10 or more images.

@TestPolygon
Copy link

TestPolygon commented Apr 30, 2021

I think it makes sense to add the additional field type: attachment, file.
And short aliases (type-alias): a and f to use them in filename.

And the same change should be done with --download-archive. The DB entries shoud be now look so:
{type-alias}_{the_current_row_format}.

So, if the entry has no type-alias prefix (it's the row that was created before this supposed update) it shoud be consided that it is a preview f (file), since they are placed before the original file with type a (attachment).

With this change people that used --download-archive can just easily download only the missing files. No need to redownload all files.

@RJFAC
Copy link

RJFAC commented May 1, 2021

#1488

@Vishvamitra
Copy link
Author

@TestPolygon

At least for Fanbox it works

May I ask how you made it work?
The Linux executable version doesn't launch because a python library is missing and installing from source works but for some reason the program ignores the dev version. Am I being dumb or am I missing anything crucial here?

@zenosiege
Copy link

image
Damn, even online extractors don't help, WTF with the windows ZIP archive?

@TestPolygon
Copy link

TestPolygon commented May 1, 2021

And another question:

It is possible do no skip posts without media, which contains only a text?
In order to save text metadata with --write-metadata or with postprocessors like I described here: #1505 (comment).

Probably the problem reason is that filename of metadata file relies on media content properties.
UPD: No, even using only a post properties: {user}, {id}, {date:%Y.%m.%d}, {title} for "filename" does not help; --no-skip --no-download too.


May I ask how you made it work?

pip install -U -I --no-deps --no-cache-dir https://github.com/mikf/gallery-dl/archive/master.tar.gz

@Vishvamitra
Copy link
Author

Vishvamitra commented May 1, 2021

UPDATE
I think I found the reason why gallery-dl doesn't automatically use the dev version on Linux: it looks like the procedure to install gallery-dl from source places gallery-dl inside a folder which normally isn't located in PATH. The folder is /home/user/.local/bin. Unless you add the former folder in your PATH, you need to write the whole file path to execute the dev version. See here:
Screenshot_2021-05-01_11-25-08

A question is arising in my mind now: does the dev version require a different config file? Or will it read the same config file as the stable version, which has been installed through pip with the python3 -m pip install -U gallery-dl command?

@TestPolygon
Copy link

TestPolygon commented May 1, 2021

In Windows you need only install Python with the set checkbox "Add Python to PATH":
image
And after run

  • pip install -U gallery-dl, or
  • pip install -U -I --no-deps --no-cache-dir https://github.com/mikf/gallery-dl/archive/master.tar.gz

in a console (CMD or Git Bash).

That's all. You can use it in a console now. Type gallery-dl --version to check it.

Do not forget to open the console in the download folder:

  • CMD: cd %HOMEPATH%/Downloads
  • Git Bash: cd ~/Downloads

There is only one default config for all program instances.

@zenosiege
Copy link

zenosiege commented May 1, 2021

btw, just tried the dev python version
image
it still saying "404 NOT FOUND"

UPD. Maybe I'm not a super pythonist, but I just checked kemonoparty.py file in "extractor" folder. Even with "data." it doesn't work
UPD. 2 XD LOL I'M SO DUMB. Gonna make a note: never use a prepared batch file with the executable in one folder. It works fine, thanks!

@Vishvamitra
Copy link
Author

I think that I need your help again. I really can't install the dev version on my system.
Case 1: if I execute the python3 -m pip install -U -I --no-deps --no-cache-dir https://github.com/mikf/gallery-dl/archive/master.tar.gz command, as @mikf suggests, the installation of the dev package is smooth but the dev version is nowhere to be found in my system. I even checked the .local/bin folder but I already have the 1.17.3 version of gallery-dl there so where am I supposed to look for the dev version?
Case 2. if I execute the pip install -U -I --no-deps --no-cache-dir https://github.com/mikf/gallery-dl/archive/master.tar.gz command, as @TestPolygon says, I get the following error message:
DEPRECATION: Python 2.7 reached the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 is no longer maintained. pip 21.0 will drop support for Python 2.7 in January 2021. More details about Python 2 support in pip can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support pip 21.0 will remove support for this functionality.
In this latest case, the installation obviously fails.

Please, could anyone give me a hint on the way out of this?

@TestPolygon
Copy link

I get the following error message:

DEPRECATION: Python 2.7 reached the end of its life
Please upgrade your Python

image

@Vishvamitra
Copy link
Author

After upgrading to Kubuntu 20.04 I was finally able to install and use the 1.17.4-dev version of gallery-dl, but I got a new error when I tried to download the same user profile from Kemono party. I'll upload a screenshot of my terminal below:
Screenshot_20210502_215622

@TestPolygon
Copy link

The server wants to relax.

@Vishvamitra
Copy link
Author

Am I the only one experiencing issues again to download files from kemono party?
I keep getting these errors and the download of every file repeatedly fails.
Screenshot_20210503_142329

@TestPolygon
Copy link

TestPolygon commented May 3, 2021

If the program can't do something that you can't do manually it's not the program's problem.


Check does it work in a browser. Do not confuse previews with the original files, by the way.

Don't forget about cookies.

The official chat: https://t.me/kemonoparty

@Vishvamitra
Copy link
Author

@TestPolygon
Did I imply that I was blaming the program? It would be nice of you if you could give me a hint of the solution.

@TestPolygon
Copy link

TestPolygon commented May 3, 2021

The server wants to relax.

If the program can't do something that you can't do manually it's not the program's problem.

Okay, one time more:
image

The data server does not work. - C.O.

Read the related to the service chat, board.

@Vishvamitra
Copy link
Author

@TestPolygon
Ok thank you for the clarification.

@Diadial
Copy link

Diadial commented May 3, 2021

@TestPolygon
Ok thank you for the clarification.

In regards to connection issues, it's not just you. The owner of KemonoParty recently upgraded the servers and implemented a DDOS protection. I imagine they or the sysadmin are configuring the site still. That or they've gone over the bandwidth limit.

@Terrails
Copy link

Terrails commented May 14, 2021

It's come up several times now that it skips a lot of files because of duplicate names. I really think the default name should be set to "{id}_{title}_{num:>02}_{filename}.{extension}" or "{id}_{title}_{num}_{filename}.{extension}" instead of "{id}_{title}_{filename}.{extension}"
A huge amount of the files I try to download have duplicate names.
Edit: I'm just mentioning this because it could help helpful for people in the future.. I already have it in my config.
Edit2: I personally user "{id}_{title}_{num:>02}_{filename}.{extension}" because there's a lot of posts with 10 or more images.

I've been using a filename containing "{num}" since I started using gallery-dl for kemono, but it still stands that gallery-dl downloads the file located at "/files/{service}/{user}/{id}/FileName.{extension}" and then skips the file that has exactly the same name located at "/attachments/{service}/{user}/{id}/FileName.{extension}" instead of downloading both and giving them the proper {num}. My downloaded gallery is currently missing the majority of files with {num}: 2 because the files with {num}: 1 have the same name as {num}: 2 which results in "duplicates" being skipped.

The solution that @TestPolygon suggested should hopefully give us a simple solution, as the root of the issue seems to be the archive because all files, including the so called "duplicates", download normally when not using archive. Can anyone also confirm this?

I think it makes sense to add the additional field type: attachment, file.
And short aliases (type-alias): a and f to use them in filename.

And the same change should be done with --download-archive. The DB entries shoud be now look so:
{type-alias}_{the_current_row_format}.

So, if the entry has no type-alias prefix (it's the row that was created before this supposed update) it shoud be consided that it is a preview f (file), since they are placed before the original file with type a (attachment).

With this change people that used --download-archive can just easily download only the missing files. No need to redownload all files.

@TestPolygon
Copy link

Since you use {num}/{num:>02} I think it can be fixed with changing also archive-format which default is:

archive_fmt = "{service}_{user}_{id}_{filename}.{extension}"

to, for example:

 "archive-format": "{service}_{user}_{id}_{num:>02}_{filename}.{extension}"

But this change only makes sense to do only if you did not download something yet. Because old entries with the different format will be ignored after this change.


I have created a separate issue for it: #1556

@TestPolygon
Copy link

TestPolygon commented May 16, 2021

Kemono 403 Forbidden

Use the search.

#1370

mikf added a commit that referenced this issue Jun 18, 2021
Add an enumeration index so that attachments and regular files with the
same filename still get downloaded and not counted as duplicate files
(even though for patreon posts they usually are)

This invalidates all previously generated archive IDs.
To keep using old names and IDs, set
'filename' to "{id}_{title}_{filename}.{extension}" and
'archive-format' to "{service}_{user}_{id}_{filename}.{extension}".
@mikf
Copy link
Owner

mikf commented Jun 19, 2021

I've (finally) changed the default filenames and archive IDs to avoid duplicate names by using an enumeration index ({num}): e9ab973.
As the commit message says, this breaks backwards compatibility in the sense that any previous generated archive IDs or filenames won't get recognized as "already downloaded" anymore.
It is possible to keep using the old names and IDs (see e9ab973), but that obviously has the problem of potentially skipping downloads.
You could also use the {type} field from #1556 instead of {num} if that suits you better.

@mikf mikf closed this as completed Jun 19, 2021
@TestPolygon
Copy link

TestPolygon commented Jun 19, 2021

It's of course better that it was.
But using of {num} will lead to downloading more count of unnecessary duplicates, since come artists have the notable count of posts have duplicated of URLs within one post — when the same URL is counted twice in a post.
While with using {type}, or{type[0]} you will download only unique URLs of attachments (but not inline — they will be with duplicates) within a post. (Although it applies only to some rare artists. For example, patreon of incognit has 1403 attachment URLs while only 949 URLs are unique)
Also it's more convenient to understand which type media is: inline, file, or attachment(s).

@left1000
Copy link

left1000 commented Jun 20, 2021

Uh, so I just grabbed https://github.com/mikf/gallery-dl/releases/tag/v1.18.0 do I need to update my gallery-dl.conf file? Or just download this new exe?

edit: In case it's not clear, I want to use the new default naming options, but I know naming options are usually controlled in the gallery-dl.conf file?

@TestPolygon
Copy link

TestPolygon commented Jun 20, 2021

If you want to use the new default format:

filename_fmt = "{id}_{title}_{num:>02}_{filename}.{extension}"
archive_fmt = "{service}_{user}_{id}_{num}"

just remove "filename" and "archive-format" from the config file if you have added them.


The alternative format with the first letter of {type}:

"kemonoparty": {
  "filename": "{id}_{title}_{type[0]}_{filename}.{extension}",
  "archive-format": "{service}_{user}_{id}_{type[0]}_{filename}.{extension}"
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests