-
-
Notifications
You must be signed in to change notification settings - Fork 995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bcy.net giving errors for new posts #613
Comments
Well, I also haven't managed to find anything yet,
That's not exactly the same problem as mentioned before, but the image URLs and metadata from the API endpoint are different than the embedded ones in
This can probably be solved by adding the watermark or "noop" filter to the |
Images from new posts can have incomplete/partial URLs (1) without any filename extension when fetching their data from '/apiv3/user/selfPosts', so now all data gets taken from '/item/detail/ID' pages. It is currently unknown how to get the non-watermarked original version of these images, or if that is possible at all. (2) Images with a watermark will have their 'filter' metadata field set to "watermark". For original images this field is an empty string "". Enabling the 'noop' option will, in addition to the watermarked version, yield the the '~noop.image' filter version (3), where 'filter' is set to "noop". (1) "https://img-bcy-qn.pstatp.com/banciyuan/3ccdff22479c4060aadc86718209b281" (2) "https://p1-bcy.byteimg.com/img/banciyuan/3ccdff22479c4060aadc86718209b281~tplv-banciyuan-logo-v3:wqnpnLLlhZLlpKfprZTnjotfCuWNiuasoeWFgyAtIEFDR-eIseWlveiAheekvuWMug==.image" (3) "https://p1-bcy.byteimg.com/img/banciyuan/3ccdff22479c4060aadc86718209b281~noop.image"
The This commit also adds a |
The former implementation would try to use the embedded data from '/item/detail/' pages for every post, even if that wasn't really necessary. This commit also fixes some issues with posts only visible to logged in users.
Some posts aren't downloading properly because the URL's are different for them. So it doesn't download the "original" watermarked version or the "noop" version which is higher quality than what it grabbed. Here's an example: https://bcy.net/item/detail/6721286314647355660 When I use '-g' it shows the URL as this which doesn't even work when put into a browser: It should be pointed at this for the "noop" version (which I compared and is a higher quality image that is 200kb larger with less compression artifacts): And here is the the "original" that isn't even being detected at all by gallery-dl right now: A few others have "c0qxx" or "c0r67" instead of "c0rbo". The first 3 of this users posts download normally with both "noop" and "watermarked" detected. The last 4 posts do not. |
Someone appears to have found a solution that is working. I've somewhat tested myself. personally I don't code but its generating a signature that matches the unwatermarked, original images. Unfortunately i don't actually understand what's being done, but it'd be great if you could take a look into it and see if whatever is done can be integrated into gallery-dl |
Thank you so much for implementing and supporting the site! It was a real hassle to use that site honestly and your downloader really improved the experience.
That being said, i've been getting quite a few errors and i believe its because they changed the format sometime ago. The old posts still uses the old format i mentioned in the other post, but it seems like they have changed it for the new ones
A recent example https://bcy.net/item/detail/6780546160802143236
The display "thumbnail":
https://p1-bcy.byteimg.com/img/banciyuan/3ccdff22479c4060aadc86718209b281~tplv-banciyuan-w650.image
The "original" with watermark:
https://p1-bcy.byteimg.com/img/banciyuan/3ccdff22479c4060aadc86718209b281~tplv-banciyuan-logo-v3:wqnpnLLlhZLlpKfprZTnjotfCuWNiuasoeWFgyAtIEFDR-eIseWlveiAheekvuWMug==.image?sig=XOCQEWBAelmBFHEPfxA8dD5dX2g%3D
Seems like the string "~tplv-banciyuan-logo-v3:wqnpnLLlhZLlpKfprZTnjotfCuWNiuasoeWFgyAtIEFDR-eIseWlveiAheekvuWMug==.image" gives the original image, but makes it come along with a watermark.
Surprisingly, "wqnpnLLlhZLlpKfprZTnjotfCuWNiuasoeWFgyAtIEFDR-eIseWlveiAheekvuWMug" is actually base64 for the chinese characters on the watermark itself. The watermark includes the poster's name, which makes me believe this is NOT a coincidence. There is a very headache catch though.
The characters on the watermark
"©露兒大魔王_
半次元 - ACE爱好者社区"
Actually maps to (in base64, UTF-8)
"wqnpnLLlhZLlpKfprZTnjotfCuWNiuasoeWFgyAtIEFDReeIseWlveiAheekvuWMugo="
while what's used above in the link is
"wqnpnLLlhZLlpKfprZTnjotfCuWNiuasoeWFgyAtIEFDR-eIseWlveiAheekvuWMug=="
Almost exactly the same except that a repeated "e" is replaced with a "-", very strange indeed.
Replacing the original with the "correct" string "wqnpnLLlhZLlpKfprZTnjotfCuWNiuasoeWFgyAtIEFDReeIseWlveiAheekvuWMugo=" doesn't work. Seems like its some kind of obfuscation or human error.
I tried replacing the whole string with a base64 encode for a space, ie "ICAg==" or "IA==" Doesn't work. Atm i'm stuck.
The only other template i managed to find is
"https://p1-bcy.byteimg.com/img/banciyuan/3ccdff22479c4060aadc86718209b281~noop.image". Its unwatermarked but it seems to be compressed quite a bit, its not exactly the original.
I think we just need to get the template right; the correct "~xxxx" tag for the original unwatermarked.
The downloader gives this output when trying to download said profile.
downloader.http: '404 Not Found' for 'https://img-bcy-qn.pstatp.com/banciyuan/3ccdff22479c4060aadc86718209b281' download: Failed to download 6780546160802143236 35432115.part downloader.http: '404 Not Found' for 'https://img-bcy-qn.pstatp.com/banciyuan/481a06423e3e4969bf129319541c4ab5' download: Failed to download 6780546160802143236 35432116.part downloader.http: '404 Not Found' for 'https://img-bcy-qn.pstatp.com/banciyuan/bc46a12d7d5b4f838506c63cdc5a126f' download: Failed to download 6780546160802143236 35432117.part downloader.http: '404 Not Found' for 'https://img-bcy-qn.pstatp.com/banciyuan/51936a46c02c49a09dfee28d495eea1c' download: Failed to download 6780546160802143236 35432118.part downloader.http: '404 Not Found' for 'https://img-bcy-qn.pstatp.com/banciyuan/a6a61bce98b448abbb1e12e9deb6cb6b' download: Failed to download 6780546160802143236 35432119.part downloader.http: '404 Not Found' for 'https://img-bcy-qn.pstatp.com/banciyuan/14dbc38e5bff48688716119d17639520' download: Failed to download 6780546160802143236 35432120.part downloader.http: '404 Not Found' for 'https://img-bcy-qn.pstatp.com/banciyuan/a19a6e8fc59c49d28e04b753fb5cb102' download: Failed to download 6780546160802143236 35432121.part downloader.http: '404 Not Found' for 'https://img-bcy-qn.pstatp.com/banciyuan/1f52f033ebb74293a244067f975e095c' download: Failed to download 6780546160802143236 35432122.part downloader.http: '404 Not Found' for 'https://img-bcy-qn.pstatp.com/banciyuan/a337da495119443aad11145aa1db7d90' download: Failed to download 6778693005793565699 35037961.part downloader.http: '404 Not Found' for 'https://img-bcy-qn.pstatp.com/banciyuan/1eb566a4a3854a19beb4cff899cd00a1' download: Failed to download 6778693005793565699 35037962.part downloader.http: '404 Not Found' for 'https://img-bcy-qn.pstatp.com/banciyuan/bdb5dae63fa6477fa478b927cbac3236' download: Failed to download 6778693005793565699 35037963.part downloader.http: '404 Not Found' for 'https://img-bcy-qn.pstatp.com/banciyuan/2170770b8fcb4308b3367e31d441e62b' download: Failed to download 6778693005793565699 35037964.part
These are most likely from the downloader using the old technique to handle the new links, which i've tried, does not work.
The current roundabout way to handle this imho is to maybe check if link has an image extension (.jpg/.png), and if it does, implement the old method.
If it doesn't then just grab the watermarked originals as well as the "~noop" version mentioned above (until we find a method to remove the watermark from originals), perhaps also place it in separate folders until a final solution can be found. In the mean time i'll manually use the noop version to crop out the watermark from the original.
The text was updated successfully, but these errors were encountered: