You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
URLs that was excluded from Wayback Machine are not handled properly using CDX Server API (Availability API is fine).
Manual web user interface request returns:
Sorry.
This URL has been excluded from the Wayback Machine.
API request returns: org.archive.util.io.RuntimeIOException: org.archive.wayback.exception.AdministrativeAccessControlException: Blocked Site Error
waybackpy cdx_utils.py does not expect such response and crashes with exception:
Traceback (most recent call last):
...
for snapshot in cdx.snapshots():
File "/usr/local/lib/python3.6/dist-packages/waybackpy/cdx_api.py", line 144, in snapshots
for text in texts:
File "/usr/local/lib/python3.6/dist-packages/waybackpy/cdx_api.py", line 52, in cdx_api_manager
total_pages = get_total_pages(self.url, self.user_agent)
File "/usr/local/lib/python3.6/dist-packages/waybackpy/cdx_utils.py", line 15, in get_total_pages
return int(response.text.strip())
ValueError: invalid literal for int() with base 10: 'org.archive.util.io.RuntimeIOException: org.archive.wayback.exception.AdministrativeAccessControlException: Blocked Site Error'
In my opinion it is better to simply return no snapshots available without exceptions, as from the user perspective, I think, it doesn't matter is it blocked or simply was not archived.
Though in some use cases, which I am not aware of, it might be important to know this information.
But probably you are right, if API interpret this situation as an exception, then it is better to stay closer to API. I think custom BlockedSiteError is OK.
URLs that was excluded from Wayback Machine are not handled properly using CDX Server API (Availability API is fine).
Manual web user interface request returns:
API request returns:
org.archive.util.io.RuntimeIOException: org.archive.wayback.exception.AdministrativeAccessControlException: Blocked Site Error
waybackpy cdx_utils.py does not expect such response and crashes with exception:
To Reproduce
Sample URL: http://gotceleb.com
Version:
The text was updated successfully, but these errors were encountered: