Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError: '/status_404' #230

Open
MaheswarReddy1194321 opened this issue Jan 31, 2020 · 4 comments
Open

KeyError: '/status_404' #230

MaheswarReddy1194321 opened this issue Jan 31, 2020 · 4 comments

Comments

@MaheswarReddy1194321
Copy link

MaheswarReddy1194321 commented Jan 31, 2020

key_error

-> it's giving error if we are cloning website which have error page.
becaue

def add_scheme(url):
if url[-1] == '/':
url = url.strip('/')
if yarl.URL(url).scheme:
new_url = yarl.URL(url)
err_url = yarl.URL(url + '/status_404')
else:
new_url = yarl.URL('http://' + url)
err_url = yarl.URL('http://' + url + '/status_404')
return new_url, err_url

in the above function from coner.py we are directly adding error page whithout checking whether the website has error page.
kindly, let me know if there is any mistake in my comment.

@afeena
Copy link
Collaborator

afeena commented Mar 4, 2020

@MaheswarReddy1194321 sorry for the late reply

I can't reproduce your problem :( But I know we had some problems with a cloner, so it doesn't work stable all the time. You should check you meta.json and try probably to re-clone again

@afeena
Copy link
Collaborator

afeena commented Mar 4, 2020

Maybe it's somehow related #215

@afeena
Copy link
Collaborator

afeena commented Mar 4, 2020

And this #183

@afeena afeena mentioned this issue Mar 4, 2020
@lordlabuckdas
Copy link
Contributor

As per my observation, this issue arises in 2 cases:

  1. The target website does not have an error page.
  2. Upon visiting a page that does not exist, we are redirected to the homepage.

The website in question (used by @MaheswarReddy1194321) comes under the 2nd case as in.yahoo.com/status_404 redirects to in.yahoo.com/?err=404&err_url=https%3a%2f%2fin.yahoo.com%2fstatus_404.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants