-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP::Client hangs on www #8770
Comments
Adidas have lots of measures against automated crawlers, it's possible they're just not responding (I tried to do some scraping on their site a while back). Might want to try a different user agent. Did you also experience this with other sites? |
I've tried to curl it, it gave me instantly a 403 page so I've added an user agent and it worked require "http/client"
headers = HTTP::Headers.new
headers["User-Agent"] = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.87 Safari/537.36"
response = HTTP::Client.get("https://www.adidas.com/", headers: headers)
puts response.status_code but the issue is the same |
So adding a user agent makes it work? I think that's expected. |
No it does not. It only works via curl |
So when using curl it seems like the server is responding in http2. I think you will have to force http1.1 if you want the same results to compare to your crystal code. Also |
this hangs for me: |
I think we can close this, it seems to be an issue with Adidas' website not supporting 1.1 or something like that. HTTP/2 is very hard to implement but it's a separate issue. |
Here is an implementation of HTTP2. https://github.com/ysbaddaden/http2 Here is the client class. It does not and docs so there might be some trial and error. https://github.com/ysbaddaden/http2/blob/5db786ee42f9193894720f3bb9eefdb24038d83e/client.cr#L6 |
When I try send a get request to certain sites using www it simply hangs
eg.:
but this works without a problem:
even though as soon as I try to get the redirect location from the response header it hangs again since the site redirects to www
I haven't had time yet to debug the issue, but weirdly it's not happening on all sites. (youtube redirects to www as well, and it's working fine)
os: macOS 10.15.2
crystal: 0.32.1
edit: after a while it died with a time out:
The text was updated successfully, but these errors were encountered: