Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Url gets manipulated in the wrong way #2188

Closed
pvanderlinden opened this issue Aug 11, 2017 · 10 comments
Closed

Url gets manipulated in the wrong way #2188

pvanderlinden opened this issue Aug 11, 2017 · 10 comments
Labels

Comments

@pvanderlinden
Copy link

Long story short

Url's passed into aiohttp gets changed in certain ways the servers don't always accept. In this case a : in the url which is quoted to %3A already, will get unquoted by aiohttp.

Expected behaviour

%3A should stay %3A

Actual behaviour

%3A in the path get's translated to : causing errors

Steps to reproduce

pass in a url with %3A. Example url: https://thumbor-static.factorymedia.com/_9eOzvEVNwVVRsG0qxCqt-9-KxI=/1280x720/smart/http%3A%2F%2Fcoresites-cdn.factorymedia.com%2Fsidewalk%2Fwp-content%2Fuploads%2F2017%2F08%2Ffifty-fifty-twenty-years.jpg works in your browser, and in python requests, doesn't work in aiohttp.

Your environment

aiohttp 2.2.5, python 3.5, linux

@fafhrd91
Copy link
Member

@asvetlov I think yarl should support raw(unsafe) urls

@pvanderlinden
Copy link
Author

Isn't the main issue here that it unquotes characters? That aiohttp changes characters into their % form where needed makes sense, the other way around: not really.

@asvetlov
Copy link
Member

@fafhrd91 agree.
I wonder if await client.get(URL(url, encoded=True)) is working.
Need to check it.

@asvetlov
Copy link
Member

@pvanderlinden normalizing urls by default on client is good idea.
Usually servers normalize urls on their side too, but looks like your particular server doesn't do it.
For this case we should provide workaround.

@pvanderlinden
Copy link
Author

The issue seem to arise with anything which uses AWS cloudfront when I was looking around for issues.

I wasn't aware of the encode=True option, which solves my problem, thanks. I will close the issue then as I don't think it is a bug then.

@shuckc
Copy link
Contributor

shuckc commented Nov 6, 2017

I think it is a bug to undo pre-encoded values supplied by the user, as it can break hmac style authorisation headers that have been pre-computed. The operation "encode" should not be doing an opportunistic decode just because it can.

@asvetlov
Copy link
Member

asvetlov commented Nov 7, 2017

@shuckc I don't get your point.
Please elaborate how pre-encoded URL breaks HMAC?

@shuckc
Copy link
Contributor

shuckc commented Nov 7, 2017

Hi Andrew,
Great work by the way - don't mean to complain, the case I had turned out to be badly-encoded:

I used python 3's urllib to quote a json snippet passed as a GET argument. urllib encodes the ":" as %3A, the partially encoded url is then used to "sign" the request by adding an extra header.

Yarl inside aiohttp was then reversing the encoding of %3A back to ":" before sending the request, so the api-signature header no longer matched the request. The server checks the signature against the request value and failed.

From what I have learned subsequently, colon is a permissible character in URLs, so it is actually urllib that is at fault here for encoding it.

	url = 'https://server.com/.../blah?filter=' + urllib.parse.quote('{"open": true}')
	...
	headers = {}
	headers['api-nonce'] = nonce
	headers['api-key'] = apiKey
	headers['api-signature'] = generate_signature(apiSecret, method, url, nonce, body_text)

	async with session.request(method, url, headers=headers, data=body) as resp:
		...

I fixed this first by

	...
	#https://github.com/aio-libs/aiohttp/issues/2188
	yurl = yarl.URL(url, encoded=True)

	async with session.request(method, yurl, headers=headers, data=body) as resp:
		...

And later by removing urllib :-)

@asvetlov
Copy link
Member

asvetlov commented Nov 7, 2017

Yes, your fix is totally correct.
That's why encoded=True option exists and exposed in yarl API.
yarl tries to follow RFC but if some server prefer another encoding form -- the option allows to send any data.
Unfortunately it isn't combined with params argument of request method -- for adding query string params aiohttp need parse base URL and requote it to don't make a mess.
The behavior is described in http://aiohttp.readthedocs.io/en/stable/client.html#passing-parameters-in-urls

@lock
Copy link

lock bot commented Oct 28, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a [new issue] for related bugs.
If you feel like there's important points made in this discussion, please include those exceprts into that [new issue].
[new issue]: https://github.com/aio-libs/aiohttp/issues/new

@lock lock bot added the outdated label Oct 28, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants