Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backwards incompatibility with header parsing from 3.8.4 to 3.8.5 #7468

Closed
1 task done
mdegat01 opened this issue Aug 2, 2023 · 12 comments · Fixed by #7480
Closed
1 task done

Backwards incompatibility with header parsing from 3.8.4 to 3.8.5 #7468

mdegat01 opened this issue Aug 2, 2023 · 12 comments · Fixed by #7480

Comments

@mdegat01
Copy link

mdegat01 commented Aug 2, 2023

Describe the bug

I received home-assistant/supervisor#4454 in my project and the root cause turned out to be the update of aiohttp from 3.8.4 to 3.8.5. I was able to clearly reproduce it in a vanilla aiohttp server set up from the example here and having a script make a call providing a Host header. The same script works when the server is running on 3.8.4 and fails if I update to 3.8.5.

To Reproduce

  1. Install aiohttp 3.8.4 and run the demo server shown here
  2. Use the following script to make a call to it and receive the hello world response as expected
#!/bin/bash

SERVER=127.0.0.1
nc -i 1 ${SERVER} 8080 <<EOF
GET /world HTTP/1.1
Host: 127.0.0.1

EOF
  1. Stop the server, update to 3.8.5 and start it again
  2. Run the same script and receive this response instead:
HTTP/1.0 400 Bad Request
Content-Type: text/plain; charset=utf-8
Content-Length: 69
Date: Wed, 02 Aug 2023 19:28:52 GMT
Server: Python/3.11 aiohttp/3.8.5

Invalid header value char:

  b'Host: 127.0.0.1'
                   ^

Expected behavior

The same script should work on 3.8.4 and 3.8.5

Logs/tracebacks

Error handling request
Traceback (most recent call last):
  File "/Users/degam/.pyenv/versions/3.11.2/lib/python3.11/site-packages/aiohttp/web_protocol.py", line 332, in data_received
    messages, upgraded, tail = self._request_parser.feed_data(data)
                               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "aiohttp/_http_parser.pyx", line 557, in aiohttp._http_parser.HttpParser.feed_data
aiohttp.http_exceptions.BadHttpMessage: 400, message:
  Invalid header value char:

    b'Host: 127.0.0.1'
                     ^

Python Version

$ python --version
Python 3.11.2

aiohttp Version

$ python -m pip show aiohttp
Name: aiohttp
Version: 3.8.5
Summary: Async http client/server framework (asyncio)
Home-page: https://github.com/aio-libs/aiohttp
Author:
Author-email:
License: Apache 2
Location: /Users/degam/.pyenv/versions/3.11.2/lib/python3.11/site-packages
Requires: aiosignal, async-timeout, attrs, charset-normalizer, frozenlist, multidict, yarl
Required-by:

multidict Version

$ python -m pip show multidict
Name: multidict
Version: 6.0.4
Summary: multidict implementation
Home-page: https://github.com/aio-libs/multidict
Author: Andrew Svetlov
Author-email: [email protected]
License: Apache 2
Location: /Users/degam/.pyenv/versions/3.11.2/lib/python3.11/site-packages
Requires:
Required-by: aiohttp, yarl

yarl Version

$ python -m pip show yarl
Name: yarl
Version: 1.9.2
Summary: Yet another URL library
Home-page: https://github.com/aio-libs/yarl/
Author: Andrew Svetlov
Author-email: [email protected]
License: Apache-2.0
Location: /Users/degam/.pyenv/versions/3.11.2/lib/python3.11/site-packages
Requires: idna, multidict
Required-by: aiohttp

OS

macOS

Also have been able to reproduce it on a variety of systems running the Home Assistant software which depends on aiohttp, as described in the issue I linked at the top.

Related component

Server

Additional context

No response

Code of Conduct

  • I agree to follow the aio-libs Code of Conduct
@mdegat01 mdegat01 added the bug label Aug 2, 2023
@webknjaz
Copy link
Member

webknjaz commented Aug 2, 2023

This is expected result of fixing a request smuggling security vulnerability. Newer RFCs are stricter regarding what should be a header separator. It can't be just CR or LF anymore and must always be a CR LF pair.

@webknjaz webknjaz removed the bug label Aug 2, 2023
@webknjaz
Copy link
Member

webknjaz commented Aug 2, 2023

@Dreamsorcerer should that byte-string marker leak like this? It's probably confusing to people..

@mdegat01
Copy link
Author

mdegat01 commented Aug 3, 2023

So the original script that caused the linked issue at the top of my post also included unix2dos like this:

#!/bin/bash

SERVER=127.0.0.1
nc -i 1 ${SERVER} 8080 <<< unix2dos<<EOF
GET /world HTTP/1.1
Host: 127.0.0.1

EOF

I stripped that out for this issue report since I was trying to keep the steps to reproduce as simple as possible. I was able to reproduce it even with that in there but upon further testing it looks like the underlying problem is really that unix2dos isn't working as the author expected in that script and the line endings remain incorrect.

So thanks for the explanation and in agreement, this is not an aiohttp issue. Although if considering improving the error message I would appreciate that. It was quite confusing, I wasn't sure what that marker was pointing to or trying to tell me as it appeared to be pointing to the single quote beyond the end of the header.

@Dreamsorcerer
Copy link
Member

I'm pretty sure in testing, I saw errors with \r in the output. I'll have to test this one out later and see why it's not including the bad character in the output...

@Dreamsorcerer
Copy link
Member

OK, it's because I'm splitting on \n, to restrict it to the current line, but if the \n is the error then it's just outside where I split the data. I could leave the full data in the output, but that'd be pretty messy, or I could just readd the \n to the end of the string after splitting.

@Dreamsorcerer
Copy link
Member

Actually, I can just split it on \r\n, which should then match the parser's logic.

Dreamsorcerer added a commit that referenced this issue Aug 5, 2023
patchback bot pushed a commit that referenced this issue Aug 5, 2023
Fixes #7468.

(cherry picked from commit 1fb06bb)
patchback bot pushed a commit that referenced this issue Aug 5, 2023
Fixes #7468.

(cherry picked from commit 1fb06bb)
Dreamsorcerer added a commit that referenced this issue Aug 6, 2023
)

**This is a backport of PR #7480 as merged into master
(1fb06bb).**

Fixes #7468.

Co-authored-by: Sam Bull <[email protected]>
@webknjaz
Copy link
Member

webknjaz commented Aug 6, 2023

I wonder if we could use https://docs.python.org/3/library/unicodedata.html#unicodedata.name for anything non-ASCII in the error representation.
It is possible to reference symbols by names like '\N{CARRIAGE RETURN}\N{LINE FEED}', so why not do a reverse? And we wouldn't have to keep it as a bytestring...
@Dreamsorcerer WDYT?

@Dreamsorcerer
Copy link
Member

Dreamsorcerer commented Aug 6, 2023

It is possible to reference symbols by names like '\N{CARRIAGE RETURN}\N{LINE FEED}', so why not do a reverse? And we wouldn't have to keep it as a bytestring...

Seems like it doesn't really work for any of the characters we'd particularly want it for (it also requires str, which means we need to encode it first and it might not be valid utf-8):

>>> unicodedata.name("\r")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: no such name
>>> unicodedata.name("\n")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: no such name
>>> unicodedata.name("\x01")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: no such name

It'd also become more complex to figure out the pointer positioning. I think bytes is pretty clear, a developer can look up the characters themselves if they are unfamiliar with it.

Dreamsorcerer added a commit that referenced this issue Aug 6, 2023
)

**This is a backport of PR #7480 as merged into master
(1fb06bb).**

Fixes #7468.

Co-authored-by: Sam Bull <[email protected]>
@webknjaz
Copy link
Member

webknjaz commented Aug 6, 2023

@Dreamsorcerer true. Though, it'd be nice to have something like this conceptually, maybe not through unicode but with code-points for everything non-ASCII.. I was wondering whether to task one of my mentees with this research, it doesn't seem too complicated.

@Dreamsorcerer
Copy link
Member

Up to you, but I'm not sure it adds much, and there are a bunch of other tasks I think would be more useful to look at.

@webknjaz
Copy link
Member

webknjaz commented Aug 6, 2023

Maybe we could just do repr(bytes_thing)[2:-1] to get rid of the bytes-wrapper?

@Dreamsorcerer
Copy link
Member

Yeah, if you think that's preferable. I was just thinking it would be clear what format it is being displayed in, so a user knows that \x01 is not 4 literal characters, for example.

renovate bot referenced this issue in allenporter/pyrainbird Oct 9, 2023
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Change | Age | Adoption | Passing | Confidence |
|---|---|---|---|---|---|
| [aiohttp](https://togithub.com/aio-libs/aiohttp) | `==3.8.5` ->
`==3.8.6` |
[![age](https://developer.mend.io/api/mc/badges/age/pypi/aiohttp/3.8.6?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![adoption](https://developer.mend.io/api/mc/badges/adoption/pypi/aiohttp/3.8.6?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![passing](https://developer.mend.io/api/mc/badges/compatibility/pypi/aiohttp/3.8.5/3.8.6?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|
[![confidence](https://developer.mend.io/api/mc/badges/confidence/pypi/aiohttp/3.8.5/3.8.6?slim=true)](https://docs.renovatebot.com/merge-confidence/)
|

---

### Release Notes

<details>
<summary>aio-libs/aiohttp (aiohttp)</summary>

###
[`v3.8.6`](https://togithub.com/aio-libs/aiohttp/blob/HEAD/CHANGES.rst#386-2023-10-07)

[Compare
Source](https://togithub.com/aio-libs/aiohttp/compare/v3.8.5...v3.8.6)

\==================

## Security bugfixes

- Upgraded the vendored copy of llhttp\_ to v9.1.3 -- by
:user:`Dreamsorcerer`

    Thanks to :user:`kenballus` for reporting this, see

GHSA-pjjw-qhg8-p2p9.

    .. \_llhttp: https://llhttp.org

    `#&#8203;7647 <https://github.com/aio-libs/aiohttp/issues/7647>`\_

- Updated Python parser to comply with RFCs 9110/9112 -- by
:user:`Dreamorcerer`

    Thanks to :user:`kenballus` for reporting this, see

GHSA-gfw2-4jvh-wgfg.

    `#&#8203;7663 <https://github.com/aio-libs/aiohttp/issues/7663>`\_

## Deprecation

- Added `fallback_charset_resolver` parameter in `ClientSession` to
allow a user-supplied
    character set detection function.

Character set detection will no longer be included in 3.9 as a default.
If this feature is needed,
please use `fallback_charset_resolver
<https://docs.aiohttp.org/en/stable/client_advanced.html#character-set-detection>`\_.

    `#&#8203;7561 <https://github.com/aio-libs/aiohttp/issues/7561>`\_

## Features

- Enabled lenient response parsing for more flexible parsing in the
client
(this should resolve some regressions when dealing with badly formatted
HTTP responses). -- by :user:`Dreamsorcerer`

    `#&#8203;7490 <https://github.com/aio-libs/aiohttp/issues/7490>`\_

## Bugfixes

- Fixed `PermissionError` when `.netrc` is unreadable due to
permissions.

    `#&#8203;7237 <https://github.com/aio-libs/aiohttp/issues/7237>`\_

- Fixed output of parsing errors pointing to a `\n`. -- by
:user:`Dreamsorcerer`

    `#&#8203;7468 <https://github.com/aio-libs/aiohttp/issues/7468>`\_

-   Fixed `GunicornWebWorker` max_requests_jitter not working.

    `#&#8203;7518 <https://github.com/aio-libs/aiohttp/issues/7518>`\_

- Fixed sorting in `filter_cookies` to use cookie with longest path. --
by :user:`marq24`.

    `#&#8203;7577 <https://github.com/aio-libs/aiohttp/issues/7577>`\_

- Fixed display of `BadStatusLine` messages from llhttp\_. -- by
:user:`Dreamsorcerer`

    `#&#8203;7651 <https://github.com/aio-libs/aiohttp/issues/7651>`\_

***

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Enabled.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://developer.mend.io/github/allenporter/pyrainbird).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy4wLjMiLCJ1cGRhdGVkSW5WZXIiOiIzNy4wLjMiLCJ0YXJnZXRCcmFuY2giOiJtYWluIn0=-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants