Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

proposal/new requirement - served filename in content-disposition header must follow correct encoding #1390

Closed
elarlang opened this issue Sep 30, 2022 · 24 comments · Fixed by #2069
Assignees
Labels
4b Major-rework These issues need to be part of a full chapter rework 6) PR awaiting review V12 _5.0 - prep This needs to be addressed to prepare 5.0

Comments

@elarlang
Copy link
Collaborator

elarlang commented Sep 30, 2022

Basically, it must follow RFC 6266: "Use of the Content-Disposition Header Field in the Hypertext Transfer Protocol (HTTP)"
https://tools.ietf.org/html/rfc6266#section-5

Proposal (2023-04-29 updated requirement text and added alternative category):

  • Category: V12.5 "File and Resources > File Download" OR "V5.3 Output Encoding and Injection Prevention"
  • Verify that when Content-Disposition header is used then filename and filename* attribute values are correctly sanitized and encoded.
  • Level: 1, 2, 3
  • CWE: ?
    • CWE-172 - CWE-172: Encoding Error

Note: Content-Disposition header may be used with attachment or inline, so we can not limit requirement text only for "download file" functionality.

Why: if not converted correctly, it may give "header injection" possibility

Something like that (even this one is for mails):

Update:

  • filename - only characters from ISO-8859-1 can be used, value bust be sanitized
  • filename* - can be presented in chosen charset (utf-8) and need to be: sanitized + encoded to charset + urlencoded
@elarlang elarlang self-assigned this Sep 30, 2022
@elarlang elarlang added the 3) awaiting proposal There is some discussion in issue and reach to some results but it's not concluded with clear propos label Nov 20, 2022
@tghosth tghosth self-assigned this Dec 7, 2022
@tghosth tghosth added _5.0 - prep This needs to be addressed to prepare 5.0 josh/elar labels Dec 7, 2022
@motoyasu-saburi
Copy link

I'm commenting because I have been researching this issue.


Basically, it is better to conform to the RFC or WHAT WG.
However, the WHATWG recommendation is URL Encode to filename.

WHATWG HTML Spec:
https://html.spec.whatwg.org/#multipart-form-data

For field names and filenames for file fields, the result of the encoding in the previous bullet point must be escaped by replacing any 0x0A (LF) bytes with the byte sequence %0A, 0x0D (CR) with %0D and 0x22 (") with %22. The user agent must not perform any other escapes.

Chrome and Firefox use the URL Encode method.


HTTP Clinet and other systems basically used escaping with \ .
The same is true of Email client. (e.g. Gmail)

  • " --> \"
  • \n --> \\n
  • \r --> \\r

" escaping is necessary because field names, file extension, etc.. are tampered with when filenames such as the following are inserted into Content-Disposition.

Input filename:
malicious.sh"; name="field2"; dummy="abc.txt


Output Content-Disposition:

Content-Disposition: form-data; name="field1"; filename="malicious.sh"; name="field2"; dummy="abc.txt"

Similarly, I have confirmed that filename* can be RFD if it can generate the following Content-Disposition when URL Encoding is not done. (This is only valid in Firefox, though.)

Content-Disposition: attachment; filename*=utf-8''malicious.sh%00'normal.txt

Examples of problematic cases include:

90% of the problems were either a lack of escaping of the " in filename or a lack of escaping of CRLF.
Only Ktor had a problem with RFD due to missing escaping of filename* .


Implimentetion example

Spring:
https://github.com/spring-projects/spring-framework/blob/4cc91e46b210b4e4e7ed182f93994511391b54ed/spring-web/src/main/java/org/springframework/http/ContentDisposition.java#L259-L267

Symphony:
https://github.com/symfony/symfony/blob/123b1651c4a7e219ba59074441badfac65525efe/src/Symfony/Component/Mime/Header/ParameterizedHeader.php#L128-L133

Golang:
https://github.com/golang/go/blob/1e7e160d070443147ee38d4de530ce904637a4f3/src/mime/multipart/writer.go#L132-L136


I looked at about 50-80 well-known modules, and about 20% of them retained this vulnerability,
so I think it is very significant that clear requirements are stated in the ASVS.

@Sjord
Copy link
Contributor

Sjord commented Jan 20, 2023

Perhaps CWE - CWE-838: Inappropriate Encoding for Output Context (4.9) is more relevant.

@tghosth tghosth added 4b Major-rework These issues need to be part of a full chapter rework V12 V5 Temporary label for grouping input validation, sanitization, encoding, escaping related requirements labels May 23, 2023
@elarlang
Copy link
Collaborator Author

As we now have RFC references in requirement texts, we can use it for this one as well.

One candidate for CWE is CWE-641 Improper Restriction of Names for Files and Other Resources. There are no good one as the requirement asks validation+sanitization+encoding.

@motoyasu-saburi
Copy link

I have written a report on this issue.
https://gist.github.com/motoyasu-saburi/1b19ef18e96776fe90ba1b9f910fa714

Also, I have read some RFCs in broad strokes, but the escaping requirement was not clearly stated.

https://datatracker.ietf.org/doc/html/rfc2231
https://datatracker.ietf.org/doc/html/rfc2616
https://datatracker.ietf.org/doc/html/rfc5987
https://datatracker.ietf.org/doc/html/rfc6266
https://datatracker.ietf.org/doc/html/rfc7578
https://datatracker.ietf.org/doc/html/rfc8187

The format of form-data > filename is described in the following RFC.

RFC2616 (obsolete)
https://datatracker.ietf.org/doc/html/rfc2616#section-19.5.1

filename-param = "filename" "=" quoted-string

RFC6266 (standard)
https://datatracker.ietf.org/doc/html/rfc6266#section-4.1

filename-param = "filename" "=" value
                            | "filename*" "=" ext-value
....
value     = <value, defined in [RFC2616], Section 3.6>
              ; token | quoted-string

I reported a same problem to Scala's Web Framework (Playframework),
and the maintainer investigated the issue in considerable detail.
playframework/playframework#11571

I hope this will be helpful.

@tghosth
Copy link
Collaborator

tghosth commented Sep 26, 2023

that is interesting @motoyasu-saburi

Do you think you could try and formulate a requirement based on the conclusions of your research?

@tghosth
Copy link
Collaborator

tghosth commented Aug 12, 2024

This was tagged as V5 but my inclination is to keep it as V12 as it is specific to files and V5 is a little overloaded :)

@motoyasu-saburi it would be still be cool to get a suggested requirement draft from you :)

@tghosth tghosth removed josh/elar V5 Temporary label for grouping input validation, sanitization, encoding, escaping related requirements labels Aug 12, 2024
@elarlang
Copy link
Collaborator Author

It's not a wording proposal, but just to throw some direction in:
Verify that when file names are served in an HTTP header (such as Content-Disposition), file names are encoded or sanitized (e. g. according to RFC6266).

@motoyasu-saburi
Copy link

motoyasu-saburi commented Aug 13, 2024

Sorry I have not been able to make suggestions.

Additional materials were prepared late last year and are listed in passing.
https://codeblue.jp/2023/result/pdf/cb23%EF%BD%B0filename-in-content-disposition-is-a-landmine-vulnerability-caused-by-ambiguous-requirements-by-motoyasu-saburi.pdf


Verify that when file names are served in an HTTP header (such as Content-Disposition), file names are encoded or sanitized (e. g. according to RFC6266).

I thought the text here was generally good.
Personally, I think it is a difficult area to decide whether RFC6266 should be listed or not.
The reason is that there is no double-quote escaping requirement in this RFC.

(In fact, many services were implemented in accordance with the RFC and were vulnerable)

@motoyasu-saburi
Copy link

Proposal for V5 side:
Ensure that filenames are encoded or escaped to prevent request tampering with multipart format. (e.g. https://html.spec.whatwg.org/#multipart-form-data)

@jmanico
Copy link
Member

jmanico commented Aug 13, 2024

Why escaped? Typically I suggest we never use the submitted filename and to save files using an internal filename and not a user driven filename.

@motoyasu-saburi
Copy link

@jmanico

Why escaped?
This is because there is no clear standard requirement other than the WhatWG's HTMLSpec.
Therefore, some application implementations use \ for escaping, while others use percent encoding (I personally recommend percent encoding because it is easier to implement and less prone to bugs).
(Personally, I recommend percent encoding because it is easier to implement and less prone to bugs.)

My memory is a little fuzzy,
but I think in the case of Gmail, escaping was done with \ ( " --> \" ).

Typically I suggest we never use the submitted filename and to save files using an internal filename and not a user driven filename.

There are quite a few Web services that make requests including file names.
For example, e-mail services and e-commerce sites also have the ability to upload/send files such as images.
Such services may include the file name sent by the user in the request as it is.

Does this answer your question?

@elarlang
Copy link
Collaborator Author

Personally, I think it is a difficult area to decide whether RFC6266 should be listed or not.

The reason is that there is no double-quote escaping requirement in this RFC.

It must be encoded for filename* attribute, no?

Also the requirement is not limited here only for request tampering - correct encoding is also needed for the functionality to work correctly in every situation, not only when served filename is taken from user-request directly.

@motoyasu-saburi
Copy link

motoyasu-saburi commented Aug 14, 2024

Since some parts of the story are complicated, we will try to sort it out.

Content-Disposition is used in two patterns.

  1. HTTP Response Header
  2. Multipart format
    a. Email
    b. HTTP Request (multipart/form-data)

There are two attributes that specify the filename: filename and filename*

filename:

  • It is used for both HTTP responses header and Multipart.
  • The only clear encoding requirement is the browser standard “WhatWG > HTML Spec > multipart/form-data”.
    • Need to url encode the letter [ CRLF, Double-Quote ].
  • There is no explicit requirement in the RFC to make [ CRLF, Double-Quote] harmless.
    • Various services and OSS may URL Encode CRLF/Double-Quote, convert it to spaces, or escape it with BackSlash.
  • If CRLF/Double-Quote is not escaped, the following attack scenario may occur
    • HTTP Response Header:
      • HTTP Header Injection (CRLF Injection by \ Back-Slash)
      • Reflect File Downlaod ( Parameter Tampering by " Double-Quote )
    • Multipart Format:
      • Form/Email Value Tampering (CRLF Injection by \ Back-Slash )
      • Form/Email Value Tampering ( Parameter Name Tampering by " Double-Quote )

filename*

  • It is used for HTTP responses header.
  • The clear encoding requirement is the RFC6266 / RFC5987
  • If URLEncode is not done, the following attacks may be possible
    • Contend-Disposition Value Tampering ( Parameter Name Tampering by ;, ", CRLF)
  • filename and filename* may be used together. In such cases, filename* may take precedence over filename. (This is an unimportant specification in this Issue.)

As mentioned above, the fact that there may or may not be a standard to refer to makes the requirement very complicated.

The fact that Content-Dispotion is used in a variety of situations (Email / HTTP Request / Response) also complicates the situation.

@elarlang
Copy link
Collaborator Author

The reason is that there is no double-quote escaping requirement in this RFC.

Double quotes must be sanitized in this situation. Sometimes there are no encloser marks used, which means that it is enough to use ; in filename to have new attribute in the HTTP response header - it means, this one also must be sanitized.

Rethinking about it again, I still stand for my previous proposal (#1390 (comment))

Verify that when file names are served in an HTTP header (such as Content-Disposition), file names are encoded or sanitized (e. g. according to RFC6266).

We may list here other situations to cover it more cases, but it can be specific here to hilight this one and certain problem. In general we can say, the problem is covered with current requirement 5.3.1 (encoding for filename*) + 5.2.2 (sanitization for filename) .

@motoyasu-saburi
Copy link

Rethinking about it again, I still stand for my previous proposal

I did not fully understand the current specifications.
I think this wording is fine.

Double quotes must be sanitized in this situation. Sometimes there are no encloser marks used, which means that it is enough to use ; in filename to have new attribute in the HTTP response header - it means, this one also must be sanitized.

As an addendum, there is the problem that the specifications here are also ambiguous.
As for here, RFC may or may not say that it should be enclosed in a Double-Quote.`
For example, rfc6266 allows non-Quoted-String.
However, RFC 2616 requires Quoted-String.

https://datatracker.ietf.org/doc/html/rfc6266#appendix-A
o RFC 2616 only allows "quoted-string" for the filename parameter.
This would be an exceptional parameter syntax, and also doesn't
reflect actual use.

@elarlang
Copy link
Collaborator Author

elarlang commented Sep 4, 2024

Previous proposal:

Verify that when file names are served in an HTTP header (such as Content-Disposition), file names are encoded or sanitized (e. g. according to RFC6266).

Ok, focus here is the technique to use. Maybe should put the focus to "vector to mitigate" and rephrase it to something like:

Verify that serving a file name (e. g. in HTTP response header, as email attachment) it is encoded or sanitized (e. g. according to RFC6266) to keep the document structure.

(note: that I don't like the line above myself as it requires more work, I just try to put one additional direction to find solution here)

@jmanico
Copy link
Member

jmanico commented Sep 5, 2024 via email

@elarlang
Copy link
Collaborator Author

elarlang commented Sep 5, 2024

ping @motoyasu-saburi @tghosth @Sjord - any comments or thoughts?

@motoyasu-saburi
Copy link

I agree with this direction myself.
(At first I thought we should include detailed requirements, but this was a mistake)

@elarlang
Copy link
Collaborator Author

elarlang commented Sep 5, 2024

At first I thought we should include detailed requirements, but this was a mistake

It is not a mistake - technical input gives an opportunity to validate the requirement, and does it cover all the cases. But we took an approach for the ASVS to put more abstract principles as requirements and leave technical or step-by-step guides to testing guides and cheat sheets.

@elarlang
Copy link
Collaborator Author

elarlang commented Sep 5, 2024

I also re-open the section question for the proposed requirement:

  • V5.3 Output Encoding and Injection Prevention - by requirement decription it fits here.
  • V12.5 File Download - it already contains one similar (from functionality perspective) requirement and those fits together. At the moment it is the only requirement in this section. Maybe the section should be called "Serving a File"?

@elarlang elarlang added the next meeting Filter for leaders label Sep 5, 2024
@jmanico
Copy link
Member

jmanico commented Sep 5, 2024

I like the idea of changing the section to "serving a file".

Since this is very focused on file download, I would like to keep this requirement in v12. There are many requirements that fit into many sections. I would like v12 to be as complete as possible to help folks build file upload/download systems instead of having to look to many sections.

Or at least have a section at the end of v12 that points to other requirements that are needed for file upload, but I think that is to messy and hard to maintain.

@tghosth
Copy link
Collaborator

tghosth commented Sep 5, 2024

I like this direction. How about: Verify that file names served (e.g., in HTTP response headers or email attachments) are properly encoded or sanitized (e.g., following RFC 6266) to preserve document structure and prevent injection attacks.

This looks good to me

I think it should stay in V12.5

@elarlang
Copy link
Collaborator Author

elarlang commented Sep 5, 2024

I'll handle the PR for the requirement and opened a separate issue (#2066) for discussing the section title, as there is a bit larger picture to keep in mind.

@elarlang elarlang added 5) awaiting PR A proposal hs been accepted and reviewed and we are now waiting for a PR and removed 3) awaiting proposal There is some discussion in issue and reach to some results but it's not concluded with clear propos next meeting Filter for leaders labels Sep 5, 2024
@elarlang elarlang linked a pull request Sep 10, 2024 that will close this issue
@elarlang elarlang added 6) PR awaiting review and removed 5) awaiting PR A proposal hs been accepted and reviewed and we are now waiting for a PR labels Sep 10, 2024
tghosth pushed a commit that referenced this issue Sep 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4b Major-rework These issues need to be part of a full chapter rework 6) PR awaiting review V12 _5.0 - prep This needs to be addressed to prepare 5.0
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants