Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CSS URLs longer than 1024 characters are not allowed #187

Open
sapio-dwelch opened this issue Sep 26, 2019 · 4 comments · May be fixed by #287
Open

CSS URLs longer than 1024 characters are not allowed #187

sapio-dwelch opened this issue Sep 26, 2019 · 4 comments · May be fixed by #287

Comments

@sapio-dwelch
Copy link

When trying to sanitize large data URIs used as background images in CSS properties, there is a hard-coded URL limit of 1024 characters (this is in StylingPolicy.sanitizeAndAppendUrl). Any value larger than 1024 characters is removed.

public class HtmlSanitizer {
    public static String TOO_LONG = "<table style=\"width:100%;\">\r\n" + 
            "    <tr>\r\n" + 
            "        <td style=\"background-image:url('" + 
            "Qm94PSIwIDAgMjAwLjAgMjAwvoidIGZpbGwtb3BhY2l0eT0iMSIgeG1sbnM6eGxpbms9Imh0dHA6Ly93" + 
            "d3cudzMub3JnLzE5OTkveGxpbmsiIGNvbG9yLXJlbmRlcmluZz0iYXV0byIgY29sb3ItaW50ZXJwb2xh" + 
            "dGlvbj0iYXV0byIgdGV4dC1yZW5kZXJpbmc9ImF1dG8iIHN0cm9rZT0iYmxhY2siIHN0cm9rZS1saW5l" + 
            "Y2FwPSJzcXVhcmUiIHdpZHRoPSIyMDAiIHN0cm9rZS1taXRlcmxpbWl0PSIxMCIgc2hhcGUtcmVuZGVy" + 
            "aW5nPSJhdXRvIiBzdHJva2voidBhY2l0eT0iMSIgZmlsbD0iYmxhY2siIHN0cm9rZS1kYXNoYXJyYXk9" + 
            "Im5vbmUiIGZvbnQtd2VpZ2h0PSJub3JtYWwiIHN0cm9rZS13aWR0aD0iMSIgaGVpZ2h0PSIyMDAiIHht" + 
            "bG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIgZm9udC1mYW1pbHk9IidEaWFsb2cnIiBmb250" + 
            "LXN0eWxlPSJub3JtYWwiIHN0cm9rZS1saW5lam9pbj0ibWl0ZXIiIGZvbnQtc2l6ZT0iMTJweCIgc3Ry" + 
            "b2tlLWRhc2hvZmZzZXQ9IjAiIGltYWdlLXJlbmRlcmluZz0iYXV0byINCj48IS0tR2VuZXJhdGVkIGJ5" + 
            "IE1hcnZpbiB3aXRoIEJhdGlrIFNWRyBHZW5lcmF0b3ItLT48ZGVmcyBpZD0iZ2VuZXJpY0RlZnMiDQog" + 
            "IC8+PGcNCiAgPjxkZWZzIGlkPSIxNDQ4MjQ5ODEwMjEtZGVmczEiDQogICAgPjxjbGlwUGF0aCBjbGlw" + 
            "UGF0aFVuaXRzPSJ1c2VyU3BhY2VPblVzZSIgaWQ9voidNDgyNDk4MTA0OTgtY2xpcFBhdGgxIg0KICAg" + 
            "ICAgPjxwYXRoIGQ9Ik0wIDAgTDIwMCAwIEwyMDAgMjAwIEwwIDIwMCBMMCAwIFoiDQogICAgICAvPjwv" + 
            "Y2xpcFBhdGgNCiAgICAgID48Y2xpcFBhdGggY2xpcFBhdGhVbml0cz0idXNlclNwYWNlT25Vc2UiIGlk" + 
            "PSIxNDQ4MjQ5OD=='); background-size: contain; background-repeat: no-repeat; back" + 
            "ground-position:center; height:100px; margin:auto;\"/>\r\n" + 
            "    </tr>\r\n" + 
            "</table>";
    
    public static String NOT_TOO_LONG = "<table style=\"width:100%;\">\r\n" + 
            "    <tr>\r\n" + 
            "        <td style=\"background-image:url('" + 
            "Qm94PSIwIDAgMjAwLjAgMjAwvoidIGZpbGwtb3BhY2l0eT0iMSIgeG1sbnM6eGxpbms9Imh0dHA6Ly93" + 
            "d3cudzMub3JnLzE5OTkveGxpbmsiIGNvbG9yLXJlbmRlcmluZz0iYXV0byIgY29sb3ItaW50ZXJwb2xh" + 
            "dGlvbj0iYXV0byIgdGV4dC1yZW5kZXJpbmc9ImF1dG8ivoidcm9rZT0iYmxhY2siIHN0cm9rZS1saW5l" + 
            "Y2FwPSJzcXVhcmUiIHdpZHRoPSIyMDAiIHN0cm9rZS1taXRlcmxpbWl0PSIxMCIgc2hhcGUtcmVuZGVy" + 
            "IE1hcnZpbiB3aXRoIEJhdGlrIFNWRyBHZW5lcmF0b3ItLT48ZGVmcyBpZD0iZ2VuZXJpY0RlZnMiDQog" + 
            "IC8+PGcNCiAgPjxkZWZzIGlkPSIxNDQ4MjQ5ODEwMjEtZGVmczEiDQogICAgPjxjbGlwUGF0aCBjbGlw" + 
            "UGF0aFVuaXRzPSJ1c2VyU3BhY2VPblVzZSIgaWQ9IjE0NDgyNDk4MTA0OTgtY2xpcFBhdGgxIg0KICAg" + 
            "ICAgPjxwYXRoIGQ9Ik0wIDAgTDIwMCAwIEwyMDAgMjAwIEwwIDIwMCBMMCAwIFoiDQogICAgICAvPjwv" + 
            "Y2xpcFBhdGgNCiAgvoidID48Y2xpcFBhdGggY2xpcFBhdGhVbml0cz0idXNlclNwYWNlT25Vc2UiIGlk" + 
            "PSIxNDQ4MjQ5OD=='); background-size: contain; background-repeat: no-repeat; back" + 
            "ground-position:center; height:100px; margin:auto;\"/>\r\n" + 
            "    </tr>\r\n" + 
            "</table>";
    
    public static final PolicyFactory HTML_POLICY = new HtmlPolicyBuilder()
            // allow all elements to be styled
            .allowStyling()
            // allow urls in bg images.
            .allowUrlsInStyles(AttributePolicy.IDENTITY_ATTRIBUTE_POLICY)
            // and also allow "data" URLs
            .allowUrlProtocols("data")
            // allow some more "global" elements
            .allowElements("table", "tbody", "th", "tr", "td").toFactory();
    
    public static void main(String args[]) {
        System.out.println(HTML_POLICY.sanitize(TOO_LONG));
        System.out.println(HTML_POLICY.sanitize(NOT_TOO_LONG));
    }
}

The output for TOO_LONG will not include the background-image property, but the output for NOT_TOO_LONG will. Is there a reason for the limit?

@mikesamuel
Copy link
Contributor

I believe the thought was that long URLs can be a denial-of-service vector.

That decision was made before data: URLs were widely used for images, so could be reconsidered.

Do you know of a good cutoff?

https://stackoverflow.com/questions/417142/what-is-the-maximum-length-of-a-url-in-different-browsers seems to conflate the address bar length and length limits to fetch.spec.whatwg.org.

@sapio-dwelch
Copy link
Author

sapio-dwelch commented Oct 14, 2019

Thanks for the reply.

I am not aware of what a good cutoff would be.

I thought that maybe the justification for a limit of 1024 was long URL == attack, but, if so, why is it only being enforced for URLs found in style sheets and not for URLs found elsewhere? (I have tested an image with a data URL greater than 4000 bytes and there seems to be no such limit). If there is no enforcement of this for all URLs, then I would argue that it be removed for the URLs found in style sheets, as well.

Those who are concerned about the max lengths could still enforce a 1024 character limit (or any limit) using a custom AttributePolicy.

@ioleo
Copy link

ioleo commented Sep 13, 2023

Is there a workaround to allow any size data:image blocks? I have a size check on the whole payload of HTTP request, so I don't need to check individual properties length.

ioleo added a commit to ioleo/java-html-sanitizer that referenced this issue Sep 13, 2023
@ioleo ioleo linked a pull request Sep 13, 2023 that will close this issue
@mikesamuel
Copy link
Contributor

Length limits avoid a lot of boundary problems in downstream code so seem in scope for sanitizers.

https://stackoverflow.com/a/417184/20394 suggests that maybe the limit should be 2000 instead of 1024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants