Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem with path matching and URI manipulation when involving double slashes #4743

Closed
RussellLuo opened this issue Apr 28, 2022 · 5 comments · Fixed by #4948
Closed

Problem with path matching and URI manipulation when involving double slashes #4743

RussellLuo opened this issue Apr 28, 2022 · 5 comments · Fixed by #4948

Comments

@RussellLuo
Copy link
Collaborator

RussellLuo commented Apr 28, 2022

Problem

1. Path matcher containing double slashes won't match any request

Caddyfile:

:8080 {
    handle_path //prefix/* {
        reverse_proxy 127.0.0.1:9090
    }
    respond "oops"
}

Neither of the following requests will match handle_path:

$ curl http://127.0.0.1:8080/prefix/path
oops
$ curl http://127.0.0.1:8080//prefix/path
oops

2. Normal handle_path can match but can't strip prefix from requests containing double slashes

Caddyfile:

:8080 {
    handle_path /prefix/* {
        reverse_proxy 127.0.0.1:9090
    }
    respond "oops"
}

:9090 {
    respond {path}
}

As shown below, the second request containing double slashes will be forwarded to the upstream without being stripped:

$ curl http://127.0.0.1:8080/prefix/path 
/path
$ curl http://127.0.0.1:8080//prefix/path                                                     
//prefix/path

Cause

I think the above problems are explainable, since "path cleaning" was introduced after the fix #4407:

The solution is to unescape (convert URL-encoded characters) then clean the path, before matching.

Possible solution

If the problems mentioned above are not the normal behaviors as expected by Caddy, I think maybe we should also introduce "path cleaning" while doing path matching (besides cleaning the source path, we should also clean the target path) and URI manipulation.

@RussellLuo
Copy link
Collaborator Author

As for problem (2), I have tested the behavior of Nginx with the following config:

server {
    listen 8080;

    location /prefix/ {
        proxy_pass http://127.0.0.1:9090/; # The trailing slash matters.
    } 
}

And here are the corresponding error logs (since no upstream is defined):

..., request: "GET /prefix/path HTTP/1.1", upstream: "http://127.0.0.1:9090/path", host: "127.0.0.1:8080"
..., request: "GET //prefix/path HTTP/1.1", upstream: "http://127.0.0.1:9090/path", host: "127.0.0.1:8080"

@RussellLuo
Copy link
Collaborator Author

Instead of cleaning path implicitly within path matchers, how about enhancing the rewrite handler to let users do it explicitly if needed?

For example, as a possible solution, we can add an attribute NormalizePath to Rewrite as below:

type Rewrite struct {
	...

	// Strips the given prefix from the beginning of the URI path.
	StripPathPrefix string `json:"strip_path_prefix,omitempty"`

	// Strips the given suffix from the end of the URI path.
	StripPathSuffix string `json:"strip_path_suffix,omitempty"`

	// Performs substring replacements on the URI.
	URISubstring []substrReplacer `json:"uri_substring,omitempty"`

	// Performs regular expression replacements on the URI path.
	PathRegexp []*regexReplacer `json:"path_regexp,omitempty"`

+	// Normalize the URI path by decoding the text encoded in the “%XX” form,
+	// resolving references to relative path components “.” and “..”, and
+	// merging multiple slashes into a single slash.
+	NormalizePath bool `json:"normalize_path,omitempty"`

	logger *zap.Logger
}

Take the problem (2) for example, the corresponding Caddyfile might look like this:

:8080 {
    uri normalize_path
    handle_path /prefix/* {
        reverse_proxy 127.0.0.1:9090
    }
    respond "oops"
}

:9090 {
    respond {path}
}

As for the original problem with basicauth - the case where "path cleaning" was introduced - I think this may be an alternative fix:

example.com {
    root * /srv

    uri normalize_path
    basicauth /secret-files* {
        bob <hash>
    }   

    file_server
}

@mholt
Copy link
Member

mholt commented May 5, 2022

Thanks Russell -- as discussed in Slack, I wonder if a good solution for this would be an exported function in the caddyhttp package, that matchers and handlers can use as needed.

Edit: actually, maybe SanitizedPathJoin is related or the function we're looking for?

@mholt
Copy link
Member

mholt commented Aug 10, 2022

@RussellLuo I've been thinking on this long and hard for a while. Track #4948 where I am hopefully resolving this.

@mholt
Copy link
Member

mholt commented Aug 11, 2022

@RussellLuo This is now fixed in #4948. If the prefix (or suffix, or substring) pattern in the rewrite config contains //, then it will be assumed that slashes should not be merged when cleaning the path.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants