Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to encode + decode URL #799

Closed
djc opened this issue Oct 26, 2022 · 2 comments · Fixed by #817
Closed

Unable to encode + decode URL #799

djc opened this issue Oct 26, 2022 · 2 comments · Fixed by #817

Comments

@djc
Copy link
Contributor

djc commented Oct 26, 2022

This code fails:

        let url = dbg!(Url::parse("         m:/          /%2E.//     \\           ").unwrap());
        let encoded = url.as_str();
        let reparsed = Url::parse(encoded).unwrap();
panicked at 'called `Result::unwrap()` on an `Err` value: InvalidDomainCharacter', test.rs:3:44
[test.rs:1] Url::parse("         m:/          /%2E.//     \\           ").unwrap() = Url {
    scheme: "m",
    cannot_be_a_base: false,
    username: "",
    password: None,
    host: None,
    port: None,
    path: "//%20%20%20%20%20\\",
    query: None,
    fragment: None,
}

Found as part of fuzzing the trust-dns crate (which uses Url in CAA records).

@qsantos
Copy link
Contributor

qsantos commented Feb 27, 2023

The offending URL can be reduced to m:/.//\\.

@qsantos
Copy link
Contributor

qsantos commented Feb 28, 2023

The issue comes from the remove dot segments step. Let's consider the URI m:/.//. Then, according to the RFC:

B. if the input buffer begins with a prefix of "/./" […], then replace that prefix with "/"

So it should be normalized to m://, but this has different semantics (resulting in \ being interpreted as being part of the authority in the original example).

I have conducted a few tests with some URI normalization libraries:

$ node test.js
http://example.com/a/c/d
m:/.//
$ cat test.php
<?php
require_once 'URLNormalizer.php';
echo (new Normalizer('http://example.com/a/b/../c/d'))->normalize(), "\n";
echo (new Normalizer('m:/.//'))->normalize(), "\n";
?>
$ php test.php
http://example.com/a/c/d
m:/
$ cat test.pl
use feature qw(say);
use URI::Normalize qw(normalize_uri);
say normalize_uri(URI->new('http://example.com/a/b/../c/d'));
say normalize_uri(URI->new('m:/.//'));
$ perl test.pl
http://example.com/a/c/d
m://
$ cat test.rb
require 'addressable/uri'
p Addressable::URI.parse('http://example.com/a/b/../c/d').normalize.to_s
p Addressable::URI.parse('m:/.//').normalize.to_s
$ ruby test.rb
"http://example.com/a/c/d"
/usr/share/rubygems-integration/all/gems/addressable-2.8.1/lib/addressable/uri.rb:2487:in `validate': Cannot have a path with two leading slashes without an authority set: 'm://' (Addressable::URI::InvalidURIError)
	from /usr/share/rubygems-integration/all/gems/addressable-2.8.1/lib/addressable/uri.rb:2410:in `defer_validation'
	from /usr/share/rubygems-integration/all/gems/addressable-2.8.1/lib/addressable/uri.rb:839:in `initialize'
	from /usr/share/rubygems-integration/all/gems/addressable-2.8.1/lib/addressable/uri.rb:2184:in `new'
	from /usr/share/rubygems-integration/all/gems/addressable-2.8.1/lib/addressable/uri.rb:2184:in `normalize'
	from test.rb:3:in `<main>'

For PHP, I am using https://github.com/glenscott/url-normalizer. In short:

  • Node does not remove the dot before a double-slash; I think the logic is somewhere in there, but I never liked browsing Firefox's source code, so who knows?
  • PHP removes the dot but avoids leaving a double-slash where it would be reinterpreted as the authority; this is done by merging consecutive slashes, which is does not conform to the RFC
  • Perl exhibits the same bug as rust-url
  • Ruby straight up refuses to parse a path with two leading slashes without an authority

I feel like Ruby's is the most consistent and straightforward solution.

This does mean that the no_panic test from #654 must be amended. However, we can cover the m:/.// case in a dedicated unit test.

See #817

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants