-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
net/url: misleading error message when url has a leading space #29261
Comments
Technically, according to RFC3986-Sec3.1 -
So therefore, anything not starting with a letter should be rejected and that would be the right behavior. However, I had a look at the implementations out there in the wild. NodeJS accepts it-
Curl does not -
Same with wget -
Python also fails to parse-
Note that In light of the above, let us just return a more descriptive error (with quotes around the url) if we see a space. How about -
|
That is an incorrect error. It should be alpha character, not alpha-numeric.
… On Dec 20, 2018, at 11:27 PM, Agniva De Sarker ***@***.***> wrote:
Technically, according to RFC3986-Sec3.1 -
Scheme names consist of a sequence of characters beginning with a
letter and followed by any combination of letters, digits, plus
("+"), period ("."), or hyphen ("-").
So therefore, anything not starting with a letter should be rejected and that would be the right behavior. However, I had a look at the implementations out there in the wild.
NodeJS accepts it-
> require("url").parse(' https://example.org')
Url {
protocol: 'https:',
slashes: true,
auth: null,
host: 'example.org',
port: null,
hostname: 'example.org',
hash: null,
search: null,
query: null,
pathname: '/',
path: '/',
href: 'https://example.org/' }
Curl does not -
curl ' https://example.org'
curl: (1) Protocol " https" not supported or disabled in libcurl
Same with wget -
wget ' https://example.org'
https://example.org: Scheme missing.
Python also fails to parse-
from urllib.parse import urlparse
>>> o = urlparse(' https://example.org')
>>> o
ParseResult(scheme='', netloc='', path=' https://example.org', params='', query='', fragment='')
Note that scheme and netloc are empty.
In light of the above, let us just return a more descriptive error (with quotes around the url) if we see a space. How about -
parse " http://example.org": URL scheme does not begin with an alpha-numeric character.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
That's right, sorry. Then we can simplify it to ".. does not begin with a letter". |
Perhaps ".. scheme does not begin with a letter" would be more accurate as Interestingly, i found u, err := url.Parse(" //example.org")
if err != nil {
panic(err)
}
fmt.Println(u) is NOT failing, but outputs:
|
Yes, we should return an error on leading bogus chars. |
ok I will make a PR for this code change with associated tests. |
Current implementation doesn't always make it obvious what the exact problem with the URL is, so this makes it clearer by consistently quoting the invalid URL, as is the norm in other parsing implementations, eg.: strconv.Atoi(" 123") returns an error: parsing " 123": invalid syntax Updates golang#29261
I have implemented the generic improvement of error messages by consistently quoting the URLs in #29384, but more detailed messages and stricter validation are welcome of course. |
@bradfitz what does "bogus chars" translate to besides a space? |
Any non-letter, according to the spec. |
On further analysis, it seems its not straightforward to implement this because of 2 reasons:
To account for the space-specific error, we could add a separate case branch and raise the error message like so:
However this seems like its a bandaid fix to me, but I am not sure, so would appreciate some input. Any thoughts on whether this is good, or if there's any other approach? |
I assume the proposed change would go just before Lines 435 to 457 in b50210f
This would make it inconsistent with other validation error cases in the same
They both just returns scheme as "" and no error whatsoever. It should be consistent in a way so that all reasons for invalid scheme result either in error or "", but not mixed. |
It's hard to discuss code changes here. Maybe just send a CL and we can continue it over there. We probably need to simplify the switch-case a bit. Have an if condition first, which checks for i==0 and non-letter, and return immediately. And integrate the rest of the switch-case accordingly. We can keep the special |
Change https://golang.org/cl/155922 mentions this issue: |
Change https://golang.org/cl/185117 mentions this issue: |
Current implementation doesn't always make it obvious what the exact problem with the URL is, so this makes it clearer by consistently quoting the invalid URL, as is the norm in other parsing implementations, eg.: strconv.Atoi(" 123") returns an error: parsing " 123": invalid syntax Updates #29261 Change-Id: Icc6bff8b4a4584677c0f769992823e6e1e0d397d GitHub-Last-Rev: 648b9d9 GitHub-Pull-Request: #29384 Reviewed-on: https://go-review.googlesource.com/c/go/+/185117 Reviewed-by: Daniel Martí <[email protected]> Run-TryBot: Daniel Martí <[email protected]> TryBot-Result: Gobot Gobot <[email protected]>
Current implementation doesn't always make it obvious what the exact problem with the URL is, so this makes it clearer by consistently quoting the invalid URL, as is the norm in other parsing implementations, eg.: strconv.Atoi(" 123") returns an error: parsing " 123": invalid syntax Updates golang#29261 Change-Id: Icc6bff8b4a4584677c0f769992823e6e1e0d397d GitHub-Last-Rev: 648b9d9 GitHub-Pull-Request: golang#29384 Reviewed-on: https://go-review.googlesource.com/c/go/+/185117 Reviewed-by: Daniel Martí <[email protected]> Run-TryBot: Daniel Martí <[email protected]> TryBot-Result: Gobot Gobot <[email protected]>
Current implementation doesn't always make it obvious what the exact problem with the URL is, so this makes it clearer by consistently quoting the invalid URL, as is the norm in other parsing implementations, eg.: strconv.Atoi(" 123") returns an error: parsing " 123": invalid syntax Updates golang#29261 Change-Id: Icc6bff8b4a4584677c0f769992823e6e1e0d397d GitHub-Last-Rev: 648b9d9 GitHub-Pull-Request: golang#29384 Reviewed-on: https://go-review.googlesource.com/c/go/+/185117 Reviewed-by: Daniel Martí <[email protected]> Run-TryBot: Daniel Martí <[email protected]> TryBot-Result: Gobot Gobot <[email protected]>
See golang/go#29261. go 1.14 added quotes around error messages for net/url
See golang/go#29261. go 1.14 added quotes around error messages for net/url
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?go env
OutputWhat did you do?
https://play.golang.org/p/jkcYSD6ZRcO
What did you expect to see?
Either
What did you see instead?
A misleading detailed error message:
In #24246 the proposed solution was to trim the URLs before using them, but from the given error message it is very hard to see what the real problem is.
I propose to either:
or
or
The
cannot contain spaces
in error message can be changed forcannot contain whitespace
or evencontains invalid characters
, depending how the check is implemented.Current implementation:
go/src/net/url/url.go
Lines 540 to 562 in b50210f
The text was updated successfully, but these errors were encountered: