-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed isURL regex to take account file urls. #20435
Merged
Merged
Changes from 1 commit
Commits
Show all changes
4 commits
Select commit
Hold shift + click to select a range
40f5560
Fixed isURL regex to take account file urls.
SergioEstevao 024149c
Change regex to support url with the format file:/
SergioEstevao 7ddfe49
Check for start and end of line.
SergioEstevao f9be985
Simplify test now that we test for start and end of line in regex.
SergioEstevao File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -37,8 +37,10 @@ describe( 'isURL', () => { | |
[ 'https://wordpress.org/./foo' ], | ||
[ 'https://wordpress.org/path?query#fragment' ], | ||
[ 'https://localhost/foo#bar' ], | ||
[ 'https:///localhost/foo#bar' ], | ||
[ 'mailto:[email protected]' ], | ||
[ 'ssh://user:[email protected]:8080' ], | ||
[ 'file:///localfolder/file.mov' ], | ||
] )( 'valid (true): %s', ( url ) => { | ||
expect( isURL( url ) ).toBe( true ); | ||
} ); | ||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In wondering whether the proposed revisions are in-fact technically valid, it eventually clicked with me that the extra slash is meant to represent the path name (e.g.
file:
is the protocol, then//
, then/foo
constituting a root-relative path). So in looking at how this is implemented, I wonder if this is something where we should be accounting for this as part of the pattern matching for the path, and not strictly as an optional extra slash following the//
. The pattern is a bit unwieldy to untangle, so it's not immediately clear to me where this change would need to occur.It's also a bit awkward that it's the second of three slashes which is made to be optional. Though I expect it works just the same.
Noting that the
//
is technically considered optional based on this pattern, I wonder if the following should be considered valid, as omitting the optional//
, but still root-relative:In the browser, this is considered valid, and evaluates the same as
file:///foo
:Under this proposed implementation, I expect that would not be supported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly related contrary information for the above-given example:
https://url.spec.whatwg.org/#scheme-state
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At least based on relevant specification, it is optional:
https://url.spec.whatwg.org/#path-or-authority-state
The diagram here is a little easier to interpret, though technically not the specification we adhere to:
https://en.wikipedia.org/wiki/Uniform_Resource_Identifier#Generic_syntax
If the extra slash we're accounting for is relevant for the path, then the pattern should match it separately from the optional authority
//
.Edit: Oof, this gets really confusing, when you consider that
http:
is a "special scheme", and thus it would be subject to special authority slashes state, and omitting//
would be deemed invalid.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aduth I did some changes to support the case you mention, have a look to see if you agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SergioEstevao I'm not seeing any changes. Did you push them? GitHub was having an outage incident this morning, so possible that it was missed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My previous comments were admittedly a bit of a winding brain dump. In short, it's not clear to me if we should consider
file:/foo
as "valid". It's tolerated by theURL
constructor and thus would returntrue
in the browser, so there's an argument in favor of consistency. But as I note in #20435 (comment), the specification is fairly clear that a double-slash should be required to be considered valid.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you come up with this regular expression or find it somewhere? I was trying to decipher it yesterday, and it felt to me like there's some redundancies around
www.
, and user/pass matching. It made it hard for me to figure out how best to change it, but ultimately it felt like the problem is in-part because there's an expectation that one or more characters exist before the pathname can start (which is what the changes here are proposing is an incorrect assumption).Notably, the occurrences of:
[A-Za-z0-9\.\-]+
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated the expression that @etoledom used. I think he used the one for Javascript defined here: https://urlregex.com/
I'm tempted to replace it by the one defined for HTML 5:
^(([^:/?#]+):)?(//([^/?#]*))?([^?#]*)(\?([^#]*))?(#(.*))?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tried this and it's no good, I think the current version of the commit is the best one that matches all the situations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aduth ^