Skip to content

Url Regex Filter

zacharyberson edited this page Jun 17, 2019 · 25 revisions

This page details the Url Regex used to filter Urls input during Content Creation. Try the tool at https://www.regextester.com/20 for building and understanding regex.

Note: The following characters are allowed in a URL in addition to alphanumerics, and will be referred to as CHARS. While these are allowed, some can only exist as escaped and only in specific contexts/location, however this regex does not enforce these restraints. Future revisions may implement them.

-._~:/?#[]@!$&'()*+,;=


Whole REGEX

/^((http[s]?|ftp):\/\/)(((\w+\.)?\w+\.\w{2,})|(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}))(\/[\w-._~:/?#[\]@!$&'()*+,;=]+(\.[\w-._~:/?#[\]@!$&'()*+,;=]+)?)*(\?|\?[\w-._~:/?#[\]@!$&'()*+,;=]+=[\w-._~:/?#[\]@!$&'()*+,;=]*(&[\w-._~:/?#[\]@!$&'()*+,;=]+=[\w-._~:/?#[\]@!$&'()*+,;=]*)*)?(#[\w-._~:/?#[\]@!$&'()*+,;=]*)?\/?$/

or

/^((http[s]?|ftp):\/\/)(((\w+\.)?\w+\.\w{2,})|(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}))(\/[\wCHARS]+(\.[\wCHARS]+)?)*(\?|\?[\wCHARS]+=[\wCHARS]*(&[\wCHARS]+=[\wCHARS]*)*)?(#[\wCHARS]*)?\/?$/

Example Valid URL

http://www.test.com/dir/file-name.jpg?var1=foo&var2=&var3=this2#foo/


Parts that Make Up the REGEX

0.)

Beginning of regex

/^


1.)

http:// OR https:// OR ftp://

(http[s]?|ftp):\/\/


2.)

[www.](valid_chars).(valid_chars) OR #.#.#.#

(((\w+\.)?\w+\.\w{2,}) OR (\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}))

so

(((\w+\.)?\w+\.\w{2,})|(\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3}))


3.)

/sub-dir(.jpg) FOR 0 - many times

\/[\wCHARS]+(\.[\wCHARS]+)? FOR * times

so

(\/[\wCHARS]+(\.[\wCHARS]+)?)*


4.)

? OR {

?some-thing=something-else

\?[\wCHARS]+=[\wCHARS]*

AND

&some-thing=something-else FOR 0 - many times

(&[\wCHARS]+=[\wCHARS]*)*

} FOR 0 - 1 time

(\? OR (\?[\wCHARS]+=[\wCHARS]* AND (&[\wCHARS]+=[\wCHARS]*)*))?

so

(\?|\?[\wCHARS]+=[\wCHARS]*(&[\wCHARS]+=[\wCHARS]*)*)?


5.)

#pa-rt FOR 0 - 1 time

so

(#[\wCHARS]*)?


6.)

/ FOR 0 - 1 time

so

\/?


End.)

End of regex

$/

Clone this wiki locally