Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misleading error message: The URL 'xxx' is missing 1 slash(es) '/' after the protocol #1034

Closed
nekennedy opened this issue Apr 4, 2019 · 9 comments · Fixed by #1131
Closed
Assignees
Labels
status: has PR The issue is being processed in a pull request type: improvement The issue suggests an improvement of an existing feature
Milestone

Comments

@nekennedy
Copy link

This error (RSC-023) is being generated for a variety of invalid URLS. They are all incorrect, but the problem has nothing to do with the number of slashes. I've provided some examples below and included the text of the epubcheck error and a snippet of HTML:

The URL 'http://Medscape.com,November' is missing 1 slash(es) '/' after the protocol 'http:'
<a href="http://Medscape.com,November">Medscape.com,November</a>
The URL 'http://worldmarket.com)painted' is missing 1 slash(es) '/' after the protocol 'http:'
(from <a href="http://worldmarket.com)painted">worldmarket.com</a>)painted in
The URL 'http://HGTV.com’s' is missing 1 slash(es) '/' after the protocol 'http:'
<a href="http://HGTV.com&#8217;s">HGTV.com&#8217;s</a>
The URL 'http://cronometer.com),I' is missing 1 slash(es) '/' after the protocol 'http:'
(using <a href="http://cronometer.com),I">cronometer.com),</a>
The URL 'http://www​.pbs.org/wgbh/amex/till/sfeature/sf_remember.html' is missing 1 slash(es) '/' after the protocol 'http:'
<a class="hlink" href="http://www&#x200b;.pbs.org/wgbh/amex/till/sfeature/sf_remember.html">http://www&#x200b;.pbs.org/wgbh/amex/till/sfeature/sf_remember.html</a>
The URL 'http://Protectmarriage.com–Yes' is missing 1 slash(es) '/' after the protocol 'http:'
<a href="http://Protectmarriage.com&#x2013;Yes">Protectmarriage.com&#x2013;Yes</a>
The URL 'http://​www.​youtube.​com​/watch​?v​=Wriy3ICfF9U​&​feature​=player_embedded​' is missing 1 slash(es) '/' after the protocol 'http:'
<a href="http://&#x200b;www.&#x200b;youtube.&#x200b;com&#x200b;/watch&#x200b;?v&#x200b;=Wriy3ICfF9U&#x200b;&amp;&#x200b;feature&#x200b;=player_embedded&#x200b;">http://&#x200b;www.&#x200b;youtube.&#x200b;com&#x200b;/watch&#x200b;?v&#x200b;=Wriy3ICfF9U&#x200b;&amp;&#x200b;feature&#x200b;=playe&#x200b;r_embe&#x200b;dded</a>

There are a lot of random URLs in our backlist files, let me know if you need more examples!

@nekennedy
Copy link
Author

nekennedy commented Apr 4, 2019

Reading through these again. "Nothing to do with the number of slashes" an exaggeration. I assumed that it was flagging the &#x2013;, but seems like it's not. Feel free to close this as invalid.

@teytag
Copy link

teytag commented Apr 4, 2019 via email

@mattgarrish
Copy link
Member

It's definitely a bug, though, as the URIs are invalid. It's just not a problem with the number of slashes, but that must be a result of not being able to parse the URI correctly.

@tofi86
Copy link
Collaborator

tofi86 commented Apr 4, 2019

Issue #708 is why I introduced this check with PR #731 2 years ago. Sorry if I missed something here...

@rdeltour
Copy link
Member

rdeltour commented Apr 5, 2019

Yeah the URIs are invalid indeed, but the message is definitely confusing. Sorry I didn't catch that when reviewing #731.

In that PR, I had originally proposed URL 'XXX' has no host component, but it's too technical and not much clearer.

What about a more generic: Couldn't parse URL 'XXX'?

@tofi86 do you think it is too late a change for the translations?

@rdeltour
Copy link
Member

rdeltour commented Apr 5, 2019

What about a more generic: Couldn't parse URL 'XXX'?

or Couldn't parse host of URL 'XXX' if we want to be just a tad more specific…

@tofi86
Copy link
Collaborator

tofi86 commented Apr 7, 2019

@tofi86 do you think it is too late a change for the translations?

no problem, we can find a better cmessage for sure.

However, I'm not sure why Java isn't correctly identifying the http:// as the getSchemeSpecificPart()?
https://github.com/w3c/epubcheck/pull/731/files#diff-384d5f886223b6b6b46d56f83cf3445dR238

@tofi86
Copy link
Collaborator

tofi86 commented Aug 14, 2019

New report from a pagina EPUB-Checker user:

WARNING (RSC-023) at "Mediation ebook.epub/OEBPS/Text/Meditation_ackn.xhtml" (line 61, col 68):
The URL 'http://www.111now.com)' is missing 1 slash(es) '/' after the protocol 'http:'

It seems that URL's ending with a ) are invalid URL's but the parser reports it as another issue.

@rdeltour rdeltour self-assigned this Sep 13, 2019
@rdeltour rdeltour added status: accepted Ready to be further processed type: improvement The issue suggests an improvement of an existing feature labels Sep 13, 2019
@rdeltour rdeltour added this to the 4.3.3 milestone Sep 13, 2019
@rdeltour rdeltour added status: has PR The issue is being processed in a pull request and removed status: accepted Ready to be further processed labels Apr 27, 2020
@nekennedy
Copy link
Author

Thanks 😀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: has PR The issue is being processed in a pull request type: improvement The issue suggests an improvement of an existing feature
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants