Unescapes html in PageParser.href_match_to_url #191

daveFNbuck · 2016-01-01T17:24:47Z

PageParser breaks if the links contain any escaped html characters. This fixes that bug.

kwlzn · 2016-01-01T19:48:45Z

would you mind also adding a quick test for this?

one mild concern to call out: this introduces a dependency on the xml.* class of stdlib modules which, depending on the environment python was compiled in (e.g. lacking supporting xml devel packages), may or may not be present in the stdlib on some systems/interpreters. I suppose that's sort of outside of our scope here tho - and I don't see a great portable alternative to xml.sax.saxutils.unescape short of writing our own.

daveFNbuck · 2016-01-02T17:48:01Z

Sure, I can add a quick test. https://wiki.python.org/moin/EscapingHtml
suggests a short function we can use if the xml lib not existing is a
problem for some people. All I really need is the s.replace('&', '&')
line, as the issue I'm having is with parameters being passed in the url on
my custom pypi server.

On Fri, Jan 1, 2016 at 11:48 AM, Kris Wilson [email protected]
wrote:

would you mind also adding a quick test for this?

one mild concern to call out: this introduces a dependency on the xml.*
class of stdlib modules which, depending on the environment python was
compiled in (e.g. lacking supporting xml devel packages), may or may not be
present in the stdlib on some systems/interpreters. I suppose that's sort
of outside of our scope here tho - and I don't see a great portable
alternative to xml.sax.saxutils.unescape short of writing our own.

—
Reply to this email directly or view it on GitHub
#191 (comment).

PageParser breaks if the links contain any escaped characters. This fixes that bug.

daveFNbuck · 2016-01-07T04:06:28Z

ping

kwlzn · 2016-01-07T22:11:31Z

lgtm!

Unescape html in PageParser.href_match_to_url.

kwlzn · 2016-01-07T22:12:22Z

thanks @daveFNbuck - this should be going out in the 1.1.2 release later today/tomorrow.

daveFNbuck · 2016-01-08T02:11:56Z

Awesome, thanks!

On Thu, Jan 7, 2016 at 2:12 PM, Kris Wilson [email protected]
wrote:

thanks @daveFNbuck https://github.com/daveFNbuck - this should be going
out in the 1.1.2 release later today/tomorrow.

—
Reply to this email directly or view it on GitHub
#191 (comment).

daveFNbuck mentioned this pull request Jan 2, 2016

Updates interpreter path parsing to handle Jenkins #192

Closed

daveFNbuck force-pushed the unescape_urls branch from 1bd1039 to cf39ea3 Compare January 2, 2016 18:09

Unescapes html in PageParser.href_match_to_url

8da3523

PageParser breaks if the links contain any escaped characters. This fixes that bug.

daveFNbuck force-pushed the unescape_urls branch from cf39ea3 to 8da3523 Compare January 2, 2016 18:12

kwlzn added a commit that referenced this pull request Jan 7, 2016

Merge pull request #191 from daveFNbuck/unescape_urls

033707c

Unescape html in PageParser.href_match_to_url.

kwlzn merged commit 033707c into pex-tool:master Jan 7, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unescapes html in PageParser.href_match_to_url #191

Unescapes html in PageParser.href_match_to_url #191

daveFNbuck commented Jan 1, 2016

kwlzn commented Jan 1, 2016

daveFNbuck commented Jan 2, 2016

daveFNbuck commented Jan 7, 2016

kwlzn commented Jan 7, 2016

kwlzn commented Jan 7, 2016

daveFNbuck commented Jan 8, 2016

Unescapes html in PageParser.href_match_to_url #191

Unescapes html in PageParser.href_match_to_url #191

Conversation

daveFNbuck commented Jan 1, 2016

kwlzn commented Jan 1, 2016

daveFNbuck commented Jan 2, 2016

daveFNbuck commented Jan 7, 2016

kwlzn commented Jan 7, 2016

kwlzn commented Jan 7, 2016

daveFNbuck commented Jan 8, 2016