Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Only last octet is left in non-UTF-8 %-encoded sequence #517

Closed
serhiy-storchaka opened this issue Sep 27, 2020 · 2 comments · Fixed by #532
Closed

Only last octet is left in non-UTF-8 %-encoded sequence #517

serhiy-storchaka opened this issue Sep 27, 2020 · 2 comments · Fixed by #532

Comments

@serhiy-storchaka
Copy link
Contributor

>>> URL.build(path='/%f0%9f%90', encoded=True).path
'/%90'
@besfahbod
Copy link
Contributor

Since URL is expected to be able to generate a Unicode version of all URL data fields, no exceptions, I believe the expectation is to deal with non-UTF8 encoding before URL instantiation.

With that, I think there are two possible expected outcomes here:

  1. URL instantiation should raise an exception, for decoding error.
  2. URL instantiation ignoring the erroneous runs, leaving us with only / as the path here.

We probably can make (2) an option to the parser, and have (1) be the default behavior.

What do you think?

@serhiy-storchaka
Copy link
Contributor Author

I think that at first step for compatibility with the current code incorrect %-encoded sequence should be returned intact: '/%f0%9f%90' (preserving the case of letters). And correct %-sequences should be decoded. I'll try to fix this after fixing other issues in the unquoter.

But it would be nice to have a way to control the behavior:

  • Raise an error (at instantiation time or at decoding time).
  • Ignore incorrect escaping completely.
  • Replace it with one or more U+FFFD characters.
  • Use "surrogate escapes" as for non-decodable paths.
  • Return it intact.

This is different and more complex issue.

aio-libs-github-bot bot pushed a commit to aio-libs/aiohttp that referenced this issue Nov 16, 2020
Bumps [yarl](https://github.com/aio-libs/yarl) from 1.6.2 to 1.6.3.
<details>
<summary>Release notes</summary>
<p><em>Sourced from <a href="https://github.com/aio-libs/yarl/releases">yarl's releases</a>.</em></p>
<blockquote>
<h2>yarl 1.6.3 release</h2>
<h2>Bugfixes</h2>
<ul>
<li>No longer loose characters when decoding incorrect percent-sequences (like <code>%e2%82%f8</code>). All non-decodable percent-sequences are now preserved.
<code>[#517](aio-libs/yarl#517) &lt;https://github.com/aio-libs/yarl/issues/517&gt;</code>_</li>
<li>Provide x86 Windows wheels.
<code>[#535](aio-libs/yarl#535) &lt;https://github.com/aio-libs/yarl/issues/535&gt;</code>_</li>
</ul>
</blockquote>
</details>
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a href="https://github.com/aio-libs/yarl/blob/master/CHANGES.rst">yarl's changelog</a>.</em></p>
<blockquote>
<h1>1.6.3 (2020-11-14)</h1>
<h2>Bugfixes</h2>
<ul>
<li>No longer loose characters when decoding incorrect percent-sequences (like <code>%e2%82%f8</code>). All non-decodable percent-sequences are now preserved.
<code>[#517](aio-libs/yarl#517) &lt;https://github.com/aio-libs/yarl/issues/517&gt;</code>_</li>
<li>Provide x86 Windows wheels.
<code>[#535](aio-libs/yarl#535) &lt;https://github.com/aio-libs/yarl/issues/535&gt;</code>_</li>
</ul>
<hr />
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a href="https://github.com/aio-libs/yarl/commit/7fc35c68f23c2fe43069c9f5696f952b8ec485e8"><code>7fc35c6</code></a> Bump to 1.6.3</li>
<li><a href="https://github.com/aio-libs/yarl/commit/68257bb63488bd1309acad57e34ac7f3f7682bd2"><code>68257bb</code></a> Fix x86 wheels building (<a href="https://github-redirect.dependabot.com/aio-libs/yarl/issues/546">#546</a>)</li>
<li><a href="https://github.com/aio-libs/yarl/commit/58ee718bd41df64928d265ced5e4f5107d09c529"><code>58ee718</code></a> Bump sphinx from 3.3.0 to 3.3.1 (<a href="https://github-redirect.dependabot.com/aio-libs/yarl/issues/545">#545</a>)</li>
<li><a href="https://github.com/aio-libs/yarl/commit/ff66061cc4c55e98f2fbcdc007d513dc86032da2"><code>ff66061</code></a> Bump sphinxcontrib-spelling from 7.0.1 to 7.1.0 (<a href="https://github-redirect.dependabot.com/aio-libs/yarl/issues/544">#544</a>)</li>
<li><a href="https://github.com/aio-libs/yarl/commit/da13791327aec15877d56079aa893f7e7a58f48a"><code>da13791</code></a> Fix benchmark (<a href="https://github-redirect.dependabot.com/aio-libs/yarl/issues/533">#533</a>)</li>
<li><a href="https://github.com/aio-libs/yarl/commit/1ce7c8467bf9e1b389e333f29f62dee570942720"><code>1ce7c84</code></a> Preserve non-decodable %-sequences intact when unquote. (<a href="https://github-redirect.dependabot.com/aio-libs/yarl/issues/532">#532</a>)</li>
<li><a href="https://github.com/aio-libs/yarl/commit/ea8c41d06a8dba6c3e8fc7e82a6e8f8ff2b0196a"><code>ea8c41d</code></a> Bump sphinxcontrib-spelling from 7.0.0 to 7.0.1 (<a href="https://github-redirect.dependabot.com/aio-libs/yarl/issues/542">#542</a>)</li>
<li><a href="https://github.com/aio-libs/yarl/commit/8e737f744230e0b5f55bcb3afd60239bf83a7ccb"><code>8e737f7</code></a> Bump sphinx from 3.2.1 to 3.3.0 (<a href="https://github-redirect.dependabot.com/aio-libs/yarl/issues/543">#543</a>)</li>
<li><a href="https://github.com/aio-libs/yarl/commit/59e89f47b87825dd7b6f1ce621ab962b1f065371"><code>59e89f4</code></a> Bump pytest from 6.1.1 to 6.1.2 (<a href="https://github-redirect.dependabot.com/aio-libs/yarl/issues/541">#541</a>)</li>
<li><a href="https://github.com/aio-libs/yarl/commit/083ce28572db40bfd93ebb7393e80c996bc1d3a1"><code>083ce28</code></a> Bump sphinxcontrib-spelling from 6.0.0 to 7.0.0 (<a href="https://github-redirect.dependabot.com/aio-libs/yarl/issues/539">#539</a>)</li>
<li>Additional commits viewable in <a href="https://github.com/aio-libs/yarl/compare/v1.6.2...v1.6.3">compare view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=yarl&package-manager=pip&previous-version=1.6.2&new-version=1.6.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/configuring-github-dependabot-security-updates)

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants