Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document encoding detection behavior for unknown encodings. #2733

Merged
merged 1 commit into from
Feb 13, 2018

Conversation

gilbsgilbs
Copy link
Contributor

Closes #2732

What do these changes do?

Document that sometimes, cchardet might detect encodings that Python doesn't know. In such cases, .text() function might raise a LookupError, and get_encoding may return values that are unsafe to pass to bytes.decode() or to .text() functions.

Are there changes in behavior for the user?

No.

Related issue number

#2732

Checklist

  • NA I think the code is well written
  • NA Unit tests for the changes exist
  • NA Documentation reflects the changes
  • If you provide code modification, please add yourself to CONTRIBUTORS.txt
    • The format is <Name> <Surname>.
    • Please keep alphabetical order, the file is sorted by names.
  • Add a new news fragment into the CHANGES folder
    • name it <issue_id>.<type> for example (588.bugfix)
    • if you don't have an issue_id change it to the pr id after creating the pr
    • ensure type is one of the following:
      • .feature: Signifying a new feature.
      • .bugfix: Signifying a bug fix.
      • .doc: Signifying a documentation improvement.
      • .removal: Signifying a deprecation or removal of public API.
      • .misc: A ticket has been closed, but it is not of interest to users.
    • Make sure to use full sentences with correct case and punctuation, for example: "Fix issue with non-ascii contents in doctest text files."

Sometimes, cchardet might detect encodings that Python doesn't know. In
such cases, `.text()` function might raise a `LookupError`, and
`get_encoding` may return values that are unsafe to pass to `bytes.decode()`
or to `.text()` functions.

Closes aio-libs#2732
@codecov-io
Copy link

codecov-io commented Feb 13, 2018

Codecov Report

Merging #2733 into master will increase coverage by 0.13%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #2733      +/-   ##
==========================================
+ Coverage   97.85%   97.99%   +0.13%     
==========================================
  Files          39       39              
  Lines        7335     7335              
  Branches     1283     1283              
==========================================
+ Hits         7178     7188      +10     
+ Misses         50       45       -5     
+ Partials      107      102       -5
Impacted Files Coverage Δ
aiohttp/client_reqrep.py 97.39% <0%> (+0.37%) ⬆️
aiohttp/helpers.py 97.29% <0%> (+0.49%) ⬆️
aiohttp/connector.py 96.82% <0%> (+0.74%) ⬆️
aiohttp/payload.py 98.74% <0%> (+1.25%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 185e3f7...b6848c5. Read the comment docs.

@asvetlov asvetlov merged commit c54f1a8 into aio-libs:master Feb 13, 2018
@asvetlov
Copy link
Member

Thanks!

@asvetlov asvetlov added this to the 3.1 milestone Feb 13, 2018
@lock
Copy link

lock bot commented Oct 28, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a [new issue] for related bugs.
If you feel like there's important points made in this discussion, please include those exceprts into that [new issue].
[new issue]: https://github.com/aio-libs/aiohttp/issues/new

@lock lock bot added the outdated label Oct 28, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Oct 28, 2019
@psf-chronographer psf-chronographer bot added the bot:chronographer:provided There is a change note present in this PR label Oct 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bot:chronographer:provided There is a change note present in this PR outdated
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Encoding detection can lead to LookupError
3 participants