-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-91810: ElementTree: Use text file's encoding by default in XML declaration #91903
gh-91810: ElementTree: Use text file's encoding by default in XML declaration #91903
Conversation
…ML declaration ElementTree method write() and function tostring() now use the text file's encoding ("UTF-8" if not available) instead of locale encoding in XML declaration when encoding="unicode" is specified.
Thanks @serhiy-storchaka for the PR 🌮🎉.. I'm working now to backport this PR to: 3.9, 3.10, 3.11. |
GH-92663 is a backport of this pull request to the 3.11 branch. |
GH-92664 is a backport of this pull request to the 3.10 branch. |
GH-92665 is a backport of this pull request to the 3.9 branch. |
…ML declaration (pythonGH-91903) ElementTree method write() and function tostring() now use the text file's encoding ("UTF-8" if not available) instead of locale encoding in XML declaration when encoding="unicode" is specified. (cherry picked from commit 707839b) Co-authored-by: Serhiy Storchaka <[email protected]>
…ML declaration (pythonGH-91903) ElementTree method write() and function tostring() now use the text file's encoding ("UTF-8" if not available) instead of locale encoding in XML declaration when encoding="unicode" is specified. (cherry picked from commit 707839b) Co-authored-by: Serhiy Storchaka <[email protected]>
…ML declaration (pythonGH-91903) ElementTree method write() and function tostring() now use the text file's encoding ("UTF-8" if not available) instead of locale encoding in XML declaration when encoding="unicode" is specified. (cherry picked from commit 707839b) Co-authored-by: Serhiy Storchaka <[email protected]>
…XML declaration (GH-91903) (GH-92663) ElementTree method write() and function tostring() now use the text file's encoding ("UTF-8" if not available) instead of locale encoding in XML declaration when encoding="unicode" is specified. (cherry picked from commit 707839b) Co-authored-by: Serhiy Storchaka <[email protected]> Automerge-Triggered-By: GH:serhiy-storchaka
…XML declaration (GH-91903) (GH-92664) ElementTree method write() and function tostring() now use the text file's encoding ("UTF-8" if not available) instead of locale encoding in XML declaration when encoding="unicode" is specified. (cherry picked from commit 707839b) Co-authored-by: Serhiy Storchaka <[email protected]> Automerge-Triggered-By: GH:serhiy-storchaka
…ML declaration (GH-91903) (GH-92665) ElementTree method write() and function tostring() now use the text file's encoding ("UTF-8" if not available) instead of locale encoding in XML declaration when encoding="unicode" is specified. (cherry picked from commit 707839b) Co-authored-by: Serhiy Storchaka <[email protected]> Automerge-Triggered-By: GH:serhiy-storchaka
FYI, @pablogsal @ambv: this change broke existing behaviour in a subtle way. I'm not sure if it's worth rolling back in 3.9, especially considering it went into the final regular bugfix release. It may be worth fixing the regression in 3.10 or 3.11, or it may be deemed to be a documentation issue. The documentation of ElementTree.write() says: Before this change, passing a file opened in text mode regardless of the encoding used and passing After this change, passing a file opened in text mode with UTF-8 or ascii as encoding, and passing
... will produce invalid XML because of this change:
... where before this change it would not emit the second XML declaration. (This is arguably problematic code, but unfortunately the file's encoding can of course be implicit, which makes it harder to tell what's going on. This was synthesized from real code found while upgrading to 3.9.13 at Google.) |
I am a bit worried that for 3.10 people have already developed workarounds for this and we are going to break them again. On the other hand, one could argue that the situation is too specific and that's why nobody has raised issues so far. I am ok rolling it back, though, if that's the consensus. |
Actually, this change didn't make it into 3.10.4, so it's only in 3.9.13 and 3.11 at this point. |
#93426 fixes this. |
…t in XML declaration (pythonGH-91903) (pythonGH-92665) ElementTree method write() and function tostring() now use the text file's encoding ("UTF-8" if not available) instead of locale encoding in XML declaration when encoding="unicode" is specified. (cherry picked from commit 707839b) Co-authored-by: Serhiy Storchaka <[email protected]> Automerge-Triggered-By: GH:serhiy-storchaka
I think we should fix it in 3.9.14 as well since it's a regression only in 3.9.13. |
(In the mean time we did release 3.10.5 with the bug which made this version affected too) |
ElementTree method write() and function tostring() now use the text file's
encoding ("UTF-8" if not available) instead of locale encoding in XML
declaration when encoding="unicode" is specified.
Closes #91810.