Skip to content

Commit

Permalink
gh-98401: Invalid escape sequences emits SyntaxWarning (#99011)
Browse files Browse the repository at this point in the history
A backslash-character pair that is not a valid escape sequence now
generates a SyntaxWarning, instead of DeprecationWarning.  For
example, re.compile("\d+\.\d+") now emits a SyntaxWarning ("\d" is an
invalid escape sequence), use raw strings for regular expression:
re.compile(r"\d+\.\d+"). In a future Python version, SyntaxError will
eventually be raised, instead of SyntaxWarning.

Octal escapes with value larger than 0o377 (ex: "\477"), deprecated
in Python 3.11, now produce a SyntaxWarning, instead of
DeprecationWarning. In a future Python version they will be
eventually a SyntaxError.

codecs.escape_decode() and codecs.unicode_escape_decode() are left
unchanged: they still emit DeprecationWarning.

* The parser only emits SyntaxWarning for Python 3.12 (feature
  version), and still emits DeprecationWarning on older Python
  versions.
* Fix SyntaxWarning by using raw strings in Tools/c-analyzer/ and
  wasm_build.py.
  • Loading branch information
vstinner authored Nov 3, 2022
1 parent 916af11 commit a60ddd3
Show file tree
Hide file tree
Showing 11 changed files with 69 additions and 29 deletions.
2 changes: 1 addition & 1 deletion Doc/library/re.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ a literal backslash, one might have to write ``'\\\\'`` as the pattern
string, because the regular expression must be ``\\``, and each
backslash must be expressed as ``\\`` inside a regular Python string
literal. Also, please note that any invalid escape sequences in Python's
usage of the backslash in string literals now generate a :exc:`DeprecationWarning`
usage of the backslash in string literals now generate a :exc:`SyntaxWarning`
and in the future this will become a :exc:`SyntaxError`. This behaviour
will happen even if it is a valid escape sequence for a regular expression.

Expand Down
18 changes: 12 additions & 6 deletions Doc/reference/lexical_analysis.rst
Original file line number Diff line number Diff line change
Expand Up @@ -612,9 +612,13 @@ Notes:
As in Standard C, up to three octal digits are accepted.

.. versionchanged:: 3.11
Octal escapes with value larger than ``0o377`` produce a :exc:`DeprecationWarning`.
In a future Python version they will be a :exc:`SyntaxWarning` and
eventually a :exc:`SyntaxError`.
Octal escapes with value larger than ``0o377`` produce a
:exc:`DeprecationWarning`.

.. versionchanged:: 3.12
Octal escapes with value larger than ``0o377`` produce a
:exc:`SyntaxWarning`. In a future Python version they will be eventually
a :exc:`SyntaxError`.

(3)
Unlike in Standard C, exactly two hex digits are required.
Expand Down Expand Up @@ -646,9 +650,11 @@ escape sequences only recognized in string literals fall into the category of
unrecognized escapes for bytes literals.

.. versionchanged:: 3.6
Unrecognized escape sequences produce a :exc:`DeprecationWarning`. In
a future Python version they will be a :exc:`SyntaxWarning` and
eventually a :exc:`SyntaxError`.
Unrecognized escape sequences produce a :exc:`DeprecationWarning`.

.. versionchanged:: 3.12
Unrecognized escape sequences produce a :exc:`SyntaxWarning`. In a future
Python version they will be eventually a :exc:`SyntaxError`.

Even in a raw literal, quotes can be escaped with a backslash, but the
backslash remains in the result; for example, ``r"\""`` is a valid string
Expand Down
16 changes: 16 additions & 0 deletions Doc/whatsnew/3.12.rst
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,22 @@ Other Language Changes
chance to execute the GC periodically. (Contributed by Pablo Galindo in
:gh:`97922`.)

* A backslash-character pair that is not a valid escape sequence now generates
a :exc:`SyntaxWarning`, instead of :exc:`DeprecationWarning`.
For example, ``re.compile("\d+\.\d+")`` now emits a :exc:`SyntaxWarning`
(``"\d"`` is an invalid escape sequence), use raw strings for regular
expression: ``re.compile(r"\d+\.\d+")``.
In a future Python version, :exc:`SyntaxError` will eventually be raised,
instead of :exc:`SyntaxWarning`.
(Contributed by Victor Stinner in :gh:`98401`.)

* Octal escapes with value larger than ``0o377`` (ex: ``"\477"``), deprecated
in Python 3.11, now produce a :exc:`SyntaxWarning`, instead of
:exc:`DeprecationWarning`.
In a future Python version they will be eventually a :exc:`SyntaxError`.
(Contributed by Victor Stinner in :gh:`98401`.)


New Modules
===========

Expand Down
10 changes: 5 additions & 5 deletions Lib/test/test_codeop.py
Original file line number Diff line number Diff line change
Expand Up @@ -310,8 +310,8 @@ def test_filename(self):
def test_warning(self):
# Test that the warning is only returned once.
with warnings_helper.check_warnings(
(".*literal", SyntaxWarning),
(".*invalid", DeprecationWarning),
('"is" with a literal', SyntaxWarning),
("invalid escape sequence", SyntaxWarning),
) as w:
compile_command(r"'\e' is 0")
self.assertEqual(len(w.warnings), 2)
Expand All @@ -321,9 +321,9 @@ def test_warning(self):
warnings.simplefilter('error', SyntaxWarning)
compile_command('1 is 1', symbol='exec')

# Check DeprecationWarning treated as an SyntaxError
# Check SyntaxWarning treated as an SyntaxError
with warnings.catch_warnings(), self.assertRaises(SyntaxError):
warnings.simplefilter('error', DeprecationWarning)
warnings.simplefilter('error', SyntaxWarning)
compile_command(r"'\e'", symbol='exec')

def test_incomplete_warning(self):
Expand All @@ -337,7 +337,7 @@ def test_invalid_warning(self):
warnings.simplefilter('always')
self.assertInvalid("'\\e' 1")
self.assertEqual(len(w), 1)
self.assertEqual(w[0].category, DeprecationWarning)
self.assertEqual(w[0].category, SyntaxWarning)
self.assertRegex(str(w[0].message), 'invalid escape sequence')
self.assertEqual(w[0].filename, '<input>')

Expand Down
2 changes: 1 addition & 1 deletion Lib/test/test_fstring.py
Original file line number Diff line number Diff line change
Expand Up @@ -776,7 +776,7 @@ def test_backslashes_in_string_part(self):
self.assertEqual(f'2\x203', '2 3')
self.assertEqual(f'\x203', ' 3')

with self.assertWarns(DeprecationWarning): # invalid escape sequence
with self.assertWarns(SyntaxWarning): # invalid escape sequence
value = eval(r"f'\{6*7}'")
self.assertEqual(value, '\\42')
self.assertEqual(f'\\{6*7}', '\\42')
Expand Down
24 changes: 12 additions & 12 deletions Lib/test/test_string_literals.py
Original file line number Diff line number Diff line change
Expand Up @@ -109,19 +109,19 @@ def test_eval_str_invalid_escape(self):
for b in range(1, 128):
if b in b"""\n\r"'01234567NU\\abfnrtuvx""":
continue
with self.assertWarns(DeprecationWarning):
with self.assertWarns(SyntaxWarning):
self.assertEqual(eval(r"'\%c'" % b), '\\' + chr(b))

with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('always', category=DeprecationWarning)
warnings.simplefilter('always', category=SyntaxWarning)
eval("'''\n\\z'''")
self.assertEqual(len(w), 1)
self.assertEqual(str(w[0].message), r"invalid escape sequence '\z'")
self.assertEqual(w[0].filename, '<string>')
self.assertEqual(w[0].lineno, 1)

with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('error', category=DeprecationWarning)
warnings.simplefilter('error', category=SyntaxWarning)
with self.assertRaises(SyntaxError) as cm:
eval("'''\n\\z'''")
exc = cm.exception
Expand All @@ -133,11 +133,11 @@ def test_eval_str_invalid_escape(self):

def test_eval_str_invalid_octal_escape(self):
for i in range(0o400, 0o1000):
with self.assertWarns(DeprecationWarning):
with self.assertWarns(SyntaxWarning):
self.assertEqual(eval(r"'\%o'" % i), chr(i))

with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('always', category=DeprecationWarning)
warnings.simplefilter('always', category=SyntaxWarning)
eval("'''\n\\407'''")
self.assertEqual(len(w), 1)
self.assertEqual(str(w[0].message),
Expand All @@ -146,7 +146,7 @@ def test_eval_str_invalid_octal_escape(self):
self.assertEqual(w[0].lineno, 1)

with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('error', category=DeprecationWarning)
warnings.simplefilter('error', category=SyntaxWarning)
with self.assertRaises(SyntaxError) as cm:
eval("'''\n\\407'''")
exc = cm.exception
Expand Down Expand Up @@ -186,19 +186,19 @@ def test_eval_bytes_invalid_escape(self):
for b in range(1, 128):
if b in b"""\n\r"'01234567\\abfnrtvx""":
continue
with self.assertWarns(DeprecationWarning):
with self.assertWarns(SyntaxWarning):
self.assertEqual(eval(r"b'\%c'" % b), b'\\' + bytes([b]))

with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('always', category=DeprecationWarning)
warnings.simplefilter('always', category=SyntaxWarning)
eval("b'''\n\\z'''")
self.assertEqual(len(w), 1)
self.assertEqual(str(w[0].message), r"invalid escape sequence '\z'")
self.assertEqual(w[0].filename, '<string>')
self.assertEqual(w[0].lineno, 1)

with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('error', category=DeprecationWarning)
warnings.simplefilter('error', category=SyntaxWarning)
with self.assertRaises(SyntaxError) as cm:
eval("b'''\n\\z'''")
exc = cm.exception
Expand All @@ -209,11 +209,11 @@ def test_eval_bytes_invalid_escape(self):

def test_eval_bytes_invalid_octal_escape(self):
for i in range(0o400, 0o1000):
with self.assertWarns(DeprecationWarning):
with self.assertWarns(SyntaxWarning):
self.assertEqual(eval(r"b'\%o'" % i), bytes([i & 0o377]))

with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('always', category=DeprecationWarning)
warnings.simplefilter('always', category=SyntaxWarning)
eval("b'''\n\\407'''")
self.assertEqual(len(w), 1)
self.assertEqual(str(w[0].message),
Expand All @@ -222,7 +222,7 @@ def test_eval_bytes_invalid_octal_escape(self):
self.assertEqual(w[0].lineno, 1)

with warnings.catch_warnings(record=True) as w:
warnings.simplefilter('error', category=DeprecationWarning)
warnings.simplefilter('error', category=SyntaxWarning)
with self.assertRaises(SyntaxError) as cm:
eval("b'''\n\\407'''")
exc = cm.exception
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
A backslash-character pair that is not a valid escape sequence now generates a
:exc:`SyntaxWarning`, instead of :exc:`DeprecationWarning`. For example,
``re.compile("\d+\.\d+")`` now emits a :exc:`SyntaxWarning` (``"\d"`` is an
invalid escape sequence), use raw strings for regular expression:
``re.compile(r"\d+\.\d+")``. In a future Python version, :exc:`SyntaxError`
will eventually be raised, instead of :exc:`SyntaxWarning`. Patch by Victor
Stinner.
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Octal escapes with value larger than ``0o377`` (ex: ``"\477"``), deprecated
in Python 3.11, now produce a :exc:`SyntaxWarning`, instead of
:exc:`DeprecationWarning`. In a future Python version they will be
eventually a :exc:`SyntaxError`. Patch by Victor Stinner.
11 changes: 9 additions & 2 deletions Parser/string_parser.c
Original file line number Diff line number Diff line change
Expand Up @@ -21,9 +21,16 @@ warn_invalid_escape_sequence(Parser *p, const char *first_invalid_escape, Token
if (msg == NULL) {
return -1;
}
if (PyErr_WarnExplicitObject(PyExc_DeprecationWarning, msg, p->tok->filename,
PyObject *category;
if (p->feature_version >= 12) {
category = PyExc_SyntaxWarning;
}
else {
category = PyExc_DeprecationWarning;
}
if (PyErr_WarnExplicitObject(category, msg, p->tok->filename,
t->lineno, NULL, NULL) < 0) {
if (PyErr_ExceptionMatches(PyExc_DeprecationWarning)) {
if (PyErr_ExceptionMatches(category)) {
/* Replace the DeprecationWarning exception with a SyntaxError
to get a more accurate error report */
PyErr_Clear();
Expand Down
2 changes: 1 addition & 1 deletion Tools/c-analyzer/c_parser/_state_machine.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ def parse(srclines):
# # end matched parens
# ''')

'''
r'''
# for loop
(?:
\s* \b for
Expand Down
2 changes: 1 addition & 1 deletion Tools/wasm/wasm_build.py
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,7 @@ def read_python_version(configure: pathlib.Path = CONFIGURE) -> str:
configure and configure.ac are the canonical source for major and
minor version number.
"""
version_re = re.compile("^PACKAGE_VERSION='(\d\.\d+)'")
version_re = re.compile(r"^PACKAGE_VERSION='(\d\.\d+)'")
with configure.open(encoding="utf-8") as f:
for line in f:
mo = version_re.match(line)
Expand Down

0 comments on commit a60ddd3

Please sign in to comment.