Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-39939: Add str.removeprefix and str.removesuffix #18939

Merged
merged 39 commits into from
Apr 22, 2020
Merged
Show file tree
Hide file tree
Changes from 28 commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
0addc43
Add cutprefix and cutsuffix methods to str, bytes, and bytearray.
sweeneyde Mar 10, 2020
3adb9fa
pep 7: lining up argumenets
sweeneyde Mar 11, 2020
a7a1bc8
Revert "pep 7: lining up argumenets"
sweeneyde Mar 11, 2020
fe18644
pep 7: line up arguemnts
sweeneyde Mar 11, 2020
ff8e3c6
pep 7: line up arguemnts
sweeneyde Mar 11, 2020
5339a46
📜🤖 Added by blurb_it.
blurb-it[bot] Mar 11, 2020
cc85978
add UserString methods
sweeneyde Mar 11, 2020
1442ffe
Merge branch 'cut_affix' of https://github.com/sweeneyde/cpython into…
sweeneyde Mar 11, 2020
111b0f9
update count of objects in test_doctests
sweeneyde Mar 11, 2020
8265e4d
restore clinic output
sweeneyde Mar 11, 2020
7401b87
update count of objects in test_doctests
sweeneyde Mar 11, 2020
e550171
return original when bytes.cut***fix does not find match
sweeneyde Mar 11, 2020
0a5d0a9
Document cutprefix and cutsuffix
sweeneyde Mar 12, 2020
a126438
fix doctest in docs
sweeneyde Mar 12, 2020
fbc4a50
Add credit
sweeneyde Mar 12, 2020
3783dc3
make the empty affix case fast
sweeneyde Mar 12, 2020
428e733
clarified: one affix at a time
sweeneyde Mar 12, 2020
5796757
ensure tuples are not allowed
sweeneyde Mar 12, 2020
6fe9ac5
Fix userstring type behavior
sweeneyde Mar 12, 2020
13e8296
WhatsNew and ACKS
sweeneyde Mar 12, 2020
49fa220
WhatsNew and ACKS
sweeneyde Mar 12, 2020
550beca
fix spelling
sweeneyde Mar 12, 2020
01d0655
Direct readers from (l/r)strip to cut***fix
sweeneyde Mar 12, 2020
3c0e350
Merge branch 'cut_affix' of https://github.com/sweeneyde/cpython into…
sweeneyde Mar 12, 2020
fe80ba8
Fix typo in docs
sweeneyde Mar 16, 2020
ae23692
minor c formatting consistency
sweeneyde Mar 16, 2020
a9e253c
copy/paste errors; don't say 'return the original'
sweeneyde Mar 20, 2020
4c33b74
changed 'cut' to 'remove'
sweeneyde Mar 25, 2020
4413e2e
Change method names in whatsnew
sweeneyde Mar 25, 2020
5dfa968
Update Misc/NEWS.d/next/Core and Builtins/2020-03-11-19-17-36.bpo-399…
sweeneyde Mar 25, 2020
aa6eede
new names in the whatsnew header
sweeneyde Mar 28, 2020
d941711
Merge branch 'master' into cut_affix
sweeneyde Apr 9, 2020
f55836d
add examples of differences between l/rstrip and removeaffix
sweeneyde Apr 21, 2020
8d0584a
Merge branch 'cut_affix' of https://github.com/sweeneyde/cpython into…
sweeneyde Apr 21, 2020
8b6267a
apply changes from review
sweeneyde Apr 22, 2020
61cd530
apply changes from review
sweeneyde Apr 22, 2020
ffe72f1
more documentation tweaks
sweeneyde Apr 22, 2020
d8f5a99
clean up the NEWS entry
sweeneyde Apr 22, 2020
3df1f38
mention arg type in docstrings
sweeneyde Apr 22, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
86 changes: 84 additions & 2 deletions Doc/library/stdtypes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1549,6 +1549,39 @@ expression support in the :mod:`re` module).
interpreted as in slice notation.


.. method:: str.removeprefix(prefix, /)

vstinner marked this conversation as resolved.
Show resolved Hide resolved
Return a copy of the string with the given prefix removed, if present. ::
sweeneyde marked this conversation as resolved.
Show resolved Hide resolved

>>> 'BarFooBaz'.removeprefix('Bar')
'FooBaz'
>>> 'BarFooBaz'.removeprefix('Baz')
'BarFooBaz'
sweeneyde marked this conversation as resolved.
Show resolved Hide resolved

The expression ``s.removeprefix(pre)`` is roughly equivalent to
``s[len(pre):] if s.startswith(pre) else s``.
Unlike :meth:`~str.startswith`, only one prefix can be passed
at a time.
sweeneyde marked this conversation as resolved.
Show resolved Hide resolved

vstinner marked this conversation as resolved.
Show resolved Hide resolved
.. versionadded:: 3.9

.. method:: str.removesuffix(suffix, /)

Return a copy of the string with the given suffix removed, if present. ::
sweeneyde marked this conversation as resolved.
Show resolved Hide resolved

>>> 'BarFooBaz'.removesuffix('Baz')
'BarFoo'
>>> 'BarFooBaz'.removesuffix('Bar')
'BarFooBaz'
sweeneyde marked this conversation as resolved.
Show resolved Hide resolved

The expression ``s.removesuffix(suf)`` is roughly equivalent to
``s[:-len(suf)] if suf and s.endswith(suf) else s``.
Unlike :meth:`~str.endswith`, only one suffix can be passed
at a time.

.. versionadded:: 3.9


.. method:: str.encode(encoding="utf-8", errors="strict")

Return an encoded version of the string as a bytes object. Default encoding
Expand Down Expand Up @@ -1831,6 +1864,9 @@ expression support in the :mod:`re` module).
>>> 'www.example.com'.lstrip('cmowz.')
'example.com'

See :meth:`str.removeprefix` for a method that will remove a single prefix
string rather than all of a set of characters.


.. staticmethod:: str.maketrans(x[, y[, z]])

Expand Down Expand Up @@ -1911,6 +1947,8 @@ expression support in the :mod:`re` module).
>>> 'mississippi'.rstrip('ipz')
'mississ'

See :meth:`str.removesuffix` for a method that will remove a single suffix
string rather than all of a set of characters.

.. method:: str.split(sep=None, maxsplit=-1)

Expand Down Expand Up @@ -2591,6 +2629,46 @@ arbitrary binary data.
Also accept an integer in the range 0 to 255 as the subsequence.


.. method:: bytes.removeprefix(prefix, /)
bytearray.removeprefix(prefix, /)

Return a copy of the binary data with the given prefix removed,
if present. ::

>>> b'BarFooBaz'.removeprefix(b'Bar')
b'FooBaz'
>>> b'BarFooBaz'.removeprefix(b'Baz')
b'BarFooBaz'

The *prefix* may be any :term:`bytes-like object`.
The expression ``b.removeprefix(pre)`` is roughly equivalent to
``b[len(pre):] if b.startswith(pre) else b[:]``.
Unlike :meth:`~bytes.startswith`, only one prefix can be passed
at a time.

.. versionadded:: 3.9


.. method:: bytes.removesuffix(suffix, /)
bytearray.removesuffix(suffix, /)

Return a copy of the binary data with the given suffix removed,
if present. ::

>>> b'BarFooBaz'.removesuffix(b'Baz')
b'BarFoo'
>>> b'BarFooBaz'.removesuffix(b'Bar')
b'BarFooBaz'

The *suffix* may be any :term:`bytes-like object`.
The expression ``b.removesuffix(suf)`` is roughly equivalent to
``b[:-len(suf)] if suf and b.endswith(suf) else b[:]``.
Unlike :meth:`~bytes.endswith`, only one suffix can be passed
at a time.

.. versionadded:: 3.9


.. method:: bytes.decode(encoding="utf-8", errors="strict")
bytearray.decode(encoding="utf-8", errors="strict")

Expand Down Expand Up @@ -2841,7 +2919,9 @@ produce new objects.
b'example.com'

The binary sequence of byte values to remove may be any
:term:`bytes-like object`.
:term:`bytes-like object`. See :meth:`~bytes.removeprefix` for a method
that will remove a single prefix string rather than all of a set of
characters.

.. note::

Expand Down Expand Up @@ -2890,7 +2970,9 @@ produce new objects.
b'mississ'

The binary sequence of byte values to remove may be any
:term:`bytes-like object`.
:term:`bytes-like object`. See :meth:`~bytes.removesuffix` for a method
that will remove a single suffix string rather than all of a set of
characters.

.. note::

Expand Down
7 changes: 7 additions & 0 deletions Doc/whatsnew/3.9.rst
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,13 @@ Merge (``|``) and update (``|=``) operators have been added to the built-in
:class:`dict` class. See :pep:`584` for a full description.
(Contributed by Brandt Bucher in :issue:`36144`.)

cutprefix and cutsuffix methods for strings
-------------------------------------------
``str.cutprefix(prefix)`` and ``str.cutsuffix(suffix)`` have been added to
sweeneyde marked this conversation as resolved.
Show resolved Hide resolved
easily remove an unneeded prefix or a suffix from a string. Corresponding
``bytes`` and ``bytearray`` methods have also been added.
sweeneyde marked this conversation as resolved.
Show resolved Hide resolved
(Contributed by Dennis Sweeney in :issue:`18939`.)

vstinner marked this conversation as resolved.
Show resolved Hide resolved

Other Language Changes
======================
Expand Down
8 changes: 8 additions & 0 deletions Lib/collections/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -1202,6 +1202,14 @@ def count(self, sub, start=0, end=_sys.maxsize):
if isinstance(sub, UserString):
sub = sub.data
return self.data.count(sub, start, end)
def removeprefix(self, prefix, /):
if isinstance(prefix, UserString):
prefix = prefix.data
return self.__class__(self.data.removeprefix(prefix))
def removesuffix(self, suffix, /):
if isinstance(suffix, UserString):
suffix = suffix.data
return self.__class__(self.data.removesuffix(suffix))
def encode(self, encoding='utf-8', errors='strict'):
encoding = 'utf-8' if encoding is None else encoding
errors = 'strict' if errors is None else errors
Expand Down
36 changes: 36 additions & 0 deletions Lib/test/string_tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -682,6 +682,42 @@ def test_replace_overflow(self):
self.checkraises(OverflowError, A2_16, "replace", "A", A2_16)
self.checkraises(OverflowError, A2_16, "replace", "AA", A2_16+A2_16)

def test_removeprefix(self):
self.checkequal('am', 'spam', 'removeprefix', 'sp')
self.checkequal('spamspam', 'spamspamspam', 'removeprefix', 'spam')
self.checkequal('spam', 'spam', 'removeprefix', 'python')
self.checkequal('spam', 'spam', 'removeprefix', 'spider')
self.checkequal('spam', 'spam', 'removeprefix', 'spam and eggs')

self.checkequal('', '', 'removeprefix', '')
self.checkequal('', '', 'removeprefix', 'abcde')
self.checkequal('abcde', 'abcde', 'removeprefix', '')
self.checkequal('', 'abcde', 'removeprefix', 'abcde')

self.checkraises(TypeError, 'hello', 'removeprefix')
self.checkraises(TypeError, 'hello', 'removeprefix', 42)
self.checkraises(TypeError, 'hello', 'removeprefix', 42, 'h')
self.checkraises(TypeError, 'hello', 'removeprefix', 'h', 42)
self.checkraises(TypeError, 'hello', 'removeprefix', ("he", "l"))

def test_removesuffix(self):
self.checkequal('sp', 'spam', 'removesuffix', 'am')
self.checkequal('spamspam', 'spamspamspam', 'removesuffix', 'spam')
self.checkequal('spam', 'spam', 'removesuffix', 'python')
self.checkequal('spam', 'spam', 'removesuffix', 'blam')
self.checkequal('spam', 'spam', 'removesuffix', 'eggs and spam')

self.checkequal('', '', 'removesuffix', '')
self.checkequal('', '', 'removesuffix', 'abcde')
self.checkequal('abcde', 'abcde', 'removesuffix', '')
self.checkequal('', 'abcde', 'removesuffix', 'abcde')

self.checkraises(TypeError, 'hello', 'removesuffix')
self.checkraises(TypeError, 'hello', 'removesuffix', 42)
self.checkraises(TypeError, 'hello', 'removesuffix', 42, 'h')
self.checkraises(TypeError, 'hello', 'removesuffix', 'h', 42)
self.checkraises(TypeError, 'hello', 'removesuffix', ("lo", "l"))

def test_capitalize(self):
self.checkequal(' hello ', ' hello ', 'capitalize')
self.checkequal('Hello ', 'Hello ','capitalize')
Expand Down
2 changes: 1 addition & 1 deletion Lib/test/test_doctest.py
Original file line number Diff line number Diff line change
Expand Up @@ -661,7 +661,7 @@ def non_Python_modules(): r"""

>>> import builtins
>>> tests = doctest.DocTestFinder().find(builtins)
>>> 800 < len(tests) < 820 # approximate number of objects with docstrings
>>> 800 < len(tests) < 826 # approximate number of objects with docstrings
True
>>> real_tests = [t for t in tests if len(t.examples) > 0]
>>> len(real_tests) # objects that actually have doctests
Expand Down
1 change: 1 addition & 0 deletions Misc/ACKS
Original file line number Diff line number Diff line change
Expand Up @@ -1655,6 +1655,7 @@ Hisao Suzuki
Kalle Svensson
Andrew Svetlov
Paul Swartz
Dennis Sweeney
Al Sweigart
Sviatoslav Sydorenko
Thenault Sylvain
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Added str.cutprefix and str.cutsuffix methods and corresponding bytes and bytearray methods to cut affixes off of a string, if present. Patch by Dennis Sweeney.
sweeneyde marked this conversation as resolved.
Show resolved Hide resolved
65 changes: 65 additions & 0 deletions Objects/bytearrayobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -1185,6 +1185,69 @@ bytearray_endswith(PyByteArrayObject *self, PyObject *args)
return _Py_bytes_endswith(PyByteArray_AS_STRING(self), PyByteArray_GET_SIZE(self), args);
}

/*[clinic input]
bytearray.removeprefix as bytearray_removeprefix

prefix: Py_buffer
/

Return a copy of the bytearray with a given prefix removed if present.
sweeneyde marked this conversation as resolved.
Show resolved Hide resolved

If the bytearray starts with the prefix, return b[len(prefix):].
sweeneyde marked this conversation as resolved.
Show resolved Hide resolved
Otherwise, return a copy of the original bytearray.
[clinic start generated code]*/

static PyObject *
bytearray_removeprefix_impl(PyByteArrayObject *self, Py_buffer *prefix)
/*[clinic end generated code: output=6cabc585e7f502e0 input=4b4f34cda54a3c82]*/
{
const char *self_start = PyByteArray_AS_STRING(self);
Py_ssize_t self_len = PyByteArray_GET_SIZE(self);
const char *prefix_start = prefix->buf;
Py_ssize_t prefix_len = prefix->len;

if (self_len >= prefix_len
&& memcmp(self_start, prefix_start, prefix_len) == 0)
{
return PyByteArray_FromStringAndSize(self_start + prefix_len,
self_len - prefix_len);
}

return PyByteArray_FromStringAndSize(self_start, self_len);
}

/*[clinic input]
bytearray.removesuffix as bytearray_removesuffix

suffix: Py_buffer
/

Return a copy of the bytearray with a given suffix removed if present.

If the bytearray ends with the suffix, return b[:len(b)-len(suffix)].
sweeneyde marked this conversation as resolved.
Show resolved Hide resolved
Otherwise, return a copy of the original bytearray.
[clinic start generated code]*/

static PyObject *
bytearray_removesuffix_impl(PyByteArrayObject *self, Py_buffer *suffix)
/*[clinic end generated code: output=2bc8cfb79de793d3 input=9e99a83e43aa6ed8]*/
{
const char *self_start = PyByteArray_AS_STRING(self);
Py_ssize_t self_len = PyByteArray_GET_SIZE(self);
const char *suffix_start = suffix->buf;
Py_ssize_t suffix_len = suffix->len;

if (self_len >= suffix_len
&& memcmp(self_start + self_len - suffix_len,
suffix_start, suffix_len) == 0)
{
return PyByteArray_FromStringAndSize(self_start,
self_len - suffix_len);
}

return PyByteArray_FromStringAndSize(self_start, self_len);
}


/*[clinic input]
bytearray.translate
Expand Down Expand Up @@ -2207,6 +2270,8 @@ bytearray_methods[] = {
BYTEARRAY_POP_METHODDEF
BYTEARRAY_REMOVE_METHODDEF
BYTEARRAY_REPLACE_METHODDEF
BYTEARRAY_REMOVEPREFIX_METHODDEF
BYTEARRAY_REMOVESUFFIX_METHODDEF
BYTEARRAY_REVERSE_METHODDEF
{"rfind", (PyCFunction)bytearray_rfind, METH_VARARGS, _Py_rfind__doc__},
{"rindex", (PyCFunction)bytearray_rindex, METH_VARARGS, _Py_rindex__doc__},
Expand Down
76 changes: 76 additions & 0 deletions Objects/bytesobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -2181,6 +2181,80 @@ bytes_replace_impl(PyBytesObject *self, Py_buffer *old, Py_buffer *new,

/** End DALKE **/

/*[clinic input]
bytes.removeprefix as bytes_removeprefix

prefix: Py_buffer
/

Remove a specified prefix, if present.

If the bytes starts with the prefix, return b[len(prefix):].
Otherwise, return a copy of the original bytes.
[clinic start generated code]*/

static PyObject *
bytes_removeprefix_impl(PyBytesObject *self, Py_buffer *prefix)
/*[clinic end generated code: output=f006865331a06ab6 input=51ea1fc18687503e]*/
{
const char *self_start = PyBytes_AS_STRING(self);
Py_ssize_t self_len = PyBytes_GET_SIZE(self);
const char *prefix_start = prefix->buf;
Py_ssize_t prefix_len = prefix->len;

if (self_len >= prefix_len
&& prefix_len > 0
&& memcmp(self_start, prefix_start, prefix_len) == 0)
{
return PyBytes_FromStringAndSize(self_start + prefix_len,
self_len - prefix_len);
}

if (PyBytes_CheckExact(self)) {
Py_INCREF(self);
return (PyObject *)self;
}

return PyBytes_FromStringAndSize(self_start, self_len);
sweeneyde marked this conversation as resolved.
Show resolved Hide resolved
}

/*[clinic input]
bytes.removesuffix as bytes_removesuffix

suffix: Py_buffer
/

Remove a specified suffix, if present.

If the bytes ends with the suffix, return b[:len(b)-len(prefix)].
Otherwise, return a copy of the original bytes.
[clinic start generated code]*/

static PyObject *
bytes_removesuffix_impl(PyBytesObject *self, Py_buffer *suffix)
/*[clinic end generated code: output=d887d308e3242eeb input=9f6172d9ddad90cd]*/
{
const char *self_start = PyBytes_AS_STRING(self);
Py_ssize_t self_len = PyBytes_GET_SIZE(self);
const char *suffix_start = suffix->buf;
Py_ssize_t suffix_len = suffix->len;

if (self_len >= suffix_len
&& suffix_len > 0
&& memcmp(self_start + self_len - suffix_len,
suffix_start, suffix_len) == 0)
{
return PyBytes_FromStringAndSize(self_start,
self_len - suffix_len);
}

if (PyBytes_CheckExact(self)) {
Py_INCREF(self);
return (PyObject *)self;
}

return PyBytes_FromStringAndSize(self_start, self_len);
}

static PyObject *
bytes_startswith(PyBytesObject *self, PyObject *args)
Expand Down Expand Up @@ -2420,6 +2494,8 @@ bytes_methods[] = {
BYTES_MAKETRANS_METHODDEF
BYTES_PARTITION_METHODDEF
BYTES_REPLACE_METHODDEF
BYTES_REMOVEPREFIX_METHODDEF
BYTES_REMOVESUFFIX_METHODDEF
{"rfind", (PyCFunction)bytes_rfind, METH_VARARGS, _Py_rfind__doc__},
{"rindex", (PyCFunction)bytes_rindex, METH_VARARGS, _Py_rindex__doc__},
STRINGLIB_RJUST_METHODDEF
Expand Down
Loading