Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-41842: Add a unregister function in _codecs module #22360

Merged
merged 15 commits into from
Sep 28, 2020
8 changes: 8 additions & 0 deletions Doc/c-api/codec.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,14 @@ Codec registry and support functions
As side effect, this tries to load the :mod:`encodings` package, if not yet
done, to make sure that it is always first in the list of search functions.

.. c:function:: int PyCodec_Unregister(PyObject *search_function)

Unregister a codec search function and clear the registry's cache.
If the search function is not registered, do nothing.
Return 0 on success. Raise an exception and return -1 on error.

.. versionadded:: 3.10

.. c:function:: int PyCodec_KnownEncoding(const char *encoding)

Return ``1`` or ``0`` depending on whether there is a registered codec for
Expand Down
11 changes: 7 additions & 4 deletions Doc/library/codecs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -163,11 +163,14 @@ function:
:class:`CodecInfo` object. In case a search function cannot find
a given encoding, it should return ``None``.

.. note::

Search function registration is not currently reversible,
which may cause problems in some cases, such as unit testing or
module reloading.
.. function:: unregister(search_function)

Unregister a codec search function and clear the registry's cache.
If the search function is not registered, do nothing.

.. versionadded:: 3.10


While the builtin :func:`open` and the associated :mod:`io` module are the
recommended approach for working with encoded text files, this module
Expand Down
9 changes: 9 additions & 0 deletions Doc/whatsnew/3.10.rst
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,12 @@ base64
Add :func:`base64.b32hexencode` and :func:`base64.b32hexdecode` to support the
Base32 Encoding with Extended Hex Alphabet.

codecs
------

Add a :func:`codecs.unregister` to unregister a codec search function.
shihai1991 marked this conversation as resolved.
Show resolved Hide resolved
shihai1991 marked this conversation as resolved.
Show resolved Hide resolved
(Contributed by Hai Shi in :issue:`41842`.)

curses
------

Expand Down Expand Up @@ -237,6 +243,9 @@ New Features
:class:`datetime.time` objects.
(Contributed by Zackery Spytz in :issue:`30155`.)

* Add a :c:func:`PyCodec_Unregister` to unregister a codec search function.
shihai1991 marked this conversation as resolved.
Show resolved Hide resolved
(Contributed by Hai Shi in :issue:`41842`.)

Porting to Python 3.10
----------------------

Expand Down
8 changes: 8 additions & 0 deletions Include/codecs.h
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,14 @@ PyAPI_FUNC(int) PyCodec_Register(
PyObject *search_function
);

/* Unregister a codec search function and clear the registry's cache.
If the search function is not registered, do nothing.
shihai1991 marked this conversation as resolved.
Show resolved Hide resolved
Return 0 on success. Raise an exception and return -1 on error. */

PyAPI_FUNC(int) PyCodec_Unregister(
PyObject *search_function
);
shihai1991 marked this conversation as resolved.
Show resolved Hide resolved

/* Codec registry lookup API.

Looks up the given encoding and returns a CodecInfo object with
Expand Down
12 changes: 12 additions & 0 deletions Lib/test/test_codecs.py
Original file line number Diff line number Diff line change
Expand Up @@ -1641,6 +1641,18 @@ def test_register(self):
self.assertRaises(TypeError, codecs.register)
self.assertRaises(TypeError, codecs.register, 42)

def test_unregister(self):
name = "nonexistent_codec_name"
search_function = mock.Mock()
codecs.register(search_function)
shihai1991 marked this conversation as resolved.
Show resolved Hide resolved
self.assertRaises(TypeError, codecs.lookup, name)
search_function.assert_called_with(name)
search_function.reset_mock()
shihai1991 marked this conversation as resolved.
Show resolved Hide resolved

codecs.unregister(search_function)
self.assertRaises(LookupError, codecs.lookup, name)
search_function.assert_not_called()

def test_lookup(self):
self.assertRaises(TypeError, codecs.lookup)
self.assertRaises(LookupError, codecs.lookup, "__spam__")
Expand Down
1 change: 1 addition & 0 deletions Misc/ACKS
Original file line number Diff line number Diff line change
Expand Up @@ -1575,6 +1575,7 @@ Akash Shende
Charlie Shepherd
Bruce Sherwood
Gregory Shevchenko
Hai Shi
Alexander Shigin
Pete Shinners
Michael Shiplett
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Add :func:`codecs.unregister` and :c:func:`PyCodec_Unregister` to unregister
a codec search function.
shihai1991 marked this conversation as resolved.
Show resolved Hide resolved
22 changes: 22 additions & 0 deletions Modules/_codecsmodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,27 @@ _codecs_register(PyObject *module, PyObject *search_function)
Py_RETURN_NONE;
}

/*[clinic input]
_codecs.unregister
search_function: object
/

Unregister a codec search function and clear the registry's cache.

If the search function is not registered, do nothing.
[clinic start generated code]*/

static PyObject *
_codecs_unregister(PyObject *module, PyObject *search_function)
/*[clinic end generated code: output=1f0edee9cf246399 input=dd7c004c652d345e]*/
{
if (PyCodec_Unregister(search_function) < 0) {
return NULL;
}

Py_RETURN_NONE;
}

/*[clinic input]
_codecs.lookup
encoding: str
Expand Down Expand Up @@ -992,6 +1013,7 @@ _codecs_lookup_error_impl(PyObject *module, const char *name)

static PyMethodDef _codecs_functions[] = {
_CODECS_REGISTER_METHODDEF
_CODECS_UNREGISTER_METHODDEF
_CODECS_LOOKUP_METHODDEF
_CODECS_ENCODE_METHODDEF
_CODECS_DECODE_METHODDEF
Expand Down
13 changes: 12 additions & 1 deletion Modules/clinic/_codecsmodule.c.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

21 changes: 21 additions & 0 deletions Python/codecs.c
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,27 @@ int PyCodec_Register(PyObject *search_function)
return -1;
}

int
PyCodec_Unregister(PyObject *search_function)
{
PyInterpreterState *interp = PyInterpreterState_Get();
PyObject *codec_search_path = interp->codec_search_path;
/* Do nothing if codec_search_path is not created yet or was cleared. */
shihai1991 marked this conversation as resolved.
Show resolved Hide resolved
if (codec_search_path == NULL) {
return 0;
}

Py_ssize_t n = PyList_Size(codec_search_path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the macro PyList_GET_SIZE() instead of PyList_Size(), as codec_search_path is guaranteed to be a list.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it's a list right now, its type can change tomorrow.

@shihai1991: If you modify the code to use fast macros, please add the following assertion before using it:

assert(PyList_CheckExact(interp->codec_search_path));

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vstinner Copy that, I have already updated it.
But I have a question. assert(PyList_Check(op) in _PyList_CAST() is not good enough?

for (Py_ssize_t i = 0; i < n; i++) {
PyObject *item = PyList_GetItem(codec_search_path, i);
shihai1991 marked this conversation as resolved.
Show resolved Hide resolved
if (item == search_function) {
shihai1991 marked this conversation as resolved.
Show resolved Hide resolved
PyDict_Clear(interp->codec_search_cache);
shihai1991 marked this conversation as resolved.
Show resolved Hide resolved
return PyList_SetSlice(codec_search_path, i, i+1, NULL);
}
}
shihai1991 marked this conversation as resolved.
Show resolved Hide resolved
return 0;
}

extern int _Py_normalize_encoding(const char *, char *, size_t);

/* Convert a string to a normalized Python string(decoded from UTF-8): all characters are
Expand Down