Skip to content

Commit

Permalink
bpo-41842: Add codecs.unregister() function (pythonGH-22360)
Browse files Browse the repository at this point in the history
Add codecs.unregister() and PyCodec_Unregister() functions
to unregister a codec search function.
  • Loading branch information
shihai1991 authored and Seth Sims committed Oct 18, 2020
1 parent 1dba9b0 commit e3625f2
Show file tree
Hide file tree
Showing 11 changed files with 108 additions and 5 deletions.
8 changes: 8 additions & 0 deletions Doc/c-api/codec.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,14 @@ Codec registry and support functions
As side effect, this tries to load the :mod:`encodings` package, if not yet
done, to make sure that it is always first in the list of search functions.
.. c:function:: int PyCodec_Unregister(PyObject *search_function)
Unregister a codec search function and clear the registry's cache.
If the search function is not registered, do nothing.
Return 0 on success. Raise an exception and return -1 on error.
.. versionadded:: 3.10
.. c:function:: int PyCodec_KnownEncoding(const char *encoding)
Return ``1`` or ``0`` depending on whether there is a registered codec for
Expand Down
11 changes: 7 additions & 4 deletions Doc/library/codecs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -163,11 +163,14 @@ function:
:class:`CodecInfo` object. In case a search function cannot find
a given encoding, it should return ``None``.

.. note::

Search function registration is not currently reversible,
which may cause problems in some cases, such as unit testing or
module reloading.
.. function:: unregister(search_function)

Unregister a codec search function and clear the registry's cache.
If the search function is not registered, do nothing.

.. versionadded:: 3.10


While the builtin :func:`open` and the associated :mod:`io` module are the
recommended approach for working with encoded text files, this module
Expand Down
10 changes: 10 additions & 0 deletions Doc/whatsnew/3.10.rst
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,12 @@ base64
Add :func:`base64.b32hexencode` and :func:`base64.b32hexdecode` to support the
Base32 Encoding with Extended Hex Alphabet.

codecs
------

Add a :func:`codecs.unregister` function to unregister a codec search function.
(Contributed by Hai Shi in :issue:`41842`.)

curses
------

Expand Down Expand Up @@ -237,6 +243,10 @@ New Features
:class:`datetime.time` objects.
(Contributed by Zackery Spytz in :issue:`30155`.)

* Add a :c:func:`PyCodec_Unregister` function to unregister a codec
search function.
(Contributed by Hai Shi in :issue:`41842`.)

Porting to Python 3.10
----------------------

Expand Down
8 changes: 8 additions & 0 deletions Include/codecs.h
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,14 @@ PyAPI_FUNC(int) PyCodec_Register(
PyObject *search_function
);

/* Unregister a codec search function and clear the registry's cache.
If the search function is not registered, do nothing.
Return 0 on success. Raise an exception and return -1 on error. */

PyAPI_FUNC(int) PyCodec_Unregister(
PyObject *search_function
);

/* Codec registry lookup API.
Looks up the given encoding and returns a CodecInfo object with
Expand Down
12 changes: 12 additions & 0 deletions Lib/test/test_codecs.py
Original file line number Diff line number Diff line change
Expand Up @@ -1641,6 +1641,18 @@ def test_register(self):
self.assertRaises(TypeError, codecs.register)
self.assertRaises(TypeError, codecs.register, 42)

def test_unregister(self):
name = "nonexistent_codec_name"
search_function = mock.Mock()
codecs.register(search_function)
self.assertRaises(TypeError, codecs.lookup, name)
search_function.assert_called_with(name)
search_function.reset_mock()

codecs.unregister(search_function)
self.assertRaises(LookupError, codecs.lookup, name)
search_function.assert_not_called()

def test_lookup(self):
self.assertRaises(TypeError, codecs.lookup)
self.assertRaises(LookupError, codecs.lookup, "__spam__")
Expand Down
1 change: 1 addition & 0 deletions Misc/ACKS
Original file line number Diff line number Diff line change
Expand Up @@ -1575,6 +1575,7 @@ Akash Shende
Charlie Shepherd
Bruce Sherwood
Gregory Shevchenko
Hai Shi
Alexander Shigin
Pete Shinners
Michael Shiplett
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
Add :c:func:`PyCodec_Unregister` function to unregister a codec search
function.
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
Add :func:`codecs.unregister` function to unregister a codec search function.
22 changes: 22 additions & 0 deletions Modules/_codecsmodule.c
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,27 @@ _codecs_register(PyObject *module, PyObject *search_function)
Py_RETURN_NONE;
}

/*[clinic input]
_codecs.unregister
search_function: object
/
Unregister a codec search function and clear the registry's cache.
If the search function is not registered, do nothing.
[clinic start generated code]*/

static PyObject *
_codecs_unregister(PyObject *module, PyObject *search_function)
/*[clinic end generated code: output=1f0edee9cf246399 input=dd7c004c652d345e]*/
{
if (PyCodec_Unregister(search_function) < 0) {
return NULL;
}

Py_RETURN_NONE;
}

/*[clinic input]
_codecs.lookup
encoding: str
Expand Down Expand Up @@ -992,6 +1013,7 @@ _codecs_lookup_error_impl(PyObject *module, const char *name)

static PyMethodDef _codecs_functions[] = {
_CODECS_REGISTER_METHODDEF
_CODECS_UNREGISTER_METHODDEF
_CODECS_LOOKUP_METHODDEF
_CODECS_ENCODE_METHODDEF
_CODECS_DECODE_METHODDEF
Expand Down
13 changes: 12 additions & 1 deletion Modules/clinic/_codecsmodule.c.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

25 changes: 25 additions & 0 deletions Python/codecs.c
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,31 @@ int PyCodec_Register(PyObject *search_function)
return -1;
}

int
PyCodec_Unregister(PyObject *search_function)
{
PyInterpreterState *interp = PyInterpreterState_Get();
PyObject *codec_search_path = interp->codec_search_path;
/* Do nothing if codec_search_path is not created yet or was cleared. */
if (codec_search_path == NULL) {
return 0;
}

assert(PyList_CheckExact(codec_search_path));
Py_ssize_t n = PyList_GET_SIZE(codec_search_path);
for (Py_ssize_t i = 0; i < n; i++) {
PyObject *item = PyList_GET_ITEM(codec_search_path, i);
if (item == search_function) {
if (interp->codec_search_cache != NULL) {
assert(PyDict_CheckExact(interp->codec_search_cache));
PyDict_Clear(interp->codec_search_cache);
}
return PyList_SetSlice(codec_search_path, i, i+1, NULL);
}
}
return 0;
}

extern int _Py_normalize_encoding(const char *, char *, size_t);

/* Convert a string to a normalized Python string(decoded from UTF-8): all characters are
Expand Down

0 comments on commit e3625f2

Please sign in to comment.