Skip to content

Commit

Permalink
gh-113626: Add allow_code parameter in marshal functions (GH-113648)
Browse files Browse the repository at this point in the history
Passing allow_code=False prevents serialization and de-serialization of
code objects which is incompatible between Python versions.
  • Loading branch information
serhiy-storchaka authored Jan 16, 2024
1 parent a482bc6 commit d2d8332
Show file tree
Hide file tree
Showing 10 changed files with 356 additions and 53 deletions.
41 changes: 31 additions & 10 deletions Doc/library/marshal.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,11 @@ transfer of Python objects through RPC calls, see the modules :mod:`pickle` and
:mod:`shelve`. The :mod:`marshal` module exists mainly to support reading and
writing the "pseudo-compiled" code for Python modules of :file:`.pyc` files.
Therefore, the Python maintainers reserve the right to modify the marshal format
in backward incompatible ways should the need arise. If you're serializing and
in backward incompatible ways should the need arise.
The format of code objects is not compatible between Python versions,
even if the version of the format is the same.
De-serializing a code object in the incorrect Python version has undefined behavior.
If you're serializing and
de-serializing Python objects, use the :mod:`pickle` module instead -- the
performance is comparable, version independence is guaranteed, and pickle
supports a substantially wider range of objects than marshal.
Expand All @@ -40,7 +44,8 @@ Not all Python object types are supported; in general, only objects whose value
is independent from a particular invocation of Python can be written and read by
this module. The following types are supported: booleans, integers, floating
point numbers, complex numbers, strings, bytes, bytearrays, tuples, lists, sets,
frozensets, dictionaries, and code objects, where it should be understood that
frozensets, dictionaries, and code objects (if *allow_code* is true),
where it should be understood that
tuples, lists, sets, frozensets and dictionaries are only supported as long as
the values contained therein are themselves supported. The
singletons :const:`None`, :const:`Ellipsis` and :exc:`StopIteration` can also be
Expand All @@ -54,27 +59,32 @@ bytes-like objects.
The module defines these functions:


.. function:: dump(value, file[, version])
.. function:: dump(value, file, version=version, /, *, allow_code=True)

Write the value on the open file. The value must be a supported type. The
file must be a writeable :term:`binary file`.

If the value has (or contains an object that has) an unsupported type, a
:exc:`ValueError` exception is raised --- but garbage data will also be written
to the file. The object will not be properly read back by :func:`load`.
:ref:`Code objects <code-objects>` are only supported if *allow_code* is true.

The *version* argument indicates the data format that ``dump`` should use
(see below).

.. audit-event:: marshal.dumps value,version marshal.dump

.. versionchanged:: 3.13
Added the *allow_code* parameter.

.. function:: load(file)

.. function:: load(file, /, *, allow_code=True)

Read one value from the open file and return it. If no valid value is read
(e.g. because the data has a different Python version's incompatible marshal
format), raise :exc:`EOFError`, :exc:`ValueError` or :exc:`TypeError`. The
file must be a readable :term:`binary file`.
format), raise :exc:`EOFError`, :exc:`ValueError` or :exc:`TypeError`.
:ref:`Code objects <code-objects>` are only supported if *allow_code* is true.
The file must be a readable :term:`binary file`.

.. audit-event:: marshal.load "" marshal.load

Expand All @@ -88,24 +98,32 @@ The module defines these functions:
This call used to raise a ``code.__new__`` audit event for each code object. Now
it raises a single ``marshal.load`` event for the entire load operation.

.. versionchanged:: 3.13
Added the *allow_code* parameter.


.. function:: dumps(value[, version])
.. function:: dumps(value, version=version, /, *, allow_code=True)

Return the bytes object that would be written to a file by ``dump(value, file)``. The
value must be a supported type. Raise a :exc:`ValueError` exception if value
has (or contains an object that has) an unsupported type.
:ref:`Code objects <code-objects>` are only supported if *allow_code* is true.

The *version* argument indicates the data format that ``dumps`` should use
(see below).

.. audit-event:: marshal.dumps value,version marshal.dump

.. versionchanged:: 3.13
Added the *allow_code* parameter.

.. function:: loads(bytes)

.. function:: loads(bytes, /, *, allow_code=True)

Convert the :term:`bytes-like object` to a value. If no valid value is found, raise
:exc:`EOFError`, :exc:`ValueError` or :exc:`TypeError`. Extra bytes in the
input are ignored.
:exc:`EOFError`, :exc:`ValueError` or :exc:`TypeError`.
:ref:`Code objects <code-objects>` are only supported if *allow_code* is true.
Extra bytes in the input are ignored.

.. audit-event:: marshal.loads bytes marshal.load

Expand All @@ -114,6 +132,9 @@ The module defines these functions:
This call used to raise a ``code.__new__`` audit event for each code object. Now
it raises a single ``marshal.loads`` event for the entire load operation.

.. versionchanged:: 3.13
Added the *allow_code* parameter.


In addition, the following constants are defined:

Expand Down
8 changes: 8 additions & 0 deletions Doc/whatsnew/3.13.rst
Original file line number Diff line number Diff line change
Expand Up @@ -247,6 +247,14 @@ ipaddress
* Add the :attr:`ipaddress.IPv4Address.ipv6_mapped` property, which returns the IPv4-mapped IPv6 address.
(Contributed by Charles Machalow in :gh:`109466`.)

marshal
-------

* Add the *allow_code* parameter in module functions.
Passing ``allow_code=False`` prevents serialization and de-serialization of
code objects which are incompatible between Python versions.
(Contributed by Serhiy Storchaka in :gh:`113626`.)

mmap
----

Expand Down
1 change: 1 addition & 0 deletions Include/internal/pycore_global_objects_fini_generated.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Include/internal/pycore_global_strings.h
Original file line number Diff line number Diff line change
Expand Up @@ -276,6 +276,7 @@ struct _Py_global_strings {
STRUCT_FOR_ID(after_in_child)
STRUCT_FOR_ID(after_in_parent)
STRUCT_FOR_ID(aggregate_class)
STRUCT_FOR_ID(allow_code)
STRUCT_FOR_ID(append)
STRUCT_FOR_ID(argdefs)
STRUCT_FOR_ID(arguments)
Expand Down
1 change: 1 addition & 0 deletions Include/internal/pycore_runtime_init_generated.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions Include/internal/pycore_unicodeobject_generated.h

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

26 changes: 26 additions & 0 deletions Lib/test/test_marshal.py
Original file line number Diff line number Diff line change
Expand Up @@ -129,6 +129,32 @@ def test_different_filenames(self):
self.assertEqual(co1.co_filename, "f1")
self.assertEqual(co2.co_filename, "f2")

def test_no_allow_code(self):
data = {'a': [({0},)]}
dump = marshal.dumps(data, allow_code=False)
self.assertEqual(marshal.loads(dump, allow_code=False), data)

f = io.BytesIO()
marshal.dump(data, f, allow_code=False)
f.seek(0)
self.assertEqual(marshal.load(f, allow_code=False), data)

co = ExceptionTestCase.test_exceptions.__code__
data = {'a': [({co, 0},)]}
dump = marshal.dumps(data, allow_code=True)
self.assertEqual(marshal.loads(dump, allow_code=True), data)
with self.assertRaises(ValueError):
marshal.dumps(data, allow_code=False)
with self.assertRaises(ValueError):
marshal.loads(dump, allow_code=False)

marshal.dump(data, io.BytesIO(), allow_code=True)
self.assertEqual(marshal.load(io.BytesIO(dump), allow_code=True), data)
with self.assertRaises(ValueError):
marshal.dump(data, io.BytesIO(), allow_code=False)
with self.assertRaises(ValueError):
marshal.load(io.BytesIO(dump), allow_code=False)

@requires_debug_ranges()
def test_minimal_linetable_with_no_debug_ranges(self):
# Make sure when demarshalling objects with `-X no_debug_ranges`
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Add support for the *allow_code* argument in the :mod:`marshal` module.
Passing ``allow_code=False`` prevents serialization and de-serialization of
code objects which is incompatible between Python versions.
Loading

0 comments on commit d2d8332

Please sign in to comment.