In Python, writing a metaclass lets you create new kinds of class objects whose behaviour you can control.
By analogy (and bit of abuse of English), writing a metamodule lets you create module objects with customized behaviour.
metamodule.py
is a single-file, permissively-licensed Python
library that makes it easy and safe to use custom module subtypes as
the public interface for your library. For example, ordinarily in
Python it's easy to issue a deprecation warning when someone calls a
deprecated function (mymodule.foo()
), but it's very difficult to
issue a deprecation warning when someone accesses a deprecated
constant (mymodule.FOO
). Another commonly-requested (though
somewhat dangerous) feature is the ability to delay importing a
submodule until the first time it's accessed
(mymodule.submodule.subfunction()
). With metamodule, these are
both easy to solve: we just give mymodule
a custom __getattr__
method that does what we want. (And in fact, you don't even need to
write this __getattr__
-- metamodule includes an implementation
that provides both of the above features out-of-the-box.)
In the source directory of this project, try starting a Python REPL and running:
>>> import examplepkg
examplepkg
is a module object:
>>> import types >>> isinstance(examplepkg, types.ModuleType) True
But it's not a regular module object; it's a custom subclass:
>>> examplepkg <FancyModule 'examplepkg' from 'examplepkg/__init__.py'>
And this subclass has superpowers:
# Automatically loads the submodule on first access: >>> examplepkg.submodule.subattr ... submodule loading ... 'look ma no import' # Imports are cached so future usage is just as fast as regular access: >>> examplepkg.submodule.subattr 'look ma no import' # Accessing this attribute triggers a warning: >>> examplepkg.a __main__:1: FutureWarning: 'a' attribute will become 2 in next release 1 # But regular attributes continue to work fine, with no speed penalty: >>> examplepkg.b 2 # reload() works fine (except on CPython 3.3, which is buggy) >>> import imp >>> imp.reload(examplepkg) <FancyModule 'examplepkg' from 'examplepkg/__init__.pyc'> # And functions defined in the package use the same globals dict # as the package itself. (On py2 replace .__globals__ with .func_globals) >>> examplepkg.__dict__ is examplepkg.f.__globals__ True
To accomplish this, all we had to do was put the following code at the
top of examplepkg/__init__.py
:
# WARNING: this should be placed at the *very top* of your module, # *before* you import any code that might recursively re-import # your package. import metamodule metamodule.install(__name__) del metamodule # Any strings in this set name submodules that will be lazily imported: # NB: you probably shouldn't use this unless you have a real, # specific need for it, since it can cause import errors and other # side-effects to appear at weird and confusing places. __auto_import__.add("submodule") # Attributes that we want to warn users about: __warn_on_access__["a"] = ( # Attribute value 1, # Warning issued when attribute is accessed FutureWarning("'a' attribute will become 2 in next release"))
You can also define your own ModuleType
subclass and pass it as
the second argument to metamodule.install
. Your class can do
anything you can regularly do with a Python class -- define special
methods like __getattribute__
, use properties, have a custom
__repr__
, whatever you want. Note that your class instance's
__dict__
will be the module globals dict, so assigning to
self.foo
is equivalent to creating a global variable in your
module named foo
, and vice-versa.
The one thing to watch out for is that your class's __init__
will
not be called -- instead, you should define a method
__metamodule_init__
which will be called immediately after your
metamodule class is installed.
Metamodule is currently tested against:
- CPython 2.6, 2.7
- CPython 3.3, 3.4, 3.5, 3.6, 3.7-dev
I suspect it will work on pretty much every version of CPython that
has a working ctypes
, I just don't have convenient access to older
versions to test.
The latest PyPy3 nightlies should also work, and the PyPy3 5.9 release
when it comes out. PyPy2 is currently not supported, and as far as I
know neither are Jython, IronPython, etc.; hopefully they will catch
up with Python 3.5 and start allowing __class__
assignment on
module objects.
Python has always allowed these kinds of tricks to some extent, via
the mechanism of assigning a new object to
sys.modules["mymodule"]
; this object can then have whatever
behaviour you like. This can work well, but the end result is that you
have two different objects that both represent the same module: your
original module object (which owns the globals()
namespace where
your module code executes), and your custom object. Depending on the
relative order of the assignment to sys.modules
and imports of
submodules, you can end up with different pieces of code in the same
program thinking that mymodule
refers to one or the other of these
objects. If they don't share the same __dict__
, then their
namespaces can get out of sync; alternatively, if they do share the
same __dict__
, then this means that your custom object can't
subclass ModuleType
(module objects don't allow reassignment of
their __dict__
attribute), which breaks reload()
. All in all
it's a bit of a mess. It's possible to write correct code using this
method, if you are extremely careful -- for example apipkg is a somewhat similar library
uses this approach, but to keep things workable it requires that your
library's public interface be defined entirely by apipkg
calls. There's no easy way to take a legacy Python package and
incrementally switch it over to using apipkg.
The key feature that metamodule provides is: it makes it easy to set
up sys.modules["mymodule"]
so that it is both (a) an instance of a
class that you control, so you can have arbitrary properties etc.,
AND (b) a regular subclass of ModuleType
with your
__init__.py
's globals()
as its __dict__
attribute, so that
you can continue using the usual Python approach to defining your
API.
This combination makes it easy and safe to transition an existing
library to using metamodule -- just add a call to
metamodule.install
at the top of your __init__.py
, and nothing
at all will change, except that you can now start taking advantage of
your new superpowers at your leisure.
How do we do it? On CPython 3.5 and later, this is easy: metamodule
uses __class__
assignment on module objects (a feature that was
added to CPython explicitly to support this usage).
On CPython 3.4 and earlier, it uses ctypes
hacks. These are ugly,
but safe so long as no one goes back in time and changes the internal
memory layout of module objects on old, already-released versions of
Python. (Which is not going to happen.) Basically, we instantiate a
new object of the specified ModuleType
subclass, and then we use
some arcane knowledge of how these objects are laid out in order to
swap the guts of your original module and the new object. Then we
assign the new object into sys.modules
. This preserves the key
invariant that at any given point there's exactly one module that owns
your globals dict, and it's in sys.modules
. It does, however, mean
that things will go horribly wrong if you call metamodule.install
after someone else has already imported your module. So unless you
only want to support Python 3.5+, then make sure to call
metamodule.install
right at the top of your module definition
file.
These two tricks together let us safely support all versions of
CPython, and as alternative implementations like PyPy catch up with
CPython 3.5 in supporting __class__
assignment, we'll support
those too.
1.2:
- In
FancyModule
, don't include__warn_on_access__
attributes indir(module)
, to avoid problems with test discovery trying to iterate over all attributes.
1.1:
- When looking up
__metamodule_init__
, go straight to the class without checking the instance. This makes our behavior more consistent with regular__init__
, and avoids accidentally triggering__getattr__
. (Thanks to Antony Lee for the report + fix.)
1.0:
- First public release.