The python language is a flexible language often used as an interface to manipulate high performance libraries coded in less flexible native languages like C/C++. ConfigState is this idea applied on an higher level in the hierarchy, it provides a frame to bridge human-readable configuration languages (e.g. json or yaml) with python.
With ConfigState one can configure a complex hierarchy of python classes and instantiate them using a single configuration file. To avoid pitfalls and enhance the developer's experience, ConfigState provides a frame preventing inconsistencies and raising explicit explanation in failing situations. The performance is optimized for low runtime overhead, most of the logic is done during the class definition.
The core component is the class ConfigState
that defines a pattern to represent python classes with two distinctive set of attributes: a set of immutable configuration values and a set of mutable state values.
The configuration is set upon initialization and is passed through the constructor. Once initialized, the configuration is frozen and cannot change. The state variables constitute the mutable state of the instance and can be updated throughout its lifetime. The configuration and state variables are meant to represent the necessary and sufficient information required to clone the object's instance. They can be used to save and restore the object from disk.
- The configuration fields are defined using
ConfigField
class attributes. They can have typing constraints and be provided with a factory method for building complex types out of simpler/built-in ones. - State variables are defined using
StateVar
attributes within the constructor. They can alternatively be defined as class properties using@stateproperty
if random logic execution is needed upon accession/modification.
Implementing a class inheriting from ConfigState
as parent offers the following benefits:
- Provides clear semantic separation between the static configuration values and the mutable state variables.
- Configuration values and state variables are accessible through pythonic syntax and benefit from the IDE's type hinting feature.
- Using a configuration file, one can instantiate a complex hierarchy of python classes. A config field may be another
ConfigState
object allowing to define tree-like structuredConfigState
hierarchies. - A config field can be a reference to a nested
ConfigState
object's config field. This allows coupling between config fields. For example, configuration of a log folder path can be injected into the nestedConfigState
objects through the configuration of the topmostConfigState
object. ConfigState
objects can be serialized/deserialized into/from a stream. They are pickleable and in some cases jsonable.
from pathlib import Path
from config_state import ConfigField
from config_state import ConfigState
from config_state import StateVar
import numpy as np
class Foo(ConfigState):
learning_rate: float = ConfigField(0.1, 'The learning rate', force_type=True)
license_key: str = ConfigField(None, 'License key', required=True)
log_dir: Path = ConfigField('./', 'Path to a folder', type=Path)
def __init__(self, config=None):
super().__init__(config=config)
self.weights: np.ndarray = StateVar(np.random.random((10, 10)),
'The weights of the model')
self.iteration: int = StateVar(0, 'Training iterations')
We can instantiate a ConfigState
with a dictionary (that may have been obtained from loading a json or yaml file):
conf = {
'learning_rate': 0.1,
'license_key': 'ID123',
'log_dir': 'logs/'
}
foo = Foo(conf)
The configuration of foo
can be summarized:
print(foo.config_summary())
Output:
learning_rate: 0.1
license_key: ID123
log_dir: logs
Values are accessible with pythonic syntax (the IDE should be able to perform type hinting and code completion):
assert isinstance(foo.learning_rate, float)
assert foo.learning_rate == 0.1
Config values are immutable:
foo.learning_rate = 0.2 # Not OK, raises 'AttributeError: Updating a conf field is forbidden'
But changing a state variable is ok:
foo.iteration += 1 # Ok, state variable
Missing required fields raises an exception:
conf = {
'learning_rate': 0.1,
'log_dir': 'logs/'
}
foo = Foo(conf) # ConfigError: Configuring 'Foo': Those required fields have not been specified {'license_key'}
Configuring invalid fields raise an exception:
conf = {
'color': 'red',
'license_key': 'ID123'
}
foo = Foo(conf) # ConfigError: Configuring 'Foo': Trying to update the conf field 'color' which has not been defined
Configuring with an invalid type raise an exception:
conf = {
'learning_rate': '0.1',
'license_key': 'ID123'
}
foo = Foo(conf) # ConfigError: Configuring 'Foo': Value `0.1` of type `<class 'str'>` is not compatible with specified type `float`
A state variable can be defined using properties with the @stateproperty
decorator, this is convenient in case some logic need to be run while accessing or setting the variable.
from config_state import ConfigState
from config_state import stateproperty
import numpy as np
class Model(ConfigState):
def __init__(self, config):
super().__init__(config)
self._weights: np.ndarray = np.random.random((10, 10))
@stateproperty
def weights(self) -> np.ndarray:
'''Weights of the model'''
return self._weights
@weights.setter
def weights(self, val):
self._weights = val
ConfigState
objects are serializable if their config and state variables are serializable too. The state of an object is considered to be entirely encapsulated within the config values and the state variables. The state can be obtained with foo.get_state()
which returns an ObjectState
instance. Those objects represent the serialized information of a ConfigState
object.
import pickle
pickle.dump(foo, open('foo.pkl', 'wb'))
foo2 = pickle.load(open('foo.pkl', 'rb'))
In some cases, ConfigState
objects are json serializable:
from config_state.serializers import Json
class JsonableFoo(ConfigState):
log_dir: str = ConfigField('log_dir/', 'Path to output folder')
learning_rate: float = ConfigField(0.1, 'The learning rate')
def __init__(self, config=None):
super().__init__(config=config)
self.iteration = StateVar(0, 'Training iterations')
foo = JsonableFoo()
# saving
Json().save(foo, 'foo.json')
# loading
foo = Json().load('foo.json')
Content of foo.json
:
{
"type": "__main__.JsonableFoo",
"config": {
"__VERSION__": {
"value": 1.0,
"doc": "ConfigState protocol's version",
"type": "builtins.float"
},
"log_dir": {
"value": "log_dir/",
"doc": "Path to output folder.",
"type": "builtins.str"
},
"learning_rate": {
"value": 0.1,
"doc": "The learning rate",
"type": "builtins.float"
}
},
"internal_state": {
"iteration": {
"value": 0,
"doc": "Training iterations",
"type": "builtins.int"
}
}
}
Pickle and Json serializers are available as plugin:
serializer = Serializer({'class': 'Pickle'})
serializer.save(foo, 'foo.pkl')
If a ConfigField
has a specified type
but the type of the provided value
is different, type
is used as an implicit factory by calling type(value)
. This is useful for nested ConfigState
objects:
class NestedFoo(ConfigState):
license_key: str = ConfigField(type=str, required=True)
foo: Foo = ConfigField(type=Foo,
doc='A ConfigState as config field',
required=True)
conf = {
'license_key': '4321',
'foo': {
'learning_rate': 0.1,
'license_key': 'ID123',
'log_dir': 'logs/'
}
}
nested_foo = NestedFoo(conf) # Ok, nested_foo.foo is instantiated using conf['foo']
isinstance(nested_foo.foo, Foo) # True
A factory can be explicitly provided through a callable:
from datetime import datetime
def date_factory(str_date):
return datetime.strptime(str_date, '%Y-%m-%d %H:%M:%S')
class DateFoo(ConfigState):
date: datetime = ConfigField(value='2019-01-01 00:00:00', type=datetime,
doc='some date',
factory=date_factory)
date_foo = DateFoo({'date': '2021-04-28 00:00:00'})
print(type(date_foo.date)) # <class 'datetime.datetime'>
It may happen that the full configuration of an object is not known at the time of its instantiation. In such case it is possible to defer their specification at a later time using Ellipsis
:
foo = Foo({'license_key': ...})
foo.license_key is Ellipsis # True
foo.license_key = 1337 # ok, we can update an Ellipsis
foo.license_key = 42 # Not OK, raises 'AttributeError: Updating a conf field is forbidden'
# Note: For convenience with configs defined within json or yaml files, strings '...' are interpreted as Ellipsis:
foo = Foo({'license_key': str('...')})
foo.license_key is Ellipsis # True
A ConfigField
can be references to fields in nested ConfigState
fields simplifying the configuration of complex hierarchies:
class FooWithRef(ConfigState):
foo: Foo = ConfigField(type=Foo) # a nested ConfigState
license_key = ConfigField(foo.license_key) # Reference to a nested field
# FooWithRef.license_key is a reference to FooWithRef.foo.license_key
# allowing to simplify the configuration, instead of:
FooWithRef({'foo': {'license_key': 'ABC123'}})
# we can do:
foo_with_ref = FooWithRef({'license_key': 'ABC123'})
foo_with_ref.license_key == 'ABC123' # True
foo_with_ref.foo.license_key == 'ABC123' # True
foo_with_ref.foo.license_key is foo_with_ref.license_key # True
A reference can point to another reference:
class FooWithRef2(ConfigState):
foo_with_ref: FooWithRef = ConfigField(type=FooWithRef)
license_key = ConfigField(foo_with_ref.license_key) # It is a reference to another reference
foo = FooWithRef2({'license_key': 'ABC123'})
foo.foo_with_ref.license_key is foo.license_key # True
foo.foo_with_ref.foo.license_key is foo.license_key # True
foo.license_key == 'ABC123' # True
A reference can point to multiple fields using list or tuples:
class SubFooWithMultiRef(ConfigState):
foo1: Foo = ConfigField(type=Foo)
foo2: Foo = ConfigField(type=Foo)
license_key = ConfigField([foo1.license_key, foo2.license_key]) # Reference to foo1 and foo2's license_key field
# Now instead of:
conf = {'foo1': {'license_key': 'ABC123'}, 'foo2': {'license_key': 'ABC123'}}
SubFooWithMultiRef(conf)
# One can simply do:
foo = SubFooWithMultiRef({'license_key': 'ABC123'})
foo.license_key == 'ABC123' # True
foo.foo1.license_key is foo.license_key # True
foo.foo2.license_key is foo.license_key # True
A ConfigState
class can be decorated with @builder
, this registers the class as a builder, this allows its sub classes to be decorated with @register
, that registers them as plugins and enable their instantiation using the builder parent.
from config_state import builder
from config_state import register
@builder
class ColoredFoo(ConfigState):
color: str = ConfigField(None, "Color", static=True)
value: int = ConfigField(type=int, doc="Value")
@register
class RedFoo(ColoredFoo):
color: str = ConfigField("Red", "Color", static=True)
@register
class BlueFoo(ColoredFoo):
color: str = ConfigField("Blue", "Color", static=True)
colored_foo = ColoredFoo({'class': 'BlueFoo', 'value': 1})
print(type(colored_foo)) # <class '__main__.BlueFoo'>
print(colored_foo.color) # Blue
print(colored_foo.value) # 1
Builders can be defined in a hierarchy. For instance, we can define a master builder from which every builder can inherit. Building an object is made by specifying the hierarchy path:
@builder
class MasterBuilder(ConfigState):
pass
@builder
@register
class ColoredFoo(MasterBuilder):
pass
colored_foo = MasterBuilder({'class': 'ColoredFoo.BlueFoo', 'value': 1})