-
-
Notifications
You must be signed in to change notification settings - Fork 199
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tools for writing out parameters to a Dolang configuration file #446
Comments
As with other things, this is something that is best handled by crafting a single "template" example and then adapting/improving the template by thinking about the extent to which the template could be used for various different projects. I think good candidates would be documentation examples; you might try ConsPortfolioModelDoc, for example. |
Notes from the sprint meeting this morning:
I'll work on a demo of ways to work with this after ofter some other directory cleanup (i.e. #440 ) |
I've been doing research into how other scientific simulation libraries handle this problem.
These are the libraries I included in my survey, with some notes about how each managed parameters in their examples. PySBSystems Biology Modeling MesaAgent Based Modeling --- Individual examples have their own requirements.txt ActivitySimMetropolitan Travel Activity -- Many, many configuration options SimuPyDynamic systems NengoBrain simulations nilearnNeuro imaging Special mention: YggdrasilPlant simulations |
Sounds like you did an admirably comprehensive job at looking at other
libraries.
I guess your next step should be to propose lessons, in the form of
suggestions for how we should do this systematically?
…On Fri, Dec 13, 2019 at 4:53 PM Sebastian Benthall ***@***.***> wrote:
I've been doing research into how other scientific simulation libraries
handle this problem.
What I've found is that there is no standardized way of doing it yet, but
there's some patterns that seem to hold across the libraries.
- None of these libraries has anything as tightly integrated as
REMARKs currently are for publications. The examples provided with the core
libraries vary in how 'complete' they are as useful demos or exploratory
tool, but I don't see any submodules or linking across repositories.
- There are almost never parameters hard-coded into the library
itself. Most of the time, these are coded into the python of an example
notebook or python file on a case-by-case basis. There are a couple
exceptions to this:
- ActivitySim has many, many parameters for its simulations; it
stores these in .yaml and .csv files
- Mesa has substantive model classes that are initialized at the
start of particular simulations or experiments. These have their default
parameters loaded as default arguments to the class initializer and
sometimes stored in static variables of the class itself.
These are the libraries I included in my survey, with some notes about how
each managed parameters in their examples.
PySB
Systems Biology Modeling
http://pysb.org/
--- Parameters coded into examples. No visible inheritance.
Mesa
Agent Based Modeling
https://pypi.org/project/Mesa/
https://forum.comses.net/t/mesa-an-agent-based-modeling-framework-in-python-3/7039
--- Individual examples have their own requirements.txt
--- models have default parameters in __init__method of the model class
ActivitySim
Metropolitan Travel Activity
https://activitysim.github.io/
-- Many, many configuration options
-- Everything provided in .csv or .yaml files, e.g.:
https://github.com/ActivitySim/activitysim/tree/master/example/configs
SimuPy
Dynamic systems
https://github.com/simupy/simupy
https://readthedocs.org/projects/simupy-personal/downloads/pdf/latest/
--- Parameters coded into each example file
--- No reuse -- library is immature
Nengo
Brain simulations
https://www.frontiersin.org/articles/10.3389/fninf.2013.00048/full
--- examples are all notebooks in the docs directory
https://github.com/nengo/nengo/tree/master/docs
--- parameters are all just entered as arguments in (very lightweight)
modeling interface, e.g.:
https://github.com/s72sue/std_neural_nets/blob/master/hopfield_network.ipynb
nilearn
Neuro imaging
http://nilearn.github.io/auto_examples/index.html#tutorial-examples
--- datasets are loaded by a data loading handler
--- many examples, with few parameters, which are hardcoded as method
arguments
E.g.
https://github.com/nilearn/nilearn/blob/master/examples/04_manipulating_images/plot_roi_extraction.py
Special mention:
Yggdrasil
Plant simulations
https://academic.oup.com/insilicoplants/article/1/1/diz001/5479575
https://github.com/cropsinsilico/yggdrasil
Software for combining models across programming languages to accommodate
different layers of abstraction.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#446?email_source=notifications&email_token=AAKCK75H243SCRV5PQUX3JTQYP75JA5CNFSM4JV6MI72YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEG3LLKY#issuecomment-565622187>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKCK74K2ZNUSHF22ZSJAQLQYP75JANCNFSM4JV6MI7Q>
.
--
- Chris Carroll
|
The main lessons from this work are:
I will make a PR with a demonstration of how this could work with the HARK core and a template example. One thing that occurred to me after I did the survey of simulation libraries, but which I think is important, is this:
This may be a more complex issue, better dealt with in a separate task. But I wanted to flag for future work the possibility of:
I've noticed that Dolo configuration files do separate meaningful categories of parameters from each other, which I think helps add clarity. |
So, back to business after a long teaching span.
Indeed in dolo the choice was made to completely separate the model part
which is in the yaml part from solution instructiond which are in the
python code. However the separation is not that strict and one of the
reasons there are no command options in the yaml file is is their lack so
far of api stability. I want the yaml filed to stay.
A certain degree is separation is probably a good idea.
I'd suggest to check the toml language. It looks nice and simple. I didn't
adopt it because it isn't great to input equations (you need quoted
everywhere)
…On Sun, Dec 15, 2019, 9:15 PM Sebastian Benthall ***@***.***> wrote:
The main lessons from this work are:
- If there are a large number of parameters, it makes sense to put
them in a serial configuration file, like a .yaml
- If there are substantive models, it makes sense for default
parameter values to be loaded by the model's class when it initializes.
I will make a PR with a demonstration of how this could work with the HARK
core and a template example.
One thing that occurred to me after I did the survey of simulation
libraries, but which I think is important, is this:
- The libraries I looked at are mainly about defining a model's
content by giving it parameters, and delivering simulated output.
- *Some* HARK parameters are actually more about how the model is
executed, which is quite a bit different from model content. Maybe this
should be treated differently. This would imply a comparison with a
different set of Python libaries that emphasis model-fitting more, such as
scikit-learn and PyMC3.
This may be a more complex issue, better dealt with in a separate task.
But I wanted to flag for future work the possibility of:
- Distinguishing, when defining parameters, between those that are for
the model's substantive content (like CRRA and DiscFac), and
parameters that guide how it works operationally, perhaps like
CubicBool.
I've noticed that Dolo configuration files do separate meaningful
categories of parameters from each other, which I think helps add clarity.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#446?email_source=notifications&email_token=AACDSKMZBPXUSPI7MUKEAE3QY2F5RA5CNFSM4JV6MI72YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEG5BGDI#issuecomment-565842701>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACDSKNV25KX3BM2TOPIUUTQY2F5RANCNFSM4JV6MI7Q>
.
|
(Replaced this comment with markdown version below)
|
Pablo, So, your suggestion would be that we adopt toml as our standard for For example, right now we have a giant ConsumerParameters.py file which we use with commands like
(and the relevant part of ConsumerParameters.py is excerpted below). We’ve had some discussions about alternative ways of doing this including:
One complexity is that we build our models by inheritance, so that for My inclination is for 3, but am curious about your thoughts.
|
#462 is intended to demonstrate an incremental step in the right direction here. In In this PR, With the exception of a few small changes, this PR could in principle be merged with no change to the API for downstream uses. |
@sbenthall, sounds like you've made a nice prototype (though I haven't had time to look at it yet). I'd be interested in your thinking about the pros and cons of my idea from the prior discussion, of having default values embedded in the definition of the class, then written out to a yaml file. As I see it: pro: There's one place to look both for how the parameter is used and what its default numerical value is
PS. Did you look into why Pablo suggested toml instead of yaml? |
@llorracc Ok, I'll be honest. I don't like the idea of having the classes write the parameters out to a yaml file, with that yaml file stored in the version control, for your "countercounterpoint" reason. I think it's confusing. I think it would accomplish the same thing, but be less confusing, if each class instance had a method that clearly reported to the user what its parameters are. These parameters might even be displayed in the I think these conversations are very tricky because they often depend on quite unscientific intuitions about what's "easier to use", which is a very noisy human variable. Most of the time when I have an opinion on this, it's based on my understanding of software engineering conventions. But there's always room to disagree. I've now looked at TOML, as Pablo recommends. It looks quite similar to YAML. I think it's less widely used than YAML. My impression is that it would be idiosyncratic to adopt it. If it isn't as good as YAML for including equations, I think that's a dealbreaker for depending on it in the long run. https://gist.github.com/oconnor663/9aeb4ed56394cb013a20 |
Thinking through my feelings on this, I guess partly they boil down to an
aversion to a proliferation of different files that people have to get
loaded correctly and in the right locations. This may well be a bad
instinct on my part, to the extent that it is not conditioned on experience
with people getting everything via a `pip install` or whatever. My
fondness for embedding the default parameter values in the definition of
the class is that tif we do it that way then hen whenever the person has a
definition of the class they are guaranteed to have a definition of
default parameter values. Our current setup requires them also to have
ConsumerParameters.py in the right place, and your extension requires them
_als_o to have a yaml file in the right place. To the extent that worrying
about people having "the right files in the right place" is anachronistic
on my part (because `pip install` or `git pull` will guarantee that, I'm on
board with your approach.
The one other point in favor of the 'defaults within the class definition'
approach is inheritance. Suppose we want ConsIndShockType to inherit all
the default parameters of PerfForesightCRRAType, and only be required to
specify values for values that are NOT default parameters for
PerfForesightConsumerType. I don't see how we do that with your setup of
standalone YAML files for each type, whereas it is inherent in my approach
of classes inheriting from parent classes and ony having to define the
variables that are novel.
I think it would accomplish the same thing, but be less confusing, if
each class instance had a method that clearly reported to the user what its
parameters are.
This sounds good; but to be clear, you are proposing a new standard here,
which is not yet implemented for any of our existing classes?
…On Mon, Dec 23, 2019 at 7:23 PM Sebastian Benthall ***@***.***> wrote:
@llorracc <https://github.com/llorracc> Ok, I'll be honest.
I don't like the idea of having the classes write the parameters out to a
yaml file, with that yaml file stored in the version control, for your
"countercounterpoint" reason. I think it's confusing.
I think it would accomplish the same thing, but be less confusing, if each
class instance had a method that clearly reported to the user what its
parameters are.
These parameters might even be displayed in the __repr__ of the class.
https://www.pythonforbeginners.com/basics/__str__-vs-__repr
I think these conversations are very tricky because they often depend on
quite unscientific intuitions about what's "easier to use", which is a very
noisy human variable. Most of the time when I have an opinion on this, it's
based on my understanding of software engineering conventions. But there's
always room to disagree.
I've now looked at TOML, as Pablo recommends. It looks quite similar to
YAML. I think it's less widely used than YAML. My impression is that it
would be idiosyncratic to adopt it. If it isn't as good as YAML for
including equations, I think that's a dealbreaker for depending on it in
the long run.
https://gist.github.com/oconnor663/9aeb4ed56394cb013a20
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#446?email_source=notifications&email_token=AAKCK7ZU7QYDVHGY2HUFJIDQ2FI7DA5CNFSM4JV6MI72YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEHSFNSA#issuecomment-568612552>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKCK7ZKTXROVBVS53X74RDQ2FI7DANCNFSM4JV6MI7Q>
.
--
- Chris Carroll
|
As a small step: Standardize the names of the parameter dictionaries in |
Referring to that proposed "small step"--a problem with naming the dictionaries in For example, here, This is a good example of why it would be better for classes to have default parameters that get inherited by their subclasses. Indeed, For this reason, I'm working on #466, which gives each class the parameters as overrideable defaults. |
With #442 merged, now it's easier to see why the current way of handling parameters is problematic. Because of the old way of handling parameters, there are downstream dependencies on a parameter file that shouldn't be in the HARK module:
Whatever solution we find for parameter management within HARK will, in the best case, also inform how REMARKs work as well. |
I think at yesterday's meeting we came to some conclusions about where to go with this.
For (c) and (d) there's questions about how specifically the YAML will be formatted. The idea behind (c) and (d) is to have model portability. This is quite a big lift. |
* loading init_perfect_foresight parameters by default. See #446 * loading init_idiosyncratic_shock into IndShochConsumerType on initialization #446 * using default ConsIndShock parameters when configuring RAexample * loading kinked_R parameters into class by default * Using default params in initializer for classes in ConsPrefShockModel * removing act_T from default Cobb-Douglas parameters because it's already in __init__ * loading default params into initialization, last cases. see #446 * Fixing issue with multiple cycles arguments in IndShockExplicitPermIncConsumerType() and MedShockConsumerType() Co-authored-by: Christopher Llorracc Carroll <[email protected]>
The next step in this issue is to allow the configuration of a HARK model from a YAML file. The best thing to do would be to use an existing YAML format for model configuration: Dolang! So this is related to #763 |
This also depends on having an organized representation of the parameters of a model, or #660 |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Simulations built with HARK currently start with long sections of Python code setting parameters.
This custom code is written in a few different styles; it would be cleaner to have these parameters in a configuration file. This is what Dolang does, and is a pattern it would do well to get to.
This also would be a step in the direction of automating the creation of tables for showing the mapping between notation and Python variables.
The text was updated successfully, but these errors were encountered: