-
-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Saving the trace #248
Comments
Update: This seems to work for saving the posterior, using trace.posterior.to_netcdf('posterior.h5', engine='scipy') reload with from xarray import open_dataset
posterior = open_dataset('trace.h5', engine='scipy') I think I could do this for all the groups in trace, just Has anyone got any better ideas for saving/loading the trace object? I'm fairly new to |
Hi @alexlyttle! Thanks for pinging in. This is a design interface decision between ArviZ devs and PyMC4 devs - the namespacing in pm4 is probably the reason why the error you saw is happening. Tagging here @canyon289 @ColCarroll @AlexAndorra and @aloctavodia, who I think are the more active ones on ArviZ amongst us. They probably know better what kind of API changes are needed - perhaps pushing up the engine kwarg to arviz' top-level API, or collecting kwargs? |
Hi @alexlyttle, and thanks for the ping @ericmjl ! |
We have to look carefully into this, IIRC, the I am not following pymc4 closely enough, so the following proposal may make no sense. If the variable names are |
Thanks @ericmjl for tagging the relevant people! Hi @AlexAndorra thanks for the response. I understand this isn't priority atm but thought it worth mentioning. With regards to PyMC4 not being production ready: this may be a separate issue for discussion elsewhere, but we have been using PyMC3 in our research for a while but our particular model takes ~ hours to run. I recently converted it to PyMC4 and it takes ~ 10 mins due to GPU XLA! This is a game-changer for us and look forward to the full release! We are making sure to keep track of the PyMC4 version, being fully aware it is pre-release, but wondered if it would be fine to use in published work (despite the disclaimer in the README)? My opinion is that if we run final model in PyMC3 and the results differ negligibly, it's fine. If this needs more discussion I can get in touch another way or in a different issue. Thanks @OriolAbril this makes a lot of sense to me. I think in PyMC4 you can have |
@alexlyttle, we're thrilled that your model achieves such a big speedup with pymc4 (and GPU). Could I ask you a few questions? How big is your model, with this I mean how many independent variables does it have to sample from and how many observed data points? About the naming convention used by pymc4, you are correct. We follow tensorflow's policy of name scopes, which can be nested. Each name snipe is separated by forward slashes, and the rightmost part is the plain name of the variable. We could do a quick fix by implementing a custom A more complicated but cleaner solution would be to take advantage of the hdf5 and net_cdf groups. The root level of each name scope can be considered a group because it's a model that contains a group of random variables and potentially nested models (nested groups). I consider this to be cleaner but it could potentially impact the arviz groups design, so I don't know what's the best way to do proceed. |
@lucianopaz thanks for your interest! It's possible the main contribution to the speed-up is that we are using a regression neural network (fully-connected 6 x 128 hidden neurons) trained using TensorFlow, which maps inputs to outputs (some of which are observables). This is converted to Theano in order to sample in our PyMC3 model (which cannot utilise the GPU efficiently). In PyMC4 we are able to sample the trained neural net entirely on the GPU. It's a hierarchical model with 5 independent hyper-parameters and 5 object-level parameters which feed in to the neural net. There are 4 observables derived from the neural net output for ~100 data points. I hope this made some sense! I like the sound of the cleaner solution. I may have a go doing something like that in my code and feedback here as I'd be happy to help in some way towards this, but I'm no expert of hdf5 or net_cdf. |
I am not sure saving the variables in a nested group structure would be cleaner in practice. If the implementation and specification of the data structure where a little different, the situation would be different. When saving, we'd have to give the string after the last When loading, we'd have to recursively inspect the groups manually in order to get all variables, load them and once loaded merge them all together into a single xarray dataset. Currently the whole posterior dataset is loaded at once and directly as an xarray dataset. Moreover, the nested group structure would not be part of the loaded dataset, users cannot select variables based on the values before the And finally, groups/datasets are independent, while coordinate values are shared in groups/dataset objects (that is, the chain, draw, and extra dimensions labels are stored once per group/dataset, not once per variable). Thus, using nested groups we would be storing the exact same coordinate labels multiple times. |
Thanks for the feedback @alexlyttle. About publishing, you have to be aware that at the moment, pymc4 has no initial tuning for the NUTS sampler. This means that the mass matrix (which is the covariance matrix of the momentum proposals done at the begining of each HMC step) will not be automatically tuned to the model you are using. This can lead to poor exploration of the state space, suboptimal acceptance rate and also potentially to more divergences during sampling. This does not mean that you cannot use the traces that you get from pymc4, but it means that you have to carefully diagnose your results. You will have to look at the effective sample size (depends on the autocorrelation of the samples in the trace), the Rhat (depends on how well mixed each chain is and if every chain reaches the same steady state), the warmup period in which the samples go from the initial proposal to the steady state typical set, and finally, you should also look at the energy coverage to diagnose whether your MCMC explored the full posterior parameter space or if you are getting biased answers. |
@OriolAbril, ok so we will have to write custom |
Thanks for the advice @lucianopaz! Actually, I have been noticing divergences and some other issues with the chains. With regards to changing the |
I love the progress with PyMC4 so far. Are there plans to implement a
save_trace
function in PyMC4?I have been trying the following:
and it gives this error:
The illegal characters seem to be the '/' in the variable names. I have a lot of variables, but could manually replace the characters if need be. However, being able to save them as they are would be better, especially for reloading the trace.
Do you have any suggestions on saving the trace to allow such characters? If the '/' character makes saving problematic, are there any alternatives which may be specified when PyMC4 creates the variable names.
The text was updated successfully, but these errors were encountered: