Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

blackbox_external_likelihood notebook is broken #4002

Closed
rpgoldman opened this issue Jul 6, 2020 · 14 comments
Closed

blackbox_external_likelihood notebook is broken #4002

rpgoldman opened this issue Jul 6, 2020 · 14 comments

Comments

@rpgoldman
Copy link
Contributor

Description of your problem

Investigating this ArviZ issue, I found that it was an attempt to recode this notebook: blackbox_external_likelihood.ipynb. Trying to run this, I find that it fails for a number of reasons:

  1. It requires python modules that are not in our requirements or development requirements. I managed to get by this, but only by using Jupyter magic to install new modules, which may be bad form.
  2. The cython configuration is specific to some linux, and there is no notion of how to modify it for other platforms. Presumably the %%cython magic could be modified based on platform, but I don't know how to do this.
  3. On my Mac, I dropped the linux specific directory from the cython config and it seemed to run to completion, but then when I tried to execute the cell that referenced the my_model C function, it errored out, saying that this function was not defined.
  4. This example uses a random variable as a value of observed, which does not seem to work (and even if it did, breaks ArviZ).

I don't know where to begin to fix this. I think it should either be fixed or removed, though.

Please provide a minimal, self-contained, and reproducible example.

Load this notebook into jupyter and try to execute it.

Please provide the full traceback.

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-13-a114fa16c81c> in <module>
      7 ctrue = 3.   # true y-intercept
      8 
----> 9 truemodel = my_model([mtrue, ctrue], x)
     10 
     11 # make data

NameError: name 'my_model' is not defined

Versions and main components

arviz      0.8.3 (7cec6dd5ec38ded43ce184ddb1e18e66e70dacbb)
emcee      3.0.2
platform   1.0.8
theano     1.0.4
corner     2.0.1
pymc3      3.9.2  (78420727a5302e48834ede5c838d94ebad63cd1a)
numpy      1.19.0
cython     0.29.20
matplotlib 3.2.2
IPython    7.16.1
last updated: Sun Jul 05 2020 

CPython 3.7.7
IPython 7.16.1
watermark 2.0.2
  • Operating system: MacOS 10.15.5
  • How did you install PyMC3: pip -e from source
@fonnesbeck
Copy link
Member

This notebook seems problematic for long-term maintenance. Perhaps it would be better-suited for a blog post rather than a permanent fixture in the repository. Should we remove it?

@AlexAndorra
Copy link
Contributor

I agree with Chris' assessment -- this NB was already problematic when we did batch runs for 3.9.0 release.
The best case would indeed be to update it and turn it into a blog post, but I don't know whether the original author has time for that 🤷‍♂️

@junpenglao
Copy link
Member

This is a pretty good example and I would prefer it to be on the website so it is more complete. We do also have a few other examples that are difficult to maintain (e.g., the LDA one, the VI one that depends on the deprecated Keras version), not sure what is the best course of action here

@junpenglao
Copy link
Member

(Also, blackbox_external_likelihood is initially a blog post and we ask the author to contribute as an example)

@fonnesbeck
Copy link
Member

Perhaps we can have static examples that are not meant to be reproducible notebooks. If its already a blog post from the author, perhaps we can just link to it.

We talked at the last lab meeting about moving all of the notebooks out of the main repo and into their own. It might be time to do this.

@rpgoldman
Copy link
Contributor Author

@junpenglao Why is this a good example? Currently it blows up (even if you can get through the cython stuff) because it uses a free random variable as an observation. This is definitely something that ArviZ does not expect, because it messes up the notion of observed_data -- what's observed data that varies between samples?

Could the core message of this notebook -- explaining how to use a complex external function as likelihood -- be liberated from its use of a free RV as observation? In that case, I agree that it would be good to keep.

@SSchwoebel
Copy link

I am trying to set up MCMC sampling from a custom model. I am new to pymc and ran into this issue when trying to follow the example. Are there any other resources on how to set up sampling from a black box function you could point me to?

@OriolAbril
Copy link
Member

I remember @rpgoldman proposing to split current observed keyword in two, one that would keep the name observed and support only data that is constant for all samples and an extra one to allow passing also posterior parameters to DensityDist. So as far as I understand there are two viable options to solve this:

  • Consider observed in DensityDist accepting posterior parameters as a feature instead as a bug. This requires:
    • Documenting the behaviour clearly in DensityDist docs
    • Modifying ArviZ to stop assuming observed in DensityDist should go to observed_data group.
  • Implement the split Robert proposed, which would require:
    • Modify the code for DensityDist to add the new argument.
    • Modify the notebook to use the new argument
    • ArviZ will work without changes with this, but it would probably be nice to catch this error to output a more informative error pointing to the use of the new argument for DensityDist.

Have I missed something? @junpenglao @fonnesbeck ?

@twiecki
Copy link
Member

twiecki commented Sep 5, 2020

@OriolAbril I think we should go with the second option. Do you have time to implement this fix?

@OriolAbril
Copy link
Member

Not right now but I should have time in a couple weeks from now. I'll comment here whenever I get to it to allow anyone who has the time to comment here and work on it without duplicating any work

@OriolAbril
Copy link
Member

I have started working on this, I'll run some tests and hopefully submit a PR in the following days

@GuillaumeAEC
Copy link

Hello,
It seems to me that the the problem is not fixed yet. In my opinion, "black-box" external likelihoods are powerful ways in PyMC3 to deal with real inference problems. Please do you have any news ? Thanks a lot.

@ilia-sharafutdinov
Copy link

Is there any progress on this?

@OriolAbril
Copy link
Member

I'm not sure I know exactly what "this" is so I'll try to be thorough:

  • If "this" is fixing the blackbox external likelihood notebook. It will get done at some point. You hould not track this issue though and instead watch Blackbox likelihood pymc-examples#48
  • If "this" is getting DensityDists like the ones in the notebook where you pass a dict and some values of that dict are posterior parameters instead of actual observed variables, this is actually fixed already, but there are so many different issues and discourse topics on the exact same topic I lost track long ago and don't know anymore where I have shared the solution and where I haven't. You can use pm.sample(..., return_inferencedata=True, idata_kwargs={"density_dist_obs": False} to indicate that the observed argument of your DensityDist does not represent observed values. Note that for DensityDists that take only observed data as input, everything has always worked independently of the input type of observed.
  • If "this" is about the givens argument in DensityDist, or other changes in DensityDist that would "fix" this and avoid needing this special keyword, this won't be fixed in v3.x. The root issues of all this should be fixed with pymc3>=4.0 _if DensityDist still exists there. v4 will be more powerful and functional and there may be better ways to do this that make DensityDist obsolete and useless.

I am therefore closing this issue in favour of pymc-devs/pymc-examples#48 because the notebook doesn't live in this repo anymore but on pymc-examples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants