-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Silent integer overflow in test_values #4279
Comments
Thanks @mschmidt87 for the detailed report! Pinging @StephenHogg here, who recently did some work (#4211) on initial model checks before sampling; could be related. |
|
Yes, I looked a bit in the code, but I think the problem is in the construction of the test value for my observation variable (which becomes negative for very high values for some reason). But I don't really know where those test values are determined/initialized. |
Okay, I have digged deeper into this and found the source of the error: In my example, I am specifying the data as a numpy array with dtype Now, in the which executes Now, Now, I am not familiar with all the intricacies of the code, but the comments suggest this casting might happen because PyMC3 thinks that I am passing an array of indices since the dtype is integer. I don't quite understand why that necessitates casting it to int32 but I presume that is related to theano? |
So, to me, there are essentially two questions:
|
The reason that we cast to int32 and not to int64 is that the latter can overflow float64 calculations for large integer values, whereas int32 won't. You could try to see if this happens by casting your observations to float64. Maybe some |
Thank you for your reply. I could scale down my observations, I guess, at the (acceptable) cost of losing some precision in the model. However, could you explain what you mean with "we cast to int32 and not to int64 is that the latter can overflow float64 calculations for large integer values"? It's not clear to me what you mean with this. |
I think he refers to operations like int64 * float64. Now let's appreciate for a moment, that the recently introduced checks from #4211 surfaced a bug that would have caused very hard to debug problems in the MCMC. What if we just add a check in |
Okay, I was not aware that this cannot happen with |
There's a known issue with pymc3 ignoring bounds on priors and then ending up in impossible places. (similar to the open issues documented here: ICB-DCM/pyPESTO#365 and here: pymc-devs/pymc#4279, and here: https://discourse.pymc.io/t/square-decision-region-model-specification-initial-evaluation-failed/6413 We discovered this bug at around 3am, and will continue attempting to fix it, but would appreciate help from the teaching staff is possible. Otherwise, we can rewrite the model before the next deadline.
Heads-up, the |
I think the intX change was never included in V4 |
* add type guard for inX * fix test for pandas * fix posterior test, ints passed for float data Closes pymc-devs#4279
Closing as stale, feel free to reopen if still relevant |
Should be solved by #7114, although we don't use |
The problem may still persist, but we would need a better reproducible example that's compatible with today's API. If anyone faces this problem, calling |
Description of your problem
I am trying to fit a Negative-Binomial regression model. However, I am running into
SamplingError: Initial evaluation of model at starting point failed!
, even if I am trying to fit samples from the model prior.I found out that the test_value of my observation value (which should be positive because of the Negative-Binomial distribution) are sometimes negative, which leads to the error. I then checked
model.check_test_point()
and get:This happens because some of the test values of the observed variable are negative:
If I understand correctly, the test values of the observed variable are basically the same as the values I am trying to fit:
However, for very large sampled values, the test value is negative:
Thus, it seems that the test value is, for some reason, overflowing and turning negative for very large values.
Below is a minimal reproducer:
Please provide a minimal, self-contained, and reproducible example.
Please provide the full traceback.
Please provide any additional information below.
Versions and main components
The text was updated successfully, but these errors were encountered: