-
-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed Posterior plot errors with boolean array. #1707
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could use if dtype kind == "f" -> elif dtype kind == "i" else (dtype == "b")
instead of converting to int and storing is_bool
. We may also want to update a bit more the apperance of the plot? @sethaxen? i.e. showing the y axis in the boolean case, or hiding the mean and annotating both bars with count (xx%)
on top of them, things like that
@OriolAbril Can the dtype of values be other than f, i or b? |
That's a good question. In principle a PPL could produces samples of strings or objects, though all I am aware of currently would require first assigning integers to the strings/objects and then produce integer samplers. Do we anywhere in arviz assert that the dtype is amount these types?
Yes, I think it makes sense to label the bars with count and percent instead mean, and if this is done, an axis is not necessary. |
True/False is an edge case from category dtype with 2 categories. If we have a flag for category, then we could do this kind of (bar) plot for multiple categories too. |
So, does that mean |
Sorry about the delay, I think this is a good starting point, we can then improve/refactor the else case to handle the category case too, but before that we have to decide on an api, have this info in inferencedata or as an argument or both which will probably take some time. |
fda2fae
to
99aa489
Compare
Codecov Report
@@ Coverage Diff @@
## main #1707 +/- ##
===========================================
- Coverage 90.81% 79.55% -11.26%
===========================================
Files 114 114
Lines 12332 12348 +16
===========================================
- Hits 11199 9824 -1375
- Misses 1133 2524 +1391
Continue to review full report at Codecov.
|
|
We need to add test that has boolean data. (check current tests and see if you can add boolean data as input for the test for this plot) |
I think I need to create a function
to test with a boolean data. Right? |
yes |
94dcae8
to
03bbf34
Compare
I think now there are some errors not related to my code changes. |
03bbf34
to
ac335d2
Compare
c465126
to
ab02a78
Compare
@@ -319,18 +319,23 @@ def format_axes(): | |||
rug=False, | |||
show=False, | |||
) | |||
else: | |||
elif values.dtype.kind == "i": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the initial condition is if kind=kde and dtype=f
so this condition is too restrictive, I think it should be if dtype=i or (dtype=f and kind=hist)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea right. I should have done it that way. Thanks for pointing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test exists because we all do this which is also why writing tests is important everywhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
One more observation. Now, that there is an added kind of plot bar
for boolean values. So maybe we should set
kind = 'bar'
somewhere and should accept kind = 'bar'
.
ab02a78
to
ed6f1c1
Compare
|
Were tests passing locally? Whenever I get stuck on one or a couple tests I run them locally, and only those so I can iterate fast. pytest has a lot of features to select which subset of tests to run: https://docs.pytest.org/en/6.2.x/reference.html#command-line-flags. To begin with you have to call |
def test_plot_posterior_boolean(): | ||
data = np.random.choice(a=[False, True], size=(4, 100)) | ||
axes = plot_posterior(data) | ||
plt.draw() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plt.draw() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is not needed for tests (or is it?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answer to this question may look weird at first but
On notebook, assert axes.get_xticklabels() runs fine when run in next cell but returns empty list when run on the same cell containing plt.plot()
.
Found a solution here
(My earlier comment got duplicated. When I deleted one, both got deleted 🤦🏼 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, then this can be removed (we don't run tests in notebooks)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests were failing when I removed this because then get_xticklabels() returns empty list.
Interactive users when want to run axes.get_xticklabels()
can do this by running it in different cell
But non interactive users have to call plt.draw() before using axes.get_xticklabels()
More clear explanation :
in interactive mode in ipython, a draw() is automatically triggered after each function that has a plotting action. In non-interactive environments, the draw action is executed only at the end--this saves a lot of run time.
But maybe, we can call plt.draw() inside our posterior plot function and remove it from here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, tests are really run from command line
python -m pytest path/to/tests
So keeping them clean is ok.
@OriolAbril This is much better approach. Thanks! |
c12fefa
to
5db0960
Compare
5db0960
to
1db9b3c
Compare
Unable to understand why a codecov shows a lot of code becomes uncovered through these changes. It shows whole matplotlib posterior plot function is becomes uncovered! |
If the azure job fails, nothing is uploaded to codecov, thus if all jobs fail, the coverage is 0 |
5767d59
to
bf9141c
Compare
data = np.random.choice(a=[False, True], size=(4, 100)) | ||
axes = plot_posterior(data) | ||
assert axes | ||
assert axes.get_xticklabels()[0].get_text() != 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is not the best way to see if xticklabels are correct. It should be like assert axes.get_xticklabels()[0].get_text() == 'False'
.
When I tried it locally, tests work well only if I provide show=True
to the plot function call but shows an assertion error when show=False (default).
AssertionError: assert ' ' == 'False'
Azure jobs fails with same error even if show=True
is pushed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe it could be solved using some non-interactive/directly writing to file backend? I don't reallly know what is happening, but it would not be surprising that Azure had some extra checks preventing guis, pop ups or other interactive objects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already tried using plt.draw()
in the matplotlib backend file, didn't work! I guess we need to call plt.draw()
in the test function before assert or maybe leave it like that if its an "okayish" way to test if not good.
Thanks for the fix |
Description
Plotted bar instead of hist for a boolean value.
Also, should HDI also be visible? I have assumed no.
Fixes #1694
Checklist