-
-
Notifications
You must be signed in to change notification settings - Fork 407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix credible_interval=None and plot_dist with ndim>2 #1115
Conversation
if type(values).__name__ == "DataArray": | ||
ax.hist(values.values.flatten(), bins=bins, **hist_kwargs) | ||
else: | ||
ax.hist(values, bins=bins, **hist_kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I'd use only values = np.asarray(values).faltten()
(and probably the same in bokeh file, haven't checked though). This will accept DataArrays stright away.
Moreover, it is probably a good idea to add some errors in the general file plots/distplot.py
in a similar way to plot_kde, so users get some informative error if they try to use the function on an inference data or a dataset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this solution is also working fine.
For, the second suggestion, Are following error appropriate?
"Cannot directly convert xarray.Dataset to numpy array. Instead, create an xarray.DataArray first or Use plot_posterior, plot_density, plot_joint" "or plot_pair instead of plot_dist"
and
" Inference Data object detected. Use plot_posterior instead of plot_dist"
Thanks for the PR, I love having the plots here in the description, if I may, some minor comments. Try to make the title as short and descriptive as possible (i.e. Avoid using the issue it fixes/is related in the title, we generally don't remember issues by name and GitHub does not render links in titles, only in description, moreover, if you use some keywords right before the issue number GitHub will do some magic and close the issue automatically once the PR is merged! And finally but not least, these are relevant fixes! They should be included in the changelog so users can see it has been fixed. |
Thanks for review @OriolAbril . I will make relevant changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are three files relevant to any ArviZ plot, and plot_dist
is no exception:
plots/xxxplot.py
: The function exposed to users is the one defined here. It takes care of all preprocessing tasks common between backends.plots/backends/bokeh/xxxplot.py
: This function is called by the one above if using bokeh backend and does the actual plotting in bokehplots/backends/matplotlib/xxxplot.py
: Same as above but when using matplotlib bokeh
@@ -115,7 +115,7 @@ def _histplot_bokeh_op(values, values2, rotated, ax, hist_kwargs): | |||
if bins is None: | |||
bins = get_bins(values) | |||
density = hist_kwargs.pop("density", True) | |||
hist, edges = np.histogram(values, density=density, bins=bins) | |||
hist, edges = np.histogram(np.asarray(values).flatten(), density=density, bins=bins) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should move this before using values for bins and do values = np.asarray(values).flatten()
to use always flattened array.
@@ -91,8 +91,10 @@ def _histplot_mpl_op(values, values2, rotated, ax, hist_kwargs): | |||
raise NotImplementedError("Insert hexbin plot here") | |||
|
|||
bins = hist_kwargs.pop("bins") | |||
if bins is None: | |||
bins = get_bins(values) | |||
ax.hist(np.asarray(values).flatten(), bins=bins, **hist_kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
arviz/plots/distplot.py
Outdated
if isinstance(values, xr.Dataset): | ||
raise ValueError( | ||
"Cannot directly convert xarray.Dataset to numpy array." | ||
" Instead, create an xarray.DataArray first " | ||
"or Use plot_posterior, plot_density, plot_joint" | ||
"or plot_pair instead of plot_dist" | ||
) | ||
if isinstance(values, InferenceData): | ||
raise ValueError(" Inference Data object detected. Use plot_posterior instead of plot_dist") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though everything it says is correct, I would change it slightly to guide users towards our intended workflow, moreover, I figured we can have a single message for both cases:
if isinstance(values, xr.Dataset) or isinstance(values, InferenceData):
raise ValueError(
"InferenceData or xarray.Dateset object detected,"
" use plot_posterior, plot_density or plot_pair"
" instead of plot_dist"
)
They can always create a DataArray manually, but we have the var_names
argument so users should not create dataarrays manually. We also recently deprecated plot_joint
in favour of plot_pair
so error messages should not continue recommending its usage.
@OriolAbril As the pull request is approved can I add it to changelog under "Maintenance and fixes"? |
There is no need to wait for the PR to be approved to add to changelog, we often merge PRs without any approval too. We have no convention on using approved tag on reviews either, I generally use it to indicate that bulk or PR contribution is finished and only nits are pending. |
CHANGELOG.md
Outdated
@@ -5,7 +5,7 @@ | |||
### New features | |||
* Integrate jointplot into pairplot, add point-estimate and overlay of plot kinds #1079 | |||
### Maintenance and fixes | |||
|
|||
* Fixed parameter `credible_interval` and fix for Dataarray input with dimensions greater than 2(#1115) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better use one bullet point per fix as they are actually unrelated:
* Fixed behaviour of `credible_interval=None` in `plot_posterior` (#1115)
* Fixed hist kind of `plot_dist` with multidimensional input (#1115)
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Description
This pull request fixes #1104 wherein changes have been made to distplot.py to flatten the array if it is a Dataarray and changes have been made to posteriorplot.py to change credible_interval to "auto" and added a condition for handling None for credible_interval.
Plots
In case of credible_interval=None
In case of credible_interval is not set, it will become auto-
Checklist