Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewriting MCMCChains with Makie.jl+AlgebraOfGraphics #306

Open
ParadaCarleton opened this issue Jun 6, 2021 · 16 comments
Open

Rewriting MCMCChains with Makie.jl+AlgebraOfGraphics #306

ParadaCarleton opened this issue Jun 6, 2021 · 16 comments

Comments

@ParadaCarleton
Copy link
Member

I've been discussing this with a couple people, and I think you could get MCMCChains to feature parity with Bayesplot within a few months if you reimplemented the code using Makie.jl+AlgebraOfGraphics. Makie's recipe system is almost the same as StatsPlots', so you could reuse most of the current code pretty easily. AlgebraOfGraphics is an extension for Makie adding ggplot2-style syntax, and my thought there is that you can reuse most of Bayesplot's code, which relies on ggplot2. The bigger bonus would be that this feature parity would be very easy to maintain, since any plot added to Bayesplot could quickly be added to Turing; in addition, AlgebraOfGraphics provides a lot of tools to users that make it easier for them to modify plots themselves if they'd like to do so. I think that having a package that can do these things would be extremely helpful in getting people to switch from R, since even ArviZ tends to lag behind Bayesplot a bit in terms of what it can do.

I'd be interested in helping with this project if there's any interest in it.

@devmotion
Copy link
Member

I think it would be great if there would be support for Makie for Chains. However, as long as there is no lightweight recipe support for Makie, similar to RecipesBase for Plots.jl, it would be better to have a separate package for Makie support, similar to https://github.com/JuliaGaussianProcesses/AbstractGPsMakie.jl that adds Makie support to https://github.com/JuliaGaussianProcesses/AbstractGPs.jl (which already natively supports Plots.jl through RecipesBase). The main reason would be that plotting functionality is not the main focus of MCMCChains (yet) and that we want to avoid the really heavy Makie.jl dependency.

In general, I think an implementation should not be based on AlgebraOfGraphics but on regular Makie and only define @recipes (if needed) and the convert_arguments pipeline for Chains. While AlgebraOfGraphics is nice, it seems limiting if Makie users have to use it if they want to plot a Chains object. And all regular Makie recipes and plotting functionality are supported automatically by AlgebraOfGraphics, so you could still use the ggplot-like syntax if you want to.

@ParadaCarleton
Copy link
Member Author

I think it would be great if there would be support for Makie for Chains. However, as long as there is no lightweight recipe support for Makie, similar to RecipesBase for Plots.jl, it would be better to have a separate package for Makie support, similar to https://github.com/JuliaGaussianProcesses/AbstractGPsMakie.jl that adds Makie support to https://github.com/JuliaGaussianProcesses/AbstractGPs.jl (which already natively supports Plots.jl through RecipesBase). The main reason would be that plotting functionality is not the main focus of MCMCChains (yet) and that we want to avoid the really heavy Makie.jl dependency.

MakieCore is the equivalent of RecipeseBase for Makie, but I don't see any problems with splitting off the Makie functionality.

In general, I think an implementation should not be based on AlgebraOfGraphics but on regular Makie and only define @recipes (if needed) and the convert_arguments pipeline for Chains. While AlgebraOfGraphics is nice, it seems limiting if Makie users have to use it if they want to plot a Chains object. And all regular Makie recipes and plotting functionality are supported automatically by AlgebraOfGraphics, so you could still use the ggplot-like syntax if you want to.

I'm not sure I understand. You're right that you can mix any Makie recipes you'd like with AlgebraOfGraphics, so there's not really much cost involved with installing it. On the other hand, there's a huge upside in terms of time saved by using ggplot syntax that would let you carry over a lot of code from Bayesplot.

@devmotion
Copy link
Member

MakieCore is the equivalent of RecipeseBase for Makie, but I don't see any problems with splitting off the Makie functionality.

It's not usable yet (requires MakieOrg/Makie.jl#998), and it will only work for truly lightweight recipes (see MakieOrg/Makie.jl#996). So it is still unclear when and how an equivalent/similar approach to RecipesBase would be available.

On the other hand, there's a huge upside in terms of time saved by using ggplot syntax that would let you carry over a lot of code from Bayesplot.

It would be strange to require users to commit to AlgebraOfGraphics and the ggplots-like syntax if you can implement stuff as generic Makie recipes that can be used both with standard Makie and AlgebraOfGraphics. I don't think this should be guided by similarities with bayesplot.

@rikhuijzer
Copy link
Contributor

rikhuijzer commented Jun 14, 2021

Maybe, a section in the documentation like Gadfly would be a good compromise? Like Gadfly, AlgebraOfGraphics is also very high level and, therefore, there is less need to implement special glue code between Makie and MCMCChains.

For example, based on the MCMCChains docs:

using AlgebraOfGraphics
using DataFrames
using CategoricalArrays
using MCMCChains
using CairoMakie
using Random

n_iter = 400
n_name = 3
n_chain = 2

val = randn(n_iter, n_name, n_chain) .+ [1, 2, 3]'
val = hcat(val, rand(1:2, n_iter, 1, n_chain))

chn = Chains(randn(100, 2, 3), [:A, :B])
df = DataFrame(chn)
df[!, :chain] = categorical(df.chain)

layers = data(df) * mapping(:A; color=:chain) * AlgebraOfGraphics.density()
axis = (; ylabel="Density")
AlgebraOfGraphics.draw(layers; axis)

image

EDIT. And for the MCMCChains.plot(chn):

using Makie

chn = Chains(val, [:A, :B, :C, :D])
df = DataFrame(chn)
df[!, :chain] = categorical(df.chain)
sdf = stack(df, names(chn), variable_name=:parameter)

layer = data(sdf) * mapping(:value; color=:chain, row=:parameter)
scat = layer * visual(Lines)
dens = layer * AlgebraOfGraphics.density()

fig = Figure(; resolution=(800, 600))
axis = (xlabel="Iteration", ylabel="Sample value")
draw!(fig[1, 1], scat; axis)
axis = (xlabel="Sample value", ylabel="Density")
draw!(fig[1, 2], dens; axis)

image

(Note that the axis ranges are linked though, which isn't the case for MCMCChains.plot.)

@cpfiffer
Copy link
Member

I kinda love it and would be happy to see this in the docs. I think also a glue-on package that handles some common interactions between MCMCChains and Makie/AlgebraOfGraphics would be warranted, but I am generally opposed at this time to rewriting the internals to use Makie/AlgebraOfGraphics. For now -- though I may revise my opinion later because DAMN those are some good-looking plots.

@devmotion
Copy link
Member

It's nice to see how much functionality is already provided for free by the Tables interface. Probably a separate MCMCChains-Makie package would be helpful to define custom plots or specific conversion rules (e.g., for default plot types).

@rikhuijzer
Copy link
Contributor

rikhuijzer commented Jun 14, 2021

specific conversion rules (e.g., for default plot types).

@devmotion, what do you mean by this exactly? I'm afraid that I don't understand

@devmotion
Copy link
Member

@devmotion
Copy link
Member

It shouldn't be necessary to construct a DataFrame. Chains supports the Tables.jl interface and AlgebraOfGraphics can deal with any Tables.jl input.

@rikhuijzer
Copy link
Contributor

There is a package related to this discussion at https://github.com/theogf/Turkie.jl.

@adkabo
Copy link

adkabo commented Jul 9, 2021

I wrote some code for MCMCChains + AlgebraOfGraphics a while back. https://github.com/adkabo/BayesPlots.jl/blob/main/src/plots.jl

@devmotion
Copy link
Member

It shouldn't be necessary to construct a DataFrame. Chains supports the Tables.jl interface and AlgebraOfGraphics can deal with any Tables.jl input.

More concretely, the plots above can be generated with the following code without DataFrames:

using AlgebraOfGraphics
using CairoMakie
using MCMCChains

using AlgebraOfGraphics: density

chain = Chains(randn(100, 2, 3), [:A, :B])

plt = data(chain) * mapping(:A; color=:chain => nonnumeric) * density()
draw(plt; axis=(ylabel="density",))

density

using AlgebraOfGraphics
using CairoMakie
using MCMCChains

using AlgebraOfGraphics: density

val = hcat(randn(400, 3, 2), rand(1:2, 400, 1, 2))
val .+= [1 2 3 0]
chain = Chains(val, [:A, :B, :C, :D])

# exclude additional information such as log probability
params = names(chain, :parameters) 
chain_mapping = mapping(params .=> "sample value") *
    mapping(; color=:chain => nonnumeric, row=dims(1) => renamer(params))
plt1 = data(chain) * mapping(:iteration) * chain_mapping * visual(Lines)
plt2 = data(chain) * chain_mapping * density()
fig = Figure(; resolution=(800, 600))
draw!(fig[1, 1], plt1)
draw!(fig[1, 2], plt2; axis=(ylabel="density",))

values

@storopoli
Copy link
Member

Looks like @kskyten is doing something with Makie.

Look at: https://github.com/kskyten/BayesPlot.jl

@sethaxen
Copy link
Member

There's also a plan to have ArviZ's plots in Plots.jl or Makie.jl. arviz-devs/ArviZ.jl#108. My current thinking is that the steps are

  1. Split diagnostics and statistics from ArviZ into smaller, modular packages
  2. Add atomic recipes for various uncertainty visualizations for uni- and bivariate draws with a consistent interface to StatsPlots and Makie
  3. Create a package full of "plotting data" functions. These are functions that compute some statistics from the inference data, to be used for specific plots, and store them in structs.
  4. Create packages for Makie.jl and/or Plots.jl that simply implement the plotting functions for the structs using the atomic uncertainty recipes.

This is PPL-agnostic. Any PPL can hook into this by overloading the "plotting data" functions for types owned by the PPL. Alternatively, one can define a single converter from the types of the PPL to a common structure, which is the ArviZ.InferenceData type.

Another idea is to have something like a Tables interface for collections of MCMC draws produced in a Bayesian workflow. This would give a unified interface for retrieving prior, prior-predictive, posterior, posterior-predictive, etc draws, and then iterating over them either chain-wise, iteration-wise, or parameter-wise, etc. If we as a community could develop such a unified interface, then any plotting or diagnostics package could hook into the interface to provide plots for any PPL.

@ParadaCarleton
Copy link
Member Author

Another idea is to have something like a Tables interface for collections of MCMC draws produced in a Bayesian workflow. This would give a unified interface for retrieving prior, prior-predictive, posterior, posterior-predictive, etc draws, and then iterating over them either chain-wise, iteration-wise, or parameter-wise, etc.

Oh man, I love this! I think it's a great idea -- it would have saved me so much time/trouble with ParetoSmooth.jl.

@JasonPekos
Copy link
Member

FWIW there is now also: https://github.com/TidierOrg/TidierPlots.jl, which would make matching the BayesPlot implementation even easier. The pace of development there has been pretty impressive.

They provide a pretty magical looking method of automatically creating new geoms from Makie functions, which is simply:

geom_raincloud = geom_template("geom_raincloud", ["x", "y"], :RainClouds)

ggplot(penguins) + 
    geom_raincloud(aes(x = :species, y = :bill_depth_mm/10, color = :species), size = 4) +
    scale_y_continuous(labels = "{:.1f} cm") + 
    labs(title = "Bill Depth by Species", x = "Species", y = "Bill Depth") +
    theme_minimal()

So if a hypothetical MakieArviz.jl package adds only the basic geoms we'd need to match BayesPlot (a more approachable task), the functions that are really just a composition of smaller geoms can be created by pretty much carrying over the BayesPlot code exactly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants