Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Marginalization for learn_strudel circuits #48

Closed
RenatoGeh opened this issue Jan 17, 2021 · 3 comments
Closed

Marginalization for learn_strudel circuits #48

RenatoGeh opened this issue Jan 17, 2021 · 3 comments

Comments

@RenatoGeh
Copy link
Contributor

Hi,

I'm trying to compute the marginal probability of a circuit learned from learn_strudel, but I keep getting a keyword error. Am I doing something wrong?

Here's a minimal example to reproduce the error.

using ProbabilisticCircuits, LogicCircuits, DataFrames

R, _, T = twenty_datasets("nltcs")
M, _, _ = learn_strudel(R; num_mix = 10, init_maxiter = 10, em_maxiter = 100)
# Setting first value as missing in test set.
Q = allowmissing(T, 1)
Q[:,1] .= missing
p = MAR(M, Q)

The error backtrace:

ERROR: UndefKeywordError: keyword argument component_idx not assigned
Stacktrace:
 [1] ParamBitCircuit(::SharedSumNode, ::DataFrame) at /home/renatogeh/.julia/packages/ProbabilisticCircuits/Uh1Ay/src/param_bit_circuit.jl:45
 [2] marginal(::SharedSumNode, ::DataFrame) at /home/renatogeh/.julia/packages/ProbabilisticCircuits/Uh1Ay/src/queries/marginal_flow.jl:40
 [3] top-level scope at REPL[20]:1
 [4] run_repl(::REPL.AbstractREPL, ::Any) at /build/julia/src/julia-1.5.3/usr/share/julia/stdlib/v1.5/REPL/src/REPL.jl:288

Thanks

@MhDang
Copy link
Member

MhDang commented Jan 18, 2021

hi,

Thanks for the feedback.

The learn_strudel returns a mixture of circuits sharing the same structure but different parameters. Concretely, a tuple (pc, component_weights, lls), pc is a SharedProbCircuit representing the circuit structure, and each column of parameters refers to each component; component_weights as its weights; lls is the log-likelihoods.

Inferences routines are not implemented for SharedProbCircuit for the uniform API, (TODO later), so directly calling EVI, MAR will get error. Instead, you can call MAR for each component by setting component_idx, and then weighted the results in the end.

R, _, T = twenty_datasets("nltcs")
M, W, _ = learn_strudel(R; num_mix = 10, init_maxiter = 10, em_maxiter = 100)
# Setting first value as missing in test set.
Q = allowmissing(T, 1)
Q[:,1] .= missing
lls = zeros(num_examples(T), num_mix)
for i in 1:num_mix
    pbc = ParamBitCircuit(M, Q; component_idx=i)
    lls[:, i] .= MAR(pbc, Q)
end

ll = logsumexp(lls .+ log.(W), 2)

Btw, if you want to learn a single model probabilistic circuit, use learn_circuit instead.

learn_circuit(R; maxiter=20)

@RenatoGeh
Copy link
Contributor Author

Thanks for the quick reply. :)

I see. I thought, since there was nothing on this on the docs, that mixtures behaved the same way as regular circuits and the weight computation was somehow embedded on SharedProbCircuit, especially since EVI works out-of-the-box with mixtures (though looking at the code I guess Juice assumes uniform weights for standard log-likelihood).

I've opened a draft PR here #49 addressing these issues.

@MhDang
Copy link
Member

MhDang commented Jan 21, 2021

Thanks for the contribution, the PR has been merged. : )

@MhDang MhDang closed this as completed Jan 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants